At Berkeley we have been doing a lot of testing with Matterhorn 1.3 and
the reference platform defined in the wiki. Unfortunately we have found
this combination to be very unstable during our testing. We have been
experiencing daily crash issues with these agents which would severely
impact our capability to use them in any sort of production setting.
Our testing is as follows:
We have deployed 4 capture agents, in 4 different buildings, using the
reference platform defined in the wiki here:
http://opencast.jira.com/wiki/display/MHDOC/Reference+Hardware+1.3
We have installed ubuntu 10.10 64 Bit and then installed the capture
agent software as per the instructions here:
http://opencast.jira.com/wiki/display/MHDOC/Install+Capture+Agent+v1.3
We install using all of the default settings and the only alterations to
the configuration come from defining the admin server, passwords,
increasing the log level to debug, and instructing the logs to rotate on
a daily basis.
We then schedule recordings on the devices via the admin server. The
recordings are roughly an hour or an hour and a half in length. These
correspond to typical class lengths. We record presentation, audio, and
video. The recordings are usually scheduled back to back throughout the
day with 5 minute breaks.
Using the above testing method, we experience daily issues with the
agents. The issues seem to occur when the agent is recording. These
issues are the following:
1. The machine hangs. When this occurs it is completely unresponsive
and must be power cycled. There is nothing in the system logs or
matterhorn logs that would indicate why. I have documented this issue
in the following jira ticket:
http://opencast.jira.com/browse/MH-8762
2. The JVM crashes. When this occurs there is sometimes, but not
always, a java crash log. There is nothing in the matterhorn or system
logs that indicate why this might have occurred. Restarting matterhorn
is sufficient to get the capture agent back up again. I have documented
this issue in the following jira ticket:
http://opencast.jira.com/browse/MH-8756
3. Matterhorn hangs. When this occurs, the jvm is still running but
matterhorn is unresponsive. It stops logging and does not respond to
http requests. There is nothing in the matterhorn or system logs that
indicate why this might have occurred. Restarting matterhorn is
sufficient to get the capture agent back up again. I have documented
this issue in the following jira ticket:
http://opencast.jira.com/browse/MH-8763
I recently asked this on the list, but unless I missed it I didn't see
any responses. Is anyone running matterhorn 1.3 capture agents with the
reference platform? Are we the only people in the community testing
this? Has anyone else seen these stability problems?
--
Jon
_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users