At Berkeley we have been doing a lot of testing with Matterhorn 1.3 and the reference platform defined in the wiki. Unfortunately we have found this combination to be very unstable during our testing. We have been experiencing daily crash issues with these agents which would severely impact our capability to use them in any sort of production setting.

Our testing is as follows:

We have deployed 4 capture agents, in 4 different buildings, using the reference platform defined in the wiki here: http://opencast.jira.com/wiki/display/MHDOC/Reference+Hardware+1.3

We have installed ubuntu 10.10 64 Bit and then installed the capture agent software as per the instructions here: http://opencast.jira.com/wiki/display/MHDOC/Install+Capture+Agent+v1.3

We install using all of the default settings and the only alterations to the configuration come from defining the admin server, passwords, increasing the log level to debug, and instructing the logs to rotate on a daily basis.

We then schedule recordings on the devices via the admin server. The recordings are roughly an hour or an hour and a half in length. These correspond to typical class lengths. We record presentation, audio, and video. The recordings are usually scheduled back to back throughout the day with 5 minute breaks.

Using the above testing method, we experience daily issues with the agents. The issues seem to occur when the agent is recording. These issues are the following:

1. The machine hangs. When this occurs it is completely unresponsive and must be power cycled. There is nothing in the system logs or matterhorn logs that would indicate why. I have documented this issue in the following jira ticket:
http://opencast.jira.com/browse/MH-8762

2. The JVM crashes. When this occurs there is sometimes, but not always, a java crash log. There is nothing in the matterhorn or system logs that indicate why this might have occurred. Restarting matterhorn is sufficient to get the capture agent back up again. I have documented this issue in the following jira ticket:
http://opencast.jira.com/browse/MH-8756

3. Matterhorn hangs. When this occurs, the jvm is still running but matterhorn is unresponsive. It stops logging and does not respond to http requests. There is nothing in the matterhorn or system logs that indicate why this might have occurred. Restarting matterhorn is sufficient to get the capture agent back up again. I have documented this issue in the following jira ticket:
http://opencast.jira.com/browse/MH-8763

I recently asked this on the list, but unless I missed it I didn't see any responses. Is anyone running matterhorn 1.3 capture agents with the reference platform? Are we the only people in the community testing this? Has anyone else seen these stability problems?

--
Jon
_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users

Reply via email to