Hi,
we do run the reference hardware, but only rarely see any hangs; we
rather reliably do see hangs (on average every 2nd day, during a
recording, without any log-notice) with confidence monitoring enabled.
We also run Ubuntu 10.10 with MH 1.3 on the capture agents.
We *do not* set the debug level any higher than default, so I don't
know its impact. Neither do we rotate the logs.
Regards, Andreas
Jonathan Felder schrieb am Fri, 20 Apr 2012 betreff "[Matterhorn-users]...":
At Berkeley we have been doing a lot of testing with Matterhorn 1.3 and the
reference platform defined in the wiki. Unfortunately we have found this
combination to be very unstable during our testing. We have been
experiencing daily crash issues with these agents which would severely impact
our capability to use them in any sort of production setting.
Our testing is as follows:
We have deployed 4 capture agents, in 4 different buildings, using the
reference platform defined in the wiki here:
http://opencast.jira.com/wiki/display/MHDOC/Reference+Hardware+1.3
We have installed ubuntu 10.10 64 Bit and then installed the capture agent
software as per the instructions here:
http://opencast.jira.com/wiki/display/MHDOC/Install+Capture+Agent+v1.3
We install using all of the default settings and the only alterations to the
configuration come from defining the admin server, passwords, increasing the
log level to debug, and instructing the logs to rotate on a daily basis.
We then schedule recordings on the devices via the admin server. The
recordings are roughly an hour or an hour and a half in length. These
correspond to typical class lengths. We record presentation, audio, and
video. The recordings are usually scheduled back to back throughout the day
with 5 minute breaks.
Using the above testing method, we experience daily issues with the agents.
The issues seem to occur when the agent is recording. These issues are the
following:
1. The machine hangs. When this occurs it is completely unresponsive and
must be power cycled. There is nothing in the system logs or matterhorn logs
that would indicate why. I have documented this issue in the following jira
ticket:
http://opencast.jira.com/browse/MH-8762
2. The JVM crashes. When this occurs there is sometimes, but not always, a
java crash log. There is nothing in the matterhorn or system logs that
indicate why this might have occurred. Restarting matterhorn is sufficient
to get the capture agent back up again. I have documented this issue in the
following jira ticket:
http://opencast.jira.com/browse/MH-8756
3. Matterhorn hangs. When this occurs, the jvm is still running but
matterhorn is unresponsive. It stops logging and does not respond to http
requests. There is nothing in the matterhorn or system logs that indicate
why this might have occurred. Restarting matterhorn is sufficient to get the
capture agent back up again. I have documented this issue in the following
jira ticket:
http://opencast.jira.com/browse/MH-8763
I recently asked this on the list, but unless I missed it I didn't see any
responses. Is anyone running matterhorn 1.3 capture agents with the
reference platform? Are we the only people in the community testing this?
Has anyone else seen these stability problems?
--
Jon
_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users
-----------------------
[email protected]
01/58801 DW 41523
mobil: 0664/60 588 4523
TU Wien
DVR-Nummer: 0005886
-----------------------
_______________________________________________
Matterhorn-users mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn-users