[ 
http://opencast.jira.com/browse/MH-8756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=30748#comment-30748
 ] 

Adam McKenzie commented on MH-8756:
-----------------------------------

This comment will hopefully contain all of the necessary steps to replicate the 
changes I made to configurations, code and logging to try and diagnose this 
problem. If you are running the tests in rare cases the crashes can occur not 
during a capture, instead while the agent is idle. The crashes seem to have a 
higher chance of occurring if the captures are close together (1.5 hour 
captures, 1 minute between). For a typical test I will schedule 4 captures @ 
1.5 hours long with 1 minute between them. It usually fails on the 2nd or 3rd 
capture. 

Test 1: Sun jdk vs. Openjdk
The capture agents I was using were configured for Sun java so I installed 
openjdk using the package manager and ran:
sudo update-alternatives --config java
sudo update-alternatives --config javac
sudo update-alternatives --config jar
To switch the jvm version to openjdk. I got the same results from Sun java as I 
did with openjdk. 
If you need to install sun java: 
1. Get Java:
wget http://download.oracle.com/otn-pub/java/jdk/7/jdk-7-linux-x64.tar.gz

2. Uncompress it:
tar -xvf jdk-7u1-linux-x64.tar.gz

3. Move to a new location:
sudo mkdir /usr/lib/jvm
sudo mv jdk1.7.0_01/ /usr/lib/jvm/java-7-oracle/

4. Install the binaries:
sudo update-alternatives --install /usr/bin/javac javac 
/usr/lib/jvm/java-7-oracle/bin/javac 1
sudo update-alternatives --install /usr/bin/java java 
/usr/lib/jvm/java-7-oracle/bin/java 1
sudo update-alternatives --install /usr/bin/jar jar 
/usr/lib/jvm/java-7-oracle/bin/jar 1

5. Set the correct version of java:
sudo update-alternatives --config java
sudo update-alternatives --config javac
sudo update-alternatives --config jar 

Just change the download location and paths to move it to if you want to use 
1.6 instead of 1.7. 

Test 2: Epiphan_VGA2USB vs. V4LSRC
In the configuration file found at 
/opt/matterhorn/felix/conf/services/org.opencastproject.capture.impl.ConfigurationManager.properties
 change the line:
capture.device.Epiphan_VGA2USB.type=EPIPHAN_VGA2USB
to:
capture.device.Epiphan_VGA2USB.type=V4LSRC


Test 3: mpeg2 vs. x264
In the configuration file found at 
/opt/matterhorn/felix/conf/services/org.opencastproject.capture.impl.ConfigurationManager.properties
 add the lines (there will be no equivalent lines):
capture.device.camera.codec=x264enc
capture.device.camera.container=mp4mux
capture.device.camera.bitrate=2048

capture.device.vga.codec=x264enc
capture.device.vga.container=mp4mux
capture.device.vga.bitrate=2048


Test 4: Increasing GST_DEBUG levels
I was hoping that increasing the GST levels would give hints as to what 
gstreamer elements might be causing the issue. I used GST_DEBUG levels of 2 & 3 
(4 & 5 too noisy for the duration it takes to get it to reproduce, 1GB/10 
minutes and slows down the process to the point where it won't capture). There 
didn't seem to be any connection with gstreamer activity (the gstreamer log 
activity stopped after the start capture was completed). To increase the gst 
debug level run:
export GST_DEBUG=3
Then execute felix manually using the shell script at 
/opt/matterhorn/felix/bin/start_matterhorn.sh so that you can see the logs or 
else you will need to change the redirects of the init script to dump the data 
for you. 


Test 5: Setting the LD_PRELOAD to include 
Added:
LD_PRELOAD=/usr/lib/jvm/java-6-openjdk/jre/lib/amd64/libjsig.so
To the /etc/init.d/matterhorn script with the rest of the environment 
variables. 


Test 6: Setting Gstreamer buffers from 512MB to 64MB
Change the lines:
capture.device.camera.buffer.bytes=536870912
capture.device.vga.buffer.bytes=536870912
capture.device.audio.buffer.bytes=536870912
To:
capture.device.camera.buffer.bytes=67108864
capture.device.vga.buffer.bytes=67108864
capture.device.audio.buffer.bytes=67108864

Test 7: Disabled zipping and ingestion of new captures (old captures might 
still be zipped and attempted to ingest)
Apply the following patch file to your 1.3 source files (can be found in 
/opt/matterhorn/capture-agent/matterhorn-source/)

Index: 
modules/matterhorn-capture-agent-impl/src/main/java/org/opencastproject/capture/impl/jobs/StopCaptureJob.java
===================================================================
--- 
modules/matterhorn-capture-agent-impl/src/main/java/org/opencastproject/capture/impl/jobs/StopCaptureJob.java
       (revision 12179)
+++ 
modules/matterhorn-capture-agent-impl/src/main/java/org/opencastproject/capture/impl/jobs/StopCaptureJob.java
       (working copy)
@@ -79,7 +79,7 @@
       trigger.getJobDataMap().put(JobParameters.SCHEDULER, sched);

       // Schedule the serializeJob
-      sched.scheduleJob(job, trigger);
+      // sched.scheduleJob(job, trigger);

       logger.info("stopCaptureJob complete");

@@ -94,9 +94,6 @@
         e.printStackTrace();
       }

-    } catch (SchedulerException e) {
-      logger.error("Couldn't schedule task: {}", e);
-      e.printStackTrace();
     } catch (Exception e) {
       logger.error("Unexpected exception: {}", e);
       e.printStackTrace();



Test 8: Attached gdb to a jvm that crashed
sudo apt-get install 
gstreamer0.10-plugins-bad-multiverse-dbg   
gstreamer0.10-plugins-ugly-dbg
gstreamer0.10-plugins-base-dbg
gstreamer0.10-ffmpeg-dbg
gstreamer0.10-plugins-bad-dbg              
gstreamer0.10-plugins-good-dbg

Follow the instructions at: 
https://wiki.ubuntu.com/Backtrace
To attach gdb to the java program. 
                
> Matterhorn Capture Agent JVM Crashes - libavcodec.so.52
> -------------------------------------------------------
>
>                 Key: MH-8756
>                 URL: http://opencast.jira.com/browse/MH-8756
>             Project: Matterhorn Project
>          Issue Type: Bug
>          Components: Capture (Devices and Software)
>    Affects Versions: 1.3, 1.2.1
>            Reporter: Jonathan Felder
>            Assignee: Adam McKenzie
>         Attachments: gdb-may-11.txt, hs_err_pid1347.log, hs_err_pid1466.log, 
> hs_err_pid26145.log, hs_err_pid4252.log, jvm-may-11.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
http://opencast.jira.com/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        
_______________________________________________
Matterhorn mailing list
[email protected]
http://lists.opencastproject.org/mailman/listinfo/matterhorn


To unsubscribe please email
[email protected]
_______________________________________________

Reply via email to