[ https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644100#action_12644100 ]
Brian Bockelman commented on HADOOP-4541: ----------------------------------------- Dummy me forgot the greatest tool for debugging of them all: gdb. Here's the stack trace of one of these suckers in action: (gdb) where #0 0x0000002a975a0a47 in CollectedHeap::allocate_from_tlab_slow () from /opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so #1 0x0000002a97529551 in CollectedHeap::common_mem_allocate_noinit () from /opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so #2 0x0000002a9793374b in typeArrayKlass::allocate () from /opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so #3 0x0000002a9769f5e8 in jni_NewByteArray () from /opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so #4 0x0000002a971ef363 in hdfsWrite (fs=Variable "fs" is not available. ) at hdfs.c:590 #5 0x0000002a970ea97a in globus_l_gfs_hdfs_dump_buffers (hdfs_handle=0x6de080) at globus_gridftp_server_hdfs.c:507 That dump_buffers call (internal to my application) worries me; give me a few hours to step through things in GDB; if errors aren't propogated right in that particular function, it could lead to an application-side infinite loop. > Infinite loop in error handler for libhdfs > ------------------------------------------ > > Key: HADOOP-4541 > URL: https://issues.apache.org/jira/browse/HADOOP-4541 > Project: Hadoop Core > Issue Type: Bug > Components: libhdfs > Affects Versions: 0.18.1 > Reporter: Brian Bockelman > Attachments: libhdfs_failure.txt > > > If there is a problem writing out to HDFS, libhdfs gets put in an infinite > loop. > Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am > attaching the strace output. You can see the java stack traces which are > written over and over. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.