[
https://issues.apache.org/jira/browse/HDFS-13585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506720#comment-16506720
]
Nalini Ganapati edited comment on HDFS-13585 at 7/2/18 5:45 PM:
----------------------------------------------------------------
This is critical for us to be fixed. Basically, the crash shows up when there
are multiple native libraries using JNI in the same application. In our case,
this was from gatk([https://github.com/broadinstitute/gatk)] that packages
numerous jars with some having embedded native libraries. We worked around this
issue by rewriting hdfsThreadDestructor in
hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/thread_local_storage.c
static void hdfsThreadDestructor(void *v)
{
JavaVM *vm;
JNIEnv *env = v;
jint ret;
ret = (*env)->GetJavaVM(env, &vm);
if (ret)
{ fprintf(stderr, "hdfsThreadDestructor: GetJavaVM failed with error %d\n",
ret); (*env)->ExceptionDescribe(env); }
else {
*// Buggy JVM support, DetachCurrentThread throws exceptions sometimes.*
*// Workaround is to try to AttachCurrentThread as it is a noop if the*
*// Thread is already attached.*
*ret = (*vm)->AttachCurrentThread(vm, (void**)&env, 0);
*if (ret == JNI_OK) {*
(**vm)->DetachCurrentThread(vm);*
*}*
}
}
Is there any possibility of getting this issue fixed soon?
was (Author: nganapati):
This is critical for us to be fixed. Basically, the crash shows up when there
are multiple native libraries using JNI in the same application. In our case,
this was from gatk([https://github.com/broadinstitute/gatk)] that packages
numerous jars with some having embedded native libraries. We worked around this
issue by rewriting hdfsThreadDestructor in
hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/thread_local_storage.c
static void hdfsThreadDestructor(void *v)
{
JavaVM *vm;
JNIEnv *env = v;
jint ret;
ret = (*env)->GetJavaVM(env, &vm);
if (ret)
{ fprintf(stderr, "hdfsThreadDestructor: GetJavaVM failed with error %d\n",
ret); (*env)->ExceptionDescribe(env); }
else {
*// Buggy JVM support, DetachCurrentThread throws exceptions sometimes.*
*// Workaround is to try to AttachCurrentThread as it is a noop if the*
*// Thread is already attached.*
*ret = (*vm)->AttachCurrentThread(vm, (void*)&env, 0);
*if (ret == JNI_OK) {*
*(*vm)->DetachCurrentThread(vm);*
*}*
}
}
Is there any possibility of getting this issue fixed soon?
> libhdfs SIGSEGV during shutdown of Java application.
> ----------------------------------------------------
>
> Key: HDFS-13585
> URL: https://issues.apache.org/jira/browse/HDFS-13585
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: native
> Affects Versions: 2.7.5
> Environment: Centos 7
> gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
> Reporter: Nalini Ganapati
> Priority: Critical
>
> We are using libhdfs for hdfs support from our native library. This has been
> working mostly fine with Java/Spark applications, but some of them throw a
> SIGSEGV in hdfsThreadDestructor(). We tried to dynamically load and unload
> libhdfs.so using dlopen/dlclose but to no avail and we still see the seg
> fault. Is this a known issue? Looks like thread local storage is involved,
> are there workarounds?
>
> Here is a call stack from gdb java <core file>
> (gdb) bt
> #0 0x00007f3333ad21f7 in raise () from /usr/lib64/libc.so.6
> #1 0x00007f3333ad38e8 in abort () from /usr/lib64/libc.so.6
> #2 0x00007f3333380259 in os::abort(bool) () from
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/amd64/server/libjvm.so
> #3 0x00007f3333585986 in VMError::report_and_die() () from
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/amd64/server/libjvm.so
> #4 0x00007f3333389ec7 in JVM_handle_linux_signal () from
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/amd64/server/libjvm.so
> #5 0x00007f333337d678 in signalHandler(int, siginfo_t*, void*) () from
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/amd64/server/libjvm.so
> #6 <signal handler called>
> #7 0x00007f3333341e66 in Monitor::ILock(Thread*) () from
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/amd64/server/libjvm.so
> #8 0x00007f33333428f6 in Monitor::lock_without_safepoint_check() () from
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/amd64/server/libjvm.so
> #9 0x00007f333358bc21 in VM_Exit::wait_if_vm_exited() () from
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/amd64/server/libjvm.so
> #10 0x00007f333314fee5 in jni_DetachCurrentThread () from
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/lib/amd64/server/libjvm.so
> #11 0x00007f32f2645f15 in hdfsThreadDestructor (v=0x7f332c018bc8)
> at
> /home/kshvachk/Work/Hadoop/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/os/posix/thread_local_storage.c:49
> #12 0x00007f3334490c22 in __nptl_deallocate_tsd () from
> /usr/lib64/libpthread.so.0
> #13 0x00007f3334490e33 in start_thread () from /usr/lib64/libpthread.so.0
> #14 0x00007f3333b9534d in clone () from /usr/lib64/libc.so.6
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]