[
https://issues.apache.org/jira/browse/HDFS-16021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18029943#comment-18029943
]
lifulong commented on HDFS-16021:
---------------------------------
hi [~jeremy.coulon] , Glad to find this issue—we're experiencing the same
problem. The Java process using libhdfs crashes in the {{hdfsThreadDestructor}}
function with a core dump. The {{env *env}} is not null, and we suspected it
might be related to JVM shutdown, but there's no clear evidence since logs show
other threads were still processing data normally at the time
> heap-use-after-free in hdfsThreadDestructor
> -------------------------------------------
>
> Key: HDFS-16021
> URL: https://issues.apache.org/jira/browse/HDFS-16021
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.9.1, 2.9.2, 3.3.0
> Reporter: Jeremy Coulon
> Priority: Major
> Attachments: fix-hdfsThreadDestructor.patch, hdfs-asan.log
>
>
> Related to HDFS-12628 HDFS-13585 HDFS-14488 HDFS-15270
>
> We have experienced crashes located in libhdfs hdfsThreadDestructor() for a
> long time. Crash is almost systematic with OpenJ9 and more sporadic with
> Hotspot JVM.
>
> I finally went to the root cause of this bug thanks to AddressSanitizer. This
> is quite difficult to setup because you need to rebuild both the test-case,
> hadoop and openjdk-hotspot >= 13 with specific compiler options.
>
> See hdfs-asan.log for details.
>
> *Analysis:*
> In hdfsThreadDestructor(), you are making several JNI calls in order to
> detach the thread from the JVM:
>
> {code:java}
> /* Detach the current thread from the JVM */
> if (env) {
> ret = (*env)->GetJavaVM(env, &vm);
> /*
> * More code here...
> */
> }{code}
> This is fine if the thread was created in the C/C++ world.
>
> However if the thread was created in the Java world, this is absolutely
> wrong. When a Java thread terminates, the JVM deallocates some memory which
> contains (among other things) the thread specific JNIEnv. Then
> hdfsThreadDestructor() is called. The *env* variable is not NULL but points
> to memory which was just released. This is heap-use-after-free detected by
> ASan.
>
> I have been working on a patch that fixes the issue (see attachment).
>
> Here is the idea:
> * In hdfsThreadDestructor(), we need to know if the thread was created by
> Java or C/C++ . If it was created by C/C++ we should make JNI calls in order
> to detach the current thread. If it was created by Java, we don't need to
> make any JNI call: thread is already detached.
> * In getGlobalJNIEnv(), we can detect if the thread was created by Java or
> C/C++. It can be done by calling *vm->GetEnv()*. Then we store this
> information inside ThreadLocalState.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]