[ 
https://issues.apache.org/jira/browse/HDFS-16021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18029943#comment-18029943
 ] 

lifulong commented on HDFS-16021:
---------------------------------

hi [~jeremy.coulon] , Glad to find this issue—we're experiencing the same 
problem. The Java process using libhdfs crashes in the {{hdfsThreadDestructor}} 
function with a core dump. The {{env *env}} is not null, and we suspected it 
might be related to JVM shutdown, but there's no clear evidence since logs show 
other threads were still processing data normally at the time

> heap-use-after-free in hdfsThreadDestructor
> -------------------------------------------
>
>                 Key: HDFS-16021
>                 URL: https://issues.apache.org/jira/browse/HDFS-16021
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.9.1, 2.9.2, 3.3.0
>            Reporter: Jeremy Coulon
>            Priority: Major
>         Attachments: fix-hdfsThreadDestructor.patch, hdfs-asan.log
>
>
> Related to HDFS-12628 HDFS-13585 HDFS-14488 HDFS-15270 
>  
> We have experienced crashes located in libhdfs hdfsThreadDestructor() for a 
> long time. Crash is almost systematic with OpenJ9 and more sporadic with 
> Hotspot JVM.
>  
> I finally went to the root cause of this bug thanks to AddressSanitizer. This 
> is quite difficult to setup because you need to rebuild both the test-case, 
> hadoop and openjdk-hotspot >= 13 with specific compiler options.
>  
> See hdfs-asan.log for details.
>  
> *Analysis:*
> In hdfsThreadDestructor(), you are making several JNI calls in order to 
> detach the thread from the JVM:
>  
> {code:java}
> /* Detach the current thread from the JVM */
> if (env) {
>   ret = (*env)->GetJavaVM(env, &vm);
>   /*
>    *  More code here...
>    */
> }{code}
> This is fine if the thread was created in the C/C++ world.
>  
> However if the thread was created in the Java world, this is absolutely 
> wrong. When a Java thread terminates, the JVM deallocates some memory which 
> contains (among other things) the thread specific JNIEnv. Then 
> hdfsThreadDestructor() is called. The *env* variable is not NULL but points 
> to memory which was just released. This is heap-use-after-free detected by 
> ASan.
>  
> I have been working on a patch that fixes the issue (see attachment).
>  
> Here is the idea:
>  * In hdfsThreadDestructor(), we need to know if the thread was created by 
> Java or C/C++ . If it was created by C/C++ we should make JNI calls in order 
> to detach the current thread. If it was created by Java, we don't need to 
> make any JNI call: thread is already detached.
>  * In getGlobalJNIEnv(), we can detect if the thread was created by Java or 
> C/C++. It can be done by calling *vm->GetEnv()*. Then we store this 
> information inside ThreadLocalState.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to