[ 
https://issues.apache.org/jira/browse/HDFS-16021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Coulon updated HDFS-16021:
---------------------------------
    Description: 
Related to HDFS-12628 HDFS-13585 HDFS-14488 HDFS-15270 

 

We have experienced crashes located in libhdfs hdfsThreadDestructor() for a 
long time. Crash is almost systematic with OpenJ9 and more sporadic with 
Hotspot JVM.

 

I finally went to the root cause of this bug thanks to AddressSanitizer. This 
is quite difficult to setup because you need to rebuild both the test-case, 
hadoop and openjdk-hotspot >= 13 with specific compiler options.

 

See hdfs-asan.log for details.

 

*Analysis:*

In hdfsThreadDestructor(), you are making several JNI calls in order to detach 
the thread from the JVM:

 
{code:java}
/* Detach the current thread from the JVM */
if (env) {
  ret = (*env)->GetJavaVM(env, &vm);
  /*
   *  More code here...
   */
}{code}
This is fine if the thread was created in the C/C++ world.

 

However if the thread was created in the Java world, this is absolutely wrong. 
When a Java thread terminates, the JVM deallocates some memory which contains 
(among other things) the thread specific JNIEnv. Then hdfsThreadDestructor() is 
called. The *env* variable is not NULL but points to memory which was just 
released. This is heap-use-after-free detected by ASan.

 

I have been working on a patch that fixes the issue (see attachment).

 

Here is the idea:
 * In hdfsThreadDestructor(), we need to know if the thread was created by Java 
or C/C++ . If it was created by C/C++ we should make JNI calls in order to 
detach the current thread. If it was created by Java, we don't need to make any 
JNI call: thread is already detached.
 * In getGlobalJNIEnv(), we can detect if the thread was created by Java or 
C/C++. It can be done by calling *vm->GetEnv()*. Then we store this information 
inside ThreadLocalState.

  was:
Related to HDFS-12628 HDFS-13585 HDFS-14488 HDFS-15270 

 

We have experienced crashes located in libhdfs hdfsThreadDestructor() for a 
long time. Crash is almost systematic with OpenJ9 and more sporadic with 
Hotspot JVM.

 

I finally went to the root cause of this bug thanks to AddressSanitizer. This 
is quite difficult to setup because you need to rebuild both the test-case, 
hadoop and openjdk-hotspot >= 13 with specific compiler options.

 

See hdfs-asan.log for details.

 

*Analysis:*

In hdfsThreadDestructor(), you are making several JNI calls in order to detach 
the thread from the JVM:

 
{code:java}
/* Detach the current thread from the JVM */
if (env) {
  ret = (*env)->GetJavaVM(env, &vm);
  /*
   *  More code here...
   */
}{code}
This is fine if the thread was created in the C/C++ world.

 

However if the thread was created in the Java world, this is absolutely wrong. 
When a Java thread terminates, the JVM deallocates some memory which contains 
(among other things) the thread specific JNIEnv. Then hdfsThreadDestructor() is 
called. The *env* variable is not NULL but points to memory which was just 
released. This is heap-use-after-free detected by ASan.

 

I have been working on a patch that fixes the issue (see attachment).

 

Here is the idea:
 * In hdfsThreadDestructor(), we need to know if the thread was create by Java 
or C/C++. If it was created by C/C++, we should make JNI calls in order to 
detach the current thread. If it was created by Java, we don't need to make any 
JNI call: thread is already detached.
 * In getGlobalJNIEnv(), we can detect if the thread was created by Java or 
C/C++. It can be done by calling *vm->GetEnv()*. Then we store this information 
inside ThreadLocalState.


> heap-use-after-free in hdfsThreadDestructor
> -------------------------------------------
>
>                 Key: HDFS-16021
>                 URL: https://issues.apache.org/jira/browse/HDFS-16021
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.9.1, 2.9.2, 3.3.0
>            Reporter: Jeremy Coulon
>            Priority: Major
>         Attachments: fix-hdfsThreadDestructor.patch, hdfs-asan.log
>
>
> Related to HDFS-12628 HDFS-13585 HDFS-14488 HDFS-15270 
>  
> We have experienced crashes located in libhdfs hdfsThreadDestructor() for a 
> long time. Crash is almost systematic with OpenJ9 and more sporadic with 
> Hotspot JVM.
>  
> I finally went to the root cause of this bug thanks to AddressSanitizer. This 
> is quite difficult to setup because you need to rebuild both the test-case, 
> hadoop and openjdk-hotspot >= 13 with specific compiler options.
>  
> See hdfs-asan.log for details.
>  
> *Analysis:*
> In hdfsThreadDestructor(), you are making several JNI calls in order to 
> detach the thread from the JVM:
>  
> {code:java}
> /* Detach the current thread from the JVM */
> if (env) {
>   ret = (*env)->GetJavaVM(env, &vm);
>   /*
>    *  More code here...
>    */
> }{code}
> This is fine if the thread was created in the C/C++ world.
>  
> However if the thread was created in the Java world, this is absolutely 
> wrong. When a Java thread terminates, the JVM deallocates some memory which 
> contains (among other things) the thread specific JNIEnv. Then 
> hdfsThreadDestructor() is called. The *env* variable is not NULL but points 
> to memory which was just released. This is heap-use-after-free detected by 
> ASan.
>  
> I have been working on a patch that fixes the issue (see attachment).
>  
> Here is the idea:
>  * In hdfsThreadDestructor(), we need to know if the thread was created by 
> Java or C/C++ . If it was created by C/C++ we should make JNI calls in order 
> to detach the current thread. If it was created by Java, we don't need to 
> make any JNI call: thread is already detached.
>  * In getGlobalJNIEnv(), we can detect if the thread was created by Java or 
> C/C++. It can be done by calling *vm->GetEnv()*. Then we store this 
> information inside ThreadLocalState.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to