[
https://issues.apache.org/jira/browse/HDFS-12606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16193998#comment-16193998
]
Kai Zheng commented on HDFS-12606:
----------------------------------
Thanks for the ping Eddy. By design we can have multiple coder instances for
concurrent coding tasks, and no global static variable should block this except
bugs. We guard isal codes in Java, not relying on its thread model. We can
investigate it when back to office, next Monday.
> JVM crashes when running NNBench on EC enabled.
> ------------------------------------------------
>
> Key: HDFS-12606
> URL: https://issues.apache.org/jira/browse/HDFS-12606
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: erasure-coding
> Affects Versions: 3.0.0-beta1
> Reporter: Lei (Eddy) Xu
> Priority: Critical
>
> When running NNbench on a RS(6,3) directory, JVM crashes double free or
> corruption:
> {code}
> 08:16:29 Running NNBENCH.
> 08:16:29 WARNING: Use "yarn jar" to launch YARN applications.
> 08:16:31 NameNode Benchmark 0.4
> 08:16:31 17/10/04 08:16:31 INFO hdfs.NNBench: Test Inputs:
> 08:16:31 17/10/04 08:16:31 INFO hdfs.NNBench: Test Operation: create_write
> 08:16:31 17/10/04 08:16:31 INFO hdfs.NNBench: Start time: 2017-10-04
> 08:18:31,16
> :
> :
> 08:18:54 *** Error in `/usr/java/jdk1.8.0_144/bin/java': double free or
> corruption (out): 0x00007ffb55dbfab0 ***
> 08:18:54 ======= Backtrace: =========
> 08:18:54 /lib64/libc.so.6(+0x7c619)[0x7ffb5b85f619]
> 08:18:54 [0x7ffb45017774]
> 08:18:54 ======= Memory map: ========
> 08:18:54 00400000-00401000 r-xp 00000000 ca:01 276832134
> /usr/java/jdk1.8.0_144/bin/java
> 08:18:54 00600000-00601000 rw-p 00000000 ca:01 276832134
> /usr/java/jdk1.8.0_144/bin/java
> 08:18:54 0173e000-01f91000 rw-p 00000000 00:00 0 [heap]
> 08:18:54 603600000-614700000 rw-p 00000000 00:00 0
> 08:18:54 614700000-72bd00000 ---p 00000000 00:00 0
> 08:18:54 72bd00000-73a500000 rw-p 00000000 00:00 0
> 08:18:54 73a500000-7c0000000 ---p 00000000 00:00 0
> 08:18:54 7c0000000-7c0400000 rw-p 00000000 00:00 0
> 08:18:54 7c0400000-800000000 ---p 00000000 00:00 0
> 08:18:54 7ffb20174000-7ffb208ab000 rw-p 00000000 00:00 0
> 08:18:54 7ffb208ab000-7ffb20975000 ---p 00000000 00:00 0
> 08:18:54 7ffb20975000-7ffb20b75000 rw-p 00000000 00:00 0
> 08:18:54 7ffb20b75000-7ffb20d75000 rw-p 00000000 00:00 0
> 08:18:54 7ffb20d75000-7ffb20d8a000 r-xp 00000000 ca:01 209866
> /usr/lib64/libgcc_s-4.8.5-20150702.so.1
> 08:18:54 7ffb20d8a000-7ffb20f89000 ---p 00015000 ca:01 209866
> /usr/lib64/libgcc_s-4.8.5-20150702.so.1
> 08:18:54 7ffb20f89000-7ffb20f8a000 r--p 00014000 ca:01 209866
> /usr/lib64/libgcc_s-4.8.5-20150702.so.1
> 08:18:54 7ffb20f8a000-7ffb20f8b000 rw-p 00015000 ca:01 209866
> /usr/lib64/libgcc_s-4.8.5-20150702.so.1
> 08:18:54 7ffb20f8b000-7ffb20fbd000 r-xp 00000000 ca:01 553654092
> /usr/java/jdk1.8.0_144/jre/lib/amd64/libsunec.so
> 08:18:54 7ffb20fbd000-7ffb211bc000 ---p 00032000 ca:01 553654092
> /usr/java/jdk1.8.0_144/jre/lib/amd64/libsunec.so
> 08:18:54 7ffb211bc000-7ffb211c2000 rw-p 00031000 ca:01 553654092
> /usr/java/jdk1.8.0_144/jre/lib/amd64/libsunec.so
> :
> :
> 08:18:54 7ffb5c3fb000-7ffb5c3fc000 r--p 00000000 00:00 0
> 08:18:54 7ffb5c3fc000-7ffb5c3fd000 rw-p 00000000 00:00 0
> 08:18:54 7ffb5c3fd000-7ffb5c3fe000 r--p 00021000 ca:01 637266
> /usr/lib64/ld-2.17.so
> 08:18:54 7ffb5c3fe000-7ffb5c3ff000 rw-p 00022000 ca:01 637266
> /usr/lib64/ld-2.17.so
> 08:18:54 7ffb5c3ff000-7ffb5c400000 rw-p 00000000 00:00 0
> 08:18:54 7ffdf8767000-7ffdf8788000 rw-p 00000000 00:00 0 [stack]
> 08:18:54 7ffdf878b000-7ffdf878d000 r-xp 00000000 00:00 0 [vdso]
> 08:18:54 ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
> {code}
> It happens on both {{jdk1.8.0_144}} and {{jdk1.8.0_121}} in our environments.
> It is highly suspicious due to the native code used in erasure coding, i.e.,
> ISA-L is not thread safe
> [https://01.org/sites/default/files/documentation/isa-l_open_src_2.10.pdf]
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]