[
https://issues.apache.org/jira/browse/MAPREDUCE-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559111#comment-14559111
]
Tony Reix commented on MAPREDUCE-6346:
--------------------------------------
Hi Binglin, Thanks for answering !
About BigEndian/LittleEndian, I'm now using RHEL7.1 on PPC64LE : Little Endian.
So, there should not be an issue about this.
About unaligned memory access, BLOCK_SIZE on PPC64LE is not 4096. So it may
have an impact. Or it is something else.
Yes, it looks like something is corrupted at creation and generates the crash
late, when reading and merging, long time after the issue appeared.
Thanks for the suggestions for finding the root cause: it is what I need, since
the code is over-complicated for me.
Since I have very few understanding of this code, I'm interested with
detailed/precise suggestions and instructions about what to trace and where in
the code. Would you mind indicating me where it would be useful to add trace
instructions ? so that you or other experts of this could more easily locate
the issue.
> mapred.nativetask.kvtest.KVTest crashes on PPC64LE
> --------------------------------------------------
>
> Key: MAPREDUCE-6346
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6346
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 3.0.0
> Environment: RHEL 7.1 - PPC64 LE - OpenJDK
> rhel-2.5.5.1.ael7b_1-ppc64le u79-b14
> Reporter: Tony Reix
> Attachments: TR
>
>
> Test org.apache.hadoop.mapred.nativetask.kvtest.KVTest (and 5 or 6 other
> tests) crashes on PPC64LE .
> ....
> 15/04/28 10:46:06 INFO Mid-spill: { id: 4, collect: 245 ms, in-memory sort:
> 32 ms, in-memory records: 48202, merge&spill: 80 ms, uncompressed size:
> 5031451, real size: 3739319 path:
> /tmp/hadoop-reixt/mapred/local/localRunner/reixt/jobcache/job_local408221154_0008/attempt_local408221154_0008_m_000000_0/output/spill4.out
> }
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # SIGSEGV (0xb) at pc=0x00003fff6c7d8e50, pid=945, tid=70366264881616
> #
> # JRE version: OpenJDK Runtime Environment (7.0_79-b14) (build
> 1.7.0_79-mockbuild_2015_04_10_10_48-b00)
> # Java VM: OpenJDK 64-Bit Server VM (24.79-b02 mixed mode linux-ppc64
> compressed oops)
> # Derivative: IcedTea 2.5.5
> # Distribution: Built on Red Hat Enterprise Linux Server release 7.1 (Maipo)
> (Fri Apr 10 10:48:01 EDT 2015)
> # Problematic frame:
> # C [libnativetask.so.1.0.0+0x58e50]
> NativeTask::WritableUtils::ReadVLongInner(char const*, unsigned int&)+0x40
> #
> # Core dump written. Default location:
> /home/reixt/HADOOP-2.7.0/hadoop-FromApache-Trunk-201504241115/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/core
> or core.945
> #
> # An error report file with more information is saved as:
> # /tmp/jvm-945/hs_error.log
> #
> # If you would like to submit a bug report, please include
> # instructions on how to reproduce the bug and visit:
> # http://icedtea.classpath.org/bugzilla
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> /bin/sh: line 1: 945 Aborted (core dumped)
> /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.79-2.5.5.1.ael7b_1.ppc64le/jre/bin/java
> -Xmx4096m -XX:MaxPermSize=768m -XX:+HeapDumpOnOutOfMemoryError -jar
> /home/reixt/HADOOP-2.7.0/hadoop-FromApache-Trunk-201504241115/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/surefire/surefirebooter9078773752877532263.jar
>
> /home/reixt/HADOOP-2.7.0/hadoop-FromApache-Trunk-201504241115/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/surefire/surefire4138802116387705281tmp
>
> /home/reixt/HADOOP-2.7.0/hadoop-FromApache-Trunk-201504241115/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/surefire/surefire_01525011254551870798tmp
> /tmp/jvm-945/hs_error.log :
> # C [libnativetask.so.1.0.0+0x58e50]
> NativeTask::WritableUtils::ReadVLongInner(char const*, unsigned int&)+0x40
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)