[ 
https://issues.apache.org/jira/browse/HADOOP-12941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-12941.
-------------------------------------
    Resolution: Won't Fix

There is no IA64 any more, sorry

> abort in Unsafe_GetLong when running IA64 HPUX 64bit mode 
> ----------------------------------------------------------
>
>                 Key: HADOOP-12941
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12941
>             Project: Hadoop Common
>          Issue Type: Bug
>         Environment: hpux IA64  running 64bit mode 
>            Reporter: gene bradley
>            Priority: Major
>
> Now that we have a core to look at we can sorta see what is going on#14 
> 0x9fffffffaf000dd0 in Java native_call_stub frame#15 0x9fffffffaf014470 in 
> JNI frame: sun.misc.Unsafe::getLong (java.lang.Object, long) ->long#16 
> 0x9fffffffaf0067a0 in interpreted frame: 
> org.apache.hadoop.hbase.util.Bytes$LexicographicalComparerHolder$UnsafeComparer::compareTo
>  (byte[], int, int, byte[], int, int) ->int bci: 74#17 0x9fffffffaf0066e0 in 
> interpreted frame: 
> org.apache.hadoop.hbase.util.Bytes$LexicographicalComparerHolder$UnsafeComparer::compareTo
>  (java.lang.Object, int, int, java.lang.Object, int, int) ->int bci: 16#18 
> 0x9fffffffaf006720 in interpreted frame: 
> org.apache.hadoop.hbase.util.Bytes::compareTo (byte[], int, int, byte[], int, 
> int) ->int bci: 11#19 0x9fffffffaf0066e0 in interpreted frame: 
> org.apache.hadoop.hbase.KeyValue$KVComparator::compareRowKey 
> (org.apache.hadoop.hbase.Cell, org.apache.hadoop.hbase.Cell) ->int bci: 36#20 
> 0x9fffffffaf0066e0 in interpreted frame: 
> org.apache.hadoop.hbase.KeyValue$KVComparator::compare 
> (org.apache.hadoop.hbase.Cell, org.apache.hadoop.hbase.Cell) ->int bci: 3#21 
> 0x9fffffffaf0066e0 in interpreted frame: 
> org.apache.hadoop.hbase.KeyValue$KVComparator::compare (java.lang.Object, 
> java.lang.Object) ->int bci: 9;; Line: 4000xc00000003ad84d30:0 
> <Unsafe_GetLong+0x130>:    (p1)  ld8              
> r45=[r34]0xc00000003ad84d30:1 <Unsafe_GetLong+0x131>:          adds           
>   r34=16,r320xc00000003ad84d30:2 <Unsafe_GetLong+0x132>:          adds        
>      ret0=8,r32;;0xc00000003ad84d40:0 <Unsafe_GetLong+0x140>:          add    
>           ret1=r35,r45 <==== r35 is off0xc00000003ad84d40:1 
> <Unsafe_GetLong+0x141>:          ld8              
> r35=[r34],240xc00000003ad84d40:2 <Unsafe_GetLong+0x142>:          nop.i       
>      0x00xc00000003ad84d50:0 <Unsafe_GetLong+0x150>:          ld8             
>  r41=[ret0];;0xc00000003ad84d50:1 <Unsafe_GetLong+0x151>:          ld8.s      
>       r49=[r34],-240xc00000003ad84d50:2 <Unsafe_GetLong+0x152>:          
> nop.i            0x00xc00000003ad84d60:0 <Unsafe_GetLong+0x160>:          ld8 
>              r39=[ret1];; <=== abort0xc00000003ad84d60:1 
> <Unsafe_GetLong+0x161>:          ld8              
> ret0=[r35]0xc00000003ad84d60:2 <Unsafe_GetLong+0x162>:          nop.i         
>    0x0;;0xc00000003ad84d70:0 <Unsafe_GetLong+0x170>:          cmp.ne.unc      
>  p1=r0,ret0;;M,MI0xc00000003ad84d70:1 <Unsafe_GetLong+0x171>:    (p1)  mov    
>           r48=r410xc00000003ad84d70:2 <Unsafe_GetLong+0x172>:    (p1)  
> chk.s.i          r49,Unsafe_GetLong+0x290(gdb) x /10i 
> $pc-48*20x9fffffffaf000d70:           flushrs                                 
>                            MMI0x9fffffffaf000d71:           mov              
> r44=r320x9fffffffaf000d72:           mov              
> r45=r330x9fffffffaf000d80:           mov              r46=r34                 
>                           MMI0x9fffffffaf000d81:           mov              
> r47=r350x9fffffffaf000d82:           mov              
> r48=r360x9fffffffaf000d90:           mov              r49=r37                 
>                           MMI0x9fffffffaf000d91:           mov              
> r50=r380x9fffffffaf000d92:           mov              r51=r39
> 0x9fffffffaf000da0:           adds             r14=0x270,r4                   
>                    MMI(gdb) p /x $r35$9 = 0x22(gdb) x /x 
> $ret10x9ffffffe1d0d2bda:     0x677a68676c78743a(gdb) x /x 
> $r45+0x220x9ffffffe1d0d2bda:     0x677a68676c78743aSo here is the problem,  
> this is a 64bit JVM 0 : /opt/java8/bin/IA64W/java1 : 
> -Djava.util.logging.config.file=/test28/gzh/tomcat/conf/logging.properties2 : 
> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager3 : 
> -Dorg.apache.catalina.security.SecurityListener.UMASK=0224 : -server5 : 
> -XX:PermSize=128m6 : -XX:MaxPermSize=256m7 : 
> -Djava.endorsed.dirs=/test28/gzh/tomcat/endorsed8 : -classpath9 : 
> /test28/gzh/tomcat/bin/bootstrap.jar:/test28/gzh/tomcat/bin/tomcat-juli.jar10 
> : -Dcatalina.base=/test28/gzh/tomcat11 : -Dcatalina.home=/test28/gzh/tomcat12 
> : -Djava.io.tmpdir=/test28/gzh/tomcat/temp13 : 
> org.apache.catalina.startup.Bootstrap14 : startSince they are not passing and 
> -Xmx values we are taking defaults which look at the system resources. So 
> what is happening here is a 32 bit word aligned address is being used to 
> index into a byte array (gdb) jo 0x9ffffffe1d0d2bb8_mark = 
> 0x0000000000000001, _klass = 0x9fffffffa8c00768, instance of type [Blength of 
> the array: 1180 0 0 102 0 0 0 8 0 70 103 122 104 103 108 120 116 58 70 83 78 
> 95 50 48 49 53 49 48 50 50 44 65 44 49 52 52 53 52 55 57 57 51 51 57 53 56 46 
> 52 56 54 55 50 48 51 49 99 57 97 101 52 57 101 97 101 49 100 56 49 51 53 51 
> 99 99 97 97 54 98 56 100 46 4 105 110 102 111 115 101 113 110 117 109 68 117 
> 114 105 110 103 79 112 101 110 0 0 1 80 -6 96 -95 -48 4 0 0 0 0 0 0 0 4This 
> is the whole string gdb) x /2s 0x9ffffffe1d0d2bd80x9ffffffe1d0d2bd8:      
> ""0x9ffffffe1d0d2bd9:      
> "Fgzhglxt:FSN_20151022,A,1445479933958.48672031c9ae49eae1d81353ccaa6b8d.\004infoseqnumDuringOpen"To
>  me this is a bug in the callee potentially in 
> org.apache.hadoop.hbase.util.Bytes$LexicographicalComparerHolder$UnsafeComparer::compareToWhy
>  are they calling Unsafe_GetLong on a byte array,  there is no checking of 
> alignment and I really think this is a bug on their part. As far as I know, 
> GetLong expects 64 bit alignment I did find some other 64 bit users who saw 
> this with the same stack trace as this customer
> https://issues.apache.org/jira/browse/PHOENIX-1438http://permalink.gmane.org/gmane.comp.java.hadoop.hbase.devel/39017
> the fix would go here by adding a test for ia64 
> looking at the code from a bug they are checking for if the box is sparc.  
> static Comparer<byte[]> getBestComparer() {
> +      if (System.getProperty("os.arch").equals("sparc")) {  <====
> +        if (LOG.isTraceEnabled()) {
> +          LOG.trace("Lexicographical comparer selected for "
> +              + "byte aligned system architecture");
> +        }
> +        return lexicographicalComparerJavaImpl();
> +      }
>        try {
>          Class<?> theClass = Class.forName(UNSAFE_COMPARER_NAME);so this is 
> 'fixable' from a java class perspective.Hari said he will talk with his open 
> source contact 
> This Hadoop bug report points to the same problem in the same code:
> https://issues.apache.org/jira/browse/HADOOP-11466
> In that case the symptom of the unaligned accesses was bad performance 
> instead of a crash. This shows diffs for that fix:
> http://mail-archives.apache.org/mod_mbox/hadoop-common-commits/201501.mbox/%3cb19d5f83ca7148b782e5b432817b6...@git.apache.org%3E
> Those diffs show that fix only avoids the bad code when running on "sparc". 
> They really should have instead avoided that bad code for every architecture 
> other than x86. They should not be assuming that that FastByteComparisons 
> enhancement will work on other processors and actually improves performance. 
> On processors that do allow unaligned accesses at much cost they are just 
> creating bad performance that will be hard for anyone to ever find.
> For all IA64 customers this will be an issue when running 64 bit. The IA 
> processor enforces alignment on instruction types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to