Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core 
dumps on Solaris 10 on SPARC.

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263
#
# JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build 
1.7.0_55-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode 
solaris-sparc compressed oops)
# Problematic frame:
# V  [libjvm.so+0xc52d78]  Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making increasing 
use of "unsafe" functions in Java, presumably to speed things up, and some 
CPUs are more picky than others about memory alignment.  In particular, x86 
will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and 
(understandably) Oracle has said that if you're going to use unsafe 
functions you need to understand what you're doing: 
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows 
that the new use of "unsafe" memory access functions is in the 
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum 
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java: 
           
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java: 
           
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
 
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
                
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import 
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:             
   
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(), 
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import 
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:        
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:                
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:                
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:            
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:            
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum 
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
 
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
            
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
 
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
                
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS 
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method accepts 
an arbitrary offset and passes it to an unsafe function with no check that 
it's a multiple of 8:

    public static Hash128 hash128(byte[] key, int offset, int length, long 
seed, Hash128 hash) {
        long h1 = seed;
        long h2 = seed;

        if (length >= 16) {

            final int len16 = length & 0xFFFFFFF0; // higher multiple of 16 
that is lower than or equal to length
            final int end = offset + len16;
            for (int i = offset; i < end; i += 16) {
                long k1 = UnsafeUtils.readLongLE(key, i);
                long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as 
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory 
access.

Does Elasticsearch have any policy for support of hardware other than x86?  
If not, I don't think many people would care but you really ought to 
clearly say so on your platform support page.  If you do intend to support 
non-x86 architectures then you need to be much more careful about the use 
of unsafe memory accesses.

Regards,

David

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to