Hello, After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core dumps on Solaris 10 on SPARC.
# A fatal error has been detected by the Java Runtime Environment: # # SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263 # # JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build 1.7.0_55-b13) # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode solaris-sparc compressed oops) # Problematic frame: # V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158 I'm pretty sure the problem here is that Elasticsearch is making increasing use of "unsafe" functions in Java, presumably to speed things up, and some CPUs are more picky than others about memory alignment. In particular, x86 will tolerate misaligned memory access whereas SPARC won't. Somebody has tried to report this to Oracle in the past and (understandably) Oracle has said that if you're going to use unsafe functions you need to understand what you're doing: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574 A quick grep through the code of the two versions of Elasticsearch shows that the new use of "unsafe" memory access functions is in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes: bash-3.2$ git checkout v1.0.1 Checking out files: 100% (2904/2904), done. bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils ./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum UnsafeUtils { ./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java: if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) { ./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java: } else if (UnsafeUtils.equals(key, get(curId, spare))) { ./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.UnsafeUtils; ./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java: return UnsafeUtils.equals(b1, b2); bash-3.2$ git checkout v1.2.2 Checking out files: 100% (2220/2220), done. bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils ./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import org.elasticsearch.common.util.UnsafeUtils; ./src/main/java/org/elasticsearch/common/bytes/BytesReference.java: return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(), b.arrayOffset(), a.length()); ./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import org.elasticsearch.common.util.UnsafeUtils; ./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java: return UnsafeUtils.readLongLE(key, blockOffset); ./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java: long k1 = UnsafeUtils.readLongLE(key, i); ./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java: long k2 = UnsafeUtils.readLongLE(key, i + 8); ./src/main/java/org/elasticsearch/common/util/BytesRefHash.java: if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) { ./src/main/java/org/elasticsearch/common/util/BytesRefHash.java: } else if (UnsafeUtils.equals(key, get(curId, spare))) { ./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum UnsafeUtils { ./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import org.elasticsearch.common.util.UnsafeUtils; ./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java: return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset); ./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.UnsafeUtils; ./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java: return UnsafeUtils.equals(b1, b2); Presumably one of these three new uses is what is causing the JVM SIGBUS error I'm seeing. A quick look at the MurmurHash3 class shows that the hash128 method accepts an arbitrary offset and passes it to an unsafe function with no check that it's a multiple of 8: public static Hash128 hash128(byte[] key, int offset, int length, long seed, Hash128 hash) { long h1 = seed; long h2 = seed; if (length >= 16) { final int len16 = length & 0xFFFFFFF0; // higher multiple of 16 that is lower than or equal to length final int end = offset + len16; for (int i = offset; i < end; i += 16) { long k1 = UnsafeUtils.readLongLE(key, i); long k2 = UnsafeUtils.readLongLE(key, i + 8); This is a recipe for generating JVM core dumps on architectures such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory access. Does Elasticsearch have any policy for support of hardware other than x86? If not, I don't think many people would care but you really ought to clearly say so on your platform support page. If you do intend to support non-x86 architectures then you need to be much more careful about the use of unsafe memory accesses. Regards, David -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
