Hi! It looks like you are running the RocksDB state backend 1.1 (is still an old version packaged into your JAR file?)
This line indicates that: org.apache.flink.contrib.streaming.state. RocksDBStateBackend.performSemiAsyncSnapshot (the method does not exist in 1.2 any more) Can you try and run 1.2 and see if that still occurs? In general, I cannot vouch 100% for RocksDBs JNI API, but in our 1.2 tests so far it was stable. Stephan On Wed, Mar 22, 2017 at 3:13 PM, Florian König <florian.koe...@micardo.com> wrote: > Hi, > > I have experienced two crashes of Flink 1.2 by SIGSEGV, caused by > something in RocksDB. What is the preferred way to report them? All I got > at the moment are two hs_err_pid12345.log files. They are over 4000 lines > long each. Is there anything significant that I should extract to help you > guys and/or put into a JIRA ticket? > > The first thing that came to my mind was the stack traces (see below). > Anything else? > > Thanks > Florian > > ---- > > Stack: [0x00007fec04341000,0x00007fec04442000], sp=0x00007fec0443ff48, > free space=1019k > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) > J 10252 org.rocksdb.RocksDB.get(J[BIJ)[B (0 bytes) @ 0x00007fec925887cc > [0x00007fec92588780+0x4c] > J 27241 C2 org.apache.flink.contrib.streaming.state. > RocksDBValueState.value()Ljava/lang/Object; (78 bytes) @ > 0x00007fec94010ca4 [0x00007fec940109c0+0x2e4] > j com.micardo.backend.TransitionProcessor$2.getValue()Ljava/lang/Long;+7 > j com.micardo.backend.TransitionProcessor$2. > getValue()Ljava/lang/Object;+1 > J 38483 C2 org.apache.flink.runtime.metrics.dump.MetricDumpSerialization. > serializeGauge(Ljava/io/DataOutput;Lorg/apache/flink/runtime/metrics/dump/ > QueryScopeInfo;Ljava/lang/String;Lorg/apache/flink/metrics/Gauge;)V (114 > bytes) @ 0x00007fec918eabf0 [0x00007fec918eabc0+0x30] > J 38522 C2 org.apache.flink.runtime.metrics.dump.MetricDumpSerialization$ > MetricDumpSerializer.serialize(Ljava/util/Map; > Ljava/util/Map;Ljava/util/Map;Ljava/util/Map;)Lorg/apache/ > flink/runtime/metrics/dump/MetricDumpSerialization$MetricSerializationResult; > (471 bytes) @ 0x00007fec94eb6260 [0x00007fec94eb57a0+0xac0] > J 47531 C2 org.apache.flink.runtime.metrics.dump. > MetricQueryService.onReceive(Ljava/lang/Object;)V (453 bytes) @ > 0x00007fec95ca57a0 [0x00007fec95ca4da0+0xa00] > J 5815 C2 akka.actor.UntypedActor.aroundReceive(Lscala/ > PartialFunction;Ljava/lang/Object;)V (7 bytes) @ 0x00007fec91e3ae6c > [0x00007fec91e3adc0+0xac] > J 5410 C2 akka.actor.ActorCell.invoke(Lakka/dispatch/Envelope;)V (104 > bytes) @ 0x00007fec91d5bc44 [0x00007fec91d5b9a0+0x2a4] > J 6628 C2 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue. > runTask(Lscala/concurrent/forkjoin/ForkJoinTask;)V (60 bytes) @ > 0x00007fec9212d050 [0x00007fec9212ccc0+0x390] > J 40130 C2 scala.concurrent.forkjoin.ForkJoinWorkerThread.run()V (182 > bytes) @ 0x00007fec923f8170 [0x00007fec923f7fc0+0x1b0] > v ~StubRoutines::call_stub > > -------------------------------------- > > Stack: [0x00007f167a5b7000,0x00007f167a6b8000], sp=0x00007f167a6b5f40, > free space=1019k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > C [libstdc++.so.6+0xc026b] std::basic_string<char, > std::char_traits<char>, std::allocator<char> >::basic_string(std::string > const&)+0xb > C [librocksdbjni8426686507832168508.so+0x2f14ca] > rocksdb::BackupEngine::Open(rocksdb::Env*, rocksdb::BackupableDBOptions > const&, rocksdb::BackupEngine**)+0x3a > C [librocksdbjni8426686507832168508.so+0x180ad5] > Java_org_rocksdb_BackupEngine_open+0x25 > J 50030 org.rocksdb.BackupEngine.open(JJ)J (0 bytes) @ > 0x00007f16cb79aa36 [0x00007f16cb79a980+0xb6] > J 49809 C1 org.apache.flink.contrib.streaming.state.RocksDBStateBackend. > performSemiAsyncSnapshot(JJ)Ljava/util/HashMap; (416 bytes) @ > 0x00007f16cd2733d4 [0x00007f16cd2719a0+0x1a34] > J 51766 C2 org.apache.flink.contrib.streaming.state.RocksDBStateBackend. > snapshotPartitionedState(JJ)Ljava/util/HashMap; (40 bytes) @ > 0x00007f16cb40a1fc [0x00007f16cb40a1a0+0x5c] > J 50547 C2 org.apache.flink.streaming.api.operators. > AbstractUdfStreamOperator.snapshotOperatorState(JJ)Lorg/ > apache/flink/streaming/runtime/tasks/StreamTaskState; (206 bytes) @ > 0x00007f16cb8be89c [0x00007f16cb8be7e0+0xbc] > J 52232 C2 > org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(JJ)Z > (650 bytes) @ 0x00007f16cbfbbf60 [0x00007f16cbfbb540+0xa20] > J 52419 C2 org.apache.flink.streaming.runtime.io.BarrierBuffer. > notifyCheckpoint(Lorg/apache/flink/runtime/io/network/api/CheckpointBarrier;)V > (25 bytes) @ 0x00007f16cbdd2624 [0x00007f16cbdd25c0+0x64] > J 41649 C2 org.apache.flink.streaming.runtime.io.StreamInputProcessor. > processInput(Lorg/apache/flink/streaming/api/operators/ > OneInputStreamOperator;Ljava/lang/Object;)Z (439 bytes) @ > 0x00007f16cc8aed5c [0x00007f16cc8add40+0x101c] > J 33374% C2 org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run()V > (42 bytes) @ 0x00007f16cbdc21d0 [0x00007f16cbdc20c0+0x110] > j org.apache.flink.streaming.runtime.tasks.StreamTask.invoke()V+301 > j org.apache.flink.runtime.taskmanager.Task.run()V+700 > j java.lang.Thread.run()V+11 > v ~StubRoutines::call_stub > V [libjvm.so+0x68c616] > V [libjvm.so+0x68cb21] > V [libjvm.so+0x68cfc7] > V [libjvm.so+0x723d80] > V [libjvm.so+0xa69dcf] > V [libjvm.so+0xa69efc] > V [libjvm.so+0x91d9d8] > C [libpthread.so.0+0x8064] start_thread+0xc4 > > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) > J 50030 org.rocksdb.BackupEngine.open(JJ)J (0 bytes) @ > 0x00007f16cb79a9c4 [0x00007f16cb79a980+0x44] > J 49809 C1 org.apache.flink.contrib.streaming.state.RocksDBStateBackend. > performSemiAsyncSnapshot(JJ)Ljava/util/HashMap; (416 bytes) @ > 0x00007f16cd2733d4 [0x00007f16cd2719a0+0x1a34] > J 51766 C2 org.apache.flink.contrib.streaming.state.RocksDBStateBackend. > snapshotPartitionedState(JJ)Ljava/util/HashMap; (40 bytes) @ > 0x00007f16cb40a1fc [0x00007f16cb40a1a0+0x5c] > J 50547 C2 org.apache.flink.streaming.api.operators. > AbstractUdfStreamOperator.snapshotOperatorState(JJ)Lorg/ > apache/flink/streaming/runtime/tasks/StreamTaskState; (206 bytes) @ > 0x00007f16cb8be89c [0x00007f16cb8be7e0+0xbc] > J 52232 C2 > org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(JJ)Z > (650 bytes) @ 0x00007f16cbfbbf60 [0x00007f16cbfbb540+0xa20] > J 52419 C2 org.apache.flink.streaming.runtime.io.BarrierBuffer. > notifyCheckpoint(Lorg/apache/flink/runtime/io/network/api/CheckpointBarrier;)V > (25 bytes) @ 0x00007f16cbdd2624 [0x00007f16cbdd25c0+0x64] > J 41649 C2 org.apache.flink.streaming.runtime.io.StreamInputProcessor. > processInput(Lorg/apache/flink/streaming/api/operators/ > OneInputStreamOperator;Ljava/lang/Object;)Z (439 bytes) @ > 0x00007f16cc8aed5c [0x00007f16cc8add40+0x101c] > J 33374% C2 org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run()V > (42 bytes) @ 0x00007f16cbdc21d0 [0x00007f16cbdc20c0+0x110] > j org.apache.flink.streaming.runtime.tasks.StreamTask.invoke()V+301 > j org.apache.flink.runtime.taskmanager.Task.run()V+700 > j java.lang.Thread.run()V+11 > v ~StubRoutines::call_stub > > > >