[ 
https://issues.apache.org/jira/browse/FLINK-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16262584#comment-16262584
 ] 

Andrey commented on FLINK-5465:
-------------------------------

We were able to reproduce this issue on Flink 1.3.2. This is very critical bug, 
because failing job could bring whole flink cluster down. 

Failing job amplifying chances for rocksdb crash by constant start/cancel.

> RocksDB fails with segfault while calling AbstractRocksDBState.clear()
> ----------------------------------------------------------------------
>
>                 Key: FLINK-5465
>                 URL: https://issues.apache.org/jira/browse/FLINK-5465
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.2.0
>            Reporter: Robert Metzger
>         Attachments: hs-err-pid26662.log
>
>
> I'm using Flink 699f4b0.
> {code}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007f91a0d49b78, pid=26662, tid=140263356024576
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 
> 1.7.0_67-b01)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni-linux64.so+0x1aeb78]  
> rocksdb::GetColumnFamilyID(rocksdb::ColumnFamilyHandle*)+0x8
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # 
> /yarn/nm/usercache/robert/appcache/application_1484132267957_0007/container_1484132267957_0007_01_000010/hs_err_pid26662.log
> Compiled method (nm) 1869778  903     n       org.rocksdb.RocksDB::remove 
> (native)
>  total in heap  [0x00007f91b40b9dd0,0x00007f91b40ba150] = 896
>  relocation     [0x00007f91b40b9ef0,0x00007f91b40b9f48] = 88
>  main code      [0x00007f91b40b9f60,0x00007f91b40ba150] = 496
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to