[jira] [Commented] (FLINK-3948) EventTimeWindowCheckpointingITCase Fails with Core Dump
[ https://issues.apache.org/jira/browse/FLINK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316023#comment-15316023 ] ASF GitHub Bot commented on FLINK-3948: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2072 Looks great, +1 to merge > EventTimeWindowCheckpointingITCase Fails with Core Dump > --- > > Key: FLINK-3948 > URL: https://issues.apache.org/jira/browse/FLINK-3948 > Project: Flink > Issue Type: Bug > Components: state backends >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Critical > > It fails because of a core dump in RocksDB. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3948) EventTimeWindowCheckpointingITCase Fails with Core Dump
[ https://issues.apache.org/jira/browse/FLINK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315477#comment-15315477 ] ASF GitHub Bot commented on FLINK-3948: --- GitHub user aljoscha opened a pull request: https://github.com/apache/flink/pull/2072 [FLINK-3948] Protect RocksDB cleanup by cleanup lock Before, it could happen that an asynchronous checkpoint was going on when trying to do cleanup. Now we protect cleanup and asynchronous checkpointing by a lock. This was what caused `EventTimeWindowCheckpointingITCase` to fail. I now ran it more than a 100 times on travis and haven't observed a build failure related to this. You can merge this pull request into a Git repository by running: $ git pull https://github.com/aljoscha/flink rocksdb/fix-core-dump Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2072.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2072 commit c8456b45c47e67cc316d5bb979de36a6225eebd4 Author: Aljoscha KrettekDate: 2016-06-04T05:59:48Z Revert "[FLINK-3960] ignore EventTimeWindowCheckpointingITCase for now" This reverts commit 98a939552e12fc699ff39111bbe877e112460ceb. commit 13c8593ec9074aa086caf4329b21e331a1c54d58 Author: Aljoscha Krettek Date: 2016-05-20T20:37:14Z [FLINK-3948] Protect RocksDB cleanup by cleanup lock Before, it could happen that an asynchronous checkpoint was going on when trying to do cleanup. Now we protect cleanup and asynchronous checkpointing by a lock. > EventTimeWindowCheckpointingITCase Fails with Core Dump > --- > > Key: FLINK-3948 > URL: https://issues.apache.org/jira/browse/FLINK-3948 > Project: Flink > Issue Type: Bug > Components: state backends >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Critical > > It fails because of a core dump in RocksDB. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3948) EventTimeWindowCheckpointingITCase Fails with Core Dump
[ https://issues.apache.org/jira/browse/FLINK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294781#comment-15294781 ] Aljoscha Krettek commented on FLINK-3948: - RocksDB seems to be somewhat sensitive to the environment and configuration. I changed the configuration to this: {code} private static class RocksDbOptionsFactory implements OptionsFactory { final long targetFileSize = 100; final long writeBufferSize = 100; @Override public DBOptions createDBOptions(DBOptions currentOptions) { currentOptions .setMaxBackgroundCompactions(1) .setMaxBackgroundFlushes(1) .setMaxOpenFiles(1); return currentOptions; } @Override public ColumnFamilyOptions createColumnOptions(ColumnFamilyOptions currentOptions) { currentOptions .setTargetFileSizeBase(targetFileSize) .setMaxBytesForLevelBase(4 * targetFileSize) .setWriteBufferSize(writeBufferSize) .setMinWriteBufferNumberToMerge(1) .setMaxWriteBufferNumber(1); return currentOptions; } } {code} And now I even get this on my local machine: {code} # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x0001288fb173, pid=53485, tid=62699 # # JRE version: Java(TM) SE Runtime Environment (8.0_40-b25) (build 1.8.0_40-b25) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.40-b25 mixed mode bsd-amd64 compressed oops) # Problematic frame: # C [librocksdbjni2649341092967859180..jnilib+0xc0173] rocksdb::TableCache::FindTable(rocksdb::EnvOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::Cache::Handle**, bool, bool, rocksdb::HistogramImpl*)+0x93 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /Users/aljoscha/Dev/work/flink/flink-tests/hs_err_pid53485.log [thread 25603 also had an error] # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # {code} > EventTimeWindowCheckpointingITCase Fails with Core Dump > --- > > Key: FLINK-3948 > URL: https://issues.apache.org/jira/browse/FLINK-3948 > Project: Flink > Issue Type: Bug > Components: state backends >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Critical > > It fails because of a core dump in RocksDB. -- This message was sent by Atlassian JIRA (v6.3.4#6332)