[
https://issues.apache.org/jira/browse/FLINK-18338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141796#comment-17141796
]
Yun Tang commented on FLINK-18338:
----------------------------------
I have figured out why this happened and this [success
CI|https://dev.azure.com/myasuka/flink/_build/results?buildId=157&view=results]
of multi core modules could also prove it.
The root cause: newly added test {{RocksDBStateMisuseOptionTest}} forgets to
dispose {{RocksDBKeyedStateBackend}}.
Code below with frocksdbjni of 5.17.2-artisans-2.0 could reproduce this:
{code:java}
NativeLibraryLoader.getInstance().loadLibrary("/tmp/rocksdb-lib");
List<ColumnFamilyHandle> cf = new ArrayList<>(1);
try (DBOptions options = new DBOptions().setCreateIfMissing(true);
ColumnFamilyOptions columnFamilyOptions = new ColumnFamilyOptions();
RocksDB rocksdb = RocksDB.open(options,
"/tmp/rocksdb-2",
Collections.singletonList(new
ColumnFamilyDescriptor("default".getBytes(), columnFamilyOptions)),
cf)) {
rocksdb.put(ByteBuffer.allocate(4).array(), ByteBuffer.allocate(4).array());
}
{code}
RocksDB-java use
[#finalize|https://github.com/facebook/rocksdb/wiki/RocksJava-Basics#memory-management]
to release C++ object when Java starts GC. However, if we do not destroy
column family handle before destroying RocksDB, the
[assert|https://github.com/dataArtisans/frocksdb/blob/49bc897d5d768026f1eb816d960c1f2383396ef4/db/column_family.cc#L1238]
would fail at [versions
reset|https://github.com/dataArtisans/frocksdb/blob/49bc897d5d768026f1eb816d960c1f2383396ef4/db/db_impl.cc#L515]
when DB closing and we cannot ensure the order of GC, that's why sometimes the
CI would fail.
I'll create a new PR to fix FLINK-17800 and avoid this problem.
> RocksDB tests crash the JVM on CI
> ---------------------------------
>
> Key: FLINK-18338
> URL: https://issues.apache.org/jira/browse/FLINK-18338
> Project: Flink
> Issue Type: Bug
> Components: Runtime / State Backends, Tests
> Affects Versions: 1.11.0
> Reporter: Chesnay Schepler
> Assignee: Yun Tang
> Priority: Blocker
> Labels: test-stability
> Fix For: 1.11.0
>
>
> Something about {{pure virtual method called}}.
> Seen this twice in separate PRs.
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3615&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3632&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1
--
This message was sent by Atlassian Jira
(v8.3.4#803005)