Dear Community Members,

I want to start discussion on the two tickets I filed recently:
Add support for Java based AssociativeMergeOperator via JNI
<https://issues.apache.org/jira/browse/FLINK-39455>
Support ReducingMergeState and AggregatingMergeState backed by Java based
associative merge operators
<https://issues.apache.org/jira/browse/FLINK-39456>

Copying the motivation from the FLIP doc
<https://docs.google.com/document/d/1HwEDRGoSZIUU1SYxTih4qp8FM6LjTdIrDs7CJHm4iB0/edit?usp=sharing>
:

Flink supports RocksDBReducingState and RocksDBAggregatingState state
variables that do a synchronous read-modify-write on every add call. While
this works great in many scenarios, for write-heavy workloads this can be
expensive and may become a bottleneck.
RocksDB's AssociativeMergeOperator is a storage-level primitive designed
for commutative and associative operations — integer counters, set union,
list append, approximate sketches, top-K structures, Bloom filter, and
similar patterns. However, frocksdb (the RocksDB fork used in Flink) does
not support Java based associative merge operators.

This FLIP has two parts:
1. Support for Java based AssociativeMergeOperator in frocksdb via JNI
2. Support ReducingMergeState and AggregatingMergeState backed by Java
based associative merge operators

The first part proposes exposing the associative merge operator as a Java
class in frocksdb with minimal JNI overhead. RocksDB can call these
operators during flushing and compaction.
The second part leverages the frocksdb support developed in the first part
to support ReducingMergeState and AggregatingMergeState state variables
with user defined ReduceFunction and AggregateFunction using rocksdb
backend.

This enhancement opens up a powerful feature of rocksdb to Java. Flink
users can use it to build interesting associative data structures
on streaming data. I have added benchmark details from a prototype
implementation in the FLIP doc.

Looking forward to feedback.

FLIP in Google doc
<https://docs.google.com/document/d/1HwEDRGoSZIUU1SYxTih4qp8FM6LjTdIrDs7CJHm4iB0/edit?usp=sharing>

Best,
-Soumitra.

Reply via email to