lucasbru commented on code in PR #22378:
URL: https://github.com/apache/kafka/pull/22378#discussion_r3357070146


##########
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java:
##########
@@ -311,19 +312,16 @@ private void addValueProvidersToMetricsRecorder() {
     }
 
     /**
-     * Creates lightweight {@link ColumnFamilyOptions} for the offsets column 
family. The offsets CF
-     * stores only a small number of key-value pairs (one per changelog 
partition), so it does not
-     * need the heavyweight options used for the data CF (large write buffers, 
bloom filters,
-     * aggressive compaction). Sharing the data CF's options causes 
unnecessary write amplification
-     * and compaction pressure that can contribute to RocksDB write stalls 
under heavy restore I/O.
+     * The offsets CF stores only a small number of key-value pairs
+     * (one per changelog partition), so it does not need the heavyweight
+     * options used for the data CF.
+     * Uses the default options for compaction and max write buffer number.
      */
-    protected static ColumnFamilyOptions createOffsetsCFOptions() {
-        final ColumnFamilyOptions offsetsCFOptions = new ColumnFamilyOptions();
-        offsetsCFOptions.setCompressionType(CompressionType.NO_COMPRESSION);
-        offsetsCFOptions.setCompactionStyle(CompactionStyle.LEVEL);
-        offsetsCFOptions.setWriteBufferSize(1024 * 1024L); // 1MB — sufficient 
for offset metadata
-        offsetsCFOptions.setMaxWriteBufferNumber(2);
-        return offsetsCFOptions;
+    protected ColumnFamilyOptions createOffsetsCFOptions() {

Review Comment:
   This now assigns the `offsetsCfOptions` field as a side effect and also 
returns it, and `close()` relies on the field to free the native options. It 
works because each open path calls it exactly once, but a method named 
`create...()` that mutates instance state is a bit of a footgun: a second call 
(or an override that doesn't delegate to super) would orphan the previous 
`ColumnFamilyOptions` and reintroduce the leak, since only the last instance 
gets closed. Could we keep this a pure factory and assign at the single call 
site (`offsetsCfOptions = createOffsetsCFOptions()`), or rename it to something 
like `initOffsetsCFOptions()`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to