klion26 commented on a change in pull request #10329:
[FLINK-12785][StateBackend] RocksDB savepoint recovery can use a lot of
unmanaged memory
URL: https://github.com/apache/flink/pull/10329#discussion_r354146743
##########
File path:
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBWriteBatchWrapper.java
##########
@@ -113,4 +126,16 @@ public void close() throws RocksDBException {
}
IOUtils.closeQuietly(batch);
}
+
+ private void flushIfNeeded() throws RocksDBException {
+ boolean needFlush = batch.count() == capacity || (batchSize > 0
&& batch.getDataSize() >= batchSize);
+ if (needFlush) {
+ flush();
+ }
+ }
+
+ @VisibleForTesting
+ long getDataSize() {
+ return batch.getDataSize();
Review comment:
Yesterday, I used singleshot mode to benchmark this, now I ran benchmark
both singleshot and averagetime, `averagetime` mode will get ~ 2% improvment
```
Benchmark Mode Cnt Score Error Units
BBenchmark.testGetDataSizeAfterPut avgt 15 0.369 ± 0.067 us/op
BBenchmark.testRawGetDataSizeAfterPut avgt 15 0.389 ± 0.088 us/op
BBenchmark.testGetDataSizeAfterPutOneShot ss 15 2.458 ± 1.467 us/op
BBenchmark.testRawGetDataSizeAfterPutOneShot ss 15 2.024 ± 0.223 us/op
```
attached the benchmark code below
```
@Benchmark
@BenchmarkMode(Mode.SingleShotTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public long testGetDataSizeAfterPutOneShot() throws RocksDBException {
writeBatch.put(handle, dummy, dummy);
return 2 + calculateVarint32Length(dummy.length) + dummy.length +
calculateVarint32Length(dummy.length) + dummy.length;
}
private int calculateVarint32Length(int input) {
if (input < (1 << 7)) {
return 1;
}
if (input < (1 << 14)) {
return 2;
}
if (input < (1 << 21)) {
return 3;
}
if (input < (1 << 28)) {
return 4;
}
return 5;
}
@Benchmark
@BenchmarkMode(Mode.SingleShotTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public long testRawGetDataSizeAfterPutOneShot() throws RocksDBException {
writeBatch.put(handle, dummy, dummy);
return writeBatch.getDataSize();
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public long testGetDataSizeAfterPut() throws RocksDBException {
writeBatch.put(handle, dummy, dummy);
return 2 + calculateVarint32Length(dummy.length) + dummy.length +
calculateVarint32Length(dummy.length) + dummy.length;
}
@Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
public long testRawGetDataSizeAfterPut() throws RocksDBException {
writeBatch.put(handle, dummy, dummy);
return writeBatch.getDataSize();
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services