klion26 commented on a change in pull request #10329: 
[FLINK-12785][StateBackend] RocksDB savepoint recovery can use a lot of 
unmanaged memory
URL: https://github.com/apache/flink/pull/10329#discussion_r354146743
 
 

 ##########
 File path: 
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBWriteBatchWrapper.java
 ##########
 @@ -113,4 +126,16 @@ public void close() throws RocksDBException {
                }
                IOUtils.closeQuietly(batch);
        }
+
+       private void flushIfNeeded() throws RocksDBException {
+               boolean needFlush = batch.count() == capacity || (batchSize > 0 
&& batch.getDataSize() >= batchSize);
+               if (needFlush) {
+                       flush();
+               }
+       }
+
+       @VisibleForTesting
+       long getDataSize() {
+               return batch.getDataSize();
 
 Review comment:
   Yesterday, I used singleshot mode to benchmark this, now I ran benchmark 
both singleshot and averagetime, `averagetime` mode will get ~ 2% improvment
   
   ```
   Benchmark                                     Mode  Cnt  Score   Error  Units
   BBenchmark.testGetDataSizeAfterPut            avgt   15  0.369 ± 0.067  us/op
   BBenchmark.testRawGetDataSizeAfterPut         avgt   15  0.389 ± 0.088  us/op
   BBenchmark.testGetDataSizeAfterPutOneShot       ss   15  2.458 ± 1.467  us/op
   BBenchmark.testRawGetDataSizeAfterPutOneShot    ss   15  2.024 ± 0.223  us/op
   ```
   
   attached the benchmark code below
   ```
   @Benchmark
   @BenchmarkMode(Mode.SingleShotTime)
   @OutputTimeUnit(TimeUnit.MICROSECONDS)
   public long testGetDataSizeAfterPutOneShot() throws RocksDBException {
        writeBatch.put(handle, dummy, dummy);
        return  2 + calculateVarint32Length(dummy.length) + dummy.length + 
calculateVarint32Length(dummy.length) + dummy.length;
   }
   
   private int calculateVarint32Length(int input) {
        if (input < (1 << 7)) {
                return 1;
        }
        if (input < (1 << 14)) {
                return 2;
        }
        if (input < (1 << 21)) {
                return 3;
        }
        if (input < (1 << 28)) {
                return 4;
        }
        return 5;
   }
   
   @Benchmark
   @BenchmarkMode(Mode.SingleShotTime)
   @OutputTimeUnit(TimeUnit.MICROSECONDS)
   public long testRawGetDataSizeAfterPutOneShot() throws RocksDBException {
        writeBatch.put(handle, dummy, dummy);
        return writeBatch.getDataSize();
   }
   
   @Benchmark
   @BenchmarkMode(Mode.AverageTime)
   @OutputTimeUnit(TimeUnit.MICROSECONDS)
   public long testGetDataSizeAfterPut() throws RocksDBException {
         writeBatch.put(handle, dummy, dummy);
         return  2 + calculateVarint32Length(dummy.length) + dummy.length + 
calculateVarint32Length(dummy.length) + dummy.length;
   }
   
   @Benchmark
   @BenchmarkMode(Mode.AverageTime)
   @OutputTimeUnit(TimeUnit.MICROSECONDS)
   public long testRawGetDataSizeAfterPut() throws RocksDBException {
          writeBatch.put(handle, dummy, dummy);
           return writeBatch.getDataSize();
    }
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to