Guozhang Wang created KAFKA-13239:
-------------------------------------

             Summary: Use RocksDB.ingestExternalFile for restoration
                 Key: KAFKA-13239
                 URL: https://issues.apache.org/jira/browse/KAFKA-13239
             Project: Kafka
          Issue Type: Improvement
          Components: streams
            Reporter: Guozhang Wang


Now that we are in newer version of RocksDB, we can consider using the new

{code}
ingestExternalFile(final ColumnFamilyHandle columnFamilyHandle,
      final List<String> filePathList,
      final IngestExternalFileOptions ingestExternalFileOptions)
{code}

for restoring changelog into state stores. More specifically:

1) Use larger default batch size in restore consumer polling behavior so that 
each poll would return more records as possible.
2) For a single batch of records returned from a restore consumer poll call, 
first write them as a single SST File using the {{SstFileWriter}}. The existing 
{{DBOptions}} could be used to construct the {{EnvOptions} and {{Options}} for 
the writter.
Do not yet ingest the written file to the db yet within each iteration
3) At the end of the restoration, call {{RocksDB.ingestExternalFile}} given all 
the written files' path as the parameter. The {{IngestExternalFileOptions}} 
would be specifically configured to allow key range overlapping with mem-table.
4) A specific note is that after the call in 3), heavy compaction may be 
executed by RocksDB in the background and before it cools down, starting normal 
processing immediately which would try to {{put}} new records into the store 
may see high stalls. To work around it we would consider using 
{{RocksDB.compactRange()}} which would block until the compaction is completed.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to