Sophie Blee-Goldman created KAFKA-8627: ------------------------------------------
Summary: Investigate batching on state restore Key: KAFKA-8627 URL: https://issues.apache.org/jira/browse/KAFKA-8627 Project: Kafka Issue Type: Improvement Components: streams Reporter: Sophie Blee-Goldman Currently when rebuilding state from scratch, we form batches based on whatever is returned by poll() and write them to RocksDB. Given the structure of RocksDB, inserting large sorted batches gives the best performance when writing large amounts of data at once, so we should investigate the potential restore-time improvement of 1) Larger batches – either by tuning the restore consumer to return larger amounts of data, buffering records into larger batches, or both 2) Sorting batches These two factors are likely to be coupled, so we should explore the performance gains/hits by varying both if possible (ie turn sorting on/off with a variety of batch sizes) -- This message was sent by Atlassian JIRA (v7.6.3#76005)