[GitHub] [kafka] guozhangwang commented on a change in pull request #8661: KAFKA-9603: Do not turn on bulk loading for segmented stores on stand-by tasks

GitBox Wed, 13 May 2020 16:58:31 -0700


guozhangwang commented on a change in pull request #8661:
URL: https://github.com/apache/kafka/pull/8661#discussion_r424796625




##########
File path: 
streams/src/main/java/org/apache/kafka/streams/state/internals/AbstractRocksDBSegmentedBytesStore.java
##########
@@ -248,17 +243,6 @@ void restoreAllInternal(final Collection<KeyValue<byte[], 
byte[]>> records) {
             final long segmentId = segments.segmentId(timestamp);
             final S segment = segments.getOrCreateSegmentIfLive(segmentId, 
context, observedStreamTime);
             if (segment != null) {
-                // This handles the case that state store is moved to a new 
client and does not
-                // have the local RocksDB instance for the segment. In this 
case, toggleDBForBulkLoading
-                // will only close the database and open it again with bulk 
loading enabled.
-                if (!bulkLoadSegments.contains(segment)) {
-                    segment.toggleDbForBulkLoading(true);
-                    // If the store does not exist yet, the 
getOrCreateSegmentIfLive will call openDB that
-                    // makes the open flag for the newly created store.
-                    // if the store does exist already, then 
toggleDbForBulkLoading will make sure that
-                    // the store is already open here.
-                    bulkLoadSegments = new HashSet<>(segments.allSegments());
-                }

Review comment:
       Yup, that makes sense to me. I'm thinking about the world where standbys 
(and also restoring tasks) are executed on different threads. The concern about 
IQ are valid indeed that with a large set of un-compacted L0 files.
   
   In the even larger scope, where we would have checkpoints I'd believe that 
bulk-loading would not be very necessary since we would not have a huge number 
of records to catch up any more :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] guozhangwang commented on a change in pull request #8661: KAFKA-9603: Do not turn on bulk loading for segmented stores on stand-by tasks

Reply via email to