deepthi912 commented on code in PR #18020:
URL: https://github.com/apache/pinot/pull/18020#discussion_r3013792572


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/indexsegment/immutable/ImmutableSegmentImpl.java:
##########
@@ -160,13 +159,14 @@ public void persistDocIdsSnapshot(String fileName, 
ThreadSafeMutableRoaringBitma
         LOGGER.warn("Previous snapshot was not taken cleanly. Remove tmp file: 
{}", tmpFile);
         FileUtils.deleteQuietly(tmpFile);
       }
-      MutableRoaringBitmap docIdsSnapshot = docIds.getMutableRoaringBitmap();
-      try (DataOutputStream dataOutputStream = new DataOutputStream(new 
FileOutputStream(tmpFile))) {
-        docIdsSnapshot.serialize(dataOutputStream);
+      byte[] bytes = docIds.toBytes();

Review Comment:
   Yeah I haven't seen any longer pause from the small benchmark I have taken . 
Older code was completely cloning the bitmap entirely. Atleast with that we 
reduce the number of lock contentions as well as bytes which are eligible for 
gc immediately. With off-heap memory:
   
   It's might not be worth it here:
   
   Direct buffers are slower to allocate. ByteBuffer.allocateDirect() requires 
a system call (mmap) and has higher per-allocation overhead. For a ~40KB buffer 
created per segment, this would add overhead dominate any benefit. For maybe 
larger bitmaps it might make sense but it doesn't explain the purpose. We might 
just use it to save the heap memory and bring down heap overhead or long gc 
pauses. I don't think in our scenario we had a long gc pause. @KKcorps correct 
me if I am wrong.



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/indexsegment/immutable/ImmutableSegmentImpl.java:
##########
@@ -160,13 +159,14 @@ public void persistDocIdsSnapshot(String fileName, 
ThreadSafeMutableRoaringBitma
         LOGGER.warn("Previous snapshot was not taken cleanly. Remove tmp file: 
{}", tmpFile);
         FileUtils.deleteQuietly(tmpFile);
       }
-      MutableRoaringBitmap docIdsSnapshot = docIds.getMutableRoaringBitmap();
-      try (DataOutputStream dataOutputStream = new DataOutputStream(new 
FileOutputStream(tmpFile))) {
-        docIdsSnapshot.serialize(dataOutputStream);
+      byte[] bytes = docIds.toBytes();

Review Comment:
   Yeah I haven't seen any longer pause from the small benchmark I have taken . 
Older code was completely cloning the bitmap. Atleast with that we reduce the 
number of lock contentions as well as bytes which are eligible for gc 
immediately. With off-heap memory:
   
   It's might not be worth it here:
   
   Direct buffers are slower to allocate. ByteBuffer.allocateDirect() requires 
a system call (mmap) and has higher per-allocation overhead. For a ~40KB buffer 
created per segment, this would add overhead dominate any benefit. For maybe 
larger bitmaps it might make sense but it doesn't explain the purpose. We might 
just use it to save the heap memory and bring down heap overhead or long gc 
pauses. I don't think in our scenario we had a long gc pause. @KKcorps correct 
me if I am wrong.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to