imply-cheddar commented on code in PR #16698:
URL: https://github.com/apache/druid/pull/16698#discussion_r1669780949


##########
processing/src/main/java/org/apache/druid/segment/writeout/FileWriteOutBytes.java:
##########
@@ -33,22 +33,25 @@
 import java.nio.channels.FileChannel;
 import java.nio.channels.WritableByteChannel;
 
-final class FileWriteOutBytes extends WriteOutBytes
+public final class FileWriteOutBytes extends WriteOutBytes
 {
   private final File file;
   private final FileChannel ch;
   private long writeOutBytes;
 
-  /** Purposely big-endian, for {@link #writeInt(int)} implementation */
-  private final ByteBuffer buffer = ByteBuffer.allocate(4096); // 4K page 
sized buffer
+  /**
+   * Purposely big-endian, for {@link #writeInt(int)} implementation.
+   * Direct because there is a material difference in performance when writing 
direct buffers
+   */
+  private final ByteBuffer buffer = ByteBuffer.allocateDirect(32768); // 32K 
page sized buffer

Review Comment:
   Are you sure about the memory leak?  `DirectByteBuffer` creates a `Cleaner` 
which watches the references and frees the memory when it is GC'd.  There 
shouldn't be a leak from creating a thing here.  It is true that the memory 
will be used for some time before GC is invoked, but this shouldn't create a 
leak.
   
   In terms of performance, the updates here were done in response to profiling 
data.  Part of the speed up comes from the larger page size, another part from 
the direct flushing.
   
   



##########
processing/src/main/java/org/apache/druid/segment/writeout/TmpFileSegmentWriteOutMedium.java:
##########
@@ -43,21 +69,41 @@ public final class TmpFileSegmentWriteOutMedium implements 
SegmentWriteOutMedium
   @Override
   public WriteOutBytes makeWriteOutBytes() throws IOException
   {
-    File file = File.createTempFile("filePeon", null, dir);
-    FileChannel ch = FileChannel.open(
-        file.toPath(),
-        StandardOpenOption.READ,
-        StandardOpenOption.WRITE
+    return new LazilyAllocatingHeapWriteOutBytes(

Review Comment:
   Because it has broad-reaching performance improving impacts?  The 
alternative would be to create a new thing, update everything to use it and 
then delete the old, which is basically the same as changing in place.  The 
reason to leave the old one would be if we wanted to maintain configurability 
to do the old thing, but I don't know why that would be useful here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to