[GitHub] [druid] loquisgon commented on a change in pull request #11123: Avoid memory mapping hydrants after they are persisted & after they are merged for native batch ingestion

GitBox Mon, 26 Apr 2021 12:17:27 -0700


loquisgon commented on a change in pull request #11123:
URL: https://github.com/apache/druid/pull/11123#discussion_r620578672




##########
File path: 
server/src/main/java/org/apache/druid/segment/realtime/FireHydrant.java
##########
@@ -34,17 +34,53 @@
 
 import javax.annotation.Nullable;
 import java.io.Closeable;
+import java.io.File;
 import java.util.Optional;
 import java.util.concurrent.atomic.AtomicReference;
 import java.util.function.Function;
 
 /**
+ *
  */
 public class FireHydrant
 {
   private final int count;
   private final AtomicReference<ReferenceCountingSegment> adapter;
   private volatile IncrementalIndex index;
+  private File persistedFile;
+  private SegmentId persistedSegmentId;
+
+  /**
+   * @return The persisted file path. This is needed to recreate mapped files 
before merging.
+   * it will be null for real time hydrants.
+   */
+  public @Nullable File getPersistedFile()
+  {
+    return persistedFile;
+  }
+
+  /**
+   * @return The persisted segment id. This is needed to recreate mapped files 
before merging.
+   * It will be null for real time hydrants
+   */
+  public @Nullable SegmentId getPersistedSegmentId()

Review comment:
       Having `persistedSegmentId` in addition to `segmentId` is kind of ugly. 
But I was trying to avoid touching the existing code as much as possible. In 
order to use only segment id then I would have to introduce a "segmentId" 
private member that sometimes it is set by the adapter and others it is set by 
the "persisted segment id". I am on the fence on whether doing this would even 
be more confusing still. In any case, our discussion was for the next step to 
do is to split Appendarator into two implementations: One batch and one for 
real time.  The realtime appenderator would be practically the same 
implementation before these changes. The batch appenderator would take the 
current incremental direction to its logical conclusion: avoid keeping all data 
structures that are not needed in memory for native batch (which would now 
remove OOMs due to relation of data size to memory consumption). Then this code 
is temporary and we can have a cleaner implementation in the BatchAppenderator
 Impl.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] loquisgon commented on a change in pull request #11123: Avoid memory mapping hydrants after they are persisted & after they are merged for native batch ingestion

Reply via email to