jihoonson commented on issue #8492: Improper Appenderator.add() calls concurrent with persist URL: https://github.com/apache/incubator-druid/issues/8492#issuecomment-529728974 Thank you for creating this issue. During ingestion, `persist()` can happen multiple times for the same segment. `Sink` is a logical representation of a single segment and can have multiple `FireHydrant` which could hold an `IncrementalIndex` in memory or a `QueryableIndex` in disk. Note that each of these `IncrementalIndex` or `QueryableIndex` is a part of the same segment and should be merged before it's pushed to deep storage. In `AppenderatorImpl.persistAll()`, it persists all parts of the segments in memory into disk. How the persist happens is, the "main" thread first [creates a new `FireHydrant` with a new `IncrementalIndex`](https://github.com/apache/incubator-druid/blob/master/server/src/main/java/org/apache/druid/segment/realtime/appenderator/AppenderatorImpl.java#L561), and then the "persist" thread [performs the persist in background](https://github.com/apache/incubator-druid/blob/master/server/src/main/java/org/apache/druid/segment/realtime/appenderator/AppenderatorImpl.java#L579). As a result, when the "main" thread calls `AppenderatorImpl.add()` next time, the new row will be added to the new `Firehydrant` instead of the one which is being persisted.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org