jihoonson commented on issue #8492: Improper Appenderator.add() calls 
concurrent with persist
URL: 
https://github.com/apache/incubator-druid/issues/8492#issuecomment-529728974
 
 
   Thank you for creating this issue.
   
   During ingestion, `persist()` can happen multiple times for the same 
segment. `Sink` is a logical representation of a single segment and can have 
multiple `FireHydrant` which could hold an `IncrementalIndex` in memory or a 
`QueryableIndex` in disk. Note that each of these `IncrementalIndex` or 
`QueryableIndex` is a part of the same segment and should be merged before it's 
pushed to deep storage. In `AppenderatorImpl.persistAll()`, it persists all 
parts of the segments in memory into disk. How the persist happens is, the 
"main" thread first [creates a new `FireHydrant` with a new 
`IncrementalIndex`](https://github.com/apache/incubator-druid/blob/master/server/src/main/java/org/apache/druid/segment/realtime/appenderator/AppenderatorImpl.java#L561),
 and then the "persist" thread [performs the persist in 
background](https://github.com/apache/incubator-druid/blob/master/server/src/main/java/org/apache/druid/segment/realtime/appenderator/AppenderatorImpl.java#L579).
 As a result, when the "main" thread calls `AppenderatorImpl.add()` next time, 
the new row will be added to the new `Firehydrant` instead of the one which is 
being persisted.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to