jihoonson commented on issue #11231: URL: https://github.com/apache/druid/issues/11231#issuecomment-840760944
@loquisgon thanks for adding details. I have some comments on those. - Currently, the `Sink` tracks how many rows have been added to itself which is the sum of in-memory segment size and all persisted segment sizes. This metadata is used to check whether the segment hits `maxRowsPerSegment` and needs to be pushed. In this proposal, the batch appenderator will remove all in-memory metadata of the `Sink` once it is persisted which will lose the row count in the sink. How are you thinking of tracking this information? - When a segment hits `maxRowsPerSegment`, what will the task do in the merge phase? It currently pushes all segments at once including even the small segments that hasn't hit the limit yet. The current behavior is designed for the streaming ingestion to track of the offsets of pushed segments. The batch task doesn't have to do the same. Instead, it can selectively push only the segments that are large enough. This will create "better" segments in terms of the segment size. The below are nit comments. - The `AppenderatorImpl` currently persists all in-memory segments when it hits the memory limit. This is because, in streaming ingestion, the realtime task should be able to stop and restart when the middleManager where the task is running on is restarted. The persist operation writes a checkpoint on disk that indicates the stream offsets corresponding to the data that is persisted, so that the task can read from the last checkpoint on restart. The batch task doesn't need this checkpoint as it's not restorable, and thus doesn't need to persist all in-memory segments at once either. So, maybe it's worth considering persisting not all segments but some of them until enough memory space is available. This way can also help creating "better" segments by mitigating https://github.com/apache/druid/issues/11252. However, https://github.com/apache/druid/issues/11252 perhaps needs to be solved by redesigning the merge operation that can be shared by both batch and streaming ingestion, so that all ingestion methods can benefit from it. - The proposal will reuse the existing `FireHydrants` and `Sinks` for the batch appenderator. However, these are designed to serve realtime query processing and thus have data structures and architectures to support concurrent accesses. Imagining what the batch version of `FireHydrants` and `Sinks` would look like, I think they will likely be very similar to the realtime version. So, even though these will be unnecessary for the batch appenderator, I think it's OK to reuse them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
