mcvsubbu commented on issue #6302: URL: https://github.com/apache/pinot/issues/6302#issuecomment-896393436
@yupeng9 and I discussed offline, and here is the summary for Upsert support: For upsert tables, we need an occasional refill of older data. During refill, the segments change data, and if ingestion is also going on, then it messes up the indexing since other (uploaded realtime) segments are being modified. A few minutes of pause where queries are still being served (but potentially served with stale data) is acceptable. The requirement is to pause all partitions, not one at a time. Having things pause while retaining a CONSUMING segment with data in it is hard. Pausing is easy, but in case a server gets restarted, it is hard to remember the exact offset when things were paused, and consume up to that. Instead, I suggest we support two controller API primitives - Complete current consuming segments NOW, and optionally create a new segment. This may address Issue #7280 (basically, force a commit now). - Restart consumption by creating new consuming segments. - Support the above two for a single partition or all partitions (we can start with all I think). Things to think about and handle correctly (just listing a few) - Make sure the server initiates ONE call to complete segments. The pause operation needs to be invoked on the `PartitionConsumer` object. Not sure what happens if an ONLINE transition already comes in. A bad race condition can happen in such a case. - Make sure the periodic task does not restart a paused consumption. - Decide if paused state needs to be marked in zk (in idealstate?) Handle server restart during paused state. I am sure people can add more. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
