mcvsubbu commented on issue #6302:
URL: https://github.com/apache/pinot/issues/6302#issuecomment-896393436


   @yupeng9  and I discussed offline, and here is the summary for Upsert 
support:
   
   For upsert tables, we need an occasional refill of older data. During 
refill, the segments change data, and if ingestion is also going on, then it 
messes up the indexing since other (uploaded realtime) segments are being 
modified. A few minutes of pause where queries are still being served (but 
potentially served with stale data) is acceptable. The requirement is to pause 
all partitions, not one at a time.
   
   Having things pause while retaining a CONSUMING segment with data in it is 
hard. Pausing is easy, but in case a server gets restarted, it is hard to 
remember the exact offset when things were paused, and consume up to that.
   
   Instead, I suggest we support two controller API primitives
   - Complete current consuming segments NOW, and optionally create a new 
segment. This may address Issue #7280 (basically, force a commit now). 
   - Restart consumption by creating new consuming segments.
   - Support the above two for a single partition or all partitions (we can 
start with all I think).
   
   Things to think about and handle correctly (just listing a few)
   - Make sure the server initiates ONE call to complete segments. The pause 
operation needs to be invoked on the `PartitionConsumer` object. Not sure what 
happens if an ONLINE transition already comes in. A bad race condition can 
happen in such a case.
   - Make sure the periodic task does not restart a paused consumption.
   - Decide if paused state needs to be marked in zk (in idealstate?) Handle 
server restart during paused state.
   
   I am sure people can add more.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to