mcvsubbu commented on issue #6302:
URL: https://github.com/apache/pinot/issues/6302#issuecomment-896474846


   I suggest to provide an option of either discarding all the consuming 
segments or completing all of them. I feel that handling server restarts when 
we are consumed half way and paused will be hard. 
   
   Take the case when the server starts to consume a segment at offset 100 and 
there are two replicas A and B. After sometime, a "pause" command is entered. 
Replica A is at 150, and Replica B is at 160, and they stop consuming, with the 
rows still in memory.
   
   Now, A gets restarted. Ingestion has continued, so the current offset 
available in stream is 200. 
   
   Should A just serve data until 100 (i.e. data that has been committed) 
whereas B will serve data until 150?
   Should A consume up to 150 and stop?
   Should A consume up to 160 and stop? What if there are three replicas?
   
   It gets harder to maintain state.
   
   We can provide the operator an option to either complete the current 
consuming segment or discard it. Either way, we "pause" at a point where 
everything is committed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to