mcvsubbu commented on issue #6302: URL: https://github.com/apache/pinot/issues/6302#issuecomment-896474846
I suggest to provide an option of either discarding all the consuming segments or completing all of them. I feel that handling server restarts when we are consumed half way and paused will be hard. Take the case when the server starts to consume a segment at offset 100 and there are two replicas A and B. After sometime, a "pause" command is entered. Replica A is at 150, and Replica B is at 160, and they stop consuming, with the rows still in memory. Now, A gets restarted. Ingestion has continued, so the current offset available in stream is 200. Should A just serve data until 100 (i.e. data that has been committed) whereas B will serve data until 150? Should A consume up to 150 and stop? Should A consume up to 160 and stop? What if there are three replicas? It gets harder to maintain state. We can provide the operator an option to either complete the current consuming segment or discard it. Either way, we "pause" at a point where everything is committed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
