mcvsubbu commented on PR #12045: URL: https://github.com/apache/pinot/pull/12045#issuecomment-1830742824
> I'm good with the change, and want to start a discussion on how we want to handle consumption error at high level. > > Error can occur when: > > 1. Consuming segment is created (with state transition) > > 2. During consumption, including segment commit (no state transition) > > 3. Post segment commit, switching from CONSUMING to ONLINE (with state transition) > > > Currently: > > * Server notifies controller when 1 or 2 happens, and controller changes IS to OFFLINE > > * When 1 or 3 happens, segment moves to ERROR state in EV > > > The current behavior can leave segment in 3 conditions, and IMO quite hard to manage: > > 1. IS: OFFLINE, EV: ERROR > > 2. IS: OFFLINE, EV: CONSUMING -> OFFLINE > > 3. IS: ONLINE, EV: ERROR > > > What this PR is trying to fix is when 1 happens: IS: OFFLINE, EV: ERROR -> OFFLINE (by segment reset). A easier way to fix this might be to not throw exception after notifying the controller, and handle 1 & 2 using the same mechanism: IS: OFFLINE, EV: CONSUMING -> OFFLINE. This way we do not need to wait for 30 seconds before issuing the segment reset. Sure, but that could result in queries going to the server when the segment does not exist. The earlier fix did not change the EV to OFFLINE all the time. We found instances when EV was in ERROR state, and the IS was in OFFLINE state. Anyways, I have added integration tests so that any other fix we adopt can still be tested locally at least. > > I also want to discuss if we want to mark IS as OFFLINE so that we can auto-recover. If we can make 2 changing EV to ERROR, then we may use segment reset to fix the consumption error (this can also apply to OFFLINE table). This approach has 2 benefits: > > * More general, and can fix both CONSUMING and ONLINE segments > > * IS only contains ONLINE/CONSUMING state, which makes other code much error-prune. Contributors usually miss the handling of OFFLINE state in IS because they won't know it can exist Open to discussions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
