haibow commented on issue #151: Feature request: realtime table bouncing
URL: https://github.com/apache/incubator-pinot/issues/151#issuecomment-525983999
 
 
   @mcvsubbu Thanks for your inputs!
   
   In our environment, around 2/3 of our production tables are REALTIME, and we 
are seeing more and more requests to evolve schemas, especially with our recent 
efforts to onboard customers in a self-serve mode. 
   
   Our flush thresholds are 5 hours by default, by which time the new schema 
should be picked up. But the problem is when to reload segments. If we reload 
segments right after the schema change (call reload endpoint in the onboarding 
portal when the user bumps the kafka schema version), the consuming segments 
would keep consuming and **seal with the old schema**. This results in schema 
mismatch at query time, causing this segment to be dropped, since this would be 
the only segment with the old schema - old segment are reloaded with the new 
schema, and new segments would pick up the new schema as well. Reloading 
segments X hours after the schema change could solve the issue, but it adds 
operational overhead.
   
   So, if we can force flush the consuming segments upon schema change, once 
the new segments start consuming with the new schema, we can call reload 
endpoint on all old segments, which would both solve the schema mismatch issue, 
and allow consumption with the new schema right away.
   
   If you anticipate issues with force flushing segments across partitions, we 
can manage customers' expectations on the time to pick up the new schema, and 
only address the schema mismatch issue, like upon completion of the consuming 
segment, fetch the latest schema and add the new column(s) to the mutable 
segment?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to