haibow commented on issue #4225: Make Pinot schema evolution easier URL: https://github.com/apache/incubator-pinot/issues/4225#issuecomment-521824275 Seems reload doesn't work well with REALTIME tables, and the table will end up having segments with inconsistent schemas. For realtime tables, after updating the schema and calling `reload` endpoint, all ONLINE segments would be reloaded with the new schema, but the CONSUMING segment would be skipped. As a result, the consuming segment would both keep consuming and finally seal with the old schema. Tested on a realtime table with LLC (code last checked in from master on [04/11/19](https://github.com/apache/incubator-pinot/commit/26330f3a2c3309e7cf574e1fff86a1de9fb934ff)). The consuming segment at the time of the reloading would be the only segment with the old schema, when either in CONSUMING state or later in ONLINE state. Impact: - when querying data within a small time range after the time of reloading, the new field added in the new schema is not returned in the query result. - when querying data with a bit bigger time range, we would see messages below: `MergeResponseError: responses for table: $table from servers: [$server1, server2] got dropped due to data schema inconsistency.` Reloading the table/segment again after the consuming segment seals would reload it with the new schema thus bringing the whole table back in healthy state, but it's operationally inefficient. So it seems more like a bug now. We might need to revisit approaches like - flushing the consuming segment (and reload) - adding new columns in memory, and refresh the schema, before consuming more rows (without a forced flush) @mayankshriv @Jackie-Jiang thoughts?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org