anuragrai16 commented on PR #17380:
URL: https://github.com/apache/pinot/pull/17380#issuecomment-3698494110

   > > > We need to discuss when should we use data CRC instead of index CRC, 
and what is the side effect. When using data CRC, index only change happening 
in the deep store (i.e. new index added) won't be honored. This could prevent 
users from creating the index from minion and reduce the index creation on 
server. Given we want to solve the problem of real-time committed segment 
potentially having different CRC, I feel a better way to address this is to add 
a flag in ZK metadata to indicate that we can check only the data CRC. This 
flag only exists in committed segment, but not segment pushed from other 
ingestion flow
   > > 
   > > 
   > > Thanks @Jackie-Jiang , we can do that. For my understanding, in the 
current code, I'm only using Data CRC in `doAddOnlineSegment` of the class 
`OfflineTableDataManager` and `RealtimeTableDataManager` , which are called in 
helix transition states during `onBecomeOnlineFromConsuming` and 
`onBecomeOnlineFromOffline`.
   > > For the other flows of reload segment, replace segments (used by 
minions), data CRC is not used. So, in a way, is the code already handling this 
point ? Or are there other flows that might be accidentally included in this ?
   > 
   > These helix state transitions apply to a lot of scenarios, not only for 
the committed segments. E.g. when server starts, all segments are loaded 
through these 2 state transitions, which will make server ignore changes 
applied by minions.
   
   
   @Jackie-Jiang Makes sense. Updated the logic to persist an optional ZK flag 
for realtime committing segments that will be used to tell the replicas to use 
data CRC for replace. PTAL. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to