klsince commented on issue #7320:
URL: https://github.com/apache/pinot/issues/7320#issuecomment-901581362


   Re 2, local index cleanup works as part of segment reloading. The cleanup 
happens in the same thread doing segment reloading, and it leverages the 
current failure handling mechanism (as in 
[code](https://github.com/apache/pinot/blob/master/pinot-server/src/main/java/org/apache/pinot/server/starter/helix/HelixInstanceDataManager.java#L276))
 of segment reloading to keep disk states consistent upon failures, and be 
atomic when swapping in the cleaned segment. With this atomicity, the queries 
can continue to work with the existing segment until the new one is swapped in.
   
   Just like adding new indices during segment reloading, removing indices is 
also done inside the segment folder **not** accessed by ongoing queries, so 
it's safe to modify the files in the folder. To cleanup, we simply copy the 
indices defined in table config into a temp file, then rename it back to index 
file, effectively removing those not set in table config any more. In 
implementation, we use transferTo() for copy as in [the 
PR](https://github.com/apache/pinot/pull/7301/files#diff-52f126f7138a706a5fddb9257af1c558c4623269bc69308212a77c06021cbef7R433)
   
   Besides, cleanup happens after adding new indices so that newly created 
indices are kept after cleanup. In implementation, it just needs to do cleanup 
after closing all PinotDataBuffers as in [the 
PR](https://github.com/apache/pinot/pull/7301/files#diff-52f126f7138a706a5fddb9257af1c558c4623269bc69308212a77c06021cbef7R365).
 
   
   Hope this helps clarify a bit more, and feel free to comment the PR. Thanks!
   
   Re 1, I assume it's not always available to ssh to servers to delete 
segments, so an option to force segment download via Pinot restful API can be 
convenient. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to