ankitsultana opened a new issue, #10243:
URL: https://github.com/apache/pinot/issues/10243

   At present we make a `PinotFS::listFiles` call in the segment commit flow. 
This is done in `PinotLLCRealtimeSegmentManager`
   
   
https://github.com/apache/pinot/blob/master/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java#L494
   
   For realtime tables with high ingestion throughput, the segment commits are 
quite frequent (100s per minute). For such tables the number of segments is 
also high (10s of thousands), making this listFiles call costly. This not only 
can impact ingestion latency but also put pressure on the underlying FS used by 
PinotFS.
   
   There are two options to eliminate this:
   
   1. If the tmp file path is deterministic, avoid the listFiles call and 
directly delete the file
   2. Make the listFiles call async and run in the background.
   
   cc: @Jackie-Jiang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to