Jackie-Jiang commented on issue #4741: Make the download url of a realtime LLC 
segment as a list of urls.
URL: https://github.com/apache/incubator-pinot/pull/4741#issuecomment-546092366
 
 
   Let's hold on this change. I'm against putting server url into the segment 
ZK metadata for the following reasons:
   - Server should not modify segment ZK metadata, segment ZK metadata should 
only be touched by controller. We use segment ZK metadata as the segment lock, 
and if we allow server to modify segment ZK metadata, we might run into certain 
unknown race conditions.
   - Whenever segment moves from one server to another server, it is hard to 
keep the segment ZK metadata consistent with the segment location. Think of 
rebalance where all segments might be moved, which will cause multiple 
modifications on all segment ZK metadata for the table.
   - Storing server url along with deep storage url implies server should 
return the segment in the same format as deep storage, which is not the case. 
Server keeps segment un-compressed while deep storage stores compressed segment.
   
   Instead, I am proposing the following approach:
   - Keep the download uri in segment ZK metadata only points to deep storage
   - Introduce a concept of downloading segment from peer, which allow server 
to download segment from other servers serving the same segment if deep storage 
is down.
   - When deep storage is down, server can read the table's external view, find 
other ONLINE servers with the segment, and construct download uri based on the 
ONLINE server instance id.
   
   With the new approach:
   - No need to touch segment ZK metadata
   - The logic is cleaner. First try the deep storage, if failed, go to the 
peer download approach
   - Segment move can be automatically picked up by the external view
   - When downloading the segment from peer, expect the segment to be 
uncompressed, without extra handling for both compressed and uncompressed 
segment.
   - Offline table can also benefit from the peer download.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to