xiangfu0 commented on issue #7275:
URL: https://github.com/apache/pinot/issues/7275#issuecomment-898986667


   > Can you add some more specific requirements? Are you looking for a no 
downtime move?
   I think we should have a no-downtime solution for this.
   
   There are two types of deep store access patterns:
   1. From the controller, typically the segments are stored on the controller, 
either local or mounted remote disk(NAS/ EBS/...)
   2. Direct from remote store, e.g. s3/hdfs/webhdfs/gcs/adls/etc
   
   So there are four scenarios:
   A. Type I -> Type I:
   If we want to switch to a big disk instance, then just bring down one 
controller instance, rsync all the disk from old to new controller instance 
then update segment metadata download URL with new instance host.
   
   B. Type I -> Type II:
   If we want to achieve no downtime migration, pinot controller needs to 
handle the migration from deep store first, so all new segments are using new 
deep store download uris. meanwhile clients can also download segments from the 
controller with old download URIs.
   
   The gap here is that we should allow users to configure multiple deep stores 
and read/write from them.
   https://github.com/apache/pinot/issues/7302 is a prerequisite.
   
   C. Type II -> Type I:
   Please think twice and don't do it.
   
   D. Type II -> Type II:
   This is simple, We can achieve this right now by:
   - Loop all segments for all the tables
   - Download segment and copy to the new deep store (keep relevant path)
   - Update the download URI in the segment metadata.
   
   I think overall we need to:
   - Implement a gap feature https://github.com/apache/pinot/issues/7302
   - Implement a premitive of API to download segment  from current download 
URI and upload to new location then update download URL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to