steFaiz commented on PR #8105:
URL: https://github.com/apache/paimon/pull/8105#issuecomment-4611109819

   > Are there any application scenarios?
   
   I think this is currently a rather temporary solution. The existing raw data 
tables (the original tables) are typically built on an ODPS + OSS pipeline, and 
there's a highly complex downstream dependency chain—for instance, dozens of 
ODPS tables might depend on this single raw table.
   
   After switching the original table to Paimon, downstream odps tables could 
not be replaced by paimon immediately. We need to gradually switching the whole 
chain:
   
   Like:
   
   1. The original source odps + oss is replaced by odps + paimon:
      a. original data are double writed to both Paimon and Odps
      b. previous odps stores structured columns + oss path
      c. now odps stores structured columns + paimon Blob Descriptor
   2. downstreams odps just need to change the parse logic:  
      From parsing oss path to parsing paimon BlobDescriptors
   3. Gradually replace all odps tables with paimon tables.
   
   BlobConsumer is just for the first step: after writing a batch of paimon 
records, we could write the blob descriptors into odps immediately.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to