zhongyujiang commented on issue #3627: URL: https://github.com/apache/paimon/issues/3627#issuecomment-2211612314
Hi @BsoBird Tired storage is one of the application scenarios. Implementing multi-location management has additional benefits, such as enabling a smoother transition of data from Hadoop to cloud object storage. Of course, this also requires support for multi-location metadata management. > You just need to regularly set the hdfs directory to a different storage policy, and then perform a compaction can be. Are you referring to Paimon data compaction? I think that’s something we want to avoid. Compaction requires a complete rewrite of the data, and the overhead of decoding and encoding is not trivial. > When is the best time to do your data migrated to this path, it becomes a problem.There also seems to be a consistency issue to consider, what if for some other reason the same data files appear in both paths?It's an unavoidable question. I believe data administrators can perform data migration operations based on demand. Data consistency is indeed a consideration, it can be achived by using atomic swaps. I don’t think this is an unavoidable issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
