YannByron commented on PR #5436: URL: https://github.com/apache/hudi/pull/5436#issuecomment-1111127124
> I general, the design guidelines to consider at 1st priority is not to double write, for these reasons: > > 1. The CDC details records occupies several times the storage cost than the actual base data files. This is not acceptable for production, especially for lake format, we already have active timeline commits for history snapshots; > 2. The double write would reduce the write throughput obviously; > 3. If we double write that log files, we need to handle the transaction for the data file completeness and the CDC logs, for example, how about we write the log success but the data files failed, should we failover, and how we do the failover ? Recover from the log files ? There are many corner cases to handle just like we did to metadata table already. > 4. What about the TTL of the log files, should it be separate managed from the data files ? Say we keep 10 latest commits for data files, should we also keep that for log files ? How to clean them, and which component to clean them ? The existing cleaning service ? Note that the log data set is huge and the cleaning should be enough efficient. 1. Now, we have two table types: COW and MOR. As a lake format, We need to have different concerns for different table types that can use in different scenario. As i said above, we should focus more on query performance for COW tables, and write performance for MOR tables. Your solution in google doc do the same things for both. If i understand your solution correctly, it need a full-join to detect the changing for cow. It is implemented with two time-travel queries, i.e, we need to load the two versions of file group, even just one record is changed for cow table (at most streaming cases, maybe just a very tiny fraction is changed in one commit). 2. `the write throughput` is the main point for MOR. At most cases, we do not need to write out extra cdc files. The timing at which the CDC files has to be generated is when the MOR table will write out the base file, not log file. After all, in the normal cases, the MOR table also need to rewrite the base file, not always write to log file. 3. Hudi transaction is managed by timeline. Failure to write CDC files or data files should not complete the commit correctly. 4. The management about log files is as usual. Only CDC files, we need to consider to clean them in time by the clean service. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
