TheR1sing3un commented on PR #12627: URL: https://github.com/apache/hudi/pull/12627#issuecomment-2591556537
If I do not unify the naming rule on spark first, problems will occur in the following scenarios: nsert and then bulk_insert. In this case, file naming will be disorderly, and one bucket generates two file ids: <img width="1096" alt="image" src="https://github.com/user-attachments/assets/069fb234-5c84-41fa-b3c6-52b91a3c3b19" /> <img width="780" alt="image" src="https://github.com/user-attachments/assets/858e2622-ae45-488d-8a05-ca4ccf56ec54" /> @danny0405 How about we unify the spark side with this pr first, and then unify spark/flink in the next Pr? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
