tlchao opened a new issue #3417: URL: https://github.com/apache/iceberg/issues/3417
I started a job used java api to generate an iceberg partiton-table. This job append data to table every two minutes, this will lead to lots of small files. I hope to merge small files in another spark job with writing operation parallelly. I have found two operators will generate snap complicit, which leads to writing job meet "snap-xxxx.avro can't be found" when merge job is running. And now the way we can use is making writing job and merging job execute serially. It seems this way is inefficient. Is any good ways? thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
