tlchao opened a new issue #3417:
URL: https://github.com/apache/iceberg/issues/3417


   I started a job used java api to generate an iceberg partiton-table. This 
job append data to table every two minutes, this will lead to lots of small 
files. I hope to merge small files in another spark job with writing operation 
parallelly. I have found two operators will generate snap complicit, which 
leads to writing job meet "snap-xxxx.avro can't be found" when merge job is 
running. And now the way we can use is making writing job and merging job 
execute serially. It seems this way is inefficient. Is any good ways? thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to