zhangdove opened a new pull request #1313:
URL: https://github.com/apache/iceberg/pull/1313


   My use case :
   1. table.expireSnapshots().cleanExpiredFiles(false).commit() 
[https://github.com/apache/iceberg/pull/1244]
   2. actions.removeOrphanFiles().olderThan(t1).execute()
   
   The first step takes about two seconds.However,in the second step of 
deleting the files, as the number of files increases, the deletion time becomes 
slower and slower, which is not what I want. If I do not understand the error, 
delete the file executed by a single thread in Spark Driver. Can we move the 
execution-deletion file from the Driver side to Spark's Executor to do 
multithreaded erasure of orphaned files?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to