kbendick commented on PR #5478:
URL: https://github.com/apache/iceberg/pull/5478#issuecomment-1209758658

   For the log that is causing confusion for users because the distributed 
engines (in this case Spark) then go on to do some cleanup, maybe we can update 
the log to state something similar to we're "skipping clean up during current 
iteration, cleanup might occur later if this action is run from a distributed 
processing engine"? Then possibly add logs to Spark as well?
   
   > I would like a lot to distinguish between "didn't try" and "nothing to 
clean up" -- even when "nothing to clean up" is because there were no expired 
files.
   
   Maybe we need a flag for "delayedFileCleanup" or just generally 
"runningViaDistributedAction" and then let Spark / engines control the logs? I 
don't love it, but throwing it out there as the current logging is admittedly 
confusing when the Spark action runs (which is the primary way users interact 
with this action).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to