aokolnychyi commented on issue #4346:
URL: https://github.com/apache/iceberg/issues/4346#issuecomment-1083412195


   @kbendick, I think being able to customize `DeleteOrphanFiles` with a custom 
way of obtaining actual files is a great feature but I find it orthogonal. No 
matter what way we use to get actual files, we still need to normalize the 
locations and decide what to do if the scheme/authority don't match.
   
   I'd consider exposing some sort of a strategy in `DeleteOrphanFiles` that 
would allow users to customize not only how to obtain actual files but also 
other things (e.g. how to perform normalization). For example, I find it 
reasonable to use Hadoop for normalization if we use Hadoop for listing. If we 
rely on another way of computing actual files, maybe we should use something 
else for normalizing.
   
   I think once we know what to do for the current implementation we can think 
of a way to pluggable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to