kbendick commented on issue #4346: URL: https://github.com/apache/iceberg/issues/4346#issuecomment-1083357207
Also adding that I have been working with @bijanhoule on a patch that will allow users to provide their own dataframe of actual files, to avoid skipping the listing entirely if all of the files are already known - for example if a cluster administrator is able to provide a list of files from HDFS or if a list of files is obtainable from the cloud provider (such as S3 inventory list). This is similar in that it allows us to avoid the list but it does put the onus on the user to provide the correct listing. This would be _in addition_ to the work mentioned above as an alternative option. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
