RussellSpitzer commented on issue #2917: URL: https://github.com/apache/iceberg/issues/2917#issuecomment-891122748
Reading specific files isn't really supported at the moment although you may be able to repurpose the Rewrite code to handle it... To do so you would have to make BaseFileScanTasks which you could do by converting the File metadata rows into DataFiles (See SparkDataFile) and then putting those into the manager .... I think one of the key issues is that the way we read the files is also state dependent on the table at that time so we have to know what spec we are using, for the residual evaluator you can just put in an always true evaluator. --- If you just want to read files it may make sense to just pass the file paths to the normal spark reading code rather than doing it through the Iceberg Datasource. Something like https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala#L595 spark.read.parquet(files:_*) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
