GrigorievNick opened a new issue #2917:
URL: https://github.com/apache/iceberg/issues/2917
I wanna read few files from the previous version of the table.
I can find files and their metadata using `.files`. prefix in the datable
name.
```
val files = sparkSession
.read
.option("snapshot-id", previousVersion)
.format("iceberg")
.load(s"$testTable.files")
.filter(my filters to find file)
```
I also see that in Spark3RewriteActions I can specify a list of files to
read from the table using `FILE_SCAN_TASK_SET_ID`
```
val manager = FileScanTaskSetManager.get
manager.stageTasks(table, "fileGroupId", fileScanTasksList)
sparkSession
.read
.format("iceberg")
.option("snapshot-id", previous)
.option(SparkReadOptions.FILE_SCAN_TASK_SET_ID, "groufileGroupIdpID")
.load(testTable)
```
What I can find is how to create `FileScanTask` from `$testTable.files` or
from any output.
In code, I see that `DataTableScan` use `ManifestReadtask`, but I don't see
any code to read from one or a few predefine iceberg files.
Do I need to write my own FileScanTask implementation, or there is a code
that already does it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]