rdblue commented on a change in pull request #1508:
URL: https://github.com/apache/iceberg/pull/1508#discussion_r725707350
##########
File path:
spark3/src/main/java/org/apache/iceberg/spark/source/IcebergSource.java
##########
@@ -101,24 +103,35 @@ public Table getTable(StructType schema, Transform[]
partitioning, Map<String, S
SparkSession spark = SparkSession.active();
setupDefaultSparkCatalog(spark);
String path = options.get("path");
+ Long snapshotId = Spark3Util.propertyAsLong(options,
SparkReadOptions.SNAPSHOT_ID, null);
+ Long asOfTimestamp = Spark3Util.propertyAsLong(options,
SparkReadOptions.AS_OF_TIMESTAMP, null);
Review comment:
I think that we want to generalize this a bit more rather than adding
special classes. I've taken a shot at building the needed feature in #3269. In
addition to solving the problem here of needing to pass a snapshot ID through
an identifier, it defines a syntax for allowing users to do it, which basically
adds support for time travel queries to SQL.
I think that's a cleaner way of doing this because we only have to pass a
snapshot ID through -- timestamp can be resolved when the table is loaded from
the identifier. I think it's also a bit less code.
Could you take a look at that issue? To move forward with this one, I think
we should either get the identifier format in #3269 in first, or just remove
the additions here for Spark 3 and add Spark 3 support once that goes in. What
do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]