ted-jenks opened a new issue, #15699:
URL: https://github.com/apache/iceberg/issues/15699

   ### Feature Request / Improvement
   
   Spark SQL supports point-in-time queries via `VERSION AS OF` and `TIMESTAMP 
AS OF`, but there is no SQL syntax for querying a range of versions or 
timestamps. This is useful for incremental consumption patterns.
   
   The underlying scan infrastructure fully supports this. `IncrementalScan` 
provides `fromSnapshotInclusive`, `fromSnapshotExclusive`, and `toSnapshot`.
   
   There is:
   ```
     CALL spark_catalog.system.create_changelog_view(
       table => 'db.tbl',
       options => map('start-snapshot-id','1','end-snapshot-id', '2')
     );
     SELECT * FROM tbl_changes;
   ```
   But this does not provide a direct SQL query syntax for version ranges.
   
   ## Proposal
   
   Add SQL syntax for version/timestamp range queries. Possible forms (open for 
discussion):
   
   ```sql
   -- by snapshot ID
   SELECT * FROM db.table VERSION BETWEEN 1 AND 5
   
   -- by timestamp
   SELECT * FROM db.table TIMESTAMP BETWEEN '2024-01-01' AND '2024-06-01'
   
   -- by tag/branch ref
   SELECT * FROM db.table VERSION BETWEEN 'tag-a' AND 'tag-b'
   ```
   
   
   
   ### Query engine
   
   Spark
   
   ### Willingness to contribute
   
   - [ ] I can contribute this improvement/feature independently
   - [x] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to