kbendick commented on a change in pull request #2617:
URL: https://github.com/apache/iceberg/pull/2617#discussion_r637817843
##########
File path: core/src/main/java/org/apache/iceberg/StaticTableScan.java
##########
@@ -25,16 +25,33 @@
class StaticTableScan extends BaseTableScan {
private final Function<StaticTableScan, DataTask> buildTask;
+ // Metadata table name that the buildTask that this StaticTableScan will
return data for.
+ private final String scannedTableName;
- StaticTableScan(TableOperations ops, Table table, Schema schema,
Function<StaticTableScan, DataTask> buildTask) {
+ StaticTableScan(TableOperations ops, Table table, Schema schema, String
scannedTableName,
+ Function<StaticTableScan, DataTask> buildTask) {
super(ops, table, schema);
this.buildTask = buildTask;
+ this.scannedTableName = scannedTableName;
}
- private StaticTableScan(TableOperations ops, Table table, Schema schema,
+ private StaticTableScan(TableOperations ops, Table table, Schema schema,
String scannedTableName,
Function<StaticTableScan, DataTask> buildTask,
TableScanContext context) {
super(ops, table, schema, context);
this.buildTask = buildTask;
+ this.scannedTableName = scannedTableName;
+ }
+
+ @Override
+ public TableScan appendsBetween(long fromSnapshotId, long toSnapshotId) {
+ throw new UnsupportedOperationException(
+ String.format("Incremental scan is not supported for metadata table
%s", scannedTableName));
+ }
+
+ @Override
+ public TableScan appendsAfter(long fromSnapshotId) {
+ throw new UnsupportedOperationException(
+ String.format("Incremental scan is not supported for metadata table
%s", scannedTableName));
Review comment:
I believe the `MetadataTableType` scans could be unified into just
`StaticTableScan` and `BaseAllMetadataTableScan`, so that the overrides could
be done in two places, with the names and metadata types provided to the user
for the error message.
The current layout of refined error messages for time based / snapshot
specific / incremental metadata table scans:
```
MetadataTableType Current Scan Class Base Snapshot / Time /
Incremental Scan Overrides
ENTRIES BaseTableScan None
FILES BaseTableScan None
HISTORY StaticTableScan None
SNAPSHOTS StaticTableScan None
MANIFESTS StaticTableScan None
PARTITIONS StaticTableScan None
ALL_DATA_FILES BaseAllMetadataTableScan [asOfTime, useSnapshot]
ALL_MANIFESTS BaseAllMetadataTableScan [asOfTime, useSnapshot]
ALL_ENTRIES BaseAllMetadataTableScan None
```
[AllDataFilesTableScan specific
overrides](https://github.com/apache/iceberg/blob/32ce969218b75915cb7396c541bdc19f4ac8bd9f/core/src/main/java/org/apache/iceberg/AllDataFilesTable.java#L92-L100)
[AllManifestsTableScan specific
overrides](https://github.com/apache/iceberg/blob/32ce969218b75915cb7396c541bdc19f4ac8bd9f/core/src/main/java/org/apache/iceberg/AllManifestsTable.java#L99-L107)
The two that have overrides are implemented in the final implementing scan
classes.
To have consistent, enriched, user-friendly error messages and to ensure we
catch all cases, we'd need to pass the scanned metadata table name (as it can
technically be overridden - and is in some tests) and the MetadataTableType.
Users encounter this problem often enough when switching back and forth
between editing a snapshot specific query and inspecting a metadata table, and
then reach out for help or spend a few minutes realizing where they went wrong
and it would be a big quality of life upgrade (as well as making the code much
more intentional).
So I propose changing the inheritance of these classes to match their
`MetadataTableType` counterparts
```
MetadataTableType New Scan Class
ENTRIES **EntriesTableScan extends StaticTableScan**
FILES **FilesTableScan extends StaticTableScan**
HISTORY StaticTableScan
SNAPSHOTS StaticTableScan
MANIFESTS StaticTableScan
PARTITIONS StaticTableScan
ALL_DATA_FILES BaseAllMetadataTableScan
ALL_MANIFESTS BaseAllMetadataTableScan
ALL_ENTRIES BaseAllMetadataTableScan
```
If we decide to implement any of the snapshot / time / incremental scan
specific functionality, then the actual concrete scan class can override it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]