kbendick commented on a change in pull request #2617:
URL: https://github.com/apache/iceberg/pull/2617#discussion_r650294316
##########
File path: core/src/main/java/org/apache/iceberg/ManifestEntriesTable.java
##########
@@ -69,18 +69,35 @@ MetadataTableType metadataTableType() {
private static class EntriesTableScan extends BaseTableScan {
- EntriesTableScan(TableOperations ops, Table table, Schema schema) {
+ private final String scannedTableName;
+
+ EntriesTableScan(TableOperations ops, Table table, Schema schema, String
scannedTableName) {
super(ops, table, schema);
+ this.scannedTableName = scannedTableName;
}
- private EntriesTableScan(TableOperations ops, Table table, Schema schema,
TableScanContext context) {
+ private EntriesTableScan(TableOperations ops, Table table, Schema schema,
String scannedTableName,
+ TableScanContext context) {
super(ops, table, schema, context);
+ this.scannedTableName = scannedTableName;
+ }
+
+ @Override
+ public TableScan appendsBetween(long fromSnapshotId, long toSnapshotId) {
+ throw new UnsupportedOperationException(
+ String.format("Incremental scan is not supported for metadata table
%s", scannedTableName));
+ }
+
+ @Override
+ public TableScan appendsAfter(long fromSnapshotId) {
+ throw new UnsupportedOperationException(
+ String.format("Incremental scan is not supported for metadata table
%s", scannedTableName));
Review comment:
I guess we don't, especially if we want it to say `Incremental scan is
not supported for HISTORY scan of table $table.name", where "$table.name" would
be "foo" and not "foo#history".
I would be ok with that.
I think I misunderstood your original request for the change in the wording
(it will no longer say "table foo#history", but the "for HISTORY scan of table
foo" will suffice) 👍
##########
File path: core/src/main/java/org/apache/iceberg/ManifestEntriesTable.java
##########
@@ -69,18 +69,35 @@ MetadataTableType metadataTableType() {
private static class EntriesTableScan extends BaseTableScan {
- EntriesTableScan(TableOperations ops, Table table, Schema schema) {
+ private final String scannedTableName;
+
+ EntriesTableScan(TableOperations ops, Table table, Schema schema, String
scannedTableName) {
super(ops, table, schema);
+ this.scannedTableName = scannedTableName;
}
- private EntriesTableScan(TableOperations ops, Table table, Schema schema,
TableScanContext context) {
+ private EntriesTableScan(TableOperations ops, Table table, Schema schema,
String scannedTableName,
+ TableScanContext context) {
super(ops, table, schema, context);
+ this.scannedTableName = scannedTableName;
+ }
+
+ @Override
+ public TableScan appendsBetween(long fromSnapshotId, long toSnapshotId) {
+ throw new UnsupportedOperationException(
+ String.format("Incremental scan is not supported for metadata table
%s", scannedTableName));
+ }
+
+ @Override
+ public TableScan appendsAfter(long fromSnapshotId) {
+ throw new UnsupportedOperationException(
+ String.format("Incremental scan is not supported for metadata table
%s", scannedTableName));
Review comment:
So that makes sense.
I got it to work with `Incremental scan is not supported for %s scan of
table %s` with the first one being all possible values in
`MetadataTableType.values()` and the table always being `table().name()` (e.g.
without the $metadataIdentifier).
However, not having the actual scanned table name still leaves us without a
solution to this issue, that the events emitted reference the table itself and
are potentially incorrect. https://github.com/apache/iceberg/issues/2635
Maybe we can meet Monday or Tuesday and discuss this further? Its possible
https://github.com/apache/iceberg/issues/2635 is not a bug. Maybe you can help
me better understand the use case for the events, such as `ScanEvent`?
It would seem to me that if we emit a `ScanEvent` when doing an scan on one
of the AllMetadataTables, we'd want the emitted scan event to use the scanned
table name (e.g. $tableName[.|#]$metadataIdentifier), to distinguish from a
scan event on the table's data itself. But possibly that's not necessary?
I can push the updated code and then roll back if we decide not to go with
it after we talk.
##########
File path: core/src/main/java/org/apache/iceberg/ManifestEntriesTable.java
##########
@@ -69,18 +69,35 @@ MetadataTableType metadataTableType() {
private static class EntriesTableScan extends BaseTableScan {
- EntriesTableScan(TableOperations ops, Table table, Schema schema) {
+ private final String scannedTableName;
+
+ EntriesTableScan(TableOperations ops, Table table, Schema schema, String
scannedTableName) {
super(ops, table, schema);
+ this.scannedTableName = scannedTableName;
}
- private EntriesTableScan(TableOperations ops, Table table, Schema schema,
TableScanContext context) {
+ private EntriesTableScan(TableOperations ops, Table table, Schema schema,
String scannedTableName,
+ TableScanContext context) {
super(ops, table, schema, context);
+ this.scannedTableName = scannedTableName;
+ }
+
+ @Override
+ public TableScan appendsBetween(long fromSnapshotId, long toSnapshotId) {
+ throw new UnsupportedOperationException(
+ String.format("Incremental scan is not supported for metadata table
%s", scannedTableName));
+ }
+
+ @Override
+ public TableScan appendsAfter(long fromSnapshotId) {
+ throw new UnsupportedOperationException(
+ String.format("Incremental scan is not supported for metadata table
%s", scannedTableName));
Review comment:
Actually, I think that this is fine.
I'm not too sure I fully understand the use case of `ScanEvent`, so I'm not
going to worry about whether or not it's a bug at this moment in time.
I agree that it's more consistent with the rest of the code to be using a
method that's overridden vs a constructor parameter.
I have updated to do that, and for the few classes inherit directly from
`BaseTableScan`, I've used the same named helper function to keep things
somewhat consistent.
##########
File path: core/src/main/java/org/apache/iceberg/HistoryTable.java
##########
@@ -82,6 +82,11 @@ private DataTask task(TableScan scan) {
// override planFiles to avoid the check for a current snapshot because
this metadata table is for all snapshots
return CloseableIterable.withNoopClose(HistoryTable.this.task(this));
}
+
+ @Override
+ protected String tableType() {
+ return String.valueOf(HistoryTable.this.metadataTableType());
Review comment:
For scans that inherit from `StaticTableScan`, it's possible to get the
table type from the surrounding class's method.
For the entries that extend `BaseAllMetadataTableScan`, it's not possible to
call methods from the surrounding class so I used the metadata table type enum
values directly (such as
[here](https://github.com/apache/iceberg/pull/2617/files#diff-c106422c51344f58d260f5578bfc8e720106dcd441f8348660155bb0b331062bR87-R91)).
Not really in love with one style or another, though this one here would be
harder to mess up in a refactor as it's more DRY (at the expense of being a bit
more verbose).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]