stevenzwu commented on code in PR #5318:
URL: https://github.com/apache/iceberg/pull/5318#discussion_r930218433
##########
flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java:
##########
@@ -68,26 +72,59 @@
private final ReaderFunction<T> readerFunction;
private final SplitAssignerFactory assignerFactory;
+ // Can't use SerializableTable as enumerator needs a regular table
+ // that can discover table changes
+ private transient Table table;
+
IcebergSource(
TableLoader tableLoader,
ScanContext scanContext,
ReaderFunction<T> readerFunction,
- SplitAssignerFactory assignerFactory) {
+ SplitAssignerFactory assignerFactory,
+ Table table) {
this.tableLoader = tableLoader;
this.scanContext = scanContext;
this.readerFunction = readerFunction;
this.assignerFactory = assignerFactory;
+ this.table = table;
Review Comment:
We are avoiding the feature of inferring parallelism. But I think this
refactoring is still good.
- It avoid double loading of the table. a `Table` is loaded in the builder
to get fields like `schema`, `io`, `encryption` etc. It will be loaded again in
the `IcebergSource#createEnumerator` method, which also runs in the
jobmanager/driver.
- `table/lazyTable()` is used by the `name()` getter.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]