rdblue commented on a change in pull request #4060:
URL: https://github.com/apache/iceberg/pull/4060#discussion_r801820735
##########
File path:
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/source/DataIterator.java
##########
@@ -41,16 +42,49 @@
private final FileScanTaskReader<T> fileScanTaskReader;
private final InputFilesDecryptor inputFilesDecryptor;
- private Iterator<FileScanTask> tasks;
+ private final CombinedScanTask combinedTask;
+
+ private Iterator<FileScanTask> fileTasksIterator;
private CloseableIterator<T> currentIterator;
+ private int fileOffset;
+ private long recordOffset;
public DataIterator(FileScanTaskReader<T> fileScanTaskReader,
CombinedScanTask task,
FileIO io, EncryptionManager encryption) {
this.fileScanTaskReader = fileScanTaskReader;
this.inputFilesDecryptor = new InputFilesDecryptor(task, io, encryption);
- this.tasks = task.files().iterator();
+ this.combinedTask = task;
+
+ this.fileTasksIterator = task.files().iterator();
this.currentIterator = CloseableIterator.empty();
+
+ // fileOffset starts at -1 because we started
+ // from an empty iterator that is not from the split files.
+ this.fileOffset = -1;
+ this.recordOffset = 0L;
+ }
+
+ public void seek(int startingFileOffset, long startingRecordOffset) {
Review comment:
This implementation doesn't actually seek to the offsets, it seeks that
number of files and/or records. In other words, this only works if it is called
before the first call to `next`.
I think that this needs to validate that `fileOffset` is still `-1`. If it
isn't, then this should fail immediately with an `IllegalStateException`
because seek is only supported just after initialization, not when any records
have been consumed. (Or else we'll have to fix this implementation.)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]