gyfora commented on a change in pull request #14174:
URL: https://github.com/apache/flink/pull/14174#discussion_r531123588
##########
File path:
flink-core/src/main/java/org/apache/flink/api/common/io/DelimitedInputFormat.java
##########
@@ -738,15 +739,18 @@ public void reopen(FileInputSplit split, Long state)
throws IOException {
Preconditions.checkArgument(state == -1 || state >=
split.getStart(),
" Illegal offset "+ state +", smaller than the splits
start=" + split.getStart());
- try {
+ // If we checkpointed at the beginning of split simply call open
+ if (split.getStart() == state) {
this.open(split);
- } finally {
- this.offset = state;
+ return;
}
- if (state > this.splitStart + split.getLength()) {
+ super.open(split);
Review comment:
this.open basically assumes we are at the beginning of the split and
immediately fills the readbuffer and tries to drop partial lines at the
beginning of the split. It also calls super.open
This behaviour would lead to problems in case we restore from a checkpoint
when we never have to do this and seeking to the checkpoint offset can actually
mean backward seeking due to this.open's behaviour (and that doesn't work for
compressed streams)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]