[ https://issues.apache.org/jira/browse/FLINK-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124791#comment-16124791 ]
ASF GitHub Bot commented on FLINK-7423: --------------------------------------- Github user XuPingyong commented on the issue: https://github.com/apache/flink/pull/4525 @greghogan, if the object passed to nextRecord may be reused internally by the InputFormat, do the similar cases need to be re-considered? In `DataSourceTask.java`: OT reuse = serializer.createInstance(); // as long as there is data to read while (!this.taskCanceled && !format.reachedEnd()) { OT returned; if ((returned = format.nextRecord(reuse)) != null) { output.collect(returned); } } And in many batch drivers: final MutableObjectIterator<T> in = taskContext.getInput(0); T value = serializer.createInstance(); while (running && (value = in.next(value)) != null) { ....... } In my opinion: 1. `Null` records are meaningless, but `null` is meaningful for input or format which means the end. If a user only call `InputFormat#nextRecord` without `InputFormat#reachedEnd`, only `null` can be returned. 2. The returned object of `InputFormat#nextRecord` should not need to be considered that it may be passed again. If a immutable object is returned, an exception will be thrown when it is reused again in `InputFormat#nextRecord`. @greghogan, could you please offer some cases that the object passed to nextRecord can be reused internally by the InputFormat? Thanks. > Always reuse an instance to get elements from the inputFormat > --------------------------------------------------------------- > > Key: FLINK-7423 > URL: https://issues.apache.org/jira/browse/FLINK-7423 > Project: Flink > Issue Type: Bug > Components: DataStream API > Reporter: Xu Pingyong > Assignee: Xu Pingyong > > In InputFormatSourceFunction.java: > {code:java} > OUT nextElement = serializer.createInstance(); > while (isRunning) { > format.open(splitIterator.next()); > // for each element we also check if cancel > // was called by checking the isRunning flag > while (isRunning && !format.reachedEnd()) { > nextElement = > format.nextRecord(nextElement); > if (nextElement != null) { > ctx.collect(nextElement); > } else { > break; > } > } > format.close(); > completedSplitsCounter.inc(); > if (isRunning) { > isRunning = splitIterator.hasNext(); > } > } > {code} > the format may return other element or null when nextRecord, that will may > cause exception. -- This message was sent by Atlassian JIRA (v6.4.14#64029)