stevenzwu edited a comment on issue #1643:
URL: https://github.com/apache/iceberg/issues/1643#issuecomment-714700304


   Let's look at the nextRecord contract from the `InputFormat`. reuse or not 
is controlled by the caller of `nextRecord(OT reuse)`. 
   ```
        /**
         * @param reuse Object that may be reused.
         */
        OT nextRecord(OT reuse) throws IOException;
   ```
   
   If the input `reuse` arg is null, there is no reuse. Otherwise, input arg 
should be used for deserialization. That is how `BinaryInputFormat` implemented.
   ```
        @Override
        public T nextRecord(T record) throws IOException {
                   // ...
                record = this.deserialize(record, this.dataInputStream);
                   // ...
        }
   ```
   
   `CloseableIterable<T>` interface is just NOT a good fit for InputFormat 
interface. Now the `FlinkInputFormat` ignore the reuse hint from caller and 
simply returns reused RowData object.
   ```
     @Override
     public RowData nextRecord(RowData reuse) {
       return iterator.next();
     }
   ```
   
   So technically, `FlinkInputFormat` is not conforming to the contract of 
`InputFormat` interface.  If we truly want to implement the correct nextRecord 
contract, we need to change the `CloseableIterable/Iterator` to support `next(T 
reuse)`, which is impossible to change the java `Iterator` interface. If it 
can't conform to the contract, maybe the default behavior should be the more 
conservative (and less performant) behavior of no reuse.
   
   We don't use InputFormat internally. So its behavior is not very relevant to 
us. This is purely a semantic discussion. I will address other issues in 
separate comments.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to