[ 
https://issues.apache.org/jira/browse/HUDI-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HUDI-6896:
------------------------------
    Description: 
org.apache.hudi.io.storage.HoodieAvroHFileReader.RecordIterator#hasNext uses 
org.apache.hadoop.hbase.io.hfile.HFileScanner#isSeeked to seek to the first 
line of the file.
{code:java}
        if (!scanner.isSeeked()) {
          hasRecords = scanner.seekTo();
        }
{code}
if isSeeked returns false, scanner seeks to start of file.

After end of file is reached, isSeeked would still return false and the next 
time hasNext is called it seeks to start of file again leading to an infinite 
loop.

Documentation for HFileScanner#isSeeked 
True is scanner has had one of the seek calls invoked; i.e. seekBefore(Cell) or 
seekTo() or seekTo(Cell). Otherwise returns false.

  was:org.apache.hudi.io.storage.HoodieAvroHFileReader.RecordIterator#hasNext 
uses org.apache.hadoop.hbase.io.hfile.HFileScanner#isSeeked to seek to the 
first line of the file.


> HoodieAvroHFileReader.RecordIterator iteration never terminates
> ---------------------------------------------------------------
>
>                 Key: HUDI-6896
>                 URL: https://issues.apache.org/jira/browse/HUDI-6896
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Lokesh Jain
>            Priority: Major
>
> org.apache.hudi.io.storage.HoodieAvroHFileReader.RecordIterator#hasNext uses 
> org.apache.hadoop.hbase.io.hfile.HFileScanner#isSeeked to seek to the first 
> line of the file.
> {code:java}
>         if (!scanner.isSeeked()) {
>           hasRecords = scanner.seekTo();
>         }
> {code}
> if isSeeked returns false, scanner seeks to start of file.
> After end of file is reached, isSeeked would still return false and the next 
> time hasNext is called it seeks to start of file again leading to an infinite 
> loop.
> Documentation for HFileScanner#isSeeked 
> True is scanner has had one of the seek calls invoked; i.e. seekBefore(Cell) 
> or seekTo() or seekTo(Cell). Otherwise returns false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to