[ 
https://issues.apache.org/jira/browse/DRILL-8070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PJ Fanning updated DRILL-8070:
------------------------------
    Description: 
In ExcelBatchReader, this code makes the wrong assumption:
{code:java}
    for (int i = 1; i < rowNumber; i++) {
         currentRow = rowIterator.next();
    } {code}
 
There are 2 for loops like this.

Empty Rows will not necessarily be returned by the iterator. Basically, rows 
without populated cells could easily be skipped. Think of the Sheet as being 
represented as a sparse matrix - because it is stored like this.

 

 

 

  was:
In ExcelBatchReader, this code makes the wrong assumption:

```

for (int i = 1; i < rowNumber; i++) {
  currentRow = rowIterator.next();
}

```

 

There are 2 for loops like this.

 

Empty Rows will not necessarily be returned by the iterator. Basically, rows 
without populated cells could easily be skipped. Think of the Sheet as being 
represented as a sparse matrix - because it is stored like this.

 

 

 


> format-excel assumes that rowIterator returns every row - it doesn't
> --------------------------------------------------------------------
>
>                 Key: DRILL-8070
>                 URL: https://issues.apache.org/jira/browse/DRILL-8070
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>            Reporter: PJ Fanning
>            Priority: Major
>
> In ExcelBatchReader, this code makes the wrong assumption:
> {code:java}
>     for (int i = 1; i < rowNumber; i++) {
>          currentRow = rowIterator.next();
>     } {code}
>  
> There are 2 for loops like this.
> Empty Rows will not necessarily be returned by the iterator. Basically, rows 
> without populated cells could easily be skipped. Think of the Sheet as being 
> represented as a sparse matrix - because it is stored like this.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to