[
https://issues.apache.org/jira/browse/DRILL-7514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17041206#comment-17041206
]
ASF GitHub Bot commented on DRILL-7514:
---------------------------------------
vvysotskyi commented on pull request #1991: DRILL-7514: Update Apache POI to
Latest Version
URL: https://github.com/apache/drill/pull/1991#discussion_r382168457
##########
File path:
contrib/format-excel/src/main/java/org/apache/drill/exec/store/excel/ExcelBatchReader.java
##########
@@ -273,7 +273,12 @@ private int getColumnCount() {
int columnCount;
if (readerConfig.headerRow >= 0) {
- columnCount =
sheet.getRow(sheet.getFirstRowNum()).getPhysicalNumberOfCells();
+ try {
+ columnCount =
sheet.getRow(sheet.getFirstRowNum()).getPhysicalNumberOfCells();
+ } catch (NullPointerException e) {
Review comment:
Thanks for fixing it, but I'm afraid we can obtain NPE here for other cases,
according to its JavaDoc, null may be returned for the case when a row with
specified index wasn't defined. Also, there is a line of code below which may
have a possible issue. I would recommend refactoring it to something like this:
```
private int getColumnCount() {
int rowNumber = readerConfig.headerRow > 0 ? sheet.getFirstRowNum() : 0;
XSSFRow sheetRow = sheet.getRow(rowNumber);
return sheetRow != null ? sheetRow.getPhysicalNumberOfCells() : 0;
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Update Apache POI to Latest Version
> -----------------------------------
>
> Key: DRILL-7514
> URL: https://issues.apache.org/jira/browse/DRILL-7514
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.17.0
> Reporter: Charles Givre
> Assignee: Charles Givre
> Priority: Minor
> Fix For: 1.18.0
>
>
> Drill's Excel Format Plugin uses Apache POI to parse Excel files. While this
> reader is effective in that it parses formulae and data types, it uses memory
> inefficiently and will struggle to read very large Excel files.
> The latest version of POI addresses some of the memory issues and hopefully
> Drill will be able to query larger Excel files without running out of memory.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)