[
https://issues.apache.org/jira/browse/NIFI-13988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896701#comment-17896701
]
Philipp Korniets commented on NIFI-13988:
-----------------------------------------
Also I noticed that ConvertExcelToCSV uses string types everywhere
https://github.com/apache/nifi/blob/8ecf23e77c8ca828a77f3b84554ed3347d8f7fa2/nifi-nar-bundles/nifi-poi-bundle/nifi-poi-processors/src/main/java/org/apache/nifi/processors/poi/ConvertExcelToCSVProcessor.java#L475C1-L480C14
{code:java}
String stringCellValue = cell.getStringCellValue();
final CellType type = cell.getCellType();
if(type.equals(CellType.BOOLEAN) && formatBooleans) {
stringCellValue = stringCellValue.equals("1") ? "TRUE" :
"FALSE";
}
{code}
Given that we cant control what data comes from the vendor - this is a better
approach? or an option?
> ExcelReader - Use Starting Row schema strategy and string empty values
> ----------------------------------------------------------------------
>
> Key: NIFI-13988
> URL: https://issues.apache.org/jira/browse/NIFI-13988
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 2.0.0
> Reporter: Philipp Korniets
> Assignee: Daniel Stieglitz
> Priority: Major
> Attachments: Test workbook NiFi2_0.xlsx
>
>
> When Use Starting Row as schema strategy in ExcelReader it analyses first 10
> row. Problem appears with empty cells of *Numerical* type which can appear
> anywhere after 10 rows. The cells *looks like* NULL, but actually is an empty
> string.
> File with example data attached.
> Field Exercise Price.
> Use Starting Row throws an error:
> {code:java}
> Caused by: java.lang.NumberFormatException: For input string: ""
> at
> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
> at java.base/java.lang.Long.parseLong(Long.java:719)
> at java.base/java.lang.Long.parseLong(Long.java:832)
> at
> org.apache.nifi.serialization.record.util.DataTypeUtils.toLong(DataTypeUtils.java:1391)
> at
> org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:213)
> at
> org.apache.nifi.serialization.record.util.DataTypeUtils.convertType(DataTypeUtils.java:174)
> at
> org.apache.nifi.excel.ExcelRecordReader.convert(ExcelRecordReader.java:170)
> at
> org.apache.nifi.excel.ExcelRecordReader.lambda$getCurrentRowValues$0(ExcelRecordReader.java:127)
> at
> java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
> at
> java.base/java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:617)
> at
> org.apache.nifi.excel.ExcelRecordReader.getCurrentRowValues(ExcelRecordReader.java:114)
> at
> org.apache.nifi.excel.ExcelRecordReader.nextRecord(ExcelRecordReader.java:84)
> ... 28 common frames omitted{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)