[
https://issues.apache.org/jira/browse/NIFI-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688050#comment-17688050
]
Daniel Stieglitz commented on NIFI-10792:
-----------------------------------------
[~exceptionfactory] I am not sure how NIFI-11167 will help if we are to use the
POI API. Currently I believe ConvertExcelToCSVProcessor is using the XSSF
Event API. The stack trace when reading in an Excel file larger than 10MB is:
{code:java}
at org.apache.poi.util.IOUtils.throwRFE(IOUtils.java:599)
at org.apache.poi.util.IOUtils.checkLength(IOUtils.java:276)
at org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:230)
at org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:203)
at
org.apache.poi.openxml4j.util.ZipArchiveFakeEntry.<init>(ZipArchiveFakeEntry.java:82)
at
org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:98)
at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:132)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:319)
at
org.apache.nifi.processors.poi.ConvertExcelToCSVProcessor$1.process(ConvertExcelToCSVProcessor.java:240)
{code}
The starting point for using the XSSF Event API is to use the
{code:java}
org.apache.poi.openxml4j.opc.OPCPackage
{code}
which as you can see above is where the stack trace starts.
Also I am not sure how the SXSSF model would be pertinent as it is used for
creating an Excel document and we are trying to create a reader.
> ConvertExcelToCSVProcessor : Failed to convert file over 10MB
> --------------------------------------------------------------
>
> Key: NIFI-10792
> URL: https://issues.apache.org/jira/browse/NIFI-10792
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core UI
> Affects Versions: 1.17.0, 1.16.3, 1.18.0
> Reporter: mayki
> Priority: Critical
> Labels: Excel, csv, processor
> Fix For: 1.15.3
>
> Attachments: ConvertExcelToCSVProcessor_1_18_0_with_POI_OLD.PNG,
> ConvertExcelToCSVProcessor_1_19_1.PNG
>
>
> Hello all,
> It seems all version greater 1.15.3 introduce a failure on the processor
> *ConvertExcelToCSVProcessor* with this error :
> {code:java}
> Tried to allocate an array of length 101,695,141, but the maximum length for
> this record type is 100,000,000. If the file is not corrupt or large, please
> open an issue on bugzilla to request increasing the maximum allowable size
> for this record type. As a temporary workaround, consider setting a higher
> override value with IOUtils.setByteArrayMaxOverride() {code}
> I have tested with 2 differences instances nifi version 1.15.3 ==> Work: OK
> And since upgrade in 1.16, 1.17, 1.18 ==> same processsor *failed* with file
> greater than 10MB.
> Could you help us to correct this bug ?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)