[ 
https://issues.apache.org/jira/browse/NIFI-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17697026#comment-17697026
 ] 

David Handermann commented on NIFI-10792:
-----------------------------------------

Thanks for working on a solution [~dstiegli1]!

In answer to your questions:

1. The excel-streaming-reader is licensed under the Apache Software License 2.0 
and does not include any special notices, so in this case no additional license 
or notice changes are required.

2. Does the unit test generate a 20 MB file, or does it add a 20 MB file to the 
repository? Adding a 20 MB file to the repository is not a good option. If the 
file is generated programmatically, then the question is how long the unit test 
takes to run. If it is more than 1 second, the test could be annotated with 
EnabledIfSystemProperty("nifi.test.performance"), which disables the test by 
default, but allows it to be run when the named System property is specified.

3. In the case of row cache size and buffer size, I think less is more in terms 
of Property Descriptors. In other words, setting reasonable defaults is better 
than introducing user-facing properties that most user will never need to 
change. I recommend a row cache size of 100 and a buffer size of 4096, 
following the example from the library itself. Perhaps the buffer size could be 
larger, but if the changes have the general behavior of always reading forward 
through rows, then the small buffer size seems like the best option.

> ConvertExcelToCSVProcessor : Failed to convert file over 10MB 
> --------------------------------------------------------------
>
>                 Key: NIFI-10792
>                 URL: https://issues.apache.org/jira/browse/NIFI-10792
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core UI
>    Affects Versions: 1.17.0, 1.16.3, 1.18.0
>            Reporter: mayki
>            Priority: Critical
>              Labels: Excel, csv, processor
>             Fix For: 1.15.3
>
>         Attachments: ConvertExcelToCSVProcessor_1_18_0_with_POI_OLD.PNG, 
> ConvertExcelToCSVProcessor_1_19_1.PNG
>
>
> Hello all,
> It seems all version greater 1.15.3 introduce a failure on the processor 
> *ConvertExcelToCSVProcessor* with this error :
> {code:java}
> Tried to allocate an array of length 101,695,141, but the maximum length for 
> this record type is 100,000,000. If the file is not corrupt or large, please 
> open an issue on bugzilla to request increasing the maximum allowable size 
> for this record type. As a temporary workaround, consider setting a higher 
> override value with IOUtils.setByteArrayMaxOverride() {code}
> I have tested with 2 differences instances nifi version 1.15.3 ==> Work: OK
> And since upgrade in 1.16, 1.17, 1.18 ==> same processsor *failed* with file 
> greater than 10MB.
> Could you help us to correct this bug ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to