[ 
https://issues.apache.org/jira/browse/DRILL-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543561#comment-17543561
 ] 

ASF GitHub Bot commented on DRILL-8149:
---------------------------------------

cgivre commented on PR #2483:
URL: https://github.com/apache/drill/pull/2483#issuecomment-1140357037

   > @luocooong @cgivre I'm not sure how unit-testable these changes are. To 
really test the features would require very large xlsx files and even with the 
all the new settings in place, the tests would be slow to run and use a lot of 
memory. Would it be feasible to treat these properties as something that users 
would only need to set in rare circumstances? If users run into issues with the 
properties in real world scenarios, then maybe the code and tests can be beefed 
up then.
   
   I'm ok with not unit testing these.  Since all that is happening here is 
that that we are creating some new config variables in the format config and 
passing them down to a reader, I think it is fine.  So, unless there is any 
objection, LGTM +1.




> format-excel plugin needs to support POI IOUtils byte array overrides to 
> support big files
> ------------------------------------------------------------------------------------------
>
>                 Key: DRILL-8149
>                 URL: https://issues.apache.org/jira/browse/DRILL-8149
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Data Types
>    Affects Versions: 1.19.0
>            Reporter: PJ Fanning
>            Priority: Major
>
> [https://poi.apache.org/components/configuration.html] - see 
> [org.apache.poi.util.IOUtils.setByteArrayMaxOverride(int 
> maxOverride)|https://poi.apache.org/apidocs/5.0/org/apache/poi/util/IOUtils.html#setByteArrayMaxOverride-int-]
> Core POI code tries to set limits on resource allocations. 
> excel-streaming-reader may not be as heavily affected by these settings 
> because it only used parts of the core POI codebase.
> POI 5.2.1 (due in next few weeks) fixes a few issues but there is some 
> evidence that core POI users are hitting issues when loading large files and 
> having to set  the byte array max override setting.
> I can do some testing of the format-excel plugin to see if it can hit these 
> issues with large files.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to