GitHub user centic9-dt added a comment to the discussion: Is There a Better 
Architecture Than Per-Formula Ephemeral Workbooks?

Apache POI is mostly concerned about reading and writing the various Microsoft 
formats. Evaluating and handling formulas is more of a necessity for being able 
to read and write files.

Therefore formula evaluation is currently depending a lot on the underlying 
workbook-structures, formulas also usually reference other cells, so there is a 
natural need for being able to access data in the workbook throughout formula 
evaluation.

Large parts of formula evaluation should be independent of the actual 
Workbook-format (HSSF, XSSF, SXSSF), so you might be able to experiment with a 
few things:
* Run performance evaluation to find where actual CPU and GC hotspots are. 
Tools like IntelliJ IDEA, Java Flight Recorder or similar tools, maybe things 
can be improved for your specific case
* Try to cache the XSSFWorkbook and just "clean it" between calls, thus 
reducing memory allocations and lower GC impact, but at the risk of potential 
side-effects between operations
* The XML-based XSSF/SXSSF classes introduce more memory churn than the older 
binary format. Used the .xls format via HSSWorkbook might improve performance 
if your downstream processing can handle this older binary format


GitHub link: 
https://github.com/apache/poi/discussions/1061#discussioncomment-16875933

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to