How about we adopt a full lazy-evaluation approach for Formula objects? Either _byteEncoding and _encodedTokenLen should be provided or _ptgTokens should be provided. Only when the other is needed should Ptg.readTokens or Ptg.serializeTokens be called.
As I understand it, reading and serializing tokens is pretty expensive, and eager evaluation is expensive if the results are never used. There should be: > private Formula(final byte[] byteEncoding, final int encodedTokenLen); > private Formula(final Ptg[] cachedTokens); I don't think a third constructor is necessary with the current code and enough lazy evaluation, but here's a 3rd constructor > private Formula(final byte[] byteEncoding, final int encodedTokenLen, final > Ptg[] cachedTokens); I think we're less likely to rewrite org.apache.poi.hssf.model with SpreadsheetVersion code since it would encourage incorrect usage (the mailing list would explode). However, if you can get to the bottom of why HSSF formula evaluation is faster with some profiling (perhaps it's the expensive xmlbean reading and writing that's slowing XSSF down), then we could go that avenue. 1. Find and fix the largest contributors to the overall formula evaluation time for your XSSF and HSSF test cases. 2. Create a GenericSSEvaluationWorkbook that stores sheets, cells, and rows in plain old java objects without the underlying HSSF bytestream or XSSF/SXSSF xml data structures that are needed for serialization. This wouldn't need to be a writeable workbook. This could plug into the current formula evaluation code. 3. Rewrite some of the XSSF classes that wrap an xmlbean to fully read their data from an xmlbean, and only recreate the bean when writing out. We want to go this direction with XSSF as it would make POI faster, use less memory, and make it easier to transition to a different XML library. Maybe I'm overstating, but long term, this would make it easier to merge HSSF, XSSF, SXSSF, and other SS interfaces, enabling format-agnostic classes that could convert between xls and xlsx. 4. Other ideas? --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
