cowwoc opened a new pull request, #952:
URL: https://github.com/apache/poi/pull/952

   ## Summary
   
   Adds `XSSFOptimiser` class to provide optimization methods for XSSF 
workbooks, equivalent to the existing `HSSFOptimiser` for HSSF workbooks.
   
   This addresses the common "style explosion" problem where programmatically 
created XSSF workbooks accumulate too many duplicate cell styles and fonts, 
causing:
   - Slow file opening in Excel
   - Excel hangs or crashes with large workbooks
   - Increased file size
   - The "Too many different cell formats" error
   
   ## Changes
   
   **New files:**
   - `poi-ooxml/src/main/java/org/apache/poi/xssf/usermodel/XSSFOptimiser.java`
   - 
`poi-ooxml/src/test/java/org/apache/poi/xssf/usermodel/TestXSSFOptimiser.java`
   
   **Methods added:**
   - `optimiseCellStyles(XSSFWorkbook)` - Remaps cells using duplicate styles 
to canonical (first) occurrence
   - `optimiseFonts(XSSFWorkbook)` - Updates style font references to canonical 
fonts
   
   ## Implementation Notes
   
   Unlike HSSF, XSSF's `StylesTable` maintains a separate Java list from the 
underlying `CTStylesheet` XML. Removing entries from `CTStylesheet` can cause 
`XmlValueDisconnectedException` on save because the Java objects become 
disconnected from their XML backing.
   
   The implementation uses a **safe remapping approach**:
   - Identifies duplicate styles/fonts by comparing all properties (not indices)
   - Remaps cells to use canonical (first occurrence) styles
   - Does NOT remove entries from XML (avoids disconnection issues)
   - Prevents further style explosion
   
   ## Test Coverage
   
   Includes 6 comprehensive tests:
   - `testOptimiseCellStylesRemapsDuplicates`
   - `testOptimiseCellStylesHandlesUnusedStyles`
   - `testOptimiseCellStylesPreservesDifferentStyles`
   - `testOptimiseFontsRemapsDuplicates`
   - `testWorkbookSaveableAfterOptimization`
   - `testOptimizationPreservesFormulas`
   
   ## Usage Example
   
   ```java
   XSSFWorkbook workbook = new XSSFWorkbook(new 
FileInputStream("bloated.xlsx"));
   XSSFOptimiser.optimiseFonts(workbook);
   XSSFOptimiser.optimiseCellStyles(workbook);
   workbook.write(new FileOutputStream("optimized.xlsx"));
   ```
   
   Closes #951


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to