Hi, The OOXML support in POI works really well for Apache Tika, but I'm a bit annoyed [1] by the size of the ooxml-schemas jar file.
I looked at the compile-ooxml-xsds target that generates the jar, and the difference in the input and output sizes is pretty amazing: * input: 220K ooxml-lib/OfficeOpenXML-XMLSchema.zip * output 14M ooxml-lib/ooxml-schemas-1.0.jar That's a 62x difference! Are all of the generated code and xsd snippets in ooxml-schemas needed by POI, or would there be some way to streamline the jar? [1] http://jukkaz.wordpress.com/2009/10/16/putting-poi-on-a-diet/ BR, Jukka Zitting --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
