Nick, Do you think we should follow up on the Tika side? Do we know if we can handle this?
---------- Forwarded message --------- From: Nick Burch <[email protected]> Date: Fri, Oct 9, 2020 at 4:43 PM Subject: XLSX wrapped in an OLE2 CompObj/Package - should WorkbookFactory handle it? To: <[email protected]> Hi All Over on Stackoverflow <https://stackoverflow.com/q/64269294/685641> there's a user who was getting what they thought was an embedded XSLX file out of a PPT, but finding it was an OLE2 wrapper with CompObj and Package entries. The real XLSX was in the Package part. Passing the outer OLE2 stream to WorkbookFactory didn't work What do people think here? Should we have WorkbookFactory spot this case, grab the OOXML out of the POIFS and try to load that? Update HSLF to optionally extract the OOXML out of the OLE2? Record the gotcha in the docs somewhere? Something else? Cheers Nick --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
