The reports from 3.17 vs 4.0.0-SNAPSHOT are here: http://162.242.228.174/reports/poi-4.0.0_reports.tar.gz
Aside from the two issues I've already identified (stackoverflow and small regression on boilerplate/template identification in ppt), it looks like more files are being identified as tika-ooxml than .docx or pptx. This may be a Tika-level issue, but I want to look into that. If anyone notices anything else, please let me know! On Wed, Aug 8, 2018 at 7:25 AM Tim Allison <[email protected]> wrote: > > Hi Andi, > I think I'm mostly good. If you could take a look at: > https://bz.apache.org/bugzilla/show_bug.cgi?id=62592. The thmx file > doesn't open in pptx and may be malformed, but we need to prevent the > StackOverflowError...infinite recursion. > I don't like the patch because it requires the caching of > ContentTypeEntry after its initial creation, which we're doing > currently, but it doesn't feel right to require that. > So, if you have a better solution, please help! :) > > Thank you! > > Best, > > Tim > On Tue, Aug 7, 2018 at 4:54 PM Andreas Beeker <[email protected]> wrote: > > > > Hi Tim, > > > > On 7/31/18 9:49 PM, Tim Allison wrote: > > > I'm trying to upgrade Tika to 4.0.0-SNAPSHOT. > > > > > > 2) To confirm OLEShape has become HSLFObjectShape? > > > > You are correct. > > > > Can I help you with the upgrade? > > > > Andi > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
