The reports from 3.17 vs 4.0.0-SNAPSHOT are here:
http://162.242.228.174/reports/poi-4.0.0_reports.tar.gz

Aside from the two issues I've already identified (stackoverflow and
small regression on boilerplate/template identification in ppt), it
looks like more files are being identified as tika-ooxml than .docx or
pptx.  This may be a Tika-level issue, but I want to look into that.

If anyone notices anything else, please let me know!
On Wed, Aug 8, 2018 at 7:25 AM Tim Allison <[email protected]> wrote:
>
> Hi Andi,
>   I think I'm mostly good.  If you could take a look at:
> https://bz.apache.org/bugzilla/show_bug.cgi?id=62592.  The thmx file
> doesn't open in pptx and may be malformed, but we need to prevent the
> StackOverflowError...infinite recursion.
>   I don't like the patch because it requires the caching of
> ContentTypeEntry after its initial creation, which we're doing
> currently, but it doesn't feel right to require that.
>   So, if you have a better solution, please help! :)
>
>   Thank you!
>
>             Best,
>
>                    Tim
> On Tue, Aug 7, 2018 at 4:54 PM Andreas Beeker <[email protected]> wrote:
> >
> > Hi Tim,
> >
> > On 7/31/18 9:49 PM, Tim Allison wrote:
> > >   I'm trying to upgrade Tika to 4.0.0-SNAPSHOT.
> > >
> > > 2) To confirm OLEShape has become HSLFObjectShape?
> >
> > You are correct.
> >
> > Can I help you with the upgrade?
> >
> > Andi
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to