Dunno where you are on this...I'm still snowed in. It would be great if we
could upgrade to PDFBox 1.8.11 if we haven't done so yet. TIKA-1830. Last I
tried, we have to remove some "exceptional" handling in the unit test comparing
the sequential to the non-sequential parser because the tests now pass. Other
than that, should be straightforward. I raise this only because Uwe Schindler
noted how important this improvement is for Solr running on Java 9.
If I had time, I'd also want to finish the upgrade to POI and then run the
massive corpus tests. Maybe tomorrow, but not today...argh...
Cheers,
Tim
________________________________________
From: Markus Jelsma <[email protected]>
Sent: Thursday, January 21, 2016 3:41 PM
To: [email protected]
Subject: RE: [DISCUSS] Tika 1.12-rc1 (was Re: New Tika release)
Chris - that would be awesome! Nutch 1.12 can then bundle Tika 1.12!
Markus
-----Original message-----
> From:Mattmann, Chris A (3980) <[email protected]>
> Sent: Thursday 21st January 2016 21:30
> To: [email protected]
> Subject: [DISCUSS] Tika 1.12-rc1 (was Re: New Tika release)
>
> Fine by me. I can cut a 1.12-rc1 this weekend.
>
> If I don’t hear objections from the other devs, I’ll go for it
> on Friday. Also this will be the first Git release, so should
> be fun! :)
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: [email protected]
> WWW: http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> -----Original Message-----
> From: Markus Jelsma <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Thursday, January 21, 2016 at 12:27 PM
> To: "[email protected]" <[email protected]>
> Subject: New Tika release
>
> >Hello PMC,
> >
> >With TIKA-1835 committed Apache Nutch can finally fully support text and
> >link extraction via Boilerpipe, something many Nutch users (myself not
> >included) have been looking forward too for the last few years. We, as
> >Nutch PMC, cannot release Nutch with that support without Tika so our
> >users must wait until this is resolved and available. I do not want to
> >put additional burden to a Tika release manager or whatever, but i do
> >want to kindly beg the Tika PMC to discuss a possible early release of a
> >new Apache Tika.
> >
> >Please let me know what you think.
> >
> >Regards,
> >Markus
>
>