Hi Chris, I have tested tika 1.9-rc2. In particular, I checked the new work on CTAKESParser. Thank you for your great work.
My vote for this RC is +1. Thanks, Giuseppe On Mon, Jun 8, 2015 at 8:58 AM, Konstantin Gribov <[email protected]> wrote: > Hi, Chris. > > SHA1 hash and GPG signature are valid for all published artifacts. I've > tested 1.9-rc2 on several text docs (rtf, pdf, doc, docx) and result is > quite good. > > I've found minor regression since 1.7 (it may be related to POI, not Tika > itself), but they shouldn't prevent releasing 1.9 from rc2. I'll try to > create doc to reproduce it and file a ticket to jira because I can't share > original doc file on which it can be reproduced. JFYI, > o.a.t.p.microsoft.OfficeParser produces U+200B (zero width white space) > where U+00AD (soft hyphen) should be. Same document saved to odt and docx > have different content (one has U+00AD on same position, one has nothing > there, like tika-app-1.7 had). > > [x] +1 Release this package as Apache Tika 1.9 > [ ] -1 Do not release this package because… > > Thank you for preparing this release. > > -- > Best regards, > Konstantin Gribov > > вс, 7 июня 2015 г. в 4:47, Mattmann, Chris A (3980) < > [email protected]>: > > > Hi Folks, > > > > A second candidate for the Tika 1.9 release is available at: > > > > https://dist.apache.org/repos/dist/dev/tika/ > > > > The release candidate is a zip archive of the sources in: > > http://svn.apache.org/repos/asf/tika/tags/1.9-rc2/ > > > > The SHA1 checksum of the archive is > > 9b78c9e9ce9640b402b7fef8e30f3cdbe384f44c. > > > > In addition, a staged maven repository is available here: > > https://repository.apache.org/content/repositories/orgapachetika-1011/ > > > > > > Please vote on releasing this package as Apache Tika 1.9. > > The vote is open for the next 72 hours and passes if a majority of at > > least three +1 Tika PMC votes are cast. > > > > [ ] +1 Release this package as Apache Tika 1.9 > > [ ] -1 Do not release this package because… > > > > Cheers, > > Chris > > > > P.S. Of course here is my +1. > > > > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Chris Mattmann, Ph.D. > > Chief Architect > > Instrument Software and Science Data Systems Section (398) > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > Office: 168-519, Mailstop: 168-527 > > Email: [email protected] > > WWW: http://sunset.usc.edu/~mattmann/ > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Adjunct Associate Professor, Computer Science Department > > University of Southern California, Los Angeles, CA 90089 USA > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > >
