Woohoo! Thank you! -----Original Message----- From: Mattmann, Chris A (3980) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Thursday, September 29, 2016 1:27 PM To: dev@tika.apache.org Subject: Re: Tika 1.14?
If there aren’t any objections I’ll roll 1.14 this weekend with an RC1 by Monday. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect, Instrument Software and Science Data Systems Section (398) Manager, Open Source Projects Formulation and Development Office (8212) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Director, Information Retrieval and Data Science Group (IRDS) Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA WWW: http://irds.usc.edu/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ On 9/29/16, 8:07 AM, "Allison, Timothy B." <talli...@mitre.org> wrote: I didn't find any showstoppers. Are we ready for Chris to roll 1.14-rc1? Some notes: We're getting quite a few new attachments: 315k (mostly from newly recognized mbox, and MSOffice) New mimetypes: mbox, text/calendar, x-sh, vnd.djvu, dbf, and many more The upgraded copy of icu4j is misidentifying a handful of files as UTF-16[LB]E. We're missing a small amount of text from custom PPT templates (known issue) We're getting quite a few new exceptions for attachments that weren't formerly extracted. These are unknown embedded objects that are being misidentified as PSD, other image files or TTF. We're getting quite a few new exceptions for files that are now correctly identified as "x-ms-asx" because they contain invalid xml -----Original Message----- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Wednesday, September 28, 2016 1:34 PM To: dev@tika.apache.org Subject: RE: Tika 1.14? All, I finished running the regression tests. I have just started going through the results. Reports are available here: https://github.com/tballison/share/blob/master/tika_comparisons/reports_1_14-trunk_vs_1_13.zip -----Original Message----- From: Chris Mattmann [mailto:mattm...@apache.org] Sent: Thursday, September 22, 2016 12:25 PM To: dev@tika.apache.org Subject: Re: Tika 1.14? Sounds great to me Tim. If you tell me when the tests are done, I’d be happy to RC a release! On 9/21/16, 11:31 AM, "Allison, Timothy B." <talli...@mitre.org> wrote: All, PDFBox 2.0.3 is now integrated, I'm about to push the integration with POI-3.15. I have a few cleanup things I'd like to take care of. Any other items for 1.14? Should we aim for Mon 26th for final code changes for 1.14? I can run the regression tests, and then maybe we could cut the release candidate some time mid to end of next week? Best, Tim