I didn't find any showstoppers.  Are we ready for Chris to roll 1.14-rc1?

Some notes:
We're getting quite a few new attachments: 315k (mostly from newly recognized 
mbox, and MSOffice)
New mimetypes: mbox, text/calendar, x-sh, vnd.djvu, dbf, and many more
The upgraded copy of icu4j is misidentifying a handful of files as UTF-16[LB]E.
We're missing a small amount of text from custom PPT templates (known issue)
We're getting quite a few new exceptions for attachments that weren't formerly 
extracted.  These are unknown embedded objects that are being misidentified as 
PSD, other image files or TTF. 
We're getting quite a few new exceptions for files that are now correctly 
identified as "x-ms-asx" because they contain invalid xml


-----Original Message-----
From: Allison, Timothy B. [mailto:[email protected]] 
Sent: Wednesday, September 28, 2016 1:34 PM
To: [email protected]
Subject: RE: Tika 1.14?

All,
  I finished running the regression tests.  I have just started going through 
the results.

Reports are available here:

https://github.com/tballison/share/blob/master/tika_comparisons/reports_1_14-trunk_vs_1_13.zip



-----Original Message-----
From: Chris Mattmann [mailto:[email protected]] 
Sent: Thursday, September 22, 2016 12:25 PM
To: [email protected]
Subject: Re: Tika 1.14?

Sounds great to me Tim. If you tell me when the tests are done, I’d be happy to 
RC a release!





On 9/21/16, 11:31 AM, "Allison, Timothy B." <[email protected]> wrote:

    All,
      PDFBox 2.0.3 is now integrated, I'm about to push the integration with 
POI-3.15.  I have a few cleanup things I'd like to take care of.
      Any other items for 1.14?
      Should we aim for Mon 26th for final code changes for 1.14?  I can run 
the regression tests, and then maybe we could cut the release candidate some 
time mid to end of next week?
    
           Best,
    
                   Tim
    
    



Reply via email to