Thanks!
+1

BR,
Oleg

On Tue, Aug 4, 2015 at 5:37 AM, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:

> +1
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattm...@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
> -----Original Message-----
> From: "Allison, Timothy B." <talli...@mitre.org>
> Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
> Date: Tuesday, July 28, 2015 at 11:08 AM
> To: "dev@tika.apache.org" <dev@tika.apache.org>
> Subject: RE: release Tika 1.10?
>
> >Just finished the run against ~2.8 million docs (4.8 million including
> >attachments) from a combination of govdocs1 and Common Crawl.  I compared
> >1.9 with trunk.
> >
> >Most looks good.
> >
> >Some highlights:
> >* Thanks to Andrew Jackson and TIKA-1678, we're now getting better
> >metadata out of ~1300 from 550k PDFs. This appears to be far more common
> >in Common Crawl PDFs than in govdocs1 PDFs.
> >* No significant changes found in the handful of msg files...I wanted to
> >check after the work on TIKA-1238.
> >* Thanks to Andreas Beeker and TIKA-1046/POI 54332, there are far fewer
> >PPT exceptions
> >* There are a very few more files in CommonCrawl that are now incorrectly
> >identified as RFC vs text (TIKA-1602), but this is a tiny handful (total
> >of 4 documents in both CC and govdocs1)
> >
> >A regret:
> >This run used the digesting parser for both container and embedded files.
> > This causes some truncated (=corrupt) package files to throw an
> >exception before they otherwise would.  The opposite happens, too (more
> >embedded files when using the digester), but this is extremely rare. This
> >means that for truncated gz, x-xz and x-archive files there are many more
> >with fewer attachments in Tika 1.10-SNAPSHOT than in Tika 1.9.
> >
> >With Konstantin's and Bob's fix of TIKA-1524, I think we're in good shape
> >for 1.10...from my perspective.
> >
> >             Best,
> >
> >                       Tim
> >-----Original Message-----
> >From: David Meikle [mailto:loo...@gmail.com]
> >Sent: Sunday, July 26, 2015 10:50 AM
> >To: dev@tika.apache.org
> >Subject: Re: release Tika 1.10?
> >
> >
> >> On 23 Jul 2015, at 14:07, Allison, Timothy B. <talli...@mitre.org>
> >>wrote:
> >>
> >>  With the fix of TIKA-1690, I think it makes sense to roll a new
> >>release (1.10) in the next week or so.  I'd like to get TIKA-1667
> >>(upgrade poi) in before the release.  Are there any other blockers on
> >>1.10?
> >
> >+1 from me too.  As discussed on private, I will roll the release on
> >Tuesday night (UK Time) to give people time to shout for other candidates.
> >
> >Cheers,
> >Dave
>
>

Reply via email to