Thank _you_ Dominik for your tools and collaboration! -----Original Message----- From: Dominik Stadler [mailto:[email protected]] Sent: Wednesday, October 5, 2016 3:33 PM To: [email protected] Subject: Re: Apache Tika's public regression corpus
Great writeup, Tim, thanks for taking the time to tell people about things that we do! Dominik. On Wed, Oct 5, 2016 at 7:56 PM, Allison, Timothy B. <[email protected]> wrote: > All, > > I recently blogged about some of the work we're doing with a large > scale regression corpus to make Tika, POI and PDFBox more robust and > to identify regressions before release. If you'd like to chip in with > recommendations, requests or Hadoop/Spark clusters (why not shoot for the > stars), please do! > > http://openpreservation.org/blog/2016/10/04/apache-tikas- > regression-corpus-tika-1302/ > > Many thanks, again, to Rackspace for our vm and to Common Crawl and > govdocs1 for most of our files! > > Cheers, > > Tim >
