Re: Resource Sharing Tika Corpus with Any23

Tim Allison Fri, 30 Nov 2018 11:23:23 -0800

I think that'd be great.  Some questions:

1) Would you use the same input docs that we're using or would you
need/want a new TB drive for your input/output?  How much space will
you need for your eval framework including outputs?
2) Would you be willing to coordinate with us and PDFBox and POI
around release times?
3) Would you be running your processing every so often (around your
releases) or would it be constant aside from our releases?  I ask
because I'd like @Tobias Ospelt to have cycles for his fuzzing work
when we're not getting ready for a release.


Onward!

Cheers,

           Tim
On Fri, Nov 30, 2018 at 2:08 PM Lewis John Mcgibbney
<[email protected]> wrote:
>
> Hi dev@tika,
> Over at Any23 we have been discussing the prospect of running large scale
> jobs over a significant, challenging dataset, same as is done with Tika via
> Tika batch on the VM.
> Is there any possibility, a very small number of us from the Any23 team
> could access VM and the dataset(s)? If the answer is yes, we will move
> ahead with building a test suite?
> Thank you for your consideration dev@tika,
> Lewis
>
> --
>
> *Lewis*
> Dr. Lewis J. McGibbney Ph.D, B.Sc
> *Skype*: lewis.john.mcgibbney

Re: Resource Sharing Tika Corpus with Any23

Reply via email to