How many TB would that be?
 
> Still haven’t had time to put the server in a dmz. Ugh.
> 
>  Yes, more than happy to share.
> 
> If anyone has recommendations for file hosting for a couple of TB, let me
> know.
> 
> One option would be to work with CommonCrawl to bump the max file size one
> crawl a year...
> 
> On Tue, Jun 2, 2020 at 1:48 AM Tilman Hausherr <[email protected]>
> wrote:
> 
> > Can we / I access these files? Most differences are improvements or not
> > meaningful, but there are a few I'd like to have a look, e.g.
> > 
> > commoncrawl3/commoncrawl3/XO/XOAAGISRMRPZQRZF4LSMJERGEYK5QI2T
> > 
> > the word "antrag" loses the first "a". Although maybe the "a" was a big
> > one and gets assigned to another line.
> > 
> > Tilman
> > 
> > Am 02.06.2020 um 02:58 schrieb Tim Allison:
> > > > > Reports are available here:
> > https://github.com/tballison/share/blob/master/tika_comparisons/reports-pdfbox-2.0.20.tgz
> > > Looks like there are trivial differences in content with a slight
> > > improvement over 2.0.19.  I don't see any differences in exceptions or
> > > attachments.
> > > 
> > > Cheers,
> > > 
> > >          Tim
> > > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> > 
> > 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
[email protected]
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to