How many TB would that be? > Still haven’t had time to put the server in a dmz. Ugh. > > Yes, more than happy to share. > > If anyone has recommendations for file hosting for a couple of TB, let me > know. > > One option would be to work with CommonCrawl to bump the max file size one > crawl a year... > > On Tue, Jun 2, 2020 at 1:48 AM Tilman Hausherr <[email protected]> > wrote: > > > Can we / I access these files? Most differences are improvements or not > > meaningful, but there are a few I'd like to have a look, e.g. > > > > commoncrawl3/commoncrawl3/XO/XOAAGISRMRPZQRZF4LSMJERGEYK5QI2T > > > > the word "antrag" loses the first "a". Although maybe the "a" was a big > > one and gets assigned to another line. > > > > Tilman > > > > Am 02.06.2020 um 02:58 schrieb Tim Allison: > > > > > Reports are available here: > > https://github.com/tballison/share/blob/master/tika_comparisons/reports-pdfbox-2.0.20.tgz > > > Looks like there are trivial differences in content with a slight > > > improvement over 2.0.19. I don't see any differences in exceptions or > > > attachments. > > > > > > Cheers, > > > > > > Tim > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > -- Maruan Sahyoun
FileAffairs GmbH Josef-Schappe-Straße 21 40882 Ratingen Tel: +49 (2102) 89497 88 Fax: +49 (2102) 89497 91 [email protected] www.fileaffairs.de Geschäftsführer: Maruan Sahyoun Handelsregister: AG Düsseldorf, HRB 53837 UST.-ID: DE248275827 --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
