Thanks! And indeed, no statics please.
Markus -----Original message----- > From:Tim Allison <[email protected]> > Sent: Monday 26th August 2019 19:40 > To: [email protected] > Subject: Re: How to increase ZIP bomb maximum depth > > Oh, ok. This is helpful. Got it. The AutoDetectParser automatically > wraps the incoming handler in a SecureContentHandler. Some options... > > 1) We could have the AutoDetectParser skip wrapping a > SecureContentHandler around the incoming handler if the user calls > parse with a SecureContentHandler... > 2) We could add SecureContentHandler parameter settings to the > AutoDetectParser, and it would configure the SecureContentHandler > accordingly...I think there are a few subtleties, but this might get > you configurability via tika-config.xml. > > I'm not offering static thresholds on the SecureContentHandler. :D > > Fellow devs, how else might we make this work and make it configurable > via tika-config.xml? > > Cheers, > > Tim > > > On Mon, Aug 26, 2019 at 1:24 PM Markus Jelsma > <[email protected]> wrote: > > > > Hello Tim, > > > > I use Tika embedded in another Java application. passing it a custom > > ContentHandler which collects interesting stuff, which we, after the parse, > > use to construct meaningful text. > > > > ReadableContentHandler handler = new ReadableContentHandler(url, > > config); > > > > AutoDetectParser parser = new AutoDetectParser(tikaConfig); > > parser.parse(stream, handler, new Metadata(), context); > > > > My ContentHandler does not extend SecureContentHandler so i never have a > > chance to pass some different value for the nesting limit check. > > > > Many thanks, > > Markus > > > > -----Original message----- > > > From:Tim Allison <[email protected]> > > > Sent: Monday 26th August 2019 19:11 > > > To: [email protected] > > > Subject: Re: How to increase ZIP bomb maximum depth > > > > > > Hi Markus, > > > > > > This requires some work...the zip bomb protections are currently > > > handled by the handler. We allow for configuration of the parsers, > > > detectors, charset detectors, but not yet the handlers. IIRC, we've > > > talked a bit about specifying a custom handler via the commandline at > > > least in tika-server. I wonder if we should allow for a default > > > handler configuration that would specify a handler to be used by the > > > facade Tika.parse(inputStream)? > > > > > > Fellow devs have any recommendations? > > > > > > How are you currently calling Tika? Via tika-server, Solr's DIH or > > > something else? > > > > > > Best, > > > > > > Tim > > > > > > On Mon, Aug 26, 2019 at 11:20 AM Markus Jelsma > > > <[email protected]> wrote: > > > > > > > > Hello, > > > > > > > > I've been looking around to increase the limit, but i don't seem to be > > > > able to find how. I know there the setter for it, but using > > > > AutoDetectParser, i'd like to set it via tika-config. I haven't seen a > > > > parameter for tika-config that would set that value and the manual on > > > > Configuring Tika doesn't mention it. > > > > > > > > Many thanks, > > > > Markus > > > > > > > > > > > >
