Oh, ok. This is helpful. Got it. The AutoDetectParser automatically
wraps the incoming handler in a SecureContentHandler. Some options...
1) We could have the AutoDetectParser skip wrapping a
SecureContentHandler around the incoming handler if the user calls
parse with a SecureContentHandler...
2) We could add SecureContentHandler parameter settings to the
AutoDetectParser, and it would configure the SecureContentHandler
accordingly...I think there are a few subtleties, but this might get
you configurability via tika-config.xml.
I'm not offering static thresholds on the SecureContentHandler. :D
Fellow devs, how else might we make this work and make it configurable
via tika-config.xml?
Cheers,
Tim
On Mon, Aug 26, 2019 at 1:24 PM Markus Jelsma
<[email protected]> wrote:
>
> Hello Tim,
>
> I use Tika embedded in another Java application. passing it a custom
> ContentHandler which collects interesting stuff, which we, after the parse,
> use to construct meaningful text.
>
> ReadableContentHandler handler = new ReadableContentHandler(url, config);
>
> AutoDetectParser parser = new AutoDetectParser(tikaConfig);
> parser.parse(stream, handler, new Metadata(), context);
>
> My ContentHandler does not extend SecureContentHandler so i never have a
> chance to pass some different value for the nesting limit check.
>
> Many thanks,
> Markus
>
> -----Original message-----
> > From:Tim Allison <[email protected]>
> > Sent: Monday 26th August 2019 19:11
> > To: [email protected]
> > Subject: Re: How to increase ZIP bomb maximum depth
> >
> > Hi Markus,
> >
> > This requires some work...the zip bomb protections are currently
> > handled by the handler. We allow for configuration of the parsers,
> > detectors, charset detectors, but not yet the handlers. IIRC, we've
> > talked a bit about specifying a custom handler via the commandline at
> > least in tika-server. I wonder if we should allow for a default
> > handler configuration that would specify a handler to be used by the
> > facade Tika.parse(inputStream)?
> >
> > Fellow devs have any recommendations?
> >
> > How are you currently calling Tika? Via tika-server, Solr's DIH or
> > something else?
> >
> > Best,
> >
> > Tim
> >
> > On Mon, Aug 26, 2019 at 11:20 AM Markus Jelsma
> > <[email protected]> wrote:
> > >
> > > Hello,
> > >
> > > I've been looking around to increase the limit, but i don't seem to be
> > > able to find how. I know there the setter for it, but using
> > > AutoDetectParser, i'd like to set it via tika-config. I haven't seen a
> > > parameter for tika-config that would set that value and the manual on
> > > Configuring Tika doesn't mention it.
> > >
> > > Many thanks,
> > > Markus
> > >
> > >
> >