Hi Tim,

Could you clarify the pros and cons between ForkParser (after your
refactoring) and TikaServer? Maybe we should send those to users list and
wiki...

Thanks

2018-05-29 16:27 GMT-03:00 Tim Allison <talli...@apache.org>:

> Ken,
>   Once TIKA-2653 is done and 1.19(?) is released, I'll propose switching
> ERH to the ForkParser.  There's also an open ticket for using tika-server.
> I think users should have both options.
>
> On Tue, May 29, 2018 at 3:25 PM, Tim Allison <talli...@apache.org> wrote:
>
>> 1: CORRECTION: the ForkParser by itself (without my mods) will protect
>> against ooms, permanent hangs, and native lib crashing.  My proposed mods (on
>> TIKA-2653) only move the parser dependencies out of Solr's dependencies.
>>
>> 2: note: Also, note the discussion on where to place this information.
>> Cassandra Targett advocates putting this guidance in the main users' guide.
>>
>> On Tue, May 29, 2018 at 3:22 PM, Tim Allison <talli...@apache.org> wrote:
>>
>>> Y, my mods to the ForkParser should make it more robust, and will help
>>> with OOMs, permanent hangs and native lib crashing.  But those changes are
>>> still in the works...
>>>
>>> On Tue, May 29, 2018 at 3:18 PM, Luís Filipe Nassif <lfcnas...@gmail.com
>>> > wrote:
>>>
>>>> Hi Ken,
>>>>
>>>> Threads will not help with OutOfMemoryErrors or crashes caused by native
>>>> libs. ForkParser can help, after the refactoring started by Tim to
>>>> handle
>>>> some of its limitations. See TIKA-2653
>>>>
>>>> 2018-05-29 16:11 GMT-03:00 Ken Krugler <kkrugler_li...@transpac.com>:
>>>>
>>>> > Thanks for the ref, Tim.
>>>> >
>>>> > I’m curious why SolrCell doesn’t fire up threads when parsing docs
>>>> with
>>>> > Tika (or use the fork parser), to mitigate issues with hangs &
>>>> crashes?
>>>> >
>>>> > — Ken
>>>> >
>>>> > > On May 29, 2018, at 11:54 AM, Tim Allison <talli...@apache.org>
>>>> wrote:
>>>> > >
>>>> > > All,
>>>> > >
>>>> > >  Over the weekend, Shawn Heisey very kindly drafted a wikipage
>>>> about the
>>>> > > challenges of using Solr's ExtractingRequestHandler and the
>>>> guidance to
>>>> > > avoid it in production.
>>>> > >
>>>> > >   I completely agree with this point, and I think that Shawn did a
>>>> very
>>>> > > nice job of capturing some of the challenges.  If you have any
>>>> feedback
>>>> > or
>>>> > > would like to make edits, see:
>>>> > >
>>>> > > https://wiki.apache.org/solr/RecommendCustomIndexingWithTika
>>>> > >
>>>> > >   Cheers,
>>>> > >
>>>> > >                 Tim
>>>> >
>>>> > --------------------------------------------
>>>> > http://about.me/kkrugler
>>>> > +1 530-210-6378
>>>> >
>>>> >
>>>>
>>>
>>>
>>
>

Reply via email to