Thanks Nick.

I had set AutoDetectParser in the ParseContext and that was causing text
extraction of embedded objects recursively. Once I removed this I got text
extract of just the parent file.

Regards,
Shiv


On Fri, Nov 29, 2013 at 4:16 PM, Nick Burch <[email protected]> wrote:

> On Fri, 29 Nov 2013, Shiv Kenche wrote:
>
>> I have a Parent doc file with many attachments(children) into it. I need
>> to extract text content of Parent doc file but do not need text extract of
>> its children.
>>
>
> Tika does not recurse into embedded documents by default. To enable
> recursion, you need to set a Parser object onto the ParseContext, to be
> used to handle the child objects. Without one, Tika will process the outer
> (parent) document only
>
> Nick
>

Reply via email to