[
https://issues.apache.org/jira/browse/TIKA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902585#comment-14902585
]
Nick Burch commented on TIKA-1740:
----------------------------------
You might be better off writing your own Recursion handler. Take a look at how
things like RecursiveParserWrapper and the Tika App embedded resources
extractor work, and then do something specialised for your use-case.
{{RecursiveParserWrapper}} is designed to make things easy for many common uses
cases, but isn't expected to work for everyone!
> RecursiveParserWrapper returning ContentHandler-s
> -------------------------------------------------
>
> Key: TIKA-1740
> URL: https://issues.apache.org/jira/browse/TIKA-1740
> Project: Tika
> Issue Type: Wish
> Components: core, parser
> Reporter: Andrea
>
> I would like to build a mechanism to allow a custom object being built
> starting from a parsing result. This can be done easily by working with a
> custom ContentHandler "transformer", but how can I achieve this result using
> a RecursiveParserWrapper? In this case I can only set a ContentHandlerFactory
> and the parser will just call the toString method and set it as a metadata.
> Can you imagine something to get the entire ContentHandler object for each
> subfile instead of the result of the toString method? Of course, it would
> also be needed to have a flag to disable the TIKA_CONTENT metadata production.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)