[
https://issues.apache.org/jira/browse/TIKA-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372099#comment-14372099
]
Tyler Palsulich commented on TIKA-1351:
---------------------------------------
I think this would be a nice feature. But, it's a large task to update every
Parser. I believe there is a DummyContentHandler which just discards everything
(but extraction is still done)? I forget the name.
> Parser implementations should accept null content handlers
> ----------------------------------------------------------
>
> Key: TIKA-1351
> URL: https://issues.apache.org/jira/browse/TIKA-1351
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Reporter: Sergey Beryozkin
> Priority: Minor
>
> Applications which want to let users search documents based only on their
> metadata do not need to get the content parsed.
> The only workaround I've found so far is to pass a no op content handler
> which can ignore the content events but it does not stop the parser such as
> PDFParser from parsing the content.
> Proposal: update parser API docs to let implementers know ContentHandler can
> be null and update the shipped implementations to parse the metadata only if
> ContentHandler is null
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)