[
https://issues.apache.org/jira/browse/TIKA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116562#comment-14116562
]
Hudson commented on TIKA-674:
-----------------------------
SUCCESS: Integrated in tika-trunk-jdk1.6 #164 (See
[https://builds.apache.org/job/tika-trunk-jdk1.6/164/])
Fix for TIKA-674: CompositeParser should indicate which parser was actually
selected for parsing contributed by Andrzej Bialecki. (mattmann:
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1621531)
* /tika/trunk/CHANGES.txt
*
/tika/trunk/tika-core/src/main/java/org/apache/tika/parser/CompositeParser.java
> CompositeParser should indicate which parser was actually selected for parsing
> ------------------------------------------------------------------------------
>
> Key: TIKA-674
> URL: https://issues.apache.org/jira/browse/TIKA-674
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 0.10
> Reporter: Andrzej Bialecki
> Assignee: Chris A. Mattmann
> Fix For: 1.6
>
>
> If multiple parsers exist that support the same mime type, and
> AutoDetectParser (or another CompositeParser) is used, then the parse output
> does not indicate which of the alternative parsers was actually used. I think
> that the name of the parser (FQCN?) should be added to the metadata.
> Something like this trivial patch:
> {code}
> Index: tika-core/src/main/java/org/apache/tika/parser/CompositeParser.java
> ===================================================================
> --- tika-core/src/main/java/org/apache/tika/parser/CompositeParser.java
> (revision 1135167)
> +++ tika-core/src/main/java/org/apache/tika/parser/CompositeParser.java
> (working copy)
> @@ -238,6 +238,7 @@
> try {
> TikaInputStream taggedStream = TikaInputStream.get(stream, tmp);
> TaggedContentHandler taggedHandler = new
> TaggedContentHandler(handler);
> + metadata.add("X-Parsed-By", parser.getClass().getName());
> try {
> parser.parse(taggedStream, taggedHandler, metadata, context);
> } catch (RuntimeException e) {
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)