[ 
https://issues.apache.org/jira/browse/TIKA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116556#comment-14116556
 ] 

Hudson commented on TIKA-674:
-----------------------------

FAILURE: Integrated in tika-trunk-jdk1.7 #185 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/185/])
Fix for TIKA-674: CompositeParser should indicate which parser was actually 
selected for parsing contributed by Andrzej Bialecki. (mattmann: 
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1621531)
* /tika/trunk/CHANGES.txt
* 
/tika/trunk/tika-core/src/main/java/org/apache/tika/parser/CompositeParser.java


> CompositeParser should indicate which parser was actually selected for parsing
> ------------------------------------------------------------------------------
>
>                 Key: TIKA-674
>                 URL: https://issues.apache.org/jira/browse/TIKA-674
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 0.10
>            Reporter: Andrzej Bialecki 
>            Assignee: Chris A. Mattmann
>             Fix For: 1.6
>
>
> If multiple parsers exist that support the same mime type, and 
> AutoDetectParser (or another CompositeParser) is used, then the parse output 
> does not indicate which of the alternative parsers was actually used. I think 
> that the name of the parser (FQCN?) should be added to the metadata.
> Something like this trivial patch:
> {code}
> Index: tika-core/src/main/java/org/apache/tika/parser/CompositeParser.java
> ===================================================================
> --- tika-core/src/main/java/org/apache/tika/parser/CompositeParser.java       
> (revision 1135167)
> +++ tika-core/src/main/java/org/apache/tika/parser/CompositeParser.java       
> (working copy)
> @@ -238,6 +238,7 @@
>          try {
>              TikaInputStream taggedStream = TikaInputStream.get(stream, tmp);
>              TaggedContentHandler taggedHandler = new 
> TaggedContentHandler(handler);
> +            metadata.add("X-Parsed-By", parser.getClass().getName());
>              try {
>                  parser.parse(taggedStream, taggedHandler, metadata, context);
>              } catch (RuntimeException e) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to