[ 
https://issues.apache.org/jira/browse/TIKA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753683#action_12753683
 ] 

Yonik Seeley commented on TIKA-193:
-----------------------------------

Hmmm, I'm testing Solr Cell from the current solr-trunk (which has Tika 0.4), 
and I'm seeing Content-Type added twice, for PDFs only.

<arr name="attr_Content-Type">
  <str>application/pdf</str>
  <str>application/pdf</str>
</arr>


> PDFParser adds mime-type twice
> ------------------------------
>
>                 Key: TIKA-193
>                 URL: https://issues.apache.org/jira/browse/TIKA-193
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.3
>            Reporter: Jonathan Koren
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 0.4
>
>         Attachments: patch
>
>
> Using AutoDetectParser to call PDFParser causes the mime-type to be added 
> twice.  It should be added exactly once.
> Proposed Fix:
> parser/pdf/PDFParser.java should be changed from:
> metadata.add(Metadata.CONTENT_TYPE, "application/pdf");
> to:
> metadata.set(Metadata.CONTENT_TYPE, "application/pdf");
> as per other Tika bundled parsers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to