[ 
https://issues.apache.org/jira/browse/TIKA-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284042#comment-15284042
 ] 

Hudson commented on TIKA-1928:
------------------------------

FAILURE: Integrated in tika-2.x #94 (See 
[https://builds.apache.org/job/tika-2.x/94/])
TIKA-1928 More tests for files with # in them (nick: rev 
398e73dee3e6ece5a0af376abe954b75bce0cfda)
* tika-core/src/test/java/org/apache/tika/mime/MimeDetectionTest.java
TIKA-1928 Fix detection for filenames containing a #, avoid (nick: rev 
2a69db7bbad17f235e66274186f5f3814f32a56e)
* tika-core/src/test/java/org/apache/tika/detect/NameDetectorTest.java
* tika-core/src/main/java/org/apache/tika/detect/NameDetector.java


> Filename detection misses when a # is in a filename
> ---------------------------------------------------
>
>                 Key: TIKA-1928
>                 URL: https://issues.apache.org/jira/browse/TIKA-1928
>             Project: Tika
>          Issue Type: Bug
>          Components: detector
>    Affects Versions: 1.12
>         Environment: java 8
>            Reporter: Jean Coudon
>            Priority: Minor
>             Fix For: 1.14
>
>
> If there is a pound character in a filename it will be detected as 
> application/octet-stream instead of the proper type that is detected without 
> the filename containing the pound.
> {code:java}
> Metadata metadata = new Metadata();
> Tika tika = new Tika();
> metadata.add(Metadata.RESOURCE_NAME_KEY, "test#.pdf");
> // tika uses NameDetector if first parameter == null
> System.out.println(tika.detect(null, metadata));
> // prints application/octet-stream instead of application/pdf
> {code}
> Tested for application/pdf and application/xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to