[ 
https://issues.apache.org/jira/browse/TIKA-516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Sidorenko updated TIKA-516:
----------------------------------

    Attachment: Test.java

It seems like the code works fine with latest svn trunk (1004971). I've 
attached slightly modified version of test case.

> Excel 5 files are inconsistently detected as either "application/msword" or 
> "application/vnd.ms-excel"
> ------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-516
>                 URL: https://issues.apache.org/jira/browse/TIKA-516
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.7
>            Reporter: Victor Kazakov
>            Priority: Minor
>         Attachments: excel5.xls, Test.java
>
>
> Using the AutoDetectParser on an Excel 5 file inconsistently detects it as 
> either "application/msword" or "application/vnd.ms-excel"
> See the following code:
>       public static void main(String[] args) throws Exception {
>               FileInputStream stream = null;
>               try {
>                       for (int i = 0; i < 10; i++) {
>                               File file = new File("excel5.xls");
>                               stream = new FileInputStream(file);
>                               AutoDetectParser parser = new 
> AutoDetectParser();
>                               Metadata metadata = new Metadata();
>                               metadata.set(Metadata.RESOURCE_NAME_KEY, 
> file.getName());
>                               parser.parse(stream, new DefaultHandler(), 
> metadata);
>                               
> System.out.println(metadata.get(Metadata.CONTENT_TYPE));
>                       }
>               } finally {
>                       if (stream != null) {
>                               stream.close();
>                       }
>               }
>       }
> an example output is: 
> application/vnd.ms-excel
> application/msword
> application/msword
> application/vnd.ms-excel
> application/vnd.ms-excel
> application/vnd.ms-excel
> application/vnd.ms-excel
> application/msword
> application/vnd.ms-excel
> application/msword
> The excel 5 file I used is attached to this bug.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to