[ 
https://issues.apache.org/jira/browse/TIKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giorgiana Ciobanu updated TIKA-3811:
------------------------------------
    Description: 
I need to detect mime type for a file but for security reason I want to exclude 
the detection by file name extension. 

I added a tika-config_test.xml (see attached) to my unit test but it still 
detects file by name extension.

I attached a test file that is wrongly detected as text/vtt because of the file 
extension, it should be text/plain in this case.

 

The code of my unit test:
{code:java}
File file = new 
File(getClass().getClassLoader().getResource("invalid_format.vtt").getFile());
TikaConfig tikaConfig = new TikaConfig(this.getClass()
.getClassLoader()
.getResourceAsStream("tika-config_test.xml"));
 
// returns text/vtt but should be text/plain
String mimeType = new Tika(tikaConfig).detect(file); 
{code}
 

  was:
I need to detect mime type for a file but for security reason I want to exclude 
the detection by file name extension. 

I added a tika-config_test.xml (see attached) to my unit test but it still 
detects file by name extension.

I attached a test file that is wrongly detected as text/vtt because of the file 
extension, it should be text/plain in this case.

 

The code of my unit test:

 

 
{code:java}
File file = new 
File(getClass().getClassLoader().getResource("invalid_format.vtt").getFile());
TikaConfig tikaConfig = new TikaConfig(this.getClass()
.getClassLoader()
.getResourceAsStream("tika-config_test.xml"));
 
// returns text/vtt but should be text/plain
String mimeType = new Tika(tikaConfig).detect(file); 
{code}
 


> Exclude NameDetector not working for Tika.detect(file)
> ------------------------------------------------------
>
>                 Key: TIKA-3811
>                 URL: https://issues.apache.org/jira/browse/TIKA-3811
>             Project: Tika
>          Issue Type: Bug
>          Components: config, core, detector
>    Affects Versions: 2.3.0
>            Reporter: Giorgiana Ciobanu
>            Priority: Major
>         Attachments: invalid_format.vtt, tika-config_test.xml
>
>
> I need to detect mime type for a file but for security reason I want to 
> exclude the detection by file name extension. 
> I added a tika-config_test.xml (see attached) to my unit test but it still 
> detects file by name extension.
> I attached a test file that is wrongly detected as text/vtt because of the 
> file extension, it should be text/plain in this case.
>  
> The code of my unit test:
> {code:java}
> File file = new 
> File(getClass().getClassLoader().getResource("invalid_format.vtt").getFile());
> TikaConfig tikaConfig = new TikaConfig(this.getClass()
> .getClassLoader()
> .getResourceAsStream("tika-config_test.xml"));
>  
> // returns text/vtt but should be text/plain
> String mimeType = new Tika(tikaConfig).detect(file); 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to