[ https://issues.apache.org/jira/browse/TIKA-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781311#action_12781311 ]
Yuan-Fang Li commented on TIKA-309: ----------------------------------- Hi Chris, Jukka, Yes, the Tika tests are passing for me. However, my test for one of the ontologies ("http://www.w3.org/2002/07/owl#") is still failing, and here is why. In test tika-core/src/test/java/org/apache/tika/mime/MimeDetectionTest.java, the method testUrl(String expected, String url, String file) is actually testing the content in the file named "file" with the url being a clue for the detection. My test, however, opens an input stream on the actual url and use that to detect the mime type. For the above URL, tika is testing against the file named "test-difficult-rdf2.xml". The only difference I can see between this file and the actual content of the URl is the one line at the top: "<?xml version='1.0' encoding='ISO-8859-1'?>". This line is present in the tika test file but not in the URL. So. if you remove/comment out that line from "test-difficult-rdf2.xml" and run the following maven command to run the test: mvn -Dtest=MimeDetectionTest test, it will fail. Or, you could use the following test case to test against the real URL. @Test public void testRDFStreamMimeType() throws IOException { URL url = new URL("http://www.w3.org/2002/07/owl#"); final InputStream stream = new BufferedInputStream(url.openStream()); try { MimeTypes mimeTypes = TikaConfig.getDefaultConfig().getMimeRepository(); Metadata metadata = new Metadata(); String mime = mimeTypes.detect(stream, metadata).toString(); assertEquals("application/rdf+xml", mime); } finally { stream.close(); } } Cheers Yuan-Fang > Mime type application/rdf+xml not correctly detected > ---------------------------------------------------- > > Key: TIKA-309 > URL: https://issues.apache.org/jira/browse/TIKA-309 > Project: Tika > Issue Type: Bug > Components: mime > Affects Versions: 0.5 > Reporter: Yuan-Fang Li > Assignee: Chris A. Mattmann > Priority: Minor > Fix For: 0.5 > > > Mime type detector using AutoDetectParser and Metadata returns > "application/xml" for the URL http://www.w3.org/2002/07/owl#, where it should > be "application/rdf+xml". The correct mime type is also suggested here: > http://www.w3.org/TR/owl-ref/#MIMEType. > P.S., Tika was downloaded from svn and built with Maven last week. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.