On Thu, 24 Apr 2014, אברהם חיון wrote:
These two are aliases. You might need to check you're using the canonical form

*Can you please elaborate?   What is the difference between the alias and
the canonical form ?*

From the Tika mimetypes file:

  <mime-type type="application/xml">
    <acronym>XML</acronym>
    <_comment>Extensible Markup Language</_comment>
    <tika:link>http://en.wikipedia.org/wiki/Xml</tika:link>
    <tika:uti>public.xml</tika:uti>
    <alias type="text/xml"/>

So, the official / canonical mimetype is application/xml, while text/xml is an alias for it.

MediaTypeRegistry - http://tika.apache.org/1.5/api/org/apache/tika/mime/MediaTypeRegistry.html - can give you the aliases for a given canonical type. You can use the normalize call to turn the alias into the canonical form if needed


Tika doesn't know about this, is it a common alias?

*Not used a lot, but several places list it as an XML type, like here:*
*http://filext.com/file-extension/XML
<http://filext.com/file-extension/XML>*
*or*
*http://help.dottoro.com/lapuadlp.php
<http://help.dottoro.com/lapuadlp.php>*

If they're commonly used aliases, please open a jira and suggest them

*Where should I look to see the right and acceptable mediaType / aliases of
every format ?*

https://svn.apache.org/repos/asf/tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml

Nick

Reply via email to