On Thu, 24 Apr 2014, אברהם חיון wrote:
Here is the simple code (Thank you Nick):
List<MediaType> mts = new ArrayList<MediaType>();
// All of these should return XML type
mts.add(MediaType.parse("text/xml"));
mts.add(MediaType.parse("application/xml"));

These two are aliases. You might need to check you're using the canonical form

mts.add(MediaType.parse("application/x-xml"));

Tika doesn't know about this, is it a common alias?

mts.add(MediaType.parse("application/atom+xml"));
mts.add(MediaType.parse("application/rss+xml"));

// All of these should return Compress or ZIP type
mts.add(MediaType.parse("application/gzip"));
mts.add(MediaType.parse("application/x-gzip"));
mts.add(MediaType.parse("application/x-compress"));

None of these is zip! That's application/zip . These are all different compression formats to zip

mts.add(MediaType.parse("application/x-gunzip"));
mts.add(MediaType.parse("application/gzipped"));
mts.add(MediaType.parse("application/gzip-compressed"));
mts.add(MediaType.parse("gzip/document"));

Tika doesn't know about any of those, if they're common you might want to suggest them as new aliases and/or new mime types

Nick

Reply via email to