Hi, I'm new to Tika and I have a question regarding content types.
I would like to check if the content type provided by a content repository actually matches the content. So I use Tika to detect the type from the content stream and compare it to the provided content type. That works well if the content types are the same or one is a subtype of the other. But there are some cases that require a more fuzzy comparison. If, for example, Tika detects "application/xhtml+xml" and the repository reports "text/html" then that would be a close enough match for my purpose. Is there a simple way in Tika to do such fuzzy comparisons? Thanks, Florian
