On Sun, 8 Feb 2015, [email protected] wrote:
I'm trying to add a custom mime type. I've seen solutions that involve
writing a custom-mimetypes.xml file, but I'd really prefer to add my
custom type programmatically.

Currently, I think we only support loading magic from a combination of (one) main mimetypes file and (many) custom mimetypes files. That loading does some sanity checking in the process.

Originally, Tika only supported the single core magic file. It took a little bit of re-jigging to handle the custom ones too. More rejigging is possible!

Mostly this is because the magic bytes for the file format are already defined elsewhere in code (which I'd prefer not to duplicate), and I also want to leave the possibility open for user-defined mime types at runtime.

Would you anticipate adding these additional mimetypes once, when getting your Tika Config object, or do you forsee wanting to add them on the fly?

Instead of getting the Detector through the TikaConfig, I've tried
instantiating a new MagicDetector with the desired byte pattern and
MediaType, grabbing a DefaultDetector, and adding my new MagicDetector
to the DefaultDetector's getDetectors() List, and then performing
detection.

I'm not sure that'll work, I'm not sure the detectors list can be modified that. What happens if you get a default TikaConfig object, grab the normal detectors from that, build your custom one, then create a CompositeDetector from that list + new one, and use that CompositeDetector from then on?

Nick

Reply via email to