>I am just a bit worried that the one abstraction to rule them all may preclude >me from easily handling more esoteric parts of some document formats.
If you find that you have needs that are too use-case-specific to fold into Tika, you can easily create your own parser/modification of one of our parsers within the Tika framework. [1] As someone who learned the hard way from doing my own custom content extraction work before drinking the Tika koolaid, I highly encourage going with the one abstraction until it fails you and then either modifying Tika through patches/issues or creating your own custom parsers and using those within the Tika framework. >I presume that the best way to request enhancements is to create a JIRA entry >so it can be tracked? Y, please. Cheers, Tim [1] https://tika.apache.org/1.13/parser_guide.html