Hi, I use Tika through the Solr ExtractingRequestHandler and I face a very common use case namely: postprocessing Tika fields in order to normalize some fields values or override them with explicitly passed "literal" values.
With exception of some vagues statements about "ContentHandler", I failed to find some good examples about this (while it appears to be quite an important feature) I also would like to work at the API "field" level rather than working with xpath on the raw Tika output. Does anyone knows of some good resources/samples about the proper way to "postprocess" fields in the context of a Solr integration ? PS: I may have posted this on the Solr ML but I know that while Tika outputs XML it also overrides fields passed to the ExtractingRequestHandler so I guess that the changes I need to do would rather apply somewhere around the Tika API. thank you in advance
