You can place a file called tika.config in your Solr core’s conf directory, and Solr’s ExtractingRequestHandler will parse it. In there you can define your custom new parser.
See https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 23. jul. 2015 kl. 22.03 skrev Aditya Dhulipala <[email protected]>: > > Hi, > > > I have implemented a new file-type parser for TIka. It parses a custom > filetype (*.mx) > > > I would like my Solr instance to use my version of Tika with the mx parser. > > I found this by a google search > > https://lucidworks.com/blog/extending-apache-tika-capabilities/ > > But it seems to be over 5 years old. And the "download project" link is > broken > > > Can anybody help me with this? > > > I tried replaceing the tika-* jars within contrib/extraction/lib under > solr-root with my compiled tika-* jars. But that didn't work, Solr is still > using the old Tika binaries (i.e. without .mx parser). I know that my > tika-** jars are working correctly, because I can run them in GUI mode and > parse a test .mx file. > > > > Thanks! > > - > > Aditya
