(X-posted from stack overflow)
This feels like a basic, dumb question, but my reading of the documentation has
not led me to an answer.
i'm using Solr to index journal articles. Using the out-of-the-box
configuration, it indexed the text of the documents, but I'm looking to use
Grobid to pull out the authors, title, affiliations, etc. I got grobid up and
running as a service.
I added
<str name="tika.config">/path/to/tika-config.xml</str>
to the requestHandler for /update/extract in solrconfig.xml
The tika-config looks like:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<properties>
<parsers>
<parser class="org.apache.tika.parser.journal.JournalParser">
<mime>application/pdf</mime>
</parser>
</parsers>
</properties>
I'm getting a ClassNotFound exception when I try to import a document, but
can't figure out where to set the classpath to fix it.