On 5/4/2016 7:15 AM, Betsey Benagh wrote:
> (X-posted from stack overflow)
>
> This feels like a basic, dumb question, but my reading of the documentation
> has not led me to an answer.
>
>
> i'm using Solr to index journal articles. Using the out-of-the-box
> configuration, it indexed the text of the documents, but I'm looking to use
> Grobid to pull out the authors, title, affiliations, etc. I got grobid up and
> running as a service.
>
> I added
>
> <str name="tika.config">/path/to/tika-config.xml</str>
>
> to the requestHandler for /update/extract in solrconfig.xml
>
> The tika-config looks like:
>
> <?xml version="1.0" encoding="UTF-8" standalone="no"?>
> <properties>
> <parsers>
> <parser class="org.apache.tika.parser.journal.JournalParser">
> <mime>application/pdf</mime>
> </parser>
> </parsers>
> </properties>
>
>
> I'm getting a ClassNotFound exception when I try to import a document, but
> can't figure out where to set the classpath to fix it.
I do not know anything about grobid.
We'll need to see the exception -- the entire multi-line stacktrace,
including any "caused by" sections.
In general, you should create a lib directory in the solr home and place
all extra jars in that directory. Otherwise you need <lib> elements in
solrconfig.xml to load jars -- and they will be loaded once for every
core that uses that <lib> element. ${solr.solr.home}/lib loads jars
*once* when Solr starts and makes them available to all cores.
Thanks,
Shawn