Re: missing a directory, can not process pdf files

xxxx xxxx Wed, 19 Sep 2012 12:01:20 -0700

1) so how does this look like for example?
2) without curl? how does this look like? i am very confused because they use 
curl in the example but say at the same time that we should not use curl. also 
i have not installed curl
-------- Original-Nachricht --------
> Datum: Wed, 19 Sep 2012 11:47:54 -0700 (PDT)
> Von: Chris Hostetter <hossman_luc...@fucit.org>
> An: solr-user@lucene.apache.org
> Betreff: Re: missing a directory, can not process pdf files


> 
> : user:~/solr/example/exampledocs$ java -jar post.jar test.pdf doesnt work
> 
> 1) you can use post.jar to send PDFs, but you have to use the option to 
> tell solr you are sending a PDF file - because by default it assumes you 
> are posting XML.  you can see the problem by looking at the output from 
> post.jar and the solr logs...
> 
> hossman@frisbee:~/tmp/solr-4.0-BETA/bin-zip/apache-solr-4.0.0-BETA/example/exampledocs$
> java -jar post.jar /tmp/test.pdf 
> SimplePostTool version 1.5
> Posting files to base url http://localhost:8983/solr/update using
> content-type application/xml..
> ...
> 
> And in the Solr logs...
> 
> ...
> SEVERE: org.apache.solr.common.SolrException: Invalid UTF-8 middle byte 
> 0xe3 (at char #10, byte #-1)
>       at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:159)
> ...
> 
> ...if you specify the type things should work fine on the clinet side.
> 
> As for the Server side...
> 
> 2) by default Solr's "/update" handler supports Solr Documents in XML, 
> JSON, CSV, and JavaBin.  If you wnat to use the "ExtractingRequestHandler"
> to parse rich documents you just have to change the URL exactly as noted 
> in the wiki you mentioned
> ("-Durl=http://localhost:8983/solr/update/extract";)
> 
> 
> -Hoss

Re: missing a directory, can not process pdf files

Reply via email to