1) so how does this look like for example? 2) without curl? how does this look like? i am very confused because they use curl in the example but say at the same time that we should not use curl. also i have not installed curl -------- Original-Nachricht -------- > Datum: Wed, 19 Sep 2012 11:47:54 -0700 (PDT) > Von: Chris Hostetter <hossman_luc...@fucit.org> > An: solr-user@lucene.apache.org > Betreff: Re: missing a directory, can not process pdf files
> > : user:~/solr/example/exampledocs$ java -jar post.jar test.pdf doesnt work > > 1) you can use post.jar to send PDFs, but you have to use the option to > tell solr you are sending a PDF file - because by default it assumes you > are posting XML. you can see the problem by looking at the output from > post.jar and the solr logs... > > hossman@frisbee:~/tmp/solr-4.0-BETA/bin-zip/apache-solr-4.0.0-BETA/example/exampledocs$ > java -jar post.jar /tmp/test.pdf > SimplePostTool version 1.5 > Posting files to base url http://localhost:8983/solr/update using > content-type application/xml.. > ... > > And in the Solr logs... > > ... > SEVERE: org.apache.solr.common.SolrException: Invalid UTF-8 middle byte > 0xe3 (at char #10, byte #-1) > at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:159) > ... > > ...if you specify the type things should work fine on the clinet side. > > As for the Server side... > > 2) by default Solr's "/update" handler supports Solr Documents in XML, > JSON, CSV, and JavaBin. If you wnat to use the "ExtractingRequestHandler" > to parse rich documents you just have to change the URL exactly as noted > in the wiki you mentioned > ("-Durl=http://localhost:8983/solr/update/extract") > > > -Hoss