Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.

The following page has been changed by seanoc5:
http://wiki.apache.org/solr/ExtractingRequestHandler

------------------------------------------------------------------------------
  
   * Check out Solr trunk or get a 1.4 release or later if it exists
   * cd example
-  * Add the Configuration as defined below to the solrconfig.xml (or your 
solrconfig.xml), the libs will be added to the Solr home lib automatically by 
the example target, but the example Solr configuration does not contain the 
configuration of the ExtractingRequestHandler
+  * Add the Configuration as defined below to the solrconfig.xml (or your 
solrconfig.xml), the libs will be added to the Solr home lib automatically by 
the example target, but the example Solr configuration does not contain the 
configuration of the ExtractingRequestHandler 
+   *''for recent solr code from svn, just uncomment existing section in 
solr/conf/solrconfig.xml under 'example' dir''
   * java -jar start.jar
  
  
  In a separate window, post a file:
  
-  *  curl 
http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text  
-F "[email protected]" //Note, the trunk/site contains some nice example 
docs.
+  *  curl 
http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text  
-F "[email protected]" //Note, the trunk/site contains some nice example 
docs 
+   * hint: [email protected] needs a valid path (absolute or relative), 
e.g. "myfi...@../../site/tutorial.html" if you are still in exampledocs dir.
+   * with recent svn, you may need to add a unique '''id''' param to curl (see 
[http://www.nabble.com/Missing-required-field:-id-Using-ExtractingRequestHandler-td22611039.html
 nabble msg]):
+   * e.g. curl 
http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text\&ext.literal.id=123
 -F "myfi...@../../site/tutorial.html"
  
  or
  
@@ -51, +55 @@

         <!> NOTE, this literally streams the file, which does not, then, 
provide info to Solr about the name of the file.
  
  or whatever other way you know how to do it.  Don't forget to COMMIT!
+  * e.g. curl "http://localhost:8983/solr/update/"; -H "Content-Type: text/xml" 
--data-binary '<commit waitFlush="false"/>'   --see 
[http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika#example.source
 LucidImagination note]
  
  = Configuration =
  

Reply via email to