Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.

The following page has been changed by FergusMcMenemie:
http://wiki.apache.org/solr/ExtractingRequestHandler

The comment on the change is:
Added note for indexing large documents.

------------------------------------------------------------------------------
  or whatever other way you know how to do it.  Don't forget to COMMIT!
   * e.g. curl "http://localhost:8983/solr/update/"; -H "Content-Type: text/xml" 
--data-binary '<commit waitFlush="false"/>'   --see 
[http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika#example.source
 LucidImagination note]
  
- If you are not working from the supplied example/solr directory you must copy 
all libraries from example/solr/libs into a libs directory within your own solr 
directory. The !ExtractingRequestHandler is not incorporated into the solr war 
file, you have to install it seperatly.
+ If you are not working from the supplied example/solr directory you must copy 
all libraries from example/solr/libs into a libs directory within your own solr 
directory. The !ExtractingRequestHandler is not incorporated into the solr war 
file, you have to install it separately.
  
  = Configuration =
  
@@ -83, +83 @@

  In the defaults section, we are mapping Tika's Last-Modified Metadata 
attribute to a field named last_modified.  We are also telling it to ignore 
undeclared fields.  These are all overridden parameters.
  
  The tika.config entry points to a file containing a Tika configuration.  You 
would only need this if you have customized your own Tika configuration.  The 
Tika config contains info about parsers, mime types, etc.
+ 
+ You may also need to adjust the {{{multipartUploadLimitInKB}}} attribute as 
follows if you are submitting very large documents. The 
{{{enableRemoteStreaming}}} is not used by the !ExtractingRequestHandler.
+ {{{
+   <requestDispatcher handleSelect="true" >
+     <requestParsers enableRemoteStreaming="false" 
multipartUploadLimitInKB="20480" />
+     ....
+ }}}
  
  Lastly, the date.formats allows you to specify various 
java.text.SimpleDateFormat date formats for working with transforming extracted 
input to a Date.  Solr comes configured with the following date formats (see 
the DateUtil class in Solr)
  {{{

Reply via email to