Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The following page has been changed by FergusMcMenemie: http://wiki.apache.org/solr/ExtractingRequestHandler The comment on the change is: Added note for indexing large documents. ------------------------------------------------------------------------------ or whatever other way you know how to do it. Don't forget to COMMIT! * e.g. curl "http://localhost:8983/solr/update/" -H "Content-Type: text/xml" --data-binary '<commit waitFlush="false"/>' --see [http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika#example.source LucidImagination note] - If you are not working from the supplied example/solr directory you must copy all libraries from example/solr/libs into a libs directory within your own solr directory. The !ExtractingRequestHandler is not incorporated into the solr war file, you have to install it seperatly. + If you are not working from the supplied example/solr directory you must copy all libraries from example/solr/libs into a libs directory within your own solr directory. The !ExtractingRequestHandler is not incorporated into the solr war file, you have to install it separately. = Configuration = @@ -83, +83 @@ In the defaults section, we are mapping Tika's Last-Modified Metadata attribute to a field named last_modified. We are also telling it to ignore undeclared fields. These are all overridden parameters. The tika.config entry points to a file containing a Tika configuration. You would only need this if you have customized your own Tika configuration. The Tika config contains info about parsers, mime types, etc. + + You may also need to adjust the {{{multipartUploadLimitInKB}}} attribute as follows if you are submitting very large documents. The {{{enableRemoteStreaming}}} is not used by the !ExtractingRequestHandler. + {{{ + <requestDispatcher handleSelect="true" > + <requestParsers enableRemoteStreaming="false" multipartUploadLimitInKB="20480" /> + .... + }}} Lastly, the date.formats allows you to specify various java.text.SimpleDateFormat date formats for working with transforming extracted input to a Date. Solr comes configured with the following date formats (see the DateUtil class in Solr) {{{
