[Solr Wiki] Update of "PreAnalyzedField" by AndrzejBialecki

Apache Wiki Fri, 11 May 2012 12:16:59 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.


The "PreAnalyzedField" page has been changed by AndrzejBialecki:
http://wiki.apache.org/solr/PreAnalyzedField

New page:
= Using PreAnalyzedField type for integration with external document processing 
pipelines. =

''This field type is available since Solr 4.0. ''

PreAnalyzedField type provides a way to send to Solr serialized token streams, 
optionally with independent stored values of a field, and have this information 
stored and indexed without any additional text processing applied in Solr. This 
is useful if user wants to submit field content that was already processed by 
some existing external text processing pipeline (e.g. tokenized, annotated, 
stemmed, inserted synonyms, etc), while using all the rich attributes that 
Lucene's TokenStream provides (per-token attributes).

== Pluggable serialization ==
The serialization format is pluggable using implementations of 
PreAnalyzedParser interface. There are two out of the box implementations:

 * JsonPreAnalyzedParser - as the name suggests, it parses content that uses 
JSON to represent field's content. This is the default parser to use if the 
field type is not configured otherwise.
 * SimplePreAnalyzedParser - uses a simple strict plain text format, which in 
some situations may be easier to create than JSON.

== Configuration options ==
There is only one configuration parameter, `parserImpl`. The value of this 
parameter should be a fully qualified class name of a class that implements 
PreAnalyzedParser interface. The default value of this parameter is 
`org.apache.solr.schema.JsonPreAnalyzedParser`.

[Solr Wiki] Update of "PreAnalyzedField" by AndrzejBialecki

Reply via email to