Hi FMC,
On 5/3/2011 at 12:37 PM, FatMan Corp wrote:
> Hi, I would like to get another's field information for the same document
> within a Tekonizer class.
> How can this be achieved?
Use s in your schema
<http://wiki.apache.org/solr/SchemaXml#Copy_Fields>, and associate different
analysis pipelines with each field. Each field's analysis pipeline will be fed
the original raw text.
Presently Lucene's analysis pipeline is single-field only: you have to create
separate analysis pipelines for each field, with an extra pass over the
original text for each field. I personally think Lucene should provide
multi-field analysis capabilities, but this would not be a simple change. Even
if Lucene does eventually gain this capability, modifying Solr to expose it
would be an added layer of complexity, and given that already
exists as a workaround, there may be little motivation to do so.
Some of the use cases full multi-field analysis could serve are already handled
in Lucene (but not yet in Solr) by TeeSinkTokenFilter
<http://lucene.apache.org/java/3_1_0/api/core/org/apache/lucene/analysis/TeeSinkTokenFilter.html>.
An enterprising Lucene user could write a single-pass tokenizer that emits
tokens with one type per target field, then employ one TeeSinkTokenFilter per
field to approximate full multi-field analysis. Adding TeeSinkTokenFilter
support to Solr, though, would require substantial changes to Solr's code and
schema format (schema schema?).
Steve
> -Original Message-
> From: FatMan Corp [mailto:fatmanc...@gmail.com]
> Sent: Tuesday, May 03, 2011 12:37 PM
> To: solr-user@lucene.apache.org
> Subject: Getting field information inside a Tokenizer
>
> Hi, I would like to get another's field information for the same document
> within a Tekonizer class.
> How can this be achieved?
>
> Thanks