[
https://issues.apache.org/jira/browse/LUCENE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630885#action_12630885
]
michaelsembwever edited comment on LUCENE-1380 at 9/14/08 5:58 AM:
---------------------------------------------------------------------
> All this patch does is to set all position increment of the tokens produced
> by the ShingleFilter to 0, right?
> I'm going to remove this for 2.4 fix and recommend you to use the filter
> strategy mentioned.
The patch to add the new TokenFilter isn't easy-as-abc as lucene needs to have
the filter class added to classpath, and Solr needs the TokenFilterFactory
added to be able to read it from the configuration files. A lot of work when
we're (almost) agreed that removing positional information from all tokens
makes sense when using the ShingleFilter.
If it were just the one installation i wouldn't have a problem with adding the
custom TokenFilter, but because our use-case is an open sourced and documented
system ( read http://sesat.no/howto-solr-query-evaluation.html ) i'd like to
make it as easy as possible for third parties.
I would also think that because this is a way to replace commercial and
competing technology from FAST that the community would be behind such an
enhancement...
was (Author: michaelsembwever):
> All this patch does is to set all position increment of the tokens
produced by the ShingleFilter to 0, right?
> I'm going to remove this for 2.4 fix and recommend you to use the filter
> strategy mentioned.
The patch to add the new TokenFilter isn't easy-as-abc as lucene needs to have
the filter class added to classpath, and Solr needs the TokenFilterFactory
added to be able to read it from the configuration files. A lot of work when
we're (almost) agreed that removing positional information from all tokens
makes sense when using the ShingleFilter.
If it were just the one installation i wouldn't have a problem with adding the
custom TokenFilter, but because our use-case is an open sourced and documented
system ( read http://sesat.no/howto-solr-query-evaluation.html ) i'd like to
make it as easy as possible for third parties.
I would also think that this is a way to replace commercial and competing
technology from FAST that the community would be behind such an enhancement...
> Patch for ShingleFilter.enablePositions
> ---------------------------------------
>
> Key: LUCENE-1380
> URL: https://issues.apache.org/jira/browse/LUCENE-1380
> Project: Lucene - Java
> Issue Type: Improvement
> Components: contrib/analyzers
> Reporter: Michael Semb Wever
> Assignee: Karl Wettin
> Priority: Trivial
> Attachments: LUCENE-1380.patch, LUCENE-1380.patch
>
>
> Make it possible for *all* words and shingles to be placed at the same
> position.
> Default is to place each shingle at the same position as the unigram (or
> first shingle if outputUnigrams=false). That is, each coterminal token has
> positionIncrement=1 and every other token a positionIncrement=0.
> This leads to a MultiPhraseQuery where at least one word/shingle must be
> matched from each word/token. This is not always desired.
> See http://comments.gmane.org/gmane.comp.jakarta.lucene.user/34746 for
> mailing list thread.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]