I just noticed this in SynonymFilter in trunk:
// TODO: we should set PositionLengthAttr too...
It looks like the code does in fact set the PositionLengthAttribute, so
maybe it is just a dead TODO.
And, I see the following comment (which I had seen before and was the basis
for my assertion that arbitrary graphs were not supported:
* <p><b>NOTE</b>: when a match occurs, the output tokens
* associated with the matching rule are "stacked" on top of
* the input stream (if the rule had
* <code>keepOrig=true</code>) and also on top of another
* matched rule's output tokens. This is not a correct
* solution, as really the output should be an arbitrary
* graph/lattice. For example, with the above match, you
* would expect an exact <code>PhraseQuery</code> <code>"y b
* c"</code> to match the parsed tokens, but it will fail to
* do so. This limitation is necessary because Lucene's
* TokenStream (and index) cannot yet represent an arbitrary
* graph.</p>
Granted, some of that is specific to index-time support for synonyms, which
I am avoiding, but it is a source for some confusion. If full graphs are
somehow supported at query time (or in the TokenStream in general), that
should be stated more clearly.
-- Jack Krupansky
-----Original Message-----
From: Robert Muir
Sent: Friday, August 10, 2012 1:44 PM
To: dev@lucene.apache.org
Subject: Re: Proposal: Full support for multi-word synonyms at query time
On Fri, Aug 10, 2012 at 1:36 PM, Jack Krupansky <j...@basetechnology.com>
wrote:
One of the ongoing potholes of Solr and Lucene is lack of full support for
multi-word synonyms at query time. The root of the problem is twofold:
individual terms are presented for analysis which precludes recognition of
multi-term synonyms, and the output stream from the analyis process is a
single, linear stream without regard to any graph/lattice structure for
multiple synonyms.
But this is not true. PositionLengthAttribute was already added, which
makes it a graph.
--
lucidimagination.com
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org