Michael McCandless created LUCENE-6721:
------------------------------------------
Summary: How to handle back-compat for new graph TokenFilters?
Key: LUCENE-6721
URL: https://issues.apache.org/jira/browse/LUCENE-6721
Project: Lucene - Core
Issue Type: Wish
Reporter: Michael McCandless
LUCENE-6664 has a patch for a new synonym filter that correctly handles
multi-token synonyms, unlike the known bugs we have today (see
http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
for examples).
But since existing query parsers and indexer (and I'm sure many other external
analysis consumers) ignore {{PosLenAtt}}, that patch also has a back-compat
layer, {{SausageGraphFilter}}, to "squash" the graph back down so these
components work as best they can...
Anyway, unless we can figure out how to make the back-compat even better than
{{SausageGraphFilter}}, we can't really move forward with graph token filters.
Robert suggested entirely new attributes for graph token streams, but I don't
see how that can work: it seems like we'd then need to have 2 copies of certain
token filters, e.g. {{StopFilter}} and {{StopGraphFilter}}.
Maybe we could fix indexer and at least our query parsers to barf if they every
see {{PosLenAtt}} != 1? Then you'd know you need to add the back compat layer
to your analysis chain...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]