[jira] [Created] (LUCENE-6721) How to handle back-compat for new graph TokenFilters?

Michael McCandless (JIRA) Wed, 05 Aug 2015 08:56:32 -0700

Michael McCandless created LUCENE-6721:
------------------------------------------


             Summary: How to handle back-compat for new graph TokenFilters?
                 Key: LUCENE-6721
                 URL: https://issues.apache.org/jira/browse/LUCENE-6721
             Project: Lucene - Core
          Issue Type: Wish
            Reporter: Michael McCandless


LUCENE-6664 has a patch for a new synonym filter that correctly handles 
multi-token synonyms, unlike the known bugs we have today (see 
http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html 
for examples).

But since existing query parsers and indexer (and I'm sure many other external 
analysis consumers) ignore {{PosLenAtt}}, that patch also has a back-compat 
layer, {{SausageGraphFilter}}, to "squash" the graph back down so these 
components work as best they can...

Anyway, unless we can figure out how to make the back-compat even better than 
{{SausageGraphFilter}}, we can't really move forward with graph token filters.

Robert suggested entirely new attributes for graph token streams, but I don't 
see how that can work: it seems like we'd then need to have 2 copies of certain 
token filters, e.g. {{StopFilter}} and {{StopGraphFilter}}.

Maybe we could fix indexer and at least our query parsers to barf if they every 
see {{PosLenAtt}} != 1?  Then you'd know you need to add the back compat layer 
to your analysis chain...




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (LUCENE-6721) How to handle back-compat for new graph TokenFilters?

Reply via email to