[
https://issues.apache.org/jira/browse/LUCENE-10010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453279#comment-17453279
]
ASF subversion and git services commented on LUCENE-10010:
----------------------------------------------------------
Commit a39337e595c3ae8da909de2c62e142b4e88f9ec3 in lucene's branch
refs/heads/main from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=a39337e5 ]
LUCENE-10010: fix TestMockAnalyzer to determinize
Test would randomly fail, if RegExp parsing returned an NFA, because it
wasn't explicitly determinizing itself.
This is a bit of a trap in RegExp, it calls minimize()-as-it-parses,
so at least most of the time, it returns a DFA. This may be
unnecessary...
> Should we have a NFA Query?
> ---------------------------
>
> Key: LUCENE-10010
> URL: https://issues.apache.org/jira/browse/LUCENE-10010
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/search
> Affects Versions: 9.0
> Reporter: Haoyu Zhai
> Priority: Major
> Time Spent: 9h
> Remaining Estimate: 0h
>
> Today when a {{RegexpQuery}} is created, it will be translated to NFA,
> determinized to DFA and eventually become an {{AutomatonQuery}}, which is
> very fast. However, not every NFA could be determinized to DFA easily, the
> example given in LUCENE-9981 showed how easy could a short regexp break the
> determinize process.
> Maybe, instead of marking those kind of queries as adversarial cases, we
> could make a new kind of NFA query, which execute directly on NFA and thus no
> need to worry about determinize process or determinized DFA size. It should
> be slower, but also makes those adversarial cases doable.
> [This article|https://swtch.com/~rsc/regexp/regexp1.html] has provided a
> simple but efficient way of searching over NFA, essentially it is a partial
> determinize process that only determinize the necessary part of DFA. Maybe we
> could give it a try?
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]