ASF subversion and git services commented on LUCENE-7914:

Commit f059ad065553460412f247c61b887a0c80406c0e in lucene-solr's branch 
refs/heads/branch_7x from [~jimczi]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f059ad06 ]

LUCENE-7914: AnalyzingSuggesterTest#testRandomRealisticKeys: trim big titles to 
make sure that they can pass the max recursion level in Operations#topsortState.

> Add safeguards to RegExp.toAutomaton and Operations
> ---------------------------------------------------
>                 Key: LUCENE-7914
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7914
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Jim Ferenczi
>             Fix For: master (8.0), 7.1
>         Attachments: LUCENE-7914.patch, LUCENE-7914.patch, LUCENE-7914.patch, 
> LUCENE-7914.patch, LUCENE-7914.patch
> When creating an automaton from a regexp, some operators can create more 
> states than maxDeterminizedStates:
> {code}
> a{10000}
> {code}
> The example above <b>creates</b> a single path with 10k states already 
> determinized so maxDeterminizedStates is not checked. 
> Some operations on automaton like Operations.isFinite or 
> Operations.topoSortStates are recursive and the maximum level of recursion 
> depends on the longest path in the automaton. So a large automaton like above 
> can exceed java's stack.
> In most of the cases we are covered by maxDeterminizedStates but there will 
> always be adversarial cases where a large automaton is created from a small 
> input so I think we should also have safeguards in the recursive methods. 
> I've attached a patch that adds a max recursion level to Operations.isFinite 
> and Operations.topoSortStates in order to limit stack overflows. The limit is 
> set to 1000 so any automaton with a path bigger than 1000 would throw an 
> IllegalStateException.
> The patch also uses maxDeterminizedStates to limit the number of states that 
> a repeat operator can create and throw a TooComplex..Exception when this 
> limit is reached.
> Finally the patch adds the ability to skip Operations.isFinite on 
> AutomatonQuery and uses this as an optimization for PrefixQuery that uses 
> infinite automatons only.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to