[ https://issues.apache.org/jira/browse/LUCENE-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121364#comment-16121364 ]
ASF subversion and git services commented on LUCENE-7914: --------------------------------------------------------- Commit e16df81cb9a1e687295237ce45339be2e7185519 in lucene-solr's branch refs/heads/branch_7_0 from [~jimczi] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e16df81 ] LUCENE-7914: Fix TestSuggestField#testRealisticKeys: trim big titles to make sure that they can pass the max recursion level in Operations#topsortState. > Add safeguards to RegExp.toAutomaton and Operations > --------------------------------------------------- > > Key: LUCENE-7914 > URL: https://issues.apache.org/jira/browse/LUCENE-7914 > Project: Lucene - Core > Issue Type: Bug > Reporter: Jim Ferenczi > Fix For: master (8.0), 7.1 > > Attachments: LUCENE-7914.patch, LUCENE-7914.patch, LUCENE-7914.patch, > LUCENE-7914.patch, LUCENE-7914.patch > > > When creating an automaton from a regexp, some operators can create more > states than maxDeterminizedStates: > {code} > a{10000} > {code} > The example above <b>creates</b> a single path with 10k states already > determinized so maxDeterminizedStates is not checked. > Some operations on automaton like Operations.isFinite or > Operations.topoSortStates are recursive and the maximum level of recursion > depends on the longest path in the automaton. So a large automaton like above > can exceed java's stack. > In most of the cases we are covered by maxDeterminizedStates but there will > always be adversarial cases where a large automaton is created from a small > input so I think we should also have safeguards in the recursive methods. > I've attached a patch that adds a max recursion level to Operations.isFinite > and Operations.topoSortStates in order to limit stack overflows. The limit is > set to 1000 so any automaton with a path bigger than 1000 would throw an > IllegalStateException. > The patch also uses maxDeterminizedStates to limit the number of states that > a repeat operator can create and throw a TooComplex..Exception when this > limit is reached. > Finally the patch adds the ability to skip Operations.isFinite on > AutomatonQuery and uses this as an optimization for PrefixQuery that uses > infinite automatons only. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org