mikemccand commented on code in PR #13072: URL: https://github.com/apache/lucene/pull/13072#discussion_r1504523798
########## lucene/core/src/java/org/apache/lucene/util/automaton/Automaton.java: ########## @@ -92,6 +93,7 @@ public Automaton() { public Automaton(int numStates, int numTransitions) { states = new int[numStates * 2]; isAccept = new BitSet(numStates); + terminable = new BitSet(numStates); Review Comment: One spooky thing about this new `BitSet` is it is "best effort" now? I.e. one could create an Automaton that indeed has some states that match all suffixes, but forget to set the bit here? E.g. if I build a `RegexpQuery` that is actually a `PrefixQuery` we won't set this? Everything else about `Automaton` today is fundamental (states, transitions, isAccept) and necessary, but this new member is more a best effort optimization? ########## lucene/core/src/java/org/apache/lucene/util/automaton/Automaton.java: ########## @@ -70,6 +70,7 @@ public class Automaton implements Accountable, TransitionAccessor { private int[] states; private final BitSet isAccept; + private final BitSet terminable; Review Comment: At first I couldn't understand what `terminable` means. If a bit is set for a state, does that mean this state accepts everything from now on (all possible suffixes)? Could we maybe rename it to `isMatchAllSuffix` or so? ########## lucene/core/src/java/org/apache/lucene/util/automaton/RunAutomaton.java: ########## @@ -67,12 +68,16 @@ protected RunAutomaton(Automaton a, int alphabetSize) { points = a.getStartPoints(); size = Math.max(1, a.getNumStates()); accept = new FixedBitSet(size); + terminable = new FixedBitSet(size); Review Comment: Perhaps, instead of storing this concept in `Automaton`, we could solely store it in `RunAutomaton`, if we can efficiently find final states that effectively have `.*` transitions to themselves? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org