[
https://issues.apache.org/jira/browse/LUCENE-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034245#comment-14034245
]
Robert Muir commented on LUCENE-5752:
-------------------------------------
I think the tests and docs etc look great here. So I really like that patch,
I'm just worried about just a few minor things:
* concatenate: as mentioned before, we rely on this today in quite a few
places, and now the runtime has significantly changed (when the left side is a
singleton)
* singleton: speaking of such, this optimization is removed, but are we sure
about this? In practice this is probably extremely effective, maybe even
outweighing any other optimizations we could do.
* regex/wildcard parsing: we should really test that this isn't totally crazy
(read: blowing up) now.
* acceptStates: should this really be a hashset? is there a reason not to use a
bitset?
> Explore light weight Automaton replacement
> ------------------------------------------
>
> Key: LUCENE-5752
> URL: https://issues.apache.org/jira/browse/LUCENE-5752
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 5.0
>
> Attachments: LUCENE-5752.patch
>
>
> This effort started with the patch on LUCENE-4556, to create a "light
> weight" replacement for the current object-heavy Automaton class
> (which creates separate State and Transition objects).
> I took that initial patch much further, and cutover most places in
> Lucene that use Automaton to LightAutomaton. Tests pass.
> The core idea of LightAutomaton is all states are ints, and you build
> up the automaton under the restriction that you add all outgoing
> transitions one state at a time. This worked well for most
> operations, but for some (e.g. UTF32ToUTF8!!) it was harder, so I also
> added a separate builder to add transitions in any order and then in
> the end they are sorted and added to the real automaton.
> If this is successful I think we should just replace the current
> Automaton with LightAutomaton; right now they both exist in my current
> patch...
> This is very much a work in progress, and I'm not sure the
> restrictions the API imposes are "reasonable" (some algos got uglier).
> But I think it's at least worth exploring/iterating... I'll make a branch and
> commit my current state.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]