> That's what I invented, and yes, it was invented by countless people before > :) >
You know I didn't mean to sound rude, right? I'm really admiring your ability to come up with these solutions by yourself, I'm merely copying other folks' ideas. Anyway, the optimization you're describing is sure possible. Lucene's FST implementation can actually combine both approaches because always expanding nodes is inefficient and those already expanded will allow a binary search (assuming the automaton structure is known to the implementation). Another refinement of this idea creates a detached table (err.. index :) of states to start from inside the automaton, so that you don't have to go through the initial 2-3 states which are more or less always large and even binary search is costly there. Dawid