I believe this is still slower than a Perl solution. You cannot crush a
scripting language with C++ if all you do is create objects and issue calls in a
loop. That is essentially what a scripting language would do. Driven only in
first gear, a hot sports car will not outrun my GMC Safari 95 van.
In order to have C++ truly crush everything else, you must abandon STL, object
creation in a loop, redundant copying of the data, and other performance
killers. Use your own hand-crafted hash. Compute the hash index in a loop as you
process each character. Or you might actually do better organizing the
dictionary into a sorted list, and using an incremental binary search. Or maybe
put the dictionary in a tri.
Also fine tune the I/O to eliminate waste. Have a buffer, read chunks into it,
then deal with each chunk in place one at a time. Do not copy anything if you
can at all avoid it! Figure out a way to deal with the words being chopped off
at the end of I/O block. In short, you have essentially full control on the CPU
level - use it!
Anybody wants to shake his C/C++ muscle?
--
Sasha Pachev
AskSasha Linux Consulting
http://www.asksasha.com
Running Blog
http://sasha.fastrunningblog.com
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/