Incidentally, @lagerratrobe, once you save a lot of time by replacing sorting
by a perfect commutative hash, the next optimization is making the IO part
faster (you need cligen-0.9.43 for this):
import strutils, tables, cligen/[mfile, mslice]
const prime: array[26,
Incidentally, in the small alphabet case (and realistically anagrams are
usually related to word games over small alphabets) there is a neat trick for
rapid signature computation: map the 26 letters to the first 26 primes (2, 3,
5, 7, .., 97, 101). Then make the signature of a word the product
To clarify how @cdome's comment differs from @SolitudeSF's and @lagerratrobe's
own observation, the intent was always for `-d:danger` to imply `-d:release`,
but I guess there are several versions of Nim out there for which this
implication does not hold. You have to define both for those Nim
-d:release -d:danger might give extra speed
BTW, the Unicode version of this with 65536 to 2e6 "letters" (depending upon if
you can restrict to "one Unicode plane") is definitely trickier. You might
think a 2 pass radix sort would help, but since most words are short most of
your sorts are _very_ small. Indeed, even insertion sort might
const file = staticRead("dict.txt").strip.splitLines
Besides @SolitudeSF's important suggestion, an algorithmic improvement you can
use is to replace `algorithm.sorted(lower, system.cmp)` with a hand-written
"counting sort". For ASCII (i.e. 1 byte textual characters) this is (very?)
easy and should be quite a bit faster than `algorithm.sorted`.
use -d:danger if you need performance
Oh! strtabs would have saved some time. Still, it was good to understand how to
use the core features.
Thank you for the feedback.
Tiny optimization to create less copies:
proc makeSignature(word: string): string =
let ascii = unidecode(word)
result = strutils.toLowerAscii(ascii)
result.sort(lower, system.cmp) # or algorithm.sort(result, lower,
system.cmp)
Run
Note that there is
Answering my own question on this. Using the "-d: release" compile option made
all the difference. With that, I was able to reduce the time to:
real0m0.286s
user0m0.269s
sys 0m0.012s
Run
Hi All,
I read about Nim for the 1st time last week and decided to make something over
the weekend. I was pleasantly surprised that I was able to cobble something
together that did what I wanted it to do, but am a bit surprised that it's not
faster. Maybe some of you could give some
12 matches
Mail list logo