Pete Kazmier wrote:
train:: [B.ByteString] - WordFreq
train words = frequencyMap
where
frequencyMap = foldr incWordCount M.empty words
incWordCount w m = M.insertWith (+) w 1 m
So is 'incWordCount' strict in its second argument? I'm still not
sure exactly
Udo Stenzel wrote:
There is another bug of this sort in your code. Consider
incWordCount w m = M.insertWith (+) w 1 m
There is no reason to evaluate the sum inside the map, instead an
unevaluated thunk is put in there.
Would not Data.Map.insertWith' do the trick?
b
Bryan O'Sullivan wrote:
Udo Stenzel wrote:
There is another bug of this sort in your code. Consider
incWordCount w m = M.insertWith (+) w 1 m
There is no reason to evaluate the sum inside the map, instead an
unevaluated thunk is put in there.
Would not Data.Map.insertWith'
Albert Y. C. Lai wrote:
I try using WordSet = [String] (plus corresponding change in code) and
get great speedup, actually way more than 3x. There was also a memory
growth phenomenon using Set String, and replacement by [String] stops
that too, now it's constant space (constant = 20M). It is
Ketil Malde wrote:
Hm - nobody suggested using ByteStrings yet?
I wrote an independent port of Norvig's spellchecker, because I figured
it would make the basis of an interesting tutorial. For such a small
program, it's been quite a challenge!
I started out using lazy ByteStrings,
On Sun, 2007-04-22 at 00:25 -0400, Pete Kazmier wrote:
Pete Kazmier [EMAIL PROTECTED] writes:
I'd love to see other Haskell implementations as well if anyone has a
few moments to spare. Admittedly, it took me several hours to get my
version working, but I'm a Haskell newbie.
Ketil Malde [EMAIL PROTECTED] writes:
On Sun, 2007-04-22 at 00:25 -0400, Pete Kazmier wrote:
Pete Kazmier [EMAIL PROTECTED] writes:
I'd love to see other Haskell implementations as well if anyone has a
few moments to spare. Admittedly, it took me several hours to get my
version
Bryan O'Sullivan [EMAIL PROTECTED] writes:
After this switch, I found that spellchecking one word still took 4x
as long in Haskell as Norvig's Python program. Since I was checking
only one word in each case, essentially all of the runtime was taken
up by building the word frequency map.
Pete Kazmier wrote:
Bryan O'Sullivan [EMAIL PROTECTED] writes:
After this switch, I found that spellchecking one word still took 4x
as long in Haskell as Norvig's Python program. Since I was checking
only one word in each case, essentially all of the runtime was taken
up by building the word
Pete Kazmier wrote:
Bryan, out of curiosity, is a non bytestring version of your code
faster?
No, but the difference is smaller than I expected: the lazy ByteString
version is about 1.8x faster than using String.
b
___
Haskell-Cafe
Derek Elkins [EMAIL PROTECTED] writes:
After reading Bryan's post, I switched my right fold to a strict left
fold and almost tripled my original speed. Could someone provide some
guidelines on when to use each? I thought foldr should be used when
building some large data structure such as
Pete Kazmier wrote:
Derek Elkins [EMAIL PROTECTED] writes:
After reading Bryan's post, I switched my right fold to a strict left
fold and almost tripled my original speed. Could someone provide some
guidelines on when to use each? I thought foldr should be used when
building some large data
On Sun, 2007-04-22 at 11:51 -0400, Pete Kazmier wrote:
type WordFreq = M.Map B.ByteString Int
train:: [B.ByteString] - WordFreq
train words = frequencyMap
where
frequencyMap = foldr incWordCount M.empty words
incWordCount w m = M.insertWith (+) w 1 m
On Sun, Apr 22, 2007 at 10:10:44PM +0200, Ketil Malde wrote:
On Sun, 2007-04-22 at 11:51 -0400, Pete Kazmier wrote:
type WordFreq = M.Map B.ByteString Int
train:: [B.ByteString] - WordFreq
train words = frequencyMap
where
frequencyMap = foldr incWordCount
On 4/22/07, Pete Kazmier [EMAIL PROTECTED] wrote:
Worse - and this is true for ByteStrings, too - String comparisons are
O(n), which means lookups in Sets and Maps are expensive. A trie (i.e,
a search tree where each internal node corresponds to a word prefix, and
has one branch per letter
On Sun, 2007-04-22 at 12:55 -0400, Pete Kazmier wrote:
After reading Bryan's post, I switched my right fold to a strict left
fold and almost tripled my original speed. Could someone provide some
guidelines on when to use each? I thought foldr should be used when
building some large data
Bryan O'Sullivan wrote:
In my profile results, I find that simply converting words to lower case
accounts for a whopping 40% of time and allocation (see the attachment
for my definition of the train function).
COST CENTREMODULE %time %alloc
lower
Hi Pete,
Recently I read an interesting article by Peter Norvig[1] on how to
write a spelling corrector in 21-lines of Python. I wanted to try and
implement it in Haskell. My implementation is terribly slow and I was
hoping for some tips on how to improve it and make it idiomatic.
I had a
Pete Kazmier [EMAIL PROTECTED] writes:
I'd love to see other Haskell implementations as well if anyone has a
few moments to spare. Admittedly, it took me several hours to get my
version working, but I'm a Haskell newbie. Unfortunately, I think it
runs as slow as it took me to write it!
I try using WordSet = [String] (plus corresponding change in code) and
get great speedup, actually way more than 3x. There was also a memory
growth phenomenon using Set String, and replacement by [String] stops
that too, now it's constant space (constant = 20M). It is possible to
attribute
20 matches
Mail list logo