Kamil Dworakowski wrote:
On Jun 22, 10:03 am, Eugene Kirpichov ekirpic...@gmail.com wrote:
Hey, you're using String I/O!
nWORDS - fmap (train . map B.pack . words) (readFile big.txt)
This should be
WORDS - fmap (train . B.words) (B.readFile big.txt)
By the way, which exact file do you use
Using Bryan O'Sullivan's fantastic BloomFilter I got it down below
Python's run time! Now it is 35.56s, 28% of the time is spent on GC,
which I think means there is still some room for improvement.
One easy way to fix the GC time is to increase the default heap size.
./a.out +RTS -A200M
On Jun 22, 10:12 pm, Bulat Ziganshin bulat.zigans...@gmail.com
wrote:
Hello Kamil,
Tuesday, June 23, 2009, 12:54:49 AM, you wrote:
I went back to using Strings instead of ByteStrings and with that
hashtable the program finishes in 31.5s! w00t!
and GC times are? also, try ByteString+HT,
Hello Kamil,
Tuesday, June 23, 2009, 11:17:43 AM, you wrote:
One easy way to fix the GC time is to increase the default heap size.
./a.out +RTS -A200M
It does make the GC only 1.4% of run time but it increases it overall
by 14s.
not surprising - you lose L2 cache locality. try to use -A
On Jun 22, 10:03 am, Eugene Kirpichov ekirpic...@gmail.com wrote:
Hey, you're using String I/O!
nWORDS - fmap (train . map B.pack . words) (readFile big.txt)
This should be
WORDS - fmap (train . B.words) (B.readFile big.txt)
By the way, which exact file do you use as a misspellings file?
On Jun 22, 6:46 am, Bulat Ziganshin bulat.zigans...@gmail.com wrote:
Hello Kamil,
Monday, June 22, 2009, 12:01:40 AM, you wrote:
Right... Python uses hashtables while here I have a tree with log n
you can try this pure hashtable approach:
import Prelude hiding (lookup)
import qualified
Am Montag 22 Juni 2009 21:31:50 schrieb Kamil Dworakowski:
On Jun 22, 6:46 am, Bulat Ziganshin bulat.zigans...@gmail.com wrote:
Hello Kamil,
Monday, June 22, 2009, 12:01:40 AM, you wrote:
Right... Python uses hashtables while here I have a tree with log n
you can try this pure
On Jun 22, 9:10 am, Ketil Malde ke...@malde.org wrote:
Kamil Dworakowski ka...@dworakowski.name writes:
Right... Python uses hashtables while here I have a tree with log n
access time. I did not want to use the Data.HashTable, it would
pervade my program with IO. The alternative is an ideal
On Jun 22, 9:06 pm, Daniel Fischer daniel.is.fisc...@web.de wrote:
Am Montag 22 Juni 2009 21:31:50 schrieb Kamil Dworakowski:
On Jun 22, 6:46 am, Bulat Ziganshin bulat.zigans...@gmail.com wrote:
Hello Kamil,
Monday, June 22, 2009, 12:01:40 AM, you wrote:
Right... Python uses
Hello Kamil,
Tuesday, June 23, 2009, 12:54:49 AM, you wrote:
I went back to using Strings instead of ByteStrings and with that
hashtable the program finishes in 31.5s! w00t!
and GC times are? also, try ByteString+HT, it should be pretty easy to
write hashByteString
--
Best regards,
Bulat
Am Montag 22 Juni 2009 22:54:49 schrieb Kamil Dworakowski:
Wait! Have you typed that definition into the msg off the top of your
head? :)
No, took a bit of looking.
I went back to using Strings instead of ByteStrings and with that
hashtable the program finishes in 31.5s! w00t!
Nice :D
kamil:
On Jun 22, 9:10 am, Ketil Malde ke...@malde.org wrote:
Kamil Dworakowski ka...@dworakowski.name writes:
Right... Python uses hashtables while here I have a tree with log n
access time. I did not want to use the Data.HashTable, it would
pervade my program with IO. The alternative
Hello Don,
Tuesday, June 23, 2009, 1:22:46 AM, you wrote:
One easy way to fix the GC time is to increase the default heap size.
./a.out +RTS -A200M
to be exact, -A isn't a heap size - it's frequency of generation-1
collections. by default, collection perfromed every 512kbytes, tied to
L2
What are the keys in your Map? Strict ByteStrings?
And you're using ghc 6.10.x?
For the record, I use 6.10.1 and strict ByteString everywhere now. I
used to have some lazy IO with vanilla strings, but switching to
Data.ByteString.Char8.readFile didn't change the time at all. The
big.txt is
14 matches
Mail list logo