Send Beginners mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."
Today's Topics:
1. Re: Fwd: Implementing a spellchecker - problem with
Data.HashTable performance (Karol Samborski)
2. Re: Fwd: Implementing a spellchecker - problem with
Data.HashTable performance (Brent Yorgey)
3. Re: Two questions for a production project... (umptious)
4. Re: Fwd: Implementing a spellchecker - problem with
Data.HashTable performance (umptious)
5. Re: Fwd: Implementing a spellchecker - problem with
Data.HashTable performance (Karol Samborski)
6. Re: Training tasks (umptious)
----------------------------------------------------------------------
Message: 1
Date: Fri, 20 Apr 2012 12:30:38 +0200
From: Karol Samborski <[email protected]>
Subject: Re: [Haskell-beginners] Fwd: Implementing a spellchecker -
problem with Data.HashTable performance
To: Rados?aw Szymczyszyn <[email protected]>
Cc: [email protected]
Message-ID:
<CACe2dTt7vt2q3ypbaB+XU35CU=t1jn7bnucrzzcf5gwfzib...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
W dniu 20 kwietnia 2012 11:33 u?ytkownik Rados?aw Szymczyszyn
<[email protected]> napisa?:
> Hello fellow Haskellers,
>
> It's my first post to this list, so at first: Greetings to everyone!
>
> I've wanted to pick up Haskell for some time and I've run in
> to a perfect (at least it seemed so) oocasion at my Natural Language
> Processing course. At first I tried using Python, but it turned out
> that some extra performance wouldn't hurt and I hoped Haskell would
> give me some boost without forcing me to sacrifice high-level
> expressiveness at the same time.
>
> The first lesson was to use ByteStrings instead of ordinary Strings
> for IO. Then I needed to build a dictionary of words found in the
> source text. That's the problematic part - I've tried using a few
> structures, namely Data.Map and Data.HashTable from base package and
> Data.HashMap from unordered-containers. In all cases I couldn't reach
> acceptable performance. As of that, I'd like to ask about some advice
> and comments about the following examples.
>
> First example [1] just builds a HashTable of ~1 500 000 ByteStrings.
> The program takes about 11 seconds to execute. Second example [2] does
> roughly the same, but taking the HashTable data from a file (just a
> dictionary of valid Polish forms). Reading data itself is not a
> problem, as I've measured the execution time of a program which just
> reads the file (and prints its last line to actually force the read,
> hence "handle" Haskell's laziness) and it takes no more than a second.
> Coupled together reading and building, however, takes unbearably long.
> I didn't wait till the program stopped and just terminated it, but it
> had been running for ~10 minutes already.
>
> The most disappointing fact is that a roughly equivalent (and rather
> effortless to write in contrast to the Haskell example - I'm just
> learning though) Python test which just reads everything into a dict
> takes about 10 seconds to execute (reading and updating the dict of
> course). Writing the spellchecker in Python seemed pretty
> straightforward for me (but its certainly not the only language I
> know; I've had some Java (who didn't nowadays), system level C and am
> daily working in Erlang). I understand of course, that the most
> convenient tool is generally the one you know the best, but my
> experience writing the Haskell equivalent reached both extremes -
> effortless expression of the logic, but frustration because of the
> unexplainable (for me at least) performance behaviour.
>
> The question is why is it so slow? (or AM I RILY DAT DUMB? ;)
>
> The link to examples:
> [1] https://gist.github.com/2411699 TestHashTableArtificial
> [2] https://gist.github.com/2411699 TestReadIntoHashTable
>
> I'm looking forward to somebody shedding some light on this. Thanks in
> advance!
>
> Best regards,
> Radek Szymczyszyn
Hi!
Did you try with -O2?
I've created file "formy.utf8" filled with 1400000 lines of word
"?y?niejszymi" and here is my output:
$ ghc --make -O2 TestReadIntoHashTable.hs -o test -rtsopts
$ time ./test +RTS -K100m -RTS
1400000
real 0m3.265s
user 0m3.097s
sys 0m0.143s
Best,
Karol Samborski
------------------------------
Message: 2
Date: Fri, 20 Apr 2012 07:52:56 -0400
From: Brent Yorgey <[email protected]>
Subject: Re: [Haskell-beginners] Fwd: Implementing a spellchecker -
problem with Data.HashTable performance
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=utf-8
I seem to recall that Data.HashTable is quite slow and not recommended
for use. Try using Data.HashMap.Lazy/Strict from the
unordered-containers package instead.
-Brent
On Fri, Apr 20, 2012 at 11:33:51AM +0200, Rados?aw Szymczyszyn wrote:
> Hello fellow Haskellers,
>
> It's my first post to this list, so at first: Greetings to everyone!
>
> I've wanted to pick up Haskell for some time and I've run in
> to a perfect (at least it seemed so) oocasion at my Natural Language
> Processing course. At first I tried using Python, but it turned out
> that some extra performance wouldn't hurt and I hoped Haskell would
> give me some boost without forcing me to sacrifice high-level
> expressiveness at the same time.
>
> The first lesson was to use ByteStrings instead of ordinary Strings
> for IO. Then I needed to build a dictionary of words found in the
> source text. That's the problematic part - I've tried using a few
> structures, namely Data.Map and Data.HashTable from base package and
> Data.HashMap from unordered-containers. In all cases I couldn't reach
> acceptable performance. As of that, I'd like to ask about some advice
> and comments about the following examples.
>
> First example [1] just builds a HashTable of ~1 500 000 ByteStrings.
> The program takes about 11 seconds to execute. Second example [2] does
> roughly the same, but taking the HashTable data from a file (just a
> dictionary of valid Polish forms). Reading data itself is not a
> problem, as I've measured the execution time of a program which just
> reads the file (and prints its last line to actually force the read,
> hence "handle" Haskell's laziness) and it takes no more than a second.
> Coupled together reading and building, however, takes unbearably long.
> I didn't wait till the program stopped and just terminated it, but it
> had been running for ~10 minutes already.
>
> The most disappointing fact is that a roughly equivalent (and rather
> effortless to write in contrast to the Haskell example - I'm just
> learning though) Python test which just reads everything into a dict
> takes about 10 seconds to execute (reading and updating the dict of
> course). Writing the spellchecker in Python seemed pretty
> straightforward for me (but its certainly not the only language I
> know; I've had some Java (who didn't nowadays), system level C and am
> daily working in Erlang). I understand of course, that the most
> convenient tool is generally the one you know the best, but my
> experience writing the Haskell equivalent reached both extremes -
> effortless expression of the logic, but frustration because of the
> unexplainable (for me at least) performance behaviour.
>
> The question is why is it so slow? (or AM I RILY DAT DUMB? ;)
>
> The link to examples:
> [1] https://gist.github.com/2411699 TestHashTableArtificial
> [2] https://gist.github.com/2411699 TestReadIntoHashTable
>
> I'm looking forward to somebody shedding some light on this. Thanks in
> advance!
>
> Best regards,
> Radek Szymczyszyn
>
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners
------------------------------
Message: 3
Date: Fri, 20 Apr 2012 14:38:34 +0100
From: umptious <[email protected]>
Subject: Re: [Haskell-beginners] Two questions for a production
project...
To: Michael Orlitzky <[email protected]>
Cc: [email protected]
Message-ID:
<cae20bnv0b+wweshuclzap08xrcxhd1bh_1i1_3u-aas2qwx...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
This is the sort of question you might want to ask on Haskell cafe too.
But
- This really comes down to whether the GC can trusted, and I would
astonished if it was otherwise. It would make writing web frameworks in
Haskell rather pointless. (Otoh I can't find a list of real world users for
Yesod etc.)
- A lot of people use XMonad to manage their desktops and just leave their
machines on *forever.*
- If Haskell does prove to be unsuitable (which is very unlikely) then
Erlang is a safe bet. It's about the most reliable language there is - it
was designed for telecoms. It's much simpler to learn than Haskell,
although less funky (eg no lazy evaluation) and closer to a scripting
language in speed. Get a copy of Programming *Erlang*: Software for a
Concurrent World by Joe *Armstrong* and with even minimal Haskell
experience you should be writing code straight away.
On 20 April 2012 03:43, Michael Orlitzky <[email protected]> wrote:
> On 04/19/2012 07:40 PM, Mike Meyer wrote:
> > I've got a project coming up that I could use haskell for, providing I
> > can convince myself that it's appropriate. The critical question is:
> >
> > Anyone using Haskell in a long-running production application? I'm
> > talking about something where the program would run for weeks or
> > months non-stop, with either multiple threads or multiple copies.
>
> I have a rather stupid application that parses feeds from the Twitter
> API. For each username given on the command line, it calls forkIO and
> the new thread enters a recursive loop:
>
> recurse x y z = do
> ...
> recurse x y z
>
> forever and ever.
>
> Contrary to my expectations (I still basically don't know what I'm doing
> in Haskell) it seems to run in constant space for months at a time.
>
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.haskell.org/pipermail/beginners/attachments/20120420/8768de00/attachment-0001.htm>
------------------------------
Message: 4
Date: Fri, 20 Apr 2012 14:45:25 +0100
From: umptious <[email protected]>
Subject: Re: [Haskell-beginners] Fwd: Implementing a spellchecker -
problem with Data.HashTable performance
Cc: [email protected]
Message-ID:
<cae20bnv3fex8etersm2ekl-5vzuvw5z8vrhwz3sakgyx4wt...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
While I think Misters Samborski and Yorgey have fixed the performance
problem, I'm still curious about the apparent synergy effect:
> First example [1] just builds a HashTable of ~1 500 000 ByteStrings.
> > The program takes about 11 seconds to execute.
>
> > Reading data itself is not a
> > problem, as I've measured the execution time of a program which just
> > reads the file (and prints its last line to actually force the read,
> > hence "handle" Haskell's laziness) and it takes no more than a second.
>
> > Coupled together reading and building, however, takes unbearably long.
> > I didn't wait till the program stopped and just terminated it, but it
> > had been running for ~10 minutes already.
>
???
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.haskell.org/pipermail/beginners/attachments/20120420/474e2415/attachment-0001.htm>
------------------------------
Message: 5
Date: Fri, 20 Apr 2012 16:01:53 +0200
From: Karol Samborski <[email protected]>
Subject: Re: [Haskell-beginners] Fwd: Implementing a spellchecker -
problem with Data.HashTable performance
To: umptious <[email protected]>
Cc: [email protected]
Message-ID:
<cace2dtuhrobegtxuuq1qrql1b8b8rtznpttcepdnt4cvvn0...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
W dniu 20 kwietnia 2012 15:45 u?ytkownik umptious <[email protected]> napisa?:
> While I think Misters Samborski and Yorgey have fixed the performance
> problem, I'm still curious about the apparent synergy effect:
>
Hi,
I wrote about the example program of that effect last time. With -O2
flag it took 3 sek.
Best,
Karol Samborski
------------------------------
Message: 6
Date: Fri, 20 Apr 2012 15:02:21 +0100
From: umptious <[email protected]>
Subject: Re: [Haskell-beginners] Training tasks
To: [email protected]
Message-ID:
<CAE20bNuGd+MaWL6Jme8zqo33ePUo-tKi82ZO-=xwggcxcac...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
On 19 April 2012 20:20, Michael Litchard <[email protected]> wrote:
> Hi, I noticed one of your suggestions
>
> > - A version of the classic Traveller/Elite trading system - a really nice
> > little toy business rules system that can be code either minimally or to
> > show off the funkier features of a language to get more flexibility and
> more
> > interesting pricing rules. Minimal version would be a few tens of lines
> and
> > the config file declaring item
>
> I couldn't find anything on Elite, could you give me a reference?
> I found a reference concerning Traveller, but if you had anything in
> mind, it may be helpful.
>
> Thanks :)
>
If you download Oolite, available from most Linux repositories, it should
have the same trading rules as the original Elite (which it stole from the
Traveller pen and paper game.) There's a space trading game for the Palm
Pilot that has improved more interesting rules - maybe you can find a Palm
emulator if you don't have one? -
http://en.wikipedia.org/wiki/Space_Trader_%28Palm_OS%29
Basically, you like goods that can be traded and you have worlds to sell
them on. Worlds have government types and economic types - and in more
complex variants ecologies and events - that affect the price and supply of
goods. Eg Fish will be cheap on a Farming world with an Ocean environment,
but expensive on an Industrial Desert world especially if it is Rich; Guns
will sell for more if a War is going on; Survival Equipment and any Food
Item will sell for more in the aftermath of a disaster. You can also model
legality based on a Government type.
The challenge is to get a system that allows trade rules to be concise and
expressive while being written in a form that can be edited by a
non-programmer - the config can be an .hs file, but it should be one that
even a non-coder can understand and edit. It's a nice problem because it's
scalable (you increase the complexity of the rules) and it has a lot of
characteristics of real business systems. I knocked out the core of a
solution last night - together with the recursive Shape Editor it's one of
the best language test programs I know - and should have something nice
around Monday if you want a copy.
- Jay
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.haskell.org/pipermail/beginners/attachments/20120420/e5675244/attachment.htm>
------------------------------
_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners
End of Beginners Digest, Vol 46, Issue 37
*****************************************