[Haskell-cafe] Slowdown with GHC when using multiple CPUs

2012-06-29 Thread David Powell
I recently ran into a problem writing a program that I hoped I could speed
up by running over multiple CPUs.  I want non-haskell users to be able to
run the tool and take advantage of multiple CPUs.  *But* there is a serious
slowdown when the RTS is run with -N and some of the CPUs are already busy.

This is already mentioned in the GHC docs, but the problem I experienced
was more serious than I expected causing a slowdown of around 2x compared
to running with a single CPU.

I reproduced the problem with the following code from the haskell wiki:

{-# LANGUAGE BangPatterns #-}
import Data.Digest.Pure.MD5
import qualified Data.ByteString.Lazy as L
import System.Environment
import Control.Concurrent
import Control.Monad (replicateM_)

main = do
files <- getArgs
str <- newEmptyMVar
mapM_ (forkIO . hashAndPrint str) files
printNrResults (length files) str

printNrResults i var = replicateM_ i (takeMVar var >>= putStrLn)

hashAndPrint str f = do
bs <- L.readFile f
let !h = show $ md5 bs
putMVar str (f ++ ": " ++ h)


When run on 4 idle CPU cores, I get the following wall clock times:
  ./run +RTS -N1   : 20.4 sec
  ./run +RTS -N2   : 11.0 sec
  ./run +RTS -N4   : 6.7 sec

When run on the same 4 core machine, but with 2 cores already busy:
  ./run +RTS -N1   : 23.5 sec
  ./run +RTS -N2   : 14.1 sec
  ./run +RTS -N4   : 57.8 sec   < Blowout...

This is quite a problem in practice when running on a shared server.  Is
there anything that can be done to address this?

(I wrote up a few more details here:
http://thunking.drp.id.au/2012/06/slowdown-with-ghc-when-using-multiple.html
)

Thanks,

-- 
David Powell
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: HDBC-postgresql and safe/unsafe FFI calls

2010-09-03 Thread David Powell
Hi John,

My current usage of hdbc is in a server process that takes requests from
multiple clients, queries the database, and returns a result.  Having a
single db query block everything else isn't really workable for me.  I
suspect this will also be an issue for others.  As an example, the
persistent-postgresql package which is part of the new Yesod web framework
will have the same problem.

I can send you a patch, but I am concerned with the issue Leon raised about
libpq needing to be compiled with thread support.  This is the default on my
platforms (macosx, debian), but probably is dangerous to rely on.  I guess
we can just test the result of 'PQisthreadsafe()', in connectPostgreSQL and
raise an error if it is false.

Cheers,

-- David


On Fri, Sep 3, 2010 at 1:36 AM, John Goerzen  wrote:

> Hi David,
>
> I've had varying arguments from people that want me to mark things safe or
> unsafe for various performance reasons.  I'm happy to apply your change if
> you like.  Can you send me a diff (and attach your explanation here to it,
> which I'll use as a commit message for future reference)?
>
> Thanks,
>
> -- John
>
>
> On 09/01/2010 09:40 PM, David Powell wrote:
>
>> Greetings,
>>
>> I'm having an issue with the HDBC-postgresql package that requires me to
>> manually patch it before installation for most of my use cases.
>>
>> All the FFI calls in this package are marked "unsafe".  Unfortunately,
>> this means that whenever I issue a slow sql query, all other processing
>> stops.  In most places that I want to use this module, I've had to
>> manually patch it to at least mark the PQexec and PQexecParams calls as
>> "safe".
>>
>> Is there any reason these calls should not be marked as "safe"?  I
>> understand that there a little extra runtime overhead with this, but I'd
>> have thought that negligible given all the other processing that goes on
>> with these particular calls under the hood.
>>
>> Cheers,
>>
>> --
>> David Powell
>>
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] HDBC-postgresql and safe/unsafe FFI calls

2010-09-01 Thread David Powell
Thanks Jason, I think I had read that - I quite enjoy Edward's posts.
Re-reading, seems to confirm what I thought, most (all?) of the FFI calls in
HDBC-postgresql should be changed to "safe".

-- David

On Thu, Sep 2, 2010 at 2:47 PM, Jason Dagit  wrote:

> On Wed, Sep 1, 2010 at 7:40 PM, David Powell  wrote:
> > Greetings,
> >
> > I'm having an issue with the HDBC-postgresql package that requires me to
> > manually patch it before installation for most of my use cases.
> >
> > All the FFI calls in this package are marked "unsafe".  Unfortunately,
> this
> > means that whenever I issue a slow sql query, all other processing stops.
> > In most places that I want to use this module, I've had to manually patch
> it
> > to at least mark the PQexec and PQexecParams calls as "safe".
> >
> > Is there any reason these calls should not be marked as "safe"?  I
> > understand that there a little extra runtime overhead with this, but I'd
> > have thought that negligible given all the other processing that goes on
> > with these particular calls under the hood.
>
> Have you read this?
> http://blog.ezyang.com/2010/07/safety-first-ffi-and-threading/
>
> Perhaps it answers your questions?
>
> Jason
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] HDBC-postgresql and safe/unsafe FFI calls

2010-09-01 Thread David Powell
Greetings,

I'm having an issue with the HDBC-postgresql package that requires me to
manually patch it before installation for most of my use cases.

All the FFI calls in this package are marked "unsafe".  Unfortunately, this
means that whenever I issue a slow sql query, all other processing stops.
In most places that I want to use this module, I've had to manually patch it
to at least mark the PQexec and PQexecParams calls as "safe".

Is there any reason these calls should not be marked as "safe"?  I
understand that there a little extra runtime overhead with this, but I'd
have thought that negligible given all the other processing that goes on
with these particular calls under the hood.

Cheers,

-- 
David Powell
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] zip-archive performance/memmory usage

2010-08-10 Thread David Powell
I was interested to see if I could determine what was happening with this.
After some playing around, I noticed the code was running significantly
faster if I *didn't* compile it, but ran it with 'runghc' instead (running
under ghci was also fast).

Here are the running times I found.  The 'Zip.hs' program comes with the
zip-archive package.  The runtime of the compiled version didn't seem to be
affected by optimisations.  Regardless, I'm quite surprised running
interpreted was significantly faster than compiled.

> time runghc ./Zip.hs -l ~/jdk1.6.0_05-src.zip
 1.48s user 0.17s system 97% cpu 1.680 total

> time ./dist/build/Zip/Zip -l ~/jdk1.6.0_05-src.zip
 89.00s user 1.06s system 98% cpu 1:31.84 total

The file 'jdk1.6.0_05-src.zip' was just an 18MB zip file I had lying
around.  I'm using ghc 6.12.1

Cheers,

-- 
David Powell


On Tue, Aug 10, 2010 at 12:10 PM, Jason Dagit  wrote:

>
>
> On Mon, Aug 9, 2010 at 4:29 PM, Pieter Laeremans wrote:
>
>> Hello,
>>
>> I'm trying some haskell scripting. I'm writing a script to print some
>> information
>> from a zip archive.  The zip-archive library does look nice but the
>> performance of zip-archive/lazy bytestring
>> doesn't seem to scale.
>>
>> Executing :
>>
>>eRelativePath $ head $ zEntries archive
>>
>> on an archive of around 12 MB with around 20 files yields
>>
>> Stack space overflow: current size 8388608 bytes.
>>
>
> So it's a stack overflow at about 8 megs.  I don't have a strong sense of
> what is normal, but that seems like a small stack to me.  Oh, actually I
> just check and that is the default stack size :)
>
> I looked at Zip.hs (included as an example).  The closest I see to your
> example is some code for listing the files in the archive.  Perhaps you
> should try the supplied program on your archive and see if it too has a
> stack overflow.
>
> The line the author uses to list files is:
> List-> mapM_ putStrLn $ filesInArchive archive
>
> But, you're taking the head of the entries, so I don't see how you'd be
> holding on to too much data.  I just don't see anything wrong with your
> program.  Did you remember to compile with optimizations?  Perhaps try the
> author's way of listing entries and see if performance changes?
>
>
>>
>> The script in question can be found at :
>>
>> http://github.com/plaeremans/HaskellSnipplets/blob/master/ZipList.hs
>>
>> I'm using the latest version of haskell platform.  Are these libaries not
>> production ready,
>> or am I doing something terribly wrong ?
>>
>
> Not production ready would be my assumption.  I think an iteratee style
> might be more appropriate for these sorts of nested streams of potentially
> large size anyway.  I'm skeptical of anything that depends on lazy
> bytestrings or lazy io.  In this case, the performance would appear to be
> depend on lazy bytestrings.
>
> You might want to experiment with increasing the stack size.  Something
> like this:
> ./ZipList +RTS -K100M -RTS foo.zip
>
> Jason
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Thread scheduling

2010-06-10 Thread David Powell
On Fri, Jun 11, 2010 at 3:34 PM, Luke Palmer  wrote:

> Say, using System.Time.getClockTime.
>
> Luke
>
> On Thu, Jun 10, 2010 at 11:31 PM, Luke Palmer  wrote:
> > On Thu, Jun 10, 2010 at 11:50 AM, Andrew Coppin
> >  wrote:
> >> Control.Concurrent provides the threadDelay function, which allows you
> to
> >> make the current thread sleep until T=now+X. However, I can't find any
> way
> >> of making the current thread sleep until T=X. In other words, I want to
> >> specify an absolute wakeup time, not a relative one.
> >
> > Modulo a small epsilon between the two actions, can't you just get the
> > current time and subtract it from the target time?  threadDelay is
> > allowed to delay for too long anyway, so doing it this way does not
> > lose you any correctness.
> >
> > Luke
> >
>

This is a slightly different issue, but isn't there a potential problem with
threadDelay?  I noticed that internally threadDelay uses gettimeofday() as
the absolute time source (on linux at least).   Isn't there potential
problem with this since wall-clock time isn't guaranteed to be monotonic
increasing?  On linux, I'd have thought the "right" thing to do would be to
use clock_gettime(CLOCK_MONOTONIC) although that is probably not very
portable.

-- David
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] executeFile failing on macosx

2010-05-16 Thread David Powell
Indeed System.Process does work for me.  I had avoided it because it is a
little more awkward to use it when you want the actual PIDs.  I don't
understand why System.Process.runProcess works for me, but executeFile does
not.  I did find this issue (for python)
http://bugs.python.org/issue6800which I think is the same thing I'm
hitting and they also claim it is fixed
in macosx 10.6.

Anyway, I'll work with System.Process for now.  Thanks for your help.

-- David

On Mon, May 17, 2010 at 9:41 AM, Thomas Schilling
wrote:

> Works fine on 10.6.3.  If you run with +RTS -N2, though, you'll get
> "forking not supported with +RTS -N greater than 1"
>
> The reason for this is that forking won't copy over the threads which
> means that the Haskell IO manager stops working (you'd have to somehow
> reinitialise the RTS while leaving heap and runtime stacks in tact --
> very tricky).
>
> I'm using http://hackage.haskell.org/package/process to run external
> processes.  I haven't had any problems with it.
>
> On 17 May 2010 00:06, David Powell  wrote:
> >
> > On Mon, May 17, 2010 at 1:33 AM, Bulat Ziganshin <
> bulat.zigans...@gmail.com>
> > wrote:
> >>
> >> Hello David,
> >>
> >> Sunday, May 16, 2010, 7:18:29 PM, you wrote:
> >>
> >> > "executeFile" is failing for me on Mac OS X 10.5.8, with ghc 6.12.1
> >> > when compiling with "-threaded".  Compiling without -threaded, or
> >> > running on linux is fine.
> >> >>  forkProcess $ executeFile "/bin/echo" False ["Ok"] Nothing
> >>
> >> afair, forkProcess and -threaded shouldn't work together on any Unix.
> >> can you try forkIO or forkOS instead?
> >>
> >
> > Hi Bulat,
> >
> > Both, forkIO and forkOS fail in the same way for me with -threaded.  I
> > believe this is because macosx requires the process to only have a single
> > thread when doing an execv(), which I thought was the purpose of
> > forkProcess?
> >
> > Cheers,
> >
> > -- David
> >
> > ___
> > Haskell-Cafe mailing list
> > Haskell-Cafe@haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> >
>
>
>
> --
> Push the envelope.  Watch it bend.
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] executeFile failing on macosx

2010-05-16 Thread David Powell
On Mon, May 17, 2010 at 1:33 AM, Bulat Ziganshin
wrote:

> Hello David,
>
> Sunday, May 16, 2010, 7:18:29 PM, you wrote:
>
> > "executeFile" is failing for me on Mac OS X 10.5.8, with ghc 6.12.1
> > when compiling with "-threaded".  Compiling without -threaded, or running
> on linux is fine.
> >>  forkProcess $ executeFile "/bin/echo" False ["Ok"] Nothing
>
> afair, forkProcess and -threaded shouldn't work together on any Unix.
> can you try forkIO or forkOS instead?
>
>
Hi Bulat,

Both, forkIO and forkOS fail in the same way for me with -threaded.  I
believe this is because macosx requires the process to only have a single
thread when doing an execv(), which I thought was the purpose of
forkProcess?

Cheers,

-- David
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] executeFile failing on macosx

2010-05-16 Thread David Powell
"executeFile" is failing for me on Mac OS X 10.5.8, with ghc 6.12.1 when
compiling with "-threaded".  Compiling without -threaded, or running on
linux is fine.

When compiled with -threaded, the following snippet produces the error:
  testProg: /bin/echo: executeFile: failed (Operation not supported)

> import System.Posix.Process
> main = do
>
>  forkProcess $ executeFile "/bin/echo" False ["Ok"] Nothing

Any suggestions for a work around for this would be appreciated.

Cheers,

-- 
David Powell
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] hSetEncoding on socket handles

2010-05-11 Thread David Powell
Greetings,

I am having trouble sending unicode characters as utf8 over a socket handle.
Despite setting the encoding on the socket handle to utf8, it still seems to
use some other encoding when writing to the socket.  It works correctly when
writing to stdout, but not to a socket handle.  I am using ghc 6.12.1 and
network-2.2.1.7.  I can get it to work using System.IO.UTF8, but I was under
the impression this was no longer necessary?

I also don't seem to understand the interaction between hSetEncoding and
hSetBinaryMode because if I set the binary mode to 'False' and the encoding
to
utf8 on the socket, then when writing to the socket the string seems to be
truncated at the first non-ascii codepoint.

Here is a test snippet, which can be used with netcat as a listening server
(ie. nc -l 1234).

> import System.IO
> import Network
> main = do
>  let a="λ"
>  s <- connectTo "127.0.0.1" (PortNumber 1234)
>  hSetEncoding s utf8
>  hSetEncoding stdout utf8
>  hPutStrLn s a
>  putStrLn a
>  hClose s

Thanks,

David
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe