[Haskell-cafe] Slowdown with GHC when using multiple CPUs
I recently ran into a problem writing a program that I hoped I could speed up by running over multiple CPUs. I want non-haskell users to be able to run the tool and take advantage of multiple CPUs. *But* there is a serious slowdown when the RTS is run with -N and some of the CPUs are already busy. This is already mentioned in the GHC docs, but the problem I experienced was more serious than I expected causing a slowdown of around 2x compared to running with a single CPU. I reproduced the problem with the following code from the haskell wiki: {-# LANGUAGE BangPatterns #-} import Data.Digest.Pure.MD5 import qualified Data.ByteString.Lazy as L import System.Environment import Control.Concurrent import Control.Monad (replicateM_) main = do files <- getArgs str <- newEmptyMVar mapM_ (forkIO . hashAndPrint str) files printNrResults (length files) str printNrResults i var = replicateM_ i (takeMVar var >>= putStrLn) hashAndPrint str f = do bs <- L.readFile f let !h = show $ md5 bs putMVar str (f ++ ": " ++ h) When run on 4 idle CPU cores, I get the following wall clock times: ./run +RTS -N1 : 20.4 sec ./run +RTS -N2 : 11.0 sec ./run +RTS -N4 : 6.7 sec When run on the same 4 core machine, but with 2 cores already busy: ./run +RTS -N1 : 23.5 sec ./run +RTS -N2 : 14.1 sec ./run +RTS -N4 : 57.8 sec < Blowout... This is quite a problem in practice when running on a shared server. Is there anything that can be done to address this? (I wrote up a few more details here: http://thunking.drp.id.au/2012/06/slowdown-with-ghc-when-using-multiple.html ) Thanks, -- David Powell ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: HDBC-postgresql and safe/unsafe FFI calls
Hi John, My current usage of hdbc is in a server process that takes requests from multiple clients, queries the database, and returns a result. Having a single db query block everything else isn't really workable for me. I suspect this will also be an issue for others. As an example, the persistent-postgresql package which is part of the new Yesod web framework will have the same problem. I can send you a patch, but I am concerned with the issue Leon raised about libpq needing to be compiled with thread support. This is the default on my platforms (macosx, debian), but probably is dangerous to rely on. I guess we can just test the result of 'PQisthreadsafe()', in connectPostgreSQL and raise an error if it is false. Cheers, -- David On Fri, Sep 3, 2010 at 1:36 AM, John Goerzen wrote: > Hi David, > > I've had varying arguments from people that want me to mark things safe or > unsafe for various performance reasons. I'm happy to apply your change if > you like. Can you send me a diff (and attach your explanation here to it, > which I'll use as a commit message for future reference)? > > Thanks, > > -- John > > > On 09/01/2010 09:40 PM, David Powell wrote: > >> Greetings, >> >> I'm having an issue with the HDBC-postgresql package that requires me to >> manually patch it before installation for most of my use cases. >> >> All the FFI calls in this package are marked "unsafe". Unfortunately, >> this means that whenever I issue a slow sql query, all other processing >> stops. In most places that I want to use this module, I've had to >> manually patch it to at least mark the PQexec and PQexecParams calls as >> "safe". >> >> Is there any reason these calls should not be marked as "safe"? I >> understand that there a little extra runtime overhead with this, but I'd >> have thought that negligible given all the other processing that goes on >> with these particular calls under the hood. >> >> Cheers, >> >> -- >> David Powell >> > > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] HDBC-postgresql and safe/unsafe FFI calls
Thanks Jason, I think I had read that - I quite enjoy Edward's posts. Re-reading, seems to confirm what I thought, most (all?) of the FFI calls in HDBC-postgresql should be changed to "safe". -- David On Thu, Sep 2, 2010 at 2:47 PM, Jason Dagit wrote: > On Wed, Sep 1, 2010 at 7:40 PM, David Powell wrote: > > Greetings, > > > > I'm having an issue with the HDBC-postgresql package that requires me to > > manually patch it before installation for most of my use cases. > > > > All the FFI calls in this package are marked "unsafe". Unfortunately, > this > > means that whenever I issue a slow sql query, all other processing stops. > > In most places that I want to use this module, I've had to manually patch > it > > to at least mark the PQexec and PQexecParams calls as "safe". > > > > Is there any reason these calls should not be marked as "safe"? I > > understand that there a little extra runtime overhead with this, but I'd > > have thought that negligible given all the other processing that goes on > > with these particular calls under the hood. > > Have you read this? > http://blog.ezyang.com/2010/07/safety-first-ffi-and-threading/ > > Perhaps it answers your questions? > > Jason > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] HDBC-postgresql and safe/unsafe FFI calls
Greetings, I'm having an issue with the HDBC-postgresql package that requires me to manually patch it before installation for most of my use cases. All the FFI calls in this package are marked "unsafe". Unfortunately, this means that whenever I issue a slow sql query, all other processing stops. In most places that I want to use this module, I've had to manually patch it to at least mark the PQexec and PQexecParams calls as "safe". Is there any reason these calls should not be marked as "safe"? I understand that there a little extra runtime overhead with this, but I'd have thought that negligible given all the other processing that goes on with these particular calls under the hood. Cheers, -- David Powell ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] zip-archive performance/memmory usage
I was interested to see if I could determine what was happening with this. After some playing around, I noticed the code was running significantly faster if I *didn't* compile it, but ran it with 'runghc' instead (running under ghci was also fast). Here are the running times I found. The 'Zip.hs' program comes with the zip-archive package. The runtime of the compiled version didn't seem to be affected by optimisations. Regardless, I'm quite surprised running interpreted was significantly faster than compiled. > time runghc ./Zip.hs -l ~/jdk1.6.0_05-src.zip 1.48s user 0.17s system 97% cpu 1.680 total > time ./dist/build/Zip/Zip -l ~/jdk1.6.0_05-src.zip 89.00s user 1.06s system 98% cpu 1:31.84 total The file 'jdk1.6.0_05-src.zip' was just an 18MB zip file I had lying around. I'm using ghc 6.12.1 Cheers, -- David Powell On Tue, Aug 10, 2010 at 12:10 PM, Jason Dagit wrote: > > > On Mon, Aug 9, 2010 at 4:29 PM, Pieter Laeremans wrote: > >> Hello, >> >> I'm trying some haskell scripting. I'm writing a script to print some >> information >> from a zip archive. The zip-archive library does look nice but the >> performance of zip-archive/lazy bytestring >> doesn't seem to scale. >> >> Executing : >> >>eRelativePath $ head $ zEntries archive >> >> on an archive of around 12 MB with around 20 files yields >> >> Stack space overflow: current size 8388608 bytes. >> > > So it's a stack overflow at about 8 megs. I don't have a strong sense of > what is normal, but that seems like a small stack to me. Oh, actually I > just check and that is the default stack size :) > > I looked at Zip.hs (included as an example). The closest I see to your > example is some code for listing the files in the archive. Perhaps you > should try the supplied program on your archive and see if it too has a > stack overflow. > > The line the author uses to list files is: > List-> mapM_ putStrLn $ filesInArchive archive > > But, you're taking the head of the entries, so I don't see how you'd be > holding on to too much data. I just don't see anything wrong with your > program. Did you remember to compile with optimizations? Perhaps try the > author's way of listing entries and see if performance changes? > > >> >> The script in question can be found at : >> >> http://github.com/plaeremans/HaskellSnipplets/blob/master/ZipList.hs >> >> I'm using the latest version of haskell platform. Are these libaries not >> production ready, >> or am I doing something terribly wrong ? >> > > Not production ready would be my assumption. I think an iteratee style > might be more appropriate for these sorts of nested streams of potentially > large size anyway. I'm skeptical of anything that depends on lazy > bytestrings or lazy io. In this case, the performance would appear to be > depend on lazy bytestrings. > > You might want to experiment with increasing the stack size. Something > like this: > ./ZipList +RTS -K100M -RTS foo.zip > > Jason > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Thread scheduling
On Fri, Jun 11, 2010 at 3:34 PM, Luke Palmer wrote: > Say, using System.Time.getClockTime. > > Luke > > On Thu, Jun 10, 2010 at 11:31 PM, Luke Palmer wrote: > > On Thu, Jun 10, 2010 at 11:50 AM, Andrew Coppin > > wrote: > >> Control.Concurrent provides the threadDelay function, which allows you > to > >> make the current thread sleep until T=now+X. However, I can't find any > way > >> of making the current thread sleep until T=X. In other words, I want to > >> specify an absolute wakeup time, not a relative one. > > > > Modulo a small epsilon between the two actions, can't you just get the > > current time and subtract it from the target time? threadDelay is > > allowed to delay for too long anyway, so doing it this way does not > > lose you any correctness. > > > > Luke > > > This is a slightly different issue, but isn't there a potential problem with threadDelay? I noticed that internally threadDelay uses gettimeofday() as the absolute time source (on linux at least). Isn't there potential problem with this since wall-clock time isn't guaranteed to be monotonic increasing? On linux, I'd have thought the "right" thing to do would be to use clock_gettime(CLOCK_MONOTONIC) although that is probably not very portable. -- David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] executeFile failing on macosx
Indeed System.Process does work for me. I had avoided it because it is a little more awkward to use it when you want the actual PIDs. I don't understand why System.Process.runProcess works for me, but executeFile does not. I did find this issue (for python) http://bugs.python.org/issue6800which I think is the same thing I'm hitting and they also claim it is fixed in macosx 10.6. Anyway, I'll work with System.Process for now. Thanks for your help. -- David On Mon, May 17, 2010 at 9:41 AM, Thomas Schilling wrote: > Works fine on 10.6.3. If you run with +RTS -N2, though, you'll get > "forking not supported with +RTS -N greater than 1" > > The reason for this is that forking won't copy over the threads which > means that the Haskell IO manager stops working (you'd have to somehow > reinitialise the RTS while leaving heap and runtime stacks in tact -- > very tricky). > > I'm using http://hackage.haskell.org/package/process to run external > processes. I haven't had any problems with it. > > On 17 May 2010 00:06, David Powell wrote: > > > > On Mon, May 17, 2010 at 1:33 AM, Bulat Ziganshin < > bulat.zigans...@gmail.com> > > wrote: > >> > >> Hello David, > >> > >> Sunday, May 16, 2010, 7:18:29 PM, you wrote: > >> > >> > "executeFile" is failing for me on Mac OS X 10.5.8, with ghc 6.12.1 > >> > when compiling with "-threaded". Compiling without -threaded, or > >> > running on linux is fine. > >> >> forkProcess $ executeFile "/bin/echo" False ["Ok"] Nothing > >> > >> afair, forkProcess and -threaded shouldn't work together on any Unix. > >> can you try forkIO or forkOS instead? > >> > > > > Hi Bulat, > > > > Both, forkIO and forkOS fail in the same way for me with -threaded. I > > believe this is because macosx requires the process to only have a single > > thread when doing an execv(), which I thought was the purpose of > > forkProcess? > > > > Cheers, > > > > -- David > > > > ___ > > Haskell-Cafe mailing list > > Haskell-Cafe@haskell.org > > http://www.haskell.org/mailman/listinfo/haskell-cafe > > > > > > > > -- > Push the envelope. Watch it bend. > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] executeFile failing on macosx
On Mon, May 17, 2010 at 1:33 AM, Bulat Ziganshin wrote: > Hello David, > > Sunday, May 16, 2010, 7:18:29 PM, you wrote: > > > "executeFile" is failing for me on Mac OS X 10.5.8, with ghc 6.12.1 > > when compiling with "-threaded". Compiling without -threaded, or running > on linux is fine. > >> forkProcess $ executeFile "/bin/echo" False ["Ok"] Nothing > > afair, forkProcess and -threaded shouldn't work together on any Unix. > can you try forkIO or forkOS instead? > > Hi Bulat, Both, forkIO and forkOS fail in the same way for me with -threaded. I believe this is because macosx requires the process to only have a single thread when doing an execv(), which I thought was the purpose of forkProcess? Cheers, -- David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] executeFile failing on macosx
"executeFile" is failing for me on Mac OS X 10.5.8, with ghc 6.12.1 when compiling with "-threaded". Compiling without -threaded, or running on linux is fine. When compiled with -threaded, the following snippet produces the error: testProg: /bin/echo: executeFile: failed (Operation not supported) > import System.Posix.Process > main = do > > forkProcess $ executeFile "/bin/echo" False ["Ok"] Nothing Any suggestions for a work around for this would be appreciated. Cheers, -- David Powell ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] hSetEncoding on socket handles
Greetings, I am having trouble sending unicode characters as utf8 over a socket handle. Despite setting the encoding on the socket handle to utf8, it still seems to use some other encoding when writing to the socket. It works correctly when writing to stdout, but not to a socket handle. I am using ghc 6.12.1 and network-2.2.1.7. I can get it to work using System.IO.UTF8, but I was under the impression this was no longer necessary? I also don't seem to understand the interaction between hSetEncoding and hSetBinaryMode because if I set the binary mode to 'False' and the encoding to utf8 on the socket, then when writing to the socket the string seems to be truncated at the first non-ascii codepoint. Here is a test snippet, which can be used with netcat as a listening server (ie. nc -l 1234). > import System.IO > import Network > main = do > let a="λ" > s <- connectTo "127.0.0.1" (PortNumber 1234) > hSetEncoding s utf8 > hSetEncoding stdout utf8 > hPutStrLn s a > putStrLn a > hClose s Thanks, David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe