Perf regression: ghc --make: add nicer names to RTS threads (threaded IO manager, make workers) (f686682)
Hi, the attached commit seems to have regressed the scs nofib benchmark by ~3%: http://ghcspeed-nomeata.rhcloud.com/timeline/?ben=nofib/time/scsenv=1#/?exe=2base=2+68ben=nofib/time/scsenv=1revs=50equid=on The graph unfortunately is in the wrong order, as the tool gets confused by timezones and by commits with identical CommitDate, e.g. due to rebasing. This needs to be fixed, I manually verified that the commit below is the first that shows the above-noise-level-increase of runtime. (Other benchmarks seem to be unaffected.) Is this regression expected and intended or unexpected? Is it fixable? Or is is this simply inexplicable? Thanks, Joachim Am Montag, den 04.08.2014, 13:13 + schrieb g...@git.haskell.org: Repository : ssh://g...@git.haskell.org/ghc On branch : master Link : http://ghc.haskell.org/trac/ghc/changeset/f6866824ce5cdf5359f0cad78c49d65f6d43af12/ghc --- commit f6866824ce5cdf5359f0cad78c49d65f6d43af12 Author: Sergei Trofimovich sly...@gentoo.org Date: Mon Aug 4 08:10:33 2014 -0500 ghc --make: add nicer names to RTS threads (threaded IO manager, make workers) Summary: The patch names most of RTS threads and ghc (the tool) threads. It makes nicer debug and eventlog output for ghc itself. Signed-off-by: Sergei Trofimovich sly...@gentoo.org Test Plan: ran debugged ghc under '+RTS -Ds' Reviewers: simonmar, austin Reviewed By: austin Subscribers: phaskell, simonmar, relrod, ezyang, carter Differential Revision: https://phabricator.haskell.org/D101 --- f6866824ce5cdf5359f0cad78c49d65f6d43af12 compiler/main/GhcMake.hs | 14 ++ libraries/base/GHC/Event/Thread.hs | 8 ++-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/compiler/main/GhcMake.hs b/compiler/main/GhcMake.hs index 33f163c..0c63203 100644 --- a/compiler/main/GhcMake.hs +++ b/compiler/main/GhcMake.hs @@ -63,6 +63,7 @@ import qualified Data.Set as Set import qualified FiniteMap as Map ( insertListWith ) import Control.Concurrent ( forkIOWithUnmask, killThread ) +import qualified GHC.Conc as CC import Control.Concurrent.MVar import Control.Concurrent.QSem import Control.Exception @@ -80,6 +81,11 @@ import System.IO.Error ( isDoesNotExistError ) import GHC.Conc ( getNumProcessors, getNumCapabilities, setNumCapabilities ) +label_self :: String - IO () +label_self thread_name = do +self_tid - CC.myThreadId +CC.labelThread self_tid thread_name + -- - -- Loading the program @@ -744,10 +750,18 @@ parUpsweep n_jobs old_hpt stable_mods cleanup sccs = do | ((ms,mvar,_),idx) - comp_graph_w_idx ] +liftIO $ label_self main --make thread -- For each module in the module graph, spawn a worker thread that will -- compile this module. let { spawnWorkers = forM comp_graph_w_idx $ \((mod,!mvar,!log_queue),!mod_idx) - forkIOWithUnmask $ \unmask - do +liftIO $ label_self $ unwords +[ worker --make thread +, for module +, show (moduleNameString (ms_mod_name mod)) +, number +, show mod_idx +] -- Replace the default log_action with one that writes each -- message to the module's log_queue. The main thread will -- deal with synchronously printing these messages. diff --git a/libraries/base/GHC/Event/Thread.hs b/libraries/base/GHC/Event/Thread.hs index 6e991bf..dcfa32a 100644 --- a/libraries/base/GHC/Event/Thread.hs +++ b/libraries/base/GHC/Event/Thread.hs @@ -39,6 +39,7 @@ import GHC.Event.Manager (Event, EventManager, evtRead, evtWrite, loop, import qualified GHC.Event.Manager as M import qualified GHC.Event.TimerManager as TM import GHC.Num ((-), (+)) +import GHC.Show (showSignedInt) import System.IO.Unsafe (unsafePerformIO) import System.Posix.Types (Fd) @@ -244,11 +245,14 @@ startIOManagerThreads = forM_ [0..high] (startIOManagerThread eventManagerArray) writeIORef numEnabledEventManagers (high+1) +show_int :: Int - String +show_int i = showSignedInt 0 i + restartPollLoop :: EventManager - Int - IO ThreadId restartPollLoop mgr i = do M.release mgr !t - forkOn i $ loop mgr - labelThread t IOManager + labelThread t (IOManager on cap ++ show_int i) return t startIOManagerThread :: IOArray Int (Maybe (ThreadId, EventManager)) @@ -258,7 +262,7 @@ startIOManagerThread eventManagerArray i = do let create = do !mgr - new True !t - forkOn i $ loop mgr -labelThread t
linker_unload validate related issue (how to duplicate that?).
Folks, I've noted that validate is failing on Linux recently due to issue in linker_unload. As I've submitted some patch to this test case recently which fixes this on Solaris I'm kind of curious if I broke it or not. Anyway, strange thing is: when I configure ghc and run the test by (g)make TEST=linker_unload on both Linux and Solaris I get no failure. When I validate on Linux (validate is not working on Solaris yet), then I get failure in linker_unload: Wrong exit code (expected 0 , actual 2 ) Stdout: Stderr: /bin/sh: 1: Syntax error: Unterminated quoted string make[3]: *** [linker_unload] Error 2 *** unexpected failure for linker_unload(normal) when I try to run: cd testsuite make TEST=linker_unload inside this validation tree I again get no failure in this test: [...] = linker_unload(normal) 2522 of 4082 [0, 0, 0] cd ./rts $MAKE -s --no-print-directory linker_unload/dev/null linker_unload.run.stdout 2linker_unload.run.stderr OVERALL SUMMARY for test run started at Wed Aug 6 10:55:17 2014 CEST 0:00:08 spent to go through 4082 total tests, which gave rise to 13459 test cases, of which 13458 were skipped 0 had missing libraries 1 expected passes 0 expected failures 0 caused framework failures 0 unexpected passes 0 unexpected failures make[1]: Leaving directory `/home/karel/src/validate-test/testsuite/tests' I've also noted that this test case fails on Solaris builders with strange error: = linker_unload(normal) 170 of 4082 [0, 0, 1] cd ./rts $MAKE -s --no-print-directory linker_unload /dev/null linker_unload.run.stdout 2linker_unload.run.stderr Wrong exit code (expected 0 , actual 2 ) Stdout: Stderr: linker_unload: internal error: loadObj: can't read `/buildbot/gabor-ghc-head-builder/builder/tempbuild/build/bindisttest/install/libHSinteg_BcPVjqcazPNGsNFG4agFty.a' (GHC version 7.9.20140806 for i386_unknown_solaris2) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug gmake[3]: *** [linker_unload] Abort (core dumped) So the question is: why validate fails and why builder fails on this particular test and why my common testing on both Solaris and Linux is not able to duplicate the issue? What's so different between validate and builders and between my common: perl boot; ./configure some params; gmake -j12; cd testsuite; gmake THREADS=12 fast ? Thanks! Karel ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: linker_unload validate related issue (how to duplicate that?).
Austin and I chatted about it, and it's probably because the test is not creating ghcconfig.h early enough. I haven't looked further on how to fix it though. Edward Excerpts from Karel Gardas's message of 2014-08-06 10:16:20 +0100: Folks, I've noted that validate is failing on Linux recently due to issue in linker_unload. As I've submitted some patch to this test case recently which fixes this on Solaris I'm kind of curious if I broke it or not. Anyway, strange thing is: when I configure ghc and run the test by (g)make TEST=linker_unload on both Linux and Solaris I get no failure. When I validate on Linux (validate is not working on Solaris yet), then I get failure in linker_unload: Wrong exit code (expected 0 , actual 2 ) Stdout: Stderr: /bin/sh: 1: Syntax error: Unterminated quoted string make[3]: *** [linker_unload] Error 2 *** unexpected failure for linker_unload(normal) when I try to run: cd testsuite make TEST=linker_unload inside this validation tree I again get no failure in this test: [...] = linker_unload(normal) 2522 of 4082 [0, 0, 0] cd ./rts $MAKE -s --no-print-directory linker_unload/dev/null linker_unload.run.stdout 2linker_unload.run.stderr OVERALL SUMMARY for test run started at Wed Aug 6 10:55:17 2014 CEST 0:00:08 spent to go through 4082 total tests, which gave rise to 13459 test cases, of which 13458 were skipped 0 had missing libraries 1 expected passes 0 expected failures 0 caused framework failures 0 unexpected passes 0 unexpected failures make[1]: Leaving directory `/home/karel/src/validate-test/testsuite/tests' I've also noted that this test case fails on Solaris builders with strange error: = linker_unload(normal) 170 of 4082 [0, 0, 1] cd ./rts $MAKE -s --no-print-directory linker_unload /dev/null linker_unload.run.stdout 2linker_unload.run.stderr Wrong exit code (expected 0 , actual 2 ) Stdout: Stderr: linker_unload: internal error: loadObj: can't read `/buildbot/gabor-ghc-head-builder/builder/tempbuild/build/bindisttest/install/libHSinteg_BcPVjqcazPNGsNFG4agFty.a' (GHC version 7.9.20140806 for i386_unknown_solaris2) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug gmake[3]: *** [linker_unload] Abort (core dumped) So the question is: why validate fails and why builder fails on this particular test and why my common testing on both Solaris and Linux is not able to duplicate the issue? What's so different between validate and builders and between my common: perl boot; ./configure some params; gmake -j12; cd testsuite; gmake THREADS=12 fast ? Thanks! Karel ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: linker_unload validate related issue (how to duplicate that?).
Just for the record validate fails since it is using ghc installed into bindisttest/install dir/ subdirectory. The spaces here are really nasty test as my fix to linker_unload has not counted with the possibility of having ghc installed in such location (cut -d ' ' ... does bad thing in this case). So yes, that was me who broke validate but this should be already fixed by revert of problematic patch. Sorry for that, Karel On 08/ 6/14 11:16 AM, Karel Gardas wrote: So the question is: why validate fails and why builder fails on this particular test and why my common testing on both Solaris and Linux is not able to duplicate the issue? What's so different between validate and builders and between my common: perl boot; ./configure some params; gmake -j12; cd testsuite; gmake THREADS=12 fast ? ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Perf regression: ghc --make: add nicer names to RTS threads (threaded IO manager, make workers) (f686682)
Hi Sergei, Am Mittwoch, den 06.08.2014, 22:15 +0300 schrieb Sergei Trofimovich: On Wed, 06 Aug 2014 09:30:45 +0200 Joachim Breitner m...@joachim-breitner.de wrote: the attached commit seems to have regressed the scs nofib benchmark by ~3%: http://ghcspeed-nomeata.rhcloud.com/timeline/?ben=nofib/time/scsenv=1#/?exe=2base=2+68ben=nofib/time/scsenv=1revs=50equid=on That's a test of compiled binary performance, not the compiler, right? Correct. The graph unfortunately is in the wrong order, as the tool gets confused by timezones and by commits with identical CommitDate, e.g. due to rebasing. This needs to be fixed, I manually verified that the commit below is the first that shows the above-noise-level-increase of runtime. (Other benchmarks seem to be unaffected.) Is this regression expected and intended or unexpected? Is it fixable? Or is is this simply inexplicable? The graph looks mysterious (18 ms bump). Bencmark does not use haskell threads at all. Yes, I was surprised by that as well. I'll try to reproduce degradation locally and will investigate. Thanks! The only runtime part affected by the patch only renames threads (the renamer gets called once for each created thread): diff --git a/libraries/base/GHC/Event/Thread.hs b/libraries/base/GHC/Event/Thread.hs index 6e991bf..dcfa32a 100644 --- a/libraries/base/GHC/Event/Thread.hs +++ b/libraries/base/GHC/Event/Thread.hs @@ -39,6 +39,7 @@ import GHC.Event.Manager (Event, EventManager, evtRead, evtWrite, loop, import qualified GHC.Event.Manager as M import qualified GHC.Event.TimerManager as TM import GHC.Num ((-), (+)) +import GHC.Show (showSignedInt) import System.IO.Unsafe (unsafePerformIO) import System.Posix.Types (Fd) @@ -244,11 +245,14 @@ startIOManagerThreads = forM_ [0..high] (startIOManagerThread eventManagerArray) writeIORef numEnabledEventManagers (high+1) +show_int :: Int - String +show_int i = showSignedInt 0 i + restartPollLoop :: EventManager - Int - IO ThreadId restartPollLoop mgr i = do M.release mgr !t - forkOn i $ loop mgr - labelThread t IOManager + labelThread t (IOManager on cap ++ show_int i) return t startIOManagerThread :: IOArray Int (Maybe (ThreadId, EventManager)) @@ -258,7 +262,7 @@ startIOManagerThread eventManagerArray i = do let create = do !mgr - new True !t - forkOn i $ loop mgr -labelThread t IOManager +labelThread t (IOManager on cap ++ show_int i) writeIOArray eventManagerArray i (Just (t,mgr)) old - readIOArray eventManagerArray i case old of It does replace a reference to the a string (IOManager) by something involving allocation and computation. I guess that could have a measurable effect. What happens to programs relying on very cheap threads? Do we have benchmarks for this class of programs at all? Greetings, Joachim -- Joachim “nomeata” Breitner m...@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nome...@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nome...@debian.org signature.asc Description: This is a digitally signed message part ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Perf regression: ghc --make: add nicer names to RTS threads (threaded IO manager, make workers) (f686682)
On Wed, 6 Aug 2014 22:15:34 +0300 Sergei Trofimovich sly...@gentoo.org wrote: On Wed, 06 Aug 2014 09:30:45 +0200 Joachim Breitner m...@joachim-breitner.de wrote: Hi, the attached commit seems to have regressed the scs nofib benchmark by ~3%: http://ghcspeed-nomeata.rhcloud.com/timeline/?ben=nofib/time/scsenv=1#/?exe=2base=2+68ben=nofib/time/scsenv=1revs=50equid=on That's a test of compiled binary performance, not the compiler, right? The graph unfortunately is in the wrong order, as the tool gets confused by timezones and by commits with identical CommitDate, e.g. due to rebasing. This needs to be fixed, I manually verified that the commit below is the first that shows the above-noise-level-increase of runtime. (Other benchmarks seem to be unaffected.) Is this regression expected and intended or unexpected? Is it fixable? Or is is this simply inexplicable? The graph looks mysterious (18 ms bump). Bencmark does not use haskell threads at all. I'll try to reproduce degradation locally and will investigate. I think I know what happens. According to perf the benchmark spends 34%+ of time in garbage collection ('perf record -- $args'/'perf report'): 27,91% test test [.] evacuate 9,29% test test [.] s9Lz_info 7,46% test test [.] scavenge_block And the whole benchmark runs a tiny bit more than 300ms. It is exactly in line with major GC timer (0.3s). If we run $ time ./test inverter 345 10n 4u 1/dev/null multiple times there is heavy instability in there (with my patch reverted): real0m0.319s real0m0.305s real0m0.307s real0m0.373s real0m0.381s which is +/- 80ms drift! Let's try to kick major GC earlier instead of running right at runtime shutdown time: $ time ./test inverter 345 10n 4u +RTS -I0.1 1/dev/null real0m0.304s real0m0.308s real0m0.302s real0m0.304s real0m0.308s real0m0.306s real0m0.305s real0m0.312s which is way more stable behaviour. Thus my theory is that my changed stepped from 90% of time 1 GC run per run to 90% of time 2 GC runs per run -- Sergei signature.asc Description: PGP signature ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: Perf regression: ghc --make: add nicer names to RTS threads (threaded IO manager, make workers) (f686682)
Hi, Am Mittwoch, den 06.08.2014, 23:40 +0300 schrieb Sergei Trofimovich: I think I know what happens. According to perf the benchmark spends 34%+ of time in garbage collection ('perf record -- $args'/'perf report'): 27,91% test test [.] evacuate 9,29% test test [.] s9Lz_info 7,46% test test [.] scavenge_block And the whole benchmark runs a tiny bit more than 300ms. It is exactly in line with major GC timer (0.3s). If we run $ time ./test inverter 345 10n 4u 1/dev/null multiple times there is heavy instability in there (with my patch reverted): real0m0.319s real0m0.305s real0m0.307s real0m0.373s real0m0.381s which is +/- 80ms drift! Let's try to kick major GC earlier instead of running right at runtime shutdown time: $ time ./test inverter 345 10n 4u +RTS -I0.1 1/dev/null real0m0.304s real0m0.308s real0m0.302s real0m0.304s real0m0.308s real0m0.306s real0m0.305s real0m0.312s which is way more stable behaviour. Thus my theory is that my changed stepped from 90% of time 1 GC run per run to 90% of time 2 GC runs per run great analysis, thanks. I think in this case we should not worry about it. From a QA perspective we are already doing well if we consider the apparent regressions. If we can explain them and consider it acceptable, then it’s fine. Greetings, Joachim -- Joachim “nomeata” Breitner m...@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nome...@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nome...@debian.org signature.asc Description: This is a digitally signed message part ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs