[Chicken-hackers] [PATCH] Change setjmp/longjmp to _setjmp/_longjmp to avoid overhead and fix signal masking bug

2012-06-13 Thread Peter Bex
Hi hackers,

While looking over a system call trace while debugging another issue,
I noticed a LOT of system calls to __sigprocmask14 (on NetBSD).

It turns out these are generated by setjmp() and longjmp(), so upon
each minor GC it make do two superfluous system calls.  According
to POSIX, it is actually /unspecified/ if those affect the signal mask:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/longjmp.html

This setting of the signal mask is probably even unintended and may be
the cause of bugs when the mask is set since the GC might reset any
masked signals.  The following program prints signaling and
GOT SIGINT with Chicken master on NetBSD (using csi):

(use posix)
(set-signal-handler! signal/int (lambda (sig) (print GOT SIGINT)))
(signal-mask! signal/int)
(let lp ((i 0))
  (if (= i 1000)
  (begin (print signaling)
 (process-signal (current-process-id) signal/int)
 (lp 0))
  (lp (add1 i

With the attached patch it only prints signaling, which is the same
behavior this program shows on Linux with both versions.  I also made
a small test but I wasn't sure whether the GC is guaranteed to be
triggered so it might not be a very valid test (it doesn't when
you compile it, but for me it does in csi).  This test is attached.
If it's okay, feel free to add it to the testsuite.

So, to solve this, it's more correct to use sigsetjmp/siglongjmp to
explicitly set the mask or not.  However, according to this GNULib
manual http://www.gnu.org/software/gnulib/manual/gnulib.html#sigsetjmp
this function isn't supported by MingW, MSVC and Minix.

That's why the attached patch uses _sigsetjmp and _siglongjmp.
Those are in POSIX, and pretty widely supported (but they're slated
to be *possibly* removed in future POSIX versions).  According to the
aforementioned GNULib manual, _sigsetjmp/_siglongjmp are the fastest
way to set and restore the registers without signal mask.

I've also attached two outputs from the resurrected Chicken benchmarks
by Mario: https://github.com/mario-goulart/chicken-benchmarks
These clearly show about a 5% speed improvement across the board.
(you can see a clear comparison by using the compare.scm program from
 that repo).  I compiled Chicken with itself before running this
benchmark.  This may have an effect because according to Mario the
compilation time is included in the measurements.

Cheers,
Peter
-- 
http://sjamaan.ath.cx
--
The process of preparing programs for a digital computer
 is especially attractive, not only because it can be economically
 and scientifically rewarding, but also because it can be an aesthetic
 experience much like composing poetry or music.
-- Donald Knuth
From c21fd1ae69cdae80c563753d56c66480eb5fe2a3 Mon Sep 17 00:00:00 2001
From: Peter Bex peter@xs4all.nl
Date: Wed, 13 Jun 2012 22:54:26 +0200
Subject: [PATCH] Use _setjmp/_longjmp instead of setjmp/longjmp to prevent
 unneccessary system call overhead for saving/restoring the
 signal mask on systems where this is done

---
 chicken.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/chicken.h b/chicken.h
index 837a51c..31960b0 100644
--- a/chicken.h
+++ b/chicken.h
@@ -874,8 +874,8 @@ DECL_C_PROC_p0 (128,  1,0,0,0,0,0,0,0)
 # define C_gettimeofday gettimeofday
 # define C_gmtime   gmtime
 # define C_localtimelocaltime
-# define C_setjmp   setjmp
-# define C_longjmp  longjmp
+# define C_setjmp   _setjmp
+# define C_longjmp  _longjmp
 # define C_alloca   alloca
 # define C_strerror strerror
 # define C_isalpha  isalpha
-- 
1.7.9.1

((repetitions . 10)
 (installation-prefix . /home/sjamaan/chicken-test)
 (csc-options . )
 (results
   (triangl . 10.951)
   (travinit . 0.907)
   (traverse . 3.569)
   (takr . 7.649)
   (takl . 2.825)
   (tak . 4.519)
   (sort1 . 41.279)
   (slatex . 10.483)
   (scheme . 0.724)
   (sboyer . 410.743)
   (puzzle . 0.682)
   (psyntax . 22.867)
   (nucleic2 . 50.316)
   (nqueens . 0.77)
   (nfa . 33.036)
   (nestedloop . 45.192)
   (nboyer . 177.593)
   (nbody . 49.134)
   (mazefun . 42.822)
   (maze . 2.124)
   (lattice . 118.835)
   (kanren . 62.096)
   (hanoi . 6.023)
   (graphs . 15.114)
   (fread . 17.907)
   (fprint . 2.769)
   (fibc . 18.013)
   (fib . 2.034)
   (fft . 0.501)
   (earley . 0.62)
   (dynamic . 2.404)
   (div-rec . 1.031)
   (div-iter . 0.339)
   (destructive . 1.219)
   (deriv . 10.01)
   (dderiv . 9.12)
   (ctak . 2.708)
   (cpstak . 6.139)
   (conform . 1.853)
   (browse . 1.85)
   (boyer . 1.446)
   (binarytrees . 1.224)
   (0 . 0)))
((repetitions . 10)
 (installation-prefix . /home/sjamaan/chicken-master)
 (csc-options . )
 (results
   (triangl . 10.715)
   (travinit . 0.911)
   (traverse . 3.59)
   (takr . 7.663)
   (takl . 2.733)
   (tak . 4.562)
   (sort1 . 42.108)
   (slatex . 

Re: [Chicken-hackers] [PATCH] Change setjmp/longjmp to _setjmp/_longjmp to avoid overhead and fix signal masking bug

2012-06-13 Thread Mario Domenech Goulart
Hi,

On Wed, 13 Jun 2012 23:22:02 +0200 Peter Bex peter@xs4all.nl wrote:

 I've also attached two outputs from the resurrected Chicken benchmarks
 by Mario: https://github.com/mario-goulart/chicken-benchmarks
 These clearly show about a 5% speed improvement across the board.
 (you can see a clear comparison by using the compare.scm program from
  that repo).  I compiled Chicken with itself before running this
 benchmark.  This may have an effect because according to Mario the
 compilation time is included in the measurements.

Just a small clarification: the timings in the log files don't take into
account the compilation time.  Those numbers are just for the execution
time.

The message at the end of the execution of run.scm (Total run time: ...)
does take compilation time into account (sorry, Peter -- I thought you
were talking about that).

So, the normalized results you see when you feed compare.scm the log
files consider only execution time of the benchmarked programs.

Sorry for the confusion.

Best wishes.
Mario
-- 
http://parenteses.org/mario

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Change setjmp/longjmp to _setjmp/_longjmp to avoid overhead and fix signal masking bug

2012-06-13 Thread Mario Domenech Goulart
Hi Peter,

On Wed, 13 Jun 2012 23:22:02 +0200 Peter Bex peter@xs4all.nl wrote:

 I've also attached two outputs from the resurrected Chicken benchmarks
 by Mario: https://github.com/mario-goulart/chicken-benchmarks
 These clearly show about a 5% speed improvement across the board.
 (you can see a clear comparison by using the compare.scm program from
  that repo).  I compiled Chicken with itself before running this
 benchmark.  This may have an effect because according to Mario the
 compilation time is included in the measurements.

Unfortunately I could not observe any improvement on Linux x86.  The
performance is actually slightly worse with your patch (using master,
c48a109d668f3186bb4a213940c0b0b81a1ad03d).  I run the benchmarks with no
csc options and with -O3.  Here are the results ([2] is the chicken with
the _setjmp/_longjmp patch):

+---[1]:
|- installation-prefix: /home/mario/local/chicken-2012-06-13
|- csc-options: 
|- repetitions: 10

+---[2]:
|- installation-prefix: /home/mario/local/chicken-2012-06-13-longjmp
|- csc-options: 
|- repetitions: 10

Displaying normalized results (larger numbers indicate better results)

Programs   [1]   [2]

0_1.00__1.00
binarytrees___1.03__1.00
boyer_1.01__1.00
browse1.00__1.00
conform___1.00__1.00
cpstak1.01__1.00
ctak__1.00__1.00
dderiv1.00__1.00
deriv_1.00__1.00
destructive___1.02__1.00
div-iter__1.00__1.00
div-rec___1.00__1.00
dynamic___1.00__1.00
earley1.06__1.00
fft___1.03__1.00
fib___1.03__1.00
fibc__1.00__1.00
fprint1.01__1.00
fread_1.01__1.00
graphs1.00__1.00
hanoi_1.00__1.00
kanren1.00__1.00
lattice___1.04__1.00
maze__1.01__1.00
mazefun___1.00__1.00
nbody_1.00__1.00
nboyer1.01__1.00
nestedloop1.01__1.00
nfa___1.03__1.00
nqueens___1.00__1.00
nucleic2__1.03__1.00
paraffins_1.00__1.00
peval_1.00__1.00
psyntax___1.00__1.00
puzzle1.00__1.00
sboyer1.01__1.00
scheme1.05__1.00
slatex1.05__1.00
sort1_1.00__1.00
tak___1.03__1.00
takl__1.01__1.00
takr__1.03__1.00
traverse__1.00__1.00
travinit__1.03__1.00
triangl___1.03__1.00



+---[1]:
|- installation-prefix: /home/mario/local/chicken-2012-06-13
|- csc-options: -O3
|- repetitions: 10

+---[2]:
|- installation-prefix: /home/mario/local/chicken-2012-06-13-longjmp
|- csc-options: -O3
|- repetitions: 10

Displaying normalized results (larger numbers indicate better results)

Programs   [1]   [2]

0_1.00__1.00
binarytrees___1.00__1.00
boyer_1.02__1.00
browse1.00__1.00
conform___1.01__1.00
cpstak1.00__1.01
ctak__1.00__1.00
dderiv1.01__1.00
deriv_1.00__1.00
destructive___1.04__1.00
div-iter__1.03__1.00
div-rec___1.00__1.01
dynamic___1.00__1.00
earley1.03__1.00
fft___1.04__1.00
fib___1.00__1.00
fibc__1.00__1.00
fprint1.00__1.01
fread_1.01__1.00
graphs1.00__1.00
hanoi_1.00__1.00
kanren1.00__1.00
lattice___1.00__1.00
maze__1.00__1.01
mazefun___1.00__1.00
nbody_1.03__1.00
nboyer1.02__1.00
nestedloop1.02__1.00
nfa___1.01__1.00
nqueens___1.05__1.00
nucleic2__1.00__1.00
paraffins_1.00__1.00
peval_1.00__1.00
psyntax___1.00__1.00
puzzle1.02__1.00
sboyer1.00__1.00
scheme1.09__1.00

Re: [Chicken-hackers] [PATCH] Change setjmp/longjmp to _setjmp/_longjmp to avoid overhead and fix signal masking bug

2012-06-13 Thread Jim Ursetto

On Jun 13, 2012, at 7:28 PM, Mario Domenech Goulart wrote:

 Unfortunately I could not observe any improvement on Linux x86.  The
 performance is actually slightly worse with your patch (using master,
 c48a109d668f3186bb4a213940c0b0b81a1ad03d).  I run the benchmarks with no
 csc options and with -O3.  Here are the results ([2] is the chicken with
 the _setjmp/_longjmp patch):

There won't be any improvement on Linux because _setjmp == setjmp;
Linux doesn't save the signal mask on setjmp() unless the obscure __FAVOR_BSD
is #defined.  The performance regression you observed could just be
statistical noise as well -- but, sometimes gcc will inline known calls
and it might do that for setjmp and not _setjmp, even though setjmp is
just a macro alias for _setjmp.  Only way to be sure is to look at the
assembly output.

Other than figuring that out, it would be a good idea to test on mingw
and OS X (I was going to do this).  However testing on other platforms
like cygwin or Solaris (or more obscure?) is problematic.  It is not
really a question of whether _setjmp works but if every platform supports
_setjmp.  I don't know if this is something to throw in before the
release, if one is coming soon, unless we are going to test every
supported platform before release.  Anyone else?

JIm

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Change setjmp/longjmp to _setjmp/_longjmp to avoid overhead and fix signal masking bug

2012-06-13 Thread John Cowan
Jim Ursetto scripsit:

 Other than figuring that out, it would be a good idea to test on mingw
 and OS X (I was going to do this).  However testing on other platforms
 like cygwin or Solaris (or more obscure?) is problematic.  It is not
 really a question of whether _setjmp works but if every platform supports
 _setjmp.  

On Cygwin, _setjmp and _longjmp are supported.  I can test, but probably
not until next week.

On Solaris, _setjmp and _longjmp don't manipulate the signal mask,
whereas setjmp and longjmp do, according to the man pages.  I can't test
on Solaris.

-- 
John Cowan  co...@ccil.org
http://www.ccil.org/~cowan
Humpty Dump Dublin squeaks through his norse
Humpty Dump Dublin hath a horrible vorse
But for all his kinks English / And his irismanx brogues
Humpty Dump Dublin's grandada of all rogues.  --Cousin James

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Change setjmp/longjmp to _setjmp/_longjmp to avoid overhead and fix signal masking bug

2012-06-13 Thread Jim Ursetto
On Jun 13, 2012, at 8:51 PM, John Cowan wrote:

 On Cygwin, _setjmp and _longjmp are supported.  I can test, but probably
 not until next week.

Cool, thanks.  We can probably assume that if it compiles, it works fine.

 On Solaris, _setjmp and _longjmp don't manipulate the signal mask,
 whereas setjmp and longjmp do, according to the man pages.  I can't test
 on Solaris.

Peter mentioned to me that _setjmp is supported on Solaris but not on very
old Solaris (like, 2.5).  Question is, are there any supported
systems that don't support _setjmp at all.  We believe that, if a system
does support _setjmp, it will do the right thing.

Jim


___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Change setjmp/longjmp to _setjmp/_longjmp to avoid overhead and fix signal masking bug

2012-06-13 Thread John Cowan
Jim Ursetto scripsit:

 Peter mentioned to me that _setjmp is supported on Solaris but not on
 very old Solaris (like, 2.5).

I can't believe that anyone really cares about a 17-year-old release.
Backward compat is fine, but there need to be limits.  Does current
Chicken build with gcc 2.7?  It's just about as old.

-- 
Henry S. Thompson said, / Syntactic, structural,   John Cowan
Value constraints we / Express on the fly. co...@ccil.org
Simon St. Laurent: Your / Incomprehensible http://www.ccil.org/~cowan
Abracadabralike / schemas must die!

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers