Re: [racket-users] Futures + threads SIGSEGV

2020-05-02 Thread Dominik Pantůček
Hi Sam,

On 02. 05. 20 14:26, Sam Tobin-Hochstadt wrote:
> I successfully reproduced this on the first try, which is good. Here's
> my debugging advice (I'm also looking at it):
> 
> 1. To use a binary with debugging symbols, use
> `racket/src/build/racket/racket3m` from the checkout of the Racket
> repository that you built.
> 2. When running racket in GDB, there are lots of segfaults because of
> the GC; you'll want to use `handle SIGSEGV nostop noprint`
> 3. It may not work for this situation because of parallelism, but if
> you can reproduce the bug using `rr` [1] it will be almost infinitely
> easier to find and fix.

thanks for the hints and also thanks for opening the Github issue for
that. I'll try to post my results (if any) there.

> 
> I'm also curious about your experience with Racket CS and futures.
> It's unlikely to have the _same_ bugs, but it would be good to find
> the ones there are. :)

This is going to be a really hard one. With all the tricks I learned
during past weeks, I get almost 400 frames per second with my experiment
using 3m and unsafe operations. Without unsafe operations it goes down
to 300 and without unsafe operations and with the de-optimized flip
function as shown in the example + set-argb-pixels, I am at about 50 fps
(that is presumably a completely "safe" version without relying on my
bounds and type checking).

With CS, I am unable to get quickly working anything else than the
de-optimized version with set-argb-pixels and I am at about 5 fps. Also,
the thread scheduling is "interesting" at best. I am postponing the work
on that - I sort of assume, that it can take another few weeks to
understand how to properly use all the fixnum/flonum related stuff with CS.


Thanks again!
Dominik


> 
> [1] https://rr-project.org
> 
> On Sat, May 2, 2020 at 7:56 AM Dominik Pantůček
>  wrote:
>>
>> Hello fellow Racketeers,
>>
>> during my research into how Racket can be used as generic software
>> rendering platform, I've hit some limits of Racket's (native) thread
>> handling. Once I started getting SIGSEGVs, I strongly suspected I am
>> doing too much unsafe operations - and to be honest, that was true.
>> There was one off-by-one memory access :).
>>
>> But that was easy to resolve - I just switched to safe/contracted
>> versions of everything and found and fixed the bug. But I still got
>> occasional SIGSEGV. So I dug even deeper (during last two months I've
>> read most of the JIT inlining code) than before and noticed that the
>> crashes disappear when I refrain from calling bytes-set! in parallel
>> using futures.
>>
>> So I started creating a minimal-crashing-example. At first, I failed
>> miserably. Just filling a byte array over and over again, I was unable
>> to reproduce the crash. But then I realized, that in my application,
>> threads come to play and that might be the case. And suddenly, creating
>> MCE was really easy:
>>
>> Create new eventspace using parameterize/make-eventspace, put the actual
>> code in application thread (thread ...) and make the main thread wait
>> for this application thread using thread-wait. Before starting the
>> application thread, I create a simple window, bitmap and a canvas, that
>> I keep redrawing using refresh-now after each iteration. Funny thing is,
>> now it keeps crashing even without actually modifying the bitmap in
>> question. All I need to do is to mess with some byte array in 8 threads.
>> Sometimes it takes a minute on my computer before it crashes, sometimes
>> it needs more, but it eventually crashes pretty consistently.
>>
>> And it is just 60 lines of code:
>>
>> #lang racket/gui
>>
>> (require racket/future racket/fixnum racket/cmdline)
>>
>> (define width 800)
>> (define height 600)
>>
>> (define framebuffer (make-fxvector (* width height)))
>> (define pixels (make-bytes (* width height 4)))
>>
>> (define max-depth 0)
>>
>> (command-line
>>  #:once-each
>>  (("-d" "--depth") d "Futures binary partitioning depth" (set! max-depth
>> (string->number d
>>
>> (file-stream-buffer-mode (current-output-port) 'none)
>>
>> (parameterize ((current-eventspace (make-eventspace)))
>>   (define win (new frame%
>>(label "test")
>>(width width)
>>(height height)))
>>   (define bmp (make-bitmap width height))
>>   (define canvas (new canvas%
>>   (parent win)
>>   (paint-callback
>>(λ (c dc)
>>  (send dc draw-bitmap bmp 0 0)))
>>   ))
>>
>>   (define (single-run)
>> (define (do-bflip start end (depth 0))
>>   (cond ((fx< depth max-depth)
>>  (define cnt (fx- end start))
>>  (define cnt2 (fxrshift cnt 1))
>>  (define mid (fx+ start cnt2))
>>  (let ((f (future
>>(λ ()
>>  (do-bflip start mid (fx+ depth 1))
>>(do-bflip mid end (fx+ depth 1))
>>

Re: [racket-users] Futures + threads SIGSEGV

2020-05-02 Thread Sam Tobin-Hochstadt
I opened https://github.com/racket/racket/issues/3145 to avoid too
much mailing list traffic, and posted a stack trace there.

Sam


On Sat, May 2, 2020 at 8:31 AM Matthew Flatt  wrote:
>
> I wasn't able to produce a crash on my first try, but the Nth try
> worked, so this is very helpful!
>
> I'm investigating, too...
>
> At Sat, 2 May 2020 08:26:10 -0400, Sam Tobin-Hochstadt wrote:
> > I successfully reproduced this on the first try, which is good. Here's
> > my debugging advice (I'm also looking at it):
> >
> > 1. To use a binary with debugging symbols, use
> > `racket/src/build/racket/racket3m` from the checkout of the Racket
> > repository that you built.
> > 2. When running racket in GDB, there are lots of segfaults because of
> > the GC; you'll want to use `handle SIGSEGV nostop noprint`
> > 3. It may not work for this situation because of parallelism, but if
> > you can reproduce the bug using `rr` [1] it will be almost infinitely
> > easier to find and fix.
> >
> > I'm also curious about your experience with Racket CS and futures.
> > It's unlikely to have the _same_ bugs, but it would be good to find
> > the ones there are. :)
> >
> > [1] https://rr-project.org
> >
> > On Sat, May 2, 2020 at 7:56 AM Dominik Pantůček
> >  wrote:
> > >
> > > Hello fellow Racketeers,
> > >
> > > during my research into how Racket can be used as generic software
> > > rendering platform, I've hit some limits of Racket's (native) thread
> > > handling. Once I started getting SIGSEGVs, I strongly suspected I am
> > > doing too much unsafe operations - and to be honest, that was true.
> > > There was one off-by-one memory access :).
> > >
> > > But that was easy to resolve - I just switched to safe/contracted
> > > versions of everything and found and fixed the bug. But I still got
> > > occasional SIGSEGV. So I dug even deeper (during last two months I've
> > > read most of the JIT inlining code) than before and noticed that the
> > > crashes disappear when I refrain from calling bytes-set! in parallel
> > > using futures.
> > >
> > > So I started creating a minimal-crashing-example. At first, I failed
> > > miserably. Just filling a byte array over and over again, I was unable
> > > to reproduce the crash. But then I realized, that in my application,
> > > threads come to play and that might be the case. And suddenly, creating
> > > MCE was really easy:
> > >
> > > Create new eventspace using parameterize/make-eventspace, put the actual
> > > code in application thread (thread ...) and make the main thread wait
> > > for this application thread using thread-wait. Before starting the
> > > application thread, I create a simple window, bitmap and a canvas, that
> > > I keep redrawing using refresh-now after each iteration. Funny thing is,
> > > now it keeps crashing even without actually modifying the bitmap in
> > > question. All I need to do is to mess with some byte array in 8 threads.
> > > Sometimes it takes a minute on my computer before it crashes, sometimes
> > > it needs more, but it eventually crashes pretty consistently.
> > >
> > > And it is just 60 lines of code:
> > >
> > > #lang racket/gui
> > >
> > > (require racket/future racket/fixnum racket/cmdline)
> > >
> > > (define width 800)
> > > (define height 600)
> > >
> > > (define framebuffer (make-fxvector (* width height)))
> > > (define pixels (make-bytes (* width height 4)))
> > >
> > > (define max-depth 0)
> > >
> > > (command-line
> > >  #:once-each
> > >  (("-d" "--depth") d "Futures binary partitioning depth" (set! max-depth
> > > (string->number d
> > >
> > > (file-stream-buffer-mode (current-output-port) 'none)
> > >
> > > (parameterize ((current-eventspace (make-eventspace)))
> > >   (define win (new frame%
> > >(label "test")
> > >(width width)
> > >(height height)))
> > >   (define bmp (make-bitmap width height))
> > >   (define canvas (new canvas%
> > >   (parent win)
> > >   (paint-callback
> > >(λ (c dc)
> > >  (send dc draw-bitmap bmp 0 0)))
> > >   ))
> > >
> > >   (define (single-run)
> > > (define (do-bflip start end (depth 0))
> > >   (cond ((fx< depth max-depth)
> > >  (define cnt (fx- end start))
> > >  (define cnt2 (fxrshift cnt 1))
> > >  (define mid (fx+ start cnt2))
> > >  (let ((f (future
> > >(λ ()
> > >  (do-bflip start mid (fx+ depth 1))
> > >(do-bflip mid end (fx+ depth 1))
> > >(touch f)))
> > > (else
> > >  (for ((i (in-range start end)))
> > >(define c (fxvector-ref framebuffer i))
> > >(bytes-set! pixels (+ (* i 4) 0) #xff)
> > >(bytes-set! pixels (+ (* i 4) 1) (fxand (fxrshift c 16)
> > > #xff))
> > >(bytes-set! pixels (+ (* i 

Re: [racket-users] Futures + threads SIGSEGV

2020-05-02 Thread Dexter Lagan
Hi Dominik,

  Ah that explains why I was getting an incorrect number of threads! I didn’t 
think about using future-visualizer, but I’ll give it a try. Thanks!

Dex

> On May 2, 2020, at 2:27 PM, Dominik Pantůček  
> wrote:
> 
> Hi Dex,
> 
>> On 02. 05. 20 14:10, Dexter Lagan wrote:
>> Hello,
>> 
>>   I’ve been getting inconsistent results as well. A while ago I made a
>> benchmark based on a parallel spectral norm computation. The benchmark
>> works fine on Windows on most systems and uses all cores, but crashes
>> randomly on other systems. I haven’t been able to figure out why. On
>> Linux it doesn’t seem to use more than one core. I’d be interested to
>> know if this is related. Here’s the benchmark code :
>> 
>> https://github.com/DexterLagan/benchmark
> 
> Beware that (processor-count) returns the number of HT-cores, so your
> v1.3 is actually requesting twice the number of threads as there are
> HTs. At least on Linux this is the case (checked right now).
> 
> Interesting idea... 16 threads:
> 
> $ time racket crash.rkt -d 4
> SIGSEGV MAPERR si_code 1 fault on addr (nil)
> Aborted (core dumped)
> 
> real6m37,579s
> user32m55,192s
> sys0m35,124s
> 
> So that is consistent to what I see.
> 
> Have you tried using future-visualizer[1] for checking why it uses only
> single CPU thread? Last summer I spent quite some time with it to help
> me find the right futures usage patterns that actually enable the
> speculative computation in parallel. Usually if your code is too deep
> and keeps allocating "something" each frame, it goes back to the runtime
> thread for each allocation.
> 
> 
> Cheers,
> Dominik
> 
> [1] https://docs.racket-lang.org/future-visualizer/index.html
> 
>> 
>> Dex
>> 
>>> On May 2, 2020, at 1:56 PM, Dominik Pantůček
>>>  wrote:
>>> 
>>> Hello fellow Racketeers,
>>> 
>>> during my research into how Racket can be used as generic software
>>> rendering platform, I've hit some limits of Racket's (native) thread
>>> handling. Once I started getting SIGSEGVs, I strongly suspected I am
>>> doing too much unsafe operations - and to be honest, that was true.
>>> There was one off-by-one memory access :).
>>> 
>>> But that was easy to resolve - I just switched to safe/contracted
>>> versions of everything and found and fixed the bug. But I still got
>>> occasional SIGSEGV. So I dug even deeper (during last two months I've
>>> read most of the JIT inlining code) than before and noticed that the
>>> crashes disappear when I refrain from calling bytes-set! in parallel
>>> using futures.
>>> 
>>> So I started creating a minimal-crashing-example. At first, I failed
>>> miserably. Just filling a byte array over and over again, I was unable
>>> to reproduce the crash. But then I realized, that in my application,
>>> threads come to play and that might be the case. And suddenly, creating
>>> MCE was really easy:
>>> 
>>> Create new eventspace using parameterize/make-eventspace, put the actual
>>> code in application thread (thread ...) and make the main thread wait
>>> for this application thread using thread-wait. Before starting the
>>> application thread, I create a simple window, bitmap and a canvas, that
>>> I keep redrawing using refresh-now after each iteration. Funny thing is,
>>> now it keeps crashing even without actually modifying the bitmap in
>>> question. All I need to do is to mess with some byte array in 8 threads.
>>> Sometimes it takes a minute on my computer before it crashes, sometimes
>>> it needs more, but it eventually crashes pretty consistently.
>>> 
>>> And it is just 60 lines of code:
>>> 
>>> #lang racket/gui
>>> 
>>> (require racket/future racket/fixnum racket/cmdline)
>>> 
>>> (define width 800)
>>> (define height 600)
>>> 
>>> (define framebuffer (make-fxvector (* width height)))
>>> (define pixels (make-bytes (* width height 4)))
>>> 
>>> (define max-depth 0)
>>> 
>>> (command-line
>>> #:once-each
>>> (("-d" "--depth") d "Futures binary partitioning depth" (set! max-depth
>>> (string->number d
>>> 
>>> (file-stream-buffer-mode (current-output-port) 'none)
>>> 
>>> (parameterize ((current-eventspace (make-eventspace)))
>>>  (define win (new frame%
>>>   (label "test")
>>>   (width width)
>>>   (height height)))
>>>  (define bmp (make-bitmap width height))
>>>  (define canvas (new canvas%
>>>  (parent win)
>>>  (paint-callback
>>>   (λ (c dc)
>>> (send dc draw-bitmap bmp 0 0)))
>>>  ))
>>> 
>>>  (define (single-run)
>>>(define (do-bflip start end (depth 0))
>>>  (cond ((fx< depth max-depth)
>>> (define cnt (fx- end start))
>>> (define cnt2 (fxrshift cnt 1))
>>> (define mid (fx+ start cnt2))
>>> (let ((f (future
>>>   (λ ()
>>> (do-bflip start mid (fx+ depth 1))
>>>   (do-bflip mid end (fx+ 

Re: [racket-users] Futures + threads SIGSEGV

2020-05-02 Thread Matthew Flatt
I wasn't able to produce a crash on my first try, but the Nth try
worked, so this is very helpful!

I'm investigating, too...

At Sat, 2 May 2020 08:26:10 -0400, Sam Tobin-Hochstadt wrote:
> I successfully reproduced this on the first try, which is good. Here's
> my debugging advice (I'm also looking at it):
> 
> 1. To use a binary with debugging symbols, use
> `racket/src/build/racket/racket3m` from the checkout of the Racket
> repository that you built.
> 2. When running racket in GDB, there are lots of segfaults because of
> the GC; you'll want to use `handle SIGSEGV nostop noprint`
> 3. It may not work for this situation because of parallelism, but if
> you can reproduce the bug using `rr` [1] it will be almost infinitely
> easier to find and fix.
> 
> I'm also curious about your experience with Racket CS and futures.
> It's unlikely to have the _same_ bugs, but it would be good to find
> the ones there are. :)
> 
> [1] https://rr-project.org
> 
> On Sat, May 2, 2020 at 7:56 AM Dominik Pantůček
>  wrote:
> >
> > Hello fellow Racketeers,
> >
> > during my research into how Racket can be used as generic software
> > rendering platform, I've hit some limits of Racket's (native) thread
> > handling. Once I started getting SIGSEGVs, I strongly suspected I am
> > doing too much unsafe operations - and to be honest, that was true.
> > There was one off-by-one memory access :).
> >
> > But that was easy to resolve - I just switched to safe/contracted
> > versions of everything and found and fixed the bug. But I still got
> > occasional SIGSEGV. So I dug even deeper (during last two months I've
> > read most of the JIT inlining code) than before and noticed that the
> > crashes disappear when I refrain from calling bytes-set! in parallel
> > using futures.
> >
> > So I started creating a minimal-crashing-example. At first, I failed
> > miserably. Just filling a byte array over and over again, I was unable
> > to reproduce the crash. But then I realized, that in my application,
> > threads come to play and that might be the case. And suddenly, creating
> > MCE was really easy:
> >
> > Create new eventspace using parameterize/make-eventspace, put the actual
> > code in application thread (thread ...) and make the main thread wait
> > for this application thread using thread-wait. Before starting the
> > application thread, I create a simple window, bitmap and a canvas, that
> > I keep redrawing using refresh-now after each iteration. Funny thing is,
> > now it keeps crashing even without actually modifying the bitmap in
> > question. All I need to do is to mess with some byte array in 8 threads.
> > Sometimes it takes a minute on my computer before it crashes, sometimes
> > it needs more, but it eventually crashes pretty consistently.
> >
> > And it is just 60 lines of code:
> >
> > #lang racket/gui
> >
> > (require racket/future racket/fixnum racket/cmdline)
> >
> > (define width 800)
> > (define height 600)
> >
> > (define framebuffer (make-fxvector (* width height)))
> > (define pixels (make-bytes (* width height 4)))
> >
> > (define max-depth 0)
> >
> > (command-line
> >  #:once-each
> >  (("-d" "--depth") d "Futures binary partitioning depth" (set! max-depth
> > (string->number d
> >
> > (file-stream-buffer-mode (current-output-port) 'none)
> >
> > (parameterize ((current-eventspace (make-eventspace)))
> >   (define win (new frame%
> >(label "test")
> >(width width)
> >(height height)))
> >   (define bmp (make-bitmap width height))
> >   (define canvas (new canvas%
> >   (parent win)
> >   (paint-callback
> >(λ (c dc)
> >  (send dc draw-bitmap bmp 0 0)))
> >   ))
> >
> >   (define (single-run)
> > (define (do-bflip start end (depth 0))
> >   (cond ((fx< depth max-depth)
> >  (define cnt (fx- end start))
> >  (define cnt2 (fxrshift cnt 1))
> >  (define mid (fx+ start cnt2))
> >  (let ((f (future
> >(λ ()
> >  (do-bflip start mid (fx+ depth 1))
> >(do-bflip mid end (fx+ depth 1))
> >(touch f)))
> > (else
> >  (for ((i (in-range start end)))
> >(define c (fxvector-ref framebuffer i))
> >(bytes-set! pixels (+ (* i 4) 0) #xff)
> >(bytes-set! pixels (+ (* i 4) 1) (fxand (fxrshift c 16)
> > #xff))
> >(bytes-set! pixels (+ (* i 4) 2) (fxand (fxrshift c 8) #xff))
> >(bytes-set! pixels (+ (* i 4) 3) (fxand c #xff))
> > (do-bflip 0 (* width height))
> > (send canvas refresh-now))
> > (send win show #t)
> >
> >   (define appthread
> > (thread
> >  (λ ()
> >(let loop ()
> >  (single-run)
> >  (loop)
> >   (thread-wait appthread))
> >
> > Note: the code is 

Re: [racket-users] Futures + threads SIGSEGV

2020-05-02 Thread Dominik Pantůček
Hi Dex,

On 02. 05. 20 14:10, Dexter Lagan wrote:
> Hello,
> 
>   I’ve been getting inconsistent results as well. A while ago I made a
> benchmark based on a parallel spectral norm computation. The benchmark
> works fine on Windows on most systems and uses all cores, but crashes
> randomly on other systems. I haven’t been able to figure out why. On
> Linux it doesn’t seem to use more than one core. I’d be interested to
> know if this is related. Here’s the benchmark code :
> 
> https://github.com/DexterLagan/benchmark

Beware that (processor-count) returns the number of HT-cores, so your
v1.3 is actually requesting twice the number of threads as there are
HTs. At least on Linux this is the case (checked right now).

Interesting idea... 16 threads:

$ time racket crash.rkt -d 4
SIGSEGV MAPERR si_code 1 fault on addr (nil)
Aborted (core dumped)

real6m37,579s
user32m55,192s
sys 0m35,124s

So that is consistent to what I see.

Have you tried using future-visualizer[1] for checking why it uses only
single CPU thread? Last summer I spent quite some time with it to help
me find the right futures usage patterns that actually enable the
speculative computation in parallel. Usually if your code is too deep
and keeps allocating "something" each frame, it goes back to the runtime
thread for each allocation.


Cheers,
Dominik

[1] https://docs.racket-lang.org/future-visualizer/index.html

> 
> Dex
> 
>> On May 2, 2020, at 1:56 PM, Dominik Pantůček
>>  wrote:
>>
>> Hello fellow Racketeers,
>>
>> during my research into how Racket can be used as generic software
>> rendering platform, I've hit some limits of Racket's (native) thread
>> handling. Once I started getting SIGSEGVs, I strongly suspected I am
>> doing too much unsafe operations - and to be honest, that was true.
>> There was one off-by-one memory access :).
>>
>> But that was easy to resolve - I just switched to safe/contracted
>> versions of everything and found and fixed the bug. But I still got
>> occasional SIGSEGV. So I dug even deeper (during last two months I've
>> read most of the JIT inlining code) than before and noticed that the
>> crashes disappear when I refrain from calling bytes-set! in parallel
>> using futures.
>>
>> So I started creating a minimal-crashing-example. At first, I failed
>> miserably. Just filling a byte array over and over again, I was unable
>> to reproduce the crash. But then I realized, that in my application,
>> threads come to play and that might be the case. And suddenly, creating
>> MCE was really easy:
>>
>> Create new eventspace using parameterize/make-eventspace, put the actual
>> code in application thread (thread ...) and make the main thread wait
>> for this application thread using thread-wait. Before starting the
>> application thread, I create a simple window, bitmap and a canvas, that
>> I keep redrawing using refresh-now after each iteration. Funny thing is,
>> now it keeps crashing even without actually modifying the bitmap in
>> question. All I need to do is to mess with some byte array in 8 threads.
>> Sometimes it takes a minute on my computer before it crashes, sometimes
>> it needs more, but it eventually crashes pretty consistently.
>>
>> And it is just 60 lines of code:
>>
>> #lang racket/gui
>>
>> (require racket/future racket/fixnum racket/cmdline)
>>
>> (define width 800)
>> (define height 600)
>>
>> (define framebuffer (make-fxvector (* width height)))
>> (define pixels (make-bytes (* width height 4)))
>>
>> (define max-depth 0)
>>
>> (command-line
>> #:once-each
>> (("-d" "--depth") d "Futures binary partitioning depth" (set! max-depth
>> (string->number d
>>
>> (file-stream-buffer-mode (current-output-port) 'none)
>>
>> (parameterize ((current-eventspace (make-eventspace)))
>>  (define win (new frame%
>>   (label "test")
>>   (width width)
>>   (height height)))
>>  (define bmp (make-bitmap width height))
>>  (define canvas (new canvas%
>>  (parent win)
>>  (paint-callback
>>   (λ (c dc)
>> (send dc draw-bitmap bmp 0 0)))
>>  ))
>>
>>  (define (single-run)
>>    (define (do-bflip start end (depth 0))
>>  (cond ((fx< depth max-depth)
>> (define cnt (fx- end start))
>> (define cnt2 (fxrshift cnt 1))
>> (define mid (fx+ start cnt2))
>> (let ((f (future
>>   (λ ()
>> (do-bflip start mid (fx+ depth 1))
>>   (do-bflip mid end (fx+ depth 1))
>>   (touch f)))
>>    (else
>> (for ((i (in-range start end)))
>>   (define c (fxvector-ref framebuffer i))
>>   (bytes-set! pixels (+ (* i 4) 0) #xff)
>>   (bytes-set! pixels (+ (* i 4) 1) (fxand (fxrshift c 16)
>> #xff))
>>   (bytes-set! pixels (+ (* i 4) 2) (fxand (fxrshift c 8)
>> #xff))
>>   

Re: [racket-users] Futures + threads SIGSEGV

2020-05-02 Thread Sam Tobin-Hochstadt
I successfully reproduced this on the first try, which is good. Here's
my debugging advice (I'm also looking at it):

1. To use a binary with debugging symbols, use
`racket/src/build/racket/racket3m` from the checkout of the Racket
repository that you built.
2. When running racket in GDB, there are lots of segfaults because of
the GC; you'll want to use `handle SIGSEGV nostop noprint`
3. It may not work for this situation because of parallelism, but if
you can reproduce the bug using `rr` [1] it will be almost infinitely
easier to find and fix.

I'm also curious about your experience with Racket CS and futures.
It's unlikely to have the _same_ bugs, but it would be good to find
the ones there are. :)

[1] https://rr-project.org

On Sat, May 2, 2020 at 7:56 AM Dominik Pantůček
 wrote:
>
> Hello fellow Racketeers,
>
> during my research into how Racket can be used as generic software
> rendering platform, I've hit some limits of Racket's (native) thread
> handling. Once I started getting SIGSEGVs, I strongly suspected I am
> doing too much unsafe operations - and to be honest, that was true.
> There was one off-by-one memory access :).
>
> But that was easy to resolve - I just switched to safe/contracted
> versions of everything and found and fixed the bug. But I still got
> occasional SIGSEGV. So I dug even deeper (during last two months I've
> read most of the JIT inlining code) than before and noticed that the
> crashes disappear when I refrain from calling bytes-set! in parallel
> using futures.
>
> So I started creating a minimal-crashing-example. At first, I failed
> miserably. Just filling a byte array over and over again, I was unable
> to reproduce the crash. But then I realized, that in my application,
> threads come to play and that might be the case. And suddenly, creating
> MCE was really easy:
>
> Create new eventspace using parameterize/make-eventspace, put the actual
> code in application thread (thread ...) and make the main thread wait
> for this application thread using thread-wait. Before starting the
> application thread, I create a simple window, bitmap and a canvas, that
> I keep redrawing using refresh-now after each iteration. Funny thing is,
> now it keeps crashing even without actually modifying the bitmap in
> question. All I need to do is to mess with some byte array in 8 threads.
> Sometimes it takes a minute on my computer before it crashes, sometimes
> it needs more, but it eventually crashes pretty consistently.
>
> And it is just 60 lines of code:
>
> #lang racket/gui
>
> (require racket/future racket/fixnum racket/cmdline)
>
> (define width 800)
> (define height 600)
>
> (define framebuffer (make-fxvector (* width height)))
> (define pixels (make-bytes (* width height 4)))
>
> (define max-depth 0)
>
> (command-line
>  #:once-each
>  (("-d" "--depth") d "Futures binary partitioning depth" (set! max-depth
> (string->number d
>
> (file-stream-buffer-mode (current-output-port) 'none)
>
> (parameterize ((current-eventspace (make-eventspace)))
>   (define win (new frame%
>(label "test")
>(width width)
>(height height)))
>   (define bmp (make-bitmap width height))
>   (define canvas (new canvas%
>   (parent win)
>   (paint-callback
>(λ (c dc)
>  (send dc draw-bitmap bmp 0 0)))
>   ))
>
>   (define (single-run)
> (define (do-bflip start end (depth 0))
>   (cond ((fx< depth max-depth)
>  (define cnt (fx- end start))
>  (define cnt2 (fxrshift cnt 1))
>  (define mid (fx+ start cnt2))
>  (let ((f (future
>(λ ()
>  (do-bflip start mid (fx+ depth 1))
>(do-bflip mid end (fx+ depth 1))
>(touch f)))
> (else
>  (for ((i (in-range start end)))
>(define c (fxvector-ref framebuffer i))
>(bytes-set! pixels (+ (* i 4) 0) #xff)
>(bytes-set! pixels (+ (* i 4) 1) (fxand (fxrshift c 16)
> #xff))
>(bytes-set! pixels (+ (* i 4) 2) (fxand (fxrshift c 8) #xff))
>(bytes-set! pixels (+ (* i 4) 3) (fxand c #xff))
> (do-bflip 0 (* width height))
> (send canvas refresh-now))
> (send win show #t)
>
>   (define appthread
> (thread
>  (λ ()
>(let loop ()
>  (single-run)
>  (loop)
>   (thread-wait appthread))
>
> Note: the code is deliberately de-optimized to highlight the problem.
> Not even mentioning CPU cache coherence here
>
> Running this from command-line, I can adjust the number of threads.
> Running with 8 threads:
>
> $ time racket crash.rkt -d 3
> SIGSEGV MAPERR si_code 1 fault on addr (nil)
> Aborted (core dumped)
>
> real1m18,162s
> user7m11,936s
> sys 0m3,832s
> $ time racket crash.rkt -d 3
> SIGSEGV MAPERR si_code 1 fault on 

Re: [racket-users] Futures + threads SIGSEGV

2020-05-02 Thread Dexter Lagan
Hello,

  I’ve been getting inconsistent results as well. A while ago I made a 
benchmark based on a parallel spectral norm computation. The benchmark works 
fine on Windows on most systems and uses all cores, but crashes randomly on 
other systems. I haven’t been able to figure out why. On Linux it doesn’t seem 
to use more than one core. I’d be interested to know if this is related. Here’s 
the benchmark code :

https://github.com/DexterLagan/benchmark

Dex

> On May 2, 2020, at 1:56 PM, Dominik Pantůček  
> wrote:
> 
> Hello fellow Racketeers,
> 
> during my research into how Racket can be used as generic software
> rendering platform, I've hit some limits of Racket's (native) thread
> handling. Once I started getting SIGSEGVs, I strongly suspected I am
> doing too much unsafe operations - and to be honest, that was true.
> There was one off-by-one memory access :).
> 
> But that was easy to resolve - I just switched to safe/contracted
> versions of everything and found and fixed the bug. But I still got
> occasional SIGSEGV. So I dug even deeper (during last two months I've
> read most of the JIT inlining code) than before and noticed that the
> crashes disappear when I refrain from calling bytes-set! in parallel
> using futures.
> 
> So I started creating a minimal-crashing-example. At first, I failed
> miserably. Just filling a byte array over and over again, I was unable
> to reproduce the crash. But then I realized, that in my application,
> threads come to play and that might be the case. And suddenly, creating
> MCE was really easy:
> 
> Create new eventspace using parameterize/make-eventspace, put the actual
> code in application thread (thread ...) and make the main thread wait
> for this application thread using thread-wait. Before starting the
> application thread, I create a simple window, bitmap and a canvas, that
> I keep redrawing using refresh-now after each iteration. Funny thing is,
> now it keeps crashing even without actually modifying the bitmap in
> question. All I need to do is to mess with some byte array in 8 threads.
> Sometimes it takes a minute on my computer before it crashes, sometimes
> it needs more, but it eventually crashes pretty consistently.
> 
> And it is just 60 lines of code:
> 
> #lang racket/gui
> 
> (require racket/future racket/fixnum racket/cmdline)
> 
> (define width 800)
> (define height 600)
> 
> (define framebuffer (make-fxvector (* width height)))
> (define pixels (make-bytes (* width height 4)))
> 
> (define max-depth 0)
> 
> (command-line
> #:once-each
> (("-d" "--depth") d "Futures binary partitioning depth" (set! max-depth
> (string->number d
> 
> (file-stream-buffer-mode (current-output-port) 'none)
> 
> (parameterize ((current-eventspace (make-eventspace)))
>  (define win (new frame%
>   (label "test")
>   (width width)
>   (height height)))
>  (define bmp (make-bitmap width height))
>  (define canvas (new canvas%
>  (parent win)
>  (paint-callback
>   (λ (c dc)
> (send dc draw-bitmap bmp 0 0)))
>  ))
> 
>  (define (single-run)
>(define (do-bflip start end (depth 0))
>  (cond ((fx< depth max-depth)
> (define cnt (fx- end start))
> (define cnt2 (fxrshift cnt 1))
> (define mid (fx+ start cnt2))
> (let ((f (future
>   (λ ()
> (do-bflip start mid (fx+ depth 1))
>   (do-bflip mid end (fx+ depth 1))
>   (touch f)))
>(else
> (for ((i (in-range start end)))
>   (define c (fxvector-ref framebuffer i))
>   (bytes-set! pixels (+ (* i 4) 0) #xff)
>   (bytes-set! pixels (+ (* i 4) 1) (fxand (fxrshift c 16)
> #xff))
>   (bytes-set! pixels (+ (* i 4) 2) (fxand (fxrshift c 8) #xff))
>   (bytes-set! pixels (+ (* i 4) 3) (fxand c #xff))
>(do-bflip 0 (* width height))
>(send canvas refresh-now))
> (send win show #t)
> 
>  (define appthread
>(thread
> (λ ()
>   (let loop ()
> (single-run)
> (loop)
>  (thread-wait appthread))
> 
> Note: the code is deliberately de-optimized to highlight the problem.
> Not even mentioning CPU cache coherence here
> 
> Running this from command-line, I can adjust the number of threads.
> Running with 8 threads:
> 
> $ time racket crash.rkt -d 3
> SIGSEGV MAPERR si_code 1 fault on addr (nil)
> Aborted (core dumped)
> 
> real1m18,162s
> user7m11,936s
> sys0m3,832s
> $ time racket crash.rkt -d 3
> SIGSEGV MAPERR si_code 1 fault on addr (nil)
> Aborted (core dumped)
> 
> real3m44,005s
> user20m10,920s
> sys0m11,702s
> $ time racket crash.rkt -d 3
> SIGSEGV MAPERR si_code 1 fault on addr (nil)
> Aborted (core dumped)
> 
> real2m1,650s
> user10m58,392s
> sys0m6,445s
> $ time racket crash.rkt -d 3
> 

[racket-users] Futures + threads SIGSEGV

2020-05-02 Thread Dominik Pantůček
Hello fellow Racketeers,

during my research into how Racket can be used as generic software
rendering platform, I've hit some limits of Racket's (native) thread
handling. Once I started getting SIGSEGVs, I strongly suspected I am
doing too much unsafe operations - and to be honest, that was true.
There was one off-by-one memory access :).

But that was easy to resolve - I just switched to safe/contracted
versions of everything and found and fixed the bug. But I still got
occasional SIGSEGV. So I dug even deeper (during last two months I've
read most of the JIT inlining code) than before and noticed that the
crashes disappear when I refrain from calling bytes-set! in parallel
using futures.

So I started creating a minimal-crashing-example. At first, I failed
miserably. Just filling a byte array over and over again, I was unable
to reproduce the crash. But then I realized, that in my application,
threads come to play and that might be the case. And suddenly, creating
MCE was really easy:

Create new eventspace using parameterize/make-eventspace, put the actual
code in application thread (thread ...) and make the main thread wait
for this application thread using thread-wait. Before starting the
application thread, I create a simple window, bitmap and a canvas, that
I keep redrawing using refresh-now after each iteration. Funny thing is,
now it keeps crashing even without actually modifying the bitmap in
question. All I need to do is to mess with some byte array in 8 threads.
Sometimes it takes a minute on my computer before it crashes, sometimes
it needs more, but it eventually crashes pretty consistently.

And it is just 60 lines of code:

#lang racket/gui

(require racket/future racket/fixnum racket/cmdline)

(define width 800)
(define height 600)

(define framebuffer (make-fxvector (* width height)))
(define pixels (make-bytes (* width height 4)))

(define max-depth 0)

(command-line
 #:once-each
 (("-d" "--depth") d "Futures binary partitioning depth" (set! max-depth
(string->number d

(file-stream-buffer-mode (current-output-port) 'none)

(parameterize ((current-eventspace (make-eventspace)))
  (define win (new frame%
   (label "test")
   (width width)
   (height height)))
  (define bmp (make-bitmap width height))
  (define canvas (new canvas%
  (parent win)
  (paint-callback
   (λ (c dc)
 (send dc draw-bitmap bmp 0 0)))
  ))

  (define (single-run)
(define (do-bflip start end (depth 0))
  (cond ((fx< depth max-depth)
 (define cnt (fx- end start))
 (define cnt2 (fxrshift cnt 1))
 (define mid (fx+ start cnt2))
 (let ((f (future
   (λ ()
 (do-bflip start mid (fx+ depth 1))
   (do-bflip mid end (fx+ depth 1))
   (touch f)))
(else
 (for ((i (in-range start end)))
   (define c (fxvector-ref framebuffer i))
   (bytes-set! pixels (+ (* i 4) 0) #xff)
   (bytes-set! pixels (+ (* i 4) 1) (fxand (fxrshift c 16)
#xff))
   (bytes-set! pixels (+ (* i 4) 2) (fxand (fxrshift c 8) #xff))
   (bytes-set! pixels (+ (* i 4) 3) (fxand c #xff))
(do-bflip 0 (* width height))
(send canvas refresh-now))
(send win show #t)

  (define appthread
(thread
 (λ ()
   (let loop ()
 (single-run)
 (loop)
  (thread-wait appthread))

Note: the code is deliberately de-optimized to highlight the problem.
Not even mentioning CPU cache coherence here

Running this from command-line, I can adjust the number of threads.
Running with 8 threads:

$ time racket crash.rkt -d 3
SIGSEGV MAPERR si_code 1 fault on addr (nil)
Aborted (core dumped)

real1m18,162s
user7m11,936s
sys 0m3,832s
$ time racket crash.rkt -d 3
SIGSEGV MAPERR si_code 1 fault on addr (nil)
Aborted (core dumped)

real3m44,005s
user20m10,920s
sys 0m11,702s
$ time racket crash.rkt -d 3
SIGSEGV MAPERR si_code 1 fault on addr (nil)
Aborted (core dumped)

real2m1,650s
user10m58,392s
sys 0m6,445s
$ time racket crash.rkt -d 3
SIGSEGV MAPERR si_code 1 fault on addr (nil)
Aborted (core dumped)

real8m8,666s
user45m52,359s
sys 0m25,184s
$

With 4 threads it didn't crash even after quite some time:

$ time racket crash.rkt -d 2
^Cuser break
  context...:
   "crash.rkt": [running body]
   temp35_0
   for-loop
   run-module-instance!
   perform-require!

real20m18,706s
user61m38,546s
sys 0m22,719s
$


I'll re-run the 4-thread test overnight.

What would be the best approach to debugging this issue? I assume I'll
load the racket binary in gdb and see the stack traces at the moment of
the crash, but that won't reveal the source of the problem (judging
based on my previous experience of debugging heavily multi-threaded