hi everyone! for the benefit of those JACK devs who don't follow linux-audio-* (if any), and for posterity in the JACK mailing list archives:
there has been a very insightful discussion on LAU/LAD about ringbuffer peculiarities and a possible bug. olivier guilyardi has written test cases that pinpoint those problems. he has made them available at http://svn.samalyse.com/misc/rbtest. can i suggest that the rest of this discussion take place on or at least be cc:ed to jack-devel? best, jörn (who has no idea at all about concurrent programming, but knows a bit about community engineering and data mining :-D) Olivier Guilyardi wrote: > Jack O'Quin wrote: >> On Sun, Oct 19, 2008 at 2:55 AM, Paul Davis <[EMAIL PROTECTED]> wrote: >> >>> i don't know whether to shoot myself or eat another couple of the >>> oft-promised hats. >> Don't beat yourself up too badly. Multiple threads accessing shared memory >> is *very* tricky. Even smart people (like you) get it wrong sometimes. > > I agree with Jack. I remember a friend of mine once qualifying multi-threading > of "kernel quantization". This is complex but also challenging :) > > Although your understanding of these matters is far beyond mine, I think that > I > have a good professional experience of coding methodology. And to me the > problem > here is indeed methodological: there is a lack of unit tests in JACK. > >>> so the next question is how best to prevent it. as far as i can see we >>> have a couple of proposals: >>> >>> 1) fons' design, which never actually wraps readptr or writeptr, but >>> masks the address used to access the data buffer >>> >>> 2) removing the intermediate state's visibility >>> >>> i admit to preferring (2) even though i know that with a 64 bit index, >>> not wrapping the ptrs is not really a problem. >> I also prefer (2). >> >>> however, it is not totally clear to me how to prevent an optimizing >>> compiler from doing the wrong thing here. unlike the claims made by >>> someone involved with portaudio, i believe that it is correct to declare >>> the read_ptr and write_ptr volatile, so that the compiler knows that it >>> cannot try to be clever about it accesses of the "other" ptr (i.e. >>> read_ptr for the writing thread, and vice versa). maybe the comment on >>> the use of volatile was based on some idea that i thought it made the >>> variables thread safe, which is and never was the case. >>> >>> i suspect that the safest code looks like: >>> >>> size_t tmp = (ptr + incr) & mask; >>> barrier(); >>> ptr = tmp; >>> >>> but i am not sure whether barrier needs to be read, write or both. >>> >>> i think that the simpler code: >>> >>> ptr = (ptr + incr) & mask; >>> >>> is subject to potential compiler and/or processor "optimization" that >>> might reduce it back to the problem case of two ops without an >>> intermediate load/store location. the volatile declaration ought to >>> prevent the compiler from doing this, and i don't see why a processor >>> would do this, ever, but clearly i've already been deeply wrong about >>> this. does anybody know for certain? >> It is best to avoid assumptions about what some future compiler may >> consider an "optimization". If the register pressure is high at some >> point in the program, it may decide to store some value just to free up >> register space for other variables. >> >> The "volatile" declaration should remove any need for the compiler >> barrier() statement, AFAICT. Note that barrier() is a compiler >> directive, and has no effect on the CPU's ability to reorder cache >> operations in an SMP memory hierarchy. >> >> Here is a fairly clear and complete description of the memory barrier >> issues: http://lxr.linux.no/linux/Documentation/memory-barriers.txt >> >> As best I can tell, both ring buffer threads require "general" memory >> barriers, because they both read and write (different) shared data. In >> Linux kernel terms, they'd need to use smp_mb(), since the problems >> they address are multiprocessor-only. But, since JACK is not compiled >> differently for UP and SMP, the full mb() seems more appropriate. >> >> That stuff makes my head hurt. > > Maybe that is because your approach is too theoretical. I've read many > theories > about whether memory barriers are needed or not in lock-free ring buffers, > and I > think it might lead nowhere without experimenting, that is: write test cases. > > Until now, I just couldn't write a test that fail because of the lack of > memory > barrier on x86. However, I think I found another bug in Jack ringbuffer, by > writing another test. > > It's a bit of a weird test, I call it "bit circle": I start 16 "node" threads. > Each node : > - communicate with its adjacent node through 2 ring buffers > - is responsible for shifting an integer by one bit > - send the shifted result to the next node through a ring buffer > - checks that the value read from previous node is correct. > > On my Debian Quad Core and Mac OS X Core Duo boxes, this test fails with > Jack's > ringbuffer even with my patch applied. But it succeeds with Portaudio (both > with > and without memory barriers), and also with Fons' implementation. > > I wish someone could eventually run my test suite on a PowerPC, especially > SMP. > > The usage is unchanged: > svn co http://svn.samalyse.com/misc/rbtest > cd rbtest > make test > > The run time is now shorter, about 2 minutes. Below is the output. You can see > that jack (test-bit-circle-jack) fails the bit circle test, even when patched > (test-bit-circle-jack-fix1). "|" is printed when a node thread starts, and "-" > when a node has read 1000 values, to ensure data is really flowing. > > The test-int-array-* tests are the same as my original test. All *-lfq tests > use > Fons's Lfq ringbuffer (modified to use char instead of uint32_t) as backend. > > ./run-tests.sh bit-circle 512 jack jack-fix1 portaudio portaudio-nobarrier lfq > > test-bit-circle-jack - starting (5s max) - buffer size: 512 > || FAILURE: 2 != 514 > > test-bit-circle-jack-fix1 - starting (5s max) - buffer size: 512 > || FAILURE: 2 != 514 > > test-bit-circle-portaudio - starting (5s max) - buffer size: 512 > ||-|-|-||-|-|-|-|-||--|||--|---- SUCCESS > > test-bit-circle-portaudio-nobarrier - starting (5s max) - buffer size: 512 > |||--|-|||---||-||-|-|--||--|--- SUCCESS > > test-bit-circle-lfq - starting (5s max) - buffer size: 512 > |||--|||-|||-|||--|||-------|--- SUCCESS > > ./run-tests.sh int-array 16 jack jack-fix1 portaudio portaudio-nobarrier lfq > > test-int-array-jack - starting (20s max) - array/buffer size: 8/16 > [reader started] [writer started] [flowing] 52572 != 52568 at offset 0 > FAILURE in chunk 91822 > > test-int-array-jack-fix1 - starting (20s max) - array/buffer size: 8/16 > [reader started] [writer started] [flowing] SUCCESS > > test-int-array-portaudio - starting (20s max) - array/buffer size: 8/16 > [reader started] [writer started] [flowing] SUCCESS > > test-int-array-portaudio-nobarrier - starting (20s max) - array/buffer size: > 8/16 > [reader started] [writer started] [flowing] SUCCESS > > test-int-array-lfq - starting (20s max) - array/buffer size: 8/16 > [writer started] [reader started] [flowing] SUCCESS > -- jörn nettingsmeier home://germany/45128 essen/lortzingstr. 11/ http://spunk.dnsalias.org phone://+49/201/491621 Kurt is up in Heaven now. _______________________________________________ Linux-audio-dev mailing list [email protected] http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev
