Re: std::atomic and lock free algorithms with Core Audio

Hovik Melikyan Sun, 11 Sep 2016 06:10:57 -0700

Thanks Ross. I've seen Jeff Preshing's blog, it comes up in top
results when you google these things. I also found and am now watching
Herb Sutter's talks that are really great if someone wants to
understand these things from scratch:

https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2

>
> If there is no ordering requirement between individual value updates and
> other atomic/non-atomic code, then atomic relaxed is sufficient. The other
> orderings are only useful when considering multiple related reads/writes
> where their relative update order needs to be maintained.
>

On a second thought, mute and gain say on a mixer bus are in fact
related. If your app mutes the bus and then changes the gain you don't
want to hear the change of gain. So maybe a good implementation of a
node with mute/gain should guarantee sequential consistency, and I'm
now wondering how for example Apple's AU's deal with this.

From my experience with stock AU's, they don't always react to
attribute changes immediately. For example the EQ family was so slow
for one of my apps that I had to write my own versions of these
filters based on vDSP. By "immediately" here I mean for changes to
take effect before the next rendering cycle at the latest.

I'm now beginning to think of some kind of a unified messaging system
for CoreAudio apps that would read messages safely and only once at
the start of a rendering cycle, then distribute them across the graph.
This will allow to have only one load-acquire per rendering cycle, and
no synchronization or any ordering constraints (even "relaxed" ones?)
for the rest of rendering.

Michael Tyson's TPCircularBuffer is probably a good candidate, but a
quick look at the source raises a few flags to be honest, plus my
overall experience with TheAmazingAudioEngine in the past has been
rather negative, and I can't even know if it wasn't because of
TPCircularBuffer. TAAE is a bit overengineered in any case.

So I think I'd start with Herb Sutter's 1P1C lock-free queue described here:
http://www.drdobbs.com/parallel/writing-lock-free-code-a-corrected-queue/210604448

Memory allocations with this implementation happen only on the
producer's side, which is good for audio apps. I think it comes down
to optimizing these allocations by e.g. reusing message objects, or
doing something even more complicated, like this:
https://github.com/cameron314/readerwriterqueue

That's what I've been able to gather so far... Does the above make sense?

--
Hovik Melikyan

On 11 September 2016 at 08:08, Ross Bencina <[email protected]> wrote:
> On 11/09/2016 8:38 AM, Hovik Melikyan wrote:
>>
>> Firstly, for atomic/aligned types such as Float32 or Int32 there has
>> to be a simple mechanism and I'm wondering, e.g. which of the
>> memory_order model should be used with these parameters.
>>
>> The thing with parameters like volume and mute is that in principle it
>> is not required for changes to be immediately available for the audio
>> thread, but it should be "soon enough", say they should take effect
>> during the next rendering cycle. My first question is then, if I use
>> std::atomic<> (which I admit I still don't understand very well),
>> which memory ordering model to use? Is the relaxed model sufficient in
>> this case? On the other hand, is acquire/release semantics too
>> expensive for what I need?
>
>
> If there is no ordering requirement between individual value updates and
> other atomic/non-atomic code, then atomic relaxed is sufficient. The other
> orderings are only useful when considering multiple related reads/writes
> where their relative update order needs to be maintained.
>
>> Second, in more complicated cases like passing a block of parameters
>> to a parametric EQ, what is the best mechanism?
>
>
> Not sure.
>
>
>> Another option I've found is a less known algorithm called "optimistic
>> reader" that involves a shared counter. The writer increments the
>> counter once when it starts writing and one more time when it ends.
>> The reader, in its turn, reads the counter, reads the data, then
>> compares the counter with the stored value. If the values don't match
>> it means the data block is likely inconsistent, and that it should
>> retry. While it sounds simple and elegant, there is at least one major
>> drawback that potentially the consumer may end up looping a lot. Also
>> for me there is a question again what memory ordering model to use
>> with the counter.
>
>
> The reader can give up if it tries too many times (and if you think about
> it, it's pretty unlikely that you'd need more than 2 retries). A similar
> thing would apply to using a try-lock.
>
> This is a good explanation of release/aquire:
>
> http://preshing.com/20120913/acquire-and-release-semantics/
>
> also
>
> http://preshing.com/20131125/acquire-and-release-fences-dont-work-the-way-youd-expect/
>
> There are many other excellent posts on Jeff Preshing's blog.
>
> Ross.
>
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/coreaudio-api/archive%40mail-archive.com

This email sent to [email protected]

Re: std::atomic and lock free algorithms with Core Audio

Reply via email to