Re: How does Core Audio handle heterogeneous sample rates?

Brian Willoughby Sun, 05 Nov 2017 18:25:58 -0800

Most applications, including iTunes, only send audio to a single device. 
Therefore, only one sample rate is handled. However, if that application mixes 
multiple audio files at different sample rates, then there will be sample rate 
conversion. It is up to the application developer to decide on the type of 
sample rate conversion and, while CoreAudio provides many options, there’s no 
rule that prevents a developer from using non-CoreAudio solutions. In addition, 
some applications have problems when a single audio file’s sample rate does not 
match the device sample rate, even in those situations where there isn’t a 
second sample rate on input.

The number of applications which can open multiple audio devices is small 
compared to the overall macOS market, but if you’re in a modern recording 
studio then you might actually see quite a few applications that support this. 
Companies like Mark of the Unicorn (MOTU) have developed their own features to 
support multiple audio devices. Their Digital Performer (DP) product allows 
users to connect input and output from multiple pieces of audio hardware. MOTU 
DP handles all of the sample rate conversion within the application, although 
it is unclear (to me, at least) to what degree they leverage CoreAudio SRC 
versus their own SRC, and whether they support audio interfaces besides their 
own brand (they probably do, thanks to CoreAudio drivers) or not.

Independent of applications, macOS supports aggregate devices within the 
CoreAudio subsystem. These Aggregate Devices are created and configured by the 
user and then presented transparently to the application as if they were a 
single, physical device. In other words, the application code is unaware that 
the aggregate device is not a single hardware device, and thus there is no 
special code within the app. The answer to your questions in this case is that 
a CoreAudio aggregate device has a master audio device which does not incur 
sample rate conversion, while all other audio devices that belong to the 
aggregate have some amount of automatic SRC, as needed. There are sometimes 
issues when the user incorrectly or incompletely configures an aggregate 
device, so it’s important to be aware that it’s not a fool-proof option.

CoreAudio itself offers a number of building blocks to application developers, 
many of which have automatic SRC. The default output audio device AudioUnit and 
the HAL output device AudioUnit both have optional SRC support. There is an 
AudioStreamBasicDescription (ASBD) for both input and output scopes of those 
AudioUnits, and if the sample rate does not match then SRC will be performed. 
The ExtAudioFile API allows nearly all audio file formats to be opened and 
decoded, and again there is both an input and output ASBD such that SRC may be 
performed, as needed. In addition to these handy tools for audio device I/O and 
audio file I/O, CoreAudio also offers the same SRC feature as AUConverter, an 
independent AudioUnit that can convert between sample rates and/or sample 
formats. This AudioUnit allows control of the SRC quality, including some of 
the best SRC in the industry (see http://http://src.infinitewave.ca for 
details). When directly hosting an AUConverter, application developers have the 
ability to control the level of quality, and thus control the tradeoff between 
CPU usage and audio quality. Although AUConverter is used by CoreAudio in many 
places (default output device, ExtAudioFile, aggregate devices), it is not 
always possible for the user to control the quality in an immediately 
accessible fashion. In most of those situations, the SRC quality is a bit lower 
than best-in-the-industry, probably because Apple's assumption is that most 
macOS users care more about CPU availability than ultimate audio quality. In 
response, there is a small market for carefully-coded macOS applications that 
specifically set the SRC quality to maximum, but it is rather difficult to 
discern what is happening behind the scenes for any given app.

The short answer is that CoreAudio does not directly handle producers with 
different sample rates. Instead, the application developer and/or the user must 
configure various options available in CoreAudio to control this. As far as I 
know, there is no completely automatic handling, because someone (either 
developer or user) must choose the mechanism that controls this.

As for your challenge to pick a rate that won’t incur resampling, it’s up to 
you as a developer to design the graph within your application that handles 
audio. Whether you use the CoreAudio AUGraph API or build your own audio flow, 
it’s still up to your code to query the outside sample rates (such as the 
output device or the input file) and decide how to manage mismatches. You could 
query everything and then choose the highest sample rate to preserve maximum 
quality, but that would use more CPU. You could also choose the lowest sample 
rate and suffer the loss of quality. When the sample rates are fairly close, 
such as 44.1 and 48 kHz, there might not be much quality difference, but your 
app will still need a reasonable amount of intelligence to automatically choose 
the overall sample rate that involves the least amount of SRC (if your goal is 
to avoid SRC as much as possible). CoreAudio does not make any of these 
decisions automatically for you. There are default sample rates, but they are 
just defaults, not optimizations based on all of your application’s resources.

As for your linear phase question, it appears that all of Apple’s SRC options 
are linear phase. Look at the src.infinitewave.ca site for Afconvert, Apple 
CoreAudio, and Logic. You can examine many aspects of the SRC performance. Note 
that Afconvert (bats) is the highest quality available from Apple, although the 
test results on that site may represent older macOS releases than the one 
you’re working on.

A little more information on what you’re trying to build, and whether you’re on 
macOS or iOS would help narrow down the many possibilities.

Finally, I don’t think you’re rambling. I’ve put a lot of time into developing 
tools that allow me to verify the bit-perfect accuracy of various macOS 
software. I do music mastering, and prefer to ensure that there is no SRC in my 
monitoring path unless it is part of the mastering decision process. It can be 
difficult to be certain, but it is still quite possible to confirm that SRC is 
not occurring with specific software and hardware setups.

Brian Willoughby
Sound Consulting

On Nov 5, 2017, at 5:18 PM, Brian Armstrong <[email protected]> 
wrote:
> I was curious how Core Audio handles producers with different sample rates, 
> let's say 44,100 Hz and 48,000 Hz. I guess this may also be a question about 
> sound cards. My suspicion is that one consumer must have to face resampling 
> on the way out. If that's the case, I'm curious what the resampling looks 
> like, and if it has linear phase.
> 
> I'm trying to nail down some quirks that might be sample rate related in a 
> modem library I work on. If there is resampling that occurs, seems my best 
> bet is to pick a rate that won't incur resampling, whatever that rate might 
> be? But I'm not sure how I would get that rate.
> 
> Sorry for the rambling here :)
> -Brian

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/coreaudio-api/archive%40mail-archive.com

This email sent to [email protected]

Re: How does Core Audio handle heterogeneous sample rates?

Reply via email to