Re: [pulseaudio-discuss] [PATCH 07/13] loopback: Refactor latency initialization

Georg Chini Thu, 26 Nov 2015 11:37:12 -0800

On 26.11.2015 18:47, Tanu Kaskinen wrote:

On Thu, 2015-11-26 at 08:41 +0100, Georg Chini wrote:

On 26.11.2015 01:49, Tanu Kaskinen wrote:

On Wed, 2015-11-25 at 22:58 +0100, Georg Chini wrote:

On 25.11.2015 19:49, Tanu Kaskinen wrote:

On Wed, 2015-11-25 at 16:05 +0100, Georg Chini wrote:

On 25.11.2015 09:00, Georg Chini wrote:

OK, understood. Strange that you are talking of 75% and 25%
average buffer fills. Doesn't that give a hint towards the connection
between sink latency and buffer_latency?
I believe I found something in the sink or alsa code back in February
which at least supported my choice of the 0.75, but I have to admit
that I can't find it anymore.

Lets take the case I mentioned in my last mail. I have requested
20 ms for the sink/source latency and 5 ms for the memblockq.

What does it mean that you request 20 ms "sink/source latency"? There
is the sink latency and the source latency. Does 20 ms "sink/source
latency" mean that you want to give 10 ms to the sink and 10 ms to the
source? Or 20 ms to both?

I try to configure source and sink to the same latency, so when I
say source/sink latency = 20 ms I mean that I configure both to
20 ms.
In the end it may be possible that they are configured to different
latencies (for example HDA -> USB).
The minimum necessary buffer_latency is determined by the larger
of the two.
For simplicity in this thread I always assume they are both equal.

The
20 ms cannot be satisfied, I get 25 ms as sink/source latency when
I try to configure it (USB device).

I don't understand how you get 25 ms. default_fragment_size was 5 ms
and default_fragments was 4, multiply those and you get 20 ms.

You are right. The configured latency is 20 ms but in fact I am seeing
up to 25 ms.

25 ms reported as the sink latency? If the buffer size is 20 ms, then
that would mean that there's 5 ms buffered later in the audio path.
That sounds a bit high to me, but not impossible. My understanding is
that USB transfers audio in 1 ms packets, so there has to be at least 1
ms extra buffer after the basic alsa ringbuffer, maybe the extra buffer
contains several packets.

I did not check the exact value, maybe it is not 25 but 24 ms, anyway
significantly larger than the configured value.

For the loopback code it means that the target latency is not what
I specified on the command line but the average sum of source and
sink latency + buffer_latency.

The target latency should be "configured source latency +
buffer_latency + configured sink latency". The average latency of the
sink and source don't matter, because you need to be prepared for the
worst case scenario, in which the source buffer is full and the sink
wants to refill its buffer before the source pushes its buffered audio
to the memblockq.

Using your suggestion would again drastically reduce the possible
lower limit. Obviously it is not necessary to go to the full range.

How is that obviously not necessary? For an interrupt-driven alsa
source I see how that is not necessary, hence the suggestion for
optimization, but other than that, I don't see the obvious reason.

Obviously in the sense that it is working not only for interrupt-driven
alsa sources but also for bluetooth devices and timer-based alsa devices.
I really spent a lot of time with stability tests, so I know it is working
reliable for the devices I could test.

That special case is also difficult to explain. There are two situations,
where I use the average sum of source and sink latency.
1) The latency specified cannot be satisfied
2) sink/source latency and buffer_latency are both specified

In case 1) the sink/source latency will be set as low as possible
and buffer_latency will be derived from the sink/source latency
using my safeguards.
in case 2) sink/source latency will be set to the nearest possible
value (but may be higher than specified), and buffer_latency is
set to the commandline value.

Now in both cases you have sink/source latency + buffer_latency
as the target value for the controller - at least if you want to handle
it similar to the normal operation.
The problem now is that the configured sink/source latency is
possibly different from what you get on average. So I replaced
sink/source latency with the average sum of the measured
latencies.

Of course the average measured latency of a sink or source is lower
than the configured latency. The configured latency represents the
situation where the sink or source buffer is full, and the buffers
won't be full most of the time. That doesn't mean that the total
latency doesn't need to be big enough to contain both of the configured
latencies, because you need to handle the case where both buffers
happen to be full at the same time.

I am not using sink or source latency alone, I am using the
average sum of source and sink latency, which is normally
slightly higher than a single configured latency.

How can it be possible that both buffers are full at the same

time? This could only happen if there is some congestion and
then there is a problem with the audio anyway. In a steady
state, when one buffer is mostly empty, the other one must be
mostly full. Otherwise the latency would jump around wildly.

There are three buffers, not two. The sum of the buffer fill level of
the sink and source will jump around wildly, but the total latency will
stay constant, because the memblockq will always contain the empty
space of the sink and source buffers.

Note that the measured latency of the sink and source doesn't jump
around that wildly, because the measurement often causes a reset in the
buffer fill levels. But every time the sink buffer is refilled or the
source pushes out data, the latency of the sink or source jumps. The
sink refills and source emptying don't (generally) happen in a
synchronized manner, so the latency sum of the sink and source does
jump around.

If the sink and source were synchronized, the combined latency wouldn't
jump around, and you could reduce the total latency. But that's not the
case.

The average is also used to compare the "real"
source/sink latency + buffer_latency
against the configured overall latency and the larger of the two
values is the controller target. This is the mechanism used
to increase the overall latency in case of underruns.

I don't understand this paragraph. I thought the reason why the
measured total latency is compared against the configured total latency
is that you then know whether you should increase or decrease the sink
input rate. I don't see how averaging the measurements helps here.

Normally, the configured overall latency is used as a target for
the controller. Now there must be some way to detect during
runtime if this target is something that can be achieved at all.

You know whether the target is achievable when you know the maximum
latency of the sink and source. If the sum of those is larger than the
target, the target is not achievable. Measuring the average latency
doesn't bring any new information.

By the way, the fact that the real sink latency can be higher than the
configured latency is problematic, when thinking whether the target
latency can be achieved. The extra latency needs to be compensated by
decreasing buffer_latency, and if such extra margin doesn't exist, then
the target latency is not achievable.

It's not currently possible to separate the alsa ringbuffer latency
from the total sink latency, so you don't know how much there is such
extra latency. You could look at the sink latency reports, and if they
go beyond the configured latency, then you know that there's *at least*
that much extra latency, but it would be nice if the alsa sink could
report the ringbuffer and total latencies separately. The alsa API
supports this, but PulseAudio's own APIs don't. The source has the same
problem, but it also has the additional problem that the latency
measurements cause the buffer to be emptied first, so the measurements
never show larger latencies than configured. I propose that for now we
ignore such extra latencies. We could add some safety margin to
buffer_latency to cover these latencies, but it's not nice to force
that for cases where such extra latencies don't exist.

So I compare the target value against buffer_latency +
average_sum_of_source_and_sink_latency and set the controller
target to the larger of the two.
This is the way the underrun protection works. In normal operation,
the configured overall latency is larger than the sum above and
buffer_latency is not used at all. When underruns occur, buffer_latency
is increased until the sum gets larger than the configured latency
and the controller switches the target.

And what does this have to do with increasing the latency on underruns?
If you get an underrun, then you know buffer_latency is too low, so you
bump it up by 5 ms (if I recall your earlier email correctly), causing
the configured total latency to go up by 5 ms as well. As far as I can
see, the measured latency is not needed for anything in this operation.

----

Using your example (usb sound card with 4 * 5 ms sink and source
buffers), my algorithm combined with the alsa source optimization
yields the following results:

configured sink latency = 20 ms
configured source latency = 20 ms
maximum source buffer fill level = 5 ms
buffer_latency = 0 ms
target latency = 25 ms

So you see that the results aren't necessarily overly conservative.

That's different from what you proposed above, but sounds
like a reasonable approach. The calculation would be slightly
different because I defined buffer_latency = 5 ms on the
command line. So the result would be 30 ms, which is more
sensible. First we already know that the 25 ms won't work.
Second, the goal of the calculation was to find a working
target latency using the configured buffer_latency, so you
can't ignore it.
My calculation leads to around 27.5 ms instead of your 30 ms,
so the two values are near enough to each other and your
proposal has the advantage of being constant.

I will replace the average sum by
0.25 * configured_source_latency + configured_sink_latency.
in the next version if my tests with that value are successful.

Do you mean that you're going to use 0.25 as the multiplier regardless
of the number of fragments?

Previously I've been saying that in the general case the target latency
should be "configured source latency + buffer_latency + configured sink
latency". To generalize the alsa source exception, I'll use the
following definition instead from now on: "target latency = maximum
source buffer fill level + buffer_latency + maximum sink buffer fill
level". Usually the maximum fill levels have to be assumed to be the
same as the configured latencies, but in the interrupt-driven alsa
source case the maximum fill level is known to be "configured source
latency / fragments".


OK, now I finally got you. That has taken quite a bit, sorry.

Let me summarize:

- The minimum achievable latency is maximum source fill + maximum sink fill
- The sink maximum fill level is always 100%, regardless of the device type
- The source maximum fill level depends on device type
   # For interrupt driven alsa sources it is one default-fragment-size
   # For timer-based alsa devices it is 50% of the configured latency
   # For the general case it is unknown and may be 100%

Do we have an agreement so far?

As we are talking about fail-safe measures I would think that it is OK
to assume that 50% is a reasonable value even for the general case.
As already said there is a mechanism in the controller that will handle
the cases when that assumption is not true and I would not want
to throw away the potential reduction of the latency just to be on
the 100% safe side.

If we are on the same page now, I would implement it as stated
above.


_______________________________________________
pulseaudio-discuss mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/pulseaudio-discuss

Re: [pulseaudio-discuss] [PATCH 07/13] loopback: Refactor latency initialization

Reply via email to