Re: [PATCH v2] ARM: staging: bcm2835-audio: interpolate audio delay

2019-01-01 Thread Mike Brady
Apologies for the delay coming back to this.

Having had a long look at the further implications of this proposed change, I 
would like to withdraw the suggested patch.

I agree that is would seem excessive to have to change the PCM core to 
accommodate it. Furthermore, the idea only works if is is legitimate to assume 
that frames are smoothly consumed in the interpolation interval. If that 
assumption were not to hold for some reason, then the delay reported would be 
wrong.

Thanks to everyone for their comments and inputs.

Regards
Mike



> On 13 Nov 2018, at 16:50, Takashi Iwai  wrote:
> 
> On Sun, 11 Nov 2018 19:21:29 +0100,
> Mike Brady wrote:
>> 
>> /* hardware definition */
>> static const struct snd_pcm_hardware snd_bcm2835_playback_hw = {
>>  .info = (SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER |
>>   SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID |
>> - SNDRV_PCM_INFO_DRAIN_TRIGGER | SNDRV_PCM_INFO_SYNC_APPLPTR),
>> + SNDRV_PCM_INFO_BATCH | SNDRV_PCM_INFO_DRAIN_TRIGGER |
>> + SNDRV_PCM_INFO_SYNC_APPLPTR),
> 
> As already mentioned, the addition of SNDRV_PCM_INFO_BATCH should be
> irrelevant with this change.  Please create another patch to add this
> and clarify it in the changelog.
> 
>> diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
>> index 4e6110d778bd..574df7d7a1fa 100644
>> --- a/sound/core/pcm_lib.c
>> +++ b/sound/core/pcm_lib.c
>> @@ -229,19 +229,38 @@ static void update_audio_tstamp(struct 
>> snd_pcm_substream *substream,
>>  (runtime->audio_tstamp_report.actual_type ==
>>  SNDRV_PCM_AUDIO_TSTAMP_TYPE_DEFAULT)) {
>> 
>> -/*
>> - * provide audio timestamp derived from pointer position
>> - * add delay only if requested
>> - */
>> +// provide audio timestamp derived from pointer position
>> 
>>  audio_frames = runtime->hw_ptr_wrap + runtime->status->hw_ptr;
>> 
>> -if (runtime->audio_tstamp_config.report_delay) {
>> +/*
>> + * If the runtime->delay is greater than zero, it's a
>> + * genuine delay, e.g. a delay due to a hardware fifo.
>> + *
>> + * But if the runtime->delay is less than zero, it's an
>> + * interpolated estimate of the number of frames output
>> + * since the hardware pointer was last updated.
>> + *
>> + * It would be calculated in the pointer callback.
>> + * For example, for the bcm_2835 driver, it is calculated in
>> + * snd_bcm2835_pcm_pointer().
>> + */
>> +
>> +if (runtime->delay < 0) {
>> +// The delay is an interpolated estimate...
>>  if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
>> -audio_frames -=  runtime->delay;
>> -else
>> -audio_frames +=  runtime->delay;
>> +audio_frames += runtime->delay;
>> +} else {
>> +// The delay is a real delay. Add it if requested.
>> +if (runtime->audio_tstamp_config.report_delay) {
>> +if (substream->stream ==
>> +SNDRV_PCM_STREAM_PLAYBACK)
>> +audio_frames -=  runtime->delay;
>> +else
>> +audio_frames +=  runtime->delay;
>> +}
>>  }
> 
> Well, honestly speaking, I'm really not fond of changing the PCM core
> just for this.
> 
> Can we make bcm audio driver providing the finer pointer update
> instead?  If we have a module option to disable that behavior, it's an
> enough excuse in case anyone really cares about the accuracy.
> 
> 
> thanks,
> 
> Takashi

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

2018-11-11 Thread Mike Brady


> On 6 Nov 2018, at 21:31, Takashi Iwai  wrote:
> 
> On Tue, 06 Nov 2018 22:05:11 +0100,
> Mike Brady wrote:
>> 
>> 
>>> On 5 Nov 2018, at 16:11, Takashi Iwai  wrote:
>>> 
>>> On Mon, 05 Nov 2018 16:57:07 +0100,
>>> Mike Brady wrote:
>>>> 
>>>>> One another thing I'd like to point out is that the value given in the
>>>>> patch is nothing but an estimated position, optimistically calculated
>>>>> via the system timer.  Mike and I had already discussion in another
>>>>> thread, and another possible option would be to provide the proper
>>>>> timestamp-vs-hwptr pair, instead of updating the timestamp always at
>>>>> the status read.
>>>> 
>>>> Agreed — that would give the caller the information needed to do the
>>>> interpolation for themselves if desired.
>>> 
>>> And now I wonder whether the problem is still present with the latest
>>> code.  There was a (kind of) regression in this regard when we
>>> introduced the fine-grained hardware timestamping, but it should have
>>> been addressed by the commit 20e3f985bb875fea4f86b04eba4b6cc29bfd6b71
>>>   ALSA: pcm: update tstamp only if audio_tstamp changed
>>> 
>>> Could you double-check whether the tstamp field gets still updated
>>> even if no hwptr (and delay) is changed?
>> 
>> Yes, this could be a bit problematic. The function update_audio_tstamp in 
>> pcm_lib.c could include the interpolated delay in the calculation of 
>> audio_tstamp, and hence
>> could trigger the update of tstamp.
> 
> Well, my question is about the current driver as-is.
> It has no runtime->delay, so far, hence audio_tstamp is calculated
> only from the hwptr position.  As the corresponding tstamp gets
> updated only when the audio_tstamp (i.e. hwptr) is updated, the driver
> should provide the consistent pair of audio_tstamp (i.e. hwptr) vs
> tstamp.

Fair enough — I had been thinking about the situation with the patch in place.

>> Another issue, as I see it, is that the the audio_tstamp value would depend 
>> on whether, and when, a snd_pcm_delay() call (which recalculates the 
>> interpolation and puts it into the delay field) was made immediately prior 
>> to it. By zeroing the delay when a GPU interrupt occurs, you could be 
>> certain that the interpolated delay would be less than or equal to the true 
>> delay, but this doesn’t seem very satisfactory — you have neither the 
>> timestamp of the last update nor the correctly interpolated timestamp. 
> 
> No, audio_stamp field is updated at snd_pcm_period_elapsed() call as
> well as tstamp field.

That's great.

> Basically the driver provides three things: hwptr, tstamp and
> audio_tstamp.  For the default configuration (like bcm audio does),
> audio_tstamp is calculated from hwptr, so it can be seen as the hwptr
> represented in timespec.  OTOH, tstamp is the actual system time that
> is updated only when audio_tstamp changes -- which means tstamp gets
> updated *only* at snd_pcm_period_elapsed() call on bcm audio.
> 
> And, my point is that you should be able to interpolate the actual
> position in user-space side based on these information; it doesn't
> have to be done in the kernel at all.

That is true, of course. The problem is that the snd_pcm_delay() call is so 
inaccurate though.

>> Sadly, therefore, I’m now of the view that this approach to interpolating 
>> the delay between GPU interrupts is not really viable. Would that be your 
>> view?

> Actually there were some bugs in the past that the tstamp was updated
> at each snd_pcm_status(), but it should have been fixed in the recent
> kernels.  That's why I asked to re-check the current status.

Yes, as far as I can see, that's fixed. 

In further testing, however, I noticed that the audio_frames calculation in 
update_audio_tstamp() in pcm_lib.c didn't include the delay,
so now it does if the delay field is negative, which it is "naturally" in this 
case. With that change, the delay reported by snd_pcm_delay() and
calculated as you referred to above are consistent.

So, overall, I am happier that this approach is at least viable. But two issues 
remain, in my view:
First, is it "a good idea"?
Second, the delay field is now being used as a delay if its positive and an 
interpolation if it's negative. It works, but would it be better to have an 
extra "interpolation" field?

I'll post the updated patch shortly.

Thanks,
Mike

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

2018-11-06 Thread Mike Brady

> On 5 Nov 2018, at 16:11, Takashi Iwai  wrote:
> 
> On Mon, 05 Nov 2018 16:57:07 +0100,
> Mike Brady wrote:
>> 
>>> One another thing I'd like to point out is that the value given in the
>>> patch is nothing but an estimated position, optimistically calculated
>>> via the system timer.  Mike and I had already discussion in another
>>> thread, and another possible option would be to provide the proper
>>> timestamp-vs-hwptr pair, instead of updating the timestamp always at
>>> the status read.
>> 
>> Agreed — that would give the caller the information needed to do the
>> interpolation for themselves if desired.
> 
> And now I wonder whether the problem is still present with the latest
> code.  There was a (kind of) regression in this regard when we
> introduced the fine-grained hardware timestamping, but it should have
> been addressed by the commit 20e3f985bb875fea4f86b04eba4b6cc29bfd6b71
>ALSA: pcm: update tstamp only if audio_tstamp changed
> 
> Could you double-check whether the tstamp field gets still updated
> even if no hwptr (and delay) is changed?

Yes, this could be a bit problematic. The function update_audio_tstamp in 
pcm_lib.c could include the interpolated delay in the calculation of 
audio_tstamp, and hence
could trigger the update of tstamp.

Another issue, as I see it, is that the the audio_tstamp value would depend on 
whether, and when, a snd_pcm_delay() call (which recalculates the interpolation 
and puts it into the delay field) was made immediately prior to it. By zeroing 
the delay when a GPU interrupt occurs, you could be certain that the 
interpolated delay would be less than or equal to the true delay, but this 
doesn’t seem very satisfactory — you have neither the timestamp of the last 
update nor the correctly interpolated timestamp. 

Sadly, therefore, I’m now of the view that this approach to interpolating the 
delay between GPU interrupts is not really viable. Would that be your view?

Regards
Mike

> thanks,
> 
> Takashi

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

2018-11-05 Thread Mike Brady
Thanks for the comments and suggestions.

> On 25 Oct 2018, at 08:37, Takashi Iwai  wrote:
> 
> Well, in the API POV, it's nothing wrong to keep hwptr sticking while
> updating only delay value.  It implies that the hardware chip doesn't
> provide the hwptr update.

As I understand it, this driver stages settings, data and status information 
for the true audio driver which is part of VideoCore (VC). The driver 
communicates with the VC by sending messages. Responses come back in 
asynchronous callbacks. There doesn’t seem to be any other source of data or 
status.

When parameters such as frame format, rate and period size have been set up, 
the VC executes periodic callbacks to retrieve period-sized buffers of data. At 
44,100 frames per second and with standard 444-frame periods, callbacks occur 
approximately every 10.07 milliseconds.

> Though, usually the delay value is provided also from the hardware,
> e.g. reading the link position or such.  It's a typical case like
> USB-audio, where the hwptr gets updated and the delay indicates the
> actual position *behind* hwptr.  That works because hwptr shows the
> position in the ring buffer at which you can access the data.  And it
> doesn't mean that hwptr is the actually playing position, but it can
> be ahead of the current position, when many samples are queued on
> FIFO.  The delay is provided to correct the position back to the
> actual point.

 The information that the alsa snd_pcm_delay() function depends on is updated 
during these callbacks. Thus, a user program monitoring the snc_pcm_delay() 
value closely will see sudden jumps in the value every 10 milliseconds or so — 
a 10 millisecond jitter. 
> 
> But, this also doesn't mean that the delay shouldn't be used for the
> purpose like this patchset, either.  OTOH, providing a finer hwptr
> value would be likely more apps-friendly; there must be many programs
> that don't evaluate the delay value properly.

With interpolation, the number of frames that would have been output from the 
time of the last callback is subtracted from the delay to give a more accurate 
estimate of the actual delay at the time it is requested.

> So, I suppose that hwptr update might be a better option if the code
> won't become too complex.  Let's see.

Having looked at the code, there does not seem to be a good way to avoid 
interpolation. Later versions of the interface include a message type of 
VC_AUDIO_MSG_TYPE_LATENCY (see 
https://github.com/raspberrypi/firmware/blob/master/opt/vc/include/interface/vmcs_host/vc_vchi_audioserv_defs.h#L158)
 which seems to be a request to return the latency. However, the latency would 
be returned in an asynchronous callback (see function audio_vchi_callback in 
bcm2835-vchiq.c). One can wait for the result, but it seems that it could take 
up to 10 milliseconds (see function bcm2835_audio_send_msg_locked in 
bcm2835-vchiq.c). This is hardly tolerable, and to avoid it, one would have to 
store both the latency returned and the time the request was sent (or the time 
the reply was returned — it’s not clear which would be correct) and interpolate 
from that to the time the delay is requested. In other words, from the point of 
view of avoiding interpolation, this is likely to be no better than the present 
suggestion. There wold also be a need to make the latency request periodically, 
adding to the overhead.

Without getting a good deal more information about the VC, which may not be 
available, I’m afraid I can’t see a way of getting a better fix on the 
instantaneous values of pointers such as the hw_ptr. BTW, I have not been able 
to find a source for the file vc_vchi_audioserv_defs.h, which looks like a 
Broadcom file and which appears to have two  versions. If anyone could point me 
to the source, I’d be grateful.

> One another thing I'd like to point out is that the value given in the
> patch is nothing but an estimated position, optimistically calculated
> via the system timer.  Mike and I had already discussion in another
> thread, and another possible option would be to provide the proper
> timestamp-vs-hwptr pair, instead of updating the timestamp always at
> the status read.

Agreed — that would give the caller the information needed to do the 
interpolation for themselves if desired.

> Maybe it's worth to have a module option to suppress this optimistic
> hwptr update behavior, in case something went wrong with clocks?

Following this suggestion, I have updated the patch to include a module 
parameter ‘enable_delay_interpolation’, and I will post that later for 
consideration.

Regards
Mike


> thanks,
> 
> Takashi

> ___
> Alsa-devel mailing list
> alsa-de...@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

2018-10-28 Thread Mike Brady


> On 25 Oct 2018, at 08:37, Takashi Iwai  wrote:
> 
> On Thu, 25 Oct 2018 00:02:34 +0200,
> Kirill Marinushkin wrote:
>> 
 When you play sound - the pointer increments.
>>> 
>>> Unfortunately, when you play sound, the pointer does not actually 
>>> increment, for up to about 10 milliseconds. I know of no way to actually 
>>> access the true “live” position of the frame that is being played at any 
>>> instant; hence the desire to estimate it.
>>> 
>> 
>> Your vision of situation in the opposite from my vision. What you see as a
>> symptom - I see as a root cause. As I see, you should fix the
>> pointer-not-incrementing. Why do you think that it's okay that the pointer is
>> not updating during sound play? Why do you insist that there is a delay? I 
>> don't
>> understand why we are so stuck here.
> 
> Well, in the API POV, it's nothing wrong to keep hwptr sticking while
> updating only delay value.  It implies that the hardware chip doesn't
> provide the hwptr update.
> 
> Though, usually the delay value is provided also from the hardware,
> e.g. reading the link position or such.  It's a typical case like
> USB-audio, where the hwptr gets updated and the delay indicates the
> actual position *behind* hwptr.  That works because hwptr shows the
> position in the ring buffer at which you can access the data.  And it
> doesn't mean that hwptr is the actually playing position, but it can
> be ahead of the current position, when many samples are queued on
> FIFO.  The delay is provided to correct the position back to the
> actual point.
> 
> But, this also doesn't mean that the delay shouldn't be used for the
> purpose like this patchset, either.  OTOH, providing a finer hwptr
> value would be likely more apps-friendly; there must be many programs
> that don't evaluate the delay value properly.
> 
> So, I suppose that hwptr update might be a better option if the code
> won't become too complex.  Let's see.

Indeed. It will take me a few days to look into this…

Regards
Mike

> One another thing I'd like to point out is that the value given in the
> patch is nothing but an estimated position, optimistically calculated
> via the system timer.  Mike and I had already discussion in another
> thread, and another possible option would be to provide the proper
> timestamp-vs-hwptr pair, instead of updating the timestamp always at
> the status read.
> 
> Maybe it's worth to have a module option to suppress this optimistic
> hwptr update behavior, in case something went wrong with clocks?
> 
> 
> thanks,
> 
> Takashi

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

2018-10-28 Thread Mike Brady
Hi Kirill. Thanks for the post.
Mike

> On 25 Oct 2018, at 18:20, Kirill Marinushkin  wrote:
> 
> Hello Takashi, Mike,
> 
> @Takashi
> 
> On 10/25/18 09:37, Takashi Iwai wrote:
>> Well, in the API POV, it's nothing wrong to keep hwptr sticking while
>> updating only delay value.  It implies that the hardware chip doesn't
>> provide the hwptr update.
> 
> Thank you for the clarification. Modifying `runtime->delay` from the `pointer`
> function looked wrong for me. Now I understand the motivation and the 
> use-case.
> I will be more careful when analyzing the code which doesn't fit my 
> expectations.
> 
> @Mike
> 
> I was wrong. You can ignore my comments. Please don't take them personal: it's
> all about having high-quality code in kernel.
> 
> Best Regards,
> Kirill
> ___
> Alsa-devel mailing list
> alsa-de...@alsa-project.org
> http://mailman.alsa-project.org/mailman/listinfo/alsa-devel

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

2018-10-24 Thread Mike Brady
Hi Kirill. Thanks again for your comments.

> On 24 Oct 2018, at 19:06, Kirill Marinushkin  wrote:
> 
> Hello Mike,
> 
> On 10/24/18 10:20, Mike Brady wrote:
>> Hi Kirill. Thanks for your comments.
>> 
>>> On 22 Oct 2018, at 23:25, Kirill Marinushkin  
>>> wrote:
>>> 
>>> AFAIU, this patch is wrong. Please correct me, maybe I misunderstand 
>>> something.
>>> 
>>>> The problem that this patch seeks to resolve is that when userland asks for
>>>> the delay
>>> 
>>> The userspace asks not for delay, but for the pointer.
>> 
>> The call in question is snd_pcm_delay. I presume this delay is calculated 
>> from knowledge of the stream position “pos", the period (buffer?) number 
>> (and period/buffer size) and the snd_pcm_runtime structure’s “delay" field 
>> (“runtime->delay”).
>> 
> 
> 
> In kernel, this delay is calculated in `snd_pcm_calc_delay()`, defined at
> `sound/core/pcm_native.c:889`. If you analyze how this calculation is done, 
> you
> will see that the returned value is a summary of the following fields:
> 
> * runtime->status->hw_ptr
> * runtime->buffer_size
> * runtime->control->appl_ptr
> * runtime->boundary
> * runtime->delay

That’s very useful, thanks.

>>> You modify the function, which is called `snd_bcm2835_pcm_pointer`. Here 
>>> you are
>>> supposed to increase `alsa_stream->pos` with the proper offset. Instead, you
>>> imitate a delay, but in fact the delay is not increased.
>>> 
>>> So, the proper solution should be to fix the reported pointer.
>> 
>> I think there is a difficulty with this. The “pos” pointer looks to have to 
>> be modulo the buffer size. This causes a problem, as I see it, in that if 
>> the calculated (pos + interpolated delay in bytes) is longer than the buffer 
>> size
> 
> 
> There is no "interpolated delay". The concept of "interpolated delay" is
> incorrect.

Yes, my language here is wrong. What I mean is the estimated number of frames 
output since the pointer was last updated — let’s call it the `interpolated 
frame count`.

> When you play sound - the pointer increments.

Unfortunately, when you play sound, the pointer does not actually increment, 
for up to about 10 milliseconds. I know of no way to actually access the true 
“live” position of the frame that is being played at any instant; hence the 
desire to estimate it.

What actually seems to be happening is that when `bcm2835_playback_fifo` is 
called, the pointer is updated, but as frames are individually output to the 
DAC, this pointer does not increment. It is not updated until the next time 
`bcm2835_playback_fifo` is called.

> But in this commit you increment the delay, as if sound doesn't play.

It is true that the patch does make use of the  snd_pcm_runtime structure’s 
“delay" field (aka "runtime->delay” here). That field is defined for: “/* extra 
delay; typically FIFO size */”. Clearly it is not being used for that here — it 
is being used simply because it is part of the calculation done in 
snd_pcm_calc_delay(), as you point out. At present, it looks like that field 
isn’t being used –– it’s set to zero –– and not modified anywhere else in the 
driver, AFAICS. If it was necessary, it would be a simple matter to preserve 
whatever value it was given.

>>> As a result,
>>> userspace will recieve the correct delay, instead of these crazy 10 ms.
>> 
>> Just to point out that with the proposed patch, it appears that the correct 
>> delay is being reported, (apart, possibly, from any delay originally set in 
>> the snd_pcm_delay field, as mentioned above).
> 
> 
> Then I would like to point out the alsa-lib function `snd_pcm_avail()` - it 
> will return the wrong value.

It is already the case that snd_pcm_avail() does not return the true delay. The 
ALSA documentation states: "The value returned by that call [i.e. the 
snd_pcm_avail*() functions] is not directly related to the delay…” 

Kind regards
Mike

>> 
>>> FYI, there is
>>>> a discussion of the effects of a downstream equivalent of this suggested 
>>>> patch
>>>> at:
>>>> https://github.com/raspberrypi/firmware/issues/1026#issuecomment-415746016.
>>> 
>>> Thank you for the link, it clarified for me what you try to achieve.
>>> 
>>> On 10/22/18 21:17, Mike Brady wrote:
>>>> When the BCM2835 audio output is used, userspace sees a jitter up to 10ms
>>>> in the audio position, aka "delay" -- the number of frames that must
>>>> be output before a new frame would be played.
&g

Re: [alsa-devel] [PATCH v2] staging: bcm2835-audio: interpolate audio delay

2018-10-24 Thread Mike Brady
Hi Kirill. Thanks for your comments.

> On 22 Oct 2018, at 23:25, Kirill Marinushkin  wrote:
> 
> AFAIU, this patch is wrong. Please correct me, maybe I misunderstand 
> something.
> 
>> The problem that this patch seeks to resolve is that when userland asks for
>> the delay
> 
> The userspace asks not for delay, but for the pointer.

The call in question is snd_pcm_delay. I presume this delay is calculated from 
knowledge of the stream position “pos", the period (buffer?) number (and 
period/buffer size) and the snd_pcm_runtime structure’s “delay" field 
(“runtime->delay”).

> You modify the function, which is called `snd_bcm2835_pcm_pointer`. Here you 
> are
> supposed to increase `alsa_stream->pos` with the proper offset. Instead, you
> imitate a delay, but in fact the delay is not increased.
> 
> So, the proper solution should be to fix the reported pointer.

I think there is a difficulty with this. The “pos” pointer looks to have to be 
modulo the buffer size. This causes a problem, as I see it, in that if the 
calculated (pos + interpolated delay in bytes) is longer than the buffer size, 
it must be wrapped, but AFAIK we are unable to increment a buffer index used in 
the snd_pcm_delay calculation. Hence the calculation of the actual position 
would be wrong. This is why the snd_pcm_runtime delay field is used. On 
reflection, BTW, the patch assumes that the field's original value was zero — 
that can be rectified.

> As a result,
> userspace will recieve the correct delay, instead of these crazy 10 ms.

Just to point out that with the proposed patch, it appears that the correct 
delay is being reported, (apart, possibly, from any delay originally set in the 
snd_pcm_delay field, as mentioned above).

All the best,
Mike

> FYI, there is
>> a discussion of the effects of a downstream equivalent of this suggested 
>> patch
>> at:
>> https://github.com/raspberrypi/firmware/issues/1026#issuecomment-415746016.
> 
> Thank you for the link, it clarified for me what you try to achieve.
> 
> On 10/22/18 21:17, Mike Brady wrote:
>> When the BCM2835 audio output is used, userspace sees a jitter up to 10ms
>> in the audio position, aka "delay" -- the number of frames that must
>> be output before a new frame would be played.
>> Make this a bit nicer for userspace by interpolating the position
>> using the CPU clock.
>> The overhead is small -- an extra ktime_get() every time a GPU message
>> is sent -- and another call and a few calculations whenever the delay
>> is sought from userland.
>> At 48,000 frames per second, i.e. approximately 20 microseconds per
>> frame, it would take a clock inaccuracy of
>> 20 microseconds in 10 milliseconds -- 2,000 parts per million --
>> to result in an inaccurate estimate, whereas
>> crystal- or resonator-based clocks typically have an
>> inaccuracy of 10s to 100s of parts per million.
>> 
>> Signed-off-by: Mike Brady 
>> ---
>> Changes in v2 -- remove inappropriate addition of SNDRV_PCM_INFO_BATCH flag
>> 
>> .../vc04_services/bcm2835-audio/bcm2835-pcm.c | 20 +++
>> .../vc04_services/bcm2835-audio/bcm2835.h |  1 +
>> 2 files changed, 21 insertions(+)
>> 
>> diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c 
>> b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c
>> index e66da11af5cf..9053b996cada 100644
>> --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c
>> +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c
>> @@ -74,6 +74,7 @@ void bcm2835_playback_fifo(struct bcm2835_alsa_stream 
>> *alsa_stream,
>>  atomic_set(_stream->pos, pos);
>> 
>>  alsa_stream->period_offset += bytes;
>> +alsa_stream->interpolate_start = ktime_get();
>>  if (alsa_stream->period_offset >= alsa_stream->period_size) {
>>  alsa_stream->period_offset %= alsa_stream->period_size;
>>  snd_pcm_period_elapsed(substream);
>> @@ -243,6 +244,7 @@ static int snd_bcm2835_pcm_prepare(struct 
>> snd_pcm_substream *substream)
>>  atomic_set(_stream->pos, 0);
>>  alsa_stream->period_offset = 0;
>>  alsa_stream->draining = false;
>> +alsa_stream->interpolate_start = ktime_get();
>> 
>>  return 0;
>> }
>> @@ -292,6 +294,24 @@ snd_bcm2835_pcm_pointer(struct snd_pcm_substream 
>> *substream)
>> {
>>  struct snd_pcm_runtime *runtime = substream->runtime;
>>  struct bcm2835_alsa_stream *alsa_stream = runtime->private_data;
>> +ktime_t now = ktime_get();
>> +
>> +/* Give userspace better dela

[PATCH v2] staging: bcm2835-audio: interpolate audio delay

2018-10-22 Thread Mike Brady
When the BCM2835 audio output is used, userspace sees a jitter up to 10ms
in the audio position, aka "delay" -- the number of frames that must
be output before a new frame would be played.
Make this a bit nicer for userspace by interpolating the position
using the CPU clock.
The overhead is small -- an extra ktime_get() every time a GPU message
is sent -- and another call and a few calculations whenever the delay
is sought from userland.
At 48,000 frames per second, i.e. approximately 20 microseconds per
frame, it would take a clock inaccuracy of
20 microseconds in 10 milliseconds -- 2,000 parts per million --
to result in an inaccurate estimate, whereas
crystal- or resonator-based clocks typically have an
inaccuracy of 10s to 100s of parts per million.

Signed-off-by: Mike Brady 
---
Changes in v2 -- remove inappropriate addition of SNDRV_PCM_INFO_BATCH flag

 .../vc04_services/bcm2835-audio/bcm2835-pcm.c | 20 +++
 .../vc04_services/bcm2835-audio/bcm2835.h |  1 +
 2 files changed, 21 insertions(+)

diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c 
b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c
index e66da11af5cf..9053b996cada 100644
--- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c
+++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c
@@ -74,6 +74,7 @@ void bcm2835_playback_fifo(struct bcm2835_alsa_stream 
*alsa_stream,
atomic_set(_stream->pos, pos);
 
alsa_stream->period_offset += bytes;
+   alsa_stream->interpolate_start = ktime_get();
if (alsa_stream->period_offset >= alsa_stream->period_size) {
alsa_stream->period_offset %= alsa_stream->period_size;
snd_pcm_period_elapsed(substream);
@@ -243,6 +244,7 @@ static int snd_bcm2835_pcm_prepare(struct snd_pcm_substream 
*substream)
atomic_set(_stream->pos, 0);
alsa_stream->period_offset = 0;
alsa_stream->draining = false;
+   alsa_stream->interpolate_start = ktime_get();
 
return 0;
 }
@@ -292,6 +294,24 @@ snd_bcm2835_pcm_pointer(struct snd_pcm_substream 
*substream)
 {
struct snd_pcm_runtime *runtime = substream->runtime;
struct bcm2835_alsa_stream *alsa_stream = runtime->private_data;
+   ktime_t now = ktime_get();
+
+   /* Give userspace better delay reporting by interpolating between GPU
+* notifications, assuming audio speed is close enough to the clock
+* used for ktime
+*/
+
+   if ((ktime_to_ns(alsa_stream->interpolate_start)) &&
+   (ktime_compare(alsa_stream->interpolate_start, now) < 0)) {
+   u64 interval =
+   (ktime_to_ns(ktime_sub(now,
+   alsa_stream->interpolate_start)));
+   u64 frames_output_in_interval =
+   div_u64((interval * runtime->rate), 10);
+   snd_pcm_sframes_t frames_output_in_interval_sized =
+   -frames_output_in_interval;
+   runtime->delay = frames_output_in_interval_sized;
+   }
 
return snd_pcm_indirect_playback_pointer(substream,
_stream->pcm_indirect,
diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h 
b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h
index e13435d1c205..595ad584243f 100644
--- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h
+++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h
@@ -78,6 +78,7 @@ struct bcm2835_alsa_stream {
unsigned int period_offset;
unsigned int buffer_size;
unsigned int period_size;
+   ktime_t interpolate_start;
 
struct bcm2835_audio_instance *instance;
int idx;
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH] staging: bcm2835-audio: interpolate audio delay

2018-10-22 Thread Mike Brady
Hi Kirill.

The problem that this patch seeks to resolve is that when userland asks for the
delay -- the time to hear -- the driver responds with a figure that is only 
updated
when a GPU interrupt occurs, now approximately every 10 milliseconds.

As far as I am aware, there is no way to get a more up-to-date delay number on 
request,
hence the present patch to interpolate based on the time of the last update, 
the time of
the request and the nominal frame rate.

It does have the drawback of relying on the accuracy of the CPU clock and on the
nominal rate but on reasonable assumptions, it will not be in error over
the approximately 10 ms interval.

As you'll probably know, this is the built-in DAC in the full-sized Raspberry 
Pi devices and
is widely used. As a developer of Shairport Sync, I have noticed that the 
"slop" in
delay on the Pi's built-in DAC is very large by comparison with other DACs. 
FYI, there is
a discussion of the effects of a downstream equivalent of this suggested patch 
at:
https://github.com/raspberrypi/firmware/issues/1026#issuecomment-415746016.

Best wishes
Mike

On Thu, Oct 18, 2018 at 09:48:36PM +0200, Kirill Marinushkin wrote:
> Hello Mike,
> 
> On 10/18/18 12:57, Mike Brady wrote:
> > +   ktime_t now = ktime_get();
> > +
> > +   /* Give userspace better delay reporting by interpolating between GPU
> > +* notifications, assuming audio speed is close enough to the clock
> > +* used for ktime
> > +*/
> > +
> > +   if ((ktime_to_ns(alsa_stream->interpolate_start)) &&
> > +   (ktime_compare(alsa_stream->interpolate_start, now) < 0)) {
> > +   u64 interval =
> > +   (ktime_to_ns(ktime_sub(now,
> > +   alsa_stream->interpolate_start)));
> > +   u64 frames_output_in_interval =
> > +   div_u64((interval * runtime->rate), 10);
> > +   snd_pcm_sframes_t frames_output_in_interval_sized =
> > +   -frames_output_in_interval;
> > +   runtime->delay = frames_output_in_interval_sized;
> > +   }
> 
> This doesn't look like a good solution for me. More like a workaround. What is
> the root cause of the delay?
> 
> Best Regards,
> Kirill
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH] staging: bcm2835-audio: interpolate audio delay

2018-10-18 Thread Mike Brady
When the BCM2835 audio output is used, userspace sees a jitter up to 10ms
in the audio position, aka "delay" -- the number of frames that must
be output before a new frame would be played.
Make this a bit nicer for userspace by interpolating the position
using the CPU clock.
The overhead is small -- an extra ktime_get() every time a GPU message
is sent -- and another call and a few calculations whenever the delay
is sought from userland.
At 48,000 frames per second, i.e. approximately 20 microseconds per
frame, it would take a clock inaccuracy of
20 microseconds in 10 milliseconds -- 2,000 parts per million --
to result in an inaccurate estimate, whereas
crystal- or resonator-based clocks typically have an
inaccuracy of 10s to 100s of parts per million.

Signed-off-by: Mike Brady 
---
 .../vc04_services/bcm2835-audio/bcm2835-pcm.c | 23 ++-
 .../vc04_services/bcm2835-audio/bcm2835.h |  1 +
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c 
b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c
index e66da11af5cf..250adf82385a 100644
--- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c
+++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c
@@ -12,7 +12,8 @@
 static const struct snd_pcm_hardware snd_bcm2835_playback_hw = {
.info = (SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER |
 SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID |
-SNDRV_PCM_INFO_DRAIN_TRIGGER | SNDRV_PCM_INFO_SYNC_APPLPTR),
+SNDRV_PCM_INFO_BATCH | SNDRV_PCM_INFO_DRAIN_TRIGGER |
+SNDRV_PCM_INFO_SYNC_APPLPTR),
.formats = SNDRV_PCM_FMTBIT_U8 | SNDRV_PCM_FMTBIT_S16_LE,
.rates = SNDRV_PCM_RATE_CONTINUOUS | SNDRV_PCM_RATE_8000_48000,
.rate_min = 8000,
@@ -74,6 +75,7 @@ void bcm2835_playback_fifo(struct bcm2835_alsa_stream 
*alsa_stream,
atomic_set(_stream->pos, pos);
 
alsa_stream->period_offset += bytes;
+   alsa_stream->interpolate_start = ktime_get();
if (alsa_stream->period_offset >= alsa_stream->period_size) {
alsa_stream->period_offset %= alsa_stream->period_size;
snd_pcm_period_elapsed(substream);
@@ -243,6 +245,7 @@ static int snd_bcm2835_pcm_prepare(struct snd_pcm_substream 
*substream)
atomic_set(_stream->pos, 0);
alsa_stream->period_offset = 0;
alsa_stream->draining = false;
+   alsa_stream->interpolate_start = ktime_get();
 
return 0;
 }
@@ -292,6 +295,24 @@ snd_bcm2835_pcm_pointer(struct snd_pcm_substream 
*substream)
 {
struct snd_pcm_runtime *runtime = substream->runtime;
struct bcm2835_alsa_stream *alsa_stream = runtime->private_data;
+   ktime_t now = ktime_get();
+
+   /* Give userspace better delay reporting by interpolating between GPU
+* notifications, assuming audio speed is close enough to the clock
+* used for ktime
+*/
+
+   if ((ktime_to_ns(alsa_stream->interpolate_start)) &&
+   (ktime_compare(alsa_stream->interpolate_start, now) < 0)) {
+   u64 interval =
+   (ktime_to_ns(ktime_sub(now,
+   alsa_stream->interpolate_start)));
+   u64 frames_output_in_interval =
+   div_u64((interval * runtime->rate), 10);
+   snd_pcm_sframes_t frames_output_in_interval_sized =
+   -frames_output_in_interval;
+   runtime->delay = frames_output_in_interval_sized;
+   }
 
return snd_pcm_indirect_playback_pointer(substream,
_stream->pcm_indirect,
diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h 
b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h
index e13435d1c205..595ad584243f 100644
--- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h
+++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835.h
@@ -78,6 +78,7 @@ struct bcm2835_alsa_stream {
unsigned int period_offset;
unsigned int buffer_size;
unsigned int period_size;
+   ktime_t interpolate_start;
 
struct bcm2835_audio_instance *instance;
int idx;
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel