Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021)

Andrew Fish via groups.io Fri, 16 Apr 2021 07:34:37 -0700


> On Apr 16, 2021, at 6:22 AM, Marvin Häuser <mhaeu...@posteo.de> wrote:
> 
> Good day,
> 
> Sorry for the nitpicking.
> 
> - Protocols always need a "Revision" field as first member. This is used to 
> be able to expand its capabilities in later revisions without introducing a 
> new, distinct protocol.
> - Consider the name EFI_SIMPLE_AUDIO_OUTPUT(!)_PROTOCOL, to not cause 
> confusion if input is ever added. Input in my opinion should be a separate 
> protocol as there is no reason why they would necessarily be coupled 
> topology-wise (think of an USB microphone, it will never have any sort of 
> output).
> - To make code safety a bit easier, try to use "CONST" for "IN" (non-OUT) 
> pointers, so that CONST can be propagated where possible.
> - Please do *not* make the events caller-owned. We had it multiple times 
> already on production firmware that events are left dangling and may be 
> polled/signaled after ExitBS(). The caller should be able to decide on some 
> policy maybe (i.e. abort or block on ExitBS() until the playback finished), 
> as cut-off audio may be awkward; but the callee definitely should implement 
> "event safety" itself. Maybe avoid exposing events directly at all and 
> provide nice abstractions the caller cannot misuse.
> - I don't think audio should be required at all, the required subset should 
> firstly consider minimalism and security. Accessibility will not be of 
> concern for some IoT device, the audio code would simply eat space, and 
> introduce a larger surface for bugs.
>


Marvin,

Generally how we work this in the UEFI Specification is we make it optional via 
the following wording: “If a platform includes the ability to play audio in EFI 
then the EFI_SIMPLE_AUDIO_OUTPUT_PROTOCOL must be implemented. 

Basically this requirement will get added to UEFI Specification 2.6.2 
Platform-Specific Elements.

Thanks,

Andrew Fish

> Best regards,
> Marvin
> 
> On 16.04.21 01:42, Ethin Probst wrote:
>> Hi Andrew,
>> 
>> What would that protocol interface look like if we utilized your idea?
>> With mine (though I need to add channel mapping as well), your
>> workflow for playing a stereo sound from left to right would probably
>> be something like this:
>> 1) Encode the sound using a standard tool into a Wave PCM 16.
>> 2) Place the Wave file in the Firmware Volume using a given UUID as
>> the name. As simple as editing the platform FDF file.
>> 3) Write some BDS code
>>   a) Lookup Wave file by UUID and read it into memory.
>>   b) Decode the audio file (audio devices will not do this decoding
>> for you, you have to do that yourself).
>>   c) Call EFI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample rate
>> of your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit
>> PCM audio, the channel mapping, the number of samples, and the samples
>> themselves.
>>   d) call EFI_BOOT_SERVICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEventEx()
>> for a playback event to signal.
>>   e) call EFI_AUDIO_PROTOCOL.StartPlayback(), passing in the event you
>> just created.
>> The reason that LoadBuffer() takes so many parameters is because the
>> device does not know the audio that your passing in. If I'm given an
>> array of 16-bit audio samples, its impossible to know the parameters
>> (sample rate, sample format, channel mapping, etc.) from that alone.
>> Using your idea, though, my protocol could be greatly simplified.
>> Forcing a particular channel mapping, sample rate and sample format on
>> everyone would complicate application code. From an application point
>> of view, one would, with that type of protocol, need to do the
>> following:
>> 1) Load an audio file in any audio file format from any storage mechanism.
>> 2) Decode the audio file format to extract the samples and audio metadata.
>> 3) Resample the (now decoded) audio samples and convert (quantize) the
>> audio samples into signed 16-bit PCM audio.
>> 4) forward the samples onto the EFI audio protocol.
>> There is another option. (I'm happy we're discussing this now -- we
>> can hammer out all the details now which will make a lot of things
>> easier.) Since I'll most likely end up splitting the device-specific
>> interfaces to different audio protocols, we could make a simple audio
>> protocol that makes various assumptions about the audio samples your
>> giving it (e.g.: sample rate, format, ...). This would just allow
>> audio output and input in signed 16-bit PCM audio, as you've
>> suggested, and would be a simple and easy to use interface. Something
>> like:
>> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL {
>>   EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset;
>>   EFI_SIMPLE_AUDIO_PROTOCOL_START Start;
>>   EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop;
>> } EFI_SIMPLE_AUDIO_PROTOCOL;
>> This way, users and driver developers have a simple audio protocol
>> they can implement if they like. It would assume signed 16-bit PCM
>> audio and mono channel mappings at 44100 Hz. Then, we can have an
>> advanced protocol for each device type (HDA, USB, VirtIO, ...) that
>> exposes all the knobs for sample formats, sample rates, that kind of
>> thing. Obviously, like the majority (if not all) UEFI protocols, these
>> advanced protocols would be optional. I think, however, that the
>> simple audio protocol should be a required protocol in all UEFI
>> implementations. But that might not be possible. So would this simpler
>> interface work as a starting point?
>> 
>> On 4/15/21, Andrew Fish <af...@apple.com> wrote:
>>> 
>>>> On Apr 15, 2021, at 1:11 PM, Ethin Probst <harlydavid...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Is there any necessity for audio input and output to be implemented
>>>>> within the same protocol?  Unlike a network device (which is
>>>>> intrinsically bidirectional), it seems natural to conceptually separate
>>>>> audio input from audio output.
>>>> Nope, there isn't a necessity to make them in one, they can be
>>>> separated into two.
>>>> 
>>>>> The code controlling volume/mute may not have any access to the sample
>>>>> buffer.  The most natural implementation would seem to allow for a
>>>>> platform to notice volume up/down keypresses and use those to control the
>>>>> overall system volume, without any knowledge of which samples (if any)
>>>>> are currently being played by other code in the system.
>>>> Your assuming that the audio device your implementing the
>>>> volume/muting has volume control and muting functionality within
>>>> itself, then.
>>> Not really. We are assuming that audio hardware has a better understanding
>>> of how that system implements volume than some generic EFI Code that is by
>>> definition platform agnostic.
>>> 
>>>> This may not be the case, and so we'd need to
>>>> effectively simulate it within the driver, which isn't too hard to do.
>>>> As an example, the VirtIO driver does not have a request type for
>>>> muting or for volume control (this would, most likely, be within the
>>>> VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore,
>>>> either the driver would have to simulate the request or return
>>>> EFI_UNSUPPORTED this instance.
>>>> 
>>> So this is an example of above since the audio hardware knows it is routing
>>> the sound output into another subsystem, and that subsystem controls the
>>> volume. So the VirtIo Sound Driver know best how to bstract volume/mute for
>>> this platform.
>>> 
>>>>> Consider also the point of view of the developer implementing a driver
>>>>> for some other piece of audio hardware that happens not to support
>>>>> precisely the same sample rates etc as VirtIO.  It would be extremely
>>>>> ugly to force all future hardware to pretend to have the same
>>>>> capabilities as VirtIO just because the API was initially designed with
>>>>> VirtIO in mind.
>>>> Precisely, but the brilliance of VirtIO
>>> The brilliance of VirtIO is that it just needs to implement a generic device
>>> driver for a given operating system. In most cases these operating systems
>>> have sounds subsystems that manage sound and want fine granularity of
>>> control on what is going on. So the drivers are implemented to maximizes
>>> flexibility since the OS has lots of generic code that deals with sound, and
>>> even user configurable knobs to control audio. In our case that extra layer
>>> does not exist in EFI and the end user code just want to tell the driver do
>>> some simple things.
>>> 
>>> Maybe it is easier to think about with an example. Lets say I want to play a
>>> cool sound on every boot. What would be the workflow to make the happen.
>>> 1) Encode the sound using a standard tool into a Wave PCM 16.
>>> 2) Place the Wave file in the Firmware Volume using a given UUID as the
>>> name. As simple as editing the platform FDF file.
>>> 3) Write some BDS code
>>>   a) Lookup Wave file by UUID and read it into memory.
>>>   b) Point the EFI Sound Protocol at the buffer with the Wave file
>>>   c) Tell the EFI Sound Protocol to play the sound.
>>> 
>>> If you start adding in a lot of perimeters that work flow starts getting
>>> really complicated really quickly.
>>> 
>>>> is that the sample rate,
>>>> sample format, &c., do not have to all be supported by a VirtIO
>>>> device. Notice, also, how in my protocol proposal I noted that the
>>>> sample rates, at least, were "recommended," not "required." Should a
>>>> device not happen to support a sample rate or sample format, all it
>>>> needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1
>>>> (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what
>>>> sample rates it supports, channel mappings, &c.
>>>> 
>>>> I do understand how just using a predefined sample rate and sample
>>>> format might be a good idea, and if that's the best way, then that's
>>>> what we'll do. The protocol can always be revised at a later time if
>>>> necessary. I apologize if my stance seems obstinate.
>>>> 
>>> I think if we add the version into the protocol and make sure we have a
>>> separate load and play operation we could add a member to set the extra
>>> perimeters if needed. There might also be some platform specific generic
>>> tunables that might be interesting for a future member function.
>>> 
>>> Thanks,
>>> 
>>> Andrew Fish
>>> 
>>>> Also, thank you, Laszlo, for your advice -- I hadn't considered that a
>>>> network driver would be another good way of figuring out how async
>>>> works in UEFI.
>>>> 
>>>> On 4/15/21, Andrew Fish <af...@apple.com> wrote:
>>>>> 
>>>>>> On Apr 15, 2021, at 5:07 AM, Michael Brown <mc...@ipxe.org> wrote:
>>>>>> 
>>>>>> On 15/04/2021 06:28, Ethin Probst wrote:
>>>>>>> - I hoped to add recording in case we in future want to add
>>>>>>> accessibility aids like speech recognition (that was one of the todo
>>>>>>> tasks on the EDK2 tasks list)
>>>>>> Is there any necessity for audio input and output to be implemented
>>>>>> within
>>>>>> the same protocol?  Unlike a network device (which is intrinsically
>>>>>> bidirectional), it seems natural to conceptually separate audio input
>>>>>> from
>>>>>> audio output.
>>>>>> 
>>>>>>> - Muting and volume control could easily be added by just replacing
>>>>>>> the sample buffer with silence and by multiplying all the samples by a
>>>>>>> percentage.
>>>>>> The code controlling volume/mute may not have any access to the sample
>>>>>> buffer.  The most natural implementation would seem to allow for a
>>>>>> platform to notice volume up/down keypresses and use those to control
>>>>>> the
>>>>>> overall system volume, without any knowledge of which samples (if any)
>>>>>> are
>>>>>> currently being played by other code in the system.
>>>>>> 
>>>>> I’ve also thought of adding NVRAM variable that would let setup, UEFI
>>>>> Shell,
>>>>> or even the OS set the current volume, and Mute. This how it would be
>>>>> consumed concept is why I proposed mute and volume being separate APIs.
>>>>> The
>>>>> volume up/down API in addition to fixed percentage might be overkill, but
>>>>> it
>>>>> does allow a non liner mapping to the volume up/down keys. You would be
>>>>> surprised how picky audiophiles can be and it seems they like to file
>>>>> Bugzillas.
>>>>> 
>>>>>>> - Finally, the reason I used enumerations for specifying parameters
>>>>>>> like sample rate and stuff was that I was looking at this protocol
>>>>>>> from a general UEFI applications point of view. VirtIO supports all of
>>>>>>> the sample configurations listed in my gist, and it seems reasonable
>>>>>>> to allow the application to control those parameters instead of
>>>>>>> forcing a particular parameter configuration onto the developer.
>>>>>> Consider also the point of view of the developer implementing a driver
>>>>>> for
>>>>>> some other piece of audio hardware that happens not to support
>>>>>> precisely
>>>>>> the same sample rates etc as VirtIO.  It would be extremely ugly to
>>>>>> force
>>>>>> all future hardware to pretend to have the same capabilities as VirtIO
>>>>>> just because the API was initially designed with VirtIO in mind.
>>>>>> 
>>>>>> As a developer on the other side of the API, writing code to play sound
>>>>>> files on an arbitrary unknown platform, I would prefer to simply
>>>>>> consume
>>>>>> as simple as possible an abstraction of an audio output protocol and
>>>>>> not
>>>>>> have to care about what hardware is actually implementing it.
>>>>>> 
>>>>> It may make sense to have an API to load the buffer/stream and other APIs
>>>>> to
>>>>> play or pause. This could allow an optional API to configure how the
>>>>> stream
>>>>> is played back. If we add a version to the Protocol that would at least
>>>>> future proof us.
>>>>> 
>>>>> We did get feedback that it is very common to speed up the auto playback
>>>>> rates for accessibility. I’m not sure if that is practical with a simple
>>>>> PCM
>>>>> 16 wave file with the firmware audio implementation. I guess that is
>>>>> something we could investigate.
>>>>> 
>>>>> In terms of maybe adding text to speech there is an open source project
>>>>> that
>>>>> conceptually we could port to EFI. It would likely be a binary that
>>>>> would
>>>>> have to live on the EFI System Partition due to size. I was thinking
>>>>> that
>>>>> words per minute could be part of that API and it would produce a PCM 16
>>>>> wave file that the audio protocol we are discussing could play.
>>>>> 
>>>>>> Both of these argue in favour of defining a very simple API that
>>>>>> expresses
>>>>>> only a common baseline capability that is plausibly implementable for
>>>>>> every piece of audio hardware ever made.
>>>>>> 
>>>>>> Coupled with the relatively minimalistic requirements for boot-time
>>>>>> audio,
>>>>>> I'd probably suggest supporting only a single format for audio data,
>>>>>> with
>>>>>> a fixed sample rate (and possibly only mono output).
>>>>>> 
>>>>> In my world the folks that work for Jony asked for a stereo boot bong to
>>>>> transition from left to right :). This is not the CODEC you are looking
>>>>> for
>>>>> was our response…. I also did not mention that some languages are right
>>>>> to
>>>>> left, as the only thing worse than one complex thing is two complex
>>>>> things
>>>>> to implement.
>>>>> 
>>>>>> As always: perfection is achieved, not when there is nothing more to
>>>>>> add,
>>>>>> but when there is nothing left to take away.  :)
>>>>>> 
>>>>> "Simplicity is the ultimate sophistication”
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Andrew Fish
>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Michael
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> --
>>>> Signed,
>>>> Ethin D. Probst
>>> 
>> 
> 
> 
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#74210): https://edk2.groups.io/g/devel/message/74210
Mute This Topic: https://groups.io/mt/81710286/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] VirtIO Sound Driver (GSoC 2021)

Reply via email to