> On Apr 16, 2021, at 6:22 AM, Marvin Häuser <mhaeu...@posteo.de> wrote: > > Good day, > > Sorry for the nitpicking. > > - Protocols always need a "Revision" field as first member. This is used to > be able to expand its capabilities in later revisions without introducing a > new, distinct protocol. > - Consider the name EFI_SIMPLE_AUDIO_OUTPUT(!)_PROTOCOL, to not cause > confusion if input is ever added. Input in my opinion should be a separate > protocol as there is no reason why they would necessarily be coupled > topology-wise (think of an USB microphone, it will never have any sort of > output). > - To make code safety a bit easier, try to use "CONST" for "IN" (non-OUT) > pointers, so that CONST can be propagated where possible. > - Please do *not* make the events caller-owned. We had it multiple times > already on production firmware that events are left dangling and may be > polled/signaled after ExitBS(). The caller should be able to decide on some > policy maybe (i.e. abort or block on ExitBS() until the playback finished), > as cut-off audio may be awkward; but the callee definitely should implement > "event safety" itself. Maybe avoid exposing events directly at all and > provide nice abstractions the caller cannot misuse. > - I don't think audio should be required at all, the required subset should > firstly consider minimalism and security. Accessibility will not be of > concern for some IoT device, the audio code would simply eat space, and > introduce a larger surface for bugs. >
Marvin, Generally how we work this in the UEFI Specification is we make it optional via the following wording: “If a platform includes the ability to play audio in EFI then the EFI_SIMPLE_AUDIO_OUTPUT_PROTOCOL must be implemented. Basically this requirement will get added to UEFI Specification 2.6.2 Platform-Specific Elements. Thanks, Andrew Fish > Best regards, > Marvin > > On 16.04.21 01:42, Ethin Probst wrote: >> Hi Andrew, >> >> What would that protocol interface look like if we utilized your idea? >> With mine (though I need to add channel mapping as well), your >> workflow for playing a stereo sound from left to right would probably >> be something like this: >> 1) Encode the sound using a standard tool into a Wave PCM 16. >> 2) Place the Wave file in the Firmware Volume using a given UUID as >> the name. As simple as editing the platform FDF file. >> 3) Write some BDS code >> a) Lookup Wave file by UUID and read it into memory. >> b) Decode the audio file (audio devices will not do this decoding >> for you, you have to do that yourself). >> c) Call EFI_AUDIO_PROTOCOL.LoadBuffer(), passing in the sample rate >> of your audio, EFI_AUDIO_PROTOCOL_SAMPLE_FORMAT_S16 for signed 16-bit >> PCM audio, the channel mapping, the number of samples, and the samples >> themselves. >> d) call EFI_BOOT_SERVICES.CreateEvent()/EFI_BOOT_SERVICES.CreateEventEx() >> for a playback event to signal. >> e) call EFI_AUDIO_PROTOCOL.StartPlayback(), passing in the event you >> just created. >> The reason that LoadBuffer() takes so many parameters is because the >> device does not know the audio that your passing in. If I'm given an >> array of 16-bit audio samples, its impossible to know the parameters >> (sample rate, sample format, channel mapping, etc.) from that alone. >> Using your idea, though, my protocol could be greatly simplified. >> Forcing a particular channel mapping, sample rate and sample format on >> everyone would complicate application code. From an application point >> of view, one would, with that type of protocol, need to do the >> following: >> 1) Load an audio file in any audio file format from any storage mechanism. >> 2) Decode the audio file format to extract the samples and audio metadata. >> 3) Resample the (now decoded) audio samples and convert (quantize) the >> audio samples into signed 16-bit PCM audio. >> 4) forward the samples onto the EFI audio protocol. >> There is another option. (I'm happy we're discussing this now -- we >> can hammer out all the details now which will make a lot of things >> easier.) Since I'll most likely end up splitting the device-specific >> interfaces to different audio protocols, we could make a simple audio >> protocol that makes various assumptions about the audio samples your >> giving it (e.g.: sample rate, format, ...). This would just allow >> audio output and input in signed 16-bit PCM audio, as you've >> suggested, and would be a simple and easy to use interface. Something >> like: >> typedef struct EFI_SIMPLE_AUDIO_PROTOCOL { >> EFI_SIMPLE_AUDIO_PROTOCOL_RESET Reset; >> EFI_SIMPLE_AUDIO_PROTOCOL_START Start; >> EFI_SIMPLE_AUDIO_PROTOCOL_STOP Stop; >> } EFI_SIMPLE_AUDIO_PROTOCOL; >> This way, users and driver developers have a simple audio protocol >> they can implement if they like. It would assume signed 16-bit PCM >> audio and mono channel mappings at 44100 Hz. Then, we can have an >> advanced protocol for each device type (HDA, USB, VirtIO, ...) that >> exposes all the knobs for sample formats, sample rates, that kind of >> thing. Obviously, like the majority (if not all) UEFI protocols, these >> advanced protocols would be optional. I think, however, that the >> simple audio protocol should be a required protocol in all UEFI >> implementations. But that might not be possible. So would this simpler >> interface work as a starting point? >> >> On 4/15/21, Andrew Fish <af...@apple.com> wrote: >>> >>>> On Apr 15, 2021, at 1:11 PM, Ethin Probst <harlydavid...@gmail.com> >>>> wrote: >>>> >>>>> Is there any necessity for audio input and output to be implemented >>>>> within the same protocol? Unlike a network device (which is >>>>> intrinsically bidirectional), it seems natural to conceptually separate >>>>> audio input from audio output. >>>> Nope, there isn't a necessity to make them in one, they can be >>>> separated into two. >>>> >>>>> The code controlling volume/mute may not have any access to the sample >>>>> buffer. The most natural implementation would seem to allow for a >>>>> platform to notice volume up/down keypresses and use those to control the >>>>> overall system volume, without any knowledge of which samples (if any) >>>>> are currently being played by other code in the system. >>>> Your assuming that the audio device your implementing the >>>> volume/muting has volume control and muting functionality within >>>> itself, then. >>> Not really. We are assuming that audio hardware has a better understanding >>> of how that system implements volume than some generic EFI Code that is by >>> definition platform agnostic. >>> >>>> This may not be the case, and so we'd need to >>>> effectively simulate it within the driver, which isn't too hard to do. >>>> As an example, the VirtIO driver does not have a request type for >>>> muting or for volume control (this would, most likely, be within the >>>> VIRTIO_SND_R_PCM_SET_PARAMS request, see sec. 5.14.6.4.3). Therefore, >>>> either the driver would have to simulate the request or return >>>> EFI_UNSUPPORTED this instance. >>>> >>> So this is an example of above since the audio hardware knows it is routing >>> the sound output into another subsystem, and that subsystem controls the >>> volume. So the VirtIo Sound Driver know best how to bstract volume/mute for >>> this platform. >>> >>>>> Consider also the point of view of the developer implementing a driver >>>>> for some other piece of audio hardware that happens not to support >>>>> precisely the same sample rates etc as VirtIO. It would be extremely >>>>> ugly to force all future hardware to pretend to have the same >>>>> capabilities as VirtIO just because the API was initially designed with >>>>> VirtIO in mind. >>>> Precisely, but the brilliance of VirtIO >>> The brilliance of VirtIO is that it just needs to implement a generic device >>> driver for a given operating system. In most cases these operating systems >>> have sounds subsystems that manage sound and want fine granularity of >>> control on what is going on. So the drivers are implemented to maximizes >>> flexibility since the OS has lots of generic code that deals with sound, and >>> even user configurable knobs to control audio. In our case that extra layer >>> does not exist in EFI and the end user code just want to tell the driver do >>> some simple things. >>> >>> Maybe it is easier to think about with an example. Lets say I want to play a >>> cool sound on every boot. What would be the workflow to make the happen. >>> 1) Encode the sound using a standard tool into a Wave PCM 16. >>> 2) Place the Wave file in the Firmware Volume using a given UUID as the >>> name. As simple as editing the platform FDF file. >>> 3) Write some BDS code >>> a) Lookup Wave file by UUID and read it into memory. >>> b) Point the EFI Sound Protocol at the buffer with the Wave file >>> c) Tell the EFI Sound Protocol to play the sound. >>> >>> If you start adding in a lot of perimeters that work flow starts getting >>> really complicated really quickly. >>> >>>> is that the sample rate, >>>> sample format, &c., do not have to all be supported by a VirtIO >>>> device. Notice, also, how in my protocol proposal I noted that the >>>> sample rates, at least, were "recommended," not "required." Should a >>>> device not happen to support a sample rate or sample format, all it >>>> needs to do is return EFI_INVALID_PARAMETER. Section 5.14.6.2.1 >>>> (VIRTIO_SND_R_JACK_GET_CONFIG) describes how a jack tells you what >>>> sample rates it supports, channel mappings, &c. >>>> >>>> I do understand how just using a predefined sample rate and sample >>>> format might be a good idea, and if that's the best way, then that's >>>> what we'll do. The protocol can always be revised at a later time if >>>> necessary. I apologize if my stance seems obstinate. >>>> >>> I think if we add the version into the protocol and make sure we have a >>> separate load and play operation we could add a member to set the extra >>> perimeters if needed. There might also be some platform specific generic >>> tunables that might be interesting for a future member function. >>> >>> Thanks, >>> >>> Andrew Fish >>> >>>> Also, thank you, Laszlo, for your advice -- I hadn't considered that a >>>> network driver would be another good way of figuring out how async >>>> works in UEFI. >>>> >>>> On 4/15/21, Andrew Fish <af...@apple.com> wrote: >>>>> >>>>>> On Apr 15, 2021, at 5:07 AM, Michael Brown <mc...@ipxe.org> wrote: >>>>>> >>>>>> On 15/04/2021 06:28, Ethin Probst wrote: >>>>>>> - I hoped to add recording in case we in future want to add >>>>>>> accessibility aids like speech recognition (that was one of the todo >>>>>>> tasks on the EDK2 tasks list) >>>>>> Is there any necessity for audio input and output to be implemented >>>>>> within >>>>>> the same protocol? Unlike a network device (which is intrinsically >>>>>> bidirectional), it seems natural to conceptually separate audio input >>>>>> from >>>>>> audio output. >>>>>> >>>>>>> - Muting and volume control could easily be added by just replacing >>>>>>> the sample buffer with silence and by multiplying all the samples by a >>>>>>> percentage. >>>>>> The code controlling volume/mute may not have any access to the sample >>>>>> buffer. The most natural implementation would seem to allow for a >>>>>> platform to notice volume up/down keypresses and use those to control >>>>>> the >>>>>> overall system volume, without any knowledge of which samples (if any) >>>>>> are >>>>>> currently being played by other code in the system. >>>>>> >>>>> I’ve also thought of adding NVRAM variable that would let setup, UEFI >>>>> Shell, >>>>> or even the OS set the current volume, and Mute. This how it would be >>>>> consumed concept is why I proposed mute and volume being separate APIs. >>>>> The >>>>> volume up/down API in addition to fixed percentage might be overkill, but >>>>> it >>>>> does allow a non liner mapping to the volume up/down keys. You would be >>>>> surprised how picky audiophiles can be and it seems they like to file >>>>> Bugzillas. >>>>> >>>>>>> - Finally, the reason I used enumerations for specifying parameters >>>>>>> like sample rate and stuff was that I was looking at this protocol >>>>>>> from a general UEFI applications point of view. VirtIO supports all of >>>>>>> the sample configurations listed in my gist, and it seems reasonable >>>>>>> to allow the application to control those parameters instead of >>>>>>> forcing a particular parameter configuration onto the developer. >>>>>> Consider also the point of view of the developer implementing a driver >>>>>> for >>>>>> some other piece of audio hardware that happens not to support >>>>>> precisely >>>>>> the same sample rates etc as VirtIO. It would be extremely ugly to >>>>>> force >>>>>> all future hardware to pretend to have the same capabilities as VirtIO >>>>>> just because the API was initially designed with VirtIO in mind. >>>>>> >>>>>> As a developer on the other side of the API, writing code to play sound >>>>>> files on an arbitrary unknown platform, I would prefer to simply >>>>>> consume >>>>>> as simple as possible an abstraction of an audio output protocol and >>>>>> not >>>>>> have to care about what hardware is actually implementing it. >>>>>> >>>>> It may make sense to have an API to load the buffer/stream and other APIs >>>>> to >>>>> play or pause. This could allow an optional API to configure how the >>>>> stream >>>>> is played back. If we add a version to the Protocol that would at least >>>>> future proof us. >>>>> >>>>> We did get feedback that it is very common to speed up the auto playback >>>>> rates for accessibility. I’m not sure if that is practical with a simple >>>>> PCM >>>>> 16 wave file with the firmware audio implementation. I guess that is >>>>> something we could investigate. >>>>> >>>>> In terms of maybe adding text to speech there is an open source project >>>>> that >>>>> conceptually we could port to EFI. It would likely be a binary that >>>>> would >>>>> have to live on the EFI System Partition due to size. I was thinking >>>>> that >>>>> words per minute could be part of that API and it would produce a PCM 16 >>>>> wave file that the audio protocol we are discussing could play. >>>>> >>>>>> Both of these argue in favour of defining a very simple API that >>>>>> expresses >>>>>> only a common baseline capability that is plausibly implementable for >>>>>> every piece of audio hardware ever made. >>>>>> >>>>>> Coupled with the relatively minimalistic requirements for boot-time >>>>>> audio, >>>>>> I'd probably suggest supporting only a single format for audio data, >>>>>> with >>>>>> a fixed sample rate (and possibly only mono output). >>>>>> >>>>> In my world the folks that work for Jony asked for a stereo boot bong to >>>>> transition from left to right :). This is not the CODEC you are looking >>>>> for >>>>> was our response…. I also did not mention that some languages are right >>>>> to >>>>> left, as the only thing worse than one complex thing is two complex >>>>> things >>>>> to implement. >>>>> >>>>>> As always: perfection is achieved, not when there is nothing more to >>>>>> add, >>>>>> but when there is nothing left to take away. :) >>>>>> >>>>> "Simplicity is the ultimate sophistication” >>>>> >>>>> Thanks, >>>>> >>>>> Andrew Fish >>>>> >>>>>> Thanks, >>>>>> >>>>>> Michael >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> Signed, >>>> Ethin D. Probst >>> >> > > > > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#74210): https://edk2.groups.io/g/devel/message/74210 Mute This Topic: https://groups.io/mt/81710286/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-