On Mon, 08.10.07 20:02, Matteo Settenvini ([EMAIL PROTECTED]) wrote: Hi!
Ok, a little bit late because I was travelling, but here's my reply to the whole PA thread on desktop-devel (as the maintainer of PA). This is a long reply, so you might to want to grab yourself of a cup of coffee (or mango lassi?) before you start reading through it. I'd like to thank davidz, matteo, haddess, jan for jumping in the discussion for PA's defense and replying more quickly than I did. Thanks, dudes! > It has been a while since esound has received some attention - releases > are almost stalled. Looking at the GNOME wiki, it seems that Pulseaudio > is the stronger candidate between alternatives, and that it allows for > quite a lot of nifty things. > > I'm running pulseaudio since four or five months now on two of my > desktop systems, both x86 and PPC, and I must say that I'm really > satisfied by it. > It's quite stable and has very few compelling bugs for the normal user > (e.g. when using it as an esound replacement on a machine with more than > a logged in user it doesn't share the esd socket, or similar). > > It also seems to be actively developed, and is shipped by default with > Fedora 8. > > Can it be eligible for inclusion in GNOME 2.22? Coincidentally we discussed just this during the GNOME Summit on sunday. Here are my 10¢ on this and all the issues raised in the whole thread following Matteos proposal. I am not sure that PA should become "part" of GNOME. A blessed dependency sure, but really a new module of GNOME? Probably not. Fedora now ships PA by default, and SuSE is moving to PA as well. (Of the big distros only that spaceboy distro doesn't love us anymore as it seems, as I haven't heard from them in a while) There are still a couple of rough edges, like that we ship two volume controls: one being the native PA volume control, which can do all kinds of nifty things like per-stream volumes and moving streams between devices. And gnome-volume-control which is much sexier UI-wise (i18n, ...), but exposes a lot of cruft we'd prefer to get rid of (i.e. all kinds of stupid alsa mixer tracks and checkboxes nobody really understands, shows every devices thrice, ...) Resolving this duplication probably needs a little bit tighter integration of PA and GNOME: either the volume control tool in GNOME would need to link directly against PA -- or we'd have to wrap all the special PA features in GST's mixer interfaces -- which I think doesn't make that much sense. Too many abstraction layers are bad, especially if there'd only be a single backend driver which would implement most of it. A couple of direct replies to what people brought up in their emails: Martin Meyer suggested that PA was "heavy-weight". This is quite frankly bullshit. It depends how you compile PA. Sure, PA is a little bit bigger than ESD, but not that much. It of course becomes bigger, if you compile *all* the modules we ship. But you don't have to do this -- if you just want the core, then just compile the core and PA is tiny. A lot of embedded people now start to adopt PA -- people which a lot stronger constraints that we generally have on GNOME as a desktop for a PC. So, the "bloat", "heavy-weight" issue is nonsense. You can compile the SVN PA fine with just two external dependencies (ALSA and liboil -- both libraries are nowadays installed on all distros anyway -- so they don't really count) and it works fine. Everything else is optional, and can be split off in seperate packages. And even without those extra modules PA is still very useful. Regarding GST vs. PulseAudio: there is just no "vs."! Gstreamer does muxing/demuxing/decoding/encoding of media streams. PA is a low-level PCM-only sound server. They're too different things. You could compare this to X11 and GTK: X11 just does a bit of windowing and drawing for you; GTK does all those UI things on top. PA does just a bit of buffering, mixing, filtering for you; GST does all those nice decoding/encoding/muxing/demuxing things on top. Regarding the PA vs dmix issue, Sven Neumann brought up. Yes, if you only care about the simplest form of mixing, then dmix is sufficient for you. However, if we want to provide anything that remotely comes near to what Vista or MacOS X provides -- then we need some kind of sound server, just like they are shipping one. (MS likes to call the sound server a "userspace sound system", though, but that's just the terminology. The imporant fact is that they have a real-time process which serializes access to the PCM devices). So what does PA offer you beyond dmix right now? From a user perspective this is: moving streams on-the-fly between devices; distributing audio on multiple audio devices at the same time; per-stream volumes; fast-user-switching support; automatic saving/restoring of per-application devices, volumes; sensible hotplug support; "rescueing" streams to another audio device, if you accidentaly pull your usb cable; network support; ... the list goes on and on and on. Also, ALSA is Linux specific (though personally I think this doesn't really matter) Gustavo brought up the issue that PA "hogs" the sound device. Sure we do. The idea is having everything go through PA, so that we can treat everything the same. However, since there are some APIs that are notoriously hard to virtualize (e.g. OSS with mmap) and some areas where you don't want the extra context-switching PA adds (pro audio, for now), there's now a tool called "pasuspender" which when passed a command line it will execute that, but before doing so suspend PA's sound card access and afterwards resume it again. So, prefix your "quake2" invocation with "pasuspender" and everything should be fine. Also, we now close all audio devices after 1s of idle time by default. We do this mostly to save power. However this also has the side effect of releasing the audio device quickly for other apps. The drawback of course is that many sound cards generate pops and clicks everytime you open/close the device (some intel hda for example), but that can probably be worked around in the drivers (according to Takashi) and I guess you cannot have everything at the same time, so power saving is more important for now. In practice you probably shouldn't notice PA's presence at all -- unless you try to play a ALSA stream to hw:0 and a PA stream at the same time. And last but not least, we have been shipping a PA plugin for libasound for a while now. It's enabled by default in F8 and redirects all ALSA audio to PA -- unless some borked app hard codes "hw:0" as device name. Regarding Flash and PA: As Bastien pointed out, in F8 we ship a plugin for the flash player which makes it compatible with PA. With that plugin Flash and PA are perfectly compatible. Gustavo repeatedly brought up the compatibility with current (closed-source) stuff: PA is also "the compatible sound server". We provide compatibility with OSS, ALSA, ESD, GST, LIBAO, Xine, MPlayer, ... (in various degrees, but mostly pretty high-quality). Right now Quake2 is the only relevant app I know that doesn't really work on top of PA, but for those cases we have pasuspender. Basically, I think this is a non-issue these days. And for almost all of the remaining apps we have compat problems with, we can fix our compat layers for them. Most of the time the applications are misusing the APIs, but we're happy to try to add the necessary stuff to out compat layers to get them working with them. Regarding hardware mixing support: this is bullshit. You know, a while back all sound cards had wavetable stuff built in hw. And then this became obsolete - because it could be done with less effort and without problems in software, with faster CPUs. Then, there where MPEG decoder cards which soonishly became obsolete -- because it could be done with less effor in software, with faster CPUs. And then, some vendors added hw mixing to their cards. But that was 6 years ago -- if you look at current sound card designs (HDA) you'll notice that they only support a single stream. They are high-quality but very feature-limited DAC. HW mixing is dead technology, it's out of fashion, made redundant by stuff that nowadays is available in the CPU: MMX, SSE. Using hw mixing imposes a greater burden on your USB, PCI busses, might generate more IRQs. The place to do mixing is nowadays the CPU -- it's one of the reasons MMX, SSE where added to the CPU in the first place. Accelerating mixing in hw is really not what you want to do these days. But, if you really insist that you want to use this obsolete technology in your sound system the you're welcome to send me a patch or add a module to PA. But honestly, the next one who comes up with the hw mixing issue should please do his homework and read up what happened in sound card design in the last 10 years, thank you very much. Asking for hw mixing in PA is like asking for support for MPEG decoder cards in GST. Also, never forget: PA does much more than just mixing audio. That's just the tiniest part of it. Gustavo then played the latency card: yes, PA increases the latency over direct hw access. But so does dmix, because it enforces fixed fragment settings for all apps. What you really want to do (which however right now is only partially implemented in PA) is allowing per-stream fragment settings, by scheduling audio based on timer interrupts instead of sound io interrupts (based on fixed fragment settings). Those timer interrupts can be dynamically changed so we can change the wakeup points dynamically during playback without too much effort. However this needs some kind of kernel support (hrtimers, HPET), which only has become available very recently and on x86 only (not even amd64 yet), so until we get this fully implemented a few months will pass. If we have that however, we basically get the same PCM pipeline that Vista and MacOS have: a huge mixing buffer managed by a real-time userspace sound server which allows rewriting at any time and notifying clients dynamically, scheduled via timer interrupts. In essence, in the long run we really *need* something like PA, if we want to provide low latencies (i.e. short fragments == frequent interrupts) and low power consumption (i.e. few interrupts == huge fragments) at the same time and switch between them dynamically. Yes, right now, PA increases your achievable latencies a bit (but just a bit), but in the end we *need* a process that does the audio scheduling based on timers -- something that PA will then do. Of course, PA doesn't fully implement yet, which is partially PA's fault and partially the kernel's fault that sucks when it comes to timers, right now. We're getting there. Then, Gustavo played the stability card: Yepp, sure, PA is relatively new code. But I mean, esd is more than ten years old these days. And you'd call it stable? Come on! PA is stable enough for inclusion in F8, and it is actively maintained. And that should be all that counts. Oh, and sound is not really life-depending, is it? If you lose audio on your desktop all you lose is a bit of background music, it's not that PA eats all your files for breakfast. The "stability" argument is just a trick to disallow innovation. Gustavo, PA in F8 is very much different then PA 0.9.6. As suggested by Matthias, please try it in F8. You know, Gustavo, that RH did a lot of work on PA before we included it in F8, to make it seamless and as bug-free as possible? Sure there might be an issue left here and there. But that's in every software. So, to the next big technical issue Gustavo found in PA: he thinks its developers are stubborn. Thank you very much, Gustavo, I love you too. Maybe it is you that is stubborn here, with spreading all this FUD? (Just as a side note: do you know that Takashi, the upstream ALSA maintainer also maintains PA in Suse? Maybe you're more Catholic than the Pope in your insistance on ALSA dmix?) Regarding CPU load: the version of PA that ships in F8 uses exactly 0.00% CPU when idle -- unless some stupid app polls for the volume all the time, which might raise it a bit -- but that should be fixed in the app. Frederic still loves ESD. ESD is bad, in latency, in features, in code, in everything. I am not sure if you, Frederic, noticed that ESD only supports 2ch, 16bit, 44khz audio. Have you noticed all those 5.1 sound systems popping all around you? Have you noticed that everyone hates esd? And that the most well known trick to get your audio working on your Linux desktop is called "killall esd"? Noone wants to maintain ESD -- do you? There are just so many reasons why ESD should be obsoleted... Dude, the next one who seriously suggests ESD as our path to the future in desktop I audio I will personally buy a ticket for a time machine, so he can fast-forward for 10 years or so and join the rest of us in 2007! Regarding cross-desktop support: I personally don't care too much about KDE, but apparently you can set it up just fine like described here: http://pulseaudio.org/wiki/PerfectSetup Xine (which I think is what amarock -- or whatever that awful media player everyone but me loves so much is called -- uses for the hard stuff) also ships a native PA driver. Ronald, you say: "Userspace daemons are out." This is completely bogus. Just have a look on other OSes. Like MacOSX, like Vista. One of the new Vista features is the new "userspace sound system". In Unix nomenclatura this translates to "daemon". A user sound system is the way it needs to be, it's the way the systems do it which currently ship more powerful and useful sound systems then we do. As mentioned earlier, the PCM pipeline you really want is one RT thread per device that drives all streams based on timers, not on IO IRQs, managing a large, rewritable playback buffer. HW mixing is dead, and the lock-free magic dmix does is not really powerful enough for what is required from a sound system these days. PA is an implementation of the aforementioned ideal audio server design. (Not complete, as mentioned above, though). This is a very good read about the design of CoreAudio, and basically does what we want to do in PA as well. http://developer.apple.com/DOCUMENTATION/DeviceDrivers/Conceptual/WritingAudioDrivers/AudioFamilyDesign/chapter_3_section_3.html#//apple_ref/doc/uid/TP30000731-CJBIDABE Ronald, you claim: "sound daemon is the right solution _only_ for networked audio". This is also bogus. There's a lot of stuff you want to do in a sound server. For example: policy decisions like "everytime I plug in my USB headset in I want all voip playback streams to automcatically switch to it, and everytime i start my voip app i want its stream to go through the usb headset". Then, doing all this kind of "compiz for audio" stuff. For example, what I will probably make available in PA pretty soon is the ability to do "spacial" event sounds, i.e. if you press a button on the left side of your screen its event sound goes out of the left speaker, and vice versa. Or stuff like automatically sliding down the volume of all windows that are currently not in the foreground. (i.e. you start two totems and only the one in the foreground is at 100% volume, the other one at 30% or so. And when you switch windows the volumes automatically slided to the opposite). Right now, PA basically just provides the infrastructure for these kind of things, but after the groundwork is now done, I can now focus on the "earcandy" part. In short: there are both user-visible (like these effects, moving streams between devices, per-stream volumes) and technical (doing low-latency and low power-consumption at the same time) reasons why a userspace sound daemon is the way forward. Ronald, the "alsa-plugin" ships a OSS backend, just as a side note. Regarding GSmartMix: some parts of gsm live on, like the the new sound preferences dialog which allows per-class devices and stuff. The problem I saw with gsm is that it was limited to GST. And yeah, not all apps use GST, and many apps never will. I hope to work with Marc-André to get the remaining ideas of gsm into PA, as soon as I export the necessary meta information for all streams in PA. Ronald, in a way PA is just a reimplementation of dmix. You can autolaunch it via libasound, and you shouldn't notice much of a difference, except that you suddenly can do device aggregation, per-stream volumes with just a few clicks, and so on. Jan: dmix doesn't involve a daemon anymore. They now do some atomic ops magic of mixing everything lock-free with a single mix buffer and a couple of saturation buffers. It's a technically brilliant solution, though probably not the best for your CPU caches, and it falls back to locking mode on multicore and non-x86. Gustavo: PA by default uses pretty large playback buffers which apps can rewrite at any time. This is the very definition of what MS calls "GlitchFree", and is the way to go to provide never-drop-out guarantees and quick reaction when seeking. We don't really pass those large buffer down to the hw yet, but that's mostly because of the hrtimer mess mentioned above. PA in F8 should not drop out, unless you configure it manually to some strange settings. If you ship a shitty HZ=100-with-no-preemption kernel, then yes, this increases the chance of a drop-out. But really, if you want to shoot yourself in the foot then go for it, but don't blame PA for it, don't do it the ESR way. In any reasonable setup PA shouldn't drop out. The way forward, to get something like "GlitchFree" on Linux is called "PulseAudio", and in contrary what you are claiming, ALSA dmix is not. Gustavo: as I tried to make clear above the way to go is a userspace sound server. And we have that, then it's perfectly fine to do network support in it as well. And again: no modern sound card supports hw mixing anymore. That's the past, get over it. Gustavo: OSS is only dead -- as an implementation of a kernel sound system (though some people from 4front might even claim the contrary here), OTOH it is very alive -- as an API, and (unfortunately) it is going to stay around for a long time still. It's a much smaller API then ALSA, and portable and used in a lot of commercial apps. That's why we support it for compatibility in PA. Regarding RT support in PA: Right now on F8 rt for pa is not enabled by default, due to security. I'd really love to enable it by default, which we could do if we had a safe process babysitter daemon which would supervise PA and is running on a higher rtprio than PA. Hopefully eventually someone will replace init/gnome-session which something which can babysit processes very well, and this thing should then do rt-supervising as well. Also, contrary to what Gustavo says, you don't need to be root to do RT, all you need is RLIMIT_RTPRIO set to something > 0. Regarding event sounds: Yes, I disable them too by default, I think everyone reasonable (except davidz, maybe :-)) does that. But why do we do that? Partly because the sounds we have right now in GNOME suck big time and are annoying like hell. And partly, because they are truggered far too often. If you ever used a MacOS machine you probably know that the event sounds there are lot more subtle and ... useful. I can think of a couple of places where sound events make a lot of sense, if they are high-quality: - when you get an email a human voice should say something like "You've got mail", instead of some stupid "ding" sound noone knows what it means. - when long-running actions complete you might also want a human voice saying "CD burning finished", or "downloaded finished". - For incoming IMs you should have a subtle "ping" sound. Having a human voice everytime probably is too much, given their frequency. - Some UI actions like workspace swiutching/fast-user switching, and minimizing/maximizing might be good candidates for event sounds too. So basically, what I try to say is: just because current sound events suck, there's no reason they *have to* suck. I hope someone will eventually give the sound theming spec another shot and provide us whith more useful, internationalized default sound samples. OK, so much about defending PA. I hope I answered to every single question, comment, FUD spread. If not, just give me a ping! So, where do we go from here? At the Summit and internally at RH we discussed a little how we should go on with PA and GNOME. So, here basically what I plan: There are basically three areas where GNOME currently interfaces with PA via compat layers only and where we should replace the relevant code with something newer: 1. Currently esd is explicitly started via gnome-session. In F8 we provide a compat script called "esd" that starts up PA. So, g-s thinks it starts esd, while it actually starts PA. This is OK, but this hard coded dependency on a binary called "esd" should go away. Instead PA should be started via XDG autostart or suchlike. This would require some serializing of sound events to fix the race we get when one app wants to play a sound event and pa is not fully started yet. Not too difficult. This removes the hard dep on ESD doesn't even replace it with a PA specific one. Gustavo, Ronald, I hope you rejoice? 2. Sound events are generated directly via libesd from libgnome. This hard dep sucks as well. What I propose instead is this: I will introduce a new sound event API called "libcanberra", which is intended to be cross-platform, cross-toolkit and well-supported on PA. It basically exports just a single variadic function: cbr_play(c, id, CBR_META_ROLE, "event", CBR_META_NAME, "click-event", CBR_META_SOUND_FILE_WAV, "/usr/share/sounds/foo.wav", CBR_META_DESCRIPTION, "Button has been clicked", CBR_META_ICON_NAME, "clicked", CBR_META_X11_DISPLAY, ":0", CBR_META_X11_XID, "4711", CBR_META_POINTER_X, "46", CBR_META_POINTER_Y, "766", CBR_META_LANGUAGE, "de_DE", -1); If that function is called, the caller should pass as many properites as possible. Then, libcanberra will try to find the right sound file for this event, and contact the sound server for playback. The meta information is passed: to do transparent i18n, for a11y, for sound effects (i.e. the spacial sound effects I mentioned earlier with the POINTER_X and POINTER_Y props). (In reality the API will probably have a couple of more functions, for cacheing, and for predefining properties so that you don't have to specify them for each event again. So maybe 5 functions or so.) As soon as I have a version of this library I will write a small module for gtk (the kind of you can load into every gtk app with --gtk-module) which will basically do what libgnome currently does: hooking into a couple of signals -- but instead of direct calls to libesd it will call the aforementioned libcanberra function with the appropriate parameters. Advantages: suddenly sound events work for non-gnome apps (i.e. only gtk-using apps) too. We can remove yet another part from libgnome, and last but not least, yet another hard dep on ESD is gone, and not even replaced by one on PA. Not even libcanberra becomes a hard dep of Gtk. Gustavo, Ronald, this is where should rejoice, again. 3. Mixer APIs. There are thre mixer control tools right now: the OSD that is shown when you press your volume-up/volume-down keys; the mixer applet; and gnome-volume-control. The OSD is supported fine through gst-pulse (our rocking PA plugin for gst), but for the applet and the standalone mixer i'd like to see a replacement. Right now both use the gst mixer abstraction API, which only exposes a very limited set of what our PA mixer can do and which quite frankly is a big mess. We'd have two options here: fix the gst mixer api, so that it exports the whole functionality that PA offers. Or, just make the mixer depend directly on the PA libs. I'd vote for the latter. Why? Because abstraction APIs in most cases suck, and especially if a large part of the API is only implemented in a single backend (which would be PA). That's why in F9 we will probably drop g-v-c and replace it with pa's specific mixer tool called "pavucontrol", that we already ship. (I mentioned this already above). So, what I'd like to see is that pavucontrol could become a part of GNOME proper eventually, and for that to work PA would need to become a blessed dependency. While I see not much worth in developing two volume control tools in parallel, we could even keep g-v-c around for those who prefer to stick with their bare 90s-style audio systems. (Ronald, Gustavo, that's again where you should rejoice). The question of course remains, which mixer app to maintain in GNOME. My own pavucontrol is quite featureful, but I think it's not the best thing UI-wise (though some people seem to disagree with me -- and do like it). I'd be happy if someone would pick this up. If noone picks it up, I will probably hack up some pa-specific applet and stick it together with pavucontrol in GNOME SVN, and then suggest it for inclusion into GNOME proper. So far my plans. When we have dealt with these three issues, GNOME should work fine on both PA and without PA. Will take some time to implement them all. But I hope that even people like Gustavo and Ronald can live with it. Oh, and I hope that my comments on Gustavo's and Ronald's position didn't sound too harsh. It's just that I consider your positions badly-informed and a bit FUDish, it's not intended to be personal. Any questions? Yours, the stubborn Lennart -- Lennart Poettering Red Hat, Inc. lennart [at] poettering [dot] net ICQ# 11060553 http://0pointer.net/lennart/ GnuPG 0x1A015CC4 _______________________________________________ desktop-devel-list mailing list desktop-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/desktop-devel-list