Ok, where to begin ... I guess first off, the IRC meeting was a big help for me in getting a sense of what was already clear and what was not. I fairly deliberately didn't lurk that because I was more interested in listening to what other people were "thinking out loud" at this stage, and it was a great window into that. Keep doing those :)
There are some open questions from that which do have straightforward answers, so I'll try to sort this from easy to Hard ... <dondelelcaro> and the server aparently isn't capable of transcoding or whatever <Diziet> dondelelcaro: AFAICT the server doesn't ever transcode so every client needs to send a codec everyone understands. The server is basically just a packet amplifier, it takes whatever each client sends, and forwards a copy to each other participant. It doesn't do much more than that. Clients don't even need to be sending the with the same codec. If you configure your client bandwidth for <= 32kbs it will send Speex, and all other clients will decode that fine. But it's not symmetrical, so they may still be sending you Celt at 96kbs, or Opus at any rate, regardless of your own client configuration. There isn't really any "negotiation" in the normal protocol sense of that word, except when the client supports multiple versions of Celt, then the server will indicate which version the connected clients have in common. <vorlon> Debian runs a public mumble server, easy enough to attack Because of the above, the server itself isn't vulnerable to problems in the codec code. It may have its own problems, but I'm not aware of any of those if they exist. It's only the client code that's exposed here, but anyone connecting to a server has a 'direct' connection to all of the other clients there. <rra> In other words, we don't *know* of any security problems; we (for some value of we) just think the code is horrible. <Diziet> No. "We" think the code is dead upstream, is the main point. <vorlon> my understanding was that someone concocted a proof of concept against later versions of the code, and that the opus code has been proofed against it It's not that the code is even particularly horrible (as code written by Heavy Math people goes :) Mostly it's that there is nobody analysing the problems found and corrected in later code, and/or backporting any needed fixes to the version that mumble wants to continue using. People are saying they want to keep using it, but nobody is taking on the role of actually being responsible for it - or indicating anything other than that they are explicitly _not_ prepared to take on that task so far. <Diziet> Not that it's horrible. I haven't seen anyone claim the opus code is much better than the celt code from a security pov. I'll make that claim unambiguously now (if I hadn't done so already :). Celt was an entirely experimental codebase, each 'version' of it that was tagged existed *only* for testing the quality of the audio that it produced. Next to no attention was paid to the normal "release issues" of a piece of "production" software beyond what other people submitted as patches. Once the listening test results were in, code was changed, the bitstream broken as/if needed, and audio quality testing continued. Being free software, and an experiment conducted fully in the open, people were free to do with it what they wished -- but if they did, the onus was *entirely* on them to worry about release quality and maintenance issues. That's not something the upstream developers devoted any real attention to at all before the bitstream freeze. Opus by comparison has its C code as the normative part of an IETF proposed standard. An utterly insane number of hours went into QA testing it for "release issues" after the final bitstream freeze, and vetting it for precisely these sorts of problems (and that work is still ongoing). There are slides from one of the IETF meetings documenting some of that process -- and there are things that should be obvious from even a cursory look at the code - like Opus actually has a test suite, with near complete code coverage, that fuzzes the code intelligently on every run etc. etc. I won't go so far as to claim it's completely bug free. But people actually care if there is even a hint that it isn't. We're in an entirely different phase of the development now, where release polishing is at the forefront of What Matters to the maintainers. No version of celt had that sort of attention, especially not one as old as 0.7.1. <Diziet> What's weird is why don't we have references to this vuln ? It's not really that weird. As per the above Thorvald and I became solely responsible for celt 0.7.1 when we decided to include it in squeeze - so there is nobody else spending any time on this - and you never before asked me, or apparently the mumble upstream people you said you spoke to, for any such further detail :) <Diziet> If so we could see "can we apply the patch to celt" which might be interesting info. The mumble upstream folk were given a (not exhaustive) list of commits to look at when this first came to our attention - and I asked them about their progress with those again last week. I got the same reply as I did initially though: The patches don't directly apply to the older code, and far too much had changed for there to be any trivial mapping back to it that they were able to follow. Which doesn't mean the problems don't exist in the old code, just that the places where a fix was later applied did not make this easy to answer with any confidence. If somebody has time to volunteer to analyse this in more detail, then I'm sure we can get them more information. I'd be delighted if that resulted in a plausible belief that the old code really is safe still, or patches to make it so. I just don't buy people telling me "pfft, it's fine" when they haven't looked at all - after a person who had done much of the insanely thorough testing of this code told me that they thought it wasn't ... <rra> Backing up a little bit: Assume that we all decide that it's okay to reintroduce celt. Do we actually have someone who is willing to do the work of reintroducing celt into the archive? I mean, is Ron willing to do that, or is someone else willing and capable to do it if Ron doesn't want to be stuck supporting it because he doesn't agree with it? We don't really want to reintroduce celt as a public package whatever is decided here. There really is nothing except mumble with any excuse to still be using this now. So the main question is, are we comfortable shipping mumble with it enabled as a private lib? Simply uploading that is a no-brainer, anyone can do it, and I won't refuse to do that if that's where consensus lands and Thorvald doesn't have time to do so. But I only committed to being responsible for celt 0.7.1 until we had a bitstream frozen version to ship, which we now do, and Thorvald doesn't appear to have the time to commit to that for another release cycle either. So my big concern is that we have nobody stepping in to fill the gap of an "upstream" maintainer, who will diligently investigate issues like this rather than just say "I'll worry about it when someone else sends me a patch" ... <rra> vorlon: My understanding was that we were unsure whether the existing clients out there in the world that speak celt would actually negotiate speex. As I mentioned above, there is no negotiation for this. If you have a client that can encode speex, it can just send it and any other client will be able to decode it. But that's kind of orthogonal to what they will send you. They could send you Opus in return, and if you don't have a client that can decode Opus, then you won't be able to hear them. The lack of real bi-directional negotiation is part of why this is such a mess in the transition period, but that's sort of fundamental to the way the server operates and can't trivially be fixed. <Diziet> But AIUI that would involve downgrading all the clients in a channel to speex which might well be unacceptable to the userbase effectively making our version of the mumble client unuseable in those contexts. Talking about "downgrading" to speex is only meaningful when comparing it to opus. Celt isn't a speech coder, so it doesn't perform well under conditions where speex does, and vice versa. Neither is clearly "down" from the other, at least not when comparing with celt 0.7.1. They are different tools, specialised for different jobs. It would be just as valid to say that "downgrading to celt 0.7.1" would have the effect you mention. And that's empirically true because there are already people blocking its use on their servers, and the number of people doing that will only grow over the lifetime of Wheezy now that they can do it just by setting opusthreshold instead of hacking at the code to change the permitted celt versions. Celt 0.7.1 gives poor results at bitrates where both opus and speex shine. <Diziet> dondelelcaro: I think we know that if we reenable the embedded celt it will work as intended with existing clients. We do gain an extra possible dimension of interoperability. Unfortunately that's not enough on its own to ensure it will work with other existing clients and servers - either at all, or with acceptable results. And even mumble upstream is hoping to be able to phase out all codecs other than opus in a shorter time than the lifecycle of Wheezy. I agree the backward compatibility issue is important. But "existing clients" is currently not a stationary target either. If we lock this into another stable release, we are likely to be the last ones left carrying the hot potato alone, long after it stops actually being useful to anyone. <vorlon> I want to see what actually happens when two clients connect to a server having only speex in common <vorlon> right now we don't have such a test If you set the requested bandwidth in both clients to 32kbs or less, then that's exactly what will happen (if you have old enough clients to still have speex encoding support :) Lying about the celt version so that clients thought they didn't have anything else in common was the crux of the trick Thorvald thought we could pull though as I understood it, yeah. Of which I sadly have no new news at this stage ... :( Thorvald is back, but the other mumble upstream folk and I only got to talk to him for a couple of minutes before "Argh. Work. gotta go. bbl." And he hasn't yet been back again ... On the brighter side, he has got upstream snapshots rolling again, so there are opus enabled clients getting around and more testing has been happening on those. And the other upstream folk and I have had some reasonably productive discussions on mitigation and interop issues. We're still talking about getting speex put back into the client, which seems to be getting some agreement, but we're not quite all said and done on that just yet. There's a couple of things that still need looking at. They've added a config option for the client now, which permits users to disable celt - which at least gives people an option to turn it off fast should that be needed faster than we can push code updates. And the SECCOMP sandboxed version of celt has been pushed upstream now. The guy who worked on that sounds like he's pretty happy with it, but I looked at the code and do worry that it's probably too big a change to push into a stable release without more live testing, and probably a few other pairs of eyes auditing it. It seems to be a good answer that I don't totally want to take off the table, but probably not something that could seriously be considered for wheezy proper without more people taking an intense interest in it very soon. So whichever way you slice it - we still don't have a one-true killer solution here yet that's clearly without fault. My gut feeling is still kind of saying we should target bpo, where we can push upstream fixes as fast as they come, since it looks like that is going to be needed for a while - but that answer sucks for another group of people too ... Anyhow, way too many words, sorry about that - and there's still lots I haven't covered - but I'm playing this with an open hand, so if there's questions, do ask them, please. And I'll try to give shorter answers ... <rra> Damn, can't get someone else to do the work for us. :) That's kind of the perfect summary of the mess we're immersed in, yeah :) Ron -- To UNSUBSCRIBE, email to debian-ctte-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120812183634.gf1...@audi.shelbyville.oz