Bug#848508: LANG=C wormhole :/
On 2017-01-03 15:13:23, Daniel Kahn Gillmor wrote: > On Tue 2017-01-03 14:20:43 -0500, anarcat wrote: >> I'm happy to follow whatever upstream decides, but I'd like to point out >> that this is not just a feature request ("non-ASCII wordlist", which can >> be supported fine even if we go back to py2 btw), but an actual bug >> ("fails to work"). > > "fails to work when the user explicitly sets LANG=C on a program that > deals with human-readable text in 2017" :) I think this is inaccurate: users do not need to explicitely set LANG=C, it is the default when unset. :) >> C.UTF-8 is necessarily available on all Debian systems, let alone the >> default. In fact, I believe the default locale, on Debian systems, is >> still C. Having our package fail to work in that locale breaks the >> Principle Of Least Astonishment. > > I don't think this is the case, but i could be wrong. What makes you > think that this is true? primarly out of gut feeling: i commonly get pushback from people not running unicode locales when i report bugs triggered by my funny name. but also, i have a debian wheezy VM here created with vmdebootstrap (so fairly plain). here's the locale: root@debian:~# echo $LANG ie. not set. i believe that a minimal Debian install will not set LANG if you login through the console, nor if you login through SSH. graphical environments typically set that variable, but we can't assume the users will have a display manager to set that up. i believe we may be conflating "having C.UTF-8 *available*" with "having LANG set to some UTF-8 locale". it may be true that most systems will have a UTF-8 locale available (although I question even that, given my experience with this VM), but i am pretty certain that we can't assume LANG will be properly set. > the default for debian systems is to install > a task-$LANGUAGE package based on the choice made during d-i, and > configures a sensible localse C.UTF-8 > is always available. in this (vm)debootstrap-built chroot here, it is not the case: task-language is not installed, and the C.UTF-8 locale is not configured. and even if it was, it is not necessarily set. root@debian:~# apt-cache policy task-english task-english: Installed: (none) Candidate: 3.14.1 Version table: 3.14.1 0 500 http://httpredir.debian.org/debian/ wheezy/main amd64 Packages d-i is not the only way to install debian... (i'd be curious to see if debirf actually sets that up correctly, btw ;) > I do note that when LANG is completely unset, we see the same failure, > even though C.UTF-8 is available. In that case, i'd recommend that we > just explicitly set LANG=C.UTF-8 (within wormhole) to work around > python-click's idiosyncracies on py3. i think that's not necessarily a good idea: this is *exactly* the kind of stuff the python-click warning is there for - to avoid assuming any sort of encoding or locale, and forcing the user to decide on it. by setting the locale, we are basically ignoring the warning, and we might as well just catch the exception and/or silence it (which is possible with monkeypatching). > But if the user deliberately sets LANG=whatever to something > non-unicode, i don't think it's unreasonable for wormhole to decline to > work in that environment if one of its dependencies is dependent on a > UTF-8 locale. as we have seen, the problem is not if a user deliberately configures a "wrong" locale, but also when no locale is configured, which is a surprisingly common situation. >> I still believe the simplest fix, in the short term, is to revert back >> packaging to Python2. We could (and should, anyways) provide both >> python2 and python3 bindings for the magic-wormhole *libraries* and make >> the binary use the python2 libraries until the click bug is fixed or >> Debian defaults to a UTF-8 locale. > > why not just (a) fix the unset $LANG situation with a small patch, and because that silences a real issue with python3-click that we do not want to silence. click needs to be fixed, we shouldn't hide potential errors like this. what if the user is running under a latin1 locale that just happens to work because it's an extension of ASCII? before you tell me how wrong that sounds, consider that i have done exactly that for about a decade, over various operating systems... > (b) tag the python-click bug as "affects: magic-wormhole" and leave it > as is? that sounds like a good idea in any case... my bottom line on this bug is that wormhole is a file transfer program that doesn't, a priori, have to specifically deal with locale problems. garbage in, garbage out. i agree that if someone has the wrong locale and/or passes corrupt data to wormhole, it should bail out preemptively. but in this case, there is a legitimate use case where no locale is configured, or, actually, the C locale is configured (by default) and only ASCII data is passed. we shouldn't bail out in that specific case and i don't know of anything special in wormhole that should make
Bug#848508: LANG=C wormhole :/
On Tue 2017-01-03 14:20:43 -0500, anarcat wrote: > I'm happy to follow whatever upstream decides, but I'd like to point out > that this is not just a feature request ("non-ASCII wordlist", which can > be supported fine even if we go back to py2 btw), but an actual bug > ("fails to work"). "fails to work when the user explicitly sets LANG=C on a program that deals with human-readable text in 2017" :) > C.UTF-8 is necessarily available on all Debian systems, let alone the > default. In fact, I believe the default locale, on Debian systems, is > still C. Having our package fail to work in that locale breaks the > Principle Of Least Astonishment. I don't think this is the case, but i could be wrong. What makes you think that this is true? the default for debian systems is to install a task-$LANGUAGE package based on the choice made during d-i, and configures a sensible localse C.UTF-8 is always available. I do note that when LANG is completely unset, we see the same failure, even though C.UTF-8 is available. In that case, i'd recommend that we just explicitly set LANG=C.UTF-8 (within wormhole) to work around python-click's idiosyncracies on py3. But if the user deliberately sets LANG=whatever to something non-unicode, i don't think it's unreasonable for wormhole to decline to work in that environment if one of its dependencies is dependent on a UTF-8 locale. > I still believe the simplest fix, in the short term, is to revert back > packaging to Python2. We could (and should, anyways) provide both > python2 and python3 bindings for the magic-wormhole *libraries* and make > the binary use the python2 libraries until the click bug is fixed or > Debian defaults to a UTF-8 locale. why not just (a) fix the unset $LANG situation with a small patch, and (b) tag the python-click bug as "affects: magic-wormhole" and leave it as is? --dkg signature.asc Description: PGP signature
Bug#848508: LANG=C wormhole :/
On Tue, Jan 03, 2017 at 02:03:44PM -0500, Daniel Kahn Gillmor wrote: > Control: forwarded 848508 https://github.com/warner/magic-wormhole/issues/127 > > It'd be really nice for wormhole to stay on python 3 -- i would like to > be able to run a system free of python2 in the near future, and i'd also > like to be able to have wormhole available. I'd like to have Debian free of Python2 as well, but it is not going to happen in stretch, not by a long shot. > I've forwarded the debian bug report upstream to see whether Brian has > any suggested resolution. Great, thanks! > but I note that once we're talking about wormhole using non-ASCII > wordlists (see https://github.com/warner/magic-wormhole/issues/26), > "LANG=C wormhole receive" is going to be a buggy invocation no matter > what anyway. I'm happy to follow whatever upstream decides, but I'd like to point out that this is not just a feature request ("non-ASCII wordlist", which can be supported fine even if we go back to py2 btw), but an actual bug ("fails to work"). > I'm inclined to just say "don't do that" on debian systems, where we > expect C.UTF-8 to be available anyway. C.UTF-8 is necessarily available on all Debian systems, let alone the default. In fact, I believe the default locale, on Debian systems, is still C. Having our package fail to work in that locale breaks the Principle Of Least Astonishment. I still believe the simplest fix, in the short term, is to revert back packaging to Python2. We could (and should, anyways) provide both python2 and python3 bindings for the magic-wormhole *libraries* and make the binary use the python2 libraries until the click bug is fixed or Debian defaults to a UTF-8 locale. Until then, we are, by default, not working at all unless the user does an extra configuration. I think this is an unacceptable situation and a bug that should be fixed. Failing that, someone should mark this bug as unfixed and just close this issue. I certainly wouldn't do that myself as I believe this is a bug that we should work around until Click does the right thing. Not everyone is a unicode geek like we are. ;) A. -- I worry about my child and the Internet all the time, even though she's too young to have logged on yet. Here's what I worry about. I worry that 10 or 15 years from now, she will come to me and say 'Daddy, where were you when they took freedom of the press away from the Internet?' - Mike Godwin, Electronic Frontier Foundation signature.asc Description: Digital signature
Bug#848508: LANG=C wormhole :/
Control: forwarded 848508 https://github.com/warner/magic-wormhole/issues/127 It'd be really nice for wormhole to stay on python 3 -- i would like to be able to run a system free of python2 in the near future, and i'd also like to be able to have wormhole available. I've forwarded the debian bug report upstream to see whether Brian has any suggested resolution. but I note that once we're talking about wormhole using non-ASCII wordlists (see https://github.com/warner/magic-wormhole/issues/26), "LANG=C wormhole receive" is going to be a buggy invocation no matter what anyway. I'm inclined to just say "don't do that" on debian systems, where we expect C.UTF-8 to be available anyway. --dkg signature.asc Description: PGP signature