> It seems to me that part of the point of the byte/string split (and the > lack of automatic coercion) is to make the programmer be explicit about > converting between unicode and bytes. Having these functions, which > convert between binary formats (ASCII-only representations of binary data > and back) accept unicode strings is reintroducing automatic coercions, > and I think it will lead to the same kind of bugs that automatic string > coercions yielded in Python2: a program works fine until the input > turns out to have non-ASCII data in it, and then it blows up with an > unexpected UnicodeError.
I agree with the change in principle, but I also agree in the choice of error with you: py> binascii.a2b_hex("MURRAY") Traceback (most recent call last): File "<stdin>", line 1, in <module> binascii.Error: Non-hexadecimal digit found py> binascii.a2b_hex("VLÖWIS") Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: string argument should contain only ASCII characters I think it should give binascii.Error in both cases: Ö is as much a non-hexadecimal digit as M. With that changed, I'd have no issues with the patch: these functions are already fairly strict in their input, whether it's bytes or Unicode. So the chances that non-ASCII characters get it to fall over in a way that never causes problems in pure-ASCII communities are very low. > If most people agree with Antoine I won't fight it, but it seems to me > that accepting unicode in the binascii and base64 APIs is a bad idea. No - it's only the choice of error that is a bad idea. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com