On Thu, Apr 25, 2013 at 7:43 AM, Antoine Pitrou solip...@pitrou.net wrote:
On Thu, 25 Apr 2013 04:19:36 +0200
Lennart Regebro rege...@gmail.com wrote:
On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull step...@xemacs.org
wrote:
RFC 4648 repeatedly refers to *characters*, without
Lennart Regebro writes:
Base64 is an encoding that transforms between 8-bit streams. Let it be
that. Don't try to shoehorn it into a completely different kind of
encoding.
By completely different kind of encoding do you mean codec?
I think that would be an unfortunate result. These
On Thu, Apr 25, 2013 at 8:57 AM, Stephen J. Turnbull step...@xemacs.org wrote:
I think that would be an unfortunate result. These operations on
streams are theoretically nicely composable. It would be nice if
practice reflected that by having a uniform API for all of these
operations
On Thu, Apr 25, 2013 at 4:57 PM, Stephen J. Turnbull step...@xemacs.org wrote:
Lennart Regebro writes:
Base64 is an encoding that transforms between 8-bit streams. Let it be
that. Don't try to shoehorn it into a completely different kind of
encoding.
By completely different kind of
Le Thu, 25 Apr 2013 08:38:12 +0200,
Lennart Regebro rege...@gmail.com a écrit :
On Thu, Apr 25, 2013 at 7:43 AM, Antoine Pitrou solip...@pitrou.net
wrote:
On Thu, 25 Apr 2013 04:19:36 +0200
Lennart Regebro rege...@gmail.com wrote:
On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull
On Thu, Apr 25, 2013 at 11:25 AM, Antoine Pitrou solip...@pitrou.net wrote:
Le Thu, 25 Apr 2013 08:38:12 +0200,
Yes it is. Base64 takes 8-bit bytes and transforms them into another
8-bit stream that can be safely transmitted over various channels that
would mangle an unencoded 8-bit stream,
On 2013-04-25, at 11:25 , Antoine Pitrou wrote:
Besides, I would consider a RFC more authoritative than a
Wikipedia definition.
Base encoding of data is used in many situations to store or transfer
data in environments that, perhaps for legacy reasons, are restricted
to US-ASCII [1] data.
Le Thu, 25 Apr 2013 12:46:43 +0200,
Xavier Morel catch-...@masklinn.net a écrit :
On 2013-04-25, at 11:25 , Antoine Pitrou wrote:
Besides, I would consider a RFC more authoritative than a
Wikipedia definition.
Base encoding of data is used in many situations to store or
transfer
Le Thu, 25 Apr 2013 12:05:01 +0200,
Lennart Regebro rege...@gmail.com a écrit :
The Wikipedia page does talk about *text* and *characters* for
the result of base64 encoding.
So are saying that you want the Python implementation of base64
encoding to take 8-bit binary data in bytes format
On Thu, Apr 25, 2013 at 2:57 PM, Antoine Pitrou solip...@pitrou.net wrote:
I can think of many usecases where I want to *embed* base64-encoded
data in a larger text *before* encoding that text and transmitting
it over a 8-bit channel.
That still doesn't mean that this should be the default
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 04/25/2013 01:43 AM, Antoine Pitrou wrote:
On Thu, 25 Apr 2013 04:19:36 +0200 Lennart Regebro rege...@gmail.com
wrote:
On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull
step...@xemacs.org wrote:
RFC 4648 repeatedly refers to *characters*,
On Apr 25, 2013, at 03:34 PM, Lennart Regebro wrote:
In the case of JSON objects, they are intended for data exchange, and
hence in the end need to be byte strings.
Except that they're not.
http://bugs.python.org/issue10976
-Barry
___
Python-Dev
On 25/04/2013 14:34, Lennart Regebro wrote:
On Thu, Apr 25, 2013 at 2:57 PM, Antoine Pitrou solip...@pitrou.net wrote:
I can think of many usecases where I want to *embed* base64-encoded
data in a larger text *before* encoding that text and transmitting
it over a 8-bit channel.
That still
On Thu, Apr 25, 2013 at 10:07 AM, Barry Warsaw ba...@python.org wrote:
On Apr 25, 2013, at 03:34 PM, Lennart Regebro wrote:
In the case of JSON objects, they are intended for data exchange, and
hence in the end need to be byte strings.
Except that they're not.
On Thu, Apr 25, 2013 at 4:22 PM, MRAB pyt...@mrabarnett.plus.com wrote:
The JSON specification says that it's text. Its string literals can
contain Unicode codepoints. It needs to be encoded to bytes for
transmission and storage, but JSON itself is not a bytestring format.
OK, fair enough.
Le Thu, 25 Apr 2013 09:55:26 -0400,
Tres Seaver tsea...@palladion.com a écrit :
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 04/25/2013 01:43 AM, Antoine Pitrou wrote:
On Thu, 25 Apr 2013 04:19:36 +0200 Lennart Regebro
rege...@gmail.com wrote:
On Thu, Apr 25, 2013 at 3:54 AM,
Le Thu, 25 Apr 2013 15:34:45 +0200,
Lennart Regebro rege...@gmail.com a écrit :
I don't agree that there is a significant difference between those
wordings in this context. The end result is the same: Things intended
to be handled/seen as textual should be unicode strings, things
intended
On 25/04/2013 15:22, MRAB wrote:
On 25/04/2013 14:34, Lennart Regebro wrote:
On Thu, Apr 25, 2013 at 2:57 PM, Antoine Pitrou solip...@pitrou.net wrote:
I can think of many usecases where I want to *embed* base64-encoded
data in a larger text *before* encoding that text and transmitting
it over
On Thu, Apr 25, 2013 at 5:27 PM, Antoine Pitrou solip...@pitrou.net wrote:
Le Thu, 25 Apr 2013 15:34:45 +0200,
Lennart Regebro rege...@gmail.com a écrit :
I don't agree that there is a significant difference between those
wordings in this context. The end result is the same: Things intended
Lennart Regebro writes:
On Thu, Apr 25, 2013 at 4:22 PM, MRAB pyt...@mrabarnett.plus.com wrote:
The JSON specification says that it's text. Its string literals can
contain Unicode codepoints. It needs to be encoded to bytes for
transmission and storage, but JSON itself is not a
On Thu, 25 Apr 2013, Lennart Regebro wrote:
On Thu, Apr 25, 2013 at 4:22 PM, MRAB pyt...@mrabarnett.plus.com wrote:
The JSON specification says that it's text. Its string literals can
contain Unicode codepoints. It needs to be encoded to bytes for
transmission and storage, but JSON itself is
MRAB writes:
RFC 4648 says Base encoding of data is used in many situations to
store or transfer data in environments that, perhaps for legacy reasons,
are restricted to US-ASCII [1] data..
To me, US-ASCII is an encoding, so it appears to be talking about
encoding binary data
On 23.04.2013 23:37, Nick Coghlan wrote:
On 24 Apr 2013 01:25, M.-A. Lemburg m...@egenix.com wrote:
On 23.04.2013 17:15, Barry Warsaw wrote:
On Apr 22, 2013, at 06:22 PM, Guido van Rossum wrote:
You can ask the same question about all the other codecs. (And that
question has indeed been
On 23.04.2013 19:24, Guido van Rossum wrote:
On Tue, Apr 23, 2013 at 9:04 AM, M.-A. Lemburg m...@egenix.com wrote:
On 23.04.2013 17:47, Guido van Rossum wrote:
On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg m...@egenix.com wrote:
Just as reminder: we have the general purpose
encode()/decode()
On 4/24/2013 1:22 AM, M.-A. Lemburg wrote:
On 23.04.2013 19:24, Guido van Rossum wrote:
On Tue, Apr 23, 2013 at 9:04 AM, M.-A. Lemburg m...@egenix.com wrote:
On 23.04.2013 17:47, Guido van Rossum wrote:
On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg m...@egenix.com wrote:
Just as reminder:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 04/23/2013 09:29 AM, Stephen J. Turnbull wrote:
By RFC specification, BASE64 is a *textual* representation of
arbitrary binary data.
It isn't text in the sense Py3k means: it is a representation for
transmission on-the-wire for protocols which
Tres Seaver writes:
On 04/23/2013 09:29 AM, Stephen J. Turnbull wrote:
By RFC specification, BASE64 is a *textual* representation of
arbitrary binary data.
It isn't text in the sense Py3k means:
RFC 4648 repeatedly refers to *characters*, without specifying an
encoding for them. In
On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull step...@xemacs.org wrote:
RFC 4648 repeatedly refers to *characters*, without specifying an
encoding for them. In fact, if you copy accurately, you can write
BASE64 on a napkin and that napkin will accurate transmit the data
(assuming it
On Thu, 25 Apr 2013 04:19:36 +0200
Lennart Regebro rege...@gmail.com wrote:
On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull step...@xemacs.org
wrote:
RFC 4648 repeatedly refers to *characters*, without specifying an
encoding for them.
[...]
Base64 is an encoding that transforms
Steven D'Aprano wrote:
- If it is no burden to have to import a module and call an external
function for some transformations, why have encode and decode methods at
all?
Now that all text strings are unicode, the unicode codecs
are in a sense special, in that you can't do any string
I/O at
R. David Murray writes:
You transform *into* the encoding, and untransform *out* of the
encoding. Do you have an example where that would be ambiguous?
In the bytes-to-bytes case, any pair of character encodings (eg, UTF-8
and ISO-8859-15) would do. Or how about in text, ReST to HTML?
On Tue, 23 Apr 2013 22:29:33 +0900, Stephen J. Turnbull step...@xemacs.org
wrote:
R. David Murray writes:
You transform *into* the encoding, and untransform *out* of the
encoding. Do you have an example where that would be ambiguous?
In the bytes-to-bytes case, any pair of character
On Wed, Apr 24, 2013 at 12:16 AM, R. David Murray rdmur...@bitdance.com wrote:
On Tue, 23 Apr 2013 22:29:33 +0900, Stephen J. Turnbull
step...@xemacs.org wrote:
R. David Murray writes:
You transform *into* the encoding, and untransform *out* of the
encoding. Do you have an example
On Apr 22, 2013, at 10:30 PM, Donald Stufft wrote:
I may be dull, but it wasn't until I started using Python 3 that it really
clicked in my head what encode/decode did exactly. In Python2 I just sort of
sprinkled one or the other when there was errors until the pain stopped. I
mostly attribute
On Apr 22, 2013, at 06:22 PM, Guido van Rossum wrote:
You can ask the same question about all the other codecs. (And that
question has indeed been asked in the past.)
Except for rot13. :-)
The fact that you can do this instead *is* a bit odd. ;)
from codecs import getencoder
encoder =
On 23.04.2013 17:15, Barry Warsaw wrote:
On Apr 22, 2013, at 06:22 PM, Guido van Rossum wrote:
You can ask the same question about all the other codecs. (And that
question has indeed been asked in the past.)
Except for rot13. :-)
The fact that you can do this instead *is* a bit odd. ;)
On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg m...@egenix.com wrote:
Just as reminder: we have the general purpose
encode()/decode() functions in the codecs module:
import codecs
r13 = codecs.encode('hello world', 'rot-13')
These interface directly to the codec interfaces, without
On 23.04.2013 17:47, Guido van Rossum wrote:
On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg m...@egenix.com wrote:
Just as reminder: we have the general purpose
encode()/decode() functions in the codecs module:
import codecs
r13 = codecs.encode('hello world', 'rot-13')
These interface
R. David Murray writes:
On Tue, 23 Apr 2013 22:29:33 +0900, Stephen J. Turnbull
step...@xemacs.org wrote:
R. David Murray writes:
You transform *into* the encoding, and untransform *out* of the
encoding. Do you have an example where that would be ambiguous?
In the
On Tue, Apr 23, 2013 at 9:04 AM, M.-A. Lemburg m...@egenix.com wrote:
On 23.04.2013 17:47, Guido van Rossum wrote:
On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg m...@egenix.com wrote:
Just as reminder: we have the general purpose
encode()/decode() functions in the codecs module:
import
On Wed, 24 Apr 2013 01:49:39 +0900, Stephen J. Turnbull step...@xemacs.org
wrote:
R. David Murray writes:
On Tue, 23 Apr 2013 22:29:33 +0900, Stephen J. Turnbull
step...@xemacs.org wrote:
R. David Murray writes:
You transform *into* the encoding, and untransform *out* of the
On 4/23/2013 12:49 PM, Stephen J. Turnbull wrote:
Which is an obnoxious API, since (1) you've now made it impossible to
use transform for
bytestring.transform(from='utf-8', to='iso-8859-1')
bytestring.transform(from='ulaw', to='mp3')
textstring.transform(from='rest', to='html')
On 24 Apr 2013 01:25, M.-A. Lemburg m...@egenix.com wrote:
On 23.04.2013 17:15, Barry Warsaw wrote:
On Apr 22, 2013, at 06:22 PM, Guido van Rossum wrote:
You can ask the same question about all the other codecs. (And that
question has indeed been asked in the past.)
Except for
Stephen J. Turnbull wrote:
By RFC specification, BASE64 is a
*textual* representation of arbitrary binary data. (Cf. URIs.) The
natural interpretation of .encode('base64') in that context would be
as a bytes-to-text encoder. However, ... In
practice, we invariably use an ASCII octet stream
R. David Murray writes:
I think you're completely missing my point here. The problem is that
in the cases I mention, what is encoded data and what is decoded data
can only be decided by asking the user.
I think I understood that. I don't understand why that's a
problem.
It's a
Greg Ewing writes:
Web developers might grumble about the need for an extra call,
but they can no longer claim it would kill the performance of
their web server.
Of course they can. There never was any performance measurement that
supported that claim in the first place. I don't see how
Terry Jan Reedy writes:
.transform should be explicit and always take two args, no implicit
defaults, the 'from form' and the 'to' form. They can labelled by
position in the natural order (from, to)
Not natural to escaped-from-C programmers, though. I hesitate to say
make it
Hi everyone,
Take a look at this question:
http://stackoverflow.com/questions/16122435/python-3-how-do-i-use-bytes-to-bytes-and-string-to-string-encodings/16122472?noredirect=1#comment23034787_16122472
Is there really no way to use base64 that's as short as:
b'whatever'.encode('base64')
Hi,
Your question is discussed since 4 years in the following issue:
http://bugs.python.org/issue7475
The last proposition is to add transform() and untransform() methods
to bytes and str types. But nobody implemented the idea. If I remember
correctly, the missing point is how to define which
if two lines is cumbersome, you're in for a cumbersome life a programmer.
On Apr 22, 2013 7:31 AM, Ram Rachum r...@rachum.com wrote:
Hi everyone,
Take a look at this question:
On 22 April 2013 12:39, Calvin Spealman ironfro...@gmail.com wrote:
if two lines is cumbersome, you're in for a cumbersome life a programmer.
One of which is essentially Python's equivalent of a declaration...
Paul
___
Python-Dev mailing list
On Mon, Apr 22, 2013 at 7:39 AM, Calvin Spealman ironfro...@gmail.com wrote:
if two lines is cumbersome, you're in for a cumbersome life a programmer.
Other encodings are either missing completely from the stdlib, or have
corrupted behavior. For example, string_escape is gone, and
unicode_escape
On Mon, Apr 22, 2013 at 09:50:14AM -0400, Devin Jeanpierre
jeanpierr...@gmail.com wrote:
unicode_escape doesn't make any sense anymore -- python code is text,
not bytes, so why does 'abc'.encode('unicode_escape') return bytes?
AFAIU the situation is simple: unicode.encode(encoding) returns
Devin Jeanpierre writes:
why does 'abc'.encode('unicode_escape') return bytes?
Duck-typing: encode always turns unicode into bytes.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
On Mon, 22 Apr 2013 09:50:14 -0400, Devin Jeanpierre jeanpierr...@gmail.com
wrote:
On Mon, Apr 22, 2013 at 7:39 AM, Calvin Spealman ironfro...@gmail.com wrote:
if two lines is cumbersome, you're in for a cumbersome life a programmer.
Other encodings are either missing completely from the
Victor Stinner wrote:
The last proposition is to add transform() and untransform() methods
to bytes and str types. ... If I remember
correctly, the missing point is how to define which types are
supported by a codec
Also, for any given codec, which direction is transform
and which is
On Tue, 23 Apr 2013 11:16:20 +1200, Greg Ewing greg.ew...@canterbury.ac.nz
wrote:
Victor Stinner wrote:
The last proposition is to add transform() and untransform() methods
to bytes and str types. ... If I remember
correctly, the missing point is how to define which types are
supported
--Guido van Rossum (sent from Android phone)
On Apr 22, 2013 6:09 PM, R. David Murray rdmur...@bitdance.com wrote:
On Tue, 23 Apr 2013 11:16:20 +1200, Greg Ewing
greg.ew...@canterbury.ac.nz wrote:
Victor Stinner wrote:
The last proposition is to add transform() and untransform() methods
On 23/04/13 09:16, Greg Ewing wrote:
Victor Stinner wrote:
The last proposition is to add transform() and untransform() methods
to bytes and str types. ... If I remember
correctly, the missing point is how to define which types are
supported by a codec
Also, for any given codec, which
On Apr 22, 2013, at 10:04 PM, Steven D'Aprano st...@pearwood.info wrote:
On 23/04/13 09:16, Greg Ewing wrote:
Victor Stinner wrote:
The last proposition is to add transform() and untransform() methods
to bytes and str types. ... If I remember
correctly, the missing point is how to define
On Mon, Apr 22, 2013 at 7:04 PM, Steven D'Aprano st...@pearwood.info wrote:
As others have pointed out in the past, repeatedly, the codec system is
completely general and can transform bytes-bytes and text-text just as
easily as bytes-text. Or indeed any bijection, as the docs for 2.7 point
On Tue, Apr 23, 2013 at 4:04 AM, Steven D'Aprano st...@pearwood.info wrote:
As others have pointed out in the past, repeatedly, the codec system is
completely general and can transform bytes-bytes and text-text just as
easily as bytes-text.
Yes, but the encode()/decode() methods are not, and
Using decode() and encode() would break that predictability. But someone
suggested the use of transform() and untransform() instead. That would
clarify that the transformation is bytes bytes and Unicode string
Unicode string.
On 23 Apr 2013 05:50, Lennart Regebro rege...@gmail.com wrote:
On
63 matches
Mail list logo