Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-24 Thread Stephen J. Turnbull
Lennart Regebro writes:

 > Base64 is an encoding that transforms between 8-bit streams. Let it be
 > that. Don't try to shoehorn it into a completely different kind of
 > encoding.

By "completely different kind of encoding" do you mean "codec"?

I think that would be an unfortunate result.  These operations on
streams are theoretically nicely composable.  It would be nice if
practice reflected that by having a uniform API for all of these
operations (charset translation, encoded text to internal, content
transfer encoding, compression ...).  I think it would be useful, too,
though I can't prove that.

Anyway, this discussion belongs on python-ideas at this point.  Or
would, if I had an idea about implementation.  I'll take it there when
I do have something to say about implementation.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A decade as a core dev

2013-04-24 Thread Georg Brandl
Am 18.04.2013 17:02, schrieb Brett Cannon:
> Today marks my 10 year anniversary as a core developer on Python. I
> wrote a blog post to mark the occasion
> (http://sayspy.blogspot.ca/2013/04/a-decade-of-commits.html), but I
> wanted to personally thank python-dev for the past decade (and
> whatever comes in the future). All of you taught me how to really
> program and for that I will be eternally grateful. And the friendships
> I have built through this list are priceless.

Hah, I only have 2 years to go. Time flies like an unladen swallow...
Congrats :)

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-24 Thread Lennart Regebro
On Thu, Apr 25, 2013 at 7:43 AM, Antoine Pitrou  wrote:
> On Thu, 25 Apr 2013 04:19:36 +0200
> Lennart Regebro  wrote:
>> On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull  
>> wrote:
>> > RFC 4648 repeatedly refers to *characters*, without specifying an
>> > encoding for them.
> [...]
>>
>> Base64 is an encoding that transforms between 8-bit streams.
>
> No, it isn't. What Stephen wrote above.

Yes it is. Base64 takes 8-bit bytes and transforms them into another
8-bit stream that can be safely transmitted over various channels that
would mangle an unencoded 8-bit stream, such as email etc.

http://en.wikipedia.org/wiki/Base64

>> Either you get a "LookupError: unknown
>> encoding: base64", which is what you get now, or you get an
>> UnicodeEncodingError if the text is not ASCII. We don't want the
>> latter, because it means that code that looks fine for the developer
>> breaks in real life because the developer was American
>
> That's bogus.

No, that's real life.

> By the same argument, we should suppress any
> encoding which isn't able to represent all possible unicode strings.

No, if you explicitly use such an encoding it is because you need to
because you are transferring data to a system that needs the encoding
in question. Unicode errors are unavoidable at that point, not an
unexpected surprise because a conversion happened implicitly that you
didn't know about.

//Lennart
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] slow hg clone of python repo?

2013-04-24 Thread Antoine Pitrou
On Wed, 24 Apr 2013 16:24:15 -0700
Guido van Rossum  wrote:
> It's a big repo. Patience.

We are actually having bandwidth issues with the current OSU/OSL
hosting of python.org machines, which is affecting not only
hg.python.org but also pypi.python.org, for at least some users.

I believe Noah and friends/colleagues are investigating :-)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-24 Thread Antoine Pitrou
On Thu, 25 Apr 2013 04:19:36 +0200
Lennart Regebro  wrote:
> On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull  
> wrote:
> > RFC 4648 repeatedly refers to *characters*, without specifying an
> > encoding for them.
[...]
> 
> Base64 is an encoding that transforms between 8-bit streams.

No, it isn't. What Stephen wrote above.

> Either you get a "LookupError: unknown
> encoding: base64", which is what you get now, or you get an
> UnicodeEncodingError if the text is not ASCII. We don't want the
> latter, because it means that code that looks fine for the developer
> breaks in real life because the developer was American

That's bogus. By the same argument, we should suppress any
encoding which isn't able to represent all possible unicode strings.
That's almost all encodings provided by Python (including utf-8, if
you consider lone surrogates).

I'm sorry for Americans, but they *still* must know about character
encodings, and be ready to handle UnicodeErrors, when using Python 3 for
encoding/decoding bytestrings. There's no way around it.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-24 Thread Lennart Regebro
On Thu, Apr 25, 2013 at 3:54 AM, Stephen J. Turnbull  wrote:
> RFC 4648 repeatedly refers to *characters*, without specifying an
> encoding for them.  In fact, if you copy accurately, you can write
> BASE64 on a napkin and that napkin will accurate transmit the data
> (assuming it doesn't run into sleet or gloom of night).

Or Mrs Cake.

> What else is that but "text in the sense of Py3k"?

Text in the sense of Py3k is Unicode. That a 8-bit character stream
(or in this case 6-bit) fits in the 31 bit character space of Unicode
doesn't make it Unicode, and hence not text. (Napkins of course have
even higher bit density than 31 bits per character, unless you write
very small). From the viewpoint of Py3k, bytes data is not text.

This is a very useful way to deal with Unicode. See also
http://regebro.wordpress.com/2011/03/23/unconfusing-unicode-what-is-unicode/

> My point is not that Python's base64 codec *should* be bytes-to-str
> and back.

Base64 does not convert between a Unicode character stream and an
8-bite byte stream. It converts between a 8-bit byte-stream and an
8-bit byte stream. It therefore should be bytes to bytes. To fit
Unicode text into Base64 you have to first use an encoding on that
Unicode text to convert it to bytes.

> What I'm groping toward is an idea of a "variable method", so that we
> could use .encode and .decode where they are TOOWTDI for people even
> though a purely formal interpretation of duck-typing would say "but
> why is that blue whale quacking, waddling, and flying?"  In other
> words (although I have no idea how best to implement it), I would like
> "somestring.encode('base64')" to fail with "I don't know how to do
> that" (an attribute lookup error?), the same way that
> "somebytes.encode('utf-8')" does in Python 3 today.

There's only two options there. Either you get a "LookupError: unknown
encoding: base64", which is what you get now, or you get an
UnicodeEncodingError if the text is not ASCII. We don't want the
latter, because it means that code that looks fine for the developer
breaks in real life because the developer was American and didn't
think of this, but his client happens to have an accent in the name.

Base64 is an encoding that transforms between 8-bit streams. Let it be
that. Don't try to shoehorn it into a completely different kind of
encoding.

//Lennart
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] slow hg clone of python repo?

2013-04-24 Thread R. David Murray
On Wed, 24 Apr 2013 18:14:18 -0700, Eli Bendersky  wrote:
> On Wed, Apr 24, 2013 at 4:37 PM, Sean Felipe Wolfe wrote:
> 
> > On Wed, Apr 24, 2013 at 4:24 PM, Guido van Rossum 
> > wrote:
> > > It's a big repo. Patience.
> > >
> > > On Wed, Apr 24, 2013 at 4:17 PM, Sean Felipe Wolfe 
> > wrote:
> > >> Hey everybody, I'm trying to download the python sources with hg and
> > >> it's taking a while ... 7+ minutes so far and all I've got is
> > >> .../cpython and .../cypython/.hg . Any ideas as to why there's a
> > >> delay?
> > >>
> > >> I'm following the dev guide with this command:
> > >> hg clone http://hg.python.org/cpython
> > >>
> > >> I'm on Linux Mint 14, using the supplied hg version 2.2.2 . My
> > >> internet connection seems speedy enough.
> > >>
> >
> 
> Sean, 7 minutes doesn't sound bad. Keep in mind that with Hg, the whole
> repository is being cloned to your computer - all active (and inactive)
> branches, all history, etc. The up-side is that after this initial clone,
> subsequent pulls are pretty quick and all other operations are local and
> super fast (log, blame, etc.)

To further clarify what Eli said, "the whole repo" gets put into that
.hg directory *first*, and only at the end is a working directory
checkout done.  So all you will see is cpython/.hg until the very last
moment when it will start telling about the checkout being done.

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-24 Thread Stephen J. Turnbull
Tres Seaver writes:

 > On 04/23/2013 09:29 AM, Stephen J. Turnbull wrote:
 > > By RFC specification, BASE64 is a *textual* representation of
 > > arbitrary binary data.
 > 
 > It isn't "text" in the sense Py3k means:

RFC 4648 repeatedly refers to *characters*, without specifying an
encoding for them.  In fact, if you copy accurately, you can write
BASE64 on a napkin and that napkin will accurate transmit the data
(assuming it doesn't run into sleet or gloom of night).  What else is
that but "text in the sense of Py3k"?

My point is not that Python's base64 codec *should* be bytes-to-str
and back.  My point is that, both in the formal spec and in historical
evolution, that is a plausible interpretation of ".encode('base64')"
which happens to be the reverse of the normal codec convention, where
".encode(codec)" is a *string* method, and ".decode(codec)" is a
*bytes* method.

This is not harder to learn for people (for BASE64 encoding or for
coded character sets), because in each case there's a natural sense of
direction for *en*coding vs. *de*coding.  But it does break duck-
typing, as does the web developer bytes-to-bytes usage of BASE64.

What I'm groping toward is an idea of a "variable method", so that we
could use .encode and .decode where they are TOOWTDI for people even
though a purely formal interpretation of duck-typing would say "but
why is that blue whale quacking, waddling, and flying?"  In other
words (although I have no idea how best to implement it), I would like
"somestring.encode('base64')" to fail with "I don't know how to do
that" (an attribute lookup error?), the same way that
"somebytes.encode('utf-8')" does in Python 3 today.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] slow hg clone of python repo?

2013-04-24 Thread Eli Bendersky
On Wed, Apr 24, 2013 at 4:37 PM, Sean Felipe Wolfe wrote:

> On Wed, Apr 24, 2013 at 4:24 PM, Guido van Rossum 
> wrote:
> > It's a big repo. Patience.
> >
> > On Wed, Apr 24, 2013 at 4:17 PM, Sean Felipe Wolfe 
> wrote:
> >> Hey everybody, I'm trying to download the python sources with hg and
> >> it's taking a while ... 7+ minutes so far and all I've got is
> >> .../cpython and .../cypython/.hg . Any ideas as to why there's a
> >> delay?
> >>
> >> I'm following the dev guide with this command:
> >> hg clone http://hg.python.org/cpython
> >>
> >> I'm on Linux Mint 14, using the supplied hg version 2.2.2 . My
> >> internet connection seems speedy enough.
> >>
>

Sean, 7 minutes doesn't sound bad. Keep in mind that with Hg, the whole
repository is being cloned to your computer - all active (and inactive)
branches, all history, etc. The up-side is that after this initial clone,
subsequent pulls are pretty quick and all other operations are local and
super fast (log, blame, etc.)

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library

2013-04-24 Thread Greg Ewing

R. David Murray wrote:


If 'a' is now an instance of MyEnum, then I would expect that:

MyEnum.a.b

would be valid


That is indeed a quirk, but it's not unprecedented. Exactly
the same thing happens in Java. This compiles and runs:

  enum Foo {
a, b
  }

  public class Main {

public static void main(String[] args) {
  System.out.printf("%s\n", Foo.a.b);
}

  }

There probably isn't much use for that behaviour, but on
the other hand, it's probably not worth going out of our
way to prevent it.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] slow hg clone of python repo?

2013-04-24 Thread Sean Felipe Wolfe
On Wed, Apr 24, 2013 at 4:24 PM, Guido van Rossum  wrote:
> It's a big repo. Patience.
>
> On Wed, Apr 24, 2013 at 4:17 PM, Sean Felipe Wolfe  
> wrote:
>> Hey everybody, I'm trying to download the python sources with hg and
>> it's taking a while ... 7+ minutes so far and all I've got is
>> .../cpython and .../cypython/.hg . Any ideas as to why there's a
>> delay?
>>
>> I'm following the dev guide with this command:
>> hg clone http://hg.python.org/cpython
>>
>> I'm on Linux Mint 14, using the supplied hg version 2.2.2 . My
>> internet connection seems speedy enough.
>>
>> TIA!
>> Sean

Thanks :) It actually completed quickly after I sent the email.  :P
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] slow hg clone of python repo?

2013-04-24 Thread Guido van Rossum
It's a big repo. Patience.

On Wed, Apr 24, 2013 at 4:17 PM, Sean Felipe Wolfe  wrote:
> Hey everybody, I'm trying to download the python sources with hg and
> it's taking a while ... 7+ minutes so far and all I've got is
> .../cpython and .../cypython/.hg . Any ideas as to why there's a
> delay?
>
> I'm following the dev guide with this command:
> hg clone http://hg.python.org/cpython
>
> I'm on Linux Mint 14, using the supplied hg version 2.2.2 . My
> internet connection seems speedy enough.
>
> TIA!
> Sean
>
> --
> A musician must make music, an artist must paint, a poet must write,
> if he is to be ultimately at peace with himself.
> - Abraham Maslow
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] slow hg clone of python repo?

2013-04-24 Thread Sean Felipe Wolfe
Hey everybody, I'm trying to download the python sources with hg and
it's taking a while ... 7+ minutes so far and all I've got is
.../cpython and .../cypython/.hg . Any ideas as to why there's a
delay?

I'm following the dev guide with this command:
hg clone http://hg.python.org/cpython

I'm on Linux Mint 14, using the supplied hg version 2.2.2 . My
internet connection seems speedy enough.

TIA!
Sean

-- 
A musician must make music, an artist must paint, a poet must write,
if he is to be ultimately at peace with himself.
- Abraham Maslow
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A decade as a core dev

2013-04-24 Thread Sean Felipe Wolfe
On Thu, Apr 18, 2013 at 8:02 AM, Brett Cannon  wrote:
> Today marks my 10 year anniversary as a core developer on Python. I
> wrote a blog post to mark the occasion
> (http://sayspy.blogspot.ca/2013/04/a-decade-of-commits.html), but I
> wanted to personally thank python-dev for the past decade (and
> whatever comes in the future). All of you taught me how to really
> program and for that I will be eternally grateful. And the friendships
> I have built through this list are priceless.


Congratulations Brett :)  I am just getting started on my
contribuatory journey and this is good positive reinforcement.
Saludos!!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] I cannot create bug reports

2013-04-24 Thread Brian Curtin
On Wed, Apr 24, 2013 at 1:55 PM, Daniel Wong  wrote:
> Thank you. That was the problem.
>
> I feel kind of stupid now. In my defense, the error message could have been
> more helpful, and requesting the bug creation form could have thrown up a
> login error instead of showing up blank. File another bug?

Bugs about the bug tracker go to http://psf.upfronthosting.co.za/roundup/meta/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] I cannot create bug reports

2013-04-24 Thread Daniel Wong
Thank you. That was the problem.

I feel kind of stupid now. In my defense, the error message could have been
more helpful, and requesting the bug creation form could have thrown up a
login error instead of showing up blank. File another bug?


On Wed, Apr 24, 2013 at 11:45 AM, Ian Cordasco
wrote:

> The first thing that comes to mind is that your session expired and
> you need to log-in again. After logging in myself I see the form in
> all of it's glory.
>
> On Wed, Apr 24, 2013 at 2:35 PM, Daniel Wong 
> wrote:
> > Glorious members of python-dev,
> >
> > I'd like to submit a patch, but I cannot create a bug report. As of this
> > morning (US West Coast), when I go to
> > http://bugs.python.org/issue?@template=item I get no form fields.
> >
> > I went there last night, and I was able to get a form. I kept that tab
> open
> > over night, and tried to submit this morning. When I did that, I got
> > permission denied errors. It seems that something weird has happened to
> my
> > account, or bug tracker itself changed in my sleep.
> >
> > Anyone have any idea what's going on here?
> >
> > Daniel
> >
> > ___
> > Python-Dev mailing list
> > Python-Dev@python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> >
> http://mail.python.org/mailman/options/python-dev/graffatcolmingov%40gmail.com
> >
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] I cannot create bug reports

2013-04-24 Thread Ian Cordasco
The first thing that comes to mind is that your session expired and
you need to log-in again. After logging in myself I see the form in
all of it's glory.

On Wed, Apr 24, 2013 at 2:35 PM, Daniel Wong  wrote:
> Glorious members of python-dev,
>
> I'd like to submit a patch, but I cannot create a bug report. As of this
> morning (US West Coast), when I go to
> http://bugs.python.org/issue?@template=item I get no form fields.
>
> I went there last night, and I was able to get a form. I kept that tab open
> over night, and tried to submit this morning. When I did that, I got
> permission denied errors. It seems that something weird has happened to my
> account, or bug tracker itself changed in my sleep.
>
> Anyone have any idea what's going on here?
>
> Daniel
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/graffatcolmingov%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] I cannot create bug reports

2013-04-24 Thread Daniel Wong
Glorious members of python-dev,

I'd like to submit a patch, but I cannot create a bug report. As of this
morning (US West Coast), when I go to
http://bugs.python.org/issue?@template=item I get no form fields.

I went there last night, and I was able to get a form. I kept that tab open
over night, and tried to submit this morning. When I did that, I got
permission denied errors. It seems that something weird has happened to my
account, or bug tracker itself changed in my sleep.

Anyone have any idea what's going on here?

Daniel
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-24 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 04/23/2013 09:29 AM, Stephen J. Turnbull wrote:
> By RFC specification, BASE64 is a *textual* representation of
> arbitrary binary data.

It isn't "text" in the sense Py3k means:  it is a representation for
transmission on-the-wire for protocols which requre 7-bit-safe data.
Nobody working with base64-encoded data is going to expect to do "normal"
string processing on that data:  the closest thing to that is splitting
it into 72-byte chunks for transmission via e-mail.

Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iEYEARECAAYFAlF4D9YACgkQ+gerLs4ltQ5nUACfWm4YEMarjUb7fEEpP+aMtaQr
a7kAn1Pc8ufUwJzKHD0DgSxQ4H/uqf82
=CzTZ
-END PGP SIGNATURE-

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-24 Thread Glenn Linderman

On 4/24/2013 1:22 AM, M.-A. Lemburg wrote:

On 23.04.2013 19:24, Guido van Rossum wrote:

On Tue, Apr 23, 2013 at 9:04 AM, M.-A. Lemburg  wrote:

On 23.04.2013 17:47, Guido van Rossum wrote:

On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg  wrote:

Just as reminder: we have the general purpose
encode()/decode() functions in the codecs module:

import codecs
r13 = codecs.encode('hello world', 'rot-13')

These interface directly to the codec interfaces, without
enforcing type restrictions. The codec defines the supported
input and output types.

As an implementation mechanism I see nothing wrong with this. I hope
the codecs module lets you introspect the input and output types of a
codec given by name?

At the moment there is no standard interface to access supported
input and output types... but then: regular Python functions or
methods also don't provide such functionality, so no surprise
there ;-)

Not quite the same though. Each function has its own unique behavior.
But codecs support a standard interface, *except* that the input and
output types sometimes vary.

The codec system itself


It's mostly a matter of specifying the supported type
combinations in the codec documentation.

BTW: What would be a use case where you'd want to
programmatically access such information before calling
the codec ?

As you know, in Python 3, most code working with bytes doesn't also
work with strings, and vice versa (except for a few cases where we've
gone out of our way to write polymorphic code -- but users rarely do
so, and any time you use a string or bytes literal you basically limit
yourself to that type).

Suppose I write a command-line utility that reads a file, runs it
through a codec, and writes the result to another file. Suppose the
name of the codec is a command-line argument (as well as the
filenames). I need to know whether to open the files in text or binary
mode based on the name of the codec.

Ok, so you need to know which codecs your tool can support and
which of those need text input and which bytes input.

I've been thinking about this some more: I think that type
information alone is not flexible enough to cover such
use cases.


Maybe MIME type and encoding would be sufficient type information, but 
probably not str vs. bytes.



In your use case you'd want to only permit use of a certain
set of codecs, not simply all of them, since some might
not implement what you actually want to achieve with the tool,
e.g. a user might have installed a codec set that adds
support for reading and writing image data, but your
intended use was to only support text data.


MIME type supports this sort of concept, with the two-level hierarchy of 
naming the type... text/xml text/plain image/jpeg



So what we need is a way to allow the codecs to say e.g.
"I work on text", "I support encoding bytes and text",
"I encode to bytes", "I'm reversible", "I transform
input data", "I support bytes and text, and will create
same type output", "I work on image data", "I work on
X509 certificates", "I work on XML data", etc.


Guess what I think you are re-inventing here
Nope, guess again
Yep, MIME types _plus_ encodings.


In other words, we need a form of tagging system, with a
set of standard tags that each codec can publish and
which also allows non-standard tags (which can then at
some point be made standard, if there's agreement on them).


Hmm.  Sounds just like the registry for, um, you guessed it: MIME types.


Given a codec name you could then ask the codec registry for
the codec tags and verify that the chosen codec handles
text data, needs bytes or text encoding input and
creates bytes as encoding output. If the registry returns
codec tags that don't include the "I work on text" tag,
the tool could then raise an error.


For just doing text encoding transformations,  text/plain would work as 
a MIME type, and the encodings of interest for the encodings.


Seems like "str" always means "Unicode" but the MIME type can vary; 
"bytes" might mean encoded text, and the MIME type can also vary.


For non-textual transformations, "encoding" might mean Base 64, BinHex, 
or other such representations... but those can also be applied to text, 
so it might be a 3rd dimension, or it might just be a list of encodings 
rather than a single encoding.


Compression could be another dimension, or perhaps another encoding.

But really, then, a transformation needs to be a list of steps; a codec 
can sign up to perform one or more of the steps, a sequence of codecs 
would have to be found, capable of performing a subsequence of the 
steps, and then run in the appropriate order.


This all sounds so general, that probably the Python compiler could be 
implemented as a codec :)  Or any compiler. Probably a web server could 
be implemented as a codec too :)  Well, maybe not, codecs have limited 
error handling and reporting abilities.
___
Python-Dev mailing list
Python-Dev@python.org
htt

Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-24 Thread M.-A. Lemburg
On 23.04.2013 19:24, Guido van Rossum wrote:
> On Tue, Apr 23, 2013 at 9:04 AM, M.-A. Lemburg  wrote:
>> On 23.04.2013 17:47, Guido van Rossum wrote:
>>> On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg  wrote:
 Just as reminder: we have the general purpose
 encode()/decode() functions in the codecs module:

 import codecs
 r13 = codecs.encode('hello world', 'rot-13')

 These interface directly to the codec interfaces, without
 enforcing type restrictions. The codec defines the supported
 input and output types.
>>>
>>> As an implementation mechanism I see nothing wrong with this. I hope
>>> the codecs module lets you introspect the input and output types of a
>>> codec given by name?
>>
>> At the moment there is no standard interface to access supported
>> input and output types... but then: regular Python functions or
>> methods also don't provide such functionality, so no surprise
>> there ;-)
> 
> Not quite the same though. Each function has its own unique behavior.
> But codecs support a standard interface, *except* that the input and
> output types sometimes vary.

The codec system itself

>> It's mostly a matter of specifying the supported type
>> combinations in the codec documentation.
>>
>> BTW: What would be a use case where you'd want to
>> programmatically access such information before calling
>> the codec ?
> 
> As you know, in Python 3, most code working with bytes doesn't also
> work with strings, and vice versa (except for a few cases where we've
> gone out of our way to write polymorphic code -- but users rarely do
> so, and any time you use a string or bytes literal you basically limit
> yourself to that type).
> 
> Suppose I write a command-line utility that reads a file, runs it
> through a codec, and writes the result to another file. Suppose the
> name of the codec is a command-line argument (as well as the
> filenames). I need to know whether to open the files in text or binary
> mode based on the name of the codec.

Ok, so you need to know which codecs your tool can support and
which of those need text input and which bytes input.

I've been thinking about this some more: I think that type
information alone is not flexible enough to cover such
use cases.

In your use case you'd want to only permit use of a certain
set of codecs, not simply all of them, since some might
not implement what you actually want to achieve with the tool,
e.g. a user might have installed a codec set that adds
support for reading and writing image data, but your
intended use was to only support text data.

So what we need is a way to allow the codecs to say e.g.
"I work on text", "I support encoding bytes and text",
"I encode to bytes", "I'm reversible", "I transform
input data", "I support bytes and text, and will create
same type output", "I work on image data", "I work on
X509 certificates", "I work on XML data", etc.

In other words, we need a form of tagging system, with a
set of standard tags that each codec can publish and
which also allows non-standard tags (which can then at
some point be made standard, if there's agreement on them).

Given a codec name you could then ask the codec registry for
the codec tags and verify that the chosen codec handles
text data, needs bytes or text encoding input and
creates bytes as encoding output. If the registry returns
codec tags that don't include the "I work on text" tag,
the tool could then raise an error.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 24 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2013-04-17: Released eGenix mx Base 3.2.6 ... http://egenix.com/go43

: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why can't I encode/decode base64 without importing a module?

2013-04-24 Thread M.-A. Lemburg
On 23.04.2013 23:37, Nick Coghlan wrote:
> On 24 Apr 2013 01:25, "M.-A. Lemburg"  wrote:
>>
>> On 23.04.2013 17:15, Barry Warsaw wrote:
>>> On Apr 22, 2013, at 06:22 PM, Guido van Rossum wrote:
>>>
> You can ask the same question about all the other codecs.  (And that
> question has indeed been asked in the past.)

 Except for rot13. :-)
>>>
>>> The fact that you can do this instead *is* a bit odd. ;)
>>>
>>> from codecs import getencoder
>>> encoder = getencoder('rot-13')
>>> r13 = encoder('hello world')[0]
>>
>> Just as reminder: we have the general purpose
>> encode()/decode() functions in the codecs module:
>>
>> import codecs
>> r13 = codecs.encode('hello world', 'rot-13')
>>
>> These interface directly to the codec interfaces, without
>> enforcing type restrictions. The codec defines the supported
>> input and output types.
> 
> If we already have those, why aren't they documented? 

Good question. I added them in 2004 and probably just forgot
to add the documentation:

http://hg.python.org/cpython-fullhistory/rev/8ea2cb1ec598

I guess the doc-strings could be used as basis for the
documentation.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 24 2013)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2013-04-17: Released eGenix mx Base 3.2.6 ... http://egenix.com/go43

: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com