Re: [Python-Dev] Pre-PEP: The "bytes" object

2006-02-24 Thread Michael Hoffman
[Neil Schemenauer]
>> @classmethod
>> def fromhex(self, data):
>> data = re.sub(r'\s+', '', data)
>> return bytes(binascii.unhexlify(data))

[Jason Orendorff]
> If it's to be a classmethod, I guess that should be "return self(
> binascii.unhexlify(data))".

Am I the only one who finds the use of "self" on a classmethod to be
incredibly confusing? Can we please follow PEP 8 and use "cls"
instead?
-- 
Michael Hoffman <[EMAIL PROTECTED]>
European Bioinformatics Institute

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Stephen J. Turnbull
> "Ron" == Ron Adam <[EMAIL PROTECTED]> writes:

Ron> We could call it transform or translate if needed.

You're still losing the directionality, which is my primary objection
to "recode".  The absence of directionality is precisely why "recode"
is used in that sense for i18n work.

There really isn't a good reason that I can see to use anything other
than the pair "encode" and "decode".  In monolingual environments,
once _all_ human-readable text (specifically including Python programs
and console I/O) is automatically mapped to a Python (unicode) string,
most programmers will never need to think about it as long as Python
(the project) very very strongly encourages that all Python programs
be written in UTF-8 if there's any chance the program will be reused
in a locale other than the one where it was written.  (Alternatively
you can depend on PEP 263 coding cookies.)  Then the user (or the
Python interpreter) just changes console and file I/O codecs to the
encoding in use in that locale, and everything just works.

So the remaining uses of "encode" and "decode" are for advanced users
and specialists: people using stuff like base64 or gzip, and those who
need to use unicode codecs explicitly.

I could be wrong about the possibility to get rid of explicit unicode
codec use in monolingual environments, but I hope that we can at least
try to achieve that.

>> Unlikely.  Errors like "A
>> string".encode("base64").encode("base64") are all too easy to
>> commit in practice.

Ron> Yes,... and wouldn't the above just result in a copy so it
Ron> wouldn't be an out right error.

No, you either get the following:

A string. -> QSBzdHJpbmcu -> UVNCemRISnBibWN1

or you might get an error if base64 is defined as bytes->unicode.

Ron> * Given that the string type gains a __codec__ attribute
Ron> to handle automatic decoding when needed.  (is there a reason
Ron> not to?)

Ron>str(object[,codec][,error]) -> string coded with codec

Ron>unicode(object[,error]) -> unicode

Ron>bytes(object) -> bytes

str == unicode in Py3k, so this is a non-starter.  What do you want to
say?

Ron>  * a recode() method is used for transformations that
Ron> *do_not* change the current codec.

I'm not sure what you mean by the "current codec".  If it's attached
to an "encoded object", it should be the codec needed to decode the
object.  And it should be allowed to be a "codec stack".  So suppose
you start with a unicode object "obj".  Then

>>> bytes = bytes (obj, 'utf-8')# implicit .encode()
>>> print bytes.codec
['utf-8']
>>> wire = bytes.encode ('base64')  # with apologies to Greg E.
>>> print wire.codec
['base64', 'utf-8']
>>> obj2 = wire.decode ('gzip')
CodecMatchException
>>> obj2 = wire.decode (wire.codec)
>>> print obj == obj2
True
>>> print obj2.codec
[]

or maybe None for the last.  I think this would be very nice as a
basis for improving the email module (for one), but I don't really
think it belongs in Python core.

Ron> That may be why it wasn't done this way to start.  (?)

I suspect the real reason is that Marc-Andre had the generalized codec
in mind from Day 0, and your proposal only works with duck-typing if
codecs always have a well-defined signature with two different types
for the argument and return of the "constructor".

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of TsukubaTennodai 1-1-1 Tsukuba 305-8573 JAPAN
   Ask not how you can "do" free software business;
  ask what your business can "do for" free software.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Stephen J. Turnbull
> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes:

Greg> Stephen J. Turnbull wrote:

>> No, base64 isn't a wire protocol.  It's a family[...].

Greg> Yes, and it's up to the programmer to choose those code
Greg> units (i.e. pick an encoding for the characters) that will,
Greg> in fact, pass through the channel he is using without
Greg> corruption. I don't see how any of this is inconsistent with
Greg> what I've said.

It's not.  It just shows that there are other "correct" ways to think
about the issue.

>> Only if you do no transformations that will harm the
>> base64-encoding.  ...  It doesn't allow any of the usual
>> transformations on characters that might be applied globally to
>> a mail composition buffer, for example.

Greg> I don't understand that. Obviously if you rot13 your mail
Greg> message or turn it into pig latin or something, it's going
Greg> to mess up any base64 it might contain.  But that would be a
Greg> silly thing to do to a message containing base64.

What "message containing base64"?  "Any base64 in there?"  "Nope,
nobody here but us Unicode characters!"  I certainly hope that in Py3k
bytes objects will have neither ROT13 nor case-changing methods, but
str objects certainly will.  Why give up the safety of that
distinction?

Greg> Given any piece of text, there are things it makes sense to
Greg> do with it and things it doesn't, depending entirely on the
Greg> use to which the text will eventually be put.  I don't see
Greg> how base64 is any different in this regard.

If you're going to be binary about it, it's not different.  However
the kind of "text" for which Unicode was designed is normally produced
and consumed by people, who wll pt up w/ ll knds f nnsns.  Base64
decoders will not put up with the same kinds of nonsense that people
will.

You're basically assuming that the person who implements the code that
processes a Unicode string is the same person who implemented the code
that converts a binary object into base64 and inserts it into a
string.  I think that's a dangerous (and certainly invalid) assumption.

I know I've lost time and data to applications that make assumptions
like that.  In fact, that's why "MULE" is a four-letter word in Emacs
channels.

>> So then you bring it right back in with base64.  Now they need
>> to know about bytes<->unicode codecs.

Greg> No, they need to know about the characteristics of the
Greg> channel over which they're sending the data.

I meant it in a trivial sense: "How do you use a bytes<->unicode codec
properly without knowing that it's a bytes<->unicode codec?"

In most environments, it should be possible to hide bytes<->unicode
codecs almost all the time, and I think that's a very good thing.  I
don't think it's a good idea to gratuitously introduce wire protocols
as unicode codecs, even if a class of bit patterns which represent the
integer 65 are denoted "A" in various sources.  Practicality beats
purity (especially when you're talking about the purity of a pregnant
virgin).

Greg> It might be appropriate to to use base64 followed by some
Greg> encoding, but the programmer needs to be aware of that and
Greg> choose the encoding wisely. It's not possible to shield him
Greg> from having to know about encodings in that situation, even
Greg> if the encoding is just ascii.

What do you think the email module does?  Assuming conforming MIME
messages and receivers capable of handling UTF-8, the user of the
email module does not need to know anything about any encodings at
all.  With a little more smarts, the email module could even make a
good choice of output encoding based on the _language_ of the text,
removing the restriction to UTF-8 on the output side, too.  With the
aid of file(1), it can make excellent guesses about attachments.

Sure, the email module programmer needs to know, but the email module
programmer needs to know an awful lot about codecs anyway, since mail
at that level is a binary channel, while users will be throwing a
mixed bag of binary and textual objects at it.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of TsukubaTennodai 1-1-1 Tsukuba 305-8573 JAPAN
   Ask not how you can "do" free software business;
  ask what your business can "do for" free software.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] getdefault(), the real replacement for setdefault()

2006-02-24 Thread Barry Warsaw
On Feb 23, 2006, at 4:41 PM, Thomas Wouters wrote:

> On Wed, Feb 22, 2006 at 10:29:08PM -0500, Barry Warsaw wrote:
>> d.getdefault('foo', list).append('bar')
>
>> Anyway, I don't think it's an either/or choice with Guido's subclass.
>> Instead I think they are different use cases.  I would add  
>> getdefault()
>> to the standard dict API, remove (eventually) setdefault(), and add
>> Guido's subclass in a separate module.  But I /wouldn't/ clutter the
>> built-in dict's API with on_missing().
>
> +1. This is a much closer match to my own use of setdefault than  
> Guido's
> dict subtype. I'm +0 on the subtype, but I prefer the call-time  
> decision on
> whether to fall back to a default or not.

Cool!  As your reward:

SF patch #1438113

https://sourceforge.net/tracker/index.php? 
func=detail&aid=1438113&group_id=5470&atid=305470

-Barry

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-24 Thread James Y Knight
On Feb 24, 2006, at 1:54 AM, Greg Ewing wrote:
> Thomas Wouters wrote:
>> On Thu, Feb 23, 2006 at 05:25:30PM +1300, Greg Ewing wrote:
>>
>>> As an aside, is there any chance that this could be
>>> changed in 3.0? I.e. have the for-loop create a new
>>> binding for the loop variable on each iteration.
>>
>> You can't do that without introducing a whole new scope
>> for the body of the 'for' loop,
>
> There's no need for that. The new scope need only
> include the loop variable -- everything else could
> still refer to the function's main scope.

No, that would be insane. You get the exact same problem, now even  
more confusing:

l=[]
for x in range(10):
   y = x
   l.append(lambda: (x, y))

print l[0]()

With your suggestion, that would print (0, 9).

Unless python grows a distinction between creating a binding and  
assigning to one as most other languages have, this problem is here  
to stay.

James
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-24 Thread Raymond Hettinger
> Michael Chermside wrote:
>> The next() method of iterators was an interesting
>> object lesson. ... Since it was sometimes invoked by name
>> and sometimes by special mechanism, the choice was to use the
>> unadorned name, but later experience showed that it would have been
>> better the other way.

[Grep]
> Any thoughts about fixing this in 3.0?

IMO, it isn't broken. It was an intentional divergence from naming conventions. 
The reasons for the divergence haven't changed.  Code that uses next() is more 
understandable, friendly, and readable without the walls of underscores.


Raymond 


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-24 Thread Alex Martelli
On 2/24/06, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> > Michael Chermside wrote:
> >> The next() method of iterators was an interesting
> >> object lesson. ... Since it was sometimes invoked by name
> >> and sometimes by special mechanism, the choice was to use the
> >> unadorned name, but later experience showed that it would have been
> >> better the other way.
>
> [Grep]
> > Any thoughts about fixing this in 3.0?
>
> IMO, it isn't broken. It was an intentional divergence from naming 
> conventions.
> The reasons for the divergence haven't changed.  Code that uses next() is more
> understandable, friendly, and readable without the walls of underscores.

Wouldn't, say, next(foo) [[with a hypothetical builtin 'next'
internally calling foo.__next__(), just like builtin 'len' internally
calls foo.__len__()]] be just as friendly etc? No biggie either way,
but that would seem to be more aligned with Python's usual approach.


Alex
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Dropping support for Win9x in 2.6

2006-02-24 Thread Neal Norwitz
Martin and I were talking about dropping support for older versions of
Windows (of the non-NT flavor).  We both thought that it was
reasonable to stop supporting Win9x (including WinME) in Python 2.6. 
I updated PEP 11 to reflect this.

The Python 2.5 installer will present a warning message on the systems
which will not be supported in Python 2.6.

n
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping support for Win9x in 2.6

2006-02-24 Thread Georg Brandl
Neal Norwitz wrote:
> Martin and I were talking about dropping support for older versions of
> Windows (of the non-NT flavor).  We both thought that it was
> reasonable to stop supporting Win9x (including WinME) in Python 2.6. 
> I updated PEP 11 to reflect this.
> 
> The Python 2.5 installer will present a warning message on the systems
> which will not be supported in Python 2.6.

Hey, someone even wanted to continue supporting DOS...

Georg

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping support for Win9x in 2.6

2006-02-24 Thread Michael Foord
Georg Brandl wrote:
> Neal Norwitz wrote:
>   
>> Martin and I were talking about dropping support for older versions of
>> Windows (of the non-NT flavor).  We both thought that it was
>> reasonable to stop supporting Win9x (including WinME) in Python 2.6. 
>> I updated PEP 11 to reflect this.
>>
>> The Python 2.5 installer will present a warning message on the systems
>> which will not be supported in Python 2.6.
>> 
>
> Hey, someone even wanted to continue supporting DOS...
>
>   
A lot of people are still using Windows 98.  But I guess if noone is 
volunteering to maintain the code...

Michael Foord

> Georg
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping support for Win9x in 2.6

2006-02-24 Thread Aahz
On Fri, Feb 24, 2006, Michael Foord wrote:
> Georg Brandl wrote:
>> Neal Norwitz wrote:
>>   
>>> Martin and I were talking about dropping support for older versions of
>>> Windows (of the non-NT flavor).  We both thought that it was
>>> reasonable to stop supporting Win9x (including WinME) in Python 2.6. 
>>> I updated PEP 11 to reflect this.
>>>
>>> The Python 2.5 installer will present a warning message on the systems
>>> which will not be supported in Python 2.6.
>>
>> Hey, someone even wanted to continue supporting DOS...
>   
> A lot of people are still using Windows 98.  But I guess if noone is 
> volunteering to maintain the code...

DOS has some actual utility for low-grade devices and is overall a
simpler platform to deliver code for.  At the standard 18-month release
cycle, it will be beginning of 2008 for the release of 2.6, which is ten
years after Win98.
-- 
Aahz ([EMAIL PROTECTED])   <*> http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping support for Win9x in 2.6

2006-02-24 Thread Alexander Schremmer
On Fri, 24 Feb 2006 10:29:27 -0800, Aahz wrote:

> DOS has some actual utility for low-grade devices and is overall a
> simpler platform to deliver code for.  At the standard 18-month release
> cycle, it will be beginning of 2008 for the release of 2.6, which is ten
> years after Win98.

The last Windows release of that branch was Windows ME, in September 2000,
i.e. you have to wait till 2010 in order to be ten years after the last
legacy OS release.

Kind regards,
Alexander

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping support for Win9x in 2.6

2006-02-24 Thread Guido van Rossum
On 2/24/06, Michael Foord <[EMAIL PROTECTED]> wrote:
> A lot of people are still using Windows 98.  But I guess if noone is
> volunteering to maintain the code...

Agreed. If they're so keen on using an antiquated OS, perhaps they
would be perfectly happy using a matching Python version... Somehow I
doubt this is going to be a big deal for anyone affected.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping support for Win9x in 2.6

2006-02-24 Thread Trent Mick
[Neal Norwitz wrote]
> Martin and I were talking about dropping support for older versions of
> Windows (of the non-NT flavor).  We both thought that it was
> reasonable to stop supporting Win9x (including WinME) in Python 2.6. 
> I updated PEP 11 to reflect this.

Are there specific code areas in mind that would be ripped out for this
or is this mainly to avoid having to test on and ensure new code is
compatible with?

Trent

-- 
Trent Mick
[EMAIL PROTECTED]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping support for Win9x in 2.6

2006-02-24 Thread Facundo Batista
2006/2/24, Neal Norwitz <[EMAIL PROTECTED]>:

> Martin and I were talking about dropping support for older versions of
> Windows (of the non-NT flavor).  We both thought that it was
> reasonable to stop supporting Win9x (including WinME) in Python 2.6.

+1

.Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-24 Thread Jeremy Hylton
On 2/24/06, James Y Knight <[EMAIL PROTECTED]> wrote:
> On Feb 24, 2006, at 1:54 AM, Greg Ewing wrote:
> > Thomas Wouters wrote:
> >> On Thu, Feb 23, 2006 at 05:25:30PM +1300, Greg Ewing wrote:
> >>
> >>> As an aside, is there any chance that this could be
> >>> changed in 3.0? I.e. have the for-loop create a new
> >>> binding for the loop variable on each iteration.
> >>
> >> You can't do that without introducing a whole new scope
> >> for the body of the 'for' loop,
> >
> > There's no need for that. The new scope need only
> > include the loop variable -- everything else could
> > still refer to the function's main scope.
>
> No, that would be insane. You get the exact same problem, now even
> more confusing:
>
> l=[]
> for x in range(10):
>y = x
>l.append(lambda: (x, y))
>
> print l[0]()
>
> With your suggestion, that would print (0, 9).
>
> Unless python grows a distinction between creating a binding and
> assigning to one as most other languages have, this problem is here
> to stay.

The more practical complaint is that list comprehensions use the same
namespace as the block that contains them.  It's much easier to miss
an assignment to, say, i in a list comprehension than it is in a
separate statement in the body of a for loop.  Since list comps are
expressions, the only variable at issue is the index variable.  It
would be simple to fix by renaming, but I suspect we're stuck with the
current behavior for backwards compatibility reasons.

Jeremy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Ron Adam

* The following reply is a rather longer than I intended explanation of 
why codings (and how they differ) like 'rot' aren't the same thing as 
pure unicode codecs and probably should be treated differently.
If you already understand that, then I suggest skipping this.  But if 
you like detailed logical analysis, it might be of some interest even if 
it's reviewing the obvious to those who already know.

(And hopefully I didn't make any really obvious errors myself.)


Stephen J. Turnbull wrote:
>> "Ron" == Ron Adam <[EMAIL PROTECTED]> writes:
> 
> Ron> We could call it transform or translate if needed.
> 
> You're still losing the directionality, which is my primary objection
> to "recode".  The absence of directionality is precisely why "recode"
> is used in that sense for i18n work.

I think your not understanding what I suggested.  It might help if we 
could agree on some points and then go from there.

So, lets consider a "codec" and a "coding" as being two different things 
where a codec is a character sub set of unicode characters expressed in 
a native format.  And a coding is *not* a subset of the unicode 
character set, but an _opperation_ performed on text.  So you would have 
the following properties.

codec ->  text is always in *one_codec* at any time.

coding ->  operation performed on text.

Lets add a special default coding called 'none' to represent a do 
nothing coding. (figuratively for explanation purposes)

'none' -> return the input as is, or the uncoded text


Given the above relationships we have the following possible 
transformations.

   1. codec to like codec:   'ascii' to 'ascii'
   2. codec to unlike codec:   'ascii' to 'latin1'

And we have coding relationships of:

   a. coding to like coding  # Unchanged, do nothing
   b. coding to unlike coding


Then we can express all the possible combinations as...

[1.a, 1.b, 2.a, 2.b]


1.a -> coding in codec to like coding in like codec:

'none' in 'ascii' to 'none' in 'ascii'

1.b -> coding in codec to diff coding in like codec:

'none' in 'ascii' to 'base64' in 'ascii'

2.a -> coding in codec to same coding in diff codec:

'none' in 'ascii' to 'none' in 'latin1'

2.b -> coding in codec to diff coding in diff codec:

'none' in 'latin1' to 'base64' in 'ascii'

This last one is a problem as some codecs combine coding with character 
set encoding and return text in a differnt encoding than they recieved. 
  The line is also blurred between types and encodings.  Is unicode and 
encoding?  Will bytes also be a encoding?


Using the above combinations:

(1.a) is just creating a new copy of a object.

s = str(s)


(1.b) is recoding an object, it returns a copy of the object in the same 
encoding.

s = s.encode('hex-codec')  # ascii str -> ascii str coded in hex
s = s.decode('hex-codec')  # ascii str coded in hex -> ascii str

* these are really two differnt operations. And encoding repeatedly 
results in nested codings.  Codecs (as a pure subset of unicode) don't 
have that property.

* the hex-codec also fit the 2.b pattern below if the source string is 
of a differnt type than ascii. (or the the default string?)


(2.a) creates a copy encoded in a new codec.

s = s.encode('latin1')

* I beleive string constructors should have a encoding argument for use 
with unicode strings.

s = str(u, 'latin1')   # This would match the bytes constructor.


(2.b) are combinations of the above.

   s = u.encode('base64')
  # unicode to ascii string as base64 coded characters

   u = unicode(s.decode('base64'))
  # ascii string coded in base64 to unicode characters

or

>>> u = unicode(s, 'base64')
  Traceback (most recent call last):
File "", line 1, in ?
  TypeError: decoder did not return an unicode object (type=str)

Ooops...  ;)

So is coding the same as a codec?  I think they have different 
properties and should be treated differently except when the 
practicality over purity rule is needed.  And in those cases maybe the 
names could clearly state the result.

u.decode('base64ascii')  # name indicates coding to codec


> A string. -> QSBzdHJpbmcu -> UVNCemRISnBibWN1

Looks like the underlying sequence is:

  native string -> unicode -> unicode coded base64 -> coded ascii str

And decode operation would be...

  coded ascii str -> unicode coded base64 -> unicode -> ascii str

Except it may combine some of these steps to speed it up.

Since it's a hybred codec including a coding operation. We have to treat 
it as a codec.


> Ron> * Given that the string type gains a __codec__ attribute
> Ron> to handle automatic decoding when needed.  (is there a reason
> Ron> not to?)
> 
> Ron>str(object[,codec][,error]) -> string coded with codec
> 
> Ron>unicode(object[,error]) -> unicode
> 
> Ron>bytes(object) -> bytes
> 
> str == unicode in Py3k, so this is a non-starter.  What do you want

Re: [Python-Dev] problem with genexp

2006-02-24 Thread Neal Norwitz
On 2/20/06, Jiwon Seo <[EMAIL PROTECTED]> wrote:
> Regarding this Grammar change;  (last October)
>  from   argument: [test '=' ] test [gen_for]
>  to  argument: test [gen_for] | test '=' test ['(' gen_for ')']
>
> - to raise error for "bar(a = i for i in range(10)) )"
>
> I think we should change it to
>  argument: test [gen_for] | test '=' test
>
> instead of
>  argument: test [gen_for] | test '=' test ['(' gen_for ')']
>
> that is, without ['(' gen_for ')'] . We don't need that extra term,
> because "test" itself includes generator expressions - with all those
> parensises.

Works for me, committed.

n
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pre-PEP: The "bytes" object

2006-02-24 Thread Neil Schemenauer
Michael Hoffman <[EMAIL PROTECTED]> wrote:
> Am I the only one who finds the use of "self" on a classmethod to be
> incredibly confusing? Can we please follow PEP 8 and use "cls"
> instead?

Sorry, using "self" was an oversight.  It should be "cls", IMO.

  Neil

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Greg Ewing
Stephen J. Turnbull wrote:

> the kind of "text" for which Unicode was designed is normally produced
> and consumed by people, who wll pt up w/ ll knds f nnsns.  Base64
> decoders will not put up with the same kinds of nonsense that people
> will.

The Python compiler won't put up with that sort of
nonsense either. Would you consider that makes Python
source code binary data rather than text, and that
it's inappropriate to represent it using a unicode
string?

> You're basically assuming that the person who implements the code that
> processes a Unicode string is the same person who implemented the code
> that converts a binary object into base64 and inserts it into a
> string.

No, I'm assuming the user of base64 knows the
characteristics of the channel he's using. You
can only use base64 if you know the channel
promises not to munge the particular characters
that base64 uses. If you don't know that, you
shouldn't be trying to send base64 through that
channel.

> In most environments, it should be possible to hide bytes<->unicode
> codecs almost all the time,

But it *is* hidden in the situation I'm talking
about, because all the Unicode encoding/decoding
takes place inside the implementation of the
text channel, which I'm taking as a given.

> I don't think it's a good idea to gratuitously introduce
 > wire protocols as unicode codecs,

I am *not* saying that base64 is a unicode codec!
If that's what you thought I was saying, it's no
wonder we're confusing each other.

It's just a transformation from bytes to
text. I'm only calling it unicode because all
text will be unicode in Py3k. In py2.x it could
just as well be a str -- but a str interpreted
as text, not binary.

> What do you think the email module does?
> Assuming conforming MIME messages

But I'm not assuming mime in the first place. If I
have a mail interface that will accept chunks of
binary data and encode them as a mime message for
me, then I don't need to use base64 in the first
place.

The only time I need to use something like base64
is when I have something that will only accept
text. In Py3k, "accepts text" is going to mean
"takes a character string as input", where
"character string" is a distinct type from
"binary data". So having base64 produce anything
other than a character string would be awkward
and inconvenient.

I phrased that paragraph carefully to avoid using
the word "unicode" anywhere. Does that make it
clearer what I'm getting at?

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-24 Thread Greg Ewing
Raymond Hettinger wrote:
> Code that 
> uses next() is more understandable, friendly, and readable without the 
> walls of underscores.

There wouldn't be any walls of underscores, because

   y = x.next()

would become

   y = next(x)

The only time you would need to write underscores is
when defining a __next__ method. That would be no worse
than defining an __init__ or any other special method,
and has the advantage that it clearly marks the method
as being special.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-24 Thread Greg Ewing
Jeremy Hylton wrote:

> The more practical complaint is that list comprehensions use the same
> namespace as the block that contains them.  
 > ... but I suspect we're stuck with the
> current behavior for backwards compatibility reasons.

There will be no backwards compatibility in 3.0,
so perhaps this could be fixed then?

Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-24 Thread Guido van Rossum
On 2/24/06, Greg Ewing <[EMAIL PROTECTED]> wrote:
> Jeremy Hylton wrote:
> > The more practical complaint is that list comprehensions use the same
> > namespace as the block that contains them.
>  > ... but I suspect we're stuck with the
> > current behavior for backwards compatibility reasons.
>
> There will be no backwards compatibility in 3.0,
> so perhaps this could be fixed then?

Yes that's the plan. [f(x) for x in S] will be syntactic sugar for
list(f(x) for x in S) which already avoids the scope problem.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pre-PEP: The "bytes" object

2006-02-24 Thread Ron Adam
Neil Schemenauer wrote:
> Michael Hoffman <[EMAIL PROTECTED]> wrote:
>> Am I the only one who finds the use of "self" on a classmethod to be
>> incredibly confusing? Can we please follow PEP 8 and use "cls"
>> instead?
> 
> Sorry, using "self" was an oversight.  It should be "cls", IMO.
> 
>   Neil

IMO2

Why was it decided that the unicode encoding argument should be ignored 
if the first argument is a string?  Wouldn't an exception be better 
rather than give the impression it does something when it doesn't?

Ron

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pre-PEP: The "bytes" object

2006-02-24 Thread Neil Schemenauer
Ron Adam <[EMAIL PROTECTED]> wrote:
> Why was it decided that the unicode encoding argument should be ignored 
> if the first argument is a string?  Wouldn't an exception be better 
> rather than give the impression it does something when it doesn't?

>From the PEP:

There is no sane meaning that the encoding can have in that
case.  str objects *are* byte arrays and they know nothing about
the encoding of character data they contain.  We need to assume
that the programmer has provided str object that already uses
the desired encoding.

Raising an exception would be a valid option.  However, passing the
string through unchanged makes the transition from str to bytes
easier.

  Neil

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping support for Win9x in 2.6

2006-02-24 Thread Tim Peters
[Neal Norwitz]
>> Martin and I were talking about dropping support for older versions of
>> Windows (of the non-NT flavor).  We both thought that it was
>> reasonable to stop supporting Win9x (including WinME) in Python 2.6.
>> I updated PEP 11 to reflect this.

It's OK by me, but I have the same question as Trent:

[Trent Mick]
> Are there specific code areas in mind that would be ripped out for this
> or is this mainly to avoid having to test on and ensure new code is
> compatible with?

Seem unlikely it's the latter, since I'm not sure any Python developer
tests on a pre-NT Windows anymore anyway.  Maybe Raymond is still
running WinME?

About the former, I don't see much potential.  The ugliest 9x-ism is
w9xpopen.exe, but comments in the places it's used say it's needed on
NT too if the user is running command.com.  If so, it stays.

There's a bit of excruciating Win9x-specific code in winsound.c that
could go away, and I suppose we could assume that Unicode filenames
are always supported on Windows.

Maybe best is that if someone reports a Win9x-specific bug against
2.6+, we could close it as Won't-Fix at once instead of letting it sit
around ignored for years :-)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pre-PEP: The "bytes" object

2006-02-24 Thread Ron Adam
Neil Schemenauer wrote:
> Ron Adam <[EMAIL PROTECTED]> wrote:
>> Why was it decided that the unicode encoding argument should be ignored 
>> if the first argument is a string?  Wouldn't an exception be better 
>> rather than give the impression it does something when it doesn't?
> 
>>From the PEP:
> 
> There is no sane meaning that the encoding can have in that
> case.  str objects *are* byte arrays and they know nothing about
> the encoding of character data they contain.  We need to assume
> that the programmer has provided str object that already uses
> the desired encoding.
> 
> Raising an exception would be a valid option.  However, passing the
> string through unchanged makes the transition from str to bytes
> easier.
> 
>   Neil

I guess I'm concerned that if the string isn't already in the specified 
encoding it could pass though without complaining and not be encoded as 
expected.

 >>> b.bytes(u'abc', 'hex-codec')
bytes([54, 49, 54, 50, 54, 51])

 >>> b.bytes('abc', 'hex-codec')
bytes([97, 98, 99])# not hex

If this was in a function I would need to do a check of some sort 
anyways or cast to unicode beforehand, or encode beforehand.  Which 
negates the advantage of having the codec argument in bytes unfortunately.

def hexabyte(s):
s = unicode(s)
return bytes(s, 'hex-codec')
or

def hexabyte(s):
s = s.encode('hex-codec')
return bytes(s)

It seems to me if you are specifying a codec for bytes, then you will 
not be expecting to get an already encoded string, and if you do, it may 
not be in the codec you want since you are probably not specifying the 
default codec.

Ron


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com