Re: [Python-Dev] PEP 343: Context managers a superset of decorators?

2006-02-13 Thread Eric Sumner
On 2/12/06, Josiah Carlson <[EMAIL PROTECTED]> wrote:
[paragraphs swapped]
> The desire for context managers to have access to its enclosing scope is
> another discussion entirely, though it may do so without express
> permission via stack frame manipulation.

My main point was that, with relatively small changes to 343, it can
replace the decorator syntax with a more general solution that matches
the style of the rest of the language better.  The main change (access
to scopes) makes this possible, and the secondary change (altering the
block syntax) mitigates (but does not remove) the syntax difficulties
presented.  I realize that I made an assumption that may not be valid;
namely, that a new scope is generated by the 'with' statement.  Stack
frame manipulation would not be able to provide access to a scope that
no longer exists.

> Re-read the decorator PEP: http://www.python.org/peps/pep-0318.html to
> understand why both of these options (indentation and prefix notation)
> are undesireable for a general decorator syntax.

With the changes that I propose, both syntaxes are equivalent and can
be used interchangeably.  While each of them has problems, I believe
that in situations where one has a problem, the other usually does
not.

>From this point on, I provide a point-by-point reaction to the most
applicable syntax objections listed in PEP 318.  If you're not
interested in this, bail out now.

In the PEP, there is no discussion of a prefix notation in which the
decorator is placed before the 'def' on the same line.  The most
similar example has the decorator between the 'def' and the parameter
list.  It mentions two problems:

> There are a couple of objections to this form. The first is that it breaks
> easily 'greppability' of the source -- you can no longer search for 'def foo('
> and find the definition of the function. The second, more serious, objection
> is that in the case of multiple decorators, the syntax would be extremely
> unwieldy.

The first does not apply, as this syntax does not separate 'def' and
the function name.  The second is still a valid concern, but the
decorator list can easily be broken across multiple lines.

The main objection to an indented syntax seems to be that it requires
decorated functions to be indented an extra level.  For simple
decorators, the compacted syntax could be used to sidestep this
problem.  The main complaints about the J2 proposal don't quite apply:
the code in the block is a sequence of statements and 'with' is
already going to be added to the language as a compound statement.

  -- Eric
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 343: Context managers a superset of decorators?

2006-02-13 Thread Nick Coghlan
Eric Sumner wrote:
> I realize that I made an assumption that may not be valid;
> namely, that a new scope is generated by the 'with' statement.

The with statement uses the existing scope - its just a way of factoring out 
try/finally boilerplate code. No more, and, in fact, fractionally less (the 
'less' being the fact that just like any other Python function, you only get 
to supply one value to be bound to a name in the invoking scope).

Trying to link this with the function definition pipelining provided by 
decorators seems like a bit of a stretch. It certainly isn't a superset of the 
decorator functionality - if you want a statement that manipulates the 
namespace it contains, that's what class statements are for :)

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-13 Thread Georg Brandl
The above docs are from August 2005 while docs.python.org/dev is current.
Shouldn't the old docs be removed?


Georg

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 343: Context managers a superset of decorators?

2006-02-13 Thread Eric Sumner
On 2/13/06, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> Eric Sumner wrote:
> > I realize that I made an assumption that may not be valid;
> > namely, that a new scope is generated by the 'with' statement.
>
> The with statement uses the existing scope - its just a way of factoring out
> try/finally boilerplate code. No more, and, in fact, fractionally less (the
> 'less' being the fact that just like any other Python function, you only get
> to supply one value to be bound to a name in the invoking scope).

Ok.  These changes are more substantial than I thought, then.

> Trying to link this with the function definition pipelining provided by
> decorators seems like a bit of a stretch. It certainly isn't a superset of the
> decorator functionality - if you want a statement that manipulates the
> namespace it contains, that's what class statements are for :)

Several examples of how the 'with' block would be used involve
transactions which are either rolled back or confirmed.  All of these
use the transaction capabilities of some external database.  With
separate scopes, the '__exit__' function can decide which names to
export outwards to the containing scope.  Unlike class statements, the
contained scope is used temporarily and can be discarded when the
'with' statement is completed.  This would allow a context manager to
provide a local transaction handler.

To me, it is not much of a leap from copying data between scopes to
modifying it as it goes through, which is exactly what decorators do. 
The syntax that this provides for decorators seems reasonable enough
(to me) to make the '@' syntax redundant.  However, this is a larger
change than I thought, and maybe not worth the effort to implement.

  -- Eric
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] nice()

2006-02-13 Thread Smith
| From: Josiah Carlson <[EMAIL PROTECTED]>
| "Alan Gauld" <[EMAIL PROTECTED]> wrote:
|| However I do dislike the name nice() - there is already a nice() in
|| the 
|| os module with a fairly well understood function. 

perhaps trim(), nearly(), about(), defer_the_pain_of() :-) I've waited to think 
of names until after writing this. The reason for the last name option may 
become apparent after reading the rest of this post.

|| But I'm sure some
|| time with a thesaurus can overcome that single mild objection. :-)
| 
| Presumably it would be located somewhere like the math module.

I would like to see it as accessible as round, int, float, and repr. I really 
think a round-from-the-left is a nice tool to have. It's obviously very easy to 
build your own if you know what tools to use. Not everyone is going to be 
reading the python-dev or similar lists, however, and so having it handy would 
be nice.

| From: Greg Ewing <[EMAIL PROTECTED]>
| Smith wrote:
| 
|| When teaching some programming to total newbies, a common
|| frustration is how to explain why a==b is False when a and b are
|| floats computed by different routes which ``should'' give the
|| same results (if arithmetic had infinite precision).
| 
| This is just a special case of the problems inherent
| in the use of floating point. As with all of these,
| papering over this particular one isn't going to help
| in the long run -- another one will pop up in due
| course.
| 
| Seems to me it's better to educate said newbies not
| to use algorithms that require comparing floats for
| equality at all. 

I think that having a helper function like nice() is a middle ground solution 
to the problem, falling short of using only decimal or rational values for 
numbers and doing better than requiring a test of error between floating values 
that should be equal but aren't because of alternate methods of computation. 
Just like the argument for having true division being the default behavior for 
the computational environment, it seems a little unfriendly to expect the more 
casual user to have to worry that 3*0.1 is not the same as 3/10.0. I know--they 
really are different, and one should (eventually) understand why, but does 
anyone really want the warts of floating point representation to be popping up 
in their work if they could be avoided, or at least easily circumvented?

I know you know why the following numbers show up as not equal, but this would 
be an example of the pain in working with a reasonably simple exercise of, say, 
computing the bin boundaries for a histogram where bins are a width of 0.1: 

###
>>> for i in range(20):
...  if (i*.1==i/10.)<>(nice(i*.1)==nice(i/10.)):
...   print i,repr(i*.1),repr(i/10.),i*.1,i/10.
... 
3 0.30004 0.2 0.3 0.3
6 0.60009 0.59998 0.6 0.6
7 0.70007 0.69996 0.7 0.7
12 1.2002 1.2 1.2 1.2
14 1.4001 1.3999 1.4 1.4
17 1.7002 1.7 1.7 1.7
19 1.9001 1.8999 1.9 1.9
###

For, say, garden variety numbers that aren't full of garbage digits resulting 
from fp computation, the boundaries computed as 0.1*i are not going to agree 
with such simple numbers as 1.4 and 0.7.

Would anyone (and I truly don't know the answer) really mind if all floating 
point values were filtered through whatever lies behind the str() manipulation 
of floats before the computation was made? I'm not saying that strings would be 
compared, but that float(str(x)) would be compared to float(str(y)) if x were 
being compared to y as in x<=y. If this could be done, wouldn't a lot of grief 
just go away and not require the use of decimal or rational types for many 
users? 

I understand that the above really is just a patch over the problem, but I'm 
wondering if it moves the problem far enough away that most users wouldn't have 
to worry about it. Here, for example, are the first values where the running 
sum doesn't equal the straight multiple of some step size:

###
>>> def go(x,n=1000):
...  s=0;i=0
...  while snice(i*x):
...return i,s,i*x,`s`,`i*x`
... 
>>> for i in range(1,100):
...  print i, go(i/1000.)
...  print
...  
1 (60372 60.371999 60.372 60.3719994 60.372)

2 (49645 99.28 99.29 99.2849998 99.296)
###

The soonest the breakdown occurs is at the 22496th multiple of 0.041 for the 
range given above. By the time someone starts getting into needs of iterating 
so many times, they will be ready to use the more sophisticated option of 
nice()--the one which makes it more versatile and less of a patch--the option 
to round the answers to a given number of leading digits rather than a given 
decimal precision like round. nice() gives a simple way to think about making a 
comparison of floats. You just have to ask yourself at what "part per X" do you 
no longer care whether the numbers are different or not. e.g., for 
approximately 1 part in 1

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
One recommendation: for starters, I'd much rather see the bytes type
standardized without a literal notation. There should be are lots of
ways to create bytes objects from string objects, with specific
explicit encodings, and those should suffice, at least initially.

I also wonder if having a b"..." literal would just add more confusion
-- bytes are not characters, but b"..." makes it appear as if they
are.

--Guido

On 2/11/06, Bengt Richter <[EMAIL PROTECTED]> wrote:
> On Fri, 10 Feb 2006 21:35:26 -0800, Guido van Rossum <[EMAIL PROTECTED]> 
> wrote:
>
> >> On Sat, 11 Feb 2006 05:08:09 + (UTC), Neil Schemenauer <[EMAIL 
> >> PROTECTED]> > >The backwards compatibility problems *seem* to be 
> >> relatively minor.
> >> >I only found one instance of breakage in the standard library.  Note
> >> >that my patch does not change PyObject_Str(); that would break
> >> >massive amounts of code.  Instead, I introduce a new function:
> >> >PyString_New().  I'm not crazy about the name but I couldn't think
> >> >of anything better.
> >
> >On 2/10/06, Bengt Richter <[EMAIL PROTECTED]> wrote:
> >> Should this not be coordinated with PEP 332?
> >
> >Probably.. But that PEP is rather incomplete. Wanna work on fixing that?
> >
> I'd be glad to add my thoughts, but first of course it's Skip's PEP,
> and Martin casts a long shadow when it comes to character coding issues
> that I suspect will have to be considered.
>
> (E.g., if there is a b'...' literal for bytes, the actual characters of
> the source code itself that the literal is being expressed in could be ascii
> or latin-1 or utf-8 or utf16le a la Microsoft, etc. UIAM, I read that the 
> source
> is at least temporarily normalized to Unicode, and then re-encoded (except now
> for string literals?) per coding cookie or other encoding inference. (I may be
> out of date, gotta catch up).
>
> If one way or the other a string literal is in Unicode, then presumably so is
> a byte string b'...' literal -- i.e. internally u"b'...'" just before
> being turned into bytes.
>
> Should that then be an internal straight u"b'...'".encode('byte') with 
> default ascii + escapes
> for non-ascii and non-printables, to define the full 8 bits without encoding 
> error?
> Should unicode be encodable into byte via a specific encoding? E.g., 
> u'abc'.encode('byte','latin1'),
> to distinguish producing a mutable byte string vs an immutable str type as 
> with u'abc'.encode('latin1').
> (but how does this play with str being able to produce unicode? And when do 
> these changes happen?)
> I guess I'm getting ahead of myself ;-)
>
> So I would first ask Skip what he'd like to do, and Martin for some hints on 
> reading, to avoid
> going down paths he already knows lead to brick walls ;-) And I need to think 
> more about PEP 349.
>
> I would propose to do the reading they suggest, and edit up a new version of 
> pep-0332.txt
> that anyone could then improve further. I don't know about an early deadline. 
> I don't want
> to over-commit, as time and energies vary. OTOH, as you've noticed, I could 
> be spending my
> time more effectively ;-)
>
> I changed the thread title, and will wait for some signs from you, Skip, 
> Martin, Neil, and I don't
> know who else might be interested...
>
> Regards,
> Bengt Richter
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread M.-A. Lemburg
Guido van Rossum wrote:
> One recommendation: for starters, I'd much rather see the bytes type
> standardized without a literal notation. There should be are lots of
> ways to create bytes objects from string objects, with specific
> explicit encodings, and those should suffice, at least initially.
> 
> I also wonder if having a b"..." literal would just add more confusion
> -- bytes are not characters, but b"..." makes it appear as if they
> are.

Agreed.

Given that we have a source code encoding which would need
to be honored, b"..." doesn't really make all that much sense
(unless you always use hex escapes).

Note that if we drop the string type, all codecs which currently
return strings will have to return bytes. This gives you a pretty
exhaustive way of defining your binary literals in Python :-)

Here's one:

data = "abc".encode("latin-1")

To simplify things we might want to have

bytes("abc")

do the above encoding per default.

> --Guido
> 
> On 2/11/06, Bengt Richter <[EMAIL PROTECTED]> wrote:
>> On Fri, 10 Feb 2006 21:35:26 -0800, Guido van Rossum <[EMAIL PROTECTED]> 
>> wrote:
>>
 On Sat, 11 Feb 2006 05:08:09 + (UTC), Neil Schemenauer <[EMAIL 
 PROTECTED]> > >The backwards compatibility problems *seem* to be 
 relatively minor.
> I only found one instance of breakage in the standard library.  Note
> that my patch does not change PyObject_Str(); that would break
> massive amounts of code.  Instead, I introduce a new function:
> PyString_New().  I'm not crazy about the name but I couldn't think
> of anything better.
>>> On 2/10/06, Bengt Richter <[EMAIL PROTECTED]> wrote:
 Should this not be coordinated with PEP 332?
>>> Probably.. But that PEP is rather incomplete. Wanna work on fixing that?
>>>
>> I'd be glad to add my thoughts, but first of course it's Skip's PEP,
>> and Martin casts a long shadow when it comes to character coding issues
>> that I suspect will have to be considered.
>>
>> (E.g., if there is a b'...' literal for bytes, the actual characters of
>> the source code itself that the literal is being expressed in could be ascii
>> or latin-1 or utf-8 or utf16le a la Microsoft, etc. UIAM, I read that the 
>> source
>> is at least temporarily normalized to Unicode, and then re-encoded (except 
>> now
>> for string literals?) per coding cookie or other encoding inference. (I may 
>> be
>> out of date, gotta catch up).
>>
>> If one way or the other a string literal is in Unicode, then presumably so is
>> a byte string b'...' literal -- i.e. internally u"b'...'" just before
>> being turned into bytes.
>>
>> Should that then be an internal straight u"b'...'".encode('byte') with 
>> default ascii + escapes
>> for non-ascii and non-printables, to define the full 8 bits without encoding 
>> error?
>> Should unicode be encodable into byte via a specific encoding? E.g., 
>> u'abc'.encode('byte','latin1'),
>> to distinguish producing a mutable byte string vs an immutable str type as 
>> with u'abc'.encode('latin1').
>> (but how does this play with str being able to produce unicode? And when do 
>> these changes happen?)
>> I guess I'm getting ahead of myself ;-)
>>
>> So I would first ask Skip what he'd like to do, and Martin for some hints on 
>> reading, to avoid
>> going down paths he already knows lead to brick walls ;-) And I need to 
>> think more about PEP 349.
>>
>> I would propose to do the reading they suggest, and edit up a new version of 
>> pep-0332.txt
>> that anyone could then improve further. I don't know about an early 
>> deadline. I don't want
>> to over-commit, as time and energies vary. OTOH, as you've noticed, I could 
>> be spending my
>> time more effectively ;-)
>>
>> I changed the thread title, and will wait for some signs from you, Skip, 
>> Martin, Neil, and I don't
>> know who else might be interested...
>>
>> Regards,
>> Bengt Richter
>>
>> ___
>> Python-Dev mailing list
>> [email protected]
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: 
>> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
> 
> 
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/mal%40egenix.com

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 13 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-D

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote:
>One recommendation: for starters, I'd much rather see the bytes type
>standardized without a literal notation. There should be are lots of
>ways to create bytes objects from string objects, with specific
>explicit encodings, and those should suffice, at least initially.
>
>I also wonder if having a b"..." literal would just add more confusion
>-- bytes are not characters, but b"..." makes it appear as if they
>are.

Why not just have the constructor be:

 bytes(initializer [,encoding])

Where initializer must be either an iterable of suitable integers, or a 
unicode/string object.  If the latter (i.e., it's a basestring), the 
encoding argument would then be required.  Then, there's no need for 
special codec support for the bytes type, since you call bytes on the thing 
to be encoded.  And of course, no need for a 'b' literal.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Guido van Rossum
On 2/10/06, Mark Russell <[EMAIL PROTECTED]> wrote:
>
> On 10 Feb 2006, at 12:45, Nick Coghlan wrote:
>
> An alternative would be to call it "__discrete__", as that is the key
>
> characteristic of an indexing type - it consists of a sequence of discrete
>
> values that can be isomorphically mapped to the integers.
> Another alternative: __as_ordinal__.  Wikipedia describes ordinals as
> "numbers used to denote the position in an ordered sequence" which seems a
> pretty precise description of the intended result.  The "as_" prefix also
> captures the idea that this should be a lossless conversion.

Aren't ordinals generally assumed to be non-negative? The numbers used
as slice or sequence indices can be negative!

Also, I don't buy the reason for 'as'l I don't see how this word would
require the conversion to be losless.

The PEP continues to use __index__ and I'm happy with that.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] ssize_t branch (Was: release plan for 2.5 ?)

2006-02-13 Thread Guido van Rossum
On 2/12/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Neal Norwitz wrote:
> > I'm tempted to say we should merge now.  I know the branch works on
> > 64-bit boxes.  I can test on a 32-bit box if Martin hasn't already.
> > There will be a lot of churn fixing problems, but maybe we can get
> > more people involved.
>
> The ssize_t branch has now all the API I want it to have. I just
> posted the PEP to comp.lang.python, maybe people have additional
> things they consider absolutely necessary.
>
> There are two aspects left, and both can be done after the merge:
> - a lot of modules still need adjustments, to really support
>   64-bit collections. This shouldn't cause any API changes, AFAICT.
>
> - the printing of Py_ssize_t values should be supported. I think
>   Tim proposed to provide the 'z' formatter across platforms.
>   This is a new API, but it's a pure extension, so it can be
>   done in the trunk.

Great news. I'm looking forward to getting this over with!

> I would like to avoid changing APIs after the merge to the trunk
> has happened; I remember Guido saying (a few years ago) that this
> change must be a single large change, rather many small incremental
> changes. I agree, and I hope I have covered everything that needs
> to be covered.

Let me qualify that a bit -- I'd be okay with one honking big change
followed by some minor adjustments. I'd say that, since you've already
done so much in the branch, we're quickly approaching the point where
the extra testing we get from merging soon out-benefits the problems
some folks may experience due to the branch not being perfect yet.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Jim Jewett
Guido:

> I don't like __true_int__ very much. Personally,
> I'm fine with calling it __index__

index is OK, but is there a reason __integer__ would be
rejected?

__int__ roughly follows the low-level C implementation,
and may do odd things on unusual input.

__integer__ properly creates a conceptual integer, so
it won't lose or corrupt information (unless the class
writer does this intentionally).

-jJ
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Guido van Rossum
On 2/13/06, Jim Jewett <[EMAIL PROTECTED]> wrote:
> Guido:
>
> > I don't like __true_int__ very much. Personally,
> > I'm fine with calling it __index__
>
> index is OK, but is there a reason __integer__ would be
> rejected?
>
> __int__ roughly follows the low-level C implementation,
> and may do odd things on unusual input.
>
> __integer__ properly creates a conceptual integer, so
> it won't lose or corrupt information (unless the class
> writer does this intentionally).

Given the number of folks who misappreciate the difference between
__getattr__ and __getattribute__, I'm not sure I'd want to encourage
using abbreviated and full forms of the same term in the same context.
When confronted with the existence of __int__ and __integer__ I can
see plenty of confusion ahead.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] nice()

2006-02-13 Thread Raymond Hettinger
Please do not spam multiple mail lists with these posts (edu-sig, 
python-dev, and tutor).

Raymond



- Original Message - 
From: "Smith" <[EMAIL PROTECTED]>
To: 
Cc: ; 
Sent: Monday, February 13, 2006 12:10 PM
Subject: Re: [Python-Dev] nice() 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-13 Thread Guido van Rossum
Shouldn't docs.python.org be removed? It seems to add mroe confusion
than anything, especially since most links on python.org continue to
point to python.org/doc/.

On 2/13/06, Georg Brandl <[EMAIL PROTECTED]> wrote:
> The above docs are from August 2005 while docs.python.org/dev is current.
> Shouldn't the old docs be removed?

(Now that I work for Google I realize more than ever before the
importance of keeping URLs stable; PageRank(tm) numbers don't get
transferred as quickly as contents. I have this worry too in the
context of the python.org redesign; 301 permanent redirect is *not*
going to help PageRank of the new page.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-13 Thread Fred L. Drake, Jr.
On Monday 13 February 2006 10:03, Georg Brandl wrote:
 > The above docs are from August 2005 while docs.python.org/dev is current.
 > Shouldn't the old docs be removed?

I'm afraid I've generally been too busy to chime in much on this topic, but 
I've spent a bit of time thinking about it, and would like to keep on top of 
the issue still.

The automatically-maintained version of the development docs is certainly 
preferrable to the manually-maintained-by-me version, and I've updated the 
link from www.python.org/doc/ to refer to that version for now.  However, I 
do have some concerns about how this is all structured still.

One of the goals of docs.python.org was to be able to do a Google site-search 
and only see the current version.  Having multiple versions on that site is 
contrary to that purpose.  I'd like to see the development version(s) move 
back to being in the www.python.org/dev/doc/ hierarchy.

What I would also like to see is to have an automatically-updated version for 
each of the maintainer versions of Python, as well as the development trunk.  
That would mean two versions at this point (2.4.x, 2.5.x); only one of those 
is currently handled automatically.


  -Fred

-- 
Fred L. Drake, Jr.   
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] moving content around (Re: http://www.python.org/dev/doc/devel still available)

2006-02-13 Thread Fredrik Lundh
Guido van Rossum wrote:

> (Now that I work for Google I realize more than ever before the
> importance of keeping URLs stable; PageRank(tm) numbers don't get
> transferred as quickly as contents. I have this worry too in the
> context of the python.org redesign; 301 permanent redirect is *not*
> going to help PageRank of the new page.)

so what's the best way to move stuff around?

wikipedia seems to display the content from the "new" location under
the old URL, but with a small blurb at the top that says "redirected
from ", e.g.

http://en.wikipedia.org/wiki/F_Scott_Fitzgerald

(not sure if it's done that way to avoid HTTP roundtrips, or for some
obscure googlerank reason...)





___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Jim Jewett
Is there a reason __integer__ would be rejected?

Guido van Rossum answered:

> Given the number of folks who misappreciate the difference between
> __getattr__ and __getattribute__, I'm not sure I'd want to encourage
> using abbreviated and full forms of the same term in the same context.
> When confronted with the existence of __int__ and __integer__ I can
> see plenty of confusion ahead.

I see this case as slightly different.

getattr and getattribute are both things you might
reasonably want to do.  __int__ is something you
probably shouldn't be doing very often anymore;
it is being kept for backwards compatibility.

Switching getattr and getattribute will cause bugs,
which may be hard to diagnose, even for people
who might reasonably be using the hooks.  Switching
__int__ and (newname) won't matter, unless
__int__ was already doing something unexpected.
Since backwards compatibility means we can't
prevent __int__ from doing the unexpected, a
similar name might be *good* -- at least it would
tip people off that __int__ might not be what they
want.

I can't think of any way to associate getattr vs
getattribute with timing or precedence.  I already
associate int with a specific C datatype and integer
with something more abstract.  (I'm not sure the
new method is a better match for my integer
concept, and it probably isn't a better match
for java.lang.Integer, but ... the separation is there.)

-jJ
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] moving content around (Re: http://www.python.org/dev/doc/devel still available)

2006-02-13 Thread Guido van Rossum
On 2/13/06, Fredrik Lundh <[EMAIL PROTECTED]> wrote:
> Guido van Rossum wrote:
>
> > (Now that I work for Google I realize more than ever before the
> > importance of keeping URLs stable; PageRank(tm) numbers don't get
> > transferred as quickly as contents. I have this worry too in the
> > context of the python.org redesign; 301 permanent redirect is *not*
> > going to help PageRank of the new page.)
>
> so what's the best way to move stuff around?

I don't know; my point was to avoid needless moving rather than giving
a best practice for moving.

> wikipedia seems to display the content from the "new" location under
> the old URL, but with a small blurb at the top that says "redirected
> from ", e.g.
>
> http://en.wikipedia.org/wiki/F_Scott_Fitzgerald
>
> (not sure if it's done that way to avoid HTTP roundtrips, or for some
> obscure googlerank reason...)

Can't say I understand that particular example. Wikipedia has
different requirements though; there are aliases (e.g. homonyms,
synonyms) that won't go away. For python.org we're looking at
minimizing the URL space churn.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote:
> >One recommendation: for starters, I'd much rather see the bytes type
> >standardized without a literal notation. There should be are lots of
> >ways to create bytes objects from string objects, with specific
> >explicit encodings, and those should suffice, at least initially.
> >
> >I also wonder if having a b"..." literal would just add more confusion
> >-- bytes are not characters, but b"..." makes it appear as if they
> >are.
>
> Why not just have the constructor be:
>
>  bytes(initializer [,encoding])
>
> Where initializer must be either an iterable of suitable integers, or a
> unicode/string object.  If the latter (i.e., it's a basestring), the
> encoding argument would then be required.  Then, there's no need for
> special codec support for the bytes type, since you call bytes on the thing
> to be encoded.  And of course, no need for a 'b' literal.

It'd be cruel and unusual punishment though to have to write

  bytes("abc", "Latin-1")

I propose that the default encoding (for basestring instances) ought
to be "ascii" just like everywhere else. (Meaning, it should really be
the system default encoding, which defaults to "ascii" and is
intentionally hard to change.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351

2006-02-13 Thread Guido van Rossum
I've rejected PEP 351, with a reference to this thread as the rationale.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification

2006-02-13 Thread Guido van Rossum
On 2/12/06, Greg Ewing <[EMAIL PROTECTED]> wrote:
>  > [A large head-exploding set of rules]
>
> Blarg.
>
> Const - Just Say No.

+1

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Jim Jewett
Travis wrote:

>  The patch adds a new API function int PyObject_AsIndex(obj)

How did you decide between int and long?

Why not ssize_t?

Also, if index is being added as a builtin, should the failure
result be changed?  I'm thinking that this may become a
replacement for isinstance(val, (int, long)).  If so, it might
be nice not to raise errors, or at least to raise a more
specific subclass.  (Catching a TypeError and then
checking the message string ... does not seem clean.)

-jJ
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Jim Jewett
Travis wrote:

>  The patch adds a new API function int PyObject_AsIndex(obj)

How did you decide between int and long?

Why not ssize_t?
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Guido van Rossum
On 2/13/06, Jim Jewett <[EMAIL PROTECTED]> wrote:
> Travis wrote:
>
> >  The patch adds a new API function int PyObject_AsIndex(obj)
>
> How did you decide between int and long?
>
> Why not ssize_t?

It should be the same type used everywhere for indexing. In the svn
HEAD that's int. Once PEP 353 lands it should be ssize_t. I've made
Travis aware of this issue already.

> Also, if index is being added as a builtin, should the failure
> result be changed?

I don't like to add a built-in index() at this point; mostly because
of Occam's razor (we haven't found a need).

> I'm thinking that this may become a
> replacement for isinstance(val, (int, long)).

But only if it's okay if values > sys.maxint (or some other constant
indicating the limit of ssize_t) are not required to be supported.

> If so, it might
> be nice not to raise errors, or at least to raise a more
> specific subclass.  (Catching a TypeError and then
> checking the message string ... does not seem clean.)

I'm not sure what you mean. How could index(x) ever replace
isinstance(x, (int, long)) without raising an exception? Surely
index("abc") *should* raise an exception.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-13 Thread A.M. Kuchling
On Mon, Feb 13, 2006 at 03:52:44PM -0500, Fred L. Drake, Jr. wrote:
> What I would also like to see is to have an automatically-updated
> version for each of the maintainer versions of Python, as well as
> the development trunk.  That would mean two versions at this point
> (2.4.x, 2.5.x); only one of those is currently handled
> automatically.

If Thomas could set up a wildcard DNS of some sort, would it be a good
idea to have lots of hostnames, e.g. docs-24.python.org,
docs-25.python.org, etc.?  We could probably make it work in Apache
with mod_rewrite so that we aren't endlessly tweaking the config file
as new versions are released.

--amk

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Aahz
On Mon, Feb 13, 2006, Jim Jewett wrote:
>
> getattr and getattribute are both things you might reasonably want to
> do. __int__ is something you probably shouldn't be doing very often
> anymore; it is being kept for backwards compatibility.

And how do you convert a float to an int?  __int__ is NOT going away; the
sole purpose of __index__ is to enable sequence index functionality and
similar use-cases for int-like objects that do not subclass from int.
(For example, one might want to allow an enumeration type to index into
a list.)
-- 
Aahz ([EMAIL PROTECTED])   <*> http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification

2006-02-13 Thread Jeremy Hylton
It sounds like the right answer for Python is to change the signature
of PyArg_ParseTupleAndKeywords() back.  We'll fix it when C fixes its
const rules .

Jeremy

On 2/13/06, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> On 2/12/06, Greg Ewing <[EMAIL PROTECTED]> wrote:
> >  > [A large head-exploding set of rules]
> >
> > Blarg.
> >
> > Const - Just Say No.
>
> +1
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification

2006-02-13 Thread Guido van Rossum
+1

On 2/13/06, Jeremy Hylton <[EMAIL PROTECTED]> wrote:
> It sounds like the right answer for Python is to change the signature
> of PyArg_ParseTupleAndKeywords() back.  We'll fix it when C fixes its
> const rules .
>
> Jeremy
>
> On 2/13/06, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > On 2/12/06, Greg Ewing <[EMAIL PROTECTED]> wrote:
> > >  > [A large head-exploding set of rules]
> > >
> > > Blarg.
> > >
> > > Const - Just Say No.
> >
> > +1
> >
> > --
> > --Guido van Rossum (home page: http://www.python.org/~guido/)
> > ___
> > Python-Dev mailing list
> > [email protected]
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: 
> > http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
> >
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread M.-A. Lemburg
Guido van Rossum wrote:
> On 2/13/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
>> At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote:
>>> One recommendation: for starters, I'd much rather see the bytes type
>>> standardized without a literal notation. There should be are lots of
>>> ways to create bytes objects from string objects, with specific
>>> explicit encodings, and those should suffice, at least initially.
>>>
>>> I also wonder if having a b"..." literal would just add more confusion
>>> -- bytes are not characters, but b"..." makes it appear as if they
>>> are.
>> Why not just have the constructor be:
>>
>>  bytes(initializer [,encoding])
>>
>> Where initializer must be either an iterable of suitable integers, or a
>> unicode/string object.  If the latter (i.e., it's a basestring), the
>> encoding argument would then be required.  Then, there's no need for
>> special codec support for the bytes type, since you call bytes on the thing
>> to be encoded.  And of course, no need for a 'b' literal.
> 
> It'd be cruel and unusual punishment though to have to write
> 
>   bytes("abc", "Latin-1")
> 
> I propose that the default encoding (for basestring instances) ought
> to be "ascii" just like everywhere else. (Meaning, it should really be
> the system default encoding, which defaults to "ascii" and is
> intentionally hard to change.)

We're talking about Py3k here: "abc" will be a Unicode string,
so why restrict the conversion to 7 bits when you can have 8 bits
without any conversion problems ?

While we're at it: I'd suggest that we remove the auto-conversion
from bytes to Unicode in Py3k and the default encoding along with
it. In Py3k the standard lib will have to be Unicode compatible
anyway and string parser markers like "s#" will have to go away
as well, so there's not much need for this anymore.

(Maybe a bit radical, but I guess that's what Py3k is meant for.)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 13 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification

2006-02-13 Thread Jeremy Hylton
On 2/10/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Jeremy Hylton wrote:
> > Ok.  I reviewed the original problem and you're right, the problem was
> > not that it failed outright but that it produced a warning about the
> > deprecated conversion:
> > warning: deprecated conversion from string constant to 'char*''
> >
> > I work at a place that takes the same attitude as python-dev about
> > warnings:  They're treated as errors and you can't check in code that
> > the compiler generates warnings for.
>
> In that specific case, I think the compiler's warning should be turned
> off; it is a bug in the compiler if that specific warning cannot be
> turned off separately.

The compiler in question is gcc and the warning can be turned off with
-Wno-write-strings.  I think we'd be better off leaving that option
on, though.  This warning will help me find places where I'm passing a
string literal to a function that does not take a const char*.  That's
valuable, not insensate.

Jeremy

> While it is true that the conversion is deprecated, the C++ standard
> defines this as
>
> "Normative for the current edition of the Standard, but not guaranteed
> to be part of the Standard in future revisions."
>
> The current version is from 1998. I haven't been following closely,
> but I believe there are no plans to actually remove the feature
> in the next revision.
>
> FWIW, Annex D also defines these features as deprecated:
> - the use of "static" for objects in namespace scope (AFAICT
>   including C file-level static variables and functions)
> - C library headers (i.e. )
>
> Don't you get a warning when including Python.h, because that
> include ?
>
> > Nonetheless, the consensus on the c++ sig and python-dev at the time
> > was to fix Python.  If we don't allow warnings in our compilations, we
> > shouldn't require our users at accept warnings in theirs.
>
> We don't allow warnings for "major compilers". This specific compiler
> appears flawed (or your configuration of it).
>
> Regards,
> Martin
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-13 Thread Thomas Wouters
On Mon, Feb 13, 2006 at 05:41:00PM -0500, A.M. Kuchling wrote:
> On Mon, Feb 13, 2006 at 03:52:44PM -0500, Fred L. Drake, Jr. wrote:
> > What I would also like to see is to have an automatically-updated
> > version for each of the maintainer versions of Python, as well as
> > the development trunk.  That would mean two versions at this point
> > (2.4.x, 2.5.x); only one of those is currently handled
> > automatically.

> If Thomas could set up a wildcard DNS of some sort,

That wouldn't be a problem. I fear what it'll do to the PageRank though ;-)

-- 
Thomas Wouters <[EMAIL PROTECTED]>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-13 Thread Michael Foord
Guido van Rossum wrote:
> Shouldn't docs.python.org be removed? It seems to add mroe confusion
> than anything, especially since most links on python.org continue to
> point to python.org/doc/.
>
>   
All the web says about 1200 links into the docs.python.org subdomain. 
(Different to the google link feature, which only shows links to a 
specific URL I believe.)

http://www.alltheweb.com/search?cat=web&cs=utf8&q=link%3Adocs.python.org&rys=0&itag=crv&_sb_lang=pref

It's where I link to as well. Be a shame to lose it. ;-)

Michael Foord

> On 2/13/06, Georg Brandl <[EMAIL PROTECTED]> wrote:
>   
>> The above docs are from August 2005 while docs.python.org/dev is current.
>> Shouldn't the old docs be removed?
>> 
>
> (Now that I work for Google I realize more than ever before the
> importance of keeping URLs stable; PageRank(tm) numbers don't get
> transferred as quickly as contents. I have this worry too in the
> context of the python.org redesign; 301 permanent redirect is *not*
> going to help PageRank of the new page.)
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 10:55 PM 2/13/2006 +0100, M.-A. Lemburg wrote:
>Guido van Rossum wrote:
> > On 2/13/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> >> At 09:55 AM 2/13/2006 -0800, Guido van Rossum wrote:
> >>> One recommendation: for starters, I'd much rather see the bytes type
> >>> standardized without a literal notation. There should be are lots of
> >>> ways to create bytes objects from string objects, with specific
> >>> explicit encodings, and those should suffice, at least initially.
> >>>
> >>> I also wonder if having a b"..." literal would just add more confusion
> >>> -- bytes are not characters, but b"..." makes it appear as if they
> >>> are.
> >> Why not just have the constructor be:
> >>
> >>  bytes(initializer [,encoding])
> >>
> >> Where initializer must be either an iterable of suitable integers, or a
> >> unicode/string object.  If the latter (i.e., it's a basestring), the
> >> encoding argument would then be required.  Then, there's no need for
> >> special codec support for the bytes type, since you call bytes on the 
> thing
> >> to be encoded.  And of course, no need for a 'b' literal.
> >
> > It'd be cruel and unusual punishment though to have to write
> >
> >   bytes("abc", "Latin-1")
> >
> > I propose that the default encoding (for basestring instances) ought
> > to be "ascii" just like everywhere else. (Meaning, it should really be
> > the system default encoding, which defaults to "ascii" and is
> > intentionally hard to change.)
>
>We're talking about Py3k here: "abc" will be a Unicode string,
>so why restrict the conversion to 7 bits when you can have 8 bits
>without any conversion problems ?

Actually, I thought we were talking about adding bytes() in 2.5.

However, now that you've brought this up, it actually makes perfect sense 
to just use latin-1 as the effective encoding for both strings and 
unicode.  In Python 2.x, strings are byte strings by definition, so it's 
only in 3.0 that an encoding would be required.  And again, latin1 is a 
reasonable, roundtrippable default encoding.

So, it sounds like making the encoding default to latin-1 would be a 
reasonably safe approach in both 2.x and 3.x.


>While we're at it: I'd suggest that we remove the auto-conversion
>from bytes to Unicode in Py3k and the default encoding along with
>it. In Py3k the standard lib will have to be Unicode compatible
>anyway and string parser markers like "s#" will have to go away
>as well, so there's not much need for this anymore.

I thought all this was already in the plan for 3.0, but maybe I assume too 
much.  :)

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification

2006-02-13 Thread M.-A. Lemburg
Tim Peters wrote:
> [Jeremy]
 I added some const to several API functions that take char* but
 typically called by passing string literals.
> 
> [Tim]
>>> If he had _stuck_ to that, we wouldn't be having this discussion :-)
>>> (that is, nobody passes string literals to
>>> PyArg_ParseTupleAndKeywords's kws argument).
> 
> [Jeremy]
>> They are passing arrays of string literals.  In my mind, that was a
>> nearly equivalent use case.  I believe the C++ compiler complains
>> about passing an array of string literals to char**.
> 
> It's the consequences:  nobody complains about tacking "const" on to a
> former honest-to-God "char *" argument that was in fact not modified,
> because that's not only helpful for C++ programmers, it's _harmless_
> for all programmers.  For example, nobody could sanely object (and
> nobody did :-)) to adding const to the attribute-name argument in
> PyObject_SetAttrString().  Sticking to that creates no new problems
> for anyone, so that's as far as I ever went.

Well, it broke my C extensions... I now have this in my code:

/* The keyword array changed to const char* in Python 2.5 */
#if PY_VERSION_HEX >= 0x0205
# define Py_KEYWORDS_STRING_TYPE const char
#else
# define Py_KEYWORDS_STRING_TYPE char
#endif
...
static Py_KEYWORDS_STRING_TYPE *kwslist[] = {"yada", NULL};
...
if (!PyArg_ParseTupleAndKeywords(args,kws,format,kwslist,&a1))
goto onError;

The crux is that code which should be portable across Python
versions won't work otherwise: you either get Python 2.5 xor
Python 2.x (for x < 5) compatibility.

Not too happy about it, but then compared to the ssize_t
changes and the relative imports PEP, this one is an easy
one to handle.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 13 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-13 Thread Fred L. Drake, Jr.
On Monday 13 February 2006 15:40, Guido van Rossum wrote:
 > Shouldn't docs.python.org be removed? It seems to add mroe confusion
 > than anything, especially since most links on python.org continue to
 > point to python.org/doc/.

docs.python.org was created specifically to make searching the most recent 
"stable" version of the docs easier (using Google's site: modifier, no less).  
I don't know what the link count statistics say (other than what you 
mention), and don't know which gets hit more often, but I still think it's a 
reasonable approach.

I've been switching links to point to docs.python.org whenever I find an older 
link that points to www.python.org/doc/current/; other parts of the doc/ area 
from the site didn't move, and perhaps that's a problem that should be 
addressed.

 > (Now that I work for Google I realize more than ever before the
 > importance of keeping URLs stable; PageRank(tm) numbers don't get
 > transferred as quickly as contents. I have this worry too in the
 > context of the python.org redesign; 301 permanent redirect is *not*
 > going to help PageRank of the new page.)

Maybe I'm just not getting why that's relevant.


  -Fred

-- 
Fred L. Drake, Jr.   
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-13 Thread Fredrik Lundh
Fred L. Drake, Jr. wrote:

> docs.python.org was created specifically to make searching the most recent
> "stable" version of the docs easier (using Google's site: modifier, no less).
> I don't know what the link count statistics say (other than what you
> mention), and don't know which gets hit more often

I've been looking into page stats for the AltPyDotOrgCms activity; from
what I can tell, it's evenly distributed (~55% on www.python.org/doc,
45% on docs.python.org)





___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread M.-A. Lemburg
Phillip J. Eby wrote:
 Why not just have the constructor be:

  bytes(initializer [,encoding])

 Where initializer must be either an iterable of suitable integers, or a
 unicode/string object.  If the latter (i.e., it's a basestring), the
 encoding argument would then be required.  Then, there's no need for
 special codec support for the bytes type, since you call bytes on the 
>> thing
 to be encoded.  And of course, no need for a 'b' literal.
>>> It'd be cruel and unusual punishment though to have to write
>>>
>>>   bytes("abc", "Latin-1")
>>>
>>> I propose that the default encoding (for basestring instances) ought
>>> to be "ascii" just like everywhere else. (Meaning, it should really be
>>> the system default encoding, which defaults to "ascii" and is
>>> intentionally hard to change.)
>> We're talking about Py3k here: "abc" will be a Unicode string,
>> so why restrict the conversion to 7 bits when you can have 8 bits
>> without any conversion problems ?
> 
> Actually, I thought we were talking about adding bytes() in 2.5.

Then we'd need to make the "ascii" encoding assumption
again, just like Guido proposed.

> However, now that you've brought this up, it actually makes perfect sense 
> to just use latin-1 as the effective encoding for both strings and 
> unicode.  In Python 2.x, strings are byte strings by definition, so it's 
> only in 3.0 that an encoding would be required.  And again, latin1 is a 
> reasonable, roundtrippable default encoding.

It is. However, it's not a reasonable assumption of the
default encoding since there are many encodings out there
that special case the characters 0x80-0xFF, hence the choice
of using ASCII as default encoding in Python.

The conversion from Unicode to bytes is different in this
respect, since you are converting from a "bigger" type to
a "smaller" one. Choosing latin-1 as default for this
conversion would give you all 8 bits, instead of just 7
bits that ASCII provides.

> So, it sounds like making the encoding default to latin-1 would be a 
> reasonably safe approach in both 2.x and 3.x.

Reasonable for bytes(): yes. In general: no.

>> While we're at it: I'd suggest that we remove the auto-conversion
>>from bytes to Unicode in Py3k and the default encoding along with
>> it. In Py3k the standard lib will have to be Unicode compatible
>> anyway and string parser markers like "s#" will have to go away
>> as well, so there's not much need for this anymore.
> 
> I thought all this was already in the plan for 3.0, but maybe I assume too 
> much.  :)

Wouldn't want to wait for Py4D :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 13 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote:
> Guido van Rossum wrote:
> > It'd be cruel and unusual punishment though to have to write
> >
> >   bytes("abc", "Latin-1")
> >
> > I propose that the default encoding (for basestring instances) ought
> > to be "ascii" just like everywhere else. (Meaning, it should really be
> > the system default encoding, which defaults to "ascii" and is
> > intentionally hard to change.)
>
> We're talking about Py3k here: "abc" will be a Unicode string,
> so why restrict the conversion to 7 bits when you can have 8 bits
> without any conversion problems ?

As Phillip guessed, I was indeed thinking about introducing bytes()
sooner than that, perhaps even in 2.5 (though I don't want anything
rushed).

Even in Py3k though, the encoding issue stands -- what if the file
encoding is Unicode? Then using Latin-1 to encode bytes by default
might not by what the user expected. Or what if the file encoding is
something totally different? (Cyrillic, Greek, Japanese, Klingon.)
Anything default but ASCII isn't going to work as expected. ASCII
isn't going to work as expected either, but it will complain loudly
(by throwing a UnicodeError) whenever you try it, rather than causing
subtle bugs later.

> While we're at it: I'd suggest that we remove the auto-conversion
> from bytes to Unicode in Py3k and the default encoding along with
> it.

I'm not sure which auto-conversion you're talking about, since there
is no bytes type yet. If you're talking about the auto-conversion from
str to unicode: the bytes type should not be assumed to have *any*
properties that the current str type has, and that includes
auto-conversion.

> In Py3k the standard lib will have to be Unicode compatible
> anyway and string parser markers like "s#" will have to go away
> as well, so there's not much need for this anymore.
>
> (Maybe a bit radical, but I guess that's what Py3k is meant for.)

Right.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> Actually, I thought we were talking about adding bytes() in 2.5.

I was.

> However, now that you've brought this up, it actually makes perfect sense
> to just use latin-1 as the effective encoding for both strings and
> unicode.  In Python 2.x, strings are byte strings by definition, so it's
> only in 3.0 that an encoding would be required.  And again, latin1 is a
> reasonable, roundtrippable default encoding.
>
> So, it sounds like making the encoding default to latin-1 would be a
> reasonably safe approach in both 2.x and 3.x.

I disagree. IMO the same reasons why we don't do this now for the
conversion between str and unicode stands for bytes.

> >While we're at it: I'd suggest that we remove the auto-conversion
> >from bytes to Unicode in Py3k and the default encoding along with
> >it. In Py3k the standard lib will have to be Unicode compatible
> >anyway and string parser markers like "s#" will have to go away
> >as well, so there's not much need for this anymore.

I don't know yet what the C API will look like in 3.0. But it may well
have to support auto-conversion from Unicode to char* using some
system default encoding (e.g. the Windows default code page?) in order
to be able to conveniently wrap OS APIs that use char* instead of some
sort of Unicode (and each OS has its own way of interpreting char* as
Unicode -- I believe Apple uses UTF-8?).

> I thought all this was already in the plan for 3.0, but maybe I assume too
> much.  :)

In Py3k, I can see two reasonable approaches to conversion between
strings (Unicode) and bytes: always require an explicit encoding, or
assume ASCII. Anything else is asking for trouble IMO.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 12:03 AM 2/14/2006 +0100, M.-A. Lemburg wrote:
>The conversion from Unicode to bytes is different in this
>respect, since you are converting from a "bigger" type to
>a "smaller" one. Choosing latin-1 as default for this
>conversion would give you all 8 bits, instead of just 7
>bits that ASCII provides.

I was just pointing out that since byte strings are bytes by definition, 
then simply putting those bytes in a bytes() object doesn't alter the 
existing encoding.  So, using latin-1 when converting a string to bytes 
actually seems like the the One Obvious Way to do it.

I'm so accustomed to being wary of encoding issues that the idea doesn't 
*feel* right at first - I keep going, "but you can't know what encoding 
those bytes are".  Then I go, Duh, that's the point.  If you convert 
str->bytes, there's no conversion and no interpretation - neither the str 
nor the bytes object knows its encoding, and that's okay.  So 
str(bytes_object) (in 2.x) should also just turn it back to a normal 
bytestring.

In fact, the 'encoding' argument seems useless in the case of str objects, 
and it seems it should default to latin-1 for unicode objects.  The only 
use I see for having an encoding for a 'str' would be to allow confirming 
that the input string in fact is valid for that encoding.  So, 
"bytes(some_str,'ascii')" would be an assertion that some_str must be valid 
ASCII.


> > So, it sounds like making the encoding default to latin-1 would be a
> > reasonably safe approach in both 2.x and 3.x.
>
>Reasonable for bytes(): yes. In general: no.

Right, I was only talking about bytes().

For 3.0, the type formerly known as "str" won't exist, so only the Unicode 
part will be relevant then.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> At 12:03 AM 2/14/2006 +0100, M.-A. Lemburg wrote:
> >The conversion from Unicode to bytes is different in this
> >respect, since you are converting from a "bigger" type to
> >a "smaller" one. Choosing latin-1 as default for this
> >conversion would give you all 8 bits, instead of just 7
> >bits that ASCII provides.
>
> I was just pointing out that since byte strings are bytes by definition,
> then simply putting those bytes in a bytes() object doesn't alter the
> existing encoding.  So, using latin-1 when converting a string to bytes
> actually seems like the the One Obvious Way to do it.

This actually makes some sense -- bytes(s) where isinstance(s, str)
should just copy the data, since we can't know what encoding the user
believes it is in anyway. (With the exception of string literals,
where it makes sense to assume that the user believes it is in the
same encoding as the source code -- but I believe non-ASCII characters
in string literals are disallowed anyway, or at least known to cause
undefined results in rats.)

> I'm so accustomed to being wary of encoding issues that the idea doesn't
> *feel* right at first - I keep going, "but you can't know what encoding
> those bytes are".  Then I go, Duh, that's the point.  If you convert
> str->bytes, there's no conversion and no interpretation - neither the str
> nor the bytes object knows its encoding, and that's okay.  So
> str(bytes_object) (in 2.x) should also just turn it back to a normal
> bytestring.

You've got me convinced. Scrap my previous responses in this thread.

> In fact, the 'encoding' argument seems useless in the case of str objects,

Right.

> and it seems it should default to latin-1 for unicode objects.

But here I disagree.

> The only
> use I see for having an encoding for a 'str' would be to allow confirming
> that the input string in fact is valid for that encoding.  So,
> "bytes(some_str,'ascii')" would be an assertion that some_str must be valid
> ASCII.

We already have ways to assert that a string is ASCII.

> For 3.0, the type formerly known as "str" won't exist, so only the Unicode
> part will be relevant then.

And I think then the encoding should be required or default to ASCII.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Michael Foord
Phillip J. Eby wrote:
[snip..]
>
> In fact, the 'encoding' argument seems useless in the case of str objects, 
> and it seems it should default to latin-1 for unicode objects.  The only 
>   
-1 for having an implicit encode that behaves differently to other 
implicit encodes/decodes that happen in Python. Life is confusing enough 
already.

Michael Foord

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Michael Foord <[EMAIL PROTECTED]> wrote:
> Phillip J. Eby wrote:
> [snip..]
> >
> > In fact, the 'encoding' argument seems useless in the case of str objects,
> > and it seems it should default to latin-1 for unicode objects.  The only
> >
> -1 for having an implicit encode that behaves differently to other
> implicit encodes/decodes that happen in Python. Life is confusing enough
> already.

But adding an encoding doesn't help. The str.encode() method always
assumes that the string itself is ASCII-encoded, and that's not good
enough:

>>> "abc".encode("latin-1")
'abc'
>>> "abc".decode("latin-1")
u'abc'
>>> "abc\xf0".decode("latin-1")
u'abc\xf0'
>>> "abc\xf0".encode("latin-1")
Traceback (most recent call last):
  File "", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position
3: ordinal not in range(128)
>>>

The right way to look at this is, as Phillip says, to consider
conversion between str and bytes as not an encoding but a data type
change *only*.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Barry Warsaw
On Mon, 2006-02-13 at 15:44 -0800, Guido van Rossum wrote:

> The right way to look at this is, as Phillip says, to consider
> conversion between str and bytes as not an encoding but a data type
> change *only*.

That sounds right to me too.
-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Michael Foord
Guido van Rossum wrote:
> On 2/13/06, Michael Foord <[EMAIL PROTECTED]> wrote:
>   
>> Phillip J. Eby wrote:
>> [snip..]
>> 
>>> In fact, the 'encoding' argument seems useless in the case of str objects,
>>> and it seems it should default to latin-1 for unicode objects.  The only
>>>
>>>   
>> -1 for having an implicit encode that behaves differently to other
>> implicit encodes/decodes that happen in Python. Life is confusing enough
>> already.
>> 
>
> But adding an encoding doesn't help. The str.encode() method always
> assumes that the string itself is ASCII-encoded, and that's not good
> enough:
>
>   
Sorry - I meant for the unicode to bytes case. A default encoding that 
behaves differently to the current to implicit encodes/decodes would be 
confusing IMHO.

I agree that string to bytes shouldn't change the value of the bytes. 
The least confusing description of a non-unicode string is 'byte-string'.

Michael Foord
 "abc".encode("latin-1")
 
> 'abc'
>   
 "abc".decode("latin-1")
 
> u'abc'
>   
 "abc\xf0".decode("latin-1")
 
> u'abc\xf0'
>   
 "abc\xf0".encode("latin-1")
 
> Traceback (most recent call last):
>   File "", line 1, in ?
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position
> 3: ordinal not in range(128)
>   
>
> The right way to look at this is, as Phillip says, to consider
> conversion between str and bytes as not an encoding but a data type
> change *only*.
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Alex Martelli
On 2/13/06, Guido van Rossum <[EMAIL PROTECTED]> wrote:
   ...
> I don't like to add a built-in index() at this point; mostly because
> of Occam's razor (we haven't found a need).

I thought you had agreed, back when I had said that __index__ should
also be made easily available to implementors of Python-coded classes
implementing sequences, more elegantly than by demanding that they
code x.__index__() [I can't think offhand of any other special-named
method that you HAVE to call directly -- there's always some syntax or
functionality in the standard library to call it more elegantly on
your behalf].  This doesn't neessarily argue that index should be in
the built-ins module, of course, but I thought there was a sentiment
towards having it in either the operator or math modules.


Alex
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] bdist_* to stdlib?

2006-02-13 Thread Guido van Rossum
In private email, Phillip Eby suggested to add these things to the
2.5. standard library:

bdist_deb, bdist_msi, and friends

He explained them as follows:

"""
bdist_deb makes .deb files (packages for Debian-based Linux distros, like
Ubuntu).  bdist_msi makes .msi installers for Windows (it's by Martin v.
Loewis).  Marc Lemburg proposed on the distutils-sig that these and various
other implemented bdist_* formats (other than bdist_egg) be included in the
next Python release, and there was no opposition there that I recall.
"""

I guess bdist_egg should also be added if we support setuptools (not
setuplib as I mistakenly called it previously)? (I'm still a bit
unclear on the various concepts here, not having made a distribution
of anything in a very long time...)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for adding an sq_index slot so that any object, a or b, can be used in X[a:b] notation

2006-02-13 Thread Guido van Rossum
Sorry, you're right. operator.index() sounds fine.

--Guido

On 2/13/06, Alex Martelli <[EMAIL PROTECTED]> wrote:
> On 2/13/06, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>...
> > I don't like to add a built-in index() at this point; mostly because
> > of Occam's razor (we haven't found a need).
>
> I thought you had agreed, back when I had said that __index__ should
> also be made easily available to implementors of Python-coded classes
> implementing sequences, more elegantly than by demanding that they
> code x.__index__() [I can't think offhand of any other special-named
> method that you HAVE to call directly -- there's always some syntax or
> functionality in the standard library to call it more elegantly on
> your behalf].  This doesn't neessarily argue that index should be in
> the built-ins module, of course, but I thought there was a sentiment
> towards having it in either the operator or math modules.
>
>
> Alex
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 03:23 PM 2/13/2006 -0800, Guido van Rossum wrote:
>On 2/13/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> > The only
> > use I see for having an encoding for a 'str' would be to allow confirming
> > that the input string in fact is valid for that encoding.  So,
> > "bytes(some_str,'ascii')" would be an assertion that some_str must be valid
> > ASCII.
>
>We already have ways to assert that a string is ASCII.

I didn't mean that it was the only purpose.  In Python 2.x, practical code 
has to sometimes deal with "string-like" objects.  That is, code that takes 
either strings or unicode.  If such code calls bytes(), it's going to want 
to include an encoding so that unicode conversions won't fail.  But 
silently ignoring the encoding argument in that case isn't a good idea.

Ergo, I propose to permit the encoding to be specified when passing in a 
(2.x) str object, to allow code that handles both str and unicode to be 
"str-stable" in 2.x.

I'm fine with rejecting an encoding argument if the initializer is not a 
str or unicode; I just don't want the call signature to vary based on a 
runtime distinction between str and unicode.  And, I don't want the 
encoding argument to be silently ignored when you pass in a string.  If I 
assert that I'm encoding ASCII (or utf-8 or whatever), then the string 
should be required to be valid.  If I don't pass in an encoding, then I'm 
good to go.

(This is orthogonal to the issue of what encoding is used as a default for 
conversions from the unicode type, btw.)


> > For 3.0, the type formerly known as "str" won't exist, so only the Unicode
> > part will be relevant then.
>
>And I think then the encoding should be required or default to ASCII.

The reason I'm arguing for latin-1 is symmetry in 2.x versions only.  (In 
3.x, there's no str vs. unicode, and thus nothing to be symmetrical.)  So, 
if you invoke bytes() without an encoding on a 2.x basestring, you should 
get the same result.  Latin-1 produces "the same result" when viewed in 
terms of the resulting byte string.

If we don't go with latin-1, I'd argue for requiring an encoding for 
unicode objects in 2.x, because that seems like the only reasonable way to 
break the symmetry between str and unicode, even though it forces 
"str-stable" code to specify an encoding.  The key is that at least *one* 
of the signatures needs to be stable in meaning across both str and unicode 
in 2.x in order to allow unicode-safe, str-stable code to be written.

(Again, for 3.x, this issue doesn't come into play because there's only one 
string type to worry about; what the default is or whether there's a 
default is therefore entirely up to you.)

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Michael Foord <[EMAIL PROTECTED]> wrote:
> Sorry - I meant for the unicode to bytes case. A default encoding that
> behaves differently to the current to implicit encodes/decodes would be
> confusing IMHO.

And I am in agreement with you there (I think only Phillip argued otherwise).

> I agree that string to bytes shouldn't change the value of the bytes.

It's a deal then.

Can the owner of PEP 332 update the PEP to record these decisions?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Missing PyCon 2006

2006-02-13 Thread Ka-Ping Yee
Hi folks.  I had been planning to attend PyCon this year and was really
looking forward to it, but i need to cancel.  I am sorry that i won't
be getting to see you all in a couple of weeks.

If you know anyone who hasn't yet registered but wants to go, please
contact me -- we can transfer my registration.  Thanks, and sorry for
using python-dev for this.


-- ?!ng
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> I didn't mean that it was the only purpose.  In Python 2.x, practical code
> has to sometimes deal with "string-like" objects.  That is, code that takes
> either strings or unicode.  If such code calls bytes(), it's going to want
> to include an encoding so that unicode conversions won't fail.

That sounds like a rather hypothetical example. Have you thought it
through? Presumably code that accepts both str and unicode either
doesn't care about encodings, but simply returns objects of the same
type as the arguments -- and then it's unlikely to want to convert the
arguments to bytes; or it *does* care about encodings, and then it
probably already has to special-case str vs. unicode because it has to
control how str objects are interpreted.

> But
> silently ignoring the encoding argument in that case isn't a good idea.
>
> Ergo, I propose to permit the encoding to be specified when passing in a
> (2.x) str object, to allow code that handles both str and unicode to be
> "str-stable" in 2.x.

Again, have you thought this through?

What would bytes("abc\xf0", "latin-1") *mean*? Take the string
"abc\xf0", interpret it as being encoded in XXX, and then encode from
XXX to Latin-1. But what's XXX? As I showed in a previous post,
"abc\xf0".encode("latin-1") *fails* because the source for the
encoding is assumed to be ASCII.

I think we can make this work only when the string in fact only
contains ASCII and the encoding maps ASCII to itself (which most
encodings do -- but e.g. EBCDIC does not). But I'm not sure how useful
that is.

> I'm fine with rejecting an encoding argument if the initializer is not a
> str or unicode; I just don't want the call signature to vary based on a
> runtime distinction between str and unicode.

I'm still not sure that this will actually help anyone.

> And, I don't want the
> encoding argument to be silently ignored when you pass in a string.

Agreed.

> If I
> assert that I'm encoding ASCII (or utf-8 or whatever), then the string
> should be required to be valid.

Defined how? That the string is already in that encoding?

> If I don't pass in an encoding, then I'm
> good to go.
>
> (This is orthogonal to the issue of what encoding is used as a default for
> conversions from the unicode type, btw.)

Right. The issues are completely different!

> > > For 3.0, the type formerly known as "str" won't exist, so only the Unicode
> > > part will be relevant then.
> >
> >And I think then the encoding should be required or default to ASCII.
>
> The reason I'm arguing for latin-1 is symmetry in 2.x versions only.  (In
> 3.x, there's no str vs. unicode, and thus nothing to be symmetrical.)  So,
> if you invoke bytes() without an encoding on a 2.x basestring, you should
> get the same result.  Latin-1 produces "the same result" when viewed in
> terms of the resulting byte string.

Only if you assume the str object is encoded in Latin-1.

Your argument for symmetry would be a lot stronger if we used Latin-1
for the conversion between str and Unicode. But we don't. I like the
other interpretation (which I thought was yours too?) much better: str
<--> bytes conversions don't use encodings by simply change the type
without changing the bytes; conversion between either and unicode
works exactly the same, and requires an encoding unless all the
characters involved are pure ASCII.

> If we don't go with latin-1, I'd argue for requiring an encoding for
> unicode objects in 2.x, because that seems like the only reasonable way to
> break the symmetry between str and unicode, even though it forces
> "str-stable" code to specify an encoding.  The key is that at least *one*
> of the signatures needs to be stable in meaning across both str and unicode
> in 2.x in order to allow unicode-safe, str-stable code to be written.

Using ASCII as the default encoding has the same property -- it can
remain stable across the 2.x / 3.0 boundary.

> (Again, for 3.x, this issue doesn't come into play because there's only one
> string type to worry about; what the default is or whether there's a
> default is therefore entirely up to you.)

A nice-to-have property would be that it might be possible to write
code that today deals with Unicode and str, but in 3.0 will deal with
Unicode and bytes instead. But I'm not sure how likely that is since
bytes objects won't have most methods that str and Unicode objects
have (like lower(), find(), etc.).

There's one property that bytes, str and unicode all share: type(x[0])
== type(x), at least as long as len(x) >= 1. This is perhaps the
ultimate test for string-ness.

Or should b[0] be an int, if b is a bytes object? That would change
things dramatically.

There's also the consideration for APIs that, informally, accept
either a string or a sequence of objects. Many of these exist, and
they are probably all being converted to support unicode as well as
str (if it makes sense at all). Should a bytes object be considered as
a sequen

Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread James Y Knight
On Feb 13, 2006, at 7:09 PM, Guido van Rossum wrote:

> On 2/13/06, Michael Foord <[EMAIL PROTECTED]> wrote:
>> Sorry - I meant for the unicode to bytes case. A default encoding  
>> that
>> behaves differently to the current to implicit encodes/decodes  
>> would be
>> confusing IMHO.
>
> And I am in agreement with you there (I think only Phillip argued  
> otherwise).
>
>> I agree that string to bytes shouldn't change the value of the bytes.
>
> It's a deal then.
>
> Can the owner of PEP 332 update the PEP to record these decisions?

So, in python2.X, you have:
- bytes("\x80"), you get a bytestring with a single byte of value  
0x80 (when no encoding is specified, and the object is a str, it  
doesn't try to encode it at all).
- bytes("\x80", encoding="latin-1"), you get an error, because  
encoding "\x80" into latin-1 implicitly decodes it into a unicode  
object first, via the system-wide default: ascii.
- bytes(u"\x80"), you get an error, because the default encoding for  
a unicode string is ascii.
- bytes(u"\x80", encoding="latin-1"), you get a bytestring with a  
single byte of value 0x80.

In py3k, when the str object is eliminated, then what do you have?  
Perhaps
- bytes("\x80"), you get an error, encoding is required. There is no  
such thing as "default encoding" anymore, as there's no str object.
- bytes("\x80", encoding="latin-1"), you get a bytestring with a  
single byte of value 0x80.


James
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, James Y Knight <[EMAIL PROTECTED]> wrote:
> So, in python2.X, you have:
> - bytes("\x80"), you get a bytestring with a single byte of value
> 0x80 (when no encoding is specified, and the object is a str, it
> doesn't try to encode it at all).
> - bytes("\x80", encoding="latin-1"), you get an error, because
> encoding "\x80" into latin-1 implicitly decodes it into a unicode
> object first, via the system-wide default: ascii.
> - bytes(u"\x80"), you get an error, because the default encoding for
> a unicode string is ascii.
> - bytes(u"\x80", encoding="latin-1"), you get a bytestring with a
> single byte of value 0x80.

Yes to all.

> In py3k, when the str object is eliminated, then what do you have?
> Perhaps
> - bytes("\x80"), you get an error, encoding is required. There is no
> such thing as "default encoding" anymore, as there's no str object.
> - bytes("\x80", encoding="latin-1"), you get a bytestring with a
> single byte of value 0x80.

Yes to both again.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] http://www.python.org/dev/doc/devel still available

2006-02-13 Thread Jeremy Hylton
On 2/13/06, Fred L. Drake, Jr. <[EMAIL PROTECTED]> wrote:
> On Monday 13 February 2006 15:40, Guido van Rossum wrote:
>  > Shouldn't docs.python.org be removed? It seems to add mroe confusion
>  > than anything, especially since most links on python.org continue to
>  > point to python.org/doc/.
>
> docs.python.org was created specifically to make searching the most recent
> "stable" version of the docs easier (using Google's site: modifier, no less).
> I don't know what the link count statistics say (other than what you
> mention), and don't know which gets hit more often, but I still think it's a
> reasonable approach.

Why not do a query like this?
http://www.google.com/search?q=site%3Apython.org/doc/current%20urllib&hl=en

Jeremy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Neil Schemenauer
Guido van Rossum <[EMAIL PROTECTED]> wrote:
>> In py3k, when the str object is eliminated, then what do you have?
>> Perhaps
>> - bytes("\x80"), you get an error, encoding is required. There is no
>> such thing as "default encoding" anymore, as there's no str object.
>> - bytes("\x80", encoding="latin-1"), you get a bytestring with a
>> single byte of value 0x80.
>
> Yes to both again.

I haven't been following this dicussion about bytes() real closely
but I don't think that bytes() should do the encoding.  We already
have a way to spell that:

"\x80".encode('latin-1')

Also, I think it would useful to introduce byte array literals at
the same time as the bytes object.  That would allow people to use
byte arrays without having to get involved with all the silly string
encoding confusion.

  Neil

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Fred L. Drake, Jr.
On Monday 13 February 2006 21:52, Neil Schemenauer wrote:
 > Also, I think it would useful to introduce byte array literals at
 > the same time as the bytes object.  That would allow people to use
 > byte arrays without having to get involved with all the silly string
 > encoding confusion.

bytes([0, 1, 2, 3])


  -Fred

-- 
Fred L. Drake, Jr.   
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Guido van Rossum
On 2/13/06, Neil Schemenauer <[EMAIL PROTECTED]> wrote:
> Guido van Rossum <[EMAIL PROTECTED]> wrote:
> >> In py3k, when the str object is eliminated, then what do you have?
> >> Perhaps
> >> - bytes("\x80"), you get an error, encoding is required. There is no
> >> such thing as "default encoding" anymore, as there's no str object.
> >> - bytes("\x80", encoding="latin-1"), you get a bytestring with a
> >> single byte of value 0x80.
> >
> > Yes to both again.
>
> I haven't been following this dicussion about bytes() real closely
> but I don't think that bytes() should do the encoding.  We already
> have a way to spell that:
>
> "\x80".encode('latin-1')

But in 2.5 we can't change that to return a bytes object without
creating HUGE incompatibilities.

In general I've come to appreciate that there are two ways of
converting an object of type A to an object of type B: ask an A
instance to convert itself to a B, or ask the type B to create a new
instance from an A. Depending on what A and B are, both APIs make
sense; sometimes reasons of decoupling require that A can't know about
B, in which case you have to use the latter approach; sometimes B
can't know about A, in which case you have to use the former. Even
when A == B we sometimes support both APIs: to create a new list from
a list a, you can write a[:] or list(a); to create a new dict from a
dict d, you can write d.copy() or dict(d).

An advantage of the latter API is that there's no confusion about the
resulting type -- dict(d) is definitely a dict, and list(a) is
definitely a list. Not so for d.copy() or a[:] -- if the input type is
another mapping or sequence, it'll probably return an object of that
same type.

Again, it depends on the application which is better.

I think that bytes(s, ) is fine, especially for expressing a
new type, since it is unambiguous about the result type, and has no
backwards compatibility issues.

> Also, I think it would useful to introduce byte array literals at
> the same time as the bytes object.  That would allow people to use
> byte arrays without having to get involved with all the silly string
> encoding confusion.

You missed the part where I said that introducing the bytes type
*without* a literal seems to be a good first step. A new type, even
built-in, is much less drastic than a new literal (which requires
lexer and parser support in addition to everything else).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Barry Warsaw
On Feb 13, 2006, at 7:29 PM, Guido van Rossum wrote:

> There's one property that bytes, str and unicode all share: type(x[0])
> == type(x), at least as long as len(x) >= 1. This is perhaps the
> ultimate test for string-ness.

But not perfect, since of course other containers can contain objects  
of their own type too.  But it leads to an interesting issue...

> Or should b[0] be an int, if b is a bytes object? That would change
> things dramatically.

This makes me think I want an unsigned byte type, which b[0] would  
return.  In another thread I think someone mentioned something about  
fixed width integral types, such that you could have an object that  
was guaranteed to be 8-bits wide, 16-bits wide, etc.   Maybe you also  
want signed and unsigned versions of each.  This may seem like YAGNI  
to many people, but as I've been working on a tightly embedded/ 
extended application for the last few years, I've definitely had  
occasions where I wish I could more closely and more directly model  
my C values as Python objects (without using the standard workarounds  
or writing my own C extension types).

But anyway, without hyper-generalizing, it's still worth asking  
whether a bytes type is just a container of byte objects, where the  
contained objects would be distinct, fixed 8-bit unsigned integral  
types.

> There's also the consideration for APIs that, informally, accept
> either a string or a sequence of objects. Many of these exist, and
> they are probably all being converted to support unicode as well as
> str (if it makes sense at all). Should a bytes object be considered as
> a sequence of things, or as a single thing, from the POV of these
> types of APIs? Should we try to standardize how code tests for the
> difference? (Currently all sorts of shortcuts are being taken, from
> isinstance(x, (list, tuple)) to isinstance(x, basestring).)

I think bytes objects are very much like string objects today --  
they're the photons of Python since they can act like either  
sequences or scalars, depending on the context.  For example, we have  
code that needs to deal with situations where an API can return  
either a scalar or a sequence of those scalars.  So we have a utility  
function like this:

def thingiter(obj):
 try:
 it = iter(obj)
 except TypeError:
 yield obj
 else:
 for item in it:
 yield item

Maybe there's a better way to do this, but the most obvious problem  
is that (for our use cases), this fails for strings because in this  
context we want strings to act like scalars.  So we add a little test  
just before the "try:" like "if isinstance(obj, basestring): yield  
obj".  But that's yucky.

I don't know what the solution is -- if there /is/ a solution short  
of special case tests like above, but I think the key observation is  
that sometimes you want your string to act like a sequence and  
sometimes you want it to act like a scalar.  I suspect bytes objects  
will be the same way.

-Barry


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Phillip J. Eby
At 04:29 PM 2/13/2006 -0800, Guido van Rossum wrote:
>On 2/13/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> > I didn't mean that it was the only purpose.  In Python 2.x, practical code
> > has to sometimes deal with "string-like" objects.  That is, code that takes
> > either strings or unicode.  If such code calls bytes(), it's going to want
> > to include an encoding so that unicode conversions won't fail.
>
>That sounds like a rather hypothetical example. Have you thought it
>through? Presumably code that accepts both str and unicode either
>doesn't care about encodings, but simply returns objects of the same
>type as the arguments -- and then it's unlikely to want to convert the
>arguments to bytes; or it *does* care about encodings, and then it
>probably already has to special-case str vs. unicode because it has to
>control how str objects are interpreted.

Actually, it's the other way around.  Code that wants to output 
uninterpreted bytes right now and accepts either strings or Unicode has to 
special-case *unicode* -- not str, because str is the only "bytes type" we 
currently have.

This creates an interesting issue in WSGI for Jython, which of course only 
has one (unicode-based) string type now.  Since there's no bytes type in 
Python in general, the only solution we could come up with was to treat 
such strings as latin-1:

 http://www.python.org/peps/pep-0333.html#unicode-issues

This is why I'm biased towards latin-1 encoding of unicode to bytes; it's 
"the same thing" as an uninterpreted string of bytes.

I think the difference in our viewpoints is that you're still thinking 
"string" thoughts, whereas I'm thinking "byte" thoughts.  Bytes are just 
bytes; they don't *have* an encoding.

So, if you think of "converting a string to bytes" as meaning "create an 
array of numerals corresponding to the characters in the string", then this 
leads to a uniform result whether the characters are in a str or a unicode 
object.  In other words, to me, bytes(str_or_unicode) should be treated as:

 bytes(map(ord, str_or_unicode))

In other words, without an encoding, bytes() should simply treat str and 
unicode objects *as if they were a sequence of integers*, and produce an 
error when an integer is out of range.  This is a logical and consistent 
interpretation in the absence of an encoding, because in that case you 
don't care about the encoding - it's just raw data.

If, however, you include an encoding, then you're stating that you want to 
encode the *meaning* of the string, not merely its integer values.


>What would bytes("abc\xf0", "latin-1") *mean*? Take the string
>"abc\xf0", interpret it as being encoded in XXX, and then encode from
>XXX to Latin-1. But what's XXX? As I showed in a previous post,
>"abc\xf0".encode("latin-1") *fails* because the source for the
>encoding is assumed to be ASCII.

I'm saying that XXX would be the same encoding as you specified.  i.e., 
including an encoding means you are encoding the *meaning* of the string.

However, I believe I mainly proposed this as an alternative to having 
bytes(str_or_unicode) work like bytes(map(ord,str_or_unicode)), which I 
think is probably a saner default.


>Your argument for symmetry would be a lot stronger if we used Latin-1
>for the conversion between str and Unicode. But we don't.

But that's because we're dealing with its meaning *as a string*, not merely 
as ordinals in a sequence of bytes.


>  I like the
>other interpretation (which I thought was yours too?) much better: str
><--> bytes conversions don't use encodings by simply change the type
>without changing the bytes;

I like it better too.  The part you didn't like was where MAL and I believe 
this should be extended to Unicode characters in the 0-255 range also.  :)


>There's one property that bytes, str and unicode all share: type(x[0])
>== type(x), at least as long as len(x) >= 1. This is perhaps the
>ultimate test for string-ness.
>
>Or should b[0] be an int, if b is a bytes object? That would change
>things dramatically.

+1 for it being an int.  Heck, I'd want to at least consider the 
possibility of introducing a character type (chr?) in Python 3.0, and 
getting rid of the "iterating a string yields strings" 
characteristic.  I've found it to be a bit of a pain when dealing with 
heterogeneous nested sequences that contain strings.


>There's also the consideration for APIs that, informally, accept
>either a string or a sequence of objects. Many of these exist, and
>they are probably all being converted to support unicode as well as
>str (if it makes sense at all). Should a bytes object be considered as
>a sequence of things, or as a single thing, from the POV of these
>types of APIs? Should we try to standardize how code tests for the
>difference? (Currently all sorts of shortcuts are being taken, from
>isinstance(x, (list, tuple)) to isinstance(x, basestring).)

I'm inclined to think of certain features at least in terms of the buffer 
interface, but 

Re: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification

2006-02-13 Thread Martin v. Löwis
Jeremy Hylton wrote:
> The compiler in question is gcc and the warning can be turned off with
> -Wno-write-strings.  I think we'd be better off leaving that option
> on, though.  This warning will help me find places where I'm passing a
> string literal to a function that does not take a const char*.  That's
> valuable, not insensate.

Hmm. I'd say this depends on what your reaction to the warning is.
If you sprinkle const_casts in the code, nothing is gained.

Perhaps there is some value in finding functions which ought to expect
const char*. For that, occasional checks should be sufficient; I cannot
see a point in having code permanently pass with that option. In
particular not if you are interfacing with C libraries.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Martin v. Löwis
M.-A. Lemburg wrote:
> We're talking about Py3k here: "abc" will be a Unicode string,
> so why restrict the conversion to 7 bits when you can have 8 bits
> without any conversion problems ?

YAGNI. If you have a need for byte string in source code, it will
typically be "random" bytes, which can be nicely used through

  bytes([0x73, 0x9f, 0x44, 0xd2, 0xfb, 0x49, 0xa3, 0x14,  0x8b, 0xee])

For larger blocks, people should use base64.string_to_bytes (which
can become a synonym for base64.decodestring in Py3k).

If you have bytes that are meaningful text for some application
(say, a wire protocol), it is typically ASCII-Text. No protocol
I know of uses non-ASCII characters for protocol information.

Of course, you need a way to get .encode output as bytes somehow,
both in 2.5, and in Py3k. I suggest writing

  bytes(s.encode(encoding))

In 2.5, bytes() can be constructed from strings, and will do a
conversion; in Py3k, .encode will already return a string, so
this will be a no-op.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Martin v. Löwis
Phillip J. Eby wrote:
> I was just pointing out that since byte strings are bytes by definition, 
> then simply putting those bytes in a bytes() object doesn't alter the 
> existing encoding.  So, using latin-1 when converting a string to bytes 
> actually seems like the the One Obvious Way to do it.

This is a misconception. In Python 2.x, the type str already *is* a
bytes type. So if S is an instance of 2.x str, bytes(S) does not need
to do any conversion. You don't need to assume it is latin-1: it's
already bytes.

> In fact, the 'encoding' argument seems useless in the case of str objects, 
> and it seems it should default to latin-1 for unicode objects.

I agree with the former, but not with the latter. There shouldn't be a
conversion of Unicode objects to bytes at all. If you want bytes from
a Unicode string U, write

  bytes(U.encode(encoding))

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Martin v. Löwis
Guido van Rossum wrote:
>>In py3k, when the str object is eliminated, then what do you have?
>>Perhaps
>>- bytes("\x80"), you get an error, encoding is required. There is no
>>such thing as "default encoding" anymore, as there's no str object.
>>- bytes("\x80", encoding="latin-1"), you get a bytestring with a
>>single byte of value 0x80.
> 
> 
> Yes to both again.

Please reconsider, and don't give bytes() an encoding= argument.
It doesn't need one. In Python 3, people should write

  "\x80".encode("latin-1")

if they absolutely want to, although they better write

  bytes([0x80])

Now, the first form isn't valid in 2.5, but

  bytes(u"\x80".encode("latin-1"))

could work in all versions.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Adam Olsen
On 2/13/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> M.-A. Lemburg wrote:
> > We're talking about Py3k here: "abc" will be a Unicode string,
> > so why restrict the conversion to 7 bits when you can have 8 bits
> > without any conversion problems ?
>
> YAGNI. If you have a need for byte string in source code, it will
> typically be "random" bytes, which can be nicely used through
>
>   bytes([0x73, 0x9f, 0x44, 0xd2, 0xfb, 0x49, 0xa3, 0x14,  0x8b, 0xee])
>
> For larger blocks, people should use base64.string_to_bytes (which
> can become a synonym for base64.decodestring in Py3k).
>
> If you have bytes that are meaningful text for some application
> (say, a wire protocol), it is typically ASCII-Text. No protocol
> I know of uses non-ASCII characters for protocol information.

What would that imply for repr()?  To support eval(repr(x)) it would
have to produce whatever format the source code includes to begin
with.

If I understand correctly there's three main candidates:
1. Direct copying to str in 2.x, pretending it's latin-1 in unicode in 3.x
2. Direct copying to str/unicode if it's only ascii values, switching
to a list of hex literals if there's any non-ascii values
3. b"foo" literal with ascii for all ascii characters (other than \
and "), \xFF for individual characters that aren't ascii

Given the choice I prefer the third option, with the second option as
my runner up.  The first option just screams "silent errors" to me.


--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Baffled by PyArg_ParseTupleAndKeywords modification

2006-02-13 Thread Martin v. Löwis
M.-A. Lemburg wrote:
>>It's the consequences:  nobody complains about tacking "const" on to a
>>former honest-to-God "char *" argument that was in fact not modified,
>>because that's not only helpful for C++ programmers, it's _harmless_
>>for all programmers.  For example, nobody could sanely object (and
>>nobody did :-)) to adding const to the attribute-name argument in
>>PyObject_SetAttrString().  Sticking to that creates no new problems
>>for anyone, so that's as far as I ever went.
> 
> 
> Well, it broke my C extensions... I now have this in my code:
> 
> /* The keyword array changed to const char* in Python 2.5 */
> #if PY_VERSION_HEX >= 0x0205
> # define Py_KEYWORDS_STRING_TYPE const char
> #else
> # define Py_KEYWORDS_STRING_TYPE char
> #endif
> ...
> static Py_KEYWORDS_STRING_TYPE *kwslist[] = {"yada", NULL};
> ...

You did not read Tim's message carefully enough. He wasn't talking
about PyArg_ParseTupleAndKeywords *at all*. He only talked about
changing char* arguments to const char*, e.g. in
PyObject_SetAttrString. Did that break your C extensions also?

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread James Y Knight
On Feb 14, 2006, at 12:20 AM, Phillip J. Eby wrote:
>  bytes(map(ord, str_or_unicode))
>
> In other words, without an encoding, bytes() should simply treat  
> str and
> unicode objects *as if they were a sequence of integers*, and  
> produce an
> error when an integer is out of range.  This is a logical and  
> consistent
> interpretation in the absence of an encoding, because in that case you
> don't care about the encoding - it's just raw data.


If you're talking about "raw data", then make bytes(unicodestring)  
produce what buffer(unicodestring) currently does -- something  
completely and utterly worthless. :) [it depends on how you compiled  
python and what endianness your system has.]

There really is no case where you don't care about the  
encoding...there is always a specific desired output encoding, and  
you have to think about what encoding that is. The argument that  
latin-1 is a sensible default just because you can convert to latin-1  
by chopping off the upper 3 bytes of a unicode character's ordinal  
position is not convincing; you're still doing an encoding operation,  
it just happens to be computationally easy. That Jython programs have  
to pretend that unicode strings are an appropriate way to store  
bytes, and thus often have to do fake "latin-1" conversions which are  
really no such thing, doesn't make a convincing argument either.  
Using unicode strings to store bytes read from or written to a socket  
is really just broken.

Actually having any default encoding at all is IMO a poor idea, but  
as python has one at the moment (ascii), might as well keep using it  
for consistency until it's eliminated (sys.setdefaultencoding 
('undefined') is my friend.)

James
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bdist_* to stdlib?

2006-02-13 Thread Martin v. Löwis
Guido van Rossum wrote:
> In private email, Phillip Eby suggested to add these things to the
> 2.5. standard library:
> 
> bdist_deb, bdist_msi, and friends
[...]
> I guess bdist_egg should also be added if we support setuptools (not
> setuplib as I mistakenly called it previously)? 

I'm in favour of that (and not only because I wrote bdist_msi :-).
I think distutils should support all native package formats we can
get code for.

I'm actually opposed to bdist_egg, from a conceptual point of view.
I think it is wrong if Python creates its own packaging format
(just as it was wrong that Java created jar files - but they are
without deployment procedures even today). The burden should be
on developer's side, for creating packages for the various systems,
not on the users side, when each software comes with its own
deployment infrastructure.

OTOH, users are fond of eggs, for reasons that I haven't yet
understood.

>From a release management point of view, I would still like to
make another bdist_msi release before contributing it to Python.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

2006-02-13 Thread Martin v. Löwis
Adam Olsen wrote:
> What would that imply for repr()?  To support eval(repr(x))

I don't think eval(repr(x)) needs to be supported for the bytes
type. However, if that is desirable, it should return something
like

  bytes([1,2,3])

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bdist_* to stdlib?

2006-02-13 Thread Thomas Wouters
On Mon, Feb 13, 2006 at 04:04:26PM -0800, Guido van Rossum wrote:
> In private email, Phillip Eby suggested to add these things to the
> 2.5. standard library:
> 
> bdist_deb, bdist_msi, and friends

FWIW, I've been using a patched distutils with bdist_deb, and it's worked
fine for the most part. The only issue I had was with a setuptools package
(rather than distutils), which I'm sure can be worked out. (Not that I'm
particularly convinced setuptools is the right approach for a .deb, but I
haven't really seen the point of setuptools anyway ;)

-- 
Thomas Wouters <[EMAIL PROTECTED]>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com