date:20140116

Re: [Python-Dev] PEP 460 reboot

2014-01-16 Thread Cameron Simpson

On 14Jan2014 20:23, Antoine Pitrou  wrote:
> On Tue, 14 Jan 2014 10:52:05 -0800
> Guido van Rossum  wrote:
> > Quite a few people have spoken out in favor of loud
> > failures rather than silent "wrong" output. But I think that in the
> > specific context of formatting output, there is a long and IMO good
> > tradition of producing (slightly) wrong output in favor of more
> > strict behavior. Consider for example what to do when a number
> > doesn't fit in the given width. Would you rather raise an exception,
> > truncate the
> > value, or mess up the formatting? All languages newer than Fortran
> > that I've used have chosen the latter, and I still agree it's a good
> > idea.
> 
> Well that's useful when printing out human-readable stuff on stdout,
> much less when you're emitting binary data that's supposed to conform
> to a well-defined protocol. I expect bytes formatting to be used for
> the latter, not the former.

I'm 12 hours behind in this thread still, but I'm with Antoine here.

With protocols, there's a long and IMO good tradition in the RFC
world of being generous in what you accept and conservative in what
you send, and writing bytes data constitutes "send" to my mind.

While having numbers overflow their widths is (only) often ok for
human reports, even that is a PITA for machine parsing later.

By way of a text example, my personal bugbear is the UNIX "ps" command
in its many flavours. It has fixed width columns with fields that
frequently overflow these days, and the overflowing numbers abut
each other. Post processing this rubbish is a disaster (I don't
want to write "ps", but I have written things that want to read its
output).

Of course the fix is easy in some ways, use format strings saying
"%-5d %-5d %-5d" instead of "%-6d%-6d%-6d". But the authors of ps
didn't. And quietly overflowing these fields is exactly what breaks
my post processing programs.

Morally, this is the same as mojibake.

Therefore I am firmly in the "fail loudly" camp: if the format
string doesn't behave as you naively expected it to, find out early
while you can easily fix it.

Cheers,
-- 
Cameron Simpson 

Motorcycles are like peanuts... who can stop at just one?
- Zebee Johnstone  aus.motorcycles Poser Permit #1
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Argument Clinic Derby update

2014-01-16 Thread Larry Hastings




The Derby is moving forward, and I have maybe a half-dozen contributors 
so far.  They've made a small dent in making the conversion but I'd have 
to say it's going slowly.  We could use more people contributing patches!


To the contributors with patches that are stalled in the tracker: Sorry, 
but there are only so many hours in the day.  I really am spending all 
day on this, every day, but I've also been adding new features in 
response to user requests and that's eaten a lot of my time.  I'm 
getting to the patches in the order they arrived in my mailbox.  Please 
sit tight, I'll get to yours, and I really do appreciate your 
contribution.  If you have more time to contribute, please consider 
writing more patches.  It really will help!


And I guess I could also use some volunteers to review patches, too!


//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Serhiy Storchaka


16.01.14 07:55, Larry Hastings написав(ла):

  * itertools.repeat deliberately makes it impossible to provide an
argument for "times" that behaves the same as not providing the
"times" argument, and
  * there is currently no way to implement this behavior using Argument
Clinic.  (I'd have to add a hack where impl functions also get args
and kwargs.)


/*[clinic input]
itertools.times
object: object
[
times: int
]
[clinic start generated code]*/


Are you suggesting that, when converting builtins to Argument Clinic
with unrepresentable default values, we're permitted to tweak the
defaults to something representable?


I think we need some standard placeholder for unrepresentable default 
value. May be "..." or "?"?



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Serhiy Storchaka


16.01.14 08:05, Guido van Rossum написав(ла):

In this specific case it's clear to me that the special-casing of
negative count is intentional -- presumably it emulates sequence
repetition, where e.g. 'a'*-1 == ''.


In this specific case it's contrary to sequence repetition. Because 
repeat('a', -1) repeats 'a' forever. This is a point of Vajrasky's issue 
[1].



But not accepting None is laziness -- accepting either a number or
None requires much more effort, if you need to have the number as a C
integer. I'm not sure how AC could make this any easier, unless you
want to special-case maxint or -maxint-1.


getattr(foo, 'bar', None) is not the same as getattr(foo, 'bar'). So 
None can't be used as universal default value.



[1] http://bugs.python.org/issue19145


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Terry Reedy

On 1/16/2014 3:31 AM, Serhiy Storchaka wrote:

16.01.14 08:05, Guido van Rossum написав(ла):

In this specific case it's clear to me that the special-casing of
negative count is intentional -- presumably it emulates sequence
repetition, where e.g. 'a'*-1 == ''.

In this specific case it's contrary to sequence repetition. Because
repeat('a', -1) repeats 'a' forever.

'Forever' only when the keyword is used and the value is -1.
In 3.4b2

>>> itertools.repeat('a', -1)
repeat('a', 0)
>>> itertools.repeat('a', times=-1)
repeat('a')
>>> itertools.repeat('a', times=-2)
repeat('a', -2)

> This is a point of Vajrasky's issue [1].

The first line is correct in both behavior and representation.
The second line behavior (and corresponding repr) are wrong.
The third line repr is wrong but the behavior is like the first.

[1] http://bugs.python.org/issue19145

--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Nick Coghlan

On 16 Jan 2014 17:53, "Ethan Furman"  wrote:
>
> On 01/15/2014 06:45 AM, Brett Cannon wrote:
>>
>>
>> This is why I have argued that if you specify it as "if there is a
format spec specified, then the return value from
>> calling __format__() will have str.decode('ascii', 'strict') called on
it" you get the support for the various
>> number-specific format specs for free.
>
>
> It may work like this under the hood, but it's an implementation detail.
 Since the numeric format codes will call int, index, or float on the
object (to handle subclasses), we could then call __format__ on the
resulting int or float to do the heavy lifting; but since __format__ on
anything else would never be called I don't want to give that impression.

I have a different proposal: let's *just* add mod formatting to bytes, and
leave the extensible formatting system as a text only operation.

We don't really care if bytes supports that method for version
compatibility purposes, and the deliberate flexibility of the design makes
it hard to translate into the binary domain.

So let's just not provide that - let's accept that, for the binary domain,
printf style formatting is just a better fit for the job :)

Cheers,
Nick.

>
>
>> It also means if you pass in a string that you just want the strict
ASCII bytes
>> of then you can get it with {:s}.
>
>
> This isn't going to happen.  If the user wants a string to be in the byte
stream, it has to either be a bytes literal or explicitly encoded [1].
>
> --
> ~Ethan~
>
> [1] Apologies if this has already been answered.  I wanted to make sure I
responded to all the ideas/objects, and I may have responded more than once
to some.  It's been a long few threads.  ;)
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Greg Ewing


Nick Coghlan wrote:


I have a different proposal: let's *just* add mod formatting to bytes, 
and leave the extensible formatting system as a text only operation.


+1

--
Greg
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Nick Coghlan

On 16 Jan 2014 11:45, "Carl Meyer"  wrote:
>
> Hi Ethan,
>
> I haven't chimed into this discussion, but the direction it's headed
> recently seems right to me. Thanks for putting together a PEP. Some
> comments on it:
>
> On 01/15/2014 05:13 PM, Ethan Furman wrote:
> > 
> > Abstract
> > 
> >
> > This PEP proposes adding the % and {} formatting operations from str to
> > bytes [1].
>
> I think the PEP could really use a rationale section summarizing _why_
> these formatting operations are being added to bytes; namely that they
> are useful when working with various ASCIIish-but-not-properly-text
> network protocols and file formats, and in particular when porting code
> dealing with such formats/protocols from Python 2.
>
> Also I think it would be useful to have a section summarizing the
> primary objections that have been raised, and why those objections have
> been overruled (presuming the PEP is accepted). For instance: the main
> objection, AIUI, has been that the bytes type is for pure bytes-handling
> with no assumptions about encoding, and thus we should not add features
> to it that assume ASCIIness, and that may be attractive nuisances for
> people writing bytes-handling code that should not assume ASCIIness but
> will once they use the feature.

Close, but not quite - the concern was that this was a feature that didn't
*inherently* imply a restriction to ASCII compatible data, but only did so
when the numeric formatting codes were used. This made it a source of value
dependent compatibility errors based on the format string, akin to the kind
of value dependent errors seen when implicitly encoding arbitrary text as
ASCII.

Guido's successful counter was to point out that the parsing of the format
string itself assumes ASCII compatible data, thus placing at least the
mod-formatting operation in the same category as the currently existing
valid-for-sufficiently-ASCII-compatible-data only operations.

Current discussions suggest to me that the argument against implicit
encoding operations that introduce latent data driven defects may still
apply to bytes.format though, so I've reverted to being -1 on that.

Cheers,
Nick.

>And the refutation: that the bytes type
> already provides some operations that assume ASCIIness, and these new
> formatting features are no more of an attractive nuisance than those;
> since the syntax of the formatting mini-languages themselves itself
> assumes ASCIIness, there is not likely to be any temptation to use it
> with binary data that cannot.
>
> Although it can be hard to arrive at accurate and agreed-on summaries of
> the discussion, recording such summaries in the PEP is important; it may
> help save our future selves and colleagues from having to revisit all
> these same discussions and megathreads.
>
> > Overriding Principles
> > =
> >
> > In order to avoid the problems of auto-conversion and value-generated
> > exceptions,
> > all object checking will be done via isinstance, not by values contained
> > in a
> > Unicode representation.  In other words::
> >
> >   - duck-typing to allow/reject entry into a byte-stream
> >   - no value generated errors
>
> This seems self-contradictory; "isinstance" is type-checking, which is
> the opposite of duck-typing. A duck-typing implementation would not use
> isinstance, it would call / check for the existence of a certain magic
> method instead.
>
> I think it might also be good to expand (very) slightly on what "the
> problems of auto-conversion and value-generated exceptions" are; that
> is, that the benefit of Python 3's model is that encoding is explicit,
> not implicit, making it harder to unwittingly write code that works as
> long as all data is ASCII, but fails as soon as someone feeds in
> non-ASCII text data.
>
> Not everyone who reads this PEP will be steeped in years of discussion
> about the relative merits of the Python 2 vs 3 models; it doesn't hurt
> to spell out a few assumptions.
>
>
> > Proposed semantics for bytes formatting
> > ===
> >
> > %-interpolation
> > ---
> >
> > All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.)
> > will be supported, and will work as they do for str, including the
> > padding, justification and other related modifiers, except locale.
> >
> > Example::
> >
> >>>> b'%4x' % 10
> >b'   a'
> >
> > %c will insert a single byte, either from an int in range(256), or from
> > a bytes argument of length 1.
> >
> > Example:
> >
> > >>> b'%c' % 48
> > b'0'
> >
> > >>> b'%c' % b'a'
> > b'a'
> >
> > %s is restricted in what it will accept::
> >
> >   - input type supports Py_buffer?
> > use it to collect the necessary bytes
> >
> >   - input type is something else?
> > use its __bytes__ method; if there isn't one, raise an exception [2]
> >
> > Examples:
> >
> > >>> b'%s' % b'abc'
> > b'abc'
> >
> > >>> b'%s' % 3.14
> > Trac

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Antoine Pitrou

On Wed, 15 Jan 2014 21:55:46 -0800
Larry Hastings  wrote:
> 
> Passing in "None" here is inconvenient as it's an integer argument.

Inconvenient for whom? The callee or the caller?

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Antoine Pitrou

On Thu, 16 Jan 2014 04:42:43 -0500
Terry Reedy  wrote:

> On 1/16/2014 3:31 AM, Serhiy Storchaka wrote:
> > 16.01.14 08:05, Guido van Rossum написав(ла):
> >> In this specific case it's clear to me that the special-casing of
> >> negative count is intentional -- presumably it emulates sequence
> >> repetition, where e.g. 'a'*-1 == ''.
> >
> > In this specific case it's contrary to sequence repetition. Because
> > repeat('a', -1) repeats 'a' forever.
> 
> 'Forever' only when the keyword is used and the value is -1.
> In 3.4b2
> 
>  >>> itertools.repeat('a', -1)
> repeat('a', 0)
>  >>> itertools.repeat('a', times=-1)
> repeat('a')
>  >>> itertools.repeat('a', times=-2)
> repeat('a', -2)

Looks like a horrible bug to me. Passing an argument by position should
mean the same as passing it by keyword!

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: asyncio: Fix CoroWrapper (fix my previous commit)

2014-01-16 Thread Antoine Pitrou

On Thu, 16 Jan 2014 01:55:43 +0100 (CET)
victor.stinner  wrote:
> http://hg.python.org/cpython/rev/f07161c4f3aa
> changeset:   88494:f07161c4f3aa
> user:Victor Stinner 
> date:Thu Jan 16 01:55:29 2014 +0100
> summary:
>   asyncio: Fix CoroWrapper (fix my previous commit)
> 
> Add __name__ and __doc__ to __slots__
> 
> files:
>   Lib/asyncio/tasks.py |  4 +---
>   1 files changed, 1 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/Lib/asyncio/tasks.py b/Lib/asyncio/tasks.py
> --- a/Lib/asyncio/tasks.py
> +++ b/Lib/asyncio/tasks.py
> @@ -32,9 +32,7 @@
>  
>  
>  class CoroWrapper:
> -"""Wrapper for coroutine in _DEBUG mode."""
> -
> -__slots__ = ['gen', 'func']
> +__slots__ = ['gen', 'func', '__name__', '__doc__']
>  

Why did you remove the docstring?

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Georg Brandl

Am 16.01.2014 12:39, schrieb Antoine Pitrou:
> On Thu, 16 Jan 2014 04:42:43 -0500
> Terry Reedy  wrote:
> 
>> On 1/16/2014 3:31 AM, Serhiy Storchaka wrote:
>> > 16.01.14 08:05, Guido van Rossum написав(ла):
>> >> In this specific case it's clear to me that the special-casing of
>> >> negative count is intentional -- presumably it emulates sequence
>> >> repetition, where e.g. 'a'*-1 == ''.
>> >
>> > In this specific case it's contrary to sequence repetition. Because
>> > repeat('a', -1) repeats 'a' forever.
>> 
>> 'Forever' only when the keyword is used and the value is -1.
>> In 3.4b2
>> 
>>  >>> itertools.repeat('a', -1)
>> repeat('a', 0)
>>  >>> itertools.repeat('a', times=-1)
>> repeat('a')
>>  >>> itertools.repeat('a', times=-2)
>> repeat('a', -2)
> 
> Looks like a horrible bug to me. Passing an argument by position should
> mean the same as passing it by keyword!

Indeed, that should be fixed regardless of AC.

Georg

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread MRAB


On 2014-01-16 05:32, Larry Hastings wrote:
[snip]


In the specific case of SHA1_new's "string" parameter, we could lie and
claim that the default value is b''.  Internally we could still use NULL
as a default and get away with it.  But this is only a happy
coincidence.  Many (most?) functions like this won't have a clever
Python value we can trick you with.

What else could we do?  We could add a special value, let's call it
sys.NULL, whose specific semantics are "turns into NULL when passed into
builtins".  This would solve the problem but it's really, really awful.


Would it be better if it were called "__null__"?


The only other option I can see: don't convert SHA1_new() to use
Argument Clinic, and don't provide introspection information for it either.

Can you, gentle reader, suggest a better option?


//arry/

p.s. Ryan's function signatures above suggest that he's converting code
from using PyArg_ParseTuple into using PyArg_ParseTupleAndKeywords.  I
don't think he's *actually* doing that, and if I saw that in patches
submitted to me I would ask that it be fixed.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Common subset of python 2 and python 3

2014-01-16 Thread Markus Unterwaditzer

On Wed, Jan 15, 2014 at 01:22:44PM +0100, "Martin v. Löwis" wrote:
> Am 12.01.14 18:39, schrieb Nachshon David Armon:
> >>> I propose that this new version of python use the python 3 unicode model.
> >>> As the version of python will be fully compatible with both python 2 and
> >>> with python 3 but NOT necsesarily with all existing code in either. It is
> >>> designed as a porting tool only.
> 
> I don't think that it is possible to write an interpreter that is fully
> compatible for all it accepts. Would you think that the program
> 
> print(repr(2**80).endswith("L"))
> 
> is in the subset that should be supported by both Python 2 and Python 3?

IMO Python 2 and 3 do have this part in common when you talk about valid syntax
and available methods and functions, but not in terms of behavior. I think a
new proposed Python version should simply crash on your example.

I'm kind-of playing devil's advocate here because i agree with previous posters
that such a Python version is unneccessary with tox and "python2 -3"

> 
> Notice that it prints "True" in Python 2 and "False" in Python 3. So if
> this common-version interpreter *rejects* the above program, which
> operation (**, repr, endswith) would you want to ban from subset?

Warnings about using certain string methods on repr() might be a neat thing to
add to "python -3" or static analysis tools.

> 
> Regards,
> Martin
> 
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/markus%40unterwaditzer.net

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Brett Cannon

On Thu, Jan 16, 2014 at 2:51 AM, Ethan Furman  wrote:

> On 01/15/2014 06:45 AM, Brett Cannon wrote:
>
>>
>> This is why I have argued that if you specify it as "if there is a format
>> spec specified, then the return value from
>> calling __format__() will have str.decode('ascii', 'strict') called on
>> it" you get the support for the various
>> number-specific format specs for free.
>>
>
> It may work like this under the hood, but it's an implementation detail.


I'm arguing it's not an implementation detail but a definition of how
bytes.format() would work.


>  Since the numeric format codes will call int, index, or float on the
> object (to handle subclasses),


But that's **only** because the numeric types choose to as part of their
__format__() implementation; it is not inherent to str.format().


> we could then call __format__ on the resulting int or float to do the
> heavy lifting;


It's not just the heavy lifting; it does **all** the lifting for format
specifications.


> but since __format__ on anything else would never be called I don't want
> to give that impression.
>
>
Fine, if you're worried about bytes.format() overstepping by implicitly
calling str.encode() on the return value of __format__() then you will need
__bytes__format__() to get equivalent support.

-Brett


>
>  It also means if you pass in a string that you just want the strict ASCII
>> bytes
>> of then you can get it with {:s}.
>>
>
> This isn't going to happen.  If the user wants a string to be in the byte
> stream, it has to either be a bytes literal or explicitly encoded [1].
>
> --
> ~Ethan~
>
> [1] Apologies if this has already been answered.  I wanted to make sure I
> responded to all the ideas/objects, and I may have responded more than once
> to some.  It's been a long few threads.  ;)
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> brett%40python.org
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Brett Cannon

On Thu, Jan 16, 2014 at 4:56 AM, Nick Coghlan  wrote:

>
> On 16 Jan 2014 17:53, "Ethan Furman"  wrote:
> >
> > On 01/15/2014 06:45 AM, Brett Cannon wrote:
> >>
> >>
> >> This is why I have argued that if you specify it as "if there is a
> format spec specified, then the return value from
> >> calling __format__() will have str.decode('ascii', 'strict') called on
> it" you get the support for the various
> >> number-specific format specs for free.
> >
> >
> > It may work like this under the hood, but it's an implementation detail.
>  Since the numeric format codes will call int, index, or float on the
> object (to handle subclasses), we could then call __format__ on the
> resulting int or float to do the heavy lifting; but since __format__ on
> anything else would never be called I don't want to give that impression.
>
> I have a different proposal: let's *just* add mod formatting to bytes, and
> leave the extensible formatting system as a text only operation.
>
> We don't really care if bytes supports that method for version
> compatibility purposes, and the deliberate flexibility of the design makes
> it hard to translate into the binary domain.
>
> So let's just not provide that - let's accept that, for the binary domain,
> printf style formatting is just a better fit for the job :)
>

Or PEP 460 for bytes.format() and PEP 461 for %.

-Brett


> Cheers,
> Nick.
>
> >
> >
> >> It also means if you pass in a string that you just want the strict
> ASCII bytes
> >> of then you can get it with {:s}.
> >
> >
> > This isn't going to happen.  If the user wants a string to be in the
> byte stream, it has to either be a bytes literal or explicitly encoded [1].
> >
> > --
> > ~Ethan~
> >
> > [1] Apologies if this has already been answered.  I wanted to make sure
> I responded to all the ideas/objects, and I may have responded more than
> once to some.  It's been a long few threads.  ;)
> >
> > ___
> > Python-Dev mailing list
> > [email protected]
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Ethan Furman


On 01/16/2014 04:49 AM, Michael Urman wrote:

On Thu, Jan 16, 2014 at 1:52 AM, Ethan Furman  wrote:

Is this an intended exception to the overriding principle?



Hmm, thanks for spotting that.  Yes, that would be a value error if anything
over 255 is used, both currently in Py2, and for bytes in Py3.  As Carl
suggested, a little more explanation is needed in the PEP.


FYI, note that str/unicode already has another value-dependent
exception with %c. I find the message surprising, as I wasn't aware
Python had a 'char' type:


'%c' % 'a'

'a'

'%c' % 'abc'

Traceback (most recent call last):
   File "", line 1, in 
TypeError: %c requires int or char


Python doesn't have a char type, it has str's of length 1... which are usually 
referred to as char's.  ;)

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Neil Schemenauer

Greg Ewing  wrote:
> Neil Schemenauer wrote:
>> Objects that implement __str__ can also implement __bytes__ if they
>> can guarantee that ASCII characters are always returned,
>
> I think __ascii_ would be a better name. I'd expect
> a method called __bytes__ on an int to return some
> version of its binary value.

I realize now we can't use __bytes__.  Currently, passing an int
to bytes() causes it to construct an object with that many null
bytes.

If we are going to support format() (I'm not convinced it is nessary
and could easily be added in a later version), then we need an
equivalent to __format__.  My vote is either:

def __formatascii__(self, spec):
...

or

def __ascii__(self, spec):
...

Previously I was thinking of __bformat__ or __formatb__ but having
ascii in the method name is a great reminder.

Objects with a natural arbitrary byte representation can implement
__bytes__ and %s should use that if it exists.

  Neil

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: asyncio: Fix CoroWrapper (fix my previous commit)

2014-01-16 Thread Guido van Rossum

Because somehow you can't have a slot named __doc__ *and* a docstring
in the class. Try it. (I tried to work around this but didn't get very
far.)

On Thu, Jan 16, 2014 at 3:40 AM, Antoine Pitrou  wrote:
> On Thu, 16 Jan 2014 01:55:43 +0100 (CET)
> victor.stinner  wrote:
>> http://hg.python.org/cpython/rev/f07161c4f3aa
>> changeset:   88494:f07161c4f3aa
>> user:Victor Stinner 
>> date:Thu Jan 16 01:55:29 2014 +0100
>> summary:
>>   asyncio: Fix CoroWrapper (fix my previous commit)
>>
>> Add __name__ and __doc__ to __slots__
>>
>> files:
>>   Lib/asyncio/tasks.py |  4 +---
>>   1 files changed, 1 insertions(+), 3 deletions(-)
>>
>>
>> diff --git a/Lib/asyncio/tasks.py b/Lib/asyncio/tasks.py
>> --- a/Lib/asyncio/tasks.py
>> +++ b/Lib/asyncio/tasks.py
>> @@ -32,9 +32,7 @@
>>
>>
>>  class CoroWrapper:
>> -"""Wrapper for coroutine in _DEBUG mode."""
>> -
>> -__slots__ = ['gen', 'func']
>> +__slots__ = ['gen', 'func', '__name__', '__doc__']
>>
>
> Why did you remove the docstring?
>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: asyncio: Fix CoroWrapper (fix my previous commit)

2014-01-16 Thread Christian Heimes

On 16.01.2014 16:57, Guido van Rossum wrote:
> Because somehow you can't have a slot named __doc__ *and* a docstring
> in the class. Try it. (I tried to work around this but didn't get very
> far.)

That's true for all class attributes. You can't have a slot and a class
attribute at the same time. After all the __doc__ string is stored in a
class attribute, too.

>>> class Example:
... __slots__ = ("egg",)
... # This doesn't work
... egg = None
...
Traceback (most recent call last):
  File "", line 1, in 
ValueError: 'egg' in __slots__ conflicts with class variable


>>> class Example:
... """doc"""
... __slots__ = ("__doc__",)
...
Traceback (most recent call last):
  File "", line 1, in 
ValueError: '__doc__' in __slots__ conflicts with class variable


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Michael Urman

On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon  wrote:
> Fine, if you're worried about bytes.format() overstepping by implicitly
> calling str.encode() on the return value of __format__() then you will need
> __bytes__format__() to get equivalent support.

Could we just re-use PEP-3101's note (easily updated for Python 3):

Note for Python 2.x: The 'format_spec' argument will be either
a string object or a unicode object, depending on the type of the
original format string.  The __format__ method should test the type
of the specifiers parameter to determine whether to return a string or
unicode object.  It is the responsibility of the __format__ method
to return an object of the proper type.

If __format__ receives a format_spec of type bytes, it should return
bytes. For such cases on objects that cannot support bytes (i.e. for
str), it can raise. This appears to avoid the need for additional
methods. (As does Nick's proposal of leaving it out for now.)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Brett Cannon

On Thu, Jan 16, 2014 at 11:33 AM, Michael Urman  wrote:

> On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon  wrote:
> > Fine, if you're worried about bytes.format() overstepping by implicitly
> > calling str.encode() on the return value of __format__() then you will
> need
> > __bytes__format__() to get equivalent support.
>
> Could we just re-use PEP-3101's note (easily updated for Python 3):
>
> Note for Python 2.x: The 'format_spec' argument will be either
> a string object or a unicode object, depending on the type of the
> original format string.  The __format__ method should test the type
> of the specifiers parameter to determine whether to return a string or
> unicode object.  It is the responsibility of the __format__ method
> to return an object of the proper type.
>
> If __format__ receives a format_spec of type bytes, it should return
> bytes. For such cases on objects that cannot support bytes (i.e. for
> str), it can raise. This appears to avoid the need for additional
> methods. (As does Nick's proposal of leaving it out for now.)
>

That's a very good catch, Michael! I think that makes sense if there is
precedence. Unfortunately that bit from the PEP never made it into the
documentation so I'm not sure if there is a backwards-compatibility worry.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Neil Schemenauer

Carl Meyer  wrote:
> I think the PEP could really use a rationale section summarizing _why_
> these formatting operations are being added to bytes

I agree.  My attempt at re-writing the PEP is below.

>> In order to avoid the problems of auto-conversion and
>> value-generated exceptions, all object checking will be done via
>> isinstance, not by values contained in a Unicode representation.
>> In other words::
>> 
>>   - duck-typing to allow/reject entry into a byte-stream
>>   - no value generated errors
>
> This seems self-contradictory; "isinstance" is type-checking, which is
> the opposite of duck-typing.

Again, I agree.  We should avoid isinstance checks if possible.

Abstract

This PEP proposes adding %-interpolation to the bytes object.

Rational

A distruptive but useful change introduced in Python 3.0 was the clean
separation of byte strings (i.e. the "bytes" object) from character
strings (i.e. the "str" object).  The benefit is that character
encodings must be explicitly specified and the risk of corrupting
character data is reduced.

Unfortunately, this separation has made writing certain types of
programs more complicated and verbose.  For example, programs that deal
with network protocols often manipulate ASCII encoded strings.  Since
the "bytes" type does not support string formatting, extra encoding and
decoding between the "str" type is required.

For simplicity and convenience it is desireable to introduce formatting
methods to "bytes" that allow formatting of ASCII-encoded character
data.  This change would blur the clean separation of byte strings and
character strings.  However, it is felt that the practical benefits
outweigh the purity costs.  The implicit assumption of ASCII-encoding
would be limited to formatting methods.

One source of many problems with the Python 2 Unicode implementation is
the implicit coercion of Unicode character strings into byte strings
using the "ascii" codec.  If the character strings contain only ASCII
characters, all was well.  However, if the string contains a non-ASCII
character then coercion causes an exception.

The combination of implicit coercion and value dependent failures has
proven to be a recipe for hard to debug errors.  A program may seem to
work correctly when tested (e.g. string input that happened to be ASCII
only) but later would fail, often with a traceback far from the source
of the real error.  The formatting methods for bytes() should avoid this
problem by not implicitly encoding data that might fail based on the
content of the data.

Another desirable feature is to allow arbitrary user classes to be used
as formatting operands.  Generally this is done by introducing a special
method that can be implemented by the new class.

Proposed semantics for bytes formatting
===

Special method __ascii__

A new special method, analogous to __format__, is introduced.  This
method takes a single argument, a format specifier.  The return
value is a bytes object.  Objects that have an ASCII only
representation can implement this method to allow them to be used as
format operators.  Objects with natural byte representations should
implement __bytes__ or the Py_buffer API.

%-interpolation
---

All the numeric formatting codes (such as %x, %o, %e, %f, %g, etc.)
will be supported, and will work as they do for str, including the
padding, justification and other related modifiers.  To avoid having to
introduce two special methods, the format specifications will be
translated to equivalent __format__ specifiers and __ascii__ method
of each argument would be called.

Example::

   >>> b'%4x' % 10
   b'   a'

%c will insert a single byte, either from an int in range(256), or from
a bytes argument of length 1.

Example:

>>> b'%c' % 48
b'0'

>>> b'%c' % b'a'
b'a'

%s is a restricted in what it will accept::

  - input type supports Py_buffer or has __bytes__?
use it to collect the necessary bytes (may contain non-ASCII
characters)

  - input type is something else?
use its __ascii__ method; if there isn't one, raise TypeErorr

Examples:

>>> b'%s' % b'abc'
b'abc'

>>> b'%s' % 3.14
b'3.14'

>>> b'%4s' % 12
b'  12'

>>> b'%s' % 'hello world!'
Traceback (most recent call last):
...
TypeError: 'hello world' has no __ascii__ method, perhaps you need to 
encode it?

.. note::

   Because the str type does not have a __ascii__ method, attempts to
   directly use 'a string' as a bytes interpolation value will raise an
   exception.  To use 'string' values, they must be encoded or otherwise
   transformed into a bytes sequence::

  'a string'.encode('latin-1')

Unsupported % format codes
^^

%r (which calls __repr__) is not supported

format
--

The format() method will not be implemented at this time but may be
added in a later Python release.  The __ascii__ method is

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Ethan Furman


On 01/16/2014 06:45 AM, Brett Cannon wrote:

On Thu, Jan 16, 2014 at 2:51 AM, Ethan Furman wrote:

On 01/15/2014 06:45 AM, Brett Cannon wrote:


This is why I have argued that if you specify it as
 "if there is a format spec specified, then the return
value from calling __format__() will have
 str.decode('ascii', 'strict') called on it" you get
the support for the various number-specific format
 specs for free.



Since the numeric format codes will call int, index,
 or float on the object (to handle subclasses),


But that's **only** because the numeric types choose
 to as part of their __format__() implementation; it is
not inherent to str.format().


As I understand it, str.format will call the object's __format__.  So, for 
example, if I say:

  u'the value is: %d' % myNum(17)

then it will be myNum.__format__ that gets called, not int.__format__; this is precisely what we don't want, since can't 
know that myNum is only going to return ASCII characters.


This is why I would have bytes.__format__, as part of its parsing, call int, index, or float depending on the format 
code; so the above example would have bytes.__format__ calling int() on myNum(17), at which point we either have an int 
type or an exception was raised because myNum isn't really an integer.  Once we have an int, whose format we know and 
trust, then we can call its __format__ and proceed from there.


On the flip side, if myNum does define it's own __format__, it will not be called by bytes.format, and perhaps that is 
another good reason for bytes to only support %-interpolation and not format?


--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Neil Schemenauer

Michael Urman  wrote:
> If __format__ receives a format_spec of type bytes, it should return
> bytes. For such cases on objects that cannot support bytes (i.e. for
> str), it can raise. This appears to avoid the need for additional
> methods. (As does Nick's proposal of leaving it out for now.)

That's an interesting idea.  I proposed __ascii__ as a analogous
method to __format__ for bytes formatting and to have
%-interpolation use it.  However, overloading __format__ based on
the type of the argument could work.

I see with Python 3:

>>> (1).__format__(b'')
Traceback (most recent call last):
  File "", line 1, in 
TypeError: must be str, not bytes

A TypeError exception is what we want if the object does not support
bytes formatting.  Some possible problems:

- It could be hard to provide a helpful exception message since it
  is generated inside the __format__ method rather than inside the
  bytes.__mod__ method (in the case of a missing __ascii__ method).
  The most common error will be using a str object and so we could
  modify the __format__ method of str to provide a nice hint (use
  encode()).

- Is there some risk that an object will unwittingly implement a
  __format__ method that unintentionally accepts a bytes argument?
  That requires some investigation.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Michael Urman

On Thu, Jan 16, 2014 at 11:13 AM, Neil Schemenauer  wrote:
> A TypeError exception is what we want if the object does not support
> bytes formatting.  Some possible problems:
>
> - It could be hard to provide a helpful exception message since it
>   is generated inside the __format__ method rather than inside the
>   bytes.__mod__ method (in the case of a missing __ascii__ method).
>   The most common error will be using a str object and so we could
>   modify the __format__ method of str to provide a nice hint (use
>   encode()).

The various format functions could certainly intercept and wrap
exceptions raised by __format__ methods. Once the core types were
modified to expect bytes in format_spec, however, this may not be
critical; __format__ methods which delegate would work as expected,
str could certainly be clear about why it raised, and custom
implementations would be handled per comments I'll make on your second
point. Overall I suspect this is no worse than unhandled values in the
format_spec are today.

> - Is there some risk that an object will unwittingly implement a
>   __format__ method that unintentionally accepts a bytes argument?
>   That requires some investigation.

Agreed. Some quick armchair calculations suggest to me that there are
three likely outcomes:
 - Properly handle the type (perhaps written with the 2.x clause in mind)
 - Raise an exception internally (perhaps ValueError, such as from
format(3, 'q'))
 - Mishandle and return a str (perhaps due to to if/else defaulting)
The first and second outcome may well reflect what we want, and the
third could easily be detected and turned into an exception by the
format functions.

I'm uncertain whether this reflects all the scenarios we would care about.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Glenn Linderman


On 1/16/2014 8:41 AM, Brett Cannon wrote:
That's a very good catch, Michael! I think that makes sense if there 
is precedence. Unfortunately that bit from the PEP never made it into 
the documentation so I'm not sure if there is a 
backwards-compatibility worry.


No.  If __format__ is called with bytes format, and returns str, there 
would be an exception generated on the spot.


If __format__ is called with bytes format, and tries to use it as str, 
there would be an exception generated on the spot.


Prior to 3.whenever-this-is-implemented, Python 3 only provides str 
formats to __format__, right? So new code is required to pass bytes to 
__format__.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Eric V. Smith

On 01/16/2014 11:23 AM, Ethan Furman wrote:
> On 01/16/2014 06:45 AM, Brett Cannon wrote:
>> On Thu, Jan 16, 2014 at 2:51 AM, Ethan Furman wrote:
>>> On 01/15/2014 06:45 AM, Brett Cannon wrote:

 This is why I have argued that if you specify it as
  "if there is a format spec specified, then the return
 value from calling __format__() will have
  str.decode('ascii', 'strict') called on it" you get
 the support for the various number-specific format
  specs for free.
> 
>>> Since the numeric format codes will call int, index,
>>>  or float on the object (to handle subclasses),
>>
>> But that's **only** because the numeric types choose
>>  to as part of their __format__() implementation; it is
>> not inherent to str.format().
> 
> As I understand it, str.format will call the object's __format__.  So,
> for example, if I say:
> 
>   u'the value is: %d' % myNum(17)
> 
> then it will be myNum.__format__ that gets called, not int.__format__;
> this is precisely what we don't want, since can't know that myNum is
> only going to return ASCII characters.

"Magic" methods, including __format__, are called on the type, not the
instance.

> This is why I would have bytes.__format__, as part of its parsing, call
> int, index, or float depending on the format code; so the above example
> would have bytes.__format__ calling int() on myNum(17), at which point
> we either have an int type or an exception was raised because myNum
> isn't really an integer.  Once we have an int, whose format we know and
> trust, then we can call its __format__ and proceed from there.
> 
> On the flip side, if myNum does define it's own __format__, it will not
> be called by bytes.format, and perhaps that is another good reason for
> bytes to only support %-interpolation and not format?

For the first iteration of bytes.format(), I think we should just
support the exact types of int, float, and bytes. It will call the
type's__format__ (with the object as "self") and encode the result to
ASCII. For the stated use case of 2.x compatibility, I suspect this will
cover > 90% of the uses in real code. If we find there are cases where
real code needs additional types supported, we can consider adding
__format_ascii__ (or whatever name we cook up).

Eric.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Guido van Rossum

On Thu, Jan 16, 2014 at 1:42 AM, Terry Reedy  wrote:
> On 1/16/2014 3:31 AM, Serhiy Storchaka wrote:
>>
>> 16.01.14 08:05, Guido van Rossum написав(ла):
>>>
>>> In this specific case it's clear to me that the special-casing of
>>> negative count is intentional -- presumably it emulates sequence
>>> repetition, where e.g. 'a'*-1 == ''.
>>
>>
>> In this specific case it's contrary to sequence repetition. Because
>> repeat('a', -1) repeats 'a' forever.
>
>
> 'Forever' only when the keyword is used and the value is -1.
> In 3.4b2
>
 itertools.repeat('a', -1)
> repeat('a', 0)
 itertools.repeat('a', times=-1)
> repeat('a')
 itertools.repeat('a', times=-2)
> repeat('a', -2)
>
>
>> This is a point of Vajrasky's issue [1].
>
> The first line is correct in both behavior and representation.
> The second line behavior (and corresponding repr) are wrong.
> The third line repr is wrong but the behavior is like the first.
>
>> [1] http://bugs.python.org/issue19145

Eew. This is much more wacko than I thought. (Serves me right for
basically not caring about itertools :-(. ) It also mostly sounds like
unintended -- I can't imagine the intention was to treat the keyword
argument different than the positional argument, but I can easily
imagine getting the logic slightly wrong.

If I had complete freedom in redefining the spec I would treat
positional and keyword the same, interpret absent or None to mean
"forever" and explicit negative integers to mean the same as zero, and
make repr show a positional integer >= 0 if the repeat isn't None.

But I don't know if that's too much of a change.

@Antoine:
>> Passing in "None" here is inconvenient as it's an integer argument.
>
> Inconvenient for whom? The callee or the caller?

I meant for the callee -- it's slightly complex to code up right. But
IMO worth it.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Guido van Rossum

On Thu, Jan 16, 2014 at 12:31 AM, Serhiy Storchaka  wrote:
> getattr(foo, 'bar', None) is not the same as getattr(foo, 'bar'). So None
> can't be used as universal default value.

Not universal, but I still think that most functions don't need to
have such a subtle distinction.

E.g. in the case of sha1() I still believe that it's totally fine to
switch the default to b''. In that particular case I don't see the
need to also accept None as a way to specify the default.

Basically, my philosophy about this is that anytime you can't easily
reimplement the same signature in Python (without reverting to
manually parsing the args using *args and **kwds) it is a pain, and
you should think twice before canonizing such a signature.

Also, there are two somewhat different cases:

(a) The default can easily be expressed as a value of the same type
that the argument normally has. This is the sha1() case. In this case
I see no need to also accept None as an argument (unless it is
currently accepted, which it isn't for sha1()). Another example is
.read() -- here, passing in a negative integer means the same
as not passing an argument.

(b) The default has a special meaning that does something different
than any valid value. A good example is getattr(), which must forever
be special.

To me, most functions should fall in (a) even if there is currently
ambiguity, and it feels like repeat() was *meant* to be in (a).

I'm not sure how AC should deal with (b), but I still hope that true
examples are rare enough that we can keep hand-coding them.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: asyncio: Fix CoroWrapper (fix my previous commit)

2014-01-16 Thread Guido van Rossum

Yeah the confusing thing is that omitting the docstring fixes it --
the class still has a __doc__ attribute but apparently it comes from
the metaclass. :-)

I guess you *could* have both a class and an instance __doc__ by
making a really clever descriptor, but it seems simpler to just use a
comment instead of a docstring. :-)

I'll do this now.

On Thu, Jan 16, 2014 at 8:14 AM, Christian Heimes  wrote:
> On 16.01.2014 16:57, Guido van Rossum wrote:
>> Because somehow you can't have a slot named __doc__ *and* a docstring
>> in the class. Try it. (I tried to work around this but didn't get very
>> far.)
>
> That's true for all class attributes. You can't have a slot and a class
> attribute at the same time. After all the __doc__ string is stored in a
> class attribute, too.
>
 class Example:
> ... __slots__ = ("egg",)
> ... # This doesn't work
> ... egg = None
> ...
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: 'egg' in __slots__ conflicts with class variable
>
>
 class Example:
> ... """doc"""
> ... __slots__ = ("__doc__",)
> ...
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: '__doc__' in __slots__ conflicts with class variable
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] python code in argument clinic annotations

2014-01-16 Thread Yury Selivanov

The whole discussion of whether clinic should write its output
right in the source file (buffered or not), or in a separate sidefile,
started because we currently cannot run the clinic during the build
process, since it’s written in python.

But what if, at some point, someone implements the Tools/clinic.py in
pure C, so that integrating it directly in the build process will be
possible?  In this case, the question is — should we use python code 
in the argument clinic DSL?

If we keep it strictly declarative, then, at least, we’ll have this
possibility in the future.

--  
Yury Selivanov

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Ethan Furman


On 01/16/2014 10:30 AM, Eric V. Smith wrote:

On 01/16/2014 11:23 AM, Ethan Furman wrote:

On 01/16/2014 06:45 AM, Brett Cannon wrote:


But that's **only** because the numeric types choose
to as part of their __format__() implementation; it is
not inherent to str.format().


As I understand it, str.format will call the object's __format__.  So,
for example, if I say:

   u'the value is: %d' % myNum(17)

then it will be myNum.__format__ that gets called, not int.__format__;
this is precisely what we don't want, since can't know that myNum is
only going to return ASCII characters.


"Magic" methods, including __format__, are called on the type, not the
instance.


Yes, that's why I said `myNum(17)` and not `myNum`.



This is why I would have bytes.__format__, as part of its parsing, call
int, index, or float depending on the format code; so the above example
would have bytes.__format__ calling int() on myNum(17), at which point
we either have an int type or an exception was raised because myNum
isn't really an integer.  Once we have an int, whose format we know and
trust, then we can call its __format__ and proceed from there.

On the flip side, if myNum does define it's own __format__, it will not
be called by bytes.format, and perhaps that is another good reason for
bytes to only support %-interpolation and not format?


For the first iteration of bytes.format(), I think we should just
support the exact types of int, float, and bytes. It will call the
type's__format__ (with the object as "self") and encode the result to
ASCII. For the stated use case of 2.x compatibility, I suspect this will
cover > 90% of the uses in real code. If we find there are cases where
real code needs additional types supported, we can consider adding
__format_ascii__ (or whatever name we cook up).


That can certainly be our fallback position if we can't decide now how we want 
to handle int and float subclasses.

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Eric V. Smith

On 01/16/2014 01:55 PM, Ethan Furman wrote:
>> "Magic" methods, including __format__, are called on the type, not the
>> instance.
> 
> Yes, that's why I said `myNum(17)` and not `myNum`.

Oops, apologies. I misread the code.

Eric.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] python code in argument clinic annotations

2014-01-16 Thread Guido van Rossum

On Thu, Jan 16, 2014 at 11:15 AM, Yury Selivanov
 wrote:
> The whole discussion of whether clinic should write its output
> right in the source file (buffered or not), or in a separate sidefile,
> started because we currently cannot run the clinic during the build
> process, since it’s written in python.

But that's why the output is checked in. It's the same with the parser
IIRC. (And yes, there's a bootstrap issue -- but that's solved by
using an older Python version.)

> But what if, at some point, someone implements the Tools/clinic.py in
> pure C, so that integrating it directly in the build process will be
> possible?  In this case, the question is — should we use python code
> in the argument clinic DSL?
>
> If we keep it strictly declarative, then, at least, we’ll have this
> possibility in the future.

Sounds like a pretty unlikely scenario. Why would you implement clinic in C?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/16/2014 12:32 AM, Larry Hastings wrote:
> We could add a special value, let's call it sys.NULL, whose specific
> semantics are "turns into NULL when passed into builtins".  This would
> solve the problem but it's really, really awful.

That doesn't smell too bad too me -- I would prefer to be able to build
up all such calls programmatically for testing purposes (e.g., to ensure
identical semantics for all code paths between a Python reference
implementation and a C extension).


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlLYNVMACgkQ+gerLs4ltQ79NwCgy3231to9rnw/8I+52hFJE+2w
Z9QAnR0pAMfkofhT82K1yQctm0E8TF7j
=QaC4
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Larry Hastings


On 01/16/2014 03:38 AM, Antoine Pitrou wrote:

On Wed, 15 Jan 2014 21:55:46 -0800
Larry Hastings  wrote:

Passing in "None" here is inconvenient as it's an integer argument.

Inconvenient for whom? The callee or the caller?


The callee, specifically the C argument parsing code.  (Even more 
specifically: the Argument Clinic argument parsing code generator.)



//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] python code in argument clinic annotations

2014-01-16 Thread Yury Selivanov

Guido,

On Thursday, January 16, 2014, Guido van Rossum  wrote:

> On Thu, Jan 16, 2014 at 11:15 AM, Yury Selivanov
> > wrote:
> > The whole discussion of whether clinic should write its output
> > right in the source file (buffered or not), or in a separate sidefile,
> > started because we currently cannot run the clinic during the build
> > process, since it’s written in python.
>
> But that's why the output is checked in. It's the same with the parser
> IIRC. (And yes, there's a bootstrap issue -- but that's solved by
> using an older Python version.)
>
> > But what if, at some point, someone implements the Tools/clinic.py in
> > pure C, so that integrating it directly in the build process will be
> > possible?  In this case, the question is — should we use python code
> > in the argument clinic DSL?
> >
> > If we keep it strictly declarative, then, at least, we’ll have this
> > possibility in the future.
>
> Sounds like a pretty unlikely scenario. Why would you implement clinic in
> C?


Unlikely, yes.

There is just one reason for having it in C --
having it integrated in the build process,
so that the generated output/sidefiles
are not in the repository.

Yury
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Larry Hastings


On 01/16/2014 04:21 AM, MRAB wrote:

On 2014-01-16 05:32, Larry Hastings wrote:
[snip]


We could add a special value, let's call it
sys.NULL, whose specific semantics are "turns into NULL when passed into
builtins".  This would solve the problem but it's really, really awful.


Would it be better if it were called "__null__"?


No.  The problem is not the name, the problem is in the semantics. This 
would mean a permanent special case in Python's argument parsing (and 
"special cases aren't special enough to break the rules"), and would 
inflict these same awful semantics on alternate implementations like 
PyPy, Jython, and IronPython.



//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] python code in argument clinic annotations

2014-01-16 Thread Larry Hastings


On 01/16/2014 11:36 AM, Guido van Rossum wrote:

On Thu, Jan 16, 2014 at 11:15 AM, Yury Selivanov
 wrote:

If we keep it strictly declarative, then, at least, we’ll have this
possibility in the future.

Sounds like a pretty unlikely scenario. Why would you implement clinic in C?


We'll never reimplement Argument Clinic in C.  I could list many reasons 
for this.  Suffice to say, I'm not doing it, and I doubt anyone else 
would ever step up to the plate and try it.


And, "form follows function".  It's a bad idea to limit Argument 
Clinic's features today based on what might be inconvenient someday in 
some hypothetical rewrite in C.  Argument Clinic should be maximally 
useful, right now.



//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] python code in argument clinic annotations

2014-01-16 Thread Larry Hastings


On 01/16/2014 11:46 AM, Yury Selivanov wrote:

There is just one reason for having it in C --
having it integrated in the build process,
so that the generated output/sidefiles
are not in the repository.


It's possible to integrate Argument Clinic into the build process 
without rewriting it in C.  We could write a small C program that looked 
on your path for a suitable Python 3 interpreter, and ran 
Tools/clinic/clinic.py under that interpreter.  If it failed to find 
such an interpreter it could print a warning message.


Alternatively, we could add a checksum for the Clinic *input* block to 
the output somewhere.  This would give the C tool the ability to check 
and see if the Clinic input had changed, and only bother to run 
clinic.py if it had.


However, the generated output is still going to be checked in to the 
repository regardless.



//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] python code in argument clinic annotations

2014-01-16 Thread Georg Brandl

Am 16.01.2014 20:46, schrieb Yury Selivanov:
> Guido,
> 
> On Thursday, January 16, 2014, Guido van Rossum  > wrote:
> 
> On Thu, Jan 16, 2014 at 11:15 AM, Yury Selivanov
> > wrote:
> > The whole discussion of whether clinic should write its output
> > right in the source file (buffered or not), or in a separate sidefile,
> > started because we currently cannot run the clinic during the build
> > process, since it’s written in python.
> 
> But that's why the output is checked in. It's the same with the parser
> IIRC. (And yes, there's a bootstrap issue -- but that's solved by
> using an older Python version.)
> 
> > But what if, at some point, someone implements the Tools/clinic.py in
> > pure C, so that integrating it directly in the build process will be
> > possible?  In this case, the question is — should we use python code
> > in the argument clinic DSL?
> >
> > If we keep it strictly declarative, then, at least, we’ll have this
> > possibility in the future.
> 
> Sounds like a pretty unlikely scenario. Why would you implement clinic in 
> C?
> 
> 
> Unlikely, yes.

About as unlikely as switching the Python sources to C++ and using templates
to implement a Clinic-like DSL :)

Georg

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] python code in argument clinic annotations

2014-01-16 Thread Yury Selivanov

Larry,

On January 16, 2014 at 2:58:02 PM, Larry Hastings ([email protected]) wrote:
> > However, the generated output is still going to be checked in  
> to the repository regardless.

OK. Since it looks like it’s 100% accepted to commit it to the repo, then
my question is moot.

And again, Larry, kudos for pulling the AC off.

-
Yury
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Closing the Clinic output format debate (at least for now)

2014-01-16 Thread Larry Hastings




The current tally of votes, by order of popularity:

   Side file: +6
   Buffer: +1.5
   Multiple buffers, Modified buffer, Forward buffer: +1
   Original: -5


However, as stated, support for "side files" will not go in unless Guido 
explicitly states that it's okay with him.  He has not. Therefore it's 
not going in.  If you want this feature, take it up with our BDFL.  I 
feel my hands are tied.


Second-best is all the buffer approaches, collectively.  Since there was 
no clear winner, I'm going to make the new default the "modified buffer" 
approach, as that's the only one that does not require rearranging your 
code to use.  However, to encourage continued experimentation, I'm going 
to leave in the configurability (at least for now), so people can keep 
experimenting.  Maybe we'll find something in the future that's a clear 
new favorite.


As a stretch goal, I'd like to also add Zachary Ware's proposed 
"forward" buffer, as a further concession to experimentation.  It 
shouldn't be too messy, but if it gets out of hand I'll back out of it.


Finally, I'm going to add support for "presets" so you can switch 
between original / modified buffer / buffer / forward buffer with just 
one statement.  (Multiple buffers doesn't need a different preset.)


I'll also keep the line prefix (and add a line suffix too) and see if a 
prefix of "/*clinic*/" helps.



//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Mark Shannon


On 16/01/14 19:43, Larry Hastings wrote:

On 01/16/2014 04:21 AM, MRAB wrote:

On 2014-01-16 05:32, Larry Hastings wrote:
[snip]


We could add a special value, let's call it
sys.NULL, whose specific semantics are "turns into NULL when passed into
builtins".  This would solve the problem but it's really, really awful.


Would it be better if it were called "__null__"?


No.  The problem is not the name, the problem is in the semantics. This
would mean a permanent special case in Python's argument parsing (and
"special cases aren't special enough to break the rules"), and would
inflict these same awful semantics on alternate implementations like
PyPy, Jython, and IronPython.


Indeed.

Why not just change the clinic spec a bit, from
'The "default" is a Python literal value.' to
'The "default" is a Python literal value or NULL.'?

A NULL default would imply the parameter is optional
with no default.


Cheers,
Mark.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Terry Reedy


On 1/16/2014 5:11 AM, Nick Coghlan wrote:


Guido's successful counter was to point out that the parsing of the
format string itself assumes ASCII compatible data,


Did you see my explanation, which I wrote in response to one of your 
earlier posts, of why I think "the parsing of the format string itself 
assumes ASCII compatible data" that statement is confused and wrong? The 
above seems to say that what I wrote is impossible, but perhaps I 
misunderstand what Guido and you mean. Among my questions are "by data, 
do you mean interpolated objects or interpolated bytes?" and "what 
restriction on 'data' do you intend by 'ASCII compatible'?".


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Closing the Clinic output format debate (at least for now)

2014-01-16 Thread Larry Hastings


On 01/16/2014 12:13 PM, Larry Hastings wrote:



The current tally of votes, by order of popularity:

Side file: +6
Buffer: +1.5
Multiple buffers, Modified buffer, Forward buffer: +1
Original: -5



I should add, that's out of a total of eleven votes cast.  So the side 
file was a clear winner but far from unanimous.


Since the votes were all public, the tally might as well be to. Here it 
is in handy "CSV" format:


   "Names", Original", "Side File", "Buffer", "Multiple Buffers",
   "Modified Buffer", "Forward Buffer"
   "Totals", -5, 6, 1.5, 1, 1, 1
   "Brett Cannon", 0, 0, 1, 1, 0, 0
   "Antoine Pitrou", -0.5, 1, 0, 0, 0, 0
   "Raymond Hettinger", -1, 1, 0, 0, 0, 0
   "Zachary Ware", -1, 0, 1, 1, 0, 0
   "Georg Brandl", -1, 0, 0, 1, 0, 0
   "Serhiy Storchaka", -1, 1, 0, 0, 0, 0
   "Yury Selivanov", 0, 1, -1, -1, -1, 0
   "Ryan Smith-Roberts", 1, 0, 0, -1, 1, 1
   "Ethan Furman", 0, 0, 0.5, 0, 1, 0
   "Meador Inge", -0.5, 1, 0, 0, 0, 0
   "Stefan Krah", -1, 1, 0, 0, 0, 0



//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Closing the Clinic output format debate (at least for now)

2014-01-16 Thread Georg Brandl

Am 16.01.2014 22:20, schrieb Larry Hastings:
> On 01/16/2014 12:13 PM, Larry Hastings wrote:
>>
>>
>> The current tally of votes, by order of popularity:
>>
>> Side file: +6
>> Buffer: +1.5
>> Multiple buffers, Modified buffer, Forward buffer: +1
>> Original: -5
>>
> 
> I should add, that's out of a total of eleven votes cast.  So the side file 
> was
> a clear winner but far from unanimous.
> 
> Since the votes were all public, the tally might as well be to.  Here it is in
> handy "CSV" format:
> 
> "Names", Original", "Side File", "Buffer", "Multiple Buffers", "Modified
> Buffer", "Forward Buffer"
> "Totals", -5, 6, 1.5, 1, 1, 1
> "Brett Cannon", 0, 0, 1, 1, 0, 0
> "Antoine Pitrou", -0.5, 1, 0, 0, 0, 0
> "Raymond Hettinger", -1, 1, 0, 0, 0, 0
> "Zachary Ware", -1, 0, 1, 1, 0, 0
> "Georg Brandl", -1, 0, 0, 1, 0, 0
> "Serhiy Storchaka", -1, 1, 0, 0, 0, 0
> "Yury Selivanov", 0, 1, -1, -1, -1, 0
> "Ryan Smith-Roberts", 1, 0, 0, -1, 1, 1
> "Ethan Furman", 0, 0, 0.5, 0, 1, 0
> "Meador Inge", -0.5, 1, 0, 0, 0, 0
> "Stefan Krah", -1, 1, 0, 0, 0, 0

Although this is neglecting the difference between +0 and -0 :)

Georg


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Closing the Clinic output format debate (at least for now)

2014-01-16 Thread Larry Hastings


On 01/16/2014 01:24 PM, Georg Brandl wrote:

Although this is neglecting the difference between +0 and -0 :)


I hear LibreOffice is accepting external patches again.


//arr//y/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Ryan Smith-Roberts

Let me expand on the issue, and address some of the replies.

The goal of Argument Clinic is to create new docstring signatures for
builtins, with the following properties:

1) Useful. While one can create a signature of func(*args) and then
document complex and arbitrary restrictions on what args contains, that
isn't helpful to the end-user examining the docstring, or to automated
tools.

2) Inspectable. For a signature to be compatible with inspect.signature(),
it *must be a valid native Python declaration*. This means no optional
positional arguments of the form func(foo[, bar]), and no non-Python
default values.

3) Correct. The semantics of the builtin's signature should match the
expectations users have about pure Python declarations.

There are two classes of builtins whose signatures do not have these
properties. The first is those with very weird signatures, like
curses.window.addstr(). It's fine that those don't get converted, they're
hopeless. A second class is builtins with "almost but not quite" usable
signatures, mostly the ones with optional positional parameters. It would
be nice to "rescue" those builtins. So, let us return to my original
example, getservbyname(). Its current signature:

socket.getservbyname(servicename[, protocolname])

This is not an inspectable signature, since pure Python does not support
bracketed arguments. To make it inspectable, we must give protocolname a
(valid Python) default value:

socket.getservbyname(servicename, protocolname=None)

Unfortunately, while useful and inspectable, this signature is not correct.
For a pure Python function, passing None for protocolname is the same as
omitting it. However, if you pass None to getservbyname(), it raises a
TypeError. So, we have these three options:

1) Don't give getservbyname() an inspectable signature.
2) Lie to the user about the acceptability of None.
3) Alter the semantics of getservbyname() to treat None as equivalent to
omitting protocolname.

Obviously #2 is out. My question: is #3 ever acceptable? It's a real
change, as it breaks any code that relies on the TypeError exception.

>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Guido van Rossum

On Thu, Jan 16, 2014 at 1:18 PM, Terry Reedy  wrote:
> On 1/16/2014 5:11 AM, Nick Coghlan wrote:
>
>> Guido's successful counter was to point out that the parsing of the
>> format string itself assumes ASCII compatible data,
>
> Did you see my explanation, which I wrote in response to one of your earlier
> posts, of why I think "the parsing of the format string itself assumes ASCII
> compatible data" that statement is confused and wrong? The above seems to
> say that what I wrote is impossible, but perhaps I misunderstand what Guido
> and you mean. Among my questions are "by data, do you mean interpolated
> objects or interpolated bytes?" and "what restriction on 'data' do you
> intend by 'ASCII compatible'?".

Can you move the meta-discussion off-list? I'm getting tired of "did
you understand what I said".

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Terry Reedy


On Thu, Jan 16, 2014 at 1:42 AM, Terry Reedy  wrote:



itertools.repeat('a', -1)

repeat('a', 0)

itertools.repeat('a', times=-1)

repeat('a')

itertools.repeat('a', times=-2)

repeat('a', -2)



The first line is correct in both behavior and representation.
The second line behavior (and corresponding repr) are wrong.
The third line repr is wrong but the behavior is like the first.


[1] http://bugs.python.org/issue19145


On 1/16/2014 1:42 PM, Guido van Rossum wrote:

If I had complete freedom in redefining the spec I would treat
positional and keyword the same, interpret absent or None to mean
"forever" and explicit negative integers to mean the same as zero, and
make repr show a positional integer >= 0 if the repeat isn't None.

But I don't know if that's too much of a change.


I copied the unsnipped stuff above to a tracker message.

http://bugs.python.org/issue19145

--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Closing the Clinic output format debate (at least for now)

2014-01-16 Thread Guido van Rossum

I am tired of being the only blocker. So I withdraw my preference. Do
what you all can agree on without me.

On Thu, Jan 16, 2014 at 12:13 PM, Larry Hastings  wrote:
>
>
> The current tally of votes, by order of popularity:
>
> Side file: +6
> Buffer: +1.5
> Multiple buffers, Modified buffer, Forward buffer: +1
> Original: -5
>
>
> However, as stated, support for "side files" will not go in unless Guido
> explicitly states that it's okay with him.  He has not.  Therefore it's not
> going in.  If you want this feature, take it up with our BDFL.  I feel my
> hands are tied.
>
> Second-best is all the buffer approaches, collectively.  Since there was no
> clear winner, I'm going to make the new default the "modified buffer"
> approach, as that's the only one that does not require rearranging your code
> to use.  However, to encourage continued experimentation, I'm going to leave
> in the configurability (at least for now), so people can keep experimenting.
> Maybe we'll find something in the future that's a clear new favorite.
>
> As a stretch goal, I'd like to also add Zachary Ware's proposed "forward"
> buffer, as a further concession to experimentation.  It shouldn't be too
> messy, but if it gets out of hand I'll back out of it.
>
> Finally, I'm going to add support for "presets" so you can switch between
> original / modified buffer / buffer / forward buffer with just one
> statement.  (Multiple buffers doesn't need a different preset.)
>
> I'll also keep the line prefix (and add a line suffix too) and see if a
> prefix of "/*clinic*/" helps.
>
>
> /arry
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Jan Kaliszewski


16.01.2014 17:33, Michael Urman wrote:

On Thu, Jan 16, 2014 at 8:45 AM, Brett Cannon  
wrote:
Fine, if you're worried about bytes.format() overstepping by 
implicitly
calling str.encode() on the return value of __format__() then you 
will need

__bytes__format__() to get equivalent support.


Could we just re-use PEP-3101's note (easily updated for Python 3):

Note for Python 2.x: The 'format_spec' argument will be either
a string object or a unicode object, depending on the type of the
original format string.  The __format__ method should test the 
type
of the specifiers parameter to determine whether to return a 
string or
unicode object.  It is the responsibility of the __format__ 
method

to return an object of the proper type.

If __format__ receives a format_spec of type bytes, it should return
bytes. For such cases on objects that cannot support bytes (i.e. for
str), it can raise. This appears to avoid the need for additional
methods. (As does Nick's proposal of leaving it out for now.)


-1.

I'd treat the format()+.__format__()+str.format()-"ecosystem" as
a nice text-data-oriented, *complete* Py3k feature, backported to
Python 2 to share the benefits of the feature with it as well as
to make the 2-to-3 transition a bit easier.

IMHO, the PEP-3101's note cited above just describes a workaround
over the flaws of the Py2's obsolete text model.  Moving such
complications into Py3k would make the feature (and especially the
ability to implement your own .__format__()) harder to understand
and make use of -- for little profit.

Such a move is not needed for compatibility.  And, IMHO, the
format()/__format__()/str.format()-matter is all about nice and
flexible *text* formatting, not about binary data interpolation.

16.01.2014 10:56, Nick Coghlan wrote:


I have a different proposal: let's *just* add mod formatting to
bytes, and leave the extensible formatting system as a text only
operation.

We don't really care if bytes supports that method for version
compatibility purposes, and the deliberate flexibility of the design
makes it hard to translate into the binary domain.

So let's just not provide that - let's accept that, for the binary
domain, printf style formatting is just a better fit for the job :)


+1!

However, I am not sure if %s should be limited to bytes-like
objects.  As "practicality beats purity", I would be +0.5 for
enabling the following:

- input type supports Py_buffer?
  use it to collect the necessary bytes

- input type has the __bytes__() method?
  use it to collect the necessary bytes

- input type has the encode() method?
  raise TypeError

- otherwise:
  use something equivalent to ascii(obj).encode('ascii')
  (note that it would nicely format numbers + format other
  object in more-or-less useful way without the fear of
  encountering a non-ascii data).

  another option: use str()-representation of strictly
  defined types, e.g.: int, float, decimal.Decimal,
  fractions.Fraction...

Cheers.
*j

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/16/2014 04:08 PM, Ryan Smith-Roberts wrote:

> [L]et us return to my original example, getservbyname(). Its current
> signature:
> 
> socket.getservbyname(servicename[, protocolname])
> 
> This is not an inspectable signature, since pure Python does not
> support bracketed arguments. To make it inspectable, we must give
> protocolname a (valid Python) default value:
> 
> socket.getservbyname(servicename, protocolname=None)
> 
> Unfortunately, while useful and inspectable, this signature is not
> correct. For a pure Python function, passing None for protocolname is
> the same as omitting it. However, if you pass None to getservbyname(),
> it raises a TypeError. So, we have these three options:
> 
> 1) Don't give getservbyname() an inspectable signature. 2) Lie to the
> user about the acceptability of None. 3) Alter the semantics of
> getservbyname() to treat None as equivalent to omitting protocolname.
> 
> Obviously #2 is out. My question: is #3 ever acceptable? It's a real 
> change, as it breaks any code that relies on the TypeError exception.

+1 for #3, especially in a new "major" release (w/ sufficient
documentation of the change).


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlLYWU8ACgkQ+gerLs4ltQ6obQCglHmIM4kcNOQte7jj9NjL6Xia
KQwAn2ircAlSR6iwFIAt6PDz0bs6iIDt
=G+GC
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Terry Reedy


On 1/16/2014 4:59 PM, Guido van Rossum wrote:


I'm getting tired of "did you understand what I said".


I was asking whether I needed to repeat myself, but forget that.
I was also saying that while I understand 'ascii-compatible encoding', I 
do not understand the notion of 'ascii-compatible data' or statements 
based on it.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] AC Derby and accepting None for optional positional arguments

2014-01-16 Thread Larry Hastings



On 01/16/2014 01:08 PM, Ryan Smith-Roberts wrote:
There are two classes of builtins whose signatures do not have these 
properties. The first is those with very weird signatures, like 
curses.window.addstr(). It's fine that those don't get converted, 
they're hopeless.


Speaking as the father of Argument Clinic, I disagree.  My goal with 
Clinic is to convert every function in CPython whose semantics can be 
expressed with a PyArg_ parsing function.


For example, curses.window.addstr could be converted just fine.  Its 
signature is exactly the same as curses.window.addch which has already 
been converted.


socket.sendto eludes me for now--but I haven't given up yet.


Don't give up hope,


//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Nick Coghlan

On 17 Jan 2014 09:36, "Terry Reedy"  wrote:
>
> On 1/16/2014 4:59 PM, Guido van Rossum wrote:
>
>> I'm getting tired of "did you understand what I said".
>
>
> I was asking whether I needed to repeat myself, but forget that.
> I was also saying that while I understand 'ascii-compatible encoding', I
do not understand the notion of 'ascii-compatible data' or statements based
on it.

There are plenty of data formats (like SMTP and HTTP) that are constrained
to be ASCII compatible, either globally, or locally in the parts being
manipulated by an application (such as a file header). ASCII incompatible
segments may be present, but in ways that allow the data processing to
handle them correctly. The ASCII assuming methods on bytes objects are
there to help in dealing with that kind of data.

If the binary data is just one large block in a single text encoding, it's
generally easier to just decode it to text, but multipart formats generally
don't allow that.

>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Greg


On 17/01/2014 10:18 a.m., Terry Reedy wrote:

On 1/16/2014 5:11 AM, Nick Coghlan wrote:


Guido's successful counter was to point out that the parsing of the
format string itself assumes ASCII compatible data,


Nick's initial arguments against bytes formatting were very
abstract and philosophical, along the lines that it violated
some pure mental model of text/bytes separation.

Then Guido said something that Nick took to be an equal and
opposite philosophical argument that cancelled out his original
objections, and he withdrew them.

I don't think it matters whether the internal details of that
debate make sense to the rest of us. The main thing is that
a consensus seems to have been reached on bytes formatting
being basically a good thing.

--
Greg

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Ethan Furman


On 01/16/2014 05:32 PM, Greg wrote:


I don't think it matters whether the internal details of that
debate make sense to the rest of us. The main thing is that
a consensus seems to have been reached on bytes formatting
being basically a good thing.


And a good thing, too, on both counts!  :)

A few folks have suggested not implementing .format() on bytes;  I've been resistant, but then I remembered that format 
is also a function.


http://docs.python.org/3/library/functions.html?highlight=ascii#format
==
format(value[, format_spec])

Convert a value to a “formatted” representation, as controlled by format_spec. The interpretation of format_spec 
will depend on the type of the value argument, however there is a standard formatting syntax that is used by most 
built-in types: Format Specification Mini-Language.


The default format_spec is an empty string which usually gives the same 
effect as calling str(value).

A call to format(value, format_spec) is translated to type(value).__format__(format_spec) which bypasses the 
instance dictionary when searching for the value’s __format__() method. A TypeError exception is raised if the method is 
not found or if either the format_spec or the return value are not strings.

==

Given that, I can relent on .format and just go with .__mod__ .  A low-level 
service for a low-level protocol, what?  ;)

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Stephen J. Turnbull

Meta enough that I'll take Guido out of the CC.

Nick Coghlan writes:

 > There are plenty of data formats (like SMTP and HTTP) that are
 > constrained to be ASCII compatible,

"ASCII compatible" is a technical term in encodings, which means
"bytes in the range 0-127 always have ASCII coded character semantics,
do what you like with bytes in the range 128-255."[1]

Worse, it's clearly confusing in this discussion.  Let's stop using
this term to mean

the data format has elements that are defined to contain only
bytes with ASCII coded character semantics

(which is the relevant restriction AFAICS -- I don't know of any
ASCII-compatible formats where the bytes 128-255 are used for any
purpose other than encoding non-ASCII characters).  OTOH, if it *is*
an ASCII-compatible text encoding, the semantics are dubious if the
bytes versions of many of these methods/operations are used.

A documentation suggestion: It's easy enough to rewrite

 > constrained to be ASCII compatible, either globally, or locally in
 > the parts being manipulated by an application (such as a file
 > header). ASCII incompatible segments may be present, but in ways
 > that allow the data processing to handle them correctly.

as 

containing 'well-defined segments constrained to be (strictly)
ASCII-encoded' (aka ASCII segments).

And then you can say 

 are designed for use *only* on bytes
that are ASCII segments; use on other data is likely to cause
hard-to-diagnose corruption.

If there are other use cases for "ASCII-compatible data formats" as
defined above (not worrying about codecs, because they are a very
small minority of code-to-be-written at this point), I don't know
about them.  Does anyone?  If there are any, I'll be happy to revise.
If not, that seems to be a precise and intelligible statement of the
restrictions that is useful to the practical use cases.  And nothing
stops users who think they know what they're doing from using them in
other contexts (which can be documented if they turn out to be broadly
useful).

Footnotes: 
[1]  "ASCII coded character semantics" is of course mildly ambiguous
due to considerations like EOL conventions.  But "you know what I'm
talking about".

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Stephen J. Turnbull

Greg writes:

 > I don't think it matters whether the internal details of [the EIBTI
 > vs. PBP] debate make sense to the rest of us. The main thing is
 > that a consensus seems to have been reached on bytes formatting
 > being basically a good thing.

I think some of it matters to the documentation.

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Closing the Clinic output format debate (at least for now)

2014-01-16 Thread Nick Coghlan

On 17 January 2014 08:01, Guido van Rossum  wrote:
> I am tired of being the only blocker. So I withdraw my preference. Do
> what you all can agree on without me.

I had been staying out of the debate because I haven't had time to
participate in the derby yet (if nobody has claimed the builtins yet,
I was planning to do that this weekend). However, reviewing the
changes for http://bugs.python.org/issue20189 has now been enough to
convince me that a separate generated file is the way to go.

My rationale is because of the way it affects the code review process:
with a separate file, I can skip to the next file in the review as
soon as I see ".clinic" in the file name. We may even be able to teach
Reitveld to skip over clinic files (or at least suggest skipping them)
automatically.

With the current intermingled hand written + generated format, I can't
tell just from the file name whether or not there are manual changes I
need to review. Fortunately, in this particular case, Larry provided a
list of the files with real changes in them, but I now think it makes
more sense to instead bake the "this is all generated code, if you
have reviewed the input changes and trust argument clinic to do the
right thing, you can just skip reviewing it" notification directly
into the filenames.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Neil Schemenauer

Greg  wrote:
> I don't think it matters whether the internal details of that
> debate make sense to the rest of us. The main thing is that
> a consensus seems to have been reached on bytes formatting
> being basically a good thing.

I've been mostly steering clear of the metaphysical and writing
code today. ;-)  An extremely rough patch has been uploaded:

http://bugs.python.org/issue20284

I have a new one almost ready that introduces __ascii__ rather than
overloading __format__.  I like it better, will upload to issue
tracker soon.

Regards,

  Neil

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Steven D'Aprano

On Fri, Jan 17, 2014 at 11:19:44AM +0900, Stephen J. Turnbull wrote:
> Meta enough that I'll take Guido out of the CC.
> 
> Nick Coghlan writes:
> 
>  > There are plenty of data formats (like SMTP and HTTP) that are
>  > constrained to be ASCII compatible,
> 
> "ASCII compatible" is a technical term in encodings, which means
> "bytes in the range 0-127 always have ASCII coded character semantics,
> do what you like with bytes in the range 128-255."[1]

Examples, and counter-examples, may help. Let me see if I have got this 
right: an ASCII-compatible encoding may be an ASCII-superset like 
Latin-1, or a variable-width encoding like UTF-8 where the ASCII chars 
are encoded to the same bytes as ASCII, and non-ASCII chars are not. A 
counter-example would be UTF-16, or some of the Asian encodings like 
Big5. Am I right so far?

But Nick isn't talking about an encoding, he's talking about a data 
format. I think that an ASCII-compatible format means one where (in at 
least *some* parts of the data) bytes between 0 and 127 have the same 
meaning as in ASCII, e.g. byte 84 is to be interpreted as ASCII 
character "T". This doesn't mean that every byte 84 means "T", only that 
some of them do -- hopefully a well-defined sections of the data. Below, 
you introduce the term "ASCII segments" for these.

> Worse, it's clearly confusing in this discussion.  Let's stop using
> this term to mean
> 
> the data format has elements that are defined to contain only
> bytes with ASCII coded character semantics
> 
> (which is the relevant restriction AFAICS -- I don't know of any
> ASCII-compatible formats where the bytes 128-255 are used for any
> purpose other than encoding non-ASCII characters).  OTOH, if it *is*
> an ASCII-compatible text encoding, the semantics are dubious if the
> bytes versions of many of these methods/operations are used.
> 
> A documentation suggestion: It's easy enough to rewrite
> 
>  > constrained to be ASCII compatible, either globally, or locally in
>  > the parts being manipulated by an application (such as a file
>  > header). ASCII incompatible segments may be present, but in ways
>  > that allow the data processing to handle them correctly.
> 
> as 
> 
> containing 'well-defined segments constrained to be (strictly)
> ASCII-encoded' (aka ASCII segments).
> 
> And then you can say 
> 
>  are designed for use *only* on bytes
> that are ASCII segments; use on other data is likely to cause
> hard-to-diagnose corruption.

An example: if you have the byte b'\x63', calling upper() on that will 
return b'\x43'. That is only meaningful if the byte is intended as the 
ASCII character "c".

> Footnotes: 
> [1]  "ASCII coded character semantics" is of course mildly ambiguous
> due to considerations like EOL conventions.  But "you know what I'm
> talking about".

I think I know what your talking about, but don't know for sure unless I 
explain it back to you.

-- 
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Nick Coghlan

On 17 January 2014 11:51, Ethan Furman  wrote:
> On 01/16/2014 05:32 PM, Greg wrote:
>>
>>
>> I don't think it matters whether the internal details of that
>> debate make sense to the rest of us. The main thing is that
>> a consensus seems to have been reached on bytes formatting
>> being basically a good thing.
>
>
> And a good thing, too, on both counts!  :)
>
> A few folks have suggested not implementing .format() on bytes;  I've been
> resistant, but then I remembered that format is also a function.
>
> http://docs.python.org/3/library/functions.html?highlight=ascii#format
> ==
> format(value[, format_spec])
>
> Convert a value to a “formatted” representation, as controlled by
> format_spec. The interpretation of format_spec will depend on the type of
> the value argument, however there is a standard formatting syntax that is
> used by most built-in types: Format Specification Mini-Language.
>
> The default format_spec is an empty string which usually gives the same
> effect as calling str(value).
>
> A call to format(value, format_spec) is translated to
> type(value).__format__(format_spec) which bypasses the instance dictionary
> when searching for the value’s __format__() method. A TypeError exception is
> raised if the method is not found or if either the format_spec or the return
> value are not strings.
> ==
>
> Given that, I can relent on .format and just go with .__mod__ .  A low-level
> service for a low-level protocol, what?  ;)

Exactly - while I'm a fan of the new extensible formatting system and
strongly prefer it to printf-style formatting for text, it also has a
whole lot of complexity that is hard to translate to the binary
domain, including the format() builtin and __format__ methods.

Since the relevant use cases appear to be already covered adequately
by prinft-style formatting, attempting to translate the flexible text
formatting system as well just becomes additional complexity we don't
need.

I like Stephen Turnbull's suggestion of using "binary formats with
ASCII segments" to distinguish the kind of formats we're talking about
from ASCII compatible text encodings, and I think Python 3.5 will end
up with a suite of solutions that suitably covers all use cases, just
by bringing back printf-style formatting directly to bytes:

* format(), str.format(), str.format_map(): a rich extensible text
formatting system, including date interpolation support
* str.__mod__: retained primarily for backwards compatibility, may
occasionally be used as a text formatting optimisation tool (since the
inflexibility means it will likely always be marginally faster than
the rich formatting system for the cases that it covers)
* bytes.__mod__, bytearray.__mod__: restored in Python 3.5 to simplify
production of data in variable length binary formats that contain
ASCII segments
* the struct module: rich (but not extensible) formatting system for
fixed length binary formats

In Python 2, the binary format with ASCII segments use case was
intermingled with general purpose text formatting on the str type,
which is I think the main reason it has taken us so long to convince
ourselves it is something that is genuinely worth bringing back in a
more limited form in Python 3, rather than just being something we
wanted back because we were used to having it in Python 2.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 updates

2014-01-16 Thread Glenn Linderman


On 1/16/2014 9:46 PM, Nick Coghlan wrote:

On 17 January 2014 11:51, Ethan Furman  wrote:

On 01/16/2014 05:32 PM, Greg wrote:


I don't think it matters whether the internal details of that
debate make sense to the rest of us. The main thing is that
a consensus seems to have been reached on bytes formatting
being basically a good thing.


And a good thing, too, on both counts!  :)

A few folks have suggested not implementing .format() on bytes;  I've been
resistant, but then I remembered that format is also a function.

http://docs.python.org/3/library/functions.html?highlight=ascii#format
==
format(value[, format_spec])

 Convert a value to a “formatted” representation, as controlled by
format_spec. The interpretation of format_spec will depend on the type of
the value argument, however there is a standard formatting syntax that is
used by most built-in types: Format Specification Mini-Language.

 The default format_spec is an empty string which usually gives the same
effect as calling str(value).

 A call to format(value, format_spec) is translated to
type(value).__format__(format_spec) which bypasses the instance dictionary
when searching for the value’s __format__() method. A TypeError exception is
raised if the method is not found or if either the format_spec or the return
value are not strings.
==

Given that, I can relent on .format and just go with .__mod__ .  A low-level
service for a low-level protocol, what?  ;)

Exactly - while I'm a fan of the new extensible formatting system and
strongly prefer it to printf-style formatting for text, it also has a
whole lot of complexity that is hard to translate to the binary
domain, including the format() builtin and __format__ methods.

Since the relevant use cases appear to be already covered adequately
by prinft-style formatting, attempting to translate the flexible text
formatting system as well just becomes additional complexity we don't
need.

I like Stephen Turnbull's suggestion of using "binary formats with
ASCII segments" to distinguish the kind of formats we're talking about
from ASCII compatible text encodings,


I liked that too, and almost said so on his posting, but will say it 
here, instead.



and I think Python 3.5 will end
up with a suite of solutions that suitably covers all use cases, just
by bringing back printf-style formatting directly to bytes:

* format(), str.format(), str.format_map(): a rich extensible text
formatting system, including date interpolation support
* str.__mod__: retained primarily for backwards compatibility, may
occasionally be used as a text formatting optimisation tool (since the
inflexibility means it will likely always be marginally faster than
the rich formatting system for the cases that it covers)
* bytes.__mod__, bytearray.__mod__: restored in Python 3.5 to simplify
production of data in variable length binary formats that contain
ASCII segments
* the struct module: rich (but not extensible) formatting system for
fixed length binary formats


Adding format codes with variable length could enhance the struct module 
to additional uses. C structs, on which it is modeled, often get around 
the difficulty of variable length items by defining one variable length 
item at the end, or by defining offsets in the fixed part, to variable 
length parts that follows. Such a structure cannot presently be created 
by struct alone.



In Python 2, the binary format with ASCII segments use case was
intermingled with general purpose text formatting on the str type,
which is I think the main reason it has taken us so long to convince
ourselves it is something that is genuinely worth bringing back in a
more limited form in Python 3, rather than just being something we
wanted back because we were used to having it in Python 2.

Cheers,
Nick.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461 - Adding % and {} formatting to bytes

2014-01-16 Thread Steven D'Aprano

On Thu, Jan 16, 2014 at 08:23:13AM -0800, Ethan Furman wrote:

> As I understand it, str.format will call the object's __format__.  So, for 
> example, if I say:
> 
>   u'the value is: %d' % myNum(17)
> 
> then it will be myNum.__format__ that gets called, not int.__format__; 

I seem to have missed something, because I am completely confused... Why 
are you talking about str.format and then show an example using % instead?

%d calls __str__, not __format__. This is in Python 3.3:

py> class MyNum(int):
... def __str__(self):
... print("Calling MyNum.__str__")
... return super().__str__()
... def __format__(self):
... print("Calling MyNum.__format__")
... return super().__format__()
...
py> n = MyNum(17)
py> u"%d" % n
Calling MyNum.__str__
'17'

By analogy, if we have a bytes %d formatting, surely it should either:

(1) call type(n).__bytes__(n), which is guaranteed to raise if the 
result isn't ASCII (i.e. like len() raises if the result isn't 
an int); or

(2) call type(n).__str__(n).encode("ascii", "strict").

Personally, I lean towards (2), even though that means you can't have a 
single class provide an ASCII string to b'%d' and a non-ASCII string to 
u'%d'. 

> this 
> is precisely what we don't want, since can't know that myNum is only going 
> to return ASCII characters.

It seems to me that Consenting Adults applies here. If class MyNum 
returns a non-ASCII string, then you ought to get a runtime exception, 
exactly the same as happens with just about every other failure in 
Python. If you don't want that possible exception, then don't use MyNum, 
or explicitly wrap it in a call to int:

b'the value is: %d' % int(MyNum(17))

The *worst* solution would be to completely ignore MyNum.__str__. 
That's a nasty violation of the Principle Of Least Surprise, and will 
lead to confusion ("why isn't my class' __str__ method being called?") 
and bugs.

* Explicit is better than implicit -- better to explicitly 
  wrap MyNum in a call to int() than to have bytes %d 
  automagically do it for you;

* Special cases aren't special enough to break the rules -- 
  bytes %d isn't so special that standard Python rules about
  calling special methods should be ignored;

* Errors should never pass silently -- if MyNum does the wrong 
  thing when used with bytes %d, you should get an exception.

> This is why I would have bytes.__format__, as part of its parsing, call 
> int, index, or float depending on the format code; so the above example 
> would have bytes.__format__ calling int() on myNum(17), 

The above example you give doesn't have any bytes in it. Can you explain 
what you meant to say? I'm guessing you intended this:

b'the value is: %d' % MyNum(17)

rather than using u'' as actually given, but I don't really know.

-- 
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

68 matches

Mail list logo