date:20160903

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Brett Cannon

On Sat, Sep 3, 2016, 17:45 Yury Selivanov  wrote:

>
> >
> > But without that new API (basically what Christian proposed) you'd
> > need
> > to iterate over the list in order to find the object that belongs to
> > Pyjion.
> >
> >
> > Yes.
>
> Yeah, which means the same for my opcode patch... Which unfortunately
> will make things slower :(
>
> >   If we manage to implement my opcode caching idea, we'll have at
> > least two known users of co_extra.  Without a way to claim a
> > particular
> > index in co_extra you will have some overhead to locate your objects.
> >
> >
> > Two things. One, I would want any new API to start with an underscore
> > so people know we can and will change its semantics as necessary. Two,
> > Guido would have to re-accept the PEP as this is a shift in the use of
> > the field if this is how people want to go.
>
>
> Since this isn't a user-facing/public API feature, are we *really*
> forced to accept/implement the PEP before the beta?
>

I say yes since people could want to use it during the beta for testing
(it's Ned's call in the end, though).


> I'd be happy to spend some time tomorrow/Monday to hammer out an
> alternative approach to co_extra. Let's see if we can find a slightly
> better approach.
>

OK!

-brett


> Yury
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Greg Ewing


Nick Coghlan wrote:

For synchronous code, that's a relatively easy burden to push back
onto the programmer - assuming fair thread scheduling, a with
statement can ensure reliably ensure prompt resource cleanup.

That assurance goes out the window as soon as you explicitly pause
code execution inside the body of the with statement - it doesn't
matter whether its via yield, yield from, or await, you've completely
lost that assurance of immediacy.


I don't see how this is any worse than a thread containing
an ordinary with-statement that waits for something that
will never happen. If that's the case, then you've got a
deadlock, and you have more to worry about than resources
not being released.

I think what all this means is that an event loop must
not simply drop async tasks on the floor. If it's asked
to cancel a task, it should do that by throwing an
appropriate exception into it and letting it unwind
itself.

To go along with that, the programmer needs to understand
that he can't just fire off a task and abandon it if it
uses external resources and is not guaranteed to finish
under its own steam. He needs to arrange a timeout or
other mechanism to cancel it if it doesn't complete in a
timely manner.

If those things are done, an async with should be exactly
as adequate for resource cleanup as an ordinary with is in
a thread. It also shouldn't be necessary to have any
special protocol for finalising an async generator; async
with together with a way of throwing an exception into a
task should be all that's needed.

--
Greg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Brett Cannon

Great, thanks!

On Sat, Sep 3, 2016, 17:59 Guido van Rossum  wrote:

> Brett, I have not followed everything here but I have no problem with
> tweaks at this level as long as you are happy with it.
>
> --Guido (mobile)
>
> On Sep 3, 2016 5:39 PM, "Brett Cannon"  wrote:
>
>>
>>
>> On Sat, 3 Sep 2016 at 17:27 Yury Selivanov 
>> wrote:
>>
>>>
>>> On 2016-09-03 5:19 PM, Brett Cannon wrote:
>>> >
>>> >
>>> > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov >> > > wrote:
>>> >
>>> >
>>> >
>>> > On 2016-09-03 4:15 PM, Christian Heimes wrote:
>>> > > On 2016-09-04 00:03, Yury Selivanov wrote:
>>> > >>
>>> > >> On 2016-09-03 12:27 PM, Brett Cannon wrote:
>>> > >>> Below is the `co_extra` section of PEP 523 with the update
>>> > saying that
>>> > >>> users are expected to put a tuple in the field for easier
>>> > simultaneous
>>> > >>> use of the field.
>>> > >>>
>>> > >>> Since the `co_extra` discussions do not affect CPython itself
>>> I'm
>>> > >>> planning on landing the changes stemming from the PEP probably
>>> > on Monday.
>>> > >> Tuples are immutable.  If you have multiple co_extra users then
>>> > they
>>> > >> will have to either mutate tuple (which isn't always possible,
>>> for
>>> > >> instance, you can't increase size), or to replace it with
>>> > another tuple.
>>> > >>
>>> > >> Creating lists is a bit more expensive, but item access speed
>>> > should be
>>> > >> in the same ballpark.
>>> > >>
>>> > >> Another question -- sorry if this was discussed before -- why
>>> > do we want
>>> > >> a PyObject* there at all?  I.e. why don't we create a dedicated
>>> > struct
>>> > >> CoExtraContainer to manage the stuff in co_extra? My
>>> > understanding is
>>> > >> that the users of co_extra are C-level python optimizers and
>>> > profilers,
>>> > >> which don't need the overhead of CPython API.
>>> >
>>> >
>>> > As Chris pointed out in another email, the overhead is only in the
>>> > allocation, not the iteration/access if you use the PyTuple macros to
>>> > get the size and index into the tuple the overhead is negligible.
>>>
>>> Yes, my point was that it's as cheap to use a list as a tuple for
>>> co_extra.  If we decide to store PyObject in co_extra.
>>>
>>> > >>
>>> > >> This way my work to add an extra caching layer (which I'm very
>>> much
>>> > >> willing to continue to work on) wouldn't require another set of
>>> > extra
>>> > >> fields for code objects.
>>> > > Quick idea before I go to bed:
>>> > >
>>> > > You could adopt a similar API to OpenSSL's
>>> CRYPTO_get_ex_new_index()
>>> > > API,
>>> > >
>>> >
>>> https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html
>>> > >
>>> > >
>>> > > static int code_index = 0;
>>> > >
>>> > > int PyCodeObject_NewIndex() {
>>> > >  return code_index++;
>>> > > }
>>> > >
>>> > > A library like Pyjion has to acquire an index first. In further
>>> > calls it
>>> > > uses the index as offset into the new co_extra field. Libraries
>>> > don't
>>> > > have to hard-code their offset and two libraries will never
>>> > conflict.
>>> > > PyCode_New() can pre-populate co_extra with a PyTuple of size
>>> > > code_index. This avoids most resizes if you load Pyjion early.
>>> For
>>> > > code_index == 0 leaf the field NULL.
>>> >
>>> > Sounds like a very good idea!
>>> >
>>> >
>>> > The problem with this is the pre-population. If you don't get your
>>> > index assigned before the very first code object is allocated then you
>>> > still have to manage the size of the tuple in co_extra. So what this
>>> > would do is avoid the iteration but not the allocation overhead.
>>> >
>>> > If we open up the can of worms in terms of custom functions for this
>>> > (which I was trying to avoid), then you end up with Py_ssize_t
>>> > _PyCode_ExtraIndex(), PyObject *
>>> >   _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int
>>> > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data)
>>> > which does all the right things for creating or resizing the tuple as
>>> > necessary and which I think matches mostly what Nick had proposed
>>> > earlier. But the pseudo-code for _PyCode_GetExtra() would be::
>>> >
>>> >   if co_extra is None:
>>> > co_extra = (None,) * _next_extra_index;
>>> > return None
>>> >   elif len(co_extra) < index - 1:
>>> > ... pad out tuple
>>> > return None
>>> >else:
>>> >  return co_extra[index]
>>> >
>>> > Is that going to save us enough to want to have a custom API for this?
>>>
>>> But without that new API (basically what Christian proposed) you'd need
>>> to iterate over the list in order to find the object that belongs to
>>> Pyjion.
>>
>>

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Guido van Rossum

Brett, I have not followed everything here but I have no problem with
tweaks at this level as long as you are happy with it.

--Guido (mobile)

On Sep 3, 2016 5:39 PM, "Brett Cannon"  wrote:

>
>
> On Sat, 3 Sep 2016 at 17:27 Yury Selivanov 
> wrote:
>
>>
>> On 2016-09-03 5:19 PM, Brett Cannon wrote:
>> >
>> >
>> > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov > > > wrote:
>> >
>> >
>> >
>> > On 2016-09-03 4:15 PM, Christian Heimes wrote:
>> > > On 2016-09-04 00:03, Yury Selivanov wrote:
>> > >>
>> > >> On 2016-09-03 12:27 PM, Brett Cannon wrote:
>> > >>> Below is the `co_extra` section of PEP 523 with the update
>> > saying that
>> > >>> users are expected to put a tuple in the field for easier
>> > simultaneous
>> > >>> use of the field.
>> > >>>
>> > >>> Since the `co_extra` discussions do not affect CPython itself
>> I'm
>> > >>> planning on landing the changes stemming from the PEP probably
>> > on Monday.
>> > >> Tuples are immutable.  If you have multiple co_extra users then
>> > they
>> > >> will have to either mutate tuple (which isn't always possible,
>> for
>> > >> instance, you can't increase size), or to replace it with
>> > another tuple.
>> > >>
>> > >> Creating lists is a bit more expensive, but item access speed
>> > should be
>> > >> in the same ballpark.
>> > >>
>> > >> Another question -- sorry if this was discussed before -- why
>> > do we want
>> > >> a PyObject* there at all?  I.e. why don't we create a dedicated
>> > struct
>> > >> CoExtraContainer to manage the stuff in co_extra? My
>> > understanding is
>> > >> that the users of co_extra are C-level python optimizers and
>> > profilers,
>> > >> which don't need the overhead of CPython API.
>> >
>> >
>> > As Chris pointed out in another email, the overhead is only in the
>> > allocation, not the iteration/access if you use the PyTuple macros to
>> > get the size and index into the tuple the overhead is negligible.
>>
>> Yes, my point was that it's as cheap to use a list as a tuple for
>> co_extra.  If we decide to store PyObject in co_extra.
>>
>> > >>
>> > >> This way my work to add an extra caching layer (which I'm very
>> much
>> > >> willing to continue to work on) wouldn't require another set of
>> > extra
>> > >> fields for code objects.
>> > > Quick idea before I go to bed:
>> > >
>> > > You could adopt a similar API to OpenSSL's
>> CRYPTO_get_ex_new_index()
>> > > API,
>> > >
>> > https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_
>> ex_new_index.html
>> > >
>> > >
>> > > static int code_index = 0;
>> > >
>> > > int PyCodeObject_NewIndex() {
>> > >  return code_index++;
>> > > }
>> > >
>> > > A library like Pyjion has to acquire an index first. In further
>> > calls it
>> > > uses the index as offset into the new co_extra field. Libraries
>> > don't
>> > > have to hard-code their offset and two libraries will never
>> > conflict.
>> > > PyCode_New() can pre-populate co_extra with a PyTuple of size
>> > > code_index. This avoids most resizes if you load Pyjion early. For
>> > > code_index == 0 leaf the field NULL.
>> >
>> > Sounds like a very good idea!
>> >
>> >
>> > The problem with this is the pre-population. If you don't get your
>> > index assigned before the very first code object is allocated then you
>> > still have to manage the size of the tuple in co_extra. So what this
>> > would do is avoid the iteration but not the allocation overhead.
>> >
>> > If we open up the can of worms in terms of custom functions for this
>> > (which I was trying to avoid), then you end up with Py_ssize_t
>> > _PyCode_ExtraIndex(), PyObject *
>> >   _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int
>> > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data)
>> > which does all the right things for creating or resizing the tuple as
>> > necessary and which I think matches mostly what Nick had proposed
>> > earlier. But the pseudo-code for _PyCode_GetExtra() would be::
>> >
>> >   if co_extra is None:
>> > co_extra = (None,) * _next_extra_index;
>> > return None
>> >   elif len(co_extra) < index - 1:
>> > ... pad out tuple
>> > return None
>> >else:
>> >  return co_extra[index]
>> >
>> > Is that going to save us enough to want to have a custom API for this?
>>
>> But without that new API (basically what Christian proposed) you'd need
>> to iterate over the list in order to find the object that belongs to
>> Pyjion.
>
>
> Yes.
>
>
>>   If we manage to implement my opcode caching idea, we'll have at
>> least two known users of co_extra.  Without a way to claim a particular
>> index in co_extra you will have some overhead to locate your

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Yury Selivanov





But without that new API (basically what Christian proposed) you'd
need
to iterate over the list in order to find the object that belongs to
Pyjion.


Yes.


Yeah, which means the same for my opcode patch... Which unfortunately 
will make things slower :(



  If we manage to implement my opcode caching idea, we'll have at
least two known users of co_extra.  Without a way to claim a
particular
index in co_extra you will have some overhead to locate your objects.


Two things. One, I would want any new API to start with an underscore 
so people know we can and will change its semantics as necessary. Two, 
Guido would have to re-accept the PEP as this is a shift in the use of 
the field if this is how people want to go.



Since this isn't a user-facing/public API feature, are we *really* 
forced to accept/implement the PEP before the beta?


I'd be happy to spend some time tomorrow/Monday to hammer out an 
alternative approach to co_extra. Let's see if we can find a slightly 
better approach.


Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Brett Cannon

On Sat, 3 Sep 2016 at 17:27 Yury Selivanov  wrote:

>
> On 2016-09-03 5:19 PM, Brett Cannon wrote:
> >
> >
> > On Sat, 3 Sep 2016 at 16:43 Yury Selivanov  > > wrote:
> >
> >
> >
> > On 2016-09-03 4:15 PM, Christian Heimes wrote:
> > > On 2016-09-04 00:03, Yury Selivanov wrote:
> > >>
> > >> On 2016-09-03 12:27 PM, Brett Cannon wrote:
> > >>> Below is the `co_extra` section of PEP 523 with the update
> > saying that
> > >>> users are expected to put a tuple in the field for easier
> > simultaneous
> > >>> use of the field.
> > >>>
> > >>> Since the `co_extra` discussions do not affect CPython itself I'm
> > >>> planning on landing the changes stemming from the PEP probably
> > on Monday.
> > >> Tuples are immutable.  If you have multiple co_extra users then
> > they
> > >> will have to either mutate tuple (which isn't always possible, for
> > >> instance, you can't increase size), or to replace it with
> > another tuple.
> > >>
> > >> Creating lists is a bit more expensive, but item access speed
> > should be
> > >> in the same ballpark.
> > >>
> > >> Another question -- sorry if this was discussed before -- why
> > do we want
> > >> a PyObject* there at all?  I.e. why don't we create a dedicated
> > struct
> > >> CoExtraContainer to manage the stuff in co_extra? My
> > understanding is
> > >> that the users of co_extra are C-level python optimizers and
> > profilers,
> > >> which don't need the overhead of CPython API.
> >
> >
> > As Chris pointed out in another email, the overhead is only in the
> > allocation, not the iteration/access if you use the PyTuple macros to
> > get the size and index into the tuple the overhead is negligible.
>
> Yes, my point was that it's as cheap to use a list as a tuple for
> co_extra.  If we decide to store PyObject in co_extra.
>
> > >>
> > >> This way my work to add an extra caching layer (which I'm very
> much
> > >> willing to continue to work on) wouldn't require another set of
> > extra
> > >> fields for code objects.
> > > Quick idea before I go to bed:
> > >
> > > You could adopt a similar API to OpenSSL's
> CRYPTO_get_ex_new_index()
> > > API,
> > >
> >
> https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html
> > >
> > >
> > > static int code_index = 0;
> > >
> > > int PyCodeObject_NewIndex() {
> > >  return code_index++;
> > > }
> > >
> > > A library like Pyjion has to acquire an index first. In further
> > calls it
> > > uses the index as offset into the new co_extra field. Libraries
> > don't
> > > have to hard-code their offset and two libraries will never
> > conflict.
> > > PyCode_New() can pre-populate co_extra with a PyTuple of size
> > > code_index. This avoids most resizes if you load Pyjion early. For
> > > code_index == 0 leaf the field NULL.
> >
> > Sounds like a very good idea!
> >
> >
> > The problem with this is the pre-population. If you don't get your
> > index assigned before the very first code object is allocated then you
> > still have to manage the size of the tuple in co_extra. So what this
> > would do is avoid the iteration but not the allocation overhead.
> >
> > If we open up the can of worms in terms of custom functions for this
> > (which I was trying to avoid), then you end up with Py_ssize_t
> > _PyCode_ExtraIndex(), PyObject *
> >   _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int
> > _PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data)
> > which does all the right things for creating or resizing the tuple as
> > necessary and which I think matches mostly what Nick had proposed
> > earlier. But the pseudo-code for _PyCode_GetExtra() would be::
> >
> >   if co_extra is None:
> > co_extra = (None,) * _next_extra_index;
> > return None
> >   elif len(co_extra) < index - 1:
> > ... pad out tuple
> > return None
> >else:
> >  return co_extra[index]
> >
> > Is that going to save us enough to want to have a custom API for this?
>
> But without that new API (basically what Christian proposed) you'd need
> to iterate over the list in order to find the object that belongs to
> Pyjion.


Yes.


>   If we manage to implement my opcode caching idea, we'll have at
> least two known users of co_extra.  Without a way to claim a particular
> index in co_extra you will have some overhead to locate your objects.
>

Two things. One, I would want any new API to start with an underscore so
people know we can and will change its semantics as necessary. Two, Guido
would have to re-accept the PEP as this is a shift in the use of the field
if this is how people want to go.
___
Python-Dev mailing list

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Yury Selivanov

On 2016-09-03 5:19 PM, Brett Cannon wrote:

On Sat, 3 Sep 2016 at 16:43 Yury Selivanov > wrote:

On 2016-09-03 4:15 PM, Christian Heimes wrote:
> On 2016-09-04 00:03, Yury Selivanov wrote:
>>
>> On 2016-09-03 12:27 PM, Brett Cannon wrote:
>>> Below is the `co_extra` section of PEP 523 with the update
saying that
>>> users are expected to put a tuple in the field for easier
simultaneous
>>> use of the field.
>>>
>>> Since the `co_extra` discussions do not affect CPython itself I'm
>>> planning on landing the changes stemming from the PEP probably
on Monday.
>> Tuples are immutable.  If you have multiple co_extra users then
they
>> will have to either mutate tuple (which isn't always possible, for
>> instance, you can't increase size), or to replace it with
another tuple.
>>
>> Creating lists is a bit more expensive, but item access speed
should be
>> in the same ballpark.
>>
>> Another question -- sorry if this was discussed before -- why
do we want
>> a PyObject* there at all?  I.e. why don't we create a dedicated
struct
>> CoExtraContainer to manage the stuff in co_extra? My
understanding is
>> that the users of co_extra are C-level python optimizers and
profilers,
>> which don't need the overhead of CPython API.

As Chris pointed out in another email, the overhead is only in the 
allocation, not the iteration/access if you use the PyTuple macros to 
get the size and index into the tuple the overhead is negligible.

Yes, my point was that it's as cheap to use a list as a tuple for 
co_extra.  If we decide to store PyObject in co_extra.

>>
>> This way my work to add an extra caching layer (which I'm very much
>> willing to continue to work on) wouldn't require another set of
extra
>> fields for code objects.
> Quick idea before I go to bed:
>
> You could adopt a similar API to OpenSSL's CRYPTO_get_ex_new_index()
> API,
>
https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html
>
>
> static int code_index = 0;
>
> int PyCodeObject_NewIndex() {
>  return code_index++;
> }
>
> A library like Pyjion has to acquire an index first. In further
calls it
> uses the index as offset into the new co_extra field. Libraries
don't
> have to hard-code their offset and two libraries will never
conflict.
> PyCode_New() can pre-populate co_extra with a PyTuple of size
> code_index. This avoids most resizes if you load Pyjion early. For
> code_index == 0 leaf the field NULL.

Sounds like a very good idea!

The problem with this is the pre-population. If you don't get your 
index assigned before the very first code object is allocated then you 
still have to manage the size of the tuple in co_extra. So what this 
would do is avoid the iteration but not the allocation overhead.

If we open up the can of worms in terms of custom functions for this 
(which I was trying to avoid), then you end up with Py_ssize_t 
_PyCode_ExtraIndex(), PyObject *
  _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int 
_PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data) 
which does all the right things for creating or resizing the tuple as 
necessary and which I think matches mostly what Nick had proposed 
earlier. But the pseudo-code for _PyCode_GetExtra() would be::

  if co_extra is None:
co_extra = (None,) * _next_extra_index;
return None
  elif len(co_extra) < index - 1:
... pad out tuple
return None
   else:
 return co_extra[index]

Is that going to save us enough to want to have a custom API for this?

But without that new API (basically what Christian proposed) you'd need 
to iterate over the list in order to find the object that belongs to 
Pyjion.  If we manage to implement my opcode caching idea, we'll have at 
least two known users of co_extra.  Without a way to claim a particular 
index in co_extra you will have some overhead to locate your objects.

Yury

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Brett Cannon

On Sat, 3 Sep 2016 at 16:55 Yury Selivanov  wrote:

>
>
> On 2016-09-03 4:13 PM, Chris Angelico wrote:
> > On Sun, Sep 4, 2016 at 8:03 AM, Yury Selivanov 
> wrote:
> >> On 2016-09-03 12:27 PM, Brett Cannon wrote:
> >>> Below is the `co_extra` section of PEP 523 with the update saying that
> >>> users are expected to put a tuple in the field for easier simultaneous
> use
> >>> of the field.
> >>>
> >>> Since the `co_extra` discussions do not affect CPython itself I'm
> planning
> >>> on landing the changes stemming from the PEP probably on Monday.
> >>
> >> Tuples are immutable.  If you have multiple co_extra users then they
> will
> >> have to either mutate tuple (which isn't always possible, for instance,
> you
> >> can't increase size), or to replace it with another tuple.
> > Replace it, but only as they register themselves with a particular
> > function. Imagine a profiler doing something vaguely like this:
>
> "Replacing" makes it error prone to cache the pointer even for small
> periods of time. Defining co_extra using Python C API forces us to
> acquire the GIL etc (aside from other performance penalties). Although
> we probably would recommend to use the GIL anyways, I'm not sure tuple
> really simplifies anything here.
>
> >
> > class FunctionStats:
> >  def __init__(self):
> >  self.info = [whatever, whatever, blah blah]
> >
> > def profile(func):
> >  """Decorator to mark a function for profiling"""
> >  func.__code__.co_extra += (FunctionStats(),)
> >  return func
> >
> > Tuple immutability impacts the initialization only. After that, you
> > just iterate over it.
>
> I wasn't aware we wanted to expose co_extra to Python land.
>

We are most definitely not exposing the field to Python code.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Brett Cannon

On Sat, 3 Sep 2016 at 16:43 Yury Selivanov  wrote:

>
>
> On 2016-09-03 4:15 PM, Christian Heimes wrote:
> > On 2016-09-04 00:03, Yury Selivanov wrote:
> >>
> >> On 2016-09-03 12:27 PM, Brett Cannon wrote:
> >>> Below is the `co_extra` section of PEP 523 with the update saying that
> >>> users are expected to put a tuple in the field for easier simultaneous
> >>> use of the field.
> >>>
> >>> Since the `co_extra` discussions do not affect CPython itself I'm
> >>> planning on landing the changes stemming from the PEP probably on
> Monday.
> >> Tuples are immutable.  If you have multiple co_extra users then they
> >> will have to either mutate tuple (which isn't always possible, for
> >> instance, you can't increase size), or to replace it with another tuple.
> >>
> >> Creating lists is a bit more expensive, but item access speed should be
> >> in the same ballpark.
> >>
> >> Another question -- sorry if this was discussed before -- why do we want
> >> a PyObject* there at all?  I.e. why don't we create a dedicated struct
> >> CoExtraContainer to manage the stuff in co_extra?  My understanding is
> >> that the users of co_extra are C-level python optimizers and profilers,
> >> which don't need the overhead of CPython API.
>

As Chris pointed out in another email, the overhead is only in the
allocation, not the iteration/access if you use the PyTuple macros to get
the size and index into the tuple the overhead is negligible.

> >>
> >> This way my work to add an extra caching layer (which I'm very much
> >> willing to continue to work on) wouldn't require another set of extra
> >> fields for code objects.
> > Quick idea before I go to bed:
> >
> > You could adopt a similar API to OpenSSL's CRYPTO_get_ex_new_index()
> > API,
> >
> https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html
> >
> >
> > static int code_index = 0;
> >
> > int PyCodeObject_NewIndex() {
> >  return code_index++;
> > }
> >
> > A library like Pyjion has to acquire an index first. In further calls it
> > uses the index as offset into the new co_extra field. Libraries don't
> > have to hard-code their offset and two libraries will never conflict.
> > PyCode_New() can pre-populate co_extra with a PyTuple of size
> > code_index. This avoids most resizes if you load Pyjion early. For
> > code_index == 0 leaf the field NULL.
>
> Sounds like a very good idea!
>

The problem with this is the pre-population. If you don't get your index
assigned before the very first code object is allocated then you still have
to manage the size of the tuple in co_extra. So what this would do is avoid
the iteration but not the allocation overhead.

If we open up the can of worms in terms of custom functions for this (which
I was trying to avoid), then you end up with Py_ssize_t
_PyCode_ExtraIndex(), PyObject *
  _PyCode_GetExtra(PyCodeObject *code, Py_ssize_t index), and int
_PyCode_SetExtra(PyCodeObject *code, Py_ssize_t index, PyObject *data)
which does all the right things for creating or resizing the tuple as
necessary and which I think matches mostly what Nick had proposed earlier.
But the pseudo-code for _PyCode_GetExtra() would be::

  if co_extra is None:
co_extra = (None,) * _next_extra_index;
return None
  elif len(co_extra) < index - 1:
... pad out tuple
return None
   else:
 return co_extra[index]

Is that going to save us enough to want to have a custom API for this?

-Brett

>
> Yury
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Chris Angelico

On Sun, Sep 4, 2016 at 9:49 AM, Yury Selivanov  wrote:
>
>
> On 2016-09-03 4:13 PM, Chris Angelico wrote:
>> Replace it, but only as they register themselves with a particular
>> function. Imagine a profiler doing something vaguely like this:
>
>
> "Replacing" makes it error prone to cache the pointer even for small periods
> of time. Defining co_extra using Python C API forces us to acquire the GIL
> etc (aside from other performance penalties). Although we probably would
> recommend to use the GIL anyways, I'm not sure tuple really simplifies
> anything here.

If everyone behaves properly, it should be safe.

tuple_pointer = co_extra
max_index = len(tuple_pointer)
is tuple_pointer[0] mine? No
-- someone appends to the tuple --
is tuple_pointer[1] mine? No

The only effect of caching is that, in effect, mutations aren't seen
till the end of the iteration - a short time anyway.

>> class FunctionStats:
>>  def __init__(self):
>>  self.info = [whatever, whatever, blah blah]
>>
>> def profile(func):
>>  """Decorator to mark a function for profiling"""
>>  func.__code__.co_extra += (FunctionStats(),)
>>  return func
>>
>> Tuple immutability impacts the initialization only. After that, you
>> just iterate over it.
>
>
> I wasn't aware we wanted to expose co_extra to Python land.  I'm not
> convinced it's a good idea, because exposing, say, Pyjion JIT state to
> Python doesn't make any sense.  At least for Python 3.6 I don't think we
> would want to expose this field.
>
> Moreover, profiling Python with a pure Python profiler is kind of slow...
> I'm sure people use C for that anyways.

This is what I get for overly embracing the notion that Python is
executable pseudo-code :) Yes, this would normally be happening in C,
but notionally, it'll be like that.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Yury Selivanov




On 2016-09-03 4:13 PM, Chris Angelico wrote:

On Sun, Sep 4, 2016 at 8:03 AM, Yury Selivanov  wrote:

On 2016-09-03 12:27 PM, Brett Cannon wrote:

Below is the `co_extra` section of PEP 523 with the update saying that
users are expected to put a tuple in the field for easier simultaneous use
of the field.

Since the `co_extra` discussions do not affect CPython itself I'm planning
on landing the changes stemming from the PEP probably on Monday.


Tuples are immutable.  If you have multiple co_extra users then they will
have to either mutate tuple (which isn't always possible, for instance, you
can't increase size), or to replace it with another tuple.

Replace it, but only as they register themselves with a particular
function. Imagine a profiler doing something vaguely like this:


"Replacing" makes it error prone to cache the pointer even for small 
periods of time. Defining co_extra using Python C API forces us to 
acquire the GIL etc (aside from other performance penalties). Although 
we probably would recommend to use the GIL anyways, I'm not sure tuple 
really simplifies anything here.




class FunctionStats:
 def __init__(self):
 self.info = [whatever, whatever, blah blah]

def profile(func):
 """Decorator to mark a function for profiling"""
 func.__code__.co_extra += (FunctionStats(),)
 return func

Tuple immutability impacts the initialization only. After that, you
just iterate over it.


I wasn't aware we wanted to expose co_extra to Python land.  I'm not 
convinced it's a good idea, because exposing, say, Pyjion JIT state to 
Python doesn't make any sense.  At least for Python 3.6 I don't think we 
would want to expose this field.


Moreover, profiling Python with a pure Python profiler is kind of 
slow...  I'm sure people use C for that anyways.


Yury

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Yury Selivanov




On 2016-09-03 4:15 PM, Christian Heimes wrote:

On 2016-09-04 00:03, Yury Selivanov wrote:


On 2016-09-03 12:27 PM, Brett Cannon wrote:

Below is the `co_extra` section of PEP 523 with the update saying that
users are expected to put a tuple in the field for easier simultaneous
use of the field.

Since the `co_extra` discussions do not affect CPython itself I'm
planning on landing the changes stemming from the PEP probably on Monday.

Tuples are immutable.  If you have multiple co_extra users then they
will have to either mutate tuple (which isn't always possible, for
instance, you can't increase size), or to replace it with another tuple.

Creating lists is a bit more expensive, but item access speed should be
in the same ballpark.

Another question -- sorry if this was discussed before -- why do we want
a PyObject* there at all?  I.e. why don't we create a dedicated struct
CoExtraContainer to manage the stuff in co_extra?  My understanding is
that the users of co_extra are C-level python optimizers and profilers,
which don't need the overhead of CPython API.

This way my work to add an extra caching layer (which I'm very much
willing to continue to work on) wouldn't require another set of extra
fields for code objects.

Quick idea before I go to bed:

You could adopt a similar API to OpenSSL's CRYPTO_get_ex_new_index()
API,
https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html


static int code_index = 0;

int PyCodeObject_NewIndex() {
 return code_index++;
}

A library like Pyjion has to acquire an index first. In further calls it
uses the index as offset into the new co_extra field. Libraries don't
have to hard-code their offset and two libraries will never conflict.
PyCode_New() can pre-populate co_extra with a PyTuple of size
code_index. This avoids most resizes if you load Pyjion early. For
code_index == 0 leaf the field NULL.


Sounds like a very good idea!

Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Christian Heimes

On 2016-09-04 00:03, Yury Selivanov wrote:
> 
> 
> On 2016-09-03 12:27 PM, Brett Cannon wrote:
>> Below is the `co_extra` section of PEP 523 with the update saying that
>> users are expected to put a tuple in the field for easier simultaneous
>> use of the field.
>>
>> Since the `co_extra` discussions do not affect CPython itself I'm
>> planning on landing the changes stemming from the PEP probably on Monday.
> 
> Tuples are immutable.  If you have multiple co_extra users then they
> will have to either mutate tuple (which isn't always possible, for
> instance, you can't increase size), or to replace it with another tuple.
> 
> Creating lists is a bit more expensive, but item access speed should be
> in the same ballpark.
> 
> Another question -- sorry if this was discussed before -- why do we want
> a PyObject* there at all?  I.e. why don't we create a dedicated struct
> CoExtraContainer to manage the stuff in co_extra?  My understanding is
> that the users of co_extra are C-level python optimizers and profilers,
> which don't need the overhead of CPython API.
> 
> This way my work to add an extra caching layer (which I'm very much
> willing to continue to work on) wouldn't require another set of extra
> fields for code objects.

Quick idea before I go to bed:

You could adopt a similar API to OpenSSL's CRYPTO_get_ex_new_index()
API,
https://www.openssl.org/docs/manmaster/crypto/CRYPTO_get_ex_new_index.html


static int code_index = 0;

int PyCodeObject_NewIndex() {
return code_index++;
}

A library like Pyjion has to acquire an index first. In further calls it
uses the index as offset into the new co_extra field. Libraries don't
have to hard-code their offset and two libraries will never conflict.
PyCode_New() can pre-populate co_extra with a PyTuple of size
code_index. This avoids most resizes if you load Pyjion early. For
code_index == 0 leaf the field NULL.

Christian
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Chris Angelico

On Sun, Sep 4, 2016 at 8:03 AM, Yury Selivanov  wrote:
> On 2016-09-03 12:27 PM, Brett Cannon wrote:
>>
>> Below is the `co_extra` section of PEP 523 with the update saying that
>> users are expected to put a tuple in the field for easier simultaneous use
>> of the field.
>>
>> Since the `co_extra` discussions do not affect CPython itself I'm planning
>> on landing the changes stemming from the PEP probably on Monday.
>
>
> Tuples are immutable.  If you have multiple co_extra users then they will
> have to either mutate tuple (which isn't always possible, for instance, you
> can't increase size), or to replace it with another tuple.

Replace it, but only as they register themselves with a particular
function. Imagine a profiler doing something vaguely like this:

class FunctionStats:
def __init__(self):
self.info = [whatever, whatever, blah blah]

def profile(func):
"""Decorator to mark a function for profiling"""
func.__code__.co_extra += (FunctionStats(),)
return func

Tuple immutability impacts the initialization only. After that, you
just iterate over it.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Koos Zevenhoven

On Sun, Sep 4, 2016 at 1:23 AM, Ivan Levkivskyi  wrote:
> On 4 September 2016 at 00:11, Random832  wrote:
>>
>> On Sat, Sep 3, 2016, at 18:06, Koos Zevenhoven wrote:
>> > I guess one reason I don't like bchr (nor chrb, really) is that they
>> > look just like a random sequence of letters in builtins, but not
>> > recognizable the way asdf would be.
>> >
>> > I guess I have one last pair of suggestions for the name of this
>> > function: bytes.chr or bytes.char.
>>
>> What about byte? Like, not bytes.byte, just builtins.byte.
>
>
> I like this option, it would be very "symmetric" to have, compare:
>
chr(42)
> '*'
str()
> ''
>
> with this:
>
byte(42)
> b'*'
bytes()
> b''
>
> It is easy to explain and remember this.

In one way, I like it, but on the other hand, indexing a bytes gives
an integer, so maybe a 'byte' is just an integer in range(256). Also,
having both byte and bytes would be a slight annoyance with
autocomplete.

-- Koos
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Ivan Levkivskyi

On 4 September 2016 at 00:11, Random832  wrote:

> On Sat, Sep 3, 2016, at 18:06, Koos Zevenhoven wrote:
> > I guess one reason I don't like bchr (nor chrb, really) is that they
> > look just like a random sequence of letters in builtins, but not
> > recognizable the way asdf would be.
> >
> > I guess I have one last pair of suggestions for the name of this
> > function: bytes.chr or bytes.char.
>
> What about byte? Like, not bytes.byte, just builtins.byte.
>

I like this option, it would be very "symmetric" to have, compare:

>>>chr(42)
'*'
>>>str()
''

with this:

>>>byte(42)
b'*'
>>>bytes()
b''

It is easy to explain and remember this.

--
Ivan
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Koos Zevenhoven

On Sat, Sep 3, 2016 at 6:41 PM, Ethan Furman  wrote:
>>>
>>> Open Questions
>>> ==
>>>
>>> Do we add ``iterbytes`` to ``memoryview``, or modify
>>> ``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation?
>>> Or
>>> do we ignore memory for now and add it later?
>>
>>
>> Apparently memoryview.cast('s') comes from Nick Coghlan:
>>
>> .
>> However, since 3.5 (https://bugs.python.org/issue15944) you can call
>> cast("c") on most memoryviews, which I think already does what you
>> want:
>>
> tuple(memoryview(b"ABC").cast("c"))
>>
>> (b'A', b'B', b'C')
>
>
> Nice!
>

Indeed! Exposing this as bytes_instance.chars would make porting from
Python 2 really simple. Of course even better would be if slicing the
view would return bytes, so the porting rule would be the same for all
bytes subscripting:

py2str[SOMETHING]

becomes

py3bytes.chars[SOMETHING]

With the "c" memoryview there will be a distinction between slicing
and indexing.

And Random832 seems to be making some good points.

--- Koos


> --
> ~Ethan~
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com



-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Random832

On Sat, Sep 3, 2016, at 18:06, Koos Zevenhoven wrote:
> I guess one reason I don't like bchr (nor chrb, really) is that they
> look just like a random sequence of letters in builtins, but not
> recognizable the way asdf would be.
> 
> I guess I have one last pair of suggestions for the name of this
> function: bytes.chr or bytes.char.

What about byte? Like, not bytes.byte, just builtins.byte.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Random832

On Sat, Sep 3, 2016, at 08:08, Martin Panter wrote:
> On 1 September 2016 at 19:36, Ethan Furman  wrote:
> > Deprecation of current "zero-initialised sequence" behaviour without removal
> > 
> >
> > Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
> > argument and interpret it as meaning to create a zero-initialised sequence
> > of the given size::
> >
> > >>> bytes(3)
> > b'\x00\x00\x00'
> > >>> bytearray(3)
> > bytearray(b'\x00\x00\x00')
> >
> > This PEP proposes to deprecate that behaviour in Python 3.6, but to leave
> > it in place for at least as long as Python 2.7 is supported, possibly
> > indefinitely.
> 
> Can you clarify what “deprecate” means? Just add a note in the
> documentation, or make calls trigger a DeprecationWarning as well?
> Having bytearray(n) trigger a DeprecationWarning would be a minor
> annoyance for code being compatible with Python 2 and 3, since
> bytearray(n) is supported in Python 2.

I don't think bytearray(n) should be deprecated. I don't think that
deprecating bytes(n) should entail also deprecating bytes(n).

If I were designing these classes from scratch, I would not feel any
impulse to make their constructors take the same arguments or have the
same semantics, and I'm a bit unclear on what the reason for this
decision was.

I also don't think bytes.fromcount(n) is necessary. What's wrong with
b'\0'*n? I could swear this has been answered before, but I don't recall
what the answer was. I don't think the rationale mentioned in the PEP is
an adequate explanation, it references an earlier decision, about a
conceptually different class (it's an operation that's much more common
with mutable classes than immutable ones - when's the last time you did
(None,)*n relative to [None]*n), without actually explaining the real
reason for either underlying decision (having bytearray(n) and having
both classes take the same constructor arguments).

I think that the functions we should add/keep are:
bytes(values: Union[bytes, bytearray, Iterable[int]) 
bytearray(count : int)
bytearray(values: Union[bytes, bytearray, Iterable[int]) 
bchr(integer)

If, incidentally, we're going to add a .fromsize method, it'd be nice to
add a way to provide a fill value other than 0. Also, maybe we should
also add it for list and tuple (with the default value None)?

For the (string, encoding) signatures, there's no good reason to keep
them [TOOWTDI is str.encode] but no good reason to get rid of them
either.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Koos Zevenhoven

On Sat, Sep 3, 2016 at 7:59 PM, Nick Coghlan  wrote:
> On 3 September 2016 at 03:54, Koos Zevenhoven  wrote:
>> chrb seems to be more in line with some bytes versions in for instance os
>> than bchr.
>
> The mnemonic for the current name in the PEP is that bchr is to chr as
> b"" is to "". The PEP should probably say that in addition to pointing
> out the 'unichr' Python 2 inspiration, though.

Thanks for explaining. Indeed I hope that unichr does not affect any
naming decisions that will remain in the language for a long time.

> The other big difference between this and the os module case, is that
> the resulting builtin constructor pairs here are str/chr (arbitrary
> text, single code point) and bytes/bchr (arbitrary binary data, single
> binary octet). By contrast, os.getcwd() and os.getcwdb() (and similar
> APIs) are both referring to the same operating system level operation,
> they're just requesting a different return type for the data.

But chr and "bchr" are also requesting a different return type. The
difference is that the data is not coming from an os-level operation
but from an int.

I guess one reason I don't like bchr (nor chrb, really) is that they
look just like a random sequence of letters in builtins, but not
recognizable the way asdf would be.

I guess I have one last pair of suggestions for the name of this
function: bytes.chr or bytes.char.

-- Koos

> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia

-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Yury Selivanov




On 2016-09-03 12:27 PM, Brett Cannon wrote:
Below is the `co_extra` section of PEP 523 with the update saying that 
users are expected to put a tuple in the field for easier simultaneous 
use of the field.


Since the `co_extra` discussions do not affect CPython itself I'm 
planning on landing the changes stemming from the PEP probably on Monday.


Tuples are immutable.  If you have multiple co_extra users then they 
will have to either mutate tuple (which isn't always possible, for 
instance, you can't increase size), or to replace it with another tuple.


Creating lists is a bit more expensive, but item access speed should be 
in the same ballpark.


Another question -- sorry if this was discussed before -- why do we want 
a PyObject* there at all?  I.e. why don't we create a dedicated struct 
CoExtraContainer to manage the stuff in co_extra?  My understanding is 
that the users of co_extra are C-level python optimizers and profilers, 
which don't need the overhead of CPython API.


This way my work to add an extra caching layer (which I'm very much 
willing to continue to work on) wouldn't require another set of extra 
fields for code objects.


Yury

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations

2016-09-03 Thread Yury Selivanov




On 2016-08-30 2:20 PM, Guido van Rossum wrote:

I'm happy to present PEP 526 for your collective review:
https://www.python.org/dev/peps/pep-0526/ (HTML)
https://github.com/python/peps/blob/master/pep-0526.txt (source)

There's also an implementation ready:
https://github.com/ilevkivskyi/cpython/tree/pep-526

I don't want to post the full text here but I encourage feedback on
the high-order ideas, including but not limited to

- Whether (given PEP 484's relative success) it's worth adding syntax
for variable/attribute annotations.

- Whether the keyword-free syntax idea proposed here is best:
   NAME: TYPE
   TARGET: TYPE = VALUE


I'm in favour for the PEP, and I like the syntax.  I find it much better 
than any previously discussed alternatives.


Static typing is becoming increasingly more popular, and the benefits of 
using static type checkers for big code bases are clear.  The PEP 
doesn't really change the semantics of the language, it only allows 
better tooling (using comments for annotations was fine too, but 
dedicated syntax makes this feature a first class citizen).


Yury

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Tweak to PEP 523 for storing a tuple in co_extra

2016-09-03 Thread Brett Cannon

Below is the `co_extra` section of PEP 523 with the update saying that
users are expected to put a tuple in the field for easier simultaneous use
of the field.

Since the `co_extra` discussions do not affect CPython itself I'm planning
on landing the changes stemming from the PEP probably on Monday.

--

Expanding ``PyCodeObject``
--

One field is to be added to the ``PyCodeObject`` struct
[#pycodeobject]_::

  typedef struct {
 ...
 PyObject *co_extra;  /* "Scratch space" for the code object. */
  } PyCodeObject;

The ``co_extra`` will be ``NULL`` by default and will not be used by
CPython itself. Third-party code is free to use the field as desired.
Values stored in the field are expected to not be required in order
for the code object to function, allowing the loss of the data of the
field to be acceptable. The field will be freed like all other fields
on ``PyCodeObject`` during deallocation using ``Py_XDECREF()``.

Code using the field is expected to always store a tuple in the field.
This allows for multiple users of the field to not trample over each
other while being as performant as possible. Typical usage of the
field is expected to roughly follow the following pseudo-code::

  if co_extra is None:
data = DataClass()
co_extra = (data,)
  else:
assert isinstance(co_extra, tuple)
for x in co_extra:
if isinstance(x, DataClass):
data = x
break
else:
data = DataClass()
co_extra += (data,)

Using a list was considered but was found to be less performant, and
with a key use-case being JIT usage the performance consideration it
was deemed more important to use a tuple than a list. A tuple also
makes more sense semantically as the objects stored in the tuple will
be heterogeneous.

A dict was also considered, but once again performance was more
important. While a dict will have constant overhead in looking up
data, the overhead for the common case of a single object being stored
in the data structure leads to a tuple having better performance
characteristics (i.e. iterating a tuple of length 1 is faster than
the overhead of hashing and looking up an object in a dict).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Yury Selivanov


Hi Oscar,

I don't think PyPy is in breach of the language spec here. Python made
a decision a long time ago to shun RAII-style implicit cleanup in
favour if with-style explicit cleanup.

The solution to this problem is to move resource management outside of
the generator functions. This is true for ordinary generators without
an event-loop etc. The example in the PEP is

async def square_series(con, to):
 async with con.transaction():
 cursor = con.cursor(
 'SELECT generate_series(0, $1) AS i', to)
 async for row in cursor:
 yield row['i'] ** 2

async for i in square_series(con, 1000):
 if i == 100:
 break

The normal generator equivalent of this is:

def square_series(con, to):
 with con.transaction():
 cursor = con.cursor(
 'SELECT generate_series(0, $1) AS i', to)
 for row in cursor:
 yield row['i'] ** 2

This code is already broken: move the with statement outside to the
caller of the generator function.


Exactly.

I used 'async with' in the PEP to demonstrate that the cleanup 
mechanisms are powerful enough to handle bad code patterns.


Thank you,
Yury
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Nick Coghlan

On 4 September 2016 at 04:38, Oscar Benjamin  wrote:
> On 3 September 2016 at 16:42, Nick Coghlan  wrote:
>> On 2 September 2016 at 19:13, Nathaniel Smith  wrote:
>>> This works OK on CPython because the reference-counting gc will call
>>> handle.__del__() at the end of the scope (so on CPython it's at level
>>> 2), but it famously causes huge problems when porting to PyPy with
>>> it's much faster and more sophisticated gc that only runs when
>>> triggered by memory pressure. (Or for "PyPy" you can substitute
>>> "Jython", "IronPython", whatever.) Technically this code doesn't
>>> actually "leak" file descriptors on PyPy, because handle.__del__()
>>> will get called *eventually* (this code is at level 1, not level 0),
>>> but by the time "eventually" arrives your server process has probably
>>> run out of file descriptors and crashed. Level 1 isn't good enough. So
>>> now we have all learned to instead write
> ...
>>> BUT, with the current PEP 525 proposal, trying to use this generator
>>> in this way is exactly analogous to the open(path).read() case: on
>>> CPython it will work fine -- the generator object will leave scope at
>>> the end of the 'async for' loop, cleanup methods will be called, etc.
>>> But on PyPy, the weakref callback will not be triggered until some
>>> arbitrary time later, you will "leak" file descriptors, and your
>>> server will crash.
>>
>> That suggests the PyPy GC should probably be tracking pressure on more
>> resources than just memory when deciding whether or not to trigger a
>> GC run.
>
> PyPy's GC is conformant to the language spec

The language spec doesn't say anything about what triggers GC cycles -
that's purely a decision for runtime implementors based on the
programming experience they want to provide their users.

CPython runs GC pretty eagerly, with it being immediate when the
automatic reference counting is sufficient and the cyclic GC doesn't
have to get involved at all.

If I understand correctly, PyPy currently decides whether or not to
trigger a GC cycle based primarily on memory pressure, even though the
uncollected garbage may also be holding on to system resources other
than memory (like file descriptors).

For synchronous code, that's a relatively easy burden to push back
onto the programmer - assuming fair thread scheduling, a with
statement can ensure reliably ensure prompt resource cleanup.

That assurance goes out the window as soon as you explicitly pause
code execution inside the body of the with statement - it doesn't
matter whether its via yield, yield from, or await, you've completely
lost that assurance of immediacy.

At that point, even CPython doesn't ensure prompt release of resources
- it just promises to try to clean things up as soon as it can and as
best it can (which is usually pretty soon and pretty well, with recent
iterations of 3.x, but event loops will still happily keep things
alive indefinitely if they're waiting for events that never happen).

For synchronous generators, you can make your API a bit more
complicated, and ask your caller to handle the manual resource
management, but you may not want to do that.

The asynchronous case is even worse though, as there, you often simply
can't readily push the burden back onto the programmer, because the
code is *meant* to be waiting for events and reacting to them, rather
than proceeding deterministically from beginning to end.

So while it's good that PEP 492 and 525 attempt to adapt synchronous
resource management models to the asynchronous world, it's also
important to remember that there's a fundamental mismatch of
underlying concepts when it comes to trying to pair up deterministic
resource management with asynchronous code - you're often going to
want to tip the model on its side and set up a dedicated resource
manager that other components can interact with, and then have the
resource manager take care of promptly releasing the resources when
the other components go away (perhaps with notions of leases and lease
renewals if you simply cannot afford unexpected delays in resources
being released).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Yury Selivanov


Hi Nathaniel,

On 2016-09-02 2:13 AM, Nathaniel Smith wrote:

On Thu, Sep 1, 2016 at 3:34 PM, Yury Selivanov  wrote:

Hi,

I've spent quite a while thinking and experimenting with PEP 525 trying to
figure out how to make asynchronous generators (AG) finalization reliable.
I've tried to replace the callback for GCed with a callback to intercept
first iteration of AGs.  Turns out it's very hard to work with weak-refs and
make asyncio event loop to reliably track and shutdown all open AGs.

My new approach is to replace the "sys.set_asyncgen_finalizer(finalizer)"
function with "sys.set_asyncgen_hooks(firstiter=None, finalizer=None)".

1) Can/should these hooks be used by other types besides async
generators? (e.g., async iterators that are not async generators?)
What would that look like?


Asynchronous iterators (classes implementing __aiter__, __anext__) 
should use __del__ for any cleanup purposes.


sys.set_asyncgen_hooks only supports asynchronous generators.



2) In the asyncio design it's legal for an event loop to be stopped
and then started again. Currently (I guess for this reason?) asyncio
event loops do not forcefully clean up resources associated with them
on shutdown. For example, if I open a StreamReader, loop.stop() and
loop.close() will not automatically close it for me. When, concretely,
are you imagining that asyncio will run these finalizers?


I think we will add another API method to asyncio event loop, which 
users will call before closing the loop.  In my reference implementation 
I added `loop.shutdown()` synchronous method.




3) Should the cleanup code in the generator be able to distinguish
between "this iterator has left scope" versus "the event loop is being
violently shut down"?


This is already handled in the reference implementation.  When an AG is 
iterated for the first time, the loop starts tracking it by adding it to 
a weak set.  When the AG is about to be GCed, the loop removes it from 
the weak set, and schedules its 'aclose()'.


If 'loop.shutdown' is called it means that the loop is being "violently 
shutdown", so we schedule 'aclose' for all AGs in the weak set.




4) More fundamentally -- this revision is definitely an improvement,
but it doesn't really address the main concern I have. Let me see if I
can restate it more clearly.

Let's define 3 levels of cleanup handling:

   Level 0: resources (e.g. file descriptors) cannot be reliably cleaned up.

   Level 1: resources are cleaned up reliably, but at an unpredictable time.

   Level 2: resources are cleaned up both reliably and promptly.

In Python 3.5, unless you're very anal about writing cumbersome 'async
with' blocks around every single 'async for', resources owned by aysnc
iterators land at level 0. (Because the only cleanup method available
is __del__, and __del__ cannot make async calls, so if you need async
calls to do clean up then you're just doomed.)

I think at the revised draft does a good job of moving async
generators from level 0 to level 1 -- the finalizer hook gives a way
to effectively call back into the event loop from __del__, and the
shutdown hook gives us a way to guarantee that the cleanup happens
while the event loop is still running.
Right.  It's good to hear that you agree that the latest revision of the 
PEP makes AGs cleanup reliable (albeit unpredictable when exactly that 
will happen, more on that below).


My goal was exactly this - make the mechanism reliable, with the same 
predictability as what we have for __del__.



But... IIUC, it's now generally agreed that for Python code, level 1
is simply *not good enough*. (Or to be a little more precise, it's
good enough for the case where the resource being cleaned up is
memory, because the garbage collector knows when memory is short, but
it's not good enough for resources like file descriptors.) The classic
example of this is code like:


I think this is where I don't agree with you 100%.  There are no strict 
guarantees when an object will be GCed in a timely manner in CPython or 
PyPy.  If it's part of a ref cycle, it might not be cleaned up at all.


All in all, in all your examples I don't see the exact place where AGs 
are different from let's say synchronous generators.


For instance:

   async def read_json_lines_from_server(host, port):
   async for line in asyncio.open_connection(host, port)[0]:
   yield json.loads(line)

You would expect to use this like:

   async for data in read_json_lines_from_server(host, port):
   ...


If you rewrite the above code without the 'async' keyword, you'd have a 
synchronous generator with *exactly* the same problems.

tl;dr: AFAICT this revision of PEP 525 is enough to make it work
reliably on CPython, but I have serious concerns that it bakes a
CPython-specific design into the language. I would prefer a design
that actually aims for "level 2" cleanup semantics (for example, [1])



I honestly don't see why PEP 525 can't be implemented in

Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Oscar Benjamin

On 3 September 2016 at 16:42, Nick Coghlan  wrote:
> On 2 September 2016 at 19:13, Nathaniel Smith  wrote:
>> This works OK on CPython because the reference-counting gc will call
>> handle.__del__() at the end of the scope (so on CPython it's at level
>> 2), but it famously causes huge problems when porting to PyPy with
>> it's much faster and more sophisticated gc that only runs when
>> triggered by memory pressure. (Or for "PyPy" you can substitute
>> "Jython", "IronPython", whatever.) Technically this code doesn't
>> actually "leak" file descriptors on PyPy, because handle.__del__()
>> will get called *eventually* (this code is at level 1, not level 0),
>> but by the time "eventually" arrives your server process has probably
>> run out of file descriptors and crashed. Level 1 isn't good enough. So
>> now we have all learned to instead write
...
>> BUT, with the current PEP 525 proposal, trying to use this generator
>> in this way is exactly analogous to the open(path).read() case: on
>> CPython it will work fine -- the generator object will leave scope at
>> the end of the 'async for' loop, cleanup methods will be called, etc.
>> But on PyPy, the weakref callback will not be triggered until some
>> arbitrary time later, you will "leak" file descriptors, and your
>> server will crash.
>
> That suggests the PyPy GC should probably be tracking pressure on more
> resources than just memory when deciding whether or not to trigger a
> GC run.

PyPy's GC is conformant to the language spec AFAICT:
https://docs.python.org/3/reference/datamodel.html#object.__del__

"""
object.__del__(self)

Called when the instance is about to be destroyed. This is also called
a destructor. If a base class has a __del__() method, the derived
class’s __del__() method, if any, must explicitly call it to ensure
proper deletion of the base class part of the instance. Note that it
is possible (though not recommended!) for the __del__() method to
postpone destruction of the instance by creating a new reference to
it. It may then be called at a later time when this new reference is
deleted. It is not guaranteed that __del__() methods are called for
objects that still exist when the interpreter exits.
"""

Note the last sentence. It is also not guaranteed (across different
Python implementations and regardless of the CPython-specific notes in
the docs) that any particular object will cease to exist before the
interpreter exits. Taken together these two imply that it is not
guaranteed that *any* __del__ method will ever be called.

Antoine's excellent work in PEP 442 has improved the situation with
CPython but the language spec (covering all implementations) remains
the same and changing that requires a new PEP and coordination with
other implementations. Without changing it is a mistake to base a new
core language feature (async finalisation) on CPython-specific
implementation details. Already using with (or try/finally etc.)
inside a generator function behaves differently under PyPy:

$ cat gentest.py

def generator_needs_finalisation():
try:
for n in range(10):
yield n
finally:
print('Doing important cleanup')

for obj in generator_needs_finalisation():
if obj == 5:
break

print('Process exit')

$ python gentest.py
Doing important cleanup
Process exit

So here the cleanup is triggered by the reference count of the
generator falling at the break statement. Under CPython this
corresponds to Nathaniel's "level 2" cleanup. If we keep another
reference around it gets done at process exit:

$ cat gentest2.py

def generator_needs_finalisation():
try:
for n in range(10):
yield n
finally:
print('Doing important cleanup')

gen = generator_needs_finalisation()
for obj in gen:
if obj == 5:
break

print('Process exit')

$ python gentest2.py
Process exit
Doing important cleanup

So that's Nathaniel's "level 1" cleanup. However if you run either of
these scripts under PyPy the cleanup simply won't occur (i.e. "level
0" cleanup):

$ pypy gentest.py
Process exit
$ pypy gentest2.py
Process exit

I don't think PyPy is in breach of the language spec here. Python made
a decision a long time ago to shun RAII-style implicit cleanup in
favour if with-style explicit cleanup.

The solution to this problem is to move resource management outside of
the generator functions. This is true for ordinary generators without
an event-loop etc. The example in the PEP is

async def square_series(con, to):
async with con.transaction():
cursor = con.cursor(
'SELECT generate_series(0, $1) AS i', to)
async for row in cursor:
yield row['i'] ** 2

async for i in square_series(con, 1000):
if i == 100:
break

The normal generator equivalent of this is:

def square_series(con, to):
with con.transaction():
cursor = con.cursor(
'SELECT generate_series(0, $1) AS i', to)

Re: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8

2016-09-03 Thread Adam Bartoš

Nick Coghlan (ncoghlan at gmail.com) on Sat Sep 3 12:27:44 EDT 2016 wrote:

> After also reading the Windows console encoding PEP, I realised
> there's a couple of missing discussions here regarding the impacts on
> sys.argv, os.environ, and os.environb.
>
> The reason that's relevant is that "sys.getfilesystemencoding" is a
> bit of a misnomer, as it's also used to determine the assumed encoding
> of command line arguments and environment variables.
>
>
Regarding sys.argv, AFAIK Unicode arguments work well on Python 3. Even
non-BMP characters are transferred correctly.


Adam Bartoš
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Nick Coghlan

On 3 September 2016 at 03:54, Koos Zevenhoven  wrote:
> chrb seems to be more in line with some bytes versions in for instance os
> than bchr.

The mnemonic for the current name in the PEP is that bchr is to chr as
b"" is to "". The PEP should probably say that in addition to pointing
out the 'unichr' Python 2 inspiration, though.

The other big difference between this and the os module case, is that
the resulting builtin constructor pairs here are str/chr (arbitrary
text, single code point) and bytes/bchr (arbitrary binary data, single
binary octet). By contrast, os.getcwd() and os.getcwdb() (and similar
APIs) are both referring to the same operating system level operation,
they're just requesting a different return type for the data.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Nick Coghlan

On 3 September 2016 at 21:35, Martin Panter  wrote:
>> Le samedi 3 septembre 2016, Random832  a écrit :
>>> On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote:
>>> > The problem with only having `bchr` is that it doesn't help with
>>> > `bytearray`;
>>>
>>> What is the use case for bytearray.fromord? Even in the rare case
>>> someone needs it, why not bytearray(bchr(...))?
>
> On 3 September 2016 at 08:47, Victor Stinner  wrote:
>> Yes, this was my point: I don't think that we need a bytearray method to
>> create a mutable string from a single byte.
>
> I agree with the above. Having an easy way to turn an int into a bytes
> object is good. But I think the built-in bchr() function on its own is
> enough. Just like we have bytes object literals, but the closest we
> have for a bytearray literal is bytearray(b". . .").

This is a good point - earlier versions of the PEP didn't include
bchr(), they just had the class methods, so "bytearray(bchr(...))"
wasn't an available spelling (if I remember the original API design
correctly, it would have been something like
"bytearray(bytes.byte(...))"), which meant there was a strong
consistency argument in having the alternate constructor on both
types. Now that the PEP proposes the "bchr" builtin, the "fromord"
constructors look less necessary.

Given that, and the uncertain deprecation time frame for accepting
integers in the main bytes and bytearray constructors, perhaps both
the "fromsize" and "fromord" parts of the proposal can be deferred
indefinitely in favour of just adding the bchr() builtin?

We wouldn't gain the "initialise a region of memory to an arbitrary
value" feature, but it can be argued that wanting that is a sign
someone may be better off with a more specialised memory manipulation
library, rather than relying solely on the builtins.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8

2016-09-03 Thread Nick Coghlan

On 4 September 2016 at 00:49, Nick Coghlan  wrote:
> On 2 September 2016 at 08:31, Steve Dower  wrote:
>> This proposal would remove all use of the *A APIs and only ever call the *W
>> APIs. When Windows returns paths to Python as str, they will be decoded from
>> utf-16-le and returned as text (in whatever the minimal representation is).
>> When
>> Windows returns paths to Python as bytes, they will be decoded from
>> utf-16-le to
>> utf-8 using surrogatepass (Windows does not validate surrogate pairs, so it
>> is
>> possible to have invalid surrogates in filenames). Equally, when paths are
>> provided as bytes, they are decoded from utf-8 into utf-16-le and passed to
>> the
>> *W APIs.
>
> The overall proposal looks good to me, there's just a terminology
> glitch here: utf-8 <-> utf-16-le should either be described as
> transcoding, or else as decoding and then re-encoding. As they're both
> text codecs, there's no "decoding" operation that switches between
> them.

After also reading the Windows console encoding PEP, I realised
there's a couple of missing discussions here regarding the impacts on
sys.argv, os.environ, and os.environb.

The reason that's relevant is that "sys.getfilesystemencoding" is a
bit of a misnomer, as it's also used to determine the assumed encoding
of command line arguments and environment variables.

With the PEP currently stating that all use of the "*A" Windows APIs
will be removed, I'm guessing these will just start working as
expected, but it should be convered explicitly.

In addition, if the subprocess module is going to be excluded from
these changes, that should be called out explicitly (Keeping in mind
that on *nix, the only subprocess pipe configurations that are
straightforward to set up in Python 3 are raw binary mode and
universal newlines mode, with the latter implicitly treating the pipes
as UTF-8 text)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects

2016-09-03 Thread Chris Angelico

On Sun, Sep 4, 2016 at 2:09 AM, Nick Coghlan  wrote:
> On 3 September 2016 at 08:50, Chris Angelico  wrote:
>> Got it, thanks. I hope the vagaries of linear search don't mess with
>> profilers - a debugger isn't going to be bothered by whether it gets
>> first slot or second, but profiling and performance might get subtle
>> differences based on which thing looks at a function first. A dict
>> would avoid that (constant-time lookups with a pre-selected key will
>> be consistent), but costs a lot more.
>
> Profiling with a debugger enabled is going to see a lot more
> interference from the debugger than it is from a linear search through
> a small tuple for its own state :)

Right; I was contrasting the debugger at one end (linear search is
utterly dwarfed by other costs) with a profiler at the other end
(wants minimal cost, and minimal noise, and a linear search gives cost
and noise). In between, an optimizer is an example of something that
could mess with the profiler based on activation ordering (and thus
which one gets first slot).

> Optimising compilers and VM profilers are clearly a case where
> cooperation will be desirable, as are optimising compilers and
> debuggers. However, that cooperation is still going to need to be
> worked out on a pairwise basis - the PEP can't magically make
> arbitrary pairs of plugins compatible, all it can do is define some
> rules and guidelines that make it easier for plugins to cooperate when
> they want to do so.

Obviously, but AIUI the rules sound pretty simple:

1) Base compiler: co_extra = ()
2) Modifier: co_extra += (MyState(),)
3) Repeat #2 for other tools
4) for obj in co_extra: if obj.__class__ is MyState: do stuff

Anyone who puts a non-tuple into co_extra is playing badly with other
people. Anyone who doesn't use a custom class is risking collisions.
Beyond that, it should be pretty straight-forward.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [erratum] Emotional responses to PEPs 484 and 526

2016-09-03 Thread Guido van Rossum

On Sat, Sep 3, 2016 at 8:18 AM, Stephen J. Turnbull
 wrote:
> Stephen J. Turnbull writes:
>
>  > My version ... furthermore makes mypy into a units checker,
>
> That isn't true, mypy does want annotations on all the variables it
> checks and does not infer them from initializer type.

But it does! Mypy emphatically does *not* need annotations on all
variables; it infers most variable types from the first expression
assigned to them. E.g. here:

  output = []
  n = 0
  output.append(n)
  reveal_type(output)

it will reveal the type List[int] without any help from annotations.

There are cases where it does require annotation on empty containers,
when it's less obvious how the container is filled, and other, more
complicated situations, but a sequence of assignments as in Nick's
example is a piece of cake for it.

In fact, the one place where *I* wanted a type annotation was here:

  expected: V = Ohm(100k)*input

because I haven't had a need to use Ohm's law in a long time, so I
could personally use the hint that Ohm times Amps makes Volts (but
again, given suitable class definitions, mypy wouldn't have needed
that annotation).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Update on PEP 523 and adding a co_extra field to code objects

2016-09-03 Thread Nick Coghlan

On 3 September 2016 at 08:50, Chris Angelico  wrote:
> Got it, thanks. I hope the vagaries of linear search don't mess with
> profilers - a debugger isn't going to be bothered by whether it gets
> first slot or second, but profiling and performance might get subtle
> differences based on which thing looks at a function first. A dict
> would avoid that (constant-time lookups with a pre-selected key will
> be consistent), but costs a lot more.

Profiling with a debugger enabled is going to see a lot more
interference from the debugger than it is from a linear search through
a small tuple for its own state :)

Optimising compilers and VM profilers are clearly a case where
cooperation will be desirable, as are optimising compilers and
debuggers. However, that cooperation is still going to need to be
worked out on a pairwise basis - the PEP can't magically make
arbitrary pairs of plugins compatible, all it can do is define some
rules and guidelines that make it easier for plugins to cooperate when
they want to do so.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What should a good type checker do? (was: Please reject or postpone PEP 526)

2016-09-03 Thread Koos Zevenhoven

What's up with the weird subthreads, Stephen?!

On Guido's suggestion, I'm working on posting those type-checking thoughts here.

-- Koos

On Sat, Sep 3, 2016 at 6:17 PM, Stephen J. Turnbull
 wrote:
> Please respect Reply-To, set to python-ideas.
>
> Greg Ewing writes:
>  > Chris Angelico wrote:
>  > > Forcing people to write 1.0 just to be compatible with 1.5 will cause
>  > > a lot of annoyance.
>  >
>  > Indeed, this would be unacceptable IMO.
>
> But "forcing" won't happen.  Just ignore the warning.  *All* such
> Python programs will continue to run (or crash) exactly as if the type
> declarations weren't there.  If you don't like the warning, either
> don't run the typechecker, or change your code to placate it.
>
> But allowing escapes from a typechecker means allowing escapes.  All
> of them, not just the ones you or I have preapproved.  I want my
> typechecker to be paranoid, and loud about it.
>
> That doesn't mean I would never use a type like "Floatable" (ie, any
> type subject to implicit conversion to float).  But in the original
> example, I would probably placate the typechecker.  YMMV, of course.
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com



-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 525, third round, better finalization

2016-09-03 Thread Nick Coghlan

On 2 September 2016 at 19:13, Nathaniel Smith  wrote:
> This works OK on CPython because the reference-counting gc will call
> handle.__del__() at the end of the scope (so on CPython it's at level
> 2), but it famously causes huge problems when porting to PyPy with
> it's much faster and more sophisticated gc that only runs when
> triggered by memory pressure. (Or for "PyPy" you can substitute
> "Jython", "IronPython", whatever.) Technically this code doesn't
> actually "leak" file descriptors on PyPy, because handle.__del__()
> will get called *eventually* (this code is at level 1, not level 0),
> but by the time "eventually" arrives your server process has probably
> run out of file descriptors and crashed. Level 1 isn't good enough. So
> now we have all learned to instead write
>
>  # good modern Python style:
>  def get_file_contents(path):
>   with open(path) as handle:
>   return handle.read()

This only works if the file fits in memory - otherwise you just have
to accept the fact that you need to leave the file handle open until
you're "done with the iterator", which means deferring the resource
management to the caller.

> and we have fancy tools like the ResourceWarning machinery to help us
> catch these bugs.
>
> Here's the analogous example for async generators. This is a useful,
> realistic async generator, that lets us incrementally read from a TCP
> connection that streams newline-separated JSON documents:
>
>   async def read_json_lines_from_server(host, port):
>   async for line in asyncio.open_connection(host, port)[0]:
>   yield json.loads(line)
>
> You would expect to use this like:
>
>   async for data in read_json_lines_from_server(host, port):
>   ...

The actual synchronous equivalent to this would look more like:

def read_data_from_file(path):
with open(path) as f:
for line in f:
yield f

(Assume we're doing something interesting to each line, rather than
reproducing normal file iteration behaviour)

And that has the same problem as your asynchronous example: the caller
needs to worry about resource management on the generator and do:

with closing(read_data_from_file(path)) as itr:
for line in itr:
...

Which means the problem causing your concern doesn't arise from the
generator being asynchronous - it comes from the fact the generator
actually *needs* to hold the FD open in order to work as intended (if
it didn't, then the code wouldn't need to be asynchronous).

> BUT, with the current PEP 525 proposal, trying to use this generator
> in this way is exactly analogous to the open(path).read() case: on
> CPython it will work fine -- the generator object will leave scope at
> the end of the 'async for' loop, cleanup methods will be called, etc.
> But on PyPy, the weakref callback will not be triggered until some
> arbitrary time later, you will "leak" file descriptors, and your
> server will crash.

That suggests the PyPy GC should probably be tracking pressure on more
resources than just memory when deciding whether or not to trigger a
GC run.

> For correct operation, you have to replace the
> simple 'async for' loop with this lovely construct:
>
>   async with aclosing(read_json_lines_from_server(host, port)) as ait:
>   async for data in ait:
>   ...
>
> Of course, you only have to do this on loops whose iterator might
> potentially hold resources like file descriptors, either currently or
> in the future. So... uh... basically that's all loops, I guess? If you
> want to be a good defensive programmer?

At that level of defensiveness in asynchronous code, you need to start
treating all external resources (including file descriptors) as a
managed pool, just as we have process and thread pools in the standard
library, and many database and networking libraries offer connection
pooling. It limits your per-process concurrency, but that limit exists
anyway at the operating system level - modelling it explicitly just
lets you manage how the application handles those limits.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Ethan Furman


On 09/02/2016 06:17 PM, Greg Ewing wrote:

Ethan Furman wrote:



The problem with only having `bchr` is that it doesn't
 help with `bytearray`; the problem with not having
 `bchr` is who wants to write `bytes.fromord`?


If we called it 'bytes.fnord' (From Numeric Ordinal)
people would want to write it just for the fun factor.


Very good point!  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Ethan Furman

On 09/03/2016 05:08 AM, Martin Panter wrote:

On 1 September 2016 at 19:36, Ethan Furman wrote:

Deprecation of current "zero-initialised sequence" behaviour without removal

Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
argument and interpret it as meaning to create a zero-initialised sequence
of the given size::

 >>> bytes(3)
 b'\x00\x00\x00'
 >>> bytearray(3)
 bytearray(b'\x00\x00\x00')

This PEP proposes to deprecate that behaviour in Python 3.6, but to leave
it in place for at least as long as Python 2.7 is supported, possibly
indefinitely.

Can you clarify what “deprecate” means? Just add a note in the
documentation, [...]

This one.

Addition of "getbyte" method to retrieve a single byte
--

This PEP proposes that ``bytes`` and ``bytearray`` gain the method
``getbyte``
which will always return ``bytes``::

Should getbyte() handle negative indexes? E.g. getbyte(-1) returning
the last byte.

Yes.

Open Questions
==

Do we add ``iterbytes`` to ``memoryview``, or modify
``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation?  Or
do we ignore memory for now and add it later?

Apparently memoryview.cast('s') comes from Nick Coghlan:
.
However, since 3.5 (https://bugs.python.org/issue15944) you can call
cast("c") on most memoryviews, which I think already does what you
want:

tuple(memoryview(b"ABC").cast("c"))

(b'A', b'B', b'C')

Nice!

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [erratum] Emotional responses to PEPs 484 and 526

2016-09-03 Thread Ivan Levkivskyi

On 3 September 2016 at 17:18, Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> Stephen J. Turnbull writes:
>
>  > My version ... furthermore makes mypy into a units checker,
>
> That isn't true, mypy does want annotations on all the variables it
> checks and does not infer them from initializer type.
>

I have heard that pytype (https://github.com/google/pytype) does more type
inference (although it has some weaknesses).
In general, I think it is OK that the amount of annotations needed depends
on the type checker
(there is actually a note on this in the last revision of PEP 526).

--
Ivan
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] [erratum] Emotional responses to PEPs 484 and 526

2016-09-03 Thread Stephen J. Turnbull

Stephen J. Turnbull writes:

 > My version ... furthermore makes mypy into a units checker,

That isn't true, mypy does want annotations on all the variables it
checks and does not infer them from initializer type.

Sorry for the misinformation.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] What should a good type checker do? (was: Please reject or postpone PEP 526)

2016-09-03 Thread Stephen J. Turnbull

Please respect Reply-To, set to python-ideas.

Greg Ewing writes:
 > Chris Angelico wrote:
 > > Forcing people to write 1.0 just to be compatible with 1.5 will cause
 > > a lot of annoyance.
 > 
 > Indeed, this would be unacceptable IMO.

But "forcing" won't happen.  Just ignore the warning.  *All* such
Python programs will continue to run (or crash) exactly as if the type
declarations weren't there.  If you don't like the warning, either
don't run the typechecker, or change your code to placate it.

But allowing escapes from a typechecker means allowing escapes.  All
of them, not just the ones you or I have preapproved.  I want my
typechecker to be paranoid, and loud about it.

That doesn't mean I would never use a type like "Floatable" (ie, any
type subject to implicit conversion to float).  But in the original
example, I would probably placate the typechecker.  YMMV, of course.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 529: Change Windows filesystem encoding to UTF-8

2016-09-03 Thread Nick Coghlan

On 2 September 2016 at 08:31, Steve Dower  wrote:
> This proposal would remove all use of the *A APIs and only ever call the *W
> APIs. When Windows returns paths to Python as str, they will be decoded from
> utf-16-le and returned as text (in whatever the minimal representation is).
> When
> Windows returns paths to Python as bytes, they will be decoded from
> utf-16-le to
> utf-8 using surrogatepass (Windows does not validate surrogate pairs, so it
> is
> possible to have invalid surrogates in filenames). Equally, when paths are
> provided as bytes, they are decoded from utf-8 into utf-16-le and passed to
> the
> *W APIs.

The overall proposal looks good to me, there's just a terminology
glitch here: utf-8 <-> utf-16-le should either be described as
transcoding, or else as decoding and then re-encoding. As they're both
text codecs, there's no "decoding" operation that switches between
them.

As far as the timing of this particular change goes, I think you make
a good case that all of the cases that will see a behaviour change
with this PEP have already been receiving deprecation warnings since
3.3, which would make it acceptable to change the default behaviour in
3.6.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [New-bugs-announce] [issue27948] f-strings: allow backslashes only in the string parts, not in the expression parts

2016-09-03 Thread Eric V. Smith

I'm aware of the buildbot failures due to this commit. I'm working on it.

Sorry about that: tests passed on my machine.

Eric.

On 09/03/2016 09:24 AM, Eric V. Smith wrote:
> 
> New submission from Eric V. Smith:
> 
> See issue 27921.
> 
> Currently (and for 3.6 beta 1), backslashes are not allowed anywhere in 
> f-strings. This needs to be changed to allow them in the string parts, but 
> not in the expression parts.
> 
> Also, require that the start and end of an expression be literal '{' and '}, 
> not escapes like '\0x7b' and '\u007d'.
> 
> --
> assignee: eric.smith
> components: Interpreter Core
> messages: 274294
> nosy: eric.smith
> priority: normal
> severity: normal
> stage: needs patch
> status: open
> title: f-strings: allow backslashes only in the string parts, not in the 
> expression parts
> type: behavior
> versions: Python 3.6
> 
> ___
> Python tracker 
> 
> ___
> ___
> New-bugs-announce mailing list
> new-bugs-annou...@python.org
> https://mail.python.org/mailman/listinfo/new-bugs-announce
> 

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-03 Thread Martin Panter

On 1 September 2016 at 23:28, Random832  wrote:
> On Thu, Sep 1, 2016, at 18:28, Steve Dower wrote:
>> This is a raw (bytes) IO class that requires text to be passed encoded
>> with utf-8, which will be decoded to utf-16-le and passed to the Windows 
>> APIs.
>> Similarly, bytes read from the class will be provided by the operating
>> system as utf-16-le and converted into utf-8 when returned to Python.
>
> What happens if a character is broken across a buffer boundary? e.g. if
> someone tries to read or write one byte at a time (you can't do a
> partial read of zero bytes, there's no way to distinguish that from an
> EOF.)
>
> Is there going to be a higher-level text I/O class that bypasses the
> UTF-8 encoding step when the underlying bytes stream is a console? What
> if we did that but left the encoding as mbcs? I.e. the console is text
> stream that can magically handle characters that aren't representable in
> its encoding. Note that if anything does os.read/write to the console's
> file descriptors, they're gonna get MBCS and there's nothing we can do
> about it.

Maybe it is too complicated and impractical, but I have imagined that
the sys.stdin/stdout/stderr could be custom TextIOBase objects. They
would not be wrappers or do much encoding (other than maybe newline
encoding). To solve the compatibility problems with code that uses
stdout.buffer or whatever, you could add a custom “buffer” object,
something like my AsciiBufferMixin class:
https://gist.github.com/vadmium/d1b07d771fbf4347683c005c40991c02

Just putting this idea out there, but maybe Steve’s UTF-8 encoding
solution is good enough.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Emotional responses to PEPs 484 and 526

2016-09-03 Thread Nick Coghlan

On 3 September 2016 at 18:03, Stephen J. Turnbull
 wrote:
> Therefore, I think Nick's version was an abuse of variable annotation.
> I don't mean to criticize Nick, as he was trying to make the best of
> an unlikely proposal.  But if Nick can fall into this trap[2], I think
> the fears of many that type annotations will grow like fungus on code
> that really doesn't need them, and arguably is better without them,
> are quite reasonable.

I suggest lots of things of python-ideas that I would probably oppose
if they ever made it as far as python-dev - enabling that kind of
speculative freedom is a large part of *why* we have a brainstorming
list.

For me, type annotations fall into the same category in practice as
metaclasses and structural linters: if you're still asking yourself
the question "Do I need one?" the answer is an emphatic "No". They're
tools designed to solve particular problems, so you reach for them
when you have those problems, rather than as a matter of course.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 526 ready for review: Syntax for Variable and Attribute Annotations

2016-09-03 Thread Nick Coghlan

On 3 September 2016 at 02:17, Guido van Rossum  wrote:
> Pinning down the semantics is not why I am pushing for PEP 526 -- I
> only want to pin down the *syntax* to the point where we won't have to
> change it again for many versions, since it's much harder to change
> the syntax than it is to change the behavior of type checkers (which
> have fewer backwards compatibility constraints, a faster release
> cycle, and narrower user bases than core Python itself).

+1 from me as well for omitting any new type semantics that aren't
absolutely necessary from the PEP (i.e. nothing beyond ClassVar) - I
only figured it was worth bringing up here as the question had already
arisen.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Martin Panter

On 1 September 2016 at 19:36, Ethan Furman  wrote:
> Deprecation of current "zero-initialised sequence" behaviour without removal
> 
>
> Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
> argument and interpret it as meaning to create a zero-initialised sequence
> of the given size::
>
> >>> bytes(3)
> b'\x00\x00\x00'
> >>> bytearray(3)
> bytearray(b'\x00\x00\x00')
>
> This PEP proposes to deprecate that behaviour in Python 3.6, but to leave
> it in place for at least as long as Python 2.7 is supported, possibly
> indefinitely.

Can you clarify what “deprecate” means? Just add a note in the
documentation, or make calls trigger a DeprecationWarning as well?
Having bytearray(n) trigger a DeprecationWarning would be a minor
annoyance for code being compatible with Python 2 and 3, since
bytearray(n) is supported in Python 2.

> Addition of "getbyte" method to retrieve a single byte
> --
>
> This PEP proposes that ``bytes`` and ``bytearray`` gain the method
> ``getbyte``
> which will always return ``bytes``::

Should getbyte() handle negative indexes? E.g. getbyte(-1) returning
the last byte.

> Open Questions
> ==
>
> Do we add ``iterbytes`` to ``memoryview``, or modify
> ``memoryview.cast()`` to accept ``'s'`` as a single-byte interpretation?  Or
> do we ignore memory for now and add it later?

Apparently memoryview.cast('s') comes from Nick Coghlan:
.
However, since 3.5 (https://bugs.python.org/issue15944) you can call
cast("c") on most memoryviews, which I think already does what you
want:

>>> tuple(memoryview(b"ABC").cast("c"))
(b'A', b'B', b'C')
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Martin Panter

> Le samedi 3 septembre 2016, Random832  a écrit :
>> On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote:
>> > The problem with only having `bchr` is that it doesn't help with
>> > `bytearray`;
>>
>> What is the use case for bytearray.fromord? Even in the rare case
>> someone needs it, why not bytearray(bchr(...))?

On 3 September 2016 at 08:47, Victor Stinner  wrote:
> Yes, this was my point: I don't think that we need a bytearray method to
> create a mutable string from a single byte.

I agree with the above. Having an easy way to turn an int into a bytes
object is good. But I think the built-in bchr() function on its own is
enough. Just like we have bytes object literals, but the closest we
have for a bytearray literal is bytearray(b". . .").
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Martin Panter

On 2 September 2016 at 17:54, Koos Zevenhoven  wrote:
> On Thu, Sep 1, 2016 at 10:36 PM, Ethan Furman  wrote:
>> * Deprecate passing single integer values to ``bytes`` and ``bytearray``
>> * Add ``bytes.fromsize`` and ``bytearray.fromsize`` alternative
>> constructors
>> * Add ``bytes.fromord`` and ``bytearray.fromord`` alternative constructors
>> * Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods
>> * Add ``bytes.iterbytes`` and ``bytearray.iterbytes`` alternative
>> iterators
>
> I wonder if from_something with an underscore is more consistent (according
> to a quick search perhaps yes).

That would not be too inconsistent with the sister constructor bytes.fromhex().
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-03 Thread Adam Bartoš

>
> The use of an ASCII compatible encoding is required to maintain
> compatibility with code that bypasses the TextIOWrapper and directly
> writes ASCII bytes to the standard streams (for example, 
> [process_stdinreader.py]
>  ).
> Code that assumes a particular encoding for the standard streams other than
> ASCII will likely break.


Note that for example in IDLE there are sys.std* stream objects that don't
have buffer attribute. I would argue that it is incorrect to suppose that
there is always one.

Adam Bartoš
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-03 Thread Adam Bartoš

Steve Dower (steve.dower at python.org) on Thu Sep 1 18:28:53 EDT 2016 wrote

I'm about to be offline for a few days, so I wanted to get my current
> draft PEPs out for people can read and review.
>
> I don't believe there is a lot of change as a result of either PEP, but
> the impact of what change there is needs to be weighed against the benefits.
>
> If anything, I'm likely to have underplayed the impact of this change
> (though I've had a *lot* of support for this one). Just stating my
> biases up-front - take it as you wish.
>
> See https://bugs.python.org/issue1602 for the current proposed patch for
> this PEP. I will likely update it after my upcoming flights, but it's in
> pretty good shape right now.
>
> Cheers,
> Steve
>
>
Did you consider that the hard-wired readline hook
`_PyOS_WindowsConsoleReadline` won't be needed in future if
http://bugs.python.org/issue17620 gets resolved so the default hook on
Windows just reads from sys.stdin? This would also reduce code duplicity
and all the Read/WriteConsoleW stuff would be gathered together in one
special class.

Regards,
Adam Bartoš
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 528: Change Windows console encoding to UTF-8

2016-09-03 Thread Adam Bartoš

Paul Moore (p.f.moore at gmail.com) on Fri Sep 2 05:23:04 EDT 2016 wrote

>
> On 2 September 2016 at 03:35, Steve Dower  > wrote:
> >* I'd need to test to be sure, but writing an incomplete code point should
> *>* just truncate to before that point. It may currently raise OSError if that
> *>* truncated to zero length, as I believe that's not currently distinguished
> *>* from an error. What behavior would you propose?
> *
> For "correct" behaviour, you should retain the unwritten bytes, and
> write them as part of the next call (essentially making the API
> stateful, in the same way that incremental codecs work). I'm pretty
> sure that this could cause actual problems, for example I think invoke
> (https://github.com/pyinvoke/invoke) gets byte streams from
> subprocesses and dumps them direct to stdout in blocks (so could
> easily end up splitting multibyte sequences). It''s arguable that it
> should be decoding the bytes from the subprocess and then re-encoding
> them, but that gets us into "guess the encoding used by the
> subprocess" territory.
>
> The problem is that we're not going to simply drop some bad data in
> the common case - it's not so much the dropping of the start of an
> incomplete code point that bothers me, as the encoding error you hit
> at the start of the *next* block of data you send. So people will get
> random, unexplained, encoding errors.
>
> I don't see an easy answer here other than a stateful API.
>
>
Isn't the buffered IO wrapper for this?



> >* Reads of less than four bytes fail instantly, as in the worst case we need
> *>* four bytes to represent one Unicode character. This is an unfortunate
> *>* reality of trying to limit it to one system call - you'll never get a full
> *>* buffer from a single read, as there is no simple mapping between
> *>* length-as-utf8 and length-as-utf16 for an arbitrary string.
> *
> And here - "read a single byte" is a not uncommon way of getting some
> data. Once again see invoke:
> https://github.com/pyinvoke/invoke/blob/master/invoke/platform.py#L147
>
> used at
> https://github.com/pyinvoke/invoke/blob/master/invoke/runners.py#L548
>
> I'm not saying that there's an easy answer here, but this *will* break
> code. And actually, it's in violation of the documentation: 
> seehttps://docs.python.org/3/library/io.html#io.RawIOBase.read
>
> """
> read(size=-1)
>
> Read up to size bytes from the object and return them. As a
> convenience, if size is unspecified or -1, readall() is called.
> Otherwise, only one system call is ever made. Fewer than size bytes
> may be returned if the operating system call returns fewer than size
> bytes.
>
> If 0 bytes are returned, and size was not 0, this indicates end of
> file. If the object is in non-blocking mode and no bytes are
> available, None is returned.
> """
>
> You're not allowed to return 0 bytes if the requested size was not 0,
> and you're not at EOF.
>
>

That's why it should be rather signaled by an exception. Even when one
doesn't transcode UTF-16 to UTF-8, reading just one byte is still
impossible I would argue that also incorrect here. I raise ValueError in
win_unicode_console.


Adam Bartoš
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Emotional responses to PEPs 484 and 526

2016-09-03 Thread Stephen J. Turnbull

Guido van Rossum writes:

 > I just spoke to someone who noted that [PEP 526] is likely to evoke
 > an outsize emotional response. (Similar to what happened with PEP 484.)

Emotional, yes, but I resent the "outsize" part.  Although that word
to the wise is undoubtedly enough, i.e., tl:dr if you like, let me
explain why I have one foot in each camp.

Compare Nick's version of "scientific code with SI units":

  from circuit_units import A, V, Ohm, seconds

  delta: A
  for delta in [-500n, 0, 500n]:
  input: A = 2.75u + delta
  wait(seconds(1u))
  expected: V = Ohm(100k)*input
  tolerance: V = 2.2m
  fails = check_output(expected, tolerance)
  print('%s: I(in)=%rA, measured V(out)=%rV, expected V(out)=%rV, 
diff=%rV.' % (
  'FAIL' if fails else 'pass',
  input, get_output(), expected, get_output() - expected
  ))

with

  from circuit_units import VoltType, uA, mV, kOhm, u_second

  expected: VoltType

  for delta in [-0.5*uA, 0*uA, 0.5*uA]:
  input = 2.75*uA + delta
  wait(1*u_second) 
  expected = (100*kOhm)*input
  tolerance = 2.2*mV   
  fails = check_output(expected, tolerance)
  print('%s: I(in)=%rA, measured V(out)=%rV, expected V(out)=%rV, 
diff=%rV.' % (
  'FAIL' if fails else 'pass',
  input, get_output(), expected, get_output() - expected
  ))

In Nick's version, literals like 500n ("500 nano-whatevers") require
Ken Kundert's proposed syntax change.  I left that in because it
streamlines the expressions.  I wrote the latter because I really
disliked Nick's version, streamlined with "SI scale syntax" or not.
Nick didn't explicitly type the *_output functions, so I didn't
either.[1]  I assume they're annotated in their module of definition.

The important point about the second version is that if we accept the
hypothesis that the pseudo-literals like '[0.5*uA, 0*uA, 0.5*uA]' are
good enough to implicitly type the variables they're assigned to (as
they are in this snippet), mypy will catch "unit errors" (which the
circuit_units module converts into TypeErrors) in the expressions.  I
think that this hypothesis is appropriate in the context of the thread.

Therefore, I think Nick's version was an abuse of variable annotation.
I don't mean to criticize Nick, as he was trying to make the best of
an unlikely proposal.  But if Nick can fall into this trap[2], I think
the fears of many that type annotations will grow like fungus on code
that really doesn't need them, and arguably is better without them,
are quite reasonable.

The point here is *not* that Nick's version is "horrible" (as many of
the "emotional" type refuseniks might say), whatever that might mean.
I can easily imagine that the snippet above is part of the embedded
software for air traffic control or medical life support equipment,
and a belt and suspenders approach ("make units visible to reviewers"
+ "mypy checking" + "full coverage in unit tests") is warranted.  Ie,
Nick's version is much better than mine in that context because the
hypothesis that "implicit declaration" is good enough is invalid.  But
in the context of discussion of how to make measurement units visible
and readable in a Python program, he grabbed an inappropriate tool
because it was close to hand.

My version does everything the OP asked for, it furthermore makes mypy
into a units checker, and it does so in a way that any intermediate
Python programmer (and many novices as well) can immediately grasp.
If Nick had been given the constraint that novices should be able to
read it, I suspect he would have written the same snippet I did.  Nick?

Footnotes: 
[1]  I think both versions could have better variable naming along
with a few other changes to make them more readable, but those aspects
were dictated by an earlier post.  I don't have the nerve to touch
Nick's code, so left those as is.

[2]  Or perhaps he did it intentionally, trying to combine two cool
ideas, variable type annotations and SI units, in one example.  If so,
I think that this combination was unwise in context.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 467: last round (?)

2016-09-03 Thread Victor Stinner

Yes, this was my point: I don't think that we need a bytearray method to
create a mutable string from a single byte.

Victor

Le samedi 3 septembre 2016, Random832  a écrit :

> On Fri, Sep 2, 2016, at 19:44, Ethan Furman wrote:
> > The problem with only having `bchr` is that it doesn't help with
> > `bytearray`;
>
> What is the use case for bytearray.fromord? Even in the rare case
> someone needs it, why not bytearray(bchr(...))?
> ___
> Python-Dev mailing list
> Python-Dev@python.org 
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> victor.stinner%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

54 matches

Mail list logo