Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Glenn Linderman

On 11/20/2018 10:33 PM, Nathaniel Smith wrote:

On Tue, Nov 20, 2018 at 6:05 PM Glenn Linderman  wrote:

On 11/20/2018 2:17 PM, Victor Stinner wrote:

IMHO performance and hiding implementation details are exclusive. You
should either use the C API with impl. details for best performances,
or use a "limited" C API for best compatibility.

The "limited" C API concept would seem to be quite sufficient for extensions 
that want to extend Python functionality to include new system calls, etc. (pywin32, 
pyMIDI, pySide, etc.) whereas the numpy and decimal might want best performance.

To make things more complicated: numpy and decimal are in a category
of modules where if you want them to perform well on JIT-based VMs,
then there's no possible C API that can achieve that. To get the
benefits of a JIT on code using numpy or decimal, the JIT has to be
able to see into their internals to do inlining etc., which means they
can't be written in C at all [1], at which point the C API becomes
irrelevant.

It's not clear to me how this affects any of the discussion in
CPython, since supporting JITs might not be part of the goal of a new
C API, and I'm not sure how many packages fall between the
numpy/decimal side and the pure-ffi side.

-n

[1] Well, there's also the option of teaching your Python JIT to
handle LLVM bitcode as a source language, which is the approach that
Graal is experimenting with. It seems completely wacky to me to hope
you could write a C API emulation layer like PyPy's cpyext, and
compile that + C extension code to LLVM bitcode, translate the LLVM
bitcode to JVM bytecode, inline the whole mess into your Python JIT,
and then fold everything away to produce something reasonable. But I
could be wrong, and Oracle is throwing a lot of money at Graal so I
guess we'll find out.

Interesting, thanks for the introduction to wacky. I was quite content 
with the idea that numpy, and other modules that would choose to use the 
unlimited API, would be sacrificing portability to non-CPython 
implementations... except by providing a Python equivalent (decimal, and 
some others do that, IIRC).


Regarding JIT in general, though, it would seem that "precompiled" 
extensions like numpy would not need to be re-compiled by the JIT.


But if it does, then the JIT better understand/support C syntax, but JVM 
JITs probably don't! so that leads to the scenario you describe.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Nathaniel Smith
On Tue, Nov 20, 2018 at 6:05 PM Glenn Linderman  wrote:
>
> On 11/20/2018 2:17 PM, Victor Stinner wrote:
>> IMHO performance and hiding implementation details are exclusive. You
>> should either use the C API with impl. details for best performances,
>> or use a "limited" C API for best compatibility.
>
> The "limited" C API concept would seem to be quite sufficient for extensions 
> that want to extend Python functionality to include new system calls, etc. 
> (pywin32, pyMIDI, pySide, etc.) whereas the numpy and decimal might want best 
> performance.

To make things more complicated: numpy and decimal are in a category
of modules where if you want them to perform well on JIT-based VMs,
then there's no possible C API that can achieve that. To get the
benefits of a JIT on code using numpy or decimal, the JIT has to be
able to see into their internals to do inlining etc., which means they
can't be written in C at all [1], at which point the C API becomes
irrelevant.

It's not clear to me how this affects any of the discussion in
CPython, since supporting JITs might not be part of the goal of a new
C API, and I'm not sure how many packages fall between the
numpy/decimal side and the pure-ffi side.

-n

[1] Well, there's also the option of teaching your Python JIT to
handle LLVM bitcode as a source language, which is the approach that
Graal is experimenting with. It seems completely wacky to me to hope
you could write a C API emulation layer like PyPy's cpyext, and
compile that + C extension code to LLVM bitcode, translate the LLVM
bitcode to JVM bytecode, inline the whole mess into your Python JIT,
and then fold everything away to produce something reasonable. But I
could be wrong, and Oracle is throwing a lot of money at Graal so I
guess we'll find out.

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Glenn Linderman

On 11/20/2018 2:17 PM, Victor Stinner wrote:

Le mar. 20 nov. 2018 à 23:08, Stefan Krah  a écrit :

Intuitively, it should probably not be part of a limited API, but I never
quite understood the purpose of this API, because I regularly need any
function that I can get my hands on.
(...)
Reading typed strings directly into an array with minimal overhead.

IMHO performance and hiding implementation details are exclusive. You
should either use the C API with impl. details for best performances,
or use a "limited" C API for best compatibility.


The "limited" C API concept would seem to be quite sufficient for 
extensions that want to extend Python functionality to include new 
system calls, etc. (pywin32, pyMIDI, pySide, etc.) whereas the numpy and 
decimal might want best performance.



Since I would like to not touch the C API with impl. details, you can
imagine to have two compilation modes: one for best performances on
CPython, one for best compatibility (ex: compatible with PyPy). I'm
not sure how the "compilation mode" will be selected.


The nicest interface from a compilation point of view would be to have 
two #include files: One to import the limited API, and one to import the 
performance API. Importing both should be allowed and should work.


If you import the performance API, you have to learn more, and be more 
careful.


Of course, there might be appropriate subsets of each API, having 
multiple include files, to avoid including everything, but that is a 
refinement.




Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/v%2Bpython%40g.nevcal.com


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Victor Stinner
Le mar. 20 nov. 2018 à 23:08, Stefan Krah  a écrit :
> Intuitively, it should probably not be part of a limited API, but I never
> quite understood the purpose of this API, because I regularly need any
> function that I can get my hands on.
> (...)
> Reading typed strings directly into an array with minimal overhead.

IMHO performance and hiding implementation details are exclusive. You
should either use the C API with impl. details for best performances,
or use a "limited" C API for best compatibility.

Since I would like to not touch the C API with impl. details, you can
imagine to have two compilation modes: one for best performances on
CPython, one for best compatibility (ex: compatible with PyPy). I'm
not sure how the "compilation mode" will be selected.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Stefan Krah


On Mon, Nov 19, 2018 at 04:08:07PM +0100, Victor Stinner wrote:
> Le lun. 19 nov. 2018 à 13:18, Stefan Krah  a écrit :
> > In practice people desperately *have* to use whatever is there, including
> > functions with underscores that are not even officially in the C-API.
> >
> > I have to use _PyFloat_Pack* in order to be compatible with CPython,
> 
> Oh, I never used this function. These functions are private (name
> prefixed by "_") and excluded from the limited API.
> 
> For me, the limited API should be functions available on all Python
> implementations. Does it make sense to provide PyFloat_Pack4() in
> MicroPython, Jython, IronPython and PyPy? Or is it something more
> specific to CPython? I don't know the answer. If yes, open an issue to
> propose to make this function public?

It depends on what the goal is: If PyPy wants to be able to use as many
C extensions as possible, then yes.

The function is just one example of what people have to use to be 100%
compatible with CPython (or copy these functions and maintain them ...).


Intuitively, it should probably not be part of a limited API, but I never
quite understood the purpose of this API, because I regularly need any
function that I can get my hands on.


> > I need PyUnicode_KIND()
> 
> IMHO this one should not be part of the public API. The only usage
> would be to micro-optimize, but such API is very specific to one
> Python implementation. For example, PyPy doesn't use "compact string"
> but UTF-8 internally. If you use PyUnicode_KIND(), your code becomes
> incompatible with PyPy.
> 
> What is your use case?

Reading typed strings directly into an array with minimal overhead.


> I would prefer to expose the "_PyUnicodeWriter" API than PyUnicode_KIND().
> 
> > need PyUnicode_AsUTF8AndSize(),
> 
> Again, that's a micro-optimization and it's very specific to CPython:
> result cached in the "immutable" str object. I don't want to put it in
> a public API. PyUnicode_AsUTF8String() is better since it doesn't
> require an internal cache.
> 
> > I *wish* there were PyUnicode_AsAsciiAndSize().
> 
> PyUnicode_AsASCIIString() looks good to me. Sadly, it doesn't return
> the length, but usually the length is not needed.

Yes, these are all just examples.  It's also very useful to be able to
do PyLong_Type.tp_as_number->nb_multiply or grab as_integer_ratio from
the float PyMethodDef.

The latter two cases are for speed reasons but also because sometimes
you *don't* want a method from a subclass (Serhiy was very good in
finding corner cases :-).


Most C modules that I've seen have some internals. Psycopg2:

PyDateTime_DELTA_GET_MICROSECONDS
PyDateTime_DELTA_GET_DAYS
PyDateTime_DELTA_GET_SECONDS
PyList_GET_ITEM
Bytes_GET_SIZE
Py_BEGIN_ALLOW_THREADS
Py_END_ALLOW_THREADS

floatobject.h and longintrepr.h are also popular.


Stefan Krah



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-34532 status

2018-11-20 Thread Steve Dower
It's merged now. Sorry for not seeing the update (I get far too many 
github notifications/emails to be able to pay attention to them - 
pinging on the issue tracker generally works best).


Cheers,
Steve

On 20Nov2018 1002, Brett Cannon wrote:
To provide context, https://bugs.python.org/issue34532 is about making 
the Python launcher on Windows not return an error condition when using 
`py -0` (and probably `py --list` and `py --list-paths`).


On Tue, 20 Nov 2018 at 08:29, Brendan Gerrity > wrote:


Just wanted to check on bpo-34532/pr#9039. The requested changes
were submitted as a commit to the PR.

Best,
Brendan


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-34532 status

2018-11-20 Thread Brett Cannon
To provide context, https://bugs.python.org/issue34532 is about making the
Python launcher on Windows not return an error condition when using `py -0`
(and probably `py --list` and `py --list-paths`).

On Tue, 20 Nov 2018 at 08:29, Brendan Gerrity  wrote:

> Just wanted to check on bpo-34532/pr#9039. The requested changes were
> submitted as a commit to the PR.
>
> Best,
> Brendan
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] bpo-34532 status

2018-11-20 Thread Brendan Gerrity
Just wanted to check on bpo-34532/pr#9039. The requested changes were
submitted as a commit to the PR.

Best,
Brendan
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Nathaniel Smith
On Tue, Nov 20, 2018 at 1:34 AM Petr Viktorin  wrote:
>
> On 11/19/18 12:14 PM, Victor Stinner wrote:
> > To design a new C API, I see 3 options:
> >
> > (1) add more functions to the existing Py_LIMITED_API
> > (2) "fork" the current public C API: remove functions and hide as much
> > implementation details as possible
> > (3) write a new C API from scratch, based on the current C API.
> > Something like #define newcapi_Object_GetItem PyObject_GetItem"?
> > Sorry, but "#undef " doesn't work. Only very few
> > functions are defined using "#define ...".
> >
> > I dislike (1) because it's too far from what is currently used in
> > practice. Moreover, I failed to find anyone who can explain me how the
> > C API is used in the wild, which functions are important or not, what
> > is the C API, etc.
>
> One big, complex project that now uses the limited API is PySide. They
> do some workarounds, but the limited API works. Here's a writeup of the
> troubles they have with it:
> https://github.com/pyside/pyside2-setup/blob/5.11/sources/shiboken2/libshiboken/pep384impl_doc.rst

AFAIK the only two projects that use the limited API are
PySide-generated modules and cffi-generated modules. I guess if there
is some cleanup needed to remove stuff that snuck into the limited
API, then that will be fine as long as you make sure they aren't used
by either of those two projects.

For the regular C API, I guess the PyPy folks, and especially Matti
Picus, probably know more than anyone else about what parts are
actually used in the wild, since they've spent way more time digging
into real projects. (Do you want to know about the exact conditions in
which real projects rely on being able to skip calling PyType_Ready on
a statically allocated PyTypeObject? Matti knows...)

> I hope the new C API will be improvements (and clarifications) of the
> stable ABI, rather than a completely new thing.
> My ideal would be that Python 4.0 would keep the same API (with
> questionable things emulated & deprecated), but break *ABI*. The "new C
> API" would become that new stable ABI -- and this time it'd be something
> we'd really want to support, without reservations.

We already break ABI with every feature release – at least for the
main ABI. The limited ABI supposedly doesn't, but probably does, and
as noted above it has such limited use that it's probably still
possible to fix any stuff that's leaked out accidentally.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-20 Thread Petr Viktorin

On 11/19/18 12:14 PM, Victor Stinner wrote:

To design a new C API, I see 3 options:

(1) add more functions to the existing Py_LIMITED_API
(2) "fork" the current public C API: remove functions and hide as much
implementation details as possible
(3) write a new C API from scratch, based on the current C API.
Something like #define newcapi_Object_GetItem PyObject_GetItem"?
Sorry, but "#undef " doesn't work. Only very few
functions are defined using "#define ...".

I dislike (1) because it's too far from what is currently used in
practice. Moreover, I failed to find anyone who can explain me how the
C API is used in the wild, which functions are important or not, what
is the C API, etc.


One big, complex project that now uses the limited API is PySide. They 
do some workarounds, but the limited API works. Here's a writeup of the 
troubles they have with it: 
https://github.com/pyside/pyside2-setup/blob/5.11/sources/shiboken2/libshiboken/pep384impl_doc.rst



I propose (2). We control how much changes we do at each milestone,
and we start from the maximum compatibility with current C API. Each
change can be discussed and experimented to define what is the C API,
what we want, etc. I'm working on this approach for 1 year, that's why
many discussions popped up around specific changes :-)


I hope the new C API will be improvements (and clarifications) of the 
stable ABI, rather than a completely new thing.
My ideal would be that Python 4.0 would keep the same API (with 
questionable things emulated & deprecated), but break *ABI*. The "new C 
API" would become that new stable ABI -- and this time it'd be something 
we'd really want to support, without reservations.


One thing that did not work with the stable ABI was that it's "opt-out"; 
I think we can agree that a new one must be "opt-in" from the start.
I'd also like the "new API" to be a *strict subset* of the stable ABI: 
if a new function needs to be added, it should be added to both.



Some people recently proposed (3) on python-dev. I dislike this option
because it starts by breaking the backward compatibility. It looks
like (1), but worse. The goal and the implementation are unclear to
me.

--

Replacing PyDict_GetItem() (specialized call) with PyObject_Dict()
(generic API) is not part of my short term plan. I wrote it in the
roadmap, but as I wrote before, each change should be discusssed,
experimented, benchmarked, etc.

Victor
Le lun. 19 nov. 2018 à 12:02, M.-A. Lemburg  a écrit :


On 19.11.2018 11:53, Antoine Pitrou wrote:

On Mon, 19 Nov 2018 11:28:46 +0100
Victor Stinner  wrote:

Python internals rely on internals to implement further optimizations,
than modifying an "immutable" tuple, bytes or str object, because you
can do that at the C level. But I'm not sure that I would like 3rd
party extensions to rely on such things.


I'm not even talking about *modifying* tuples or str objects, I'm
talking about *accessing* their value without going through an abstract
API that does slot lookups, indirect function calls and object unboxing.

For example, people may need a fast way to access the UTF-8
representation of a unicode object.  Without making indirect function
calls, and ideally without making a copy of the data either.  How do
you do that using the generic C API?


Something else you need to consider is creating instances of
types, e.g. a tuple. In C you will have to be able to put
values into the data structure before it is passed outside
the function in order to build the tuple.

If you remove this possibility to have to copy data all the
time, losing the advantages of having a rich C API.
  --
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Nov 19 2018)

Python Projects, Coaching and Consulting ...  http://www.egenix.com/
Python Database Interfaces ...   http://products.egenix.com/
Plone/Zope Database Interfaces ...   http://zope.egenix.com/



::: We implement business ideas - efficiently in both time and costs :::

eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
   http://www.malemburg.com/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/encukou%40gmail.com


___
Python-Dev mailing list
Python-Dev@python.org

Re: [Python-Dev] Need discussion for a PR about memory and objects

2018-11-20 Thread Jeff Allen

On 20/11/2018 00:14, Chris Barker via Python-Dev wrote:
On Mon, Nov 19, 2018 at 1:41 AM Antoine Pitrou > wrote:


I'd rather keep the reference to memory addressing than start
doing car
analogies in the reference documentation.


I agree -- and any of the car analogies will probably be only valid in 
some jurisdictions, anyway.


I think being a bit more explicit about what properties an ID has, and 
how the id() function works, and we may not need an anlogy at all, 
it's not that difficult a concept. And methions that in c_python the 
id is (currently) the memory address is a good idea for those that 
will wonder about it, and if there is enough explanation, folks that 
don't know about memory addresses will not get confused.

...

I suggest something like the following:

"""
Every object has an identity, a type and a value. An object’s identity 
uniquely identifies the object. It will remain the same as long as 
that object exists. No two different objects will have the same id at 
the same time, but the same id may be re-used for future objects once 
one has been deleted. The ‘is’ operator compares the identity of two 
objects; the id() function returns an integer representing its 
identity. ``id(object_a) == id(object_b)`` if and only if they are the 
same object.


**CPython implementation detail:** For CPython, id(x) is the memory 
address where x is stored.

"""
I agree that readers of a language reference should be able to manage 
without the analogies.


I want to suggest that identity and id() are different things.

The notion of identity in Python is what we access in phrases like "two 
different objects" and "the same object" in the text above. For me it 
defies definition, although one may make statements about it. A new 
object, wherever stored, is identically different from all objects 
preceding it.


Any Python has to implement the concept of identity in order to refer, 
without confusion, to objects in structures and bound by names. In 
practice, Python need only identify an object while the object exists in 
the interpreter, and the object exists as long as something refers to it 
in this way. To this extent, the identifier in the implementation need 
not be unique for all time.


The id() function returns an integer approximating this second idea. 
There is no mechanism to reach the object itself from the result, since 
it does not keep the object in existence, and worse, it may now be the 
id of a different object. In defining id(), I think it is confusing to 
imply that this number *is* the identity of the object that provided it.


Jeff Allen
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com