from:"Stefan Behnel"

[Python-Dev] Re: Switching to Discourse

2022-07-21 Thread Stefan Behnel


h.vetin...@gmx.com schrieb am 18.07.22 um 18:04:

One of the comments in the retro was:

Searching the archives is much easier and have found me many old threads that I 
probably would have problem finding before since I haven’t been subscribed for 
that long.


I'm actually reading python-dev, c.l.py etc. through Gmane, and have done 
that ever since I joined. Simply because it's a mailing list of which I 
don't need a local (content) copy, and wouldn't want one. Gmane seems to 
have a complete archive that's searchable, regardless of "when I subscribed".


It's really sad that Discourse lacks an NNTP interface. There's an 
unmaintained bridge to NNTP servers [1], but not an emulating interface 
that would serve the available discussions via NNTP messages, so that users 
can get them into their NNTP/Mail clients to read them in proper discussion 
threads. I think adding that next to the existing web interface would serve 
everyone's needs just perfectly.


Anyone up for giving that a try? It can't be *that* difficult. ;-)

Stefan


[1] https://github.com/sman591/discourse-nntp-bridge

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/USPYYNP24UYQQ64YBBTHNOEDNGX46LVM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Switching to Discourse

2022-07-16 Thread Stefan Behnel


Petr Viktorin schrieb am 15.07.22 um 13:18:
The discuss.python.org experiment has been going on for quite a while, and 
while the platform is not without its issues, we consider it a success. The 
Core Development category is busier than python-dev. According to staff, 
discuss.python.org is much easier to moderate.. If you're following 
python-dev but not discuss.python.org, you're missing out.


That's one of the reasons then why I pretty much lost track of the CPython 
development since d.p.o was introduced. It's sad, but it was just too much 
work for me (compared to threaded Newsgroups) to follow the discussions 
there, definitely more than I wanted to invest.


It's not the only reason, though, so please take a decision for the home of 
CPython discussions that suits the (currently) more active part of the 
development community.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TA5YNMEJURKMJHTSYTM5Z6G2YQ6UM5TP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP-657 and co_positions (was: Please update Cython before introcuding C API incompatible changes in Python)

2022-02-10 Thread Stefan Behnel


Petr Viktorin schrieb am 10.02.22 um 11:22:
So, should there be a mechanism to set source/lineno/position on 
tracebacks/exceptions, rather than always requiring a frame for it?


There's "_PyTraceback_Add()" currently, but it's incomplete in terms of 
what Cython would need.


As it stands, Cython could make use of a function that accepted

- string object arguments for filename and function name
- (optionally) a 'globals' dict (or a reference to the current module)
- (optionally) a 'locals' mapping
- (optionally) a code object
- a C integer source line
- a C integer position, probably start and end lines and columns

to add a traceback level to the current exception.

I'm not sure about the code object since that's a rather heavy thing, but 
given that Cython needs to create code objects in order for its functions 
to be introspectible, that seems like a worthwhile option to have.


However, with the recent frame stack refactoring and frame object now being 
lazily created, according to


https://bugs.python.org/issue44032
https://bugs.python.org/issue44590

I guess Cython should rather integrate with the new stack frame 
infrastructure in general. That shifts the requirements a bit.


An API function like the above would then still be helpful for the reduced 
API compile mode, I guess. But as soon as Cython uses InterpreterFrame 
structs internally, it would no longer be helpful for the fast mode.


InterpreterFrame object are based on byte code instructions again, which 
brings us back to co_positions.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YSP36JL5SRSPEG4X67G5RMWUWLVXSDC5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP-657 and co_positions (was: Please update Cython before introcuding C API incompatible changes in Python)

2022-02-09 Thread Stefan Behnel


Andrew Svetlov schrieb am 09.02.22 um 19:40:

Stefan, do you really need to emulate call stack with positions?
Could the __note__ string with generated Cython part of exception traceback
solve your needs (https://www.python.org/dev/peps/pep-0678/) ?


Thanks for the link, but I think it would be surprising for users if a 
traceback displayed some code positions differently than others, when all 
code lines refer to Python code.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BSDVX7MJFDZ6PFB7FG7Z3R4IO56FZ47T/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP-657 and co_positions (was: Please update Cython before introcuding C API incompatible changes in Python)

2022-02-09 Thread Stefan Behnel

Guido van Rossum schrieb am 09.02.22 um 19:36:

On Wed, Feb 9, 2022 at 9:41 AM Pablo Galindo Salgado wrote:

On Wed, 9 Feb 2022 at 17:38, Stefan Behnel wrote:

Pablo Galindo Salgado schrieb am 09.02.22 um 17:40:

Should there be a getter/setter for co_positions?

We consider the representation of co_postions private

Yes, and that's the issue.

I can only say that currently, I am not confident to expose such an API,
at least for co_positions, as the internal implementation is very likely to
heavily change and we want to have the possibility of changing it between
patch versions if required (to address bugs and other things like that).

>
> It might require a detailed API design proposal coming from outside
> CPython
> (e.g. from Cython) to get this to change. I imagine for co_positions in
> particular this would have to use a "builder" pattern.
>
> I am unclear on how this would work though, given that Cython generates C
> code, not CPython bytecode. How would the synthesized co_positions be
> used?
> Would Cython just generate a co_positions fragment at the moment an
> exception is raised, pointing at the .pyx file from which the code was
> generated?

So, what we currently do is to update the line number (which IIRC is really 
the start line number of the current function) on the current frame when an 
exception is raised, and the byte code offset to 0. That's a hack but shows 
the correct code line in the traceback. Probably conflicts with pdb, but 
there are still other issues with that anyway.

I remember looking into the old lnotab mapping at some point and trying to 
implement that with fake byte code offsets but never got it finished.

The idea is pretty simple, though. Instead of byte code offsets, we'd count 
our syntax tree nodes and just store the code position range of each syntax 
node at the "byte code offset" of the node's counter number. That's 
probably fairly easy to do in C code, maybe even with a statically 
allocated data structure. Then, instead of setting the frame function's 
line number, we'd set the frame's byte code instruction counter to the 
number of the failing syntax node, and CPython would retrieve the code 
position from that offset.

That sounds simple enough, probably simpler than any API usage – but 
depends on implementation details.

Especially the idea of storing all this statically in the data segment of 
the shared library sounds very tempting.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GAJFB6ABFYXF3RFXFDQ3YUZD23FMXPEY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] PEP-657 and co_positions (was: Please update Cython before introcuding C API incompatible changes in Python)

2022-02-09 Thread Stefan Behnel

Pablo Galindo Salgado schrieb am 09.02.22 um 17:40:

Should there be a getter/setter for co_positions?

We consider the representation of co_postions private

Yes, and that's the issue.

so we don't want (for now) to ad
getters/setters. If you want to get the position of a instruction, you can
use PyCode_Addr2Location

What Cython needs is the other direction. How can we provide the current
source position range for a given piece of code to an exception?

As it stands, the way to do this is to copy the implementation details of
CPython into Cython in order to let it expose the specific data structures
that CPython uses for its internal representation of code positions.

I would prefer using an API instead that allows exposing this mapping
directly to CPython's traceback handling, rather than having to emulate
byte code positions. While that would probably be quite doable, it's far
from a nice interface for something that is not based on byte code.

And that's not just a Cython issue. The same applies to Domain Specific
Languages or other programming languages that integrate with Python and
want to show users code positions for their source code.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/VQSWX6MFKIA3RYPSX7O6RTVC422LTJH4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-08 Thread Stefan Behnel


Inada Naoki schrieb am 08.02.22 um 06:15:

On Tue, Feb 8, 2022 at 1:47 PM Guido van Rossum wrote:


Thanks for trying it! I'm curious why it would be slower (perhaps less 
locality? perhaps the ...Id... APIs have some other trick up their sleeve?) but 
since it's also messier and less backwards compatible than just leaving 
_Py_IDENTIFIER alone and just not using it, I'd say let's not spend more time 
on that alternative and just focus on the two other horses still in the race: 
immortal objects or what you have now.



I think it's because statically allocated strings are not interned.


That would explain such a difference.



I think deepfreeze should stop using statically allocated strings for
interned strings too.


… or consider the statically allocated strings the interned string value. 
Unless another one already exists, but that shouldn't be the case for 
CPython internal strings.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5NE7EI3TVW4C3ZZI6LO5HNPIZRQNPMHG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-04 Thread Stefan Behnel

Eric Snow schrieb am 04.02.22 um 17:35:

On Fri, Feb 4, 2022 at 8:21 AM Stefan Behnel wrote:

Correct. We (intentionally) have our own way to intern strings and do not
depend on CPython's identifier framework.

You're talking about __Pyx_StringTabEntry (and __Pyx_InitString())?

Yes, that's what we generate. The C code parsing is done here:

https://github.com/cython/cython/blob/79637b23da77732e753b1e1ab5669b3e29978be3/Cython/Compiler/Code.py#L531-L550

The deduplication is a bit complex on our side because it needs to handle
Python source encodings, and also distinguishes between identifiers (that
become 'str' in Py2), plain Unicode strings and byte strings. You don't
need most of that for plain C code. But it's done here:

https://github.com/cython/cython/blob/79637b23da77732e753b1e1ab5669b3e29978be3/Cython/Compiler/Code.py#L1009-L1088

And then there's a whole bunch of code that helps in getting Unicode
character code points and arbitrary byte values in very long strings pushed
through C compilers, while keeping it mostly readable for interested users. :)

https://github.com/cython/cython/blob/master/Cython/Compiler/StringEncoding.py

You probably don't need that either, as long as you only deal with ASCII
strings.

Any way, have fun. Feel free to ask if I can help.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/QHJBAKIQUKFPIM6GZ7DYNJF3HDMDQQUH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-04 Thread Stefan Behnel


Ronald Oussoren via Python-Dev schrieb am 03.02.22 um 14:46:

On 2 Feb 2022, at 23:41, Eric Snow wrote:
* a little less convenient: adding a global string requires modifying
a separate file from the one where you actually want to use the string
* strings can get "orphaned" (I'm planning on checking in CI)
* some strings may never get used for any given ./python invocation
(not that big a difference though)


The first two cons can probably be fixed by adding some indirection, with some
markers at the place of use and a script that uses those to generate the
C definitions.

Although my gut feeling is that adding a the CI check you mention is good
enough and adding the tooling for generating code isn’t worth the additional
complexity.


It's what we do in Cython, and it works really well there. It's very 
straight forward, you just write something like


PYUNICODE("some text here")
PYIDENT("somename")

in your C code and Cython creates a deduplicated global string table from 
them and replaces the string constants with the corresponding global 
variables. (We have two different names because an identifier in Py2 is 
'str', not 'unicode'.)


Now, the thing with CPython is that the C sources where the replacement 
would take place are VCS controlled. And a script that replaces the 
identifiers would have to somehow make sure that the new references do not 
get renamed, which would lead to non-local changes when strings are added.


What you could try is to number the identifiers, i.e. use a macro like

_Py_STR(123, "some text here")

where you manually add a new identifier as

_Py_STR("some text here")

and the number is filled in automatically by a script that finds all of 
them, deduplicates, and adds new identifiers at the end, adding 1 to the 
maximum number that it finds. That makes sure that identifiers that already 
have an ID number will not be touched, deleted strings disappear 
automatically, and non-local changes are prevented.


Defining the _Py_STR() macro as

   #define _Py_STR(id, text)  (_Py_global_string_table[id])

or

   #define _Py_STR(id, text)  (_Py_global_string_table##id)

would also give you a compile error if someone forgets to run the script.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LD3JM2NQ5ZUZDK63RH4IVZPCZ7HC4X3G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-04 Thread Stefan Behnel


Petr Viktorin schrieb am 03.02.22 um 13:47:

On 02. 02. 22 11:50, Stefan Behnel wrote:
Maybe we should advertise the two modes more. And make sure that both 
work. There are certainly issues with the current state of the "limited 
API" implementation, but that just needs work and testing.


I wonder if it can it be renamed? "Limited API" has a specific meaning 
since PEP 384, and using it for the public API is adding to the general 
confusion in this area :(


I was more referring to it as an *existing* compilation mode of Cython that 
avoids the usage of CPython implementation details. The fact that the 
implementation is incomplete just means that we spill over into non-limited 
API code when no limited API is available for a certain feature. That will 
usually be public API code, unless that is really not available either.


One recent example is the new error locations in tracebacks, where PEP 657 
explicitly lists the new "co_positions" field in code objects as an 
implementation detail of CPython. If we want to implement this in Cython, 
then there is no other way than to copy these implementation details pretty 
verbatimly from CPython and to depend on them.


https://www.python.org/dev/peps/pep-0657/

In this specific case, we're lucky that this can be considered an entirely 
optional feature that we can separately disable when users request "public 
API" mode (let's call it that). Not sure if that's what users want, though.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A55HYBIFBOTAX5IB4YUYWUHI3IDLRD2F/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-04 Thread Stefan Behnel


Victor Stinner schrieb am 03.02.22 um 22:46:

Oh right, Cython seems to be a false positive.

A code search found 3 references to __Pyx_PyObject_LookupSpecial():

PYPI-2022-01-26-TOP-5000/Cython-0.29.26.tar.gz:
Cython-0.29.26/Cython/Compiler/ExprNodes.py: lookup_func_name =
'__Pyx_PyObject_LookupSpecial'
PYPI-2022-01-26-TOP-5000/Cython-0.29.26.tar.gz:
Cython-0.29.26/Cython/Compiler/Nodes.py: code.putln("%s =
__Pyx_PyObject_LookupSpecial(%s, %s); %s" % (
PYPI-2022-01-26-TOP-5000/Cython-0.29.26.tar.gz:
Cython-0.29.26/Cython/Utility/ObjectHandling.c: static CYTHON_INLINE
PyObject* __Pyx_PyObject_LookupSpecial(PyObject* obj, PyObject*
attr_name) {

Oh, that's not "_PyObject_LookupSpecial()", it doesn't use the
_Py_Identifier type:

static CYTHON_INLINE PyObject*
__Pyx_PyObject_LookupSpecial(PyObject* obj, PyObject* attr_name)
{ ... }


Correct. We (intentionally) have our own way to intern strings and do not 
depend on CPython's identifier framework.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4ATP4FSVRNI5CLAJDN43QRDH5IHW7BW2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-02 Thread Stefan Behnel


Victor Stinner schrieb am 02.02.22 um 23:23:

On Wed, Feb 2, 2022 at 3:54 PM Stefan Behnel wrote:

So people using stable Python versions like Python 3.10 would not need
Cython, but people testing the "next Python" (Python 3.11) would not
have to manually removed generated C code.


That sounds like an environment variable might help?


Something like CYTHON_FORCE_REGEN=1 would be great :-)


https://github.com/cython/cython/commit/b859cf2bd72d525a724149a6e552abecf9cd9d89

Note that this only applies when cythonize() is actually called. Some 
setup.py scripts may not do that unless requested to.




My use case is to use a project on the "next Python" version (the main
branch) when the project contains outdated generated C code, whereas I
have a more recent Cython version installed.


That use case would probably be covered by the Cython version check now, in 
case that stays in (the decision is pending user feedback).


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N6R5BE4GVNYRUTOET5QRQ5N2ZCJYZC7X/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-02 Thread Stefan Behnel


Ronald Oussoren via Python-Dev schrieb am 02.02.22 um 16:44:

On 2 Feb 2022, at 11:50, Stefan Behnel wrote:
Petr Viktorin schrieb am 02.02.22 um 10:22:

- "normal" public API, covered by the backwards compatibility policy (users 
need to recompile for every minor release, and watch for deprecation warnings)


That's probably close to what "-DCYTHON_LIMITED_API" does by itself as it stands. I can 
see that being a nice feature that just deserves a more suitable name. (The name was chosen because 
it was meant to also internally define "Py_LIMITED_API" at some point. Not sure if it 
will ever do that.)



- internal API (underscore-prefixed names, `internal` headers, things 
documented as private)
AFAIK, only the last one is causing trouble here.


Yeah, and that's the current default mode on CPython.


Is is possible to automatically pick a different default version when building 
with a too new CPython version?  That way projects can at least be used and 
tested with pre-releases of CPython, although possibly with less performance.


As I already wrote elsewhere, that is making the assumption (or at least 
optimising for the case) that a new CPython version always breaks Cython. 
And it has the drawback that we'd get less feedback on the "normal" 
integration and may thus end up noticing problems only later in the CPython 
development cycle.


I don't think this really solves a problem.

In any case, before we start playing with the default settings, I'd rather 
let users see what *they* can make of the available options. Then we can 
still come back and see which use cases there are and how to support them 
better.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2SIGLMW4HNF5BDF2DTFZFXCHNSR4VAGB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-02 Thread Stefan Behnel

Petr Viktorin schrieb am 02.02.22 um 10:22:
Moving off the internal (unstable) API would be great, but I don't think
Cython needs to move all the way to the limited API.

There are three "levels" in the C API:

- limited API, with long-term ABI compatibility guarantees

That's what "-DCYTHON_LIMITED_API -DPy_LIMITED_API=..." is supposed to do,
which currently fails for much if not most code.

- "normal" public API, covered by the backwards compatibility policy (users
need to recompile for every minor release, and watch for deprecation warnings)

That's probably close to what "-DCYTHON_LIMITED_API" does by itself as it
stands. I can see that being a nice feature that just deserves a more
suitable name. (The name was chosen because it was meant to also internally
define "Py_LIMITED_API" at some point. Not sure if it will ever do that.)

- internal API (underscore-prefixed names, `internal` headers, things
documented as private)

AFAIK, only the last one is causing trouble here.

Yeah, and that's the current default mode on CPython.

Maybe we should advertise the two modes more. And make sure that both work.
There are certainly issues with the current state of the "limited API"
implementation, but that just needs work and testing.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/ESEPW36K3PH4RM7OFVKAOE4QMBI2WYVU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-02 Thread Stefan Behnel

Victor Stinner schrieb am 02.02.22 um 11:35:

I wish that there would be a 3rd option: ship C code generated by
Cython *but* run Cython if this C code "looks" outdated, for example
if building the C code fails with a compiler error.

So, one thing I did yesterday was to make sure that .c files get
regenerated when a different Cython version is used at build time than what
was used to generate them originally.

Thinking about this some more now, I'm no longer sure that this is really a
good idea, because it can lead to "random" build failures when a package
does not pin its Cython version and a newer (or, probably worse, older) one
happens to be installed at build time.

Not sure how to best deal with this. I'm open to suggestions, although this
might be the wrong forum.

Let's discuss it in a ticket:

https://github.com/cython/cython/issues/4611

Note that what you propose sounds more like a setuptools feature than a
Cython feature, though.

So people using stable Python versions like Python 3.10 would not need
Cython, but people testing the "next Python" (Python 3.11) would not
have to manually removed generated C code.

That sounds like an environment variable might help?

I don't really want to add something like a "last supported CPython
version". There is no guarantee that the code breaks between CPython
versions, so that would just introduce an artificial support blocker.

In Fedora RPM packages of Python projects, we have to force manually
running Cython. For example, the numpy package does: "rm PKG-INFO"
with the comment: "Force re-cythonization (ifed for PKG-INFO presence
in setup.py)".
https://src.fedoraproject.org/rpms/numpy/blob/rawhide/f/numpy.spec#_107

In my pythonci project, I use a worse hack, I search for generated C
files and remove them manually with this shell command:

rm -f -v $(grep -rl '/\* Generated by Cython') PKG-INFO

This command searchs for the pattern "/* Generated by Cython".

Right. Hacks like these are just awful. There must be a better way.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/V76GA5DRWPEJ7PRBSPRQX335WARZLUHJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Guido van Rossum schrieb am 02.02.22 um 01:43:

It may be hard to imagine if you're working on Cython, which only exists
because of performance needs, but there are other things that people want
to test with the upcoming CPython release in addition to performance

I know. Cython (and originally Pyrex) has come a long way from a tool to
get stuff done to a dependency that a large number of packages depend on.
Maintainer decisions these days are quite different from those 10 years
ago. Let alone 20.

Let's just try to keep things working in general, and fix stuff that needs
to be broken.

On Tue, Feb 1, 2022 at 4:14 PM Stefan Behnel wrote:

I'd rather make it more obvious to users what their intentions are. And
there is already a way to do that – the Limited API. (and similarly, HPy)

Your grammar confuses me. Do you want users to be clearer in expressing
their intentions?

Erm, sort of. They should be able to choose and express what they prefer,
in a simple way.

For Cython, support for the Limited API is still work in progress, although
many things are in place already. Getting it to work completely would give
users a simple way to decide whether they want to opt in for a) speed,
lots of wheels and adaptations for each CPython version, or b) less
performance, less hassle.

But until that work is complete, we're stuck with the unlimited API, right?
And by its own statements in a recent post here, HPy is still not ready for
all use cases, so it's also still a pipe dream.

Yes. HPy is certainly far from ready for anything real, but even for the
Limited API, it's still unclear whether it's actually complete enough to
cover Cython's needs. Basically, the API that Cython uses must really to be
able to implement CPython on top of itself. And at the same time interact
not with the reimplementation but with the underlying original, at the C
level. The C-API, and especially the Limited API, were never really meant
for that.

As it looks now, that switch can be done after the code generation, by
defining a simple C define in their build script. That also makes both
modes easily comparable. I think that is as good as it can get.

Do you have specific instructions for package developers here? I could
imagine that the scikit-learn maintainer (sorry to pick on you guys :-)
might not know where to start with this if until now they've always been
able to rely on either numpy wheels or building everything from source with
default settings.

It's not well documented yet, since the implementation isn't complete, and
so, a bunch of things simply won't work. I don't remember if the buffer
protocol is part of the Limited API by now, but last I checked it was still
missing, so the scikit-learn (or NumPy) people would be fairly unhappy with
the current state of affairs.

But it's mostly just passing "-DCYTHON_LIMITED_API" to your C compiler.
That's the part that will still work but won't do (yet) what you think.
Because then, you currently also have to define "-DPy_LIMITED_API=..." and
that's when your C compiler will get angry with you.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/2UFG7IPKR77HQG36BZAUEUDJJKIGBSLE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel


Thomas Caswell schrieb am 01.02.22 um 23:15:

I think it would be better to discourage projects from including the output
of cython in their sdists.  They should either have cython as a build-time
requirement or provide built wheels (which are specific a platform and
CPython version).  The middle ground of not expecting the user to have
cython while expecting them to have a working c-complier is a very narrow
case and I think asking those users to install cython is worth the forward
compatibility for Python versions you get by requiring people installing
from source to re-cythonize.


I agree. Shipping the generated C sources was a very good choice as long as 
CPython's C-API was very stable and getting a build time dependency safely 
installed on user side was very difficult.


These days, it's the opposite way.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KTWDJGHPQW7AIKDQQYV4IFHAKQZVXACL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Guido van Rossum schrieb am 02.02.22 um 00:21:

On Tue, Feb 1, 2022 at 3:07 David wrote:

Greg Ewing wrote:

To address this there could be an option to choose between
"compatible code" and "fast code", with the former restricting
itself to the stable API.

To some extent, that exists at the moment - many of the real abuses of the
CPython internals can be controlled by setting C defines. For the
particular feature that caused this discussion the majority of the uses can
be turned off by defining CYTHON_USE_EXC_INFO_STACK=0 and
CYTHON_FAST_THREAD_STATE=0. (There's still a few uses relating to
coroutines, but those too flags are sufficient to get Cython to build
itself and Numpy on Python 3.11a4).

Obviously it could still be better. But the desire to support PyPy (and
the beginnings of the limited API) mean that Cython does actually have
alternate "clean" code-paths for a lot of cases.

Hm... So maybe the issue is either with Cython's default settings (perhaps
traditionally it defaults to "as fast as possible but relies on internal
APIs a lot"?) or with the Cython settings selected by default by projects
*using* Cython?

I wonder if a solution during CPython's rocky alpha release cycle could be
to default (either in Cython or in projects using it) to the "not quite as
fast but not relying on a lot of internal APIs" mode, and to switch to
Cython's faster mode only once (a) beta is entered and (b) Cython has been
fixed to work with that beta?

This seems tempting – with the drawback that it would make Cython modules
less comparable between final and alpha/beta CPython releases. So users
would start reporting ghost performance regressions because it
(understandably) feels important to them that the slow-down they witness
needs to be resolved before the final release, and they just won't know
that this will happen automatically triggered by the version switch. :)

Feels a bit like car manufacturers who switch their exhaust cleaners on and
off based on the test mode detection.

More importantly, though, we'd get less bug reports during the alpha/beta
cycle ourselves, because things may look like they work but can still stop
working when we switch back to fast mode.

I'd rather make it more obvious to users what their intentions are. And
there is already a way to do that – the Limited API. (and similarly, HPy)

For Cython, support for the Limited API is still work in progress, although
many things are in place already. Getting it to work completely would give
users a simple way to decide whether they want to opt in for a) speed, lots
of wheels and adaptations for each CPython version, or b) less performance,
less hassle.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/FXSNX7UCQWNXXC7OWG4LBLILAYXQEOUB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Hi Irit,

Irit Katriel via Python-Dev schrieb am 01.02.22 um 23:04:

There two separate issues here. One is the timing of committing changes into
cython, and the other is the process by which the cython devs learn about
cpython development.

On the first issue, you wrote:

I'm reluctant to working on adapting Cython during alphas, because it

happened more than once that incompatible changes in CPython were rolled
back or modified again during alpha, beta and rc phases. That means more
work for me and the Cython project, and its users. Code that Cython users
generate and release on their side with a release version of Cython will
then be broken, and sometimes even more broken than with an older Cython
release.

I saw in your patch that you make changes such that they impact only the
new cpython version. So for old versions the generated code should not be
broken. Surely you don't guarantee that cython code generated for an alpha
version of cpython will work on later versions as well? Users who generate
code for an alpha version should regenerate it for the next alpha and for
beta, right?

I'd just like to note that we are talking about three different projects
and dependency levels here (CPython, Cython and a project that uses
Cython), all three have different release cycles, and not all projects can
afford to go through a new release with a new Cython version regularly or
on the "emergency" event of a new CPython release. Some even don't provide
wheels and require their users to do a source build on their side. Often
with a fixed Cython version dependency, or even with pre-generated and
shipped C sources, which makes it harder for the end users to upgrade
Cython as a work-around.

But at least it should be as easy for the maintainers as updating their
Cython version and pushing a new release. In most cases. And things are
also becoming easier these days with improvements in the packaging
ecosystem. It can just take a bit until everyone has had the chance to
upgrade along the food chain.

On the second issue:

I don't have the capacity to follow all relevant changes in CPython,
incompatible or not.

We get that, and this is why we're asking to work with you on cython updates
so that this will be easier for all of us. There are a number of cpython
core devs
who would like to help cython maintenance. We realise how important and
thinly resourced cython is, and we want to reduce your maintenance burden.
With better communication we could find ways to do that.

I'm sure we will. Thanks for your help. It is warmly appreciated.

Returning to the issue that started this thread - how do you suggest we
proceed with the exc_info change?

I'm not done sorting out the options yet. Regarding CPython, I think it's
best to keep the current changes in there. It should be easier for us to
continue from where we are now than to adapt again to a revert in CPython.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/BHIQL4P6F7OPMCAP6U24XEZUPQKI62UT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Greg Ewing schrieb am 01.02.22 um 23:33:

On 2/02/22 8:48 am, Guido van Rossum wrote:
It seems to me that a big part of the problem is that Cython feels
entitled to use arbitrary CPython internals.

I think the reason for this is that Cython is trying to be two
things at once: (1) an interface between Python and C, (2) a
compiler that turns Python code into fast C code.

To address this there could be an option to choose between
"compatible code" and "fast code", with the former restricting
itself to the stable API.

There is even more than such an option. We use a relatively large set of
feature flags that allow us to turn the usage of certain implementation
details of the C-API on and off. We use this to adapt to different Python
C-API implementations (currently CPython, PyPy, GraalPython and the Limited
C-API), although with different levels of support and reliability.

Here's the complete list of feature sets for the different targets:

https://github.com/cython/cython/blob/5a76c404c803601b6941525cb8ec8096ddb10356/Cython/Utility/ModuleSetupCode.c#L56-L311

This can also be used to enable and disable certain dependencies on CPython
implementation details, e.g. PyList, PyLong or PyUnicode, but also type
specs versus PyTypeObject structs.

Most of these feature flags can be disabled by users. There is no hard
guarantee that this always works, because it's impossible to test all
combinations, and then there are bugs as well, but most of the flags are
independent, which should usually allow to disable them independently.

So, one of the tools that we have in our sleeves when it comes to
supporting new CPython versions is also to selectively disable the
dependency on a certain C-API feature that changed, at least until we have
a way to adapt to the change itself.

In the specific case of the "exc_info" changes, however, that didn't quite
work, because that change was really not anticipated at that level of
impact. But there is an implementation for Cython 3.0 alpha now, and we'll
eventually have a legacy 0.29.x release out that will also adapt in one way
or another. Just takes a bit more time.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/QPAWLCS2FINPLVSDFFQCMVIELXETKQ3W/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Please update Cython before introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Christian Heimes schrieb am 01.02.22 um 16:42:

On 01/02/2022 16.08, Victor Stinner wrote:

I would prefer to introduce C API incompatible changes differently:
first fix Cython, and *then* introduce the change.

- (1) Propose a Cython PR and get it merged
- (2) Wait until a new Cython version is released
- (3) If possible, wait until numpy is released with regenerated Cython code
- (4) Introduce the incompatible change in Python

Note: Fedora doesn't need (3) since we always regenerated Cython code in
numpy.

this is a reasonable request for beta releases, but IMHO it is not feasible
for alphas. During alphas we want to innovate fast and play around. Your
proposal would slow down innovation and impose additional burden on core
developers.

Let's at least try not to run into a catch-22.

I'm reluctant to working on adapting Cython during alphas, because it
happened more than once that incompatible changes in CPython were rolled
back or modified again during alpha, beta and rc phases. That means more
work for me and the Cython project, and its users. Code that Cython users
generate and release on their side with a release version of Cython will
then be broken, and sometimes even more broken than with an older Cython
release.

But Victor is right, OTOH, that the longer we wait with adapting Cython,
the longer users have to wait with testing their code in upcoming CPython
versions, and the higher the chance of post-beta and post-rc rollbacks and
changes in CPython.

I don't have the capacity to follow all relevant changes in CPython,
incompatible or not. Even a Cython CI breakage of the CPython-dev job
doesn't always mean that there is something to do on our side and is
therefore silenced to avoid breakage of our own project workflows, and to
be looked at irregularly. Additionally, since Cython is a crucial part of
the Python ecosystem, breakage of Cython by CPython sometimes stalls the
build pipelines of CI images, which means that new CPython dev versions
don't reach the CI servers for a while, during which the breakage will go
even more unnoticed.

I think you should generally appreciate Cython (and the few other C-API
abstraction tools) as an opportunity to get a large number of extensions
adapted to CPython's now faster development all at once. The quicker these
tools adapt, the quicker you can get user feedback on your own changes, and
the more time you have to validate and refine them during the alpha and
beta cycles.

You can even see the adaptation as a way to validate your own changes in
the real world. It's cool to write new code, but difficult to find out
whether it behaves the way you want for the intended audience. So – be part
of your own audience.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/LJDI74V4IOHPCMQUEGH6VIQWHLM3MADG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Making code object APIs unstable

2021-09-07 Thread Stefan Behnel


Guido van Rossum schrieb am 07.09.21 um 00:44:

In addition, I just heard from the SC that they've approved the exception.
So we will remove these two APIs from 3.11 without deprecation.
Erm, hang on – when I wrote that I'm fine with *changing* them, I wasn't 
thinking of actually *removing* them. At least not both. PyCode_NewEmpty() 
isn't a good replacement since it takes low level arguments … char* instead 
of Python strings. It's good for the simple use case that it was written 
for (and Cython already uses it for that), but not so great for anything 
beyond that.


What I could try is to create only a single dummy code object and then 
always call .replace() on it to create new ones. But that seems hackish and 
requires managing yet another bit of global state across static and 
generated code parts.


I could also switch to _PyCode_New(), though it's not exactly what I would 
call an attractive option, both for usability reasons and its future API 
stability. (Cython also still generates C89 code, i.e. no partial struct 
initialisations.)


Any suggestions?

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/B6WFHGVAASF4MFUMPIHBZEUCPXVANU7D/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Making code object APIs unstable

2021-08-31 Thread Stefan Behnel

Guido van Rossum schrieb am 13.08.21 um 19:24:

In 3.11 we're changing a lot of details about code objects. Part of this is
the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine
Grained Error Locations in Tracebacks).

As a result, the set of fields of the code object is changing. This is
fine, the structure is part of the internal API anyway.

But there's a problem with two public API functions, PyCode_New() and
PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch, their
signatures are incompatible with previous versions, and they have to be
since the set of values needed to create a code object is different. (The
types.CodeType constructor signature is also changed, and so is its
replace() method, but these aren't part of any stable API).

Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP
387 stable ABI. What should we do?

A. We could deprecate them, keep (restore) their old signatures, and create
crippled code objects (no exception table, no endline/column tables,
qualname defaults to name).

B. We could deprecate them, restore the old signatures, and always raise an
error when they are called.

C. We could just delete them.

D. We could keep them, with modified signatures, and to heck with ABI
compatibility for these two.

E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add the
posonlyargcount (which is the only difference between the two), and d*mn
the torpedoes.

F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New()
(and deprecate it).

If these weren't part of the stable ABI, I'd choose (E). [...]

I also vote for (E). The creation of a code object is tied to interpreter
internals and thus shouldn't be (or have been) declared stable.

I think the only problem with that argument is that code objects are
required for frames. You could argue the same way about frames, but then it
becomes really tricky to, you know, create frames for non-Python code.

Since we're discussing this in the context of PEP 657, I wonder if there's
a better way to create tracebacks from C code, other than creating fake
frames with fake code objects.

Cython uses code objects and frames for the following use cases:

- tracing generated C code at the Python syntax level
- profiling C-implemented functions
- tracebacks for C code

Having a way to do these three efficiently (i.e. with close to zero runtime
overhead) without having to reach into internals of the interpreter state,
code objects and frames, would be nice.

Failing that, I'm ok with declaring the relevant structs and C-API
functions non-stable and letting Cython use them as such, as we always did.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/XYNNMH57O7CYWHYKTD3ELZTM3B4M53HL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PR checks hang because travis does not report back to github

2020-10-06 Thread Stefan Behnel

Victor Stinner schrieb am 05.10.20 um 12:25:
> Would you mind reporting the issue to
> https://github.com/python/core-workflow/issues so we can aggregate
> information about this issue?

Done.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/77GIBSETK44CPRBTF34VNR7CRT66WUGP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PR checks hang because travis does not report back to github

2020-10-05 Thread Stefan Behnel

Ned Deily schrieb am 05.10.20 um 01:19:
> On Oct 4, 2020, at 15:55, Terry Reedy wrote:
>>
>> On 10/4/2020 2:32 PM, Mariatta wrote:
>>> This is a known issue and I have brought it up in GitHub OS Maintainers 
>>> Feedback Group. It happens to other projects as well.
>>> Currently we have branch protection rule where even administrators couldnt 
>>> merge the PR unless all the required checks passed.
>>> Perhaps we can relax the rule to allow administrators to merge the stuck 
>>> PRs. At least temporarily until Travis/GitHub fixes it. Does this sound 
>>> okay?
>>
>> If we are told how to ping the admins, it would be better than being stuck.
> 
> If you run into a problem like this with a stuck PR, contact the release 
> manager for the branch directly via email. Release managers can override the 
> restrictions and we don't always read every list immediately.
> 
> Because this was a trivial change and because of time zones, I've taken the 
> liberty of acting in Pablo's behalf: it's now merged. 

Thank you Ned, and good to know.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/P7HPR3BSG7BO3I7562VULK64ZROGOZT2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] PR checks hang because travis does not report back to github

2020-10-04 Thread Stefan Behnel

Hi devs,

I have a trivial documentation PR

https://github.com/python/cpython/pull/22464

for which travis (unsurprisingly) had a successful run,

https://travis-ci.com/github/python/cpython/builds/187435578

but github lists the travis build as "created" instead of "passed".

https://github.com/python/cpython/pull/22464/checks?check_run_id=1188595760

I already tried closing the PR and reopening it, and also triggering the
build again on travis side, but github still fails to pick up the build status.

I tried creating a new PR, but it seems that github (or travis) deduplicate
the build requests and still refer to the original build, so that there is
still no response from travis.

I also cannot find a way to terminate the checks process in github, or
otherwise make it stop waiting for Godot.

Is this a known issue? Is there anything I can do about it?

Thanks,
Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4TNKVOJ2LUJZZHHIBNORZ7GIVMYMNDER/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-07-02 Thread Stefan Behnel

Victor Stinner schrieb am 02.07.20 um 00:07:
> Le mer. 1 juil. 2020 à 23:43, Eric V. Smith a écrit :
>>
>> On 7/1/2020 3:43 PM, Stefan Behnel wrote:
>>> Petr Viktorin schrieb am 30.06.20 um 14:51:
>>>> For example, could we only deprecate the bad parts, but not remove them
>>>> until the experiments actually show that they are preventing a beneficial
>>>> change?
>>> Big nod on this one.
>>
>> At one of the core sprints (maybe at Microsoft?) there was talk of
>> adding a new API without changing the existing one.
>>
> There is the https://github.com/pyhandle/hpy project which is
> implemented on top of the existing C API.
> 
> But this project doesn't solve problems listed in PEP 620, since
> CPython must continue to support existing C extensions.
Maybe I'm missing something here, but how is "removing parts of the C-API"
the same as "supporting existing C extensions" ? It seems to me that both
are straight opposites.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/U6HIYIHPIHSNRIG4IYUTQESJTUXKGOXC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-07-01 Thread Stefan Behnel

Petr Viktorin schrieb am 30.06.20 um 14:51:
> For example, could we only deprecate the bad parts, but not remove them
> until the experiments actually show that they are preventing a beneficial
> change?

Big nod on this one.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AUTEISLLD2IGCB2RQVLX74YETCWKGVXH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-06-26 Thread Stefan Behnel

Victor Stinner schrieb am 26.06.20 um 14:39:
> Well, the general problem is to track when the caller ends using a resource.

Although that is less of a problem if you only allow exposing the internal
data representation and nothing else. In that case, you can tie the
lifetime of the data access to the lifetime of the object.

Minus moving GCs, as Carl also pointed out. But even there, you could get
away (probably for quite a while) with pinning the data if someone asked
for it.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NS5MTQFCD7TRZCXS4ZSI3PCPEA5OL6PJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-06-24 Thread Stefan Behnel

Gustavo Carneiro schrieb am 24.06.20 um 19:19:
> On Wed, 24 Jun 2020 at 17:22, Victor Stinner wrote:
>> The question becomes: how to promote the limited C API? Should it
>> become the default, rather than an opt-in option?
> 
> It would be interesting to find out what is the performance impact of using
> limited C API, vs normal API, on some popular extensions.  This is
> something that I wish had been included in PEP 384.

It couldn't because even today it is still fairly difficult to convert
existing code to the limited API. Some code cannot even be migrated at all,
e.g. because the entire buffer protocol is missing from it. Some bugs were
only fixed in Py3.9, time will tell if anything else is missing.

The only major project that I know has been migrated (recently, with a lot
of effort) is the PyQt project. And a GUI toolkit probably doesn't have all
that many performance critical parts that are dominated by the CPython
C-API. (I'm just guessing, it probably has some, somewhere).

> It would be great if the limited API could be the default, as it allows
> building extensions once that work across most python versions.

We are adding a C compile mode for the limited API to Cython. That's also a
lot of effort, and probably won't be finished soon, but once that becomes
any usable, we'd have a whole bunch of real-world extensions that we could
use for benchmarking, many of which were written for speed. We could even
take a regular Python module and compile it in both variants to compare
"pure Python" to "full C-API" to "limited C-API".

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SENQBEJCJ7NYC72ZZ7BGIEDDBTUOXLI4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-06-24 Thread Stefan Behnel

Victor Stinner schrieb am 24.06.20 um 17:40:
> My practical problem is how to prevent C extensions accessing the
> PyFloatObject.ob_fval member directly.

Do extensions really do that in their code? I mean, there *is* a macro for
doing exactly this thing, which suggests that users should exactly *not* do
it themselves but use the macro. I would simply say that anyone accessing
the structure fields directly instead of using the intended macro is simply
on their own with that choice. If their code breaks, they'll have to fix it
in the way that was intended for the last 23 years (I looked that up).

I don't have any data, but to me, this sounds like a non-issue to start with.


> In my tests, I renamed PyObject
> members. For example, rename PyObject.ob_type to PyObject._ob_type,
> and update Py_TYPE() and Py_SET_TYPE(). If a C function accesses
> directly PyObject.ob_type, a compilation error is issued.

I think the path of
- making macros / (inline) functions available for all use cases
- making them available in a backport header file
- telling people to use those instead of direct struct access

is the right way. If/when we notice in the future that we need to change an
object struct, and macros are available for the use cases that we break (or
can be made available during a suitable deprecation phase), then extension
authors will notice at that point that they will have to switch to the
macros instead of doing whatever breaks for them (or not).


> One option would be to have a different stricter build mode where
> PyFloat_AS_DOUBLE() becomes a function call. Example:
> 
> #ifndef Py_LIMITED_API
> #  ifdef OPAQUE_STRUCTURE
> #define PyFloat_AS_DOUBLE(op) PyFloat_AsDouble(op)
> #  else
> #define PyFloat_AS_DOUBLE(op) (((PyFloatObject *)(op))->ob_fval)
> #  endif
> #endif

I think that's too broad. Why make all structs opaque, when we don't even
know which ones we may want to touch in the future at all? And, who would
really use this mode?


> Or maybe it's time to extend the limited C API: add
> PyFloat_AS_DOUBLE() macro as a function call. Extending the limited C
> API has multiple advantages:
> 
> * It eases the transition of C extensions to the limited C API
> * Py_LIMITED_API already exists, there is no need to add yet another
> build mode or any new macro
> * Most structures are *already* opaque in the limited C API.

We will have to grow it anyway, so why not. We could also add yet another
optional header file that adds everything from the full C-API that we can
somehow map to the limited C-API, as macros or inline functions. In the
worst case, we could still implement a missing function as a lookup and
call through a Python object method, if there's no other way to do it in
the limited C-API.

In the end, this could lead to a "full C-API wrapper", implemented on top
of the limited C-API. Sounds like a good way to port existing code.


> The question becomes: how to promote the limited C API? Should it
> become the default, rather than an opt-in option?

With the above "full wrapper", it could become the default. That would give
authors three choices:

- the full C-API (being tied to a minor release)
- the limited C-API (limited but providing forward compatibility)
- the wrapper (being slower but providing forward+backward compatibility)

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TVSHK6HS6G3JQ4P5OO3FM2KFDXKP3OTM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-06-24 Thread Stefan Behnel

Victor Stinner schrieb am 24.06.20 um 02:14:
> Le mar. 23 juin 2020 à 16:56, Stefan Behnel a écrit :
>>> Members of ``PyObject`` and ``PyTupleObject`` structures have not
>>> changed since the "Initial revision" commit (1990)
>>
>> While I see an advantage in hiding the details of PyObject (specifically
>> memory management internals), I would argue that there simply isn't much to
>> improve in PyTupleObject, so these two don't fly at the same level for me.
> 
> There are different reasons to make PyTupleObject opaque:
> [Some reeasons why *PyObject* should not be exposed]
>
> * Prevent C extensions to make assumptions on how a Python
> implementation stores a tuple. Currently, C extensions are designed to
> have best performances with CPython, but it makes them run slower on
> PyPy.
> 
> * It becomes possible to experiment with a more efficient PyTypeObject
> layout, in terms of memory footprint or runtime performance, depending
> on the use case. For example, storing directly numbers as numbers
> rather than PyObject. Or maybe use a different layout to make
> PyList_AsTuple() an O(1) operation. I had a similar idea about
> converting a bytearray into a bytes without having to copy memory. It
> also requires to modify PyBytesObject to experiment such idea. An
> array of PyObject* is the most efficient storage for all use cases.

Note, I understand the difference between ABI and API. Keeping
PyTuple_GET_ITEM() a macro or inline function can break the ABI at some
point once PyTupleObject changes in an incompatible way in Py3.14, and it
may do different things in PyPy entirely at some point. That's fine. We
have a policy of allowing ABI breakage between CPython minor releases.

But this does not mean that PyTupleObject needs to become an opaque type
that requires a function call into CPython for PyTuple_GET_ITEM(). It *may*
become that at some point, when there is a reason to change it into a
function call. In the current implementation, there is no such reason. In a
future implementation, there may or may not be a reason. We do not know
that. As of now, we're just needlessly slowing down existing code by
disallowing the C compiler to see that PyTuple_GET_ITEM() literally just
does a single pointer dereference.

This applies ten-fold to types like PyLong and PyFloat, where getting
straight at the native C value is also just a pointer dereference.

Basically, what I'm asking is to keep things as efficient as they are *in
CPython* as long as there is no reason to change them *in CPython*.


>> If we remove CPython specific features from the (de-facto) "official public
>> Python C-API", then I think there should be a "public CPython 3.X C-API"
>> that actively exposes the data structures natively, not just an "internal"
>> one. That way, extension authors can take the usual decision between
>> performance, maintenance effort and platform independence.
> 
> I would like to promote "portable" C code, rather than promote writing
> CPython specific code.
> 
> I mean that the "default" should be the portable API, and writing
> CPython specific code would be a deliberate opt-in choice.

That's what I mean by "public CPython 3.X C-API". Don't discourage its use,
don't hide away details. Just make it clear what is CPython specific and
what isn't, but without judging. It's a good thing for extensions to be
fast on CPython.


>> I haven't come across a use
>> case yet where I had to change a ref-count by more than 1, but allowing
>> users to arbitrarily do that may require way more infrastructure under the
>> hood than allowing them to create or remove a single reference to an
>> object. I think explicit is really better than implicit here.
> 
> Py_SET_REFCNT() is not Py_INCREF(). It's used for special functions
> like free lists, resurrect an object, save/restore reference counter
> (during resurrection), etc.

Exactly, so it is Py_INCREF() or Py_DECREF(), just without side-effects.
I'm arguing that the use case is also practically the same: increase or
decrease the refcount of an object, but without triggering the deallocation
machinery. Now read my paragraph above again. :)

Is it too late in the Py3.9 cycle to switch to two separate macros?

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K7YQLIWPTEHPH7KTEFGN6ALH2SN6U6YJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-06-23 Thread Stefan Behnel

Hi Victor,

thanks for your continued work on improving the C-API.

I'll comment on the PEP inline.

Victor Stinner schrieb am 22.06.20 um 14:10:
> PEP available at: https://www.python.org/dev/peps/pep-0620/
> [...]
> Motivation
> ==
> 
> The C API blocks CPython evolutions
> ---
> 
> Adding or removing members of C structures is causing multiple backward
> compatibility issues.
> 
> Adding a new member breaks the stable ABI (PEP 384), especially for
> types declared statically (e.g. ``static PyTypeObject MyType =
> {...};``). In Python 3.4, the PEP 442 "Safe object finalization" added
> the ``tp_finalize`` member at the end of the ``PyTypeObject`` structure.
> For ABI backward compatibility, a new ``Py_TPFLAGS_HAVE_FINALIZE`` type
> flag was required to announce if the type structure contains the
> ``tp_finalize`` member. The flag was removed in Python 3.8 (`bpo-32388
> `_).

Probably not the best example. I think this is pretty much normal API
evolution. Changing the deallocation protocol for objects is going to
impact any public API in one way or another. PyTypeObject is also not
exposed with its struct fields in the limited API, so your point regarding
"tp_print" is also not a strong one.


> Same CPython design since 1990: structures and reference counting
> -
> Members of ``PyObject`` and ``PyTupleObject`` structures have not
> changed since the "Initial revision" commit (1990)

While I see an advantage in hiding the details of PyObject (specifically
memory management internals), I would argue that there simply isn't much to
improve in PyTupleObject, so these two don't fly at the same level for me.


> Why is PyPy more efficient than CPython?
> 
> 
> The PyPy project is a Python implementation which is 4.2x faster than
> CPython on average. PyPy developers chose to not fork CPython, but start
> from scratch to have more freedom in terms of optimization choices.
> 
> PyPy does not use reference counting, but a tracing garbage collector
> which moves objects. Objects can be allocated on the stack (or even not
> at all), rather than always having to be allocated on the heap.
> 
> Objects layouts are designed with performance in mind. For example, a
> list strategy stores integers directly as integers, rather than objects.
> 
> Moreover, PyPy also has a JIT compiler which emits fast code thanks to
> the efficient PyPy design.

I would be careful with presenting examples of PyPy optimisations here.
Whichever you choose could easily give the impression that they are the
most important changes that made PyPy faster and should therefore be
followed in CPython. I doubt that there are any "top changes" that made the
biggest difference for PyPy. Even large breakthroughs on their side stand
on the shoulders of other important changes that may not have been visible
by themselves in the performance graphs.

CPython will not be rewritten from scratch, will continue to have its own
infrastructure, and will therefore have its own specific tweaks that it
will benefit from. Trying things out is fine, but there is no guarantee
that following a specific change in PyPy will make a similar difference in
CPython and its own ecosystem.


> PyPy bottleneck: the Python C API
> -
> While PyPy is way more efficient than CPython to run pure Python code,
> it is as efficient or slower than CPython to run C extensions.
> [...]
> Hide implementation details
> ---
> 
> Hiding implementation details from the C API has multiple advantages:
> 
> * It becomes possible to experiment with more advanced optimizations in
>   CPython than just micro-optimizations. For example, tagged pointers,
>   and replace the garbage collector with a tracing garbage collector
>   which can move objects.
> * Adding new features in CPython becomes easier.
> * PyPy should be able to avoid conversions to CPython objects in more
>   cases: keep efficient PyPy objects.
> * It becomes easier to implement the C API for a new Python
>   implementation.
> * More C extensions will be compatible with Python implementations other
>   than CPython.

I understand the goal of experimenting with new optimisations and larger
changes internally.

If, however, the goal is to make it easier for other implementations to
support (existing?) C extensions, then breaking all existing C extensions
in CPython first does not strike me as a good way to get there. :)

My feeling is that PyPy specifically is better served with the HPy API,
which is different enough to consider it a mostly separate API, or an
evolution of the limited API, if you want. Suggesting that extension
authors support two different APIs is much, but forcing them to support the
existing CPython C-API (for legacy reasons) and the changed CPython C-API
(for future compatibility), and

[Python-Dev] Re: (PEP 620) C API for efficient loop iterating on a sequence of PyObject** or other C types

2020-06-23 Thread Stefan Behnel

Victor Stinner schrieb am 23.06.20 um 11:18:
> Maybe an object can
> generate a temporary PyObject** view which requires to allocate
> resources (like memory) and the release function would release these
> resources.

I agree that this is more explicit when it comes to resource management,
but there is nothing that beats direct native data structure access when it
comes to speed. If a "PyObject*[]" is not what the runtime uses internally
as data structure, then why hand it out as an interface to users who
require performance? There's PyIter_Next() already for those who don't.

If the intention is to switch to a more efficient internal data structure
inside of CPython (or expose in PyPy whatever that uses), then I would look
more at PEP-393 for a good interface here, or "array.array". It's perfectly
fine to have 20 different internal array types, as long as they are
explicitly and safely exposed to users.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YWN2NXAHAHN4FAZTVCSP4ZVLDE2CGVXQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Cython and incompatible C API changes

2020-06-22 Thread Stefan Behnel

Victor Stinner schrieb am 17.06.20 um 13:25:
> Le mer. 17 juin 2020 à 12:38, Petr Viktorin a écrit :
>>> There is an ongoing discussion about always requiring to run Cython
>>> when installing a C extension which uses Cython.
>>
>> Do you have a link to that discussion?

Yeah, I was wondering, too. :)

> Hum, I forgot where the discussion happened. Maybe it wasn't a proper
> "discussion", but just a few tweets:
> https://twitter.com/tacaswell/status/1266472526806474752
> 
> Thomas A Caswell wrote: "also, if you use cython please make it a
> build-time dependency and please don't put the generated c code in the
> sdists. cython can only handle the changes in the CPyhon c-api if you
> let it!"

So much for random opinions on the Internet. ;-)

I still recommend generating the C code on the maintainer side and then
shipping it. Both approaches have their pros and cons, but that's
definitely what I recommend.

First of all, making Cython a build time dependency and then pinning an
exact Cython version with it is entirely useless, because the C code that
Cython outputs is deterministic and you can just generate it on your side
and ship it. One dependency less, lots of user side complexity avoided. So,
the only case we're talking about here is allowing different (usually
newer) Cython versions to build your code.

If you ship the C file, then you know what you get and you don't depend on
whatever Cython version users have installed on their side. You avoid the
maintenance burden of having to respond to bug reports for seemingly
unrelated C code lines or bugs in certain Cython versions. The C code that
Cython generates is very intentionally adaptive to where you compile it and
we work hard to do all environment specific adaptations in the C code and
not in the code generator that creates it. It's the holy cow of "generate
once, compile everywhere". But obviously, it cannot take as-of-now unknown
future environmental changes into account, such as changes to the CPython
C-API.

If, instead, you use Cython at package build time, then you risk build
failures on user side due to users having a buggy Cython version installed
(which may not have existed when you shipped the package, so you couldn't
exclude it), or your code failing to compile with the installed Cython due
to incompatible language changes. However, if those (somewhat exceptional)
cases don't happen, then you may end up with a setting in which your code
adapts also to newer environments, by using a recent Cython version
automatically. That is definitely an advantage.

Basically, for maintained packages, I consider shipping the generated C
code the right way. Less hassle, easier debugging, better user experience.
For unmaintained packages, regenerating the C code at build time *can*
extend the lifetime of the package to newer environments for as long as it
does not run into failures due to Cython compiler changes (so you trade one
compatibility level for another one).

The question is whether the point at which a package becomes unmaintained
can ever be clear enough to make the switch. Regardless of which way you
choose, at some point in the future someone will have to do something,
either to your code or to your build setup, in order to prevent fatal bitrot.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2K3IKBD4K7INMVV3LK6SJY6EXDDNC2M2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Stefan Behnel

Paul Moore schrieb am 13.04.20 um 14:25:
> On a related but different note, what is the recommended policy
> (assuming it's not to use the C API) for embedding Python, and for
> exposing the embedding app to Python as a C extension? My standard
> example of this is the Vim interface to Python - see
> https://github.com/vim/vim/blob/master/src/if_python3.c. I originally
> wrote this back in the Python 1.5 days, so it's *very* old, and quite
> likely not how I'd write it now, even using the C API. But what's the
> recommendation for code like that in the face of these changes, and
> the suggestion that using 3rd party tools is the normal way to write C
> extensions?

Embedding is not very well documented overall. I recently looked through
the docs to collect what a user would need to know in this case, and ended
up creating at least a little link collection, because I failed to find a
good place to refer users to. The things people need to know from the
CPython docs are scattered across different places, and lack a complete
real-world-like example that "most people" could start from. (I don't think
many users will pass strings into Python to execute code there.)

https://cython.readthedocs.io/en/latest/src/tutorial/embedding.html

From Cython's PoV, the main thing that future embedders need to understand
is that it's not really different from extending – you just have to start
the Python runtime before doing anything else. I think there should be some
help for getting that done, and then it's just executing your Python code
in some module. Cython then has its ways to go back and forth from there,
e.g. by writing cdef (C) functions as entry points for your application.

Cython currently doesn't really have "direct" support for embedding. You
can let it generate a C main function for you to start your program, but
that's not what you want in the case of vim. There's a "cython_freeze"
script that generates an inittab list in addition, but it's a bit
simplistic and not integrated. We have a beginners ticket for integrating
it better:

https://github.com/cython/cython/issues/2849

What I would like to see eventually is to let users pass a list of modules
into Cython's frontend (maybe cythonize(), maybe not), and then it would
just generate a single distutils Extension from them that links everything
together and registers all modules on import, optionally with a generated
exported C function that starts up the whole thing. That seems simple
enough to do and use, and you end up with a shared library that your
application can load. PRs welcome. :)

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y6VRSVWYSV63AFQNAQEIJZBDZZG7QOTM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Stefan Behnel

André Malo schrieb am 14.04.20 um 13:39:
> I think, it does not serve well as a policy for CPython. Since we're talking 
> hypotheticals right now, if Cython vanishes tomorrow, we're kind of left 
> empty 
> handed. Such kind of a runtime, if considered part of the compatibility 
> "promise", should be provided by the core itself, no?

There was some discussion a while ago about integrating a stripped-down
variant of Cython into CPython's stdlib. I was arguing against that because
the selling point of Cython is really what it is, and stripping that down
wouldn't lead to something equally helpful for users.

I think it's good to have separate projects (and, in fact, it's more than
one) deal with this need.

In the end, it's an external tool, like your editor, your C compiler, your
debugger and whatever else you need for developing Python extensions. It
spits out C code and lets you do with it what you want. There's no reason
it should be part of the CPython project, core or stdlib. It's even written
in Python. If it doesn't work for you, you can fix it.


> A good way to test that promise (or other implications like performance) 
> might 
> also be to rewrite the standard library extensions in Cython and see where it 
> leads.

Not sure I understand what you're saying here. stdlib extension modules are
currently written in C, with a bit of code generation. How is that different?


> I personally see myself using the python-provided runtime (types, methods, 
> GC), out of convenience (it's there, so why not use it). The vision of the 
> future outlined here can easily lead to backing off from that and rebuilding 
> all those things and really only keep touchpoints with python when it comes 
> to 
> interfacing with python itself. It's probably even desirable that way

That's actually not an uncommon thing to do. Some packages really only use
Cython or pybind11 to wrap their otherwise native C or C++ code. It's a
choice given specific organisational/project/developer constraints, and
choices are good.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QZSX36TPAKLXAA3O6KLUNCPKVJ2SKASN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP: Modify the C API to hide implementation details

2020-04-14 Thread Stefan Behnel

Steve Dower schrieb am 14.04.20 um 00:27:
> On 13Apr2020 2308, André Malo wrote:
>> For one thing, if you open up APIs for Cython, they're open for everybody
>> (Cython being "just" another C extension).
>> More to the point: The ABIs have the same problem as they have now,
>> regardless
>> how responsive the Cython developers are. Once you compiled the extension,
>> you're using the ABI and are supposedly not required to recompile to stay
>> compatible.
>>
>> So, where I'm getting at is: Either you open up to everybody or nobody. In C
>> there's not really an in-between.
> 
> On a technical level, you are correct.
> 
> On a policy level, we don't make changes that would break users of the C
> API. Because we can't track everyone who's using it, we have to assume that
> everything is used and any change will cause breakage.
> 
> To make sure it's possible to keep developing CPython, we declare parts of
> the API off limits (typically by prepending them with an underscore). If
> you use these, and you break, we're sorry but we aren't going to fix it.
> 
> This line of discussion is basically saying that we would designate a
> broader section of the API that is off limits, most likely the parts that
> are only useful for increased performance (rather than increased
> functionality). We would then specifically include the Cython
> team/volunteers in discussions about how to manage changes to these parts
> of the API to avoid breaking them, and possibly do simultaneous releases to
> account for changes so that their users have more time to rebuild.
> 
> Effectively, when we change our APIs, we would break everyone except Cython
> because we've worked with them to avoid the breakage. Anyone else using it
> has to make their own effort to follow CPython development and detect any
> breakage themselves (just like today).
> 
> So probably the part you're missing is where we would give ourselves
> permission to break more APIs in a release, while simultaneously
> encouraging people to use Cython as an isolation layer from those breaks.

To add to that, the main difference for users here is a choice:

1) I want to use whatever is in the C-API and will fix my broken code
myself whenever there's a new CPython release.

2) I write my code against the stable ABI, accept the performance
limitations, and hope that it'll "never" break and my code just keeps
working (even through future compatibility layers, if necessary).

3) I use Cython and rerun it on my code at least once for each new CPython
release series, because I want to get the best performance for each target
version.

4) I use Cython and activate its (yet to be completed) stable ABI mode, so
that I don't have to target separate (C)Python releases but can release a
single wheel, at the cost of reduced performance.

And then there are a couple of grey areas, e.g. people using Cython plus a
bit of the C-API directly, for which they are then responsible themselves
again. But it's still way easier to adapt 3% of your code every couple of
CPython releases than all of your modules for each new release. That's just
the normal price that you pay for manual optimisations.

A nice feature of Cython here is that 3) and 4) are actually not mutually
exclusive, at least as it looks so far. You should eventually be able to
generate both from your same sources (we are trying hard to keep them in
the same C file), and even mix them on PyPI, e.g. distribute a generic
stable ABI wheel for all Pythons that support it, plus accelerated wheels
for CPython 3.9 and 3.10. You may even be able to release a pure Python
wheel as well, as we currently do for Cython itself to better support PyPy.

And to drive the point home, if CPython starts changing its C-API more
radically, or comes up with a new one, we can add the support for it to
Cython and then, in the best case, users will still only have to rerun it
on their code to target that new API. Compare that to case 1).

> (Cython is still just a placeholder name here, btw. There are 1-2 other
> projects that could be considered instead, though I think Cython is the
> only one that also provides a usability improvement as well as API
> stability.)

pybind11 and mypyc could probably make a similar offer to users. The
important point is just that we centralise the abstraction and adaptation work.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BHH3XKTCKZ73WQNHPVHYNBMJPYBELZFV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Accepting PEP 573 (Module State Access from C Extension Methods)

2020-03-23 Thread Stefan Behnel

As (first-time) BDFL delegate, I accept PEP 573 for Python 3.9,
"Module State Access from C Extension Methods"

https://www.python.org/dev/peps/pep-0573/

Petr, Nick, Eric and Marcel, thank you for your work and intensive
discussions on this PEP, and also to everyone else who got involved on
mailing lists, sprints and conferences.

It was a long process with several iterations, much thinking, rethinking
and cutting down along the way, Python 3.7 *and* 3.8 being missed, but 3.9
now finally being hit. Together with several other improvements to the
C-API in the upcoming release, this will help making extension modules less
"different" and easier to adapt for subinterpreters.

Best,
Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MRM5KDM3ITIE6ROK336UQGHKFANYWX6R/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Last call for comments on PEP 573 (Module State Access from C Extension Methods)

2020-03-10 Thread Stefan Behnel

Hi Petr!

Petr Viktorin schrieb am 14.01.20 um 14:37:
> It also includes a more drastic change: it removes the MRO walker from the
> proposal.
> Reflecting on the feedback, it became clear to me that a MRO walker, as it
> was described, won't give correct results in all cases: specifically, is a
> slot is overridden by setting a special method from Python code, the walker
> won't be able to find module. Think something like:
>     c_add = Class.__add__  # where nb_add uses the MRO walker
>     Class.__add__ = lambda *args: "evil"
>     c_add(Class(), 0)  # Exception: Defining type has not been found.
> 
> This can be solved, but it will need a different design and more
> discussion. I'd rather defer it to the future.
> Meanwhile, extension authors can use their own MRO walker if they're OK
> with some surprising edge cases.

I read the last update. I can't say I'm happy about the removal since I was
seeing the MRO walker function as a way to hide internals so that extension
authors can start using it and CPython can adapt the internals later. But I
do see that there are issues with it, and I accept your choice to keep the
PEP even more minimal than it already was.

Are there any more points to discuss? If not, I would soon like to accept
the PEP, so that we can focus more on the implementation and usage.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OGMUNR4ZPMPXTWJMFCWPZ5ITJJ2G7O3F/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Last call for comments on PEP 573 (Module State Access from C Extension Methods)

2019-11-25 Thread Stefan Behnel

Hi all,

I think that PEP 573 is ready to be accepted, to greatly improve the state
of extension modules in CPython 3.9.

https://www.python.org/dev/peps/pep-0573/

It has come a long way since the original proposal and went through several
iterations and discussions by various interested people, effectively
reducing its scope quite a bit. So this is the last call for comments on
the latest version of the PEP, before I will pronounce on it. Please keep
the discussion in this thread.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IORACHSY6Z5CYTMXFKZWFT4P4ZS2LQ4A/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Pass the Python thread state to internal C functions

2019-11-15 Thread Stefan Behnel

Victor Stinner schrieb am 12.11.19 um 23:03:
> Are you ok to modify internal C functions to pass explicitly tstate?

FWIW, I started doing the same internally in Cython a while back, because
like others, I also considered it wasteful to look it up all over the
place, often multiple times inside of one function (usually related to
try-finally and exception handling). I think it similarly makes sense
inside of CPython. I would also find it reasonable to make it part of a new
C-API.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OZMEP27S6Q4OQ4CMCFPSRPM4FGUI2ZHQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The Python 2 death march

2019-09-10 Thread Stefan Behnel

Ned Batchelder schrieb am 10.09.19 um 16:54:
> this seems confusing to me
> What does the "official EOL date" mean if there's a release in April?

Also, what day in April? For example, planning the release for the 1st
could possibly add further to the confusion.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ETQO7XXHRSAB43SXKO6FAMHI45OKV6HB/

[Python-Dev] Re: Comparing dict.values()

2019-07-29 Thread Stefan Behnel

Kristian Klette schrieb am 23.07.19 um 22:59:
> During the sprints after EuroPython, I made an attempt at adding support for
> comparing the results from `.values()` of two dicts.
> 
> Currently the following works as expected:
> 
> ```
> d = {'a': 1234}
> 
> d.keys() == d.keys()
> d.items() == d.items()
> ```
> 
> but `d.values() == d.values()` does not return the expected
> results. It always returns `False`. The symmetry is a bit off.
> 
> In the bug trackers[0] and the Github PR[1], I was asked
> to raise the issue on the python-dev mailing list to find
> a consensus on what comparing `.values()` should do.
> 
> I'd argue that Python should compare the values as expected here,
> or if we don't want to encourage that behaviour, maybe we should
> consider raising an exception. 
> Returning just `False` seems a bit misleading.
> 
> What are your thoughts on the issue?

FWIW, after reading most of this thread, I do not like the idea of raising
an exception for an innocent comparison. Just think of a list of arbitrary
objects, including a dict values view for some reason, and you're looking
for the right object in the list. Maybe in some kind of generic tool,
decorator, iter-helper, or whatever, something that has to deal with
arbitrary objects provided by random users, which uses "in" instead of a
loop with "is" comparisons.

I also kind-of like the idea of having

d.values() == d.values()

return True and otherwise let the comparison return False for everything
else. This seems to be the only reasonable behaviour that might(!) have a
use case, maybe in the same line as the argument above. I can't really see
a reason for implementing anything more than that.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CGRTXKDHS7GBBZY5GQ6ZM2FAVHPHJBAQ/

[Python-Dev] Re: Expected stability of PyCode_New() and types.CodeType() signatures

2019-06-11 Thread Stefan Behnel

Victor Stinner schrieb am 12.06.19 um 00:09:
> So yeah, the PyCode_New() change is very annoying in practical, since
> every single project using Cython requires a new release in practice.

I think Cython's deal with regard to this is:

"""
If you use Cython, we will try hard to cover up the adaptations for
upcoming (and existing) CPython versions for you. But you'll likely have to
rerun Cython on your project before a new CPython major release comes out.
"""

That's obviously not ideal for projects that end up being unmaintained. But
then again, you can't freeze time forever, and /any/ change to a dependency
can end up being fatal to them.

I personally think that rerunning Cython when a new CPython release comes
out is an acceptable price to pay for a project. In the worst case, this
can even be done by others, as you suggested as a common Fedora practice
elsewhere, Victor.

(To be fair, I'll have to add that new Cython versions are also part of
those dependencies that might end up introducing incompatible changes and
forcing your code to be adapted. The upcoming Cython 3.0 release will be
such a case. However, given our compatibility record so far, I still
consider Cython a very fair deal.)

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4J3ZPCLJBHLPJRT32X2JIBAQ3C7OAL5H/

[Python-Dev] Re: Expected stability of PyCode_New() and types.CodeType() signatures

2019-06-11 Thread Stefan Behnel

Neil Schemenauer schrieb am 08.06.19 um 22:46:
> It would be great if we had a system that did CI testing with the
> top PyPI modules. E.g. pull the latest versions of the top 100 PyPI
> modules and test them with the latest CPython branch.

FWIW, travis-ci provides the latest CPython master builds (and some latest
dev branches). We use them in Cython for our CI tests.

One of the problems is that their images include some widely used libraries
like NumPy, some of which in turn depend on Cython these days, so it
happened once already that they failed to provide updated images because
they were lacking a Cython version that worked with them, and we didn't
notice that a change in Cython was needed because the CI builds were
continuing to use an outdated CPython master version. :) Ah, circular
dependencies… I think they fixed something about that, though. It wasn't a
problem this time, at least.

We also have the "Cython testbed", which we (irregularly) use before
releases to check that we didn't break more than was broken before the release.

https://travis-ci.org/cython-testbed

It's pretty much what was asked for here, just for Cython, and it turns out
to be a considerable amount of work to keep this from breaking arbitrarily
for the included projects, even without changing something in Cython along
the way.

Thus, personally, I would prefer a decentralised CI approach, where
interested/important projects test themselves against CPython master (which
many of them do already), and have them report back when they notice an
undermotivated breakage. Some projects do that with Cython (and CPython)
already, and that works quite well so far and seems the least work for
everyone.

Stefan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/EOGBRUUGY7PGUUONITJYFNIHURQ5E4OK/

Re: [Python-Dev] Expected stability of PyCode_New() and types.CodeType() signatures

2019-06-01 Thread Stefan Behnel

Serhiy Storchaka schrieb am 01.06.19 um 09:02:
> I have a related proposition. Yesterday I have reported two bugs (and Pablo
> quickly fixed them) related to handling positional-only arguments. These
> bugs were occurred due to subtle changing the meaning of co_argcount. When
> we make some existing parameters positional-only, we do not add new
> arguments, but mark existing parameters. But co_argcount now means the only
> number of positional-or-keyword parameters. Most code which used
> co_argcount needs now to be changed to use co_posonlyargcount+co_argcount.
> 
> I propose to make co_argcount meaning the number of positional parameters
> (i.e. positional-only + positional-or-keyword). This would remove the need
> of changing the code that uses co_argcount.

Sounds reasonable to me. The main distinction points are positional
arguments vs. keyword arguments vs. local variables. Whether the positional
ones are positional or positional-only is irrelevant in many cases.


> PyCode_New() can be kept unchanged, but we can add new PyCode_New2() or
> PyCode_NewEx() with different signature.

It's not a commonly used function, and it's easy for C code to adapt. I
don't think it's worth adding a new function to the C-API here, compared to
just changing the signature. Very few users would benefit, at the cost of
added complexity.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 580/590 discussion

2019-05-10 Thread Stefan Behnel

Petr Viktorin schrieb am 10.05.19 um 00:07:
> On 5/9/19 5:33 PM, Jeroen Demeyer wrote:
>> Maybe you misunderstood my proposal. I want to allow both for extra
>> flexibility:
>>
>> - METH_FASTCALL (possibly combined with METH_KEYWORDS) continues to work
>> as before. If you don't want to care about the implementation details of
>> vectorcall, this is the right thing to use.
>>
>> - METH_VECTORCALL (using exactly the vectorcallfunc signature) is a new
>> calling convention for applications that want the lowest possible
>> overhead at the cost of being slightly harder to use.
> 
> Then we can, in the spirit of minimalism, not add METH_VECTORCALL at all.
> [...]
> METH_FASTCALL is currently not documented, and it should be renamed before
> it's documented. Names with "fast" or "new" generally don't age well.

I personally don't see an advantage in having both, apart from helping code
that wants to be fast also on Py3.7, for example. It unnecessarily
complicates the CPython implementation and C-API.

I'd be ok with removing FASTCALL in favour of VECTORCALL. That's more code
to generate for Cython in order to adapt to Py<3.6, Py3.6, Py3.7 and then
Py>=3.[89], but well, seeing the heap of code that we *already* generate,
it's not going to hurt our users much.

It would, however, be (selfishly) helpful if FASTCALL could still go
through a deprecation period, because we'd like to keep the current Cython
0.29.x release series compatible with Python 3.8, and I'd like to avoid
adding support for VECTORCALL and compiling out FASTCALL in a point
release. Removing it in Py3.9 seems ok to me.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode

2019-04-27 Thread Stefan Behnel

Matthias Klose schrieb am 25.04.19 um 13:48:
> Are there use cases where you only want to load *some*
> debug extensions, even if more are installed?

Not sure if there are _important_ use cases (that could justify certain
design decisions), but I can certainly imagine using a non-debug (and
therefore faster) Pandas or NumPy for preparing some data that I need to
debug my own code. More generally, whenever I can avoid using a debug
version of a *dependency* that I don't need to include in my debug
analysis, it's probably a good idea to not use the debug version.

Even given venvs and virtualisation techniques, it would probably be nice
if users could install debug+nondebug versions of libraries once and then
import the right one at need, rather than having to set up a new
environment (while they're on a train in the middle of nowhere without fast
access to PyPI).

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Any core dev event plans for EP19?

2019-04-26 Thread Stefan Behnel

Berker Peksağ schrieb am 26.04.19 um 01:15:
> Note that this year's core dev sprint will be held in London. See
> https://discuss.python.org/t/2019-core-dev-sprint-location-date/489
> for the previous discussion. There are only two months between both
> events, so perhaps we can leave things like discussions on active PEPs
> to the core dev sprint?
> (And welcome to the team!)

Ah, nice! Thanks for telling me, I wasn't aware of it. London is just a day
by train from where I live, I'm totally in for that.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Any core dev event plans for EP19?

2019-04-25 Thread Stefan Behnel

Hi core devs,

there are several core dev events happening at the US PyCon this year, so I
was wondering if we could organise something similar at EuroPython. Does
anyone have any plans or ideas already? And, how many of us are planning to
attend EP19 in Basel this year? Unless there's something already going on
that I missed, I can (try to) set up a poll on dpo to count the interest
and collect ideas.

Sprints would probably be a straight-forward option, a mentoring session
could be another, a language summit or PEP discussion/mentoring round would
also be a possibility. More ideas welcome.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use C extensions compiled in release mode on a Python compiled in debug mode

2019-04-24 Thread Stefan Behnel

Jeroen Demeyer schrieb am 24.04.19 um 09:24:
> On 2019-04-24 01:44, Victor Stinner wrote:
>> I would like to
>> be able to run C extensions compiled in release mode on a Python
>> compiled in debug mode
> 
> That seems like a very good idea. I would certainly use the debug mode
> while developing CPython or C extensions.

+1

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] No longer enable Py_TRACE_REFS by default in debug build

2019-04-12 Thread Stefan Behnel

Serhiy Storchaka schrieb am 11.04.19 um 17:30:
> If reducing the Python memory footprint is an argument for disabling
> Py_TRACE_REFS, it is a weak argument because there is larger overhead in
> the debug build.

I think what Victor is argueing is rather that we have better ways to debug
memory problems these days, so we might be able to get rid of a relict that
no-one is using (or should be using) anymore and that has its drawbacks
(such as a very different ABI and higher memory load).

I don't really have an opinion here, but I can at least say that I never
found a use case for Py_TRACE_REFS myself and therefore certainly wouldn't
miss it.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Replacement for array.array('u')?

2019-03-22 Thread Stefan Behnel

Antoine Pitrou schrieb am 22.03.19 um 11:39:
> On Fri, 22 Mar 2019 20:31:33 +1300 Greg Ewing wrote:
>> A poster on comp.lang.python is asking about array.array('u').
>> He wants an efficient mutable collection of unicode characters
>> that can be initialised from a string.
> 
> TBH, I think anyone trying to use array.array should be directed to
> Numpy these days.  The only reason for array.array being here is that
> it predates Numpy.  Otherwise we'd never have added it.

Well, maybe it wouldn't get *added* these days anymore, with pip+PyPI
nicely in place. But being there already, it makes for a nice and efficient
"batteries included" list replacement for simple data that would otherwise
waste a lot of object memory.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

2019-03-21 Thread Stefan Behnel

Victor Stinner schrieb am 21.03.19 um 01:22:
> Alternatives have been proposed like a recipe to sort node attributes
> before serialization, but honestly, it's way too complex.

Hm, really? Five lines of simple and obvious Python code, that provide a
fast and completely Python-version agnostic solution to the problem that a
few users have, are "way too complex" ? That sounds a bit extreme to me.


> I don't want
> to have to copy such recipe to every project. Add a new function,
> import it, use it where XML is written into a file, etc. Taken alone,
> maybe it's acceptable. But please remember that some companies are
> still porting their large Python 2 code base to Python 3. This new
> backward incompatible gets on top of the pile of other backward
> incompatible changes between 2.7 and 3.8.
> 
> I would prefer to be able to "just add" sort=True. Don't forget that
> tests like "if sys.version >= (3, 8):"  will be needed which makes the
> overall fix more complicated.

Yes, exactly! Users would have to add that option *conditionally* to their
code somewhere. Personally, I really dislike having to say "if Python
version is X do this, otherwise, do that". I prefer a solution that just
works. There are at least four approaches that generally work across Python
releases: ignoring the ordering, using C14N, creating attributes in order,
sorting attributes before serialisation. I'd prefer if users picked one of
those, preferably the right on for their use case, rather than starting to
put version specific kludges into their code.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

2019-03-19 Thread Stefan Behnel

Ned Batchelder schrieb am 19.03.19 um 12:53:
> I need to re-engineer my tests.

… or sort the attributes before serialisation, or use C14N always, or
change your code to create the attributes in sorted-by-name order. The new
behaviour allows for a couple of ways to deal with the issue of backwards
compatibility.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Is XML serialization output guaranteed to be bytewise identical forever?

2019-03-19 Thread Stefan Behnel

Nathaniel Smith schrieb am 19.03.19 um 00:15:
> That seems potentially simpler to implement than canonical XML
> serialization

C14N is already implemented for ElementTree, just needs to be ported to
Py3.8 and merged.

https://bugs.python.org/issue13611

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Making PyInterpreterState an opaque type

2019-02-19 Thread Stefan Behnel

Steve Dower schrieb am 19.02.19 um 21:40:
> On 19Feb2019 1212, Stefan Behnel wrote:
>> Then it's up to the users to decide
>> how much work they want to invest into keeping up with C-API changes vs.
>> potentially sub-optimal but stable C-API usage.
> [...]
> And it's not up to the users - it's up to the package developers.

I meant "users" as in "users of the C-API", i.e. package developers.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Making PyInterpreterState an opaque type

2019-02-19 Thread Stefan Behnel

Nick Coghlan schrieb am 19.02.19 um 15:00:
> On Tue, 19 Feb 2019 at 20:41, Antoine Pitrou wrote:
>> On Mon, 18 Feb 2019 19:04:31 -0800 Steve Dower wrote:
>>> If you always rebuild your extension for every micro version (3.x.y) of
>>> CPython, then sure, go ahead and use this.
>>
>> Usually we would guarantee that API details don't change in bugfix
>> versions (i.e. the 3.x.y -> 3.x.(y + 1) transition).  Has that changed?
>> That may turn out a big problem for several third-party extensions...
> 
> This is the genuine technical difference between the three levels:
> 
> * Py_BUILD_CORE -> no ABI stability guarantees at all
> * standard -> stable within a maintenance branch
> * Py_LIMITED_API -> stable across feature releases

I'm happy with this split, and i think this is how it should be. There is
no reason (not withstanding critical bugs) to break the C-API within a
maintenance (3.x) release series. Apart from the 3.5.[12] slip, CPython has
proven very reliable in these guarantees.

We can (or at least could) easily take care in Cython to enable version
specific features and optimisations only from CPython alpha/beta releases
on, and not when they should become available in later point releases, so
that users can compile their code in, say, CPython 3.7.5 and it will work
correctly in 3.7.1.

We never cared about Py_BUILD_CORE (because that's obviously internal), and
it's also not very likely that we will have a Py_LIMITED_API backend
anywhere in the near future (although we would consider PRs for it that
implement the support as an optional C compile time feature).

What I would ask, though, and I think that's also Jeroen's request, is to
be careful what you lock up behind Py_BUILD_CORE. Any new functionality
should be available to extension modules by default, unless there is a good
reason why it should remain internal. Usually, there is a reason why this
functionality was added, and I doubt that there are many cases where these
reasons are entirely internal to CPython.

One thing that is not mentioned above are underscore private C-API
functions. I imagine that they are a bit annoying for CPython itself
because promoting them to public means renaming them, which is already a
breaking change. But they are a clear marker for potential future breakage,
which is good. Still, my experience so far suggests that they also fall
under the "keep stable in maintenance branch" rule, which is even better.

So, yeah, I'm happy with the status quo, and a bit worried about all the
moving around of declarations and that scent of a sword of Damocles hanging
over their potential confinement. IMHO, things should just be public and
potentially marked as "unstable" to advertise a risk of breakage in a
future CPython X.Y feature releases. Then it's up to the users to decide
how much work they want to invest into keeping up with C-API changes vs.
potentially sub-optimal but stable C-API usage.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Add more SyntaxWarnings?

2019-01-30 Thread Stefan Behnel

MRAB schrieb am 29.01.19 um 19:55:
> On 2019-01-29 13:44, Nick Coghlan wrote:
>> FWIW, we have pretty decent evidence that error messages don't have to
>> provide a wonderful explanation on their own in order to be helpful:
>> they just need to be distinctive enough that a web search will
>> reliably get you to a page that gives you relevant information.
>>
>> Pre-seeded answers on Stack Overflow are excellent for handling the
>> second half of that approach (see [1] for a specific example).
>> [1]
>> https://stackoverflow.com/questions/25445439/what-does-syntaxerror-missing-parentheses-in-call-to-print-mean-in-python
> 
> I have a vague recollection that a certain computer system (Amiga?) had a
> 'why' command. If it reported an error, you could type "why" and it would
> give you more details.
> 
> I suspect that all that was happening was that when the error occurred it
> would store the additional details somewhere that the 'why' command would
> simply retrieve.

So … are you suggesting to use the webbrowser module inside of the REPL to
look up the exception message of the previously printed stack trace in
stack overflow when a user types "why()"?

I faintly recall someone implementing something in that direction. It's
probably in some package on PyPI.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] C API changes

2018-11-26 Thread Stefan Behnel

Armin Rigo schrieb am 26.11.18 um 06:37:
> On Sun, 25 Nov 2018 at 10:15, Stefan Behnel wrote:
>> Overall, this seems like something that PyPy could try out as an
>> experiment, by just taking a simple extension module and replacing all
>> increfs with newref assignments. And obviously implementing the whole thing
>> for the C-API
> 
> Just to be clear, I suggested making a new API, not just tweaking
> Py_INCREF() and hoping that all the rest works as it is.  I'm
> skeptical about that.

Oh, I'm not skeptical at all. I'm actually sure that it's not that easy. I
would guess that such an automatic transformation should work in something
like 70% of the cases. Another 25% should be trivial to fix manually, and
the remaining 5% … well. They can probably still be changed with some
thinking and refactoring. That also involves cases where pointer equality
is used to detect object identity. Having a macro for that might be a good
idea.

Overall, relatively easy. And therefore not unlikely to happen. The lower
the bar, the more likely we will see adoption.

Also note that explicit Py_INCREF() calls are actually not that common. I
just checked and found only 465 calls in 124K lines of Cython generated C
code for Cython itself, and 725 calls in 348K C lines of lxml. Not exactly
a snap, but definitely not huge. All other objects originate from the C-API
in one way or another, which you control.

> To start with, a ``Py_NEWREF()`` like you describe *will* lead people
> just renaming all ``Py_INCREF()`` to ``Py_NEWREF()`` ignoring the
> return value, because that's the easiest change and it would work fine
> on CPython.

First of all, as long as Py_INCREF() is not going away, they probably won't
change anything. Therefore, before we discuss how laziness will hinder the
adoption, I would rather like to see an actual motivation for them to do
it. And since this change seems to have zero advantages in CPython, but
adds a tiny bit of complexity, I think it's now up to PyPy to show that
this added complexity has an advantage that is large enough to motivates
it. If you could come up with a prototype that demonstrates the advantage
(or at least uncovers the problems we'd face), we could actually discuss
about real solutions rather than uncertain ideas.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] C API changes

2018-11-25 Thread Stefan Behnel

Hi Armin,

Armin Rigo schrieb am 25.11.18 um 06:15:
> On Sat, 24 Nov 2018 at 22:17, Stefan Behnel wrote:
>> Couldn't this also be achieved via reference counting? Count only in C
>> space, and delete the "open object" when the refcount goes to 0?
> 
> The point is to remove the need to return the same handle to C code if
> the object is the same one.  This saves one of the largest costs of
> the C API emulation, which is looking up the object in a big
> dictionary to know if there is already a ``PyObject *`` that
> corresponds to it or not---for *all* objects that go from Python to C.

Ok, got it. And since the handle is a simple integer, there's also no
additional cost for memory allocation on the way out.

> Once we do that, then there is no need for a refcount any more.  Yes,
> you could add your custom refcount code in C, but in practice it is
> rarely done.  For example, with POSIX file descriptors, when you would
> need to "incref" a file descriptor, you instead use dup().  This gives
> you a different file descriptor which can be closed independently of
> the original one, but they both refer to the same file.

Ok, then an INCREF() would be replaced by such a dup() call that creates
and returns a new handle. In CPython, it would just INCREF and return the
PyObject*, which is as fast as the current Py_INCREF().

For PyPy, however, that means that increfs become more costly. One of the
outcomes of a recent experiment with tagged pointers for integers was that
they make increfs and decrefs more expensive, and (IIUC) that reduced the
overall performance quite visibly. In the case of pointers, it's literally
just adding a tiny condition that makes this so much slower. In the case of
handles, it would add a lookup and a reference copy in the handles array.
That's way more costly already than just the simple condition.

Now, it's unclear if this performance degredation is specific to CPython
(where PyObject* is native), or if it would also apply to PyPy. But I guess
the only way to find this out would be to try it.

IIUC, the only thing that is needed is to replace

Py_INCREF(obj);

with

obj = Py_NEWREF(obj);

which CPython would implement as

#define Py_NEWREF(obj)  (Py_INCREF(obj), obj)

Py_DECREF() would then just invalidate and clean up the handle under the hood.

There are probably some places in user code where this would end up leaking
a reference by accident because of unclean reference handling (it could
overwrite the old handle in the case of a temporary INCREF/DECREF cycle),
but it might still be enough for trying it out. We could definitely switch
to this pattern in Cython (in fact, we already use such a NEWREF macro in a
couple of places, since it's a common pattern).

Overall, this seems like something that PyPy could try out as an
experiment, by just taking a simple extension module and replacing all
increfs with newref assignments. And obviously implementing the whole thing
for the C-API, but IIUC, you might be able to tweak that into your cpyext
wrapping layer somehow, without manually rewriting all C-API functions?

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] C API changes

2018-11-24 Thread Stefan Behnel

Armin Rigo schrieb am 23.11.18 um 14:15:
> In PyPy we'd have a global table of
> "open objects", and a handle would be an index in that table; closing
> a handle means writing NULL into that table entry.  No emulated
> reference counting needed: we simply use the existing GC to keep alive
> objects that are referenced from one or more table entries.  The cost
> is limited to a single indirection.

Couldn't this also be achieved via reference counting? Count only in C
space, and delete the "open object" when the refcount goes to 0?

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] General concerns about C API changes

2018-11-18 Thread Stefan Behnel

Gregory P. Smith schrieb am 15.11.18 um 01:03:
> From my point of view: A static inline function is a much nicer modern code
> style than a C preprocessor macro.

It's also slower to compile, given that function inlining happens at a much
later point in the compiler pipeline than macro expansion. The C compiler
won't even get to see macros in fact, whereas whether to inline a function
or not is a dedicated decision during the optimisation phase based on
metrics collected in earlier stages. For something as ubiquitous as
Py_INCREF/Py_DECREF, it might even be visible in the compilation times.

Oh, BTW, I don't know if this was mentioned in the discussion before, but
transitive inlining can easily be impacted by the switch from a macro to an
inline function. Since inlining happens long before the final CPU code
generation, the C compiler needs to uses heuristics for estimating the
eventual "code weight" of an inline function, and then sums up all weights
within a calling function to decide whether to also inline that calling
function into the transitive callers or not.

Now imagine that you have an inline function that executes several
Py_INCREF/Py_DECREF call cycles, and the C compiler happens to slightly
overestimate the weights of these two. Then it might end up deciding
against inlining the function now, whereas it previously might have decided
for it since it was able to see the exact source code expanded from the
macros. I think that's what Raymond meant with his concerns regarding
changing macros into inline functions. C compilers might be smart enough to
always inline CPython's new inline functions themselves, but the style
change can still have unexpected transitive impacts on code that uses them.

I agree with Raymond that as long as there is no clear gain in this code
churn, we should not underestimate the risk of degarding code on user side.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Experiment an opt-in new C API for Python? (leave current API unchanged)

2018-11-18 Thread Stefan Behnel

Neil Schemenauer schrieb am 17.11.18 um 00:10:
> I think making PyObject an opaque pointer would help.

... well, as long as type checks are still as fast as with "ob_type", and
visible to the C compiler so that it can eliminate redundant ones, I
wouldn't mind. :)


> - Borrowed references are a problem.  However, because they are so
>   commonly used and because the source code changes needed to change
>   to a non-borrowed API is non-trivial, I don't think we should try
>   to change this.  Maybe we could just discourage their use?

FWIW, the code that Cython generates has a macro guard [1] that makes it
avoid borrowed references where possible, e.g. when it detects compilation
under PyPy. That's definitely doable already, right now.


> - It would be nice to make PyTypeObject an opaque pointer as well.
>   I think that's a lot more difficult than making PyObject opaque.
>   So, I don't think we should attempt it in the near future.  Maybe
>   we could make a half-way step and discourage accessing ob_type
>   directly.  We would provide functions (probably inline) to do what
>   you would otherwise do by using op->ob_type->.

I've sometimes been annoyed by the fact that protocol checks require two
pointer indirections in CPython (or even three in some cases), so that the
C compiler is essentially prevented from making any assumptions, and the
CPU branch prediction is also stretched a bit more than necessary. At
least, the slot check usually comes right before the call, so that the
lookups are not wasted. Inline functions are unlikely to improve that
situation, but at least they shouldn't make it worse, and they would be
more explicit.

Needless to say that Cython also has a macro guard in [1] that disables
direct slot access and makes it fall back to C-API calls, for users and
Python implementations where direct slot support is not wanted/available.


>   One reason you want to discourage access to ob_type is that
>   internally there is not necessarily one PyTypeObject structure for
>   each Python level type.  E.g. the VM might have specialized types
>   for certain sub-domains.  This is like the different flavours of
>   strings, depending on the set of characters stored in them.  Or,
>   you could have different list types.  One type of list if all
>   values are ints, for example.

An implementation like this could also be based on the buffer protocol.
It's already supported by the array.array type (which people probably also
just use when they have a need like this and don't want to resort to NumPy).


>   Basically, with CPython op->ob_type is super fast.  For other VMs,
>   it could be a lot slower.  By accessing ob_type you are saying
>   "give me all possible type information for this object pointer".
>   By using functions to get just what you need, you could be putting
>   less burden on the VM.  E.g. "is this object an instance of some
>   type" is faster to compute.

Agreed. I think that inline functions (well, or macros, because why not?)
that check for certain protocols explicitly could be helpful.


> - APIs that return pointers to the internals of objects are a
>   problem.  E.g. PySequence_Fast_ITEMS().  For CPython, this is
>   really fast because it is just exposing the internal details of
>   the layout that is already in the correct format.  For other VMs,
>   that API could be expensive to emulate.  E.g. you have a list to
>   store only ints.  If someone calls PySequence_Fast_ITEMS(), you
>   have to create real PyObjects for all of the list elements.

But that's intended by the caller, right? They want a flat serial
representation of the sequence, with potential conversion to a (list) array
if necessary. They might be a bit badly named, but that's exactly the
contract of the "PySequence_Fast_*()" line of functions.

In Cython, we completely avoid these functions, because they are way too
generic for optimisation purposes. Direct type checks and code
specialisation are much more effective.


> - Reducing the size of the API seems helpful.  E.g. we don't need
>   PyObject_CallObject() *and* PyObject_Call().  Also, do we really
>   need all the type specific APIs, PyList_GetItem() vs
>   PyObject_GetItem()?  In some cases maybe we can justify the bigger
>   API due to performance.  To add a new API, someone should have a
>   benchmark that shows a real speedup (not just that they imagine it
>   makes a difference).

So, in Cython, we use macros wherever possible, and often avoid generic
protocols in favour of type specialisations. We sometimes keep local copies
of C-API helper functions, because inlining them allows the C compiler to
strip down and streamline the implementation at compile time, rather than
jumping through generic code. (Also, it's sometimes required in order to
backport new CPython features to Py2.7+.)

PyPy's cpyext often just maps type specific C-API functions to the same
generic code, obviously, but in CPython, having a way to bypass protocols
and going

Re: [Python-Dev] The future of the wchar_t cache

2018-10-20 Thread Stefan Behnel

Serhiy Storchaka schrieb am 20.10.2018 um 13:06:
> Currently the PyUnicode object contains two caches: for UTF-8
> representation and for wchar_t representation. They are needed not for
> optimization but for supporting C API which returns borrowed references for
> such representations.
> 
> The UTF-8 cache always was in unicode objects (but in Python 2 it was not a
> UTF-8 cache, but a 8-bit representation cache). Initially it was needed for
> compatibility with 8-bit str, for implementing the "s" and "z" format units
> in PyArg_Parse(). Now it is used also for PyUnicode_AsUTF8() and
> PyUnicode_AsUTF8AndSize().
> 
> The wchar_t cache was added with PEP 393 in 3.3 as a replacement for the
> former Py_UNICODE representation. Now Py_UNICODE is defined as an alias of
> wchar_t, and the C API which returned a pointer to Py_UNICODE content
> returns now a pointer to the cached wchar_t representation. It is the "u"
> and "Z" format units in PyArg_Parse(), PyUnicode_AsUnicode(),
> PyUnicode_AsUnicodeAndSize(), PyUnicode_GET_SIZE(),
> PyUnicode_GET_DATA_SIZE(), PyUnicode_AS_UNICODE(), PyUnicode_AS_DATA().
> 
> All this increase the size of the unicode object. It includes the constant
> overhead of additional pointer and size fields, and the overhead of the
> cached representation proportional to the string length. The following
> table contains number of bytes per character for different kinds, with and
> without filling specified caches.
> 
>    raw  +utf8 +wchar_t   +utf8+wchar_t
>    Windows  Linux   Windows  Linux
> ASCII   1 1   3   5    3   5
> UCS1    1    2-3  3   5   4-5 6-7
> UCS2    2    3-5  2   6   3-5 7-9
> UCS4    4    5-8 6-8  4   7-12    5-8
> 
> There is also a new C API added in 3.3 for getting wchar_t representation
> without using the cache: PyUnicode_AsWideChar() and
> PyUnicode_AsWideCharString(). Currently it uses the cache, this has
> benefits and disadvantages.
> 
> Old Py_UNICODE based API is deprecated, and will be removed eventually.
> I want to ask about the future of the wchar_t cache. Is the benefit of
> caching the wchar_t representation larger the disadvantage of spending more
> memory? The wchar_t representation is so natural for Windows API as the
> UTF8 representation for POSIX API. But in all other cases it is just waste
> of memory. Are there reasons of keeping the wchar_t cache after removing
> the deprecated API?

I'd be happy to get rid of it. But regarding the use under Windows, I
wonder if there's interest in keeping it as a special Windows-only feature,
e.g. to speed up the data exchange with the Win32 APIs. I guess it would
have to provide a visible (performance?) advantage to justify such special
casing over the code removal.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Stefan Behnel

Guido van Rossum schrieb am 21.09.2018 um 19:35:
> Though now I start worrying about interned strings. That's a concept that's
> a little closer to being a feature.

True. While there's the general '"ab"+"cd" is (not) "abcd"' caveat, I'm
sure quite a bit of code out there assumes that parsed identifiers in a
module, such as the names of functions and classes, are interned, since
this was often communicated. And in fact, explicitly interning the same
name might return a different string object with this change than what's in
the module/class dict.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-21 Thread Stefan Behnel

Larry Hastings schrieb am 14.09.2018 um 23:27:
> What the patch does: it takes all the Python modules that are loaded as
> part of interpreter startup and deserializes the marshalled .pyc file into
> precreated objects stored as static C data.

What about the small integers cache? The C serialisation generates several
PyLong objects that would normally reside in the cache. Is this handled
somewhere? I guess the cache could entirely be loaded from the data
segment. And the same would have to be done for interned strings. Basically
anything that CPython only wants to have one instance of.

That would severely limit the application of this optimisation to external
modules, though. I don't see a way how they could load their data
structures from the data segment without duplicating all sorts of "singletons".

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-20 Thread Stefan Behnel

Carl Shapiro schrieb am 20.09.2018 um 20:21:
> On Wed, Sep 19, 2018 at 12:32 AM, Stefan Behnel wrote:
> 
>> Also, one thing that would be interesting to find out is whether constant
>> Python data structures can actually be pre-allocated in the data segment
>> (and I mean their object structs) . Then things like tuples of strings
>> (argument lists and what not) could be loaded and the objects quickly
>> initialised (although, is that even necessary?), rather than having to heap
>> allocate and create them. Probably something that we should try out in
>> Cython.
> 
> I might not be fully understanding the scope of your question but this
> patch does allocate constant data structures in the data segment.  We could
> be more aggressive with that but we limit our scope to what is presented to
> the un-marshaling code.

Ah, thanks, yes, it works recursively, also for tuples and code objects.
Took me a while to figure out how to open the "frozemodules.c" file, but
looking at that makes it clear. Yes, that's what I meant.

> This may be relevant to Cython, as well.

Totally. This might actually be more relevant for Cython than for CPython
in the end, because it wouldn't be limited to the stdlib and its core modules.

It's a bit more difficult for us, because this probably won't work easily
across Python releases (2.[67] and 3.[45678] for now) and also definitely
not for PyPy, but that just means some multiplication of the generated
code, and we have the dynamic part of it already. Supporting that for
Unicode strings will be fun, I'm sure. :)

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?

2018-09-19 Thread Stefan Behnel

Carl Shapiro schrieb am 18.09.2018 um 22:44:
> How might people feel about using the linker to bundle a list of pre-loaded
> modules into a single-file executable?

One way to do that would be to compile Python modules with Cython and link
them in statically, instead of compiling them to .pyc files.

Advantage: you get native C .o files, fast and straight forward to link.

Disadvantage: native code is much more voluminous than byte code, so the
overall binary size would grow substantially.

Also, one thing that would be interesting to find out is whether constant
Python data structures can actually be pre-allocated in the data segment
(and I mean their object structs) . Then things like tuples of strings
(argument lists and what not) could be loaded and the objects quickly
initialised (although, is that even necessary?), rather than having to heap
allocate and create them. Probably something that we should try out in Cython.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use of Cython

2018-09-04 Thread Stefan Behnel

Yury Selivanov schrieb am 04.09.2018 um 18:19:
> On Sat, Sep 1, 2018 at 6:12 PM Stefan Behnel wrote:
>> Yury Selivanov schrieb am 07.08.2018 um 19:34:
>>> The first goal is to compile mypy with it to make it faster, so I hope
>>> that the project will be completed.
>>
>> That's not "the first goal". It's the /only/ goal. The only intention of
>> mypyc is to be able to compile and optimise enough of Python to speed up
>> the kind or style of code that mypy uses.
>>
>>> Essentially, mypyc will be similar
>>> to Cython, but mypyc is a *subset of Python*, not a superset.
>>
>> Which is bad, right? It means that there will be many things that simply
>> don't work, and that you need to change your code in order to make it
>> compile at all. Cython is way beyond that point by now. Even RPython will
>> probably continue to be way better than mypyc for quite a while, maybe
>> forever, who knows.
> 
> To be clear I'm not involved with mypyc, but my understanding is that
> the entire Python syntax will be supported, except some dynamic
> features like patching `globals()`, `locals()`, or classes, or
> __class__.

No, that's not the goal, at least from what I understood from my
discussions with Jukka. The goal is to make it compile mypy, be it by
supporting Python features in mypyc or by avoiding Python features in mypy.
I'm sure they will take any shortcut they can in order to avoid having to
make mypyc too capable, because mypyc is not more than a means to an end.
For example, they may easily get away without supporting generators and
closures, which are quite difficult to implement in C. But finding a
non-trivial piece of Python code out there that uses neither of the two is
probably not easy.

I'm also sure they will avoid Python semantics wherever they can, because
implementing them in the same way as CPython and Cython would mean that
certain constructs cannot safely be statically reasoned about, and thus
cannot be optimised. Avoiding (full) Python semantics relieves you from
these restrictions, and if you control both sides, the compiler and the
code that it compiles, then it becomes much easier to apply arbitrary
optimisations at will.

IMHO, what they are implementing is much closer to ShedSkin than to Cython.

>>> Interfacing with C libraries can be easily achieved with cffi.
>>
>> Except that it will be fairly slow. cffi is not designed for static
>> analysis but for runtime operations.
> 
> Could you please clarify this point?  My current understanding is that
> you can build a static compiler with a knowledge about cffi so that it
> can compile calls like `ffi.new("something_t[]", 80)` to pure C.

I'm sure there is a relatively large subset of cffi's API that could be
compiled statically, as long as the declartions and their usage are kept
simple and fully visible to the compiler. What that subset is remains to be
seen once someone actually tries to do it.

> Yeah, statically compiling cffi-enabled code is probably the way to go
> for mypyc and Cython.

I doubt it, given the expected restrictions and verbosity. But debating
this is useless as long as no-one attempts to actually write a static
compiler for cffi(-like) code.

> Using Cython/C types usually means
> that you need to use pxd/pyx files which means that the code isn't
> Python anymore.

I'm aware that this is a very common misconception that is difficult to get
out of people's heads. You probably got this idea from wrapping a native
library, in which case the only choice you have in order to declare an
external C-API is really to use Cython's special syntax. However, this
would not apply to most use cases in the CPython project context, and it
also does not necessarily apply to most of the code in a Cython module even
if it uses external libraries.

Cython has four ways to provide type declarations: cdef statements in
Cython code, external .pxd files for Python or Cython files, special
decorators and declaration functions, and PEP-484/526 type annotations.

All four have their use cases (e.g. syntax support in different Python
versions, efficiency of expression, readability for people with different
backgrounds, etc.), and all but the first allow users to keep their module
code in Python syntax. As long as you do not call into external native
code, it's your choice which of these you prefer for your code base,
project context and developer background. You can even mix them at will, if
you feel like it.

> I know that Cython has a mode to use decorators in
> pure Python code to annotate types, but they are less intuitive than
> using typing annotations in 3.6+.

You can use PEP-484/526 type annotations to declare Cython types in Python
code that you intend to compile. It's entirely up to you, and it's an
entirely subjective measure whic

Re: [Python-Dev] Use of Cython

2018-09-01 Thread Stefan Behnel

Yury,

given that people are starting to quote enthusiastically the comments you
made below, let me set a couple of things straight.

Yury Selivanov schrieb am 07.08.2018 um 19:34:
> On Mon, Aug 6, 2018 at 11:49 AM Ronald Oussoren via Python-Dev wrote:
> 
>> I have no strong opinion on using Cython for tests or in the stdlib, other 
>> than that it is a fairly large dependency.  I do think that adding a 
>> “Cython-lite” tool the CPython distribution would be less ideal, creating 
>> and maintaining that tool would be a lot of work without clear benefits over 
>> just using Cython.
> 
> Speaking of which, Dropbox is working on a new compiler they call "mypyc".
> 
> mypyc will compile type-annotated Python code to an optimized C.

That's their plan. Saying that "it will" is a bit premature at this point.
The list of failed attempts at writing static Python compilers is rather
long, even if you only count those that compile the usual "easy subset" of
Python.

I wish them the best of luck and endurance, but they have a long way to go.

> The
> first goal is to compile mypy with it to make it faster, so I hope
> that the project will be completed.

That's not "the first goal". It's the /only/ goal. The only intention of
mypyc is to be able to compile and optimise enough of Python to speed up
the kind or style of code that mypy uses.

> Essentially, mypyc will be similar
> to Cython, but mypyc is a *subset of Python*, not a superset.

Which is bad, right? It means that there will be many things that simply
don't work, and that you need to change your code in order to make it
compile at all. Cython is way beyond that point by now. Even RPython will
probably continue to be way better than mypyc for quite a while, maybe
forever, who knows.

> Interfacing with C libraries can be easily achieved with cffi.

Except that it will be fairly slow. cffi is not designed for static
analysis but for runtime operations. You can obviously also use cffi from
Cython – but then, why would you, if you can get much faster code much more
easily without using cffi?

That being said, if someone wants to write a static cffi optimiser for
Cython, why not, I'd be happy to help with my advice. The cool thing is
that this can be improved gradually, because compiling the cffi code
probably already works out of the box. It's just not (much) faster than
when interpreted.

> Being a
> strict subset of Python means that mypyc code will execute just fine
> in PyPy.

So does normal (non-subset) Python code. You can run it in PyPy, have
CPython interpret it, or compile it with Cython if you want it to run
faster in CPython, all without having to limit yourself to a subset of
Python. Seriously, you make this sound like requiring users to rewrite
their code to make it compilable with mypyc was a good thing.

> They can even apply some optimizations to it eventually, as
> it has a strict and static type system.

In case "they" refers to PyPy here, then I remember the PyPy project
stating very clearly that they are not interested in PEP-484 typing because
it is completely irrelevant for their JIT. It's really best for them to
ignore it.

That's similar for Cython, simply because PEP-484 typing isn't designed for
optimisation at all, definitely not for C-level optimisation. Still, Cython
can make some use of PEP-484 typing, if you use it to define specific C
types. That allows normal execution in CPython, static analysis with
PEP-484 analyser tools (e.g. PyCharm or mypy), and efficient optimisation
by Cython. The best of all worlds. See the docs on how to do that, it's
been supported for about a year now (and has been around in a similar,
non-PEP-484 form for years before that PEP even existed).

> I'd be more willing to start using mypyc+cffi in CPython stdlib
> *eventually*, than Cython now.  Cython is a relatively complex and
> still poorly documented language.

You are free to improve the documentation or otherwise help us find and
discuss concrete problems with it. Calling Cython a "poorly documented
language" could easily feel offensive towards those who have put a lot of
work into the documentation, wiki, tutorials, trainings and what not that
help people use the language. Even stack overflow is getting better and
better in documenting Cython these days, even though responses over there
that describe work-arounds tend to get outdated fairly quickly.

Besides, don't forget that it's Python, so consider reading the Python
documentation first if something is unclear. And maybe some documentation
of C data types as well. (.5 wink)

> I'm speaking from experience after
> writing thousands of lines of Cython in uvloop & asyncpg.  In skillful
> hands Cython is amazing, but I'd be cautious to advertise and use it
> in CPython.

Why not? You didn't actually give any reasons for that.

> I'm also -1 on using Cython to test C API. While writing C tests is
> annoying (I wrote a fair share myself), their very purpose is to make
> third-party

Re: [Python-Dev] Let's change to C API!

2018-08-23 Thread Stefan Behnel

Greg Ewing schrieb am 23.08.2018 um 03:34:
> Neil Schemenauer wrote:
>> Perhaps a "argument clinic on steroids" would be the proper
>> approach.  So, extensions would mostly be written in C.  However, we
>> would have a pre-processor that does some "magic" to make using the
>> Python API cleaner.
> 
> You seem to have started on the train of thought that
> led me to create Pyrex (the precursor to Cython).

Greg, thank you so much for doing that. It's a great design that we and
hoards of Cython users out there continue to benefit from.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Let's change to C API!

2018-08-23 Thread Stefan Behnel

Antoine Pitrou schrieb am 23.08.2018 um 09:04:
> On Thu, 23 Aug 2018 08:07:08 +0200
> Jeroen Demeyer wrote:
>>> - the maintenance problem (how do ensure we can change small things in
>>>the C API, especially semi-private ones, without having to submit PRs
>>>to Cython as well)  
>>
>> Why don't you want to submit PRs to Cython?
> 
> Because it forces a much longer cycle time when we want to introduce a
> change in the C API: first prototype the C API change, then notice it
> breaks Cython, then try to make a PR (which isn't trivial, given
> Cython's size), then wait for the PR to be merged and a new Cython to
> be released.

I think you can put that argument back into the attic. When CPython 3.6 and
3.7 came out, I swear I had already forgotten which new features they
provided, because we had implemented and released most of the major
features 6-12 months earlier in Cython (backported to Py2.6). And it has
happened more than once that we pushed out a point release within a few
days to fix a real need on user side.

What I would rather like to see instead is that both of our sides try to
jointly discuss ideas for C-API changes, so that we don't even run into the
problem that changes we made on one side surprisingly break the other.

Don't forget that the spark for this whole discussion was to make it easier
to change the C-API at all. Being able to change Cython in one place and
then adapt a whole bunch of real world extensions out there by simply
regenerating their C code with it is a really cool feature. Basically, it
passes the ability to do that back into your own hands.

>> If you're saying "I don't 
>> want to wait for the next stable release of Cython", you could use 
>> development versions of Cython for development versions of CPython.
> 
> But depending on the development version of a compiler isn't very
> enticing, IMHO.

In case that need arises, feel free to ask which git revision we recommend
for use in CPython. In the worst case, we can always create a stable branch
for you that makes sure we don't break your productivity while we're doing
our thing.

>>> - the debugging problem (Cython's generated C code is unreadable,
>>>unlike Argument Clinic's, which can make debugging annoying)  
>>
>> Luckily, you don't need to read the C code most of the time. And it's 
>> also a matter of experience: I can read Cython-generated C code just fine.
> 
> Let's be serious here.  Regardless of the experience, nobody enjoys
> reading / stepping through code like the following:

Ok, you posted generated C code, let's read it together.

>   __Pyx_TraceLine(206,0,__PYX_ERR(1, 206, __pyx_L1_error))

This shows that you have enabled the generation of line tracing code with
the directive "linetrace=True", and Cython is translating line 206 of one
of your source modules here.

>   __Pyx_XDECREF(__pyx_r);
>   __pyx_t_2 = __Pyx_GetModuleGlobalName(__pyx_n_s_datetime); if 
> (unlikely(!__pyx_t_2)) __PYX_ERR(1, 206, __pyx_L1_error)
>   __Pyx_GOTREF(__pyx_t_2);
>   __pyx_t_3 = __Pyx_PyObject_GetAttrStr(__pyx_t_2, __pyx_n_s_datetime); if 
> (unlikely(!__pyx_t_3)) __PYX_ERR(1, 206, __pyx_L1_error)
>   __Pyx_GOTREF(__pyx_t_3);
>   __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
>   __pyx_t_2 = __Pyx_PyObject_GetAttrStr(__pyx_t_3, 
> __pyx_n_s_utcfromtimestamp); if (unlikely(!__pyx_t_2)) __PYX_ERR(1, 206, 
> __pyx_L1_error)
>   __Pyx_GOTREF(__pyx_t_2);
>   __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;

This implements "datetime.datetime.utcfromtimestamp", probably used in a
"return" statement.

>   __pyx_t_3 = __Pyx_PyFloat_DivideObjC(__pyx_v_x, __pyx_float_1e3, 1e3, 0); 
> if (unlikely(!__pyx_t_3)) __PYX_ERR(1, 206, __pyx_L1_error)
>   __Pyx_GOTREF(__pyx_t_3);

This is "x/1e3", optimised for fast computation in the case that "x" turns
out to be a number, especially a float object.

>   __pyx_t_4 = NULL;
>   if (CYTHON_UNPACK_METHODS && likely(PyMethod_Check(__pyx_t_2))) {
> __pyx_t_4 = PyMethod_GET_SELF(__pyx_t_2);
> if (likely(__pyx_t_4)) {
>   PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_2);
>   __Pyx_INCREF(__pyx_t_4);
>   __Pyx_INCREF(function);
>   __Pyx_DECREF_SET(__pyx_t_2, function);
> }
>   }
>   if (!__pyx_t_4) {
> __pyx_t_1 = __Pyx_PyObject_CallOneArg(__pyx_t_2, __pyx_t_3); if 
> (unlikely(!__pyx_t_1)) __PYX_ERR(1, 206, __pyx_L1_error)
> __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;
> __Pyx_GOTREF(__pyx_t_1);
>   } else {
> #if CYTHON_FAST_PYCALL
> if (PyFunction_Check(__pyx_t_2)) {
>   PyObject *__pyx_temp[2] = {__pyx_t_4, __pyx_t_3};
>   __pyx_t_1 = __Pyx_PyFunction_FastCall(__pyx_t_2, __pyx_temp+1-1, 1+1); 
> if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 206, __pyx_L1_error)
>   __Pyx_XDECREF(__pyx_t_4); __pyx_t_4 = 0;
>   __Pyx_GOTREF(__pyx_t_1);
>   __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;
> } else
> #endif
> #if CYTHON_FAST_PYCCALL
> if (__Pyx_PyFastCFunction_Check(__pyx_t_2)) {
>   PyObject *__pyx_temp[2] = {__pyx_t_4,

Re: [Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-11 Thread Stefan Behnel

Petr Viktorin schrieb am 10.08.2018 um 13:48:
> Would this be better than a flag + raising an error on init?

Ok, I've implemented this in Cython for now, to finally move the PEP-489
support forward. The somewhat annoying drawback is that module reloading
previously *seemed* to work, simply because it didn't actually do anything.
Now, people will get an exception in cases that previously worked silently.
An exception would probably have been better from the beginning, because it
clearly tells people that what they are trying is not supported. Now it's a
bit of a breaking change. I'll see what it gives.

Thanks for your feedback on this.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Let's change to C API!

2018-08-11 Thread Stefan Behnel

Antoine Pitrou schrieb am 11.08.2018 um 15:19:
> On Fri, 10 Aug 2018 19:15:11 +0200 Armin Rigo wrote:
>> Currently, the C API only allows Psyco-style JITting (much slower than
>> PyPy).  All three other points might not be possible at all without a
>> seriously modified C API.  Why?  I have no proof, but only
>> circumstantial evidence.  Each of (2), (3), (4) has been done in at
>> least one other implementation: PyPy, Jython and IronPython.  Each of
>> these implementation has also got its share of troubles with emulating
>> the CPython C API.  You can continue to think that the C API has got
>> nothing to do with it.  I tend to think the opposite.  The continued
>> absence of major performance improvements for either CPython itself or
>> for any alternative Python implementation that *does* support the C
>> API natively is probably proof enough---I think that enough time has
>> passed, by now, to make this argument.
> [...]
> That leaves us with CPython and PyPy, which are only two data points.
> And there are enough differences, AFAIK, between those two that picking
> up "supports the C API natively" as the primary factor leading to a
> performance difference sounds arbitrary.

IMHO, while it's not clear to what extent the C-API hinders performance
improvements or jittability of code in CPython, I think it's fair to assume
that it's easier to improve internals when they are internal and not part
of a public API. Whether it's worth the effort to design a new C-API, or at
least make major changes to it, I cannot say, lacking an actual comparable
implementation of such a design that specifically targets better performance.

As it stands, extensions can actually make good use of the fact that the
C-API treats them (mostly, see e.g. PEPs 575/580) as first class citizens
in the CPython ecosystem. So, the status quo is at least a tradeoff.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Stefan Behnel

Petr Viktorin schrieb am 10.08.2018 um 11:51:
> On 08/10/18 11:21, Stefan Behnel wrote:
>> coming back to PEP 489 [1], the multi-phase extension module
>> initialization. We originally designed it as an "all or nothing" feature,
>> but as it turns out, the "all" part is so difficult to achieve that most
>> potential users end up with "nothing". So, my question is: could we split
>> it up so that projects can get at least the main advantages: module spec
>> and unicode module naming.
>>
>> PEP 489 is a great protocol in the sense that it allows extension modules
>> to set themselves up in the same way that Python modules do: load, create
>> module, execute module code. Without it, creating the module and executing
>> its code are a single step that is outside of the control of CPython, which
>> prevents the module from knowing its metadata and CPython from knowing
>> up-front what the module will actually be.
>>
>> Now, the problem with PEP 489 is that it requires support for reloading and
>> subinterpreters at the same time [2]. For this, extension modules must
>> essentially be free of static global state, which comprises both the module
>> code itself and any external native libraries that it uses. That is
>> somewhere between difficult and impossible to achieve. PEP 573 [3] explains
>> some of the reasons, and lists solutions for some of the issues, but cannot
>> solve the general problem that some extension modules simply cannot get rid
>> of their global state, and are therefore inherently incompatible with
>> reloading and subinterpreters.
> 
> Are there any issues that aren't explained in PEP 573?
> I don't think Python modules should be *inherently* incompatible with
> subinterpreters. Static global state is perhaps unavoidable in some cases,
> but IMO it should be managed when it's exposed to Python.
> If there are issues not in the PEPs, I'd like to collect the concrete cases
> in some document.

There's always the case where an external native library simply isn't
re-entrant and/or requires configuration to be global. I know, there's
static linking and there are even ways to load an external shared library
multiple times, but that's just adding to the difficulties. Let's just
accept that some things are not easy enough to make for a good requirement.


>> I would like the requirement in [2] to be lifted in PEP 489, to make the
>> main features of the PEP generally available to all extension modules.
>>
>> The question is then how to opt out of the subinterpreter support. The PEP
>> explicitly does not allow backporting new init slot functions/feeatures:
>>
>> "Unknown slot IDs will cause the import to fail with SystemError."
>>
>> But at least changing this in Py3.8 should be doable and would be really
>> nice.
> 
> I don't think we can just silently skip unknown slots -- that would mean
> modules wouldn't be getting features they asked for.
> Do you have some more sophisticated model for slots in mind, or is this
> something to be designed?

Sorry for not being clear here. I was asking for changing the assumptions
that PEP 489 makes about modules that claim to support the multi-step
initialisation part of the PEP. Adding a new (flag?) slot was just one idea
for opting out of multi-initialisation support.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Can we split PEP 489 (extension module init) ?

2018-08-10 Thread Stefan Behnel

Hi,

coming back to PEP 489 [1], the multi-phase extension module
initialization. We originally designed it as an "all or nothing" feature,
but as it turns out, the "all" part is so difficult to achieve that most
potential users end up with "nothing". So, my question is: could we split
it up so that projects can get at least the main advantages: module spec
and unicode module naming.

PEP 489 is a great protocol in the sense that it allows extension modules
to set themselves up in the same way that Python modules do: load, create
module, execute module code. Without it, creating the module and executing
its code are a single step that is outside of the control of CPython, which
prevents the module from knowing its metadata and CPython from knowing
up-front what the module will actually be.

Now, the problem with PEP 489 is that it requires support for reloading and
subinterpreters at the same time [2]. For this, extension modules must
essentially be free of static global state, which comprises both the module
code itself and any external native libraries that it uses. That is
somewhere between difficult and impossible to achieve. PEP 573 [3] explains
some of the reasons, and lists solutions for some of the issues, but cannot
solve the general problem that some extension modules simply cannot get rid
of their global state, and are therefore inherently incompatible with
reloading and subinterpreters.

I would like the requirement in [2] to be lifted in PEP 489, to make the
main features of the PEP generally available to all extension modules.

The question is then how to opt out of the subinterpreter support. The PEP
explicitly does not allow backporting new init slot functions/feeatures:

"Unknown slot IDs will cause the import to fail with SystemError."

But at least changing this in Py3.8 should be doable and would be really nice.

What do you think?

Stefan



[1] https://www.python.org/dev/peps/pep-0489/
[2]
https://www.python.org/dev/peps/pep-0489/#subinterpreters-and-interpreter-reloading
[3] https://www.python.org/dev/peps/pep-0573/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use of Cython

2018-08-09 Thread Stefan Behnel

Hi,

this is getting a bit out of scope for this list. I propose to move further
questions about general Cython usage to he cython-users mailing list.

Matěj Cepl schrieb am 08.08.2018 um 12:44:
> On 2018-08-06, 15:13 GMT, Stefan Behnel wrote:
>> Not sure I understand this correctly, but I think we're on the 
>> same page here: writing test code in C is cumbersome, writing 
>> test code in a mix of C and Python across different files is 
>> aweful. And making it difficult to write or even just review 
>> test code simply means that people will either waste their 
>> precious contribution time on it, or try to get around it.
> 
> I was thinking about the same when porting M2Crypto to py3k 
> (M2Crypto is currently swig-based mix of C-code and Python). Is 
> it even possible to make a mix of Cython, swig-based C, and 
> Python?

As long as you take the decision at a per-module basis, sure. If you want
to mix them inside of a single module, then it's either Cython+C or Swig+C,
not all three. But as Antoine suggested, unless you really want an
identical mapper for whole range of different languages, Swig is likely not
what you should use these days.

> In the end I rather stayed with plain C, because the 
> combination seems unimaginably awful.

Probably worth expanding your imagination. :)

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use of Cython

2018-08-06 Thread Stefan Behnel

Ronald Oussoren via Python-Dev schrieb am 06.08.2018 um 15:25:
>> On 5 Aug 2018, at 18:14, Nick Coghlan wrote:
>> On 5 August 2018 at 18:06, Ronald Oussoren wrote:
>>> I’m not sure if I understand this, ctypes and cffi are used to access C APIs
>>> without writing C code including the CPython API (see for example
>>> ).
>>>
>>> The code code below should be mostly equivalent to the Cython example posted
>>> earlier:
>>>
>>> import unittest
>>> import ctypes
>>> from ctypes import pythonapi
>>>
>>> class PyObject(ctypes.Structure):
>>>_fields_ = (
>>>('ob_refcnt', ctypes.c_ssize_t),
>>>)
>>>
>>> pythonapi.PyList_Append.argtypes = [ctypes.py_object, ctypes.py_object]
>>>
>>> def refcount(v):
>>>return PyObject.from_address(id(v)).ob_refcnt
>>
>> The quoted code is what I was referring to in:
>> 
>> ctypes & cffi likely wouldn't help as much in the case, since they
>> don't eliminate the need to come up with custom code for parts 3 & 4,
>> they just let you write that logic in Python rather than C.
>> 
> 
> And earlier Nick wrote:
>> 1. The test case itself (what action to take, which assertions to make about 
>> it)
>> 2. The C code to make the API call you want to test
>> 3. The Python->C interface for the test case from 1 to pass test
>> values in to the code from 2
>> 4. The C->Python interface to get state of interest from 2 back to the
>> test case from 1
> 
> For all of Cython, ctypes and cffi you almost never have to write (2), and 
> hence (3) and (4), but can write that code in Python.

Which then means that you have a mix of Python and C in many cases. I guess
that's what you meant with your next sentence:

> This is at the code of making it harder to know which bits of the CPython API 
> are used in step (2), which makes it harder to review a testcase. 

Not sure I understand this correctly, but I think we're on the same page
here: writing test code in C is cumbersome, writing test code in a mix of C
and Python across different files is aweful. And making it difficult to
write or even just review test code simply means that people will either
waste their precious contribution time on it, or try to get around it.

> BTW. In other projects I use tests there almost all of the test code is in C, 
> the unittest runner only calls a C function and uses the result of that 
> function to deduce if the test passed or failed. This only works nicely for 
> fairly simple tests (such as the example test in this thread), not for more 
> complicated and interesting tests due to having to write more C code.

I totally agree with that. For less trivial tests, people will often want
to stear the test case at the C level, because some things are really
difficult to do from Python. Good luck making assertions about reference
counts when you're orchestrating the C-API through ctypes. And this is
where Cython shines – your code *always* ends up running in C, regardless
of how much of it is plain Python. But at any point, you can do pretty
arbitrary C things, all in the same function. And unittest can execute that
function directly for you, without having to write a Python wrapper or
separate test runner.

And for the really hard cases, you can resort to writing a literal C code
snippet in your Cython source file (as a string) and let Cython drop it
into the file it generates, e.g. to quickly define a macro, a little C
function, or an interface wrapper around a C macro that would otherwise be
difficult to test. That little feature removes the last reason for
resorting to a separate C file.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Use of Cython

2018-08-04 Thread Stefan Behnel

Antoine Pitrou schrieb am 04.08.2018 um 15:57:
> Le 04/08/2018 à 15:13, Nick Coghlan a écrit :
>>
>> It'd be *really* nice to at least be able to write some of the C API
>> tests directly in Cython rather than having to fiddle about with
>> splitting the test between the regrtest parts that actually define the
>> test case and the extension module parts that expose the interfaces
>> that we want to test.
> 
> Actually, I think testing the C API is precisely the kind of area where
> you don't want to involve a third-party, especially not a moving target
> (Cython is actively maintained and generated code will vary after each
> new Cython release).  Besides, Cython itself calls the C API, which
> means you might end up involuntarily testing the C API against itself.
> 
> If anything, testing the C API using ctypes or cffi would probably be
> more reasonable... assuming we get ctypes / cffi to compile everywhere,
> which currently isn't the case.

I agree that you would rather not want to let Cython (or another tool)
generate the specific code that tests a specific C-API call, but you could
still use Cython to get around writing the setup, validation and unittest
boilerplate code in C. Basically, a test could then look something like
this (probably works, although I didn't test it):

from cpython.object cimport PyObject
from cpython.list cimport PyList_Append

def test_PyList_Append_on_empty_list():
# setup code
l = []
assert len(l) == 0
value = "abc"
pyobj_value =  value
refcount_before = pyobj_value.ob_refcnt

# conservative test call, translates to the expected C code,
# although with exception propagation if it returns -1:
errcode = PyList_Append(l, value)

# validation
assert errcode == 0
assert len(l) == 1
assert l[0] is value
assert pyobj_value.ob_refcnt == refcount_before + 1


If you don't want the exception handling, you can define your own
declaration of PyList_Append() that does not have it. But personally, I'd
rather use try-except in my test code than manually taking care of cleaning
up (unexpected) exceptions.


What we do in Cython, BTW, is write doctests in compiled ".pyx" files. That
allows us to execute certain parts of a test in Python (the doctest code)
and other parts in Cython (the compiled functions/classes that have the
doctests), and thus to do a direct comparison between Python and Cython. An
example that you could find in a test ".pyx" file:

def test_times2(x):
"""
doctest that gets executed by Python:

>>> test_times2(3) == 3 * 2
True
"""
# Cython compiled code in a compiled function that gets tested:
return x * 2


Given that CPython's current "_testcapimodule.c" is only a good 5000 lines
long (divide that by the number of public C-API functions and macros!), I'm
sure the above could help in improving the unit test coverage of the C-API
quite quickly.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Using Cython for the stdlib (was: Let's change to C API!)

2018-08-01 Thread Stefan Behnel

Brett Cannon schrieb am 01.08.2018 um 18:17:
> On Tue, 31 Jul 2018 at 13:42 Stefan Behnel wrote:
>> Antoine Pitrou schrieb am 31.07.2018 um 09:45:
>>> Also, a C extension can be built-in (linked statically into the
>>> interpreter), which I think would be hard to do with Cython.
>>
>> Someone recently contributed a feature of hiding the pyinit function for
>> the embedding case, so people do these things already. This could use the
>> normal inittab mechanism, for example. What I think you might be referring
>> to is that Cython modules require the CPython runtime to be initialised to
>> a certain extent, so you couldn't implement "sys" in Cython, for example.
> 
> I think the key thing is that on Windows all extension modules are built-in
> modules, so that use-case would need to be supported (I don't know Cython
> well enough to know whether this would be doable if we converted as much as
> possible to Cython itself).

As Steve noted, this is probably easy. What Cython produces is just the C
code file for an extension module. Whether you turn that into a shared
library or statically link it into something else (that knows how to
initialise an extension module) is up to you.

I would say, from the point on where CPython is ready to initialise its own
extension modules, it can also initialise Cython generated modules. So,
just to give an example, if you want to compile difflib.py into an
accelerator module and link that into the core, that's probably fine, as
long as you first initialise everything that difflib needs in its module
init code (such as importlib to execute the module level imports) before
you initialise the compiled difflib module.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Let's change to C API!

2018-07-31 Thread Stefan Behnel

Jeroen Demeyer schrieb am 31.07.2018 um 14:01:
> On 2018-07-31 12:56, Victor Stinner wrote:
>> I would be nice to be able to use something to "generate" C
>> extensions, maybe even from pure Python code.
> 
> Cython has a "pure Python mode" which does exactly that. There are several
> ways to include typing information, to ensure that a module remains
> Python-compatible but can be compiled by Cython in an optimized way.

FWIW, modules like difflib can easily be speed up by factors when compiling
and optimising them with Cython, without giving up the Python syntax
compatibility. I just gave a Cython talk at EuroPython last week where I
used difflib as one of my examples.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Let's change to C API!

2018-07-31 Thread Stefan Behnel

Antoine Pitrou schrieb am 31.07.2018 um 09:45:
> On Tue, 31 Jul 2018 09:27:03 +0200
> Jeroen Demeyer  wrote:
>> On 2018-07-31 08:58, Antoine Pitrou wrote:
>>> I think Stefan is right that we
>>> should push people towards Cython and alternatives, rather than direct
>>> use of the C API (which people often fail to use correctly, in my
>>> experience).  
>>
>> I know this probably isn't the correct place to bring it up, but I'm 
>> sure that CPython itself could benefit from using Cython. For example, 
>> most of the C extensions in Modules/ could be written in Cython.
> 
> We don't depend on any third-party Python modules.  Adding a Cython
> dependency for CPython development would be a tough sell.

I don't really want to get into that discussion (it's more about processes
than arguments), but let me note that the CPython development already has a
couple of dependencies, such as github and its bots, or tools like argument
clinic (admittedly included), make and a C compiler (not included), and a
text editor. It's not like it's free of tools that help in writing and
maintaining the code. That's pretty much the level at which I also see
Cython. It's more complex than argument clinic, but it otherwise serves a
similar need.

> Also, a C extension can be built-in (linked statically into the
> interpreter), which I think would be hard to do with Cython.

Someone recently contributed a feature of hiding the pyinit function for
the embedding case, so people do these things already. This could use the
normal inittab mechanism, for example. What I think you might be referring
to is that Cython modules require the CPython runtime to be initialised to
a certain extent, so you couldn't implement "sys" in Cython, for example.
But Jeroen is right, Cython should be a viable option for (most of?) the
extension modules in the stdlib. Whether the CPython core devs would accept
it in their workflow or not is a totally different question.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [PEP 576/580] Reserve one type slot for Cython

2018-07-30 Thread Stefan Behnel

Jeroen Demeyer schrieb am 30.07.2018 um 16:40:
> On 2018-07-30 15:35, INADA Naoki wrote:
>> As repeatedly said, PEP 580 is very complicated protocol
>> when just implementing callable object.
> 
> Can you be more concrete what you find complicated? Maybe I can improve the
> PEP to explain it more. Also, I'm open to suggestions to make it less
> complicated.
> 
>> It is optimized for implementing custom method object, although
>> almost only Cython want the custom method type.
> 
> For the record, Numba also seems interested in the PEP:
> https://groups.google.com/a/continuum.io/d/msg/numba-users/2G6k2R92MIM/P-cFKW7xAgAJ

To add to that record, I (briefly) talked to Ronan Lamy about this at
EuroPython and PyPy could also be interested in generalising the call
protocol, especially with the future goal of extending it into a C level
call protocol that their JIT could understand and build a cffi-like
interface on.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmarks why we need PEP 576/579/580

2018-07-22 Thread Stefan Behnel

Jeroen Demeyer schrieb am 22.07.2018 um 22:54:
> On 2018-07-22 22:32, Antoine Pitrou wrote:
>> - more importantly, issue26110 is entirely internal changes, while you
>>    are proposing to add a new protocol (which is like a new API)
> 
> Just to make sure I understand you: your point is that it's automatically
> more complicated because it's an API instead of an internal change?

I think it's more that it changes something substantial in a way that is
not just internal but visible to users. I think it's more than just a
tracker issue and worth going through the PEP process.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmarks why we need PEP 576/579/580

2018-07-22 Thread Stefan Behnel

Guido van Rossum schrieb am 22.07.2018 um 01:14:
> The cost would be if we were to end up maintaining all that code and it
> wouldn’t make much difference.

Well, this is exactly my point. Someone has to maintain the *existing* code
base and help newcomers to get into it and understand it. This is not easy.
The proposed implementation *already* makes a difference. And it does not
even degrade the performance while doing that, isn't that great?

To make this clear – right now, there is someone who stands up and
volunteers to invest the work to clean up the current implementation. He
has already designed, and even implemented, a protocol that applies to all
types of callables in the same way *and* that is extensible for current and
future needs and optimisations. I think this is way more than anyone could
ask for, and it would be very sad if this chance was wasted, and we would
have to remain with the current implementation.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Benchmarks why we need PEP 576/579/580

2018-07-21 Thread Stefan Behnel

Guido van Rossum schrieb am 21.07.2018 um 22:46:
> Given the cost of a mistake here I recommend a higher standard.

May I ask what you think the "cost of a mistake" is here?

Jeroen has already implemented most of this, and is willing to provide
essentially a shrink-wrapped implementation. He has shown, using the
current application benchmark suite, that his implementation does not
degrade the application performance (that we know of). He has motivated in
PEP form, and shown in his implementation, that the changes avoid much of
the special casing that's currently littered in various spots of the
interpreter and replace them by a much clearer protocol, thus reducing the
overall maintenance cost. He has layed out a cleanup path to get rid of the
current quirks in the split between function/method types, thus making the
code easier to explain and lowering the entry barrier for newcomers to the
code base. And, he has motivated that this protocol enables a future
extension towards a specialised (faster) C level call protocol, which third
party extensions would benefit from.

Given all that, I'm having a hard time finding a "cost" in this. To me, it
reads like a plain net win for all sides.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Micro-benchmarks for PEP 580

2018-07-10 Thread Stefan Behnel

INADA Naoki schrieb am 11.07.2018 um 02:01:
> On Wed, Jul 11, 2018 at 7:47 AM Victor Stinner wrote:
>> I proposed something simpler, but nobody tried to implement it.
>> Instead of calling the long and complex PyArg_Parse...() functions,
>> why not generating C code to parse arguments instead? The idea looks
>> like "inlining" PyArg_Parse...() in its caller, but technically it
>> means that Argument Clinic generates C code to parse arguments.
> 
> But I have worrying about it.  If we do it for all function, it makes Python
> binary fatter, consume more CPU cache.  Once CPU cache is start
> stashing, application performance got slower quickly.

Now, I'd like to see benchmark numbers for that before I believe it. Macro
benchmarks, not micro benchmarks! *wink*

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Micro-benchmarks for PEP 580

2018-07-10 Thread Stefan Behnel

INADA Naoki schrieb am 11.07.2018 um 02:12:
> On Tue, Jul 10, 2018 at 10:20 PM Antoine Pitrou wrote:
>> On Tue, 10 Jul 2018 21:59:28 +0900
>> INADA Naoki wrote:
>>>
>>> Then, the function is called from another C extension like this:
>>>
>>> PyObject_CallFunction(func, "n", 42);
>>>
>>> Currently, we create temporary long object for passing argument.
>>> If there is protocol for exposeing format used by PyArg_Parse*, we can
>>> bypass temporal Python object and call myfunc_impl directly.

Note that this is not fast at all. It actually has to parse the format
description at runtime. For really fast calls, this should be avoided, and
it can be avoided by using a str object for the signature description and
interning it. That relies on signature normalisation, which requires a
proper formal specification of C/C++ signature strings, which ... is pretty
much the can of worms that Antoine mentioned.

>> This is another can of worms to open.  If you're worried about the
>> added complexity of PEP 580, what you're proposing is one or two orders
>> of magnitude more complicated.
> 
> This is just an example of possible optimization, to explain why I want
> example application first.
> I know Cython is important for data scientists.  But I don't know how
> it used typically.
> 
> If my idea has 50% gain and current PEP 580 has only 5% gain,
> why we should accept PEP 580?

Because PEP 580 is also meant as a preparation for a fast C-to-C call
interface in Python.

Unpacking C callables is quite an involved protocol, and we have been
pushing the idea around and away in the Cython project for some seven
years. It's about time to consider it more seriously now, and there are
plans to implement it on top of PEP 580 (which is also mentioned in the PEP).

> And PEP 576 seems much simpler and straightforward way to expose
> FASTCALL.

Except that it does not get us one step forward on the path towards what
you proposed. So, why would *you* prefer it over PEP 580?

>>> I think optimization like this idea can boost application performance
>>> using Cython heavily.
>>
>> You can already define the C signature of a function in Cython and
>> intra-Cython calls will benefit from it where possible.  Cooperation
>> from core Python is not necessary for that.
> 
> Why not allow it for extensions written in C, without Cython?

It should be. They just need a simpler protocol, which is PEP 580.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] On the METH_FASTCALL calling convention

2018-07-08 Thread Stefan Behnel

Jeroen Demeyer schrieb am 08.07.2018 um 09:07:
> On 2018-07-07 10:55, Serhiy Storchaka wrote:
>> The first part of
>> handling arguments can be made outside of the C function, by the calling
>> API.
> 
> Sure, it could be done but I don't see the advantage. I don't think you
> will gain performance because you are just moving code from one place to
> another.

You probably can, by allowing the caller to decide how to map the keyword
arguments. Passing them as a flat array is the fastest way for the callee
to evaluate them, so that's great. For the caller, they might already be
available in that format or not, so the caller can save time if they are,
and only needs to invest time if they are not.

> And how do you plan to deal with *args and **kwds in your
> proposal? You'll need to make sure that this doesn't become slower.

That, on the other hand, is an actual concern. If we already have a tuple
and dict, unpacking them for the call and then repacking them on the other
side is a serious performance regression – for this specific use case.

The question is, how important is that use case compared to everything
else? And, since we have more than one supported signature anyway, can we
leave that case to a non-fastcall case? In the end, it's up to the callee
to decide which protocol to support and use, and if the intention is to
work with **kwargs, then maybe the callee should not use the fastcall
protocol in the first place.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 575, 576, 579 and 580

2018-07-07 Thread Stefan Behnel

INADA Naoki schrieb am 07.07.2018 um 17:16:
>> 2. The new API should be used internally so that 3rd party extensions
>> are not second class citizens in term of call performance.
> 
> These PEPs proposes new public protocol which can be implemented
> by 3rd party extensions, especially Cython.
> In this meaning, it's not used only *internally*.

I think Mark meant that the API should *also* be used internally, in the
same way that external code uses it. Meaning, there shouldn't be a separate
internal API.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 575, 576, 579 and 580

2018-07-07 Thread Stefan Behnel

Nick Coghlan schrieb am 07.07.2018 um 16:14:
> when the new calling
> convention is tied to a protocol that any type can implement (as PEP
> 580 proposes), the concern doesn't even arise.

Nick, +1 to all of what you said in your reply, and I also really like the
fact that this proposal is creating a new, general protocol that removes
lots of type special casing from places where objects are being called
"efficiently". We're essentially designing a Fast Duck Calling convention here.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] On the METH_FASTCALL calling convention

2018-07-07 Thread Stefan Behnel

Hi Serhiy!

Serhiy Storchaka schrieb am 07.07.2018 um 10:55:
> There is my idea. Split every of keyword argument parsing functions on two
> parts. The first part linearize keyword arguments, it converts positional
> and keyword arguments (in whatever form they were presented) into a linear
> array of PyObject* (with NULLs for not specified optional arguments). The
> second part is common and similar to _PyArg_ParseStack(), but supports
> NULLs. It converts an array of PyObject* to a sequence of C values. I tried
> to implement this idea, is is not simple, and results were mixed, but I
> don't loss a hope.

That proposal sounds good to me. Cython currently does something similar
/inside/ of its function entry code, in that it executes an unrolled series
of PyDict_GetItem() calls for the expected keyword arguments (instead of
iterating over a dict, which turned out to be slower), and maps those to an
array of arguments, all before it passes over that array to convert the
values to the expected C types. I agree that it makes sense to do the name
matching outside of the callee since the caller knows best in what way
(sequence, dict, ...) the arguments are available and can decide on the
fastest way to map them to a flat array, given the expected argument names.

And I think the proposal would fit nicely into PEP-580.


> And here we return to METH_FASTCALL|METH_KEYWORDS. The first part of
> handling arguments can be made outside of the C function, by the calling
> API. Then the signature of the C function can be simpler, the same as for
> METH_FASTCALL. But we need to expose the list of keyword parameter names as
> an attribute of CFunction.

And names should be expected to be interned, so that matching the keywords
can be done via pointer comparisons in almost all cases. That should make
it pretty fast, and errors can be detected in a slow separate pass if the
pointer matching fails. I think we cannot strictly assume a predictable,
helpful ordering of the keyword arguments on the calling side (that would
allow for a one-pass merge), but it's rather common for users to pass
keyword arguments in the order in which the signature expects them, so I'm
sure there's a fast algorithm (e.g. something like Insertion Sort) to match
both sides in negligible time.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods

2018-07-07 Thread Stefan Behnel

INADA Naoki schrieb am 07.07.2018 um 10:08:
> Thank you.  Do you plan to make it default when PEP 580 is accepted
> and implemented?

It will become the default at some point, I'm sure. Note that we will still
have to support older Python versions, though, currently 2.6+, which would
not have the improvements available. Some things might be backportable for
us, at least to older Py3.x releases, but we'll see.


> Personally speaking, I used Cython for quick & easy alternative way to
> writing extension types.
> I don't need compatibility with pure Python functions.  I prefer
> minimum and lightweight.
> So I will disable it explicitly or stop using Cython.

I'll try to keep the directive available as a compatibility switch for you. ;)


> But if you believe PEP 580 makes many Cython users happy, I believe you.

It's more of a transitive thing, for the users of your code. If the
functions and methods in the code that I write behave like Python
functions, then people who use my code will not run into surprises and
quirks when trying to do their stuff with them and things will generally
"just work", e.g. inspecting the functions when debugging or using them
interactively, looking up their signature, default arguments and
annotations, generating documentation from them, pickling, assigning them
as methods ... basically anything that people implicitly expect to be able
to do with Python functions (or even callables in general) and that doesn't
(not well or not at all) work with PyCFunction.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 579 and PEP 580: refactoring C functions and methods

2018-07-07 Thread Stefan Behnel

INADA Naoki schrieb am 07.07.2018 um 06:10:
> How often "custom method type" are used?
> 
> I thought Cython use it by default.
> But when I read code generated by Cython, I can't find it.
> It uses normal PyMethodDef and tp_methods.
> 
> I found CyFunction in Cython repository, but I can't find
> how to use it.  Cython document doesn't explain any information
> about it.

Its usage is disabled by default because of some of the problems that
Jeroen addresses in his PEP(s).

You can enable Cython's own function type by setting the compiler directive
"binding=True", e.g. from your setup.py or in a comment at the very top of
your source file:

# cython: binding=True

The directive name "binding" stems from the fact that CyFunctions bind as
methods when put into classes, but it's really misleading these days
because the main advantage is that it makes Cython compiled functions look
and behave much more like Python functions, including introspection etc.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] On the METH_FASTCALL calling convention

2018-07-07 Thread Stefan Behnel

Jeroen Demeyer schrieb am 05.07.2018 um 16:53:
> The only case when this handling of keywords is suboptimal is when using
> **kwargs. In that case, a dict must be converted to a tuple. It looks hard
> to me to support efficiently both the case of fixed keyword arguments
> (f(foo=x)) and a keyword dict (f(**kwargs)). Since the former is more
> common than the latter, the current choice is optimal.

Wrappers that adapt or add some arguments (think partial()) aren't all that
uncommon, even when speed is not irrelevant. But I agree that actual
keyword arguments should rarely be involved in those calls.

Typically, it's calls with 1 to ~3 positional arguments that occur in
performance critical situations. Often just one, rarely more, and zero
arguments is a fast case anyway. Keyword arguments will always suffer some
kind of penalty compared to positional arguments, regardless of how they
are implemented (at runtime). But they can easily be avoided in many cases,
and anyone designing a performance relevant API that *requires* keyword
arguments deserves to have their code forked away from them. :)

The current keyword calling conventions seem fine with me and I do not see
a reason why we should invest discussion time and distributed brain
capacity into "improving" them.

Stefan


PS: Passing keyword arguments through wrappers unchanged might be a case to
consider in the future, but the calling PEPs don't seem the right place to
discuss those, as it's not just a call issue but also a compiler issue.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Can we make METH_FASTCALL public, from Python 3.7? (ref: PEP 579

2018-06-20 Thread Stefan Behnel

Serhiy Storchaka schrieb am 20.06.2018 um 18:56:
> 20.06.18 18:42, INADA Naoki пише:
>> I don't have any idea about changing METH_FASTCALL more.
>> If Victor and Serhiy think so, and PyPy maintainers like it too, I want
>> to make it public as soon as possible.
> 
> I don't have objections against making the METH_FASTCALL method calling
> convention public. But only for positional-only parameters, the protocol
> for keyword parameters is more complex and still can be changed.

That's also the level that Cython currently uses/supports, exactly because
keyword arguments are a) quite a bit more complex, b) a lot less often used
and c) pretty much never used in performance critical code.

Cython also currently limits the usage to Py3.6+, although I'm considering
to generally enable it for everything since Py2.6 as soon as Cython starts
using the calling convention for its own functions, just in case it ends up
calling itself without prior notice. :)

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] C-level calling (was: PEP 575 (Unifying function/method classes) update)

2018-06-20 Thread Stefan Behnel

Victor Stinner schrieb am 19.06.2018 um 16:59:
> 2018-06-19 13:58 GMT+02:00 Jeroen Demeyer :
>> Personally, I think that you are exaggerating these issues.
> 
> I'm not trying to convince you to abandon the idea. I would be happy
> to be able to use FASTCALL in more cases! I just tried to explain why
> I chose to abandon my idea.
> 
> FASTCALL is cute on tiny microbenchmarks, but I'm not sure that having
> spent almost one year on it was worth it :-)

Fastcall is actually nice, also because it has a potential to *simplify*
several things with regard to calling Python objects from C.

Thanks for implementing it, Victor.

Just to add another bit of background on top of the current discussion,
there is an idea around, especially in the scipy/big-data community, (and
I'm not giving any guarantees here that it will lead to a PEP +
implementation, as it depends on people's workload) to design a dedicated C
level calling interface for Python. Think of it as similar to the buffer
interface, but for calling arbitrary C functions by bypassing the Python
call interface entirely. Objects that wrap some kind of C function (and
there are tons of them in the CPython world) would gain C signature meta
data, maybe even for overloaded signatures, and C code that wants to call
them could validate that meta data and call them as native C calls.

But that is a rather big project to undertake, and I consider Jeroen's new
PEP also a first step in that direction.

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 575 (Unifying function/method classes) update

2018-06-18 Thread Stefan Behnel

Victor Stinner schrieb am 18.06.2018 um 15:09:
> I tried two options to add support for FASTCALL on calling an object:
> add a flag in tp_flags and reuse tp_call, or add a new tp_fastcall
> slot. I failed to implement correctly any of these two options.
> 
> There are multiple issues with tp_fastcall:
> 
> * ABI issue: it's possible to load a C extension using the old ABI,
> without tp_fastcall: it's not possible to write type->tp_fastcall on
> such type. This limitation causes different issues.

Not a problem if we rededicate the unused (since Py3.0) "tp_print" slot for it.

Even better, since the slot exists already in Py3.0+, tools like Cython,
NumPy (with its ufuncs etc.) or generic function dispatchers, basically
anything that benefits from fast calls, can enable support for it in all
CPython 3.x versions and benefit from faster calls among each other,
independent of the support in CPython. The explicit type flag opt-in that
the PEP proposes makes this completely safe.


> * If tp_call is modified, tp_fastcall may be outdated. Same if
> tp_fastcall is modified.

Slots are fixed at type creation and should never be modified afterwards.


> What happens on "del obj.__call__" or "del type.__call__"?

$ python3.7 -c 'del len.__call__'
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'builtin_function_or_method' object attribute '__call__' is
read-only

$ python3.7 -c 'del type.__call__'
Traceback (most recent call last):
  File "", line 1, in 
TypeError: can't set attributes of built-in/extension type 'type'

And a really lovely one:

$ python3.7 -c 'del (lambda:0).__call__'
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: __call__


> * Many public functions of the C API still requires the tuple and dict
> to pass positional and keyword arguments, so a compatibility layer is
> required to types who only want to implement FASTCALL.

Well, yes. It would require a trivial piece of code to map between the two.
Fine with me.


> Related issue:
> what is something calls tp_call with (args: tuple, kwargs: dict)?
> Crash or call a compatibility layer converting arguments to FASTCALL
> calling convention?

The latter, obviously. Also easy to implement, with the usual undefined
dict order caveat (although that's probably solved when running in Py3.6+).


> I abandoned my idea for two reasons:
> 
> 1) in the worst case, my changes caused a crash which is not accepted
> for an optimization.

This isn't really an optimisation. It's a generalisation of the call protocol.


> My first intent was to removed the
> property_descr_get() hack because its implementation is fragile and
> caused crashes.

Not sure which hack you mean.


> 2) we implemented a lot of other optimizations which made calls faster
> without having to touch tp_call nor tp_fastcall. The benefit of
> FASTCALL for tp_call/tp_fastcall was not really significant.

What Jeroen said. Cleaning up the implementation and generalising the call
protocol is going to open up a wonderfully bright future for CPython. :)

Stefan

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

1 2 3 4 5 6 7 >

1 - 100 of 622 matches

Mail list logo