Overall, I support the efforts to improve the C API, but over the last few 
weeks have become worried.  I don't want to hold up progress with fear, 
uncertainty, and doubt.  Yet, I would like to be more comfortable that we're 
all aware of what is occurring and what are the potential benefits and risks.

* Inline functions are great.  They provide true local variables, better 
separation of concerns, are far less kludgy than text based macro substitution, 
and will typically generate the same code as the equivalent macro.  This is 
good tech when used with in a single source file where it has predictable 
results.   

However, I'm not at all confident about moving these into header files which 
are included in multiple target .c files which need be compiled into separate 
.o files and linked to other existing libraries.

With a macro, I know for sure that the substitution is taking place.  This 
happens at all levels of optimization and in a debug mode.  The effects are 
100% predictable and have a well-established track record in our mature 
battle-tested code base.  With cross module function calls, I'm less confident 
about what is happening, partly because compilers are free to ignore inline 
directives and partly because the semantics of inlining are less clear when the 
crossing module boundaries.

* Other categories of changes that we make tend to have only a shallow reach.  
However, these C API changes will likely touch every C extension that has ever 
been written, some of which is highly tuned but not actively re-examined.  If 
any mistakes are make, they will likely be pervasive.  Accordingly, caution is 
warranted.

My expectation was that the changes would be conducted in experimental 
branches. But extensive changes are already being made (or about to be made) on 
the 3.8 master. If a year from now, we decide that the changes were 
destabilizing or that the promised benefits didn't materialize, they will be 
difficult to undo because there are so many of them and because they will be 
interleaved with other changes.

The original motivation was to achieve a 2x speedup in return for significantly 
churning the C API. However, the current rearranging of the include files and 
macro-to-inline-function changes only give us churn.  At the very best, they 
will be performance neutral.  At worst, formerly cheap macro calls will become 
expensive in places that we haven't thought to run timings on.  Given that 
compilers don't have to honor an inline directive, we can't really know for 
sure -- perhaps today it works out fine, and perhaps tomorrow the compilers opt 
for a different behavior.

Maybe everything that is going on is fine.  Maybe it's not. I am not expert 
enough to know for sure, but we should be careful before green-lighting such an 
extensive series of changes directly to master.  Reasonable questions to ask 
are: 1) What are the risks to third party modules, 2) Do we really know that 
the macro-to-inline-function transformations are semantically neutral. 3) If 
there is no performance benefit (none has been seen so far, nor is any promised 
in the pending PRs), is it worth it?  

We do know that PyPy folks have had their share of issues with the C API, but 
I'm not sure that we can make any of this go away without changing the 
foundations of the whole ecosystem.  It is inconvenient for a full GC 
environment to interact with the API for a reference counted environment -- I 
don't think we can make this challenge go away without giving up reference 
counting.  It is inconvenient for a system that manifests objects on demand to 
interact with an API that assumes that objects have identity and never more 
once they are created -- I don't think we can make this go away either.  It is 
inconvenient to a system that uses unboxed data to interact with our API where 
everything is an object that includes a type pointer and reference count -- We 
have provided an API for boxing and boxing, but the trip back-and-forth is 
inconveniently expensive -- I don't think we can make that go away either 
because too much of the ecosystem depends on that API.  There are some things 
that ca
 n be mitigated such as challenges with borrowed references but that doesn't 
seem to have been the focus on any of the PRs.

In short, I'm somewhat concerned about the extensive changes that are 
occurring.  I do know they will touch substantially every C module in the 
entire ecosystem.  I don't know whether they are safe or whether they will give 
any real benefit.

FWIW, none of this is a criticism of the work being done.  Someone needs to 
think deeply about the C API or else progress will never be made.  That said, 
it is a high risk project with many PRs going directly into master, so it does 
warrant having buy in that the churn isn't destabilizing and will actually 
produce a benefit that is worth it.


Raymond







_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to