[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2021-09-21 Thread STINNER Victor


STINNER Victor  added the comment:

There is no consensus on changing things, so I just close my issue.

--
resolution:  -> rejected
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-10 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

PyModule_AddObject() has unique weird design, it is easy to misuse, and most 
code misuse it, but fixing it would break the code which uses it correctly.

I did not see any problems with PyTuple_GetItem().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-10 Thread STINNER Victor


STINNER Victor  added the comment:

> If you want to work only with non-borrowed references, (...)

This is not my goal here. My goal is to reduce the risk of memory leaks.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-10 Thread STINNER Victor


STINNER Victor  added the comment:

In bpo-1635741, I added PyModule_AddObjectRef() (commit 
8021875bbcf7385e651def51bc597472a569042c):
https://docs.python.org/dev/c-api/module.html#c.PyModule_AddObjectRef

"Similar to PyModule_AddObject() but don't steal a reference to the value on 
success."

I was tired of bugs caused by misusage of the surprising PyModule_AddObject() 
API.

PyModule_AddObject() *is* useful in some cases, but it is confusing in most 
cases...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-10 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 78ba7c69ada2f7eefd4e3ea375337d78dc2ce721 by kj in branch 'master':
bpo-42294: Grammar fixes in doc glossary strong/weak refs (GH-23227)
https://github.com/python/cpython/commit/78ba7c69ada2f7eefd4e3ea375337d78dc2ce721


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-10 Thread Ken Jin


Change by Ken Jin :


--
nosy: +kj
nosy_count: 5.0 -> 6.0
pull_requests: +22130
pull_request: https://github.com/python/cpython/pull/23227

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I concur with Mark.

If you want to work only with non-borrowed references, use PySequence_GetItem() 
and PySequence_SetItem(). It has a cost: it is slower and needs checking 
errors. If you need more performant solution and binary compatibility across 
versions, use PyTuple_GetItem() and PyTuple_SetItem() (borrowed references is 
the part of optimization). If you don't need binary compatibility, but need 
speed, use macros.

And no need to expand the C API. It is already large enough.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread STINNER Victor


STINNER Victor  added the comment:

> PyTuple_SetItem() does clear the previous item, it uses Py_XSETREF. The macro 
> version (PyTuple_SET_ITEM) does not clear the previous item.

Oh sorry, I was thinking at PyTuple_SET_ITEM().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread hai shi


Change by hai shi :


--
nosy: +shihai1991

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread Ronald Oussoren


Ronald Oussoren  added the comment:

PyTuple_SetItem() does clear the previous item, it uses Py_XSETREF. The macro 
version (PyTuple_SET_ITEM) does not clear the previous item.

--
nosy: +ronaldoussoren

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread STINNER Victor


STINNER Victor  added the comment:

Mark Shannon:
> The old functions aren't going away, so these additional functions provide no 
> real safety. You can't stop C programmers trading away correctness for some 
> perceived performance benefit :(

In my experience, newcomers tend to copy existing more. If slowly, the code 
base moves towards safer code less error-prone code like Py_NewRef() or 
Py_SETREF(), slowly, we will avoid a bunch of bugs.


> If we were designing the API from scratch, then this would be a better set of 
> functions. But because the old functions remain, it just means we are making 
> the API larger.

New API VS enhance the existing API. So far, no approach won. I wrote the PEP 
620 to enhance the C API and towards a more opaque API, and there is the HPy 
project which is a new API written correctly from the start. But HPy is not 
usable yet, and migrating C extensions to HPy will take years.

Also, enhancing the existing API and writing a new API are not exclusive option.

What is the issue of making the C API larger?


> Please don't add macros, use inline functions.

For Py_NewRef(), I used all at once :-) static inline function + function + 
macro :-)

It's exported as a regular function for the stable ABI, but overriden by a 
static inline function with a macro. The idea is to allow to use it for 
developers who cannot use static inline functions (ex: extension modules not 
written in C).

I chose to redefine functions as static inline functions in the limited C API. 
If it's an issue, we can consider to only do that in Include/cpython/object.h.


> There seems to be some confusion about borrowed references and stolen 
> references in 
> https://pythoncapi.readthedocs.io/bad_api.html#borrowed-references
"Stealing" a reference is perfectly safe. Returning a "borrowed" reference is 
not.
>
> So, don't bother with `PyTuple_SetItemRef()`, as `PyTupleSetItem()` is safe.

I'm really annoyed that almost all functions increase the refcount of their 
arugments, except a bunch of special cases. I would like to move towards a more 
regular API.

PyTuple_SetItem() is annoying because it steals a reference to the item. Also, 
it doesn't clear the reference of the previous item, which is also likely to 
introduce a reference leak.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread Mark Shannon


Mark Shannon  added the comment:

I'm not convinced that this is worth the effort.

The old functions aren't going away, so these additional functions provide no 
real safety.
You can't stop C programmers trading away correctness for some perceived 
performance benefit :(

If we were designing the API from scratch, then this would be a better set of 
functions. But because the old functions remain, it just means we are making 
the API larger.


Please don't add macros, use inline functions.

There seems to be some confusion about borrowed references and stolen 
references in https://pythoncapi.readthedocs.io/bad_api.html#borrowed-references
"Stealing" a reference is perfectly safe. Returning a "borrowed" reference is 
not.

So, don't bother with `PyTuple_SetItemRef()`, as `PyTupleSetItem()` is safe.

--
nosy: +Mark.Shannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +22107
pull_request: https://github.com/python/cpython/pull/23209

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +22105
pull_request: https://github.com/python/cpython/pull/23207

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 23c5f93b83f78f295313e137011edb18b24c37c2 by Victor Stinner in 
branch 'master':
bpo-42294: Add borrowed/strong reference to doc glossary (GH-23206)
https://github.com/python/cpython/commit/23c5f93b83f78f295313e137011edb18b24c37c2


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread STINNER Victor


Change by STINNER Victor :


--
keywords: +patch
pull_requests: +22104
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/23206

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42294] [C API] Add new C functions with more regular reference counting like PyTuple_GetItemRef()

2020-11-09 Thread STINNER Victor


New submission from STINNER Victor :

The C API of Python uses and abuses borrowed references and stealing references 
for performance. When such function is used in some very specific code for best 
performances, problems arise when they are the only way to access objects. 
Reference counting in C is error prone, most people, even experimented core 
developers, get it wrong. Examples of issues:

* Reference leaks: objects are never deleted causing memory leaks. For example, 
an error handling code which forgets to call Py_DECREF() on a newly create 
object.

* Unsafe borrowed references: call arbitrary Python code can delete the 
referenced objects, and so the borrowed reference becomes a dangling pointer. 
Most developers are confident that a function call cannot run arbitrary Python 
code, whereas a single Py_DECREF() can trigger a GC collection which runs 
finalizers which can be arbitrary Python code. Many functions have been fixed 
manually by adding Py_INCREF() and Py_DECREF() around "unsafe" function calls.


Borrowed references and stealing references make reference counting code 
special, even more complex to review. I propose to use new function to make 
refecence counting code more regular, simpler to review, and so less error 
prone.

Examples:

* Add PyTuple_GetItem(): similar to PyTuple_GetItem() but returns a strong 
reference (or NULL if the tuple item is not set)
* Add PyTuple_SetItemRef(): similar to PyTuple_SetItem() but don't steal a 
reference to the new item

The C API has a long list of functions using borrowed references, so I'm not 
sure where we should stop. I propose to start with the most common functions: 
PyDict, PyTuple, PyList, and see how it goes.

--

PyTuple_GetItem() is a function call which checks arguments: raise an exception 
if arguments are invalid. For best performances, PyTuple_GET_ITEM() macro is 
providing to skip these checks. This macro also returns a borrowed reference.

I'm not if a new PyTuple_GET_ITEM_REF() macro should be added: similar to 
PyTuple_GET_ITEM() but returns a strong reference.

Same open question abut PyTuple_SET_ITEM(tuple, index, item) macro which is 
also special:

* Don't call Py_XINCREF(item)
* Don't call Py_XDECREF() on the old item

If a new PyTuple_SET_ITEM_REF() macro is added, I would prefer to make the 
function more "regular" in term of reference counting, and so call Py_XDECREF() 
on the old item. When used on a newly created tuple, it would add an useless 
Py_XDECREF(NULL), compared to PyTuple_SET_ITEM(). Again, my idea here is to 
provide functions with a less surprising behavior and more regular reference 
counting. There are alternatives to build a new tuple without the useless 
Py_XDECREF(NULL), like Py_BuildValue().

Code which requires best performance could continue to use PyTuple_SET_ITEM() 
which is not deprecated, and handle reference counting manually.

--

An alternative is to use abstract functions like:

* PyTuple_GetItem() => PySequence_GetItem()
* PyDict_GetItem() => PyObject_GetItem()
* etc.

I propose to keep specialized functions per type to avoid the overhead of 
indirection. For example, PySequence_GetItem(obj, index) calls 
Py_TYPE(obj)->tp_as_sequence->sq_item(obj, index) which implies multiple 
indirection:

* Get the object type from PyObject.ob_type
* Dereference *type to get PyTypeObject.tp_as_sequence
* Dereference *PyTypeObject.tp_as_sequence to get PySequenceMethods.sq_item

--

I don't plan to get rid of borrowed references. Sometimes, they are safe and 
replacing them with strong references would require explicit reference counting 
code which is again easy to get wrong.

For example, Py_TYPE() returns a borrowed reference to an object type. The 
function is commonly used to access immediately to a type member, with no risk 
of calling arbitrary Python code between the Py_TYPE() call and the read of the 
type attribute. For example, the following code is perfectly safe:

PyErr_Format(PyExc_TypeError, "exec() globals must be a dict, not 
%.100s",
 Py_TYPE(globals)->tp_name);


--

See also bpo-42262 where I added Py_NewRef() and Py_XNewRef() functions.

See https://pythoncapi.readthedocs.io/bad_api.html#borrowed-references for 
details about issues caused by borrowed references and a list of functions 
using borrowed references.

--
components: C API
messages: 380578
nosy: vstinner
priority: normal
severity: normal
status: open
title: [C API] Add new C functions with more regular reference counting like 
PyTuple_GetItemRef()
versions: Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com