[issue46376] PyMapping_Check returns 1 for list

2022-01-25 Thread Ben Avrahami


Ben Avrahami  added the comment:

IMHO, I don't think any alternative to aviramha's solution addresses the issue, 
And I don't think the need is niche enough to be ignored.

PyType_HasFeature excludes strings, bytes, and other esoteric types.
PyMapping_Check includes mappings like dict and MappingProxyType.
PySequence_Check includes non-dict mappings like MappingProxyType.

The only possible solutions right now are:

a. Import the "collections.abc.Sequence" class into C and run 
"PyObject_IsInstance" on it. This would be the correct solution, but it would 
be slower than aviramha's propsal.
b. Perform's aviramha's proposal manually: first check the 
"Py_TPFLAGS_SEQUENCE" feature flag, and fallback for instance checking strings, 
bytecodes, and other esoteric types individually. This would be correct and 
fast, but is cumbersome to perform, and users are bound to forget some types.

A question as simple as "would isinstance(X, ) returns true?" 
should be easier to answer for low-level developer, one only needs to look at 
the finagling numpy, among others, has to perform to parse a sequence (the 
relevant function in numpy is called "PyArray_FromAny").

A simple implementation of a new function for non-CPython alternatives will be 
to do implement solution a: import "collections.abc.Sequence", and check for 
instance. For CPYTHON, this can be optimized by implementing solution b.

There is already a precedence for ABI functions that only exist because its 
trivial implementation can be optimized with CPython specific behaviour. I 
don't understand the reluctance to add this one.

--
nosy: +avrahami.ben

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-23 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

I would really like this to be left alone.  We've been without a reliable 
version for over a decade.  That is strong evidence that we really don't need 
this unless it can be done perfectly (which it can't).  

Also, PyMapping_Check is just a CPython specific optimization (and a flawed 
one).  Because it can't be made completely reliable, we should not encourage 
people to use it, nor should we enshrine it in the stable API.

And if a modification can potentially break long stable code (such as that it 
the re module), then we absolutely shouldn't do it. x

Further, there is a general design issue.  Abstract base classes were invented 
to solve this specific problem (distinguishing mappings from sequences).  ABCs 
(and typing) are a now well established practice, and it is foolish to try to 
do an end run around using them.

I am strongly opposed to this going forward and request that a PEP be made if 
it is pursued further.   It is an SC level decision to allow stable code to be 
broken, to guarantee a CPython specific API for something that cannot be made 
correct in the general case, and to bypass the intended way to do it.

If the core concern is that isinstance() checks for ABCs are too slow, then 
efforts should be made to optimize them rather than creating an unreliable, 
CPython only alternative.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-23 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

It is difficult to distinguish Mapping from Sequence because they have the same 
set of dunder methods. The difference is only in semantic -- __iter__ yields 
items for Sequence and keys for Mapping, __getitem__ gets an item by index (or 
a sequence by slice) for Sequence and value by key for Mapping. But semantic 
cannot be tested here.

In these cases when both Sequence and Mapping are accepted but handled 
differently (like in the dict constructor) we test for existence of the "keys" 
method. It is not good, because it is not a dunder method, and therefore should 
be looked up at instance, not only at a type.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-23 Thread Aviram


Aviram  added the comment:

I sent it to sig-capi - 
https://mail.python.org/archives/list/capi-...@python.org/thread/T6DHEKHKKZIYU2GEPGHUQJ3DHTJXZGWW/

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-19 Thread Petr Viktorin


Petr Viktorin  added the comment:

I'd post it to capi-sig, or to the existing thread on python-dev. But here's a 
good place too, especially if you want feedback from a smaller group of people 
first.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-19 Thread Aviram


Aviram  added the comment:

Sure, I will do so. The proposal should be written here, right?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-19 Thread Petr Viktorin


Petr Viktorin  added the comment:

Changing the existing functions is a no-go for backwards compatibility reasons. 
I think the best way forward would be to add a new function, and then possibly 
deprecate the old one if it's deemed dangerous.

If you want to push this forward, could you summarize the use case(s) and 
expected behavior (docs) for such a function? It probably doesn't need a PEP, 
but it's worth looking here for what to think about when making a proposal: 
https://www.python.org/dev/peps/pep-0001/#what-belongs-in-a-successful-pep

--
nosy: +petr.viktorin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-15 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
assignee: rhettinger -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-15 Thread Bar Harel


Bar Harel  added the comment:

> That was true in 2011 and it is still true today

Python's methodology greatly shifted since 2011. For the better or worse, 
Python lost some of its duck typing. Nowadays, most people use linters. 
Wherever you'd try to pass sre_parse.SubPattern, the linter will throw an error 
saying it's not a Sequence even if it fully behaves like one. You can silence 
that error but you'll continue receiving such warnings all over the code, 
whether in stdlib or out in the wild. The meaning of Sequence now shifted to 
"inherits from abc.Sequence". The only thing wrong with PySequence_Check is 
that the ecosystem shifted, but its view of a Sequence remained the same.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-15 Thread Bar Harel


Bar Harel  added the comment:

@rhettinger I completely understand what you're saying and at first I agreed 
with you. Before I gave it a closer look, I thought about the same thing - we 
want reliability. Reliability is important and will avoid subtle bugs, which is 
why I was against this change for the exact reasons you mentioned: it is both 
breaking and unreliable.

I then realized that this change can be a reliable replacement for 
isinstance(obj, collections.abc.Sequence) at the C level. Let's use the broken 
sre_parse.SubPattern as an example - it does not register or inherit from 
collections.abc.Sequence, and isinstance(SubPattern, collections.abc.Sequence) 
== False. We cannot know programmatically if SubPattern is a Sequence, we 
cannot type hint it as such, and apart from reading the documentation, we 
cannot deal with the type differently in dynamic code that accepts either 
sequences or mappings. I dare to say, counting on it being a sequence, 
especially on a LBYL language like C is even less reliable. While SubPattern 
"embraces" the spirit of duck typing, it is very hard to fit in light of all 
recent changes advocating for a more structured and well defined types. After 
all, this feature was requested in order to solve reliability issues in 
statically typed languages.

Putting everything aside, the grand question still remains: do you think that 
there's a use for an efficient C-API isinstance check for Sequences and 
Mappings? I would presume the answer is yes. Would we encourage it? I have no 
clue. But if there's a need, we can either change this function as it has the 
same "spirit" or introduce a new one to prevent breaking existing code.

To answer your question: per specification, testing for Py_TPFLAGS_SEQUENCE 
using PyType_HasFeature, does not take strings, bytes and bytearray into 
consideration, and will not suffice. It is an incorrect solution that is even 
less reliable and falls into the exact pitfall of "guesswork" (for example 
SubPattern currently doesn't work with it either). It is not encouraged or 
easily thought of. PySequence_Check which is much more intuitive yet doesn't 
work either and that's where fixing it can have an edge.

A theoretical `PyIsInstance_Sequence` can check for TPFLAGS_SEQUENCE and 
Str/Bytes/ByteArray_Check. If I'm not wrong, doing so will be 100% reliable, 
identical to isinstance(obj, Sequence), and will be very efficient.

As a side-note, the C-API documentation for TP_FLAGS is not clear atm. It 
mentions for example tp_as_sequence and says "if such a flag bit is clear, the 
type fields it guards must not be accessed and must be considered to have a 
zero or NULL value instead" yet Py_TPFLAGS_SEQUENCE does not actually coincide 
with sequences per specification. I know it has a different explanation as well 
and the flag has its own docstring, but it is still a bit misleading.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-15 Thread Aviram


Aviram  added the comment:

I agree that a developer should and would prefer the `Py_TPFLAGS_*` but when 
you visit https://docs.python.org/3/c-api/sequence.html
It seems like the best practice to determine Sequence protocol is by using this 
function, hence leading to confusion. There's no recommendation to use the new 
`Py_TPFLAGS_*`.
To have this knowledge of `Py_TPFLAGS_*` one should be very knowledgable in 
Python's C-API.
How about adding a deprecation note to `PyMapping_Check` & `PySequence_Check` 
in the documentation, suggesting the alternative path (to use 
`PyType_HasFeature`)?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Here's question to focus on:  In what circumstances should a developer ever 
prefer PyMapping_Check() over PyType_HasFeature() with Py_TPFLAGS_MAPPING?

The latter doesn't give any new, useful information:

   return o && Py_TYPE(o)->tp_as_mapping &&
   Py_TYPE(o)->tp_as_mapping->mp_subscript;

I don't see any reason to build on top of this. It's best to just let it go 
gently into the good night without disrupting anything that currently happens 
to work.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> what about simply excluding TPFLAGS_MAPPING from PySequence 
> and TPFLAGS_Sequence from PyMapping? It will remove the types 
> that are 100% not sequences or mappings, as these flags 
> are mutually exclusive by definition.

This is more plausible than the proposed breaking change.


> The result will be much more accurate

If they can't be made fully reliable, why would we ever recommend that someone 
use these functions in real code?  They can be made to guess better than they 
guess now, but there is still guesswork.  

ISTM developers should follow the structure pattern matching implementation and 
refuse the temptation to guess.  If a class declares itself as a mapping or 
sequence, that is reliable information.  In contrast, these functions attempt 
to divine meaning in the absence of a clear declaration.  Using these functions 
will likely result in subtle bugs.

Once Py_TPFLAGS_MAPPING and Py_TPFLAGS_SEQUENCE became available, we should 
have deprecated these functions.  No one should use them anymore. Their core 
design is flawed; they tried to deduce semantics from structural artifacts.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread John Millikin


Change by John Millikin :


--
nosy:  -jmillikin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Bar Harel


Bar Harel  added the comment:

I thought about it, what about simply excluding TPFLAGS_MAPPING from PySequence 
and TPFLAGS_Sequence from PyMapping? It will remove the types that are 100% not 
sequences or mappings, as these flags are mutually exclusive by definition. The 
result will be much more accurate, yet not cause a breaking change, apart from 
places where it is truly not a sequence or mapping.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Inada Naoki


Inada Naoki  added the comment:

collections.abc.Mapping is fixed by https://bugs.python.org/issue43977
We can be same thing if backward compatibility allows it.

--
nosy: +methane

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

s/it isn't work breaking other things/it isn't worth breaking other things/

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> It changes behavior for objects not being iterable/sequences 
> if not inheriting from `abc.collections.Sequence`.

This would be a breaking change (for example, it broke the long stable 
sre_parse code).

The utility of PySequence_Check and PyMapping_Check is already so low that it 
isn't work breaking other things to just to marginally improve these two minor 
and rarely used functions.

These functions will never be fully reliable.  The documentation explains that 
in general, we can't tell if a class with __getitem__ is a mapping or a 
sequence.  Sometimes hints are present (such as the tp_flags), but the can't 
get a reliable result.  As Guido observed, "calling PyMapping_Check() was never 
particularly reliable, and extension modules depending on it probably always 
had subtle bugs."  That was true in 2011 and it is still true today.

I recommend closing this.  These functions are mostly unimportant and 
unreliable and cannot be made correct.  In contrast, iterability is important 
and needs to be stable.  Special cases aren't important enough to break the 
rules.

--
assignee:  -> rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Bar Harel


Bar Harel  added the comment:

Another question we should ask is about duck typing. Is a sequence which 
doesn't inherit from abc.Sequence considered a sequence? Whatever the answer 
is, PySequence specifically looks for a sequence and removes duck typing out of 
the picture. The object will not pass static typing and will not pass 
isinstance check, so there's no reason for it to pass PySequence.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Aviram


Aviram  added the comment:

I submitted a draft patch. Using TPFlags alone doesn't cut it as some types are 
excluded (bytes, str, bytearray) in sequence and same for mapping. I'm thinking 
of checking for those cases specifically as those are very very specific 
casings. Would love some input.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Irit Katriel


Change by Irit Katriel :


--
type: behavior -> enhancement
versions:  -Python 3.10, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Brandt Bucher


Change by Brandt Bucher :


--
nosy: +brandtbucher

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Aviram


Change by Aviram :


--
keywords: +patch
pull_requests: +28799
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/30600

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Bar Harel


Bar Harel  added the comment:

Do note, the relevant functions are in the Stable ABI, and their promise will 
slightly change, yet modifying the current functions instead of creating new 
ones may still be beneficial.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Bar Harel


Bar Harel  added the comment:

Up until now, trying to distinguish between actual sequences and mappings in 
C-API was a pain. Same methods are implemented in customer user classes, and 
the ABCs could have either been registered dynamically or had an appropriate 
__subclasshook__. On top of that, making the same checks as isinstance 
regarding the whole ABC and __mro__ chain would considerably slow down the code.

The new tp_flags are set in both registered abc classes, and subclasses of the 
appropriate types. As tp_flags exists, we don't even need to account for 
tp_as_mapping and tp_as_sequence. According to the docs "if such a flag bit is 
clear, the type fields it guards must not be accessed and must be considered to 
have a zero or NULL value instead.". User made classes that don't inherit from 
the Sequence and Mapping ABCs are not considered either, and there is no 
__subclasshook__ for these ABCs so inheritance is fully enforced.

As we cover builtins, c-extension classes and user-made Python classes, 
checking tp_flags is guaranteed to be accurate for distinguishing between 
mapping and sequence types, and would be as fast as a pointer dereference can 
be.

Modifying the current PySequence_Check and PyMapping_Check might be a breaking 
change, especially on a C-API level. However one can argue that checking for a 
mapping and expecting a failure on user-made mappings counts as a bug. Having 
them accurately return the correct type in a fast manner using tp_flags is 
therefore an acceptable breaking change.

--
nosy: +bar.harel

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Aviram


Aviram  added the comment:

Would checking the TPFLAGS for `Py_TPFLAGS_SEQUENCE` & `Py_TPFLAGS_MAPPING` 
when using `PySequence_Check` & `PyMapping_Check` be a valid fix?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Ashley Anderson


Change by Ashley Anderson :


--
nosy: +aganders3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Aviram


Aviram  added the comment:

https://github.com/PyO3/pyo3/issues/2072 Relevant discussion in PyO3 related 
issue

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Aviram


Change by Aviram :


--
nosy: +benjamin.peterson, bukzor, georg.brandl, jmillikin, levkivskyi, 
miss-islington, pitrou, rhettinger, serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46376] PyMapping_Check returns 1 for list

2022-01-14 Thread Aviram


New submission from Aviram :

This is re-open of https://bugs.python.org/issue5945.
In the former issue, it was decided that documenting the odd behavior and later 
providing clean, good ABC C API would be the long term solution.
Few years passed and there are no alternatives AFAIK
I am wondering if that's due to lack of resources or just forgotten?
I don't mind contributing the necessary change in case there's really nothing 
in progress and it is something Python community wants to fix.
Hopefully there's an existing solution and I just didn't search well enough.

--
components: C API
messages: 410555
nosy: aviramha
priority: normal
severity: normal
status: open
title: PyMapping_Check returns 1 for list
type: behavior
versions: Python 3.10, Python 3.11, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com