[issue15573] Support unknown formats in memoryview comparisons

2012-11-02 Thread Roundup Robot
Roundup Robot added the comment: New changeset 969069d464bc by Stefan Krah in branch '3.3': Issue #15814: Use hash function that is compatible with the equality http://hg.python.org/cpython/rev/969069d464bc -- ___ Python tracker

[issue15573] Support unknown formats in memoryview comparisons

2012-08-29 Thread Stefan Krah
Stefan Krah added the comment: We overlooked one thing. Since hashing is defined in terms of tobytes(), the equality-hash invariant is now broken: from _testbuffer import ndarray x = ndarray([1,2,3], shape=[3], format='f') y = ndarray([1,2,3], shape=[3], format='B') a = memoryview(x) b =

[issue15573] Support unknown formats in memoryview comparisons

2012-08-29 Thread Nick Coghlan
Nick Coghlan added the comment: Perhaps the hash could just be based on a subset of the items? For example, hash the format, shape and the first and last items? Yes, such an approach means you're more likely to get collisions if you do start hashing these, but that seems better than making

[issue15573] Support unknown formats in memoryview comparisons

2012-08-29 Thread Stefan Krah
Stefan Krah added the comment: I'm trying to think of an optimization for the native types. It should be possible to cast any native type element to unsigned char and use the truncated value for the bytes hash. Well, for double probably it's required to go from double - int64_t - unsigned

[issue15573] Support unknown formats in memoryview comparisons

2012-08-29 Thread Martin v . Löwis
Martin v. Löwis added the comment: 1. I STRONGLY request to take hashing out of this issue. The only further action that this issue can have is complete reversal of the patch, starting over. Now that the issue has been resolved, a NEW issue arises, namely that hashing is inconsistent with

[issue15573] Support unknown formats in memoryview comparisons

2012-08-29 Thread Stefan Krah
Stefan Krah added the comment: I agree that this isn't a blocker due to the limited use of memoryview hashing. Tracking this in #15814. -- status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573

[issue15573] Support unknown formats in memoryview comparisons

2012-08-29 Thread Arfrever Frehtes Taifersar Arahesis
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com: -- resolution: - fixed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573 ___

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread STINNER Victor
STINNER Victor added the comment: Even if I would prefer to see a fully working new implementation of the buffer interface, I agree with Georg: it's too late for such huge change in Python 3.3. Can' we wait Python 3.4 to change the equality operator? It also took me three major releases to fix

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread Stefan Krah
Stefan Krah added the comment: The effect of the change is pretty minimal though: Previously not equal was returned for unknown formats, now the struct module figures it out. memoryobject.c has 100% coverage except for a couple of lines that are either impossible to reach or very hard to reach.

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread Nick Coghlan
Nick Coghlan added the comment: With Stefan's major improvements to the test suite for 3.3, the risk is low and I *really* don't want to spend the life of the 3.3 series excusing the current comparison behaviour. By contrast, explaining the shift in definition as moving from raw memory

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread Georg Brandl
Georg Brandl added the comment: Well, I'm not against you committing it as long as you commit it *right now*. I'm about to cut the tag. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread Stefan Krah
Stefan Krah added the comment: You can look at the attached memoryobject.c.gcov: *All* failure paths are taken since Python API functions are wrapped in macros that set PyExc_FailAPIError and return failure at predefined points the test suite. That should read: With a patch that wraps

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread Stefan Krah
Stefan Krah added the comment: Great, give me 20 minutes or so. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573 ___ ___ Python-bugs-list

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread Nick Coghlan
Nick Coghlan added the comment: I'm currently on IRC and double checking the patch locally. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573 ___

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread Nick Coghlan
Nick Coghlan added the comment: Stefan, was there something specific you wanted to do before committing? Otherwise I can commit it as soon as this full test run finishes. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread Roundup Robot
Roundup Robot added the comment: New changeset afa3dedfee18 by Nick Coghlan in branch 'default': Close #15573: use value-based memoryview comparisons (patch by Stefan Krah) http://hg.python.org/cpython/rev/afa3dedfee18 -- resolution: - fixed stage: patch review - committed/rejected

[issue15573] Support unknown formats in memoryview comparisons

2012-08-25 Thread Stefan Krah
Stefan Krah added the comment: Except for Misc/NEWS the patch can be committed unchanged. Please go ahead! [The remaining doc changes are tracked elsewhere.] -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573

[issue15573] Support unknown formats in memoryview comparisons

2012-08-24 Thread Martin v . Löwis
Martin v. Löwis added the comment: Is this still blocking the release? If so, it should be resolved within the next twelve hours, or else it may block the release until September. -- ___ Python tracker rep...@bugs.python.org

[issue15573] Support unknown formats in memoryview comparisons

2012-08-24 Thread Georg Brandl
Georg Brandl added the comment: I'm not very keen to hold up the release for long times again, especially for a patch of this size and lots of potential breakage. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573

[issue15573] Support unknown formats in memoryview comparisons

2012-08-19 Thread Georg Brandl
Changes by Georg Brandl ge...@python.org: -- priority: deferred blocker - release blocker ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573 ___

[issue15573] Support unknown formats in memoryview comparisons

2012-08-19 Thread Stefan Krah
Stefan Krah added the comment: I think issue15573-struct-2.diff is ready to go and I'd rather commit sooner than later. Nick, can I interpret your last review comment as go ahead? :) -- ___ Python tracker rep...@bugs.python.org

[issue15573] Support unknown formats in memoryview comparisons

2012-08-19 Thread Georg Brandl
Georg Brandl added the comment: Small nit: when you put versionchanged:: 3.3 there should be a hint of *what* changed exactly :) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573 ___

[issue15573] Support unknown formats in memoryview comparisons

2012-08-19 Thread Stefan Krah
Stefan Krah added the comment: Right. I'm tracking all versionchanged issues in #15724. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573 ___

[issue15573] Support unknown formats in memoryview comparisons

2012-08-17 Thread Stefan Krah
Stefan Krah added the comment: There is still one corner case involving NaNs: Released memoryviews always compare equal. I took that over from the 3.2 implementation. import array a = array.array('d', [float('nan')]) m = memoryview(a) m == m False m.release() m == m True I guess we have

[issue15573] Support unknown formats in memoryview comparisons

2012-08-15 Thread Stefan Krah
Stefan Krah added the comment: Here is a new patch that hopefully addresses all comments. -- Added file: http://bugs.python.org/file26831/issue15573-struct-2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573

[issue15573] Support unknown formats in memoryview comparisons

2012-08-14 Thread Stefan Krah
Stefan Krah added the comment: Here is a patch implementing by-value comparisons for all format strings understood by the struct module. It is slightly longer than promised, since for larger arrays it is necessary to cache an unpacking object for acceptable performance. The fast path for

[issue15573] Support unknown formats in memoryview comparisons

2012-08-14 Thread Stefan Krah
Stefan Krah added the comment: I didn't get my own comments as mail, so this is just a notification that I've answered Nick's comments. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573

[issue15573] Support unknown formats in memoryview comparisons

2012-08-14 Thread Nick Coghlan
Nick Coghlan added the comment: Yeah, as soon as I got your email I realised I'd only been searching for usage *in the patch* rather than the full file. Oops :( More comments in the review. -- ___ Python tracker rep...@bugs.python.org

[issue15573] Support unknown formats in memoryview comparisons

2012-08-12 Thread Stefan Krah
Stefan Krah added the comment: Martin v. L??wis rep...@bugs.python.org wrote: Here is a more formal definition of my last proposal, v and w are equal iff v.shape() == w.shape() and ((v.format == w.format and v.tobytes() == w.tobytes()) or v.tolist() == w.tolist()) if tolist raises

[issue15573] Support unknown formats in memoryview comparisons

2012-08-12 Thread Stefan Krah
Stefan Krah added the comment: To be specific, after a quick look the only function from _testbuffer.c that would be needed is unpack_single(). This really does not look very complex. -- ___ Python tracker rep...@bugs.python.org

[issue15573] Support unknown formats in memoryview comparisons

2012-08-12 Thread Nick Coghlan
Nick Coghlan added the comment: We do import modules in the Object/* code if we really need to (Grep for PyImport_Import in the Objects dir) It seems to me like this case is rapidly reaching (or has even gone well past) the point where that's the sensible thing to do. A couple of key issues

[issue15573] Support unknown formats in memoryview comparisons

2012-08-12 Thread Stefan Krah
Stefan Krah added the comment: Thanks, Nick. I'll work on a struct module patch then. At this point for me this is the only satisfying solution. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Nick Coghlan
Nick Coghlan added the comment: Stefan's proposed definition is correct. Shapes define array types, format strings define entry types, then the actual memory contents define the value. It's not Stefan's definition of equivalence, it's what a statically typed array *means*. The 3.2 way is

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: Nick: I still disagree. Would you agree that array.array constitutes a statically typed array? Yet py array.array('b',b'foo') == array.array('B',b'foo') True py array.array('i',[1,2,3]) == array.array('L', [1,2,3]) True So the array object (rightfully)

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: it might *still* happen, I should say, since this problem is exactly what Victor says this issue intends to solve (for comparison of two 'u' arrays). -- ___ Python tracker rep...@bugs.python.org

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Stefan Krah
Stefan Krah added the comment: So we have two competing proposals: 1) Py_buffers are strongly typed arrays in the ML sense (e.g. array of float, array of int). This is probably what I'd prefer on the C level, imagine a function like function like PyBuffer_Compare(v, w). Backwards

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: Here is a patch doing the comparison on abstract values if the formats are different. -- Added file: http://bugs.python.org/file26766/abstractcmp.diff ___ Python tracker rep...@bugs.python.org

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Nick Coghlan
Nick Coghlan added the comment: OK, I think I finally understand what Martin is getting at from a semantic point of view, and I think I can better explain the background of the issue and why Stefan's proposed solution is both necessary and correct. The ideal definition of equivalence for

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Nick Coghlan
Nick Coghlan added the comment: Short version: 3.3 implemented new constraints on memoryview equality comparisons to eliminate cases that were incorrectly returning True when they should have returned False. However, the format constraints are currently too restrictive, so some memoryview

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Nick Coghlan
Nick Coghlan added the comment: I can see value in adopting the abstraction approach, but it's not part of this issue. That problem already existed when using memoryview in 3.2, we can wait until 3.4 to fix it. There's an actual, concrete regression in 3.3 that may break code, and comparing

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Nick Coghlan
Nick Coghlan added the comment: And, to be honest, I'd be quite happy for congratulations, you have reached the threshold where you need NumPy rather than memoryview to be the long term answer to getting by value comparison semantics. -- ___ Python

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: However, the format constraints are currently too restrictive, so some memoryview comparisons that correctly returned True in 3.2 are now also returning False. This patch fixes *those* comparisons to once again return True. No, it doesn't - at least not in

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: And, to be honest, I'd be quite happy for congratulations, you have reached the threshold where you need NumPy rather than memoryview to be the long term answer to getting by value comparison semantics. IMO, this threshold is already reached when you start

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Nick Coghlan
Nick Coghlan added the comment: Hmm, you're right. OK, here's a simpler proposal for 3.3: 1. Always look at shape first. If they're different, then they're not equal 2. If both formats are known, compare by unpacked value 3. If either format is unknown, compare by memory contents (just like

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Antoine Pitrou
Antoine Pitrou added the comment: Le samedi 11 août 2012 à 19:52 +, Nick Coghlan a écrit : I'd be happier if the compare-by-value didn't make complete copies of the entire array though. Ditto. If a and b are bytes objects, comparing memoryview(a) and memoryview(b) should be as cheap as

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: Am 11.08.12 21:59, schrieb Antoine Pitrou: Le samedi 11 août 2012 à 19:52 +, Nick Coghlan a écrit : I'd be happier if the compare-by-value didn't make complete copies of the entire array though. Ditto. If a and b are bytes objects, comparing

[issue15573] Support unknown formats in memoryview comparisons

2012-08-11 Thread Martin v . Löwis
Martin v. Löwis added the comment: Here is a more formal definition of my last proposal, v and w are equal iff v.shape() == w.shape() and ((v.format == w.format and v.tobytes() == w.tobytes()) or v.tolist() == w.tolist()) if tolist raises an exception for unsupported codes, they are not

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Martin v . Löwis
Martin v. Löwis added the comment: haypo: thanks for stating the issue. ISTM that this classifies as an obscure bug: you have to use memoryviews, and you need to compare them for equality, and the comparison needs to be non-trivial, where trivial is defined by both are 1D byte arrays. While

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Stefan Krah
Stefan Krah added the comment: I can easily provide a specification that makes the current implementation correct Yes, the current specification is: memoryview only attempts to compare arrays with known (single character native) formats and returns not equal otherwise. The problem is that

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Martin v . Löwis
Martin v. Löwis added the comment: Note that in 3.2 memoryview would return equal for arrays that simply aren't equal, if those arrays happen to have the same bit pattern. This is exactly my point. If a memoryview has the same memory as another, why are they then not rightfully considered

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Stefan Krah
Stefan Krah added the comment: PEP-3118 specifies strongly typed multi-dimensional arrays. The existing code in the 3.2 memoryview as well as numerous comments by Travis Oliphant make it clear that these capabilities were intended from the start for memoryview as well. Perhaps the name

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Martin v . Löwis
Martin v. Löwis added the comment: Can you please elaborate on the specification: 1. what does it mean that the formats of v and w are equal? 2. Victor's clarification about this issue isn't about comparing two arrays, but an array with a string object. So: when is an array equal to some

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Stefan Krah
Stefan Krah added the comment: 1. what does it mean that the formats of v and w are equal? I'm using array and Py_buffer interchangeably since a Py_buffer struct actually describes a multi-dimensional array. v and w are Py_buffer structs. So v.format must equal w.format, where format is a

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Stefan Krah
Changes by Stefan Krah stefan-use...@bytereef.org: -- stage: needs patch - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573 ___

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Martin v . Löwis
Martin v. Löwis added the comment: So v.format must equal w.format, where format is a format string in struct module syntax. The topic of this issue is to determine under what circumstances two strings in struct module syntax are considered equal. And that is exactly my question: We don't

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread STINNER Victor
STINNER Victor added the comment: Can't we start with something simple (for ptyhon 3.3?), and elaborate later? In my specific example, both object have the same format string and the same content. So i expect that they are equal. Le 10 août 2012 19:47, Martin v. Löwis rep...@bugs.python.org a

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Martin v . Löwis
Martin v. Löwis added the comment: Can't we start with something simple (for ptyhon 3.3?), and elaborate later? Sure: someone would have to make a proposal what exactly that is. IMO, the 3.2 definition *was* simple, but apparently it was considered too simple. So either Stefan gets to define

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Stefan Krah
Stefan Krah added the comment: The ideal specification is: 1) Arrays v and w are equal iff format and shape are equal and for all valid indices allowed by shape memcmp((char *)PyBuffer_GetPointer(v, indices), (char *)PyBuffer_GetPointer(w, indices), itemsize) ==

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Stefan Krah
Stefan Krah added the comment: Martin v. Loewis rep...@bugs.python.org wrote: Sure: someone would have to make a proposal what exactly that is. IMO, the 3.2 definition *was* simple, but apparently it was considered too simple. It was simply broken in multiple ways. Example: from numpy

[issue15573] Support unknown formats in memoryview comparisons

2012-08-10 Thread Martin v . Löwis
Martin v. Löwis added the comment: I find Stefan's proposed equality confusing. Why is it based on memcmp? Either it compares memory (i.e. internal representations), or it compares abstract values. If it compares abstract values, then it shouldn't be a requirement that the format strings are

[issue15573] Support unknown formats in memoryview comparisons

2012-08-09 Thread Martin v . Löwis
Martin v. Löwis added the comment: Can someone please explain why this is a release blocker? What is the specific regression from 3.2 that this deals with? -- nosy: +loewis ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573

[issue15573] Support unknown formats in memoryview comparisons

2012-08-09 Thread STINNER Victor
STINNER Victor added the comment: What is the specific regression from 3.2 that this deals with? I don't know if it must be called a regression, but at least the behaviour is different in Python 3.2 and 3.3. For example, an Unicode array is no more equal to its memoryview: Python 3.3.0b1

[issue15573] Support unknown formats in memoryview comparisons

2012-08-08 Thread Stefan Krah
Stefan Krah added the comment: Right, byte order specifiers are always at the beginning of the string. That is at least something. I wonder if we should tighten PEP-3118 to demand a canonical form of format strings, such as (probably incomplete): - Whitespace is disallowed. - Except for

[issue15573] Support unknown formats in memoryview comparisons

2012-08-07 Thread Stefan Krah
New submission from Stefan Krah: Continuing the discussion from #13072. I hit a snag here: Determining in full generality whether two format strings describe identical items is pretty complicated, see also #3132. I'm attaching a best effort fmtcmp() function that should do the following: -

[issue15573] Support unknown formats in memoryview comparisons

2012-08-07 Thread Nick Coghlan
Nick Coghlan added the comment: I confess I was thinking of an even simpler format strings must be identical fallback, but agree your way is better, since it reproduces the 3.2 behaviour in many more cases where ignoring the format string actually did the right thing. The struct docs for the

[issue15573] Support unknown formats in memoryview comparisons

2012-08-07 Thread Christian Heimes
Changes by Christian Heimes li...@cheimes.de: -- nosy: +christian.heimes ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15573 ___ ___

[issue15573] Support unknown formats in memoryview comparisons

2012-08-07 Thread Meador Inge
Meador Inge added the comment: I agree that the general case is complicated. It will get even more complicated if the full of PEP 3118 gets implemented since it turns into a tree comparison. In general, I think you will probably have to compute some canonical form and then compare the