[Numpy-discussion] dtype comparison and hashing - bug?
Hi, I have just run into this oddness: In [28]: dt1 = np.dtype('f4') In [29]: dt1.str Out[29]: 'f4' In [30]: dt2 = dt1.newbyteorder('') In [31]: dt2.str Out[31]: 'f4' In [32]: dt1 == dt2 Out[32]: True In [33]: hash(dt1) == hash(dt2) Out[33]: False This is the same as: http://www.mail-archive.com/numpy-discussion@scipy.org/msg13299.html My question was - does the team still agree this is a bug? Can anyone offer a pointer as to how it should be fixed? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype comparison and hashing - bug?
On Wed, Oct 20, 2010 at 5:08 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, I have just run into this oddness: In [28]: dt1 = np.dtype('f4') In [29]: dt1.str Out[29]: 'f4' In [30]: dt2 = dt1.newbyteorder('') In [31]: dt2.str Out[31]: 'f4' In [32]: dt1 == dt2 Out[32]: True In [33]: hash(dt1) == hash(dt2) Out[33]: False This is the same as: http://www.mail-archive.com/numpy-discussion@scipy.org/msg13299.html My question was - does the team still agree this is a bug? This should have been fixed when I implemented the hashing protocol for dtypes. This is a bug in the hashing protocol implementation, most likely caused by = and being considered different by the hashing function. I will try to take a look at it (the function to fix is _array_descr_builtin in hashdesc.c if you feel like doing some C right now :) ). cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype comparison and hashing - bug?
Hi, It already has a ticket :) http://projects.scipy.org/numpy/ticket/1637 Oops - sorry - thanks for point that out. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype comparison and hashing
On Wed, Oct 15, 2008 at 12:56 PM, Robert Kern [EMAIL PROTECTED] wrote: On Wed, Oct 15, 2008 at 02:20, Geoffrey Irving [EMAIL PROTECTED] wrote: Hello, Currently in numpy comparing dtypes for equality with == does an internal PyArray_EquivTypes check, which means that the dtypes NPY_INT and NPY_LONG compare as equal in python. However, the hash function for dtypes reduces id(), which is therefore inconsistent with ==. Unfortunately I can't produce a python snippet showing this since I don't know how to create a NPY_INT dtype in pure python. Based on the source it looks like hash should raise a type error, since tp_hash is null but tp_richcompare is not. Does the following snippet through an exception for others? import numpy hash(numpy.dtype('int')) 5708736 This might be the problem: /* Macro to get the tp_richcompare field of a type if defined */ #define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \ ? (t)-tp_richcompare : NULL) I'm using the default Mac OS X 10.5 installation of python 2.5 and numpy, so maybe those weren't compiled correctly. Has anyone else seen this issue? Actually, the problem is that we provide a hash function explicitly. In multiarraymodule.c: PyArrayDescr_Type.tp_hash = (hashfunc)_Py_HashPointer; That is a violation of the hashing protocol (objects which compare equal and are hashable need to hash equal), and should be fixed. Thanks for finding that. Geoffrey ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] dtype comparison and hashing
Hello, Currently in numpy comparing dtypes for equality with == does an internal PyArray_EquivTypes check, which means that the dtypes NPY_INT and NPY_LONG compare as equal in python. However, the hash function for dtypes reduces id(), which is therefore inconsistent with ==. Unfortunately I can't produce a python snippet showing this since I don't know how to create a NPY_INT dtype in pure python. Based on the source it looks like hash should raise a type error, since tp_hash is null but tp_richcompare is not. Does the following snippet through an exception for others? import numpy hash(numpy.dtype('int')) 5708736 This might be the problem: /* Macro to get the tp_richcompare field of a type if defined */ #define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \ ? (t)-tp_richcompare : NULL) I'm using the default Mac OS X 10.5 installation of python 2.5 and numpy, so maybe those weren't compiled correctly. Has anyone else seen this issue? Thanks, Geoffrey ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype comparison and hashing
On Wed, Oct 15, 2008 at 02:20, Geoffrey Irving [EMAIL PROTECTED] wrote: Hello, Currently in numpy comparing dtypes for equality with == does an internal PyArray_EquivTypes check, which means that the dtypes NPY_INT and NPY_LONG compare as equal in python. However, the hash function for dtypes reduces id(), which is therefore inconsistent with ==. Unfortunately I can't produce a python snippet showing this since I don't know how to create a NPY_INT dtype in pure python. Based on the source it looks like hash should raise a type error, since tp_hash is null but tp_richcompare is not. Does the following snippet through an exception for others? import numpy hash(numpy.dtype('int')) 5708736 This might be the problem: /* Macro to get the tp_richcompare field of a type if defined */ #define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \ ? (t)-tp_richcompare : NULL) I'm using the default Mac OS X 10.5 installation of python 2.5 and numpy, so maybe those weren't compiled correctly. Has anyone else seen this issue? Actually, the problem is that we provide a hash function explicitly. In multiarraymodule.c: PyArrayDescr_Type.tp_hash = (hashfunc)_Py_HashPointer; That is a violation of the hashing protocol (objects which compare equal and are hashable need to hash equal), and should be fixed. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion