[Numpy-discussion] dtype comparison and hashing - bug?

2010-10-20 Thread Matthew Brett
Hi,

I have just run into this oddness:

In [28]: dt1 = np.dtype('f4')

In [29]: dt1.str
Out[29]: 'f4'

In [30]: dt2 = dt1.newbyteorder('')

In [31]: dt2.str
Out[31]: 'f4'

In [32]: dt1 == dt2
Out[32]: True

In [33]: hash(dt1) == hash(dt2)
Out[33]: False

This is the same as:

http://www.mail-archive.com/numpy-discussion@scipy.org/msg13299.html

My question was - does the team still agree this is a bug?  Can anyone
offer a pointer as to how it should be fixed?

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dtype comparison and hashing - bug?

2010-10-20 Thread David Cournapeau
On Wed, Oct 20, 2010 at 5:08 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 I have just run into this oddness:

 In [28]: dt1 = np.dtype('f4')

 In [29]: dt1.str
 Out[29]: 'f4'

 In [30]: dt2 = dt1.newbyteorder('')

 In [31]: dt2.str
 Out[31]: 'f4'

 In [32]: dt1 == dt2
 Out[32]: True

 In [33]: hash(dt1) == hash(dt2)
 Out[33]: False

 This is the same as:

 http://www.mail-archive.com/numpy-discussion@scipy.org/msg13299.html

 My question was - does the team still agree this is a bug?

This should have been fixed when I implemented the hashing protocol
for dtypes. This is a bug in the hashing protocol implementation, most
likely caused by = and  being considered different by the hashing
function. I will try to take a look at it (the function to fix is
_array_descr_builtin in hashdesc.c if you feel like doing some C right
now :) ).

cheers,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dtype comparison and hashing - bug?

2010-10-20 Thread Matthew Brett
Hi,

 It already has a ticket :)

    http://projects.scipy.org/numpy/ticket/1637

Oops - sorry - thanks for point that out.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dtype comparison and hashing

2008-10-18 Thread Geoffrey Irving
On Wed, Oct 15, 2008 at 12:56 PM, Robert Kern [EMAIL PROTECTED] wrote:
 On Wed, Oct 15, 2008 at 02:20, Geoffrey Irving [EMAIL PROTECTED] wrote:
 Hello,

 Currently in numpy comparing dtypes for equality with == does an
 internal PyArray_EquivTypes check, which means that the dtypes NPY_INT
 and NPY_LONG compare as equal in python.  However, the hash function
 for dtypes reduces id(), which is therefore inconsistent with ==.
 Unfortunately I can't produce a python snippet showing this since I
 don't know how to create a NPY_INT dtype in pure python.

 Based on the source it looks like hash should raise a type error,
 since tp_hash is null but tp_richcompare is not.  Does the following
 snippet through an exception for others?

 import numpy
 hash(numpy.dtype('int'))
 5708736

 This might be the problem:

 /* Macro to get the tp_richcompare field of a type if defined */
 #define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \
 ? (t)-tp_richcompare : NULL)

 I'm using the default Mac OS X 10.5 installation of python 2.5 and
 numpy, so maybe those weren't compiled correctly.  Has anyone else
 seen this issue?

 Actually, the problem is that we provide a hash function explicitly.
 In multiarraymodule.c:

PyArrayDescr_Type.tp_hash = (hashfunc)_Py_HashPointer;

 That is a violation of the hashing protocol (objects which compare
 equal and are hashable need to hash equal), and should be fixed.

Thanks for finding that.

Geoffrey
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] dtype comparison and hashing

2008-10-15 Thread Geoffrey Irving
Hello,

Currently in numpy comparing dtypes for equality with == does an
internal PyArray_EquivTypes check, which means that the dtypes NPY_INT
and NPY_LONG compare as equal in python.  However, the hash function
for dtypes reduces id(), which is therefore inconsistent with ==.
Unfortunately I can't produce a python snippet showing this since I
don't know how to create a NPY_INT dtype in pure python.

Based on the source it looks like hash should raise a type error,
since tp_hash is null but tp_richcompare is not.  Does the following
snippet through an exception for others?

 import numpy
 hash(numpy.dtype('int'))
5708736

This might be the problem:

/* Macro to get the tp_richcompare field of a type if defined */
#define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \
 ? (t)-tp_richcompare : NULL)

I'm using the default Mac OS X 10.5 installation of python 2.5 and
numpy, so maybe those weren't compiled correctly.  Has anyone else
seen this issue?

Thanks,
Geoffrey
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dtype comparison and hashing

2008-10-15 Thread Robert Kern
On Wed, Oct 15, 2008 at 02:20, Geoffrey Irving [EMAIL PROTECTED] wrote:
 Hello,

 Currently in numpy comparing dtypes for equality with == does an
 internal PyArray_EquivTypes check, which means that the dtypes NPY_INT
 and NPY_LONG compare as equal in python.  However, the hash function
 for dtypes reduces id(), which is therefore inconsistent with ==.
 Unfortunately I can't produce a python snippet showing this since I
 don't know how to create a NPY_INT dtype in pure python.

 Based on the source it looks like hash should raise a type error,
 since tp_hash is null but tp_richcompare is not.  Does the following
 snippet through an exception for others?

 import numpy
 hash(numpy.dtype('int'))
 5708736

 This might be the problem:

 /* Macro to get the tp_richcompare field of a type if defined */
 #define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \
 ? (t)-tp_richcompare : NULL)

 I'm using the default Mac OS X 10.5 installation of python 2.5 and
 numpy, so maybe those weren't compiled correctly.  Has anyone else
 seen this issue?

Actually, the problem is that we provide a hash function explicitly.
In multiarraymodule.c:

PyArrayDescr_Type.tp_hash = (hashfunc)_Py_HashPointer;

That is a violation of the hashing protocol (objects which compare
equal and are hashable need to hash equal), and should be fixed.

-- 
Robert Kern

I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth.
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion