Re: [Numpy-discussion] NaN as dictionary key?

2009-04-23 Thread josef . pktd
2009/4/20 Wes McKinney wesmck...@gmail.com:
 I assume that, because NaN != NaN, even though both have the same hash value
 (hash(NaN) == -32768), that Python treats any NaN double as a distinct key
 in a dictionary.

 In [76]: a = np.repeat(nan, 10)

 In [77]: d = {}

 In [78]: for i, v in enumerate(a):
    : d[v] = i
    :
    :

 In [79]: d
 Out[79]:
 {nan: 0,
  nan: 1,
  nan: 6,
  nan: 4,
  nan: 3,
  nan: 9,
  nan: 7,
  nan: 2,
  nan: 8,
  nan: 5}

 I'm not sure if this ever worked in a past version of NumPy, however, I have
 code which does a group by value and occasionally in the real world those
 values are NaN. Any ideas or a way around this problem?

For non hashable keys, I convert them to string, eg with repr or str
or some other string representation for floating point.

I use it for example to feed it to unique1d.

Josef


 a
array([ NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN])
 np.unique1d(a)
array([ NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN])

using type string is not good with nan (automatic conversion of nans in casting)
 np.unique1d(a.astype(str))
array(['1'],
  dtype='|S1')
 a.astype(str)
array(['1', '1', '1', '1', '1', '1', '1', '1', '1', '1'],
  dtype='|S1')

 np.unique1d([repr(ii) for ii in a])
array(['nan'],
  dtype='|S3')


but nans don't round trip, is this intended (at least not on windows

 np.unique1d(np.arange(10).astype(str)).astype(float)
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])
 np.all(np.array([repr(ii) for ii in np.pi*np.arange(10)]).astype(float) == 
 np.pi*np.arange(10))
True

 np.unique1d([repr(ii) for ii in a]).astype(float)
Traceback (most recent call last):
  File pyshell#120, line 1, in module
ValueError: invalid literal for float(): nan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NaN as dictionary key?

2009-04-23 Thread Bruce Southey
josef.p...@gmail.com wrote:
 2009/4/20 Wes McKinney wesmck...@gmail.com:
   
 I assume that, because NaN != NaN, even though both have the same hash value
 (hash(NaN) == -32768), that Python treats any NaN double as a distinct key
 in a dictionary.

 In [76]: a = np.repeat(nan, 10)

 In [77]: d = {}

 In [78]: for i, v in enumerate(a):
: d[v] = i
:
:

 In [79]: d
 Out[79]:
 {nan: 0,
  nan: 1,
  nan: 6,
  nan: 4,
  nan: 3,
  nan: 9,
  nan: 7,
  nan: 2,
  nan: 8,
  nan: 5}

 I'm not sure if this ever worked in a past version of NumPy, however, I have
 code which does a group by value and occasionally in the real world those
 values are NaN. Any ideas or a way around this problem?
 

 For non hashable keys, I convert them to string, eg with repr or str
 or some other string representation for floating point.

 I use it for example to feed it to unique1d.

 Josef


   
 a
 
 array([ NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN])
   
 np.unique1d(a)
 
 array([ NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN,  NaN])

 using type string is not good with nan (automatic conversion of nans in 
 casting)
   
 np.unique1d(a.astype(str))
 
 array(['1'],
   dtype='|S1')
   
 a.astype(str)
 
 array(['1', '1', '1', '1', '1', '1', '1', '1', '1', '1'],
   dtype='|S1')

   
 np.unique1d([repr(ii) for ii in a])
 
 array(['nan'],
   dtype='|S3')


 but nans don't round trip, is this intended (at least not on windows

   
 np.unique1d(np.arange(10).astype(str)).astype(float)
 
 array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])
   
 np.all(np.array([repr(ii) for ii in np.pi*np.arange(10)]).astype(float) == 
 np.pi*np.arange(10))
 
 True

   
 np.unique1d([repr(ii) for ii in a]).astype(float)
 
 Traceback (most recent call last):
   File pyshell#120, line 1, in module
 ValueError: invalid literal for float(): nan
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
   
Hi,
Perhaps you want something use isfinite and friends such as:

import numpy as np
a = np.array([1,2,3, np.inf, np.nan, 10])
e = {}
for i, v in enumerate(a):
if np.isfinite(v):
e[v] = i
else:
e[repr(v)]=i

You probably should use isfinite outside of the loop.

If you really do not care about NaN and infinity, then you could use a 
masked array where NaN and infinity are masked.

Bruce

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NaN as dictionary key?

2009-04-20 Thread David Cournapeau
On Mon, Apr 20, 2009 at 11:42 PM, Wes McKinney wesmck...@gmail.com wrote:
 I assume that, because NaN != NaN, even though both have the same hash value
 (hash(NaN) == -32768), that Python treats any NaN double as a distinct key
 in a dictionary.

I think that strictly speaking, nan should not be hashable because of
nan != nan. But since that's not an error in python, I am not sure we
should do something about it.

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion