Hi all,

This is a question which has been bugging me for a while.  I have an (N,
3) array where N ~ 16 of points.  These points are all unique and
separated by a reasonable distance.

I wish to sort these points into a canonical order in a fashion which is
robust against small perturbations.  In other words changing any
component of any of the points by an epsilon ~ 1e-12 should not affect
the resulting sorted order.

Considering a direct application of np.lexsort:

  In [6]: my_array = np.array([[-0.5, 0, 2**0.5],
                               [0.5, 0, 2**0.5 - 1e-15]])

  In [7]: my_array[np.lexsort(my_array.T)]
  Out[7]:  array([[ 0.5       ,  0.        ,  1.41421356],
                  [-0.5       ,  0.        ,  1.41421356]])

however, if the small 1e-15 perturbation is removed the order changes to
the 'natural' ordering.  Hence, np.lexsort is out.

Rounding the array before sorting is not suitable either; just because
(a - b) < epsilon does not mean that np.around(a, decimals=x) ==
np.around(b, decimals=b).

I am therefore looking for an efficient (= within a factor of 10 of
np.lexsort) solution to the problem.  I've looked at writing my own
comparison function cmp(x, y) which looks at the next dimension if
abs(x[i] - y[i]) < epsilon however using this with sorted is thousands
of times slower.  Given that I have well over 100,000 of these arrays
this is nuisance.

My other idea is to therefore find a means of quickly replacing all
numbers within 10*epsilon of a given number in an array with that
number.  This should permit the application of np.lexsort in order to
obtain the desired ordering (which is what I'm interesting in).
However, I am yet to figure out how to do this efficiently.

Before I throw in the towel and drop down to C are there any other neat
tricks I am missing?

Regards, Freddie.

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to