On 27/10/13 20:22, [email protected] wrote: > On Sun, Oct 27, 2013 at 3:22 PM, Freddie Witherden > <[email protected]> wrote: >> On 27/10/13 18:54, Daniele Nicolodi wrote: >>> On 27/10/2013 19:42, Freddie Witherden wrote: >>>> On 27/10/13 18:35, Nathaniel Smith wrote: >>>>> On Sun, Oct 27, 2013 at 6:28 PM, Freddie Witherden >>>>> <[email protected]> wrote: >>>>>> Hi all, >>>>>> >>>>>> This is a question which has been bugging me for a while. I have an (N, >>>>>> 3) array where N ~ 16 of points. These points are all unique and >>>>>> separated by a reasonable distance. >>>>>> >>>>>> I wish to sort these points into a canonical order in a fashion which is >>>>>> robust against small perturbations. In other words changing any >>>>>> component of any of the points by an epsilon ~ 1e-12 should not affect >>>>>> the resulting sorted order. >>>>> >>>>> I don't understand how this is possible even in principle. >>>>> >>>>> Say your points are >>>>> >>>>> a = [0, 0, 0] >>>>> b = [0, 0, 1e-12] >>>>> >>>>> According to your criterion, either a or b should go first -- I don't >>>>> know which. Let's say our canonical ordering decides that a goes >>>>> first. But if you perturb both of them, then you have >>>>> >>>>> a = [0, 0, 1e-12] >>>>> b = [0, 0, 0] >>>>> >>>>> And now your requirement says that a still has to go first. But if a >>>>> goes first this time, then b had to go first the last time, by >>>>> symmetry. Thus your criterion is self-contradictory...? >>>> >>>> Not exactly; in your case the distance between a and b is of the order >>>> epislon. However, my points are "all unique and separated by a >>>> reasonable distance." This requires at least one of the components of >>>> any two points to differ in all instances, permitting an ordering to be >>>> defined. (Where if epislon ~ 1e-12 the minimum instance between any two >>>> points is of order ~ 1e-6.) >>> >>> Do you mean that all you points are distributed around some fixed points >>> in your space? In this case, it looks like what you are looking for is >>> categorization or clustering and not sorting. Once you perform >>> clustering, you can simply define an arbitrary order in which report the >>> content of each cluster. If this is not the case, the problem that >>> Nathaniel highlishts is still present. >> >> I am not entirely sure what you mean here. If x is my array of points >> of size (16, 3) then I am guarenteeing that >> >> np.min(scipy.spatial.distance.pdist(x)) >= 1e-6 >> >> In this instance I am unsure how the issue highlighted by Nathaniel >> might arise. Of course it is (very) possible that I am missing >> something, however, I believe under the terms of this constraint that it >> is always possible to define an order with which to iterate through the >> points which is invarient to shuffling of the points and small >> pertubations of the components. > > > If the epsilon or scale depends on the column, then, I think, divmod > should work to cut off the noise > >>>> my_array[np.lexsort(divmod(my_array, [1e-1, 1e-12, 1])[0].T)] > array([[-0.5 , 0. , 1.41421356], > [ 0.5 , 0. , 1.41421356]])
An interesting proposal. However, it appears to have the same issues as the rounding approach. Consider: In [5]: a, b = 1.0 + 1e-13, 1.0 - 1e-13 In [6]: abs(a - b) < 1e-12 Out[6]: True In [7]: divmod(a, 1e-6)[0] == divmod(b, 1e-6)[0] Out[7]: False Hence should np.lexsort encounter such a pair it will consider a and b to be different even though they are within epsilon of one another. Regards, Freddie.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
