I think it should return an "undefined" symbol. There is no angle between two zero vectors.
In a practical sense, taking two zero vectors to be equivalent in the context of user-item vectors, say, is dodgy in my opinion. That is akin to saying "If we both hate everything on this restaurant's menu we are the same person." On Thu, Apr 4, 2013 at 11:56 AM, Dan Filimon <dangeorge.fili...@gmail.com>wrote: > Suneel is right. :) > > Let me explain how this came up: > - When clustering, and assigning a point to a cluster, the centroid needs > to be updated. > - To update the centroid in the nearest neighbor searcher classes, the > centroid must first be removed. > - To remove the centroid, we get the closest vector (search for it, and it > should be itself) and then remove it from the data structures. > => However, when the centroid is 0, the nearest vector (which should be > itself) has a huge distance (1 rather than 0) and this trips a check. > > > On Thu, Apr 4, 2013 at 9:46 PM, Sean Owen <sro...@gmail.com> wrote: > > > It sounds pretty undefined, but I would tend to define the distance as > > 0 in this case of course. And that means defining the cosine as 1. > > Which class in particular? There are a few implementations of this > > distance measure. > > > > On Thu, Apr 4, 2013 at 7:42 PM, Dan Filimon <dangeorge.fili...@gmail.com > > > > wrote: > > > In the case where both vectors are all zeros, the angle between them is > > 0, > > > so the cosine is therefore 1 and the so the distance returned should > be 0 > > > (unless I misunderstood what the distance does). > > > > > > In Mahout, when calling distance() however, if both the denominator and > > > dotProduct are 0 (which is true when both vectors are 0), the returned > > > value is 1. > > > > > > This looks like a bug to me and I would open a JIRA issue and fix it > but > > I > > > want to make sure there's nothing I could possibly be missing. > > > > > > Thoughts? > > >