No, I don't know what use it would be if a similarity measure always gave a 1. I assume he's thinking of the fact that some similarity metrics use features that are not real-valued, but binary (exist or not). But the result is not always 1, or even 1/0, but a value in [-1,1].
On Fri, Dec 21, 2012 at 11:45 PM, Kai R. Larsen <[email protected]> wrote: > Hi, > > My sincere apologies if this is a naïve question (I'm sure it is). > > I've engaged a programmer to take an weblog and focus on 250 pages containing > items that may be similar (or not). The goal is create item-item > relationship tables where every cell contains a score for how similar two > items are. He now tells me that only two of the (many) Mahout algorithms can > be used to generate such tables, and those that do generate a distance of 1 > or some other constant value between all pairs. > > This can't be true, can it? There must be a way to tease out such > information from the algorithms. Any advice? Any ideas why all > relationships would be one? While it is common for the website users to have > visited only one page at a time, it is not pervasive. > > Best, > > Kai Larsen
