Always good to test, but I think that euclidean distance with L_1 regularization is probably more interesting.
On Mon, Mar 4, 2013 at 12:00 PM, Chris Harrington <[email protected]>wrote: > So if I'm understanding what you are saying is, simply put, that I should > investigate the use L_1 as my distance measure during my measuring of > vector distance within a cluster? > > > On 1 Mar 2013, at 16:24, Ted Dunning wrote: > > > What Sean says is just right, except that I was (telegraphically) getting > > at a slightly different point with L_1: > > > > On Wed, Feb 27, 2013 at 7:23 AM, Chris Harrington <[email protected] > >wrote: > > > >> Is L_1 regularization the same as manhattan distance? > >> > > > > L_1 metric is manhattan distance, yes. > > > > L_1 regularization of k-means refers to something a little bit different. > > > > The idea with regularization is that you add some sort of penalty to the > > function you are optimizing. This penalty pushes the optimization > toward a > > solution that you would prefer on some other grounds than just the > > optimization alone. Regularization often helps in solving > underdetermined > > systems where there are an infinite number of solutions and we have to > pick > > a preferred solution. > > > > There isn't anything that says that you have to be optimizing the same > kind > > of function as the regularization. Thus k-means, which is inherently > > optimizing squared error can quite reasonably be regularized with L_1 > (sum > > of the absolute value of the centroids' coefficients). > > > > I haven't tried this at all seriously yet. L_1 regularization tends to > > help drive toward sparsity, but it is normally used in convex problems > > where we can guarantee a findable global optimum. The k-means problem, > > however, is not convex so adding the regularization may screw things up > in > > practice. For text-like data, I have a strong intuition that the > idealized > > effect of L_1 should be very good, but the pragmatic effect may not be so > > useful. > >
