Same reason you would use kernels instead of linear for SVMs... you can get more separation in a different space. But text is already so high dimensional...
On Thu, Jul 14, 2011 at 11:14 AM, Eshwaran Vijaya Kumar < [email protected]> wrote: > Assuming the OP was doing cosine similarity (as is commonly done with text) > while clustering, wouldn't that implicitly imply the use of a Kernel ? Would > using a separate kernel help? > > On Jul 14, 2011, at 6:56 AM, Hector Yee wrote: > > > The histogram intersection kernel would work well and it has no > parameters > > > > Sent from my iPad > > > > On Jul 14, 2011, at 2:38 AM, Vckay <[email protected]> wrote: > > > >> I am clustering some real world text data using K-Means. I recently came > >> across Kernel K-Means and wanted to know if someone who has had > experience > >> with Kernels could comment on their appropriateness for text data, i.e, > >> Would using a Kernel boost k-means quality? ( I know this is rather > general > >> but it is sort of hard to figure out if my high dimensional real world > data > >> is linearly separable.) If so, are there any Kernel's with "practically > >> accepted" parameters? > >> > >> Thanks > >> VC > > -- Yee Yang Li Hector http://hectorgon.blogspot.com/ (tech + travel) http://hectorgon.com (book reviews)
