Assuming the OP was doing cosine similarity (as is commonly done with text) while clustering, wouldn't that implicitly imply the use of a Kernel ? Would using a separate kernel help?
On Jul 14, 2011, at 6:56 AM, Hector Yee wrote: > The histogram intersection kernel would work well and it has no parameters > > Sent from my iPad > > On Jul 14, 2011, at 2:38 AM, Vckay <[email protected]> wrote: > >> I am clustering some real world text data using K-Means. I recently came >> across Kernel K-Means and wanted to know if someone who has had experience >> with Kernels could comment on their appropriateness for text data, i.e, >> Would using a Kernel boost k-means quality? ( I know this is rather general >> but it is sort of hard to figure out if my high dimensional real world data >> is linearly separable.) If so, are there any Kernel's with "practically >> accepted" parameters? >> >> Thanks >> VC
