Hi vcaky,

Are you using raw text data with k-means? It's usual to obtain some lower
dimension and dense representation of the documents using Singular Value
Decomposition and such techniques, and working with that representation
instead. You may want to take a look at SVD algorithms in mahout.

Best,
Fernando.

2011/7/14 Vckay <[email protected]>

> I am clustering some real world text data using K-Means. I recently came
> across Kernel K-Means and wanted to know if someone who has had experience
> with Kernels could comment on their appropriateness for text data, i.e,
> Would using a Kernel boost k-means quality? ( I know this is rather general
> but it is sort of hard to figure out if my high dimensional real world data
> is linearly separable.) If so, are there any Kernel's with "practically
> accepted" parameters?
>
> Thanks
> VC
>

Reply via email to