Re: Document Clustering

Stefan Groschupf Tue, 11 Nov 2003 12:35:56 -0800

Marcel Stor wrote:

Stefan Groschupf wrote:

Hi,

How is document clustering different/related to text categorization?

Clustering: try to find own categories and put documents that match in it. You group all documents with minimal distance together.

Would I be correct to say that you have to define a "distance threshold" parameter in order to define when to build a new category for a certain group?

I'm not sure. There are different data mining algorithms that could be used. Depends on this algoritm. I prefer Support vector machines(SVM). There you calculate distances of multi demensional vectors in a multidemensional "room". One vector represent one document.

Stefan

Re: Document Clustering

Reply via email to