Marcel Stor wrote:
Stefan Groschupf wrote:I'm not sure. There are different data mining algorithms that could be used. Depends on this algoritm. I prefer Support vector machines(SVM). There you calculate distances of multi demensional vectors in a multidemensional "room".
Hi,
How is document clustering different/related to text categorization?Clustering: try to find own categories and put documents that match
in it. You group all documents with minimal distance together.
Would I be correct to say that you have to define a "distance threshold"
parameter in order to define when to build a new category for a certain
group?
One vector represent one document.
Stefan
