Hi guys, Im working with text mining project, I would like to apply text clustering to identify trends inside the dcuments. The dcuments which enter to clustering process is dynamic depending on the query results from a big repository.
I applied some Mahout algotitms like LDA, KMeans with conopy. but the problem is I didn't get any better way to choose number of clusters or topics. For Canopy and KMeans I need to determine two arameters t1 and t2. For LDA I need also to determine the number of topics. Here i'm asking if there is a god way to choose those parameters, or if there is a god text clustering algorithm which deals with dynamic data. Thanks in advance, Donni
