Not exactly, I was trying to build a logic for this calculation, but before that I thought to take suggestion from everyone.
Anyways will give a try with Streaming KMeans. On Thu, Mar 13, 2014 at 3:43 AM, Suneel Marthi <[email protected]>wrote: > Is there any rational to what u r proposing? > > Its better to go with Streaming KMeans than the combination of Canopy - > KMeans clustering. > > Moreover, Canopy clustering (due to a single reducer in Canopy Generation > phase) is more likely to fail with large datasets and that's a behavior > that's been oft reported by several users in these forums. > > > > > > > > On Wednesday, March 12, 2014 4:17 PM, Bikash Gupta < > [email protected]> wrote: > > Hi, > > Finding out right T1 and T2 in canopy is time taking task with manual > intervention. I am planning to automate the process of calculation. > > Idea is I would increment T1 and T2 by x times of 3.1 and x times of 2.1, > and would collect the approx T1 and T2 for each K cluster. > > Not sure if this is good idea. Please suggest!!! > > -- > Thanks & Regards > Bikash Gupta > -- Thanks & Regards Bikash Kumar Gupta
