Not exactly, I was trying to build a logic for this calculation, but before
that I thought to take suggestion from everyone.


Anyways will give a try with Streaming KMeans.


On Thu, Mar 13, 2014 at 3:43 AM, Suneel Marthi <[email protected]>wrote:

> Is there any rational to what u r proposing?
>
> Its better to go with Streaming KMeans than the combination of Canopy -
> KMeans clustering.
>
> Moreover, Canopy clustering (due to a single reducer in Canopy Generation
> phase) is more likely to fail with large datasets and that's a behavior
> that's been oft reported by several users in these forums.
>
>
>
>
>
>
>
> On Wednesday, March 12, 2014 4:17 PM, Bikash Gupta <
> [email protected]> wrote:
>
> Hi,
>
> Finding out right T1 and T2 in canopy is time taking task with manual
> intervention. I am planning to automate the process of calculation.
>
> Idea is I would increment T1 and T2 by x times of 3.1 and x times of 2.1,
> and would collect the approx T1 and T2 for each K cluster.
>
> Not sure if this is good idea. Please suggest!!!
>
> --
> Thanks & Regards
> Bikash Gupta
>



-- 
Thanks & Regards
Bikash Kumar Gupta

Reply via email to