Re: canopy creating canopies with the same points

Scott C. Cote Mon, 24 Mar 2014 07:29:45 -0700

Reinis,

The documentation has several Jira¹s open - with one with my name on it.

Fortunately, the canopy cluster technology has a good page (as well as
some outdated pages).

Please see this link for your question:

        http://mahout.apache.org/users/clustering/canopy-clustering.html

as I believe that it is well written.

To directly answer your question:

Remember that T1 > T2 and points within T2 are added to the cluster and
removed from the "input set", while points within T1 are added to the
cluster but NOT removed from the ³input set" (and therefore may be added
to another cluster later in the process).

SCott

On 3/24/14, 6:44 AM, "Reinis Vicups" <[email protected]> wrote:

>Hi,
>
>apparently I am missunderstanding the way canopy works. I thought that
>once datapoint is added to canopy, it is removed from the list of
>to-be-clustered points thus one point is assigned to one canopy.
>
>In the example below this is not the case:
>
>:C-28{n=1 c=[70:11.686, 72:7.170, 236:8.182, 396:238.981, 468:40.572,
>556:10.985, 889:8.678, 1101:114
>:C-29{n=1 c=[70:11.686, 72:7.170, 236:8.182, 396:217.804, 468:33.560,
>556:10.985, 889:8.678, 1101:113
>:C-30{n=1 c=[70:11.686, 72:7.170, 236:8.182, 396:215.841, 468:37.231,
>556:10.985, 889:8.678, 1101:113
>:C-31{n=1 c=[70:11.686, 72:7.170, 236:8.182, 396:206.121, 468:32.243,
>556:10.985, 889:8.678, 1101:112
>
>So is the correct assumption that only the points within T2 get assigned
>to only one canopy or even points within T2 can get assigned to more
>than one canopy?
>
>greets
>reinis

Re: canopy creating canopies with the same points

Reply via email to