That's going to be easier if you can work off of trunk, since the output of
clustering has been cleaned up to write a better format, per
https://issues.apache.org/jira/browse/MAHOUT-1505

E.g.,

{
  "top_terms": [
    {"all":3.0149030685424805},
    {"english":3.0149030685424805},
    {"best":3.0149030685424805},
    {"spaniel":3.0149030685424805},
    {"springer":3.0149030685424805},
    {"dogs":1.9162907600402832}
  ],
  "cluster_id": 7,
  "cluster": {
    "r": [],
    "c": [
      {"all":3.015},
      {"best":3.015},
      {"dogs":1.916},
      {"english":3.015},
      {"spaniel":3.015},
      {"springer":3.015}
    ],
    "n": 1,
    "identifier": "C-7"
  },
  "points": [
    {
      "point": [
        {"all":3.015},
        {"best":3.015},
        {"dogs":1.916},
        {"english":3.015},
        {"spaniel":3.015},
        {"springer":3.015}
      ],
      "vector_name": "P(14)",
      "weight": "1.0"
    }
  ]
}


On Fri, Jun 13, 2014 at 2:42 AM, Kamesh <[email protected]> wrote:

> Hi All,
> Please help me in getting the data points inside each cluster.
> The output of the clustering algorithm is center of the cluster and radius
> of the cluster. How do we derive actual data points inside each cluster
> from this output.
>
> --
> Kamesh.
>

Reply via email to