That's going to be easier if you can work off of trunk, since the output of
clustering has been cleaned up to write a better format, per
https://issues.apache.org/jira/browse/MAHOUT-1505
E.g.,
{
"top_terms": [
{"all":3.0149030685424805},
{"english":3.0149030685424805},
{"best":3.0149030685424805},
{"spaniel":3.0149030685424805},
{"springer":3.0149030685424805},
{"dogs":1.9162907600402832}
],
"cluster_id": 7,
"cluster": {
"r": [],
"c": [
{"all":3.015},
{"best":3.015},
{"dogs":1.916},
{"english":3.015},
{"spaniel":3.015},
{"springer":3.015}
],
"n": 1,
"identifier": "C-7"
},
"points": [
{
"point": [
{"all":3.015},
{"best":3.015},
{"dogs":1.916},
{"english":3.015},
{"spaniel":3.015},
{"springer":3.015}
],
"vector_name": "P(14)",
"weight": "1.0"
}
]
}
On Fri, Jun 13, 2014 at 2:42 AM, Kamesh <[email protected]> wrote:
> Hi All,
> Please help me in getting the data points inside each cluster.
> The output of the clustering algorithm is center of the cluster and radius
> of the cluster. How do we derive actual data points inside each cluster
> from this output.
>
> --
> Kamesh.
>