> what are the data samples in this cluster

Mehmet's response below works for exploring the hierarchical tree. However, Birch currently doesn't store the data samples that belong to a given subcluster. If you need that, as far as I know, a reasonable approximation can be obtained by computing the data samples that are closest to the centroid of the considered subcluster (accessible via _CFNode.centroids_) as compared to all other subcluster centroids at this hierarchical tree depth.

Alternatively, the modifications in PR https://github.com/scikit-learn/scikit-learn/pull/8808 aimed to make this process easier..
--
Roman

On 23/08/17 13:44, Suzen, Mehmet wrote:
Hi Sema,

You can access CFNode from the fit output, assign fit output, so you
can have the object.

brc_fit = brc.fit(X)
brc_fit_cfnode = brc_fit.root_
<sklearn.cluster.birch._CFNode object at 0x7ff31acbf668>

Then you can access CFNode, see here
https://kite.com/docs/python/sklearn.cluster.birch._CFNode

Also, this example comparing mini batch kmeans.
http://scikit-learn.org/stable/auto_examples/cluster/plot_birch_vs_minibatchkmeans.html

Hope this was what you are after.

Best,
Mehmet

On 23 August 2017 at 10:55, Sema Atasever <s.atase...@gmail.com> wrote:
Dear scikit-learn members,

Considering the "CF-tree" data structure :

- How can i access Clustering Feature Tree in Birch?

- For example, how many clusters are there in the hierarchy under the root
node and what are the data samples in this cluster?

- Can I get them separately for 3 trees?

Best.

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to