Re: [Scikit-learn-general] BIRCH: merge subclusters

2016-02-07 Thread Joel Nothman
It's not clear *why* you're doing this. The model will automatically recluster the subclusters after identifying them, as long as you specify either a number of clusters or a clustering model to the n_clusters parameter. Can you fit this post-processing into that "final clustering" framework? On

[Scikit-learn-general] BIRCH: merge subclusters

2016-02-07 Thread Dženan Softić
Hi, I am doing some experiments with BIRCH. When BIRCH finish, I would like to merge subclusters based on some criteria. I am doing this this by calling "merge_subcluster" method on subcluster that I want to merge with, passing it subcluster object of the second cluster:

Re: [Scikit-learn-general] BIRCH: merge subclusters

2016-02-07 Thread Dženan Softić
Hi, Thank you for your reply. My aim is not to use global clustering step, but rather to use BIRCH for online an clustering (possible infinite stream). I was also trying to set BIRCH threshold automatically. In order to do so, I use Gap Statistics (developed it on top of Apache Spark) for certain