Re: [Scikit-learn-general] BIRCH - Testing datasets

2015-12-06 Thread Dženan Softić
Thanks. That makes sense. Actually, I am trying as well to make the threshold dynamic. Still have to test my approach. Best, On Mon, Nov 30, 2015 at 9:15 PM, Manoj Kumar wrote: > Ah well, the value of the threshold set depends on your data. > > If your data is on the scale of 1e4 - 1e5, it is

Re: [Scikit-learn-general] BIRCH - Testing datasets

2015-11-30 Thread Manoj Kumar
Ah well, the value of the threshold set depends on your data. If your data is on the scale of 1e4 - 1e5, it is expected to provide a really high threshold, because the sample distances are on the same scale. We are trying to produce heuristics for an optimal "auto" threshold parameter here, (http

Re: [Scikit-learn-general] BIRCH - Testing datasets

2015-11-30 Thread Manoj Kumar
Hi, Can you provide your script for testing? Thanks ! On Mon, Nov 30, 2015 at 3:06 PM, Dženan Softić wrote: > Hi, > > I am trying to test BIRCH with the original datasets found here: > https://cs.joensuu.fi/sipu/datasets/ > (100K points, 100 clusters) > > The problem is setting the threshold

[Scikit-learn-general] BIRCH - Testing datasets

2015-11-30 Thread Dženan Softić
Hi, I am trying to test BIRCH with the original datasets found here: https://cs.joensuu.fi/sipu/datasets/ (100K points, 100 clusters) The problem is setting the threshold. I need to set it above 10 000 to get decent results. That is very weird because on BIRCH example ( http://scikit-learn.org/st