Re: [scikit-learn] Construct the microclusters using a CF-Tree

2017-07-03 Thread Roman Yurchak
Hello Sema, as far as I can tell, in your dataset you has n_samples=65909, n_features=539. Clustering high dimensional data is problematic for a number of reasons, https://en.wikipedia.org/wiki/Clustering_high-dimensional_data#Problems besides the BIRCH implementation doesn't scale well for

Re: [scikit-learn] Scikit-learn workshop and sprint at EuroScipy 2017 in Erlangen

2017-07-03 Thread Tim Head
Hey, On Wed, Jun 28, 2017 at 9:42 AM Olivier Grisel wrote: > > > Do you have any suggestion ? The workshop duration is 90 min. > Looks like a good setup. Two thoughts: should we construct an example that uses a pipeline to illustrate the point that you should put your whole pipeline into your g

Re: [scikit-learn] Construct the microclusters using a CF-Tree

2017-07-03 Thread Sema Atasever
Dear Roman, When I try the code with the original data (*data.dat*) as you suggested, I get the following error : *Memory Error* --> (*error.png*), how can i overcome this problem, thank you so much in advance. ​ data.dat