I just had the same issue recently. It's been fixed in the dev (0.14)
branch. If you pull/build/install that, everything should be fine.

F.


On 1 February 2013 13:40, Vinay B, <[email protected]> wrote:

> >From the scikit script at
> http://scikit-learn.org/dev/_downloads/document_clustering.py , it
> appears as the number of clusters are set to the number of newsgroups
> subfolders. I'm guessing that's done more out of convenience . On the
> other hand, the users should be able to set an arbitrary number of
> clusters for better or worse, depending on the desired cluster
> granularity.
>
> But if I increase the number of clusters to a moderately large size as
> shown below, I see some errors. Code changes and output below
>
> Thanks
> .
> .
> .
> true_k = 200; #<============EXPLICITLY SET NUMBER OF DESIRED CLUSTERS
> # Do the actual clustering
> if opts.minibatch:
>     km = MiniBatchKMeans(n_clusters=true_k, init='k-means++', n_init=1,
>                          init_size=1000,
>                          batch_size=1000, verbose=1)
> else:
>     km = KMeans(n_clusters=true_k, init='k-means++', max_iter=100,
> n_init=1,
>                 verbose=1)
> .
> .
> .
>
> Output:
>
>
> None
> Usage: ClusteringApp.py [options]
>
> Options:
>   -h, --help            show this help message and exit
>   --no-minibatch        Use ordinary k-means algorithm (in batch mode).
>   --no-idf              Disable Inverse Document Frequency feature
> weighting.
>   --use-hashing         Use a hashing feature vectorizer
>   --n-features=N_FEATURES
>                         Maximum number of features (dimensions)to extract
> from
>                         text.
> Loading 20 newsgroups dataset for categories:
> ['alt.atheism', 'talk.religion.misc', 'comp.graphics', 'sci.space']
> 3387 documents
> 4 categories
>
> Extracting features from the training dataset using a sparse vectorizer
> done in 3.176239s
> n_samples: 3387, n_features: 10000
>
> Clustering sparse data with MiniBatchKMeans(batch_size=1000,
> compute_labels=True, init=k-means++,
>         init_size=1000, k=None, max_iter=100, max_no_improvement=10,
>         n_clusters=200, n_init=1, random_state=None,
>         reassignment_ratio=0.01, tol=0.0, verbose=1)
> Init 1/1 with method: k-means++
> Inertia for init 1/1: 654.117274
> Minibatch iteration 1/400:mean batch inertia: 0.863513, ewa inertia:
> 0.863513
> Minibatch iteration 2/400:mean batch inertia: 0.813080, ewa inertia:
> 0.833741
> Minibatch iteration 3/400:mean batch inertia: 0.815186, ewa inertia:
> 0.822788
> Minibatch iteration 4/400:mean batch inertia: 0.801274, ewa inertia:
> 0.810088
> Minibatch iteration 5/400:mean batch inertia: 0.800503, ewa inertia:
> 0.804430
> Minibatch iteration 6/400:mean batch inertia: 0.802421, ewa inertia:
> 0.803244
> Minibatch iteration 7/400:mean batch inertia: 0.789954, ewa inertia:
> 0.795398
> Minibatch iteration 8/400:mean batch inertia: 0.793326, ewa inertia:
> 0.794175
> Minibatch iteration 9/400:mean batch inertia: 0.792347, ewa inertia:
> 0.793096
> [_mini_batch_step] Reassigning 124 cluster centers.
> Traceback (most recent call last):
>   File "/home/vinayb/python/HelloPython/examples/ClusteringApp.py",
> line 114, in <module>
>     km.fit(X)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/cluster/k_means_.py",
> line 1221, in fit
>     verbose=self.verbose)
>   File
> "/usr/local/lib/python2.7/dist-packages/sklearn/cluster/k_means_.py",
> line 888, in _mini_batch_step
>     centers[to_reassign] = new_centers
> ValueError: setting an array element with a sequence.
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_jan
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to