[scikit-learn] Clustering with sparse matrix
Hi all, I have a sparse matrix where each row (item) has 160 features. For each of them only three or four features are different by 0. Can I do clustering with this data? I’m thinking to use PCA to reduce dimensionality. Thanks for any answer. Luigi ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
[scikit-learn] Getting the indexes of the data points after clustering using Kmeans
Hi, I have applied Kmeans clustering using the scikit library from kmeans=KMeans(max_iter=4,n_clusters=10,n_init=10).fit(euclidean_dist) After applying the algorithm.I would like to get the data points in the clusters so as to further use them to apply a model. Example: kmeans.cluster_centers_[1] gives me distance array of all the data points. Is there any way around this available in scikit so as to get the data points id/index. Regards ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Re: [scikit-learn] Getting the indexes of the data points after clustering using Kmeans
Hi, if you have your original points stored in a numpy array, you can get all points from a cluster i by doing the following: cluster_points = points[kmeans.labels_ == i] "kmeans.labels_" contains a list labels for each point. "kmeans.labels_ == i" creates a mask that selects only those points that belong to cluster i and the whole line then gives you the points, finally. BTW: the fit method has the raw points as input parameter, not the distance matrix. Regards, Christian prince gosavi schrieb am Mi., 21. Feb. 2018 um 11:16 Uhr: > Hi, > I have applied Kmeans clustering using the scikit library from > > kmeans=KMeans(max_iter=4,n_clusters=10,n_init=10).fit(euclidean_dist) > > After applying the algorithm.I would like to get the data points in the > clusters so as to further use them to apply a model. > > Example: > kmeans.cluster_centers_[1] > > gives me distance array of all the data points. > > Is there any way around this available in scikit so as to get the data > points id/index. > > Regards > ___ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Re: [scikit-learn] Getting the indexes of the data points after clustering using Kmeans
Hi, Thanks for your hint It just saved my day. Regards, Rajkumar On Wed, Feb 21, 2018 at 4:28 PM, Christian Braune < christian.braun...@gmail.com> wrote: > Hi, > > if you have your original points stored in a numpy array, you can get all > points from a cluster i by doing the following: > > cluster_points = points[kmeans.labels_ == i] > > "kmeans.labels_" contains a list labels for each point. > "kmeans.labels_ == i" creates a mask that selects only those points that > belong to cluster i > and the whole line then gives you the points, finally. > > BTW: the fit method has the raw points as input parameter, not the > distance matrix. > > Regards, > Christian > > prince gosavi schrieb am Mi., 21. Feb. 2018 um > 11:16 Uhr: > >> Hi, >> I have applied Kmeans clustering using the scikit library from >> >> kmeans=KMeans(max_iter=4,n_clusters=10,n_init=10).fit(euclidean_dist) >> >> After applying the algorithm.I would like to get the data points in the >> clusters so as to further use them to apply a model. >> >> Example: >> kmeans.cluster_centers_[1] >> >> gives me distance array of all the data points. >> >> Is there any way around this available in scikit so as to get the data >> points id/index. >> >> Regards >> ___ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > > ___ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > -- Regards ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn