Sorry, just noticed that I had forgotten to attach a sample image. Regards
On Wed, Dec 9, 2020 at 1:21 PM Abhishek Ghose <abhishek.ghose...@gmail.com> wrote: > Hi, > > A quick way I use is to draw a convex hull (scipy) around the points in a > cluster. > Here's a short example - k-means with k=2 is run on synthetic data: > > from sklearn.datasets import make_blobs > from sklearn.cluster import KMeans > from matplotlib import pyplot as plt > from scipy.spatial import ConvexHull > > > X, _ = make_blobs(centers=2) > kmeans = KMeans(n_clusters=2, random_state=0).fit(X) > > # uncomment the next line if you're using a notebook > #%matplotlib inline > for label in set(kmeans.labels_): > X_clust = X[kmeans.labels_==label] > hull = ConvexHull(X_clust, qhull_options='QJ') > vertices_cycle = hull.vertices.tolist() > vertices_cycle.append(hull.vertices[0]) > plt.plot(X_clust[vertices_cycle, 0], X_clust[vertices_cycle, 1], > 'k--', lw=1) > plt.scatter(X_clust[:, 0], X_clust[:, 1]) > > Note: > > 1. You can still have overlaps between boundaries - but I think this > is a good effort-to-results tradeoff. > 2. To draw a closed boundary, you'd need to add the first vertex to > the list returned by the hull function - the above code does that. > 3. You'd need to handle the case for clusters with <=2 points > explicitly - not shown in the above code. > 4. I use the "QJ" option (other options at the qhull library page, > which scipy internally uses: http://www.qhull.org/html/qh-optq.htm) to > joggle the points a bit when they lie on a line. > > Regards > > > On Wed, Dec 9, 2020 at 12:41 PM Brown J.B. via scikit-learn < > scikit-learn@python.org> wrote: > >> Dear Mahmood, >> >> Andrew's solution with a circle will guarantee you render an image in >> which every point is covered within some circle. >> >> However, if data contains outliers or artifacts, you might get circles >> which are excessively large and distort the image you want. >> For example, imagine if there were a single red point in Andrew's image >> at the coordinate (3,10); then, the resulting circle would cover all points >> in the entire plot, which is unlikely what you want. >> You could potentially generate a density estimate for each class and then >> have matplotlib render the contour lines (e.g., solutions of where >> estimates have a specific value), but as was said, this is not the job of >> Kmeans, but rather of general data analysis. >> >> The ellipsoid solution proposed to you is, in a sense, a middle ground >> between these two solutions (the circles and the density plots). >> You could adjust the (4 or 5) parameters of an ellipsoid to cover "most" >> of the points for a particular class and tolerate that the ellipsoids don't >> cover a few outliers or artifacts (e.g., the coordinate (3,10) I mentioned >> above). >> The resulting functional forms of the ellipses might be more precise than >> circles and less complex than density contours, and might lead to >> actionable knowledge depending on your context/domain. >> >> Hope this helps. >> J.B. Brown >> >> 2020年12月9日(水) 21:08 Mahmood Naderan <mahmood...@gmail.com>: >> >>> >Mebbe principal components analysis would suggest an >>> >ellipsoid containing "most" of the points in a "cloud". >>> >>> Sorry I didn't understand. Can you explain more? >>> Regards, >>> Mahmood >>> >>> >>> >>> >>> On Wed, Dec 9, 2020 at 8:55 PM The Helmbolds via scikit-learn < >>> scikit-learn@python.org> wrote: >>> >>>> [scikit-learn] Drawing contours in KMeans4 >>>> >>>> >>>> Mebbe principal components analysis would suggest an ellipsoid >>>> containing "most" of the points in a "cloud". >>>> >>>> >>>> >>>> >>>> "You won't find the right answers if you don't ask the right >>>> questions!" (Robert Helmbold, 2013) >>>> >>>> >>>> On Wednesday, December 9, 2020, 12:22:49 PM MST, Andrew Howe < >>>> ahow...@gmail.com> wrote: >>>> >>>> >>>> Ok, I see. Well the attached notebook demonstrates doing this by simply >>>> finding the maximum distance from each centroid to it's datapoints and >>>> drawing a circle using that radius. It's simple, but will hopefully at >>>> least point you in a useful direction. >>>> [image: image.png] >>>> Andrew >>>> >>>> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >>>> J. Andrew Howe, PhD >>>> LinkedIn Profile <http://www.linkedin.com/in/ahowe42> >>>> ResearchGate Profile <http://www.researchgate.net/profile/John_Howe12/> >>>> Open Researcher and Contributor ID (ORCID) >>>> <http://orcid.org/0000-0002-3553-1990> >>>> Github Profile <http://github.com/ahowe42> >>>> Personal Website <http://www.andrewhowe.com> >>>> I live to learn, so I can learn to live. - me >>>> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >>>> >>>> >>>> On Wed, Dec 9, 2020 at 12:59 PM Mahmood Naderan <mahmood...@gmail.com> >>>> wrote: >>>> >>>> I mean a circle/contour to group the points in a cluster for better >>>> representation. >>>> For example, if there are 6 six clusters, it will be more meaningful to >>>> group large data points in a circle or contour. >>>> >>>> Regards, >>>> Mahmood >>>> >>>> >>>> >>>> >>>> On Wed, Dec 9, 2020 at 11:49 AM Andrew Howe <ahow...@gmail.com> wrote: >>>> >>>> Contours generally indicate a third variable - often a probability >>>> density. Kmeans doesn't provide density estimates, so what precisely would >>>> you want the contours to represent? >>>> >>>> Andrew >>>> >>>> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >>>> J. Andrew Howe, PhD >>>> LinkedIn Profile <http://www.linkedin.com/in/ahowe42> >>>> ResearchGate Profile <http://www.researchgate.net/profile/John_Howe12/> >>>> Open Researcher and Contributor ID (ORCID) >>>> <http://orcid.org/0000-0002-3553-1990> >>>> Github Profile <http://github.com/ahowe42> >>>> Personal Website <http://www.andrewhowe.com> >>>> I live to learn, so I can learn to live. - me >>>> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >>>> >>>> >>>> On Wed, Dec 9, 2020 at 9:41 AM Mahmood Naderan <mahmood...@gmail.com> >>>> wrote: >>>> >>>> Hi >>>> I use the following code to highlight the cluster centers with some red >>>> dots. >>>> >>>> kmeans = KMeans(n_clusters=6, init='k-means++', max_iter=100, >>>> n_init=10, random_state=0) >>>> pred_y = kmeans.fit_predict(a) >>>> plt.scatter(a[:,0], a[:,1]) >>>> plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, >>>> 1], s=100, c='red') >>>> plt.show() >>>> >>>> I would like to know if it is possible to draw contours over the >>>> clusters. Is there any way for that? >>>> Please let me know if there is a function or option in KMeans. >>>> >>>> Regards, >>>> Mahmood >>>> >>>> >>>> _______________________________________________ >>>> scikit-learn mailing list >>>> scikit-learn@python.org >>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>> >>>> _______________________________________________ >>>> scikit-learn mailing list >>>> scikit-learn@python.org >>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>> >>>> _______________________________________________ >>>> scikit-learn mailing list >>>> scikit-learn@python.org >>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>> >>>> _______________________________________________ >>>> scikit-learn mailing list >>>> scikit-learn@python.org >>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>> _______________________________________________ >>>> scikit-learn mailing list >>>> scikit-learn@python.org >>>> https://mail.python.org/mailman/listinfo/scikit-learn >>>> >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn@python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > > > -- > Computers: The eventual realization of Douglas Adams' musings - the world > depends on machines controlled by mice. > -- Computers: The eventual realization of Douglas Adams' musings - the world depends on machines controlled by mice.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn