Don’t get me wrong, but you’d have to either manually label them yourself, asking domain experts, or use platforms like Amazon Turk (or collect them in some other way).
> On Apr 3, 2017, at 7:38 AM, Shuchi Mala <shuchi...@gmail.com> wrote: > > How can I get ground truth labels of the training examples in my dataset? > > With Best Regards, > Shuchi Mala > Research Scholar > Department of Civil Engineering > MNIT Jaipur > > > On Fri, Mar 31, 2017 at 8:17 PM, Sebastian Raschka <se.rasc...@gmail.com> > wrote: > Hi, Shuchi, > > regarding labels_true: you’d only be able to compute the rand index adjusted > for chance if you have the ground truth labels iof the training examples in > your dataset. > > The second parameter, labels_pred, takes in the predicted cluster labels > (indices) that you got from the clustering. E.g, > > dbscn = DBSCAN() > labels_pred = dbscn.fit(X).predict(X) > > Best, > Sebastian > > > > On Mar 31, 2017, at 12:02 AM, Shuchi Mala <shuchi...@gmail.com> wrote: > > > > Thank you so much for your quick reply. I have one more doubt. The below > > statement is used to calculate rand score. > > > > metrics.adjusted_rand_score(labels_true, labels_pred) > > In my case what will be labels_true and labels_pred and how I will > > calculate labels_pred? > > > > With Best Regards, > > Shuchi Mala > > Research Scholar > > Department of Civil Engineering > > MNIT Jaipur > > > > > > On Thu, Mar 30, 2017 at 8:38 PM, Shane Grigsby <shane.grig...@colorado.edu> > > wrote: > > Since you're using lat / long coords, you'll also want to convert them to > > radians and specify 'haversine' as your distance metric; i.e. : > > > > coords = np.vstack([lats.ravel(),longs.ravel()]).T > > coords *= np.pi / 180. # to radians > > > > ...and: > > > > db = DBSCAN(eps=0.3, min_samples=10, metric='haversine') > > # replace eps and min_samples as appropriate > > db.fit(coords) > > > > Cheers, > > Shane > > > > > > On 03/30, Sebastian Raschka wrote: > > Hi, Shuchi, > > > > 1. How can I add data to the data set of the package? > > > > You don’t need to add your dataset to the dataset module to run your > > analysis. A convenient way to load it into a numpy array would be via > > pandas. E.g., > > > > import pandas as pd > > df = pd.read_csv(‘your_data.txt', delimiter=r"\s+”) > > X = df.values > > > > 2. How I can calculate Rand index for my data? > > > > After you ran the clustering, you can use the “adjusted_rand_score” > > function, e.g., see > > http://scikit-learn.org/stable/modules/clustering.html#adjusted-rand-score > > > > 3. How to use make_blobs command for my data? > > > > The make_blobs command is just a utility function to create toydatasets, > > you wouldn’t need it in your case since you already have “real” data. > > > > Best, > > Sebastian > > > > > > On Mar 30, 2017, at 4:51 AM, Shuchi Mala <shuchi...@gmail.com> wrote: > > > > Hi everyone, > > > > I have the data with following attributes: (Latitude, Longitude). Now I am > > performing clustering using DBSCAN for my data. I have following doubts: > > > > 1. How can I add data to the data set of the package? > > 2. How I can calculate Rand index for my data? > > 3. How to use make_blobs command for my data? > > > > Sample of my data is : > > Latitude Longitude > > 37.76901 -122.429299 > > 37.76904 -122.42913 > > 37.76878 -122.429092 > > 37.7763 -122.424249 > > 37.77627 -122.424657 > > > > > > With Best Regards, > > Shuchi Mala > > Research Scholar > > Department of Civil Engineering > > MNIT Jaipur > > > > _______________________________________________ > > scikit-learn mailing list > > scikit-learn@python.org > > https://mail.python.org/mailman/listinfo/scikit-learn > > > > _______________________________________________ > > scikit-learn mailing list > > scikit-learn@python.org > > https://mail.python.org/mailman/listinfo/scikit-learn > > > > -- > > *PhD candidate & Research Assistant* > > *Cooperative Institute for Research in Environmental Sciences (CIRES)* > > *University of Colorado at Boulder* > > > > _______________________________________________ > > scikit-learn mailing list > > scikit-learn@python.org > > https://mail.python.org/mailman/listinfo/scikit-learn > > > > _______________________________________________ > > scikit-learn mailing list > > scikit-learn@python.org > > https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn