Don’t get me wrong, but you’d have to either manually label them
yourself, asking domain experts, or use platforms like Amazon Turk (or
collect them in some other way).
> On Apr 3, 2017, at 7:38 AM, Shuchi Mala <shuchi...@gmail.com> wrote:
>
> How can I get ground truth labels of the training examples in my
dataset?
>
> With Best Regards,
> Shuchi Mala
> Research Scholar
> Department of Civil Engineering
> MNIT Jaipur
>
>
> On Fri, Mar 31, 2017 at 8:17 PM, Sebastian Raschka <
se.rasc...@gmail.com> wrote:
> Hi, Shuchi,
>
> regarding labels_true: you’d only be able to compute the rand index
adjusted for chance if you have the ground truth labels iof the training
examples in your dataset.
>
> The second parameter, labels_pred, takes in the predicted cluster
labels (indices) that you got from the clustering. E.g,
>
> dbscn = DBSCAN()
> labels_pred = dbscn.fit(X).predict(X)
>
> Best,
> Sebastian
>
>
> > On Mar 31, 2017, at 12:02 AM, Shuchi Mala <shuchi...@gmail.com>
wrote:
> >
> > Thank you so much for your quick reply. I have one more doubt. The
below statement is used to calculate rand score.
> >
> > metrics.adjusted_rand_score(labels_true, labels_pred)
> > In my case what will be labels_true and labels_pred and how I will
calculate labels_pred?
> >
> > With Best Regards,
> > Shuchi Mala
> > Research Scholar
> > Department of Civil Engineering
> > MNIT Jaipur
> >
> >
> > On Thu, Mar 30, 2017 at 8:38 PM, Shane Grigsby <
shane.grig...@colorado.edu> wrote:
> > Since you're using lat / long coords, you'll also want to convert
them to radians and specify 'haversine' as your distance metric; i.e. :
> >
> > coords = np.vstack([lats.ravel(),longs.ravel()]).T
> > coords *= np.pi / 180. # to radians
> >
> > ...and:
> >
> > db = DBSCAN(eps=0.3, min_samples=10, metric='haversine')
> > # replace eps and min_samples as appropriate
> > db.fit(coords)
> >
> > Cheers,
> > Shane
> >
> >
> > On 03/30, Sebastian Raschka wrote:
> > Hi, Shuchi,
> >
> > 1. How can I add data to the data set of the package?
> >
> > You don’t need to add your dataset to the dataset module to run your
analysis. A convenient way to load it into a numpy array would be via
pandas. E.g.,
> >
> > import pandas as pd
> > df = pd.read_csv(‘your_data.txt', delimiter=r"\s+”)
> > X = df.values
> >
> > 2. How I can calculate Rand index for my data?
> >
> > After you ran the clustering, you can use the “adjusted_rand_score”
function, e.g., see
> > http://scikit-learn.org/stable/modules/clustering.html#adjus
ted-rand-score
> >
> > 3. How to use make_blobs command for my data?
> >
> > The make_blobs command is just a utility function to create
toydatasets, you wouldn’t need it in your case since you already have
“real” data.
> >
> > Best,
> > Sebastian
> >
> >
> > On Mar 30, 2017, at 4:51 AM, Shuchi Mala <shuchi...@gmail.com>
wrote:
> >
> > Hi everyone,
> >
> > I have the data with following attributes: (Latitude, Longitude).
Now I am performing clustering using DBSCAN for my data. I have following
doubts:
> >
> > 1. How can I add data to the data set of the package?
> > 2. How I can calculate Rand index for my data?
> > 3. How to use make_blobs command for my data?
> >
> > Sample of my data is :
> > Latitude Longitude
> > 37.76901 -122.429299
> > 37.76904 -122.42913
> > 37.76878 -122.429092
> > 37.7763 -122.424249
> > 37.77627 -122.424657
> >
> >
> > With Best Regards,
> > Shuchi Mala
> > Research Scholar
> > Department of Civil Engineering
> > MNIT Jaipur
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn@python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn@python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> > --
> > *PhD candidate & Research Assistant*
> > *Cooperative Institute for Research in Environmental Sciences
(CIRES)*
> > *University of Colorado at Boulder*
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn@python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn@python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn