The randomisation only changes the order of the data, not the set of data
points.

On 27 August 2015 at 22:44, Andrew Howe <ahow...@gmail.com> wrote:

> I'm working through the tutorial, and also experimenting kind of on my
> own.  I'm on the text analysis example, and am curious about the relative
> merits of analyzing by word frequency, relative frequency, and adjusted
> relative frequency.  Using the 20 newsgroups data, I've built a set of
> pipelines within a cross_validation loop; the important part of the code is
> here:
>
> # get the data
> nw = dat.datetime.now()
> rndstat = nw.hour*3600+nw.minute*60+nw.second
> twenty_train = fetch_20newsgroups(subset='train', categories=categories,
> random_state = rndstat, shuffle=True, download_if_missing=False)
> twenty_test = fetch_20newsgroups(subset='test', categories=categories,
> random_state = rndstat, shuffle=True, download_if_missing=False)
>
> # first with raw counts
> text_clf = Pipeline([('vect', CountVectorizer()), ('clf',
> MultinomialNB())])
> text_clf.fit(twenty_train.data,twenty_train.target)
> pred = text_clf.predict(twenty_test.data)
> test_ccrs[mccnt,0] = sum(pred ==
> twenty_test.target)/len(twenty_test.target)
>
> The issue is that everytime I run this, though I've confirmed the data
> sampled is different, the value in test_ccrs is *always* the same.  Am I
> missing something?
>
> Thanks!
> Andrew
>
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
> J. Andrew Howe, PhD
> Editor-in-Chief, European Journal of Mathematical Sciences
> Executive Editor, European Journal of Pure and Applied Mathematics
> www.andrewhowe.com
> http://www.linkedin.com/in/ahowe42
> https://www.researchgate.net/profile/John_Howe12/
> I live to learn, so I can learn to live. - me
> <~~~~~~~~~~~~~~~~~~~~~~~~~~~>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to