Thanks for reply. I misused random.seed as it returns None.
I passed an integer to random_state but it remains that unexpected
behaviors.
After I cloned the estimator by sklearn.base.clone,e the result becomes
reasonable.
clfs = [ (clone(pipe).fit(x[train_index], y[train_index]), (x[test_index],
y[test_index])) for
train_index, test_index in KFold(x.shape[0], n_folds=8,
shuffle=True, random_state=254)]
scores = [m.accuracy_score(p[1][1], p[0].predict(p[1][0])) for p in clfs]
It looks like the estimator keep previous states when they're training
2014-12-13 9:57 GMT+08:00 Andy <t3k...@gmail.com>:
>
> random.seed returns nothing, and the random module is not used, it is
> numpy.random.
> You should just pass the integer.
>
>
> On 12/09/2014 06:50 PM, He-chien Tsai wrote:
>
> Thanks for your approach, I didn't notice that cross_val_score accepts
> cross validator as cv
> Your approach makes that strange behavior disappeared!
> But I still can't figure out what mistake I made, my original code looks
> nothing wrong.
>
> BTW, I used pipeline because I planned using data transformation.
>
> 2014-12-10 4:33 GMT+08:00 Sebastian Raschka <se.rasc...@gmail.com>:
>
>> What is your dataset size? I am a little bit curious whether you need the
>> pipe.fit(), I'd do the CV usually like this
>>
>> clf1 = Pipeline([
>> ('classifier', RandomForestClassifier(n_estimators=100,
>> min_samples_leaf=10,random_state=random.seed(1234)))
>>
>> cv = KFold(n=X_train.shape[0],
>> n_folds=5,
>> random_state=123)
>>
>> scores = cross_val_score(clf1, X_train, y_train, cv=cv,
>> scoring='accuracy')
>>
>> Best,
>> Sebastian
>>
>>
>> > On Dec 9, 2014, at 3:05 PM, He-chien Tsai <depot...@gmail.com> wrote:
>> >
>> > I got two strange cross-validation scores even I tried different
>> parameter of random_state in KFold, the last fold significantly lower than
>> other folds like this:
>> > [0.66555285540704734,
>> > 0.64459295261239369,
>> > 0.64611178614823817,
>> > 0.6488456865127582,
>> > 0.65268915223336377,
>> > 0.65603160133697969,
>> > 0.66423579459130966,
>> > 0.097538742023700997]
>> >
>> > [0.82442284325637905,
>> > 0.8353584447144593,
>> > 0.82685297691373028,
>> > 0.82320777642770349,
>> > 0.82685297691373028,
>> > 0.82989064398541923,
>> > 0.82006079027355627,
>> > 0.64133738601823709]
>> > My code is below
>> > pipe = Pipeline([
>> > ('classifier', RandomForestClassifier(n_estimators=100,
>> min_samples_leaf=10,random_state=random.seed(1234)))
>> > ])
>> > clfs = [ (pipe.fit(x[train_index], y[train_index]), (x[test_index],
>> y[test_index])) for
>> > train_index, test_index in KFold(x.shape[0], n_folds=8,
>> shuffle=True, random_state=random.seed(125))]
>> > scores = [m.accuracy_score(p[1][1], p[0].predict(p[1][0])) for p in
>> clfs]
>> >
>> ------------------------------------------------------------------------------
>> > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> > from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>> > with Interactivity, Sharing, Native Excel Exports, App Integration &
>> more
>> > Get technology previously reserved for billion-dollar corporations, FREE
>> >
>> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk_______________________________________________
>> > Scikit-learn-general mailing list
>> > Scikit-learn-general@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>> with Interactivity, Sharing, Native Excel Exports, App Integration & more
>> Get technology previously reserved for billion-dollar corporations, FREE
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations,
> FREEhttp://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
>
>
>
> _______________________________________________
> Scikit-learn-general mailing
> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general