Oh yes, I get that now. All this while I was thinking there was an issue
with the mac due to a similar issue discussed here
https://github.com/scikit-learn/scikit-learn/issues/5115.
Thanks a lot for clearing this up. I am going to change the loop and see
if I can run the parallel implementation on mac.
It was probably running on server since its has many more processors..
Thanks,
Amita
On Thu, May 12, 2016 at 7:35 PM, Sebastian Raschka <se.rasc...@gmail.com>
wrote:
> I am not that much into the multi-processing implementation in
> scikit-learn / joblib, but I think this could be one issue why your mac
> hangs… I’d say that it’s probably the safest approach to only set the
> n_jobs parameter for the innermost object.
>
> E.g., if you 4 processors, you said the GridSearch to 2 and a k-fold loop
> to e.g., 5, I can imagine that it would blow up because you are suddenly
> trying to run 10 processes on 4 processors if it makes sense!?
>
>
> > On May 12, 2016, at 10:26 PM, Amita Misra <amis...@ucsc.edu> wrote:
> >
> > I had not thought about the n_jobs parameter, mainly because it does not
> run on my mac and the system just hangs if i use it.
> > The same code runs on linux server though.
> >
> > I have one more clarification to seek.
> > I was running it on server with this code. Would this be fine or may I
> move the n_jobs=3 to GridSearchCV
> >
> > grid_search = GridSearchCV(pipeline,
> param_grid=param_grid,scoring=scoringcriteria,cv=5)
> > scores = cross_validation.cross_val_score(grid_search, X_train,
> Y_train,cv=cvfolds,n_jobs=3)
> >
> > Thanks,
> > Amita
> >
> > On Thu, May 12, 2016 at 6:58 PM, Sebastian Raschka <se.rasc...@gmail.com>
> wrote:
> > You are welcome, and I am glad to hear that it works :). And “your"
> approach is definitely the cleaner way to do it … I think you just need to
> be a bit careful about the n_jobs parameter in practice, I would only set
> it to n_jobs=-1 in the inner loop.
> >
> > Best,
> > Sebastian
> >
> >
> > > On May 12, 2016, at 7:17 PM, Amita Misra <amis...@ucsc.edu> wrote:
> > >
> > > Thanks.
> > > Actually there were 2 people running the same experiments and the
> other person was doing as you have shown above.
> > > We were getting the same results but since methods were different I
> wanted to ensure that I am doing it the right way.
> > >
> > > Thanks,
> > > Amita
> > >
> > > On Thu, May 12, 2016 at 2:43 PM, Sebastian Raschka <
> se.rasc...@gmail.com> wrote:
> > > I see; that’s what I thought. At first glance, the approach (code)
> looks correct to me but I haven’ t done it this way, yet. Typically, I use
> a more “manual” approach iterating over the outer folds manually (since I
> typically use nested CV for algo selection):
> > >
> > >
> > > gs_est = … your gridsearch, pipeline, estimator with param grid and
> cv=5
> > > skfold = StratifiedKFold(y=y_train, n_folds=5, shuffle=True,
> random_state=123)
> > >
> > > for outer_train_idx, outer_valid_idx in skfold:
> > > gs_est.fit(X_train[outer_train_idx], y_train[outer_train_idx])
> > > y_pred = gs_est.predict(X_train[outer_valid_idx])
> > > acc = accuracy_score(y_true=y_train[outer_valid_idx],
> y_pred=y_pred)
> > > print(' | inner ACC %.2f%% | outer ACC %.2f%%' %
> (gs_est.best_score_ * 100, acc * 100))
> > > cv_scores[name].append(acc)
> > >
> > > However, it should essentially do the same thing as your code if I see
> it correctly.
> > >
> > >
> > > > On May 12, 2016, at 4:50 PM, Amita Misra <amis...@ucsc.edu> wrote:
> > > >
> > > > Actually I do not have an independent test set and hence I want to
> use it as an estimate for generalization performance. Hence my classifier
> is fixed SVM and I want to learn the parameters and also estimate an
> unbiased performance using only one set of data.
> > > >
> > > > I wanted to ensure that my code correctly does a nested 10*5 CV and
> the parameters are learnt on a different set and final evaluation to get
> the predicted score is on a different set.
> > > >
> > > > Amita
> > > >
> > > >
> > > >
> > > > On Thu, May 12, 2016 at 1:24 PM, Sebastian Raschka <
> se.rasc...@gmail.com> wrote:
> > > > I would say there are 2 different applications of nested CV. You
> could use it for algorithm selection (with hyperparam tuning in the inner
> loop). Or, you could use it as an estimate of the generalization
> performance (only hyperparam tuning), which has been reported to be less
> biased than the a k-fold CV estimate (Varma, S., & Simon, R. (2006). Bias
> in error estimation when using cross-validation for model selection. BMC
> Bioinformatics, 7, 91. http://doi.org/10.1186/1471-2105-7-91)
> > > >
> > > > By "you could use it as an estimate of the generalization
> performance (only hyperparam tuning)” I mean as a replacement for k-fold on
> the training set and evaluation on an independent test set.
> > > >
> > > > > On May 12, 2016, at 4:16 PM, Алексей Драль <aad...@gmail.com>
> wrote:
> > > > >
> > > > > Hi Amita,
> > > > >
> > > > > As far as I understand your question, you only need one CV loop to
> optimize your objective with scoring function provided:
> > > > >
> > > > > ===
> > > > > pipeline=Pipeline([('scale',
> preprocessing.StandardScaler()),('filter',
> SelectKBest(f_regression)),('svr', svm.SVR())]
> > > > > C_range = [0.1, 1, 10, 100]
> > > > > gamma_range=numpy.logspace(-2, 2, 5)
> > > > > param_grid=[{'svr__kernel': ['rbf'], 'svr__gamma':
> gamma_range,'svr__C': C_range}]
> > > > > grid_search = GridSearchCV(pipeline, param_grid=param_grid, cv=5,
> scoring=scoring_function)
> > > > > grid_search.fit(X_train, Y_train)
> > > > > ===
> > > > >
> > > > > More details about it you should be able to find in documentation:
> > > > > •
> http://scikit-learn.org/stable/modules/grid_search.html#grid-search
> > > > > •
> http://scikit-learn.org/stable/modules/grid_search.html#gridsearch-scoring
> > > > >
> > > > > 2016-05-12 17:05 GMT+01:00 Amita Misra <amis...@ucsc.edu>:
> > > > > Hi,
> > > > >
> > > > > I have a limited dataset and hence want to learn the parameters
> and also evaluate the final model.
> > > > > From the documents it looks that nested cross validation is the
> way to do it. I have the code but still I want to be sure that I am not
> overfitting any way.
> > > > >
> > > > > pipeline=Pipeline([('scale',
> preprocessing.StandardScaler()),('filter',
> SelectKBest(f_regression)),('svr', svm.SVR())]
> > > > > C_range = [0.1, 1, 10, 100]
> > > > > gamma_range=numpy.logspace(-2, 2, 5)
> > > > > param_grid=[{'svr__kernel': ['rbf'], 'svr__gamma':
> gamma_range,'svr__C': C_range}]
> > > > > grid_search = GridSearchCV(pipeline, param_grid=param_grid,cv=5)
> Y_pred=cross_validation.cross_val_predict(grid_search, X_train,
> Y_train,cv=10)
> > > > >
> > > > > correlation= numpy.ma.corrcoef(Y_train,Y_pred)[0, 1]
> > > > >
> > > > >
> > > > > please let me know if my understanding is correct.
> > > > >
> > > > > This is 10*5 nested cross validation. Inner folds CV over training
> data involves a grid search over hyperparameters and outer folds evaluate
> the performance.
> > > > >
> > > > >
> > > > >
> > > > > Thanks,
> > > > > Amita--
> > > > > Amita Misra
> > > > > Graduate Student Researcher
> > > > > Natural Language and Dialogue Systems Lab
> > > > > Baskin School of Engineering
> > > > > University of California Santa Cruz
> > > > >
> > > > >
> > > > >
> ------------------------------------------------------------------------------
> > > > > Mobile security can be enabling, not merely restricting. Employees
> who
> > > > > bring their own devices (BYOD) to work are irked by the imposition
> of MDM
> > > > > restrictions. Mobile Device Manager Plus allows you to control
> only the
> > > > > apps on BYO-devices by containerizing them, leaving personal data
> untouched!
> > > > > https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
> > > > > _______________________________________________
> > > > > Scikit-learn-general mailing list
> > > > > Scikit-learn-general@lists.sourceforge.net
> > > > > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Yours sincerely,
> > > > > Alexey A. Dral
> > > > >
> ------------------------------------------------------------------------------
> > > > > Mobile security can be enabling, not merely restricting. Employees
> who
> > > > > bring their own devices (BYOD) to work are irked by the imposition
> of MDM
> > > > > restrictions. Mobile Device Manager Plus allows you to control
> only the
> > > > > apps on BYO-devices by containerizing them, leaving personal data
> untouched!
> > > > >
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j_______________________________________________
> > > > > Scikit-learn-general mailing list
> > > > > Scikit-learn-general@lists.sourceforge.net
> > > > > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> > > >
> > > >
> > > >
> ------------------------------------------------------------------------------
> > > > Mobile security can be enabling, not merely restricting. Employees
> who
> > > > bring their own devices (BYOD) to work are irked by the imposition
> of MDM
> > > > restrictions. Mobile Device Manager Plus allows you to control only
> the
> > > > apps on BYO-devices by containerizing them, leaving personal data
> untouched!
> > > > https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
> > > > _______________________________________________
> > > > Scikit-learn-general mailing list
> > > > Scikit-learn-general@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> > > >
> > > >
> > > >
> > > > --
> > > > Amita Misra
> > > > Graduate Student Researcher
> > > > Natural Language and Dialogue Systems Lab
> > > > Baskin School of Engineering
> > > > University of California Santa Cruz
> > > >
> > > >
> ------------------------------------------------------------------------------
> > > > Mobile security can be enabling, not merely restricting. Employees
> who
> > > > bring their own devices (BYOD) to work are irked by the imposition
> of MDM
> > > > restrictions. Mobile Device Manager Plus allows you to control only
> the
> > > > apps on BYO-devices by containerizing them, leaving personal data
> untouched!
> > > >
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j_______________________________________________
> > > > Scikit-learn-general mailing list
> > > > Scikit-learn-general@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> > >
> > >
> > >
> ------------------------------------------------------------------------------
> > > Mobile security can be enabling, not merely restricting. Employees who
> > > bring their own devices (BYOD) to work are irked by the imposition of
> MDM
> > > restrictions. Mobile Device Manager Plus allows you to control only the
> > > apps on BYO-devices by containerizing them, leaving personal data
> untouched!
> > > https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
> > > _______________________________________________
> > > Scikit-learn-general mailing list
> > > Scikit-learn-general@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> > >
> > >
> > >
> > > --
> > > Amita Misra
> > > Graduate Student Researcher
> > > Natural Language and Dialogue Systems Lab
> > > Baskin School of Engineering
> > > University of California Santa Cruz
> > >
> > >
> ------------------------------------------------------------------------------
> > > Mobile security can be enabling, not merely restricting. Employees who
> > > bring their own devices (BYOD) to work are irked by the imposition of
> MDM
> > > restrictions. Mobile Device Manager Plus allows you to control only the
> > > apps on BYO-devices by containerizing them, leaving personal data
> untouched!
> > >
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j_______________________________________________
> > > Scikit-learn-general mailing list
> > > Scikit-learn-general@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
> >
> >
> ------------------------------------------------------------------------------
> > Mobile security can be enabling, not merely restricting. Employees who
> > bring their own devices (BYOD) to work are irked by the imposition of MDM
> > restrictions. Mobile Device Manager Plus allows you to control only the
> > apps on BYO-devices by containerizing them, leaving personal data
> untouched!
> > https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
> > _______________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
> >
> >
> > --
> > Amita Misra
> > Graduate Student Researcher
> > Natural Language and Dialogue Systems Lab
> > Baskin School of Engineering
> > University of California Santa Cruz
> >
> >
> ------------------------------------------------------------------------------
> > Mobile security can be enabling, not merely restricting. Employees who
> > bring their own devices (BYOD) to work are irked by the imposition of MDM
> > restrictions. Mobile Device Manager Plus allows you to control only the
> > apps on BYO-devices by containerizing them, leaving personal data
> untouched!
> >
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j_______________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
> ------------------------------------------------------------------------------
> Mobile security can be enabling, not merely restricting. Employees who
> bring their own devices (BYOD) to work are irked by the imposition of MDM
> restrictions. Mobile Device Manager Plus allows you to control only the
> apps on BYO-devices by containerizing them, leaving personal data
> untouched!
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
--
Amita Misra
Graduate Student Researcher
Natural Language and Dialogue Systems Lab
Baskin School of Engineering
University of California Santa Cruz
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general