Re: [Scikit-learn-general] Feature selection and cross validation; and identifying chosen features

2015-02-11 Thread Andy
On 02/11/2015 04:22 PM, Timothy Vivian-Griffiths wrote: Hi Gilles, Thank you so much for clearing this up for me. So, am I right in thinking that the feature selection is carried for every CV-fold, and then once the best parameters have been found, the pipeline is then run on the whole

Re: [Scikit-learn-general] Feature selection and cross validation; and identifying chosen features

2015-02-11 Thread Vlad Niculae
On 11 Feb 2015, at 16:31, Andy t3k...@gmail.com wrote: On 02/11/2015 04:22 PM, Timothy Vivian-Griffiths wrote: Hi Gilles, Thank you so much for clearing this up for me. So, am I right in thinking that the feature selection is carried for every CV-fold, and then once the best

[Scikit-learn-general] Feature selection and cross validation; and identifying chosen features

2015-02-11 Thread Timothy Vivian-Griffiths
Hi Gilles, Thank you so much for clearing this up for me. So, am I right in thinking that the feature selection is carried for every CV-fold, and then once the best parameters have been found, the pipeline is then run on the whole training set in order to get the .best_estimator_? One final

Re: [Scikit-learn-general] Feature selection and cross validation; and identifying chosen features

2015-02-11 Thread Joel Nothman
You could use grid2.best_estimator_.named_steps['feature_selection'].get_support(), or .transform(feature_names) instead of .get_support(). Note for instance that if you have a pipeline of multiple feature selectors, for some reason, .transform(feature_names) remains useful while .get_support()

Re: [Scikit-learn-general] Feature selection and cross validation; and identifying chosen features

2015-02-11 Thread Gilles Louppe
On 11 February 2015 at 22:22, Timothy Vivian-Griffiths vivian-griffith...@cardiff.ac.uk wrote: Hi Gilles, Thank you so much for clearing this up for me. So, am I right in thinking that the feature selection is carried for every CV-fold, and then once the best parameters have been found, the

Re: [Scikit-learn-general] Feature selection and cross validation

2015-02-10 Thread Gilles Louppe
Hi Tim, On 9 February 2015 at 19:54, Timothy Vivian-Griffiths vivian-griffith...@cardiff.ac.uk wrote: Just a quick follow up to some of the previous problems that I have had: after getting some kind assistance at the PyData London meetup last week, I found out why I was getting different

[Scikit-learn-general] Feature selection and cross validation

2015-02-09 Thread Timothy Vivian-Griffiths
Just a quick follow up to some of the previous problems that I have had: after getting some kind assistance at the PyData London meetup last week, I found out why I was getting different results using an SVC in R, and it was happening because R scales the inputs automatically whereas sklearn