Re: [Scikit-learn-general] Subclassing vectorizers

2016-03-23 Thread Fred Mailhot
Thanks very much everyone; seems to be working now! On 23 March 2016 at 00:58, Sebastian Raschka wrote: > Hah, and I just wanted to write regarding the VotingClassifier — I > remember my struggle quite well when I tried to to make it pipeline and > GridSearch compatible

Re: [Scikit-learn-general] Subclassing vectorizers

2016-03-22 Thread Sebastian Raschka
Hah, and I just wanted to write regarding the VotingClassifier — I remember my struggle quite well when I tried to to make it pipeline and GridSearch compatible until I figured that one out :P > On Mar 23, 2016, at 12:34 AM, Joel Nothman wrote: > > And I lied that none

Re: [Scikit-learn-general] Subclassing vectorizers

2016-03-22 Thread Joel Nothman
And I lied that none of the scikit-learn estimators define their own get_params. Of course the following do: VotingClassifier, Kernel (and subclasses), Pipeline and FeatureUnion On 23 March 2016 at 15:04, Joel Nothman wrote: > something like the following may suffice: >

Re: [Scikit-learn-general] Subclassing vectorizers

2016-03-22 Thread Joel Nothman
something like the following may suffice: def get_params(self, deep=True): out = super(WordCooccurrenceVectorizer, self).get_params(deep=deep) out['w2v_clusters'] = self.w2v_clusters return out On 23 March 2016 at 15:01, Joel Nothman wrote: > Hi Fred, > > We

Re: [Scikit-learn-general] Subclassing vectorizers

2016-03-22 Thread Joel Nothman
Hi Fred, We use the __init__ signature to get the list of parameters that (a) can be set by grid search; (b) need to be copied to a cloned instance of the estimator (with any fitted model discarded) in constructing ensembles, cross validation, etc. While none of the scikit-learn library of