Hi Olivier,
Thank you for your considerations. I'll follow your recomendations.
Instance Reduction is the opposite of resampling (like a inverse SMOTE);
So it would require the pipeline to accept transformers that change the
number of samples (axis=0 in the input data). Maybe in the future I could
change to this approach!
Again, thanks!
On Fri, Jul 4, 2014 at 5:28 AM, Olivier Grisel <[email protected]>
wrote:
> 2014-07-04 3:35 GMT+02:00 Dayvid Victor <[email protected]>:
> > Hi Olivier,
> >
> > I solved this issue, but talking to some people in the maillist,
> > they adviced me to start a new project (already referenced in the wiki)
> > and latter think about include instance reduction in the sklearn.
> >
> > https://github.com/dvro/scikit-protopy (name is not definite yet);
> >
> > If you could take a look and give me some pointers, like:
> >
> > Is the use of 'fit' and 'reduce' ok? Or should I use 'transform'?
>
> I don't know want instance reduction is about, but if you want to use
> it as a feature transformation layer to be used in a sklearn Pipeline
> you have to implement the transformer interface:
>
> Make sure to have a look at:
> http://scikit-learn.org/dev/developers/#apis-of-scikit-learn-objects
>
> However keep in mind that the current Pipeline implementation does not
> accept transformers that change the number of samples (axis=0 in the
> input data). This is a known limitation that prevent us to use
> pipeline of transformers to perform resampling operations.
>
> > Should I do the classifier setup in the __init__ (passing all arguments
> of
> > the KNN to in the InstanceReduction constructor)?
>
> You might want to pass a KNN instance as `base_estimator` directly as
> the pattern followed in BaggingEstimator for instance.
>
> > Do you think I call it scikits.protopy (use: from scikits.protopy import
> A)
> > in order to be according to the scikits pattern?
>
> We have abandoned this pattern in sklearn for years as namespace
> packages tend to cause a lot of issue with installation tools.
>
> I would rather recommend to use a flat name instead (both for the
> project name and the package name).
>
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
--
*Dayvid Victor R. de Oliveira*
PhD Candidate in Computer Science at Federal University of Pernambuco (UFPE)
MSc in Computer Science at Federal University of Pernambuco (UFPE)
BSc in Computer Engineering - Federal University of Pernambuco (UFPE)
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general