A little question regarding how it’s currently handled ...
So, if I have one of scikit-learn’s feature selectors in a pipeline, and it
selected e.g., the features idx=[1, 12, 23] after “.fit”. Now, if I use
“.predict" on that pipeline, wouldn’t the feature selectors transform method
only pass X[
Cool, thanks for feedback!
Any outstanding PRs addressing something like this or anyone on this list
been thinking of/working on solutions?
I imagine it might be implemented as a step in a pipeline (eg.
FeatureRemover()) and be generally applicable / potentially benefit many
sklearners. Not sure i
Dear all,
I've created a pull request with new functions to perform feature selection
using IG and GR: https://github.com/scikit-learn/scikit-learn/pull/6534
These are two popular feature selection methods, it would be great if
scikit-learn implemented those.
The implementation closely follows th
Currently there is no automatic mechanism for eliminating the generation of
features that are not selected downstream. It needs to be achieved manually.
On 15 March 2016 at 08:05, Philip Tully wrote:
> Hi,
>
> I'm trying to optimize the time it takes to make a prediction with my
> model(s). I re
Hi,
I'm trying to optimize the time it takes to make a prediction with my
model(s). I realized that when I perform feature selection during the
model fit(), that these features are likely still computed when I go
to predict() or predict_proba(). An optimization would then involve
actually eliminat
Ah, the API changes...
but know im getting something like:
import mlxtend.classifier.EnsembleClassifier
Traceback (most recent call last):
File "", line 1, in
File "mlxtend/classifier/__init__.py", line 8, in
from .ensemble import EnsembleClassifier
File "mlxtend/classifier/ensemble.p
Hey,
the mlxtend library worked great on my Computer.
Now installed it on an server.
import mlxtend works fine
but if i want to import the EnsembleClassifier he gives ma an error like:
from mlxtend.sklearn import EnsembleClassifier :
"No module named sklearn"
import sklearn works also.
Doe
Hi, Herbert,
I can't help you with the accuracy problem since this can be due to many
different things. However, there is now a way to combine different classifiers
for majority rule voting, the sklearn.ensemble.VotingClassifier (. It is not in
the current stable release yet but you could get it
Thanks that helped.
But i just can't get an higher accuracy then 45%... don't now why. also
with logicstic regression and so on..
Is there a way to combine for example an SVM with a decision tree?
Herb
On 2 June 2015 at 11:19, Michael Eickenberg
wrote:
> Some configurations are not implemente
Some configurations are not implemented or difficult to evaluate in the
dual. Setting dual=True/False doesn't change the result, so please don't
vary it as you would vary other parameters. It can however sometimes yield
a speed-up. Here you should try setting dual=False as a first means of
debuggin
Does anyone know why this failure occurs?
ValueError: Unsupported set of arguments: loss='l1' and
penalty='squared_hinge'are not supported when dual=True, Parameters:
penalty='l1', loss='squared_hinge', dual=True
I'm using a Linear SVC ( in andreas example code).
On 1 June 2015 at 13:38, Herber
Cool, thx for that!
Herb
On 1 June 2015 at 12:16, JAGANADH G wrote:
> Hi
>
> I have listed sklearn feature selection with minimal examples here
>
>
> http://nbviewer.ipython.org/github/jaganadhg/data_science_notebooks/blob/master/sklearn/scikit_learn_feature_selection.ipynb
>
> Jagan
>
> On Th
Hi
I have listed sklearn feature selection with minimal examples here
http://nbviewer.ipython.org/github/jaganadhg/data_science_notebooks/blob/master/sklearn/scikit_learn_feature_selection.ipynb
Jagan
On Thu, May 28, 2015 at 10:14 PM, Herbert Schulz
wrote:
> Thank's to both of you!!! I realy
Thank's to both of you!!! I realy appreciate it! I will try everything this
weekend.
Best regards,
Herb
On 28 May 2015 at 18:21, Sebastian Raschka wrote:
> I agree with Andreas,
> typically, a large number of features also shouldn't be a big problem for
> random forests in my experience; howev
I agree with Andreas,
typically, a large number of features also shouldn't be a big problem for
random forests in my experience; however, it of course depends on the number of
trees and training samples.
If you suspect that overfitting might be a problem using unregularized
classifiers, also co
Hi Herbert.
1) Often reducing the features space does not help with accuracy, and
using a regularized classifier leads to better results.
2) To do feature selection, you need two methods: one to reduce the set
of features, another that does the actual supervised task
(classification here).
Ha
Hello,
I'm using scikit-learn for machine learning.
I have 800 samples with 2048 features, therefore i want to reduce my
features to get hopefully a better accuracy.
It is a multiclass problem (class 0-5), and the features consists of 1's
and 0's: [1,0,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0]
I'm us
Hi Tim.
Nearly everything in scikit-learn will assume numeric features, or
one-hot encoded categorical features.
You can feed categorical variables encoded as integers, but usually this
will not result in the desired behavior.
For the ordinal (ordered) data, tree-based methods like the
RandomFor
Hi all,
In my classification problem,
some features are numerical (e.g. 10.1, 1), and
some features are categorical though numerically coded as nonnegative numbers
(such as id coded as 100, 99), and
some features are ordered though numerically coded as nonnegative numbers(such
as versions
On 11 February 2015 at 22:22, Timothy Vivian-Griffiths
wrote:
> Hi Gilles,
>
> Thank you so much for clearing this up for me. So, am I right in thinking
> that the feature selection is carried for every CV-fold, and then once the
> best parameters have been found, the pipeline is then run on the
You could use
grid2.best_estimator_.named_steps['feature_selection'].get_support(),
or .transform(feature_names) instead of .get_support(). Note for instance
that if you have a pipeline of multiple feature selectors, for some reason,
.transform(feature_names) remains useful while .get_support() do
> On 11 Feb 2015, at 16:31, Andy wrote:
>
>
> On 02/11/2015 04:22 PM, Timothy Vivian-Griffiths wrote:
>> Hi Gilles,
>>
>> Thank you so much for clearing this up for me. So, am I right in thinking
>> that the feature selection is carried for every CV-fold, and then once the
>> best parameters
On 02/11/2015 04:22 PM, Timothy Vivian-Griffiths wrote:
> Hi Gilles,
>
> Thank you so much for clearing this up for me. So, am I right in thinking
> that the feature selection is carried for every CV-fold, and then once the
> best parameters have been found, the pipeline is then run on the whole
Hi Gilles,
Thank you so much for clearing this up for me. So, am I right in thinking that
the feature selection is carried for every CV-fold, and then once the best
parameters have been found, the pipeline is then run on the whole training set
in order to get the .best_estimator_?
One final th
Hi Tim,
On 9 February 2015 at 19:54, Timothy Vivian-Griffiths
wrote:
> Just a quick follow up to some of the previous problems that I have had:
> after getting some kind assistance at the PyData London meetup last week, I
> found out why I was getting different results using an SVC in R, and it w
Just a quick follow up to some of the previous problems that I have had: after
getting some kind assistance at the PyData London meetup last week, I found out
why I was getting different results using an SVC in R, and it was happening
because R scales the inputs automatically whereas sklearn doe
On 11/02/2014 04:15 PM, Lars Buitinck wrote:
> 2014-11-02 22:09 GMT+01:00 Andy :
>>> No. That would be backward stepwise selection. Neither that, nor its
>>> forward cousin (find most discriminative feature, then second-most,
>>> etc.) are implemented in scikit-learn.
>>>
>> Isn't RFE the backward
2014-11-02 22:09 GMT+01:00 Andy :
>> No. That would be backward stepwise selection. Neither that, nor its
>> forward cousin (find most discriminative feature, then second-most,
>> etc.) are implemented in scikit-learn.
>>
> Isn't RFE the backward step selection using a maximum number of features?
On 10/20/2014 04:29 PM, Lars Buitinck wrote:
> 2014-10-20 22:08 GMT+02:00 George Bezerra :
>> Not an expert, but I think the idea is that you remove (or add) features one
>> by one, starting from the ones that have the least (or most) impact.
>>
>> E.g., try removing a feature, if performance impro
There are feature selection algorithms based on Evolutionary Algorithms,
so, despite the exponential space of search, you can fix a number of
evaluations.
Experimentally, this approach have found optimal solutions on
Instace/Feature/Classifier selection, without exploring the whole search
space.
2014-10-21 4:14 GMT+02:00 Joel Nothman :
> I assume Robert's query is about RFECV.
Oh wait, RFE = backward subset selection. I'm an idiot, sorry.
--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9
I assume Robert's query is about RFECV.
On 21 October 2014 07:35, Manoj Kumar
wrote:
> Hi,
>
> No expert here, either but there are also feature selection classes which
> compute the score per feature.
>
> A simple example would be the f_classif, which in a very broad way
> measures how a certai
*Roberto
On 21 October 2014 13:14, Joel Nothman wrote:
> I assume Robert's query is about RFECV.
>
> On 21 October 2014 07:35, Manoj Kumar
> wrote:
>
>> Hi,
>>
>> No expert here, either but there are also feature selection classes which
>> compute the score per feature.
>>
>> A simple example w
Hi,
No expert here, either but there are also feature selection classes which
compute the score per feature.
A simple example would be the f_classif, which in a very broad way measures
how a certain feature varies across all the classes to how a feature varies
in a particular class (a naive expla
2014-10-20 22:08 GMT+02:00 George Bezerra :
> Not an expert, but I think the idea is that you remove (or add) features one
> by one, starting from the ones that have the least (or most) impact.
>
> E.g., try removing a feature, if performance improves, keep it that way and
> move on to the next fea
Not an expert, but I think the idea is that you remove (or add) features
one by one, starting from the ones that have the least (or most) impact.
E.g., try removing a feature, if performance improves, keep it that way and
move on to the next feature. It's a greedy approach; not optimal, but
avoids
I'm not sure if I correctly understood the feature selection algorithms.
Basically, accuracy, or any other scoring function is used to determine whether
to keep a specific feature or not? If so, how is the optimal subset of features
determined? Bruteforce would be exponential in complexity.
Th
e: Thu, 9 Oct 2014 06:58:46 +0200
> From: peter.z...@gmail.com
> To: scikit-learn-general@lists.sourceforge.net
> Subject: Re: [Scikit-learn-general] Feature selection: floating search
> algorithm
>
> Hi Nikolay,
>
> On Wed, Oct 8, 2014 at 10:03 PM, Nikolay Mayorov
Hi Nikolay,
On Wed, Oct 8, 2014 at 10:03 PM, Nikolay Mayorov wrote:
> Hi!
>
> Do you think scikit-learn will benefit from the general algorithm of feature
> selection as described by P.Pudil et al. in "Floating search methods in
> feature selection"?
>
> It is a wrapper method which alternates f
Hi!
Do you think scikit-learn will benefit from the general algorithm of feature
selection as described by P.Pudil et al. in "Floating search methods in
feature selection"?
It is a wrapper method which alternates feature additions and removals
(starting from empty or full set of features.
Hi Andreas,
you can find an extensive description of those techniques in this doctoral
thesis from a friend of mine at Oxford University (pag. 99-105), together
with the appropriate references.
http://people.maths.ox.ac.uk/tsanas/Preprints/DPhil%20thesis.pdf
Let me provide you with a brief summ
Hi Andrea.
Thanks a lot for wanting to contribute.
Could you elaborate a bit on the algorithmsthat you want to implement
(i.e. reference paper) and their usage? I haven't heard of them
(except Gram-Schmidt but I'm not sure how that works in this context)
and I am sure other could you some detai
Hi everybody,
my name is Andrea Bravi, I have been subscribed to this mailing list for
quite some time, however I have just finished convincing myself that I
should contribute to this cool project, rather than simply using it.
As a brief introduction, I am a researcher applying machine learning
m
>
> That said, as Olivier mentioned, the GradientBoostingClassifier could
>> implement a "transform", and that might be a good idea.
>>
>
> Ok, then maybe that's something I can tackle if it's not to hairy ?
>
>
I tried something really dumb, but it seems to work in my case:
"""
class ExtGradien
> On Wed, Jul 17, 2013 at 09:09:02AM +0200, Eustache DIEMERT wrote:
> > Ok, then for folks like me that come to numpy because (thanks to)
> sklearn, than
> > why not point a (few) good tutorials somewhere in the docs ?
>
> Indeed. What would people think of pointing to the scipy-lectures
> (http://
On Wed, Jul 17, 2013 at 09:09:02AM +0200, Eustache DIEMERT wrote:
> Ok, then for folks like me that come to numpy because (thanks to) sklearn,
> than
> why not point a (few) good tutorials somewhere in the docs ?
Indeed. What would people think of pointing to the scipy-lectures
(http://scipy-lec
I agree that the narrative feature selection documentation should
include an inline toy example to demonstrate how to combine a selector
transformer in a pipeline as this is the canonical way to use a
feature selection, especially if you want to cross validate the impact
oft he feature selection hy
Mmm
Maybe just including the simple pipeline you provide in the feature
selection doc [1] would suffice to point to the recommended way to do that ?
Like a sub-sub-section dubbed "Including feature selection in a prediction
pipeline" ?
What do you think ?
Would it be too detailed ? should we le
>
> Yes. Learn numpy. Seriously, this may sound provocative but it's the
> biggest favor you can do yourself.
Ok, then for folks like me that come to numpy because (thanks to) sklearn,
than why not point a (few) good tutorials somewhere in the docs ?
I mean if it's an implicit requirement, then
2013/7/16 Olivier Grisel
> Feature selectors should implement the `Transformer` API so that they
> can be used in a Pipeline and make it possible to cross validate them.
>
>
That's what I thought too. Do we have an example of cross-validation
feature selection + learning ?
> The univariate feat
On Tue, Jul 16, 2013 at 05:09:09PM +0200, Eustache DIEMERT wrote:
> What is missing IMHO is a simple example on how to actually transform the
> dataset after the initial feature selection !
I beg to disagree. We have a huge amount of examples. Probably too many.
We need to move people away from co
Oh, well that's sad! Given that it assigns feature_importances_, is there
any reason it should not incorporate the mixin to provide it with
transform()? (I assumed that transform was available wherever
feature_importances_ was.)
On Wed, Jul 17, 2013 at 3:38 PM, Gael Varoquaux <
gael.varoqu...@nor
Hey Joel,
I am afraid that I think that the GradientBoostingClassifier does not
implement the transform method.
Gaël
On Wed, Jul 17, 2013 at 07:42:20AM +1000, Joel Nothman wrote:
> Sorry, I made a mistake: unless the classifier has penalty=l1, its default
> feature selection threshold (as used i
Sorry, I made a mistake: unless the classifier has penalty=l1, its default
feature selection threshold (as used in a pipeline currently) is the mean
feature importance score.
On Wed, Jul 17, 2013 at 7:11 AM, Joel Nothman
wrote:
> For your example, Eustache, the following would work (with a dense
For your example, Eustache, the following would work (with a dense or
sparse X):
"""
clf = GradientBoostingClassifier()
clf.fit(X, y)
clf.fit(clf.transform(threshold=1e-3), y)
"""
Alternatively, use a Pipeline:
"""
clf = Pipeline([
('sel', GradientBoostingClassifier()),
('clf', GradientBo
Feature selectors should implement the `Transformer` API so that they
can be used in a Pipeline and make it possible to cross validate them.
The univariate feature selectors already implement the transformer API:
http://scikit-learn.org/stable/modules/feature_selection.html#univariate-feature-sel
Hi Sklearners,
I was trying out several feature selection methods of sklearn on the Arcene
dataset [1] and it occurred to me that despite the numerous examples [2] in
the docs, most of them were just plotting/printing most relevant features.
What is missing IMHO is a simple example on how to actu
On 02/22/2013 12:03 PM, Christian wrote:
> Hi,
>
> when I train a classification model with feature selected data, I'll
> need for future scoring issues the selector object and the model object.
> So I'll must persist both ( i.e. with pickle ), right ?
Yes.
But the selector is just a mask of siz
Hi,
when I train a classification model with feature selected data, I'll
need for future scoring issues the selector object and the model object.
So I'll must persist both ( i.e. with pickle ), right ?
Many thanks
Christian
-
On Fri, Jun 15, 2012 at 4:50 PM, Yaroslav Halchenko wrote:
>
> On Fri, 15 Jun 2012, josef.p...@gmail.com wrote:
>> https://github.com/PyMVPA/PyMVPA/blob/master/mvpa2/misc/dcov.py#L160
>> looks like a double sum, but wikipedia only has one sum, elementwise product.
>
> sorry -- I might be slow -- w
On Fri, 15 Jun 2012, josef.p...@gmail.com wrote:
> https://github.com/PyMVPA/PyMVPA/blob/master/mvpa2/misc/dcov.py#L160
> looks like a double sum, but wikipedia only has one sum, elementwise product.
sorry -- I might be slow -- what sum? there is only an outer product in
160:Axy = Ax[:, None
On Fri, Jun 15, 2012 at 4:20 PM, Yaroslav Halchenko wrote:
> Here is a comparison to output of my code (marked with >):
>
> 0.00458652660079 0.788017364828 0.00700027844478 0.00483928213727
>> 0.145564526722 0.480124905375 0.422482399359 0.217567496918
> 6.50616752373e-07 7.99461373461e-05 0.0070
Here is a comparison to output of my code (marked with >):
0.00458652660079 0.788017364828 0.00700027844478 0.00483928213727
> 0.145564526722 0.480124905375 0.422482399359 0.217567496918
6.50616752373e-07 7.99461373461e-05 0.00700027844478 0.0094610687282
> 0.120884106118 0.249205123601 0.4224823
On Fri, Jun 15, 2012 at 3:50 PM, wrote:
> On Fri, Jun 15, 2012 at 10:45 AM, Yaroslav Halchenko
> wrote:
>>
>> On Fri, 15 Jun 2012, Satrajit Ghosh wrote:
>>> hi yarik,
>>> here is my attempt:
>>>
>>> [1]https://github.com/satra/scikit-learn/blob/enh/covariance/sklearn/covariance/distan
On Fri, Jun 15, 2012 at 10:45 AM, Yaroslav Halchenko
wrote:
>
> On Fri, 15 Jun 2012, Satrajit Ghosh wrote:
>> hi yarik,
>> here is my attempt:
>>
>> [1]https://github.com/satra/scikit-learn/blob/enh/covariance/sklearn/covariance/distance_covariance.py
>> i'll look at your code in det
On Fri, 15 Jun 2012, Satrajit Ghosh wrote:
>hi yarik,
>here is my attempt:
>
> [1]https://github.com/satra/scikit-learn/blob/enh/covariance/sklearn/covariance/distance_covariance.py
>i'll look at your code in detail later today to understand the uv=True
it is just to compute dCo[v
hi yarik,
here is my attempt:
https://github.com/satra/scikit-learn/blob/enh/covariance/sklearn/covariance/distance_covariance.py
i'll look at your code in detail later today to understand the uv=True case.
cheers,
satra
On Fri, Jun 15, 2012 at 10:19 AM, Yaroslav Halchenko wrote:
> I haven't
I haven't had a chance to play with it extensively but I have a basic
implementation:
https://github.com/PyMVPA/PyMVPA/blob/master/mvpa2/misc/dcov.py
which still lacks statistical assessment, but provides dCov, dCor values
and yes -- it is "inherently multivariate", but since also could be
useful
hi yarik,
hm... interesting -- and there is no comparison against "minimizing
> independence"? e.g. dCov measure
> http://en.wikipedia.org/wiki/Distance_correlation which is really simple
> to estimate and as intuitive as a correlation coefficient
>
thanks for bringing up dCov. have you had a cha
Submitted 5/07; Revised 6/11; Published 5/12
It takes such a long time ...
On Fri, Jun 15, 2012 at 8:58 PM, Satrajit Ghosh wrote:
> fyi
>
> -- Forwarded message --
> From: joshua vogelstein
> Date: Fri, Jun 15, 2012 at 12:35 AM
>
> http://jmlr.csail.mit.edu/papers/volume13/song
hm... interesting -- and there is no comparison against "minimizing
independence"? e.g. dCov measure
http://en.wikipedia.org/wiki/Distance_correlation which is really simple
to estimate and as intuitive as a correlation coefficient
On Fri, 15 Jun 2012, Satrajit Ghosh wrote:
>fyi
>
fyi
-- Forwarded message --
From: joshua vogelstein
Date: Fri, Jun 15, 2012 at 12:35 AM
http://jmlr.csail.mit.edu/papers/volume13/song12a/song12a.pdf
these guys define a nice nonlinear/nonparametric measure of correlation
that might be of interest to you.
---
Say you are working with text document classification in a particular
domain. You want to train the system. Is there an established criteria to
choose the right Feature vector?
--
All the data continuously generated in your
73 matches
Mail list logo