Hi folks,
I just added our planned common sprint to the pycon sprint page:
https://us.pycon.org/2012/community/sprints/projects/
I listed myself and Olivier as 'leaders' just so the organizers have
someone to contact. Please add your name to that list if you plan on
participating, as I imagine
CoefSelectTransformerMixin must be deprecated or deleted (I would
favor the latter as I guess nobody uses it in user-land code) and
replaced in favor of SelectorMixin. I will do it later today.
https://github.com/scikit-learn/scikit-learn/issues/518
Mathieu
--
On Tue, Jan 24, 2012 at 11:07:20PM +0100, Andreas wrote:
> Taking the mean means that if a feature has a strong positive weight for one
> class and a strong negative weight for another class, they might cancel,
> leading to the feature being not present in the solution.
> Why does that make sense?
Hi everybody.
At the moment I'm trying to understand feature selection.
I was looking at the "L1 based feature selection" that is described in
the docs.
I was trying to use that with LinearSVC but I don't really understand
what is
going on. Maybe someone can explain.
I am in the mult-class setu
On Tue, Jan 24, 2012 at 08:02:34AM -0500, Satrajit Ghosh wrote:
>this list generates a lot of practical useful information such as your
>response below that gets "lost" (i.e. difficult to search if you don't
>have the right terms) in the mailing list archives. could we think about
>
2012/1/24 Satrajit Ghosh :
> hi olivier and others,
>
> this list generates a lot of practical useful information such as your
> response below that gets "lost" (i.e. difficult to search if you don't have
> the right terms) in the mailing list archives. could we think about how to
> capture such in
hi olivier and others,
this list generates a lot of practical useful information such as your
response below that gets "lost" (i.e. difficult to search if you don't have
the right terms) in the mailing list archives. could we think about how to
capture such information in the docs/wiki?
cheers,
Which classifier have you tried? Are you sure you selected the best
hyper-parameters with GridSearchCV? Have your tried to normalize the
dataset? For instance have a look at:
http://scikit-learn.org/dev/modules/preprocessing.html
For very sparse data with large variance in the feature, you shou
Am 15.01.2012 19:45, schrieb Gael Varoquaux:
> On Sun, Jan 15, 2012 at 07:39:00PM +0100, Philipp Singer wrote:
>> The problem is that my representation is very sparse so I have a huge
>> amount of zeros.
> That's actually good: some of our estimators are able to use a sparse
> representation to spe
On 01/24/2012 10:49 AM, Olivier Grisel wrote:
> 2012/1/24 Dimitrios Pritsos:
>> Thank you very much for the advice. I will try this too(today!).
>> however, it seems that I might need to use the partial_fit() in the near
>> feature after I will collect/crawl a new corpus.
>> So a question is, my re
2012/1/24 Dimitrios Pritsos :
>
> Thank you very much for the advice. I will try this too(today!).
> however, it seems that I might need to use the partial_fit() in the near
> feature after I will collect/crawl a new corpus.
> So a question is, my result (20%) was due to some short of bug in
> part
On 01/23/2012 09:11 PM, Olivier Grisel wrote:
> 2012/1/23 Dimitrios Pritsos:
>> However, when I do the same test using partial_fit() for the same
>> sub-set of my Data Set (see above) I am getting ~20%.
>>
>> Any Suggestions?
> Do a grid search to find the best alpha on SGDClassifier (and on C for
12 matches
Mail list logo