2013/1/10 Gael Varoquaux gael.varoqu...@normalesup.org:
On Thu, Jan 10, 2013 at 03:57:23PM +1100, Juan Nunez-Iglesias wrote:
More precisely, I think David wants a function that will take a set of RFs
and
return a new classifier object that does all the weighted averaging Andy
suggested for
Hi there.. I'm not sure if you have been answered yet.. so perhaps I can
help
MultinomialNB has a parameter called `class_weight` which you can set at
initialization.
| class_weight : array-like, size=[n_classes,]
| Prior probabilities of the classes. If specified the priors are not
|
... or more simply:
pipeline.fit(X, y, nb__sample_weight=sample_weight)
On 10 January 2013 15:20, Gilles Louppe g.lou...@gmail.com wrote:
Hi,
I don't know how it interfaces with NLTK's SklearnClassifier, but if
you can work your way using only Scikit-Learn for training, then can
you pass
Dear SciKitters,
when running a PCA on a rather small dataset, I end up in the situation
that the first principal component is predominant.
My dataset contains 694 samples with 177 features each.
Here comes my code
X = dataDescrs_array
y = dataActs_array
target_names = ['inactive','active']
On Thu, Jan 10, 2013 at 03:25:55PM +0100, paul.czodrow...@merckgroup.com wrote:
I fear that I mixed up my syntax...
Syntax looks good.
If there is one largely predominant component in the data, you should be
able to see it with your naked eye: all the features should have series
that look
Sorry for the confusion, guys.
But I did not scale my features - they contain a wild mixture of values:
- floats ranging from 0 to 1200
- floats ranging from 0 to 60
- integers between 0 and 25
and so on...
My fault!
BTW, I tried to re-run the IRIS example (
Is such a table available some place in the docs?
Ideally it would have time complexity as a function of both number of
samples and features per sample.
Thank you,
--
Andrew Winterman
714 362 6823
--
Master Visual
2013/1/10 Lars Buitinck l.j.buiti...@uva.nl:
2013/1/10 Jason Rudy ja...@clinicast.net
I'm working on an implementation of MARS [1] that I'd like to share, and
it seems like sklearn would be a good place for it. The MARS algorithm is
currently available as part of the R package earth and is
2013/1/10 Andrew Winterman andywinter...@gmail.com:
Is such a table available some place in the docs?
Ideally it would have time complexity as a function of both number of
samples and features per sample.
Nope. That would be a great contribution!
--
Olivier
http://twitter.com/ogrisel -
+1 for the contribution. I was looking for this quite frequently.
On Thu, Jan 10, 2013 at 12:55 PM, Olivier Grisel
olivier.gri...@ensta.orgwrote:
2013/1/10 Andrew Winterman andywinter...@gmail.com:
Is such a table available some place in the docs?
Ideally it would have time complexity as
On Fri, Jan 11, 2013 at 2:40 AM, Andrew Winterman
andywinter...@gmail.com wrote:
Is such a table available some place in the docs?
Ideally it would have time complexity as a function of both number of
samples and features per sample.
I think it would fit in this PR:
Thanks for the help, guys. Indeed it's easy enough to implement a class
for combining the classifiers in a model-specific way. Thanks for the note
on the oob-score!
On Thu, Jan 10, 2013 at 2:28 AM, Olivier Grisel olivier.gri...@ensta.orgwrote:
2013/1/10 Gael Varoquaux
I agree, I'll fork that and do some work on it if I have time this weekend.
Should the classifiers docstrings also note their time complexity?
Seems like something you'd want to know...
On Thu, Jan 10, 2013 at 10:12 AM, Mathieu Blondel math...@mblondel.org wrote:
On Fri, Jan 11, 2013 at 2:40
yes please. I was looking all over the place for these the last week or so
On Thu, Jan 10, 2013 at 1:33 PM, Andrew Winterman
andywinter...@gmail.comwrote:
I agree, I'll fork that and do some work on it if I have time this weekend.
Should the classifiers docstrings also note their time
PR #804 had some comments about generating the tables automatically, which
would be nice. How about a consistently structured `Complexity` section to
the docstrings, and use it to populate the table?
On Thu, Jan 10, 2013 at 6:38 PM, Ronnie Ghose ronnie.gh...@gmail.comwrote:
yes please. I was
That seems to make sense to me, especially since we'll want to analyze
the algorithm as written.
On Thu, Jan 10, 2013 at 10:46 AM, Vlad Niculae zephy...@gmail.com wrote:
PR #804 had some comments about generating the tables automatically, which
would be nice. How about a consistently
This is a general problem if the features are not in the same units.
As you saw, PCA assumes that features all have equal importance.
If you want all to have the same weight, you have to rescale (using
StandardScaler for example).
The problem is: it is not clear whether this is the right thing
Hi Jason.
Thanks for wanting to contribute MARS to sklearn.
There is even an issue requesting the feature ;)
https://github.com/scikit-learn/scikit-learn/issues/845
I think it would be great addition.
You should be aware of the fact that contributing to sklearn is a bit
more than just
On 01/10/2013 07:46 PM, Vlad Niculae wrote:
PR #804 had some comments about generating the tables automatically,
which would be nice. How about a consistently structured `Complexity`
section to the docstrings, and use it to populate the table?
-1
That would mean hacking the numpy
Hi everybody.
Long and general mail coming on.
TL;DR version: do we want to plan for the future?
Today I read this blog post on the scope of open source projects:
http://brianegranger.com/?p=249
It made me dig up an old mail draft I wrote after reading a post by Gael:
I am +1 on a plan, since it's helpful for newbies like myself in
orienting themselves, and helps focus developer effort.
That said, the breadth of this project is pretty amazing, and it's
probably a good idea to keep classifiers which are up-and-coming in
academia available. I guess I'm voting
On 11 January 2013 10:21, Lars Buitinck l.j.buiti...@uva.nl wrote:
2013/1/10 Andreas Mueller amuel...@ais.uni-bonn.de:
I wanted to ask: should we try to make plans? We get a lot of PRs and
have more and more contributors and I think it might be nice
if we had some form of road map to give
2013/1/11 Lars Buitinck l.j.buiti...@uva.nl:
2013/1/10 Andreas Mueller amuel...@ais.uni-bonn.de:
I wanted to ask: should we try to make plans? We get a lot of PRs and
have more and more contributors and I think it might be nice
if we had some form of road map to give everything a bit more
Hi all,
One component of a good roadmap would be to make sure we emphasize good
implementations of fundamental ML algorithms. One area I'd like to work
on is density estimation: KDE in particular is an important component of
a wide variety of algorithms, and there is not (to my knowledge) a
2013/1/11 Vlad Niculae zephy...@gmail.com:
I completely agree with everyone regarding 1.0 and I really think we should
make a clear list of issues for this (just saying API is pretty vague).
However there is life after the 1.0, and I think Andy's message was more
about that kind of long-term
On Thu, Jan 10, 2013 at 10:33:05AM -0800, Andrew Winterman wrote:
Should the classifiers docstrings also note their time complexity?
I think that it would be good.
Thanks,
G
--
Master HTML5, CSS3, ASP.NET, MVC, AJAX,
26 matches
Mail list logo