Hi Raghav.
I feel that your proposal lacks some focus.
I'd remove the two:
Mallow's Cp for LASSO / LARS
Implement built in abs max scaler, Nesterov's momentum and finish up the
Multilayer Perceptron module.
And as discussed in this thread probably also
Forge a self sufficient ML tutorial based on scikit-learn.
If you feel like you proposal has not enough material (not sure about that),
two things that could be added and are more related to the
cross-validation and grid-search part
(but probably difficult from an API standpoint) are making CV objects
(aka path algorithms, or generalized cross-validation)
work together with GridSearchCV.
The other would be how to allow early stopping using a validation set.
The two are probably related (imho).
Olivier also mentioned cross-validation for out-of-core (partial_fit)
algorithms.
I feel that is not as important, but might also tie into your proposal.
Finishing the refactoring of model_evaluation in three days seems a bit
optimistic, if you include reviews.
For sample_weight support, I'm not if there are obvious ways to extend
sample_weight to all the algorithms that you mentioned.
How does it work for spectral clustering and agglomerative clustering
for example?
In general, I feel you should rather focus on less things, and more on
the details of what to do there.
Otherwise the proposal looks good.
For the wiki, having links to the issues might be helpful.
Thanks for the application :)
Andy
On 03/22/2015 08:52 PM, Raghav R V wrote:
2 things :
* The subject should have been "Multiple Metric Support in grid_search
and cross_validation modules and other general improvements" and not
multiple metric learning! Sorry for that!
* The link was not available due to the trailing "." (dot), which has
been fixed now!
Thanks
R
On Mon, Mar 23, 2015 at 5:47 AM, Raghav R V <rag...@gmail.com
<mailto:rag...@gmail.com>> wrote:
1. the link is broken
Ah! Sorry :) -
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-Multiple-metric-support-for-CV-and-grid_search-and-other-general-improvements.
2. that sounds quite difficult and unfortunately conducive to
cheating
Hmm... Should I then simply opt for adding more examples then?
On Sun, Mar 22, 2015 at 7:57 PM, Raghav R V <rag...@gmail.com
<mailto:rag...@gmail.com>> wrote:
Hi,
1. This is my proposal for the multiple metric learning
project as a wiki page -
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-Multiple-metric-support-for-CV-and-grid_search-and-other-general-improvements.
Possible mentors : Andreas Mueller (amueller) and Joel
Nothman (jnothman)
Any feedback/suggestions/additions/deletions would be
awesome. :)
2. Given that there is a huge interest among students in
learning about ML, do you think it would be within the
scope of/beneficial to skl to have all the exercises
and/or concepts, from a good quality book (ESL / PRML /
Murphy) or an academic course like NG's CS229 (not the
less rigorous coursera version), implemented using
sklearn? Or perhaps we could instead enhance our tutorials
and examples, to be a self study guide to learn about ML?
I have included this in my GSoC proposal but was not quite
sure if this would be an useful idea!!
Or would it be better if I simply add more examples?
Please let me know your views!!
Thanks
R
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go
Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media,
is your hub for all
things parallel software development, from weekly thought
leadership blogs to
news, videos, case studies, tutorials and more. Take a
look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel
Website, sponsored
by Intel and developed in partnership with Slashdot Media, is
your hub for all
things parallel software development, from weekly thought
leadership blogs to
news, videos, case studies, tutorials and more. Take a look
and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general