Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2016-01-31 Thread Daniel Homola
Dear all, I migrated my Python implementation of the Boruta algorithm to: https://github.com/danielhomola/boruta_py I also implemented 3 mutual information based feature selection (JMI, JMIM, MRMR) methods and wrapped them up in scikit-learn like interface: https://github.com/danielhomola/mifs

Re: [Scikit-learn-general] Contributing to Scikit-Learn(GSOC)

2016-01-11 Thread Andy
Hi Imaculate. We have found that in recent years, we were quite limited in terms of mentoring resources. Many of the core-devs are very busy, and we already have many contributions waiting for reviews. If you are interested in working on scikit-learn as part of GSoC, I suggest you start cont

Re: [Scikit-learn-general] Contributing to scikit-learn

2016-01-10 Thread Raghav R V
Hi Antoine, Welcome to scikit-learn! Please see if you find this issue interesting to start with - https://github.com/scikit-learn/scikit-learn/issues/6149 Thanks On Sat, Jan 9, 2016 at 6:42 PM, WENDLINGER Antoine < antoinewendlin...@gmail.com> wrote: > Hi everyone, > > Let me introduce myself

[Scikit-learn-general] Contributing to Scikit-Learn(GSOC)

2016-01-09 Thread Imaculate Mosha
Hi all, I would like to contribute to scikit-learn even better for Google Summer of Code.I'm a third year undergrad student. I did an introductory course to Machine Learning but after learning Scikit-Learn I realised we only scratched the surface, we did neural networks, reinforcement learning a

[Scikit-learn-general] Contributing to scikit-learn

2016-01-09 Thread WENDLINGER Antoine
Hi everyone, Let me introduce myself : my name is Antoine, I'm a 21-years-old French student in Computer Science, and would love to contribute to scikit-learn. This would be my first contribution to an open-source project so I'm a bit lost and do not really know where to start. I read the pages ab

Re: [Scikit-learn-general] Contributing to scikit-learn

2015-09-10 Thread Rohit Shinde
Hi Gael, Heeding your advice, I was looking over the possible bugs and I have decided to solve this one: https://github.com/scikit-learn/scikit-learn/issues/5229. Any pointers on how to approach this one? Thanks, Rohit. On Thu, Sep 10, 2015 at 10:27 AM, Gael Varoquaux < gael.varoqu...@normalesu

Re: [Scikit-learn-general] Contributing to scikit-learn

2015-09-09 Thread Gael Varoquaux
I would strongly recommend to start with something easier, like issues labelled 'easy'. Starting with such a big project is most likely going to lead to you approaching the project in a way that is not well adapted to scikit-learn, and thus code that does not get merged. Cheers, Gaël On Thu, Sep

Re: [Scikit-learn-general] Contributing to scikit-learn

2015-09-09 Thread Rohit Shinde
Hello everyone, I have built scikit-learn and I am ready to start coding. Can I get some pointers on how I could start contributing to the projects I mentioned in the earlier mail? Thanks, Rohit. On Mon, Sep 7, 2015 at 11:50 AM, Rohit Shinde wrote: > Hi Jacob, > > I am interested in Global opt

Re: [Scikit-learn-general] Contributing to scikit-learn

2015-09-06 Thread Rohit Shinde
Hi Jacob, I am interested in Global optimization based hyperparameter optimization and Generalised Additive Models. However, I don't know what kind of background would be needed and if mine would be sufficient for it. I would like to know the prerequisites for it. On Sun, Sep 6, 2015 at 9:58 PM,

Re: [Scikit-learn-general] Contributing to scikit-learn

2015-09-06 Thread Jacob Schreiber
Hi Rohit I'm glad you want to contribute to scikit-learn! Which idea were you interested in working on? The metric learning and GMM code is currently being worked on by GSOC students AFAIK. Jacob On Sun, Sep 6, 2015 at 8:18 AM, Rohit Shinde wrote: > Hello everyone, > > I am Rohit. I am interes

[Scikit-learn-general] Contributing to scikit-learn

2015-09-06 Thread Rohit Shinde
Hello everyone, I am Rohit. I am interested in contributing toward scikit-learn. I am quite proficient in Python, Java, C++ and scheme. I have taken undergrad courses in Machine Learning and data mining. I was also part of this year's GSoC under The Opencog Foundation. I was looking at the ideas

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-05-08 Thread Andreas Mueller
It doesn't need to be super technical, and we try to keep the user guide "easy to understand". No bonus points for unnecessary latex ;) The example should be as illustrative and fair as possible, and built-in datasets are preferred. It shouldn't be to heavy-weight, though. If you like, you can sh

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-05-08 Thread Daniel Homola
Hi Andy, Thanks! Will definitely do a github pull request once Miron confirmed he benchmarked my implementation by running it on the datasets the method was published with. I wrote a blog post about it, which explains the differences but in a quite casual an non rigorous way: http://danielh

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-05-08 Thread Andreas Mueller
Btw, an example that compares this against existing feature selection methods that explains differences and advantages would help users and convince us to merge ;) On 05/08/2015 02:34 PM, Daniel Homola wrote: Hi all, I wrote a couple of weeks ago about implementing the Boruta all-relevant f

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-05-08 Thread Andreas Mueller
Hi Daniel. That looks cool. Can you do a github pull request? See the contributor docs: http://scikit-learn.org/dev/developers/index.html Thanks, Andy On 05/08/2015 02:34 PM, Daniel Homola wrote: Hi all, I wrote a couple of weeks ago about implementing the Boruta all-relevant feature selectio

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-05-08 Thread Daniel Homola
Hi all, I wrote a couple of weeks ago about implementing the Boruta all-relevant feature selection method algorithm in Python.. I think it's ready to go now. I wrote fit, transform and fit_transform methods for it to make it sklearn like. Here it is: https://bitbucket.org/danielhomola/borut

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-04-17 Thread Gilles Louppe
Hi, In general, I agree that we should at least add a way to compute feature importances using permutations. This is an alternative, yet standard, way to do it in comparison to what we do (mean decrease of impurity, which is also standard). Assuming we provide permutation importances as a buildin

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-04-15 Thread Satrajit Ghosh
hi andy and dan, i've been using a similar heuristic with extra trees quite effectively. i have to look at the details of this R package and the paper, but in my case i add a feature that has very low correlation with my target class/value (depending on the problem) and choose features that have a

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-04-15 Thread Daniel Homola
Hi Andy, So at each iteration the x predictor matrix (n by m) is practically copied and each column is shuffled in the copied version. This shuffled matrix is then copied next to the original (n by 2m) and fed into the RF, to get the feature importances. Also at the start of the method, a vect

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-04-15 Thread Andreas Mueller
Hi Dan. I saw that paper, but it is not well-cited. My question is more how different this is from what we already have. So it looks like some (5) random control features are added and the features importances are compared against the control. The question is whether the feature importance that

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-04-15 Thread Daniel Homola
Hi Andy, This is the paper: http://www.jstatsoft.org/v36/i11/ which was cited 79 times according to Google Scholar. Regarding your second point, the first 3 questions of the FAQ on the Boruta website answers it I guess.. https://m2.icm.edu.pl/boruta/ 1. *So, what's so special about Boruta?*

Re: [Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-04-15 Thread Andreas Mueller
Hi Daniel. That sounds potentially interesting. Is there a widely cited paper for this? I didn't read the paper, but it looks very similar to RFE(RandomForestClassifier()). Is it qualitatively different from that? Does it use a different feature importance? btw: your mail is flagged as spam as

[Scikit-learn-general] Contributing to scikit-learn with a re-implementation of a Random Forest based iterative feature selection method

2015-04-15 Thread Daniel Homola
Hi all, I needed a multivariate feature selection method for my work. As I'm working with biological/medical data, where n < p or even n << p I started to read up on Random Forest based methods, as in my limited understanding RF copes pretty well with this suboptimal situation. I came across

Re: [Scikit-learn-general] Contributing to Scikit

2014-02-02 Thread Juan Nunez-Iglesias
On Mon, Feb 3, 2014 at 5:49 AM, Andy wrote: > We should have an FAQ. > It should include > > What is the project name? scikit-learn, not scikit or SciKit nor sci-kit > learn. > > How do you pronounce the project name? sy-kit learn. sci stands for > science! > > Do you want to add this awesome new

Re: [Scikit-learn-general] Contributing to Scikit

2014-02-02 Thread Andy
On 02/02/2014 07:41 PM, Vlad Niculae wrote: > I've heard stchee-kit once, along with stchee-pee and num-pee. > We should have an FAQ. It should include What is the project name? scikit-learn, not scikit or SciKit nor sci-kit learn. How do you pronounce the project name? sy-kit learn. sci stands

Re: [Scikit-learn-general] Contributing to Scikit

2014-02-02 Thread Andy
On 02/02/2014 06:39 PM, Hadayat Seddiqi wrote: > i always said "skikit" > Many people do ;) sci as in science =) -- WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you

Re: [Scikit-learn-general] Contributing to Scikit

2014-02-02 Thread Vlad Niculae
I've heard stchee-kit once, along with stchee-pee and num-pee. Vlad On Sun Feb 2 18:39:58 2014, Hadayat Seddiqi wrote: > i always said "skikit" > > > On Sun, Feb 2, 2014 at 12:20 PM, Andy > wrote: > > On 02/02/2014 12:06 PM, Olivier Grisel wrote: > > Note: the n

Re: [Scikit-learn-general] Contributing to Scikit

2014-02-02 Thread Hadayat Seddiqi
i always said "skikit" On Sun, Feb 2, 2014 at 12:20 PM, Andy wrote: > On 02/02/2014 12:06 PM, Olivier Grisel wrote: > > Note: the name of the project is scikit-learn, not scikit or SciKit > > nor sci-kit learn. Cheers, > I should make this my signature from now on. Also including > pronounciati

Re: [Scikit-learn-general] contributing to scikit

2014-02-02 Thread Andy
On 02/01/2014 10:42 PM, Robert Layton wrote: > > Finally, when choosing classifiers, it's our preference to focus on > heavily used classifiers, rather than state of the art. Many of the > core devs (and myself) have coded classifiers that are scikit-learn > compatible, but not in the library be

Re: [Scikit-learn-general] Contributing to Scikit

2014-02-02 Thread Andy
On 02/02/2014 12:06 PM, Olivier Grisel wrote: > Note: the name of the project is scikit-learn, not scikit or SciKit > nor sci-kit learn. Cheers, I should make this my signature from now on. Also including pronounciation (sy-kit learn)

Re: [Scikit-learn-general] Contributing to Scikit

2014-02-02 Thread Olivier Grisel
2014/2/2 Jitesh Khandelwal : > Hi, > > I have used scikit-learn for academic purposes and I like it very much. > > I want to contribute to it. I have gone through the developers documentation > and setup my local working directory. > > As suggested in the developers documentation, it did look for s

[Scikit-learn-general] Contributing to Scikit

2014-02-02 Thread Jitesh Khandelwal
Hi, I have used scikit-learn for academic purposes and I like it very much. I want to contribute to it. I have gone through the developers documentation and setup my local working directory. As suggested in the developers documentation, it did look for some "EASY" tagged issues in the issue trac

Re: [Scikit-learn-general] contributing to scikit

2014-02-01 Thread Robert Layton
Hi Joseph, In theory, you should be able to take any classifier in sklearn and base your implementation off that. That said, there are a few caveats. Some classifiers are older, before coding was more formalised. Others have a lot of cython code hooks, and can be difficult to read. That all said,

Re: [Scikit-learn-general] contributing to scikit

2014-02-01 Thread Joseph Perla
Is this the right place to ask? I'm just going to send in a pull request if nobody has any suggestions. j On Fri, Jan 31, 2014 at 7:10 PM, Joseph Perla wrote: > I love SciKit and I'm going to contribute an SGD classifier for > semi-supervised problems. > > I already read through all the contribut

[Scikit-learn-general] contributing to scikit

2014-01-31 Thread Joseph Perla
I love SciKit and I'm going to contribute an SGD classifier for semi-supervised problems. I already read through all the contributor documentation and I've read many of the docs. I'm asking the list if I should model my code off of the style/quality of the SGDClassifier class or if there is a bet

Re: [Scikit-learn-general] Contributing to scikit-learn

2013-10-14 Thread Olivier Grisel
Please have a look at the contributors guide: http://scikit-learn.org/stable/developers/#contributing-code In particular this doc mentions [Easy] tagged issues: https://github.com/scikit-learn/scikit-learn/issues?labels=Easy But in general the best way to contribute is to actually use the libra

[Scikit-learn-general] Contributing to scikit-learn

2013-10-14 Thread Ankit Agrawal
Hi, I am Ankit Agrawal, a 4th year undergrad majoring in EE with specialization in Communications and Signal Processing at IIT Bombay. I completed my GSoC with scikit-image this year and have a good grasp with Python(and a little bit with Cython). I have completed a course in ML, and have tak

Re: [Scikit-learn-general] Contributing to Scikit-Learn

2013-10-02 Thread Olivier Grisel
2013/10/2 Manoj Kumar : > Hi, > > I am Manoj Kumar, a junior undergrad from Birla Institute of Technology and > Science. > > I've just completed my Google Summer of Code under SymPy. So I have a good > programming background in Python. > > Regarding my Machine Learning background, I've taken an inf

[Scikit-learn-general] Contributing to Scikit-Learn

2013-10-02 Thread Manoj Kumar
Hi, I am Manoj Kumar, a junior undergrad from Birla Institute of Technology and Science. I've just completed my Google Summer of Code under SymPy. So I have a good programming background in Python. Regarding my Machine Learning background, I've taken an informal Coursera course, under Andrew Ng.

Re: [Scikit-learn-general] contributing to scikit-learn

2013-08-01 Thread Andreas Mueller
Hey Eustache. Nice write-up. So who are the tinkerers and who are the prophets ? ;) Cheers, Andy On 08/01/2013 03:40 PM, Eustache DIEMERT wrote: Hi list, Not so long ago I had my first PR merged into sklearn. Overall it was a very cool experience, thanks to many of you :) Here is a little po

Re: [Scikit-learn-general] contributing to scikit-learn

2013-08-01 Thread Gael Varoquaux
On Thu, Aug 01, 2013 at 03:40:05PM +0200, Eustache DIEMERT wrote: > Here is a little post that tells the story :  > http://stochastics.komodo.re/posts/contributing-to-sklearn.html Cool! Glad you enjoyed it. I tweeted you :) https://twitter.com/GaelVaroquaux/status/362934648302616576 Thanks a lot

[Scikit-learn-general] contributing to scikit-learn

2013-08-01 Thread Eustache DIEMERT
Hi list, Not so long ago I had my first PR merged into sklearn. Overall it was a very cool experience, thanks to many of you :) Here is a little post that tells the story : http://stochastics.komodo.re/posts/contributing-to-sklearn.html Cheers, Eustache

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-07 Thread Vandana Bachani
Hi David, Yes I use one-hot encoding, but my understanding of one-hot encoding says that each discrete attribute can be represented as a bit pattern. So the node corresponding to that input attribute is actually a set of nodes representing that bit pattern. An unknown just means that the bit for un

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-07 Thread David Warde-Farley
On Thu, Jun 07, 2012 at 10:40:32AM -0700, Vandana Bachani wrote: > Hi Andreas, > > I agree missing data is not specific to MLP. > We dealt it with pretty simple as u mentioned by taking mean over the > dataset for continuous-valued attributes. > Another thing that I feel is not adequately explored

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-07 Thread Vandana Bachani
ote: >>> >>> I think you sent this mail only to me, please send all mails to mailling >>> list. Btw. Andreas is my mentor, so he is the one in charge here :-) >>> >>> Ad 1) Afaik all you need is one hidden layer, it's certainly possible to >>> add the po

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-07 Thread eat
the possibility, but I think we decided that it's not a priority. >> >> Ad 2) Good idea >> >> David >> >> -- Forwarded message -- >> From: Vandana Bachani >> Date: Tue, Jun 5, 2012 at 6:59 PM >> Subject: Re: [Scikit-l

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-07 Thread David Warde-Farley
On Thu, Jun 07, 2012 at 03:09:11PM +, LI Wei wrote: > Intuitively maybe we can set the missing values using the average over the > nearest neighbors calculated using these existing features? Not sure > whether it is the correct way to do it :-) That's known as "imputation" (or in a particular

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-07 Thread LI Wei
ll you need is one hidden layer, it's certainly possible to > add the possibility, but I think we decided that it's not a priority. > > Ad 2) Good idea > > David > > ------ Forwarded message -- > From: Vandana Bachani > Date: Tue, Jun 5, 2012

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-07 Thread Andreas Mueller
ecided that it's not a priority. Ad 2) Good idea David -- Forwarded message -- From: *Vandana Bachani* <mailto:vandana@gmail.com>> Date: Tue, Jun 5, 2012 at 6:59 PM Subject: Re: [Scikit-learn-general] Contributing to scikit-learn To: h4wk...@gmail.com <ma

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-06 Thread xinfan meng
Deep learning literature said that the more layers you have, the less hidden nodes in one layer you need. But I agree one hidden layer would be sufficient now. On Thu, Jun 7, 2012 at 11:12 AM, David Warde-Farley < warde...@iro.umontreal.ca> wrote: > On 2012-06-05, at 1:51 PM, David Marek wrote:

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-06 Thread David Warde-Farley
On 2012-06-05, at 1:51 PM, David Marek wrote: > 1) Afaik all you need is one hidden layer, The universal approximator theorem says that any continuous function can be approximated arbitrarily well if you have one hidden layer with enough hidden units, but it says nothing about the ease of find

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-05 Thread David Marek
Hi, As Gael and Olivier said, I am working on mlp this summer, it's my GSOC project. So there is some existing code (in Cython) and you won't be able to just use your class project, but you should definitely look at it. I will be grateful for every help and suggestion. I have got basic classificat

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-05 Thread Olivier Grisel
2012/6/5 Gael Varoquaux : > Hi Vandana and Shreyas, > > Welcome and thanks for the interest, > > With regards to MLP (multi-layer perceptrons), David Marek is right now > working on such feature: > https://github.com/davidmarek/scikit-learn/tree/gsoc_mlp > you can probably pitch in with him: 4 eyes

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-05 Thread Andreas Mueller
Hi Shreyas. In particular, the VBGMM and DPGMM might need some attention. Once you are a bit familiar with the GMM code, you could have a look at issue 393 . Any help would be much appreciated :) Cheers, Andy Am 05.06.2012 08:07, schrieb

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-05 Thread Shreyas Karkhedkar
Hi Gael, Thanks for the response. Vandana and I are really excited about contributing to scikits. I will go through the GMM code and will put in suggestions for refactoring - and if possible implement some new features. Once again, on behalf of Vandana and I, thanks for the reply. Looking forwa

Re: [Scikit-learn-general] Contributing to scikit-learn

2012-06-04 Thread Gael Varoquaux
Hi Vandana and Shreyas, Welcome and thanks for the interest, With regards to MLP (multi-layer perceptrons), David Marek is right now working on such feature: https://github.com/davidmarek/scikit-learn/tree/gsoc_mlp you can probably pitch in with him: 4 eyes are always better than only 2. With re

[Scikit-learn-general] Contributing to scikit-learn

2012-06-04 Thread Vandana Bachani
Hi, Me and my friend Shreyas want to contribute to the scikit-learn code. I want to add code for neural networks (Multi-layer Perceptrons) and Shreyas has some ideas for the Expecation-Maximization algorithm and Gaussian Mixture Models. Please let us know how we can contribute to the code and if we