Re: [Scikit-learn-general] GSOC idea

2013-04-29 Thread Şükrü Bezen
Thanks for the feedback. Actually, implementation and testing of association rule learning can be finished sooner than what I thought, two week in total, because of its simplicity so updated schedule would be like: • Getting familiar with scikit-learn, API structure etc. (1 week) • Gene

Re: [Scikit-learn-general] GSOC idea

2013-04-29 Thread Gael Varoquaux
On Mon, Apr 29, 2013 at 01:28:09AM +0300, Şükrü Bezen wrote: > • Getting familiar with scikit-learn, API structure etc. (1 week) > • Generating, finding datasets for future use. (1-3 days) > • Implementing association rule learning, (1 week) > • Testing, documenting (1 week) > • Implement

Re: [Scikit-learn-general] GSOC idea

2013-04-28 Thread Şükrü Bezen
Hi again, For collaborative filtering: www.stat.osu.edu/~dmsl/Sarwar_2001.pdf For association rule learning: http://rakesh.agrawal-family.com/papers/vldb94apriori.pdf And as the schedule part: - Getting familiar with scikit-learn, API structure etc. (1 week) - Generating, finding datasets

Re: [Scikit-learn-general] GSOC idea

2013-04-24 Thread Vlad Niculae
Thank you, Do you have some references prepared? It would be useful. I am not sure if what is in my head is correct but I think association rule learning is interesting and a kind of method that I would like to see in scikit-learn, as well as finding frequent itemsets. I hope I'm thinking of the

Re: [Scikit-learn-general] GSOC idea

2013-04-24 Thread Şükrü Bezen
Hi Vlad, It looks good for me to focus on the proposal now and looking into mentor later. I am considering collaborative filtering with *user similarity* and *item similarity*. And also* association rule learning* for finding out general behaviour of a user-item group. I think those two would be

Re: [Scikit-learn-general] GSOC idea

2013-04-23 Thread Vlad Niculae
Hi Şükrü We can focus on the proposal now and decide later who is better to mentor it. I could do it but it is not the thing I would be the best at mentoring, so to solve the chicken-and-egg problem we can optimize the decisions jointly when the time comes. Did you start working on your proposal

[Scikit-learn-general] GSOC idea: stacked generalization

2013-04-20 Thread Kemal Eren
Hello scikit-learn team, I currently work as a developer for the ilastik project (http://ilastik.org/), and I will be starting a PhD in bioinformatics at UCSD this fall. I would like to participate in the Google Summer of Code this year. I have hacked on scikit-learn for my own work in the past.

Re: [Scikit-learn-general] GSOC idea

2013-04-20 Thread Şükrü Bezen
I am still looking for a mentor to backup this idea of mine, anyone interested ? On Wed, Apr 17, 2013 at 2:43 AM, Mathieu Blondel wrote: > > > On Mon, Apr 15, 2013 at 10:45 PM, Olivier Grisel > wrote: > >> Also I would rather avoid adding fancy new application specific public >> API just for th

Re: [Scikit-learn-general] GSOC idea

2013-04-16 Thread Mathieu Blondel
On Mon, Apr 15, 2013 at 10:45 PM, Olivier Grisel wrote: > Also I would rather avoid adding fancy new application specific public > API just for the recsys use case. Especially before the 1.0 release. > If we can stick to the existing public fit / transform / predict API > (using scipy.sparse matri

Re: [Scikit-learn-general] GSOC idea

2013-04-15 Thread Şükrü Bezen
Thanks for your precious feedbacks. I am not considering to implement any fancy public APIs at least not before finishing the core part with the existing APIs. Any mentor ideas for this idea? On Mon, Apr 15, 2013 at 4:45 PM, Olivier Grisel wrote: > 2013/4/15 Andreas Mueller : > > Hi Şükrü. > >

Re: [Scikit-learn-general] GSOC idea

2013-04-15 Thread Olivier Grisel
2013/4/15 Andreas Mueller : > Hi Şükrü. > I think this is an awesome idea. > Finding a good mentor might be a problem, though. Any takes? > Also, I wouldn't set the goals to high. Having a good api that works for > many applications > and solid and efficient implementation of one or two core techni

Re: [Scikit-learn-general] GSOC idea

2013-04-15 Thread Andreas Mueller
Hi S,ükrü. I think this is an awesome idea. Finding a good mentor might be a problem, though. Any takes? Also, I wouldn't set the goals to high. Having a good api that works for many applications and solid and efficient implementation of one or two core techniques would go a long way, and imho w

[Scikit-learn-general] GSOC idea

2013-04-15 Thread Şükrü Bezen
Hello, My name is Şükrü BEZEN and I am having my MSc. degree from METU Computer Engineering in the topic of Recommendation Systems. I would like to implement the core recommendation systems algorithms into the scikit-learn. That would include collaborative filtering, content filtering and some hy

Re: [Scikit-learn-general] GSoC idea

2012-03-24 Thread Andreas
Hi Rishabh. Please have a look at http://wiki.python.org/moin/SummerOfCode/2012. Before applying for a given project/task, you should start on some smaller tasks in the project to familiarize yourself with the code base and working with the devs. The "issues" page is a good place to start https:/

[Scikit-learn-general] GSoC idea

2012-03-24 Thread Rishabh Dixit
Hi all, I am a second year student of BITS Pilani, India. While going through the GSoC idea page of SciKit, I found that I can work on the idea of Implementing a stochastic gradient descent algorithm to learn a multi-layer perceptron. I am well versed with python and C and currently working on ne

Re: [Scikit-learn-general] GSoc Idea

2012-03-15 Thread Vikram Kamath
Hi Lars, I think there might have been some ambiguity in this. I just want to clarify that I DO NOT intend on replacing the current optimized CART implementation. I am only proposing the addition of the C5.0 implementation as a feature. And yes, I think allowing the user to supply a bo

Re: [Scikit-learn-general] GSoc Idea

2012-03-12 Thread Lars Buitinck
2012/3/12 Vikram Kamath : > 1. Splits in CART are restricted to binary splits (a C4.5/C5.0 D-Tree is > m-ary) All our learners work on numeric data, meaning categorical data must be split into binary features according to a one-of-K representation prior to handing it to a learner. So unless you

Re: [Scikit-learn-general] GSoc Idea

2012-03-12 Thread Vikram Kamath
Hi, This is in response to Peter and Adreas' queries about the differences between CART and C4.5/C5.0 1. Splits in CART are restricted to binary splits (a C4.5/C5.0 D-Tree is m-ary) 2. Differences between C4.5/C5.0 and CART include differences in: a. splitting criteria b. the pruning meth

Re: [Scikit-learn-general] GSoc Idea

2012-03-11 Thread Peter Prettenhofer
2012/3/10 Andreas > ** > Hi Vikram. > Thanks for sending in your idea. > I am not so familiar with the differences between C4.5 and they are not > very clear > to me from the webpage. > >From what I understand, C5.0 features include: > - sample weights (called case weights), for which already a p

Re: [Scikit-learn-general] GSoc Idea

2012-03-10 Thread Andreas
Hi Vikram. Thanks for sending in your idea. I am not so familiar with the differences between C4.5 and they are not very clear to me from the webpage. From what I understand, C5.0 features include: - sample weights (called case weights), for which already a pull request exists