Hi Issam. Please stay on the list ;) Sorry I have been so critical of your proposal. Merging the two PRs is a good proposal.
I just don't think the "speeding up" is realistic. As I said, we will not include Numba into scikit-learn very soon. And we will not include GPU implementations into scikit-learn any time soon. (I think there is a consensus on this for the time being.) For the algorithms: deep believe nets are not really much to program. They state "train a couple of RBMs, then throw the weights in a neural network". So implementing a DBN is more about getting the API of RBMs and MLPs right. Sorry, I am a bit behind my mails on scikit-learn and haven't followed the GSOC process much. Maybe some of the mentors can say more about the current state? Cheers, Andy On 05/04/2013 09:17 AM, Issam wrote: > Hi Andy, > > I agree, Numba Pro is commercial, I put it because someone suggested > using it in the comments :). But the idea is to provide a speed boost, > which a basic numba might suffice. > > As per the schedule I devised, the first few weeks are dedicated to > pulling MLP and RBM, documenting them, testing them on public > datasets, fixing any serious defects and publishing them for scikit > users to use. So taking those two pull requests will be the starting > points. > > Even though, I am not very deep in "Deep Learning" yet, I see myself > quite an expert in Neural Networks (which deep learning is about). > Also I have a professor who is specialized in DL who I'm consulting > regularly. > The problem with not giving such a detailed proposal is that I'm > having a busy semester which will end in May 26 where I will be > submitting 3 papers on Machine Learning. From May 26 onwards, I would > be devoting myself full time on deep learning. > > You say theano developers took several years establishing theano, but > deep learning is a relatively new area, so they had to write the > algorithms from scratch, and probably took a long time developing the > API, website, servers and the infrastructure of Theano, which scikit > already has. > In addition, I will be only be developing the main Deep learning > algorithm, for instance "Deep Belief Nets", since its full > description/pseudo code is already out there, the main difficulty will > be integrating it into scikit rather than the actual implementation. > > Anyhow, if this doesn't workout, I would be implementing Deep Learning > algorithms anyway throughout the summer :), since its my future area > of research. > > I only request a mentor to have him kindly familiarize me with the > convention that scikit expects for pushing the code into the > repository, everything else I can do alone or with the my supervisor > who is an expert in the area :). > > Thanks, > Yours truly, > > --Issam > > > On 5/4/2013 9:46 AM, Andy wrote: >> On 05/03/2013 10:30 PM, Issam wrote: >>> Hi Andy, >>> >>> The main idea behind proposing to use GPU techniques is to have an >>> efficient implementation, it doesnt need to be GPU techniques, >>> rather any technique such as Numba Pro that speeds up deep learning >>> algorithms which demand a lot of computation. >>> >> Sorry, I think you misunderstood my remark. >> I was not suggesting to use Numba Pro. This is a commercial product >> and definitely can not be used in scikit-learn. >> I was rather asking how you would go about doing an efficient >> implementation. >> This is something that needs to be well thought-through, raster than >> an afterthought. >> Afaik several labs (!) have been working on theano for several years >> (!). >> >>> So are you suggesting that there would be no mentor for this >>> project? The objective is mainly to get scikit-learn started with >>> general-purpose deep learning algorithms... >> I don't know. Is there a mentor? Who? >> >> I know a lot of the core people (Lars, Gael, Olivier, me) have been >> very busy and the GSOC is not as organized as it should be. >> But mentors should be assigned previous to application. >> >> For getting sklearn started with deep learning: >> Let me reiterate: there are MLP and RBM implementations. The RBM is >> done. It only needs finishing touches. >> It was done by a deep learning expert, with reviews of people active >> in the field. >> We really focused on having a working, well-documented, well >> illustrated numpy implementation at first. >> I am not entirely certain about the current state of the MLP pull >> request, but I think there are now several working versions of it. >> >> Any deep learning contributions to scikit-learn should take these two >> pull requests as starting points. >> >> Cheers, >> Andy > ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite It's a free troubleshooting tool designed for production Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap2 _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general