Hi,

The model representation (the tree structure) shouldn’t be affected by the fact
that the input data is sparse or dense. 

You might be interested by this issue 
https://github.com/scikit-learn/scikit-learn/issues/655.

Best,
Arnaud


On 07 Mar 2014, at 18:13, vamsi kaushik <[email protected]> wrote:

> hi Gael,
> thanks for that.
> 
> Firstly i have some issues to clarify before diving into the proposal part.
> I have a potential architecture in mind for the sparse implementation
> 
> firstly i feel a separate baseSplitter ( SparseSplitter) must be implemented 
> which is inherited by the Sparsebestsplitter etc.
> We can construct algorithms for efficiently extracting random features or 
> extracting samples.
> 
> But how do we implement a tree that works on sparse data ?
> "The binary tree is represented as a number of parallel arrays. The i-th
> element of each array holds information about the node `i`"
> And i don't think trees work on sparse matrices by standard 
> 
> Excuse my noobness, but can anyone help me out ?  
> 
> 
> On Thu, Mar 6, 2014 at 12:01 PM, Gael Varoquaux 
> <[email protected]> wrote:
> Hi Kaushik,
> 
> What I suggest that you do is that you create a page on the wiki for your
> proposal, and you start iterating on it there. You also need to ask for
> feedback on the mailing list with regards to that page and need to find
> two mentors (one mentor and a co mentor) that are willing to mentor you
> and ideally help build your proposal.
> 
> Cheers,
> 
> Gaël
> 
> On Thu, Mar 06, 2014 at 04:45:58AM +0530, vamsi kaushik wrote:
> > Hi,
> > I am kaushik and i would like to submit a proposal for gsoc 2014. I have 
> > been
> > working on the sparse matrix implementation in decision trees. I would like 
> > to
> > write a proposal regarding the project and i would be glad if anyone could 
> > help
> > me out.
> 
> > thanks,
> > kaushik varanasi
> 
> > ------------------------------------------------------------------------------
> > Subversion Kills Productivity. Get off Subversion & Make the Move to 
> > Perforce.
> > With Perforce, you get hassle-free workflows. Merge that actually works.
> > Faster operations. Version large binaries.  Built-in WAN optimization and 
> > the
> > freedom to use Git, Perforce or both. Make the move to Perforce.
> > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
> 
> > _______________________________________________
> > Scikit-learn-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> --
>     Gael Varoquaux
>     Researcher, INRIA Parietal
>     Laboratoire de Neuro-Imagerie Assistee par Ordinateur
>     NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
>     Phone:  ++ 33-1-69-08-79-68
>     http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux
> 
> ------------------------------------------------------------------------------
> Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
> With Perforce, you get hassle-free workflows. Merge that actually works.
> Faster operations. Version large binaries.  Built-in WAN optimization and the
> freedom to use Git, Perforce or both. Make the move to Perforce.
> http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> ------------------------------------------------------------------------------
> Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
> With Perforce, you get hassle-free workflows. Merge that actually works. 
> Faster operations. Version large binaries.  Built-in WAN optimization and the
> freedom to use Git, Perforce or both. Make the move to Perforce.
> http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk_______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to