2012/3/5 Jinhui Li <[email protected]>: > Hi there, > > Again, I have a question about tree.py. > > The fit function of basedecisionregressor convert the X to dense format. > > In my case, there are 100M train samples, 30K features, most data are > zeros. > > please give some suggestions?
AFAIK nobody has been working on that, that would be a great pull-request ;) The scipy.sparse.csc_matrix datastructure is probably the most well suited for decision trees that slice the dataset feature-wise rather than sample-wise. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
