2012/3/5 Jinhui Li <[email protected]>:
> Hi there,
>
> Again, I have a question about tree.py.
>
> The fit function of basedecisionregressor convert the X to dense format.
>
> In my case, there are 100M train samples, 30K features, most data are
> zeros.
>
> please give some suggestions?

AFAIK nobody has been working on that, that would be a great pull-request ;)

The scipy.sparse.csc_matrix datastructure is probably the most well
suited for decision trees that slice the dataset feature-wise rather
than sample-wise.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to