At the moment your three options are
1) get more memory
2) do feature selection - 400k features on 200k samples seems to me to
contain a lot of redundant information or irrelevant features
3) submit a PR to support dense matrices - this is going to be a lot of
work and I doubt it's worth it.

All the best
Brian
On Apr 24, 2013 5:14 AM, "Calvin Morrison" <mutanttur...@gmail.com> wrote:

> get more memory?
>
> On 23 April 2013 17:06, Alex Kopp <ark...@cornell.edu> wrote:
> > Hi,
> >
> > I am looking to build a random forest regression model with a pretty
> large
> > amount of sparse data. I noticed that I cannot fit the random forest
> model
> > with a sparse matrix. Unfortunately, a dense matrix is too large to fit
> in
> > memory. What are my options?
> >
> > For reference, I have just over 400k features and just over 200k training
> > examples
> >
> >
> ------------------------------------------------------------------------------
> > Try New Relic Now & We'll Send You this Cool Shirt
> > New Relic is the only SaaS-based application performance monitoring
> service
> > that delivers powerful full stack analytics. Optimize and monitor your
> > browser, app, & servers with just a few lines of code. Try New Relic
> > and get this awesome Nerd Life shirt!
> http://p.sf.net/sfu/newrelic_d2d_apr
> > _______________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
>
>
> ------------------------------------------------------------------------------
> Try New Relic Now & We'll Send You this Cool Shirt
> New Relic is the only SaaS-based application performance monitoring service
> that delivers powerful full stack analytics. Optimize and monitor your
> browser, app, & servers with just a few lines of code. Try New Relic
> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to