Just wanted to leave a note on this, after I got my new iMac (and installed R64 from the AT&T site) -- quantreg did run, after topping out at whopping 12GB of swap space (MacOS X, at least, should theoretically have as much swap space as there is space on the HD -- it will dynamically increase it as memory usage goes up). I did get a "caught segfault" error but it wasn't until I did a ?rqss and clicked on a PDF vignette in the help browser (I was able to summary(tahoe_rq) with no problem). I don't know if the mac help browser has some issue under 64 bit systems, may be worth looking into.

I figure its best to first work out the parameters (tau) with a random subset first, at least for efficiency sake, then deploy the algorithm on the entire dataset.
--j

roger koenker wrote:
my earlier comment is probably irrelevant since you are fitting only one qss component and have no other covariates. A word of warning though when you go back to this on your new machine -- you are almost surely going to want to specify a large lambda for the qss component in the rqss call. The default of 1 is likely to produce something very very rough with
such a large dataset.


url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    rkoen...@uiuc.edu            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Urbana, IL 61801



On Jun 24, 2009, at 5:04 PM, Jonathan Greenberg wrote:

Yep, its looking like a memory issue -- we have 6GB RAM and 1GB swap -- I did notice that the analysis takes far less memory (and runs) if I:

tahoe_rq <- rqss(ltbmu_4_stemsha_30m_exp.img~ltbmu_eto_annual_mm.img,tau=.99,data=boundary_data)
  (which I assume fits a line to the quantiles)
vs.
tahoe_rq <- rqss(ltbmu_4_stemsha_30m_exp.img~qss(ltbmu_eto_annual_mm.img),tau=.99,data=boundary_data)
  (which is fitting a spline)

Unless anyone else has any hints as to whether or not I'm making a mistake in my call (beyond randomly subsetting the data -- I'd like to run the analysis on the full dataset to begin with) -- I'd like to fit a spline to the upper 1% of the data, I'll just wait until my new computer comes in next week which has more RAM. Thanks!

--j


roger koenker wrote:
Jonathan,

Take a look at the output of sessionInfo(), it should say x86-64 if you have a 64bit installation, or at least I think this is the case.

Regarding rqss(), my experience is that (usually) memory problems are due to the fact that early on the processing there is a call to model.matrix() which is supposed to create a design, aka X, matrix for the problem. This matrix is then coerced to matrix.csr sparse format, but the dense form is often too big for the machine to cope with. Ideally, someone would write an R version of model.matrix that would permit building the matrix in sparse form from the get-go, but this is a non-trivial task. (Or at least so it appeared to me when I looked into it a few years ago.) An option is to roll your own X matrix: take a smalller version of the data, apply the formula, look at the structure of X and then try to make a sparse version of the full X matrix. This is usually not that difficult, but "usually" is based on a rather small sample that may not be representative of your problems.

Hope that this helps,

Roger

url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    rkoen...@uiuc.edu            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Urbana, IL 61801



On Jun 24, 2009, at 4:07 PM, Jonathan Greenberg wrote:

Rers:

I installed R 2.9.0 from the Debian package manager on our amd64 system that currently has 6GB of RAM -- my first question is whether this installation is a true 64-bit installation (should R have access to > 4GB of RAM?) I suspect so, because I was running an rqss() (package quantreg, installed via install.packages() -- I noticed it required a compilation of the source) and watched the memory usage spike to 4.9GB (my input data contains > 500,000 samples).

With this said, after 30 mins or so of processing, I got the following error:

tahoe_rq <- rqss(ltbmu_4_stemsha_30m_exp.img~qss(ltbmu_eto_annual_mm.img),tau=.99,data=boundary_data)
Error: cannot allocate vector of size 1.5 Gb

The dataset is a bit big (300mb or so), so I'm not providing it unless necessary to solve this memory problem.

Thoughts? Do I need to compile either the main R "by hand" or the quantreg package?

--j

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Cell: 415-794-5043
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to