[R] memory allocation glitches

2007-08-14 Thread Ben Bolker
(not sure whether this is better for R-devel or R-help ...) I am currently trying to debug someone else's package (they're not available at the moment, and I would like it to work *now*), which among other things allocates memory for a persistent buffer that gets used by various functions. The

Re: [R] memory allocation glitches

2007-08-14 Thread Peter Dalgaard
Ben Bolker wrote: (not sure whether this is better for R-devel or R-help ...) Hardcore debugging is usually better off in R-devel. I'm leaving it in R-help though. I am currently trying to debug someone else's package (they're not available at the moment, and I would like it to work

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-10 Thread Michael Cassin
Thanks for all the comments, The artificial dataset is as representative of my 440MB file as I could design. I did my best to reduce the complexity of my problem to minimal reproducible code as suggested in the posting guidelines. Having searched the archives, I was happy to find that the topic

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-10 Thread Prof Brian Ripley
I don't understand why one would run a 64-bit version of R on a 2GB server, especially if one were worried about object size. You can run 32-bit versions of R on x86_64 Linux (see the R-admin manual for a comprehensive discussion), and most other 64-bit OSes default to 32-bit executables.

Re: [R] R memory usage

2007-08-09 Thread Prof Brian Ripley
See ?gc ?Memory-limits On Wed, 8 Aug 2007, Jun Ding wrote: Hi All, I have two questions in terms of the memory usage in R (sorry if the questions are naive, I am not familiar with this at all). 1) I am running R in a linux cluster. By reading the R helps, it seems there are no default

[R] Memory problem

2007-08-09 Thread Gang Chen
I got a long list of error message repeating with the following 3 lines when running the loop at the end of this mail: R(580,0xa000ed88) malloc: *** vm_allocate(size=327680) failed (error code=3) R(580,0xa000ed88) malloc: *** error: can't allocate region R(580,0xa000ed88) malloc: *** set a

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Michael Cassin
Hi, I've been having similar experiences and haven't been able to substantially improve the efficiency using the guidance in the I/O Manual. Could anyone advise on how to improve the following scan()? It is not based on my real file, please assume that I do need to read in characters, and can't

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
If we add quote = FALSE to the write.csv statement its twice as fast reading it in. On 8/9/07, Michael Cassin [EMAIL PROTECTED] wrote: Hi, I've been having similar experiences and haven't been able to substantially improve the efficiency using the guidance in the I/O Manual. Could anyone

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Michael Cassin
Thanks for looking, but my file has quotes. It's also 400MB, and I don't mind waiting, but don't have 6x the memory to read it in. On 8/9/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: If we add quote = FALSE to the write.csv statement its twice as fast reading it in. On 8/9/07, Michael

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
Another thing you could try would be reading it into a data base and then from there into R. The devel version of sqldf has this capability. That is it will use RSQLite to read the file directly into the database without going through R at all and then read it from there into R so its a

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
Just one other thing. The command in my prior post reads the data into an in-memory database. If you find that is a problem then you can read it into a disk-based database by adding the dbname argument to the sqldf call naming the database. The database need not exist. It will be created by

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Michael Cassin
I really appreciate the advice and this database solution will be useful to me for other problems, but in this case I need to address the specific problem of scan and read.* using so much memory. Is this expected behaviour? Can the memory usage be explained, and can it be made more efficient?

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
One other idea. Don't use byrow = TRUE. Matrices are stored in column order so that might be more efficient. You can always transpose it later. Haven't tested it to see if it helps. On 8/9/07, Michael Cassin [EMAIL PROTECTED] wrote: I really appreciate the advice and this database solution

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Charles C. Berry
On Thu, 9 Aug 2007, Michael Cassin wrote: I really appreciate the advice and this database solution will be useful to me for other problems, but in this case I need to address the specific problem of scan and read.* using so much memory. Is this expected behaviour? Can the memory usage be

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
Try it as a factor: big2 - rep(letters,length=1e6) object.size(big2)/1e6 [1] 4.000856 object.size(as.factor(big2))/1e6 [1] 4.001184 big3 - paste(big2,big2,sep='') object.size(big3)/1e6 [1] 36.2 object.size(as.factor(big3))/1e6 [1] 4.001184 On 8/9/07, Charles C. Berry [EMAIL

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Charles C. Berry
I do not see how this helps Mike's case: res - (as.character(1:1e6)) object.size(res) [1] 3624 object.size(as.factor(res)) [1] 4224 Anyway, my point was that if two character vectors for which all.equal() yields TRUE can differ by almost an order of magnitude in object.size(), and

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Prof Brian Ripley
On Thu, 9 Aug 2007, Charles C. Berry wrote: On Thu, 9 Aug 2007, Michael Cassin wrote: I really appreciate the advice and this database solution will be useful to me for other problems, but in this case I need to address the specific problem of scan and read.* using so much memory. Is this

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Gabor Grothendieck
The examples were just artificially created data. We don't know what the real case is but if each entry is distinct then factors won't help; however, if they are not distinct then there is a huge potential savings. Also if they are really numeric, as in your example, then storing them as numeric

Re: [R] Memory problem

2007-08-09 Thread Gang Chen
It seems the problem lies in this line: try(fit.lme - lme(Beta ~ group*session*difficulty+FTND, random = ~1|Subj, Model), tag - 1); As lme fails for most iterations in the loop, the 'try' function catches one error message for each failed iteration. But the puzzling part is, why does the

[R] R memory usage

2007-08-08 Thread Jun Ding
Hi All, I have two questions in terms of the memory usage in R (sorry if the questions are naive, I am not familiar with this at all). 1) I am running R in a linux cluster. By reading the R helps, it seems there are no default upper limits for vsize or nsize. Is this right? Is there an upper

Re: [R] memory error with 64-bit R in linux

2007-07-19 Thread Paul Gilbert
You might try running top while R runs, to get a better idea of what is happening. 64-bit R takes more memory than 32-bit (longer pointers) and for a large problem I would say that 2GB RAM is a minimum if you want any speed. Slowness is likely related to needing to use swap space. The cannot

[R] memory error with 64-bit R in linux

2007-07-18 Thread zhihua li
Hi netters, I'm using the 64-bit R-2.5.0 on a x86-64 cpu, with an RAM of 2 GB. The operating system is SUSE 10. The system information is: -uname -a Linux someone 2.6.13-15.15-smp #1 SMP Mon Feb 26 14:11:33 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux I used heatmap to process a matrix of the

Re: [R] memory error with 64-bit R in linux

2007-07-18 Thread jim holtman
Are you paging? That might explain the long run times. How much space are your other objects taking up? The matrix by itself should only require about 13MB if it is numeric. I would guess it is some of the other objects that you have in your working space. Put some gc() in your loop to see

Re: [R] memory error with 64-bit R in linux

2007-07-18 Thread James MacDonald
The dist object for the rows of the matrix will be 16000x16000, which if there are any copies will easily suck up all of your RAM. A more pertinent question is what use would a heatmap of that size be? How do you plan to visualize 16000 rows? In a pdf? You certainly couldn't publish such a

Re: [R] memory error with 64-bit R in linux

2007-07-18 Thread zhihua li
it wrong? Thanks a lot! From: jim holtman [EMAIL PROTECTED] To: zhihua li [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: Re: [R] memory error with 64-bit R in linux Date: Wed, 18 Jul 2007 17:50:31 -0500 Are you paging? That might explain the long run times. How much space are your

Re: [R] memory error with 64-bit R in linux

2007-07-18 Thread jim holtman
-help@stat.math.ethz.ch Subject: Re: [R] memory error with 64-bit R in linux Date: Wed, 18 Jul 2007 17:50:31 -0500 Are you paging? That might explain the long run times. How much space are your other objects taking up? The matrix by itself should only require about 13MB if it is numeric. I would

[R] memory problem

2007-07-14 Thread Li, Xue
Hi, My computer has 2GB of ram and I also request 2GB of virtual ram from c drive, therefore totally I have 4GB of ram. Before I open R workshop, I also add C:\Program Files\R\R-2.5.0\bin\Rgui.exe --max-mem-size=3000Mb--max-vsize=3000Mb into the target of R by right clicking the R

[R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-06-26 Thread ivo welch
dear R experts: I am of course no R experts, but use it regularly. I thought I would share some experimentation with memory use. I run a linux machine with about 4GB of memory, and R 2.5.0. upon startup, gc() reports used (Mb) gc trigger (Mb) max used (Mb) Ncells 268755 14.4

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-06-26 Thread Prof Brian Ripley
The R Data Import/Export Manual points out several ways in which you can use read.csv more efficiently. On Tue, 26 Jun 2007, ivo welch wrote: dear R experts: I am of course no R experts, but use it regularly. I thought I would share some experimentation with memory use. I run a linux

[R] Memory increase in R

2007-04-18 Thread Hong Su An
Dear All: Pleas help me to increase the memory in R. I am trying to make euclidean distance matrix. The number of low in data is 500,000. Therefore, the dimension of euclidean distance matrix is 500,000*500,000. When I run the data in R. R could not make distance matrix because of memory

Re: [R] Memory increase in R

2007-04-18 Thread jim holtman
You would need 2TB (2,000,000,000,000) to store a single copy of your data. You probably need to rescale your problem. Even if you had the memory, the computation would take a very long time. On 4/18/07, Hong Su An [EMAIL PROTECTED] wrote: Dear All: Pleas help me to increase the memory in R.

Re: [R] Memory management

2007-04-12 Thread yoooooo
Okay thanks, I'm going through the docs now.. and I came through this.. The named field is set and accessed by the SET_NAMED and NAMED macros, and take values 0, 1 and 2. R has a `call by value' illusion, so an assignment like b - a appears to make a copy of a and refer to it as b.

Re: [R] Memory management

2007-04-11 Thread yoooooo
I guess I have more reading to do Are there any website that I can read up on memory management, or specifically what happen when we 'pass in' variables, which strategy is better at which situation? Thanks~ - y Prof Brian Ripley wrote: On Tue, 10 Apr 2007, yoo wrote: Hi

Re: [R] Memory management

2007-04-11 Thread Prof Brian Ripley
Start with the 'R Internals' manual. R has 'call by value' semantics, but lazy copying (the idea is to make a copy only when an object is changed and there are still references to the original version, but that idea is partially implemented). 'which strategy is better at which situation' is

Re: [R] Memory management

2007-04-11 Thread Charilaos Skiadas
Before you go down that road, I would recommend first seeing if it is really a problem. Premature code optimization is in my opinion never a good idea. Also, reading the Details on ?attach you will find this: The database is not actually attached. Rather, a new environment is

[R] Memory management

2007-04-10 Thread yoooooo
Hi all, I'm just curious how memory management works in R... I need to run an optimization that keeps calling the same function with a large set of parameters... so then I start to wonder if it's better if I attach the variables first vs passing them in (coz that involves a lot of copying.. )

Re: [R] Memory management

2007-04-10 Thread Prof Brian Ripley
On Tue, 10 Apr 2007, yoo wrote: Hi all, I'm just curious how memory management works in R... I need to run an optimization that keeps calling the same function with a large set of parameters... so then I start to wonder if it's better if I attach the variables first vs passing them in

[R] memory, speed, and assigning results into new v. existing variable

2007-03-23 Thread David L. Van Brunt, Ph.D.
I have a very large data frame, and I'm doing a conversion of all columns into factors. Takes a while (thanks to folks here though, for making faster!), but am wondering about optimization from a memory perspective... Internally, am I better off assigning into a new data frame, or doing one of

Re: [R] memory, speed, and assigning results into new v. existing variable

2007-03-23 Thread David L. Van Brunt, Ph.D.
. Van Brunt, Ph.D. Sent: Friday, March 23, 2007 11:15 AM To: R-Help List Subject: [R] memory, speed,and assigning results into new v. existing variable I have a very large data frame, and I'm doing a conversion of all columns into factors. Takes a while (thanks to folks here though, for making

[R] Memory error

2007-03-08 Thread Andrew Perrin
Greetings- Running R 2.4.0 under Debian Linux, I am getting a memory error trying to read a very large file: library(foreign) oldgrades.df - read.spss('Individual grades with AI (Nov 7 2006).sav',to.data.frame=TRUE) Error: cannot allocate vector of size 10826 Kb This file is, granted,

Re: [R] Memory error

2007-03-08 Thread Uwe Ligges
Andrew Perrin wrote: Greetings- Running R 2.4.0 under Debian Linux, I am getting a memory error trying to read a very large file: library(foreign) oldgrades.df - read.spss('Individual grades with AI (Nov 7 2006).sav',to.data.frame=TRUE) Error: cannot allocate vector of size 10826 Kb

Re: [R] Memory error

2007-03-08 Thread Andrew Perrin
Zoinks, thanks! I will seek to pare that file down and try again. Andy -- Andrew J Perrin - andrew_perrin (at) unc.edu - http://perrin.socsci.unc.edu Assistant Professor of Sociology; Book Review Editor, _Social Forces_

Re: [R] Memory error

2007-03-08 Thread Thomas Lumley
On Thu, 8 Mar 2007, Andrew Perrin wrote: Greetings- Running R 2.4.0 under Debian Linux, I am getting a memory error trying to read a very large file: library(foreign) oldgrades.df - read.spss('Individual grades with AI (Nov 7 2006).sav',to.data.frame=TRUE) Error: cannot allocate vector

Re: [R] Memory Limits in Ubuntu Linux

2007-03-07 Thread Bos, Roger
as good. Thanks, Roger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 06, 2007 5:37 PM To: Bos, Roger Subject: RE: [R] Memory Limits in Ubuntu Linux Thanks for your prompt reply! The windows 3GB switch is quite problematic - it was not useable

Re: [R] Memory Limits in Ubuntu Linux

2007-03-07 Thread davidkat
not quite as good as my Windows setup with Tinn-R, but almost as good. Thanks, Roger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 06, 2007 5:37 PM To: Bos, Roger Subject: RE: [R] Memory Limits in Ubuntu Linux Thanks for your prompt

Re: [R] Memory Limits in Ubuntu Linux

2007-03-07 Thread Bos, Roger
line. HTH, Roger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 07, 2007 12:27 PM To: Bos, Roger Cc: r-help@stat.math.ethz.ch Subject: RE: [R] Memory Limits in Ubuntu Linux Thanks for the tips, Roger. fyi: When I added /3GB to the boot.ini

[R] Memory Limits in Ubuntu Linux

2007-03-06 Thread davidkat
I am an R user trying to get around the 2Gig memory limit in Windows, so here I am days later with a working Ubuntu, and R under Ubuntu. But - the memory problems seem worse than ever. R code that worked under windows fails, unable to allocate memory. Searching around the web, it appears that

Re: [R] Memory Limits in Ubuntu Linux

2007-03-06 Thread Uwe Ligges
[EMAIL PROTECTED] wrote: I am an R user trying to get around the 2Gig memory limit in Windows, so here I am days later with a working Ubuntu, and R under Ubuntu. But - the memory problems seem worse than ever. R code that worked under windows fails, unable to allocate memory. Searching

Re: [R] Memory Limits in Ubuntu Linux

2007-03-06 Thread Christos Hatzis
www.nuverabio.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Tuesday, March 06, 2007 3:44 PM To: r-help@stat.math.ethz.ch Subject: [R] Memory Limits in Ubuntu Linux I am an R user trying to get around the 2Gig memory limit

Re: [R] Memory Limits in Ubuntu Linux

2007-03-06 Thread Dirk Eddelbuettel
On 6 March 2007 at 12:43, [EMAIL PROTECTED] wrote: | I am an R user trying to get around the 2Gig memory limit in Windows, so The real limit on 32bit systems is a 3gb address space. R under Windows can get there, see the R-Windows FAQ. | here I am days later with a working Ubuntu, and R under

Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Liaw, Andy
I don't see why making copies of the columns you need inside the loop is better memory management. If the data are in a matrix, accessing elements is quite fast. If you're worrying about speed of that, do what Charles suggest: work with the transpose so that you are accessing elements in the

Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Federico Calboli
Liaw, Andy wrote: I don't see why making copies of the columns you need inside the loop is better memory management. If the data are in a matrix, accessing elements is quite fast. If you're worrying about speed of that, do what Charles suggest: work with the transpose so that you are

Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Charles C. Berry
On Tue, 20 Feb 2007, Federico Calboli wrote: Liaw, Andy wrote: I don't see why making copies of the columns you need inside the loop is better memory management. If the data are in a matrix, accessing elements is quite fast. If you're worrying about speed of that, do what Charles

Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Federico Calboli
Charles C. Berry wrote: This is a bit different than your original post (where it appeared that you were manipulating one row of a matrix at a time), but the issue is the same. As suggested in my earlier email this looks like a caching issue, and this is not peculiar to R. Viz.

[R] memory management uestion

2007-02-19 Thread Federico Calboli
Hi All, I would like to ask the following. I have an array of data in an objetct, let's say X. I need to use a for loop on the elements of one or more columns of X and I am having a debate with a colleague about the best memory management. I believe that if I do: col1 = X[,1] col2 = X[,2]

Re: [R] memory management uestion

2007-02-19 Thread Charles C. Berry
On Mon, 19 Feb 2007, Federico Calboli wrote: Hi All, I would like to ask the following. I have an array of data in an objetct, let's say X. I need to use a for loop on the elements of one or more columns of X and I am having a debate with a colleague about the best memory management.

Re: [R] memory management uestion

2007-02-19 Thread Federico Calboli
Charles C. Berry wrote: Whoa! You are accessing one ROW at a time. Either way this will tangle up your cache if you have many rows and columns in your orignal data. You might do better to do Y - t( X ) ### use '-' ! for (i in whatever ){ do something using Y[ , i ] } My

Re: [R] memory-efficient column aggregation of a sparse matrix

2007-02-01 Thread Douglas Bates
On 1/31/07, Jon Stearley [EMAIL PROTECTED] wrote: I need to sum the columns of a sparse matrix according to a factor - ie given a sparse matrix X and a factor fac of length ncol(X), sum the elements by column factors and return the sparse matrix Y of size nrow(X) by nlevels(f). The appended

Re: [R] memory-efficient column aggregation of a sparse matrix

2007-02-01 Thread roger koenker
Doug is right, I think, that this would be easier with full indexing using the matrix.coo classe, if you want to use SparseM. But then the tapply seems to be the way to go. url:www.econ.uiuc.edu/~rogerRoger Koenker email[EMAIL PROTECTED]Department of Economics

Re: [R] memory-efficient column aggregation of a sparse matrix

2007-02-01 Thread Jon Stearley
On Feb 1, 2007, at 6:22 AM, Douglas Bates wrote: It turns out that in the sparse matrix code used by the Matrix package the triplet representation allows for duplicate index positions with the convention that the resulting value at a position is the sum of the values of any triplets with

[R] memory-efficient column aggregation of a sparse matrix

2007-01-31 Thread Jon Stearley
I need to sum the columns of a sparse matrix according to a factor - ie given a sparse matrix X and a factor fac of length ncol(X), sum the elements by column factors and return the sparse matrix Y of size nrow(X) by nlevels(f). The appended code does the job, but is unacceptably

Re: [R] Memory leak with character arrays?

2007-01-18 Thread Jean lobry
Dear Peter, The file that I'm reading contains the upstream regions of the yeast genome, with each upstream region labeled using a FASTA header, i.e.: FASTA header for gene 1 upstream region. . FASTA header for gene 2 upstream you may

[R] Memory leak with character arrays?

2007-01-17 Thread Peter Waltman
Hi - When I'm trying to read in a text file into a labeled character array, the memory stamp/footprint of R will exceed 4 gigs or more. I've seen this behavior on Mac OS X, Linux for AMD_64 and X86_64., and the R versions are 2.4, 2.4 and 2.2, respectively. So, it would seem that this is

Re: [R] Memory leak with character arrays?

2007-01-17 Thread jim holtman
What does the FASTA header look like. You are using 'gene' to access things in the array and if (for example) 'gene' is a character vector of 10, then for every element of vectors that you are using (I count about 4-5 that use this index) then you are going to have at least 550 * 6000 * 5 * 10

Re: [R] Memory leak with character arrays?

2007-01-17 Thread Peter Waltman
__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory leak with character arrays?

2007-01-17 Thread Peter Waltman
__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory problem --- use sparse matrices

2007-01-09 Thread Zoltan Kmetty
Unfortunatelly, i have to fill all the cells, with numbers..., so I need a better machine, or i have to split the data for smaller parts, but that way is much slower, but i see i dont have other alternative way. But thanx for your help, because i work with big networks too (1 vertex), and

Re: [R] memory problem

2007-01-08 Thread Thomas Lumley
On Sat, 6 Jan 2007, Zoltan Kmetty wrote: Hi! I had some memory problem with R - hope somebody could tell me a solution. I work with very large datasets, but R cannot allocate enough memoty to handle these datasets. You haven't said what you want to do with these datasets. -thomas

Re: [R] memory problem --- use sparse matrices

2007-01-08 Thread Martin Maechler
UweL == Uwe Ligges [EMAIL PROTECTED] on Sun, 07 Jan 2007 09:42:08 +0100 writes: UweL Zoltan Kmetty wrote: Hi! I had some memory problem with R - hope somebody could tell me a solution. I work with very large datasets, but R cannot allocate enough

Re: [R] memory problem

2007-01-07 Thread Uwe Ligges
Zoltan Kmetty wrote: Hi! I had some memory problem with R - hope somebody could tell me a solution. I work with very large datasets, but R cannot allocate enough memoty to handle these datasets. I want work a matrix with row= 100 000 000 and column=10 A know this is 1 milliard

Re: [R] memory problem

2007-01-07 Thread Bos, Roger
: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges Sent: Sunday, January 07, 2007 3:42 AM To: Zoltan Kmetty Cc: r-help@stat.math.ethz.ch Subject: Re: [R] memory problem Zoltan Kmetty wrote: Hi! I had some memory problem with R - hope somebody could tell me a solution. I work

[R] memory problem

2007-01-06 Thread Zoltan Kmetty
Hi! I had some memory problem with R - hope somebody could tell me a solution. I work with very large datasets, but R cannot allocate enough memoty to handle these datasets. I want work a matrix with row= 100 000 000 and column=10 A know this is 1 milliard cases, but i thought R could handle

Re: [R] memory limits in R loading a dataset and using the package tree

2007-01-05 Thread Weiwei Shi
IMHO, R is not good at really large-scale data mining, esp. when the algorithm is complicated. The alternatives are 1. sampling your data; sometimes you really do not need that large number of records and the accuracy might already be good enough when you load less. 2. find an alternative

Re: [R] memory limits in R loading a dataset and using the packagetree

2007-01-05 Thread Sicotte, Hugues Ph.D.
PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Weiwei Shi Sent: Friday, January 05, 2007 2:12 PM To: domenico pestalozzi Cc: r-help@stat.math.ethz.ch Subject: Re: [R] memory limits in R loading a dataset and using the packagetree IMHO, R is not good at really large-scale data mining, esp. when

[R] memory limits in R loading a dataset and using the package tree

2007-01-04 Thread domenico pestalozzi
I think the question is discussed in other thread, but I don't exactly find what I want . I'm working in Windows XP with 2GB of memory and a Pentium 4 - 3.00Ghx. I have the necessity of working with large dataset, generally from 300,000 records to 800,000 (according to the project), and about 300

Re: [R] memory limits in R loading a dataset and using the package tree

2007-01-04 Thread Prof Brian Ripley
Please read the rw-FAQ Q2.9. There are ways to raise the limit, and you have not told us that you used them (nor the version of R you used, which matters as the limits are version-specific). Beyond that, there are ways to use read.table more efficiently: see its help page and the 'R Data

Re: [R] Memory problem on a linux cluster using a large data set [Broadcast]

2006-12-21 Thread Iris Kolder
Kolder [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch; N.C. Onland-moret [EMAIL PROTECTED] Sent: Monday, December 18, 2006 7:48:23 PM Subject: RE: [R] Memory problem on a linux cluster using a large data set [Broadcast] In addition to my off-list reply to Iris (pointing her to an old post

Re: [R] Memory problem on a linux cluster using a large data set [Broadcast]

2006-12-21 Thread Thomas Lumley
On Thu, 21 Dec 2006, Iris Kolder wrote: Thank you all for your help! So with all your suggestions we will try to run it on a computer with a 64 bits proccesor. But i've been told that the new R versions all work on a 32bits processor. I read in other posts that only the old R versions

Re: [R] Memory problem on a linux cluster using a large data set [Broadcast]

2006-12-21 Thread Martin Morgan
Section 8 of the Installation and Administration guide says that on 64-bit architectures the 'size of a block of memory allocated is limited to 2^32-1 (8 GB) bytes'. The wording 'a block of memory' here is important, because this sets a limit on a single allocation rather than the memory consumed

[R] Memory problem on a linux cluster using a large data set

2006-12-18 Thread Iris Kolder
Hello, I have a large data set 320.000 rows and 1000 columns. All the data has the values 0,1,2. I wrote a script to remove all the rows with more than 46 missing values. This works perfect on a smaller dataset. But the problem arises when I try to run it on the larger data set I get an error

Re: [R] Memory problem on a linux cluster using a large data set

2006-12-18 Thread Martin Morgan
Iris -- I hope the following helps; I think you have too much data for a 32-bit machine. Martin Iris Kolder [EMAIL PROTECTED] writes: Hello, I have a large data set 320.000 rows and 1000 columns. All the data has the values 0,1,2. It seems like a single copy of this data set will be at

Re: [R] Memory problem on a linux cluster using a large data set [Broadcast]

2006-12-18 Thread Liaw, Andy
In addition to my off-list reply to Iris (pointing her to an old post of mine that detailed the memory requirement of RF in R), she might consider the following: - Use larger nodesize - Use sampsize to control the size of bootstrap samples Both of these have the effect of reducing sizes of trees

Re: [R] memory problem [cluster]

2006-12-05 Thread Martin Maechler
Roger == Roger Bivand [EMAIL PROTECTED] on Sat, 2 Dec 2006 22:11:12 +0100 (CET) writes: Roger On Sat, 2 Dec 2006, Dylan Beaudette wrote: Hi Stephano, Roger Looks like you used my example verbatim Roger (http://casoilresource.lawr.ucdavis.edu/drupal/node/221) Roger :)

Re: [R] memory problem [cluster]

2006-12-02 Thread Dylan Beaudette
Hi Stephano, Looks like you used my example verbatim (http://casoilresource.lawr.ucdavis.edu/drupal/node/221) :) While my approach has not *yet* been published, the original source [4] by Roger Bivand certainly has. Just a reminder. That said, I would highly recommend reading up on the

Re: [R] memory problem [cluster]

2006-12-02 Thread Roger Bivand
On Sat, 2 Dec 2006, Dylan Beaudette wrote: Hi Stephano, Looks like you used my example verbatim (http://casoilresource.lawr.ucdavis.edu/drupal/node/221) :) From exchanges on R-sig-geo, I believe the original questioner is feeding NAs to clara, and the error message in clara() is overrunning

[R] memory problem

2006-12-01 Thread Massimo Di Stefano
hi to all, frustated for this error, to day i buy a 1 GB memory slot for my laptop now it have 1,28GB instead the old 512, but i've the same error :-( damn!damn!how can i do? repeat for a little area (about 20X20 km and res=20m) it work fine! have you any suggestion? is ther a method for look

[R] memory management

2006-10-30 Thread Federico Calboli
Hi All, just a quick (?) question while I wait my code runs... I'm comparing the identity of the lines of a dataframe, doing all possible pairwise comparisons. In doing so I use identical(), but that's by the way. I'm doing a (not so) quick and dirty check, and subsetting the data as

Re: [R] memory management

2006-10-30 Thread bogdan romocea
:[EMAIL PROTECTED] On Behalf Of Federico Calboli Sent: Monday, October 30, 2006 11:35 AM To: r-help Subject: [R] memory management Hi All, just a quick (?) question while I wait my code runs... I'm comparing the identity of the lines of a dataframe, doing all possible pairwise comparisons

[R] Memory allocation

2006-09-07 Thread alex lam \(RI\)
Dear list, I have been trying to run the function qvalue under the package qvalue on a vector with about 20 million values. asso_p.qvalue-qvalue(asso_p.vector) Error: cannot allocate vector of size 156513 Kb sessionInfo() Version 2.3.1 (2006-06-01) i686-pc-linux-gnu attached base packages:

Re: [R] Memory allocation

2006-09-07 Thread Prof Brian Ripley
On Thu, 7 Sep 2006, alex lam (RI) wrote: Dear list, I have been trying to run the function qvalue under the package qvalue on a vector with about 20 million values. asso_p.qvalue-qvalue(asso_p.vector) Error: cannot allocate vector of size 156513 Kb sessionInfo() Version 2.3.1

[R] Memory issues

2006-09-03 Thread Davendra Sohal
Hi, I'm using R on Windows and upgraded the computer memory to 4GB, as R was telling me that it is out of memory (for making heatmaps). It still says that the maximum memory is 1024Mb, even if I increase it using memory.limit and memory.size. Is there a way to permanently increase R's memory quota

Re: [R] Memory issues

2006-09-03 Thread Prof Brian Ripley
Please do read the rw-FAQ, Q2.9 (and the posting guide). In particular, Windows never gives 4GB to a single 32-bit user process. On Sun, 3 Sep 2006, Davendra Sohal wrote: Hi, I'm using R on Windows and upgraded the computer memory to 4GB, as R was telling me that it is out of memory (for

Re: [R] Memory usage decreases drastically after save workspace, quit, restart, load workspace

2006-08-26 Thread Prof Brian Ripley
On Sat, 26 Aug 2006, Klaus Thul wrote: Dear all, I have the following problem: - I have written a program in R which runs out of physical memory on my MacBook Pro with 1.5 GB RAM How does R know about physical memory on a virtual-memory OS? I presume the symptom is swapping by your

[R] Memory usage decreases drastically after save workspace, quit, restart, load workspace

2006-08-25 Thread Klaus Thul
Dear all, I have the following problem: - I have written a program in R which runs out of physical memory on my MacBook Pro with 1.5 GB RAM - The memory usage is much higher then I would expect from the actual data in my global environment - When I save the workspace, quit R, restart

Re: [R] memory problems when combining randomForests

2006-08-01 Thread Ramon Diaz-Uriarte
Dear Eleni, But if every time you remove a variable you pass some test data (ie data not used to train the model) and base the performance of the new, reduced model on the error rate on the confusion matrix for the test data, then this overfitting should not be an issue, right? (unless of

Re: [R] memory problems when combining randomForests

2006-07-31 Thread Eleni Rapsomaniki
Hello I've just realised attachments are not allowed, so the data for the example in my previous message is: pos.df=read.table(http://www.savefile.com/projects3.php?fid=6240314pid=847249key=119090;, header=T)

Re: [R] memory problems when combining randomForests

2006-07-31 Thread Weiwei Shi
Hi, Andy: What's the Jerry Friedman's ISLE? I googled it and did not find the paper on it. Could you give me a link, please? Thanks, Weiwei On 7/31/06, Eleni Rapsomaniki [EMAIL PROTECTED] wrote: Hello I've just realised attachments are not allowed, so the data for the example in my

Re: [R] memory problems when combining randomForests

2006-07-31 Thread Eleni Rapsomaniki
Hi Andy, I get different order of importance for my variables depending on their order in the training data. Perhaps answering my own question, the change in importance rankings could be attributed to the fact that before passing my data to randomForest I impute the missing values randomly

Re: [R] memory problems when combining randomForests [Broadcast]

2006-07-31 Thread Liaw, Andy
@stat.math.ethz.ch Subject: Re: [R] memory problems when combining randomForests [Broadcast] Hi, Andy: What's the Jerry Friedman's ISLE? I googled it and did not find the paper on it. Could you give me a link, please? Thanks, Weiwei On 7/31/06, Eleni Rapsomaniki [EMAIL PROTECTED] mailto:[EMAIL PROTECTED

Re: [R] memory problems when combining randomForests

2006-07-31 Thread Weiwei Shi
Found it from another paper: importance sample learning ensemble (ISLE) which originates from Friedman and Popescu (2003). On 7/31/06, Weiwei Shi [EMAIL PROTECTED] wrote: Hi, Andy: What's the Jerry Friedman's ISLE? I googled it and did not find the paper on it. Could you give me a link,

  1   2   3   4   >