subject:"\[R\] R Memory Usage Concerns"

Re: [R] R Memory Usage Concerns

2009-09-15 Thread Evan Klitzke

On Mon, Sep 14, 2009 at 10:01 PM, Henrik Bengtsson h...@stat.berkeley.edu wrote: As already suggested, you're (much) better off if you specify colClasses, e.g. tab - read.table(~/20090708.tab, colClasses=c(factor, double, double)); Otherwise, R has to load all the data, make a best guess

Re: [R] R Memory Usage Concerns

2009-09-15 Thread Thomas Lumley

On Tue, 15 Sep 2009, Evan Klitzke wrote: On Mon, Sep 14, 2009 at 10:01 PM, Henrik Bengtsson h...@stat.berkeley.edu wrote: As already suggested, you're (much) better off if you specify colClasses, e.g. tab - read.table(~/20090708.tab, colClasses=c(factor, double, double)); Otherwise, R has

Re: [R] R Memory Usage Concerns

2009-09-15 Thread Carlos J. Gil Bellosta

Hello, I do not know whether my package colbycol may help you. It can help you read files that would not have fitted into memory otherwise. Internally, as the name indicates, data is read into R in a column by column fashion. IO times increase but you need just a fraction of intermediate memory

[R] R Memory Usage Concerns

2009-09-14 Thread Evan Klitzke

Hello all, To start with, these measurements are on Linux with R 2.9.2 (64-bit build) and Python 2.6 (also 64-bit). I've been investigating R for some log file analysis that I've been doing. I'm coming at this from the angle of a programmer whose primarily worked in Python. As I've been playing

Re: [R] R Memory Usage Concerns

2009-09-14 Thread jim holtman

When you read your file into R, show the structure of the object: str(tab) also the size of the object: object.size(tab) This will tell you what your data looks like and the size taken in R. Also in read.table, use colClasses to define what the format of the data is; may make it faster. You

Re: [R] R Memory Usage Concerns

2009-09-14 Thread Eduardo Leoni

And, by the way, factors take up _more_ memory than character vectors. object.size(sample(c(a,b), 1000, replace=TRUE)) 4088 bytes object.size(factor(sample(c(a,b), 1000, replace=TRUE))) 4296 bytes On Mon, Sep 14, 2009 at 11:35 PM, jim holtman jholt...@gmail.com wrote: When you read your file

Re: [R] R Memory Usage Concerns

2009-09-14 Thread Evan Klitzke

On Mon, Sep 14, 2009 at 8:35 PM, jim holtman jholt...@gmail.com wrote: When you read your file into R, show the structure of the object: ... Here's the data I get: tab - read.table(~/20090708.tab) str(tab) 'data.frame': 1797601 obs. of 3 variables: $ V1: Factor w/ 6 levels biz_details,..:

Re: [R] R Memory Usage Concerns

2009-09-14 Thread Evan Klitzke

On Mon, Sep 14, 2009 at 8:58 PM, Eduardo Leoni leoni...@msu.edu wrote: And, by the way, factors take up _more_ memory than character vectors. object.size(sample(c(a,b), 1000, replace=TRUE)) 4088 bytes object.size(factor(sample(c(a,b), 1000, replace=TRUE))) 4296 bytes I think this is just

Re: [R] R Memory Usage Concerns

2009-09-14 Thread hadley wickham

I think this is just because you picked short strings. If the factor is mapping the string to a native integer type, the strings would have to be larger for you to notice: object.size(sample(c(a pretty long string, another pretty long string), 1000, replace=TRUE)) 8184 bytes

Re: [R] R Memory Usage Concerns

2009-09-14 Thread hadley wickham

its 32-bit representation. This seems like it might be too conservative for me, since it implies that R allocated exactly as much memory for the lists as there were numbers in the list (e.g. typically in an interpreter like this you'd be allocating on order-of-two boundaries, i.e. sizeof(obj)

Re: [R] R Memory Usage Concerns

2009-09-14 Thread Henrik Bengtsson

As already suggested, you're (much) better off if you specify colClasses, e.g. tab - read.table(~/20090708.tab, colClasses=c(factor, double, double)); Otherwise, R has to load all the data, make a best guess of the column classes, and then coerce (which requires a copy). /Henrik On Mon, Sep

Re: [R] R Memory Usage Concerns

Re: [R] R Memory Usage Concerns

Re: [R] R Memory Usage Concerns

[R] R Memory Usage Concerns

Re: [R] R Memory Usage Concerns

Re: [R] R Memory Usage Concerns

Re: [R] R Memory Usage Concerns

Re: [R] R Memory Usage Concerns

Re: [R] R Memory Usage Concerns

Re: [R] R Memory Usage Concerns

Re: [R] R Memory Usage Concerns

11 matches

Site Navigation

Mail list logo

Footer information