Re: [R] Memory management

2007-04-12 Thread yoooooo

Okay thanks, I'm going through the docs now.. and I came through this.. 

The named field is set and accessed by the SET_NAMED and NAMED macros, and
take values 0, 1 and 2. R has a `call by value' illusion, so an assignment
like 
 b - a
appears to make a copy of a and refer to it as b. However, if neither a nor
b are subsequently altered there is no need to copy. What really happens is
that a new symbol b is bound to the same value as a and the named field on
the value object is set (in this case to 2). When an object is about to be
altered, the named field is consulted. A value of 2 means that the object
must be duplicated before being changed. 

What does it mean the new symbol b is bound to the same value as a. 
Does it mean b has a pointer pointing to a? 

Thanks!!
- yooo


yoo wrote:
 
 I guess I have more reading to do Are there any website that I can
 read up on memory management, or specifically what happen when we 'pass
 in' variables, which strategy is better at which situation? 
 
 Thanks~
 - y
 
 
 Prof Brian Ripley wrote:
 
 On Tue, 10 Apr 2007, yoo wrote:
 

 Hi all, I'm just curious how memory management works in R... I need to
 run an
 optimization that keeps calling the same function with a large set of
 parameters... so then I start to wonder if it's better if I attach the
 variables first vs passing them in (coz that involves a lot of copying..
 )
 
 Your paranethetical comment is wrong: no copying is needed to 'pass in' a 
 variable.
 
 Thus, I do this
 fn3 - function(x, y, z, a, b, c){ sum(x, y, z, a, b, c) }
 fn4 - function(){ sum(x, y, z, a, b, c) }

 rdn - rep(1.1, times=1e8)
 r - proc.time()
 for (i in 1:5)
  fn3(rdn, rdn, rdn, rdn, rdn, rdn)
 time1 - proc.time() - r
 print(time1)

 lt - list(x = rdn, y = rdn, z = rdn, a = rdn, b = rdn, c = rdn)
 attach(lt)
 r - proc.time()
 for (i in 1:5)
  fn4()
 time2 - proc.time() - r
 print(time2)
 detach(lt)

 The output is
 [1] 25.691  0.003 25.735  0.000  0.000
 [1] 25.822  0.005 25.860  0.000  0.000

 Turns out attaching takes longer to run.. which is counter intuitive
 (unless
 the search to the pos=2 envir takes long time as well) Do you guys know
 why
 this is the case?
 
 I would not trust timing differences of that nature: they often depend on 
 the state of the system, and in particular of the garbage collector.
 You should be using system.time() for that reason: it calls the garbage 
 collector immediately before timing.
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Memory-management-tf3556238.html#a9961010
Sent from the R help mailing list archive at Nabble.com.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management

2007-04-11 Thread yoooooo

I guess I have more reading to do Are there any website that I can read
up on memory management, or specifically what happen when we 'pass in'
variables, which strategy is better at which situation? 

Thanks~
- y


Prof Brian Ripley wrote:
 
 On Tue, 10 Apr 2007, yoo wrote:
 

 Hi all, I'm just curious how memory management works in R... I need to
 run an
 optimization that keeps calling the same function with a large set of
 parameters... so then I start to wonder if it's better if I attach the
 variables first vs passing them in (coz that involves a lot of copying..
 )
 
 Your paranethetical comment is wrong: no copying is needed to 'pass in' a 
 variable.
 
 Thus, I do this
 fn3 - function(x, y, z, a, b, c){ sum(x, y, z, a, b, c) }
 fn4 - function(){ sum(x, y, z, a, b, c) }

 rdn - rep(1.1, times=1e8)
 r - proc.time()
 for (i in 1:5)
  fn3(rdn, rdn, rdn, rdn, rdn, rdn)
 time1 - proc.time() - r
 print(time1)

 lt - list(x = rdn, y = rdn, z = rdn, a = rdn, b = rdn, c = rdn)
 attach(lt)
 r - proc.time()
 for (i in 1:5)
  fn4()
 time2 - proc.time() - r
 print(time2)
 detach(lt)

 The output is
 [1] 25.691  0.003 25.735  0.000  0.000
 [1] 25.822  0.005 25.860  0.000  0.000

 Turns out attaching takes longer to run.. which is counter intuitive
 (unless
 the search to the pos=2 envir takes long time as well) Do you guys know
 why
 this is the case?
 
 I would not trust timing differences of that nature: they often depend on 
 the state of the system, and in particular of the garbage collector.
 You should be using system.time() for that reason: it calls the garbage 
 collector immediately before timing.
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Memory-management-tf3556238.html#a9937981
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management

2007-04-11 Thread Prof Brian Ripley
Start with the 'R Internals' manual.  R has 'call by value' semantics, but 
lazy copying (the idea is to make a copy only when an object is changed 
and there are still references to the original version, but that idea is 
partially implemented).

'which strategy is better at which situation' is difficult.  'S 
Programming' (see the FAQ) has a lot of accumulated wisdom that has 
largely been superseded by changes to S and R.  We keep making changes to 
reduce copying (another slew of changes is planned for 2.6.0), so this is 
something that is very hard to keep up with.

We can tell you that some things are likely to be bad, and 'S Programming' 
is a good place to find out about most of those.

On Wed, 11 Apr 2007, yoo wrote:


 I guess I have more reading to do Are there any website that I can read
 up on memory management, or specifically what happen when we 'pass in'
 variables, which strategy is better at which situation?

 Thanks~
 - y


 Prof Brian Ripley wrote:

 On Tue, 10 Apr 2007, yoo wrote:


 Hi all, I'm just curious how memory management works in R... I need to
 run an
 optimization that keeps calling the same function with a large set of
 parameters... so then I start to wonder if it's better if I attach the
 variables first vs passing them in (coz that involves a lot of copying..
 )

 Your paranethetical comment is wrong: no copying is needed to 'pass in' a
 variable.

 Thus, I do this
 fn3 - function(x, y, z, a, b, c){ sum(x, y, z, a, b, c) }
 fn4 - function(){ sum(x, y, z, a, b, c) }

 rdn - rep(1.1, times=1e8)
 r - proc.time()
 for (i in 1:5)
  fn3(rdn, rdn, rdn, rdn, rdn, rdn)
 time1 - proc.time() - r
 print(time1)

 lt - list(x = rdn, y = rdn, z = rdn, a = rdn, b = rdn, c = rdn)
 attach(lt)
 r - proc.time()
 for (i in 1:5)
  fn4()
 time2 - proc.time() - r
 print(time2)
 detach(lt)

 The output is
 [1] 25.691  0.003 25.735  0.000  0.000
 [1] 25.822  0.005 25.860  0.000  0.000

 Turns out attaching takes longer to run.. which is counter intuitive
 (unless
 the search to the pos=2 envir takes long time as well) Do you guys know
 why
 this is the case?

 I would not trust timing differences of that nature: they often depend on
 the state of the system, and in particular of the garbage collector.
 You should be using system.time() for that reason: it calls the garbage
 collector immediately before timing.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management

2007-04-11 Thread Charilaos Skiadas
Before you go down that road, I would recommend first seeing if it is  
really a problem. Premature code optimization is in my opinion never  
a good idea.

Also, reading the Details on ?attach you will find this:

The database is not actually attached. Rather, a new environment is  
created on the search path and the elements of a list (including  
columns of a data frame) or objects in a save file or an environment  
are copied into the new environment. If you use - or assign to  
assign to an attached database, you only alter the attached copy, not  
the original object. (Normal assignment will place a modified version  
in the user's workspace: see the examples.) For this reason attach  
can lead to confusion.

So in fact it is the attaching that has to do copying, not the other  
way around.

As for references, perhaps there is a better one, but searching for  
pass in Writing R Extensions I found the following on page 41:

Some memory allocation is obvious in interpreted code, for example,
y - x + 1
allocates memory for a new vector y. Other memory allocation is less  
obvious and occurs because
R is forced to make good on its promise of ‘call-by-value’ argument  
passing. When an argument
is passed to a function it is not immediately copied. Copying occurs  
(if necessary) only when
the argument is modified. This can lead to surprising memory use.

Perhaps a better source, section 4.3.3 of The R language  
definition, on Argument Evaluation.


On Apr 11, 2007, at 8:25 AM, yoo wrote:


 I guess I have more reading to do Are there any website that I  
 can read
 up on memory management, or specifically what happen when we 'pass in'
 variables, which strategy is better at which situation?

 Thanks~
 - y


 On Tue, 10 Apr 2007, yoo wrote:


 Hi all, I'm just curious how memory management works in R... I  
 need to
 run an
 optimization that keeps calling the same function with a large  
 set of
 parameters... so then I start to wonder if it's better if I  
 attach the
 variables first vs passing them in (coz that involves a lot of  
 copying..
 )


Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Memory management

2007-04-10 Thread yoooooo

Hi all, I'm just curious how memory management works in R... I need to run an
optimization that keeps calling the same function with a large set of
parameters... so then I start to wonder if it's better if I attach the
variables first vs passing them in (coz that involves a lot of copying.. )

Thus, I do this
fn3 - function(x, y, z, a, b, c){ sum(x, y, z, a, b, c) }
fn4 - function(){ sum(x, y, z, a, b, c) }

rdn - rep(1.1, times=1e8)
r - proc.time()
for (i in 1:5)
  fn3(rdn, rdn, rdn, rdn, rdn, rdn)
time1 - proc.time() - r
print(time1)

lt - list(x = rdn, y = rdn, z = rdn, a = rdn, b = rdn, c = rdn)
attach(lt)
r - proc.time()
for (i in 1:5)
  fn4()
time2 - proc.time() - r
print(time2)
detach(lt)

The output is
[1] 25.691  0.003 25.735  0.000  0.000
[1] 25.822  0.005 25.860  0.000  0.000

Turns out attaching takes longer to run.. which is counter intuitive (unless
the search to the pos=2 envir takes long time as well) Do you guys know why
this is the case? 
-- 
View this message in context: 
http://www.nabble.com/Memory-management-tf3556238.html#a9929835
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management

2007-04-10 Thread Prof Brian Ripley
On Tue, 10 Apr 2007, yoo wrote:


 Hi all, I'm just curious how memory management works in R... I need to run an
 optimization that keeps calling the same function with a large set of
 parameters... so then I start to wonder if it's better if I attach the
 variables first vs passing them in (coz that involves a lot of copying.. )

Your paranethetical comment is wrong: no copying is needed to 'pass in' a 
variable.

 Thus, I do this
 fn3 - function(x, y, z, a, b, c){ sum(x, y, z, a, b, c) }
 fn4 - function(){ sum(x, y, z, a, b, c) }

 rdn - rep(1.1, times=1e8)
 r - proc.time()
 for (i in 1:5)
  fn3(rdn, rdn, rdn, rdn, rdn, rdn)
 time1 - proc.time() - r
 print(time1)

 lt - list(x = rdn, y = rdn, z = rdn, a = rdn, b = rdn, c = rdn)
 attach(lt)
 r - proc.time()
 for (i in 1:5)
  fn4()
 time2 - proc.time() - r
 print(time2)
 detach(lt)

 The output is
 [1] 25.691  0.003 25.735  0.000  0.000
 [1] 25.822  0.005 25.860  0.000  0.000

 Turns out attaching takes longer to run.. which is counter intuitive (unless
 the search to the pos=2 envir takes long time as well) Do you guys know why
 this is the case?

I would not trust timing differences of that nature: they often depend on 
the state of the system, and in particular of the garbage collector.
You should be using system.time() for that reason: it calls the garbage 
collector immediately before timing.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Liaw, Andy
I don't see why making copies of the columns you need inside the loop is
better memory management.  If the data are in a matrix, accessing
elements is quite fast.  If you're worrying about speed of that, do what
Charles suggest: work with the transpose so that you are accessing
elements in the same column in each iteration of the loop.

Andy 

From: Federico Calboli
 
 Charles C. Berry wrote:
 
  Whoa! You are accessing one ROW at a time.
  
  Either way this will tangle up your cache if you have many rows and 
  columns in your orignal data.
  
  You might do better to do
  
  Y - t( X ) ### use '-' !
  
  for (i in whatever ){
  do something using Y[ , i ]
  }
 
 My question is NOT how to write the fastest code, it is 
 whether dummy variables (for lack of better words) make the 
 memory management better, i.e. faster, or not.
 
 Best,
 
 Fede
 
 --
 Federico C. F. Calboli
 Department of Epidemiology and Public Health Imperial 
 College, St Mary's Campus Norfolk Place, London W2 1PG
 
 Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193
 
 f.calboli [.a.t] imperial.ac.uk
 f.calboli [.a.t] gmail.com
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Federico Calboli
Liaw, Andy wrote:
 I don't see why making copies of the columns you need inside the loop is
 better memory management.  If the data are in a matrix, accessing
 elements is quite fast.  If you're worrying about speed of that, do what
 Charles suggest: work with the transpose so that you are accessing
 elements in the same column in each iteration of the loop.

As I said, this is pretty academic, I am not looking for how to do something 
differetly.

Having said that, let me present this code:

for(i in gp){
new[i,1] = ifelse(srow[i]0, new[srow[i],zippo[i]], sav[i])
new[i,2] = ifelse(drow[i]0, new[drow[i],zappo[i]], sav[i])
  }

where gp is large vector and srow and drow are the dummy variables for:

srow = data[,2]
drow = data[,4]

If instead of the dummy variable I access the array directly (and its' a 60 
x 6 array) the loop takes 2/3 days --not sure here, I killed it after 48 hours.

If I use dummy variables the code runs in 10 minutes-ish.

Comments?

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Charles C. Berry
On Tue, 20 Feb 2007, Federico Calboli wrote:

 Liaw, Andy wrote:
  I don't see why making copies of the columns you need inside the loop is
  better memory management.  If the data are in a matrix, accessing
  elements is quite fast.  If you're worrying about speed of that, do what
  Charles suggest: work with the transpose so that you are accessing
  elements in the same column in each iteration of the loop.

 As I said, this is pretty academic, I am not looking for how to do something 
 differetly.

 Having said that, let me present this code:

 for(i in gp){
new[i,1] = ifelse(srow[i]0, new[srow[i],zippo[i]], sav[i])
new[i,2] = ifelse(drow[i]0, new[drow[i],zappo[i]], sav[i])
  }

 where gp is large vector and srow and drow are the dummy variables for:

 srow = data[,2]
 drow = data[,4]

 If instead of the dummy variable I access the array directly (and its' a 
 60 x 6 array) the loop takes 2/3 days --not sure here, I killed it after 
 48 hours.

 If I use dummy variables the code runs in 10 minutes-ish.

 Comments?


This is a bit different than your original post (where it appeared that 
you were manipulating one row of a matrix at a time), but the issue is the 
same.

As suggested in my earlier email this looks like a caching issue, and this 
is not peculiar to R.

Viz.

Most modern CPUs are so fast that for most program workloads the locality 
of reference of memory accesses, and the efficiency of the caching and 
memory transfer between different levels of the hierarchy, is the 
practical limitation on processing speed. As a result, the CPU spends much 
of its time idling, waiting for memory I/O to complete.

(from http://en.wikipedia.org/wiki/Memory_hierarchy)


The computation you have is challenging to your cache, and the effect of 
dropping unused columns of your 'data' object by assiging the 
columns used  to 'srow' and 'drow' has lightened the load.

If you do not know why SAXPY and friends are written as they are, a little 
bit of study will be rewarded by a much better understanding of these 
issues. I think Golub and Van Loan's 'Matrix Computations' touches on this 
(but I do not have my copy close to hand to check).



 Best,

 Fede

 -- 
 Federico C. F. Calboli
 Department of Epidemiology and Public Health
 Imperial College, St Mary's Campus
 Norfolk Place, London W2 1PG

 Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

 f.calboli [.a.t] imperial.ac.uk
 f.calboli [.a.t] gmail.com


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Federico Calboli
Charles C. Berry wrote:

 
 
 This is a bit different than your original post (where it appeared that 
 you were manipulating one row of a matrix at a time), but the issue is 
 the same.
 
 As suggested in my earlier email this looks like a caching issue, and 
 this is not peculiar to R.
 
 Viz.
 
 Most modern CPUs are so fast that for most program workloads the 
 locality of reference of memory accesses, and the efficiency of the 
 caching and memory transfer between different levels of the hierarchy, 
 is the practical limitation on processing speed. As a result, the CPU 
 spends much of its time idling, waiting for memory I/O to complete.
 
 (from http://en.wikipedia.org/wiki/Memory_hierarchy)
 
 
 The computation you have is challenging to your cache, and the effect of 
 dropping unused columns of your 'data' object by assiging the columns 
 used  to 'srow' and 'drow' has lightened the load.
 
 If you do not know why SAXPY and friends are written as they are, a 
 little bit of study will be rewarded by a much better understanding of 
 these issues. I think Golub and Van Loan's 'Matrix Computations' touches 
 on this (but I do not have my copy close to hand to check).

Thanks for the clarifications. My bottom line is, I prefer dummy variables 
because they allow me to write cleaner code, with a shorter line for the same 
instruction i.e. less chances of creeping errors (+ turning into -, etc).

I've been challenged that that's memory inefficent, and I wanted to have the 
opinion of people with more experience than mine on the matter.

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] memory management uestion

2007-02-19 Thread Federico Calboli
Hi All,

I would like to ask the following.

I have an array of data in an objetct, let's say X.

I need to use a for loop on the elements of one or more columns of X and I am 
having a debate with a colleague about the best memory management.

I believe that if I do:

col1 = X[,1]
col2 = X[,2]
...
colx = X[,x]


and then

for(i in whatever){
do something using col1[i], col2[i] ... colx[i]
}

my memory management is better that doing:

for(i in whatever){
do something using X[i,1], X[i,2] ... X[,x]
}

BTW, here I *have to* use a for() loop an no nifty tapply, lapply and family.

Any comment is welcome.

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory management uestion

2007-02-19 Thread Charles C. Berry
On Mon, 19 Feb 2007, Federico Calboli wrote:

 Hi All,

 I would like to ask the following.

 I have an array of data in an objetct, let's say X.

 I need to use a for loop on the elements of one or more columns of X and I am
 having a debate with a colleague about the best memory management.


Yez guys should take this fight out into the parking lot. ;-)

Armed with gc(), system.time(), and whatever memory monitoring tools your 
OS'es provide you can pound each other with memory usage and timing stats 
till one of you screams 'uncle' or you both have had enough and decide to 
shake hands and come back inside.



 I believe that if I do:

 col1 = X[,1]
 col2 = X[,2]
 ...
 colx = X[,x]


 and then

 for(i in whatever){
 do something using col1[i], col2[i] ... colx[i]
 }

 my memory management is better that doing:

 for(i in whatever){
 do something using X[i,1], X[i,2] ... X[,x]
 }


Whoa! You are accessing one ROW at a time.

Either way this will tangle up your cache if you have many rows and 
columns in your orignal data.

You might do better to do

Y - t( X ) ### use '-' !

for (i in whatever ){
do something using Y[ , i ]
}



 BTW, here I *have to* use a for() loop an no nifty tapply, lapply and family.

 Any comment is welcome.

 Best,

 Fede

 -- 
 Federico C. F. Calboli
 Department of Epidemiology and Public Health
 Imperial College, St Mary's Campus
 Norfolk Place, London W2 1PG

 Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

 f.calboli [.a.t] imperial.ac.uk
 f.calboli [.a.t] gmail.com

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory management uestion

2007-02-19 Thread Federico Calboli
Charles C. Berry wrote:

 Whoa! You are accessing one ROW at a time.
 
 Either way this will tangle up your cache if you have many rows and 
 columns in your orignal data.
 
 You might do better to do
 
 Y - t( X ) ### use '-' !
 
 for (i in whatever ){
 do something using Y[ , i ]
 }

My question is NOT how to write the fastest code, it is whether dummy variables 
(for lack of better words) make the memory management better, i.e. faster, or 
not.

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] memory management

2006-10-30 Thread Federico Calboli
Hi All,

just a quick (?) question while I wait my code runs...

I'm comparing the identity of the lines of a dataframe, doing all possible 
pairwise comparisons. In doing so I use identical(), but that's by the way. I'm 
doing a (not so) quick and dirty check, and subsetting the data as

data[row.numb,]

and

data[a different row,]

I suspect the problem there is that I load into memory the whole frame data[,] 
every time, making the biz quite slow and wasteful. As I'm idly waiting, I 
though, had I put every line of data[,] as the item of a list, then done my 
pairwise comparisons using the list, would I have had a better performance?

(do I win the prize for the most convoluted sentence sent to the R-help?)

For the pedants, yes, I know I could kill the process and try myself, but the 
spirit of the question is, is there a way of dealing with big data 
*efficiently*?

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory management

2006-10-30 Thread bogdan romocea
This was asked before. Collapse the data frame into a vector, e.g.
v - apply(DF,1,function(x) {paste(x,collapse=_)})
then work with the values of that vector (table, unique etc). If your
data frame is really large run this in a DBMS.


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of
 Federico Calboli
 Sent: Monday, October 30, 2006 11:35 AM
 To: r-help
 Subject: [R] memory management

 Hi All,

 just a quick (?) question while I wait my code runs...

 I'm comparing the identity of the lines of a dataframe, doing
 all possible
 pairwise comparisons. In doing so I use identical(), but
 that's by the way. I'm
 doing a (not so) quick and dirty check, and subsetting the data as

 data[row.numb,]

 and

 data[a different row,]

 I suspect the problem there is that I load into memory the
 whole frame data[,]
 every time, making the biz quite slow and wasteful. As I'm
 idly waiting, I
 though, had I put every line of data[,] as the item of a
 list, then done my
 pairwise comparisons using the list, would I have had a
 better performance?

 (do I win the prize for the most convoluted sentence sent to
 the R-help?)

 For the pedants, yes, I know I could kill the process and try
 myself, but the
 spirit of the question is, is there a way of dealing with big
 data *efficiently*?

 Best,

 Fede

 --
 Federico C. F. Calboli
 Department of Epidemiology and Public Health
 Imperial College, St Mary's Campus
 Norfolk Place, London W2 1PG

 Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

 f.calboli [.a.t] imperial.ac.uk
 f.calboli [.a.t] gmail.com

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] memory management

2005-12-01 Thread Afshartous, David

All,

I've written some functions that use a list and a 
list of sub-lists and I'm running into memory problems,
even after changing memory.limit.   Does it make 
any difference to the handling of memory if I use 
simple vectors and matrices instead of the list and
list of sub-lists?  I suspect no, but just want to 
check.

thanks!
Dave

ps - please reply to [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Memory management on Windows (was Size of jpegs/pngs)

2005-10-02 Thread Prof Brian Ripley
I think this an issue about the amount of graphics memory.  You are asking 
for an image of about 17*2*3 = 102Mb, and you need more than that.

From the help page:

  Windows imposes limits on the size of bitmaps: these are not
  documented in the SDK and may depend on the version of Windows. It
  seems that 'width' and 'height' are each limited to 2^15-1 and
  there is a 16Mb limit on the total amount of memory in Windows
  95/98/ME.

so I do wonder why you are surprised.

My laptop appears to be limited to about half your example with a 128Mb 
graphics card (and lots of other things going on).

On Sun, 2 Oct 2005 [EMAIL PROTECTED] wrote:

 Dear all

 I have trouble with setting the size for jpegs and pngs. I need to save 
 a dendrogram of 1000 words into a jpeg or png file. On one of my 
 computers, the following works just fine:

 bb-agnes(aa, method=ward)
 jpeg(C:/Temp/test.txt, width=17000, height=2000)
 plot(bb)
 dev.off()

 On my main computer, however, this doesn't work:
 jpeg(C:/Temp/test.txt, width=17000, height=2000)
 Error in jpeg(C:/Temp/test.txt, width = 17000, height = 2000) :
unable to start device devWindows
 In addition: Warning message:
 Unable to allocate bitmap

 This is a Windows XP Pro SP2 system, which is started with this chsort
 R.version
 _
 platform i386-pc-mingw32
 arch i386
 os   mingw32
 system   i386, mingw32
 status
 major2
 minor1.1
 year 2005
 month06
 day  20
 language R

 which is started with a shortcut.
 C:\rw2011\bin\Rgui.exe --max-mem-size=1500M

 I checked the web and the R-help pages, tried out the ppsize option, and 
 compared the options settings with those of the machine that works 
 (which actually runs R 2.0.1 of 15 Nov 2004), but couldn't come up with 
 an explanation. Any idea what I do wrong?

Did you read the help page?

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Memory Management under Linux: Problems to allocate large amounts of data

2005-06-30 Thread Dubravko Dolic
Dear Prof. Ripley.

Thank You for Your quick answer. Your right by assuming that we run R on a 
32bit System. My technician tried to install R on a emulated 64bit Opteron 
machine which led into some trouble. Maybe because the Opteron includes a 32bit 
Processor which emulates 64bit (AMD64 x86_64). As You seem to have good 
experience with running R on a 64bit OS I feel encouraged to have another try 
for this.



-Ursprüngliche Nachricht-
Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Gesendet: Mittwoch, 29. Juni 2005 15:18
An: Dubravko Dolic
Cc: r-help@stat.math.ethz.ch
Betreff: Re: [R] Memory Management under Linux: Problems to allocate large 
amounts of data

Let's assume this is a 32-bit Xeon and a 32-bit OS (there are 
64-bit-capable Xeons).  Then a user process like R gets a 4GB address 
space, 1GB of which is reserved for the kernel.  So R has a 3GB address 
space, and it is trying to allocate a 2GB contigous chunk.  Because of 
memory fragmentation that is quite unlikely to succeed.

We run 64-bit OSes on all our machines with 2GB or more RAM, for this 
reason.

On Wed, 29 Jun 2005, Dubravko Dolic wrote:

 Dear Group

 I'm still trying to bring many data into R (see older postings). After 
 solving some troubles with the database I do most of the work in MySQL. 
 But still I could be nice to work on some data using R. Therefore I can 
 use a dedicated Server with Gentoo Linux as OS hosting only R. This 
 Server is a nice machine with two CPU and 4GB RAM which should do the 
 job:

 Dual Intel XEON 3.06 GHz
 4 x 1 GB RAM PC2100 CL2
 HP Proliant DL380-G3

 I read the R-Online help on memory issues and the article on garbage 
 collection from the R-News 01-2001 (Luke Tierney). Also the FAQ and some 
 newsgroup postings were very helpful on understanding memory issues 
 using R.

 Now I try to read data from a database. The data I wanted to read 
 consists of 158902553 rows and one field (column) and is of type 
 bigint(20) in the database. I received the message that R could not 
 allocate the 2048000 Kb (almost 2GB) sized vector. As I have 4BG of RAM 
 I could not imagine why this happened. In my understanding R under Linux 
 (32bit) should be able to use the full RAM. As there is not much space 
 used by OS and R as such (free shows the use of app. 670 MB after 
 dbSendQuery and fetch) there are 3GB to be occupied by R. Is that 
 correct?

Not really.  The R executable code and the Ncells are already in the 
address space, and this is a virtual memory OS, so the amount of RAM is 
not relevant (it would still be a 3GB limit with 12GB of RAM).

 After that I started R by setting n/vsize explicitly

 R --min-vsize=10M --max-vsize=3G --min-nsize=500k --max-nsize=100M

 mem.limits()
nsize vsize
 104857600NA

 and received the same message.


 A garbage collection delivered the following information:

 gc()
 used (Mb) gc trigger   (Mb) limit (Mb)  max used   (Mb)
 Ncells 217234  5.9 50   13.4   280050   13.4
 Vcells  87472  0.7  157650064 1202.8   3072 196695437 1500.7


 Now I'm at a loss. Maybe anyone could give me a hint where I should read 
 further or which Information can take me any further

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Memory Management under Linux: Problems to allocate large amounts of data

2005-06-30 Thread Prof Brian Ripley

On Thu, 30 Jun 2005, Dubravko Dolic wrote:


Dear Prof. Ripley.

Thank You for Your quick answer. Your right by assuming that we run R on 
a 32bit System. My technician tried to install R on a emulated 64bit 
Opteron machine which led into some trouble. Maybe because the Opteron 
includes a 32bit Processor which emulates 64bit (AMD64 x86_64). As You 
seem to have good experience with running R on a 64bit OS I feel 
encouraged to have another try for this.


It should work out of the box on an Opteron Linux systen: it does for 
example on FC3 and SuSE 9.x.  Some earlier Linux distros for x86_64 are
not fully 64-bit, but we ran R on FC2 (although some packages could not be 
installed).


Trying to build a 32-bit version of R on FC3 does not work for me: the 
wrong libgcc_s is found.  (One might want a 32-bit version for speed on 
small tasks.)



-Ursprüngliche Nachricht-
Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
Gesendet: Mittwoch, 29. Juni 2005 15:18
An: Dubravko Dolic
Cc: r-help@stat.math.ethz.ch
Betreff: Re: [R] Memory Management under Linux: Problems to allocate large 
amounts of data

Let's assume this is a 32-bit Xeon and a 32-bit OS (there are
64-bit-capable Xeons).  Then a user process like R gets a 4GB address
space, 1GB of which is reserved for the kernel.  So R has a 3GB address
space, and it is trying to allocate a 2GB contigous chunk.  Because of
memory fragmentation that is quite unlikely to succeed.

We run 64-bit OSes on all our machines with 2GB or more RAM, for this
reason.


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Memory Management under Linux: Problems to allocate large amounts of data

2005-06-30 Thread Peter Dalgaard
Prof Brian Ripley [EMAIL PROTECTED] writes:

 On Thu, 30 Jun 2005, Dubravko Dolic wrote:
 
  Dear Prof. Ripley.
 
  Thank You for Your quick answer. Your right by assuming that we run
  R on a 32bit System. My technician tried to install R on a emulated
  64bit Opteron machine which led into some trouble. Maybe because the
  Opteron includes a 32bit Processor which emulates 64bit (AMD64
  x86_64). As You seem to have good experience with running R on a
  64bit OS I feel encouraged to have another try for this.

Er? What is an emulated Opteron machine? Opterons are 64 bit.
 
 It should work out of the box on an Opteron Linux systen: it does for
 example on FC3 and SuSE 9.x.  Some earlier Linux distros for x86_64 are
 not fully 64-bit, but we ran R on FC2 (although some packages could
 not be installed).
 
 Trying to build a 32-bit version of R on FC3 does not work for me: the
 wrong libgcc_s is found.  (One might want a 32-bit version for speed
 on small tasks.)

On FC4 it is even easier: yum install R R-devel gets you a working R
2.1.1 straight away (from Fedora Extras). Only if you want to include
hardcore optimized BLAS or do not like the performance hit of having R
as a shared library do you need to compile at all.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Memory Management under Linux: Problems to allocate large amounts of data

2005-06-30 Thread Dubravko Dolic
Dear Peter,

AMD64 and EM64T (Intel) were designed as 32bit CPUs which are able to address 
64bit registers. So they are nut pure 64bit Systems. This is why they are 
much cheaper than a real 64bit machine.

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von Peter Dalgaard
Gesendet: Donnerstag, 30. Juni 2005 11:48
An: Prof Brian Ripley
Cc: Dubravko Dolic; r-help@stat.math.ethz.ch
Betreff: Re: [R] Memory Management under Linux: Problems to allocate large 
amounts of data

Prof Brian Ripley [EMAIL PROTECTED] writes:

 On Thu, 30 Jun 2005, Dubravko Dolic wrote:
 
  Dear Prof. Ripley.
 
  Thank You for Your quick answer. Your right by assuming that we run
  R on a 32bit System. My technician tried to install R on a emulated
  64bit Opteron machine which led into some trouble. Maybe because the
  Opteron includes a 32bit Processor which emulates 64bit (AMD64
  x86_64). As You seem to have good experience with running R on a
  64bit OS I feel encouraged to have another try for this.

Er? What is an emulated Opteron machine? Opterons are 64 bit.
 
 It should work out of the box on an Opteron Linux systen: it does for
 example on FC3 and SuSE 9.x.  Some earlier Linux distros for x86_64 are
 not fully 64-bit, but we ran R on FC2 (although some packages could
 not be installed).
 
 Trying to build a 32-bit version of R on FC3 does not work for me: the
 wrong libgcc_s is found.  (One might want a 32-bit version for speed
 on small tasks.)

On FC4 it is even easier: yum install R R-devel gets you a working R
2.1.1 straight away (from Fedora Extras). Only if you want to include
hardcore optimized BLAS or do not like the performance hit of having R
as a shared library do you need to compile at all.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Memory Management under Linux: Problems to allocate large amounts of data

2005-06-29 Thread Dubravko Dolic
Dear Group

I'm still trying to bring many data into R (see older postings). After solving 
some troubles with the database I do most of the work in MySQL. But still I 
could be nice to work on some data using R. Therefore I can use a dedicated 
Server with Gentoo Linux as OS hosting only R. This Server is a nice machine 
with two CPU and 4GB RAM which should do the job:

Dual Intel XEON 3.06 GHz
4 x 1 GB RAM PC2100 CL2
HP Proliant DL380-G3

I read the R-Online help on memory issues and the article on garbage collection 
from the R-News 01-2001 (Luke Tierney). Also the FAQ and some newsgroup 
postings were very helpful on understanding memory issues using R.

Now I try to read data from a database. The data I wanted to read consists of 
158902553 rows and one field (column) and is of type bigint(20) in the 
database. I received the message that R could not allocate the 2048000 Kb 
(almost 2GB) sized vector. As I have 4BG of RAM I could not imagine why this 
happened. In my understanding R under Linux (32bit) should be able to use the 
full RAM. As there is not much space used by OS and R as such (free shows the 
use of app. 670 MB after dbSendQuery and fetch) there are 3GB to be occupied by 
R. Is that correct?

After that I started R by setting n/vsize explicitly

R --min-vsize=10M --max-vsize=3G --min-nsize=500k --max-nsize=100M

 mem.limits()
nsize vsize
104857600NA

and received the same message.


A garbage collection delivered the following information:

 gc()
 used (Mb) gc trigger   (Mb) limit (Mb)  max used   (Mb)
Ncells 217234  5.9 50   13.4   280050   13.4
Vcells  87472  0.7  157650064 1202.8   3072 196695437 1500.7


Now I'm at a loss. Maybe anyone could give me a hint where I should read 
further or which Information can take me any further 





Dubravko Dolic
Statistical Analyst
Tel:      +49 (0)89-55 27 44 - 4630
Fax:     +49 (0)89-55 27 44 - 2463
Email: [EMAIL PROTECTED]
Komdat GmbH
Nymphenburger Straße 86
80636 München
-
ONLINE MARKETING THAT WORKS
-
This electronic message contains information from Komdat Gmb...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Memory Management under Linux: Problems to allocate large amounts of data

2005-06-29 Thread Prof Brian Ripley
Let's assume this is a 32-bit Xeon and a 32-bit OS (there are 
64-bit-capable Xeons).  Then a user process like R gets a 4GB address 
space, 1GB of which is reserved for the kernel.  So R has a 3GB address 
space, and it is trying to allocate a 2GB contigous chunk.  Because of 
memory fragmentation that is quite unlikely to succeed.

We run 64-bit OSes on all our machines with 2GB or more RAM, for this 
reason.

On Wed, 29 Jun 2005, Dubravko Dolic wrote:

 Dear Group

 I'm still trying to bring many data into R (see older postings). After 
 solving some troubles with the database I do most of the work in MySQL. 
 But still I could be nice to work on some data using R. Therefore I can 
 use a dedicated Server with Gentoo Linux as OS hosting only R. This 
 Server is a nice machine with two CPU and 4GB RAM which should do the 
 job:

 Dual Intel XEON 3.06 GHz
 4 x 1 GB RAM PC2100 CL2
 HP Proliant DL380-G3

 I read the R-Online help on memory issues and the article on garbage 
 collection from the R-News 01-2001 (Luke Tierney). Also the FAQ and some 
 newsgroup postings were very helpful on understanding memory issues 
 using R.

 Now I try to read data from a database. The data I wanted to read 
 consists of 158902553 rows and one field (column) and is of type 
 bigint(20) in the database. I received the message that R could not 
 allocate the 2048000 Kb (almost 2GB) sized vector. As I have 4BG of RAM 
 I could not imagine why this happened. In my understanding R under Linux 
 (32bit) should be able to use the full RAM. As there is not much space 
 used by OS and R as such (free shows the use of app. 670 MB after 
 dbSendQuery and fetch) there are 3GB to be occupied by R. Is that 
 correct?

Not really.  The R executable code and the Ncells are already in the 
address space, and this is a virtual memory OS, so the amount of RAM is 
not relevant (it would still be a 3GB limit with 12GB of RAM).

 After that I started R by setting n/vsize explicitly

 R --min-vsize=10M --max-vsize=3G --min-nsize=500k --max-nsize=100M

 mem.limits()
nsize vsize
 104857600NA

 and received the same message.


 A garbage collection delivered the following information:

 gc()
 used (Mb) gc trigger   (Mb) limit (Mb)  max used   (Mb)
 Ncells 217234  5.9 50   13.4   280050   13.4
 Vcells  87472  0.7  157650064 1202.8   3072 196695437 1500.7


 Now I'm at a loss. Maybe anyone could give me a hint where I should read 
 further or which Information can take me any further

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html