Thanks Roger

I feel we've got a low RAM machine which would need a bit of an uplift 
(recent server though)!
The linux machine is unfortunately also with 4Gb of RAM
But  I persist to say it would be interesting to have within R a way of 
automatically performing swapping memory if needed ...

Didier

Roger Bivand wrote:
> On Tue, 11 Sep 2007, [EMAIL PROTECTED] wrote:
>
>>
>>> These days in GIS on may have to manipulate big datasets or arrays.
>>>
>>> Here I am on WINDOWS I have a 4Gb
>>> my aim was to have an array of dim 298249 12 10 22 but that's 2.9Gb
>>
>
> Assuming double precision (no single precision in R), 5.8Gb.
>
>>
>> It used to be (maybe still is?) the case that a single process could 
>> only
>> 'claim' a chunk of max size 2GB on Windows.
>>
>>
>> Also remember to compute overhead for R objects... 58 bytes per 
>> object, I
>> think it is.
>>
>>
>>> It is also strange that once a dd needed 300.4Mb and then 600.7Mb 
>>> (?) as
>>> also I made some room in removing ZZ?
>>
>>
>> Approximately double size - many things the interpreter does involve
>> making an additional copy of the data and then working with *that*.  
>> This
>> might be happening here, though I didn't read your code carefully enough
>> to be able to be certain.
>>
>>
>>> which I don't really know if it took into account as the limit is
>>> greater than the physical RAM of 4GB. ...?
>>
>> :)
>>
>>> would it be easier using Linux ?
>>
>> possibly a little bit - on a linux machine you can at least run a PAE
>> kernel (giving you a lot more address space to work with) and have the
>> ability to turn on a bit more virtual memory.
>>
>> usually with data of the size you're trying to work with, i try to 
>> find a
>> way to preprocess the data a bit more before i apply R's tools to it.
>> sometimes we stick it into a database (postgres) and select out the bits
>> we want our inferences to be sourced from.  ;)
>>
>> it might be simplest to just hunt up a machine with 8 or 16GB of 
>> memory in
>> it, and run those bits of the analysis that really need memory on that
>> machine...
>
> Yes, if there is no other way, a 64bit machine with lots of RAM would 
> not be so contrained, but maybe this is a matter of first deciding why 
> doing statistics on that much data is worth the effort? It may be, but 
> just trying to read large amounts of data into memory is perhaps not 
> justified in itself.
>
> Can you tile or subset the data, accumulating intermediate results? 
> This is the approach the biglm package takes, and the R/GDAL interface 
> also supports subsetting from an external file.
>
> Depending on the input format of the data, you should be able to do 
> all you need provided that you do not try to keep all the data in 
> memory. Using a database may be a good idea, or if the data are 
> multiple remote sensing images, subsetting and accumulating results.
>
> Roger
>
>>
>> --e
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo@stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>


-- 
Dr Didier Leibovici    
        http://www.nottingham.ac.uk/cgs/leibovici.shtml
Centre for Geospatial Science     
Sir Clive Granger Building
University of Nottingham,
University Park
Nottingham NG7 2RD, UK
        Tel: +44 - (0)115 84 66058   Fax: +44 (0)115 95 15249


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to