Re: [R] Help with saptial analysis (cluster)

2007-04-27 Thread Roger Bivand
ONKELINX, Thierry Thierry.ONKELINX at inbo.be writes:

 
 Dear Fransico,
 
 The distance matrix would be 102000 x 102000. So it would contain 1040400 
values. If you need one bit for
 each value, this would requier 9,7 GB. So the distance matrix won't fit in 
the RAM of your computer.

Perhaps you could make progress by using a 2D kernel density - there are 
functions among others in the MASS and splancs packages, or by binning - 
Bioconductor's hexbin package comes to mind. Then you would be looking for 
areas of increased density on the grid (in points per unit area or equivalently 
counts per bin) rather than at the interpoint distances. The kernel2d() 
function in splancs handles a data set of your size with no problems.

Roger

(with apologies for pruning, gmane is very dictatorial)

 
 Cheers,
 
 Thierry


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with saptial analysis (cluster)

2007-04-25 Thread Francisco Pastor
Hi R-users

I'm a beginner with R and statistics, so I need some help to start my data
analysis. I've been reading some docs and tutorials on R and cluster analysis.
I've got a large dataset (102000 points) with values of longitude, latitude and
temperature and want to see if I can find groups (clusters).

Following some tutorials I can look for principal components but get an error
with calculation of distances:

 matriz.distancias-dist(comp.obs)
Error in vector(double, length) : specified vector size is too big (translated
from spanish)

So, my questions are: is the dataset too big? could you point me to any docs
explaining how to study spatially distributed data (lon,lat,data)?

Thanks in advance


___
Francisco Pastor
Meteorology department
FundaciĆ³n CEAM
[EMAIL PROTECTED]
http://www.gva.es/ceamet
http://www.gva.es/ceam
Paterna, Valencia, Spain
___
Usuario Linux registrado: 363952

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with saptial analysis (cluster)

2007-04-25 Thread ONKELINX, Thierry
Dear Fransico,

The distance matrix would be 102000 x 102000. So it would contain 1040400 
values. If you need one bit for each value, this would requier 9,7 GB. So the 
distance matrix won't fit in the RAM of your computer.

Cheers,

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology 
and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
[EMAIL PROTECTED]
www.inbo.be 

Do not put your faith in what statistics say until you have carefully 
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of 
uncertainties, a surgery of suppositions. ~M.J.Moroney

 

 -Oorspronkelijk bericht-
 Van: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] Namens Francisco Pastor
 Verzonden: woensdag 25 april 2007 12:34
 Aan: r-help@stat.math.ethz.ch
 Onderwerp: [R] Help with saptial analysis (cluster)
 
 Hi R-users
 
 I'm a beginner with R and statistics, so I need some help to 
 start my data analysis. I've been reading some docs and 
 tutorials on R and cluster analysis.
 I've got a large dataset (102000 points) with values of 
 longitude, latitude and temperature and want to see if I can 
 find groups (clusters).
 
 Following some tutorials I can look for principal components 
 but get an error with calculation of distances:
 
  matriz.distancias-dist(comp.obs)
 Error in vector(double, length) : specified vector size is 
 too big (translated from spanish)
 
 So, my questions are: is the dataset too big? could you point 
 me to any docs explaining how to study spatially distributed 
 data (lon,lat,data)?
 
 Thanks in advance
 
 
 __
 _
 Francisco Pastor
 Meteorology department
 FundaciĆ³n CEAM
 [EMAIL PROTECTED]
 http://www.gva.es/ceamet
 http://www.gva.es/ceam
 Paterna, Valencia, Spain
 __
 _
 Usuario Linux registrado: 363952
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.