[R] Error in vector(double, length) : vector size specified is too large....VLDs

2005-09-16 Thread Edzer J. Pebesma
Tom,

please try to use the variogram function in package gstat;
it doesn't (try to) store all pairwise differences, but rather
accumulates them for distance intervals.

It will take a while to do this, and there is a chance that
you overflow the counter that keeps the number of point
pairs for each interval: 304000^2  2^32; it is stored as
a C long, so may work on a 64 bit architecture. Otherwise,
I'd suggest to sample your data set.

I'd be interested to hear whether you succeed (or not).
--
Edzer

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error in vector(double, length) : vector size specified is too large....VLDs

2005-09-16 Thread Tom Colson
It took about 2 hours on the 64 bit windows platform. Now I just need  to
find my notes from ST733 and remember how to use GSTAT to estimate the
parameters


http://www4.ncsu.edu/~tpcolson/variog.jpg

Thomas Colson
North Carolina State University
Department of Forestry and Environmental Resources
(919) 673 8023
[EMAIL PROTECTED]

Calendar:
www4.ncsu.edu/~tpcolson
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Edzer J. Pebesma
Sent: Friday, September 16, 2005 5:27 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Error in vector(double, length) : vector size specified is
too largeVLDs

Tom,

please try to use the variogram function in package gstat; it doesn't (try
to) store all pairwise differences, but rather accumulates them for distance
intervals.

It will take a while to do this, and there is a chance that you overflow the
counter that keeps the number of point pairs for each interval: 304000^2 
2^32; it is stored as a C long, so may work on a 64 bit architecture.
Otherwise, I'd suggest to sample your data set.

I'd be interested to hear whether you succeed (or not).
--
Edzer

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Error in vector(double, length) : vector size specified is too large....VLDs

2005-09-15 Thread Tom Colson
I have what R seems to consider a very large dataset, a 12MB text file of
lat,long,and height values, 130,000 rows to be exact. 

Here's what I get:


Thomas Colson
North Carolina State University
Department of Forestry and Environmental Resources
(919) 673 8023
[EMAIL PROTECTED]

Calendar:
www4.ncsu.edu/~tpcolson

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Error in vector(double, length) : vector size specified is too large....VLDs

2005-09-15 Thread Tom Colson
I have what R seems to consider a very large dataset, a 12MB text file of
lat,long,and height values, 130,000 rows to be exact. 

Here's what I get:
 data1 - data.frame(read.table(BE3720078500WC20020828.txt,sep=,,
header=T))
 raw.data - as.geodata(data1)
 variog.1.b - variog(raw.data)
variog: computing omnidirectional variogram
Error in vector(double, length) : vector size specified is too large
 round(memory.limit()/1048576.0, 2)
[1] 4000



The Vector size specified is too large seems to be a common error, but I
haven't seen any workarounds posted...and the help.archive web site seems to
be down. I can plot the dataset, do some elementary stats on it...no
variogram though. 


Any ideas on how to compute variograms on datasets with 100 to 300k points? 
Thanks 

Thomas Colson
North Carolina State University
Department of Forestry and Environmental Resources
(919) 673 8023
[EMAIL PROTECTED]

Calendar:
www4.ncsu.edu/~tpcolson

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error in vector(double, length) : vector size specified is too large....VLDs

2005-09-15 Thread Tom Colson
 
At 4 GB, I'm at the 32bit windows limit.

Thomas Colson
North Carolina State University
Department of Forestry and Environmental Resources
(919) 673 8023
[EMAIL PROTECTED]

Calendar:
www4.ncsu.edu/~tpcolson
 

-Original Message-
From: Berton Gunter [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 15, 2005 2:34 PM
To: 'Tom Colson'
Subject: RE: [R] Error in vector(double,length) : vector size specified is
too largeVLDs

 
 Any ideas on how to compute variograms on datasets with 100 to 300k 
 points?
 Thanks

Get more memory? ... it's cheap! :-)

-- Bert Gunter
Genentech

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error in vector(double, length) : vector size specified is too large....VLDs

2005-09-15 Thread Tom Colson
 
 rm(data1)
 variog.1.b - variog(raw.data)
variog: computing omnidirectional variogram
Error in vector(double, length) : vector size specified is too large

Turns out I was wrong re: # of rows...it's 304,000


Same problem. Version is 2.1.1, hardware is Dual Xeon 3.6 4 GB RAM, XP Pro
64 Bit. Can reproduce the problem with 64Bit R 2.1.1 running on Fedora 4,
same hardware. 



Thomas Colson
North Carolina State University
Department of Forestry and Environmental Resources
(919) 673 8023
[EMAIL PROTECTED]

Calendar:
www4.ncsu.edu/~tpcolson
 

-Original Message-
From: Douglas Grove [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 15, 2005 2:23 PM
To: Tom Colson
Subject: Re: [R] Error in vector(double, length) : vector size specified
is too largeVLDs

Well you could start by removing large objects that you aren't using (e.g.
'data1') and seeing if that helps. 

There may be other suggestions but you haven't told us what platform you're
working on, as the posting guide requests:

 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

Doug


On Thu, 15 Sep 2005, Tom Colson wrote:

 I have what R seems to consider a very large dataset, a 12MB text file 
 of lat,long,and height values, 130,000 rows to be exact.
 
 Here's what I get:
  data1 - data.frame(read.table(BE3720078500WC20020828.txt,sep=,,
 header=T))
  raw.data - as.geodata(data1)
  variog.1.b - variog(raw.data)
 variog: computing omnidirectional variogram Error in vector(double, 
 length) : vector size specified is too large
  round(memory.limit()/1048576.0, 2)
 [1] 4000
 
 
 
 The Vector size specified is too large seems to be a common error, 
 but I haven't seen any workarounds posted...and the help.archive web 
 site seems to be down. I can plot the dataset, do some elementary 
 stats on it...no variogram though.
 
 
 Any ideas on how to compute variograms on datasets with 100 to 300k
points? 
 Thanks
 
 Thomas Colson
 North Carolina State University
 Department of Forestry and Environmental Resources
 (919) 673 8023
 [EMAIL PROTECTED]
 
 Calendar:
 www4.ncsu.edu/~tpcolson
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error in vector(double, length) : vector size specified is too large....VLDs

2005-09-15 Thread Peter Dalgaard
Tom Colson [EMAIL PROTECTED] writes:

  
  rm(data1)
  variog.1.b - variog(raw.data)
 variog: computing omnidirectional variogram
 Error in vector(double, length) : vector size specified is too large
 
 Turns out I was wrong re: # of rows...it's 304,000
 
 
 Same problem. Version is 2.1.1, hardware is Dual Xeon 3.6 4 GB RAM, XP Pro
 64 Bit. Can reproduce the problem with 64Bit R 2.1.1 running on Fedora 4,
 same hardware. 
 

Variograms involve the differences between all pairs of points which
can become a rather large number of values. 304000*303999/2 in your
case, about 344GB by my reckoning. And the distances between them
makes for a similar quantity.

Now, some algorithms may be smarter than to keep all values in memory,
but you haven't even told us where you got the variog() from. It
doesn't seem to be in the standard packages, although we do have
variogram() and Variogram() in spatial and nlme.

-- 
   O__   Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error in vector(double, length) : vector size specified is too large....VLDs

2005-09-15 Thread Roger Bivand
On 15 Sep 2005, Peter Dalgaard wrote:

 Tom Colson [EMAIL PROTECTED] writes:
 
   
   rm(data1)
   variog.1.b - variog(raw.data)
  variog: computing omnidirectional variogram
  Error in vector(double, length) : vector size specified is too large
  
  Turns out I was wrong re: # of rows...it's 304,000
  
  
  Same problem. Version is 2.1.1, hardware is Dual Xeon 3.6 4 GB RAM, XP Pro
  64 Bit. Can reproduce the problem with 64Bit R 2.1.1 running on Fedora 4,
  same hardware. 
  
 
 Variograms involve the differences between all pairs of points which
 can become a rather large number of values. 304000*303999/2 in your
 case, about 344GB by my reckoning. And the distances between them
 makes for a similar quantity.
 
 Now, some algorithms may be smarter than to keep all values in memory,
 but you haven't even told us where you got the variog() from. It
 doesn't seem to be in the standard packages, although we do have
 variogram() and Variogram() in spatial and nlme.

Right, this is from geoR, which uses full matrices. I think both fields 
and gstat can work with larger data sets. Whether model-based 
geostatistics is what you need for interpolating a digital elevation model 
is another question.

 
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error in vector(double, length) : vector size specified is too large....VLDs

2005-09-15 Thread Tom Colson
Yes, using geoR. 

I can interpolate the DEM quite easily in Grass (v.surf.rst, kriging) and
block kriging in ArcInfo. What we need, though, is to be able to estimate
or even nail down the variogram for these data sets. Where am I going with
this? I'm guessing that variables such as slope, ruggedness, etc.. are going
to produce different sill, range, and nugget values, which I can then use to
fine tune the interpolation process, rather than using the same spline or
kriging parameters on say, a whole state boundary worth of Lidar data.  And
yes, I can estimate the variogram in ArcInfo (limited to 1 points) and
can also import the DEM from grass into R using spgrassbut the point is
to analyze the point data BEFORE I make the DEM. 

So I'm guessing the geoR isn't ever going to handle this size data, and I
need to be using gstat? (As I write this, gstat(variogram) is plugging away
for last 10 minute with no errors.)

Thanks for quick replies


Thomas Colson
North Carolina State University
Department of Forestry and Environmental Resources
(919) 673 8023
[EMAIL PROTECTED]

Calendar:
www4.ncsu.edu/~tpcolson
 

-Original Message-
From: Roger Bivand [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 15, 2005 3:28 PM
To: Peter Dalgaard
Cc: Tom Colson; r-help@stat.math.ethz.ch
Subject: Re: [R] Error in vector(double, length) : vector size specified
is too largeVLDs

On 15 Sep 2005, Peter Dalgaard wrote:

 Tom Colson [EMAIL PROTECTED] writes:
 
   
   rm(data1)
   variog.1.b - variog(raw.data)
  variog: computing omnidirectional variogram Error in 
  vector(double, length) : vector size specified is too large
  
  Turns out I was wrong re: # of rows...it's 304,000
  
  
  Same problem. Version is 2.1.1, hardware is Dual Xeon 3.6 4 GB RAM, 
  XP Pro
  64 Bit. Can reproduce the problem with 64Bit R 2.1.1 running on 
  Fedora 4, same hardware.
  
 
 Variograms involve the differences between all pairs of points which 
 can become a rather large number of values. 304000*303999/2 in your 
 case, about 344GB by my reckoning. And the distances between them 
 makes for a similar quantity.
 
 Now, some algorithms may be smarter than to keep all values in memory, 
 but you haven't even told us where you got the variog() from. It 
 doesn't seem to be in the standard packages, although we do have
 variogram() and Variogram() in spatial and nlme.

Right, this is from geoR, which uses full matrices. I think both fields and
gstat can work with larger data sets. Whether model-based geostatistics is
what you need for interpolating a digital elevation model is another
question.

 
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html