from:"Markus Metz"

Re: [GRASS-stats] rgrass7 - SQLite and GML drivers not working for readVECT

2017-10-20 Thread Markus Metz

Hi Vero,

On Thu, Oct 19, 2017 at 9:26 PM, Veronica Andreo <veroand...@gmail.com>
wrote:
>
> Hello again,
>
> I come back to this thread beacuse the issue was solved for readVECT, but
I now realize (when trying to write vectors back into GRASS after some
processing in R) that writeVECT shows the same problem, i.e. the only
driver working is ESRI Shapefile (all smooth, no errors), but driver =
"SQLite" throws the same error as reported for readVECT at the begining of
this thread. Would it be possible to fix also writeVECT?

in both readVECT and writeVECT in the file vect_link.R, replace
ogrDGRASSs <- gsub(" ", "_", sapply(strsplit(ogrDGRASS, ": "), "[", 2))
with
ogrDGRASSs <- gsub(" ", "_", trimws(sapply(strsplit(ogrDGRASS, " [(]"),
"[", 1)))

this works with all versions of GRASS 7 and all versions of GDAL.

Markus M


>
> Here an example:
>
> > library(rgrass7)
> Loading required package: sp
> Loading required package: XML
> GRASS GIS interface loaded with GRASS version: GRASS 7.3.svn (2017)
> and location: eu_laea
>
> > bbox <- readVECT("bbox_greece", driver = "SQLite")
> WARNING: No attribute table found -> using only category numbers as
attributes
> Exporting 1 area (may take some time)...
>  100%
> v.out.ogr complete. 1 feature (Polygon type) written to 
> (SQLite format).
> OGR data source with driver: SQLite
> ...
> WARNING: No attribute table found -> using only category numbers as
attributes
> Exporting 1 area (may take some time)...
>  100%
> v.out.ogr complete. 1 feature (Polygon type) written to 
> (SQLite format).
>
> > writeVECT(bbox, "bbox_from_R", driver = "SQLite")
> Error: driver %in% candDrivers is not TRUE
>
> > writeVECT(bbox, "bbox_from_R", driver = "ESRI Shapefile")
> ... all goes fine...
>
> Sorry for bothersome and thanks much in advance!
>
> best,
> Vero
>
> ps: sessionInfo() and ogrDrivers() are the same as before.
>
> 2017-10-11 15:03 GMT+02:00 Roger Bivand <roger.biv...@nhh.no>:
>>
>> OK, thanks, will revise at next release.
>>
>>
>> Roger
>>
>> On Wed, 11 Oct 2017, Markus Metz wrote:
>>
>>> Dear Roger,
>>>
>>> On Wed, Oct 11, 2017 at 2:36 PM, Roger Bivand <roger.biv...@nhh.no>
wrote:
>>>>
>>>>
>>>> Dear Markus,
>>>>
>>>> I can't see how to get the same strings out without conditioning,
>>>
>>>
>>> with
>>>
>>> ogrDGRASSs <- gsub(" ", "_", trimws(sapply(strsplit(ogrDGRASS, " [(]"),
>>> "[", 1)))
>>>
>>>> because for v.in.ogr -f and GDAL >= 2.0, GRASS < 7.3 presents for
example:
>>>>
>>>> GML (rw): GML
>>>> SQLite (rw): SQLite
>>>> ESRI Shapefile (rw): ESRI Shapefile
>>>> GeoJSON (rw): GeoJSON
>>>>
>>>> (readOGR used the string following ":" )
>>>
>>>
>>> The structure of the output of r.in.gdal -f and v.in.ogr -f is
>>>
>>>  (): 
>>>
>>> readOGR must use the string preceding "(". Anything following ":" is a
>>> description which can change any time. Before GDAL 2.0, there was
nothing
>>> else but the short name for OGR drivers, therefore the short name was
used
>>> as description.
>>>
>>>>
>>>> and >= 7.3:
>>>>
>>>> GML (rw+): Geography Markup Language (GML)
>>>> SQLite (rw+): SQLite / Spatialite
>>>> ESRI Shapefile (rw+): ESRI Shapefile
>>>> GeoJSON (rw+): GeoJSON
>>>>
>>>> where the string after ":" is different.
>>>
>>>
>>> the string before the read/write flags, i.e. before "(" is identical.
>>>
>>>> If we can depend on all GRASS < 7.3 having the same short name
position,
>>>
>>> yes, I could avoid conditioning by changing the string processing to
suit
>>>>
>>>> = 7.3 and apply it to all previous; I chose not to modify the string
>>>
>>> processing for < 7.3 to avoid any problems I can't readily check.
>>>
>>> For all versions of GRASS 7 and all versions of GDAL, the short name
>>> position has been and continues to be the first position. For v.in.ogr
-f,
>>> the short name may also appear after ":", but only if there is no long
name.
>>>
>>> Best regards,
>>>
>>>

Re: [GRASS-stats] rgrass7 - SQLite and GML drivers not working for readVECT

2017-10-11 Thread Markus Metz

Dear Roger,

On Wed, Oct 11, 2017 at 1:41 PM, Roger Bivand  wrote:
>
> New version submitted to CRAN; until then:
>
> install.packages("rgrass7", repos="http://R-Forge.R-project.org;)
>
> should pick up the latest version; #3425 closed. Please report back
whether this works ... (conditioning on GRASS version to create comparable
driver name strings).

I don't think there is a need to condition on the GRASS version, see my
suggestion in #3425

Markus M
>
> Roger
>
>
> On Wed, 11 Oct 2017, Roger Bivand wrote:
>
>> Thanks for trying to contribute. The GH site is not the rgrass7
development site - that is SVN on R-forge (GH is a very preliminary trial
site for using sf vector representation in R, and maybe raster raster
representation (or forthcoming stars), instead of sp classes).
>>
>> GRAS 7.2.2 works OK with the current logic checks; I can reproduce the
issue in 7.3 (latest); there is a change in vector/v.in.ogr/main.c
returning the DriverLongName for GDAL >= 2.0; in GRASS 7.2.2, there is no
such change. Could the GRASS developer responsible for this obvious
regression provide an additional flag in v.in.ogr (and v.external,
v.out.ogr) to permit backwards compatibility? See line 387, needs to change
>>
>> #if GDAL_VERSION_NUM >= 200
>>
>> to add a !backwards_compatible test too.
>>
>> I'll hold off trying to fix this in rgrass7 because it is a regression.
I can add the backwards_compatibility=TRUE flag to readVECT() once it is
exposed.
>>
>> This is:
>>
>> https://trac.osgeo.org/grass/ticket/3425
>>
>> Roger
>>
>> On Tue, 10 Oct 2017, Ahmadou Dicko wrote:
>>
>>> In the readVECT function, internally v.in.ogr is used to list the
supported
>>> vector format and it is compared the format available using rgdal (or
sf).
>>> However, using v.external instead of v.in.ogr fix this single problem
>>> because of the way the output is different (in form).
>>> For example, if you use v.in.ogr you will have to compare
>>
>> SQLite_/_Spatialite
>>>
>>> (GRASS) to SQLite (R) and they are not the same.
>>>
>>> I tried to send a PR, let me know if it works
>>>
>>> https://github.com/rsbivand/rgrass7/pull/1
>>>
>>> Best,
>>>
>>> On Tue, Oct 10, 2017 at 9:29 PM, Helmut Kudrnovsky 
wrote:
>>>
> Gesendet: Dienstag, 10. Oktober 2017 um 23:24 Uhr
> Von: "Ahmadou Dicko" 
> An: "Helmut Kudrnovsky" 
> Cc: "Roger Bivand" , "grass-stats@lists.osgeo.org"
<

 grass-stats@lists.osgeo.org>
>
> Betreff: Re: [GRASS-stats] rgrass7 - SQLite and GML drivers not
working

 for readVECT
>
>
> Hi everyone,
>
> I think that using v.external -f (instead of v.in.ogr -f) can fix this

 issue (didn't try yet)
>
>
>
> execGRASS("v.external", flags = "f", intern = TRUE)
> [1] "ARCGEN" "AVCBin" "AVCE00"
> [4] "AeronavFAA" "AmigoCloud" "BNA"
> [7] "CAD""CSV""CSW"
> [10] "Carto"  "Cloudant"   "CouchDB"
> [13] "DGN""DXF""EDIGEO"
> [16] "ESRI_Shapefile" "ElasticSearch"  "GFT"
> [19] "GML""GPKG"   "GPSBabel"
> [22] "GPSTrackMaker"  "GPX""GeoJSON"
> [25] "GeoRSS" "Geoconcept" "Geomedia"
> [28] "HTF""HTTP"   "Idrisi"
> [31] "JML""JPEG2000"   "KML"
> [34] "MSSQLSpatial"   "MapInfo_File"   "Memory"
> [37] "MySQL"  "ODBC"   "ODS"
> [40] "OGR_GMT""OGR_GRASS"  "OGR_PDS"
> [43] "OGR_SDTS"   "OGR_VRT""OSM"
> [46] "OpenAir""OpenFileGDB""PCIDSK"
> [49] "PDF""PGDUMP" "PGeo"
> [52] "PLSCENES"   "PostgreSQL" "REC"


 in a quick check, there is no difference in available formats.



>>>
>>>
>>>
>>>
>>
>>
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: roger.biv...@nhh.no
> Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
> http://orcid.org/-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0J=en
> ___
> grass-stats mailing list
> grass-stats@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/grass-stats
___
grass-stats mailing list
grass-stats@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-stats

Re: [GRASS-stats] readVECT6 - ogr fails to export some columns

2012-04-20 Thread Markus Metz

On Fri, Apr 20, 2012 at 4:47 PM, Eric Momsen eric.mom...@ndsu.edu wrote:
 Hi,

 I've been using GRASS and R for a few months now, and just ran into a snag.
  We added some new attribute columns to our vector file in GRASS, and now
 when I try to move the data into R the transfer fails.

 The original vector map in grass was imported from a shapefile.  I then used
 v.db.addcol and v.rast.stats to add more attributes to the vector map.

 I first thought it had to do with null values, so I tried adding a column
 filled with a constant value.  This didn't help.

 I tried to reproduce the error in SPEARFISH, columns added with v.rast.stats
 were fine, but v.db.addcol wouldn't execute.


 I don't know why the installation let me add columns in one location and not
 another, maybe this is the root cause or another issue all together?

 v.db.addcolumn --verbose map=landuse@PERMANENT layer=1 columns=test

 Adding column test to the table
 DBMI-DBF driver error:
 SQL parser error (syntax error, unexpected $end processing
 '') in statement:
 ALTER TABLE landuse ADD COLUMN test

I think the column type is missing. Try e.g.
v.db.addcolumn --verbose map=landuse@PERMANENT layer=1 columns='test
double precision'

 Unable to execute statement.
 ERROR: Error while executing: 'ALTER TABLE landuse ADD COLUMN test'
 ERROR: Unable to add column test.


 Any ideas for what I'm missing, or a workaround to get the data to R?  (I
 was supposed to get an R data file ready today for others to work with that
 aren't used to GRASS...I had previously done the GRASS-R transfer via a
 script a number of times, and didn't expect any problems!)

 Here are the R messages:

 source(/home/emomsen/Documents/loaddata.R)

 first command in file:
 field2007-readVECT6(ACSC_2007_Field_Boundary)

 Loading required package: rgdal
 Geospatial Data Abstraction Library extensions to R successfully loaded
 Loaded GDAL runtime: GDAL 1.9.0, released 2011/12/29
 Path to GDAL shared files: /usr/local/share/gdal
 Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
 Path to PROJ.4 shared files: (autodetected)
 Available OGR Drivers:
 Warning 1: Field COUNTY of width 1000 truncated to 255.
 snip... these are OK
 ERROR 6: Failed to add field named 'HARVEST_DAY'
 ERROR 6: Failed to add field named 'NDVI_04_B_DAY'
 snip...These were added with v.db.addcol
 ERROR 6: Failed to add field named 'NDVI_04_B_mean'

The column name is probably too long for the OGR dbf driver, the max
length is 10 I think. That is, NDVI_04_B_DAY becomes truncated to
NDVI_04_B_ and NDVI_04_B_mean also becomes truncated to NDVI_04_B_,
resulting in duplicate column names. You could try a shorter column
prefix with v.rast.stats.

Markus M

 ...snip,... there are about 100 of these, extended statistics for 10
 rasters.


 The import does finish, all of the original columns for the shapefile are
 imported to R.  v.db.select shows most of the attribute columns together
 (the last ones are lost from the line length), and querying the map gives
 all the attributes.

 Thanks for any help!

 Eric

 sessionInfo()
 R version 2.14.1 (2011-12-22)
 Platform: x86_64-pc-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] rgdal_0.7-8     spgrass6_0.7-10 XML_3.9-4       sp_0.9-98

 loaded via a namespace (and not attached):
 [1] grid_2.14.1    lattice_0.20-0


 g.gisenv -n

 LANG=en_US.UTF-8
 GRASS_ADDON_PATH=/home/emomsen/v.krige
 GISDBASE=/home/shared/research/GRASSDATA
 LOCATION_NAME=transferField
 ADDON_PATH=/home/emomsen/v.krige
 GUI=wxpython
 MAPSET=PERMANENT


 ___
 grass-stats mailing list
 grass-stats@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/grass-stats

___
grass-stats mailing list
grass-stats@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-stats

Re: [GRASS-stats] new module for multiple regression with large raster maps

2011-11-08 Thread Markus Metz

Dylan Beaudette wrote:
 On Friday, October 07, 2011, Markus Metz wrote:
 There is a new module r.regression.multi as grass7 add-on to calculate
 multiple regressions with raster maps. The motivation for this module
 is to calculate regression coefficients and statistics for very large
 datasets, too large for e.g. R. The module uses less than 3 MB memory
 for 400 million cells with one response variable and 8 predictors.
 Including residuals, this makes a total of 4 billion numbers.
 Calculation takes about 4 minutes for this dataset on my laptop. In
 addition to the slope estimates, statistics provided are R squared,
 adjusted R squared, F, AIC, corrected AIC, BIC for the full model, and
 F, AIC, corrected AIC, BIC for each predictor. Results are identical
 to those produced by R (with smaller test datasets).

 Markus M
 ___
 grass-stats mailing list
 grass-stats@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/grass-stats


 Very cool. Can this module fit models from point data, and produce predictions
 in the form of a raster?

Yes, as of r49130. Residuals are also available as output raster.

Markus M
___
grass-stats mailing list
grass-stats@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-stats

[GRASS-stats] i.pca fixes in trunk

2011-11-04 Thread Markus Metz

Hi all,

based on the wiki for Principal Components Analysis [0], numerous
discussions in the mailing lists [1,2,3,4], particularly a comment by
Edzer Pebesma [5], and personal demand, I have fixed a few issues in
i.pca in trunk r49090.

- the faulty or missing centering of the input bands described in [0]
for should be fixed
- i.pca has a new flag -n to normalize input bands with (x - mean) / stddev
- values of the output maps are now calculated depending on the input
band transformation (centering or normalization). Is this OK?
- Eigen values, (vectors), and [percent importance] are now written to
stdout instead of stderr

The results of i.pca for the examples using SPOT imagery in the wiki
[0] are now identical to R's princomp() results. If the new -n flag is
used, the results of i.pca are identical to princomp(center = TRUE,
scale = TRUE).

Tested also with 9 input maps in a region with 400 million cells.

Markus M


[0] http://grass.osgeo.org/wiki/Principal_Components_Analysis
[1] http://lists.osgeo.org/pipermail/grass-user/2009-February/048722.html
[2] http://lists.osgeo.org/pipermail/grass-stats/2009-March/000933.html
[3] http://lists.osgeo.org/pipermail/grass-stats/2009-March/000942.html
[4] http://lists.osgeo.org/pipermail/grass-stats/2009-April/001028.html
[5] http://lists.osgeo.org/pipermail/grass-stats/2009-April/000977.html
___
grass-stats mailing list
grass-stats@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-stats

Re: [GRASS-stats] Re: Problem with v.distance in spgrass6 package in R

2011-11-03 Thread Markus Metz

2011/11/3 Roger Bivand roger.biv...@nhh.no:
 On Wed, 2 Nov 2011, toke wrote:

 Hi Roger and everybody else

 I finally found time to translate the problem into spearfish data.

 To recapitulate I would like to use the v.distance command to transfer
 information from polygons to point data.

 The problem is that the v.distance command do not update the column
 specified to receive the calculation done by the v.distance command.


 I do not see the same problem, and the command works for me - can others
 please try to reproduce it?

The spearfish example works for me too, exactly the same result like yours.

Markus M


 Roger

 My output in R:

 library(spgrass6)

 Loading required package: sp
 Loading required package: rgdal
 Geospatial Data Abstraction Library extensions to R successfully loaded
 Loaded GDAL runtime: GDAL 1.8.1, released 2011/07/09
 Path to GDAL shared files: /usr/local/share/gdal
 Loaded PROJ.4 runtime: Rel. 4.7.1, 23 September 2009, [PJ_VERSION: 470]
 Path to PROJ.4 shared files: (autodetected)
 Loading required package: XML
 GRASS GIS interface loaded with GRASS version: GRASS 6.4.2svn47586 (2011)
 and location: spearfish60

 execGRASS(g.copy, vect='fields@PERMANENT',fields1)

 Copy vector fields@PERMANENT to current mapset as fields1

 execGRASS(g.copy, vect='archsites@PERMANENT',archsites1)

 Copy vector archsites@PERMANENT to current mapset as archsites1

 execGRASS(v.db.addcol, map=archsites1,

 + columns=\fields double precision\)

 fields1-readVECT6(fields1)

 WARNING: The map contains islands. To preserve them in the output map, use
         the -c flag
 Exporting 65 areas (may take some time)...
  100%
 v.out.ogr complete. 67 features written to fields1 (ESRI_Shapefile).
 OGR data source with driver: ESRI Shapefile
 Source: /home/rsb/topics/grassdata/spearfish60/rsb/.tmp/reclus2, layer:
 fields1
 with 67 features and 2 fields
 Feature type: wkbPolygon with 2 dimensions

 archsites1-readVECT6(archsites1)

 Exporting 25 geometries...
  100%
 v.out.ogr complete. 25 features written to archsite (ESRI_Shapefile).
 OGR data source with driver: ESRI Shapefile
 Source: /home/rsb/topics/grassdata/spearfish60/rsb/.tmp/reclus2, layer:
 archsite
 with 25 features and 3 fields
 Feature type: wkbPoint with 2 dimensions

 summary(archsites1)

 Object of class SpatialPointsDataFrame
 Coordinates:
              min     max
 coords.x1  589860  608355
 coords.x2 4914479 4926490
 Is projected: TRUE
 proj4string :
 [+proj=utm +zone=13 +datum=NAD27 +units=m +no_defs +ellps=clrk66
 +nadgrids=@conus,@alaska,@ntv2_0.gsb,@ntv1_can.dat]
 Number of points: 25
 Data attributes:
      cat                      str1        fields
  Min.   : 1   No Name            :12   Min.   : NA
  1st Qu.: 7   Bob Miller         : 1   1st Qu.: NA
  Median :13   Boulder Creek Cabin: 1   Median : NA
  Mean   :13   Canyon Station     : 1   Mean   :NaN
  3rd Qu.:19   Cole Creek Mine    : 1   3rd Qu.: NA
  Max.   :25   Elkhorn Peak       : 1   Max.   : NA
              (Other)            : 8   NA's   : 25

 set.echoCmdOption(TRUE)

 [1] FALSE

 execGRASS(v.distance, flags=p, from=archsites1, to=fields1,

 from_type=point, to_type=point,line,area, dmax=as.integer(1),
 upload=to_attr, column=fields, to_column=cat)
 GRASS command: v.distance -p from=archsites1 to=fields1 from_type=point
 to_type=point,line,area dmax=1 upload=to_attr column=fields to_column=cat
  100%
  100%
  100%
 from_cat|fields
 1|63
 2|63
 3|63
 4|63
 5|null
 6|null
 7|null
 8|null
 9|25
 10|63
 11|null
 12|null
 13|63
 14|63
 15|null
 16|63
 17|null
 18|63
 19|63
 20|63
 21|null
 22|63
 23|63
 24|null
 25|63
 v.distance complete.

 set.echoCmdOption(FALSE)

 [1] TRUE

 archsites1-readVECT6(archsites1)

 Exporting 25 geometries...
  100%
 v.out.ogr complete. 25 features written to archsite (ESRI_Shapefile).
 OGR data source with driver: ESRI Shapefile
 Source: /home/rsb/topics/grassdata/spearfish60/rsb/.tmp/reclus2, layer:
 archsite
 with 25 features and 3 fields
 Feature type: wkbPoint with 2 dimensions

 summary(archsites1)

 Object of class SpatialPointsDataFrame
 Coordinates:
              min     max
 coords.x1  589860  608355
 coords.x2 4914479 4926490
 Is projected: TRUE
 proj4string :
 [+proj=utm +zone=13 +datum=NAD27 +units=m +no_defs +ellps=clrk66
 +nadgrids=@conus,@alaska,@ntv2_0.gsb,@ntv1_can.dat]
 Number of points: 25
 Data attributes:
      cat                      str1        fields
  Min.   : 1   No Name            :12   Min.   :25.00
  1st Qu.: 7   Bob Miller         : 1   1st Qu.:63.00
  Median :13   Boulder Creek Cabin: 1   Median :63.00
  Mean   :13   Canyon Station     : 1   Mean   :60.47
  3rd Qu.:19   Cole Creek Mine    : 1   3rd Qu.:63.00
  Max.   :25   Elkhorn Peak       : 1   Max.   :63.00
              (Other)            : 8   NA's   :10.00

 table(archsites1$fields)

 25 63
  1 14
 ## alternatively:

 execGRASS(v.distance, from=archsites1, to=fields1,

 from_type=point, to_type=point,line,area, dmax=as.integer(1),
 upload=to_attr,

[GRASS-stats] new module for multiple regression with large raster maps

2011-10-07 Thread Markus Metz

There is a new module r.regression.multi as grass7 add-on to calculate
multiple regressions with raster maps. The motivation for this module
is to calculate regression coefficients and statistics for very large
datasets, too large for e.g. R. The module uses less than 3 MB memory
for 400 million cells with one response variable and 8 predictors.
Including residuals, this makes a total of 4 billion numbers.
Calculation takes about 4 minutes for this dataset on my laptop. In
addition to the slope estimates, statistics provided are R squared,
adjusted R squared, F, AIC, corrected AIC, BIC for the full model, and
F, AIC, corrected AIC, BIC for each predictor. Results are identical
to those produced by R (with smaller test datasets).

Markus M
___
grass-stats mailing list
grass-stats@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-stats

Re: [GRASS-stats] Re: [GRASS-user] Testing i.pca ~ prcomp(), m.eigensystem ~ princomp()

2009-04-01 Thread Markus Metz



Edzer Pebesma wrote:

Markus, a few notes:

- if you do PCA on uncentered data, by computing the eigenvalues of the
uncentered covariance matrix, this implies that bands with a larger mean
will get more influence on the final PCAs. I have sofar not managed
finding an argument why this would be desirable.
  
Add it to wiki? E.g. bands entered in a PCA should have the same mean, 
but normalization is also an option.

- if you do PCA on (band-mean)/sd(band), it means that you first
normalize (scale) 

I think scale and normalize are two different things.

each variable to mean zero and unit variance. This
procedure is identical to doing PCA on the correlation matrix. It means
that, unlike for unscaled variables, variables with larger variance will
not get more influence on the PCA than others. For image analysis I can
see a place for both; if bands with low variance indicate insignificant
and perhaps noisy information, you may downweight them. 
Variance is dependent on range, I would rather use something like 
coefficient of variation (stddev/mean) to get some scale-independent 
indicator on the amount of information in a given band. A downscaled 
band (e.g. MODIS scale of 0.0001) has still the same information but 
lower variance.

- Only in case of normalized variables, or equivalently PCA on
correlations, it makes sense to select PC's with an eigenvalue larger
than 1. The reasoning is fairly weak, but goes like this: if a PC has
eigenvalue  1, it explains more variance than any of the original
variables, which all have variance 1.
  
Sounds good to me, why should I use a component that explains less than 
any of the original bands? And the whole purpose of a PCA is variable 
reduction to get a new set of variables, each explaining the whole 
dataset better than one of the original variables/bands. A PCA produces 
as many components as input variables, so some selection is usually 
necessary for further processing, could also be % explained variance. 
OTOH, sometimes only the first component is of interest. There may be 
exceptions for imagery processing, e.g. haze reduction (would have to 
read up on imagery processing too to say anything more about where 
components with eigenvalue  1 could be useful).


___
grass-stats mailing list
grass-stats@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-stats

Re: [GRASS-stats] rgrass7 - SQLite and GML drivers not working for readVECT

Re: [GRASS-stats] rgrass7 - SQLite and GML drivers not working for readVECT

Re: [GRASS-stats] readVECT6 - ogr fails to export some columns

Re: [GRASS-stats] new module for multiple regression with large raster maps

[GRASS-stats] i.pca fixes in trunk

Re: [GRASS-stats] Re: Problem with v.distance in spgrass6 package in R

[GRASS-stats] new module for multiple regression with large raster maps

Re: [GRASS-stats] Re: [GRASS-user] Testing i.pca ~ prcomp(), m.eigensystem ~ princomp()

8 matches

Site Navigation

Mail list logo

Footer information