Re: [R] How to export/save an mrpp object?
Nikos: I finally ran mrpp tests. I think all is fine but one very important issue: I have no idea how to export/save an mrpp object. Tried anything I know and searched the archives but found nothing. David W: And what happened when you tried what seems like the obvious: save(mrpp_obj, file=) # rm(list=ls() ) # Only uncomment if you are ready for your workspace to clear load(mrpp_store.Rdata) Right, clearing did the trick. Any ideas? Is really copy-pasting the mrpp results the only way? Many of us have no idea what such an object is, since you have not described the packages and functions used to create it. If you want an ASCII version then dput or dump are also available. Multiresponse Permuation Procedures (MRPP) is implemented in the vegan package. The function mrpp() returns (an object of class mrpp) something like: --%--- # check class class ( samples_bitemporal_modis.0001.mrpp ) [1] mrpp # check structure str ( samples_bitemporal_modis.0001.mrpp ) List of 12 $ call: language mrpp(dat = samples_bitemporal_modis.0001[, 1:5], grouping = samples_bitemporal_modis.0001[[Class]]) $ delta : num 0.126 $ E.delta : num 0.202 $ CS : logi NA $ n : Named int [1:5] 335 307 183 188 27 ..- attr(*, names)= chr [1:5] Urban Vegetation Bare ground Burned ... $ classdelta : Named num [1:5] 0.1255 0.1045 0.1837 0.0981 0.1743 ..- attr(*, names)= chr [1:5] Urban Vegetation Bare ground Burned ... $ Pvalue : num 0.001 $ A : num 0.378 $ distance: chr euclidean $ weight.type : num 1 $ boot.deltas : num [1:999] 0.202 0.202 0.202 0.203 0.202 ... $ permutations: num 999 - attr(*, class)= chr mrpp --%--- Now I've tried the following: --%--- # 1. save(d) it save ( samples_bitemporal_modis.0001.mrpp , file=exported.mrpp.R ) # 2. loade(d) it in a new object... loadedmrpp - load ( exported.mrpp.R) # 3. (tried) to check it... str ( exported.mrpp.R) chr samples_bitemporal_modis.0001.mrpp # it did not cross my mind immediately to... get(loadedmrpp) Call: mrpp(dat = samples_bitemporal_modis.0001[, 1:5], grouping = samples_bitemporal_modis.0001[[Class]]) Dissimilarity index: euclidean Weights for groups: n Class means and counts: Urban Vegetation Bare ground Burned Water delta 0.1255 0.1045 0.1837 0.0981 0.1743 n 335307183 18827 Chance corrected within-group agreement A: 0.3778 Based on observed delta 0.1258 and expected delta 0.2022 Significance of delta: 0.001 Based on 999 permutations # ...or to work on a clean workspace! --%--- Thank you David. Cheers, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to export/save an mrpp object?
Gavin: [...] I think you should read ?load to rethink what these functions do and how they do it. [...] Absolutely correct. I will. Thank you for your time Gavin, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to export/save an mrpp object?
Greets (again) :-) I finally ran mrpp tests. I think all is fine but one very important issue: I have no idea how to export/save an mrpp object. Tried anything I know and searched the archives but found nothing. Any ideas? Is really copy-pasting the mrpp results the only way? Thank you for your attention, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate between-axes vs within-axes heterogeneity of multivariate matrices
On Wednesday 22 of December 2010 05:57:17 Nikos Alexandris wrote: [...] Apologies for repeating the same question (trying to understand the problem myself). I started to get a grip on this. But anyway, my questions are actually not directly about R questions - sorry for the traffic. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate between-axes vs within-axes heterogeneity of multivariate matrices
This time with a more-R oriented question: Is the mrpp {vegan} package [1] useful in trying to check, or get a clue about the differences between- and within-axes (or variables or dimensions or columns) of a multivariate matrix? The description explains: ...(MRPP) provides a test of whether there is a significant difference between two or more groups of sampling units. ... ... difference may be one of location (differences in mean) or one of spread (differences in within-group distance) ... and ... Function mrpp operates on a data.frame matrix where rows are observations and responses data matrix. The response(s) may be uni- or multivariate. ... Question: what about the observations being actually the columns? Is a simple transposing of the matrix enough? Any other alternatives or hints? Thak you, Nikos --- [1] http://cc.oulu.fi/~jarioksa/softhelp/vegan/html/mrpp.html --%--- Nikos: My question(s) in the end might be silly but I am no expert on this, so here it goes: Noy-Meir (1973), Pielou (1984) and a few others have pointed to non-centered PCA being in some cases useful. They clearly explain that it is the case when multi-dimensional data display distinct clusters (which have zero, or near-zero, projections in some subset of the axes) and the task is (exactly) to separate this clusters among the principal components. I have done my complete work using prcomp() and tested combinations of center=FALSE/TRUE and scale=FALSE/TRUE. I would like to now check this between-axes vs within-axes heterogeneity of my data and cross-check results with the various tested PCA-versions. Is there any (official or custom) function available in R that could answer this question? Some relative/comparative (preferrable simple and intuitive) measure(s)? Something that would graphically perhaps give an indication without time-consuming clustering, sampling or whatsoever processing? Even though the above mentoined authors mention some measure for the assymetry of the yielded compoenents ( uncentered - unipolar, centered - bipolar) I find the concept a bit hard to understand. Isn't there a quick way (function) to just say (with numbers of plots of course) well, it seems that the data are heterogenous looking at between- axes or the other way around it looks like the variables differ within, more than between? Apologies for repeating the same question (trying to understand the problem myself). Thank you, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Estimate between-axes vs within-axes heterogeneity of multivariate matrices
Hi! My question(s) in the end might be silly but I am no expert on this, so here it goes: Noy-Meir (1973), Pielou (1984) and a few others have pointed to non-centered PCA being in some cases useful. They clearly explain that it is the case when multi-dimensional data display distinct clusters (which have zero, or near-zero, projections in some subset of the axes) and the task is (exactly) to separate this clusters among the principal components. I have done my complete work using prcomp() and tested combinations of center=FALSE/TRUE and scale=FALSE/TRUE. I would like to now check this between-axes vs within-axes heterogeneity of my data and cross-check results with the various tested PCA-versions. Is there any (official or custom) function available in R that could answer this question? Some relative/comparative (preferrable simple and intuitive) measure(s)? Something that would graphically perhaps give an indication without time-consuming clustering, sampling or whatsoever processing? Even though the above mentoined authors mention some measure for the assymetry of the yielded compoenents ( uncentered - unipolar, centered - bipolar) I find the concept a bit hard to understand. Isn't there a quick way (function) to just say (with numbers of plots of course) well, it seems that the data are heterogenous looking at between- axes or the other way around it looks like the variables differ within, more than between? Apologies for repeating the same question (trying to understand the problem myself). Thank you, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Estimate between-axes vs within-axes heterogeneity of multivariate matrices
Hi! My question(s) in the end might be silly but I am no expert on this, so here it goes: Noy-Meir (1973), Pielou (1984) and a few others have pointed to non-centered PCA being in some cases useful. They clearly explain that it is the case when multi-dimensional data display distinct clusters (which have zero, or near-zero, projections in some subset of the axes) and the task is (exactly) to separate this clusters among the principal components. I have done my complete work using prcomp() and tested combinations of center=FALSE/TRUE and scale=FALSE/TRUE. I would like to now check this between-axes vs within-axes heterogeneity of my data and cross-check results with the various tested PCA-versions. Is there any (official or custom) function available in R that could answer this question? Some relative/comparative (preferrable simple and intuitive) measure(s)? Something that would graphically perhaps give an indication without time-consuming clustering, sampling or whatsoever processing? Even though the above mentoined authors mention some measure for the assymetry of the yielded compoenents ( uncentered - unipolar, centered - bipolar) I find the concept a bit hard to understand. Isn't there a quick way (function) to just say (with numbers of plots of course) well, it seems that the data are heterogenous looking at between- axes or the other way around it looks like the variables differ within, more than between? Apologies for repeating the same question (trying to understand the problem myself). Thank you, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Factor Loadings in Vegan's PCA
On Wednesday 30 of June 2010 23:02:09 afso...@unisinos.br wrote: Hi all, I am using the vegan package to run a prcincipal components analysis on forest structural variables (tree density, basal area, average height, regeneration density) in R. However, I could not find out how to extract factor loadings (correlations of each variable with each pca axis), as is straightforwar in princomp. Do anyone know how to do that? Moreover, do anyone knows a function r package that produces rotated-pca and biplots? Most packages I found did only one of these tasks (princomp, psych, vegan). Thanks a lot, Alexandre Hi Alexandre. I haven't used the vegan package. But using princomp() and/or prcomp() is really easy. Easy is also the extraction of the loadings. Just check the structure of the result of princomp() to find the loadings or the result of prcomp() to find the rotation ( which is the same as the loadings in princomp() ). For plotting you might want to have a look at the plotpc R package. It's something I really like (and have customised it a lot to suit my needs of plotting bivariate rotated axes (=the principal components) of a given data set using the prcomp R function and even more). Don't know though if this is what you are looking for. ( ...my custom version however is not ready for a generic use. It would take to somebody to spend more time on it to remove hardcoded stuff and figure out smart automatic ways to handle co-plotting scaled and non-scaled axes (as a result of scaled=standardised and non-scaled=unstandardised PCA) on the same plot.) Also interesting is the smoothScatter functions that produces a smoothed color density representation of the scatterplot, obtained through a kernel density estimate. It used densCols to produce a vector containing colors which encode the local densities at each point in a scatterplot. Very interesting if the data-matrix to be transformed is large. Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Note on PCA (not directly with R)
Christofer Bogaso wrote: Dear all, I am looking for some interactive study materials on Principal component analysis. Basically I would like to know what we are actually doing with PCA? Having in mind the eigenvalue decomposition and a bivariate data set, the sum- it-all in a few sentences I think is: - PCA rotates (and scales) the data set in such a way that aligns the thransformed axes (the principal components) with the direction(s) of maximum variance - eigen values are proportional to the lentgh of the axes of variation - eigen (or characteristic) vectors define the rotation What is happening within the dataset at the time of doing PCA. The algorithm (classically): - mean-centers the data matrix - calculates the covariance matrix (non-standardised PCA) or the correlation matrix (standartised PCA, a step also known as scaling) - calculates the the eigenvalue decomposition (EVD) (the eigenvectors and eigenvalues) of a data variance-covariance (non-standardised) or the correlation matrix (standardised) - sorts the variances (i.e. the eigenvalues) in decreasing order and finally projects the original dataset signals into what is named Principal Components or scores, by multiplying them with the eigenvectors which act as weighting coefficients. The algorithm does actually three (or more?) things: - minimises the mean square error of approximating the original data set, - keeps the maximum possible variance(s) of the original data set, - gives decorrelated variables Probably a 3-dimensional interactive explanation would be best for me. I have gone through some online materials specially Wikipedia etc, however what I need a movable explanation to understand that. Any suggestion please? For what is worth, I think a 2-dimensional example is better to start with. You can have a look at the plotpc() package. It really is educational. Good luck, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Attempt to customise the plotpc() function
Nikos Alexandris: Among the (R-)tools, I've seen on the net, for (bivariate) Principal Component scatter plots (+histograms), plotpc [1] is the one I like most. [...] I started the modification by attempting first to get a prcomp version of plotpc() (named it plotpc.svd()) by altering the following: [...] I am bit lost now about where I should continue looking for required modifications in the code. Any hints? Once again I am replying to myself ;-) I've spend many-many hours of searching in manuals, on the net and trying crazy things to understand where this mystical (to me) function un() could be sourced from. Eventually I decided to change all of its occurrences with unit() as I am sure this was the function meant to be used but I hit another wall, another unknown function my.plot.something(). I was then sure that they are sourced from somewhere. At some point I was enlightened and had a look in the source plotpc.R where I just found what I was expecting to found ;-). It may look silly but to the unexperienced useR it is not. I was only looking at the print-out of plotpc (without the parentheses) and was puzzled that the un() function is nowhere. - Question: why isn't the whole source of plotpc.R printed out with plotpc? Or why isn't any clue given where this un() function is coming from? methods() and getAnywhere() say anything about it. * Note to self: look at the Source.R Anyhow, it was a long learning-night-session and I am good to go. In fact, I am very close to what I want to do. I've added the option to use either prcomp or princomp as well as if the input dataset should be centered and/or scaled. I have still some strange issue with respect to how the plotted histograms (of PC1 and PC2) along with their text are flipped. Hopefully I'll fix this too. I will be contacting soon the author to post him my modifications as enhancement wishes. If anybody is interested to know or help please post here. Regards, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Attempt to customise the plotpc() function
Peter Ehlers wrote: Nikos, I think you can just replace the line pc - princomp(x[,1:2], scores=TRUE, na.action=na.fail) with pc - prcomp(x[,1:2], retx=TRUE, center=pc.center, scale.=pc.scale, na.action=na.fail) and rename the components of pc names(pc) - c('sdev', 'loadings', 'center', 'scale', 'scores') Right. Υet, it is still not enough. I had to change the definition of the limits that feed viewport mainly because of the huge difference of an unscaled vs. scaled dataset before the pc-analysis takes place. Because I want to give (me) the option to have really informative plots, I've added an extra grid.points() in case the data are transformed (centered and/or scaled) to print both the original and the transformed (with another pch and/or color) point cloud. and then use the rest of the plotpc() code as is (except for maybe having to use flip1=TRUE, etc). Hmm... I am _now_ working on it to understand how I could make this automatic!. If I give flip1, flip2 (=TRUE) the histograms are located where they should (optically) be printed but the text (rotation angle) that accompanies the histogram is I think not correct. It is quite the opposite angle that is being printed. Any ideas? As to why other functions used in plotpc() are not printed when you ask R to print plotpc(): why should they be? Can you imagine the mess that would result if you got the printouts of is.na(), pushViewport, popViewport, ...? Egad! Thank you Peter. I understand it now. [ Ignorant me but if you don't know something you will probably do mistakes (which is after all the learning process. ] Anyway, as you've discovered, when you want to modify code, look at the sources. Thank you Peter. Kindest regards, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Attempt to customise the plotpc() function
Peter Ehlers wrote: and then use the rest of the plotpc() code as is (except for maybe having to use flip1=TRUE, etc). Nikos: Hmm... I am _now_ working on it to understand how I could make this automatic!. If I give flip1, flip2 (=TRUE) the histograms are located where they should (optically) be printed but the text (rotation angle) that accompanies the histogram is I think not correct. It is quite the opposite angle that is being printed. Any ideas? I was wrong. flip is about the location only and has nothing to do with the rotation angle of the principal components (right?). So it's ok. Maybe there is still a way to auto-define the flips? But it's not worth spending time on it... Thanks again, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Attempt to customise the plotpc() function
Dear R-list, Among the (R-)tools, I've seen on the net, for (bivariate) Principal Component scatter plots (+histograms), plotpc [1] is the one I like most. By default it performs PCA on a bivariate dataset based on R's princomp() (which is the eigenvector-based algebraic solution to PCA). I would like to modify plotpc() in order be able, as an end-user, to define whether the PCA should be performed by using: (a) princomp() (the default currently) (b) the prcomp() function (which is the SVD-based algebraic solution to PCA) Of course I want to have control over the parameters that prcomp() accepts, namely: - data centering (argument center) - data scaling (argument scale) Hopefully this is the right place to post this question(s). I started the modification by attempting first to get a prcomp version of plotpc() (named it plotpc.svd()) by altering the following: - accept the arguments pccenter and pcscale in line 1: --%--- function ( x , pccenter = TRUE, scale = FALSE, xrange =... ) --%--- - accept both the center and the scale parameters for the scale() function in line 134: --%--- x - scale (x , center = pccenter , scale = pcscale ) --%--- - change princomp to prcomp in line 140 as well as all required arguments (that I know off): --%--- pc - prcomp ( x[, 1:2] , center = FALSE , scale = FALSE , retx = TRUE , na.action = na.fail ) --%--- -- in line 144: --%--- angle - atan2(pc$rotation[2, 1], pc$rotation[1, 1]) --%--- -- in lines 147 and 148, I replaced pc$scores with pc$x Running the modified function ends up with an error: --%--- # x is a data.frame with two columns and works fine with plotpc() plotpc.svd (x , pccenter = TRUE , pcscale = FALSE ) Error in inherits(unit, unit) : could not find function un # getting more info traceback() 5: inherits(unit, unit) 4: is.unit(y) 3: textGrob(label = label, x = x, y = y, just = just, hjust = hjust, vjust = vjust, rot = rot, check.overlap = check.overlap, default.units = default.units, name = name, gp = gp, vp = vp) 2: grid.text(main, y = un(ymean + xlim[2] + xrange/10), gp = gpar(cex = 1, font = 2)) 1: plotpc.svd(x, pccenter = TRUE, pcscale = FALSE) --%--- I am bit lost now about where I should continue looking for required modifications in the code. Any hints? Thanks in advance, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Attempt to customise the plotpc() function
On Saturday 15 of May 2010 19:09:30 Nikos Alexandris wrote: Among the (R-)tools, I've seen on the net, for (bivariate) Principal Component scatter plots (+histograms), plotpc [1] is the one I like most. [...] [1] http://cran.r-project.org/web/packages/plotpc/index.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cross-checking a custom function for separability indices
they just are numbered as usual [1] [2] [3]... 6. separability.matrices_multiref - function ( Reference.Data.Frame.A , Target.Data.Frame.A , Reference.Data.Frame.B , Target.Data.Frame.B ) comments: - same as function 5 (above) but accepts 4 different samples (as data.frames of course). Questions: If you made it read this post till here, maybe you could give hints on how to handle: 1. the definition of the directory path that leads to the custom functions that are required to be source(d)? Currently hard-coded at a specific directory where I store custom functions. Is it better to make 1 _big_ script which includes all functions? What is good coding practice in such occasions ? 2. column naming after a column by column merging of three different matrices (data.frames)? This concerns separability.matrices(). 3. the long command built-up using assign() and else() ? For some reason the functions fails to run if the else(...) statement is moved in the next line. For this reason the script separability.matrix.R does not respect the 78 characters width limit. # Custom function for various Separability Measures # by Nikos Alexandris, Freiburg, 8.04.2010 # ( based on Divergence and Jeffries-Matusita, requires input variables as.matrices ) separability.measures - function ( Vector.1 , Vector.2 ) { # convert vectors to matrices in case they are not Matrix.1 - as.matrix (Vector.1) Matrix.2 - as.matrix (Vector.2) # define means mean.Matrix.1 - mean ( Matrix.1 ) mean.Matrix.2 - mean ( Matrix.2 ) # define difference of means mean.difference - mean.Matrix.1 - mean.Matrix.2 # define covariances for supplied matrices cv.Matrix.1 - cov ( Matrix.1 ) cv.Matrix.2 - cov ( Matrix.2 ) # define the halfsum of cv's as p p - ( cv.Matrix.1 + cv.Matrix.2 ) / 2 # --% # calculate the Bhattacharryya index bh.distance - 0.125 * t ( mean.difference ) * p^ ( -1 ) * mean.difference + 0.5 * log ( det ( p ) / sqrt ( det ( cv.Matrix.1 ) * det ( cv.Matrix.2 ) ) ) # --% # calculate Jeffries-Matusita # following formula is bound between 0 and 2.0 jm.distance - 2 * ( 1 - exp ( -bh.distance ) ) # also found in the bibliography: # jm.distance - 1000 * sqrt ( 2 * ( 1 - exp ( -bh.distance ) ) ) # the latter formula is bound between 0 and 1414.0 # --% # calculate the divergence # trace (is the sum of the diagonal elements) of a square matrix trace.of.matrix - function ( SquareMatrix ) { sum ( diag ( SquareMatrix ) ) } # term 1 divergence.term.1 - 1/2 * trace.of.matrix ( ( cv.Matrix.1 - cv.Matrix.2 ) * ( cv.Matrix.2^ (-1) - cv.Matrix.1^ (-1) ) ) # term 2 divergence.term.2 - 1/2 * trace.of.matrix ( ( cv.Matrix.1^ (-1) + cv.Matrix.2^ (-1) ) * ( mean.Matrix.1 - mean.Matrix.2 ) * t ( mean.Matrix.1 - mean.Matrix.2 ) ) # divergence divergence - divergence.term.1 + divergence.term.2 # --% # and the transformed divergence transformed.divergence - 2 * ( 1 - exp ( - ( divergence / 8 ) ) ) # --% # print results --- look at separability.measures.pp # get column names (hopefully they exist) --- to use for/with... ??? --- name.of.Matrix.1 - colnames ( Matrix.1 ) name.of.Matrix.2 - colnames ( Matrix.2 ) # create new objects to be used with... ? assign ( divergence.vector , divergence , envir = .GlobalEnv) assign ( jm.distance.vector, jm.distance, envir = .GlobalEnv) assign ( bh.distance.vector, bh.distance, envir = .GlobalEnv ) assign ( transformed.divergence.vector , transformed.divergence , envir = .GlobalEnv ) # print some message --- ? # a just do it solution using cat() cat ( paste ( Divergence: , round ( divergence
[R] Cross-checking a custom function for separability indices
Hi list! I have prepared a custom function (below) in order to calculate separability indices (Divergence, Bhattacharyya, Jeffries-Matusita, Transformed divergene) between two samples of (spectral land cover) classes. I need help to cross-compare results to verify that it works as expected (since I don't know of any other foss-tool that will give me quickly some results). Does anybody use another mean/tool/function (within or out of R) to calculate these distances? For example, # two samples sample.1 - c (1362, 1411, 1457, 1735, 1621, 1621, 1791, 1863, 1863, 1838) sample.2 - c (1354, 1458, 1458, 1458, 1550, 1145, 1428, 1573, 1573, 1657) # running the custom function (below) separability.measures ( sample.1 , sample.2 ) Divergence: 1.5658 Bhattacharryya: 0.1683 Jeffries-Matusita: 0.3098 Transformed divergence: 0.3555 Thanks in advance, Nikos # --- Code --- # Custom function for various Separability Measures # by Nikos Alexandris, Freiburg, 8.04.2010 # ( based on Divergence and Jeffries-Matusita, requires input variables as.matrices ) separability.measures - function ( Vector.1 , Vector.2 ) { # convert vectors to matrices in case they are not Matrix.1 - as.matrix (Vector.1) Matrix.2 - as.matrix (Vector.2) # define means mean.Matrix.1 - mean ( Matrix.1 ) mean.Matrix.2 - mean ( Matrix.2 ) # define difference of means mean.difference - mean.Matrix.1 - mean.Matrix.2 # define covariances for supplied matrices cv.Matrix.1 - cov ( Matrix.1 ) cv.Matrix.2 - cov ( Matrix.2 ) # define the halfsum of cv's as p p - ( cv.Matrix.1 + cv.Matrix.2 ) / 2 # --% # calculate the Bhattacharryya index bh.distance - 0.125 * t ( mean.difference ) * p^ ( -1 ) * mean.difference + 0.5 * log10 ( det ( p ) / sqrt ( det ( cv.Matrix.1 ) * det ( cv.Matrix.2 ) ) ) # --% # calculate Jeffries-Matusita jm.distance - 2 * ( 1 - exp ( - bh.distance ) ) # --% # calculate the divergence # trace (is the sum of the diagonal elements) of a square matrix trace.of.matrix - function ( SquareMatrix ) { sum ( diag ( SquareMatrix ) ) } # term 1 divergence.term.1 - 1/2 * trace.of.matrix ( ( cv.Matrix.1 - cv.Matrix.2 ) * ( cv.Matrix.2^ (-1) - cv.Matrix.1^ (-1) ) ) # term 2 divergence.term.2 - 1/2 * trace.of.matrix ( ( cv.Matrix.1^ (-1) + cv.Matrix.2^ (-1) ) * ( mean.Matrix.1 - mean.Matrix.2 ) * t ( mean.Matrix.1 - mean.Matrix.2 ) ) # divergence divergence - divergence.term.1 + divergence.term.2 # --% # and the transformed divergence transformed.divergence - 2 * ( 1 - exp ( - ( divergence / 8 ) ) ) # --% # print results --- look at separability.measures.pp # get column names (hopefully they exist) --- to use for/with... ??? --- name.of.Matrix.1 - colnames ( Matrix.1 ) name.of.Matrix.2 - colnames ( Matrix.2 ) # create new objects to be used with... ? assign ( divergence.vector , divergence , envir = .GlobalEnv) assign ( jm.distance.vector, jm.distance, envir = .GlobalEnv) assign ( bh.distance.vector, bh.distance, envir = .GlobalEnv ) assign ( transformed.divergence.vector , transformed.divergence , envir = .GlobalEnv ) # print some message --- ? # a just do it solution using cat() cat ( paste ( Divergence: , round ( divergence , digits = 4 ) , sep = ) , \n) cat ( paste ( Bhattacharryya: , round ( bh.distance , digits = 4 ) , sep = ) , \n) cat ( paste ( Jeffries-Matusita: , round ( jm.distance , digits = 4 ) , sep = ) , \n) cat ( paste ( Transformed divergence: , round ( transformed.divergence , digits = 4 ) , sep = ) , \n ) cat (\n) } # --- Code
Re: [R] c(), or cat(), or paste(), all cause unwanted reordering
On Thu, 2010-03-25 at 11:54 -0800, Jeff Brown wrote: Wow, you guys are awesome. Thanks! Thanks for the cat() question Jeff and to all guRus out there for the replies. This is something I was looking for the last hour. Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] c(), or cat(), or paste(), all cause unwanted reordering
Jeff Brown wrote: Wow, you guys are awesome. Thanks! Nikos Alexandris wrote: Thanks for the cat() question Jeff and to all guRus out there for the replies. This is something I was looking for the last hour. I can't seem to make this run: I have a function ( Column.of.Matrix.1 , Column.of.Matrix.2 ) which gives me an output like: --%--- Divergence: 0.2605 Jeffries-Matusita: 0.04489 Bhattacharryya: 0.0227 Transformed divergence: 0.06406 --%--- The custom function uses cat() as well to print the above. Now, I would like to use this function in a for() loop and print before the results a message like Separability measures between A and B:. I get A and B using colnames(Matrix.X[i]). Everything seems to be ok except that the message is printed after the results. Although I read all relevant posts in this thread about paste(), cat(), c(), I still don't understand why the message is being printed in the end when I use cat( c( ...)) or paste ( c(...)) or other combinations: --%--- cat ( c ( Separability measures between: , 1st sample , and , 2nd sample , \n , function ( Matrix.1 [i] , Matrix.2 [i] ) , collapse = ) ) --%--- Can I *force* the output of the function to be printed in the end? Thank you, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] c(), or cat(), or paste(), all cause unwanted reordering
On Tue, 2010-04-06 at 19:29 +0200, Nikos Alexandris wrote: Jeff Brown wrote: Wow, you guys are awesome. Thanks! Nikos Alexandris wrote: Thanks for the cat() question Jeff and to all guRus out there for the replies. This is something I was looking for the last hour. I can't seem to make this run: I have a function ( Column.of.Matrix.1 , Column.of.Matrix.2 ) which gives me an output like: --%--- Divergence: 0.2605 Jeffries-Matusita: 0.04489 Bhattacharryya: 0.0227 Transformed divergence: 0.06406 --%--- The custom function uses cat() as well to print the above. Now, I would like to use this function in a for() loop and print before the results a message like Separability measures between A and B:. I get A and B using colnames(Matrix.X[i]). Everything seems to be ok except that the message is printed after the results. Although I read all relevant posts in this thread about paste(), cat(), c(), I still don't understand why the message is being printed in the end when I use cat( c( ...)) or paste ( c(...)) or other combinations: --%--- cat ( c ( Separability measures between: , 1st sample , and , 2nd sample , \n , function ( Matrix.1 [i] , Matrix.2 [i] ) , collapse = ) ) --%--- Can I *force* the output of the function to be printed in the end? Thank you, Nikos Got it by using a new function which first calls cat(...) and then the custom.function that gives what I need. Thanks, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Define column names to a series of data.frames
Nikos Alexandris wrote: ... I have 6 data frames consisting of 6 rows x 7 columns put together from other data.frames. ... I want to give the following column names to each data.frame: (SDev, PC1, PC2, PC3, PC4, PC5, PC6) ... How is it to be done at once for all data.frames that the function objects(pattern = SomePattern) can find? Henrique Dallazuanna wrote: Yes, just in the list. If they want change the name in Environment GlobalEnv: for(i in ls(pattern = DF[0-9])) assign(i, `names-`(get(i), c(SDev,PC1, PC2, PC3, PC4, PC5, PC6)), globalenv()) It works also using colnames() and rownames(), for example: --%--- # collect results based on svd svd.results - objects ( pattern = ^svd.result_of_.* ) # collect results based on eigenvector(s) eigenvector.results - objects ( pattern = ^eigenvector.result_of_.* ) # combine all in one pca.results - c ( svd.results , eigenvector.results ) # (re-)define column names for ( i in pca.results ) assign ( i , `colnames-` ( get ( i ) , column_names ) , globalenv ( ) ) # (re-)define row names for ( i in pca.results ) assign ( i , `rownames-` ( get ( i ) , row_names ) , globalenv ( ) ) --%--- Henrique D., you are a guRu! Thank you, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: Define column names to a series of data.frames
Thanks to all for the replies. I'll post back here when I get things working (...in a couple of days). Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Define column names to a series of data.frames
Greets to the list! I am aware that this topic has been discussed several times. And I've read quite some related posts [1]. Yet, can't seem to give a solution to my problem. I have 6 data frames consisting of 6 rows x 7 columns put together from other data.frames. Something like: a b c d e f g v1 # # # # # # # v2 # # # # # # # v3 # # # # # # # v4 # # # # # # # v5 # # # # # # # v6 # # # # # # # I want to give the following column names to each data.frame: (SDev, PC1, PC2, PC3, PC4, PC5, PC6) Works fine for one data.frame: column_names - c(SDev, PC1, PC2, PC3, PC4, PC5, PC6) names( df1 ) - column_names How is it to be done at once for all data.frames that the function objects(pattern = SomePattern) can find? I've tried several things with assign, get, paste, for but I am not getting anywhere. I need to integrate this in a function that can handle lot's of data.frames and not only 6. Thank you, Nikos --- [1] I am stuck on this post: http://www.mail-archive.com/r-h...@stat.math.ethz.ch/msg32063.html. It reads: dframes - c(a,b,c) cols - c(one,two) df - data.frame(11:20, 21:30) names(df) - cols assign(dframes[2], df) Can't understand the logic behind the [2]. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best R text editors?
[Answering to the threads question] For those who use Gnome's gedit, there is now RGedit: http://sourceforge.net/projects/rgedit Apologies if this was already posted, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] labeling in qplot
On Tue, 2009-08-04 at 20:16 -0700, Mohan S wrote: HI am plotting different density plots in one graph each with a different color. And i want to add labels to plot mentioning which color belongs to which data series. p2 - qplot(corArms, data = data1, geom = density, adjust=0.4, main=Arms Correlation All) + layer(data=data2, geom=density, adjust=0.4, color=I(2)) + layer(data=data3, geom=density, adjust=0.4, color=I(3)); how can i add such a legend. Any pointers? Hi Mo. Perhaps smartlegend in gplots may be what you are looking for. The example below might _not_ be close to what you demonstrate above - it's just an example. # install gplots if not installed library(gplots) # plot a density plot(density(mod.b2.06, from=0, to=6000), main=Density plots for both the pre- postfire MODIS bands 2, 6 and 7, col=magenta, lty=2, lwd=2) # add another one lines(density(mod.b2.07, from=0, to=6000), col=magenta4, lty=1, lwd=2) # example of smartlegend smartlegend(x=right, y=top, c(Prefire MODIS band 2, Postfire MODIS band 2), col=c(magenta, magenta4), lty=2:1, lwd=2, cex=1.5) Hope this helps. Kindest regards, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] why is 0 not an integer?
On Wed, 2009-08-05 at 13:16 -0700, Steve Jaffe wrote: Why when I assign 0 to an element of an integer vector does the type change to numeric? Here is a particularly perplexing example: v - 0:10 v [1] 0 1 2 3 4 5 6 7 8 9 10 class(v) [1] integer v[1] - 0 try this: v - as.integer(0) Nikos class(v) [1] numeric #!! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] productivity tools in R?
On Thu, 2009-07-02 at 00:01 -0500, Gene Leynes wrote: playwith(xyplot(z3), time.mode = TRUE) WoW! Looks (and is) GrEaT! Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on creating a sequence of vectors
megh: I want to create a number of vectors like : vec1 - rnorm(1) vec2 - rnorm(2) vec3 - rnorm(3) and so on... Maybe try the assign() function. Something like: for (i in 1:10) assign ( paste ( vec , i , sep = ) , rnorm(i) ) Kind regards, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to plot hyperbolic iso-lines of a cost function?
Hi R-specialists! I would like to draw some hyperbolic (iso-)lines of a cost function in a bi-dimensional space ( =shapes of a cost function ), based on (the general form): C(x) = c1*Ce + c2*Oe + c3*{ 1 - [ (1 - Ce)^a * (1 - Oe)^b ] } where: - Oe/Ce are Omission/Commission (ranging between 0 and 1 ) values in a bi-dimensional space ( see example of Oe/Ce values [1]) - c1, c2, c3, a and b are some parameters which determine sensitivity of the function Is there a ready-to-use plotting - function? Kind regards, Nikos --- [1] Omission/ Commission errors 0.0.42497404 0.010124160.35377707 0.031503210.30218908 0.061586110.26031315 0.101884180.21955793 0.155588280.17899955 0.225111560.14204273 0.316631640.10425152 0.427280350.06922038 0.584208960.03561727 0.903683970. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GRASS raster data processing
Maayt: I just imported two raster maps into R using the SPGRASS6 package, one containing elevation data and the other containing an erosion index: Kar_inc -readRAST6(Incis_Kar, plugin=FALSE) Kar_dem - readRAST6(DEM_Kar, plugin=FALSE) I just wanted to make a xy plot of erosion parameter vs elevation. How does this work? I don't get how to handle SpatialGridDataFrames... Maarten, you can check some web-pages with respect to GRASS R [1][2]. This is a question for grass-stats actually [3]. A quick answer: check the structure of the newly created object and you will find that the numbers are to be found in (e.g. for Incis_Kar ) kar_...@data$incis_kar. So use @data and the $ to access a slot. Kind regards, Nikos --- [1] http://grass.osgeo.org/wiki/R [2] http://grass.ibiblio.org/statsgrass/index.php#grassR [3] http://grass.osgeo.org/statsgrass/index.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] create vectors within a double loop
Hi R-list. This is my first post. I'll try to be as precise as possible with the difficulty I have to get things done. I have a hard time trying to construct a double for loop and create within the inner loop new objects (in this case vectors). I posted this question in a non-directly related with pure R-problems list (in grass-stats). In addition, I think I wasn't precise enough. Many thanks for your time in advance, Nikos --- The data * A list of 10 different names called classifications * data.frames starting with the classification names and a common _thresholds suffix. Each _thresholds data.frame contains 10 columns with numeric values (each column consists of 875 rows) The goal I want to create a new vector for each classification which will hold 10 numeric values (e.g. some sum) one for each column in the classification-thresholds objects. The code # loop over classifications for (x in classifications) { # loop over sequence 1 to 10 for (i in 1:10) # store sum's per source column assign ( (paste ( x, _sums, sep = )[i], sum ( !is.na ( get ( paste ( x, _thresholds, sep = ) )[ ,i] ) ) ) } The problem The above piece of code gives me 10 new vectors but with only 1 value (e.g. the last derived from the loop) _instead_ of 10. Questions 1. How can I construct this properly 2. Related question: how can I print the structure of each column of each classification with a for loop? e.g. # a single loop work perfectly as follows: for (i in 1:10) str(burned_eig_normalised_cov.omission_thresholds[,i]) int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... logi [1:875] NA NA NA NA NA NA ... # now, I would like to say for (x in classifications) { for (i in 1:10) str(paste(x, .omission_thresholds, sep=)[,i]) } Error in paste(x, .omission_thresholds, sep = )[, i] : incorrect number of dimensions Why is this wrong? Given the common prefix, how can I paste the prefix and the suffix and access the columns? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create vectors within a double loop
jim holtman: You might want to look at how to use 'lapply' to create lists. Here is one way of doing it: # create test data a_threshold - b_threshold - as.data.frame(matrix(sample(c(1:5, NA), 100, TRUE), 10)) classification - c('a', 'b') result - lapply(classification, function(.cls){ + colSums(!is.na(get(paste(.cls, '_threshold', sep='' + }) Jim, thank you for your time. You saved my day ;-). Just for your interest, my code (even if not elegant) works also with colSums(). Great function. Of course, the lapply() is probably the elegant solution to many tasks like mine. Kindest regards, Nikos The code # loop over classifications for (x in classifications) { # loop over sequence 1 to 10 for (i in 1:10) # store sum's per source column assign ( (paste ( x, _sums, sep = )[i], # changed from sum to colSums colSums ( !is.na ( get ( paste ( x, _thresholds, sep = ) ) # removed the [ ,i] # ) ) ) } The 2nd question remains or it is again solved with an _apply()_ function. Maybe I'll find the answer in R_inferno? Questions --%--- 2. Related question: how can I print the structure of each column of each classification with a for loop? e.g. # a single loop work perfectly as follows: for (i in 1:10) str(burned_eig_normalised_cov.omission_thresholds[,i]) int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... int [1:875] NA NA NA NA NA NA NA NA NA NA ... logi [1:875] NA NA NA NA NA NA ... # now, I would like to say for (x in classifications) { for (i in 1:10) str(paste(x, .omission_thresholds, sep=)[,i]) } Error in paste(x, .omission_thresholds, sep = )[, i] : incorrect number of dimensions Why is this wrong? Given the common prefix, how can I paste the prefix and the suffix and access the columns? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.