Re: [R] points marking
Dear Gregory , Thnaks for your reply and help. I am explaining you my problems again, below is my script for the same . Dom -c (195,568,559) fkbp - barplot (Dom, col=black, xlab=, border = NA, space = 7, xlim=c(0,650), ylim =c(0, 87), las = 2, horiz = TRUE) axis (1, at = seq(0,600,10), las =2) 1. ==Segments 1= segments(164,7.8,192,7.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(45,15.8,138,15.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(160,15.8,255,15.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(277,15.8,378,15.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(51,23.8,145,23.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(167,23.8,262,23.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(284,23.8,381,23.8, col = green, pch=23, cex=9, lty=solid, lwd=20) 2. ==Segments 2 == segments(399,15.8,432,15.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(448,15.8,475,15.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(486,15.8,515,15.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(401,23.8,434,23.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(450,23.8,475,23.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(486,23.8,517,23.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) I solved one part of my query i.e to mark points from one positions to other is ok and I found that its working fine but I have another issue now, as I am using using two segments data 1 and 2 , although I want to draw different shapes for segmants 2 as I am giving pch=21, but I it seems to give a solid line for both. I want to draw different shapes for every chunk of segments i.e is the whole point. I want to make script which can generate such figures, below is link to one of the tool. http://www.expasy.ch/tools/mydomains/ Thank you Jeet On Thu, Jun 10, 2010 at 11:10 PM, Greg Snow greg.s...@imail.org wrote: Your question is not really clear, do either of these examples do what you want? with(anscombe, plot(x1, y2, ylim=range(y2,y3)) ) with(anscombe, points(x1, y3, col='blue', pch=2) ) with(anscombe, segments(x1, y2, x1, y3, col=ifelse( y2y3, 'green','red') ) ) with(anscombe, plot(x1, y2, ylim=range(y2,y3), type='n') ) with(anscombe[order(anscombe$x1),], polygon( c( x1,rev(x1) ), c(y2, rev(y3)), col='grey' ) ) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of khush Sent: Thursday, June 10, 2010 7:48 AM To: r-help@r-project.org Subject: [R] points marking Hi, How to mark points on x axis of a graph keeping x axis as constant and changing y from y1 to y2 respectively. I want to highlight the area from y1 to y2. Any suggestions Thank you Jeet [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ff package when reading .csv files
Hi My aim is to read a large .csv file into R. I ran the following code and am using R version 10.1 on Windows. library(ff) read.csv.ffdf(x=NULL,file.csv,fileEncoding=,nrows=-1,first.rows=NULL,next.rows=NULL,levels=NULL,appendLevels=TRUE,FUN=read.table,transFUN=NULL,asffdf_args=list(),BATCHBYTES=getOption(ffbatchbytes),VERBOSE=FALSE) Error in read.table.ffdf(FUN = read.csv, ...) : formal argument FUN matched by multiple actual arguments Can anyone help me to fix this error. Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/ff-package-when-reading-csv-files-tp2251333p2251333.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] points marking
Hi I am not sure if you can do what you want. Segments are not points so your pch option is (I believe) ignored. You could play with lmitre and lend parameters, but it probably would not help much. You cold try to look at ?symbols but you probably need to change source code to suit your needs. Regards Petr r-help-boun...@r-project.org napsal dne 11.06.2010 08:00:04: Dear Gregory , Thnaks for your reply and help. I am explaining you my problems again, below is my script for the same . Dom -c (195,568,559) fkbp - barplot (Dom, col=black, xlab=, border = NA, space = 7, xlim=c(0,650), ylim =c(0, 87), las = 2, horiz = TRUE) axis (1, at = seq(0,600,10), las =2) 1. ==Segments 1= segments(164,7.8,192,7.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(45,15.8,138,15.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(160,15.8,255,15.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(277,15.8,378,15.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(51,23.8,145,23.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(167,23.8,262,23.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(284,23.8,381,23.8, col = green, pch=23, cex=9, lty=solid, lwd=20) 2. ==Segments 2 == segments(399,15.8,432,15.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(448,15.8,475,15.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(486,15.8,515,15.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(401,23.8,434,23.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(450,23.8,475,23.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(486,23.8,517,23.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) I solved one part of my query i.e to mark points from one positions to other is ok and I found that its working fine but I have another issue now, as I am using using two segments data 1 and 2 , although I want to draw different shapes for segmants 2 as I am giving pch=21, but I it seems to give a solid line for both. I want to draw different shapes for every chunk of segments i.e is the whole point. I want to make script which can generate such figures, below is link to one of the tool. http://www.expasy.ch/tools/mydomains/ Thank you Jeet On Thu, Jun 10, 2010 at 11:10 PM, Greg Snow greg.s...@imail.org wrote: Your question is not really clear, do either of these examples do what you want? with(anscombe, plot(x1, y2, ylim=range(y2,y3)) ) with(anscombe, points(x1, y3, col='blue', pch=2) ) with(anscombe, segments(x1, y2, x1, y3, col=ifelse( y2y3, 'green','red') ) ) with(anscombe, plot(x1, y2, ylim=range(y2,y3), type='n') ) with(anscombe[order(anscombe$x1),], polygon( c( x1,rev(x1) ), c(y2, rev(y3)), col='grey' ) ) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of khush Sent: Thursday, June 10, 2010 7:48 AM To: r-help@r-project.org Subject: [R] points marking Hi, How to mark points on x axis of a graph keeping x axis as constant and changing y from y1 to y2 respectively. I want to highlight the area from y1 to y2. Any suggestions Thank you Jeet [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Retrieving the 2 row of dist computations
Edit: I'm stupid and visualized the dist matrix incorrectly in my head. Should be Column # = x, Row # = y. n = 827-(x-2) index = y-1+(n+827)(827-n+1)/2 Everything works just fine. Thanks! Jeff08 wrote: Edit: There is something funky about the code. It definitely returns the right column of the distance data, but returns an incorrect row. Code: NCols=250 NRows=829 myMat-matrix(runif(NCols*NRows), ncol=NCols) d-dist(myMat) e-sort.list(d) e-e[1:5] ##Retrieve minimum 5 distances k - 5 res - matrix(NA, ncol = 2, nrow = k) ds - sort(d) for(i in 1:k) res[i, ] - which(as.matrix(d) == ds[i], arr.ind = TRUE)[1,] colnames(res) - c('row','col') rownames(res) - 1:k res I have derived the formula for 829 rows, to check if the returned column and row matches the index given by e. Column # = x, Row # = y. n = 828-(x-2) index = y+(n+828)(828-n+1)/2 Formula R CODE ##Just checking for row 1 i-1 y-res[i,1] x-res[i,2] n-(828-(x-2)) index1-(y+(n+828)*(828-n+1)/2) index2-e[i] ##index1 should equal index2, but this is not the case ##you can tell that the column is right because index1 index 2 is close ##(a change in row of 1 shifts the index by 1, but a change in column ## shifts index by ~400 on average) You can then compare this index to the one given by e[i] Jorge Ivan Velez wrote: Hi there, I am sure there is a better way to do it, but here is a suggestion: res - matrix(NA, ncol = 2, nrow = 5) for(i in 1:5) res[i, ] - which(as.matrix(d) == sort(d)[i], arr.ind = TRUE)[1,] res HTH, Jorge On Wed, Jun 9, 2010 at 11:30 PM, Jeff08 wrote: Dear R Gurus, As you probably know, dist calculates the distance between every two rows of data. What I am interested in is the actual two rows that have the least distance between them, rather than the numerical value of the distance itself. For example, If the minimum distance in the following sample run is d[14], which is .3826119, and the rows are 4 6. I need to find a generic way to retrieve these rows, for a generic matrix of NRows (in this example NRows=7) NCols=5 NRows=7 myMat-matrix(runif(NCols*NRows), ncol=NCols) d-dist(myMat) 1 2 3 4 5 6 2 0.7202138 3 0.7866527 0.9052319 4 0.6105235 1.0754259 0.8897555 5 0.5032729 1.0789359 0.9756421 0.4167131 6 0.6007685 0.6949224 0.3826119 0.7590029 0.7994574 7 0.9751200 1.2218754 1.0547197 0.5681905 0.7795579 0.8291303 e-sort.list(d) e-e[1:5] ##Retrieve minimum 5 distances [1] 14 16 4 18 5 -- View this message in context: http://r.789695.n4.nabble.com/Retrieving-the-2-row-of-dist-computations-tp2249844p2249844.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Retrieving-the-2-row-of-dist-computations-tp2249844p2251349.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Desolve package: How to pass thousand of parameters to C compiled code?
On 6/8/2010 2:03 PM, Ben Bolker wrote: cmmu_mileat yahoo.com writes: Hi, I have used DeSolve package for my ODE problem regarding infectious disease transmission and currently am trying to pass lots (roughly a thousand) of model parameters to the C compiled model (I have to use C compiled code instead of R code purely because of the speed). I can't go define it one by one as it gonna take ages to finish and also quite difficult to revise. I have read the instructions in Writing Code in Compiled Languages, which is very well written, but still have no clue on how to proceed. Please can anyone suggest the best way to go about this, and also the places where I can find examples of DeSolve C compiled code. There's another example of compiled code in http://www.jstatsoft.org/v33/i09/paper , but it probably won't say anything the 'Writing Code' guide doesn't already cover. I found the guide pretty complete ... I would suggest that you pack your parameters into a single numeric vector (or several, if that makes more sense) and pass them that way. What kind of infectious disease data do you have that will allow you to fit a model with thousands of parameters ... ?? (Just curious.) I agree in using vectors (and matrices) like recommended by Ben. In addition, you may consider to use structures and unions instead of #define macros: union my_parms_vec { struct {double foo1, foo2, bar[15];}; double value[18]; }; static my_parms_vec p; // ... // This structure has then to be initialized // as vector 'value' in initmod: /* initializer */ void initmod(void (* odeparms)(int *, double *)) { int N=3; odeparms(N, p.value); } // so that you can use constructs like p.foo or p.bar[7] in your model (i.e. derivs) function. With respect of the large number of parameters you may think whether they are really parameters and not something like external forcings, for which recent versions of deSolve have separate mechanisms to deal with. See ?forcings Hope it helps Thomas Petzoldt PS: There is also a dedicated mailing list for discussions like this: https://stat.ethz.ch/mailman/listinfo/r-sig-dynamic-models __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculation of r squared from a linear regression
Hi, I'm trying to verify the calculation of coefficient of determination (r squared) for linear regression. I've done the calculation manually with a simple test case and using the definition of r squared outlined in summary(lm) help. There seems to be a discrepancy between the what R produced and the manual calculation. Does anyone know why this is so? What does the multiple r squared reported in summary(lm) represent? # The test case: x - c(1,2,3,4) y - c(1.6,4.4,5.5,8.3) dummy - data.frame(x, y) fm1 - lm(y ~ x-1, data = dummy) summary(fm1) betax - fm1$coeff[x] * sd(x) / sd(y) # cd is coefficient of determination cd - betax * cor(y, x) Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Misplacement of Greek letter
Newlines are not supported within expressions for mathematical notation. See ?lotmath Best, Uwe Ligges On 11.06.2010 07:36, beloitstudent wrote: Hello. I am trying to get my axis label to read as follows (The symbol) Delta AUC blah blah... then below it...(some other text) The problem is the Delta symbol shows up beside the (some other text) rather than the AUC. Does any one know how I can get the Delta to remain beside AUC? Here is the actual command should you care to look at it. par(mar=c(8,8,4,4)) par(adj=.5) par(font.lab=2) barplot(c(5724.688,7290.7875), ylab=expression(paste(Delta,AUC 45-200 min \n(μg/ml * 155 min)), xlim=c(0,1), width=0.16, border=black,font.lab=2,cex.lab=2.1,cex.names=2, names.arg=c(Saline \n(n=8), Exendin\n(9-39) (n=8)),mgp=c(3,4,0), axisnames=TRUE, ylim=c(0,1), col=c('grey88','grey71'), axes=FALSE, lwd=2, space=.5) axis(2, at=c(0, 2000, 6000,1), lwd=2, font=1.7, pos=-.025,cex.axis=2) abline(h=0, untf=FALSE, lty=1, lwd=2) arrows(.4, 8297.291, .4, 6284.284, code=3, angle=90, lwd=2) arrows(.16,7071.79,.16,4377.585, code=3, angle=90, lwd=2) Thanks all! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HOW to install RSQLite database
Are you asking how to install the RSQLite package or how to create a SQLite database? The two are somewhat distinct questions. RSQLite is just a package of functions for R to be able to access data in an SQLite database. There isn't a separate SQLite program - just a library that is compiled into RSQLite. Regards David -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of vijaysheegi Sent: 10 June 2010 16:22 To: r-help@r-project.org Subject: [R] HOW to install RSQLite database Please let me know where i have to type below thing to RSQLite database get installed.Please let me know the solution.Thanks in advance RSQLite -- Embedding the SQLite engine in R (The RSQLite package includes a recent copy of the SQLite distribution from http://www.sqlite.org.) Installation There are 3 alternatives for installation: 1. Simple installation: R CMD INSTALL RSQLite-.tar.gz the installation automatically detects whether SQLite is available in any of your system directories; if it's not available, it installs the SQLite engine and the R-SQLite interface under the package directory $R_PACKAGE_DIR/sqlite. 2. If you have SQLite installed in a non-system directory (e.g, in $HOME/sqlite), a) You can use export PKG_LIBS=-L$HOME/sqlite/lib -lsqlite export PKG_CPPFLAGS=-I$HOME/sqlite/include R CMD INSTALL RSQLite-.tar.gz b) or you can use the --with-sqlite-dir configuration argument R CMD INSTALL --configure-args=--with-sqlite-dir=$HOME/sqlite \ RSQLite-.tar.gz 3. If you don't have SQLite but you rather install the version we provide into a directory different than the RSQLite package, for instance, $HOME/app/sqlite, use R CMD INSTALL --configure-args=--enable-sqlite=$HOME/app/sqlite \ RSQLite-.tar.gz Usage - Note that if you use an *existing* SQLite library that resides in a non-system directory (e.g., other than /lib, /usr/lib, /usr/local/lib) you may need to include it in our LD_LIBRARY_PATH, prior to invoking R. For instance export LD_LIBRARY_PATH=$HOME/sqlite/lib:$LD_LIBRARY_PATH R library(help=RSQLite) library(RSQLite) (if you use the --enable-sqlite=DIR configuration argument, the SQLite library is statically linked to the RSQLite R package, and you need not worry about setting LD_LIBRARY_PATH.) -- View this message in context: http://r.789695.n4.nabble.com/HOW-to-install-RSQLite-database-tp2250604p 2250604.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Issued by UBS AG or affiliates to professional investors...{{dropped:30}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting the current working directory to the location of the source file
Isn't this what source(..., chdir=TRUE) is for? See help(source). /H On Fri, Jun 11, 2010 at 2:33 AM, Marcin Gomulka mrgo...@gmail.com wrote: AFAIK a script run through source() does not have any legit way to learn about it's own location. I need this to make sure that the script will find its datafiles after I move the whole directory. (The datafiles are in the same directory.) Here is a hack I invented to work around it: print(getwd()) source_pathname = get(ofile,envir = parent.frame()) source_dirname = dirname(source_pathname ) setwd(source_dirname) print(getwd()) Question: Is there a better, cleaner way? Thanks, mrgomel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculation of r squared from a linear regression
On 2010-06-11 2:16, Sandra Hawthorne wrote: Hi, I'm trying to verify the calculation of coefficient of determination (r squared) for linear regression. I've done the calculation manually with a simple test case and using the definition of r squared outlined in summary(lm) help. There seems to be a discrepancy between the what R produced and the manual calculation. Does anyone know why this is so? What does the multiple r squared reported in summary(lm) represent? # The test case: x- c(1,2,3,4) y- c(1.6,4.4,5.5,8.3) dummy- data.frame(x, y) fm1- lm(y ~ x-1, data = dummy) summary(fm1) betax- fm1$coeff[x] * sd(x) / sd(y) # cd is coefficient of determination cd- betax * cor(y, x) The discrepancy is due to incorrect manual calculation. You're using (incorrectly, at that) formulas for simple regression _with an intercept term_ whereas you model has _no_ intercept term. What summary.lm reports is clearly described on the help page. See r.squared in the Value section. -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculation of r squared from a linear regression
On Fri, 2010-06-11 at 01:16 -0700, Sandra Hawthorne wrote: Hi, I'm trying to verify the calculation of coefficient of determination (r squared) for linear regression. I've done the calculation manually with a simple test case and using the definition of r squared outlined in summary(lm) help. There seems to be a discrepancy between the what R produced and the manual calculation. Does anyone know why this is so? What does the multiple r squared reported in summary(lm) represent? # The test case: x - c(1,2,3,4) y - c(1.6,4.4,5.5,8.3) dummy - data.frame(x, y) fm1 - lm(y ~ x-1, data = dummy) summary(fm1) betax - fm1$coeff[x] * sd(x) / sd(y) # cd is coefficient of determination cd - betax * cor(y, x) Thanks. Sorry Sandra, But the problem in yours script. Look this x - c(1,2,3,4) y - c(1.6,4.4,5.5,8.3) dummy - data.frame(x, y) fm1 - lm(y ~ x, data = dummy) summary(fm1) Call: lm(formula = y ~ x, data = dummy) Residuals: 1 2 3 4 -0.17 0.51 -0.51 0.17 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.3500 0.6584 -0.532 0.6481 x 2.1200 0.2404 8.818 0.0126 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.5376 on 2 degrees of freedom Multiple R-squared: 0.9749, Adjusted R-squared: 0.9624 F-statistic: 77.76 on 1 and 2 DF, p-value: 0.01262 betax - fm1$coeff[2] * sd(x) / sd(y) # cd is coefficient of determination cd - betax * cor(y, x) cd x 0.974924 The formula fm1$coeff[2] * sd(x) / sd(y) is valid only the model have a intercept... -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Clustering algorithms don't find obvious clusters
I have a directed graph which is represented as a matrix on the form 0 4 0 1 6 0 0 0 0 1 0 5 0 0 4 0 Each row correspond to an author (A, B, C, D) and the values says how many times this author have cited the other authors. Hence the first row says that author A have cited author B four times and author D one time. Thus the matrix represents two groups of authors: (A,B) and (C,D) who cites each other. But there is also a weak link between the groups. In reality this matrix is much bigger and very sparce but it still consists of distinct groups of authors. My problem is that when I cluster the matrix using pam, clara or agnes the algorithms does not find the obvious clusters. I have tried to turn it into a dissimilarity matrix before clustering but that did not help either. The layout of the clustering is not that important to me, my primary interest is the to get the right nodes into the right clusters. Sincerely Henrik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Misplacement of Greek letter
Not that I want to encourage the creation of the infamous skyscraper plot (which has an extremely low information/ink ratio), but if you're having trouble with axis labelling, it's probably best to look at mtext(). You can use several mtext calls to place parts of your label on different margin lines. -Peter Ehlers On 2010-06-10 23:36, beloitstudent wrote: Hello. I am trying to get my axis label to read as follows (The symbol) Delta AUC blah blah... then below it...(some other text) The problem is the Delta symbol shows up beside the (some other text) rather than the AUC. Does any one know how I can get the Delta to remain beside AUC? Here is the actual command should you care to look at it. par(mar=c(8,8,4,4)) par(adj=.5) par(font.lab=2) barplot(c(5724.688,7290.7875), ylab=expression(paste(Delta,AUC 45-200 min \n(μg/ml * 155 min)), xlim=c(0,1), width=0.16, border=black,font.lab=2,cex.lab=2.1,cex.names=2, names.arg=c(Saline \n(n=8), Exendin\n(9-39) (n=8)),mgp=c(3,4,0), axisnames=TRUE, ylim=c(0,1), col=c('grey88','grey71'), axes=FALSE, lwd=2, space=.5) axis(2, at=c(0, 2000, 6000,1), lwd=2, font=1.7, pos=-.025,cex.axis=2) abline(h=0, untf=FALSE, lty=1, lwd=2) arrows(.4, 8297.291, .4, 6284.284, code=3, angle=90, lwd=2) arrows(.16,7071.79,.16,4377.585, code=3, angle=90, lwd=2) Thanks all! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ff package when reading .csv files
On 2010-06-11 0:37, dhanush wrote: Hi My aim is to read a large .csv file into R. I ran the following code and am using R version 10.1 on Windows. library(ff) read.csv.ffdf(x=NULL,file.csv,fileEncoding=,nrows=-1,first.rows=NULL,next.rows=NULL,levels=NULL,appendLevels=TRUE,FUN=read.table,transFUN=NULL,asffdf_args=list(),BATCHBYTES=getOption(ffbatchbytes),VERBOSE=FALSE) Error in read.table.ffdf(FUN = read.csv, ...) : formal argument FUN matched by multiple actual arguments Can anyone help me to fix this error. Thanks in advance. read.csv.ffdf() is just a wrapper for read.table.ffdf() with FUN set to read.csv. You should not use any FUN= argument in read.csv.ffdf. Either call read.table.ffdf with FUN=read.csv or call read.csv with no FUN specification. Nor is there any need to specify all the other arguments to take their default values. -Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Clustering algorithms don't find obvious clusters
Le 11/06/2010 12:45, Henrik Aldberg a écrit : I have a directed graph which is represented as a matrix on the form 0 4 0 1 6 0 0 0 0 1 0 5 0 0 4 0 Each row correspond to an author (A, B, C, D) and the values says how many times this author have cited the other authors. Hence the first row says that author A have cited author B four times and author D one time. Thus the matrix represents two groups of authors: (A,B) and (C,D) who cites each other. But there is also a weak link between the groups. In reality this matrix is much bigger and very sparce but it still consists of distinct groups of authors. My problem is that when I cluster the matrix using pam, clara or agnes the algorithms does not find the obvious clusters. I have tried to turn it into a dissimilarity matrix before clustering but that did not help either. The layout of the clustering is not that important to me, my primary interest is the to get the right nodes into the right clusters. Hello Henrik, You can use a graph clustering using the igraph package. Example: library(igraph) simM-NULL simM-rbind(simM,c(0, 4, 0, 1)) simM-rbind(simM,c(6, 0, 0, 0)) simM-rbind(simM,c(0, 1, 0, 5)) simM-rbind(simM,c(0, 0, 4, 0)) G - graph.adjacency( simM,weighted=TRUE,mode=directed) plot(G,layout=layout.kamada.kawai) ### walktrap.community wt - walktrap.community(G, modularity=TRUE) wmemb - community.to.membership(G, wt$merges, steps=which.max(wt$modularity)-1) V(G)$color - rainbow(3)[wmemb$membership+1] plot(G) I hope it helps Etienne Sincerely Henrik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cforest and Random Forest memory use
You say you're on a 64-bit box, but are you running 64-bit R? -Peter Ehlers On 2010-06-10 4:36, Matthew OKane wrote: Hi all, I'm having great trouble working with the Cforest (from the party package) and Random forest functions. Large data set seem to create very large model objects which means I cannot work with the number of observations I need to, despite running on a large 8GB 64-bit box. I would like the object to only hold the trees themselves as I intend to export them out of R. Is there anyway, either through options or editing out code and recompiling them, I can reduce their footprint? I've had a look at the cforest code and the culprit is the 'emsemble' area of the object. I suspect this part of the object contains something related to the number of observations (I have savesplitstats set to FALSE so this shouldn't be the issue). Thanks, Matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glm-test?
Dear R-users, I would like to test, whether a sample distribution differs significantly from a population distribution. They are not normally distributed. How should I proceed? Using somehow glm-models? How? The population and the sample data are here. They can be loaded using the load-command. http://users.utu.fi/attenka/D_Pop http://users.utu.fi/attenka/D_Samp Best regards, Atte Tenkanen University of Turku, Finland Department of Musicology +35823335278 http://users.utu.fi/attenka/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to adjust plot size?
I'm not sure what part of the process is giving you trouble, but if you play around with the mar part of the code, you get a lot of flexibility over the margins. Also, if you pre-set the dimensions of the window the plot is created in, you get even more control. E.g. x11(width=9, height=6, pointsize=12) par(mfrow=c(1,2), mar=c(3,2,2,2)) plot(1:10) plot(1:10) savePlot(c:\\test.wmf,type=wmf) Guy liang wrote: Greetings! When inserting the following R curves into a word file, there is a big margin in the graph. How do I remove the marign? I tried FIN, but it seems not compatiable with MFROW. par(mfrow=c(1,2), mar=c(10,1,10,2)) plot(1:10) plot(1:10) Thanks for your help, Liang -- View this message in context: http://r.789695.n4.nabble.com/How-to-adjust-plot-size-tp2250904p2251588.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm-test?
Which test do you want to use? Once you know that, tell us and we'll tell you where to find it in R. Cheers Joris On Fri, Jun 11, 2010 at 1:50 PM, Atte Tenkanen atte...@utu.fi wrote: Dear R-users, I would like to test, whether a sample distribution differs significantly from a population distribution. They are not normally distributed. How should I proceed? Using somehow glm-models? How? The population and the sample data are here. They can be loaded using the load-command. http://users.utu.fi/attenka/D_Pop http://users.utu.fi/attenka/D_Samp Best regards, Atte Tenkanen University of Turku, Finland Department of Musicology +35823335278 http://users.utu.fi/attenka/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cforest and Random Forest memory use
Also, you have not said: - your OS - your version of R - your version of party - your code - what Large data set means - what very large model objects means So... how is anyone suppose to help you? Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm-test?
Take a look at this document: http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf All information you need is in there. Cheers Joris On Fri, Jun 11, 2010 at 2:50 PM, Atte Tenkanen atte...@utu.fi wrote: I would have tried z-test (n=67) but since the distribution is not normally distributed, but positive skew, I should somehow transform the data? Values are between 0 and 1. atte Which test do you want to use? Once you know that, tell us and we'll tell you where to find it in R. Cheers Joris On Fri, Jun 11, 2010 at 1:50 PM, Atte Tenkanen atte...@utu.fi wrote: Dear R-users, I would like to test, whether a sample distribution differs significantly from a population distribution. They are not normally distributed. How should I proceed? Using somehow glm-models? How? The population and the sample data are here. They can be loaded using the load-command. http://users.utu.fi/attenka/D_Pop http://users.utu.fi/attenka/D_Samp Best regards, Atte Tenkanen University of Turku, Finland Department of Musicology +35823335278 http://users.utu.fi/attenka/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R in Linux: problem with special characters
Hi, Im working with the 64 bit version of R 2.11.0 for Linux. My session info is: R version 2.11.0 (2010-04-22) x86_64-redhat-linux-gnu locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base When I try to print words with special characters the result is that the expression printed has some kind of code substituting the special character. For example, if I run print(dúvida) the result is: print(dúvida) [1] d\372vida This as problem has something to do with the locale settings? If I run the locale command in the Linux server, I get: [daniel.fernan...@pt-lnx13 ~]$ locale LANG=pt_PT.UTF-8 LC_CTYPE=C LC_NUMERIC=C LC_TIME=C LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C LC_ALL=C Thanks in advance for your help, Daniel TRANSFORME SUAS FOTOS EM EMOTICONS PARA O MESSENGER. CLIQUE AQUI E VEJA COMO. _ VEJA SEUS EMAILS ONDE QUER QUE VOCÊ ESTEJA, ACESSE O HOTMAIL PELO SEU CELULAR AGORA. =Live_Hotmailutm_medium=Taglineutm_content=VEJASEUSEM84utm_campaign=MobileServices [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] r code to broaden the boarder of the bars of a histogram
To whom it may concern, I have a problem concerning the design of a histogram. How do I change the border widths of the bars of a histogram. The initial command is: hist (punkte,breaks=30, xlab=Punkte, ylab=Häufigkeit, main=Histogramm, col= heat.colors(30), border= red) I suspect that it has to do with the lwd command but can't figure it out. Kind regards, Andreas Baranowski University of Klagenfurt Universitätsstraße 65-67 9020 Klagenfurt Austria __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unable to load an object
R-help, I seem not to get an object saved neither with .RData extension nor output via dput. Whenever I try to import the above object in another worspace I just get nothing. geoFeatures - load(geoFeatures.RData) geoFeatures [1] geoFeatures The geoFeatures.RData workspace contains an object list called geoFeatures Thanks in advance version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 11.1 year 2010 month 05 day31 svn rev52157 language R version.string R version 2.11.1 (2010-05-31) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lm without error
this is not an important question, but I wonder why lm returns an error, and whether this can be shut off. it would seem to me that returning NA's would make more sense in some cases---after all, the problem is clearly that coefficients cannot be computed. I know that I can trap the lm.fit() error---although I have always found this to be quite inconvenient---and this is easy if I have only one regression in my lm() statement. but, let's presume I have a matrix with a few thousand dependent y variables (and the same independent X variables). Let's presume one of the y variables contains only NA's. I believe I now cannot use lm(y ~ X), because one of the regressions will throw the lm.fit exception. (all the other y vectors should have worked.) or is there a way to get lm() to work in such situations? /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm-test?
I would have tried z-test (n=67) but since the distribution is not normally distributed, but positive skew, I should somehow transform the data? Values are between 0 and 1. atte Which test do you want to use? Once you know that, tell us and we'll tell you where to find it in R. Cheers Joris On Fri, Jun 11, 2010 at 1:50 PM, Atte Tenkanen atte...@utu.fi wrote: Dear R-users, I would like to test, whether a sample distribution differs significantly from a population distribution. They are not normally distributed. How should I proceed? Using somehow glm-models? How? The population and the sample data are here. They can be loaded using the load-command. http://users.utu.fi/attenka/D_Pop http://users.utu.fi/attenka/D_Samp Best regards, Atte Tenkanen University of Turku, Finland Department of Musicology +35823335278 http://users.utu.fi/attenka/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unable to load an object
Read the posting guide please. You can perfectly save and load the RData file. You just didn't save what you think you saved, but why that is can only be solved when we get to see your actual code. Cheers Joris On Fri, Jun 11, 2010 at 10:28 AM, Luis Ridao Cruz lu...@hav.fo wrote: R-help, I seem not to get an object saved neither with .RData extension nor output via dput. Whenever I try to import the above object in another worspace I just get nothing. geoFeatures - load(geoFeatures.RData) geoFeatures [1] geoFeatures The geoFeatures.RData workspace contains an object list called geoFeatures Thanks in advance version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 11.1 year 2010 month 05 day 31 svn rev 52157 language R version.string R version 2.11.1 (2010-05-31) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm without error
Obvious solution : check your data before you throw it in the lm. lm() shouldn't work in that situation, and if it would, I'd no longer use R. Cheers Joris On Fri, Jun 11, 2010 at 2:49 PM, ivo welch ivo...@gmail.com wrote: this is not an important question, but I wonder why lm returns an error, and whether this can be shut off. it would seem to me that returning NA's would make more sense in some cases---after all, the problem is clearly that coefficients cannot be computed. I know that I can trap the lm.fit() error---although I have always found this to be quite inconvenient---and this is easy if I have only one regression in my lm() statement. but, let's presume I have a matrix with a few thousand dependent y variables (and the same independent X variables). Let's presume one of the y variables contains only NA's. I believe I now cannot use lm(y ~ X), because one of the regressions will throw the lm.fit exception. (all the other y vectors should have worked.) or is there a way to get lm() to work in such situations? /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] removing a non empty directory
I'd like to remove automatically a directory that may be non empty. I tried: file.remove(NewDir, recursive=TRUE) [1] FALSE Warning message: In file.remove(NewDir, recursive = TRUE) : cannot remove file 'Prostate_Validated_mirWalk', reason 'Directory not empty' Is there another command to remove entire directories including their contents ? Thank you. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unable to load an object
If file 'geoFeatures.RData' contains an object with name 'geoFeatures', it is loaded if you do: load(geoFeatures.RData); However, when you do: geoFeatures - load(geoFeatures.RData); it will be loaded, but immediately overwritten because you create a new object with the same name. Note that load() returns a character vector of the object *names* loaded. Here is an example illustrating what is going on: x - 1; save(x, file=foo.RData); rm(x); res - load(foo.RData); res [1] x str(x); num 1 x - load(foo.RData); str(x); chr x You can easily load your data file into a new environment as: library(R.utils); env - loadToEnv(foo.RData); Then 'env' will be an environment containing your data, e.g. ll(envir=env); member data.class dimension objectSize 1 xnumeric 1 32 See also saveObject() and loadObject() in R.utils. /Henrik On Fri, Jun 11, 2010 at 4:15 PM, Joris Meys jorism...@gmail.com wrote: Read the posting guide please. You can perfectly save and load the RData file. You just didn't save what you think you saved, but why that is can only be solved when we get to see your actual code. Cheers Joris On Fri, Jun 11, 2010 at 10:28 AM, Luis Ridao Cruz lu...@hav.fo wrote: R-help, I seem not to get an object saved neither with .RData extension nor output via dput. Whenever I try to import the above object in another worspace I just get nothing. geoFeatures - load(geoFeatures.RData) geoFeatures [1] geoFeatures The geoFeatures.RData workspace contains an object list called geoFeatures Thanks in advance version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 11.1 year 2010 month 05 day 31 svn rev 52157 language R version.string R version 2.11.1 (2010-05-31) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm without error
1) please use reproducible, minimal examples when discussing behavior of R. 2) perhaps ?try could help. ivo welch wrote: this is not an important question, but I wonder why lm returns an error, and whether this can be shut off. it would seem to me that returning NA's would make more sense in some cases---after all, the problem is clearly that coefficients cannot be computed. I know that I can trap the lm.fit() error---although I have always found this to be quite inconvenient---and this is easy if I have only one regression in my lm() statement. but, let's presume I have a matrix with a few thousand dependent y variables (and the same independent X variables). Let's presume one of the y variables contains only NA's. I believe I now cannot use lm(y ~ X), because one of the regressions will throw the lm.fit exception. (all the other y vectors should have worked.) or is there a way to get lm() to work in such situations? /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Documentation of B-spline function
Goodmorning, This is a documentation related question about the B-spline function in R. In the help file it is stated that: df degrees of freedom; one can specify df rather than knots; bs() then chooses df-degree-1 knots at suitable quantiles of x (which will ignore missing values). So if one were to specify a spline with 6 degrees of freedom (and no intercept) then a basis with 6-3-1 =2 internal knots should be created. However this is not what happens: library(splines) s1-bs(women$height, df = 6,deg=3) s2-bs(women$height, df = 6,deg=2) attributes(s1)$knots 25% 50% 75% 61.5 65.0 68.5 attributes(s2)$knots 20% 40% 60% 80% 60.8 63.6 66.4 69.2 i.e. basis is created with an extra knot i.e. bs() chooses df-degree internal knots The documentation of ns states that: ns() then chooses df - 1 - intercept knots ... suggesting that the spline functions create the basis with df-degree internal knots if no intercept is specified but df-degree-1 internal knots if the caller explicitly asks for an intercept. s1-bs(women$height, df = 6,deg=3,intercept=T) s2-bs(women$height, df = 6,deg=2,intercept=T) attributes(s1)$knots 33.3% 66.7% 62.7 67.3 attributes(s2)$knots 25% 50% 75% 61.5 65.0 68.5 Is it possible to change the documentation of these functions to reflect their actual behaviour. For example something like the following: df degrees of freedom; one can specify df rather than knots; bs() then chooses df-degree-1 knots at suitable quantiles of x (which will ignore missing values) if the intercept argument is TRUE and df-degree if intercept=FALSE. Christos Argyropoulos _ Hotmail: Trusted email with Microsoft’s powerful SPAM protection. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: r code to broaden the boarder of the bars of a histogram
Hi Look at the source code. graphics:::plot.histogram You can find that boxes are actually drawn by rect So if you want to use standard graphics, you probably need to modify source code and set up your version of plot.histogram. Maybe with ggplot2 package you can find some way how to do what you want but you shall check yourself. Regards Petr r-help-boun...@r-project.org napsal dne 11.06.2010 15:09:21: To whom it may concern, I have a problem concerning the design of a histogram. How do I change the border widths of the bars of a histogram. The initial command is: hist (punkte,breaks=30, xlab=Punkte, ylab=Häufigkeit, main=Histogramm, col= heat.colors(30), border= red) I suspect that it has to do with the lwd command but can't figure it out. Kind regards, Andreas Baranowski University of Klagenfurt Universitätsstraße 65-67 9020 Klagenfurt Austria __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R in Linux: problem with special characters
On Fri, Jun 11, 2010 at 2:48 PM, daniel fernandes danielpas...@hotmail.com wrote: This as problem has something to do with the locale settings? If I run the locale command in the Linux server, I get: Possibly. print(dúvida) [1] dúvida sessionInfo() R version 2.10.1 (2009-12-14) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base li...@debian-liv:~$ locale LANG=en_GB.UTF-8 LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=en_GB.UTF-8 LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=en_GB.UTF-8 LC_NAME=en_GB.UTF-8 LC_ADDRESS=en_GB.UTF-8 LC_TELEPHONE=en_GB.UTF-8 LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=en_GB.UTF-8 LC_ALL= Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] r code to broaden the boarder of the bars of a histogram
Use truehist() in pkg:MASS. -Peter Ehlers On 2010-06-11 7:09, Andreas Baranowski wrote: To whom it may concern, I have a problem concerning the design of a histogram. How do I change the border widths of the bars of a histogram. The initial command is: hist (punkte,breaks=30, xlab=Punkte, ylab=Häufigkeit, main=Histogramm, col= heat.colors(30), border= red) I suspect that it has to do with the lwd command but can't figure it out. Kind regards, Andreas Baranowski University of Klagenfurt Universitätsstraße 65-67 9020 Klagenfurt Austria __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm without error
thanks, everybody. joris---let me disagree with you, please. there are so many possibilities of how lm.fit could fail that by the time I am done with pre-checking, I may as well write my own lm() routine. eric--let me disagree with you, too. I did know about ?try and it is useful when the dependent variable is just one vector---except if you have thousands of dependent variables (to run thousands of regressions with one lm() statement). if an error is thrown, then you then have to determine which of the columns actually was responsible for the error, and then you have to restart it. if you want a minimal example to explain this dilemma better: y= matrix(rnorm(1000), nrow=10, ncol=100) y[,28]= rep(NA, 10) x=rnorm(10) lm( y ~ x ) ## now what do you do? hunt for which column was responsible? gabor---this seems to be exactly what I wanted to get---coefficients without triggering an lm.fit() error. thanks (yet again). in my example, coefs= qr.coef( qr(x), y ) works great. regards, /iaw On Fri, Jun 11, 2010 at 10:46 AM, Erik Iverson er...@ccbr.umn.edu wrote: 1) please use reproducible, minimal examples when discussing behavior of R. 2) perhaps ?try could help. ivo welch wrote: this is not an important question, but I wonder why lm returns an error, and whether this can be shut off. it would seem to me that returning NA's would make more sense in some cases---after all, the problem is clearly that coefficients cannot be computed. I know that I can trap the lm.fit() error---although I have always found this to be quite inconvenient---and this is easy if I have only one regression in my lm() statement. but, let's presume I have a matrix with a few thousand dependent y variables (and the same independent X variables). Let's presume one of the y variables contains only NA's. I believe I now cannot use lm(y ~ X), because one of the regressions will throw the lm.fit exception. (all the other y vectors should have worked.) or is there a way to get lm() to work in such situations? /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] points marking
Those graphs look like chromosome maps, if so, you may want to look into the bioconductor project, they may have some prewritten functions to do this. If not, the lend argument (see ?par) may be something to look at. If you really want points and segments you will need to plot the points with the points function and the segments separately. Segments can take vectors, so you don't need to separate things into multiple calls. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 From: khush [mailto:bioinfo.kh...@gmail.com] Sent: Friday, June 11, 2010 12:00 AM To: Greg Snow Cc: r-help@r-project.org Subject: Re: [R] points marking Dear Gregory , Thnaks for your reply and help. I am explaining you my problems again, below is my script for the same . Dom -c (195,568,559) fkbp - barplot (Dom, col=black, xlab=, border = NA, space = 7, xlim=c(0,650), ylim =c(0, 87), las = 2, horiz = TRUE) axis (1, at = seq(0,600,10), las =2) 1. ==Segments 1= segments(164,7.8,192,7.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(45,15.8,138,15.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(160,15.8,255,15.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(277,15.8,378,15.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(51,23.8,145,23.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(167,23.8,262,23.8, col = green, pch=23, cex=9, lty=solid, lwd=20) segments(284,23.8,381,23.8, col = green, pch=23, cex=9, lty=solid, lwd=20) 2. ==Segments 2 == segments(399,15.8,432,15.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(448,15.8,475,15.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(486,15.8,515,15.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(401,23.8,434,23.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(450,23.8,475,23.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) segments(486,23.8,517,23.8, col = blue, pch=21, cex=9, lty=solid, lwd=20) I solved one part of my query i.e to mark points from one positions to other is ok and I found that its working fine but I have another issue now, as I am using using two segments data 1 and 2 , although I want to draw different shapes for segmants 2 as I am giving pch=21, but I it seems to give a solid line for both. I want to draw different shapes for every chunk of segments i.e is the whole point. I want to make script which can generate such figures, below is link to one of the tool. http://www.expasy.ch/tools/mydomains/ Thank you Jeet On Thu, Jun 10, 2010 at 11:10 PM, Greg Snow greg.s...@imail.orgmailto:greg.s...@imail.org wrote: Your question is not really clear, do either of these examples do what you want? with(anscombe, plot(x1, y2, ylim=range(y2,y3)) ) with(anscombe, points(x1, y3, col='blue', pch=2) ) with(anscombe, segments(x1, y2, x1, y3, col=ifelse( y2y3, 'green','red') ) ) with(anscombe, plot(x1, y2, ylim=range(y2,y3), type='n') ) with(anscombe[order(anscombe$x1),], polygon( c( x1,rev(x1) ), c(y2, rev(y3)), col='grey' ) ) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.orgmailto:greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-boun...@r-mailto:r-help-boun...@r- project.orghttp://project.org] On Behalf Of khush Sent: Thursday, June 10, 2010 7:48 AM To: r-help@r-project.orgmailto:r-help@r-project.org Subject: [R] points marking Hi, How to mark points on x axis of a graph keeping x axis as constant and changing y from y1 to y2 respectively. I want to highlight the area from y1 to y2. Any suggestions Thank you Jeet [[alternative HTML version deleted]] __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Windows, OSX and Linux: updating a graphic device and double buffering
Hello there, I'm struggling with the base graphics system on different operating systems. I would like to get an animation effect by re-plotting with the plot function. See the attached code example: move the slider quick from one side to the other. I experience different levels of success, depending on which OS I use. - Linux (Ubuntu 9.10, R 2.9.2-3): Each plot command gets displayed. However the labels are flickering. - Windows: works perfect, double buffering is really well implemented. - OS X: Moving the slider slowly results in displaying the output of almost all plot commands. However if you move the slider quickly, only few of the plot commands get displayed (the orange bullet jumps). This question has been asked before, but hopefully things have changed: Is it possible to double buffer my output manually? If not, is it possible to force update my graphic device. E.g. with a command like dev.flush()? How can I run my code on all operating systems without any jumps? Regards, Adrian Waddell Here is a minimal example code: library(tcltk) tt - tktoplevel() tkpack(top - tkframe(tt), side = top) SliderValue - tclVar(50) SliderValueLabel - tklabel(top,text=as.character(tclvalue(SliderValue))) tkpack(tklabel(top,text=Slider Value : ),SliderValueLabel, side = left) tkconfigure(SliderValueLabel,textvariable=SliderValue) slider - tkscale(tt, from=0, to=100, showvalue=F, variable=SliderValue, resolution=1, orient=horizontal, length = 200) tkpack(slider, side = top) tkconfigure(slider, command=function(...){ state - unlist(...) myPlotFn(state) }) myPlotFn - function(state) { plot(1,1, type = 'n', xlim = c(-10,110), ylim = c(-2,2)) lines(c(0,100), c(0,0), lwd=4) points(as.numeric(tclvalue(SliderValue)),0, cex = 10, pch = 19, col = orange) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removing a non empty directory
?unlink - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 11 Jun 2010, mau...@alice.it wrote: I'd like to remove automatically a directory that may be non empty. I tried: file.remove(NewDir, recursive=TRUE) [1] FALSE Warning message: In file.remove(NewDir, recursive = TRUE) : cannot remove file 'Prostate_Validated_mirWalk', reason 'Directory not empty' Is there another command to remove entire directories including their contents ? Thank you. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Managing list elements
Hello, I have two lists with the same number of elements tail(LHS) [[1]] [1] antecedentes.factor_riesgo=17 antecedentes.estado=1 antecedentes.medio=4 tarjetas_flagrancia.adquiriente2= [[2]] [1] antecedentes.riesgo=1 antecedentes.estado=1 antecedentes.medio=4 tarjetas_flagrancia.adquiriente2= [[3]] [1] resultado_investig.pto_comp=N antecedentes.riesgo=1 antecedentes.factor_riesgo=17 antecedentes.medio=4 [[4]] [1] resultado_investig.pto_comp=N antecedentes.riesgo=1 antecedentes.factor_riesgo=17 tarjetas_flagrancia.adquiriente2= [[5]] [1] resultado_investig.pto_comp=N antecedentes.factor_riesgo=17 antecedentes.medio=4 tarjetas_flagrancia.adquiriente2= [[6]] [1] resultado_investig.pto_comp=N antecedentes.riesgo=1 antecedentes.medio=4 tarjetas_flagrancia.adquiriente2= tail(RHS) [[1]] [1] antecedentes.riesgo=1 [[2]] [1] antecedentes.factor_riesgo=17 [[3]] [1] tarjetas_flagrancia.adquiriente2= [[4]] [1] antecedentes.medio=4 [[5]] [1] antecedentes.riesgo=1 [[6]] [1] antecedentes.factor_riesgo=17 I would like to create a new list from this two which would have in every list entry one entry with the corresponding elements from LHS and another entry with the corresponding element from RHS (LHS doesn't always have three 4 elements per entry). Do you know how can I do this? Thank you Felipe Parra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] passing constrasts=FALSE to contrast functions -- why does this exist?
Hello, I've noticed that all contrast functions, like contr.treatment, contr.poly, etc., take a logical argument called 'contrasts'. The default is TRUE, in which case they do their normal thing of returning a n x n-1 matrix whose columns are linearly-independent of the intercept. If contrasts=FALSE, they instead return an n x n matrix with full rank (usually the identity matrix, corresponding to dummy coding, but contr.poly returns orthogonal polynomials that include the zero-th order constant term, instead of starting with the linear term as it normally would). Why does this argument exist? My initial theory was that this was added to support the smart handling of redundancy in model matrix construction -- depending on what other terms exist in a formula, sometimes R will choose to contrast code a factor in n-1 columns, and sometimes it will choose to dummy code it in n columns. So it would make sense to call the contrast function with contrasts=TRUE in the former case and contrasts=FALSE in the latter case, and that way if the contrast function for some reason wanted a full-rank coding *besides* dummy coding then it could do that (like contr.poly). But in fact, when R decides it wants dummy coding, it doesn't call the contrast function, it just dummy codes unconditionally: a - factor(c(a, b, c)) trace(contr.treatment) invisible(model.matrix(~ a)) # contrast coded trace: ctrfn(levels(x), contrasts = contrasts) invisible(model.matrix(~ 0 + a)) # dummy coded In fact, I can't find any code anywhere in R that ever uses contrasts=FALSE. So what's going on? Is this a bug and R *should* be using contrasts=FALSE to dummy code factors? Confusedly yours, -- Nathaniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rules-class
Hellos I have a rules-class element which I got from the apriori function in the arules package, no I would like to stay just with a subset of the rules. Does anybody know how can I create an object which has only the subset corresponding from some indices I give him. For example if I have the followint rules-class object: lhs rhs support confidence lift 1 {} = {tarjetas_flagrancia.adquiriente2=} 0.9960369 0.9960369 1.000 2 {datafono.marca5=} = {tarjetas_flagrancia.adquiriente2=} 0.8049805 0.9984959 1.0024688 3 {Target=1} = {resultado_investig.pto_comp=N} 0.8134982 0.9991282 1.0427348 4 {Target=1} = {tarjetas_flagrancia.adquiriente2=} 0.8106589 0.9956411 0.9996026 5 {antecedentes.estado=1} = {antecedentes.medio=4} 0.8420383 1.000 1.1392183 6 {antecedentes.estado=1} = {tarjetas_flagrancia.adquiriente2=} 0.8382231 0.9954691 0.9994299 7 {antecedentes.factor_riesgo=17} = {antecedentes.riesgo=1} 0.8507926 1.000 1.1750886 8 {antecedentes.riesgo=1} = {antecedentes.factor_riesgo=17} 0.8507926 0.9997567 1.1750886 I would like my result to be the following rule-class object if I choose to stay with the third and eigth rules lhs rhs support confidence lift 3 {Target=1} = {resultado_investig.pto_comp=N} 0.8134982 0.9991282 1.0427348 8 {antecedentes.riesgo=1} = {antecedentes.factor_riesgo=17} 0.8507926 0.9997567 1.1750886 Thank You Felipe Parra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Managing list elements
Luis - I *think* that mapply(list,LHS,RHS,SIMPLIFY=FALSE) will give you what you want, but without a reproducible example it's hard to tell. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 11 Jun 2010, Luis Felipe Parra wrote: Hello, I have two lists with the same number of elements tail(LHS) [[1]] [1] antecedentes.factor_riesgo=17 antecedentes.estado=1 antecedentes.medio=4 tarjetas_flagrancia.adquiriente2= [[2]] [1] antecedentes.riesgo=1 antecedentes.estado=1 antecedentes.medio=4 tarjetas_flagrancia.adquiriente2= [[3]] [1] resultado_investig.pto_comp=N antecedentes.riesgo=1 antecedentes.factor_riesgo=17 antecedentes.medio=4 [[4]] [1] resultado_investig.pto_comp=N antecedentes.riesgo=1 antecedentes.factor_riesgo=17 tarjetas_flagrancia.adquiriente2= [[5]] [1] resultado_investig.pto_comp=N antecedentes.factor_riesgo=17 antecedentes.medio=4 tarjetas_flagrancia.adquiriente2= [[6]] [1] resultado_investig.pto_comp=N antecedentes.riesgo=1 antecedentes.medio=4 tarjetas_flagrancia.adquiriente2= tail(RHS) [[1]] [1] antecedentes.riesgo=1 [[2]] [1] antecedentes.factor_riesgo=17 [[3]] [1] tarjetas_flagrancia.adquiriente2= [[4]] [1] antecedentes.medio=4 [[5]] [1] antecedentes.riesgo=1 [[6]] [1] antecedentes.factor_riesgo=17 I would like to create a new list from this two which would have in every list entry one entry with the corresponding elements from LHS and another entry with the corresponding element from RHS (LHS doesn't always have three 4 elements per entry). Do you know how can I do this? Thank you Felipe Parra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Transforming simulation data which is spread across many files into a barplot
I'm an R newbie, and I'm just trying to use some of it's graphing capabilities, but I'm a bit stuck - basically in massaging the already available data into a format R likes. I have a simulation environment which produces logs, which represent a number of different things. I then run a python script on this data, and putting it in a nicer format. Essentially, the python script reduces the number of files by two orders of magnitude. What I'm left with, is a number of files, which each have two columns of data in them. The files look something like this: --1000.log-- Sent Received 405.0 3832.0 176.0 1742.0 176.0 1766.0 176.0 1240.0 356.0 3396.0 ... This file - called 1000.log - represents a data point at 1000. What I'd like to do is to use a loop, to read in 50 or so of these files, and then produce a stacked barplot. Ideally, the stacked barplot would have 1 bar per file, and two stacks per bar. The first stack would be the mean of the sent, and the second would be the mean of the received. I've used a loop to read files in R before, something like this --- for (i in 1:50){ tmpFile - paste(base, i*100, .log, sep=) tmp - read.table(tmpFile) } --- But I really don't know how to handle massaging this data into the matrix I need. I hope this makes sense, I find it a little hard to describe. Can anyone give me some help jumping into this one? Thanks -- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transforming simulation data which is spread across many files into a barplot
On Fri, Jun 11, 2010 at 1:32 PM, Ian Bentley ian.bent...@gmail.com wrote: I'm an R newbie, and I'm just trying to use some of it's graphing capabilities, but I'm a bit stuck - basically in massaging the already available data into a format R likes. I have a simulation environment which produces logs, which represent a number of different things. I then run a python script on this data, and putting it in a nicer format. Essentially, the python script reduces the number of files by two orders of magnitude. What I'm left with, is a number of files, which each have two columns of data in them. The files look something like this: --1000.log-- Sent Received 405.0 3832.0 176.0 1742.0 176.0 1766.0 176.0 1240.0 356.0 3396.0 ... This file - called 1000.log - represents a data point at 1000. What I'd like to do is to use a loop, to read in 50 or so of these files, and then produce a stacked barplot. Ideally, the stacked barplot would have 1 bar per file, and two stacks per bar. The first stack would be the mean of the sent, and the second would be the mean of the received. I've used a loop to read files in R before, something like this --- for (i in 1:50){ tmpFile - paste(base, i*100, .log, sep=) tmp - read.table(tmpFile) } # Load data library(plyr) paths - dir(base, pattern = \\.log, full = TRUE) names(paths) - basename(paths) df - ddply(paths, read.table) # Compute averages: avg - ddply(df, .id, summarise, sent = mean(sent), received = mean(received) You can read more about plyr at http://had.co.nz/plyr. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to code mixed model with nested factors in lmer
Hi, I have coding question on mixed model in R. I am using R2.11.0 in windows. I have an experiment with 2 fixed effect factors - A and B. The levels of B are within the levels of A factor. The model is very similar to a split plot design except the nesting relationship between the 2 fixed effect factors. For example: there are 2 levels for A - GM and ZM. There are 7 levels of B in total (G1-G4 and Z1 - Z3) - but G1, G2, G3, G4 only appear in GM while Z1, Z2 and Z3 only appear in ZM. The SAS code is like the following: Proc Mixed data = a; Class A B rep; Model y = A B(A); Random rep rep*A; Run; I tried to code this model in R using lmer(). It turned out that R has some issues with nesting levels with fixed effect. I changed and tried a few models just for diagnostic purpose. And I found whenever I included the nesting fixed effects, I got the following error message: Problem in .C(mixed_EM,: Singularity in backsolve, while calling subroutine mixed_EM. I am wondering whether someone can help me set up this mixed model in R. Thanks Huiyan Zhao - This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transforming simulation data which is spread across manyfiles into a barplot
Ouch! Lousy plot. Instead, plot the 50 (mean sent, mean received)pairs as a y vs x scatterplot to see the relationship. Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hadley Wickham Sent: Friday, June 11, 2010 11:53 AM To: Ian Bentley Cc: r-help@r-project.org Subject: Re: [R] Transforming simulation data which is spread across manyfiles into a barplot On Fri, Jun 11, 2010 at 1:32 PM, Ian Bentley ian.bent...@gmail.com wrote: I'm an R newbie, and I'm just trying to use some of it's graphing capabilities, but I'm a bit stuck - basically in massaging the already available data into a format R likes. I have a simulation environment which produces logs, which represent a number of different things. I then run a python script on this data, and putting it in a nicer format. Essentially, the python script reduces the number of files by two orders of magnitude. What I'm left with, is a number of files, which each have two columns of data in them. The files look something like this: --1000.log-- Sent Received 405.0 3832.0 176.0 1742.0 176.0 1766.0 176.0 1240.0 356.0 3396.0 ... This file - called 1000.log - represents a data point at 1000. What I'd like to do is to use a loop, to read in 50 or so of these files, and then produce a stacked barplot. Ideally, the stacked barplot would have 1 bar per file, and two stacks per bar. The first stack would be the mean of the sent, and the second would be the mean of the received. I've used a loop to read files in R before, something like this --- for (i in 1:50){ tmpFile - paste(base, i*100, .log, sep=) tmp - read.table(tmpFile) } # Load data library(plyr) paths - dir(base, pattern = \\.log, full = TRUE) names(paths) - basename(paths) df - ddply(paths, read.table) # Compute averages: avg - ddply(df, .id, summarise, sent = mean(sent), received = mean(received) You can read more about plyr at http://had.co.nz/plyr. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transforming simulation data which is spread across manyfiles into a barplot
I'm not trying to see the relation between sent and received, but rather to show how these grow across the increasing complexity of the 50 data points. On 11 June 2010 15:02, Bert Gunter gunter.ber...@gene.com wrote: Ouch! Lousy plot. Instead, plot the 50 (mean sent, mean received)pairs as a y vs x scatterplot to see the relationship. Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hadley Wickham Sent: Friday, June 11, 2010 11:53 AM To: Ian Bentley Cc: r-help@r-project.org Subject: Re: [R] Transforming simulation data which is spread across manyfiles into a barplot On Fri, Jun 11, 2010 at 1:32 PM, Ian Bentley ian.bent...@gmail.com wrote: I'm an R newbie, and I'm just trying to use some of it's graphing capabilities, but I'm a bit stuck - basically in massaging the already available data into a format R likes. I have a simulation environment which produces logs, which represent a number of different things. I then run a python script on this data, and putting it in a nicer format. Essentially, the python script reduces the number of files by two orders of magnitude. What I'm left with, is a number of files, which each have two columns of data in them. The files look something like this: --1000.log-- Sent Received 405.0 3832.0 176.0 1742.0 176.0 1766.0 176.0 1240.0 356.0 3396.0 ... This file - called 1000.log - represents a data point at 1000. What I'd like to do is to use a loop, to read in 50 or so of these files, and then produce a stacked barplot. Ideally, the stacked barplot would have 1 bar per file, and two stacks per bar. The first stack would be the mean of the sent, and the second would be the mean of the received. I've used a loop to read files in R before, something like this --- for (i in 1:50){ tmpFile - paste(base, i*100, .log, sep=) tmp - read.table(tmpFile) } # Load data library(plyr) paths - dir(base, pattern = \\.log, full = TRUE) names(paths) - basename(paths) df - ddply(paths, read.table) # Compute averages: avg - ddply(df, .id, summarise, sent = mean(sent), received = mean(received) You can read more about plyr at http://had.co.nz/plyr. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transforming simulation data which is spread across many files into a barplot
Try this: base - file # replace as appropriate N - 50 filenames - paste(base, seq_len(N)*100, .log, sep = ) mat - sapply(filenames, function(fn) colMeans(read.table(fn, col.names = c(Sent, Received))) ) barplot(mat) On Fri, Jun 11, 2010 at 2:32 PM, Ian Bentley ian.bent...@gmail.com wrote: I'm an R newbie, and I'm just trying to use some of it's graphing capabilities, but I'm a bit stuck - basically in massaging the already available data into a format R likes. I have a simulation environment which produces logs, which represent a number of different things. I then run a python script on this data, and putting it in a nicer format. Essentially, the python script reduces the number of files by two orders of magnitude. What I'm left with, is a number of files, which each have two columns of data in them. The files look something like this: --1000.log-- Sent Received 405.0 3832.0 176.0 1742.0 176.0 1766.0 176.0 1240.0 356.0 3396.0 ... This file - called 1000.log - represents a data point at 1000. What I'd like to do is to use a loop, to read in 50 or so of these files, and then produce a stacked barplot. Ideally, the stacked barplot would have 1 bar per file, and two stacks per bar. The first stack would be the mean of the sent, and the second would be the mean of the received. I've used a loop to read files in R before, something like this --- for (i in 1:50){ tmpFile - paste(base, i*100, .log, sep=) tmp - read.table(tmpFile) } --- But I really don't know how to handle massaging this data into the matrix I need. I hope this makes sense, I find it a little hard to describe. Can anyone give me some help jumping into this one? Thanks -- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transforming simulation data which is spread acrossmanyfiles into a barplot
So two time series? Fair enough. But less is more. Plot them as separates series of points connected by lines, different colors for the two different series. Or as two trellises plots. You may also wish to overlay a smooth to help the reader see the trend(e.g via a loess or other nonparametric smooth, or perhaps just a fitted line). The only part of a bar that conveys information is the top. The rest of the fill is chartjunk (Tufte's term) and distracts. Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ian Bentley Sent: Friday, June 11, 2010 12:15 PM To: Bert Gunter Cc: r-help@r-project.org; Hadley Wickham Subject: Re: [R] Transforming simulation data which is spread acrossmanyfiles into a barplot I'm not trying to see the relation between sent and received, but rather to show how these grow across the increasing complexity of the 50 data points. On 11 June 2010 15:02, Bert Gunter gunter.ber...@gene.com wrote: Ouch! Lousy plot. Instead, plot the 50 (mean sent, mean received)pairs as a y vs x scatterplot to see the relationship. Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hadley Wickham Sent: Friday, June 11, 2010 11:53 AM To: Ian Bentley Cc: r-help@r-project.org Subject: Re: [R] Transforming simulation data which is spread across manyfiles into a barplot On Fri, Jun 11, 2010 at 1:32 PM, Ian Bentley ian.bent...@gmail.com wrote: I'm an R newbie, and I'm just trying to use some of it's graphing capabilities, but I'm a bit stuck - basically in massaging the already available data into a format R likes. I have a simulation environment which produces logs, which represent a number of different things. I then run a python script on this data, and putting it in a nicer format. Essentially, the python script reduces the number of files by two orders of magnitude. What I'm left with, is a number of files, which each have two columns of data in them. The files look something like this: --1000.log-- Sent Received 405.0 3832.0 176.0 1742.0 176.0 1766.0 176.0 1240.0 356.0 3396.0 ... This file - called 1000.log - represents a data point at 1000. What I'd like to do is to use a loop, to read in 50 or so of these files, and then produce a stacked barplot. Ideally, the stacked barplot would have 1 bar per file, and two stacks per bar. The first stack would be the mean of the sent, and the second would be the mean of the received. I've used a loop to read files in R before, something like this --- for (i in 1:50){ tmpFile - paste(base, i*100, .log, sep=) tmp - read.table(tmpFile) } # Load data library(plyr) paths - dir(base, pattern = \\.log, full = TRUE) names(paths) - basename(paths) df - ddply(paths, read.table) # Compute averages: avg - ddply(df, .id, summarise, sent = mean(sent), received = mean(received) You can read more about plyr at http://had.co.nz/plyr. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] comparing reshape's
I thought I would share the following. System: Mac Pro 2.26GHz, OSX, 8GB of memory (not a constraint), R 2.11.0, 64bit version. Task: I have a long data set: 2.2 million long observations (factor xid, factor yid, variable zcontent), which I want to map into a sparse matrix of 948 columns and 16,350 rows. There are two commonly used functions to accomplish this: library(stats); outcome = reshape( subset(mydataframe, select=c(yid,xid,zcontent), timevar=yid, idvar=xid, direction=wide) ) takes about 9,600 seconds . library(reshape) melted = melt( subset(mydataframe, select=c(yid,xid,zcontent), id=c(xid, yid) ) outcome = cast( zcontent, xid ~ yid ) takes about 875 seconds. so, for large reshape jobs from long to wide, the reshape library is much more efficient. YMMV. /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm-test?
Thanks! Atte Take a look at this document: http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf All information you need is in there. Cheers Joris On Fri, Jun 11, 2010 at 2:50 PM, Atte Tenkanen atte...@utu.fi wrote: I would have tried z-test (n=67) but since the distribution is not normally distributed, but positive skew, I should somehow transform the data? Values are between 0 and 1. atte Which test do you want to use? Once you know that, tell us and we'll tell you where to find it in R. Cheers Joris On Fri, Jun 11, 2010 at 1:50 PM, Atte Tenkanen atte...@utu.fi wrote: Dear R-users, I would like to test, whether a sample distribution differs significantly from a population distribution. They are not normally distributed. How should I proceed? Using somehow glm-models? How? The population and the sample data are here. They can be loaded using the load-command. http://users.utu.fi/attenka/D_Pop http://users.utu.fi/attenka/D_Samp Best regards, Atte Tenkanen University of Turku, Finland Department of Musicology +35823335278 http://users.utu.fi/attenka/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Handling character string
Dear all, Is there any R function to say these 2 character strings temp and temp are actually same? If I type following code R says there are indeed different : temp == temp[1] FALSE Is there any way out? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm without error
This will give the coefficients of each regression for which there are no missing values in the dependent variable and NAs for the rest: # test data set.seed(123) y - cbind(y1 = 1:4, y2 = c(NA, 2:4)) x - 1:4 + rnorm(4) qr.coef(qr(cbind(1, x)), y) y1 y2 0.8607244 NA x 0.6049789 NA On Fri, Jun 11, 2010 at 8:49 AM, ivo welch ivo...@gmail.com wrote: this is not an important question, but I wonder why lm returns an error, and whether this can be shut off. it would seem to me that returning NA's would make more sense in some cases---after all, the problem is clearly that coefficients cannot be computed. I know that I can trap the lm.fit() error---although I have always found this to be quite inconvenient---and this is easy if I have only one regression in my lm() statement. but, let's presume I have a matrix with a few thousand dependent y variables (and the same independent X variables). Let's presume one of the y variables contains only NA's. I believe I now cannot use lm(y ~ X), because one of the regressions will throw the lm.fit exception. (all the other y vectors should have worked.) or is there a way to get lm() to work in such situations? /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling character string
Megh Dal wrote: Dear all, Is there any R function to say these 2 character strings temp and temp are actually same? If I type following code R says there are indeed different : temp == temp[1] FALSE You don't say how you're defining same, but it definitely requires more explanation, since they are not the same. Why should those two strings be the same in your mind? Do you want to remove leading white space, all white space, just one space, etc? You might find the examples in ?sub useful. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rgui crashed on Windows XP Home
Hi there, I just installed R 2.11.1 on my PC, which runs a Windows XP Home. The installation is successful, however, when I double click on the R icon, I get the following error message: R for Windows GUI front-end has encountered a problem and needs to close. We are sorry for the inconvenience. Error signature is: AppName: rgui.exeAppVer: 2.111.52157.0 ModName: msvcrt.dll ModVer: 7.0.2600.2180Offset: d2b5 I get a gdb, and then gdb the Rgui.exe, I get the following message: (gdb) run Starting program: D:\Program Files\R\R-2.11.1\bin/Rgui.exe [New Thread 2460.0xb7c] Program received signal SIGSEGV, Segmentation fault. 0x77c1d2b5 in msvcrt!mblen () from C:\WINDOWS\system32\msvcrt.dll (gdb) bt #0 0x77c1d2b5 in msvcrt!mblen () from C:\WINDOWS\system32\msvcrt.dll #1 0x77c1d3a9 in msvcrt!mbstowcs () from C:\WINDOWS\system32\msvcrt.dll #2 0x635597f9 in GA_newwindow () from D:\Program Files\R\R-2.11.1\bin\Rgraphapp.dll #3 0x63543a25 in GA_newcontrol () from D:\Program Files\R\R-2.11.1\bin\Rgraphapp.dll #4 0x63543c3f in GA_newimagebutton () from D:\Program Files\R\R-2.11.1\bin\Rgraphapp.dll #5 0x6c723d7e in setupui () from D:\Program Files\R\R-2.11.1\bin\R.dll #6 0x004014e2 in ?? () #7 0x00401425 in ?? () #8 0x00401708 in ?? () #9 0x0040124b in ?? () #10 0x004012b8 in ?? () #11 0x7c816fe7 in RegisterWaitForInputIdle () from C:\WINDOWS\system32\kernel32.dll #12 0x in ?? () (gdb) The Rterm.exe can run normally. I don't know it's a bug of Rgui or a bug of my system. Thanks for any help. Regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Clustering algorithms don't find obvious clusters
Henrik, The clustering algorithms you refer to (and almost all others) expect the matrix to be symmetric. They do not seek a graph-theoretic solution, but rather proximity in geometric or topological space. How did you convert y9oru matrix to a dissimilarity? Dave Roberts Henrik Aldberg wrote: I have a directed graph which is represented as a matrix on the form 0 4 0 1 6 0 0 0 0 1 0 5 0 0 4 0 Each row correspond to an author (A, B, C, D) and the values says how many times this author have cited the other authors. Hence the first row says that author A have cited author B four times and author D one time. Thus the matrix represents two groups of authors: (A,B) and (C,D) who cites each other. But there is also a weak link between the groups. In reality this matrix is much bigger and very sparce but it still consists of distinct groups of authors. My problem is that when I cluster the matrix using pam, clara or agnes the algorithms does not find the obvious clusters. I have tried to turn it into a dissimilarity matrix before clustering but that did not help either. The layout of the clustering is not that important to me, my primary interest is the to get the right nodes into the right clusters. Sincerely Henrik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling character string
I think the poster wants ?regex. Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erik Iverson Sent: Friday, June 11, 2010 2:06 PM To: Megh Dal Cc: r-h...@stat.math.ethz.ch Subject: Re: [R] Handling character string Megh Dal wrote: Dear all, Is there any R function to say these 2 character strings temp and temp are actually same? If I type following code R says there are indeed different : temp == temp[1] FALSE You don't say how you're defining same, but it definitely requires more explanation, since they are not the same. Why should those two strings be the same in your mind? Do you want to remove leading white space, all white space, just one space, etc? You might find the examples in ?sub useful. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Decision values from KSVM
Hi, I'm working on a project using the kernlab library. For one phase, I want the decision values from the SVM prediction, not the class label. the e1071 library has this function, but I can't find the equivalent in ksvm. In general, when an SVM is used for classification, the label of an unknown test-case is decided by the sign of its resulting value as calculated from the SVM {-,+} I want the actual values as a proximal representation of the strength of the decision. (Further from the hyperplane indicates more confidence) Any suggestions? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R in Linux: problem with special characters
daniel fernandes wrote: Hi, I’m working with the 64 bit version of R 2.11.0 for Linux. My session info is: R version 2.11.0 (2010-04-22) x86_64-redhat-linux-gnu locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base When I try to print words with special characters the result is that the expression printed has some kind of code substituting the special character. For example, if I run print(“dúvida”) the result is: print(dúvida) [1] d\372vida This as problem has something to do with the locale settings? If I run the locale command in the Linux server, I get: Yes, it's your locale settings. The C locale doesn't support the ú character in your string, and displays it in octal. Duncan Murdoch [daniel.fernan...@pt-lnx13 ~]$ locale LANG=pt_PT.UTF-8 LC_CTYPE=C LC_NUMERIC=C LC_TIME=C LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C LC_ALL=C Thanks in advance for your help, Daniel TRANSFORME SUAS FOTOS EM EMOTICONS PARA O MESSENGER. CLIQUE AQUI E VEJA COMO. _ VEJA SEUS EMAILS ONDE QUER QUE VOCÊ ESTEJA, ACESSE O HOTMAIL PELO SEU CELULAR AGORA. =Live_Hotmailutm_medium=Taglineutm_content=VEJASEUSEM84utm_campaign=MobileServices [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rgui crashed on Windows XP Home
Jinsong Zhao wrote: Hi there, I just installed R 2.11.1 on my PC, which runs a Windows XP Home. The installation is successful, however, when I double click on the R icon, I get the following error message: R for Windows GUI front-end has encountered a problem and needs to close. We are sorry for the inconvenience. The error occurs in msvcrt.dll, a Microsoft dll. It happened after a call from one of the R dlls, setting up the GUI. I don't really know what to suggest to fix this, other than the usual things: try running R with the --vanilla command line argument, try shutting down everything else on your system, etc. Duncan Murdoch Error signature is: AppName: rgui.exeAppVer: 2.111.52157.0 ModName: msvcrt.dll ModVer: 7.0.2600.2180Offset: d2b5 I get a gdb, and then gdb the Rgui.exe, I get the following message: (gdb) run Starting program: D:\Program Files\R\R-2.11.1\bin/Rgui.exe [New Thread 2460.0xb7c] Program received signal SIGSEGV, Segmentation fault. 0x77c1d2b5 in msvcrt!mblen () from C:\WINDOWS\system32\msvcrt.dll (gdb) bt #0 0x77c1d2b5 in msvcrt!mblen () from C:\WINDOWS\system32\msvcrt.dll #1 0x77c1d3a9 in msvcrt!mbstowcs () from C:\WINDOWS\system32\msvcrt.dll #2 0x635597f9 in GA_newwindow () from D:\Program Files\R\R-2.11.1\bin\Rgraphapp.dll #3 0x63543a25 in GA_newcontrol () from D:\Program Files\R\R-2.11.1\bin\Rgraphapp.dll #4 0x63543c3f in GA_newimagebutton () from D:\Program Files\R\R-2.11.1\bin\Rgraphapp.dll #5 0x6c723d7e in setupui () from D:\Program Files\R\R-2.11.1\bin\R.dll #6 0x004014e2 in ?? () #7 0x00401425 in ?? () #8 0x00401708 in ?? () #9 0x0040124b in ?? () #10 0x004012b8 in ?? () #11 0x7c816fe7 in RegisterWaitForInputIdle () from C:\WINDOWS\system32\kernel32.dll #12 0x in ?? () (gdb) The Rterm.exe can run normally. I don't know it's a bug of Rgui or a bug of my system. Thanks for any help. Regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Documentation of B-spline function
On Fri, 11 Jun 2010, Christos Argyropoulos wrote: Goodmorning, This is a documentation related question about the B-spline function in R. In the help file it is stated that: df degrees of freedom; one can specify df rather than knots; bs() then chooses df-degree-1 knots at suitable quantiles of x (which will ignore missing values). Not in R 2.11.1 where help(bs) says: df degrees of freedom; one can specify df rather than knots; bs() then chooses df-degree (minus one if there is an intercept) knots at suitable quantiles of x (which will ignore missing values). So if one were to specify a spline with 6 degrees of freedom (and no intercept) then a basis with 6-3-1 =2 internal knots should be created. However this is not what happens: library(splines) s1-bs(women$height, df = 6,deg=3) s2-bs(women$height, df = 6,deg=2) attributes(s1)$knots 25% 50% 75% 61.5 65.0 68.5 attributes(s2)$knots 20% 40% 60% 80% 60.8 63.6 66.4 69.2 i.e. basis is created with an extra knot i.e. bs() chooses df-degree internal knots The documentation of ns states that: ns() then chooses df - 1 - intercept knots ... suggesting that the spline functions create the basis with df-degree internal knots if no intercept is specified but df-degree-1 internal knots if the caller explicitly asks for an intercept. If you knew that 1 - TRUE == 0, then you know that is what it says. s1-bs(women$height, df = 6,deg=3,intercept=T) s2-bs(women$height, df = 6,deg=2,intercept=T) attributes(s1)$knots 33.3% 66.7% 62.7 67.3 attributes(s2)$knots 25% 50% 75% 61.5 65.0 68.5 Is it possible to change the documentation of these functions to reflect their actual behaviour. For example something like the following: df degrees of freedom; one can specify df rather than knots; bs() then chooses df-degree-1 knots at suitable quantiles of x (which will ignore missing values) if the intercept argument is TRUE and df-degree if intercept=FALSE. R-devel is where you post stuff like this, but be sure to refer to current versions to avoid being flamed for non-compliance with posting guidelines. HTH, Chuck Christos Argyropoulos _ Hotmail: Trusted email with Microsoft’s powerful SPAM protection. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer() with no intercept
Hi, I asked this before, but haven't got any response. So would like to have another try. thanks for help. Also tried twice to join the model mailing list so that I can ask question there, but still haven't got permission to join that list yet. === Hi, I am wondering how I can specify no intercept in a mixed model using lmer(). Here is an example dataset attached (test.txt). There are 3 workers, in 5 days, measured a response variable y on independent variable x. I want to use a quadratic term (x2 in the dataset) to model the relationship between y and x. test-read.table(test.txt,sep='\t',header=T) If I just simply use lm() and ignore worker and day, so that I can try both a linear regression with and without an intercept, here is what I get: lm(y~x+x2, data=test) Coefficients: (Intercept)x x2 -1.77491040.1099160 -0.0006152 lm(y~x+x2-1, data=test) Coefficients: x x2 0.0490097 -0.0001962 Now, I want to try mixed model considering worker and day as random effect. With an intercept: lmer(y~x+x2+(1|worker)+(1|day), data=test) Fixed effects: Estimate Std. Error t value (Intercept) -1.324e+00 4.490e-01 -2.948 x1.117e-01 8.563e-03 13.041 x2 -6.357e-04 7.822e-05 -8.127 Without an intercept: lmer(y~x+x2+(1|worker)+(1|day)-1, data=test) Fixed effects: Estimate Std. Error t value x 1.107e-01 8.528e-03 12.981 x2 -6.304e-04 7.805e-05 -8.077 It seems working fine. But if you look at the fixed effect coefficients of both mixed models, the coefficients for x and x2 are not much different, regardless of whether an intercept is included or not. This is not the case for simple linear regression using lm() on the top. If I plot all 4 models in the following plot: xyplot(y~x,groups=worker,test, col.line = grey, lwd = 2, , panel = function(x,y) { panel.xyplot(x,y, type='p') x-sort(x) panel.lines(x,-1.324+0.1117*x-0.0006357*x*x) panel.lines(x,0.1107*x-0.0006304*x*x,col='red') panel.lines(x,0.04901*x-0.0001962*x*x,col='blue') panel.lines(x,-1.7749+0.10992*x-0.0006152*x*x,col='green') }) As you can see, the mixed model without intercept (red line) does not fit the data very well (it's at the top edge of the data, instead of in the middle of the data), so I guess I did something wrong here. Can anyone make any suggestions? Thanks John worker day x x2 y 1 AC 05MAY10 -0.744840518760792 0.554787398387846 -4.30960132841301 2 AC 05MAY10 0.787599395527709 0.620312807835613 -4.71542109959628 3 AC 05MAY10 0.545530976094587 0.297604045878713 -2.31063823031730 4 AC 05MAY10 10.4116867886136108.403221784190 -2.68580284811662 5 AC 05MAY10 45.86784157807202103.85889103111 0.857277739439742 6 AC 05MAY10 84.84974640511847199.4794650129 2.45727135537005 7 AC 05MAY10 103.13358692250210636.5367515012 2.13169532824448 8 AC 05MAY10 105.74883990487511182.8171412270 1.28410512992259 9 CAH 05MAY10 -0.899013850822378 0.808225903970481 -1.03955702898241 10 CAH 05MAY10 2.987150947528318.92307078331928 0.829280624483265 11 CAH 05MAY10 1.722943410181822.96853399468895 -0.50214940111739 12 CAH 05MAY10 20.4231486028650417.104998854705 3.11786294644213 13 CAH 05MAY10 69.35418751832344810.00332632677 3.34246850536472 14 CAH 05MAY10 111.30763940362612389.3905896076 2.70345650675251 15 CAH 05MAY10 124.00425888701915377.0562221188 4.00616663408449 16 CAH 05MAY10 128.61363809741116541.4679046517 5.1806910416212 17 YG 05MAY10 -0.944438526313875 0.891964129985924 -6.26534533776664 18 YG 05MAY10 2.036947269013814.14915417674284 -3.78598472716979 19 YG 05MAY10 2.338377805198995.46801075984726 -4.2138305943426 20 YG 05MAY10 21.6366580986433468.144973677588 1.89365585649664 21 YG 05MAY10 67.58610966603834567.88221978976 4.58402791158291 22 YG 05MAY10 107.94951037183611653.0967895192 2.05571359621679 23 YG 05MAY10 112.12699368847112572.4627136144 4.05242444227599 24 YG 05MAY10 116.15340098174713491.6125596265 4.34316209268166 25 AC 06MAY10 -1.08663195011827 1.18076899501784 -2.12763929077743 26 AC 06MAY10 -0.426141043811507 0.181596189220761 -2.10662282720267 27 AC 06MAY10 -1.10752325011792 1.22660774955177 -3.01995286978557 28 AC 06MAY10 7.7878737590745460.6509776872818 -1.64834644316891
Re: [R] lm without error
On Fri, Jun 11, 2010 at 5:28 PM, ivo welch ivo.we...@gmail.com wrote: thanks, everybody. joris---let me disagree with you, please. there are so many possibilities of how lm.fit could fail that by the time I am done with pre-checking, I may as well write my own lm() routine. If we all would agree, life would be boring, no? ;-) I see your point, but the thing is that if the function returns only NA coefficients (and it can't do anything else than that, as a fit is mathematically not possible), you have literally no information about what went wrong. If I fit can't be done, I'd like to know why it happened, and not just get the answer Not Available. That's an error message too. Checking your data before doing an analysis is what we call Good Statistical Practice. Using a model on data you didn't check before is like starting to drive to another country without checking which direction you have to go. Pretty unlikely you're going to arrive at the right spot... This said, you gave us what we need. y= matrix(rnorm(1000), nrow=10, ncol=100) y[,28]= rep(NA, 10) x=rnorm(10) lm( y ~ x ) ## now what do you do? hunt for which column was responsible? I'd do : y= matrix(rnorm(1000), nrow=10, ncol=100) y[,28]= rep(NA, 10) x=rnorm(10) getOut - which(colSums(is.na(y))==dim(y)[1]) lm( y[-getOut] ~ x ) Cheers Joris -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm without error
did it again... it has to be getOut - which(colSums(is.na(y))==dim(y)[1]) lm( y[,-getOut] ~ x ) off course. Cheers Joris On Sat, Jun 12, 2010 at 2:22 AM, Joris Meys jorism...@gmail.com wrote: On Fri, Jun 11, 2010 at 5:28 PM, ivo welch ivo.we...@gmail.com wrote: thanks, everybody. joris---let me disagree with you, please. there are so many possibilities of how lm.fit could fail that by the time I am done with pre-checking, I may as well write my own lm() routine. If we all would agree, life would be boring, no? ;-) I see your point, but the thing is that if the function returns only NA coefficients (and it can't do anything else than that, as a fit is mathematically not possible), you have literally no information about what went wrong. If I fit can't be done, I'd like to know why it happened, and not just get the answer Not Available. That's an error message too. Checking your data before doing an analysis is what we call Good Statistical Practice. Using a model on data you didn't check before is like starting to drive to another country without checking which direction you have to go. Pretty unlikely you're going to arrive at the right spot... This said, you gave us what we need. y= matrix(rnorm(1000), nrow=10, ncol=100) y[,28]= rep(NA, 10) x=rnorm(10) lm( y ~ x ) ## now what do you do? hunt for which column was responsible? I'd do : y= matrix(rnorm(1000), nrow=10, ncol=100) y[,28]= rep(NA, 10) x=rnorm(10) getOut - which(colSums(is.na(y))==dim(y)[1]) lm( y[-getOut] ~ x ) Cheers Joris -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmer() with no intercept
Try a different example: set.seed(123) N - 24 k - 6 x - 1:N f - rep(rnorm(k, 0, 4), each = N/k) e - rnorm(N) y - x + f + e fac - gl(k, N/k) library(lme4) fm1 - lmer(y ~ x + (1|fac)); fm1 fm0 - lmer(y ~ x -1 + (1|fac)); fm0 plot(y, fitted(fm0)) abline(a = 0, b = 1, lty = 2, col = blue) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sharing experience - installing R Spatial Views
Hi Guys, I would like to share my experience when installing the Spatial views packages for R. I could not install 32 packages which are parts of the Spatial views, and I use google-search and search to solve ALL those problems for about 2 days. I hope maybe somebody would benefit from my experience. I admitted that I do not have excellent programming skills at all. So, perhaps some of steps I did are not necessary to solve the problem at hands. But it works and I am happy. I was lucky if the problem I have was already solved by someone else, or if the problem is just a matter of installing another R packages that depends on other packages. Sometimes it has something to do not with R package but with a package that has to be installed in my operating system (in my case LinuxMint 8, based on ubuntu karmic). so I just guessed the nature of the problem, and try to solve it via already solved similar problem. i found many problems solved direct or indirect from Nabble - OSGeo FOSS4G websites. of course many problem are also solved from other website other than Nabble, but I forgot their address, as I google search again and again for all those problems. But I do appreciate everyone who post their problems, solved or not solved, so I can solved my own problem, either directly or indirectly (by using the idea and guessing from already solved problem). I found that we need to change from gcc version 4 to 3, in order to some packages to be installed (I forgot which one). The script i used to achieve that, taken from the internet, is placed at the bottom of this message. and finally below here are lists of problems and solutions of installing Spatial Views. install-R views Spatial in linuxmint 8 (ubuntu karmic based) requirement: update for latest R ( 2.11.1) # install package ctv # install.packages(ctv,dependencies=TRUE) # problems: package: XML error: cannot find xml2-config solved: apt-get install libxml2-dev # install views Spatial # install.views(Spatial,dependencies=TRUE) # problems: The downloaded packages are in ‘/tmp/Rtmpb0gy50/downloaded_packages’ There were 32 warnings (use warnings() to see them) warnings() Warning messages: 1: In install.packages(pkgs, repos = views[[i]]$repository, ... : installation of package 'RODBC' had non-zero exit status message error: configure error: ODBC headers sql.h and sqlext.h not found solved: apt-get install unixodbc unixodbc-dev 2: In install.packages(pkgs, repos = views[[i]]$repository, ... : installation of package 'Rmpi' had non-zero exit status message error: configure error: Cannot find mpi.h header file solved: - first add debian source repository - copy paste existing main restricted, add -src deb http://.. karmic main restricted. deb http://.. karmic-updates main restricted. become deb-src http://.. karmic main restricted. deb-src http://.. karmic-updates main restricted. - apt-get update - apt-get build-dep r-cran-rmpi - R CMD INSTALL Rmpi --configure-args=--with-mpi=/usr/lib/openmpi 3: In install.packages(pkgs, repos = views[[i]]$repository, ... : installation of package 'rpvm' had non-zero exit status message error: Try to guess if pvm is installed somewhere ... Cannot find pvm. If pvm is installed, set PVM_ROOT to where pvm is. Otherwise, please install pvm first. ERROR: configuration failed for package ‘rpvm’ solved: - apt-get install pvm pvm-dev 4: In install.packages(pkgs, repos = views[[i]]$repository, ... : installation of package 'rsprng' had non-zero exit status message error: Cannot find sprng 2.0 header file. solved: - install libsprng2-dev 5: In install.packages(pkgs, repos = views[[i]]$repository, ... : installation of package 'tkrplot' had non-zero exit status message error: tcltkimg.c:2:16: tk.h: No such file or directory tcltkimg.c:469: error: expected declaration specifiers before ‘Tcl_Interp’ tcltkimg.c:468: warning: type of ‘interp’ defaults to ‘int’ make: *** [tcltkimg.o] Error 1 solved: apt-get install tk8.5-dev 6: In install.packages(pkgs, repos = views[[i]]$repository, ... : installation of package 'rJava' had non-zero exit status problems: no java compiler javac and javah check with R CMD java reconf hendro-linux R-packages # R
[R] Compiling R with multi-threaded BLAS math libraries - why not actually ?
Hello all, I came acrosshttp://www.r-bloggers.com/performance-benefits-of-linking-r-to-multithreaded-math-libraries/ David Smith's new post Performance benefits of linking R to multithreaded math librarieshttp://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html Which explains how (and why) REvolution distribution of R uses different BLAS math libraries for R, so to allow multi-threaded mathematical computation. What the post doesn't explain is why it is that native R distribution doesn't use the multi-threaded version of the libraries. Is it because R-devel team didn't get to it yet or is it for some technical reason. Could someone please help to explain the situation? Thanks in advance, Tal p.s: I wasn't sure if to send the question here or to R-devel, I decided to send it here. If I am in the wrong - please let me know. Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling R with multi-threaded BLAS math libraries - why not actually ?
The reason that the BLAS libraries are not installed as part of the standard distribution is that its desirable that the standard distribution be the same on all machines whereas you need a different BLAS library for each different CPU type. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R can't find gcc library that other programs can.
R is used in the Sage project. R is building on Solaris 10 with SPARC processors. Until recently, I did not give it much more thought, as it appeard to build ok. More recently someone noticed a test failure. It would appear a number of modules are not building (Matrix, class, mgcv, nnet, rpart, spatial, and survival) are all failing to build. But the R build process does not terminate. Is that intensional? The way the build is failing is very odd indeed. ld.so.1: R: fatal: libgcc_s.so.1: open failed: No such file or directory This is despite the location of libgcc_s.so is specified in LD_LIBRARY_PATH, and numerous other parts in Sage link to this library ok. There's some information about the issue here http://trac.sagemath.org/sage_trac/ticket/9201 Does anyone have any suggestions? In particular, do you consider the failure of one or more of those modules sufficiently serious that the build of R should stop, or do you consider some failures like this not so serious, and so it is right for the build to continue? (The reason I want to know this, is that we want to test for this failure in Sage. Depending on the seriousness of this, we make break the build if such a module fails, or we may decide to continue, and test things later, where a test failure would not stop Sage working. Dave __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling R with multi-threaded BLAS math libraries - why not actually ?
In the case of REvolution R, David mentioned using the Intel MKL, proprietary library which may not be distributed in the way R is distributed. Maybe REvolution has a license to redistribute the library. For the others, I suspect Gabor has the right idea, that the R-core team would rather not keep architecture dependent code in the sources, although there is a very small amount already (`grep -R __asm__`). However, I know using Linux (Debian in particular) it is fairly straightforward to build R with `enhanced' BLAS libraries. The R Administration and Installation manual has a pretty good section on linking with enhanced BLAS and LAPACK libs, including the Intel MKL, if you are willing cough up $399, or swear not to use the library commercially or academically. Maybe a short tutorial using free software, such as ATLAS would be suitable content for an r-bloggers post :) ? Matt Shotwell Graduate Student Div. Biostatistics and Epidemiology Medical University of South Carolina On Fri, 2010-06-11 at 19:21 -0400, Tal Galili wrote: Hello all, I came acrosshttp://www.r-bloggers.com/performance-benefits-of-linking-r-to-multithreaded-math-libraries/ David Smith's new post Performance benefits of linking R to multithreaded math librarieshttp://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html Which explains how (and why) REvolution distribution of R uses different BLAS math libraries for R, so to allow multi-threaded mathematical computation. What the post doesn't explain is why it is that native R distribution doesn't use the multi-threaded version of the libraries. Is it because R-devel team didn't get to it yet or is it for some technical reason. Could someone please help to explain the situation? Thanks in advance, Tal p.s: I wasn't sure if to send the question here or to R-devel, I decided to send it here. If I am in the wrong - please let me know. Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling R with multi-threaded BLAS math libraries - why not actually ?
On 11 June 2010 at 23:01, Matt Shotwell wrote: | In the case of REvolution R, David mentioned using the Intel MKL, | proprietary library which may not be distributed in the way R is | distributed. Maybe REvolution has a license to redistribute the library. | For the others, I suspect Gabor has the right idea, that the R-core team | would rather not keep architecture dependent code in the sources, | although there is a very small amount already (`grep -R __asm__`). | | However, I know using Linux (Debian in particular) it is fairly | straightforward to build R with `enhanced' BLAS libraries. The R | Administration and Installation manual has a pretty good section on | linking with enhanced BLAS and LAPACK libs, including the Intel MKL, if | you are willing cough up $399, or swear not to use the library | commercially or academically. BLAS is actually an interface standard, so there is no _rebuilding of R_ required. BLAS allows you to simply drop in a better BLAS library. The R Inst + Admin manual has the details. | Maybe a short tutorial using free software, such as ATLAS would be | suitable content for an r-bloggers post :) ? Given the drop-in nature, on suitable platforms all it takes is sudo apt-get install libatlas3gf-base which gets you there most of the way using 'base' Atlas (ie not cpu tuned). On my platform I also see libatlas3gf-3dnow libatlas3gf-base libatlas3gf-core2sse3 libatlas3gf-sse2 libatlas3gf-sse3 libatlas3gf-sse but this list will differ for different hardware platforms. Whenever I looked at this I found the different between 'base' and 'more tuned' atlas libraries to be rather small so I tend to just stick with base. Also, Atlas on Debian/Ubuntu is still single-threaded (as opposed to the MKL). But one can drop in the Goto BLAS from U Texas which are 'free' but non-redistributable. Lastly, as David's article said and as knowledgeable people often repeat: unless you do _lots_ of linear algebra this will not noticeably affect overall R performance as a lot of time is spent in other areas too. -- Regards, Dirk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decision values from KSVM
Any suggestions? ?predict.ksvm has an argument called type: one of response, probabilities ,votes, decision indicating the type of output: predicted values, matrix of class probabilities, matrix of vote counts, or matrix of decision values. Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlay of barchart and xyplot
Hi, I have an example below of adding a key to the merged plot. You can not have the key on the right hand side because that viewport is used by the second ylab (ylab2 from doubleYScale). Well, if you really wanted to, you could do it with the grid package, using frameGrob or somesuch. NTLST_Dispersal_VAR_00_08$Year - factor(NTLST_Dispersal_VAR_00_08$Year, levels = c(1999,2000,2001,2002,2003,2004,2005,2006,2007), ordered = TRUE) dispersal- barchart(LDP_PER*100 + SPP_PER*100 + SPG_PER*100 ~ Year | District, data=NTLST_Dispersal_VAR_00_08, stack=TRUE, layout=c(5,5), scales = list(x = list(rot = 90)), xlab=Year, ylab=%, strip = strip.custom( bg=light gray), par.settings = simpleTheme(col = c(dark gray, light gray, white)), auto.key = list(points = FALSE, rectangles = TRUE) ) vars - xyplot(sqrt(Infestation_NUM) + AI ~ Year | District, data=NTLST_Dispersal_VAR_00_08, layout=c(5,5), type=b, ylab=Square roots of number of infested cells/Landscape aggregation index, auto.key = list(lines = TRUE) ) dblplot - doubleYScale(dispersal, vars, use.style=FALSE, add.ylab2 = TRUE ) dblplot - update(dblplot, par.settings = simpleTheme(fill = c(white, dark gray, black), border=black,col.line=black, col.points=black,pch=c(16,17),lty=c(1,1,1,2,1)) ) ## include second key at the bottom update(dblplot, legend = list(bottom = vars$legend$top)) ## Otherwise you could just include a key argument in the first plot which includes all the items explicitly. ## Or merge the two 'auto.key's at the top: mergeLegends - function(a, b, ...) { g - frameGrob() agrob - a if (!inherits(a, grob)) { a - eval(as.call(c(as.symbol(a$fun), a$args)), getNamespace(lattice)) } if (!inherits(b, grob)) { b - eval(as.call(c(as.symbol(b$fun), b$args)), getNamespace(lattice)) } g - packGrob(g, a, side = left) packGrob(g, b, side = right) } update(dblplot, legend = list(top = list(fun = mergeLegends, args = list(a = dispersal$legend$top, b = vars$legend$top On 5 June 2010 04:49, Chen, Huapeng FOR:EX huapeng.c...@gov.bc.ca wrote: Hi Felix, Thanks for your help and advice. The following code is close to what I want but still have problems of failure to custom lines and add a key in any way. Par.settings with the final plot seems not working somehow except pch and lty but they overwrite par.setting with barchart. I also attached data I used by using dput. I appreciate your further helps. Thanks, Huapeng # code # NTLST_Dispersal_VAR_00_08$Year - factor(NTLST_Dispersal_VAR_00_08$Year, levels = c(1999,2000,2001,2002,2003,2004,2005,2006,2007), ordered = TRUE) dispersal-barchart(NTLST_Dispersal_VAR_00_08$LDP_PER*100 + NTLST_Dispersal_VAR_00_08$SPP_PER*100 + NTLST_Dispersal_VAR_00_08$SPG_PER*100 ~ NTLST_Dispersal_VAR_00_08$Year | NTLST_Dispersal_VAR_00_08$District, data=NTLST_Dispersal_VAR_00_08, horizontal=FALSE, stack=TRUE, layout=c(5,5), xlab=Year, ylab=%, strip = strip.custom( bg=light gray), par.settings = simpleTheme(col = c(dark gray, light gray, white)), #key=list(space=right,size=10, # rectangles=list(size=1.7, border=black, col = c(white, dark gray, black)), #lines=list(pch=c(16,17),lty=c(1,2),col=black,type=b), # text=list(text=c(SPG,SPP,LDP))) #auto.key=TRUE ) xyplot(sqrt(NTLST_Dispersal_VAR_00_08$Infestation_NUM) + NTLST_Dispersal_VAR_00_08$AI ~ NTLST_Dispersal_VAR_00_08$Year | NTLST_Dispersal_VAR_00_08$District, data=NTLST_Dispersal_VAR_00_08, layout=c(5,5), type=b, ylab=Square roots of number of infested cells/Landscape aggregation index, #par.settings = simpleTheme(col = c(black, black), pch=c(16,17)), #key=list(space=right,size=10, #rectangles=list(size=1.7, border=black, col = c(white, dark gray, black)), # lines=list(pch=c(16,17),lty=c(1,2),col=black,type=b), # text=list(text=c(t4,t5))) ) doubleYScale(dispersal, vars, use.style=FALSE, add.ylab2 = TRUE ) update(trellis.last.object(), par.settings = simpleTheme(fill = c(white, dark gray, black), border=black,col.line=black,
Re: [R] Date conversion
Thanks Joshua, I wanted to use some kind of date format in latex but ended up using exactly what you and Marc suggested. Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA - Original Message From: Joshua Wiley jwiley.ps...@gmail.com To: Felipe Carrillo mazatlanmex...@yahoo.com Cc: r-h...@stat.math.ethz.ch Sent: Thu, June 10, 2010 1:18:27 PM Subject: Re: [R] Date conversion Hello Felipe, Is this what you want? format(as.Date(3/10/10, format=%m/%d/%y), %B %d, %Y) Josh On Thu, Jun 10, 2010 at 8:29 AM, Felipe Carrillo href=mailto:mazatlanmex...@yahoo.com;mazatlanmex...@yahoo.com wrote: Hi: Can't find a way to convert from shortDate to LongDate format. I got: 3/10/10 that I want to convert to March 10, 2010. I am using: \documentclass[11pt]{article} \usepackage{longtable,verbatim} \usepackage{ctable} \usepackage{datetime} \title{my title} \begin{document} % Convert date \dddate\3/10/10 end{document} My report is changing every two weeks so I will eventually use \Sexpr{report[1,1]} to grab the date from column 1, row 1 of a table named report but right now my report has the date formated as described above (3/10/10). Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ ymailto=mailto:R-help@r-project.org; href=mailto:R-help@r-project.org;R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Senior in Psychology University of California, Riverside http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.