Re: [R] How to get intersection of multiple vectors?
孟欣 lm_mengxin at 163.com writes: v1-c(a,b,c,d) v2-c(a,b,e) v3-c(a,f,g) I want to get the intersection of v1,v2,v3,ie a How can I do then? What I know is only for 2 vectors via intersect function, but don't know how to deal with multiple vectors. Reduce(intersect, list(v1 = c(a,b,c,d), v2 = c(a,b,e), v3 = c(a,f,g))) Many thanks -- Ken Knoblauch Inserm U846 Stem-cell and Brain Research Institute Department of Integrative Neurosciences 18 avenue du Doyen Lépine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr/members/kenneth-knoblauch.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqldf for Very Large Tab Delimited Files
On Wed, Feb 1, 2012 at 11:57 PM, HC hca...@yahoo.co.in wrote: Hi All, I have a very (very) large tab-delimited text file without headers. There are only 8 columns and millions of rows. I want to make numerous pieces of this file by sub-setting it for individual stations. Station is given as in the first column. I am trying to learn and use sqldf package for this but am stuck in a couple of places. To simulate my requirement, I have taken iris dataset as an example and have done the following: (1) create a tab-delimited file without headers. (2) read it using read.csv.sql command (3) write the result of a query, getting first 10 records Here is the reproducible code that I am trying: # Text data file write.table(iris, irisNoH.txt, sep = \t, quote = FALSE, col.names=FALSE,row.names = FALSE) # create an empty database (can skip this step if database already exists) sqldf(attach myTestdbT as new) f1-file(irisNoH.txt) attr(f1, file.format) - list(header=FALSE,sep=\t) # read into table called irisTab in the mytestdb sqlite database read.csv.sql(irisNoH.txt, sql = create table main.irisTab1 as select * from file, dbname = mytestdb) res1-sqldf(select * from main.irisTab1 limit 10, dbname = mytestdb) write.table(res1, iris10.txt, sep = \t, quote = FALSE, col.names=FALSE,row.names = FALSE) # For querying records of a particular species - unresolved problems #a1-virginica #attr(f1, names) - c(A1,A2,A3,A4,A5) #res2-fn$sqldf(select * from main.irisTab1 where A5 = '$a1') In the above, I am not able to: (1) assign the names to various columns (2) query for particular value of a column; in this case for particular species, say virginica (3) I guess fn$sqldf can do the job but it requires assigning column names Any help would be most appreciated. Ignoring your iris file for a moment, to query the 5th column (getting its name via sql rather than via R) we can do this: library(sqldf) species - virginica nms - names(dbGetQuery(con, select * from iris limit 0)) fn$dbGetQuery(con, select * from iris where `nms[5]` = '$species' limit 3) Now, sqldf is best used when you are getting the data from R but if you want to store it in a database and just leave it there then you might be better off using RSQLite directly like this (the eol = \r\n in the dbWriteTable statement was needed on my Windows system but you may not need that depending on your platform): write.table(iris, irisNoH.txt, sep = \t, quote = FALSE, col.names = FALSE, row.names = FALSE) library(sqldf) library(RSQLite) con - dbConnect(SQLite(), dbname = mytestdb) dbWriteTable(con, iris, irisNoH.txt, sep = \t, eol = \r\n) species - virginica nms - names(dbGetQuery(con, select * from iris limit 0)) fn$dbGetQuery(con, select * from iris where `nms[5]` = '$species' limit 3) dbDisconnect(con) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time conversion from second to Y M D H M S format
I have some time data and which is in seconds time -c( 126230400 126252000 126273600 126295200 126316800 126338400) now I wanted to convert this time to Y M D H M S format I have tried following codes but it does not give me the out put in Y M D H M S time_t1 - as.POSIXlt(time, origin=2005-01-01, tz=GMT) time_f - as.POSIXct(time, origin=2005-01-01, tz=GMT) So somebody could please tell me how to fix this problem. Thanks -- View this message in context: http://r.789695.n4.nabble.com/time-conversion-from-second-to-Y-M-D-H-M-S-format-tp4350831p4350831.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gee: suppress printout
On 02.02.2012 06:37, Ginata86 wrote: I am using the method to sink the output. However, it can only suppress 'user's initial regression estimate ' and still display the following sentence ' Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27' You have been told to use suppressMessages() already. Uwe Ligges I am just wondering is there any way that we can also suppress this one? Because I need to loop this for many times, it's annoying to display this one. -- View this message in context: http://r.789695.n4.nabble.com/gee-suppress-printout-tp908053p4350605.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summary.formula question
Dear all, Before my question, I wish to all of you my very best wishes for 2012. I'm using summary.formula to make table. I have something like this : s1-summary(fdh~cup5+cup6+schef+cpro1+stratify(id2),data=dat,na.include=F) the output give the marginal row named overall, but is it possible to add a marginal column ? Sincerly Justin BEM BP 1917 Yaoundé Tél (237) 76043774 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] knncat broken on R 2.14?
Works for me. Uwe Ligges On 02.02.2012 02:09, Nick Matzke wrote: Hi, Until recently I was using the knncat classifier function of knncat on an old computer (2.12, Mac OS X 10.4), and everything worked great. However, now that I have updated to R 2.14.1 (on Mac OS X 10.7), knncat seems broken. Problems: 1. It seems to output verbose output by default, and regardless of whether I put 0 or 1 into the verbose option. 2. It seems to just predict everything to be class 0. 3. I can see this problem even when running the example script under ?knncat. Given example from ?knncat, and expected behavior: === library(knncat) library(MASS) # Load data set syncat - knncat (synth.tr, classcol=3) ## Not run: syncat Train set misclass rate: 12.8 synpred - predict (syncat, synth.tr, synth.te, train.classcol=3, newdata.classcol=3) table (synpred, synth.te$yc) synpred 0 1 0 460 91 1 40 409 === Actual behavior: == library(MASS) # Load data set syncat - knncat (synth.tr, classcol=3) Global U: [,1] [,2] [,3] [,4] [,5] [1,] 3.20705e+02 1.91438e+02 1.51645e+02 1.20514e+02 8.21981e+01 [2,] 1.91438e+02 1.14799e+02 9.09814e+01 7.22355e+01 4.92108e+01 [3,] 1.51645e+02 9.09814e+01 7.21094e+01 5.72461e+01 3.89942e+01 [4,] 1.20514e+02 7.22355e+01 5.72461e+01 4.54553e+01 3.09702e+01 [5,] 8.21981e+01 4.92108e+01 3.89942e+01 3.09702e+01 2.11076e+01 [6,] 5.57264e+01 3.34476e+01 2.65108e+01 2.10446e+01 1.43334e+01 [7,] 2.51586e+01 1.52490e+01 1.20991e+01 9.58523e+00 6.51199e+00 [8,] 1.45272e+01 8.85590e+00 7.03089e+00 5.56357e+00 3.77419e+00 [9,] 8.30201e+00 5.07964e+00 4.03440e+00 3.19007e+00 2.16202e+00 [10,] 3.36454e+00 2.07002e+00 1.64502e+00 1.29931e+00 8.79344e-01 [,6] [,7] [,8] [,9] [,10] [1,] 5.57264e+01 2.51586e+01 1.45272e+01 8.30201e+00 3.36454e+00 [2,] 3.34476e+01 1.52490e+01 8.85590e+00 5.07964e+00 2.07002e+00 [3,] 2.65108e+01 1.20991e+01 7.03089e+00 4.03440e+00 1.64502e+00 [4,] 2.10446e+01 9.58523e+00 5.56357e+00 3.19007e+00 1.29931e+00 [5,] 1.43334e+01 6.51199e+00 3.77419e+00 2.16202e+00 8.79344e-01 [6,] 9.74699e+00 4.45226e+00 2.58857e+00 1.48583e+00 6.06140e-01 [7,] 4.45226e+00 2.07551e+00 1.22083e+00 7.05915e-01 2.9e-01 [8,] 2.58857e+00 1.22083e+00 7.22786e-01 4.19620e-01 1.74066e-01 [9,] 1.48583e+00 7.05915e-01 4.19620e-01 2.44220e-01 1.01671e-01 [10,] 6.06140e-01 2.9e-01 1.74066e-01 1.01671e-01 4.25445e-02 Global W: [,1] [,2] [,3] [,4] [,5] [1,] 3.68224e+02 2.36787e+02 1.93972e+02 1.58837e+02 1.12597e+02 [2,] 2.36787e+02 1.58549e+02 1.32054e+02 1.09539e+02 7.88945e+01 [3,] 1.93972e+02 1.32054e+02 1.11021e+02 9.27903e+01 6.74442e+01 [4,] 1.58837e+02 1.09539e+02 9.27903e+01 7.82101e+01 5.74745e+01 [5,] 1.12597e+02 7.88945e+01 6.74442e+01 5.74745e+01 4.31835e+01 [...etc...lots of undesired detailed/verbose output, then...] syncat Training set misclass rate: 50% synpred - predict (syncat, synth.tr, synth.te, train.classcol=3, + newdata.classcol=3) table (synpred, synth.te$yc) synpred 0 1 0 500 500 == Any help would be much appreciated! Cheers, Nick PS computer details = knncat is version 1.1.11 = == R version 2.14.1 (2011-12-22) Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [R.app GUI 1.43 (5989) x86_64-apple-darwin9.8.0] == __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User Interface Equivalent Code
On 30.01.2012 20:26, Ajay Askoolum wrote: When I plot, the plot's user interface offers me a choice: File | Copy to the Clipboard | as a Bitmap. What is the equivalent code for achieving this but without the plot interface becoming visible? For something *equivalent*, see ?dev.copy. Since you are on Windows: See also ?bmp for a cleaner more direct approach to print into the device right away. You may also want to consider to produce vector graphic formats rather than bitmaps - the former is in most but not all cases preferable. Uwe Ligges Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Makefile to compile .so in src (was: Re: automated libR location)
On 01.02.2012 03:37, Matyas Sustik wrote: Prof Brian Ripley wrote: 'library' in R has a different meaning: I've altered the subject to be more accurate 'libR'. This is what R CMD SHLIB is for: it does all this for you in a portable way. But if you want to DIY, you can use R CMD config to find out the appropriate linker incantation. Thank you for the clarification. I do not insist on doing it myself and would welcome R doing it automatically. I guess I did not fully understand the instructions in the document R Extensions. (English is my second language.) A small example compiling a single .so would help greatly I think. My current Makefile that I put in src of a package skeleton looks like this to create the .so on Linux: all : QUIC.so OBJECTS = QUIC.o QUIC.so PKG_LIBS = @LAPACK_LIBS@ @BLAS_LIBS@ QUIC.o : QUIC.cpp g++ -O3 -DNDEBUG -Wall -fpic -pthread -shared -fno-omit-frame-pointer -ansi -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -c QUIC.cpp -o QUIC.o QUIC.so : QUIC.o g++ -std=gnu99 -shared -lm -llapack -lblas -L/usr/lib/R/lib -lR -O3 QUIC.o -o QUIC.so 1. I don't believe you really need all the flags from above. If so, within a package, use a Makevars file. 2. R CMD SHLIB QUIC.cpp shoudl do the trick already, perhaps some linker flags are required for blas that can be specified in the same line, see R CMD SHLIB --help Uwe Ligges This actually built and created a loadable package but I would want to do it more the R-way. Thanks in advance! -Matyas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] finding rows in a matrix that match a vector
On 28.01.2012 05:43, Melissa Patrician wrote: Hi, Please excuse my inexperience, but I am just learning R (this is my very first day programming in R) and having a really hard time figuring out how to do the following: I have a matrix that is 1000 row by 6 columns (named 'table.combos') and a 1 row by 6 column vector (named 'mine'). I want to find every row in 'table.combos' that equals 'mine' and then count this number of times that this is the case. In matlab, I would use the 'find' command but I can not seem to figure out what syntax to use for R. Can anyone please help? Again, I'm assuming this is probably a very easy thing to do, but since I am new to R, I am having a hard time figuring it out. I did some research on previous posts and saw that the 'apply' function appears to do something like this, but I don't know what function I am supposed to input into 'apply' to use this. If you like the apply way, you have a vector v and your matrix M: sum(apply(table.combos, 1, function(x) all(x == mine))) Uwe Ligges Thanks in advance for the help! Cheers, Melissa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get intersection of multiple vectors?
On Thu, Feb 02, 2012 at 01:55:07PM +0800, 孟欣 wrote: v1-c(a,b,c,d) v2-c(a,b,e) v3-c(a,f,g) I want to get the intersection of v1,v2,v3,ie a How can I do then? What I know is only for 2 vectors via intersect function,but don't know how to deal with multiple vectors. Hi. Set intersection is an associative operation. So, intersect(intersect(v1, v2), v3) or intersect(v1, intersect(v2, v3)) yield the correct result. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get intersection of multiple vectors?
On Thu, Feb 02, 2012 at 01:55:07PM +0800, 孟欣 wrote: v1-c(a,b,c,d) v2-c(a,b,e) v3-c(a,f,g) I want to get the intersection of v1,v2,v3,ie a How can I do then? What I know is only for 2 vectors via intersect function,but don't know how to deal with multiple vectors. Hi. Try the following intersectSeveral - function(...) { Reduce(intersect, list(...)) } intersectSeveral(v1, v2, v3) [1] a Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing out data
What is the best way to write out comma separated data, as a program is running (rather than waiting to the end using write.csv)? At the moment I'm doing this, but I guess it's not the most efficient. The data is in a column in the matrix postcount, and I'm using a loop to write out each of the 100 elements. for (j in 1:100) { cat(postcount[1,j], ,, file=filename, append=TRUE) } cat(\n, file=filename, append=TRUE) Thank you! Thomas This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] While loop working with TRUE/FALSE?
Thanks to Berend and the others, I've found a solution which works fine for my problem. I have not only 2 vectors, but also 4. Question is, if q1 and q2 is equal to w1 and w2. The computational time is very short, also for large data. q1 - c(9,5,1,5) q2 - c(9,2,1,5) w1 - c(9,4,4,4,5) w1 - c(9,4,4,4,5) v - vector() for (i in 1:(length(q1))){ v[i] - any((q1[i] == w1) (q2[i] == w2)) } best regards -- View this message in context: http://r.789695.n4.nabble.com/While-loop-working-with-TRUE-FALSE-tp4348340p4351214.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] While loop working with TRUE/FALSE?
Hey Chris, I would take advantage from the apply function: apply(cbind(q1,q2),1,function(x)any((x[1]==w1)(x[2]==w2))) Regards PF On Thu, Feb 2, 2012 at 12:55 PM, Chris82 rubenba...@gmx.de wrote: Thanks to Berend and the others, I've found a solution which works fine for my problem. I have not only 2 vectors, but also 4. Question is, if q1 and q2 is equal to w1 and w2. The computational time is very short, also for large data. q1 - c(9,5,1,5) q2 - c(9,2,1,5) w1 - c(9,4,4,4,5) w1 - c(9,4,4,4,5) v - vector() for (i in 1:(length(q1))){ v[i] - any((q1[i] == w1) (q2[i] == w2)) } best regards -- View this message in context: http://r.789695.n4.nabble.com/While-loop-working-with-TRUE-FALSE-tp4348340p4351214.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- +--- | Patrizio Frederic, | http://www.economia.unimore.it/frederic_patrizio/ +--- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing out data
On 02/02/2012 11:40 AM, Thomas wrote: What is the best way to write out comma separated data, as a program is running (rather than waiting to the end using write.csv)? At the moment I'm doing this, but I guess it's not the most efficient. The data is in a column in the matrix postcount, and I'm using a loop to write out each of the 100 elements. for (j in 1:100) { cat(postcount[1,j], ,, file=filename, append=TRUE) } cat(\n, file=filename, append=TRUE) Thank you! Thomas This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi, write.csv also supports an append argument. Maybe that is faster than using cat. cheers, Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing out data
Correct me if I'm wrong, but I think that write.csv() doesn't have an append argument; write.table() does though. Ivan -- Ivan CALANDRA Université de Bourgogne UMR CNRS/uB 6282 Biogéosciences 6 Boulevard Gabriel 21000 Dijon, FRANCE +33(0)3.80.39.63.06 ivan.calan...@u-bourgogne.fr Le 02/02/12 13:43, Paul Hiemstra a écrit : On 02/02/2012 11:40 AM, Thomas wrote: What is the best way to write out comma separated data, as a program is running (rather than waiting to the end using write.csv)? At the moment I'm doing this, but I guess it's not the most efficient. The data is in a column in the matrix postcount, and I'm using a loop to write out each of the 100 elements. for (j in 1:100) { cat(postcount[1,j], ,, file=filename, append=TRUE) } cat(\n, file=filename, append=TRUE) Thank you! Thomas This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi, write.csv also supports an append argument. Maybe that is faster than using cat. cheers, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Draw vertical line segments between pairs of points
Dear all, How to add /*vertical*/ lines above bar graph to display p-values ( between pairs of points )? Regards ML -- Mohamed Lajnef,IE INSERM U955 eq 15# P?le de Psychiatrie# H?pital CHENEVIER # 40, rue Mesly # 94010 CRETEIL Cedex FRANCE # mohamed.laj...@inserm.fr # tel : 01 49 81 32 79 # Sec : 01 49 81 32 90 # fax : 01 49 81 30 99 # [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix element position: from length to dim
How can I pass from position in length inside a matrix to position in dim ? a=matrix(c(1:999),nrow=9) which(a==87)#position in length 1:length(a) 87 which(a==87,arr.ind=TRUE) #position in dim row col [1,] 6 10 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] While loop working with TRUE/FALSE?
On Feb 2, 2012, at 6:55 AM, Chris82 wrote: Thanks to Berend and the others, I've found a solution which works fine for my problem. I have not only 2 vectors, but also 4. Question is, if q1 and q2 is equal to w1 and w2. The computational time is very short, also for large data. q1 - c(9,5,1,5) q2 - c(9,2,1,5) w1 - c(9,4,4,4,5) w1 - c(9,4,4,4,5) v - vector() for (i in 1:(length(q1))){ v[i] - any((q1[i] == w1) (q2[i] == w2)) This suggests a lack of understanding re: how to use logical functions. The any() function is completely superfluous here. It will return exactly the same vector as would: q1[i] == w1) (q2[i] == w2) If you wanted to use any() to pick out cases where either q1[i] == w1 or q2[i]==w2 then do not put an ampersand between those arguments but rather a comma. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix element position: from length to dim
On Thu, Feb 02, 2012 at 02:08:37PM +0100, Ana wrote: How can I pass from position in length inside a matrix to position in dim ? a=matrix(c(1:999),nrow=9) which(a==87)#position in length 1:length(a) 87 which(a==87,arr.ind=TRUE) #position in dim row col [1,] 6 10 Hi. Assume d - dim(a) i - 87 Try the following two approaches. 1. x - rep(FALSE, times=prod(d)) x[i] - TRUE which(array(x, dim=d), arr.ind=TRUE) row col [1,] 6 10 2. c((i - 1) %% d[1], (i - 1) %/% d[1]) + 1 [1] 6 10 Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stuck with levels while reassigning dataframe colnames?
Thanks for the advice, df-read.table(infile, sep=, skip = 1, header=TRUE) is indeed much cleaner from the outset (and was my usual way to it). I was unaware that readLines(infile, n=1) could get me the first line without reading the whole file again. But I do need to get my head around these levels... ---Jean Plamondon -- View this message in context: http://r.789695.n4.nabble.com/Stuck-with-levels-while-reassigning-dataframe-colnames-tp4350435p4351528.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] While loop working with TRUE/FALSE?
Hi Thanks to Berend and the others, I've found a solution which works fine for my problem. I have not only 2 vectors, but also 4. Question is, if q1 and q2 is equal to w1 and w2. The computational time is very short, also for large data. q1 - c(9,5,1,5) q2 - c(9,2,1,5) w1 - c(9,4,4,4,5) w1 - c(9,4,4,4,5) v - vector() for (i in 1:(length(q1))){ v[i] - any((q1[i] == w1) (q2[i] == w2)) } If i understand correctly you want to know if any value in q1 is also in w1 and q2 is in w2. Therefore any((q2[2] == w2) (q1[2] == w1)) is true in only if there is common elements in both vector pairs which is what (q1 %in% w1) (q2 %in% w2) [1] TRUE FALSE FALSE TRUE does. For small vectors (several values) timing will be probably similar, however with moderate vectors with few thousand values there is considerable speedup. q1-sample(q1, 1, replace=T) q2-sample(q2, 1, replace=T) w1-sample(w1, 10, replace=T) w2-sample(w2, 10, replace=T) v2- vector() system.time({ + for (i in 1:(length(q1))){ + v2[i] - any((q1[i] == w1) (q2[i] == w2)) + }}) user system elapsed 34.361.69 36.16 system.time(v-((q1 %in% w1) (q2 %in% w2))) user system elapsed 0.010.000.02 all.equal(v,v2) [1] TRUE Regards Petr best regards -- View this message in context: http://r.789695.n4.nabble.com/While-loop- working-with-TRUE-FALSE-tp4348340p4351214.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix element position: from length to dim
Also take a look at the arrayInd() function which is what's used by which() internally for the arr.ind = TRUE case. Michael On Thu, Feb 2, 2012 at 8:25 AM, Petr Savicky savi...@cs.cas.cz wrote: On Thu, Feb 02, 2012 at 02:08:37PM +0100, Ana wrote: How can I pass from position in length inside a matrix to position in dim ? a=matrix(c(1:999),nrow=9) which(a==87) #position in length 1:length(a) 87 which(a==87,arr.ind=TRUE) #position in dim row col [1,] 6 10 Hi. Assume d - dim(a) i - 87 Try the following two approaches. 1. x - rep(FALSE, times=prod(d)) x[i] - TRUE which(array(x, dim=d), arr.ind=TRUE) row col [1,] 6 10 2. c((i - 1) %% d[1], (i - 1) %/% d[1]) + 1 [1] 6 10 Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing out data
I believe connections were designed to do this as efficiently as possible by keeping the i/o path open rather than reopening it each time like write.table(append = TRUE) would do, though I may be wrong on the details: see ?connections. Prof Ripley has a good article about them in R News 1.1 -- http://cran.r-project.org/doc/Rnews/Rnews_2001-1.pdf -- but it's a rather out of date. Michael On Thu, Feb 2, 2012 at 7:56 AM, Ivan Calandra ivan.calan...@u-bourgogne.fr wrote: Correct me if I'm wrong, but I think that write.csv() doesn't have an append argument; write.table() does though. Ivan -- Ivan CALANDRA Université de Bourgogne UMR CNRS/uB 6282 Biogéosciences 6 Boulevard Gabriel 21000 Dijon, FRANCE +33(0)3.80.39.63.06 ivan.calan...@u-bourgogne.fr Le 02/02/12 13:43, Paul Hiemstra a écrit : On 02/02/2012 11:40 AM, Thomas wrote: What is the best way to write out comma separated data, as a program is running (rather than waiting to the end using write.csv)? At the moment I'm doing this, but I guess it's not the most efficient. The data is in a column in the matrix postcount, and I'm using a loop to write out each of the 100 elements. for (j in 1:100) { cat(postcount[1,j], ,, file=filename, append=TRUE) } cat(\n, file=filename, append=TRUE) Thank you! Thomas This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi, write.csv also supports an append argument. Maybe that is faster than using cat. cheers, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pgfSweave doesn't lazyload my objects
Hi all, I'm struggling a bit to get pgfSweave to lazyload objects when compiling a .Rnw file for a second time. Caching works fine except that for every run all objects get cached again and again. I've used cacheSweave which works fine; all cached objects from code-chunks with option cache = TRUE are lazy loaded. I've tried it on two machines ... I'm pretty sure I'm overlooking something obvious. Below is my .Rnw and sessionInfo. Pointers, suggestions, etc are most welcome. %%% RNW; test_png.Rnw %% \documentclass{article} \begin{document} some bla bla text large-chunk-no-cache, cache=false= mm1 - matrix(1:1e7, 1e3, 1e4) @ c2= print(length(mm1)) @ large-chunk-do-cache, cache=true= mm2 - matrix(1:1e7, 1e3, 1e4) @ c4= print(length(mm2)) @ \end{document} % END RNW %% I am running the folowing R command: pgfSweave('test_pgf.Rnw', compile.tex=F) ### SESSION INFO R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] pgfSweave_1.2.1 tikzDevice_0.6.2 cacheSweave_0.6 formatR_0.3-4 [5] optparse_0.9.4 getopt_1.17 highlight_0.3.1 parser_0.0-14 [9] Rcpp_0.9.9 codetools_0.2-8 stashR_0.3-4 filehash_2.2 loaded via a namespace (and not attached): [1] digest_0.5.1 grid_2.14.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gee: suppress printout
I don't think it can be removed, a message like this has been coming out for several years and there may be a good reason why it is there. Your best bet is probably to approach the package maintainer with a suggestion to alter the code. Regards Søren -Oprindelig meddelelse- Fra: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] På vegne af Ginata86 Sendt: 2. februar 2012 06:38 Til: r-help@r-project.org Emne: Re: [R] gee: suppress printout I am using the method to sink the output. However, it can only suppress 'user's initial regression estimate ' and still display the following sentence ' Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27' I am just wondering is there any way that we can also suppress this one? Because I need to loop this for many times, it's annoying to display this one. -- View this message in context: http://r.789695.n4.nabble.com/gee-suppress-printout-tp908053p4350605.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Ingen virus fundet i denne meddelelse. Kontrolleret af AVG - www.avg.com Version: 2012.0.1913 / Virusdatabase: 2112/4780 - Udgivelsesdato: 01-02-2012 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The less than () operator doesnt seem to perform as expected
The example here puzzles me. It seems like the operator doesn't work as expected. l - 0.6 u - seq(0.4, 0.7, 0.1) u [1] 0.4 0.5 0.6 0.7 mygrid - expand.grid(l = l, u = u) mygrid l u 1 0.6 0.4 2 0.6 0.5 3 0.6 0.6 4 0.6 0.7 mygridcollapsed - mygrid[mygrid$l mygrid$u, ] mygridcollapsed l u 3 0.6 0.6 4 0.6 0.7 In this little example I expect 'mygridcollapsed' only to return row 4 and for it to return row 3 seems wrong. The strange thing is it seems to work if I start the u-sequence at 0.5. l - 0.6 u - seq(0.5, 0.7, 0.1) u [1] 0.5 0.6 0.7 mygrid - expand.grid(l = l, u = u) mygrid l u 1 0.6 0.5 2 0.6 0.6 3 0.6 0.7 mygridcollapsed - mygrid[mygrid$l mygrid$u, ] mygridcollapsed l u 3 0.6 0.7 Maybe I'm missing something... Best wishes Jonas Hal _ BRFkredit sender e-mails og vedhaeftede dokumenter i ikke-krypteret form. Hvis du ikke ?nsker at modtage oplysninger fra BRFkredit pr. e-mail, beder vi dig meddele os det via brev eller e-mail. Denne e-mail kan indeholde fortrolig information. Hvis du modtager e-mailen ved en fejl, beder vi dig informere os om det hurtigst muligt. Samtidig beder vi dig slette e-mailen uden at videresende eller kopiere indholdet. _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to retrieve a column name of a data frame
Hi, I 'd like to know how to retrieve a column name of a data frame. For instance : df = data.frame(c1=c('a','b'),c2=c(1,2)) df c1 c2 1 a 1 2 b 2 I would like to retrieve the column name which value is 2 (here, the column is c2) thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/How-to-retrieve-a-column-name-of-a-data-frame-tp4351764p4351764.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculation of probability values from multivariate normal densities
Hi, I would like to know, if there's any R function, which allows calculation of probability values (0,1) from multivariate normal densities. I would be grateful for any output. Cheers, MG __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] get mean of same elements in a data.frame
Hi, I have the following data.frame: data.frame(x = c(1:10), y = rnorm(10,2,1), label = rep(c('a', 'b', 'c', 'd', 'e'),2)) in this data.frame there is a label-variable containing strings. Each string is represented two times. Now I would like to have the mean of the corresponding x (and y-values) for every unique label-element. For the label 'a' for example there is an x value of 1 and 6. So the resulting value should be 3.5. How can I do this in R? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Troubles with stemming (tm + Snowball packages) under MacOS
The Sys.setenv(NOAWT=TRUE) code indeed solved my problem which was excatly what Julien described. The key is you have to deactivate AWT BEFORE loading RWeka/Snowball. If I do so it will fire a few warning messages but that should not affect anything. I am running the lsa package which requires RWeka and Snowball. My R version is 2.14.1, under Mac OS X 10.6.8. My code snippet as below: dtm-textmatrix(ldir,minWordLength=1,stopwords=stopwords_en,stemming=TRUE,language=english) Refreshing GOE props... ---Registering Weka Editors--- Trying to add database driver (JDBC): RmiJdbc.RJDriver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): jdbc.idbDriver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): org.gjt.mm.mysql.Driver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): com.mckoi.JDBCDriver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): org.hsqldb.jdbcDriver - Warning, not in CLASSPATH? [KnowledgeFlow] Loading properties and plugins... [KnowledgeFlow] Initializing KF... Julien Velcin wrote I have desactivated AWT (like written in http://r.789695.n4.nabble.com/Problem-with-Snowball-amp-RWeka-td3402126.html) with : Sys.setenv(NOAWT=TRUE) The command tm_map(reuters, stemDocument) gives the following errors : -- View this message in context: http://r.789695.n4.nabble.com/Troubles-with-stemming-tm-Snowball-packages-under-MacOS-tp4292605p4351779.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to retrieve a column name of a data frame
colnames(df)[2] Michael On Thu, Feb 2, 2012 at 10:31 AM, ikuzar raz...@hotmail.fr wrote: Hi, I 'd like to know how to retrieve a column name of a data frame. For instance : df = data.frame(c1=c('a','b'),c2=c(1,2)) df c1 c2 1 a 1 2 b 2 I would like to retrieve the column name which value is 2 (here, the column is c2) thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/How-to-retrieve-a-column-name-of-a-data-frame-tp4351764p4351764.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get mean of same elements in a data.frame
There are *many* ways, but here's two: df = data.frame(x = c(1:10), y = rnorm(10,2,1), label = rep(c('a', 'b', 'c', 'd', 'e'),2)) with(df, ave(x, label)) # Returns the correct value in each spot (useful if you want to add a group-mean column to df with(df, tapply(x, label, mean)) # Probably more what you were looking for. Michael On Thu, Feb 2, 2012 at 10:47 AM, Martin Batholdy batho...@googlemail.com wrote: Hi, I have the following data.frame: data.frame(x = c(1:10), y = rnorm(10,2,1), label = rep(c('a', 'b', 'c', 'd', 'e'),2)) in this data.frame there is a label-variable containing strings. Each string is represented two times. Now I would like to have the mean of the corresponding x (and y-values) for every unique label-element. For the label 'a' for example there is an x value of 1 and 6. So the resulting value should be 3.5. How can I do this in R? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The less than () operator doesnt seem to perform as expected
You need to back up a bit to see the root cause of the problem, which is that seq()'s calculations necessarily involve some roundoff error (since it works with 52 binary digits of precision): u - seq(from=0.4, to=0.7, by=0.1) u - c(0.4, 0.5, 0.6, 0.7) [1] 0.00e+00 0.00e+00 1.110223e-16 0.00e+00 u - (4:7) * 0.1 [1] 0.00e+00 0.00e+00 0.00e+00 -1.110223e-16 u - (4:7) / 10 [1] 0.00e+00 0.00e+00 1.110223e-16 0.00e+00 u - cumsum(c(0.4, 0.1, 0.1, 0.1)) [1] 0.00e+00 0.00e+00 0.00e+00 -1.110223e-16 I find the easiest way around this sort of problem is to use integer sequences (use them as subscripts into your real sequence and do the tests on the subscripts). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jonas Hal Sent: Thursday, February 02, 2012 2:01 AM To: r-help@r-project.org Subject: [R] The less than () operator doesnt seem to perform as expected The example here puzzles me. It seems like the operator doesn't work as expected. l - 0.6 u - seq(0.4, 0.7, 0.1) u [1] 0.4 0.5 0.6 0.7 mygrid - expand.grid(l = l, u = u) mygrid l u 1 0.6 0.4 2 0.6 0.5 3 0.6 0.6 4 0.6 0.7 mygridcollapsed - mygrid[mygrid$l mygrid$u, ] mygridcollapsed l u 3 0.6 0.6 4 0.6 0.7 In this little example I expect 'mygridcollapsed' only to return row 4 and for it to return row 3 seems wrong. The strange thing is it seems to work if I start the u-sequence at 0.5. l - 0.6 u - seq(0.5, 0.7, 0.1) u [1] 0.5 0.6 0.7 mygrid - expand.grid(l = l, u = u) mygrid l u 1 0.6 0.5 2 0.6 0.6 3 0.6 0.7 mygridcollapsed - mygrid[mygrid$l mygrid$u, ] mygridcollapsed l u 3 0.6 0.7 Maybe I'm missing something... Best wishes Jonas Hal __ ___ BRFkredit sender e-mails og vedhaeftede dokumenter i ikke-krypteret form. Hvis du ikke ?nsker at modtage oplysninger fra BRFkredit pr. e-mail, beder vi dig meddele os det via brev eller e-mail. Denne e-mail kan indeholde fortrolig information. Hvis du modtager e-mailen ved en fejl, beder vi dig informere os om det hurtigst muligt. Samtidig beder vi dig slette e-mailen uden at videresende eller kopiere indholdet. __ ___ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculation of probability values from multivariate normal densities
I'm not sure what you mean probability values but you can get a multivariate normal density from library(mvtnorm) ? dmvnorm The same package also provides a pmvnorm but one has to be slightly more comfortable handling CDFs in the multivariate case. Michael 2012/2/2 Michał Góralski mgora...@ibch.poznan.pl: Hi, I would like to know, if there's any R function, which allows calculation of probability values (0,1) from multivariate normal densities. I would be grateful for any output. Cheers, MG __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The less than () operator doesnt seem to perform as expected
This is R FAQ 7.31, about machine representation of floating point numbers. mygrid$u[3] - mygrid$l[3] [1] 1.110223e-16 So mygrid$l[3] mygrid$u[3] is true, though the difference is very, very small and due solely to the limitations of computers. Sarah On Thu, Feb 2, 2012 at 5:00 AM, Jonas Hal j...@brf.dk wrote: The example here puzzles me. It seems like the operator doesn't work as expected. l - 0.6 u - seq(0.4, 0.7, 0.1) u [1] 0.4 0.5 0.6 0.7 mygrid - expand.grid(l = l, u = u) mygrid l u 1 0.6 0.4 2 0.6 0.5 3 0.6 0.6 4 0.6 0.7 mygridcollapsed - mygrid[mygrid$l mygrid$u, ] mygridcollapsed l u 3 0.6 0.6 4 0.6 0.7 In this little example I expect 'mygridcollapsed' only to return row 4 and for it to return row 3 seems wrong. The strange thing is it seems to work if I start the u-sequence at 0.5. l - 0.6 u - seq(0.5, 0.7, 0.1) u [1] 0.5 0.6 0.7 mygrid - expand.grid(l = l, u = u) mygrid l u 1 0.6 0.5 2 0.6 0.6 3 0.6 0.7 mygridcollapsed - mygrid[mygrid$l mygrid$u, ] mygridcollapsed l u 3 0.6 0.7 Maybe I'm missing something... Best wishes Jonas Hal -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The less than () operator doesnt seem to perform as expected
It's likely an infelicity of floating point representations (R FAQ 7.31) but admittedly, not a case I would have expected to present itself. If you want it to work out as expected, try this: l - 0.6 u - seq(0.4, 0.7, 0.1) l.int - (6L) / 10 u.int - seq(4, 7) / 10 l u l.int u.int Michael On Thu, Feb 2, 2012 at 5:00 AM, Jonas Hal j...@brf.dk wrote: The example here puzzles me. It seems like the operator doesn't work as expected. l - 0.6 u - seq(0.4, 0.7, 0.1) u [1] 0.4 0.5 0.6 0.7 mygrid - expand.grid(l = l, u = u) mygrid l u 1 0.6 0.4 2 0.6 0.5 3 0.6 0.6 4 0.6 0.7 mygridcollapsed - mygrid[mygrid$l mygrid$u, ] mygridcollapsed l u 3 0.6 0.6 4 0.6 0.7 In this little example I expect 'mygridcollapsed' only to return row 4 and for it to return row 3 seems wrong. The strange thing is it seems to work if I start the u-sequence at 0.5. l - 0.6 u - seq(0.5, 0.7, 0.1) u [1] 0.5 0.6 0.7 mygrid - expand.grid(l = l, u = u) mygrid l u 1 0.6 0.5 2 0.6 0.6 3 0.6 0.7 mygridcollapsed - mygrid[mygrid$l mygrid$u, ] mygridcollapsed l u 3 0.6 0.7 Maybe I'm missing something... Best wishes Jonas Hal _ BRFkredit sender e-mails og vedhaeftede dokumenter i ikke-krypteret form. Hvis du ikke ?nsker at modtage oplysninger fra BRFkredit pr. e-mail, beder vi dig meddele os det via brev eller e-mail. Denne e-mail kan indeholde fortrolig information. Hvis du modtager e-mailen ved en fejl, beder vi dig informere os om det hurtigst muligt. Samtidig beder vi dig slette e-mailen uden at videresende eller kopiere indholdet. _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to retrieve a column name of a data frame
colnames( df )[2] [1] c2 On Thursday 02 February 2012 07:31:33 ikuzar wrote: Hi, I 'd like to know how to retrieve a column name of a data frame. For instance : df = data.frame(c1=c('a','b'),c2=c(1,2)) df c1 c2 1 a 1 2 b 2 I would like to retrieve the column name which value is 2 (here, the column is c2) thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/How-to-retrieve-a-column-name-of-a-data-frame -tp4351764p4351764.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glmer question
I would like to fit the following model: logit(p_{ij}) = \mu + a_i + b_j wherea_i ~ N(0, \sigma_a^2) , b_j ~ N(0, \sigma_b^2) and \sigma_a = \sigma_b. Is it possible to fit a model with such a constraint on the variance components in glmer? -- View this message in context: http://r.789695.n4.nabble.com/glmer-question-tp4351829p4351829.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to retrieve a column name of a data frame
Sorry, it was not clear: my program have to return column name corresponding to a value, for example 'b' (so, the corresponding column is c1) How to retrieve c1 ? Thanks -- View this message in context: http://r.789695.n4.nabble.com/How-to-retrieve-a-column-name-of-a-data-frame-tp4351764p4351866.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Organizing Large Datasets
Recently I've run into memory problems while using data.frames for a reasonably large dataset. I've solved those problems using arrays, and that has provoked me to do a few benchmarks. I would like to share the results. Let us start with the data. There are N subjects classified into G groups. These subjects are observed for T periods, and each observation consists of M variables. So, this is a standard panel. Suppose, though, that it's reasonably large, with hundreds of variables, tens of thousands of subjects, and over a decade. As I think, there are three common ways to organize such data. The first way is a single table, where each row is an observation (columns are Group, Subject, Period, plus all M variables). This is a standard way in econometrics software, let me call it the wide format. The second way is to have a separate table for data, where each row is an observation for a particular variable, i.e. the columns are Subject, Period, Variable, Value, and to have a separate table with classification of subjects into groups. This would be a standard way to organize data in a relational database (a star scheme). Finally, given that I'm talking about dense data, the data can be organized as a multidimensional array (subjects, periods, variables), plus one would need vectors with names for the elements of each of the dimensions. I did two benchmarks: 1) creating random data in the respective format, and 2) aggregating over groups. As data.table can be faster than data.frame, I've included both. Here is the source code: https://docs.google.com/uc?id=0B-uoYmSQJJvwNTdjNzljZjUtZmVhYS00ZTQ5LTgyMjEtYmJhMjg1OTBhOTU5 The results, in brief, are as follows. Long format (star scheme) is dominated by all other options w.r.t. time and memory usage (no big surprise, R is not MySQL). Concerning the wide format, data.table is faster and more memory efficient than data.frame. Finally, the wide format with a data.table and the array format are similar in execution times, but the array format requires less memory. More importantly, if I need to do aggregations over variables, then the wide.format is not that suitable anymore, whereas the array can be applied just as before. So, a data.cube package anyone? Andrei. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to retrieve a column name of a data frame
I'd use something like which(df == b, arr.ind = TRUE) which, gives the column number in the second spot; this gives you colnames(df)[which(df == b, arr.ind = TRUE)[2]] Michael On Thu, Feb 2, 2012 at 11:00 AM, ikuzar raz...@hotmail.fr wrote: Sorry, it was not clear: my program have to return column name corresponding to a value, for example 'b' (so, the corresponding column is c1) How to retrieve c1 ? Thanks -- View this message in context: http://r.789695.n4.nabble.com/How-to-retrieve-a-column-name-of-a-data-frame-tp4351764p4351866.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Windows 7 installation of .qz package from SourceForge
Thank you Duncan, the 2nd instructions worked. The both probably would have worked, but I had some code (below) that threw an error. It's designed to automatically set the internet connection to my Windows setting, so R would know the proxy server used at my worksite. I had seen this suggestion in the archives of R-Help. However, this code in the R-2.14.0/etc./Rprofile.site file wasn't being recognized as a valid command, so I commented it out (the code works fine when I run it in the RGUI). Thanks again, Paul # use the proxy settings for Windows/IE: setInternet2(TRUE) Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 -Original Message- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: Wednesday, February 01, 2012 11:48 AM To: Prew, Paul Cc: r-help@r-project.org Subject: Re: [R] Windows 7 installation of .qz package from SourceForge On 12-02-01 12:07 PM, Prew, Paul wrote: Hello, I'm trying to install the package metRology from SourceForge. I save the zip file metRology_0.9-06.tar.gz to my Windows 7 machine, and try to install the package using the RGUI: Packages Install packages from local zip files metRology_0.9-06.tar.gz. There's no .zip extension, but R seems to go to work on the installation with a couple warning messages and one error. tar.gz files are source packages. .zip is a binary image of an installed package. To install a tar.gz, you can't use the menu items. If you have the necessary tools set up properly, it should work to do install.packages(.../path/to/metRology_0.9-06.tar.gz, type=source, repos=NULL) from within R. If that fails (as it likely will if the package has C or Fortran code and you haven't installed the right compilers), you need to get the R tools from CRAN mirror/bin/windows/Rtools. You can also install from outside R using R CMD INSTALL .../path/to/metRology_0.9-06.tar.gz. Duncan Murdoch The menu Packages Load Package ... doesn't provide metrology as one of the choices. ==R session window utils:::menuInstallLocal() Warning in unzip(zipname, exdir = dest) : error 1 in extracting from zip file Warning in read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) : cannot open compressed file 'metRology_0.9-06.tar.gz/DESCRIPTION', probable reason 'No such file or directory' Error in read.dcf(file.path(pkgname, DESCRIPTION), c(Package, Type)) : cannot open the connection Do I use a utility such as 7-zip to decompress the underlying files, then re-zip into a file with the .zip extension? Thank you, Paul sessionInfo() R version 2.14.0 (2011-10-31) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] tcltk stats graphics grDevices utils datasets methods base other attached packages: [1] Rcmdr_1.8-1 car_2.0-12 nnet_7.3-1 MASS_7.3-16 loaded via a namespace (and not attached): [1] tools_2.14.0 Paul Prew | Statistician 651-795-5942 | fax 651-204-7504 Ecolab Research Center | Mail Stop ESC-F4412-A 655 Lone Oak Drive | Eagan, MN 55121-1560 CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. CONFIDENTIALITY NOTICE: =\ \ This e-mail communication a...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] formula error inside function
Hi, I have fixed this. I replaced the = with -. I do not think this is the most elegant way, so if anyone else has any better ideas they would be very much apperitiaded. New lines: formulaGenotype - test_variable~Genotype + Gender formulaNull - test_variable~Gender Cheers, Hugh On 01/29/2012 07:46 PM, Hugh Morgan wrote: Hi, I have what I suppose is the same problem as this. I am using the linear mixed model function lme, and this does not seems to take the attribute model=TRUE at the end of the function. Is there a more general way of solving this problem? Is my description of the problem below correct (from my understanding of cran.r-project.org/doc/contrib/Fox-Companion/appendix-scope.pdf)? Using the test script: calculate_mixed_model_p - function() { dataObj=read.csv('dataMini.csv', header=TRUE, sep=,, dec=.) colnames(dataObj) attach(dataObj) library(nlme) formulaGenotype = test_variable~Genotype + Gender formulaNull = test_variable~Gender finalModelGenotype = lme(formulaGenotype, random=~1|Date, dataObj, na.action=na.omit, method=ML, keep.data = TRUE) finalModelNull = lme(formulaNull, random=~1|Date, dataObj, na.action=na.omit, method=ML) anovaModel = anova (finalModelGenotype,finalModelNull) print(anovaModel) } Fails with: Error in eval(expr, envir, enclos) : object 'formulaGenotype' not found I THINK function lme(...) constructs an object (finalModelGenotype) that has as part of it a link (pointer?) to object formulaGenotype. During construction this is in the function scope as it was passed to it. When finalModelGenotype is later passed to function anova(...) the link is still there but as the lme(...) scope no longer exists the link is now broken. Any help greatly apperitiated, Hugh PS, I tried to make this script self contained, and generated the data object with the following lines. It looks identical when you print it, but the lme function fails with error at [2]. If someone was to tell me what I am doing wrong I may be able to post easier scripts. [1] dataObj=data.frame(test_variable=c(23.0,20.2,23.8,25.6,24.6,22.7,27.7,27.5,23.5,22.8,22.3,20.9,26.6,23.8,24.5,26.8,23.2,29.9,23.3,22.5,22.2,27.2,28.1,24.5,22.7,20.7,26.2,27.1,22.0,22.2,26.7,28.5,22.2,22.1,25.3,21.7,29.3), Gender=c(Female,Female,Male,Male,Male,Female,Male,Male,Female,Female,Female,Female,Male,Male,Male,Male,Female,Male,Female,Female,Female,Male,Male,Male,Female,Female,Male,Male,Female,Female,Male,Male,Female,Female,Female,Female,Male), Genotype=c(10028,10028,10028,10028,10028,10028,10028,10028,10028,10028,10028,10028,10028,10028,10028,10028,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0), Assay.Date=c(01/07/2009,01/07/2009,01/07/2009,01/07/2009,01/07/2009,01/07/2009,01/07/2009,01/07/2009,01/07/2009,07/07/2010,07/07/2010,07/07/2010,01/07/2009,07/07/2010,07/07/2010,07/07/2010,02/06/2010,02/06/2010,02/06/2010,02/06/2010,02/06/2010,02/06/2010,02/06/2010,02/06/2010,17/06/2010,17/06/2010,17/06/2010,17/06/2010,16/06/2010,16/06/2010,16/06/2010,16/06/2010,22/06/2010,22/06/2010,22/06/2010,22/06/2010,22/06/2010), Weight=c(9.9,9.5,9.9,10,9.9,9.8,10.2,10.4,9.9,9.8,9.9,9.5,9.8,9.5,9.8,9.9,9.5,10,9.8,9.5,9.7,10,10.2,9.9,9.9,9.5,10,10,9.8,9.9,10.2,10.1,9.8,9.9,10.2,9.8,10) ) [2] Error in `rownames-`(`*tmp*`, value = c(1, 2, 3, 4, 5, 6, : attempt to set rownames on object with no dimensions In addition:Warning message: In Ops.factor(y[revOrder], Fitted) : - not meaningful for factors On 01/25/2012 01:25 PM, Terry Therneau wrote: I want use survfit() and basehaz() inside a function, but it doesn't work. Could you take a look at this problem. Thanks for your help. Your problem has to do with environments, and these lines fmla- as.formula(Surv(time, event) ~ Z1 + Z2) BaseFun- function(x){ start.coxph- coxph(x, phmmd) ... survfit(start.coxph) } Basefun(fmla) The survfit routine needs to reconstruct the model matrix, and by default in R this is done in the context where the model formula was first defined. Unfortunately this is outside the function, leading to problems -- your argument x is is unknown in the outer envirnoment. The solution is to add model=TRUE to the coxph call so that the model frame is saved and survfit doesn't have to do reconstruction. If you think this should work as is, well, so do I. I spent a lot of time on this issue a few months ago and finally threw in the towel. The interaction of environments with model.frame and model.matrix is subtle and far from obvious. (Just to be clear: I didn't say broken. Each aspect of the process has well thought out reasons.) The standard modeling functions lm, glm, etc changed their defaults from model=F to model=T at some point. This costs some space memory, but coxph may need to do the same. Terry T __
Re: [R] Function to compute multi-response, multi-rater kappa?
Luk: Don't know if this solves your desire for an implementation in R, but the most general extension of Cohen's kappa for testing agreement that I'm aware of are the extensions made by using multi-response randomized block permutation procedures (MRBP) developed by Pual Mielke and Ken Berry. They calculate a generalized measure of agreement that can be applied to nominal, ordinal, or continuous data. I know it is used for computing and testing Cohen's kappa for multiple raters with nominal data. But I'm unsure whether it is easily applied to multiple responses at the same time as multiple raters but it might be. Might check out Mielke and Berry (2007. Permutation Methods, 2nd ed, pp 150-166). We distribute a package of permutation software (Blossom) that computes all the multiresponse permutation procedure family of statistics including MRBP ( we're in the process of porting it to an R package but it will be several months before its ready to go). There are versions of MRPP (less complete) available in the vegan package for R, but I don't know whether it will do the randomized block variant required for Cohen's kappa. Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: brian_c...@usgs.gov tel: 970 226-9326 From: Luk Arbuckle luk.arbuc...@gmail.com To: David Winsemius dwinsem...@comcast.net Cc: r-help@r-project.org Date: 02/01/2012 08:37 PM Subject: Re: [R] Function to compute multi-response, multi-rater kappa? Sent by: r-help-boun...@r-project.org Although interesting, Dave, this doesn't fit my problem. I want to measure the percentage agreement corrected for chance using an extension of kappa. The example data presented in the paper you linked to is considering an ordinal measure (ranked preference), whereas I'm looking to measure correlation between a nominal measure (agreement between non-ordered categories). The paper by Kraemer is cited over 100 times in Google Scholar, mostly in the health sciences, so I'm surprised it's not implemented in R. But I suppose this is a niche problem (multi-response version of kappa), or that there is some other extension to kappa, maybe in the social sciences, that I'm not aware of. Cheers, Luk Arbuckle On Wed, Feb 1, 2012 at 17:13, David Winsemius wrote: Searching on multiple raters attributes at the same site brings up http://finzi.psych.upenn.edu/R/library/smacof/doc/smacof.pdf (by Jan De Leeuw) Which has as one example multiple raters scoring different breads. On Feb 1, 2012, at 4:41 PM, Luk Arbuckle wrote: Thanks David, but those are not multi-response versions of the kappa. Extensions to multiple raters are common and well known. I am hoping someone familiar with multiple response extensions of kappa might see my post and be able to help. As I said, my search on cran has failed. I tried all the expected keywords, and looked through several kappa functions, but I don't see any that deal with the multi-response case as I've described it. Either it isn't available in R, or I'm looking in the wrong place. I did not intentionally double post, nor try to deceive your efforts to block double posting. I am not receiving my posts, contrary to my settings, so I rewrote the first one. I thought maybe it was blocked the first time because someone thought it wasn't an R question, so I changed the subject. Cheers, Luk Arbuckle On Wed, Feb 1, 2012 at 16:25, David Winsemius wrote: On Feb 1, 2012, at 3:13 PM, Luk Arbuckle wrote: I'm very sorry for double posting! My r-help setting Receive your own posts to the list? is set to Yes, and Mail delivery is Enabled. Yet I did not get a copy of my post (this message is a reply from my sent mail). I only learned of the double posting when I found it copied in an r-help archive. Again, my apologies. I actually had a chance to prevent that second posting. It looked familiar when viewed in the moderation queue and I took a quick look at what was in my inbox but since you used a different subject line my search failed. Speaking of searching ... you are asked in the Posting Guide (that no one reads) to post the specifics of your own efforts. My first search with multi-rater kappa failed. My second search is here: http://search.r-project.org/**cgi-bin/namazu.cgi?query=** multiple+raters+kappamax=100**result=normalsort=score** idxname=functionsidxname=**Rhelp08idxname=Rhelp10**idxname=Rhelp02 http://search.r-project.org/cgi-bin/namazu.cgi?query=multiple+raters+kappamax=100result=normalsort=scoreidxname=functionsidxname=Rhelp08idxname=Rhelp10idxname=Rhelp02 Having gotten more than one apparently on-target result with relatively minor effort, I see no point in my expending even more time. -- David. Luk Arbuckle On Wed, Feb 1, 2012 at 13:47, Luk Arbuckle wrote: I'm looking for a function in R that extends kappa to
Re: [R] time conversion from second to Y M D H M S format
Dear Uwe , Thanks for reply I have tried format function that u suggested (format(time_t1, %Y %m %d %H %M %S) and I got format(time_t1, %Y %m %d %H %M %S) [1] 126230400 126252000 126273600 126295200 126316800 126338400 I think something is not working correct. -- View this message in context: http://r.789695.n4.nabble.com/time-conversion-from-second-to-Y-M-D-H-M-S-format-tp4350831p4352062.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function to compute multi-response, multi-rater kappa?
This is very interesting, Brian, thanks. I was starting to wonder if the reason I can't find an implementation of Kraemer's approach might be that there are better methods. The key extension of kappa by Kraemer, for my purposes, was the jacknife estimate to improve on estimated standard errors (the standard errors by Fleiss are an asymptotic estimate). But given that my sample size and number of response categories are large it may not matter. I would like to test this, however, and I will certainly look at the references you cite to learn about this other approach. Cheers! Luk Arbuckle On Thu, Feb 2, 2012 at 11:40, Brian S Cade ca...@usgs.gov wrote: Luk: Don't know if this solves your desire for an implementation in R, but the most general extension of Cohen's kappa for testing agreement that I'm aware of are the extensions made by using multi-response randomized block permutation procedures (MRBP) developed by Pual Mielke and Ken Berry. They calculate a generalized measure of agreement that can be applied to nominal, ordinal, or continuous data. I know it is used for computing and testing Cohen's kappa for multiple raters with nominal data. But I'm unsure whether it is easily applied to multiple responses at the same time as multiple raters but it might be. Might check out Mielke and Berry (2007. Permutation Methods, 2nd ed, pp 150-166). We distribute a package of permutation software (Blossom) that computes all the multiresponse permutation procedure family of statistics including MRBP ( we're in the process of porting it to an R package but it will be several months before its ready to go). There are versions of MRPP (less complete) available in the vegan package for R, but I don't know whether it will do the randomized block variant required for Cohen's kappa. Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: brian_c...@usgs.gov tel: 970 226-9326 From: Luk Arbuckle luk.arbuc...@gmail.com To: David Winsemius dwinsem...@comcast.net Cc: r-help@r-project.org Date: 02/01/2012 08:37 PM Subject: Re: [R] Function to compute multi-response, multi-rater kappa? Sent by: r-help-boun...@r-project.org -- Although interesting, Dave, this doesn't fit my problem. I want to measure the percentage agreement corrected for chance using an extension of kappa. The example data presented in the paper you linked to is considering an ordinal measure (ranked preference), whereas I'm looking to measure correlation between a nominal measure (agreement between non-ordered categories). The paper by Kraemer is cited over 100 times in Google Scholar, mostly in the health sciences, so I'm surprised it's not implemented in R. But I suppose this is a niche problem (multi-response version of kappa), or that there is some other extension to kappa, maybe in the social sciences, that I'm not aware of. Cheers, Luk Arbuckle On Wed, Feb 1, 2012 at 17:13, David Winsemius wrote: Searching on multiple raters attributes at the same site brings up http://finzi.psych.upenn.edu/R/library/smacof/doc/smacof.pdf (by Jan De Leeuw) Which has as one example multiple raters scoring different breads. On Feb 1, 2012, at 4:41 PM, Luk Arbuckle wrote: Thanks David, but those are not multi-response versions of the kappa. Extensions to multiple raters are common and well known. I am hoping someone familiar with multiple response extensions of kappa might see my post and be able to help. As I said, my search on cran has failed. I tried all the expected keywords, and looked through several kappa functions, but I don't see any that deal with the multi-response case as I've described it. Either it isn't available in R, or I'm looking in the wrong place. I did not intentionally double post, nor try to deceive your efforts to block double posting. I am not receiving my posts, contrary to my settings, so I rewrote the first one. I thought maybe it was blocked the first time because someone thought it wasn't an R question, so I changed the subject. Cheers, Luk Arbuckle On Wed, Feb 1, 2012 at 16:25, David Winsemius wrote: On Feb 1, 2012, at 3:13 PM, Luk Arbuckle wrote: I'm very sorry for double posting! My r-help setting Receive your own posts to the list? is set to Yes, and Mail delivery is Enabled. Yet I did not get a copy of my post (this message is a reply from my sent mail). I only learned of the double posting when I found it copied in an r-help archive. Again, my apologies. I actually had a chance to prevent that second posting. It looked familiar when viewed in the moderation queue and I took a quick look at what was in my inbox but since you used a different subject line my search failed. Speaking of searching ... you are asked in the Posting Guide
Re: [R] Plotting bar graph over a geographical map
If you are willing to use base graphics instead of ggplot2 graphs, then look at the subplot function in the TeachingDemos package. One of the examples there shows adding multiple small bar graphs to a map. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of sjlabrie Sent: Tuesday, January 31, 2012 9:53 PM To: r-help@r-project.org Subject: [R] Plotting bar graph over a geographical map Hi, I am looking for a way to plot bar on a map instead of the standard points. I have been using ggplot2 and maps libraries. The points are added with the function geom_point. I know that there is a function geom_bar but I can't figure out how to use it. Thank you for your help, Simon ### R-code library(ggplot2) library(maps) measurements - read.csv(all_podo.count.csv, header=T) allworld - map_data(world) pdf(map.pdf) ggplot(measurements, aes(long, lat)) + geom_polygon(data = allworld, aes(x = long, y = lat, group = group), colour = grey70, fill = grey70) + geom_point(aes(size = ref)) + opts(axis.title.x = theme_blank(), axis.title.y = theme_blank()) + geom_bar(aes(y = normcount)) dev.off() ### -- View this message in context: http://r.789695.n4.nabble.com/Plotting- bar-graph-over-a-geographical-map-tp4346925p4346925.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Vertical string with horizontal letters
I'm trying to format text on a plot such that the string is vertical but the letters are horizonal. I tried text(1,1,label=output, srt=270) This gives the string rotation I want, but that rotates the entire output so the letters are also rotated. I've also tried text(1,1,label=output, srt=270, crt=270) to no avail. par()$crt doesn't seem to affect text? The format I want is demonstrated below: o u t p u t Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqldf for Very Large Tab Delimited Files
On Thu, Feb 2, 2012 at 3:11 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Feb 1, 2012 at 11:57 PM, HC hca...@yahoo.co.in wrote: Hi All, I have a very (very) large tab-delimited text file without headers. There are only 8 columns and millions of rows. I want to make numerous pieces of this file by sub-setting it for individual stations. Station is given as in the first column. I am trying to learn and use sqldf package for this but am stuck in a couple of places. To simulate my requirement, I have taken iris dataset as an example and have done the following: (1) create a tab-delimited file without headers. (2) read it using read.csv.sql command (3) write the result of a query, getting first 10 records Here is the reproducible code that I am trying: # Text data file write.table(iris, irisNoH.txt, sep = \t, quote = FALSE, col.names=FALSE,row.names = FALSE) # create an empty database (can skip this step if database already exists) sqldf(attach myTestdbT as new) f1-file(irisNoH.txt) attr(f1, file.format) - list(header=FALSE,sep=\t) # read into table called irisTab in the mytestdb sqlite database read.csv.sql(irisNoH.txt, sql = create table main.irisTab1 as select * from file, dbname = mytestdb) res1-sqldf(select * from main.irisTab1 limit 10, dbname = mytestdb) write.table(res1, iris10.txt, sep = \t, quote = FALSE, col.names=FALSE,row.names = FALSE) # For querying records of a particular species - unresolved problems #a1-virginica #attr(f1, names) - c(A1,A2,A3,A4,A5) #res2-fn$sqldf(select * from main.irisTab1 where A5 = '$a1') In the above, I am not able to: (1) assign the names to various columns (2) query for particular value of a column; in this case for particular species, say virginica (3) I guess fn$sqldf can do the job but it requires assigning column names Any help would be most appreciated. Ignoring your iris file for a moment, to query the 5th column (getting its name via sql rather than via R) we can do this: library(sqldf) species - virginica nms - names(dbGetQuery(con, select * from iris limit 0)) fn$dbGetQuery(con, select * from iris where `nms[5]` = '$species' limit 3) Now, sqldf is best used when you are getting the data from R but if you want to store it in a database and just leave it there then you might be better off using RSQLite directly like this (the eol = \r\n in the dbWriteTable statement was needed on my Windows system but you may not need that depending on your platform): write.table(iris, irisNoH.txt, sep = \t, quote = FALSE, col.names = FALSE, row.names = FALSE) library(sqldf) library(RSQLite) con - dbConnect(SQLite(), dbname = mytestdb) dbWriteTable(con, iris, irisNoH.txt, sep = \t, eol = \r\n) species - virginica nms - names(dbGetQuery(con, select * from iris limit 0)) fn$dbGetQuery(con, select * from iris where `nms[5]` = '$species' limit 3) dbDisconnect(con) There seems to have been a pasting error here. The first part was intended to show how to do this using sqldf and the second using RSQLite.Thus the first part was intended to be: library(sqldf) species - virginica # obviously we could just do nms - names(iris) but to get # names from database instead nms - names(dbGetQuery(con, select * from iris limit 0)) # use 5th column fn$sqldf(select * from iris where `nms[5]` = '$species' limit 3) and the second part that illustrates RSQLite was ok. Note that fn$ comes from the gsubfn package which sqldf loads. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vertical string with horizontal letters
One possible solution is to use strsplit to break on each character and then paste to put in a \n after each character. Then when you plot the text should be in the format you desire. x - outputy - unlist(strsplit(x, NULL))p - cat(paste(y, collapse=\n)) plot.new()text(.5, .5, paste(y, collapse=\n)) cheersTyler From: israelb...@hotmail.com To: r-help@r-project.org Date: Thu, 2 Feb 2012 17:20:17 + Subject: [R] Vertical string with horizontal letters I'm trying to format text on a plot such that the string is vertical but the letters are horizonal. I tried text(1,1,label=output, srt=270) This gives the string rotation I want, but that rotates the entire output so the letters are also rotated. I've also tried text(1,1,label=output, srt=270, crt=270) to no avail. par()$crt doesn't seem to affect text? The format I want is demonstrated below: o u t p u t Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The less than () operator doesnt seem to perform as expected
On Thu, Feb 02, 2012 at 10:00:58AM +, Jonas Hal wrote: The example here puzzles me. It seems like the operator doesn't work as expected. l - 0.6 u - seq(0.4, 0.7, 0.1) u [1] 0.4 0.5 0.6 0.7 mygrid - expand.grid(l = l, u = u) mygrid l u 1 0.6 0.4 2 0.6 0.5 3 0.6 0.6 4 0.6 0.7 mygridcollapsed - mygrid[mygrid$l mygrid$u, ] mygridcollapsed l u 3 0.6 0.6 4 0.6 0.7 In this little example I expect 'mygridcollapsed' only to return row 4 and for it to return row 3 seems wrong. The strange thing is it seems to work if I start the u-sequence at 0.5. Hi. As others pointed out, the problem is in different rounding error of 0.6 and seq(0.4, 0.7, 0.1)[3]. Try print(0.6, digits=20) [1] 0.5999778 print(seq(0.4, 0.7, 0.1)[3], digits=20) [1] 0.60008882 Use round(, digits=1) to force the same rounding in seq(0.4, 0.7, 0.1) and in c(0.4, 0.5, 0.6, 0.7) round(seq(0.4, 0.7, 0.1), digits=1) == c(0.4, 0.5, 0.6, 0.7) Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vertical string with horizontal letters
There are only a few graphics devices that honor the 'crt' setting to rotate characters differently from the string rotation (postscript is the only one I know of, and then not always). For your specific case you could do something like: text(1,1, paste( unlist(strsplit('output','')), collapse='\n'), adj=c(0,1)) You could use gsub instead of the paste and strsplit, but it adds an extra line feed at the beginning and end, or use: gsub('(?=.)(?=.)','\n','output', perl=TRUE) You may also want to play around a little with the adj=c(0,1) to get the positioning that you want. On Thu, Feb 2, 2012 at 10:20 AM, Israel Byrd israelb...@hotmail.com wrote: I'm trying to format text on a plot such that the string is vertical but the letters are horizonal. I tried text(1,1,label=output, srt=270) This gives the string rotation I want, but that rotates the entire output so the letters are also rotated. I've also tried text(1,1,label=output, srt=270, crt=270) to no avail. par()$crt doesn't seem to affect text? The format I want is demonstrated below: o u t p u t Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vertical string with horizontal letters
I apologize for the improperly formatted submission. I had my hotmail set to plain text instead of rich text. x - outputy - unlist(strsplit(x, NULL)) plot.new()text(.5, .5, paste(y, collapse=\n)) From: tyler_rin...@hotmail.com To: israelb...@hotmail.com; r-help@r-project.org Date: Thu, 2 Feb 2012 13:08:17 -0500 Subject: Re: [R] Vertical string with horizontal letters One possible solution is to use strsplit to break on each character and then paste to put in a \n after each character. Then when you plot the text should be in the format you desire. x - outputy - unlist(strsplit(x, NULL))p - cat(paste(y, collapse=\n)) plot.new()text(.5, .5, paste(y, collapse=\n)) cheersTyler From: israelb...@hotmail.com To: r-help@r-project.org Date: Thu, 2 Feb 2012 17:20:17 + Subject: [R] Vertical string with horizontal letters I'm trying to format text on a plot such that the string is vertical but the letters are horizonal. I tried text(1,1,label=output, srt=270) This gives the string rotation I want, but that rotates the entire output so the letters are also rotated. I've also tried text(1,1,label=output, srt=270, crt=270) to no avail. par()$crt doesn't seem to affect text? The format I want is demonstrated below: o u t p u t Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Probit regression with limited parameter space
[cc'ing back to r-help] On 12-02-02 01:56 PM, Sally Luo wrote: I tried to adapt your code to my model and got the results as below. I don't know how to fix the warning messages. It says rearrange the lower (or upper) bounds to match 'start'. The warning is overly conservative in this case. I should work on engineering the package so that it handles this better. You can disregard them. In answer to your previous questions: * size refers to the number of trials per observation (1, if you have binary data) * you've got the form of the lower and upper bounds right. * you've got the formula in 'parameters' right -- this builds a linear model (using R's model.matrix) on the probit scale based on the 8 parameters And two of the estimates for my restricted parameters are on the boundary. The warning message says the variance-covariance calculations may be unreliable. Those parameters are the ones of interest to my study. Can I still make inferences using the p-values reported by mle2 in this case? That's quite tricky unfortunately, and it isn't a problem that's specific to the mle2 package. The basic issue is that the whole derivation of the multivariate normal sampling distribution of the maximum likelihood estimator depends on the maximum likelihood being an interior local maximum (and hence having a negative-definite hessian, or a positive-definite information matrix), which is untrue on the boundary -- the Wikipedia article on maximum likelihood mentions this issue, for example http://en.wikipedia.org/wiki/Maximum_likelihood Perhaps someone here can suggest an approach (although it gets outside the scope of R help, or you can ask on http://stats.stackexchange.com ... Thanks for your help. Sally mle2fit - mle2(y~dbinom(pnorm(pprobit),size=1), + parameters=list(pprobit~x1+x2+x3+x4+x5+x6+x7+x8), + start=list(pprobit=0), + optimizer=nlminb, + lower=c(-Inf,-1,-1,-1,-Inf,-Inf,-Inf,-Inf,-Inf), + upper=c(Inf,1,1,1,Inf,Inf,Inf,Inf,Inf), + data=d) Warning messages: 1: In fix_order(call$lower, lower bounds, -Inf) : lower bounds not named: rearranging to match 'start' 2: In fix_order(call$upper, upper bounds, Inf) : upper bounds not named: rearranging to match 'start' 3: In mle2(y ~ dbinom(pnorm(pprobit), size = 1), parameters = list(pprobit ~ : some parameters are on the boundary: variance-covariance calculations based on Hessian may be unreliable On Wed, Feb 1, 2012 at 11:16 PM, Sally Luo shali...@gmail.com wrote: Prof. Bolker, Thanks a lot for your reply. In my model, I have 9 explanatory variables and I need to restrict the range of parameters 2-4 to (-1,1). I tried to modify the univariate probit example you gave in your reply, however, I could not get through. Specificially, I am not sure what 'pprobit' represents in your code? How should I code this part if I have more than one variable? Also does size refer to the number of parameters? Since only 3 parameters need to be restricted in my model, should I write lower=c(-Inf, -1,-1,-1, -Inf, -Inf, -Inf, -Inf, -Inf) and upper=c(Inf, 1,1,1, Inf, Inf, Inf, Inf, Inf)? Thanks again for your kind help. Best, Sally On Wed, Feb 1, 2012 at 7:19 AM, Ben Bolker bbol...@gmail.com wrote: Sally Luo shali623 at gmail.com writes: Dear R helpers, I need to estimate a probit model with box constraints placed on several of the model parameters. I have the following two questions: 1) How are the standard errors calclulated in glm (family=binomial(link=probit)? I ran a typical probit model using the glm probit link and the nlminb function with my own coding of the loglikehood, separately. As nlminb does not produce the hessian matrix, I used hessian (numDeriv) to calculate it. However, the standard errors calculated using hessian function are quite different from the ones generated by the glm function, although the parameter estimates are very close. I was wondering what makes this difference in the estmation of standard errors and how this computation is carried out in glm. 2) Does any one know how to estimate a constrained probit model in R (to be specific, I need to restrain the range of three parameters to [-1,1])? Among the optimation functions, so far nlminb and spg work for my problem, but neither produces a hessian matrix. As I mentioned above, if I use hessian funciton and calculate standard errors manually, the standard errors seem not right. I'm a little biased, but I think the bbmle package is the easiest way to get this done -- it provides convenient wrappers for a range of optimizers including nlminb. I would warn however that you should be very careful interpreting the meaning of the Hessian matrix if some of your parameters lie on the boundary of the feasible space ... set.seed(101) x - runif(100) p - pnorm(1+3*x) y - rbinom(100,p,size=1) d
[R] R-Project at university.
Dear reader, I'm a student on engineering studies at Silesian University of Technology in Gliwice in Poland, my field of study is Technology and Mechanical Engineering on Integrated process of manufacturing systems, also I held a Bachelor's degree on Automation and Robotics. However I have a view questions about the R-Project, as far as I'm aware of on your website the program appears to be free to use, which captured my eyes, but does that mean this program (r-project) can be used by any degree students, for instance as a leaner or a teacher, on the other hand are there any limitations of how the program can be used, for example if I wanted to compile *.exe program file using the R program could that be achieved, without any cost. Although I request further information on terms and condition, including license, and any other useful information about using r-project as learning tool for university students and projects. Hope to hear from you soon, thank you for your time. sincerely Karol Porwol . [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Predict function
I've created a linear model and am trying to use the predict function to predict the outcome of a sports game. I have four explanatory variables a,b,c,d. where a,b relate to the home team and c,d relate to the away team. i'd like to know the probability that the home team wins (assuming no draws). Many thanks David -- View this message in context: http://r.789695.n4.nabble.com/Predict-function-tp4352516p4352516.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kernel smoothing of disease rates at locations
Is it possible to apply a kernel smoothing regression whose estimator or indeed the confidence intervals cannot take negative values or values greater than 1? Best regards, Ioanna -- View this message in context: http://r.789695.n4.nabble.com/kernel-smoothing-of-disease-rates-at-locations-tp799701p4352286.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two-Way PERMANOVA with Repeated Measurements
Hello, I want to perform a permanova where the first factor called Treatment has four levels. The second factor involves sampling the same research plots for four consecutive years, hence the repeated measurements. I have been able to use the adonis function from the package vegan to run this analysis. code below: TC.perMANOVA.adonis-adonis(TC.PerMANOVA ~ Treatment*Year, data=TC.PerMANOVA.ENV, permutations=99, method=bray, strata = NULL) However, my concern is that this does not take into account that Year is a repeated measurement on the same research plots. Any suggestions would be appreciated. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time conversion from second to Y M D H M S format
It works for me as well so there's something funny on your end: please run the following *verbatim* (in a vanilla R session): sink(ForRHelp.txt) print(sessionInfo()) cat(\n) print(.Platform) time -as.POSIXct(c( 126230400, 126252000, 126273600), origin=2005-01-01, tz=GMT) print(time) cat(format(time[1], %Y %m %d %H %M %S), \n) cat(format(time[2], %Y %m %d %H %M %S), \n) cat(format(time[3], %Y %m %d %H %M %S), \n) sink() print(paste(Text file in, getwd())) and send the resulting txt file to the list (so we can see exactly your system config and what not). Michael On Thu, Feb 2, 2012 at 11:57 AM, uday uday_143...@hotmail.com wrote: Dear Uwe , Thanks for reply I have tried format function that u suggested (format(time_t1, %Y %m %d %H %M %S) and I got format(time_t1, %Y %m %d %H %M %S) [1] 126230400 126252000 126273600 126295200 126316800 126338400 I think something is not working correct. -- View this message in context: http://r.789695.n4.nabble.com/time-conversion-from-second-to-Y-M-D-H-M-S-format-tp4350831p4352062.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fiedler
Hi. I am looking for a function in R for computing the Fiedler vector of a graph (the eigenvector associated with the second smallest eigenvalue of the Laplacian of the graph). Alternatively, I am searching for an efficient method to compute just few eigenvalues/vectors of a matrix (the smallest). Many thanks. Massimo Franceschet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kernel smoothing of disease rates at locations
Caveat: Both of the following claims are subject to verification by true experts, which I am not. But I believe: 1. If all values being smoothed are positive, then the smoother must be also. If there are negative values, this is no longer true, and your question needs much more detail to get an answer. 2. Just restrict the (pointwise) CI's to your desired range if they fall outside of it. This assumes that values outside the range cannot occur. If that assumption is wrong, again you will need to provide much greater detail. -- Bert On Thu, Feb 2, 2012 at 10:09 AM, ioanna ii54...@msn.com wrote: Is it possible to apply a kernel smoothing regression whose estimator or indeed the confidence intervals cannot take negative values or values greater than 1? Best regards, Ioanna -- View this message in context: http://r.789695.n4.nabble.com/kernel-smoothing-of-disease-rates-at-locations-tp799701p4352286.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time conversion from second to Y M D H M S format
On 02-02-2012, at 19:23, R. Michael Weylandt wrote: It works for me as well so there's something funny on your end: please run the following *verbatim* (in a vanilla R session): sink(ForRHelp.txt) print(sessionInfo()) cat(\n) print(.Platform) time -as.POSIXct(c( 126230400, 126252000, 126273600), origin=2005-01-01, tz=GMT) print(time) cat(format(time[1], %Y %m %d %H %M %S), \n) cat(format(time[2], %Y %m %d %H %M %S), \n) cat(format(time[3], %Y %m %d %H %M %S), \n) sink() print(paste(Text file in, getwd())) and send the resulting txt file to the list (so we can see exactly your system config and what not). Michael I appear to have the same or similar problem on Mac OS X 10.6.8 I ran the above script with R --vanilla. The result is R version 2.14.1 Patched (2012-01-30 r58238) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_GB/en_GB/en_GB/C/en_GB/en_GB attached base packages: [1] stats graphics grDevices utils datasets methods base $OS.type [1] unix $file.sep [1] / $dynlib.ext [1] .so $GUI [1] X11 $endian [1] little $pkgType [1] mac.binary.leopard $path.sep [1] : $r_arch [1] x86_64 [1] 2009-01-01 00:00:00 GMT 2009-01-01 06:00:00 GMT [3] 2009-01-01 12:00:00 GMT 2009 01 01 00 00 00 2009 01 01 06 00 00 2009 01 01 12 00 00 Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-Project at university.
Disclaimer: I am a lawyer so this all should be verified elsewhere, but best I understand it (and would welcome verification by someone who knows more about this): The R-Project (broadly taken) is an extensive collection of packages + a core interpreter. The interpreter, the base packages, and most available add-on packages are licensed on the GPL (GNU Public License). Consequently they are free for use, no charge for anyone: however, commercial redistribution is trickier since the GPL is copyleft. If you don't have any intent to redistribute (i.e., to write R code to give to anyone else) the license questions almost certainly don't apply to you. If you are wiling to put your code under a widely accepted open-source license, it is quite easy to redistribute and the R-Project (in the form of CRAN) provides a powerful platform for doing so. Much more information can be found about this here: https://www.gnu.org/licenses/gpl-faq.html Certainly, in my experience, it is widely used by faculty and students in an academic context with no legal worries. R is an interpreted language, so one can't make executables from it. Anything else you want to do, you can do for free. Including ordering pizza! (though I imagine one would still be expected to pay for the pizza) Michael On Thu, Feb 2, 2012 at 1:28 PM, Sylhetrin sylhet...@gmail.com wrote: Dear reader, I'm a student on engineering studies at Silesian University of Technology in Gliwice in Poland, my field of study is Technology and Mechanical Engineering on Integrated process of manufacturing systems, also I held a Bachelor's degree on Automation and Robotics. However I have a view questions about the R-Project, as far as I'm aware of on your website the program appears to be free to use, which captured my eyes, but does that mean this program (r-project) can be used by any degree students, for instance as a leaner or a teacher, on the other hand are there any limitations of how the program can be used, for example if I wanted to compile *.exe program file using the R program could that be achieved, without any cost. Although I request further information on terms and condition, including license, and any other useful information about using r-project as learning tool for university students and projects. Hope to hear from you soon, thank you for your time. sincerely Karol Porwol . [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Probit regression with limited parameter space
[cc'ing back to r-help again -- I do this so the answers can be archived and viewed by others] On 12-02-02 02:41 PM, Sally Luo wrote: Prof. Bolker, Thanks for your quick reply and detailed explanation. I also ran the unrestricted model using glmfit - glm(y~x1+x2+x3+x4+x5+x6+x7+x8, family=binomial(link=probit),data=d). However, the results I got from glm and mle2 (both for the unrestricted model) are not very similar (please see below). In your earlier example, both glm and mle2 produce almost the same estimation results. I just hope to figure out what might cause the discrepancy in the estimation results I've got. coef(summary(*glmfit*)) Estimate Std. Errorz value Pr(|z|) (Intercept) -0.853900059 0.2464179864 -3.4652505 5.297377e-04 x1 1.627125691 0.3076174699 5.2894450 1.226881e-07 x2-0.092716326 0.5229866504 -0.1772824 8.592866e-01 x3-3.301509522 0.9169991843 -3.6003407 3.178004e-04 x4 7.187483436 2.2135961171 3.2469715 1.166401e-03 x5-0.002544181 0.0112740324 -0.2256673 8.214602e-01 x6 6.978374268 2.2347939216 3.1226030 1.792594e-03 x7-0.009832379 0.0113807583 -0.8639476 3.876167e-01 x8-0.001252075 0.0002304789 -5.4324941 5.557178e-08 coef(summary(*mle2fit*)) Estimate Std. Error z valuePr(z) pprobit.(Intercept) -0.603492668 0.230117071 -2.6225463 8.727541e-03 pprobit.x1 1.645984346 0.288479906 5.7057158 1.158552e-08 pprobit.x2-0.157361533 0.523048376 -0.3008546 7.635253e-01 pprobit.x3-3.935203692 0.932692587 -4.2191862 2.451857e-05 pprobit.x4 7.512701611 0.062911076 119.4177885 0.00e+00 pprobit.x5-0.001475556 0.011525137 -0.1280293 8.981258e-01 pprobit.x6 7.399355063 0.018372749 402.7353318 0.00e+00 pprobit.x7-0.010113008 0.011647725 -0.8682388 3.852636e-01 pprobit.x8-0.001650021 0.000244997 -6.7348622 1.640854e-11 My best guess is that you are running into optimization problems. The big advantage of glm() is that it uses a special-purpose optimization method (iteratively reweighted least squares) that is generally much more robust/reliable than general-purpose nonlinear optimizers such as nlminb. If there is indeed a GLM fitting routine coded in R, somewhere, that someone has adapted to work with box constraints, it will probably perform better than mle2. Some general suggestions for troubleshooting this: * check the log-likelihoods returned by the two methods. If they are very close (say within 0.01 likelihood units), then the issue is that you just have a very flat goodness-of-fit surface, and the two sets of coefficients are in practice very similar to each other. * if possible, try starting each approach (glm(), mle2()) from the solution found by the other (it's a little bit of a pain to get the syntax right here) and see if they get stuck right where they are or whether they find that one answer or the other is right. * if you were using one of the optimizing methods from optim() (rather than nlminb), e.g. L-BFGS-B, I would suggest you try using parscale to rescale the parameters to have approximately equal magnitudes near the solution. This apparently isn't possible with nlminb, but you could try optimizer=optim (the default), method=L-BFGS-B and see how you do (although L-BFGS-B is often a bit finicky). Alternatively, you can try optimizer=optimx, in which case you have a larger variety of unconstrained optimizers to choose from (you have to install the optimx package and take a look at its documentation). Alternatively, you can scale your input variables (e.g. use scale() on your input matrix to get zero-centered, sd 1 variables), although you would then have to adjust your lower and upper bounds accordingly. * it's a bit more work, but you may be able to unpack this a bit and provide analytical derivatives. That would help a lot. In short: you are entering the quagmire of numerical optimization methods. I have learned most of this stuff by trial and error -- can anyone on the list suggest a good/friendly introduction? (Press et al Numerical Recipes; Givens and Hoeting's Computational Statistics book looks good, although I haven't read it ...) Ben Bolker On Thu, Feb 2, 2012 at 1:12 PM, Ben Bolker bbol...@gmail.com wrote: [cc'ing back to r-help] On 12-02-02 01:56 PM, Sally Luo wrote: I tried to adapt your code to my model and got the results as below. I don't know how to fix the warning messages. It says rearrange the lower (or upper) bounds to match 'start'. The warning is overly conservative in this case. I should work on engineering the package so that it handles this better. You can disregard them. In answer to your previous questions: * size refers to the number of trials per observation (1, if you have binary data)
Re: [R] time conversion from second to Y M D H M S format
On 02-02-2012, at 21:10, Berend Hasselman wrote: On 02-02-2012, at 19:23, R. Michael Weylandt wrote: It works for me as well so there's something funny on your end: please run the following *verbatim* (in a vanilla R session): sink(ForRHelp.txt) print(sessionInfo()) cat(\n) print(.Platform) time -as.POSIXct(c( 126230400, 126252000, 126273600), origin=2005-01-01, tz=GMT) print(time) cat(format(time[1], %Y %m %d %H %M %S), \n) cat(format(time[2], %Y %m %d %H %M %S), \n) cat(format(time[3], %Y %m %d %H %M %S), \n) sink() print(paste(Text file in, getwd())) and send the resulting txt file to the list (so we can see exactly your system config and what not). Michael I appear to have the same or similar problem on Mac OS X 10.6.8 I ran the above script with R --vanilla. The result is R version 2.14.1 Patched (2012-01-30 r58238) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_GB/en_GB/en_GB/C/en_GB/en_GB attached base packages: [1] stats graphics grDevices utils datasets methods base $OS.type [1] unix $file.sep [1] / $dynlib.ext [1] .so $GUI [1] X11 $endian [1] little $pkgType [1] mac.binary.leopard $path.sep [1] : $r_arch [1] x86_64 [1] 2009-01-01 00:00:00 GMT 2009-01-01 06:00:00 GMT [3] 2009-01-01 12:00:00 GMT 2009 01 01 00 00 00 2009 01 01 06 00 00 2009 01 01 12 00 00 Disregard my previous posting. Results are correct. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] possibly Error in R version 2.12.1 (2010-12-16)
Hi, the following Code demonstrates an possibly Error in R (or you can explain me, why this happens, thanks in advance) Code: # testClass - function( stackData= c()) { list( write= function( ...) { sChain= for( s in c( stackData, ...)) { sChain= paste( sChain, '', sub( '', '', s), '', sep, sep='') } write( sChain, fHandle, append=TRUE) }, stackIt1 = function( ...) { testClass( stackData= c( stackData, ...)) }, stackIt2 = function( ...) { tmp= c( stackData, ...) testClass( stackData= tmp) }, getStack = function() { stackData }, NULL ) } to1= testClass() for( i in 4:2) { to1= to1$stackIt1( i) } print( all( rep( 2, 3) == to1$getStack())) # error! to2= testClass() for( i in 4:2) { to2= to2$stackIt2( i) } print( all( 4:2 == to2$getStack())) # correct! # what ist the difference between stackIt1 and stackIt2? # (error appears only by using an for loop) version _ platform i486-pc-linux-gnu arch i486 os linux-gnu system i486, linux-gnu status major 2 minor 12.1 year 2010 month 12 day16 svn rev53855 language R version.string R version 2.12.1 (2010-12-16) Regards # End of Code written in an R-File and called per source( 'Fname.R') shows 2 subsequent outputs of 'TRUE', which is not ok in my mind Thanks for your attention __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need to Write a Code that can find the molecular weight of various compounds
Hi Paul! Thanks a lot! I tried downloading the Rdisop file and encountered this error: Error: package ‘Rdisop’ is not installed for 'arch=x64' I tried downloading directly from the source using R and got this error: Error in file(filename, r, encoding = encoding) : cannot open the connection In addition: Warning message: In file(filename, r, encoding = encoding) : cannot open: HTTP status was '404 Not Found' So i'm not sure if the source is still there? I also tried the Rcdk method, and received this error: Loading required package: rJava Error : .onLoad failed in loadNamespace() for 'rJava', details: call: fun(libname, pkgname) error: JAVA_HOME cannot be determined from the Registry Error: package ‘rJava’ could not be loaded So! I downloaded Java and the newest rJava package from http://www.rforge.net/rJava/ But still received this error: Error : .onLoad failed in loadNamespace() for 'rJava', details: call: fun(libname, pkgname) error: JAVA_HOME cannot be determined from the Registry Error: package/namespace load failed for ‘rJava’ any ideas? ): -- View this message in context: http://r.789695.n4.nabble.com/Need-to-Write-a-Code-that-can-find-the-molecular-weight-of-various-compounds-tp4342874p4352295.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need to Write a Code that can find the molecular weight of various compounds
I also tried downloading the JDK version of Java and received this new error when running it: Error : .onLoad failed in loadNamespace() for 'rJava', details: call: dirname(this$RuntimeLib) error: a character vector argument expected Error: package/namespace load failed for ‘rJava’ -- View this message in context: http://r.789695.n4.nabble.com/Need-to-Write-a-Code-that-can-find-the-molecular-weight-of-various-compounds-tp4342874p4352510.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Splitting up large set of survey data into categories
push (sorry ;-)) -- View this message in context: http://r.789695.n4.nabble.com/Splitting-up-large-set-of-survey-data-into-categories-tp4323327p4352611.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Post hoc test for lm() or glm() ?
Hi R-helpers, TukeyHSD() works for models fitted with aov(), but could anyone point me to a function that performs a similar post hoc test for models fitted with lm() or glm()? Thanks in advance, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with GMT+/- time zones
Wow. Thanks very much for pointing that out - I never would have guessed it was deliberate that + and - were reversed! For future reference for anyone else similarly confused by this departure from time zone and mathematic convention, here's the relevant part of en.wikipedia.org/wiki/Tz_database: The special area of Etc is used for some administrative zones, particularly for Etc/UTC which represents Coordinated Universal Time. In order to conform with the POSIX style, those zone names beginning with Etc/GMT have their sign reversed from what most people expect. In this style, zones west of GMT have a positive sign and those east have a negative sign in their name (e.g Etc/GMT-14 is 14 hours ahead/east of GMT.) Thanks, Andrew On 2/02/2012, at 20:46 , Jeff Newmiller wrote: Should has nothing to do with it. That is the way the Olsen tz database works. See en.wikipedia.org/wiki/Tz_database. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Andrew Digby andrewdi...@mac.com wrote: I'm struggling with time zone version when expressed as hours offset from GMT. Can anyone confirm that the behaviour below is incorrect? It seems that the GMT offsets are backwards: format(as.POSIXct(2011-05-23 17:23:00, tz=Europe/London),tz=America/New_York,usetz=T) [1] 2011-05-23 12:23:00 EDT - this works. format(as.POSIXct(2011-05-23 17:23:00,tz=GMT),tz=GMT-5,usetz=T) [1] 2011-05-23 22:23:00 GMT - this doesn't work: 17:23:00 GMT should be 12:23:00 GMT-5! Thanks. R version 2.13.0 (2011-04-13) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Post hoc test for lm() or glm() ?
The R multcomp package provides one general approach to multiplicity correction. For general contrasts in lm and glm, the rms package's ols and Glm functions make this even easier to use. Frank Mark Na wrote Hi R-helpers, TukeyHSD() works for models fitted with aov(), but could anyone point me to a function that performs a similar post hoc test for models fitted with lm() or glm()? Thanks in advance, Mark __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Post-hoc-test-for-lm-or-glm-tp4352761p4352799.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Post hoc test for lm() or glm() ?
The glht function in the multcomp package is what you are looking for. There are additional examples in the ?MMC help file in the HH package. Rich On Thu, Feb 2, 2012 at 3:42 PM, Mark Na mtb...@gmail.com wrote: Hi R-helpers, TukeyHSD() works for models fitted with aov(), but could anyone point me to a function that performs a similar post hoc test for models fitted with lm() or glm()? Thanks in advance, Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Post hoc test for lm() or glm() ?
Thank you Richard and Frank for your very quick and helpful replies. Cheers, Mark On Thu, Feb 2, 2012 at 2:58 PM, Frank Harrell f.harr...@vanderbilt.edu wrote: The R multcomp package provides one general approach to multiplicity correction. For general contrasts in lm and glm, the rms package's ols and Glm functions make this even easier to use. Frank Mark Na wrote Hi R-helpers, TukeyHSD() works for models fitted with aov(), but could anyone point me to a function that performs a similar post hoc test for models fitted with lm() or glm()? Thanks in advance, Mark __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Post-hoc-test-for-lm-or-glm-tp4352761p4352799.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pgfSweave doesn't lazyload my objects
I can reproduce it but I do not know why this happens. FWIW, I tried the knitr package and it worked well except that you have to write cache=TRUE or FALSE instead of true/false. library(knitr) knit('test_pgf.Rnw') Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Thu, Feb 2, 2012 at 6:12 AM, Ludo Pagie lpa...@xs4all.nl wrote: Hi all, I'm struggling a bit to get pgfSweave to lazyload objects when compiling a .Rnw file for a second time. Caching works fine except that for every run all objects get cached again and again. I've used cacheSweave which works fine; all cached objects from code-chunks with option cache = TRUE are lazy loaded. I've tried it on two machines ... I'm pretty sure I'm overlooking something obvious. Below is my .Rnw and sessionInfo. Pointers, suggestions, etc are most welcome. %%% RNW; test_png.Rnw %% \documentclass{article} \begin{document} some bla bla text large-chunk-no-cache, cache=false= mm1 - matrix(1:1e7, 1e3, 1e4) @ c2= print(length(mm1)) @ large-chunk-do-cache, cache=true= mm2 - matrix(1:1e7, 1e3, 1e4) @ c4= print(length(mm2)) @ \end{document} % END RNW %% I am running the folowing R command: pgfSweave('test_pgf.Rnw', compile.tex=F) ### SESSION INFO R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] pgfSweave_1.2.1 tikzDevice_0.6.2 cacheSweave_0.6 formatR_0.3-4 [5] optparse_0.9.4 getopt_1.17 highlight_0.3.1 parser_0.0-14 [9] Rcpp_0.9.9 codetools_0.2-8 stashR_0.3-4 filehash_2.2 loaded via a namespace (and not attached): [1] digest_0.5.1 grid_2.14.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate the natural log of cdf between 2 intervals
Hello all, I was wondering if there is an R function to do the following: [*] log(pnorm(x)-pnorm(y)), where xy. I don't want all the area under the natural log of the normal pdf less than x, I only want the area between y and x. I am aware of the ability to specify log.p=TRUE, which gives me the log of the probability that X=x. This does not help me, because the following code: pnorm(x, log.p=TRUE)-pnorm(y,log.p=TRUE) is not the same as [*] mathematically. I cannot use [*] because some of my x's are far less than the mean, more than 10 sd. This causes me to take the log(0) which is an error. Thus, I need to stay in the log scale, since, for z less than 10 sd below the mean, log(pnorm(z)) is an error, and pnorm(z,log.p=TRUE) is stable even though theoretically they are equivalent. Thanks for your time Justin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how can i calculate the mean of my data which is only bigger than 75?
Hi michael, thanks, but here is more explanations of my questions to have more help, (also pls have a look at the data below):  Three questions to give more concrete help: i) Is your data set stored as a matrix or a data.frame My data is in a data frame ii) What are you trying to get the mean of -- all variables pooled or of each variable independently? Actually I would like to have all the data for either u, v w1 or w2 that are bigger than75, which then there weâll have new data frame with either u,v, w1 or w2 bigger than 75, doesnât matter x,y,z1 and z2, they will just follow what ever the results would be (in this new data frame we should still have x,y,z1,z1,u, v,w1 and w2, but only those with the values of u or v or w1 or w2 that are bigger than 75. iii) When you say =75 for all variables, do you mean only use a row if it's =75 for each element or just only use the =75 elements for each calculation independently. after we have the new data frame, then I would like to have the mean for x, y, z1 and z2 (the absolute number, without taking consideration the negative signs). If possible, Itshould have all the results altogether (mean of x=.., y=⦠z1=.. and z2= â¦)and not one by one.  Another question, if I would like to create a new data frame with only the maximum data of x (for example if I have 0.456; -0.456; and many more of this values as the maximum values of x ,How can I do it ? (withput taking consideration of the negative signs)  I hope my questions are clear now. Thanks in advance, Yakamu Michael  x y Z1 Z2 u v W1 W2 -0.0077 -0.4665 -0.0048 -0.1302 70 26 59 54 -0.0028 -0.0055 0.0026 -0.001 62 42 82 62 -0.0123 0.006 -0.003 0.0029 74 18 83 78 0.0232 0.0367 0.0028 0.0027 65 34 74 78 -0.0075 0.1141 -0.0018 0.0363 63 0 77 69 0.004 -0.0032 0.0036 -0.0156 14 40 70 64 -0.003 -0.0392 -0.006 -0.0212 55 42 63 69 -0.0116 -0.0028 0.0031 0.0209 59 23 69 35 0.0171 -0.0496 -0.0055 0.0118 35 57 73 42 -0.0135 -0.0324 0.0001 0.0004 55 45 57 55 0.0345 0.004 0.0041 0.0079 77 38 57 71 -0.0206 -0.0152 0.003 0.0104 55 30 56 81 -0.0044 0.0343 0.0059 0.0105 74 52 58 75 0.0138 -0.065 0.0016 -0.0064 68 64 70 56 -0.0303 0.0012 -0.009 0.0025 66 32 42 52 -0.0231 0.0379 -0.0006 0.0116 70 49 61 34 0.0305 0.078 -0.0081 -0.0082 83 45 22 18 -0.03 0.0978 0.0118 0.0103 88 25 31 68 0.0072 -0.0019 0.0049 0.0055 79 50 67 71 --- On Wed, 2/1/12, R. Michael Weylandt michael.weyla...@gmail.com wrote: From: R. Michael Weylandt michael.weyla...@gmail.com Subject: Re: [R] how can i calculate the mean of my data which is only bigger than 75? To: Yakamu Yakamu iam_yak...@yahoo.com Cc: r-help@r-project.org r-help@r-project.org Date: Wednesday, February 1, 2012, 12:47 PM I'm not entirely sure what you mean, but it's likely one of these: apply(data, 2, function(x) mean(x[x75])) mean(data[ apply(data,1, function(x) all(x 75), ]) mean(data[data75]) Three questions to give more concrete help: i) Is your data set stored as a matrix or a data.frame ii) What are you trying to get the mean of -- all variables pooled or of each variable independently? iii) When you say =75 for all variables, do you mean only use a row if it's =75 for each element or just only use the =75 elements for each calculation independently. Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how can i calculate the mean of my data which is only bigger than 75?
Hi michael, thanks, but here is more explanations of my questions to have more help, (also pls have a look at the data below):  Three questions to give more concrete help: i) Is your data set stored as a matrix or a data.frame My data is in a data frame ii) What are you trying to get the mean of -- all variables pooled or of each variable independently? Actually I would like to have all the data for either u, v w1 or w2 that are bigger than75, which then there weâll have new data frame with either u,v, w1 or w2 bigger than 75, doesnât matter x,y,z1 and z2, they will just follow what ever the results would be (in this new data frame we should still have x,y,z1,z1,u, v,w1 and w2, but only those with the values of u or v or w1 or w2 that are bigger than 75. iii) When you say =75 for all variables, do you mean only use a row if it's =75 for each element or just only use the =75 elements for each calculation independently. after we have the new data frame, then I would like to have the mean for x, y, z1 and z2 (the absolute number, without taking consideration the negative signs). If possible, Itshould have all the results altogether (mean of x=.., y=⦠z1=.. and z2= â¦)and not one by one.  Another question, if I would like to create a new data frame with only the maximum data of x (for example if I have 0.456; -0.456; and many more of this values as the maximum values of x ,How can I do it ? (withput taking consideration of the negative signs)  I hope my questions are clear now. Thanks in advance, Yakamu Michael  x y Z1 Z2 u v W1 W2 -0.0077 -0.4665 -0.0048 -0.1302 70 26 59 54 -0.0028 -0.0055 0.0026 -0.001 62 42 82 62 -0.0123 0.006 -0.003 0.0029 74 18 83 78 0.0232 0.0367 0.0028 0.0027 65 34 74 78 -0.0075 0.1141 -0.0018 0.0363 63 0 77 69 0.004 -0.0032 0.0036 -0.0156 14 40 70 64 -0.003 -0.0392 -0.006 -0.0212 55 42 63 69 -0.0116 -0.0028 0.0031 0.0209 59 23 69 35 0.0171 -0.0496 -0.0055 0.0118 35 57 73 42 -0.0135 -0.0324 0.0001 0.0004 55 45 57 55 0.0345 0.004 0.0041 0.0079 77 38 57 71 -0.0206 -0.0152 0.003 0.0104 55 30 56 81 -0.0044 0.0343 0.0059 0.0105 74 52 58 75 0.0138 -0.065 0.0016 -0.0064 68 64 70 56 -0.0303 0.0012 -0.009 0.0025 66 32 42 52 -0.0231 0.0379 -0.0006 0.0116 70 49 61 34 0.0305 0.078 -0.0081 -0.0082 83 45 22 18 -0.03 0.0978 0.0118 0.0103 88 25 31 68 0.0072 -0.0019 0.0049 0.0055 79 50 67 71 --- On Wed, 2/1/12, R. Michael Weylandt michael.weyla...@gmail.com wrote: From: R. Michael Weylandt michael.weyla...@gmail.com Subject: Re: [R] how can i calculate the mean of my data which is only bigger than 75? To: Yakamu Yakamu iam_yak...@yahoo.com Cc: r-help@r-project.org r-help@r-project.org Date: Wednesday, February 1, 2012, 12:47 PM I'm not entirely sure what you mean, but it's likely one of these: apply(data, 2, function(x) mean(x[x75])) mean(data[ apply(data,1, function(x) all(x 75), ]) mean(data[data75]) Three questions to give more concrete help: i) Is your data set stored as a matrix or a data.frame ii) What are you trying to get the mean of -- all variables pooled or of each variable independently? iii) When you say =75 for all variables, do you mean only use a row if it's =75 for each element or just only use the =75 elements for each calculation independently. Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time conversion from second to Y M D H M S format
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of uday Sent: Thursday, February 02, 2012 8:57 AM To: r-help@r-project.org Subject: Re: [R] time conversion from second to Y M D H M S format Dear Uwe , Thanks for reply I have tried format function that u suggested (format(time_t1, %Y %m %d %H %M %S) and I got format(time_t1, %Y %m %d %H %M %S) [1] 126230400 126252000 126273600 126295200 126316800 126338400 I think something is not working correct. You are right something is not working correctly. But you haven't shown what you did from beginning to end, so we don't know what that something might be. Try this time -c( 126230400, 126252000, 126273600, 126295200, 126316800, 126338400) time_t1- as.POSIXlt(time, origin=2005-01-01, tz=GMT) time_t1 [1] 2009-01-01 00:00:00 GMT 2009-01-01 06:00:00 GMT [3] 2009-01-01 12:00:00 GMT 2009-01-01 18:00:00 GMT [5] 2009-01-02 00:00:00 GMT 2009-01-02 06:00:00 GMT format(time_t1, %Y %m %d %H %M %S) [1] 2009 01 01 00 00 00 2009 01 01 06 00 00 2009 01 01 12 00 00 [4] 2009 01 01 18 00 00 2009 01 02 00 00 00 2009 01 02 06 00 00 Does that not do what you wanted? Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-Project at university.
I suggest you read the GNU license included in the source code and on the CRAN website. The essence is that you are free to use it, and to change it, but if you pass your changes on to anyone else, you have to make the source code of those changes available to those whom you give it to. Most users do not need to concern themselves with the conditions, since most users don't change it. Regarding compiling .exe files, R is not really designed for that. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Sylhetrin sylhet...@gmail.com wrote: Dear reader, I'm a student on engineering studies at Silesian University of Technology in Gliwice in Poland, my field of study is Technology and Mechanical Engineering on Integrated process of manufacturing systems, also I held a Bachelor's degree on Automation and Robotics. However I have a view questions about the R-Project, as far as I'm aware of on your website the program appears to be free to use, which captured my eyes, but does that mean this program (r-project) can be used by any degree students, for instance as a leaner or a teacher, on the other hand are there any limitations of how the program can be used, for example if I wanted to compile *.exe program file using the R program could that be achieved, without any cost. Although I request further information on terms and condition, including license, and any other useful information about using r-project as learning tool for university students and projects. Hope to hear from you soon, thank you for your time. sincerely Karol Porwol . [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gsub syntax help
I have some elements in a vector with extraneous information (e.g. file name and sample IDs) that I'd like to strip from every element. For example, I would like SPI1.S1.str1.P3.sample.tif SPI1.S1.STR2.P1.sample.tif to read SPI1.S1.str1.P3 SPI1.S1.STR2.P1. Will someone help me with the syntax in gsub? It needs to be something like gsub(garbage, everything except garbage , dataframe,), I think, but it's the everything except garbage that's giving me trouble. Thanks *Ben Caldwell* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub syntax help
In the example you gave, all that has to be done is replace .sample.tif at the end of the string with , which is easy. avec - c(SPI1.S1.str1.P3.sample.tif, SPI1.S1.STR2.P1.sample.tif) gsub(\\.sample\\.tif$, , avec) [1] SPI1.S1.str1.P3 SPI1.S1.STR2.P1 If your real data are more complex, we need to know what they look like. Sarah On Thu, Feb 2, 2012 at 4:42 PM, Benjamin Caldwell btcaldw...@berkeley.edu wrote: I have some elements in a vector with extraneous information (e.g. file name and sample IDs) that I'd like to strip from every element. For example, I would like SPI1.S1.str1.P3.sample.tif SPI1.S1.STR2.P1.sample.tif to read SPI1.S1.str1.P3 SPI1.S1.STR2.P1. Will someone help me with the syntax in gsub? It needs to be something like gsub(garbage, everything except garbage , dataframe,), I think, but it's the everything except garbage that's giving me trouble. Thanks *Ben Caldwell* -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub syntax help
Oh, perfect. I was running gsub(.sample.tif, , avec). your change gsub(\\.sample\\.tif$, , avec) did it. Thanks Sarah *Ben Caldwell* On Thu, Feb 2, 2012 at 1:48 PM, Sarah Goslee sarah.gos...@gmail.com wrote: In the example you gave, all that has to be done is replace .sample.tif at the end of the string with , which is easy. avec - c(SPI1.S1.str1.P3.sample.tif, SPI1.S1.STR2.P1.sample.tif) gsub(\\.sample\\.tif$, , avec) [1] SPI1.S1.str1.P3 SPI1.S1.STR2.P1 If your real data are more complex, we need to know what they look like. Sarah On Thu, Feb 2, 2012 at 4:42 PM, Benjamin Caldwell btcaldw...@berkeley.edu wrote: I have some elements in a vector with extraneous information (e.g. file name and sample IDs) that I'd like to strip from every element. For example, I would like SPI1.S1.str1.P3.sample.tif SPI1.S1.STR2.P1.sample.tif to read SPI1.S1.str1.P3 SPI1.S1.STR2.P1. Will someone help me with the syntax in gsub? It needs to be something like gsub(garbage, everything except garbage , dataframe,), I think, but it's the everything except garbage that's giving me trouble. Thanks *Ben Caldwell* -- Sarah Goslee http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate the natural log of cdf between 2 intervals
On Thu, Feb 02, 2012 at 01:18:42PM -0800, justin jarvis wrote: Hello all, I was wondering if there is an R function to do the following: [*] log(pnorm(x)-pnorm(y)), where xy. I don't want all the area under the natural log of the normal pdf less than x, I only want the area between y and x. I am aware of the ability to specify log.p=TRUE, which gives me the log of the probability that X=x. This does not help me, because the following code: pnorm(x, log.p=TRUE)-pnorm(y,log.p=TRUE) is not the same as [*] mathematically. I cannot use [*] because some of my x's are far less than the mean, more than 10 sd. This causes me to take the log(0) which is an error. Thus, I need to stay in the log scale, since, for z less than 10 sd below the mean, Hello: Try the following. x - -20 y - -19.9 xplog - pnorm(x, log.p=TRUE) yplog - pnorm(y,log.p=TRUE) logdiff - yplog + log1p( - exp(xplog - yplog)) logdiff [1] -202.0626 In an exact arithmetic, we have exp(logdiff) = exp(yplog + log1p( - exp(xplog - yplog))) = exp(yplog + log(1 - exp(xplog - yplog))) = exp(yplog) * (1 - exp(xplog - yplog)) = exp(yplog) - exp(xplog) So, we have exp(logdiff) = exp(yplog) - exp(xplog) logdiff = log(exp(yplog) - exp(xplog)) as required. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] an unusual use for R
I thought some of you might be amused by this. In my non-work time, I'm an avid weaver and teacher of weaving. I'm working on a project involving creating many detailed weaving patterns, so I wrote R code to automate it. Details here: http://stringpage.com/blog/?p=822 If the overlap between R users and avid tablet weavers turns out to be 1, I'll polish it up and turn it into a package. Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to run GLM with burr distribution?
I want to run the glm () function for my data but instead of using the family distributions in R, I need the 4P Burr distribution. Can some please explain how can I go about doing that. Or please provide me with an example. I'm new to R. Eg. Model1 - glm(Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian, data = anorexia) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bigkmeans not parallel
I'm using bigkmeans in 'biganalytics' to cluster my 60,000 by 600,000 matrix. I'm using a 8 core Linux VM. I have register parallel backend with registerDoMC() And I checked how many cores registered with getDoParWorkers() It returns 8, which is the number of cores I have on my machine. And I run the test below, whose results shows improved speed due to parallel. check -function(n) { + for(i in 1:1000) + { + sme - matrix(rnorm(100), 10,10) + solve(sme) + } + } times - 100 # times to run the loop system.time(x - foreach(j=1:times ) %dopar% check(j)) user system elapsed --- 4 system.time(x - foreach(j=1:times ) %do% check(j)) user system elapsed ---- 16 But when I run my data in bigkmeans ans - bigkmeans(data,200,nstart=5,iter.max=20) I see only one R process in system monitor, and only one CPU usage is high. I guess it's not really parallel. I also tried DoSNOW, though it's used for multi clusters. cl - makeCluster(8,type=SOCK) registerDoSNOW(cl) ans - bigkmeans(data,200,nstart = 30) There are 8 R processes but only 1 running. Is it because I have something misconfigured? Or is the bigkmeans do not support parallel? Thanks in advance to any advise. Regards, Lishu -- View this message in context: http://r.789695.n4.nabble.com/bigkmeans-not-parallel-tp4353036p4353036.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an unusual use for R
Brilliant Sarah ! I love seeing such unexpected and creative applications. I'm not a weaver but am a knitter (and a knotter actually) and have mused about using R to help design elements of textured knitting patterns e.g. as seen in single-colour, traditional fisherman's jumpers from England and Scotland. I've yet to do anything more than muse though. Hope it turns into a package :) Michael On 3 February 2012 09:54, Sarah Goslee sarah.gos...@gmail.com wrote: I thought some of you might be amused by this. In my non-work time, I'm an avid weaver and teacher of weaving. I'm working on a project involving creating many detailed weaving patterns, so I wrote R code to automate it. Details here: http://stringpage.com/blog/?p=822 If the overlap between R users and avid tablet weavers turns out to be 1, I'll polish it up and turn it into a package. Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] auto.key and simpleTheme
Dear all, parallel(~iris[1:4], groups = Species, iris, par.settings = simpleTheme(lwd = c(1,3,1), lty = c(1,1,2), col.line = 1), auto.key = T) Despite the use of par.settings and simpleTheme, the lines in the key and graph are not the same. Any suggestions why? Regards, Marcin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an unusual use for R
Well, I have to say, how nice to find a valid use for string theory :-) . Now that we all know you are in fact the mistress of skulls, guess we better tread lightly! Carl quote From: Sarah Goslee sarah.goslee_at_gmail.com Date: Thu, 02 Feb 2012 17:54:04 -0500 I thought some of you might be amused by this. In my non-work time, I'm an avid weaver and teacher of weaving. I'm working on a project involving creating many detailed weaving patterns, so I wrote R code to automate it. Details here: http://stringpage.com/blog/?p=822 If the overlap between R users and avid tablet weavers turns out to be 1, I'll polish it up and turn it into a package. -- Sent from my Cray XK6 Pendeo-navem mei anguillae plena est. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqldf for Very Large Tab Delimited Files
Hi Gabor, Thank you very much for your guidance and help. I could run the following code successfully on a 500 mb test data file. A snapshot of the data file is attached herewith. code start*** library(sqldf) library(RSQLite) iFile-Test100.txt con - dbConnect(SQLite(),dbname = myTest100) dbWriteTable(con, TestDB100, iFile, sep = \t) #, eol = \r\n) nms - names(dbGetQuery(con, select * from TestDB100 limit 0)) nRec-fn$dbGetQuery(con, select count(*)from TestDB100) aL1-1; while (aL1=nRec){ res1-fn$dbGetQuery(con, select * from (select * from TestDB100 limit '$aL1',1)) istn-res1[1,1] res1-fn$dbGetQuery(con, select * from TestDB100 where `nms[1]` = '$istn') icount-dim(res1)[1] oFile-paste(istn,_Test.txt,sep=) write.table(res1, oFile, sep = \t, quote = FALSE, col.names= FALSE, row.names = FALSE) aL1-aL1+icount } dbDisconnect(con) code end*** However, the actual data file that I want to handle is about *160 GB*. And when I use the same above code on that file, it gives following error for dbWriteTable(con, ...) statement error start** dbWriteTable(con, TestDB, iFile, sep = \t) #, eol = \r\n) Error in try({ : RS-DBI driver: (RS_sqlite_getline could not realloc) [1] FALSE error end** I am not sure about the reason of this error. Is this due to the big file size? I understood from sqldf webpage that SQLite can work for even a larger file than this and is only restricted by the disc space and not RAM. I have about 400GB free space on the PC I am using, with Windows 7 as the operating system. I am assuming that the about dbWriteTable command is using the disc memory only and is not the issue. In fact this file has been created using MySQLdump and I do not have access to the original MYSQL database file. I want to know the following: (1) Am I missing something in the above code that is preventing handling of this big 160 GB file? (2) Should this be handled outside of R, if R is becoming a limitation in this? And if yes then what is a possible way forward? Thank you again for your quick response and all the help. HC http://r.789695.n4.nabble.com/file/n4353362/Test100.txt Test100.txt -- View this message in context: http://r.789695.n4.nabble.com/sqldf-for-Very-Large-Tab-Delimited-Files-tp4350555p4353362.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqldf for Very Large Tab Delimited Files
On Thu, Feb 2, 2012 at 8:07 PM, HC hca...@yahoo.co.in wrote: Hi Gabor, Thank you very much for your guidance and help. I could run the following code successfully on a 500 mb test data file. A snapshot of the data file is attached herewith. code start*** library(sqldf) library(RSQLite) iFile-Test100.txt con - dbConnect(SQLite(),dbname = myTest100) dbWriteTable(con, TestDB100, iFile, sep = \t) #, eol = \r\n) nms - names(dbGetQuery(con, select * from TestDB100 limit 0)) nRec-fn$dbGetQuery(con, select count(*)from TestDB100) aL1-1; while (aL1=nRec){ res1-fn$dbGetQuery(con, select * from (select * from TestDB100 limit '$aL1',1)) istn-res1[1,1] res1-fn$dbGetQuery(con, select * from TestDB100 where `nms[1]` = '$istn') icount-dim(res1)[1] oFile-paste(istn,_Test.txt,sep=) write.table(res1, oFile, sep = \t, quote = FALSE, col.names= FALSE, row.names = FALSE) aL1-aL1+icount } dbDisconnect(con) code end*** However, the actual data file that I want to handle is about *160 GB*. And when I use the same above code on that file, it gives following error for dbWriteTable(con, ...) statement error start** dbWriteTable(con, TestDB, iFile, sep = \t) #, eol = \r\n) Error in try({ : RS-DBI driver: (RS_sqlite_getline could not realloc) [1] FALSE error end** I am not sure about the reason of this error. Is this due to the big file size? I understood from sqldf webpage that SQLite can work for even a larger file than this and is only restricted by the disc space and not RAM. I have about 400GB free space on the PC I am using, with Windows 7 as the operating system. I am assuming that the about dbWriteTable command is using the disc memory only and is not the issue. In fact this file has been created using MySQLdump and I do not have access to the original MYSQL database file. I want to know the following: (1) Am I missing something in the above code that is preventing handling of this big 160 GB file? (2) Should this be handled outside of R, if R is becoming a limitation in this? And if yes then what is a possible way forward? Thank you again for your quick response and all the help. HC http://r.789695.n4.nabble.com/file/n4353362/Test100.txt Test100.txt I think its unlikely SQLite could handle a database that large unless you can divide it into multiple separate databases. At one time the SQLite site said it did not handle databases over 1 GB and although I think that is outdated by more recent versions of SQLite its still likely true that your size is too large for it. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Nested? Two-way ANOVA with repeated measures
Dear R-users, I have 3 plant populations (fixed). Within each population there is the same number of âfamiliesâ (random) â the seed progeny of the same plant. These families were exposed to 2 treatments (fixed) and their response was measured (mean values for 25 seedlings per family per treatment are presented in data table). I would like to know if there is a significant difference in the response of populations between the treatments (primarily the interaction term, and the main effects as well) taking into account an important (from biological point of view) thing that progeny of each plant (i.e. family) was exposed to both treatments. Taking the artificial example one could easily do it with car-package: library(car) dat - data.frame(Family = 1:60, # Plant family name Pop = rep(c(Pop1,Pop2,Pop3), each=20),# Population name Cond1 = rnorm(60, 15, 1),# obtained values at experimental conditions 1 Cond2 = rnorm(60, 20, 1))# experimental conditions 2 # rearrange data data.wide - data.frame(Family = 1:20, subset(dat, dat$Pop == Pop1)[3:4], subset(dat, dat$Pop == Pop2)[3:4], subset(dat, dat$Pop == Pop3)[3:4]) names(data.wide)[2:7] - c(Pop1.Cond1,Pop1.Cond2, Pop2.Cond1,Pop2.Cond2, Pop3.Cond1,Pop3.Cond2) # define the structure of analysis design - data.frame(Pop = rep(c(Pop1,Pop2,Pop3), each=2), Cond = rep(c(Cond1,Cond2))) # define the model mod - lm(as.matrix(data.wide[, -1]) ~ 1) an - Anova(mod, idata = design, idesign = ~Pop * Cond) summary(an) But obviously this is not the right way to analyse this data because plant families are nested within the populations. So Iâm struggling with how to incorporate this information into the model. Tanks in advance for any suggestions and/or helpful links! Vladimir. PS. If itâll be easier to do it with the long format of data one can run this code: library(reshape2) data.long - melt(dat, measure.vars=c(Cond1, Cond2), variable.name =Cond) -- Vladimir Mikryukov, PhD Institute of Plant Animal Ecology UD RAS, Lab. of Population and Community Ecotoxicology [8 Marta 202, 620144, Ekaterinburg, Russia] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tukey Type III SS vs Type I SS
Hi. I was looking for help on how to use Tukey multiple comparison on Type III SS because I read on Quick R that it is using Type I SS by default. I am wondering if the use of glht helps. Thanks for help. Vera __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about error while using anova function
I don't know what your data look like, but I recently ran into this error message while using Anova() in the {car} package, and I resolved in by replacing the categorical predictors in my model with orthogonal contrasts. I did something along these lines: fac - factor(c(M,F,M,M,F)) # A categorical predictor. contrasts(fac) # Default numerical values used by Anova(), anova(), lm(), aov(), etc. contrasts(fac) - contr.sum # Orthogonal contrasts to apply to fac. facCont - contrasts(fac)[fac] # Assign values in new variable. Replace fac with facCont in model. Of course, this may have nothing to do with the reason your model was misbehaving. -- View this message in context: http://r.789695.n4.nabble.com/about-error-while-using-anova-function-tp4159463p4353590.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Contour plot with messy field data.
Hello, I have some data that will be in the form: structure(list(station = structure(c(20L, 2L, 4L, 19L, 3L, 11L, 1L, 5L, 10L, 12L, 17L, 18L, 6L, 9L, 13L, 16L, 7L, 8L, 15L, 14L ), .Label = c(1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 3, 4, 5, 6, 7, 8, 9, scope ), class = factor), distance = c(0, 2, 0.8, 3, 1.2, 1.7, 1, 1.4, 2.8, 2.2, 4.5, 4.2, 2.8, 3.6, 3.4, 4.8, 3.8, 4.2, 4.8, 4.4 ), degrees = c(0, 90, 89.9, 82.4, -59, 69.4, 10.8, 45, 69, 26.6, 63.4, 61.6, 45, 56.3, 28.1, 51.7, 38.7, 45, 38.3, 25.4), z = c(0L, 0L, -1L, 0L, 0L, -1L, 0L, -1L, -1L, 0L, 0L, 0L, -1L, -1L, 0L, 0L, -1L, -1L, 0L, 0L), x = c(0, 0, 0, 0.4, 0.6, 0.6, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3.8, 4), y = c(0, 2, 0.8, 3, -1, 1.6, 0.19, 1, 2.6, 1, 4, 3.7, 2, 3, 1.6, 3.8, 2.4, 3, 3, 1.9)), .Names = c(station, distance, degrees, z, x, y), row.names = c(1L, 11L, 13L, 10L, 12L, 20L, 2L, 14L, 19L, 3L, 8L, 9L, 15L, 18L, 4L, 7L, 16L, 17L, 6L, 5L), class = data.frame) Basically, I would like to create a contour plot and eventually be able to calculate the % area within each contour. This only has 2 contour lines of height z=0, and z=-1. I would like to have as many and these z values likely won't fall nicely into a contour with others. This may give you an idea of the spatial arrangement of observations. http://r.789695.n4.nabble.com/file/n4353603/Slide1.png Does anyone have any suggestions on how to begin doing this? -- View this message in context: http://r.789695.n4.nabble.com/Contour-plot-with-messy-field-data-tp4353603p4353603.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need to Write a Code that can find the molecular weight of various compounds
Matthew, My Fault I should have sent you to the current release cycle page. The link was for the old 2.6 Bioconductor release, sorry about that. :( Personally the easies way to install any bioconductor package is to open R copy and paste the following code: source(http://bioconductor.org/biocLite.R;) biocLite(Rdisop) The main page for Rdisop is here : http://www.bioconductor.org/packages/release/bioc/html/Rdisop.html . This time I've given you the correct page/ current release :) As for the rJava and rCDK. It can be difficult to install. I'm going to assume that you're on Linux as you said you tried to download the source code. In this case you want to make sure that you have your $JAVA_HOME path set. In the terminal first try: sudo R CMD javareconf ## If that doesn't work then try echo $JAVA_HOME ## if that returns nothing try: export JAVA_HOME=/usr/lib/jvm/java-6-sun This page looks like quite a good explanation : http://stackoverflow.com/questions/3311940/r-rjava-package-install-failing Hope it helps, Paul On Feb 2, 2012, at 7:26 PM, matthew.ttd.nguyen wrote: I also tried downloading the JDK version of Java and received this new error when running it: Error : .onLoad failed in loadNamespace() for 'rJava', details: call: dirname(this$RuntimeLib) error: a character vector argument expected Error: package/namespace load failed for rJava -- View this message in context: http://r.789695.n4.nabble.com/Need-to-Write-a-Code-that-can-find-the-molecular-weight-of-various-compounds-tp4342874p4352510.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- You received this message because you are subscribed to the Google Groups R-help-archive group. To post to this group, send email to r-help-arch...@googlegroups.com. To unsubscribe from this group, send email to r-help-archive+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/r-help-archive?hl=en. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Logistic population growth and deSolve
Hello, I am new to R and I am having problems trying to model logistic population growth with the deSolve package. I would like to run the model for four populations with the same initial population and carrying capacity but with different growth rates and put the results into a data frame. When I run the following lines of code I get unexpected results from output but the format is more or less what I am looking for. When I run the function, I get the results I expected but for only one time step. I haven't been able to discern what the problem is and haven't gotten any error messages to clue me in on where I am making a mistake. Advice would be grealty appreciated. Thanks. library(deSolve) parameters = c(K=305, ra=0.8, rb=1.5, rc=2.1, rd=2.6) state = c(Na=5, Nb=5, Nc=5, Nd=5) logGrowth = function(time, state, parameters) { with(as.list(c(state,parameters)), { dNa.dt = ra * Na * (1-(Na/K)) dNb.dt = rb * Nb * (1-(Nb/K)) dNc.dt = rc * Nc * (1-(Nc/K)) dNd.dt = rd * Nd * (1-(Nd/K)) return(list(c(dNa.dt, dNb.dt, dNc.dt, dNd.dt))) }) } times = 1:20 output = ode(y = state, times = times, func = logGrowth, parms = parameters) print(output) logGrowth(times,state,parameters) -- View this message in context: http://r.789695.n4.nabble.com/Logistic-population-growth-and-deSolve-tp4353655p4353655.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour plot with messy field data.
Ok, I've since found this: # called previously posted dataset dat attach(dat) library(akima) data.interp - interp(x,y,z) contour(data.interp) any idea how to calculate area within specified contour lines? Thanks chuck.01 wrote Hello, I have some data that will be in the form: structure(list(station = structure(c(3L, 20L, 2L, 4L, 19L, 11L, 1L, 5L, 10L, 12L, 17L, 18L, 6L, 9L, 13L, 16L, 7L, 8L, 15L, 14L ), .Label = c(1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 3, 4, 5, 6, 7, 8, 9, scope ), class = factor), distance = c(1.2, 0, 2, 0.8, 3, 1.7, 1, 1.4, 2.8, 2.2, 4.5, 4.2, 2.8, 3.6, 3.4, 4.8, 3.8, 4.2, 4.8, 4.4 ), degrees = c(-59, 0, 90, 89.9, 82.4, 69.4, 10.8, 45, 69, 26.6, 63.4, 61.6, 45, 56.3, 28.1, 51.7, 38.7, 45, 38.3, 25.4), z = c(0L, 0L, 0L, -1L, 0L, -1L, 0L, -1L, -1L, 0L, 0L, 0L, -1L, -1L, 0L, 0L, -1L, -1L, 0L, 0L), x = c(-0.6, 0, 0, 0, 0.4, 0.6, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3.8, 4), y = c(1, 0, 2, 0.8, 3, 1.6, 0.19, 1, 2.6, 1, 4, 3.7, 2, 3, 1.6, 3.8, 2.4, 3, 3, 1.9)), .Names = c(station, distance, degrees, z, x, y), row.names = c(12L, 1L, 11L, 13L, 10L, 20L, 2L, 14L, 19L, 3L, 8L, 9L, 15L, 18L, 4L, 7L, 16L, 17L, 6L, 5L), class = data.frame) I would like to create a contour plot and eventually be able to calculate the % area within each contour. This only has 2 contour lines of height z=0, and z=-1; I would like to have as many and these z values likely won't fall nicely into a contour with others. Also, linear interpolation adding at least one point between observed points would be great. This may give you an idea of the spatial arrangement of observations. http://r.789695.n4.nabble.com/file/n4353603/Slide1.png Does anyone have any suggestions on how to begin doing this? -- View this message in context: http://r.789695.n4.nabble.com/Contour-plot-with-messy-field-data-tp4353603p4353662.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.