[R] Constrainted Nonlinear Optimization - lack of convergence
Hello, I am attempting to utilize the 'alabama' package to solve a constrained nonlinear optimization problem. The problem has both equality and inequality constraints (heq and hin functions are used). All constraints are smooth, i.e. I can differentiate easily to produce heq.jac and hin.jac functions. My initial solution is feasible; I am attempting to maximize a function, phi. As such, I create an objective or cost function '-phi', and the gradient of the cost function '-dphi/dz'. I will gladly provide the detailed code, but perhaps an overview of the problem may be sufficient. 0. I installed 'alabama' and was successful at solving the example problem. 1. My constraints are: z=0 (for several elements in the vector z) z=0 (for remaining elements in vector z) Z - sum(z) =0, where Z is a constant real number. 2. My cost function to maximize is (or, minimize -phi): phi==SUM[ p[i]*LN{f[i]} ], where sum is for i=1:length(z) and where f[i]=={(1-r)*SUM[z+s]-z[i]-s[i]}*z[i]/(z[i]+s[i]) + Z - sum(z) and where s, p are vectors of length(z) and are constants. Note, elements of p s are all 0 and where (1-r) is a scalar0 Note: f[i], under the constraints listed above that is, should always be = 0 3. I can readily calculate the gradient of phi, where in general: dphi/dz=d/dz[i] of phi== p*f'/f, where f' is df/dz[i]. 4. I created functions for inequality and equality constraints and their jacobians, the cost function, and the grad of cost function. 5. I utilize the alabama package, and the 'auglag' function. As a first attempt, I utilized only a single inequality constraint for z0, all other z constraints are z=0, and the Z - sum(z) 0 inequality constraint.I used default settings, except for attempts to utilize various 'methods', e.g. BFGS, Nelder-Mead. Review of the alamaba package source code leads me to believe that this code automatically generates the Lagrangian of the cost function augmented with Lagrangian multipliers, and also generates the gradient of the augmented Lagrangian. Hence, I assume (perhaps incorrectly), that auglag is automatically generating the dual problem, and attempts to find a solution to the dual problem by calling 'optim'. MY ISSUE: The code often runs successfully (converges); sometimes with satisfying (TRUE) KKT1 and KKT2, sometimes only 1 of the 2. Sometimes it fails to converge at all. When it does converge, I do not obtain the same optimum condition when I utilize different initial conditions. When it does fail to converge, I often end up with a Nan, generated when attempting to take log(f[i]), meaning that f[i]0, and I interpret and observe that some or all of the elements of the vector z are less than zero, despite my constraints. QUESTION Other than the obvious - review my code for typos, etc, which I believe have been resolved... 1. Can the alabama procedure take a solution path that may not satisfy the constraints? If not, then I must have an error in my code despite attempts to eliminate and I must review yet again. 2. If the path may not satisfy all of the constraints (perhaps to due to steep gradients), how to avoid this situation? 2a. I presume that some of the issues may be with difference in scaling, e.g. say s=[200,500,400,300,100], p=[0.1,0.2,0.4,0.1,0.2], Z=1000, (1-r)=0.8, and initial starting point for z=[0,0,200,0,0]. However, I am not experienced at scaling these or the constraints. Any suggestions? 2b I am not an expert in optimization, but have some background in math/engineering. I suspect and hope that something as simple as relaxing the constraints on z=0 to z=delta, where delta is a small positive number, may help - any comments? I admit, I am lazy for not trying this, as I just thought of it while writing this post. 2c. I am dangerously knowledgeable that penalty functions exist, but I am uncertain on how to utilize and how to determine how to select the term 'sig0'. Suggestions? 2d. Thinking more, I have not rigorously attempted to modify the tolerance for convergence, thinking that perhaps my issue is more related to the solution path not remaining in the constraints being the issue, and not my convergence. Am I incorrect in thinking so? I would appreciate any assitance that someone can provide. Again, if the code is required, I will share, but I hope that I have defined my problem well enough above so as to avoid anyone having to sort through / degub my own code. Much appreciated, Tim -- View this message in context: http://r.789695.n4.nabble.com/Constrainted-Nonlinear-Optimization-lack-of-convergence-tp3531534p3531534.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PAM Clustering Ignores Cluster Number Parameter
I am using PAM with k = 10 clusters, but I only get one cluster ID for all my observations. I couldn't find any discussion about this in the help file, or mailing lists. Is there a reasonable explanation for this result ? cIDs - pam(all, 10, cluster.only = TRUE, do.swap = FALSE) table(cIDs) cIDs 0 16671 The matrix of observations can be found at : http://129.94.136.7/file_dump/dario/all.obj I'm using R version 2.13.0 (2011-04-13) on Platform: x86_64-unknown-linux-gnu (64-bit) and have cluster_1.13.3. -- Dario Strbenac Research Assistant Cancer Epigenetics Garvan Institute of Medical Research Darlinghurst NSW 2010 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scatter plot: multiple Y variables and error bars
#Hi all, #Using the example data that follows, can someone please show me how to get a scatterplot of points with #error bars in the Y direction. something like this works for one Y: xYplot(Cbind(y1, l1, u1) ~x1, data=y) #but this: xYplot(Cbind(y1, l1, u1) + Cbind(y2, l2, u2)~x1, data=y) # doesn't give me what I would have expected, which is both sets of points to have their respective error # bars. Any examples would be greatly appreciated, and I am not partial to xYplot, so please share #anything you like. y1 - c(1, 1.2, 0.9, 1, 1.2) u1 - c(1.3, 1.4, 1.3, 1.2, 1.4) l1 - c(0.8, 0.9, 0.85, 0.8, 0.9) x1 - c(1:5) y2 - c(1.2, 1.4, 1.2, 1.4, 1.5) u2 - c(1.5, 1.8, 1.6, 1.6, 1.7) l2 - c(1.1, 1.3, 1.0, 1.2, 1.4) y - data.frame(y1,u1,l1,x1) ## thanks ahead of time! -- View this message in context: http://r.789695.n4.nabble.com/scatter-plot-multiple-Y-variables-and-error-bars-tp3531563p3531563.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to make array of regression objects
Dear all, I have made couple logistic regressions, what making a distribution of some event. Currently, i store it like this: o1 - lrm(...) o2 - lrm(...) o3 - lrm(...) ... Then, i have made a function to peak required regression object from this variables by it number: get_object - function(obj_name, nModel) { eval (parse(text=paste(o - , obj_name, nModel, sep=))) o } Is there a better way to do it? I have try to store it in the matrix using data.frame(), but object become destroyed after that and predict() function do not recognize it. Regards, Dmitrij Kudriavcev [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] text mining analysis and word visualization of pdfs
Dear Lists, What is the appropriate software package for dumping say 20 PDFS in a folder, then creating data visualization with frequency counts of certain words as well as measure correlation within each file for certain key relationships or key words. I am doing text analysis of biases in enterprise software sponsored publications- and need to come up with a statistical threshold. Regards, Ajay Ohri Websites- http://decisionstats.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Date_Time detected as Duplicated (but they are not!)
I have a problem with duplicated date_time stamps that I do not see as duplicated. I read a file with observations taken every 30 minutes: aur2009=read.csv(paste(datadir,AUR_ECPP_2009.csv,sep=/),sep=;,stringsAsFactors=F) aur2009[1:3,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 1/1/2009 0:000 NaN 5.86NaN 2 1/1/2009 0:300 NaN 5.05NaN 3 1/1/2009 1:000 NaN 5.56NaN delme = strptime(aur2009[,1], %m/%d/%Y %H:%M) aur2009[,1]=as.POSIXct(delme) Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 2009-01-01 00:00:000 NaN 5.86NaN 2 2009-01-01 00:30:000 NaN 5.05NaN 3 2009-01-01 01:00:000 NaN 5.56NaN aur2009ts = ts(aur2009) row.names(aur2009ts) = as.character(delme) aur2009ts[1:3,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-01-01 00:00:00 12307644000 NaN 5.86NaN 2009-01-01 00:30:00 12307662000 NaN 5.05NaN 2009-01-01 01:00:00 12307680000 NaN 5.56NaN Then: aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme)) Warning message: In zoo(aur2009[, 2:12], as.POSIXct(delme)) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique So I investigate: any(duplicated(aur2009ts[,1])) [1] TRUE aur2009ts[(duplicated(aur2009ts[,1])),1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 02:00:00 12382848000 NaN 1.2NaN 2009-03-29 02:30:00 12382866000 NaN 1.2NaN But note the surprise: aur2009ts[aur2009ts[,1]==1238284800,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:00:00 12382848000 NaN -0.58NaN 2009-03-29 02:00:00 12382848000 NaN 1.20NaN aur2009ts[aur2009ts[,1]==1238286600,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:30:00 12382866000 NaN -0.34NaN 2009-03-29 02:30:00 12382866000 NaN 1.20NaN The dates detected as duplicated are actually different times that got the same value in the ts version of the object! What am I doing wrong? They are all observations every 30min, why are these 2 encoded as the same time? Any help appreciated Agus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date_Time detected as Duplicated (but they are not!)
See under Note in ?strptime: Remember that in most timezones some times do not occur and some occur twice because of transitions to/from summer time. âstrptimeâ does not validate such times (it does not assume a specific timezone), but conversion by âas.POSIXctâ) will do so. On Wed, May 18, 2011 at 3:53 PM, Agustin Lobo agustin.l...@ictja.csic.eswrote: I have a problem with duplicated date_time stamps that I do not see as duplicated. I read a file with observations taken every 30 minutes: aur2009=read.csv(paste(datadir,AUR_ECPP_2009.csv,sep=/),sep=;,stringsAsFactors=F) aur2009[1:3,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 1/1/2009 0:000 NaN 5.86NaN 2 1/1/2009 0:300 NaN 5.05NaN 3 1/1/2009 1:000 NaN 5.56NaN delme = strptime(aur2009[,1], %m/%d/%Y %H:%M) aur2009[,1]=as.POSIXct(delme) Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 2009-01-01 00:00:000 NaN 5.86NaN 2 2009-01-01 00:30:000 NaN 5.05NaN 3 2009-01-01 01:00:000 NaN 5.56NaN aur2009ts = ts(aur2009) row.names(aur2009ts) = as.character(delme) aur2009ts[1:3,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-01-01 00:00:00 12307644000 NaN 5.86 NaN 2009-01-01 00:30:00 12307662000 NaN 5.05 NaN 2009-01-01 01:00:00 12307680000 NaN 5.56 NaN Then: aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme)) Warning message: In zoo(aur2009[, 2:12], as.POSIXct(delme)) : some methods for âzooâ objects do not work if the index entries in âorder.byâ are not unique So I investigate: any(duplicated(aur2009ts[,1])) [1] TRUE aur2009ts[(duplicated(aur2009ts[,1])),1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 02:00:00 12382848000 NaN 1.2 NaN 2009-03-29 02:30:00 12382866000 NaN 1.2 NaN But note the surprise: aur2009ts[aur2009ts[,1]==1238284800,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:00:00 12382848000 NaN -0.58 NaN 2009-03-29 02:00:00 12382848000 NaN 1.20 NaN aur2009ts[aur2009ts[,1]==1238286600,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:30:00 12382866000 NaN -0.34 NaN 2009-03-29 02:30:00 12382866000 NaN 1.20 NaN The dates detected as duplicated are actually different times that got the same value in the ts version of the object! What am I doing wrong? They are all observations every 30min, why are these 2 encoded as the same time? Any help appreciated Agus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsum...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with PLSR with jack knife
Amit Patel amitrh...@yahoo.co.uk writes: BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, validation = LOO) and BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, validation = CV) [...] Now I am unsure of how to utilise these to identify the significant variables. You can use the jackknife built into plsr to get an indication about significant variables, by adding the argument jackknife = TRUE to the plsr call. Use jack.test(BHPLS1) to do the test. But _PLEASE_ do read the Warning section inf ?jack.test! -- Regards, Bjørn-Helge Mevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with PLSR Loadings
Amit Patel amitrh...@yahoo.co.uk writes: x - loadings(BHPLS1) my loadings contain variable names rather than numbers. No, they don't. str(x) loadings [1:94727, 1:10] -0.00113 -0.03001 -0.00059 -0.00734 -0.02969 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:94727] PCIList1 PCIList2 PCIList3 PCIList4 ... ..$ : chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ... - attr(*, explvar)= Named num [1:10] 14.57 6.62 7.59 5.91 3.26 ... ..- attr(*, names)= chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ... Look at the first line of output. These are the values, and they are numeric (it is a matrix). The other lines are attributes of the matrix. plot(BHPLS1, loadings, comps = 1:2, legendpos = topleft, labels = numbers, xlab = nm) Error in loadingplot.default(x, ...) : Could not convert variable names to numbers. This says that loadingplot.default could not convert variable _names_ to numbers. That is not surprising, since the variable names are PCIList1, PCIList2, etc., and the documentation for loadinplot says: with 'numbers', the variable names are converted to numbers, if possible. Variable names of the forms 'number' or 'number text' (where the space is optional), are handled. So don't ask the plot function to use numbers as labels. Use e.g. names instead: labels = names. Tip: It is always a good idea to read the output and error messages very carefully. -- Regards, Bjørn-Helge Mevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] text mining analysis and word visualization of pdfs
Ajay Ohri wrote: What is the appropriate software package for dumping say 20 PDFS in a folder, then creating data visualization with frequency counts of certain words as well as measure correlation within each file for certain key relationships or key words. pdftotext + Unix™ for Poets + R (ggplot2) HTH. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XLSM Question
Hi, I would like to ask you how to read an arrary from an *.xlsm file? I have tried different packages such as xlsReadWrite and RODBC. Everything is performed on the final versions of addons and R. Additionally, when I tried the RODBC received the following error: library(RODBC) con = odbcConnectExcel(C:\\Temp.xlsm) Warning messages: 1: In odbcDriverConnect(con, tabQuote = c([, ]), ...) : [RODBC] ERROR: state HY000, code -5120, message [Microsoft][ODBC Excel Driver] External table is not in the expected format. 2: In odbcDriverConnect(con, tabQuote = c([, ]), ...) : ODBC connection failed Many thanks, Alexandros [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date_Time detected as Duplicated (but they are not!)
and is it not possible to ignore savings time? My data are in UTC, with no savings time changes delme = strptime(aur2009[,1], %m/%d/%Y %H:%M,tz=UTC) any(duplicated(delme)) [1] TRUE delme = as.POSIXct(aur2009[,1], %m/%d/%Y %H:%M,tz=UTC) any(duplicated(delme)) [1] TRUE Agus On Wed, May 18, 2011 at 8:55 AM, Michael Sumner mdsum...@gmail.com wrote: See under Note in ?strptime: Remember that in most timezones some times do not occur and some occur twice because of transitions to/from summer time. ‘strptime’ does not validate such times (it does not assume a specific timezone), but conversion by ‘as.POSIXct’) will do so. On Wed, May 18, 2011 at 3:53 PM, Agustin Lobo agustin.l...@ictja.csic.es wrote: I have a problem with duplicated date_time stamps that I do not see as duplicated. I read a file with observations taken every 30 minutes: aur2009=read.csv(paste(datadir,AUR_ECPP_2009.csv,sep=/),sep=;,stringsAsFactors=F) aur2009[1:3,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 1/1/2009 0:00 0 NaN 5.86 NaN 2 1/1/2009 0:30 0 NaN 5.05 NaN 3 1/1/2009 1:00 0 NaN 5.56 NaN delme = strptime(aur2009[,1], %m/%d/%Y %H:%M) aur2009[,1]=as.POSIXct(delme) Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 2009-01-01 00:00:00 0 NaN 5.86 NaN 2 2009-01-01 00:30:00 0 NaN 5.05 NaN 3 2009-01-01 01:00:00 0 NaN 5.56 NaN aur2009ts = ts(aur2009) row.names(aur2009ts) = as.character(delme) aur2009ts[1:3,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-01-01 00:00:00 1230764400 0 NaN 5.86 NaN 2009-01-01 00:30:00 1230766200 0 NaN 5.05 NaN 2009-01-01 01:00:00 1230768000 0 NaN 5.56 NaN Then: aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme)) Warning message: In zoo(aur2009[, 2:12], as.POSIXct(delme)) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique So I investigate: any(duplicated(aur2009ts[,1])) [1] TRUE aur2009ts[(duplicated(aur2009ts[,1])),1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 02:00:00 1238284800 0 NaN 1.2 NaN 2009-03-29 02:30:00 1238286600 0 NaN 1.2 NaN But note the surprise: aur2009ts[aur2009ts[,1]==1238284800,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:00:00 1238284800 0 NaN -0.58 NaN 2009-03-29 02:00:00 1238284800 0 NaN 1.20 NaN aur2009ts[aur2009ts[,1]==1238286600,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:30:00 1238286600 0 NaN -0.34 NaN 2009-03-29 02:30:00 1238286600 0 NaN 1.20 NaN The dates detected as duplicated are actually different times that got the same value in the ts version of the object! What am I doing wrong? They are all observations every 30min, why are these 2 encoded as the same time? Any help appreciated Agus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsum...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date_Time detected as Duplicated (but they are not!)
Dear Augustin: What are the duplicated times? Looks they really do occur twice or more in your original data: perhaps two stamps less time apart than the resolution of your clock? delme[duplicated(delme)] aur2009[[duplicated(delme),1] On 18 May 2011, at 8:49 AM, Agustin Lobo wrote: and is it not possible to ignore savings time? My data are in UTC, with no savings time changes delme = strptime(aur2009[,1], %m/%d/%Y %H:%M,tz=UTC) any(duplicated(delme)) [1] TRUE delme = as.POSIXct(aur2009[,1], %m/%d/%Y %H:%M,tz=UTC) any(duplicated(delme)) [1] TRUE Agus On Wed, May 18, 2011 at 8:55 AM, Michael Sumner mdsum...@gmail.com wrote: See under Note in ?strptime: Remember that in most timezones some times do not occur and some occur twice because of transitions to/from summer time. ‘strptime’ does not validate such times (it does not assume a specific timezone), but conversion by ‘as.POSIXct’) will do so. On Wed, May 18, 2011 at 3:53 PM, Agustin Lobo agustin.l...@ictja.csic.es wrote: I have a problem with duplicated date_time stamps that I do not see as duplicated. I read a file with observations taken every 30 minutes: aur2009=read.csv(paste(datadir,AUR_ECPP_2009.csv,sep=/),sep=;,stringsAsFactors=F) aur2009[1:3,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 1/1/2009 0:000 NaN 5.86NaN 2 1/1/2009 0:300 NaN 5.05NaN 3 1/1/2009 1:000 NaN 5.56NaN delme = strptime(aur2009[,1], %m/%d/%Y %H:%M) aur2009[,1]=as.POSIXct(delme) Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 2009-01-01 00:00:000 NaN 5.86NaN 2 2009-01-01 00:30:000 NaN 5.05NaN 3 2009-01-01 01:00:000 NaN 5.56NaN aur2009ts = ts(aur2009) row.names(aur2009ts) = as.character(delme) aur2009ts[1:3,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-01-01 00:00:00 12307644000 NaN 5.86 NaN 2009-01-01 00:30:00 12307662000 NaN 5.05 NaN 2009-01-01 01:00:00 12307680000 NaN 5.56 NaN Then: aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme)) Warning message: In zoo(aur2009[, 2:12], as.POSIXct(delme)) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique So I investigate: any(duplicated(aur2009ts[,1])) [1] TRUE aur2009ts[(duplicated(aur2009ts[,1])),1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 02:00:00 12382848000 NaN 1.2 NaN 2009-03-29 02:30:00 12382866000 NaN 1.2 NaN But note the surprise: aur2009ts[aur2009ts[,1]==1238284800,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:00:00 12382848000 NaN -0.58 NaN 2009-03-29 02:00:00 12382848000 NaN 1.20 NaN aur2009ts[aur2009ts[,1]==1238286600,1:5] Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:30:00 12382866000 NaN -0.34 NaN 2009-03-29 02:30:00 12382866000 NaN 1.20 NaN The dates detected as duplicated are actually different times that got the same value in the ts version of the object! What am I doing wrong? They are all observations every 30min, why are these 2 encoded as the same time? Any help appreciated Agus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsum...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Integral Symbol
Dear All, I am documenting a R package. That means writing the *.Rd files inside the \man folder of the package structure I was wondering how to write the symbol for an integral function in a formula. Similar to this one in LaTeX: \int_{0}^{10} \Omega(t)dt I already tried \deqn{\int_{0}^{10} \Omega(t)dt} but it does not work. Any idea? Which math symbols does R-help recognise? Regards, Javier Hidalgo Carrio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] DCC-GARCH model
Hello, I have a few questions concerning the DCC-GARCH model and its programming in R. So here is what I want to do: I take quotes of two indices - SP500 and DJ. And the aim is to estimate coefficients of the DCC-GARCH model for them. This is how I do it: library(tseries) p1 = get.hist.quote(instrument = ^gspc,start = 2005-01-07,end = 2009-09-04,compression = w, quote=AdjClose) p2 = get.hist.quote(instrument = ^dji,start = 2005-01-07,end = 2009-09-04,compression = w, quote=AdjClose) p = cbind(p1,p2) y = diff(log(p))*100 y[,1] = y[,1]-mean(y[,1]) y[,2] = y[,2]-mean(y[,2]) T = length(y[,1]) library(ccgarch) library(fGarch) f1 = garchFit(~ garch(1,1), data=y[,1],include.mean=FALSE) f1 = f1@fit$coef f2 = garchFit(~ garch(1,1), data=y[,2],include.mean=FALSE) f2 = f2@fit$coef a = c(f1[1], f2[1]) A = diag(c(f1[2],f2[2])) B = diag(c(f1[3], f2[3])) dccpara = c(0.2,0.6) dccresults = dcc.estimation(inia=a, iniA=A, iniB=B, ini.dcc=dccpara,dvar=y, model=diagonal) dccresults$out DCCrho = dccresults$DCC[,2] matplot(DCCrho, type='l') dccresults$out deliver me the estimated coefficients of the DCC-GARCH model. And here is my first question: How can I check if these coefficients are significant or not? How can I test them for significance? second question would be: Is this true that matplot(DCCrho, type='l') shows conditional correlation between the two indices in question? and the third one: What is actually dccpara and why do I get totally different DCC-alpha and DCC-beta coefficients if I change dccpara from c(0.2,0.6) to, let's say, c(0.01, 0.98) ? What determines which values should be chosen? Hopefully someone will find time to give me a hand. Thank you very much in advance, people of good will, for looking at/checking what I wrote and helping me. Best regards Marcin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need expert help with model.matrix
Dear experts: Is it possible to create a new function based on stats:::model.matrix.default so that an alternative factor coding is used when the function is called instead of the default factor coding? Basically, I'd like to reproduce the results in 'mat' below, without having to explicitly specify my desired factor coding (identity matrices) in the 'contrasts.arg'. dd - data.frame(a = gl(3,4), b = gl(4,1,12)) ca - contrasts(dd$a, contrasts= FALSE) # 3 x 3 identity matrix cb - contrasts(dd$b, contrasts= FALSE) # 4 x 4 identity matrix mat - model.matrix(~ a + b, dd, contrasts.arg = list(a=ca, b=cb)) My approach was to modify the code in model.matrix by explicitly setting the contrasts argument in the contr.identity and contrasts function to FALSE. This is shown at the bottom of the email in the function model.matrix2: contr.identity - contr.treatment formals(contr.identity)$contrasts - FALSE contrasts - contrasts formals(contrasts)$contrasts - FALSE However, I believe this function is using contrasts = TRUE, as it doesn't return the identity contrasts mat2 - model.matrix2(~ a + b, dd) Any help here is much appreciated. Axel. - model.matrix2 - function (object, data = environment(object), contrasts.arg = NULL, xlev = NULL, ...) { t - if (missing(data)) terms(object) else terms(object, data = data) if (is.null(attr(data, terms))) data - model.frame(object, data, xlev = xlev) else { reorder - match(sapply(attr(t, variables), deparse, width.cutoff = 500)[-1L], names(data)) if (any(is.na(reorder))) stop(model frame and formula mismatch in model.matrix()) if (!identical(reorder, seq_len(ncol(data data - data[, reorder, drop = FALSE] } int - attr(t, response) contr.identity - contr.treatment formals(contr.identity)$contrasts - FALSE contrasts - contrasts formals(contrasts)$contrasts - FALSE if (length(data)) { contr.funs - c('contr.identity', 'contr.poly') namD - names(data) for (i in namD) if (is.character(data[[i]])) { data[[i]] - factor(data[[i]]) warning(gettextf(variable '%s' converted to a factor, i), domain = NA) } isF - sapply(data, function(x) is.factor(x) || is.logical(x)) isF[int] - FALSE isOF - sapply(data, is.ordered) for (nn in namD[isF]) if (is.null(attr(data[[nn]], contrasts))) contrasts(data[[nn]]) - contr.funs[1 + isOF[nn]] #browser() if (!is.null(contrasts.arg) is.list(contrasts.arg)) { if (is.null(namC - names(contrasts.arg))) stop(invalid 'contrasts.arg' argument) for (nn in namC) { if (is.na(ni - match(nn, namD))) warning(gettextf(variable '%s' is absent, its contrast will be ignored, nn), domain = NA) else { ca - contrasts.arg[[nn]] if (is.matrix(ca)) contrasts(data[[ni]], ncol(ca)) - ca else contrasts(data[[ni]]) - contrasts.arg[[nn]] } } } } else { isF - FALSE data - list(x = rep(0, nrow(data))) } ans - .Internal(model.matrix(t, data)) cons - if (any(isF)) lapply(data[isF], function(x) attr(x, contrasts)) else NULL attr(ans, contrasts) - cons ans } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integral Symbol
See the section on writing Mathematics in Rd file in the manual Writing R Extensions. This will show how to produce high quality formulas in LaTeX generated output and ASCII versions otherwise. If you want to provide an excellent HTML version as well, the section on Conditional text is also worth reading. Uwe Ligges On 18.05.2011 10:55, Javi Hidalgo wrote: Dear All, I am documenting a R package. That means writing the *.Rd files inside the \man folder of the package structure I was wondering how to write the symbol for an integral function in a formula. Similar to this one in LaTeX: \int_{0}^{10} \Omega(t)dt I already tried \deqn{\int_{0}^{10} \Omega(t)dt} but it does not work. Any idea? Which math symbols does R-help recognise? Regards, Javier Hidalgo Carrio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make array of regression objects
Hi Dmitrij, I think the usual way is to store the results in a list: o - list() o[[1]] - lrm(...) o[[2]] - lrm(...) o[[...]] -lrm(...) then you can access the results like 0[[1]], 0[[2]] ... Best, Ista On Tue, May 17, 2011 at 11:53 PM, Dmitrij Kudriavcev dimitrij.kudriav...@ntsg.lt wrote: Dear all, I have made couple logistic regressions, what making a distribution of some event. Currently, i store it like this: o1 - lrm(...) o2 - lrm(...) o3 - lrm(...) ... Then, i have made a function to peak required regression object from this variables by it number: get_object - function(obj_name, nModel) { eval (parse(text=paste(o - , obj_name, nModel, sep=))) o } Is there a better way to do it? I have try to store it in the matrix using data.frame(), but object become destroyed after that and predict() function do not recognize it. Regards, Dmitrij Kudriavcev [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multidimensional List access
Dear all, I have data organized in a way like this Data[[CharacteristicsList1]][[CharacteristicsList2]][[CharacteristicsList3]][[CharacteristicsList4]] where CharacteristicsList4 there is a DF stored with various columns of name V1,V2, ... , Vn Is there an easy way to get a vector of all the values in, lets say V3, in all CharacteristicsLists without the need for FOR-loops? I figured there could be something like Data[[:]][[:]][[:]][[:]][[V3]] Many thanks, Ingo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-code in R-file documentation
Hello List, I would like to insert code from .r files into a LaTeX appendix (possibly using Sweave). I was considering: results=tex,eval=true,echo=true= source(file.r) @ but I would just like to echo the code and not evaluate the code within the file. maybe: results=tex,eval=true,echo=false= cat(\\begin{verbatim}) readLines(file.r) cat(\\end{verbatim}) @ The above works well other than the line numbers which are included (which isn't so bad). Thanks for the help and ideas! Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] new package SamplingStrata
Dear R users, I would like to announce that on the CRAN is now available a new package (SamplingStrata version 0.9) for the optimal stratification of sampling frames. This package offers an approach for the determination of the best stratification of a sampling frame, the one that ensures the minimum sample size under the condition to satisfy precision constraints in a multivariate and multidomain case. This approach is based on the use of the genetic algorithm: each solution (i.e. a particular partition in strata of the sampling frame) is considered as an individual in a population to be evolved; the fitness of all individuals is evaluated by calculating (using the Bethel-Chromy algorithm) the sampling size satisfying accuracy constraints on the target estimates. The package covers all the phases, from the optimisation of the sampling frame, up to the design of the stratified sample, ending with the selection of the units. In the tar.gz (directory: \inst\doc) it is possible to find a vignette ('SamplingStrataVignette.pdf') showing a complete application, from the optimisation of the sampling frame to the selection of the required sample. I would appreciate any feedback Sincerely, Giulio Barcaroli -- Giulio Barcaroli Methods, Tools and Methodological Support Italian National Institute of Statistics barca...@istat.it ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Query Gene ontology
Dear R-users, I'm looking for a way to query the gene ontology in R like in the GO browser (AmiGO). I tried different packages (NCBI2R, GOsim ...) but I did not find the way to extract genes names associated to a GO term. Could you tell me if there is a way to do that? Thanks, Hervé `·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´ Hervé Lemaître U1000 Imagerie et Psychiatrie INSERM - CEA - Faculté de Médecine Paris Sud 11 Service Hospitalier Frédéric Joliot 4, Place du Général Leclerc 91401 ORSAY, FRANCE Tél: (+33) 1 69 86 77 84 Fax: (+33) 1 69 86 78 10 `·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´´¯``·.¸¸.·´ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] text mining analysis and word visualization of pdfs
On Wed, May 18, 2011 at 1:44 PM, Karl Ove Hufthammer k...@huftis.orgwrote: Ajay Ohri wrote: What is the appropriate software package for dumping say 20 PDFS in a folder, then creating data visualization with frequency counts of certain words as well as measure correlation within each file for certain key relationships or key words. pdftotext + Unix for Poets + R (ggplot2) What about the tm package ? I am a beginner and I don't know much about this but I recall that it does have the ability to handle PDF's. A few words from the experts would be nice. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Query Gene ontology
You will find a wider and more experienced audience for this question on the Bioconductor mailing list. -- David On May 18, 2011, at 5:14 AM, LEMAITRE Hervé Université Paris Sud wrote: Dear R-users, I'm looking for a way to query the gene ontology in R like in the GO browser (AmiGO). I tried different packages (NCBI2R, GOsim ...) but I did not find the way to extract genes names associated to a GO term. Could you tell me if there is a way to do that? David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Query Gene ontology
LEMAITRE Hervé Université Paris Sud herve.lemaitre at cea.fr writes: I'm looking for a way to query the gene ontology in R like in the GO browser (AmiGO). I tried different packages (NCBI2R, GOsim ...) but I did not find the way to extract genes names associated to a GO term. Could you tell me if there is a way to do that? You will probably have better luck posting this question on the bioconductor mailing list (read the posting guide, and search the list archives, first ...) Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-code in R-file documentation
Brian Oney zenlines at gmail.com writes: Hello List, I would like to insert code from .r files into a LaTeX appendix (possibly using Sweave). I was considering: [snip] Wouldn't it be easier to use the LaTeX listings package? https://stat.ethz.ch/pipermail/r-help/2006-September/113688.html https://stat.ethz.ch/pipermail/r-help/2006-September/113103.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need expert help with model.matrix
On Wed, May 18, 2011 at 7:15 AM, Axel Urbiz axel.ur...@gmail.com wrote: Dear experts: Is it possible to create a new function based on stats:::model.matrix.default so that an alternative factor coding is used when the function is called instead of the default factor coding? Basically, I'd like to reproduce the results in 'mat' below, without having to explicitly specify my desired factor coding (identity matrices) in the 'contrasts.arg'. dd - data.frame(a = gl(3,4), b = gl(4,1,12)) ca - contrasts(dd$a, contrasts= FALSE) # 3 x 3 identity matrix cb - contrasts(dd$b, contrasts= FALSE) # 4 x 4 identity matrix mat - model.matrix(~ a + b, dd, contrasts.arg = list(a=ca, b=cb)) My approach was to modify the code in model.matrix by explicitly setting the contrasts argument in the contr.identity and contrasts function to FALSE. This is shown at the bottom of the email in the function model.matrix2: contr.identity - contr.treatment formals(contr.identity)$contrasts - FALSE contrasts - contrasts formals(contrasts)$contrasts - FALSE However, I believe this function is using contrasts = TRUE, as it doesn't return the identity contrasts mat2 - model.matrix2(~ a + b, dd) Any help here is much appreciated. Axel. If your objective in all this is ultimately to get lm coefficients in the original coding then see ?dummy.coef -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question on approximations of full logistic regression model
Thank you for your advice, Tim. I am reading your paper and other materials in your website. I could not find R package of your bootknife method. Is there any R package for this procedure? (11/05/17 14:13), Tim Hesterberg wrote: My usual rule is that whatever gives the widest confidence intervals in a particular problem is most accurate for that problem :-) Bootstrap percentile intervals tend to be too narrow. Consider the case of the sample mean; the usual formula CI is xbar +- t_alpha sqrt( (1/(n-1)) sum((x_i - xbar)^2)) / sqrt(n) The bootstrap percentile interval for symmetric data is roughly xbar +- z_alpha sqrt( (1/(n )) sum((x_i - xbar)^2)) / sqrt(n) It is narrower than the formula CI because * z quantiles rather than t quantiles * standard error uses divisor of n rather than (n-1) In stratified sampling, the narrowness factor depends on the stratum sizes, not the overall n. In regression, estimates for some quantities may be based on a small subset of the data (e.g. coefficients related to rare factor levels). This doesn't mean we should give up on the bootstrap. There are remedies for the bootstrap biases, see e.g. Hesterberg, Tim C. (2004), Unbiasing the Bootstrap-Bootknife Sampling vs. Smoothing, Proceedings of the Section on Statistics and the Environment, American Statistical Association, 2924-2930. http://home.comcast.net/~timhesterberg/articles/JSM04-bootknife.pdf And other methods have their own biases, particularly in nonlinear applications such as logistic regression. Tim Hesterberg Thank you for your reply, Prof. Harrell. I agree with you. Dropping only one variable does not actually help a lot. I have one more question. During analysis of this model I found that the confidence intervals (CIs) of some coefficients provided by bootstrapping (bootcov function in rms package) was narrower than CIs provided by usual variance-covariance matrix and CIs of other coefficients wider. My data has no cluster structure. I am wondering which CIs are better. I guess bootstrapping one, but is it right? I would appreciate your help in advance. -- KH (11/05/16 12:25), Frank Harrell wrote: I think you are doing this correctly except for one thing. The validation and other inferential calculations should be done on the full model. Use the approximate model to get a simpler nomogram but not to get standard errors. With only dropping one variable you might consider just running the nomogram on the entire model. Frank KH wrote: Hi, I am trying to construct a logistic regression model from my data (104 patients and 25 events). I build a full model consisting of five predictors with the use of penalization by rms package (lrm, pentrace etc) because of events per variable issue. Then, I tried to approximate the full model by step-down technique predicting L from all of the componet variables using ordinary least squares (ols in rms package) as the followings. I would like to know whether I am doing right or not. library(rms) plogit- predict(full.model) full.ols- ols(plogit ~ stenosis+x1+x2+ClinicalScore+procedure, sigma=1) fastbw(full.ols, aics=1e10) Deleted Chi-Sq d.f. P Residual d.f. P AICR2 stenosis 1.41 10.2354 1.41 10.2354 -0.59 0.991 x216.78 10. 18.19 20.0001 14.19 0.882 procedure 26.12 10. 44.31 30. 38.31 0.711 ClinicalScore 25.75 10. 70.06 40. 62.06 0.544 x183.42 10. 153.49 50. 143.49 0.000 Then, fitted an approximation to the full model using most imprtant variable (R^2 for predictions from the reduced model against the original Y drops below 0.95), that is, dropping stenosis. full.ols.approx- ols(plogit ~ x1+x2+ClinicalScore+procedure) full.ols.approx$stats n Model L.R.d.f. R2 g Sigma 104.000 487.9006640 4.000 0.9908257 1.3341718 0.1192622 This approximate model had R^2 against the full model of 0.99. Therefore, I updated the original full logistic model dropping stenosis as predictor. full.approx.lrm- update(full.model, ~ . -stenosis) validate(full.model, bw=F, B=1000) index.orig trainingtest optimism index.correctedn Dxy 0.6425 0.7017 0.6131 0.0887 0.5539 1000 R20.3270 0.3716 0.3335 0.0382 0.2888 1000 Intercept 0. 0. 0.0821 -0.0821 0.0821 1000 Slope 1. 1. 1.0548 -0.0548 1.0548 1000 Emax 0. 0. 0.0263 0.0263 0.0263 1000 validate(full.approx.lrm, bw=F, B=1000) index.orig trainingtest optimism index.correctedn Dxy 0.6446 0.6891 0.6265 0.0626 0.5820 1000 R20.3245 0.3592 0.3428 0.0164 0.3081 1000 Intercept 0. 0.
Re: [R] Smooth contour of a map
I've pratically resolved my problem (the code is under that), but a last thing is not perfect: when I use the function plot to call after the function polygon, there is a marge between my raster and the window. I think it's the axis of the function plot(), but I have not found how delete it. Someone have a solution please? Pierre Bruyer ##smooth contour contours - contourLines(V2b,levels=paliers) par(mar=c(0,0,0,0)) plot(1,col=white,main=polygon(), asp = 1, axes = FALSE, ann = FALSE,xlim=c(0,1), ylim = c(0,1),type = n, method = c(image)) for (i in seq_along(contours)) { x - contours[[i]]$x y - contours[[i]]$y c - contours[[i]]$level j - 1 tmp - 0 while(j length(level[,1]) tmp == 0){ if(level[j,1] == c){ tmp - j } j - j+1 } polygon( spline( seq_along(x), x)$y, spline( seq_along(y), y)$y ,col = colgraph[tmp+1], border = NA) } Le 17 mai 2011 à 16:44, Pierre Bruyer a écrit : The result is good, thanks a lot, but how can I with this method fill my raster to color? Le 17 mai 2011 à 15:43, Duncan Murdoch a écrit : I don't think filled.contour gives you access to the contour lines. If you use contourLines() to compute them, then you can draw them using code like this: contours - contourLines(V2b,levels=paliers) for (i in seq_along(contours)) { x - contours[[i]]$x y - contours[[i]]$y lines( splines( seq_along(x), x)$y, splines( seq_along(y), y)$y ) } but as I said, you won't get great results. A better way is to use a finer grid, e.g. by fitting a smooth surface to your set of points and using predictions from the model to interpolate. Duncan Murdoch On 17/05/2011 9:35 AM, Pierre Bruyer wrote: I work with large datasets (1 points) so I can't post them , but my function is : create_map- function(grd, level ,map_output, format = c(jpeg), width_map = 150, height_map = 150,...) { ##sp- spline(x = grd[,1], y = grd[,2]) grd2- matrix(grd[,3], nrow = sqrt(length(grd[,3])), ncol = sqrt(length(grd[,3])), byrow = FALSE) V2b- grd2 ##creation of breaks for colors i-1 paliers- c(-1.0E300) while(i=length(level[,1])) { paliers- c(paliers,level[i,1]) i- i+1 } paliers- c(paliers, 1.0E300) ##scale color creation i- 1 colgraph- c(rgb(255,255,255, maxColorValue = 255)) while(i=length(level[,2])) { colgraph- c(colgraph, rgb(level[i,2],level[i,3],level[i,4], maxColorValue = 255)) i- i +1 } ##user can choose the output format (default is jpeg) switch(format, png = png(map_output, width = width_map, height = height_map) , jpeg = jpeg(map_output, width = width_map, height = height_map, quality = 100), bmp = bmp(map_output, width = width_map, height = height_map), tiff = tiff(map_output, width = width_map, height = height_map), jpeg(map_output, width = width_map, height = height_map)) ## drawing map ##delete marge par(mar=c(0,0,0,0)) filled.contour(V2b, col = colgraph, levels = paliers, asp = 1, axes = FALSE, ann = FALSE) dev.off() } where grd is a xyz data frame, map_output is the path+name of the output image file, and level is a matrix like this : level- matrix(0,10,4) level[1,1]- 1.E+00 level[2,1]- 3.E+00 level[3,1]- 5.E+00 level[4,1]- 1.E+01 level[5,1]- 1.5000E+01 level[6,1]- 2.E+01 level[7,1]- 3.E+01 level[8,1]- 4.E+01 level[9,1]- 5.E+01 level[10,1]- 7.5000E+01 level[1,2]- 102 level[2,2]- 102 level[3,2]- 102 level[4,2]- 93 level[5,2]- 204 level[6,2]- 248 level[7,2]- 241 level[8,2]- 239 level[9,2]- 224 level[10,2]- 153 level[1,3]- 153 level[2,3]- 204 level[3,3]- 204 level[4,3]- 241 level[5,3]- 255 level[6,3]- 243 level[7,3]- 189 level[8,3]- 126 level[9,3]- 14 level[10,3]- 0 level[1,4]- 153 level[2,4]- 204 level[3,4]- 153 level[4,4]- 107 level[5,4]- 102 level[6,4]- 33 level[7,4]- 59 level[8,4]- 63 level[9,4]- 14 level[10,4]- 51 Le 17 mai 2011 à 15:17, Duncan Murdoch a écrit : On 17/05/2011 8:24 AM, Pierre Bruyer wrote: Thank you for your answer, but the function spline() (and a lot of other function in R) can't take in its parameters the original contour which are define by a vector, i.e. : If you post some reproducible code to generate the contours, someone will show you how to use splines to interpolate them. Duncan Murdoch ##creation of breaks for colors i-1 paliers- c(-1.0E300) while(i=length(level[,1])) {
Re: [R] Integral Symbol
Thanks. I was exactly reading the manual Writing R Extensions, on section Mathematics. Where, it informs about basic LaTeX style support. However, It seems like it does not support the LaTeX integral symbol \int, but it does support i.e.: the summation symbol \sum. Has anyone had this experience on documenting R packages? Does anyone know any R-package where the integral symbol appear in the help files. Regards, Javier Hidalgo Carrio Date: Wed, 18 May 2011 13:14:54 +0200 From: lig...@statistik.tu-dortmund.de To: havyhida...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Integral Symbol See the section on writing Mathematics in Rd file in the manual Writing R Extensions. This will show how to produce high quality formulas in LaTeX generated output and ASCII versions otherwise. If you want to provide an excellent HTML version as well, the section on Conditional text is also worth reading. Uwe Ligges On 18.05.2011 10:55, Javi Hidalgo wrote: Dear All, I am documenting a R package. That means writing the *.Rd files inside the \man folder of the package structure I was wondering how to write the symbol for an integral function in a formula. Similar to this one in LaTeX: \int_{0}^{10} \Omega(t)dt I already tried \deqn{\int_{0}^{10} \Omega(t)dt} but it does not work. Any idea? Which math symbols does R-help recognise? Regards, Javier Hidalgo Carrio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make array of regression objects
Thank you, Ista, that exactly, what i was looking for :) Regards, Dmitrij Kudriavcev 2011/5/18 Ista Zahn iz...@psych.rochester.edu Hi Dmitrij, I think the usual way is to store the results in a list: o - list() o[[1]] - lrm(...) o[[2]] - lrm(...) o[[...]] -lrm(...) then you can access the results like 0[[1]], 0[[2]] ... Best, Ista On Tue, May 17, 2011 at 11:53 PM, Dmitrij Kudriavcev dimitrij.kudriav...@ntsg.lt wrote: Dear all, I have made couple logistic regressions, what making a distribution of some event. Currently, i store it like this: o1 - lrm(...) o2 - lrm(...) o3 - lrm(...) ... Then, i have made a function to peak required regression object from this variables by it number: get_object - function(obj_name, nModel) { eval (parse(text=paste(o - , obj_name, nModel, sep=))) o } Is there a better way to do it? I have try to store it in the matrix using data.frame(), but object become destroyed after that and predict() function do not recognize it. Regards, Dmitrij Kudriavcev [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a list of dataframes
On May 17, 2011, at 7:13 PM, Lara Poplarski wrote: Thank you all, this is exactly what I had in mind, except that I still have to get my head around apply et al. Back to the books for me then! Read the lapply( ...) call as: For every element in the object named `data`, send that element to a function that returns TRUE if its first dimension is greater than one, returns FALSE if its first dimension is one, and return nothing (actually a vector with zero elements) if it doesn't have a (first) dim attribute, and finally return the ordered collection of those values as a list which is assigned the name 'entries.with.nrows'. Lara On Tue, May 17, 2011 at 2:41 PM, Jannis bt_jan...@yahoo.de wrote: Have a look at lapply(). Something like: entries.with.nrows=lapply(data,function(x)dim(x)[1]1) should give you a vector with the elements of the list that you seek marked with TRUE. This vector can then be used to extract a subset from your list by: data.reduced=data[entries.with.nrows] Or similar HTH Jannis --- Lara Poplarski larapoplar...@gmail.com schrieb am Di, 17.5.2011: Von: Lara Poplarski larapoplar...@gmail.com Betreff: [R] subsetting a list of dataframes An: r-help@r-project.org Datum: Dienstag, 17. Mai, 2011 20:24 Uhr Hello All, I have a list of dataframes, and I need to subset it by keeping only those dataframes in the list that meet a certain criterion. Specifically, I need to generate a second list which only includes those dataframes whose number of rows is 1. Could someone suggest how to do this? I have come close to what I need with loops and such, but there must be a less clumsy way... Many thanks, Lara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Smooth contour of a map
You may be looking for the par settings of xaxs=i, yaxs=i, which if you add them to the plot call will prevent the regular behavior of adding 4% padding to the axis widths. ?par -- David. On May 18, 2011, at 8:27 AM, Pierre Bruyer wrote: I've pratically resolved my problem (the code is under that), but a last thing is not perfect: when I use the function plot to call after the function polygon, there is a marge between my raster and the window. I think it's the axis of the function plot(), but I have not found how delete it. Someone have a solution please? Pierre Bruyer ##smooth contour contours - contourLines(V2b,levels=paliers) par(mar=c(0,0,0,0)) plot(1,col=white,main=polygon(), asp = 1, axes = FALSE, ann = FALSE,xlim=c(0,1), ylim = c(0,1),type = n, method = c(image)) for (i in seq_along(contours)) { x - contours[[i]]$x y - contours[[i]]$y c - contours[[i]]$level j - 1 tmp - 0 while(j length(level[,1]) tmp == 0){ if(level[j,1] == c){ tmp - j } j - j+1 } polygon( spline( seq_along(x), x)$y, spline( seq_along(y), y) $y ,col = colgraph[tmp+1], border = NA) } Le 17 mai 2011 à 16:44, Pierre Bruyer a écrit : The result is good, thanks a lot, but how can I with this method fill my raster to color? Le 17 mai 2011 à 15:43, Duncan Murdoch a écrit : I don't think filled.contour gives you access to the contour lines. If you use contourLines() to compute them, then you can draw them using code like this: contours - contourLines(V2b,levels=paliers) for (i in seq_along(contours)) { x - contours[[i]]$x y - contours[[i]]$y lines( splines( seq_along(x), x)$y, splines( seq_along(y), y)$y ) } but as I said, you won't get great results. A better way is to use a finer grid, e.g. by fitting a smooth surface to your set of points and using predictions from the model to interpolate. Duncan Murdoch On 17/05/2011 9:35 AM, Pierre Bruyer wrote: I work with large datasets (1 points) so I can't post them , but my function is : create_map- function(grd, level ,map_output, format = c(jpeg), width_map = 150, height_map = 150,...) { ##sp- spline(x = grd[,1], y = grd[,2]) grd2- matrix(grd[,3], nrow = sqrt(length(grd[,3])), ncol = sqrt(length(grd[,3])), byrow = FALSE) V2b- grd2 ##creation of breaks for colors i-1 paliers- c(-1.0E300) while(i=length(level[,1])) { paliers- c(paliers,level[i,1]) i- i+1 } paliers- c(paliers, 1.0E300) ##scale color creation i- 1 colgraph- c(rgb(255,255,255, maxColorValue = 255)) while(i=length(level[,2])) { colgraph- c(colgraph, rgb(level[i,2],level[i,3],level[i,4], maxColorValue = 255)) i- i +1 } ##user can choose the output format (default is jpeg) switch(format, png = png(map_output, width = width_map, height = height_map) , jpeg = jpeg(map_output, width = width_map, height = height_map, quality = 100), bmp = bmp(map_output, width = width_map, height = height_map), tiff = tiff(map_output, width = width_map, height = height_map), jpeg(map_output, width = width_map, height = height_map)) ## drawing map ##delete marge par(mar=c(0,0,0,0)) filled.contour(V2b, col = colgraph, levels = paliers, asp = 1, axes = FALSE, ann = FALSE) dev.off() } where grd is a xyz data frame, map_output is the path+name of the output image file, and level is a matrix like this : level- matrix(0,10,4) level[1,1]- 1.E+00 level[2,1]- 3.E+00 level[3,1]- 5.E+00 level[4,1]- 1.E+01 level[5,1]- 1.5000E+01 level[6,1]- 2.E+01 level[7,1]- 3.E+01 level[8,1]- 4.E+01 level[9,1]- 5.E+01 level[10,1]- 7.5000E+01 level[1,2]- 102 level[2,2]- 102 level[3,2]- 102 level[4,2]- 93 level[5,2]- 204 level[6,2]- 248 level[7,2]- 241 level[8,2]- 239 level[9,2]- 224 level[10,2]- 153 level[1,3]- 153 level[2,3]- 204 level[3,3]- 204 level[4,3]- 241 level[5,3]- 255 level[6,3]- 243 level[7,3]- 189 level[8,3]- 126 level[9,3]- 14 level[10,3]- 0 level[1,4]- 153 level[2,4]- 204 level[3,4]- 153 level[4,4]- 107 level[5,4]- 102 level[6,4]- 33 level[7,4]- 59 level[8,4]- 63 level[9,4]- 14 level[10,4]- 51 Le 17 mai 2011 à 15:17, Duncan Murdoch a écrit : On 17/05/2011 8:24 AM, Pierre Bruyer wrote: Thank you for your answer, but the function spline() (and a lot of other function in R) can't take in its parameters the original contour which are define by a vector, i.e. : If you post some reproducible code to
[R] matrix help (first occurrence of variable in column)
Dear R help, Apologies for the less than informative subject line. I will do my best to describe my problem. Consider the following matrix: mdat - matrix(c(1,0,1,1,1,0), nrow = 2, ncol=3, byrow=TRUE, dimnames = list(c(T1, T2), c(sp.1, sp.2, sp.3))) mdat In my actual data I have time (rows) and species occurrences (0/1 values, columns). I want to count the number of new species that occur at a given time sample. For the matrix above the answer would be 1. Is there a simple way to figure out if the species has never occurred before and then sum them up? Thanks in advance, Micheal -- Michael Denslow I.W. Carpenter Jr. Herbarium [BOON] Department of Biology Appalachian State University Boone, North Carolina U.S.A. -- AND -- Communications Manager Southeast Regional Network of Expertise and Collections sernec.org 36.214177, -81.681480 +/- 3103 meters __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Change pattern in histograms in ggplot2
Hi, I am wondering if there is a way to change the pattern of the fill in histogram in ggplot2? By default the fill is solid and I'd like to add some sort of pattern to make it more visible that these are different levels of a factor. Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integral Symbol
On 18/05/2011 9:09 AM, Javi Hidalgo wrote: Thanks. I was exactly reading the manual Writing R Extensions, on section Mathematics. Where, it informs about basic LaTeX style support. However, It seems like it does not support the LaTeX integral symbol \int, but it does support i.e.: the summation symbol \sum. Has anyone had this experience on documenting R packages? It appears in the topic shown by ?Special, Special Functions of Mathematics, but only in the LaTeX version. You can see that in the PDF version of the Reference Manual, or (if you have things set up correctly), by saying options(help_type=pdf) ?Special I don't know of any examples where someone has shown an integral sign in text or html versions. It's not a symbol supported by the R help system, it would depend on hand coding the right thing. Duncan Murdoch Does anyone know any R-package where the integral symbol appear in the help files. Regards, Javier Hidalgo Carrio Date: Wed, 18 May 2011 13:14:54 +0200 From: lig...@statistik.tu-dortmund.de To: havyhida...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Integral Symbol See the section on writing Mathematics in Rd file in the manual Writing R Extensions. This will show how to produce high quality formulas in LaTeX generated output and ASCII versions otherwise. If you want to provide an excellent HTML version as well, the section on Conditional text is also worth reading. Uwe Ligges On 18.05.2011 10:55, Javi Hidalgo wrote: Dear All, I am documenting a R package. That means writing the *.Rd files inside the \man folder of the package structure I was wondering how to write the symbol for an integral function in a formula. Similar to this one in LaTeX: \int_{0}^{10} \Omega(t)dt I already tried \deqn{\int_{0}^{10} \Omega(t)dt} but it does not work. Any idea? Which math symbols does R-help recognise? Regards, Javier Hidalgo Carrio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb
Thanks Bill. Do you and others think that a link to this guide (or another)should be included in the Posting Guide and/or R FAQ? -- Bert On Tue, May 17, 2011 at 4:07 PM, bill.venab...@csiro.au wrote: Amen to all of that, Bert. Nicely put. The google style guide (not perfect, but a thoughtful contribution on these kinds of issues, has avoiding attach() as its very first line. See http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html) I would add, though, that not enough people seem yet to be aware of within(...), a companion of with(...) in a way, but used for modifying data frames or other kinds of list objects. It should be seen as a more flexible replacement for transform() (well, almost). The difference between with() and within() is as follows: with(data, expr, ...) allows you to evaluate 'expr' with 'data' providing the primary source for variables, and returns *the evaluated expression* as the result. By contrast within(data, expr, ...) again uses 'data' as the primary source for variables when evaluating 'expr', but now 'expr' is used to modify the varibles in 'data' and returns *the modified data set* as the result. I use this a lot in the data preparation phase of a project, especially, which is usually the longest, trickiest, most important, but least discussed aspect of any data analysis project. Here is a simple example using within() for something you cannot do in one step with transform(): polyData - within(data.frame(x = runif(500)), { x2 - x^2 x3 - x*x2 b - runif(4) eta - cbind(1,x,x2,x3) %*% b y - eta + rnorm(x, sd = 0.5) rm(b) }) check: str(polyData) 'data.frame': 500 obs. of 5 variables: $ x : num 0.5185 0.185 0.5566 0.2467 0.0178 ... $ y : num [1:500, 1] 1.343 0.888 0.583 0.187 0.855 ... $ eta: num [1:500, 1] 1.258 0.788 1.331 0.856 0.63 ... $ x3 : num 1.39e-01 6.33e-03 1.72e-01 1.50e-02 5.60e-06 ... $ x2 : num 0.268811 0.034224 0.309802 0.060844 0.000315 ... Bill Venables. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Wednesday, 18 May 2011 12:08 AM To: Peter Ehlers Cc: R list Subject: Re: [R] Post-hoc tests in MASS using glm.nb Folks: Only if the user hasn't yet been introduced to the with() function, which is linked to on the ?attach page. Note also this sentence from the ?attach page: attach can lead to confusion. I can't remember the last time I needed attach(). Peter Ehlers Yes. But perhaps it might be useful to flesh this out with a bit of commentary. To this end, I invite others to correct or clarify the following. The potential confusion comes from requiring R to search for the data. There is a rigorous process by which this is done, of course, but it requires that the runtime environment be consistent with that process, and the programmer who wrote the code may not have control over that environment. The usual example is that one has an object named,say, a in the formula and in the attached data and another a also in the global environment. Then the wrong a would be found. The same thing can happen if another data set gets attached in a position before the one of interest. (Like Peter, I haven't used attach() in so long that I don't know whether any warning messages are issued in such cases). Using the data = argument when available or the with() function when not avoids this potential confusion and tightly couples the data to be analyzed with the analysis. I hope this clarifies the previous posters' comments. Cheers, Bert [... non-germane material snipped ...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter
Re: [R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb
This is the first time I've seen an R Style Guide. I will admit that I haven't looked for one previously, but nevertheless I still haven't seen one. My code style simply evolved (perhaps, chugged along) by reading posts from other users who post to the r-help community. I regularly program with a colleague who is a Java software development specialist, hacking together code that we both develop. Since his coding style differs substantially from mine and the conventions described for R we end up modifying my code to follows his convention. For example, he typically likes to name variables in this form: variable_ , which the guide frowns on. I think this guide will be very helpful. First for me to become more proficient and conventional following R stylistics. Secondly, he will see why R users do things the way R. The guide should be helpful. I appreciate you posting the link to the guide. Much appreciated. Steve Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 Bert Gunter gunter.berton@ge ne.comTo Sent by: bill.venab...@csiro.au r-help-bounces@r- cc project.org r-help@r-project.org Subject [R] R Style Guide -- Was Post-hoc 05/18/2011 09:47 tests in MASS using glm.nb AM Thanks Bill. Do you and others think that a link to this guide (or another)should be included in the Posting Guide and/or R FAQ? -- Bert On Tue, May 17, 2011 at 4:07 PM, bill.venab...@csiro.au wrote: Amen to all of that, Bert. Nicely put. The google style guide (not perfect, but a thoughtful contribution on these kinds of issues, has avoiding attach() as its very first line. See http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html) I would add, though, that not enough people seem yet to be aware of within(...), a companion of with(...) in a way, but used for modifying data frames or other kinds of list objects. It should be seen as a more flexible replacement for transform() (well, almost). The difference between with() and within() is as follows: with(data, expr, ...) allows you to evaluate 'expr' with 'data' providing the primary source for variables, and returns *the evaluated expression* as the result. By contrast within(data, expr, ...) again uses 'data' as the primary source for variables when evaluating 'expr', but now 'expr' is used to modify the varibles in 'data' and returns *the modified data set* as the result. I use this a lot in the data preparation phase of a project, especially, which is usually the longest, trickiest, most important, but least discussed aspect of any data analysis project. Here is a simple example using within() for something you cannot do in one step with transform(): polyData - within(data.frame(x = runif(500)), { x2 - x^2 x3 - x*x2 b - runif(4) eta - cbind(1,x,x2,x3) %*% b y - eta + rnorm(x, sd = 0.5) rm(b) }) check: str(polyData) 'data.frame': 500 obs. of 5 variables: $ x : num 0.5185 0.185 0.5566 0.2467 0.0178 ... $ y : num [1:500, 1] 1.343 0.888 0.583 0.187 0.855 ... $ eta: num [1:500, 1] 1.258 0.788 1.331 0.856 0.63 ... $ x3 : num 1.39e-01 6.33e-03 1.72e-01 1.50e-02 5.60e-06 ... $ x2 : num 0.268811 0.034224 0.309802 0.060844 0.000315 ... Bill Venables. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Wednesday, 18 May 2011 12:08 AM To: Peter Ehlers Cc: R list Subject: Re: [R] Post-hoc tests in MASS using glm.nb Folks: Only if the user hasn't yet been introduced to the with() function, which is linked to on the ?attach page. Note also this sentence from the ?attach page: attach can lead to confusion. I can't remember the last time I needed attach(). Peter Ehlers Yes. But perhaps it might be useful to flesh this out with a bit of
[R] Changing order of facet grid in ggplot2
Hi I am running the following code: sym - c(sym1,sym2,sym4) lifedxm - c(O-BD,O-WELL,O-UNI) life - c(lifedxm,lifedxm,lifedxm) tp - c(TP-ANY,TP-ANY, TP-ANY, TP-SUB, TP-SUB, TP-SUB, TP-CLIN , TP-CLIN, TP-CLIN) data - data.frame(sym,life,tp) qplot(life,geom=bar,weight=sym,ylim=c(0,1),legend=F,data=data) + facet_grid(. ~ tp) This creates a facet grid where TP-ANY is followed by TP-CLIN and then TP-SUB. I'd like to create a grid where TP-ANY is followed by TP-SUB then TP-CLIN. Is this possible? Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a list of dataframes
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Lara Poplarski Sent: Tuesday, May 17, 2011 4:14 PM To: r-help@r-project.org Subject: Re: [R] subsetting a list of dataframes Thank you all, this is exactly what I had in mind, except that I still have to get my head around apply et al. Back to the books for me then! Lara On Tue, May 17, 2011 at 2:41 PM, Jannis bt_jan...@yahoo.de wrote: Have a look at lapply(). Something like: entries.with.nrows=lapply(data,function(x)dim(x)[1]1) Note that the above suggestion does not work in R 2.13.0: listOfDataFrames - list(three=data.frame(x=11:13,y=101:103), one=data.frame(x=1,y=2), five=data.frame(x=1:5,y=11:15)) listOfDataFrames[lapply(listOfDataFrames,function(x)nrow(x)1)] Error in listOfDataFrames[lapply(listOfDataFrames, function(x) nrow(x) : invalid subscript type 'list' lapply(...) always returns a list and lists are not acceptable as subscripts. Instead, make the subscript one of the following: as.logical(lapply(...)) sapply(...) # and hope that FUN always returns TRUE or FALSE and length(list)0 vapply(..., FUN.VALUE=FALSE) It may be a bit quicker to do the 0 outside of the loop, as in as.integer(lapply(listOfDataFrames, FUN=nrow)) 0 or vapply(listOfDataFrames, FUN=nrow, FUN.VALUE=0L) 0 but you need a pretty long list to notice. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com should give you a vector with the elements of the list that you seek marked with TRUE. This vector can then be used to extract a subset from your list by: data.reduced=data[entries.with.nrows] Or similar HTH Jannis --- Lara Poplarski larapoplar...@gmail.com schrieb am Di, 17.5.2011: Von: Lara Poplarski larapoplar...@gmail.com Betreff: [R] subsetting a list of dataframes An: r-help@r-project.org Datum: Dienstag, 17. Mai, 2011 20:24 Uhr Hello All, I have a list of dataframes, and I need to subset it by keeping only those dataframes in the list that meet a certain criterion. Specifically, I need to generate a second list which only includes those dataframes whose number of rows is 1. Could someone suggest how to do this? I have come close to what I need with loops and such, but there must be a less clumsy way... Many thanks, Lara [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with 2-D plot of k-mean clustering analysis
Hi, all I would like to use R to perform k-means clustering on my data which included 33 samples measured with ~1000 variables. I have already used kmeans package for this analysis, and showed that there are 4 clusters in my data. However, it's really difficult to plot this cluster in 2-D format since the huge number of variables. One possible way is to project the multidimensional space into 2-D platform, but I could not find any good way to do that. Any suggestions or comments will be really helpful! Thanks, Meng [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing order of facet grid in ggplot2
data$tp - factor(data$tp, levels = c(TP-ANY,TP-SUB,TP-CLIN)) qplot(life,geom=bar,weight=sym,ylim=c(0,1),legend=F,data=data) + facet_grid(. ~ tp) On Wednesday, May 18, 2011 at 9:14 AM, Christopher Desjardins wrote: Hi I am running the following code: sym - c(sym1,sym2,sym4) lifedxm - c(O-BD,O-WELL,O-UNI) life - c(lifedxm,lifedxm,lifedxm) tp - c(TP-ANY,TP-ANY, TP-ANY, TP-SUB, TP-SUB, TP-SUB, TP-CLIN , TP-CLIN, TP-CLIN) data - data.frame(sym,life,tp) qplot(life,geom=bar,weight=sym,ylim=c(0,1),legend=F,data=data) + facet_grid(. ~ tp) This creates a facet grid where TP-ANY is followed by TP-CLIN and then TP-SUB. I'd like to create a grid where TP-ANY is followed by TP-SUB then TP-CLIN. Is this possible? Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change pattern in histograms in ggplot2
There are a number of discussion threads on the google groups ggplot2 page: here are two of them. http://groups.google.com/group/ggplot2/browse_thread/thread/ca546f7f4d636deb/e0763a54b7735c35?lnk=gstq=fill+pattern#e0763a54b7735c35 http://groups.google.com/group/ggplot2/browse_thread/thread/9a9c081d235efc24/d319d4500174cdd7?lnk=gstq=fill+pattern#d319d4500174cdd7 Scott On Wednesday, May 18, 2011 at 8:39 AM, Christopher Desjardins wrote: Hi, I am wondering if there is a way to change the pattern of the fill in histogram in ggplot2? By default the fill is solid and I'd like to add some sort of pattern to make it more visible that these are different levels of a factor. Thanks! Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convolution confusion:
Hi, I'm new to R, and I'm a bit confused with the convolve() function. If I do: x-c(1, 2, 3) convolve(x, rev(x), TRUE, open) = 9 12 10 4 1 But I expected: 3 8 14 8 3 (like in Octave/MATLAB - conv(x, reverse(x)) ) 3 2 1 x 1 2 3 = 3 2 1 0 6 4 2 0 0 9 6 3 = 3 8 14 8 3 The thing is, that convolve(x, x, TRUE, open) works. For me it feels very confusing, that convolution does the reverse itself but the help suggest to reverse it again. The help file says: Note that the usual definition of convolution of two sequences x and y is given by convolve(x, rev(y), type = o). Thanks for your help, Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R example code of Split-plot Manova
Hi, I'm a PhD student in Milan (Italy). I read the OBrienKaiser example in ?Anova in the car package. I think that this is not a Manova split-plot design. I need to know if someone knows a R code for MANOVA split-plot. Is there someone who can help me? Thanks for kindness Riccardo -- View this message in context: http://r.789695.n4.nabble.com/R-example-code-of-Split-plot-Manova-tp1593985p3532630.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] retrieving gbif data
Hi, I am trying to use the gbif function in the dismo package and it does not seem to work. I get the same error message every time: Error in if (sp) geo - TRUE : argument is not interpretable as logical It does not matter whether I query for my species of interest or if I copy and paste those included in the help for the function or the vignette of the package. I am not sure whether it is because I am doing something wrong, although I am inclined to think there is a bug in the gbif code. Any help will be appreciated. Thank you in advance, Rafa -- National Evolutionary Synthesis Center *NESCent http://www.nescent.org/* 2024 W. Main Street, Suite A200 Durham, NC27705 r...@nescent.org mailto:r...@duke.edu 919.668.9107 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integral Symbol
Thanks! This is what I was looking for. Apparently, it is not supported then it should be as integral, but in the pdf version appears the integral symbol. Regards, Javier. Date: Wed, 18 May 2011 09:39:28 -0400 From: murdoch.dun...@gmail.com To: havyhida...@hotmail.com CC: r-help@r-project.org; lig...@statistik.tu-dortmund.de Subject: Re: [R] Integral Symbol On 18/05/2011 9:09 AM, Javi Hidalgo wrote: Thanks. I was exactly reading the manual Writing R Extensions, on section Mathematics. Where, it informs about basic LaTeX style support. However, It seems like it does not support the LaTeX integral symbol \int, but it does support i.e.: the summation symbol \sum. Has anyone had this experience on documenting R packages? It appears in the topic shown by ?Special, Special Functions of Mathematics, but only in the LaTeX version. You can see that in the PDF version of the Reference Manual, or (if you have things set up correctly), by saying options(help_type=pdf) ?Special I don't know of any examples where someone has shown an integral sign in text or html versions. It's not a symbol supported by the R help system, it would depend on hand coding the right thing. Duncan Murdoch Does anyone know any R-package where the integral symbol appear in the help files. Regards, Javier Hidalgo Carrio Date: Wed, 18 May 2011 13:14:54 +0200 From: lig...@statistik.tu-dortmund.de To: havyhida...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Integral Symbol See the section on writing Mathematics in Rd file in the manual Writing R Extensions. This will show how to produce high quality formulas in LaTeX generated output and ASCII versions otherwise. If you want to provide an excellent HTML version as well, the section on Conditional text is also worth reading. Uwe Ligges On 18.05.2011 10:55, Javi Hidalgo wrote: Dear All, I am documenting a R package. That means writing the *.Rd files inside the \man folder of the package structure I was wondering how to write the symbol for an integral function in a formula. Similar to this one in LaTeX: \int_{0}^{10} \Omega(t)dt I already tried \deqn{\int_{0}^{10} \Omega(t)dt} but it does not work. Any idea? Which math symbols does R-help recognise? Regards, Javier Hidalgo Carrio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with 2-D plot of k-mean clustering analysis
I wonder if it makes sense to reduce the dimensionality of the variables somehow? David Cross d.cr...@tcu.edu www.davidcross.us On May 18, 2011, at 9:41 AM, Meng Wu wrote: Hi, all I would like to use R to perform k-means clustering on my data which included 33 samples measured with ~1000 variables. I have already used kmeans package for this analysis, and showed that there are 4 clusters in my data. However, it's really difficult to plot this cluster in 2-D format since the huge number of variables. One possible way is to project the multidimensional space into 2-D platform, but I could not find any good way to do that. Any suggestions or comments will be really helpful! Thanks, Meng [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Loop stopping after 1 iteration
Hi all, This is a very basic question, but I just can't figure out why R is handling a loop I'm writing the way it is. Here is the script I have written: grid_2_series-function(gage_handle,data_type,filename) series_name-paste(gage_handle,data_type,sep=_) data_grid-read.table(file=paste(filename,.txt,sep=)) num_rows_data-nrow(data_grid)-1 num_cols_data-ncol(data_grid)-4 num_obs-num_rows_data*num_cols_data time_series-matrix(nrow=0,ncol=2) for(i in 1:length(num_obs)){ rownum-ceiling(i/31)+1 colnum-if(i%%31==0){ 35 }else{ (i%%31)+4 } year-data_grid[rownum,2] month-data_grid[rownum,3] day-colnum-4 date_string-paste(month,day,year,sep=/) date-as.Date(date_string,format='%m/%d/%Y') value-as.character(data_grid[rownum,colnum]) time_series-rbind(time_series,c(date,value)) } The script is working as I intended it to (goes through a matrix of data where column 2 is the year, column 3 is the month, and row 1 columns 5-35 are the day of the month the observation was recorded [I have included a screenshot below to help visualize what I'm talking about] and converts the grid into a 2 column time series where column 1 is the date and column 2 is the value of the observation), but it is stopping after only 1 iteration. nabble_img src=matrix_screenshot.jpg border=0/ Two questions: 1.) Does anyone know of an existing function to accomplish this task? 2.) Why is the loop stopping after 1 iteration? I have it written to iterate up to the total number of observations (20,615 in one case). Thank you for your help and sorry for this question which I'm sure has a very simple answer. Thanks again, Billy -- View this message in context: http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p3532988.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integral Symbol
Can’t you just embed it in the html as a symbol? #x222b; or int; I’d have thought you also just put it into straight into the document as a character – ∫– , as long as the html is stored as unicode U+222B http://en.wikipedia.org/wiki/Integral_symbol On 18 May 2011, at 4:04 PM, Javi Hidalgo wrote: Thanks! This is what I was looking for. Apparently, it is not supported then it should be as integral, but in the pdf version appears the integral symbol. Regards, Javier. Date: Wed, 18 May 2011 09:39:28 -0400 From: murdoch.dun...@gmail.com To: havyhida...@hotmail.com CC: r-help@r-project.org; lig...@statistik.tu-dortmund.de Subject: Re: [R] Integral Symbol On 18/05/2011 9:09 AM, Javi Hidalgo wrote: Thanks. I was exactly reading the manual Writing R Extensions, on section Mathematics. Where, it informs about basic LaTeX style support. However, It seems like it does not support the LaTeX integral symbol \int, but it does support i.e.: the summation symbol \sum. Has anyone had this experience on documenting R packages? It appears in the topic shown by ?Special, Special Functions of Mathematics, but only in the LaTeX version. You can see that in the PDF version of the Reference Manual, or (if you have things set up correctly), by saying options(help_type=pdf) ?Special I don't know of any examples where someone has shown an integral sign in text or html versions. It's not a symbol supported by the R help system, it would depend on hand coding the right thing. Duncan Murdoch Does anyone know any R-package where the integral symbol appear in the help files. Regards, Javier Hidalgo Carrio Date: Wed, 18 May 2011 13:14:54 +0200 From: lig...@statistik.tu-dortmund.de To: havyhida...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Integral Symbol See the section on writing Mathematics in Rd file in the manual Writing R Extensions. This will show how to produce high quality formulas in LaTeX generated output and ASCII versions otherwise. If you want to provide an excellent HTML version as well, the section on Conditional text is also worth reading. Uwe Ligges On 18.05.2011 10:55, Javi Hidalgo wrote: Dear All, I am documenting a R package. That means writing the *.Rd files inside the \man folder of the package structure I was wondering how to write the symbol for an integral function in a formula. Similar to this one in LaTeX: \int_{0}^{10} \Omega(t)dt I already tried \deqn{\int_{0}^{10} \Omega(t)dt} but it does not work. Any idea? Which math symbols does R-help recognise? Regards, Javier Hidalgo Carrio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with 2-D plot of k-mean clustering analysis
On Wed, May 18, 2011 at 7:41 AM, Meng Wu mengwu1...@gmail.com wrote: Hi, all I would like to use R to perform k-means clustering on my data which included 33 samples measured with ~1000 variables. I have already used kmeans package for this analysis, and showed that there are 4 clusters in my data. However, it's really difficult to plot this cluster in 2-D format since the huge number of variables. One possible way is to project the multidimensional space into 2-D platform, but I could not find any good way to do that. Any suggestions or comments will be really helpful! You could use multidimensional scaling, function cmdscale(), to produce a 2-dimensional representation of your data, then plot it using colors that correspond to the clusters. For example, suppose your data is stored in matrix X (1000x33). I assume you clustered the samples, not the variables, so you have a vector label[] with length 33 that has values between 1 and 4. Since k-means uses Euclidean distance, you would re-create the distance dst = dist(t(X)) then feed it into cmdscale() mds = cmdscale(dst); then plot it: plot(mds, col = label) HTH, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop stopping after 1 iteration
Hi, The answer to (2) is that num_obs is a scalar, so length(num_obs) is 1. You probably wanted to do for (i in 1:num_obs) instead. Best wishes Martyn -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of armstrwa Sent: 18 May 2011 16:18 To: r-help@r-project.org Subject: [R] Loop stopping after 1 iteration Hi all, This is a very basic question, but I just can't figure out why R is handling a loop I'm writing the way it is. Here is the script I have written: grid_2_series-function(gage_handle,data_type,filename) series_name-paste(gage_handle,data_type,sep=_) data_grid-read.table(file=paste(filename,.txt,sep=)) num_rows_data-nrow(data_grid)-1 num_cols_data-ncol(data_grid)-4 num_obs-num_rows_data*num_cols_data time_series-matrix(nrow=0,ncol=2) for(i in 1:length(num_obs)){ rownum-ceiling(i/31)+1 colnum-if(i%%31==0){ 35 }else{ (i%%31)+4 } year-data_grid[rownum,2] month-data_grid[rownum,3] day-colnum-4 date_string-paste(month,day,year,sep=/) date-as.Date(date_string,format='%m/%d/%Y') value-as.character(data_grid[rownum,colnum]) time_series-rbind(time_series,c(date,value)) } The script is working as I intended it to (goes through a matrix of data where column 2 is the year, column 3 is the month, and row 1 columns 5-35 are the day of the month the observation was recorded [I have included a screenshot below to help visualize what I'm talking about] and converts the grid into a 2 column time series where column 1 is the date and column 2 is the value of the observation), but it is stopping after only 1 iteration. nabble_img src=matrix_screenshot.jpg border=0/ Two questions: 1.) Does anyone know of an existing function to accomplish this task? 2.) Why is the loop stopping after 1 iteration? I have it written to iterate up to the total number of observations (20,615 in one case). Thank you for your help and sorry for this question which I'm sure has a very simple answer. Thanks again, Billy -- View this message in context: http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p 3532988.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop stopping after 1 iteration
I knew it would be something simple. Thanks for catching that, Martyn. Billy -- View this message in context: http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p3533041.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop stopping after 1 iteration
On May 18, 2011, at 11:18 AM, armstrwa wrote: Hi all, This is a very basic question, but I just can't figure out why R is handling a loop I'm writing the way it is. Here is the script I have written: grid_2_series-function(gage_handle,data_type,filename) series_name-paste(gage_handle,data_type,sep=_) data_grid-read.table(file=paste(filename,.txt,sep=)) num_rows_data-nrow(data_grid)-1 num_cols_data-ncol(data_grid)-4 num_obs-num_rows_data*num_cols_data time_series-matrix(nrow=0,ncol=2) for(i in 1:length(num_obs)){ rownum-ceiling(i/31)+1 colnum-if(i%%31==0){ 35 }else{ (i%%31)+4 } year-data_grid[rownum,2] month-data_grid[rownum,3] day-colnum-4 date_string-paste(month,day,year,sep=/) date-as.Date(date_string,format='%m/%d/%Y') value-as.character(data_grid[rownum,colnum]) time_series-rbind(time_series,c(date,value)) } The script is working as I intended it to (goes through a matrix of data where column 2 is the year, column 3 is the month, and row 1 columns 5-35 are the day of the month the observation was recorded [I have included a screenshot below to help visualize what I'm talking about] and converts the grid into a 2 column time series where column 1 is the date and column 2 is the value of the observation), but it is stopping after only 1 iteration. The jpg file will not be seen by most readers of this list. nabble_img src=matrix_screenshot.jpg border=0/ Two questions: 1.) Does anyone know of an existing function to accomplish this task? Since you have only define the task in terms of a loop this is not working properly and a picture that is not attached and included no test data, I have only a vague understanding of the task. Perhaps you want stack or melt from the reshape2 package. You might consider explaining more completely what you want in natural language. (And reading the Posting Guide with attention to acceptable attachment formats.) 2.) Why is the loop stopping after 1 iteration? I have it written to iterate up to the total number of observations (20,615 in one case). Most likely is your misunderstanding of how length is being interpreted for a vector. You probably want 1:nobs rather than 1:length(nobs) since length(nobs) most probably 1 in this case. Thank you for your help and sorry for this question which I'm sure has a very simple answer. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop stopping after 1 iteration
William, num_obs obviously isn't a vector, therefore length(num_obs) will evaluate to one. Hence your for loop control part will expand to for (i in 1:1) while it should probably read: for (i in 1:num_obs) Best Hugo On Wednesday 18 May 2011 17:18:15 armstrwa wrote: Hi all, This is a very basic question, but I just can't figure out why R is handling a loop I'm writing the way it is. Here is the script I have written: grid_2_series-function(gage_handle,data_type,filename) series_name-paste(gage_handle,data_type,sep=_) data_grid-read.table(file=paste(filename,.txt,sep=)) num_rows_data-nrow(data_grid)-1 num_cols_data-ncol(data_grid)-4 num_obs-num_rows_data*num_cols_data time_series-matrix(nrow=0,ncol=2) for(i in 1:length(num_obs)){ rownum-ceiling(i/31)+1 colnum-if(i%%31==0){ 35 }else{ (i%%31)+4 } year-data_grid[rownum,2] month-data_grid[rownum,3] day-colnum-4 date_string-paste(month,day,year,sep=/) date-as.Date(date_string,format='%m/%d/%Y') value-as.character(data_grid[rownum,colnum]) time_series-rbind(time_series,c(date,value)) } The script is working as I intended it to (goes through a matrix of data where column 2 is the year, column 3 is the month, and row 1 columns 5-35 are the day of the month the observation was recorded [I have included a screenshot below to help visualize what I'm talking about] and converts the grid into a 2 column time series where column 1 is the date and column 2 is the value of the observation), but it is stopping after only 1 iteration. nabble_img src=matrix_screenshot.jpg border=0/ Two questions: 1.) Does anyone know of an existing function to accomplish this task? 2.) Why is the loop stopping after 1 iteration? I have it written to iterate up to the total number of observations (20,615 in one case). Thank you for your help and sorry for this question which I'm sure has a very simple answer. Thanks again, Billy -- View this message in context: http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p3532988.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dataset Quasi Poisson
Hello, I'm looking for a dataset for Quasipoisson regression. The result must be significantly different from the classic poisson regression. You can help me? Please It is for my last university exam Thanks a lot -- View this message in context: http://r.789695.n4.nabble.com/Dataset-Quasi-Poisson-tp3533060p3533060.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Email out of R (code)
How does this compare to create.post() ? Kevin On Tue, May 17, 2011 at 3:44 PM, Daniel Malter dan...@umd.edu wrote: Hi all, I thought I would post code to send an email out of R. The code uses Grothendieck and Bellosta's interface package rJython for executing Python from R. The code itself provides basic email functionality for email servers requiring authentication. It should be easy to extend it (e.g., for sending attachments). I hope it's useful. require(rJython) rJython - rJython() rJython$exec( import smtplib ) rJython$exec(from email.MIMEText import MIMEText) rJython$exec(import email.utils) mail-c( #Email settings fromaddr = 'sender email address', toaddrs = 'recipient email address', msg = MIMEText('This is the body of the message.'), msg['From'] = email.utils.formataddr(('sender name', fromaddr)), msg['To'] = email.utils.formataddr(('recipient name', toaddrs)), msg['Subject'] = 'Simple test message', #SMTP server credentials username = 'sender login', password = 'sender password', #Set SMTP server and send email, e.g., google mail SMTP server server = smtplib.SMTP('smtp.gmail.com:587'), server.ehlo(), server.starttls(), server.ehlo(), server.login(username,password), server.sendmail(fromaddr, toaddrs, msg.as_string()), server.quit()) jython.exec(rJython,mail) Best, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Email-out-of-R-code-tp3530671p3530671.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop stopping after 1 iteration
Didn't mean to snub you guys, Hugo and David. I didn't see your posts before. Thanks for the advice. -- View this message in context: http://r.789695.n4.nabble.com/Loop-stopping-after-1-iteration-tp3532988p3533217.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logistic regression lrm() output
Hi, I am trying to run a simple logistic regression using lrm() to calculate a odds ratio. I found a confusing output when I use summary() on the fit object which gave some OR that is totally different from simply taking exp(coefficient), see below: dat-read.table(dat.txt,sep='\t',header=T,row.names=NULL) d-datadist(dat) options(datadist='d') library(rms) (fit-lrm(response~x,data=dat,x=T,y=T)) Logistic Regression Model lrm(formula = response ~ x, data = dat, x = T, y = T) Model Likelihood DiscriminationRank Discrim. Ratio TestIndexes Indexes Obs 150LR chi2 17.11R2 0.191C 0.763 0128d.f. 1g1.209Dxy 0.526 1 22Pr( chi2) 0.0001gr 3.350gamma 0.528 max |deriv| 1e-11 gp 0.129tau-a 0.132 Brier0.111 CoefS.E. Wald Z Pr(|Z|) Intercept -5.0059 0.9813 -5.10 0.0001 x 0.5647 0.1525 3.70 0.0002 As you can see, the odds ratio for x is exp(0.5647)=1.75892. But if I run the following using summary(): summary(fit) Effects Response : response Factor LowHigh Diff. Effect S.E. Lower 0.95 Upper 0.95 x 3.9003 6.2314 2.3311 1.32 0.36 0.62 2.01 Odds Ratio 3.9003 6.2314 2.3311 3.73 NA 1.86 7.49 What are these output? none of the numbers is the odds ratio (1.75892) that I calculated by using exp(). Can any explain? Thanks John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strucchange package Linux help
When I run the code below on Macintosh and Windows, the plot comes out fine. However, on Linux, the png generated is invalid from R console, and loading strucchange crashes rkward. Is this a known issue on Linux and, if so, is there a workaround? Many thanks! require(strucchange) data(RealInt) bp.ri - breakpoints(RealInt~1, h=15) summary(bp.ri) fac.ri - breakfactor(bp.ri, breaks = 3, label='seg') fm.ri - lm(RealInt~0 + fac.ri) summary(fm.ri) vcov.ri - function(x,...) kernHAC (x, kernel = 'Quadratic Spectral', prewhite = 1, approx = 'AR(1)', ...) coef(bp.ri, breaks - 3) sapply(vcov(bp.ri, breaks = 3, vcov=vcov.ri), sqrt) confint(bp.ri, breaks = 3, vcov=vcov.ri) png('SCC2.png') plot(RealInt) lines(as.vector(time(RealInt)), fitted(fm.ri), col=4) lines(confint(bp.ri, breaks = 3, vcov=vcov.ri)) dev.off() print(paste('Plot in SCC2.png in', getwd())) -- Sent from my mobile device Envoyait de mon telephone mobil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Email out of R (code)
I do not know. I was not aware and could hardly find any information on create.post(). From what I have seen at first glance, it seems that create.post() either opens your standard email program or web browser, which the python code does not. Instead it needs the R-library interfacing Python. I also do not know how create.post() handles server authentication (though, my blind guess would be with the settings of your email program or browser mail). To stop guessing, if you want a solid comparison, I am afraid you have to do it yourself. Best, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Email-out-of-R-code-tp3530671p3533280.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with 2-D plot of k-mean clustering analysis
Hi Meng, I would like to use R to perform k-means clustering on my data which included 33 samples measured with ~1000 variables. I have already used kmeans package for this analysis, and showed that there are 4 clusters in my data. However, it's really difficult to plot this cluster in 2-D format since the huge number of variables. One possible way is to project the multidimensional space into 2-D platform, but I could not find any good way to do that. Any suggestions or comments will be really helpful! For suggestions it would be extremely helpful to tell us what kind of variables your 1000 variables are. Parallel coordinate plots plot values over (many) variables. Whether this is useful, depends very much on your variables: E.g. I have spectral channels, they have an intrinsic order and the values have physically the same meaning (and almost the same range), so the parallel coordinate plot comes naturally (it produces in fact the spectra). Claudia Thanks, Meng [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Claudia Beleites Spectroscopy/Imaging Institute of Photonic Technology Albert-Einstein-Str. 9 07745 Jena Germany email: claudia.belei...@ipht-jena.de phone: +49 3641 206-133 fax: +49 2641 206-399 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with 2-D plot of k-mean clustering analysis
One idea: Pick the three largest clusters, their centers determine a plane. project your data into that plane. albyn On Wed, May 18, 2011 at 06:55:39PM +0200, Claudia Beleites wrote: Hi Meng, I would like to use R to perform k-means clustering on my data which included 33 samples measured with ~1000 variables. I have already used kmeans package for this analysis, and showed that there are 4 clusters in my data. However, it's really difficult to plot this cluster in 2-D format since the huge number of variables. One possible way is to project the multidimensional space into 2-D platform, but I could not find any good way to do that. Any suggestions or comments will be really helpful! For suggestions it would be extremely helpful to tell us what kind of variables your 1000 variables are. Parallel coordinate plots plot values over (many) variables. Whether this is useful, depends very much on your variables: E.g. I have spectral channels, they have an intrinsic order and the values have physically the same meaning (and almost the same range), so the parallel coordinate plot comes naturally (it produces in fact the spectra). Claudia Thanks, Meng [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Claudia Beleites Spectroscopy/Imaging Institute of Photonic Technology Albert-Einstein-Str. 9 07745 Jena Germany email: claudia.belei...@ipht-jena.de phone: +49 3641 206-133 fax: +49 2641 206-399 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Albyn Jones Reed College jo...@reed.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression lrm() output
Why is a one unit change in x an interesting range for the purpose of estimating an odds ratio? The default in summary() is the inter-quartile-range odds ratio as clearly stated in the rms documentation. Frank array chip wrote: Hi, I am trying to run a simple logistic regression using lrm() to calculate a odds ratio. I found a confusing output when I use summary() on the fit object which gave some OR that is totally different from simply taking exp(coefficient), see below: dat-read.table(dat.txt,sep='\t',header=T,row.names=NULL) d-datadist(dat) options(datadist='d') library(rms) (fit-lrm(response~x,data=dat,x=T,y=T)) Logistic Regression Model lrm(formula = response ~ x, data = dat, x = T, y = T) Model Likelihood DiscriminationRank Discrim. Ratio TestIndexes Indexes Obs 150LR chi2 17.11R2 0.191C 0.763 0128d.f. 1g1.209Dxy 0.526 1 22Pr( chi2) 0.0001gr 3.350gamma 0.528 max |deriv| 1e-11 gp 0.129tau-a 0.132 Brier0.111 CoefS.E. Wald Z Pr(|Z|) Intercept -5.0059 0.9813 -5.10 0.0001 x 0.5647 0.1525 3.70 0.0002 As you can see, the odds ratio for x is exp(0.5647)=1.75892. But if I run the following using summary(): summary(fit) Effects Response : response Factor LowHigh Diff. Effect S.E. Lower 0.95 Upper 0.95 x 3.9003 6.2314 2.3311 1.32 0.36 0.62 2.01 Odds Ratio 3.9003 6.2314 2.3311 3.73 NA 1.86 7.49 What are these output? none of the numbers is the odds ratio (1.75892) that I calculated by using exp(). Can any explain? Thanks John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/logistic-regression-lrm-output-tp3533223p3533278.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Smooth contour of a map
It's perfect, thank you! I would like post the final code if someone need help in this subject, but I try to correct a last problem, how can I constrain the contourLines() function to take the corner points of the map in his result ... it does not consider this point like a contour point. Le 18 mai 2011 à 15:18, David Winsemius a écrit : You may be looking for the par settings of xaxs=i, yaxs=i, which if you add them to the plot call will prevent the regular behavior of adding 4% padding to the axis widths. ?par -- David. On May 18, 2011, at 8:27 AM, Pierre Bruyer wrote: I've pratically resolved my problem (the code is under that), but a last thing is not perfect: when I use the function plot to call after the function polygon, there is a marge between my raster and the window. I think it's the axis of the function plot(), but I have not found how delete it. Someone have a solution please? Pierre Bruyer ##smooth contour contours - contourLines(V2b,levels=paliers) par(mar=c(0,0,0,0)) plot(1,col=white,main=polygon(), asp = 1, axes = FALSE, ann = FALSE,xlim=c(0,1), ylim = c(0,1),type = n, method = c(image)) for (i in seq_along(contours)) { x - contours[[i]]$x y - contours[[i]]$y c - contours[[i]]$level j - 1 tmp - 0 while(j length(level[,1]) tmp == 0){ if(level[j,1] == c){ tmp - j } j - j+1 } polygon( spline( seq_along(x), x)$y, spline( seq_along(y), y)$y ,col = colgraph[tmp+1], border = NA) } Le 17 mai 2011 à 16:44, Pierre Bruyer a écrit : The result is good, thanks a lot, but how can I with this method fill my raster to color? Le 17 mai 2011 à 15:43, Duncan Murdoch a écrit : I don't think filled.contour gives you access to the contour lines. If you use contourLines() to compute them, then you can draw them using code like this: contours - contourLines(V2b,levels=paliers) for (i in seq_along(contours)) { x - contours[[i]]$x y - contours[[i]]$y lines( splines( seq_along(x), x)$y, splines( seq_along(y), y)$y ) } but as I said, you won't get great results. A better way is to use a finer grid, e.g. by fitting a smooth surface to your set of points and using predictions from the model to interpolate. Duncan Murdoch On 17/05/2011 9:35 AM, Pierre Bruyer wrote: I work with large datasets (1 points) so I can't post them , but my function is : create_map- function(grd, level ,map_output, format = c(jpeg), width_map = 150, height_map = 150,...) { ##sp- spline(x = grd[,1], y = grd[,2]) grd2- matrix(grd[,3], nrow = sqrt(length(grd[,3])), ncol = sqrt(length(grd[,3])), byrow = FALSE) V2b- grd2 ##creation of breaks for colors i-1 paliers- c(-1.0E300) while(i=length(level[,1])) { paliers- c(paliers,level[i,1]) i- i+1 } paliers- c(paliers, 1.0E300) ##scale color creation i- 1 colgraph- c(rgb(255,255,255, maxColorValue = 255)) while(i=length(level[,2])) { colgraph- c(colgraph, rgb(level[i,2],level[i,3],level[i,4], maxColorValue = 255)) i- i +1 } ##user can choose the output format (default is jpeg) switch(format, png = png(map_output, width = width_map, height = height_map) , jpeg = jpeg(map_output, width = width_map, height = height_map, quality = 100), bmp = bmp(map_output, width = width_map, height = height_map), tiff = tiff(map_output, width = width_map, height = height_map), jpeg(map_output, width = width_map, height = height_map)) ## drawing map ##delete marge par(mar=c(0,0,0,0)) filled.contour(V2b, col = colgraph, levels = paliers, asp = 1, axes = FALSE, ann = FALSE) dev.off() } where grd is a xyz data frame, map_output is the path+name of the output image file, and level is a matrix like this : level- matrix(0,10,4) level[1,1]- 1.E+00 level[2,1]- 3.E+00 level[3,1]- 5.E+00 level[4,1]- 1.E+01 level[5,1]- 1.5000E+01 level[6,1]- 2.E+01 level[7,1]- 3.E+01 level[8,1]- 4.E+01 level[9,1]- 5.E+01 level[10,1]- 7.5000E+01 level[1,2]- 102 level[2,2]- 102 level[3,2]- 102 level[4,2]- 93 level[5,2]- 204 level[6,2]- 248 level[7,2]- 241 level[8,2]- 239 level[9,2]- 224 level[10,2]- 153 level[1,3]- 153 level[2,3]- 204 level[3,3]- 204 level[4,3]- 241 level[5,3]- 255 level[6,3]- 243 level[7,3]- 189 level[8,3]- 126 level[9,3]- 14 level[10,3]- 0 level[1,4]- 153 level[2,4]- 204 level[3,4]- 153 level[4,4]- 107 level[5,4]- 102 level[6,4]- 33 level[7,4]- 59 level[8,4]- 63 level[9,4]- 14 level[10,4]- 51 Le 17 mai 2011 à 15:17, Duncan Murdoch a écrit
[R] assign $y of predict() function output to variable
Hello R-help Below is the output from the predict() function. How can I assign $y to a variable. predict(function,df2) $x V1 1 36.28 2 34.73 3 33.74 4 69.87 5 58.88 6 89.44 7 43.97 8 41.94 9 33.34 10 38.47 11 35.16 12 42.94 13 46.76 14 53.24 15 52.43 16 50.40 17 34.42 18 33.22 19 33.24 20 39.60 21 39.32 22 44.71 23 54.03 24 47.48 25 35.42 26 34.78 27 34.31 28 78.60 29 74.43 30 120.80 31 48.35 32 45.40 33 33.95 34 38.27 35 35.16 36 47.10 37 48.10 38 51.79 39 62.10 40 50.95 41 35.75 42 34.62 43 57.99 44 45.09 45 43.93 46 60.98 47 66.64 48 59.84 49 64.81 50 77.52 51 113.40 52 88.12 53 80.36 54 118.80 55 113.00 56 169.50 57 53.04 58 63.39 59 96.04 60 109.80 61 83.74 62 133.10 63 122.30 64 168.30 65 61.89 66 58.58 67 75.98 68 87.66 69 84.01 70 132.80 71 135.60 72 127.70 $y V1 1 2.676489 2 2.070236 3 1.682677 4 15.853686 5 11.523969 6 23.030727 7 5.678122 8 4.886343 9 1.526004 10 3.532138 11 2.238484 12 5.276394 13 6.766605 14 9.301601 15 8.983873 16 8.188838 17 1.948910 18 1.478992 19 1.486828 20 3.973300 21 3.864004 22 5.966758 23 9.611797 24 7.047672 25 2.340192 26 2.089803 27 1.905852 28 19.180440 29 17.611545 30 32.357421 31 7.387438 32 6.235922 33 1.764910 34 3.454035 35 2.238484 36 6.899319 37 7.289786 38 8.733040 39 12.798111 40 8.404081 41 2.469258 42 2.027188 43 11.172123 44 6.114989 45 5.662521 46 12.354907 47 14.589716 48 11.903750 49 13.869002 50 18.778316 51 30.387489 52 22.579871 53 19.828694 54 31.838458 55 30.277085 56 42.186268 57 9.223121 58 13.308252 59 25.211889 60 29.379133 61 21.048335 62 35.336657 63 32.740194 64 42.003454 65 12.715023 66 11.405337 67 18.199665 68 22.421596 69 21.144335 70 35.268242 71 35.898715 72 34.073050 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overlaying maps
I'm having difficulty overlaying maps when writing to a file graphics device. My command sequence has the structure plot(map1) par(new = T) plot(map2) On the screen device, it works fine. When I attempt something like png(file = map.png) plot(map1) par(new = T) plot(map2) dev.off() only the last map appears, the previous ones having been cleared. Can someone clarify? Thanks, Michael Laviolette PhD MPH New Hampshire Department of Health and Human Services __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Email out of R (code)
In case you're using Unix/Linux, have a look at www.r-project.org/doc/Rnews/Rnews_2007-1.pdf (page 30 - 32) Wolfgang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger (a t) igbmc.fr De : r-help-boun...@r-project.org [r-help-boun...@r-project.org] de la part de Kevin Wright [kw.s...@gmail.com] Date d'envoi : mercredi 18 mai 2011 17:57 À : Daniel Malter Cc : r-help@r-project.org Objet : Re: [R] Email out of R (code) How does this compare to create.post() ? Kevin On Tue, May 17, 2011 at 3:44 PM, Daniel Malter dan...@umd.edu wrote: Hi all, I thought I would post code to send an email out of R. The code uses Grothendieck and Bellosta's interface package rJython for executing Python from R. The code itself provides basic email functionality for email servers requiring authentication. It should be easy to extend it (e.g., for sending attachments). I hope it's useful. require(rJython) rJython - rJython() rJython$exec( import smtplib ) rJython$exec(from email.MIMEText import MIMEText) rJython$exec(import email.utils) mail-c( #Email settings fromaddr = 'sender email address', toaddrs = 'recipient email address', msg = MIMEText('This is the body of the message.'), msg['From'] = email.utils.formataddr(('sender name', fromaddr)), msg['To'] = email.utils.formataddr(('recipient name', toaddrs)), msg['Subject'] = 'Simple test message', #SMTP server credentials username = 'sender login', password = 'sender password', #Set SMTP server and send email, e.g., google mail SMTP server server = smtplib.SMTP('smtp.gmail.com:587'), server.ehlo(), server.starttls(), server.ehlo(), server.login(username,password), server.sendmail(fromaddr, toaddrs, msg.as_string()), server.quit()) jython.exec(rJython,mail) Best, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Email-out-of-R-code-tp3530671p3530671.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] text mining problem using TM package
Hi, Im using R (TM package) for text mining and Im having problems filtering articles out of my data set by local meta data. Here is the code: *data - (C:/ /19970331)* * * * * *rs - ReutersSource(data , encoding = UTF-8)* *RC - VCorpus(DirSource(data), readerControl = list(reader = readRCV1asPlain,* * language = en_US,* * load = TRUE),* * dbControl = list(useDb = TRUE,* * dbName = texts.db,* * dbType = DB1))* * * * * * * *tm_index(RC, FUN = sFilter, doclevel = F, useMeta = T, Topics == 'MCAT') * * * * * When I use sFilter, I can only filter fields in yellow, I want to filter fields in red, what am I doing wrong? Thanks, Andy This is meta data that is attached to each article Available meta data pairs are: Author : DateTimeStamp: 1997-03-31 Description : Heading : USA: WHX begins tender offer for Dynamics Corp. ID : 476871 Language : en_US Origin : Reuters Corpus Volume 1 User-defined local meta data pairs are: $Publisher [1] Reuters Holdings Plc $Topics [1] C18 C181 CCAT $Industries [1] I22100 I34000 $Countries [1] USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strucchange package Linux help
On Wed, 18 May 2011, Hasan Diwan wrote: When I run the code below on Macintosh and Windows, the plot comes out fine. However, on Linux, the png generated is invalid from R console, and loading strucchange crashes rkward. I can replicate nothing of this. I ran the script both in a plain R 2.13.0 and in RKward 0.5.5 on a Debian GNU/Linux machine. In both cases, the script below yielded the correct outcome and the PNG showed the correct graphic. (And neither R or RKward crashed.) Is this a known issue on Linux and, if so, is there a workaround? This is almost certainly no general Linux problem with PNG graphics and no problem with the specific strucchange example. It's more likely that something else goes wrong in your specific setup. Z Many thanks! require(strucchange) data(RealInt) bp.ri - breakpoints(RealInt~1, h=15) summary(bp.ri) fac.ri - breakfactor(bp.ri, breaks = 3, label='seg') fm.ri - lm(RealInt~0 + fac.ri) summary(fm.ri) vcov.ri - function(x,...) kernHAC (x, kernel = 'Quadratic Spectral', prewhite = 1, approx = 'AR(1)', ...) coef(bp.ri, breaks - 3) sapply(vcov(bp.ri, breaks = 3, vcov=vcov.ri), sqrt) confint(bp.ri, breaks = 3, vcov=vcov.ri) png('SCC2.png') plot(RealInt) lines(as.vector(time(RealInt)), fitted(fm.ri), col=4) lines(confint(bp.ri, breaks = 3, vcov=vcov.ri)) dev.off() print(paste('Plot in SCC2.png in', getwd())) -- Sent from my mobile device Envoyait de mon telephone mobil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Memory Problems (cannot allocate vector of size)
While doing pls I found the following problem BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife = FALSE, validation = LOO) when not enabling jackknife the command works fine, but when trying to enable jackknife i get the following error. BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife = TRUE, validation = LOO) Error: cannot allocate vector of size 289.1 Mb I am dealing with a very large dataset str(PLSdata) 'data.frame': 40 obs. of 2 variables: $ GroupingList: int 1 1 1 1 1 1 1 1 1 1 ... $ PCIList : AsIs [1:40, 1:94727] 0 0 0 0 0 0 0 0 0 0 ... ..- attr(*, dimnames)=List of 2 .. ..$ : chr X X.1 X.12 X.13 ... .. ..$ : NULL object.size(PLSdata)/1048600 28.9113560938394 bytes How can i get around this memory shortage [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-code in R-file documentation
I guess what you want is cat(readLines(file.r), sep = \n) Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Wed, May 18, 2011 at 4:03 AM, Brian Oney zenli...@gmail.com wrote: Hello List, I would like to insert code from .r files into a LaTeX appendix (possibly using Sweave). I was considering: results=tex,eval=true,echo=true= source(file.r) @ but I would just like to echo the code and not evaluate the code within the file. maybe: results=tex,eval=true,echo=false= cat(\\begin{verbatim}) readLines(file.r) cat(\\end{verbatim}) @ The above works well other than the line numbers which are included (which isn't so bad). Thanks for the help and ideas! Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hierarchical clustering within a size limit
Hi Peter, Thanks for your help. A second simple question that I cannot solve is the following. labels = cutree(hc, h=500) # members of cluster 1: x[labels==1] # members of cluster 2: x[labels==2] When x is = 8 the index numbers appear in the output: [['[1]', '180066408', '180066464', '180066465', '180066483', '180066486', '180066518', '180066525', '[8]', '180066554', '180066623', '180066638', '180066652', '18006681 9', '180066884']] As opposed to when they are less than 8: [['150329963', '150329989', '150330179', '150330299', '150330375', '150330460']] Is there a simple way to make these index numbers disappear? Thanks On Wed, May 11, 2011 at 10:53 AM, Peter Langfelder peter.langfel...@gmail.com wrote: On Wed, May 11, 2011 at 10:12 AM, rna seq rna.see...@gmail.com wrote: Hello List, I am trying to implement a hierarchical cluster using the hclust method agglomerative single linkage method with a small wrinkle. I would like to cluster a set of numbers on a number line only if they are within a distance of 500. I would then like to print out the members of this list. So far I can put a vector: x-c(2,10,200,300,600,700) into a distance matrix: dist(x,method=manhattan) 1 2 3 4 5 2 8 3 198 190 4 298 290 100 5 598 590 400 300 6 698 690 500 400 100 I can then cluster these distances using: hc-hclust(v, method = complete) Next, I believe I set my distance limit in the cluster using the command cutree(hc, h=500) 1 1 1 1 2 2 1 3 [1] 1 1 1 1 2 2 This seems to produce the correct result however, whatt I am unable to do is go back and extract and print out the members of each cluster. Any herp would be greatly appreciated. Very simple. labels = cutree(hc, h=500) # members of cluster 1: x[labels==1] # members of cluster 2: x[labels==2] HTH, Peter [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grouped bar plot
Hi, I am trying to produce a grouped bar plot from a data.frame and I'm having difficulties figuring out how to do so. My data is 500 rows by 4 columns and basically looks like so: head(x) V1V2V3V4 1 XOM 0.2317915 0.1610068 1.6941637 2 AAPL 0.6735488 0.7433611 0.1594102 3 GE 1.2554160 0.9237384 1.6767711 4 IBM 1.6296938 0.3730387 0.5858115 5 CVX 0.9194169 0.4785705 0.1803601 6 PG 0.7768241 1.7622060 0.7640163 . . . I would like to produce something similar to what is found at: http://www.statmethods.net/graphs/bar.html # the grouped barplot example or http://had.co.nz/ggplot2/geom_bar.html# the Dodged bar charts example Across the X-axis, for each set(row) of 3 data points(V2, V3, V4) associated with a symbol(V1), I would like to create a group of 3 bars reflecting their values. So the Y-axis will represent the magnitude of values in the columns (V2, V3, V4), and X-axis will have 500 groups of 3 bars, for a total of 1500 bars. I would like the color of each bar to reflect the column of data it represents, and to label each group of 3 with the corresponding symbol in column V1. I was trying to get this to work using ggplot but the y-axis in the example is the count, which is not what I'm after. Any suggestions, to get me started down the right path would be appreciated. Thank you. James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data network format and grouping analysis
Hi everyone, I have a dataset of friendship with this format: ego alter 47461 2 97421 3 14738 1NA 47472NA 974323 14739 21 4748313 97443 5 14740 314 47494NA 97454NA 14741 4NA 47505NA 9746513 14742 510 47516 12 97476 7 ... NA means that individuals don't select any friend. Does anyone know how to format this dataset to use sna or igraph packages? I don't know how to convert it into a matrix or a edgelist in R without losing isolated individuals . Next question, anyone knows if there is a package to perform a Moody's Crowds routine to identify groups using R, or other algorithms designed to search groups by maximizing modularity scores? Thank you in advance! -- Sebastián Daza sebastian.d...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataset Quasi Poisson
1) This mailing list is not for homework. 2) I would recommend reading the introduction to R that comes with every installation of R, since your answer is in there. Alternatively, you could google R and quasi poisson. On Wed, May 18, 2011 at 11:42 AM, ilpoeta84 antonioperfe...@gmail.com wrote: Hello, I'm looking for a dataset for Quasipoisson regression. The result must be significantly different from the classic poisson regression. You can help me? Please It is for my last university exam Thanks a lot -- View this message in context: http://r.789695.n4.nabble.com/Dataset-Quasi-Poisson-tp3533060p3533060.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- === Jon Daily Technician === #!/usr/bin/env outside # It's great, trust me. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple ordering or sorting question
Greetings, I'm trying to simply reorder a data frame on the row numbers. So, for example, instead of getting 1,2,3,4,5,6,7,8,9,10,11, ... 100 ..., I get instead 1, 10, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 11, ... I've tried commands such as df - df[order(rownames(df)),] and and have substituted the order command with sort and sort.list to no avail. Any advice would be appreciated. Thanks in advance. David -- === David Kaplan, Ph.D. Professor Department of Educational Psychology University of Wisconsin - Madison Educational Sciences, Room, 1082B 1025 W. Johnson Street Madison, WI 53706 email: dkap...@education.wisc.edu homepage: http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html Phone: 608-262-0836 === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple ordering or sorting question
It looks like your row numbers are characters because that is the sort sequence you are getting. Try df - df[order(as.numeric(rownames(df))), ] On Wed, May 18, 2011 at 2:42 PM, David Kaplan dkap...@education.wisc.edu wrote: Greetings, I'm trying to simply reorder a data frame on the row numbers. So, for example, instead of getting 1,2,3,4,5,6,7,8,9,10,11, ... 100 ..., I get instead 1, 10, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 11, ... I've tried commands such as df - df[order(rownames(df)),] and and have substituted the order command with sort and sort.list to no avail. Any advice would be appreciated. Thanks in advance. David -- === David Kaplan, Ph.D. Professor Department of Educational Psychology University of Wisconsin - Madison Educational Sciences, Room, 1082B 1025 W. Johnson Street Madison, WI 53706 email: dkap...@education.wisc.edu homepage: http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html Phone: 608-262-0836 === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple ordering or sorting question
That did it. Thanks!! David === David Kaplan, Ph.D. Professor Department of Educational Psychology University of Wisconsin - Madison Educational Sciences, Room, 1082B 1025 W. Johnson Street Madison, WI 53706 email: dkap...@education.wisc.edu homepage: http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html Phone: 608-262-0836 === On 5/18/11 1:50 PM, jim holtman wrote: It looks like your row numbers are characters because that is the sort sequence you are getting. Try df- df[order(as.numeric(rownames(df))), ] On Wed, May 18, 2011 at 2:42 PM, David Kaplan dkap...@education.wisc.edu wrote: Greetings, I'm trying to simply reorder a data frame on the row numbers. So, for example, instead of getting 1,2,3,4,5,6,7,8,9,10,11, ... 100 ..., I get instead 1, 10, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 11, ... I've tried commands such as df- df[order(rownames(df)),] and and have substituted the order command with sort and sort.list to no avail. Any advice would be appreciated. Thanks in advance. David -- === David Kaplan, Ph.D. Professor Department of Educational Psychology University of Wisconsin - Madison Educational Sciences, Room, 1082B 1025 W. Johnson Street Madison, WI 53706 email: dkap...@education.wisc.edu homepage: http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html Phone: 608-262-0836 === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Use paste function to select column of data
Hello, I want to build a function to call up a column of a data.frame by the names of the columns. I have column names that are sequentially named (col1, col2, etc.). How do I change a character expression into something that will be understood as a data.frame column. For example: example-data.frame(cbind(col1=1:10, col2=21:30, col3=41:50)) call.fun-function(t){ x-paste(col,t, sep=) ## Change this so that it is the data, not a character expression example$x} call.fun(t=2) Within the real function, I will continue do calculations on the column of data. My problem is that I am either getting a character expression or NULL from my function. Thanks for your help on what is probably a very simple question. John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] assign $y of predict() function output to variable
On May 18, 2011, at 1:35 PM, Asan Ramzan wrote: Hello R-help Below is the output from the predict() function. How can I assign $y to a variable. Newvar - predict(function,df2)$y -- David. $x V1 1 36.28 2 34.73 3 33.74 4 69.87 5 58.88 6 89.44 7 43.97 8 41.94 9 33.34 10 38.47 11 35.16 12 42.94 13 46.76 14 53.24 15 52.43 16 50.40 17 34.42 18 33.22 19 33.24 20 39.60 21 39.32 22 44.71 23 54.03 24 47.48 25 35.42 26 34.78 27 34.31 28 78.60 29 74.43 30 120.80 31 48.35 32 45.40 33 33.95 34 38.27 35 35.16 36 47.10 37 48.10 38 51.79 39 62.10 40 50.95 41 35.75 42 34.62 43 57.99 44 45.09 45 43.93 46 60.98 47 66.64 48 59.84 49 64.81 50 77.52 51 113.40 52 88.12 53 80.36 54 118.80 55 113.00 56 169.50 57 53.04 58 63.39 59 96.04 60 109.80 61 83.74 62 133.10 63 122.30 64 168.30 65 61.89 66 58.58 67 75.98 68 87.66 69 84.01 70 132.80 71 135.60 72 127.70 $y V1 1 2.676489 2 2.070236 3 1.682677 4 15.853686 5 11.523969 6 23.030727 7 5.678122 8 4.886343 9 1.526004 10 3.532138 11 2.238484 12 5.276394 13 6.766605 14 9.301601 15 8.983873 16 8.188838 17 1.948910 18 1.478992 19 1.486828 20 3.973300 21 3.864004 22 5.966758 23 9.611797 24 7.047672 25 2.340192 26 2.089803 27 1.905852 28 19.180440 29 17.611545 30 32.357421 31 7.387438 32 6.235922 33 1.764910 34 3.454035 35 2.238484 36 6.899319 37 7.289786 38 8.733040 39 12.798111 40 8.404081 41 2.469258 42 2.027188 43 11.172123 44 6.114989 45 5.662521 46 12.354907 47 14.589716 48 11.903750 49 13.869002 50 18.778316 51 30.387489 52 22.579871 53 19.828694 54 31.838458 55 30.277085 56 42.186268 57 9.223121 58 13.308252 59 25.211889 60 29.379133 61 21.048335 62 35.336657 63 32.740194 64 42.003454 65 12.715023 66 11.405337 67 18.199665 68 22.421596 69 21.144335 70 35.268242 71 35.898715 72 34.073050 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use paste function to select column of data
On May 18, 2011, at 3:12 PM, John Poulsen wrote: Hello, I want to build a function to call up a column of a data.frame by the names of the columns. I have column names that are sequentially named (col1, col2, etc.). How do I change a character expression into something that will be understood as a data.frame column. For example: example-data.frame(cbind(col1=1:10, col2=21:30, col3=41:50)) call.fun-function(t){ x-paste(col,t, sep=) ## Change this so that it is the data, not a character expression # right don't use the $ operator, instead use [[ example[[x]]} call.fun(t=2) Within the real function, I will continue do calculations on the column of data. My problem is that I am either getting a character expression or NULL from my function. Thanks for your help on what is probably a very simple question. John -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use paste function to select column of data
On 2011-05-18 12:12, John Poulsen wrote: Hello, I want to build a function to call up a column of a data.frame by the names of the columns. I have column names that are sequentially named (col1, col2, etc.). How do I change a character expression into something that will be understood as a data.frame column. For example: example-data.frame(cbind(col1=1:10, col2=21:30, col3=41:50)) call.fun-function(t){ x-paste(col,t, sep=) ## Change this so that it is the data, not a character expression example$x} call.fun(t=2) Get out of the dollar habit. Replace your example$x with example[[x]] or with example[, x] Peter Ehlers Within the real function, I will continue do calculations on the column of data. My problem is that I am either getting a character expression or NULL from my function. Thanks for your help on what is probably a very simple question. John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouped bar plot
On 2011-05-18 11:13, jctoll wrote: Hi, I am trying to produce a grouped bar plot from a data.frame and I'm having difficulties figuring out how to do so. My data is 500 rows by 4 columns and basically looks like so: head(x) V1V2V3V4 1 XOM 0.2317915 0.1610068 1.6941637 2 AAPL 0.6735488 0.7433611 0.1594102 3 GE 1.2554160 0.9237384 1.6767711 4 IBM 1.6296938 0.3730387 0.5858115 5 CVX 0.9194169 0.4785705 0.1803601 6 PG 0.7768241 1.7622060 0.7640163 . . . I would like to produce something similar to what is found at: http://www.statmethods.net/graphs/bar.html # the grouped barplot example or http://had.co.nz/ggplot2/geom_bar.html# the Dodged bar charts example Across the X-axis, for each set(row) of 3 data points(V2, V3, V4) associated with a symbol(V1), I would like to create a group of 3 bars reflecting their values. So the Y-axis will represent the magnitude of values in the columns (V2, V3, V4), and X-axis will have 500 groups of 3 bars, for a total of 1500 bars. I would like the color of each bar to reflect the column of data it represents, and to label each group of 3 with the corresponding symbol in column V1. I was trying to get this to work using ggplot but the y-axis in the example is the count, which is not what I'm after. Any suggestions, to get me started down the right path would be appreciated. Thank you. Using base barplot() and calling your 6 lines of data 'd': barplot(t(d[-1]), names.arg=d[,1], beside=TRUE) Give a careful reading to the definition of the 'height' argument on the help page. Peter Ehlers James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouped bar plot
On Wed, May 18, 2011 at 2:38 PM, Peter Ehlers ehl...@ucalgary.ca wrote: On 2011-05-18 11:13, jctoll wrote: Hi, I am trying to produce a grouped bar plot from a data.frame and I'm having difficulties figuring out how to do so. My data is 500 rows by 4 columns and basically looks like so: head(x) V1 V2 V3 V4 1 XOM 0.2317915 0.1610068 1.6941637 2 AAPL 0.6735488 0.7433611 0.1594102 3 GE 1.2554160 0.9237384 1.6767711 4 IBM 1.6296938 0.3730387 0.5858115 5 CVX 0.9194169 0.4785705 0.1803601 6 PG 0.7768241 1.7622060 0.7640163 . . . I would like to produce something similar to what is found at: http://www.statmethods.net/graphs/bar.html # the grouped barplot example or http://had.co.nz/ggplot2/geom_bar.html # the Dodged bar charts example Across the X-axis, for each set(row) of 3 data points(V2, V3, V4) associated with a symbol(V1), I would like to create a group of 3 bars reflecting their values. So the Y-axis will represent the magnitude of values in the columns (V2, V3, V4), and X-axis will have 500 groups of 3 bars, for a total of 1500 bars. I would like the color of each bar to reflect the column of data it represents, and to label each group of 3 with the corresponding symbol in column V1. I was trying to get this to work using ggplot but the y-axis in the example is the count, which is not what I'm after. Any suggestions, to get me started down the right path would be appreciated. Thank you. Using base barplot() and calling your 6 lines of data 'd': barplot(t(d[-1]), names.arg=d[,1], beside=TRUE) Give a careful reading to the definition of the 'height' argument on the help page. Peter Ehlers Thank you, that's what I was looking for and it gets me started in the right direction. I can now work on refining the layout. Thanks again. James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlaying maps
Using my mind-reading skills, plot.Map(), while deprecated, does provide an add= option. If this doesn't help, you'll need to read the posting guide and provide (a lot) more information. Ray Brownrigg On Thu, 19 May 2011, michael.laviole...@dhhs.state.nh.us wrote: I'm having difficulty overlaying maps when writing to a file graphics device. My command sequence has the structure plot(map1) par(new = T) plot(map2) On the screen device, it works fine. When I attempt something like png(file = map.png) plot(map1) par(new = T) plot(map2) dev.off() only the last map appears, the previous ones having been cleared. Can someone clarify? Thanks, Michael Laviolette PhD MPH New Hampshire Department of Health and Human Services __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data network format and grouping analysis
The following works to get an igraph object from a matrix edgelist: dat2 - matrix(rep(seq(1,5,1), 4), nrow=10, ncol=2) graph.edgelist( dat2 ) I tried with NA's but graph.edgelist did not allow NA's. Wouldn't you just leave those rows out with NA's in them? An NA means there is no edge, right? Scott On Wednesday, May 18, 2011 at 1:23 PM, Sebastián Daza wrote: Hi everyone, I have a dataset of friendship with this format: ego alter 4746 1 2 9742 1 3 14738 1 NA 4747 2 NA 9743 2 3 14739 2 1 4748 3 13 9744 3 5 14740 3 14 4749 4 NA 9745 4 NA 14741 4 NA 4750 5 NA 9746 5 13 14742 5 10 4751 6 12 9747 6 7 ... NA means that individuals don't select any friend. Does anyone know how to format this dataset to use sna or igraph packages? I don't know how to convert it into a matrix or a edgelist in R without losing isolated individuals . Next question, anyone knows if there is a package to perform a Moody's Crowds routine to identify groups using R, or other algorithms designed to search groups by maximizing modularity scores? Thank you in advance! -- Sebastián Daza sebastian.d...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Covariable Logistic Regression In R
Hello, I would like some help figuring out how to run Covariable Logistic Regression in R. I've been searching for a while on how to get this done in R (I have had the luck previously of using a software package that just does it) and I am coming up empty handed. Any experience or insights would be greatly appreciated. There is a package that does do exactly what I want, with the exception that it requires very specific data input, it is called : mbmdr. My input consists of various clinical variables measured from patients as well as expression data from various genes. What I would like to do is identify significant genes while considering their interactions with the various clinical variables. Thank you, Ana [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convolution confusion:
On May 18, 2011, at 15:47 , Alex Hofmann wrote: Hi, I'm new to R, and I'm a bit confused with the convolve() function. If I do: x-c(1, 2, 3) convolve(x, rev(x), TRUE, open) = 9 12 10 4 1 But I expected: 3 8 14 8 3 (like in Octave/MATLAB - conv(x, reverse(x)) ) 3 2 1 x 1 2 3 = 3 2 1 0 6 4 2 0 0 9 6 3 = 3 8 14 8 3 The thing is, that convolve(x, x, TRUE, open) works. For me it feels very confusing, that convolution does the reverse itself but the help suggest to reverse it again. The help file says: Note that the usual definition of convolution of two sequences x and y is given by convolve(x, rev(y), type = o). Thanks for your help, This confuses me every time as well. One way of putting it is that R's convolve is really what others call correlate: the product-sum between x and y shifted by k, for k=(1-n):(n-1) (adding appropriate padding): z - 1:3 crossprod(z,z) [,1] [1,] 14 crossprod(c(z,0),c(0,z)) [,1] [1,]8 crossprod(c(z,0,0),c(0,0,z)) [,1] [1,]3 Notice that this always comes out symmetric if x==y. However in convolution you want sum(x_j, y_(k-j)) so y is used in reverse order. One way of spotting the issue is that if x represents the distribution of a binary random variable X, then the convolution of x with itself should be the distribution of the sum of two independent such variables. x [1] 0.05 0.95 convolve(x,x,type=o) [1] 0.0475 0.9050 0.0475 convolve(x,rev(x),type=o) [1] 0.0025 0.0950 0.9025 ... and it is pretty obviously not the case that the sum of two highly skewed distributions is symmetric, so the 2nd line is right. dbinom(0:2,p=.95,size=2) [1] 0.0025 0.0950 0.9025 -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot geom_boxplot vertical margins
If you plot: df-data.frame(x=factor(1:100),y=rnorm(1000)) ggplot(df,aes(x=x,y=y))+geom_boxplot() How do I remove those pesky margins on the sides of the plot area? Or maybe just reduce their size to something more like the spacing of the boxes? Thanks, Justin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot geom_boxplot vertical margins
Is this what you want? You can control how much space you want to see on the sides of the plot: df-data.frame(x=factor(1:100),y=rnorm(1000)) ggplot(df,aes(x=x,y=y))+geom_boxplot() + scale_x_discrete(expand=c(0,0)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA http://www.fws.gov/redbluff/rbdd_jsmp.aspx - Original Message From: Justin Haynes jto...@gmail.com To: r-help@r-project.org Sent: Wed, May 18, 2011 1:51:19 PM Subject: [R] ggplot geom_boxplot vertical margins If you plot: df-data.frame(x=factor(1:100),y=rnorm(1000)) ggplot(df,aes(x=x,y=y))+geom_boxplot() How do I remove those pesky margins on the sides of the plot area? Or maybe just reduce their size to something more like the spacing of the boxes? Thanks, Justin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot geom_boxplot vertical margins
Exactly! Thanks, I couldn't find that anywhere! On Wed, May 18, 2011 at 1:59 PM, Felipe Carrillo mazatlanmex...@yahoo.com wrote: Is this what you want? You can control how much space you want to see on the sides of the plot: df-data.frame(x=factor(1:100),y=rnorm(1000)) ggplot(df,aes(x=x,y=y))+geom_boxplot() + scale_x_discrete(expand=c(0,0)) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA http://www.fws.gov/redbluff/rbdd_jsmp.aspx - Original Message From: Justin Haynes jto...@gmail.com To: r-help@r-project.org Sent: Wed, May 18, 2011 1:51:19 PM Subject: [R] ggplot geom_boxplot vertical margins If you plot: df-data.frame(x=factor(1:100),y=rnorm(1000)) ggplot(df,aes(x=x,y=y))+geom_boxplot() How do I remove those pesky margins on the sides of the plot area? Or maybe just reduce their size to something more like the spacing of the boxes? Thanks, Justin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] June ** R ** /S+ Courses: Nationwide Back2back (1) R/S+ Fundamentals and (2) R/S-Plus Advanced Programming. in San Francisco, New York City, Washington DC
XLSolutions has scheduled the first 2011 back2back courses in New York City, San Francisco, Washington DC. Taught by top R/S+ gurus! West Coast ---back2back--- East Coast (1) R/S-PLUS Fundamentals and Programming Techniques http://www.xlsolutions-corp.com/coursedetail.asp?id=30 * San Francisco June 20-21, 2011 * New York * june 9-10, 2011 * Washington, DC * June 16-17, 2011 (2) R/S+ System: Advanced Programming http://www.xlsolutions-corp.com/coursedetail.asp?id=16 * San Francisco * June 22-23,2011 * Washington, DC * June 14-15,2011 * New York * june 13-14, 2011 Ask for group discount and reserve your seat Now - Earlybird Rates. Payment due after the class! Email Sue Turner: sue@xlsolutions- corp.com http://www.xlsolutions-corp.com/rplus.asp Please let us know if you and your colleagues are interested in this class to take advantage of group discount. Register now to secure your seat! Cheers, Elvis Miller, PhD Manager Training. XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com el...@xlsolutions-corp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Memory Problems (cannot allocate vector of size)
How about reading the posting guide and afterwards searching the list archive: http://r.789695.n4.nabble.com/R-help-f789696.html Searching for Error: cannot allocate vector of size will give you hundreds of results as this question is asked VERY frequently! Jannis On 05/18/2011 07:55 PM, Amit Patel wrote: While doing pls I found the following problem BHPLS1- plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife = FALSE, validation = LOO) when not enabling jackknife the command works fine, but when trying to enable jackknife i get the following error. BHPLS1- plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife = TRUE, validation = LOO) Error: cannot allocate vector of size 289.1 Mb I am dealing with a very large dataset str(PLSdata) 'data.frame': 40 obs. of 2 variables: $ GroupingList: int 1 1 1 1 1 1 1 1 1 1 ... $ PCIList : AsIs [1:40, 1:94727] 0 0 0 0 0 0 0 0 0 0 ... ..- attr(*, dimnames)=List of 2 .. ..$ : chr X X.1 X.12 X.13 ... .. ..$ : NULL object.size(PLSdata)/1048600 28.9113560938394 bytes How can i get around this memory shortage [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plots: I've deleted axes, now to delete space
On 05/16/2011 06:18 PM, adele_thomp...@cargill.com wrote: Re-sizing within the dev command works well. I'm not sure why I would need the dev.off(). I have the plot commands run. Then I have the dev.copy2pdf command. Thanks again for your help. Well, you really should get into the habit of reading the documentation of each of the commands you use. To get familiar with the concept of graphics in R (and to answer your question regarding def.off()) I would recommend having a look at some basic textbook about R or one of the many tutorials on R available on the web. (googeling graphics R gives you a really helpful link with the third entry!) The answer to your question basicly is that you need to tell R that your figure is finished (by running def.off() ). Then R can create the file. Usually figures are created by a sequence of calls so R itself can never now whether it would be necessary to add some elements to the plot later or not. Sorry for beeing a bit harsh, but quite often many of the questions here on the list can be easily answered by searching documentation, the web or getting familiar with the basic R concepts! HTH Jannis -Original Message- From: greg.s...@imail.org [mailto:greg.s...@imail.org] Sent: Monday, May 16, 2011 11:11 AM To: Thompson, Adele - adele_thomp...@cargill.com; r-help@r-project.org Subject: RE: [R] Plots: I've deleted axes, now to delete space If your goal is to end up with a pdf file, then I would suggest creating the pdf file directly using the pdf function (you can specify height and width in the function) then run your commands to create the plot and use dev.off() to finish. You often get different results when writing directly to a file vs doing one of the dev.copy because of some different settings. In general the dev.copy approach can be a quick and easy solution for a simple graph, but plotting directly to the file tends to work better if you want a quality graph in the file. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Schatzi Sent: Monday, May 16, 2011 8:41 AM To: r-help@r-project.org Subject: Re: [R] Plots: I've deleted axes, now to delete space I am outputting the plot to a pdf file using the code: dev.copy2pdf(file=testing.pdf) The plots are too small though unless I first manually increase the size in R and then use the dev.copy command. Is there a way to automatically increase the window size? I tried fin and din, but those do not seem to work or they only increase the size to a certain degree, even though I can manually increase it to fill my screen. Schatzi wrote: Thanks all for the replies. I am getting better slowly but surely. I imagine that I will get better at figuring out things as well so I don't have to post as many questions. I do lots of searches, but still cannot figure out how to do everything that I need. The new code is as such: par(mfrow=c(4,7), mar=c(2, 2, 2, 1.5), oma=c(1, 1, 4, 0)) for (i in 1:28) { a-seq(1,3,1) plot(a,a, ann=FALSE, main= plot of a vs a) } mtext(Plot of a vs a,side=3,outer=TRUE) -Original Message- From: murdoch.dun...@gmail.com [mailto:murdoch.dun...@gmail.com] Sent: Friday, May 13, 2011 03:25 PM To: Thompson, Adele - adele_thomp...@cargill.com Cc: greg.s...@imail.org; r-help@r-project.org Subject: Re: [R] Plots: I've deleted axes, now to delete space On 11-05-13 4:21 PM, adele_thomp...@cargill.com wrote: Easy fix. Under ?par, I don't see where I can enter an overall title. Should I add a text command or something? mtext() writes text in the margins; argument outer puts it in the outer margins. Duncan Murdoch -Original Message- From: greg.s...@imail.org [mailto:greg.s...@imail.org] Sent: Friday, May 13, 2011 03:17 PM To: Thompson, Adele - adele_thomp...@cargill.com; r-help@r-project.org Subject: RE: [R] Plots: I've deleted axes, now to delete space Look at the help for par, specifically the section on 'mar' to set the per plot margins smaller and the section on 'oma' to leave room for the overall title. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/Plots-I-ve-deleted-axes-now-to-delete-space-tp3521078p3526379.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __
Re: [R] Covariable Logistic Regression In R
First of all, use the correct terminology for the statistical method. It is not productive to identify 'significant' genes unless you use quite complex methods. This kind of endeavor would take at least 6 statistics courses in order to be successful. Frank Anamaria Crisan wrote: Hello, I would like some help figuring out how to run Covariable Logistic Regression in R. I've been searching for a while on how to get this done in R (I have had the luck previously of using a software package that just does it) and I am coming up empty handed. Any experience or insights would be greatly appreciated. There is a package that does do exactly what I want, with the exception that it requires very specific data input, it is called : mbmdr. My input consists of various clinical variables measured from patients as well as expression data from various genes. What I would like to do is identify significant genes while considering their interactions with the various clinical variables. Thank you, Ana [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Covariable-Logistic-Regression-In-R-tp3533886p3534114.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plots: I've deleted axes, now to delete space
I've never used an R command without reading the documentation first. I think that would be impractical. I am not an expert at deciphering the documentation though and I post here because I did in fact read the documentation, do extensive google searches, ask friends/collegues and still find no answer. This forum is not a first resort for me. There is a very bad rep of R forums being notoriously harsh on new comers. I myself do not like when people do not try. In order to avoid this harshness (I have not found this forum any more harsh than your everyday educated R user), I do my own research and ask a question when I get stuck. I realize that people still are upset by my ignorance and that is their choice. I will get better and better at R and less and less likely to run into any of these issues. I am not a programmer or statistician by trade, but I am a learner and thus will continue to improve. You have not offended me at all as few people have this ability. Thank you for your help. Adele -Original Message- From: bt_jan...@yahoo.de [mailto:bt_jan...@yahoo.de] Sent: Wednesday, May 18, 2011 04:53 PM To: Thompson, Adele - adele_thomp...@cargill.com Cc: r-help@r-project.org Subject: Re: [R] Plots: I've deleted axes, now to delete space On 05/16/2011 06:18 PM, adele_thomp...@cargill.com wrote: Re-sizing within the dev command works well. I'm not sure why I would need the dev.off(). I have the plot commands run. Then I have the dev.copy2pdf command. Thanks again for your help. Well, you really should get into the habit of reading the documentation of each of the commands you use. To get familiar with the concept of graphics in R (and to answer your question regarding def.off()) I would recommend having a look at some basic textbook about R or one of the many tutorials on R available on the web. (googeling graphics R gives you a really helpful link with the third entry!) The answer to your question basicly is that you need to tell R that your figure is finished (by running def.off() ). Then R can create the file. Usually figures are created by a sequence of calls so R itself can never now whether it would be necessary to add some elements to the plot later or not. Sorry for beeing a bit harsh, but quite often many of the questions here on the list can be easily answered by searching documentation, the web or getting familiar with the basic R concepts! HTH Jannis -Original Message- From: greg.s...@imail.org [mailto:greg.s...@imail.org] Sent: Monday, May 16, 2011 11:11 AM To: Thompson, Adele - adele_thomp...@cargill.com; r-help@r-project.org Subject: RE: [R] Plots: I've deleted axes, now to delete space If your goal is to end up with a pdf file, then I would suggest creating the pdf file directly using the pdf function (you can specify height and width in the function) then run your commands to create the plot and use dev.off() to finish. You often get different results when writing directly to a file vs doing one of the dev.copy because of some different settings. In general the dev.copy approach can be a quick and easy solution for a simple graph, but plotting directly to the file tends to work better if you want a quality graph in the file. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Schatzi Sent: Monday, May 16, 2011 8:41 AM To: r-help@r-project.org Subject: Re: [R] Plots: I've deleted axes, now to delete space I am outputting the plot to a pdf file using the code: dev.copy2pdf(file=testing.pdf) The plots are too small though unless I first manually increase the size in R and then use the dev.copy command. Is there a way to automatically increase the window size? I tried fin and din, but those do not seem to work or they only increase the size to a certain degree, even though I can manually increase it to fill my screen. Schatzi wrote: Thanks all for the replies. I am getting better slowly but surely. I imagine that I will get better at figuring out things as well so I don't have to post as many questions. I do lots of searches, but still cannot figure out how to do everything that I need. The new code is as such: par(mfrow=c(4,7), mar=c(2, 2, 2, 1.5), oma=c(1, 1, 4, 0)) for (i in 1:28) { a-seq(1,3,1) plot(a,a, ann=FALSE, main= plot of a vs a) } mtext(Plot of a vs a,side=3,outer=TRUE) -Original Message- From: murdoch.dun...@gmail.com [mailto:murdoch.dun...@gmail.com] Sent: Friday, May 13, 2011 03:25 PM To: Thompson, Adele - adele_thomp...@cargill.com Cc: greg.s...@imail.org; r-help@r-project.org Subject: Re: [R] Plots: I've deleted axes, now to delete space On 11-05-13 4:21 PM, adele_thomp...@cargill.com wrote: Easy fix. Under ?par, I don't see where I can enter an overall title. Should I add a text command or something? mtext() writes text in the margins; argument outer puts it in the outer margins.
Re: [R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb
Hi Bert, I think people should know about the Google Sytle Guide for R because, as I said, it represents a thoughtful contribution to the debate. Most of its advice is very good (meaning I agree with it!) but some is a bit too much (for example, the blanket advice never to use S4 classes and methods - that's just resisting progress, in my view). The advice on using - for the (normal) assingment operator rather than = is also good advice, (according to me), but people who have to program in both C and R about equally often may find it a bit tedious. We can argue over that one. I suggest it has a place in the R FAQ but with a suitable warning that this is just one view, albeit a thougtful one. I don't think it need be included in the posting guide, though. It would take away some of the fun. :-) Bill Venables. -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Wednesday, 18 May 2011 11:47 PM To: Venables, Bill (CMIS, Dutton Park) Cc: r-help@r-project.org Subject: R Style Guide -- Was Post-hoc tests in MASS using glm.nb Thanks Bill. Do you and others think that a link to this guide (or another)should be included in the Posting Guide and/or R FAQ? -- Bert On Tue, May 17, 2011 at 4:07 PM, bill.venab...@csiro.au wrote: Amen to all of that, Bert. Nicely put. The google style guide (not perfect, but a thoughtful contribution on these kinds of issues, has avoiding attach() as its very first line. See http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html) I would add, though, that not enough people seem yet to be aware of within(...), a companion of with(...) in a way, but used for modifying data frames or other kinds of list objects. It should be seen as a more flexible replacement for transform() (well, almost). The difference between with() and within() is as follows: with(data, expr, ...) allows you to evaluate 'expr' with 'data' providing the primary source for variables, and returns *the evaluated expression* as the result. By contrast within(data, expr, ...) again uses 'data' as the primary source for variables when evaluating 'expr', but now 'expr' is used to modify the varibles in 'data' and returns *the modified data set* as the result. I use this a lot in the data preparation phase of a project, especially, which is usually the longest, trickiest, most important, but least discussed aspect of any data analysis project. Here is a simple example using within() for something you cannot do in one step with transform(): polyData - within(data.frame(x = runif(500)), { x2 - x^2 x3 - x*x2 b - runif(4) eta - cbind(1,x,x2,x3) %*% b y - eta + rnorm(x, sd = 0.5) rm(b) }) check: str(polyData) 'data.frame': 500 obs. of 5 variables: $ x : num 0.5185 0.185 0.5566 0.2467 0.0178 ... $ y : num [1:500, 1] 1.343 0.888 0.583 0.187 0.855 ... $ eta: num [1:500, 1] 1.258 0.788 1.331 0.856 0.63 ... $ x3 : num 1.39e-01 6.33e-03 1.72e-01 1.50e-02 5.60e-06 ... $ x2 : num 0.268811 0.034224 0.309802 0.060844 0.000315 ... Bill Venables. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Wednesday, 18 May 2011 12:08 AM To: Peter Ehlers Cc: R list Subject: Re: [R] Post-hoc tests in MASS using glm.nb Folks: Only if the user hasn't yet been introduced to the with() function, which is linked to on the ?attach page. Note also this sentence from the ?attach page: attach can lead to confusion. I can't remember the last time I needed attach(). Peter Ehlers Yes. But perhaps it might be useful to flesh this out with a bit of commentary. To this end, I invite others to correct or clarify the following. The potential confusion comes from requiring R to search for the data. There is a rigorous process by which this is done, of course, but it requires that the runtime environment be consistent with that process, and the programmer who wrote the code may not have control over that environment. The usual example is that one has an object named,say, a in the formula and in the attached data and another a also in the global environment. Then the wrong a would be found. The same thing can happen if another data set gets attached in a position before the one of interest. (Like Peter, I haven't used attach() in so long that I don't know whether any warning messages are issued in such cases). Using the data = argument when available or the with() function when not avoids this potential confusion and tightly couples the data to be analyzed with the analysis. I hope this clarifies the previous posters' comments. Cheers, Bert [... non-germane material snipped ...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
[R] using hglm to fit a gamma GLMM with nested random effects?
Apologies for continuing to ask about this but . . in my quest to fit a gamma GLMM model to my data (see partial copy of thread below), I'm exploring using hglm today. The question of the day has to do with the errors I'm currently getting from the hglm package. Can hglm handle a model with nested random effects? I don't see an example of one of those in the package documentation. If it can, can anyone tell me what these errors are trying to tell me? If no, I promise I'll let this rest and just take B. Boker's advice to go with a nice safe modified log transform of the data. Best test.gamma-hglm(fixed=post.f.crwn.length~lg.shigo.av+dbh+leaf.area+ bark.thick.bh+ht.any, random=~1|site/transect/plot, family=Gamma(link=log), data=rws30.BL) Error in `contrasts-`(`*tmp*`, value = contr.treatment) : contrasts can be applied only to factors with 2 or more levels In addition: Warning messages: 1: In Ops.factor(site, transect) : / not meaningful for factors 2: In Ops.factor(site/transect, plot) : / not meaningful for factors test.gamma-hglm(fixed=post.f.crwn.length~lg.shigo.av+dbh+leaf.area+ bark.thick.bh+ht.any, random=~1|site, family=Gamma(link=log), data=rws30.BL) Error in hglm.default(X = X, y = Y, Z = z, family = family, rand.family = rand.family, : Length of X and Z differ. * Dennis Murphy djmu...@gmail.com Tue, May 17, 2011 at 6:18 PM To: Benjamin Caldwell btcaldw...@berkeley.edu Hi: Someone else (Wayne Zhang, CNA) asked a similar question re hierarchical Gamma models on R-help today and responded to suggestions as follows: Hglm does the work! Thanks! Also, I find that the developing version of lme4, called lme4a, has the capability to fit Gamma models. And both lme4a and hglm produce results consistent with the published ones. Problems solved! Perhaps you might find success following his lead? Dennis Ben Bolker bbol...@gmail.com * *Tue, May 17, 2011 at 4:50 PM* To: Benjamin Caldwell btcaldw...@berkeley.edu Cc: r-sig-mixed-mod...@r-project.org r-sig-mixed-mod...@r-project.org [forwarding to r-sig-mixed-models list ...] As of today, Gamma models are (still) not feasible in lme4 -- they are somewhat more numerically challenging than the other families, so Doug Bates is having to do some re-engineering. There is a *possibility* that I can get Gamma fitting to work in the 'alpha'/bleeding-edge development version of glmmADMB, but it will definitely be bleeding-edge ... if you are interested in trying that, please contact me off-list. In the meantime, my standing advice is to try a LMM on the log-transformed data (zero values in the response are problematic, but they would be problematic in a Gamma GLMM in any case if the shape parameter is ever 1 ...) Ben Bolker On 11-05-17 07:32 PM, Benjamin Caldwell wrote: Addendum: I tried a gamma fit in glmmPQL and got the same errors. *Ben Caldwell* PhD Candidate University of California, Berkeley On Tue, May 17, 2011 at 3:51 PM, Benjamin Caldwell btcaldw...@berkeley.eduhttps://mail.google.com/mail/?ui=2ik=938097cb0fview=ptsearch=inboxmsg=130005e6516c0ad7dsqt=1 mailto:btcaldw...@berkeley.eduhttps://mail.google.com/mail/?ui=2ik=938097cb0fview=ptsearch=inboxmsg=130005e6516c0ad7dsqt=1 wrote: Hello After seeing this (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/005213.html) email I thought I would check the issue with a gamma family in lme4 hadn't been fixed; can I fit a hierarchical gamma model in lme4 at this time? There doesn't seem to be another package capable of it at this time. My thought process: 1. took a look at the response variable and some subsets to see what it looked like, (bppfcl and transformed response var), attached 2. took a look at a gamma and gaussian fit to the response variable. 3. ran hierarchical gaussian model in nlme to look at residuals (more familiar with graphs from that package) (qqnorm and residuals) Given the residual output for the gaussian model it looks like I could remove the values at the end of the distribution and get a decent fit. I'd still like to try a gamma model though, if that's possible. Is it possible in lme4 or another package I don't know about? ---This is the code I'm running--- rws30.BL$site - factor(rws30.BL$site) rws30.BL$transect - interaction(rws30.BL$site, rws30.BL$transect, drop = TRUE) rws30.BL$plot - interaction(rws30.BL$site, rws30.BL$transect, rws30.BL$plot, drop = TRUE) hist(rws30.BL$post.f.crwn.length) rws30.BL$gpost.f.crwn.length library(nlme) burnedmodel1.3-lme(post.f.crwn.length~lg.shigo.av+dbh+leaf.area+ bark.thick.bh http://bark.thick.bh+ht.any+ht.alive, random=(~1|site/transect/plot),na.action=na.omit, data=rws30.BL) Error: no valid set of coefficients has been found: please supply starting values In addition: Warning message: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
Re: [R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb
On 19/05/11 10:26, bill.venab...@csiro.au wrote: SNIP Most of [the Google style guide's] advice is very good (meaning I agree with it!) but some is a bit too much (for example, the blanket advice never to use S4 classes and methods - that's just resisting progress, in my view). SNIP I must respectfully disagree with this view, and concur heartily with the style guide. S4 classes and methods are a ball-and-chain that one has to drag along. See also fortune(S4 methods). :-) cheers, Rolf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.