[R] question about mixtools package
Hello all, May be silly question, but what exactly is beta parameter in functions like regmixEM from mixtools package? I mean, how to determine this beta, if i have a set of metrics for each case? Is there a function for that? I have try to put NULL at this parameter, but function just do not work in this case. Cheers, Dima [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error message: object of type 'closure' is not subsettable
Newbie wrote: Dear R-users I need to calibrate kappa, rho, eta, theta, v0 in the following code, see below. However when I run it, I get: y - function(kappahat, rhohat, etahat, thetahat, v0hat) {sum(difference(k, t, S0, X, r, implvol, q, kappahat, rhohat, etahat, thetahat, v0hat)^2)} nlminb(start=list(kappa, rho, eta, theta, v0), objective = y, lower =lb, upper =ub) Error in dots[[1L]][[1L]] : object of type 'closure' is not subsettable And I don't know what this mean and what I am doing wrong. Can anyone help me? Here is my code and data set. Best Rikke .. y - function(kappahat, rhohat, etahat, thetahat, v0hat) {sum(difference(k, t, S0, X, r, implvol, q, kappahat, rhohat, etahat, thetahat, v0hat)^2)} nlminb(start=list(kappa, rho, eta, theta, v0), objective = y, lower =lb, upper =ub) You haven't given all your data. Spot csv is missing. You are using nlminb incorrectly. It expects the objective function to take a numeric vector as argument as clearly stated in the documentation. Which should have been clear after your first post. This would possibly help (NOT tested because of lack of data) y - function(par) {kappahat-par[1]; rhohat-par[2]; etahat-par[3]; thetahat-par[4]; v0hat-par[5]; sum(difference(k, t, S0, X, r, implvol, q, kappahat, rhohat, etahat, thetahat, v0hat)^2)} nlminb(start=c(kappa, rho, eta, theta, v0), objective = y, lower =lb, upper =ub) Berend -- View this message in context: http://r.789695.n4.nabble.com/Error-message-object-of-type-closure-is-not-subsettable-tp3752886p3754511.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coding question for behavioral data analysis
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello, as far as I understood your problem, this function might do the trick: CountNextBehavior - function (data.source, interest.behavior, lev.ignore, interest.timeframe) { ## -- ## Returns the number of occuring behavior in a given timeframe ## ## Args: ## data.source is the source dataframe, with columns Behavior and ##Time. Behavior is assumed to be a factor and Time an integer ## interest.behavior is the seeked level of the behavior ## lev.ignore is a vector of behavior levels to ignore ## interest.timeframe fixes the time frame for observation count ## ## Returns: ## a matrix named according to the behaviors # First, get rid of unwanted behavioral levels data.source - with(data.source[!data.source$Behavior %in% lev.ignore, ], data.frame(Time = Time, Behavior = factor(Behavior))) # Creates the return matrix seeked.blevels - levels(data.source$Behavior) count.behavior - matrix(rep(0,length(seeked.blevels)), nrow=1, dimnames=list(Count, seeked.blevels)) # Look when the behavior occurs seeked.behavior - data.source$Behavior == interest.behavior occuring.time - data.source$Time[seeked.behavior] # Iterate over occuring times for (obs.time in occuring.time) { # Get all the observed behavior in the given timeframe this.timeframe - data.source$Time obs.time data.source$Time = obs.time + interest.timeframe this.behavior - data.source$Behavior[this.timeframe] # Get the level of the first observed behavior first.behavior - this.behavior[1] # Count the number of occurences this.count - sum(this.behavior == first.behavior) # Add the count to the given behavior count.behavior[first.behavior] - count.behavior[first.behavior] + this.count } return(count.behavior) } Am 18.08.2011 19:29, schrieb jabroesch: Hello all, I have a question which I have been struggling with for several weeks now, that I think might be easy for more proficient coders than myself. I have a large behavioral dataset, with behaviors and the times (milliseconds) that they occurred. Each subject has a separate file, and a sample subject file can be generated using the following syntax: Time - c(1000, 1050, 1100, 1500, 2500, 5000, 6500, 6600, 7000) Behavior - c(g, a, s, 5, z, g, z, g, a) mydata - data.frame(Time,Behavior) My basic goal is to be able to extract some details about what behaviors follow another specific behavior within a time window (say1000 milliseconds). I figured out how to determine if one specific behavior follows another specific behavior within that window with the following syntax. TimeG=mydata$Time[mydata$Behavior == g] TimeA=mydata$Time[mydata$Behavior == a] out=rep(NA, length(TimeG)) for (i in 1:length(TimeG)){tmp = TimeA-TimeG[i] out[i]=(sum(0 tmp tmp =1000 )0 ) } number_of_behaviors-length(TimeG) number_of_affectmirroring-sum(out) This generates 2 values: the number of times that the target behavior g occurred, and the number of times that it was followed by the behavior a within 1000 milliseconds. Question: What I can't seem to figure out is a to generate a count of the number of times that multiple different types of behaviors immediately follow a specific behavior within 1000 milliseconds. So say the behavior of interest is �g� as it is in the example above. I want to determine 1)what was the next behavior (from a specified list of possible behaviors bellow) that followed it within 1000 milliseconds. Ideally the output would 1 row with be 13 columns. The first column would be the number of times that the target behavior, g in this example occurs. The next 12 columns would be the number of times that one of the specific behaviors was the next behavior that followed within 1000 milliseconds. So one column for each of these behaviors : a s d z x c v q w e r t. The two complicating factors are: 1)there might be multiple behaviors that followed within 1000 milliseconds, and I only want to count the first one; and 2)there are additional behaviors that I would like to ignore (like the 5 in the example above). Any help or suggestions are appreciated. Thank you, James Broesch -- View this message in context: http://r.789695.n4.nabble.com/Coding-question-for-behavioral-data-analysis-tp3753151p3753151.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
Re: [R] A question about using getSrcDirectory() with R/Rscript
Dear Uwe, Thanks for you suggestion. I should have spotted that (not thinking about the order of commands properly). I still don't know why it failed to work in RGui for me. It works now. I suspect it must be another case of PEBCAK (Problem Exists Between Chair And Keyboard). Best wishes, Cormac. 2011/8/18 Uwe Ligges lig...@statistik.tu-dortmund.de: Works for me in R-patched: I guess your problem is that you have to set the options() before source()ing. Best, Uwe Ligges On 17.08.2011 10:31, Cormac Long wrote: Good morning R-help, I have an idiot question: I would like to use getSrcDirectory() and friends to allow me to identify where an R file has been called from when invoked using Rscript. If I understand the documentation correctly, the following example should work: In file test.R: options(keep.source=T) fn-function(x){x-x+1} srcDir-getSrcDirectory(fn) print(srcDir) I attempted the following invocations of Rscript: + Rscript test.R + Rscriptfull_path/test.R I attempted the following invocations using R: + source(test.R) + Manually entering the function In both attempts, the variable srcDir is a zero-length character vector. Digging into the documentation, I notice that getSrcDirectory() looks for a srcref attribute in the function body. In neither R nor Rscript is this attribute set when declaring the function. So: what am I missing? Comments: + I have 'keep.source' option set to TRUE in both R and Rscript (irritatingly, it's default is TRUE in R and FALSE in Rscript - why is this?) + I have tested this with: o R 2.13.1 on Ubuntu 10.10 (server) o R 2.13.0 on Windows 7 Best wishes, Cormac. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UNC Windows path beginning with backslashes
Thanks Henrik, but I have 2 reasons for not using that approach: A) If I don't map the drive until after R starts the UNC path is already present in several places I know about and probably some I don't, leading to the problems I started with. So reason 'B' doesn't really matter to me, but as author of R.utils you may be interested that... B) On my system those calls don't seem to work. Details here... -- library(R.utils) Loading required package: R.oo Loading required package: R.methodsS3 R.methodsS3 v1.2.1 (2010-09-18) successfully loaded. See ?R.methodsS3 for help. R.oo v1.8.1 (2011-07-10) successfully loaded. See ?R.oo for help. R.utils v1.7.8 (2011-07-24) successfully loaded. See ?R.utils for help. sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 [2] LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] datasets grDevices splines graphics stats utils tcltk [8] tools methods base other attached packages: [1] R.utils_1.7.8 R.oo_1.8.1 R.methodsS3_1.2.1 RODBC_1.3-3 [5] tree_1.0-29nlme_3.1-102 MASS_7.3-14 xlsReadWrite_1.5.4 [9] svSocket_0.9-51TinnR_1.0.3R2HTML_2.2 Hmisc_3.8-3 [13] survival_2.36-9 loaded via a namespace (and not attached): [1] cluster_1.14.0 grid_2.13.1 lattice_0.19-31 svMisc_0.9-61 # It seems to think I have no mapped drives System$getMappedDrivesOnWindows() named character(0) # Although I clearly have (in fact I'm running R from Z:), so I can't # find a 'spare' drive letter system(net use) New connections will not be remembered. Status Local RemoteNetwork --- OK F:\\server10\microbiology Microsoft Windows Network OK L:\\server23\Stats Microsoft Windows Network OK M:\\server10\jewell Microsoft Windows Network OK Q:\\server04\pccommon (not backed up) Microsoft Windows Network OK R:\\server23\Template Microsoft Windows Network Z:\\campden\shares\Workgroup\Stats Microsoft Windows Network \\TSCLIENT\C Microsoft Terminal Services \\TSCLIENT\D Microsoft Terminal Services \\TSCLIENT\E Microsoft Terminal Services \\TSCLIENT\F Microsoft Terminal Services \\TSCLIENT\G Microsoft Terminal Services \\TSCLIENT\H Microsoft Terminal Services \\TSCLIENT\I Microsoft Terminal Services \\TSCLIENT\L Microsoft Terminal Services \\TSCLIENT\M Microsoft Terminal Services \\TSCLIENT\Q Microsoft Terminal Services \\TSCLIENT\R Microsoft Terminal Services The command completed successfully. # The commands you cited throw errors... System$mapDriveOnWindows(K, campden\\shares\\Workgroup\\Stats) Error in list(`System$mapDriveOnWindows(K, campden\\shares\\Workgroup\\Stats)` = environment, : [2011-08-19 09:16:28] Exception: Argument 'drive' is not a valid drive (e.g. 'Y:'): K at throw(Exception(...)) at throw.default(Argument 'drive' is not a valid drive (e.g. 'Y:'): , drive) at throw(Argument 'drive' is not a valid drive (e.g. 'Y:'): , drive) at method(static, ...) at System$mapDriveOnWindows(K, campden\\shares\\Workgroup\\Stats) driveLetters - System$getMappedDrivesOnWindows() driveLetters named character(0) System$unmapDriveOnWindows(K) Error in list(`System$unmapDriveOnWindows(K)` = environment, `method(static, ...)` = environment, : [2011-08-19 09:29:09] Exception: Argument 'drive' is not a valid drive (e.g. 'Y:'): K at throw(Exception(...)) at throw.default(Argument 'drive' is not a valid drive (e.g. 'Y:'): , drive) at throw(Argument 'drive' is not a valid drive (e.g. 'Y:'): , drive) at method(static, ...) at System$unmapDriveOnWindows(K) Thanks for your interest, Keith Jewell - Henrik Bengtsson h...@biostat.ucsf.edu wrote in message news:cafdcvcqe3uukmmqsjj0fpevfjgrabrgbt1g8drcxgpnsjeb...@mail.gmail.com... I think you can also do this from within R (e.g. in your .Rprofile) using the R.utils package; library(R.utils) System$mapDriveOnWindows(K, campden\\shares\\Workgroup\\Stats) driveLetters - System$getMappedDrivesOnWindows() System$unmapDriveOnWindows(K) These methods utilize
Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction
Dear Mark, Thank you very much for your kind advice. Actually, I already performed penalized logistic regression by pentrace and lrm in package rms. The reason why I wanted to reduce dimensionality of those 9 variables was that these variables were not so important according to the subject matter knowledge and that I wanted to avoid events per variable problem. Your answer about dudi.mix$l1 helped me a lot. I finally was able to perform penalized logistic regression for data consisting of 4 important variables and x18.dudi.mix$l1[, 1]. Thanks a lot again. One more question, I investigated homals package too. I found it has ndim option. mydata is followings; head(x10homals.df) age sex symptom HT DM IHD smoking hyperlipidemia Statin Response 1 62 M asymptomatic positive negative negative positive positive positive negative 2 82 M symptomatic positive negative negative negative positive positive negative 3 64 M asymptomatic negative positive negative negative positive positive negative 4 55 M symptomatic positive positive positive negative positive positive negative 5 67 M symptomatic positive negative negative negative negative positive negative 6 79 M asymptomatic positive positive negative negative positive positive negative age is continuous variable, and Response should not be active for computation, so, ... x10.homals4 - homals(x10homals.df, active = c(rep(TRUE, 9), FALSE), level=c(numerical, rep(nominal, 9)), ndim=4) I did it with ndim from 2 to 9, compared Classification rate of Response by predict(x10.homals). p.x10.homals4 Classification rate: Variable Cl. Rate %Cl. Rate 1 age 0.4712 47.12 2 sex 0.9808 98.08 3 symptom 0.8269 82.69 4 HT 0.9135 91.35 5 DM 0.8558 85.58 6 IHD 0.8750 87.50 7 smoking 0.9423 94.23 8 hyperlipidemia 0.9519 95.19 9 Statin 0.8942 89.42 10 Response 0.6154 61.54 This is the best for classification of Response, so, I selected ndim=4. Then, I found objscores. head(x10.homals4$objscores) D1 D2 D3 D4 1 -0.002395321 -0.034032230 -0.008140378 0.02369123 2 0.036788626 -0.010308707 0.005725984 -0.02751958 3 0.014363031 0.049594466 -0.025627467 0.06254055 4 0.083092285 0.065147519 0.045903394 -0.03751551 5 -0.013692504 0.005106661 -0.007656776 -0.04107009 6 0.002320747 0.024375393 -0.017785415 -0.01752556 I used x10.homals4$objscores[, 1] as a predictor for logistic regression as in the same way as PC1 in PCA. Am I going the right way? Thanks a lot for your help in advance. Best regards -- Kohkichi Hosoda (11/08/19 4:21), Mark Difford wrote: On Aug 18, 2011 khosoda wrote: I'm trying to do model reduction for logistic regression. Hi Kohkichi, My general advice to you would be to do this by fitting a penalized logistic model (see lrm in package rms and glmnet in package glmnet; there are several others). Other points are that the amount of variance explained by mixed PCA and MCA are not comparable. Furthermore, homals() is a much better choice than MCA because it handles different types of variables whereas MCA is for categorical variables. On the more specific question of whether you should use dudi.mix$l1 or dudi.mix$li, it doesn't matter: the former is a scaled version of the latter. Same for dudi.acm. To see this do the following: ## plot(x18.dudi.mix$li[, 1], x18.dudi.mix$l1[, 1]) Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3753437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- * 神戸大学大学院医学研究科 脳神経外科学分野 細田 弘吉 〒650-0017 神戸市中央区楠町7丁目5-1 Phone: 078-382-5966 Fax : 078-382-5979 E-mail address Office: khos...@med.kobe-u.ac.jp Home : khos...@venus.dti.ne.jp __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] retain class after merge
Dear All, is there a simple way to retain the class attribute of a column, if merging two data.frames? When merging the example data.frames form help(merge) I am unable to keep the class attribute as set before merging (see below). Two columns are assigned new classes before merge (myclass1, myclass2), but after merge the resulting column has class character. best regards, Heinz ## use character columns of names to get sensible sort order authors - data.frame( surname = I(c(Tukey, Venables, Tierney, Ripley, McNeil)), nationality = c(US, Australia, US, UK, Australia), deceased = c(yes, rep(no, 4))) books - data.frame( name = I(c(Tukey, Venables, Tierney, Ripley, Ripley, McNeil, R Core)), title = c(Exploratory Data Analysis, Modern Applied Statistics ..., LISP-STAT, Spatial Statistics, Stochastic Simulation, Interactive Data Analysis, An Introduction to R), other.author = c(NA, Ripley, NA, NA, NA, NA, Venables Smith)) class(authors$surname) - 'myclass1' class(books$name) - 'myclass2' (m1 - merge(authors, books, by.x = surname, by.y = name)) class(m1$surname) [1] character sessionInfo() R version 2.13.1 Patched (2011-08-08 r56671) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 [3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C [5] LC_TIME=German_Switzerland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Windows 7 issues with installing packages and setting library paths
Dear all, I am forced to work in an environment without administrator rights. When using R2.13.1 on Windows 7 (64-Bit), I found that I can´t install or update any packages due to missing writing permissions. I managed to get full access to a directory on my C:\ drive now - but how do I specify that all libraries shall be installed into this directory? In Rcmd_environ I have the following entries: ## from R.sh R_SHARE_DIR=C:\\Program Files\\R\\R-2.13.1\share R_INCLUDE_DIR=C:\\Program Files\\R\\R-2.13.1\share\include R_DOC_DIR=C:\\Program Files\\R\\R-2.13.1\share\doc R_ARCH= R_LIBS_USER=C:\\Program Files\\R\\R-2.13.1\\library R_LIBS=C:\\Program Files\\R\\R-2.13.1\\library In Rprofile.site I have the following entries: .Library.site=C:\\Program Files\\R\\R-2.13.1\\library .Library=C:\\Program Files\\R\\R-2.13.1\\library .libPaths=C:\\Program Files\\R\\R-2.13.1\\library What else do I need to change? When I start up R, I get the following error message: Error: cannot change value of locked binding for '.Library' When calling .libPaths, I still get the wrong path: winfs-uni.top.gwdg.de/cscherb1$/R/R-2.13.1/library R has been installed at C:\\Program Files\\R but for some reason it still uses winfs-uni.top.gwdg.de/cscherb1$/R as the default directory for libraries (where I don´t have write permissions for some unknown reasons) What can I do to change the default library installation location? Any help would be greatly appreciated! Many thanks and best wishes Christoph -- Dr. rer.nat. Christoph Scherber University of Goettingen DNPW, Agroecology Grisebachstr. 6 D-37077 Goettingen Germany phone +49 (0)551 39 8807 fax +49 (0)551 39 8806 Homepage http://www.gwdg.de/~cscherb1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coding question for behavioral data analysis
You might try using outer to create a matrix that will help out: Time - c(1000, 1050, 1100, 1500, 2500, 5000, 6500, 6600, 7000) Time [1] 1000 1050 1100 1500 2500 5000 6500 6600 7000 ?outer starting httpd help server ... done x - outer(Time, Time, FUN = function(a, b){d - b-a; (d=0) (d = 1000)}) x [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE [2,] FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE [3,] FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE FALSE [4,] FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE [5,] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE [6,] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE [7,] FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE [8,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE [9,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE This says, reading down the columns, that event 4 occurs after 1, 2 3 within the window; event 9 occurs after 7 8 within the window; etc. On Thu, Aug 18, 2011 at 1:29 PM, jabroesch james.broe...@gmail.com wrote: Hello all, I have a question which I have been struggling with for several weeks now, that I think might be easy for more proficient coders than myself. I have a large behavioral dataset, with behaviors and the times (milliseconds) that they occurred. Each subject has a separate file, and a sample subject file can be generated using the following syntax: Time - c(1000, 1050, 1100, 1500, 2500, 5000, 6500, 6600, 7000) Behavior - c(g, a, s, 5, z, g, z, g, a) mydata - data.frame(Time,Behavior) My basic goal is to be able to extract some details about what behaviors follow another specific behavior within a time window (say1000 milliseconds). I figured out how to determine if one specific behavior follows another specific behavior within that window with the following syntax. TimeG=mydata$Time[mydata$Behavior == g] TimeA=mydata$Time[mydata$Behavior == a] out=rep(NA, length(TimeG)) for (i in 1:length(TimeG)){tmp = TimeA-TimeG[i] out[i]=(sum(0 tmp tmp =1000 )0 ) } number_of_behaviors-length(TimeG) number_of_affectmirroring-sum(out) This generates 2 values: the number of times that the target behavior g occurred, and the number of times that it was followed by the behavior a within 1000 milliseconds. Question: What I can't seem to figure out is a to generate a count of the number of times that multiple different types of behaviors immediately follow a specific behavior within 1000 milliseconds. So say the behavior of interest is “g” as it is in the example above. I want to determine 1)what was the next behavior (from a specified list of possible behaviors bellow) that followed it within 1000 milliseconds. Ideally the output would 1 row with be 13 columns. The first column would be the number of times that the target behavior, g in this example occurs. The next 12 columns would be the number of times that one of the specific behaviors was the next behavior that followed within 1000 milliseconds. So one column for each of these behaviors : a s d z x c v q w e r t. The two complicating factors are: 1)there might be multiple behaviors that followed within 1000 milliseconds, and I only want to count the first one; and 2)there are additional behaviors that I would like to ignore (like the 5 in the example above). Any help or suggestions are appreciated. Thank you, James Broesch -- View this message in context: http://r.789695.n4.nabble.com/Coding-question-for-behavioral-data-analysis-tp3753151p3753151.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert week value to date
Folkes, Michael: I now realize I could write code to evaluate which of the first 7 days in the year is a Monday and then I'd know the start of week 1 in each year, and multiply from there. But note that library(surveillance) # ISO week isoWeekYear(as.Date(2010-01-01))$ISOWeek [1] 53 so that 2010-01-01 is actually on the week 53 of the year 2009. January 4th is always on week 1 of the same year isoWeekYear(as.Date(2010-01-04))$ISOWeek [1] 1 and every now and then there are 53 weeks in a year, not 52. Heikki Kaskelma Munkkiniemi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] postscript( does not save the plot
Dear Marc, I would like to thank you for your answer. Unfortunately still setEPS() postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],exponper[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Xlabel,ylab=Ylabel,boxwex=0.5,pars=list(whisklwd=0,staplelwd=0) ) dev.off() still does not work, though the setEPS() postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],exponper[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Xlabel,ylab=Ylabel,boxwex=0.5) dev.off() works fine! It seems that the problem is with the pars=list(. Just to make it more clear. The dev.off() returns 1 and the file is created. The problem is that this file can not be open with any program, while all the other .eps files I have and were created by R, with the above methodology work really nice. B.R Alex From: Marc Schwartz marc_schwa...@me.com Cc: R-help@r-project.org R-help@r-project.org Sent: Wednesday, August 17, 2011 5:48 PM Subject: Re: [R] postscript( does not save the plot Not sure what output you get in the first case. You don't need: ps.options=setEPS() just: setEPS() Using: set.seed(1) test - matrix(runif(500*500), 500) setEPS() postscript(file = exponcoverapprox.eps) boxplot(test[30, 1:500], test[90, 1:500], test[150, 1:500], test[210, 1:500], test[270, 1:500], test[330, 1:500], test[390, 1:500], names = c(1, 3, 5, 8, 10, 13, 1), outline = FALSE, ylim=c(0.01, 50), log = y, xlab = xvalue, ylab = yvalue, boxwex=0.5, pars = list(whisklwd = 0, staplelwd = 0)) dev.off() I get the attached output which seems to be OK. Marc On Aug 17, 2011, at 10:02 AM, Alaios wrote: The problem is a bit weird. This does not work: ps.options=setEPS() postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5, pars =list(whisklwd=0,staplelwd=0)) dev.off() This works postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5) dev.off() To not bother you with the details, the only difference is the pars =list(whisklwd=0,staplelwd=0) at the end of the boxplot , which I use to remove the whiskers fromt he blot. B.R From: Marc Schwartz marc_schwa...@me.com Cc: R-help@r-project.org R-help@r-project.org Sent: Tuesday, August 16, 2011 7:38 PM Subject: Re: [R] postscript( does not save the plot On Aug 16, 2011, at 12:32 PM, Alaios wrote: Dear all, I am using the following code to write the plot to an eps format postscript(file=test.eps,horizontal=FALSE) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5, pars =list(whisklwd=0,staplelwd=0)) dev.off() This creates a 6kb eps file, that can not be opened by any program. I tired with photoshop gimp, acrobat reader. This is the normal process I follow to save my plots. dev.off always returns 1. and the boxplot function succesfullu does the plot in the screen. What might be the problem? I would like to thank you in advance for your help B.R Alex You did not create an EPS file. See ?postscript and pay attention to the fourth paragraph under Details: The postscript produced for a single R plot is EPS (Encapsulated PostScript) compatible, and can be included into other documents, e.g., into LaTeX, using \includegraphics{filename}. For use in this way you will probably want to use setEPS() to set the defaults as horizontal = FALSE, onefile = FALSE, paper = special. Note that the bounding box is for the device region: if you find the white space around the plot region excessive, reduce the margins of the figure region viapar(mar=). HTH, Marc Schwartz [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dates - week and year not day.
Tack on a day of the week (6 as the last day) for a point of reference for the conversion: dates - paste('6.', 0:53, '.2011', sep = '') dates [1] 6.0.2011 6.1.2011 6.2.2011 6.3.2011 6.4.2011 6.5.2011 6.6.2011 [8] 6.7.2011 6.8.2011 6.9.2011 6.10.2011 6.11.2011 6.12.2011 6.13.2011 [15] 6.14.2011 6.15.2011 6.16.2011 6.17.2011 6.18.2011 6.19.2011 6.20.2011 [22] 6.21.2011 6.22.2011 6.23.2011 6.24.2011 6.25.2011 6.26.2011 6.27.2011 [29] 6.28.2011 6.29.2011 6.30.2011 6.31.2011 6.32.2011 6.33.2011 6.34.2011 [36] 6.35.2011 6.36.2011 6.37.2011 6.38.2011 6.39.2011 6.40.2011 6.41.2011 [43] 6.42.2011 6.43.2011 6.44.2011 6.45.2011 6.46.2011 6.47.2011 6.48.2011 [50] 6.49.2011 6.50.2011 6.51.2011 6.52.2011 6.53.2011 as.POSIXct(dates, format = %w.%W.%Y) [1] 2011-01-01 EST 2011-01-08 EST 2011-01-15 EST 2011-01-22 EST 2011-01-29 EST [6] 2011-02-05 EST 2011-02-12 EST 2011-02-19 EST 2011-02-26 EST 2011-03-05 EST [11] 2011-03-12 EST 2011-03-19 EDT 2011-03-26 EDT 2011-04-02 EDT 2011-04-09 EDT [16] 2011-04-16 EDT 2011-04-23 EDT 2011-04-30 EDT 2011-05-07 EDT 2011-05-14 EDT [21] 2011-05-21 EDT 2011-05-28 EDT 2011-06-04 EDT 2011-06-11 EDT 2011-06-18 EDT [26] 2011-06-25 EDT 2011-07-02 EDT 2011-07-09 EDT 2011-07-16 EDT 2011-07-23 EDT [31] 2011-07-30 EDT 2011-08-06 EDT 2011-08-13 EDT 2011-08-20 EDT 2011-08-27 EDT [36] 2011-09-03 EDT 2011-09-10 EDT 2011-09-17 EDT 2011-09-24 EDT 2011-10-01 EDT [41] 2011-10-08 EDT 2011-10-15 EDT 2011-10-22 EDT 2011-10-29 EDT 2011-11-05 EDT [46] 2011-11-12 EST 2011-11-19 EST 2011-11-26 EST 2011-12-03 EST 2011-12-10 EST [51] 2011-12-17 EST 2011-12-24 EST 2011-12-31 EST NA On Tue, Aug 16, 2011 at 4:01 AM, holdnatalie osp...@bangor.ac.uk wrote: Hi, I would be very grateful for some advice. I have read the help pages for Date, strptime, etc. All examples seem to use some version of day month year as date format. However I have Weekly composite data so ONLY want to input the dates as Week.Year (eg 35.2011). strptime seems to show this is possible using %W for week (UK convention) and %Y for year. My data is in a df called chlorophyll and has a date column. I tried to use the following (after I converted to characters using as.character); chlorophyll$date - strptime(chlorophyll$date, %W.%Y) It recognised the year but replaced the week part with todays date (16TH August). Any advice? Thanks Natalie -- View this message in context: http://r.789695.n4.nabble.com/Dates-week-and-year-not-day-tp3746591p3746591.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] postscript( does not save the plot
On Aug 19, 2011, at 8:06 AM, Alaios wrote: Dear Marc, I would like to thank you for your answer. Unfortunately still setEPS() postscript(file=exponcoverapprox.eps) boxplot (test [30,1 : 500 ],exponper [90,1 : 500 ],test [150,1 : 500 ],test [210,1 : 500 ],test [270,1 : 500 ],test [330,1 : 500 ],test [390,1 : 500 ],names = c (1 ,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Xlabel,ylab=Ylabel,boxwex=0.5,pars=list(whisklwd=0,staplelwd=0) ) dev.off() still does not work, though the The code you posted does not parse. There is a lissing closing quote in the ylab argument. -- David. setEPS() postscript(file=exponcoverapprox.eps) boxplot (test [30,1 : 500 ],exponper [90,1 : 500 ],test [150,1 : 500 ],test [210,1 : 500 ],test [270,1 : 500 ],test [330,1 : 500 ],test [390,1 : 500 ],names = c (1 ,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Xlabel,ylab=Ylabel,boxwex=0.5) dev.off() works fine! It seems that the problem is with the pars=list(. Just to make it more clear. The dev.off() returns 1 and the file is created. The problem is that this file can not be open with any program, while all the other .eps files I have and were created by R, with the above methodology work really nice. B.R Alex From: Marc Schwartz marc_schwa...@me.com Cc: R-help@r-project.org R-help@r-project.org Sent: Wednesday, August 17, 2011 5:48 PM Subject: Re: [R] postscript( does not save the plot Not sure what output you get in the first case. You don't need: ps.options=setEPS() just: setEPS() Using: set.seed(1) test - matrix(runif(500*500), 500) setEPS() postscript(file = exponcoverapprox.eps) boxplot(test[30, 1:500], test[90, 1:500], test[150, 1:500], test[210, 1:500], test[270, 1:500], test[330, 1:500], test[390, 1:500], names = c(1, 3, 5, 8, 10, 13, 1), outline = FALSE, ylim=c(0.01, 50), log = y, xlab = xvalue, ylab = yvalue, boxwex=0.5, pars = list(whisklwd = 0, staplelwd = 0)) dev.off() I get the attached output which seems to be OK. Marc On Aug 17, 2011, at 10:02 AM, Alaios wrote: The problem is a bit weird. This does not work: ps.options=setEPS() postscript(file=exponcoverapprox.eps) boxplot (test [30,1 : 500 ],test [90,1 : 500 ],test [150,1 : 500 ],test [210,1 : 500 ],test [270,1 : 500 ],test [330,1 : 500 ],test [390,1 : 500 ],names = c (1 ,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5, pars =list(whisklwd=0,staplelwd=0)) dev.off() This works postscript(file=exponcoverapprox.eps) boxplot (test [30,1 : 500 ],test [90,1 : 500 ],test [150,1 : 500 ],test [210,1 : 500 ],test [270,1 : 500 ],test [330,1 : 500 ],test [390,1 : 500 ],names = c (1 ,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5) dev.off() To not bother you with the details, the only difference is the pars =list(whisklwd=0,staplelwd=0) at the end of the boxplot , which I use to remove the whiskers fromt he blot. B.R From: Marc Schwartz marc_schwa...@me.com Cc: R-help@r-project.org R-help@r-project.org Sent: Tuesday, August 16, 2011 7:38 PM Subject: Re: [R] postscript( does not save the plot On Aug 16, 2011, at 12:32 PM, Alaios wrote: Dear all, I am using the following code to write the plot to an eps format postscript(file=test.eps,horizontal=FALSE) boxplot (test [30,1 : 500 ],test [90,1 : 500 ],test [150,1 : 500 ],test [210,1 : 500 ],test [270,1 : 500 ],test [330,1 : 500 ],test [390,1 : 500 ],names = c (1 ,3 ,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5, pars =list(whisklwd=0,staplelwd=0)) dev.off() This creates a 6kb eps file, that can not be opened by any program. I tired with photoshop gimp, acrobat reader. This is the normal process I follow to save my plots. dev.off always returns 1. and the boxplot function succesfullu does the plot in the screen. What might be the problem? I would like to thank you in advance for your help B.R Alex You did not create an EPS file. See ?postscript and pay attention to the fourth paragraph under Details: The postscript produced for a single R plot is EPS (Encapsulated PostScript) compatible, and can be included into other documents, e.g., into LaTeX, using \includegraphics{filename}. For use in this way you will probably want to use setEPS() to set the defaults as horizontal = FALSE, onefile = FALSE, paper = special. Note that the bounding box is for the device region: if you find the white space around the plot region excessive, reduce the margins of the figure region viapar(mar=). HTH, Marc Schwartz [[alternative HTML version deleted]] __
Re: [R] postscript( does not save the plot
Sometimes when I have a script that does not close out a graphics device correctly (using PDF), I sometimes have problems opening up the file. I use the following command to make sure all graphics devices are closed before generating plots after a script has not terminated correctly: graphics.off() Try this command before generating your plots. On Fri, Aug 19, 2011 at 8:06 AM, Alaios ala...@yahoo.com wrote: Dear Marc, I would like to thank you for your answer. Unfortunately still setEPS() postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],exponper[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Xlabel,ylab=Ylabel,boxwex=0.5,pars=list(whisklwd=0,staplelwd=0) ) dev.off() still does not work, though the setEPS() postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],exponper[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Xlabel,ylab=Ylabel,boxwex=0.5) dev.off() works fine! It seems that the problem is with the pars=list(. Just to make it more clear. The dev.off() returns 1 and the file is created. The problem is that this file can not be open with any program, while all the other .eps files I have and were created by R, with the above methodology work really nice. B.R Alex From: Marc Schwartz marc_schwa...@me.com Cc: R-help@r-project.org R-help@r-project.org Sent: Wednesday, August 17, 2011 5:48 PM Subject: Re: [R] postscript( does not save the plot Not sure what output you get in the first case. You don't need: ps.options=setEPS() just: setEPS() Using: set.seed(1) test - matrix(runif(500*500), 500) setEPS() postscript(file = exponcoverapprox.eps) boxplot(test[30, 1:500], test[90, 1:500], test[150, 1:500], test[210, 1:500], test[270, 1:500], test[330, 1:500], test[390, 1:500], names = c(1, 3, 5, 8, 10, 13, 1), outline = FALSE, ylim=c(0.01, 50), log = y, xlab = xvalue, ylab = yvalue, boxwex=0.5, pars = list(whisklwd = 0, staplelwd = 0)) dev.off() I get the attached output which seems to be OK. Marc On Aug 17, 2011, at 10:02 AM, Alaios wrote: The problem is a bit weird. This does not work: ps.options=setEPS() postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5, pars =list(whisklwd=0,staplelwd=0)) dev.off() This works postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5) dev.off() To not bother you with the details, the only difference is the pars =list(whisklwd=0,staplelwd=0) at the end of the boxplot , which I use to remove the whiskers fromt he blot. B.R From: Marc Schwartz marc_schwa...@me.com Cc: R-help@r-project.org R-help@r-project.org Sent: Tuesday, August 16, 2011 7:38 PM Subject: Re: [R] postscript( does not save the plot On Aug 16, 2011, at 12:32 PM, Alaios wrote: Dear all, I am using the following code to write the plot to an eps format postscript(file=test.eps,horizontal=FALSE) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5, pars =list(whisklwd=0,staplelwd=0)) dev.off() This creates a 6kb eps file, that can not be opened by any program. I tired with photoshop gimp, acrobat reader. This is the normal process I follow to save my plots. dev.off always returns 1. and the boxplot function succesfullu does the plot in the screen. What might be the problem? I would like to thank you in advance for your help B.R Alex You did not create an EPS file. See ?postscript and pay attention to the fourth paragraph under Details: The postscript produced for a single R plot is EPS (Encapsulated PostScript) compatible, and can be included into other documents, e.g., into LaTeX, using \includegraphics{filename}. For use in this way you will probably want to use setEPS() to set the defaults as horizontal = FALSE, onefile = FALSE, paper = special. Note that the bounding box is for the device region: if you find the white space around the plot region excessive, reduce the margins of the figure region viapar(mar=). HTH, Marc Schwartz [[alternative HTML version deleted]] __
Re: [R] postscript( does not save the plot
Dear al, I would like to thank you for your replies. I have tried with graphics.off() but did not help too. I am also sorry that my example was not reproducible So this one setEPS() postscript(file=mytest.eps) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Number of Sensors/km^2,ylab=Percentage of Coverage Error Estimations (log scale),boxwex=0.5) dev.off() always saves to a eps file that works while this one setEPS() postscript(file=mytest.eps) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Number of Sensors/km^2,ylab=Percentage of Coverage Error Estimations (log scale),boxwex=0.5,pars=list(whisklwd=0,staplelwd=0)) dev.off() will create an eps file that is not saved. I think is the last part of the boxplot pars=list(whisklwd=0,staplelwd=0) that creates that. What should I do to debug this? B.R Alex # Trashes test-exponper setEPS() postscript(file=mytest.eps) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Number of Sensors/km^2,ylab=Percentage of Coverage Error Estimations (log scale),boxwex=0.5) # legend(topright,c(Exponential)) dev.off() From: jim holtman jholt...@gmail.com Cc: Marc Schwartz marc_schwa...@me.com; R-help@r-project.org R-help@r-project.org Sent: Friday, August 19, 2011 2:27 PM Subject: Re: [R] postscript( does not save the plot Sometimes when I have a script that does not close out a graphics device correctly (using PDF), I sometimes have problems opening up the file. I use the following command to make sure all graphics devices are closed before generating plots after a script has not terminated correctly: graphics.off() Try this command before generating your plots. Dear Marc, I would like to thank you for your answer. Unfortunately still setEPS() postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],exponper[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Xlabel,ylab=Ylabel,boxwex=0.5,pars=list(whisklwd=0,staplelwd=0) ) dev.off() still does not work, though the setEPS() postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],exponper[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,15),outline=FALSE,ylim=c(0.01,50),log=y, xlab = Xlabel,ylab=Ylabel,boxwex=0.5) dev.off() works fine! It seems that the problem is with the pars=list(. Just to make it more clear. The dev.off() returns 1 and the file is created. The problem is that this file can not be open with any program, while all the other .eps files I have and were created by R, with the above methodology work really nice. B.R Alex From: Marc Schwartz marc_schwa...@me.com Cc: R-help@r-project.org R-help@r-project.org Sent: Wednesday, August 17, 2011 5:48 PM Subject: Re: [R] postscript( does not save the plot Not sure what output you get in the first case. You don't need: ps.options=setEPS() just: setEPS() Using: set.seed(1) test - matrix(runif(500*500), 500) setEPS() postscript(file = exponcoverapprox.eps) boxplot(test[30, 1:500], test[90, 1:500], test[150, 1:500], test[210, 1:500], test[270, 1:500], test[330, 1:500], test[390, 1:500], names = c(1, 3, 5, 8, 10, 13, 1), outline = FALSE, ylim=c(0.01, 50), log = y, xlab = xvalue, ylab = yvalue, boxwex=0.5, pars = list(whisklwd = 0, staplelwd = 0)) dev.off() I get the attached output which seems to be OK. Marc On Aug 17, 2011, at 10:02 AM, Alaios wrote: The problem is a bit weird. This does not work: ps.options=setEPS() postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5, pars =list(whisklwd=0,staplelwd=0)) dev.off() This works postscript(file=exponcoverapprox.eps) boxplot(test[30,1:500],test[90,1:500],test[150,1:500],test[210,1:500],test[270,1:500],test[330,1:500],test[390,1:500],names=c(1,3,5,8,10,13,1),outline=FALSE,ylim=c(0.01,50),log=y, xlab = xvalue,ylab=yvalue,boxwex=0.5) dev.off() To not bother you with the details, the only difference is the pars =list(whisklwd=0,staplelwd=0) at the end of the boxplot , which I use to remove the whiskers fromt he blot. B.R From: Marc Schwartz
Re: [R] postscript( does not save the plot
On Aug 19, 2011, at 9:08 AM, Alaios wrote: Dear al, I would like to thank you for your replies. I have tried with graphics.off() but did not help too. I am also sorry that my example was not reproducible It has never been reproducible because you have ignored the request 3 days ago to supply the 'test' object. You have also ignores the request that you post sessionInfo() -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Concatenate two strings in one in a string matrix
Hello Many thanks. * is not a typo. The output is a description of a nonlinear system so terms such as y(k-1)*y(k-2) are allowed. I wonder whether could be ignored so that the outputs such as y_{01}(k-003)* would not show up. Cheers Ed On Thu, Aug 18, 2011 at 3:40 PM, David Winsemius dwinsem...@comcast.netwrote: On Aug 18, 2011, at 2:35 PM, Eduardo Mendes wrote: Dear R-Users I have the following matrix out$desc [,1][,2] [1,] [2,] y_{01}(k-001) [3,] y_{01}(k-002) [4,] y_{01}(k-003) [5,] u_{01}(k-001) [6,] u_{01}(k-002) [7,] u_{01}(k-003) [8,] y_{01}(k-001) y_{01}(k-001) [9,] y_{01}(k-001) y_{01}(k-002) [10,] y_{01}(k-001) y_{01}(k-003) [11,] y_{01}(k-001) u_{01}(k-001) and need to concatenate each line to a single string. Something like [2,] y_{01}(k-001) - [2,] y_{01}(k-001) [11,] y_{01}(k-001) u_{01}(k-001) - [11,] y_{01}(k-001)*u_{01}(k-001) Is there a way to do it without going through every column? apply(out$desc, 1, paste, collapse=) It is ambiguous what you want for a delimiter. In one case you used , and another you used *. I used . -- David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] More efficient option to append()?
On 08/17/2011 10:53 PM, Alex Ruiz Euler wrote: Dear R community, I have a 2 million by 2 matrix that looks like this: x-sample(1:15,200, replace=T) y-sample(1:10*1000, 200, replace=T) x y [1,] 10 4000 [2,] 3 1000 [3,] 3 4000 [4,] 8 6000 [5,] 2 9000 [6,] 3 8000 [7,] 2 1 (...) The first column is a population expansion factor for the number in the second column (household income). I want to expand the second column with the first so that I end up with a vector beginning with 10 observations of 4000, then 3 observations of 1000 and so on. In my mind the natural approach would be to create a NULL vector and append the expansions: myvar-NULL myvar-append(myvar, replicate(x[1],y[1]), 1) for (i in 2:length(x)) { myvar-append(myvar,replicate(x[i],y[i]),sum(x[1:i])+1) } to end with a vector of sum(x), which in my real database corresponds to 22 million observations. This works fine --if I only run it for the first, say, 1000 observations. If I try to perform this on all 2 million observations it takes long, way too long for this to be useful (I left it running 11 hours yesterday to no avail). I know R performs well with operations on relatively large vectors. Why is this so inefficient? And what would be the smart way to do this? Hi Alex, The other reply already gave you the R way of doing this while avoiding the for loop. However, there is a more general reason why your for loop is terribly inefficient. A small set of examples: largeVector = runif(10e4) outputVector = NULL system.time(for(i in 1:length(largeVector)) { outputVector = append(outputVector, largeVector[i] + 1) }) # user system elapsed # 6.591 0.168 6.786 The problem in this code is that outputVector keeps on growing and growing. The operating system needs to allocate more and more space as the object grows. This process is really slow. Several (much) faster alternatives exist: # Pre-allocating the outputVector outputVector = rep(0,length(largeVector)) system.time(for(i in 1:length(largeVector)) { outputVector[i] = largeVector[i] + 1 }) # user system elapsed # 0.178 0.000 0.178 # speed up of 37 times, this will only increase for large # lengths of largeVector # Using apply functions system.time(outputVector - sapply(largeVector, function(x) return(x + 1))) # user system elapsed # 0.124 0.000 0.125 # Even a bit faster # Using vectorisation system.time(outputVector - largeVector + 1) # user system elapsed # 0.000 0.000 0.001 # Practically instant, 6780 times faster than the first example It is not always clear which method is most suitable and which performs best. At least they all perform much, much better than the naive option of letting outputVector grow. cheers, Paul Thanks in advance. Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] More efficient option to append()?
On 08/18/2011 07:46 AM, Timothy Bates wrote: This takes a few seconds to do 1 million lines, and remains explicit/for loop form numberofSalaryBands = 100 # 200 x= sample(1:15,numberofSalaryBands, replace=T) y= sample((1:10)*1000, numberofSalaryBands, replace=T) df = data.frame(x,y) finalN = sum(df$x) myVar= rep(NA, finalN) outIndex = 1 i= 1 for (i in 1:numberofSalaryBands) { kount = df$x[i] myVar[outIndex:(outIndex+kount-1)] = rep(df$y[i], kount) # Make x[i] copies of value y[i] For posterity, the problem in the code of the OP was that myVar was continuously growing. This required the operating system to continuously create more space for myVar, which is a very slow process. In this example you preallocate the space needed for myVar by creating an object of the appropriate length before the for loop. So, in my opinion, for loops and append should be avoided like the plague! my 2cts :) Paul outIndex = outIndex+kount } head(myVar) plyr::count(myVar) On Aug 18, 2011, at 12:17 AM, Alex Ruiz Euler wrote: Dear R community, I have a 2 million by 2 matrix that looks like this: x-sample(1:15,200, replace=T) y-sample(1:10*1000, 200, replace=T) x y [1,] 10 4000 [2,] 3 1000 [3,] 3 4000 [4,] 8 6000 [5,] 2 9000 [6,] 3 8000 [7,] 2 1 (...) The first column is a population expansion factor for the number in the second column (household income). I want to expand the second column with the first so that I end up with a vector beginning with 10 observations of 4000, then 3 observations of 1000 and so on. In my mind the natural approach would be to create a NULL vector and append the expansions: myvar-NULL myvar-append(myvar, replicate(x[1],y[1]), 1) for (i in 2:length(x)) { myvar-append(myvar,replicate(x[i],y[i]),sum(x[1:i])+1) } to end with a vector of sum(x), which in my real database corresponds to 22 million observations. This works fine --if I only run it for the first, say, 1000 observations. If I try to perform this on all 2 million observations it takes long, way too long for this to be useful (I left it running 11 hours yesterday to no avail). I know R performs well with operations on relatively large vectors. Why is this so inefficient? And what would be the smart way to do this? Thanks in advance. Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] More efficient option to append()?
As I already stated in my reply to your earlier post: resending the answer for the archives of the mailing list... Hi Alex, The other reply already gave you the R way of doing this while avoiding the for loop. However, there is a more general reason why your for loop is terribly inefficient. A small set of examples: largeVector = runif(10e4) outputVector = NULL system.time(for(i in 1:length(largeVector)) { outputVector = append(outputVector, largeVector[i] + 1) }) # user system elapsed # 6.591 0.168 6.786 The problem in this code is that outputVector keeps on growing and growing. The operating system needs to allocate more and more space as the object grows. This process is really slow. Several (much) faster alternatives exist: # Pre-allocating the outputVector outputVector = rep(0,length(largeVector)) system.time(for(i in 1:length(largeVector)) { outputVector[i] = largeVector[i] + 1 }) # user system elapsed # 0.178 0.000 0.178 # speed up of 37 times, this will only increase for large # lengths of largeVector # Using apply functions system.time(outputVector - sapply(largeVector, function(x) return(x + 1))) # user system elapsed # 0.124 0.000 0.125 # Even a bit faster # Using vectorisation system.time(outputVector - largeVector + 1) # user system elapsed # 0.000 0.000 0.001 # Practically instant, 6780 times faster than the first example It is not always clear which method is most suitable and which performs best. At least they all perform much, much better than the naive option of letting outputVector grow. cheers, Paul On 08/17/2011 11:17 PM, Alex Ruiz Euler wrote: Dear R community, I have a 2 million by 2 matrix that looks like this: x-sample(1:15,200, replace=T) y-sample(1:10*1000, 200, replace=T) x y [1,] 10 4000 [2,] 3 1000 [3,] 3 4000 [4,] 8 6000 [5,] 2 9000 [6,] 3 8000 [7,] 2 1 (...) The first column is a population expansion factor for the number in the second column (household income). I want to expand the second column with the first so that I end up with a vector beginning with 10 observations of 4000, then 3 observations of 1000 and so on. In my mind the natural approach would be to create a NULL vector and append the expansions: myvar-NULL myvar-append(myvar, replicate(x[1],y[1]), 1) for (i in 2:length(x)) { myvar-append(myvar,replicate(x[i],y[i]),sum(x[1:i])+1) } to end with a vector of sum(x), which in my real database corresponds to 22 million observations. This works fine --if I only run it for the first, say, 1000 observations. If I try to perform this on all 2 million observations it takes long, way too long for this to be useful (I left it running 11 hours yesterday to no avail). I know R performs well with operations on relatively large vectors. Why is this so inefficient? And what would be the smart way to do this? Thanks in advance. Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strange convention for time zone names
Hi, My time zone in Montreal is Standard time zone:UTC/GMT -5 hours (see http://www.timeanddate.com/worldclock/city.html?n=165). Yet, in R (POSIXct objects) I must specify the opposite, i.e. UTC+5: dateMontreal = as.POSIXct(2011-01-15 05:00:00, tz=EST) dateMontreal2 = as.POSIXct(2011-01-15 05:00:00, tz=UTC+5) wrongdateMontreal = as.POSIXct(2011-01-15 05:00:00, tz=UTC-5) dateLondon = as.POSIXct(2011-01-15 10:00:00, tz=UTC0) difftime(dateMontreal, dateLondon) Time difference of 0 secs difftime(dateMontreal2, dateLondon) Time difference of 0 secs difftime(wrongdateMontreal, dateLondon) Time difference of -10 hours Is there a reason for this counter-intuitive convention? Denis R version 2.13.1 (2011-07-08) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Concatenate two strings in one in a string matrix
On Aug 19, 2011, at 9:45 AM, Eduardo Mendes wrote: Hello Many thanks. * is not a typo. The output is a description of a nonlinear system so terms such as y(k-1)*y(k-2) are allowed. I wonder whether could be ignored so that the outputs such as y_{01}(k-003)* would not show up. Paste with * then remove trailing *'s: ifelse( grepl(*$, result), sub(*,, result), result) Cheers Ed On Thu, Aug 18, 2011 at 3:40 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 18, 2011, at 2:35 PM, Eduardo Mendes wrote: Dear R-Users I have the following matrix out$desc [,1][,2] [1,] [2,] y_{01}(k-001) [3,] y_{01}(k-002) [4,] y_{01}(k-003) [5,] u_{01}(k-001) [6,] u_{01}(k-002) [7,] u_{01}(k-003) [8,] y_{01}(k-001) y_{01}(k-001) [9,] y_{01}(k-001) y_{01}(k-002) [10,] y_{01}(k-001) y_{01}(k-003) [11,] y_{01}(k-001) u_{01}(k-001) and need to concatenate each line to a single string. Something like [2,] y_{01}(k-001) - [2,] y_{01}(k-001) [11,] y_{01}(k-001) u_{01}(k-001) - [11,] y_{01}(k-001)*u_{01} (k-001) Is there a way to do it without going through every column? apply(out$desc, 1, paste, collapse=) It is ambiguous what you want for a delimiter. In one case you used , and another you used *. I used . -- David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Concatenate two strings in one in a string matrix
On Aug 19, 2011, at 10:21 AM, David Winsemius wrote: On Aug 19, 2011, at 9:45 AM, Eduardo Mendes wrote: Hello Many thanks. * is not a typo. The output is a description of a nonlinear system so terms such as y(k-1)*y(k-2) are allowed. I wonder whether could be ignored so that the outputs such as y_{01} (k-003)* would not show up. Paste with * then remove trailing *'s: ifelse( grepl(*$, result), sub(*,, result), result) Rather (but still untested) ifelse( grepl(\\*$, result), sub(\\*,, result), result) -- david. Cheers Ed On Thu, Aug 18, 2011 at 3:40 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 18, 2011, at 2:35 PM, Eduardo Mendes wrote: Dear R-Users I have the following matrix out$desc [,1][,2] [1,] [2,] y_{01}(k-001) [3,] y_{01}(k-002) [4,] y_{01}(k-003) [5,] u_{01}(k-001) [6,] u_{01}(k-002) [7,] u_{01}(k-003) [8,] y_{01}(k-001) y_{01}(k-001) [9,] y_{01}(k-001) y_{01}(k-002) [10,] y_{01}(k-001) y_{01}(k-003) [11,] y_{01}(k-001) u_{01}(k-001) and need to concatenate each line to a single string. Something like [2,] y_{01}(k-001) - [2,] y_{01}(k-001) [11,] y_{01}(k-001) u_{01}(k-001) - [11,] y_{01}(k-001)*u_{01} (k-001) Is there a way to do it without going through every column? apply(out$desc, 1, paste, collapse=) It is ambiguous what you want for a delimiter. In one case you used , and another you used *. I used . -- David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with format()
R Users: Can anyone please help me with the following: I'm unclear as to how to get format to do what I want. I've tried the following and get unexpected results. Input. val-321.6 format(val, digits=1) format(val, digits=2) format(val, digits=3) format(val, digits=4) format(val, digits=5) Output [1] 322 [1] 322 [1] 322 [1] 321.6 [1] 321.6 Whereas I would expect to get... [1] 300 [1] 320 [1] 322 [1] 321.6 [1] 321.60 since digits is defined as the number of significant digits. The number 321.6 shown to 1 significant digit should be 300., not 322 which is 3 significant digits! Likewise for the other cases. Can anyone explain what format() is doing? Regards, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with format()
On Aug 19, 2011, at 10:23 AM, Michael Karol wrote: R Users: Can anyone please help me with the following: I'm unclear as to how to get format to do what I want. I've tried the following and get unexpected results. Input. val-321.6 format(val, digits=1) format(val, digits=2) format(val, digits=3) format(val, digits=4) format(val, digits=5) Output [1] 322 [1] 322 [1] 322 [1] 321.6 [1] 321.6 Whereas I would expect to get... [1] 300 [1] 320 [1] 322 [1] 321.6 [1] 321.60 since digits is defined as the number of significant digits. The number 321.6 shown to 1 significant digit should be 300., not 322 which is 3 significant digits! Likewise for the other cases. Can anyone explain what format() is doing? As you are observing the significance test only starts to kick in to the right of the decimal, but it does not add or extend significance to numbers that don't have the specified extent. (I agree the help page is not clear on these points.) If you want to force a particular size then you would want sprintf or formatC (as the help page for format does link to.) Regards, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Leading zeros
Hello, I have a dataset with an Id columns like: 4/3003 55/333 66/22 I want to put leading zeros to get: 0004/3003 00055/333 66/22 How can I solve this? Thanks Vasco Cadavez [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Leading zeros
On Aug 19, 2011, at 11:12 AM, Vasco Cadavez wrote: Hello, I have a dataset with an Id columns like: 4/3003 55/333 66/22 I want to put leading zeros to get: 0004/3003 00055/333 66/22 How can I solve this? ?sprintf ?formatC -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Leading zeros
On Aug 19, 2011, at 11:17 AM, David Winsemius wrote: On Aug 19, 2011, at 11:12 AM, Vasco Cadavez wrote: Hello, I have a dataset with an Id columns like: 4/3003 55/333 66/22 I want to put leading zeros to get: 0004/3003 00055/333 66/22 How can I solve this? ?sprintf ?formatC I may have been too quick. Padding with leading zeros using sprintf is described for numeric but not for character types. There are several character padding funcitons when you search: http://search.r-project.org/cgi-bin/namazu.cgi?query=pad+character++max=100result=normalsort=scoreidxname=functionsidxname=Rhelp08idxname=Rhelp10idxname=Rhelp02 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in read.dcf(file = tmpf) : Line starting 'head ...' is malformed!
Dear R-Users, I'm trying to setup a personal repository for a few packages I'm working on. I am on R-Forge but I still need to have various versions of my package that R-Forge does not build (for R 2.8.1 for example). So I followed the instructions in this document: Hhttp:// cran.r-project.org/doc/manuals/R-admin.html#Setting-up-a-package-repository and used this function as recommended: *write_PACKAGES()* Now, when I try to install or update my package, I get: *install.packages(bmisc, repos=http://www.benoitr.comze.com/R;)** Error in read.dcf(file = tmpf) : Line starting 'head ...' is malformed!* Can someone help me with this ? Benoit --- R: 2.13.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AFT model time-dependent with weibull distribution
Dear R-community, I have tried to estimate an accelerated failure time(AFT) and proportional hazard (PH) parametric survival model with time-independent and time-dependent covariates. For that purpose, I have used the eha package. Please, consider this example: weibullph - phreg(Surv(sta,time,S) ~ TDC1 + TIC1, dist=weibull, data.frame=Data) weibullaft-aftreg(Surv(sta,time,S) ~ TDC1 + TIC1, dist=weibull, data.frame=Data) ## aftreg gives error when I add ID argument... Error in aftreg.fit(X, Y, dist, strats, offset, init, shape, id, control, : Overlapping intervals for id 2 From help(aftreg): id If there are more than one spell per individual, it is essential to keep spells together by the id argument. This allows for time-varying covariates. data table: Data S sta time TDC1 total_time TIC1 ID A 1 0 1 48.50 1 1 1 B 0 0 1 65.96 2 1 2 B 1 1 2 65.08 2 12 C 0 0 1 0.002 4 3 C 1 1 2 0.002 4 3 D 0 0 1 72.742 5 4 D 1 1 2 72.522 5 4 E 0 0 1 61.84 2 35 E 0 1 2 60.562 35 F 0 0 1 35.044 26 F 0 1 2 36.974 26 F 0 2 3 37.924 26 F 1 3 4 39.014 26 time - time to event sta - starting time TDC - time dependent covariates TIC - time independent covariate total_time - total time at risk ID - ID 1- What happens if the ID is not included? 2- How can I solve this error? 3- Why the phreg function does not need an ID? Thanks, Javier -- View this message in context: http://r.789695.n4.nabble.com/AFT-model-time-dependent-with-weibull-distribution-tp3755079p3755079.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ATSP to TSP reformulation
Greetings, I am having trouble getting the function reformulate_ATSP_as_TSP to work for me. I have provided a simple example of some of the code I've been using. In particular, I'm not sure why I'm getting the error Error in dimnames(tsp) - list(lab, lab) : length of 'dimnames' [1] not equal to array extent since I created the object ATSP with a valid square matrix. Is there something simple I'm missing? The code is below. Thank you for your time. x = array(0, dim=c(4,4)) fix(x) x col1 col2 col3 col4 [1,]0123 [2,]10 115 [3,]2406 [4,]3560 library(TSP) example = ATSP(x) example2 = reformulate_ATSP_as_TSP(example, infeasible = 1000, cheap = .0001) Error in dimnames(tsp) - list(lab, lab) : length of 'dimnames' [1] not equal to array extent -- View this message in context: http://r.789695.n4.nabble.com/ATSP-to-TSP-reformulation-tp3755143p3755143.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to generate piecewise cubic spline with many knots?
Hi all, I have a series of intra-day data. The variables exhibit a typical daily pattern over the day. I need to diurnallly adjust the data. It takes the follow form 1, regress y on a piecewise cubic spline of x with knots (a1,a2,a3,a4...). x is the time of a day. 2, divide original series by the spline forecast. I know there are some functions to do the spline, like bs, splint, I do not know which one is exactly I need. Does anyway give me some suggestion? Thanks. -- View this message in context: http://r.789695.n4.nabble.com/How-to-generate-piecewise-cubic-spline-with-many-knots-tp3755419p3755419.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple Traveling Salesperson Problem
While R has library TSP to help solve traveling salesperson problems, does anyone know if it has any libraries to help solve multiple traveling salesperson problems? For instance, suppose one is planning school bus routes and one has multiple buses. Thank you for your time. -- View this message in context: http://r.789695.n4.nabble.com/Multiple-Traveling-Salesperson-Problem-tp3755151p3755151.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction
On Aug 19, 2011 khosoda wrote: I used x10.homals4$objscores[, 1] as a predictor for logistic regression as in the same way as PC1 in PCA. Am I going the right way? Hi Kohkichi, Yes, but maybe explore the sets= argument (set Response as the target variable and the others as the predictor variables). Then use Dim1 scores. Also think about fitting a rank-1 restricted model, combined with the sets= option. See the vignette to the package and look at @ARTICLE{MIC98, author = {Michailides, G. and de Leeuw, J.}, title = {The {G}ifi system of descriptive multivariate analysis}, journal = {Statistical Science}, year = {1998}, volume = {13}, pages = {307--336}, abstract = {} } Regards, Mark. - Mark Difford (Ph.D.) Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa -- View this message in context: http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3755163.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help Dxy and C-index calculation
Dear professor, I am currently using Design package and the cph formula for assessing multivariable analysis. I am tryng to get the C-index for my survival model based on Dxy coefficient. I am confused since there is a negative value. Do I need to used the absolute Dxy ? index.orig training test optimism index.corrected n Dxy -0.341357727 -0.344002740 -0.341357727 -0.002645013 -0.338712715 40 R2 0.084694141 0.095440176 0.079077594 0.016362582 0.068331560 40 Slope 1.0 1.0 0.897999711 0.102000289 0.897999711 40 D 0.033368588 0.038458457 0.030983429 0.007475027 0.025893561 40 U -0.002890981 -0.002968495 0.003412045 -0.006380539 0.003489558 40 Q 0.036259569 0.041426951 0.027571385 0.013855566 0.022404003 40 Many thanks Dr MAZOUNI Institut Gustave Roussy Department of breast surgical oncology Villejuif, France [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] splitting sample names
Thanks to all of you for the suggestions and corrections. Sharad -- View this message in context: http://r.789695.n4.nabble.com/splitting-sample-names-tp3753712p3755297.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Labelling all variables at once (using Hmisc label)
Indeed, as David pointed out, all the portion that used courier font (all the good stuff) was absent from the email posting. Thanks for your answers. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] require(dataset) for example.
Dear R Users Any idea if there exists any one dimensional Cox Process datasets in R? 'Spatstat' is very comprehensive but doesn't seem to have any examples of 1D (time series) Doubly Stochastic Poisson Process data. (I am aware it can be simulated) Thank you, Ken [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Leading zeros
Copying list one what was sent in reply. Anybody have a better solution? On Aug 19, 2011, at 11:57 AM, Vasco Cadavez wrote: Thanks, A solution can be by substring to remove the / then numeric will be ok! What you think? How can I remove the / with sub or gsub: sprintf(%010.0f, as.integer(gsub(/,, c(4/3003,55/333,66/22)) )) [1] 043003 055333 006622 -- David. Thanks Vasco Cadavez - Menssagem Original - De: David Winsemius dwinsem...@comcast.net Para: David Winsemius dwinsem...@comcast.net Cópia: Vasco Cadavez vcadavez@ipbpt, r-help@r-project.org Enviado: Fri, 19 Aug 2011 11:51:08 -0400 Assunto: Re: [R] Leading zeros On Aug 19, 2011, at 11:17 AM, David Winsemius wrote: On Aug 19, 2011, at 11:12 AM, Vasco Cadavez wrote: Hello, I have a dataset with an Id columns like: 4/3003 55/333 66/22 I want to put leading zeros to get: 0004/3003 00055/333 66/22 How can I solve this? ?sprintf ?formatC I may have been too quick. Padding with leading zeros using sprintf is described for numeric but not for character types. There are severa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Leading zeros
On Fri, Aug 19, 2011 at 9:19 AM, David Winsemius dwinsem...@comcast.net wrote: Copying list one what was sent in reply. Anybody have a better solution? No sure my solution is better, but it avoids the integer conversion and retains the /. I wrote a function that padds entries of input character vector with zeros to final length 'length'. padZeros = function(x, length) { n = length(x); out = rep(, n); for (i in 1:length(x)) { l = nchar(x[i]); out[i] = paste( paste(rep(0, length-l), collapse = ), x[i], sep = ); } out } padZeros(c(33/22, 4/50005, 6644/2233), length = 12) [1] 00033/22 04/50005 0006644/2233 HTH, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] some questions about stepwise cox regression model and heterogeneity in survival analysis
Hello, all users here: Recently i am doing a project of survival analysis. I collect the characteristics of patients and have got some factors which are related with the cancer. When i come to overcome relations between genotypes (Snps) and survival time via stratified analysis, i come across the problem of how to use coxph(survival package) to performe multivariate cox regression analysis with adjustment for some potential confounders. I have not found any suggestions of the survival package manual. And when i want to calculate the heterogeneity between genotypes and other stratified factors, i can not find something command like this (only survdiff for log-rank test, but no command for heterogeneity test). At last, stepwise cox regression need to use a significance level of 0.05 for entering and 0.10 for removing the explanatory variables. First how can i do stepwise cox regression analysis using survival package. And how can i limit the significance level? Could anyone here offer me the solution? Thank a lot, it will be appreciate of your kindly answers! -- View this message in context: http://r.789695.n4.nabble.com/some-questions-about-stepwise-cox-regression-model-and-heterogeneity-in-survival-analysis-tp3755591p3755591.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sign of the y axis in partialPlot for randomForest regression
Hi everybody, I used randomForest to regress invertebrates abundances in least impaired river reaches from some environmental parameters. Then I used these models to predict invertebrates abundances in impaired reaches. Now I would like to model the deviation (observation - prediction) with a set of chemical parameters to see if the deviations from predictions could be explained with water chemistry. I did built the model, and I used partialPlot to depict the patterns between individual water chemistry parameters and deviation from prediction. I know that the range and the values indicated on the y-axis do not correspond to the 'raw' deviation. However, since my deviations could be positive (i.e. greater abundance than expected) or negative (i.e. lower abundance than expected), I would like to know if the signs of the y-axis do correspond to those of my deviation values: does a negative value in the y-axis really correspond to a negative value of deviation. Thanks for your help Cédric -- View this message in context: http://r.789695.n4.nabble.com/sign-of-the-y-axis-in-partialPlot-for-randomForest-regression-tp3755583p3755583.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding text to a plot created with strat.plot() from package rioja
I have a plot created with strat.plot() from package rioja. When the plot is created with scale.percent=FALSE, each x axes is labeled at 0 and its maximum. However, when scale.percent=TRUE, the x axes are not labeled. I need to use scale.percent=TRUE and I need labels for the x axes. I have been able to add labels to the x axes with mtext but it is very tedious to find the correct position. Is there a better way to do this or a better way to find the desired coordinates than trial and error? Jason === __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] installing packages systemwide
I installed some downloaded packages in R. I always do $sudo R CMD INSTALL anRpackage.tar.gz By default it is storing these packages into my directory /home/mary/R/x86_64-pc-linux-gnu-library/2.13/. However I want them to be systemwide into /usr/local/lib/R/site-library/ folder. I tried $sudo R R install.packages(anRpackage, dep=TRUE) I did not succeed into getting them install in req folder. Any idea? -- - Mary Kindall Yorktown Heights, NY USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing non-graphic (text) output to PDF
Hi, friends. I keep coming to you because I'm so new to R and can't seem to figure out some simple things. Sorry. Consider the following code. I want to load a table and write out the structure to a PDF document. I just can't seem to manage writing non-graphic output to PDF. Any help? I've tried several functions, but nothing worked. All I get is the title. # ** # Load the DEBT table. debt - readRDS(T:/R.Data/Debt.rData) dim(debt) # Open the debt.pdf file for graphics output. pdf( file=paste( R:/DAS/DMS/FedDebt ,DataDiscovery ,DistributionAnalysis ,Report ,Debt.pdf ,sep=/ ) ) # == # Write the debt structucture to the output PDF. plot.new() title(DEBT) str(debt) # == dev.off() # Turn off the PDF device. # ** End of Program Ed Ed Heaton Project Manager, Sr. SAS Developer Data and Analytic Solutions, Inc. 10318 Yearling Drive Rockville, MD 20850 Office: 301-520-7414 ehea...@dasconsultants.com www.dasconsultants.com http://www.dasconsultants.com/ CMMI ML-2, SBA 8(a) SDB, WBE (WBENC), MBE (VA MD) e...@heaton.name (Re: http://www.r-project.org/posting-guide.html) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to merge distance data based on location
Hi all, I have two data frames, two columns each, 1000s of rows. Each row represents a segment of the genome where a deletion has occurred. First column is start position of the deletion in genomic distance, second is end position. So, e.g., first 3 rows of data frame A is: 1003 1023 5932 6120 12348 12689 first 3 rows of data frame B is: 852 5305 1010 1015 8500 9500 1 13000 I want to merge based on distance, such that each row will be deletions that overlap. So I'd like: 1003 1023852 5305 1010 1015 5932 6120 8500 9500 12348 126891 13000 Does anyone have ideas about how to accomplish this? Thank you, Matthew Keller -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3D surface plot
On 11-08-16 9:50 PM, Eric Heupel wrote: I have what is probably a noob question, but I am trying to create a 3d plot to illustrate the range of values for the following simple function: A = B*(C/D) B, C, and D are independent variables whose range are equal (e.g. 1 to 3 inclusive) I figure it's not possible to map the surface of A on the 3d space defined by B, C and D but I would like to create a surface defined by the lower and upper limits of the variables - that is to say a rectangle with corners at (1,1,1), (2,3,2), (3,3,3) and (3,2,2) with a color map displayed on it corresponding to the values of A and a color key to the side of that. I have been able to wrap my head part way around persp and wireframe and can create a surface for A~B*(C/D) in either, but have not managed to create either the The contour3d function in misc3d might do what you want. See the examples. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] installing packages systemwide
Take a look at: R CMD INSTALL --help and you will realize that you need to specify the library path, e.g. R CMD INSTALL anRpackage --library=/usr/local/... or take a look at ?install.packages and use the second argument, e.g. install.packages('anRpackage', lib = '/usr/local/...') Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Fri, Aug 19, 2011 at 12:10 PM, Mary Kindall mary.kind...@gmail.com wrote: I installed some downloaded packages in R. I always do $sudo R CMD INSTALL anRpackage.tar.gz By default it is storing these packages into my directory /home/mary/R/x86_64-pc-linux-gnu-library/2.13/. However I want them to be systemwide into /usr/local/lib/R/site-library/ folder. I tried $sudo R R install.packages(anRpackage, dep=TRUE) I did not succeed into getting them install in req folder. Any idea? -- - Mary Kindall Yorktown Heights, NY USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-Debian] installing packages systemwide
At 18:10 19/08/2011, Mary Kindall wrote: I installed some downloaded packages in R. I always do $sudo R CMD INSTALL anRpackage.tar.gz By default it is storing these packages into my directory /home/mary/R/x86_64-pc-linux-gnu-library/2.13/. However I want them to be systemwide into /usr/local/lib/R/site-library/ folder. I tried $sudo R R install.packages(anRpackage, dep=TRUE) I did not succeed into getting them install in req folder. There is a parameter lib to install.packages. Does that do what you would like? Any idea? -- - Mary Kindall Yorktown Heights, NY USA [[alternative HTML version deleted]] ___ R-SIG-Debian mailing list r-sig-deb...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-debian Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Concatenate two strings in one in a string matrix
Many thanks. The still untested worked. Cheers Ed On Fri, Aug 19, 2011 at 11:23 AM, David Winsemius dwinsem...@comcast.netwrote: On Aug 19, 2011, at 10:21 AM, David Winsemius wrote: On Aug 19, 2011, at 9:45 AM, Eduardo Mendes wrote: Hello Many thanks. * is not a typo. The output is a description of a nonlinear system so terms such as y(k-1)*y(k-2) are allowed. I wonder whether could be ignored so that the outputs such as y_{01}(k-003)* would not show up. Paste with * then remove trailing *'s: ifelse( grepl(*$, result), sub(*,, result), result) Rather (but still untested) ifelse( grepl(\\*$, result), sub(\\*,, result), result) -- david. Cheers Ed On Thu, Aug 18, 2011 at 3:40 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 18, 2011, at 2:35 PM, Eduardo Mendes wrote: Dear R-Users I have the following matrix out$desc [,1][,2] [1,] [2,] y_{01}(k-001) [3,] y_{01}(k-002) [4,] y_{01}(k-003) [5,] u_{01}(k-001) [6,] u_{01}(k-002) [7,] u_{01}(k-003) [8,] y_{01}(k-001) y_{01}(k-001) [9,] y_{01}(k-001) y_{01}(k-002) [10,] y_{01}(k-001) y_{01}(k-003) [11,] y_{01}(k-001) u_{01}(k-001) and need to concatenate each line to a single string. Something like [2,] y_{01}(k-001) - [2,] y_{01}(k-001) [11,] y_{01}(k-001) u_{01}(k-001) - [11,] y_{01}(k-001)*u_{01}(k-001) Is there a way to do it without going through every column? apply(out$desc, 1, paste, collapse=) It is ambiguous what you want for a delimiter. In one case you used , and another you used *. I used . -- David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gsub for numeric characters in string
Dear all, I have what is a bit of a confusing question, so I hope that I can explain clearly. Thank you for your help in advance. I would like to do a replacement procedure on several strings, but the way that I am currently going about it is not working. I have defined len, which is a series comprising the lengths of different items, all preceded by a colon. len [1] :328 :154 :135 [4] :147 :30 :50 [7] :252 :45 ; 'lenplustate is a series that is comprised of an attribute of each item preceding the colon, followed by the length (as defined in len). lenplusstate [1] 1:328 1:154 4:135 [4] NA:147 3:30 2:50 [7] NA:252 NA:45 NA; tree is a string that gives the specific relationship (via parenthetical notation) among the different items. Note that the lengths are included in this tree (following the colon), and the name of each item (1-5) precedes a colon. However, not every colon is preceded with a name (because there are internal nodes in the tree structure). tree [1] (*1*:328,((*5*:154,*2*:135):147,(*3*:30,*4*:50):252):45); I would like to replace the length with the lengthplusstate in the tree, while removing the names, so that it looks like this: theoreticalnewtree [1] (*1*:328,((*1*:154,*4*:135)NA:147,(*3*:30,*2*:50)NA:252)NA:45); I am using this code: for (j in all) newtree - gsub(ln[j], lnplusstate[j], tree) However, I end up with this: newtree [1] (*11*:328,((*51*:154,*24*:135)NA:147,(*33*:30,*42*:50)NA:252)NA:45); that is, I have not removed the names from the string so that now the state information is wrong. If anyone can help with the proper code to get rid of the names in tree while replacing the length with lengthplustate, I would appreciate it. I apologize if this was unclear. All the best, Rebecca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to merge distance data based on location
r-help-boun...@r-project.org wrote on 08/19/2011 12:15:39 PM: [image removed] [R] how to merge distance data based on location Matthew Keller to: r help 08/19/2011 12:18 PM Sent by: r-help-boun...@r-project.org Hi all, I have two data frames, two columns each, 1000s of rows. Each row represents a segment of the genome where a deletion has occurred. First column is start position of the deletion in genomic distance, second is end position. So, e.g., first 3 rows of data frame A is: 1003 1023 5932 6120 12348 12689 first 3 rows of data frame B is: 852 5305 1010 1015 8500 9500 1 13000 The first row of data frame B describes a deletion that fully envelopes the deletion described in the second row. Does this make sense? I want to merge based on distance, such that each row will be deletions that overlap. So I'd like: 1003 1023852 5305 1010 1015 5932 6120 8500 9500 12348 126891 13000 Would you mind describing what you plan to do with the resulting merged data frame? I ask because there may be some approach (other than data frame merging) that might serve your needs better. What if the second row of data frame B was 1025 1038 it would still overlap with the first row of B, but it wouldn't overlap with the first row of A. How would you want your merged data frame to look? Does anyone have ideas about how to accomplish this? Thank you, Matthew Keller -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Leading zeros
-- Forwarded message -- From: Ole Peter Smith ole@gmail.com Date: Fri, Aug 19, 2011 at 1:40 PM Subject: Re: [R] Leading zeros To: David Winsemius dwinsem...@comcast.net I'm all new to R, assisting the last days of topics from the sideline. I am, however, a longterm programmer. This question makes me ask myself - and now here - is there any string split-function, ex some thing like: split('/',2000/3000/4) -- (2000,3000,4) After this patting with zero is a breeze, sprintf('%010d',... 0le On Fri, Aug 19, 2011 at 1:19 PM, David Winsemius dwinsem...@comcast.net wrote: Copying list one what was sent in reply. Anybody have a better solution? On Aug 19, 2011, at 11:57 AM, Vasco Cadavez wrote: Thanks, A solution can be by substring to remove the / then numeric will be ok! What you think? How can I remove the / with sub or gsub: sprintf(%010.0f, as.integer(gsub(/,, c(4/3003,55/333,66/22)) )) [1] 043003 055333 006622 -- David. Thanks Vasco Cadavez - Menssagem Original - De: David Winsemius dwinsem...@comcast.net Para: David Winsemius dwinsem...@comcast.net Cópia: Vasco Cadavez vcadavez@ipbpt, r-help@r-project.org Enviado: Fri, 19 Aug 2011 11:51:08 -0400 Assunto: Re: [R] Leading zeros On Aug 19, 2011, at 11:17 AM, David Winsemius wrote: On Aug 19, 2011, at 11:12 AM, Vasco Cadavez wrote: Hello, I have a dataset with an Id columns like: 4/3003 55/333 66/22 I want to put leading zeros to get: 0004/3003 00055/333 66/22 How can I solve this? ?sprintf ?formatC I may have been too quick. Padding with leading zeros using sprintf is described for numeric but not for character types. There are severa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- / ( O O ) =oOO==(_)==OOo= God does not care about our mathematical difficulties. He integrates empirically - Einstein .oooO Oooo. ==( )=( )= \ ( ) / \_) (_/ === Ole Peter Smith, IME, UFG http://www.mat.ufg.br/docentes/olepeter - ole at mat.ufg.br === Life sure is a Mystery to be Lived Not a Problem to be Solved === -- / ( O O ) =oOO==(_)==OOo= God does not care about our mathematical difficulties. He integrates empirically - Einstein .oooO Oooo. ==( )=( )= \ ( ) / \_) (_/ === Ole Peter Smith, IME, UFG http://www.mat.ufg.br/docentes/olepeter - ole at mat.ufg.br === Life sure is a Mystery to be Lived Not a Problem to be Solved === __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] chisq.test(): standardized (adjusted) Pearson residuals
I'm using chisq.test() on a matrix of categorical data, and I see that the residuals attribute of the returned object will give me the Pearson residuals. That's cool. However, what I'd really like is the standardized (adjusted) Pearson residuals, which have a N(0,1) distribution. Is there a way to do that in R (other than by me programming it myself?) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R and Sweave
Hi everybody. I'm trying to use R with Sweave but I have a problem perhaps with the directory path of sweave in R. The windows path is this: C:\Program Files (x86)\R\R-2.9.2\share\texmf\Sweave When I run the latex file with R, the program works well, without any errors, but when I create the pdf file, the Scode doesn't work. I think the problem is about the path. Can anybody suggest to me anything to do? Thanks very much Cheers -- View this message in context: http://r.789695.n4.nabble.com/R-and-Sweave-tp3755837p3755837.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Leading zeros
There is a strsplit() function (syntax is (stringToBeSplit, splitAt) ) that may be of use. I haven't followed the thread so I don't know how well it handles the original problem. Michael Weylandt On Fri, Aug 19, 2011 at 12:41 PM, Ole Peter Smith ole@gmail.com wrote: -- Forwarded message -- From: Ole Peter Smith ole@gmail.com Date: Fri, Aug 19, 2011 at 1:40 PM Subject: Re: [R] Leading zeros To: David Winsemius dwinsem...@comcast.net I'm all new to R, assisting the last days of topics from the sideline. I am, however, a longterm programmer. This question makes me ask myself - and now here - is there any string split-function, ex some thing like: split('/',2000/3000/4) -- (2000,3000,4) After this patting with zero is a breeze, sprintf('%010d',... 0le On Fri, Aug 19, 2011 at 1:19 PM, David Winsemius dwinsem...@comcast.net wrote: Copying list one what was sent in reply. Anybody have a better solution? On Aug 19, 2011, at 11:57 AM, Vasco Cadavez wrote: Thanks, A solution can be by substring to remove the / then numeric will be ok! What you think? How can I remove the / with sub or gsub: sprintf(%010.0f, as.integer(gsub(/,, c(4/3003,55/333,66/22)) )) [1] 043003 055333 006622 -- David. Thanks Vasco Cadavez - Menssagem Original - De: David Winsemius dwinsem...@comcast.net Para: David Winsemius dwinsem...@comcast.net Cópia: Vasco Cadavez vcadavez@ipbpt, r-help@r-project.org Enviado: Fri, 19 Aug 2011 11:51:08 -0400 Assunto: Re: [R] Leading zeros On Aug 19, 2011, at 11:17 AM, David Winsemius wrote: On Aug 19, 2011, at 11:12 AM, Vasco Cadavez wrote: Hello, I have a dataset with an Id columns like: 4/3003 55/333 66/22 I want to put leading zeros to get: 0004/3003 00055/333 66/22 How can I solve this? ?sprintf ?formatC I may have been too quick. Padding with leading zeros using sprintf is described for numeric but not for character types. There are severa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- / ( O O ) =oOO==(_)==OOo= God does not care about our mathematical difficulties. He integrates empirically - Einstein .oooO Oooo. ==( )=( )= \ ( ) / \_) (_/ === Ole Peter Smith, IME, UFG http://www.mat.ufg.br/docentes/olepeter - ole at mat.ufg.br === Life sure is a Mystery to be Lived Not a Problem to be Solved === -- / ( O O ) =oOO==(_)==OOo= God does not care about our mathematical difficulties. He integrates empirically - Einstein .oooO Oooo. ==( )=( )= \ ( ) / \_) (_/ === Ole Peter Smith, IME, UFG http://www.mat.ufg.br/docentes/olepeter - ole at mat.ufg.br === Life sure is a Mystery to be Lived Not a Problem to be Solved === __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] COXPH TIME-DEPENDENT
B is the specification for time-varying covariates. Otherwise, your model will think that each row is one independent observation that either had an event or was censored at time or total_time. HTH, Daniel javier palacios wrote: Dear R-community, which of the following two formats is correct? Are both correct? Please, consider this example: data table: Data S sta time TDC1 total_time A 1 0 1 48.50 1 B 0 0 1 65.96 2 B 1 1 2 65.08 2 C 0 0 1 0.002 C 1 1 2 0.002 D 0 0 1 72.742 D 1 1 2 72.522 E 0 0 1 61.84 2 E 0 1 2 60.562 F 0 0 1 35.044 F 0 1 2 36.974 F 0 2 3 37.924 F 1 3 4 39.014 time - time to event sta - starting time TDC - time dependent covariates total_time - total time at risk option A coxph(Surv(time,S) ~ time_dependent_covariates, data=data.frame(Data)) option B coxph(Surv(sta,time,S) ~ time_dependent_covariates, data=data.frame(Data)) option C coxph(Surv(total_time,S) ~ time_dependent_covariates, data=data.frame(Data)) How can time at risk be visualized in the coxph output? Best regards, Javier -- View this message in context: http://r.789695.n4.nabble.com/COXPH-TIME-DEPENDENT-tp3754837p3755852.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] chisq.test(): standardized (adjusted) Pearson residuals
On Aug 19, 2011, at 1:28 PM, Stephen Davies wrote: I'm using chisq.test() on a matrix of categorical data, and I see that the residuals attribute of the returned object will give me the Pearson residuals. That's cool. However, what I'd really like is the standardized (adjusted) Pearson residuals, which have a N(0,1) distribution. Is there a way to do that in R (other than by me programming it myself?) ?scale -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hmisc::rcorr on a 'data.frame'?
Dear all ?Hmisc::rcorr states that it takes as main argument a numeric matrix. But is it normal that it fails in such an ugly way on a data frame? (See below.) If the function didn't attempt any conversion to a matrix, I would have expected it to state that in the error message that it didn't accept 'data.frame' objects in its input. Also, I vaguely remember having used in the past rcorr() on data frames. Regards Liviu require(Hmisc) rcorr(mtcars[ , 1:4]) Error in storage.mode(x) - if (.R.) double else single : (list) object cannot be coerced to type 'double' rcorr(as.matrix(mtcars[ , 1:4])) mpg cyl disphp mpg 1.00 -0.85 -0.85 -0.78 cyl -0.85 1.00 0.90 0.83 disp -0.85 0.90 1.00 0.79 hp -0.78 0.83 0.79 1.00 n= 32 P mpg cyl disp hp mpg 0 00 cyl 0 00 disp 0 00 hp0 0 0 -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] display only the top-right half of a correlation matrix?
Dear all Is there an easy way to display only one half (top-right or bottom-left) of a correlation matrix? require(Hmisc) rcorr(as.matrix(mtcars[ , 1:4])) mpg cyl disphp mpg 1.00 -0.85 -0.85 -0.78 cyl -0.85 1.00 0.90 0.83 disp -0.85 0.90 1.00 0.79 hp -0.78 0.83 0.79 1.00 n= 32 P mpg cyl disp hp mpg 0 00 cyl 0 00 disp 0 00 hp0 0 0 Since the two sides are identical, there is little value in having both displayed at the same time. Moreover, it considerably slows down the inspection of the results. Thank you Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub for numeric characters in string
On Fri, Aug 19, 2011 at 11:11 AM, Rebecca Gray atlas...@gmail.com wrote: Dear all, I have what is a bit of a confusing question, so I hope that I can explain clearly. Thank you for your help in advance. I would like to do a replacement procedure on several strings, but the way that I am currently going about it is not working. I have defined len, which is a series comprising the lengths of different items, all preceded by a colon. len [1] :328 :154 :135 [4] :147 :30 :50 [7] :252 :45 ; 'lenplustate is a series that is comprised of an attribute of each item preceding the colon, followed by the length (as defined in len). lenplusstate [1] 1:328 1:154 4:135 [4] NA:147 3:30 2:50 [7] NA:252 NA:45 NA; tree is a string that gives the specific relationship (via parenthetical notation) among the different items. Note that the lengths are included in this tree (following the colon), and the name of each item (1-5) precedes a colon. However, not every colon is preceded with a name (because there are internal nodes in the tree structure). tree [1] (*1*:328,((*5*:154,*2*:135):147,(*3*:30,*4*:50):252):45); I would like to replace the length with the lengthplusstate in the tree, while removing the names, so that it looks like this: theoreticalnewtree [1] (*1*:328,((*1*:154,*4*:135)NA:147,(*3*:30,*2*:50)NA:252)NA:45); I can help you, but what is the name of each item? I thought it was the index of the item in the len and lenplusstate variables, but that apparently is not the case. You have to specify the names as well. Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] display only the top-right half of a correlation matrix?
r-help-boun...@r-project.org wrote on 08/19/2011 01:50:48 PM: [image removed] [R] display only the top-right half of a correlation matrix? Liviu Andronic to: r-help@r-project.org Help 08/19/2011 01:55 PM Sent by: r-help-boun...@r-project.org Dear all Is there an easy way to display only one half (top-right or bottom-left) of a correlation matrix? See ?lower.tri require(Hmisc) rcorr(as.matrix(mtcars[ , 1:4])) mpg cyl disphp mpg 1.00 -0.85 -0.85 -0.78 cyl -0.85 1.00 0.90 0.83 disp -0.85 0.90 1.00 0.79 hp -0.78 0.83 0.79 1.00 n= 32 P mpg cyl disp hp mpg 0 00 cyl 0 00 disp 0 00 hp0 0 0 Since the two sides are identical, there is little value in having both displayed at the same time. Moreover, it considerably slows down the inspection of the results. Thank you Liviu Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] display only the top-right half of a correlation matrix?
On Fri, Aug 19, 2011 at 11:50 AM, Liviu Andronic landronim...@gmail.com wrote: Dear all Is there an easy way to display only one half (top-right or bottom-left) of a correlation matrix? require(Hmisc) rcorr(as.matrix(mtcars[ , 1:4])) mpg cyl disp hp mpg 1.00 -0.85 -0.85 -0.78 cyl -0.85 1.00 0.90 0.83 disp -0.85 0.90 1.00 0.79 hp -0.78 0.83 0.79 1.00 Use as.dist: here's an example. x = matrix(rnorm(5*100), 100, 5) as.dist(cor(x)) 1 2 3 4 2 -2.892981e-06 3 2.873711e-02 1.002969e-02 4 -5.803705e-02 4.022733e-02 -6.154211e-02 5 1.137083e-01 -8.065676e-02 -9.279316e-02 -8.201583e-02 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rms:fastbw variable selection differences with AIC .vs. p value methods
I want to employ a parsimonious model to draw nomograms, as the full model is too complex to draw nomograms readily (several interactions of continuous variables). However, one interesting variable stays or leaves based on whether I choose p value or AIC options to fastbw(). My question boils down to this: Is there a theoretical reason to prefer one over another? Consider: fastbw(model94c, aic=1e10) DeletedChi-Sq d.f. P Residual d.f. P AIC ToD 0.11 3 0.99030.11 3 0.9903 -5.89 Experience * ToD 2.56 3 0.46462.67 6 0.8487 -9.33 Experience * Assoc 0.45 2 0.79703.13 8 0.9262 -12.87 RatePressure 2.99 3 0.39396.11 11 0.8658 -15.89 DW_height_t 2.92 3 0.40479.03 14 0.8293 -18.97 TBV * Experience 3.46 3 0.3260 12.49 17 0.7698 -21.51 Experience * Sex 0.05 1 0.8153 12.54 18 0.8181 -23.46 Experience 0.18 1 0.67212.72 19 0.8526 -25.28 Sex 1.19 1 0.2745 13.91 20 0.8348 -26.09 Assoc6.09 2 0.0475 20.01 22 0.5826 -23.99 Experience * Pulse 10.53 3 0.0146 30.53 25 0.2049 -19.47 Sex * ToD 18.24 3 0.0004 48.77 28 0.0088 -7.23 PulsePressure 21.15 3 0.0001 69.92 31 0.00017.92 Race19.87 2 0. 89.79 33 0. 23.79 Pulse 25.31 3 0. 115.09 36 0. 43.09 Age * Experience 202.80 3 0. 317.89 39 0. 239.89 TBV282.41 3 0. 600.30 42 0. 516.30 Location 310.19 14 0. 910.50 56 0. 798.50 Age809.64 3 0. 1720.13 59 0. 1602.13 The ordering of variables is expected, and is consistent with the substantial knowledge I have about the outcome. The problematic variable is Sex * TOD . When I use p value as the rule, with an SLS of 0.01, the variable is retained, but when I use AIC , Sex*TOD is not retained. This reflects the fact that while the Sex*TOD interaction is theoretically interesting, the AIC value is negative and relatively small in magnitude, even as the p value skirts below 0.01. Is this judgement territory or are their statistical considerations that should be invoked? Caveats? Is there a theoretical reason to choose AIC over p value methods, or is either acceptable? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] More efficient option to append()?
Thanks for the code corrections. I see how for loops, append and naively populating a NULL vector can be so resource consuming. I tried the codes with 20 million observations in the following machine: processor : 7 cpu family : 6 model name : Intel(R) Core(TM) i7 CPU Q 720 @ 1.60GHz cpu MHz : 933.000 cache size : 6144 KB First I tried Timothy's code and left it running for half an hour and I had to interrupt the command at Timing stopped at: 1033.516 829.147 1845.648 Then Dennis' option: user system elapsed 25.793 0.224 25.784 And for Paul's option, using a vector of length 20 million I had to stop at: Timing stopped at: 850.577 8.868 851.464 Not very efficient for relatively large vectors. I have also read that using {} instead of () to wrap for example {x+1} works faster, as do working directly with matrices instead of dataframes. Thanks for your input. Alex On Fri, 19 Aug 2011 13:58:09 + Paul Hiemstra paul.hiems...@knmi.nl wrote: As I already stated in my reply to your earlier post: resending the answer for the archives of the mailing list... Hi Alex, The other reply already gave you the R way of doing this while avoiding the for loop. However, there is a more general reason why your for loop is terribly inefficient. A small set of examples: largeVector = runif(10e4) outputVector = NULL system.time(for(i in 1:length(largeVector)) { outputVector = append(outputVector, largeVector[i] + 1) }) # user system elapsed # 6.591 0.168 6.786 The problem in this code is that outputVector keeps on growing and growing. The operating system needs to allocate more and more space as the object grows. This process is really slow. Several (much) faster alternatives exist: # Pre-allocating the outputVector outputVector = rep(0,length(largeVector)) system.time(for(i in 1:length(largeVector)) { outputVector[i] = largeVector[i] + 1 }) # user system elapsed # 0.178 0.000 0.178 # speed up of 37 times, this will only increase for large # lengths of largeVector # Using apply functions system.time(outputVector - sapply(largeVector, function(x) return(x + 1))) # user system elapsed # 0.124 0.000 0.125 # Even a bit faster # Using vectorisation system.time(outputVector - largeVector + 1) # user system elapsed # 0.000 0.000 0.001 # Practically instant, 6780 times faster than the first example It is not always clear which method is most suitable and which performs best. At least they all perform much, much better than the naive option of letting outputVector grow. cheers, Paul On 08/17/2011 11:17 PM, Alex Ruiz Euler wrote: Dear R community, I have a 2 million by 2 matrix that looks like this: x-sample(1:15,200, replace=T) y-sample(1:10*1000, 200, replace=T) x y [1,] 10 4000 [2,] 3 1000 [3,] 3 4000 [4,] 8 6000 [5,] 2 9000 [6,] 3 8000 [7,] 2 1 (...) The first column is a population expansion factor for the number in the second column (household income). I want to expand the second column with the first so that I end up with a vector beginning with 10 observations of 4000, then 3 observations of 1000 and so on. In my mind the natural approach would be to create a NULL vector and append the expansions: myvar-NULL myvar-append(myvar, replicate(x[1],y[1]), 1) for (i in 2:length(x)) { myvar-append(myvar,replicate(x[i],y[i]),sum(x[1:i])+1) } to end with a vector of sum(x), which in my real database corresponds to 22 million observations. This works fine --if I only run it for the first, say, 1000 observations. If I try to perform this on all 2 million observations it takes long, way too long for this to be useful (I left it running 11 hours yesterday to no avail). I know R performs well with operations on relatively large vectors. Why is this so inefficient? And what would be the smart way to do this? Thanks in advance. Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Sweave
On Fri, Aug 19, 2011 at 2:23 PM, danielepippo dan...@hotmail.it wrote: Hi everybody. I'm trying to use R with Sweave but I have a problem perhaps with the directory path of sweave in R. The windows path is this: C:\Program Files (x86)\R\R-2.9.2\share\texmf\Sweave When I run the latex file with R, the program works well, without any errors, but when I create the pdf file, the Scode doesn't work. I think the problem is about the path. Can anybody suggest to me anything to do? Please post the type of LaTeX installation you are using (MikTeX?) and the version, along with the error you get when you try to compile to document. It might be as easy as adding C:\Program Files (x86)\R\R-2.9.2\share\texmf\Sweave to the MikTeX roots directory, but more information is needed. Also update to the latest R version if possible. Version 2.9 is two years old by now, and two years is a long time in R-land. Best, Ista Thanks very much Cheers -- View this message in context: http://r.789695.n4.nabble.com/R-and-Sweave-tp3755837p3755837.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Leading zeros
Here is yet another way of prepending leading zeros on a string: x - c('123/1234', '234/12', '21342342134/34', '99') n - 10 # upto 10 leading zeros (max length of string) leading - paste(rep('0', n), collapse = '') # add up to 10 zeros and then truncate to max length, but at least keep original substring(paste(leading, x, sep = '') +, pmin(11, nchar(x) + 1) # starting location +, n + nchar(x) # ending location +) [1] 00123/1234 234/12 21342342134/34 99 On Fri, Aug 19, 2011 at 12:41 PM, Ole Peter Smith ole@gmail.com wrote: -- Forwarded message -- From: Ole Peter Smith ole@gmail.com Date: Fri, Aug 19, 2011 at 1:40 PM Subject: Re: [R] Leading zeros To: David Winsemius dwinsem...@comcast.net I'm all new to R, assisting the last days of topics from the sideline. I am, however, a longterm programmer. This question makes me ask myself - and now here - is there any string split-function, ex some thing like: split('/',2000/3000/4) -- (2000,3000,4) After this patting with zero is a breeze, sprintf('%010d',... 0le On Fri, Aug 19, 2011 at 1:19 PM, David Winsemius dwinsem...@comcast.net wrote: Copying list one what was sent in reply. Anybody have a better solution? On Aug 19, 2011, at 11:57 AM, Vasco Cadavez wrote: Thanks, A solution can be by substring to remove the / then numeric will be ok! What you think? How can I remove the / with sub or gsub: sprintf(%010.0f, as.integer(gsub(/,, c(4/3003,55/333,66/22)) )) [1] 043003 055333 006622 -- David. Thanks Vasco Cadavez - Menssagem Original - De: David Winsemius dwinsem...@comcast.net Para: David Winsemius dwinsem...@comcast.net Cópia: Vasco Cadavez vcadavez@ipbpt, r-help@r-project.org Enviado: Fri, 19 Aug 2011 11:51:08 -0400 Assunto: Re: [R] Leading zeros On Aug 19, 2011, at 11:17 AM, David Winsemius wrote: On Aug 19, 2011, at 11:12 AM, Vasco Cadavez wrote: Hello, I have a dataset with an Id columns like: 4/3003 55/333 66/22 I want to put leading zeros to get: 0004/3003 00055/333 66/22 How can I solve this? ?sprintf ?formatC I may have been too quick. Padding with leading zeros using sprintf is described for numeric but not for character types. There are severa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- / ( O O ) =oOO==(_)==OOo= God does not care about our mathematical difficulties. He integrates empirically - Einstein .oooO Oooo. ==( )=( )= \ ( ) / \_) (_/ === Ole Peter Smith, IME, UFG http://www.mat.ufg.br/docentes/olepeter - ole at mat.ufg.br === Life sure is a Mystery to be Lived Not a Problem to be Solved === -- / ( O O ) =oOO==(_)==OOo= God does not care about our mathematical difficulties. He integrates empirically - Einstein .oooO Oooo. ==( )=( )= \ ( ) / \_) (_/ === Ole Peter Smith, IME, UFG http://www.mat.ufg.br/docentes/olepeter - ole at mat.ufg.br === Life sure is a Mystery to be Lived Not a Problem to be Solved === __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] display only the top-right half of a correlation matrix?
On Fri, Aug 19, 2011 at 9:02 PM, Peter Langfelder peter.langfel...@gmail.com wrote: Use as.dist: here's an example. Seems promising, but for one issue: I would like to keep the diagonal and thus specify 'diag=T', but then as.dist() replaces the diagonal values with zero. (See below.) Is there a way to prevent it from doing that? Either keep the original values, or not display anything in the diagonal (as for the upper part)? Regards Liviu (xb - rcorr(as.matrix(mtcars[ , 1:4]))) mpg cyl disphp mpg 1.00 -0.85 -0.85 -0.78 cyl -0.85 1.00 0.90 0.83 disp -0.85 0.90 1.00 0.79 hp -0.78 0.83 0.79 1.00 n= 32 P mpg cyl disp hp mpg 0 00 cyl 0 00 disp 0 00 hp0 0 0 round(as.dist(xb$r, T), 2) mpg cyl disphp mpg 0.00 cyl -0.85 0.00 disp -0.85 0.90 0.00 hp -0.78 0.83 0.79 0.00 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] display only the top-right half of a correlation matrix?
On Fri, Aug 19, 2011 at 12:32 PM, Liviu Andronic landronim...@gmail.com wrote: On Fri, Aug 19, 2011 at 9:02 PM, Peter Langfelder peter.langfel...@gmail.com wrote: Use as.dist: here's an example. Seems promising, but for one issue: I would like to keep the diagonal and thus specify 'diag=T', but then as.dist() replaces the diagonal values with zero. (See below.) Is there a way to prevent it from doing that? Either keep the original values, or not display anything in the diagonal (as for the upper part)? if as.dist doesn't work, use brute force: x = matrix(rnorm(5*100), 100, 5) mat = signif(cor(x), 2); mat[lower.tri(mat)] = data.frame(mat) Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Windows 7 issues with installing packages and setting library paths
On 19.08.2011 13:32, Christoph Scherber wrote: Dear all, I am forced to work in an environment without administrator rights. When using R2.13.1 on Windows 7 (64-Bit), I found that I can´t install or update any packages due to missing writing permissions. I managed to get full access to a directory on my C:\ drive now - but how do I specify that all libraries shall be installed into this directory? In Rcmd_environ I have the following entries: ## from R.sh R_SHARE_DIR=C:\\Program Files\\R\\R-2.13.1\share R_INCLUDE_DIR=C:\\Program Files\\R\\R-2.13.1\share\include R_DOC_DIR=C:\\Program Files\\R\\R-2.13.1\share\doc R_ARCH= R_LIBS_USER=C:\\Program Files\\R\\R-2.13.1\\library R_LIBS=C:\\Program Files\\R\\R-2.13.1\\library In Rprofile.site I have the following entries: .Library.site=C:\\Program Files\\R\\R-2.13.1\\library .Library=C:\\Program Files\\R\\R-2.13.1\\library .libPaths=C:\\Program Files\\R\\R-2.13.1\\library Forget about anything your wrote above and delete what you entered. A clean installation should provide you with a personal library for packages anyway. If that is not the case, you can also set an environment variable R_LIBS_USER whjere you can point to any directory (=library) where you'd like to install packages. See ?Startup and ?.libPaths and of course the R Installation and Administration manual for more help. Best, Uwe Ligges What else do I need to change? When I start up R, I get the following error message: Error: cannot change value of locked binding for '.Library' When calling .libPaths, I still get the wrong path: winfs-uni.top.gwdg.de/cscherb1$/R/R-2.13.1/library R has been installed at C:\\Program Files\\R but for some reason it still uses winfs-uni.top.gwdg.de/cscherb1$/R as the default directory for libraries (where I don´t have write permissions for some unknown reasons) What can I do to change the default library installation location? Any help would be greatly appreciated! Many thanks and best wishes Christoph __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] More efficient option to append()?
On 19.08.2011 15:50, Paul Hiemstra wrote: On 08/17/2011 10:53 PM, Alex Ruiz Euler wrote: Dear R community, I have a 2 million by 2 matrix that looks like this: x-sample(1:15,200, replace=T) y-sample(1:10*1000, 200, replace=T) x y [1,] 10 4000 [2,] 3 1000 [3,] 3 4000 [4,] 8 6000 [5,] 2 9000 [6,] 3 8000 [7,] 2 1 (...) The first column is a population expansion factor for the number in the second column (household income). I want to expand the second column with the first so that I end up with a vector beginning with 10 observations of 4000, then 3 observations of 1000 and so on. In my mind the natural approach would be to create a NULL vector and append the expansions: myvar-NULL myvar-append(myvar, replicate(x[1],y[1]), 1) for (i in 2:length(x)) { myvar-append(myvar,replicate(x[i],y[i]),sum(x[1:i])+1) } to end with a vector of sum(x), which in my real database corresponds to 22 million observations. This works fine --if I only run it for the first, say, 1000 observations. If I try to perform this on all 2 million observations it takes long, way too long for this to be useful (I left it running 11 hours yesterday to no avail). I know R performs well with operations on relatively large vectors. Why is this so inefficient? And what would be the smart way to do this? Hi Alex, The other reply already gave you the R way of doing this while avoiding the for loop. However, there is a more general reason why your for loop is terribly inefficient. A small set of examples: largeVector = runif(10e4) outputVector = NULL system.time(for(i in 1:length(largeVector)) { Please do teach people to use seq_along(largeVector) rather than 1:length(largeVector) (the latter is not save in case of length 0 objects). Uwe Ligges outputVector = append(outputVector, largeVector[i] + 1) }) # user system elapsed # 6.591 0.168 6.786 The problem in this code is that outputVector keeps on growing and growing. The operating system needs to allocate more and more space as the object grows. This process is really slow. Several (much) faster alternatives exist: # Pre-allocating the outputVector outputVector = rep(0,length(largeVector)) system.time(for(i in 1:length(largeVector)) { outputVector[i] = largeVector[i] + 1 }) # user system elapsed # 0.178 0.000 0.178 # speed up of 37 times, this will only increase for large # lengths of largeVector # Using apply functions system.time(outputVector- sapply(largeVector, function(x) return(x + 1))) # user system elapsed # 0.124 0.000 0.125 # Even a bit faster # Using vectorisation system.time(outputVector- largeVector + 1) # user system elapsed # 0.000 0.000 0.001 # Practically instant, 6780 times faster than the first example It is not always clear which method is most suitable and which performs best. At least they all perform much, much better than the naive option of letting outputVector grow. cheers, Paul Thanks in advance. Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Sweave
On 19.08.2011 21:15, Ista Zahn wrote: On Fri, Aug 19, 2011 at 2:23 PM, danielepippodan...@hotmail.it wrote: Hi everybody. I'm trying to use R with Sweave but I have a problem perhaps with the directory path of sweave in R. The windows path is this: C:\Program Files (x86)\R\R-2.9.2\share\texmf\Sweave When I run the latex file with R, the program works well, without any errors, but when I create the pdf file, the Scode doesn't work. I think the problem is about the path. Can anybody suggest to me anything to do? Please post the type of LaTeX installation you are using (MikTeX?) and the version, along with the error you get when you try to compile to document. It might be as easy as adding C:\Program Files (x86)\R\R-2.9.2\share\texmf\Sweave to the MikTeX roots directory, but more information is needed. Also update to the latest R version if possible. Version 2.9 is two years old by now, and two years is a long time in R-land. The actual problem is that LaTeX has still problems with paths containing spaces. Uwe Ligges Best, Ista Thanks very much Cheers -- View this message in context: http://r.789695.n4.nabble.com/R-and-Sweave-tp3755837p3755837.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help: how to generate counts from generalized poisson distribution
Dear All, Is there a simulator that can generate observations from a generalized poisson distribution? Thanks and regards, Chee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] display only the top-right half of a correlation matrix?
On Fri, Aug 19, 2011 at 9:38 PM, Peter Langfelder peter.langfel...@gmail.com wrote: if as.dist doesn't work, use brute force: x = matrix(rnorm(5*100), 100, 5) mat = signif(cor(x), 2); mat[lower.tri(mat)] = data.frame(mat) Yes, brute force works. This isn't quite how I wanted to do this, but the following seems to work for me. Thanks all Liviu require(Hmisc) print.rcorr - function (x, upper=FALSE, ...) { r - format(round(x$r, 2)) if(!is.null(upper)) r[if(!upper) upper.tri(r) else lower.tri(r)] - '' print(data.frame(r)) n - x$n if (all(n == n[1, 1])) cat(\nn=, n[1, 1], \n\n) else { cat(\nn\n) print(n) } cat(\nP\n) P - x$P P - ifelse(P 0.0001, 0, P) p - format(round(P, 4)) p[is.na(P)] - if(!is.null(upper)) p[if(!upper) upper.tri(p) else lower.tri(p)] - '' print(p, quote = FALSE) invisible() } (xb - rcorr(as.matrix(mtcars[ , 1:4]))) mpg cyl disphp mpg 1.00 cyl -0.85 1.00 disp -0.85 0.90 1.00 hp -0.78 0.83 0.79 1.00 n= 32 P mpg cyl disp hp mpg cyl 0 disp 0 0 hp0 0 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Sweave
On Fri, Aug 19, 2011 at 4:25 PM, Uwe Ligges lig...@statistik.tu-dortmund.de wrote: On 19.08.2011 21:15, Ista Zahn wrote: On Fri, Aug 19, 2011 at 2:23 PM, danielepippodan...@hotmail.it wrote: Hi everybody. I'm trying to use R with Sweave but I have a problem perhaps with the directory path of sweave in R. The windows path is this: C:\Program Files (x86)\R\R-2.9.2\share\texmf\Sweave When I run the latex file with R, the program works well, without any errors, but when I create the pdf file, the Scode doesn't work. I think the problem is about the path. Can anybody suggest to me anything to do? Please post the type of LaTeX installation you are using (MikTeX?) and the version, along with the error you get when you try to compile to document. It might be as easy as adding C:\Program Files (x86)\R\R-2.9.2\share\texmf\Sweave to the MikTeX roots directory, but more information is needed. Also update to the latest R version if possible. Version 2.9 is two years old by now, and two years is a long time in R-land. The actual problem is that LaTeX has still problems with paths containing spaces. That may be the problem, but I don't think it's a certainty; Sweave works on my system (with MikTeX 2.9) even though the R texmf directory path contains spaces. All I had to do was add C:\Program Files\R\R-2.13.1\share\texmf to the list of MikTeX root directories. Best, Ista Uwe Ligges Best, Ista Thanks very much Cheers -- View this message in context: http://r.789695.n4.nabble.com/R-and-Sweave-tp3755837p3755837.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Build a package - check error
Dear R-users I am slowly migrating my mex files (MATLAB - Fortran and C) to R. To get my own functions available on R section I have decided to learn how to build a R package. I choose a simple example with a few Fortran and R functions (wrapper). The fortran sources are located at src and the R functions at R (as recommended). The building process went ok but R CMD check did not. The error mgs was Error in dyn.load(fortran.so) : unable to load shared object '/home/eduardo/R_packages/test.Rcheck/fortran.so': Although I can see that R cannot find the compiled fortran code I do not know what to do. I believe it is something to do with the following lines in the R-wrapper file if (!is.loaded('calnpr')) dyn.load(fortran.so) How to add the path so that once the package is installed the compiled fortran code can be found? Many thanks Cheers Ed [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in read.dcf(file = tmpf) : Line starting 'head ...' is malformed!
On 19.08.2011 16:45, Benoit Bruneau wrote: Dear R-Users, I'm trying to setup a personal repository for a few packages I'm working on. I am on R-Forge but I still need to have various versions of my package that R-Forge does not build (for R 2.8.1 for example). So I followed the instructions in this document: Hhttp:// cran.r-project.org/doc/manuals/R-admin.html#Setting-up-a-package-repository and used this function as recommended: *write_PACKAGES()* Now, when I try to install or update my package, I get: *install.packages(bmisc, repos=http://www.benoitr.comze.com/R;)** Error in read.dcf(file = tmpf) : Line starting 'head ...' is malformed!* Can someone help me with this ? Yes: R-2.8.1 under Windows will look into http://www.benoitr.comze.com/R/bin/windows/contrib/2.8/... for packages - and that does not exist. You just provide a binary directory For R-2.13.x You have to build a binary with R-2.8.x for the /2.8/ repository. It is cumbersome to support so many binary versions since you have to build a binary of your package for each of the R versions you want to support. Best, Uwe Ligges Benoit --- R: 2.13.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Build a package - check error
On 19.08.2011 22:53, Eduardo Mendes wrote: Dear R-users I am slowly migrating my mex files (MATLAB - Fortran and C) to R. To get my own functions available on R section I have decided to learn how to build a R package. I choose a simple example with a few Fortran and R functions (wrapper). The fortran sources are located at src and the R functions at R (as recommended). The building process went ok but R CMD check did not. The error mgs was Error in dyn.load(fortran.so) : unable to load shared object '/home/eduardo/R_packages/test.Rcheck/fortran.so': Although I can see that R cannot find the compiled fortran code I do not know what to do. I believe it is something to do with the following lines in the R-wrapper file if (!is.loaded('calnpr')) dyn.load(fortran.so) 1. If the package is called calnpr, the shared library is also called that way. 2. you have to provide the path to the shared library. See ?.First.lib for how to do it in a package without NAMESPACE (and note that NAMESPACES are forced for the next R release). Best, Uwe Ligges How to add the path so that once the package is installed the compiled fortran code can be found? Many thanks Cheers Ed [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AFT model time-dependent with weibull distribution
On Fri, Aug 19, 2011 at 2:55 PM, javier palacios xpfen...@gmail.com wrote: Dear R-community, I have tried to estimate an accelerated failure time(AFT) and proportional hazard (PH) parametric survival model with time-independent and time-dependent covariates. For that purpose, I have used the eha package. Please, consider this example: weibullph - phreg(Surv(sta,time,S) ~ TDC1 + TIC1, dist=weibull, data.frame=Data) weibullaft-aftreg(Surv(sta,time,S) ~ TDC1 + TIC1, dist=weibull, data.frame=Data) ## aftreg gives error when I add ID argument... Error in aftreg.fit(X, Y, dist, strats, offset, init, shape, id, control, : Overlapping intervals for id 2 Does not happen to me. From help(aftreg): id If there are more than one spell per individual, it is essential to keep spells together by the id argument. This allows for time-varying covariates. data table: Data S sta time TDC1 total_time TIC1 ID A 1 0 1 48.50 1 1 1 B 0 0 1 65.96 2 1 2 B 1 1 2 65.08 2 1 2 C 0 0 1 0.00 2 4 3 C 1 1 2 0.00 2 4 3 D 0 0 1 72.74 2 5 4 D 1 1 2 72.52 2 5 4 E 0 0 1 61.84 2 3 5 E 0 1 2 60.56 2 3 5 F 0 0 1 35.04 4 2 6 F 0 1 2 36.97 4 2 6 F 0 2 3 37.92 4 2 6 F 1 3 4 39.01 4 2 6 Duplicated rownames are no allowed. time - time to event sta - starting time TDC - time dependent covariates TIC - time independent covariate total_time - total time at risk ID - ID 1- What happens if the ID is not included? Read the documentation 2- How can I solve this error? Read the documentation 3- Why the phreg function does not need an ID? Read the documentation (and a good text book on survival analysis) And please read the posting guide; the elementary information is missing. Thanks, Javier -- View this message in context: http://r.789695.n4.nabble.com/AFT-model-time-dependent-with-weibull-distribution-tp3755079p3755079.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Göran Broström __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot label symbols and superscript
I was unable to find an answer to my problem. I would like to label the y axis of a plot with a rate and would like to use a dot (•) rather than a multiplication sign (x). ylab = quote(Speed~(cmxsec^2)) Thanks in advance. keith -- M. Keith Cox, Ph.D. Alaska NOAA Fisheries, National Marine Fisheries Service Auke Bay Laboratories 17109 Pt. Lena Loop Rd. Juneau, AK 99801 keith@noaa.gov marlink...@gmail.com U.S. (907) 789-6603 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Build a package - check error
Hi I have modified the path to dyn.load(paste(Sys.getenv(R_LIBS_USER),/fortran/src/fortran.so,sep=)) and the package could installed, loaded and the lines with dyn.load worked. It does not look like a pretty solution but works on my linux (I am not sure if it works on my mac or windows). I am not sure if this is what you meant but as I have no clue what .First.lib does or NAMESPACES means this is the best I come up with. Please correct me if I am wrong. Many thanks Ed On Fri, Aug 19, 2011 at 6:03 PM, Uwe Ligges lig...@statistik.tu-dortmund.de wrote: On 19.08.2011 22:53, Eduardo Mendes wrote: Dear R-users I am slowly migrating my mex files (MATLAB - Fortran and C) to R. To get my own functions available on R section I have decided to learn how to build a R package. I choose a simple example with a few Fortran and R functions (wrapper). The fortran sources are located at src and the R functions at R (as recommended). The building process went ok but R CMD check did not. The error mgs was Error in dyn.load(fortran.so) : unable to load shared object '/home/eduardo/R_packages/**test.Rcheck/fortran.so': Although I can see that R cannot find the compiled fortran code I do not know what to do. I believe it is something to do with the following lines in the R-wrapper file if (!is.loaded('calnpr')) dyn.load(fortran.so) 1. If the package is called calnpr, the shared library is also called that way. 2. you have to provide the path to the shared library. See ?.First.lib for how to do it in a package without NAMESPACE (and note that NAMESPACES are forced for the next R release). Best, Uwe Ligges How to add the path so that once the package is installed the compiled fortran code can be found? Many thanks Cheers Ed [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to add horizontal lines above bar graph to display p-values?
Hi, I would like to draw horizontal lines above a bar graph, in order to display the p-values of a Fisher test. Here is an examplehttp://thejns.org/action/showPopup?citid=citart1id=f3-1060501doi=10.3171%2Fped.2007.106.6.501of the type of display I would like to have. Is there a way to draw the horizontal lines and write their associated p-values in R? Thanks for you help! Sebastien Vigneau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculating p-value for 1-tailed test in a linear model
Hello, I'm having trouble figuring out how to calculate a p-value for a 1-tailed test of beta_1 in a linear model fit using command lm. My model has only 1 continuous, predictor variable. I want to test the null hypothesis beta_1 is = 0. I can calculate the p-value for a 2-tailed test using the code 2*pt(-abs(t-value), df=degrees.freedom), where t-value and degrees.freedom are values provided in the summary of the lm. The resulting p-value is the same as provided by the summary of the lm for beta_1. I'm unsure how to change my calculation of the p-value for a 1-tailed test. Thanks for your assistance, Andy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] COXPH TIME-DEPENDENT
Thanks, Javier -- View this message in context: http://r.789695.n4.nabble.com/COXPH-TIME-DEPENDENT-tp3754837p3756128.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help: how to generate counts from generalized poisson distribution
On Fri, Aug 19, 2011 at 04:43:35PM -0400, Chee Chen wrote: Dear All, Is there a simulator that can generate observations from a generalized poisson distribution? Thanks and regards, Chee You can use rzigp() in the ZIGP package. rzigp(n,mu,phi,omega=0) for generalized poisson. Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Auto key legend does not match plot
Dear R-help members. I am an 'R-learner' (about 6 hours so far) using the lattice library to create a ranked dotplot and am colour coding the dots by a variable called Commodity. However when i use autokey to make a legend the size (cex) and symbol (pch) do not match what is on the dotplot. Code is below and image attached library(lattice) Cal_dat - read.table(Calibration2.dat,header = TRUE,sep = \t,) dotplot(reorder(Label.yr, Resc_Gt)~ Resc_Gt,groups=Commodity, data=Cal_dat,cex=1.5, pch=19,aspect=xy, auto.key=list(space=right,title=Commodity)) Any assistance appreciated http://r.789695.n4.nabble.com/file/n3756245/Ranked_boxplot_by_commodity.png -- View this message in context: http://r.789695.n4.nabble.com/Auto-key-legend-does-not-match-plot-tp3756245p3756245.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice help: Dotplot
With Dotplot, I'm trying to make a figure that will ultimately have the same x-axis (which will be my response variable and the error bars), but the y-axis will consist of a different label for every point. Here's my code: Dotplot(fTaxonGrouped ~ Cbind(normSlope,normLwr,normUpr)|fGroup, groups=fEpoch, pch=c(17,15,19), col=c(3:1), scales = list( y = list(relation = sliced,axs = r, alternating = 0,tick.number=10), x = list(tick.number = 6) )) It ALMOST does exactly what I. There are six small graphs, and each datapoint has a unique value on the y-axis (integers), but once I try to change the y-axis tick labels to the list containing the names of each datapoint (characters), it lists only the first 10 names for all 6 graphs -- that is, it repeats the names, so they are no longer unique and are also mislabeled. If I change the relation to same, the correct datapoints are plotted for their appropriate group and hence appropriate plot, and every point is labeled correctly, but because only about 10 data points apply to each group, I have 6 plots with about 10 points each but all 40 names listed in every one. Essentially, I need the graph the results when relation is set to same, but with blank labels removed (like when relation is set to sliced) and thus scaled correctly, but the labels cannot be changed into integers, which is what happens when relation is set to slice. Thanks -- View this message in context: http://r.789695.n4.nabble.com/Lattice-help-Dotplot-tp3756134p3756134.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing non-graphic (text) output to PDF
Hi, Try this, library(gridExtra) example(grid.table) or addtable2plot() in plotrix, or textplot() in gplots, or Hmisc using latex, or Sweave, ... HTH, baptiste PS: please read the posting guide On 20 August 2011 05:14, Ed Heaton heat...@comcast.net wrote: Hi, friends. I keep coming to you because I'm so new to R and can't seem to figure out some simple things. Sorry. Consider the following code. I want to load a table and write out the structure to a PDF document. I just can't seem to manage writing non-graphic output to PDF. Any help? I've tried several functions, but nothing worked. All I get is the title. # ** # Load the DEBT table. debt - readRDS(T:/R.Data/Debt.rData) dim(debt) # Open the debt.pdf file for graphics output. pdf( file=paste( R:/DAS/DMS/FedDebt ,DataDiscovery ,DistributionAnalysis ,Report ,Debt.pdf ,sep=/ ) ) # == # Write the debt structucture to the output PDF. plot.new() title(DEBT) str(debt) # == dev.off() # Turn off the PDF device. # ** End of Program Ed Ed Heaton Project Manager, Sr. SAS Developer Data and Analytic Solutions, Inc. 10318 Yearling Drive Rockville, MD 20850 Office: 301-520-7414 ehea...@dasconsultants.com www.dasconsultants.com http://www.dasconsultants.com/ CMMI ML-2, SBA 8(a) SDB, WBE (WBENC), MBE (VA MD) e...@heaton.name (Re: http://www.r-project.org/posting-guide.html) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Auto key legend does not match plot
It is most likely due to your ordering of y values. You need to write key manually to reflect the change. Without providing reproduciable data, you may not get specific help. Weidong Gu On Fri, Aug 19, 2011 at 6:23 PM, markm0705 markm0...@gmail.com wrote: Dear R-help members. I am an 'R-learner' (about 6 hours so far) using the lattice library to create a ranked dotplot and am colour coding the dots by a variable called Commodity. However when i use autokey to make a legend the size (cex) and symbol (pch) do not match what is on the dotplot. Code is below and image attached library(lattice) Cal_dat - read.table(Calibration2.dat,header = TRUE,sep = \t,) dotplot(reorder(Label.yr, Resc_Gt)~ Resc_Gt,groups=Commodity, data=Cal_dat,cex=1.5, pch=19,aspect=xy, auto.key=list(space=right,title=Commodity)) Any assistance appreciated http://r.789695.n4.nabble.com/file/n3756245/Ranked_boxplot_by_commodity.png -- View this message in context: http://r.789695.n4.nabble.com/Auto-key-legend-does-not-match-plot-tp3756245p3756245.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot label symbols and superscript
On Aug 19, 2011, at 6:36 PM, Marlin Keith Cox wrote: I was unable to find an answer to my problem. I would like to label the y axis of a plot with a rate and would like to use a dot (•) rather than a multiplication sign (x). ylab = quote(Speed~(cmxsec^2)) ?plotmath # seemed like the logical place to look # yep, there it is. plot(1,1, ylab = quote(Speed~(cm%.%sec^2)) ) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Leading zeros
x - rep(00,2) y - c(23/45,67/8) substr(x,1+nchar(x)-nchar(y), nchar(x)) - y x --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. David Winsemius dwinsem...@comcast.net wrote: Copying list one what was sent in reply. Anybody have a better solution? On Aug 19, 2011, at 11:57 AM, Vasco Cadavez wrote: Thanks, A solution can be by substring to remove the / then numeric will be ok! What you think? How can I remove the / with sub or gsub: sprintf(%010.0f, as.integer(gsub(/,, c(4/3003,55/333,66/22)) )) [1] 043003 055333 006622 -- David. Thanks Vasco Cadavez - Menssagem Original - De: David Winsemius dwinsem...@comcast.net Para: David Winsemius dwinsem...@comcast.net Cópia: Vasco Cadavez vcadavez@ipbpt, r-help@r-project.org Enviado: Fri, 19 Aug 2011 11:51:08 -0400 Assunto: Re: [R] Leading zeros On Aug 19, 2011, at 11:17 AM, David Winsemius wrote: On Aug 19, 2011, at 11:12 AM, Vasco Cadavez wrote: Hello, I have a dataset with an Id columns like: 4/3003 55/333 66/22 I want to put leading zeros to get: 0004/3003 00055/333 66/22 How can I solve this? ?sprintf ?formatC I may have been too quick. Padding with leading zeros using sprintf is described for numeric but not for character types. There are severa _ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a Question regarding glm for linear regression
Hello All, I have a question about glm in R. I would like to fit a model with glm function, I have a vector y (size n) which is my response variable and I have matrix X which is by size (n*f) where f is the number of features or columns. I have about 80 features, and when I fit a model using the following formula, glmfit = glm(y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12 + x13 + x14 + x15 + x16 + x17 + x18 + x19 + x20 + x21 + x22 + x23 + x24 + x25 + x26 + x27 + x28 + x29 + x30 + x31 + x32 + x33 + x34 + x35 + x36 + x37 + x38 + x39 + x40 + x41 + x42 + x43 + x44 + x45 + x46 + x47 + x48 + x49 + x50 + x51 + x52 + x53 + x54 + x55 + x56 + x57 + x58 + x59 + x60 + x61 + x62 + x63 + x64 + x65 + x66 + x67 + x68 + x69 + x70 + x71 + x72 + x73 + x74 + x75 + x76 + x77 + x78 + x79 + x80) it gives me an error Error in eval(expr, envir, enclos) : object 'x3' not found which I dont know why I am given those errors. The other thing is that when I use the glm.fit, I can get coefficients without any errors. So, I am not sure what is going on and if glm.fit is the same as glm, can I use glm.fit instead of glm? Thanks a lot, Andra __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a Question regarding glm for linear regression
convert your matrix to a data frame: df - as.data.frame(mymatrix) then you can simplify your formula and specify where the data is coming from: glm.fit - glm(y~., data=df) the . in the formula means all columns in your dataframe (except y, if it is in df) On Sat, Aug 20, 2011 at 10:43 AM, Andra Isan andra_i...@yahoo.com wrote: Hello All, I have a question about glm in R. I would like to fit a model with glm function, I have a vector y (size n) which is my response variable and I have matrix X which is by size (n*f) where f is the number of features or columns. I have about 80 features, and when I fit a model using the following formula, glmfit = glm(y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12 + x13 + x14 + x15 + x16 + x17 + x18 + x19 + x20 + x21 + x22 + x23 + x24 + x25 + x26 + x27 + x28 + x29 + x30 + x31 + x32 + x33 + x34 + x35 + x36 + x37 + x38 + x39 + x40 + x41 + x42 + x43 + x44 + x45 + x46 + x47 + x48 + x49 + x50 + x51 + x52 + x53 + x54 + x55 + x56 + x57 + x58 + x59 + x60 + x61 + x62 + x63 + x64 + x65 + x66 + x67 + x68 + x69 + x70 + x71 + x72 + x73 + x74 + x75 + x76 + x77 + x78 + x79 + x80) it gives me an error Error in eval(expr, envir, enclos) : object 'x3' not found which I dont know why I am given those errors. The other thing is that when I use the glm.fit, I can get coefficients without any errors. So, I am not sure what is going on and if glm.fit is the same as glm, can I use glm.fit instead of glm? Thanks a lot, Andra __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice help: Dotplot
On Aug 19, 2011, at 5:07 PM, sw1 wrote: With Dotplot, Are you sure that you are using lattice? Maybe you out to look more closely at: ?Dotplot I'm trying to make a figure that will ultimately have the same x-axis (which will be my response variable and the error bars), but the y-axis will consist of a different label for every point. Here's my code: Dotplot(fTaxonGrouped ~ Cbind(normSlope,normLwr,normUpr)|fGroup, groups=fEpoch, pch=c(17,15,19), col=c(3:1), scales = list( y = list(relation = sliced,axs = r, alternating = 0,tick.number=10), x = list(tick.number = 6) )) It ALMOST does exactly what I. There are six small graphs, and each datapoint has a unique value on the y-axis (integers), but once I try to change the y-axis tick labels to the list containing the names of each datapoint (characters), it lists only the first 10 names for all 6 graphs -- that is, it repeats the names, so they are no longer unique and are also mislabeled. If I change the relation to same, the correct datapoints are plotted for their appropriate group and hence appropriate plot, and every point is labeled correctly, but because only about 10 data points apply to each group, I have 6 plots with about 10 points each but all 40 names listed in every one. Essentially, I need the graph the results when relation is set to same, but with blank labels removed (like when relation is set to sliced) and thus scaled correctly, but the labels cannot be changed into integers, which is what happens when relation is set to slice. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Labelling all variables at once (using Hmisc label)
Sorry about the nabble problem. At any rate, do require(Hmisc) then ?label to see how to associate a vector of labels with all the variables in a data frame at once. Frank do999 wrote: Indeed, as David pointed out, all the portion that used courier font (all the good stuff) was absent from the email posting. Thanks for your answers. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Labelling-all-variables-at-once-using-Hmisc-label-tp3745660p3756459.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hmisc::rcorr on a 'data.frame'?
I don't see anything wrong with using as.matrix. The documentation doesn't say it will support a data frame. Frank Liviu Andronic wrote: Dear all ?Hmisc::rcorr states that it takes as main argument a numeric matrix. But is it normal that it fails in such an ugly way on a data frame? (See below.) If the function didn't attempt any conversion to a matrix, I would have expected it to state that in the error message that it didn't accept 'data.frame' objects in its input. Also, I vaguely remember having used in the past rcorr() on data frames. Regards Liviu require(Hmisc) rcorr(mtcars[ , 1:4]) Error in storage.mode(x) - if (.R.) double else single : (list) object cannot be coerced to type 'double' rcorr(as.matrix(mtcars[ , 1:4])) mpg cyl disphp mpg 1.00 -0.85 -0.85 -0.78 cyl -0.85 1.00 0.90 0.83 disp -0.85 0.90 1.00 0.79 hp -0.78 0.83 0.79 1.00 n= 32 P mpg cyl disp hp mpg 0 00 cyl 0 00 disp 0 00 hp0 0 0 -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Hmisc-rcorr-on-a-data-frame-tp3755868p3756462.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help Dxy and C-index calculation
Replace Design with rms (for general reasons not related to your question; Design is about to be obsolete). Negate Dxy. The linear predictor for the Cox model is relative log hazard. Higher hazard means shorter survival time. For other survival models the model is stated in terms of survival time. Frank On Fri, 19 Aug 2011 08:30:50 -0500, chafika.mazo...@igr.fr chafika.mazo...@igr.fr wrote: Dear professor, I am currently using Design package and the cph formula for assessing multivariable analysis. I am tryng to get the C-index for my survival model based on Dxy coefficient. I am confused since there is a negative value. Do I need to used the absolute Dxy ? index.orig training test optimism index.corrected n Dxy -0.341357727 -0.344002740 -0.341357727 -0.002645013 -0.338712715 40 R2 0.084694141 0.095440176 0.079077594 0.016362582 0.068331560 40 Slope 1.0 1.0 0.897999711 0.102000289 0.897999711 40 D 0.033368588 0.038458457 0.030983429 0.007475027 0.025893561 40 U -0.002890981 -0.002968495 0.003412045 -0.006380539 0.003489558 40 Q 0.036259569 0.041426951 0.027571385 0.013855566 0.022404003 40 Many thanks Dr MAZOUNI Institut Gustave Roussy Department of breast surgical oncology Villejuif, France -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a Question regarding glm for linear regression
On Aug 19, 2011, at 8:43 PM, Andra Isan wrote: Hello All, I have a question about glm in R. I would like to fit a model with glm function, I have a vector y (size n) which is my response variable and I have matrix X which is by size (n*f) where f is the number of features or columns. I have about 80 features, and when I fit a model using the following formula, glmfit = glm(y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12 + x13 + x14 + x15 + x16 + x17 + x18 + x19 + x20 + x21 + x22 + x23 + x24 + x25 + x26 + x27 + x28 + x29 + x30 + x31 + x32 + x33 + x34 + x35 + x36 + x37 + x38 + x39 + x40 + x41 + x42 + x43 + x44 + x45 + x46 + x47 + x48 + x49 + x50 + x51 + x52 + x53 + x54 + x55 + x56 + x57 + x58 + x59 + x60 + x61 + x62 + x63 + x64 + x65 + x66 + x67 + x68 + x69 + x70 + x71 + x72 + x73 + x74 + x75 + x76 + x77 + x78 + x79 + x80) If X is a matrix, then you cannot attach it and none of its column names would be accessible by functions. what does ls() return? I'm guessing you constructed a dataframe (forgot to include x3) and attach()-ed it and are calling it (incorrectly ) a matrix. it gives me an error Error in eval(expr, envir, enclos) : object 'x3' not found which I dont know why I am given those errors. The other thing is that when I use the glm.fit, I can get coefficients without any errors. Really? What was the R code that produced results? So, I am not sure what is going on Yes. Neither are we. and if glm.fit is the same as glm, can I use glm.fit instead of glm? I do not see a formula method for glm.fit in the help page. Nor do I see a mechanism for handling formulas in the code of glm.fit. Perhaps the question is, Do you care about errors? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating p-value for 1-tailed test in a linear model
On 20/08/11 10:20, Andrew Campomizzi wrote: Hello, I'm having trouble figuring out how to calculate a p-value for a 1-tailed test of beta_1 in a linear model fit using command lm. My model has only 1 continuous, predictor variable. I want to test the null hypothesis beta_1 is= 0. I can calculate the p-value for a 2-tailed test using the code 2*pt(-abs(t-value), df=degrees.freedom), where t-value and degrees.freedom are values provided in the summary of the lm. The resulting p-value is the same as provided by the summary of the lm for beta_1. I'm unsure how to change my calculation of the p-value for a 1-tailed test. Thanks for your assistance, Andy The r-help mailing list is *not* for giving assistance with homework. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AFT model time-dependent with weibull distribution
Dear Prof. Broström, I have searched in the reference manual inside the package eha, updated recently. I did not find any description on how to enter id in the aftreg function except the description of the argument. Can you refer to a specific part of the manual? Do you mean another documentation? I had also visited the posting guide before ( I included an example and all the details I could.). What else can I add? And have also read a a book intruducing survival analysis in R, but the doubt that I have is very specific to R computation. I reask the questions: I have a doubt on how to perform a weibull parametric survival anayisis using time-dependent covariates in R using both accelerated failure time(AFT) and proportional hazard (PH). I would highly appreciate a reply to my questions. I provide an example similar to data i use. (Here there are only two covariates one dependent and another independent) Data S sta time TDC1 total_time TIC1 ID A 1 0 1 48.50 1 1 1 B 0 0 1 65.96 2 1 2 B 1 1 2 65.08 2 12 C 0 0 1 0.002 4 3 C 1 1 2 0.002 4 3 D 0 0 1 72.742 5 4 D 1 1 2 72.522 5 4 E 0 0 1 61.84 2 35 E 0 1 2 60.562 35 F 0 0 1 35.044 26 F 0 1 2 36.974 26 F 0 2 3 37.924 26 F 1 3 4 39.014 26 time - time to event sta - starting time TDC - time dependent covariates TIC - time independent covariate total_time - total time at risk ID - ID Doubts: -AFT and PH time-dependent Weibull distribution I have tried to estimate an accelerated failure time(AFT) and proportional hazard (PH) parametric survival model with time-independent and time-dependent covariates with a weibull distribution. For that purpose, I have used the eha package. To my knowledge, the survival package does not provide a solution for estimating models with time-dependent models. weibullph - phreg(Surv(sta,time,S) ~ TDC1 + TIC1, dist=weibull, data.frame=Data) weibullaft-aftreg(Surv(sta,time,S) ~ TDC1 + TIC1, dist=weibull, data.frame=Data) ## aftreg gives an error when I add an ID argument... That should be used for controlling for time-varying variables. Error in aftreg.fit(X, Y, dist, strats, offset, init, shape, id, control, : *Overlapping intervals for id 2 * From help(aftreg): id If there are more than one spell per individual, it is essential to keep spells together by the id argument. This allows for time-varying covariates. Here are the questions I have: 1- How can I solve this error? 2- Does the phreg function need an ID? Can I use it to estimate a model with time-dependent covariates? 3- How can time-dependent covariates be estimated with phreg or aftreg, or other function in R? Thank you very much, J -- View this message in context: http://r.789695.n4.nabble.com/AFT-model-time-dependent-with-weibull-distribution-tp3755079p3756424.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.