Re: [R] Correct Localized Numbers on Plots, related to glibc!
You need to read ?Sys.setlocale (surely part of the homework the R posting guide asked of you). I have read that. And I have used the standard way of changing locale to fa_IR.utf8 (i.e. using Sys.setlocale(category=LC_ALL,locale=fa_IR.utf8). Also Sys.setlocale(category=LC_NUMERIC,locale=fa_IR.utf8) ) The output of Sys.localeconv at the end of my email should have made this clear. It is vital that you give the 'at a minimum' information requested in the posting guide when asking such questions, and we also absolutely need to know what graphics device you are trying to use. The result is the same on RStudioGD, X11, pdf, ps, cairo_pdf, cairo_ps, svg, png, jpeg, bmp. (I do not know if this is what you are asking for. if not let me what exactly you are asking and how can I find it please.) My sessionInfo is as follows: R version 2.14.1 (2011-1-22) Platform: i486-pc-linux-gnu (32 bit) locale: [1] LC_CTYPE=fa_IR.utf8 LC_NUMERIC=fa_IR.utf8 [3] LC_TIME=fa_IR.utf8LC_COLLATE=fa_IR.utf8 [5] LC_MONETARY=fa_IR.utf8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=CLC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASURMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphicsgrDevices utils datasetsmethods base loaded via a namespace (and not attached): [1] tools_2.14.1 I have read the posting guide. I had not posted the sessionInfo as the information on system and graphic device seemed irrelevant, as it does not change in different devices and also on Microsoft Windows the locale Persian_Iran.1256 does not change anything. I have not talked about windows as I don't expect windows to do anything correctly on Persian. On 15/01/2012 08:26, Majid Einian wrote: Dear R Helpers, I want to localize my plots, i.e. the numbers by x y axis be Persian, using Persian numerals and Persian decimal separator. I change the locale to fa_IR.utf8, but nothing on plots change. I can How, precisely? Please show us exactly what you did, with a reproducible example. Well, I assume that changing the locale should do the trick, but it does not. As I mentioned earlier: X11() Sys.setlocale(category=LC_ALL,locale=fa_IR.utf8) Sys.setlocale(category=LC_NUMERIC,locale=fa_IR.utf8) plot(gdpr) change the numerals shaping to Persian ones (۱۲۳۴ instead of 1234) using some non-standard fonts but the decimal point is a problem. I asked about that in Persian-Computing mailing list and I got the answer that follows. I don't know how should I use this l flag mentioned in the answer in R plots (I'm using simple R plots, no special library). Has anybody had similar problem in any language (maybe Arabic, other languages I'm not sure use different numeral characters). Also I don't have e.g. French locale on my system to see if the decimal separator changes accordingly to locale for them. It will if they followed the documentation. Thanks in advance. - Roozbeh Pournaderrooz...@gmail.com Tue, Dec 6, 2011 at 3:47 AM To: Majid Einianeinia...@gmail.com Cc: persian-comput...@googlegroups.com The glibc model for generating numbers is kind of complex. For using native digits, one is supposed to use the I flag. For example, in order to get ۱۲٫۳, you should do printf(%I.1f, 12.3). This is to make sure applications have a way to output both ASCII numbers, and native numbers. Roozbeh -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Majid Einian, PhD Candidate in Economics, Graduate School of Management and Economics, Sharif University of Technology, Tehran, IRAN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MuMIn package, problem using model selection table from manually created list of models
The subject says it all really. Question 1. Here is some code created to illustrate my problem, can anyone spot where I'm going wrong? Question 2. The reason I'm following a manual specification of models relates to the fact that in reality I am using mgcv::gam, and I'm not aware that dredge is able to separate individual smooth terms out of say s(a,b). Hence an additional request, if anyone has example code for using gam in a multimodel inference framework, especially with bivariate smooths, I'd be most grateful. Cheers and Thanks in Advance Mike require(MuMIn) data(Cement) # option 1, create model.selection object using dredge fm0 - lm(y ~ ., data = Cement) print(dd - dredge(fm0)) fm1 - lm(formula = y ~ X1 + X2, data = Cement) fm2 - lm(formula = y ~ X1 + X2 + X4, data = Cement) fm3 - lm(formula = y ~ X1 + X2 + X3, data = Cement) fm4 - lm(formula = y ~ X1 + X4, data = Cement) fm5 - lm(formula = y ~ X1 + X3 + X4, data = Cement) # ranked with AICc by default # obviously this works model.avg(get.models(dd, delta 4)) # option 2: the aim is to produce a model selection object comparable to that from get.models(dd, delta 4) # but from a manually-specified list of models my.manual.selection - mod.sel(list(fm1, fm2, fm3, fm4, fm5)) # works model.avg(list(fm1, fm2, fm3, fm4, fm5)) # or jut model.avg(fm1, fm2, fm3, fm4, fm5) # doesn't work model.avg(my.manual.selection) # hence this doesn't work get.models(my.manual.selection, delta 4) -- This message (and any attachments) is for the recipient ...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MuMIn package, problem using model selection table from manually created list of models
Dnieper 2012-01-17 10:51, Dunbar, Michael J. piste: The subject says it all really. Question 1. Here is some code created to illustrate my problem, can anyone spot where I'm going wrong? Question 2. The reason I'm following a manual specification of models relates to the fact that in reality I am using mgcv::gam, and I'm not aware that dredge is able to separate individual smooth terms out of say s(a,b). Hence an additional request, if anyone has example code for using gam in a multimodel inference framework, especially with bivariate smooths, I'd be most grateful. You can model average the coefficients, but not the terms. Cheers and Thanks in Advance Mike require(MuMIn) data(Cement) # option 1, create model.selection object using dredge fm0- lm(y ~ ., data = Cement) print(dd- dredge(fm0)) fm1- lm(formula = y ~ X1 + X2, data = Cement) fm2- lm(formula = y ~ X1 + X2 + X4, data = Cement) fm3- lm(formula = y ~ X1 + X2 + X3, data = Cement) fm4- lm(formula = y ~ X1 + X4, data = Cement) fm5- lm(formula = y ~ X1 + X3 + X4, data = Cement) # ranked with AICc by default # obviously this works model.avg(get.models(dd, delta 4)) # option 2: the aim is to produce a model selection object comparable to that from get.models(dd, delta 4) # but from a manually-specified list of models my.manual.selection- mod.sel(list(fm1, fm2, fm3, fm4, fm5)) # works model.avg(list(fm1, fm2, fm3, fm4, fm5)) # or jut model.avg(fm1, fm2, fm3, fm4, fm5) # doesn't work model.avg(my.manual.selection) # hence this doesn't work get.models(my.manual.selection, delta 4) There is no need to recreate the models (which is what get.models does) once you have them already as a list. models - list(fm1, fm2, fm3, fm4, fm5) my.manual.selection - mod.sel(models) model.avg(models[ my.manual.selection$delta 4 ]) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New PLYR issue
Hello everyone, I have got the same problem, with the same error message. Using R 2.14.1, plyr 1.7.1, R.Studio 0.94.110, Windows XP The plyr mailing list does not provide any help until now. require(plyr) c(sample(c(1:100), 50, replace=TRUE))-V1 c(rep( 1:5, 10))-f1 #variable to group V1 data.frame(cbind(V1, f1))-DF str(DF) ddply(DF$V1, DF$f1, sd) ddply(.(DF$V1), .(DF$f1), sd) /Error in if (empty(.data)) return(.data) : / /missing value where TRUE/FALSE needed / /Thanks everyone, / [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Display numbers on map
I have a text file with states and numbers. I would like to display each number that corresponds to a state on a map. I am trying to use the maps package, but it doesn't show Alaska or Hawaii. Do you have suggestions on how to do this? Jeffrey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Averaging within a range of values
Thank you all for your help. It's working now. I chose to use the sqldf method and fn$ in the gsubfn package. I used fn$ so that I could put variables into the sqldf statement. This helped me to increase or decrease the window size in the Gene dataframe if I wanted to include values in the average both before or after the gene. For anyone else interested, this is what it looked like. Avg_func-function(chr,begin,finish){ fn$sqldf(paste(select d1C,chr,.'ORF', avg(C0), avg(C1) from d1C,chr,, d2C,chr, where d2C,chr,.Pos between d1C,chr,.Start-,begin, and d1C,chr,.End+,finish, group by d1C,chr,.'ORF',sep=)) } I used d1C1 etc. because I had a different file for each chromosome that I read into separate dataframes in R. The dataframes also had ORF (Open Reading Frame) instead of Group. -- View this message in context: http://r.789695.n4.nabble.com/Averaging-within-a-range-of-values-tp4291958p4302910.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] meta-analysis normal quantile plot metafor
At 12:15 16/01/2012, Ricc wrote: Hello, I used the default parameters: - envelope: default is TRUE - level: the default is to take the value from the object (I do not understand this very well) When you specified your original model to rma.uni you either specified the level or let it default (to 95). That value is stored in the object returned by rma.uni and used by qqnorm.rma.uni unless you override it. You can see what is in the object returned by rma.uni by using str. - bonferroni : no - reps : default is 1000 - smooth: default is TRUE - bass : default is 0 I used no other arguments. 2012/1/13 Michael Dewey i...@aghmed.fsnet.co.uk: At 15:53 11/01/2012, Ricc wrote: Hello, I once used the metawin software to perform a meta-analysis (see metawinsoft, Rosenberg et al.) and produced normal qqplot to test for a potential bias in the dataset. I now want to re-use the same dataset with the package metafor by W. Viechtbauer (great package btw). I run the qqnorm.rma.uni function. I use standardized effect sizes as in metawin. I think it would help if you said which parameters you used to control the envelope. Did you smooth it? Did you use the Bonferroni correction? QQplot generated with metafor differs from the plot obtained with metawin: most of the datapoint fall outside the confidence envelope (using the same confidence level). I don't understand very well how the pseudo confidence envelope was created in metafor. Is it more conservative than that from metawin or created using the package envelope ? Unfortunately I do not have access to metawin's code so that I cannot compare implementations but the manual let me think that metawin print classical confidence interval... Thanks for input ! Ricc More precisions: R version 2.13.1 (2011-07-08) Platform: x86_64-pc-linux-gnu (64-bit) metafor_1.6-0 Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Checking dates for entry errors
On 1/11/2012 11:07 PM, Paul Miller wrote: Hello Everyone, I have a question about how best to check dates for entry errors. Try using regular expression matching and the functions grep, strsplit, regexpr etc. If you are not familiar with regex: bit a bumpy road of getting into it but in the long term definitely worth the effort ! cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] meta-analysis normal quantile plot metafor
@ Micheal: thanks I understand now. @ Wolfgang: apparently, using the DL-estimator solved my issue and leaded to a result with only a slight difference with metawin. thanks ! 2012/1/17 Michael Dewey i...@aghmed.fsnet.co.uk: At 12:15 16/01/2012, Ricc wrote: Hello, I used the default parameters: - envelope: default is TRUE - level: the default is to take the value from the object (I do not understand this very well) When you specified your original model to rma.uni you either specified the level or let it default (to 95). That value is stored in the object returned by rma.uni and used by qqnorm.rma.uni unless you override it. You can see what is in the object returned by rma.uni by using str. - bonferroni : no - reps : default is 1000 - smooth: default is TRUE - bass : default is 0 I used no other arguments. 2012/1/13 Michael Dewey i...@aghmed.fsnet.co.uk: At 15:53 11/01/2012, Ricc wrote: Hello, I once used the metawin software to perform a meta-analysis (see metawinsoft, Rosenberg et al.) and produced normal qqplot to test for a potential bias in the dataset. I now want to re-use the same dataset with the package metafor by W. Viechtbauer (great package btw). I run the qqnorm.rma.uni function. I use standardized effect sizes as in metawin. I think it would help if you said which parameters you used to control the envelope. Did you smooth it? Did you use the Bonferroni correction? QQplot generated with metafor differs from the plot obtained with metawin: most of the datapoint fall outside the confidence envelope (using the same confidence level). I don't understand very well how the pseudo confidence envelope was created in metafor. Is it more conservative than that from metawin or created using the package envelope ? Unfortunately I do not have access to metawin's code so that I cannot compare implementations but the manual let me think that metawin print classical confidence interval... Thanks for input ! Ricc More precisions: R version 2.13.1 (2011-07-08) Platform: x86_64-pc-linux-gnu (64-bit) metafor_1.6-0 Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error predict with lda and cross validation
Hi, I use the lda function from the MASS package to classify some samples according to some chemical properties. If I run lda without cross validation all is ok but, if I run lda with cross validation, the R consol say: resLDA - ldaRedOx - lda(Activity ~ TRedOx[,1:6], CV=TRUE, data=dfDataRedOx, subset=train) predLDA - predict(resLDA, newdata=dfDataRedOx[-train,])$class predLDA - predict(resLDA, newdata=dfDataRedOx[-train,])$class Error in UseMethod(predict) : no applicable method for 'predict' applied to an object of class list How should I use predict function with lda with the cross validation? Best Riccardo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mean of simulation runs given in a table
Hi, I have the simulation results of the following structure: run par measured 1 1012 2 1014 1 2020 2 2026 Where run is the simulation run number, par is the parameter of the simulation, and measured is the value measured in the simulation. This is only a simple example of my results. There are many values measured and many parameters. But the basic structure stays the same: there are many runs (identified by the run number) for the same values of the parameters with various measured values -- they constitute a sample. I would like to calculate the mean of the measured value for a sample, and so I would like to obtain the output as follows: par mean 10 13 20 23 I would appreciate it if someone could write me how to do it. Thank you, Irek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] r help
df-data.frame( group=rep(c(a,b), c(7,6)), sales=c(3,4,6,5,1,30,8,4,6,9,10,27,9), turn_over=c(1.5, 2.9, 1.9,20.5, 2.3, 1.65, 0.06, 3.4, 3.5, 2.23, 0.1, 9.8,1.4) ) Hello all, In this data set ı need to replace the outliers with 1.5IQR for each group and for each variable so the final data set should look like sales=c(3,4,6,5,1,8,8,4,6,9,10,10,9), turn_over=c(1.5, 2.9, 1.9,2.9, 2.3, 1.65, 0.06, 3.4, 3.5, 2.23, 0.1, 3.5,1.4) so far I did try k=boxplot(df[, c(sales,turn_over)], df[,groupID]) any help will be appreciated. -- View this message in context: http://r.789695.n4.nabble.com/r-help-tp4303025p4303025.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] result numeric(0) when using variable1[which(variable2=max(variable2)]
Dear all, I have a question about the knowing for which row I have the max value of one of my variables. I calculated the Rsquared for different columns and made a list to gather them. I unlisted this list to create a vector with this values. I want to know for which column I have the max value of Rsquared. The columns were always named in the same way. They always start with results4$depth_ following by the number. The numbers are constructed as: seq(1,10,0.1). But if the R squared values are now in 1 column, I don’t know for which column they are calculated. So I made a new data frame with both columns: R2 - unlist(LIST) Cvalue - c(seq(1,10,0.1)) results5 - data.frame(Cvalue,R2) # I know I can calculate the max value of Rsquared by this way: max(results5$R2) # now I want to know to which Cvalue this belongs. I would write it like this: results5$Cvalue[which(results5$R2 == max(results5$R2))] # But I always get the solution: numeric(0) # I don’t know if these Rsquared values are in a kind of format that this doesn’t work? (I used before for similar things, and I experienced that for example it cannot works if R recognizes the values as a date). Maybe because it’s with a lot of decimals? (eg 2.907530e-01) I know that max(results5$R2) is in this example 0.6081547 and I can see that that belongs to the Cvalue == 1.8. It works in the opposite way. results5$R2[which(results5$Cvalue == 1.8)] # But neither results5$Cvalue[which(results5$R2 == 0.6081547)] # nor results5$Cvalue[which(results5$R2 == max(results5$R2))] # works… I hope someone can help me with this problem Kind regards Nerak -- View this message in context: http://r.789695.n4.nabble.com/result-numeric-0-when-using-variable1-which-variable2-max-variable2-tp4302887p4302887.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] visualization for k-mean clustering
hello, i want a visualization of the k-mean clustering.which one method will be best for visualization?? thnkx [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scoring using cox model: probability of survival before time t
Dear Members, I required to score probability of survival before specified time using fitted cox model on scoring dataset. On the training sample data I am able to get the probability of a survival before time point(t), but on the scoring dataset, which will have only predictor information I am facing some issues. It would be great help for me if you tell me where am I going wrong! Here is the sample script! # library(survival) n = 100 beta1 = 3; beta2 = -2; lambdaT = .01 lambdaC = .6 x1 = rnorm(n,0) x2 = rnorm(n,0) T = rweibull(n, shape=1, scale=lambdaT*exp(-beta1*x1-beta2*x2)) C = rweibull(n, shape=1, scale=lambdaC) time = pmin(T,C) event = time==T train_sample=data.frame(time,event,x1,x2) rm(time,event,x1,x2) fit_coxph - coxph(Surv(time, event)~ x1 + x2, data= train_sample, method=breslow) #Save model to some directory save(fit_coxph, file = file.path(C:/Desktop,fit_coxph.RData)) #I can get probabilities on train_sample as below: library(peperr) pred_train - predictProb.coxph(fit_coxph, Surv(train_sample$time, train_sample$event), train_sample, 0.4) head(pred_train) #[,1] #[1,] 5.126281e-03 #[2,] 4.324882e-01 #[3,] 4.444506e-61 #[4,] 0.00e+00 #[5,] 0.00e+00 #[6,] 3.249947e-01 #In the same line, I need probabilities on scoring_data. Now, close the earlier session and run the below script in the new #R session, it gives error. library(survival) library(peperr) load(file = file.path(C:/Desktop,fit_coxph.RData)) n = 1000 set.seed(1) x1 = rnorm(n,0) x2 = rnorm(n,0) score_data - data.frame(x1,x2) pred_score - predictProb.coxph(fit_coxph, Surv(time, event), score_data, 0.04) #Error in Surv(time, event) : Time variable is not numeric #After creating dummy place holder for Surv(time, event), it gives another error: time - rep(2, n) event - rep(1, n) pred_score - predictProb.coxph(fit_coxph, Surv(time, event), score_data, 0.04) #Error in inherits(x, data.frame) : object 'train_sample' not found Appreciate your help, is there any other way to get these probabilities on newdata. Thanks in advance ~Aher -- View this message in context: http://r.789695.n4.nabble.com/Scoring-using-cox-model-probability-of-survival-before-time-t-tp4302775p4302775.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to loop on file names
Dear Hélène Genet, Re: Dear all, I need to do the same procedure on several files. But I don't know how to refer to the file name. Here is an example of what I am trying to do. List of files: file1(A,B,C, D1...Dn), file2(A,B,C,E1,...,En), file3(A,B,C,F1,...,Fn) Procedure I want to apply on each file: dft - melt(df,id=c('A','B','C')) dft$X - substr(dft$variable,1,3) dft$Y - substr(dft$variable,4,8) dft1 - cast(dft, A+B+C+X ~ Y,value=response) As you see all the files contains the same 3 variables A,B,C that I use in the procedure. So I want to apply the procedure on all the file in a loop. Something like : filelist - c('file1' , 'file2' , 'file3') for (i in 1:3) { filename - filelist[i] ... } Any suggestion to refer to these files in this loop? Thanks you in advance, Helene -- Hélène Genet, PhD Institute of Arctic Biology University of Alaska Fairbanks Irving I Blg, Room 402 Fairbanks AK 99775 Phone: 907-474-5472 Cell: 907-699-4340 Email: hge...@alaska.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. I use this procedure to point to a folder (of WAV files in this example) and process all those files in a loop: # select FOLDER w. WAV-files fnam = dirname(file.choose())# choose any one file from the folder (= directory) filist = list.files(fnam, recursive=TRUE, pattern=wav) filist1 = paste(fnam,/,filist, sep=) nfiles = length(filist1) # # filenames loop === for(i in 1:nfiles) { inname=filist1[i] ywave=readWave(inname) # read the i'th file (wav as an example, can be any filetype you need) ywave2=ywave # output is the same as input (yet) ### ### DO ANY PROCESSING YOU WANT HERE to change ywave into ywave2 ### outname=paste(dirname(inname), /*,basename(inname), sep=) writeWave(ywave2,outname) } I took these few lines out of a larger program with wave file processing not relevant here. Hope I didn't forget something, and that this works for your case, Beste wishes, Franklin Bretschneider Utrecht University -- Dept Biologie Kruytgebouw W711 Padualaan 8 3584 CH Utrecht The Netherlands __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving WinBugs log file when using bugs()
On 16.01.2012 18:00, chaps31 wrote: Hi The log file (not as .odc and as .txt file) is in R'd tempdir() after WinBUGS returns to R. Was this my statement? I actually meant to write *both* rather than not above. Best, Uwe Is there any way to save the .odc file? Ruth -- View this message in context: http://r.789695.n4.nabble.com/Saving-WinBugs-log-file-when-using-bugs-tp4299827p4300162.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] net classification improvement?
Thanks for the reply. I think more the issue is whether it can be applied to cross-sectional data. This I'm not sure. This method is heavily cited in the New England Journal of Medicine, but thus far I've only seen it used with longitudinal data. On 1/16/12 10:23 PM, Kevin E. Thorpe kevin.tho...@utoronto.ca wrote: On 01/16/2012 08:10 PM, Essers, Jonah wrote: Greetings, I have generated several ROC curves and would like to compare the AUCs. The data are cross sectional and the outcomes are binary. I am testing which of several models provide the best discrimination. Would it be most appropriate to report AUC with 95% CI's? I have been looking in to the net reclassification improvement (see below for reference) but thus far I can only find a version in Hmisc package which requires survival data. Any idea what the best approach is for cross-sectional data? I believe that the function in Hmisc that does this will also work on binary data. Thanks Pencina MJ, D'Agostino RB Sr, D'Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 2008;27:157-172 -- Kevin E. Thorpe Biostatistician/Trialist, Applied Health Research Centre (AHRC) Li Ka Shing Knowledge Institute of St. Michael's Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] formula in function as text?
Hello all, It might be a simple question, but I cannot find the solution, as I do not know which subjects I should search on. So, much thanks for he/she we can help me. I am creating a function and would like to place a formula in the function, without it being executed immediately. Like saving it temporary as 'text'. Simplified version of what I would like to be able to do: test-function(a,x){ if(a5){ b-3+ x[i]} if(a5){ b- 6 + x[i]} y-1:10 for (i in 1:10){y[i]-4 + b} return(y) } In my perfect world, R will replace b in the formula y=4+b by the appropriate b, indicated by the condition (value of a). It now takes for 'b' only the first argument of x (+3 or 6). I know I can solve the problem by also looping over b and turning it into a vector, but I would like to know if it is also possible in the way stated above. If I put 3+x[i] in to make it a character, it will still be character at y-4+b or when I use as.numeric, it will create NA Thanks in advance, Julia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BLAS
I'm setting up an Ubuntu virtual machine that will use 4-Intel Xeon CPU x5650. I'd like to compile R with a BLAS but the question is whcih one. Seems like the only free ones are GotoBLAS which I'm not sure is being maintained for newer CPUs and OpenBLAS for Loongson CPUs. I saw a favorable report on OpenBLAS (http://www.rochester.edu/college/gradstudents/jolmsted/files/computing/BLAS_Comparison.pdf), but I'm not sure it's the right thing for my CPUs. The webpage for OpenBLAS says, On X86 box, compile this library for loongson3a CPU. Any opinions on whether this will work? If not, any suggestions on another free BLAS? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting probability density and cumulative distribution function
Hi! I want to plot the probability density function and the cumulative distribution function for the gamma, lognormal, exponential and Pareto distribution. I want to vary the parameters and and have a plot with 2-3 different parameters in the same figure. It should look like this (example weibull distribution): http://en.wikipedia.org/wiki/File:Weibull_PDF.svg library(Cairo) CairoFonts(regular=DejaVu Sans:style=Regular) CairoSVG(Weibull PDF.svg) par(mar=c(3, 3, 1, 1)) x - seq(0, 2.5, length.out=1000) plot(x, dweibull(x, .5), type=l, col=blue, xlab=, ylab=, xlim=c(0, 2.5), ylim=c(0, 2.5), xaxs=i, yaxs=i) lines(x, dweibull(x, 1), type=l, col=red) lines(x, dweibull(x, 1.5), type=l, col=magenta) lines(x, dweibull(x, 5), type=l, col=green) legend(topright, legend=paste(\u03bb = 1, k =, c(.5, 1, 1.5, 5)), lwd=1, col=c(blue, red, magenta, green)) dev.off() and http://en.wikipedia.org/wiki/File:Weibull_CDF.svg library(Cairo) CairoFonts(regular=DejaVu Sans:style=Regular) CairoSVG(Weibull CDF.svg) par(mar=c(3, 3, 1, 1)) x - seq(0, 2.5, length.out=1000) plot(x, pweibull(x, .5), type=l, col=blue, xlab=, ylab=, xlim=c(0, 2.5), ylim=c(0, 1), xaxs=i, yaxs=i) lines(x, pweibull(x, 1), type=l, col=red) lines(x, pweibull(x, 1.5), type=l, col=magenta) lines(x, pweibull(x, 5), type=l, col=green) legend(bottomright, legend=paste(\u03bb = 1, k =, c(.5, 1, 1.5, 5)), lwd=1, col=c(blue, red, magenta, green)) dev.off() How do I adapt the syntax to use it for the other distributions? -- View this message in context: http://r.789695.n4.nabble.com/Plotting-probability-density-and-cumulative-distribution-function-tp4303198p4303198.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean of simulation runs given in a table
On 17.01.2012 12:31, Irek Szczesniak wrote: Hi, I have the simulation results of the following structure: run par measured 1 1012 2 1014 1 2020 2 2026 Where run is the simulation run number, par is the parameter of the simulation, and measured is the value measured in the simulation. This is only a simple example of my results. There are many values measured and many parameters. But the basic structure stays the same: there are many runs (identified by the run number) for the same values of the parameters with various measured values -- they constitute a sample. I would like to calculate the mean of the measured value for a sample, and so I would like to obtain the output as follows: par mean 10 13 20 23 I would appreciate it if someone could write me how to do it. For you data in a data.frame called dat: aggregate(measured ~ par, dat, mean) Uwe Ligges Thank you, Irek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error predict with lda and cross validation
On 17.01.2012 14:54, Riccardo Romoli wrote: Hi, I use the lda function from the MASS package to classify some samples according to some chemical properties. If I run lda without cross validation all is ok but, if I run lda with cross validation, the R consol say: resLDA - ldaRedOx - lda(Activity ~ TRedOx[,1:6], CV=TRUE, data=dfDataRedOx, subset=train) predLDA - predict(resLDA, newdata=dfDataRedOx[-train,])$class predLDA - predict(resLDA, newdata=dfDataRedOx[-train,])$class Error in UseMethod(predict) : no applicable method for 'predict' applied to an object of class list How should I use predict function with lda with the cross validation? See ?lda : it has a very different output if CV=TRUE is used. Hence you have to prepare it with CV=FALSE in order to make predictions again. It does not make sense to ask LDA for a cross validation an run another test later on, does it? Uwe Ligges Best Riccardo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] result numeric(0) when using variable1[which(variable2=max(variable2)]
On 17/01/2012 5:35 AM, Nerak wrote: Dear all, I have a question about the knowing for which row I have the max value of one of my variables. I calculated the Rsquared for different columns and made a list to gather them. I unlisted this list to create a vector with this values. I want to know for which column I have the max value of Rsquared. The columns were always named in the same way. They always start with results4$depth_ following by the number. The numbers are constructed as: seq(1,10,0.1). But if the R squared values are now in 1 column, I don’t know for which column they are calculated. So I made a new data frame with both columns: R2- unlist(LIST) Cvalue- c(seq(1,10,0.1)) results5- data.frame(Cvalue,R2) # I know I can calculate the max value of Rsquared by this way: max(results5$R2) # now I want to know to which Cvalue this belongs. I would write it like this: results5$Cvalue[which(results5$R2 == max(results5$R2))] Don't use quotes on the expression. None of your R2 values are the string max(results5$R2) which is why you're getting numeric(0). You can also make it simpler by using the which.max() function. Duncan Murdoch # But I always get the solution: numeric(0) # I don’t know if these Rsquared values are in a kind of format that this doesn’t work? (I used before for similar things, and I experienced that for example it cannot works if R recognizes the values as a date). Maybe because it’s with a lot of decimals? (eg 2.907530e-01) I know that max(results5$R2) is in this example 0.6081547 and I can see that that belongs to the Cvalue == 1.8. It works in the opposite way. results5$R2[which(results5$Cvalue == 1.8)] # But neither results5$Cvalue[which(results5$R2 == 0.6081547)] # nor results5$Cvalue[which(results5$R2 == max(results5$R2))] # works… I hope someone can help me with this problem Kind regards Nerak -- View this message in context: http://r.789695.n4.nabble.com/result-numeric-0-when-using-variable1-which-variable2-max-variable2-tp4302887p4302887.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] result numeric(0) when using variable1[which(variable2=max(variable2)]
On 17-01-2012, at 11:35, Nerak wrote: Dear all, I have a question about the knowing for which row I have the max value of one of my variables. I calculated the Rsquared for different columns and made a list to gather them. I unlisted this list to create a vector with this values. I want to know for which column I have the max value of Rsquared. The columns were always named in the same way. They always start with results4$depth_ following by the number. The numbers are constructed as: seq(1,10,0.1). But if the R squared values are now in 1 column, I don’t know for which column they are calculated. So I made a new data frame with both columns: R2 - unlist(LIST) Cvalue - c(seq(1,10,0.1)) results5 - data.frame(Cvalue,R2) # I know I can calculate the max value of Rsquared by this way: max(results5$R2) # now I want to know to which Cvalue this belongs. I would write it like this: results5$Cvalue[which(results5$R2 == max(results5$R2))] # But I always get the solution: numeric(0) You haven't provided a reproducible example. So I tried this set.seed(1) x - round(runif(10),3) x which.max(x) which(x==max(x)) x which(x==0.945) which(x==max(x)) which(x==max(x)) x[which(x==max(x))] If you run this you will see that the last line results in numeric(0). So; why are you using quotes in the which expression? Is results5$R2 a character string? This should work results5$Cvalue[which(results5$R2 == max(results5$R2))] But this is shorter results5$Cvalue[which.max[results5$R2)] Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] formula in function as text?
On 17-01-2012, at 13:36, Julia Burggraaf wrote: Hello all, It might be a simple question, but I cannot find the solution, as I do not know which subjects I should search on. So, much thanks for he/she we can help me. I am creating a function and would like to place a formula in the function, without it being executed immediately. Like saving it temporary as 'text'. Simplified version of what I would like to be able to do: test-function(a,x){ if(a5){ b-3+ x[i]} if(a5){ b- 6 + x[i]} y-1:10 for (i in 1:10){y[i]-4 + b} return(y) } In my perfect world, R will replace b in the formula y=4+b by the appropriate b, indicated by the condition (value of a). It now takes for 'b' only the first argument of x (+3 or 6). I know I can solve the problem by also looping over b and turning it into a vector, but I would like to know if it is also possible in the way stated above. If I put 3+x[i] in to make it a character, it will still be character at y-4+b or when I use as.numeric, it will create NA You are complicating matters. test - function(a,x){ if(a5){ b - 3+ x} else if(a5){ b- 6 + x} # b is now a vector y - 4 + b return(y) } Puzzle for you to solve: what happens when a is identical to 5? Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] formula in function as text?
Hi Hello all, It might be a simple question, but I cannot find the solution, as I do not know which subjects I should search on. So, much thanks for he/she we can help me. I am creating a function and would like to place a formula in the function, without it being executed immediately. Like saving it temporary as 'text'. Simplified version of what I would like to be able to do: test-function(a,x){ if(a5){ b-3+ x[i]} What is i? if(a5){ b- 6 + x[i]} y-1:10 for (i in 1:10){y[i]-4 + b} return(y) } In my perfect world, R will replace b in the formula y=4+b by the appropriate b, indicated by the condition (value of a). If you want perfect R world you shall us R approach. Let's suppose you have vector a with values below and above 5. and vector x which you want to use for computing b set.seed(111) a -sample(1:10, 10) x -runif(10) you can compute vector b according to your condition b - (((a5)+1)*3) + x # I included number 5 to computing 3 and based on this you can compute y y - 4+b You can put it in a function if you want. test-function(a,x) { b - (((a5)+1)*3) + x y - 4+b return(y) } Regards Petr It now takes for 'b' only the first argument of x (+3 or 6). I know I can solve the problem by also looping over b and turning it into a vector, but I would like to know if it is also possible in the way stated above. If I put 3+x[i] in to make it a character, it will still be character at y-4+b or when I use as.numeric, it will create NA Thanks in advance, Julia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display numbers on map
On Jan 17, 2012, at 5:37 AM, Jeffrey Joh wrote: I have a text file with states and numbers. I would like to display each number that corresponds to a state on a map. I am trying to use the maps package, but it doesn't show Alaska or Hawaii. Do you have suggestions on how to do this? This question suggests you are not yet aware of the search facility built into R: RSiteSearch(maps Hawaii Alaska ) A search query has been submitted to http://search.r-project.org The results page should open in your browser shortly The third hit was back to the map function help page. (The examples should be read and executed.) ?example [[alternative HTML version deleted]] And the above suggests you still need to read the following: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question: how to select a column from a dataframe in a function
Hi Hi, I am creating a function and ran into the problem of selecting a column from a dataset. It seems as though the $ function (as in data$columnname) does not apply in the function. In simplified version: This works: testf2-function(data,columnnumber){print(data[,columnnumber])} But this doesn't: testf-function(data,column){print(data$column)} Even though the first solution works, I would like to be able to insert the columnname in the function, instead of the columnnumber. How do I do that? Not sure if you get any answer yet. testf2-function(data,columnname){print(data[,columnname])} Petr Thank you in advance, Julia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Checking dates for entry errors
On Jan 17, 2012, at 8:02 AM, Thomas Mang wrote: On 1/11/2012 11:07 PM, Paul Miller wrote: Hello Everyone, I have a question about how best to check dates for entry errors. Try using regular expression matching and the functions grep, strsplit, regexpr etc. If you are not familiar with regex: bit a bumpy road of getting into it but in the long term definitely worth the effort ! Agree on all points, but also simply converting to the Date class and using the ?Comparison operators is informative. And there may be NA's that will signal impossible month-day combinations which would be tedious to identify with regex. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BLAS
On Tue, Jan 17, 2012 at 6:06 AM, Scott Raynaud scott.rayn...@yahoo.com wrote: I'm setting up an Ubuntu virtual machine that will use 4-Intel Xeon CPU x5650. I'd like to compile R with a BLAS but the question is whcih one. Seems like the only free ones are GotoBLAS which I'm not sure is being maintained for newer CPUs and OpenBLAS for Loongson CPUs. I saw a favorable report on OpenBLAS (http://www.rochester.edu/college/gradstudents/jolmsted/files/computing/BLAS_Comparison.pdf), but I'm not sure it's the right thing for my CPUs. The webpage for OpenBLAS says, On X86 box, compile this library for loongson3a CPU. Any opinions on whether this will work? If not, any suggestions on another free BLAS? I've been using ATLAS (http://math-atlas.sourceforge.net/) with good success. Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot- using geom_point and geom_line at the same time
On Mon, Jan 16, 2012 at 6:05 PM, Mary Kindall mary.kind...@gmail.com wrote: Thanks for reply I wanted to have legend name with spaces. Right now I am using the following code but it produce two legends. I have to use Gimp to cut the redundant legend. Your basic problem is that you're using the fill and colour aesthetics, but you only need colour. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] visualization for k-mean clustering
Will depend heavily on the structure of your data (you haven't even told use te number of dimensions or the metric in question), but I'd suggest something like a scatterplot color coded by cluster with an additional marker for cluster means. Michael Weylandt On Jan 17, 2012, at 3:48 AM, mukul purva mukul.pu...@gmail.com wrote: hello, i want a visualization of the k-mean clustering.which one method will be best for visualization?? thnkx [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting probability density and cumulative distribution function
Perhaps I misunderstand you, but it sounds like all you need to do is change weibull to the name of another distribution Michael On Jan 17, 2012, at 8:11 AM, chrisr34000 cr...@wolke7.net wrote: Hi! I want to plot the probability density function and the cumulative distribution function for the gamma, lognormal, exponential and Pareto distribution. I want to vary the parameters and and have a plot with 2-3 different parameters in the same figure. It should look like this (example weibull distribution): http://en.wikipedia.org/wiki/File:Weibull_PDF.svg library(Cairo) CairoFonts(regular=DejaVu Sans:style=Regular) CairoSVG(Weibull PDF.svg) par(mar=c(3, 3, 1, 1)) x - seq(0, 2.5, length.out=1000) plot(x, dweibull(x, .5), type=l, col=blue, xlab=, ylab=, xlim=c(0, 2.5), ylim=c(0, 2.5), xaxs=i, yaxs=i) lines(x, dweibull(x, 1), type=l, col=red) lines(x, dweibull(x, 1.5), type=l, col=magenta) lines(x, dweibull(x, 5), type=l, col=green) legend(topright, legend=paste(\u03bb = 1, k =, c(.5, 1, 1.5, 5)), lwd=1, col=c(blue, red, magenta, green)) dev.off() and http://en.wikipedia.org/wiki/File:Weibull_CDF.svg library(Cairo) CairoFonts(regular=DejaVu Sans:style=Regular) CairoSVG(Weibull CDF.svg) par(mar=c(3, 3, 1, 1)) x - seq(0, 2.5, length.out=1000) plot(x, pweibull(x, .5), type=l, col=blue, xlab=, ylab=, xlim=c(0, 2.5), ylim=c(0, 1), xaxs=i, yaxs=i) lines(x, pweibull(x, 1), type=l, col=red) lines(x, pweibull(x, 1.5), type=l, col=magenta) lines(x, pweibull(x, 5), type=l, col=green) legend(bottomright, legend=paste(\u03bb = 1, k =, c(.5, 1, 1.5, 5)), lwd=1, col=c(blue, red, magenta, green)) dev.off() How do I adapt the syntax to use it for the other distributions? -- View this message in context: http://r.789695.n4.nabble.com/Plotting-probability-density-and-cumulative-distribution-function-tp4303198p4303198.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] visualization for k-mean clustering
You mean the process of clustering (the algorithm)? Have you looked at kmeans.ani() in the animation package? Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Tue, Jan 17, 2012 at 2:48 AM, mukul purva mukul.pu...@gmail.com wrote: hello, i want a visualization of the k-mean clustering.which one method will be best for visualization?? thnkx [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot with diamond shape
Em 16/1/2012 08:07, David martin escreveu: Hi, I haven't found in R a possibility to draw a boxplot with a diamond shape (means and CI). David, Perhaps, even prejudicially, as I cannot see any advantage on the diamond shape for displaying just two dimensions, I would recommend you check if plotCI from plotrix package, or even better plotmeans from gplots package which does it more automatically, doesn't suffice for your needs. HTH, -- Cesar Rabak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to loop on file names
Inline: Michael On Jan 16, 2012, at 10:26 PM, Hélène Genet hge...@alaska.edu wrote: Dear all, I need to do the same procedure on several files. But I don't know how to refer to the file name. Here is an example of what I am trying to do. List of files: file1(A,B,C, D1...Dn), file2(A,B,C,E1,...,En), file3(A,B,C,F1,...,Fn) Procedure I want to apply on each file: filelist - c('file1' , 'file2' , 'file3') for(i in filellist){ df - read.csv(i) dft - melt(df,id=c('A','B','C')) dft$X - substr(dft$variable,1,3) dft$Y - substr(dft$variable,4,8) dft1 - cast(dft, A+B+C ~ Y,value=response) write.csv(paste(done-, i, sep = )) } As you see all the files contains the same 3 variables A,B,C that I use in the procedure. So I want to apply the procedure on all the file in a loop. Something like : filelist - c('file1' , 'file2' , 'file3') for (i in 1:3) { filename - filelist[i] ... } Any suggestion to refer to these files in this loop? Thanks you in advance, Helene -- Hélène Genet, PhD Institute of Arctic Biology University of Alaska Fairbanks Irving I Blg, Room 402 Fairbanks AK 99775 Phone: 907-474-5472 Cell: 907-699-4340 Email: hge...@alaska.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reference for dataset colon (package survival)
Dear R team, dear Prof. Therneau, library(survival) data(colon) ?colon gives me only a very rudimentary source (only a name). Is there a possibility to get a reference to the clinical trial these data are taken from? Many thanks in advance. With best wishes, Matthias Gondan -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] breakpoints and nonlinear regression
Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu - breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? Is there an easier way to calculate the confidence interval for a non-linear regression? I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
Hi, Julian- I'm not sure if this will be what you want but you could start by taking a look at: ?predict.nls Ken On 01/17/12, crimsonengineer87 julianjonre...@gmail.com wrote: Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu - breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? Is there an easier way to calculate the confidence interval for a non-linear regression? I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
Sorry, that wasn't to helpful...I see that the intervals and se.fit argument are currently ignored. On 01/17/12, crimsonengineer87 julianjonre...@gmail.com wrote: Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu - breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? Is there an easier way to calculate the confidence interval for a non-linear regression? I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot with diamond shape
On Jan 17, 2012, at 10:22 AM, csrabak wrote: Em 16/1/2012 08:07, David martin escreveu: Hi, I haven't found in R a possibility to draw a boxplot with a diamond shape (means and CI). David, Perhaps, even prejudicially, as I cannot see any advantage on the diamond shape for displaying just two dimensions, I would recommend you check if plotCI from plotrix package, or even better plotmeans from gplots package which does it more automatically, doesn't suffice for your needs. Along the same lines of criticizing the premise of changing boxes to diamonds, I thought that a violin plot might be more informative. The lattice implementation is quite good. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unable to find an inherited method for function make.db.names, for signature character, missing
Hi ! I am new to R and I am using Rstudio on Linux. Have I missed some library() ? and if so does anyone have the time to write which? I am trying to create some PostqreSQL tables from comma separated files. I am a bit surpriced that character is a problem. The missing value could be NULL. Thanks Poul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reference for dataset colon (package survival)
On Jan 17, 2012, at 10:48 AM, Matthias Gondan wrote: Dear R team, dear Prof. Therneau, library(survival) data(colon) ?colon gives me only a very rudimentary source (only a name). Is there a possibility to get a reference to the clinical trial these data are taken from? Wouldn't this seem to be the most promising place to look? http://www.ncbi.nlm.nih.gov/pubmed?term=lin%20d[au]%205-fu%20levamisole %20colon%20cancer -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
Hi Ken, Thx for that advice. I took a brief look at it. I already have my curve by just using the curve() function using the parameters a and b given by the nls. Would se.fit and interval have computed the CI? Maybe where I'm confused is at how I can break up my curve into pieces of linear regressions. Then doing CI's from there? Thanks. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303763.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Prediciting sports team scores
I am working on predicitng the scores for a days worth of matches of team sports. I have already collected data for the teams for the season we are concentrating on. I have been fitting poisson models for football games and have worked out what model is best and which predictor variables are most important. We would now like to predict the probability distribution for the scores for each team. eg. What is the probability of Manchester United vs Chelsea ending 1-1? -- View this message in context: http://r.789695.n4.nabble.com/Prediciting-sports-team-scores-tp4303708p4303708.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] An unsubsettable object in a mixed model
I am having problems using the /lme /command to fit mixed models. I have a data set similar to longitudinal data, except the hypothesised correlation is between observations taken from different individuals in the same family rather than from the same individual at different times. As soon as I try to specify any correlation structure other than independent, I get the error message /Error in x$formula : object of type 'closure' is not subsettable/. Have you any idea what /R /is objecting to and how I might fix it, and if not, can you suggest h any other way I can get R to fit a linear mixed model? The data set is VERY unbalanced with almost 2/3 of the families being single individuals, so maybe /R/ doesn't like being told to find correlation in univariate data subsets. ANY advice is welcome! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
On Tue, Jan 17, 2012 at 8:06 AM, Kenneth Frost kfr...@wisc.edu wrote: Sorry, that wasn't to helpful...I see that the intervals and se.fit argument are currently ignored. Yes, because the fitted values are nonlinear in the parameters, which makes finding exact confidence regions impossible. I think the usual approach (subject to correction by experts) is to use a delta method approximation for the fitted variances from the varcov matrix of the parameters at the converged optimum (itself an approximation) and then a standard t-interval based on that. However, this approximation can be quite bad, because degrees of freedom don't mean much for nonlinear models -- in fact, that's the essential (and huge!) difference between linear and nonlinear models -- and the likelihood surface may not be close enough to quadratic. So one may do better with, e.g. a bootstrap approximation, although this can be problematic, too, due to convergence and other issues. What I think can be said with some certainty is that the idea of approximating by a segmented regression and then using CI's for each linear part in the usual way is a particularly bad one -- the CI's will be underestimated because they don't take into account the uncertainty in the location of the fitted breakpoints, which are nonlinear **and** non-smooth functions of the data. So if confidence intervals for the fitted values are really important, I suggest that Julian work with his local statistician to come up with the best approach for his particular situation. It's tricky. Cheers, Bert On 01/17/12, crimsonengineer87 julianjonre...@gmail.com wrote: Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu - breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? Is there an easier way to calculate the confidence interval for a non-linear regression? I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Prediciting sports team scores
On Jan 17, 2012, at 10:55 AM, kerry1912 wrote: I am working on predicitng the scores for a days worth of matches of team sports. I have already collected data for the teams for the season we are concentrating on. I have been fitting poisson models for football games and have worked out what model is best and which predictor variables are most important. We would now like to predict the probability distribution for the scores for each team. eg. What is the probability of Manchester United vs Chelsea ending 1-1? This certainly sounds like homework. Please read the relevant section in the Posting Guide. -- View this message in context: http://r.789695.n4.nabble.com/Prediciting-sports-team-scores-tp4303708p4303708.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Prediciting sports team scores
Robin Lock at St Lawrence has done this for hockey, see http://it.stlawu.edu/~chodr/faq.html As I recall, he has a poisson regression model with parameters for offense and defense, and perhaps home 'field' advantage. I confess I am skeptical that this is the right approach for football - teams adjust their strategy and tactics as a function of the opponent and the current match score. Teams are trying to maximize the probability of getting a result, not the probability of scoring goals. The poisson model corresponds to a constant rate for scoring. albyn Quoting kerry1912 kerry1...@hotmail.com: I am working on predicitng the scores for a days worth of matches of team sports. I have already collected data for the teams for the season we are concentrating on. I have been fitting poisson models for football games and have worked out what model is best and which predictor variables are most important. We would now like to predict the probability distribution for the scores for each team. eg. What is the probability of Manchester United vs Chelsea ending 1-1? -- View this message in context: http://r.789695.n4.nabble.com/Prediciting-sports-team-scores-tp4303708p4303708.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 stacked bar - sum of values rather than count
On Mon, Jan 16, 2012 at 1:13 AM, Paul p...@paulhurley.co.uk wrote: On 16/01/12 02:08, J Toll wrote: Hi, I'm trying to create a stacked bar plot using ggplot2. Rather than plotting the count of each of the 13 Bar factors on the Y axis, I would like to represent the sum of the Values associated with each of the 13 Bar factors. Is there a way to do that? Given the following data, that would obviously mean that there would be some negative sums represented. Here's a bit of example data along with the command I've been using. library(ggplot2) x Value Bar Segment 1 1.10020075 1 1 2 -1.37734577 2 1 3 2.50702876 3 1 4 0.58737028 3 2 5 0.21106851 3 3 6 -2.50119261 4 1 7 1.34984831 5 1 8 -0.27556149 6 1 9 -1.54401647 6 2 10 -2.75975562 6 3 11 -0.09527123 6 4 12 1.36331646 7 1 13 -0.36051429 8 1 14 1.36790999 9 1 15 0.15064633 9 2 16 0.34022421 9 3 17 -0.64512970 10 1 18 0.83268199 11 1 19 -1.50117728 12 1 20 1.09004959 13 1 qplot(factor(Bar), data = x, geom = bar, fill = factor(Segment)) Thanks for any suggestions you might have. James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. I'm not at my usual computer, but in ggplot2 the geom_bar geom is designed for stats, so by default does report count. You can change the stat method (ie to identity to just use the values in the column) but the easiest (IIRC) is to give a weight; #Gives count || qplot http://had.co.nz/ggplot2/qplot.html(color, data=diamonds, geom=bar) #Gives sum of carat variable qplot http://had.co.nz/ggplot2/qplot.html(color, data=diamonds, geom=bar, weight=carat, ylab=carat) #just gives raw values from meanprice column || qplot http://had.co.nz/ggplot2/qplot.html(cut, meanprice, geom=bar, stat=identity) Check out the ggplot2 help page (http://had.co.nz/ggplot2/geom_bar.html) for more info. Regards, Paul. Paul, Thank you for the help. Using your second example, I added weight = Value to my previous command to get the chart I wanted. qplot(factor(Bar), data=x, geom=bar, weight = Value, fill=factor(Segment)) ggplot2 issues a warning message with my data because it has negative values: Warning message: Stacking not well defined when ymin != 0 But that's not a real concern. Anyway, thanks again for your help. James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] net classification improvement?
On 01/17/2012 07:16 AM, Essers, Jonah wrote: Thanks for the reply. I think more the issue is whether it can be applied to cross-sectional data. This I'm not sure. This method is heavily cited in the New England Journal of Medicine, but thus far I've only seen it used with longitudinal data. As I recall, the Pencina et al paper does not suggest it cannot be used outside of longitudinal data. In fact, I don't remember them using longitudinal data at all. So, unless I'm misunderstanding your question, I think the function in Hmisc (whose name I always forget) should be fine. On 1/16/12 10:23 PM, Kevin E. Thorpekevin.tho...@utoronto.ca wrote: On 01/16/2012 08:10 PM, Essers, Jonah wrote: Greetings, I have generated several ROC curves and would like to compare the AUCs. The data are cross sectional and the outcomes are binary. I am testing which of several models provide the best discrimination. Would it be most appropriate to report AUC with 95% CI's? I have been looking in to the net reclassification improvement (see below for reference) but thus far I can only find a version in Hmisc package which requires survival data. Any idea what the best approach is for cross-sectional data? I believe that the function in Hmisc that does this will also work on binary data. Thanks Pencina MJ, D'Agostino RB Sr, D'Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 2008;27:157-172 -- Kevin E. Thorpe Biostatistician/Trialist, Applied Health Research Centre (AHRC) Li Ka Shing Knowledge Institute of St. Michael's Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] net classification improvement?
Actually, I don't think I made myself clear and I wrote this late last nightSorry. More the issue is that the raw model predictions (from 0 to 1) have no inherent clinical value to them. I.e. They aren't risk of disease or risk of outcome. They are raw scores that are specific to each model and are meant to discriminate one disease from another disease. Trying to compare models is impossible because the NRI requires cutoff values. The cutoffs are different for each model. So, as I've done more reading, it appears the the IRI--Integrated Discrimination Improvement Index--which is naïve to cutoff values--may be more what I'm looking for. Does this make sense? I guess I just need a sanity check. I have been toying with the PredictABEL package and this seems to like my data inputs just fine and relies on HMISC and ROCR, both packages I know well. Thanks jonah On 1/17/12 11:49 AM, Kevin E. Thorpe kevin.tho...@utoronto.ca wrote: On 01/17/2012 07:16 AM, Essers, Jonah wrote: Thanks for the reply. I think more the issue is whether it can be applied to cross-sectional data. This I'm not sure. This method is heavily cited in the New England Journal of Medicine, but thus far I've only seen it used with longitudinal data. As I recall, the Pencina et al paper does not suggest it cannot be used outside of longitudinal data. In fact, I don't remember them using longitudinal data at all. So, unless I'm misunderstanding your question, I think the function in Hmisc (whose name I always forget) should be fine. On 1/16/12 10:23 PM, Kevin E. Thorpekevin.tho...@utoronto.ca wrote: On 01/16/2012 08:10 PM, Essers, Jonah wrote: Greetings, I have generated several ROC curves and would like to compare the AUCs. The data are cross sectional and the outcomes are binary. I am testing which of several models provide the best discrimination. Would it be most appropriate to report AUC with 95% CI's? I have been looking in to the net reclassification improvement (see below for reference) but thus far I can only find a version in Hmisc package which requires survival data. Any idea what the best approach is for cross-sectional data? I believe that the function in Hmisc that does this will also work on binary data. Thanks Pencina MJ, D'Agostino RB Sr, D'Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 2008;27:157-172 -- Kevin E. Thorpe Biostatistician/Trialist, Applied Health Research Centre (AHRC) Li Ka Shing Knowledge Institute of St. Michael's Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Which date format to choose?
R offers a bewildering array of options when it comes to representing dates and times (e.g, as.Date, chron, strptime, zoo, etc). Can anybody recommend a document that compares the relative merit of each method? I'm not looking for help with any one method, but rather a guide that describes which method is best for a particular data analysis/plotting goal. Thanks, Jake [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] net classification improvement?
On 01/17/2012 11:55 AM, Essers, Jonah wrote: Actually, I don't think I made myself clear and I wrote this late last nightSorry. More the issue is that the raw model predictions (from 0 to 1) have no inherent clinical value to them. I.e. They aren't risk of disease or risk of outcome. They are raw scores that are specific to each model and are meant to discriminate one disease from another disease. Trying to compare models is impossible because the NRI requires cutoff values. The cutoffs are different for each model. So, as I've done more reading, it appears the the IRI--Integrated Discrimination Improvement Index--which is naïve to cutoff values--may be more what I'm looking for. Does this make sense? I guess I just need a sanity check. Yes, the IRI makes sense to me. I have been toying with the PredictABEL package and this seems to like my data inputs just fine and relies on HMISC and ROCR, both packages I know well. Thanks jonah On 1/17/12 11:49 AM, Kevin E. Thorpekevin.tho...@utoronto.ca wrote: On 01/17/2012 07:16 AM, Essers, Jonah wrote: Thanks for the reply. I think more the issue is whether it can be applied to cross-sectional data. This I'm not sure. This method is heavily cited in the New England Journal of Medicine, but thus far I've only seen it used with longitudinal data. As I recall, the Pencina et al paper does not suggest it cannot be used outside of longitudinal data. In fact, I don't remember them using longitudinal data at all. So, unless I'm misunderstanding your question, I think the function in Hmisc (whose name I always forget) should be fine. On 1/16/12 10:23 PM, Kevin E. Thorpekevin.tho...@utoronto.ca wrote: On 01/16/2012 08:10 PM, Essers, Jonah wrote: Greetings, I have generated several ROC curves and would like to compare the AUCs. The data are cross sectional and the outcomes are binary. I am testing which of several models provide the best discrimination. Would it be most appropriate to report AUC with 95% CI's? I have been looking in to the net reclassification improvement (see below for reference) but thus far I can only find a version in Hmisc package which requires survival data. Any idea what the best approach is for cross-sectional data? I believe that the function in Hmisc that does this will also work on binary data. Thanks Pencina MJ, D'Agostino RB Sr, D'Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 2008;27:157-172 -- Kevin E. Thorpe Biostatistician/Trialist, Applied Health Research Centre (AHRC) Li Ka Shing Knowledge Institute of St. Michael's Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New PLYR issue
Replying to old messages without including context (particularly old ones) is rather bad netiquette. Thank you for at least providing a reproducible example. Now if you can figure out how to read the documentation we will really make some progress. Further responses below. On Tue, 17 Jan 2012, Gunnar Oehmichen wrote: Hello everyone, I have got the same problem, with the same error message. I wasn't able to draw a comparison between the problems, though the error messages were the same. Using R 2.14.1, plyr 1.7.1, R.Studio 0.94.110, Windows XP The plyr mailing list does not provide any help until now. require(plyr) c(sample(c(1:100), 50, replace=TRUE))-V1 Much better to use - than - for clarity of code (spaces and direction of assignment make a difference for readability) c(rep( 1:5, 10))-f1 #variable to group V1 data.frame(cbind(V1, f1))-DF str(DF) ddply(DF$V1, DF$f1, sd) ddply(.(DF$V1), .(DF$f1), sd) Error in if (empty(.data)) return(.data) : missing value where TRUE/FALSE needed Thanks everyone, If you hand a toothpick to a mechanic you should not be surprised when he tells you he cannot change a tire from your car. You are giving a vector where a data frame is needed, another vector where a name or vector of names are required, and the name of a function where an actual function is needed, and the function is complaining. In the face of such confusion, it is not surprising that people were unable to figure out where to start setting you straight. However, in return for your reproducible example I will give it a go. A basic unifying concept for the plyr package is that the name of the function tells you something about what needs to go in, and what will come out. ddply starts with a d so it expects a data frame as input, and because the second letter is also a d it will yield a data frame result when it is done. Argument 1: DF$V1 is a vector. It happens to be the the column named V1 in the data frame DF. To specify a data frame, don't apply operators to it, just write the name of the data frame DF. Argument 2: This argument tells ddply what the name of the grouping columns are. Do not actually give the grouping columns to ddply (which $ does). I have found that while the .() function seems cleaner, I find it clearer to use a vector of strings ... in this case, there is only one grouping column, so I would forego the usual c() concatenator and just give it f1. Argument 3: This argument is supposed to be a function that will take a data frame (first d) and yield a data frame (second d) for one group of rows. ddply will take care of stacking them as a single data frame for the final result. You have given ddply the name (first error) of a function that takes a vector and returns a scalar (wrong type of function is error two). The correct documentation for all of these arguments can be found by typing ?ddply at the R command line (after you have loaded plyr). It looks like you have been reading the documentation for ?aggregate or ?summaryBy (doBy package) and trying to use that to inform your use of ddply. So the actual call should be: ddply(DF,f1,function(df){data.frame(sdV1=sd(df$V1))}) f1 sdV1 1 1 19.93016 2 2 35.96356 3 3 33.30349 4 4 26.62831 5 5 25.03087 In general, to add more simultaneous calculations, you add more columns to the data frame produced by your function that does the calculations. If you want to give it a function name, don't put it in quotes: myfunction - function(df){ + data.frame(sdV1=sd(df$V1),meanV1=mean(df$V1)) + } ddply(DF,f1,myfunction) f1 sdV1 meanV1 1 1 19.93016 49.1 2 2 35.96356 45.6 3 3 33.30349 44.7 4 4 26.62831 72.2 5 5 25.03087 30.1 Note that although ddply does a lot for you, it doesn't reproduce all of your calculations on all of the data columns like summaryBy does... you have to explicitly create every calculated column in your function. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using Aggregate() with FUN arguments, which require more than one input variables
Dear all, I am trying to apply the aggregate() function to calculate correlations for subsets of a dataframe. My argument x is supposed to consist of 2 numerical vectors, which represent x and y for the cor() function. The following error results when calling the aggregate function: Error in FUN(X[[1L]], ...) : supply both 'x' and 'y' or a matrix-like 'x'. I think the subsets aggregate puts into cor() are sort of list types and therefore can't be handled by cor(). Can anyone provide me with a solution? Regards, RNoob -- View this message in context: http://r.789695.n4.nabble.com/Using-Aggregate-with-FUN-arguments-which-require-more-than-one-input-variables-tp4303936p4303936.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which date format to choose?
On 17/01/2012 12:14 PM, Jake Beaulieu wrote: R offers a bewildering array of options when it comes to representing dates and times (e.g, as.Date, chron, strptime, zoo, etc). Can anybody recommend a document that compares the relative merit of each method? I'm not looking for help with any one method, but rather a guide that describes which method is best for a particular data analysis/plotting goal. You could try /R Help Desk: Date and Time Classes in R / by Gabor Grothendieck and Thomas Petzoldt in R News 4(1) http://CRAN.R-project.org/doc/Rnews/Rnews_2004-1.pdf, 29-32. Here's the link: http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which date format to choose?
R offers a bewildering array of options when it comes to representing dates and times Yes no: read www.r-project.org/doc/Rnews/Rnews_2004-1.pdf (the help desk section) Brief summary: 3 major ways to deal with dates/times in R: i ) the Date class from the base distribution -- no time support, but very easy ii) the chron (package) -- no support for time zones but can do times of day iii) POSIXt (in two forms) -- the most general -- handles time zones and daylights savings, but has a few quirks General rule: use the simplest one you can but no simpler. Where things get more complicated is in the time series object itself: zoo is very general (doesn't actually even require a time object for the index) and its derivative xts is my personal workhorse. I can only speak from a quant-finance perspective but, in that domain, the decision comes down to xts from quantmod (actually standalone but tightly integrated) vs timeDate from Rmetrics. They are both very good -- one is S3 and one is S4 so they have different virtues; I'm not an S4 guy myself so that drives me to the xts choice. xts is pretty much impossible to beat for speed though if that's a factor (it uses POSIXt objects for the index and all sorts of great C routines) If speed isn't a concern, I'd suggest you see whichever one has better support for what you are trying to do and to make your decision based on that. Rmetrics is an extensive platform for analysis and econometrics, but the quantmod/quantstrat toolkit is more geared towards a trader (at least, in my impression) Others will hopefully chime in but it's going to be alot easier if you can say a little more about your problem domain and what sort of analysis you want to run. It also might behoove you to look at the timeSeries CRAN task view. Hope this helps, Michael On Tue, Jan 17, 2012 at 12:14 PM, Jake Beaulieu beaulieu.j...@epamail.epa.gov wrote: R offers a bewildering array of options when it comes to representing dates and times (e.g, as.Date, chron, strptime, zoo, etc). Can anybody recommend a document that compares the relative merit of each method? I'm not looking for help with any one method, but rather a guide that describes which method is best for a particular data analysis/plotting goal. Thanks, Jake [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Separate ablines in lattice panels
Searched archives and found some old email threads on the topic. But mot exactly what I think I need. Suppose I have a datafile such as tmp. tmp - data.frame(var1 = c(rnorm(1000), rnorm(1000, 1, 1)), var2 = gl(2, 1000)) I'd like a plot similar to the one below, but with an abline of v=0 in the lower panel and v=1 in the upper panel. Code below creates two lines in each panel, not quite sure how to separate them by panel. densityplot(~ var1|var2, tmp, type = c('g', 'l'), layout = c(1,2), panel = function(x, ...){ panel.densityplot(x, ...) panel.abline(v = c(0,1)) } ) Thank you Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New PLYR issue
Note that although ddply does a lot for you, it doesn't reproduce all of your calculations on all of the data columns like summaryBy does... you have to explicitly create every calculated column in your function. Well, ddply doesn't, but colwise will. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Separate ablines in lattice panels
?panel.number This tells you what panel you're in and you can use that to determine which line to draw. -- Bert On Tue, Jan 17, 2012 at 9:59 AM, Doran, Harold hdo...@air.org wrote: Searched archives and found some old email threads on the topic. But mot exactly what I think I need. Suppose I have a datafile such as tmp. tmp - data.frame(var1 = c(rnorm(1000), rnorm(1000, 1, 1)), var2 = gl(2, 1000)) I'd like a plot similar to the one below, but with an abline of v=0 in the lower panel and v=1 in the upper panel. Code below creates two lines in each panel, not quite sure how to separate them by panel. densityplot(~ var1|var2, tmp, type = c('g', 'l'), layout = c(1,2), panel = function(x, ...){ panel.densityplot(x, ...) panel.abline(v = c(0,1)) } ) Thank you Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
On Tue, 17 Jan 2012, crimsonengineer87 wrote: Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu - breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? breakpoints() currently can just handle linear (in parameters) regressions. So unless f(., a, b) is either known or can be written as a linear predictor, breakpoints() cannot estimate breaks in the model of interest. If you want approximate f(., a, b) by a piecewise linear function, then you would use breakpoints(Na ~ yield). The result however will typically not be continuous. To see the result fitted() can be used. See the references in ?breakpoints for some examples. However, I doubt that this is a route worth pursuing given your problem description... Is there an easier way to calculate the confidence interval for a non-linear regression? If you want to use nls(), you could use simulation techniques to obtain confidence intervals. Another possible alternative would be to use a GAM formulation. See e.g. gam() in package mgcv. hth, Z I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using Aggregate() with FUN arguments, which require more than one input variables
On 17.01.2012 18:10, RNoob wrote: Dear all, I am trying to apply the aggregate() function to calculate correlations for subsets of a dataframe. My argument x is supposed to consist of 2 numerical vectors, which represent x and y for the cor() function. The following error results when calling the aggregate function: Error in FUN(X[[1L]], ...) : supply both 'x' and 'y' or a matrix-like 'x'. I think the subsets aggregate puts into cor() are sort of list types and therefore can't be handled by cor(). as.matrix() will probably help, but since you have not specified your reproducible code, we cannot show how to change that. Uwe Ligges Can anyone provide me with a solution? Regards, RNoob -- View this message in context: http://r.789695.n4.nabble.com/Using-Aggregate-with-FUN-arguments-which-require-more-than-one-input-variables-tp4303936p4303936.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
On Tue, 17 Jan 2012, Bert Gunter wrote: On Tue, Jan 17, 2012 at 8:06 AM, Kenneth Frost kfr...@wisc.edu wrote: Sorry, that wasn't to helpful...I see that the intervals and se.fit argument are currently ignored. Yes, because the fitted values are nonlinear in the parameters, which makes finding exact confidence regions impossible. I think the usual approach (subject to correction by experts) is to use a delta method approximation for the fitted variances from the varcov matrix of the parameters at the converged optimum (itself an approximation) and then a standard t-interval based on that. However, this approximation can be quite bad, because degrees of freedom don't mean much for nonlinear models -- in fact, that's the essential (and huge!) difference between linear and nonlinear models -- and the likelihood surface may not be close enough to quadratic. So one may do better with, e.g. a bootstrap approximation, although this can be problematic, too, due to convergence and other issues. What I think can be said with some certainty is that the idea of approximating by a segmented regression and then using CI's for each linear part in the usual way is a particularly bad one -- the CI's will be underestimated because they don't take into account the uncertainty in the location of the fitted breakpoints, which are nonlinear **and** non-smooth functions of the data. So if confidence intervals for the fitted values are really important, I suggest that Julian work with his local statistician to come up with the best approach for his particular situation. It's tricky. I fully agree with Bert that, in this case, segmented regression does not seem to be a fruitful approach and that it's best to consult a local statistician. However, I just wanted to clarify a theoretical detail about what breakpoints() does. The breakpoints converge at the faster rate of n while the parameter estimates just converge with sqrt(n). This is why in principle, it is possible to get the usual inference from segmented regressions. The price for this is to assume that the true model is in fact a segmented regression (with only breakpoints/coefficients unknown). Hence, segmented regression will be useful (in the Tukey sense) if there are few relatively abrupt changes in a regression relationship. On the other hand, for approximating smooth changes there are typically better techniques available. Best, Z Cheers, Bert On 01/17/12, crimsonengineer87 julianjonre...@gmail.com wrote: Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu - breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? Is there an easier way to calculate the confidence interval for a non-linear regression? I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. Julian -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4303629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Separate ablines in lattice panels
Thank you, Bert. The help page doesn't have a usage example and I can't seem to find one via google. Do you, or anyone else, have sample code? -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Tuesday, January 17, 2012 1:07 PM To: Doran, Harold Cc: r-help@r-project.org Subject: Re: [R] Separate ablines in lattice panels ?panel.number This tells you what panel you're in and you can use that to determine which line to draw. -- Bert On Tue, Jan 17, 2012 at 9:59 AM, Doran, Harold hdo...@air.org wrote: Searched archives and found some old email threads on the topic. But mot exactly what I think I need. Suppose I have a datafile such as tmp. tmp - data.frame(var1 = c(rnorm(1000), rnorm(1000, 1, 1)), var2 = gl(2, 1000)) I'd like a plot similar to the one below, but with an abline of v=0 in the lower panel and v=1 in the upper panel. Code below creates two lines in each panel, not quite sure how to separate them by panel. densityplot(~ var1|var2, tmp, type = c('g', 'l'), layout = c(1,2), panel = function(x, ...){ panel.densityplot(x, ...) panel.abline(v = c(0,1)) } ) Thank you Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] An unsubsettable object in a mixed model
Sarah Jervis sj414 at medschl.cam.ac.uk writes: I am having problems using the /lme /command to fit mixed models. I have a data set similar to longitudinal data, except the hypothesised correlation is between observations taken from different individuals in the same family rather than from the same individual at different times. As soon as I try to specify any correlation structure other than independent, I get the error message /Error in x$formula : object of type 'closure' is not subsettable/. Have you any idea what /R /is objecting to and how I might fix it, and if not, can you suggest h any other way I can get R to fit a linear mixed model? The data set is VERY unbalanced with almost 2/3 of the families being single individuals, so maybe /R/ doesn't like being told to find correlation in univariate data subsets. ANY advice is welcome! It would help if you could provide a reproducible example, or a test case. You're also probably better off posting this to the r-sig-mixed-mo...@r-project.org mailing list. I don't know if lme will choke when a correlation model is fitted to a data set where some groups have only one individual, but you could (e.g.) very easily test if this is the problem by (1) fitting the model with only groups with n1 and (2) adding one group with n=1 to the data set and seeing if it chokes at that point. However, I can also imagine that you're mis-specifying the command in some point, and from that point of view it would be best to have some more detail about what you're trying to do. See http://tinyurl.com/reproducible-000 ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bayesian mixed logit
Dear all, I am writing an R code to fit a Bayesian mixed logit (BML) via MCMC / MH algorithms following Train (2009, ch. 12). Unfortunately, after many draws the covariance matrix of the correlated random parameters tend to become a matrix with almost perfect correlation, so I think there is a bug in the code I wrote but I do not seem to be able to find it.. dull I know. Has anybody written a code for BML with R and would like to share it with me or even take a quick look at my code? I would be extremely grateful for any help. Many thanks to everybody! Carlo *** Senior Research Associate Centre for Social and Economic Research on the Global Environment (CSERGE), __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Separate ablines in lattice panels
On Jan 17, 2012, at 1:34 PM, Doran, Harold wrote: Thank you, Bert. The help page doesn't have a usage example and I can't seem to find one via google. Do you, or anyone else, have sample code? It did not seem particularly daring or complex when I tried this (which does appear to produce what was requested): tmp - data.frame(var1 = c(rnorm(1000), rnorm(1000, 1, 1)), var2 = gl(2, 1000)) densityplot(~ var1|var2, tmp, type = c('g', 'l'), layout = c(1,2), panel = function(x, ...){ panel.densityplot(x, ...) panel.abline(v = c(0,1) [ panel.number() ]) } ) -- David. -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Tuesday, January 17, 2012 1:07 PM To: Doran, Harold Cc: r-help@r-project.org Subject: Re: [R] Separate ablines in lattice panels ?panel.number This tells you what panel you're in and you can use that to determine which line to draw. -- Bert On Tue, Jan 17, 2012 at 9:59 AM, Doran, Harold hdo...@air.org wrote: Searched archives and found some old email threads on the topic. But mot exactly what I think I need. Suppose I have a datafile such as tmp. tmp - data.frame(var1 = c(rnorm(1000), rnorm(1000, 1, 1)), var2 = gl(2, 1000)) I'd like a plot similar to the one below, but with an abline of v=0 in the lower panel and v=1 in the upper panel. Code below creates two lines in each panel, not quite sure how to separate them by panel. densityplot(~ var1|var2, tmp, type = c('g', 'l'), layout = c(1,2), panel = function(x, ...){ panel.densityplot(x, ...) panel.abline(v = c(0,1)) } ) Thank you Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Separate ablines in lattice panels
It does indeed produce what I'm expecting. The input to panel.number seems to require a character string, but here the function is called with no argument. I am not entirely clear _why_ it works, but it does seem to. -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Tuesday, January 17, 2012 1:46 PM To: Doran, Harold Cc: Bert Gunter; r-help@r-project.org Subject: Re: [R] Separate ablines in lattice panels On Jan 17, 2012, at 1:34 PM, Doran, Harold wrote: Thank you, Bert. The help page doesn't have a usage example and I can't seem to find one via google. Do you, or anyone else, have sample code? It did not seem particularly daring or complex when I tried this (which does appear to produce what was requested): tmp - data.frame(var1 = c(rnorm(1000), rnorm(1000, 1, 1)), var2 = gl(2, 1000)) densityplot(~ var1|var2, tmp, type = c('g', 'l'), layout = c(1,2), panel = function(x, ...){ panel.densityplot(x, ...) panel.abline(v = c(0,1) [ panel.number() ]) } ) -- David. -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Tuesday, January 17, 2012 1:07 PM To: Doran, Harold Cc: r-help@r-project.org Subject: Re: [R] Separate ablines in lattice panels ?panel.number This tells you what panel you're in and you can use that to determine which line to draw. -- Bert On Tue, Jan 17, 2012 at 9:59 AM, Doran, Harold hdo...@air.org wrote: Searched archives and found some old email threads on the topic. But mot exactly what I think I need. Suppose I have a datafile such as tmp. tmp - data.frame(var1 = c(rnorm(1000), rnorm(1000, 1, 1)), var2 = gl(2, 1000)) I'd like a plot similar to the one below, but with an abline of v=0 in the lower panel and v=1 in the upper panel. Code below creates two lines in each panel, not quite sure how to separate them by panel. densityplot(~ var1|var2, tmp, type = c('g', 'l'), layout = c(1,2), panel = function(x, ...){ panel.densityplot(x, ...) panel.abline(v = c(0,1)) } ) Thank you Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Separate ablines in lattice panels
On Jan 17, 2012, at 1:52 PM, Doran, Harold wrote: It does indeed produce what I'm expecting. The input to panel.number seems to require a character string, but here the function is called with no argument. I am not entirely clear _why_ it works, but it does seem to. ?panel.numer Says that there is a default ... the last object printed, i.e, the one for which you would want its number used as an idex into your vector of candidate values. -- David. -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Tuesday, January 17, 2012 1:46 PM To: Doran, Harold Cc: Bert Gunter; r-help@r-project.org Subject: Re: [R] Separate ablines in lattice panels On Jan 17, 2012, at 1:34 PM, Doran, Harold wrote: Thank you, Bert. The help page doesn't have a usage example and I can't seem to find one via google. Do you, or anyone else, have sample code? It did not seem particularly daring or complex when I tried this (which does appear to produce what was requested): tmp - data.frame(var1 = c(rnorm(1000), rnorm(1000, 1, 1)), var2 = gl(2, 1000)) densityplot(~ var1|var2, tmp, type = c('g', 'l'), layout = c(1,2), panel = function(x, ...){ panel.densityplot(x, ...) panel.abline(v = c(0,1) [ panel.number() ]) } ) -- David. -Original Message- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Tuesday, January 17, 2012 1:07 PM To: Doran, Harold Cc: r-help@r-project.org Subject: Re: [R] Separate ablines in lattice panels ?panel.number This tells you what panel you're in and you can use that to determine which line to draw. -- Bert On Tue, Jan 17, 2012 at 9:59 AM, Doran, Harold hdo...@air.org wrote: Searched archives and found some old email threads on the topic. But mot exactly what I think I need. Suppose I have a datafile such as tmp. tmp - data.frame(var1 = c(rnorm(1000), rnorm(1000, 1, 1)), var2 = gl(2, 1000)) I'd like a plot similar to the one below, but with an abline of v=0 in the lower panel and v=1 in the upper panel. Code below creates two lines in each panel, not quite sure how to separate them by panel. densityplot(~ var1|var2, tmp, type = c('g', 'l'), layout = c(1,2), panel = function(x, ...){ panel.densityplot(x, ...) panel.abline(v = c(0,1)) } ) Thank you Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pretty(range(data$z),10) Error
Hello, R-List, I'm getting error messages when adding geom_density2d() [package ggplot2]: Fehler in pretty(range(data$z), 10) : NA/NaN/Inf in externem Funktionsaufruf (arg 1) Zusätzlich: Warnmeldungen: 1: Removed 1 rows containing missing values (stat_contour). 2: In min(x) : kein nicht-fehlendes Argument für min; gebe Inf zurück 3: In max(x) : kein nicht-fehlendes Argument für max; gebe -Inf zurück 4: In min(x) : kein nicht-fehlendes Argument für min; gebe Inf zurück 5: In max(x) : kein nicht-fehlendes Argument für max; gebe -Inf zurück I installed 2.11.1 Patched in the hope to overcome this obstacle but in vain. I tried reading spss and csv format - no effect. While this one works: ggplot(sp2, aes(x=Qq04, y=Qq01)) + geom_point() + facet_wrap(~ segment,ncol=3) This one fails: ggplot(sp2, aes(x=Qq04, y=Qq01)) + geom_point() + facet_wrap(~ segment,ncol=3) + geom_density2d() But it only fails for 2 out of 10 x-variables, although they do not differ in structure. Data structure looks like this: segment Qq01 Qq02 Qq02a Qq04 Qq05 Qq07 Qq08 Qq10 Qq11 Qq15 Qq17a A 7 5 5 5 5 4 5 5 3 7 7 A 5 4 4 3 4 3 4 4 4 5 6 B 3 3 3 3 2 3 4 2 4 3 2 B 6 4 5 3 3 3 4 4 3 6 6 C 3 3 3 3 3 4 2 3 3 4 2 C 2 1 4 3 1 4 1 1 3 4 2 D 5 3 4 3 3 3 4 3 3 6 3 D 5 3 4 2 2 4 4 3 3 4 4 E 3 3 3 2 3 4 4 3 2 3 4 E 7 5 5 5 5 4 4 5 1 7 7 F 6 5 4 3 3 3 4 3 3 6 6 F 5 3 4 3 3 4 4 3 3 4 4 G 4 3 3 3 2 1 3 3 3 4 4 G 6 5 5 5 5 4 5 5 3 7 7 H 6 5 5 4 4 4 5 4 3 5 6 H 7 5 5 5 4 4 5 5 3 7 7 I 6 5 5 4 4 4 5 4 3 6 6 I 6 3 4 2 3 4 4 3 4 6 4 J 5 3 4 1 3 4 4 3 3 5 5 J 4 2 4 3 2 2 3 2 3 4 4 K 4 4 4 2 3 4 4 4 3 5 5 K 6 4 3 3 2 3 4 3 3 6 5 L 6 4 5 3 3 4 5 4 3 6 6 L 3 1 2 2 1 3 3 1 3 4 4 ... About 1.800 cases (more than 100 per segment). Did anybody experience similar problems? Thanks for all comments, Mario [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RTisean generating multivariate surrogates;
I have a question on generating multivariate time series surrogates using the surrogates function in the RTisean library. The surrogate data matrices are always much shorter than the input matrices. FYI, I'm using R version 2.12.2 on Windows XP RTisean library v 3.0.14 Tisean algorithms v 3.0.13 Creating a surrogate univariate time series returns a time series with the length of the original vector. V1=as.numeric(na.omit(filter(rnorm(103), filter=c(0.25, 0.5, 0.5, 0.25), sides=2))) S=surrogates(series=V1); nrow(S) [1] 100 However, multivariate surrogates are always shorter than the input matrices. V1=as.numeric(na.omit(filter(rnorm(103), filter=c(0.5, 0.5), sides=2))) V2=as.numeric(na.omit(filter(rnorm(103), filter=c(0.5, 0.5), sides=2))) V=cbind(V1, V2) S=surrogates(series=V, m=2, c=1:2); dim(S) [1] 16 2 One can cheat a little and repeat the datasets, returning a longer output: V1=as.numeric(na.omit(filter(rnorm(103), filter=c(0.5, 0.5), sides=2))) V2=as.numeric(na.omit(filter(rnorm(103), filter=c(0.5, 0.5), sides=2))) V=cbind(V1, V2, V1, V2) S=surrogates(series=V, m=4, c=1:4); dim(S) [1] 25 4 but the limit appears to be ~50% of the length of the input matrix. There probably is a legitimate reason behind this but I can't find mention of this either in the RTisean documentation or the TISEAN website. Similarly, examples in the literature from the authors (Schreiber and Schmitz. 2000. Surrotate time series. Physica D 142.346-382) do not appear to suffer from this truncation. Varying the arguments for the surrogates function do not fix the truncation and some of the arguments don't seem to be functional: surrogates {RTisean}R Documentation surrogates(series, n = 1, i, S = FALSE, I, l, x = 0, m, c) Arguments: series a vector or a matrix. n number of surrogates. i number of iterations. S make spectrum exact rather than distribution. I seed for random numbers. (capital i) l number of points. (lowercase L) x number of values to be skipped. m number of columns to be read. c columns to be read. --- In particular, varying n changes the processing time but does not change the function output, making me think that this is an issue with the way the R function links to the TISEAN executables. Suggestions from anyone familiar with this are greatly appreciated. Thanks, Burton Shank __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display numbers on map
David: That doesn't quite answer the question about Alaska and Hawaii. Jeffrey: help(state, maps) states This database produces a map of the states of the United States mainland ... so you have to use: map(world, USA) map(state, add=T) to get the whole lot (which I am willing to bet is not exactly what is wanted, even though it is what was stated). This is not the end of the story, but if you'll state exactly what you want for Alaska and Hawaii, then perhaps we can supply a solution. Ray Brownrigg On Wed, 18 Jan 2012, David Winsemius wrote: On Jan 17, 2012, at 5:37 AM, Jeffrey Joh wrote: I have a text file with states and numbers. I would like to display each number that corresponds to a state on a map. I am trying to use the maps package, but it doesn't show Alaska or Hawaii. Do you have suggestions on how to do this? This question suggests you are not yet aware of the search facility built into R: RSiteSearch(maps Hawaii Alaska ) A search query has been submitted to http://search.r-project.org The results page should open in your browser shortly The third hit was back to the map function help page. (The examples should be read and executed.) ?example [[alternative HTML version deleted]] And the above suggests you still need to read the following: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change state names to abbreviations in an irregular, list of names, abbreviations, null values, and foreign provinces
Thanks! Both solutions work well for the problem that I described, although I failed to mention that there are also pre-abbreviated names in the list that I'm working with, and the second solution returns NA's for these (but there are plenty of ways around this). Sample data might be: State.Province - c(Cusco, TX, unknown, Texas, NA, Louisiana) David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unable to find an inherited method for function make.db.names, for signature character, missing
I think the amount of people on this list who understand your question is roughly zero. We cannot see how the subject is related to the body of your function nor do we see any incidence that ou followed the posting guide. Uwe Ligges On 17.01.2012 17:06, Poul Kristensen wrote: Hi ! I am new to R and I am using Rstudio on Linux. Have I missed some library() ? and if so does anyone have the time to write which? I am trying to create some PostqreSQL tables from comma separated files. I am a bit surpriced that character is a problem. The missing value could be NULL. Thanks Poul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using Aggregate() with FUN arguments, which require more than one input variables
Hello, RNoob wrote Dear all, I am trying to apply the aggregate() function to calculate correlations for subsets of a dataframe. My argument x is supposed to consist of 2 numerical vectors, which represent x and y for the cor() function. The following error results when calling the aggregate function: Error in FUN(X[[1L]], ...) : supply both 'x' and 'y' or a matrix-like 'x'. I think the subsets aggregate puts into cor() are sort of list types and therefore can't be handled by cor(). Can anyone provide me with a solution? Regards, RNoob I don't know if I'm understanding it well but it seems you're trying to compute a correlation matrix for each group of a data.frame. The data.frame is divided into groups by one or more factor columns. If this is what you want, try the function below. It doesn't use 'aggregate', it uses 'split' and 'lapply'. cor.groups - function(x, vars){ cols - if(is.character(vars)) names(x) else 1:ncol(x) cols - cols %in% vars cols - cols | sapply(x, is.factor) | sapply(x, is.character) # transform logical to numeric index cols - which(cols) lapply(split(x, x[, vars]), function(grp) cor(grp[, -cols])) } # Sample data N - 100 DF - data.frame(U=as.factor(sample(LETTERS[1:3], N, T)), V=as.factor(sample(0:1, N, T)), W=sample(letters[1:6], N, T), x=1:N, y=sample(10, N, T), z=rnorm(N), stringsAsFactors=FALSE) # And test it. Note the argument 'stringsAsFactors' cor.groups(DF, U) cor.groups(DF, c(U, V)) cor.groups(DF, 1:3) cor.groups(DF, c(U, x)) # look out, right result, wrong function call I hope it helps. (if not, be more explicit) Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Using-Aggregate-with-FUN-arguments-which-require-more-than-one-input-variables-tp4303936p4304535.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unable to find an inherited method for function make.db.names, for signature character, missing
2012/1/17 Uwe Ligges lig...@statistik.tu-dortmund.de: I think the amount of people on this list who understand your question is roughly zero. A Fortunes candidate? Cheers, Bert We cannot see how the subject is related to the body of your function nor do we see any incidence that you followed the posting guide. Uwe Ligges On 17.01.2012 17:06, Poul Kristensen wrote: Hi ! I am new to R and I am using Rstudio on Linux. Have I missed some library() ? and if so does anyone have the time to write which? I am trying to create some PostqreSQL tables from comma separated files. I am a bit surpriced that character is a problem. The missing value could be NULL. Thanks Poul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using !is.na() in a HAVING clause in sqldf() XXXX
Hi everyone, I have the following: sqldf(select Premie,count(tpounds) N,avg(tpounds) Avg_Weight, stddev_samp(tpounds) StdDev from children group by Premie having !is.na(Premie)) sqldf() does not like the !is.na(Premie) specification. How does one exclude a missing group in an aggregated query using sqldf()? Thanks! Dan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using !is.na() in a HAVING clause in sqldf() XXXX
Did you try a where statement? where Premie is not null On Tue, Jan 17, 2012 at 3:03 PM, Dan Abner dan.abne...@gmail.com wrote: Hi everyone, I have the following: sqldf(select Premie,count(tpounds) N,avg(tpounds) Avg_Weight, stddev_samp(tpounds) StdDev from children group by Premie having !is.na(Premie)) sqldf() does not like the !is.na(Premie) specification. How does one exclude a missing group in an aggregated query using sqldf()? Thanks! Dan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joseph C. Magagnoli [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error when extracting from a data frame
(As a noob to R, this is my first posting - yes yes, groans all around...) I'm trying to extract certain rows from a data frame. I used the following to import data from a CSV txt file. data - read.table(file=data.txt, header=TRUE) when I do this, my attempt to extract the data rows only from where the Station value equals 1… data.station1 - data[data$Station == 1] ...is giving me the following error message: Error in `[.data.frame`(data, data $Station == 1) : undefined columns selected Bah. If I use names(data) I can see Station as a column name. And if I use str(data), the variable Station is coming up as integers including the value 1. And if I use data$Station, I see all station values, including the 1s. And if I use data[,Station] I do see all the Station values And if I instead treat the Station values as characters, by using 1, I still get the undefined error. Could someone please correct me on my syntax? Or advise if perhaps I imported the data the wrong way? I'm working out of A Beginner's Guide to R and also looked through the R manual, and even tried this from Google search: data.station1 - data,(Station == 1) ] But that gave me an unwanted output: data frame with 0 columns and 789 rows Almost, but not quite. Please help? Thank you, - Suzanne .. suzanne.mert...@gmail.com 404-337-1533 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using !is.na() in a HAVING clause in sqldf() XXXX
Dan - Try using having Premie not null instead of having !is.na(Premie) . - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Tue, 17 Jan 2012, Dan Abner wrote: Hi everyone, I have the following: sqldf(select Premie,count(tpounds) N,avg(tpounds) Avg_Weight, stddev_samp(tpounds) StdDev from children group by Premie having !is.na(Premie)) sqldf() does not like the !is.na(Premie) specification. How does one exclude a missing group in an aggregated query using sqldf()? Thanks! Dan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem in import and export
mukul purva wrote on 01/16/2012 01:35:00 AM: hello, i hav a prob in R lang. i want to do correlation of data frame 22810(gene) rows and 1436 colums(experiment) when i covert this file in csv or xls format it reads only 1024 cloumns. n when i do correlation of data mean 22810 *22810 matrix made in terminal then i export in csv or xls by write.table it will give me only 1024 columns rather than 22810 columns.wat i do plzzz help me... for this file what should the syntax of import and export... It's difficult to help you without an example of your code. I created a very simple example of a data frame with 1,500 columns and had no difficulty saving it to a csv file and viewing the file with both a text editor and with MS Excel (2007) for Windows. df - data.frame(matrix(1:3000, nrow=2)) write.csv(df, C:\\junk.csv, quote=FALSE) You may also want to investigate the write.matrix() function in the MASS package. ?write.matrix Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Display numbers on map
On Jan 17, 2012, at 2:21 PM, Ray Brownrigg wrote: David: That doesn't quite answer the question about Alaska and Hawaii. Agreed. I misinterpreted the text of help(map) example to mean that HI and AK boundaries could be found in unemp. However, it was also advice to Jeffrey that searching might suggest solutions. I found several citations on Baron's search site, including this one: http://finzi.psych.upenn.edu/R/Rhelp02/archive/44481.html This is what that example lead me try (after a bit of digging for the right names to use for Alaska_: grep( USA:Alas, map(world, plot=FALSE)$names, value=TRUE) quartz(); # or whatever is appropriate in your OS to open a new screen device map(state, xlim=c(-170,-60), ylim=c(15,85)) map(world, Hawaii, add=TRUE) map(world, USA:Alaska, add=TRUE) -- David. Jeffrey: help(state, maps) states This database produces a map of the states of the United States mainland ... so you have to use: map(world, USA) map(state, add=T) to get the whole lot (which I am willing to bet is not exactly what is wanted, even though it is what was stated). This is not the end of the story, but if you'll state exactly what you want for Alaska and Hawaii, then perhaps we can supply a solution. Ray Brownrigg On Wed, 18 Jan 2012, David Winsemius wrote: On Jan 17, 2012, at 5:37 AM, Jeffrey Joh wrote: I have a text file with states and numbers. I would like to display each number that corresponds to a state on a map. I am trying to use the maps package, but it doesn't show Alaska or Hawaii. Do you have suggestions on how to do this? This question suggests you are not yet aware of the search facility built into R: RSiteSearch(maps Hawaii Alaska ) A search query has been submitted to http://search.r-project.org The results page should open in your browser shortly The third hit was back to the map function help page. (The examples should be read and executed.) ?example [[alternative HTML version deleted]] And the above suggests you still need to read the following: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error when extracting from a data frame
Read the help file on how to extract from a data frame: ?[.data.frame Then, try adding a comma inside the brackets. data.station1 - data[data$Station==1, ] Before the comma, the data$Station==1 identifies what rows to select. After the comma, the lack of specification indicates that all columns should be selected. Jean Suzanne.mertens wrote on 01/17/2012 03:17:41 PM: (As a noob to R, this is my first posting - yes yes, groans all around...) I'm trying to extract certain rows from a data frame. I used the following to import data from a CSV txt file. data - read.table(file=data.txt, header=TRUE) when I do this, my attempt to extract the data rows only from where the Station value equals 1? data.station1 - data[data$Station == 1] ...is giving me the following error message: Error in `[.data.frame`(data, data $Station == 1) : undefined columns selected Bah. If I use names(data) I can see Station as a column name. And if I use str(data), the variable Station is coming up as integers including the value 1. And if I use data$Station, I see all station values, including the 1s. And if I use data[,Station] I do see all the Station values And if I instead treat the Station values as characters, by using 1, I still get the undefined error. Could someone please correct me on my syntax? Or advise if perhaps I imported the data the wrong way? I'm working out of A Beginner's Guide to R and also looked through the R manual, and even tried this from Google search: data.station1 - data,(Station == 1) ] But that gave me an unwanted output: data frame with 0 columns and 789 rows Almost, but not quite. Please help? Thank you, - Suzanne .. suzanne.mert...@gmail.com 404-337-1533 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bayesian mixed logit
Carlo Fezzi (ENV C.Fezzi at uea.ac.uk writes: Dear all, I am writing an R code to fit a Bayesian mixed logit (BML) via MCMC / MH algorithms following Train (2009, ch. 12). Unfortunately, after many draws the covariance matrix of the correlated random parameters tend to become a matrix with almost perfect correlation, so I think there is a bug in the code I wrote but I do not seem to be able to find it.. dull I know. Has anybody written a code for BML with R and would like to share it with me or even take a quick look at my code? I would be extremely grateful for any help. (1) maybe better at r-sig-mixed-mod...@r-project.org (2) are you trying this on real, or on simulated data? The collapse of the covariance matrix in this way is a very common symptom of overfitting/underidentification in mixed models. I wouldn't say it necessarily constitutes a bug in your code. In principle you should be able to get an uncorrelated answer if you use a big enough, sufficiently well-behaved simulated data set, but not necessarily for real data ... (3) have you tried the MCMCglmm package, which is a very fast and flexible MCMC-based approach to GLMMs? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean of simulation runs given in a table
Thank you, Uwe, for your help! I have more measurements (m1, m2) and more parameters (par1, par2). I can calculate the means of m1 and m2 this way: aggregate(cbind(m1, m2) ~ par1 + par2, dat, mean) However, I also need to calculate the standard error of the mean, and the variance for the sample, and I would like to have them output as extra columns next to the column with means. Again, I would appreciate any help! On 17.01.2012 15:09, Uwe Ligges wrote: On 17.01.2012 12:31, Irek Szczesniak wrote: Hi, I have the simulation results of the following structure: run par measured 1 10 12 2 10 14 1 20 20 2 20 26 Where run is the simulation run number, par is the parameter of the simulation, and measured is the value measured in the simulation. This is only a simple example of my results. There are many values measured and many parameters. But the basic structure stays the same: there are many runs (identified by the run number) for the same values of the parameters with various measured values -- they constitute a sample. I would like to calculate the mean of the measured value for a sample, and so I would like to obtain the output as follows: par mean 10 13 20 23 I would appreciate it if someone could write me how to do it. For you data in a data.frame called dat: aggregate(measured ~ par, dat, mean) Uwe Ligges Thank you, Irek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ireneusz (Irek) Szczesniak http://www.irkos.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
Thanks for the comments everyone. I was hoping to not have to find someone in the stats department ... well, we'll see. So in response to Z's comment ... I have tried breakpoints(Na ~ yield) and I did expect to get something continuous. The idea was to get two or three linear functions making up the curve. And then from there, get a CI from these lines. Of course, it wouldn't be good. (This is coming from a non-stats guy ... I'm a civil engineer by degree and am now learning to be a modeler as a grad student!). Do you know of any more examples of breakpoints? The examples in the references are great, but I can't seem to get it right. Thanks again. -- View this message in context: http://r.789695.n4.nabble.com/breakpoints-and-nonlinear-regression-tp4303629p4305000.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error when extracting from a data frame
On Jan 17, 2012, at 4:56 PM, Jean V Adams wrote: Read the help file on how to extract from a data frame: ?[.data.frame Then, try adding a comma inside the brackets. data.station1 - data[data$Station==1, ] Before the comma, the data$Station==1 identifies what rows to select. After the comma, the lack of specification indicates that all columns should be selected. And if you want to avoid getting all the rows where data$Station are NA, then use either of these alternatives resulting in what I expect and generally want to see as a result: data.station1 - subset( data, Station==1 ) data.station1 - data[ which(data$Station==1) , ] -- David. Jean Suzanne.mertens wrote on 01/17/2012 03:17:41 PM: (As a noob to R, this is my first posting - yes yes, groans all around...) I'm trying to extract certain rows from a data frame. I used the following to import data from a CSV txt file. data - read.table(file=data.txt, header=TRUE) when I do this, my attempt to extract the data rows only from where the Station value equals 1? data.station1 - data[data$Station == 1] ...is giving me the following error message: Error in `[.data.frame`(data, data $Station == 1) : undefined columns selected Bah. If I use names(data) I can see Station as a column name. And if I use str(data), the variable Station is coming up as integers including the value 1. And if I use data$Station, I see all station values, including the 1s. And if I use data[,Station] I do see all the Station values And if I instead treat the Station values as characters, by using 1, I still get the undefined error. Could someone please correct me on my syntax? Or advise if perhaps I imported the data the wrong way? I'm working out of A Beginner's Guide to R and also looked through the R manual, and even tried this from Google search: data.station1 - data,(Station == 1) ] But that gave me an unwanted output: data frame with 0 columns and 789 rows Almost, but not quite. Please help? Thank you, - Suzanne .. suzanne.mert...@gmail.com 404-337-1533 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean of simulation runs given in a table
Try using the function in the plyr package. E.g., z - data.frame( # your toy dataset run = c(1, 2, 1, 2), par = c(10, 10, 20, 20), measured = c(12, 14, 20, 26)) library(plyr) ddply(z, .(par), summarize, meanMeasured=mean(measured), sdMeasured=sd(measured)) par meanMeasured sdMeasured 1 10 13 1.414214 2 20 23 4.242641 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ireneusz Szczesniak Sent: Tuesday, January 17, 2012 2:43 PM To: r-help@r-project.org Subject: Re: [R] Mean of simulation runs given in a table Thank you, Uwe, for your help! I have more measurements (m1, m2) and more parameters (par1, par2). I can calculate the means of m1 and m2 this way: aggregate(cbind(m1, m2) ~ par1 + par2, dat, mean) However, I also need to calculate the standard error of the mean, and the variance for the sample, and I would like to have them output as extra columns next to the column with means. Again, I would appreciate any help! On 17.01.2012 15:09, Uwe Ligges wrote: On 17.01.2012 12:31, Irek Szczesniak wrote: Hi, I have the simulation results of the following structure: run par measured 1 10 12 2 10 14 1 20 20 2 20 26 Where run is the simulation run number, par is the parameter of the simulation, and measured is the value measured in the simulation. This is only a simple example of my results. There are many values measured and many parameters. But the basic structure stays the same: there are many runs (identified by the run number) for the same values of the parameters with various measured values -- they constitute a sample. I would like to calculate the mean of the measured value for a sample, and so I would like to obtain the output as follows: par mean 10 13 20 23 I would appreciate it if someone could write me how to do it. For you data in a data.frame called dat: aggregate(measured ~ par, dat, mean) Uwe Ligges Thank you, Irek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ireneusz (Irek) Szczesniak http://www.irkos.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pscl package and hurdle model marginal effects
This request is related to the following post from last year: https://stat.ethz.ch/pipermail/r-help/2011-June/279752.html After reading the thread, the idea is still not clear. I have fitted a model using HURDLE from the PSCL package. I am trying to get marginal effects / slopes by multiplying the coefficients by the mean of the marginal effects (I think this is right). To my understanding, this will require a mean for the binary probability model and a mean for the truncated Poisson count model. My guess is that I would use mean(predict( MODELNAME, type = XXX)) where MODELNAME is the hurdle model and XXX is either RESPONSE, COUNT, or ZERO. Assuming the above is right (correct me if it isn't), my questions are: 1. What XXX gives me the mean of the marginal effects for the binomial probability model? 2. What XXX gives me the mean of the marginal effects for the count model? Judging from my results, I would guess the answer to question 1 is COUNT, except max(predict(MODELNAME, type= count)) returns 4.5 and I expected it to be less than 1. I would also have expected COUNT to match up with the truncated Poisson count model. What is the intuition here? Also, when I try XXX = PROB, I get the following error: Error in matrix(NA, nrow = length(mu), ncol = nUnique) : too many elements specified So maybe there are other problems. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] arules killed
Hi, I recently got a bizarre message when running arules. It just said Killed and quit. Anyone know why this might have happened? I am running R on an AWS quad xl ubuntu instance. Here is some information, including dataset size and the parameters: parameter specification: confidence minval smax arem aval originalSupport support minlen maxlen 0.00035812510.11 none FALSETRUE 3.581251e-05 2 4 target ext rules FALSE algorithmic control: filter tree heap memopt load sort verbose 0.1 TRUE TRUE FALSE TRUE2TRUE apriori - find association rules with the apriori algorithm version 4.21 (2004.05.09)(c) 1996-2004 Christian Borgelt set item appearances ...[1712 item(s)] done [0.00s]. set transactions ...[1712 item(s), 837696 transaction(s)] done [3.99s]. sorting and recoding items ... [1561 item(s)] done [1.83s]. creating transaction tree ... done [1.65s]. checking subsets of size 1 2 3Killed Thanks, Patrick McCann [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot- using geom_point and geom_line at the same time
Thanks Hadley for your input. The following code works fine now. Thanks again con = textConnection(inputs var1 var2 var3 100 10 5 2 1000 20 10 4 5000 30 15 8 1 40 20 16 3 50 25 32) data = read.table(con, header=TRUE) data data = melt(data, id=inputs) g - ggplot(data,aes(x=inputs, value, colour= variable, shape=variable)) g - g + geom_line(lwd=0.8) g - g + geom_point() g - g + scale_colour_discrete('my Custom Legend') g - g + scale_shape_discrete(my Custom Legend) g - On Tue, Jan 17, 2012 at 10:07 AM, Hadley Wickham had...@rice.edu wrote: On Mon, Jan 16, 2012 at 6:05 PM, Mary Kindall mary.kind...@gmail.com wrote: Thanks for reply I wanted to have legend name with spaces. Right now I am using the following code but it produce two legends. I have to use Gimp to cut the redundant legend. Your basic problem is that you're using the fill and colour aesthetics, but you only need colour. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ -- - Mary Kindall Yorktown Heights, NY USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] howto test a package without installation
Look at Hadley Wickham's devtools package. It is designed with this sort of thing. That said, it really is not too difficult to install as long as you have a working tool chain (which you will need to test it anyway). I cound not find it with google and found no devtools on this page http://had.co.nz/ can you give more details please. R CMD INSTALL /tmp/sitools R require(sitools) This seems not to be what i want: $ R CMD INSTALL sitools * installing to library ‘/usr/local/lib/R/site-library’ Error: ERROR: no permission to install to directory ‘/usr/local/lib/R/site-library’ R tries to install something in my system. That may confuse my debian packagemanagement. kind regards, -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] howto test a package without installation
I don't believe you can. However, you need not install it into a system-wide library directory... your personal library (e.g. /home/jonas/R/x86_64-pc-linux-gnu-library/2.14) should be sufficient. Finally i created a new testuser to install the library locally as you wrote. It works. Thank you. How can i get my R clean again afterwards to test the next version? kind regards, -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] howto test a package without installation
http://cran.r-project.org/web/packages/devtools/index.html Michael On Tue, Jan 17, 2012 at 6:51 PM, Jonas Stein n...@jonasstein.de wrote: Look at Hadley Wickham's devtools package. It is designed with this sort of thing. That said, it really is not too difficult to install as long as you have a working tool chain (which you will need to test it anyway). I cound not find it with google and found no devtools on this page http://had.co.nz/ can you give more details please. R CMD INSTALL /tmp/sitools R require(sitools) This seems not to be what i want: $ R CMD INSTALL sitools * installing to library ‘/usr/local/lib/R/site-library’ Error: ERROR: no permission to install to directory ‘/usr/local/lib/R/site-library’ R tries to install something in my system. That may confuse my debian packagemanagement. kind regards, -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breakpoints and nonlinear regression
In respect of fitting piecewise linear regressions, have you looked at the segmented package? cheers, Rolf Turner On 18/01/12 04:30, crimsonengineer87 wrote: Dear Forum, I have been wracking my head over this problem for the past few days. I have a dataset of (x,y). I have been able to obtain a nonlinear regression line using nls. However, we would like to do some statistical analysis. I would like to obtain a confidence interval for the curve. We thought we could divide up the curve into piecewise linear regressions and compute CIs from those portions. There is a package called strucchange that seems helpful, but I am thoroughly confused. 'breakpoints' is used to calculate the number of breaks in the data for linear regressions. I have the following in my script: bp.pavlu- breakpoints(Na ~ f(yield, a, b), h=0.15, breaks=3, data=pavludata) plot(bp.pavlu) breakpoints(bp.pavlu) But I am confused as to how to graph the piecewise functions that make up the curve. I am not even sure if I am using breakpoints correctly. Do I just give it a linear relationhip (Na ~ yield), instead of what I have? Is there an easier way to calculate the confidence interval for a non-linear regression? I am new to R (as I've read in many questions), but I have most certainly tried many things and am just getting frustrated with the lack of examples for what I'd like to do with my data... I'd appreciate any insight. I can also provide more information if I am not clear. Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R package dev: how to export constant?
Hi, i create two constants kilo and milli in [1]. These should be available after loading library(sitools) How should i export them and what have i done wrong? (Other suggestions for improving the package are welcome too) The ready to use .tar.gz and the source can be found on github [2,3] kind regatds, [1] https://github.com/jonasstein/sitools/blob/master/init.R [2] https://github.com/jonasstein/sitools/downloads [3] https://github.com/jonasstein/sitools -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R package dev: how to export constant?
Try adding LazyData: yes to the DESCRIPTION file. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jonas Stein Sent: Tuesday, January 17, 2012 4:41 PM To: r-h...@stat.math.ethz.ch Subject: [R] R package dev: how to export constant? Hi, i create two constants kilo and milli in [1]. These should be available after loading library(sitools) How should i export them and what have i done wrong? (Other suggestions for improving the package are welcome too) The ready to use .tar.gz and the source can be found on github [2,3] kind regatds, [1] https://github.com/jonasstein/sitools/blob/master/init.R [2] https://github.com/jonasstein/sitools/downloads [3] https://github.com/jonasstein/sitools -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] howto test a package without installation
Just install the updated library over the old one and start a new R session. Devtools can be helpful for some things, but when I last looked at it I was having more difficulty with getting documentation right than debugging code, which I can do using normal function development processes, so I went back to the edit compile reload cycle to test the library in it's final form. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Jonas Stein n...@jonasstein.de wrote: I don't believe you can. However, you need not install it into a system-wide library directory... your personal library (e.g. /home/jonas/R/x86_64-pc-linux-gnu-library/2.14) should be sufficient. Finally i created a new testuser to install the library locally as you wrote. It works. Thank you. How can i get my R clean again afterwards to test the next version? kind regards, -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.