Re: [R] r package to solve for Nash equilibrium
On 11/12/13 02:49, Dereje Fentie wrote: Is there an r package out there that solves for pure strategy* Nash equilibrium of a two-person game*? A search for Nash equilibrium in r provides a link to the *GNE* package which solves for the Generalized Nash equilibrium. But what I would like to solve is a pure strategy Nash equilibrium. Please be aware that R is spelt with an upper case R!!! I can't answer your question, but you should note two things (of which you may not be aware). (1) There may not *be* a pure strategy Nash equilibrium. (2) There may be more than one Nash equilibrium. If the GNE package finds *all* Nash equilibria for a given game (I don't know whether it does, or if it stops with the first equilibrium it comes to) and *if* there is a pure strategy equilibrium, then this will be amongst the solutions found, and you just have to pick it out. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bar Graph
The serialno represents each individual in the data set.The total count of the serialno will represent the whole sample I want to do a graph to compare the total data (serialno) vs each of the remaining five variables yearly. i.e to show the total data (serialno) vs available data for one of the variables in the above case temp axilla. The above code plot count of temp_axilla yearly,how can I include serialno to be part of the plot? I have used library(ggplot2) library(foreign) utils package On Tue, Nov 12, 2013 at 11:53 AM, jwd j...@surewest.net wrote: On Mon, 11 Nov 2013 18:02:19 +0300 Keniajin Wambui kiang...@gmail.com wrote: I am using R 3.0.2 on a 64 bit machine I have a data set from 1989-2002. The data has four variables serialno, date, admission ward, temperature and bcg scar. serialno admin_ward date_admn bcg_scar temp_axilla yr 70162Ward2 11-Oct-89 y 38.9 1989 70163 Ward111-Oct-91 y 37.2 1991 70164 Ward2 11-Oct-92n 37.3 1992 70165 Ward111-Oct-93y38.9 1993 70166 Ward1 11-Oct-94 y 37.7 1994 70167 Ward1 11-Oct-95 y 40 1995 I want to do a bar graph of total data (serialno) vs *(data of one of the variables) to show the available data vs total data over the years i am using gplot(dta, aes(temp_axilla, fill=admin_ward)) + geom_bar() + facet_grid(. ~ yr, scales = free,margins=F) + geom_histogram(binwidth=300) But can include the serialno which shows the data. how can I achieve this You really need to pay better attention somewhere. There are six variables in your example table for instance, not four. You state you want to do something with the serialno variable, but it is not used in your bit of code. Also, given that serialno appears to be unique in your example, a bar graph would be singularly uninformative. You need to provide a better example of the data, and an actual example of the code you are trying to use, including the libraries you've loaded. jwdougherty __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Mega Six Solutions Web Designer and Research Consultant Kennedy Mwai 25475211786 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Setting x-axis of a plot on each loop
I have a block of code: window-5 start-3 n-1 seq1 - seq(1:40) mat-matrix(seq1,40) while(1+window=length(mat[,1])) { kd-matrix(as.integer(mat[n:(n+window-1),1])) Sys.sleep(0.2) plot(kd,col=blue,xlab=Rohdaten,ylab=values,xlim=c(start+n,start+n+window-1) ) n-n+1 } I have this expectation, that on each loop two x-axis and y-axis are changed and see the values on the plot. but I cant see the value.What should I do to have values too?. If I am changing this my code to plot(kd,col=blue,xlab=Rohdaten,ylab=values) I can see the values but on the x-axis I have no the correct values [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different output from lm() and lmPerm lmp() if categorical variables are included in the analysis
RTFM!!! :-) The help explicitly says The default contrasts are set internally to (contr.sum, contr.poly) . Set options(contrasts=c(contr.sum,contr.poly)) before your call to lm() and atest will agree with aptest all down the line. cheers, Rolf Turner On 11/08/13 21:35, Agustin Lobo wrote: I've found a problem when using categorical variables in lmp() from package lmPerm According to help(lmp): This function will behave identically to lm() if the following parameters are set: perm=, seq=TRUE, center=FALSE.) But not in the case of including categorical variables: require(lmPerm) set.seed(42) testx1 - rnorm(100,10,5) testx2 - c(rep(a,50),rep(b,50)) testy - 5*testx1 + 3 + runif(100,-20,20) test - data.frame(x1=testx1,x2= testx2,y=testy) atest - lm(y ~ x1*x2,data=test) aptest - lmp(y ~ x1*x2,data=test,perm = , seqs = TRUE, center = FALSE) summary(atest) Call: lm(formula = y ~ x1 * x2, data = test) Residuals: Min 1Q Median 3Q Max -17.1777 -9.5306 -0.9733 7.6840 22.2728 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -2.0036 3.2488 -0.6170.539 x15.3346 0.2861 18.646 2e-16 *** x2b 2.4952 5.2160 0.4780.633 x1:x2b -0.3833 0.4568 -0.8390.404 summary(aptest) Call: lmp(formula = y ~ x1 * x2, data = test, perm = , seqs = TRUE, center = FALSE) Residuals: Min 1Q Median 3Q Max -17.1777 -9.5306 -0.9733 7.6840 22.2728 Coefficients: Estimate Std. Error t value Pr(|t|) x1 5.1429 0.2284 22.516 2e-16 *** x21 -1.2476 2.6080 -0.4780.633 x1:x21 0.1917 0.2284 0.8390.404 It looks like lmp() is internally coding dummy variables in a different way, so lmp results are for a (named 1 by lmp) while lm results are for b ? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bar Graph
On 11/13/2013 07:20 PM, Keniajin Wambui wrote: The serialno represents each individual in the data set.The total count of the serialno will represent the whole sample I want to do a graph to compare the total data (serialno) vs each of the remaining five variables yearly. i.e to show the total data (serialno) vs available data for one of the variables in the above case temp axilla. The above code plot count of temp_axilla yearly,how can I include serialno to be part of the plot? I have used library(ggplot2) library(foreign) utils package Hi Keniajim, It is not easy to work out what you want. If each serial number is an individual and you want the total number of individuals (dataset is mydata): length(unique(mydata$serialno)) If you want the total number of individuals per year: nind-function(x) return(length(unique(x))) year-as.numeric(format(as.Date(mydata$date_admn,%d-%b-%y),%Y)) by(mydata$serialno,mydata$year,nind) Perhaps this will give you a start. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalized Additive Models - gamma against overfitting
Dear listeners, in order to analyze a dataset, which includes more than 30k records, I'm using a general GAMM like y~s(x)+s(y)+z+e with a underlying normal distribution. Unfortunately, I have some problems with over- and underfittings in smooth-functions at the same time. In general there is the possibility to set up a gamma like gamma=1.4 in order to avoid overfitting in the smooth-functions. In my case I would need different gammas for different smooth-functions, because s(x) is overfitted and s(y) underfitted currently. So, is there any possibility to set up a gamma for each smooth-function separately? I really would appreciate your responses! Thanks a lot! Best Mike -- View this message in context: http://r.789695.n4.nabble.com/Generalized-Additive-Models-gamma-against-overfitting-tp4680364.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Negative binomial parameterisation in mgcv
Dear R-help, The negative binomial distribution has several different parameterisations, but I can't seem to figure out what the exact one used in mgcv's negbin family is? negbin() requires a theta argument, but its not clear anywhere in the documentation (that I can find), how this parameter should be interpreted - for example, can it be less than 1? Best wishes, MArk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Negative binomial parameterisation in mgcv
On 13/11/2013 09:38, Mark Payne wrote: Dear R-help, The negative binomial distribution has several different parameterisations, but I can't seem to figure out what the exact one used in mgcv's negbin family is? negbin() requires a theta argument, but its not clear anywhere in the documentation (that I can find), how this parameter should be interpreted - for example, can it be less than 1? I believe it is the same as MASS (the package) explained in MASS (the book). As its help says Best wishes, MArk -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grid type of sampling in geodata
Hi all, I have spatial field created with grf and I was wondering if I can sample in lines, something that can resemble sampling outdoors at the streets. Random sampling looks to far of what I want to have. there is in geodata package the sample.geodata but this looks like to be random samples. I would like to thank you in advance for your help Regards A. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] making a barplot with table of experimental conditions underneath (preferably ggplot2)
Dear all, my data looks the following: df - data.frame (experiment=c(E1,E2,E3,E4), mean = c(3,4,5,6), stdev=c(0.1,0.1,0.05,0.2), method = c(STD,STD, FP, FP), enzyme =c (T,T/L,T,T/L), denaturation=c(U,U,0.05%RG, 0.1%RG)) I would like to make a bar plot with standard deviation which I solved the following way: x - df$experiment y - df$mean sd - df$stdev df.plot - qplot(x,y,xlab=, ylab=# peptides identified)+ geom_bar(colour=black, fill=darkgrey)+ geom_errorbar(aes(x=x, ymin=y-sd, ymax=y+sd), width=0.25) df.plot However, as the labels for the x-axis (the bars) I do not want the experiment number, as now, but instead a table containing the other columns of my data.frame (method, enzyme, denaturation) with the description in the front and the certain 'value' below the bars. I am looking forward to your suggestions! With best wishes, Nina __ Dr. Nina C. Hubner scientist quantitative proteomics Dep. of Molecular Biology, Radboud University Nijmegen, The Netherlands e-mail: n.hub...@ncmls.ru.nl tel: +31-24-3613655 Visiting address: NCMLS, Dept of Molecular Biology (route 274) Geert Grooteplein 26/28 6525 GA Nijmegen The Netherlands The Radboud University Medical Centre is listed in the Commercial Register of the Chamber of Commerce under file number 41055629. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] issues with calling predict.coxph.penal (survival) inside a function - subset-vector not found. Because of NextMethod?
Hello everyone, I got an issue with calling predict.coxph.penal inside a function. Regarding the context: My original problem is that I wrote a function that uses predict.coxph and survfit(model) to predict a lot of survival-curves using only the basis-curves for the strata (as delivered by survfit(model) ) and then adapts them with the predicted risk-scores. Because there are cases where my new data has strata which didn't exist in the original model I exclude them, using a Boolean vector inside the function. I end up with a call like this: predict (coxph_model, newdata[subscript_vector,] ) This works fine for coxph.model, but when I fit a model with a spline (class coxph.penal), I get an error: Error in `[.data.frame`(newdata, [subscript_vector, ) : object '[subscript_vector ' not found I suppose this is because of NextMethod, but I am not sure how to work around it. I also read a little bit about all those matching-and-frame-issues, But must confess I am not really into it. I attach a reproducible example. Any help or suggestions of work-arounds will be appreciated. Thanks Julian version _ platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 3 minor 0.1 year 2013 month 05 day16 svn rev62743 language R version.string R version 3.0.1 (2013-05-16) nickname Good Sport ##TEST-DATA # Create the simplest test data set test1 - data.frame(time=c(4,3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x=c(0,2,1,1,1,0,0), sex=c(0,0,0,0,1,1,1)) # Fit a stratified model fit1 - coxph(Surv(time, status) ~ x + strata(sex), test1) summary(fit1) #fit stratified wih spline fit2 - coxph(Surv(time, status) ~ pspline(x, df=2) + strata(sex), test1) summary(fit2) #function to predict within predicting_function - function(model, newdata){ subs -vector(mode='logical', length=nrow(newdata)) subs[1:length(subs)]- TRUE ret - predict (model, newdata=newdata[subs,]) return(ret) } predicting_function(fit1, test1) # works predicting_function(fit2,test1) #doesnt work - Error in `[.data.frame`(newdata, subs, ) : object 'subs' not found # probably because of NextMethod # traceback() #12: `[.data.frame`(newdata, subs, ) #11: newdata[subs, ] #10: is.data.frame(data) #9: model.frame.default(data = newdata[subs, ], formula = ~pspline(x, # df = 2) + strata(sex), na.action = function (object, ...) # object) #8: model.frame(data = newdata[subs, ], formula = ~pspline(x, df = 2) + # strata(sex), na.action = function (object, ...) # object) #7: eval(expr, envir, enclos) #6: eval(tcall, parent.frame()) #5: predict.coxph(model, newdata = newdata[subs, ]) #4: NextMethod(predict, object, ...) #3: predict.coxph.penal(model, newdata = newdata[subs, ]) #2: predict(model, newdata = newdata[subs, ]) at #5 #1: predicting_function(fit2, test1) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] issues with calling predict.coxph.penal (survival) inside a function - subset-vector not found. Because of NextMethod?
Works for me: predicting_function(fit2,test1) 1 2 3 4 5 6 7 -1.0481141 0.1495946 0.4492597 0.4492597 0.9982492 -0.4991246 -0.4991246 Best Simon On 13 Nov 2013, at 15:46, julian.bo...@elitepartner.de wrote: Hello everyone, I got an issue with calling predict.coxph.penal inside a function. Regarding the context: My original problem is that I wrote a function that uses predict.coxph and survfit(model) to predict a lot of survival-curves using only the basis-curves for the strata (as delivered by survfit(model) ) and then adapts them with the predicted risk-scores. Because there are cases where my new data has strata which didn't exist in the original model I exclude them, using a Boolean vector inside the function. I end up with a call like this: predict (coxph_model, newdata[subscript_vector,] ) This works fine for coxph.model, but when I fit a model with a spline (class coxph.penal), I get an error: Error in `[.data.frame`(newdata, [subscript_vector, ) : object '[subscript_vector ' not found I suppose this is because of NextMethod, but I am not sure how to work around it. I also read a little bit about all those matching-and-frame-issues, But must confess I am not really into it. I attach a reproducible example. Any help or suggestions of work-arounds will be appreciated. Thanks Julian version _ platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 3 minor 0.1 year 2013 month 05 day16 svn rev62743 language R version.string R version 3.0.1 (2013-05-16) nickname Good Sport ##TEST-DATA # Create the simplest test data set test1 - data.frame(time=c(4,3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x=c(0,2,1,1,1,0,0), sex=c(0,0,0,0,1,1,1)) # Fit a stratified model fit1 - coxph(Surv(time, status) ~ x + strata(sex), test1) summary(fit1) #fit stratified wih spline fit2 - coxph(Surv(time, status) ~ pspline(x, df=2) + strata(sex), test1) summary(fit2) #function to predict within predicting_function - function(model, newdata){ subs -vector(mode='logical', length=nrow(newdata)) subs[1:length(subs)]- TRUE ret - predict (model, newdata=newdata[subs,]) return(ret) } predicting_function(fit1, test1) # works predicting_function(fit2,test1) #doesnt work - Error in `[.data.frame`(newdata, subs, ) : object 'subs' not found # probably because of NextMethod # traceback() #12: `[.data.frame`(newdata, subs, ) #11: newdata[subs, ] #10: is.data.frame(data) #9: model.frame.default(data = newdata[subs, ], formula = ~pspline(x, # df = 2) + strata(sex), na.action = function (object, ...) # object) #8: model.frame(data = newdata[subs, ], formula = ~pspline(x, df = 2) + # strata(sex), na.action = function (object, ...) # object) #7: eval(expr, envir, enclos) #6: eval(tcall, parent.frame()) #5: predict.coxph(model, newdata = newdata[subs, ]) #4: NextMethod(predict, object, ...) #3: predict.coxph.penal(model, newdata = newdata[subs, ]) #2: predict(model, newdata = newdata[subs, ]) at #5 #1: predicting_function(fit2, test1) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data transformation to list for event occurence
Or, f3 - function (dat1) { i - dat1$Event_Occurence == 1 split(dat1$Week[i], dat1$ID[i]) } in addition to the previously mentioned f1 - function(dat1) { with(dat1,tapply(as.logical(Event_Occurence),ID,FUN=which )) } f2 - function(dat1){ lapply(split(dat1,dat1$ID),function(x) which(!!x[,3])) } Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Tuesday, November 12, 2013 2:13 PM To: R help Subject: Re: [R] Data transformation to list for event occurence Hi Anindya, You may try: dat1 - read.table(text=ID Week Event_Occurence A 1 0 A 2 0 A 3 1 A 4 0 B 1 1 B 2 0 B 3 0 B 4 1,sep=,header=TRUE,stringsAsFactors=FALSE) with(dat1,tapply(as.logical(Event_Occurence),ID,FUN=which )) #or lapply(split(dat1,dat1$ID),function(x) which(!!x[,3])) A.K. On Tuesday, November 12, 2013 4:58 PM, Anindya Sankar Dey anindy...@gmail.com wrote: Hi, Say I have a following data ID Week Event_Occurence A 1 0 A 2 0 A 3 1 A 4 0 B 1 1 B 2 0 B 3 0 B 4 1 that whether an individual experienced an event in a particular week. I wish to create list such as the first element of the list will be a vector listing the week number when the event has occurred for A, followed by that of B. Can you help creating this? -- Anindya Sankar Dey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Negative binomial parameterisation in mgcv
var(y) = mu + mu^2/theta where E(y) = mu. best, Simon On 13/11/13 10:38, Mark Payne wrote: Dear R-help, The negative binomial distribution has several different parameterisations, but I can't seem to figure out what the exact one used in mgcv's negbin family is? negbin() requires a theta argument, but its not clear anywhere in the documentation (that I can find), how this parameter should be interpreted - for example, can it be less than 1? Best wishes, MArk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double Pareto Log Normal Distribution DPLN
Hi I found this paper http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf that models the DPLN distribution as in equation 8. I implemented this in R but cannot get the same curve as in Figure 4. can you please check if my code below is correct: e.g. is the use of pnorm() correct here? ddlpn - function(x){ a=2.8 b=0.01 v=0.45 t=6.5 j - (a * b /(a+b)) norm1-pnorm((log(x)-v-(a*t^2))/t) expo1- a*v+(a^2*t^2/2) z-exp(expo1)*(x^(-a-1))*(norm1) norm2-pnorm((log(x)-v+(b*t^2))/t) expo2- -b*t+(b^2*t^2/2) y- x^(b-1)*exp(expo2)*(1-norm2) # 1-norm is the complementary CDF of N(0,1) j*(z+y) } ** Bander Alzahrani, Teacher Assistant Information Systems Department Faculty of Computing Information Technology King Abdulaziz University * From: d...@vims.edu To: cs_2...@hotmail.com CC: r-h...@stat.math.ethz.ch Subject: Re: [R] Double Pareto Log Normal Distribution Date: Tue, 12 Nov 2013 16:51:22 + http://www.math.uvic.ca/faculty/reed/dPlN.3.pdf is the original ref and has the equations. library(VGAM) for *pareto() and library(stats) for *lnorm() should get you most of the way there. On Nov 12, 2013, at 10:47 AM, b. alzahrani cs_2...@hotmail.com wrote: Hi guys I would like to generate random number Double Pareto Log Normal Distribution (DPLN). does anyone know how to do this in R or if there is any built-in function. Thanks ** Bander * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. David Forrest d...@vims.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Setting x-axis of a plot on each loop
On Nov 13, 2013, at 12:24 AM, Baro wrote: I have a block of code: window-5 start-3 n-1 seq1 - seq(1:40) mat-matrix(seq1,40) while(1+window=length(mat[,1])) { kd-matrix(as.integer(mat[n:(n+window-1),1])) Sys.sleep(0.2) plot(kd,col=blue,xlab=Rohdaten,ylab=values,xlim=c(start+n,start+n+window-1) ) n-n+1 } I have this expectation, that on each loop two x-axis and y-axis are changed and see the values on the plot. but I cant see the value.What should I do to have values too?. If I am changing this my code to plot(kd,col=blue,xlab=Rohdaten,ylab=values) I can see the values but on the x-axis I have no the correct values You see nothing (except perhaps in the first few plots where I suspect points are being plotted within the plot area) because by offdering only one unnamed numeric argument to plot you implicitly use 1:5 as your x value in all plots. You say you have qn expectation that the and y values are changing but you don't change the x-values in your code. You have not described what you are trying to do so giving further advice is not possible. I take that back; further advice: It's always a good idea to name your arguments. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What is the difference between Mean Decrease Accuracy produced by importance(foo) vs foo$importance in a Random Forest Model?
Hi R Expert Community, My question: What is the difference between Mean Decrease Accuracy produced by importance(foo) vs foo$importance in a Random Forest Model? I ran a Random Forest classification model where the classifier is binary. I stored the model in object FOREST_model. I than ran importance(FOREST_model) and FOREST_model$importance. I usually use the prior but decided to learn more about what is in summary(randomForest ) so I ran the latter. I expected both to produce identical output. Mean Decrease Gini is the only thing that is identical in both. I looked at ? Random Forest and Package 'randomForest' documentation and didn't find any info explaining this difference. I am not including a reproducible example because this is most likely something, perhaps simple, such as one is divided by something (if so, what?), that I am just not aware of. importance(FOREST_model) HC TER MeanDecreaseAccuracy MeanDecreaseGini APPT_TYP_CD_LL0.16025157 -0.521041660 0.1567029712.793624 ORG_NAM_LL0.20886631 -0.952057325 0.20208393 107.137049 NEW_DISCIPLINE0.20685079 -0.960719435 0.2007676286.495063 FOREST_model$importance HC TER MeanDecreaseAccuracy MeanDecreaseGini APPT_TYP_CD_LL0.0049473962 -3.727629e-03 0.0045949805 12.793624 ORG_NAM_LL0.0090715845 -2.401016e-02 0.0077298067 107.137049 NEW_DISCIPLINE0.0130672572 -2.656671e-02 0.0114583178 86.495063 Dan Lopez LLNL, HRIM, Workforce Analytics Metrics [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] codenls
The expression has b[1] and b[2] while start has b[2] and b[3]. The expression needs a different form, for example: # fit-nlrob(y ~ x1 / (1+ b[1]*x2^b[2]),data = xx, start = # list(b[2],b[3])) fit-nlrob(y ~ x1 / (1+ b1*x2^b2),data = xx, start = list(b1=b[2],b2=b[3])) This works, though I have no idea if the results make sense. JN On 13-11-13 06:00 AM, r-help-requ...@r-project.org wrote: Message: 6 Date: Tue, 12 Nov 2013 06:03:02 -0800 (PST) From: IZHAK shabsogh ishaqb...@yahoo.com To: r-help@r-project.org r-help@r-project.org Subject: [R] codenls Message-ID: 1384264982.50617.yahoomail...@web142502.mail.bf1.yahoo.com Content-Type: text/plain kindly help correct this program as given below i run and run is given me some error rm(list=ls()) require(stats) require(robustbase) x1-as.matrix(c(5.548,4.896,1.964,3.586,3.824,3.111,3.607,3.557,2.989)) y-as.matrix(c(2.590,3.770,1.270,1.445,3.290,0.930,1.600,1.250,3.450)) x2-as.matrix(c(0.137,2.499,0.419,1.699,0.605,0.677,0.159,1.699,0.340)) k-rep(1,9) x-data.frame(k,x1,x2) xx-data.frame(y,x1,x2) freg-function(y,x1,x2){ reg- lm(y ~ x1 + x2 , data=x) return(reg) } fit-freg(y,x1,x2) b-as.matrix((coef(fit))) f-function(b,x1,x2){ fit-nlrob(y ~ x1 / (1+ b[1]*x2^b[2]),data = xx, start = list(b[2],b[3])) return(fit) } fit1-f(b,x1,x2) Error in nlrob(y ~ x1/(1 + b[1] * x2^b[2]), data = xx, start = list(b[2], : 'start' must be fully named (list or numeric vector) [[alternative HTML version deleted]] -- Message: 7 Date: Tue, 12 Nov 2013 06:34:44 -0800 (PST) From: Alaios ala...@yahoo.com To: R-help@r-project.org R-help@r-project.org Subject: [R] Handle Gps coordinates Message-ID: 1384266884.5102.yahoomail...@web125305.mail.ne1.yahoo.com Content-Type: text/plain Dear all, I would like to ask you if there are any gps libraries. I would like to be able to handle them, -like calculate distances in meters between gps locations, -or find which gps location is closer to a list of gps locations. Is there something like that in R? I would like to tthank you in advance for your reply Regards Alex [[alternative HTML version deleted]] -- Message: 8 Date: Tue, 12 Nov 2013 06:59:53 -0800 From: Baro babak...@gmail.com To: R help r-help@r-project.org Subject: [R] Having a relative x-axis in a plot Message-ID: CAF-JZQupPPoyGfd-1k0ztn7bHmFTxdXUjtMGHub9D3TYLh=2...@mail.gmail.com Content-Type: text/plain I would like to have a relative x-axis in r. I am reading time seris from an excel file and I want to show in a plot and also I want to have a window which moves over the values. My aim ist to see which point belongs to which time(row number in excel file). i.e I am reading from 401th row in 1100th column in excel file and my window length is 256. here is my code: #which column in excel filespalte-1100 #start row and end row start-401end-600 window-256 n-1 wb - loadWorkbook(D:\\MA\\excel_mix_1100.xls) dat -readWorksheet(wb, sheet=getSheets(wb)[1], startRow=start, endRow=end, startCol=spalte, endCol=spalte,header=FALSE) datalist-dat[seq(1, length(dat[,1]), by = 2),] datalist-matrix(datalist)while(1+window=length(datalist)){ kd-matrix(as.integer(datalist[n:(n+window-1),1])) Sys.sleep(0.2) plot(kd,type=l,col=blue,xlab=Rohdaten,ylab=values,xlim=c(start+n,start+n+window-1)) n-n+1} [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2: geom_boxplot. Mapping aes factor but with different color scale and hatching
Dear all, I have a geom_boxplot where I have the boxes colored according to a factor. The color scale given is very lovely and colorful, but I need it in grayscale. how do I change this? I also am worried that the grayscale may be too similar (black/white is also not an option as in my real data I have much more than the two dummy factors I have put in the dummy data). Therefore, I would like to hatch the boxes also mapped according to the factors (see script and data below). How can I accomplish this? Also, I have a problem that I though would be easy to solve over Stack Overflow, but turn out to be more complex. i would like to have one of my legend titles in italics. Sounds basic, but can't solve it somehow. I have marked in the script below where this is. Many thank's for your help! Dummydata: mydata - data.frame( D15N = c(runif(100, min = -2), runif(100), runif(100, max = 2), runif(100)), Year = rep(c('2007', '2008'), each = 100), SampleType = rep(c('cyano', 'seston'), each = 200), Station = sample(c('B1', 'H2', 'H3', 'H4'), 400, replace = TRUE), Week = sample(c('19', '21', '23', '25'), 400, replace = TRUE)) library(ggplot2) Summdata - ddply(mydata, .(Week, SampleType, Year, Station), summarise, median = median(D15N)) ggplot(Summdata, aes(x = Station, y = median)) + (mapping=aes(group=interaction(Station, SampleType)))+ theme_bw() + geom_boxplot(mapping=aes(fill=SampleType), #I would like to have grayscale and hatched - how? outlier.shape=1, outlier.size=3)+ theme(strip.background = element_blank())+ ylab(expression(paste(,delta^{15}, N)))+ xlab(Station) + scale_shape(solid = FALSE)+ geom_hline(yintercept=0, linetype=3)+ scale_fill_discrete(guide = guide_legend(my stuff), labels=c(cyanobacteria, seston)) #if I wanted to have seston in italics - how would I do that? with kind regards Anna Zakrisson PhD student Department of Ecology, Environment and Plant Sciences Stockholm University Svante Arrheniusv. 21A SE-106 91 Stockholm Sweden/Sverige Lives in Berlin. For paper mail: Katzbachstr. 21 D-10965, Berlin - Kreuzberg Germany/Deutschland E-mail: anna.zakris...@su.se Tel work: +49-(0)3091541281 Mobile: +49-(0)15777374888 LinkedIn: http://se.linkedin.com/pub/anna-zakrisson-braeunlich/33/5a2/51b º`. . `. . `. . º`. . `. . `. .º`. . `. . `. .º [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] .First in R.app (on Mac) vs R in Terminal
Hello, I am noticing different behavior with respect to .First when working in Terminal or working with the R gui on Mac OSX, and I am wondering if someone might clarify for me the intention behind this. In both cases, I am running 3.0.2 on the same machine. In my ~/.Rprofile I have a function .First( ) I then create a new function .First in my workspace, and save the image on exit. If I do this in Terminal, the .First from my saved image will execute, while the .First from my .Rprofile will not. If I do this in the R gui, BOTH .First functions execute. (obviously, if the two functions are identical, the same function executes twice.) It appears that there is a function .First in tools:RGUI (see below). The one saved here is the .First loaded from .Rprofile. Inspecting the two reveals that the one in .GlobalEnv is the function manually created. My questions are: (1) Is the intended behavior that a .First in a restored image will take precedence over a .First in .Rprofile. (I assume so, since according to ?Startup the image loads after the .Rprofile is sourced) (2) Why is the .First function from .Rprofile saved to tools:RGUI? (3) If a package contains a function .First, will both that function and the user's .First (assuming it exists) be executed at startup? Below are sessionInfo and some additional details: Thank you, Rick Mac OSX 10.9 I created a new user. In that users ~/.Rprofile, I have only the following function ## .Rprofile .First - function() { cat(\n\nThis 'First' is from .Rprofile\n\n) } In my workspace, I then create a different function .First ## .Rprofile .First - function() { cat(\n\nThis 'First' is manually created in the workspace\n\n) } quit(yes) Then re open R. Between execution in R.app R in Terminal, I rm ~/.Rdata # --- THIS IS FROM R.app for (i in seq(search())) + cat(sprintf(%20s : %17s, search()[[i]], paste(ls(pattern=\\.First, all=TRUE, pos=i), collapse=, )), \n) .GlobalEnv :.First tools:RGUI : .__RGUI__..First package:stats : package:graphics : package:grDevices : package:utils : package:datasets : package:methods : Autoloads : package:base :.First.sys THIS IS FROM TERMINAL for (i in seq(search())) + cat(sprintf(%20s : %17s, search()[[i]], paste(ls(pattern=\\.First, all=TRUE, pos=i), collapse=, )), \n) .GlobalEnv :.First package:stats : package:graphics : package:grDevices : package:utils : package:datasets : package:methods : Autoloads : package:base :.First.sys # -- SESSION INFO --- # ## TERMINAL sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base ## R.app R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base # -- END --- # Ricardo Saporta Graduate Student, Data Analytics Rutgers University, New Jersey e: sapo...@rutgers.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fitting arbitrary curve to 1D data with error bars
Dear R experts, I was wondering how I could do a fit (say minimum chi2) to a 1D data with error bars. I searched quite a lot on the web, found out about the fitdistr() function in MASS, etc., but none of the things I found gives what I am really looking for. Perhaps I do not exactly know the proper terminology and this is why I cannot find the solution. We could also probably describe it as parameter estimation using the minimum chi2 method using data with error bars and an arbitrary fitting function. Anyway, in order not to confuse people with possibly incorrect terminology, let me show you what I am really looking for. Here is a page that shows what I want in Matlab (see the section on weighted fit): http://www.mathworks.com/matlabcentral/fileexchange/10176-ezyfit-2-41/content/ezyfit/demo/html/efdemo.html#19 Could you please help? Thanks in advance, e. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Linux Java and R not working together
Dear community, I almost tried for 3 days now to install rJava and package FSelector on my Linux machine but couldn't do so. I searched all the forums and tried some tricks ... but unfortunately couldn't find any solution. First of all I did try to install rJava in my rkward without any changes , that gave me the following error : configure: error: One or more JNI types differ from the corresponding native type. You may need to use non-standard compiler flags or a different compiler in order to fix this. ERROR: configuration failed for package ‘rJava’ * removing ‘/home/uwe/.rkward/library/rJava’ Warnmeldung: In install.packages(pkgs = c(rJava), lib = /home/uwe/.rkward/library, : Installation des Pakets ‘rJava’ hatte Exit-Status ungleich 0 So then I started searching and found a tip. I did run the following; uwe@linux-k2a8:~ sudo R CMD javareconf Java interpreter : /usr/bin/java Java version : 1.7.0_45 Java home path : /usr/lib/jdk1.7.0_45/jre Java compiler: /usr/bin/javac Java headers gen.: /usr/bin/javah Java archive tool: /usr/bin/jar NOTE: Your JVM has a bogus java.library.path system property! Trying a heuristic via sun.boot.library.path to find jvm library... Java library path: $(JAVA_HOME)/lib/i386/client JNI linker flags : -L$(JAVA_HOME)/lib/i386/client -ljvm JNI cpp flags: -I$(JAVA_HOME)/../include -I$(JAVA_HOME)/../include/linux Updating Java configuration in /usr/lib/R Done. And also did check uwe@linux-k2a8:~ java -version java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) Server VM (build 24.45-b08, mixed mode) uwe@linux-k2a8:~ sudo /usr/sbin/update-alternatives --config java There are 4 choices for the alternative java (providing /usr/bin/java). SelectionPath Priority Status 0/usr/lib/jvm/jre-1.7.0-openjdk/bin/java 17147 auto mode 1/usr/java/latest/bin/java 1 manual mode * 2/usr/lib/jdk_Oracle/bin/java 3 manual mode 3/usr/lib/jvm/jre-1.5.0-gcj/bin/java 1500 manual mode 4/usr/lib/jvm/jre-1.7.0-openjdk/bin/java 17147 manual mode Press enter to keep the current choice[*], or type selection number: I absolutely don't know what is wrong with that setup since my Java seems to have JDK and all the other things needed for R. For your Information here my system: uwe@linux-k2a8:~ uname -rm 3.4.63-2.44-desktop i686 uwe@linux-k2a8:~ cat /proc/version Linux version 3.4.63-2.44-desktop (geeko@buildhost) (gcc version 4.7.1 20120723 [gcc-4_7-branch revision 189773] (S Linux) ) #1 SMP PREEMPT Wed Oct 2 11:18:32 UTC 2013 (d91a619) I would be really thankful for any advise to get rJava (and FSelector) running soon. Best wishes Uwe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Survival analysis with truncated data.
Hi, I would like to know how to handle truncated data. My intend is to have the survival curve of a software fault in order to have some information about fault lifespan. I have some observations of a software system between 2004 and 2010. The system was first released in 1994. The event considered is the disappearance of a software fault. The faults can have been introduced at any time, between 1994 and 2010. But for fault introduced before 2004, there is not mean to know their age. I used the Surv and survfit functions with type interval2. For the faults that are first observed in 2004, I set the lower bound to the lifespan observed between 2004 and 2010. How could I set the upper bound ? Using 1994 as a starting point to not seems to be meaningful. Neither is using only the lower bound. Should I consider another survival estimator ? Thanks in advance. -- Nicolas Palix Tel: +33 4 76 51 46 27 http://membres-liglab.imag.fr/palix/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simplex
Hi Sven, May be this helps: If you look into ?simplex.object solved This indicates whether the problem was solved. A value of ‘-1’ indicates that no feasible solution could be found. A value of ‘0’ that the maximum number of iterations was reached without termination of the second stage. This may indicate an unbounded function or simply that more iterations are needed. A value of ‘1’ indicates that an optimal solution has been found. s1 - simplex(a=price * -1, A1=A, b1=b, A2=A, b2=b2, maxi=T) # This cant find a solution s1$solved #[1] -1 A.K. Dear All, I do have a problem with a linear optimisation I am trying to achieve: library(boot) price - c(26.93, 26.23, 25.64, 25.97, 26.12, 26.18, 26.49, 27, 27.32, 27.93, 27.72, 27.23) A - matrix(0, nrow=length(price)+1, ncol=length(price)) A[1,] - 1 for(i in 1:length(price)) A[i+1,i] - 1 b - c(864000.01, 288000, 288000, 288000, 288000, 288000, 288000, 288000, 288000, 288000, 288000, 288000, 288000) b2 - c(864000, 216000, 216000, 216000, 216000, 216000, 216000, 216000, 216000, 216000, 216000, 216000, 216000) simplex(a=price, A1=A, b1=b) # This works simplex(a=price, A1=A, b1=b, maxi=T) # This is the maxing function and works as well simplex(a=price * -1, A1=A, b1=b, A2=A, b2=b2, maxi=T) # This cant find a solution The result I would expect is that it picks the 4 highest(note I multiplied price with -1) numbers and multiplies them with constraint in b2. For example this works: library(boot) price - c(1, 2, 3 , 4) A - matrix(0, nrow=length(price)+1, ncol=length(price)) A[1,] - 1 for(i in 1:length(price)) A[i+1,i] - 1 b - c(8.01, 2.5, 2.5, 2.5, 2.5) b2 - c(8, 2, 2, 2, 2) simplex(a=price, A1=A, b1=b) simplex(a=price, A1=A, b1=b, maxi=T) simplex(a=price * -1, A1=A, b1=b, A2=A, b2=b2, maxi=T) I cant see a logical difference between the two, why would it not find a solution for the first problem? Thank you. Sven __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double Pareto Log Normal Distribution DPLN
Looks like there are typos in equation 8 of http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf (expo2 doesn't depend on 'v') or equation 9 of http://www.math.uvic.ca/faculty/reed/dPlN.3.pdf ('a' is not specified). Dave On Nov 13, 2013, at 11:43 AM, b. alzahrani cs_2...@hotmail.com wrote: Hi I found this paper http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf that models the DPLN distribution as in equation 8. I implemented this in R but cannot get the same curve as in Figure 4. can you please check if my code below is correct: e.g. is the use of pnorm() correct here? ddlpn - function(x){ a=2.8 b=0.01 v=0.45 t=6.5 j - (a * b /(a+b)) norm1-pnorm((log(x)-v-(a*t^2))/t) expo1- a*v+(a^2*t^2/2) z-exp(expo1)*(x^(-a-1))*(norm1) norm2-pnorm((log(x)-v+(b*t^2))/t) expo2- -b*t+(b^2*t^2/2) y- x^(b-1)*exp(expo2)*(1-norm2) # 1-norm is the complementary CDF of N(0,1) j*(z+y) } ** Bander Alzahrani, Teacher Assistant Information Systems Department Faculty of Computing Information Technology King Abdulaziz University * From: d...@vims.edu To: cs_2...@hotmail.com CC: r-h...@stat.math.ethz.ch Subject: Re: [R] Double Pareto Log Normal Distribution Date: Tue, 12 Nov 2013 16:51:22 + http://www.math.uvic.ca/faculty/reed/dPlN.3.pdf is the original ref and has the equations. library(VGAM) for *pareto() and library(stats) for *lnorm() should get you most of the way there. On Nov 12, 2013, at 10:47 AM, b. alzahrani cs_2...@hotmail.com wrote: Hi guys I would like to generate random number Double Pareto Log Normal Distribution (DPLN). does anyone know how to do this in R or if there is any built-in function. Thanks ** Bander * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. David Forrest d...@vims.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. David Forrest d...@vims.edu signature.asc Description: Message signed with OpenPGP using GPGMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Beginner - Need Perhaps 5 - 10 Minutes of R User Time to Learn Few Basics
I have finally decided that I will learn R and learn it very well. For now I am using a program that a friend of mine developed to do some advanced statistical analyses. I downloaded RStudio to my machine. [Perhaps RStudio is not the best platform to work from - I have heard that Rattle is sort of the new standard.] I have so far been able to highlight the rows of the code that I wish to run, but then I somehow turned off seeing the output. I also cannot find where I would locate the output window. Yes, frustrated. Would any kind soul be interested in helping kickstart my R learning? I have JoinMe installed on my machine so I figure we can do it interactively. It should not take more than a few minutes. I am already very experienced with both C and VBA languages as well as SPSS syntax so there is not much need to worry about me being too much of a novice. Thank you very much in advance. Zach Feinstein zfeinst...@isgmn.commailto:zfeinst...@isgmn.com (952) 277-0162 (612) 590-4813 (mobile) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kernlab multicore usage
I am hoping there are other users of the kernlab package in R who will be able to solve a puzzle. In the past, I've used the relevance vector machine engine (rvm) in kernlab and was pleased to see it use all four cores on my PC (running Windows 8). But now it only runs on one core and I can't figure out what changed. I've tried switching R versions from 3.0.2 to 2.15.3 and back and installing packages relating to multicore use, but haven't managed to get more than one core running any more. Anyone know what is necessary? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Readjusting frequencies
Hi Katherine, Check if this works: fraud_data = data.frame(no_of_frauds = c(1, 2, 4, 6, 7, 9, 10), frequency = c(3, 1, 7, 11, 13, 1, 4)) fraud_data1 - rbind(fraud_data,c(11,10)) fun1 - function(dat) { sum1 - 0 for(i in 1:nrow(dat)){ sum1 - sum1 + dat[i, frequency] dat[i,frequency] - sum1 if(sum1 5){ sum1 - 0 } } indx - which(dat$frequency 5) if(max(indx)==nrow(dat)) {res - dat[dat$frequency 5,] } else{ dat[max(indx),frequency] - sum(dat[c(max(indx),nrow(dat)),frequency]) res - dat[dat$frequency 5,] } res } fun1(fraud_data) # no_of_frauds frequency #3 4 11 #4 6 11 #5 7 18 fun1(fraud_data1) # no_of_frauds frequency #3 4 11 #4 6 11 #5 7 13 #8 11 15 A.K. On Tuesday, November 12, 2013 5:12 AM, Katherine Gobin katherine_go...@yahoo.com wrote: Dear Mr Arun, Hi! Sorry to bother you. Yesterday I had raised one query to the forum but haven't received any response. If the time permits, will it be possible for you to at least have a look at it? I have following data.frame as fraud_data = data.frame(no_of_frauds = c(1, 2, 4, 6, 7, 9, 10), frequency = c(3, 1, 7, 11, 13, 1, 4)) fraud_data no_of_frauds frequency 1 1 3 2 2 1 3 4 7 4 6 11 5 7 13 6 9 1 7 10 4 I need to regroup the data in such a way that if the frequency is less than 5, the corresponding class data gets merged to next class (or at times with previous class too) i.e. the frequencies get added till the added frequencies exceed 5. Thus, in above data.frame since frequencies pertaining to no_of_frauds 1 and 2 are 3 and 1 respectively, these get added to class 4 and the frequency of this class now becomes 3+1+7 = 11. Likewise, frequency of classes 9 and 10 are 1 and 4 and when these are added still it is 5 i.e. doesn't exceed 5. Thus, these should get added to the previous class i.e. 7. Thus I need to have no_of_frauds frequency 4 11 # ( 3 + 1 + 7) 6 11 7 18 # (13 + 1 + 4) If possible, pl go through it. Sorry to bother you. Regards Katherine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Beginner - Need Perhaps 5 - 10 Minutes of R User Time to Learn Few Basics
Type the following into your console window: install.packages(fortunes) library(fortunes) fortune(brain surgery) (It's not quite apposite, but close enough). Cheers, Bert On Wed, Nov 13, 2013 at 5:57 AM, Zach Feinstein zfeinst...@isgmn.com wrote: I have finally decided that I will learn R and learn it very well. For now I am using a program that a friend of mine developed to do some advanced statistical analyses. I downloaded RStudio to my machine. [Perhaps RStudio is not the best platform to work from - I have heard that Rattle is sort of the new standard.] I have so far been able to highlight the rows of the code that I wish to run, but then I somehow turned off seeing the output. I also cannot find where I would locate the output window. Yes, frustrated. Would any kind soul be interested in helping kickstart my R learning? I have JoinMe installed on my machine so I figure we can do it interactively. It should not take more than a few minutes. I am already very experienced with both C and VBA languages as well as SPSS syntax so there is not much need to worry about me being too much of a novice. Thank you very much in advance. Zach Feinstein zfeinst...@isgmn.commailto:zfeinst...@isgmn.com (952) 277-0162 (612) 590-4813 (mobile) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Beginner - Need Perhaps 5 - 10 Minutes of R User Time to Learn Few Basics
Bert: Yet another reason I'm a fan of Frank Harrell. Does anyone know when I get to buy the next edition of Regression Modeling Strategies? Zach: Check www.coursera.org. They have some nice R-centric classes. I signed up myself since my own R skills are self-taught. Also consider investing in some of the excellent R texts available such as The R Book or The R Primer Ersatzistician and Chutzpahthologist I can answer any question. I don't know is an answer. I don't know yet is a better answer. On Wed, Nov 13, 2013 at 2:08 PM, Bert Gunter gunter.ber...@gene.com wrote: Type the following into your console window: install.packages(fortunes) library(fortunes) fortune(brain surgery) (It's not quite apposite, but close enough). Cheers, Bert On Wed, Nov 13, 2013 at 5:57 AM, Zach Feinstein zfeinst...@isgmn.com wrote: I have finally decided that I will learn R and learn it very well. For now I am using a program that a friend of mine developed to do some advanced statistical analyses. I downloaded RStudio to my machine. [Perhaps RStudio is not the best platform to work from - I have heard that Rattle is sort of the new standard.] I have so far been able to highlight the rows of the code that I wish to run, but then I somehow turned off seeing the output. I also cannot find where I would locate the output window. Yes, frustrated. Would any kind soul be interested in helping kickstart my R learning? I have JoinMe installed on my machine so I figure we can do it interactively. It should not take more than a few minutes. I am already very experienced with both C and VBA languages as well as SPSS syntax so there is not much need to worry about me being too much of a novice. Thank you very much in advance. Zach Feinstein zfeinst...@isgmn.commailto:zfeinst...@isgmn.com (952) 277-0162 (612) 590-4813 (mobile) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double Pareto Log Normal Distribution DPLN
...Additionally...your set of parameters match none of the curves in figure 4. I think the ordering of the parameters as listed on the graphs is different than in the text of the article. The 'v' parameter controls the location of the 'elbow' and should be near log(x) in each graph, while the 't' parameter is the sharpness of the elbow. So eyeballing the elbows: log(c(80,150,300))= 4.382027 5.010635 5.703782 These appear to match positional parameter #4 in the legends, which you apply as parameter t in your function. # editing your function a bit: ddpln - function(x, a=2.5, b=0.01, v=log(100), t=v/10){ # Density of Double Pareto LogNormal distribution # from b. alzahrani cs_2...@hotmail.com email of 2013-11-13 # per formula 8 from http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf c - (a * b /(a+b)) norm1-pnorm((log(x)-v-(a*t^2))/t) norm2-pnorm((log(x)-v+(b*t^2))/t) expo1- a*v+(a^2*t^2/2) expo2- -b*v+(b^2*t^2/2) # edited from the paper's eqn 8 with: s/t/v/ z- (x^(-a-1)) * exp(expo1)*(norm1) y- (x^(b-1)) * exp(expo2)*(1-norm2) # 1-norm is the complementary CDF of N(0,1) c*(z+y) } x-10^seq(0,5,by=0.1) # 0-10k plot(x,ddpln(x,a=2.8,b=.01,v=3.8,t=0.35),log='xy',type='l') # Similar to fig 4 left. plot(x,ddpln(x,a=2.5,b=.01,v=log(2)),log='xy',type='l') plot(x,ddpln(x,a=2.5,b=.01,v=log(10)),log='xy',type='l') plot(x,ddpln(x,a=2.5,b=.01,v=log(100)),log='xy',type='l') plot(x,ddpln(x,a=2.5,b=.01,v=log(1000)),log='xy',type='l') plot(x,ddpln(x,a=2.5,b=.01,v=log(1)),log='xy',type='l') Dave On Nov 13, 2013, at 11:43 AM, b. alzahrani cs_2...@hotmail.com wrote: Hi I found this paper http://cs.stanford.edu/people/jure/pubs/dpln-kdd08.pdf that models the DPLN distribution as in equation 8. I implemented this in R but cannot get the same curve as in Figure 4. can you please check if my code below is correct: e.g. is the use of pnorm() correct here? ddlpn - function(x){ a=2.8 b=0.01 v=0.45 t=6.5 j - (a * b /(a+b)) norm1-pnorm((log(x)-v-(a*t^2))/t) expo1- a*v+(a^2*t^2/2) z-exp(expo1)*(x^(-a-1))*(norm1) norm2-pnorm((log(x)-v+(b*t^2))/t) expo2- -b*t+(b^2*t^2/2) y- x^(b-1)*exp(expo2)*(1-norm2) # 1-norm is the complementary CDF of N(0,1) j*(z+y) } ** Bander Alzahrani, Teacher Assistant Information Systems Department Faculty of Computing Information Technology King Abdulaziz University * From: d...@vims.edu To: cs_2...@hotmail.com CC: r-h...@stat.math.ethz.ch Subject: Re: [R] Double Pareto Log Normal Distribution Date: Tue, 12 Nov 2013 16:51:22 + http://www.math.uvic.ca/faculty/reed/dPlN.3.pdf is the original ref and has the equations. library(VGAM) for *pareto() and library(stats) for *lnorm() should get you most of the way there. On Nov 12, 2013, at 10:47 AM, b. alzahrani cs_2...@hotmail.com wrote: Hi guys I would like to generate random number Double Pareto Log Normal Distribution (DPLN). does anyone know how to do this in R or if there is any built-in function. Thanks ** Bander * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. David Forrest d...@vims.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. David Forrest d...@vims.edu signature.asc Description: Message signed with OpenPGP using GPGMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to sum a function over a specific range in R?
On Tue, Nov 12, 2013 at 11:45 AM, umair durrani umairdurr...@outlook.comwrote: I am new to R and have already posted this question on stack overflow. The problem is that I did not understand the answers as the R documentation about the discussed functions (e.g. 'convolve') is quite complicated for a newbie like me. Here's the question: I have a big text file with more than 3 million rows. The following is the example of the three columns I want to use: indxvehID LocalY 1 2 35.381 2 2 39.381 3 2 43.381 4 2 47.38 5 2 51.381 6 2 55.381 7 2 59.381 8 2 63.379 9 2 67.383 10 2 71.398 where,indx = IndexvehID = Vehicle ID (Here only '2' is shown but infact there are 2169 vehicle IDs and each one repeats several times because the data was collected at every 0.1 seconds)LocalY = The y coordinate of the vehicle at a particular time (The time column is not shown here) What I want to do is to create a new column of 'SmoothedY' using the following formula: SmoothedY = 1/Z * Summation from (i-15) to (i+15) (LocalY * exp(-abs(i-k))/5)) where,i = indxZ = Summation from (k =i-15) to (k = i+15) ( exp(-abs(i-k))/5)) How can I apply this formula to create the new column 'SmoothedY'? This is actually a data smoothing problem but default smoothing algorithms in R are not suitable for my data and I have to use this custom formula. Thanks in advance. Umair Durrani I have never tried this myself, but it appears as if you can define your own smoothing function using Simon Wood's mgcv package. Check out http://www.maths.bath.ac.uk/~sw283/talks/snw-R-talk.pdf for more information. Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Update a variable in a dataframe based on variables in another dataframe of a different size
This is a great solution! Love the conciseness of your solution. And easy to understand. Thanks again. Dan -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Monday, November 11, 2013 6:31 PM To: Lopez, Dan Subject: Re: [R] Update a variable in a dataframe based on variables in another dataframe of a different size Hi, You could use: H_DF[match(with(T_DF,paste(FY,ID,sep=_)), with(H_DF,paste(FY,ID,sep=_))),3]- TER A.K. On Monday, November 11, 2013 8:51 PM, Lopez, Dan lopez...@llnl.gov wrote: Below is how I am currently doing this. Is there a more efficient way to do this? The scenario is that I have two dataframes of different sizes. I need to update one binary factor variable in one of those dataframes by matching on two variables. If there is no match keep as is otherwise update. Also the variable being update, TT in this case should remain a binary factor variable (levels='HC','TER') HTDF2-merge(H_DF,T_DF,by=c(FY,ID),all.x=T) HTDF2$TT-factor(ifelse(is.na(HTDF2$TT.y),HTDF2$TT.x,HTDF2$TT.y),labels=c(HC,TER)) HTDF2-HTDF2[,-(3:4)] # REPRODUCIBLE EXAMPLE DATA FOR ABOVE.. dput(H_DF) structure(list(FY = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L), .Label = c(FY09, FY10, FY11, FY12, FY13), class = factor), ID = c(1, 1, 1, 1, 2, 2, 2, 2, 2), TT = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c(HC, TER), class = factor)), .Names = c(FY, ID, TT), class = data.frame, row.names = c(1L, 2L, 3L, 4L, 6L, 7L, 9L, 10L, 11L)) dput(T_DF) structure(list(FY = structure(c(4L, 2L, 5L), .Label = c(FY09, FY10, FY11, FY12, FY13), class = factor), ID = c(1, 2, 2), TT = structure(c(2L, 2L, 2L), .Label = c(HC, TER), class = factor)), .Names = c(FY, ID, TT), row.names = c(5L, 8L, 12L), class = data.frame) Dan Lopez LLNL, HRIM - Workforce Analytics Metrics [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Real frequencies in a 'set' problem
I have three media channels where I can push public health information (tv, radio and newspaper). Any given citizen can be touched by the information from one, two or three channels (let's ignore for the moment that citizens might miss the information all together). Hence these sets: library(sets) tv - set(1, 4, 7, 6) radio - set(6, 7, 5, 2) newspaper - set(4, 7, 5, 3) Now I want to know how many citizens that newspapers bring to the complete body of informed citizens. E.g. how many people gets the information ONLY from a newspaper: (tv | radio | t) - (tv | radio) But how do I solve this if I know that 220 people is reached via tv 349 people is reached vi radio 384 people is reached via newspaper 97 people is reached via tv and radio 173 people is reached via radio and newspaper 134 people is reached via newspaper and tv It's the grey area of the (dummy) plot below that I would like to put a frequency in library(VennDiagram) draw.triple.venn(area1 = 10, area2 = 10, area3 = 10, n12 = 3, n13 = 3, n23 = 3, n123 = 1, category = c('TV', 'Radio', 'Newspaper'), fill = c('white', 'white', 'grey'), alpha = c(1, 1, 0) ) I don't understand how the frequencies that I know is supposed to be used in the 'sets' notation. Or am I completely misunderstanding something here? Wouldn't be the first time if I did... TIA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] making a barplot with table of experimental conditions underneath (preferably ggplot2)
On 11/14/2013 01:24 AM, n.hub...@ncmls.ru.nl wrote: Dear all, my data looks the following: df- data.frame (experiment=c(E1,E2,E3,E4), mean = c(3,4,5,6), stdev=c(0.1,0.1,0.05,0.2), method = c(STD,STD, FP, FP), enzyme =c (T,T/L,T,T/L), denaturation=c(U,U,0.05%RG, 0.1%RG)) I would like to make a bar plot with standard deviation which I solved the following way: x- df$experiment y- df$mean sd- df$stdev df.plot- qplot(x,y,xlab=, ylab=# peptides identified)+ geom_bar(colour=black, fill=darkgrey)+ geom_errorbar(aes(x=x, ymin=y-sd, ymax=y+sd), width=0.25) df.plot However, as the labels for the x-axis (the bars) I do not want the experiment number, as now, but instead a table containing the other columns of my data.frame (method, enzyme, denaturation) with the description in the front and the certain 'value' below the bars. I am looking forward to your suggestions! Hi Nina, This isn't in ggplot2, but it might help: library(plotrix) plotbg-rect(0,0,5,7,col=\lightgray\);grid(col=\white\,lty=1) exp_con-paste(df$method,df$enzyme,df$denaturation,sep=\n) barp(df$mean,names.arg=rep(,4),col=darkgray, ylab=# peptides identified,do.first=plotbg) mtext(exp_con,side=1,at=1:4,line=3) dispersion(1:4,df$mean,df$stdev) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Real frequencies in a 'set' problem
On Nov 13, 2013, at 1:35 PM, Stefan Petersson wrote: I have three media channels where I can push public health information (tv, radio and newspaper). Any given citizen can be touched by the information from one, two or three channels (let's ignore for the moment that citizens might miss the information all together). Hence these sets: library(sets) tv - set(1, 4, 7, 6) radio - set(6, 7, 5, 2) newspaper - set(4, 7, 5, 3) Now I want to know how many citizens that newspapers bring to the complete body of informed citizens. E.g. how many people gets the information ONLY from a newspaper: (tv | radio | t) - (tv | radio) But how do I solve this if I know that 220 people is reached via tv 349 people is reached vi radio 384 people is reached via newspaper 97 people is reached via tv and radio 173 people is reached via radio and newspaper 134 people is reached via newspaper and tv It's the grey area of the (dummy) plot below that I would like to put a frequency in library(VennDiagram) draw.triple.venn(area1 = 10, area2 = 10, area3 = 10, n12 = 3, n13 = 3, n23 = 3, n123 = 1, category = c('TV', 'Radio', 'Newspaper'), fill = c('white', 'white', 'grey'), alpha = c(1, 1, 0) ) I don't understand how the frequencies that I know is supposed to be used in the 'sets' notation. Or am I completely misunderstanding something here? Wouldn't be the first time if I did... Is there any reason for us to think this isn't homework? Have you read the Posting Guide? TIA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to sum a function over a specific range in R?
?filter perhaps. -- Bert On Wed, Nov 13, 2013 at 1:10 PM, Adams, Jean jvad...@usgs.gov wrote: On Tue, Nov 12, 2013 at 11:45 AM, umair durrani umairdurr...@outlook.comwrote: I am new to R and have already posted this question on stack overflow. The problem is that I did not understand the answers as the R documentation about the discussed functions (e.g. 'convolve') is quite complicated for a newbie like me. Here's the question: I have a big text file with more than 3 million rows. The following is the example of the three columns I want to use: indxvehID LocalY 1 2 35.381 2 2 39.381 3 2 43.381 4 2 47.38 5 2 51.381 6 2 55.381 7 2 59.381 8 2 63.379 9 2 67.383 10 2 71.398 where,indx = IndexvehID = Vehicle ID (Here only '2' is shown but infact there are 2169 vehicle IDs and each one repeats several times because the data was collected at every 0.1 seconds)LocalY = The y coordinate of the vehicle at a particular time (The time column is not shown here) What I want to do is to create a new column of 'SmoothedY' using the following formula: SmoothedY = 1/Z * Summation from (i-15) to (i+15) (LocalY * exp(-abs(i-k))/5)) where,i = indxZ = Summation from (k =i-15) to (k = i+15) ( exp(-abs(i-k))/5)) How can I apply this formula to create the new column 'SmoothedY'? This is actually a data smoothing problem but default smoothing algorithms in R are not suitable for my data and I have to use this custom formula. Thanks in advance. Umair Durrani I have never tried this myself, but it appears as if you can define your own smoothing function using Simon Wood's mgcv package. Check out http://www.maths.bath.ac.uk/~sw283/talks/snw-R-talk.pdf for more information. Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] On ^ returning a matrix when operated on a data.frame
Dear R-users, I am wondering why ^ operator alone returns a matrix, when operated on a data.frame (as opposed to all other arithmetic operators). Here's an example: DF - data.frame(x=1:5, y=6:10) class(DF*DF) # [1] data.frame class(DF^2) # [1] matrix I posted here on SO: http://stackoverflow.com/questions/19964897/why-does-on-a-data-frame-return-a-matrix-instead-of-a-data-frame-like-do and got a very nice answer - it happens because a matrix is returned (obvious by looking at `Ops.data.frame`). However, what I'd like to understand is, *why* a matrix is returned for ^ alone? Here's an excerpt from Ops.data.frame (Thanks to Neal Fultz): if (.Generic %in% c(+, -, *, /, %%, %/%)) { names(value) - cn data.frame(value, row.names = rn, check.names = FALSE, check.rows = FALSE) } else matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, dimnames = list(rn, cn)) It's clear that a matrix will be returned unless `.Generic` is one of those arithmetic operators. My question therefore is, is there any particular reason why ^ operator is being missed in the if-statement here? I can't think of a reason where this would break. Also ?`^` doesn't seem to mention anything about this coercion. Please let me know if I should be posting this to R-devel list instead. Thank you very much, Arun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (sin asunto)
Hi! I need to print a graph that I have in a window. Previously I used: try(win.print(), silent = TRUE) if (geterrmessage() != Error in win.print() : unable to start device devWindows\n) { plotFunctiond(screen = FALSE) dev.off() but now it is not possible. Can you help me? -- Elisa. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] volume of ellipsoid
Hi, everyone! I have a matrix X(n*p) which is n samples from a p-dimensional normal distribution. Now I want to make the ellipsoid containing (1-α)% of the probability in the distribution based on Mahalanobis distance: μ:(x-μ)'Σ^(-1)(x-μ)≤χ2p(α) where x and Σ is the mean and variance-covariance matrix of X. Are there some functions in R which can plot the ellipsoid and calculate the volume (area for p=2, hypervolume for p3) of the ellipsoid? I only find the ellipsoidhull function in cluster package, but it deal with ellipsoid hull containing X, which obvious not the one I want. After that, a more difficult is the following. If we have a series of matrix X1, X2, X3,… Xm which follow different p-dimensional normal distributions. And we make m ellipsoids (E1, E2, … Em) for each matrix like the before. How can we calculate the volume of union of the m ellipsoids? One possible solution for this problem is the inclusion-exclusion principle: V(⋃Ei)(1≤i≤m)= V(Ei)(1≤i≤m)-∑V(Ei⋂Ej)(1≤ij≤m)+∑V(Ei⋂Ej⋂Ek)(1≤ijk≤m)+(-1)^(m-1)V(E1⋂…⋂Em) But the problem is how to calculate the volume of intersection between 2, 3 or more ellipsoids. Are there some functions which can calculate the volume of intersection between two region or functions which directly calculate the volume of a union of two region(the region here is ellipsoid). OR yo you have any good ideas solving this problem in R? Thank you all in advance! Yuanzhi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] On ^ returning a matrix when operated on a data.frame
On 13-11-13 6:00 PM, Arunkumar Srinivasan wrote: Dear R-users, I am wondering why ^ operator alone returns a matrix, when operated on a data.frame (as opposed to all other arithmetic operators). Here's an example: DF - data.frame(x=1:5, y=6:10) class(DF*DF) # [1] data.frame class(DF^2) # [1] matrix I posted here on SO: http://stackoverflow.com/questions/19964897/why-does-on-a-data-frame-return-a-matrix-instead-of-a-data-frame-like-do and got a very nice answer - it happens because a matrix is returned (obvious by looking at `Ops.data.frame`). However, what I'd like to understand is, *why* a matrix is returned for ^ alone? Here's an excerpt from Ops.data.frame (Thanks to Neal Fultz): if (.Generic %in% c(+, -, *, /, %%, %/%)) { names(value) - cn data.frame(value, row.names = rn, check.names = FALSE, check.rows = FALSE) } else matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, dimnames = list(rn, cn)) It's clear that a matrix will be returned unless `.Generic` is one of those arithmetic operators. My question therefore is, is there any particular reason why ^ operator is being missed in the if-statement here? I can't think of a reason where this would break. Also ?`^` doesn't seem to mention anything about this coercion. It's not just ^ that is missing, the logical relations like , ==, etc also return matrices. This is very old code (I think from 1999), but I would guess that the reason is that the ^ and operators always return values of a single type (numeric and logical respectively), whereas the other operators can take mixed type inputs and return mixed type outputs. Duncan Murdoch Please let me know if I should be posting this to R-devel list instead. Thank you very much, Arun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] On ^ returning a matrix when operated on a data.frame
Duncan, Thank you. What I meant was that ^ is the only *arithmetic operator* to result in a matrix on operating in a data.frame. I understand it's quite old code. Also, your explanation makes sense, with the exception of / operator, I suppose (I could be wrong here). Arun On Thursday, November 14, 2013 at 12:32 AM, Duncan Murdoch wrote: It's not just ^ that is missing, the logical relations like , ==, etc also return matrices. This is very old code (I think from 1999), but I would guess that the reason is that the ^ and operators always return values of a single type (numeric and logical respectively), whereas the other operators can take mixed type inputs and return mixed type outputs. Duncan Murdoch Please let me know if I should be posting this to R-devel list instead. Thank you very much, Arun [[alternative HTML version deleted]] __ R-help@r-project.org (mailto:R-help@r-project.org) mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] On ^ returning a matrix when operated on a data.frame
On 13-11-13 6:52 PM, Arunkumar Srinivasan wrote: Duncan, Thank you. What I meant was that ^ is the only *arithmetic operator* to result in a matrix on operating in a data.frame. I understand it's quite old code. Also, your explanation makes sense, with the exception of / operator, I suppose (I could be wrong here). You're right, /, %/% and %% also return consistent types. So my explanation is wrong. The NEWS entry for this change appears to be in the 0.63 release, o Ops.data.frame : things like d.fr anow return a matrix That doesn't give much of a hint for why ^ is handled differently than /. Duncan Murdoch Arun On Thursday, November 14, 2013 at 12:32 AM, Duncan Murdoch wrote: It's not just ^ that is missing, the logical relations like , ==, etc also return matrices. This is very old code (I think from 1999), but I would guess that the reason is that the ^ and operators always return values of a single type (numeric and logical respectively), whereas the other operators can take mixed type inputs and return mixed type outputs. Duncan Murdoch Please let me know if I should be posting this to R-devel list instead. Thank you very much, Arun [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Update a variable in a dataframe based on variables in another dataframe of a different size
Dan, Gabor's solution is of course good, but here's a solution that uses only base R capabilities, and doesn't sort as merge() does. Essentially the same as A.K.'s, but slightly more general. tmp1 - match( paste(T_DF$FY,T_DF$ID) , paste(H_DF$FY,H_DF$ID) ) H_DF$TT[tmp1] - T_DF$TT gg - sqldf(select FY, ID, coalesce(t.TT, h.TT) TT from H_DF h left join T_DF t using(FY, ID)) for (nm in names(H_DF)) print(all.equal(H_DF[[nm]], gg[[nm]])) [1] TRUE [1] TRUE [1] TRUE It could be made into a one-liner.It would probably break if TT doesn't have the same factor levels in both H_DF and T_DF. As an aside, I suspect that nowadays match() is generally under-appreciated among R users as a whole. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/11/13 5:20 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Nov 11, 2013 at 8:04 PM, Lopez, Dan lopez...@llnl.gov wrote: Below is how I am currently doing this. Is there a more efficient way to do this? The scenario is that I have two dataframes of different sizes. I need to update one binary factor variable in one of those dataframes by matching on two variables. If there is no match keep as is otherwise update. Also the variable being update, TT in this case should remain a binary factor variable (levels='HC','TER') HTDF2-merge(H_DF,T_DF,by=c(FY,ID),all.x=T) HTDF2$TT-factor(ifelse(is.na(HTDF2$TT.y),HTDF2$TT.x,HTDF2$TT.y),labels=c (HC,TER)) HTDF2-HTDF2[,-(3:4)] # REPRODUCIBLE EXAMPLE DATA FOR ABOVE.. dput(H_DF) structure(list(FY = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L), .Label = c(FY09, FY10, FY11, FY12, FY13), class = factor), ID = c(1, 1, 1, 1, 2, 2, 2, 2, 2), TT = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c(HC, TER), class = factor)), .Names = c(FY, ID, TT), class = data.frame, row.names = c(1L, 2L, 3L, 4L, 6L, 7L, 9L, 10L, 11L)) dput(T_DF) structure(list(FY = structure(c(4L, 2L, 5L), .Label = c(FY09, FY10, FY11, FY12, FY13), class = factor), ID = c(1, 2, 2), TT = structure(c(2L, 2L, 2L), .Label = c(HC, TER), class = factor)), .Names = c(FY, ID, TT), row.names = c(5L, 8L, 12L), class = data.frame) Here is an sqldf solution: library(sqldf) sqldf(select FY, ID, coalesce(t.TT, h.TT) TT from H_DF h left join T_DF t using(FY, ID)) FY ID TT 1 FY09 1 HC 2 FY10 1 HC 3 FY11 1 HC 4 FY12 1 TER 5 FY09 2 HC 6 FY10 2 TER 7 FY11 2 HC 8 FY12 2 HC 9 FY13 2 TER -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Analysis
Hi, I have 187 urine cultures which were subjected to culture and microscopy methods. Video were used to 'verify' the findings. Culture is considered the gold method. But Microscopy is another method which may be cheaper. I checked the videos to determine whether bacteria was growing on both the culture and microscopy readings to ' verify' what was really happening. My question is this...How can I show that Microscopy is superior? Is there a special test available? *Microscopy**Positive* *Negative**Culture Positive* *Video Positive* *47* *5* *52* Video Negative 0 *2* *2* *Culture Negative* Video Positive 18 *0* *18* *Video Negative* *5* *110* *115* *70* *117* *187* -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] oblique.scores argument for fa function in psych package
Dear R help, I'm using version 1.3.10.12 of the psych package. The help page for fa says that when calculating factor scores, oblique.scores=FALSE causes the pattern matrix to be used, and oblique.scores=TRUE causes the structure matrix to be used. However, when I try it, it seems to be the other way around. Example: ### library(psych) set.seed(1) n - 100 Y - data.frame(y1 = rnorm(n), y2 = rnorm(n), y3 = rnorm(n), y4 = rnorm(n), y5 = rnorm(n), y6 = rnorm(n)) fa1 - fa(Y, nfactors=2, rotate=oblimin, fm=minres, oblique.scores=FALSE) fa2 - fa(Y, nfactors=2, rotate=oblimin, fm=minres, oblique.scores=TRUE) fa1$scores[1:3, ] fa2$scores[1:3, ] (scale(Y) %*% solve(cor(Y)) %*% fa1$loadings)[1:3, ] # not the same as fa1$scores[1:3, ] (scale(Y) %*% solve(cor(Y)) %*% fa1$Structure)[1:3, ] # same as fa1$scores[1:3, ] (scale(Y) %*% solve(cor(Y)) %*% fa2$loadings)[1:3, ] # same as fa2$scores[1:3, ] (scale(Y) %*% solve(cor(Y)) %*% fa2$Structure)[1:3, ] # not the same as fa2$scores[1:3, ] ### Have I misunderstood something? Thanks, Mark Seeto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting arbitrary curve to 1D data with error bars
If you are after adding error bars in a scatter plot; one example is given below : #some example data set.seed(42) df - data.frame(x = rep(1:10,each=5), y = rnorm(50)) #calculate mean, min and max for each x-value library(plyr) df2 - ddply(df,.(x),function(df) c(mean=mean(df$y),min=min(df$y),max=max(df$y))) #plot error bars library(Hmisc) with(df2,errbar(x,mean,max,min)) grid(nx=NA,ny=NULL) (From: http://stackoverflow.com/questions/13032777/scatter-plot-with-error-bars) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Analysis
Don't know if the data will support the result that you want but the only way to get some sort of answer is to hire a statistician. ranjan On Wed, 13 Nov 2013 20:58:30 -0400 Jim Silverton jim.silver...@gmail.com wrote: Hi, I have 187 urine cultures which were subjected to culture and microscopy methods. Video were used to 'verify' the findings. Culture is considered the gold method. But Microscopy is another method which may be cheaper. I checked the videos to determine whether bacteria was growing on both the culture and microscopy readings to ' verify' what was really happening. My question is this...How can I show that Microscopy is superior? Is there a special test available? *Microscopy**Positive* *Negative**Culture Positive* *Video Positive* *47* *5* *52* Video Negative 0 *2* *2* *Culture Negative* Video Positive 18 *0* *18* *Video Negative* *5* *110* *115* *70* *117* *187* -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Important Notice: This mailbox is ignored: e-mails are set to be deleted on receipt. Please respond to the mailing list if appropriate. For those needing to send personal or professional e-mail, please use appropriate addresses. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Update a variable in a dataframe based on variables in another dataframe of a different size
Hi Don, In cases like: H_DF - H_DF[-4,] tmp1 - match( paste(T_DF$FY,T_DF$ID) , paste(H_DF$FY,H_DF$ID) ) H_DF$TT[tmp1] - T_DF$TT #Error in `[-.factor`(`*tmp*`, tmp1, value = c(2L, 2L, 2L)) : Probably, this works: H_DF$TT[tmp1[!is.na(tmp1)]] - unique(T_DF$TT) A.K. On Wednesday, November 13, 2013 7:59 PM, MacQueen, Don macque...@llnl.gov wrote: Dan, Gabor's solution is of course good, but here's a solution that uses only base R capabilities, and doesn't sort as merge() does. Essentially the same as A.K.'s, but slightly more general. tmp1 - match( paste(T_DF$FY,T_DF$ID) , paste(H_DF$FY,H_DF$ID) ) H_DF$TT[tmp1] - T_DF$TT gg - sqldf(select FY, ID, coalesce(t.TT, h.TT) TT from H_DF h left join T_DF t using(FY, ID)) for (nm in names(H_DF)) print(all.equal(H_DF[[nm]], gg[[nm]])) [1] TRUE [1] TRUE [1] TRUE It could be made into a one-liner.It would probably break if TT doesn't have the same factor levels in both H_DF and T_DF. As an aside, I suspect that nowadays match() is generally under-appreciated among R users as a whole. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/11/13 5:20 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Nov 11, 2013 at 8:04 PM, Lopez, Dan lopez...@llnl.gov wrote: Below is how I am currently doing this. Is there a more efficient way to do this? The scenario is that I have two dataframes of different sizes. I need to update one binary factor variable in one of those dataframes by matching on two variables. If there is no match keep as is otherwise update. Also the variable being update, TT in this case should remain a binary factor variable (levels='HC','TER') HTDF2-merge(H_DF,T_DF,by=c(FY,ID),all.x=T) HTDF2$TT-factor(ifelse(is.na(HTDF2$TT.y),HTDF2$TT.x,HTDF2$TT.y),labels=c (HC,TER)) HTDF2-HTDF2[,-(3:4)] # REPRODUCIBLE EXAMPLE DATA FOR ABOVE.. dput(H_DF) structure(list(FY = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L), .Label = c(FY09, FY10, FY11, FY12, FY13), class = factor), ID = c(1, 1, 1, 1, 2, 2, 2, 2, 2), TT = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c(HC, TER), class = factor)), .Names = c(FY, ID, TT), class = data.frame, row.names = c(1L, 2L, 3L, 4L, 6L, 7L, 9L, 10L, 11L)) dput(T_DF) structure(list(FY = structure(c(4L, 2L, 5L), .Label = c(FY09, FY10, FY11, FY12, FY13), class = factor), ID = c(1, 2, 2), TT = structure(c(2L, 2L, 2L), .Label = c(HC, TER), class = factor)), .Names = c(FY, ID, TT), row.names = c(5L, 8L, 12L), class = data.frame) Here is an sqldf solution: library(sqldf) sqldf(select FY, ID, coalesce(t.TT, h.TT) TT from H_DF h left join T_DF t using(FY, ID)) FY ID TT 1 FY09 1 HC 2 FY10 1 HC 3 FY11 1 HC 4 FY12 1 TER 5 FY09 2 HC 6 FY10 2 TER 7 FY11 2 HC 8 FY12 2 HC 9 FY13 2 TER -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MM estmator
hi I have a nonlinear regression model with two parameter, i have estimated using nls in R and i want to also find the estimate using MM, someone refer me to this function nlrob but this is main for only M estimate. can you please help me findout a function in R for MM nonlinear regression thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace NA's with value in the next row
Hi all, I have a data set which treat missing value as NA and now I need to replace all these NA's by using number in the same row but different column. Here is the part of my data: V1 V2 V3 V4 V5 V6 V7 0 0 0 1.2 0 0 0.259 0 0 12.8 0 23.7 0 8.495 6 0 81.7 0.2 0 20 19.937 0 1.5 60.9 0 0 15.5 13.900 1 13 56.8 17.5 32.8 6.4 27.654 4 3 66.4 2 0.3 NA 17.145 I want to replace (V6, 6) with (V7, 6). I have about 1000 NA's in V6 which I want to replace with the same row in V7. The other values in V6, I want to keep remain the same. How to achieve this? Assuming my data is called Targetstation, I have tried this: Targetstation - within(Targetstation, v6 - replace(v6, is.na(v6), v7[is.na (v6)])) But R gives me this: Warning messages: 1: In is.na(v6) : is.na() applied to non-(list or vector) of type 'NULL' 2: In is.na(v6) : is.na() applied to non-(list or vector) of type 'NULL' How to solve this? Thank you in advance. Regards, Dila. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MM estmator
Check the gmm package with a weighting matrix equal to the identity. Best Simon On 14 Nov 2013, at 04:37, IZHAK shabsogh ishaqb...@yahoo.com wrote: hi I have a nonlinear regression model with two parameter, i have estimated using nls in R and i want to also find the estimate using MM, someone refer me to this function nlrob but this is main for only M estimate. can you please help me findout a function in R for MM nonlinear regression thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.