Re: [R] URL error when trying to use help function in R [Sec: UNOFFICIAL]
On 09/10/2010 01:03 AM, David Winsemius wrote: On Sep 9, 2010, at 6:34 PM, Gosse, Michelle wrote: Greetings, I am using R version 2.11.1 on a Dell computer, via a VMware connection to a remote server. My browser version is IE 8.0.6001.18702 and the OS is some corporate version of Microsoft XP. I'm trying to learn more about the tapply function , so I typed ? tapply into the command line. This opened up a browser window with url http://127.0.0.1:28138/library/base/html/tapply.html which is giving me an error message. I receive the same problem when trying for help on other commands, e.g. ?table http://127.0.0.1:28138/library/base/html/table.html and ? log http://127.0.0.1:28138/library/base/html/Log.html I did a whois on 127.0.0.1 That should always be your own computer. The browser is trying to reach a server on itself over port 28138 and either the port is blocked or you don't have the documentation at that location. The _real_ problem is likely that the server is really running on the remote computer. Substituting the remote server name for 127.0.0.1 is not unlikely to make things work. (Notwithstanding firewalls and the like). It is a generic weakness of our current dynamic HTML setup, or of current browser technology if you like. Same thing with file:// URLs -- if you try to view them in a browser and you already have a browser on your display, but running on a different machine than the one with the file, you get a file not found. So when R on machine B wants to display a help page, it sends a message to the browser to connect to R's own server on B by specifying a port on localhost (127.0.0.1), but if this request gets forwarded to a browser on machine A, then it goes looking for a server on _its_ localhost, i.e. machine A, and it isn't there... I suppose we could do somewhat better, but I don't feel too confident about the various platform issues. As far as I can see, we currently hardcode http://127.0.0.1; inside the help print method in utils:::print.help_file_with_topics(), and I suspect we could make that a user option, or try to be more intelligent about finding the machine's own IP address. A pragmatic way out is always options(help_type=text). -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] covariance matrix structure for random effect in glmmPQL
Dear all, I'm using R function glmmPQL in MASS package for generalized linear mixed model considering the temporal correlations in random effect. There are 1825 observations in my data, in which the random effect is called Date, and there are five levels in Date, each repeats 365 times. When I tried fit.model1=glmmPQL(y~f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15+f16+f17+f18+f19+f20+f21+f22+f23+f24, family=poisson,random=~1|Date,data=mydata,correlation=corCompSymm(value=0.2,form=~1|Date)), the model was fitted well. But because of my particular interest, I need to specify the correlation structure by myself, so I tried the following code, fit.model2=glmmPQL(y~f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15+f16+f17+f18+f19+f20+f21+f22+f23+f24,family=poisson,random=~1|Date,data=mydata,correlation=corSymm(value=B[lower.tri(B)],form=~1|Date)), where B is a 365*365 correlation matrix that's specified by me. Then there's an error message Error in vector(double, length) : vector size specified is too large. Even I wrote B exactly the same as the one used in model1, i.e. diagonal elements 1, off-diagonal elements 0.2, the same error message shows. Is this error something inherited from the glmmPQL function that I couldn't change, or something wrong I made so that I could make certain modifications? Thanks so much in advance for any kind help! -- Best, Vicky [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on simple problem with optim
It is indeed a negative value for sigma that causes the issue. You can check this by inserting this line if(sigma = 0 ) cat(Negative sigma=,sigma,\n) after the line mu - x %*% beta in function llk.mar Negative values for sigma can be avoided with the use of a transformation for sigma, forcing it to be always positive. Make optim use log(sigma) as parameter and transform this to sigma with sigma - exp(parm[l]) in llk.mar. Like this # define the log likelihood to be maximized over llk.mar - function(parm,y,x){ # parm is the vector of parameters # the last element is sigma # y is the response # x is the design matrix l - length(parm) beta - parm[-l] sigma - exp(parm[l]) # === transform x - as.matrix(x) mu - x %*% beta if(sigma = 0 ) cat(Negative sigma=,sigma,\n) llk - sum(dnorm(y, mu, sigma,log=TRUE)) return(llk) } # initial values parm - c(as.vector(coef(fit)),log(summary(fit)$sigma)) # use log(sigma) as independent parameter Caveat: transformations often help in situations like this but can lead to badly scaled problems and are not a universal remedy. /Berend -- View this message in context: http://r.789695.n4.nabble.com/Help-on-simple-problem-with-optim-tp2533420p2533939.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
I have two questions: (1) How do you 'create' an 2 x 2 table in R using say an Odd ratio of 3 or even 0.5 (2) If I have several 2 x 2 tables, how can I 'implement' dependence in the tables with say 25 of the Tables having an odds ratio of 1 and 75 of the tables having an odds ratio of 4? Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help request: highlighting R code on WordPress.com blogs
Hello D, Thanks for sharing your technique, nice work :) I hope the solution the people here are helping with will make it both cheaper and simpler for people with less CSS expreince. p.s: thank you for the kinds words regarding R-bloggers.com Best, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Fri, Sep 10, 2010 at 6:40 AM, D Kelly O'Day ko...@processtrends.comwrote: Tali I am one of your estimated 29 Wordpress bloggers. Thanks for your RBloggers site!! I use Wordpress.com's site for my blog. I use a simple method to highlight my R script in Wordpress, example http://chartsgraphs.wordpress.com/2010/07/17/time-series-regression-of-temperature-anomaly-data-1-%E2%80%93-don%E2%80%99t-use-ols/#more-3390 here . I use pre Rscript /pre to set up my R script blocks. I purchased Wordpress' CSS service and customized the pre /pre tags to add a text box and pale yellow color scheme. I use SnagIt to make images of the console results. -- View this message in context: http://r.789695.n4.nabble.com/Help-request-highlighting-R-code-on-WordPress-com-blogs-tp2532433p2533842.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] URL error when trying to use help function in R [Sec: UNOFFICIAL]
Hi I had similar issue and it was solved by setting proxy in browser preferences to do not use proxy server for 127.0.0.1 and it helped. Regards Petr r-help-boun...@r-project.org napsal dne 10.09.2010 08:18:53: On 09/10/2010 01:03 AM, David Winsemius wrote: On Sep 9, 2010, at 6:34 PM, Gosse, Michelle wrote: Greetings, I am using R version 2.11.1 on a Dell computer, via a VMware connection to a remote server. My browser version is IE 8.0.6001.18702 and the OS is some corporate version of Microsoft XP. I'm trying to learn more about the tapply function , so I typed ? tapply into the command line. This opened up a browser window with url http://127.0.0.1:28138/library/base/html/tapply.html which is giving me an error message. I receive the same problem when trying for help on other commands, e.g. ?table http://127.0.0.1:28138/library/base/html/table.html and ? log http://127.0.0.1:28138/library/base/html/Log.html I did a whois on 127.0.0.1 That should always be your own computer. The browser is trying to reach a server on itself over port 28138 and either the port is blocked or you don't have the documentation at that location. The _real_ problem is likely that the server is really running on the remote computer. Substituting the remote server name for 127.0.0.1 is not unlikely to make things work. (Notwithstanding firewalls and the like). It is a generic weakness of our current dynamic HTML setup, or of current browser technology if you like. Same thing with file:// URLs -- if you try to view them in a browser and you already have a browser on your display, but running on a different machine than the one with the file, you get a file not found. So when R on machine B wants to display a help page, it sends a message to the browser to connect to R's own server on B by specifying a port on localhost (127.0.0.1), but if this request gets forwarded to a browser on machine A, then it goes looking for a server on _its_ localhost, i.e. machine A, and it isn't there... I suppose we could do somewhat better, but I don't feel too confident about the various platform issues. As far as I can see, we currently hardcode http://127.0.0.1; inside the help print method in utils:::print.help_file_with_topics(), and I suspect we could make that a user option, or try to be more intelligent about finding the machine's own IP address. A pragmatic way out is always options(help_type=text). -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
Hi, I got two questions : 1st Question          a=S          b=data.frame(S=3)          do.call(`-`,list(do.call(`$`,list(b,S)),5)) = How can I put new values on S column having the column name as a variable ? 2 nd Question       a=S      b=data.frame(S=3)      b[,S]=list(1:10) #Doesnt works      b$S=list(1:10) #Works = Isnt the same thing ? What is the difference between these two things ? Thanks, Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] importing third party hierarchical clustering data into R
hi, I have created a hierarchical clustering program which reads in data and clusters them. The next part of the problem is to visualize the results. I would like to know how i could import my cluster results into the R hclust object so that i can visualise using the ggobi library. Or if there is any other easier way it would be helpful too. My program is able to generate the point label that is in each cluster. Thanks a bunch! -- View this message in context: http://r.789695.n4.nabble.com/importing-third-party-hierarchical-clustering-data-into-R-tp2533961p2533961.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
Hi r-help-boun...@r-project.org napsal dne 10.09.2010 10:05:37: Hi, I got two questions : 1st Question          a=S          b=data.frame(S=3)          do.call(`-`,list(do.call(`$`,list(b,S)),5)) = How can I put new values on S column having the column name as a variable ? 2 nd Question       a=S      b=data.frame(S=3)      b[,S]=list(1:10) #Doesnt works      b$S=list(1:10) #Works = Isnt the same thing ? What is the difference between these two things ? did you looked at b e.g. by str(b)? I believe you expected something different. Maybe you wanted rbind rbind(b[,S],data.frame(S=1:10)) Regards Petr Thanks, Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmer fixed effects, SE, t . . . and p
On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote: Bert, I appreciate you comments, and I have read Doug Bates writing about p values in mixed effects regression. It is precisely because I read Doug's material that I asked how are we to interpret the estimates rather than how can we compute a p value. My question is a simple question whose answer is undoubtedly complex, but one that needs an answer. Without p values, or confidence intervals, I am not certain what to make of the results of my analysis. Does my analysis suggest, or does it not suggest that there is a relation between time and y? If I can't answer this question after running the analysis, I don't have any more information than I did before I ran the analysis, and a fair question would be why did I run the analysis? I am asking for help not in calculation a p value or a CI, but rather to know what I can and can't say about the results of the analysis. If this basic question can not be answered, I am at a loss to interpret my results. Thank you, John Doug talks quite a lot about profiling lmer fits using 'profile deviance' to investigate variability in fixed effects. For example, see section 1.5 in the draft of chapter 1 of Doug's book on mixed models: http://lme4.r-forge.r-project.org/book/ HTH G John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Bert Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM John: Search on this issue in the list archives. Doug Bates has addressed it at length. Basically, he does not calculate CI's or p-values because he does not know how to reliably do so. However, the key remark in your query was: (2) lmer does not give p values or confidence intervals for the fixed effects. How we are to interpret the estimates given that no p value or CI is given for the estimates? Think about it. A statistical analysis -- ANY statistical analysis -- treats the data in isolation: it is not informed by physics, thermodynamics, biology, other similar data, prior experience, or, indeed, any part of the body of relevant scientific knowledge. Do you really think that any such analysis, especially when predicated upon often tenuous or even (necessarily) unverifiable assumptions and simplifications should be considered authoritative? Classical statistical inference is just another piece of the puzzle, and not even particularly useful when, as if typically the case, hypotheses are formulated AFTER seeing the data (this invalidates the probability calculations -- hypotheses must be formulated before seeing the data to be meaningfully assessed). Leo Breiman called this statistics' quiet scandal something like 20 years ago, and he was no dummy. It is comforting, perhaps, but illusory to believe that statistical inference can be relied on to give sound, objective scientific results. True, without such a framework, science seems rather subjective, perhaps closer to religion and arbitrary cultural archetypes than we care to admit. But see Thomas Kuhn and Paul Feuerabend for why this is neither surprising nor necessarily a bad thing. Cheers, Bert Gunter On Thu, Sep 9, 2010 at 8:00 PM, John Sorkin jsor...@grecc.umaryland.edu wrote: windows Vista R 2.10.1 (1) How can I get the complete table of for the fixed effects from lmer. As can be seen from the example below, fixef(fit2) only give the estimates and not the SE or t value fit3- lmer(y~time + (1|Subject) + (time|Subject),data=data.frame(data)) summary(fit3) Linear mixed model fit by REML Formula: y ~ time + (1 | Subject) + (time | Subject) Data: data.frame(data) AICBIC logLik deviance REMLdev -126.2 -116.4 70.1 -152.5 -140.2 Random effects: Groups NameVariance Std.Dev. Corr Subject (Intercept) 2.9311e+01 5.41396385 Subject (Intercept) 0.e+00 0. time0.e+00 0. NaN Residual 8.1591e-07 0.00090328 Number of obs: 30, groups: Subject, 10 Fixed effects: Estimate Std. Error t value (Intercept) 14.998216 1.712046 9 time-0.999779 0.000202 -4950 Correlation of Fixed Effects: (Intr) time -0.001 fixef(fit3) (Intercept)time 14.9982158 -0.9997793 (2) lmer does not give p values or confidence intervals for the fixed effects. How we are to interpret the estimates given that no p value or CI is given for the estimates? John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore,
[R] (no subject)
Hello, I'm trying to do bar plot where 'sex' will be the category axis and 'occupation' will represent the bars and the clusters will represent the mean 'income'. sex occupation income 1 female j 12 2male b34 3male j 22 4 female j54 5male b 33 6 female b 67 7male j89 8male b 65 9 female j 45 10 male j 32 I can do bar plot where sex is the category axis and the clusters represent 'occupation'. the code is- t- table(data$sex,data$occupation) barplot(f) and the barplot where the category axis is 'sex' and the cluster represent the mean income and median income. The code is - mean=tapply(data$income,data$sex,mean) mean female male 38.7 46.5 median=tapply(data$income,data$sex,median) median female male 22.5 49.5 r=rbind(mean,median) r female male mean38.7 46.5 median 22.549.5 par(fg='red',cex=1.2) barplot(r,col=c('green','yellow'),cex.axis=1.2,col.axis='red',ylim=c(0,120) But how can I make 'occupation'' to nest inside 'sex' and then the cluster to represent the mean income? For example I am attaching a pdf plot that is produced by SPSS. Thank you. -- Tanvir Khan MS Student Institute Of Statistical Research Training University Of Dhaka tkh...@isrt.ac.bd khan.tanvir_...@ymail.com OUTPUT.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl and lighting
Yes. A white cube and all lights off to start with give me what I want. Many thanks! J Dr James Foadi PhD Membrane Protein Laboratory (MPL) Diamond Light Source Ltd Diamond House Harewell Science and Innovation Campus Chilton, Didcot Oxfordshire OX11 0DE Email: james.fo...@diamond.ac.uk Alt Email: j.fo...@imperial.ac.uk -Original Message- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: Thu 09/09/2010 18:00 To: Foadi, James (Imperial Coll.,RAL,DIA) Cc: r-help@r-project.org Subject: Re: [R] rgl and lighting On 09/09/2010 12:02 PM, james.fo...@diamond.ac.uk wrote: Dear R community (and Duncan more specifically), I can't work out how to make additional light sources work in rgl. Here is the example. First I create a cube and visualize it: cubo- cube3d(col=black) shade3d(cubo) Next I position the viewpoint at theta=0 and phi=30: view3d(theta=0,phi=30) Next, I want to create a 2nd light source which diffuses red light from the front face. I thought I could do: light3d(diffuse=red,theta=0,phi=0) but...the front side doesn't show any red-iness. Same goes for specular and ambient. What am I doing wrong here? How should the fron side show in red colour? Black doesn't reflect anything, so that's why you're not seeing the red. Colour the cube white, and you'll see it turn pink when you turn the red light on, or red if you turn off the default light first (using rgl.pop(lights)). Be aware that OpenGL (underlying rgl) has a fairly complicated lighting model. When you say col=black, you're only setting the ambient colour, i.e. the colour that appears the same in all directions. (It won't be the same on all faces of the cube, because the intensity depends on the incoming light.) There is also a specular component, which makes things appear shiny, because it's brighter from some viewpoints than others. It is normally white. Finally there's an emission component, which doesn't care about lighting, but is normally turned off. Lights also have 3 components, ambient (non-directional), diffuse (somewhat directional), and specular (highly directional). Duncan Murdoch J Dr James Foadi PhD Membrane Protein Laboratory (MPL) Diamond Light Source Ltd Diamond House Harewell Science and Innovation Campus Chilton, Didcot Oxfordshire OX11 0DE Email: james.fo...@diamond.ac.uk Alt Email: j.fo...@imperial.ac.uk -- This e-mail and any attachments may contain confidential...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] faster unlist,strsplit,gsub,for
Hi, You can leverage read.table using a textConnection: txt - x,y,z,a,b,c,dda,b,c,d,e,f,gd con - textConnection( gsub( d, \\\n, txt ) ) read.table( con, sep = , ) V1 V2 V3 V4 V5 V6 V7 1 x y z a b c d 2 a b c d e f g close( con ) Romain Le 10/09/10 06:41, rajesh j a écrit : Ok. These operations are on a string and the result is added to a data.frame. I have strings of the form x,y,z,a,b,c,dda,b,c,d,e,f,gd essentially comma separated values delimited by ad I first do a unlist(strsplit(string,split=d)) and then a strsplit(string,split=,) The list of vectors i end up with is added row by row to a preallocated data.frame like.. df[i,]-list[[i]] all of this is in a for loop and it runs for 1000 times atleast and the strings are 7000 to 8000 characters in length On Fri, Sep 10, 2010 at 9:14 AM, jim holtmanjholt...@gmail.com wrote: First thing to do is to use Rprof to profile your code to see where the time is being spent, then you can make a decision as to what to change. Are you carrying out the operations on a dataframe, if so can you change it to a matrix for some of the operations? You have provided no idea of what your code or data looks like, or how often each of the operations is being done. There are probably many ways of speeding up the code, but with no idea of what the code is, no solutions can be specified. On Thu, Sep 9, 2010 at 11:09 PM, rajesh jakshay.raj...@gmail.com wrote: Hi, I perform the operations unlist,strsplit,gsub and the for loop on a lot of strings and its heavily slowing down the overall system. Is there some way for me to speeden up these operations..maybe like alternate versions that exist which use multiprocessors etc. -- Rajesh.J -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/bzoWrs : Rcpp svn revision 2000 |- http://bit.ly/b8VNE2 : Rcpp at LondonR, oct 5th `- http://bit.ly/aAyra4 : highlight 0.2-2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
Have a look at the ggplot2 package install.packages(ggplot2) library(ggplot2) ggplot(data, aes(x = sex, y = income, fill = occupation)) + geom_bar(position = dodge) Have a look at http://had.co.nz/ggplot2/ for more information and examples. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Tanvir Khan Verzonden: vrijdag 10 september 2010 11:39 Aan: r-help@r-project.org Onderwerp: [R] (no subject) Hello, I'm trying to do bar plot where 'sex' will be the category axis and 'occupation' will represent the bars and the clusters will represent the mean 'income'. sex occupation income 1 female j 12 2male b34 3male j 22 4 female j54 5male b 33 6 female b 67 7male j89 8male b 65 9 female j 45 10 male j 32 I can do bar plot where sex is the category axis and the clusters represent 'occupation'. the code is- t- table(data$sex,data$occupation) barplot(f) and the barplot where the category axis is 'sex' and the cluster represent the mean income and median income. The code is - mean=tapply(data$income,data$sex,mean) mean female male 38.7 46.5 median=tapply(data$income,data$sex,median) median female male 22.5 49.5 r=rbind(mean,median) r female male mean38.7 46.5 median 22.549.5 par(fg='red',cex=1.2) barplot(r,col=c('green','yellow'),cex.axis=1.2,col.axis='red',ylim=c(0 ,120) But how can I make 'occupation'' to nest inside 'sex' and then the cluster to represent the mean income? For example I am attaching a pdf plot that is produced by SPSS. Thank you. -- Tanvir Khan MS Student Institute Of Statistical Research Training University Of Dhaka tkh...@isrt.ac.bd khan.tanvir_...@ymail.com Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Alignment of lines within barplot bars
On 09/10/2010 01:35 AM, Steve Murray wrote: Dear all, I have a barplot upon which I hope to superimpose horizontal lines extending across the width of each bar. I am able to partly achieve this through the following set of commands: positions- barplot(bar_values, col=grey) par(new=TRUE) plot(positions, horiz_values, col=red, pch=_, ylim=c(min(bar_values), max(bar_values))) ...however this results in small, off-centred lines, which don't extend across the width of each bar. I've tried using 'cex' to increase the width, but of course this also increases the height of the line and results in it spanning a large range of y-axis values. I'm sure this shouldn't be too tricky to achieve, nor that uncommon a problem! It may be that I'm taking the wrong approach. Hi Steve, The barp function in the plotrix package centers the bars on integer values and allows you to control the width of the bars (0.4 on each side is the default). This would make it easier to calculate the values for the segments function to superimpose te lines. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Barplots, was Re: (no subject)
On Sep 10, 2010, at 11:38 , Tanvir Khan wrote: Hello, I'm trying to do bar plot where 'sex' will be the category axis and 'occupation' will represent the bars and the clusters will represent the mean 'income'. sex occupation income 1 female j 12 2male b34 3male j 22 4 female j54 5male b 33 6 female b 67 7male j89 8male b 65 9 female j 45 10 male j 32 I can do bar plot where sex is the category axis and the clusters represent 'occupation'. the code is- t- table(data$sex,data$occupation) barplot(f) and the barplot where the category axis is 'sex' and the cluster represent the mean income and median income. The code is - mean=tapply(data$income,data$sex,mean) mean female male 38.7 46.5 median=tapply(data$income,data$sex,median) median female male 22.5 49.5 r=rbind(mean,median) r female male mean38.7 46.5 median 22.549.5 par(fg='red',cex=1.2) barplot(r,col=c('green','yellow'),cex.axis=1.2,col.axis='red',ylim=c(0,120) But how can I make 'occupation'' to nest inside 'sex' and then the cluster to represent the mean income? For example I am attaching a pdf plot that is produced by SPSS. (Please use sensible Subject:, those of us with threading mail programs see it mixed in with oodles of old and irrelevant posts.) I think you are just looking for some of this inc.by.sex.occ - with(data, tapply(income, list(sex,occupation), mean)) barplot(inc.by.sex.occ) barplot(inc.by.sex.occ, beside=T) barplot(t(inc.by.sex.occ)) barplot(t(inc.by.sex.occ),beside=T) (or, of course, reverse the order in the list() rather than transpose the result) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axis break with gap.plot()
On 09/10/2010 01:07 AM, Filoche wrote: Hi everyone. I'm trying to break the y axis on a plot. For instance, I have 2 series (points and a loess). Since the loess is a continuous set of points, it passes in the break section. However, with gap.plot I cant plot the loess because of this (I got the message some values of y will not be displayed). Here's my code: library(plotrix); #generate some data x = seq(-pi,pi,0.1); sinx = sin(x); #add leverage value sinx = c(sinx,10); xx = c(x,max(x) + 0.1); #Loess yy = loess(sinx ~ xx, span = 0.1); yy = predict(yy); #Add break between 2 and 8 gap.plot(xx,sinx,c(2,8)); #This line works fine gap.plot(xx,yy,c(2,8), add = T); #This wont plot the loess I did the graphic I would like to produce in Sigmaplot. http://img830.imageshack.us/img830/5206/breakaxis.jpg Hi Phil, The loess is being displayed, but because it is just reproducing the points already there, except for one or two, you don't see it. If you try this: gap.plot(xx,yy,c(2,8), add = TRUE,type=l); you'll see the line, although you won't get the uptick at the end because it passes through the gap. It would require a bit of manual labor to get the same plot as your example. If you have to do just one of these, I would probably recalculate the loess fit to account for the gap and display it with lines. If you have to do lots, I would think about writing a function to do this that you could call instead of the second call to gap.plot. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
On 09/10/2010 07:38 PM, Tanvir Khan wrote: Hello, I'm trying to do bar plot where 'sex' will be the category axis and 'occupation' will represent the bars and the clusters will represent the mean 'income'. sex occupation income 1 female j 12 2male b34 3male j 22 4 female j54 5male b 33 6 female b 67 7male j89 8male b 65 9 female j 45 10 male j 32 I can do bar plot where sex is the category axis and the clusters represent 'occupation'. the code is- t- table(data$sex,data$occupation) barplot(f) and the barplot where the category axis is 'sex' and the cluster represent the mean income and median income. The code is - mean=tapply(data$income,data$sex,mean) mean female male 38.7 46.5 median=tapply(data$income,data$sex,median) median female male 22.5 49.5 r=rbind(mean,median) r female male mean38.7 46.5 median 22.549.5 par(fg='red',cex=1.2) barplot(r,col=c('green','yellow'),cex.axis=1.2,col.axis='red',ylim=c(0,120) But how can I make 'occupation'' to nest inside 'sex' and then the cluster to represent the mean income? For example I am attaching a pdf plot that is produced by SPSS. Hi Tanvir, I think you may be looking for barNest: examp-data.frame(sex=sample(c(Female,Male),100,TRUE), position=sample(c(Clerical,Custodial,Manager),100,TRUE), income=rnorm(100,3,5000)) # just show the final bars barNest(income~sex+position,examp, col=list(gray,c(pink,lightblue),c(blue,green,tan))) # show the entire nest of bars barNest(income~sex+position,examp,showall=TRUE, col=list(gray,c(pink,lightblue),c(blue,green,tan))) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Standardized logistic regression coefficients
Dear all, I am looking for ways to compute standardized logistic regression coefficients. I found papers describing at least 6 different ways to standardize logistic regression coefficients. I also found a very old (Thu May 12 21:50:36 CEST 2005) suggestion by Frank E Harrell (one of the colleagues who frequently contribute on this list) saying... Design doesn't implement those because they have terrible properties. Instead consider interquartile-range odds ratios (done by summary.Design by typing summary(. . .)). 1. Is this still the case, or is there any package today in R which computes some sort of standardized logistic regression coefficients widely accepted by the community? 2. Also, if anyone knows, how can I implement this interquartile-range odds ratios of the Design package? I checked on the manual and found no reference to interquartile-range odds ratios Thanks Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Standardized Logistic Regression coefficients
Dear all, I am looking for ways to compute standardized logistic regression coefficients. Before asking for your time I did some homework. I found papers describing at least 6 different ways to standardize logistic regression coefficients. I also found a very old (Thu May 12 21:50:36 CEST 2005) suggestion by Frank E Harrell (one of the colleagues who frequently contribute on this list) saying... Design doesn't implement those because they have terrible properties. Instead consider interquartile-range odds ratios (done by summary.Design by typing summary(. . .)). 1. Is this still the case, or is there any package today in R which computes some sort of standardized logistic regression coefficients widely accepted by the community? 2. Also, if anyone knows, how can I implement this interquartile-range odds ratios of the Design package? I checked on the manual and found no reference to interquartile-range odds ratios Thanks Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gee p values
There are two z-scores reported in the summary: Naive z and Robust z. pvalue=2*min(pnorm(z-score), 1-pnorm(z-score)) # two-sided test -- View this message in context: http://r.789695.n4.nabble.com/gee-p-values-tp2533835p2534302.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer output
Hi I have a question regarding an output of a binomial lmer-model. The model is as follows: lmer(y~diet * day * female + (day|female),family=binomial) The corresponding output is: Generalized linear mixed model fit by the Laplace approximation Formula: y ~ diet * day * female + (day | female) AIC BIC logLik deviance 1084 1136 -531.1 1062 Random effects: Groups NameVariance Std.Dev. Corr female (Intercept) 1.403060 1.18451 day 0.012044 0.10975 -0.674 Number of obs: 809, groups: female, 53 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) 0.996444 0.713703 1.396 0.1627 dietNAA 1.194581 0.862294 1.385 0.1659 day 0.142026 0.074270 1.912 0.0558 . female 0.015629 0.019156 0.816 0.4146 dietNAA:day-0.124755 0.088684 -1.407 0.1595 dietNAA:female -0.024733 0.026947 -0.918 0.3587 day:female -0.001535 0.001966 -0.781 0.4348 dietNAA:day:female 0.001543 0.002720 0.568 0.5704 Now from my understanding, the estimates represent differences in slopes and intercepts between different levels of diet and so on. My questions: 1. Is there a way to display the coefficients for all levels of variables (e.g., dietAA and dietNAA)? Because it is quite hard to calculate the slopes and intercepts for all levels of each variable. 2. Is there a way to get the degrees of freedom? Thanks for your help. Regards, Denis Aydin -- This email and any files transmitted with it are confide...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gee p values
Peng, If the answer were as simple as you suggest, I would expect that gee would automatically produce the p values. Since gee does not produce the values, I fear that the computation may be more complex, or perhaps computing p values from gee may be controversial. Do you know which, if either of my speculations is true? Thank you, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Peng, C cpeng@gmail.com 9/10/2010 8:06 AM There are two z-scores reported in the summary: Naive z and Robust z. pvalue=2*min(pnorm(z-score), 1-pnorm(z-score)) # two-sided test -- View this message in context: http://r.789695.n4.nabble.com/gee-p-values-tp2533835p2534302.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme vs. lmer, how do they differ?
windows Vista R 2.10.1 What is the difference (or differences) between lme and lmer? Both appear to perform mixed effects regression analyses. Thanks John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme, groupedData, random intercept and slope
Windows Vista R 2.10.1 Does the following use of groupedData and lme produce an analysis with both random intercept and slope, or only random slope? zz-groupedData(y~time | Subject,data=data.frame(data), labels = list( x = Time, y = y ), units = list( x = (yr), y = (mm)) ) plot(zz) fit10-lme(zz) summary(fit10) Linear mixed-effects model fit by REML Data: zz AIC BIC logLik -123.1942 -115.2010 67.5971 Random effects: Formula: ~time | Subject Structure: General positive-definite StdDev Corr (Intercept) 6.054897e+00 (Intr) time4.160662e-05 1 Residual9.775954e-04 Fixed effects: y ~ time Value Std.Error DF t-value p-value (Intercept) 15.000217 1.914727 19 7.834 0 time-1.51 0.000219 19 -4566.598 0 Correlation: (Intr) time 0.059 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.73706837 -0.36289558 0.06892484 0.59777067 1.69095476 Number of Observations: 30 Number of Groups: 10 John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] plyr: version 1.2
plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of a data frame * quickly calculate summary statistics for each group * perform group-wise transformations like scaling or standardising It's already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with: * totally consistent names, arguments and outputs * convenient parallelisation through the foreach package * input from and output to data.frames, matrices and lists * progress bars to keep track of long running operations * built-in error recovery, and informative error messages * labels that are maintained across all transformations Considerable effort has been put into making plyr fast and memory efficient, and in many cases plyr is as fast as, or faster than, the built-in functions. You can find out more at http://had.co.nz/plyr/, including a 20 page introductory guide, http://had.co.nz/plyr/plyr-intro.pdf. You can ask questions about plyr (and data-manipulation in general) on the plyr mailing list. Sign up at http://groups.google.com/group/manipulatr Version 1.2 (2010-09-09) -- NEW FEATURES * l*ply, d*ply, a*ply and m*ply all gain a .parallel argument that when TRUE, applies functions in parallel using a parallel backend registered with the foreach package: x - seq_len(20) wait - function(i) Sys.sleep(0.1) system.time(llply(x, wait)) # user system elapsed # 0.007 0.005 2.005 library(doMC) registerDoMC(2) system.time(llply(x, wait, .parallel = TRUE)) # user system elapsed # 0.020 0.011 1.038 This work has been generously supported by BD (Becton Dickinson). MINOR CHANGES * a*ply and m*ply gain an .expand argument that controls whether data frames produce a single output dimension (one element for each row), or an output dimension for each variable. * new vaggregate (vector aggregate) function, which is equivalent to tapply, but much faster (~ 10x), since it avoids copying the data. * llply: for simple lists and vectors, with no progress bar, no extra info, and no parallelisation, llply calls lapply directly to avoid all the overhead associated with those unused extra features. * llply: in serial case, for loop replaced with custom C function that takes about 40% less time (or about 20% less time than lapply). Note that as a whole, llply still has much more overhead than lapply. * round_any now lives in plyr instead of reshape BUG FIXES * list_to_array works correct even when there are missing values in the array. This is particularly important for daply. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] reshape2: a reboot of the reshape package
Reshape2 is a reboot of the reshape package. It's been over five years since the first release of the package, and in that time I've learned a tremendous amount about R programming, and how to work with data in R. Reshape2 uses that knowledge to make a new package for reshaping data that is much more focussed and much much faster. This version improves speed at the cost of functionality, so I have renamed it to `reshape2` to avoid causing problems for existing users. Based on user feedback I may reintroduce some of these features. What's new in `reshape2`: * considerably faster and more memory efficient thanks to a much better underlying algorithm that uses the power and speed of subsetting to the fullest extent, in most cases only making a single copy of the data. * cast is replaced by two functions depending on the output type: `dcast` produces data frames, and `acast` produces matrices/arrays. * multidimensional margins are now possible: `grand_row` and `grand_col` have been dropped: now the name of the margin refers to the variable that has its value set to (all). * some features have been removed such as the `|` cast operator, and the ability to return multiple values from an aggregation function. I'm reasonably sure both these operations are better performed by plyr. * a new cast syntax which allows you to reshape based on functions of variables (based on the same underlying syntax as plyr): * better development practices like namespaces and tests. This work has been generously supported by BD (Becton Dickinson). -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Over lay 2 scale in same plot
Hi Josh, Thanks for your reply. I gave a reply yesterday but found that it was not posted. I managed to plot the bar pot and overlay points. The problem I am facing now is the spread of Y scale. The values I am plotting in Y scale are very close. so they look pretty flat. (lowest value 7.5 and highest value 8.5 , so if the ranges in y scale is 6-8, 8-10 , the values looks pretty flat.) How can I make the spread of Y scale i.e 6.2 - 6.4 - 6.6 - .. 8.8 - 10 so that values does not look flat. I have added an image below. http://r.789695.n4.nabble.com/file/n2534370/_GE_and_CN_Combined_Plot_mod.png Thanks in advance. regards, Mamun -- View this message in context: http://r.789695.n4.nabble.com/Over-lay-2-scale-in-same-plot-tp2528661p2534370.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme vs. lmer, how do they differ?
John Sorkin jsor...@grecc.umaryland.edu 10/09/2010 13:21:09 What is the difference (or differences) between lme and lmer? Both appear to perform mixed effects regression analyses. From a user's point of view: - lme only accepts nested random effect; lmer handles crossed random effects - lme has a convenient methof of handling heteroscedasticity; lmer doesn't. - lme gives you p-values; lmer doesn't (there is explanation of why at https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html which, I guess, should not be all that reassuring for lme users either) Steve Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme vs. lmer, how do they differ?
John Sorkin jsorkin at grecc.umaryland.edu writes: windows Vista R 2.10.1 What is the difference (or differences) between lme and lmer? Both appear to perform mixed effects regression analyses. Thanks John in a nutshell: lmer is newer, much faster, handles crossed random effects well (and generalized linear mixed models), has some support for producing likelihood profiles (in the development version), and is under rapid development. It does not attempt to estimate residual degrees of freedom and hence does not give p-values for significance of effects. lme is older, better documented (Pinheiro and Bates 2000), more stable, and handles 'R-side' structures (heteroscedasticity, within-group correlations). r-sig-mixed-models is a good place for questions about these packages. Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
Hi, On Fri, Sep 10, 2010 at 4:05 AM, omerle ome...@laposte.net wrote: Hi, I got two questions : 1st Question a=S b=data.frame(S=3) do.call(`-`,list(do.call(`$`,list(b,S)),5)) I think there is some confusion here. Why are you setting a equal to S but then never using it? = How can I put new values on S column having the column name as a variable ? I'm having trouble parsing this. What exactly do you want to do? 2 nd Question a=S b=data.frame(S=3) b[,S]=list(1:10) #Doesnt works b$S=list(1:10) #Works = Isnt the same thing ? What is the difference between these two things ? I believe b[[S]] is the same as b$S, b[,S] is different. But I have to question your assertion that b$S=list(1:10) Works. This is a very odd construction (putting a list as an element of a data.frame) and is almost certainly not what you want. Thanks, Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which language is faster for numerical computation?
Dirk E. has properly focussed the discussion on measurement rather than opinion. I'll add the issue of the human time taken to convert, and more importantly debug, interfaced code. That too could be measured, but we rarely see human hours to code/debug/test reported. Moreover, I'll mention the cat among the pigeons of Rcgmin, which I wrote to allow me to play with an optimization code more easily to discover where the algorithm might be improved. The resulting package on some problems outperforms C equivalents. Now the code is quite vectorized, but this has still been a very nice surprise. In fact, I've decided to avoid playing around with the interfaces if I can run things well-enough entirely in R. JN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SSOAP complex request types
Hi, I'm having a bit of trouble using SSOAP to send requests containing complex types. It seems as though processWSDL does not generate functions for converting the generated types into SOAP requests, but it does generate them for converting **from** SOAP requests to the complex types. The error that I get when trying to pass an object of type LoginReq (as generated by processWSDL) is that LoginReq does not inherit method toSOAP with signature (obj=LoginReq, con=XMLInternalElementNode, type=NULL). Does anyone with experience of this library know if theres something i'm doing wrong? or is it the case that sending complex request types isn't supported? Thanks, Jonny [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
Message du 10/09/10 14:53 De : Ista Zahn A : omerle Copie à : r-help@r-project.org Objet : Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ? Hi, On Fri, Sep 10, 2010 at 4:05 AM, omerle wrote: Hi, I got two questions : 1st Question          a=S          b=data.frame(S=3)          do.call(`-`,list(do.call(`$`,list(b,S)),5)) I think there is some confusion here. Why are you setting a equal to S but then never using it? = How can I put new values on S column having the column name as a variable ? I'm having trouble parsing this. What exactly do you want to do? 1 - Put a list as an element of a data.frame. That's quite convenient for my pricing function.  2 nd Question       a=S      b=data.frame(S=3)      b[,S]=list(1:10) #Doesnt works      b$S=list(1:10) #Works = Isnt the same thing ? What is the difference between these two things ? I believe b[[S]] is the same as b$S, b[,S] is different. But I have to question your assertion that b$S=list(1:10) Works. This is a very odd construction (putting a list as an element of a data.frame) and is almost certainly not what you want. 2 - That's what I want. I figured out just five minutes ago that b[[S]] works because it's the same thing as b$S. But I still dont know what is b[,S] compared to b[[S]] Thanks, Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net     [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package gbm C++ code as separate module
Dear all, I would like to separate the gbm C++ code from any R dependencies, so that it could be compiled into a standalone module. I am wondering if anyone has already done this and could provide me with some pointers/help ? Thanks! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating with tolerances (error propagation)
On Sep 9, 2010, at 10:57 AM, David Winsemius wrote: On Sep 9, 2010, at 6:50 AM, Jan private wrote: Hello Bernardo, - If I understood your problem this script solve your problem: q-0.15 + c(-.1,0,.1) h-10 + c(-.1,0,.1) 5*q*h [1] 2.475 7.500 12.625 - OK, this solves the simple example. But what if the example is not that simple. E.g. P = 5 * q/h Here, to get the maximum tolerances for P, we need to divide the maximum value for q by the minimum value for h, and vice versa. Is there any way to do this automatically, without thinking about every single step? There is a thing called interval arithmetic (I saw it as an Octave package) which would do something like this. (Sorry for the blank reply posting. Serum caffeine has not yet reached optimal levels.) Is it possible that interval arithmetic would produce statistically incorrect tolerance calculation, and that be why it has not been added to R? Those tolerance intervals are presumably some sort of (unspecified) prediction intervals (i.e. contain 95% or 63% or some fraction of a large sample) and combinations under mathematical operations are not going to be properly derived by c( min(XY), max(XY) ) since those are not calculated with any understanding of combining variances of functions on random variables. There is a function, propagate, in the qpcR package that does incorporate statistical principles in handling error propagation. Thanks to the author, Dr. rer. nat. Andrej-Nikolai Spiess, for drawing it to my attention (and of course for writing it.). It appears that it should handle the data situation offered with only minor modifications. The first example is vary similar to your (more difficult) ratio problem. install.packages(pkgs=qpcR, type=source) # binary install did not succeed on my Mac, but installing from source produced no errors or warnings. require(qpcR) q- c( 0.15 , .1) h-c( 10 , .1) EXPR - expression(5*q/h) DF - cbind(q, h) res - propagate(expr = EXPR, data = DF, type = stat, + do.sim = TRUE, verbose = TRUE) res$summary Sim PermProp Mean0.07500751 NaN 0.0750 s.d.0.05001970 NA 0.05000562 Median 0.07445724 NA NA MAD 0.04922935 NA NA Conf.lower -0.02332498 NA -0.02300922 Conf.upper 0.17475818 NA 0.17300922 (My only suggestion for enhancement would be a print or summary method that did not output every single simulated value.) Three methods for error propagation estimation are included: a) Monte- Carlo simulation using sampling from the data, b) Permutation, and c) Gaussian errors calculated via a Taylor series expansion. There is a rich set of worked examples. It appears to be capable of meeting a wide variety of challenges. -- David. -- David. I would have thought that tracking how a (measuring) error propagates through a complex calculation would be a standard problem of statistics?? In probability theory, anyway. In other words, I am looking for a data type which is a number with a deviation +- somehow attached to it, with binary operators that automatically knows how to handle the deviation. There is the suite of packages that represent theoretic random variables and support mathematical operations on them. See distrDoc and the rest of that suite. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
On Fri, Sep 10, 2010 at 9:22 AM, omerle ome...@laposte.net wrote: Message du 10/09/10 14:53 De : Ista Zahn A : omerle Copie à : r-help@r-project.org Objet : Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ? Hi, On Fri, Sep 10, 2010 at 4:05 AM, omerle wrote: Hi, I got two questions : 1st Question a=S b=data.frame(S=3) do.call(`-`,list(do.call(`$`,list(b,S)),5)) I think there is some confusion here. Why are you setting a equal to S but then never using it? = How can I put new values on S column having the column name as a variable ? I'm having trouble parsing this. What exactly do you want to do? 1 - Put a list as an element of a data.frame. That's quite convenient for my pricing function. I think this is a really bad idea. data.frames are not meant to be used in this way. Why not use a list of lists? 2 nd Question a=S b=data.frame(S=3) b[,S]=list(1:10) #Doesnt works b$S=list(1:10) #Works = Isnt the same thing ? What is the difference between these two things ? I believe b[[S]] is the same as b$S, b[,S] is different. But I have to question your assertion that b$S=list(1:10) Works. This is a very odd construction (putting a list as an element of a data.frame) and is almost certainly not what you want. 2 - That's what I want. I figured out just five minutes ago that b[[S]] works because it's the same thing as b$S. But I still dont know what is b[,S] compared to b[[S]] see ?[ Thanks, Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org Une messagerie gratuite, garantie à vie et des services en plus, ça vous tente ? Je crée ma boîte mail www.laposte.net [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
I'm having trouble parsing this. What exactly do you want to do? 1 - Put a list as an element of a data.frame. That's quite convenient for my pricing function. I think this is a really bad idea. data.frames are not meant to be used in this way. Why not use a list of lists? It can be very convenient, but I suspect the original poster is confused about the different between vectors and lists. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
Le 9/10/2010 15:37, Ista Zahn a écrit : On Fri, Sep 10, 2010 at 9:22 AM, omerleome...@laposte.net wrote: Message du 10/09/10 14:53 De : Ista Zahn A : omerle Copie à : r-help@r-project.org Objet : Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ? Hi, On Fri, Sep 10, 2010 at 4:05 AM, omerle wrote: Hi, I got two questions : 1st Question a=S b=data.frame(S=3) do.call(`-`,list(do.call(`$`,list(b,S)),5)) I think there is some confusion here. Why are you setting a equal to S but then never using it? = How can I put new values on S column having the column name as a variable ? I'm having trouble parsing this. What exactly do you want to do? 1 - Put a list as an element of a data.frame. That's quite convenient for my pricing function. I think this is a really bad idea. data.frames are not meant to be used in this way. Why not use a list of lists? Since data.frames are lists, why would it be a bad practice? 2 nd Question a=S b=data.frame(S=3) b[,S]=list(1:10) #Doesnt works b$S=list(1:10) #Works = Isnt the same thing ? What is the difference between these two things ? I believe b[[S]] is the same as b$S, b[,S] is different. But I have to question your assertion that b$S=list(1:10) Works. This is a very odd construction (putting a list as an element of a data.frame) and is almost certainly not what you want. 2 - That's what I want. I figured out just five minutes ago that b[[S]] works because it's the same thing as b$S. But I still dont know what is b[,S] compared to b[[S]] see ?[ From the help page, there is not much distinction. Maybe I haven't understood all the details... Ivan -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pairwise.t.test vs t.test
Dear all, I am perplexed when trying to get the same results using pairwise.t.test and t.test. I'm using examples in the ISwR library, attach(red.cell.folate) I can get the same result for pairwise.t.test and t.test when I set the variances to be non-equal, but not when they are assumed to be equal. Can anyone explain the differences, or what I'm doing wrong? Here's an example where I compare the first two ventilations with pairwise.t.test and t.test pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F) Pairwise comparisons using t tests with non-pooled SD data: folate and ventilation N2O+O2,24h N2O+O2,op N2O+O2,op 0.029 - O2,24h 0.161 0.298 P value adjustment method: none t.test(folate[1:8], folate[9:17], var.equal=F) Welch Two Sample t-test data: folate[1:8] and folate[9:17] t = 2.4901, df = 11.579, p-value = 0.02906 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 7.310453 113.050658 sample estimates: mean of x mean of y 316.6250 256. So 0.029 and 0.02906 are identical but if I do the same with pool.sd and var.equal = T, I get different results pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=T) Pairwise comparisons using t tests with pooled SD data: folate and ventilation N2O+O2,24h N2O+O2,op N2O+O2,op 0.014 - O2,24h 0.155 0.408 P value adjustment method: none t.test(folate[1:8], folate[9:17], var.equal=T) Two Sample t-test data: folate[1:8] and folate[9:17] t = 2.5582, df = 15, p-value = 0.02184 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 10.03871 110.32240 sample estimates: mean of x mean of y 316.6250 256. So 0.014 and 0.02184 are not the same. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem importing square character
Dear, When I try to to execute the following command, R don't read all lines (reads only 57658 lines when the file has 814125 lines): dados2-read.table(C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua IFPD\\200701_02_03_04\\SegurosClube.txt,header=FALSE,sep=^,colClasses=c(character,character,NULL,NA,NULL,NULL,NULL,character,character,NULL,NULL,NULL,NULL,NA,NULL,NULL,NULL,NULL,NA,NULL,NULL),quote=,comment.char=,skip=1,fill=TRUE) If I exclude fill=TRUE, R gives the message Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : número de itens não é múltiplo do número de colunas (number of itens is not multiple of number of columns) I identified that the problem is the following line of my data (line 57659 of my file): 13850074571^01/01/1940^000^93101104^^^1^01/05/2006^30/06/2006^13479^13479^13479^0^0^0^0^^66214-Previdência privada fechada^MARIA^DA CONCEI`O FERREIRA LOBATO^CORPORATE As you can observe, my data have a square string like this: (i don't know if you can see the character, but it looks like a white square). It looks like that R understands this character as the end of the archive. I opened my data on the notepad and copied the character. When I paste this character on R, it try to close asking if I want to save my work. What is happenning? Thanks very much. Marcelo Estácio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Over lay 2 scale in same plot
Hi Mamum, You can look at the ylim argument to plot(). It lets you control the limits; however, in your example graph, part of the issue is that the bars have a much higher value, so you could not change ylim too much (looks like in your example graph you could set it to something like ylim = c(0, 7) ? ). Josh On Fri, Sep 10, 2010 at 5:43 AM, mamunbabu2001 mrashi...@hotmail.com wrote: Hi Josh, Thanks for your reply. I gave a reply yesterday but found that it was not posted. I managed to plot the bar pot and overlay points. The problem I am facing now is the spread of Y scale. The values I am plotting in Y scale are very close. so they look pretty flat. (lowest value 7.5 and highest value 8.5 , so if the ranges in y scale is 6-8, 8-10 , the values looks pretty flat.) How can I make the spread of Y scale i.e 6.2 - 6.4 - 6.6 - .. 8.8 - 10 so that values does not look flat. I have added an image below. http://r.789695.n4.nabble.com/file/n2534370/_GE_and_CN_Combined_Plot_mod.png Thanks in advance. regards, Mamun -- View this message in context: http://r.789695.n4.nabble.com/Over-lay-2-scale-in-same-plot-tp2528661p2534370.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
On Sep 10, 2010, at 9:42 AM, Hadley Wickham wrote: I'm having trouble parsing this. What exactly do you want to do? 1 - Put a list as an element of a data.frame. That's quite convenient for my pricing function. I think this is a really bad idea. data.frames are not meant to be used in this way. Why not use a list of lists? It can be very convenient, but I suspect the original poster is confused about the different between vectors and lists. I wouldn't be surprised if someone were confused, since my reading of some (but not all) of the help documents has led me to think that lists _were_ vectors, just not vectors of atomic mode. And one oft- illustrated method for creating a list is: alist - vector(mode=list, length=10). I am perhaps less confused than I was two years ago but my confusion about all the possible permutations of mode, typeof, expression, formula, and class and the extraction methods therefrom definitely persists. I think the authors of the documentation are of divided opinion or usage on this topic. Best; David. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can I save my console contents automatically?
Dear All, I'm using R for Mac OS X Cocoa GUI R version 2.11.1. I can save contents of my console window by using command + s, but I would like to do same thing using R commands. My question is can I save the contents automatically by using R editor with some R commands. Thank you. Nobu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modifying axis labels in lattice panels
Dear all, I am struggling to modify the axis labels/ticks in a panel provided to xyplot. To begin with, I do not know the equivalent of the xaxt=n directive for panels that would set the stage for no default x axis being drawn. My goal is to draw ticks and custom formatted labels at certain hours of the week. When I execute the code below, I get an error message in the plot window that suggests a problem with some args parameter. A second problem concerns the shaded rectangles I try to draw. Clearly, the range returned by par('usr') does not correspond to the true y ranges. Any help would be greatly appreciated, Thanks, Markus PS I am using R version 2.10.0 on MACOS and the lattice package version 0.18-3 (latest) library(lattice); #multivariate time series, one row for each hour of the week: Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) + (0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 = cos(2*pi*7*(0:167)/168)); #to pass this to xyplot, first need to reshape: Xlong - reshape(Xwide, varying = c(2:3), idvar = time, direction=long, timevar = PC); #get descriptive shingle labels Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2)); xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales = list(x=list(at = Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H); WeekhourPanel - function(x,y,...){ r - range(x); #print(r) Hours8 - seq(r[1], r[2], by=8*3600); Hours24 - seq(r[1]+12*3600, r[2], by=24*3600) #axis.POSIXct(1, at= Hours8, format=%H); panel.xyplot(x,y, type=l, ...); panel.grid(0,3); panel.abline(v= Hours24-4*3600, lty=2, col = rgb(0,0,1,0.5)); panel.abline(v=Hours24+6*3600, lty=2, col = rgb(0,1,0,0.5)); bb - par('usr') y0 - bb[3]; for (i in seq(r[1], r[2], by=48*3600)) panel.rect(xleft=i, ybottom=y0, xright=i+24*3600-1, ytop=bb[4], col = rgb(0.75,0.75,0.75,0.3), border = NA); panel.axis(1, at= Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H))); #panel.axis(1, at= Hours24+6*3600, labels=format(x,%H)); #panel.axis(3, at= Hours24, labels=format(x,%a), line = -1, tick = FALSE); } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Save R-Part Tree
Hello, I have some Data, that I analyse using a regression tree (by using rpart). Now I would like to save this tree to a file for later use, and load it (not necessarily in the same workspace). It would be most preferrable, if I could just save some parameters (later probably to a database) and load them again to reform the tree. Is this possible? And if yes, how? Thanks in advance --Paul-- -- GRATIS: Spider-Man 1-3 sowie 300 weitere Videos! -- GRATIS: Spider-Man 1-3 sowie 300 weitere Videos! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairwise.t.test vs t.test
On Sep 10, 2010, at 16:01 , Jabez Wilson wrote: Dear all, I am perplexed when trying to get the same results using pairwise.t.test and t.test. I'm using examples in the ISwR library, attach(red.cell.folate) I can get the same result for pairwise.t.test and t.test when I set the variances to be non-equal, but not when they are assumed to be equal. Can anyone explain the differences, or what I'm doing wrong? Here's an example where I compare the first two ventilations with pairwise.t.test and t.test pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F) Pairwise comparisons using t tests with non-pooled SD data: folate and ventilation N2O+O2,24h N2O+O2,op N2O+O2,op 0.029 - O2,24h0.161 0.298 P value adjustment method: none t.test(folate[1:8], folate[9:17], var.equal=F) Welch Two Sample t-test data: folate[1:8] and folate[9:17] t = 2.4901, df = 11.579, p-value = 0.02906 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 7.310453 113.050658 sample estimates: mean of x mean of y 316.6250 256. So 0.029 and 0.02906 are identical but if I do the same with pool.sd and var.equal = T, I get different results pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=T) Pairwise comparisons using t tests with pooled SD data: folate and ventilation N2O+O2,24h N2O+O2,op N2O+O2,op 0.014 - O2,24h0.155 0.408 P value adjustment method: none t.test(folate[1:8], folate[9:17], var.equal=T) Two Sample t-test data: folate[1:8] and folate[9:17] t = 2.5582, df = 15, p-value = 0.02184 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 10.03871 110.32240 sample estimates: mean of x mean of y 316.6250 256. So 0.014 and 0.02184 are not the same. The help page says: The pool.SD switch calculates a common SD for all groups (NB: all) So the denominator is not the same as when testing each pair separately. You can in fact do pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F,var.eq=T) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairwise.t.test vs t.test
On Sep 10, 2010, at 9:01 AM, Jabez Wilson wrote: Dear all, I am perplexed when trying to get the same results using pairwise.t.test and t.test. I'm using examples in the ISwR library, attach(red.cell.folate) I can get the same result for pairwise.t.test and t.test when I set the variances to be non-equal, but not when they are assumed to be equal. Can anyone explain the differences, or what I'm doing wrong? Here's an example where I compare the first two ventilations with pairwise.t.test and t.test pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F) Pairwise comparisons using t tests with non-pooled SD data: folate and ventilation N2O+O2,24h N2O+O2,op N2O+O2,op 0.029 - O2,24h0.161 0.298 P value adjustment method: none t.test(folate[1:8], folate[9:17], var.equal=F) Welch Two Sample t-test data: folate[1:8] and folate[9:17] t = 2.4901, df = 11.579, p-value = 0.02906 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 7.310453 113.050658 sample estimates: mean of x mean of y 316.6250 256. So 0.029 and 0.02906 are identical but if I do the same with pool.sd and var.equal = T, I get different results pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=T) Pairwise comparisons using t tests with pooled SD data: folate and ventilation N2O+O2,24h N2O+O2,op N2O+O2,op 0.014 - O2,24h0.155 0.408 P value adjustment method: none t.test(folate[1:8], folate[9:17], var.equal=T) Two Sample t-test data: folate[1:8] and folate[9:17] t = 2.5582, df = 15, p-value = 0.02184 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 10.03871 110.32240 sample estimates: mean of x mean of y 316.6250 256. So 0.014 and 0.02184 are not the same. require(ISwR) with(red.cell.folate[1:17, ], pairwise.t.test(folate, ventilation, pool.sd = TRUE))$p.value N2O+O2,24h N2O+O2,op 0.02184081 NB: The pool.SD switch calculates a common SD for all groups and used that for all comparisons See the Details in ?pairwise.t.test HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem importing square character
On 10/09/2010 10:03 AM, Marcelo Estácio wrote: Dear, When I try to to execute the following command, R don't read all lines (reads only 57658 lines when the file has 814125 lines): dados2-read.table(C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua IFPD\\200701_02_03_04\\SegurosClube.txt,header=FALSE,sep=^,colClasses=c(character,character,NULL,NA,NULL,NULL,NULL,character,character,NULL,NULL,NULL,NULL,NA,NULL,NULL,NULL,NULL,NA,NULL,NULL),quote=,comment.char=,skip=1,fill=TRUE) If I exclude fill=TRUE, R gives the message Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : número de itens não é múltiplo do número de colunas (number of itens is not multiple of number of columns) I identified that the problem is the following line of my data (line 57659 of my file): 13850074571^01/01/1940^000^93101104^^^1^01/05/2006^30/06/2006^13479^13479^13479^0^0^0^0^^66214-Previdência privada fechada^MARIA^DA CONCEI`O FERREIRA LOBATO^CORPORATE As you can observe, my data have a square string like this: (i don't know if you can see the character, but it looks like a white square). It looks like that R understands this character as the end of the archive. I opened my data on the notepad and copied the character. When I paste this character on R, it try to close asking if I want to save my work. What is happenning? That symbol is the way some systems display the hex 1A character, which in DOS marked the end of file. By the pathname it looks as though you're working on Windows, which has inherited that behaviour. The best way to get around it would be to correct those bad characters: they are almost certainly errors in the data file. If you want to keep them, then you could try reading the file in binary mode rather than text mode. You do this using con - file( filename, open=rb) read.table(con, header=FALSE, ...) close(con) You could also try reading it on a different OS; I don't think Linux cares about 1A characters. Duncan Murdoch Thanks very much. Marcelo Estácio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
I think this is a really bad idea. data.frames are not meant to be used in this way. Why not use a list of lists? It can be very convenient, but I suspect the original poster is confused about the different between vectors and lists. I wouldn't be surprised if someone were confused, since my reading of some (but not all) of the help documents has led me to think that lists _were_ vectors, just not vectors of atomic mode. And one oft-illustrated method for creating a list is: alist - vector(mode=list, length=10). I am perhaps less confused than I was two years ago but my confusion about all the possible permutations of mode, typeof, expression, formula, and class and the extraction methods therefrom definitely persists. I think the authors of the documentation are of divided opinion or usage on this topic. I probably should have said atomic vectors - it's easy to get confused when you fail to be specific. Data frames add even more confusion: is.vector(as.vector(mtcars)) [1] FALSE (That behaviour matches the documentation, but it's still confusing!) Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?
Well, let's see if the following helps or just adds to the confusion. First lists are vectors of mode list . But they are general recursive structures (in fact, completely general). Second, data frames are lists: each column of a data frame is a component (member) of the list with the additional requirement that all the components must be the same length. The reason for the scare quotes areound column will become clear shortly. Now, for examples: x - data.frame(a=1:3,b=list(c=4:6)) x a c 1 1 4 2 2 5 3 3 6 ## This is as documented in ?data.frame. ## Now compare: y - data.frame(a=1:3) y$b - list(c=4:6) y a b 1 1 4, 5, 6 2 2 4, 5, 6 3 3 4, 5, 6 ## A different result that one might think should be the same as the previous one. What's going on is: y is a data frame, so all components must have the same length. Because the length of b, a list, is just 1, it is replicated to be the proper length: y$b $c [1] 4 5 6 $c [1] 4 5 6 $c [1] 4 5 6 ##The b component is still a list: mode(y$b) [1] list ## of course: y$c=7:9 y a b c 1 1 4, 5, 6 7 2 2 4, 5, 6 8 3 3 4, 5, 6 9 ## This is correct, since the c component is a vector of length 3. Note also: mode(y[,3]) [1] numeric mode(y[[3]]) [1] numeric mode(y[3]) [1] list ## All these are correct = agree with documented behavior of [ and [[ because a data.frame IS a list. Cheers, Bert On Fri, Sep 10, 2010 at 7:13 AM, David Winsemius dwinsem...@comcast.net wrote: On Sep 10, 2010, at 9:42 AM, Hadley Wickham wrote: I'm having trouble parsing this. What exactly do you want to do? 1 - Put a list as an element of a data.frame. That's quite convenient for my pricing function. I think this is a really bad idea. data.frames are not meant to be used in this way. Why not use a list of lists? It can be very convenient, but I suspect the original poster is confused about the different between vectors and lists. I wouldn't be surprised if someone were confused, since my reading of some (but not all) of the help documents has led me to think that lists _were_ vectors, just not vectors of atomic mode. And one oft-illustrated method for creating a list is: alist - vector(mode=list, length=10). I am perhaps less confused than I was two years ago but my confusion about all the possible permutations of mode, typeof, expression, formula, and class and the extraction methods therefrom definitely persists. I think the authors of the documentation are of divided opinion or usage on this topic. Best; David. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which language is faster for numerical computation?
Don't underestimate the importance of the choice of the algorithm you use. That often makes a huge difference. Also, vectorization is key in R, and when you use that you're really up there among the top performing languages. Here is an example from the official R wiki illustrating my points: http://rwiki.sciviews.org/doku.php?id=tips:programming:code_optim2 My rule of thumb is: Any piece of code can be made twice as fast. /Henrik On Fri, Sep 10, 2010 at 6:06 AM, Prof. John C Nash nas...@uottawa.ca wrote: Dirk E. has properly focussed the discussion on measurement rather than opinion. I'll add the issue of the human time taken to convert, and more importantly debug, interfaced code. That too could be measured, but we rarely see human hours to code/debug/test reported. Moreover, I'll mention the cat among the pigeons of Rcgmin, which I wrote to allow me to play with an optimization code more easily to discover where the algorithm might be improved. The resulting package on some problems outperforms C equivalents. Now the code is quite vectorized, but this has still been a very nice surprise. In fact, I've decided to avoid playing around with the interfaces if I can run things well-enough entirely in R. JN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which language is faster for numerical computation?
On Fri, Sep 10, 2010 at 10:23 AM, Henrik Bengtsson h...@stat.berkeley.edu wrote: Don't underestimate the importance of the choice of the algorithm you use. That often makes a huge difference. Also, vectorization is key in R, and when you use that you're really up there among the top performing languages. Here is an example from the official R wiki illustrating my points: http://rwiki.sciviews.org/doku.php?id=tips:programming:code_optim2 My rule of thumb is: Any piece of code can be made twice as fast. I second this point - the latest version of reshape is 100x faster in some situations because I came up with a better vectorised algorithm - it's still all in R. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression using mapply?
This really depends on why you want to do this and what results you want. If your main goal is to look at some basic tests, goodness of fit, then the add1 function may do everything you need. If you just want coefficient estimates then some basic matrix algebra will give those to you. Another option would be to reshape the data to long format and use lmList from the nlme package (the above will be quicker if you do not need everything that lm gives you). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Philipp Kunze Sent: Wednesday, September 08, 2010 5:35 AM To: R-help@r-project.org Subject: [R] Regression using mapply? Hi, I have huge matrices in which the response variable is in the first column and the regressors are in the other columns. What I wanted to do now is something like this: #this is just to get an example-matrix DataMatrix - rep(1,1000); Disturbance - rnorm(900); DataMatrix[101:1000] - DataMatrix[101:1000]+Disturbance; DataMatrix - matrix(DataMatrix,ncol=10,nrow=100); #estimate univariate linear model with each regressor-column, response in the first column for(i in 2:10){ result - lm(DataMatrix[,1]~DataMatrix[,i]) } Is there any way to get rid of the for-loop using mapply (or some other function)? Thanks! Philipp __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairwise.t.test vs t.test
Thanks a lot, Peter. Excellent book btw. Jab --- On Fri, 10/9/10, peter dalgaard pda...@gmail.com wrote: From: peter dalgaard pda...@gmail.com Subject: Re: [R] pairwise.t.test vs t.test To: Jabez Wilson jabez...@yahoo.co.uk Cc: R-Help r-h...@stat.math.ethz.ch Date: Friday, 10 September, 2010, 15:20 On Sep 10, 2010, at 16:01 , Jabez Wilson wrote: Dear all, I am perplexed when trying to get the same results using pairwise.t.test and t.test. I'm using examples in the ISwR library, attach(red.cell.folate) I can get the same result for pairwise.t.test and t.test when I set the variances to be non-equal, but not when they are assumed to be equal. Can anyone explain the differences, or what I'm doing wrong? Here's an example where I compare the first two ventilations with pairwise.t.test and t.test pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F) Pairwise comparisons using t tests with non-pooled SD data: folate and ventilation N2O+O2,24h N2O+O2,op N2O+O2,op 0.029 - O2,24h 0.161 0.298 P value adjustment method: none t.test(folate[1:8], folate[9:17], var.equal=F) Welch Two Sample t-test data: folate[1:8] and folate[9:17] t = 2.4901, df = 11.579, p-value = 0.02906 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 7.310453 113.050658 sample estimates: mean of x mean of y 316.6250 256. So 0.029 and 0.02906 are identical but if I do the same with pool.sd and var.equal = T, I get different results pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=T) Pairwise comparisons using t tests with pooled SD data: folate and ventilation N2O+O2,24h N2O+O2,op N2O+O2,op 0.014 - O2,24h 0.155 0.408 P value adjustment method: none t.test(folate[1:8], folate[9:17], var.equal=T) Two Sample t-test data: folate[1:8] and folate[9:17] t = 2.5582, df = 15, p-value = 0.02184 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 10.03871 110.32240 sample estimates: mean of x mean of y 316.6250 256. So 0.014 and 0.02184 are not the same. The help page says: The pool.SD switch calculates a common SD for all groups (NB: all) So the denominator is not the same as when testing each pair separately. You can in fact do pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F,var.eq=T) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding labels above bars in a barplot
Hello, I want to make a general routine to draw barplots with numbers plotted above each bar. See the example below. I could not place the numbers on the middle of each bar because I could not calculate the right position of each x-axis tick. axTicks(1) indicated a unitary step, but it does not seem work. I appreciate any help or suggestions. Best regards, Antonio Olinto == CAT VAR1VAR2 Category 01 17.59 Category 02 15.220 Category 03 10.3500 Category 04 8.4 150 Category 05 20.35000 # Coping data from a spreadsheet dat.data - read.delim(clipboard,header=T) summary(dat.data) CAT VAR1VAR2 Category 01:1 Min. : 8.40 Min. : 9 Category 02:1 1st Qu.:10.30 1st Qu.: 20 Category 03:1 Median :15.20 Median : 150 Category 04:1 Mean :14.34 Mean :1136 Category 05:1 3rd Qu.:17.50 3rd Qu.: 500 Max. :20.30 Max. :5000 dat.bar - data.frame(dat.data[,c(2)]) row.names(dat.bar)-dat.data[,1] names(dat.bar)-c(VAR1) dat.bar VAR1 Category 01 17.5 Category 02 15.2 Category 03 10.3 Category 04 8.4 Category 05 20.3 par(mar=c(12,6,3,2),cex.axis=1.2,cex.lab=1.4) barplot(t(as.matrix(dat.bar)),ylim=c(0,max(dat.data[,2]*1.1)),las=2,ylab=Y label text,col=orange) box() up - max(dat.data$VAR1)*0.1 for (i in c(0:nrow(dat.data))) { legend(0.25+i,dat.bar[1+i,1]+up,dat.data[i+1,3],col=blue,bty=n) } Webmail - iBCMG Internet http://www.ibcmg.com.br __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can't event start 'rattle' under Ubuntu 64bit
Dear all, I've wanted to give a try to the rattle GUI for R. After long struggle with dependencies I've finally managed to install rattle and all of its dependencies, although for some of them I've been forced to use cran2deb. And now all I get is: library(rattle) Loading required package: pmml Loading required package: XML Loading required package: RGtk2 Loading required package: colorspace Rattle: Graphical interface for data mining using R. Version 2.5.39 Copyright (c) 2006-2010 Togaware Pty Ltd. Type 'rattle()' to shake, rattle, and roll your data. rattle() Error: attempt to apply non-function In addition: Warning message: In method(obj, ...) : Invalid object type `GtkFrame' Rattle timestamp (for the error above): 2010-09-10 17:24:14 As an addition a blank window appears. And I'm stuck. Can anyone - please - point me in the right direction? Best, Kamil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axis break with gap.plot()
Thank Jim for your answer. I actually did my own function to plot with the loess. I just calculated the intersection between the first and second horizontal gap lines with the line formed by the 2 points before and after the gap. So I can now plot the loess from both sides of the gap section. Thank again for your help, Phil -- View this message in context: http://r.789695.n4.nabble.com/Axis-break-with-gap-plot-tp2533027p2534658.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data Manipulation
Hi, I just started using R and need some guidance. I need to create a time series chart in R, but the problem is the data is not numeric. The data is in the following format Study A A B B B A C C D Then there is also another column with dates. How can I manipulate this in order to have something that will count the number of unique entries and group them. Say A = 3 B= 3 C=2 D=1 Thanks -- View this message in context: http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534662.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem importing square character
Duncan, thanks for your answer. I tried this: con-file(C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua IFPD\\200701_02_03_04\\200701_02_03_04.txt,open=rb) dados2-read.table(con,header=FALSE,sep=^,colClasses=c(character,character,NULL,NA,NULL,NULL,NULL,character,character,NULL,NULL,NULL,NULL,NA,NULL,NULL,NULL,NULL,NA,NULL,NULL), quote=,comment.char=,skip=1) Erro em pushBack(c(lines, lines), file) : can only push back on text-mode connections My file has 800mbs. The best way to correct this is import to Access and export to txt file. Thanks again! Marcelo Estácio Date: Fri, 10 Sep 2010 10:35:06 -0400 From: murdoch.dun...@gmail.com To: mes...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Problem importing square character On 10/09/2010 10:03 AM, Marcelo Estácio wrote: Dear, When I try to to execute the following command, R don't read all lines (reads only 57658 lines when the file has 814125 lines): dados2-read.table(C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua IFPD\\200701_02_03_04\\SegurosClube.txt,header=FALSE,sep=^,colClasses=c(character,character,NULL,NA,NULL,NULL,NULL,character,character,NULL,NULL,NULL,NULL,NA,NULL,NULL,NULL,NULL,NA,NULL,NULL),quote=,comment.char=,skip=1,fill=TRUE) If I exclude fill=TRUE, R gives the message Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : número de itens não é múltiplo do número de colunas (number of itens is not multiple of number of columns) I identified that the problem is the following line of my data (line 57659 of my file): 13850074571^01/01/1940^000^93101104^^^1^01/05/2006^30/06/2006^13479^13479^13479^0^0^0^0^^66214-Previdência privada fechada^MARIA^DA CONCEI`O FERREIRA LOBATO^CORPORATE As you can observe, my data have a square string like this: (i don't know if you can see the character, but it looks like a white square). It looks like that R understands this character as the end of the archive. I opened my data on the notepad and copied the character. When I paste this character on R, it try to close asking if I want to save my work. What is happenning? That symbol is the way some systems display the hex 1A character, which in DOS marked the end of file. By the pathname it looks as though you're working on Windows, which has inherited that behaviour. The best way to get around it would be to correct those bad characters: they are almost certainly errors in the data file. If you want to keep them, then you could try reading the file in binary mode rather than text mode. You do this using con - file( filename, open=rb) read.table(con, header=FALSE, ...) close(con) You could also try reading it on a different OS; I don't think Linux cares about 1A characters. Duncan Murdoch Thanks very much. Marcelo Estácio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Manipulation
Hi, Look at the table() function. Here is an example with your data: dat - read.table(textConnection( Study A A B B B A C C D), header = TRUE) closeAllConnections() table(dat) Hope that helps, Josh On Fri, Sep 10, 2010 at 8:53 AM, dfong df...@medicine.umaryland.edu wrote: Hi, I just started using R and need some guidance. I need to create a time series chart in R, but the problem is the data is not numeric. The data is in the following format Study A A B B B A C C D Then there is also another column with dates. How can I manipulate this in order to have something that will count the number of unique entries and group them. Say A = 3 B= 3 C=2 D=1 Thanks -- View this message in context: http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534662.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Manipulation
I'm actually importing it from a CSV, so I already have that in a table. But i Can't make a graph with text. I assume I need to do some counting in order to draw the graph? Any example of this? thanks -- View this message in context: http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534690.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] for loop help please!
Hi Everyone, I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of values. row2 - rown specify the number of times I want to replicate each corresponding value in row1. I can do this with the following function: rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n. Now, I can do this manually by changing the values of X and save each resulting array/vector in an object, or write a for loop that will iterate through the rows and output a new data.matrix in which row1 - rown will correspond to the vectors generated by replicating the values of row1 row2 - rown independent times from the original data.matrix with the rep function shown above. So far I have been unable to get the for loop right. Any help will be most appreciated! Thanks beforehand for your help. Best, A -- View this message in context: http://r.789695.n4.nabble.com/for-loop-help-please-tp2534666p2534666.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Manipulation
Hi, Yes, the table() function is not to read the data, but to do the frequency counts of each level. I just included the read.table() part so that you could copy and paste my code, but I did not include the R output from table(dat). table(dat) dat A B C D 3 3 2 1 It nicely tallies for you. Also, you can look at a simple plot: # you will have to run this in your R # because I do not know an easy way to include graphs plot(table(dat)) You can also save the results in a new variable and then access portions of it: my.table - table(dat) my.table # the full table dat A B C D 3 3 2 1 my.table[2] # just extract the second element B 3 Cheers, Josh On Fri, Sep 10, 2010 at 9:11 AM, dfong df...@medicine.umaryland.edu wrote: I'm actually importing it from a CSV, so I already have that in a table. But i Can't make a graph with text. I assume I need to do some counting in order to draw the graph? Any example of this? thanks -- View this message in context: http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534690.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Manipulation
Hello, This is definitely possible with R, there a lots of package to make good graphics. However, the easiest way for us to help you is if you give us a small reproducible example, as you started to with your initial post. If you have an object in your R session that you'd like help with you can use ?dput to create a text version of it to share with the list. The table class in R is separate from a data.frame, which is probably what you have now... dfong wrote: I'm actually importing it from a CSV, so I already have that in a table. But i Can't make a graph with text. I assume I need to do some counting in order to draw the graph? Any example of this? thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop help please!
Hi A, Here is a little example that I believe does what you want. I am not quite sure how you want all the output in a new matrix, because as you repeat each value is the first row varying numbers of times, you will not have rows of equal length. Although perhaps your data is setup so that you can. # Sample data table1 - matrix(1:16, ncol = 4) table1 # look at it # Using your code to create the first example rep(table1[1,], table1[2,]) # Using apply() to go through every row of the matrix 'table1' # except the first ([-1, ]) apply(table1[-1, ], 1, function(x) {rep(table1[1,], x)}) # Using a for loop to do the same for(i in 2:nrow(table1)) { print(rep(table1[1, ], table1[i, ])) } # In general, I think apply() is prettier # It is also easier to assign all the output to a list # Compared to using a for loop Best regards, Josh On Fri, Sep 10, 2010 at 8:54 AM, alfredo alfredote...@gmail.com wrote: Hi Everyone, I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of values. row2 - rown specify the number of times I want to replicate each corresponding value in row1. I can do this with the following function: rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n. Now, I can do this manually by changing the values of X and save each resulting array/vector in an object, or write a for loop that will iterate through the rows and output a new data.matrix in which row1 - rown will correspond to the vectors generated by replicating the values of row1 row2 - rown independent times from the original data.matrix with the rep function shown above. So far I have been unable to get the for loop right. Any help will be most appreciated! Thanks beforehand for your help. Best, A -- View this message in context: http://r.789695.n4.nabble.com/for-loop-help-please-tp2534666p2534666.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop help please!
I do not follow. Could you please provide a small reproducible example of what table1 might look like, and what you want as a result? Surely you don't need a for loop. alfredo wrote: Hi Everyone, I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of values. row2 - rown specify the number of times I want to replicate each corresponding value in row1. I can do this with the following function: rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n. Now, I can do this manually by changing the values of X and save each resulting array/vector in an object, or write a for loop that will iterate through the rows and output a new data.matrix in which row1 - rown will correspond to the vectors generated by replicating the values of row1 row2 - rown independent times from the original data.matrix with the rep function shown above. So far I have been unable to get the for loop right. Any help will be most appreciated! Thanks beforehand for your help. Best, A __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop help please!
try this: x - matrix(sample(25,25), 5) x [,1] [,2] [,3] [,4] [,5] [1,] 12 24 143 20 [2,] 2175 15 17 [3,] 11 10 22 169 [4,]6 2541 23 [5,]2 198 13 18 # save result in a list result - lapply(2:nrow(x), function(.row){ + rep(x[1,], times=x[.row,]) + }) result [[1]] [1] 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 24 24 24 24 24 24 24 14 14 14 14 14 3 3 [36] 3 3 3 3 3 3 3 3 3 3 3 3 3 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 [[2]] [1] 12 12 12 12 12 12 12 12 12 12 12 24 24 24 24 24 24 24 24 24 24 14 14 14 14 14 14 14 14 14 14 14 14 14 14 [36] 14 14 14 14 14 14 14 14 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 20 20 20 20 20 20 20 20 20 [[3]] [1] 12 12 12 12 12 12 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 14 14 14 14 [36] 3 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 [[4]] [1] 12 12 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 14 14 14 14 14 14 14 14 3 3 3 3 3 3 [36] 3 3 3 3 3 3 3 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 On Fri, Sep 10, 2010 at 11:54 AM, alfredo alfredote...@gmail.com wrote: Hi Everyone, I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of values. row2 - rown specify the number of times I want to replicate each corresponding value in row1. I can do this with the following function: rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n. Now, I can do this manually by changing the values of X and save each resulting array/vector in an object, or write a for loop that will iterate through the rows and output a new data.matrix in which row1 - rown will correspond to the vectors generated by replicating the values of row1 row2 - rown independent times from the original data.matrix with the rep function shown above. So far I have been unable to get the for loop right. Any help will be most appreciated! Thanks beforehand for your help. Best, A -- View this message in context: http://r.789695.n4.nabble.com/for-loop-help-please-tp2534666p2534666.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme, groupedData, random intercept and slope
from the output, I think it's both. - Original Message From: John Sorkin jsor...@grecc.umaryland.edu To: r-help@r-project.org Sent: Fri, September 10, 2010 5:25:44 AM Subject: [R] lme, groupedData, random intercept and slope Windows Vista R 2.10.1 Does the following use of groupedData and lme produce an analysis with both random intercept and slope, or only random slope? zz-groupedData(y~time | Subject,data=data.frame(data), labels = list( x = Time, y = y ), units = list( x = (yr), y = (mm)) ) plot(zz) fit10-lme(zz) summary(fit10) Linear mixed-effects model fit by REML Data: zz AIC BIC logLik -123.1942 -115.2010 67.5971 Random effects: Formula: ~time | Subject Structure: General positive-definite StdDev Corr (Intercept) 6.054897e+00 (Intr) time4.160662e-05 1 Residual9.775954e-04 Fixed effects: y ~ time Value Std.Error DF t-value p-value (Intercept) 15.000217 1.914727 19 7.834 0 time-1.51 0.000219 19 -4566.598 0 Correlation: (Intr) time 0.059 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.73706837 -0.36289558 0.06892484 0.59777067 1.69095476 Number of Observations: 30 Number of Groups: 10 John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modifying axis labels in lattice panels
Hi: On Fri, Sep 10, 2010 at 7:16 AM, Markus Loecher markus.loec...@gmail.comwrote: Dear all, I am struggling to modify the axis labels/ticks in a panel provided to xyplot. To begin with, I do not know the equivalent of the xaxt=n directive for panels that would set the stage for no default x axis being drawn. My goal is to draw ticks and custom formatted labels at certain hours of the week. When I execute the code below, I get an error message in the plot window that suggests a problem with some args parameter. A second problem concerns the shaded rectangles I try to draw. Clearly, the range returned by par('usr') does not correspond to the true y ranges. Any help would be greatly appreciated, Thanks, Markus PS I am using R version 2.10.0 on MACOS and the lattice package version 0.18-3 (latest) library(lattice); #multivariate time series, one row for each hour of the week: Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) + (0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 = cos(2*pi*7*(0:167)/168)); #to pass this to xyplot, first need to reshape: Xlong - reshape(Xwide, varying = c(2:3), idvar = time, direction=long, timevar = PC); #get descriptive shingle labels Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2)); A less mentally taxing approach :) library(reshape) xlong - melt(Xwide, id = 'time') names(xlong)[2:3] - c('PC', 'Comp') xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales = list(x=list(at = Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H); When attempting to run this, I got Error in xyplot.formula(Comp ~ time | PC, data = Xlong, pane1l = WeekhourPanel, : object 'Hours24' not found Attempting to pull Hours24 out of the function didn't work... Hours24 - seq(r[1]+12*3600, r[2], by=24*3600) Error in seq(r[1] + 12 * 3600, r[2], by = 24 * 3600) : object 'r' not found One problem is that to use Hours24 in scales, it has to be defined in the calling environment of xyplot() - in other words, it has to be defined outside the panel function and outside of xyplot() if your present code is to have any hope of working. I think I got that part figured out: in the console, type r - range(Xwide$time) Hours24 - seq(r[1]+12*3600, r[2], by=24*3600) I at least get a plot now by running your xyplot() function with the panel function, but all the labels are 08 on the x-axis. Here's why: format(Hours24-4*3600,%H) [1] 08 08 08 08 08 08 08 This comes from the labels = part of your panel function. I got the same plot with this code (apart from adding the lines): xyplot(Comp ~ time | PC ,data = Xlong, type = 'l', scales =list(x = list(at = Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H) which indicates that something in your panel function is awry. I'd suggest starting out simply. Put both plots on the same panel using PC as a grouping variable in the long data frame. It will automatically use different colors for groups, but you can control the line color with the col.lines = argument; e.g., col.lines = c('red', 'blue'). Next, I'd work on getting the axis ticks and labels the way you want with scales. It also appears that you want to set a custom grid - my suggestion would be to do that last, after you've controlled the axis ticks and labels. Once you have that figured out, you have the kernel of your panel function. In most applications I've seen in lattice, the idea is to keep the panel function as simple as possible and pass the 'global' stuff to the function call. There's something broken in your panel function, but it's a run-time error rather than a compile-time error, so you can either debug it or try simplifying the problem (and the panel function) as much as possible. HTH, Dennis WeekhourPanel - function(x,y,...){ r - range(x); #print(r) Hours8 - seq(r[1], r[2], by=8*3600); Hours24 - seq(r[1]+12*3600, r[2], by=24*3600) #axis.POSIXct(1, at= Hours8, format=%H); panel.xyplot(x,y, type=l, ...); panel.grid(0,3); panel.abline(v= Hours24-4*3600, lty=2, col = rgb(0,0,1,0.5)); panel.abline(v=Hours24+6*3600, lty=2, col = rgb(0,1,0,0.5)); bb - par('usr') y0 - bb[3]; for (i in seq(r[1], r[2], by=48*3600)) panel.rect(xleft=i, ybottom=y0, xright=i+24*3600-1, ytop=bb[4], col = rgb(0.75,0.75,0.75,0.3), border = NA); panel.axis(1, at= Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H))); #panel.axis(1, at= Hours24+6*3600, labels=format(x,%H)); #panel.axis(3, at= Hours24, labels=format(x,%a), line = -1, tick = FALSE); } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmer fixed effects, SE, t . . . and p
But as far as I know, profile() seems to be de-activated in the lme4 package. - Original Message From: Gavin Simpson gavin.simp...@ucl.ac.uk To: John Sorkin jsor...@grecc.umaryland.edu Cc: r-help@r-project.org; Bert Gunter gunter.ber...@gene.com Sent: Fri, September 10, 2010 2:05:37 AM Subject: Re: [R] lmer fixed effects, SE, t . . . and p On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote: Bert, I appreciate you comments, and I have read Doug Bates writing about p values in mixed effects regression. It is precisely because I read Doug's material that I asked how are we to interpret the estimates rather than how can we compute a p value. My question is a simple question whose answer is undoubtedly complex, but one that needs an answer. Without p values, or confidence intervals, I am not certain what to make of the results of my analysis. Does my analysis suggest, or does it not suggest that there is a relation between time and y? If I can't answer this question after running the analysis, I don't have any more information than I did before I ran the analysis, and a fair question would be why did I run the analysis? I am asking for help not in calculation a p value or a CI, but rather to know what I can and can't say about the results of the analysis. If this basic question can not be answered, I am at a loss to interpret my results. Thank you, John Doug talks quite a lot about profiling lmer fits using 'profile deviance' to investigate variability in fixed effects. For example, see section 1.5 in the draft of chapter 1 of Doug's book on mixed models: http://lme4.r-forge.r-project.org/book/ HTH G John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Bert Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM John: Search on this issue in the list archives. Doug Bates has addressed it at length. Basically, he does not calculate CI's or p-values because he does not know how to reliably do so. However, the key remark in your query was: (2) lmer does not give p values or confidence intervals for the fixed effects. How we are to interpret the estimates given that no p value or CI is given for the estimates? Think about it. A statistical analysis -- ANY statistical analysis -- treats the data in isolation: it is not informed by physics, thermodynamics, biology, other similar data, prior experience, or, indeed, any part of the body of relevant scientific knowledge. Do you really think that any such analysis, especially when predicated upon often tenuous or even (necessarily) unverifiable assumptions and simplifications should be considered authoritative? Classical statistical inference is just another piece of the puzzle, and not even particularly useful when, as if typically the case, hypotheses are formulated AFTER seeing the data (this invalidates the probability calculations -- hypotheses must be formulated before seeing the data to be meaningfully assessed). Leo Breiman called this statistics' quiet scandal something like 20 years ago, and he was no dummy. It is comforting, perhaps, but illusory to believe that statistical inference can be relied on to give sound, objective scientific results. True, without such a framework, science seems rather subjective, perhaps closer to religion and arbitrary cultural archetypes than we care to admit. But see Thomas Kuhn and Paul Feuerabend for why this is neither surprising nor necessarily a bad thing. Cheers, Bert Gunter On Thu, Sep 9, 2010 at 8:00 PM, John Sorkin jsor...@grecc.umaryland.edu wrote: windows Vista R 2.10.1 (1) How can I get the complete table of for the fixed effects from lmer. As can be seen from the example below, fixef(fit2) only give the estimates and not the SE or t value fit3- lmer(y~time + (1|Subject) + (time|Subject),data=data.frame(data)) summary(fit3) Linear mixed model fit by REML Formula: y ~ time + (1 | Subject) + (time | Subject) Data: data.frame(data) AICBIC logLik deviance REMLdev -126.2 -116.4 70.1 -152.5 -140.2 Random effects: Groups NameVariance Std.Dev. Corr Subject (Intercept) 2.9311e+01 5.41396385 Subject (Intercept) 0.e+00 0. time0.e+00 0. NaN Residual 8.1591e-07 0.00090328 Number of obs: 30, groups: Subject, 10 Fixed effects: Estimate Std. Error t value (Intercept) 14.998216 1.712046 9 time-0.999779 0.000202 -4950 Correlation of Fixed Effects: (Intr) time -0.001 fixef(fit3) (Intercept)time 14.9982158 -0.9997793 (2) lmer does not give p values or confidence intervals for the
Re: [R] lmer fixed effects, SE, t . . . and p
On Fri, 2010-09-10 at 09:51 -0700, array chip wrote: But as far as I know, profile() seems to be de-activated in the lme4 package. It is beta software. The lme4a version of the lme4 package might have had profile re-enabled, IIRC. G - Original Message From: Gavin Simpson gavin.simp...@ucl.ac.uk To: John Sorkin jsor...@grecc.umaryland.edu Cc: r-help@r-project.org; Bert Gunter gunter.ber...@gene.com Sent: Fri, September 10, 2010 2:05:37 AM Subject: Re: [R] lmer fixed effects, SE, t . . . and p On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote: Bert, I appreciate you comments, and I have read Doug Bates writing about p values in mixed effects regression. It is precisely because I read Doug's material that I asked how are we to interpret the estimates rather than how can we compute a p value. My question is a simple question whose answer is undoubtedly complex, but one that needs an answer. Without p values, or confidence intervals, I am not certain what to make of the results of my analysis. Does my analysis suggest, or does it not suggest that there is a relation between time and y? If I can't answer this question after running the analysis, I don't have any more information than I did before I ran the analysis, and a fair question would be why did I run the analysis? I am asking for help not in calculation a p value or a CI, but rather to know what I can and can't say about the results of the analysis. If this basic question can not be answered, I am at a loss to interpret my results. Thank you, John Doug talks quite a lot about profiling lmer fits using 'profile deviance' to investigate variability in fixed effects. For example, see section 1.5 in the draft of chapter 1 of Doug's book on mixed models: http://lme4.r-forge.r-project.org/book/ HTH G John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Bert Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM John: Search on this issue in the list archives. Doug Bates has addressed it at length. Basically, he does not calculate CI's or p-values because he does not know how to reliably do so. However, the key remark in your query was: (2) lmer does not give p values or confidence intervals for the fixed effects. How we are to interpret the estimates given that no p value or CI is given for the estimates? Think about it. A statistical analysis -- ANY statistical analysis -- treats the data in isolation: it is not informed by physics, thermodynamics, biology, other similar data, prior experience, or, indeed, any part of the body of relevant scientific knowledge. Do you really think that any such analysis, especially when predicated upon often tenuous or even (necessarily) unverifiable assumptions and simplifications should be considered authoritative? Classical statistical inference is just another piece of the puzzle, and not even particularly useful when, as if typically the case, hypotheses are formulated AFTER seeing the data (this invalidates the probability calculations -- hypotheses must be formulated before seeing the data to be meaningfully assessed). Leo Breiman called this statistics' quiet scandal something like 20 years ago, and he was no dummy. It is comforting, perhaps, but illusory to believe that statistical inference can be relied on to give sound, objective scientific results. True, without such a framework, science seems rather subjective, perhaps closer to religion and arbitrary cultural archetypes than we care to admit. But see Thomas Kuhn and Paul Feuerabend for why this is neither surprising nor necessarily a bad thing. Cheers, Bert Gunter On Thu, Sep 9, 2010 at 8:00 PM, John Sorkin jsor...@grecc.umaryland.edu wrote: windows Vista R 2.10.1 (1) How can I get the complete table of for the fixed effects from lmer. As can be seen from the example below, fixef(fit2) only give the estimates and not the SE or t value fit3- lmer(y~time + (1|Subject) + (time|Subject),data=data.frame(data)) summary(fit3) Linear mixed model fit by REML Formula: y ~ time + (1 | Subject) + (time | Subject) Data: data.frame(data) AICBIC logLik deviance REMLdev -126.2 -116.4 70.1 -152.5 -140.2 Random effects: Groups NameVariance Std.Dev. Corr Subject (Intercept) 2.9311e+01 5.41396385 Subject (Intercept) 0.e+00 0. time0.e+00 0. NaN Residual 8.1591e-07 0.00090328 Number of obs: 30, groups: Subject, 10 Fixed effects: Estimate Std. Error t value
[R] difference of two RData files/environments
I just wrote up some code for differencing two .RData files or environments (or one of each). Available from source here: http://www.maths.lancs.ac.uk/~rowlings/R/Ediff/ In its handiest form, running: ediff() will tell you the difference between your working environment and the .RData file that it probably started from. Useful for those 'What have I done here?' moments when you discover an R session in a long-lost terminal window. It can take two arguments which can be paths to files or environments. It tells you which objects are in one or the other or both, and does an identical() check on the ones in both. If anyone maintains a package that this could go in, get in touch. Barry -- blog: http://geospaced.blogspot.com/ web: http://www.maths.lancs.ac.uk/~rowlings web: http://www.rowlingson.com/ twitter: http://twitter.com/geospacedman pics: http://www.flickr.com/photos/spacedman __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme4a package loading error
Thanks for reminding this. So I found lme4a package from Doug's UserR!2010 presentation folder: http://lme4.r-forge.r-project.org/slides/2010-07-20-Gaithersburg/pkg/ However, after installation, I got the following error message when trying to load the library: library(Matrix) library(Rcpp) library(minqa) library(lme4a) Error : classes modelMatrix, denseModelMatrix, sparseModelMatrix, ddenseModelMatrix, dsparseModelMatrix, predModule, dPredModule, sPredModule, respModule, glmRespMod, nlsRespMod are not exported by 'namespace:Matrix' Error: package/namespace load failed for 'lme4a' Here is my sessionInfo() sessionInfo() R version 2.11.1 (2010-05-31) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] minqa_1.1.9Rcpp_0.8.6 Matrix_0.999375-43 lattice_0.18-8 loaded via a namespace (and not attached): [1] grid_2.11.1nlme_3.1-96splines_2.11.1 stats4_2.11.1 tools_2.11.1 Any suggestions would be appreciated. John - Original Message From: Gavin Simpson gavin.simp...@ucl.ac.uk To: array chip arrayprof...@yahoo.com Cc: John Sorkin jsor...@grecc.umaryland.edu; r-help@r-project.org; Bert Gunter gunter.ber...@gene.com Sent: Fri, September 10, 2010 10:00:15 AM Subject: Re: [R] lmer fixed effects, SE, t . . . and p On Fri, 2010-09-10 at 09:51 -0700, array chip wrote: But as far as I know, profile() seems to be de-activated in the lme4 package. It is beta software. The lme4a version of the lme4 package might have had profile re-enabled, IIRC. G - Original Message From: Gavin Simpson gavin.simp...@ucl.ac.uk To: John Sorkin jsor...@grecc.umaryland.edu Cc: r-help@r-project.org; Bert Gunter gunter.ber...@gene.com Sent: Fri, September 10, 2010 2:05:37 AM Subject: Re: [R] lmer fixed effects, SE, t . . . and p On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote: Bert, I appreciate you comments, and I have read Doug Bates writing about p values in mixed effects regression. It is precisely because I read Doug's material that I asked how are we to interpret the estimates rather than how can we compute a p value. My question is a simple question whose answer is undoubtedly complex, but one that needs an answer. Without p values, or confidence intervals, I am not certain what to make of the results of my analysis. Does my analysis suggest, or does it not suggest that there is a relation between time and y? If I can't answer this question after running the analysis, I don't have any more information than I did before I ran the analysis, and a fair question would be why did I run the analysis? I am asking for help not in calculation a p value or a CI, but rather to know what I can and can't say about the results of the analysis. If this basic question can not be answered, I am at a loss to interpret my results. Thank you, John Doug talks quite a lot about profiling lmer fits using 'profile deviance' to investigate variability in fixed effects. For example, see section 1.5 in the draft of chapter 1 of Doug's book on mixed models: http://lme4.r-forge.r-project.org/book/ HTH G John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Bert Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM John: Search on this issue in the list archives. Doug Bates has addressed it at length. Basically, he does not calculate CI's or p-values because he does not know how to reliably do so. However, the key remark in your query was: (2) lmer does not give p values or confidence intervals for the fixed effects. How we are to interpret the estimates given that no p value or CI is given for the estimates? Think about it. A statistical analysis -- ANY statistical analysis -- treats the data in isolation: it is not informed by physics, thermodynamics, biology, other similar data, prior experience, or, indeed, any part of the body of relevant scientific knowledge. Do you really think that any such analysis, especially when predicated upon often tenuous or even (necessarily) unverifiable assumptions and simplifications should be considered authoritative? Classical statistical inference is just another piece of the puzzle, and not even particularly useful when, as if typically the case, hypotheses are formulated AFTER seeing the data (this invalidates the probability calculations -- hypotheses
[R] filter a tab delimited text file
Hi all, I have to filter a tab-delimited text file like below: GeneNamesvalue1value2log2(Fold_change) log2(Fold_change) normalizedSignature(abs(log2(Fold_change) normalized) 4) ENSG0209350435-3.81131293562629-4.14357714689656TRUE ENSG017713314225.467717200823365.13545298955309FALSE ENSG01162851151669-4.54130810709955 -4.87357231836982TRUE ENSG000972410162-4.69995182667858 -5.03221603794886FALSE ENSG0162460331-4.05126372834704-4.38352793961731TRUE based on the last column (TRUE), and then write to a new text file, meaning I should get something like below: GeneNamesvalue1value2log2(Fold_change) log2(Fold_change) normalizedSignature(abs(log2(Fold_change) normalized) 4) ENSG0209350435-3.81131293562629-4.14357714689656TRUE ENSG01162851151669-4.54130810709955 -4.87357231836982TRUE ENSG0162460331-4.05126372834704-4.38352793961731TRUE I used read.table and write.table but I am still not very satisfied with the results. Here is what I did: expFC - read.table( test.txt, header=T, sep=\t ) expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,] write.table (expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t ) Result: GeneNamesvalue1value2log2.Fold_change. log2.Fold_change..normalized Signature.abs.log2.Fold_change..normalized4. ENSG0209350435-3.81131293562629 -4.14357714689656TRUE ENSG01162851151669-4.54130810709955 -4.87357231836982TRUE ENSG0162460331-4.05126372834704 -4.38352793961731TRUE As you can see, there are two points: 1. The headers were altered. All the special characters were converted to dot (.). 2. The gene names (first column) were quoted (which were not in the original file). The second point is not very annoying, but the first one is. How do I get exact the headers like the original file? Thanks, D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop help please!
Hi: It's not immediately clear what you have in mind, as others have noted, but here are a couple of ideas that seem as though they may apply to your problem, as dangerous as it is to play clairvoyant: I'm using vectors instead of a matrix, but the first vector, val, contains the values whereas the second, reps, holds the number of desired repetitions, which are randomly generated to take values from 1:5. val - 1:10 reps - reps = sample(1:5, 10, replace = TRUE) (1) Output is a vector: rep(val, reps) [1] 1 1 1 2 2 3 3 3 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 8 9 [26] 9 10 10 10 10 10 (2) Output is a list: l - mapply(rep, val, reps) l [[1]] [1] 1 1 1 [[2]] [1] 2 2 [[3]] [1] 3 3 3 [[4]] [1] 4 4 4 [[5]] [1] 5 5 5 5 5 [[6]] [1] 6 6 6 6 6 [[7]] [1] 7 7 [[8]] [1] 8 [[9]] [1] 9 9 [[10]] [1] 10 10 10 10 10 Hopefully, one of these answers your question. If not, you'll need to explain your problem in more detail. The easiest way to do this is to come up with a toy example, what you tried that didn't work, and the expected outcome. Dennis On Fri, Sep 10, 2010 at 8:54 AM, alfredo alfredote...@gmail.com wrote: Hi Everyone, I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of values. row2 - rown specify the number of times I want to replicate each corresponding value in row1. I can do this with the following function: rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n. Now, I can do this manually by changing the values of X and save each resulting array/vector in an object, or write a for loop that will iterate through the rows and output a new data.matrix in which row1 - rown will correspond to the vectors generated by replicating the values of row1 row2 - rown independent times from the original data.matrix with the rep function shown above. So far I have been unable to get the for loop right. Any help will be most appreciated! Thanks beforehand for your help. Best, A -- View this message in context: http://r.789695.n4.nabble.com/for-loop-help-please-tp2534666p2534666.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding labels above bars in a barplot
See this message and the replies to it (and the replies to the replies, etc.): http://tolstoy.newcastle.edu.au/R/e2/help/07/08/22858.html In there is a discussion of why you don't really want to do that along with better alternatives and examples of the improved plots. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Antonio Olinto Sent: Friday, September 10, 2010 8:59 AM To: R-help Subject: [R] adding labels above bars in a barplot Hello, I want to make a general routine to draw barplots with numbers plotted above each bar. See the example below. I could not place the numbers on the middle of each bar because I could not calculate the right position of each x-axis tick. axTicks(1) indicated a unitary step, but it does not seem work. I appreciate any help or suggestions. Best regards, Antonio Olinto == CAT VAR1VAR2 Category 01 17.59 Category 02 15.220 Category 03 10.3500 Category 04 8.4 150 Category 05 20.35000 # Coping data from a spreadsheet dat.data - read.delim(clipboard,header=T) summary(dat.data) CAT VAR1VAR2 Category 01:1 Min. : 8.40 Min. : 9 Category 02:1 1st Qu.:10.30 1st Qu.: 20 Category 03:1 Median :15.20 Median : 150 Category 04:1 Mean :14.34 Mean :1136 Category 05:1 3rd Qu.:17.50 3rd Qu.: 500 Max. :20.30 Max. :5000 dat.bar - data.frame(dat.data[,c(2)]) row.names(dat.bar)-dat.data[,1] names(dat.bar)-c(VAR1) dat.bar VAR1 Category 01 17.5 Category 02 15.2 Category 03 10.3 Category 04 8.4 Category 05 20.3 par(mar=c(12,6,3,2),cex.axis=1.2,cex.lab=1.4) barplot(t(as.matrix(dat.bar)),ylim=c(0,max(dat.data[,2]*1.1)),las=2,yla b=Y label text,col=orange) box() up - max(dat.data$VAR1)*0.1 for (i in c(0:nrow(dat.data))) { legend(0.25+i,dat.bar[1+i,1]+up,dat.data[i+1,3],col=blue,bty=n) } Webmail - iBCMG Internet http://www.ibcmg.com.br __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filter a tab delimited text file
Duke - One possibility is to check the help files for the functions involved to see if there are options to control this behaviour. For example, the check.names= argument to read.table, or the quote= argument to write.table. How about expFC - read.table(test.txt, header=TRUE, sep=\t, check.names=FALSE) expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,] write.table(expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t, quote=FALSE ) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 10 Sep 2010, Duke wrote: Hi all, I have to filter a tab-delimited text file like below: GeneNamesvalue1value2log2(Fold_change) log2(Fold_change) normalizedSignature(abs(log2(Fold_change) normalized) 4) ENSG0209350435-3.81131293562629-4.14357714689656TRUE ENSG017713314225.467717200823365.13545298955309FALSE ENSG01162851151669-4.54130810709955-4.87357231836982 TRUE ENSG000972410162-4.69995182667858-5.03221603794886 FALSE ENSG0162460331-4.05126372834704-4.38352793961731TRUE based on the last column (TRUE), and then write to a new text file, meaning I should get something like below: GeneNamesvalue1value2log2(Fold_change) log2(Fold_change) normalizedSignature(abs(log2(Fold_change) normalized) 4) ENSG0209350435-3.81131293562629-4.14357714689656TRUE ENSG01162851151669-4.54130810709955-4.87357231836982 TRUE ENSG0162460331-4.05126372834704-4.38352793961731TRUE I used read.table and write.table but I am still not very satisfied with the results. Here is what I did: expFC - read.table( test.txt, header=T, sep=\t ) expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,] write.table (expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t ) Result: GeneNamesvalue1value2log2.Fold_change. log2.Fold_change..normalized Signature.abs.log2.Fold_change..normalized4. ENSG0209350435-3.81131293562629-4.14357714689656 TRUE ENSG01162851151669-4.54130810709955-4.87357231836982 TRUE ENSG0162460331-4.05126372834704-4.38352793961731 TRUE As you can see, there are two points: 1. The headers were altered. All the special characters were converted to dot (.). 2. The gene names (first column) were quoted (which were not in the original file). The second point is not very annoying, but the first one is. How do I get exact the headers like the original file? Thanks, D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme4a package loading error
On Fri, 2010-09-10 at 10:23 -0700, array chip wrote: Thanks for reminding this. So I found lme4a package from Doug's UserR!2010 presentation folder: http://lme4.r-forge.r-project.org/slides/2010-07-20-Gaithersburg/pkg/ What is wrong with the one on the packages tab of the lme4 project page: https://r-forge.r-project.org/R/?group_id=60 ? You might need to make sure you have the latest Matrix as well to run lme4a. Update Matrix via update.packages() or install the latest version from r-forge and see if that helps. Also, try not to cross-post to multiple lists. Stick with one, or move the thread onto the new list. HTH G However, after installation, I got the following error message when trying to load the library: library(Matrix) library(Rcpp) library(minqa) library(lme4a) Error : classes modelMatrix, denseModelMatrix, sparseModelMatrix, ddenseModelMatrix, dsparseModelMatrix, predModule, dPredModule, sPredModule, respModule, glmRespMod, nlsRespMod are not exported by 'namespace:Matrix' Error: package/namespace load failed for 'lme4a' Here is my sessionInfo() sessionInfo() R version 2.11.1 (2010-05-31) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] minqa_1.1.9Rcpp_0.8.6 Matrix_0.999375-43 lattice_0.18-8 loaded via a namespace (and not attached): [1] grid_2.11.1nlme_3.1-96splines_2.11.1 stats4_2.11.1 tools_2.11.1 Any suggestions would be appreciated. John - Original Message From: Gavin Simpson gavin.simp...@ucl.ac.uk To: array chip arrayprof...@yahoo.com Cc: John Sorkin jsor...@grecc.umaryland.edu; r-help@r-project.org; Bert Gunter gunter.ber...@gene.com Sent: Fri, September 10, 2010 10:00:15 AM Subject: Re: [R] lmer fixed effects, SE, t . . . and p On Fri, 2010-09-10 at 09:51 -0700, array chip wrote: But as far as I know, profile() seems to be de-activated in the lme4 package. It is beta software. The lme4a version of the lme4 package might have had profile re-enabled, IIRC. G - Original Message From: Gavin Simpson gavin.simp...@ucl.ac.uk To: John Sorkin jsor...@grecc.umaryland.edu Cc: r-help@r-project.org; Bert Gunter gunter.ber...@gene.com Sent: Fri, September 10, 2010 2:05:37 AM Subject: Re: [R] lmer fixed effects, SE, t . . . and p On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote: Bert, I appreciate you comments, and I have read Doug Bates writing about p values in mixed effects regression. It is precisely because I read Doug's material that I asked how are we to interpret the estimates rather than how can we compute a p value. My question is a simple question whose answer is undoubtedly complex, but one that needs an answer. Without p values, or confidence intervals, I am not certain what to make of the results of my analysis. Does my analysis suggest, or does it not suggest that there is a relation between time and y? If I can't answer this question after running the analysis, I don't have any more information than I did before I ran the analysis, and a fair question would be why did I run the analysis? I am asking for help not in calculation a p value or a CI, but rather to know what I can and can't say about the results of the analysis. If this basic question can not be answered, I am at a loss to interpret my results. Thank you, John Doug talks quite a lot about profiling lmer fits using 'profile deviance' to investigate variability in fixed effects. For example, see section 1.5 in the draft of chapter 1 of Doug's book on mixed models: http://lme4.r-forge.r-project.org/book/ HTH G John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Bert Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM John: Search on this issue in the list archives. Doug Bates has addressed it at length. Basically, he does not calculate CI's or p-values because he does not know how to reliably do so. However, the key remark in your query was: (2) lmer does not give p values or confidence intervals for the fixed effects. How we are to interpret the estimates given that no p value or CI is given for the estimates? Think about it. A statistical analysis -- ANY statistical analysis -- treats the data in isolation: it is not informed
Re: [R] boxplot knowing Q1, Q3, median, upper and lower whisker value
x=1:16 S=summary(x) S Min. 1st Qu. MedianMean 3rd Qu.Max. 1.004.758.508.50 12.25 16.00 S[-4] Min. 1st Qu. Median 3rd Qu.Max. 1.004.758.50 12.25 16.00 par(mfrow=c(1,2)) boxplot(S[-4]) # based on the summarized stats boxplot(x) # based on the raw data -- View this message in context: http://r.789695.n4.nabble.com/boxplot-knowing-Q1-Q3-median-upper-and-lower-whisker-value-tp2528571p2534818.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding zeroes after old zeroes in a vector ??
Hello Imagine I have a vector with ones and zeroes I write it compactly: 011100101 I need to get a new vector replacing the N ones following the zeroes to new zeroes. For example for N = 3 011100101 becomes 00010 I can do it with a for loop but I've read is not a good practice, How can I do it then? cheers My vector is a zoo series, indeed, but I guess it doesn't make any difference. -- View this message in context: http://r.789695.n4.nabble.com/adding-zeroes-after-old-zeroes-in-a-vector-tp2534824p2534824.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot knowing Q1, Q3, median, upper and lower whisker value
For base graphics the bxp function does the actual plotting given the statistics. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Brian Diggs Sent: Thursday, September 09, 2010 1:41 PM To: David A. Cc: R-help Subject: Re: [R] boxplot knowing Q1, Q3, median, upper and lower whisker value On 9/6/2010 8:46 AM, David A. wrote: Dear list, I am using a external program that outputs Q1, Q3, median, upper and lower whisker values for various datasets simultaneously in a tab delimited format. After importing this text file into R, I would like to plot a boxplot using these given values and not the original series of data points, i.e. not using something like boxplot(mydata). Is there an easy way for doing this? If I am not wrong, boxplot() does not accept these values as parameters. Cheers, Dave [[alternative HTML version deleted]] If you use ggplot2, you can specify the aesthetics lower, upper, middle, ymin, and ymax directly to variables in geom_boxplot. Just be sure to set stat=identity so that it does not try to summarize your data again. -- Brian Diggs Senior Research Associate, Department of Surgery Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modifying axis labels in lattice panels
Thanks a lot for this incredibly helpful and thorough reply. I had actually meant to cut out the scales part before sending the email, very sorry about the confusion, so I was actually executing just xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel) The scales part was a later attempt to control the axis directly which I eventually abandoned. (partly because I actually wanted the HOUR variables to be local to the panel function) and yes, in this simplified version I asked for labels only at 8am which formats to 08. My intention was to add more hours and weekly labels once I figure out this simple axis first. I had hoped to define a panel function that draws only one PC at a time since I envision that grouping variable to have many levels (two were just an example). Might you know how to disable the axis drawing in panel.xyplot ? Thanks ! Markus On Fri, Sep 10, 2010 at 12:45 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: On Fri, Sep 10, 2010 at 7:16 AM, Markus Loecher markus.loec...@gmail.comwrote: Dear all, I am struggling to modify the axis labels/ticks in a panel provided to xyplot. To begin with, I do not know the equivalent of the xaxt=n directive for panels that would set the stage for no default x axis being drawn. My goal is to draw ticks and custom formatted labels at certain hours of the week. When I execute the code below, I get an error message in the plot window that suggests a problem with some args parameter. A second problem concerns the shaded rectangles I try to draw. Clearly, the range returned by par('usr') does not correspond to the true y ranges. Any help would be greatly appreciated, Thanks, Markus PS I am using R version 2.10.0 on MACOS and the lattice package version 0.18-3 (latest) library(lattice); #multivariate time series, one row for each hour of the week: Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) + (0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 = cos(2*pi*7*(0:167)/168)); #to pass this to xyplot, first need to reshape: Xlong - reshape(Xwide, varying = c(2:3), idvar = time, direction=long, timevar = PC); #get descriptive shingle labels Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2)); A less mentally taxing approach :) library(reshape) xlong - melt(Xwide, id = 'time') names(xlong)[2:3] - c('PC', 'Comp') xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales = list(x=list(at = Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H); When attempting to run this, I got Error in xyplot.formula(Comp ~ time | PC, data = Xlong, pane1l = WeekhourPanel, : object 'Hours24' not found Attempting to pull Hours24 out of the function didn't work... Hours24 - seq(r[1]+12*3600, r[2], by=24*3600) Error in seq(r[1] + 12 * 3600, r[2], by = 24 * 3600) : object 'r' not found One problem is that to use Hours24 in scales, it has to be defined in the calling environment of xyplot() - in other words, it has to be defined outside the panel function and outside of xyplot() if your present code is to have any hope of working. I think I got that part figured out: in the console, type r - range(Xwide$time) Hours24 - seq(r[1]+12*3600, r[2], by=24*3600) I at least get a plot now by running your xyplot() function with the panel function, but all the labels are 08 on the x-axis. Here's why: format(Hours24-4*3600,%H) [1] 08 08 08 08 08 08 08 This comes from the labels = part of your panel function. I got the same plot with this code (apart from adding the lines): xyplot(Comp ~ time | PC ,data = Xlong, type = 'l', scales =list(x = list(at = Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H) which indicates that something in your panel function is awry. I'd suggest starting out simply. Put both plots on the same panel using PC as a grouping variable in the long data frame. It will automatically use different colors for groups, but you can control the line color with the col.lines = argument; e.g., col.lines = c('red', 'blue'). Next, I'd work on getting the axis ticks and labels the way you want with scales. It also appears that you want to set a custom grid - my suggestion would be to do that last, after you've controlled the axis ticks and labels. Once you have that figured out, you have the kernel of your panel function. In most applications I've seen in lattice, the idea is to keep the panel function as simple as possible and pass the 'global' stuff to the function call. There's something broken in your panel function, but it's a run-time error rather than a compile-time error, so you can either debug it or try simplifying the problem (and the panel function) as much as possible. HTH, Dennis WeekhourPanel - function(x,y,...){ r - range(x); #print(r) Hours8 - seq(r[1], r[2], by=8*3600); Hours24 -
Re: [R] filter a tab delimited text file
Hi Phil, On 9/10/10 1:45 PM, Phil Spector wrote: Duke - One possibility is to check the help files for the functions involved to see if there are options to control this behaviour. For example, the check.names= argument to read.table, or the quote= argument to write.table. How about Yes, I did before posting question to the list. But somehow I missed (or misunderstood) the check.names option. As about quote=FALSE option for write.table, it does not work as I want, since all the headers are unquoted too. expFC - read.table(test.txt, header=TRUE, sep=\t, check.names=FALSE) expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,] write.table(expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t, quote=FALSE ) This works perfectly and solves the first issue. Thanks so much Phil. D. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 10 Sep 2010, Duke wrote: Hi all, I have to filter a tab-delimited text file like below: GeneNamesvalue1value2log2(Fold_change) log2(Fold_change) normalizedSignature(abs(log2(Fold_change) normalized) 4) ENSG0209350435-3.81131293562629 -4.14357714689656TRUE ENSG017713314225.46771720082336 5.13545298955309FALSE ENSG01162851151669-4.54130810709955 -4.87357231836982 TRUE ENSG000972410162-4.69995182667858 -5.03221603794886 FALSE ENSG0162460331-4.05126372834704 -4.38352793961731TRUE based on the last column (TRUE), and then write to a new text file, meaning I should get something like below: GeneNamesvalue1value2log2(Fold_change) log2(Fold_change) normalizedSignature(abs(log2(Fold_change) normalized) 4) ENSG0209350435-3.81131293562629 -4.14357714689656TRUE ENSG01162851151669-4.54130810709955 -4.87357231836982 TRUE ENSG0162460331-4.05126372834704 -4.38352793961731TRUE I used read.table and write.table but I am still not very satisfied with the results. Here is what I did: expFC - read.table( test.txt, header=T, sep=\t ) expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,] write.table (expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t ) Result: GeneNamesvalue1value2log2.Fold_change. log2.Fold_change..normalized Signature.abs.log2.Fold_change..normalized4. ENSG0209350435-3.81131293562629 -4.14357714689656 TRUE ENSG01162851151669-4.54130810709955 -4.87357231836982 TRUE ENSG0162460331-4.05126372834704 -4.38352793961731 TRUE As you can see, there are two points: 1. The headers were altered. All the special characters were converted to dot (.). 2. The gene names (first column) were quoted (which were not in the original file). The second point is not very annoying, but the first one is. How do I get exact the headers like the original file? Thanks, D. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot knowing Q1, Q3, median, upper and lower whisker value
is.nan(bd.coerce(as.bdVector(c(1.0, N -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Brian Diggs Sent: Thursday, September 09, 2010 12:41 PM To: David A. Cc: R-help Subject: Re: [R] boxplot knowing Q1, Q3, median,upper and lower whisker value On 9/6/2010 8:46 AM, David A. wrote: Dear list, I am using a external program that outputs Q1, Q3, median, upper and lower whisker values for various datasets simultaneously in a tab delimited format. After importing this text file into R, I would like to plot a boxplot using these given values and not the original series of data points, i.e. not using something like boxplot(mydata). Is there an easy way for doing this? If I am not wrong, boxplot() does not accept these values as parameters. I believe boxplot(x,y,z) computes the required statistics and passes them to the bxp() function for plotting. If you have the statistics you can pass them to bxp() yourself. You might call trace(bxp) followed by a call to boxplot() to see how boxplot uses bxp. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Cheers, Dave [[alternative HTML version deleted]] If you use ggplot2, you can specify the aesthetics lower, upper, middle, ymin, and ymax directly to variables in geom_boxplot. Just be sure to set stat=identity so that it does not try to summarize your data again. -- Brian Diggs Senior Research Associate, Department of Surgery Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot bar geom: control the filling in the colour legend
Hi all, Is it possible to change the filling of the squares used to represent the colour legend in a bar plot with ggplot? in this example, fillings are raven black, I'd like them white. ggplot(diamonds, aes(clarity, colour = cut)) + geom_bar() Regards -- - Benoit Boulinguiez Ph.D student Ecole de Chimie de Rennes (ENSCR) Bureau 1.20 Equipe CIP UMR CNRS 6226 Sciences Chimiques de Rennes Avenue du Général Leclerc CS 50837 35708 Rennes CEDEX 7 Tel 33 (0)2 23 23 80 83 Fax 33 (0)2 23 23 81 20 http://www.ensc-rennes.fr/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding labels above bars in a barplot
Hi: To add to Greg's sound advice, if you want to put the numbers on top of the bars, why bother with the numerical scale? The entire point of a scale is to provide a reference for comparing different (sets of) values. \begin{rant} And when I see things like this: dat.bar VAR1 Category 01 17.5 Category 02 15.2 Category 03 10.3 Category 04 8.4 Category 05 20.3 I get doubly annoyed, because it is yet another attempt to use a bar chart to plot quantitative values by factor level. As I mentioned in a private response today, one of the problems with a bar chart is that it forces the numerical scale to have origin zero, and this is often neither necessary nor desirable. A simple line plot that connects the quantitative values between categories is sufficient, and takes *far* less ink to produce. The purpose of a statistical graphic is to convey information in a simple, clean, concise fashion - it is not meant to be a rococo art form. If you intend to write a function to automate a graphic, please think carefully about what is meant to be conveyed and the *visually* simplest means by which to convey it. \end{rant} The purpose of a bar chart is to visualize a (joint) discrete distribution. There are better ways to plot quantitative variables by group; in addition to the line plot mentioned above, the Cleveland dot chart can be very effective with many groups or multiple grouping factors. With two factors and a quantitative response, another option is the interaction plot. If this weren't the third such example/request I've seen today, I probably wouldn't be so apoplectic... Dennis On Fri, Sep 10, 2010 at 10:44 AM, Greg Snow greg.s...@imail.org wrote: See this message and the replies to it (and the replies to the replies, etc.): http://tolstoy.newcastle.edu.au/R/e2/help/07/08/22858.html In there is a discussion of why you don't really want to do that along with better alternatives and examples of the improved plots. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Antonio Olinto Sent: Friday, September 10, 2010 8:59 AM To: R-help Subject: [R] adding labels above bars in a barplot Hello, I want to make a general routine to draw barplots with numbers plotted above each bar. See the example below. I could not place the numbers on the middle of each bar because I could not calculate the right position of each x-axis tick. axTicks(1) indicated a unitary step, but it does not seem work. I appreciate any help or suggestions. Best regards, Antonio Olinto == CAT VAR1VAR2 Category 01 17.59 Category 02 15.220 Category 03 10.3500 Category 04 8.4 150 Category 05 20.35000 # Coping data from a spreadsheet dat.data - read.delim(clipboard,header=T) summary(dat.data) CAT VAR1VAR2 Category 01:1 Min. : 8.40 Min. : 9 Category 02:1 1st Qu.:10.30 1st Qu.: 20 Category 03:1 Median :15.20 Median : 150 Category 04:1 Mean :14.34 Mean :1136 Category 05:1 3rd Qu.:17.50 3rd Qu.: 500 Max. :20.30 Max. :5000 dat.bar - data.frame(dat.data[,c(2)]) row.names(dat.bar)-dat.data[,1] names(dat.bar)-c(VAR1) dat.bar VAR1 Category 01 17.5 Category 02 15.2 Category 03 10.3 Category 04 8.4 Category 05 20.3 par(mar=c(12,6,3,2),cex.axis=1.2,cex.lab=1.4) barplot(t(as.matrix(dat.bar)),ylim=c(0,max(dat.data[,2]*1.1)),las=2,yla b=Y label text,col=orange) box() up - max(dat.data$VAR1)*0.1 for (i in c(0:nrow(dat.data))) { legend(0.25+i,dat.bar[1+i,1]+up,dat.data[i+1,3],col=blue,bty=n) } Webmail - iBCMG Internet http://www.ibcmg.com.br __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding labels above bars in a barplot
Are you trying to say that you don't really like barplots? At least the OP did not ask for error bars as well. :) --- On Fri, 9/10/10, Dennis Murphy djmu...@gmail.com wrote: From: Dennis Murphy djmu...@gmail.com Subject: Re: [R] adding labels above bars in a barplot To: Antonio Olinto aolint...@bignet.com.br Cc: R-help r-h...@stat.math.ethz.ch Received: Friday, September 10, 2010, 2:31 PM Hi: To add to Greg's sound advice, if you want to put the numbers on top of the bars, why bother with the numerical scale? The entire point of a scale is to provide a reference for comparing different (sets of) values. \begin{rant} And when I see things like this: dat.bar VAR1 Category 01 17.5 Category 02 15.2 Category 03 10.3 Category 04 8.4 Category 05 20.3 I get doubly annoyed, because it is yet another attempt to use a bar chart to plot quantitative values by factor level. As I mentioned in a private response today, one of the problems with a bar chart is that it forces the numerical scale to have origin zero, and this is often neither necessary nor desirable. A simple line plot that connects the quantitative values between categories is sufficient, and takes *far* less ink to produce. The purpose of a statistical graphic is to convey information in a simple, clean, concise fashion - it is not meant to be a rococo art form. If you intend to write a function to automate a graphic, please think carefully about what is meant to be conveyed and the *visually* simplest means by which to convey it. \end{rant} The purpose of a bar chart is to visualize a (joint) discrete distribution. There are better ways to plot quantitative variables by group; in addition to the line plot mentioned above, the Cleveland dot chart can be very effective with many groups or multiple grouping factors. With two factors and a quantitative response, another option is the interaction plot. If this weren't the third such example/request I've seen today, I probably wouldn't be so apoplectic... Dennis On Fri, Sep 10, 2010 at 10:44 AM, Greg Snow greg.s...@imail.org wrote: See this message and the replies to it (and the replies to the replies, etc.): http://tolstoy.newcastle.edu.au/R/e2/help/07/08/22858.html In there is a discussion of why you don't really want to do that along with better alternatives and examples of the improved plots. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Antonio Olinto Sent: Friday, September 10, 2010 8:59 AM To: R-help Subject: [R] adding labels above bars in a barplot Hello, I want to make a general routine to draw barplots with numbers plotted above each bar. See the example below. I could not place the numbers on the middle of each bar because I could not calculate the right position of each x-axis tick. axTicks(1) indicated a unitary step, but it does not seem work. I appreciate any help or suggestions. Best regards, Antonio Olinto == CAT VAR1 VAR2 Category 01 17.5 9 Category 02 15.2 20 Category 03 10.3 500 Category 04 8.4 150 Category 05 20.3 5000 # Coping data from a spreadsheet dat.data - read.delim(clipboard,header=T) summary(dat.data) CAT VAR1 VAR2 Category 01:1 Min. : 8.40 Min. : 9 Category 02:1 1st Qu.:10.30 1st Qu.: 20 Category 03:1 Median :15.20 Median : 150 Category 04:1 Mean :14.34 Mean :1136 Category 05:1 3rd Qu.:17.50 3rd Qu.: 500 Max. :20.30 Max. :5000 dat.bar - data.frame(dat.data[,c(2)]) row.names(dat.bar)-dat.data[,1] names(dat.bar)-c(VAR1) dat.bar VAR1 Category 01 17.5 Category 02 15.2 Category 03 10.3 Category 04 8.4 Category 05 20.3 par(mar=c(12,6,3,2),cex.axis=1.2,cex.lab=1.4) barplot(t(as.matrix(dat.bar)),ylim=c(0,max(dat.data[,2]*1.1)),las=2,yla b=Y label text,col=orange) box() up - max(dat.data$VAR1)*0.1 for (i in c(0:nrow(dat.data))) { legend(0.25+i,dat.bar[1+i,1]+up,dat.data[i+1,3],col=blue,bty=n) } Webmail - iBCMG Internet http://www.ibcmg.com.br __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list
Re: [R] ggplot bar geom: control the filling in the colour legend
Sure, just change the color of the fill. ggplot(diamonds, aes(clarity, colour = cut)) + geom_bar(fill=white) -Ista On Fri, Sep 10, 2010 at 2:24 PM, Benoit Boulinguiez benoit.boulingu...@ensc-rennes.fr wrote: Hi all, Is it possible to change the filling of the squares used to represent the colour legend in a bar plot with ggplot? in this example, fillings are raven black, I'd like them white. ggplot(diamonds, aes(clarity, colour = cut)) + geom_bar() Regards -- - Benoit Boulinguiez Ph.D student Ecole de Chimie de Rennes (ENSCR) Bureau 1.20 Equipe CIP UMR CNRS 6226 Sciences Chimiques de Rennes Avenue du Général Leclerc CS 50837 35708 Rennes CEDEX 7 Tel 33 (0)2 23 23 80 83 Fax 33 (0)2 23 23 81 20 http://www.ensc-rennes.fr/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Maximum log likelihood estimates of the parameters of a nonlinear model.
Dear all, Is it possible to generate AIC or something equivalent for nonlinear model estimated based on maximum log likelihood l in R? I used nls based on least squares to estimate, and therefore I cannot assess the quality of models with AIC. nlme seems good for only mixed models and mine is not mixed models. res - nls(y ~ d*(x)^3+a*(x)^2+b*x+c, start=list(a=2, b=1,c=1,d=1), data=d) If anybody does know a R-function to estimate nonlinear model based on maximum log likelihood, please let me know. Thanks for your help in advance! Odette __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filter a tab delimited text file
On Fri, Sep 10, 2010 at 1:24 PM, Duke duke.li...@gmx.com wrote: Hi all, I have to filter a tab-delimited text file like below: GeneNames value1 value2 log2(Fold_change) log2(Fold_change) normalized Signature(abs(log2(Fold_change) normalized) 4) ENSG0209350 4 35 -3.81131293562629 -4.14357714689656 TRUE ENSG0177133 142 2 5.46771720082336 5.13545298955309 FALSE ENSG0116285 115 1669 -4.54130810709955 -4.87357231836982 TRUE ENSG0009724 10 162 -4.69995182667858 -5.03221603794886 FALSE ENSG0162460 3 31 -4.05126372834704 -4.38352793961731 TRUE based on the last column (TRUE), and then write to a new text file, meaning I should get something like below: GeneNames value1 value2 log2(Fold_change) log2(Fold_change) normalized Signature(abs(log2(Fold_change) normalized) 4) ENSG0209350 4 35 -3.81131293562629 -4.14357714689656 TRUE ENSG0116285 115 1669 -4.54130810709955 -4.87357231836982 TRUE ENSG0162460 3 31 -4.05126372834704 -4.38352793961731 TRUE I used read.table and write.table but I am still not very satisfied with the results. Here is what I did: expFC - read.table( test.txt, header=T, sep=\t ) expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,] write.table (expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t ) Result: GeneNames value1 value2 log2.Fold_change. log2.Fold_change..normalized Signature.abs.log2.Fold_change..normalized4. ENSG0209350 4 35 -3.81131293562629 -4.14357714689656 TRUE ENSG0116285 115 1669 -4.54130810709955 -4.87357231836982 TRUE ENSG0162460 3 31 -4.05126372834704 -4.38352793961731 TRUE As you can see, there are two points: 1. The headers were altered. All the special characters were converted to dot (.). 2. The gene names (first column) were quoted (which were not in the original file). This will copy input lines matching pattern as well as the header to the output verbatim preserving all quotes, spacing, etc. myFilter - function(infile, outfile, pattern = TRUE$) { L - readLines(infile) cat(L[1], \n, file = outfile) L2 - grep(pattern, L[-1], value = TRUE) for(el in L2) cat(el, \n, file = outfile, append = TRUE) } # e.g. myFilter(infile.txt, outfile.txt) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation
Do you need a table with an odds ratio exactly equal to 3 (or other value), or a realistic sample from a population with odds ratio 3 where the sample table will have a different OR (but the various tables will cluster around the true value)? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Jim Silverton Sent: Friday, September 10, 2010 1:03 AM To: r-help@r-project.org Subject: Re: [R] Simulation I have two questions: (1) How do you 'create' an 2 x 2 table in R using say an Odd ratio of 3 or even 0.5 (2) If I have several 2 x 2 tables, how can I 'implement' dependence in the tables with say 25 of the Tables having an odds ratio of 1 and 75 of the tables having an odds ratio of 4? Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding zeroes after old zeroes in a vector ??
Not sure how you handle the ending sequence of '0101'; here is one approach: x - 011100101 gsub(0111, , x) [1] 000100101 x [1] 011100101 For the final one, you could do: gsub(01.., , x) [1] 00010 On Fri, Sep 10, 2010 at 1:51 PM, skan juanp...@gmail.com wrote: Hello Imagine I have a vector with ones and zeroes I write it compactly: 011100101 I need to get a new vector replacing the N ones following the zeroes to new zeroes. For example for N = 3 011100101 becomes 00010 I can do it with a for loop but I've read is not a good practice, How can I do it then? cheers My vector is a zoo series, indeed, but I guess it doesn't make any difference. -- View this message in context: http://r.789695.n4.nabble.com/adding-zeroes-after-old-zeroes-in-a-vector-tp2534824p2534824.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't event start 'rattle' under Ubuntu 64bit
Error: attempt to apply non-function In addition: Warning message: In method(obj, ...) : Invalid object type `GtkFrame' Rattle timestamp (for the error above): 2010-09-10 17:24:14 Ditto in Debian testing i386 32 bit. Johan Nyberg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT: model diagnostics in the published literature
My experience is that most medical journals (and probably others as well, but I am most familiar with the medical journals) have word or page limits on articles. Diagnostic plots and tests that just show you what you expected and say that it is ok to use your model are not exciting enough to include. And some of those plots/tests tend to confuse non-statisticians rather than help, have you ever given a QQ-plot of the residuals to a client to show that the normal approximation is OK? I made this mistake a few times and ended up having to explain it over and over again. Most papers that I have been involved with end up including less than half of the analyses that I actually did, just the most interesting results make it. Sometimes a reviewer will ask about the tests on the assumptions and we will send them the results so they can see the model is reasonable, but rarely does it make it into the paper itself. Though I think it would be better if more reviewers asked. The drawback is that when you read a paper it is difficult (or impossible) to tell if they did all the tests and the results were as expected, or they did not do the tests and there could be major problems. One bright spot for the future is that more journals are now allowing for online supplements where all the details that don't make it into the main paper can be provided for the few who are interested. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Christopher W. Ryan Sent: Thursday, September 09, 2010 8:34 PM To: r-help@r-project.org Subject: [R] OT: model diagnostics in the published literature This is a more general statiscal question, not specific to R: As I move through my masters curriculum in statistics, I am becoming more and more attuned to issues of model fit and diagnostics (graphical methods, AIC, BIC, deviance, etc.) As my regression professor always likes to say, only draw substantive conclusions from valid models. Yet in published articles in my field (medicine), I rarely see any explicit description of whether, and if so how, model fit was assessed and assumptions checked. Mostly the results sections are all about hypothesis testing on model coefficients. Is this common in other disciplines? Are there fields of study in which it is customary to provide a discussion of model adequacy, either in the text or perhaps in an online appendix? And if that discussion is not provided, what, if anything, can one conclude about whether, and how well, it was done? Is it sort of taken as a given that those diagnostic checks were carried out? Do journal editors often ask? Thanks for your thoughts. --Chris Ryan Clinical Associate Professor of Family Medicine SUNY Upstate Medical University Clinical Campus at Binghamton __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding zeroes after old zeroes in a vector ??
Hi: Try the following: f - function(x, n) { r - rle(x) t1 - cumsum(r$lengths)[r$values == 0L] + 1 repl - as.vector(mapply(seq, t1, t1 + n - 1)) replace(x, repl, 0) } f(x, 3) HTH, Dennis On Fri, Sep 10, 2010 at 10:51 AM, skan juanp...@gmail.com wrote: Hello Imagine I have a vector with ones and zeroes I write it compactly: 011100101 I need to get a new vector replacing the N ones following the zeroes to new zeroes. For example for N = 3 011100101 becomes 00010 I can do it with a for loop but I've read is not a good practice, How can I do it then? cheers My vector is a zoo series, indeed, but I guess it doesn't make any difference. -- View this message in context: http://r.789695.n4.nabble.com/adding-zeroes-after-old-zeroes-in-a-vector-tp2534824p2534824.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum log likelihood estimates of the parameters of a nonlinear model.
Odette Gaston odette.gaston at gmail.com writes: Dear all, Is it possible to generate AIC or something equivalent for nonlinear model estimated based on maximum log likelihood l in R? I used nls based on least squares to estimate, and therefore I cannot assess the quality of models with AIC. nlme seems good for only mixed models and mine is not mixed models. res - nls(y ~ d*(x)^3+a*(x)^2+b*x+c, start=list(a=2, b=1,c=1,d=1), data=d) If anybody does know a R-function to estimate nonlinear model based on maximum log likelihood, please let me know. AIC(res) should work just fine. Ordinary least-squares fitting is equivalent to assuming that the residuals are independent and normally distributed with a homogeneous variance. If you're willing to make those assumptions you're set. If not, there are various options for relaxing them: gnls in the nlme package for correlation and heteroscedasticity, mle (stats4) or mle2 (bbmle) for normality. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding zeroes after old zeroes in a vector ??
Hi I'll study your answers. I could also try gsub(01, 00, x) N times but it could be very slow if N is large In fact when I wrote 10011I mean a vector 1 1 1 1 1 0 0 1 1 not a string, but I wrote it more compactly. I also could by shifting the elements of the vector one position and ANDing the result with the original. And again shifting 2 postions and so on up to N. But it's very slow. -- View this message in context: http://r.789695.n4.nabble.com/adding-zeroes-after-old-zeroes-in-a-vector-tp2534824p2534982.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting occurances of a letter by a factor
I'm trying to find a more elegant way of doing this. What I'm trying to accomplish is to count the frequency of letters (major / minor alleles) in a string grouped by the factor levels in another column of my data frame. Ex. DF-data.frame(c(CC, CC, NA, CG, GG, GC), c(L, U, L, U, L, NA)) colnames(DF)-c(X, Y) DF XY 1 CCL 2 CCU 3 NAL 4 CGU 5 GGL 6 GC NA I have an ugly solution, which works if you know the factor levels of Y in advance. ans-rbind(table(unlist(strsplit(as.character(DF[DF[ ,'Y'] == 'L', 1]), ))), + table(unlist(strsplit(as.character(DF[DF[ ,'Y'] == 'U', 1]), rownames(ans)-c(L, U) ans C G L 2 2 U 3 1 I've played with table, xtab, tabulate, aggregate, tapply, etc but haven't found a combination that gives a more general solution to this problem. Any ideas? Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmer output
Denis.Aydin at unibas.ch writes: I have a question regarding an output of a binomial lmer-model. The model is as follows: lmer(y~diet * day * female + (day|female),family=binomial) A reproducible example would always be nice. The corresponding output is: Generalized linear mixed model fit by the Laplace approximation Formula: y ~ diet * day * female + (day | female) AIC BIC logLik deviance 1084 1136 -531.1 1062 [ snip ] Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) 0.996444 0.713703 1.396 0.1627 dietNAA 1.194581 0.862294 1.385 0.1659 day 0.142026 0.074270 1.912 0.0558 . female 0.015629 0.019156 0.816 0.4146 dietNAA:day-0.124755 0.088684 -1.407 0.1595 dietNAA:female -0.024733 0.026947 -0.918 0.3587 day:female -0.001535 0.001966 -0.781 0.4348 dietNAA:day:female 0.001543 0.002720 0.568 0.5704 Now from my understanding, the estimates represent differences in slopes and intercepts between different levels of diet and so on. My questions: 1. Is there a way to display the coefficients for all levels of variables (e.g., dietAA and dietNAA)? Because it is quite hard to calculate the slopes and intercepts for all levels of each variable. See if lmer(y~(diet-1) * (day-1) * (female-1) + (day|female),family=binomial) helps, or see if you can use predict() with an appropriately constructed prediction data frame -- although not sure if predict works with GLMMs in current version of lme4. 2. Is there a way to get the degrees of freedom? Giant can of worms, I'm afraid. See http://glmm.wikidot.com/faq for relevant links and alternatives. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.