Re: [R] Problems using "lm" in combination with "predict"
> > My modelmatrix is of dimension 28x4 > > Then I want to make use of the function predict because there > confidence.intervals are include. > > My idea was: > > mod <- lm(y~Worktime+Vacation+Illnes+Bankholidays) > > newdate=data.frame(x=c(324,123,0.9,0.1)) Perhaps you want to specify newdate like this newdate=data.frame(Worktime = 324, Vacation = 123, Illness = 0.9, Bankholidays = 0.1))? Best regards Frede Aakmann Tøgersen Scientist UNIVERSITY OF AARHUS Faculty of Agricultural Sciences Dept. of Genetics and Biotechnology Blichers Allé 20, P.O. BOX 50 DK-8830 Tjele Phone: +45 8999 1900 Direct: +45 8999 1878 E-mail: [EMAIL PROTECTED] Web: http://www.agrsci.org This email may contain information that is confidential. Any use or publication of this email without written permission from Faculty of Agricultural Sciences is not allowed. If you are not the intended recipient, please notify Faculty of Agricultural Sciences immediately and delete this email. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Access an entry after reading a table
read.table will convert you character columns to factors. You are seeing a single value returned ("A"), but it is also reporting the levels for the factors. One way is to read the data in without conversion to factors: Model=read.table("ModelMat.txt", header=TRUE, as.is=TRUE) or you can convert the factor to character for output: as.character(Model[1,1]) On 8/2/07, Gang Chen <[EMAIL PROTECTED]> wrote: > Sorry about this basic question. After reading a table, > > Model=read.table("ModelMat.txt", header=T) > > I want to get access to each entry in the table Model. However, if I do > > > Model[1,1] > > I get the following, > > [1] A > Levels: A B C > > My question is, how can I just get the entry "A" without the 2nd line > ("Levels: A B C")? > > Thanks, > Gang > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sample size for logit model
Hi, I'm hoping someone has some insight about sample size and logit estimation that could help me. I inherited a logit model from a client in the direct marketing area. The previous consultant used approximately 143,800 observations in the training data set, of which only 50 (0.03%) were the target ( = 1) value for the dependent variable. The literature I could find gives very little guidance on sample sizes (Hosmer & Lemeshow have some material, but they basically say that little has been done). Does anyone know of some literature or even rules-of-thumb about sample sizes and/or ratio of target to non-target values of the dependent variable? The use of 143,800 observations is excessive. Does this do anything to the significance of the estimates (e.g., am I always guaranteed very small p-values?)? Is oversampling of the target value the key and if so, how do I calculate weights for the estimations? Any guidance or suggestions in this area are definitely welcome. Walt Paczkowski _ Walter R. Paczkowski, Ph.D. Data Analytics Corp. 44 Hamilton Lane Plainsboro, NJ 08536 (V) 609-936-8999 (F) 609-936-3733 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using loops to create multiple images
On 8/4/07, Donatas G. <[EMAIL PROTECTED]> wrote: > I have a data.frame with ~100 columns and I need a barplot for each column > produced and saved in some directory. > > I am not sure it is possible - so please help me. > > this is my loop that does not work... > > vars <- list (substitute (G01_01), substitute (G01_02), substitute (G01_03), > substitute (G01_04)) > results <- data.frame ('Variable Name'=rep (NA, length (vars)), > check.names=FALSE) > for (i in 1:length (vars)) { > barplot(table(i),xlab=i,ylab="Nuomonės") > dev.copy(png, filename="/my/dir/barplot.i.png", height=600, width=600) > dev.off() > } > > questions: > > Is it possible to use the i somewhere _within_ a file name? (like it is > possible in other programming or scripting languages?) (Yes, but) Why are you using dev.copy? See ?png, in particular how the ``page number'' can be encoded in the 'filename' argument for multi-page output. -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using loops to create multiple images
Not sure exactly what 'results' is doing there or 'barplot(table(i),...)' does [see ?table] but I think this is sort of what you want to do? ## Variable assignment G01_01 <- 1:10 G01_02 <- 2:6 ## Combine to list* varnames <- paste("G01_",substring(100+1:2,2),sep="") vars <- lapply(`names<-`(as.list(varnames),varnames), function(x) eval(parse(text=x))) print(vars) ## Plotting for( i in 1:length(vars) ) { filenm <- paste("/my/dir/barplot",i,".png",sep="") barplot(...) dev.copy(png,filename=filenm,...) dev.off() } ## **Combining to list, step-by-step ## does the same thing as above digits <- substring(100+1:2,2) varnames <- paste("G01_",digits,sep="") vars <- as.list(varnames) names(varlist) <- vars # convert character string of variable names to # expressions via parse() and evaluate by eval() vars <- lapply(varlist,function(x) eval(parse(text=x))) print(vars) I think in many cases paste() is your answer... --- "Donatas G." <[EMAIL PROTECTED]> wrote: > I have a data.frame with ~100 columns and I need a barplot for each column > produced and saved in some directory. > > I am not sure it is possible - so please help me. > > this is my loop that does not work... > > vars <- list (substitute (G01_01), substitute (G01_02), substitute > (G01_03), > substitute (G01_04)) > results <- data.frame ('Variable Name'=rep (NA, length (vars)), > check.names=FALSE) > for (i in 1:length (vars)) { > barplot(table(i),xlab=i,ylab="NuomonÄs") > dev.copy(png, filename="/my/dir/barplot.i.png", height=600, width=600) > dev.off() > } > > questions: > > Is it possible to use the i somewhere _within_ a file name? (like it is > possible in other programming or scripting languages?) > > Since I hate to type in all the variables (they go from G01_01 to G01_10 > and > then from G02_01 to G02_10 and so on), is it possible to shorten this list > by > putting there another loop, applying some programming thing or so? > > -- > Donatas Glodenis > http://dg.lapas.info > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > that gives answers, not web links. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot in log scale
What appears to be happening is that you are plotting all the data sets but all the lines () are being plotted outside the original plotting frame. If you just plot out$a with axes=TRUE you will see what the y coordinates are. You need to explicitly set the ylim values. There is a mockup of what I think you want to do. -- a <- 5:7 b <- c(.005,.009,.01) c <- c(.05,.08,.12 ) out <- data.frame(a,b,c); out plot_colors <- c("blue","red","forestgreen","yellow") plot(out$a, log="y", type="l", col=plot_colors[1], ylim=c(.0001, 8), xlab="tittle_x_axe", ylab="title_y_axe", cex.lab=0.8, lwd=2, xaxt='n') axis(1, at=1:3) lines(out$b, type="l", pch=1, lty=1, col=plot_colors[2]) lines(out$c, type="l", pch=2, lty=2, col=plot_colors[3]) - --- akki <[EMAIL PROTECTED]> wrote: > Hi, > Forgive me, because I am new in R. > I need a graph, where: > - y axe has log scale > - graph would have 4 set values: set first (out$a) > has values over 5.5, set > second (out$b) and third (out$c) has values over > 0.005 and the last set > (out$d) has values over 0.0005 of y axe. For this > reason, I need have log > scale on y axe. > I've look for web, but I can't do the graph that I > need. > > I try to do something as: > > out <- read.table("/my_path/data.dat", header=T, > sep="\t") > plot_colors <- > c("blue","red","forestgreen","yellow") > plot(out$a, log="y", type="l", col=plot_colors[1], > axes=FALSE, ann=T, > xlab="tittle_x_axe", ylab="title_y_axe", > cex.lab=0.8, lwd=2) > axis(1, at=1:200) > lines(out$b, type="l", pch=1, lty=1, > col=plot_colors[2]) > lines(out$c, type="l", pch=2, lty=2, > col=plot_colors[3]) > lines(out$d, type="l", pch=3, lty=3, > col=plot_colors[4]) > > But, I have to only plotted the first set of > values (out$a). How can I > draw all my values? > > Thanks in advance. > > [[alternative HTML version deleted]] > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixture of Normals with Large Data
Another possibility is to use "data squashing" methods. Relevant papers are: (1) DuMouchel et al. (1999), (2) Madigan et al. (2002), and (3) Owen (1999). Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: [EMAIL PROTECTED] - Original Message - From: "Charles C. Berry" <[EMAIL PROTECTED]> Date: Saturday, August 4, 2007 8:01 pm Subject: Re: [R] Mixture of Normals with Large Data To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch > On Sat, 4 Aug 2007, Tim Victor wrote: > > > All: > > > > I am trying to fit a mixture of 2 normals with > 110 million > observations. I > > am running R 2.5.1 on a box with 1gb RAM running 32-bit windows and > I > > continue to run out of memory. Does anyone have any suggestions. > > > If the first few million observations can be regarded as a SRS of the > > rest, then just use them. Or read in blocks of a convenient size and > > sample some observations from each block. You can repeat this process > a > few times to see if the results are sufficiently accurate. > > Otherwise, read in blocks of a convenient size (perhaps 1 million > observations at a time), quantize the data to a manageable number of > > intervals - maybe a few thousand - and tabulate it. Add the counts > over > all the blocks. > > Then use mle() to fit a multinomial likelihood whose probabilities > are the > masses associated with each bin under a mixture of normals law. > > Chuck > > > > > Thanks so much, > > > > Tim > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@stat.math.ethz.ch mailing list > > > > PLEASE do read the posting guide > > and provide commented, minimal, self-contained, reproducible code. > > > > Charles C. Berry(858) 534-2098 > Dept of > Family/Preventive Medicine > E UC San Diego >La Jolla, San Diego 92093-0901 > > __ > R-help@stat.math.ethz.ch mailing list > > PLEASE do read the posting guide > and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding of Johansen test.
Dear Mark, Thanks for your response. What is your opinion (other stat-gurus as well)? is there anything wrong in my understanding here? Regards, "Leeds, Mark (IED)" <[EMAIL PROTECTED]> wrote: hi megh : that material is extremely difficult for me but I would recommend reading one more of the following to get a better idea of cointegration in the multivariate case. Johansen or Hamilton text ( very difficult for me ) Hayashi( less difficult ) Enders ( not difficult ) I think you're A[0] is the pi matrix in the econometric literature. Assuming this is the case ( if not, then Everything below is incorrect and disregard it ) If there are n equations in the system and the rank of pi is n, then there are no cointegrating vectors. If there are n equations and the rank of pi is r, then there are n-r cointegrating vectors. If there are n equations and the and the rank of pi is 1, then there are n-1 cointegrating relationships which is the maximum that there can be. I don't think the rank of pi can be less Than 1 but the determinant of pi can be zero if all it's rows ( columns ) are not linearly independent. This probably doesn't help a heck of a lot but if you want to some other book references, Let me know. A really nice readable explanation of cointegration using the matreix results ( Which I always find difficult ) Is by Lehmann and the title has the work "desiderata" in it. I don't have it in front of me But if you google "lehmann desiderata", I'm sure it will pop up. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Megh Dal Sent: Sunday, August 05, 2007 6:33 AM To: r-help@stat.math.ethz.ch Subject: [R] Understanding of Johansen test. Dear all, I am struggling to understand the johansen test procedure in the context of co-integration in time series. Yes I understand that this forum is not directly statistics related but still I am posting here hoping that I would get som help. The error correction representation of a VAR[p] model can be written as: Delta y[t] = A[0]*y[t-1] + A[1]*Delta y[t-1] +.. where, y[t] is a vector of n variables. It is said that "if the variables in system are all co-integrated, then Rank of A[0] will be different from zero" My understanding is following : suppose, y[t] is of order 3 and p = 1 Then Delta y[t] = A[0]*y[t-1] + epsilon[t] Hence : Delta y1[t] = a[11]*y1[t-1] + a[12]*y1[t-1] +a[13]*y1[t-1] + epsilon1[t] Delta y2[t] = a[12]*y1[t-1] + a[22]*y1[t-1] +a[23]*y1[t-1] + epsilon2[t] Delta y3[t] = a[31]*y1[t-1] + a[32]*y1[t-1] +a[33]*y1[t-1] + epsilon3[t] But is rank of A[0] is 0 then it is possible to find non-zero coef for all of above three equations such that : a[11]*y1[t-1] + a[12]*y1[t-1] +a[13]*y1[t-1] = 0 a[12]*y1[t-1] + a[22]*y1[t-1] +a[23]*y1[t-1] = 0 a[12]*y1[t-1] + a[22]*y1[t-1] +a[23]*y1[t-1] = 0 therefore number of co-integrating relationship is 3 am I correct? Therefore in my understanding : if variables in a system show some co-integrating relationship thenrank should be close to zero. Am I making any mistakes? Can anyone here clarify me? Regards, Megh - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question regarding QT device
On 8/5/07, Saptarshi Guha <[EMAIL PROTECTED]> wrote: > Hi, > After a few modifications in the makefiles, I successfully compiled > the Qt device (written by Deepayan Sirkar) for OS X 10.4.9 on a > Powerbook. Cool, can you send me the modifications? I haven't managed to compile qtutils on OS X yet (not that I've tried too hard). > However when loading into R > > If i remove this line from zzz.R in qtutils/R > > grDevices::deviceIsInteractive("QT") > > and then install > >library(qtutils) > > loads fine and the QT() calls returns a QT window, however, if i > switch to another application and then switch back to the R GUI, the > menubar has disappeared. I can't test this, but Qt is designed to run as the main application, so it's possible that it is overriding whatever the GUI is doing. > If I do not remove the line > > grDevices::deviceIsInteractive("QT") > > the following error appears an qtutils does not load > Error : 'deviceIsInteractive' is not an exported object from > 'namespace:grDevices' > Error : .onLoad failed in 'loadNamespace' for 'qtutils' > Error: package/namespace load failed for 'qtutils' > > Could anyone provide some pointers to get that deviceIsInteractive > to work? What's your R version? Do you see this in 2.5.1? -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question regarding QT device
grDevices::deviceIsInteractive is only in the unreleased R-devel version of R: which version are you using? Please do study the R posting guide: we do ask for basic information for a good reason, and do ask for questions on packages (especially unreleased packages) to be sent to the maintainer. On Sun, 5 Aug 2007, Saptarshi Guha wrote: > Hi, > After a few modifications in the makefiles, I successfully compiled > the Qt device (written by Deepayan Sirkar) for OS X 10.4.9 on a > Powerbook. > However when loading into R > > If i remove this line from zzz.R in qtutils/R > > grDevices::deviceIsInteractive("QT") > > and then install > >library(qtutils) > > loads fine and the QT() calls returns a QT window, however, if i > switch to another application and then switch back to the R GUI, the > menubar has disappeared. > > If I do not remove the line > > grDevices::deviceIsInteractive("QT") > > the following error appears an qtutils does not load > Error : 'deviceIsInteractive' is not an exported object from > 'namespace:grDevices' > Error : .onLoad failed in 'loadNamespace' for 'qtutils' > Error: package/namespace load failed for 'qtutils' > > Could anyone provide some pointers to get that deviceIsInteractive > to work? > > Thanks for your time > Saptarshi > > Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha > > > [[alternative HTML version deleted]] > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setup trellis.device to color=F inside the xyplot function
Ronaldo Reis Junior said the following on 8/5/2007 6:18 AM: > Hi, > > it is possible to setup trellis.device(color=F) inside teh function xyplot? > > I try to use > >> xyplot(ocup~tempo| > nitro+estacao,col="white",ylim=c(0,0.7),par.settings=list(color=F)) > > But dont work, the only way that work for me is call the function > >> trellis.device(color=F) > > before the xyplot, but in this way it open a new device for each run. I like > that is use the same device every time, but without colors. > > Thanks > Ronaldo Try this: library(lattice) x <- 1:10 y <- 1:10 g <- rep(1:2, 5) # no color xyplot(y ~ x | g, par.settings = standard.theme(color = FALSE)) xyplot(y ~ x, groups = g, par.settings = standard.theme(color = FALSE)) # with color xyplot(y ~ x | g) xyplot(y ~ x, groups = g) HTH, --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] "super" class
Hi R Gurus How would a combine several classes into one super class, please For example, I want to combine ts, zoo, and its into one big overall class, please. I've tried several variations on the themes of setOldClass, but no luck. Thank you for any help. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question regarding QT device
Hi, After a few modifications in the makefiles, I successfully compiled the Qt device (written by Deepayan Sirkar) for OS X 10.4.9 on a Powerbook. However when loading into R If i remove this line from zzz.R in qtutils/R grDevices::deviceIsInteractive("QT") and then install >library(qtutils) loads fine and the QT() calls returns a QT window, however, if i switch to another application and then switch back to the R GUI, the menubar has disappeared. If I do not remove the line grDevices::deviceIsInteractive("QT") the following error appears an qtutils does not load Error : 'deviceIsInteractive' is not an exported object from 'namespace:grDevices' Error : .onLoad failed in 'loadNamespace' for 'qtutils' Error: package/namespace load failed for 'qtutils' Could anyone provide some pointers to get that deviceIsInteractive to work? Thanks for your time Saptarshi Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot in log scale
Hi, Forgive me, because I am new in R. I need a graph, where: - y axe has log scale - graph would have 4 set values: set first (out$a) has values over 5.5, set second (out$b) and third (out$c) has values over 0.005 and the last set (out$d) has values over 0.0005 of y axe. For this reason, I need have log scale on y axe. I've look for web, but I can't do the graph that I need. I try to do something as: out <- read.table("/my_path/data.dat", header=T, sep="\t") plot_colors <- c("blue","red","forestgreen","yellow") plot(out$a, log="y", type="l", col=plot_colors[1], axes=FALSE, ann=T, xlab="tittle_x_axe", ylab="title_y_axe", cex.lab=0.8, lwd=2) axis(1, at=1:200) lines(out$b, type="l", pch=1, lty=1, col=plot_colors[2]) lines(out$c, type="l", pch=2, lty=2, col=plot_colors[3]) lines(out$d, type="l", pch=3, lty=3, col=plot_colors[4]) But, I have to only plotted the first set of values (out$a). How can I draw all my values? Thanks in advance. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t-test
On 05/08/07, sigalit mangut-leiba <[EMAIL PROTECTED]> wrote: > Hi all, > i'm trying to run a two-sample t-test. > i have a longitudinal data that looks like this: > idnageclass > 1 22 1 > 1 22 1 > 1 22 1 > 1 22 1 > 2 63 3 > 2 63 3 > 2 63 3 > 3 43 2 > 3 43 2 > 3 43 2 > 3 43 2 > 3 43 2 > 3 43 2 > 4 37 1 > 4 37 1 > 4 37 1 > . > . > . > i want to compare between class 1 and 3, to see if there are any age > differences, and want to count every id once (not all id numbers have the > same number of rows). > i tried: > > with( > > subset(x1, !duplicated(idn)), > > t.test(age[class=="1"],age[hidclass=="3"] > > ) > > but it didn't work. any suggestions? > If you had given us the error message that you (presumably) had received, then the solution is much easier: You use the code: age[hidclass=="3"] Did you try running this before expected t.test() to function on it? Try going through this: # it is always best to generate some test data # often doing this will illustrate the best way of doing things! d1 <- data.frame(idn =sort(rep(1:100,each=4)), age = rep(round(rnorm(100, mean=50, sd=15)), each=4), class =rep(round(rnorm(100, mean=3, sd=1)), each=4)) # exclude duplicated d1.s <- d1[!duplicated(d1$id),] t.test(d1.s[class==3,], d1.s[class==1,]) Mark -- Dr. Mark Wardle Clinical research fellow and specialist registrar, Neurology Cardiff, UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bars' values on barplot
Much better. Thanks! Mark On 05/08/07, Adrian Dusa <[EMAIL PROTECTED]> wrote: > On Sunday 05 August 2007, Mark Wardle wrote: > > [...] > > So, try this for starters: > > my.values=1:5 > > x <- barplot(my.values, ylim=c(0,7)) > > text(x, 0.4+my.values, "wibble") > > Mark, you could use the "pos" argument from ?par: > > my.values=10:15 > x <- barplot(my.values, ylim=c(0,11)) > > text(x, my.values, "wibble", pos=3) # always does what you want, whereas: > > text(x, 0.4+my.values, "wibble") # doesn't look very nice > > HTH, > Adrian > > > -- > Adrian Dusa > Romanian Social Data Archive > 1, Schitu Magureanu Bd > 050025 Bucharest sector 5 > Romania > Tel./Fax: +40 21 3126618 \ > +40 21 3120210 / int.101 > > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ > -- Dr. Mark Wardle Clinical research fellow and specialist registrar, Neurology Cardiff, UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bars' values on barplot
On Sunday 05 August 2007, Mark Wardle wrote: > [...] > So, try this for starters: > my.values=1:5 > x <- barplot(my.values, ylim=c(0,7)) > text(x, 0.4+my.values, "wibble") Mark, you could use the "pos" argument from ?par: my.values=10:15 x <- barplot(my.values, ylim=c(0,11)) text(x, my.values, "wibble", pos=3) # always does what you want, whereas: text(x, 0.4+my.values, "wibble") # doesn't look very nice HTH, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] setup trellis.device to color=F inside the xyplot function
Hi, it is possible to setup trellis.device(color=F) inside teh function xyplot? I try to use > xyplot(ocup~tempo| nitro+estacao,col="white",ylim=c(0,0.7),par.settings=list(color=F)) But dont work, the only way that work for me is call the function > trellis.device(color=F) before the xyplot, but in this way it open a new device for each run. I like that is use the same device every time, but without colors. Thanks Ronaldo -- Our similarities are different. -Dale Berra, son of Yogi -- > Prof. Ronaldo Reis Júnior | .''`. UNIMONTES/Depto. Biologia Geral/Lab. de Ecologia | : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil | `- Fone: (38) 3229-8187 | [EMAIL PROTECTED] | [EMAIL PROTECTED] | http://www.ppgcb.unimontes.br/ | ICQ#: 5692561 | LinuxUser#: 205366 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Any "special interest" in R/pic interface?
Hi Folks, I'm wondering if there are people out there who would be interested in what would be involved in developing an interface between R graphics and the 'pic' language. Explanation; 'pic' has been part of the Unix 'troff' typesetting suite since very early days (1970s), and also of the GNU troff: 'groff'. Its function is to act as a preprocessor, translating textual descriptions of graphical displays into the formatting language used by troff. Example: .PS ## Need x- and y-scale factors to exist before referring to them xsc=1.0 ; ysc=1.0 ## Define the basic graphics object: the histogram bar ## uses positional parameters $1, 42, $3, $4 in the data line define bar { box width ($2 - $1)*xsc height $3*ysc \ with .sw at ($1*xsc,0) fill $4 } ## Draw the basic histogram xsc=1.0 ; ysc=0.75 copy thru bar until "EOT" -2.5 2.5 31.0 0.25 2.5 7.5 69.0 0.25 7.5 12.5 50.0 0.25 12.5 17.5 0.0 0.25 17.5 22.5 0.0 0.25 22.5 27.5 0.0 0.25 27.5 32.5 0.0 0.25 32.5 37.5 1.0 0.25 37.5 42.5 1.0 0.25 42.5 47.5 0.0 0.25 EOT .PE which will draw the histogram bars, each with a black border, and grey-filled at "grey level" 0.25. The above is readily entered by hand (with the data copied from R or imported from a file). But it is a very simple example. Each bar is a 'pic' "box" object, with width equal to the difference between its upper and lower breakpoints (so variable-width can be accomodated), scaled by factor 'xsc'. The placement of the box is set by specifying that its SouthWest corner ".sw" is at (x,0) where 'x' is the lower of the two breakpoints for the box. When a histogram (one of my histograms, "MH", in this case) has been constructed by MH <- hist(..., plot=FALSE) the first two columns above are available as MH$breaks[1:10] and MH$breaks[2:11]. The height of each box is set to the value of the 3rd column, scaled by factor 'ysc'; the values in the 3rd column are available in MH$counts. The 4th column (grey-level) is whatever you like. The whole array can be constructed in R using a simple cbind(), and it is easy to see how to write an R routine which would output the whole of the above code block to a file, which could then be copied into a troff source document (which is ASCII text throughout). The above example is of course the bare basics. You would need to add extra 'pic' statements to draw the axes, with coordinate values as annotations (this is a straightforward loop in 'pic'). You could amend the code to cause the count-values (when non-zero) to be placed on the tops of the bars as follows: define bar { box width ($2 - $1)*xsc height $3*ysc \ with .sw at ($1*xsc,0) fill $4 if($3>0) then { sprintf("%.0f",$3) "" at top of last box } } The "top" of the box is the midpoint of its top side, and the function sprintf("%.0f",$3) does what R users would expect, producing a text object which is then stacked on top of the empty text object "", the whole being vertically and horizontally centred at the "top of the box" (this ensures that the visible text is just above the top side; otherwise it would be vertically centred on, i.e. cut by, this line). You can readily add further annotations; and adjust features of annotations at pleasure-- e.g. to set them in 2-point smaller size than the point-size for the main text, and in italic type, you could modify the sprintf() to read: sprintf("\s-2\fI%.0f\\fP\s0",$3) (\s-2 makes the type 2 points smaller; \s0 restores the previous point size; \fI switches to Italic style, \fP switches back to the Previous style). You can easily do other things, e.g. rotated text. Such a segment of 'pic' language can either be incorporated into a troff document, within the main text of the document (so would appear as a Figure in the text), or could be a self-contained troff document which could be used to generate an EPS (Encapsulate PostScript) file which could then be imported into any document (or supplied as a stand-alone Figure to a journal publisher who accepts stand-alone Figures in this format; etc). I've given a simple example, to illustrate (a) getting data from R into 'pic' and (b) some of the detailed control one has over the layout and appearance of the result. However, my real interest here is in more complicated graphics which can be generated by R, such as the Trellis Plots in lattice graphics. For instance, the example for xyplot(): xyplot( Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width | Species, data = iris, allow.multiple = TRUE, scales = "free", layout = c(2, 2), auto.key = list(x = .6, y = .7, corner = c(0, 0)) ) lies well within the capabilitis of 'pic'. What one would need, to make this kind of thing generic, would be a) 'pic' code to specify the basic primitives of the plot (analagous to the "define bar{...}" in my histogram example) b) Layout information c) Acess to the numerical (and any textual) information which determines the position and value of plotted objects c) Export to a fil
[R] t-test
Hi all, i'm trying to run a two-sample t-test. i have a longitudinal data that looks like this: idnageclass 1 22 1 1 22 1 1 22 1 1 22 1 2 63 3 2 63 3 2 63 3 3 43 2 3 43 2 3 43 2 3 43 2 3 43 2 3 43 2 4 37 1 4 37 1 4 37 1 . . . i want to compare between class 1 and 3, to see if there are any age differences, and want to count every id once (not all id numbers have the same number of rows). i tried: with( subset(x1, !duplicated(idn)), t.test(age[class=="1"],age[hidclass=="3"] ) but it didn't work. any suggestions? thank you, sigalit. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] null hypothesis for two-way anova
Dear R community, Confused by some of my lab results I ask for the definition of the null hypothesis of a two-way analysis of variance in R (anova() and aov()). Starting with the following model y = a_i + b_j , i in A and j in B is the tested null hypothesis H_0: a_i = 0 for all i in A or H_0: a_m = a_n for any m and n in A? Consequently the same questions for interaction effects. Starting with the model y = a_i + b_j + f_ij , i in A and j in B is the tested null hypothesis H_0: f_ij = 0 for all i in A and j in B or H_0: f_ij = f_mn for any i and m in A and j and n in B? More specific to R the questions is formulated as R code in the appendix. Unfortunately I did not find this in the documentation. Thanks in advance, Axel Appendix: = > A = factor(c(rep(1, 10), rep(0, 5))) > B = factor(c(rep(0, 5), rep(1, 10))) > y = c(rnorm(n = 5, mean = 2, sd = 1), rnorm(n = 5, mean = 6, sd = 1), rnorm(n = 5, mean = 4, sd = 1)) > anova(lm(formula = y ~ 1 + A + B, data = data.frame(y, A, B))) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) A 1 5.041 5.041 5.5601 0.03618 * B 1 58.643 58.643 64.6851 3.56e-06 *** Residuals 12 10.879 0.907 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Which null hypothesis has been tested for the P-value in the fifth column of the output ("Pr(>F)")? For example the P-value 0.03618 for factor A. > y = c(rnorm(n = 5, mean = 2, sd = 1), rnorm(n = 5, mean = 10, sd = 1), rnorm(n = 5, mean = 4, sd = 1)) > anova(lm(formula = y ~ 1 + A + B + A:B, data = data.frame(y, A, B))) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F valuePr(>F) A 1 9.235 9.235 5.3243 0.03966 * B 1 129.623 129.623 74.7321 1.686e-06 *** Residuals 12 20.814 1.734 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 How do I draw a P-value for the interaction effect for the anova? > sessionInfo() Version 2.3.1 (2006-06-01) i386-pc-mingw32 attached base packages: [1] "methods" "stats" "graphics" "grDevices" "utils" "datasets" "base" -- *** Dipl. Math. ETH Axel Rasche Max-Planck-Institute for Molecular Genetics Department Lehrach (Vertebrate Genomics) Ihnestrasse 63-73 D-14195 Berlin-Dahlem GERMANY Tel. ++49-30-8413-1647 Fax ++49-30-8413-1128 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Understanding of Johansen test.
Dear all, I am struggling to understand the johansen test procedure in the context of co-integration in time series. Yes I understand that this forum is not directly statistics related but still I am posting here hoping that I would get som help. The error correction representation of a VAR[p] model can be written as: Delta y[t] = A[0]*y[t-1] + A[1]*Delta y[t-1] +.. where, y[t] is a vector of n variables. It is said that "if the variables in system are all co-integrated, then Rank of A[0] will be different from zero" My understanding is following : suppose, y[t] is of order 3 and p = 1 Then Delta y[t] = A[0]*y[t-1] + epsilon[t] Hence : Delta y1[t] = a[11]*y1[t-1] + a[12]*y1[t-1] +a[13]*y1[t-1] + epsilon1[t] Delta y2[t] = a[12]*y1[t-1] + a[22]*y1[t-1] +a[23]*y1[t-1] + epsilon2[t] Delta y3[t] = a[31]*y1[t-1] + a[32]*y1[t-1] +a[33]*y1[t-1] + epsilon3[t] But is rank of A[0] is 0 then it is possible to find non-zero coef for all of above three equations such that : a[11]*y1[t-1] + a[12]*y1[t-1] +a[13]*y1[t-1] = 0 a[12]*y1[t-1] + a[22]*y1[t-1] +a[23]*y1[t-1] = 0 a[12]*y1[t-1] + a[22]*y1[t-1] +a[23]*y1[t-1] = 0 therefore number of co-integrating relationship is 3 am I correct? Therefore in my understanding : if variables in a system show some co-integrating relationship thenrank should be close to zero. Am I making any mistakes? Can anyone here clarify me? Regards, Megh - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Invert Likert-Scale Values
(Ted Harding) wrote: > On 04-Aug-07 22:02:33, William Revelle wrote: > >> Alexis and John, >> >> To reverse a Likert like item, subtract the item from the maximum >> acceptable value + the minimum acceptable value, >> That is, if >> x <- 1:8 >> xreverse <- 9-x >> >> Bill >> > > A few of us have suggested this, but Alexis's welcome for the > recode() suggestion indicates that by the time he gets round to > this his Likert scale values have already become levels of a factor. > > Levels "1", "2", ... of a factor may look like integers, but they're > not; and R will not let you do arithmetic on them: > > >> x<-factor(c(1,1,1,2,2,2)) >> x >> > [1] 1 1 1 2 2 2 > Levels: 1 2 > >> y<-(3-x) >> > Warning message: > "-" not meaningful for factors in: Ops.factor(3, x) > >> y >> > [1] NA NA NA NA NA NA > > However, you can turn them back into integers, reverse, and then > turn the results back into a factor: > > >> y <- factor(3 - as.integer(x)) >> y >> > [1] 2 2 2 1 1 1 > Levels: 1 2 > > So, even for factors, the insight undelying our suggestion of "-" > is still valid! :) > Er, wouldn't y <- factor(x, levels=2:1, labels=1:2) be more to the point? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bars' values on barplot
Akki, 1. Have a look at the result from a barplot() call. >From the help: A numeric vector (or matrix, when beside = TRUE), say mp, giving the coordinates of all the bar midpoints drawn, useful for adding to the graph. 2. Have a look at the text() function that will annotate plots! So, try this for starters: my.values=1:5 x <- barplot(my.values, ylim=c(0,7)) text(x, 0.4+my.values, "wibble") Hope this helps, Best wishes, Mark On 04/08/07, akki <[EMAIL PROTECTED]> wrote: > Hi, > I need bars' values on barplot, and I don't know how I can put it. I do my > barplot as: > > data<-read.table("/my_path/file.dat",header=T, sep="\t") > barplot(as.matrix(data),log="y",beside=TRUE,main="my_title", xlab="x name", > ylab="y name"). > > How can I add the values on each bar? > > Thanks.. > > [[alternative HTML version deleted]] > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ > -- Dr. Mark Wardle Clinical research fellow and specialist registrar, Neurology Cardiff, UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] - round() strange behaviour
I use round() to prepare numbers for display in Sweave documents. While I had appreciated the "round to even" issue before, I had not given it much thought when automatically writing my output functions/scripts/Sexprs... On 04/08/07, Ted Harding <[EMAIL PROTECTED]> wrote: > On 04-Aug-07 21:57:28, John Logsdon wrote: > > I must admit I had never realised this so thanks to Monica for > > raising it. > > > > Round-to-even is used for statistical reasons and there is some > > point as it obviously reduces bias. But why should the decision > > be to round to the nearest even number? rounding to the nearest > > odd number would be equally valid. (signif(x,0) is the same as > > round(). ) > > A fair point! But see below. > > > There is a debate and sometimes you need symmetric rounding. > > Perhaps there should be an option in round that defaults to > > round-to-even for compatibility but includes the various other > > rounding approaches as seen for example in > > http://en.wikipedia.org/wiki/Rounding > > As wikipedia says: "Round-to-even is used rather than round-to-odd > as the latter rule would prevent rounding to a result of zero." > > And there's also stochastic rounding -- toss a penny. But you wouldn't > get the same answer next time. > > And I can recall -- man years ago -- being in an environment where > the practice was to round alternately up and down. (These were > engineers). > > John's comparisons between FORTRAN, octave and (implicitly) R > are interesting. > > As a general reflection: any method of rounding involves throwing > away some of the information in the data, and this will have > consequences. So the question is: what consequences are you > happiest with? > > One consequence of rounding to nearest even (or nearest odd) is > that this can give the worst possible outcome in terms of > > (rounded X - rounded Y) compared with (unrounded X - unrounded Y). > > For example, rounding to even integer X =1.5 --> 2.0 and > Y = 0.5 --> 0.0 gives > > X - Y = 1.0 , rounded X - rounded Y = 2.0 > > whereas the difference is 1.0 if you always round up, or always down, > and this is also exactly the difference between the unrounded values.. > > Rounding to nearest odd would give X = 1.5 --> 1.0, Y = 0.5 --> 1.0 > thus difference 0.0 in this case, but again difference 2.0 for > X = 2.5 --> 3.0, Y = 1.5 --> 1.0 > > And alternate rounding would give either 2.0 or 0.0 depending > on the phase of X and Y. > > Thus "always up" or "always down" means that the difference between > rounded numbers is always the same as between the unrounded numbers > which for other methods the differences can differ by 2.0. > This may or may not matter, but it is something to think about > when choosing a rounding method. > > > [...] > > Best wishes to all, > Ted. > > > E-Mail: (Ted Harding) <[EMAIL PROTECTED]> > Fax-to-email: +44 (0)870 094 0861 > Date: 04-Aug-07 Time: 23:53:50 > -- XFMail -- > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ > -- Dr. Mark Wardle Clinical research fellow and specialist registrar, Neurology Cardiff, UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.