Re: [R] Dataset on Baltimore home energy costs
elect: Total cost of electricity (including delivery and commodity charges) With Maryland's BGE/PSC/State Legislature rate control, does anybody _really_ know what the total cost is? That may be what you paid then, but was that the total cost? The cost data may be difficult to compare with non rate controlled utility cost data. Richard Graham __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] binom.test [Broadcast]
To quote one of the previous answers you've got: The formula you're using is the TV. The one binom.test() uses is the ballpark. Take your pick. Andy From: Ethan Johnsons Thank you for the info. It helps. After all, it would be: 0.1304348-1.96*(sqrt((0.1304348*(1-0.1304348))/46)) [1] 0.03310968 0.1304348+1.96*(sqrt((0.1304348*(1-0.1304348))/46)) [1] 0.2277599 Does R have a function for the calculation above? ej On 10/20/06, Francisco J. Zagmutt [EMAIL PROTECTED] wrote: Ethan, You need to explain why you think this is not the right function to use. R is doing exactly what you are asking it to do. Now is up to you to choose the methodology you feel is correct. For a good discussion on your particular issue I recommend you the following reference: A. Agresti and B. A. Coull, Approximate is better than exact for interval estimation of binomial proportions, The American Statistician, vol. 52, no. 2, pp. 119-126, 1998. Once you figure out the right function to use see if the function is available in R. If not readily available, and if after searching through R's documentation and the forum archives you still can't find a way to perform the calculation, then is time to get back to this forum. Regards, Francisco Dr. Francisco J. Zagmutt College of Veterinary Medicine and Biomedical Sciences Colorado State University From: Ethan Johnsons [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject: [R] binom.test Date: Fri, 20 Oct 2006 17:18:02 -0400 A quick question, please. 46 e coli lab samples are tested, 6 of them returned positive. So, the best point estimate for p is 6/46 = 0.1304348. For a 95% CI for p, I thought binom.test would give me the correct result, but it seems it is not the right function to use. What is the R function for this? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot.mca in MASS
Hello All, Is it possible to plot each factor in a different color on the multiple correspondence plot? What is the implication and statistical meaning behind the coordinates for the rows on the plot? Could I just plot a symbol instead of a number for the coordinates? Thanks! marco __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot.POSIXct plot.POSIXlt
Hi, Just to signal that when I want to plot POSIXct variable on x using format within plot(), I get what I want on the plot but with a number of warnings: plot(y~x,format=%y-%m) Warning messages: 1: format is not a graphical parameter in: plot.window(xlim, ylim, log, asp, ...) 2: format is not a graphical parameter in: plot.xy(xy, type, pch, lty, col, bg, cex, lwd, ...) 3: format is not a graphical parameter in: axis(side, at, labels, tick, line, pos, outer, font, lty, lwd, 4: format is not a graphical parameter in: box(which = which, lty = lty, ...) 5: format is not a graphical parameter in: title(main, sub, xlab, ylab, line, outer, ...) I suppose that format may not be at the right place in plot() or/and not handled by the functions called from there, however the documentation (?plot.POSIXct) seems to allow passing this argument: ...: Further arguments to be passed from or to other methods, typically graphical parameters or arguments of 'plot.default'. For the 'plot' methods, also 'format'. Any comment? Patrick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] one is not one
Folks, I have got a strange behaviour when testing this: sum(x) != 1 let us set x-c(70,134,1,5,0) and transform it in a vector of probabilities x-x/sum(x) One expect sum(x) should be equal to 1, which is apparently the case sum(x) [1] 1 However, when I try to test it I get: if(sum(x) !=1) print(lost) else (OK) [1] lost Which means that actually sum(x) is NOT considered equal to 1... Any idea about what is going wrong? Patrick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] one is not one
Patrick, the problem arises because computers can't exactly represent real numbers. Try this for your test: if(!all.equal(sum(x),1)) print(lost) else (OK) see FAQ http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f I hope that this helps Andrew On Sat, Oct 21, 2006 at 11:04:13AM +0200, Patrick Giraudoux wrote: Folks, I have got a strange behaviour when testing this: sum(x) != 1 let us set x-c(70,134,1,5,0) and transform it in a vector of probabilities x-x/sum(x) One expect sum(x) should be equal to 1, which is apparently the case sum(x) [1] 1 However, when I try to test it I get: if(sum(x) !=1) print(lost) else (OK) [1] lost Which means that actually sum(x) is NOT considered equal to 1... Any idea about what is going wrong? Patrick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 http://www.ms.unimelb.edu.au/~andrewpr http://blogs.mbs.edu/fishing-in-the-bay/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] one is not one
On Sat, 21 Oct 2006, Patrick Giraudoux wrote: Folks, I have got a strange behaviour when testing this: sum(x) != 1 FAQ 7.31 Why doesn't R think these numbers are equal?: print(sum(x), digits=20) identical(sum(x), 1) all.equal(sum(x), 1) let us set x-c(70,134,1,5,0) and transform it in a vector of probabilities x-x/sum(x) One expect sum(x) should be equal to 1, which is apparently the case sum(x) [1] 1 However, when I try to test it I get: if(sum(x) !=1) print(lost) else (OK) [1] lost Which means that actually sum(x) is NOT considered equal to 1... Any idea about what is going wrong? Patrick __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cardinality constraint
To solve the problem you have in general, you will need something more powerful than 'constrOptim'. This is called a mixed integer problem, and arises in several fields including financial portfolio optimization. If you really only have 3 variables, then there are only 3 combinations of 2 variables. So just solve the problem with each set and pick the best one. Patrick Burns [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Neuro LeSuperHéros wrote: Hello, How do I implement a cardinality constraint with constrOptim? I want to minimize (least square) a%*%x = 4 subject to x12 x21 x34 count(x1, x2, x3)= 2 (cardinality constraint) Is there a way to specify binary integer variables with constrOptim? Here's my code so far: a -matrix(1:3,1,3) fr - function(x) { (a%*%x-4)^2 } constrOptim(c(1,0.5,3),fr,grad=NULL,ui=-diag(3), ci=c(-2,-1,-4)) I need the optimization to give me one variable that is zero to satisfy the maximum cardinality of 2. Thanks _ Voyez vos amis en faisant un appel vidèo dans Windows Live Messenger __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Constructing predictions from HPDinterval() after lmer()
Dear r-helpers, Following up on http://finzi.psych.upenn.edu/R/Rhelp02a/archive/ 81159.html where Douglas Bates gives a helpful application of lmer() to data(sleepstudy, package = 'lme4'), I need a bit more help in order to plot the correct confidence intervals of a designed experiment such as: data(ratdrink, package = 'faraway') I follow the steps Douglas took in his example: summary( rd.er - lmer(wt ~ weeks*treat + (1 | subject), data = ratdrink) ) Linear mixed-effects model fit by REML Formula: wt ~ weeks * treat + (1 | subject) Data: ratdrink AIC BIC logLik MLdeviance REMLdeviance 962.4 982.8 -474.2 964.3948.4 Random effects: Groups NameVariance Std.Dev. subject (Intercept) 71.206 8.4384 Residual 51.222 7.1570 number of obs: 135, groups: subject, 27 Fixed effects: Estimate Std. Error t value (Intercept)52.8800 3.1928 16.56 weeks 26.4800 0.7157 37.00 treatthiouracil 4.7800 4.51531.06 treatthyroxine -0.7943 4.9756 -0.16 weeks:treatthiouracil -9.3700 1.0121 -9.26 weeks:treatthyroxine0.6629 1.11530.59 Correlation of Fixed Effects: (Intr) weeks trtthr trtthy wks:trtthr weeks -0.448 treatthircl -0.707 0.317 treatthyrxn -0.642 0.288 0.454 wks:trtthrc 0.317 -0.707 -0.448 -0.203 wks:trtthyr 0.288 -0.642 -0.203 -0.448 0.454 rd.mc - mcmcsamp(rd.er, 5) library(coda) HPDinterval(rd.mc) lower upper (Intercept)46.420404 59.406398 weeks 25.070131 27.930363 treatthiouracil-4.420942 14.009291 treatthyroxine-10.758369 9.435761 weeks:treatthiouracil -11.404620 -7.337025 weeks:treatthyroxine -1.603858 2.842704 log(sigma^2)3.683085 4.226153 log(sbjc.(In)) 3.633853 4.955867 deviance 965.750351 980.825613 attr(,Probability) [1] 0.95 ***To make sure my request is clear*** What I need is the analog of what is produced by all.effects() in the effects package: summary(rd - lm(wt ~ weeks*treat, data = ratdrink)) Call: lm(formula = wt ~ weeks * treat, data = ratdrink) Residuals: Min 1Q Median 3Q Max -23.514 -6.660 0.230 6.914 28.343 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept)52.8800 2.6547 19.919 2e-16 *** weeks 26.4800 1.0838 24.433 2e-16 *** treatthiouracil 4.7800 3.7544 1.2730.205 treatthyroxine -0.7943 4.1371 -0.1920.848 weeks:treatthiouracil -9.3700 1.5327 -6.113 1.08e-08 *** weeks:treatthyroxine0.6629 1.6890 0.3920.695 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 10.84 on 129 degrees of freedom Multiple R-Squared: 0.9121, Adjusted R-squared: 0.9087 F-statistic: 267.8 on 5 and 129 DF, p-value: 2.2e-16 rd.eff - all.effects(rd) rd.ci - data.frame(y = rd.eff[[1]]$fit, Lower = rd.eff[[1]] $lower, Upper = rd.eff[[1]]$upper) _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] one is not one
On Sat, Oct 21, 2006 at 11:04:13AM +0200, Patrick Giraudoux wrote: Which means that actually sum(x) is NOT considered equal to 1... Any idea about what is going wrong? Others have already pointed out the problem and I would like to add a reference to a classical paper on this topic: David Goldberg What Every Computer Scientist Should Know About Floating-Point Arithmetic Computing Surveys, 1991 Reprints can be found in many places online - e.g. here: http://www.physics.ohio-state.edu/~dws/grouplinks/floating_point_math.pdf cu Philipp -- Dr. Philipp PagelTel. +49-8161-71 2131 Dept. of Genome Oriented Bioinformatics Fax. +49-8161-71 2186 Technical University of Munich Science Center Weihenstephan 85350 Freising, Germany and Institute for Bioinformatics / MIPS Tel. +49-89-3187 3675 GSF - National Research Center Fax. +49-89-3187 3585 for Environment and Health Ingolstädter Landstrasse 1 85764 Neuherberg, Germany http://mips.gsf.de/staff/pagel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems running IsoMDS using vegdist with pres-abs data and two sites with zero distance
Hi I have just (finally) started to poke around in R and wanted to analyse a stream fish dataset with 28 sites and 18 species. When trying to follow the Vegan manual to run nmds from distance measures calculated by the vegdist function it turns out that I have two sites (streams) with the exactly the same four species (I have used pres-abs data in this case). When I try to run isoMDS I get an error message saying that: Error in isoMDS(vare.dis) : zero or negative distance between objects 14 and 15 i.e. the two sites with the same fish species. I can run the decorana function with sensible results so the dataset seems to be ok. My next step would be to compare the results of the fish ordination with some other datasets from the same sites (i.e. through procrustes rotation) - so I wouldn´t like to remove one of the twin sites. Any suggestion on how to circumvent this or am I simply using the wrong method? Best wishes, Leonard -- Dr. Leonard Sandin Department of Environmental Assessment / Miljöanalys Swedish University of Agricultural Sciences / SLU P.O. Box 7050 SE-750 07 Uppsala Sweden Work phone +46-(0)18-673813 fax+46-(0)18-673156 http://www.ma.slu.se -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ReadLines question
I'm getting the following error: headerinfo=readLines(met_station_file,n=8) headerinfo [1] Plot Title: tahoe met validation ,,, [2]Error: invalid multibyte string met_station_file's first 8 lines are as follows: Plot Title: tahoe met validation ,,, #,Time, GMT-07:00,Temp, F,Coupler Attached,Host Connected,Coupler Detached,Stopped,End Of File 34,10/1/2005 0:00,49.937, 35,10/1/2005 0:30,47.266, 36,10/1/2005 1:00,47.446, 37,10/1/2005 1:30,47.982, 38,10/1/2005 2:00,48.517, 39,10/1/2005 2:30,49.228, Why am I getting this error? Are those quotation marks causing the hiccup? If so, how do I get around this programmatically? --j -- Jonathan A. Greenberg, PhD NRC Research Associate NASA Ames Research Center MS 242-4 Moffett Field, CA 94035-1000 Office: 650-604-5896 Cell: 415-794-5043 AIM: jgrn307 MSN: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ReadLines question
On Sat, 21 Oct 2006, Jonathan Greenberg wrote: I'm getting the following error: headerinfo=readLines(met_station_file,n=8) headerinfo [1] Plot Title: tahoe met validation ,,, [2]Error: invalid multibyte string met_station_file's first 8 lines are as follows: Plot Title: tahoe met validation ,,, #,Time, GMT-07:00,Temp, F,Coupler Attached,Host Connected,Coupler ^^^ or whatever this looks like to you (was ^Ã for me in LC_CTYPE=en_GB) is a multibyte string. Is there a mismatch between the encoding (see ?locales) of the file and the machine into which you are reading? Detached,Stopped,End Of File 34,10/1/2005 0:00,49.937, 35,10/1/2005 0:30,47.266, 36,10/1/2005 1:00,47.446, 37,10/1/2005 1:30,47.982, 38,10/1/2005 2:00,48.517, 39,10/1/2005 2:30,49.228, Why am I getting this error? Are those quotation marks causing the hiccup? If so, how do I get around this programmatically? --j -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems running IsoMDS using vegdist with pres-abs data and two sites with zero distance
On Sat, 2006-10-21 at 13:19 +0200, Leonard Sandin wrote: Hi I have just (finally) started to poke around in R and wanted to analyse a stream fish dataset with 28 sites and 18 species. When trying to follow the Vegan manual to run nmds from distance measures calculated by the vegdist function it turns out that I have two sites (streams) with the exactly the same four species (I have used pres-abs data in this case). When I try to run isoMDS I get an error message saying that: Error in isoMDS(vare.dis) : zero or negative distance between objects 14 and 15 i.e. the two sites with the same fish species. I can run the decorana function with sensible results so the dataset seems to be ok. My next step would be to compare the results of the fish ordination with some other datasets from the same sites (i.e. through procrustes rotation) - so I wouldn´t like to remove one of the twin sites. Any suggestion on how to circumvent this or am I simply using the wrong method? Best wishes, Leonard Hi Leonard, I assume you are running isoMDS (package MASS) directly? If so, take a look at metaMDS function in vegan which has an argument zerodist which if you set it to add adds a small distance so you don't get this error. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC ENSIS, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/cv/ London, UK. WC1E 6BT. [w] http://www.ucl.ac.uk/~ucfagls/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataset on Baltimore home energy costs
Sorry, dumb typo. Should be 18-Dec-2005. -Kevin From: Richard M. Heiberger [mailto:[EMAIL PROTECTED] Sent: Fri 10/20/2006 5:43 PM To: Zembower, Kevin Subject: Re: [R] Dataset on Baltimore home energy costs Thanks for the data. Can you clarify the date here. 18-Dec-2006: Brought home son; spouse and son home during the day, setback thermostat no longer used. Interesting question: What's the cost of adding a child? It's still October, so I am not at all sure which year this should be. Rich [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Filling in a series
Colleagues After reading in some clinical data, I discovered that the subject ID column contains entries only for the first record for each individual; subsequent rows are recorded as NA. For example: 1 NA NA NA NA 2 NA NA NA NA 3 NA NA ... I can think of various approaches to replace the NA values with appropriate entries. I could loop through each row - if the value is NA, I replace it with the entry from the row above. Or, I could find the positions of the non-NA values with match, then replace groups of entries (e.g., positions 2-5) with appropriate entries, again with a loop. But, I expect that R allows some more clever approach to the problem. Any thoughts? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Filling in a series
The zoo package has na.locf (last occurrernce carried foward) for this purpose: x [1] 1 NA NA NA NA 2 NA NA NA NA 3 NA NA library(zoo) na.locf(x) [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 On 10/21/06, Dennis Fisher [EMAIL PROTECTED] wrote: Colleagues After reading in some clinical data, I discovered that the subject ID column contains entries only for the first record for each individual; subsequent rows are recorded as NA. For example: 1 NA NA NA NA 2 NA NA NA NA 3 NA NA ... I can think of various approaches to replace the NA values with appropriate entries. I could loop through each row - if the value is NA, I replace it with the entry from the row above. Or, I could find the positions of the non-NA values with match, then replace groups of entries (e.g., positions 2-5) with appropriate entries, again with a loop. But, I expect that R allows some more clever approach to the problem. Any thoughts? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Filling in a series
Actually I think its last _observation_ carried forward. On 10/21/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: The zoo package has na.locf (last occurrernce carried foward) for this purpose: x [1] 1 NA NA NA NA 2 NA NA NA NA 3 NA NA library(zoo) na.locf(x) [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 On 10/21/06, Dennis Fisher [EMAIL PROTECTED] wrote: Colleagues After reading in some clinical data, I discovered that the subject ID column contains entries only for the first record for each individual; subsequent rows are recorded as NA. For example: 1 NA NA NA NA 2 NA NA NA NA 3 NA NA ... I can think of various approaches to replace the NA values with appropriate entries. I could loop through each row - if the value is NA, I replace it with the entry from the row above. Or, I could find the positions of the non-NA values with match, then replace groups of entries (e.g., positions 2-5) with appropriate entries, again with a loop. But, I expect that R allows some more clever approach to the problem. Any thoughts? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] newbie question
Sorry - this must be obvious, but i haven't been able to find the answer in the guides i've searched. The examples seem to assume you always want to look at all the data. I want to be able to filter data in a dataframe before analyzing it. For example, I'd like to plot(a,b) but only include values where b 1000. I'd also like to be able to do similar filtering before doing other statistical functions. Thanks for your help. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newbie question
Dear Larry, Data can be filtered in the following manner: a = c(1,2,3,4) b = c(1,2,3,4) b = b[a3] b = b[b3] a b [1] 4 Or if the data is in a data frame: b = data.frame(seq(1:10),seq(1:10) names(b) = c(a,b) b b = b[b$a 5,] b The trailing comma at the end is important. Hopes this helps, Paul Larry White schreef: Sorry - this must be obvious, but i haven't been able to find the answer in the guides i've searched. The examples seem to assume you always want to look at all the data. I want to be able to filter data in a dataframe before analyzing it. For example, I'd like to plot(a,b) but only include values where b 1000. I'd also like to be able to do similar filtering before doing other statistical functions. Thanks for your help. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logistic regression with a sample missing subjects with a value of an independent variable
Dear R-help, I am trying to make logistic regression analysis using the R function glm, with the parameter family set to binomial, in order to use a logistic regression model. I have 70 samples. The dependent variables has two levels (0 and 1) and one of the independent variables has too two levels (0 and 1). The variables associate in the way shown in the table: Dependent 0 1 Independent 0 55 10 1 0 5 This gives a strong association evaluated by the fisher test (p-value = 0.0002481), but with the logistic regression it gives a p-value of 0.99 with very high values of estimate and standard error (respectively and -19.27 and 1769.26). Is there any way (other function, different setting of a parameter) to perform logistic regression analysis with these data with R? Thank you. Gabriele Stocco University of Trieste __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with mode of marginal distriubtion of rdirichlet{gtools}
Hi all, I have a problem using rdirichlet{gtools}. For Dir( a1, a2, ..., a_n), its mode can be found at $( a_i -1)/ ( \sum_{i}a_i - n)$; The means are $a_i / (\sum_{i} a_i ) $; I tried to study the above properties using rdirichlet from gtools. The code are: ## library(gtools) alpha = c(1,3,9) #totoal=13 mean.expect = c(1/13, 3/13, 9/13) mode.expect = c(0,2/10, 8/10) # this is for the overall mode. mode1 = numeric(3); mode2 = numeric(3); theta = data.frame( rdirichlet( 1, alpha) ) m1 = mean(theta) for( i in 1:3) { h = hist(theta[,i], breaks=20); mode1[i]= h$mid[ h$density == max(h$density) ] } theta = data.frame( rdirichlet( 1, alpha) ) m2 = mean(theta) for( i in 1:3) { h = hist(theta[,i], breaks=20); mode2[i]= h$mid[ h$density == max(h$density) ] } rbind(m1,m2,mean.expect) #the means are consistent rbind(mode1, mode2, mode.expect);#the marginal modes are quite different from the global mode. An example output is: rbind(m1,m2,mean.expect) #the means are consistent X1X2X3 m1 0.07609384 0.2301840 0.6937222 m2 0.07716923 0.2300336 0.6927971 mean.expect 0.07692308 0.2307692 0.6923077 rbind(mode1, mode2, mode.expect); [,1] [,2] [,3] mode1 0.025 0.175 0.725 mode2 0.010 0.175 0.725 mode.expect 0.000 0.200 0.800 So, what is the problem with the mode? Many thanks, HQ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newbie question
Larry White wrote: Sorry - this must be obvious, but i haven't been able to find the answer in the guides i've searched. The examples seem to assume you always want to look at all the data. I want to be able to filter data in a dataframe before analyzing it. For example, I'd like to plot(a,b) but only include values where b 1000. I'd also like to be able to do similar filtering before doing other statistical functions. In addition to what Paul Hiemstra mentioned, you should look at the help page for subset(). For example: df - data.frame(a = runif(20), b = 1:20) plot(a ~ b, data = subset(df, b 5)) Thanks for your help. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression with a sample missing subjects with a value of an independent variable
On Sat, 21 Oct 2006, Gabriele Stocco wrote: Dear R-help, I am trying to make logistic regression analysis using the R function glm, with the parameter family set to binomial, in order to use a logistic regression model. I have 70 samples. The dependent variables has two levels (0 and 1) and one of the independent variables has too two levels (0 and 1). The variables associate in the way shown in the table: Dependent 0 1 Independent 0 55 10 1 0 5 This gives a strong association evaluated by the fisher test (p-value = 0.0002481), but with the logistic regression it gives a p-value of 0.99 with very high values of estimate and standard error (respectively and -19.27 and 1769.26). Please see the comment at the bottom of this message, as your claims are not supported by any code. Is there any way (other function, different setting of a parameter) to perform logistic regression analysis with these data with R? fit - glm(matrix(c(55,0,10,5), 2, 2) ~ factor(c(0,1)), binomial()) fit0 - glm(matrix(c(55,0,10,5), 2, 2) ~ 1, binomial()) anova(fit0, fit, test=Chisq) Resid. Df Resid. Dev Df Deviance P(|Chi|) 1 1 16.929 2 0 2.208e-10 1 16.929 3.880e-05 is a reasonable way to do this. Beware the Hauck-Donner phenomenon (see e.g. MASS, the book) for t-tests of coefficients, although I do not get the values you quote. Since the expected values are low, you should not take the p value too seriously. Thank you. Gabriele Stocco University of Trieste __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Box's M test
Dear List I am looking for a script that will calculate the Box M test to test the homogeneity of the variance/covariance matrix between two matrices. If anyone could send me the script I would appreciate it. I am aware of the scepticism about this test, where due to extreme sensitivity a p value of 0.01 is recommended. Despite this however Box's M test is the established method for identification of stable strategic time periods within the strategic management literature and I would like the opportunity to use this method within either R or S plus. Any help would be gratefully received. Kind regards Graham [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Box M test
Dear List I am looking for a script that will calculate the Box M test to test the homogeneity of the variance/covariance matrix between two matrices. If anyone could send me the script I would appreciate it. I am aware of the scepticism about this test, where due to extreme sensitivity a p value of 0.01 is recommended. Despite this however Box's M test is the established method for identification of stable strategic time periods within the strategic management literature and I would like the opportunity to use this method within either R or S plus. Any help would be gratefully received. Kind regards Graham Kind regards Dr Graham Leask Economics and Strategy Group Aston Business School Aston University Aston Triangle Birmingham B4 7ET Tel: Direct line 0121 204 3150 email [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how do I find the row index number, or row name, of a given value in a vector?
Hi, I noticed that max(x) returns the maximum value of a vector, but the function doesn't give the user the option of retrieving the row index number instead. If I used max(x) to find the maximum value of vector x, then is there a function I can use to find the index number, or row name, of the maximum value? Thanks, Tom [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] newbie question
Larry White wrote: Sorry - this must be obvious, Yes, it is :-) I want to be able to filter data in a dataframe before analyzing it. For example, I'd like to plot(a,b) but only include values where b 1000. If a and b are vectors, then b 1000 is another vector of logical values. You can use logical vectors to select pieces of other vectors, like this: x - c(1, 2, 3) y - c(TRUE, FALSE, FALSE) x[y] # returns only the first element, 1 So, this plot can be done this way: filter - (b 1000) # filter is a logical vector a1 - a[filter] b1 - b[filter] # Now, a1 and b1 are, resp, a and b filtered plot(a1, b1) Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how do I find the row index number, or row name, of a given value in a vector?
--- tom soyer [EMAIL PROTECTED] wrote: Hi, I noticed that max(x) returns the maximum value of a vector, but the function doesn't give the user the option of retrieving the row index number instead. If I used max(x) to find the maximum value of vector x, then is there a function I can use to find the index number, or row name, of the maximum value? Thanks, Tom ?which Is this what you want to do? xx - c(1,2,3,4,9,6,7,5) bb - max(xx) bb which(xx==bb) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how do I find the row index number, or row name, of a given value in a vector?
?which.max On 10/21/06, tom soyer [EMAIL PROTECTED] wrote: Hi, I noticed that max(x) returns the maximum value of a vector, but the function doesn't give the user the option of retrieving the row index number instead. If I used max(x) to find the maximum value of vector x, then is there a function I can use to find the index number, or row name, of the maximum value? Thanks, Tom [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how do I find the row index number, or row name, of a given value in a vector?
Thanks John! I got it. On 10/21/06, John Kane [EMAIL PROTECTED] wrote: --- tom soyer [EMAIL PROTECTED] wrote: Hi, I noticed that max(x) returns the maximum value of a vector, but the function doesn't give the user the option of retrieving the row index number instead. If I used max(x) to find the maximum value of vector x, then is there a function I can use to find the index number, or row name, of the maximum value? Thanks, Tom ?which Is this what you want to do? xx - c(1,2,3,4,9,6,7,5) bb - max(xx) bb which(xx==bb) __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] I really don't understand functions in R :-)
Patrick Burns ([EMAIL PROTECTED]) wrote: You may be interested in S Poetry. Hi, I do find your book very helpful. Thank you. Best regards, Marcin Jaworski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [Fwd: [AGDG-LIST:405] R Computing Contest]
Original Message Subject: [AGDG-LIST:405] R Computing Contest Date: Sat, 21 Oct 2006 12:08:13 -0400 From: Larry Schaeffer [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] To: Animal Geneticist's Discussion [EMAIL PROTECTED] For those that are interested only: R Computer Programming Challenge Given: y = Factor A + Factor B + b1(Covariate1) + b2(Covariate2) + ... + bp(Covariate p) + animal + e where y is a single trait - vector of observations, Factors A and B are fixed with ma and mb levels respectively, Covariates 1 to p are fixed regressions and p is a general number but less than 20. Heritability is a variable and should be requested by the program. You are given a list of the animal IDs of animals with records, and their sires and dams. Sire and dam IDs are smaller than the IDs of their progeny, and all IDs go from 1 to n where n can be up to 10 million animals. Tasks: 1) Estimate the EBVs of all animals using BLUP and mixed model equations and list the top 10 largest EBVs with associated animal IDs, first ten levels of Factor A with solutions, first ten levels of Factor B with solutions, and all estimated regression coefficients. Submit your R program to Bill Szkotnicki ([EMAIL PROTECTED]) by January 5, 2007. Submissions will be dated by their arrival date. Provide any special functions or details that might be needed to run your program. All R programs will be run on the same test data sets, computer (Linux), and timed. Winner: Earliest submitted R program that runs the fastest and gives the correct answers. Prize: Free registration for the Kennedy Conference in 2008 in Guelph. -- Lep pozdrav / With regards, Gregor Gorjanc -- University of Ljubljana PhD student Biotechnical Faculty Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan Groblje 3 mail: gregor.gorjanc at bfro.uni-lj.si SI-1230 Domzale tel: +386 (0)1 72 17 861 Slovenia, Europefax: +386 (0)1 72 17 888 -- One must learn by doing the thing; for though you think you know it, you have no certainty until you try. Sophocles ~ 450 B.C. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ReadLines question
That looks to me like an infinity sign (I have no idea why that is part of the header of this file, but it is there). How do I modify the encoding to read this in? --j On 10/21/06 4:33 AM, Roger Bivand [EMAIL PROTECTED] wrote: On Sat, 21 Oct 2006, Jonathan Greenberg wrote: I'm getting the following error: headerinfo=readLines(met_station_file,n=8) headerinfo [1] Plot Title: tahoe met validation ,,, [2]Error: invalid multibyte string met_station_file's first 8 lines are as follows: Plot Title: tahoe met validation ,,, #,Time, GMT-07:00,Temp, F,Coupler Attached,Host Connected,Coupler ^^^ or whatever this looks like to you (was ^Ã for me in LC_CTYPE=en_GB) is a multibyte string. Is there a mismatch between the encoding (see ?locales) of the file and the machine into which you are reading? Detached,Stopped,End Of File 34,10/1/2005 0:00,49.937, 35,10/1/2005 0:30,47.266, 36,10/1/2005 1:00,47.446, 37,10/1/2005 1:30,47.982, 38,10/1/2005 2:00,48.517, 39,10/1/2005 2:30,49.228, Why am I getting this error? Are those quotation marks causing the hiccup? If so, how do I get around this programmatically? --j -- Jonathan A. Greenberg, PhD NRC Research Associate NASA Ames Research Center MS 242-4 Moffett Field, CA 94035-1000 Office: 650-604-5896 Cell: 415-794-5043 AIM: jgrn307 MSN: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Box M test [Broadcast]
See http://finzi.psych.upenn.edu/R/Rhelp02a/archive/0.html Andy From: GRAHAM LEASK Dear List I am looking for a script that will calculate the Box M test to test the homogeneity of the variance/covariance matrix between two matrices. If anyone could send me the script I would appreciate it. I am aware of the scepticism about this test, where due to extreme sensitivity a p value of 0.01 is recommended. Despite this however Box's M test is the established method for identification of stable strategic time periods within the strategic management literature and I would like the opportunity to use this method within either R or S plus. Any help would be gratefully received. Kind regards Graham Kind regards Dr Graham Leask Economics and Strategy Group Aston Business School Aston University Aston Triangle Birmingham B4 7ET Tel: Direct line 0121 204 3150 email [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ReadLines question
On Sat, 21 Oct 2006, Jonathan Greenberg wrote: That looks to me like an infinity sign (I have no idea why that is part of the header of this file, but it is there). How do I modify the encoding to read this in? The problem is the degree sign. Under linux: $ file tmp/Marlette_lake_snotel.csv tmp/Marlette_lake_snotel.csv: ISO-8859 text, with CRLF, CR line terminators so probably the conversion to multibyte is happening on your reading platform. Reading the file into 2.4.0 on Windows with a Norwegian 1252 setting (Sys.getlocale()), I see the degree sign. If you know the column names anyway, jump over the header and insert them yourself. Alternatively filter the non-ASCII character out before reading, it looks predictably like a degree sign. In any case, the character is not very practical in a column name. Roger --j On 10/21/06 4:33 AM, Roger Bivand [EMAIL PROTECTED] wrote: On Sat, 21 Oct 2006, Jonathan Greenberg wrote: I'm getting the following error: headerinfo=readLines(met_station_file,n=8) headerinfo [1] Plot Title: tahoe met validation ,,, [2]Error: invalid multibyte string met_station_file's first 8 lines are as follows: Plot Title: tahoe met validation ,,, #,Time, GMT-07:00,Temp, F,Coupler Attached,Host Connected,Coupler ^^^ or whatever this looks like to you (was ^Ã for me in LC_CTYPE=en_GB) is a multibyte string. Is there a mismatch between the encoding (see ?locales) of the file and the machine into which you are reading? Detached,Stopped,End Of File 34,10/1/2005 0:00,49.937, 35,10/1/2005 0:30,47.266, 36,10/1/2005 1:00,47.446, 37,10/1/2005 1:30,47.982, 38,10/1/2005 2:00,48.517, 39,10/1/2005 2:30,49.228, Why am I getting this error? Are those quotation marks causing the hiccup? If so, how do I get around this programmatically? --j -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ReadLines question
Roger Bivand [EMAIL PROTECTED] writes: On Sat, 21 Oct 2006, Jonathan Greenberg wrote: That looks to me like an infinity sign (I have no idea why that is part of the header of this file, but it is there). How do I modify the encoding to read this in? The problem is the degree sign. Under linux: $ file tmp/Marlette_lake_snotel.csv tmp/Marlette_lake_snotel.csv: ISO-8859 text, with CRLF, CR line terminators so probably the conversion to multibyte is happening on your reading platform. Reading the file into 2.4.0 on Windows with a Norwegian 1252 setting (Sys.getlocale()), I see the degree sign. Nono, there is no conversion. R is _expecting_ a multibyte sequence (utf8 most likely) and finding something that isn't part of one. The fix should be something close to read.csv(file(tmp/Marlette_lake_snotel.csv), encoding=iso-8859-1) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Possible bugs in 'seek' and 'readBin'
I found that seek(..., origin = 'current', ...) and readBin(..., what = 'integer', ...) or 'int' do not work correctly. Did anyone have the same experience? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pie
Hi, I would like to draw a pie chart. I've already tried out the standard pie-function in the GRAPH-package. My question: is there any 'better' function or package to draw a pie chart. For example I would like to draw a 3D pie chart. Dietrich Tissen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tcltk: multiple listboxes, selection
Dear list, I have multiple (BWidget) listboxes in the same toplevel window. The problem is, if I select (by left clicking) on one of those listbox elements, the current selection in the *other* listboxes is cleared! Anybody knows how I can prevent this? Here's my code (sorry not complete): (E.g. If I select an X value, I'd lose the Y value I selected before) gui.create.tab.general - function(tt) { # Network topology frame.1 - tkframe(tt) combo_topology - tkwidget(frame.1, ComboBox, values = infras, editable = FALSE) tcl(combo_topology, setvalue, first) tkgrid(tklabel(frame.1, text=Network topology), combo_topology) tkgrid(frame.1, sticky = w) # X values frame.x - tkwidget(tt, labelframe, text = X value) tl.x - tklistbox(frame.x, height = length(xvals) / 2, width = 50, selectmode = browse, background = white) for(i in seq(from=1, to=length(xvals), by=2)) { tkinsert(tl.x, end, xvals[[i]]) } tkselection.set(tl.x, 0) tkgrid(tl.x, sticky=news) # Y values frame.y - tkwidget(tt, labelframe, text = Y value(s)) tl.y - tklistbox(frame.y, height = 20, #length(yvals) / yvals.field.count, yscrollcommand=function(...) tkset(scr.y,...), width = 50, selectmode = extended, background = white) scr.y - tkscrollbar(frame.y, repeatinterval = 5, command = function(...) tkyview(tl.y, ...)) for(i in seq(from=1, to=length(yvals), by=yvals.field.count)) { tkinsert(tl.y, end, yvals[[i]]) } tkselection.set(tl.y, 0) tkgrid(tl.y, scr.y, sticky=news) # Packing tkgrid(frame.x, frame.y, sticky=news) #side = left, fill = both, expand = TRUE) } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pie
On Sat, 2006-10-21 at 22:46 +0200, Dietrich Tissen wrote: Hi, I would like to draw a pie chart. I've already tried out the standard pie-function in the GRAPH-package. My question: is there any 'better' function or package to draw a pie chart. For example I would like to draw a 3D pie chart. Dietrich Tissen Take a look at the plotrix package on CRAN. That being said, just say NO to 3D pie charts: When a graphic is taken over by decorative forms of computer debris, when the data measures and structures become Design Elements, when the overall design purveys Graphical Style rather than quantitative information, then the graphic may be called a duck in honor of the duck-form store... — Edward Tufte The primary source of extraneous lines in charting graphics today are the 3-D options offered by conventional spreadsheet graphics. These 3-D options serve no useful purpose; they add only ink to the chart, and more often than not make it more difficult to estimate the values represented. Even worse are the spreadsheet options that allow one to rotate the perspective. For those who would take bad graphical display to even higher levels, the Excel spreadsheet program offers the option of doughnut, radar, cylinder, cone, bubble charts. — Gary Klass Consider a ?dotchart or even ?barplot instead. For more information: http://lilt.ilstu.edu/gmklass/pos138/datadisplay/badchart.htm http://lilt.ilstu.edu/gmklass/pos138/datadisplay/sections/goodcharts.htm http://www.edwardtufte.com/tufte/books_vdqi and the two books by Cleveland: http://stat.bell-labs.com/wsc/index.html HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Latex editor recommendations
Richard M. Heiberger [EMAIL PROTECTED] writes: I recommend emacs http://ftp.gnu.org/pub/gnu/emacs It has modes for TeX and LaTeX and automatically chooses the right one. As a side benefit once you have emacs you can then run R through ESS, the package that provides the modes for handling statistical languages http://ess.r-project.org/ In addition you get support for Sweave (mixing R and LaTeX). Recommended packages for Editing LaTeX / R are: AucTeX + reftex ESS Jens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using the ROracle package with Oracle 10 ?
Ram Dessau [EMAIL PROTECTED] writes: [...] Window binary for ROracle 10 ? Is there anyone who could help with a version of ROracle for windows which is compatible with Oracle 10 or more general just uses the default-home The available (not officially released) version only supports Oracle 9 Any used oracle 9 driver should be able to connect to a 10g instance and work reasonably well with it. Did you try connecting to a 10g instance? Do you have a specific problem or thing not working? Do you have errors ora-600 or ora-03113? Jens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ROracle error in Windows. Memory could not be read.
Adrian Dragulescu [EMAIL PROTECTED] writes: I've seen from earlier posts that other people had problems installing ROracle under Windows. I run R-2.3.1. I got the Windows binaries for ROracle from http://stat.bell-labs.com/RS-DBI/download/index.html Here is my session: require(ROracle) Loading required package: ROracle Loading required package: DBI [1] TRUE drv - dbDriver(Oracle) drv OraDriver:(1648) con - dbConnect(drv, user=USER, password=PASS, dbname=TEST) A window pops up, with the message: The instruction at 0x6260c621 referenced memory at 0x4a280ae2. The memory could not be read. And R freezes. Switch on debugging in the oracle client (sqlnet.ora). Maybe you can get a glimpse on what's wrong. Please provide version information (Oracle Client libs, Oracle DB). Is there any network traffic going to or coming from the DB? Something is working, because I get to establish an Oracle driver. What do you mean by that? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ReadLines question
On 21 Oct 2006, Peter Dalgaard wrote: Roger Bivand [EMAIL PROTECTED] writes: On Sat, 21 Oct 2006, Jonathan Greenberg wrote: That looks to me like an infinity sign (I have no idea why that is part of the header of this file, but it is there). How do I modify the encoding to read this in? The problem is the degree sign. Under linux: $ file tmp/Marlette_lake_snotel.csv tmp/Marlette_lake_snotel.csv: ISO-8859 text, with CRLF, CR line terminators so probably the conversion to multibyte is happening on your reading platform. Reading the file into 2.4.0 on Windows with a Norwegian 1252 setting (Sys.getlocale()), I see the degree sign. Nono, there is no conversion. R is _expecting_ a multibyte sequence (utf8 most likely) and finding something that isn't part of one. The fix should be something close to read.csv(file(tmp/Marlette_lake_snotel.csv), encoding=iso-8859-1) Yes, thanks: read.csv(file(tmp/Marlette_lake_snotel.csv, encoding=iso-8859-1), skip=1, check.names=FALSE) gives usable results on Linux/en_GB and Win/Norwegian (Bokmål) 1252. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with mode of marginal distriubtion of rdiri chlet{gtools}
hong qin alongway at gmail.com writes: Hi all, I have a problem using rdirichlet{gtools}. For Dir( a1, a2, ..., a_n), its mode can be found at $( a_i -1)/ ( \sum_{i}a_i - n)$; The means are $a_i / (\sum_{i} a_i ) $; I believe the problem is that the marginal modes are not equal to the full posterior modes. (Think about the two-parameter normal distribution for example; for a sample {x_i}, the mode of the posterior variance is near sum((x_i-\bar x)^2)/n while the marginal mode is near sum((x_i-\bar x)^2)/(n-1) ...) [haven't checked this at all carefully.] Ben Bolker __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] object(s) are masked from package - what does it mean?
Hi, Sometime when I attach a dataset, R gives me the following message/warning:The following object(s) are masked from package:datasets: column_name. Does anyone know what it means? Since it seems that the dataset was attached and I could manipulate the data from the dataset without problems, I am wondering what was R trying to tell me. Thanks, Tom [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glm function question
I am creating a model attempting to predict the probability someone will reoffend after being caught for a crime. There are seven total inputs and I planned on using a logistic regression. I started with a null deviance of 182.91 and ended up with a residual deviance of 83.40 after accounting for different interactions and such. However, I realized after that my code is different from that in my book. And I can't figure out what I need to put in it's place. Here's my code: library(foreign) library(car) foo = read.table(C:/Documents and Settings/Chris/Desktop/4330/criminals.dat, header=TRUE) reoff = foo[ ,1] race = foo[ ,2] age = foo[ ,3] gender = foo[ ,4] educ = foo[ ,5] subst = foo[ ,6] prior = foo[ ,7] violence = foo[ ,8] fit1h = glm(reoff ~ factor(subst) + factor(violence) + prior + factor(violence):factor(subst) + factor(violence):factor(educ) + factor(violence):factor(age) + factor(violence):factor(prior)) summary(fit1h) If you noticed, there's no part of my code that looks like: family=binomial(link=logit)) If I code like my book has done, it would look like: fit1i = glm(reoff ~ factor(subst) + factor(violence) + prior + factor(violence):factor(subst) + factor(violence):factor(educ) + factor(violence):factor(age) + factor(violence):factor(prior), family=binomial(link=logit)) summary(fit1i) However, when I do this, my null deviance is 1104 and my residual deviance is 23460. THIS IS A HUGE DIFFERENCE IN MODEL FIT! I'm not sure if I have to redo my model or if my book was simply doing the family=binomial(link=logit) for a specific problem/reason. So, to my question: Do I need to include family=binomial(link=logit) in my code? Do I need to include any type of family? Thanks for your help, -chris [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm function question
On Sat, 2006-10-21 at 20:02 -0400, Chris Linton wrote: I am creating a model attempting to predict the probability someone will reoffend after being caught for a crime. There are seven total inputs and I planned on using a logistic regression. I started with a null deviance of 182.91 and ended up with a residual deviance of 83.40 after accounting for different interactions and such. However, I realized after that my code is different from that in my book. And I can't figure out what I need to put in it's place. Here's my code: library(foreign) library(car) foo = read.table(C:/Documents and Settings/Chris/Desktop/4330/criminals.dat, header=TRUE) reoff = foo[ ,1] race = foo[ ,2] age = foo[ ,3] gender = foo[ ,4] educ = foo[ ,5] subst = foo[ ,6] prior = foo[ ,7] violence = foo[ ,8] fit1h = glm(reoff ~ factor(subst) + factor(violence) + prior + factor(violence):factor(subst) + factor(violence):factor(educ) + factor(violence):factor(age) + factor(violence):factor(prior)) summary(fit1h) If you noticed, there's no part of my code that looks like: family=binomial(link=logit)) If I code like my book has done, it would look like: fit1i = glm(reoff ~ factor(subst) + factor(violence) + prior + factor(violence):factor(subst) + factor(violence):factor(educ) + factor(violence):factor(age) + factor(violence):factor(prior), family=binomial(link=logit)) summary(fit1i) However, when I do this, my null deviance is 1104 and my residual deviance is 23460. THIS IS A HUGE DIFFERENCE IN MODEL FIT! I'm not sure if I have to redo my model or if my book was simply doing the family=binomial(link=logit) for a specific problem/reason. So, to my question: Do I need to include family=binomial(link=logit) in my code? Yes, though you could do with just 'family = binomial' since logit is the default link function. Do I need to include any type of family? If you don't want to use the default Gaussian family, then yes. Whatever book it is you are working from (which you fail to identify) ought to clearly explain the background on the use of the distribution families in GLM's. There is a reason the author has included these instructions and you need to pay attention to them. If you look carefully at the output of summary(fit1h), you will likely see: (Dispersion parameter for gaussian family taken to be ) and you will also notice that the tests being applied (3rd and 4th columns in the coefficient summary table) are t tests and not z tests. These should be a big hint that you are not working with the proper family and are therefore not fitting a logistic regression model, which is presumably the intent of this section of the book. See ?glm and pay careful attention to the function defaults. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object(s) are masked from package - what does it mean?
tom soyer tom.soyer at gmail.com writes: Hi, Sometime when I attach a dataset, R gives me the following message/warning:The following object(s) are masked from package:datasets: column_name. Does anyone know what it means? Since it seems that the dataset was attached and I could manipulate the data from the dataset without problems, I am wondering what was R trying to tell me. As far as I understand it: Imagine there is a package A with object tmp and you load it. When you load another package B with object of the same name i.e. tmp then you will access tmp from package B and not A. You might work though with namespaces, :: and ::: Gregor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with installing Hmisc and Design: gfrotran: error
cannot exec 'f951': No such file or directory Reply-To: Hi all, I'm using R on an Suse Linux system. Since Hmisc and Design need both fortran, I installed gfortran (through rpm). However, I still get an error message, namely: gfrotran: error cannot exec 'f951': No such file or directory And the installation is cancelled. I tried solving the problem through googling it, and looking at the fortran and Hmisc page, but couldn't find any valuable information. It would be wonderful if anyone could help me. Cheers, Stephan -- --- Stephan Lindner, Dipl.Vw., MA PhD Candidate University of Michigan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to convert multiple dummy variables to 1 factor variable?
Dear Listers, I am wondering how to convert multiple dummy variables to 1 factor variable. Thanks. wensui __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm function question
Chris Linton connect.chris at gmail.com writes: I am creating a model attempting to predict the probability someone will reoffend after being caught for a crime. There are seven total inputs and I planned on using a logistic regression. I started with a null deviance of 182.91 and ended up with a residual deviance of 83.40 after accounting for different interactions and such. However, I realized after that my code is different from that in my book. And I can't figure out what I need to put in it's place. Here's my code: ... fit1h = glm(reoff ~ factor(subst) + factor(violence) + prior + factor(violence):factor(subst) + factor(violence):factor(educ) + factor(violence):factor(age) + factor(violence):factor(prior)) summary(fit1h) If you noticed, there's no part of my code that looks like: family=binomial(link=logit)) ... However, when I do this, my null deviance is 1104 and my residual deviance is 23460. THIS IS A HUGE DIFFERENCE IN MODEL FIT! I'm not sure if I have to redo my model or if my book was simply doing the family=binomial(link=logit) for a specific problem/reason. You state that you model the *probability* that ... Then family=gaussian, which is the default data generation model in glm is not appropriate. Yes, you need to use family=binomial(link=logit) or family=binomial(link=probit), but you also need to take care in proper specification of your y in the glm call. Gregor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object(s) are masked from package - what does it mean?
On Sat, 2006-10-21 at 18:43 -0500, tom soyer wrote: Hi, Sometime when I attach a dataset, R gives me the following message/warning:The following object(s) are masked from package:datasets: column_name. Does anyone know what it means? Since it seems that the dataset was attached and I could manipulate the data from the dataset without problems, I am wondering what was R trying to tell me. Thanks, Tom This has to do with R's searchpath. When you attach() a dataset, it loads the dataset into R's current searchpath. The same occurs with packages when you use library(). This can be seen with: search() [1] .GlobalEnvpackage:Designpackage:survival [4] package:splines package:Hmisc package:chron [7] package:xtablepackage:gplotspackage:gtools [10] package:gmodels package:gdata package:methods [13] package:stats package:graphics package:grDevices [16] package:utils package:datasets Autoloads [19] package:base or with a bit more detail: searchpaths() [1] .GlobalEnv [2] /usr/local/lib/R/library/Design [3] /usr/local/lib/R/library/survival [4] /usr/local/lib/R/library/splines [5] /usr/local/lib/R/library/Hmisc [6] /usr/local/lib/R/library/chron [7] /usr/local/lib/R/library/xtable [8] /usr/local/lib/R/library/gplots [9] /usr/local/lib/R/library/gtools [10] /usr/local/lib/R/library/gmodels [11] /usr/local/lib/R/library/gdata [12] /usr/local/lib/R/library/methods [13] /usr/local/lib/R/library/stats [14] /usr/local/lib/R/library/graphics [15] /usr/local/lib/R/library/grDevices [16] /usr/local/lib/R/library/utils [17] /usr/local/lib/R/library/datasets [18] Autoloads [19] /usr/local/lib/R/library/base You can see that the 'datasets' package is number 17 in the searchpath. For example, if I now attach() the iris dataset (a widely used example dataset in R): attach(iris) search() [1] .GlobalEnviris package:Design [4] package:survival package:splines package:Hmisc [7] package:chron package:xtablepackage:gplots [10] package:gtoolspackage:gmodels package:gdata [13] package:methods package:stats package:graphics [16] package:grDevices package:utils package:datasets [19] Autoloads package:base You can see that 'iris' is now number 2 in the searchpath. This means that I can now access columns in the iris dataset without needing to use the typical DataframeName$ColumnName syntax: Species [1] setosa setosa setosa setosa setosa setosa [7] setosa setosa setosa setosa setosa setosa ... However: detach(iris) Species Error: object Species not found iris$Species [1] setosa setosa setosa setosa setosa setosa [7] setosa setosa setosa setosa setosa setosa ... In your case, the warning is telling you that you have attached a data frame that presumably contains a column, whose name is 'column_name'. If you use: column_name in your R session, the object with that name in your data frame will be seen before another object with the same name that is lower in the searchpath, since your object is higher in the searchpath. Thus, your object is 'masking' the other. I did not see another object called column_name in my version of the datasets package, so perhaps you have done some manipulation on those objects during your R session. As a general comment, as has been discussed previously in other threads, this is a potentially significant danger in the habit of using attach() and should generally be avoided. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multilevel model (lme) question
Dear list, I'm trying to fit a multilevel (mixed-effects) model using the lme function (package nlme) in R 2.4.0. As a mixed-effects newbie I'm neither sure about the modeling nor the correct R syntax. My data is structured as follows: For each subject, a quantity Y is measured at a number (= 2) of time points. Moreover, at time point 0 (baseline), a quantity X is measured for each subject (I am interested to see whether X, or log(X), is a predictor of Y). I saw in some examples that time-invariant predictors should be repeated for all rows of a subject, which is why I copied the baseline value of X also to time points 0. The resulting data frame looks like this: Grouped Data: Y ~ TIME | Subject Y TIMESubject X.Baseline 9 0.0 11084 7 3.7 11084 11 0.0 27150 8 9.2 27150 Intra-subject trajectories of Y very close to linear. I'd like to check whether slope (and maybe also offset) of this line are (in part) predicted by X.baseline. Thus, I think the multilevel model specification should be as follows (i = subject, j=measurement): y_ij = \beta_i + b_i * TIME_ij + \epsilon_ij, with b_i = \zeta_i0 + \zeta_i1 * X.Baseline Is this correct? Now, I am completely unsure how to translate this into the syntax needed by lme. Is there any standard procedure on how to get from e.g. the LairdWare'82 matrix model notation to the lme input? And, in my case, is the correct model as follows, or am I wrong? my.model - lme( Y ~ TIME, my.data.frame, random=~X.Baseline | Subject ) [in case this is correct, it is by accident, since I made it up from some examples I found without really understanding the translation from the model formulation] I'd greatly appreciate some advice or help. Best, Lukas [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to convert multiple dummy variables to 1 factor variable?
On Sat, 2006-10-21 at 21:04 -0400, Wensui Liu wrote: Dear Listers, I am wondering how to convert multiple dummy variables to 1 factor variable. Thanks. wensui I was thinking of a function that is essentially the reverse of model.matrix() which is used by R modeling functions. I did not see one, though it is possible that I missed it. However, I suppose that something along the lines of the following would work. Say we have a matrix as follows, where the columns represent the presence or absence of the factor levels, as one would see in a model matrix. There should be a single '1' in each row as each row corresponds to a single observation. mat Level1 Level2 Level3 Level4 Level5 [1,] 0 1 0 0 0 [2,] 1 0 0 0 0 [3,] 0 0 0 1 0 [4,] 0 0 1 0 0 [5,] 0 0 0 0 1 # Create a new factor based upon the index of each 1 in each row # Use the matrix column names as the labels for each level NewFactor - factor(apply(mat, 1, function(x) which(x == 1)), labels = colnames(mat)) NewFactor [1] Level2 Level1 Level4 Level3 Level5 Levels: Level1 Level2 Level3 Level4 Level5 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cross tabs with percents?
-- apologies if this is a dup - i got a bounce saying the message was unprocessed. Is there a straightforward way to get a table with percents in the cells rather than counts? I've looked at table, ftable, xtabs, and ctab, which did the conversion but returned the results in a single row without labels. any suggestions are appreciated. thank you. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cross tabs with percents?
Maybe _CrossTable_ in gmodels package is what you want. 2006/10/22, Larry White [EMAIL PROTECTED]: -- apologies if this is a dup - i got a bounce saying the message was unprocessed. Is there a straightforward way to get a table with percents in the cells rather than counts? I've looked at table, ftable, xtabs, and ctab, which did the conversion but returned the results in a single row without labels. any suggestions are appreciated. thank you. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- »ÆÈÙ¹ó Department of Sociology Fudan University [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] object(s) are masked from package - what does it mean?
Thanks Marc and Gregor for the detailed explanation. You are right, the masking is potentially dangerous. Since R is object oriented, I am surprised that this is an issue. Does this mean that encapsulation does not exist in R? On 10/21/06, Marc Schwartz [EMAIL PROTECTED] wrote: On Sat, 2006-10-21 at 18:43 -0500, tom soyer wrote: Hi, Sometime when I attach a dataset, R gives me the following message/warning:The following object(s) are masked from package:datasets: column_name. Does anyone know what it means? Since it seems that the dataset was attached and I could manipulate the data from the dataset without problems, I am wondering what was R trying to tell me. Thanks, Tom This has to do with R's searchpath. When you attach() a dataset, it loads the dataset into R's current searchpath. The same occurs with packages when you use library(). This can be seen with: search() [1] .GlobalEnvpackage:Designpackage:survival [4] package:splines package:Hmisc package:chron [7] package:xtablepackage:gplotspackage:gtools [10] package:gmodels package:gdata package:methods [13] package:stats package:graphics package:grDevices [16] package:utils package:datasets Autoloads [19] package:base or with a bit more detail: searchpaths() [1] .GlobalEnv [2] /usr/local/lib/R/library/Design [3] /usr/local/lib/R/library/survival [4] /usr/local/lib/R/library/splines [5] /usr/local/lib/R/library/Hmisc [6] /usr/local/lib/R/library/chron [7] /usr/local/lib/R/library/xtable [8] /usr/local/lib/R/library/gplots [9] /usr/local/lib/R/library/gtools [10] /usr/local/lib/R/library/gmodels [11] /usr/local/lib/R/library/gdata [12] /usr/local/lib/R/library/methods [13] /usr/local/lib/R/library/stats [14] /usr/local/lib/R/library/graphics [15] /usr/local/lib/R/library/grDevices [16] /usr/local/lib/R/library/utils [17] /usr/local/lib/R/library/datasets [18] Autoloads [19] /usr/local/lib/R/library/base You can see that the 'datasets' package is number 17 in the searchpath. For example, if I now attach() the iris dataset (a widely used example dataset in R): attach(iris) search() [1] .GlobalEnviris package:Design [4] package:survival package:splines package:Hmisc [7] package:chron package:xtablepackage:gplots [10] package:gtoolspackage:gmodels package:gdata [13] package:methods package:stats package:graphics [16] package:grDevices package:utils package:datasets [19] Autoloads package:base You can see that 'iris' is now number 2 in the searchpath. This means that I can now access columns in the iris dataset without needing to use the typical DataframeName$ColumnName syntax: Species [1] setosa setosa setosa setosa setosa setosa [7] setosa setosa setosa setosa setosa setosa ... However: detach(iris) Species Error: object Species not found iris$Species [1] setosa setosa setosa setosa setosa setosa [7] setosa setosa setosa setosa setosa setosa ... In your case, the warning is telling you that you have attached a data frame that presumably contains a column, whose name is 'column_name'. If you use: column_name in your R session, the object with that name in your data frame will be seen before another object with the same name that is lower in the searchpath, since your object is higher in the searchpath. Thus, your object is 'masking' the other. I did not see another object called column_name in my version of the datasets package, so perhaps you have done some manipulation on those objects during your R session. As a general comment, as has been discussed previously in other threads, this is a potentially significant danger in the habit of using attach() and should generally be avoided. HTH, Marc Schwartz [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cross tabs with percents?
try prop.table() On 10/21/06, ronggui [EMAIL PROTECTED] wrote: Maybe _CrossTable_ in gmodels package is what you want. 2006/10/22, Larry White [EMAIL PROTECTED]: -- apologies if this is a dup - i got a bounce saying the message was unprocessed. Is there a straightforward way to get a table with percents in the cells rather than counts? I've looked at table, ftable, xtabs, and ctab, which did the conversion but returned the results in a single row without labels. any suggestions are appreciated. thank you. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- »ÆÈÙ¹ó Department of Sociology Fudan University [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question:shardsplot (package:klaR)
Dear all, I have a question on the shardsplot package:klaR(see the below Example). Plese tell me the meanings of logstand - t((t(logcount) / sdlogcount) * c(1,2,6,5,5,3)), much more. Why does this example use c(1,2,6,5,5,3) ? Examples: # Compute clusters and an Eight Directions Arranged Map for the # country data. Plotting the result. data(countries) logcount - log(countries[,2:7]) sdlogcount - apply(logcount, 2, sd) logstand - t((t(logcount) / sdlogcount) * c(1,2,6,5,5,3)) cclasses - cutree(hclust(dist(logstand)), k = 6) countryEDAM - EDAM(logstand, classes = cclasses, sa = FALSE, iter.max = 10, random = FALSE) plot(countryEDAM, vertices = FALSE, label = TRUE, stck = FALSE) Sincerely yours, Mikio [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cross tabs with percents?
Larry White wrote: -- apologies if this is a dup - i got a bounce saying the message was unprocessed. Is there a straightforward way to get a table with percents in the cells rather than counts? I've looked at table, ftable, xtabs, and ctab, which did the conversion but returned the results in a single row without labels. xtab in the prettyR package. Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] least median squares
Does anyone can provide a code to implement least median squares regression in R (not using the lqs function or calling C functions)? Reason: teaching/learning purposes Thanks PM __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] least median squares
Try this using builtin data set BOD: medsq - function(p, DF) median((p[1] + p[2] * DF[[1]] - DF[[2]])^2) init - coef(lm(demand ~ Time, BOD)) optim(init, medsq, DF = BOD) library(MASS) lqs(demand ~ Time, BOD, method = lms) On 10/22/06, Pedro Mardones [EMAIL PROTECTED] wrote: Does anyone can provide a code to implement least median squares regression in R (not using the lqs function or calling C functions)? Reason: teaching/learning purposes Thanks PM __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] currency or stock trading strategy
Thanks to everyone for many useful tips to more information.I think this thread is worth reading, if only for that reason alone (despite some disparaging comments about my research abilitites). I appreciate the advice from those who have more experience with these finance tools than me (that is not hard!), as your advice could save me many, many hours of googling and reading materials, trying to form my own impressions and evaluations of the strengths and weaknesses in various packages, with the aim of finding something easy to work with at a relatively high level of function and performance (without the price tag!). So, thanks! I wonder if I might pose a small coding problem to the list, with the intention that list members might offer some code in this thread to illustrate the utilities available? I would hope to see some alternatives and some further commentary on preferred methods. Take for instance, the charts plotted on this page: http://invest.kleinnet.com/bmw1/intro.html I think it should be easy to keep a daily catalogue of all these charts, and the associated metrics, for all stocks that have quote data readily available. What is the most efficient, succint methods for doing this with open-source software? Best, Darren On 9/19/06, BBands [EMAIL PROTECTED] wrote: On 9/17/06, Darren Weber [EMAIL PROTECTED] wrote: Hi, are there any good charting and analysis tools for use with currencies, stocks, etc. in R? I have some tools to download currency data from the NYFRB using python and XML. Can we get and parse an XML download using R? Can we have interaction in R plots? Does anyone use R for back-testing trading strategies? Are there any forums for discussion of using R for this specific purpose (apart from this general list)? Is anyone aware of any general open-source developments for these purposes (I don't see any from GNU or google searches)? Take care, Darren I hardly know where to start either. I use R, Python, rpy, gnuplot-py and gnuplot together and am very happy with the result. I find that each piece has it own strengths and together they provide a very powerful solution. I am currently working on integrating TA-lib into this mix via a project started by Andrea Malagoli--see the r-sig-finance archives. http://rpy.sourceforge.net/ http://www.gnuplot.info/ http://gnuplot-py.sourceforge.net/ http://ta-lib.org/ jab -- John Bollinger, CFA, CMT www.BollingerBands.com If you advance far enough, you arrive at the beginning. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.