[R] cph/nomogram Design/RMS package hazard ratio: interquartile vs per unit
Hello, I am constructing a nomogram using cph and nomogram commands in Dr. Harrell's Design/RMS package. The HR that I obtain for dichotomous and categorical variables are identical to those that I obtain using STATA stcox. However, the inter-quartile HR I obtain for continuous variables is obviously different, since STATA gives me HR for each unit (year, centimeter, etc) like coxph would give. My question is if this will effect the output of the nomogram. I'm assuming that nomogram is constructed using hazard between each unit rather than quartiles - is this true? Also, I've found that I do not need to create indicator variables for my categorical variables when I use cph. Is this also correct? I appreciate your feedback. Thank you. ~Renee -- View this message in context: http://r.789695.n4.nabble.com/cph-nomogram-Design-RMS-package-hazard-ratio-interquartile-vs-per-unit-tp3923896p3923896.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Foreach (doMC)
Dear list members, dear Jay, Well, I personally do not care about Revolutions Analytics selling their products as this is also included into the idea of many open source licences. Especially as Revolutions provide their packages to the community and its is everybodies personal choice to buy their special R version. I was just wondering about this issue as usually most questions on r-help are answered pretty soon and by many different people and I had the impression that this is not the case for posts regarding the foreach/doMC/doSMP etc packages. This may, however, be also due to the probably limited use of these packages for most users who do not need these high performance computing things. Or it was just my personal perception or pure chance. Thanks however, to the authors of such packages! They were of great help to me on several ocasions and I have deep respect for everybody devoting his time to open source software! Jannis On 10/19/2011 01:26 PM, Jay Emerson wrote: P.S. Is there any particular reason why there are so seldom answers to posts regarding foreach and all these doMC/doSMP packages ? Do so few people use these packages or does this have anything to do with the commercial origin of these packages? Jannis, An interesting question. I'm a huge fan of foreach and the parallel backends, and have used foreach in some of my packages. It leaves the choice of backend to the user, rather than forcing some environment. If you like multicore, great -- the package doesn't care. Someone else may use doSNOW. No problem. To answer your question, foreach was originally written by (primarily, at least) Steve Weston, previously of REvolution Computing. It, along with some of the parallel backends (perhaps all at this point, I'm out of touch) are available open-source. Hence, I'd argue that the commercial origin is a moot point -- it doesn't matter, it will always be available, and it's really useful. Steve is no longer with REvolution, however, and I can't speak for the responsiveness/interest of current REvolution folks on this point. Scanning R-help daily for things relating to my own packages is something I try to do, but it doesn't always happen. I would like to think foreach is widely used -- it does have a growing list of reverse depends/suggests. And was updated as recently as last May, I just noticed. http://cran.r-project.org/web/packages/foreach/index.html Jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identifying groups in xyplot
There's a great tutorial online that helped me out a lot - Lattice and Other Graphics in R, by J H Maindonald at the Centre for Mathematics and Its Applications at Australian National University. http://maths.anu.edu.au/~johnm/r-book/2edn/xtras/rgraphics.pdf I gave my lattice object a name, and then I was able to superimpose changes via the update fcn. I'm sure there are many other ways to do this, but this was very simple to follow and delivered results quickly. Fieldplots = xyplot(Hill.s.diversity ~ Year| Field, group=Management, layout=c(2,3), data=summer_pr_avg, auto.key=TRUE) Fieldplots update ( Fieldplots, main=Hill's evenness by Field, June 09-11, par.settings = simpleTheme (pch=c(1 ,3 ,4))) -- View this message in context: http://r.789695.n4.nabble.com/identifying-groups-in-xyplot-tp3922985p3923338.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot with the 3rd dimension = color?
If it would help get any assistance with my issue, here's another method I'm trying (using R sample data): ggplot(mtcars, aes(disp)) + geom_point(aes(y = mpg, colour = qsec))+ scale_colour_gradient(low=yellow, high=green)+ geom_point(aes(y = cyl, colour = qsec))+ scale_colour_gradient(low=red, high=blue) What I want is the var mpg to be colored by the var qsec from yellow to green and then the var cyl to be colored by the var qsec from red to blue. Instead, both colors end up being from red to blue. Thanks again, kb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Aggregating data help
check this out http://www.r-bloggers.com/pivot-tables-in-r/ -- View this message in context: http://r.789695.n4.nabble.com/Aggregating-data-help-tp3923138p3923397.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vegan: Anova.CCA accessing original data using option by=
Dear Jari, Thank you for your quick reply and the time you have spent assisting with this problem. Indeed the alias tool identifies one variable that when removed from the capscale model solves the problem. Once again greatly appreciate your assistance. Regards Steve Pawson Scientist (Entomology) Scion Forestry Rd, P.O. Box 29-237, Christchurch, New Zealand DDI +64 (0)3 3642987 Ext 4832 Cell +64 (0)27 4400727 www.scionresearch.comhttp://www.scionresearch.com/ [cid:image001.png@01CC8FF1.EA1EA930] From: Jari Oksanen [via R] [mailto:ml-node+s789695n3921456...@n4.nabble.com] Sent: Thursday, 20 October 2011 11:18 p.m. To: Steve Pawson Subject: Re: Vegan: Anova.CCA accessing original data using option by= Steve Pawson Steve.Pawson at scionresearch.com writes: My apologies for the delay in responding to your request for further information I have been travelling for work since you replied and have only just returned to email contact. The output from the traceback is as follows # This is the capscale model that I called beetlecap -capscale(log(beetles+1) ~ size + Clearfell + Absolute.Distance+ Distance_from_edge+ clearfell.harvest_area + Canopy.Cover + X500mnative + Litter3 + X500mexotic + X5000exotic + Condition(AdjLong + AdjLat + AdjLat.2 + AdjLat.2.long + AdjLong.3), environ, distance = bray) This is the ANOVA by margin option with the error anova(beetlecap, by=margin) Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent Corresponding traceback traceback() 9: `colnames-`(`*tmp*`, value = c(CAP1, CAP0)) 8: capscale(formula = log(beetles + 1) ~ size + Clearfell + Absolute.Distance + Distance_from_edge + clearfell.harvest_area + Canopy.Cover + X500mnative + Litter3 + X500mexotic + X5000exotic + Condition(AdjLong + AdjLat + AdjLat.2 + AdjLat.2.long + AdjLong.3) + Condition(size + Clearfell + Absolute.Distance + Distance_from_edge + clearfell.harvest_area + Canopy.Cover + Litter3 + X500mexotic + X5000exotic + AdjLong + AdjLat + AdjLat.2 + AdjLat.2.long + AdjLong.3), data = environ, distance = bray) [...clip...] Dear Steve Pawson, With the help of this message I was able to construct an example that gives the same error message -- this does not prove that the cause of the problem is the same, but it is possible. It may be that your *huge* model has redundant variables that cannot be analysed in marginal test: the other variables explain all, and the marginal effect of some variables is zero. With that a high number of variables as you have, this is very likely. It seems that capscale() cannot cope with this case. I fixed capscale in http://vegan.r-forge.r-project.org and now it handles smoothly these redundant variables (skips them in permutation test, and reports df=0). From your point of view it may be unfortunate that I released a new version of vegan a couple of hours before checking R-News mail, and therefore this fix is not yet in the next release, and as we just had a release we probably (hopefully) will not have a new revision very soon. So your choices are either to use the vegan version in R-Forge (which must be at least r1958) or simplify your model so that you don't have redundant variables. One way of achieving this is to use command alias(beetlecap, names = TRUE) which will list the names of the variables that cannot be analysed. You can remove these variables without influencing your fitted model, because they really are redundant variables. Cheers, Jari Oksanen __ [hidden email]/user/SendEmail.jtp?type=nodenode=3921456i=0 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Vegan-Anova-CCA-accessing-original-data-using-option-by-margin-tp3893005p3921456.html To unsubscribe from Vegan: Anova.CCA accessing original data using option by=margin, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3893005code=U3RldmUuUGF3c29uQHNjaW9ucmVzZWFyY2guY29tfDM4OTMwMDV8MzMzNTcwNTc1. This e-mail and any attachments may contain information ...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Collapse UP a dendrogram
I created a dendrogram (ddg0) using hclust in the usual way. I want to collapse UP the tree in various ways, that is, from the leaves up to the root. Optimally, I would give the id of a member of a final split in ddg0, and return a new ddg1 with that split collapsed. Alt, I could give a depth to collapse up to (such that ddg1 would have n fewer levels than ddg0). That sort of thing. I could, of course, program this myself, but it seems like something that is so obviously needed that there is very likely to be a package that does it already. Is there? TIA 'Jeff -- View this message in context: http://r.789695.n4.nabble.com/Collapse-UP-a-dendrogram-tp3923907p3923907.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 'Apply' giving me errors
So i have a simple function: bmass=function(y){ weight=y$WT*y$MSTR return(bio) } And want to apply to a whole bunch of rows in my data.frame: final1=apply(final,1,yldbu) BUT...recieve the following error: Error in y$WT : $ operator is invalid for atomic vectors However when i try: final[1,]$WT*final[1,]$MSTR [1] 156.3 It gives me the correct answerwhat is apply not liking in my code? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Apply-giving-me-errors-tp3923880p3923880.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting average effects.
hi... i am a phd student using r. i am having difficulty plotting average effects. admittedly, i am not really understanding what each of the commands mean so when i get the error i am not sure where the issue is. here is my code... i will include the points at which there are errors dat2 - dat3 - dat dat2$popc100 - dat2$popc100 + 1000 dat2$popc100[which(dat2$popc100 max(dat$popc100))] - max(dat$popc100) dat3$popc100 - dat$popc100 - 1000 dat3$popc100[which(dat3$popc100 min(dat$popc100))] - min(dat$popc100) pred1 - predict(mod, type=response) pred2 - predict(mod, newdata=dat2, type=response) pred3 - predict(mod, newdata=dat3, type=response) pop.group - cut(dat$popc100kpc, breaks=quantile(dat$popc100kpc, seq(0,1,by=.3)), include.lowest=T) means - by(cbind(pred1, pred2, pred3), list(pop.group), apply, 2, mean) means - do.call(rbind, means) par(mar=c(7,4,4,2)) plot(c(1,10), range(c(means)), type=n, xlab=, + ylab=Predicted Probability, axes=F) plot(c(1,10), range(c(means)), type=n, xlab=pop pc by 100k, + ylab=Predicted Probability, axes=F) arrows(1:10, means[,1], 1:10, means[,2], code=2, length=.1) arrows(1:10, means[,1], 1:10, means[,3], code=2, length=.1, col=red) points(1:10, means[,1], pch=16) Error in xy.coords(x, y) : 'x' and 'y' lengths differ as i understand it, i need to change the means[,1] i have tried a few combos and i am not getting anywhere... further, my arrows are huge and points are not appearing in my plot. is there anywhere i can find a break down of each of these commands and what each part means? i understand the lengths, colors, xlab, ylab, etc etc. thanks in advance for any insight you can give me. http://r.789695.n4.nabble.com/file/n3923982/effplot_copy.jpg -- View this message in context: http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3923982.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting average effects.
let me clarify, i understand what differing x, y lengths mean. i understand the concept of average effects, etc. i just don't understand how one would fix it. thanks. -- View this message in context: http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3924003.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] geom_tile rendering problems
Hi, I'm trying to overlay a geographical map with a heat map by following the directions on http://pages.stern.nyu.edu/~achinco/programming_examples/Example__PlotGeographicDensity.html. However, the smaller my zoom level (the farther I zoom out), the more white horizontal lines I have interspersed in the tiled data after calling geom_tile. Is there any way around this? I tried setting the image to panel.background, but found it impossible to scale back to the original heat map matrix. -- View this message in context: http://r.789695.n4.nabble.com/geom-tile-rendering-problems-tp3924100p3924100.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ordered probit model -marginal effects and relative importance of each predictor-
late to the game but maybe this will help: https://docs.google.com/viewer?a=vq=cache:8PhCkZxP9zQJ:www.quantoid.net/Effects_package_4up.pdf+plot+effects+of+variables+in+R+nonlinear+arrowshl=engl=uspid=blsrcid=ADGEEShUKOEcifzuWGxWvakh0yD4KtnLgBhLFvX5cCAkwewyQ75uznTw1OYybx6vrGuJflgMw6QYKGwuXQViNGZCh_lt8H4DAqKNPxI8y2hYTQJTyaMD4tZ0DKrYMNGtIY3B34qp-LBXsig=AHIEtbTdt10fNCOPXtg5nPSs85NqPFlAvA -- View this message in context: http://r.789695.n4.nabble.com/Ordered-probit-model-marginal-effects-and-relative-importance-of-each-predictor-tp3773504p3924287.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re coercing data frame rows to character: Am I right that this is a bug?
Dear Folks-- All this seems to me to behave the way you expect, recognising that column b is a factor: AA - data.frame(a=3:4, b=c('x', 'y')) AA[1,] a b 1 3 x as.numeric(AA[1,]) [1] 3 1 AA[,2] [1] x y Levels: x y as.numeric(AA[,2]) [1] 1 2 as.character(AA[,2]) [1] x y But this seems to me to be wrong: as.character(AA[1,]) [1] 3 1 Shouldn't it be: [1] 3 x to be consistant with the normal pattern of coercing factors to character values? If it is a bug, is this the right place to post it? sincerely, andrewH -- View this message in context: http://r.789695.n4.nabble.com/re-coercing-data-frame-rows-to-character-Am-I-right-that-this-is-a-bug-tp3924449p3924449.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stop R from rounding
David Winsemius dwinsem...@comcast.net on Thu, 20 Oct 2011 01:51:28 -0400 writes: On Oct 19, 2011, at 11:29 PM, Alyse wrote: Hello, I have a column in a data frame that need to be 10 digits long. As such: Decimal.Year 1 1994.25997 2 1994.26020 However, R keeps rounding the digits. As such: Decimal.Year 1 1994.260 2 1994.260 *Is there any way to stop this from happening?* Here is how I created the data frame: x - read.table('bats_1994_CTD.txt') colnames(x) - c ('Cruise ','Dec.Year','Lat.N','Long.W','Press','Depth','Temp','Sal','Oxy') date - subset(x,select=c(Dec.Year), (Depth201) (Depth199)) datelist - list(date$Dec.Year) temp - subset(x,select=c(Temp), (Depth201) (Depth199)) tempmean - aggregate(temp,by=datelist,FUN=mean) tempframe - data.frame(tempmean) #the first column of this dataframe is the one that I don't want R to round R is not rounding. It is displaying with less than full precision. You can control that with format or sprintf or formatC. Well, or more simply in such situations by options(digits = 10) # if it's 10 (significant) digits you want # uses 10 (sig..) digits *FROM NOW ON* or, if it's just for this one printing, instead of saying x which is *equivalent* to print(x), use print(x, digits = 10) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] use of segments in PLS
How to use the segments in the PLS fit1 - mvr(formula=Y~X1+X2+X3+X4+x5++x27, data=Dataset, comp=5,segment =7 ) here when i use segments,the error was like this rror in mvrCv(X, Y, ncomp, method = method, scale = sdscale, ...) : argument 7 matches multiple formal arguments Please help -- View this message in context: http://r.789695.n4.nabble.com/use-of-segments-in-PLS-tp3924397p3924397.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bar plot issues
On 20.10.2011 22:29, Henri-Paul Indiogine wrote: Hi Uwe! 2011/10/20 Uwe Liggeslig...@statistik.tu-dortmund.de: arrange it outside by, e.g. increasing the size of margins (see argument mar in ?par) and place a separate legend (see ?legend) into the margins (see xps argument in ?par). I could not find 'xps', do you mean 'xpd'? Yes, sorry, a typo. Best, Uwe This is what I have so far: par(mar=c(5.1,4.1,4.1,12.1)) barplot(t(file.codes), beside = FALSE, legend = FALSE, main=test stacked bar plot, xlab=documents, ylab=number of codes, col=rainbow(ncol(file.codes)), names.arg = rep(NA, nrow(file.codes))) danke, Henri-Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R square and F - stats in PLS
In the lm function the summary(lmobject) we have adjusted.r square and f statistics Do we have similar to the pls package and how to get it -- View this message in context: http://r.789695.n4.nabble.com/R-square-and-F-stats-in-PLS-tp3924484p3924484.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] POT package
Hi Sir It is requested to please tell the reason why the range of c(20945, 209547) is used in this function npy - length(events1[, obs])/(diff(range(ardieres[, time], + na.rm = TRUE)) - diff(ardieres[c(20945, 20947), time])) Please tell logic. Looking for quick response. Regards -- *Amina Shahzadi* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stacked plot
It appears that your object is currently a matrix. Here's a toy example to illustrate how to get a stacked bar chart in ggplot2: library('ggplot2') m - matrix(1:9, ncol = 3, dimnames = list(letters[1:3], LETTERS[1:3])) (d - as.data.frame(as.table(m))) Var1 Var2 Freq 1aA1 2bA2 3cA3 4aB4 5bB5 6cB6 7aC7 8bC8 9cC9 ggplot(d, aes(x = Var1, y = Freq, fill = Var2)) + geom_bar(position = 'stack', stat = 'identity') + labs(x = 'Variable 1', y = 'Frequency', fill = 'Group') + scale_fill_manual(values = c('A' = 'red', 'B' = 'blue', 'C' = 'green')) This plot uses Var1 as the x-variable, Freq as the response and Var2 as the variable whose frequencies are to be stacked, distinguished by fill color. position = 'stack' designates the stacking while stat = 'identity' indicates that the y variable Freq should be used to represent the counts. labs() designates the labels for each axis; the fill = label indicates the legend title for the fill colors. Finally, the scale_fill_manual() function is used to manually assign specific colors to levels of the fill variable Var2. The scale_fill_manual() code could also have been written as ... + scale_fill_manual(breaks = levels(d$Var2), values = c('red', 'blue', 'green')) with the same result. HTH, Dennis On Thu, Oct 20, 2011 at 10:08 PM, Henri-Paul Indiogine hindiog...@gmail.com wrote: Hi! I am trying to use ggplot2 to create a stacked bar plot. Previously I tried using barplot() but gave up because of problems with the positioning of the legend and other appearance problems. I am now trying to learn ggplot2 and use it for all the plots that I need to create for my dissertation. I am able to create normal bar plots using ggplot2, but I am stomped with the stacked bar plots. This works: barplot(t(file.codes), beside = FALSE) the data.frame file.codes looks like this . code.1 code.2 code.3 code.4 code.5 file.1 2 0 0 5 4 file.2 3 18 1 0 2 I would like each file to be a bar and then each code stacked for each file. By transposing the file.codes data.frame barplot() will allow me to do so. I am trying to obtain the same result in ggplot2 but i think that qplot wants the data to be like this: file.1 code.1 2 file.1 code.2 0 file.1 code.3 0 file.1 code.4 5 file.1 code.5 4 file.2 code.1 3 file.2 code.2 18 I think that I need to use the package reshape, but I am not sure whether to use cast(), melt(), or recast() and how to set up the function. Thanks, Henri-Paul -- Henri-Paul Indiogine Curriculum Instruction Texas AM University TutorFind Learning Centre Email: hindiog...@gmail.com Skype: hindiogine Website: http://people.cehd.tamu.edu/~sindiogine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use of segments in PLS
arunkumar akpbond...@gmail.com writes: How to use the segments in the PLS fit1 - mvr(formula=Y~X1+X2+X3+X4+x5++x27, data=Dataset, comp=5,segment =7 ) here when i use segments,the error was like this rror in mvrCv(X, Y, ncomp, method = method, scale = sdscale, ...) : argument 7 matches multiple formal arguments This cannot be true. mvr does not call mvrCv unless you give it the argument validation = CV or validation = LOO. Anyway, the argument is segments, not segment, which - as the error message says - matches multiple arguments, in this case segment.type. -- Regards, Bjørn-Helge Mevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R square and F - stats in PLS
arunkumar akpbond...@gmail.com writes: In the lm function the summary(lmobject) we have adjusted.r square and f statistics Do we have similar to the pls package and how to get it No. Both of these requires theory about the model that doesn't exist for PLSR. (I should note that there have been published a couple of generalisations of the degrees of freedom to general regression models, and these could be used to calculate an adjusted R^2. However, they have not been implemented in the pls package.) It seems you would like to use PLSR the way you use OLS, with classical hypothesis tests and performance statistics. This is not how PLSR is usually applied, and there are few such tools. The traditional/typical focus amongst PLSR practicioners is much more on prediction performance (RMSEP) and interpretation by plotting scores and loadings. -- Regards, Bjørn-Helge Mevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'Apply' giving me errors
On 21.10.2011 02:09, kickout wrote: So i have a simple function: bmass=function(y){ weight=y$WT*y$MSTR return(bio) } And want to apply to a whole bunch of rows in my data.frame: final1=apply(final,1,yldbu) BUT...recieve the following error: Error in y$WT : $ operator is invalid for atomic vectors However when i try: final[1,]$WT*final[1,]$MSTR [1] 156.3 It gives me the correct answerwhat is apply not liking in my code? Since apply passes the rows as vectors into your function, not as a data.frame of 1 row. I woder why you need apply() at all, since final$WT * final$MSTR should do. Uwe Ligges Thanks -- View this message in context: http://r.789695.n4.nabble.com/Apply-giving-me-errors-tp3923880p3923880.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R code Error : Hybrid Censored Weibull Distribution
On Oct 20, 2011, at 21:25 , ritwi...@isical.ac.in wrote: Dear Sir/madam, I'm getting a problem with a R-code which calculate Fisher Information Matrix for Hybrid Censored Weibull Distribution. My problem is that: when I take weibull(scale=1,shape=2) { i.e shape1} I got my desired result but when I take weibull(scale=1,shape=0.5) { i.e shape1} it gives error : Error in integrate(int2, lower = 0, upper = t) : the integral is probably divergent. I could not found any theoretical interpretation of it. I'm sending the code : The code doesn't work... output=f3(5,10) Error in f(x, ...) : object 'p' not found Furthermore, if I guess p=.5, lamda=1, n=10, the code doesn't even break: output=f3(5,10) output [1] 1.155917 So what do you expect _us_ to do about it? I strongly suspect that actually testing the code (in a clean R session) would have revealed issues causing you not to have to submit the post at all... -pd -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replicating SAS's proc rank procedure
Hi try this function ive written it should be self explantory but let me know if you have any problems. I've only been using R for a few eeeks so apologies if its not the most efficient! rankit2-function(rankvar,cuts,data,factor) { ranker-rankvar ranker-0 range-c(1:cuts) range2-range/cuts range3-quantile(factor,range2) over-length(factor) for (i in 1:over){ for (j in 1:cuts) { if (data[[i,1]]=range3[[j]]) {data[[i,3]]-j ##test-j ##print(j) } if (data[[i,3]]0) break } } out2-data return(out2) } cars$rank-0 try2-rankit2(rank,15,cars,cars$speed) try2 all the best Leigh RCalc partner www.RCalc.co.uk -- View this message in context: http://r.789695.n4.nabble.com/replicating-SAS-s-proc-rank-procedure-tp820510p3924739.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 104, Issue 19
On Oct 21, 2011, at 09:01 , Martin Maechler wrote: ARE == Alex Ruiz Euler rruizeu...@ucsd.edu on Wed, 19 Oct 2011 14:05:16 -0700 writes: ARE Motion supported. Very. ARE On Wed, 19 Oct 2011 15:40:14 +0200 ARE peter dalgaard pda...@gmail.com wrote: Argh! Someone please unsubscribe this guy? He did this over Summer too and still hasn't learned that 1 recipients of R-help do not care whether he is out of office! -pd Well, there are hundreds like him. The only difference being that he speaks Hungarian.. You might filter on the Subject line being Re: [R] R-help Digest.*, with no attention to content. That has an obvious side effect, but maybe not a harmful one... -pd Why? I (as R-* mailing list site maintainer) have had (procmail) filters that automatically catch such 'out of office' messages, so the 10'000 readers don't have to get them. The current set of filters catches a set of English, French, German,.. (and I don't know) messages So I have (many!!) filters like this: :0 * ^Subject: (Re|Holiday|Vacation): .*[-A-za-z]+ Digest, Vol [1-9][0-9]*, Issue [1-9][0-9]* { :0B * I( will not be reading.*\e?[-]?mail|.* away .* attend to your message when I get) mlist-bounced.spool } --- but can't start doing that for Hungarian or Chinese or ... Martin -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Foreach (doMC)
Jay, sorry if my post was not precise enough. I simply wanted to point out that I personally have no problem at all with commercial R products as I have the free choice to use them or their open source alternatives. In addition Revolutions is supplying their packages for free to the R community which is great! I was purely curios whether other R users may have different opinions but as you are the only one replying I would imagine that this is no problem for most users. I will browse the list archive as you suggeted to get some impression on this. So, it is probably time to close this post for not beating the dead horse? Thanks anyway, Jay, for your detailed explanations of the origin of these R packages! Best Jannis On 10/21/2011 02:34 AM, Jay Emerson wrote: Jannis, I'm not complete sure I understand your first point, but maybe someone from REvolution will weigh in. Nobody is forcing anyone to purchase any products, and there are attractive alternatives such as the CRAN R and R Studio (to name two). This issue has arisen many times of the various lists and you are welcome to search the archives and read many very intelligent, thoughtful opinions. As for foreach, etc... if you have fairly focused questions (preferably with a reproducible example if there is a problem) and if you have done reading on examples available on using it, then you might try joining the r-sig-...@r-project.org group. Clearly there are far more users of core R and hence mainstream questions on r-help are likely to be answered more quickly (on average) than specialized questions. Regards, Jay On Thu, Oct 20, 2011 at 4:27 PM, Jannisbt_jan...@yahoo.de wrote: Dear list members, dear Jay, Well, I personally do not care about Revolutions Analytics selling their products as this is also included into the idea of many open source licences. Especially as Revolutions provide their packages to the community and its is everybodies personal choice to buy their special R version. I was just wondering about this issue as usually most questions on r-help are answered pretty soon and by many different people and I had the impression that this is not the case for posts regarding the foreach/doMC/doSMP etc packages. This may, however, be also due to the probably limited use of these packages for most users who do not need these high performance computing things. Or it was just my personal perception or pure chance. Thanks however, to the authors of such packages! They were of great help to me on several ocasions and I have deep respect for everybody devoting his time to open source software! Jannis On 10/19/2011 01:26 PM, Jay Emerson wrote: P.S. Is there any particular reason why there are so seldom answers to posts regarding foreach and all these doMC/doSMP packages ? Do so few people use these packages or does this have anything to do with the commercial origin of these packages? Jannis, An interesting question. I'm a huge fan of foreach and the parallel backends, and have used foreach in some of my packages. It leaves the choice of backend to the user, rather than forcing some environment. If you like multicore, great -- the package doesn't care. Someone else may use doSNOW. No problem. To answer your question, foreach was originally written by (primarily, at least) Steve Weston, previously of REvolution Computing. It, along with some of the parallel backends (perhaps all at this point, I'm out of touch) are available open-source. Hence, I'd argue that the commercial origin is a moot point -- it doesn't matter, it will always be available, and it's really useful. Steve is no longer with REvolution, however, and I can't speak for the responsiveness/interest of current REvolution folks on this point. Scanning R-help daily for things relating to my own packages is something I try to do, but it doesn't always happen. I would like to think foreach is widely used -- it does have a growing list of reverse depends/suggests. And was updated as recently as last May, I just noticed. http://cran.r-project.org/web/packages/foreach/index.html Jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple factorial comparison LSD
Please help. I really like R and I have been looking at how to do LSD multiple comparison test with data that has more than one factor. So far, I am unsuccessful. Please help! Me __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cullen and Frey graph in fitdistrplus
Hi, I’ve came across something that I can’t explain and I would appreciate if anyone could have a go at it. In the library “fitdistplus” there is a function “descdist” to help on the decision of choosing a distribution to fit. The same function also allows bootstrap this is to take in account the uncertainty of the calculated values. If you run the line below a few times you will find quite a big variation each time you run it, in fact that’s what is made me suspicious. If I’m bootstrapping always from the same distribution curve why do I have such variation? A second question that arises to my mind and this probably is due to the lack knowledge on the subject. But if the Cullen and Frey graph is to help on the decision on which distribution to choose is the line below just giving me the uncertainty of the estimates of Kurtosis and Skewness and should I ignore all the lines in the graph as I already fitted a weibull distribution to the original data? library(fitdistrplus) descdist(rweibull(1000,shape=13.74286,scale=38.07489),boot=1000) Cheers Patrao -- View this message in context: http://r.789695.n4.nabble.com/cullen-and-Frey-graph-in-fitdistrplus-tp3924732p3924732.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'Apply' giving me errors
On Fri, Oct 21, 2011 at 3:09 AM, kickout kyle.ko...@gmail.com wrote: So i have a simple function: bmass=function(y){ weight=y$WT*y$MSTR return(bio) } But this just returns bio and since an object with that name is not defined in the function, it will be looked up in the global environment (workspace) and if it's not there either, you will get Error: object 'bio' not found. So even if you could fix the apply issue it would still not work. But Uwe Ligges showed you don't need apply to do what you seem to intend here, final$WT*final$MSTR should work. But if you do insist on apply for whatever reason then ... apply converts X (the first argument) to a matrix so you can't use the $ operator any more. The column names are preserved though, so what you could do is bmass - function(y) y[WT] * y[MSTR] apply(final, 1, bmass) And want to apply to a whole bunch of rows in my data.frame: final1=apply(final,1,yldbu) What is yldbu? I suppose you meant the function you defined above? Kenn BUT...recieve the following error: Error in y$WT : $ operator is invalid for atomic vectors However when i try: final[1,]$WT*final[1,]$MSTR [1] 156.3 It gives me the correct answerwhat is apply not liking in my code? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Apply-giving-me-errors-tp3923880p3923880.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot with the 3rd dimension = color?
On 10/21/2011 06:25 AM, Kerry wrote: Can someone please help me out with this? The ggplot2 suggestion works great but I've spent a few days trying to figure out how to plot 2 variables with it and I'm stuck. Here's my example code: ... Hi Kerry, This isn't ggplot2, but it may do what you want. library(plotrix) oldmar-par(mar=c(5,4,4,4)) plot(x,y,type=n) plotlim-par(usr) rect(plotlim[1],plotlim[3],plotlim[2],plotlim[4],col=lightgray) grid(col=white) box() points(x,y,col=color.scale(z,c(1,0),0,c(0,1)),pch=19) points(x1,y2,col=color.scale(z3,1,c(0,1),0),pch=19) legendval1-seq(min(z),max(z),length.out=5) color.legend(2.9,0.5,3.1,1.5,round(legendval1,1),align=rb,gradient=y, rect.col=color.scale(legendval1,c(1,0),0,c(0,1))) legendval2-seq(min(z3),max(z3),length.out=5) color.legend(2.9,-1.5,3.1,-0.5,round(legendval2,1),align=rb,gradient=y, rect.col=color.scale(legendval2,c(1,1),c(0,1),0)) par(xpd=TRUE) text(3,1.6,z) text(3,-0.4,z3) par(xpd=FALSE,oldmar) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What does \Sexpr[results=rd]{} exactly mean in Rd?
On 11-10-17 9:53 AM, Yihui Xie wrote: Thanks a lot! Sorry for cross-posting, but I did it intentially because I tend to believe Barry Rowlingson (Why R-help Must Die!), and I will summarize the answers here later to StackOverflow. Another user also told me this worked for 2.13.1, but not later versions. This should now be fixed. Could you please test a version of 2.14.0 beta or R-devel, after r57531? Thanks. Duncan Murdoch Regards, Yihui -- Yihui Xiexieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Mon, Oct 17, 2011 at 8:45 AM, Gavin Simpsongavin.simp...@ucl.ac.uk wrote: On Sun, 2011-10-16 at 19:36 -0500, Yihui Xie wrote: Hi, I have spent a few hours on the R-exts manual and the documentation of parse_Rd() (as well as the PDF document in the references), but I still have not figured out what results=rd means. I thought I could use an R code fragment to create an Rd fragment dynamically. Here is an example, in which I was expected the output to be a describe list DL in HTML, but it turns out not to be true. Perhaps best not to cross post to several internet resources at once. I replied to the same Q on StackOverflow: http://stackoverflow.com/q/7788628/429846 Suffice it to say that your example works for me with 2.13.1 (still need to compile 2.13.2 on my workstation). I left some additional comments and examples, which might help understand this. I had trouble when I first started playing this and didn't pursue further, but I think I am starting to understand how to use this now after taking a look when I tried to answer your Q. G (I was actually building a package with Rd's containing \Sexpr{} instead of really using Rd2HTML(); the content was not rendered after I run R CMD build.) des- \\describe{\\item{def}{ghi}} con- textConnection(c(\\title{abc}\\name{abc}, \\details{\\Sexpr[results=rd,stage=build]{des}})) z- parse_Rd(con) Rd2HTML(z, stages = build) close(con) !DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN htmlheadtitleR: abc/title meta http-equiv=Content-Type content=text/html; charset=utf-8 link rel=stylesheet type=text/css href=R.css /headbody table width=100% summary=page for abctrtdabc/tdtd align=rightR Documentation/td/tr/table h2abc/h2 h3Details/h3 pdefghi/p /body/html sessionInfo() R version 2.13.2 (2011-09-30) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] devtools_0.4 loaded via a namespace (and not attached): [1] RCurl_1.6-10 Thanks! Regards, Yihui -- Yihui Xiexieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting average effects.
Hi Denins, and thanks for your reply. I understand x,y are not lining up. I just don't know how to fix it in the code. There is only a small group of us at my university using R (4 people of which I am one). 2 are not even touching the average effects plot option, however myself and my study partner feel it is best. So, we really don't have anyone to ask. Everyone else in our class was taught on STATA. The reasons why are sort of complicated and I don't wish to bore you with details. Basically, we were the first group to be trained using R. This is our 3rd semester using it. Is there an online guide anywhere that will describe exactly what is going on in the plot function for average effects? I have been googling and have not come across anything useful, except this site. - Ph.D. Candidate -- View this message in context: http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925293.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change column/row-name
I'm not sure I follow: the matrix Iske doesn't have row or column namesthough if you perhaps mean you want to use the pasted together rows as names on the distance matrix rather than the converted characters, this will do it: Iske.rows - apply(Iske, 1, paste, collapse = ) # Perhaps subtract out the 33 you added in dimnames(Iske.levens) - list(Iske.rows, Iske.rows) On Fri, Oct 21, 2011 at 1:57 AM, Jörg Reuter jo...@reuter.at wrote: Hi, I am very happy. My problems are solved without one little thing: (Iske - matrix(c(1, 1, 1, 2, 2, 2, 1, 1, 1, 5, 1, 2, 2, 2, 1, 1, 1, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 4, 4, 4, 4, 4, 4, 2, 2, 2, 2, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 2), ncol = 5)) #My Matrix Iske- Iske+33 #I want see the letters (Iske.char-apply(Iske, 1, function(x) rawToChar(as.raw(x #Numbers to Char LD - function(s1, s2){ require(vwr) s1 = as.character(s1) s2 = as.character(s2) t(sapply(s1, levenshtein.distance, s2)) } Iske.levens-(LD(Iske.char,Iske.char)) #Calculate the Levenshtein-Distanz The result: !#$% !#$% !#$% !#$% !#$% 0 0 0 !#$% 0 0 0 !#$% 0 0 0 . . . It is all beautiful. But is there a simple way to change the column/row-name to the original from the Matrix Iske? Thanks a lot for the help yesterday. It was a big step in my life :-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting average effects.
Bart, I apologize. I posted the code I was using in my first comment, to include the error and the plot that is coming up. I was unaware that was not enough. I am not looking for anyone to give me the actual answer to my specific issue, only looking to be pointed in a direction for an online guide that i can read to understand how average effects are plotted in R. Our text book doesn't cover any topics for using R, only theory. - Ph.D. Candidate -- View this message in context: http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925350.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] POT package
Perhaps you could tell us what function you are talking about npy is not part of the POT package in the version on my machine and those letters don't seem to show up anywhere consecutively at all on my system, according to ??npy. Similarly, this trick produces no results: sapply(ls(package:POT), function(f) sum(grepl(20945,deparse(get(f) Michael On Fri, Oct 21, 2011 at 3:35 AM, Amina Shahzadi aminashahz...@gmail.com wrote: Hi Sir It is requested to please tell the reason why the range of c(20945, 209547) is used in this function npy - length(events1[, obs])/(diff(range(ardieres[, time], + na.rm = TRUE)) - diff(ardieres[c(20945, 20947), time])) Please tell logic. Looking for quick response. Regards -- *Amina Shahzadi* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] windows limits
Hello, Using the rgl package, I can set the device window to any dimension (that I have tested): par3d(windowRect=c(1,1,700,700)) With windows I can't get the window to span from the top to the bottom of the monitor. In the following, no matter how large the ypinch value gets it stops, leaving about 2 inches of space at the bottom of my screen: windows(record=TRUE, ypinch=1100, xpinch=10, xpos=1,ypos=1, rescale='fixed') ...I've read about some limits on windows. Is there any way around these limits? Any way I can get windows to perform like the rgl package 3d device? Thanks for your help! ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot() or splom()?: two factors from same data frame
On Fri, 21 Oct 2011, Duncan Mackay wrote: Without a dataset I am not sure what you need. Duncan, Part of the problems I'm trying to resolve come from changing priorities from my client and the regulators. I end up stopping one process and starting on a different one. But, that's life in the real world of environmental consulting. :-) What I need now is to compare TDS (total dissolved solids) with specific conductivity and the ions that are normally comprise TDS. Before running any regression models I need to look at these data from three points of view: all data from all sites collected during the past 30 years; average (or total) concentrations (not yet decided on what makes the most ecological sense) within a stream having multiple collection sites; and by site within certain streams. I think that I need to subset the data frame to create distinct analytical data frames for each comparison, then rm() them until needed again (or I'd have a very large number of files in the directory). If I have a subset, for example, of TDS and conductivity regardless of sample date or location I will have two columns of numbers that will fit the xyplot() formula; e.g., xyplot(TDS ~ Cond). This is the broad picture. I can then use the hydrographic basins (2 of 'em) or streams (24 of 'em) as factors to condition the analysis. Repeat for other parameter pairs (TDS vs. Ca, TDS vs, Mg, etc.). Another part of the issue, perhaps, is that the data are in a single data frame: str(chemdata) 'data.frame': 47244 obs. of 6 variables: $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 134 134 127 127 $ sampdate: Date, format: 2006-12-06 2006-12-06 ... $ param : Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 66 12 24 59 66 $ quant : num 1.08e+04 7.95 1.80e-02 2.80e+02 1.90e+01 8.44 1.62e+03 $ stream : Factor w/ 24 levels BCrk,CCrk,..: 4 4 4 21 21 21 4 $ basin : Factor w/ 2 levels BasinEast,BasinWest: 1 1 1 1 1 1 1 1 1 2 ... while all the data sets used in the books I've read are simpler. What I've not read is guidance on how complex data sets could (or should) be partitioned into smaller but still related data sets to facilitate analyses. I hope this clarifies my initial request. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] windows limits
On 21/10/2011 9:36 AM, Ben qant wrote: Hello, Using the rgl package, I can set the device window to any dimension (that I have tested): par3d(windowRect=c(1,1,700,700)) With windows I can't get the window to span from the top to the bottom of the monitor. In the following, no matter how large the ypinch value gets it stops, leaving about 2 inches of space at the bottom of my screen: windows(record=TRUE, ypinch=1100, xpinch=10, xpos=1,ypos=1, rescale='fixed') This doesn't affect your question, but you shouldn't be using xpinch and ypinch for this: they describe the pixels per inch. You should be using width and height. The real problem is that you aren't allowed to open windows that are more than 85% of the available size in either direction. This is documented in ?windows. I don't know the reason for this restriction, but it may be so that you can't inadvertantly lose the controls on the window (something that happens in rgl in your example code). You can manually resize the window after creating it (using the mouse); you could write a function to do that if you know Windows API programming, but I don't believe there's one in base R. Duncan Murdoch ...I've read about some limits on windows. Is there any way around these limits? Any way I can get windows to perform like the rgl package 3d device? Thanks for your help! ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to create time series objects combining two vectors
I am new to R and trying to understand time series objects. I have 2 vectors, one containing rainfall values (lets call the vector rain) and the other the time/date in seconds (lets call it time). Is there a method to create a time series object simply by giving the rain and time vectors as inputs? Something in the line of: time_series_object - time_series_function(rain, time) Most of the packages I have investigated don't have a simple option like this. Can this be done however? -- View this message in context: http://r.789695.n4.nabble.com/How-to-create-time-series-objects-combining-two-vectors-tp3924883p3924883.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] varying coefficients model
Dear members, I'm trying to estimate a varying coefficients model using the local polynomial estimation method in two case (univariate and bivariate) and with two smooth functions. In the univariate case I use: m1-smooth.lf(x=lp(z1,by=x1,deg=1,h=0.7,ev=z1)+lp(z3,by=expl2,deg=1,h=0.7,ev=z1)-1,y=dep,kern=gauss,kt=prod) while in the bivariate case I use: m1-smooth.lf(x=lp(z1,z2,by=x1,deg=1,h=0.7,ev=z1)+lp(z3,z4,by=expl2,deg=1,h=0.7,ev=z1)-1,y=dep,kern=gauss,kt=prod) where the varying coefficient variable is (x1) that is a random normal variable and (z1,z2,z3,z4) are also random normal variables. My problem is that when I increase the sample size the bivariate case doesn't work because appers an error newsplit. Please, can somebody help me? I don't know if I use correctly the term by to especify the varying coefficient. I'll appreciate any query. Thanks a lot [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to remove multiple outliers
Hi Michael, Thanks for the help. Yes, I have gone through the document for ?outlier. As it removes one outlier at a time, being new to R, I was woondering is there any function available for removing multiple outliers whithout calling say rm.outlier for n number of time because n is not finite here. On the second point, I am using below mentioned piece of code, because I am getting error when rm.outlier with fill = FALSE option is applied on the same dataset. outlier_tf1 = outlier(x1,logical=TRUE) find_outlier1 = which(outlier_tf1==TRUE, arr.ind=TRUE) beh_input_ro1 = x1[-find_outlier1] library(outliers) beh_input_ro - rm.outlier(beh_input_dr, fill = FALSE, median = FALSE, opposite = FALSE) Error in data.frame(X1 = c(28.7812, 24.8923, 31.3987, 25.774, 27.1798, : arguments imply differing number of rows: 2398, 2390, 2399 Regards, -Ajit -- View this message in context: http://r.789695.n4.nabble.com/How-to-remove-multiple-outliers-tp3921689p3924904.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stair-step plot
Hello Is it possible to map a plot with horizontal lines like in the step-plot, but without the vertical lines? Thanks, knut -- View this message in context: http://r.789695.n4.nabble.com/stair-step-plot-tp3924903p3924903.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
can i be taken off of this mailing list please? is there another way that you can access this without having to get all the emails?? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to use gev.fit (package ismev) under box constraints?
__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glm-poisson fitting 400.000 records
Hi, I am trying to fi a glm-poisson model to 400.000 records. I have tried biglm and glmulti but i have problems... can it really be the case that 400.000 are too many records??? I am thinking of using random samples of my dataset. Many thanks, -- View this message in context: http://r.789695.n4.nabble.com/glm-poisson-fitting-400-000-records-tp3925100p3925100.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 104, Issue 19
On Oct 21, 2011, at 09:01 , Martin Maechler wrote: ARE == Alex Ruiz Euler rruizeu...@ucsd.edu on Wed, 19 Oct 2011 14:05:16 -0700 writes: ARE Motion supported. Very. ARE On Wed, 19 Oct 2011 15:40:14 +0200 ARE peter dalgaard pda...@gmail.com wrote: Argh! Someone please unsubscribe this guy? He did this over Summer too and still hasn't learned that 1 recipients of R-help do not care whether he is out of office! -pd Well, there are hundreds like him. The only difference being that he speaks Hungarian.. You might filter on the Subject line being Re: [R] R-help Digest.*, with no attention to content. That has an obvious side effect, but maybe not a harmful one... -pd Why? I (as R-* mailing list site maintainer) have had (procmail) filters that automatically catch such 'out of office' messages, so the 10'000 readers don't have to get them. The current set of filters catches a set of English, French, German,.. (and I don't know) messages So I have (many!!) filters like this: :0 * ^Subject: (Re|Holiday|Vacation): .*[-A-za-z]+ Digest, Vol [1-9][0-9]*, Issue [1-9][0-9]* { :0B * I( will not be reading.*\e?[-]?mail|.* away .* attend to your message when I get) mlist-bounced.spool } --- but can't start doing that for Hungarian or Chinese or ... FYI: holiday=szabadság on holiday=szabadságon out of office=nem tartózkodom az irodában OR irodán kívül vagyok Expressions like nem tartózkodom az irodában OR irodán kívül vagyok will never occur in a real post sent to the R-list, so could be used for filtering. HTH, Dénes Martin -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] add=TRUE or similar in spplot?
Dear Helper, I have a spatial lines data frame object 'spRiverDf'. The data frame consists of numbers {0,1,...,5}. And I have a vector 'colorS' of length 6 with different colours. If I make a plot with spplot I get a plot of the lines - colours depending on there number in the data frame column: spplot(spRiverDf['data.col.1'], zcol=..., names.attr=..., col.regions=colorS, lwd=10) # (A) My problem: - I'd like to overlay narrow lines (lwd=5) with the appropriate colors of a second data column. My tests: 1) I tried it with the spplot options: lwd=10, sp.layout=list(list(spRiverDf['data.col.2'], col=colorS, lwd=5)) 2) Second try: result - spplot(entries_see_(A)); result + layer(sp.lines(spRiverDf['data.col.2'], col=colorS, lwd=5)) My results: - In both tests I get with spplot the desired but with the additional things only single-coloured (takes the first entry of 'colorS') lines. My questions: - Is there a possebility to overlay two spplots? Something like option add=TRUE for the usual 'plot' command. - Or is there an easy way to select desired lines of a spatial plot data frame (e.g. with a special colour) that I can use 'sp.layout' or 'layer'? - Of course I can create for additional data frame columns own spatial lines data frames for each color (depending on the number entry in the column) but this is very time-consuming - and realy not elegant. Thank you for your help! Thomas -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 104, Issue 21
Október 19-től 21-ig irodán kívül vagyok, és az emailjeimet nem érem el. Sürgős esetben kérem forduljon Kárpáti Edithez (karpati.e...@gyemszi.hu). Üdvözlettel, Mihalicza Péter I will be out of the office from 19 till 21 October with no access to my emails. In urgent cases please contact Ms. Edit Kárpáti (karpati.e...@gyemszi.hu). With regards, Peter Mihalicza __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'Apply' giving me errors
Thanks for the tips/adviceI actually used a different solution to circumvent this, but Uwe's solutions would also work -- View this message in context: http://r.789695.n4.nabble.com/Apply-giving-me-errors-tp3923880p3925377.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting average effects.
How about posting a reproducible sample, so that we can see what is going on? Read the posting guide!!! -- View this message in context: http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925324.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] POT package
npy - length(events1[, obs])/(diff(range(ardieres[, time], + na.rm = TRUE)) - diff(ardieres[c(20945, 20947), time])) This line is from the mannual A user's Guide to POT Approach. I am just trying to ask why the values 20945, 20947 are used?? Regards On Fri, Oct 21, 2011 at 4:19 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Perhaps you could tell us what function you are talking about npy is not part of the POT package in the version on my machine and those letters don't seem to show up anywhere consecutively at all on my system, according to ??npy. Similarly, this trick produces no results: sapply(ls(package:POT), function(f) sum(grepl(20945,deparse(get(f) Michael On Fri, Oct 21, 2011 at 3:35 AM, Amina Shahzadi aminashahz...@gmail.com wrote: Hi Sir It is requested to please tell the reason why the range of c(20945, 209547) is used in this function npy - length(events1[, obs])/(diff(range(ardieres[, time], + na.rm = TRUE)) - diff(ardieres[c(20945, 20947), time])) Please tell logic. Looking for quick response. Regards -- *Amina Shahzadi* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- *Amina Shahzadi* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create time series objects combining two vectors
On Fri, Oct 21, 2011 at 6:24 AM, sarelseerower sarelseero...@gmail.com wrote: I am new to R and trying to understand time series objects. I have 2 vectors, one containing rainfall values (lets call the vector rain) and the other the time/date in seconds (lets call it time). Is there a method to create a time series object simply by giving the rain and time vectors as inputs? Something in the line of: time_series_object - time_series_function(rain, time) Most of the packages I have investigated don't have a simple option like this. Can this be done however? See: http://rwiki.sciviews.org/doku.php?id=guides:tutorials:hydrological_data_analysis:miscellaneous_data_import -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stair-step plot
On 21/10/2011 6:39 AM, knut-o wrote: Hello Is it possible to map a plot with horizontal lines like in the step-plot, but without the vertical lines? Not in the basic plot function, but you can write your own fairly easily, using segments(). For example: x - y - 1:10 plot(x,y, type='n') segments(x[-10], y[-10], x[-1], y[-10]) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
On 21.10.2011 13:02, Lisa Henault wrote: can i be taken off of this mailing list please? is there another way that you can access this without having to get all the emails?? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. For both questions (unsubscribe, digest form, archived) see the bottom of each mail you got so far: https://stat.ethz.ch/mailman/listinfo/r-help Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm-poisson fitting 400.000 records
D_Tomas tomasmeca at hotmail.com writes: Hi, I am trying to fi a glm-poisson model to 400.000 records. I have tried biglm and glmulti but i have problems... can it really be the case that 400.000 are too many records??? I am thinking of using random samples of my dataset. I have problems isn't enough for us to diagnose. I tried this trivial example in base R: d - data.frame(x=runif(4e5),y=rpois(4e5,5)) system.time(glm(y~x,family=poisson,data=d,trace=TRUE)) Deviance = 438614.6 Iterations - 1 Deviance = 417968.2 Iterations - 2 Deviance = 417921.2 Iterations - 3 Deviance = 417921.2 Iterations - 4 user system elapsed 5.444 12.952 18.429 Can you give us a hint about what went wrong?? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about aggregate
Hi list I am discovering R, and -- PhD candidate in Computer Science Address 3 avenue lamine, cité ezzahra, Sousse 4000 Tunisia tel: +216 97 246 706 (+33640302046 jusqu'au 15/6) fax: +216 71 391 166 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about aggregate
Hello I am discovering R and I find it is really very powerful. However, I find some newbie difficulties. Here, I have a data frame with manu values that I want to calculate the frequency (the nomber of line) of the some criteria. For exemple here, I want it to print the number of occurence where sci[,2]=0 and sci[,1]=L. In my exemple, he is printing the number of the line in the result data frame. however, I have at least 90 line with sci[,2]=0 and sci[,1]=L. Thank you in advance for any input. aggregate(sci[,5],list(sci[,2],sci[,1]),frequency) Group.1 Group.2 x 1 0.0 L 1 2 0.2 L 1 -- PhD candidate in Computer Science Address 3 avenue lamine, cité ezzahra, Sousse 4000 Tunisia tel: +216 97 246 706 (+33640302046 jusqu'au 15/6) fax: +216 71 391 166 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change column/row-name
On Oct 21, 2011, at 1:57 AM, Jörg Reuter wrote: Hi, I am very happy. My problems are solved without one little thing: (Iske - matrix(c(1, 1, 1, 2, 2, 2, 1, 1, 1, 5, 1, 2, 2, 2, 1, 1, 1, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 5, 4, 4, 4, 4, 4, 4, 2, 2, 2, 2, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 2), ncol = 5)) #My Matrix First transform to characters. (Using raw type seems failure-prone): Iltrs - c(letters, LETTERS)[Iske] Then work with that. (No testing in absence of the levenshtein.distance function definition or package location.) Iske- Iske+33 #I want see the letters (Iske.char-apply(Iske, 1, function(x) rawToChar(as.raw(x #Numbers to Char LD - function(s1, s2){ require(vwr) s1 = as.character(s1) s2 = as.character(s2) t(sapply(s1, levenshtein.distance, s2)) } Iske.levens-(LD(Iske.char,Iske.char)) #Calculate the Levenshtein- Distanz The result: !#$% !#$% !#$% !#$% !#$% 0 0 0 !#$% 0 0 0 !#$% 0 0 0 . . . It is all beautiful. But is there a simple way to change the column/row-name to the original from the Matrix Iske? Thanks a lot for the help yesterday. It was a big step in my life :-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth
In the HistData package, I have a data frame, PearsonLee, containing observations on heights of parent and child, in weighted form: library(HistData) str(PearsonLee) 'data.frame': 746 obs. of 6 variables: $ child: num 59.5 59.5 59.5 60.5 60.5 61.5 61.5 61.5 61.5 61.5 ... $ parent : num 62.5 63.5 64.5 62.5 66.5 59.5 60.5 62.5 63.5 64.5 ... $ frequency: num 0.5 0.5 1 0.5 1 0.25 0.25 0.5 1 0.25 ... $ gp : Factor w/ 4 levels fd,fs,md,..: 2 2 2 2 2 2 2 2 2 2 ... $ par : Factor w/ 2 levels Father,Mother: 1 1 1 1 1 1 1 1 1 1 ... $ chl : Factor w/ 2 levels Daughter,Son: 2 2 2 2 2 2 2 2 2 2 ... I want to make a 2x2 set of plots of child ~ parent | par+chl, with regression lines and loess smooths, that incorporate weights=frequency. The frequencies are not integers, so I can't simply expand the data frame. I'd also like to use different colors for the regression and smoothed lines. Here's what I've tried using xyplot, all unsuccessful. I suppose I could also use ggplot2, if I could do what I want. xyplot(child ~ parent|par+chl, data=PearsonLee, weights=frequency, type=c(p, r, smooth)) xyplot(child ~ parent|par+chl, data=PearsonLee, type=c(p, r, smooth)) panel.lmline and panel.smooth don't have a weights= argument, though lm() and loess() do. # Try to control line colors: unsuccessfully -- only one value of col.lin is used xyplot(child ~ parent|par+chl, data=PearsonLee, type=c(p, r, smooth), col.line=c(red, blue)) ## try to use panel functions ... unsucessfully xyplot(child ~ parent|par+chl, data=PearsonLee, type=p, panel = function(x, y, ...) { panel.xyplot(x, y, ...) panel.lmline(x, y, col=blue, ...) panel.smooth(x, y, col=red, ...) } ) The following, using base graphics, illustrates the difference between the weighted and unweighted lines, for the total data frame: with(PearsonLee, { lim - c(55,80) xv - seq(55,80, .5) sunflowerplot(parent,child, number=frequency, xlim=lim, ylim=lim, seg.col=gray, size=.1) # unweighted abline(lm(child ~ parent), col=green, lwd=2) lines(xv, predict(loess(child ~ parent), data.frame(parent=xv)), col=green, lwd=2) # weighted abline(lm(child ~ parent, weights=frequency), col=blue, lwd=2) lines(xv, predict(loess(child ~ parent, weights=frequency), data.frame(parent=xv)), col=blue, lwd=2) }) thanks, -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Specifying Greek Character in Lattice Plot Label
For an axis label I want to include the Greek letter mu within the string. I've not found the proper way of including that expression within the string. What I want is Conductivity (uS/cm) with the 'u' replaced by mu. When I try Conductivity ( expression(paste(mu)) S/cm) I get an error. If I don't separate Conductivity and S/cm with parentheses the string 'expression(paste(mu))' displays in the lable. What am I doing incorrectly? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stair-step plot
On Oct 21, 2011, at 6:39 AM, knut-o wrote: Hello Is it possible to map a plot with horizontal lines like in the step- plot, but without the vertical lines? There is no function named 'step-plot'. If you are talking about the plot.stepfun function then look at the verticals argument. Thanks, knut -- View this message in context: http://r.789695.n4.nabble.com/stair-step-plot-tp3924903p3924903.html Sent from the R help mailing list archive at Nabble.com. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Specifying Greek Character in Lattice Plot Label
On Oct 21, 2011, at 11:27 AM, Rich Shepard wrote: For an axis label I want to include the Greek letter mu within the string. I've not found the proper way of including that expression within the string. What I want is Conductivity (uS/cm) with the 'u' replaced by mu. When I try Conductivity ( expression(paste(mu)) S/cm) I get an error. If I don't separate Conductivity and S/cm with parentheses the string 'expression(paste(mu))' displays in the lable. try: plot(1,1, xlab=expression(Conductivity~(*mu*S/cm*)) ) Parens are the only characters that need to be quoted and you do need to use proper plotmath connectors, ~ and * depending on whether you ant a space to appear or not. I don't hink you can join character values and expression values in the manner that you offer but I admit I never tried it, so I don't know for sure. Generally language and expression objects have their own special set of functions and syntax. What am I doing incorrectly? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cph/nomogram Design/RMS package hazard ratio: interquartile vs per unit
On Oct 20, 2011, at 8:22 PM, renee wrote: Hello, I am constructing a nomogram using cph and nomogram commands in Dr. Harrell's Design/RMS package. The HR that I obtain for dichotomous and categorical variables are identical to those that I obtain using STATA stcox. When posting to r-help it is advised to produce the code used. One might guess that you were talking about output from summary(fit) but that would be a guess. However, the inter-quartile HR I obtain for continuous variables is obviously different, since STATA gives me HR for each unit (year, centimeter, etc) like coxph would give. If you want HR's for single unit difference on the scale of the measured units then this should produce those: exp(coef(fit)) My question is if this will effect the output of the nomogram. I'm assuming that nomogram is constructed using hazard between each unit rather than quartiles - is this true? Yes. Also, I've found that I do not need to create indicator variables for my categorical variables when I use cph. Is this also correct? If they are factor classed variables, then that is correct. I appreciate your feedback. Thank you. ~Renee -- View this message in context: http://r.789695.n4.nabble.com/cph-nomogram-Design-RMS-package-hazard-ratio-interquartile-vs-per-unit-tp3923896p3923896.html (At least until Nabble deletes it in a year or two.) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting average effects.
i will include the data to read if if you so choose. dat - read.dta(http://quantoid.net/hw1_2011.dta;) model in question: mod99 - glm(democracy ~ popc100kpc + ngrevpc, data=dat, family=binomial) -- looking for average effects code, with error on mod99. popckpc is coded in 1k per capita. dat3$popc100kpc - dat$popc100kpc - 100 dat3$popc100kpc[which(dat3$popc100kpc min(dat$popc100kpc))] - min(dat$popc100kpc) dat2 - dat3 - dat dat2$popc100kpc - dat2$popc100kpc + 100 dat2$popc100kpc[which(dat2$popc100kpc max(dat$popc100kpc))] - max(dat$popc100kpc) dat3$popc100kpc - dat$popc100kpc - 100 dat3$popc100kpc[which(dat3$popc100kpc min(dat$popc100kpc))] - min(dat$popc100kpc) pred1 - predict(mod99, type=response) pred2 - predict(mod99, newdata=dat2, type=response) pred3 - predict(mod99, newdata=dat3, type=response) breaking the variable into groups: pop1.group - cut(dat$popc100kpc, breaks=quantile(dat$popc100kpc, seq(0,1,by=.25)), include.lowest=T) apply, 2, mean) means - by(cbind(pred1, pred2, pred3), list(pop1.group), + apply, 2, mean) means - do.call(rbind, means) and finally attempting to plot: par(mar=c(7,4,4,2)) plot(c(1,10), range(c(means)), type=n, xlab=, + ylab=Predicted Probability, axes=F) arrows(1:10, means[,1], 1:10, means[,2], code=2, length=.1) arrows(1:10, means[,1], 1:10, means[,3], code=2, length=.1, col=red) points(1:10, means[,1], pch=16) Error in xy.coords(x, y) : 'x' and 'y' lengths differ axis(1, at=1:10, labels=rownames(means), las=2) Error in axis(1, at = 1:10, labels = rownames(means), las = 2) : 'at' and 'labels' lengths differ, 10 != 4 I am not sure how to fix the error. Thank you for your time. - Ph.D. Candidate -- View this message in context: http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925945.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple factorial comparison LSD
Vera, The glht function in the multcomp package provides the capability you are looking for. The MMC functions in the HH package build on the ghlt function. There are examples in ?MMC on data with more than one factor. Rich On Fri, Oct 21, 2011 at 4:55 AM, Vera Marjorie E. Velasco velas...@univmail.cis.mcmaster.ca wrote: Please help. I really like R and I have been looking at how to do LSD multiple comparison test with data that has more than one factor. So far, I am unsuccessful. Please help! Me __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Specifying Greek Character in Lattice Plot Label
On Fri, 21 Oct 2011, David Winsemius wrote: plot(1,1, xlab=expression(Conductivity~(*mu*S/cm*)) ) Thank you, David. It did not occur to me to look for a help page. I'll read that now that I looked and found it. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replicating SAS's proc rank procedure
You can get the same results with the cut() function in R: cut(cars$speed, breaks=quantile(cars$speed, probs=c(0:15/15)), labels=1:15, include.lowest=TRUE) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of riskcalc Sent: Friday, October 21, 2011 4:06 AM To: r-help@r-project.org Subject: Re: [R] replicating SAS's proc rank procedure Hi try this function ive written it should be self explantory but let me know if you have any problems. I've only been using R for a few eeeks so apologies if its not the most efficient! rankit2-function(rankvar,cuts,data,factor) { ranker-rankvar ranker-0 range-c(1:cuts) range2-range/cuts range3-quantile(factor,range2) over-length(factor) for (i in 1:over){ for (j in 1:cuts) { if (data[[i,1]]=range3[[j]]) {data[[i,3]]-j ##test-j ##print(j) } if (data[[i,3]]0) break } } out2-data return(out2) } cars$rank-0 try2-rankit2(rank,15,cars,cars$speed) try2 all the best Leigh RCalc partner www.RCalc.co.uk -- View this message in context: http://r.789695.n4.nabble.com/replicating-SAS-s-proc-rank-procedure-tp820510 p3924739.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Working With Variables Having Different Lengths
Because of regulatory requirement changes over several decades and weather conditions preventing site access the variables in my data set have different lengths. I'd like guidance on how to perform linear regressions and other models with these variables. For example, there are 2206 rows for the parameter TDS but only 1191 rows for the parameter Cond. Such discrepancies are common in these data. Is there a reference I can read to learn how to analyze such data? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth
Hi Michael: Here's one way to get it from ggplot2. To avoid possible overplotting, I jittered the points horizontally by +/- 0.2. I also reduced the point size from the default 2 and increased the line thickness to 1.5 for both fitted curves. In ggplot2, the term faceting is synonymous with conditioning (by groups). library('HistData') library('ggplot2') ggplot(PearsonLee, aes(x = parent, y = child)) + geom_point(size = 1.5, position = position_jitter(width = 0.2)) + geom_smooth(method = lm, aes(weights = PearsonLee$weight), colour = 'green', se = FALSE, size = 1.5) + geom_smooth(aes(weights = PearsonLee$weight), colour = 'red', se = FALSE, size = 1.5) + facet_grid(chl ~ par) # If you prefer a legend, here's one take, pulling the legend inside # to the upper left corner. This requires a bit more 'trickery', but # the tricks are found in the ggplot2 book. ggplot(PearsonLee, aes(x = parent, y = child)) + geom_point(size = 1.5, position = position_jitter(width = 0.2)) + geom_smooth(method = lm, aes(weights = PearsonLee$weight, colour = 'Linear'), se = FALSE, size = 1.5) + geom_smooth(aes(weights = PearsonLee$weight, colour = 'Loess'), se = FALSE, size = 1.5) + facet_grid(chl ~ par) + scale_colour_manual(breaks = c('Linear', 'Loess'), values = c('green', 'red')) + opts(legend.position = c(0.14, 0.885), legend.background = theme_rect(fill = 'white')) HTH, Dennis On Fri, Oct 21, 2011 at 8:22 AM, Michael Friendly frien...@yorku.ca wrote: In the HistData package, I have a data frame, PearsonLee, containing observations on heights of parent and child, in weighted form: library(HistData) str(PearsonLee) 'data.frame': 746 obs. of 6 variables: $ child : num 59.5 59.5 59.5 60.5 60.5 61.5 61.5 61.5 61.5 61.5 ... $ parent : num 62.5 63.5 64.5 62.5 66.5 59.5 60.5 62.5 63.5 64.5 ... $ frequency: num 0.5 0.5 1 0.5 1 0.25 0.25 0.5 1 0.25 ... $ gp : Factor w/ 4 levels fd,fs,md,..: 2 2 2 2 2 2 2 2 2 2 ... $ par : Factor w/ 2 levels Father,Mother: 1 1 1 1 1 1 1 1 1 1 ... $ chl : Factor w/ 2 levels Daughter,Son: 2 2 2 2 2 2 2 2 2 2 ... I want to make a 2x2 set of plots of child ~ parent | par+chl, with regression lines and loess smooths, that incorporate weights=frequency. The frequencies are not integers, so I can't simply expand the data frame. I'd also like to use different colors for the regression and smoothed lines. Here's what I've tried using xyplot, all unsuccessful. I suppose I could also use ggplot2, if I could do what I want. xyplot(child ~ parent|par+chl, data=PearsonLee, weights=frequency, type=c(p, r, smooth)) xyplot(child ~ parent|par+chl, data=PearsonLee, type=c(p, r, smooth)) panel.lmline and panel.smooth don't have a weights= argument, though lm() and loess() do. # Try to control line colors: unsuccessfully -- only one value of col.lin is used xyplot(child ~ parent|par+chl, data=PearsonLee, type=c(p, r, smooth), col.line=c(red, blue)) ## try to use panel functions ... unsucessfully xyplot(child ~ parent|par+chl, data=PearsonLee, type=p, panel = function(x, y, ...) { panel.xyplot(x, y, ...) panel.lmline(x, y, col=blue, ...) panel.smooth(x, y, col=red, ...) } ) The following, using base graphics, illustrates the difference between the weighted and unweighted lines, for the total data frame: with(PearsonLee, { lim - c(55,80) xv - seq(55,80, .5) sunflowerplot(parent,child, number=frequency, xlim=lim, ylim=lim, seg.col=gray, size=.1) # unweighted abline(lm(child ~ parent), col=green, lwd=2) lines(xv, predict(loess(child ~ parent), data.frame(parent=xv)), col=green, lwd=2) # weighted abline(lm(child ~ parent, weights=frequency), col=blue, lwd=2) lines(xv, predict(loess(child ~ parent, weights=frequency), data.frame(parent=xv)), col=blue, lwd=2) }) thanks, -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
Alternatively, since you are on gmail you can set up a folder and filter so all r-help emails bypass you inbox and go right to an r-help folder (or something). I find it very useful for just browsing during down time so I can offer my assistance or move to an 'r-keepers ' folder the for little gems that I would like to use later. On Fri, Oct 21, 2011 at 4:02 AM, Lisa Henault lisa.hena...@gmail.comwrote: can i be taken off of this mailing list please? is there another way that you can access this without having to get all the emails?? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Specifying Greek Character in Lattice Plot Label
The following produces something very similar to David's method: plot(1,1, xlab = expression(paste(Conductivity (, mu, S / cm but with a slightly different slash character. I think David's method is more correct, but I've used the above method in the past with some success. On Fri, Oct 21, 2011 at 11:45 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 21, 2011, at 11:27 AM, Rich Shepard wrote: For an axis label I want to include the Greek letter mu within the string. I've not found the proper way of including that expression within the string. What I want is Conductivity (uS/cm) with the 'u' replaced by mu. When I try Conductivity ( expression(paste(mu)) S/cm) I get an error. If I don't separate Conductivity and S/cm with parentheses the string 'expression(paste(mu))' displays in the lable. try: plot(1,1, xlab=expression(Conductivity~(*mu*S/cm*)) ) Parens are the only characters that need to be quoted and you do need to use proper plotmath connectors, ~ and * depending on whether you ant a space to appear or not. I don't hink you can join character values and expression values in the manner that you offer but I admit I never tried it, so I don't know for sure. Generally language and expression objects have their own special set of functions and syntax. What am I doing incorrectly? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- ___ Luke Miller Postdoctoral Researcher Marine Science Center Northeastern University Nahant, MA (781) 581-7370 x318 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
Sounds like you are dealing with missing data problem. At default, lm or glm would only keep observations with complete records (complete case analysis). This can be problematic if you have many missing variables and missing values occur not completely at random (i.e., missing values are dependent on other (un)measured variables or missing values themselves). Imputation is a common tool for handling imcomplete data analysis. In R, you can find packages which conduct single or multiple imputations, e.g. randomForest, norm, mice, mi etc.. No easy way out with missing data problems, all imputations are based on some strong and untestable assumptions. Weidong Gu On Fri, Oct 21, 2011 at 12:13 PM, Rich Shepard rshep...@appl-ecosys.com wrote: Because of regulatory requirement changes over several decades and weather conditions preventing site access the variables in my data set have different lengths. I'd like guidance on how to perform linear regressions and other models with these variables. For example, there are 2206 rows for the parameter TDS but only 1191 rows for the parameter Cond. Such discrepancies are common in these data. Is there a reference I can read to learn how to analyze such data? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
I know in my experience Cond (conductivity??) doesn't vary much within a stream except for during high flow events, and I would imagine the same is true for TDS. If these are all low flow values, you could possibly determine a mean/median value to use for the missing data points. Obviously this is going to be different if you are sampling storm events. If you have stage data and lots of data points, you may be able to model the parameters as a function of stage. HTH Rich Shepard wrote: Because of regulatory requirement changes over several decades and weather conditions preventing site access the variables in my data set have different lengths. I'd like guidance on how to perform linear regressions and other models with these variables. For example, there are 2206 rows for the parameter TDS but only 1191 rows for the parameter Cond. Such discrepancies are common in these data. Is there a reference I can read to learn how to analyze such data? Rich __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Working-With-Variables-Having-Different-Lengths-tp3926023p3926158.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Specifying Greek Character in Lattice Plot Label
On Fri, 21 Oct 2011, Luke Miller wrote: The following produces something very similar to David's method: plot(1,1, xlab = expression(paste(Conductivity (, mu, S / cm but with a slightly different slash character. I think David's method is more correct, but I've used the above method in the past with some success. Thanks, Luke. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
On Fri, 21 Oct 2011, Weidong Gu wrote: No easy way out with missing data problems, all imputations are based on some strong and untestable assumptions. Thanks for the insights. Let me rephrase my question in a way that should work: is there a way to subset my comprehensive data frame ('chemdata') to select only those rows that have values for two different parameters (i.e., in the same column)? I suspect not. But, I can select from the database table on those criteria and read in a new R data frame. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
I believe you could also set your subscription to NOMAIL and then read the posts from the R-help archive. This would also allow you to post to R-help since you are still subscribed. Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Trevor Davies Sent: Friday, October 21, 2011 9:15 AM To: Lisa Henault Cc: R-help@r-project.org Subject: Re: [R] (no subject) Alternatively, since you are on gmail you can set up a folder and filter so all r-help emails bypass you inbox and go right to an r-help folder (or something). I find it very useful for just browsing during down time so I can offer my assistance or move to an 'r-keepers ' folder the for little gems that I would like to use later. On Fri, Oct 21, 2011 at 4:02 AM, Lisa Henault lisa.hena...@gmail.comwrote: can i be taken off of this mailing list please? is there another way that you can access this without having to get all the emails?? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
On Fri, 21 Oct 2011, B77S wrote: I know in my experience Cond (conductivity??) doesn't vary much within a stream except for during high flow events, and I would imagine the same is true for TDS. This is generally true, but not in the streams with which we're working. TDS values, for example, vary by orders of magnitude between sampling locations on the same stream, and not with any pattern. Also, specific conductance/conductivity (Cond) varies within a stream. These variations may well be on different dates, but this first look needs to ignore time. I'll eventually get to that aspect. Thanks, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting average effects.
Hi: Your approach to computing the means is not efficient; a better way would be to use the aggregate() function. I would start by combining the grouping variable and the three prediction variables into a data frame. To get the groupwise mean for all three prediction variables, you can use a formula interface for aggregate() if you have R-2.11.0 or later, cbinding the three prediction variables into a matrix on the LHS of the model formula, the grouping variable on the RHS, followed by the data frame name and the summary function. See ?aggregate for details; in particular, study the examples with a formula interface. Save the result to an object. Since this is homework, the details are left to you. As far as the base graphics plot goes, I suggest the following: - use arrows() to produce the lines with arrows - plot the means by group as points with the points() function. The arrows() function can take vector arguments; read its help page carefully. The ggplot2 version of the plot I think you're trying to generate is given below: library('ggplot2') ggplot(pmeans) + geom_point(aes(x = grp, y = pred1), colour = 'red') + geom_segment(aes(x = grp, xend = grp, y = pred3, yend = pred2), arrow = arrow(length = unit(0.4, 'cm')), colour = 'red', size = 1) pmeans is the name I gave for the averaged predictions by group, with grp representing the grouping variable and pred1-pred3 per your definitions. In addition to the aggregate() and apply family functions, the packages doBy, plyr and data.table are well designed for groupwise data summarization and processing. HTH, Dennis On Fri, Oct 21, 2011 at 8:51 AM, gradstudent nmf...@uwm.edu wrote: i will include the data to read if if you so choose. dat - read.dta(http://quantoid.net/hw1_2011.dta;) model in question: mod99 - glm(democracy ~ popc100kpc + ngrevpc, data=dat, family=binomial) -- looking for average effects code, with error on mod99. popckpc is coded in 1k per capita. dat3$popc100kpc - dat$popc100kpc - 100 dat3$popc100kpc[which(dat3$popc100kpc min(dat$popc100kpc))] - min(dat$popc100kpc) dat2 - dat3 - dat dat2$popc100kpc - dat2$popc100kpc + 100 dat2$popc100kpc[which(dat2$popc100kpc max(dat$popc100kpc))] - max(dat$popc100kpc) dat3$popc100kpc - dat$popc100kpc - 100 dat3$popc100kpc[which(dat3$popc100kpc min(dat$popc100kpc))] - min(dat$popc100kpc) pred1 - predict(mod99, type=response) pred2 - predict(mod99, newdata=dat2, type=response) pred3 - predict(mod99, newdata=dat3, type=response) breaking the variable into groups: pop1.group - cut(dat$popc100kpc, breaks=quantile(dat$popc100kpc, seq(0,1,by=.25)), include.lowest=T) apply, 2, mean) means - by(cbind(pred1, pred2, pred3), list(pop1.group), + apply, 2, mean) means - do.call(rbind, means) and finally attempting to plot: par(mar=c(7,4,4,2)) plot(c(1,10), range(c(means)), type=n, xlab=, + ylab=Predicted Probability, axes=F) arrows(1:10, means[,1], 1:10, means[,2], code=2, length=.1) arrows(1:10, means[,1], 1:10, means[,3], code=2, length=.1, col=red) points(1:10, means[,1], pch=16) Error in xy.coords(x, y) : 'x' and 'y' lengths differ axis(1, at=1:10, labels=rownames(means), las=2) Error in axis(1, at = 1:10, labels = rownames(means), las = 2) : 'at' and 'labels' lengths differ, 10 != 4 I am not sure how to fix the error. Thank you for your time. - Ph.D. Candidate -- View this message in context: http://r.789695.n4.nabble.com/plotting-average-effects-tp3923982p3925945.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
On Oct 21, 2011, at 1:04 PM, Rich Shepard wrote: On Fri, 21 Oct 2011, Weidong Gu wrote: No easy way out with missing data problems, all imputations are based on some strong and untestable assumptions. Thanks for the insights. Let me rephrase my question in a way that should work: is there a way to subset my comprehensive data frame ('chemdata') to select only those rows that have values for two different parameters (i.e., in the same column)? The last part (in the same column) does not make sense, since I was interpreting the term parameter to mean a value in a particular column. Assuming these are R NA's then logical indexing: with( chemdata, chemdata[!is.na(param1) !is.na(param2) , ]) If you are talking about extracting different text features from a single character field then look at `grepl`. patt1 - S2 # any appearance of that string patt2 - E5 # any appearance of that string with( chemdata, chemdata[ grepl(patt1, param1) grepl(patt2, param1) , ]) I suspect not. But, I can select from the database table on those criteria and read in a new R data frame. That to should b possible. Specifics are sorely lacking at this point, however. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nls making R not responding
Here is the code I am running: library(nls2) modeltest- function(A,mu,l,b,thour){ out-vector(length=length(thour)) for (i in 1:length(thour)) { out[i]-b+A/(1+exp(4*mu/A*(l-thour[i])+2)) } return(out) } A=1.3 mu=.22 l = 15 b = .07 thour = 1:25 Yvals-modeltest(A,mu,l,b,thour)-.125+runif(25)/4 st2 - expand.grid(A = seq(0.1, 1.6,.5), mu = seq(0.01, .41,.1), l=1, b =seq(0,.6,.3)) lower.bound-list(A=.01,mu=0,l=0,b=0) try( invisible(capture.output(mod1-nls2(Yvals~modeltest(A,mu,l,b,thour), #start = list(b =5, k = 2, l=0), start = st2, lower = lower.bound, algorithm = brute-force ))) ) try(nmodel-nls(Yvals~modeltest(A,mu,l,b,thour), start=coef(mod1), #start=list(A=1.8,mu=.2,l=.5,b=.15), lower=lower.bound, algorithm=port) ) My problem seems to be with initial parameter estimates. I am running through a couple hundred treatments, so I used nls2 to pick the best starting values from some options, then I run through nls with those values. If I have too many options (st2) in nls2, the run takes too long. When I cut down options, there are either errors, or in some cases R stops responding completely and I have to shut it down and start over. I do not know why it shows the not responding. Is there a better way (well, I'm sure there's always a better way) to do this where I can run through 200+ datasets with robust enough starting values. Any ideas would be greatly appreciated. Adele - In theory, practice and theory are the same. In practice, they are not - Albert Einstein -- View this message in context: http://r.789695.n4.nabble.com/nls-making-R-not-responding-tp3926263p3926263.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
On Fri, 21 Oct 2011, David Winsemius wrote: The last part (in the same column) does not make sense, since I was interpreting the term parameter to mean a value in a particular column. David, That's what I meant: two values from the 'param' column. Assuming these are R NA's then logical indexing: with( chemdata, chemdata[!is.na(param1) !is.na(param2) , ]) I'll read the with() help page again. And, I'll try the above with TDS replacing param1 and Cond replacing param2. Thanks, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arima Models - Error and jump error
Hi people, I´m trying to development a simple routine to run many Arima models result from some parâmeters combination. My data test have one year and daily level. A part of routine is: for ( d in 0:1 ) { for ( p in 0:3 ) { for ( q in 0:3 ) { for ( sd in 0:1 ) { for ( sp in 0:3 ) { for ( sq in 0:3 ) { Yfit=arima(Yst[,2],order=c(p,d,q),seasonal=list(order=c(sp,sd,sq),period=7),include.mean=TRUE,xreg=DU0) }} Until the step 187 it´s run normally, but in the step 187 return a error and stop the program. Yfit=arima(Yst[,2],order=c(1,0,1),seasonal=list(order=c(2,1,2),period=7),include.mean=TRUE,xreg=DU0) Error in optim(init[mask], armafn, method = BFGS, hessian = TRUE, control = optim.control, : non-finite finite-difference value [1] My questions is: 1. What this error mean and why it occured? 2. How can I do to this program disregard any error and to continue to run until the end of looping? 3. Someone know if already have any routine that do this? Thanks Flávio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
On Fri, 21 Oct 2011, David Winsemius wrote: The last part (in the same column) does not make sense, since I was interpreting the term parameter to mean a value in a particular column. Assuming these are R NA's then logical indexing: with( chemdata, chemdata[!is.na(param1) !is.na(param2) , ]) David, I asked the question improperly. What I should have asked is how to specify only non-missing values of a parameter to create a new subset. Example (this includes NA rows): tds.basin - subset(chemdata, param == TDS, select = c(param, quant, \ basin), drop = TRUE) When I try to add '!is.na' with the param selection I get errors. To be as specific as I should have been in my original message, how should I write the above expression to exclude rows where TDS is missing? Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stacked plot
Hi Dennis! Fantastic, great, wonderful, beautiful. I slightly changed your code to adapt it to my situation: ggplot(DF.2, aes(x=file.name, y=value, fill=codes))+geom_histogram(position=stack, stat=identity) + labs(x=document, y=number of codings) ### file.name codes value - file.1 code.1 2 file.1 code.2 0 file.1 code.3 0 file.1 code.4 5 file.1 code.5 4 file.2 code.1 3 file.2 code.2 18 There are 126 bars (file1 - file.126), so I should do the following: (1) convert to a histogram with no gaps between the bars, and (2) remove the labels at the bottom of each bar and just have xlab=documents. However, even with changing geom_bar to geom_histogram there are small gaps between the bars. Thanks for your help, Henri-Paul -- Curriculum Instruction Texas AM University TutorFind Learning Centre Email: hindiog...@gmail.com Skype: hindiogine Website: http://people.cehd.tamu.edu/~sindiogine __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
On Oct 21, 2011, at 2:09 PM, Rich Shepard wrote: On Fri, 21 Oct 2011, David Winsemius wrote: The last part (in the same column) does not make sense, since I was interpreting the term parameter to mean a value in a particular column. Assuming these are R NA's then logical indexing: with( chemdata, chemdata[!is.na(param1) !is.na(param2) , ]) David, I asked the question improperly. What I should have asked is how to specify only non-missing values of a parameter to create a new subset. Example (this includes NA rows): tds.basin - subset(chemdata, param == TDS, select = c(param, quant, \ basin), drop = TRUE) When I try to add '!is.na' with the param selection I get errors. If you do not offer both the code and the verbatim copy of the error there will be very little that we can do to diagnose your problem. To be as specific as I should have been in my original message, how should I write the above expression to exclude rows where TDS is missing? First you need to clarify whether TDS is the name of a column or a possible value in a column named param. This whole painful multi- question process would be greatly accelerated if you offered str(chemdata). -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Arima Models - Error and jump error
Perhaps: require(forecast) ?auto.arima # Or look into package fitAR. The first performs seasonal optimization so it is likely better for your application. Ken Hutchison On Oct 21, 2554 BE, at 1:59 PM, Flávio Fagundes flavi...@gmail.com wrote: Hi people, I´m trying to development a simple routine to run many Arima models result from some parâmeters combination. My data test have one year and daily level. A part of routine is: for ( d in 0:1 ) { for ( p in 0:3 ) { for ( q in 0:3 ) { for ( sd in 0:1 ) { for ( sp in 0:3 ) { for ( sq in 0:3 ) { Yfit=arima(Yst[,2],order=c(p,d,q),seasonal=list(order=c(sp,sd,sq),period=7),include.mean=TRUE,xreg=DU0) }} Until the step 187 it´s run normally, but in the step 187 return a error and stop the program. Yfit=arima(Yst[,2],order=c(1,0,1),seasonal=list(order=c(2,1,2),period=7),include.mean=TRUE,xreg=DU0) Error in optim(init[mask], armafn, method = BFGS, hessian = TRUE, control = optim.control, : non-finite finite-difference value [1] My questions is: 1. What this error mean and why it occured? 2. How can I do to this program disregard any error and to continue to run until the end of looping? 3. Someone know if already have any routine that do this? Thanks Flávio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Serialization help.
I have the following code: c - file(c:/temp/r/SkuSalesModel.br, rb) s - unserialize(c) close(c) rm(c) And it worked as late as yesterday. Today when I came in I get the following error: Error in .Call(R_unserialize, connection, refhook, PACKAGE = base) : negative length vectors are not allowed I have not upgraded or changed any installation and the file has not changed. Any ideas on how I can get more info or solve this error? Thank you. Kevin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot with the 3rd dimension = color?
Awesome, thank you so much for this! I plan to play around with this more next week with my actual data, but it provides a lot more options than I had before I posted. The link will help too. kb On Oct 20, 8:18 pm, Dennis Murphy djmu...@gmail.com wrote: AFAIK, you can't 'add' two ggplot2 graphs together; the problem in this case is that the two color scales would clash. If you're willing to discretize the z values, then you could pull it off. Here's an example: d - data.frame(x = rnorm(100), y = rnorm(100), z = factor(1 + (rnorm(100) 0))) d1 - data.frame(x = rnorm(100), y = rnorm(100), z = factor(3 + (rnorm(100) 0))) dd - rbind(d, d1) In each data frame, I'm assigning two factor levels depending on whether z 0 or not. The factor levels are 1, 2 in d and 3, 4 in d1; when rbinded together, z has four distinct levels. Now call ggplot(): ggplot(dd, aes(x = x, y = y, colour = z)) + geom_point() + scale_colour_manual(values = c('1' = 'red', '2' = 'blue', '3' = 'green', '4' = 'yellow')) This may be coarser than you like, so you could always use the cut() function to discretize z in each data frame; you'll want to assign the levels so that they are distinct in the combined data frame. Example: d3 - data.frame(x = rnorm(100), y = rnorm(100), z = cut(rnorm(100), breaks = c(-Inf, -0.5, 0.5, Inf), labels = 1:3)) d4 - data.frame(x = rnorm(100), y = rnorm(100), z = cut(rnorm(100), breaks = c(-Inf, -0.5, 0.5, Inf), labels = 4:6)) dd2 - rbind(d3, d4) mycols - c('red', 'maroon', 'blue', 'green', 'cyan', 'yellow') ggplot(dd2, aes(x = x, y = y, colour = z)) + geom_point() + scale_colour_manual(breaks = levels(dd2$z), values = mycols) You can always use the labels = argument of scale_colour_manual() to assign more evocative legend values, or equivalently, you can assign the labels in the cut() function within d3 and d4 to those you want in the legend and leave the plot code as is. BTW, there is a dedicated ggplot2 list to which you can subscribe throughhttp://had.co.nz/ggplot2/(look for the ggplot2 mailing list near the top of the page). The list archives are accessible through the same link. HTH, Dennis On Thu, Oct 20, 2011 at 12:25 PM, Kerry kbro...@gmail.com wrote: Can someone please help me out with this? The ggplot2 suggestion works great but I've spent a few days trying to figure out how to plot 2 variables with it and I'm stuck. Here's my example code: library(ggplot2) #Here's the 1st plot x-rnorm(100) y-rnorm(100) z-rnorm(100) d - data.frame(x,y,z) dg-qplot(x,y,colour=z,data=d) dg + scale_colour_gradient(low=red, high=blue) #Here's the 2nd plot which will delete the 1st plot above but I'd like them to be plotted together x1-rnorm(100) y2-rnorm(100) z3-rnorm(100) d1 - data.frame(x1,y1,z1) dg1 -qplot(x1,y1,colour=z1,data=d1) dg1 + scale_colour_gradient(low=green, high=yellow) I've been trying to get long format working but it just doesn't make any sense to me. Thanks, kb On Oct 17, 3:10 pm, Kerry kbro...@gmail.com wrote: Yes, the qplot works great, but do you know how to allow for multiple plots? I want one variable to be plotted say from blue to red and another say from yellow to green but in the same graph, each having there own separate legends. I've tried print() and arrange() but no luck. Thanks again, kb On Oct 2, 10:42 pm, Ben Bolker bbol...@gmail.com wrote: Duncan Murdoch murdoch.duncan at gmail.com writes: On 11-10-02 1:11 PM, Kerry wrote: I have 3 columns of data and want to plot each row as a point in a scatter plot and want one column to be represented as a color gradient (e.g. larger values being more red). Anyone know the command or package for this? It's not a particularly effective display, but here's how to do it. Use rainbow(101) in place of rev(heat.colors(101)) if you like. x - rnorm(10) y - rnorm(10) z - rnorm(10) colors - rev(heat.colors(101)) zcolor - colors[(z - min(z))/diff(range(z))*100 + 1] plot(x,y,col=zcolor) or d - data.frame(x,y,z) library(ggplot2) qplot(x,y,colour=z,data=d) I agree about the not particularly effective display comment, but if you have two continuous predictors and a continuous response you've got a tough display problem -- your choices are: 1. use color, size, or some other graphical characteristic (pretty far down on the Cleveland hierarchy) 2. use a perspective plot (hard to get the right viewing angle, often confusing) 3. use coplots/small multiples/faceting (requires discretizing one dimension) __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
[R] PCA and Regression with complex categorical variables
__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use gev.fit (package ismev) under box constraints?
Hallo, it seems as if something did not work with my first email I would like to estimate parameters of a general extreme value (GEV) distribution using maximum likelihood as implemented in the gev.fit function of package ismev. If I do the follwing: y.training- c(22, 22, 18, 19, 18, 18, 22, 27, 25, 19, 18, 21, 18, 20, 18, 19, 18, 21, 29, 18, 22, 19, 19, 24, 18, 21, 22, 20, 20, 27, 18, 20, 20, 18, 18, 18, 21, 18, 18, 21, 26, 19, 18, 19, 19, 18, 19, 18, 20, 20, 25, 21, 26, 22, 20, 19, 22, 21, 21, 20, 20, 19, 18, 22, 22, 27, 19, 20, 26, 29, 18, 20, 19, 22, 23, 18, 20, 20, 22, 18, 23, 18, 20, 19, 27, 21, 22, 18, 18, 19, 18, 21, 18, 23, 18, 18, 20, 20, 24, 19, 18, 19, 19, 23, 19, 18, 25, 18, 24, 19) fit-gev.fit(xdat=y.training,show=F) round(fit$mle,2) # 18.00 , 0.00 , 3.96 # The estimated shape parameter is 3.96. I would like to perform the estimation under the constraint that the shape parameter is smaller than 1, but the following does not work: fit-gev.fit(xdat=y.training,show=F,method=L-BFGS-B,lower=c(0,0,-2),upper=c(50,10,1)) round(fit$mle,2) # 18.09 , 0.27 , 3.05 It seems as is the lower and upper values are not passed to the optim function in the way they should be. A warning says that they are only passed to the control part of optim. Therefore my question: (How) is it possible to use the gev.fit-function to perform the ML estimation under the constraint that the shape parameter is smaller than 1? Or more general: Is it possible to use the gev.fit function under box constraints as it should be possible for optim? Thanks in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quantmod package
thanks for the help. but with that code it is possible to save the current quotes in a text file(only the date-time in the first columnis not preserved). when i used read.table and tried to convert it into an xts object it shows error as it cannot take the indices as time object. same case happens if i only save the quotes in a dataframe using rbind.( i guess in the latter case that happens because the symbol ,say TCS.NS gets attached with the date). please suggest a solution to this problem. -- View this message in context: http://r.789695.n4.nabble.com/quantmod-package-tp3921071p3925863.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot with the 3rd dimension = color?
Beautiful! It works perfectly, thanks! kb On Oct 21, 7:42 am, Jim Lemon j...@bitwrit.com.au wrote: On 10/21/2011 06:25 AM, Kerry wrote: Can someone please help me out with this? The ggplot2 suggestion works great but I've spent a few days trying to figure out how to plot 2 variables with it and I'm stuck. Here's my example code: ... Hi Kerry, This isn't ggplot2, but it may do what you want. library(plotrix) oldmar-par(mar=c(5,4,4,4)) plot(x,y,type=n) plotlim-par(usr) rect(plotlim[1],plotlim[3],plotlim[2],plotlim[4],col=lightgray) grid(col=white) box() points(x,y,col=color.scale(z,c(1,0),0,c(0,1)),pch=19) points(x1,y2,col=color.scale(z3,1,c(0,1),0),pch=19) legendval1-seq(min(z),max(z),length.out=5) color.legend(2.9,0.5,3.1,1.5,round(legendval1,1),align=rb,gradient=y, rect.col=color.scale(legendval1,c(1,0),0,c(0,1))) legendval2-seq(min(z3),max(z3),length.out=5) color.legend(2.9,-1.5,3.1,-0.5,round(legendval2,1),align=rb,gradient=y, rect.col=color.scale(legendval2,c(1,1),c(0,1),0)) par(xpd=TRUE) text(3,1.6,z) text(3,-0.4,z3) par(xpd=FALSE,oldmar) Jim __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PCA and Regression with complex categorical variables
Did you perhaps send an HTML message? As detailed in the Posting Guide, those get scrubbed by the mail-server. On Oct 21, 2011, at 10:48 AM, seanstcl...@verizon.net wrote: nothing -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth
Thanks very much, Dennis. See below for something I don't understand. On 10/21/2011 12:15 PM, Dennis Murphy wrote: Hi Michael: Here's one way to get it from ggplot2. To avoid possible overplotting, I jittered the points horizontally by ± 0.2. I also reduced the point size from the default 2 and increased the line thickness to 1.5 for both fitted curves. In ggplot2, the term faceting is synonymous with conditioning (by groups). library('HistData') library('ggplot2') ggplot(PearsonLee, aes(x = parent, y = child)) + geom_point(size = 1.5, position = position_jitter(width = 0.2)) + geom_smooth(method = lm, aes(weights = PearsonLee$weight), colour = 'green', se = FALSE, size = 1.5) + geom_smooth(aes(weights = PearsonLee$weight), colour = 'red', se = FALSE, size = 1.5) + facet_grid(chl ~ par) This seems to work, but I don't understand *why*, since the weight variable is PearsonLee$frequency, not PearsonLee$weight. PearsonLee$weight NULL I get an error if I try to use PearsonLee$frequency as the weights= variable. ggplot(PearsonLee, aes(x = parent, y = child)) + +geom_point(size = 1.5, position = position_jitter(width = 0.2)) + +geom_smooth(method = lm, aes(weights = PearsonLee$frequency), +colour = 'green', se = FALSE, size = 1.5) + +geom_smooth(aes(weights = PearsonLee$frequency), +colour = 'red', se = FALSE, size = 1.5) + +facet_grid(chl ~ par) Error in eval(expr, envir, enclos) : object 'weight' not found In the form below, it makes sense to me and does work, using weight=frequency in the initial aes(), and weight= in geom_smooth: ggplot(PearsonLee, aes(x = parent, y = child, weight=frequency)) + geom_point(size = 1.5, position = position_jitter(width = 0.2)) + geom_smooth(method = lm, aes(weight = PearsonLee$frequency), colour = 'green', se = FALSE, size = 1.5) + geom_smooth(aes(weight = PearsonLee$frequency), colour = 'red', se = FALSE, size = 1.5) + facet_grid(chl ~ par) # If you prefer a legend, here's one take, pulling the legend inside # to the upper left corner. This requires a bit more 'trickery', but # the tricks are found in the ggplot2 book. ggplot(PearsonLee, aes(x = parent, y = child)) + geom_point(size = 1.5, position = position_jitter(width = 0.2)) + geom_smooth(method = lm, aes(weights = PearsonLee$weight, colour = 'Linear'), se = FALSE, size = 1.5) + geom_smooth(aes(weights = PearsonLee$weight, colour = 'Loess'), se = FALSE, size = 1.5) + facet_grid(chl ~ par) + scale_colour_manual(breaks = c('Linear', 'Loess'), values = c('green', 'red')) + opts(legend.position = c(0.14, 0.885), legend.background = theme_rect(fill = 'white')) HTH, Dennis -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
On Fri, 21 Oct 2011, David Winsemius wrote: First you need to clarify whether TDS is the name of a column or a possible value in a column named param. This whole painful multi-question process would be greatly accelerated if you offered str(chemdata). Yes, I did on a different thread, but not on this one. str(chemdata) 'data.frame': 47244 obs. of 6 variables: $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 134 134 127 127 $ sampdate: Date, format: 2006-12-06 2006-12-06 ... $ param : Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 66 12 24 59 66 $ quant : num 1.08e+04 7.95 1.80e-02 2.80e+02 1.90e+01 8.44 1.62e+03 $ stream : Factor w/ 24 levels B,C,..: 4 4 4 21 21 21 4 $ basin : Factor w/ 2 levels Basin1,Basin2: 1 1 1 1 1 1 1 1 1 2 ... What I need to do is examine the relationships between the parameter TDS and other parameters associated with it; e.g., Cond and SO4. I started by subsetting the main data frame (chemdata) tds.basin - subset(chemdata, param == TDS, select = c(param, quant, \ basin), na.rm = TRUE, drop = TRUE) cond.basin - subset(chemdata, param == Cond, select = c(param, quant, \ basin), na.rm = TRUE, drop = TRUE) However, these left the NA rows in the new data frames. I can produce an xyplot() using tds.basin$quant and cond.basin$quant, but it's obvious there are many points where one or the other have NA values. When I tried a linear regression it failed because of an unequal number of rows in both data frames. What I need to learn are: 1) how to write the subset() to remove the NA rows for each one and 2) how to perform linear regression (and further analyses) on these pairs of data frames. If you do not offer both the code and the verbatim copy of the error there will be very little that we can do to diagnose your problem. str(tds.basin) 'data.frame': 2206 obs. of 3 variables: $ param: Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 58 58 58 58 58 58 $ quant: num 10800 530 3838 3658 3756 ... $ basin: Factor w/ 2 levels Basin1,Basin2: 1 2 2 2 2 2 2 2 2 2 ... str(cond.basin) 'data.frame': 1191 obs. of 3 variables: $ param: Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 24 24 24 24 24 24 24 $ quant: num 280 3170 4220 3420 3700 ... $ basin: Factor w/ 2 levels Basin1,Basin2: 1 2 2 2 2 2 2 2 2 2 ... then, m1 - lm(tds.basin$quant ~ cond.basin$quant) Error in model.frame.default(formula = tds.basin$quant ~ cond.basin$quant, : variable lengths differ (found for 'cond.basin$quant') Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rgl device on web
Hello, I'm looking for help putting an interactive rgl package 3d device on the web so that it maintains full functionality. Where should I start? Is it possible? Is there an example I can see? (Note: I'm also looking at putting other normal plots on the web.) I'd like to stay within R as much as possible... I didn't find much online regarding rgl 3d plots. Thanks, Ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Working With Variables Having Different Lengths
On Oct 21, 2011, at 3:02 PM, Rich Shepard wrote: On Fri, 21 Oct 2011, David Winsemius wrote: First you need to clarify whether TDS is the name of a column or a possible value in a column named param. This whole painful multi-question process would be greatly accelerated if you offered str(chemdata). Yes, I did on a different thread, but not on this one. str(chemdata) 'data.frame': 47244 obs. of 6 variables: $ site: Factor w/ 143 levels BC-0.5,BC-1,..: 134 134 134 127 127 $ sampdate: Date, format: 2006-12-06 2006-12-06 ... $ param : Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 66 12 24 59 66 $ quant : num 1.08e+04 7.95 1.80e-02 2.80e+02 1.90e+01 8.44 1.62e +03 $ stream : Factor w/ 24 levels B,C,..: 4 4 4 21 21 21 4 $ basin : Factor w/ 2 levels Basin1,Basin2: 1 1 1 1 1 1 1 1 1 2 ... What I need to do is examine the relationships between the parameter TDS and other parameters associated with it; e.g., Cond and SO4. How are we to determine which lines contain information about the relationships of param==TDS with whatever cases or variable has values of Cond and SO4? Are you really trying to compare two disjoint groups on some statistic like the means and std-dev of quant? (This would be a job for `aggregate`.) I started by subsetting the main data frame (chemdata) tds.basin - subset(chemdata, param == TDS, select = c(param, quant, \ basin), na.rm = TRUE, drop = TRUE) cond.basin - subset(chemdata, param == Cond, select = c(param, quant, \ basin), na.rm = TRUE, drop = TRUE) So now you have two disjoint subsets. Why should we think they can be analyzed with regression methods? However, these left the NA rows in the new data frames. Not for the param column I hope. And the na.rm= arguments should get ignored by subset. I can produce an xyplot() using tds.basin$quant and cond.basin $quant, but it's obvious there are many points where one or the other have NA values. When I tried a linear regression it failed because of an unequal number of rows in both data frames. What I need to learn are: 1) how to write the subset() to remove the NA rows for each one and 2) how to perform linear regression (and further analyses) on these pairs of data frames. If you do not offer both the code and the verbatim copy of the error there will be very little that we can do to diagnose your problem. str(tds.basin) 'data.frame': 2206 obs. of 3 variables: $ param: Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 58 58 58 58 58 58 58 $ quant: num 10800 530 3838 3658 3756 ... $ basin: Factor w/ 2 levels Basin1,Basin2: 1 2 2 2 2 2 2 2 2 2 ... str(cond.basin) 'data.frame': 1191 obs. of 3 variables: $ param: Factor w/ 66 levels AGP,ANP,ANP/AGP,..: 24 24 24 24 24 24 24 $ quant: num 280 3170 4220 3420 3700 ... $ basin: Factor w/ 2 levels Basin1,Basin2: 1 2 2 2 2 2 2 2 2 2 ... then, m1 - lm(tds.basin$quant ~ cond.basin$quant) Error in model.frame.default(formula = tds.basin$quant ~ cond.basin $quant, : variable lengths differ (found for 'cond.basin$quant') In regression call it is almost alwasy better to construct them with a data argument: m1 - lm(tds.basin$quant ~ cond.basin$quant) Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Kleinberg's burst detection algorithm
Has anyone here implemented Jon Kleinberg's burst detection algorithm (Bursty and Hierarchical Structure in Streams http://www.cs.cornell.edu/home/kleinber/bhs.pdf)? I'd rather not reimplement if there's already running code available Thanks, -s [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm-poisson fitting 400.000 records
My apologies for my vague comment. My data comprises 400.000 x 21 (17 explanatory variables, plus response variable, plus two offsets). If I build the full model (only linear) I get: Error: cannot allocate vector of size 112.3 Mb I have a 4GB RAM laptop... Would i get any improvemnt on a 8GB computer Many thanks, -- View this message in context: http://r.789695.n4.nabble.com/glm-poisson-fitting-400-000-records-tp3925100p3925968.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find a particular point on a curve
The most important thing is the point termed C (on the image): http://r.789695.n4.nabble.com/file/n3926631/courbe_temp%C3%A9rature.png which is the first point (time, temperature) where temperature stabilizes after the temperature drop (end of feeding). The definition of that particular point is : The point where temperature stabilizes within 1 sd (calculated from the prefeeding temperature) over a minimum of 10 minutes. Can someone help me writing the code for it? Thank you very much, Joanie -- View this message in context: http://r.789695.n4.nabble.com/Find-a-particular-point-on-a-curve-tp3882721p3926631.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.