Re: [R] Unsubscribe please
Hello, Don't reply only to me. 1) Filter the unwanted mails, 2) It takes few days to unsubscribe you. Regards, Pascal On 04/17/2013 02:59 PM, bert verleysen (beverconsult) wrote: I did this, but still I receive to much mails Bert Verleysen 00 32 (0)477 874 272 samen zoekend naar generatief organiseren -Oorspronkelijk bericht- Van: Pascal Oettli [mailto:kri...@ymail.com] Verzonden: woensdag 17 april 2013 6:33 Aan: Bert Verleysen (beverconsult) CC: R-help@r-project.org Onderwerp: Re: [R] Unsubscribe please Hi, Do it yourself: https://stat.ethz.ch/mailman/listinfo/r-help Hint: Bbottom of the page (To unsubscribe from R-help) Regards, Pascal On 04/17/2013 06:33 AM, Bert Verleysen (beverconsult) wrote: Verstuurd vanaf mijn iPad Bert Verleysen 00 32 (0)477 874 272 www.beverconsult.be __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transformation of a variable in a dataframe
On Apr 16, 2013, at 10:33 PM, jpm miao wrote: HI, I have a dataframe with two variable A, B. I transform the two variable and name them as C, D and save it in a dataframe dfcd. However, I wonder why can't I call them by dfcd$C and dfcd$D? Because you didn't assign them to dfab$C. It's going to be more successful if you use: dfab[[A]]*2 Thanks, Miao A=c(1,2,3) B=c(4,6,7) dfab-data.frame(A,B) C=dfab[A]*2 D=dfab[B]*3 dfcd-data.frame(C,D) dfcd A B 1 2 12 2 4 18 3 6 21 dfcd$C NULL dfcd$A [1] 2 4 6 [[alternative HTML version deleted]] David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merging big data.frame
Hi all, I am trying to merge 2 big data.frame. The problem is merge is memory intensive so R is going out of memory error: cannot allocate vector of size 360.1 Mb. To overcome this, I am exploring option of using data.table package. But its not helping in term of memory as merge in data.table is fast but not memory efficient. Similar error is coming. My inputs are inp1 V1 V2 1 a i1 2 a i2 3 a i3 4 a i4 5 b i5 6 c i6 inp2 V1 V2 1 a x 2 b x 3 a y 4 c z I want merge(x=inp1, y=inp2, by.x=V1, by.y=V1) so the output V1 V2.x V2.y 1 a i1x 2 a i1y 3 a i2x 4 a i2y 5 a i3x 6 a i3y 7 a i4x 8 a i4y 9 b i5x 10 c i6z Is there a way to do this without using merge in data.table? or Is there any other solution to do this in more efficient and less memory ? thanks avi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merging big data.frame
check out the sqldf package --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. avinash sahu avinash.s...@gmail.com wrote: Hi all, I am trying to merge 2 big data.frame. The problem is merge is memory intensive so R is going out of memory error: cannot allocate vector of size 360.1 Mb. To overcome this, I am exploring option of using data.table package. But its not helping in term of memory as merge in data.table is fast but not memory efficient. Similar error is coming. My inputs are inp1 V1 V2 1 a i1 2 a i2 3 a i3 4 a i4 5 b i5 6 c i6 inp2 V1 V2 1 a x 2 b x 3 a y 4 c z I want merge(x=inp1, y=inp2, by.x=V1, by.y=V1) so the output V1 V2.x V2.y 1 a i1x 2 a i1y 3 a i2x 4 a i2y 5 a i3x 6 a i3y 7 a i4x 8 a i4y 9 b i5x 10 c i6z Is there a way to do this without using merge in data.table? or Is there any other solution to do this in more efficient and less memory ? thanks avi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge
Thanks a lot:) Sent from my iPad On Apr 16, 2013, at 10:15 PM, arun smartpink...@yahoo.com wrote: Hi Farnoosh, YOu can use either ?merge() or ?join() DataA- read.table(text= ID v1 1 10 2 1 3 22 4 15 5 3 6 6 7 8 ,sep=,header=TRUE) DataB- read.table(text= ID v2 2 yes 5 no 7 yes ,sep=,header=TRUE,stringsAsFactors=FALSE) merge(DataA,DataB,by=ID,all.x=TRUE) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes library(plyr) join(DataA,DataB,by=ID,type=left) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes A.K. From: farnoosh sheikhi farnoosh...@yahoo.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Wednesday, April 17, 2013 12:52 AM Subject: Merge Hi Arun, I want to merge a data set with another data frame with 2 columns and keep the sample size of the DataA. DataA DataB DataCombine ID v1 ID V2 ID v1 v2 1 10 2 yes 1 10 NA 2 1 5 no 2 1 yes 3 22 7 yes 3 22 NA 4 15 4 15 NA 5 3 5 3 no 6 6 6 6 NA 7 8 7 8 yes Thanks a lot for your help and time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the joy of spreadsheets (off-topic)
Can you resend this link please? Thanks On Tue, Apr 16, 2013 at 10:33 PM, Jim Lemon j...@bitwrit.com.au wrote: On 04/17/2013 03:25 AM, Sarah Goslee wrote: ... Ouch. (Note: I know nothing about the site, the author of the article, or the study in question. I was pointed to it by someone else. But if true: highly problematic.) Sarah There seem to be three major problems described here, and only one is marginally related to Excel (and similar spreadsheets). Cherry picking data is all too common. Almost anyone who reviews papers for publication will have encountered it, and there are excellent books describing examples that have had great influence on public policy. Similarly, applying obscure and sometimes inappropriate statistical methods that produce the desired results when nothing else will appears with depressing frequency. The final point does relate to Excel and any application that hides what is going on to the casual observer. I will treasure this URL to give to anyone who chastises my moaning when I have to perform some task in Excel. It is not an error in the application (although these certainly exist) but a salutory caution to those who think that if a reasonable looking number appears in a cell, it must be the correct answer. I have found not one, but two such errors in the simple calculation of a birthday age from the date of birth and date of death. Jim __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Shane [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] I don't understand the 'order' function
There is a blog post about this: http://www.portfolioprobe.com/2012/07/26/r-inferno-ism-order-is-not-rank/ And proof that it is possible to confuse them even when you know the difference. Pat On 16/04/2013 19:10, Julio Sergio wrote: Julio Sergio juliosergio at gmail.com writes: I thought I've understood the 'order' function, using simple examples like: Thanks to you all!... As Sarah said, what was damaged was my understanding ( ;-) )... and as Duncan said, I was confusing 'order' with 'rank', thanks! Now I understand the 'order' function. -Sergio __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @burnsstat @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of: 'Impatient R' 'The R Inferno' 'Tao Te Programming') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the joy of spreadsheets (off-topic)
On Apr 17, 2013, at 10:16 , Shane Carey wrote: Can you resend this link please? Psst: https://stat.ethz.ch/pipermail/r-help/2013-April/351669.html -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a vector with repeating dates
Dear R forum I have a data.frame df = data.frame(dates = c(4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013), values = c(47, 38, 56, 92)) I need to to create a vector by repeating the dates as Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013 i.e. I need to create a new vector as given below which I need to use for some other purpose. Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Is it possible to construct such a column? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a vector with repeating dates
?rep On Wed, Apr 17, 2013 at 11:11 AM, Katherine Gobin katherine_go...@yahoo.com wrote: Dear R forum I have a data.frame df = data.frame(dates = c(4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013), values = c(47, 38, 56, 92)) I need to to create a vector by repeating the dates as Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013 i.e. I need to create a new vector as given below which I need to use for some other purpose. Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Is it possible to construct such a column? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding why a GAM can't suppress an intercept
hi Andrew. gam does suppress the intercept, it's just that this doesn't force the smooth through the intercept in the way that you would like. Basically for the parameteric component of the model '-1' behaves exactly like it does in 'lm' (it's using the same code). The smooths are 'added on' to the parametric component of the model, with sum to zero constraints to force identifiability. There is a solution to forcing a spline through a particular point at http://r.789695.n4.nabble.com/Use-pcls-in-quot-mgcv-quot-package-to-achieve-constrained-cubic-spline-td4660966.html (i.e. the R help thread Re: [R] Use pcls in mgcv package to achieve constrained cubic spline) best, Simon On 16/04/13 22:36, Andrew Crane-Droesch wrote: Dear List, I've just tried to specify a GAM without an intercept -- I've got one of the (rare) cases where it is appropriate for E(y) - 0 as X -0. Naively running a GAM with the -1 appended to the formula and the calling predict.gam, I see that the model isn't behaving as expected. I don't understand why this would be. Google turns up this old R help thread: http://r.789695.n4.nabble.com/GAM-without-intercept-td4645786.html Simon writes: *Smooth terms are constrained to sum to zero over the covariate values. ** **This is an identifiability constraint designed to avoid confounding with ** **the intercept (particularly important if you have more than one smooth). * If you remove the intercept from you model altogether (m2) then the smooth will still sum to zero over the covariate values, which in your case will mean that the smooth is quite a long way from the data. When you include the intercept (m1) then the intercept is effectively shifting the constrained curve up towards the data, and you get a nice fit. Why? I haven't read Simon's book in great detail, though I have read Ruppert et al.'s Semiparametric Regression. I don't see a reason why a penalized spline model shouldn't equal the intercept (or zero) when all of the regressors equals zero. Is anyone able to help with a bit of intuition? Or relevant passages from a good description of why this would be the case? Furthermore, why does the -1 formula specification work if it doesn't work as intended by for example lm? Many thanks, Andrew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a vector with repeating dates
Dear Andrija Djurovic, Thanks for the suggestion. Ia m aware of rep. However, here I need to repeat not only dates, but a string Current_date. Thus, I need to create a vector ( to be included in some other data.frame) with the name say dt which will contain dt Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 So this is combination of dates and a string. Hence, I am just wondering if it is possible to create such a vector or not? Regards Katherine --- On Wed, 17/4/13, andrija djurovic djandr...@gmail.com wrote: From: andrija djurovic djandr...@gmail.com Subject: Re: [R] Creating a vector with repeating dates To: Katherine Gobin katherine_go...@yahoo.com Cc: r-help@r-project.org r-help@r-project.org Date: Wednesday, 17 April, 2013, 10:14 AM ?rep On Wed, Apr 17, 2013 at 11:11 AM, Katherine Gobin katherine_go...@yahoo.com wrote: Dear R forum I have a data.frame df = data.frame(dates = c(4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013), values = c(47, 38, 56, 92)) I need to to create a vector by repeating the dates as Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013 i.e. I need to create a new vector as given below which I need to use for some other purpose. Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Is it possible to construct such a column? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a vector with repeating dates
Hello, Try the following. rep(c(Current_date, as.character(df$dates)), 3) Hope this helps, Rui Barradas Em 17-04-2013 10:11, Katherine Gobin escreveu: Dear R forum I have a data.frame df = data.frame(dates = c(4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013), values = c(47, 38, 56, 92)) I need to to create a vector by repeating the dates as Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013 i.e. I need to create a new vector as given below which I need to use for some other purpose. Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Is it possible to construct such a column? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a vector with repeating dates
Hi. Here are some examples that can maybe help you: a - Current date b - Sys.Date()-1:5 a b class(a) class(b) c(a,b) mode(b) as.numeric(b) class(c(a,b)) c(a, as.character(b)) class(c(a,b)) class(c(a,as.character(b))) Hope this helps. On Wed, Apr 17, 2013 at 11:21 AM, Katherine Gobin katherine_go...@yahoo.com wrote: Dear Andrija Djurovic, Thanks for the suggestion. Ia m aware of rep. However, here I need to repeat not only dates, but a string Current_date. Thus, I need to create a vector ( to be included in some other data.frame) with the name say dt which will contain dt Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 So this is combination of dates and a string. Hence, I am just wondering if it is possible to create such a vector or not? Regards Katherine --- On *Wed, 17/4/13, andrija djurovic djandr...@gmail.com* wrote: From: andrija djurovic djandr...@gmail.com Subject: Re: [R] Creating a vector with repeating dates To: Katherine Gobin katherine_go...@yahoo.com Cc: r-help@r-project.org r-help@r-project.org Date: Wednesday, 17 April, 2013, 10:14 AM ?rep On Wed, Apr 17, 2013 at 11:11 AM, Katherine Gobin katherine_go...@yahoo.com http://mc/compose?to=katherine_go...@yahoo.com wrote: Dear R forum I have a data.frame df = data.frame(dates = c(4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013), values = c(47, 38, 56, 92)) I need to to create a vector by repeating the dates as Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013 i.e. I need to create a new vector as given below which I need to use for some other purpose. Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Is it possible to construct such a column? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org http://mc/compose?to=R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a vector with repeating dates
On 04/17/2013 07:11 PM, Katherine Gobin wrote: Dear R forum I have a data.frame df = data.frame(dates = c(4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013), values = c(47, 38, 56, 92)) I need to to create a vector by repeating the dates as Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013 i.e. I need to create a new vector as given below which I need to use for some other purpose. Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Is it possible to construct such a column? Hi Katherine, How about: rep(c(Current date,paste(4,15:12,2013,sep=/)),3) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] I don't understand the 'order' function
On Apr 17, 2013, at 10:41 , Patrick Burns wrote: There is a blog post about this: http://www.portfolioprobe.com/2012/07/26/r-inferno-ism-order-is-not-rank/ And proof that it is possible to confuse them even when you know the difference. It usually helps to remember that x[order(x)] is sort(x) (and that x[rank(x)] is nothing of the sort). It's somewhat elusive, but not impossible to realize that the two are inverses (if no ties). Duncan M. indicated it nicely earlier in the thread: rank() is how to permute ordered data to get those observed, order is how to permute the data to put them in order. They are inverses in terms of composition of permutations, not as transformations of sets of integers: rank(order(x)) and order(rank(x)) are both equal to order(x), whereas x - rnorm(5) rank(x) [1] 4 3 5 2 1 order(x) [1] 5 4 2 1 3 ## permutation matrix P - matrix(0,5,5); diag(P[,order(x)]) - 1 P %*% 1:5 [,1] [1,]5 [2,]4 [3,]2 [4,]1 [5,]3 P2 - matrix(0,5,5); diag(P2[,rank(x)]) - 1 P2 %*% 1:5 [,1] [1,]4 [2,]3 [3,]5 [4,]2 [5,]1 P %*% P2 [,1] [,2] [,3] [,4] [,5] [1,]10000 [2,]01000 [3,]00100 [4,]00010 [5,]00001 Or, as Duncan put it: rank(x)[order(x)] and order(x)[rank(x)] are 1:length(x). The thing that tends to blow my mind is that order(order(x))==rank(x). I can't seem to get my intuition to fathom it, although there's a fairly easy proof in that 1:N == sort(order(x)) == order(x)[order(order(x))] == order(x)[rank(x)] -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spatial Ananlysis: zero.policy=TRUE doesn't work for no neighbour regions??
Molo kurz_m at uni-hohenheim.de writes: ... *As there are some regions without neighbours in my data I use the following code to create the Weights Matrix:* W_Matrix- nb2listw(location_nbq, style=W, zero.policy=TRUE) W_Matrix *And get this Output:* ... /(Error in print.listw(list(style = W, neighbours = list(c(23L, 31L, 42L : regions with no neighbours found, use zero.policy=TRUE)/ As I use zero.policy=TRUE I just don't understand what I'm doing wrong... My question would be: How could I create a Weights Matrix allowing for no-neighbour areas? You have not grasped the fact that your object W_Matrix has been created correctly, but that spdep:::print.listw also needs a zero.policy=TRUE, so: print(W_Matrix, zero.policy=TRUE) will work. If you want to set this globally for all subsequent function calls in your current session, use set.ZeroPolicyOption(TRUE). Hope this clarifies, Roger PS. Consider posting questions of this kind to R-sig-geo Thanks Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with DateVisit-gives wrong year?
Hi I have the following factor of dates that I want to converted to Date class so I can extract the month test.date [1] 14/05/2012 14/05/2012 14/05/2012 14/05/2012 14/05/2012 14/05/2012 [7] 14/05/2012 14/05/2012 14/05/2012 14/05/2012 201 Levels: 01/10/2012 01/11/2012 01/12/2012 02/07/2012 ... 28/09/2012 I use code below ntest.date-as.Date(test.date,'%d/%m/%y') but the output has the wrong year, and the reverse order ntest.date [1] 2020-05-14 2020-05-14 2020-05-14 2020-05-14 2020-05-14 [6] 2020-05-14 2020-05-14 2020-05-14 2020-05-14 2020-05-14 What am I doing wrong? I dare not say the word 'bug' Thanks Pancho Mulongeni Research Assistant PharmAccess Foundation 1 Fouché Street Windhoek West Windhoek Namibia Tel: +264 61 419 000 Fax: +264 61 419 001/2 Mob: +264 81 4456 286 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regularized Regressions
Hi all, I would greatly appreciate if someone was so kind and share with us a package or method that uses a regularized regression approach that balances a regression model performance and model complexity. That said I would be most grateful is there is an R-package that combines Ridge (sum of squares coefficients), Lasso: Sum of absolute coefficients and Best Subsets: Number of coefficients as methods of regularized regression. Sincerely, Christos Giannoulis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with DateVisit-gives wrong year?
On 04/17/2013 09:18 PM, Pancho Mulongeni wrote: Hi I have the following factor of dates that I want to converted to Date class so I can extract the month test.date [1] 14/05/2012 14/05/2012 14/05/2012 14/05/2012 14/05/2012 14/05/2012 [7] 14/05/2012 14/05/2012 14/05/2012 14/05/2012 201 Levels: 01/10/2012 01/11/2012 01/12/2012 02/07/2012 ... 28/09/2012 I use code below ntest.date-as.Date(test.date,'%d/%m/%y') but the output has the wrong year, and the reverse order ntest.date [1] 2020-05-14 2020-05-14 2020-05-14 2020-05-14 2020-05-14 [6] 2020-05-14 2020-05-14 2020-05-14 2020-05-14 2020-05-14 What am I doing wrong? I dare not say the word 'bug' Hey Pancho, It is not the bug, it is the case. Try: ntest.date-as.Date(test.date,%d/%m/%Y) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with DateVisit-gives wrong year?
HI, test.date- rep(c(14/05/2012,01/10/2012,28/09/2012),each=6) as.Date(test.date,%d/%m/%Y) # [1] 2012-05-14 2012-05-14 2012-05-14 2012-05-14 2012-05-14 #[6] 2012-05-14 2012-10-01 2012-10-01 2012-10-01 2012-10-01 #[11] 2012-10-01 2012-10-01 2012-09-28 2012-09-28 2012-09-28 #[16] 2012-09-28 2012-09-28 2012-09-28 test.date1- factor(test.date) as.Date(test.date1,%d/%m/%Y) # [1] 2012-05-14 2012-05-14 2012-05-14 2012-05-14 2012-05-14 #[6] 2012-05-14 2012-10-01 2012-10-01 2012-10-01 2012-10-01 #[11] 2012-10-01 2012-10-01 2012-09-28 2012-09-28 2012-09-28 #[16] 2012-09-28 2012-09-28 2012-09-28 A.K. - Original Message - From: Pancho Mulongeni p.mulong...@namibia.pharmaccess.org To: r-help@r-project.org r-help@r-project.org Cc: Sent: Wednesday, April 17, 2013 7:18 AM Subject: [R] Problem with DateVisit-gives wrong year? Hi I have the following factor of dates that I want to converted to Date class so I can extract the month test.date [1] 14/05/2012 14/05/2012 14/05/2012 14/05/2012 14/05/2012 14/05/2012 [7] 14/05/2012 14/05/2012 14/05/2012 14/05/2012 201 Levels: 01/10/2012 01/11/2012 01/12/2012 02/07/2012 ... 28/09/2012 I use code below ntest.date-as.Date(test.date,'%d/%m/%y') but the output has the wrong year, and the reverse order ntest.date [1] 2020-05-14 2020-05-14 2020-05-14 2020-05-14 2020-05-14 [6] 2020-05-14 2020-05-14 2020-05-14 2020-05-14 2020-05-14 What am I doing wrong? I dare not say the word 'bug' Thanks Pancho Mulongeni Research Assistant PharmAccess Foundation 1 Fouché Street Windhoek West Windhoek Namibia Tel: +264 61 419 000 Fax: +264 61 419 001/2 Mob: +264 81 4456 286 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transformation of a variable in a dataframe
Hi, You may also try: dfab-data.frame(A,B) library(plyr) dfcd-subset(mutate(dfab,C=A*2,D=B*3),select=-c(A,B)) #or dfcd1-subset(within(dfab,{D-B*3;C-A*2}),select=-c(A,B)) dfcd$C #[1] 2 4 6 dfcd$D #[1] 12 18 21 A.K. - Original Message - From: jpm miao miao...@gmail.com To: r-help r-help@r-project.org Cc: Sent: Wednesday, April 17, 2013 1:33 AM Subject: [R] Transformation of a variable in a dataframe HI, I have a dataframe with two variable A, B. I transform the two variable and name them as C, D and save it in a dataframe dfcd. However, I wonder why can't I call them by dfcd$C and dfcd$D? Thanks, Miao A=c(1,2,3) B=c(4,6,7) dfab-data.frame(A,B) C=dfab[A]*2 D=dfab[B]*3 dfcd-data.frame(C,D) dfcd A B 1 2 12 2 4 18 3 6 21 dfcd$C NULL dfcd$A [1] 2 4 6 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the joy of spreadsheets (off-topic)
On Tue, Apr 16, 2013 at 4:33 PM, Jim Lemon j...@bitwrit.com.au wrote: On 04/17/2013 03:25 AM, Sarah Goslee wrote: The final point does relate to Excel and any application that hides what is going on to the casual observer. I will treasure this URL to give to anyone who chastises my moaning when I have to perform some task in Excel. It is not an error in the application (although these certainly exist) but a salutory caution to those who think that if a reasonable looking number appears in a cell, it must be the correct answer. I have found not one, but two such errors in the simple calculation of a birthday age from the date of birth and date of death. Jim So there (maybe) was a bug in Excel. Maybe hidden from the casual observer. And since Excel is not R, and we are R snobs, Excel is evil, right? But, wait. Is it easier for a casual observer to detect a flaw in the formula in Excel, or to find an incorrect array index in an R script? All ye who want to cast stones upon the interface of Excel should ask yourselves if you have ever had a bug in R code. Kevin (no fan of Excel either) __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Kevin Wright [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] remove higher order interaction terms
Dear all, Consider the model below: x - lm(mpg ~ cyl * disp * hp * drat, mtcars) summary(x) Call: lm(formula = mpg ~ cyl * disp * hp * drat, data = mtcars) Residuals: Min 1Q Median 3Q Max -3.5725 -0.6603 0.0108 1.1017 2.6956 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.070e+03 3.856e+02 2.776 0.01350 * cyl -2.084e+02 7.196e+01 -2.896 0.01052 * disp -6.760e+00 3.700e+00 -1.827 0.08642 . hp -9.302e+00 3.295e+00 -2.823 0.01225 * drat -2.824e+02 1.073e+02 -2.633 0.01809 * cyl:disp 1.065e+00 5.034e-01 2.116 0.05038 . cyl:hp1.587e+00 5.296e-01 2.996 0.00855 ** disp:hp 7.422e-02 3.461e-02 2.145 0.04769 * cyl:drat 5.652e+01 2.036e+01 2.776 0.01350 * disp:drat 1.824e+00 1.011e+00 1.805 0.08990 . hp:drat 2.600e+00 9.226e-01 2.819 0.01236 * cyl:disp:hp -1.050e-02 4.518e-03 -2.323 0.03368 * cyl:disp:drat-2.884e-01 1.392e-01 -2.071 0.05484 . cyl:hp:drat -4.428e-01 1.504e-01 -2.945 0.00950 ** disp:hp:drat -2.070e-02 9.568e-03 -2.163 0.04600 * cyl:disp:hp:drat 2.923e-03 1.254e-03 2.331 0.03317 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.245 on 16 degrees of freedom Multiple R-squared: 0.9284, Adjusted R-squared: 0.8612 F-statistic: 13.83 on 15 and 16 DF, p-value: 2.007e-06 Is there a straightforward way to remove the highest order interaction terms? Say: cyl:disp:hp cyl:disp:drat cyl:hp:drat disp:hp:drat cyl:disp:hp:drat I know I could do this: x - lm(mpg ~ cyl * disp * hp * drat - cyl:disp:hp - cyl:disp:drat - cyl:hp:drat - disp:hp:drat - cyl:disp:hp:drat, mtcars) But I was hoping for a more elegant solution. Regards, Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove higher order interaction terms
On Apr 17, 2013, at 7:23 AM, Liviu Andronic landronim...@gmail.com wrote: Dear all, Consider the model below: x - lm(mpg ~ cyl * disp * hp * drat, mtcars) summary(x) Call: lm(formula = mpg ~ cyl * disp * hp * drat, data = mtcars) Residuals: Min 1Q Median 3Q Max -3.5725 -0.6603 0.0108 1.1017 2.6956 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.070e+03 3.856e+02 2.776 0.01350 * cyl -2.084e+02 7.196e+01 -2.896 0.01052 * disp -6.760e+00 3.700e+00 -1.827 0.08642 . hp -9.302e+00 3.295e+00 -2.823 0.01225 * drat -2.824e+02 1.073e+02 -2.633 0.01809 * cyl:disp 1.065e+00 5.034e-01 2.116 0.05038 . cyl:hp1.587e+00 5.296e-01 2.996 0.00855 ** disp:hp 7.422e-02 3.461e-02 2.145 0.04769 * cyl:drat 5.652e+01 2.036e+01 2.776 0.01350 * disp:drat 1.824e+00 1.011e+00 1.805 0.08990 . hp:drat 2.600e+00 9.226e-01 2.819 0.01236 * cyl:disp:hp -1.050e-02 4.518e-03 -2.323 0.03368 * cyl:disp:drat-2.884e-01 1.392e-01 -2.071 0.05484 . cyl:hp:drat -4.428e-01 1.504e-01 -2.945 0.00950 ** disp:hp:drat -2.070e-02 9.568e-03 -2.163 0.04600 * cyl:disp:hp:drat 2.923e-03 1.254e-03 2.331 0.03317 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.245 on 16 degrees of freedom Multiple R-squared: 0.9284, Adjusted R-squared: 0.8612 F-statistic: 13.83 on 15 and 16 DF, p-value: 2.007e-06 Is there a straightforward way to remove the highest order interaction terms? Say: cyl:disp:hp cyl:disp:drat cyl:hp:drat disp:hp:drat cyl:disp:hp:drat I know I could do this: x - lm(mpg ~ cyl * disp * hp * drat - cyl:disp:hp - cyl:disp:drat - cyl:hp:drat - disp:hp:drat - cyl:disp:hp:drat, mtcars) But I was hoping for a more elegant solution. Regards, Liviu If you only want up to say second order interactions: summary(lm(mpg ~ (cyl + disp + hp + drat) ^ 2, data = mtcars)) Call: lm(formula = mpg ~ (cyl + disp + hp + drat)^2, data = mtcars) Residuals: Min 1Q Median 3Q Max -3.5487 -1.6998 0.0894 1.2366 4.6138 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 9.816e+01 4.199e+01 2.338 0.0294 * cyl -1.656e+01 1.226e+01 -1.351 0.1910 disp 1.333e-03 1.634e-01 0.008 0.9936 hp -1.936e-01 2.260e-01 -0.857 0.4014 drat-8.913e+00 8.745e+00 -1.019 0.3197 cyl:disp 2.134e-02 1.071e-02 1.992 0.0595 . cyl:hp 3.074e-02 1.970e-02 1.560 0.1337 cyl:drat 2.590e+00 2.601e+00 0.996 0.3307 disp:hp -3.846e-04 3.906e-04 -0.985 0.3359 disp:drat -3.518e-02 3.951e-02 -0.890 0.3834 hp:drat 1.210e-02 5.432e-02 0.223 0.8259 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.717 on 21 degrees of freedom Multiple R-squared: 0.8623,Adjusted R-squared: 0.7967 F-statistic: 13.15 on 10 and 21 DF, p-value: 6.237e-07 This is covered in ?formula Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] compate tow different data frame
Hi, dat1- read.table(text= V1 V2 A 1 B 2 C 1 D 3 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat2- read.table(text= V3 V2 AAA 1 BBB 2 CCC 3 ,sep=,header=TRUE,stringsAsFactors=FALSE) library(plyr) join(dat1,dat2,by=V2,type=full) # V1 V2 V3 #1 A 1 AAA #2 B 2 BBB #3 C 1 AAA #4 D 3 CCC merge(dat1,dat2) # V2 V1 V3 #1 1 A AAA #2 1 C AAA #3 2 B BBB #4 3 D CCC A.K. Dear R-users, I have the following 2 files; A V1 V2 A 1 B 2 C 1 D 3 B V1 V2 AAA 1 BBB 2 CCC 3 I want to get this output C V1 V2 V3 A 1 AAA B 2 BBB C 1 AAA D 3 CCC I want to compare A$V2 with B$V2, if it is the same, then append B$V1 to A. How to ?? Please __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove higher order interaction terms
On Wed, Apr 17, 2013 at 2:33 PM, Marc Schwartz marc_schwa...@me.com wrote: If you only want up to say second order interactions: summary(lm(mpg ~ (cyl + disp + hp + drat) ^ 2, data = mtcars)) This is what I was looking for. Thank you so much. This is covered in ?formula Indeed. I tried to parse ?formula at several occasions in the past few years, but never quite grasped it fully. Thanks again, Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] On matrix calculation
Hello again, Let say I have a matrix: Mat - matrix(1:12, 4, 3) And a vector: Vec - 5:8 Now I want to do following: Each element of row-i in 'Mat' will be divided by i-th element of Vec Is there any direct way to doing that? Thanks for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regularized Regressions
Merhaba, Hello to you too Mehmet (Yasu ki sena) Thank you for your email and especially for sharing this package. I appreciate it. However, my feeling is that this package does not have the third component of Best Subsets (pls correct me if I am wrong). It uses only a combination of Ridge and Lasso. If you happen to know any other packages that uses all of them I would greatly welcome and appreciate if you were so kind and share it. I tried to search the cran lists but I am not sure I can found something like that. That's why I was asking the R-community Thank you again for your prompt response! Cheers Christos On Wed, Apr 17, 2013 at 8:16 AM, Suzen, Mehmet msu...@gmail.com wrote: Yasu, Try Elastic nets: http://cran.r-project.org/web/packages/pensim/index.html There some other packages supporting elastic nets: Just search the CRAN Cheers, Mehmet On 17 April 2013 13:19, Christos Giannoulis cgiann...@gmail.com wrote: Hi all, I would greatly appreciate if someone was so kind and share with us a package or method that uses a regularized regression approach that balances a regression model performance and model complexity. That said I would be most grateful is there is an R-package that combines Ridge (sum of squares coefficients), Lasso: Sum of absolute coefficients and Best Subsets: Number of coefficients as methods of regularized regression. Sincerely, Christos Giannoulis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] On matrix calculation
Hi, Try: sweep(Mat,1,Vec,/) # [,1] [,2] [,3] #[1,] 0.200 1 1.80 #[2,] 0.333 1 1.67 #[3,] 0.4285714 1 1.571429 #[4,] 0.500 1 1.50 do.call(rbind,lapply(seq_len(nrow(Mat)),function(i) Mat[i,]/Vec[i])) # [,1] [,2] [,3] #[1,] 0.200 1 1.80 #[2,] 0.333 1 1.67 #[3,] 0.4285714 1 1.571429 #[4,] 0.500 1 1.50 A.K. - Original Message - From: Christofer Bogaso bogaso.christo...@gmail.com To: r-help r-help@r-project.org Cc: Sent: Wednesday, April 17, 2013 8:39 AM Subject: [R] On matrix calculation Hello again, Let say I have a matrix: Mat - matrix(1:12, 4, 3) And a vector: Vec - 5:8 Now I want to do following: Each element of row-i in 'Mat' will be divided by i-th element of Vec Is there any direct way to doing that? Thanks for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change the date into an interval of date?
Hi, Try: evt_c.1- read.table(text= patient_id responsed_at 1 2010-5 1 2010-7 1 2010-8 1 2010-9 2 2010-5 2 2010-6 2 2010-7 ,sep=,header=TRUE,stringsAsFactors=FALSE) lst1-split(evt_c.1,evt_c.1$patient_id) res-do.call(rbind,lapply(lst1,function(x) {x1-as.numeric(gsub(.*\\-,,x[,2])); x$t-c(0,cumsum(diff(x1)));x})) row.names(res)-1:nrow(res) res # patient_id responsed_at t #1 1 2010-5 0 #2 1 2010-7 2 #3 1 2010-8 3 #4 1 2010-9 4 #5 2 2010-5 0 #6 2 2010-6 1 #7 2 2010-7 2 #or library(plyr) res2-mutate(evt_c.1,t=ave(as.numeric(gsub(.*\\-,,responsed_at)),patient_id,FUN=function(x) c(0,cumsum(diff(x) res2 # patient_id responsed_at t #1 1 2010-5 0 #2 1 2010-7 2 #3 1 2010-8 3 #4 1 2010-9 4 #5 2 2010-5 0 #6 2 2010-6 1 #7 2 2010-7 2 identical(res,res2) #[1] TRUE A.K. From: GUANGUAN LUO guanguan...@gmail.com To: arun smartpink...@yahoo.com Sent: Wednesday, April 17, 2013 8:32 AM Subject: Re: how to change the date into an interval of date? thank you, and now i've got a table like this dput(head(evt_c.1,5)) structure(list(responsed_at = c(2010-05, 2010-07, 2010-08, 2010-10, 2010-11), patient_id = c(2L, 2L, 2L, 2L, 2L), number = c(1, 2, 3, 4, 5), response_id = c(77L, 1258L, 2743L, 4499L, 6224L), session_id = c(2L, 61L, 307L, 562L, 809L), login = c(3002, 3002, 3002, 3002, 3002), clinique_basdai.fatigue = c(4, 5, 5, 6, 4), which i want is to add a column t, for example now my table is like this: patient_id responsed_at 12010-5 12010-7 12010-8 12010-9 22010-5 22010-6 22010-7 after add the column t paient_id responsed_att 12010-5 0 12010-7 2 12010-8 3 12010-9 4 22010-5 0 22010-6 1 22010-7 2 Le 17 avril 2013 14:23, arun smartpink...@yahoo.com a écrit : Hi, format() is one way. library(zoo) as.yearmon(dat1$responsed_at) #[1] May 2010 Jul 2010 Aug 2010 Oct 2010 Nov 2010 Dec 2010 #[7] Jan 2011 Feb 2011 Mar 2011 Apr 2011 Jun 2011 Jul 2011 #[13] Aug 2011 Sep 2011 Oct 2011 Nov 2011 Dec 2011 Jan 2012 #[19] Mar 2012 May 2010 A.K. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] On matrix calculation
On 17-04-2013, at 14:39, Christofer Bogaso bogaso.christo...@gmail.com wrote: Hello again, Let say I have a matrix: Mat - matrix(1:12, 4, 3) And a vector: Vec - 5:8 Now I want to do following: Each element of row-i in 'Mat' will be divided by i-th element of Vec Is there any direct way to doing that? What have you tried? Because of recycling and storage by columns of matrices and and the fact length(Vec) == nrow(Mat) Mat / Vec will do. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the joy of spreadsheets (off-topic)
On Tue, Apr 16, 2013 at 1:25 PM, Sarah Goslee sarah.gos...@gmail.com wrote: Given that we occasionally run into problems with comparing Excel results to R results, and other spreadsheet-induced errors, I thought this might be of interest. http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems The punchline: If this error turns out to be an actual mistake Reinhart-Rogoff made, well, all I can hope is that future historians note that one of the core empirical points providing the intellectual foundation for the global move to austerity in the early 2010s was based on someone accidentally not updating a row formula in Excel. Ouch. (Note: I know nothing about the site, the author of the article, or the study in question. I was pointed to it by someone else. But if true: highly problematic.) Herndon, Ash and Pollin (HAP), the authors of the critique, found that in the highest debt category the Excel error in Rienhart and Rogoff (RR) was -0.3 percent points compared to a total error (from that plus RR's other 2 mistakes) of -2.3 percentage points. See Figure 1 of HAP. Thus aside from the dubiousness of attributing the coding error in Excel to Excel itself it was not the main source of the discrepancy. Also even if one backs out all three errors that they found, the key conclusion that GDP growth is declining with debt still occurs (but to a lesser extent) as pointed out by RR in an initial responding email reported by Bloomberg News. The key takeaway here is really unrelated to Excel but rather is that until data and analyses are shared or made public so that the analysis can be reproduced one cannot have any real confidence in research results. RR http://www.nber.org/papers/w15639.pdf HAP http://www.peri.umass.edu/fileadmin/pdf/working_papers/working_papers_301-350/WP322.pdf Bloomberg News http://www.bloomberg.com/news/2013-04-16/reinhart-rogoff-paper-cited-by-ryan-faulted-for-serious-errors-.html -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with DateVisit-gives wrong year?
Thank you, I see this was due to me using %y instead of %Y, See ?strptime -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: 17 April 2013 12:31 To: Pancho Mulongeni Cc: R help Subject: Re: [R] Problem with DateVisit-gives wrong year? HI, test.date- rep(c(14/05/2012,01/10/2012,28/09/2012),each=6) as.Date(test.date,%d/%m/%Y) # [1] 2012-05-14 2012-05-14 2012-05-14 2012-05-14 2012-05-14 #[6] 2012-05-14 2012-10-01 2012-10-01 2012-10-01 2012-10-01 #[11] 2012-10-01 2012-10-01 2012-09-28 2012-09-28 2012-09-28 #[16] 2012-09-28 2012-09-28 2012-09-28 test.date1- factor(test.date) as.Date(test.date1,%d/%m/%Y) # [1] 2012-05-14 2012-05-14 2012-05-14 2012-05-14 2012-05-14 #[6] 2012-05-14 2012-10-01 2012-10-01 2012-10-01 2012-10-01 #[11] 2012-10-01 2012-10-01 2012-09-28 2012-09-28 2012-09-28 #[16] 2012-09-28 2012-09-28 2012-09-28 A.K. - Original Message - From: Pancho Mulongeni p.mulong...@namibia.pharmaccess.org To: r-help@r-project.org r-help@r-project.org Cc: Sent: Wednesday, April 17, 2013 7:18 AM Subject: [R] Problem with DateVisit-gives wrong year? Hi I have the following factor of dates that I want to converted to Date class so I can extract the month test.date [1] 14/05/2012 14/05/2012 14/05/2012 14/05/2012 14/05/2012 14/05/2012 [7] 14/05/2012 14/05/2012 14/05/2012 14/05/2012 201 Levels: 01/10/2012 01/11/2012 01/12/2012 02/07/2012 ... 28/09/2012 I use code below ntest.date-as.Date(test.date,'%d/%m/%y') but the output has the wrong year, and the reverse order ntest.date [1] 2020-05-14 2020-05-14 2020-05-14 2020-05-14 2020-05-14 [6] 2020-05-14 2020-05-14 2020-05-14 2020-05-14 2020-05-14 What am I doing wrong? I dare not say the word 'bug' Thanks Pancho Mulongeni Research Assistant PharmAccess Foundation 1 Fouché Street Windhoek West Windhoek Namibia Tel: +264 61 419 000 Fax: +264 61 419 001/2 Mob: +264 81 4456 286 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change the date into an interval of date?
Hi, I hope this is what you are looking for: library(plyr) mutate(evt_c.1,t=ave(as.numeric(gsub(.*\\-,,responsed_at)),patient_id,gsub(-.*,,responsed_at),FUN=function(x) c(0,cumsum(diff(x) # patient_id responsed_at t #1 1 2010-5 0 #2 1 2010-7 2 #3 1 2010-8 3 #4 1 2010-9 4 #5 1 2010-12 7 #6 1 2011-1 0 #7 1 2011-2 1 #8 2 2010-5 0 #9 2 2010-6 1 #10 2 2010-7 2 #11 3 2010-1 0 #12 3 2010-2 1 #13 3 2010-4 3 #14 3 2010-5 4 #15 4 2011-01 0 #16 4 2011-03 2 #17 5 2012-04 0 #18 5 2012-06 2 A.K. From: GUANGUAN LUO guanguan...@gmail.com To: arun smartpink...@yahoo.com Sent: Wednesday, April 17, 2013 9:21 AM Subject: Re: how to change the date into an interval of date? evt_c.1- read.table(text= patient_id responsed_at 1 2010-5 1 2010-7 1 2010-8 1 2010-9 1 2010-12 1 2011-1 1 2011-2 2 2010-5 2 2010-6 2 2010-7 3 2010-1 3 2010-2 3 2010-4 3 2010-5 4 2011-01 4 2011-03 5 2012-04 5 2012-06 ,sep=,header=TRUE, stringsAsFactors=FALSE) mutate(evt_c.11,t=ave(as.numeric(gsub(.*\\-,,responsed_at)),patient_id,FUN=function(x) c(0,cumsum(diff(x) patient_id responsed_at t 1 1 2010-5 0 2 1 2010-7 2 3 1 2010-8 3 4 1 2010-9 4 5 1 2010-12 7 6 1 2011-1 -4 7 1 2011-2 -3 8 2 2010-5 0 9 2 2010-6 1 10 2 2010-7 2 11 3 2010-1 0 12 3 2010-2 1 13 3 2010-4 3 14 3 2010-5 4 15 4 2011-01 0 16 4 2011-03 2 17 5 2012-04 0 18 5 2012-06 2 This is my problem. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package VIF
Hi Could you perhaps possibly help me. I would like to use the package VIF but cannot get results I attach the .csv file and my R code. What do I have to do ? Any help is greatly appreciated. library(VIF) coal - read.csv(e:/freekvif/cqa1.csv,header=TRUE) y - as.numeric(coal$AI) x - as.matrix(cbind(coal$Gyp, coal$Pyrite, coal$Sid, coal$Calcite, coal$Dol, coal$Apatite, coal$Kaol, coal$Quartz, coal$Mica, coal$Micro, coal$Rutile)) myd -list(y=y,x=x) vif.sel - vif(myd$y, myd$x, subsize=11, trace=FALSE) vif.sel$select Response from R: logical(0) Thank you so much, Jacob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regularized Regressions
Perhaps I am wrong, but I think there are only a few packages supporting Elastic Net, and none of them also perform Best Subsets. On Wed, Apr 17, 2013 at 8:46 AM, Christos Giannoulis cgiann...@gmail.comwrote: Merhaba, Hello to you too Mehmet (Yasu ki sena) Thank you for your email and especially for sharing this package. I appreciate it. However, my feeling is that this package does not have the third component of Best Subsets (pls correct me if I am wrong). It uses only a combination of Ridge and Lasso. If you happen to know any other packages that uses all of them I would greatly welcome and appreciate if you were so kind and share it. I tried to search the cran lists but I am not sure I can found something like that. That's why I was asking the R-community Thank you again for your prompt response! Cheers Christos On Wed, Apr 17, 2013 at 8:16 AM, Suzen, Mehmet msu...@gmail.com wrote: Yasu, Try Elastic nets: http://cran.r-project.org/web/packages/pensim/index.html There some other packages supporting elastic nets: Just search the CRAN Cheers, Mehmet On 17 April 2013 13:19, Christos Giannoulis cgiann...@gmail.com wrote: Hi all, I would greatly appreciate if someone was so kind and share with us a package or method that uses a regularized regression approach that balances a regression model performance and model complexity. That said I would be most grateful is there is an R-package that combines Ridge (sum of squares coefficients), Lasso: Sum of absolute coefficients and Best Subsets: Number of coefficients as methods of regularized regression. Sincerely, Christos Giannoulis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave: Some questions about potential formatting options
Paul, #1: I've never tried but you might be able to escape the required tags in your text (e.g. in html you could write out the b in your text). #3: Which output? Is this in text? #2: I may be possible and maybe easy to implement. So if you want to dig into it, have at it. For me, I'm completely buried in the foreseeable future and won't be able to pay much attention to it. To be honest, odfWeave has been fairly neglected by me and lately I've had thoughts of orphaning the package :-/ Thanks, Max On Tue, Apr 16, 2013 at 1:15 PM, Paul Miller pjmiller...@yahoo.com wrote: Hi Milan and Max, Thanks to each of you for your reply to my post. Thus far, I've managed to find answers to some of the questions I asked initially. I am now able to control the justification of the leftmost column in my tables, as well as to add borders to the top and bottom. I also downloaded Milan's revised version of odfWeave at the link below, and found that it does a nice job of controlling column widths. http://nalimilan.perso.neuf.fr/transfert/odfWeave.tar.gz There are some other things I'm still struggling with though. 1. Is it possible to get odfTableCaption and odfFigureCaption to make the titles they produce bold? I understand it might be possible to accomplish this by changing something in the styles but am not sure what. If someone can give me a hint, I can likely do the rest. 2. Is there any way to get odfFigureCaption to put titles at the top of the figure instead of the bottom? I've noticed that odfTableCaption is able to do this but apparently not odfFigureCaption. 3. Is it possible to add special characters to the output? Below is a sample Kaplan-Meier analysis. There's a footnote in there that reads Note: X2(1) = xx.xx, p = .. Is there any way to make the X a lowercase Chi and to superscript the 2? I did quite a bit of digging on this topic. It sounds like it might be difficult, especially if one is using Windows as I am. Thanks, Paul ## Get data ## Load packages require(survival) require(MASS) Sample analysis attach(gehan) gehan.surv - survfit(Surv(time, cens) ~ treat, data= gehan, conf.type = log-log) print(gehan.surv) survTable - summary(gehan.surv)$table survTable - data.frame(Treatment = rownames(survTable), survTable, row.names=NULL) survTable - subset(survTable, select = -c(records, n.max)) ## odfWeave ## Load odfWeave require(odfWeave) Modify StyleDefs currentDefs - getStyleDefs() currentDefs$firstColumn$type - Table Column currentDefs$firstColumn$columnWidth - 5 cm currentDefs$secondColumn$type - Table Column currentDefs$secondColumn$columnWidth - 3 cm currentDefs$ArialCenteredBold$fontSize - 10pt currentDefs$ArialNormal$fontSize - 10pt currentDefs$ArialCentered$fontSize - 10pt currentDefs$ArialHighlight$fontSize - 10pt currentDefs$ArialLeftBold - currentDefs$ArialCenteredBold currentDefs$ArialLeftBold$textAlign - left currentDefs$cgroupBorder - currentDefs$lowerBorder currentDefs$cgroupBorder$topBorder - 0.0007in solid #00 setStyleDefs(currentDefs) Modify ImageDefs imageDefs - getImageDefs() imageDefs$dispWidth - 5.5 imageDefs$dispHeight- 5.5 setImageDefs(imageDefs) Modify Styles currentStyles - getStyles() currentStyles$figureFrame - frameWithBorders setStyles(currentStyles) Set odt table styles tableStyles - tableStyles(survTable, useRowNames = FALSE, header = ) tableStyles$headerCell[1,] - cgroupBorder tableStyles$header[,1] - ArialLeftBold tableStyles$text[,1] - ArialNormal tableStyles$cell[2,] - lowerBorder Weave odt source file fp - N:/Studies/HCRPC1211/Report/odfWeaveTest/ inFile - paste(fp, testWeaveIn.odt, sep=) outFile - paste(fp, testWeaveOut.odt, sep=) odfWeave(inFile, outFile) ## Contents of .odt source file ## Here is a sample Kaplan-Meier table. testKMTable, echo=FALSE, results = xml= odfTableCaption(A Sample Kaplan-Meier Analysis Table) odfTable(survTable, useRowNames = FALSE, digits = 3, colnames = c(Treatment, Number, Events, Median, 95% LCL, 95% UCL), colStyles = c(firstColumn, secondColumn, secondColumn, secondColumn, secondColumn, secondColumn), styles = tableStyles) odfCat(Note: X2(1) = xx.xx, p = .) @ Here is a sample Kaplan-Meier graph. testKMFig, echo=FALSE, fig = TRUE= odfFigureCaption(A Sample Kaplan-Meier Analysis Graph, label = Figure) plot(gehan.surv, xlab = Time, ylab= Survivorship) @ -- Max [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
Re: [R] how to change the date into an interval of date?
Hi, Try this: library(mondate) mutate(evt_c.1,t=ave(round(as.numeric(mondate(paste(evt_c.1[,2],-01,sep=,patient_id,FUN=function(x) c(0,cumsum(diff(x) # patient_id responsed_at t #1 1 2010-5 0 #2 1 2010-7 2 #3 1 2010-8 3 #4 1 2010-9 4 #5 1 2010-12 7 #6 1 2011-1 8 #7 1 2011-2 9 #8 2 2010-5 0 #9 2 2010-6 1 #10 2 2010-7 2 #11 3 2010-1 0 #12 3 2010-2 1 #13 3 2010-4 3 #14 3 2010-5 4 #15 4 2011-01 0 #16 4 2011-03 2 #17 5 2012-04 0 #18 5 2012-06 2 If it change: evt_c.1$responsed_at[6:7]- c(2011-05,2011-07) mutate(evt_c.1,t=ave(round(as.numeric(mondate(paste(evt_c.1[,2],-01,sep=,patient_id,FUN=function(x) c(0,cumsum(diff(x) # patient_id responsed_at t #1 1 2010-5 0 #2 1 2010-7 2 #3 1 2010-8 3 #4 1 2010-9 4 #5 1 2010-12 7 #6 1 2011-05 12 #7 1 2011-07 14 #8 2 2010-5 0 #9 2 2010-6 1 #10 2 2010-7 2 #11 3 2010-1 0 #12 3 2010-2 1 #13 3 2010-4 3 #14 3 2010-5 4 #15 4 2011-01 0 #16 4 2011-03 2 #17 5 2012-04 0 #18 5 2012-06 2 A.K. From: GUANGUAN LUO guanguan...@gmail.com To: arun smartpink...@yahoo.com Sent: Wednesday, April 17, 2013 9:25 AM Subject: Re: how to change the date into an interval of date? mutate(evt_c.11,t=ave(as.numeric(gsub(.*\\-,,responsed_at)),patient_id,FUN=function(x) c(0,cumsum(diff(x) patient_id responsed_at t 1 1 2010-5 0 2 1 2010-7 2 3 1 2010-8 3 4 1 2010-9 4 5 1 2010-12 7 6 1 2011-1 8 7 1 2011-2 9 8 2 2010-5 0 9 2 2010-6 1 10 2 2010-7 2 11 3 2010-1 0 12 3 2010-2 1 13 3 2010-4 3 14 3 2010-5 4 15 4 2011-01 0 16 4 2011-03 2 17 5 2012-04 0 18 5 2012-06 2 this is the order i want. you are so kind-hearted. GG 2013/4/17 arun smartpink...@yahoo.com Alright, Sorry, I misunderstood. So, what do you want your result to be at 2011-1. Is it 0? From: GUANGUAN LUO guanguan...@gmail.com To: arun smartpink...@yahoo.com Sent: Wednesday, April 17, 2013 9:21 AM Subject: Re: how to change the date into an interval of date? evt_c.1- read.table(text= patient_id responsed_at 1 2010-5 1 2010-7 1 2010-8 1 2010-9 1 2010-12 1 2011-1 1 2011-2 2 2010-5 2 2010-6 2 2010-7 3 2010-1 3 2010-2 3 2010-4 3 2010-5 4 2011-01 4 2011-03 5 2012-04 5 2012-06 ,sep=,header=TRUE, stringsAsFactors=FALSE) mutate(evt_c.11,t=ave(as.numeric(gsub(.*\\-,,responsed_at)),patient_id,FUN=function(x) c(0,cumsum(diff(x) patient_id responsed_at t 1 1 2010-5 0 2 1 2010-7 2 3 1 2010-8 3 4 1 2010-9 4 5 1 2010-12 7 6 1 2011-1 -4 7 1 2011-2 -3 8 2 2010-5 0 9 2 2010-6 1 10 2 2010-7 2 11 3 2010-1 0 12 3 2010-2 1 13 3 2010-4 3 14 3 2010-5 4 15 4 2011-01 0 16 4 2011-03 2 17 5 2012-04 0 18 5 2012-06 2 This is my problem. 2013/4/17 arun smartpink...@yahoo.com If this is not what your problem, please provide a dataset like below and explain where is the problem? - Original Message - From: arun smartpink...@yahoo.com To: GUANGUAN LUO guanguan...@gmail.com Cc: Sent: Wednesday, April 17, 2013 9:17 AM Subject: Re: how to change the date into an interval of date? Hi, I am not sure I understand your question: evt_c.1- read.table(text= patient_id responsed_at 1 2010-5 1 2010-7 1 2010-8 1 2010-9 2 2010-5 2 2010-6 2 2010-7 3 2010-1 3 2010-2 3 2010-4 3 2010-5 4 2011-01 4 2011-03 5 2012-04 5 2012-06 ,sep=,header=TRUE,stringsAsFactors=FALSE) mutate(evt_c.1,t=ave(as.numeric(gsub(.*\\-,,responsed_at)),patient_id,FUN=function(x) c(0,cumsum(diff(x) patient_id responsed_at t 1 1 2010-5 0 2 1 2010-7 2 3
[R] Bug in VGAM z value and coefficient ?
Dear, When i multiply the y of a regression by 10, I would expect that the coefficient would be multiply by 10 and the z value to stay constant. Here some reproducible code to support the case. *Ex 1* library(mvtnorm) library(VGAM) set.seed(1) x=rmvnorm(1000,sigma=matrix(c(1,0.75,0.75,1),2,2)) summary(vglm(y~x,family=studentt2,data=data.frame(y=x[,1],x=x[,2]))) summary(vglm(y~x,family=studentt2,data=data.frame(y=x[,1]*10,x=x[,2]))) summary(vglm(y~x,family=cauchy1,data=data.frame(y=x[,1],x=x[,2]))) summary(vglm(y~x,family=cauchy1,data=data.frame(y=x[,1]*10,x=x[,2]))) summary(vglm(y~x,family=hypersecant,data=data.frame(y=x[,1],x=x[,2]))) summary(vglm(y~x,family=hypersecant,data=data.frame(y=x[,1]*10,x=x[,2]))) *Ex 2* library(VGAM) tdata - data.frame(x2 = runif(nn - 1000)) tdata - transform(tdata, y1 = rt(nn, df = exp(exp(0.5 - x2))), y2 = rt(nn, df = exp(exp(0.5 - x2 fit1 - vglm(y1 ~ x2, studentt, tdata, trace = TRUE) tdata$y1=tdata$y1*100 fit2 - vglm(y1 ~ x2, studentt, tdata, trace = TRUE) coef(fit1, matrix = TRUE) coef(fit2, matrix = TRUE) I also feel that often VGAM package (vglm function) dont converge and just stops. Do you know a reliable package with a lot of available distribution ? Thanks, Olivier [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: could not find function invlogit and bayesglm
I have installed the arm package and its dependents (e.g MATRIX, etc), but cannot use the functions invlogit and bayesglm because it gives me the error message Error: could not find function invlogit or Error: could not find function invlogit. What could be the problem. Regards Carrington [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a vector with repeating dates
Dear Sir, Thanks a lot for your valuable suggestions and help. Regards Katherine --- On Wed, 17/4/13, Jim Lemon j...@bitwrit.com.au wrote: From: Jim Lemon j...@bitwrit.com.au Subject: Re: [R] Creating a vector with repeating dates To: Katherine Gobin katherine_go...@yahoo.com Cc: r-help@r-project.org Date: Wednesday, 17 April, 2013, 10:35 AM On 04/17/2013 07:11 PM, Katherine Gobin wrote: Dear R forum I have a data.frame df = data.frame(dates = c(4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013), values = c(47, 38, 56, 92)) I need to to create a vector by repeating the dates as Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013 i.e. I need to create a new vector as given below which I need to use for some other purpose. Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Is it possible to construct such a column? Hi Katherine, How about: rep(c(Current date,paste(4,15:12,2013,sep=/)),3) Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Anova unbalanced
Hello everybody, I have got a data set with about 400 companies. Each company has a score for its enviroment comportment between 0 and 100. These companies belong to about 15 different countries. I have e.g. 70 companies from UK and 5 from Luxembourg,- so the data set is pretty unbalanced and I want to do an ANOVA. Somthing like aov(enviromentscore~country). But the aov function is just for a balanced design. So I wonder if I can use fit=lm(enviromentscore~country), anova (fit) instead? Would this be okay or can it also only be used with balanced data? Thanking you in anticipation, best regards Claudia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Pathological case for the function 'brunch' of CAPER?
Dear R-enthusiats, While using the regression fonction 'brunch' of CAPER (with R v2.15.4), in a simple case (binary variable Yes/No vs. a continuous variable) I ended with an unexplained error: Error in if (any(stRes robust)) { : missing value where TRUE/FALSE needed I simplified my code so that you can run it, just copy everything in a directory and run source(analysis.R) Code: http://iktp.tu-dresden.de/~prudent/Divers/R/analysis.R Tree: http://iktp.tu-dresden.de/~prudent/Divers/R/vertebrates.tree Data: http://iktp.tu-dresden.de/~prudent/Divers/R/data.txt The source of the error is the pruning, (particularly for these tips: cavPor3, myoLuc1) but after searching around I still have no clue of what is happening. Any hint is welcome! Thanks in advance, Xavier [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: could not find function invlogit and bayesglm
Have you loaded it? library(arm) John Kane Kingston ON Canada -Original Message- From: masan...@uniswa.sz Sent: Wed, 17 Apr 2013 10:08:39 +0200 To: r-help@r-project.org Subject: [R] Error: could not find function invlogit and bayesglm I have installed the arm package and its dependents (e.g MATRIX, etc), but cannot use the functions invlogit and bayesglm because it gives me the error message Error: could not find function invlogit or Error: could not find function invlogit. What could be the problem. Regards Carrington [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: could not find function invlogit and bayesglm
Hi Carrington, You also need the boot package (see http://stat.ethz.ch/R-manual/R-patched/library/boot/html/inv.logit.html ) As for the other function, please load the arm package, e.g., require(arm) require(boot) and then you will be able to use the functions mentioned below. HTH, Jorge.- On Wed, Apr 17, 2013 at 6:08 PM, S'dumo Masango masan...@uniswa.sz wrote: I have installed the arm package and its dependents (e.g MATRIX, etc), but cannot use the functions invlogit and bayesglm because it gives me the error message Error: could not find function invlogit or Error: could not find function invlogit. What could be the problem. Regards Carrington [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: could not find function invlogit and bayesglm
On 17-04-2013, at 10:08, S'dumo Masango masan...@uniswa.sz wrote: I have installed the arm package and its dependents (e.g MATRIX, etc), but cannot use the functions invlogit and bayesglm because it gives me the error message Error: could not find function invlogit or Error: could not find function invlogit. What could be the problem. Have you done library(arm) etc? Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlay two stat_ecdf() plots
Hi Your files have only one column, so melted data is virtualy the same. When I read them as test1 and test2 i can do plot(ecdf(test1$Down)) plot(ecdf(test2$Up), add=T, col=2) Or using previously ustated ggplot2 package test-rbind(melt(test1),melt(test2)) p-ggplot(test, aes(x=value, colour=variable)) p+stat_ecdf() gives me 2 curves. What is your problem? Petr BTW. please use preferably dput(your.data) for providing data for us. From: Robin Mjelle [mailto:robinmje...@gmail.com] Sent: Tuesday, April 16, 2013 11:09 PM To: PIKAL Petr Subject: Re: [R] Overlay two stat_ecdf() plots Dear Petr, I have attached the two tables that I want to plot using stat_ecdf(). To plot one of the table I use: Down - read.table(FC_For_top100Down_RegulatedMiRNATargetsClean.csv,sep=,header=T) Down.m - melt(Down,variable.namehttp://variable.name=DownFC) ggplot(Down.m, aes(value)) + stat_ecdf() This workes fine, but how do I plot both files in one plot? On Tue, Apr 16, 2013 at 9:45 AM, PIKAL Petr petr.pi...@precheza.czmailto:petr.pi...@precheza.cz wrote: Hi Do you mean ecdf? If yes just ose add option in plot. plot(ecdf(rnorm(100, 1,2))) plot(ecdf(rnorm(100, 2,2)), add=TRUE, col=2) If not please specify from where is ecdf_stat or stat_ecdf which, as you indicate, are the same functions. Regrdas Petr -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-bounces@r-mailto:r-help-bounces@r- project.orghttp://project.org] On Behalf Of Robin Mjelle Sent: Monday, April 15, 2013 1:10 PM To: r-help@r-project.orgmailto:r-help@r-project.org Subject: [R] Overlay two stat_ecdf() plots I want to plot two scdf-plots in the same graph. I have two input tables with one column each: Targets - read.table(/media/, sep=, header=T) NonTargets - read.table(/media/..., sep=, header=T) head(Targets) V1 1 3.160514 2 6.701948 3 4.093844 4 1.992014 5 1.604751 6 2.076802 head(NonTargets) V1 1 3.895934 2 1.990506 3 -1.746919 4 -3.451477 5 5.156554 6 1.195109 Targets.m - melt(Targets) head(Targets.m) variablevalue 1 V1 3.160514 2 V1 6.701948 3 V1 4.093844 4 V1 1.992014 5 V1 1.604751 6 V1 2.076802 NonTargets.m - melt(NonTargets) head(NonTargets.m) variable value 1 V1 3.895934 2 V1 1.990506 3 V1 -1.746919 4 V1 -3.451477 5 V1 5.156554 6 V1 1.195109 How do I proceed to plot them in one Graph using ecdf_stat() [[alternative HTML version deleted]] __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mancova with R
Dear all, I'm trying to compare two sets of variables, the first set is composed exclusively of numerical variables and the second regroups factors and numerical variables. I can't use a Manova because of this inclusion of numerical variables in the second set. The solution should be to perform a Mancova, but I didn't find any package that allow this type of test. I've already looked in this forum and on the net to find answers, but the only thing I've found is the following: lm(as.matrix(Y) ~ x+z) x and z could be numerical and factors. The problem with that is it actually only perform a succession of lm (or glm), one for each numerical variable contained in the Y matrix. It is not a true MANCOVA that do a significance test (most often a Wald test) for the overall two sets comparison. Such a test is available in SPSS and SAS, but I really want to stay in R! Someone have any idea? Thanks in advance for your help! Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski remilesmeri...@yahoo.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot 2 y axis
Exel is hardly the epitome of good graphing. There are a couple of ways to do what you want and Jim has shown you one , but ... What about using two panels to present the data as in opar - par(mfrow = c(2, 1)) plot(dat1$Date, dat1$Weight, col = red, xlab = , ylab = Weight) plot(dat1$Date, dat1$Height, col = blue, xlab = Date, ylab = Height) par - opar John Kane Kingston ON Canada -Original Message- From: ye...@lbl.gov Sent: Tue, 16 Apr 2013 15:35:29 -0700 To: r-help@r-project.org Subject: [R] plot 2 y axis Hi, I want to plot two variables on the same graph but with two y axis just like what you can do in Excel. I searched online that seems like you can not achieve that in ggplot. So is there anyway I can do it in a nice way in basic plot? Suppose my data looks like this: WeightHeight Date 0.1 0.31 0.2 0.42 0.3 0.83 0.6 1 4 I want to haveDateas X axis ,Weight as the left y axis and Height as the right y axis. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] spatial graph and its boundary
Dear r-helpers, I have a graph created using the library 'spatgraphs'. library(spatstat) library(spatgraphs) xy - rbind(c(28.39, -16.27), c(30.62, -20.13), c(32.25, -28.7), c(22.43, -27.22), c(27.5, -21.17), c(31.22, -24.52), c(17.93, -26.92), c(18.72, -17.95), c(24.15, -17.82), c(29.23, -22.85)) ow - owin(xrange=range(xy[,1]), yrange=range(xy[,2])) pp - ppp(x=xy[,1],y=xy[,2],n=nrow(xy),window=ow) gg - spatgraph(pp=pp, type=gabriel) plot(gg, asp=1, pp=pp, add=FALSE) lines(xy[c(1,2,10,6,3,4,7,8,9,1),],col=2,lwd=2) and now I need to automatically extract polygon corresponding to its outer boundary, highlighted here in red. Are you aware of such function in R? I looked for it hard but with no success. Thank you in advance for any hint. OndÅej -- OndÅej Mikula Institute of Animal Physiology and Genetics Academy of Sciences of the Czech Republic Veveri 97, 60200 Brno, Czech Republic Institute of Vertebrate Biology Academy of Sciences of the Czech Republic Studenec 122, 67502 Konesin, Czech Republic [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] use of names() within lapply()
Dear all, List g has 2 elements names(g) [1] 2009-10-07 2012-02-29 and the list plot lapply(g, plot, main=names(g)) results in equal plot titles with both list names, whereas distinct titles names(g[1]) and names(g[2]) are sought. Clearly, lapply is passing 'g' in stead of consecutively passing g[1] and then g[2] to process the additional 'main' argument to plot. help(lapply) is mute as to what to element-wise pass parameters. Any suggestion would be appreciated. Kind regards, Ivan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mancova with R
Dear Remi, Take a look at the Anova() function in the car package. In your case, you could use Anova(lm(as.matrix(Y) ~ x + z)) or, for more detail, summary(Anova(lm(as.matrix(Y) ~ x + z))) I hope this helps, John John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Wed, 17 Apr 2013 07:47:27 -0700 (PDT) Rémi Lesmerises remilesmeri...@yahoo.ca wrote: Dear all, I'm trying to compare two sets of variables, the first set is composed exclusively of numerical variables and the second regroups factors and numerical variables. I can't use a Manova because of this inclusion of numerical variables in the second set. The solution should be to perform a Mancova, but I didn't find any package that allow this type of test. I've already looked in this forum and on the net to find answers, but the only thing I've found is the following: lm(as.matrix(Y) ~ x+z) x and z could be numerical and factors. The problem with that is it actually only perform a succession of lm (or glm), one for each numerical variable contained in the Y matrix. It is not a true MANCOVA that do a significance test (most often a Wald test) for the overall two sets comparison. Such a test is available in SPSS and SAS, but I really want to stay in R! Someone have any idea? Thanks in advance for your help! Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski remilesmeri...@yahoo.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use of names() within lapply()
On 17/04/2013 11:04 AM, Ivan Alves wrote: Dear all, List g has 2 elements names(g) [1] 2009-10-07 2012-02-29 and the list plot lapply(g, plot, main=names(g)) results in equal plot titles with both list names, whereas distinct titles names(g[1]) and names(g[2]) are sought. Clearly, lapply is passing 'g' in stead of consecutively passing g[1] and then g[2] to process the additional 'main' argument to plot. help(lapply) is mute as to what to element-wise pass parameters. Any suggestion would be appreciated. I think you want mapply rather than lapply, or you could do lapply on a vector of indices. For example, mapply(plot, g, main=names) or lapply(1:2, function(i) plot(g[[i]], main=names(g)[i])) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use of names() within lapply()
Hi, Try: set.seed(25) g- list(sample(1:40,20,replace=TRUE),sample(40:60,20,replace=TRUE)) names(g)- c(2009-10-07,2012-02-29) pdf(Trialnew.pdf) lapply(seq_along(g),function(i) plot(g[[i]],main=names(g)[i])) dev.off() A.K. - Original Message - From: Ivan Alves pap: u...@me.com To: R-help@r-project.org R-help@r-project.org Cc: Sent: Wednesday, April 17, 2013 11:04 AM Subject: [R] use of names() within lapply() Dear all, List g has 2 elements names(g) [1] 2009-10-07 2012-02-29 and the list plot lapply(g, plot, main=names(g)) results in equal plot titles with both list names, whereas distinct titles names(g[1]) and names(g[2]) are sought. Clearly, lapply is passing 'g' in stead of consecutively passing g[1] and then g[2] to process the additional 'main' argument to plot. help(lapply) is mute as to what to element-wise pass parameters. Any suggestion would be appreciated. Kind regards, Ivan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] proper way to handle obsolete function names
Dear R community, what would be the proper R way to handle obsolete function names? I have created several packages with functions and sometimes would like to change the name of a function but would like to create a mechanism that other scripts of functions using the old name still work. Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change the date into an interval of date?
Hi. No problem. cc:ing to Rhelp. A.K. From: GUANGUAN LUO guanguan...@gmail.com Sent: Wednesday, April 17, 2013 10:25 AM Subject: Re: how to change the date into an interval of date? Thank you so much . That is exactly the things i want. GG Hi, Try this: library(mondate) mutate(evt_c.1,t=ave(round(as.numeric(mondate(paste(evt_c.1[,2],-01,sep=,patient_id,FUN=function(x) c(0,cumsum(diff(x) # patient_id responsed_at t #1 1 2010-5 0 #2 1 2010-7 2 #3 1 2010-8 3 #4 1 2010-9 4 #5 1 2010-12 7 #6 1 2011-1 8 #7 1 2011-2 9 #8 2 2010-5 0 #9 2 2010-6 1 #10 2 2010-7 2 #11 3 2010-1 0 #12 3 2010-2 1 #13 3 2010-4 3 #14 3 2010-5 4 #15 4 2011-01 0 #16 4 2011-03 2 #17 5 2012-04 0 #18 5 2012-06 2 If it change: evt_c.1$responsed_at[6:7]- c(2011-05,2011-07) mutate(evt_c.1,t=ave(round(as.numeric(mondate(paste(evt_c.1[,2],-01,sep=,patient_id,FUN=function(x) c(0,cumsum(diff(x) # patient_id responsed_at t #1 1 2010-5 0 #2 1 2010-7 2 #3 1 2010-8 3 #4 1 2010-9 4 #5 1 2010-12 7 #6 1 2011-05 12 #7 1 2011-07 14 #8 2 2010-5 0 #9 2 2010-6 1 #10 2 2010-7 2 #11 3 2010-1 0 #12 3 2010-2 1 #13 3 2010-4 3 #14 3 2010-5 4 #15 4 2011-01 0 #16 4 2011-03 2 #17 5 2012-04 0 #18 5 2012-06 2 A.K. From: GUANGUAN LUO guanguan...@gmail.com Sent: Wednesday, April 17, 2013 9:25 AM Subject: Re: how to change the date into an interval of date? mutate(evt_c.11,t=ave(as.numeric(gsub(.*\\-,,responsed_at)),patient_id,FUN=function(x) c(0,cumsum(diff(x) patient_id responsed_at t 1 1 2010-5 0 2 1 2010-7 2 3 1 2010-8 3 4 1 2010-9 4 5 1 2010-12 7 6 1 2011-1 8 7 1 2011-2 9 8 2 2010-5 0 9 2 2010-6 1 10 2 2010-7 2 11 3 2010-1 0 12 3 2010-2 1 13 3 2010-4 3 14 3 2010-5 4 15 4 2011-01 0 16 4 2011-03 2 17 5 2012-04 0 18 5 2012-06 2 this is the order i want. you are so kind-hearted. GG Alright, Sorry, I misunderstood. So, what do you want your result to be at 2011-1. Is it 0? From: GUANGUAN LUO guanguan...@gmail.com Sent: Wednesday, April 17, 2013 9:21 AM Subject: Re: how to change the date into an interval of date? evt_c.1- read.table(text= patient_id responsed_at 1 2010-5 1 2010-7 1 2010-8 1 2010-9 1 2010-12 1 2011-1 1 2011-2 2 2010-5 2 2010-6 2 2010-7 3 2010-1 3 2010-2 3 2010-4 3 2010-5 4 2011-01 4 2011-03 5 2012-04 5 2012-06 ,sep=,header=TRUE, stringsAsFactors=FALSE) mutate(evt_c.11,t=ave(as.numeric(gsub(.*\\-,,responsed_at)),patient_id,FUN=function(x) c(0,cumsum(diff(x) patient_id responsed_at t 1 1 2010-5 0 2 1 2010-7 2 3 1 2010-8 3 4 1 2010-9 4 5 1 2010-12 7 6 1 2011-1 -4 7 1 2011-2 -3 8 2 2010-5 0 9 2 2010-6 1 10 2 2010-7 2 11 3 2010-1 0 12 3 2010-2 1 13 3 2010-4 3 14 3 2010-5 4 15 4 2011-01 0 16 4 2011-03 2 17 5 2012-04 0 18 5 2012-06 2 This is my problem. If this is not what your problem, please provide a dataset like below and explain where is the problem? - Original Message - To: GUANGUAN LUO guanguan...@gmail.com Cc: Sent: Wednesday, April 17, 2013 9:17 AM Subject: Re: how to change the date into an interval of date? Hi, I am not sure I understand your question: evt_c.1- read.table(text= patient_id responsed_at 1 2010-5 1 2010-7 1 2010-8 1 2010-9 2 2010-5 2 2010-6 2 2010-7 3 2010-1 3 2010-2 3 2010-4 3 2010-5 4 2011-01 4 2011-03 5 2012-04 5 2012-06 ,sep=,header=TRUE,stringsAsFactors=FALSE) mutate(evt_c.1,t=ave(as.numeric(gsub(.*\\-,,responsed_at)),patient_id,FUN=function(x)
Re: [R] Mancova with R
Dear John, Thanks for your comments! But when I tried your suggestion, the output was as the following:  Response Dist_arbre :       Df   Sum Sq   Mean Sq F value   Pr(F)   Poids     1 0.00010398 0.00010398  6.2910 0.0364733 *  Age      1 0.5202 0.5202  3.1476 0.1139652   --- Signif. codes:  0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 I have the P-value but not the direction of the relationship, information that I had with  lm(as.matrix(Y) ~ x+z). I could combine the results of these two tests, but it seems inelegant to me. Moreover I didn't have a total significance test as with a true MANCOVA. An idea?! Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski 300, allée des Ursulines remilesmeri...@yahoo.ca De : John Fox j...@mcmaster.ca à: Rémi Lesmerises remilesmeri...@yahoo.ca Cc : r-help@r-project.org r-help@r-project.org Envoyé le : mercredi 17 avril 2013 10h54 Objet : Re: [R] Mancova with R Dear Remi, Take a look at the Anova() function in the car package. In your case, you could use Anova(lm(as.matrix(Y) ~ x + z)) or, for more detail, summary(Anova(lm(as.matrix(Y) ~ x + z))) I hope this helps, John John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/    On Wed, 17 Apr 2013 07:47:27 -0700 (PDT) Rémi Lesmerises remilesmeri...@yahoo.ca wrote: Dear all, I'm trying to compare two sets of variables, the first set is composed exclusively of numerical variables and the second regroups factors and numerical variables. I can't use a Manova because of this inclusion of numerical variables in the second set. The solution should be to perform a Mancova, but I didn't find any package that allow this type of test. I've already looked in this forum and on the net to find answers, but the only thing I've found is the following: lm(as.matrix(Y) ~ x+z) x and z could be numerical and factors. The problem with that is it actually only perform a succession of lm (or glm), one for each numerical variable contained in the Y matrix. It is not a true MANCOVA that do a significance test (most often a Wald test) for the overall two sets comparison. Such a test is available in SPSS and SAS, but I really want to stay in R! Someone have any idea? Thanks in advance for your help!  Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski remilesmeri...@yahoo.ca    [[alternative HTML version deleted]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use of names() within lapply()
Dear Duncan and A.K. Many thanks for your super quick help. The modified lapply did the trick, mapply died with a error Error in dots[[2L]][[1L]] : object of type 'builtin' is not subsettable. Kind regards, Ivan On 17 Apr 2013, at 17:12, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 17/04/2013 11:04 AM, Ivan Alves wrote: Dear all, List g has 2 elements names(g) [1] 2009-10-07 2012-02-29 and the list plot lapply(g, plot, main=names(g)) results in equal plot titles with both list names, whereas distinct titles names(g[1]) and names(g[2]) are sought. Clearly, lapply is passing 'g' in stead of consecutively passing g[1] and then g[2] to process the additional 'main' argument to plot. help(lapply) is mute as to what to element-wise pass parameters. Any suggestion would be appreciated. I think you want mapply rather than lapply, or you could do lapply on a vector of indices. For example, mapply(plot, g, main=names) or lapply(1:2, function(i) plot(g[[i]], main=names(g)[i])) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mancova with R
Dear Remi, On Wed, 17 Apr 2013 08:23:07 -0700 (PDT) Rémi Lesmerises remilesmeri...@yahoo.ca wrote: Dear John, Thanks for your comments! But when I tried your suggestion, the output was as the following: Response Dist_arbre : Df Sum Sq Mean Sq F value Pr(F) Poids 1 0.00010398 0.00010398 6.2910 0.0364733 * Age 1 0.5202 0.5202 3.1476 0.1139652 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 I have the P-value but not the direction of the relationship, information that I had with lm(as.matrix(Y) ~ x+z). I could combine the results of these two tests, but it seems inelegant to me. Moreover I didn't have a total significance test as with a true MANCOVA. An idea?! Yes, try what I suggested. The output that you show here isn't from the the Anova() function in the car package. As well, you might find it useful to read the on-line appendix on multivariate linear models, at http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix/Appendix-Multivariate-Linear-Models.pdf, from the book with which the car package is associated. Best, John Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski 300, allée des Ursulines remilesmeri...@yahoo.ca De : John Fox j...@mcmaster.ca À : Rémi Lesmerises remilesmeri...@yahoo.ca Cc : r-help@r-project.org r-help@r-project.org Envoyé le : mercredi 17 avril 2013 10h54 Objet : Re: [R] Mancova with R Dear Remi, Take a look at the Anova() function in the car package. In your case, you could use Anova(lm(as.matrix(Y) ~ x + z)) or, for more detail, summary(Anova(lm(as.matrix(Y) ~ x + z))) I hope this helps, John John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Wed, 17 Apr 2013 07:47:27 -0700 (PDT) Rémi Lesmerises remilesmeri...@yahoo.ca wrote: Dear all, I'm trying to compare two sets of variables, the first set is composed exclusively of numerical variables and the second regroups factors and numerical variables. I can't use a Manova because of this inclusion of numerical variables in the second set. The solution should be to perform a Mancova, but I didn't find any package that allow this type of test. I've already looked in this forum and on the net to find answers, but the only thing I've found is the following: lm(as.matrix(Y) ~ x+z) x and z could be numerical and factors. The problem with that is it actually only perform a succession of lm (or glm), one for each numerical variable contained in the Y matrix. It is not a true MANCOVA that do a significance test (most often a Wald test) for the overall two sets comparison. Such a test is available in SPSS and SAS, but I really want to stay in R! Someone have any idea? Thanks in advance for your help! Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski remilesmeri...@yahoo.ca [[alternative HTML version deleted]] John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Anova unbalanced
Dear Claudia, Your question has been posed on many previous occasions. The (short) answer has always been the same: have a look at the Anova function in the car package but before doing that, get a copy of John Fox's Applied Regression Analysis and Generalized Linear Models book. Best, José José Iparraguirre Chief Economist Age UK T 020 303 31482 E jose.iparragui...@ageuk.org.uk Twitter @jose.iparraguirre@ageuk Tavis House, 1- 6 Tavistock Square London, WC1H 9NB www.ageuk.org.uk | ageukblog.org.uk | @ageukcampaigns -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of paladini Sent: 17 April 2013 10:47 To: r-help@r-project.org Subject: [R] Anova unbalanced Hello everybody, I have got a data set with about 400 companies. Each company has a score for its enviroment comportment between 0 and 100. These companies belong to about 15 different countries. I have e.g. 70 companies from UK and 5 from Luxembourg,- so the data set is pretty unbalanced and I want to do an ANOVA. Somthing like aov(enviromentscore~country). But the aov function is just for a balanced design. So I wonder if I can use fit=lm(enviromentscore~country), anova (fit) instead? Would this be okay or can it also only be used with balanced data? Thanking you in anticipation, best regards Claudia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Please donate to the Syria Crisis Appeal by text or online: To donate £5 by mobile, text SYRIA to 70800. To donate online, please visit http://www.ageinternational.org.uk/syria Over one million refugees are desperately in need of water, food, healthcare, warm clothing, blankets and shelter; Age International urgently needs your support to help affected older refugees. Age International is a subsidiary charity of Age UK and a member of the Disasters Emergency Committee (DEC). The DEC launches and co-ordinates national fundraising appeals for public donations on behalf of its member agencies. Texts cost £5 plus one standard rate message. Age International will receive a minimum of £4.96. More info at ageinternational.org.uk/SyriaTerms --- Age UK is a registered charity and company limited by guarantee, (registered charity number 1128267, registered company number 6825798). Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA. For the purposes of promoting Age UK Insurance, Age UK is an Appointed Representative of Age UK Enterprises Limited, Age UK is an Introducer Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth Access for the purposes of introducing potential annuity and health cash plans customers respectively. Age UK Enterprises Limited, JLT Benefit Solutions Limited and Simplyhealth Access are all authorised and regulated by the Financial Services Authority. -- This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you receive a message in error, please advise the sender and delete immediately. Except where this email is sent in the usual course of our business, any opinions expressed in this email are those of the author and do not necessarily reflect the opinions of Age UK or its subsidiaries and associated companies. Age UK monitors all e-mail transmissions passing through its network and may block or modify mails which are deemed to be unsuitable. Age Concern England (charity number 261794) and Help the Aged (charity number 272786) and their trading and other associated companies merged on 1st April 2009. Together they have formed the Age UK Group, dedicated to improving the lives of people in later life. The three national Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help the Aged in these nations to form three registered charities: Age Scotland, Age NI, Age Cymru. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regularized Regressions
Hi Levi, Thank you very much for your reply and concern. There was a package till 2011...or even 2012...entitled Generalized Path Seeking Regression in R which was using a combination of all of these methods...as a way to regularize regression. However, I presently can't find it. Thus, I was wondering if someone in the R-community could inform us if there is an upgraded version of it with a different name or another package which does the same thing. Thank you again very much and my apologies for populating the mail messages of the community. Cheers Christos On Wed, Apr 17, 2013 at 9:50 AM, Levi Waldron lwaldron.resea...@gmail.comwrote: Perhaps I am wrong, but I think there are only a few packages supporting Elastic Net, and none of them also perform Best Subsets. On Wed, Apr 17, 2013 at 8:46 AM, Christos Giannoulis cgiann...@gmail.comwrote: Merhaba, Hello to you too Mehmet (Yasu ki sena) Thank you for your email and especially for sharing this package. I appreciate it. However, my feeling is that this package does not have the third component of Best Subsets (pls correct me if I am wrong). It uses only a combination of Ridge and Lasso. If you happen to know any other packages that uses all of them I would greatly welcome and appreciate if you were so kind and share it. I tried to search the cran lists but I am not sure I can found something like that. That's why I was asking the R-community Thank you again for your prompt response! Cheers Christos On Wed, Apr 17, 2013 at 8:16 AM, Suzen, Mehmet msu...@gmail.com wrote: Yasu, Try Elastic nets: http://cran.r-project.org/web/packages/pensim/index.html There some other packages supporting elastic nets: Just search the CRAN Cheers, Mehmet On 17 April 2013 13:19, Christos Giannoulis cgiann...@gmail.com wrote: Hi all, I would greatly appreciate if someone was so kind and share with us a package or method that uses a regularized regression approach that balances a regression model performance and model complexity. That said I would be most grateful is there is an R-package that combines Ridge (sum of squares coefficients), Lasso: Sum of absolute coefficients and Best Subsets: Number of coefficients as methods of regularized regression. Sincerely, Christos Giannoulis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave: Some questions about potential formatting options
Le mardi 16 avril 2013 à 10:15 -0700, Paul Miller a écrit : Hi Milan and Max, Thanks to each of you for your reply to my post. Thus far, I've managed to find answers to some of the questions I asked initially. I am now able to control the justification of the leftmost column in my tables, as well as to add borders to the top and bottom. I also downloaded Milan's revised version of odfWeave at the link below, and found that it does a nice job of controlling column widths. http://nalimilan.perso.neuf.fr/transfert/odfWeave.tar.gz There are some other things I'm still struggling with though. 1. Is it possible to get odfTableCaption and odfFigureCaption to make the titles they produce bold? I understand it might be possible to accomplish this by changing something in the styles but am not sure what. If someone can give me a hint, I can likely do the rest. Just right-click on a caption and choose Edit paragraph style... (in the template document). Or edit the styles called Table and Illustration. 2. Is there any way to get odfFigureCaption to put titles at the top of the figure instead of the bottom? I've noticed that odfTableCaption is able to do this but apparently not odfFigureCaption. No idea. 3. Is it possible to add special characters to the output? Below is a sample Kaplan-Meier analysis. There's a footnote in there that reads Note: X2(1) = xx.xx, p = .. Is there any way to make the X a lowercase Chi and to superscript the 2? I did quite a bit of digging on this topic. It sounds like it might be difficult, especially if one is using Windows as I am. For the Chi you can copy the Unicode character χ from e.g. LibreOffice and use it in the string passed to odfCat() and friends. If that does not work on Windows, you can also use the escape code \u03C7. For the ², you can either use the Unicode character (code \u00B2), or try to insert ODF markup to put a 2 as an exponent (I did not test second option). Regards Thanks, Paul ## Get data ## Load packages require(survival) require(MASS) Sample analysis attach(gehan) gehan.surv - survfit(Surv(time, cens) ~ treat, data= gehan, conf.type = log-log) print(gehan.surv) survTable - summary(gehan.surv)$table survTable - data.frame(Treatment = rownames(survTable), survTable, row.names=NULL) survTable - subset(survTable, select = -c(records, n.max)) ## odfWeave ## Load odfWeave require(odfWeave) Modify StyleDefs currentDefs - getStyleDefs() currentDefs$firstColumn$type - Table Column currentDefs$firstColumn$columnWidth - 5 cm currentDefs$secondColumn$type - Table Column currentDefs$secondColumn$columnWidth - 3 cm currentDefs$ArialCenteredBold$fontSize - 10pt currentDefs$ArialNormal$fontSize - 10pt currentDefs$ArialCentered$fontSize - 10pt currentDefs$ArialHighlight$fontSize - 10pt currentDefs$ArialLeftBold - currentDefs$ArialCenteredBold currentDefs$ArialLeftBold$textAlign - left currentDefs$cgroupBorder - currentDefs$lowerBorder currentDefs$cgroupBorder$topBorder - 0.0007in solid #00 setStyleDefs(currentDefs) Modify ImageDefs imageDefs - getImageDefs() imageDefs$dispWidth - 5.5 imageDefs$dispHeight- 5.5 setImageDefs(imageDefs) Modify Styles currentStyles - getStyles() currentStyles$figureFrame - frameWithBorders setStyles(currentStyles) Set odt table styles tableStyles - tableStyles(survTable, useRowNames = FALSE, header = ) tableStyles$headerCell[1,] - cgroupBorder tableStyles$header[,1] - ArialLeftBold tableStyles$text[,1] - ArialNormal tableStyles$cell[2,] - lowerBorder Weave odt source file fp - N:/Studies/HCRPC1211/Report/odfWeaveTest/ inFile - paste(fp, testWeaveIn.odt, sep=) outFile - paste(fp, testWeaveOut.odt, sep=) odfWeave(inFile, outFile) ## Contents of .odt source file ## Here is a sample Kaplan-Meier table. testKMTable, echo=FALSE, results = xml= odfTableCaption(“A Sample Kaplan-Meier Analysis Table”) odfTable(survTable, useRowNames = FALSE, digits = 3, colnames = c(Treatment, Number, Events, Median, 95% LCL, 95% UCL), colStyles = c(firstColumn, secondColumn, secondColumn, secondColumn, secondColumn, secondColumn), styles = tableStyles) odfCat(“Note: X2(1) = xx.xx, p = .”) @ Here is a sample Kaplan-Meier graph. testKMFig, echo=FALSE, fig = TRUE= odfFigureCaption(A Sample Kaplan-Meier Analysis Graph, label = Figure) plot(gehan.surv, xlab = Time, ylab= Survivorship) @ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible
[R] Best way to calculate averages of Blocks in an matrix?
Folks, I recently was given a simulated data set like the following subset: sim_sub-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, 0.03, 0.03)), .Names = c(V11, V12, V13, V14, V15, V16, V17, V18, V19, V20), row.names = c(NA, 20L), class = data.frame) sim_sub V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 3 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 4 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 5 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 6 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 9 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 Every 5 rows represents one block of simulated data. What would be the best way to average the blocks? My way was to reshape sim_sub, average over the columns and then reshape back like so: matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.0050 0. 0. 0.0025 0.0025 0.005 0. 0.0050 0.0050 0.0050 [2,] 0. 0.0025 0. 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 [3,] 0. 0. 0. 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 [5,] 0. 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 How bad is t(sim_sub) in the above? Thanks for your time, KW -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use of names() within lapply()
Dear Ivan, No problem. If you want it in a single plot: matplot(do.call(cbind,g),ylab=value,pch=1:2,main=Some plot,col=c(red,orange),type=o) legend(topleft,inset=.01,lty=c(1,1),title=Plot,col=c(red,orange),names(g),horiz=TRUE) A.K. From: Ivan Alves papu...@me.com To: Duncan Murdoch murdoch.dun...@gmail.com; arun smartpink...@yahoo.com Cc: R-help@r-project.org R-help@r-project.org Sent: Wednesday, April 17, 2013 11:33 AM Subject: Re: [R] use of names() within lapply() Dear Duncan and A.K. Many thanks for your super quick help. The modified lapply did the trick, mapply died with a error Error in dots[[2L]][[1L]] : object of type 'builtin' is not subsettable. Kind regards, Ivan On 17 Apr 2013, at 17:12, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 17/04/2013 11:04 AM, Ivan Alves wrote: Dear all, List g has 2 elements names(g) [1] 2009-10-07 2012-02-29 and the list plot lapply(g, plot, main=names(g)) results in equal plot titles with both list names, whereas distinct titles names(g[1]) and names(g[2]) are sought. Clearly, lapply is passing 'g' in stead of consecutively passing g[1] and then g[2] to process the additional 'main' argument to plot. help(lapply) is mute as to what to element-wise pass parameters. Any suggestion would be appreciated. I think you want mapply rather than lapply, or you could do lapply on a vector of indices. For example, mapply(plot, g, main=names) or lapply(1:2, function(i) plot(g[[i]], main=names(g)[i])) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best way to calculate averages of Blocks in an matrix?
do.call(rbind,lapply(split(sim_sub,((seq_len(nrow(sim_sub))-1)%/% 5)+1),colMeans)) # V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 #1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 #2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 #3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 #4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 A.K. - Original Message - From: Keith S Weintraub kw1...@gmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Wednesday, April 17, 2013 12:54 PM Subject: [R] Best way to calculate averages of Blocks in an matrix? Folks, I recently was given a simulated data set like the following subset: sim_sub-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, 0.03, 0.03)), .Names = c(V11, V12, V13, V14, V15, V16, V17, V18, V19, V20), row.names = c(NA, 20L), class = data.frame) sim_sub V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 3 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 4 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 5 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 6 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 9 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 Every 5 rows represents one block of simulated data. What would be the best way to average the blocks? My way was to reshape sim_sub, average over the columns and then reshape back like so: matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.0050 0. 0. 0.0025 0.0025 0.005 0. 0.0050 0.0050 0.0050 [2,] 0. 0.0025 0. 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 [3,] 0. 0. 0. 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 [5,] 0. 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 How bad is t(sim_sub) in the above? Thanks for your time, KW -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge
Hi, I have a quick question here. Lets say he has three data frames and he needs to combine those three data frame using merge. Can we simply use merge to join three data frames ? I remember I had some problem using merge for more than two dataframes. Thanks. On Wed, Apr 17, 2013 at 1:05 AM, Farnoosh farnoosh...@yahoo.com wrote: Thanks a lot:) Sent from my iPad On Apr 16, 2013, at 10:15 PM, arun smartpink...@yahoo.com wrote: Hi Farnoosh, YOu can use either ?merge() or ?join() DataA- read.table(text= ID v1 1 10 2 1 3 22 4 15 5 3 6 6 7 8 ,sep=,header=TRUE) DataB- read.table(text= ID v2 2 yes 5 no 7 yes ,sep=,header=TRUE,stringsAsFactors=FALSE) merge(DataA,DataB,by=ID,all.x=TRUE) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes library(plyr) join(DataA,DataB,by=ID,type=left) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes A.K. From: farnoosh sheikhi farnoosh...@yahoo.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Wednesday, April 17, 2013 12:52 AM Subject: Merge Hi Arun, I want to merge a data set with another data frame with 2 columns and keep the sample size of the DataA. DataA DataB DataCombine ID v1 ID V2 ID v1 v2 1 10 2 yes 1 10 NA 2 1 5 no 2 1 yes 3 22 7 yes 3 22 NA 4 15 4 15 NA 5 3 5 3 no 6 6 6 6 NA 7 8 7 8 yes Thanks a lot for your help and time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best way to calculate averages of Blocks in an matrix?
Also, do.call(rbind,lapply(split(sim_sub,rep(1:(1+nrow(sim_sub)/5),each=5)[seq_len(nrow(sim_sub))]),colMeans)) # V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 #1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 #2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 #3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 #4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 A.K. - Original Message - From: arun smartpink...@yahoo.com To: Keith S Weintraub kw1...@gmail.com Cc: R help r-help@r-project.org Sent: Wednesday, April 17, 2013 1:04 PM Subject: Re: [R] Best way to calculate averages of Blocks in an matrix? do.call(rbind,lapply(split(sim_sub,((seq_len(nrow(sim_sub))-1)%/% 5)+1),colMeans)) # V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 #1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 #2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 #3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 #4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 A.K. - Original Message - From: Keith S Weintraub kw1...@gmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Wednesday, April 17, 2013 12:54 PM Subject: [R] Best way to calculate averages of Blocks in an matrix? Folks, I recently was given a simulated data set like the following subset: sim_sub-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, 0.03, 0.03)), .Names = c(V11, V12, V13, V14, V15, V16, V17, V18, V19, V20), row.names = c(NA, 20L), class = data.frame) sim_sub V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 3 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 4 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 5 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 6 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 9 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 Every 5 rows represents one block of simulated data. What would be the best way to average the blocks? My way was to reshape sim_sub, average over the columns and then reshape back like so: matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.0050 0. 0. 0.0025 0.0025 0.005 0. 0.0050 0.0050 0.0050 [2,] 0. 0.0025 0. 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 [3,] 0. 0. 0. 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 [5,] 0. 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 How bad is t(sim_sub) in the above? Thanks for your time, KW -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best way to calculate averages of Blocks in an matrix?
Hello, Try the following. blocks - rep(1:(1 + nrow(sim_sub) %/% 5), each = 5)[seq_len(nrow(sim_sub))] aggregate(sim_sub, list(blocks), FUN = mean) Hope this helps, Rui Barradas Em 17-04-2013 18:04, arun escreveu: do.call(rbind,lapply(split(sim_sub,((seq_len(nrow(sim_sub))-1)%/% 5)+1),colMeans)) # V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 #1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 #2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 #3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 #4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 A.K. - Original Message - From: Keith S Weintraub kw1...@gmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Wednesday, April 17, 2013 12:54 PM Subject: [R] Best way to calculate averages of Blocks in an matrix? Folks, I recently was given a simulated data set like the following subset: sim_sub-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, 0.03, 0.03)), .Names = c(V11, V12, V13, V14, V15, V16, V17, V18, V19, V20), row.names = c(NA, 20L), class = data.frame) sim_sub V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 3 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 4 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 5 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 6 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 9 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 Every 5 rows represents one block of simulated data. What would be the best way to average the blocks? My way was to reshape sim_sub, average over the columns and then reshape back like so: matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.0050 0. 0. 0.0025 0.0025 0.005 0. 0.0050 0.0050 0.0050 [2,] 0. 0.0025 0. 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 [3,] 0. 0. 0. 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 [5,] 0. 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 How bad is t(sim_sub) in the above? Thanks for your time, KW -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mancova with R
On Apr 17, 2013, at 16:47 , Rémi Lesmerises wrote: Dear all, I'm trying to compare two sets of variables, the first set is composed exclusively of numerical variables and the second regroups factors and numerical variables. I can't use a Manova because of this inclusion of numerical variables in the second set. The solution should be to perform a Mancova, but I didn't find any package that allow this type of test. I've already looked in this forum and on the net to find answers, but the only thing I've found is the following: lm(as.matrix(Y) ~ x+z) x and z could be numerical and factors. The problem with that is it actually only perform a succession of lm (or glm), one for each numerical variable contained in the Y matrix. It is not a true MANCOVA that do a significance test (most often a Wald test) for the overall two sets comparison. Such a test is available in SPSS and SAS, but I really want to stay in R! Someone have any idea? You can fit two models and compare them with (say) fit1 - lm(as.matrix(Y) ~ x+z) fit2 - lm(as.matrix(Y) ~ x) anova(fit1, fit2, test=Wilks) or, removing terms sequentially: anova(fit1, test=Wilks) Thanks in advance for your help! Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski remilesmeri...@yahoo.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge
HI Janesh, YOu can use: library(plyr) ?join_all() #From the help page: dfs - list( a = data.frame(x = 1:10, a = runif(10)), b = data.frame(x = 1:10, b = runif(10)), c = data.frame(x = 1:10, c = runif(10)) ) join_all(dfs) join_all(dfs, x) join_all(dfs, x) # x a b c #1 1 0.7113766 0.1348978 0.1153703 #2 2 0.2520057 0.7249154 0.2362936 #3 3 0.5670157 0.8166805 0.3049683 #4 4 0.7441726 0.4929165 0.6779029 #5 5 0.5616914 0.5272339 0.6202915 #6 6 0.2858429 0.1203205 0.8399356 #7 7 0.9910520 0.1251815 0.4729418 #8 8 0.7079778 0.5465055 0.8951371 #9 9 0.0564100 0.1837211 0.6451289 #10 10 0.7169663 0.1328287 0.2467554 Reduce(function(...) merge(...,by=x),dfs) # x a b c #1 1 0.7113766 0.1348978 0.1153703 #2 2 0.2520057 0.7249154 0.2362936 #3 3 0.5670157 0.8166805 0.3049683 #4 4 0.7441726 0.4929165 0.6779029 #5 5 0.5616914 0.5272339 0.6202915 #6 6 0.2858429 0.1203205 0.8399356 #7 7 0.9910520 0.1251815 0.4729418 #8 8 0.7079778 0.5465055 0.8951371 #9 9 0.0564100 0.1837211 0.6451289 #10 10 0.7169663 0.1328287 0.2467554 A.K. From: Janesh Devkota janesh.devk...@gmail.com To: Farnoosh farnoosh...@yahoo.com Cc: arun smartpink...@yahoo.com; R help r-help@r-project.org Sent: Wednesday, April 17, 2013 1:05 PM Subject: Re: [R] Merge Hi, I have a quick question here. Lets say he has three data frames and he needs to combine those three data frame using merge. Can we simply use merge to join three data frames ? I remember I had some problem using merge for more than two dataframes. Thanks. On Wed, Apr 17, 2013 at 1:05 AM, Farnoosh farnoosh...@yahoo.com wrote: Thanks a lot:) Sent from my iPad On Apr 16, 2013, at 10:15 PM, arun smartpink...@yahoo.com wrote: Hi Farnoosh, YOu can use either ?merge() or ?join() DataA- read.table(text= ID v1 1 10 2 1 3 22 4 15 5 3 6 6 7 8 ,sep=,header=TRUE) DataB- read.table(text= ID v2 2 yes 5 no 7 yes ,sep=,header=TRUE,stringsAsFactors=FALSE) merge(DataA,DataB,by=ID,all.x=TRUE) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes library(plyr) join(DataA,DataB,by=ID,type=left) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes A.K. From: farnoosh sheikhi farnoosh...@yahoo.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Wednesday, April 17, 2013 12:52 AM Subject: Merge Hi Arun, I want to merge a data set with another data frame with 2 columns and keep the sample size of the DataA. DataA DataB DataCombine ID v1 ID V2 ID v1 v2 1 10 2 yes 1 10 NA 2 1 5 no 2 1 yes 3 22 7 yes 3 22 NA 4 15 4 15 NA 5 3 5 3 no 6 6 6 6 NA 7 8 7 8 yes Thanks a lot for your help and time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mgcv: how select significant predictor vars when using gam(...select=TRUE) using automatic optimization
I have 11 possible predictor variables and use them to model quite a few target variables. In search for a consistent manner and possibly non-manual manner to identify the significant predictor vars out of the eleven I thought the option select=T might do. Example: (here only 4 pedictors) first is vanilla with select=F fit1-gam(target~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,select=F) summary(fit1) Family: quasi Link function: log Formula: target ~ s(mgs) + s(gsd) + s(mud) + s(ssCmax) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -34.57 20.47 -1.689 0.0913 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Approximate significance of smooth terms: edf Ref.df F p-value s(mgs)2.335 2.623 0.2600.829 s(gsd)6.868 7.506 13.955 2e-16 *** s(mud)8.990 9.000 11.727 2e-16 *** s(ssCmax) 6.770 6.978 6.664 7.68e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 R-sq.(adj) = 0.402 Deviance explained = 40.4% GCV score = 8.8563e+05 Scale est. = 8.8053e+05 n = 4511 then turn select=TRUE fit2-gam(target~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,select=TRUE) summary(fit2) Family: quasi Link function: log Formula: target ~ s(mgs) + s(gsd) + s(mud) + s(ssCmax) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.1585 1.7439 0.0910.928 Approximate significance of smooth terms: edf Ref.df F p-value s(mgs)2.456 8 24.50 2e-16 *** s(gsd)7.272 9 14.33 2e-16 *** s(mud)7.678 9 20.38 2e-16 *** s(ssCmax) 6.556 9 14.36 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 R-sq.(adj) = 0.397 Deviance explained = 40% GCV score = 8.9209e+05 Scale est. = 8.8715e+05 n = 4511 I seem to not fully understand how to work with select. The predictor mgs is obviously not significant, as seen from fit (above), yet here it appears as significant. Why was it not dropped? How are not-significant predictors are identified? -- View this message in context: http://r.789695.n4.nabble.com/mgcv-how-select-significant-predictor-vars-when-using-gam-select-TRUE-using-automatic-optimization-tp4664510.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge
Hello Arun, Thank you so much for the prompt reply. I have one simple question here. DOes three dots (...) in the reduce function means we are applying for three dataframes here ? So, if we were to combine four would that dots be four dots ? Thanks. On Wed, Apr 17, 2013 at 12:16 PM, arun smartpink...@yahoo.com wrote: HI Janesh, YOu can use: library(plyr) ?join_all() #From the help page: dfs - list( a = data.frame(x = 1:10, a = runif(10)), b = data.frame(x = 1:10, b = runif(10)), c = data.frame(x = 1:10, c = runif(10)) ) join_all(dfs) join_all(dfs, x) join_all(dfs, x) #x a b c #1 1 0.7113766 0.1348978 0.1153703 #2 2 0.2520057 0.7249154 0.2362936 #3 3 0.5670157 0.8166805 0.3049683 #4 4 0.7441726 0.4929165 0.6779029 #5 5 0.5616914 0.5272339 0.6202915 #6 6 0.2858429 0.1203205 0.8399356 #7 7 0.9910520 0.1251815 0.4729418 #8 8 0.7079778 0.5465055 0.8951371 #9 9 0.0564100 0.1837211 0.6451289 #10 10 0.7169663 0.1328287 0.2467554 Reduce(function(...) merge(...,by=x),dfs) #x a b c #1 1 0.7113766 0.1348978 0.1153703 #2 2 0.2520057 0.7249154 0.2362936 #3 3 0.5670157 0.8166805 0.3049683 #4 4 0.7441726 0.4929165 0.6779029 #5 5 0.5616914 0.5272339 0.6202915 #6 6 0.2858429 0.1203205 0.8399356 #7 7 0.9910520 0.1251815 0.4729418 #8 8 0.7079778 0.5465055 0.8951371 #9 9 0.0564100 0.1837211 0.6451289 #10 10 0.7169663 0.1328287 0.2467554 A.K. From: Janesh Devkota janesh.devk...@gmail.com To: Farnoosh farnoosh...@yahoo.com Cc: arun smartpink...@yahoo.com; R help r-help@r-project.org Sent: Wednesday, April 17, 2013 1:05 PM Subject: Re: [R] Merge Hi, I have a quick question here. Lets say he has three data frames and he needs to combine those three data frame using merge. Can we simply use merge to join three data frames ? I remember I had some problem using merge for more than two dataframes. Thanks. On Wed, Apr 17, 2013 at 1:05 AM, Farnoosh farnoosh...@yahoo.com wrote: Thanks a lot:) Sent from my iPad On Apr 16, 2013, at 10:15 PM, arun smartpink...@yahoo.com wrote: Hi Farnoosh, YOu can use either ?merge() or ?join() DataA- read.table(text= ID v1 1 10 2 1 3 22 4 15 5 3 6 6 7 8 ,sep=,header=TRUE) DataB- read.table(text= ID v2 2 yes 5 no 7 yes ,sep=,header=TRUE,stringsAsFactors=FALSE) merge(DataA,DataB,by=ID,all.x=TRUE) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes library(plyr) join(DataA,DataB,by=ID,type=left) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes A.K. From: farnoosh sheikhi farnoosh...@yahoo.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Wednesday, April 17, 2013 12:52 AM Subject: Merge Hi Arun, I want to merge a data set with another data frame with 2 columns and keep the sample size of the DataA. DataA DataB DataCombine ID v1 ID V2 ID v1 v2 1 10 2 yes 1 10 NA 2 1 5 no 2 1 yes 3 22 7 yes 3 22 NA 4 15 4 15 NA 5 3 5 3 no 6 6 6 6 NA 7 8 7 8 yes Thanks a lot for your help and time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge
No, you don't have to use four dots. Please check these links for further details: http://stackoverflow.com/questions/5890576/usage-of-three-dots-or-dot-dot-dot-in-functions http://cran.r-project.org/doc/manuals/R-lang.pdf A.K. From: Janesh Devkota janesh.devk...@gmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org; Farnoosh farnoosh...@yahoo.com Sent: Wednesday, April 17, 2013 1:23 PM Subject: Re: [R] Merge Hello Arun, Thank you so much for the prompt reply. I have one simple question here. DOes three dots (...) in the reduce function means we are applying for three dataframes here ? So, if we were to combine four would that dots be four dots ? Thanks. On Wed, Apr 17, 2013 at 12:16 PM, arun smartpink...@yahoo.com wrote: HI Janesh, YOu can use: library(plyr) ?join_all() #From the help page: dfs - list( a = data.frame(x = 1:10, a = runif(10)), b = data.frame(x = 1:10, b = runif(10)), c = data.frame(x = 1:10, c = runif(10)) ) join_all(dfs) join_all(dfs, x) join_all(dfs, x) # x a b c #1 1 0.7113766 0.1348978 0.1153703 #2 2 0.2520057 0.7249154 0.2362936 #3 3 0.5670157 0.8166805 0.3049683 #4 4 0.7441726 0.4929165 0.6779029 #5 5 0.5616914 0.5272339 0.6202915 #6 6 0.2858429 0.1203205 0.8399356 #7 7 0.9910520 0.1251815 0.4729418 #8 8 0.7079778 0.5465055 0.8951371 #9 9 0.0564100 0.1837211 0.6451289 #10 10 0.7169663 0.1328287 0.2467554 Reduce(function(...) merge(...,by=x),dfs) # x a b c #1 1 0.7113766 0.1348978 0.1153703 #2 2 0.2520057 0.7249154 0.2362936 #3 3 0.5670157 0.8166805 0.3049683 #4 4 0.7441726 0.4929165 0.6779029 #5 5 0.5616914 0.5272339 0.6202915 #6 6 0.2858429 0.1203205 0.8399356 #7 7 0.9910520 0.1251815 0.4729418 #8 8 0.7079778 0.5465055 0.8951371 #9 9 0.0564100 0.1837211 0.6451289 #10 10 0.7169663 0.1328287 0.2467554 A.K. From: Janesh Devkota janesh.devk...@gmail.com To: Farnoosh farnoosh...@yahoo.com Cc: arun smartpink...@yahoo.com; R help r-help@r-project.org Sent: Wednesday, April 17, 2013 1:05 PM Subject: Re: [R] Merge Hi, I have a quick question here. Lets say he has three data frames and he needs to combine those three data frame using merge. Can we simply use merge to join three data frames ? I remember I had some problem using merge for more than two dataframes. Thanks. On Wed, Apr 17, 2013 at 1:05 AM, Farnoosh farnoosh...@yahoo.com wrote: Thanks a lot:) Sent from my iPad On Apr 16, 2013, at 10:15 PM, arun smartpink...@yahoo.com wrote: Hi Farnoosh, YOu can use either ?merge() or ?join() DataA- read.table(text= ID v1 1 10 2 1 3 22 4 15 5 3 6 6 7 8 ,sep=,header=TRUE) DataB- read.table(text= ID v2 2 yes 5 no 7 yes ,sep=,header=TRUE,stringsAsFactors=FALSE) merge(DataA,DataB,by=ID,all.x=TRUE) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes library(plyr) join(DataA,DataB,by=ID,type=left) # ID v1 v2 #1 1 10 NA #2 2 1 yes #3 3 22 NA #4 4 15 NA #5 5 3 no #6 6 6 NA #7 7 8 yes A.K. From: farnoosh sheikhi farnoosh...@yahoo.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Wednesday, April 17, 2013 12:52 AM Subject: Merge Hi Arun, I want to merge a data set with another data frame with 2 columns and keep the sample size of the DataA. DataA DataB DataCombine ID v1 ID V2 ID v1 v2 1 10 2 yes 1 10 NA 2 1 5 no 2 1 yes 3 22 7 yes 3 22 NA 4 15 4 15 NA 5 3 5 3 no 6 6 6 6 NA 7 8 7 8 yes Thanks a lot for your help and time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] proper way to handle obsolete function names
On Apr 17, 2013, at 10:17 AM, Jannis bt_jan...@yahoo.de wrote: Dear R community, what would be the proper R way to handle obsolete function names? I have created several packages with functions and sometimes would like to change the name of a function but would like to create a mechanism that other scripts of functions using the old name still work. It sounds like you want .Deprecate ?.Deprecate Michael Cheers Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv: how select significant predictor vars when using gam(...select=TRUE) using automatic optimization
Jan, What mgcv version are you using, please? (Older versions have a poor p-value approximation when select=TRUE, but of course it's possible that you've managed to break the newer approximation as well) The 'select=TRUE' option adds a penalty to each smooth, to allow it to be penalized out of the model altogether via optimization of the smoothing parameter selection criterion. Usually it is better to use REML for smoothing parameter selection in this case using 'method=REML' as an option to gam. This is because REML is less prone to undersmoothing than GCV. So 'select=TRUE' is not selecting on the basis of the p-values, themselves, but obviously this sort of discrepancy should not be happening. best, Simon On 17/04/13 15:50, Jan Holstein wrote: I have 11 possible predictor variables and use them to model quite a few target variables. In search for a consistent manner and possibly non-manual manner to identify the significant predictor vars out of the eleven I thought the option select=T might do. Example: (here only 4 pedictors) first is vanilla with select=F fit1-gam(target~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,select=F) summary(fit1) Family: quasi Link function: log Formula: target ~ s(mgs) + s(gsd) + s(mud) + s(ssCmax) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -34.57 20.47 -1.689 0.0913 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Approximate significance of smooth terms: edf Ref.df F p-value s(mgs)2.335 2.623 0.2600.829 s(gsd)6.868 7.506 13.955 2e-16 *** s(mud)8.990 9.000 11.727 2e-16 *** s(ssCmax) 6.770 6.978 6.664 7.68e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 R-sq.(adj) = 0.402 Deviance explained = 40.4% GCV score = 8.8563e+05 Scale est. = 8.8053e+05 n = 4511 then turn select=TRUE fit2-gam(target~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,select=TRUE) summary(fit2) Family: quasi Link function: log Formula: target ~ s(mgs) + s(gsd) + s(mud) + s(ssCmax) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.1585 1.7439 0.0910.928 Approximate significance of smooth terms: edf Ref.df F p-value s(mgs)2.456 8 24.50 2e-16 *** s(gsd)7.272 9 14.33 2e-16 *** s(mud)7.678 9 20.38 2e-16 *** s(ssCmax) 6.556 9 14.36 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 R-sq.(adj) = 0.397 Deviance explained = 40% GCV score = 8.9209e+05 Scale est. = 8.8715e+05 n = 4511 I seem to not fully understand how to work with select. The predictor mgs is obviously not significant, as seen from fit (above), yet here it appears as significant. Why was it not dropped? How are not-significant predictors are identified? -- View this message in context: http://r.789695.n4.nabble.com/mgcv-how-select-significant-predictor-vars-when-using-gam-select-TRUE-using-automatic-optimization-tp4664510.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] proper way to handle obsolete function names
On Wed, Apr 17, 2013 at 10:36 AM, R. Michael Weylandt michael.weyla...@gmail.com It sounds like you want .Deprecate ?.Deprecate Perhaps you meant Deprecated? ?Deprecated Best, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simulation\bootstrap of list factors
Dear R experts, I am trying to simulate a list containing data matrices. Unfortunately, I don't manage to get it to work. A small example: n=5 nbootstrap=2 subsets-list() for (i in 1:n){ subsets[[i]] - rnorm(5, mean=80, sd=1) for (j in 1:nbootstrap){ test-list() test[[j]]-subsets[[i]] } } How can I get test to be 2 simulation rounds with each 5 matrices. Kind regards, Tobias [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unsubscribe please
On Apr 16, 2013, at 11:02 PM, Pascal Oettli wrote: Hello, Don't reply only to me. 1) Filter the unwanted mails, There is also an option to get Rhelp postings in digest-format. -- David. 2) It takes few days to unsubscribe you. Regards, Pascal On 04/17/2013 02:59 PM, bert verleysen (beverconsult) wrote: I did this, but still I receive to much mails Bert Verleysen 00 32 (0)477 874 272 samen zoekend naar generatief organiseren -Oorspronkelijk bericht- Van: Pascal Oettli [mailto:kri...@ymail.com] Verzonden: woensdag 17 april 2013 6:33 Aan: Bert Verleysen (beverconsult) CC: R-help@r-project.org Onderwerp: Re: [R] Unsubscribe please Hi, Do it yourself: https://stat.ethz.ch/mailman/listinfo/r-help Hint: Bbottom of the page (To unsubscribe from R-help) Regards, Pascal On 04/17/2013 06:33 AM, Bert Verleysen (beverconsult) wrote: Verstuurd vanaf mijn iPad Bert Verleysen 00 32 (0)477 874 272 www.beverconsult.be __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use of names() within lapply()
On 17/04/2013 11:33 AM, Ivan Alves wrote: Dear Duncan and A.K. Many thanks for your super quick help. The modified lapply did the trick, mapply died with a error Error in dots[[2L]][[1L]] : object of type 'builtin' is not subsettable. That's due to a typo: I should have said mapply(plot, g, main=names(g)) Duncan Murdoch Kind regards, Ivan On 17 Apr 2013, at 17:12, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 17/04/2013 11:04 AM, Ivan Alves wrote: Dear all, List g has 2 elements names(g) [1] 2009-10-07 2012-02-29 and the list plot lapply(g, plot, main=names(g)) results in equal plot titles with both list names, whereas distinct titles names(g[1]) and names(g[2]) are sought. Clearly, lapply is passing 'g' in stead of consecutively passing g[1] and then g[2] to process the additional 'main' argument to plot. help(lapply) is mute as to what to element-wise pass parameters. Any suggestion would be appreciated. I think you want mapply rather than lapply, or you could do lapply on a vector of indices. For example, mapply(plot, g, main=names) or lapply(1:2, function(i) plot(g[[i]], main=names(g)[i])) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best way to calculate averages of Blocks in an matrix?
On Apr 17, 2013, at 9:54 AM, Keith S Weintraub wrote: Folks, I recently was given a simulated data set like the following subset: sim_sub-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, 0.03, 0.03)), .Names = c(V11, V12, V13, V14, V15, V16, V17, V18, V19, V20), row.names = c(NA, 20L), class = data.frame) sim_sub V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 3 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 4 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 5 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 6 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 9 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 Every 5 rows represents one block of simulated data. What would be the best way to average the blocks? This answers the posed question: tapply( data.matrix(sim_sub), rep( rep(1:4, each=5), each=10) ,mean) 1 2 3 4 0.0030 0.0070 0.0106 0.0144 Your code following suggests that you do not want the average values within blocks but within blocks AND ALSO within columns (although how you get 5 rows of 5 blocks from a 20 row input object is unclear to me) data.frame( lapply(sim_sub, function(col) tapply(col, rep(1:4, each=5), mean) ) ) V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 From your code I am guessing a typo of 5 for 4? My way was to reshape sim_sub, average over the columns and then reshape back like so: matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.0050 0. 0. 0.0025 0.0025 0.005 0. 0.0050 0.0050 0.0050 [2,] 0. 0.0025 0. 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 [3,] 0. 0. 0. 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 [5,] 0. 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 How bad is t(sim_sub) in the above? The whole matrix( matrix( t(.), ... )) approach seems kind of tortured, but to your question, t() is a fairly efficient function. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best way to calculate averages of Blocks in an matrix?
tapply(t(data.matrix(sim_sub)),rep( rep(1:4, each=5), each=10),mean) # 1 2 3 4 #0.0086 0.0074 0.0082 0.0108 unlist(lapply(split(sim_sub,((seq_len(nrow(sim_sub))-1)%/%5)+1),function(x) mean(unlist(x # 1 2 3 4 #0.0086 0.0074 0.0082 0.0108 A.K. - Original Message - From: David Winsemius dwinsem...@comcast.net To: Keith S Weintraub kw1...@gmail.com Cc: r-help@r-project.org r-help@r-project.org Sent: Wednesday, April 17, 2013 4:05 PM Subject: Re: [R] Best way to calculate averages of Blocks in an matrix? On Apr 17, 2013, at 9:54 AM, Keith S Weintraub wrote: Folks, I recently was given a simulated data set like the following subset: sim_sub-structure(list(V11 = c(0.01, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V12 = c(0, 0, 0, 0.01, 0.03, 0, 0, 0, 0, 0, 0, 0.01, 0, 0.01, 0, 0, 0, 0, 0, 0.04), V13 = c(0, 0, 0, 0.01, 0, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01, 0, 0, 0, 0, 0.01), V14 = c(0, 0.01, 0.01, 0.01, 0.01, 0, 0, 0, 0, 0.03, 0, 0, 0.01, 0.01, 0.04, 0.01, 0.02, 0, 0.01, 0.03), V15 = c(0, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.02, 0.01, 0, 0, 0.01, 0, 0, 0, 0.01, 0.01, 0.04), V16 = c(0, 0, 0, 0.03, 0.02, 0.01, 0, 0, 0.02, 0.02, 0, 0.02, 0.02, 0, 0.01, 0.01, 0, 0, 0.03, 0.01), V17 = c(0, 0.01, 0, 0.01, 0, 0, 0, 0.01, 0.05, 0.03, 0, 0.01, 0, 0.02, 0.02, 0, 0, 0.01, 0.02, 0.04), V18 = c(0, 0.01, 0, 0.03, 0.03, 0, 0, 0, 0.02, 0.01, 0, 0.02, 0.01, 0.02, 0.03, 0.02, 0, 0, 0.04, 0.04 ), V19 = c(0, 0.01, 0.01, 0.02, 0.07, 0, 0, 0, 0.04, 0.01, 0.02, 0, 0, 0, 0.04, 0, 0, 0, 0, 0.05), V20 = c(0, 0, 0, 0.01, 0.04, 0.01, 0, 0, 0.02, 0.04, 0.01, 0, 0.02, 0, 0.03, 0, 0.02, 0.01, 0.03, 0.03)), .Names = c(V11, V12, V13, V14, V15, V16, V17, V18, V19, V20), row.names = c(NA, 20L), class = data.frame) sim_sub V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.01 0.01 0.00 0.01 0.01 0.01 0.00 3 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.00 0.01 0.00 4 0.01 0.01 0.01 0.01 0.00 0.03 0.01 0.03 0.02 0.01 5 0.00 0.03 0.00 0.01 0.01 0.02 0.00 0.03 0.07 0.04 6 0.01 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.01 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 9 0.00 0.00 0.00 0.00 0.01 0.02 0.05 0.02 0.04 0.02 10 0.00 0.00 0.01 0.03 0.02 0.02 0.03 0.01 0.01 0.04 11 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.02 0.01 12 0.00 0.01 0.00 0.00 0.00 0.02 0.01 0.02 0.00 0.00 13 0.00 0.00 0.00 0.01 0.00 0.02 0.00 0.01 0.00 0.02 14 0.00 0.01 0.00 0.01 0.01 0.00 0.02 0.02 0.00 0.00 15 0.00 0.00 0.01 0.04 0.00 0.01 0.02 0.03 0.04 0.03 16 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.02 0.00 0.00 17 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 0.00 0.02 18 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.00 0.00 0.01 19 0.00 0.00 0.00 0.01 0.01 0.03 0.02 0.04 0.00 0.03 20 0.00 0.04 0.01 0.03 0.04 0.01 0.04 0.04 0.05 0.03 Every 5 rows represents one block of simulated data. What would be the best way to average the blocks? This answers the posed question: tapply( data.matrix(sim_sub), rep( rep(1:4, each=5), each=10) ,mean) 1 2 3 4 0.0030 0.0070 0.0106 0.0144 Your code following suggests that you do not want the average values within blocks but within blocks AND ALSO within columns (although how you get 5 rows of 5 blocks from a 20 row input object is unclear to me) data.frame( lapply(sim_sub, function(col) tapply(col, rep(1:4, each=5), mean) ) ) V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 1 0.004 0.008 0.002 0.008 0.004 0.01 0.004 0.014 0.022 0.010 2 0.002 0.000 0.002 0.006 0.006 0.01 0.018 0.006 0.010 0.014 3 0.000 0.004 0.002 0.012 0.004 0.01 0.010 0.016 0.012 0.012 4 0.000 0.008 0.002 0.014 0.012 0.01 0.014 0.020 0.010 0.018 From your code I am guessing a typo of 5 for 4? My way was to reshape sim_sub, average over the columns and then reshape back like so: matrix(colSums(matrix(t(sim_sub), byrow = TRUE, ncol = 50)), byrow = TRUE, ncol = 10)/4 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0.0050 0. 0. 0.0025 0.0025 0.005 0. 0.0050 0.0050 0.0050 [2,] 0. 0.0025 0. 0.0075 0.0025 0.005 0.0050 0.0075 0.0025 0.0050 [3,] 0. 0. 0. 0.0050 0.0025 0.005 0.0050 0.0025 0.0025 0.0075 [4,] 0.0025 0.0050 0.0025 0.0075 0.0075 0.020 0.0250 0.0275 0.0150 0.0150 [5,] 0. 0.0175 0.0075 0.0275 0.0175 0.015 0.0225 0.0275 0.0425 0.0350 How bad is t(sim_sub) in the above? The whole matrix( matrix( t(.), ... )) approach seems kind of tortured, but to your question, t() is a fairly efficient function. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
[R] Full Information Maximum Likelihood estimation method for multivariate sample selection problem
Dear R experts/ users Full Information Maximum Likelihood (FIML) estimation approach is considered robust over Seemingly Unrelated Regression (SUR) approach for analysing data of multivariate sample selection problem. The zero cases in my dependent variables are resulted from three sources: Irreverent options, not choosing due to negative utility and not used in the reported time. FIML can address the estimation problem associated with cross-equation correlation of the errors. I am interested to learn and apply the FIML method of estimation. I searched R resources in internet but I could not get the materials specific to address my following questions. I request R experts/ users to address the following queries. Q.1. hick package of R (e.g. lavaan, mvnmle , stat4 and sem) is appropriate to analyse the multivariate sample selection problem by using FIML estimation method? Q.2. How should it be formulated the code to execute the FIML method ? Q.3. what is the right method similar to log likelihood ratio to determine variables stability in the model? Q.4. My original data of dependent variables are in percentage in measurement. Do I need to change them any other specific functional form? I attempted to formulate the data in the following structure. Selection equation ws = c(w1, w2, w3) # values of dependent variables in selection equations are binary (1 and 0) zs = c(z1, z2, z3, z4, z5) # z1, z2, z3 continuous and z4 and z5 dummies explanatory variables in selection equation Level equation (extent of particular option use) ys = c(y1, y2, y3) # values of dependent variables are percentage with some zero cases xs = c(x1, x2, x3, x4, x5) # x1, x2, x3 continuous and x4 and x5 dummies dependent variables. Note: The variables in both selection and level equations are mostly same. Advance thanks for helping me. Champak Ishram [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multi-core processing in glmulti
Dear list, I am trying to do an automated model selection of a glmm (function glmer; package: lme4) containing a large number of predictors. As far as i understand, glmulti is able to devide the process into chuncks and proceed by parallel processing on on multiple cores. Unfortunately this does not seem to work and i could not really fid any advice on the matter on other forums. Specifically i have the following questions: 1) does parallel processing only work for exhaustive processing (glmulti(..., method = h, )) or also for the generic algorithm (glmulti(..., method = g, ))? 2) do i need to invoke another package designed for parallel processing (e.g. package::parallel or package::snow) to set up the necessary computational clusters before calling glmulti, or can glmulti address the differend cores of my pc on its own? Any help would be greatly appreciated! cheers, marc -- View this message in context: http://r.789695.n4.nabble.com/Multi-core-processing-in-glmulti-tp4664546.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] t-statistic for independent samples
Hi, Typical things you read when new to stats are cautions about using a t-statistic when comparing independent samples. You are steered toward a pooled test or welch's approximation of the degrees of freedom in order to make the distribution a t-distribution. However, most texts give no information why you have to do this. So I thought I try a little experiment which is outlined here. Distrubtion of differences of independent samples http://msemac.redwoods.edu/~darnold/math15/R/chapter11/DistributionForTwoIndependentSamplesPartII.html As you can see in the above link, I see no evidence why you need a pooled or Welch's in these images. Anyone care to comment? Or should I put this on Stack Exchange? D. -- View this message in context: http://r.789695.n4.nabble.com/t-statistic-for-independent-samples-tp4664553.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t-statistic for independent samples
On 04/17/2013 06:24 PM, David Arnold wrote: Hi, Typical things you read when new to stats are cautions about using a t-statistic when comparing independent samples. You are steered toward a pooled test or welch's approximation of the degrees of freedom in order to make the distribution a t-distribution. However, most texts give no information why you have to do this. So I thought I try a little experiment which is outlined here. Distrubtion of differences of independent samples http://msemac.redwoods.edu/~darnold/math15/R/chapter11/DistributionForTwoIndependentSamplesPartII.html As you can see in the above link, I see no evidence why you need a pooled or Welch's in these images. Anyone care to comment? Or should I put this on Stack Exchange? D. Admittedly, I just skimmed the page, but one thing stands out. Your standard deviations are really quite close to each other. Try your simulations again with variance ratios exceeding 2 and see what happens. -- Kevin E. Thorpe Head of Biostatistics, Applied Health Research Centre (AHRC) Li Ka Shing Knowledge Institute of St. Michael's Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t-statistic for independent samples
Dear David, On Wed, Apr 17, 2013 at 6:24 PM, David Arnold dwarnol...@suddenlink.net wrote: Hi, [snip] D. Before posting to StackExchange, check out the Wikipedia entry for Behrens-Fisher problem. Cheers, Jay -- G. Jay Kerns, Ph.D. Youngstown State University http://people.ysu.edu/~gkerns/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t-statistic for independent samples
OK,although the variance ratio was already 2.25 to 1, tried sigma1=10, sigma2=25, which makes the ratios of the variances 6.25 to 1. Still no change. See: http://msemac.redwoods.edu/~darnold/math15/R/chapter11/DistributionForTwoIndependentSamplesPartII.html D. -- View this message in context: http://r.789695.n4.nabble.com/t-statistic-for-independent-samples-tp4664553p4664556.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Understanding why a GAM can't suppress an intercept
Simon, Many thanks as always for your help. I see and appreciate the example that you cited, but I'm having a hard time generalizing it to a multivariate case. A bit about my context -- my dataset is response ratios; the log of a treatment over a control. One of my explanatory variables is treatment intensity. When intensity goes to zero, the expectation of the response ratio should also go to zero. Here is the model that I would like to fit: model = (ResponseRatio~ +s(as.factor(study),bs=re,by=intensity) +s(intensity) +s(x1),by=intensity) +s(x2),by=intensity) +te(x1,x2),by=intensity) +te(x1,intensity) ) Here is the example that you gave: library(mgcv) set.seed(0) n - 100 x - runif(n)*4-1;x - sort(x); f - exp(4*x)/(1+exp(4*x));y - f+rnorm(100)*0.1;plot(x,y) dat - data.frame(x=x,y=y) ## Create a spline basis and penalty, making sure there is a knot ## at the constraint point, (0 here, but could be anywhere) knots - data.frame(x=seq(-1,3,length=9)) ## create knots ## set up smoother... sm - smoothCon(s(x,k=9,bs=cr),dat,knots=knots)[[1]] ## 3rd parameter is value of spline at knot location 0, ## set it to 0 by dropping... X - sm$X[,-3]## spline basis S - sm$S[[1]][-3,-3] ## spline penalty off - y*0 + .6 ## offset term to force curve through (0, .6) ## fit spline constrained through (0, .6)... b - gam(y ~ X - 1 + offset(off),paraPen=list(X=list(S))) lines(x,predict(b)) ## compare to unconstrained fit... b.u - gam(y ~ s(x,k=9),data=dat,knots=knots) lines(x,predict(b.u)) *My question*: how can I extend your example to more than one smooth terms, and several smooth interactions? In the context of a model where E[ResponseRatio] must equal 0 when intensity equals zero? I am not sure that I understand exactly what is going on when you call smoothCon to specify the `sm`. I've written my own penalized spline code from scratch, but it is far less sophisticated than mgcv, and is basically a ridge regression where I optimize to get a lambda after specifying a model with a lot of knots. mgcv clearly has a lot more going on, and is far preferable to my rudimentary code for its handling of tensors and random effects. (Also, for prediction, how can I do by=dum AND by=intensity at the same time?) Many thanks, Andrew PS: I am aware that interacting my model with an intensity variable makes my model quite heteroskedastic. I am thinking of using a cluster wild bootstrap to construct confidence intervals. If a better way forward immediately comes to your mind -- especially if its computationally cheaper -- I'd greatly appreciate if you could share it. On 04/17/2013 02:16 AM, Simon Wood wrote: hi Andrew. gam does suppress the intercept, it's just that this doesn't force the smooth through the intercept in the way that you would like. Basically for the parameteric component of the model '-1' behaves exactly like it does in 'lm' (it's using the same code). The smooths are 'added on' to the parametric component of the model, with sum to zero constraints to force identifiability. There is a solution to forcing a spline through a particular point at http://r.789695.n4.nabble.com/Use-pcls-in-quot-mgcv-quot-package-to-achieve-constrained-cubic-spline-td4660966.html (i.e. the R help thread Re: [R] Use pcls in mgcv package to achieve constrained cubic spline) best, Simon On 16/04/13 22:36, Andrew Crane-Droesch wrote: Dear List, I've just tried to specify a GAM without an intercept -- I've got one of the (rare) cases where it is appropriate for E(y) - 0 as X -0. Naively running a GAM with the -1 appended to the formula and the calling predict.gam, I see that the model isn't behaving as expected. I don't understand why this would be. Google turns up this old R help thread: http://r.789695.n4.nabble.com/GAM-without-intercept-td4645786.html Simon writes: *Smooth terms are constrained to sum to zero over the covariate values. ** **This is an identifiability constraint designed to avoid confounding with ** **the intercept (particularly important if you have more than one smooth). * If you remove the intercept from you model altogether (m2) then the smooth will still sum to zero over the covariate values, which in your case will mean that the smooth is quite a long way from the data. When you include the intercept (m1) then the intercept is effectively shifting the constrained curve up towards the data, and you get a nice fit. Why? I haven't read Simon's book in great detail, though I have read Ruppert et al.'s Semiparametric Regression. I don't see a reason why a penalized spline model shouldn't equal the intercept (or zero) when all of the regressors equals zero. Is anyone able to help with a bit of intuition? Or relevant passages from a good description of why this would be the case? Furthermore, why does the -1
[R] Memory usage reported by gc() differs from 'top'
In help(gc) I read, ...the primary purpose of calling 'gc' is for the report on memory usage. What memory usage does gc() report? And more importantly, which memory uses does it NOT report? Because I see one answer from gc(): used (Mb) gc trigger (Mb) max used (Mb) Ncells 14875922 794.5 21754962 1161.9 17854776 953.6 Vcells 59905567 457.1 84428913 644.2 72715009 554.8 (That's about 1.5g max used, 1.8g trigger.) And a different answer from an OS utility, 'top': PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 6210 brech 20 0 18.2g 7.2g 2612 S1 93.4 16:26.73 R So the R process is holding on to 18.2g memory, but it only seems to have accout of 1.5g or so. Where is the rest? I tried searching the archives, and found answers like just buy more RAM. Which doesn't exactly answer my question. And come on, 18g is pretty big; sure it doesn't fit in my RAM (only 7.2g are in), but that's beside the point. The huge memory demand is specific to R version 2.15.3 Patched (2013-03-13 r62500) -- Security Blanket. The same test runs without issues under R version 2.15.1 beta (2012-06-11 r59557) -- Roasted Marshmallows. I appreciate any insights you can share into R's memory management, and gc() in particular. /Christian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t-statistic for independent samples
I just looked more carefully at your code. You are computing the unequal-variance (Welch) version of the t-test, so that's why there isn't a problem. Compare it with the equal-variance t-test, using the pooled variance estimate, which does have a problem, as below -thomas tstat4 - function() { n1 = 7 mu1 = 100 sigma1 = 25 n2 = 14 mu2 = 100 sigma2 = 10 x1 = rnorm(n1, mu1, sigma1) x1bar = mean(x1) s1 = sd(x1) x2 = rnorm(n2, mu2, sigma2) x2bar = mean(x2) s2 = sd(x2) t = ((x1bar - x2bar) - (mu1 - mu2))/sqrt(s1^2/n1 + s2^2/n2) t2= ((x1bar - x2bar) - (mu1 - mu2))/sqrt(((n1-1)*s1^2 + (n2-1)*s2^2)/(n1 +n2-2)*(1/n1+1/n2)) return(c(t,t2)) } tstats4 = replicate(1, tstat4()) hist(tstats4[1,], breaks = scott, prob = TRUE, xlim = c(-4, 4), ylim = c(0 , 0.4)) x = seq(-4, 4, length = 200) y = dt(x, df = 48) lines(x, y, type = l, col = red) hist(tstats4[2,], breaks = scott, prob = TRUE, xlim = c(-4, 4), ylim = c(0 , 0.4)) x = seq(-4, 4, length = 200) y = dt(x, df = 48) lines(x, y, type = l, col = red) On Thu, Apr 18, 2013 at 12:28 PM, David Arnold dwarnol...@suddenlink.netwrote: OK,although the variance ratio was already 2.25 to 1, tried sigma1=10, sigma2=25, which makes the ratios of the variances 6.25 to 1. Still no change. See: http://msemac.redwoods.edu/~darnold/math15/R/chapter11/DistributionForTwoIndependentSamplesPartII.html D. -- View this message in context: http://r.789695.n4.nabble.com/t-statistic-for-independent-samples-tp4664553p4664556.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using different function (parameters) with apply
Hi All, I have the following problem (read the commented bit below): a-matrix(1:9,nrow=3) a [,1] [,2] [,3] [1,]147 [2,]258 [3,]369 div-1:3 apply(a,2,function(x)x/div) ##want to divide each column by div- instead each row is divided## [,1] [,2] [,3] [1,]1 4.07 [2,]1 2.54 [3,]1 2.03 apply(a,1,function(x)x/div) ##Changing Margin from 2 to 1 does something completele weird## [,1] [,2] [,3] [1,] 1.00 2.003 [2,] 2.00 2.503 [3,] 2.33 2.673 Any thoughts? Thanks, Sachin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.