Re: [R] best way to apply a list of functions to a dataset ?
Hi: On Tue, Jul 20, 2010 at 5:37 PM, Glen Barnett glnbr...@gmail.com wrote: Hi Dennis, Thanks for the reply. Yes, that's easier, but the conversion to a matrix with rbind has converted the output of that final function to a numeric. If you look at the output of lapply(attitude, f), you'll see that the conversion from logical to numeric has already taken place. Different components of lists can have different types, but within a component, all of the elements must have the same class. You can patch up the result as follows: x - data.frame(do.call(rbind, lapply(attitude, f))) x[, 5] - as.logical(x[, 5]) x meansdskewness median mean.gt.med rating 64.6 12.172562 -0.35792491 65.5 FALSE complaints 66.6 13.314757 -0.21541749 65.0TRUE privileges 53.1 12.235430 0.37912287 51.5TRUE learning 56.36667 11.737013 -0.05403354 56.5 FALSE raises 64.6 10.397226 0.19754317 63.5TRUE critical 74.76667 9.894908 -0.86577893 77.5 FALSE advance42.9 10.288706 0.85039799 41.0TRUE but if you're doing this sort of thing over a large data frame such fixes may be impractical. I included that last function in the example secifically to preclude people assuming that functions would always return the same type. There is a plyr solution, although it's a little more long-winded than I'd prefer in the end: library(ggplot2) # melt the data frame so that the variables become factor levels ma - melt(attitude) Using as id variables dim(ma) [1] 210 2 # Use ddply to get the set of summaries by variable: ddply(ma, .(variable), summarise, mean = mean(value), sd = sd(value), skewness = skewness(value), median = median(value), mean.gt.med = mean.gt.med(value)) variable meansdskewness median mean.gt.med 1 rating 64.6 12.172562 -0.35792491 65.5 FALSE 2 complaints 66.6 13.314757 -0.21541749 65.0TRUE 3 privileges 53.1 12.235430 0.37912287 51.5TRUE 4 learning 56.36667 11.737013 -0.05403354 56.5 FALSE 5 raises 64.6 10.397226 0.19754317 63.5TRUE 6 critical 74.76667 9.894908 -0.86577893 77.5 FALSE 7advance 42.9 10.288706 0.85039799 41.0TRUE Notice that now the logical class of mean.gt.med is preserved. The trick with ddply() in package plyr is that, in this case, it is convenient to melt the data frame first before doing the summarizations. This is because ddply() requires a grouping variable - in this example, the groups are the variables themselves. HTH, Dennis I guess this doesn't matter too much for a logical, but what if instead the function returned a character (say mean, median, or equal - indicating which one was larger, or equal which could easily happen with discrete data). This precludes using rbind (which I also used at first, before I noticed that sometimes I could have functions that don't return numerics). Glen On Tue, Jul 20, 2010 at 6:55 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: This might be a little easier (?): library(datasets) skewness - function(x) mean(scale(x)^3) mean.gt.med - function(x) mean(x)median(x) # -- # construct the function to apply to each variable in the data frame f - function(x) c(mean = mean(x), sd = sd(x), skewness = skewness(x), median = median(x), mean.gt.med = mean.gt.med(x)) # map function to each variable with lapply and combine with do.call(): do.call(rbind, lapply(attitude, f)) meansdskewness median mean.gt.med rating 64.6 12.172562 -0.35792491 65.5 0 complaints 66.6 13.314757 -0.21541749 65.0 1 privileges 53.1 12.235430 0.37912287 51.5 1 learning 56.36667 11.737013 -0.05403354 56.5 0 raises 64.6 10.397226 0.19754317 63.5 1 critical 74.76667 9.894908 -0.86577893 77.5 0 advance42.9 10.288706 0.85039799 41.0 1 HTH, Dennis On Mon, Jul 19, 2010 at 10:51 PM, Glen Barnett glnbr...@gmail.com wrote: Assuming I have a matrix of data (or under some restrictions that will become obvious, possibly a data frame), I want to be able to apply a list of functions (initially producing a single number from a vector) to the data and produce a data frame (for compact output) with column 1 being the function results for the first function, column 2 being the results for the second function and so on - with each row being the columns of the original data. The obvious application of this is to produce summaries of data sets (a bit like summary() does on numeric matrices), but with user supplied functions. I am content for the moment to leave it to the user to supply functions that work with the data they supply so as to produce results that will actually be data-frame-able, though I'd
[R] xtable with ifelse statement
Hi there, I'm very new on R and I hope someone can help me to solve the problem in using the ifelse statement with the xtable function(library xtable). I'm trying to get the printing of the elements of two lists in a sorted way. These two list have in common the their names. I will try to give an example: the first list looks like this: $code1 Code code1 Nation Usa $code2 Code code2 Nation Spain the second one is like this $code 1 var187 var2 73 $code2 var134 var223 As you can see the two list have the same names.I tried to implement the following ifelse statement but it doesn't work orde2=function(x,y){ ifelse(names(x)==names(y),list(print(xtable(x)),print(xtable(y))),NULL) What I want is the xtable of this first: $code1 Code code1 Nation Usa and after this: $code 1 var187 var2 73 And as the first case, the xtable of this; $code2 Code code2 Nation Spain and after this: $code2 var134 var223 Someone Knows if it possible to do this and how?? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/xtable-with-ifelse-statement-tp2296694p2296694.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can RMySQL run on Windows using a remote MySQL server?
Change of user, was neatgadgets but that seems to not work very well. I managed to get it working by copying the 2 dll files into the R bin folder. So I am up and running now. Thanks for your help. -- View this message in context: http://r.789695.n4.nabble.com/Can-RMySQL-run-on-Windows-using-a-remote-MySQL-server-tp2293533p2296597.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice Panel Print Order
Hi, I have done a bit of searching and have not found a good answer for my question. Although I have not difficulty ordering the panels, Lattice prints them from bottom to top and left to right for each page. Is it possible to make it print from top to bottom for each page? I've tinkered with index.cond and tried reordering the panels, but I don't see a way to do it on a page by page basis. Example: tmp-data.frame(vals=sample(1:100,150,replace=T),concentration=sample(0:15,15),group=letters[1:10]) xyplot(vals~concentration|group,data=tmp,layout=c(3,3)) As you can see my layout is a 3x3 matrix for each page. It would be logical for Lattice to print the panels as: abc def ghi new page j However, it prints the following order: ghi def abc new page blank row blank row j Your help is much appreciated. ME __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nls with some coefficients fixed
Gabor Grothendieck ggrothendi...@gmail.com wrote in message news:aanlktilszaicycu3lz2f5d_bxq1g8m8f7jsjsbj2l...@mail.gmail.com... On Tue, Jul 20, 2010 at 9:58 AM, nas...@uottawa.ca wrote: For nls, the fixing (or masking) of parameters is not, to my knowledge, possible. This is something I've been trying to get in such routines for over 2 decades. Masks are available, but not yet well documented, in Rcgmin and Rvmmin packages. However, these use an optim() style approach, which is quite different from nls(). If there's sufficient interest and some collaboration, I'd be willing to have a go at providing such functionality, but it would take quite a bit of work to provide the full capability of nls(). You can optimize over b while fixing m like this: m - 1 nls(demand ~ m + b * Time, BOD, start = c(b = 1)) Thanks Gabor, I'd recognised that approach in my original post starting this thread (19 July) where I'd used the example: # # fix parameter and use explicit start values works fine nls(density ~ SSgompertz(log(conc), Asym, b2, b3=0.8), data = DNase.1, start=list(Asym=3, b2=2)) #--- I'm fitting many different models so I need a more generic approach. I'd appreciate any comments on the suggestion in my 20 July post? (repeated here with some typo's corrected) - nls - function(formula, data=parent.frame(), start, ...){ if (missing(start)) start - getInitial(formula, data) stats:::nls(formula, data, start=start[names(start) %in% all.vars(formula)], ...) } -- I see it breaks nls for non-selfStart functions; e.g. omitting the explicit start from your example nls(demand ~ m + b * Time, BOD) works fine with vanilla nls (albeit with a warning) # No starting values specified for some parameters. # Intializing 'b' to '1.'. # Consider specifying 'start' or using a selfStart model but my wrapper breaks it altogether. # Error in getInitial.default(func, data, mCall = as.list(match.call(func, : # no 'getInitial' method found for function objects I think that's OK in my application - I either use selfStart functions or explicit start lists - but I don't really want to remove existing functionality. I guess I could (!?!) dive into nls and implement the same kind of approach internally, but that's very deep water for me :-{ Best regards, Keith J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice Panel Print Order
On Wed, Jul 21, 2010 at 12:13 AM, Mark Ebbert mark.ebb...@hci.utah.edu wrote: Hi, I have done a bit of searching and have not found a good answer for my question. Although I have not difficulty ordering the panels, Lattice prints them from bottom to top and left to right for each page. Is it possible to make it print from top to bottom for each page? Search for as.table in ?xyplot. Is this what you were looking for? -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nls with some coefficients fixed
Gabor Grothendieck ggrothendi...@gmail.com wrote in message news:aanlktilszaicycu3lz2f5d_bxq1g8m8f7jsjsbj2l...@mail.gmail.com... On Tue, Jul 20, 2010 at 9:58 AM, nas...@uottawa.ca wrote: For nls, the fixing (or masking) of parameters is not, to my knowledge, possible. This is something I've been trying to get in such routines for over 2 decades. Masks are available, but not yet well documented, in Rcgmin and Rvmmin packages. However, these use an optim() style approach, which is quite different from nls(). If there's sufficient interest and some collaboration, I'd be willing to have a go at providing such functionality, but it would take quite a bit of work to provide the full capability of nls(). You can optimize over b while fixing m like this: m - 1 nls(demand ~ m + b * Time, BOD, start = c(b = 1)) Thanks Gabor, I'd recognised that approach in my original post starting this thread (19 July) where I'd used the example: # # fix parameter and use explicit start values works fine nls(density ~ SSgompertz(log(conc), Asym, b2, b3=0.8), data = DNase.1, start=list(Asym=3, b2=2)) #--- I'm fitting many different models so I need a more generic approach. I'd appreciate any comments on the suggestion in my 20 July post? (repeated here with some typo's corrected) - nls - function(formula, data=parent.frame(), start, ...){ if (missing(start)) start - getInitial(formula, data) stats:::nls(formula, data, start=start[names(start) %in% all.vars(formula)], ...) } -- I see it breaks nls for non-selfStart functions; e.g. omitting the explicit start from your example nls(demand ~ m + b * Time, BOD) works fine with vanilla nls (albeit with a warning) # No starting values specified for some parameters. # Intializing 'b' to '1.'. # Consider specifying 'start' or using a selfStart model but my wrapper breaks it altogether. # Error in getInitial.default(func, data, mCall = as.list(match.call(func, : # no 'getInitial' method found for function objects I think that's OK in my application - I either use selfStart functions or explicit start lists - but I don't really want to remove existing functionality. I guess I could (!?!) dive into nls and implement the same kind of approach internally, but that's very deep water for me :-{ Best regards, Keith J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] square brackets in expression in plots
Dear Duncan, dear David, thanks for the reply! I knew about Duncans solution, but could not figure out how to combine it that the 'm*s^-1' can be read from a string. Is there a way to tweak Davids suggestion in a way that this is possible? E.g.: a='m*s^-1' b='speed~bgroup([,a, ])' plot(1:10,main=parse(text=b)) Cheers Jannis --- David Winsemius dwinsem...@comcast.net schrieb am Di, 20.7.2010: Von: David Winsemius dwinsem...@comcast.net Betreff: Re: [R] square brackets in expression in plots An: Jannis bt_jan...@yahoo.de CC: r-h...@stat.math.ethz.ch Datum: Dienstag, 20. Juli, 2010 18:06 Uhr On Jul 20, 2010, at 12:42 PM, Jannis wrote: Dears, do you know whether it is possible to include any square parantheses (brackets) in an expression to use it as an axis label? e.g. I would like to have a label like (with the sub and superscripts correct): speed [m s^-1] Not exactly clear what you mean by correct. I know how to combine an expression with text via paste, but as I run soimething like: a='m*s^{-1}' plot(1:10,main=parse(text=a)) ?plotmath This seemed to work fine with main, sub and xlab: a='speed [m*s^-1]' plot(1:10,main=parse(text=a)) But maybe you wanted the square-brackets? a='speed~bgroup([, m*s^-1, ])' plot(1:10,main=parse(text=a)) I found now way of doing this. I use the parse thing as I have all these units stored as strings that represent expressions. Cheers for any help! Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: If help
Hi I would make an interaction of those two vectors. interaction(sample(letters[1:2], 20, replace=T),sample(letters[1:2], 20, replace=T)) [1] a.a a.b b.b b.a b.b b.b a.a a.a b.b b.a a.a a.a b.b b.a a.a a.b b.a a.a b.b [20] a.b Levels: a.a b.a a.b b.b You will get factor and you can change easily labels and change them to numeric vector. See, ?levels, ?factor Regards Petr r-help-boun...@r-project.org napsal dne 20.07.2010 19:14:28: Hi Y'all, I have some data in a table with 2 columns. There are two values: Reduction and No Reduction. I am trying to make a new variable change which recode the combinations from column 1 and 2 into a single number. Here is a snippet from the table: [1,] NoReduction NoReduction [2,] ReductionReduction [3,] NoReduction NoReduction [4,] NoReduction NoReduction [5,] ReductionReduction [6,] ReductionReduction [7,] ReductionReduction [8,] NoReduction NoReduction [9,] NoReduction NoReduction [10,] NoReduction NoReduction This is the code that I have written so far.. for (i in 1:nrow(change20082009)) if(change20082009[i,1]=='No Reduction' change20082009[i,2]=='No Reduction') ){change20082009[i,3] - 1} else if(change20082009[i,1]=='No Reduction' change20082009[i,2]=='Reduction') {change20082009[i,3] - -1} else if(change20082009[i,1]=='Reduction' change20082009[i,2]=='No Reduction') {change20082009[i,3] - 2} else if(change20082009[i,1]=='Reduction' change20082009[i,2]=='Reduction') {change20082009[i,3] - 0} ) I can't seem to get the code above to work..Any suggestions (I am sure it is really basic)? Is there a better way to do this? Sincerely, tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to unsubscribe
Hello, I need to unsubcribe this email to the R help mailing list because I need to create an email only for this mailing list. My email account doesn't have that much space anymore and I don't have access to this every day. How can I do that? Thanks a lot -- Bárbara H. Costa Marine Biologist Researcher SCIAENA - Marine Sciences and Cooperation www.sciaena.org ISPA | http://www.ispa.pt/ui/uie/index.asp BIOMARES | http://www.ccmar.ualg.pt/biomares/ MSI-UCSB | http://www.msi.ucsb.edu/ bco...@ispa.pt barbarahco...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ACCTGMX to 1223400 in R?
Dear all, Thank you for the help. Yes I should have posted in the Bioconductor's forum. Next time I sure will. The chartr function helped and now the code is significantly faster. I have also used the Bstrings as suggested by Martin. Thank you again. -- View this message in context: http://r.789695.n4.nabble.com/ACCTGMX-to-1223400-in-R-tp2294636p2296884.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] KALMAN
Hello, I am looking for some very simple, step by step, hands on application/examples/notes etc. on setting up a multivariate time series Kalman filter model in R. Any help/pointers much appreciated. Best regards, Costas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Limited output
The details of my problem are as follows: I have an sql that returns 2192 rows in sqlite. In R, I typed the following: library(RSQLite) con - dbConnect(dbDriver(SQLite), dbname = C:\\sqlite\\... .sqlite) dbListTables(con) #[1] tbl_n... tbl_s...# cur - dbSendQuery(con,select ... from tbl_n... where... ) bru - fetch(cur) bru This returns the first 500 of the 2192 rows of sqlite. If I then type options()$max.print I get #[1] 9# And from str(bru) I get #'data.frame': 500 obs. of 2 variables: $ vbl1 : chr... $ vbl2 : num...# By the way, I appreciate your help! -- View this message in context: http://r.789695.n4.nabble.com/Limited-output-tp2295882p2296795.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Limited output
I just found the solution... I used dbGetQuery instead of dbSendQuery. That's a more direct approach and it works. Thanks again for your help. -- View this message in context: http://r.789695.n4.nabble.com/Limited-output-tp2295882p2296822.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Objects within environment
Hi all, I have following environments loaded with my current R session: search() [1] .GlobalEnvpackage:stats package:graphics package:grDevices [5] package:utils package:datasets package:methods Autoloads [9] package:base How can I find the objects under a specific environment? Here I tries following: objects(envir=package:base) Error in objects(envir = package:base) : invalid 'envir' argument It would be great if somebody would point me the correct arguments for object() function to find the onjects associated with it. In help file it is written that: envir: an alternative argument to ‘name’ for specifying the environment evaluation environment. Mostly there for back compatibility What is the wrong in my code? Thanks, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Obtaining the unmerged cases from one of the two data set
Dear R Gurus, I am having two dummy csv data sets A and B containing 19 and 15 cases/observations respectively. From the two data set 13 cases are intersection. From one of the two (any) data set, How do I then retrieve the unmerged data ? let's take A for example, six cases must appear in our results. See the R codes below. Please assist. Looking forward to hearing from the group sooner and thanking you in advance Kind regards Mangalani Peter Makananisa Statistical Analyst South African Revenue Service (SARS) Segmentation and Research : Data Modelling Tel: +27 12 422 7357, Cell: +27 82 456 4669, Fax:+27 12 422 6579 E-Mail : pmakanan...@sars.gov.za mailto:pmakanan...@sars.gov.za A = read.csv(C:/Documents and Settings/S1033067/Desktop/A.csv, header = TRUE, dec =,, sep = ,) names(A) [1] NAME SALARY dim(A)[1] [1] 19 B = read.csv(C:/Documents and Settings/S1033067/Desktop/B.csv, header = TRUE, dec =,, sep = ,) names(B) [1] NAME B.SALARY dim(B)[1] [1] 15 common = merge(A,B) names(common) [1] NAME SALARY B.SALARY dim(common)[1] [1] 13 Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Email_Disclaimer.pdf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Brute-force clustering
Hi all, I use heatmap.2 for clustering and as you know you can choose between several distance functions and clustering algorithms. Trying all combinations isn't too bad but I have two further parameters and whenever I change one of these I need do redo all cluster combinations. It probably takes weeks to test all possible combinations that's why I would like to use a kind of brute-force solution to test all combinations. One major issue is that I have 20 different cases c1-c20 and I need c1-c5 to be in one cluster and c6-c8 in a different one and so on. So after trying a certain combination I somehow need to check whether the cases fall into the same clusters. Cheers -- View this message in context: http://r.789695.n4.nabble.com/Brute-force-clustering-tp2296897p2296897.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest - Strata
Coll, An alternative approach is to do that subsetting yourself before sending it to RF and treat each group as an external validation group, as follows: - extract Site A, build a RF model (Model 1) on sites B and C - validate this model by running a predict on site A using the model, use ROCR or other evaluation metrics to look at the effectiveness of Model 1. - extract Site B, build a RF model (Model 2) on sites A and C. - validate this model by trying to predict presence in site B using model 2. - continue through all your sites. This is called 'leave-one-out' and is used in some fields for model validation. You final accuracy estimates of your model could be based on the averages of values obtained for each model. Hope that Helps. Tim -- Message: 44 Date: Tue, 20 Jul 2010 08:48:04 -0700 (PDT) From: Coll gbco...@gmail.com To: r-help@r-project.org Subject: [R] Random Forest - Strata Message-ID: 1279640884553-2295731.p...@n4.nabble.com Content-Type: text/plain; charset=us-ascii Hi all, Had struggled in getting Strata in randomForest to work on this. Can I get randomForest for each of its TREE, to get ALL sample from some strata to build tree, while leaving some strata TOTALLY untouched as oob? e.g. in below, how I can tell RF to, - for tree 1 in the forest, to use only Site A and B to build the tree, while using the WHOLE Site C data for the oob error rate, - for tree 2, use only site A and C to build tree, while using whole site B data for oob - for tree 3, use Site B and C, A as oob...? My command does not work as it would use some sample in all of the sites: rforest.obj - randomForest(Presence.f ~., data=dataset.subset, strata = site.factor) while the setting the corresponding sampsize argument seems would only screen out the Site in all tree building... SitePresence Length Sulphur A Yes3.50 19.42 A No 3.9051.09 A No 3.6026.75 B Yes2.60 9.71 B No 2.209.77 B No 2.608.60 B No 3.0035.59 C Yes3.50 16.07 C No 3.4049.96 C No 3.1035.35 Any idea / comments are welcomed. Thanks in advance. Coll -- View this message in context: http://r.789695.n4.nabble.com/Random-Forest-Strata-tp2295731p2295731.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Objects within environment
On 21/07/2010 5:57 AM, Megh Dal wrote: Hi all, I have following environments loaded with my current R session: search() [1] .GlobalEnvpackage:stats package:graphics package:grDevices [5] package:utils package:datasets package:methods Autoloads [9] package:base How can I find the objects under a specific environment? Here I tries following: objects(envir=package:base) Error in objects(envir = package:base) : invalid 'envir' argument It would be great if somebody would point me the correct arguments for object() function to find the onjects associated with it. In help file it is written that: envir: an alternative argument to ‘name’ for specifying the environment evaluation environment. Mostly there for back compatibility What is the wrong in my code? The easiest way to pick an item from the search list is by number: objects(3) or ls(3) will give you the objects in the graphics package, with the search list as above. You can also specify the name as the name argument, e.g. objects(package:base) If you want to use the envir argument (why?), you need to give an environment, not the name of one. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Chi-square distribution probability density function:
Hi to all I found an formular of an ** ***p-Value Calculator for the Chi-Square test* *http://www.danielsoper.com/statcalc/calc11.aspx* *with the formula* *http://www.danielsoper.com/statkb/topic11.aspx* *what's the gamma function of this formula in r?* *df=5* *ch2=25.50878* *the following code does not give the result 0.001 for the values above * *p= ((0.5^(df/2))/gamma(df))*(ch2^((df/2)-1))* (2.718281828459^(-ch2/2)) or is there any other error? Regards Knut *** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining the unmerged cases from one of the two data set
On Jul 21, 2010, at 5:33 AM, Mangalani Peter Makananisa wrote: Dear R Gurus, I saw no reason to copy Rob Hyndman. I did not see that this involves any of the packages he maintains. I am having two dummy csv data sets A and B containing 19 and 15 cases/observations respectively. From the two data set 13 cases are intersection. From one of the two (any) data set, How do I then retrieve the unmerged data ? let's take A for example, six cases must appear in our results. See the R codes below. ?setdiff Perhaps: setdiff( (NAME(A), NAME(B) ) You can also do a merge that is an outer join that includes all the NAME information and then extract the records with SALARY and .B.SALARY data. Untested in absence of working example: ?merge mer - merge(A,B, all=TRUE) mer[ mer$NAME %in% setdiff(NAME(A), NAME(B) ), ] -- David. A = read.csv(C:/Documents and Settings/S1033067/Desktop/A.csv, header = TRUE, dec =,, sep = ,) names(A) [1] NAME SALARY dim(A)[1] [1] 19 B = read.csv(C:/Documents and Settings/S1033067/Desktop/B.csv, header = TRUE, dec =,, sep = ,) names(B) [1] NAME B.SALARY dim(B)[1] [1] 15 common = merge(A,B) names(common) [1] NAME SALARY B.SALARY dim(common)[1] [1] 13 David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Validation in R for GAMM
My GAMM model is to find drivers of species richness in forests is gamm1- gamm(Total Species Richness ~ fROT + s(PH) + s(LOI) + ASP + s(SQRT_ELEV) + CANCOV + s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) + s(Annprec), random=list (fSITE =~1), family = poisson, data = BIOFOR3) My issues are that the validation graphs are using methods I'm unfamiliar with e.g. square rooting values and Pearson's residuals. I'm following as best I can recommendations in Wood 2006 and Zuur 2009 but have some (I think) basic questions on graphical validation: 1. plot sqrt transformed fitted values vs square root transformed observed values - should be straight line. I presume the observed values mentioned above refer only to observed RESPONSE variable 2. Pearson residuals vs. sqrt fitted values (should form band with no patterns) I'm totally unfamilar with using Pearson residuals. Until I can see example of a band with no patterns I can't really say if my output is OK! Could this band also be described as a cloud of points with no pattern? I have 2 factors and 2 unsmoothed explanatory variables in my model – what residual graphs are suitable for these? Is it the traditional ones of normalized residuals vs raw explanatory variables? Also for a lot of other model checking residuals are plotted against explanatory variables..what’s appropriate residuals to plot vs.smoothed explanatory variables for a GAMM? 3. Raw residuals (observed vs fitted) vs. sqrt fitted values (should show clear cone) But my output doesn't show clear cone and I don't know what action to take to rectify this issue Infact graphs 2. and 3. are the exact same. Some help with this would be much appreciated. Thanks, Karen The only examples I have for graph output is Wood 2006 Fig 6.11 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-square distribution probability density function:
On Jul 21, 2010, at 6:52 AM, Knut Krueger wrote: Hi to all I found an formular of an ** ***p-Value Calculator for the Chi-Square test* *http://www.danielsoper.com/statcalc/calc11.aspx* *with the formula* *http://www.danielsoper.com/statkb/topic11.aspx* *what's the gamma function of this formula in r?* *df=5* *ch2=25.50878* *the following code does not give the result 0.001 for the values above * *p= ((0.5^(df/2))/gamma(df))*(ch2^((df/2)-1))* (2.718281828459^(- ch2/2)) or is there any other error? Check your implementation of that formula. You made an error in the gamma argument. See also: ?Chisquare ?gamma -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xtable with ifelse statement
What exactly do you want to do? To me it looks more like a problem with the arrangement of your lists as with xtable (or ifelse) (so you are better next time of with a more precise subject line). Please read the posting guide and provide some reproducable code so we can help you (and include your lists etc. via dump('x',file=stdout())). As first suggestion I would rearrange your lists in a way that the data you want to have together ends up in one list, but there are several alternatives, depending on what you want to achieve. Jannis --- vivi84 cipullu...@hotmail.com schrieb am Mi, 21.7.2010: Von: vivi84 cipullu...@hotmail.com Betreff: [R] xtable with ifelse statement An: r-help@r-project.org Datum: Mittwoch, 21. Juli, 2010 07:23 Uhr Hi there, I'm very new on R and I hope someone can help me to solve the problem in using the ifelse statement with the xtable function(library xtable). I'm trying to get the printing of the elements of two lists in a sorted way. These two list have in common the their names. I will try to give an example: the first list looks like this: $code1 Code code1 Nation Usa $code2 Code code2 Nation Spain the second one is like this $code 1 var1 87 var2 73 $code2 var1 34 var2 23 As you can see the two list have the same names.I tried to implement the following ifelse statement but it doesn't work orde2=function(x,y){ ifelse(names(x)==names(y),list(print(xtable(x)),print(xtable(y))),NULL) What I want is the xtable of this first: $code1 Code code1 Nation Usa and after this: $code 1 var1 87 var2 73 And as the first case, the xtable of this; $code2 Code code2 Nation Spain and after this: $code2 var1 34 var2 23 Someone Knows if it possible to do this and how?? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/xtable-with-ifelse-statement-tp2296694p2296694.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Contingency Table Analysis - specific cell to specific cell comparisons
Dear R users, I have a question of how to do some specific cell to cell comparisons on a R x C contingency table. The table is a 3 x 5 table with frequency / count data. langcons.table - table(lang, cons) langcons.table[cbind(lang,cons)] - freq langcons.table Adj Int Oth Pas Tra C 69 221 17 3 198 E 56 214 33 31 174 J 36 291 8 9 164 I know how to do an independent model test using Poisson in glm glm.out1 - glm(freq~lang+cons, family=poisson, data=langcons.data) summary(glm.out1) And then fit the saturated model glm.out2 - glm(freq~lang*cons, family=poisson, data=langcons.data) summary(glm.out2) However, the results are difficult to interpret: C and Adj are used to as a baseline. And I can only see main effects and interactions and *always according to the baseline*. Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) lang1 lang2 cons1 cons2 cons3 cons4 lang1:cons1 lang2:cons1 lang1:cons2 lang2:cons2 lang1:cons3 lang2:cons3 lang1:cons4 lang2:cons4 If anyone know, please suggest me some way to do specific cell to cell comparison on such a contingency table. Say, to compare pairs of cells: along a column: 3 vs 31, 9 vs 31, 3 vs 9 along a row: 36 vs 9 or even across column and row: 36 vs 31, and 36 vs 3 Thanks for your help in advance. Best, John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contingency Table Analysis - specific cell to specific cell comparisons
On Jul 21, 2010, at 8:07 AM, Tsunhin John Wong wrote: Dear R users, I have a question of how to do some specific cell to cell comparisons on a R x C contingency table. The table is a 3 x 5 table with frequency / count data. langcons.table - table(lang, cons) langcons.table[cbind(lang,cons)] - freq langcons.table Adj Int Oth Pas Tra C 69 221 17 3 198 E 56 214 33 31 174 J 36 291 8 9 164 I know how to do an independent model test using Poisson in glm glm.out1 - glm(freq~lang+cons, family=poisson, data=langcons.data) summary(glm.out1) And then fit the saturated model glm.out2 - glm(freq~lang*cons, family=poisson, data=langcons.data) summary(glm.out2) However, the results are difficult to interpret: C and Adj are used to as a baseline. And I can only see main effects and interactions and *always according to the baseline*. Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) lang1 lang2 cons1 cons2 cons3 cons4 lang1:cons1 lang2:cons1 lang1:cons2 lang2:cons2 lang1:cons3 lang2:cons3 lang1:cons4 lang2:cons4 If anyone know, please suggest me some way to do specific cell to cell comparison on such a contingency table. Even if you are daunted by the task of plugging the covariates into the formula, exp(intercept+sum(beta_N*var_n)), you can always use the predict function to create an estimate for all (or a specific set) of the covariates. They come out on the log(rate) scale so would need to be exponentiated. Consult your instructor for further help. Say, to compare pairs of cells: along a column: 3 vs 31, 9 vs 31, 3 vs 9 along a row: 36 vs 9 or even across column and row: 36 vs 31, and 36 vs 3 David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to unsubscribe
As in nearly every e-mail list, you just need to click on one of the links at the end of each mail (its the first link in this case) and follow the instructions. Jannis --- barbara horta e costa barbarahco...@gmail.com schrieb am Mi, 21.7.2010: Von: barbara horta e costa barbarahco...@gmail.com Betreff: [R] how to unsubscribe An: r-help@r-project.org Datum: Mittwoch, 21. Juli, 2010 09:50 Uhr Hello, I need to unsubcribe this email to the R help mailing list because I need to create an email only for this mailing list. My email account doesn't have that much space anymore and I don't have access to this every day. How can I do that? Thanks a lot -- Bárbara H. Costa Marine Biologist Researcher SCIAENA - Marine Sciences and Cooperation www.sciaena.org ISPA | http://www.ispa.pt/ui/uie/index.asp BIOMARES | http://www.ccmar.ualg.pt/biomares/ MSI-UCSB | http://www.msi.ucsb.edu/ bco...@ispa.pt barbarahco...@gmail.com [[alternative HTML version deleted]] -Integrierter Anhang folgt- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest - Strata
If you use the index argument of the trainControl() function in the caret package, the train() function can be used for this type of resampling (and you'll get some decent summaries and visualizations to boot) Max On Jul 21, 2010, at 7:11 AM, Tim Howard tghow...@gw.dec.state.ny.us wrote: Coll, An alternative approach is to do that subsetting yourself before sending it to RF and treat each group as an external validation group, as follows: - extract Site A, build a RF model (Model 1) on sites B and C - validate this model by running a predict on site A using the model, use ROCR or other evaluation metrics to look at the effectiveness of Model 1. - extract Site B, build a RF model (Model 2) on sites A and C. - validate this model by trying to predict presence in site B using model 2. - continue through all your sites. This is called 'leave-one-out' and is used in some fields for model validation. You final accuracy estimates of your model could be based on the averages of values obtained for each model. Hope that Helps. Tim -- Message: 44 Date: Tue, 20 Jul 2010 08:48:04 -0700 (PDT) From: Coll gbco...@gmail.com To: r-help@r-project.org Subject: [R] Random Forest - Strata Message-ID: 1279640884553-2295731.p...@n4.nabble.com Content-Type: text/plain; charset=us-ascii Hi all, Had struggled in getting Strata in randomForest to work on this. Can I get randomForest for each of its TREE, to get ALL sample from some strata to build tree, while leaving some strata TOTALLY untouched as oob? e.g. in below, how I can tell RF to, - for tree 1 in the forest, to use only Site A and B to build the tree, while using the WHOLE Site C data for the oob error rate, - for tree 2, use only site A and C to build tree, while using whole site B data for oob - for tree 3, use Site B and C, A as oob...? My command does not work as it would use some sample in all of the sites: rforest.obj - randomForest(Presence.f ~., data=dataset.subset, strata = site.factor) while the setting the corresponding sampsize argument seems would only screen out the Site in all tree building... SitePresence Length Sulphur AYes 3.5019.42 ANo3.9051.09 ANo3.6026.75 BYes 2.609.71 BNo2.209.77 BNo2.608.60 BNo3.0035.59 CYes 3.5016.07 CNo3.4049.96 CNo3.1035.35 Any idea / comments are welcomed. Thanks in advance. Coll -- View this message in context: http://r.789695.n4.nabble.com/Random-Forest-Strata-tp2295731p2295731.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Saving R plots as image in oracle database tables
Dear All, I want to know how to save R plots as image/object in a oracle database?? I am using RODBC package for connecting R to oracle Please explain by giving example.. Thanks in advance Cheers, Vikrant -- View this message in context: http://r.789695.n4.nabble.com/Saving-R-plots-as-image-in-oracle-database-tables-tp2297009p2297009.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fix()ing an S4 method
Hi R experts, The fix() function canbe used to edit normal functions. I would like to know whether it's also possible to use something similar to edit a method of an S4 class. In other words, is there a fix-like function that allows me to edit method definitions without having to go back to the source code? setGeneric (name=doStuff,def = function(object){standardGeneric(doStuff)}) setClass(SomeClass, representation(text = character)) setMethod(f = doStuff, signature (SomeClass), definition = function(object){ return(print(paste(***, obj...@text))) }) fix(doStuff) # Does NOT give the intended result. How can I make R show the method definition? Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to unsubscribe
I already changed my email account. Thanks a lot On 21 July 2010 12:50, Jannis bt_jan...@yahoo.de wrote: As in nearly every e-mail list, you just need to click on one of the links at the end of each mail (its the first link in this case) and follow the instructions. Jannis --- barbara horta e costa barbarahco...@gmail.com schrieb am Mi, 21.7.2010: Von: barbara horta e costa barbarahco...@gmail.com Betreff: [R] how to unsubscribe An: r-help@r-project.org Datum: Mittwoch, 21. Juli, 2010 09:50 Uhr Hello, I need to unsubcribe this email to the R help mailing list because I need to create an email only for this mailing list. My email account doesn't have that much space anymore and I don't have access to this every day. How can I do that? Thanks a lot -- Bárbara H. Costa Marine Biologist Researcher SCIAENA - Marine Sciences and Cooperation www.sciaena.org ISPA | http://www.ispa.pt/ui/uie/index.asp BIOMARES | http://www.ccmar.ualg.pt/biomares/ MSI-UCSB | http://www.msi.ucsb.edu/ bco...@ispa.pt barbarahco...@gmail.com [[alternative HTML version deleted]] -Integrierter Anhang folgt- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bárbara H. Costa Marine Biologist Researcher SCIAENA - Marine Sciences and Cooperation www.sciaena.org ISPA | http://www.ispa.pt/ui/uie/index.asp BIOMARES | http://www.ccmar.ualg.pt/biomares/ MSI-UCSB | http://www.msi.ucsb.edu/ bco...@ispa.pt barbarahco...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] DIF Analysis starting from a gpcm class object
Dear useRs, does any of you have suggestions on how to conduct a proper DIF analysis starting from a model of class gpcm (from the wonderful package ltm by prof. Rizopoulos)? difR will handle only dichotomous items, and I have a mix of dicho- and polytomous ones (that's why I chose the partial credit model). I also found the package lordif, but I'm not really sure if that's what I need. I basically want to regress item difficulties on some other covariates (sex, age class,...).. Is there a package you know of that handle such models for a DIF analysis or should I code such a thing by myself? and if so, how can I extract what I need from the model object? thanks to everybody /federico - Dr. Federico Andreis Università degli Studi di Milano-Bicocca, PhD Student MEB Department, Karolinska Institutet, Stockholm, Visiting PhD Student - Problem: To Catch a Lion in the Sahara Desert - The Dirac Method We observe that wild lions are, ipso facto, not observable in the Sahara Desert. Consequently, if there are any lions in the Sahara, they are tame. The capture of a tame lion may be left as an exercise for the reader. -- View this message in context: http://r.789695.n4.nabble.com/DIF-Analysis-starting-from-a-gpcm-class-object-tp2297051p2297051.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fractional Polynomials - Hazard Ratios and Relative Hazard Plots
Dear All, I am using Windows and R version 2.9.2 with libraries cmprsk, mfp and Design. I have a dataset with approximately 1700 patients (1 row per patient) and I have 12 outcomes, three of which are continuous. I have performed univariate analyses to see if any factors are associated with a higher likelihood of the event of interest (achieving 12 month remission from epileptic seizures) and also an analysis adjusted for multiple variables. I have tried log and fractional polynomial (FP) transformations of each continuous variable. In the case of age, used for the example below, the FP transformation led to the best model fit according to the AIC. I have therefore applied this transformation for all analyses. To begin with I have fitted a Cox model stratified by randomisation period, rpa, (either before or after a certain date). fit1 - mfp(Surv(Remtime,Rcens) ~ fp(age) + strata(rpa), family=cox, data=nearma, select=0.05, verbose=TRUE) I would like two things from this model, hazard ratios and and an associated hazard ratio plot. I am aware that the hazard ratios produced from a fractional polynomial transformation are not to be used directly (i.e. those obtained from summary(coxfitf1)). Instead the derived functional form of the variable should be used to estimate hazard ratios post hoc. I have attached a word document explaining how hazard ratios and confidence interval can be derived and given a worked example for the variable, age. The univariate results are: Age (years) ≤10 (10 to 25] (25 to 37] (37 to 50] (50 to 71] 71 1.00 1.00 (1.00 to 1.00) 0.99 (0.99 to 1.00) 0.99 (0.99 to 0.99) 0.99 (0.98 to 0.99) 0.98 (0.97 to 0.99) To create a plot of the relative hazard I have used the code: plot(nearma$age,predict(fit1,type=risk),xlab=Age,ylab=Relative Hazard) The produced plot is attached. As you can clearly see, the hazard ratios above and the relative hazard plot do not agree. This is also the case for the other two continuous variables that have been transformed via the FP approach. The hazard ratios for age using the model adjusted for multiple variables are as follows, which do coincide with the plot: Age (years) ≤10 (10 to 25] (25 to 37] (37 to 50] (50 to 71] 71 1.00 0.86 (0.77 to 0.95) 0.78 (0.65 to 0.94) 0.80 (0.64 to 1.00) 1.01 (0.81 to 1.25) 1.61 (1.27 to 2.05) Can anyone therefore explain why the univariate hazard ratios do not coincide with the relative hazard plot and yet the hazard ratios from the multivariable model do? I know that the calculations are correct for both sets of hazard ratios. Thank you for any help you can provide as I am at a loss to explain the difference in the plot fo the calculations - they should, after all, be saying the same thing! Thank you, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Cheat Sheets and Reference Cards
FYI - There is a new collection of free cheat sheets and reference cards for R on DevCheatSheet - http://devcheatsheet.com/tag/r/ If anyone knows of any other R cheat sheets that are not listed, please use the 'Suggest a Missing Cheat Sheet' link on the site. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Piecewise regression using lme()
Hi everyone, I'm trying to fit a of piecewise regression model on a time series. The idea is to divide the series into segments and then to apply linear regression models on each segment but in a global way and considering heteroskedasticity between the segments. For example, I build a time series y with 3 segments: segment1=1:20+rnorm(20,0,2) segment2=20-2*1:30+rnorm(30,0,5) segment3=-40+0.5*1:15+rnorm(15,0,1) group=c(rep(1,20),rep(2,30),rep(3,15)) y=c(segment1,segment2,segment3) Data=data.frame(y,t=1:65,group=as.factor(group)) the model I'd like to fit is: y_t= (beta_01+beta_11*t+error_1)*(group==1)+(beta_02+beta_12*t+error_2)*(group==2)+(beta_03+beta_13*t+error_3)*(group==3) It looks like a mixed effects model were the fixed effect are the piecewise linear regression parts (beta_0i+beta_1i*t) and the random effects are the variance components error_i. The problem is that I can't find the way the write this model correctly using the lme() function of the package nlme. I can't remove the global intercept and the global variance component. Here's what I tried: #1 Using a prior piecewise linear regression lm.list=lmList(y~t|group,Data) model.lme=lme(lm.list,weights=varIdent(form=~1|group)) #2 Trying to estimate the whole model directly and considering the different lines as random effects model.lme=lme(y~1,random=~t|group,data=Data) but the intercept remains... I read a lot of R-help messages before posting this one (my first!) and I'm not getting any closer. It looks like no one tried to estimate the exact same model. I'll be very grateful if someone could help me on this. Thanks Goulven -- View this message in context: http://r.789695.n4.nabble.com/Piecewise-regression-using-lme-tp2297118p2297118.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error message: attempt to set rownames on object with no dimensions
Hi there, I am trying to analyze some data using the lme package. In my case there is a variable called Newmarker that can only take 2 values (number 1 or number 2). I have used the as.factor to remark this fact but an error message appears stating: attempt to set rownames on object with no dimensions. Here is the code I have used: *Dataset - read.table(Data2.txt, header=TRUE, sep=, na.strings=NA, dec=., strip.white=TRUE) Dataset$Newmarker - as.factor(Dataset$Newmarker) attach(Dataset) lme.1 - lme(Newmarker ~ Age, data=Dataset, random= ~1|ID) summary(lme.1)* If I don't use the as.factor I have no problems but I don't think this is the correct way to proceed... The other variables I use can have any value. Can anybody help me please? Thanking you in advance, Regards, Igor Blanco [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error message: attempt to set rownames on object with no dimensions
?lme (at least the one in package nlme) seems not to mention that the response in 'fixed' must be a numeric variable, but it must (and a factor is not numeric). If you want to use a binary response, you need to use other methods, and it would be better to ask on R-sig-mixed-models making clear what your aims actually are. On Wed, 21 Jul 2010, Igor Blanco wrote: Hi there, I am trying to analyze some data using the lme package. Perhaps you mean the nlme package? In my case there is a variable called Newmarker that can only take 2 values (number 1 or number 2). I have used the as.factor to remark this fact but an error message appears stating: attempt to set rownames on object with no dimensions. Here is the code I have used: *Dataset - read.table(Data2.txt, header=TRUE, sep=, na.strings=NA, dec=., strip.white=TRUE) Dataset$Newmarker - as.factor(Dataset$Newmarker) attach(Dataset) lme.1 - lme(Newmarker ~ Age, data=Dataset, random= ~1|ID) summary(lme.1)* If I don't use the as.factor I have no problems but I don't think this is the correct way to proceed... The other variables I use can have any value. Can anybody help me please? Thanking you in advance, Regards, Igor Blanco [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error message: attempt to set rownames on object with no dimensions
On Jul 21, 2010, at 9:39 AM, Igor Blanco wrote: Hi there, I am trying to analyze some data using the lme package. In my case there is a variable called Newmarker that can only take 2 values (number 1 or number 2). I have used the as.factor to remark this fact but an error message appears stating: attempt to set rownames on object with no dimensions. Here is the code I have used: *Dataset - read.table(Data2.txt, header=TRUE, sep=, na.strings=NA, dec=., strip.white=TRUE) Dataset$Newmarker - as.factor(Dataset$Newmarker) Why do you use as.factor? It's probably already a factor. What does str(Dataset) show? attach(Dataset) You do not need to do this, and it creates potential for confusion. The data= argument will allow the column names to be accessed by the lme() function. lme.1 - lme(Newmarker ~ Age, data=Dataset, random= ~1|ID) summary(lme.1)* If I don't use the as.factor I have no problems but I don't think this is the correct way to proceed... The other variables I use can have any value. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The opposite of lag
Hello! I have a data frame A (below) with a grouping factor (group). I take my DV and create the new, lagged DV by applying the function lag.it (below). It works fine. A - data.frame(year=rep(c(1980:1984),3), group= factor(sort(rep(1:3,5))), DV=c(rnorm(15))) lag.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, -1))[idx,] out-cbind(x, DVs[,2]) # wages[,2] names(out)[length(out)]-DV.lag return(out) } A A.lagged - do.call(rbind, by(A, A$group, lag.it)) A.lagged Now, I am trying to create the oppostive of lag for DV (should I call it lead?) I tried exactly the same as above, but with a different number under lag function (below), but it's not working. I am clearly doing something wrong. Any advice? Thanks a lot! lead.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, 2))[idx,] out-cbind(x, DVs[,2]) names(out)[length(out)]-DV.lead return(out) } A A.lead - do.call(rbind, by(A, A$group, lead.it)) A.lead -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How do you calculate DIC from coda files (R2WinBUGS)?
I tried calculating pD as 1/2 variance of the deviance, but I got hugely inflated numbers. Does anybody know how to calculate pD from the coda files output from R2WinBUGS? By the way, 'set DIC' is greyed out for some reason within WinBUGS, so I can't monitor DIC. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] best way to apply a list of functions to a dataset ?
Dennis' ddply solution would be my choice. Here is a small variation that makes it easy to modify what list of functions is applied: # ma- melt(attitude) f - function(x,v) summarise(x, mean = mean(v), sd = sd(v), skewness = skewness(v), mean.gt.med = mean.gt.med(v) ) ddply(ma, .(variable), function(x) f(x, v = x[[value]])) # Another option is to use data.frame in place of summarise: # f - function(x,v) data.frame( mean = mean(v), sd = sd(v), skewness = skewness(v), mean.gt.med = mean.gt.med(v) ) # -Peter Ehlers On 2010-07-21 0:41, Dennis Murphy wrote: Hi: On Tue, Jul 20, 2010 at 5:37 PM, Glen Barnettglnbr...@gmail.com wrote: Hi Dennis, Thanks for the reply. Yes, that's easier, but the conversion to a matrix with rbind has converted the output of that final function to a numeric. If you look at the output of lapply(attitude, f), you'll see that the conversion from logical to numeric has already taken place. Different components of lists can have different types, but within a component, all of the elements must have the same class. You can patch up the result as follows: x- data.frame(do.call(rbind, lapply(attitude, f))) x[, 5]- as.logical(x[, 5]) x meansdskewness median mean.gt.med rating 64.6 12.172562 -0.35792491 65.5 FALSE complaints 66.6 13.314757 -0.21541749 65.0TRUE privileges 53.1 12.235430 0.37912287 51.5TRUE learning 56.36667 11.737013 -0.05403354 56.5 FALSE raises 64.6 10.397226 0.19754317 63.5TRUE critical 74.76667 9.894908 -0.86577893 77.5 FALSE advance42.9 10.288706 0.85039799 41.0TRUE but if you're doing this sort of thing over a large data frame such fixes may be impractical. I included that last function in the example secifically to preclude people assuming that functions would always return the same type. There is a plyr solution, although it's a little more long-winded than I'd prefer in the end: library(ggplot2) # melt the data frame so that the variables become factor levels ma- melt(attitude) Using as id variables dim(ma) [1] 210 2 # Use ddply to get the set of summaries by variable: ddply(ma, .(variable), summarise, mean = mean(value), sd = sd(value), skewness = skewness(value), median = median(value), mean.gt.med = mean.gt.med(value)) variable meansdskewness median mean.gt.med 1 rating 64.6 12.172562 -0.35792491 65.5 FALSE 2 complaints 66.6 13.314757 -0.21541749 65.0TRUE 3 privileges 53.1 12.235430 0.37912287 51.5TRUE 4 learning 56.36667 11.737013 -0.05403354 56.5 FALSE 5 raises 64.6 10.397226 0.19754317 63.5TRUE 6 critical 74.76667 9.894908 -0.86577893 77.5 FALSE 7advance 42.9 10.288706 0.85039799 41.0TRUE Notice that now the logical class of mean.gt.med is preserved. The trick with ddply() in package plyr is that, in this case, it is convenient to melt the data frame first before doing the summarizations. This is because ddply() requires a grouping variable - in this example, the groups are the variables themselves. HTH, Dennis I guess this doesn't matter too much for a logical, but what if instead the function returned a character (say mean, median, or equal - indicating which one was larger, or equal which could easily happen with discrete data). This precludes using rbind (which I also used at first, before I noticed that sometimes I could have functions that don't return numerics). Glen On Tue, Jul 20, 2010 at 6:55 PM, Dennis Murphydjmu...@gmail.com wrote: Hi: This might be a little easier (?): library(datasets) skewness- function(x) mean(scale(x)^3) mean.gt.med- function(x) mean(x)median(x) # -- # construct the function to apply to each variable in the data frame f- function(x) c(mean = mean(x), sd = sd(x), skewness = skewness(x), median = median(x), mean.gt.med = mean.gt.med(x)) # map function to each variable with lapply and combine with do.call(): do.call(rbind, lapply(attitude, f)) meansdskewness median mean.gt.med rating 64.6 12.172562 -0.35792491 65.5 0 complaints 66.6 13.314757 -0.21541749 65.0 1 privileges 53.1 12.235430 0.37912287 51.5 1 learning 56.36667 11.737013 -0.05403354 56.5 0 raises 64.6 10.397226 0.19754317 63.5 1 critical 74.76667 9.894908 -0.86577893 77.5 0 advance42.9 10.288706 0.85039799 41.0 1 HTH, Dennis On Mon, Jul 19, 2010 at 10:51 PM, Glen Barnettglnbr...@gmail.com wrote: Assuming I have a matrix of data (or under some restrictions that will become obvious, possibly a data frame), I want to be able to apply a list of functions (initially producing a single number from a vector) to the data and produce a data
Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday
Harlan Harris harlan at harris.name writes: Hi, I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything worked OK until this morning, when my ability to write into spreadsheet cells went away. I get the following weird error: Error in els[[type + 1]] : subscript out of bounds Looking at the Google Docs API changelog, I see the following: http://code.google.com/apis/spreadsheets/changelog.html Release 2010-01 (July 14, 2010) This is an advanced notice about an upcoming change. - Starting July 19, 2010, all links returned by all Spreadsheets API feeds will use HTTPS. This is being done in the interests of increased security. If you require the use of HTTP, we recommend that you remove the replace https with http in these links. Another announcement will be made on July 19, 2010, when this change goes to production. I suspect this is the problem. Fixing it is above my head, I'm afraid. Could anyone help? This is urgent. Thank you, This is an Omegahat package (took me a little while to find it). Perhaps you should write to the package maintainer? library(RGoogleDocs) help(package=RGoogleDocs) or, more obscurely: help(package=RGoogleDocs)$info[[1]][9] (there may be a better way to deal with objects of type packageInfo but I can't figure it out right at the moment). It looks as though one might be able to fix this by hacking the hard-coded URLs in the code, but as you suggest that might be above your head. good luck ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday
On Wed, Jul 21, 2010 at 11:24 AM, Ben Bolker bbol...@gmail.com wrote: Harlan Harris harlan at harris.name writes: Hi, I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything worked OK until this morning, when my ability to write into spreadsheet cells went away. I get the following weird error: Error in els[[type + 1]] : subscript out of bounds Looking at the Google Docs API changelog, I see the following: http://code.google.com/apis/spreadsheets/changelog.html Release 2010-01 (July 14, 2010) This is an advanced notice about an upcoming change. - Starting July 19, 2010, all links returned by all Spreadsheets API feeds will use HTTPS. This is being done in the interests of increased security. If you require the use of HTTP, we recommend that you remove the replace https with http in these links. Another announcement will be made on July 19, 2010, when this change goes to production. I suspect this is the problem. Fixing it is above my head, I'm afraid. Could anyone help? This is urgent. Thank you, This is an Omegahat package (took me a little while to find it). Perhaps you should write to the package maintainer? library(RGoogleDocs) help(package=RGoogleDocs) or, more obscurely: help(package=RGoogleDocs)$info[[1]][9] (there may be a better way to deal with objects of type packageInfo but I can't figure it out right at the moment). Maybe: packageDescription('RGoogleDocs', fields = 'Author') It looks as though one might be able to fix this by hacking the hard-coded URLs in the code, but as you suggest that might be above your head. good luck ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Clustering groups
Hi, is there a way in R to identify those cluster methods / distance measures which best reflect predefined cluster groups. Given 10 observations O1...O10. Optimally, these 10 observations cluster as follows: cluster1: O1, O2, O3, O4 cluster2: O5, O6 cluster3: O7, O8, O9, O10. What I want is a method which identifies that cluster method / distance measure which best reflect my predefined groups. Is that somehow possible to do? Cheers -- View this message in context: http://r.789695.n4.nabble.com/Clustering-groups-tp2297210p2297210.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The opposite of lag
On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame A (below) with a grouping factor (group). I take my DV and create the new, lagged DV by applying the function lag.it (below). It works fine. A - data.frame(year=rep(c(1980:1984),3), group= factor(sort(rep(1:3,5))), DV=c(rnorm(15))) lag.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, -1))[idx,] out-cbind(x, DVs[,2]) # wages[,2] names(out)[length(out)]-DV.lag return(out) } A A.lagged - do.call(rbind, by(A, A$group, lag.it)) A.lagged Now, I am trying to create the oppostive of lag for DV (should I call it lead?) I tried exactly the same as above, but with a different number under lag function (below), but it's not working. I am clearly doing something wrong. Any advice? Thanks a lot! lead.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, 2))[idx,] out-cbind(x, DVs[,2]) names(out)[length(out)]-DV.lead return(out) } A A.lead - do.call(rbind, by(A, A$group, lead.it)) A.lead Try this: library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) lag(z, c(-1, 0, 1)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] best way to apply a list of functions to a dataset ?
ddply(ma, .(variable), summarise, mean = mean(value), sd = sd(value), skewness = skewness(value), median = median(value), mean.gt.med = mean.gt.med(value)) In principle, you should be able to do: ddply(ma, .(variable), colwise(each(mean, sd, skewness, median, mean.gt.med))) but currently colwise and each don't work together that well. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with time in R
Ms. Chisholm, If you could tell us how you plan to use the variables, we will have a better understanding of what you are looking for and will be able to help you. Are you looking for the time in seconds? In that case, do as Mr. Holfman says. He just skipped the part about converting the factors to characters. You can do that by: y - as.character(x) where x is the vector of factors. Are you looking to have a list of hours, minutes and seconds? That can be done too...Although it would be much easier to just have hours and min.sec On Tue, Jul 20, 2010 at 7:33 AM, Sarah Chisholm sarah.chisholm...@ucl.ac.uk wrote: Hi, I have a problem with the time formatting in R. I have entered time in the format MM:SS.xyz and R has automatically classified this as a factor, but I need it numerically. However when I use as.numeric() it gives me totally different numbers. Is there any way I can tell R to read thes input as a number? Thank you very much [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Aaditya Nanduri aaditya.nand...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interactions in GAMMs
Hi, I've an issue adding an interaction to a GAMM: My model was of form: gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV) + CANCOV + s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) + s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE), family = poisson, data = BIOFOR2) with interaction of s(OLDWDLD:DISTWOOD). However I got this error message below but don't know what it means? (my data is composed of info for 150 plots) #I Warning messages: #2: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used #3: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used Can anyone offer advice on how to include this interaction in GAMM model? Thanks Karen Moore [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday - CAN PAY FOR FIX
I unfortunately haven't received any responses about this problem. We (the company I work for) are willing to discuss payment to someone who is willing to quickly contribute a fix to the RGoogleDocs/RCurl toolchain that will restore write access. Please contact me directly if you're interested. Thank you, -Harlan Harris On Tue, Jul 20, 2010 at 10:19 AM, Harlan Harris har...@harris.name wrote: Hi, I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything worked OK until this morning, when my ability to write into spreadsheet cells went away. I get the following weird error: Error in els[[type + 1]] : subscript out of bounds Looking at the Google Docs API changelog, I see the following: http://code.google.com/apis/spreadsheets/changelog.html Release 2010-01 (July 14, 2010) This is an advanced notice about an upcoming change. - Starting July 19, 2010, all links returned by all Spreadsheets API feeds will use HTTPS. This is being done in the interests of increased security. If you require the use of HTTP, we recommend that you remove the replace https with http in these links. Another announcement will be made on July 19, 2010, when this change goes to production. I suspect this is the problem. Fixing it is above my head, I'm afraid. Could anyone help? This is urgent. Thank you, -Harlan Harris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interactions in GAMMs
Hi Karen, I think you should decide what you mean for interaction. s(x:y) is meaningless If you want to fit a surface you should use s(x,y). If you want to fit a varying coefficient model (interaction between a linear and a smooth term) you should use the argument by in s(). The help files of the mgcv package by S. Wood are quite clear and they also include a lot of examples. Hope this helps you vito Karen Moore ha scritto: Hi, I've an issue adding an interaction to a GAMM: My model was of form: gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV) + CANCOV + s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) + s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE), family = poisson, data = BIOFOR2) with interaction of s(OLDWDLD:DISTWOOD). However I got this error message below but don't know what it means? (my data is composed of info for 150 plots) #I Warning messages: #2: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used #3: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used Can anyone offer advice on how to include this interaction in GAMM model? Thanks Karen Moore [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lm: order of dropped columns
Hi all, If presented with a singular design matrix, lm drops columns to make the design matrix non-singular. What algorithm is used to select which (and how many) column(s) to drop? Particularly, given a factor, how does lm choose levels of the factor to discard? Thanks for the help. Best, Anirban [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interactions in GAMMs
On Jul 21, 2010, at 11:17 AM, Karen Moore wrote: Hi, I've an issue adding an interaction to a GAMM: My model was of form: # Package? Probably: require(mgcv) gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV) + CANCOV + s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) + s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE), family = poisson, data = BIOFOR2) with interaction of s(OLDWDLD:DISTWOOD). However I got this error message below but don't know what it means? The documentation for s() says the unnamed arguments must be a list of covariates and I do not think OLDWDLD:DISTWOOD constitutes a list. (On the other hand it appears that the term list is being used a bit loosely here, since the examples do not explicitly enclose the covariates in the list() function and when I did so with the example on the te help page I get an error.) (my data is composed of info for 150 plots) #I Warning messages: #2: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used #3: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used Can anyone offer advice on how to include this interaction in GAMM model? Do you have Wood's book? Ch/sect 6.7 has worked examples using the te() function which I believe is the GAM(M) counterpart to linear interactions. He also makes the point that random arguments to gamm() need to be in the list form. (The documentation suggests this has not changes since the book was published.) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Default For list.len Argument (list output truncated)
Hello How can I set a default for the 'list.len' argument in 'str'? x - as.data.frame(matrix(1:1000, ncol=1000)) str(x) formals(str) - alist(object=, ...=, list.len=1000) args(str); formals(str) str(x) Does not display errors, but does not work either. Thanks for help Sören __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The opposite of lag
On Wed, Jul 21, 2010 at 10:50 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame A (below) with a grouping factor (group). I take my DV and create the new, lagged DV by applying the function lag.it (below). It works fine. A - data.frame(year=rep(c(1980:1984),3), group= factor(sort(rep(1:3,5))), DV=c(rnorm(15))) lag.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, -1))[idx,] out-cbind(x, DVs[,2]) # wages[,2] names(out)[length(out)]-DV.lag return(out) } A A.lagged - do.call(rbind, by(A, A$group, lag.it)) A.lagged Now, I am trying to create the oppostive of lag for DV (should I call it lead?) I tried exactly the same as above, but with a different number under lag function (below), but it's not working. I am clearly doing something wrong. Any advice? Thanks a lot! lead.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, 2))[idx,] out-cbind(x, DVs[,2]) names(out)[length(out)]-DV.lead return(out) } A A.lead - do.call(rbind, by(A, A$group, lead.it)) A.lead Try this: library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) lag(z, c(-1, 0, 1)) The ### line was missing: library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) z - as.zooreg(z) ### lag(z, c(-1, 0, 1)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The opposite of lag
Never mind- I figured it out: A library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) z - as.zooreg(z) ### result-lag(z, c(-1, 0, 1)) A result Thank you very much! On Wed, Jul 21, 2010 at 12:08 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thank you, Gabor, but sorry - what is the exact order of those rows again? Thank you! Dimitri On Wed, Jul 21, 2010 at 11:59 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jul 21, 2010 at 10:50 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame A (below) with a grouping factor (group). I take my DV and create the new, lagged DV by applying the function lag.it (below). It works fine. A - data.frame(year=rep(c(1980:1984),3), group= factor(sort(rep(1:3,5))), DV=c(rnorm(15))) lag.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, -1))[idx,] out-cbind(x, DVs[,2]) # wages[,2] names(out)[length(out)]-DV.lag return(out) } A A.lagged - do.call(rbind, by(A, A$group, lag.it)) A.lagged Now, I am trying to create the oppostive of lag for DV (should I call it lead?) I tried exactly the same as above, but with a different number under lag function (below), but it's not working. I am clearly doing something wrong. Any advice? Thanks a lot! lead.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, 2))[idx,] out-cbind(x, DVs[,2]) names(out)[length(out)]-DV.lead return(out) } A A.lead - do.call(rbind, by(A, A$group, lead.it)) A.lead Try this: library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) lag(z, c(-1, 0, 1)) The ### line was missing: library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) z - as.zooreg(z) ### lag(z, c(-1, 0, 1)) -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The opposite of lag
Sorry, I don't think it's working. the last 3 columns (on the right) of result contain the original data of each group. But there is no shift at all. I am trying to reach the following result for each group: The first number disappears and at the bottom an NA appears. Is it possible? On Wed, Jul 21, 2010 at 12:12 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Never mind- I figured it out: A library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) z - as.zooreg(z) ### result-lag(z, c(-1, 0, 1)) A result Thank you very much! On Wed, Jul 21, 2010 at 12:08 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thank you, Gabor, but sorry - what is the exact order of those rows again? Thank you! Dimitri On Wed, Jul 21, 2010 at 11:59 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jul 21, 2010 at 10:50 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame A (below) with a grouping factor (group). I take my DV and create the new, lagged DV by applying the function lag.it (below). It works fine. A - data.frame(year=rep(c(1980:1984),3), group= factor(sort(rep(1:3,5))), DV=c(rnorm(15))) lag.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, -1))[idx,] out-cbind(x, DVs[,2]) # wages[,2] names(out)[length(out)]-DV.lag return(out) } A A.lagged - do.call(rbind, by(A, A$group, lag.it)) A.lagged Now, I am trying to create the oppostive of lag for DV (should I call it lead?) I tried exactly the same as above, but with a different number under lag function (below), but it's not working. I am clearly doing something wrong. Any advice? Thanks a lot! lead.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, 2))[idx,] out-cbind(x, DVs[,2]) names(out)[length(out)]-DV.lead return(out) } A A.lead - do.call(rbind, by(A, A$group, lead.it)) A.lead Try this: library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) lag(z, c(-1, 0, 1)) The ### line was missing: library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) z - as.zooreg(z) ### lag(z, c(-1, 0, 1)) -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday
On Wed, Jul 21, 2010 at 10:38 AM, Henrique Dallazuanna www...@gmail.com wrote: Maybe: packageDescription('RGoogleDocs', fields = 'Author') or rather packageDescription('RgoogleDocs', fields='Maintainer') (usually but not always the same person) thanks Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The opposite of lag
Basically, if you look at the A lagged below, then you see how in each group the values of A move downwards: and NA is added on top and the last value in each group disappears. I am trying to get the opposite: the first value in each group disappears and NA is added at the bottom. D. set.seed(1234) A - data.frame(year=rep(c(1980:1984),3), group= factor(sort(rep(1:3,5))), DV=c(rnorm(15))) lag.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, -1))[idx,] out-cbind(x, DVs[,2]) # wages[,2] names(out)[length(out)]-DV.lag return(out) } A A.lagged - do.call(rbind, by(A, A$group, lag.it)) A.lagged On Wed, Jul 21, 2010 at 12:18 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Sorry, I don't think it's working. the last 3 columns (on the right) of result contain the original data of each group. But there is no shift at all. I am trying to reach the following result for each group: The first number disappears and at the bottom an NA appears. Is it possible? On Wed, Jul 21, 2010 at 12:12 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Never mind- I figured it out: A library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) z - as.zooreg(z) ### result-lag(z, c(-1, 0, 1)) A result Thank you very much! On Wed, Jul 21, 2010 at 12:08 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thank you, Gabor, but sorry - what is the exact order of those rows again? Thank you! Dimitri On Wed, Jul 21, 2010 at 11:59 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jul 21, 2010 at 10:50 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jul 21, 2010 at 10:14 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! I have a data frame A (below) with a grouping factor (group). I take my DV and create the new, lagged DV by applying the function lag.it (below). It works fine. A - data.frame(year=rep(c(1980:1984),3), group= factor(sort(rep(1:3,5))), DV=c(rnorm(15))) lag.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, -1))[idx,] out-cbind(x, DVs[,2]) # wages[,2] names(out)[length(out)]-DV.lag return(out) } A A.lagged - do.call(rbind, by(A, A$group, lag.it)) A.lagged Now, I am trying to create the oppostive of lag for DV (should I call it lead?) I tried exactly the same as above, but with a different number under lag function (below), but it's not working. I am clearly doing something wrong. Any advice? Thanks a lot! lead.it - function(x) { DV - ts(x$DV, start = x$year[1]) idx - seq(length = length(DV)) DVs - cbind(DV, lag(DV, 2))[idx,] out-cbind(x, DVs[,2]) names(out)[length(out)]-DV.lead return(out) } A A.lead - do.call(rbind, by(A, A$group, lead.it)) A.lead Try this: library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) lag(z, c(-1, 0, 1)) The ### line was missing: library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) z - as.zooreg(z) ### lag(z, c(-1, 0, 1)) -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrain density to 0 at 0?
This is the boundary problem in density estimation. One simple trick is to use the idea of reflection. If your data is (y1, y2, ...,yn), you create a reflected data by appending `n' negative values to your data, call this y*. Estimate the kernel density for this as fhat(y*). Redefine this estimate so that it is zero for all y 0, and is equal to 2 * fhat(y*). Here is an example: y - rexp(100, rate=10) yref - c(-y, y) fref - density(yref) sel - fref$x = 0 fref$y[sel] - 2 * fref$y[sel] par(mfrow=c(2,1)) plot(density(y)) plot(fref$x[sel], fref$y[sel], type=l) This approach assumes that the density has 0 gradient at the boundary (gradient defined from the right hand side). There are other (better) approaches as well. Look at a book on density estimation. Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Farley, Robert farl...@metro.net Date: Tuesday, July 20, 2010 2:47 pm Subject: Re: [R] Constrain density to 0 at 0? To: r-help@r-project.org r-help@r-project.org Read the help page for density more carefully. Especially the bw and from arguments. Yes, it was my inability to make sense of the help page that motivated my email. My distances range from 0.4 to 7.6 but the density plot ranges from -2 to 10. from=0 seems to hide the negative portion of the density without setting it to 0. I've tried adjust, but it requires a value of 0.2, which results in a very, very spiky plot. I was hoping for a process that could forbid impossible (negative) values, yet allow the density plot to blend the individual measurements. Is there a variable bw that could be set small at the extrema, and larger in the range of the data? Robert Farley Metro www.Metro.net -Original Message- From: David Winsemius [ Sent: Monday, July 19, 2010 19:31 To: Farley, Robert Cc: r-help@r-project.org Subject: Re: [R] Constrain density to 0 at 0? On Jul 19, 2010, at 9:57 PM, Farley, Robert wrote: I'm plotting some trip length frequencies using the following code: plot( density(zTestData$Distance, weights=zTestData$Actual), xlim=c(0,10), main=Test TLFD, xlab=Distance, col=6 ) lines(density(zTestData$Distance, weights=zTestData$FlatWeight), col=2) lines(density(zTestData$Distance, weights=zTestData$BrdWeight ), col=3) which works fine except the distances are all positive, but the densities don't drop to 0 until around -2 or -3. Is there a way for me to force the density plot to 0 at 0? Yes. (Assuming it can be zero, given the data.) Read the help page for density more carefully. Especially the bw and from arguments. -- David. __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Objects within environment
On 21/07/2010 12:27 PM, Megh Dal wrote: Hi Duncan, thanks for your clarification. However I do not think I could really understand the envir argument in objects() function. It is saying an alternative argument for name Is the alternative means the alternative of, let say, package:graphics (which is the name of an environment?). Can you give me an example of an alternative argument of that particular environment? as.environment(package:graphics) is an environment. package:graphics is its name. This is specifying the environment evaluation environment. What does the phase environment evaluation environment mean? Can you give me an example? That looks like a typo to me. Environment evaluation environment is meaningless. Mostly there for back compatibility: again totally in dark, what does it mean for back compatibility? An example would definitely be great. Presumably some older release of R used envir, and we still have it so that old code will still work. But that's a signal that new code should never need to use it. Toy said you need to give an environment, not the name of one. If I call someone I need to call with his name, right? Then if I need to give an environment then, without its name how can I do so? This is a computer language, not a conversation. Words have technical meanings that aren't always perfect matches to English meanings of the same words. Here the name of an environment is what you get when you ask for the name attribute of it. There are lots of different ways to refer to objects other than by the name in that technical sense. For example, you could say x - as.environment(package:graphics) # uses the environment's name ls(envir=x) # refers to it by a variable holding it. Duncan Murdoch Can you please explain me in simple english? I think R help file should use more non-technical simple english language so that student like me can understand R in more easier way! --- On Wed, 7/21/10, Duncan Murdoch murdoch.dun...@gmail.com wrote: From: Duncan Murdoch murdoch.dun...@gmail.com Subject: Re: [R] Objects within environment To: Megh Dal megh700...@yahoo.com Cc: r-h...@stat.math.ethz.ch Date: Wednesday, July 21, 2010, 4:48 PM On 21/07/2010 5:57 AM, Megh Dal wrote: Hi all, I have following environments loaded with my current R session: search() [1] .GlobalEnv package:stats package:graphics package:grDevices [5] package:utils package:datasets package:methods Autoloads [9] package:base How can I find the objects under a specific environment? Here I tries following: objects(envir=package:base) Error in objects(envir = package:base) : invalid 'envir' argument It would be great if somebody would point me the correct arguments for object() function to find the onjects associated with it. In help file it is written that: envir: an alternative argument to ‘name’ for specifying the environment evaluation environment. Mostly there for back compatibility What is the wrong in my code? The easiest way to pick an item from the search list is by number: objects(3) or ls(3) will give you the objects in the graphics package, with the search list as above. You can also specify the name as the name argument, e.g. objects(package:base) If you want to use the envir argument (why?), you need to give an environment, not the name of one. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The opposite of lag
On Wed, Jul 21, 2010 at 12:18 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Sorry, I don't think it's working. the last 3 columns (on the right) of result contain the original data of each group. But there is no shift at all. I am trying to reach the following result for each group: The first number disappears and at the bottom an NA appears. Is it possible? It works for me. Here is the result. As we see - the first 3 columns are lagged ahead so that they start at 1981 which is one year later than the original columns start, - the second set of 3 columns are the originals so they start at 1980 - the last set of 3 columns are lagged so they start at 1979 which is one year before the original columns start. set.seed(123) A - data.frame(year=rep(c(1980:1984),3), group= + factor(sort(rep(1:3,5))), DV=c(rnorm(15))) library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) z - as.zooreg(z) ### lag(z, c(-1, 0, 1)) X1.lag.1 X2.lag.1 X3.lag.1 X1.lag0X2.lag0X3.lag0 1979 NA NA NA NA NA NA 1980 NA NA NA -0.56047565 1.7150650 1.2240818 1981 -0.56047565 1.7150650 1.2240818 -0.23017749 0.4609162 0.3598138 1982 -0.23017749 0.4609162 0.3598138 1.55870831 -1.2650612 0.4007715 1983 1.55870831 -1.2650612 0.4007715 0.07050839 -0.6868529 0.1106827 1984 0.07050839 -0.6868529 0.1106827 0.12928774 -0.4456620 -0.5558411 1985 0.12928774 -0.4456620 -0.5558411 NA NA NA X1.lag1X2.lag1X3.lag1 1979 -0.56047565 1.7150650 1.2240818 1980 -0.23017749 0.4609162 0.3598138 1981 1.55870831 -1.2650612 0.4007715 1982 0.07050839 -0.6868529 0.1106827 1983 0.12928774 -0.4456620 -0.5558411 1984 NA NA NA 1985 NA NA NA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday - CAN PAY FOR FIX
Hi Harlan Can you send some code so that we can reproduce the problem. That will enable me to fix the problem quicker. D. On 7/21/10 8:26 AM, Harlan Harris wrote: I unfortunately haven't received any responses about this problem. We (the company I work for) are willing to discuss payment to someone who is willing to quickly contribute a fix to the RGoogleDocs/RCurl toolchain that will restore write access. Please contact me directly if you're interested. Thank you, -Harlan Harris On Tue, Jul 20, 2010 at 10:19 AM, Harlan Harris har...@harris.name mailto:har...@harris.name wrote: Hi, I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything worked OK until this morning, when my ability to write into spreadsheet cells went away. I get the following weird error: Error in els[[type + 1]] : subscript out of bounds Looking at the Google Docs API changelog, I see the following: http://code.google.com/apis/spreadsheets/changelog.html Release 2010-01 (July 14, 2010) This is an advanced notice about an upcoming change. * Starting July 19, 2010, all links returned by all Spreadsheets API feeds will use HTTPS. This is being done in the interests of increased security. If you require the use of HTTP, we recommend that you remove the replace |https| with |http| in these links. Another announcement will be made on July 19, 2010, when this change goes to production. I suspect this is the problem. Fixing it is above my head, I'm afraid. Could anyone help? This is urgent. Thank you, -Harlan Harris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with invoking R from the command line (Windows Vista)
Dear all I am running in a problem when trying to run R from Windows command line. I am runnning Windows Vista, R-2.10.1. I have a script I would like to run remotely from another program. As it did not work, I thought I would test the script from the Windows command line which highlighted a problem. When I open the command line and type R at the prompt (or any other variant (R CMD BATCH, Rgui.exe or Rterm.exe), I get the error message: 'Rscript' is not recognized as an internal or external command, operable program or batch file. if I move to the R directory (cd usr\R\R-2.10.1\bin), and try running the script ( R myscript.R --vanilla) it works fine. I thought the problem may come from the environment variables (tried specifying explicit location of R.exe), but this did not solve the problem. I also re-install R but that did not help either. I cannot think of anything else. I have read the R installation and rwfaq notes, and look through the posts but I cannot find any suggestion on how to fix the bug. If anyone can suggest anything, that will be much appreciated. Many thanks Camille PS: I am also running R on another machine (Windows XP) and I don't have a similar problem! -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-invoking-R-from-the-command-line-Windows-Vista-tp2297429p2297429.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [SPAM](Aktien) Re: Chi-square distribution probability density function:
On Jul 21, 2010, at 12:36 PM, Knut Krueger wrote: David Winsemius schrieb: On Jul 21, 2010, at 6:52 AM, Knut Krueger wrote: Hi to all I found an formular of an ** ***p-Value Calculator for the Chi-Square test* *http://www.danielsoper.com/statcalc/calc11.aspx* *with the formula* http://www.danielsoper.com/statkb/topic11.aspx *what's the gamma function of this formula in r?* *df=5* *ch2=25.50878* *the following code does not give the result 0.001 for the values above * *p= ((0.5^(df/2))/gamma(df))*(ch2^((df/2)-1))* (2.718281828459^(- ch2/2)) or is there any other error? Check your implementation of that formula. You made an error in the gamma argument. Sorry a copy and paste error but gamma(df/2) does not solve the problem I am not able to get the implementation of the function working :-( See also: ?Chisquare ?gamma there is also a different result using this function from the helpf files and the function /f_n(x) = 1 / (2^(n/2) Gamma(n/2)) x^(n/2-1) e^(-x/2) That is (perhaps) because you are confusing a density function with a cumulative probability function. The expression above should give the same result as a (proper) R transcription of the formula on the page you offered above and should be what dchisq referenced on the ? Chisquare page returns as well. Sure enough... dchisq(25.50878, 5) [1] 4.951e-05 p= ((0.5^(df/2))/gamma(df/2))*(ch2^((df/2)-1))* (2.718281828459^(- ch2/2)) p [1] 4.951e-05 Those formulae only differ in how they use grouping of the constant terms. (In R the function is gamma, not Gamma.) You should be able to get the P(X^2|x25.508, df=5) from an integration of that function. Knut / David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] String processing - is there a better way
I have a two part question Part 1) I am trying to remove characters in a string based on the position of a key character in another string. I have a solution that works but it requires a for-loop. A vectorized way of doing this has alluded me. CleanRead-function(x,y) { if (!is.character(x)) x - as.character(x) if (!is.character(y)) y - as.character(y) idx-grep(\\*, x, value=FALSE) starpos-gregexpr(\\*, x[idx]) ysplit-strsplit(y[idx], '') n-length(idx) for(i in 1:n) { ysplit[[i]][starpos[[i]]] = } y[idx]-unlist(lapply(ysplit, paste, sep='', collapse='')) return(y) } x-c(AA*.*A **a.a*,,,A, C*c.., **aA) y-c(abcdefghi, abcdefghij, abcde, abcd) CleanRead(x,y) [1] abdfghi cdeghij acde cd Is there a better way to do this? Part 2) My next step in the string processing is to take the characters in the output of CleanRead and subtract 33 from the ascii value of the character to obtain an integer. Again I have a solution that works, involving splitting the string into characters then converting them to factors (starting at ascii 34) and using unclass to get the integer value. (kindof a atoi(x)-33 all in one step) I looked for the C equivalent of atoi, but the only help I could find (R-help 2003) suggested using as.numeric. However, the help file (and testing) shows you get 'NA'. Am I missing an easier way to do this? Thanks in advance, Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with invoking R from the command line (Windows Vista)
On 21/07/2010 12:55 PM, Camster wrote: Dear all I am running in a problem when trying to run R from Windows command line. I am runnning Windows Vista, R-2.10.1. I have a script I would like to run remotely from another program. As it did not work, I thought I would test the script from the Windows command line which highlighted a problem. When I open the command line and type R at the prompt (or any other variant (R CMD BATCH, Rgui.exe or Rterm.exe), I get the error message: 'Rscript' is not recognized as an internal or external command, operable program or batch file. This is a sign that your PATH environment variable doesn't include the directory RHOME\bin. if I move to the R directory (cd usr\R\R-2.10.1\bin), and try running the script ( R myscript.R --vanilla) it works fine. I thought the problem may come from the environment variables (tried specifying explicit location of R.exe), but this did not solve the problem. I also re-install R but that did not help either. You don't say exactly what you tried, but if you tried changing your PATH, you should try again, and verify that the change is in place in the session where you're trying to run R. Duncan Murdoch I cannot think of anything else. I have read the R installation and rwfaq notes, and look through the posts but I cannot find any suggestion on how to fix the bug. If anyone can suggest anything, that will be much appreciated. Many thanks Camille PS: I am also running R on another machine (Windows XP) and I don't have a similar problem! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The opposite of lag
I reinstalled zoo and now I can see the years on left (raw names). Before - I could not see them. Thank you! On Wed, Jul 21, 2010 at 12:51 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jul 21, 2010 at 12:18 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Sorry, I don't think it's working. the last 3 columns (on the right) of result contain the original data of each group. But there is no shift at all. I am trying to reach the following result for each group: The first number disappears and at the bottom an NA appears. Is it possible? It works for me. Here is the result. As we see - the first 3 columns are lagged ahead so that they start at 1981 which is one year later than the original columns start, - the second set of 3 columns are the originals so they start at 1980 - the last set of 3 columns are lagged so they start at 1979 which is one year before the original columns start. set.seed(123) A - data.frame(year=rep(c(1980:1984),3), group= + factor(sort(rep(1:3,5))), DV=c(rnorm(15))) library(zoo) z - read.zoo(A, index = 1, split = group, frequency = 1) z - as.zooreg(z) ### lag(z, c(-1, 0, 1)) X1.lag.1 X2.lag.1 X3.lag.1 X1.lag0 X2.lag0 X3.lag0 1979 NA NA NA NA NA NA 1980 NA NA NA -0.56047565 1.7150650 1.2240818 1981 -0.56047565 1.7150650 1.2240818 -0.23017749 0.4609162 0.3598138 1982 -0.23017749 0.4609162 0.3598138 1.55870831 -1.2650612 0.4007715 1983 1.55870831 -1.2650612 0.4007715 0.07050839 -0.6868529 0.1106827 1984 0.07050839 -0.6868529 0.1106827 0.12928774 -0.4456620 -0.5558411 1985 0.12928774 -0.4456620 -0.5558411 NA NA NA X1.lag1 X2.lag1 X3.lag1 1979 -0.56047565 1.7150650 1.2240818 1980 -0.23017749 0.4609162 0.3598138 1981 1.55870831 -1.2650612 0.4007715 1982 0.07050839 -0.6868529 0.1106827 1983 0.12928774 -0.4456620 -0.5558411 1984 NA NA NA 1985 NA NA NA -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with invoking R from the command line (Windows Vista)
On Wed, Jul 21, 2010 at 12:55 PM, Camster c.szmar...@googlemail.com wrote: Dear all I am running in a problem when trying to run R from Windows command line. I am runnning Windows Vista, R-2.10.1. I have a script I would like to run remotely from another program. As it did not work, I thought I would test the script from the Windows command line which highlighted a problem. When I open the command line and type R at the prompt (or any other variant (R CMD BATCH, Rgui.exe or Rterm.exe), I get the error message: 'Rscript' is not recognized as an internal or external command, operable program or batch file. if I move to the R directory (cd usr\R\R-2.10.1\bin), and try running the script ( R myscript.R --vanilla) it works fine. I thought the problem may come from the environment variables (tried specifying explicit location of R.exe), but this did not solve the problem. I also re-install R but that did not help either. I cannot think of anything else. I have read the R installation and rwfaq notes, and look through the posts but I cannot find any suggestion on how to fix the bug. If anyone can suggest anything, that will be much appreciated. Many thanks Camille PS: I am also running R on another machine (Windows XP) and I don't have a similar problem! If you don't want to change your path you can use Rgui.bat, Rterm.bat, etc. in the batchfiles distribution at http://batchfiles.googlecode.com Just put any of them that you wish to use on your path (they are self contained Windows batch files, i.e. with no dependencies) and then when you invoke any of them they will look up R in the registry and run the corresponding command. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The opposite of lag
On Wed, Jul 21, 2010 at 1:16 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: I reinstalled zoo and now I can see the years on left (raw names). Before - I could not see them. Thank you! Note that the result is a zoo object. A zoo object consists of data (which is everything except the years, here) and the time index which is shown along the left hand side. If lg is the zoo object then coredata(lg) and index(lg) give the two components while as.data.frame(lg) does give a data frame with the years as rownames. lg - lag(z, c(-1, 0, 1)) coredata(lg) index(lg) as.data.frame(lg) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-square distribution probability density function:
On Jul 21, 2010, at 1:18 PM, Knut Krueger wrote: David Winsemius schrieb: And exactly why did you think I offered ?Chisquare I was completely on the wrong way, and tried to find a solution with the formula instead to substitute the formula. So I tried to implement pchisq into the formula - and of course I got wrong values ... p - function(x) ((0.5^(5/2))/gamma(5/2))*(x^((5/2)-1))* (2.718281828459^(-x/2)) integrate(p, 25.508, Inf) 0.000 with absolute error 2.7e-05 Check result pchisq(25.508,5) [1] 0. 1-pchisq(25.508,5) [1] 0.000 Thank's Knut David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The opposite of lag
Thanks a lot, it's very helpful! On Wed, Jul 21, 2010 at 1:21 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jul 21, 2010 at 1:16 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: I reinstalled zoo and now I can see the years on left (raw names). Before - I could not see them. Thank you! Note that the result is a zoo object. A zoo object consists of data (which is everything except the years, here) and the time index which is shown along the left hand side. If lg is the zoo object then coredata(lg) and index(lg) give the two components while as.data.frame(lg) does give a data frame with the years as rownames. lg - lag(z, c(-1, 0, 1)) coredata(lg) index(lg) as.data.frame(lg) -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] String processing - is there a better way
On Wed, Jul 21, 2010 at 1:02 PM, Davis, Brian brian.da...@uth.tmc.edu wrote: I have a two part question Part 1) I am trying to remove characters in a string based on the position of a key character in another string. I have a solution that works but it requires a for-loop. A vectorized way of doing this has alluded me. CleanRead-function(x,y) { if (!is.character(x)) x - as.character(x) if (!is.character(y)) y - as.character(y) idx-grep(\\*, x, value=FALSE) starpos-gregexpr(\\*, x[idx]) ysplit-strsplit(y[idx], '') n-length(idx) for(i in 1:n) { ysplit[[i]][starpos[[i]]] = } y[idx]-unlist(lapply(ysplit, paste, sep='', collapse='')) return(y) } x-c(AA*.*A **a.a*,,,A, C*c.., **aA) y-c(abcdefghi, abcdefghij, abcde, abcd) CleanRead(x,y) [1] abdfghi cdeghij acde cd Is there a better way to do this? Part 2) My next step in the string processing is to take the characters in the output of CleanRead and subtract 33 from the ascii value of the character to obtain an integer. Again I have a solution that works, involving splitting the string into characters then converting them to factors (starting at ascii 34) and using unclass to get the integer value. (kindof a atoi(x)-33 all in one step) I looked for the C equivalent of atoi, but the only help I could find (R-help 2003) suggested using as.numeric. However, the help file (and testing) shows you get 'NA'. This splits x and y into vectors of single characters, extracts those from y for which x is not * and then matches the result to letters to return a number. f - function(x, y) match(y[x != *], letters) mapply(f, strsplit(x, ), strsplit(y, )) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday - CAN PAY FOR FIX
Hi Harlan If you install the latest version of RCurl from source via install.packages(RCurl, repos = http://www.omegahat.org/R;) and that should solve the problem, assuming I have been reproducing the same problem you mentioned. You haven't mentioned what operating system your are on. If you are on Windows, that will pick up the binary version. If you are on the mac, you will have to build it from source. D. On 7/21/10 8:26 AM, Harlan Harris wrote: I unfortunately haven't received any responses about this problem. We (the company I work for) are willing to discuss payment to someone who is willing to quickly contribute a fix to the RGoogleDocs/RCurl toolchain that will restore write access. Please contact me directly if you're interested. Thank you, -Harlan Harris On Tue, Jul 20, 2010 at 10:19 AM, Harlan Harris har...@harris.name mailto:har...@harris.name wrote: Hi, I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything worked OK until this morning, when my ability to write into spreadsheet cells went away. I get the following weird error: Error in els[[type + 1]] : subscript out of bounds Looking at the Google Docs API changelog, I see the following: http://code.google.com/apis/spreadsheets/changelog.html Release 2010-01 (July 14, 2010) This is an advanced notice about an upcoming change. * Starting July 19, 2010, all links returned by all Spreadsheets API feeds will use HTTPS. This is being done in the interests of increased security. If you require the use of HTTP, we recommend that you remove the replace |https| with |http| in these links. Another announcement will be made on July 19, 2010, when this change goes to production. I suspect this is the problem. Fixing it is above my head, I'm afraid. Could anyone help? This is urgent. Thank you, -Harlan Harris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Objects within environment
Can you please explain me in simple english? I think R help file should use more non-technical simple english language so that student like me can understand R in more easier way! -- Megh (and others): Without judging this particular issue, writing good technical documentation -- whether it be plumbing, refrigerator repair, or computer code -- is difficult. I think relatively few people do it well. IMHO, for example, Brian Ripley excels at this, because I find him complete, accurate, and terse; I, however, do not. But others may disagree. It's hard to hit a sweet spot that is at the same time, complete, correct, as concise as possible, but still comprehensible to a broad range of users. So while I do not wish to censure -- after all, if you had trouble understanding, your complaint is justified -- I do want to urge a bit more understanding and effort on your part. Before you complain, try writing some documentation yourself. It's damn hard! ;) Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interactions in GAMMs
On Wed, 2010-07-21 at 16:17 +0100, Karen Moore wrote: Hi, I've an issue adding an interaction to a GAMM: My model was of form: gamm1 - gamm(TOTSR ~ fROT + s(PH) + s(LOI) + s(ASP) + s(SQRT_ELEV) + CANCOV + s(SQRT_TOTCWD) + s(WELLF) + s(WELLN) + s(OLDWDLD) + s(DISTWOOD) + s(Annprec) + s(OLDWDLD:DISTWOOD) + (1|fSITE), family = poisson, data = BIOFOR2) with interaction of s(OLDWDLD:DISTWOOD). However I got this error message below but don't know what it means? (my data is composed of info for 150 plots) An interaction is formed in a GAMM using a smoother in two or more dimensions, such as s(OLDWDLD, DISTWOOD) or te(OLDWDLD, DISTWOOD), the latter allows for different basis function within the two smoothers and for different inherent scales for the two variables. IIRC the latter allows you to produce strictly nested models for likelihood ratio testing: te(OLDWDLD) + te(DISTWOOD) is nested within te(OLDWDLD, DISTWOOD) HTH G #I Warning messages: #2: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used #3: In OLDWDLD:DISTWOOD : # numerical expression has 150 elements: only the first used Can anyone offer advice on how to include this interaction in GAMM model? Thanks Karen Moore [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with invoking R from the command line (Windows Vista) - - SORTED!
Thanks for suggestion. It was indeed a path problem. I move R into a directory without spaces and added the directory RHOME\bin in the Path environment variable. All sorted. -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-invoking-R-from-the-command-line-Windows-Vista-tp2297429p2297552.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday - CAN PAY FOR FIX
*Thank you*! Yes, version 1.4-3 of RCurl solves this problem! If you put a tip jar https://www.paypal.com/cgi-bin/webscr?cmd=_donate-intro-outside on OmegaHat, Duncan, I will seriously kick a few bucks your way. Getting my code back running will save me a lot of time and anxiety! Thanks for all your efforts, -Harlan On Wed, Jul 21, 2010 at 1:45 PM, Duncan Temple Lang dun...@wald.ucdavis.edu wrote: Hi Harlan If you install the latest version of RCurl from source via install.packages(RCurl, repos = http://www.omegahat.org/R;) and that should solve the problem, assuming I have been reproducing the same problem you mentioned. You haven't mentioned what operating system your are on. If you are on Windows, that will pick up the binary version. If you are on the mac, you will have to build it from source. D. On 7/21/10 8:26 AM, Harlan Harris wrote: I unfortunately haven't received any responses about this problem. We (the company I work for) are willing to discuss payment to someone who is willing to quickly contribute a fix to the RGoogleDocs/RCurl toolchain that will restore write access. Please contact me directly if you're interested. Thank you, -Harlan Harris On Tue, Jul 20, 2010 at 10:19 AM, Harlan Harris har...@harris.name mailto:har...@harris.name wrote: Hi, I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything worked OK until this morning, when my ability to write into spreadsheet cells went away. I get the following weird error: Error in els[[type + 1]] : subscript out of bounds Looking at the Google Docs API changelog, I see the following: http://code.google.com/apis/spreadsheets/changelog.html Release 2010-01 (July 14, 2010) This is an advanced notice about an upcoming change. * Starting July 19, 2010, all links returned by all Spreadsheets API feeds will use HTTPS. This is being done in the interests of increased security. If you require the use of HTTP, we recommend that you remove the replace |https| with |http| in these links. Another announcement will be made on July 19, 2010, when this change goes to production. I suspect this is the problem. Fixing it is above my head, I'm afraid. Could anyone help? This is urgent. Thank you, -Harlan Harris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] prediction from a logistic mixed effects model
Hi, Is there any similar command to predict which can be used with a logistic random effects model? I have run a random effects model using lme(), and then use predict.lme() with no problems. However, I would also like to run a logistic random effects model, and then also run a predict command on the logistic random effects model. If I use lme(), I can not include a family option to indicate I am running a logistic model. I have instead successfully used lmer(), with the family=binomial(link=logit) option. However, I then can not find an equivalent command to predict to use with lmer. Any suggestions/options? Thanks very much for the help. -K -- View this message in context: http://r.789695.n4.nabble.com/prediction-from-a-logistic-mixed-effects-model-tp2297668p2297668.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Converting Between Character and Numeric Objects
Hello, I'm trying to convert a vector of string objects to a numeric object by initializing the variables with values. I use the function below to scan through a matrix and create mass action flux relationships: makeMassActionFluxes = function(sMatrix) { #Allocate a matrix with identical dimensions as the inputted stoichiometric matrix temp = matrix(nrow = dim(sMatrix)[1], ncol = dim(sMatrix)[2]) velocities = vector(length = dim(sMatrix)[2]) ## This series of 'for' loops scans the stoichiometrix matrix looking for negative integers. ## Once it identifies a negative integer it pulls the row name and exponentiates it to the ## appropriate stoichiometry, creating the flux expression. for (i in 1:dim(sMatrix)[1]) { for (j in 1:dim(sMatrix)[2]) { if (sMatrix[i, j] = 0) (temp[i, j] = NA) else temp[i, j] = paste(rownames(sMatrix)[i], '^', -sMatrix[i, j], sep = '') } } ## Scan temp by column. Ignore strings that begin with '1.' Use paste() to multiply strings. for (k in 1:dim(temp)[2]) {#Loop through columns velocities[k] = paste(c(temp[ , k]), sep = '', collapse = '*') velocities[k] = gsub('*NA', '', velocities[k], fixed = TRUE) velocities[k] = gsub('NA*', '', velocities[k], fixed = TRUE) } return(velocities) } If I pass in the following example matrix, I get a list of flux expressions ('v') just like I want: testS = matrix(c(-1, 0, 2, -1, -1, 0, 0, -1, -1, 1, 0, 0, 0, 1, 0), nrow = 5, ncol = 3, byrow = TRUE) rownames(testS) = c(met1, met2, met3, met4, met5) colnames(testS) = c('rxn1', 'rxn2', 'rxn3') v = makeMassActionFluxes(testS) v [1] met1^1*met2^1 met2^1*met3^1 met3^1 My next step is to use an ODE generating function to initialize values for the met variables. state = c(met1 = 10, met2 = 10, met3 = 10, met4 = 10, met5 = 10) parameters = 1 ODE = function(t, state, parameters, S, velocity) { with(as.list(c(state, parameters, S, velocity)), { des = S %*% v #The products should look like this: #dmet1 = (-1)*met1^1*met2^1 + (0)*met2^1*met3^1 + (2)*met3^1 #Return list(des) }) } When I call 'makeODE' I get the following error (which I expected): Error in S %*% v : requires numeric/complex matrix/vector arguments I'm wondering if there is an easy way to remove the quotes on the elements on 'v' or make 'v' into a numeric vector so that I can initialize them with values? Or, is there an easier way to generate the flux expressions in my 'makeMassActionFluxes' function so that eventually passing in initial values for the met variables will be easier? Thanks in advance! Erin -- Erin Rachael Shellman The University of Michigan Bioinformatics PhD Candidate http://www.erinshellman.com shell...@umich.edu (937) 321.1129 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm: order of dropped columns
Hi: (1) lm() drops columns in a rank-deficient model matrix X to make X'X nonsingular - this is called a full-rank reparameterization of the linear model. (2) How many columns of X are dropped depends on its rank, which in turn depends on the number of constraints in the model matrix. This is related to the degrees of freedom associated with each term in the corresponding linear model.. The default contrasts are options()$contrasts unordered ordered contr.treatment contr.poly Other choices include contr.helmert() and contr.sum(). See the help page ?contrasts for further details. See also section 6.2 of Venables and Ripley's _Modern Applied Statistics with S, 4th ed._ for further information on the connection between the columns of the model matrix in ANOVA and the defined sets of contrasts. Under the default contrasts, the first column is dropped for the main effect terms. Here's a simple example of a balanced 2-way ANOVA with interaction: d - data.frame(a = factor(rep(letters[1:3], each = 4)), b = factor(rep(rep(1:2, each = 2), 3)), c = rnorm(12)) d a b c 1 a 1 -0.77367688 2 a 1 -0.79069791 3 a 2 0.69257133 4 a 2 2.46788204 5 b 1 0.38892289 6 b 1 -0.03521033 7 b 2 -0.01071611 8 b 2 -0.74209425 9 c 1 1.36974281 10 c 1 -1.22775441 11 c 2 0.29621976 12 c 2 0.28208192 m - aov(c ~ a * b, data = d) model.matrix(m) (Intercept) ab ac b2 ab:b2 ac:b2 11 0 0 0 0 0 21 0 0 0 0 0 31 0 0 1 0 0 41 0 0 1 0 0 51 1 0 0 0 0 61 1 0 0 0 0 71 1 0 1 1 0 81 1 0 1 1 0 91 0 1 0 0 0 10 1 0 1 0 0 0 11 1 0 1 1 0 1 12 1 0 1 1 0 1 attr(,assign) [1] 0 1 1 2 3 3 attr(,contrasts) attr(,contrasts)$a [1] contr.treatment attr(,contrasts)$b [1] contr.treatment Notice that the first column of each main effect is dropped, and that the interaction columns are the products of the retained a columns with the retained b columns. The attr() components verify that the contrast method used for this matrix is the default contr.treatment. anova(m) Analysis of Variance Table Response: c Df Sum Sq Mean Sq F value Pr(F) a 2 0.5001 0.25003 0.2827 0.7633 b 1 1.3700 1.36999 1.5489 0.2597 a:b2 4.5647 2.28235 2.5804 0.1554 Residuals 6 5.3070 0.88450 Examination of the degrees of freedom tells us that there are two independent contrasts for a, one independent contrast for b and two independent contrasts for the interaction of a and b, which are shown in model.matrix(m) above. If you want to reset the baselline factor level, see ?relevel. Also look into the contrast package on CRAN. HTH, Dennis On Wed, Jul 21, 2010 at 8:40 AM, Anirban Mukherjee am...@cornell.eduwrote: Hi all, If presented with a singular design matrix, lm drops columns to make the design matrix non-singular. What algorithm is used to select which (and how many) column(s) to drop? Particularly, given a factor, how does lm choose levels of the factor to discard? Thanks for the help. Best, Anirban [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Get distribution of positive/negative examples for each cluster
Dear R experts, I have a labeled data set. Each data is assigned a binary label 0 or 1. Assume that I use some clustering algorithm to group the data by clusters (using some features of the data). Now I want to know how many data are labeled as 0/1 in each cluster. For example, assume that I have 9 labeled data grouped into three clusters. The ids of the clusters are 1, 2, and 3. The dataset is represented by the following matrix: membershipLabel d110 d210 d311 d420 d521 d621 d731 d831 d931 Now I want to get the following output, telling me how many data are labeled as 0 and 1 in each cluster cluster_id0-data1-data 121 212 303 The output does not have to be a matrix, it could be a summary of the statistics. How should I approach this problem? What R functions should I use to get such information? Thanks so much! Boya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help me with prediction in linear model
Hi R-community, I have the code as follows,i Fitted model as follows lbeer-log(beer_monthly) t-seq(1956,1995.2,length=length(beer_monthly)) #beer_monthly contains 400+ entries t2=t^2 beer_fit_parabola=lm(lbeer~t+t2) Below is not working for me. Please help me in preparing the new data set for the below prediction predict(beer_fit_parabola,newdata=data.frame(t=seq(1995,1998,length=20),t2=seq(1995,1998,length=20)) #it is listing all 400+ entries ,but not 20 ahead prediction. Thanks In advance for your help -- View this message in context: http://r.789695.n4.nabble.com/Help-me-with-prediction-in-linear-model-tp2297313p2297313.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmomRFA package: error bounds/confidence intervals
Dear List I’m using the ”lmomRFA” package to fit different distributions to my data sample. To calculate the error bounds I used: regsimq(…) and sitequantbounds(…) So my questions are: Are error bounds and confidence intervals the same thing? And: Does regsimq(… boundprob = c(0.05, 0.95)) calculate the 90 or the 95% confidence interval? If error bounds and confidence intervals are not equal: Is there a way to calcu late confidence intervals for my fitted distributions (gev, weibull, gumbel…)? Thank you in advance, Tonja Neu: WEB.DE De-Mail - Einfach wie E-Mail, sicher wie ein Brief! Jetzt De-Mail-Adresse reservieren: https://produkte.web.de/go/demail02 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sem by variable x
Hi Jarret, Thank you for your answer. I get the following message: Error in cov(a.df[, 2:7], nrow(a.df)) : incompatible dimensions The function seems to run for some countries but then appears to stop when a country has incomplete data (1 var is missing, for example). How to force the function to continue or skip those countries? Thank you, Daniel On Tue, Jul 20, 2010 at 7:24 PM, Jarrett Byrnes byr...@msi.ucsb.edu wrote: You may want to take a look at the lavaan package and use the multigroup analysis there (and see if you even need to group by country as well). Otherwise, you could do something like library(sem) library(plyr) cfa_func-function(a.df){ cfa-sem(ses.model, cov(a.df[,2:7], nrow(a.df))) print(summary(cfa)) } d_ply(data, idcntry, cfa_func) -Jarrett On Jul 20, 2010, at 6:22 AM, Daniel Caro wrote: Hi R users, I am new in R. I would like to perform confirmatory factor analysis for a data set of countries. My data are: data - read.csv(ses.raw, header = TRUE) attach(data) names(data) [1] idcntry momed daded dadocu momocu hompos finan The country id is idcntry, my model is ses.model, and variables to be included in the analysis are momed daded dadocu momocu hompos finan . How can I run cfa-sem(ses.model, cov(data[,2:7], nrow(data))) summary(cfa) by country? I am able to perform sem on all data by not by country. I tried by(data[,2:7], idcntry, function(x) sem(ses.model, cov(data[,2:7]), nrow(data))) but the output is the same for all countries. Thank you, Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice Panel Print Order
For pete's sake! That's just embarrassing. I don't know how I missed that. Thanks for your help Deepayan. I would also like to say thank you for supporting Lattice so well. On Jul 21, 2010, at 2:39 AM, Deepayan Sarkar wrote: On Wed, Jul 21, 2010 at 12:13 AM, Mark Ebbert mark.ebb...@hci.utah.edu wrote: Hi, I have done a bit of searching and have not found a good answer for my question. Although I have not difficulty ordering the panels, Lattice prints them from bottom to top and left to right for each page. Is it possible to make it print from top to bottom for each page? Search for as.table in ?xyplot. Is this what you were looking for? -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Objects within environment
Hi Duncan, thanks for your clarification. However I do not think I could really understand the envir argument in objects() function. It is saying an alternative argument for name Is the alternative means the alternative of, let say, package:graphics (which is the name of an environment?). Can you give me an example of an alternative argument of that particular environment? This is specifying the environment evaluation environment. What does the phase environment evaluation environment mean? Can you give me an example? Mostly there for back compatibility: again totally in dark, what does it mean for back compatibility? An example would definitely be great. Toy said you need to give an environment, not the name of one. If I call someone I need to call with his name, right? Then if I need to give an environment then, without its name how can I do so? Can you please explain me in simple english? I think R help file should use more non-technical simple english language so that student like me can understand R in more easier way! --- On Wed, 7/21/10, Duncan Murdoch murdoch.dun...@gmail.com wrote: From: Duncan Murdoch murdoch.dun...@gmail.com Subject: Re: [R] Objects within environment To: Megh Dal megh700...@yahoo.com Cc: r-h...@stat.math.ethz.ch Date: Wednesday, July 21, 2010, 4:48 PM On 21/07/2010 5:57 AM, Megh Dal wrote: Hi all, I have following environments loaded with my current R session: search() [1] .GlobalEnv package:stats package:graphics package:grDevices [5] package:utils package:datasets package:methods Autoloads [9] package:base How can I find the objects under a specific environment? Here I tries following: objects(envir=package:base) Error in objects(envir = package:base) : invalid 'envir' argument It would be great if somebody would point me the correct arguments for object() function to find the onjects associated with it. In help file it is written that: envir: an alternative argument to ‘name’ for specifying the environment evaluation environment. Mostly there for back compatibility What is the wrong in my code? The easiest way to pick an item from the search list is by number: objects(3) or ls(3) will give you the objects in the graphics package, with the search list as above. You can also specify the name as the name argument, e.g. objects(package:base) If you want to use the envir argument (why?), you need to give an environment, not the name of one. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Objects within environment
Thanks Duncan, I understood. Your explanation is really great. Thank you so much for your time. --- On Wed, 7/21/10, Duncan Murdoch murdoch.dun...@gmail.com wrote: From: Duncan Murdoch murdoch.dun...@gmail.com Subject: Re: [R] Objects within environment To: Megh Dal megh700...@yahoo.com Cc: r-h...@stat.math.ethz.ch Date: Wednesday, July 21, 2010, 10:15 PM On 21/07/2010 12:27 PM, Megh Dal wrote: Hi Duncan, thanks for your clarification. However I do not think I could really understand the envir argument in objects() function. It is saying an alternative argument for name Is the alternative means the alternative of, let say, package:graphics (which is the name of an environment?). Can you give me an example of an alternative argument of that particular environment? as.environment(package:graphics) is an environment. package:graphics is its name. This is specifying the environment evaluation environment. What does the phase environment evaluation environment mean? Can you give me an example? That looks like a typo to me. Environment evaluation environment is meaningless. Mostly there for back compatibility: again totally in dark, what does it mean for back compatibility? An example would definitely be great. Presumably some older release of R used envir, and we still have it so that old code will still work. But that's a signal that new code should never need to use it. Toy said you need to give an environment, not the name of one. If I call someone I need to call with his name, right? Then if I need to give an environment then, without its name how can I do so? This is a computer language, not a conversation. Words have technical meanings that aren't always perfect matches to English meanings of the same words. Here the name of an environment is what you get when you ask for the name attribute of it. There are lots of different ways to refer to objects other than by the name in that technical sense. For example, you could say x - as.environment(package:graphics) # uses the environment's name ls(envir=x) # refers to it by a variable holding it. Duncan Murdoch Can you please explain me in simple english? I think R help file should use more non-technical simple english language so that student like me can understand R in more easier way! --- On Wed, 7/21/10, Duncan Murdoch murdoch.dun...@gmail.com wrote: From: Duncan Murdoch murdoch.dun...@gmail.com Subject: Re: [R] Objects within environment To: Megh Dal megh700...@yahoo.com Cc: r-h...@stat.math.ethz.ch Date: Wednesday, July 21, 2010, 4:48 PM On 21/07/2010 5:57 AM, Megh Dal wrote: Hi all, I have following environments loaded with my current R session: search() [1] .GlobalEnv package:stats package:graphics package:grDevices [5] package:utils package:datasets package:methods Autoloads [9] package:base How can I find the objects under a specific environment? Here I tries following: objects(envir=package:base) Error in objects(envir = package:base) : invalid 'envir' argument It would be great if somebody would point me the correct arguments for object() function to find the onjects associated with it. In help file it is written that: envir: an alternative argument to ‘name’ for specifying the environment evaluation environment. Mostly there for back compatibility What is the wrong in my code? The easiest way to pick an item from the search list is by number: objects(3) or ls(3) will give you the objects in the graphics package, with the search list as above. You can also specify the name as the name argument, e.g. objects(package:base) If you want to use the envir argument (why?), you need to give an environment, not the name of one. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-square distribution probability density function:
David Winsemius schrieb: And exactly why did you think I offered ?Chisquare I was completely on the wrong way, and tried to find a solution with the formula instead to substitute the formula. So I tried to implement pchisq into the formula - and of course I got wrong values ... Thank's Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] .jinit and .jnew
i have a jar under z:/downloads/AA, i use .jinit(classpath=z:/downloads/AA) and then .jnew(Myclass) but .jnew throes an error: Error in .jnew(MyClass) : java.lang.NoClassDefFoundError: MyClass tried .jinit(classpath=z:/downloads/AA/myClass.jar) and got the same error. any idea? -- View this message in context: http://r.789695.n4.nabble.com/jinit-and-jnew-tp2297486p2297486.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGoogleDocs ability to write to spreadsheets broken as of yesterday
Henrique Dallazuanna wrote: On Wed, Jul 21, 2010 at 11:24 AM, Ben Bolker bbol...@gmail.com wrote: Harlan Harris harlan at harris.name writes: Hi, I'm using RGoogleDocs/RCurl to update a Google Spreadsheet. Everything worked OK until this morning, when my ability to write into spreadsheet cells went away. I get the following weird error: Error in els[[type + 1]] : subscript out of bounds Looking at the Google Docs API changelog, I see the following: http://code.google.com/apis/spreadsheets/changelog.html Release 2010-01 (July 14, 2010) This is an advanced notice about an upcoming change. - Starting July 19, 2010, all links returned by all Spreadsheets API feeds will use HTTPS. This is being done in the interests of increased security. If you require the use of HTTP, we recommend that you remove the replace https with http in these links. Another announcement will be made on July 19, 2010, when this change goes to production. I suspect this is the problem. Fixing it is above my head, I'm afraid. Could anyone help? This is urgent. Thank you, This is an Omegahat package (took me a little while to find it). Perhaps you should write to the package maintainer? library(RGoogleDocs) help(package=RGoogleDocs) or, more obscurely: help(package=RGoogleDocs)$info[[1]][9] (there may be a better way to deal with objects of type packageInfo but I can't figure it out right at the moment). Maybe: packageDescription('RGoogleDocs', fields = 'Author') From the News file for 2.11.0: o maintainer() has been added, to give convenient access to the name of the maintainer of a package It looks as though one might be able to fix this by hacking the hard-coded URLs in the code, but as you suggest that might be above your head. good luck ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about allocMatrix error message
Dear R family, I faced a technical problem in r coding. #s=t(dev)%*%dev/(nr-1) # dev (100,000 by 2) stands for deviation from the mean #sinv=solve(s) #t2=diag(dev%*%sinv%*%t(dev)) I got an error message at t2 statement: Error in diag(dev %*% si %*% t(dev)) : allocMatrix: too many elements specified Please let me know if there is a way to overcome this problem. best moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calling R from within Java, using jri
I've gotten the same problems when running the examples, the programs just exit after creating Rengine. After trying all the possibilities, I found that I set the environment variable R_HOME wrong. After changing it from C:\Program Files\R\R-2.11.1\bin to C:\Program Files\R\R-2.11.1, the programs worked just fine!!! Hope this helps! -- View this message in context: http://r.789695.n4.nabble.com/R-calling-R-from-within-Java-using-jri-tp808810p2297558.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xtable
Hi, How do I build a table from a regression model adjusted using xtable? Commands are: modelo1 = lm(Y~X1 + X2) influencia = influence.measures(modelo1) require(xtable) xtable(influencia) but it isn't work. Thanks, -- Silvano Cesar da Costa Departamento de Estatística Universidade Estadual de Londrina Fone: 3371-4346 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Get distribution of positive/negative examples for each cluster
Boya- table() is the function that does what you want: cdat = data.frame(membership=rep(1:3,rep(3,3)), + label=as.character(c(0,0,1,0,1,1,1,1,1))) table(cdat) label membership 0 1 1 2 1 2 1 2 3 0 3 From there, you can rearrange it in a variety of ways: as.data.frame(table(cdat)) membership label Freq 1 1 02 2 2 01 3 3 00 4 1 11 5 2 12 6 3 13 Or, to conform with your request reshape(as.data.frame(table(cdat)),idvar='membership', + v.names='Freq',timevar='label',direction='wide') membership Freq.0 Freq.1 1 1 2 1 2 2 1 2 3 3 0 3 - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Wed, 21 Jul 2010, Boya Sun wrote: Dear R experts, I have a labeled data set. Each data is assigned a binary label 0 or 1. Assume that I use some clustering algorithm to group the data by clusters (using some features of the data). Now I want to know how many data are labeled as 0/1 in each cluster. For example, assume that I have 9 labeled data grouped into three clusters. The ids of the clusters are 1, 2, and 3. The dataset is represented by the following matrix: membershipLabel d110 d210 d311 d420 d521 d621 d731 d831 d931 Now I want to get the following output, telling me how many data are labeled as 0 and 1 in each cluster cluster_id0-data1-data 121 212 303 The output does not have to be a matrix, it could be a summary of the statistics. How should I approach this problem? What R functions should I use to get such information? Thanks so much! Boya [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Variance of the prediction in the linear regression model (Theory and programming)
Hi, folks, Here are the codes: ## y=1:10 x=c(1:9,1) lin=lm(log(y)~x) ### log(y) is following Normal distribution x=5:14 prediction=predict(lin,newdata=x) ##prediction=predict(lin) ### 1. The codes do not work, and give the error message: Error in eval(predvars, data, env) : numeric 'envir' arg not of length one. But if I use the code after the pound sign, it works. I mean the name of the newdata is x, why it does not work though? 2. Because the prediction is conducted for log(y). I need to get the expected value of y, which is LN distribution, for the new data sets. I need to know the expectation of log(y) and variance of log(y). # mean=mean(prediction) sd=sd(prediction) mean_y=exp(mean+0.5*sd^2) ### formula from Normal to LN ## Is sd(prediction) the correct why to calculate the sigma of the prediction? Or should I just use the value of Residual standard error from summary(lin)? Answer to either question will be appreciated! Thanks Yi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about allocMatrix error message
Moohwan - It appears that you are trying to calculate a 10 by 10 matrix when all you want are the diagonal elements. Here's one approach that might work: s = t(dev)%*%dev/(nr-1) sinv = solve(s) part = sinv%*%t(dev) t2 = dev[,1]*part[1,] + dev[,2]*part[2,] - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Wed, 21 Jul 2010, Moohwan Kim wrote: Dear R family, I faced a technical problem in r coding. #s=t(dev)%*%dev/(nr-1) # dev (100,000 by 2) stands for deviation from the mean #sinv=solve(s) #t2=diag(dev%*%sinv%*%t(dev)) I got an error message at t2 statement: Error in diag(dev %*% si %*% t(dev)) : allocMatrix: too many elements specified Please let me know if there is a way to overcome this problem. best moohwan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variance of the prediction in the linear regression model (Theory and programming)
Sorry, for the second question. I stated in a wrong way. My aim is the mean and sd of each new observation. # mean=fitted(prediction) ## But I do not know how to get sd for each new observation. Any tips? Thanks Yi On Wed, Jul 21, 2010 at 2:29 PM, Yi liuyi.fe...@gmail.com wrote: Hi, folks, Here are the codes: ## y=1:10 x=c(1:9,1) lin=lm(log(y)~x) ### log(y) is following Normal distribution x=5:14 prediction=predict(lin,newdata=x) ##prediction=predict(lin) ### 1. The codes do not work, and give the error message: Error in eval(predvars, data, env) : numeric 'envir' arg not of length one. But if I use the code after the pound sign, it works. I mean the name of the newdata is x, why it does not work though? 2. Because the prediction is conducted for log(y). I need to get the expected value of y, which is LN distribution, for the new data sets. I need to know the expectation of log(y) and variance of log(y). # mean=mean(prediction) sd=sd(prediction) mean_y=exp(mean+0.5*sd^2) ### formula from Normal to LN ## Is sd(prediction) the correct why to calculate the sigma of the prediction? Or should I just use the value of Residual standard error from summary(lin)? Answer to either question will be appreciated! Thanks Yi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issues reshaping data
Hello All, I'm having some trouble reshaping my data from wide to long format. I have tried using both the reshape function and package. Although I haven't worked much with the reshape function, I have found the reshape package useful and intuitive for reshaping data from long to wide format. However, going the other way has me stumped with this type of data. My data is set up, roughly, as follows: SUBJECT TRIALWORD1 WORD1.RT WORD2 WORD2.RT ... WORD25.RT 1 1My 100 friend200 ... ... 1 2John's 250 dog 320 ... ... 1 3His 120 waiter400 ... ... 2 1My 100 friend200 ... ... 2 2John's 250 dog 320 ... ... 2 3His 120 waiter400 ... ... What I would like to do is transform it roughly to: Subject TRIAL WORDRT 1 1 My 100 1 1 friend 200 . 2 3 His120 2 3 Waiter 400 The furthest I've gotten with the package is melting the data such that RT is in its own column by using each WORD#.RT as a measure.var. However, I can't get the words to go with them (either in the melted frame or trying to cast the melted data). I tried the reshape function as well, but kept getting an error about duplicate row names. I lost the exact code I was using after closing R, but roughly, I would use Subject as an ID variable, varying = list(wordRTs,words) [where wordRTs are the set of WORD#.RT and words are the set of WORD# columns] , v.names=c(RT,WORD). If anyone has any help, it would be much appreciated! Thanks, Jason -- View this message in context: http://r.789695.n4.nabble.com/Issues-reshaping-data-tp2297853p2297853.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help me with prediction in linear model
Hi: On Wed, Jul 21, 2010 at 8:46 AM, vijaysheegi vijay.she...@gmail.com wrote: Hi R-community, I have the code as follows,i Fitted model as follows lbeer-log(beer_monthly) t-seq(1956,1995.2,length=length(beer_monthly)) #beer_monthly contains 400+ entries This is unnecessary: t2=t^2 beer_fit_parabola=lm(lbeer~t+t2) Try beer_fit_parabola - lm(lbeer ~ poly(t, 2)) instead. poly() is a safer function to use for polynomial regression. See ?poly for the reasons why. Below is not working for me. Please help me in preparing the new data set for the below prediction tpred - data.frame(t = seq(1995, 1998, length = 20)) predict(beer_fit_parabola, newdata = tpred) If that doesn't work, try putting lbeer and t in a data frame and use it as the data = argument of lm() [same model]. Then try predict() again with the same code as above. This shouldn't be necessary, though. Here's a quick made up example: x - 1:10 y - 3 + 2 * x - x^2 + rnorm(10) plot(x, y) mm - lm(y ~ poly(x, 2)) predict(mm, newdata = data.frame(x = c(11, 12))) 1 2 -96.10648 -117.04534 If you look at summary(mm), you'll see why polynomial models are not very useful for prediction, particularly when extrapolating outside the range of the observed data. (Compare the parameter estimates to the function that produced y as a crude check.) A better choice might be a natural or restricted cubic spline to reduce the standard error of prediction. HTH, Dennis predict(beer_fit_parabola,newdata=data.frame(t=seq(1995,1998,length=20),t2=seq(1995,1998,length=20)) #it is listing all 400+ entries ,but not 20 ahead prediction. Thanks In advance for your help -- View this message in context: http://r.789695.n4.nabble.com/Help-me-with-prediction-in-linear-model-tp2297313p2297313.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Command that is conditional upon file retrieval: is it possible?
Hi all, I'm currently working on an R program where I have to access an FTP server to download some of the data I need. However, the people who post up the files I access are at times inconsistent with regards to time posted, if they post at all, etc Here's some of the code I use: library(RCurl) url1 = paste(ftp://user:passw...@a.great.website.com/;, file, num1, .csv, sep = ) data1 = getURL(url1) write(data1, file = paste(inMyFolder, num1, .csv, sep = )) Sometimes this process works perfectly, and sometimes I get an error message like this attached to data1 = getURL(url1): Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) : RETR response: 550 That's because that particular file isn't on the FTP server yet. Now... let's just imagine that there's another way for me to access the file elsewhere, and I can drag it into the working directory with the same name as the file I'm telling R to write immediately after finding it on the FTP server. So here's my question: is it possible to write a command that will write the file if there isn't an error message going along with data1 = getURL(url1), but won't write the file if it can't find it As of right now, if I got the error message, dragged the file into the working directory and ran the program again, R will overwrite my good file with an empty one because in all cases, I'm telling it to write a file with that name that includes the information in data1. Thanks in advance, Andrew -- View this message in context: http://r.789695.n4.nabble.com/Command-that-is-conditional-upon-file-retrieval-is-it-possible-tp2297811p2297811.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.