Re: [R] windows vs. linux code
(Ted Harding) wrote: There is one MAJOR issue you will have to watch out for, far more likely to turn up than calls like system(). This is that, if you want to have two or more plotting windows in use at the same time, while the first one is autoatically opened by the plot() command, you will have to open additional ones explcitily. In Linux, the command is X11() [possibly with paramaters, though usually you don't need to bother]. In Windows, it is windows() [ditto]. as far as i understand, you can just dev.new() instead, if not in all then in most circumstances, which will do the system-dependent stuff for you, making your code less system-dependent. vQ I run R on Linux, so use the X11() command. However, If I write a script which would also be run on a Windows system, I write using windows() in the first instance, but with a conditional alias to X11(): if(length(grep(linux,R.Version()$os))){ windows - function( ... ) X11( ... ) } and put this at the beginning of the code file. Then, if the code is run on a Windows machine, the function call windows() does the Windows thing; but if the code is run on Linux then the above test detects that, and defines a function windows() which does the same as X11(). Ted. On 26-Feb-09 01:25:36, Sherri Heck wrote: i am asking if, in general, r code can be written on a linux-based system and then run on a windows-based system. Rolf Turner wrote: On 26/02/2009, at 2:08 PM, Sherri Heck wrote: Dear All- I have been given some Rcode that was written using a Linux OS, but I use Windows-based R. The person that is giving it to me said that it needs to run on a Linux system. Does anyone have any insight and/or can verify this. I haven't yet obtained the code, so I haven't been able to try it yet. Despite the knowledge, wisdom, insight, skill, good looks, and other admirable characteristics of the members of the R-help list, few of us are skilled in telepathy or clairvoyance. cheers, Rolf Turner ## Attention:This e-mail message is privileged and confidential. If you are not theintended recipient please delete the message and notify the sender.Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshalwww.marshalsoftware.com ## __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 26-Feb-09 Time: 03:58:35 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- --- Wacek Kusnierczyk, MD PhD Email: w...@idi.ntnu.no Phone: +47 73591875, +47 72574609 Department of Computer and Information Science (IDI) Faculty of Information Technology, Mathematics and Electrical Engineering (IME) Norwegian University of Science and Technology (NTNU) Sem Saelands vei 7, 7491 Trondheim, Norway Room itv303 Bioinformatics Gene Regulation Group Department of Cancer Research and Molecular Medicine (IKM) Faculty of Medicine (DMF) Norwegian University of Science and Technology (NTNU) Laboratory Center, Erling Skjalgsons gt. 1, 7030 Trondheim, Norway Room 231.05.060 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AsciiGridPredict returns error in unionDataJoin
Dear list, I am using AsciiGridPredict from the yaImpute package to make a spatial prediction of a randomForest model. When I call AsciiGirdPredict I am getting an error message, that I do not understand: AsciiGridPredict(st.rf.fit, xlist, ylist, xtypes=NULL, rows=NULL, myPredFunc=NULL) Rows per dot: 8 Rows to do: 884 ToDo: .. Done: Error in unionDataJoin(m1, m2) : row names are requried within all input matrices In addition: Warning message: In AsciiGridImpute(object, xfiles, outfiles, xtypes = xtypes, lon = lon, : NA's generated due to illegal level(s). Does someone know what the error means and how I can solve this problem? Thank you very much in advance, Frauke _ Show them the way! Add maps and directions to your party invites. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Incorporating cumsum in for loop
Hello Hello i have this DF: c CHR_NR diffdato1 x 1022 10395 1994-05-3011.0 39 0 1994 1 1 0 024 11 24 29100 11377 2003-05-2217.0 24 0 2003 1 1 0 015 11 29 29409 11377 2003-06-0618.8 15 0 2003 1 0 0 014 01 29 28964 11377 2003-07-0420.1 14 0 2003 1 1 0 017 11 59 29422 11377 2003-07-2132.9 17 0 2003 1 0 0 014 01 59 28605 11377 2003-08-0419.9 14 0 2003 1 0 0 014 01 59 28996 11377 2003-08-1810.5 14 0 2003 1 0 0 014 01 59 29932 11377 2003-12-0820.5 27 0 2003 1 1 0 029 11 78 30393 11377 2004-01-0628.6 29 0 2004 1 0 0 0 8 01 78 36100 11377 2004-01-1428.68 0 2004 1 0 0 0 8 01 78 30847 11377 2004-01-2219.08 0 2004 1 0 0 0 7 01 78 34549 11377 2004-01-2919.07 0 2004 1 0 0 0 7 01 78 30035 11377 2004-02-0514.47 0 2004 1 0 0 0 7 01 78 34550 11377 2004-02-1214.47 0 2004 1 0 0 012 01 78 42629 11493 2007-05-3111.8 31 0 2007 1 1 0 025 11 25 20900 12558 2000-06-3049.8 38 0 2000 1 1 0 027 11 118 21618 12558 2000-07-2751.0 27 0 2000 1 0 0 014 01 118 22014 12558 2000-08-1036.6 14 0 2000 1 0 0 014 01 118 21405 12558 2000-08-2418.6 14 0 2000 1 0 0 014 01 118 21790 12558 2000-09-0721.9 14 0 2000 1 0 0 014 01 118 22185 12558 2000-09-2114.4 14 0 2000 1 0 0 035 01 118 16695 13018 1999-02-0919.0 33 0 1999 1 1 0 014 11 14 22315 13018 2000-08-1821.8 36 0 2000 1 1 0 028 11 81 22029 13018 2000-09-1526.2 28 0 2000 1 0 0 014 01 81 21280 13018 2000-09-2920.2 14 0 2000 1 0 0 014 01 81 21738 13018 2000-10-1310.1 14 0 2000 1 0 0 025 01 81 24610 13018 2001-09-1731.3 27 0 2001 1 1 0 017 11 32 23958 13018 2001-10-0415.2 17 0 2001 1 0 0 015 01 32 43479 13168 2007-10-0315.6 33 0 2007 1 1 0 035 11 35 44755 13168 2008-04-0418.3 23 0 2008 1 1 0 025 11 25 45355 13168 2008-07-0415.2 36 0 2008 1 1 0 032 11 66 45540 13168 2008-08-0510.3 32 0 2008 1 0 0 034 01 66 And I want like this for every x-value, illustrated with x=118: a CHR_NR diffdato diffCHR aar ind10 diffind10 dato63 diffdato63 diffdato1 x akkusum 12558 2000-06-3049.8 38 0 2000 1 1 0 027 11 118 27 12558 2000-07-2751.0 27 0 2000 1 0 0 014 01 118 41 12558 2000-08-1036.6 14 0 2000 1 0 0 014 01 118 55 12558 2000-08-2418.6 14 0 2000 1 0 0 014 01 118 69 12558 2000-09-0721.9 14 0 2000 1 0 0 014 01 118 83 12558 2000-09-2114.4 14 0 2000 1 0 0 035 01 118 118 So I want the vector akkusum just for the whole dataset, i thought of using af for loop with cumsum (which i have used in the example with x=118
Re: [R] Problems with ARIMA models?
that's interesting, marie. on his webpage [1], david s. stoffer provides concrete arguments for why the r arima function in the stats package should be considered to produce misleading, if not just wrong results. david (cc:ed for reference) reported the issue in october 2005, bug report #8231 [2], and then again in april 2006, same bug report id [3]. a search through the maling list's archives reveals that there are only two messages matching the pattern '8231', namely the two mentioned above. that is, there has been *no public response* posted. after a few lessons, i have learned that you can't demand anything from r developers (which isn't quite accurate: you can demand, but they're likely to ignore you), so it's hard to demand that this issue be appropriately addressed. however, one can, and should, expect that such bug reports, correct or not, are addressed in public, leaving no doubt as to the accuracy of the design and implementation of any functionality in r. responding to bug reports is an important part of the open source culture (in this particular case, it's not really about sources) which r folks do not hesitate to refer to. it is surprising that david's concrete and polite (please prove me wrong) post hasn't received the treatment it deserved. it hasn't any! a brief look at some of the respective sources: svn log http://svn.r-project.org/R/trunk/src/library/stats/R/arima.R svn log http://svn.r-project.org/R/trunk/src/library/stats/src/arima.c hints that they've been edited mostly by one single developer. if the responsibility, as it seems, is not diffused among many, why wouldn't the author provide a clear and convincing answer to david's comments? as far as i can see, david has sufficient expertise in both the particular field and r in general, not to be treated as a naive user asking dumb questions (see fortune('dumb'), for example). is ignoring concrete critique part of the official strategy? can't see this documented on r's website. vQ [1] http://www.stat.pitt.edu/stoffer/tsa2/Rissues.htm [2] http://tolstoy.newcastle.edu.au/R/devel/05/10/2885.html [3] http://tolstoy.newcastle.edu.au/R/devel/06/04/4756.html Marie Sivertsen wrote: Dear R, I have find a website where they report problem with ARIMA models in R. I run the examples there and they give result as shown on the website. Does this mean that nothing has corrected in R? Maybe you not have seen the page, but the author said he contacted you. Here is the URL: http://www.stat.pitt.edu/stoffer/tsa2/Rissues.htm I like to know your opinion. Mvh. Marie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hi, Coding problem
Ssophia wrote: Hi, there Below is my code to one Homework question. I couldn't come up with the reasonable answer. PLEASE do read the posting guide http://www.R-project.org/posting-guide.html could you please help me to figure out what is the problem with my code? thank you Question is Coding P{X=j} =(1/2)^(j+1) + (1/2) *2^(j-1)/3^j my code is sim - function(n.gen){ urandom - runif(n.gen) sim.vector - rep(0,n.gen) for(j in 1:n.gen){ i - 1 p - 5/12 F - p while(urandom[j] = F){ p - p*((1/2)^(i+1)+1/3*(2/3)^i)/((1/2)^i+(1/2)*(2/3)^i) F - F+p i-i+1 } sim.vector[j] - i } # output sim.vector } result is 12345678 11 0.37 0.22 0.16 0.13 0.05 0.02 0.03 0.01 0.01 always, there are some numbers missing, it should be continuous. why 9 and 10 are missing thank you sophia _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange behavior of savePlot
Christophe Genolini wrote: Hi all, I am using savePlot in a loop for saving several graph but I get some graph in 553x552, some other in 1920x1119. How comes ? My data are almost all the same (same label, same xlim / ylim, almost same data. Only the color changes). I save them in bmp. Thanks for your help. Rather than using a Windows Device and copying the contents, just use the bmp() device directly if you really want bitmaps. Uwe Ligges Christophe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Place independent labels between values on x-axis
T.D.Rudolph wrote: I conducted a frequency averaging procedure which left me with the data frame below (Bin is an artifact of a cut() procedure and can be either as.character or as.factor): Bin Freq 1 (-180,-160] 7.904032 2 (-160,-140] 5.547901 3 (-140,-120] 4.522542 4 (-120,-100] 4.784184 5 (-100,-80] 4.490083 6(-80,-60] 4.754268 7(-60,-40] 5.597407 8(-40,-20] 5.964031 9 (-20,0] 7.266519 10 (0,20] 6.947202 11 (20,40] 6.168730 12 (40,60] 4.918232 13 (60,80] 4.638589 14(80,100] 4.288087 15 (100,120] 4.091052 16 (120,140] 4.451199 17 (140,160] 5.869740 18 (160,180] 7.796204 I am able to present the data as a histogram using barplot(), but on the x-axis I would like to see the values that separate (i.e. that are located between) my various bins (i.e. seq(-180,180,20)). Thus 0 would appear directly under the centre line that separates (-20,0] and (0,20], etc. barplot(angledist$Freq, xlab=Turning Angle, ylab=Average frequency (%), ylim=c(0,10)) ?names.arg, ?xlim don't seem to do it but I could be wrong Tyler Two comments: 1. In fact you probably want a histpgram on your un-cut()-ted data and specify breaks. A barplot seems to be misleading for originally continuous data. 2. You can suppress x-axis labels by specifying the argument xaxt=n and add your own x-axis later on by a call to axis() which allows to specify the tick mark positions. Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survival::survfit,plot.survfit
Jeff Xu wrote: I am confused when trying the function survfit. my question is: what does the survival curve given by plot.survfit mean? is it the survival curve with different covariates at different points? or just the baseline survival curve? for example, I run the following code and get the survival curve library(survival) fit-coxph(Surv(futime,fustat)~resid.ds+rx+ecog.ps,data=ovarian) plot(survfit(fit,type=breslow)) summary(survfit(fit,type=breslow)) for the first two failure points, we have s(59|x1)=0.971, s(115|x2)=0.942 how can we guarantee that s(59|x1) is always greater than s(115|x2)? since s(59|x1)=s_0(59)^exp(\beta'x1) and s(115|x2)=s_0(115)^exp(\beta'x2), we can manipulate covariates to make s(59|x1) s(115|x2), right? do I miss anything? In advance: I´m a beginner in survival analysis, too. But I think I can help you with this. plot(survfit(fit)) should plot the survival-function for x=0 or equivalently beta'=0. This curve is independent of any covariates. If you want to see the impact of residual-status=2 you could add something like: attach(ovarian) ovarian_new - data.frame(resid.ds=2, rx=(mean(rx)),ecog.ps=mean(ecog.ps)) detach() plot(survfit(fit), newdata=ovarian_new) This should give you the survival-function for an average patient with residual-status 2. Regards Bernhard __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange behavior of savePlot
Uwe Ligges a écrit : Christophe Genolini wrote: Hi all, I am using savePlot in a loop for saving several graph but I get some graph in 553x552, some other in 1920x1119. How comes ? My data are almost all the same (same label, same xlim / ylim, almost same data. Only the color changes). I save them in bmp. Thanks for your help. Rather than using a Windows Device and copying the contents, just use the bmp() device directly if you really want bitmaps. Uwe Ligges I do not want a bitmaps. I use savePlot in a library, so I let the user decide whith export he wants. Christophe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing zoo object (index contains NAs)
On Wed, 25 Feb 2009, Rob Denniker wrote: Dear list, I have an irregular time series saved and exported as a zoo object. What is the trick to force zoo to ignore the missing dates when reading it back in? Thanks. Almost certainly there are no NAs in your index (although we can't say for sure as Gabor pointed out). write.zoo(g, file = gdata.txt, index.name = date, append = F, quote = T, sep = ,) h - read.zoo(gdata.txt, sep = ,, format = Y-%m-%d) ^^^ First problem: This should be %Y. Second problem: You need header = TRUE. Thus, I would guess that h - read.zoo(gdata.txt, sep = ,, format = %Y-%m-%d, header = TRUE) should do what you want. Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using package ROCR
Thank you very much for the response! The plot(1,1) helped to resolve the first problem. But I am still getting a second error message when running demo(ROCR) Error in as.double(y) : cannot coerce type 'S4' to vector of type 'double' It seems it has something to do with compatibility of S4 objects. My versions of R and ROCR package are the same as you listed. But it seems something other is missing in my installation. William Doane wrote: Responding to question 1... it seems the demo assumes you already have a plot window open. library(ROCR) plot(1,1) demo(ROCR) seems to work. For question 2, my environment produces the expected results... plot doesn't generate an error: * R 2.8.1 GUI 1.27 Tiger build 32-bit (5301) * OS X 10.5.6 * ROCR 1.0-2 -Wil wiener30 wrote: I am trying to use package ROCR to analyze classification accuracy, unfortunately there are some problems right at the beginning. Question 1) When I try to run demo I am getting the following error message library(ROCR) demo(ROCR) if(dev.cur() = 1) [TRUNCATED] Error in get(getOption(device)) : wrong first argument When I issue the command dev.cur() it returns null device 1 It seems something is wrong with my R-environment ? Could somebody provide a hint, what is wrong. Question 2) When I run an example commands from the manual library(ROCR) data(ROCR.simple) pred - prediction( ROCR.simple$predictions, ROCR.simple$labels ) perf - performance( pred, tpr, fpr ) plot( perf ) the plot command issues the following error message Error in as.double(y) : cannot coerce type 'S4' to vector of type 'double' How this could be fixed ? Thanks for the support -- View this message in context: http://www.nabble.com/Using-package-ROCR-tp22198213p0312.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Hello, I'm trying to fit a generalized linear mixed model to estimate diabetes prevalence at US county level. To do this I'm using the glmer() function in package lme4. I can fit relatively simple models (i.e. few covariates) but when expanding the number of covariates I usually encounter the following error message. gm8 - glmer(DIAB05F~AGE+as.factor(SEX)+poolt+poolx+poverty+fastfood+(1|as.factor(diab$fips)), family = binomial(link=logit), data = diab, doFit=TRUE) Error in validObject(.Object) : invalid class mer object: Slot Zt must by dims['q'] by dims['n']*dims['s'] In the above, the response is person-level diabetes status as a function of AGE=age, SEX=sex, poolt=average county diabetes prevalence for previous years, poolx=pooled county diabetes prevalence for counties with similar age, sex, race, and income structure, poverty=county poverty rate, fastfood=number of fastfood places per 100,000 people in the county, and a county random effect. If I leave out fastfood, the model gets at least fitted - although it doesn't converge (yet): Warning message: In mer_finalize(ans) : false convergence (8) I would be grateful for any advice on what the problem could be and how to resolve it. Thanks, Tanja [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using package ROCR
Responding to question 1... it seems the demo assumes you already have a plot window open. library(ROCR) plot(1,1) demo(ROCR) seems to work. For question 2, my environment produces the expected results... plot doesn't generate an error: * R 2.8.1 GUI 1.27 Tiger build 32-bit (5301) * OS X 10.5.6 * ROCR 1.0-2 -Wil wiener30 wrote: I am trying to use package ROCR to analyze classification accuracy, unfortunately there are some problems right at the beginning. Question 1) When I try to run demo I am getting the following error message library(ROCR) demo(ROCR) if(dev.cur() = 1) [TRUNCATED] Error in get(getOption(device)) : wrong first argument When I issue the command dev.cur() it returns null device 1 It seems something is wrong with my R-environment ? Could somebody provide a hint, what is wrong. Question 2) When I run an example commands from the manual library(ROCR) data(ROCR.simple) pred - prediction( ROCR.simple$predictions, ROCR.simple$labels ) perf - performance( pred, tpr, fpr ) plot( perf ) the plot command issues the following error message Error in as.double(y) : cannot coerce type 'S4' to vector of type 'double' How this could be fixed ? Thanks for the support -- View this message in context: http://www.nabble.com/Using-package-ROCR-tp22198213p22219437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange behavior of savePlot
Christophe Genolini wrote: Uwe Ligges a écrit : Christophe Genolini wrote: Hi all, I am using savePlot in a loop for saving several graph but I get some graph in 553x552, some other in 1920x1119. How comes ? My data are almost all the same (same label, same xlim / ylim, almost same data. Only the color changes). I save them in bmp. Thanks for your help. Rather than using a Windows Device and copying the contents, just use the bmp() device directly if you really want bitmaps. Uwe Ligges I do not want a bitmaps. I use savePlot in a library, so I let the user decide whith export he wants. Then, why not let the user choose a proper device? Anyway, the resulting bitmap is copied from the screen device, that means the size is also copied. That means if you have some 1600x1200 screen and your windows device is resized to fullscreen, than you will almost get 1600x1200 pixels in your bitmap... Uwe Ligges Christophe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R communication
Dear all, Imagine that you have a small LAN with two computers. I would like to run R on both, and possible to run computations from one computer to the other one. TCP IP protocole would be preferable. Which package would you use for that? I would be very glad if you could also provide me with some lines of code, e.g., create a matrix X in computer 1, transfer its value to the second computer, make some calculation, and get the value back. Thanks for your help, David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange behavior of savePlot
Uwe Ligges a écrit : Christophe Genolini wrote: Uwe Ligges a écrit : Christophe Genolini wrote: Hi all, I am using savePlot in a loop for saving several graph but I get some graph in 553x552, some other in 1920x1119. How comes ? My data are almost all the same (same label, same xlim / ylim, almost same data. Only the color changes). I save them in bmp. Thanks for your help. Rather than using a Windows Device and copying the contents, just use the bmp() device directly if you really want bitmaps. Uwe Ligges I do not want a bitmaps. I use savePlot in a library, so I let the user decide whith export he wants. Then, why not let the user choose a proper device? Because I program a graphical interface that let the user to chose the graph he want to export (that can be one but also 50). Then he sets dynamicaly the parameters for all the graphics he wants to export. Then all the graphs are exported at once. Anyway, the resulting bitmap is copied from the screen device, that means the size is also copied. That means if you have some 1600x1200 screen and your windows device is resized to fullscreen, than you will almost get 1600x1200 pixels in your bitmap... Ok, the strange behevior was occuring because I change the size of the windows during the export... Thanks a lot Christophe Uwe Ligges Christophe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Modifying Names from (x,y] into x
Hi, I got this problem once, and Prof. Ripley kindly added an example in the help page of ?cut, aaa - c(1,2,3,4,5,2,3,4,5,6,7) ## one way to extract the breakpoints labs - levels(cut(aaa, 3)) cbind(lower = as.numeric( sub(\\((.+),.*, \\1, labs) ), upper = as.numeric( sub([^,]*,([^]]*)\\], \\1, labs) )) lower upper [1,] 0.994 3.00 [2,] 3.000 5.00 [3,] 5.000 7.01 Hope this helps, baptiste On 26 Feb 2009, at 02:55, Gundala Viswanath wrote: Hi, I have the following data that looks like this: names(dat) [1] (-2329,-2319] (-1399,-1389] (-669.4,-659.4] How can I modify those names into just this? [1] -2329 -1399 -669.4 - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] T-test by groups
Ingrid Tohver wrote: I would like to run a t-test within a by group function. My dataset, error, is organized as the following (I have 133 Sites): Site week Dataset Region lat_map long_map mean_tsim diff20 diff40 diff80 ALFI 15 USACE UC 48.15625 -117.0938 8.87 1.34 1.90 2.98 ALFI 16 USACE UC 48.15625 -117.0938 10.28 0.57 1.08 2.27 ALFI 17 USACE UC 48.15625 -117.0938 11.08 0.74 1.30 2.52 ALFI 18 USACE UC 48.15625 -117.0938 12.23 0.42 1.11 2.42 ALFI 19 USACE UC 48.15625 -117.0938 13.19 1.00 1.73 3.14 ALFI 20 USACE UC 48.15625 -117.0938 14.31 1.77 2.62 3.78 I am interested in running the t-test by the Site index. My code looks like this: t_test-by(error, error['Site'], function(dat) t.test(subset(error $diff20),subset(error$diff80), data=dat)) This code runs the t-test, but over the whole dataset without discriminating by Site, so each Site's result is the same. Could someone help determine a better approach or why mine is not working. I guess you want by(error, error['Site'], function(dat) t.test(dat$diff20, dat$diff40)) Uwe Ligges Thank you, Ingrid [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
Dear Tanja, R-Sig-Mixed-models is a better list for questions about lme4 and nlme. There you are much more likely to get an answer from the mixed models specialists. First of all I would recommend you to write the random effect as (1|fips) instead of (1|as.factor(diab$fips)). You will run into troubles when you change the dataset as only the random effect explicitly refers to the dataset. I can think of two things that may cause the errors: a lack of data points or an overspecified model. If you have a lot of data points then you should have a look at the correlations between the covariates. Highly correlated covariates can lead to unstable models with false convergences as a result. HTH, Thierry PS An informative subject line is recommended by the posting guide. ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Tanja Srebotnjak Verzonden: donderdag 26 februari 2009 9:17 Aan: r-help@r-project.org Onderwerp: [R] (no subject) Hello, I'm trying to fit a generalized linear mixed model to estimate diabetes prevalence at US county level. To do this I'm using the glmer() function in package lme4. I can fit relatively simple models (i.e. few covariates) but when expanding the number of covariates I usually encounter the following error message. gm8 - glmer(DIAB05F~AGE+as.factor(SEX)+poolt+poolx+poverty+fastfood+(1|as.fact or(diab$fips)), family = binomial(link=logit), data = diab, doFit=TRUE) Error in validObject(.Object) : invalid class mer object: Slot Zt must by dims['q'] by dims['n']*dims['s'] In the above, the response is person-level diabetes status as a function of AGE=age, SEX=sex, poolt=average county diabetes prevalence for previous years, poolx=pooled county diabetes prevalence for counties with similar age, sex, race, and income structure, poverty=county poverty rate, fastfood=number of fastfood places per 100,000 people in the county, and a county random effect. If I leave out fastfood, the model gets at least fitted - although it doesn't converge (yet): Warning message: In mer_finalize(ans) : false convergence (8) I would be grateful for any advice on what the problem could be and how to resolve it. Thanks, Tanja [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Incorporating cumsum in for loop
joe1985 wrote: Hello Hello i have this DF: c CHR_NR diffdato1 x 1022 10395 1994-05-3011.0 39 0 1994 1 1 0 024 11 24 29100 11377 2003-05-2217.0 24 0 2003 1 1 0 015 11 29 29409 11377 2003-06-0618.8 15 0 2003 1 0 0 014 01 29 28964 11377 2003-07-0420.1 14 0 2003 1 1 0 017 11 59 29422 11377 2003-07-2132.9 17 0 2003 1 0 0 014 01 59 28605 11377 2003-08-0419.9 14 0 2003 1 0 0 014 01 59 28996 11377 2003-08-1810.5 14 0 2003 1 0 0 014 01 59 29932 11377 2003-12-0820.5 27 0 2003 1 1 0 029 11 78 30393 11377 2004-01-0628.6 29 0 2004 1 0 0 0 8 01 78 36100 11377 2004-01-1428.68 0 2004 1 0 0 0 8 01 78 30847 11377 2004-01-2219.08 0 2004 1 0 0 0 7 01 78 34549 11377 2004-01-2919.07 0 2004 1 0 0 0 7 01 78 30035 11377 2004-02-0514.47 0 2004 1 0 0 0 7 01 78 34550 11377 2004-02-1214.47 0 2004 1 0 0 012 01 78 42629 11493 2007-05-3111.8 31 0 2007 1 1 0 025 11 25 20900 12558 2000-06-3049.8 38 0 2000 1 1 0 027 11 118 21618 12558 2000-07-2751.0 27 0 2000 1 0 0 014 01 118 22014 12558 2000-08-1036.6 14 0 2000 1 0 0 014 01 118 21405 12558 2000-08-2418.6 14 0 2000 1 0 0 014 01 118 21790 12558 2000-09-0721.9 14 0 2000 1 0 0 014 01 118 22185 12558 2000-09-2114.4 14 0 2000 1 0 0 035 01 118 16695 13018 1999-02-0919.0 33 0 1999 1 1 0 014 11 14 22315 13018 2000-08-1821.8 36 0 2000 1 1 0 028 11 81 22029 13018 2000-09-1526.2 28 0 2000 1 0 0 014 01 81 21280 13018 2000-09-2920.2 14 0 2000 1 0 0 014 01 81 21738 13018 2000-10-1310.1 14 0 2000 1 0 0 025 01 81 24610 13018 2001-09-1731.3 27 0 2001 1 1 0 017 11 32 23958 13018 2001-10-0415.2 17 0 2001 1 0 0 015 01 32 43479 13168 2007-10-0315.6 33 0 2007 1 1 0 035 11 35 44755 13168 2008-04-0418.3 23 0 2008 1 1 0 025 11 25 45355 13168 2008-07-0415.2 36 0 2008 1 1 0 032 11 66 45540 13168 2008-08-0510.3 32 0 2008 1 0 0 034 01 66 And I want like this for every x-value, illustrated with x=118: a CHR_NR diffdato diffCHR aar ind10 diffind10 dato63 diffdato63 diffdato1 x akkusum 12558 2000-06-3049.8 38 0 2000 1 1 0 027 11 118 27 12558 2000-07-2751.0 27 0 2000 1 0 0 014 01 118 41 12558 2000-08-1036.6 14 0 2000 1 0 0 014 01 118 55 12558 2000-08-2418.6 14 0 2000 1 0 0 014 01 118 69 12558 2000-09-0721.9 14 0 2000 1 0 0 014 01 118 83 12558 2000-09-2114.4 14 0 2000 1 0 0 035 01 118 118 So I want the vector akkusum just for the whole dataset, i thought of using af for loop with cumsum (which i have used in the example
Re: [R] T-test by groups
Uwe Ligges wrote: Ingrid Tohver wrote: I would like to run a t-test within a by group function. My dataset, error, is organized as the following (I have 133 Sites): Site week Dataset Region lat_map long_map mean_tsim diff20 diff40 diff80 ALFI 15 USACE UC 48.15625 -117.0938 8.87 1.34 1.90 2.98 ALFI 16 USACE UC 48.15625 -117.0938 10.28 0.57 1.08 2.27 ALFI 17 USACE UC 48.15625 -117.0938 11.08 0.74 1.30 2.52 ALFI 18 USACE UC 48.15625 -117.0938 12.23 0.42 1.11 2.42 ALFI 19 USACE UC 48.15625 -117.0938 13.19 1.00 1.73 3.14 ALFI 20 USACE UC 48.15625 -117.0938 14.31 1.77 2.62 3.78 I am interested in running the t-test by the Site index. My code looks like this: t_test-by(error, error['Site'], function(dat) t.test(subset(error $diff20),subset(error$diff80), data=dat)) This code runs the t-test, but over the whole dataset without discriminating by Site, so each Site's result is the same. Could someone help determine a better approach or why mine is not working. I guess you want by(error, error['Site'], function(dat) t.test(dat$diff20, dat$diff40)) ...and most likely also, paired=TRUE. Notice that the data= argument only works for the formula interface, which is not (yet) defined for paired data, but only for the y~group type two-sample case. Uwe Ligges Thank you, Ingrid [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] biplot.princomp - changing score labels
Thanks for the hints! They seem to help me. Axel Héctor Villalobos schrieb: Perhaps this may help you. Regards data(iris3) ir - rbind(iris3[ , , 1], iris3[ , , 2], iris3[ , , 3]) ir.pca - princomp(ir) biplot(ir.pca) # Rehacer Biplot # Calcular Factor para re-escalar scores y eigenvectores lambda - ir.pca$sdev[1:2] * sqrt(ir.pca$n.obs) scores - t( t(ir.pca$scores[ , c(1, 2)]) / lambda ) variables - t( t(ir.pca$loadings[ , c(1, 2)]) * lambda ) x11() plot(scores, type=p, xlim=c(-0.22, 0.24), ylim=c(-0.2, 0.2), pch=c(rep(16, 50), rep(21, 50), rep(14, 50)), col=c(rep(red, 50), rep(blue, 50), rep(green, 50))) abline(v=0, h=0, lty=3) par(new=TRUE) plot(variables, type=n, xaxt=n, yaxt=n, xlim=c(-22, 24), ylim=c(-20, 20)) arrows(0, 0, variables[ , 1], variables[ , 2], len=0.1, col=red) text(2*variables, rownames(variables), col=red, xpd=TRUE) axis(3); axis(4) On 25 Feb 2009 at 9:52, Axel Strauß wrote: Date sent: Wed, 25 Feb 2009 09:52:54 +0100 From: Axel Strauß a.stra...@tu-bs.de To:R-help@r-project.org Subject: Re: [R] biplot.princomp - changing score labels Prof Brian Ripley schrieb: On Tue, 24 Feb 2009, Axel Strauß wrote: OK, the one thing I figured out: Is should be like: biplot(test.pca, cex=c(2,1), col=c(red,green)... to change size, colours etc separately. But I still don't know how change lables of observations to symbols properly. That's not part of the design of the function, so just make a copy and edit to meet your fancies. The designer of biplot.princomp. The idea behind my question was actually not a styling one but to provide additional information in the graph. My observations have different species richness and I wanted to adopt symbol size to the number of species to show the change of species richness along the PCs. Anyway, thanks for the comment - and for designing biplot.princomp. Axel -- Héctor Villalobos hvill...@ipn.mx CICIMAR - IPN A.P. 592. Col. Centro La Paz, Baja California Sur, MÉXICO. 23000 Tels. (+52 612) 122 53 44; 123 46 58; 123 47 34 ext. 82425 Fax. (+52 612) 122 53 22 -- Gravity is a habit that is hard to shake off. Terry Pratchett __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] survival::predict.coxph
Hi, I just came across another question concerning predict.coxph Terry Therneau states in A Package for Survival Analysis in S that term - predict(fit, type=terms) yields predicted values for the individual components of the linear predictor X*beta My coxmodel looks like: coef exp(coef) se(coef) z p S0 -3.106 4.48e-02 2.88 -1.080 0.2800 S16.365 5.81e+02 5.20 1.224 0.2200 S2 -14.009 8.24e-07 5.32 -2.636 0.0084 [..] The first line of my input-data looks like: S0 S1 S2 S3 1 -1.030 -0.9500 -1.0950 -1.0700 So I thought the first line of term should be calculated by -1.030*-3.106, -0.9500*6.365, -1.0950*-14.009 [..] which is 3.20, -6.04, 15.34 Actually the first line of term contains: S0 S1 S2 1 3.36737346 -6.36032595 15.73846097 which is quite similar but not the same. Can anyone shed some light on this? I guess there must be tons of literature on this topic but I find it quite hard to find the good one. I´d also appreciate literature on how to choose the appropriate number of covariates for a coxmodel and overfitting. Regards Bernhard __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Moving Average
See rollapply in zoo or filter or embed in the core of R. On Thu, Feb 26, 2009 at 7:07 AM, mau...@alice.it wrote: I am looking for some help at removing low-frequency components from a signal, through Moving Average on a sliding window. I understand thiis is a smoothing procedure that I never done in my life before .. sigh. I searched R archives and found rollmean, MovingAverages {TTR}, SymmetricMA. None of the above mantioned functions seems to accept the smoothing polynomial order and the sliding window with as input parameters. Maybe I am missing something. I wonder whether there is some building blocks in R if not even a function which does it all (I do not expect that much,though). Even some literature references and/or tutorials are very welcome. Thank you so much, Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2: labels points with different colors or idnumbers
Dear list, Using ggplot2 I could produce both boxplot and points in the same plot but instead of points I would like to label the different subjects with different colors or their idnumbers. Is there away to do it? Also how can I put three plots on the same graph with ggplot2? mfrow=c(3,1) did not do the job. dat group time id freq 1 1 00 0018 5.21 2 1 00 3026 3.13 3 1 00 5030 5.04 4 1 00 5108 3.23 5 1 00 5152 3.97 6 1 00 6080 0.16 7 1 01 0018 4.89 8 1 01 3026 6.58 9 1 01 5030 7.42 10 1 01 5108 10.10 11 1 01 5152 3.74 12 1 01 6080 0.81 library(ggplot2) qplot(factor(dat$time),dat$freq,dat,geom=c(boxplot,jitter), ylab=names(dat[,4]),xlab=time) __ Ta semester! - sök efter resor hos Kelkoo. Jämför pris på flygbiljetter och hotellrum här: http://www.kelkoo.se/c-169901-resor-biljetter.html?partnerId=96914052 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2: labels points with different colors or idnumbers
Dear Tom, I think you better switch to ggplot() instead of qplot(). library(ggplot2) dat - expand.grid(id = 1:10, time = 1:3, group = 1:3) dat$Freq - rnorm(nrow(dat), mean = dat$id * dat$time) ggplot(dat, aes(x = factor(time), y = Freq)) + geom_boxplot() + geom_jitter(aes(colour = factor(id))) But you can do it with qplot() qplot(factor(dat$time), dat$Freq, dat, geom = boxplot) + geom_jitter(aes(colour = factor(dat$id))) IMO the ggplot version gives more readable code. Especially when you are doing complex plots. Splitting plots according to a factor is easy with facet_grid() or facet_wrap(). dat - expand.grid(id = 1:10, time = 1:3, Group = 1:3) dat$Freq - rnorm(nrow(dat), mean = dat$id * dat$time) ggplot(dat, aes(x = factor(time), y = Freq)) + geom_boxplot() + geom_jitter(aes(colour = factor(id))) + facet_grid(. ~ Group) If you want three different plots then you will need to do some reading on viewports, grobs and stuff. The ggplot2 book on Hadley's website give a good intro. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Tom Cohen Verzonden: donderdag 26 februari 2009 13:05 Aan: r-help@r-project.org Onderwerp: [R] ggplot2: labels points with different colors or idnumbers Dear list, Using ggplot2 I could produce both boxplot and points in the same plot but instead of points I would like to label the different subjects with different colors or their idnumbers. Is there away to do it? Also how can I put three plots on the same graph with ggplot2? mfrow=c(3,1) did not do the job. dat group time id freq 1 1 00 0018 5.21 2 1 00 3026 3.13 3 1 00 5030 5.04 4 1 00 5108 3.23 5 1 00 5152 3.97 6 1 00 6080 0.16 7 1 01 0018 4.89 8 1 01 3026 6.58 9 1 01 5030 7.42 10 1 01 5108 10.10 11 1 01 5152 3.74 12 1 01 6080 0.81 library(ggplot2) qplot(factor(dat$time),dat$freq,dat,geom=c(boxplot,jitter), ylab=names(dat[,4]),xlab=time) __ Ta semester! - sök efter resor hos Kelkoo. Jämför pris på flygbiljetter och hotellrum här: http://www.kelkoo.se/c-169901-resor-biljetter.html?partnerId=96914052 [[alternative HTML version deleted]] Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kmeans: invalid length argument
Dear R Experts, I am running a cluter analysis using kmeans and have come across an error to which I am unable to find a solution. First, let me describe the problem: THE R CODE IS: -- # NRM is a 100 x 100 numerical matrix infile = 't:\\NRM\\NRM' groups = 7 outfile = 't:\\NRM\\cluster.groups' print( paste(infile, groups, outfile, sep=' ')) pairs - read.table(file=infile, header=TRUE, sep='') names(pairs) - row.names(pairs) dist.pairs - dist(pairs) clust - kmeans(dist.pairs, groups) write.table(clust$cluster, file=outfile, quote=FALSE) THE ERROR IS: Error in vector(integer, length) : invalid 'length' argument Calls: kmeans - do_one - switch - integer - vector Execution halted I ran this code on a WindowsXP R6.2.6 host and it ran fine with acceptable results. However, the error ocurrs when I run it on CentOS (redhat 4.1.1-52) R2.7.2. The following search queries yielded no pertinent results from the web(Google), Google groups, R-help archives (Nabble and Namazu), and R-FAQ: 1. kmeans 2. invalid length argument 3. vector ?vector says that the length argument must be non-negative integer. But I don't know how to access that call to vector to see what is actually being used as an argument. Any help on this matter will be greatly appreciated. Thank you for your time. Lowell --- Lowell Gould, Ph.D. Smithfield Premium Genetics 316 W. Charity Rd. Rose Hill, NC 28458 v. 910.282.4292 f. 910.289.6466 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] statistical significance of accuracy increase in classification
Do you know about any good reference that discusses kappa for classification and maybe CI for kappa??? I don't, but googling on kappa and confusion matrix etc should get you there. Kappa works very well when the true classes are skewed. For example, if 10% of you samples are class A and 90% class B, you can optimize accuracy by calling everything B. Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Moving Average
I saw Gabor's reply but have a clarification to request. You say you want to remove low frequency components but then you request smoothing functions. The term smoothing implies removal of high-frequency components of a series. If smoothing really is your goal then additional R resource would be smooth.spline, loess (or lowess), ksmooth, or using smoothing terms in regressions. Venables and Ripley have quite a few worked examples of such in MASS. -- David Winsemius On Feb 26, 2009, at 7:07 AM, mau...@alice.it wrote: I am looking for some help at removing low-frequency components from a signal, through Moving Average on a sliding window. I understand thiis is a smoothing procedure that I never done in my life before .. sigh. I searched R archives and found rollmean, MovingAverages {TTR}, SymmetricMA. None of the above mantioned functions seems to accept the smoothing polynomial order and the sliding window with as input parameters. Maybe I am missing something. I wonder whether there is some building blocks in R if not even a function which does it all (I do not expect that much,though). Even some literature references and/or tutorials are very welcome. Thank you so much, Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] C-index for models fitted using start, stop in Surv?
One of my colleagues has written a technical report on how to do this, but I have not yet implemented it in the survival package. Terry Therneau http://mayoresearch.mayo.edu/mayo/research/biostat/techreports.cfm #80 Concordance for Survival Time Data: Fixed and Time-Dependent Covariates and Possible Ties in Predictor and Time Concordance, or synonymously the C-statistic, is a valuable measure of model discrimination in analyses involving survival time data. This report provides a definition of concordance in the case of survival data, allowing for time-dependent covariates with the counting process data representation and accounting for ties in the covariates and times. Walter K Kremers and The William J. von Liebig Transplant Center [April 2007] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survival::predict.coxph
You are mostly correct. Because of the censoring issue, there is no good estimate of the mean survival time. The survival curve either does not go to zero, or gets very noisy near the right hand tail (large standard error); a smooth parametric estimate is what is really needed to deal with this. For this reason the mean survival, though computed (but see the survfit.print.mean option, help(print.survfit)) is not highly regarded. It is not an option in predict.coxph. Terry T. begin included message -- Hi, if I got it right then the survival-time we expect for a subject is the integral over the specific survival-function of the subject from 0 to t_max. If I have a trained cox-model and want to make a prediction of the survival-time for a new subject I could use survfit(coxmodel, newdata=newSubject) to estimate a new survival-function which I have to integrate thereafter. Actually I thought predict(coxmodel, newSubject) would do this for me, but I?m confused which type I have to declare. If I understand the little pieces of documentation right then none of the available types is exactly the predicted survival-time. I think I have to use the mean survival-time of the baseline-function times exp(the result of type linear predictor). Am I right? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot.survfit
For a fitted Cox model, one can either produce the predicted survival curve for a particular hypothetical subject (survfit), or the predicted curve for a particular cohort of subjects (survexp). See chapter 10 of Therneau and Grambsch for a long discussion of the differences between these, and the various pitfalls. By default, survfit produces the curve for a hypothetical average subject whose covariate values are the respective means of the data set. I'm not very keen on this estimate --- what is sex=.453, a hermaphrodite? But it is the historical default. Terry Therenau begin included message - I am confused when trying the function survfit. my question is: what does the survival curve given by plot.survfit mean? is it the survival curve with different covariates at different points? or just the baseline survival curve? for example, I run the following code and get the survival curve library(survival) fit-coxph(Surv(futime,fustat)~resid.ds+rx+ecog.ps,data=ovarian) plot(survfit(fit,type=breslow)) summary(survfit(fit,type=breslow)) for the first two failure points, we have s(59|x1)=0.971, s(115|x2)=0.942 how can we guarantee that s(59|x1) is always greater than s(115|x2)? since s(59|x1)=s_0(59)^exp(\beta'x1) and s(115|x2)=s_0(115)^exp(\beta'x2), we can manipulate covariates to make s(59|x1) s(115|x2), right? do I miss anything? thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] statistical significance of accuracy increase in classification
On 26 Feb 2009, at 14:14, Max Kuhn wrote: Do you know about any good reference that discusses kappa for classification and maybe CI for kappa??? You might also want to take a look at this survey article on kappa and its alternatives: Artstein, Ron and Poesio, Massimo (2008). Survey article: Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4), 555–596. which you can download from http://www.aclweb.org/anthology-new/J/J08/ Alternatives to the standard Fleiss-Cohen asymptotic confidence intervals in the elementary 2x2 case are discussed in Lee, J.J., Tu, Z. N.:A Better Confidence for Kappa on Measuring Agreement Between Two Raters with Binary Outcomes Journal of Computational and Graphical Statistics, 3:301-321, 1994. which is available from JSTOR: http://www.jstor.org/stable/1390914 An S implementation of their approximations can be downloaded here: http://lib.stat.cmu.edu/S/kappa I've started to evaluate the accuracy of these approximations with simulation experiments some time ago, but haven't found the time to follow up on it. Hope this helps, Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survival::survfit,plot.survfit
plot(survfit(fit)) should plot the survival-function for x=0 or equivalently beta'=0. This curve is independent of any covariates. This is not correct. It plots the curve for a hypothetical subject with x= mean of each covariate. This is NOT the average survival of the data set. Imagine a cohort made up of 60 year old men and their 10 year old grandsons: the expected survival of this cohort does not look that for a 35 year old male. Terry T __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: Moving Average
I apologize for my messy post which stems from my own confusion ... and depression as well. In fact I though I was done with a big chunk of a project and to my dismay I found out there is more to do. I am trying to adapt an algorithm, based on advanced wavelet analysis, to my respiration signals. The original algorithm was implemented in Fortran by the mathematician author of the underlying theory. I implemented it in R language with some minor changes due to the nature o fthe phenomenon I am studying. Since my results and the mathematician's results mostly disagree on the same sample signal, I was suggested to remove the low frequencies components in advance of starting wavelet analysis. Upon rereading his suggestion I got more and more confused. As far as I know Moving Average is one of the simplest DSP low-pass filters. Therefore I cannot understand how MA can be used to remove low-frequency components. Nor I can understand hi suggestion that I paste in the following. You have a rather short signal - 120 samples only. I avoid regarding wavelet coeffcients on senior detail levels because their main support interval is of the same order as the whole time interval and the circular effect of discrete finite wavelet transform is too strong for them. Thus, if the length of time series equals N=2^k then I work with detail levels from 1 to (k-3). It means that for this time series k=7 and the working detail levels are 1,2,3. Besides that you use periodic extension of the signal whereas I use zero padding till the length nearest N=2^k and I do not include into analysis zero wavelet coefficients which arises due to zero padding. Moreover, the SpAn removes automatically before wavelet analysis low-frequency components from the signal (which are the main source of circular effect) by moving average within time window of the radius 2^(k-3). I advise you to remove low-frequency components as well, for example by local polynomials of the 2-nd order wit! hin moving time window of the radius 8 samples (the length of moving window equals 17, i.e. slightly more than 16 - maximum scale for the 3-rd detail level). Thank you so much, Maura -Messaggio originale- Da: David Winsemius [mailto:dwinsem...@comcast.net] Inviato: gio 26/02/2009 14.54 A: mau...@alice.it Cc: r-help@r-project.org Oggetto: Re: [R] Moving Average I saw Gabor's reply but have a clarification to request. You say you want to remove low frequency components but then you request smoothing functions. The term smoothing implies removal of high-frequency components of a series. If smoothing really is your goal then additional R resource would be smooth.spline, loess (or lowess), ksmooth, or using smoothing terms in regressions. Venables and Ripley have quite a few worked examples of such in MASS. -- David Winsemius On Feb 26, 2009, at 7:07 AM, mau...@alice.it wrote: I am looking for some help at removing low-frequency components from a signal, through Moving Average on a sliding window. I understand thiis is a smoothing procedure that I never done in my life before .. sigh. I searched R archives and found rollmean, MovingAverages {TTR}, SymmetricMA. None of the above mantioned functions seems to accept the smoothing polynomial order and the sliding window with as input parameters. Maybe I am missing something. I wonder whether there is some building blocks in R if not even a function which does it all (I do not expect that much,though). Even some literature references and/or tutorials are very welcome. Thank you so much, Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Singularity in a regression?
R friends, In a matrix of 1s and 0s, I'm getting a singularity error. Any helpful ideas? lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe + paraF + paraCl + paraBr + paraI + paraMe) Residuals: Min 1Q Median 3QMax -4.573e-01 -7.884e-02 3.469e-17 6.616e-02 2.427e-01 Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(|t|) (Intercept) 7.9173 0.1129 70.135 2e-16 *** metaF-0.3973 0.2339 -1.698 0.115172 metaClNA NA NA NA metaBr0.3454 0.1149 3.007 0.010929 * metaI 0.4827 0.2339 2.063 0.061404 . metaMe0.3654 0.1149 3.181 0.007909 ** paraF 0.7675 0.1449 5.298 0.000189 *** paraCl0.3400 0.1449 2.347 0.036925 * paraBr1.0200 0.1449 7.040 1.36e-05 *** paraI 1.3327 0.2339 5.697 9.96e-05 *** paraMe1.2191 0.1573 7.751 5.19e-06 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.2049 on 12 degrees of freedom Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699 F-statistic: 16.61 on 9 and 12 DF, p-value: 1.811e-05 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Moving Average
On 26-Feb-09 13:54:51, David Winsemius wrote: I saw Gabor's reply but have a clarification to request. You say you want to remove low frequency components but then you request smoothing functions. The term smoothing implies removal of high-frequency components of a series. If you produce a smoothed series, your result of course contains the low-frequency comsponents, with the high-frequency components removed. But if you then subtract that from the original series, your result contains the high-frequency components, with the low-frequency compinents removed. Moving-average is one way of smoothing (but can introduce periodic components which were not there to start with). Filtering a time-series is a very open-ended activity! In many cases a useful start is exploration of the spectral properties of the series, for which R has several functions. 'spectrum()' in the stats package (loaded bvy default) is one basic function. help.search(time series) will throw up a lot of functions. You might want to look at package 'ltsa' (linear time series analysis). Alternatively, if yuou already have good information about the frequency-structure of the series, or (for instance) know that it has a will-defined seasonal component, then you could embark on designing a transfer function specifically tuned to the job. Have a look at RSiteSearch({transfer function}) Hoping this helps, Ted. If smoothing really is your goal then additional R resource would be smooth.spline, loess (or lowess), ksmooth, or using smoothing terms in regressions. Venables and Ripley have quite a few worked examples of such in MASS. -- David Winsemius On Feb 26, 2009, at 7:07 AM, mau...@alice.it wrote: I am looking for some help at removing low-frequency components from a signal, through Moving Average on a sliding window. I understand thiis is a smoothing procedure that I never done in my life before .. sigh. I searched R archives and found rollmean, MovingAverages {TTR}, SymmetricMA. None of the above mantioned functions seems to accept the smoothing polynomial order and the sliding window with as input parameters. Maybe I am missing something. I wonder whether there is some building blocks in R if not even a function which does it all (I do not expect that much,though). Even some literature references and/or tutorials are very welcome. Thank you so much, Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 26-Feb-09 Time: 14:54:43 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Moving Average
I wrote a little code using Fourier filtering if you would like to take a look at this: library(StreamMetabolism) library(mFilter) x - read.production(file.choose()) #contiguous.zoo(data.frame(x[,RM202DO.Conc], coredata(x[,RM202DO.Conc]))) #contiguous.zoo(data.frame(x[,RM61DO.Conc], coredata(x[,RM61DO.Conc]))) short - x[42685:48535,RM202DO.Conc] #short - x[53909:59957,RM61DO.Conc] short.ts - ts(coredata(short), frequency=96) #fourier filtering short.fft - fft(short.ts) plot(Re(short.fft), xlim=c(0,10), ylim=c(-1000, 1000)) short.fft[789:5563] = 0+0i short.ifft = fft(short.fft, inverse = TRUE)/length(short.fft) #zoo series filt - zoo(coredata(Re(short.ifft)) , index(short)) par(mfrow=c(2,1)) plot(short) plot(filt) window.plot - function(x, y, a, b, s, d){ par(mfrow=c(2,1)) plot(window.chron(x, a, b, s, d)) plot(window.chron(y, a, b, s, d)) } window.plot(short, filt, 04/17/2007, 00:01:00, 04/17/2007, 23:46:00) plot.e - function(b, w, x, y, z){ a - window.chron(b, w, x, y, z) plot(a, ylim=range(a)+0.06*c(-1, 1)) lines(a*0.98, col=blue) lines(a*1.02, col=red) } it may not be exactly what you want, but you will have a handle on what spectral properties that you have removed. On Thu, Feb 26, 2009 at 9:54 AM, Ted Harding ted.hard...@manchester.ac.uk wrote: On 26-Feb-09 13:54:51, David Winsemius wrote: I saw Gabor's reply but have a clarification to request. You say you want to remove low frequency components but then you request smoothing functions. The term smoothing implies removal of high-frequency components of a series. If you produce a smoothed series, your result of course contains the low-frequency comsponents, with the high-frequency components removed. But if you then subtract that from the original series, your result contains the high-frequency components, with the low-frequency compinents removed. Moving-average is one way of smoothing (but can introduce periodic components which were not there to start with). Filtering a time-series is a very open-ended activity! In many cases a useful start is exploration of the spectral properties of the series, for which R has several functions. 'spectrum()' in the stats package (loaded bvy default) is one basic function. help.search(time series) will throw up a lot of functions. You might want to look at package 'ltsa' (linear time series analysis). Alternatively, if yuou already have good information about the frequency-structure of the series, or (for instance) know that it has a will-defined seasonal component, then you could embark on designing a transfer function specifically tuned to the job. Have a look at RSiteSearch({transfer function}) Hoping this helps, Ted. If smoothing really is your goal then additional R resource would be smooth.spline, loess (or lowess), ksmooth, or using smoothing terms in regressions. Venables and Ripley have quite a few worked examples of such in MASS. -- David Winsemius On Feb 26, 2009, at 7:07 AM, mau...@alice.it wrote: I am looking for some help at removing low-frequency components from a signal, through Moving Average on a sliding window. I understand thiis is a smoothing procedure that I never done in my life before .. sigh. I searched R archives and found rollmean, MovingAverages {TTR}, SymmetricMA. None of the above mantioned functions seems to accept the smoothing polynomial order and the sliding window with as input parameters. Maybe I am missing something. I wonder whether there is some building blocks in R if not even a function which does it all (I do not expect that much,though). Even some literature references and/or tutorials are very welcome. Thank you so much, Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 26-Feb-09 Time: 14:54:43 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Let's not spend our time and resources
Re: [R] Singularity in a regression?
On 26-Feb-09 12:58:49, Bob Gotwals wrote: R friends, In a matrix of 1s and 0s, I'm getting a singularity error. Any helpful ideas? From the degress of freedom in your output, it seems you are fitting 10 binary variables to a total of 23 observations. In such circumstances, it is not unlikely that the matrix of 0s and 1s representing the binary variables would have at least 1 column which can be represented as a linear combination of the others (which is what the 1 not defined because of singularities means). Get more data, or use fewer variables! Or, also worth considering, check whether there are relationahips in the real world between your 10 variables which would tend to generate such linear dependence. Ted. lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe + paraF + paraCl + paraBr + paraI + paraMe) Residuals: Min 1Q Median 3QMax -4.573e-01 -7.884e-02 3.469e-17 6.616e-02 2.427e-01 Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(|t|) (Intercept) 7.9173 0.1129 70.135 2e-16 *** metaF-0.3973 0.2339 -1.698 0.115172 metaClNA NA NA NA metaBr0.3454 0.1149 3.007 0.010929 * metaI 0.4827 0.2339 2.063 0.061404 . metaMe0.3654 0.1149 3.181 0.007909 ** paraF 0.7675 0.1449 5.298 0.000189 *** paraCl0.3400 0.1449 2.347 0.036925 * paraBr1.0200 0.1449 7.040 1.36e-05 *** paraI 1.3327 0.2339 5.697 9.96e-05 *** paraMe1.2191 0.1573 7.751 5.19e-06 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.2049 on 12 degrees of freedom Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699 F-statistic: 16.61 on 9 and 12 DF, p-value: 1.811e-05 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 26-Feb-09 Time: 15:07:40 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Moving Average
On Feb 26, 2009, at 9:54 AM, (Ted Harding) wrote: On 26-Feb-09 13:54:51, David Winsemius wrote: I saw Gabor's reply but have a clarification to request. You say you want to remove low frequency components but then you request smoothing functions. The term smoothing implies removal of high-frequency components of a series. If you produce a smoothed series, your result of course contains the low-frequency comsponents, with the high-frequency components removed. But if you then subtract that from the original series, your result contains the high-frequency components, with the low-frequency compinents removed. Yes. The time series term would be detrending or de-trending. Moving-average is one way of smoothing (but can introduce periodic components which were not there to start with). Filtering a time-series is a very open-ended activity! In many cases a useful start is exploration of the spectral properties of the series, for which R has several functions. 'spectrum()' in the stats package (loaded bvy default) is one basic function. help.search(time series) will throw up a lot of functions. You might want to look at package 'ltsa' (linear time series analysis). Alternatively, if yuou already have good information about the frequency-structure of the series, or (for instance) know that it has a will-defined seasonal component, then you could embark on designing a transfer function specifically tuned to the job. Have a look at RSiteSearch({transfer function}) As the OP's reply indicates, she is already using wavelet analysis. My question at this point is whether she should just be advised to ignore the low frequency components and concentrate on the middle and high frequency components. If you already have some sort of spectral decomposition, there should be no necessity of a subtraction or de-trending step. -- David Winsemius Hoping this helps, Ted. If smoothing really is your goal then additional R resource would be smooth.spline, loess (or lowess), ksmooth, or using smoothing terms in regressions. Venables and Ripley have quite a few worked examples of such in MASS. -- David Winsemius On Feb 26, 2009, at 7:07 AM, mau...@alice.it wrote: I am looking for some help at removing low-frequency components from a signal, through Moving Average on a sliding window. I understand thiis is a smoothing procedure that I never done in my life before .. sigh. I searched R archives and found rollmean, MovingAverages {TTR}, SymmetricMA. None of the above mantioned functions seems to accept the smoothing polynomial order and the sliding window with as input parameters. Maybe I am missing something. I wonder whether there is some building blocks in R if not even a function which does it all (I do not expect that much,though). Even some literature references and/or tutorials are very welcome. Thank you so much, Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 26-Feb-09 Time: 14:54:43 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Singularity in a regression?
It looks like your data has not enough information to estimate the parameter for metaCl. Maybe because metaCL is identical to one of the other variables or a constant. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Bob Gotwals Verzonden: donderdag 26 februari 2009 13:59 Aan: r-help@r-project.org Onderwerp: [R] Singularity in a regression? R friends, In a matrix of 1s and 0s, I'm getting a singularity error. Any helpful ideas? lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe + paraF + paraCl + paraBr + paraI + paraMe) Residuals: Min 1Q Median 3QMax -4.573e-01 -7.884e-02 3.469e-17 6.616e-02 2.427e-01 Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(|t|) (Intercept) 7.9173 0.1129 70.135 2e-16 *** metaF-0.3973 0.2339 -1.698 0.115172 metaClNA NA NA NA metaBr0.3454 0.1149 3.007 0.010929 * metaI 0.4827 0.2339 2.063 0.061404 . metaMe0.3654 0.1149 3.181 0.007909 ** paraF 0.7675 0.1449 5.298 0.000189 *** paraCl0.3400 0.1449 2.347 0.036925 * paraBr1.0200 0.1449 7.040 1.36e-05 *** paraI 1.3327 0.2339 5.697 9.96e-05 *** paraMe1.2191 0.1573 7.751 5.19e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.2049 on 12 degrees of freedom Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699 F-statistic: 16.61 on 9 and 12 DF, p-value: 1.811e-05 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] layout of igraph
Dear R users, I am trying to draw a network using igraph package. I intend to place the hub nodes (the ones with the relatively more connection with other nodes) in the center of the graph. Also, the graph need to be in the fashion that the higher the correlation between two nodes is , the closer the two nodes will be. Is there any layout that can help or any other way to do this? Thanks in advance. Shukai -- View this message in context: http://www.nabble.com/layout-of-igraph-tp6348p6348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Singularity in a regression?
I saw Ted's reply and it is certainly sensible. I would wonder whether to model ought to be recast so that the scientific question is more clear? You are obviously studying the effect of different substitutions (F, Cl, Br, I, Me) and different positions around an aromatic ring (meta, para). Why not consider the order of electrophilicity (or possibly size) and the position as two different variables, one ordered and the other binomial? After recoding, your formula might then look like activity ~ electro + position ... or possibly activity ~ electro + size + position, and you would be less likely to run into difficulties with collinearity. You would also have some science in your model rather than casting aimlessly about in the data. If your ordering is sensible, you end up testing with 2 or 3 degrees of freedom. -- David Winsemius On Feb 26, 2009, at 7:58 AM, Bob Gotwals wrote: R friends, In a matrix of 1s and 0s, I'm getting a singularity error. Any helpful ideas? lm(formula = activity ~ metaF + metaCl + metaBr + metaI + metaMe + paraF + paraCl + paraBr + paraI + paraMe) Residuals: Min 1Q Median 3QMax -4.573e-01 -7.884e-02 3.469e-17 6.616e-02 2.427e-01 Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(|t|) (Intercept) 7.9173 0.1129 70.135 2e-16 *** metaF-0.3973 0.2339 -1.698 0.115172 metaClNA NA NA NA metaBr0.3454 0.1149 3.007 0.010929 * metaI 0.4827 0.2339 2.063 0.061404 . metaMe0.3654 0.1149 3.181 0.007909 ** paraF 0.7675 0.1449 5.298 0.000189 *** paraCl0.3400 0.1449 2.347 0.036925 * paraBr1.0200 0.1449 7.040 1.36e-05 *** paraI 1.3327 0.2339 5.697 9.96e-05 *** paraMe1.2191 0.1573 7.751 5.19e-06 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.2049 on 12 degrees of freedom Multiple R-squared: 0.9257, Adjusted R-squared: 0.8699 F-statistic: 16.61 on 9 and 12 DF, p-value: 1.811e-05 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] layout of igraph
Shukai, the force based layout algorithms (layout.drl, layout.fruchterman.reingold, layout.graphopt, layout.kamada.kawai) are likely to do this; although they are not explicitly required to place hubs in the center, usually they do. I am not sure what is the correlation between two nodes. You mean that the graph is weighted? If you have a small graph, then you can refine the layout interactively by using 'tkplot'. Gabor On Thu, Feb 26, 2009 at 4:20 PM, kevinchang shu...@seas.upenn.edu wrote: Dear R users, I am trying to draw a network using igraph package. I intend to place the hub nodes (the ones with the relatively more connection with other nodes) in the center of the graph. Also, the graph need to be in the fashion that the higher the correlation between two nodes is , the closer the two nodes will be. Is there any layout that can help or any other way to do this? Thanks in advance. Shukai -- View this message in context: http://www.nabble.com/layout-of-igraph-tp6348p6348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gabor Csardi gabor.csa...@unil.ch UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using very large matrix
Corrado, Package bigmemory has undergone a major re-engineering and will be available soon (available now in Beta version upon request). The version currently on CRAN is probably of limited use unless you're in Linux. bigmemory may be useful to you for data management, at the very least, where x - filebacked.big.matrix(8, 8, init=n, type=double) would accomplish what you want using filebacking (disk space) to hold the object. But even this requires 64-bit R (Linux or Mac, or perhaps a Beta version of Windows 64-bit R that REvolution Computing is working on). Subsequent operations (e.g. extraction of a small portion for analysis) are then easy enough: y - x[1,] would give you the first row of x as an object y in R. Note that x is not itself an R matrix, and most existing R analytics can't work on x directly (and would max out the RAM if they tried, anyway). Feel free to email me for more information (and this invitation applies to anyone who is interested in this). Cheers, Jay #Dear friends, # #I have to use a very large matrix. Something of the sort of #matrix(8,8,n) where n is something numeric of the sort 0.xx # #I have not found a way of doing it. I keep getting the error # #Error in matrix(nrow = 8, ncol = 8, 0.2) : too many elements specified # #Any suggestions? I have searched the mailing list, but to no avail. # #Best, #-- #Corrado Topi # #Global Climate Change Biodiversity Indicators #Area 18,Department of Biology #University of York, York, YO10 5YW, UK #Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk -- John W. Emerson (Jay) Assistant Professor of Statistics Department of Statistics Yale University http://www.stat.yale.edu/~jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] layout of igraph
Thanks Gabor's fast reply. In my research, every node has it's own vector of scores. So I can compute correlation between every pair of nodes. I used the width and color of the edge for this purpose. But visualizing the correlations by distance may be clearer. Best, Shukai Gábor Csárdi-2 wrote: Shukai, the force based layout algorithms (layout.drl, layout.fruchterman.reingold, layout.graphopt, layout.kamada.kawai) are likely to do this; although they are not explicitly required to place hubs in the center, usually they do. I am not sure what is the correlation between two nodes. You mean that the graph is weighted? If you have a small graph, then you can refine the layout interactively by using 'tkplot'. Gabor On Thu, Feb 26, 2009 at 4:20 PM, kevinchang shu...@seas.upenn.edu wrote: Dear R users, I am trying to draw a network using igraph package. I intend to place the hub nodes (the ones with the relatively more connection with other nodes) in the center of the graph. Also, the graph need to be in the fashion that the higher the correlation between two nodes is , the closer the two nodes will be. Is there any layout that can help or any other way to do this? Thanks in advance. Shukai -- View this message in context: http://www.nabble.com/layout-of-igraph-tp6348p6348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gabor Csardi gabor.csa...@unil.ch UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/layout-of-igraph-tp6348p6854.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] layout of igraph
Shukai, layout.drl supports edge weights, so you could try that. An alternative is doing MDS, see ?cmdscale and maybe help.search(MDS). But both these methods are approximate, obviously, most often you cannot embed an n-dimensional (n2) graph into the 2-dimensional plane and keep all the distances between the vertices. Best, Gabor On Thu, Feb 26, 2009 at 4:44 PM, kevinchang shu...@seas.upenn.edu wrote: Thanks Gabor's fast reply. In my research, every node has it's own vector of scores. So I can compute correlation between every pair of nodes. I used the width and color of the edge for this purpose. But visualizing the correlations by distance may be clearer. Best, Shukai Gábor Csárdi-2 wrote: Shukai, the force based layout algorithms (layout.drl, layout.fruchterman.reingold, layout.graphopt, layout.kamada.kawai) are likely to do this; although they are not explicitly required to place hubs in the center, usually they do. I am not sure what is the correlation between two nodes. You mean that the graph is weighted? If you have a small graph, then you can refine the layout interactively by using 'tkplot'. Gabor On Thu, Feb 26, 2009 at 4:20 PM, kevinchang shu...@seas.upenn.edu wrote: Dear R users, I am trying to draw a network using igraph package. I intend to place the hub nodes (the ones with the relatively more connection with other nodes) in the center of the graph. Also, the graph need to be in the fashion that the higher the correlation between two nodes is , the closer the two nodes will be. Is there any layout that can help or any other way to do this? Thanks in advance. Shukai -- View this message in context: http://www.nabble.com/layout-of-igraph-tp6348p6348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gabor Csardi gabor.csa...@unil.ch UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/layout-of-igraph-tp6348p6854.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gabor Csardi gabor.csa...@unil.ch UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] which zip file is the emulator package?
Hi. I'm trying to run the elliptic package on my computer (windows platform, version 2.7.2). I downloaded the elliptic package zip file from http://lib.stat.cmu.edu/R/CRAN/ and installed it, but it says that it needs the emulator package. Can you tell me where to download this? The only similar package by name is 'emu_4.0.zip' which is not 'emulator.' I also can't have R automatically install this by clicking on 'packages' and 'install package' because it gives me an error message. Thank you. eric version 2.7.2 on a windows platform __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survival::survfit,plot.survfit
Thanks very much, Bernhard and Terry. It clarify my confusion and really helps a lot. Regards Jeff Xu Terry Therneau wrote: plot(survfit(fit)) should plot the survival-function for x=0 or equivalently beta'=0. This curve is independent of any covariates. This is not correct. It plots the curve for a hypothetical subject with x= mean of each covariate. This is NOT the average survival of the data set. Imagine a cohort made up of 60 year old men and their 10 year old grandsons: the expected survival of this cohort does not look that for a 35 year old male. Terry T __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/survival%3A%3Asurvfit%2Cplot.survfit-tp22206954p7647.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which zip file is the emulator package?
On 2/26/2009 11:08 AM, eric lee wrote: Hi. I'm trying to run the elliptic package on my computer (windows platform, version 2.7.2). I downloaded the elliptic package zip file from http://lib.stat.cmu.edu/R/CRAN/ and installed it, but it says that it needs the emulator package. Can you tell me where to download this? The only similar package by name is 'emu_4.0.zip' which is not 'emulator.' I also can't have R automatically install this by clicking on 'packages' and 'install package' because it gives me an error message. The current release gives no error if you ask it to install elliptic, and it automatically finds that emulator is in the BACCO bundle. You didn't say what the error message was that you saw, so I can't tell you whether it was due to a problem on CRAN or with what you asked for. Generally speaking it's easier and less error-prone to just ask R to install a package, rather than manually downloading and installing one. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which zip file is the emulator package?
Hi, Duncan. I choose 'package' then ''install package' and tried 6 different U.S. mirrors. The message I always get is: Warning: unable to access index for repository http://lib.stat.cmu.edu/R/CRAN/bin/windows/contrib/2.7 Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.7 Error in install.packages(NULL, .libPaths()[1], dependencies = NA, type = type) : no packages were specified In addition: Warning message: In open.connection(con, r) : unable to connect to 'cran.r-project.org' on port 80. I assume it's something on my end because I've been having connectivity issues here. Thanks again for your help. eric On Thu, Feb 26, 2009 at 11:22 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 2/26/2009 11:08 AM, eric lee wrote: Hi. I'm trying to run the elliptic package on my computer (windows platform, version 2.7.2). I downloaded the elliptic package zip file from http://lib.stat.cmu.edu/R/CRAN/ and installed it, but it says that it needs the emulator package. Can you tell me where to download this? The only similar package by name is 'emu_4.0.zip' which is not 'emulator.' I also can't have R automatically install this by clicking on 'packages' and 'install package' because it gives me an error message. The current release gives no error if you ask it to install elliptic, and it automatically finds that emulator is in the BACCO bundle. You didn't say what the error message was that you saw, so I can't tell you whether it was due to a problem on CRAN or with what you asked for. Generally speaking it's easier and less error-prone to just ask R to install a package, rather than manually downloading and installing one. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which zip file is the emulator package?
Reading this article, announcing the elliptic package (which was a distinct pleasure) and also looking at the R-FAQ, my guess is that you need the BACCO bundle. http://www.jstatsoft.org/v15/i07/paper -- David Winsemius On Feb 26, 2009, at 11:08 AM, eric lee wrote: Hi. I'm trying to run the elliptic package on my computer (windows platform, version 2.7.2). I downloaded the elliptic package zip file from http://lib.stat.cmu.edu/R/CRAN/ and installed it, but it says that it needs the emulator package. Can you tell me where to download this? The only similar package by name is 'emu_4.0.zip' which is not 'emulator.' I also can't have R automatically install this by clicking on 'packages' and 'install package' because it gives me an error message. Thank you. eric version 2.7.2 on a windows platform __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which zip file is the emulator package?
Thanks, Duncan. The BACCO package worked. On Thu, Feb 26, 2009 at 11:22 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 2/26/2009 11:08 AM, eric lee wrote: Hi. I'm trying to run the elliptic package on my computer (windows platform, version 2.7.2). I downloaded the elliptic package zip file from http://lib.stat.cmu.edu/R/CRAN/ and installed it, but it says that it needs the emulator package. Can you tell me where to download this? The only similar package by name is 'emu_4.0.zip' which is not 'emulator.' I also can't have R automatically install this by clicking on 'packages' and 'install package' because it gives me an error message. The current release gives no error if you ask it to install elliptic, and it automatically finds that emulator is in the BACCO bundle. You didn't say what the error message was that you saw, so I can't tell you whether it was due to a problem on CRAN or with what you asked for. Generally speaking it's easier and less error-prone to just ask R to install a package, rather than manually downloading and installing one. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] windows vs. linux code
Gabor Grothendieck wrote: Try if (.Platform$OS.type == windows) ... else ... Gabor has suggested what I think is the best way to do this check, but in my experience, if you are doing this check then you are almost certainly missing some feature of R that will let you avoid doing it. For graphs, dev.new() has been mentioned. I would be curious to know when people find it necessary to do this check, other than to issue a system() command for some very specialized local system reason (i.e. something that would never be used in a package and rarely in code run on different computers - not just different OSs). Paul La version française suit le texte anglais. This email may contain privileged and/or confidential information, and the Bank of Canada does not waive any related rights. Any distribution, use, or copying of this email or the information it contains by other than the intended recipient is unauthorized. If you received this email in error please delete it immediately from your system and notify the sender promptly by email that you have done so. Le présent courriel peut contenir de l'information privilégiée ou confidentielle. La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements qu'il contient par une personne autre que le ou les destinataires désignés est interdite. Si vous recevez ce courriel par erreur, veuillez le supprimer immédiatement et envoyer sans délai à l'expéditeur un message électronique pour l'aviser que vous avez éliminé de votre ordinateur toute copie du courriel reçu. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R communication
This could depend somewhat on which OS you have on the computers as to which packages will work or work best for you. A couple of packages to look at include, Rmpi, nws, and snow (no relation), and the other packages in the suggests field for snow. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of ARDIA David Sent: Thursday, February 26, 2009 3:05 AM To: r-help@r-project.org Subject: [R] R communication Dear all, Imagine that you have a small LAN with two computers. I would like to run R on both, and possible to run computations from one computer to the other one. TCP IP protocole would be preferable. Which package would you use for that? I would be very glad if you could also provide me with some lines of code, e.g., create a matrix X in computer 1, transfer its value to the second computer, make some calculation, and get the value back. Thanks for your help, David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ftp fetch using RCurl?
Hi everyone, I have to fetch about 300 to 500 zipped archives from a remote ftp server. Each of the archive is about 1Mb. I know I can get it done by using download.file() in R, but I am curious that is there a faster way to do this using RCurl. For example, are there some parameters that I can set so that the connection does not need to be rebuiltetc. A even simpler question is, how can I fetch an archive from the server and place it somewhere locally? I have spent a lot of time reading RCurl documents and curl web pages but in vain. Can someone show me an example of the syntax? Pardon me if this is trivial to you. Thanks Stanley -- View this message in context: http://www.nabble.com/ftp-fetch-using-RCurl--tp8067p8067.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error message and convergence issues in fitting glmer in package lme4
I'm resending this message because I did not include a subject line in my first posting. Apologies for the inconvenience! Tanja Hello, I'm trying to fit a generalized linear mixed model to estimate diabetes prevalence at US county level. To do this I'm using the glmer() function in package lme4. I can fit relatively simple models (i.e. few covariates) but when expanding the number of covariates I usually encounter the following error message. gm8 - glmer(DIAB05F~AGE+as.factor(SEX)+poolt+poolx+poverty+fastfood+(1|as.factor(diab$fips)), family = binomial(link=logit), data = diab, doFit=TRUE) Error in validObject(.Object) : invalid class mer object: Slot Zt must by dims['q'] by dims['n']*dims['s'] In the above, the response is person-level diabetes status as a function of AGE=age, SEX=sex, poolt=average county diabetes prevalence for previous years, poolx=pooled county diabetes prevalence for counties with similar age, sex, race, and income structure, poverty=county poverty rate, fastfood=number of fastfood places per 100,000 people in the county, and a county random effect. If I leave out fastfood, the model gets at least fitted - although it doesn't converge (yet): Warning message: In mer_finalize(ans) : false convergence (8) I would be grateful for any advice on what the problem could be and how to resolve it. Thanks, Tanja Tanja Srebotnjak, PhD, MSc, Dipl. Stat. Postgraduate Fellow Institute for Health Metrics and Evaluation University of Washington 2301 5th Ave, Suite 600 Seattle, WA 98121 Email: tan...@u.washington.edumailto:tan...@u.washington.edu Tel: +1-206-897-2866 www.healthmetricsandevaluation.orghttp://www.healthmetricsandevaluation.org From: Tanja Srebotnjak Sent: Thursday, February 26, 2009 12:17 AM To: 'r-help@r-project.org' Subject: Hello, I'm trying to fit a generalized linear mixed model to estimate diabetes prevalence at US county level. To do this I'm using the glmer() function in package lme4. I can fit relatively simple models (i.e. few covariates) but when expanding the number of covariates I usually encounter the following error message. gm8 - glmer(DIAB05F~AGE+as.factor(SEX)+poolt+poolx+poverty+fastfood+(1|as.factor(diab$fips)), family = binomial(link=logit), data = diab, doFit=TRUE) Error in validObject(.Object) : invalid class mer object: Slot Zt must by dims['q'] by dims['n']*dims['s'] In the above, the response is person-level diabetes status as a function of AGE=age, SEX=sex, poolt=average county diabetes prevalence for previous years, poolx=pooled county diabetes prevalence for counties with similar age, sex, race, and income structure, poverty=county poverty rate, fastfood=number of fastfood places per 100,000 people in the county, and a county random effect. If I leave out fastfood, the model gets at least fitted - although it doesn't converge (yet): Warning message: In mer_finalize(ans) : false convergence (8) I would be grateful for any advice on what the problem could be and how to resolve it. Thanks, Tanja [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest confusion matrix
randomForest output is based on predict(iris.rf) whereas the code shown below uses predict(iris.rf, iris). See ?predict.randomForest for an explanation. On Thu, Feb 26, 2009 at 11:10 AM, Li GUO guol...@yahoo.com wrote: Dear R users, I have a question on the confusion matrix generated by function randomForest. I used the entire data set to generate the forest, for example: print(iris.rf) Call: randomForest(formula = Species ~ ., data = iris, importance = TRUE, keep.forest = TRUE) confusion setosa versicolor virginica class.error setosa 50 0 0 0.00 versicolor 0 47 3 0.06 virginica 0 3 47 0.06 then I classified the same data set with this forest: iris.pred - predict(iris.rf, iris) table(observed = iris[,Species], predicted = iris.pred) predicted observed setosa versicolor virginica setosa 50 0 0 versicolor 0 50 0 virginica 0 0 50 Why the two matrices are different? Thinks, Li [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hi, Coding problem
It looks like what you did was something like: tmp - sim(100) table(tmp)/100 To get your results (it is easier for us to help if you tell us what you actually did, also setting a seed and telling us that seed helps us reproduce what you did exactly). The reason that you did not see 9 and 10 is just due to chance, the probabilities for those values is small and it is not surprising that you did not see a couple of unlikely values. You can either run the sim function more times or tell the computer exactly what values you want counted so it knows to include them, for example: tmp - factor(tmp, levels= 1:max(tmp) ) table(tmp)/100 Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Ssophia Sent: Wednesday, February 25, 2009 7:28 PM To: r-help@r-project.org Subject: [R] Hi, Coding problem Hi, there Below is my code to one Homework question. I couldn't come up with the reasonable answer. could you please help me to figure out what is the problem with my code? thank you Question is Coding P{X=j} =(1/2)^(j+1) + (1/2) *2^(j-1)/3^j my code is sim - function(n.gen){ urandom - runif(n.gen) sim.vector - rep(0,n.gen) for(j in 1:n.gen){ i - 1 p - 5/12 F - p while(urandom[j] = F){ p - p*((1/2)^(i+1)+1/3*(2/3)^i)/((1/2)^i+(1/2)*(2/3)^i) F - F+p i-i+1 } sim.vector[j] - i } # output sim.vector } result is 12345678 11 0.37 0.22 0.16 0.13 0.05 0.02 0.03 0.01 0.01 always, there are some numbers missing, it should be continuous. why 9 and 10 are missing thank you sophia _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] use R Group SFBA March meeting reminder
All San Francisco Bay Area useRs, On March 11th, Spencer Graves Sundar Dorai-Raj will talk about Creating R Packages, and related issues. This is the first of our (now) regular monthly meetings, details here http://www.meetup.com/R-Users/calendar/9718957/ Last week Wednesday, we had over 60 people attend our 2009 kick-off meeting in cooperation with Predictive Analytics World (see Group link below). Based on that response, we are now scheduling out the rest of the year. PRESENTERS needed! Please contact Mike or myself if you are interested in presenting. Best, Jim Porzak TGN.com San Francisco, CA http://www.linkedin.com/in/jimporzak use R! Group SF: http://www.meetup.com/R-Users/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which zip file is the emulator package?
Are you behind a firewall? See the help for download.file for details on setting proxy information if this is the case. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of eric lee Sent: Thursday, February 26, 2009 9:33 AM To: Duncan Murdoch Cc: r-h...@stat.math.ethz.ch Subject: Re: [R] which zip file is the emulator package? Hi, Duncan. I choose 'package' then ''install package' and tried 6 different U.S. mirrors. The message I always get is: Warning: unable to access index for repository http://lib.stat.cmu.edu/R/CRAN/bin/windows/contrib/2.7 Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.7 Error in install.packages(NULL, .libPaths()[1], dependencies = NA, type = type) : no packages were specified In addition: Warning message: In open.connection(con, r) : unable to connect to 'cran.r-project.org' on port 80. I assume it's something on my end because I've been having connectivity issues here. Thanks again for your help. eric On Thu, Feb 26, 2009 at 11:22 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 2/26/2009 11:08 AM, eric lee wrote: Hi. I'm trying to run the elliptic package on my computer (windows platform, version 2.7.2). I downloaded the elliptic package zip file from http://lib.stat.cmu.edu/R/CRAN/ and installed it, but it says that it needs the emulator package. Can you tell me where to download this? The only similar package by name is 'emu_4.0.zip' which is not 'emulator.' I also can't have R automatically install this by clicking on 'packages' and 'install package' because it gives me an error message. The current release gives no error if you ask it to install elliptic, and it automatically finds that emulator is in the BACCO bundle. You didn't say what the error message was that you saw, so I can't tell you whether it was due to a problem on CRAN or with what you asked for. Generally speaking it's easier and less error-prone to just ask R to install a package, rather than manually downloading and installing one. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merge question
Hi: I am a new R user. I have the following question and would appreciate your input Data1 (data frame 1) p1,d1,d2 (p1 is text and d1 and d2 are numeric) xyz,10,25 Data2 (data frame 2) p1,d1,d2 xyz,11,15 Now I want to create a new data frame that looks like so below. The fields d1 and s2 are summed by the product key. Data3 p1,d1,d2 xyz,21 (sum of 10 from Data1 and 11 from Data2),40 (sum of 25 from Data1 and 15 from Data2) Any other examples of merge you may have will be appreciated. Thanks. Satish __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generalized linear mixed models with a beta distribution
Has there been any follow up to this question? I have found myself wondering the same thing: How then does SAS fit a beta distributed GLMM? It also fits the negative binomial distribution. Both of these would be useful in glmer/lmer if they aren't 'illegal' as Brian suggested. Especially as SAS indicates a favorable delta BIC of over 1000 when I fit the beta to my data (could be the beginning of a great song..) versus my original binomial fit. Jeff Evans Michigan State University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glm with large datasets
Hi all, I have to run a logit regresion over a large dataset and I am not sure about the best option to do it. The dataset is about 20x2000 and R runs out of memory when creating it. After going over help archives and the mailing lists, I think there are two main options, though I am not sure about which one will be better. Of course, any alternative will be welcome as well. Actually, I am not quite sure about whether any of these options will work but, before getting into it, I would like to get some advice. -A first option is to use the package ff, that allows to work with the dataset without loading it into the RAM. This, combined with the bigglm function should do the job. -The dataset contains a lot of sparse variables, so I was wondering whether creating the model matrix as a sparse matrix might deliver good results. In this case, I am not sure about the capabilities of glm or some extension of it to deal with sparse matrices (I could not find any documentation about this). If possible, this second option seems more efficient since R might be capable of using the fact that matrices are sparse to speed up the computations. Thanks in advance. All the best! Julio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] windows vs. linux code
On Thu, 26 Feb 2009, Paul Gilbert wrote: Gabor Grothendieck wrote: Try if (.Platform$OS.type == windows) ... else ... Gabor has suggested what I think is the best way to do this check, but in my experience, if you are doing this check then you are almost certainly missing some feature of R that will let you avoid doing it. For graphs, dev.new() has been mentioned. I would be curious to know when people find it necessary to do this check, other than to issue a system() command for some very specialized local system reason (i.e. something that would never be used in a package and rarely in code run on different computers - not just different OSs). Grep-ing the R and package sources is informative. One place that such a test is needed is for system/shell differences, and the fact that some Windows commands need \ as the file separator (and some need / or shell-specific quoting). Quite a few packages are calling e.g. gs (where the name is platform-dependent). Some packages seem to treat tk differently by platform, but that seems to me to be largely a legacy of Tk 8.4: 8.5 is more portable. But many uses in packages are no longer necessary, e.g. flush.console exists everywhere, and R 2.9.0 will have an unzip() function. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two colors and putting lines on bar plots
For the first question you can use %in% rather than ==, for example: ifelse( dataframe$vector_o_number %in% c('00','01), 'red', 'black') for the reference line, the abline function will draw a line the full width/height of a graph for a general reference, if you want separate lines for each group of bars then the segments function will draw the shorter lines. Note that the return value of the barplot function gives information on the x coordinates of the bars that could be useful in drawing the lines (see the examples for barplot). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of John Malone Sent: Wednesday, February 25, 2009 7:05 PM To: r-help@r-project.org Subject: [R] two colors and putting lines on bar plots Hi all, I want to represent two categories with one color and have other categories a different color on a bar plot. I can do this using for one category/number using the ifelse call in col but how to extend to two categories/numbers? barplot(dataframe$vector_o_numbers, col=ifelse(dataframe$vector_o_numbers == 00, red, black), names.arg=dataframe$labels) For example if I wanted to label 00 and 01 red and have the rest of the categories colored blackhow to do this? FinallyI'd Iike to put a line on the bar plot representing an expected number to compare to the observed number for a particular category. In the past I've done this by drawing a line manually on a graph using photoshop or something similar. Is there way to do this in R? Many thanks all in advance for your help!! John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge question
Hi there, something like this? Data1-read.table(stdin(), head=T, sep=,) p1,d1,d2 xyz,10,25 kmz,100,250 Data2-read.table(stdin(), head=T, sep=,) p1,d1,d2 xyz,11,15 kmz,110,150 Data1 Data2 Data3-data.frame(rbind(Data1,Data2)) Data3 Data3.sum-aggregate(Data3[,c(d1,d2)], list(Data3$p1), sum) Data3.sum Best wishes miltinho astronauta brazil On Thu, Feb 26, 2009 at 2:52 PM, Vadlamani, Subrahmanyam {FLNA} subrahmanyam.vadlam...@fritolay.com wrote: Hi: I am a new R user. I have the following question and would appreciate your input Data1 (data frame 1) p1,d1,d2 (p1 is text and d1 and d2 are numeric) xyz,10,25 Data2 (data frame 2) p1,d1,d2 xyz,11,15 Now I want to create a new data frame that looks like so below. The fields d1 and s2 are summed by the product key. Data3 p1,d1,d2 xyz,21 (sum of 10 from Data1 and 11 from Data2),40 (sum of 25 from Data1 and 15 from Data2) Any other examples of merge you may have will be appreciated. Thanks. Satish __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in predict function for naiveBayes?
Dear all, I've sent a mail last week already. I've been trying to get this predict function work, but somehow I keep on getting the same error. Read the help files, searched the internet, but I don't seem to get what I'm doing wrong. Anybody who has experience with this function? It's contained in the package e1071. Thank you in advance On Thu, Feb 19, 2009 at 2:16 PM, joris meys jorism...@gmail.com wrote: Dear all, I tried a simple naive Bayes classification on an artificial dataset, but I have troubles getting the predict function to work with the type=class specification. With type= raw, it works perfectly, but with type=class I get following error : Error in as.vector(x, mode) : invalid 'mode' argument Data : mixture.train is a training set with 100 points originating from 2 multivariate gaussian distributions (class 0 and class 1), with X1 and X2 as coordinates in a 2-dimensional space. Mixture.test is a grid going from -15 to +15 in both dimensions. Stupid data, but it's just to test. Code : Sigma - matrix(c(10,3,3,2),2,2) mixture.train - cbind(mvrnorm(n=50, c(0, 2), Sigma),rep(0,50)) mixture.train - as.data.frame(rbind(mixture.train,cbind(mvrnorm(n=50, c(2, 0), Sigma),rep(1,50 names(mixture.train) -c(X1,X2,Class) X1 - rep(seq(-15,15,by=1),31) X2 - rep(seq(-15,15,by=1),each = 31) mixture.test - data.frame(X1,X2) Bayes.res - naiveBayes(Class ~ X1 + X2, data=mixture.train) pred.bayes -predict(Bayes.res, cbind(mixture.test$X1, mixture.test$X2),type=class) Tried it also with pred.bayes -predict(Bayes.res, mixture.test,type=class), but that gives the same effect. Is this a bug or am I missing something? Kind regards Joris Meys University Ghent [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge question
on 02/26/2009 11:52 AM Vadlamani, Subrahmanyam {FLNA} wrote: Hi: I am a new R user. I have the following question and would appreciate your input Data1 (data frame 1) p1,d1,d2 (p1 is text and d1 and d2 are numeric) xyz,10,25 Data2 (data frame 2) p1,d1,d2 xyz,11,15 Now I want to create a new data frame that looks like so below. The fields d1 and s2 are summed by the product key. Data3 p1,d1,d2 xyz,21 (sum of 10 from Data1 and 11 from Data2),40 (sum of 25 from Data1 and 15 from Data2) Any other examples of merge you may have will be appreciated. Thanks. Satish Given the nature of your data, having the same column structure with repeated keys, I would not use merge(), but rbind() the two data frames together and then use aggregate(): DF - rbind(Data1, Data2) DF p1 d1 d2 1 xyz 10 25 2 xyz 11 15 aggregate(DF[-1], list(p1 = DF$p1), sum) p1 d1 d2 1 xyz 21 40 See ?rbind and ?aggregate If you search the list archives: RSiteSearch(merge) you will yield hundreds of posts showing the use of that particular function. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bottom legends in ggplot2 ?
Has anyone had success with producing legends to a qplot graph such that the legend is placed on the bottom, under the abcissa rather than to the right hand side ? The following doesn't move the legend: library(ggplot2) qplot(mpg, wt, data=mtcars, colour=cyl, gpar(legend.position=bottom) ) I am using ggplot2_0.8.2. Thanks in advance, Avram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in predict function for naiveBayes?
You need to set Class as factor before you call naiveBayes(); i.e., mixture.train$Class - factor(mixture.train$Class) Then you can just do: pred.bayes -predict(Bayes.res, mixture.test, type=class) Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of joris meys Sent: Thursday, February 26, 2009 1:36 PM To: r-help@r-project.org Subject: Re: [R] Bug in predict function for naiveBayes? Dear all, I've sent a mail last week already. I've been trying to get this predict function work, but somehow I keep on getting the same error. Read the help files, searched the internet, but I don't seem to get what I'm doing wrong. Anybody who has experience with this function? It's contained in the package e1071. Thank you in advance On Thu, Feb 19, 2009 at 2:16 PM, joris meys jorism...@gmail.com wrote: Dear all, I tried a simple naive Bayes classification on an artificial dataset, but I have troubles getting the predict function to work with the type=class specification. With type= raw, it works perfectly, but with type=class I get following error : Error in as.vector(x, mode) : invalid 'mode' argument Data : mixture.train is a training set with 100 points originating from 2 multivariate gaussian distributions (class 0 and class 1), with X1 and X2 as coordinates in a 2-dimensional space. Mixture.test is a grid going from -15 to +15 in both dimensions. Stupid data, but it's just to test. Code : Sigma - matrix(c(10,3,3,2),2,2) mixture.train - cbind(mvrnorm(n=50, c(0, 2), Sigma),rep(0,50)) mixture.train - as.data.frame(rbind(mixture.train,cbind(mvrnorm(n=50, c(2, 0), Sigma),rep(1,50 names(mixture.train) -c(X1,X2,Class) X1 - rep(seq(-15,15,by=1),31) X2 - rep(seq(-15,15,by=1),each = 31) mixture.test - data.frame(X1,X2) Bayes.res - naiveBayes(Class ~ X1 + X2, data=mixture.train) pred.bayes -predict(Bayes.res, cbind(mixture.test$X1, mixture.test$X2),type=class) Tried it also with pred.bayes -predict(Bayes.res, mixture.test,type=class), but that gives the same effect. Is this a bug or am I missing something? Kind regards Joris Meys University Ghent [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] windows vs. linux code
Rolf Turner r.tur...@auckland.ac.nz writes: Despite the knowledge, wisdom, insight, skill, good looks, and other admirable characteristics of the members of the R-help list, few of us are skilled in telepathy or clairvoyance. Oh, yeah? Then how did I know you were going to say that, huh? -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot with pairwise joined points
I would like to do as follows plot(a,b) points(c,d,pch=19) Now join with a line segment point a[1], b[1] to c[1], d[1]; a[2], b[2] to c[2], d[2] ... a[n], b[n] to c[n], d[n] All corresponding points from the two data sets are joined by line segments. Thanks very much for any tips on how to do this. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error message and convergence issues in fitting glmer in package lme4
On Thu, Feb 26, 2009 at 10:58 AM, Tanja Srebotnjak tan...@u.washington.edu wrote: I'm resending this message because I did not include a subject line in my first posting. Also, it is generally more effective to send questions about lmer/glmer to the R-SIG-Mixed-Models list, which I am cc:ing on this reply. Hello, I'm trying to fit a generalized linear mixed model to estimate diabetes prevalence at US county level. To do this I'm using the glmer() function in package lme4. I can fit relatively simple models (i.e. few covariates) but when expanding the number of covariates I usually encounter the following error message. gm8 - glmer(DIAB05F~AGE+as.factor(SEX)+poolt+poolx+poverty+fastfood+(1|as.factor(diab$fips)), family = binomial(link=logit), data = diab, doFit=TRUE) Error in validObject(.Object) : invalid class mer object: Slot Zt must by dims['q'] by dims['n']*dims['s'] Getting that error message from this model is peculiar. I couldn't actually say what might be happening without trying the fit myself. I would suggest setting doFit = FALSE but I think that this error would be encountered even with doFit = FALSE. Again, it would be hard to say exactly what is happening here. In the above, the response is person-level diabetes status as a function of AGE=age, SEX=sex, poolt=average county diabetes prevalence for previous years, poolx=pooled county diabetes prevalence for counties with similar age, sex, race, and income structure, poverty=county poverty rate, fastfood=number of fastfood places per 100,000 people in the county, and a county random effect. If I leave out fastfood, the model gets at least fitted - although it doesn't converge (yet): The version of lmer currently under development tries to address that problem. The optimization of the parameter estimates is performed in a slightly different way that will, I hope, provide smoother convergence. If your data are not restricted and you would be willing to send me a copy of the diab data frame I could check what happens on that version (or you could install the development version yourself but that is a non-trivial undertaking). If you can send the data the best way to send it is to create an R data file as save(diab, file = diab.rda) and send the file diab.rda Warning message: In mer_finalize(ans) : false convergence (8) Frequently that is a sign of an overspecified model. I would be grateful for any advice on what the problem could be and how to resolve it. Thanks, Tanja Tanja Srebotnjak, PhD, MSc, Dipl. Stat. Postgraduate Fellow Institute for Health Metrics and Evaluation University of Washington 2301 5th Ave, Suite 600 Seattle, WA 98121 Email: tan...@u.washington.edumailto:tan...@u.washington.edu Tel: +1-206-897-2866 www.healthmetricsandevaluation.orghttp://www.healthmetricsandevaluation.org From: Tanja Srebotnjak Sent: Thursday, February 26, 2009 12:17 AM To: 'r-help@r-project.org' Subject: Hello, I'm trying to fit a generalized linear mixed model to estimate diabetes prevalence at US county level. To do this I'm using the glmer() function in package lme4. I can fit relatively simple models (i.e. few covariates) but when expanding the number of covariates I usually encounter the following error message. gm8 - glmer(DIAB05F~AGE+as.factor(SEX)+poolt+poolx+poverty+fastfood+(1|as.factor(diab$fips)), family = binomial(link=logit), data = diab, doFit=TRUE) Error in validObject(.Object) : invalid class mer object: Slot Zt must by dims['q'] by dims['n']*dims['s'] In the above, the response is person-level diabetes status as a function of AGE=age, SEX=sex, poolt=average county diabetes prevalence for previous years, poolx=pooled county diabetes prevalence for counties with similar age, sex, race, and income structure, poverty=county poverty rate, fastfood=number of fastfood places per 100,000 people in the county, and a county random effect. If I leave out fastfood, the model gets at least fitted - although it doesn't converge (yet): Warning message: In mer_finalize(ans) : false convergence (8) I would be grateful for any advice on what the problem could be and how to resolve it. Thanks, Tanja [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Download daily weather data
I'm writing a program that will tell me whether I should wear a coat, so I'd like to be able to download daily weather forecasts and daily reports of recent past weather conditions. The NOAA has very promising tabular forecasts (http://forecast.weather.gov/MapClick.php?CityName=Ithacastate=NYsite=BGMtextField1=42.4422textField2=-76.5002e=0FcstType=digital), but I can't figure out how to import them. Someone must have needed to do this before. Suggestions? Thomas Levine! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generalized linear mixed models with a beta distribution
On Thu, Feb 26, 2009 at 12:04 PM, Jeff Evans evans...@msu.edu wrote: Has there been any follow up to this question? I have found myself wondering the same thing: How then does SAS fit a beta distributed GLMM? It also fits the negative binomial distribution. When SAS decides to open-source their code we'll be able to find out. Both of these would be useful in glmer/lmer if they aren't 'illegal' as Brian suggested. Especially as SAS indicates a favorable delta BIC of over 1000 when I fit the beta to my data (could be the beginning of a great song..) versus my original binomial fit. Definitions of generalized linear mixed models are not entirely straightforward, at least for me. I'm making some progress but, as always, it is slower than one would like it to be. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot with pairwise joined points
Try: segments(a,b,c,d) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of William Simpson Sent: Thursday, February 26, 2009 1:46 PM To: r-help@r-project.org Subject: [R] plot with pairwise joined points I would like to do as follows plot(a,b) points(c,d,pch=19) Now join with a line segment point a[1], b[1] to c[1], d[1]; a[2], b[2] to c[2], d[2] ... a[n], b[n] to c[n], d[n] All corresponding points from the two data sets are joined by line segments. Thanks very much for any tips on how to do this. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot with pairwise joined points
have a look at segments(), e.g., x - rnorm(5) y - rnorm(5) z - rnorm(5) w - rnorm(5) r1 - range(x, z) r2 - range(y, w) plot(r1, r2, type = n) points(x, y) points(z, w, pch = 19) segments(x, y, z, w) I hope it helps. Best, Dimitris William Simpson wrote: I would like to do as follows plot(a,b) points(c,d,pch=19) Now join with a line segment point a[1], b[1] to c[1], d[1]; a[2], b[2] to c[2], d[2] ... a[n], b[n] to c[n], d[n] All corresponding points from the two data sets are joined by line segments. Thanks very much for any tips on how to do this. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot with pairwise joined points
On 27/02/2009, at 9:46 AM, William Simpson wrote: I would like to do as follows plot(a,b) points(c,d,pch=19) Now join with a line segment point a[1], b[1] to c[1], d[1]; a[2], b[2] to c[2], d[2] ... a[n], b[n] to c[n], d[n] All corresponding points from the two data sets are joined by line segments. ?segments cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download daily weather data
Thomas, Have a look at the source code for the webpage (ctrl-u in firefox, don't know in internet explorer, etc.). That is what you'd have to parse in order to get the forecast from this page. Typically when I parse webpages such as this I use regular expressions to do so (and I would never downplay the usefulness of regular expressions, but they take a little getting used to). There are two parts to the task: find patterns that allow you to pull out the datum/data you're after; and then write a program to pull it/them out. Also, of course, download the webpage (but that's no issue). I bet you'd be able to find a comma separated value (CSV) file containing the weather report somewhere, which would probably involve a little less labor in order to produce your automatic wardrobe advice. James On Thu, Feb 26, 2009 at 3:47 PM, Thomas Levine thomas.lev...@gmail.com wrote: I'm writing a program that will tell me whether I should wear a coat, so I'd like to be able to download daily weather forecasts and daily reports of recent past weather conditions. The NOAA has very promising tabular forecasts (http://forecast.weather.gov/MapClick.php?CityName=Ithacastate=NYsite=BGMtextField1=42.4422textField2=-76.5002e=0FcstType=digital), but I can't figure out how to import them. Someone must have needed to do this before. Suggestions? Thomas Levine! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot with pairwise joined points
Thanks very much Rolf, Dimitris, Greg! Bill On Thu, Feb 26, 2009 at 8:56 PM, Rolf Turner r.tur...@auckland.ac.nz wrote: On 27/02/2009, at 9:46 AM, William Simpson wrote: I would like to do as follows plot(a,b) points(c,d,pch=19) Now join with a line segment point a[1], b[1] to c[1], d[1]; a[2], b[2] to c[2], d[2] ... a[n], b[n] to c[n], d[n] All corresponding points from the two data sets are joined by line segments. ?segments cheers, Rolf Turner ## Attention:This e-mail message is privileged and confidential. If you are not theintended recipient please delete the message and notify the sender.Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshalwww.marshalsoftware.com ## __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generalized linear mixed models with a beta distribution
Thanks for responding Doug. I'm sure SAS just hasn't gotten around to releasing their code yet. lme4 does have a leg up on GLIMMIX in other areas, though. The latest SAS release (9.2) is now able to compute the Laplace approximation of the likelihood, but it will only fit an overdispersion parameter using pseudo-likelihoods which can't be used for model selection. I'm not sure what lme4 is doing differently through the quasi-distributions that allows this, but it's enormously useful. Jeff -Original Message- From: dmba...@gmail.com [mailto:dmba...@gmail.com] On Behalf Of Douglas Bates Sent: Thursday, February 26, 2009 3:50 PM To: Jeff Evans Cc: r-help@r-project.org Subject: Re: [R] generalized linear mixed models with a beta distribution On Thu, Feb 26, 2009 at 12:04 PM, Jeff Evans evans...@msu.edu wrote: Has there been any follow up to this question? I have found myself wondering the same thing: How then does SAS fit a beta distributed GLMM? It also fits the negative binomial distribution. When SAS decides to open-source their code we'll be able to find out. Both of these would be useful in glmer/lmer if they aren't 'illegal' as Brian suggested. Especially as SAS indicates a favorable delta BIC of over 1000 when I fit the beta to my data (could be the beginning of a great song..) versus my original binomial fit. Definitions of generalized linear mixed models are not entirely straightforward, at least for me. I'm making some progress but, as always, it is slower than one would like it to be. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Substituting in a variable file name in a Windows system command
I am running R version 2.8.1 on Windows XP OS. I generate and write a .csv file from my R script. Then the following command works to upload it to a remote server using a windows batch file that carries out the ftp (among other things). system(C:/upload_data/uploadq8.bat C:/upload_data/out_2009-02-26.csv, wait=FALSE) I want to set this up to run daily and create a unique filename each day it runs. I write the .csv file with a unique filename by fname - paste(out_,Sys.Date(),.csv,sep=) write.table(config_all5,file=fname,row.names=FALSE, quote=FALSE,sep=,); I can build the string (including quotes) that is the first argument in the system command: com - paste('C:/upload_data/uploadq8table.bat C:/upload_data/',fname,'', sep=) com [1] \C:/upload_data/uploadq8table.bat C:/upload_data/out_2009-02-26.csv\, But when I substitute it into the system command I get an error: system(com, wait=FALSE) Warning in system(com, wait = FALSE) : C:/upload_data/uploadq8table.bat C:/upload_data/out_2009-02-26.csv not found Any suggestions for how to resolve are appreciated! Elaine McGovern Jones ISC Tape and DASD Storage Products Characterization and Failure Analysis Engineering Phone: 408 284 4853 Internal: 3-4853 jon...@us.ibm.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] User manual or tutorial for GAM package
Hi all, I would like to run the gam package, but can not quite seem to get the syntax correct for my data format as simple as it is. The gamlss package has a user manual but other than the help with gam package I can not find a user manual per se. I did install the gamair that has loads of sample data so perhaps trail and error with my data I can get something to work. Anyone who can point me to a link for a download tutorial? Climbing the steep learning curve and spending a hour or so per day is beginning to pay off with most areas of R Tnx __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User manual or tutorial for GAM package
On Feb 26, 2009, at 4:44 PM, Neotropical bat risk assessments wrote: Hi all, I would like to run the gam package, but can not quite seem to get the syntax correct for my data format as simple as it is. The gamlss package has a user manual but other than the help with gam package I can not find a user manual per se. Maybe you need to buy the book: http://www.amazon.com/Generalized-Additive-Models-Introduction-Statistical/dp/1584884746 -- David Winsemius I did install the gamair that has loads of sample data so perhaps trail and error with my data I can get something to work. Anyone who can point me to a link for a download tutorial? Climbing the steep learning curve and spending a hour or so per day is beginning to pay off with most areas of R Tnx __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download daily weather data
Looks like you can sign up to get XML feed data from Weather.com http://www.weather.com/services/xmloap.html Hope it works out! -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of James Muller Sent: Thursday, February 26, 2009 3:57 PM To: r-help@r-project.org Subject: Re: [R] Download daily weather data Thomas, Have a look at the source code for the webpage (ctrl-u in firefox, don't know in internet explorer, etc.). That is what you'd have to parse in order to get the forecast from this page. Typically when I parse webpages such as this I use regular expressions to do so (and I would never downplay the usefulness of regular expressions, but they take a little getting used to). There are two parts to the task: find patterns that allow you to pull out the datum/data you're after; and then write a program to pull it/them out. Also, of course, download the webpage (but that's no issue). I bet you'd be able to find a comma separated value (CSV) file containing the weather report somewhere, which would probably involve a little less labor in order to produce your automatic wardrobe advice. James On Thu, Feb 26, 2009 at 3:47 PM, Thomas Levine thomas.lev...@gmail.com wrote: I'm writing a program that will tell me whether I should wear a coat, so I'd like to be able to download daily weather forecasts and daily reports of recent past weather conditions. The NOAA has very promising tabular forecasts (http://forecast.weather.gov/MapClick.php?CityName=Ithacastate=NYsite=BGMtextField1=42.4422textField2=-76.5002e=0FcstType=digital), but I can't figure out how to import them. Someone must have needed to do this before. Suggestions? Thomas Levine! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail and any attachments are confidential, may contain legal, professional or other privileged information, and are intended solely for the addressee. If you are not the intended recipient, do not use the information in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] When download.file fails...
Hi everyone, In the situation that the remote file does not exist, download.file fails, yet it still creates a file as provided for destfile argument. I tried to delete this bad file but got the message that it is still being used by other programs, which I assume is R. Does anyone know how to avoid this problem? Potential solutions I can think of now is either checking the existence on remote server before using download.file, or change download.file's behavior, or use other functions. Any suggestions are appreciated. Stanley -- View this message in context: http://www.nabble.com/When-download.file-fails...-tp22234662p22234662.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User manual or tutorial for GAM package
On Feb 26, 2009, at 4:54 PM, David Winsemius wrote: On Feb 26, 2009, at 4:44 PM, Neotropical bat risk assessments wrote: Hi all, I would like to run the gam package, but can not quite seem to get the syntax correct for my data format as simple as it is. The gamlss package has a user manual but other than the help with gam package I can not find a user manual per se. Maybe you need to buy the book: http://www.amazon.com/Generalized-Additive-Models-Introduction-Statistical/dp/1584884746 OR Statistical Models in S chapter 7 by Hastie http://www.amazon.com/Statistical-Models-Chapman-Computer-Science/dp/0412052911 -- David Winsemius I did install the gamair that has loads of sample data so perhaps trail and error with my data I can get something to work. Anyone who can point me to a link for a download tutorial? Climbing the steep learning curve and spending a hour or so per day is beginning to pay off with most areas of R Tnx __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download daily weather data
Scillieri, John wrote: Looks like you can sign up to get XML feed data from Weather.com http://www.weather.com/services/xmloap.html ... and use the excellent R package XML by Duncan Temple Lang to parse the document and easily access the data with, e.g.., XPath rather than regular expressions. vQ Hope it works out! -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of James Muller Sent: Thursday, February 26, 2009 3:57 PM To: r-help@r-project.org Subject: Re: [R] Download daily weather data Thomas, Have a look at the source code for the webpage (ctrl-u in firefox, don't know in internet explorer, etc.). That is what you'd have to parse in order to get the forecast from this page. Typically when I parse webpages such as this I use regular expressions to do so (and I would never downplay the usefulness of regular expressions, but they take a little getting used to). There are two parts to the task: find patterns that allow you to pull out the datum/data you're after; and then write a program to pull it/them out. Also, of course, download the webpage (but that's no issue). I bet you'd be able to find a comma separated value (CSV) file containing the weather report somewhere, which would probably involve a little less labor in order to produce your automatic wardrobe advice. James On Thu, Feb 26, 2009 at 3:47 PM, Thomas Levine thomas.lev...@gmail.com wrote: I'm writing a program that will tell me whether I should wear a coat, so I'd like to be able to download daily weather forecasts and daily reports of recent past weather conditions. The NOAA has very promising tabular forecasts (http://forecast.weather.gov/MapClick.php?CityName=Ithacastate=NYsite=BGMtextField1=42.4422textField2=-76.5002e=0FcstType=digital), but I can't figure out how to import them. Someone must have needed to do this before. Suggestions? Thomas Levine! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail and any attachments are confidential, may contain legal, professional or other privileged information, and are intended solely for the addressee. If you are not the intended recipient, do not use the information in this e-mail in any way, delete this e-mail and notify the sender. CEG-IP1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download daily weather data
Yes, as a general thing go to regular expressions if you don't have an existing library available to do the same thing (or you're lazy like me:). Jame On Thu, Feb 26, 2009 at 5:16 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Scillieri, John wrote: Looks like you can sign up to get XML feed data from Weather.com http://www.weather.com/services/xmloap.html ... and use the excellent R package XML by Duncan Temple Lang to parse the document and easily access the data with, e.g.., XPath rather than regular expressions. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download daily weather data
James Muller wrote: Yes, as a general thing go to regular expressions if you don't have an existing library available to do the same thing (or you're lazy like me:). many things are simply *much* easier with xpath than with regexes, and with the XML package you got it for free. vQ Jame On Thu, Feb 26, 2009 at 5:16 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Scillieri, John wrote: Looks like you can sign up to get XML feed data from Weather.com http://www.weather.com/services/xmloap.html ... and use the excellent R package XML by Duncan Temple Lang to parse the document and easily access the data with, e.g.., XPath rather than regular expressions. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User manual or tutorial for GAM package
-- or perhaps even the section on Additive Models in the latest edition of VR's MASS (still a useul book to have in one's library, IMHO, although, like me, it's getting grayer) Bert Gunter Genentech Nonclinical Biostatistics 650-467-7374 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Thursday, February 26, 2009 2:05 PM To: David Winsemius Cc: r-help@r-project.org; Neotropical bat risk assessments Subject: Re: [R] User manual or tutorial for GAM package On Feb 26, 2009, at 4:54 PM, David Winsemius wrote: On Feb 26, 2009, at 4:44 PM, Neotropical bat risk assessments wrote: Hi all, I would like to run the gam package, but can not quite seem to get the syntax correct for my data format as simple as it is. The gamlss package has a user manual but other than the help with gam package I can not find a user manual per se. Maybe you need to buy the book: http://www.amazon.com/Generalized-Additive-Models-Introduction-Statistical/d p/1584884746 OR Statistical Models in S chapter 7 by Hastie http://www.amazon.com/Statistical-Models-Chapman-Computer-Science/dp/0412052 911 -- David Winsemius I did install the gamair that has loads of sample data so perhaps trail and error with my data I can get something to work. Anyone who can point me to a link for a download tutorial? Climbing the steep learning curve and spending a hour or so per day is beginning to pay off with most areas of R Tnx __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download daily weather data
2009/2/26 Thomas Levine thomas.lev...@gmail.com: I'm writing a program that will tell me whether I should wear a coat, so I'd like to be able to download daily weather forecasts and daily reports of recent past weather conditions. The NOAA has very promising tabular forecasts (http://forecast.weather.gov/MapClick.php?CityName=Ithacastate=NYsite=BGMtextField1=42.4422textField2=-76.5002e=0FcstType=digital), but I can't figure out how to import them. Someone must have needed to do this before. Suggestions? You could use my geonames package that uses the GeoNames query service. There's a sample queries here: http://geonames.r-forge.r-project.org/ Easiest is probably to use GNfindNearByWeather: as.data.frame(GNfindNearByWeather(57,-2)) clouds weatherCondition 1 broken clouds n/a observation windDirection ICAO 1 EGPD 262120Z 25003KT 9000 -RA BKN018 06/05 Q1012 NOSIG 250 EGPD elevation countryCode lng temperature dewPoint windSpeed humidity 165 GB -2.216667 6503 93 stationNamedatetime lat hectoPascAltimeter 1 Aberdeen / Dyce 2009-02-26 21:20:00 57.2 1012 The package is on CRAN. There is of course an easier way to decide if you need to wear a coat, and that is to look out the window :) Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Inefficiency of SAS Programming
If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] left truncated data survival analysis package
On Thu, Feb 26, 2009 at 7:02 AM, phguard...@aol.com wrote: Hello, I d like to run a survival analysis with left truncated data. Could you recommend me a package to do this please ? The 'eha' package if you want parametric or discrete time models. Göran Thanks Philippe Guardiola Reçevez AOL Mail sur votre téléphone. Vos e-mails accessibles à tout moment! Créez un e-mail gratuit aujourd’hui. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Göran Broström __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download daily weather data
See also http://umbrellatoday.com/ Hadley On Thu, Feb 26, 2009 at 2:47 PM, Thomas Levine thomas.lev...@gmail.com wrote: I'm writing a program that will tell me whether I should wear a coat, so I'd like to be able to download daily weather forecasts and daily reports of recent past weather conditions. The NOAA has very promising tabular forecasts (http://forecast.weather.gov/MapClick.php?CityName=Ithacastate=NYsite=BGMtextField1=42.4422textField2=-76.5002e=0FcstType=digital), but I can't figure out how to import them. Someone must have needed to do this before. Suggestions? Thomas Levine! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] User manual or tutorial for GAM package
Hi all, Tnx for the number of suggestion to buy one of the books. Unfortunately being a non profit conservation project now w/o funding since all Central American projects were shut down 2 years ago due to lack of donor funding I have nothing now at all for books or even my support, to cover expenses. I am trying to wrap up several risk assessments w/o funding. Tnx again everyone is very helpful on the list as are those on the ggplot list. Cheers from the jungles of Belize, Bruce __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Something similar to data.frame, but allows variables with unique lengths?
I have a tightly coupled collection of variables with different lengths and types (some characters and others numerics). I looked at the documentation for data.frame, but indicates that it expects all variables to have the same length, i.e. number of elements. I was hoping the See Also documentation for data.frame would have helped, but it did not have any good leads. Here is a list of data I would like to store together: (FileName (character string), HighValue (numeric), LowValue (numeric), Units (character), OthersValues (array of numerics), Descriptor (character string)) By any chance is there a generic object type feature within R to store such data together in one variable? Thank you for any details and leads you can provide. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] complex data summary
Dear All, I am interested in using R to summarize data so that I can THEN do some additional analyses using behavioral data. I study birds, mostly, and often behavioral data that includes timing. So, for example, each line may be the occurrence of some behavior, and it would include identifiers for species, individual, date, time of observation and the various behavioral observations that may be important. Thus, if observing one individual, the difference in the time of one line to the subsequent line could be the time it took to carry out a behavior. Or, perhaps several lines include all the behaviors of a bout of behaviors. For example, I have a project on hummingbirds, in which an individual can be observed for a time interval, during which it may do several things. So, each thing becomes a line to identify what it was. Then, I would like to count behaviors from the start to the end of his bout. Basically, I want to calculate a few things - the difference between the times of the behaviors of one line and the subsequent or some number of subsequent lines. I want to count the number of other things in a time interval, and so on. In JMP, to calculate the interval from one line to the next, I would generate a new variable, and that basically subtracts the time in line x from the time in line x+1. I can't figure out how to do that in R. Next, in JMP (or SAS), I would do a summary, in which I based it on date, species, individual, etc., and which took a Maximum and Minimum for time, the sum of say variable X, the Maximum of variable Y and so on. Example: Date Begin EndSpecies Individual Feeding Interaction Interactsp 2008-09-11 11:15:2111:15:58 CHAU 1 3 2008-09-11 11:16:0111:16:12 CHAU 1 ChaseCHAU 2008-09-11 11:16:1711:16:28 CHAU 15 2008-09-11 11:16:3211:16:49 CHAU 1 Flee THGL 2008-09-11 11:16:5311:17:15 THGL 18 Clearly, to make another variably by just subtracting one variable from another, like time begin time end, is no problem. Also, since the species changed, the individual number can remain. What I would like to do, for example, is count the number of feedings, interactions, and so on in an interval, as well as the number of species, and calculate the total time the individual was at the food source. Well, all that is very simple in JMP and I just can't figure out how to do it in R. In fact, just figuring out how to make time and data TIME and DATE was a bit complex, but I have managed to do that. I hope I have explained my problem well enough, because all the other problems I have hinge upon summarizing my data in a way that the summary can then be used in analysis. If I could figure out this one part, I would probably switch to R completely! If you have any helpful suggestions, I am sure that there are other behavioral biologists out there that would be able to benefit from this kind of information. I know my future students will! Sincerely, Jim -- James J. Roper Smithsonian Tropical Research Institute Bocas del Toro Marine Research Station MRC 0580-03 Unit 9100, Box 0948 DPO AA 34002-9998 Skype-in (USA):+1 706 5501064 Skype-in (Brazil): 41 39415715 E-mail - personal: jjro...@gmail.com E-mail - consulting: arsart...@gmail.com 9 21.122' N, and 82 15.390' W In Google Earth, copy and paste - 9 21.122' N, 82 15.390' W Ecologia e Conservação na UFPR http://www.bio.ufpr.br/ecologia/ Personal Pages http://jjroper.googlespages.com http://arsartium.googlepages.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Substituting in a variable file name in a Windows system command
Elaine Jones wrote: I am running R version 2.8.1 on Windows XP OS. I generate and write a .csv file from my R script. Then the following command works to upload it to a remote server using a windows batch file that carries out the ftp (among other things). system(C:/upload_data/uploadq8.bat C:/upload_data/out_2009-02-26.csv, wait=FALSE) I want to set this up to run daily and create a unique filename each day it runs. I write the .csv file with a unique filename by fname - paste(out_,Sys.Date(),.csv,sep=) write.table(config_all5,file=fname,row.names=FALSE, quote=FALSE,sep=,); I can build the string (including quotes) that is the first argument in the system command: Don't include the quotes. com - paste('C:/upload_data/uploadq8table.bat C:/upload_data/',fname,'', sep=) Should be com - paste('C:/upload_data/uploadq8table.bat C:/upload_data/',fname, sep=) Duncan Murdoch com [1] \C:/upload_data/uploadq8table.bat C:/upload_data/out_2009-02-26.csv\, But when I substitute it into the system command I get an error: system(com, wait=FALSE) Warning in system(com, wait = FALSE) : C:/upload_data/uploadq8table.bat C:/upload_data/out_2009-02-26.csv not found Any suggestions for how to resolve are appreciated! Elaine McGovern Jones ISC Tape and DASD Storage Products Characterization and Failure Analysis Engineering Phone: 408 284 4853 Internal: 3-4853 jon...@us.ibm.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.