Re: [R] truncating values into separate categories
I must apoligize, as i want clear of what i wanted to occur. i dont want to count the occurences but rather recode them. I am trying to replace all of the values with the new coded values in Person_CAT. SO NP - c(1, 1, 2, 1, 1, 2, 2, 1, 4, 1, 0, 5, + 3, 3, 1, 5, 3, 5, 1, 6, 1, 2, 2, 2, + 4, 4, 1, 2, 1, 3, 3, 1, 2, 2, 1, 2, 1, 2, + 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) and Person_CAT: 1, 1, 2, 1, 1, 2, 2, 1, 4, 1, NA, 4. and so on. This task would easily be done in SPSS but i am trying to automate it using R. I hope this is more clear, Bill.Venables wrote: Here is a suggestion: Per - c(NA, 1, 2, 3,4) NP - c(1, 1, 2, 1, 1, 2, 2, 1, 4, 1, 0, 5, + 3, 3, 1, 5, 3, 5, 1, 6, 1, 2, 2, 2, + 4, 4, 1, 2, 1, 3, 3, 1, 2, 2, 1, 2, 1, 2, + 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) Person_CAT - cut(NP, breaks = c(0:4, Inf)-0.5, labels = Per) table(Person_CAT) Person_CAT NA 1 2 3 4 1 19 15 6 9 You should be aware, though, that items corresponding to the level NA will NOT be treated as missing. Bill Venables http://www.cmis.csiro.au/bill.venables/ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of PDXRugger Sent: Friday, 31 July 2009 9:54 AM To: r-help@r-project.org Subject: [R] truncating values into separate categories Hi all, Simple question which i thought i had the answer but it isnt so simple for some reason. I am sure someone can easily help. I would like to categorize the values in NP into 1 of the five values in Per, with the last category(4) representing values =4(hence 4:max(NP)). The problem is that R is reading max(NP) as multiple values instead of range so the lengths of the labels and the breaks are not matching. Suggestions? Per - c(NA, 1, 2, 3,4) NP=c(1 ,1 ,2 ,1, 1 ,2 ,2 ,1 ,4 ,1 ,0 ,5 ,3 ,3 ,1 ,5 ,3, 5, 1, 6, 1, 2, 2, 2, 4, 4, 1, 2, 1, 3, 3, 1 ,2 ,2 ,1 ,2, 1, 2, 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) Person_CAT - cut(NP, breaks=c(0,1,2,3,4:max(NP)), labels=Per) -- View this message in context: http://www.nabble.com/truncating-values-into-separate-categories-tp24749046p24749046.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/truncating-values-into-separate-categories-tp24749046p24761455.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] superpose 2 time series with different time intervals
I could use some advice. I've got 2 time series. Both cover approximately the same period of time (ie, 1940 to 2009). But one series has annual data and the other has monthly data. One refers to university enrollment; the other to unemployment rates. Both are currently in the same data frame. I'd like to use the monthly times series as a light grayscale background for a plot of the annual time series, showing both series as type l (line). Naturally with all the NA's in the annual series, that plot disappears because points are not connected across missing values. I suppose I could make both series annual, but a lot of interesting detail would get lost this way. Or I guess I could interpolate values in the annual series with monthly approximations, but this means 11 out of every 12 values is an approximation. Or I suppose I could plot each series separately and then print them with position information, which I'm reluctant to do because panel.superpose so nicely handles the alignment of the 2 panels. What I'd really like to do is plot each independently but still superposed. Effectively this seems to mean monthly data intervals but line connections across the NA's in the series with annual intervals. Any suggestions would be appreciated. Thanks. Gary Lewis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to calculate time interval between dates
Dear R users: I have a vector of dates as follows: t - c(2007-01-05, 2007-05-14, 2007-12-28, 2008-01-09, 2008-04-24, 2009-02-14) I'd like to calculate number of days between those dates (time interval). How to do it? Thank you, Julia -- View this message in context: http://www.nabble.com/how-to-calculate-time-interval-between-dates-tp24762840p24762840.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] another automation question
your name is annoying. On Fri, Jul 31, 2009 at 2:01 PM, RR! cwal...@usgs.gov wrote: This code works: x-letters[1:6] ycols-23:28 xcols-rep(c(3,4,5,8),each=length(ycols)) somertime-function(i,j)somers2(Pred_pres_a_indpdt[,i,,], population[,j]) results-mapply(somertime,xcols,ycols) How can I make variable h work? x-letters[1:6] ycols-23:28 xcols-rep(c(3,4,5,8),each=length(ycols)) somertime-function(h,i,j)somers2(Pred_pres_h_indpdt[,i,,], population[,j]) results-mapply(somertime,x,xcols,ycols) -R -- View this message in context: http://www.nabble.com/another-automation-question-tp24763017p24763017.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re moving intial numerals
I would like to recreate data so that only the last 5 digits of the below data are inlcuded as data so 200502019 would become 02019. Any ideas. data=c(200500735, 200502019, 200504131, 200504217, 200504629, 200504822, 200510115, 200511605, 200514477, 200515314, 200515438, 200519040, 200519603, 200522735, 200522853, 200523415, 200524227, 200524423) -- View this message in context: http://www.nabble.com/removing-intial-numerals-tp24763596p24763596.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compare lm() to glm(family=poisson)
Mark Na mtb954 at gmail.com writes: Dear R-helpers, I would like to compare the fit of two models, one of which I fit using lm() and the other using glm(family=poisson). The latter doesn't provide r-squared, so I wonder how to go about comparing these models (they have the same formula). Thanks very much, Mark Na I'm not sure what you are trying to do but it might be informative to compare the diagnostic plots from the fits. Remember that Poisson distributed data is heteroscedastic, mean = variance, which isn't the default hypothesis when fitting with lm. Also, the default link function with the poisson family is log. So, these are things to take into account in any potential comparison. Ken -- Ken Knoblauch Inserm U846 Stem-cell and Brain Research Institute Department of Integrative Neurosciences 18 avenue du Doyen Lépine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr/members/kenneth-knoblauch.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] superpose 2 time series with different time intervals
Try this: # two simulated series set.seed(123) ts.sim - arima.sim(list(order = c(1,1,0), ar = 0.7), n = 70) ts.sim - ts(c(ts.sim), start = 1940) ts.sim2 - arima.sim(list(order = c(1,1,0), ar = 0.7), n = 12*70) ts.sim2 - ts(c(ts.sim2), start = 1940, freq = 12) # plot plot(ts.sim2, type = l, col = grey(0.5)) lines(ts.sim) axis(1, time(ts.sim), lab = FALSE) On Fri, Jul 31, 2009 at 3:15 PM, Gary Lewisgary.m.le...@gmail.com wrote: I could use some advice. I've got 2 time series. Both cover approximately the same period of time (ie, 1940 to 2009). But one series has annual data and the other has monthly data. One refers to university enrollment; the other to unemployment rates. Both are currently in the same data frame. I'd like to use the monthly times series as a light grayscale background for a plot of the annual time series, showing both series as type l (line). Naturally with all the NA's in the annual series, that plot disappears because points are not connected across missing values. I suppose I could make both series annual, but a lot of interesting detail would get lost this way. Or I guess I could interpolate values in the annual series with monthly approximations, but this means 11 out of every 12 values is an approximation. Or I suppose I could plot each series separately and then print them with position information, which I'm reluctant to do because panel.superpose so nicely handles the alignment of the 2 panels. What I'd really like to do is plot each independently but still superposed. Effectively this seems to mean monthly data intervals but line connections across the NA's in the series with annual intervals. Any suggestions would be appreciated. Thanks. Gary Lewis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re moving intial numerals
Try this, formatC(d %% 1e5, width=5, flag = 0, mode=integer) [1] 00735 02019 04131 04217 04629 04822 10115 11605 14477 [10] 15314 15438 19040 19603 22735 22853 23415 24227 24423 HTH, baptiste 2009/7/31 PDXRugger j_r...@hotmail.com: I would like to recreate data so that only the last 5 digits of the below data are inlcuded as data so 200502019 would become 02019. Any ideas. data=c(200500735, 200502019, 200504131, 200504217, 200504629, 200504822, 200510115, 200511605, 200514477, 200515314, 200515438, 200519040, 200519603, 200522735, 200522853, 200523415, 200524227, 200524423) -- View this message in context: http://www.nabble.com/removing-intial-numerals-tp24763596p24763596.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to stop an R script when running JGR on a Linux/SuSE system
mau...@alice.it wrote: I wonder whether there is a more gentle way to stop an R script running on top of JGR aother than ... unplugging the power cord. there must be a bug in JGR on Lunux. Clicking the stop button should stop the script, clicking it here on my linux machine will immediately crash R together with JGR altogether. Unfortunately JGR still seems to be the best (and only) available shell/editor combination available. When running R in an ordinary terminal window execution can be terminated with CTRL-C I hope they will soon fix this (and a small handfull of other bugs/flaws, mostly missing/wrong keyboard shortcuts) and JGR would make a huge step from very good to excellent Bernd __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to stop an R script when running JGR on a Linux/SuSE system
sorry for the eventual double posting, but i got a strange error from a versatel(???) server about not enough quota when replying to the message __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R User Group listings
On Fri, 31 Jul 2009 06:45:38 -0400, Prof John C Nash (PJCN) wrote: Further to my posting about R UG mailing lists etc., and David Smith's post about the list he is maintaining (I was aware of his blog, but not that he was updating -- good show), I'm in communication with him to try to ensure we get appropriate information out to useRs. Already there has been a posting asking if there is any group in Germany, and asking is the first step to getting a group going. I suspect we need to expand from just a listing to also include Desperately seeking R users... entries. Will see what we can do. I shortly talked with Jip Porzak about it at useR (because he previously approached me with the question about having such a list on the official R web pages). My personal opinion is that such a listing really belongs into the R Wiki, such that user groups can add themselves. Of course it would be great if somebody could act as an editor and have an eye on the page, and we could have a prominent link to it from the R homepage. Just my 2c, Fritz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with RGtk2 Rattle
HI Thanks for all the advice, unfortunately I am unable to install the suggested fix - error message as follows: Error in gzfile(file, r) : cannot open the connection In addition: Warning message: In gzfile(file, r) : cannot open compressed file 'glade-3.4.3-win32-1/DESCRIPTION', probable reason 'No such file or directory' Sorry but nothing seems to work Regards Wayne Felix Andrews wrote: This error comes from using an old version of the GTK+ libraries. Download the latest version for Windows from http://gladewin32.sourceforge.net/ -Felix 2009/7/31 Graham Williams graham.willi...@togaware.com: Hi Wayne - but what version of the other tools have you installed? Regards, Graham 2009/7/30 Wayne Murray wayne.mur...@medicareaustralia.gov.au HI Graham Thanks for responding so promptly - unfortunately downloading and running this new version of Rattle did not alter the outcome - I am however running on Windows XP Regards Wayne Wayne Murray wrote: HI Apologies for previously trying to post this question onto the Dev forum. I have recently update my versions of R and related packages. When I try to use rattle the following message appears Error in .RGtkCall(R_setGObjectProps, obj, value, PACKAGE = RGtk2) : Invalid property tooltip-text! I have downloaded and installed the latest available version of RGtk2, so I am at a loss to explain this error, or more importantly what I need to do to overcome it Thanks for any suggestions Regards Wayne - Dr D. W. Murray Canberra, Australia -- View this message in context: http://www.nabble.com/Problem-with-RGtk2---Rattle-tp24734447p24736985.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 Postdoctoral Fellow Integrated Catchment Assessment and Management (iCAM) Centre Fenner School of Environment and Society [Bldg 48a] The Australian National University Canberra ACT 0200 Australia M: +61 410 400 963 T: + 61 2 6125 1670 E: felix.andr...@anu.edu.au CRICOS Provider No. 00120C -- http://www.neurofractal.org/felix/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Dr D. W. Murray Canberra, Australia -- View this message in context: http://www.nabble.com/Problem-with-RGtk2---Rattle-tp24734447p24768229.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using R with Hadoop/Hive for Big Data
Hi, The document helps a lot thanks. I need to know how to work with Hadoop and R in a parallel clsuter environment. HIVE is a new system on top of Hadoop that uses a SQL derivative to query it. http://hadoop.apache.org/hive/ Regards, Ajay On Fri, Jul 31, 2009 at 7:23 PM, Avram Aelony aav...@mac.com wrote: I am not sure if I understood your question, but you may want to look at http://cran.r-project.org/web/packages/HadoopStreaming/HadoopStreaming.pdf Regards, Avram On Friday, July 31, 2009, at 02:39PM, Ajay ohri ohri2...@gmail.com wrote: Hive http://hadoop.apache.org/hive/ is a data warehouse infrastructure built on top of Hadoop that provides tools to enable easy data summarization, adhoc querying and analysis of large datasets data stored in Hadoop files. It provides a mechanism to put structure on this data and it also provides a simple query language called QL which is based on SQL and which enables users familiar with SQL to query this data. At the same time, this language also allows traditional map/reduce programmers to be able to plug in their custom mappers and reducers to do more sophisticated analysis which may not be supported by the built in capabilities of the language. Is there any package currently out or in development that is looking into using R like matrix capabilties with HIVE like big data abilties on a remote/ parallel HPC. Regards, Ajay [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to calculate time interval between dates
On Jul 31, 2009, at 4:46 PM, liujb wrote: Dear R users: I have a vector of dates as follows: t - c(2007-01-05, 2007-05-14, 2007-12-28, 2008-01-09, 2008-04-24, 2009-02-14) I'd like to calculate number of days between those dates (time interval). How to do it? That is not a vector of dates, but rather a character vector. Observe: t - c(2007-01-05, 2007-05-14, 2007-12-28, 2008-01-09, 2008-04-24, 2009-02-14) class(t) # [1] character diff(t) #Error in r[i1] - r[-length(r):-(length(r) - lag + 1)] : # non-numeric argument to binary operator Try: t2 - as.Date(t) class(t2) #[1] Date t2 # [1] 2007-01-05 2007-05-14 2007-12-28 2008-01-09 2008-04-24 2009-02-14 ?diff diff(t2) #Time differences in days #[1] 129 228 12 106 296 David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R User Group listings
Why not! Looks like there were several conversations going on independently at UseR about this. I'll put up a page and then ask Martin to adjust the link. JN Friedrich Leisch wrote: On Fri, 31 Jul 2009 06:45:38 -0400, Prof John C Nash (PJCN) wrote: Further to my posting about R UG mailing lists etc., and David Smith's post about the list he is maintaining (I was aware of his blog, but not that he was updating -- good show), I'm in communication with him to try to ensure we get appropriate information out to useRs. Already there has been a posting asking if there is any group in Germany, and asking is the first step to getting a group going. I suspect we need to expand from just a listing to also include Desperately seeking R users... entries. Will see what we can do. I shortly talked with Jip Porzak about it at useR (because he previously approached me with the question about having such a list on the official R web pages). My personal opinion is that such a listing really belongs into the R Wiki, such that user groups can add themselves. Of course it would be great if somebody could act as an editor and have an eye on the page, and we could have a prominent link to it from the R homepage. Just my 2c, Fritz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to stop an R script when running JGR on a Linux/SuSE system
Bernd Kreuss wrote: sorry for the eventual double posting, but i got a strange error from a versatel(???) server about not enough quota when replying to the message Yes, I had one of those too (see below), but notice that the error occurs after the mail has left the mailing list server at ETHZ; i.e., it involves one recipient rather than all. AFAICS, there's a misconfigured mailer en route to one of our subscribers (disobeys Errors-To: and mails sender instead). The logical consequence would seem to be to unsubscribe mailingli...@versanet.de. -p -- Hi. This is the qmail-send program at maildo.versatel.de. I'm afraid I wasn't able to deliver your message to the following addresses. This is a permanent error; I've given up. Sorry it didn't work out. mailingli...@versanet.de: maildrop: Filtering through xfilter /usr/local/bin/reformail -a X-VT-Original-To: mailingli...@versanet.de maildrop: Filtering through xfilter /usr/local/bin/spamc -d 89.245.129.196 maildrop: Filtering through `$MAILFILTER -u $LOGNAME` maildrop: maildir over quota. --- Below this line is a copy of the message. Return-Path: p.dalga...@biostat.ku.dk Received: (qmail 29582 invoked from network); 30 Jul 2009 09:14:20 - Received: from avir03do.versatel-west.de ([89.245.129.71]) (envelope-sender p.dalga...@biostat.ku.dk) by mail02do.versatel.de (qmail-ldap-1.03) with SMTP for mailingli...@versanet.de; 30 Jul 2009 09:14:20 - Received: from avir03do.versatel-west.de (localhost.localdomain [127.0.0.1]) by avir03do.versatel-west.de (Postfix) with ESMTP id 99D7773CBAC for mailingli...@versanet.de; Thu, 30 Jul 2009 11:14:18 +0200 (CEST) Received: from mail01do.versatel.de (mail01do.versatel.de [89.245.129.21]) by avir03do.versatel-west.de (Postfix) with SMTP id 5D79C73CBA1 for mailingli...@versanet.de; Thu, 30 Jul 2009 11:14:18 +0200 (CEST) Received: by mail01do.versatel.de (sSMTP sendmail emulation); Thu, 30 Jul 2009 11:14:19 +0200 Received: (qmail 22716 invoked from network); 30 Jul 2009 09:14:18 - X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on spamkill08do.versatel-west.de X-Spam-Status: No, score=0.0 required=7.7 tests=none Received: from hypatia.math.ethz.ch ([129.132.145.15]) (envelope-sender r-devel-boun...@r-project.org) by mail01do.versatel.de (qmail-ldap-1.03) with SMTP for mailingli...@versanet.de; 30 Jul 2009 09:14:16 - Received: from hypatia.math.ethz.ch (hypatia [129.132.145.15]) by hypatia.math.ethz.ch (8.14.1/8.14.1) with ESMTP id n6U9D4HU025616; Thu, 30 Jul 2009 11:13:23 +0200 Received: from phil2.ethz.ch (phil2.ethz.ch [129.132.202.240]) by hypatia.math.ethz.ch (8.14.1/8.14.1) with ESMTP id n6U9CHbG024699 for r-de...@stat.math.ethz.ch; Thu, 30 Jul 2009 11:12:57 +0200 Received: from mail.kubism.ku.dk ([192.38.18.21] helo=mail.pubhealth.ku.dk) by phil2.ethz.ch with esmtp (Exim 4.66) (envelope-from p.dalga...@biostat.ku.dk) id 1MWRgL-00057m-7h for r-de...@stat.math.ethz.ch; Thu, 30 Jul 2009 11:12:17 +0200 Received: from titmouse2.kubism.ku.dk (0x50c633f5.boanxx12.dynamic.dsl.tele.dk [80.198.51.245]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.pubhealth.ku.dk (Postfix) with ESMTP id 0DDE8282BE88; Thu, 30 Jul 2009 11:11:52 +0200 (CEST) Message-ID: 4a71641a.8080...@biostat.ku.dk Date: Thu, 30 Jul 2009 11:12:58 +0200 From: Peter Dalgaard p.dalga...@biostat.ku.dk User-Agent: Thunderbird 2.0.0.21 (X11/20090320) MIME-Version: 1.0 To: kb...@andrew.cmu.edu References: 20090729175016.731f8282c...@mail.pubhealth.ku.dk In-Reply-To: 20090729175016.731f8282c...@mail.pubhealth.ku.dk X-Tag-Only: YES X-Filter-Node: phil2.ethz.ch X-USF-Spam-Level: -- X-USF-Spam-Status: hits=-2.5 tests=BAYES_00,FORGED_RCVD_HELO,SPF_PASS X-USF-Spam-Flag: NO X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch Cc: r-b...@r-project.org, r-de...@stat.math.ethz.ch Subject: Re: [Rd] Strange Interaction Between Promises and Closures (PR#13861) X-BeenThere: r-de...@r-project.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: R development and technical/programmer topics r-devel.r-project.org List-Unsubscribe: https://stat.ethz.ch/mailman/options/r-devel, mailto:r-devel-requ...@r-project.org?subject=unsubscribe List-Archive: https://stat.ethz.ch/pipermail/r-devel List-Post: mailto:r-de...@r-project.org List-Help: mailto:r-devel-requ...@r-project.org?subject=help List-Subscribe: https://stat.ethz.ch/mailman/listinfo/r-devel, mailto:r-devel-requ...@r-project.org?subject=subscribe Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1; Format=flowed Sender: r-devel-boun...@r-project.org Errors-To: r-devel-boun...@r-project.org X-VT-Original-To: mailingli...@versanet.de X-Anti-Virus: Kaspersky Anti-Virus
[R] R book for economists
Dear Group, I am an economics student starting with PhD work in London. As preparation I would like to get to know R a little bit better. For Stata there are tons of books, however, can you recommend a book for R? I have some substantiated econometrics knowledge, so it should be more a how-to book. Best regards Thiemo --- Thiemo Fetzer, Economist http://freigeist.devmag.net http://www.devmag.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R User Group listings
Better ideas should prevail. There is now a wiki page at http://wiki.r-project.org/rwiki/doku.php?id=rugs:r_user_groups. It is not yet fully populated. (David Smith's blog at REvolution Computing mentions more groups.) JN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about the summary(cph.object)
On Jul 31, 2009, at 11:24 PM, zhu yao wrote: Could someone explain the summary(cph.object)? The example is in the help file of cph. n - 1000 set.seed(731) age - 50 + 12*rnorm(n) label(age) - Age sex - factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4))) cens - 15*runif(n) h - .02*exp(.04*(age-50)+.8*(sex=='Female')) dt - -log(runif(n))/h label(dt) - 'Follow-up Time' e - ifelse(dt = cens,1,0) dt - pmin(dt, cens) units(dt) - Year dd - datadist(age, sex) options(datadist='dd') This is process for setting the range for the display of effects in Design regression objects. See: ?datadist q.effect set of two quantiles for computing the range of continuous variables to use in estimating regression effects. Defaults are c(.25,.75), which yields inter-quartile-range odds ratios, etc. ?summary.Design #--- By default, inter-quartile range effects (odds ratios, hazards ratios, etc.) are printed for continuous factors, ... #--- Value For summary.Design, a matrix of class summary.Design with rows corresponding to factors in the model and columns containing the low and high values for the effects, the range for the effects, the effect point estimates (difference in predicted values for high and low factor values), the standard error of this effect estimate, and the lower and upper confidence limits. #--- Srv - Surv(dt,e) f - cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE) summary(f) Effects Response : Srv FactorLowHigh Diff. Effect S.E. Lower 0.95 Upper 0.95 age 40.872 57.385 16.513 1.21 0.21 0.80 1.62 Hazard Ratio 40.872 57.385 16.513 3.35 NA 2.22 5.06 In this case with a 4 df regression spline, you need to look at the effect across the range of the variable. You ought to plot the age effect and examine anova(f) ). In the untransformed situation the plot is on the log hazards scale for cph. So the effect for age in this case should be the difference in log hazard at ages 40.872 and 57.385. SE is the standard error of that estimate and the Upper and Lower numbers are the confidence bounds on the effect estimate. The Hazard Ratio row gives you exponentiated results, so a difference in log hazards becomes a hazard ratio. {exp(1.21) = 3.35} sex - Female:Male 2.000 1.000 NA 0.64 0.15 0.35 0.94 Hazard Ratio 2.000 1.000 NA 1.91 NA 1.42 2.55 Wat's the meaning of Effect, S.E. Lower, Upper? You probably ought to read a bit more basic material. If you are asking this question, Harrell's Regression Modeling Strategies might be over you head, but it would probably be a good investment anyway. Venables and Ripley's Modern Applied Statistics has a chapter on survival analysis. Also consider Kalbfliesch and Prentice Statistical Analysis of Failure Time Data. I'm sure there are others; those are the ones I have on my shelf. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with RGtk2 Rattle
Um, it sounds as if you are trying to install glade-3.4.3-win32.zip into R as a package... but it is not an R package!! GTK+/Glade is a system library to be installed into Windows. You should download the .exe (not the .zip) http://downloads.sourceforge.net/gladewin32/gtk-dev-2.12.9-win32-2.exe and run it to install it. By the way, problems with rattle are best sent to the rattle-users mailing list: http://groups.google.com/group/rattle-users -Felix 2009/8/1 Wayne Murray wayne.mur...@medicareaustralia.gov.au: HI Thanks for all the advice, unfortunately I am unable to install the suggested fix - error message as follows: Error in gzfile(file, r) : cannot open the connection In addition: Warning message: In gzfile(file, r) : cannot open compressed file 'glade-3.4.3-win32-1/DESCRIPTION', probable reason 'No such file or directory' Sorry but nothing seems to work Regards Wayne Felix Andrews wrote: This error comes from using an old version of the GTK+ libraries. Download the latest version for Windows from http://gladewin32.sourceforge.net/ -Felix 2009/7/31 Graham Williams graham.willi...@togaware.com: Hi Wayne - but what version of the other tools have you installed? Regards, Graham 2009/7/30 Wayne Murray wayne.mur...@medicareaustralia.gov.au HI Graham Thanks for responding so promptly - unfortunately downloading and running this new version of Rattle did not alter the outcome - I am however running on Windows XP Regards Wayne Wayne Murray wrote: HI Apologies for previously trying to post this question onto the Dev forum. I have recently update my versions of R and related packages. When I try to use rattle the following message appears Error in .RGtkCall(R_setGObjectProps, obj, value, PACKAGE = RGtk2) : Invalid property tooltip-text! I have downloaded and installed the latest available version of RGtk2, so I am at a loss to explain this error, or more importantly what I need to do to overcome it Thanks for any suggestions Regards Wayne - Dr D. W. Murray Canberra, Australia -- View this message in context: http://www.nabble.com/Problem-with-RGtk2---Rattle-tp24734447p24736985.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 Postdoctoral Fellow Integrated Catchment Assessment and Management (iCAM) Centre Fenner School of Environment and Society [Bldg 48a] The Australian National University Canberra ACT 0200 Australia M: +61 410 400 963 T: + 61 2 6125 1670 E: felix.andr...@anu.edu.au CRICOS Provider No. 00120C -- http://www.neurofractal.org/felix/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Dr D. W. Murray Canberra, Australia -- View this message in context: http://www.nabble.com/Problem-with-RGtk2---Rattle-tp24734447p24768229.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 Postdoctoral Fellow Integrated Catchment Assessment and Management (iCAM) Centre Fenner School of Environment and Society [Bldg 48a] The Australian National University Canberra ACT 0200 Australia M: +61 410 400 963 T: + 61 2 6125 1670 E: felix.andr...@anu.edu.au CRICOS Provider No. 00120C -- http://www.neurofractal.org/felix/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R book for economists
How about Kleiber, C. Zeileis, A. Applied Econometrics with R Springer, 2008? Ronggui 2009/8/1 Thiemo Fetzer t...@devmag.net: Dear Group, I am an economics student starting with PhD work in London. As preparation I would like to get to know R a little bit better. For Stata there are tons of books, however, can you recommend a book for R? I have some substantiated econometrics knowledge, so it should be more a how-to book. Best regards Thiemo --- Thiemo Fetzer, Economist http://freigeist.devmag.net http://www.devmag.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- HUANG Ronggui, Wincent PhD Candidate Dept of Public and Social Administration City University of Hong Kong Home page: http://asrr.r-forge.r-project.org/rghuang.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] truncating values into separate categories
On Jul 31, 2009, at 2:55 PM, PDXRugger wrote: I must apoligize, as i want clear of what i wanted to occur. i dont want to count the occurences but rather recode them. I am trying to replace all of the values with the new coded values in Person_CAT. SO NP - c(1, 1, 2, 1, 1, 2, 2, 1, 4, 1, 0, 5, + 3, 3, 1, 5, 3, 5, 1, 6, 1, 2, 2, 2, + 4, 4, 1, 2, 1, 3, 3, 1, 2, 2, 1, 2, 1, 2, + 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) and Person_CAT: 1, 1, 2, 1, 1, 2, 2, 1, 4, 1, NA, 4. and so on. This task would easily be done in SPSS but i am trying to automate it using R. I hope this is more clear, Perhaps: ?cut #with special attention to the right parameter which is set to TRUE by default. per_Cat - cut(NP, breaks= c(1:4, Inf), right= FALSE) per_Cat [1] [1,2) [1,2) [2,3) [1,2) [1,2) [2,3) [2,3) [1,2) [4,Inf) [1,2) NA[4,Inf) [13] [3,4) [3,4) [1,2) [4,Inf) [3,4) [4,Inf) [1,2) [4,Inf) [1,2) [2,3) [2,3) [2,3) [25] [4,Inf) [4,Inf) [1,2) [2,3) [1,2) [3,4) [3,4) [1,2) [2,3) [2,3) [1,2) [2,3) [37] [1,2) [2,3) [2,3) [3,4) [1,2) [1,2) [4,Inf) [4,Inf) [1,2) [1,2) [1,2) [2,3) [49] [2,3) [2,3) Levels: [1,2) [2,3) [3,4) [4,Inf) Per - c( 1, 2, 3,4) levels(per_Cat) - Per per_Cat [1] 1121122141NA 43 3143414 [21] 1222441213312 2121223 [41] 1144111222 Levels: 1 2 3 4 Bill.Venables wrote: Here is a suggestion: Per - c(NA, 1, 2, 3,4) NP - c(1, 1, 2, 1, 1, 2, 2, 1, 4, 1, 0, 5, + 3, 3, 1, 5, 3, 5, 1, 6, 1, 2, 2, 2, + 4, 4, 1, 2, 1, 3, 3, 1, 2, 2, 1, 2, 1, 2, + 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) Person_CAT - cut(NP, breaks = c(0:4, Inf)-0.5, labels = Per) table(Person_CAT) Person_CAT NA 1 2 3 4 1 19 15 6 9 You should be aware, though, that items corresponding to the level NA will NOT be treated as missing. Bill Venables http://www.cmis.csiro.au/bill.venables/ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of PDXRugger Sent: Friday, 31 July 2009 9:54 AM To: r-help@r-project.org Subject: [R] truncating values into separate categories Hi all, Simple question which i thought i had the answer but it isnt so simple for some reason. I am sure someone can easily help. I would like to categorize the values in NP into 1 of the five values in Per, with the last category(4) representing values =4(hence 4:max(NP)). The problem is that R is reading max(NP) as multiple values instead of range so the lengths of the labels and the breaks are not matching. Suggestions? Per - c(NA, 1, 2, 3,4) NP=c(1 ,1 ,2 ,1, 1 ,2 ,2 ,1 ,4 ,1 ,0 ,5 ,3 ,3 ,1 ,5 ,3, 5, 1, 6, 1, 2, 2, 2, 4, 4, 1, 2, 1, 3, 3, 1 ,2 ,2 ,1 ,2, 1, 2, 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2) Person_CAT - cut(NP, breaks=c(0,1,2,3,4:max(NP)), labels=Per) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R book for economists
On Sat, 1 Aug 2009, Thiemo Fetzer wrote: Dear Group, I am an economics student starting with PhD work in London. As preparation I would like to get to know R a little bit better. For Stata there are tons of books, however, can you recommend a book for R? Of course, I have to recommend our book Kleiber Zeileis, Applied Econometrics with R, Springer. http://www.springer.com/978-0-387-77316-2 http://CRAN.R-project.org/package=AER You can grab the preface and intro chapters in the Sample pages on Springer's page to get an impression. There is also Rick Vinod's book Vinod, Hands-On Intermediate Econometrics Using R, World Scientific. http://www.worldscibooks.com/economics/6895.html And somewhat more specialized is Bernhard Pfaff's Pfaff, Analysis of Integrated and Cointegrated Time Series with R, Springer. http://www.springer.com/978-0-387-75966-1 You might find further useful information on the econometrics task view: http://CRAN.R-project.org/view=Econometrics And finally there was also a JSS special volume on Econometrics in R last year: http://www.jstatsoft.org/v27/ Best, Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SVG output on Windows OS
On Jul 31, 2009, at 6:41 PM, Michael Roessler wrote: How may one save a graphic as svg on Windows? The svg() command is recognized and functions well on Linux, etc., but not on Windows, it seems. I'm trying to use Hadley Wickam's ggplot2 and I would like to be able to save created charts as svg for later input into Illustrator. I am able to accomplish this workflow under Linux, but I don't know how to get R to recognize the svg() command under Windows. I have loaded RsvgDevice, Cairo, and cairoDevice in my attempts. The problem seems to me to be directly related to enabling R to produce svg output on Windows, rather than related to ggplot2. What does capabilities() return? -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with RGtk2 Rattle
2009/8/1 Felix Andrews fe...@nfrac.org Um, it sounds as if you are trying to install glade-3.4.3-win32.zip into R as a package... but it is not an R package!! GTK+/Glade is a system library to be installed into Windows. You should download the .exe (not the .zip) http://downloads.sourceforge.net/gladewin32/gtk-dev-2.12.9-win32-2.exe and run it to install it. By the way, problems with rattle are best sent to the rattle-users mailing list: http://groups.google.com/group/rattle-users -Felix Also Wayne, I hope you are following the instructions at http://datamining.togaware.com/survivor/Install_MS_Windows.html If there is any ambiguity there please let me know so I can make it clearer. I know of many who have installed Rattle following these, so they should work. (And thanks for the help Felix.) Regards, Graham -- Felix Andrews / å®ç¦ç« Postdoctoral Fellow Integrated Catchment Assessment and Management (iCAM) Centre Fenner School of Environment and Society [Bldg 48a] The Australian National University Canberra ACT 0200 Australia M: +61 410 400 963 T: + 61 2 6125 1670 E: felix.andr...@anu.edu.au CRICOS Provider No. 00120C -- http://www.neurofractal.org/felix/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add columns in a dataframe and fill them from another table according to a criteria
Deare R users I am new to R. What I want to do is explained below;- I have table called States.Prob which is given below:- This table gives the probabilities of the changes in the swap curve depending on the state of the swap curve. I want to put these probabilities in my dataframe mydata(given after the prob table). Prob of States Changes State1 State2 State3 State4 a Pa1 Pa2 Pa3 Pa4 b Pb1 Pb2 Pb3 Pb4 c Pc1 Pc2 Pc3 Pc4 d Pd1 Pd2 Pd3 Pd4 and I have a dataframe(with 93 rows) called mydata part of which(6 rows) is given below where I want to fill in the last four columns with probabilities taken from States.Prob according to the change and state in mydata4:- Change State PState1 PState2 PState3 PState4 1 b State1 Pb1 2 a State4 Pa4 3 b State2Pb2 4 c State3 Pc3 5 d State1 Pd1 6 a State3 Pa3 What I want to do is highlighted in Red. How can I do this easily? Many thanks for your time. kind regards Meenu P.S. Thanks for your reply John. I've tried to put only the relevant columns of the dataframe. Hope its more clear now. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xyplot: superpose 2 time series with different time intervals
I could use some advice regarding xyplot. I've got 2 time series. Both cover approximately the same period of time (ie, 1940 to 2009). But one series has annual data and the other has monthly data. One refers to university enrollment; the other to unemployment rates. Both are currently in the same data frame. I'd like to use the monthly times series as a light grayscale background for a plot of the annual time series, showing both series as type l (line). Naturally with all the NA's in the annual series, that plot disappears because points are not connected across missing values. I suppose I could make both series annual, but a lot of interesting detail would get lost this way. Or I guess I could interpolate values in the annual series with monthly approximations, but this means 11 out of every 12 values is an approximation. Or I suppose I could plot each series separately and then print them with position information, which I'm reluctant to do because panel.superpose so nicely handles the alignment of the 2 panels. What I'd really like to do is plot each independently but still superposed. Effectively this seems to mean monthly data intervals but line connections across the NA's in the series with annual intervals. Any suggestions would be appreciated. Thanks. Gary Lewis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot: superpose 2 time series with different time intervals
Try this using the same ts.sim and ts.sim2 from my previous post. https://stat.ethz.ch/pipermail/r-help/2009-August/206697.html library(zoo) library(lattice) plot(na.approx(cbind(as.zoo(ts.sim), as.zoo(ts.sim2))), screen = 1, col = c(black, grey(0.5))) On Sat, Aug 1, 2009 at 10:26 AM, Gary Lewisgary.m.le...@gmail.com wrote: I could use some advice regarding xyplot. I've got 2 time series. Both cover approximately the same period of time (ie, 1940 to 2009). But one series has annual data and the other has monthly data. One refers to university enrollment; the other to unemployment rates. Both are currently in the same data frame. I'd like to use the monthly times series as a light grayscale background for a plot of the annual time series, showing both series as type l (line). Naturally with all the NA's in the annual series, that plot disappears because points are not connected across missing values. I suppose I could make both series annual, but a lot of interesting detail would get lost this way. Or I guess I could interpolate values in the annual series with monthly approximations, but this means 11 out of every 12 values is an approximation. Or I suppose I could plot each series separately and then print them with position information, which I'm reluctant to do because panel.superpose so nicely handles the alignment of the 2 panels. What I'd really like to do is plot each independently but still superposed. Effectively this seems to mean monthly data intervals but line connections across the NA's in the series with annual intervals. Any suggestions would be appreciated. Thanks. Gary Lewis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determine the dimension-names of an element in an array in R
Hi Christian: Many thank for the code. But I am afraid that your code still has a problem in terms of providing correct correlation. For example, if you look at the correlation between DataArray_1[A2,B1,D1,] and DataArray_2[A2,C1,D1,] after running your code, you will notice that this is actually the correlation between DataArray_1[A2,B1,D1,] and DataArray_2[A1,C1,D1,] and so on. The code gives the correct result only in case where elements corresponding to A1 D1 are involved in DataArray_1 DataArray_2. The problem is in Correl-Correl[1:length(c),,,] We need to select elements of Correl more carefully to reach a proper solution. Thanks, Sauvik On Wed, Jul 29, 2009 at 11:41 PM, Poersching poerschin...@web.de wrote: Hey, i have forgotten to generalize the code so Correl-Correl[1:4,,,] must be Correl-Correl[1:length(c),,,] it's because the comparison levels. I think you don't want the correlation betweeen A1, B1, D1 and A2, C1, D1 , but between A1, B1, D1 and A1, C1, D1 or between A1, B1, D1 and A1, C2, D1. So the 1:length(c) writes only the correlation between the B and C out of the whole correlation array. That's also why the sequence in the second apply function is changed. Regards Christian. Poersching schrieb: Hey, I think I have a solution for your problem: Correl-apply(DataArray_1,1:3, function(d1) apply(DataArray_2,c(2,1,3), function(d) cor(d1,d)) ) Correl-Correl[1:4,,,] dimnames(Correl)[[1]]-c Correl-aperm(Correl,c(2,3,1,4)) This one should work. :-) Best Regards, Christian Sauvik De schrieb: Hi there, Thanks again for your reply. I know for-loop is always a solution to my problem and I had already coded using for-loop. But the number of levels for each dimension is large enough in actual problem and hence it was time-consuming. So, I was just wondering if there are any other alternative way-outs to solving my problem. That's why I tried with apply functions (sapply)assuming that this might work out faster even fractionally as compared to for-loop. Cheers, Sauvik On Mon, Jul 27, 2009 at 12:28 AM, Poersching poerschin...@web.de mailto:poerschin...@web.de wrote: Sauvik De schrieb: Hi: Lots of thanks for your valuable time! But I am not sure how you would like to use the function in this situation. As I had mentioned that the first element of my output array should be like: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) in my below code. and the output array of correlation I wish to get using sapply as follows: Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) So it would be of great help if you could kindly specify how to utilise your function findIndex in ... Apologies for all this! Thanks Regards, Sauvik Hey, sorry, I haven't understood your problem last time, but now this solution should solve your problem, so I hope. :-) It's only a for to loop, but an apply function may work too. I will think about this, but for now... ;-) la-length(a) lb-length(b) lc-length(c) ld-length(d) for (ia in 1:la) { for (ib in 1:lb) { for (ic in 1:lc) { for (id in 1:ld) { Correl[ia,ib,ic,id]-cor( DataArray_1[dimnames(Correl)[[1]][ia], dimnames(Correl)[[2]][ib], dimnames(Correl)[[4]][id],] , DataArray_2[dimnames(Correl)[[1]][ia], dimnames(Correl)[[3]][ic], dimnames(Correl)[[4]][id],] , use=pairwise.complete.obs) } } } } ## with function findIndex you can find the dimensions with ## i.e. cor values greater 0.5 or smaller -0.5, like: findIndex(Correl,Correl[Correl0.5]) findIndex(Correl,Correl[Correl(-0.5)]) I have changed the code of the function findIndex in line which contents: el[j]-which(is.element(data,element[j])) Rigards, Christian On Sun, Jul 26, 2009 at 3:54 PM, Poerschingpoerschin...@web.de mailto:poerschin...@web.de wrote: Sauvik De schrieb: Hi Gabor: Many thanks for your prompt reply! The code is fine. But I need it in more general form as I had mentioned that I need to input any 0 to find its dimension-names. Actually, I was using sapply to calculate correlation and this idea was required in the middle of correlation calculation. I am providing the way I tried my calculation. a= c(A1,A2,A3,A4,A5) b= c(B1,B2,B3)
Re: [R] xyplot: superpose 2 time series with different time intervals
In the last statement you can replace plot with xyplot (although both work). On Sat, Aug 1, 2009 at 10:37 AM, Gabor Grothendieckggrothendi...@gmail.com wrote: Try this using the same ts.sim and ts.sim2 from my previous post. https://stat.ethz.ch/pipermail/r-help/2009-August/206697.html library(zoo) library(lattice) plot(na.approx(cbind(as.zoo(ts.sim), as.zoo(ts.sim2))), screen = 1, col = c(black, grey(0.5))) On Sat, Aug 1, 2009 at 10:26 AM, Gary Lewisgary.m.le...@gmail.com wrote: I could use some advice regarding xyplot. I've got 2 time series. Both cover approximately the same period of time (ie, 1940 to 2009). But one series has annual data and the other has monthly data. One refers to university enrollment; the other to unemployment rates. Both are currently in the same data frame. I'd like to use the monthly times series as a light grayscale background for a plot of the annual time series, showing both series as type l (line). Naturally with all the NA's in the annual series, that plot disappears because points are not connected across missing values. I suppose I could make both series annual, but a lot of interesting detail would get lost this way. Or I guess I could interpolate values in the annual series with monthly approximations, but this means 11 out of every 12 values is an approximation. Or I suppose I could plot each series separately and then print them with position information, which I'm reluctant to do because panel.superpose so nicely handles the alignment of the 2 panels. What I'd really like to do is plot each independently but still superposed. Effectively this seems to mean monthly data intervals but line connections across the NA's in the series with annual intervals. Any suggestions would be appreciated. Thanks. Gary Lewis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Variable alias
Hi Everyone, is there the possibility in R to assign a variable to be an alias of another one? Example: x - 17 # assign y to be an alias of x y # returns 17 x - 4 y # returns 4 Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable alias
You can assign the x variable like this: y - x - 17 y - x - 4 On Sat, Aug 1, 2009 at 11:55 AM, Daniel Haase d...@haase-zm.de wrote: Hi Everyone, is there the possibility in R to assign a variable to be an alias of another one? Example: x - 17 # assign y to be an alias of x y # returns 17 x - 4 y # returns 4 Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] zoo plot warning messages - I don't know what they mean or how to inspect the data to figure this out
I have a time series from 1933-2005 of precipitation at Fayetteville NC. I get the following error messages when I plot the zoo series. Any help would be appreciated. If you need the data I can dput it or send the csv. I didn't include it here because I didn't want to clog up anybodies email account. I know that this is not reproducible, and I will send along the file if needed. Warning messages: 1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion 2: In lines.times(x.index, y[, i], col = col[[i]], pch = pch[[i]], : NAs introduced by coercion -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add columns in a dataframe and fill them from another table according to a criteria
On Aug 1, 2009, at 9:52 AM, Meenu Sahi wrote: Deare R users I am new to R. What I want to do is explained below;- I have table called States.Prob which is given below:- This table gives the probabilities of the changes in the swap curve depending on the state of the swap curve. I want to put these probabilities in my dataframe mydata(given after the prob table). Prob of States Changes State1 State2 State3 State4 a Pa1 Pa2 Pa3 Pa4 b Pb1 Pb2 Pb3 Pb4 c Pc1 Pc2 Pc3 Pc4 d Pd1 Pd2 Pd3 Pd4 and I have a dataframe(with 93 rows) called mydata part of which(6 rows) is given below where I want to fill in the last four columns with probabilities taken from States.Prob according to the change and state in mydata4:- Change State PState1 PState2 PState3 PState4 1 b State1 Pb1 2 a State4 Pa4 3 b State2Pb2 4 c State3 Pc3 5 d State1 Pd1 6 a State3 Pa3 What I want to do is highlighted in Red. How can I do this easily? You may have seen it in red, but we don't, and I, at least, cannot figure out what you intend. (Per the Posting Guide, which you have obviously not yet read, you need to compose your question in plain old monochromatic text and change your mail client so it posts in plain text.) If looking at the help pages for stack() and reshape() does not offer useful information and worked examples that meet your needs then: An approach that would make you more populat in these parts would be to make a simpler example, composed in syntactically correct R, that is complete in itself, and can pasted into an R session. Indicate what you intend as output from this simpler input. Perhaps pstate - read.table(textConnection(Changes State1 State2 State3 State4 + a Pa1 Pa2 Pa3 Pa4 + b Pb1 Pb2 Pb3 Pb4 + c Pc1 Pc2 Pc3 Pc4 + d Pd1 Pd2 Pd3 Pd4), header=TRUE, as.is=TRUE) ?stack data.frame(Change=pstate[,1], prstate =stack(pstate[2:5])$values, state=stack(pstate[2:5])$ind ) #first column is only 4 elements long, but will get recycled # second retreives the probabilities and may need to have as.numeric( ) wrapped around it if they really are numeric. # third returns what started out as column names. Change prstate state 1 a Pa1 State1 2 b Pb1 State1 3 c Pc1 State1 4 d Pd1 State1 5 a Pa2 State2 6 b Pb2 State2 7 c Pc2 State2 8 d Pd2 State2 9 a Pa3 State3 10 b Pb3 State3 11 c Pc3 State3 12 d Pd3 State3 13 a Pa4 State4 14 b Pb4 State4 15 c Pc4 State4 16 d Pd4 State4 Many thanks for your time. kind regards Meenu P.S. Thanks for your reply John. I've tried to put only the relevant columns of the dataframe. Hope its more clear now. \\// [[alternative HTML version deleted]] ^^Note: ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable alias
If you only have to use y once then you can use delayedAssign. This will assign a promise and the promise will not be evaluated until its used: x Error: object 'x' not found y Error: object 'y' not found x - y x - 1 delayedAssign(y, x) x - 2 y [1] 2 If that's not good enough you can use makeActiveBinding: x - 1 makeActiveBinding(y, function() x, .GlobalEnv) y [1] 1 x - 2 y [1] 2 On Sat, Aug 1, 2009 at 10:55 AM, Daniel Haased...@haase-zm.de wrote: Hi Everyone, is there the possibility in R to assign a variable to be an alias of another one? Example: x - 17 # assign y to be an alias of x y # returns 17 x - 4 y # returns 4 Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Displaying function arguments using a Windows R console
On Jul 31, 2009, at 2:35 PM, Laura S. wrote: I am relatively new to R, and would appreciate any suggestions you may have. I noticed on a Mac the functions' arguments are listed at the bottom of the R console. Is it possible to add such a feature to a Windows R console? I have Windows XP if that helps. I know function arguments can be found using args(...), but I was wanting to have something more automatic, like what I saw on the Mac computer. I went into GUI preferences, but was not sure what to do. I noticed for graphics, the playwith package can be usedI was wondering if there was something similar to this for the arguments of functions in R. I believe the feature to which you are referring is the one called function hints. This is a fairly recent answer to this question from a source that is generally authoritative: http://finzi.psych.upenn.edu/Rhelp08/2009-July/203189.html The follow-up exchange also appears worth reading for Windows users. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determine the dimension-names of an element in an array in R
Hey, oh yes, but now I have realy the ultimate solution... ;-) Here it comes: a= c(A1,A2,A3,A4,A5) b= c(B1,B2,B3) c= c(C1,C2,C3,C4) d= c(D1,D2) e= c(E1,E2,E3,E4,E5,E6,E7,E8) DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b), length(d),length(e)),dimnames=list(a,b,d,e)) DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c), length(d),length(e)),dimnames=list(a,c,d,e)) z-apply(as.matrix(a),c(1,2),function(f1) apply(as.matrix(d),c(1,2),function(f2) apply(DataArray_1[dimnames(DataArray_1)[[1]]==f1,,dimnames(DataArray_1)[[3]]==f2,],1, function(d1) apply(DataArray_2[dimnames(DataArray_2)[[1]]==f1,,dimnames(DataArray_2)[[3]]==f2,],1, function(d2) cor(d1,d2)) ))) Correl = array(z, dim=c(length(c),length(b), length(d),length(a)),dimnames=list(c,b,d,a)) Correl-aperm(Correl,c(4,2,1,3)) So, best Regards, Christian Sauvik De schrieb: Hi Christian: Many thank for the code. But I am afraid that your code still has a problem in terms of providing correct correlation. For example, if you look at the correlation between DataArray_1[A2,B1,D1,] and DataArray_2[A2,C1,D1,] after running your code, you will notice that this is actually the correlation between DataArray_1[A2,B1,D1,] and DataArray_2[A1,C1,D1,] and so on. The code gives the correct result only in case where elements corresponding to A1 D1 are involved in DataArray_1 DataArray_2. The problem is in Correl-Correl[1:length(c),,,] We need to select elements of Correl more carefully to reach a proper solution. Thanks, Sauvik On Wed, Jul 29, 2009 at 11:41 PM, Poersching poerschin...@web.de mailto:poerschin...@web.de wrote: Hey, i have forgotten to generalize the code so Correl-Correl[1:4,,,] must be Correl-Correl[1:length(c),,,] it's because the comparison levels. I think you don't want the correlation betweeen A1, B1, D1 and A2, C1, D1 , but between A1, B1, D1 and A1, C1, D1 or between A1, B1, D1 and A1, C2, D1. So the 1:length(c) writes only the correlation between the B and C out of the whole correlation array. That's also why the sequence in the second apply function is changed. Regards Christian. Poersching schrieb: Hey, I think I have a solution for your problem: Correl-apply(DataArray_1,1:3, function(d1) apply(DataArray_2,c(2,1,3), function(d) cor(d1,d)) ) Correl-Correl[1:4,,,] dimnames(Correl)[[1]]-c Correl-aperm(Correl,c(2,3,1,4)) This one should work. :-) Best Regards, Christian Sauvik De schrieb: Hi there, Thanks again for your reply. I know for-loop is always a solution to my problem and I had already coded using for-loop. But the number of levels for each dimension is large enough in actual problem and hence it was time-consuming. So, I was just wondering if there are any other alternative way-outs to solving my problem. That's why I tried with apply functions (sapply)assuming that this might work out faster even fractionally as compared to for-loop. Cheers, Sauvik On Mon, Jul 27, 2009 at 12:28 AM, Poersching poerschin...@web.de mailto:poerschin...@web.de mailto:poerschin...@web.de mailto:poerschin...@web.de wrote: Sauvik De schrieb: Hi: Lots of thanks for your valuable time! But I am not sure how you would like to use the function in this situation. As I had mentioned that the first element of my output array should be like: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) in my below code. and the output array of correlation I wish to get using sapply as follows: Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) So it would be of great help if you could kindly specify how to utilise your function findIndex in ... Apologies for all this! Thanks Regards, Sauvik Hey, sorry, I haven't understood your problem last time, but now this solution should solve your problem, so I hope. :-) It's only a for to loop, but an apply function may work too. I will think about this, but for now... ;-) la-length(a) lb-length(b) lc-length(c) ld-length(d) for (ia in 1:la) { for (ib in 1:lb) { for (ic in 1:lc) { for (id in 1:ld) {
Re: [R] write matrix M including names(dimnames(M))
On Jul 30, 2009, at 11:50 PM, Steve Jaffe wrote: I can do this by writing (and reading) the file according to some format of my own devising, but I'm wondering if there is a built-in way to write and then restore a matrix with not only the dimnames (which write.table/read.table can preserve) but also the names(dimnames)? Example: M - matrix(1:4, 2, 2) dimnames(M) - list(xdim=c(a, b), ydim=c(u, v)) M ydim xdim u v a 1 3 b 2 4 There are two such matched combinations for saving R objects complete with attributes: dput/dget # will be more readable with a text editor than the next option save/load # not very readable David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot: superpose 2 time series with different time intervals
On Sat, Aug 1, 2009 at 7:26 AM, Gary Lewisgary.m.le...@gmail.com wrote: I could use some advice regarding xyplot. I've got 2 time series. Both cover approximately the same period of time (ie, 1940 to 2009). But one series has annual data and the other has monthly data. One refers to university enrollment; the other to unemployment rates. Both are currently in the same data frame. I'd like to use the monthly times series as a light grayscale background for a plot of the annual time series, showing both series as type l (line). Naturally with all the NA's in the annual series, that plot disappears because points are not connected across missing values. You could define a small wrapper function that discards NA's before drawing lines: my.panel.lines - function(x, y, ...) { keep - !is.na(y) panel.lines(x[keep], y[keep], ...) } and use it as a custom panel.groups function: xyplot(whatever you had before, panel = panel.superpose, panel.groups = my.panel.lines) -Deepayan I suppose I could make both series annual, but a lot of interesting detail would get lost this way. Or I guess I could interpolate values in the annual series with monthly approximations, but this means 11 out of every 12 values is an approximation. Or I suppose I could plot each series separately and then print them with position information, which I'm reluctant to do because panel.superpose so nicely handles the alignment of the 2 panels. What I'd really like to do is plot each independently but still superposed. Effectively this seems to mean monthly data intervals but line connections across the NA's in the series with annual intervals. Any suggestions would be appreciated. Thanks. Gary Lewis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parameters of Logistic Distribution and (3 Parameter) Log Logistic Distribution
On Jul 31, 2009, at 3:33 AM, Madhavi Bhave wrote: Dear R Helpers Please guide me how one can estimate the parameters of Logistic Distribution and 3 Parameter Log-logistic distribution for a given data. data - c(2987.43,2990.12,3023.52,2964.79,3019.60,3051.07,3080.16,2944.15,3035.19,3023.46,2985.05,2970.95,3192.36,3084.39,2926.23,2952.15,3064.15,3003.20,2980..35,2980.45,3043.12,3115.53,3006.90,2946.03,3039.97,3064.01,3000.56,3049.57,3042.54,3037.63,2982.03,2889.74,3043.83,2930.95,3020.65,3009.21,3084.16,2954.05,2991.04,3083.10,3007.26,2949.58,2995.65,3078.36,3031.64,3001.28,3103.32,3015.04,2994.45,2963.71,2932.90,3021.31,3074.72,2980.15,3002.29,3088.18,2991.39,2942.90,3057.91,3023.25,3192.67,2966.49,3049.31,2915.38,3045.27,2852.72,2999.25,2978.52,3040.07,2945.50,3047.47,2915.95,3012.24,2985.80,2971.04,3035.72,3025.40,3014.76,2979.62,3029.20,2938.38,2966.47,3017.81,3016.43,2989.60,2941.22,3038.30,3033.44,3003.77,2950.02,3053.19,3011.69,2916.34,2918..10,3049.98,3062.46,2948.55,3072.90,3113.52,2987.61) require(MASS) ?fitdistr David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add columns in a dataframe and fill them from another table according to a criteria
Dear R users My apologizes for not writing in the correct format due to my ignorance. In the future I will write more clearly. I hope to contribute to the R community in the process of picking up the language professionally. I have now written the R code which is attached in a notepad file. I've simplified my problem in an example of, table pstate which contains the probabilities of getting certain changes in the four different states and a dataframe mydata4 which contains all the changes connected to the four different states. I would like to add the probabilities into mydata4 after matching for the change and the state. Everything before # output can be copy pasted in the R window. The desired output is written after ## OUTPUT Must I write an if else or can I do it in an easier way? Your help is greatly appreciated ! Many thanks for your patience. Regards Meenu On Sat, Aug 1, 2009 at 9:43 PM, David Winsemius dwinsem...@comcast.netwrote: On Aug 1, 2009, at 9:52 AM, Meenu Sahi wrote: Deare R users I am new to R. What I want to do is explained below;- I have table called States.Prob which is given below:- This table gives the probabilities of the changes in the swap curve depending on the state of the swap curve. I want to put these probabilities in my dataframe mydata(given after the prob table). Prob of States Changes State1 State2 State3 State4 a Pa1 Pa2 Pa3 Pa4 b Pb1 Pb2 Pb3 Pb4 c Pc1 Pc2 Pc3 Pc4 d Pd1 Pd2 Pd3 Pd4 and I have a dataframe(with 93 rows) called mydata part of which(6 rows) is given below where I want to fill in the last four columns with probabilities taken from States.Prob according to the change and state in mydata4:- Change State PState1 PState2 PState3 PState4 1 b State1 Pb1 2 a State4 Pa4 3 b State2Pb2 4 c State3 Pc3 5 d State1 Pd1 6 a State3 Pa3 What I want to do is highlighted in Red. How can I do this easily? You may have seen it in red, but we don't, and I, at least, cannot figure out what you intend. (Per the Posting Guide, which you have obviously not yet read, you need to compose your question in plain old monochromatic text and change your mail client so it posts in plain text.) If looking at the help pages for stack() and reshape() does not offer useful information and worked examples that meet your needs then: An approach that would make you more populat in these parts would be to make a simpler example, composed in syntactically correct R, that is complete in itself, and can pasted into an R session. Indicate what you intend as output from this simpler input. Perhaps pstate - read.table(textConnection(Changes State1 State2 State3 State4 + a Pa1 Pa2 Pa3 Pa4 + b Pb1 Pb2 Pb3 Pb4 + c Pc1 Pc2 Pc3 Pc4 + d Pd1 Pd2 Pd3 Pd4), header=TRUE, as.is=TRUE) ?stack data.frame(Change=pstate[,1], prstate =stack(pstate[2:5])$values, state=stack(pstate[2:5])$ind ) #first column is only 4 elements long, but will get recycled # second retreives the probabilities and may need to have as.numeric( ) wrapped around it if they really are numeric. # third returns what started out as column names. Change prstate state 1 a Pa1 State1 2 b Pb1 State1 3 c Pc1 State1 4 d Pd1 State1 5 a Pa2 State2 6 b Pb2 State2 7 c Pc2 State2 8 d Pd2 State2 9 a Pa3 State3 10 b Pb3 State3 11 c Pc3 State3 12 d Pd3 State3 13 a Pa4 State4 14 b Pb4 State4 15 c Pc4 State4 16 d Pd4 State4 Many thanks for your time. kind regards Meenu P.S. Thanks for your reply John. I've tried to put only the relevant columns of the dataframe. Hope its more clear now. \\// [[alternative HTML version deleted]] ^^Note: ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT pstate-read.table(textConnection(Changes PState1 PState2 PState3 PState4 + a Pa1 Pa2 Pa3 Pa4 + b Pb1 Pb2 Pb3 Pb4 + c Pc1 Pc2 Pc3 Pc4 + d Pd1 Pd2 Pd3 Pd4),header=TRUE,as.is=TRUE) Change-c(b,a,b,c,d,a) State-c(State1,State4,State2,State3,State1,State3) mydata4-data.frame(Change,State) mydata4-within(mydata4, { PState1-NA PState2-NA PState3-NA PState4-NA }) #OUTPUT #I would like to see my output of mydata4 with NA in the last 4 columns replaced by matching probabilities # from table pstate in whichever of the 4 columns are applicable depending on the State and Change. e.g. Row1 # of mydata4 has
Re: [R] Determine the dimension-names of an element in an array in R
Hi Christian: Thanks a lot for your continuous help. This time you got the code right ! That's what I wanted :) Great job! Thanks Regards, Sauvik On Sat, Aug 1, 2009 at 10:30 PM, Poersching poerschin...@web.de wrote: Hey, oh yes, but now I have realy the ultimate solution... ;-) Here it comes: a= c(A1,A2,A3,A4,A5) b= c(B1,B2,B3) c= c(C1,C2,C3,C4) d= c(D1,D2) e= c(E1,E2,E3,E4,E5,E6,E7,E8) DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b), length(d),length(e)),dimnames=list(a,b,d,e)) DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c), length(d),length(e)),dimnames=list(a,c,d,e)) z-apply(as.matrix(a),c(1,2),function(f1) apply(as.matrix(d),c(1,2),function(f2) apply(DataArray_1[dimnames(DataArray_1)[[1]]==f1,,dimnames(DataArray_1)[[3]]==f2,],1, function(d1) apply(DataArray_2[dimnames(DataArray_2)[[1]]==f1,,dimnames(DataArray_2)[[3]]==f2,],1, function(d2) cor(d1,d2)) ))) Correl = array(z, dim=c(length(c),length(b), length(d),length(a)),dimnames=list(c,b,d,a)) Correl-aperm(Correl,c(4,2,1,3)) So, best Regards, Christian Sauvik De schrieb: Hi Christian: Many thank for the code. But I am afraid that your code still has a problem in terms of providing correct correlation. For example, if you look at the correlation between DataArray_1[A2,B1,D1,] and DataArray_2[A2,C1,D1,] after running your code, you will notice that this is actually the correlation between DataArray_1[A2,B1,D1,] and DataArray_2[A1,C1,D1,] and so on. The code gives the correct result only in case where elements corresponding to A1 D1 are involved in DataArray_1 DataArray_2. The problem is in Correl-Correl[1:length(c),,,] We need to select elements of Correl more carefully to reach a proper solution. Thanks, Sauvik On Wed, Jul 29, 2009 at 11:41 PM, Poersching poerschin...@web.de mailto:poerschin...@web.de wrote: Hey, i have forgotten to generalize the code so Correl-Correl[1:4,,,] must be Correl-Correl[1:length(c),,,] it's because the comparison levels. I think you don't want the correlation betweeen A1, B1, D1 and A2, C1, D1 , but between A1, B1, D1 and A1, C1, D1 or between A1, B1, D1 and A1, C2, D1. So the 1:length(c) writes only the correlation between the B and C out of the whole correlation array. That's also why the sequence in the second apply function is changed. Regards Christian. Poersching schrieb: Hey, I think I have a solution for your problem: Correl-apply(DataArray_1,1:3, function(d1) apply(DataArray_2,c(2,1,3), function(d) cor(d1,d)) ) Correl-Correl[1:4,,,] dimnames(Correl)[[1]]-c Correl-aperm(Correl,c(2,3,1,4)) This one should work. :-) Best Regards, Christian Sauvik De schrieb: Hi there, Thanks again for your reply. I know for-loop is always a solution to my problem and I had already coded using for-loop. But the number of levels for each dimension is large enough in actual problem and hence it was time-consuming. So, I was just wondering if there are any other alternative way-outs to solving my problem. That's why I tried with apply functions (sapply)assuming that this might work out faster even fractionally as compared to for-loop. Cheers, Sauvik On Mon, Jul 27, 2009 at 12:28 AM, Poersching poerschin...@web.de mailto:poerschin...@web.de mailto:poerschin...@web.de mailto:poerschin...@web.de wrote: Sauvik De schrieb: Hi: Lots of thanks for your valuable time! But I am not sure how you would like to use the function in this situation. As I had mentioned that the first element of my output array should be like: cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs) in my below code. and the output array of correlation I wish to get using sapply as follows: Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...], use=pairwise.complete.obs)) So it would be of great help if you could kindly specify how to utilise your function findIndex in ... Apologies for all this! Thanks Regards, Sauvik Hey, sorry, I haven't understood your problem last time, but now this solution should solve your problem, so I hope. :-)
Re: [R] zoo plot warning messages - I don't know what they mean or how to inspect the data to figure this out
So as not to leave this thread dangling the problem was character fields where numeric fields had been expected. On Sat, Aug 1, 2009 at 11:32 AM, stephen sefickssef...@gmail.com wrote: I have a time series from 1933-2005 of precipitation at Fayetteville NC. I get the following error messages when I plot the zoo series. Any help would be appreciated. If you need the data I can dput it or send the csv. I didn't include it here because I didn't want to clog up anybodies email account. I know that this is not reproducible, and I will send along the file if needed. Warning messages: 1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion 2: In lines.times(x.index, y[, i], col = col[[i]], pch = pch[[i]], : NAs introduced by coercion -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Automatic datasets creation from multiple data sheets in a single excel file
If you have RExcel (and the necessary infrastructure, i.e. statconnDCOM and possibly rcom) installed, the following VBA macro will do the trick. -=-=-=-=-= Option Explicit Sub TransferAllSheetsAsDataframes(wb As Workbook) Dim ws As Worksheet RInterface.StartRServer For Each ws In wb.Sheets RInterface.PutDataframe ws.Name, ws.Cells(1, 1).CurrentRegion Next ws RInterface.StopRServer End Sub Sub TransferSheetsInThisWorkbook() TransferAllSheetsAsDataframes ThisWorkbook End Sub -=-=-=-=- You have to establish a reference to RExcelVBALib in your workbook. The names of the sheets will be used as the names of the dataframes. Dieter Menne wrote: rajclinasia wrote: Please let us know how to create automatic datasets from multiple data sheets in a single excel file... For example if there are 10 sheets in a single excel file, automatically 10 datasets need to be created at a time when i read an excel file as a whole at once. The critical part is getting the names of the worksheets. http://tolstoy.newcastle.edu.au/R/e6/help/09/03/7736.html For reading individual worksheets, there is lots of code around. And a site search for read excel worksheet returns quite a few references. Dieter -- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] diagonal lda and qda
Hi, all, I am wondering if there is any package doing lda and qda which allows assuming diagonal covariance matrices. I checked the lda function in MASS, and it seems it does not support this. Thanks, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] diagonal LDA and QDA
Hi, all, I am wondering if there is any package doing lda and qda which allows assuming diagonal covariance matrices. I checked the lda function in MASS, and it seems it does not support this. Thanks, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about rpart decision trees (being used to predict customer churn)
Hello, If you do my.tree - rpart(cancel ~ experience) and then you check my.tree$frame you will note that the complexity parameter there is 0. Check ?rpart.object to get a description of what this output means. But essentially, you will not be able to break the leaf unless you set a complexity parameter below that value, this is, never. You may need to go into the internals of the function (and the C code) in order to understand how this parameter is calculated. It looks to me as an oddity and it is worth trying to understand why. Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com P.S.: Note that there is a bug in your submitted code that requires some hand fixing. On Sun, 2009-07-26 at 11:37 -0700, Robert Smith wrote: Hi, I am using rpart decision trees to analyze customer churn. I am finding that the decision trees created are not effective because they are not able to recognize factors that influence churn. I have created an example situation below. What do I need to do to for rpart to build a tree with the variable experience? My guess is that this would happen if rpart used the loss matrix while creating the tree. experience - as.factor(c(rep(good,90), rep(bad,10))) cancel - as.factor(c(rep(no,85), rep(yes,5), rep(no,5), rep(yes,5))) table(experience, cancel) cancel experience no yes bad 5 5 good 85 5 rpart(cancel ~ experience) n= 100 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 100 10 no (0.900 0.100) * I tried the following commands with no success. rpart(cancel ~ experience, control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(split='information')) rpart(cancel ~ experience, parms=list(split='information'), control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,1,0), nrow=2, ncol=2))) Thanks a lot for your help. Best regards, Robert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Transparency and trellis device
Dear R-users, I am trying to produce trellis (png, or jpeg) graphs with transparent background, but I cannot manage to make that happen. I tried to play around with themes but to no avail. Any advise on the following example will be greatly appreciated: Thank you Sebastien library(lattice) df - data.frame(a=rep(1:4,4), b=rep(1:4,4), c=rep(1:4,each=4)) settings - standard.theme() settings - modifyList(settings, list(background=list(alpha=1, col=transparent))) str(settings) trellis.device(png, file=test.png, theme=settings) myplot-xyplot(b~a|c, data=df) print(myplot) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] diagonal LDA and QDA
Hi, all, I am wondering if there is any package doing lda and qda which allows assuming diagonal covariance matrices. I checked the lda function in MASS, and it seems it does not support this. Thanks, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to stop an R script when running JGR on a Linux/SuSE system
Hello, On 7/31/09, mau...@alice.it mau...@alice.it wrote: When I need to stop a running R script on Windows or Mac I just use the esc key which kills the current script and returns the control to R interpreter. But when I run R from JGR the esc is useless as well as the other available keyboard keys. This issue was addressed in a recent discussion [1]. Liviu [1] http://mailman.rz.uni-augsburg.de/pipermail/stats-rosuda-devel/2009q2/001106.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SVG output on Windows OS
Thank you, David. Cairo is installed and loaded. cairoDevice is installed and loaded. RGtk2 is installed and loaded (which installed GTK+) yet I still get false for cairo: capabilities() jpeg png tifftcltk X11 aqua http/ftp sockets TRUE TRUE TRUE TRUEFALSEFALSE TRUE TRUE libxml fifo clediticonv NLS profmemcairo TRUEFALSE TRUE TRUE TRUEFALSEFALSE Cairo.capabilities() png jpeg tiff pdf svgps x11 win TRUE FALSE FALSE TRUE TRUE TRUE FALSE TRUE ggsave(file=chart.svg) Saving 6 x 6 image Error: 'svg' is not an exported object from 'namespace:grDevices' I'm lost as to how to produce the svg output on windows. All works suitably on Linux. Michael Roessler, CFA michael.roes...@keyevent.com On Sat, Aug 1, 2009 at 6:15 AM, David Winsemius dwinsem...@comcast.netwrote: On Jul 31, 2009, at 6:41 PM, Michael Roessler wrote: How may one save a graphic as svg on Windows? The svg() command is recognized and functions well on Linux, etc., but not on Windows, it seems. I'm trying to use Hadley Wickam's ggplot2 and I would like to be able to save created charts as svg for later input into Illustrator. I am able to accomplish this workflow under Linux, but I don't know how to get R to recognize the svg() command under Windows. I have loaded RsvgDevice, Cairo, and cairoDevice in my attempts. The problem seems to me to be directly related to enabling R to produce svg output on Windows, rather than related to ggplot2. What does capabilities() return? -- David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cox ridge regression
Hello, I have questions regarding penalized Cox regression using survival package (functions coxph() and ridge()). I am using R 2.8.0 on Ubuntu Linux and survival package version 2.35-4. Question 1. Consider the following example from help(ridge): fit1 - coxph(Surv(futime, fustat) ~ rx + ridge(age, ecog.ps, theta=1), ovarian) As I understand, this builds a model in which `rx' is the predictor, whereas ridge penalty term contains variables `age' and `ph.ecog'. Could someone explain what it means to regularize on parameters which are not part of the model? Based on definition of Cox ridge regression (see for example [1]), or any other regularized regression, the penalty term is a function of the coefficients corresponding to the predictor variables, and nothing else. Question 2. Consider a similar example: library(survival) lfit2 - coxph(Surv(time, status) ~ age+ph.ecog + ridge(age, ph.ecog, theta=1), cancer) print(lfit2) Call: coxph(formula = Surv(time, status) ~ age + ph.ecog + ridge(age, ph.ecog, theta = 1), data = cancer) coef se(coef) se2 Chisq DF p age1.13e-02 0.1119.32e-03 0.01 1 0.92 ph.ecog4.43e-01 1.3981.16e-01 0.10 1 0.75 ridge(age) 2.60e-21 0.1104.85e-17 0.00 1 1.00 ridge(ph.ecog) 5.14e-22 1.393 0.00 1 1.00 Iterations: 1 outer, 3 Newton-Raphson Degrees of freedom for terms= 0 0 0 Likelihood ratio test=19.1 on 0.01 df, p=3.54e-08 n=227 (1 observation deleted due to missingness) Warning message: In sqrt((diag(x$var2))[kk]) : NaNs produced What is the meaning of the ridge(age) and ridge(ph.ecog) coefficients? Again, based on the definition of Cox ridge regression, it simply adds a penalty term to the standard Cox regression function, and doesn't introduce any new predictors. What to make of the ridge(age) and ridge(ph.ecog) rows in the output? Question 3. What is the origin and significance of the warning in the previous example: Warning message: In sqrt((diag(x$var2))[kk]) : NaNs produced Thank you very much for your help, Ljubomir [1] Bovelstad et al., Predicting survival from microarray data - a comparative study (Bioinformatics, Vol. 23, no. 16, 2007, pp. 2080-2087). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] odfWeave : sudden and unexplained error
Dear list, dear Max, I a currently working on a report. I'm writing it with OpenOffice.org and odfWeave. I'm working increentally : I write a bit, test (interactively) some ideas, cutting-and-pasting code to the Ooo report when satisfied with it. I the process, I tend to recompile the .odt source a *lot*. Suddenly, odfWeave started to give me an incomprehensible error even before starting the compilation itself (InFile is my inut .odt file, Outfile is the resultant .odt file) : odfWeave(InFile, OutFile) Copying SrcAnalyse1.odt Setting wd to /tmp/RtmphCUkSf/odfWeave01225949667 Unzipping ODF file using unzip -o SrcAnalyse1.odt Archive: SrcAnalyse1.odt extracting: mimetype inflating: content.xml inflating: layout-cache inflating: styles.xml extracting: meta.xml inflating: Thumbnails/thumbnail.png inflating: Configurations2/accelerator/current.xml creating: Configurations2/progressbar/ creating: Configurations2/floater/ creating: Configurations2/popupmenu/ creating: Configurations2/menubar/ creating: Configurations2/toolbar/ creating: Configurations2/images/Bitmaps/ creating: Configurations2/statusbar/ inflating: settings.xml inflating: META-INF/manifest.xml Removing SrcAnalyse1.odt Creating a Pictures directory Pre-processing the contents Erreur : cc$parentId == parentId is not TRUE Perusing the documentation and the r-help list archives didn't turn up anything relevant. This error survived restarting OOo, restarting R, restarting its enclosing Emacs session and even rebooting the damn hardware... Any idea ? Emmanuel Charpentier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R book for economists
Thiemo Fetzer wrote: Dear Group, I am an economics student starting with PhD work in London. As preparation I would like to get to know R a little bit better. For Stata there are tons of books, however, can you recommend a book for R? I have some substantiated econometrics knowledge, so it should be more a how-to book. Best regards Thiemo --- Thiemo Fetzer, Economist http://freigeist.devmag.net http://www.devmag.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Besides the other already mentioned econometrical references.if you are willing to read a book with life science data, then try: A Beginner's Guide to R (2009). Zuur, AF, Ieno, EN, Meesters, EHWG. Springer http://www.springer.com/statistics/computational/book/978-0-387-93836-3 Alain - Dr. Alain F. Zuur First author of: 1. Analysing Ecological Data (2007). Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p. 2. Mixed effects models and extensions in ecology with R. (2009). Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer. 3. A Beginner's Guide to R (2009). Zuur, AF, Ieno, EN, Meesters, EHWG. Springer Statistical consultancy, courses, data analysis and software Highland Statistics Ltd. 6 Laverock road UK - AB41 6FN Newburgh Email: highs...@highstat.com URL: www.highstat.com -- View this message in context: http://www.nabble.com/R-book-for-economists-tp24768682p24772774.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting file name from pdf device?
On Fri, Jul 31, 2009 at 8:49 AM, Rainer M Krugr.m.k...@gmail.com wrote: My question: how can I get the filename of the pdf from the device before it is closed? I've also looked for this and couldn't find a way. I had a similar use, where I wanted to get an R transcript with embedded plots in emacs (see prettyR for another transcript-with-plots option). What I did was use dev2bitmap to write out a PNG file. You could do something similar with dev.copy2pdf to create the pdf after you do the plotting. You could also use dev2bitmap in this manner to drive ghostscript to create pdf's for you (I don't know if it'll compress like you want). Here's what I did: show - function(file = paste(tempfile(), .png, sep = )) { dev2bitmap(file) cat([[, file, ]]\n, sep = ) # I do some post-processing in emacs to see the embedded graphic } My use case was that plots would be inserted where I used show as follows: plot(sin) show()# plot inserted into transcript here plot(cos) show(cos.png) # this time, a named local file instead of a temp file - Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compare lm() to glm(family=poisson)
Mark Na wrote: Dear R-helpers, I would like to compare the fit of two models, one of which I fit using lm() and the other using glm(family=poisson). The latter doesn't provide r-squared, so I wonder how to go about comparing these models (they have the same formula). Thanks very much, Mark Na [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The decision which distribution to use (Normal versus Poisson) should be an a priori choice. If you really want to compare them, then inspect the residuals of both models and see which model doesn't have any residual patterns. Alain - Dr. Alain F. Zuur First author of: 1. Analysing Ecological Data (2007). Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p. 2. Mixed effects models and extensions in ecology with R. (2009). Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer. 3. A Beginner's Guide to R (2009). Zuur, AF, Ieno, EN, Meesters, EHWG. Springer Statistical consultancy, courses, data analysis and software Highland Statistics Ltd. 6 Laverock road UK - AB41 6FN Newburgh Email: highs...@highstat.com URL: www.highstat.com -- View this message in context: http://www.nabble.com/Compare-lm%28%29-to-glm%28family%3Dpoisson%29-tp24764558p24772802.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A hiccup when using anova on gam() fits.
Thank you. That clarified a great many things. cheers, Rolf ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about rpart decision trees (being used to predict customer churn)
2009/7/27 Robert Smith robertpsmith2...@gmail.com Hi, I am using rpart decision trees to analyze customer churn. I am finding that the decision trees created are not effective because they are not able to recognize factors that influence churn. I have created an example situation below. What do I need to do to for rpart to build a tree with the variable experience? My guess is that this would happen if rpart used the loss matrix while creating the tree. experience - as.factor(c(rep(good,90), rep(bad,10))) cancel - as.factor(c(rep(no,85), rep(yes,5), rep(no,5), rep(yes,5))) table(experience, cancel) cancel experience no yes bad 5 5 good 85 5 rpart(cancel ~ experience) n= 100 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 100 10 no (0.900 0.100) * I tried the following commands with no success. rpart(cancel ~ experience, control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(split='information')) rpart(cancel ~ experience, parms=list(split='information'), control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,1,0), nrow=2, ncol=2))) Thanks a lot for your help. Best regards, Robert Hi Robert, Perhaps try a less extreme loss matrix: rpart(cancel ~ experience, parms=list(loss=matrix(c(0,5,1,0), byrow=TRUE, nrow=2))) Output from Rattle: Summary of the Tree model for Classification (built using rpart): n= 100 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 100 50 no (0.9000 0.1000) 2) experience=good 90 25 no (0.9444 0.0556) * 3) experience=bad 10 5 yes (0.5000 0.5000) * Classification tree: rpart(formula = cancel ~ ., data = crs$dataset, method = class, parms = list(loss = matrix(c(0, 5, 1, 0), byrow = TRUE, nrow = 2)), control = rpart.control(cp = 0.0001, usesurrogate = 0, maxsurrogate = 0)) Variables actually used in tree construction: [1] experience Root node error: 50/100 = 0.5 n= 100 CP nsplit rel error xerror xstd 1 0.4000 0 1.01.0 0.30 2 0.0001 1 0.60.6 0.22 TRAINING DATA Error Matrix - Counts Actual Predicted no yes no 85 5 yes 5 5 TRAINING DATA Error Matrix - Percentages Actual Predicted no yes no 85 5 yes 5 5 Time taken: 0.01 secs Generated by Rattle 2009-08-02 08:24:50 gjw == [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave : sudden and unexplained error
Sending me a reproducible example and the results of sessionInfo() would help. Max On Sat, Aug 1, 2009 at 5:13 PM, Emmanuel Charpentiercharp...@bacbuc.dyndns.org wrote: Dear list, dear Max, I a currently working on a report. I'm writing it with OpenOffice.org and odfWeave. I'm working increentally : I write a bit, test (interactively) some ideas, cutting-and-pasting code to the Ooo report when satisfied with it. I the process, I tend to recompile the .odt source a *lot*. Suddenly, odfWeave started to give me an incomprehensible error even before starting the compilation itself (InFile is my inut .odt file, Outfile is the resultant .odt file) : odfWeave(InFile, OutFile) Copying SrcAnalyse1.odt Setting wd to /tmp/RtmphCUkSf/odfWeave01225949667 Unzipping ODF file using unzip -o SrcAnalyse1.odt Archive: SrcAnalyse1.odt extracting: mimetype inflating: content.xml inflating: layout-cache inflating: styles.xml extracting: meta.xml inflating: Thumbnails/thumbnail.png inflating: Configurations2/accelerator/current.xml creating: Configurations2/progressbar/ creating: Configurations2/floater/ creating: Configurations2/popupmenu/ creating: Configurations2/menubar/ creating: Configurations2/toolbar/ creating: Configurations2/images/Bitmaps/ creating: Configurations2/statusbar/ inflating: settings.xml inflating: META-INF/manifest.xml Removing SrcAnalyse1.odt Creating a Pictures directory Pre-processing the contents Erreur : cc$parentId == parentId is not TRUE Perusing the documentation and the r-help list archives didn't turn up anything relevant. This error survived restarting OOo, restarting R, restarting its enclosing Emacs session and even rebooting the damn hardware... Any idea ? Emmanuel Charpentier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Package That Contains International Geomagnetic Reference Field (IGRF)
By any chance is anyone aware of an R package that contains a representation of the International Geomagnetic Reference Field (IGRF)? http://www.ngdc.noaa.gov/IAGA/vmod/igrf.html I've tracked down some Fortran and C code for the IGRF-10, and possibly IGRF-11, and was hoping to avoid an awkward port. Thanks again for any feedback and leads provided. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vglm{VGAM} output to Latex ?
Hi all, I am trying to put the summary output of vglm{VGAM} into a Latex table using mtable(Memisc}. I think I solved the problem regarding to the fact that vglm produces a vglm object which is not accepted by mtable by default defining a getSummary.vglm function. However summary.vglm adds : to the end of the coefficient names followed by the factor level while using the multinomial model. So the coefficient names look like (Intercept):1,(Intercept):2 etc. However this creates a problem in mtable: Error in strsplit(coefnames, :, fixed = TRUE) : non-character argument Any suggestions ? Or, are there any other general suggestions about putting vglm summary output into a Latex table using another method ? All help is greatly appreciated. Ugur Microsoft gives you windows, Linux gives you the whole house. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SVG output on Windows OS
It says cairo is not available. (Once you read this in a proper monospaced font, anyway.) On Aug 1, 2009, at 4:27 PM, Michael Roessler wrote: Thank you, David. Cairo is installed and loaded. cairoDevice is installed and loaded. RGtk2 is installed and loaded (which installed GTK+) yet I still get false for cairo: capabilities() jpeg png tifftcltk X11 aqua http/ftp sockets TRUE TRUE TRUE TRUEFALSEFALSE TRUE TRUE libxml fifo clediticonv NLS profmemcairo TRUEFALSE TRUE TRUE TRUEFALSEFALSE ^^^ You need to investigate why your Windows cairo installation is not available. Cairo.capabilities() png jpeg tiff pdf svgps x11 win TRUE FALSE FALSE TRUE TRUE TRUE FALSE TRUE ggsave(file=chart.svg) Saving 6 x 6 image Error: 'svg' is not an exported object from 'namespace:grDevices' I'm lost as to how to produce the svg output on windows. All works suitably on Linux. Michael Roessler, CFA michael.roes...@keyevent.com On Sat, Aug 1, 2009 at 6:15 AM, David Winsemius dwinsem...@comcast.net wrote: On Jul 31, 2009, at 6:41 PM, Michael Roessler wrote: How may one save a graphic as svg on Windows? The svg() command is recognized and functions well on Linux, etc., but not on Windows, it seems. I'm trying to use Hadley Wickam's ggplot2 and I would like to be able to save created charts as svg for later input into Illustrator. I am able to accomplish this workflow under Linux, but I don't know how to get R to recognize the svg() command under Windows. I have loaded RsvgDevice, Cairo, and cairoDevice in my attempts. The problem seems to me to be directly related to enabling R to produce svg output on Windows, rather than related to ggplot2. What does capabilities() return? -- David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add columns in a dataframe and fill them from another table according to a criteria
Apologies to list: Should have replied to all. -- DW Begin forwarded message: From: David Winsemius dwinsem...@comcast.net Date: August 1, 2009 3:02:58 PM EDT To: Meenu Sahi meenus...@gmail.com Subject: Re: [R] Add columns in a dataframe and fill them from another table according to a criteria On Aug 1, 2009, at 1:43 PM, Meenu Sahi wrote: Dear R users My apologizes for not writing in the correct format due to my ignorance. In the future I will write more clearly. I hope to contribute to the R community in the process of picking up the language professionally. I have now written the R code which is attached in a notepad file. I've simplified my problem in an example of, table pstate which contains the probabilities of getting certain changes in the four different states and a dataframe mydata4 which contains all the changes connected to the four different states. I would like to add the probabilities into mydata4 after matching for the change and the state. Everything before # output can be copy pasted in the R window. The desired output is written after ## OUTPUT Must I write an if else or can I do it in an easier way? Your help is greatly appreciated ! Many thanks for your patience. You need to figure out how to send mail to the list with plain text. But I suspect you did successfully get the attchment through to the audience. I did not like the ordering of the PStates in your new target dataframe so I changed it to fit my(and your) purposes. Change-c(b,a,b,c,d,a) State-c(State1,State4,State2,State3,State1,State3) mydata4-data.frame(Change,State) mydata4-data.frame(mydata4, + PState1=NA, + PState2=NA, + PState3=NA, + PState4=NA + ) mydata4 Change State PState1 PState2 PState3 PState4 1 b State1 NA NA NA NA 2 a State4 NA NA NA NA 3 b State2 NA NA NA NA 4 c State3 NA NA NA NA 5 d State1 NA NA NA NA 6 a State3 NA NA NA NA Note that str(pstate shows that State is a factor which becomes important. This now effects the desired transformation: for (i in 1:length(mydata4) ) { mydata4[i, as.numeric( mydata4[i, State])+2 ] - #assign to the i-th row, State + 2 column in mydata4 ... pstate[ mydata4[i, Change], as.numeric( mydata4[i, State])+1 ] } #... the value of i-th row, State+1 column of pstate mydata4 Change State PState1 PState2 PState3 PState4 1 b State1 Pb1NANANA 2 a State4NANANA Pa4 3 b State2NA Pb2NANA 4 c State3NANA Pc3NA 5 d State1 Pd1NANANA 6 a State3NANA Pa3NA The main non-obvious trick is the as.numeric( mydata4[i, State]) bit. as.numeric() when applied to a factor results in a numeric offset derived from the factor coding rather than using the level names. I suppose I could have left the PStaten's in the original order but then I would have been subtracting them from 7 to get the proper column number. Seemed even less understandable Regards Meenu On Sat, Aug 1, 2009 at 9:43 PM, David Winsemius dwinsem...@comcast.net wrote: On Aug 1, 2009, at 9:52 AM, Meenu Sahi wrote: Deare R users I am new to R. What I want to do is explained below;- I have table called States.Prob which is given below:- This table gives the probabilities of the changes in the swap curve depending on the state of the swap curve. I want to put these probabilities in my dataframe mydata(given after the prob table). Prob of States Changes State1 State2 State3 State4 a Pa1 Pa2 Pa3 Pa4 b Pb1 Pb2 Pb3 Pb4 c Pc1 Pc2 Pc3 Pc4 d Pd1 Pd2 Pd3 Pd4 and I have a dataframe(with 93 rows) called mydata part of which(6 rows) is given below where I want to fill in the last four columns with probabilities taken from States.Prob according to the change and state in mydata4:- Change State PState1 PState2 PState3 PState4 1 b State1 Pb1 2 a State4 Pa4 3 b State2Pb2 4 c State3 Pc3 5 d State1 Pd1 6 a State3 Pa3 What I want to do is highlighted in Red. How can I do this easily? You may have seen it in red, but we don't, and I, at least, cannot figure out what you intend. (Per the Posting Guide, which you have obviously not yet read, you need to compose your question in plain old monochromatic text and change your mail client so it posts in plain text.) If looking at the help pages for stack() and reshape() does not offer useful information
Re: [R] about the summary(cph.object)
Thx for your reply. In this example, age was transformed with rcs. So the output was different between f and summary(f). If I need to publicate the results, how do I explation the hazard ratio of age? 2009/8/1 David Winsemius dwinsem...@comcast.net On Jul 31, 2009, at 11:24 PM, zhu yao wrote: Could someone explain the summary(cph.object)? The example is in the help file of cph. n - 1000 set.seed(731) age - 50 + 12*rnorm(n) label(age) - Age sex - factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4))) cens - 15*runif(n) h - .02*exp(.04*(age-50)+.8*(sex=='Female')) dt - -log(runif(n))/h label(dt) - 'Follow-up Time' e - ifelse(dt = cens,1,0) dt - pmin(dt, cens) units(dt) - Year dd - datadist(age, sex) options(datadist='dd') This is process for setting the range for the display of effects in Design regression objects. See: ?datadist q.effect set of two quantiles for computing the range of continuous variables to use in estimating regression effects. Defaults are c(.25,.75), which yields inter-quartile-range odds ratios, etc. ?summary.Design #--- By default, inter-quartile range effects (odds ratios, hazards ratios, etc.) are printed for continuous factors, ... #--- Value For summary.Design, a matrix of class summary.Design with rows corresponding to factors in the model and columns containing the low and high values for the effects, the range for the effects, the effect point estimates (difference in predicted values for high and low factor values), the standard error of this effect estimate, and the lower and upper confidence limits. #--- Srv - Surv(dt,e) f - cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE) summary(f) Effects Response : Srv FactorLowHigh Diff. Effect S.E. Lower 0.95 Upper 0.95 age 40.872 57.385 16.513 1.21 0.21 0.80 1.62 Hazard Ratio 40.872 57.385 16.513 3.35 NA 2.22 5.06 In this case with a 4 df regression spline, you need to look at the effect across the range of the variable. You ought to plot the age effect and examine anova(f) ). In the untransformed situation the plot is on the log hazards scale for cph. So the effect for age in this case should be the difference in log hazard at ages 40.872 and 57.385. SE is the standard error of that estimate and the Upper and Lower numbers are the confidence bounds on the effect estimate. The Hazard Ratio row gives you exponentiated results, so a difference in log hazards becomes a hazard ratio. {exp(1.21) = 3.35} sex - Female:Male 2.000 1.000 NA 0.64 0.15 0.35 0.94 Hazard Ratio 2.000 1.000 NA 1.91 NA 1.42 2.55 Wat's the meaning of Effect, S.E. Lower, Upper? You probably ought to read a bit more basic material. If you are asking this question, Harrell's Regression Modeling Strategies might be over you head, but it would probably be a good investment anyway. Venables and Ripley's Modern Applied Statistics has a chapter on survival analysis. Also consider Kalbfliesch and Prentice Statistical Analysis of Failure Time Data. I'm sure there are others; those are the ones I have on my shelf. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about the summary(cph.object)
zhu yao wrote: Thx for your reply. In this example, age was transformed with rcs. So the output was different between f and summary(f). If I need to publicate the results, how do I explation the hazard ratio of age? David explained this. Nonlinearity in age does not complicate the explanation. The estimate is the estimate of the ratio of hazard rate at the upper quartile of age compared to the hazard ratio at the lower quartile, with the ages corresponding to these 2 points shown in the output. The output of f is not very useful for publication. The output of summary, Function, and latex are. Frank 2009/8/1 David Winsemius dwinsem...@comcast.net On Jul 31, 2009, at 11:24 PM, zhu yao wrote: Could someone explain the summary(cph.object)? The example is in the help file of cph. n - 1000 set.seed(731) age - 50 + 12*rnorm(n) label(age) - Age sex - factor(sample(c('Male','Female'), n, rep=TRUE, prob=c(.6, .4))) cens - 15*runif(n) h - .02*exp(.04*(age-50)+.8*(sex=='Female')) dt - -log(runif(n))/h label(dt) - 'Follow-up Time' e - ifelse(dt = cens,1,0) dt - pmin(dt, cens) units(dt) - Year dd - datadist(age, sex) options(datadist='dd') This is process for setting the range for the display of effects in Design regression objects. See: ?datadist q.effect set of two quantiles for computing the range of continuous variables to use in estimating regression effects. Defaults are c(.25,.75), which yields inter-quartile-range odds ratios, etc. ?summary.Design #--- By default, inter-quartile range effects (odds ratios, hazards ratios, etc.) are printed for continuous factors, ... #--- Value For summary.Design, a matrix of class summary.Design with rows corresponding to factors in the model and columns containing the low and high values for the effects, the range for the effects, the effect point estimates (difference in predicted values for high and low factor values), the standard error of this effect estimate, and the lower and upper confidence limits. #--- Srv - Surv(dt,e) f - cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE) summary(f) Effects Response : Srv FactorLowHigh Diff. Effect S.E. Lower 0.95 Upper 0.95 age 40.872 57.385 16.513 1.21 0.21 0.80 1.62 Hazard Ratio 40.872 57.385 16.513 3.35 NA 2.22 5.06 In this case with a 4 df regression spline, you need to look at the effect across the range of the variable. You ought to plot the age effect and examine anova(f) ). In the untransformed situation the plot is on the log hazards scale for cph. So the effect for age in this case should be the difference in log hazard at ages 40.872 and 57.385. SE is the standard error of that estimate and the Upper and Lower numbers are the confidence bounds on the effect estimate. The Hazard Ratio row gives you exponentiated results, so a difference in log hazards becomes a hazard ratio. {exp(1.21) = 3.35} sex - Female:Male 2.000 1.000 NA 0.64 0.15 0.35 0.94 Hazard Ratio 2.000 1.000 NA 1.91 NA 1.42 2.55 Wat's the meaning of Effect, S.E. Lower, Upper? You probably ought to read a bit more basic material. If you are asking this question, Harrell's Regression Modeling Strategies might be over you head, but it would probably be a good investment anyway. Venables and Ripley's Modern Applied Statistics has a chapter on survival analysis. Also consider Kalbfliesch and Prentice Statistical Analysis of Failure Time Data. I'm sure there are others; those are the ones I have on my shelf. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What does .[foo] really mean?
Hi R users, I really want to know what exactly .[foo] means. Thanks in advance. -Simon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.