Re: [R] manova: R vs SAS...need some clarification
Please see the footer of this message: we need to know what you did. Also, SAS may have made some assumptions for you without telling you (for example used a numerically ill-conditioned covariance matrix), and we don't know what you did in SAS, either. On Tue, 12 Aug 2008, Pedro Mardones wrote: Dear all; working with a 'fat' data set (700 variables / 50 samples) and trying to run a manova test on it (I'm aware that it's not the best option for this kind of data set) I got the error in the summary.manova function about the rank of the residuals (rank # variables). Ok. The thing that I don't understand is why I don't get the same type of error in SAS. There seems to be no problem with rank deficiency and the fit-statistics in SAS (no negative DF or something like that...). I'm sure it must be some differences in the way the manova test is calculated but I don't know what they are, so I'll appreciate any comments... Thanks PM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Update Package on CRAN
stephen sefick wrote: To update a package on CRAN I just update all of the version information stuff etc. and then upload it to the ftp site? Stephen Sefick Yes, just build the package and submit it to CRAN as before, but with an increased version number. Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing R in Ubuntu
On Tue, Aug 12, 2008 at 9:24 PM, Shreyasee Pradhan [EMAIL PROTECTED] wrote: Hi, I am running Ubuntu on my Windows OS through VMware. I am trying to install R in Ubuntu, but not getting with those commands, which are there on the site. Can anyone please tell me how to install it, stepwise, with commands to be used. As I m new to Ubuntu as well, I am not aware of the commands very well. snipped Hi, What commands did you try ? What worked and what didn't ? Which site did you refer ? Please read the posting guidelines here: http://www.r-project.org/posting-guide.html In the Ubuntu command line, try: sudo aptitude install r-base And for a list of R packages that you can install from the Ubuntu repositories: aptitude search r- | grep [^A-Za-z0-9] r- Install them like this: sudo aptitude install r-cran-package-name HTH, Senthil -/ You see, but you do not observe. The distinction is clear. Sir Arthur Conan Doyle in, The Memoirs of Sherlock Holmes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aligned memory allocation in C
On Tue, 12 Aug 2008, Jeffrey Horner wrote: Christophe Dutang1 wrote: Hi, I'm currently R porting SF Mersenne Twister algorithm of Matsumoto and Saito. To get the full power of their code, I want to use their fonction fill_array32 which need aligned memory. That is to say I need to use the C function memalign on windows, posix_memalign on linux and classic malloc on Mac OS. In 'writing R extenstion', they recommand to use R_alloc function to allocate memory in C. Does R_alloc return a pointer to aligned memory? if not how can I do this? probably no, because R crashes when I succesively R_alloc and fill_array32 (cf below) on my macbook with R 2.7.1. You can still do this. Just take the address returned from R_alloc and test for alignment. If it's not, then just use an aligned address beyond the one returned. We haven't been told what the desired alignment is (and those functions need to be told). On 32-bit Mac OS X, R_alloc is definitely aligned on 4-byte boundaries (on 64-bit OSes it is usually 8-byte aligned). (But then the question is, which direction beyond the one returned? How does one test for that?) Addresses always go upwards. So if you want 64-byte alignment you need to allocate a block at least 64 bytes longer than required, and go up to the nearest multiple of 64. BTW, this is clearly an R-devel question -- see the posting guide. Jeff Thanks in advance Kind regards Christophe PS : http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/howto-compile.html provides an example of memalign. PPS : mac os report [removed] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing R in Ubuntu
Hi, Thanks for that. the way I tried is as follows: 1) Downloaded the r-base package 2) went in that directory where the r-base package was downloaded from command line 3) entered the command sudo apt-get install r-base But got the error, that Couldn't find r-base command. I don't understand where I went I wrong. I will definitely try the following commands. Thanks, Shreyasee On Wed, Aug 13, 2008 at 12:02 PM, Senthil Kumar M [EMAIL PROTECTED]wrote: On Tue, Aug 12, 2008 at 9:24 PM, Shreyasee Pradhan [EMAIL PROTECTED] wrote: Hi, I am running Ubuntu on my Windows OS through VMware. I am trying to install R in Ubuntu, but not getting with those commands, which are there on the site. Can anyone please tell me how to install it, stepwise, with commands to be used. As I m new to Ubuntu as well, I am not aware of the commands very well. snipped Hi, What commands did you try ? What worked and what didn't ? Which site did you refer ? Please read the posting guidelines here: http://www.r-project.org/posting-guide.html In the Ubuntu command line, try: sudo aptitude install r-base And for a list of R packages that you can install from the Ubuntu repositories: aptitude search r- | grep [^A-Za-z0-9] r- Install them like this: sudo aptitude install r-cran-package-name HTH, Senthil -/ You see, but you do not observe. The distinction is clear. Sir Arthur Conan Doyle in, The Memoirs of Sherlock Holmes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing R in Ubuntu
Shreyasee Pradhan [EMAIL PROTECTED] writes: Hi, Thanks for that. the way I tried is as follows: 1) Downloaded the r-base package 2) went in that directory where the r-base package was downloaded from command line 3) entered the command sudo apt-get install r-base But got the error, that Couldn't find r-base command. I don't understand where I went I wrong. I will definitely try the following commands. I think you probably has no universe repository in your /etc/apt/source.list. all R related stuffs are in universe Try to google something like source.lst generator, if you are new to aptsource.lst Cheers poppyer Thanks, Shreyasee On Wed, Aug 13, 2008 at 12:02 PM, Senthil Kumar M [EMAIL PROTECTED]wrote: On Tue, Aug 12, 2008 at 9:24 PM, Shreyasee Pradhan [EMAIL PROTECTED] wrote: Hi, I am running Ubuntu on my Windows OS through VMware. I am trying to install R in Ubuntu, but not getting with those commands, which are there on the site. Can anyone please tell me how to install it, stepwise, with commands to be used. As I m new to Ubuntu as well, I am not aware of the commands very well. snipped Hi, What commands did you try ? What worked and what didn't ? Which site did you refer ? Please read the posting guidelines here: http://www.r-project.org/posting-guide.html In the Ubuntu command line, try: sudo aptitude install r-base And for a list of R packages that you can install from the Ubuntu repositories: aptitude search r- | grep [^A-Za-z0-9] r- Install them like this: sudo aptitude install r-cran-package-name HTH, Senthil -/ You see, but you do not observe. The distinction is clear. Sir Arthur Conan Doyle in, The Memoirs of Sherlock Holmes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqlQuery with date attribute
Hi Many GetReturn-function(code,date) { db-C:/Test.mdb channel-odbcConnectAccess(db) ssql-paste(select * from tblCalendarDate Where CalendarID =,code,and DateRebal =,date) print(ssql)# so as i can see what ssql contains mydata-sqlQuery(channel,ssql) mydata } [snip] This is the content of my table tblCalendarDate: CalendarIDDateRebal 129/09/2006 110/10/2006 120/10/2006 131/10/2006 110/11/2006 120/11/2006 Actually, the channel is open but the query on the table did not perform the query correctly, here is the result of the function when i run GetReturn(1,2007-03-01) for example: Something with the formatting of the date goes wrong as I think. In the table tblCalendarDate you have it like *29/09/2006* but in your function you have it as *2007-03-01*. Dig deeper by experimenting with the dates format. You can experiment in Access itself to see what kind of dates Access accepts. s. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqlQuery with date attribute
Thank you for your answer. Actually, I've tried with this function where I added the # symbol between the date: GetReturn-function(code,date) { db-C:/Test.mdb channel-odbcConnectAccess(db) ssql-paste(select * from tblCalendarDate Where CalendarID =,code,and DateRebal= #,date,#) print(ssql)# so as i can see what ssql contains mydata-sqlQuery(channel,ssql) mydata } GetReturn(1,2007-01-10) And it works when I run simply the command GetReturn(1,2007-03-01) Samuel Bächler [EMAIL PROTECTED] a écrit : Hi Many GetReturn-function(code,date) { db-C:/Test.mdb channel-odbcConnectAccess(db) ssql-paste(select * from tblCalendarDate Where CalendarID =,code,and DateRebal =,date) print(ssql)# so as i can see what ssql contains mydata-sqlQuery(channel,ssql) mydata } [snip] This is the content of my table tblCalendarDate: CalendarIDDateRebal 129/09/2006 110/10/2006 120/10/2006 131/10/2006 110/11/2006 120/11/2006 Actually, the channel is open but the query on the table did not perform the query correctly, here is the result of the function when i run GetReturn(1,2007-03-01) for example: Something with the formatting of the date goes wrong as I think. In the table tblCalendarDate you have it like *29/09/2006* but in your function you have it as *2007-03-01*. Dig deeper by experimenting with the dates format. You can experiment in Access itself to see what kind of dates Access accepts. s. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary.manova rank deficiency error + data
Pedro Mardones wrote: Dear R-users; Previously I posted a question about the problem of rank deficiency in summary.manova. As somebody suggested, I'm attaching a small part of the data set. #*** test - structure(.Data = list(structure(.Data = c(rep(1,3),rep(2,18),rep(3,10)), levels = c(1, 2, 3), class = factor) ,c(0.181829,0.090159,0.115824,0.112804,0.134650,0.249136,0.163144,0.122012,0.157554,0.126283, 0.105344,0.125125,0.126232,0.084317,0.092836,0.108546,0.159165,0.121620,0.142326,0.122770, 0.117480,0.153762,0.156551,0.185058,0.161651,0.182331,0.139531,0.188101,0.103196,0.116877,0.113733) ,c(0.181445,0.090254,0.115840,0.112863,0.134610,0.249003,0.163116,0.122135,0.157206,0.126129, 0.105302,0.124917,0.126243,0.084455,0.092818,0.108458,0.158769,0.121244,0.141981,0.122595, 0.117556,0.153507,0.156308,0.184644,0.161421,0.181999,0.139376,0.187708,0.103126,0.116615,0.113746) ,c(0.181058,0.090426,0.115926,0.113022,0.134632,0.248845,0.163140,0.122331,0.156871,0.126023, 0.105335,0.124757,0.126325,0.084690,0.092885,0.108455,0.158386,0.120913,0.141676,0.122492, 0.117707,0.153293,0.156095,0.184242,0.161214,0.181670,0.139271,0.187318,0.103129,0.116421,0.113826) ,c(0.180692,0.090704,0.116110,0.113319,0.134745,0.248678,0.163256,0.122637,0.156581,0.125998, 0.105479,0.124686,0.126514,0.085066,0.093088,0.108587,0.158040,0.120674,0.141446,0.122488, 0.117972,0.153150,0.155954,0.183885,0.161063,0.181383,0.139251,0.186956,0.103232,0.116351,0.114001) ,c(0.180353,0.091088,0.116392,0.113753,0.134965,0.248520,0.163475,0.123046,0.156354,0.126067, 0.105726,0.124713,0.126821,0.085584,0.093432,0.108858,0.157742,0.120533,0.141309,0.122595, 0.118340,0.153088,0.155897,0.183582,0.160975,0.181143,0.139314,0.186636,0.103449,0.116415,0.114275) ) ,names = c(GROUP, Y1, Y2, Y3, Y4,Y5) ,row.names = seq(1:31) ,class = data.frame ) summary(manova(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test), test = Wilks) #Error in summary.manova(manova(cbind(Y1, Y2, Y3, Y4, Y5) ~ GROUP, test), : residuals have rank 3 5 #*** What I don't understand is why SAS returns no errors using PROC GLM for the same data set. Is because PROC GLM doesn't take into account problems of rank deficiency? So, should I trust manova instead of PROC GLM output? I know it can be a touchy question but I would like to receive some insights. Thanks PM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. What you have here is extremely correlated data: (V - estVar(lm(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test))) Y1 Y2 Y3 Y4 Y5 Y1 0.001262567 0.001259177 0.001254746 0.001249106 0.001242385 Y2 0.001259177 0.001255814 0.001251416 0.001245812 0.001239132 Y3 0.001254746 0.001251416 0.001247055 0.001241494 0.001234861 Y4 0.001249106 0.001245812 0.001241494 0.001235983 0.001229405 Y5 0.001242385 0.001239132 0.001234861 0.001229405 0.001222889 eigen(V) $values [1] 6.224077e-03 2.313066e-07 3.499837e-10 4.259125e-12 1.334146e-12 $vectors [,1] [,2] [,3] [,4] [,5] [1,] 0.4503756 0.61213579 0.5204920 -0.3485941 0.1732681 [2,] 0.4491807 0.32333236 -0.1873653 0.5929444 -0.5540795 [3,] 0.4476157 0.01442094 -0.5498688 0.1272921 0.6934503 [4,] 0.4456201 -0.31202109 -0.3198606 -0.6557557 -0.4144143 [5,] 0.4432397 -0.65052351 0.5378809 0.2840428 0.1017918 Notice the more than 9 orders of magnitude between the eigenvalues. I think that what is happening is that what SAS calls MANOVA is actually looking at within-row contrasts, which effectively removes the largest eigenvalue. In R, the equivalent would be anova(lm(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test), X=~1, test = Wilks) Analysis of Variance Table Contrasts orthogonal to ~1 Df Wilks approx F num Df den Df Pr(F) (Intercept) 1 0.037 164.873 4 25 2e-16 *** GROUP 2 0.701 1.215 8 50 0.3098 Residuals 28 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 or (this could be computationally more precice, but in fact it gives the same result) anova(lm(cbind(Y2,Y3,Y4,Y5)-Y1~GROUP, test), test = Wilks) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Senging commands to the GUI in Windows through a script
Cool, many thanks Henrik. Tolga Henrik Bengtsson [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 13/08/2008 02:03 To [EMAIL PROTECTED] cc Prof Brian Ripley [EMAIL PROTECTED], r-help@r-project.org Subject Re: [R] Senging commands to the GUI in Windows through a script With AutoIt [http://www.autoitscript.com/] you can setup scripts that send keyboard and mouse events, wait for windows to open and more. It is quite powerful. /Henrik On Tue, Aug 12, 2008 at 4:51 AM, [EMAIL PROTECTED] wrote: OK thanks, Tolga Prof Brian Ripley [EMAIL PROTECTED] 12/08/2008 12:46 To [EMAIL PROTECTED] cc r-help@r-project.org Subject Re: [R] Senging commands to the GUI in Windows through a script On Tue, 12 Aug 2008, [EMAIL PROTECTED] wrote: Dear R Users, How can I send commands to the R GUI from within a R script in Microsoft Windows ? I am trying to get the windows within the R GUI to Tile after I draw a graph. Not directly (it's possible you can via COM, but there is no R function to do so). Thanks in advance, Tolga -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you
Re: [R] Installing R in Ubuntu
Hi, If you download a package to your harddrive for installation you need to use the dpkg command like: 1) Download pacakge (foo.deb) 2) Go to the directory 3) dpkg -i foo.deb But I would advise against this because it is better to use repositories so R get updated automatically. The standard ubuntu repositories have old versions of R, see http://cran.r-project.org/bin/linux/ubuntu/ for a description of how to add the CRAN repositories for the latest version of R. You can also install a lot of R packages from this repository, doing this also ensures that they are automatically updated. cheers and hth, Paul Shreyasee Pradhan wrote: Hi, Thanks for that. the way I tried is as follows: 1) Downloaded the r-base package 2) went in that directory where the r-base package was downloaded from command line 3) entered the command sudo apt-get install r-base But got the error, that Couldn't find r-base command. I don't understand where I went I wrong. I will definitely try the following commands. Thanks, Shreyasee On Wed, Aug 13, 2008 at 12:02 PM, Senthil Kumar M [EMAIL PROTECTED]wrote: On Tue, Aug 12, 2008 at 9:24 PM, Shreyasee Pradhan [EMAIL PROTECTED] wrote: Hi, I am running Ubuntu on my Windows OS through VMware. I am trying to install R in Ubuntu, but not getting with those commands, which are there on the site. Can anyone please tell me how to install it, stepwise, with commands to be used. As I m new to Ubuntu as well, I am not aware of the commands very well. snipped Hi, What commands did you try ? What worked and what didn't ? Which site did you refer ? Please read the posting guidelines here: http://www.r-project.org/posting-guide.html In the Ubuntu command line, try: sudo aptitude install r-base And for a list of R packages that you can install from the Ubuntu repositories: aptitude search r- | grep [^A-Za-z0-9] r- Install them like this: sudo aptitude install r-cran-package-name HTH, Senthil -/ You see, but you do not observe. The distinction is clear. Sir Arthur Conan Doyle in, The Memoirs of Sherlock Holmes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +31302535773 Fax:+31302531145 http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mob(party) formula question
I try tu use mob() with my data.frame ('data.frame':288 obs. of 81 variables; factors, numerics and ordered factors) My response is a binary variable and I should use for modelling a logistic regression (family=binomial). I read in the MOB Vignette that I could use a formula like this if I would like to have only partitioning variables apart from the response. Test.mob-mob(Resp~1|Var1+Var2+, data=dataframe, model=glinearModel, family=binomial()) but this gives me back an error-message: Fehler in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected But Var1, Var2 and Resp are in my dataframe. Why do I get this error? I am also wondering how I can find out which variables I should use for partitioning and which for modelling? There are correlations between some variables in my dataframe. Would it be a possibility to use always one variable of the correlated variable-pairs for partitioning and one for modelling? I would be very happy if somebody could give me some hints or answers to my questions. Many thanks in advance. B. - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/mob%28party%29-formula-question-tp18959898p18959898.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculating an appropriate error from the NNET package for a continuous target
I have been using the NNET package and have successfully run neural networks on both continuous and binary targets. I managed to search the internet and found out how to capture the error resulting from a binary model no problems. My problem now is that I am trying to find how to calculate an approriate error when modelling a continuous target. As a statistician I immediately think of the RMSE, is it possible to calculate such a statistic from the model? This would require kniowledge of the degrees of freedom in the model (if there is an equivalent in a neural netwrok!?) Ideally I would like the proportion of the RMSE to total error. For example, one criteria of model fit may be no worse than 10% error rate. It is this sort of statistic that i am desperate to calc.Any help greatly appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aligned memory allocation in C
Yes, it seems a good idea but your two questions are also good questions! Le 13 août 08 à 04:52, Jeffrey Horner a écrit : Christophe Dutang1 wrote: Hi, I'm currently R porting SF Mersenne Twister algorithm of Matsumoto and Saito. To get the full power of their code, I want to use their fonction fill_array32 which need aligned memory. That is to say I need to use the C function memalign on windows, posix_memalign on linux and classic malloc on Mac OS. In 'writing R extenstion', they recommand to use R_alloc function to allocate memory in C. Does R_alloc return a pointer to aligned memory? if not how can I do this? probably no, because R crashes when I succesively R_alloc and fill_array32 (cf below) on my macbook with R 2.7.1. You can still do this. Just take the address returned from R_alloc and test for alignment. If it's not, then just use an aligned address beyond the one returned. (But then the question is, which direction beyond the one returned? How does one test for that?) Jeff Thanks in advance Kind regards Christophe PS : http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/howto-compile.html provides an example of memalign. PPS : mac os report Thread 0 Crashed: 0 libSystem.B.dylib 0x9341bb9e __kill + 10 1 libSystem.B.dylib 0x93492ec2 raise + 26 2 libSystem.B.dylib 0x934a247f abort + 73 3 randtoolbox.so0x15e65f1d 0x15e5d000 + 36637 4 randtoolbox.so0x15e614ef fill_array32 + 4038 5 randtoolbox.so0x15e6513d SFmersennetwister + 335 6 randtoolbox.so0x15e652c6 doSFMersenneTwister + 255 7 libR.dylib0x00367a52 do_dotcall + 1394 8 libR.dylib0x0038b5a2 Rf_eval + 1754 9 libR.dylib0x0038f9a2 do_set + 592 10 libR.dylib0x0038b366 Rf_eval + 1182 11 libR.dylib0x0038b366 Rf_eval + 1182 12 libR.dylib0x0038c140 do_begin + 58 13 libR.dylib0x0038b366 Rf_eval + 1182 14 libR.dylib0x0038b366 Rf_eval + 1182 15 libR.dylib0x0038c140 do_begin + 58 16 libR.dylib0x0038b366 Rf_eval + 1182 17 libR.dylib0x0038d9a6 Rf_applyClosure + 663 18 libR.dylib0x0038b25d Rf_eval + 917 19 org.R-project.R 0x000189c3 run_REngineRmainloop + 569 (Rinit.m:442) 20 org.R-project.R 0x0001142a -[REngine runREPL] + 260 (REngine.m:181) 21 org.R-project.R 0x2e91 main + 795 (main.m: 126) 22 org.R-project.R 0x2b5a _start + 216 23 org.R-project.R 0x2a81 start + 41 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- http://biostat.mc.vanderbilt.edu/JeffreyHorner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dixon test
Hi, thank you very much for your useful help =). just a question...I don't know what is the distribution of my data (normal, T, etc...). So, how can I set the type parameter? There is a type value to use in case of a distribution-free statistical test? Thank you so much! Fernando Marmolejo-Ramos wrote: hi giov about the dixon test... i just run a simple test with a sample of 40 and I got: Error in dixon.test(x) : Sample size must be in range 3-30 So it seems that most of the test in the outliers package are designed for small samples. See also the Rnews article published in May 2006 (vol 6/2) called processing data for outliers by Lukasz Komsta (the developer of the package). However there is in that package a function called scores which works for big samples. You can also see the p-values and z scores for the observations you have and determine which values are considered outliers. Try this simple syntax: library(outliers) library(gamlss.dist) # this produces a exponential+Gaussian distribution (which usually has heaps of outliers!) x - rexGAUS(100,2000,3000,5000) # this confirms that Dixon works for samples between 3 and 30!!! dixon.test(x) # just to see what the data set looks like and visually confirm the outliers boxplot(x, notch=T) # sort the scores in ascending order sort(x) # returns probability of each score (using z scores) to be an outlier in order sort(scores(x, type=z, prob=1)) # determines which scores are considered outliers with a 95% confidence sort(scores(x, prob=0.95)) The author points regarding the prob part... prob If set, the corresponding p-values instead of scores are given. If value is set to 1, p-value are returned. Otherwise, a logical vector is formed, indicating which values are exceeding specified probability. In z and mad types, there is also possibility to set this value to zero, and then scores are confirmed to (n-1)/sqrt(n) value, according to Shiffler (1998). The iqr type does not support probabilities, but lim value can be specified. The reference of Shiffler is not as the one that appears in the help. It is this one: Schiffler, R.E (1988). Maximum Z scores and outliers. Am. Stat. 42, 1, 79-80. I hope this helps, Fernando -- View this message in context: http://www.nabble.com/dixon-test-tp18940260p18960162.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fPortfolio constraints, maxsumW
JPB == John P. Burkett [EMAIL PROTECTED] on Tue, 12 Aug 2008 10:46:28 -0400 JPB Running R version 2.6.1 under Gentoo Linux and using the fPortfolio JPB package, I am having trouble specifying a sector constraint. One of the JPB constraints to be imposed is that assets 1 and 2 together account for no JPB more than 13.63% of the portfolio. My attempt at coding that JPB constraint, maxsumW[1:2Assets]=13.63 fails. The relevant section of JPB my code file and the resulting error message are pasted below. JPB Suggestions about how to correct my coding would be most welcome. JPB JPB *Code beings here JPB Data = as.timeSeries(Jdata) JPB Spec = portfolioSpec() JPB setNFrontierPoints(Spec) = 150 JPB Spec JPB Constraint = c(minW[1:nAssets]=0, maxsumW[1:2Assets]=13.63) JPB frontier = portfolioFrontier(Data, Spec, Constraint) JPB **Error message begins here*** JPB Error in parse(text = constraints[i]) : JPBunexpected symbol in maxsumW[1:2Assets JPB **Error message ends here** JPB JPB -John Hi John, you should use 0.1363 instead of 13.63... hope this helps, yohan -- PhD student Swiss Federal Institute of Technology Zurich www.ethz.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems with packages tseries and robustbase
Dear R Users, Is there a known problem with downloading packages robustbase and tseries from the UK CRAN website ? Thanks in advance, Tolga = R version 2.7.1 (2008-06-23) Copyright (C) 2008 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. utils:::menuInstallPkgs() --- Please select a CRAN mirror for use in this session --- trying URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip' Error in download.file(url, destfile, method, mode = wb, ...) : cannot open URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip' In addition: Warning message: In download.file(url, destfile, method, mode = wb, ...) : cannot open: HTTP status was '404 Not Found' Warning in download.packages(p0, destdir = tmpd, available = available, : download of package 'robustbase' failed utils:::menuInstallPkgs() trying URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip' Error in download.file(url, destfile, method, mode = wb, ...) : cannot open URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip' In addition: Warning message: In download.file(url, destfile, method, mode = wb, ...) : cannot open: HTTP status was '404 Not Found' Warning in download.packages(p0, destdir = tmpd, available = available, : download of package 'tseries' failed = Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help me: nls and try function
Dear All, I have these problems: 1) How can use the function try in nls model: try(nls(...)) 2) I have 100 colun with data and I want ro prepare 99 file with the first colun with the others Time A1 A2 A3 A4 AN. I want to have 99 files with a)Time and A1 b)Time and A2 n) Time AN thanks for any help M __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems with *downloading* packages tseries and robustbase
It works from my home ISP (Virgin Media) and from .ox.ac.uk, so I think the problem is local to you, perhaps your DNS server. Ask you IT support for erm ... support. Please do try to use an accurate subject line (see the posting guide). (Why don't peoople just try a different mirror or some IP/TCP debugging tools instead of asking here about problems with mirrors? At most a handful of readers can do anything about this.) On Wed, 13 Aug 2008, [EMAIL PROTECTED] wrote: Dear R Users, Is there a known problem with downloading packages robustbase and tseries from the UK CRAN website ? Thanks in advance, Tolga = R version 2.7.1 (2008-06-23) Copyright (C) 2008 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. utils:::menuInstallPkgs() --- Please select a CRAN mirror for use in this session --- trying URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip' Error in download.file(url, destfile, method, mode = wb, ...) : cannot open URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip' In addition: Warning message: In download.file(url, destfile, method, mode = wb, ...) : cannot open: HTTP status was '404 Not Found' Warning in download.packages(p0, destdir = tmpd, available = available, : download of package 'robustbase' failed utils:::menuInstallPkgs() trying URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip' Error in download.file(url, destfile, method, mode = wb, ...) : cannot open URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip' In addition: Warning message: In download.file(url, destfile, method, mode = wb, ...) : cannot open: HTTP status was '404 Not Found' Warning in download.packages(p0, destdir = tmpd, available = available, : download of package 'tseries' failed -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems with *downloading* packages tseries and robustbase
Hi, It may be a problem with the mirror and my location. However, I should have added that the following packages had no problems installing from that mirror and location, which is why I suspected it was something more. RBloomberg strucchange car lmtest nlme corrgram RODBC MSBVAR xtable vars tseries PerformanceAnalytics fArma In addition to tseries and robustbase, I am also having problems with numDeriv. Strange. Nothing dire, as one can just download the zip files locallly and install, which is what I did. I thought I would bring it to the lists attention in case it is something package specific, and if so, the fix would benefit others. If there is another list to which one should report suspected package/mirror issues, please let me know and I can use that in the future. The commands used are below. Regards, Tolga install.packages(RDCOMClient, repos = http://www.omegahat.org/R;) install.packages(RBloomberg,repos=http://cran.uk.r-project.org;) install.packages(strucchange,repos=http://cran.uk.r-project.org;) install.packages(car,repos=http://cran.uk.r-project.org;) install.packages(lmtest,repos=http://cran.uk.r-project.org;) install.packages(nlme,repos=http://cran.uk.r-project.org;) install.packages(corrgram,repos=http://cran.uk.r-project.org;) install.packages(RODBC,repos=http://cran.uk.r-project.org;) install.packages(MSBVAR,repos=http://cran.uk.r-project.org;) install.packages(xtable,repos=http://cran.uk.r-project.org;) install.packages(vars,repos=http://cran.uk.r-project.org;) install.packages(tseries,repos=http://cran.uk.r-project.org;) install.packages(PerformanceAnalytics,repos=http://cran.uk.r-project.org;) install.packages(fArma,repos=http://cran.uk.r-project.org;) install.packages(numDeriv,repos=http://cran.uk.r-project.org;) install.packages(nortest,repos=http://cran.uk.r-project.org;) install.packages(chron,repos=http://cran.uk.r-project.org;) Prof Brian Ripley [EMAIL PROTECTED] 13/08/2008 11:48 To [EMAIL PROTECTED] cc r-help@r-project.org Subject Re: [R] problems with *downloading* packages tseries and robustbase It works from my home ISP (Virgin Media) and from .ox.ac.uk, so I think the problem is local to you, perhaps your DNS server. Ask you IT support for erm ... support. Please do try to use an accurate subject line (see the posting guide). (Why don't peoople just try a different mirror or some IP/TCP debugging tools instead of asking here about problems with mirrors? At most a handful of readers can do anything about this.) On Wed, 13 Aug 2008, [EMAIL PROTECTED] wrote: Dear R Users, Is there a known problem with downloading packages robustbase and tseries from the UK CRAN website ? Thanks in advance, Tolga = R version 2.7.1 (2008-06-23) Copyright (C) 2008 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. utils:::menuInstallPkgs() --- Please select a CRAN mirror for use in this session --- trying URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip' Error in download.file(url, destfile, method, mode = wb, ...) : cannot open URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/robustbase_0.2-8.zip' In addition: Warning message: In download.file(url, destfile, method, mode = wb, ...) : cannot open: HTTP status was '404 Not Found' Warning in download.packages(p0, destdir = tmpd, available = available, : download of package 'robustbase' failed utils:::menuInstallPkgs() trying URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip' Error in download.file(url, destfile, method, mode = wb, ...) : cannot open URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.7/tseries_0.10-15.zip' In addition: Warning message: In download.file(url, destfile, method, mode = wb, ...) : cannot open: HTTP status was '404 Not Found' Warning in download.packages(p0, destdir = tmpd, available = available, : download of package 'tseries' failed -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the
Re: [R] Help me: nls and try function
For the first question, you have provided the answer -- try(nls(...)). Was there something else you wanted? For part 2, this should work: for (i in names(myData)[-1]){ # skip first column with Time write.table(myData[, c(Time, i)], file=i) } 1) How can use the function try in nls model: try(nls(...)) 2) I have 100 colun with data and I want ro prepare 99 file with the first colun with the others Time A1 A2 A3 A4 AN. I want to have 99 files with a)Time and A1 b)Time and A2 n) Time AN thanks for any help M __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Location of HTML help files [on Firefox 3]
I've added a couple of workarounds to this in R-patched and hence the upcoming R 2.7.2. 1) There is a new menu item on Rgui to go directly to SearchEngine.html. 2) help.start() has a new argument searchEngine=TRUE to do the same. R 2.7.2 is 12 days' away, so it would be appreciated if Firefox 3 users would do some testing (especially on platforms other than Windows, which is all I have tested with Firefox 3). Hopefully these changes will appear in R-patched in tonight's tarball and Windows binary build (from the CRAN master). On Thu, 31 Jul 2008, Keith Ponting wrote: On Wed, Jul 16, 2008 at 7:27 PM, Jan Smit smit1 at un.org wrote: I am using R 2.7.1 under Windows XP, installed at C:/R/R-2.7.1. The location of the HTML SearchEngine is file:///C:/R/R-2.7.1/doc/html/search/SearchEngine.html. Now, when I type a phrase, say reshape, in the search text field, the Search Results page suggest that the location of the reshape HTML help file is file:///C:/R/library/stats/html/reshape.html, while in reality it is file:///C:/R/R-2.7.1/library/stats/html/reshape.html. Is there an easy way in which I can fix this? I too had this problem with Firefox 3.0 and 3.0.1 (but not with 3.0 RC1 by the way). A work-around which works for me is to go directly to the search engine page (enter URI file:///C:/Program%20Files%20(x86)/R/R-2.7.1/doc/html/search/SearchEngin e.html on my Windows Vista installation), rather than going there by following the link on the R documentation page (file:///C:/Program%20Files%20(x86)/R/R-2.7.1/doc/html/index.html) Keith Ponting Aurix Ltd, Malvern WR14 3SZ UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dixon test
giov [EMAIL PROTECTED] 13/08/2008 10:59:32 just a question...I don't know what is the distribution of my data (normal, T, etc...). So, how can I set the type parameter? You must assume an underlying distribution or you can't do an outlier test. Outliers are just unusually extreme data points. They can only be considered 'unusual' if there is some basis - a distribution assumption - for deciding what is 'usual'. The assumed underlying distribution describes what is expected to be 'usual'. With no distribution assumption, there is no basis for considering any data point unusual, so the idea of an outlier really has no meaning. Steve E *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Threshold vector error correction models
Hello I worked on threshold cointegration for my master thesis and wrote code for R. This code will be published in a next release of package tsDyn, when I will have time to finish it (there is a difference between using its own code and making it avalaible on R... I did'nt realize there is so much more work). You can download the actual code at http://code.google.com/p/tsdyn/source/checkout and compile it yourself (hope you are on Linux, otherwise it will take some time). The code avalaible actually entails: in a good form: -OLS grid search estimator for TVECM and TVAR and methods (especiallay nice latex exportation) in a not in method implemented form: -hansen seo linear against threshold test and theri estimator (MLE) -seo test: no cointegration against threshold cointegartion -hansen linearity test fot TAR and extension to multivariate case by Lo and Zivot -simple wald test fo the coefficients -function to simulate and bootstrap TAR and TVAR So see the doc pages (provisory) explore it and let me know if you need more infos (auch auf deutsch möglich) Matthieu Message: 7 Date: Tue, 12 Aug 2008 11:08:27 + (GMT) From: Werner Wernersen [EMAIL PROTECTED] Subject: [R] Threshold vector error correction models To: [EMAIL PROTECTED] Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=iso-8859-1 Hi, is anyone aware of estimation functions for threshold vector error correction / threshold cointegration models? I didn't find anything for R using RSeek or Google. Thanks a lot for any pointers, Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rgl/compiz problem
I have just encountered the problem with rgl where plot3d figures don't interact with the mouse. My plots zoom in and out with the mouse wheel but the mouse buttons do nothing. I can't rotate the plot. This has been mentioned and discussed here and in other lists before, and the solution is to turn off Ubuntu's fancy graphics. Back in March, Ben Bolker said: unfortunately rgl and compiz/etc. both try to use the same OpenGL interface, so you can't use both at the same time. This has echoes of when TCP/IP was in its infancy back in the days of DOS, and only one program could access the network interface at a time (until TCP/IP software got its act together). Is OpenGL really in the same position now? Or is Compiz being greedy in some sense? Surely two OpenGL applications can run at the same time? Or is it because rgl is running 'within' another OpenGL window already, so there's some nesting problem going on? Google Earth works fine, and I think that uses OpenGL. Anyone had any ideas since March? I'm on Ubuntu 8.04 and R 2.7.1 Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mob(party) formula question (example)
Here is an example that produces the same error: Read in the following as textfile (save as DFExample.txt): 1 2 3 4 7 8 9 10 12 13 14 15 16 17 18 19 21 22 23 25 27 28 29 30 31 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 AX 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 1 25 5 9 1 8.5 2.5 3 5 2 2 3 3 1 1 1 2 1 2 BX 1 1 0 0 1 0 0 1 NA NA NA 0 0 0 0 1 0 0 1 0 0 1 0 NA NA NA NA NA NA NA NA 0 0 0 1 0 NA NA NA NA NA NA NA 0 0 0 0 1 1 0 0 0 1 1 NA NA 6 1 3.252.255 5 2 2 3 3 1 1 1 1 1 1 CX 1 1 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 15 3.5 6 1 5.5 5.5 5 5 2 2 1 2 1 1 1 1 2 2 DX 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 50 17.57.5 2.5 8.5 5 5 5 2 2 2 3 1 1 1 1 3 3 EX 1 0 1 0 1 0 0 1 NA NA NA 0 0 0 1 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 NA NA 14.530 13 2.5 3 3 1 1 4 4 1 1 1 1 1 1 FX 1 0 1 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 1 1 1 0 0 1 1 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 1 0 1 0 1 1 0 0 165 25 11.5 15 12 6.5 5 5 1 1 3 3 1 1 1 1 4 5 GX 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 1 0 0 1 1 0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 40 20 14.5 9.5 11 10 3 3 1 1 1 3 1 1 3 4 1 3 HX 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 NA NA NA NA NA NA NA NA NA NA NA NA NA 1
Re: [R] rgl/compiz problem
Two days ago I installed compiz on my Debian laptop. It plays fine with the OpenGL games that I also have on that computer. (My son plays the games in the brief interludes between my intense R hacking sessions. I, of course, have no time for such frivolity. The production cycle is sleep - eat - R - eat - R - eat - sleep.) I can rotate and zoom using the touchpad, when both rgl and compiz are running. Not sure if any of this is of help, Simon. Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consulta-nt Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au http://www.uq.edu.au/~uqsblomb/ Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. -Original Message- From: [EMAIL PROTECTED] on behalf of Barry Rowlingson Sent: Wed 13/08/2008 9:45 PM To: r-help@r-project.org Subject: [R] rgl/compiz problem I have just encountered the problem with rgl where plot3d figures don't interact with the mouse. My plots zoom in and out with the mouse wheel but the mouse buttons do nothing. I can't rotate the plot. This has been mentioned and discussed here and in other lists before, and the solution is to turn off Ubuntu's fancy graphics. Back in March, Ben Bolker said: unfortunately rgl and compiz/etc. both try to use the same OpenGL interface, so you can't use both at the same time. This has echoes of when TCP/IP was in its infancy back in the days of DOS, and only one program could access the network interface at a time (until TCP/IP software got its act together). Is OpenGL really in the same position now? Or is Compiz being greedy in some sense? Surely two OpenGL applications can run at the same time? Or is it because rgl is running 'within' another OpenGL window already, so there's some nesting problem going on? Google Earth works fine, and I think that uses OpenGL. Anyone had any ideas since March? I'm on Ubuntu 8.04 and R 2.7.1 Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mob(party) formula question
On Wed, 13 Aug 2008, Birgitle wrote: I try tu use mob() with my data.frame ('data.frame':288 obs. of 81 variables; factors, numerics and ordered factors) My response is a binary variable and I should use for modelling a logistic regression (family=binomial). I read in the MOB Vignette that I could use a formula like this if I would like to have only partitioning variables apart from the response. Test.mob-mob(Resp~1|Var1+Var2+, data=dataframe, model=glinearModel, family=binomial()) This works for me. Considering an example that is easily reproducible: classifying just two (out of three) species in the iris data. iris2 - iris[-(1:50),] iris2$Species - factor(iris2$Species) mb - mob(Species ~ 1 | Petal.Length + Petal.Width + Sepal.Length + Sepal.Width, data = iris2, model = glinearModel, family = binomial()) and this runs fine, just selecting a single split R mb 1) Petal.Width = 1.7; criterion = 1, statistic = 81.818 2)* weights = 54 Terminal node model Binomial GLM with coefficients: (Intercept) -2.282 1) Petal.Width 1.7 3)* weights = 46 Terminal node model Binomial GLM with coefficients: (Intercept) 3.807 but this gives me back an error-message: Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected But Var1, Var2 and Resp are in my dataframe. Why do I get this error? More importantly, when do you get this error? My guess is that this is during plotting, right? If so, then the problem is that the plot() method for mob object by default calls node_bivplot() in each terminal node which is designed for generating partial regressor plots. In this situation this does not make sense because you don't have regressors in the terminal nodes. We haven't got a panel function for the type of model you are looking at but I've just hacked a simple one that should be sufficient for your purposes. It is essentially like node_barplot() but exploits the binomial model. It is attached below. With this you can do plot(mb, terminal_panel = myplot, tnex = 2) I am also wondering how I can find out which variables I should use for partitioning and which for modelling? For the variables for which a linear specification makes sense (at least in each component) then you should include them for modeling. And those variables for which it is not clear a priori what a useful parametric specification would be should be used as partitioning variables. There are correlations between some variables in my dataframe. Would it be a possibility to use always one variable of the correlated variable-pairs for partitioning and one for modelling? You can do that, but you could also do other combinations. That probably depends on your application. hth, Z myplot - function(ctreeobj, col = black, fill = NULL, beside = NULL, ymax = NULL, ylines = NULL, widths = 1, gap = NULL, reverse = NULL, id = TRUE) { getMaxPred - function(x) { mp - max(x$prediction) mpl - ifelse(x$terminal, 0, getMaxPred(x$left)) mpr - ifelse(x$terminal, 0, getMaxPred(x$right)) return(max(c(mp, mpl, mpr))) } y - response(ctreeobj)[[1]] if(is.factor(y) || class(y) == was_ordered) { ylevels - levels(y) if(is.null(beside)) beside - if(length(ylevels) 3) FALSE else TRUE if(is.null(ymax)) ymax - if(beside) 1.1 else 1 if(is.null(gap)) gap - if(beside) 0.1 else 0 } else { if(is.null(beside)) beside - FALSE if(is.null(ymax)) ymax - getMaxPred([EMAIL PROTECTED]) * 1.1 ylevels - seq(along = [EMAIL PROTECTED]) if(length(ylevels) 2) ylevels - if(is.null(gap)) gap - 1 } if(is.null(reverse)) reverse - !beside if(is.null(fill)) fill - gray.colors(length(ylevels)) if(is.null(ylines)) ylines - if(beside) c(3, 2) else c(1.5, 2.5) ### panel function for barplots in nodes rval - function(node) { ## parameter setup fm - node$model pred - fm$family$linkinv(coef(fm)) if(reverse) { pred - rev(pred) ylevels - rev(ylevels) } np - length(pred) nc - if(beside) np else 1 fill - rep(fill, length.out = np) widths - rep(widths, length.out = nc) col - rep(col, length.out = nc) ylines - rep(ylines, length.out = 2) gap - gap * sum(widths) yscale - c(0, ymax) xscale - c(0, sum(widths) + (nc+1)*gap) top_vp - viewport(layout = grid.layout(nrow = 2, ncol = 3, widths = unit(c(ylines[1], 1, ylines[2]), c(lines, null, lines)), heights = unit(c(1, 1), c(lines, null))), width = unit(1, npc),
Re: [R] aligned memory allocation in C
On Wed, 13 Aug 2008, Christophe Dutang1 wrote: Hi, I'm currently R porting SF Mersenne Twister algorithm of Matsumoto and Saito. To get the full power of their code, I want to use their fonction fill_array32 which need aligned memory. That is to say I need to use the C function memalign on windows, posix_memalign on linux and classic malloc on Mac OS. In 'writing R extenstion', they recommand to use R_alloc function to allocate memory in C. Does R_alloc return a pointer to aligned memory? if not how can I do this? probably no, because R crashes when I succesively R_alloc and fill_array32 (cf below) on my macbook with R 2.7.1. R_alloc's alignment will be appropriate for holding any data type. It will be offset from a value returned by malloc by a multiple of 8 bytes. My recollection, which may be wrong, is that on both Intel and PPC unaligned access to all basic data types is permitted but may be inefficient (in particular on Intel), so the reason for your crash is probably elsewhere. Best, luke Thanks in advance Kind regards Christophe PS : http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/howto-compile.html provides an example of memalign. PPS : mac os report Thread 0 Crashed: 0 libSystem.B.dylib 0x9341bb9e __kill + 10 1 libSystem.B.dylib 0x93492ec2 raise + 26 2 libSystem.B.dylib 0x934a247f abort + 73 3 randtoolbox.so 0x15e65f1d 0x15e5d000 + 36637 4 randtoolbox.so 0x15e614ef fill_array32 + 4038 5 randtoolbox.so 0x15e6513d SFmersennetwister + 335 6 randtoolbox.so 0x15e652c6 doSFMersenneTwister + 255 7 libR.dylib 0x00367a52 do_dotcall + 1394 8 libR.dylib 0x0038b5a2 Rf_eval + 1754 9 libR.dylib 0x0038f9a2 do_set + 592 10 libR.dylib 0x0038b366 Rf_eval + 1182 11 libR.dylib 0x0038b366 Rf_eval + 1182 12 libR.dylib 0x0038c140 do_begin + 58 13 libR.dylib 0x0038b366 Rf_eval + 1182 14 libR.dylib 0x0038b366 Rf_eval + 1182 15 libR.dylib 0x0038c140 do_begin + 58 16 libR.dylib 0x0038b366 Rf_eval + 1182 17 libR.dylib 0x0038d9a6 Rf_applyClosure + 663 18 libR.dylib 0x0038b25d Rf_eval + 917 19 org.R-project.R 0x000189c3 run_REngineRmainloop + 569 (Rinit.m:442) 20 org.R-project.R 0x0001142a -[REngine runREPL] + 260 (REngine.m:181) 21 org.R-project.R 0x2e91 main + 795 (main.m:126) 22 org.R-project.R 0x2b5a _start + 216 23 org.R-project.R 0x2a81 start + 41 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R freezes on Xeon multiprocessor and Win 64-bit
Hi everybody, Performing a stepAIC on a glm.nb object, from a database of more than 10,000 records and about 50 independent variables, on a 64-bit workstation with two Intel Xeon 3.20Ghz processors (keeping the HyperThreading option disabled in the BIOS), using 4 out of 7Gb available RAM, and Windows XP professional 64-bit, the system always freezes and it is no longer possible to use keyboard or mouse. In spite of that, the same analysis always works on other 32-bit old computers (Pentium IV with 1Gb RAM) with Windows XP professional 32-bit, although it tooks about 10 hours. Has someone any suggestion to solve that problem? Thank you in advance, Valerio Orioli Valerio Orioli Biodiversity Conservation Unit - http://www.disat.unimib.it/Biodiversity/index.htm Department of Environmental and Landscape Sciences University of Milano-Bicocca Piazza della Scienza 1 I-20126 - Milano ITALY Phone: +39.02.6448.2918 E-mail: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merging data sets to match data to date
Hi everyone, I want to extract data from a data set according to dates specified in a vector. I have created a blank matrix with row names (dates) that I want to extract from the full data set. I have then performed a merge to try to o/p rows corresponding to common dates to a results matrix, but the operation did not fill the results matrix. Coulc anyone offer any advice to assist with this operation? Thanks, rcoder -- View this message in context: http://www.nabble.com/merging-data-sets-to-match-data-to-date-tp18962197p18962197.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Comination of two barcharts and one xyplot
Hi Rhelpers, I would like to have some help with a plot which is beyond my capabilities. This plot that I am seeking involves an overlay of two different barcharts and one xyplot. The code that I have used is the following : #save(df1,file=M:\\KBR\\df1.RData) load(file=M:\\KBR\\df1.RData) # df1$Year.ord created to obtain the right order i.e. 2015M 2015K Year.ord-ordered(Year,levels=c('2003','2005','2007','2009','20011','2013','2015M','2015K')) # Use reshape package to melt the data frame library(reshape) df1m-melt(df1,id=c(Year,Year.ord)) library(lattice) attach(df1m) barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No. of Tests *1000,col=blue) This plot works just fine. But I want to go beyond this. My first data frame (df1) is : Year,KI,G48,AvCell,HB,Htens,Impact,Struct,Tens,Year.ord 1,2003,15.53,0.3,0.24,37.45,0.76,1.16,3.02,34.05,2003 2,2005,15.64,0.29,0.33,34.64,1.12,1.78,4.2,32.88,2005 3,2007,16.18,0.49,0.59,30.32,1.63,4.23,6.67,30.06,2007 4,2009,17.09,0.67,0.91,29.47,2.27,6.76,9.68,29.25,2009 5,2011,22.39,0.93,1.24,38.03,3.11,9.17,13.18,37.84,2011 6,2013,33.83,1.29,1.87,58.37,4.43,14.06,19.41,57.6,2013 7,2015M,44.91,1.83,2.71,75.54,6.28,20.57,27.51,74.5,2015M 8,2015K,52.22,2.14,3.15,87.71,7.34,23.88,31.98,86.57,2015K My second data frame is (L1) is : Year,KIL,G48L,AvCellL,HBL,HtensL,ImpactL,StructL,TensL 1,2009,20,1,1,30,2,10,10,35 2,2011,24,1,1.5,35,3,12,13,38 3,2013,30,1,2,40,4,14,16,45 What I want, in each panel of the lattice barchart, is to plot histograms of the relevant variable (KI, G48 etc) in one colour for the years 2003 to 2007, and in another colour for the other years. On top of this, I want to have a line plot in each panel with the limits for different years given in the second data frame L1 (as bold lines). I would like to have information on the following points : 1. How can I get a combination of these plots in every panel (two histograms and one line plot)? Is it possible? 2. Is it easier to do this with ggplot? 3. I would like to know how I can present the legend also. Will appreciate any help that I can get. Thanking You, Ravi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging data sets to match data to date
rcoder wrote: Hi everyone, I want to extract data from a data set according to dates specified in a vector. I have created a blank matrix with row names (dates) that I want to extract from the full data set. I have then performed a merge to try to o/p rows corresponding to common dates to a results matrix, but the operation did not fill the results matrix. Coulc anyone offer any advice to assist with this operation? Yes, follow the posting guide and provide commented, minimal, self-contained, reproducible code of your problem. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging data sets to match data to date
Try this: x - data.frame(Dates = seq(as.Date('2008-01-01'), as.Date('2008-01-31'), by = 'days'), Values = sample(31)) subset(x, Dates %in% as.Date(c('2008-01-05', '2008-01-20'))) On 8/13/08, rcoder [EMAIL PROTECTED] wrote: Hi everyone, I want to extract data from a data set according to dates specified in a vector. I have created a blank matrix with row names (dates) that I want to extract from the full data set. I have then performed a merge to try to o/p rows corresponding to common dates to a results matrix, but the operation did not fill the results matrix. Coulc anyone offer any advice to assist with this operation? Thanks, rcoder -- View this message in context: http://www.nabble.com/merging-data-sets-to-match-data-to-date-tp18962197p18962197.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging data sets to match data to date
zoo and merge.zoo- read the help files. Use chron to generate a list of dates that correspond to the one that you want, and then merge away. This should get you started Stephen Sefick On Wed, Aug 13, 2008 at 9:08 AM, Erik Iverson [EMAIL PROTECTED] wrote: rcoder wrote: Hi everyone, I want to extract data from a data set according to dates specified in a vector. I have created a blank matrix with row names (dates) that I want to extract from the full data set. I have then performed a merge to try to o/p rows corresponding to common dates to a results matrix, but the operation did not fill the results matrix. Coulc anyone offer any advice to assist with this operation? Yes, follow the posting guide and provide commented, minimal, self-contained, reproducible code of your problem. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comination of two barcharts and one xyplot
not reproducible On Wed, Aug 13, 2008 at 9:07 AM, ravi [EMAIL PROTECTED] wrote: Hi Rhelpers, I would like to have some help with a plot which is beyond my capabilities. This plot that I am seeking involves an overlay of two different barcharts and one xyplot. The code that I have used is the following : #save(df1,file=M:\\KBR\\df1.RData) load(file=M:\\KBR\\df1.RData) # df1$Year.ord created to obtain the right order i.e. 2015M 2015K Year.ord-ordered(Year,levels=c('2003','2005','2007','2009','20011','2013','2015M','2015K')) # Use reshape package to melt the data frame library(reshape) df1m-melt(df1,id=c(Year,Year.ord)) library(lattice) attach(df1m) barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No. of Tests *1000,col=blue) This plot works just fine. But I want to go beyond this. My first data frame (df1) is : Year,KI,G48,AvCell,HB,Htens,Impact,Struct,Tens,Year.ord 1,2003,15.53,0.3,0.24,37.45,0.76,1.16,3.02,34.05,2003 2,2005,15.64,0.29,0.33,34.64,1.12,1.78,4.2,32.88,2005 3,2007,16.18,0.49,0.59,30.32,1.63,4.23,6.67,30.06,2007 4,2009,17.09,0.67,0.91,29.47,2.27,6.76,9.68,29.25,2009 5,2011,22.39,0.93,1.24,38.03,3.11,9.17,13.18,37.84,2011 6,2013,33.83,1.29,1.87,58.37,4.43,14.06,19.41,57.6,2013 7,2015M,44.91,1.83,2.71,75.54,6.28,20.57,27.51,74.5,2015M 8,2015K,52.22,2.14,3.15,87.71,7.34,23.88,31.98,86.57,2015K My second data frame is (L1) is : Year,KIL,G48L,AvCellL,HBL,HtensL,ImpactL,StructL,TensL 1,2009,20,1,1,30,2,10,10,35 2,2011,24,1,1.5,35,3,12,13,38 3,2013,30,1,2,40,4,14,16,45 What I want, in each panel of the lattice barchart, is to plot histograms of the relevant variable (KI, G48 etc) in one colour for the years 2003 to 2007, and in another colour for the other years. On top of this, I want to have a line plot in each panel with the limits for different years given in the second data frame L1 (as bold lines). I would like to have information on the following points : 1. How can I get a combination of these plots in every panel (two histograms and one line plot)? Is it possible? 2. Is it easier to do this with ggplot? 3. I would like to know how I can present the legend also. Will appreciate any help that I can get. Thanking You, Ravi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl/compiz problem
Barry Rowlingson b.rowlingson at lancaster.ac.uk writes: I have just encountered the problem with rgl where plot3d figures don't interact with the mouse. My plots zoom in and out with the mouse wheel but the mouse buttons do nothing. I can't rotate the plot. This has been mentioned and discussed here and in other lists before, and the solution is to turn off Ubuntu's fancy graphics. Back in March, Ben Bolker said: unfortunately rgl and compiz/etc. both try to use the same OpenGL interface, so you can't use both at the same time. This has echoes of when TCP/IP was in its infancy back in the days of DOS, and only one program could access the network interface at a time (until TCP/IP software got its act together). Is OpenGL really in the same position now? Or is Compiz being greedy in some sense? Surely two OpenGL applications can run at the same time? Or is it because rgl is running 'within' another OpenGL window already, so there's some nesting problem going on? Google Earth works fine, and I think that uses OpenGL. Anyone had any ideas since March? I'm on Ubuntu 8.04 and R 2.7.1 Barry Unfortunately, an apparently knowledgeable compiz person said: This is a limitation of DRI, DRI2 should fix this, and should hopefully be in most drivers by Xorg 7.5(maybe 7.6), nvidia has there on implementation, that's why it works on it http://forum.compiz-fusion.org/showthread.php?t=8462 And poking around, http://www.phoronix.com/scan.php?page=news_itempx=NjYzNw sometime in 2009 is the closest I could get to finding an expected date when this would be available ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mob(party) formula question
Many thanks for your answer and the code that you offered me. I get this error message after calling mob (look at my given example). I guess it has something to do with the missings? The iris example works also fine for me. Sorry that I am not enough into statistics to really understand the following: Achim Zeileis wrote: . For the variables for which a linear specification makes sense (at least in each component) then you should include them for modeling. And those variables for which it is not clear a priori what a useful parametric specification would be should be used as partitioning variables. ... What do you mean with linear specification? I would be very happy if you could explain. Thanks again B. Achim Zeileis wrote: On Wed, 13 Aug 2008, Birgitle wrote: I try tu use mob() with my data.frame ('data.frame': 288 obs. of 81 variables; factors, numerics and ordered factors) My response is a binary variable and I should use for modelling a logistic regression (family=binomial). I read in the MOB Vignette that I could use a formula like this if I would like to have only partitioning variables apart from the response. Test.mob-mob(Resp~1|Var1+Var2+, data=dataframe, model=glinearModel, family=binomial()) This works for me. Considering an example that is easily reproducible: classifying just two (out of three) species in the iris data. iris2 - iris[-(1:50),] iris2$Species - factor(iris2$Species) mb - mob(Species ~ 1 | Petal.Length + Petal.Width + Sepal.Length + Sepal.Width, data = iris2, model = glinearModel, family = binomial()) - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/mob%28party%29-formula-question-tp18959898p18962866.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tiny help for tiny function
I just started to write tiny functions and therefore I appologise in advance if I am asking stupid question. I wrote a tiny function to give me back from the original matrix, a matrix showing only the values smaller -0.8 and bigger 0.8. y-c(0.1,0.2,0.3,-0.8,-0.4,0.9) x-c(0.5,0.3,0.9,-0.9,-0.7,0.3) XY-rbind(x,y) extract.values-function (x) { if(x=0.8|x=-0.8)x else(low corr.) } works: Test-sapply(XY,extract.values,simplify=FALSE) but now I try to solve the problem of having NA in the matrix. I tried like that: extract.values-function (x) { if(x=0.8|x=-0.8|x==NA)x else(low corr.) } woks not: x-c(0.5,0.3,0.9,-0.9,-0.7,0.3) y-c(0.1,0.2,NA,-0.8,-0.4,0.9) XY-rbind(x,y) Testi-sapply(XY,extract.values,simplify=FALSE) Fehler in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : Fehlender Wert, wo TRUE/FALSE nötig ist Error in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : Missing value, where TRUE/FALSE is needed How can I do this right. Thanks for help B. - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/Tiny-help-for-tiny-function-tp18963310p18963310.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The standard deviation of measurement 1 with respect to measurement 2
Hi, I have two (different types of) measurements, say X and Y, resulting from the same set of experiments. So X and Y are paired: (x_1, y_1), (x_2, y_2), ... I am trying to calculate the standard deviation of Y with respect to X. In other words, in terms of the scatter plot of X and Y, I would like to divide it into bins along the X-axis and for each bin calculate the standard deviation along the Y results in that bin. (Though I am not totally sure, this seems to remind me of the conditional expectation of Y given X - maybe it is called the conditional deviation?) Is their a built in procedure in R for calculating the above? Otherwise, what would be the easiest way to achieve it? (factors maybe?) Thankful for the help, Firas. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tiny help for tiny function
You can do this: ifelse(XY = 0.8 | XY = -0.8 | is.na(XY), XY, low corr) On 8/13/08, Birgitle [EMAIL PROTECTED] wrote: I just started to write tiny functions and therefore I appologise in advance if I am asking stupid question. I wrote a tiny function to give me back from the original matrix, a matrix showing only the values smaller -0.8 and bigger 0.8. y-c(0.1,0.2,0.3,-0.8,-0.4,0.9) x-c(0.5,0.3,0.9,-0.9,-0.7,0.3) XY-rbind(x,y) extract.values-function (x) { if(x=0.8|x=-0.8)x else(low corr.) } works: Test-sapply(XY,extract.values,simplify=FALSE) but now I try to solve the problem of having NA in the matrix. I tried like that: extract.values-function (x) { if(x=0.8|x=-0.8|x==NA)x else(low corr.) } woks not: x-c(0.5,0.3,0.9,-0.9,-0.7,0.3) y-c(0.1,0.2,NA,-0.8,-0.4,0.9) XY-rbind(x,y) Testi-sapply(XY,extract.values,simplify=FALSE) Fehler in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : Fehlender Wert, wo TRUE/FALSE nötig ist Error in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : Missing value, where TRUE/FALSE is needed How can I do this right. Thanks for help B. - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/Tiny-help-for-tiny-function-tp18963310p18963310.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tiny help for tiny function
Many thanks. Much easier than my solution B. Birgitle wrote: I just started to write tiny functions and therefore I appologise in advance if I am asking stupid question. I wrote a tiny function to give me back from the original matrix, a matrix showing only the values smaller -0.8 and bigger 0.8. y-c(0.1,0.2,0.3,-0.8,-0.4,0.9) x-c(0.5,0.3,0.9,-0.9,-0.7,0.3) XY-rbind(x,y) extract.values-function (x) { if(x=0.8|x=-0.8)x else(low corr.) } works: Test-sapply(XY,extract.values,simplify=FALSE) but now I try to solve the problem of having NA in the matrix. I tried like that: extract.values-function (x) { if(x=0.8|x=-0.8|x==NA)x else(low corr.) } woks not: x-c(0.5,0.3,0.9,-0.9,-0.7,0.3) y-c(0.1,0.2,NA,-0.8,-0.4,0.9) XY-rbind(x,y) Testi-sapply(XY,extract.values,simplify=FALSE) Fehler in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : Fehlender Wert, wo TRUE/FALSE nötig ist Error in if (x = 0.8 | x = -0.8 | x == NA) x else (low corr.) : Missing value, where TRUE/FALSE is needed How can I do this right. Thanks for help B. - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/Tiny-help-for-tiny-function-tp18963310p18963906.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mob(party) formula question
On Wed, 13 Aug 2008, Birgitle wrote: Many thanks for your answer and the code that you offered me. I get this error message after calling mob (look at my given example). I guess it has something to do with the missings? Yes, you have to handle NAs in advance if you want to fit that model. We'll try to fix that in future versions. The iris example works also fine for me. Sorry that I am not enough into statistics to really understand the following: Achim Zeileis wrote: . For the variables for which a linear specification makes sense (at least in each component) then you should include them for modeling. And those variables for which it is not clear a priori what a useful parametric specification would be should be used as partitioning variables. ... What do you mean with linear specification? I would be very happy if you could explain. Well, in each node you fit a logistic regression model. This is a (generalized) linear model, hence the variables included have a linear influence (on the link scale) within each node. The partitioning variables on the other hand capture step-shaped influences (if they are selected by the algorithm). See the references on ?mob for further details. Best, Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary.manova rank deficiency error + data
Thanks for the reply. The SAS output is attached but seems to me that doesn't correspond to the wihtin-row contrasts as you suggested. By the way, yes the data are highly correlated, in fact each row correspond to the first part of a signal vector. Thanks anyway PM The GLM Procedure Multivariate Analysis of Variance E = Error SSCP Matrix y1y2y3 y4y5 y1 0.0353518799 0.035256904 0.0351327804 0.0349749601 0.0347868018 y2 0.035256904 0.0351627227 0.0350395053 0.0348827098 0.0346956744 y3 0.0351327804 0.0350395053 0.0349173343 0.0347617352 0.0345760232 y4 0.0349749601 0.0348827098 0.0347617352 0.0346075203 0.0344233531 y5 0.0347868018 0.0346956744 0.0345760232 0.0344233531 0.0342409225 Partial Correlation Coefficients from the Error SSCP Matrix / Prob |r| DF = 28 y1 y2 y3 y4 y5 y11.00 0.92 0.67 0.21 0.999852 .0001 .0001 .0001 .0001 y20.92 1.00 0.91 0.63 0.11 .0001.0001 .0001 .0001 y30.67 0.91 1.00 0.90 0.58 .0001 .0001 .0001 .0001 y40.21 0.63 0.90 1.00 0.89 .0001 .0001 .0001 .0001 y50.999852 0.11 0.58 0.89 1.00 .0001 .0001 .0001 .0001 The SAS System 10:33 Wednesday, August 13, 2008 8 The GLM Procedure Multivariate Analysis of Variance H = Type III SSCP Matrix for group y1y2y3 y4y5 y1 0.0023822408 0.002365848 0.0023471328 0.0023261249 0.0023030993 y2 0.002365848 0.0023495679 0.0023309816 0.0023101183 0.0022872511 y3 0.0023471328 0.0023309816 0.0023125426 0.0022918453 0.0022691608 y4 0.0023261249 0.0023101183 0.0022918453 0.0022713359 0.0022488593 y5 0.0023030993 0.0022872511 0.0022691608 0.0022488593 0.0022266141 Characteristic Roots and Vectors of: E Inverse * H, where H = Type III SSCP Matrix for group E = Error SSCP Matrix Characteristic Characteristic Vector V'EV=1 Root Percenty1y2y3 y4y5 0.4184010371.72 -7542.628 17131.814 5347.394 -31627.317 16700.100 0.1649601128.28 -4180.854 -4413.446 32096.035 -35545.204 12040.697 0.0001 0.00-41004.875107291.004-95905.664 32641.189 -3028.470 0. 0.00 -416.226 -111.206 410.721 295.193 -171.953 0. 0.00-14678.651 5787.997 54718.250 -69055.249 23218.580 MANOVA Test Criteria and F Approximations for the Hypothesis of No Overall group Effect H = Type III SSCP Matrix for group E = Error SSCP Matrix S=2M=1N=11 StatisticValueF ValueNum DF Den DFPr F Wilks' Lambda 0.60518744 1.3710 480.2227 Pillai's Trace 0.43658228 1.4010 500.2095 Hotelling-Lawley Trace 0.58336114 1.3710 33.3620.2385 Roy's Greatest Root 0.41840103 2.09 5 250.1000 On Wed, Aug 13, 2008 at 4:34 AM, Peter Dalgaard [EMAIL PROTECTED] wrote: Pedro Mardones wrote: Dear R-users; Previously I posted a question about the problem of rank deficiency in summary.manova. As somebody suggested, I'm attaching a small part of the data set. #*** test - structure(.Data = list(structure(.Data = c(rep(1,3),rep(2,18),rep(3,10)), levels = c(1, 2, 3), class = factor) ,c(0.181829,0.090159,0.115824,0.112804,0.134650,0.249136,0.163144,0.122012,0.157554,0.126283, 0.105344,0.125125,0.126232,0.084317,0.092836,0.108546,0.159165,0.121620,0.142326,0.122770, 0.117480,0.153762,0.156551,0.185058,0.161651,0.182331,0.139531,0.188101,0.103196,0.116877,0.113733) ,c(0.181445,0.090254,0.115840,0.112863,0.134610,0.249003,0.163116,0.122135,0.157206,0.126129,
Re: [R] mob(party) formula question
Thanks again. Unfortunately I have always this missing values problem. But the missings have also a meaning and its impossible to code it differently or impute. Also thanks for the explanation. Now I understand. B. Achim Zeileis wrote: On Wed, 13 Aug 2008, Birgitle wrote: Many thanks for your answer and the code that you offered me. I get this error message after calling mob (look at my given example). I guess it has something to do with the missings? Yes, you have to handle NAs in advance if you want to fit that model. We'll try to fix that in future versions. The iris example works also fine for me. Sorry that I am not enough into statistics to really understand the following: Achim Zeileis wrote: . For the variables for which a linear specification makes sense (at least in each component) then you should include them for modeling. And those variables for which it is not clear a priori what a useful parametric specification would be should be used as partitioning variables. ... What do you mean with linear specification? I would be very happy if you could explain. Well, in each node you fit a logistic regression model. This is a (generalized) linear model, hence the variables included have a linear influence (on the link scale) within each node. The partitioning variables on the other hand capture step-shaped influences (if they are selected by the algorithm). See the references on ?mob for further details. Best, Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/mob%28party%29-formula-question-tp18959898p18964864.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mob(party) formula question
On Wed, 13 Aug 2008, Birgitle wrote: Thanks again. Unfortunately I have always this missing values problem. But the missings have also a meaning and its impossible to code it differently or impute. That's ok. Just to clarify: NAs are not allowed in the response or the modeling variables. In principle, it would be possible to have NAs in the partitioning variables and try to handle it with surrogate splits. Currently, surrogates are not implemented in mob(), but we are currently working on infrastructure for this. So the only work-around easily available at the moment is to call na.omit() (on the relevant variables only). Best, Z Also thanks for the explanation. Now I understand. B. Achim Zeileis wrote: On Wed, 13 Aug 2008, Birgitle wrote: Many thanks for your answer and the code that you offered me. I get this error message after calling mob (look at my given example). I guess it has something to do with the missings? Yes, you have to handle NAs in advance if you want to fit that model. We'll try to fix that in future versions. The iris example works also fine for me. Sorry that I am not enough into statistics to really understand the following: Achim Zeileis wrote: . For the variables for which a linear specification makes sense (at least in each component) then you should include them for modeling. And those variables for which it is not clear a priori what a useful parametric specification would be should be used as partitioning variables. ... What do you mean with linear specification? I would be very happy if you could explain. Well, in each node you fit a logistic regression model. This is a (generalized) linear model, hence the variables included have a linear influence (on the link scale) within each node. The partitioning variables on the other hand capture step-shaped influences (if they are selected by the algorithm). See the references on ?mob for further details. Best, Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/mob%28party%29-formula-question-tp18959898p18964864.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comination of two barcharts and one xyplot
Hi Rhelpers, Thanks a lot, Stephen, for showing me the way to get a data frame into a pasteable format with the dput command. My code is given below with the new correction. This should work, as Stephen says, right off the bat :-) ## df1 is the first data frame df1 -structure(list(Year = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 8L, 7L), .Label = c(2003, 2005, 2007, 2009, 2011, 2013, 2015K, 2015M), class = factor), KI = c(15.53, 15.64, 16.18, 17.09, 22.39, 33.83, 44.91, 52.22), G48 = c(0.3, 0.29, 0.49, 0.67, 0.93, 1.29, 1.83, 2.14), AvCell = c(0.24, 0.33, 0.59, 0.91, 1.24, 1.87, 2.71, 3.15), HB = c(37.45, 34.64, 30.32, 29.47, 38.03, 58.37, 75.54, 87.71), Htens = c(0.76, 1.12, 1.63, 2.27, 3.11, 4.43, 6.28, 7.34), Impact = c(1.16, 1.78, 4.23, 6.76, 9.17, 14.06, 20.57, 23.88), Struct = c(3.02, 4.2, 6.67, 9.68, 13.18, 19.41, 27.51, 31.98), Tens = c(34.05, 32.88, 30.06, 29.25, 37.84, 57.6, 74.5, 86.57), Year.ord = structure(1:8, .Label = c(2003, 2005, 2007, 2009, 2011, 2013, 2015M, 2015K), class = c(ordered, factor))), .Names = c(Year, KI, G48, AvCell, HB, Htens, Impact, Struct, Tens, Year.ord), row.names = c(NA, -8L), class = data.frame) ## L1 is the second data frame L1-structure(list(Year = c(2009L, 2011L, 2013L), KIL = c(20, 24, 30), G48L = c(1, 1, 1), AvCellL = c(1, 1.5, 2), HBL = c(30, 35, 40), HtensL = c(2, 3, 4), ImpactL = c(10, 12, 14), StructL = c(10, 13, 16), TensL = c(35, 38, 45)), .Names = c(Year, KIL, G48L, AvCellL, HBL, HtensL, ImpactL, StructL, TensL), class = data.frame, row.names = c(NA, -3L)) # # Use the reshape package to melt the data frame library(reshape) df1m-melt(df1,id=c(Year,Year.ord)) ## Use the lattice package to plot the barchart library(lattice) attach(df1m) barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No. of Tests *1000,col=blue) This plot works just fine. But I want to go beyond this.What I want, in each panel of the lattice barchart, is to plot histograms of the relevant variable (KI, G48 etc) in one colour for the years 2003 to 2007, and in another colour for the other years. On top of this, I want to have a line plot in each panel with the limits for different years given in the second data frame L1 (as bold lines). I would like to have information on the following points : 1. How can I get a combination of these plots in every panel (two histograms and one line plot)? Is it possible? 2. Is it easier to do this with ggplot? 3. I would like to know how I can present the legend also. Will appreciate any help that I can get. Thanking You, Ravi - Original Message From: stephen sefick [EMAIL PROTECTED] To: ravi [EMAIL PROTECTED] Cc: r-help@r-project.org Sent: Wednesday, 13 August, 2008 3:14:54 PM Subject: Re: [R] Comination of two barcharts and one xyplot not reproducible On Wed, Aug 13, 2008 at 9:07 AM, ravi [EMAIL PROTECTED] wrote: Hi Rhelpers, I would like to have some help with a plot which is beyond my capabilities. This plot that I am seeking involves an overlay of two different barcharts and one xyplot. The code that I have used is the following : #save(df1,file=M:\\KBR\\df1.RData) load(file=M:\\KBR\\df1.RData) # df1$Year.ord created to obtain the right order i.e. 2015M 2015K Year.ord-ordered(Year,levels=c('2003','2005','2007','2009','20011','2013','2015M','2015K')) # Use reshape package to melt the data frame library(reshape) df1m-melt(df1,id=c(Year,Year.ord)) library(lattice) attach(df1m) barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No. of Tests *1000,col=blue) This plot works just fine. But I want to go beyond this. My first data frame (df1) is : Year,KI,G48,AvCell,HB,Htens,Impact,Struct,Tens,Year.ord 1,2003,15.53,0.3,0.24,37.45,0.76,1.16,3.02,34.05,2003 2,2005,15.64,0.29,0.33,34.64,1.12,1.78,4.2,32.88,2005 3,2007,16.18,0.49,0.59,30.32,1.63,4.23,6.67,30.06,2007 4,2009,17.09,0.67,0.91,29.47,2.27,6.76,9.68,29.25,2009 5,2011,22.39,0.93,1.24,38.03,3.11,9.17,13.18,37.84,2011 6,2013,33.83,1.29,1.87,58.37,4.43,14.06,19.41,57.6,2013 7,2015M,44.91,1.83,2.71,75.54,6.28,20.57,27.51,74.5,2015M 8,2015K,52.22,2.14,3.15,87.71,7.34,23.88,31.98,86.57,2015K My second data frame is (L1) is : Year,KIL,G48L,AvCellL,HBL,HtensL,ImpactL,StructL,TensL 1,2009,20,1,1,30,2,10,10,35 2,2011,24,1,1.5,35,3,12,13,38 3,2013,30,1,2,40,4,14,16,45 What I want, in each panel of the lattice barchart, is to plot histograms of the relevant variable (KI, G48 etc) in one colour for the years 2003 to 2007, and in another colour for the other years. On top of this, I want to have a line plot in each panel with the limits for different years given in the second data frame L1 (as bold lines). I would like to have information on the following points : 1. How can I get a combination of these plots in every panel (two histograms and one line plot)? Is it possible? 2. Is it easier to do this with ggplot? 3. I would like to know how I can present the legend
Re: [R] ignoring zeros or converting to NA
Hi, since many suggestions are following the form of x[x==0] (or similar) I would like to ask if this is really recommended? What I have learned (the hard way) is that one should not test for equality of floating point numbers (which is the default for R's numeric values, right?) since the binary representation of these (decimal) floating point numbers is not necessarily exact (with the classic example of decimal 0.1). Is it okay in this case for the value zero where all binary elements are zero? Or does R somehow recognize that it is an integer? Just some questions out of curiosity. Thank you, Roland rcoder wrote: Hi everyone, I have a matrix that has a combination of zeros and NAs. When I perform certain calculations on the matrix, the zeros generate Inf values. Is there a way to either convert the zeros in the matrix to NAs, or only perform the calculations if not zero (i.e. like using something similar to an !all(is.na() construct)? Thanks, rcoder __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?
Dear Peter and Henrik, Thanks for your replies - this helps speed up a bit, but I thought there would be something much faster. What I mean is that I thought that a particular value of a level could be accessed instantly, similarly to a hash key. Since I've got about 6000 levels in that data frame, it means that making a list L of the form L[[1]] = values of name 1 L[[2]] = values of name 2 L[[3]] = values of name 3 ... would take ~1hour. Best, Emmanuel 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]: To simplify: n - 2.7e6; x - factor(c(rep(A, n/2), rep(B, n/2))); # Identify 'A':s t1 - system.time(res - which(x == A)); # To compare a factor to a string, the factor is in practice # coerced to a character vector. t2 - system.time(res - which(as.character(x) == A)); # Interestingly enough, this seems to be faster (repeated many times) # Don't know why. print(t2/t1); user system elapsed 0.632653 1.60 0.754717 # Avoid coercing the factor, but instead coerce the level compared to t3 - system.time(res - which(x == match(A, levels(x; # ...but gives no speed up print(t3/t1); user system elapsed 1.041667 1.00 1.018182 # But coercing the factor to integers does t4 - system.time(res - which(as.integer(x) == match(A, levels(x print(t4/t1); usersystem elapsed 0.417 0.000 0.3636364 So, the latter seems to be the fastest way to identify those elements. My $.02 /Henrik On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote: Emmanuel, On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED] wrote: Dear All, I have a large data frame ( 270 lines and 14 columns), and I would like to extract the information in a particular way illustrated below: Given a data frame df: col1=sample(c(0,1),10, rep=T) names = factor(c(rep(A,5),rep(B,5))) df = data.frame(names,col1) df names col1 1 A1 2 A0 3 A1 4 A0 5 A1 6 B0 7 B0 8 B1 9 B0 10 B0 I would like to tranform it in the form: index = c(A,B) col1[[1]]=df$col1[which(df$name==A)] col1[[2]]=df$col1[which(df$name==B)] I'm not sure I fully understand your problem, you example would not run for me. You could get a small speedup by omitting which(), you can subset by a logical vector also which give a small speedup. n - 270 foo - data.frame( + one = sample(c(0,1), n, rep = T), + two = factor(c(rep(A, n/2 ),rep(B, n/2 ))) + ) system.time(out - which(foo$two==A)) user system elapsed 0.566 0.146 0.761 system.time(out - foo$two==A) user system elapsed 0.429 0.075 0.588 You might also find use for unstack(), though I didn't see a speedup. system.time(out - unstack(foo)) user system elapsed 1.068 0.697 2.004 HTH Peter My problem is that the command: *** which(df$name==A) *** takes about 1 second because df is so big. I was thinking that a level could maybe be accessed instantly but I am not sure about how to do it. I would be very grateful for any advice that would allow me to speed this up. Best wishes, Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bmp header
Hi R users, I have a xml file. A value of one of the nodes of the xml file is a bmp image(RAW format) encoded in base64. I would like to read this image by R. I think I should do the following steps: 1. Decoding it from base64 to binary. 2. Removing the header of the image file 3. building the matrix So I wonder if you know how to do the following using R functions: 1. decode from base64 to binary. base64decode does not decode to binary. The binary file is an openable bmp file. 2. Remove the header of bmp image and produce a matrix which has the color values. My main goal is producing the matrix which has the color values, if the aforementioned steps don't look plausible, what is your suggested steps. Right now I produce the matrix, using the following steps, but I wonder if I can avoid using Image Magic and python. 1. Decoding from base64 to binary using a python function. After decoding I have a openable image file. 2. Converting bmp format to pnm using Image Magic program 3. Reading pnm format using pixmap library in R. The function read.pnm produces a pixmap object 4. Producing matrices using pixmap object Thanks for your help, Rostam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl/compiz problem
2008/8/13 Ben Bolker [EMAIL PROTECTED]: Barry Rowlingson b.rowlingson at lancaster.ac.uk writes: I have just encountered the problem with rgl where plot3d figures don't interact with the mouse. My plots zoom in and out with the mouse wheel but the mouse buttons do nothing. I can't rotate the plot. I just showed this problem to a colleague who also uses fancy wobbly windows on his Ubuntu box, and he had the same problem. But then with some random frustrated clicking his scatterplot moved! It rotated slightly! What? How did he do that? He didn't know! So he tried mouse buttons in combination. Holding B1 and then B3 and moving the mouse resulted in a zoom operation. Holding first B3 and then B1 resulted in rotation functionality. B3 and then B2 resulted in the field-of-view change operation. These three operations were what should have happened with B3, B1 and B2 presses on their own. Seemingly the mouse presses didn't get through to rgl unless another mouse button was held down. It's quite general. Hold Bx and then hold By and you get the functionality of By. Back in my office, the same things worked for me too. So as long as I do that, everything is fine for me. Unfortunately my colleague has the problem of not having any window decorations and having the rgl window go invisible when moved... Oh well, he can't win them all! Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?
I still don't understand what you are doing. Can you make a small example that shows what you have and what you want? Is ?split what you are after? Emmanuel Levy wrote: Dear Peter and Henrik, Thanks for your replies - this helps speed up a bit, but I thought there would be something much faster. What I mean is that I thought that a particular value of a level could be accessed instantly, similarly to a hash key. Since I've got about 6000 levels in that data frame, it means that making a list L of the form L[[1]] = values of name 1 L[[2]] = values of name 2 L[[3]] = values of name 3 ... would take ~1hour. Best, Emmanuel 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]: To simplify: n - 2.7e6; x - factor(c(rep(A, n/2), rep(B, n/2))); # Identify 'A':s t1 - system.time(res - which(x == A)); # To compare a factor to a string, the factor is in practice # coerced to a character vector. t2 - system.time(res - which(as.character(x) == A)); # Interestingly enough, this seems to be faster (repeated many times) # Don't know why. print(t2/t1); user system elapsed 0.632653 1.60 0.754717 # Avoid coercing the factor, but instead coerce the level compared to t3 - system.time(res - which(x == match(A, levels(x; # ...but gives no speed up print(t3/t1); user system elapsed 1.041667 1.00 1.018182 # But coercing the factor to integers does t4 - system.time(res - which(as.integer(x) == match(A, levels(x print(t4/t1); usersystem elapsed 0.417 0.000 0.3636364 So, the latter seems to be the fastest way to identify those elements. My $.02 /Henrik On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote: Emmanuel, On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED] wrote: Dear All, I have a large data frame ( 270 lines and 14 columns), and I would like to extract the information in a particular way illustrated below: Given a data frame df: col1=sample(c(0,1),10, rep=T) names = factor(c(rep(A,5),rep(B,5))) df = data.frame(names,col1) df names col1 1 A1 2 A0 3 A1 4 A0 5 A1 6 B0 7 B0 8 B1 9 B0 10 B0 I would like to tranform it in the form: index = c(A,B) col1[[1]]=df$col1[which(df$name==A)] col1[[2]]=df$col1[which(df$name==B)] I'm not sure I fully understand your problem, you example would not run for me. You could get a small speedup by omitting which(), you can subset by a logical vector also which give a small speedup. n - 270 foo - data.frame( + one = sample(c(0,1), n, rep = T), + two = factor(c(rep(A, n/2 ),rep(B, n/2 ))) + ) system.time(out - which(foo$two==A)) user system elapsed 0.566 0.146 0.761 system.time(out - foo$two==A) user system elapsed 0.429 0.075 0.588 You might also find use for unstack(), though I didn't see a speedup. system.time(out - unstack(foo)) user system elapsed 1.068 0.697 2.004 HTH Peter My problem is that the command: *** which(df$name==A) *** takes about 1 second because df is so big. I was thinking that a level could maybe be accessed instantly but I am not sure about how to do it. I would be very grateful for any advice that would allow me to speed this up. Best wishes, Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ignoring zeros or converting to NA
Integers (up to a fairly high limit) are represented exactly, as are fractions whose denominator is a power of two (again up to a fairly high limit), so x==0 is fine in that sense. If x is computed by floating point operations you do have to worry whether these are exact, eg, with x-seq(-1,1,length=7) it is not clear that the fourth element will be exactly zero. -thomas On Wed, 13 Aug 2008, Roland Rau wrote: Hi, since many suggestions are following the form of x[x==0] (or similar) I would like to ask if this is really recommended? What I have learned (the hard way) is that one should not test for equality of floating point numbers (which is the default for R's numeric values, right?) since the binary representation of these (decimal) floating point numbers is not necessarily exact (with the classic example of decimal 0.1). Is it okay in this case for the value zero where all binary elements are zero? Or does R somehow recognize that it is an integer? Just some questions out of curiosity. Thank you, Roland rcoder wrote: Hi everyone, I have a matrix that has a combination of zeros and NAs. When I perform certain calculations on the matrix, the zeros generate Inf values. Is there a way to either convert the zeros in the matrix to NAs, or only perform the calculations if not zero (i.e. like using something similar to an !all(is.na() construct)? Thanks, rcoder __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl/compiz problem
Barry Rowlingson wrote: 2008/8/13 Ben Bolker [EMAIL PROTECTED]: Barry Rowlingson b.rowlingson at lancaster.ac.uk writes: oe So he tried mouse buttons in combination. Holding B1 and then B3 and moving the mouse resulted in a zoom operation. Holding first B3 and then B1 resulted in rotation functionality. B3 and then B2 resulted in the field-of-view change operation. These three operations were what should have happened with B3, B1 and B2 presses on their own. Seemingly the mouse presses didn't get through to rgl unless another mouse button was held down. It's quite general. Hold Bx and then hold By and you get the functionality of By. Back in my office, the same things worked for me too. So as long as I do that, everything is fine for me. Unfortunately my colleague has the problem of not having any window decorations and having the rgl window go invisible when moved... Oh well, he can't win them all! Barry Interesting. I can confirm that this works for me too, although since I'm emulating B3 with a two-button mouse that obviously doesn't work ... also, the window behavior is extremely erratic (no decorations, window doesn't always come to front when it should, doesn't disappear immediately when closed, etc.) Since my machine tends to lock up on suspend when the fancy graphics are on, I'll probably continue without them for the time being ... (I suppose I could write a script to toggle the graphics mode on suspend/wake, but ugh) Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?
Sorry for being unclear, I thought the example above was clear enough. I have a data frame of the form: name info 1 YAL001C 1 2 YAL001C 1 3 YAL001C 1 4 YAL001C 1 5 YAL001C 0 6 YAL001C 1 7 YAL001C 1 8 YAL001C 1 9 YAL001C 1 10 YAL001C 1 ... ... ~270 lines, and ~6000 different names. which corresponds to yeast proteins + some info. So there are about 6000 names like YAL001C I would like to transform this data frame into the following form: 1/ a list, where each protein corresponds to an index, and the info is the vector L[[1]] [1] 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 L[[2]] [1] 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 etc. 2/ an index, which gives me the position of each protein in the list: index [1] YAL001C YAL002W YAL003W YAL005C YAL007C ... I hope this will be clearer! I'll have a look right now that the split and hash.mat functions. Thanks for your help, Emmanuel 2008/8/13 Erik Iverson [EMAIL PROTECTED]: I still don't understand what you are doing. Can you make a small example that shows what you have and what you want? Is ?split what you are after? Emmanuel Levy wrote: Dear Peter and Henrik, Thanks for your replies - this helps speed up a bit, but I thought there would be something much faster. What I mean is that I thought that a particular value of a level could be accessed instantly, similarly to a hash key. Since I've got about 6000 levels in that data frame, it means that making a list L of the form L[[1]] = values of name 1 L[[2]] = values of name 2 L[[3]] = values of name 3 ... would take ~1hour. Best, Emmanuel 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]: To simplify: n - 2.7e6; x - factor(c(rep(A, n/2), rep(B, n/2))); # Identify 'A':s t1 - system.time(res - which(x == A)); # To compare a factor to a string, the factor is in practice # coerced to a character vector. t2 - system.time(res - which(as.character(x) == A)); # Interestingly enough, this seems to be faster (repeated many times) # Don't know why. print(t2/t1); user system elapsed 0.632653 1.60 0.754717 # Avoid coercing the factor, but instead coerce the level compared to t3 - system.time(res - which(x == match(A, levels(x; # ...but gives no speed up print(t3/t1); user system elapsed 1.041667 1.00 1.018182 # But coercing the factor to integers does t4 - system.time(res - which(as.integer(x) == match(A, levels(x print(t4/t1); usersystem elapsed 0.417 0.000 0.3636364 So, the latter seems to be the fastest way to identify those elements. My $.02 /Henrik On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote: Emmanuel, On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED] wrote: Dear All, I have a large data frame ( 270 lines and 14 columns), and I would like to extract the information in a particular way illustrated below: Given a data frame df: col1=sample(c(0,1),10, rep=T) names = factor(c(rep(A,5),rep(B,5))) df = data.frame(names,col1) df names col1 1 A1 2 A0 3 A1 4 A0 5 A1 6 B0 7 B0 8 B1 9 B0 10 B0 I would like to tranform it in the form: index = c(A,B) col1[[1]]=df$col1[which(df$name==A)] col1[[2]]=df$col1[which(df$name==B)] I'm not sure I fully understand your problem, you example would not run for me. You could get a small speedup by omitting which(), you can subset by a logical vector also which give a small speedup. n - 270 foo - data.frame( + one = sample(c(0,1), n, rep = T), + two = factor(c(rep(A, n/2 ),rep(B, n/2 ))) + ) system.time(out - which(foo$two==A)) user system elapsed 0.566 0.146 0.761 system.time(out - foo$two==A) user system elapsed 0.429 0.075 0.588 You might also find use for unstack(), though I didn't see a speedup. system.time(out - unstack(foo)) user system elapsed 1.068 0.697 2.004 HTH Peter My problem is that the command: *** which(df$name==A) *** takes about 1 second because df is so big. I was thinking that a level could maybe be accessed instantly but I am not sure about how to do it. I would be very grateful for any advice that would allow me to speed this up. Best wishes, Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Flag variable
Hi All, I have 4000 case which have string variables in them, i want to do some fuzzy matching and create a new variable that is of the same length with 0 or 1's if i use the code test- agrep(web Klick,ETC$Exposure.Type , max = 2, ignore.case = TRUE) it works but i get length(test) [1] 3127 This returns the case values that do match, can someone tell me how to match this on the dataset (ETC) that i have as 1 and 0 ? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?
Wow great! Split was exactly what was needed. It takes about 1 second for the whole operation :D Thanks again - I can't believe I never used this function in the past. All the best, Emmanuel 2008/8/13 Erik Iverson [EMAIL PROTECTED]: I still don't understand what you are doing. Can you make a small example that shows what you have and what you want? Is ?split what you are after? Emmanuel Levy wrote: Dear Peter and Henrik, Thanks for your replies - this helps speed up a bit, but I thought there would be something much faster. What I mean is that I thought that a particular value of a level could be accessed instantly, similarly to a hash key. Since I've got about 6000 levels in that data frame, it means that making a list L of the form L[[1]] = values of name 1 L[[2]] = values of name 2 L[[3]] = values of name 3 ... would take ~1hour. Best, Emmanuel 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]: To simplify: n - 2.7e6; x - factor(c(rep(A, n/2), rep(B, n/2))); # Identify 'A':s t1 - system.time(res - which(x == A)); # To compare a factor to a string, the factor is in practice # coerced to a character vector. t2 - system.time(res - which(as.character(x) == A)); # Interestingly enough, this seems to be faster (repeated many times) # Don't know why. print(t2/t1); user system elapsed 0.632653 1.60 0.754717 # Avoid coercing the factor, but instead coerce the level compared to t3 - system.time(res - which(x == match(A, levels(x; # ...but gives no speed up print(t3/t1); user system elapsed 1.041667 1.00 1.018182 # But coercing the factor to integers does t4 - system.time(res - which(as.integer(x) == match(A, levels(x print(t4/t1); usersystem elapsed 0.417 0.000 0.3636364 So, the latter seems to be the fastest way to identify those elements. My $.02 /Henrik On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote: Emmanuel, On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED] wrote: Dear All, I have a large data frame ( 270 lines and 14 columns), and I would like to extract the information in a particular way illustrated below: Given a data frame df: col1=sample(c(0,1),10, rep=T) names = factor(c(rep(A,5),rep(B,5))) df = data.frame(names,col1) df names col1 1 A1 2 A0 3 A1 4 A0 5 A1 6 B0 7 B0 8 B1 9 B0 10 B0 I would like to tranform it in the form: index = c(A,B) col1[[1]]=df$col1[which(df$name==A)] col1[[2]]=df$col1[which(df$name==B)] I'm not sure I fully understand your problem, you example would not run for me. You could get a small speedup by omitting which(), you can subset by a logical vector also which give a small speedup. n - 270 foo - data.frame( + one = sample(c(0,1), n, rep = T), + two = factor(c(rep(A, n/2 ),rep(B, n/2 ))) + ) system.time(out - which(foo$two==A)) user system elapsed 0.566 0.146 0.761 system.time(out - foo$two==A) user system elapsed 0.429 0.075 0.588 You might also find use for unstack(), though I didn't see a speedup. system.time(out - unstack(foo)) user system elapsed 1.068 0.697 2.004 HTH Peter My problem is that the command: *** which(df$name==A) *** takes about 1 second because df is so big. I was thinking that a level could maybe be accessed instantly but I am not sure about how to do it. I would be very grateful for any advice that would allow me to speed this up. Best wishes, Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary.manova rank deficiency error + data
Pedro Mardones wrote: Thanks for the reply. The SAS output is attached but seems to me that doesn't correspond to the wihtin-row contrasts as you suggested. By the way, yes the data are highly correlated, in fact each row correspond to the first part of a signal vector. Thanks anyway PM Agreed. I tried disabling the check that causes R to protest, and then it gives similar DF but not quite the same statistics, quite possibly due to numerical instabilities in one or both systems. (You can easily try yourself, just do anova.mlm - stats::anova.mlm and edit the qr() call inside.) anova(lm(cbind(Y1,Y2,Y3,Y4,Y5)~GROUP, test), test = Wilks) Analysis of Variance Table DfWilks approx F num Df den Df Pr(F) (Intercept) 1 0.002537 1887.24 5 24 2e-16 *** GROUP2 0.62 1.29 10 48 0.2616 Residuals 28 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 The GLM Procedure Multivariate Analysis of Variance E = Error SSCP Matrix y1y2y3 y4y5 y1 0.0353518799 0.035256904 0.0351327804 0.0349749601 0.0347868018 y2 0.035256904 0.0351627227 0.0350395053 0.0348827098 0.0346956744 y3 0.0351327804 0.0350395053 0.0349173343 0.0347617352 0.0345760232 y4 0.0349749601 0.0348827098 0.0347617352 0.0346075203 0.0344233531 y5 0.0347868018 0.0346956744 0.0345760232 0.0344233531 0.0342409225 Partial Correlation Coefficients from the Error SSCP Matrix / Prob |r| DF = 28 y1 y2 y3 y4 y5 y11.00 0.92 0.67 0.21 0.999852 .0001 .0001 .0001 .0001 y20.92 1.00 0.91 0.63 0.11 .0001.0001 .0001 .0001 y30.67 0.91 1.00 0.90 0.58 .0001 .0001 .0001 .0001 y40.21 0.63 0.90 1.00 0.89 .0001 .0001 .0001 .0001 y50.999852 0.11 0.58 0.89 1.00 .0001 .0001 .0001 .0001 The SAS System 10:33 Wednesday, August 13, 2008 8 The GLM Procedure Multivariate Analysis of Variance H = Type III SSCP Matrix for group y1y2y3 y4y5 y1 0.0023822408 0.002365848 0.0023471328 0.0023261249 0.0023030993 y2 0.002365848 0.0023495679 0.0023309816 0.0023101183 0.0022872511 y3 0.0023471328 0.0023309816 0.0023125426 0.0022918453 0.0022691608 y4 0.0023261249 0.0023101183 0.0022918453 0.0022713359 0.0022488593 y5 0.0023030993 0.0022872511 0.0022691608 0.0022488593 0.0022266141 Characteristic Roots and Vectors of: E Inverse * H, where H = Type III SSCP Matrix for group E = Error SSCP Matrix Characteristic Characteristic Vector V'EV=1 Root Percenty1y2y3 y4y5 0.4184010371.72 -7542.628 17131.814 5347.394 -31627.317 16700.100 0.1649601128.28 -4180.854 -4413.446 32096.035 -35545.204 12040.697 0.0001 0.00-41004.875107291.004-95905.664 32641.189 -3028.470 0. 0.00 -416.226 -111.206 410.721 295.193 -171.953 0. 0.00-14678.651 5787.997 54718.250 -69055.249 23218.580 MANOVA Test Criteria and F Approximations for the Hypothesis of No Overall group Effect H = Type III SSCP Matrix for group E = Error SSCP Matrix S=2M=1N=11 StatisticValueF ValueNum DF Den DFPr F Wilks' Lambda 0.60518744 1.3710 480.2227 Pillai's Trace 0.43658228 1.4010 500.2095 Hotelling-Lawley Trace 0.58336114 1.3710 33.3620.2385 Roy's Greatest Root 0.41840103 2.09 5 250.1000 On Wed, Aug 13, 2008 at 4:34 AM, Peter Dalgaard [EMAIL PROTECTED] wrote: Pedro Mardones wrote: Dear R-users;
Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?
split if probably what you are after. Here is an example: n - 270 x - data.frame(name=sample(1:6000,n,TRUE), value=runif(n)) # split it into 6000 lists system.time(y - split(x$value, x$name)) user system elapsed 0.800.201.07 str(y[1:10]) List of 10 $ 1 : num [1:454] 0.270 0.380 0.238 0.048 0.715 ... $ 2 : num [1:440] 0.769 0.822 0.832 0.527 0.808 ... $ 3 : num [1:444] 0.626 0.324 0.918 0.916 0.743 ... $ 4 : num [1:455] 0.341 0.482 0.134 0.237 0.324 ... $ 5 : num [1:430] 0.610 0.217 0.245 0.716 0.600 ... $ 6 : num [1:443] 0.460 0.335 0.503 0.798 0.181 ... $ 7 : num [1:424] 0.4417 0.4759 0.7436 0.0863 0.1770 ... $ 8 : num [1:480] 0.0712 0.6774 0.2995 0.8378 0.1902 ... $ 9 : num [1:431] 0.892 0.836 0.397 0.612 0.395 ... $ 10: num [1:448] 0.984 0.601 0.793 0.363 0.898 ... Takes less that 1 second to split into 6000 lists. On Wed, Aug 13, 2008 at 9:03 AM, Emmanuel Levy [EMAIL PROTECTED] wrote: Wow great! Split was exactly what was needed. It takes about 1 second for the whole operation :D Thanks again - I can't believe I never used this function in the past. All the best, Emmanuel 2008/8/13 Erik Iverson [EMAIL PROTECTED]: I still don't understand what you are doing. Can you make a small example that shows what you have and what you want? Is ?split what you are after? Emmanuel Levy wrote: Dear Peter and Henrik, Thanks for your replies - this helps speed up a bit, but I thought there would be something much faster. What I mean is that I thought that a particular value of a level could be accessed instantly, similarly to a hash key. Since I've got about 6000 levels in that data frame, it means that making a list L of the form L[[1]] = values of name 1 L[[2]] = values of name 2 L[[3]] = values of name 3 ... would take ~1hour. Best, Emmanuel 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]: To simplify: n - 2.7e6; x - factor(c(rep(A, n/2), rep(B, n/2))); # Identify 'A':s t1 - system.time(res - which(x == A)); # To compare a factor to a string, the factor is in practice # coerced to a character vector. t2 - system.time(res - which(as.character(x) == A)); # Interestingly enough, this seems to be faster (repeated many times) # Don't know why. print(t2/t1); user system elapsed 0.632653 1.60 0.754717 # Avoid coercing the factor, but instead coerce the level compared to t3 - system.time(res - which(x == match(A, levels(x; # ...but gives no speed up print(t3/t1); user system elapsed 1.041667 1.00 1.018182 # But coercing the factor to integers does t4 - system.time(res - which(as.integer(x) == match(A, levels(x print(t4/t1); usersystem elapsed 0.417 0.000 0.3636364 So, the latter seems to be the fastest way to identify those elements. My $.02 /Henrik On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote: Emmanuel, On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED] wrote: Dear All, I have a large data frame ( 270 lines and 14 columns), and I would like to extract the information in a particular way illustrated below: Given a data frame df: col1=sample(c(0,1),10, rep=T) names = factor(c(rep(A,5),rep(B,5))) df = data.frame(names,col1) df names col1 1 A1 2 A0 3 A1 4 A0 5 A1 6 B0 7 B0 8 B1 9 B0 10 B0 I would like to tranform it in the form: index = c(A,B) col1[[1]]=df$col1[which(df$name==A)] col1[[2]]=df$col1[which(df$name==B)] I'm not sure I fully understand your problem, you example would not run for me. You could get a small speedup by omitting which(), you can subset by a logical vector also which give a small speedup. n - 270 foo - data.frame( + one = sample(c(0,1), n, rep = T), + two = factor(c(rep(A, n/2 ),rep(B, n/2 ))) + ) system.time(out - which(foo$two==A)) user system elapsed 0.566 0.146 0.761 system.time(out - foo$two==A) user system elapsed 0.429 0.075 0.588 You might also find use for unstack(), though I didn't see a speedup. system.time(out - unstack(foo)) user system elapsed 1.068 0.697 2.004 HTH Peter My problem is that the command: *** which(df$name==A) *** takes about 1 second because df is so big. I was thinking that a level could maybe be accessed instantly but I am not sure about how to do it. I would be very grateful for any advice that would allow me to speed this up. Best wishes, Emmanuel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE
Re: [R] which(df$name==A) takes ~1 second! (df is very large), but can it be speeded up?
If you want the index, then use: system.time(y - split(seq(nrow(x)), x$name)) user system elapsed 0.810.060.88 str(y[1:10]) List of 10 $ 1 : int [1:454] 6924 17503 26880 39197 42881 50835 57896 62624 65767 75359 ... $ 2 : int [1:440] 9954 25619 25761 33776 56651 60372 61042 63134 64414 64491 ... $ 3 : int [1:444] 5413 6831 15780 21652 29423 37000 38661 60977 72267 74839 ... $ 4 : int [1:455] 23859 24748 27221 34886 40538 41326 45065 79769 81783 83951 ... $ 5 : int [1:430] 2572 3514 9934 24969 33844 35409 38122 38161 40113 45593 ... $ 6 : int [1:443] 7145 25184 26348 31182 39965 44191 49114 52791 69855 74272 ... $ 7 : int [1:424] 4596 11762 24949 30324 57906 59043 64833 70769 88878 90594 ... $ 8 : int [1:480] 14809 17604 18958 28436 31449 45339 51829 57725 65243 73260 ... $ 9 : int [1:431] 10748 14579 27153 27685 31930 32593 34605 35680 35828 50490 ... $ 10: int [1:448] 5292 13049 21132 22673 22983 28324 40099 43709 55505 70957 ... On Wed, Aug 13, 2008 at 9:09 AM, jim holtman [EMAIL PROTECTED] wrote: split if probably what you are after. Here is an example: n - 270 x - data.frame(name=sample(1:6000,n,TRUE), value=runif(n)) # split it into 6000 lists system.time(y - split(x$value, x$name)) user system elapsed 0.800.201.07 str(y[1:10]) List of 10 $ 1 : num [1:454] 0.270 0.380 0.238 0.048 0.715 ... $ 2 : num [1:440] 0.769 0.822 0.832 0.527 0.808 ... $ 3 : num [1:444] 0.626 0.324 0.918 0.916 0.743 ... $ 4 : num [1:455] 0.341 0.482 0.134 0.237 0.324 ... $ 5 : num [1:430] 0.610 0.217 0.245 0.716 0.600 ... $ 6 : num [1:443] 0.460 0.335 0.503 0.798 0.181 ... $ 7 : num [1:424] 0.4417 0.4759 0.7436 0.0863 0.1770 ... $ 8 : num [1:480] 0.0712 0.6774 0.2995 0.8378 0.1902 ... $ 9 : num [1:431] 0.892 0.836 0.397 0.612 0.395 ... $ 10: num [1:448] 0.984 0.601 0.793 0.363 0.898 ... Takes less that 1 second to split into 6000 lists. On Wed, Aug 13, 2008 at 9:03 AM, Emmanuel Levy [EMAIL PROTECTED] wrote: Wow great! Split was exactly what was needed. It takes about 1 second for the whole operation :D Thanks again - I can't believe I never used this function in the past. All the best, Emmanuel 2008/8/13 Erik Iverson [EMAIL PROTECTED]: I still don't understand what you are doing. Can you make a small example that shows what you have and what you want? Is ?split what you are after? Emmanuel Levy wrote: Dear Peter and Henrik, Thanks for your replies - this helps speed up a bit, but I thought there would be something much faster. What I mean is that I thought that a particular value of a level could be accessed instantly, similarly to a hash key. Since I've got about 6000 levels in that data frame, it means that making a list L of the form L[[1]] = values of name 1 L[[2]] = values of name 2 L[[3]] = values of name 3 ... would take ~1hour. Best, Emmanuel 2008/8/12 Henrik Bengtsson [EMAIL PROTECTED]: To simplify: n - 2.7e6; x - factor(c(rep(A, n/2), rep(B, n/2))); # Identify 'A':s t1 - system.time(res - which(x == A)); # To compare a factor to a string, the factor is in practice # coerced to a character vector. t2 - system.time(res - which(as.character(x) == A)); # Interestingly enough, this seems to be faster (repeated many times) # Don't know why. print(t2/t1); user system elapsed 0.632653 1.60 0.754717 # Avoid coercing the factor, but instead coerce the level compared to t3 - system.time(res - which(x == match(A, levels(x; # ...but gives no speed up print(t3/t1); user system elapsed 1.041667 1.00 1.018182 # But coercing the factor to integers does t4 - system.time(res - which(as.integer(x) == match(A, levels(x print(t4/t1); usersystem elapsed 0.417 0.000 0.3636364 So, the latter seems to be the fastest way to identify those elements. My $.02 /Henrik On Tue, Aug 12, 2008 at 7:31 PM, Peter Cowan [EMAIL PROTECTED] wrote: Emmanuel, On Tue, Aug 12, 2008 at 4:35 PM, Emmanuel Levy [EMAIL PROTECTED] wrote: Dear All, I have a large data frame ( 270 lines and 14 columns), and I would like to extract the information in a particular way illustrated below: Given a data frame df: col1=sample(c(0,1),10, rep=T) names = factor(c(rep(A,5),rep(B,5))) df = data.frame(names,col1) df names col1 1 A1 2 A0 3 A1 4 A0 5 A1 6 B0 7 B0 8 B1 9 B0 10 B0 I would like to tranform it in the form: index = c(A,B) col1[[1]]=df$col1[which(df$name==A)] col1[[2]]=df$col1[which(df$name==B)] I'm not sure I fully understand your problem, you example would not run for me. You could get a small speedup by omitting which(), you can subset by a logical vector also which give a small speedup. n - 270 foo - data.frame( + one = sample(c(0,1), n, rep = T), + two = factor(c(rep(A, n/2 ),rep(B, n/2 ))) + )
Re: [R] ignoring zeros or converting to NA
The help page on binary operators (see ?==) confirms that binary representation of fractional representation is not catered for and points to all.equal as a more suitable test method for those cases. Steve E Thomas Lumley [EMAIL PROTECTED] 13/08/2008 16:47 Integers (up to a fairly high limit) are represented exactly, as are fractions whose denominator is a power of two (again up to a fairly high limit), so x==0 is fine in that sense. If x is computed by floating point operations you do have to worry whether these are exact, eg, with x-seq(-1,1,length=7) it is not clear that the fourth element will be exactly zero. -thomas On Wed, 13 Aug 2008, Roland Rau wrote: Hi, since many suggestions are following the form of x[x==0] (or similar) I would like to ask if this is really recommended? What I have learned (the hard way) is that one should not test for equality of floating point numbers (which is the default for R's numeric values, right?) since the binary representation of these (decimal) floating point numbers is not necessarily exact (with the classic example of decimal 0.1). Is it okay in this case for the value zero where all binary elements are zero? Or does R somehow recognize that it is an integer? Just some questions out of curiosity. Thank you, Roland rcoder wrote: Hi everyone, I have a matrix that has a combination of zeros and NAs. When I perform certain calculations on the matrix, the zeros generate Inf values. Is there a way to either convert the zeros in the matrix to NAs, or only perform the calculations if not zero (i.e. like using something similar to an !all(is.na() construct)? Thanks, rcoder __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tcl/tk example in batch
The example for learning tcl/tk under R at http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/OKtoplevel.html suggests running it from batch - but when I do, the window flashes by and the example ends. I'm under XP pro. Is there a workaround? Should I create a modal window instead so it persists? Thanks. -- View this message in context: http://www.nabble.com/tcl-tk-example-in-batch-tp18964294p18964294.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arguments to lm() within a function - object not found
Hi all, I'm having some difficulty passing arguments into lm() from within a function, and I was hoping someone wiser in the ways of R could tell me what I'm doing wrong. I have the following: lmwrap - function(...) { wts - somefunction() print(wts) # This works, wts has the values I expect fit - lm(weights=wts,...) return(fit) } If I call my function lmwrap, I get the the following error: lmwrap(a~b) Error in eval(expr, envir, enclos) : object wts not found A traceback gives me the following: 8: eval(expr, envir, enclos) 7: eval(extras, data, env) 6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels = TRUE) 5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE) 4: eval(expr, envir, enclos) 3: eval(mf, parent.frame()) 2: lm(weights = wts, ...) 1: wraplm(a ~ b) It seems like whatever environment lm is trying to eval wts in doesn't have it defined. Could anyone tell me what I'm doing wrong? As a sidenote, I do have a workaround, but this strikes me as really the wrong thing to do. I replace the call to lm with: eval(substitute(lm(weights = dummy,...),list(dummy=wts))) which works. Thanks Pete __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reverse orientation of text in plot margins
Dear R users, I am trying to reverse the orientation of axis labels and title in the right margin of a plot, so that they read from top to bottom. I know that this can be done using text() as follows: par(mar=c(5,4,4,4)+.1) plot(1:4,las=0) par(new=T) y - rnorm(4) plot(y,axes=FALSE,ann=FALSE,pch=17) axis(4,labels=FALSE) par(xpd=TRUE) text(x=par(usr)[2]+.25,y=axTicks(4),labels=axTicks(4),srt=-90) text(x=par(usr)[2]+.5,y=sum(par(usr)[3:4])/2,labels=titel,srt=-90) par(xpd=FALSE) the problem is that I have to manually reset the x and y coordinates of the text whenever the plot is resized. This is problematic if I want to automatize the production of a number of plots (or produce different output formats), or to make sure that the labels and title in the right axis are at an equal distance from the plot as the labels and title on the left axis. Now I can only guess it on sight. mtext() allows me to set the distance, but not to reverse the orientation of the text. I could use text() to also produce the left axis, like that labels on both sides can be at the exact same distance from the plot, but then I want to determine the plot margins relative to the plot dimensions. Does anyone see a solution to my problem that doesn't involve trial and error for the x coordinate? thanks Karel _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] conditional IF with AND
Hi everyone, I'm trying to create an if conditional statement with two conditions, whereby the statement is true when condition 1 AND condition 2 are met: code structure: if ?AND? (a[x,y] condition1, a[x,y] condition2) I've trawled through the help files, but I cannot find an example of the syntax for incorporating an AND in a conditional IF statement. Thanks, rcoder -- View this message in context: http://www.nabble.com/conditional-IF-with-AND-tp18966890p18966890.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dixon test
Thank you so much, I have not much experience on outliers =), I thought that there were nonparametric distribution-free outliers test =(. What is the most general distribution I can use? I did histogram of my data set and sometimes normal distribution seems to occur, sometimes an uniform distribution seems to occur. So, I cannot understand what distribution I can use for my whole data set S Ellison wrote: giov [EMAIL PROTECTED] 13/08/2008 10:59:32 just a question...I don't know what is the distribution of my data (normal, T, etc...). So, how can I set the type parameter? You must assume an underlying distribution or you can't do an outlier test. Outliers are just unusually extreme data points. They can only be considered 'unusual' if there is some basis - a distribution assumption - for deciding what is 'usual'. The assumed underlying distribution describes what is expected to be 'usual'. With no distribution assumption, there is no basis for considering any data point unusual, so the idea of an outlier really has no meaning. Steve E *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/dixon-test-tp18940260p18964049.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting matrix according to columns with character index
Try this: x V1 V2 V3 1 a1 c1 1 2 a1 c1 2 3 a2 c1 1 4 a1 c2 1 5 a1 c2 2 lis - split(x, list(x$V1, x$V2), drop = TRUE) do.call(rbind, unname(lis[sapply(lis, function(x)all(1:2 %in% x[,3]))])) On Wed, Aug 13, 2008 at 3:00 PM, Ralph S. [EMAIL PROTECTED] wrote: Hi, I have a long matrix of the following form which I would like to subset according to the third column: [x y z]: a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . . The first two columns a characters ai and cj. I would like to keep all the rows where there are two entries for z, 1 and 2. That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . . I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but that only gives me one line of data per x y combination. Is there an easy way of coding to keep all rows for a and c combinations where z has entries both 1 and 2? Many thanks, Ralph _ LM_WLYIA_whichathlete_us __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditional IF with AND
if(cond1 cond2) { ... } rcoder wrote: Hi everyone, I'm trying to create an if conditional statement with two conditions, whereby the statement is true when condition 1 AND condition 2 are met: code structure: if ?AND? (a[x,y] condition1, a[x,y] condition2) I've trawled through the help files, but I cannot find an example of the syntax for incorporating an AND in a conditional IF statement. Thanks, rcoder __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arguments to lm() within a function - object not found
Hi all, I'm having some difficulty passing arguments into lm() from within a function, and I was hoping someone wiser in the ways of R could tell me what I'm doing wrong. I have the following: lmwrap - function(...) { wts - somefunction() print(wts) # This works, wts has the values I expect fit - lm(weights=wts,...) return(fit) } If I call my function lmwrap, I get the the following error: lmwrap(a~b) Error in eval(expr, envir, enclos) : object wts not found A traceback gives me the following: 8: eval(expr, envir, enclos) 7: eval(extras, data, env) 6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels = TRUE) 5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE) 4: eval(expr, envir, enclos) 3: eval(mf, parent.frame()) 2: lm(weights = wts, ...) 1: wraplm(a ~ b) It seems like whatever environment lm is trying to eval wts in doesn't have it defined. Could anyone tell me what I'm doing wrong? As a sidenote, I do have a workaround, but this strikes me as really the wrong thing to do. I replace the call to lm with: eval(substitute(lm(weights = dummy,...),list(dummy=wts))) which works. Thanks Pete __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditional IF with AND
See: ?`` On Wed, Aug 13, 2008 at 1:45 PM, rcoder [EMAIL PROTECTED] wrote: Hi everyone, I'm trying to create an if conditional statement with two conditions, whereby the statement is true when condition 1 AND condition 2 are met: code structure: if ?AND? (a[x,y] condition1, a[x,y] condition2) I've trawled through the help files, but I cannot find an example of the syntax for incorporating an AND in a conditional IF statement. Thanks, rcoder -- View this message in context: http://www.nabble.com/conditional-IF-with-AND-tp18966890p18966890.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to test for a random effect in a repeated measu res analysis using anova.mlm ?
Dear “R” masters, I am trying to conduct an ANOVA with repeated measures using the command anova.mlm for data structured according to a Randomized Block Design. I would like to account for a random effect but cannot find a way to incorporate it in the analysis. NB. I tried using the argument “M” to define the outer projection (block), I get the message that length differs. “response” is the dependent variable (3 years of heights measurements, merged with cbind). “estab” is a factor with 3 levels (whether trees were planted, seeded or naturally established). I would like to include “block” as a random effect. I would like to keep the structure of the response variable (so I don’t get an output with a test for each year: this is what happens when I use “lme” or “aov”). This is the code I am using: First, fit linear models: estabfit-lm(response~estab) timefit-lm(response~1) Then test the effect of the factor “estab” anova.mlm(timefit,estabfit,M=~1). How do I integrate “block”? I was inspired by: http://tolstoy.newcastle.edu.au/R/help/05/11/15744.html Thank you so much for your help! Cheers Marie-lou Lefrancois __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re placing default labels in lattice
You can see the source code of demo script: file.show(system.file(demo/labels.R, package = lattice)) On Wed, Aug 13, 2008 at 11:20 AM, Andrewjohnclose [EMAIL PROTECTED] wrote: Dear all, I am having a little trouble deciphering how to change the default x-axis labels in a lattice xyplot (or any type of lattice plot for that matter). I have tried using the demo(labels) function but the code is truncated at precisely the wrong moment! All I am trying to do is to add superscript to two of the labels for which i tried using the expression function. It partly works, but it prints only the first replacement label inside the plotting region and forgets the rest...what am I missing? Thank you xyplot(resid(mod4)~factor(distance),aspect=1.0,cex=1.0,xlab=Distance,ylab=Residuals,data=meanAG, span=1, panel=function(x,y,span){ panel.grid(h=0, v=-1) panel.xyplot(x,y,cex=1.0,points=jitter) panel.loess(x,y, span) panel.axis(side=bottom,at=TRUE, labels=c(expression(Bray-Curtis^{1}),expression(Bray-Curtis^{2}),expression(Canberra),expression(Gower),expression(Hellinger),expression(Kulczynski))) }) http://www.nabble.com/file/p18964008/meanAG.csv meanAG.csv -- View this message in context: http://www.nabble.com/replacing-default-labels-in-lattice-tp18964008p18964008.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ignoring zeros or converting to NA
FYI, there is an isZero() in the R.utils package that allows you to specify the precision. It looks like this: isZero - function (x, neps=1, eps=.Machine$double.eps, ...) { (abs(x) neps*eps); } /Henrik On Wed, Aug 13, 2008 at 8:23 AM, Roland Rau [EMAIL PROTECTED] wrote: Hi, since many suggestions are following the form of x[x==0] (or similar) I would like to ask if this is really recommended? What I have learned (the hard way) is that one should not test for equality of floating point numbers (which is the default for R's numeric values, right?) since the binary representation of these (decimal) floating point numbers is not necessarily exact (with the classic example of decimal 0.1). Is it okay in this case for the value zero where all binary elements are zero? Or does R somehow recognize that it is an integer? Just some questions out of curiosity. Thank you, Roland rcoder wrote: Hi everyone, I have a matrix that has a combination of zeros and NAs. When I perform certain calculations on the matrix, the zeros generate Inf values. Is there a way to either convert the zeros in the matrix to NAs, or only perform the calculations if not zero (i.e. like using something similar to an !all(is.na() construct)? Thanks, rcoder __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditional IF with AND
On 13-Aug-08 16:45:27, rcoder wrote: Hi everyone, I'm trying to create an if conditional statement with two conditions, whereby the statement is true when condition 1 AND condition 2 are met: code structure: if ?AND? (a[x,y] condition1, a[x,y] condition2) I've trawled through the help files, but I cannot find an example of the syntax for incorporating an AND in a conditional IF statement. Thanks, rcoder The basic structure of an 'if' statement (from ?if -- don't forget the .. for certain keywords such as if) is: if(cond) expr What is not explained in the ?if help is that 'cond' may be any expression that evaluates to a logical TRUE or FALSE. Hence you can build 'cond' to suit your purpose. Therefore: if( (condition 1 on a[x,y])(condition 2 on a[x,y]) ) { whatever you want to do if (cond1 AND cond2 ) is TRUE } Example: if( (a[x,y]1.0)(a[x,y]2.0) ){ print(Between 1 and 2) } Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 13-Aug-08 Time: 19:33:53 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which alternative tests instead of AIC/BIC for choosing models
Many thanks John, appreciate the advice, Tolga John C Frain [EMAIL PROTECTED] 13/08/2008 18:51 To [EMAIL PROTECTED] cc r-help@r-project.org Subject Re: [R] which alternative tests instead of AIC/BIC for choosing models My initial idea would be to forget about AIC and BIC, ask the question what would one expect to get in the regression and then regress y on x1 and x2 and use a simple t-test to determine what should be included. Remember that omitted variables will bias your coefficients but if you include redundant variables your results will remain consistent. I presume that you do not have any problems with non-stationary variables. Best Regards John 2008/8/13 [EMAIL PROTECTED]: Dear R Users, I am looking for an alternative to AIC or BIC to choose model parameters. This is somewhat of a general statistics question, but I ask it in this forum as I am looking for a R solution. Suppose I have one dependent variable, y, and two independent variables, x1 an x2. I can perform three regressions: reg1: y~x1 reg2: y~x2 reg3: y~x1+x2 The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would, presumably, conclude that one should use both x1 and x2. However, the R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is 95.25%. Knowing that, I would actually conclude that x1 adds litte and should probably not be used. There is the overall question of what potentially explains this outcome, i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does not materially improve with the addition of x1 to reg 2 (to get to reg3). But that is more of a generic statistics issue and not my question here. The question I do have is, is there a package in R which implements a test and provides some diagnostic information I can use to rule out the use of x1 in a systematic way as it's addition to the equation adds little in terms of explaining the variability of y. Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John C Frain Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This
Re: [R] Arguments to lm() within a function - object not found
On Wed, 13 Aug 2008, Pete Berlin wrote: Hi all, I'm having some difficulty passing arguments into lm() from within a function, and I was hoping someone wiser in the ways of R could tell me what I'm doing wrong. I have the following: lmwrap - function(...) { wts - somefunction() print(wts) # This works, wts has the values I expect fit - lm(weights=wts,...) return(fit) } If I call my function lmwrap, I get the the following error: lmwrap(a~b) Error in eval(expr, envir, enclos) : object wts not found Correct. The help (?lm) says All of 'weights', 'subset' and 'offset' are evaluated in the same way as variables in 'formula', that is first in 'data' and then in the environment of 'formula'. A traceback gives me the following: 8: eval(expr, envir, enclos) 7: eval(extras, data, env) 6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels = TRUE) 5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE) 4: eval(expr, envir, enclos) 3: eval(mf, parent.frame()) 2: lm(weights = wts, ...) 1: wraplm(a ~ b) It seems like whatever environment lm is trying to eval wts in doesn't have it defined. Could anyone tell me what I'm doing wrong? As a sidenote, I do have a workaround, but this strikes me as really the wrong thing to do. I replace the call to lm with: eval(substitute(lm(weights = dummy,...),list(dummy=wts))) which works. It's one workaround, but working with the scoping rules is better. Hint: use the 'data' argument to lm. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help with stat functions(like adaboost, random forests and glm)
Ok, so basically I have a dataframe named data_frame data_frame contains: startdate startprice endpricethreshold1 endpricethreshold2 endpricethreshold3 all of these endpricethresholds are true/false binary vectors. They are true or false depending on whether the endprice was above or below whatever the endpricethreshold is. now I want to try to use lets say the general linear model to have it try and predict which endprice thresholds will be true or false dependent upon startdate and startprice. So I have a formula like: glm(endpricethreshold1 ~ ., data=data_frame[,c(1,2,3)], family=binomial(logit)); but, for the first term endpricethreshold1(since I really have tons of endpricethresholds and would like to make this a loop) I don't want to refer to it by its name but instead by its column indice like this: glm(data_frame[[3]] ~ ., data=data_frame[,c(1,2,3)], family=binomial(logit)); However, when I do this I am getting completely different results and I have no idea why. If anyone could help it would be greatly appreciated. Thanks, Paul Fisch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting matrix according to columns with character index
I tried this - I get an empty set: 0 rows (or 0-length row.names) I guess this happens because the z variable takes only one value per row?? What works is: DFsub-DF[DF$z == 1 | DF$z == 2,] but then, I do not eliminate the entries where there is only one entry for z given an a and c combination. Any idea what to do? -Ralph Date: Wed, 13 Aug 2008 13:05:25 -0500 From: [EMAIL PROTECTED] Subject: RE: [R] subsetting matrix according to columns with character index To: [EMAIL PROTECTED] it must be a dataframe so, if it was DF, then, assuming i understand what you want then either of the following should work: DFsub-DF[DF$z == 1 DF$z == 2,] or DFsub-subset(DF, z == 1 z == 2 ) On Wed, Aug 13, 2008 at 2:00 PM, Ralph S. wrote: Hi, I have a long matrix of the following form which I would like to subset according to the third column: [x y z]: a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . . The first two columns a characters ai and cj. I would like to keep all the rows where there are two entries for z, 1 and 2. That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . . I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but that only gives me one line of data per x y combination. Is there an easy way of coding to keep all rows for a and c combinations where z has entries both 1 and 2? Many thanks, Ralph _ LM_WLYIA_whichathlete_us __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting matrix according to columns with character index
i don't think i understood what you were trying to do, atleast based on Henrique's solution which I haven't cut and pasted yet in order to understand. Did Henrique's solution do what you wanted ? On Wed, Aug 13, 2008 at 2:45 PM, Ralph S. wrote: I tried this - I get an empty set: 0 rows (or 0-length row.names) I guess this happens because the z variable takes only one value per row?? What works is: DFsub-DF[DF$z == 1 | DF$z == 2,] but then, I do not eliminate the entries where there is only one entry for z given an a and c combination. Any idea what to do? -Ralph Date: Wed, 13 Aug 2008 13:05:25 -0500 From: [EMAIL PROTECTED] Subject: RE: [R] subsetting matrix according to columns with character index To: [EMAIL PROTECTED] it must be a dataframe so, if it was DF, then, assuming i understand what you want then either of the following should work: DFsub-DF[DF$z == 1 DF$z == 2,] or DFsub-subset(DF, z == 1 z == 2 ) On Wed, Aug 13, 2008 at 2:00 PM, Ralph S. wrote: Hi, I have a long matrix of the following form which I would like to subset according to the third column: [x y z]: a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . . The first two columns a characters ai and cj. I would like to keep all the rows where there are two entries for z, 1 and 2. That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . . I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but that only gives me one line of data per x y combination. Is there an easy way of coding to keep all rows for a and c combinations where z has entries both 1 and 2? Many thanks, Ralph _ LM_WLYIA_whichathlete_us __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ___ Your PC, mobile phone, and online services work together like never before. See how Windows® fits your life http://clk.atdmt.com/MRT/go/108587394/direct/01/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which alternative tests instead of AIC/BIC for choosing models
By way of partial follow-up to my own question, and on the odd chance anyone else wonders about this issue, some alternatives to this appear to be in the leaps package, which implements the leaps routine (Mallows Cp) and regsubsets. In my case Mallows' Cp does not work either (see below), so I have implemented the following. regr # - holds a zoo object with the 1st column being the dependent variable r2test- (result$lm.r2Rsqr) (all(unlist(lapply(2:(dim(regr)[2]),function(i) summary(lm(regr[,1]~regr[,i]))$adj.r.squared ))0.1)) which.min(leaps(as.matrix(regr[,-1]),regr[,1])$Cp)==dim(regr)[2] leaps on the same problem below === leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(adjr2)) $which 1 2 1 FALSE TRUE 1 TRUE FALSE 2 TRUE TRUE $label [1] (Intercept) 1 2 $size [1] 2 2 3 $adjr2 [1] 0.950757134 0.001681389 0.954859493 leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(Cp)) $which 1 2 1 FALSE TRUE 1 TRUE FALSE 2 TRUE TRUE $label [1] (Intercept) 1 2 $size [1] 2 2 3 $Cp [1] 38.53367 8490.553273.0 Tolga I Uzuner/JPMCHASE 13/08/2008 17:33 To r-help@r-project.org cc Subject which alternative tests instead of AIC/BIC for choosing models Dear R Users, I am looking for an alternative to AIC or BIC to choose model parameters. This is somewhat of a general statistics question, but I ask it in this forum as I am looking for a R solution. Suppose I have one dependent variable, y, and two independent variables, x1 an x2. I can perform three regressions: reg1: y~x1 reg2: y~x2 reg3: y~x1+x2 The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would, presumably, conclude that one should use both x1 and x2. However, the R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is 95.25%. Knowing that, I would actually conclude that x1 adds litte and should probably not be used. There is the overall question of what potentially explains this outcome, i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does not materially improve with the addition of x1 to reg 2 (to get to reg3). But that is more of a generic statistics issue and not my question here. The question I do have is, is there a package in R which implements a test and provides some diagnostic information I can use to rule out the use of x1 in a systematic way as it's addition to the equation adds little in terms of explaining the variability of y. Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting matrix according to columns with character index
sorry ralph. i meant the OR instead of the AND so that was my bad mistake. the subset function should also work with the OR. i think i understand better what you want now also. the approach below for doing what you want assumes that , if there are 2 rows associated with the values in the first 2 columns , then they will be 1 and 2. If they are 1,1 or 2,2, then it won't work. So, henrique's solution could be better and more general. Assume your dataframe is called DF. tempres-split(DF$x,DF$y) onlytwo-lapply(tempres, function(.df) if (nrow(.df) == 2) { return(.df) } else { return(NULL) } ) onlytwo-onlytwo[!sapply(onlytwo,is.null) result-do.call(rbind,onlytwo) On Wed, Aug 13, 2008 at 2:45 PM, Ralph S. wrote: I tried this - I get an empty set: 0 rows (or 0-length row.names) I guess this happens because the z variable takes only one value per row?? What works is: DFsub-DF[DF$z == 1 | DF$z == 2,] but then, I do not eliminate the entries where there is only one entry for z given an a and c combination. Any idea what to do? -Ralph Date: Wed, 13 Aug 2008 13:05:25 -0500 From: [EMAIL PROTECTED] Subject: RE: [R] subsetting matrix according to columns with character index To: [EMAIL PROTECTED] it must be a dataframe so, if it was DF, then, assuming i understand what you want then either of the following should work: DFsub-DF[DF$z == 1 DF$z == 2,] or DFsub-subset(DF, z == 1 z == 2 ) On Wed, Aug 13, 2008 at 2:00 PM, Ralph S. wrote: Hi, I have a long matrix of the following form which I would like to subset according to the third column: [x y z]: a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . . The first two columns a characters ai and cj. I would like to keep all the rows where there are two entries for z, 1 and 2. That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . . I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but that only gives me one line of data per x y combination. Is there an easy way of coding to keep all rows for a and c combinations where z has entries both 1 and 2? Many thanks, Ralph _ LM_WLYIA_whichathlete_us __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ___ Your PC, mobile phone, and online services work together like never before. See how Windows® fits your life http://clk.atdmt.com/MRT/go/108587394/direct/01/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting matrix according to columns with character index
Ralph: I looked at Henrique's solution and he does 2 things which make it better than mine. 1) He splits based off the first two columns where I just split based on the second. So, my split assumes that the same rows are next to each other which is an unnecessary assumption. 2) He actually checks to make sure that 1 and 2 are actually in the third column of the resulting dataframes that split returns. I assumed that , if a dataframe was of length 2, then the latter would be true automatically. So, even though mine worked for what you needed, in the spirit of generality and minimal assumptions, it better to use Henrique's solution. Also, make sure you understand it because you can learn a lot from it. ( this is also true of his solutions in general ). On Wed, Aug 13, 2008 at 3:37 PM, Ralph S. wrote: yes this work, very elegant thank you. I didn't get Henriques message in my mailbox immediately for some reason - -Ralph ___ Date: Wed, 13 Aug 2008 14:23:33 -0500 From: [EMAIL PROTECTED] Subject: RE: [R] subsetting matrix according to columns with character index To: [EMAIL PROTECTED] CC: r-help@r-project.org sorry ralph. i meant the OR instead of the AND so that was my bad mistake. the subset function should also work with the OR. i think i understand better what you want now also. the approach below for doing what you want assumes that , if there are 2 rows associated with the values in the first 2 columns , then they will be 1 and 2. If they are 1,1 or 2,2, then it won't work. So, henrique's solution could be better and more general. Assume your dataframe is called DF. tempres-split(DF$x,DF$y) onlytwo-lapply(tempres, function(.df) if (nrow(.df) == 2) { return(.df) } else { return(NULL) } ) onlytwo-onlytwo[!sapply(onlytwo,is.null) result-do.call(rbind,onlytwo) On Wed, Aug 13, 2008 at 2:45 PM, Ralph S. wrote: I tried this - I get an empty set: 0 rows (or 0-length row.names) I guess this happens because the z variable takes only one value per row?? What works is: DFsub-DF[DF$z == 1 | DF$z == 2,] but then, I do not eliminate the entries where there is only one entry for z given an a and c combination. Any idea what to do? -Ralph Date: Wed, 13 Aug 2008 13:05:25 -0500 From: [EMAIL PROTECTED] Subject: RE: [R] subsetting matrix according to columns with character index To: [EMAIL PROTECTED] it must be a dataframe so, if it was DF, then, assuming i understand what you want then either of the following should work: DFsub-DF[DF$z == 1 DF$z == 2,] or DFsub-subset(DF, z == 1 z == 2 ) On Wed, Aug 13, 2008 at 2:00 PM, Ralph S. wrote: Hi, I have a long matrix of the following form which I would like to subset according to the third column: [x y z]: a1 c1 1 a1 c1 2 a2 c1 1 a1 c2 1 a1 c2 2 . . . The first two columns a characters ai and cj. I would like to keep all the rows where there are two entries for z, 1 and 2. That is, I want: a1 c1 1 a1 c1 2 a1 c2 1 a1 c2 2 . . . I try to use something like df[by(df,c(df$x,df$y),sum(z)==3),] but that only gives me one line of data per x y combination. Is there an easy way of coding to keep all rows for a and c combinations where z has entries both 1 and 2? Many thanks, Ralph _ LM_WLYIA_whichathlete_us __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ___ Your PC, mobile phone, and online services work together like never before. See how Windows® fits your life http://clk.atdmt.com/MRT/go/108587394/direct/01/ ___ Get more from your digital life. Find out how. http://www.windowslive.com/default.html?ocid=TXT_TAGLM_WL_Home2_082008 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging data sets to match data to date
Dear Henrique, This is exactly what I need. Thank you very much for your help! rcoder Henrique Dallazuanna wrote: Try this: x - data.frame(Dates = seq(as.Date('2008-01-01'), as.Date('2008-01-31'), by = 'days'), Values = sample(31)) subset(x, Dates %in% as.Date(c('2008-01-05', '2008-01-20'))) On 8/13/08, rcoder [EMAIL PROTECTED] wrote: Hi everyone, I want to extract data from a data set according to dates specified in a vector. I have created a blank matrix with row names (dates) that I want to extract from the full data set. I have then performed a merge to try to o/p rows corresponding to common dates to a results matrix, but the operation did not fill the results matrix. Coulc anyone offer any advice to assist with this operation? Thanks, rcoder -- View this message in context: http://www.nabble.com/merging-data-sets-to-match-data-to-date-tp18962197p18962197.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/merging-data-sets-to-match-data-to-date-tp18962197p18969953.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Arguments to lm() within a function - object not found
Thanks very much for the quick reply. I had looked at the help for lm, but I clearly skimmed over the critical part explaining where weights is evaluated. Thanks, Pete On 13/8/2008, Prof Brian Ripley wrote: On Wed, 13 Aug 2008, Pete Berlin wrote: Hi all, I'm having some difficulty passing arguments into lm() from within a function, and I was hoping someone wiser in the ways of R could tell me what I'm doing wrong. I have the following: lmwrap - function(...) { wts - somefunction() print(wts) # This works, wts has the values I expect fit - lm(weights=wts,...) return(fit) } If I call my function lmwrap, I get the the following error: lmwrap(a~b) Error in eval(expr, envir, enclos) : object wts not found Correct. The help (?lm) says All of 'weights', 'subset' and 'offset' are evaluated in the same way as variables in 'formula', that is first in 'data' and then in the environment of 'formula'. A traceback gives me the following: 8: eval(expr, envir, enclos) 7: eval(extras, data, env) 6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels = TRUE) 5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE) 4: eval(expr, envir, enclos) 3: eval(mf, parent.frame()) 2: lm(weights = wts, ...) 1: wraplm(a ~ b) It seems like whatever environment lm is trying to eval wts in doesn't have it defined. Could anyone tell me what I'm doing wrong? As a sidenote, I do have a workaround, but this strikes me as really the wrong thing to do. I replace the call to lm with: eval(substitute(lm(weights = dummy,...),list(dummy=wts))) which works. It's one workaround, but working with the scoping rules is better. Hint: use the 'data' argument to lm. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditional IF with AND
Thank you all for your replies. This is all very useful information for me! Ted, thank you very much for the extra explanation and example. Many thanks, rcoder Ted.Harding-2 wrote: On 13-Aug-08 16:45:27, rcoder wrote: Hi everyone, I'm trying to create an if conditional statement with two conditions, whereby the statement is true when condition 1 AND condition 2 are met: code structure: if ?AND? (a[x,y] condition1, a[x,y] condition2) I've trawled through the help files, but I cannot find an example of the syntax for incorporating an AND in a conditional IF statement. Thanks, rcoder The basic structure of an 'if' statement (from ?if -- don't forget the .. for certain keywords such as if) is: if(cond) expr What is not explained in the ?if help is that 'cond' may be any expression that evaluates to a logical TRUE or FALSE. Hence you can build 'cond' to suit your purpose. Therefore: if( (condition 1 on a[x,y])(condition 2 on a[x,y]) ) { whatever you want to do if (cond1 AND cond2 ) is TRUE } Example: if( (a[x,y]1.0)(a[x,y]2.0) ){ print(Between 1 and 2) } Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 13-Aug-08 Time: 19:33:53 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/conditional-IF-with-AND-tp18966890p18970101.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which alternative tests instead of AIC/BIC for choosingmodels
your model 3 is the unrestricted model and your models 1 and 2 are restricted models. you can test model 1 and 2 against model 3 using the anova function, e.g. anova(model2,model3), which, for the case of OLS estimation, are compared with an F-test. If the test is insignificant, the simpler model should be preferred (and, of course, if the test were significant for the more parsimonious model). but if the variable is theoretically important (e.g. a theoretically important control), then it should be included regardless of its significance in the estimation for your specific data. best, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von [EMAIL PROTECTED] Gesendet: Wednesday, August 13, 2008 3:19 PM An: [EMAIL PROTECTED]; r-help@r-project.org Betreff: Re: [R] which alternative tests instead of AIC/BIC for choosingmodels By way of partial follow-up to my own question, and on the odd chance anyone else wonders about this issue, some alternatives to this appear to be in the leaps package, which implements the leaps routine (Mallows Cp) and regsubsets. In my case Mallows' Cp does not work either (see below), so I have implemented the following. regr # - holds a zoo object with the 1st column being the dependent variable r2test- (result$lm.r2Rsqr) (all(unlist(lapply(2:(dim(regr)[2]),function(i) summary(lm(regr[,1]~regr[,i]))$adj.r.squared ))0.1)) which.min(leaps(as.matrix(regr[,-1]),regr[,1])$Cp)==dim(regr)[2] leaps on the same problem below === leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(adjr2)) $which 1 2 1 FALSE TRUE 1 TRUE FALSE 2 TRUE TRUE $label [1] (Intercept) 1 2 $size [1] 2 2 3 $adjr2 [1] 0.950757134 0.001681389 0.954859493 leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(Cp)) $which 1 2 1 FALSE TRUE 1 TRUE FALSE 2 TRUE TRUE $label [1] (Intercept) 1 2 $size [1] 2 2 3 $Cp [1] 38.53367 8490.553273.0 Tolga I Uzuner/JPMCHASE 13/08/2008 17:33 To r-help@r-project.org cc Subject which alternative tests instead of AIC/BIC for choosing models Dear R Users, I am looking for an alternative to AIC or BIC to choose model parameters. This is somewhat of a general statistics question, but I ask it in this forum as I am looking for a R solution. Suppose I have one dependent variable, y, and two independent variables, x1 an x2. I can perform three regressions: reg1: y~x1 reg2: y~x2 reg3: y~x1+x2 The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would, presumably, conclude that one should use both x1 and x2. However, the R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is 95.25%. Knowing that, I would actually conclude that x1 adds litte and should probably not be used. There is the overall question of what potentially explains this outcome, i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does not materially improve with the addition of x1 to reg 2 (to get to reg3). But that is more of a generic statistics issue and not my question here. The question I do have is, is there a package in R which implements a test and provides some diagnostic information I can use to rule out the use of x1 in a systematic way as it's addition to the equation adds little in terms of explaining the variability of y. Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you
Re: [R] reverse orientation of text in plot margins
On Wed, 13-Aug-2008 at 06:32PM +0200, Karel Van den Meersche wrote: | | Dear R users, | | I am trying to reverse the orientation of axis labels and title in | the right margin of a plot, so that they read from top to bottom. I | know that this can be done using text() as follows: | par(mar=c(5,4,4,4)+.1) | plot(1:4,las=0) | par(new=T) | y - rnorm(4) | plot(y,axes=FALSE,ann=FALSE,pch=17) | axis(4,labels=FALSE) I think it would be easiest to work out values for at and labels in this statement. ?axis. HTH -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_Middle minds discuss events (:_~*~_:)Small minds discuss people (_)-(_) . Anon ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which alternative tests instead of AIC/BIC for choosing models
Cp is either the same thing as AIC, or an approximation to it. So it is not an 'alternative'. See e.g. the discussion in MASS or ?add1. On Wed, 13 Aug 2008, [EMAIL PROTECTED] wrote: By way of partial follow-up to my own question, and on the odd chance anyone else wonders about this issue, some alternatives to this appear to be in the leaps package, which implements the leaps routine (Mallows Cp) and regsubsets. In my case Mallows' Cp does not work either (see below), so I have implemented the following. regr # - holds a zoo object with the 1st column being the dependent variable r2test- (result$lm.r2Rsqr) (all(unlist(lapply(2:(dim(regr)[2]),function(i) summary(lm(regr[,1]~regr[,i]))$adj.r.squared ))0.1)) which.min(leaps(as.matrix(regr[,-1]),regr[,1])$Cp)==dim(regr)[2] leaps on the same problem below === leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(adjr2)) $which 1 2 1 FALSE TRUE 1 TRUE FALSE 2 TRUE TRUE $label [1] (Intercept) 1 2 $size [1] 2 2 3 $adjr2 [1] 0.950757134 0.001681389 0.954859493 leaps(as.matrix(regr3[,-1]),regr3[,1],method=c(Cp)) $which 1 2 1 FALSE TRUE 1 TRUE FALSE 2 TRUE TRUE $label [1] (Intercept) 1 2 $size [1] 2 2 3 $Cp [1] 38.53367 8490.553273.0 Tolga I Uzuner/JPMCHASE 13/08/2008 17:33 To r-help@r-project.org cc Subject which alternative tests instead of AIC/BIC for choosing models Dear R Users, I am looking for an alternative to AIC or BIC to choose model parameters. This is somewhat of a general statistics question, but I ask it in this forum as I am looking for a R solution. Suppose I have one dependent variable, y, and two independent variables, x1 an x2. I can perform three regressions: reg1: y~x1 reg2: y~x2 reg3: y~x1+x2 The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would, presumably, conclude that one should use both x1 and x2. However, the R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is 95.25%. Knowing that, I would actually conclude that x1 adds litte and should probably not be used. There is the overall question of what potentially explains this outcome, i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does not materially improve with the addition of x1 to reg 2 (to get to reg3). But that is more of a generic statistics issue and not my question here. The question I do have is, is there a package in R which implements a test and provides some diagnostic information I can use to rule out the use of x1 in a systematic way as it's addition to the equation adds little in terms of explaining the variability of y. Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South
Re: [R] The standard deviation of measurement 1 with respec t to measurement 2
Firas Swidan frsswdn at gmail.com writes: Hi, I have two (different types of) measurements, say X and Y, resulting from the same set of experiments. So X and Y are paired: (x_1, y_1), (x_2, y_2), ... I am trying to calculate the standard deviation of Y with respect to X. In other words, in terms of the scatter plot of X and Y, I would like to divide it into bins along the X-axis and for each bin calculate the standard deviation along the Y results in that bin. (Though I am not totally sure, this seems to remind me of the conditional expectation of Y given X - maybe it is called the conditional deviation?) Is their a built in procedure in R for calculating the above? Otherwise, what would be the easiest way to achieve it? (factors maybe?) Thankful for the help, Firas. Something like the following should give you what you want: x - rnorm(50) y - rnorm(50) tapply(y, cut(x, 10, include.lowest=TRUE), sd) [-2.19,-1.75] (-1.75,-1.3] (-1.3,-0.86] (-0.86,-0.415] (-0.415,0.029] 0.7569111 0.1671267 0.5620591 1.1280510 0.7772356 (0.029,0.473] (0.473,0.918] (0.918,1.36](1.36,1.81](1.81,2.25] 0.5600363 0.7681090 0.9754286 0.3184307 0.2410181 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which alternative tests instead of AIC/BIC for choosing models
Dear R Users, I am looking for an alternative to AIC or BIC to choose model parameters. This is somewhat of a general statistics question, but I ask it in this forum as I am looking for a R solution. Suppose I have one dependent variable, y, and two independent variables, x1 an x2. I can perform three regressions: reg1: y~x1 reg2: y~x2 reg3: y~x1+x2 The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would, presumably, conclude that one should use both x1 and x2. However, the R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3 is 95.25%. Knowing that, I would actually conclude that x1 adds litte and should probably not be used. There is the overall question of what potentially explains this outcome, i.e. the reduction in AIC in going from reg2 to reg3 even though R^2 does not materially improve with the addition of x1 to reg 2 (to get to reg3). But that is more of a generic statistics issue and not my question here. I know you didn't ask the generic statistics question, but I think it's fairly important. I suspect the reason that you're getting (what you consider to be) a spurious result that includes x1, or equivalently that your delta-AICs are so big, is that you have a huge data set. Lindsey (p. 15) talks a bit about calibration that changes with the size of the data set. Model 3 will very probably give you better predictive power than model 2. If you want to select on the basis of improvement in R^2, why not just do that? Ben Bolker Lindsey, J. K. 1999. Some Statistical Heresies. The Statistician 48, no. 1: 1-40. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] change 3x3 cell size in 1x1 cell size
Hi All, I wish to change 3x3 pixel size in 1x1 pixel size my grid. I have this fuction: dem.area - ([EMAIL PROTECTED],[EMAIL PROTECTED],1])*([EMAIL PROTECTED],[EMAIL PROTECTED],1]) dem.pixelsize - round(5*sqrt(dem.area/length(ground$Z)),0) dem.pixelsize where is the input to change? Thanks Ale [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional statement used in sapply()
Hi, I have data stored in a list that I would like to aggregate and perform some basic stats. However, I would like to apply conditional statements so that not all the data are used. Basically, I want to get a specific variable, do some basic functions (such as a mean), but only get the data in each element's data that match the condition. The code I used is below: result-sapply(res, function(.df) { #res is the list containing file data + if(.df$Volume0)mean(.df$Volume) #only have the mean function calculate on values great than 0 + }) I did get a numeric output; however, when I checked the output value the conditional was ignored (i.e. it did not do anything to the calculation) I also obtained these warning statements: Warning messages: 1: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used 2: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used Please let me know what am I doing wrong and how can I apply a conditional statement to the sapply function. Thanks Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional statement used in sapply()
Hi Mark, How about this? result - sapply(split(res, res$Volume0)$`TRUE`, mean) There is one thing I'm not sure: is res$Volume a vector or single numeric? -gary -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Altaweel, Mark R. Sent: Wednesday, August 13, 2008 6:03 PM To: r-help@r-project.org Subject: [R] Conditional statement used in sapply() Hi, I have data stored in a list that I would like to aggregate and perform some basic stats. However, I would like to apply conditional statements so that not all the data are used. Basically, I want to get a specific variable, do some basic functions (such as a mean), but only get the data in each element's data that match the condition. The code I used is below: result-sapply(res, function(.df) { #res is the list containing file data + if(.df$Volume0)mean(.df$Volume) #only have the mean function calculate on values great than 0 + }) I did get a numeric output; however, when I checked the output value the conditional was ignored (i.e. it did not do anything to the calculation) I also obtained these warning statements: Warning messages: 1: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used 2: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used Please let me know what am I doing wrong and how can I apply a conditional statement to the sapply function. Thanks Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This message w/attachments (message) may be privileged, confidential or proprietary, and if you are not an intended recipient, please notify the sender, do not use or share it and delete it. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Merrill Lynch. Subject to applicable law, Merrill Lynch may monitor, review and retain e-communications (EC) traveling through its networks/systems. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or error-free. This message is subject to terms available at the following link: http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you consent to the foregoing. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional statement used in sapply()
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Altaweel, Mark R. Sent: Wednesday, August 13, 2008 3:03 PM To: r-help@r-project.org Subject: [R] Conditional statement used in sapply() Hi, I have data stored in a list that I would like to aggregate and perform some basic stats. However, I would like to apply conditional statements so that not all the data are used. Basically, I want to get a specific variable, do some basic functions (such as a mean), but only get the data in each element's data that match the condition. The code I used is below: result-sapply(res, function(.df) { #res is the list containing file data + if(.df$Volume0)mean(.df$Volume) #only have the mean function calculate on values great than 0 + }) You probably want something such as result-sapply(res, function(.df) { mean(.df$Volume[.df$Volume0]) }) HTH Steve McKinney I did get a numeric output; however, when I checked the output value the conditional was ignored (i.e. it did not do anything to the calculation) I also obtained these warning statements: Warning messages: 1: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used 2: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used Please let me know what am I doing wrong and how can I apply a conditional statement to the sapply function. Thanks Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional statement used in sapply()
Hi, Yes, that's it. I got the correct results. Thanks everyone for their help once again. This is a great help board. Mark -Original Message- From: Steven McKinney [mailto:[EMAIL PROTECTED] Sent: Wed 8/13/2008 5:29 PM To: Altaweel, Mark R.; r-help@r-project.org Subject: RE: [R] Conditional statement used in sapply() -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Altaweel, Mark R. Sent: Wednesday, August 13, 2008 3:03 PM To: r-help@r-project.org Subject: [R] Conditional statement used in sapply() Hi, I have data stored in a list that I would like to aggregate and perform some basic stats. However, I would like to apply conditional statements so that not all the data are used. Basically, I want to get a specific variable, do some basic functions (such as a mean), but only get the data in each element's data that match the condition. The code I used is below: result-sapply(res, function(.df) { #res is the list containing file data + if(.df$Volume0)mean(.df$Volume) #only have the mean function calculate on values great than 0 + }) You probably want something such as result-sapply(res, function(.df) { mean(.df$Volume[.df$Volume0]) }) HTH Steve McKinney I did get a numeric output; however, when I checked the output value the conditional was ignored (i.e. it did not do anything to the calculation) I also obtained these warning statements: Warning messages: 1: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used 2: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used Please let me know what am I doing wrong and how can I apply a conditional statement to the sapply function. Thanks Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl/compiz problem
Barry Rowlingson wrote: I have just encountered the problem with rgl where plot3d figures don't interact with the mouse. My plots zoom in and out with the mouse wheel but the mouse buttons do nothing. I can't rotate the plot. This has been mentioned and discussed here and in other lists before, and the solution is to turn off Ubuntu's fancy graphics. Back in March, Ben Bolker said: unfortunately rgl and compiz/etc. both try to use the same OpenGL interface, so you can't use both at the same time. This has echoes of when TCP/IP was in its infancy back in the days of DOS, and only one program could access the network interface at a time (until TCP/IP software got its act together). Is OpenGL really in the same position now? Or is Compiz being greedy in some sense? Surely two OpenGL applications can run at the same time? Or is it because rgl is running 'within' another OpenGL window already, so there's some nesting problem going on? I think it's an Ubuntu bug, because nothing like it occurs anywhere else. So I'd suggest you turn off compiz or switch to a reliable OS like Windows ;-). Duncan Murdoch Google Earth works fine, and I think that uses OpenGL. Anyone had any ideas since March? I'm on Ubuntu 8.04 and R 2.7.1 Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl/compiz problem
on 08/13/2008 06:03 PM Duncan Murdoch wrote: Barry Rowlingson wrote: I have just encountered the problem with rgl where plot3d figures don't interact with the mouse. My plots zoom in and out with the mouse wheel but the mouse buttons do nothing. I can't rotate the plot. This has been mentioned and discussed here and in other lists before, and the solution is to turn off Ubuntu's fancy graphics. Back in March, Ben Bolker said: unfortunately rgl and compiz/etc. both try to use the same OpenGL interface, so you can't use both at the same time. This has echoes of when TCP/IP was in its infancy back in the days of DOS, and only one program could access the network interface at a time (until TCP/IP software got its act together). Is OpenGL really in the same position now? Or is Compiz being greedy in some sense? Surely two OpenGL applications can run at the same time? Or is it because rgl is running 'within' another OpenGL window already, so there's some nesting problem going on? I think it's an Ubuntu bug, because nothing like it occurs anywhere else. So I'd suggest you turn off compiz or switch to a reliable OS like Windows ;-). Gack... ;-) Google Earth works fine, and I think that uses OpenGL. Anyone had any ideas since March? I'm on Ubuntu 8.04 and R 2.7.1 Baz, what kind of graphics chipset do you have? ATI, nVidia or Intel? nVidia is terrible right now and they are being deservedly flamed left and right on the nVidia Linux fora. Their Linux support has deteriorated notably over the past year or so and is more pronounced with the new version of Xorg. Even the 2D support under Linux is worse than what I have seen on co-workers Linux systems with Intel chipsets that use shared system memory. I agree with Duncan in that you should disable any of the compiz/compiz-fusion features, which add significant overhead and put a strain on the graphics drivers. Worse if it is nVidia in their current state. Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional statement used in sapply()
Hello - Altaweel, Mark R. wrote: Hi, I have data stored in a list that I would like to aggregate and perform some basic stats. However, I would like to apply conditional statements so that not all the data are used. Basically, I want to get a specific variable, do some basic functions (such as a mean), but only get the data in each element's data that match the condition. The code I used is below: result-sapply(res, function(.df) { #res is the list containing file data + if(.df$Volume0)mean(.df$Volume) #only have the mean function calculate on values great than 0 + }) I did get a numeric output; however, when I checked the output value the conditional was ignored (i.e. it did not do anything to the calculation) I also obtained these warning statements: Warning messages: 1: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used 2: In if (.df$Volume 0) mean(.df$Volume) : the condition has length 1 and only the first element will be used Please let me know what am I doing wrong and how can I apply a conditional statement to the sapply function. Before you think about sapply, what would you do if you had one element of this list. Write a function to do that. You wouldn't do : if(x$Volume 0) mean(x$Volume) because x$Volume 0 will create a logical vector greater than length 1 (assuming x$Volume is greater than length 1), and then if will issue the warning. You might do, mean(x$Volume[x$Volume 0]) and turn it into a function. Then use sapply. Hopefully that gets you started! Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comination of two barcharts and one xyplot
At 01:17 14/08/2008, you wrote: Hi Rhelpers, Thanks a lot, Stephen, for showing me the way to get a data frame into a pasteable format with the dput command. My code is given below with the new correction. This should work, as Stephen says, right off the bat :-) ## df1 is the first data frame df1 -structure(list(Year = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 8L, 7L), .Label = c(2003, 2005, 2007, 2009, 2011, 2013, 2015K, 2015M), class = factor), KI = c(15.53, 15.64, 16.18, 17.09, 22.39, 33.83, 44.91, 52.22), G48 = c(0.3, 0.29, 0.49, 0.67, 0.93, 1.29, 1.83, 2.14), AvCell = c(0.24, 0.33, 0.59, 0.91, 1.24, 1.87, 2.71, 3.15), HB = c(37.45, 34.64, 30.32, 29.47, 38.03, 58.37, 75.54, 87.71), Htens = c(0.76, 1.12, 1.63, 2.27, 3.11, 4.43, 6.28, 7.34), Impact = c(1.16, 1.78, 4.23, 6.76, 9.17, 14.06, 20.57, 23.88), Struct = c(3.02, 4.2, 6.67, 9.68, 13.18, 19.41, 27.51, 31.98), Tens = c(34.05, 32.88, 30.06, 29.25, 37.84, 57.6, 74.5, 86.57), Year.ord = structure(1:8, .Label = c(2003, 2005, 2007, 2009, 2011, 2013, 2015M, 2015K), class = c(ordered, factor))), .Names = c(Year, KI, G48, AvCell, HB, Htens, Impact, Struct, Tens, Year.ord), row.names = c(NA, -8L), class = data.frame) ## L1 is the second data frame L1-structure(list(Year = c(2009L, 2011L, 2013L), KIL = c(20, 24, 30), G48L = c(1, 1, 1), AvCellL = c(1, 1.5, 2), HBL = c(30, 35, 40), HtensL = c(2, 3, 4), ImpactL = c(10, 12, 14), StructL = c(10, 13, 16), TensL = c(35, 38, 45)), .Names = c(Year, KIL, G48L, AvCellL, HBL, HtensL, ImpactL, StructL, TensL), class = data.frame, row.names = c(NA, -3L)) # # Use the reshape package to melt the data frame library(reshape) df1m-melt(df1,id=c(Year,Year.ord)) ## Use the lattice package to plot the barchart library(lattice) attach(df1m) barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No. of Tests *1000,col=blue) This plot works just fine. But I want to go beyond this.What I want, in each panel of the lattice barchart, is to plot histograms of the relevant variable (KI, G48 etc) in one colour for the years 2003 to 2007, and in another colour for the other years. On top of this, I want to have a line plot in each panel with the limits for different years given in the second data frame L1 (as bold lines). I would like to have information on the following points : 1. How can I get a combination of these plots in every panel (two histograms and one line plot)? Is it possible? 2. Is it easier to do this with ggplot? 3. I would like to know how I can present the legend also. Will appreciate any help that I can get. Thanking You, Ravi - Original Message From: stephen sefick [EMAIL PROTECTED] To: ravi [EMAIL PROTECTED] Cc: r-help@r-project.org Sent: Wednesday, 13 August, 2008 3:14:54 PM Subject: Re: [R] Comination of two barcharts and one xyplot not reproducible On Wed, Aug 13, 2008 at 9:07 AM, ravi [EMAIL PROTECTED] wrote: Hi Rhelpers, I would like to have some help with a plot which is beyond my capabilities. This plot that I am seeking involves an overlay of two different barcharts and one xyplot. The code that I have used is the following : #save(df1,file=M:\\KBR\\df1.RData) load(file=M:\\KBR\\df1.RData) # df1$Year.ord created to obtain the right order i.e. 2015M 2015K Year.ord-ordered(Year,levels=c('2003','2005','2007','2009','20011','2013','2015M','2015K')) # Use reshape package to melt the data frame library(reshape) df1m-melt(df1,id=c(Year,Year.ord)) library(lattice) attach(df1m) barchart(value~Year.ord|variable,scales=list(y=free,x=list(rot=90)),xlab=Year,ylab=No. of Tests *1000,col=blue) This plot works just fine. But I want to go beyond this. My first data frame (df1) is : Year,KI,G48,AvCell,HB,Htens,Impact,Struct,Tens,Year.ord 1,2003,15.53,0.3,0.24,37.45,0.76,1.16,3.02,34.05,2003 2,2005,15.64,0.29,0.33,34.64,1.12,1.78,4.2,32.88,2005 3,2007,16.18,0.49,0.59,30.32,1.63,4.23,6.67,30.06,2007 4,2009,17.09,0.67,0.91,29.47,2.27,6.76,9.68,29.25,2009 5,2011,22.39,0.93,1.24,38.03,3.11,9.17,13.18,37.84,2011 6,2013,33.83,1.29,1.87,58.37,4.43,14.06,19.41,57.6,2013 7,2015M,44.91,1.83,2.71,75.54,6.28,20.57,27.51,74.5,2015M 8,2015K,52.22,2.14,3.15,87.71,7.34,23.88,31.98,86.57,2015K My second data frame is (L1) is : Year,KIL,G48L,AvCellL,HBL,HtensL,ImpactL,StructL,TensL 1,2009,20,1,1,30,2,10,10,35 2,2011,24,1,1.5,35,3,12,13,38 3,2013,30,1,2,40,4,14,16,45 What I want, in each panel of the lattice barchart, is to plot histograms of the relevant variable (KI, G48 etc) in one colour for the years 2003 to 2007, and in another colour for the other years. On top of this, I want to have a line plot in each panel with the limits for different years given in the second data frame L1 (as bold lines). I would like to have information on the following points : 1. How can I get a combination of these plots in every panel (two histograms and one line plot)? Is it possible? 2. Is it easier to do this with ggplot? 3. I would like to know how I can
Re: [R] rgl/compiz problem
My laptop has an nVidia card. Maybe that's why it works? Simon. On Wed, 2008-08-13 at 13:17 +, Ben Bolker wrote: Barry Rowlingson b.rowlingson at lancaster.ac.uk writes: I have just encountered the problem with rgl where plot3d figures don't interact with the mouse. My plots zoom in and out with the mouse wheel but the mouse buttons do nothing. I can't rotate the plot. This has been mentioned and discussed here and in other lists before, and the solution is to turn off Ubuntu's fancy graphics. Back in March, Ben Bolker said: unfortunately rgl and compiz/etc. both try to use the same OpenGL interface, so you can't use both at the same time. This has echoes of when TCP/IP was in its infancy back in the days of DOS, and only one program could access the network interface at a time (until TCP/IP software got its act together). Is OpenGL really in the same position now? Or is Compiz being greedy in some sense? Surely two OpenGL applications can run at the same time? Or is it because rgl is running 'within' another OpenGL window already, so there's some nesting problem going on? Google Earth works fine, and I think that uses OpenGL. Anyone had any ideas since March? I'm on Ubuntu 8.04 and R 2.7.1 Barry Unfortunately, an apparently knowledgeable compiz person said: This is a limitation of DRI, DRI2 should fix this, and should hopefully be in most drivers by Xorg 7.5(maybe 7.6), nvidia has there on implementation, that's why it works on it http://forum.compiz-fusion.org/showthread.php?t=8462 And poking around, http://www.phoronix.com/scan.php?page=news_itempx=NjYzNw sometime in 2009 is the closest I could get to finding an expected date when this would be available ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320 Goddard Building (8) T: +61 7 3365 2506 http://www.uq.edu.au/~uqsblomb email: S.Blomberg1_at_uq.edu.au Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.