Re: [R] Pipelining programs in R
You can use system() or pipe() and friends. This is covered in Section 5.1 of `Writing R Extensions'. Perhaps the simplest way is to something like tmp - tempfile() zz - file(tmp, w) # write the input script to zz, e.g. via cat close(zz) res - system(paste(myprog , tmp), intern = TRUE) and then parse the output from the character vector 'res'. If you have a Unix-like OS and understand pipes, fifos and sockets you can also use those: see ?pipe. (With a higher degree of understanding you can use some of these on Windows NT.) It would have been very helpful to have known your OS, a piece of information the posting guide asked for. (Since you talk about 'open'ing an executable, I surmise you are not familiar with traditional OSes which 'run' programs.) On Thu, 18 May 2006, Dan Rabosky wrote: Hello... I would like to use R for 'pipelining' data among several programs. I'm wondering how I can use R to call another program, feed that program a set of parameters, and retrieve the output. E.g., I have an executable that, when opened, prompts the user to enter a set of parameters. The program then executes prompts the user for the name of an output file. I need to run this program on a large batch of parameters, such that it would clearly be desirable to automate the process. Is there a straightforward way to do this? I can't find any online documentation addressing this topic, but perhaps I've not been looking in the right place. In pseudocode, supposing I have a large array of parameters in R: For each set of parameters -Open Program. -Enter Parameters. -Cause program to execute (typically done by simply entering \n after manually entering parameters). -Enter name of output file. -Close program. Any advice will be greatly appreciated! Thanks, Dan Rabosky Dan Rabosky Department of Ecology and Evolutionary Biology 237 Corson Hall Cornell University Ithaca, NY14853-2701 USA web: http://www.birds.cornell.edu/evb/Graduates_Dan.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Converting character strings to numeric
After replies off the list which indicate the code should work. I tried a variety of approaches. Rebooting, Using the --vanilla option and then removing the whole lot and resinstalling. It now works. I guess it's another of those windows things? Thanks to those that helped. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Mulholland, Tom Sent: Friday, 19 May 2006 11:48 AM To: R-Help (E-mail) Subject: [R] Converting character strings to numeric I assume that I have missed something fundamental and that it is there in front of me in An Introduction to R, but I need someone to point me in the right direction. x1 - 1159 1129 1124 -5 -0.44 -1.52 x2 - c(1159,1129,1124,-5,-0.44,-1.52) x3 - unlist(strsplit(x1, )) str(x2) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 str(x3) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 as.numeric(x2) [1] 1159.00 1129.00 1124.00 -5.00 -0.44 -1.52 as.numeric(x3) [1] 1159 1129 1124 NA NA NA Warning message: NAs introduced by coercion What do I have to do to get x3 to be the same as x2. Tom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
Hello, ihmo you could buy quantian and find several third-party resellers on dirk's page stated below. http://dirk.eddelbuettel.com/quantian.html regards, christian Hi all, This may seem like a dumb question, but I work for an entity that is soon converting to XP across the board, and I will lose the ability to install software on my own. The entity has a policy of only using software that has been purchased and properly licensed (whatever that means). This means I will soon lose the ability to use R at work - something I can't do without at this point. HOWEVER, I might be able to work around this policy if I can find a licensed software vendor, preferably in Canada, that sells R. I tried googling R vendors but was unsuccessful. Any ideas? Thanks, Damien [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Converting character strings to numeric
On Fri, 19 May 2006, Mulholland, Tom wrote: After replies off the list which indicate the code should work. I tried a variety of approaches. Rebooting, Using the --vanilla option and then removing the whole lot and resinstalling. It now works. I guess it's another of those windows things? No, it works under Windows. What you have not shown us is x3: x3 [1] 1159 1129 1124 -5-0.44 -1.52 My guess is that you have something invisible in x1, e.g. a nbspace not a space (although that does not fully explain the results). What does charToRaw(x1) [1] 31 31 35 39 20 31 31 32 39 20 31 31 32 34 20 2d 35 20 2d 30 2e 34 34 20 2d [26] 31 2e 35 32 give for you? Thanks to those that helped. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Mulholland, Tom Sent: Friday, 19 May 2006 11:48 AM To: R-Help (E-mail) Subject: [R] Converting character strings to numeric I assume that I have missed something fundamental and that it is there in front of me in An Introduction to R, but I need someone to point me in the right direction. x1 - 1159 1129 1124 -5 -0.44 -1.52 x2 - c(1159,1129,1124,-5,-0.44,-1.52) x3 - unlist(strsplit(x1, )) str(x2) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 str(x3) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 as.numeric(x2) [1] 1159.00 1129.00 1124.00 -5.00 -0.44 -1.52 as.numeric(x3) [1] 1159 1129 1124 NA NA NA Warning message: NAs introduced by coercion What do I have to do to get x3 to be the same as x2. Tom -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Converting character strings to numeric
Hi Maybe change your R version? Works for me R 2.3.0pat, W 2000 x1 - 1159 1129 1124 -5 -0.44 -1.52 x2 - c(1159,1129,1124,-5,-0.44,-1.52) x3 - unlist(strsplit(x1, )) str(x2) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 str(x3) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 as.numeric(x2) [1] 1159.00 1129.00 1124.00 -5.00 -0.44 -1.52 as.numeric(x3) [1] 1159.00 1129.00 1124.00 -5.00 -0.44 -1.52 HTH Petr On 19 May 2006 at 11:47, Mulholland, Tom wrote: Date sent: Fri, 19 May 2006 11:47:54 +0800 From: Mulholland, Tom [EMAIL PROTECTED] To: R-Help (E-mail) r-help@stat.math.ethz.ch Subject:[R] Converting character strings to numeric I assume that I have missed something fundamental and that it is there in front of me in An Introduction to R, but I need someone to point me in the right direction. x1 - 1159 1129 1124 -5 -0.44 -1.52 x2 - c(1159,1129,1124,-5,-0.44,-1.52) x3 - unlist(strsplit(x1, )) str(x2) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 str(x3) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 as.numeric(x2) [1] 1159.00 1129.00 1124.00 -5.00 -0.44 -1.52 as.numeric(x3) [1] 1159 1129 1124 NA NA NA Warning message: NAs introduced by coercion What do I have to do to get x3 to be the same as x2. Tom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
Surely the entity is saying you will only be able to use software for which you have a valid licence. They are (rightly) worried about employees installing pirate copies of software which, if audited, could lead to huge fines. While there is plenty of software for which one has to pay for such a licence, R's licence is the GNU GPL - a completely valid and proper licence that gives you a legal right to use it. (If you 'buy R', my understanding is that, under its licence, all you'd be allowed to pay for is the medium it is carried on, not the program itself.) Stuart - Original Message - From: Damien Joly [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Thursday, May 18, 2006 10:51 PM Subject: [R] How can you buy R? Hi all, This may seem like a dumb question, but I work for an entity that is soon converting to XP across the board, and I will lose the ability to install software on my own. The entity has a policy of only using software that has been purchased and properly licensed (whatever that means). This means I will soon lose the ability to use R at work - something I can't do without at this point. HOWEVER, I might be able to work around this policy if I can find a licensed software vendor, preferably in Canada, that sells R. I tried googling R vendors but was unsuccessful. Any ideas? Thanks, Damien [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Converting character strings to numeric
I think you are correct (as expected) I don't know where in the original data the string is, but there is other data doing the same thing. + strsplit(test, )[[1]] [1] 5159 3336 3657 559 3042 55307 -816104 as.numeric(strsplit(test, )[[1]]) [1] 5159 3336 3657 559 304255 307NA 16104 Warning message: NAs introduced by coercion charToRaw(test) [1] 35 31 35 39 20 33 33 33 36 20 33 36 35 37 20 35 35 39 20 33 30 34 32 20 35 35 20 33 30 37 20 96 38 20 31 36 31 30 34 test [1] 5159 3336 3657 559 3042 55 307 -8 16104 x1 - 5159 3336 3657 559 3042 55 307 -8 16104 charToRaw(x1) [1] 35 31 35 39 20 33 33 33 36 20 33 36 35 37 20 35 35 39 20 33 30 34 32 20 35 35 20 33 30 37 20 2d 38 20 31 36 31 30 34 as.numeric(strsplit(x1, )[[1]]) [1] 5159 3336 3657 559 304255 307-8 16104 So it looks as if the 96 is throwing it out. I'll dig deeper. I guess there's a bit more pre-Processing to do. The only thing that seems slightly strange is that the small example I made up did not use the original data source, but was typed in the same way I did x1 above. However I can't reproduce the error so it may still be a case of finger trouble on my part. Tom -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Friday, 19 May 2006 3:03 PM To: Mulholland, Tom Cc: R-Help (E-mail) Subject: Re: [R] Converting character strings to numeric On Fri, 19 May 2006, Mulholland, Tom wrote: After replies off the list which indicate the code should work. I tried a variety of approaches. Rebooting, Using the --vanilla option and then removing the whole lot and resinstalling. It now works. I guess it's another of those windows things? No, it works under Windows. What you have not shown us is x3: x3 [1] 1159 1129 1124 -5-0.44 -1.52 My guess is that you have something invisible in x1, e.g. a nbspace not a space (although that does not fully explain the results). What does charToRaw(x1) [1] 31 31 35 39 20 31 31 32 39 20 31 31 32 34 20 2d 35 20 2d 30 2e 34 34 20 2d [26] 31 2e 35 32 give for you? Thanks to those that helped. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Mulholland, Tom Sent: Friday, 19 May 2006 11:48 AM To: R-Help (E-mail) Subject: [R] Converting character strings to numeric I assume that I have missed something fundamental and that it is there in front of me in An Introduction to R, but I need someone to point me in the right direction. x1 - 1159 1129 1124 -5 -0.44 -1.52 x2 - c(1159,1129,1124,-5,-0.44,-1.52) x3 - unlist(strsplit(x1, )) str(x2) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 str(x3) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 as.numeric(x2) [1] 1159.00 1129.00 1124.00 -5.00 -0.44 -1.52 as.numeric(x3) [1] 1159 1129 1124 NA NA NA Warning message: NAs introduced by coercion What do I have to do to get x3 to be the same as x2. Tom -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] factor analysis - discrepancy in results from R vs. Stata
Hi, I found a discrepancy between results in R and Stata for a factor analysis with a promax rotation. For Stata: . *rotate, factor(2) promax* (promax rotation) Rotated Factor Loadings Variable | 1 2Uniqueness -+ pfq_amanag~y | -0.178020.641610.70698 pfq_bwalk_~ø | 0.725690.055700.41706 pfq_cwalk_~s | 0.78938 -0.034970.41200 pfq_dkneel~g | 0.80165 -0.041880.39979 pfq_elifting | 0.587000.193960.46795 pfq_fhouse~e | 0.500860.387700.34323 pfq_gmeals | 0.035160.758840.38781 pfq_hwalki~s | 0.159420.527660.58543 pfq_istand~r | 0.465160.290580.52127 pfq_jget_i~d | 0.318190.433450.52934 pfq_kfork | 0.024580.487970.74549 pfq_ldress~g | 0.111930.639870.48377 pfq_mstand~s | 0.731770.078170.38311 pfq_nsitti~g | 0.495350.169430.61545 pfq_oreach~d | 0.349800.271560.67887 pfq_pgrasp~l | 0.269750.217780.80248 pfq_qgo_mo~s | 0.257530.652960.28598 pfq_rsocia~t | 0.144820.723480.31770 pfq_sleisu~e | -0.063160.698220.56654 For R: *factanal(x = matrix, factors = 2, rotation = promax)* Loadings: Factor1 Factor2 pfq_amanage_money 0.769 pfq_bwalk_mileø 0.925 pfq_cwalk_steps 0.977 pfq_dkneeling 0.802 0.152 pfq_elifting 0.812 0.114 pfq_fhouse_chore 0.884 pfq_gmeals0.920 pfq_hwalking_rooms0.963 pfq_istand_chair 0.927 pfq_jget_in_out_bed 0.951 pfq_kfork 0.846 pfq_ldressing 0.947 pfq_mstanding_hours 0.844 pfq_nsitting_long 0.795 pfq_oreach_over_head 0.856 pfq_pgrasp_small 0.814 pfq_qgo_movies0.971 pfq_rsocial_event 0.930 pfq_sleisure_home 0.811 This is just one example -- all other comparisons with a different number of factors, with and without rotation, generated different numbers. Any thoughts from the list members on the reasons for the discrepancy? thanks, Ricardo Pietrobon, MD, PhD Duke University Health System [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Converting character strings to numeric
Your minus eight is a hyphen eight, and those will print the same in a monospaced font. As to how you get a hyphen into a string, it depends how you do it but I presume this was not entered at an R console. On Fri, 19 May 2006, Mulholland, Tom wrote: I think you are correct (as expected) I don't know where in the original data the string is, but there is other data doing the same thing. + strsplit(test, )[[1]] [1] 5159 3336 3657 559 3042 55307 -816104 as.numeric(strsplit(test, )[[1]]) [1] 5159 3336 3657 559 304255 307NA 16104 Warning message: NAs introduced by coercion charToRaw(test) [1] 35 31 35 39 20 33 33 33 36 20 33 36 35 37 20 35 35 39 20 33 30 34 32 20 35 35 20 33 30 37 20 96 38 20 31 36 31 30 34 test [1] 5159 3336 3657 559 3042 55 307 -8 16104 x1 - 5159 3336 3657 559 3042 55 307 -8 16104 charToRaw(x1) [1] 35 31 35 39 20 33 33 33 36 20 33 36 35 37 20 35 35 39 20 33 30 34 32 20 35 35 20 33 30 37 20 2d 38 20 31 36 31 30 34 as.numeric(strsplit(x1, )[[1]]) [1] 5159 3336 3657 559 304255 307-8 16104 So it looks as if the 96 is throwing it out. I'll dig deeper. I guess there's a bit more pre-Processing to do. The only thing that seems slightly strange is that the small example I made up did not use the original data source, but was typed in the same way I did x1 above. However I can't reproduce the error so it may still be a case of finger trouble on my part. Tom -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Friday, 19 May 2006 3:03 PM To: Mulholland, Tom Cc: R-Help (E-mail) Subject: Re: [R] Converting character strings to numeric On Fri, 19 May 2006, Mulholland, Tom wrote: After replies off the list which indicate the code should work. I tried a variety of approaches. Rebooting, Using the --vanilla option and then removing the whole lot and resinstalling. It now works. I guess it's another of those windows things? No, it works under Windows. What you have not shown us is x3: x3 [1] 1159 1129 1124 -5-0.44 -1.52 My guess is that you have something invisible in x1, e.g. a nbspace not a space (although that does not fully explain the results). What does charToRaw(x1) [1] 31 31 35 39 20 31 31 32 39 20 31 31 32 34 20 2d 35 20 2d 30 2e 34 34 20 2d [26] 31 2e 35 32 give for you? Thanks to those that helped. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Mulholland, Tom Sent: Friday, 19 May 2006 11:48 AM To: R-Help (E-mail) Subject: [R] Converting character strings to numeric I assume that I have missed something fundamental and that it is there in front of me in An Introduction to R, but I need someone to point me in the right direction. x1 - 1159 1129 1124 -5 -0.44 -1.52 x2 - c(1159,1129,1124,-5,-0.44,-1.52) x3 - unlist(strsplit(x1, )) str(x2) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 str(x3) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 as.numeric(x2) [1] 1159.00 1129.00 1124.00 -5.00 -0.44 -1.52 as.numeric(x3) [1] 1159 1129 1124 NA NA NA Warning message: NAs introduced by coercion What do I have to do to get x3 to be the same as x2. Tom -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
On Thursday 18 May 2006 14:51, Damien Joly wrote: Hi all, This may seem like a dumb question, but I work for an entity that is soon converting to XP across the board, and I will lose the ability to install software on my own. The entity has a policy of only using software that has been purchased and properly licensed (whatever that means). This means I will soon lose the ability to use R at work - something I can't do without at this point. HOWEVER, I might be able to work around this policy if I can find a licensed software vendor, preferably in Canada, that sells R. I tried googling R vendors but was unsuccessful. Any ideas? Well, first, have you pointed out to whatever limited neurons came up with that specification, that this will mean that part of your job can no longer be done because their specifications appear to rule out a key tool? Second, R is available for windows and works quite well. While there is no charge for R, it IS properly licensed properly licensed under the GPL. Theoretically, is system security is the actual issue, then the individual in charge of software acquisition can download and install it for you. All of that should be clear and above board and shouldn't compromise anything unless the entity you work for has become contractually constrained to avoid using OS ware for some obscure and irrational reason. What do they actually expect to gain from this policy? The _expensive_ alternative is to have them purchase S-Plus for you. If you present them with an estimated cost and l imagine they might think having the BOFH download R for windows for you might be the cost-effective way to go. JD __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] microarray-like graph
Eric Hu wrote: Hi, I am beginning to learn R and have a data table that I would like to produce a microarray-like plot. The table looks like this: 3 0 0 3 -377.61 1.94 3 0 0 3 -444.80 2.36 2 1 0 3 -519.60 2.39 1 1 1 3 -54.88 2.49 2 1 1 4 -536.55 2.53 1 0 1 2 108.29 2.62 2 0 0 2 39.56 2.62 3 0 1 4 108.32 2.63 2 0 0 2 -455.23 2.84 1 0 0 1 -432.30 2.98 ... I would like to assign colors to the first three columns and plot the last column against fourth column which is the sum of the first three at each row. Can anyone point to me how to approach this? Thanks for your suggestions. Hi Eric, Here is an initial try at your plot. I doesn't look great, but you may be able to use some of the ideas. hu.df-read.table(hu.dat) hu.df V1 V2 V3 V4 V5 V6 1 3 0 0 3 -377.61 1.94 2 3 0 0 3 -444.80 2.36 3 2 1 0 3 -519.60 2.39 4 1 1 1 3 -54.88 2.49 5 2 1 1 4 -536.55 2.53 6 1 0 1 2 108.29 2.62 7 2 0 0 2 39.56 2.62 8 3 0 1 4 108.32 2.63 9 2 0 0 2 -455.23 2.84 10 1 0 0 1 -432.30 2.98 hu.col-rgb(hu.df$V1,hu.df$V2,hu.df$V3,maxColorValue=3) hu.col [1] #BF #BF #804000 #404040 #804040 #400040 #80 [8] #BF0040 #80 #40 plot(hu.df$V6,hu.df$V4,col=hu.col,pch=15,cex=3) Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] factor analysis - discrepancy in results from R vs. Stata
I don't believe promax is uniquely defined. Not only are there differences in the criterion (R allows a choice), it is an optimization problem with multiple local optima. In fact the same is true of factanal, and the first thing to check would be to see if the same FA solution has been found. On Fri, 19 May 2006, Ricardo Pietrobon wrote: Hi, I found a discrepancy between results in R and Stata for a factor analysis with a promax rotation. For Stata: [...] This is just one example -- all other comparisons with a different number of factors, with and without rotation, generated different numbers. Any thoughts from the list members on the reasons for the discrepancy? thanks, Ricardo Pietrobon, MD, PhD Duke University Health System [[alternative HTML version deleted]] PLEASE don't send HTML code but properly formatted ASCII text. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Incomplete Output from lmer{lme4}
Lusk, Jeffrey J jjlusk at purdue.edu writes: using this approach, but the output for the fixed effects doesn't report a p-value or the degrees of freedom (unlike the examples listed in Faraway's book, which I tried and got the same incomplete output). Any idea how I can get the complete output? This is a temporary state of lmer(lme4): currently p-values are not printed, because Douglas Bates is re-evaluating degrees of freedom estimation in his model, and as long he is not sure, he takes the better missing than wrong stand. Hope I quoted him right, the original citation is so hidden that I remember it, but cannot find it any more. Dieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Negative value on ternaryplot
Poizot Emmanuel wrote: Dear all, I found a wonderful package (vcd) able to plot ternary diagrams, i.e. ternaryplot (thanks D. Meyer). The problem is that one of three variable has negative values. If I use the ternaryplot function but some points are outside the triangle, as value en negative. Is it possible to make the ternary diagram fit exactly the cloud points ? Regards Hi Emmanuel, As Christos has already pointed out, triangle/ternary plots are intended to display triplets of values that add up to a constant. Negative numbers stretch the definition a bit. One could take triax.plot from plotrix and rejig the code to have the axes running from -0.5 to 1.5. This might appear to work, but the interpretation of such a plot would be difficult at best. Perhaps you want a 3D plot of some type like scatterplot3D? Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] uniform and clumped point plots
Beutel, Terry S wrote: I am trying to generate two dimensional random coordinates. For randomly distributed data I have simply used xy-cbind(runif(100),runif(100)) However I also want to generate coordinates that are more uniformly distributed, and coordinates that are more contagiously distributed than the above. Hi Terry, Not sure exactly what you are trying to do, but if you want to space out overlying points, you might find cluster.overplot in the plotrix package useful. On the other hand, if you want coordinate pairs that are more evenly spaced, maybe something like this: xy-cbind(sample(seq(0,1,length=101),100,TRUE), sample(seq(0,1,length=101),100,TRUE)) Jim PS Did you mean contiguously? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help
Dear Sir, Iam a frensh student and iam a new user of the R software. After using the command (x-read.delim(clipboard) to read a spreadsheet of Excel, I want to run the bds test and calculate the Lyapunov exponent. I have charged the R software by the packages tseries and tseriesChaos. when i run bds.test(x,m=2) Unfortunately the R software displays error in as.vector(x,mode= double) : the object (list) can not be automatically converted on double so what shall I do to run this two tests(lyapunov and bds)? And what is my mistake? I thank you in advance and Iam waiting forward your e-mail This si my e-mail: [EMAIL PROTECTED] Accédez au courrier électronique de La Poste : www.laposte.net ; 3615 LAPOSTENET (0,34 /mn) ; tél : 08 92 68 13 50 (0,34/mn) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] determination of number of entries in list elements
Hi, is there some elegant way to determine the number of components stored in each list element? Example: The list: - list $Elem1 [1] A B C $Elem1 [1] D $Elem1 [1] E F Then normal command length(list) would return 3. But I would like some command return the array of the single element lengths like [1] 3 1 2 so I can afterwards get my list subset with only entries which have a certain amount of components bigger or lower than a certain threshold. regards Benjamin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help
karim99.karim wrote: Dear Sir, I’am a frensh student and i’am a new user of the R software. After using the command (x-read.delim(“clipboard”) to read a spreadsheet of Excel, I want to run the bds test and calculate the Lyapunov exponent. I have charged the R software by the packages tseries and tseriesChaos. when i run bds.test(x,m=2) Unfortunately the R software displays “error in as.vector(x,mode= “double”) : the object (list) can not be automatically converted on double” so what shall I do to run this two tests(lyapunov and bds)? And what is my mistake? I thank you in advance and I’am waiting forward your e-mail This si my e-mail: [EMAIL PROTECTED] Accédez au courrier électronique de La Poste : www.laposte.net ; 3615 LAPOSTENET (0,34 €/mn) ; tél : 08 92 68 13 50 (0,34€/mn) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html seems that the data import failed. probably the exported excel data are not in a format accepted correctly by read.delim. check this. alternative (maybe): there is an additional package 'gdata' on CRAN which you can download and install. it contains a function read.xls which can read directly the excel binary format (with some limitations). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Pipelining programs in R
On Thu, May 18, 2006 at 11:50:24PM -0400, Dan Rabosky wrote: Hello... I would like to use R for 'pipelining' data among several programs. I'm wondering how I can use R to call another program, feed that program a set of parameters, and retrieve the output. E.g., I have an executable that, when opened, prompts the user to enter a set of parameters. The program then executes prompts the user for the name of an output file. I need to run this program on a large batch of parameters, such that it would clearly be desirable to automate the process. Is there a straightforward way to do this? I can't find any online documentation addressing this topic, but perhaps I've not been looking in the right place. In pseudocode, supposing I have a large array of parameters in R: For each set of parameters -Open Program. -Enter Parameters. -Cause program to execute (typically done by simply entering \n after manually entering parameters). -Enter name of output file. -Close program. If your program gets all the input it requires from the command line, you might use the pipe function, as in f - pipe(ls -l); l - readLines(f); However, your executable seems to expect its input via its standard input and you want to read the output it writes to its standard output. To my knowledge, this is not possible with R. I've written some stuff to add such functionality to R, this is available from http://www2.cmp.uea.ac.uk/~jtk/software/ The filterpipe patch is against an older version of R, though. Best regards, Jan -- +- Jan T. Kim ---+ | email: [EMAIL PROTECTED] | | WWW: http://www.cmp.uea.ac.uk/people/jtk | *-= hierarchical systems are for files, not for humans =-* __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help
x is a data frame, since that is what read.table reads. bds.test is expecting (from the help file) x: a numeric vector or time series. so you probably want to pass x[[1]]. On Thu, 18 May 2006, karim99.karim wrote: Dear Sir, Iam a frensh student and iam a new user of the R software. After using the command (x-read.delim(clipboard) to read a spreadsheet of Excel, I want to run the bds test and calculate the Lyapunov exponent. I have charged the R software by the packages tseries and tseriesChaos. when i run bds.test(x,m=2) Unfortunately the R software displays error in as.vector(x,mode= double) : the object (list) can not be automatically converted on double so what shall I do to run this two tests(lyapunov and bds)? And what is my mistake? I thank you in advance and Iam waiting forward your e-mail This si my e-mail: [EMAIL PROTECTED] Accédez au courrier électronique de La Poste : www.laposte.net ; 3615 LAPOSTENET (0,34 /mn) ; tél : 08 92 68 13 50 (0,34/mn) [[alternative HTML version deleted]] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R: determination of number of entries in list elements
You need lapply or sapply for example: sapply(yourlist, length) then you can do subset(yourlist, sapply(yourlist, length) yourlength) Stefano -Messaggio originale- Da: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] conto di Benjamin Otto Inviato: 19 May, 2006 12:10 A: R-Help Oggetto: [R] determination of number of entries in list elements Hi, is there some elegant way to determine the number of components stored in each list element? Example: The list: - list $Elem1 [1] A B C $Elem1 [1] D $Elem1 [1] E F Then normal command length(list) would return 3. But I would like some command return the array of the single element lengths like [1] 3 1 2 so I can afterwards get my list subset with only entries which have a certain amount of components bigger or lower than a certain threshold. regards Benjamin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Converting character strings to numeric
On 5/18/2006 11:47 PM, Mulholland, Tom wrote: I assume that I have missed something fundamental and that it is there in front of me in An Introduction to R, but I need someone to point me in the right direction. x1 - 1159 1129 1124 -5 -0.44 -1.52 x2 - c(1159,1129,1124,-5,-0.44,-1.52) x3 - unlist(strsplit(x1, )) str(x2) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 str(x3) chr [1:6] 1159 1129 1124 -5 -0.44 -1.52 as.numeric(x2) [1] 1159.00 1129.00 1124.00 -5.00 -0.44 -1.52 as.numeric(x3) [1] 1159 1129 1124 NA NA NA Warning message: NAs introduced by coercion What do I have to do to get x3 to be the same as x2. They should be, and are on my system in 2.2.1 and 2.3.0. Which version/platform are you using? Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] iraq statistics - OT
I came across this one: http://www.nysun.com/article/32787 which says that the violent death rate in Iraq (which presumably includes violent deaths from the war) is lower than the violent death rate in major American cities. Does anyone have any insights from statistics on how to interpret this? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] iraq statistics - OT
Gabor Grothendieck wrote: I came across this one: http://www.nysun.com/article/32787 which says that the violent death rate in Iraq (which presumably includes violent deaths from the war) is lower than the violent death rate in major American cities. Does anyone have any insights from statistics on how to interpret this? Well, I don't. But I remain very skeptical anyhow. I had heard earlier that the violent death rate for *American military personel* in Iraq is lower than the violent death rate in American cities --- which seems more plausible. But still not very plausible. Or maybe major American cities are even worse than we had been led to believe. cheers, Rolf Turner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] iraq statistics - Realllly OT
Le 19.05.2006 13:54, Gabor Grothendieck a écrit : I came across this one: http://www.nysun.com/article/32787 which says that the violent death rate in Iraq (which presumably includes violent deaths from the war) is lower than the violent death rate in major American cities. Does anyone have any insights from statistics on how to interpret this? Is Mr Bush collecting the data ? BTW : How about a Parisian jaunt? N. Too many riots. Come on ! That's ridiculous ! -- visit the R Graph Gallery : http://addictedtor.free.fr/graphiques mixmod 1.7 is released : http://www-math.univ-fcomte.fr/mixmod/index.php +---+ | Romain FRANCOIS - http://francoisromain.free.fr | | Doctorant INRIA Futurs / EDF | +---+ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] iraq statistics - OT
On 5/19/2006 7:54 AM, Gabor Grothendieck wrote: I came across this one: http://www.nysun.com/article/32787 which says that the violent death rate in Iraq (which presumably includes violent deaths from the war) is lower than the violent death rate in major American cities. Does anyone have any insights from statistics on how to interpret this? The New York Sun is not a reliable newspaper. It may be completely fabricated, and it seems likely that it is: The population is 26 million. The violent death rate quoted there is 25.71/10, implying about 6700 deaths per year. There were about 846 American deaths in Iraq in 2005. It doesn't seem credible that there were only 8 deaths (from any violent cause) for each American death. There's a web site at http://www.iraqbodycount.net/press/pr13.php (biased in the opposite direction from the Sun) that claims there were 14000 civilians violently killed in 2005. This probably doesn't include police or members of the armed forces. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Precision in estimating log
Hi R-users, I have the following code: f -function(x,p)sqrt(-(x^2)-2*log(1-p)) r1 -sqrt(-2*log(1-0.95)) r2 -sqrt(-2*log(0.05)) on executing i get the following results f(r1,0.95) [1] 0 f(r2,0.95) [1] NaN Warning message: NaNs produced in: sqrt(-(x^2) - 2 * log(1 - p)) I tried to track the problem and found that the answer to log(0.05) is different from the answer to log(1-0.95) which is ofcourse not true and hence it causes problems in the code print(log(0.05),digit=22) [1] -2.9957322735539909 print(log(1-0.95),digit=22) [1] -2.99573227355399 Any possible explanation ? Regards Anthony __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Precision in estimating log
Anthony, in the same way that we're not allowed to say if(x==0) if x is a real number, we can't say that 0.05=1-0.95... as 1-0.95 is not represented as a base 10 number on the computer, but in some base 2^i depending on your computer...and the representation is not necessarily exact... i.e. one-third (1/3) isn't representable exactly as a decimal number, but I'd guess that it is in some other base... I know it only answers part of your question... but perhaps that helps? cheers, Sean On 19/05/06, Gichangi, Anthony [EMAIL PROTECTED] wrote: Hi R-users, I have the following code: f -function(x,p)sqrt(-(x^2)-2*log(1-p)) r1 -sqrt(-2*log(1-0.95)) r2 -sqrt(-2*log(0.05)) on executing i get the following results f(r1,0.95) [1] 0 f(r2,0.95) [1] NaN Warning message: NaNs produced in: sqrt(-(x^2) - 2 * log(1 - p)) I tried to track the problem and found that the answer to log(0.05) is different from the answer to log(1-0.95) which is ofcourse not true and hence it causes problems in the code print(log(0.05),digit=22) [1] -2.9957322735539909 print(log(1-0.95),digit=22) [1] -2.99573227355399 Any possible explanation ? Regards Anthony __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Precision in estimating log
On 5/19/2006 8:25 AM, Gichangi, Anthony wrote: Hi R-users, I have the following code: f -function(x,p)sqrt(-(x^2)-2*log(1-p)) r1 -sqrt(-2*log(1-0.95)) r2 -sqrt(-2*log(0.05)) on executing i get the following results f(r1,0.95) [1] 0 f(r2,0.95) [1] NaN Warning message: NaNs produced in: sqrt(-(x^2) - 2 * log(1 - p)) I tried to track the problem and found that the answer to log(0.05) is different from the answer to log(1-0.95) which is ofcourse not true and hence it causes problems in the code print(log(0.05),digit=22) [1] -2.9957322735539909 print(log(1-0.95),digit=22) [1] -2.99573227355399 Any possible explanation ? R uses finite precision arithmetic, usually giving around 15-16 digit accuracy. Your results agree in the first 15 digits. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] iraq statistics - OT
Rolf Turner [EMAIL PROTECTED] writes: Gabor Grothendieck wrote: I came across this one: http://www.nysun.com/article/32787 which says that the violent death rate in Iraq (which presumably includes violent deaths from the war) is lower than the violent death rate in major American cities. Does anyone have any insights from statistics on how to interpret this? Well, I don't. But I remain very skeptical anyhow. I had heard earlier that the violent death rate for *American military personel* in Iraq is lower than the violent death rate in American cities --- which seems more plausible. But still not very plausible. Or maybe major American cities are even worse than we had been led to believe. They are... Figures like the ones quoted for South Africa, Colombia, New Orleans c generally represent the existence of neighbourhoods with total social and law enforcement breakdown. However, numbers can easily be misleading. I notice that the crude death rate is substantially lower in Iraq than in Canada! The fact that 40% of the Iraqis are less that 15 years of age may have something to do with that... (and with the denominator of the violent death rate too). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] iraq statistics - OT
I guess it all depends on what you include in the category of violent death. This study is the only one I'm aware of to attempt to address this: http://www.thelancet.com/journals/lancet/article/PIIS0140673604174412/fulltext (there's a registration but I think it's free, can't remember). -roger Gabor Grothendieck wrote: I came across this one: http://www.nysun.com/article/32787 which says that the violent death rate in Iraq (which presumably includes violent deaths from the war) is lower than the violent death rate in major American cities. Does anyone have any insights from statistics on how to interpret this? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] iraq statistics - OT
Peter Dalgaard [EMAIL PROTECTED] 5/19/2006 8:41 am wrote They are... Figures like the ones quoted for South Africa, Colombia, New Orleans c generally represent the existence of neighbourhoods with total social and law enforcement breakdown. However, numbers can easily be misleading. I notice that the crude death rate is substantially lower in Iraq than in Canada! The fact that 40% of the Iraqis are less that 15 years of age may have something to do with that... (and with the denominator of the violent death rate too). Another way this is misleading is that, even if you accept the numbers as given, they are comparing apples and oranges. As Peter points out, in the US cities cited there are some very bad neighborhoods. Tourists don't go to those neighborhoods. Most parts of those cities are much safer. In Iraq, the violence finds everyone. Peter Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St http://cduhr.ndri.org www.peterflom.com New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to deal with missing data?
Hi All, This is a question not directly related to R itself, it's about how to deal with missing data. I want to build wind roses i.e. circular histograms of wind directions and associated speeds to look for trends or changes in the wind patterns over several decades for some meteo stations. The database I have contains hourly records of wind direction and speed over the past 50 years...obviously that's a huge database! Of course there are a lot of missing data and they are causing problems. Two major problems arise from the temporal distribution of wind records: 1) Data are missing because of station shutdowns (consecutive missing data over days, weeks, months and even years for some stations!!!) 2) In the past, wind records were performed only during daytime while recently they cover day and night time On top of these situations, data can also miss at random. The analysis is complicated by the fact that wind direction is a circular variable so specific tools must be used to handle this. I know there are different ways to deal with missing data such as Multiple Imputation but most assume gaussianity of the variables. Moreover when a record is missing in the database, it is missing for all variables so that it is apparently not possible to use other variables to produce estimates of missing wind records. For now I'm considering the following: - look at copula function to build a bivariate distribution of wind direction and speeds and simulate values out of it to fill-in missing data. Produce several estimate of each missing data to assess the variability of the final results. The bivariate distribution should be modelled for every 5 or 10 years interval to accommodate for a possible trend in the data. - time series approach: it seems that wind direction and wind speed are autocorrelated over . But it seems to be due to a non stationarity since computing the autocorrelation on first derivative destroys everything (correlation of wind direction is performed using the circular-circular correlation coeff as defined by Mardia 1976). - Correlate with other meteo stations: this is a problem because wind patterns are affected by topography for instance and even nearby stations may have different wind patterns. Also the correlation between meteo stations is questionable since a N wind will first affect Northern stations while a S wind will first affect southern stations so the lagged correlation between stations may appear lower than what it should be I guess. - Neural networks: Data driven approach but since missing data are missing for all variables, I do not have much inputs to feed in the network. - Data weighing: this sounds stupid but I tried to give a weight to data according to the time difference between records. Data next to a missing value receive more weight than other and the weight is bigger as the number of missing data increases between two data. I thought about that because I remember using Voronoi polygons in spatial statistics to weight data according to the monitoring network density. However I'm not confident in this approach because I don't like the idea of giving a higher weight to a data simply because it is surrounded by missing values - Do nothing! Sometime it's better to consider raw data rather than applying questionable techniques. Computing wind roses with raw data sure produces artefacts but Well now you know more or less that I do not know a lot on the topic of missing data and desperately need your help :) If you have some hints on what techniques I may use or general advices, please let me know. Thanks a lot, Aziz [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Precision in estimating log
Googling for: What Every Computer Scientist should know about floating point arithmetic Gets you to a very enlightening pdf about these issues. Hth, ingmar From: Gichangi, Anthony [EMAIL PROTECTED] Date: Fri, 19 May 2006 14:25:51 +0200 To: R-help r-help@stat.math.ethz.ch Subject: [R] Precision in estimating log Hi R-users, I have the following code: f -function(x,p)sqrt(-(x^2)-2*log(1-p)) r1 -sqrt(-2*log(1-0.95)) r2 -sqrt(-2*log(0.05)) on executing i get the following results f(r1,0.95) [1] 0 f(r2,0.95) [1] NaN Warning message: NaNs produced in: sqrt(-(x^2) - 2 * log(1 - p)) I tried to track the problem and found that the answer to log(0.05) is different from the answer to log(1-0.95) which is ofcourse not true and hence it causes problems in the code print(log(0.05),digit=22) [1] -2.9957322735539909 print(log(1-0.95),digit=22) [1] -2.99573227355399 Any possible explanation ? Regards Anthony __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R-OT list needed?
The Iraq Statistics thread is a very interesting topic, and no doubt a lot of us would like to chip in with our views and comments -- indeed we are likely to bring a more discriminating view to such discussion than might be the case on many other lists. So if this took on a life of its own then it could become extensive. At which point the R-help admin would quite rightly call time on us! Yet I can see a good case, exemplified by this thread, for an arrangement where rather than each of us severally and independently taking the matter off to wherever we respectively discuss such things, we could keep it going and preserve our R-community spirit. Hence I'd like to suggest an addition to the various lists hanging off R, where we could take off-topic subjects. Hence my suggestion of R-OT (though I can see for myself, than you, what that seems to spell out -- maybe R-Social might be better). What do people think? Best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 19-May-06 Time: 14:20:46 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Tick marks in lines.survfit
I posted several months about the problem with adding tick marks to curves using lines.survfit. This occurs when lines.survfit is used to add a curve to survival curves plotted with plot.survfit. The help for this function implies that mark.time=TRUE thus: plot(pfsfit,conf.int=FALSE,xscale=365.25,yscale=100,xlab=Years,ylab=% surviving,lty=2,mark=3) lines(osfit,mark=3,col=1,lty=1,xscale=365.24,mark.time=TRUE) will add tick marks to the added curve in the same way that they appear on the first curve, but no permutation of mark, col, lty, etc. seems to produce tick marks. Has anyone found a solution to this problem? Rachel British Society of Blood and Marrow Transplantation [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Tick marks in lines.survfit
What did the maintainer say? (This is in a contributed package survival, see the posting guide.) The help page says fit - survfit(Surv(time, status) ~ sex, pbc,subset=1:312) plot(fit, mark.time=FALSE, xscale=365.24, xlab='Years', ylab='Survival') lines(fit[1], lwd=2, xscale=365.24)#darken the first curve and add marks and no marks appear. It looks to me as if the problem is that the code has e.g. deaths - c(-1, ssurv$n.event[who]) and ssurv is a vector. I think that should be x$n.event (in two places). On Fri, 19 May 2006, Rachel Pearce wrote: I posted several months about the problem with adding tick marks to curves using lines.survfit. This occurs when lines.survfit is used to add a curve to survival curves plotted with plot.survfit. The help for this function implies that mark.time=TRUE thus: plot(pfsfit,conf.int=FALSE,xscale=365.25,yscale=100,xlab=Years,ylab=% surviving,lty=2,mark=3) lines(osfit,mark=3,col=1,lty=1,xscale=365.24,mark.time=TRUE) will add tick marks to the added curve in the same way that they appear on the first curve, but no permutation of mark, col, lty, etc. seems to produce tick marks. Has anyone found a solution to this problem? Rachel British Society of Blood and Marrow Transplantation [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] iraq statistics - OT
For what the article says, every country should have a war to have a lower violent death rate!! Gabor Grothendieck wrote: I came across this one: http://www.nysun.com/article/32787 which says that the violent death rate in Iraq (which presumably includes violent deaths from the war) is lower than the violent death rate in major American cities. Does anyone have any insights from statistics on how to interpret this? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- Angelo M. Mineo Dipartimento di Scienze Statistiche e Matematiche S. Vianelli Università degli Studi di Palermo Viale delle Scienze 90128 Palermo url: http://dssm.unipa.it/elio __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R-OT list needed?
On 19 May 2006 at 14:20, (Ted Harding) wrote: | than you, what that seems to spell out -- maybe R-Social | might be better). Perfect! Those with bruises from asking silly or uninformed questions on r-help can refer to that list as ... R-AntiSocial. Just kidding. I'd be up for an off-topic list with a more discerning look at publically fudged and quoted numbers. Carl Bialik does something related in his Numbers Guy column at the on-line Wall Street Journal, but that requires a subscription. Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] iraq statistics - OT
Though I agree that the violent death rate in US cities is sad, I would also guess that the estimates are relatively accurate. I would also say that the experimental design assumed in the article is potentially badly flawed, with tremendous underreporting in Iraq and meticulous reporting in US cities. Hank On May 19, 2006, at 10:04 AM, Elio Mineo wrote: For what the article says, every country should have a war to have a lower violent death rate!! Gabor Grothendieck wrote: I came across this one: http://www.nysun.com/article/32787 which says that the violent death rate in Iraq (which presumably includes violent deaths from the war) is lower than the violent death rate in major American cities. Does anyone have any insights from statistics on how to interpret this? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html -- -- Angelo M. Mineo Dipartimento di Scienze Statistiche e Matematiche S. Vianelli Università degli Studi di Palermo Viale delle Scienze 90128 Palermo url: http://dssm.unipa.it/elio __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html Dr. M. Hank H. Stevens, Assistant Professor 338 Pearson Hall Botany Department Miami University Oxford, OH 45056 Office: (513) 529-4206 Lab: (513) 529-4262 FAX: (513) 529-4243 http://www.cas.muohio.edu/~stevenmh/ http://www.muohio.edu/ecology/ http://www.muohio.edu/botany/ E Pluribus Unum __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R-OT list needed?
From: Dirk Eddelbuettel On 19 May 2006 at 14:20, (Ted Harding) wrote: | than you, what that seems to spell out -- maybe R-Social | might be better). A ROT-SIG list? Perfect! Those with bruises from asking silly or uninformed questions on r-help can refer to that list as ... R-AntiSocial. Just kidding. I'd be up for an off-topic list with a more discerning look at publically fudged and quoted numbers. Carl Bialik does something related in his Numbers Guy column at the on-line Wall Street Journal, but that requires a subscription. For more such statistics (and if you've got 24 minutes to spare), see http://video.google.com/videoplay?docid=-869183917758574879. Andy Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
These beliefs are very prevelant. The IT person for my group doesn't beleieve in the concept of _free_ software and actually expects me to be arrested some day for using R at work! All I can say is keep the faith. On 5/19/06, J Dougherty [EMAIL PROTECTED] wrote: On Thursday 18 May 2006 14:51, Damien Joly wrote: Hi all, This may seem like a dumb question, but I work for an entity that is soon converting to XP across the board, and I will lose the ability to install software on my own. The entity has a policy of only using software that has been purchased and properly licensed (whatever that means). This means I will soon lose the ability to use R at work - something I can't do without at this point. HOWEVER, I might be able to work around this policy if I can find a licensed software vendor, preferably in Canada, that sells R. I tried googling R vendors but was unsuccessful. Any ideas? Well, first, have you pointed out to whatever limited neurons came up with that specification, that this will mean that part of your job can no longer be done because their specifications appear to rule out a key tool? Second, R is available for windows and works quite well. While there is no charge for R, it IS properly licensed properly licensed under the GPL. Theoretically, is system security is the actual issue, then the individual in charge of software acquisition can download and install it for you. All of that should be clear and above board and shouldn't compromise anything unless the entity you work for has become contractually constrained to avoid using OS ware for some obscure and irrational reason. What do they actually expect to gain from this policy? The _expensive_ alternative is to have them purchase S-Plus for you. If you present them with an estimated cost and l imagine they might think having the BOFH download R for windows for you might be the cost-effective way to go. JD __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] iraq statistics - OT
It seems that as time goes by, people including statisticians, forget the past and must re-invent it. Anyone interested should read Richardson's The Statistics of Deadly Quarrels. Volume 2 of The World of Mathematics. The book used to be given out as sort of a cracker-jack prize by book clubs everywhere. Gabor Grothendieck wrote: I came across this one: http://www.nysun.com/article/32787 which says that the violent death rate in Iraq (which presumably includes violent deaths from the war) is lower than the violent death rate in major American cities. Does anyone have any insights from statistics on how to interpret this? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html -- -- Bob Wheeler --- http://www.bobwheeler.com/ ECHIP, Inc. --- Randomness comes in bunches. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Fast update of a lot of records in a database?
We have a PostgreSQL table with about 40 records in it. Using either RODBC or RdbiPgSQL, what is the fastest way to update one (or a few) column(s) in a large collection of records? Currently we're sending sql like BEGIN UPDATE table SET col1=value WHERE id=id (repeated thousands of times for different ids) COMMIT and this takes hours to complete. Surely there must be a quicker way? Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] trouble with plotrix package
Hello list, I wrote a simple program to plot data on polar axes, taking advantage of the plotrix package and its function radial.plot. The basic plot works fine, but I am having difficulties with the formatting. There are three problems, but I thought I would attack them one at a time. Here is the first: If my data set contains values with all vector lengths between 0 and 100 (and various angles), and I set rp.type=s, I get a nice bullseye type plot with the data shown on a background of concentric circles labeled appropriately. On the other hand, if my data only contain vector length values from, say, 80 to 90 then the first concentric ring is at 80, and five more rings are scrunched between 80 and 100. It looks like there is an autoscaling feature turned on that says make a fixed number of rings starting at a nice value below the user's lowest data value and extending to a nice value above. There is a switch to set the upper bound on rings (radial.lim) but I don't see a way to specify the lower bound. What I want is a bullseye plot that goes from my start value (first ring at 10) to my end value (last ring at 100) independent of the data range. Thanks!! =Randy= R. Zelick email: [EMAIL PROTECTED] Department of Biology voice: 503-725-3086 Portland State University fax: 503-725-3888 mailing: P.O. Box 751 Portland, OR 97207 shipping: 1719 SW 10th Ave, Room 246 Portland, OR 97201 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] extending family() objects
Dear R-help, The family collection of objects is very useful since I can perform some of the calculations involved when fitting glms easily in a vectorized manner. I would like to extend them in the following manner: I want to supply a vector of family names such as c(poisson, gamma,beta), and an indexing vector such as c(1,1,1,1,1,1,2,2,2,2,3,3,3,3) ( integer elements that go up to the length of the vector of family names). I then want to construct a new type of family object, new.family, such that when calling any of its functions, eg. family.new$variance(mu), it will apply a different function according to the indexing vector. I could do this quick-and-dirty using loops but I would really like not to loose much of the speed obtained through vectorised calculations. Any ideas? Thanks Simon Bond. - /\ \ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL X - AGAINST MS ATTACHMENTS / \ http://www.gnu.org/philosophy/no-word-attachments.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fast update of a lot of records in a database?
I was going to suggest sqlUpdate in RODBC, but it looks like that function also uses the UPDATE command repeated nrow times. A second strategy that I generally prefer because it does not require RODBC (as much) and better supports transaction control is to first create a temporary table with the new columns in it and an identifier column (perhaps using sqlSave). Then you can join the two tables on the identifier column and set the old column to the new column en masse using UPDATE. Often the bottleneck in doing row-by-row updates is searching for the index of the id each time, whereas doing the entire join up front and then updating often speeds this up considerably. In general, if you are ever doing something that resembles a FOR loop in SQL, there's a faster way. Something like this is what I have in mind, although you might need to tweak for PostgreSQL syntax: UPDATE table SET col1 = (SELECT new.col1 FROM table AS old JOIN tempTable AS new ON old.idCol = new.idCol) You should also make sure that your table is indexed well to optimize for updates. HTH, Robert -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch Sent: Friday, May 19, 2006 11:17 AM To: r-help@stat.math.ethz.ch Subject: [R] Fast update of a lot of records in a database? We have a PostgreSQL table with about 40 records in it. Using either RODBC or RdbiPgSQL, what is the fastest way to update one (or a few) column(s) in a large collection of records? Currently we're sending sql like BEGIN UPDATE table SET col1=value WHERE id=id (repeated thousands of times for different ids) COMMIT and this takes hours to complete. Surely there must be a quicker way? Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fast update of a lot of records in a database?
Your approach seems very inefficient - it looks like you're executing thousands of update statements. Try something like this instead: #---build a table 'updates' (id and value) ... #---do all updates via a single left join UPDATE bigtable a LEFT JOIN updates b ON a.id = b.id SET a.col1 = b.value; You may need to adjust the syntax. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch Sent: Friday, May 19, 2006 11:17 AM To: r-help@stat.math.ethz.ch Subject: [R] Fast update of a lot of records in a database? We have a PostgreSQL table with about 40 records in it. Using either RODBC or RdbiPgSQL, what is the fastest way to update one (or a few) column(s) in a large collection of records? Currently we're sending sql like BEGIN UPDATE table SET col1=value WHERE id=id (repeated thousands of times for different ids) COMMIT and this takes hours to complete. Surely there must be a quicker way? Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Chevron Email Firewall Alert
Your message with subject [EMAIL PROTECTED] sent on 05/19/06, 11:18:39 contained one or more attachments not allowed by Chevron and was blocked. If you did not send such an email, your email address may have been spoofed. In this case, no further action is required on your part and you may disregard this message. For more details on spoofing, please visit http://messaging.chevrontexaco.com/html/spam/spooffaq.asp. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] optim with specific constraints on possible values
Searching over a finite set of possible values sounds like a variant of integer programming. If the finite number is small, then the most efficient algorithm may be just to compute them all. However, if it's a number like 13^13, then that's not feasible. If the objective function is sufficiently smooth and well defined over all real numbers (at least in some convex set, preferably rectangular), then a good integer programming algorithm may be to find the optimum ignoring the integer constraint first, then search all integer combinations in an appropriate region of the non-integer optimum. Have you tried RSiteSearch(integer programming)? This just produced 205 hits for me. Some were relevant, some weren't. Google also seemed to produce some potentially useful hits. What problem are you trying to solve? If a grid search is used just to simplify the problem, I think that's wise if the objective function is known to have multiple local optima, discontinuities, etc. If, however, the objective function is smooth, then I think 'optim' or 'nlminb' might work better for you, especially if your finite set is merely an attempt to reduce the compute time. Hope this helps, Spencer Graves Camarda, Carlo Giovanni wrote: Dear R-users, I am working with some grid-search optimization for 13 values of an object function. At glance one may compute the object function for each possible combination of set of parameters, but in my case would not be feasible: taking for example 13 possible values for each parameters like, in logarithm scale, 10^seq(-3,3,.5) will lead to 13^13 combinations of results. As a second trial I use the general-purpose function optim both without and with constraints from the candidate values (of course the latter is fast and fine with me), but I am wondering whether there is kind of optim-function which optimize using only a series of values to give as additional arguments. Just inventing, something like: possible1 - 10^seq(-3,3,0.5) possible2 - 10^seq(-3,3,0.5) ... possible13 - 10^seq(-3,3,0.5) new.optim(par=rep(median(possible1), 13), fn=my.object.function, from=cbind(possible1, possible2,..., possible13)) Instead of just: optim(par=rep(median(possible1), 13), fn=my.object.function, method=c(L-BFGS-B), lower=rep(min(possible1), 13), upper=rep(min(possible1), 13)) Thanks in advance, Carlo Giovanni Camarda === Camarda Carlo Giovanni PhD-Student Max Planck Institute for Demographic Research Konrad-Zuse-Strasse 1 18057 Rostock, Germany Tel: +49 (0)381 2081 172 Fax: +49 (0)381 2081 472 [EMAIL PROTECTED] === -- This mail has been sent through the MPI for Demographic Rese...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to use lm.predict to obtain fitted values?
I am writing a function to assess the out of sample predictive capabilities of a time series regression model. However lm.predict isn't behaving as I expect it to. What I am trying to do is give it a set of explanatory variables and have it give me a single predicted value using the lm fitted model. model = lm(y~x) newdata=matrix(1,1,6) pred = predict.lm(model,data.frame(x=newData)); Warning message: 'newdata' had 6 rows but variable(s) found have 51 rows. pred = predict.lm(model,data.frame(newData)); Warning message: 'newdata' had 6 rows but variable(s) found have 51 rows. y is a vector of length 51. x is a 6x51 matrix newdata is a matrix of the explanatory variables I'd like a prediction for. The predict.lm function is giving me 51 (=number of observations I had) numbers, rather than the one number I do want - the predicted value of y, given the values of x I have supplied it. Many thanks, R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fast update of a lot of records in a database?
On 5/19/2006 11:17 AM, Duncan Murdoch wrote: We have a PostgreSQL table with about 40 records in it. Using either RODBC or RdbiPgSQL, what is the fastest way to update one (or a few) column(s) in a large collection of records? Currently we're sending sql like BEGIN UPDATE table SET col1=value WHERE id=id (repeated thousands of times for different ids) COMMIT and this takes hours to complete. Surely there must be a quicker way? Thanks to Robert McGehee and Bogdan Romocea for their responses. Putting them together, I think the following will do what I want: put the updates into a temporary table called updates UPDATE bigtable AS a FROM updates AS b WHERE a.id = b.id SET a.col1 = b.col1 The FROM clause is a PostgreSQL extension. This is not portable, but MySQL does it with different syntax: UPDATE bigtable AS a, updates AS b WHERE a.id = b.id SET a.col1 = b.col1 I don't think SQLite supports updating one table from another. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to use lm.predict to obtain fitted values?
data.frame(x=newData) will not have any entries called x: You supplied newdata, so assuming you means newdata, data.frame(x=newdata) x.1 x.2 x.3 x.4 x.5 x.6 1 1 1 1 1 1 1 has 6 columns none of which is labelled x. If you read the help for lm, it does not mention having a matrix on the rhs of a formula, and the help for data.frame does explain how it works. predict(model, data.frame(x=I(newData))) might work. On Fri, 19 May 2006, Richard Lawn wrote: I am writing a function to assess the out of sample predictive capabilities of a time series regression model. However lm.predict isn't behaving as I expect it to. What I am trying to do is give it a set of explanatory variables and have it give me a single predicted value using the lm fitted model. model = lm(y~x) newdata=matrix(1,1,6) pred = predict.lm(model,data.frame(x=newData)); Warning message: 'newdata' had 6 rows but variable(s) found have 51 rows. pred = predict.lm(model,data.frame(newData)); Warning message: 'newdata' had 6 rows but variable(s) found have 51 rows. y is a vector of length 51. x is a 6x51 matrix newdata is a matrix of the explanatory variables I'd like a prediction for. The predict.lm function is giving me 51 (=number of observations I had) numbers, rather than the one number I do want - the predicted value of y, given the values of x I have supplied it. Many thanks, R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to use lm.predict to obtain fitted values?
I have had a similar issue recently, and looking at the archives of this list, I see other cases of it as well. It took me a while to figure out that the variable name in the data frame must be identical to the variable name in the model. I don't see this mentioned in the documentation of predict.lm, and R issues no warning in this case. How would I go about officially requesting that this is mentioned, either in the documentation, or as a warning? Sincerely, Larry Howe On Friday May 19 2006 13:26, Prof Brian Ripley wrote: data.frame(x=newData) will not have any entries called x: You supplied newdata, so assuming you means newdata, data.frame(x=newdata) x.1 x.2 x.3 x.4 x.5 x.6 1 1 1 1 1 1 1 has 6 columns none of which is labelled x. If you read the help for lm, it does not mention having a matrix on the rhs of a formula, and the help for data.frame does explain how it works. predict(model, data.frame(x=I(newData))) might work. On Fri, 19 May 2006, Richard Lawn wrote: I am writing a function to assess the out of sample predictive capabilities of a time series regression model. However lm.predict isn't behaving as I expect it to. What I am trying to do is give it a set of explanatory variables and have it give me a single predicted value using the lm fitted model. model = lm(y~x) newdata=matrix(1,1,6) pred = predict.lm(model,data.frame(x=newData)); Warning message: 'newdata' had 6 rows but variable(s) found have 51 rows. pred = predict.lm(model,data.frame(newData)); Warning message: 'newdata' had 6 rows but variable(s) found have 51 rows. y is a vector of length 51. x is a 6x51 matrix newdata is a matrix of the explanatory variables I'd like a prediction for. The predict.lm function is giving me 51 (=number of observations I had) numbers, rather than the one number I do want - the predicted value of y, given the values of x I have supplied it. Many thanks, R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Weird LM behaviour
Dear R users, experimenting with the lm function in R, I've encountered some behaviour I don't understand with my limited knowledge of regression. I made a data-'set' of three measurements (see syntax below). Using lm (linear model) to fit the regression-line, I expected to find an intercept of 2.0 and a slope of 0, but in fact the slope is slightly below zero. Amazed by this behaviour, I decided to shift the measurements towards each other, create a model using lm and analyse the resulting slopes. Plotting these, I found that they are increasingly deviant from zero when the difference between the three values is smaller. More fascinating, some of the slopes seem to be relatively large in comparison with other resulting slopes (sort of outliers). How is this possible? Is this a rounding-problem or is it the way R creates the linear model? Using SPSS results in b-values of zero. I'm using R 2.3 on MacOS 10.4.6, on a G4 processor. The syntax I used is printed below. Any explanations of this behaviour are really appreciated. Rense Nieuwenhuis #-- x - c(0,1,1) y - c(2,1,3) plot(x,y) b - 1 length - 100 for(i in 1:length) { x[1] - i/length model - lm(y~x) b[i] - model$coefficients[2] } plot(b) # [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fast update of a lot of records in a database?
put the updates into a temporary table called updates UPDATE bigtable AS a FROM updates AS b WHERE a.id = b.id SET a.col1 = b.col1 I don't think this will be any faster - why would creating a new table be faster than updating existing rows? I've never had a problem with using large numbers of SQL update statements (in the order of hundreds of thousands) to update a table and having them complete in a reasonable time (a few minutes). How heavily indexed is the field you are updating? You may be able to get some speed improvements by turning off indices before the update and back on again afterwards (highly dependent on your database system though). I would strongly suspect your bottleneck lies elsewhere (eg. when generating the statements in R, or using ODBC to send them) Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
On 5/19/06, J Dougherty [EMAIL PROTECTED] wrote: While there is no charge for R, it IS properly licensed properly licensed under the GPL. At one company I was working for, I had to run all the licenses of all the software I had on my machine, through the legal department. When they read GNU Public License (GPL) their only comment was: We have no idea what that license means. Do not touch any software using it. Jarek On 5/19/06, J Dougherty [EMAIL PROTECTED] wrote: On Thursday 18 May 2006 14:51, Damien Joly wrote: Hi all, This may seem like a dumb question, but I work for an entity that is soon converting to XP across the board, and I will lose the ability to install software on my own. The entity has a policy of only using software that has been purchased and properly licensed (whatever that means). This means I will soon lose the ability to use R at work - something I can't do without at this point. HOWEVER, I might be able to work around this policy if I can find a licensed software vendor, preferably in Canada, that sells R. I tried googling R vendors but was unsuccessful. Any ideas? Well, first, have you pointed out to whatever limited neurons came up with that specification, that this will mean that part of your job can no longer be done because their specifications appear to rule out a key tool? Second, R is available for windows and works quite well. While there is no charge for R, it IS properly licensed properly licensed under the GPL. Theoretically, is system security is the actual issue, then the individual in charge of software acquisition can download and install it for you. All of that should be clear and above board and shouldn't compromise anything unless the entity you work for has become contractually constrained to avoid using OS ware for some obscure and irrational reason. What do they actually expect to gain from this policy? The _expensive_ alternative is to have them purchase S-Plus for you. If you present them with an estimated cost and l imagine they might think having the BOFH download R for windows for you might be the cost-effective way to go. JD __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] make error for R-2.3.0
Hello, I'm trying to install R on a linux machine running Red Hat 8. I ran ./configure make and get the following error. I've installed several versions of R (2.2.1 most recently) on this machine and haven't had any problems until now. I wondered if the outdated compiler (gcc version 3.2) was the problem and attempted to install my own, more recent version. I tried gcc versions 4.1.0 and 3.4.6, but still have problems. The output below is using gcc 3.4.6 (my best attempt, but still ending with the same error as gcc 3.2). Any pointers would be appreciated. Best, Randy system: i686-pc-linux-gnu gcc version: 3.4.6 everything seems to be fine up to this point: . . . make[4]: Entering directory `/home/lgd/rjohnson/bin/R-2.3.0/src/modules/internet' gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fpic -g -O2 -std=gnu99 -c Rsock.c -o Rsock.o gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fpic -g -O2 -std=gnu99 -c internet.c -o internet.o gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fpic -g -O2 -std=gnu99 -c nanoftp.c -o nanoftp.o gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fpic -g -O2 -std=gnu99 -c nanohttp.c -o nanohttp.o gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fpic -g -O2 -std=gnu99 -c sock.c -o sock.o gcc -I. -I../../../src/include -I../../../src/include -I/usr/local/include -DHAVE_CONFIG_H -fpic -g -O2 -std=gnu99 -c sockconn.c -o sockconn.o In file included from sockconn.c:34: sock.h:38: error: syntax error before Sock_read sock.h:38: warning: type defaults to `int' in declaration of `Sock_read' sock.h:38: warning: data definition has no type or storage class sock.h:39: error: syntax error before Sock_write sock.h:39: warning: type defaults to `int' in declaration of `Sock_write' sock.h:39: warning: data definition has no type or storage class make[4]: *** [sockconn.o] Error 1 make[4]: Leaving directory `/home/lgd/rjohnson/bin/R-2.3.0/src/modules/internet' make[3]: *** [R] Error 2 make[3]: Leaving directory `/home/lgd/rjohnson/bin/R-2.3.0/src/modules/internet' make[2]: *** [R] Error 1 make[2]: Leaving directory `/home/lgd/rjohnson/bin/R-2.3.0/src/modules' make[1]: *** [R] Error 1 make[1]: Leaving directory `/home/lgd/rjohnson/bin/R-2.3.0/src' make: *** [R] Error 1 ~~ Randall C Johnson Bioinformatics Analyst SAIC-Frederick, Inc (Contractor) Laboratory of Genomic Diversity NCI-Frederick, P.O. Box B Bldg 560, Rm 11-85 Frederick, MD 21702 Phone: (301) 846-1304 Fax: (301) 846-1686 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
On Thursday 18 May 2006 14:51, Damien Joly wrote: Hi all, HOWEVER, I might be able to work around this policy if I can find a licensed software vendor, preferably in Canada, that sells R. I tried googling R vendors but was unsuccessful. Any ideas? Would cheapbytes ( http://www.cheapbytes.com/ ) work? http://shop.cheapbytes.com/cgi-bin/cart/0070010796.html although this page looks awfully old ... Quantian may be better/more recent http://finzi.psych.upenn.edu/R/Rhelp02a/archive/38930.html although you might have to convince your legal people that this Linux software would also be legal on Windows ... what is a licensed software vendor? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Weird LM behaviour
No, not weird. Think of it this way. As you move point (0,2) to (1,2) the slope which was 0 is moving towards infinity. Eventually the 3 points are perfectly vertical and so must have infinite slope. Your delta-x is not sufficiently granular to show the slope change for x-values very close to 1 but not yet 1, like 0.9. Note lm returns NA when x=1. As for the localized variation my guess is these are the artifact of having 3 data points. -jason - Original Message - From: Rense Nieuwenhuis [EMAIL PROTECTED] To: R-help@stat.math.ethz.ch Sent: Friday, May 19, 2006 11:51 AM Subject: [R] Weird LM behaviour Dear R users, experimenting with the lm function in R, I've encountered some behaviour I don't understand with my limited knowledge of regression. I made a data-'set' of three measurements (see syntax below). Using lm (linear model) to fit the regression-line, I expected to find an intercept of 2.0 and a slope of 0, but in fact the slope is slightly below zero. Amazed by this behaviour, I decided to shift the measurements towards each other, create a model using lm and analyse the resulting slopes. Plotting these, I found that they are increasingly deviant from zero when the difference between the three values is smaller. More fascinating, some of the slopes seem to be relatively large in comparison with other resulting slopes (sort of outliers). How is this possible? Is this a rounding-problem or is it the way R creates the linear model? Using SPSS results in b-values of zero. I'm using R 2.3 on MacOS 10.4.6, on a G4 processor. The syntax I used is printed below. Any explanations of this behaviour are really appreciated. Rense Nieuwenhuis #-- x - c(0,1,1) y - c(2,1,3) plot(x,y) b - 1 length - 100 for(i in 1:length) { x[1] - i/length model - lm(y~x) b[i] - model$coefficients[2] } plot(b) # [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] hello, everyone
Hello, R people: I have a question in using fSeries package--the funciton garchFit and garchOxFit if adding a regression to the mean formula, how to estimate the model in R? using garchFit or garchOxFit? For example, Observations is {x,y}_t,there may be some relation between x and y. the model is y_t=gamma0 + *gamma1*x_t*+psi*e_{t-1}+e_t the gamma1*x_t is regression. e_t=sqrt(h_t)*N(0,1) h_t=alpha0+alpha1*e_t^2+beta*h_{t_1}~~~GARCH(1,1). I didn't know how to estimate the model using function garchFit or garchOxFit or other functions?because the argument in garchFit/garchOxFit is formular.mean=~arma(1,1). Do you have some instrucitons? thank you very much for you help. Best wishes Ma Yuchao [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] make error for R-2.3.0
On Fri, 2006-05-19 at 15:37 -0400, Randall C Johnson [Contr.] wrote: Hello, I'm trying to install R on a linux machine running Red Hat 8. I ran ./configure make and get the following error. I've installed several versions of R (2.2.1 most recently) on this machine and haven't had any problems until now. I wondered if the outdated compiler (gcc version 3.2) was the problem and attempted to install my own, more recent version. I tried gcc versions 4.1.0 and 3.4.6, but still have problems. The output below is using gcc 3.4.6 (my best attempt, but still ending with the same error as gcc 3.2). Any pointers would be appreciated. Best, Randy SNIP of sock.h related errors Randy, This looks like the same issue that was reported on r-devel back at the end of April for RH 9. Download the latest r-patched tarball and you should be OK. Prof. Ripley made some changes to sock.h that should get around these issues. Unfortunately, they were not reported until after the release of 2.3.0. Download from here: ftp://ftp.stat.math.ethz.ch/Software/R/R-patched.tar.gz It might be time to consider updating your system, since RH 8.0 is not even supported by the Fedora Legacy folks any longer. That means no functional or security updates. Best regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] trouble with plotrix package
Randy Zelick zelickr at pdx.edu writes: There is a switch to set the upper bound on rings (radial.lim) but I don't see a way to specify the lower bound. What I want is a bullseye plot that goes from my start value (first ring at 10) to my end value (last ring at 100) independent of the data range. Thanks!! =Randy= I would suggest dumping radial.plot to a file and making the following hacks: - change the default value of radial.lim from NA to range(lengths) - delete the second and third lines of code that set radial.lim to max(lengths) if it is NA - change grid.pos to be set to pretty(radial.lim) this behavior seems more sensible to me -- if one wanted to make it backward compatible one could hack it with something like if (!missing(radial.lim) length(radial.lim)==1) radial.lim - c(min(lengths),radial.lim) perhaps Jim Lemon will want to incorporate these changes the following verbose diff file says the same thing. *** *** 2,12 label.prop = 1.1, main = , xlab = , ylab = , line.col = par(fg), mar = c(2, 2, 3, 2), show.grid = TRUE, grid.col = gray, grid.bg = par(bg), point.symbols = NULL, point.col = NULL, ! show.centroid = FALSE, radial.lim = NA, ...) { length.dim - dim(lengths) - if (is.na(radial.lim)) - radial.lim - max(lengths) if (is.null(length.dim)) { npoints - length(lengths) nsets - 1 --- 2,10 label.prop = 1.1, main = , xlab = , ylab = , line.col = par(fg), mar = c(2, 2, 3, 2), show.grid = TRUE, grid.col = gray, grid.bg = par(bg), point.symbols = NULL, point.col = NULL, ! show.centroid = FALSE, radial.lim = range(lengths), ...) { length.dim - dim(lengths) if (is.null(length.dim)) { npoints - length(lengths) nsets - 1 *** *** 23,29 radial.pos - matrix(rep(radial.pos, nsets), nrow = nsets, byrow = TRUE) if (show.grid) { ! grid.pos - pretty(c(lengths, radial.lim)) if (grid.pos[1] = 0) grid.pos - grid.pos[-1] maxlength - max(grid.pos) --- 21,27 radial.pos - matrix(rep(radial.pos, nsets), nrow = nsets, byrow = TRUE) if (show.grid) { ! grid.pos - pretty(radial.lim) if (grid.pos[1] = 0) grid.pos - grid.pos[-1] maxlength - max(grid.pos) *** *** 96,98 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to estimate adding-regression GARCH Model
-- Forwarded message -- From: ma yuchao [EMAIL PROTECTED] Date: 2006-5-20 ÉÏÎç4:01 Subject: hello, everyone To: R-help@stat.math.ethz.ch Hello, R people: I have a question in using fSeries package--the funciton garchFit and garchOxFit if adding a regression to the mean formula, how to estimate the model in R? using garchFit or garchOxFit? For example, Observations is {x,y}_t,there may be some relation between x and y. the model is y_t=gamma0 + *gamma1*x_t*+psi*e_{t-1}+e_t the gamma1*x_t is regression. e_t=sqrt(h_t)*N(0,1) h_t=alpha0+alpha1*e_t^2+beta*h_{t_1}~~~GARCH(1,1). I didn't know how to estimate the model using function garchFit or garchOxFit or other functions?because the argument in garchFit/garchOxFit is formular.mean=~arma(1,1). Do you have some instrucitons? thank you very much for you help. Best wishes Ma Yuchao [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] e1071 probplot -grouping
Hello, I am currently using probplot function in the e1071 package to do cumulative probability plots . I want to be able to do multiple cumulative probability plots ( based on a grouping of data) on a single plot. Any help with this would be greatly appreciated. Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fast update of a lot of records in a database?
Hadley, There are several reasons that running one large load and one large update would be significantly faster than thousands of individual updates. First, the time it takes to execute a query does not grow linearly with the size of a query. That is, the statement: SELECT TOP 100 * FROM table takes about 1.8 times as long as SELECT TOP 10 * FROM table, not 10 times longer (using an estimated query cost on tables in my database using MS-SQL). The reason is that SQL is optimized to perform well for large queries, and many of the steps used in a SQL operation are needlessly repeated when multiple UPDATE/SELECT statements are given rather than one large UPDATE/SELECT. For instance, on most SQL UPDATES, the most time is spent primarily on 1) Sorting the input, 2) performing a clustered index seek, and 3) performing a clustered index update. In a toy example using UPDATE, the physical operation of sorting 2000 rows takes only 6 times longer than sorting a little over 100 rows. The clustered index seek and update take about 10 times longer. This, however, is far less than the 20x longer we would expect from doing a linear row-by-row update. So even if it takes an additional 50% longer to first load the data into a temporary table, we still see the opportunity for large speed increases. A second reason we would expect one large query to run faster is that it is much easier to parallel process on multiple processors. That is, one processor can be joining the tables while a second processor simultaneously is performing clustered indexing. For a bunch of single UPDATE statements, we are forced to run the operation in serial. Thus, if the above examples were more complicated, we should expect an even larger cost savings / row. From your example, a third reason is that in multiple updates, the SQL server (at least my MS-SQL server) updates the transaction log after every query (unless you wisely run in batch mode as Duncan did with his BEGIN/COMMIT syntax), and thus significant more I/O time is spent between each UPDATE statement. For instance, the RODBC sqlUpdate function does not take advantage of transaction control, so to speed up long queries, I've often resorted to sending over temporary tables (as I suggested here), stored procedures, or even data stored as XML tables. Lastly, removing your indices before the update would likely only slow down the query. If the table is not indexed on the id key, then the SQL server has to search through the entire table to find the matching id before it can be updated. It would be like searching through a dictionary that wasn't in alphabetical order. That said, indices can slow down queries when a significant number of rows are being added, as you then have to reindex the table when the insert completes. However, Duncan isn't doing that here. Best, Robert -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of hadley wickham Sent: Friday, May 19, 2006 3:20 PM To: Duncan Murdoch Cc: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch; [EMAIL PROTECTED] Subject: Re: [R] Fast update of a lot of records in a database? put the updates into a temporary table called updates UPDATE bigtable AS a FROM updates AS b WHERE a.id = b.id SET a.col1 = b.col1 I don't think this will be any faster - why would creating a new table be faster than updating existing rows? I've never had a problem with using large numbers of SQL update statements (in the order of hundreds of thousands) to update a table and having them complete in a reasonable time (a few minutes). How heavily indexed is the field you are updating? You may be able to get some speed improvements by turning off indices before the update and back on again afterwards (highly dependent on your database system though). I would strongly suspect your bottleneck lies elsewhere (eg. when generating the statements in R, or using ODBC to send them) Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] make error for R-2.3.0
That's exactly what I needed. thanks to all! Randy On 5/19/06 4:02 PM, Marc Schwartz (via MN) [EMAIL PROTECTED] wrote: On Fri, 2006-05-19 at 15:37 -0400, Randall C Johnson [Contr.] wrote: Hello, I'm trying to install R on a linux machine running Red Hat 8. I ran ./configure make and get the following error. I've installed several versions of R (2.2.1 most recently) on this machine and haven't had any problems until now. I wondered if the outdated compiler (gcc version 3.2) was the problem and attempted to install my own, more recent version. I tried gcc versions 4.1.0 and 3.4.6, but still have problems. The output below is using gcc 3.4.6 (my best attempt, but still ending with the same error as gcc 3.2). Any pointers would be appreciated. Best, Randy SNIP of sock.h related errors Randy, This looks like the same issue that was reported on r-devel back at the end of April for RH 9. Download the latest r-patched tarball and you should be OK. Prof. Ripley made some changes to sock.h that should get around these issues. Unfortunately, they were not reported until after the release of 2.3.0. Download from here: ftp://ftp.stat.math.ethz.ch/Software/R/R-patched.tar.gz It might be time to consider updating your system, since RH 8.0 is not even supported by the Fedora Legacy folks any longer. That means no functional or security updates. Best regards, Marc Schwartz ~~ Randall C Johnson Bioinformatics Analyst SAIC-Frederick, Inc (Contractor) Laboratory of Genomic Diversity NCI-Frederick, P.O. Box B Bldg 560, Rm 11-85 Frederick, MD 21702 Phone: (301) 846-1304 Fax: (301) 846-1686 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] make error for R-2.3.0
Marc Schwartz (via MN) [EMAIL PROTECTED] writes: Download the latest r-patched tarball and you should be OK. Prof. Ripley made some changes to sock.h that should get around these issues. Unfortunately, they were not reported until after the release of 2.3.0. Download from here: ftp://ftp.stat.math.ethz.ch/Software/R/R-patched.tar.gz It might be time to consider updating your system, since RH 8.0 is not even supported by the Fedora Legacy folks any longer. That means no functional or security updates. Even better, try R-2.3.1beta available from http://cran.r-project.org/src/base-prerelease/ -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] lmer, p-values and all that
Users are often surprised and alarmed that the summary of a linear mixed model fit by lmer provides estimates of the fixed-effects parameters, standard errors for these parameters and a t-ratio but no p-values. Similarly the output from anova applied to a single lmer model provides the sequential sums of squares for the terms in the fixed-effects specification and the corresponding numerator degrees of freedom but no denominator degrees of freedom and, again, no p-values. Because they feel that the denominator degrees of freedom and the corresponding p-values can easily be calculated they conclude that failure to do this is a sign of inattention or, worse, incompetence on the part of the person who wrote lmer (i.e. me). Perhaps I can try again to explain why I don't quote p-values or, more to the point, why I do not take the obviously correct approach of attempting to reproduce the results provided by SAS. Let me just say that, although there are those who feel that the purpose of the R Project - indeed the purpose of any statistical computing whatsoever - is to reproduce the p-values provided by SAS, I am not a member of that group. If those people feel that I am a heretic for even suggesting that a p-value provided by SAS could be other than absolute truth and that I should be made to suffer a slow, painful death by being burned at the stake for my heresy, then I suppose that we will be able to look forward to an exciting finale to the conference dinner at UseR!2006 next month. (Well, I won't be looking forward to such a finale but the rest of you can.) As most of you know the t-statistic for a coefficient in the fixed-effects model matrix is the square root of an F statistic with 1 numerator degree of freedom so we can, without loss of generality, concentrate on the F statistics that were present in the anova output. Those who long ago took courses in analysis of variance or experimental design that concentrated on designs for agricultural experiments would have learned methods for estimating variance components based on observed and expected mean squares and methods of testing based on error strata. (If you weren't forced to learn this, consider yourself lucky.) It is therefore natural to expect that the F statistics created from an lmer model (and also those created by SAS PROC MIXED) are based on error strata but that is not the case. The parameter estimates calculated by lmer are the maximum likelihood or the REML (residual maximum likelihood) estimates and they are not based on observed and expected mean squares or on error strata. And that's a good thing because lmer can handle unbalanced designs with multiple nested or fully crossed or partially crossed grouping factors for the random effects. This is important for analyzing data from large observational studies such as occur in psychometrics. There are many aspects of the formulation of the model and the calculation of the parameter estimates that are very interesting to me and have occupied my attention for several years but let's assume that the model has been specified, the data given and the parameter estimates obtained. How are the F statistics calculated? The sums of squares and degrees of freedom for the numerators are calculated as in a linear model. There is a slot in an lmer model that is similar to the effects component in a lm model and that, along with the assign attribute for the model matrix provides the numerator of the F ratio. The denominator is the penalized residual sum of squares divided by the REML degrees of freedom, which is n-p where n is the number of observations and p is the column rank of the model matrix for the fixed effects. Now read that last sentence again and pay particular attention to the word the in the phrase the penalized residual sum of squares. All the F ratios use the same denominator. Let me repeat that - all the F ratios use the *same* denominator. This is why I have a problem with the assumption (sometimes stated as more that just an assumption - something on the order of absolute truth again) that the reference distribution for these F statistics should be an F distribution with a known numerator degrees of freedom but a variable denominator degrees of freedom and we can answer the question of how to calculate a p-value by coming up with a formula to assign different denominator degrees of freedom for each test. The denominator doesn't change. Why should the degrees of freedom for the denominator change? Most of the research on tests for the fixed-effects specification in a mixed model begin with the assumption that these statistics will have an F distribution with a known numerator degrees of freedom and the only purpose of the research is to decide how to obtain an approximate denominator degrees of freedom. I don't agree. There is one approach that I think may be fruitful and that I am currently pursuing. The penalized least squares formulation of a mixed-effects model
Re: [R] microarray-like graph
Thanks Jim. This is almost what I want. One problem is that the colors are not mixing at the same position and the latter color simply overwrites the previous one if their V4 and V6 are similar. I am wondering if 1) somehow the colors can be mixed at the same position or 2) bin V6 at some interval to get an average single value for that range. What do you think? Thanks. Eric On 5/19/06, Jim Lemon [EMAIL PROTECTED] wrote: Eric Hu wrote: Hi, I am beginning to learn R and have a data table that I would like to produce a microarray-like plot. The table looks like this: 3 0 0 3 -377.61 1.94 3 0 0 3 -444.80 2.36 2 1 0 3 -519.60 2.39 1 1 1 3 -54.88 2.49 2 1 1 4 -536.55 2.53 1 0 1 2 108.29 2.62 2 0 0 2 39.56 2.62 3 0 1 4 108.32 2.63 2 0 0 2 -455.23 2.84 1 0 0 1 -432.30 2.98 ... I would like to assign colors to the first three columns and plot the last column against fourth column which is the sum of the first three at each row. Can anyone point to me how to approach this? Thanks for your suggestions. Hi Eric, Here is an initial try at your plot. I doesn't look great, but you may be able to use some of the ideas. hu.df-read.table(hu.dat) hu.df V1 V2 V3 V4 V5 V6 1 3 0 0 3 -377.61 1.94 2 3 0 0 3 -444.80 2.36 3 2 1 0 3 -519.60 2.39 4 1 1 1 3 -54.88 2.49 5 2 1 1 4 -536.55 2.53 6 1 0 1 2 108.29 2.62 7 2 0 0 2 39.56 2.62 8 3 0 1 4 108.32 2.63 9 2 0 0 2 -455.23 2.84 10 1 0 0 1 -432.30 2.98 hu.col-rgb(hu.df$V1,hu.df$V2,hu.df$V3,maxColorValue=3) hu.col [1] #BF #BF #804000 #404040 #804040 #400040 #80 [8] #BF0040 #80 #40 plot(hu.df$V6,hu.df$V4,col=hu.col,pch=15,cex=3) Jim [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Innovative Enterprise Microarray Software
3rd Millennium is announcing the release of its award winning Array Repository and Data Analysis System (ARDAS) version 2. ARDAS is a web-enabled enterprise software system that provides a complete and fully integrated solution to microarray data acquisition, management, and analysis. ARDAS includes three main modules: 1- A Laboratory Information Management System (LIMS) 2- A repository and data warehouse 3- An Analysis Information Management System (AIMS) based on bioconductor and R ARDAS is a robust and scalable enterprise system based on an Oracle relational database and is offered at desktop prices ($1900-$2900). To learn more or request a trial, please visit our web site at http://www.3rdmill.com/em1 Thank you. 3rd Millennium, Inc. 391 Totten Pond Rd. Suite 104 Waltham, MA 02451 [EMAIL PROTECTED] 781-890-4440 www.3rdmill.com - If you'd rather not receive emails from 3rd Millennium, please reply to this email with the word REMOVE in the body of your message. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
While reading the various answers, I've remembered that the juridic part can't be that so simple. If I'm not fogeting something, there are some packages in R that has a more restrictive licence than GPL. HTH, Rogerio. - Original Message - From: Damien Joly [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Thursday, May 18, 2006 6:51 PM Subject: [R] How can you buy R? Hi all, This may seem like a dumb question, but I work for an entity that is soon converting to XP across the board, and I will lose the ability to install software on my own. The entity has a policy of only using software that has been purchased and properly licensed (whatever that means). This means I will soon lose the ability to use R at work - something I can't do without at this point. HOWEVER, I might be able to work around this policy if I can find a licensed software vendor, preferably in Canada, that sells R. I tried googling R vendors but was unsuccessful. Any ideas? Thanks, Damien [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Innovative Enterprise Microarray Software
3rd Millennium is announcing the release of its award winning Array Repository and Data Analysis System (ARDAS) version 2. ARDAS is a web-enabled enterprise software system that provides a complete and fully integrated solution to microarray data acquisition, management, and analysis. ARDAS includes three main modules: 1- A Laboratory Information Management System (LIMS) 2- A repository and data warehouse 3- An Analysis Information Management System (AIMS) based on bioconductor and R ARDAS is a robust and scalable enterprise system based on an Oracle relational database and is offered at desktop prices ($1900-$2900). To learn more or request a trial, please visit our web site at http://www.3rdmill.com/em1 Thank you. 3rd Millennium, Inc. 391 Totten Pond Rd. Suite 104 Waltham, MA 02451 [EMAIL PROTECTED] 781-890-4440 www.3rdmill.com - If you'd rather not receive emails from 3rd Millennium, please reply to this email with the word REMOVE in the body of your message. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Tick marks in lines.survfit
On Fri, 19 May 2006, Rachel Pearce wrote: I posted several months about the problem with adding tick marks to curves using lines.survfit. Fixed in survival 2.26, which will be in R 2.3.1 -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Innovative Enterprise Microarray Software
3rd Millennium is announcing the release of its award winning Array Repository and Data Analysis System (ARDAS) version 2. ARDAS is a web-enabled enterprise software system that provides a complete and fully integrated solution to microarray data acquisition, management, and analysis. ARDAS includes three main modules: 1- A Laboratory Information Management System (LIMS) 2- A repository and data warehouse 3- An Analysis Information Management System (AIMS) based on bioconductor and R ARDAS is a robust and scalable enterprise system based on an Oracle relational database and is offered at desktop prices ($1900-$2900). To learn more or request a trial, please visit our web site at http://www.3rdmill.com/em1 Thank you. 3rd Millennium, Inc. 391 Totten Pond Rd. Suite 104 Waltham, MA 02451 [EMAIL PROTECTED] 781-890-4440 www.3rdmill.com - If you'd rather not receive emails from 3rd Millennium, please reply to this email with the word REMOVE in the body of your message. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Cross correlation/ bivariate/ mantel
Sam, maybe you'll like to read about some environmetrics packages at http://cran.r-project.org/src/contrib/Views/Environmetrics.html or about more specific spatial analysis packages at http://cran.r-project.org/src/contrib/Views/Spatial.html Have a good reading and... good luck! Rogerio. - Original Message - From: McClatchie, Sam (PIRSA-SARDI) [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Friday, May 19, 2006 12:16 AM Subject: [R] Cross correlation/ bivariate/ mantel Background: OS: Linux Ubuntu Dapper release: R 2.3.0 editor: GNU Emacs 21.4.1 front-end: ESS 5.2.3 - Colleagues I have two spatial datasets (latitude, longitude, fish eggs) and (latitude, longitude, fish larvae) at the same 280 stations (i.e. 280 cases). I want to determine if the 2 datasets are spatially correlated. In other words, do high numbers of larvae occur where there are high numbers of eggs? I would like to calculate the cross correlation for these bivariate data and calculate a Mantel statistic as described on pg. 147 of Fortin and Dale 2005 Spatial analysis. My search of R packages came up with acf and ccf functions but I don't think these are what I want. Does anyone know which spatial package I might find the appropriate test, please? Best fishes Sam Sam McClatchie, Biological oceanography South Australian Aquatic Sciences Centre PO Box 120, Henley Beach 5022 Adelaide, South Australia email [EMAIL PROTECTED] Cellular: 0431 304 497 Telephone: (61-8) 8207 5448 FAX: (61-8) 8207 5481 Research home page http://www.members.iinet.net.au/~s.mcclatchie/ /\ ...xX(° °)Xx / \\ (((° (((° ...xX(°O°)Xx [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to use lm.predict to obtain fitted values?
Larry Howe wrote: I have had a similar issue recently, and looking at the archives of this list, I see other cases of it as well. It took me a while to figure out that the variable name in the data frame must be identical to the variable name in the model. I don't see this mentioned in the documentation of predict.lm, and R issues no warning in this case. How would I go about officially requesting that this is mentioned, either in the documentation, or as a warning? Sincerely, Larry Howe Here's what I would do before officially requesting anything: read 'An Introduction to R', especially section 11.3: predict(object, newdata=data.frame) The data frame supplied must have variables specified with the same labels as the original. Seems pretty explicit. As well, the predict.lm help page has, under Arguments: newdata An optional data frame in which to look for variables with which to predict. That, too, seems unambiguous; i.e. you can't predict with values of z when z is not in your formula. Peter Ehlers On Friday May 19 2006 13:26, Prof Brian Ripley wrote: data.frame(x=newData) will not have any entries called x: You supplied newdata, so assuming you means newdata, data.frame(x=newdata) x.1 x.2 x.3 x.4 x.5 x.6 1 1 1 1 1 1 1 has 6 columns none of which is labelled x. If you read the help for lm, it does not mention having a matrix on the rhs of a formula, and the help for data.frame does explain how it works. predict(model, data.frame(x=I(newData))) might work. On Fri, 19 May 2006, Richard Lawn wrote: I am writing a function to assess the out of sample predictive capabilities of a time series regression model. However lm.predict isn't behaving as I expect it to. What I am trying to do is give it a set of explanatory variables and have it give me a single predicted value using the lm fitted model. model = lm(y~x) newdata=matrix(1,1,6) pred = predict.lm(model,data.frame(x=newData)); Warning message: 'newdata' had 6 rows but variable(s) found have 51 rows. pred = predict.lm(model,data.frame(newData)); Warning message: 'newdata' had 6 rows but variable(s) found have 51 rows. y is a vector of length 51. x is a 6x51 matrix newdata is a matrix of the explanatory variables I'd like a prediction for. The predict.lm function is giving me 51 (=number of observations I had) numbers, rather than the one number I do want - the predicted value of y, given the values of x I have supplied it. Many thanks, R __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
On Fri, 2006-05-19 at 17:59 -0300, Rogerio Porto wrote: While reading the various answers, I've remembered that the juridic part can't be that so simple. If I'm not fogeting something, there are some packages in R that has a more restrictive licence than GPL. HTH, Rogerio. Any CRAN packages (or other R packages not on CRAN) that have non-commercial use restrictions, likely would not be able to be used by the OP anyway, even prior to this new policy. So I suspect that this would be a non-issue. If Damien's employer is willing to accept the GPL license (probably the most significant issue) and feels the need to pay for something, they could make an appropriate donation to the R Foundation. Perhaps even secure a little PR benefit for having done so. Is Damien's employer allowing the use of Firefox instead of IE? If so, the precedent within the confines of the policy has been set already. Firefox is GPL, free and no CD. There is an awful lot of commercial software out there than can be purchased online, properly licensed and downloaded, without the need for a physical CD. Anti-virus software perhaps being the most notable example. So: License: GPL CD: Don't need one Purchase:Donation to the R Foundation Being able to use R: Priceless :-) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Weird LM behaviour
On Fri, 19 May 2006, Jason Barnhart wrote: No, not weird. Think of it this way. As you move point (0,2) to (1,2) the slope which was 0 is moving towards infinity. Eventually the 3 points are perfectly vertical and so must have infinite slope. Your delta-x is not sufficiently granular to show the slope change for x-values very close to 1 but not yet 1, like 0.9. Note lm returns NA when x=1. This turns out not to be the case. Worked to infinite precision the mean of y is 2 at x and at 1, so the infinite-precision slope is exactly zero for all x!=1 and undefined for x=1. Now, we are working to finite precision and the slope is obtained by solving a linear system that gets increasingly poorly conditioned as x approaches 1. This means that for x not close to 1 the answer should be 0 to withing a small multiple of machine epsilon (and it is) and that for x close to 1 the answer should be zero to within an increasingly large multiple of machine epsilon (and it is). Without a detailed error analysis of the actual algorithm being used, you can't really predict whether the answer will follow a more-or-less consistent trend or oscillate violently. You can estimate a bound for the error: it should be a small multiple of the condition number of the design matrix times machine epsilon. As an example of how hard it is to predict exactly what answer you get, if R used the textbook formula for linear regression the bound would be a lot worse, but in this example the answer is slightly closer to zero done that way. Unless you really need to know, trying to understand why the fourteenth decimal place of a result has the value it does is not worth the effort. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
Thanks for this (and everyone else's!) responses! I really appreciate it. You've all given me a lot of potential workarounds. Damien p.s., I suspect this will apply to Firefox, GIMP, OOo.org, and all the other great OS tools I use on a daily basis. On 5/19/06, Marc Schwartz (via MN) [EMAIL PROTECTED] wrote: On Fri, 2006-05-19 at 17:59 -0300, Rogerio Porto wrote: While reading the various answers, I've remembered that the juridic part can't be that so simple. If I'm not fogeting something, there are some packages in R that has a more restrictive licence than GPL. HTH, Rogerio. Any CRAN packages (or other R packages not on CRAN) that have non-commercial use restrictions, likely would not be able to be used by the OP anyway, even prior to this new policy. So I suspect that this would be a non-issue. If Damien's employer is willing to accept the GPL license (probably the most significant issue) and feels the need to pay for something, they could make an appropriate donation to the R Foundation. Perhaps even secure a little PR benefit for having done so. Is Damien's employer allowing the use of Firefox instead of IE? If so, the precedent within the confines of the policy has been set already. Firefox is GPL, free and no CD. There is an awful lot of commercial software out there than can be purchased online, properly licensed and downloaded, without the need for a physical CD. Anti-virus software perhaps being the most notable example. So: License: GPL CD: Don't need one Purchase:Donation to the R Foundation Being able to use R: Priceless :-) HTH, Marc Schwartz [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] my first R program
Hello, This is my first attempt at using R. Still trying to figure out and understand how to work with data frames. I am trying to plot the following data(example). Some experimental data i am trying to plot here. 1) i have 2 files 2) First File: Number Position 1 120 2 134 3 156 4 169 5 203 3) Second File: Col1Col2p-val 1 2 0.45 1 2 0.56 2 3 0.56 2 3 0.68 2 3 0.88 3 4 0.76 3 5 0.79 3 5 0.92 I am trying to plot this with position as x-axis and p-val as the y-axis. The col1 and col2 in the second file correspond to the number column in first file. I am having trouble to figure out how to associate the col1 and col2 with their corresponding position values The x-axis should start with 120 as that is the min value and next values should be spaced proportionally away from the first. I tried using the percentage method to place them...but couldnt completely get it correct. so it would look like : | ||| | 120 134 156 169203 Hopefully i explained it correctly. i would like to plot the p-value as horizontal lines drawn between the col1 and col2 values (ie: positions) So, the plot will have as many horizontal lines as the rows in the second file. And ONE reference horizontal line passing thru the plot at p-val=0.5, to see what values lie below that and what above it. I have made some progress in plotting the horizontal axis, but having trouble bringing all the data together.Not sure yet how to manipulate them using the data frames:( Any suggestions and tips will be greatly appreciated. Thank you -Kiran - This email is intended only for the use of the individual or...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lmer, p-values and all that
Douglas Bates wrote: Users are often surprised and alarmed that the summary of a linear . . . . Doug, I have been needing this kind of explanation. That is very helpful. Thank you. I do a lot with penalized MLEs for ordinary regression and logistic models and know that getting sensible P-values is not straightforward even in that far simpler situation. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Weird LM behaviour
I see what you mean. Thanks for the correction. -jason - Original Message - From: Thomas Lumley [EMAIL PROTECTED] To: Jason Barnhart [EMAIL PROTECTED] Cc: R-help@stat.math.ethz.ch; Rense Nieuwenhuis [EMAIL PROTECTED] Sent: Friday, May 19, 2006 2:39 PM Subject: Re: [R] Weird LM behaviour On Fri, 19 May 2006, Jason Barnhart wrote: No, not weird. Think of it this way. As you move point (0,2) to (1,2) the slope which was 0 is moving towards infinity. Eventually the 3 points are perfectly vertical and so must have infinite slope. Your delta-x is not sufficiently granular to show the slope change for x-values very close to 1 but not yet 1, like 0.9. Note lm returns NA when x=1. This turns out not to be the case. Worked to infinite precision the mean of y is 2 at x and at 1, so the infinite-precision slope is exactly zero for all x!=1 and undefined for x=1. Now, we are working to finite precision and the slope is obtained by solving a linear system that gets increasingly poorly conditioned as x approaches 1. This means that for x not close to 1 the answer should be 0 to withing a small multiple of machine epsilon (and it is) and that for x close to 1 the answer should be zero to within an increasingly large multiple of machine epsilon (and it is). Without a detailed error analysis of the actual algorithm being used, you can't really predict whether the answer will follow a more-or-less consistent trend or oscillate violently. You can estimate a bound for the error: it should be a small multiple of the condition number of the design matrix times machine epsilon. As an example of how hard it is to predict exactly what answer you get, if R used the textbook formula for linear regression the bound would be a lot worse, but in this example the answer is slightly closer to zero done that way. Unless you really need to know, trying to understand why the fourteenth decimal place of a result has the value it does is not worth the effort. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problem with pdf() R 2.2.1, os 10.4.6
Hi, I'm trying to write a histogram to a pdf pdf() plot-hist(c, xlim=c( 0.69, 0.84), ylim=c(0,100)) when I try to open the pdf I can't open it, there is always some error . Is there something I should add to make it run under this operation system? I had problems upgrading to 2.3 (problem downloading packages) so I'm not sure an upgrade will work out with me. I just want a publication quality histogram... thank you, betty -- Betty Gilbert [EMAIL PROTECTED] Taylor Lab Plant and Microbial Biology 321 Koshland Hall U.C. Berkeley Berkeley, Ca 94720 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with pdf() R 2.2.1, os 10.4.6
On Fri, 2006-05-19 at 16:36 -0700, Betty Gilbert wrote: Hi, I'm trying to write a histogram to a pdf pdf() plot-hist(c, xlim=c( 0.69, 0.84), ylim=c(0,100)) when I try to open the pdf I can't open it, there is always some error . Is there something I should add to make it run under this operation system? I had problems upgrading to 2.3 (problem downloading packages) so I'm not sure an upgrade will work out with me. I just want a publication quality histogram... thank you, betty Betty, You need to explicitly close the pdf device with: dev.off() once the plotting related code has completed. See the example in ?pdf for more information. Otherwise, the result of hist() is not flushed to the disk file and the file then properly closed. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lmer, p-values and all that
On Fri, 2006-05-19 at 17:44 -0500, Frank E Harrell Jr wrote: Douglas Bates wrote: Users are often surprised and alarmed that the summary of a linear . . . . Doug, I have been needing this kind of explanation. That is very helpful. Thank you. I do a lot with penalized MLEs for ordinary regression and logistic models and know that getting sensible P-values is not straightforward even in that far simpler situation. Frank I would like to echo Frank's comments and say Thanks to Doug for taking the time to provide this post. I would also like to suggest that this issue has indeed become a FAQ and would like to recommend that an addition to the main R FAQ be made (wording TBD) but along the lines of: Why are p values not displayed when using lmer()? The response could be: Doug Bates has kindly provided an extensive response in a post to the r-help list, which can be reviewed at: https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html This might save Doug, Harold and Spencer (I am probably missing some others here) keystrokes in the future... :-) Best regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How can you buy R?
Jarek wrote: At one company I was working for, I had to run all the licenses of all the software I had on my machine, through the legal department. When they read GNU Public License (GPL) their only comment was: We have no idea what that license means. Do not touch any software using it. This is typical of lawyers' minds. If something is clear, rational, lucid, straightforward, unambigous, means what it says they can't understand it. cheers, Rolf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] ANCOVA, Ops.factor, singular fit???
I'm trying to perform ANCOVAs in R 1.14, on a Mac OS X, but I can't figure out what I am doing wrong. Essentially, I'm testing whether a number of quantitative dental measurements (the response variables in each ANCOVA) show sexual dimorphism (the sexes are the groups) independently of the animal's size (the concomitant variable). I have attached a 13-column matrix as a data frame (so far, so good). But then I tried to do this: model-lm(ln2~sex*ln1) or this: model-lm(ln2~sex+ln1) and got this: Warning message: - not meaningful for factors in: Ops.factor(y, z$residuals) which I don't understand. (In my matrix, ln2 is the name of the second column, a response variable, and ln1 is the name of the first column, a concomitant variable. Sex is the rightmost column, indicating sex. The first 14 rows are measurements for male individuals, and the next 13 rows are measurements for female individuals.) The data output is bizarre, too--it's just so long, and everything begins with ln 11 or ln 12. How can I fix this? I have another question: If possible I would like to use a robust fit algorithm to fit the data. When I attempt to do this, substituting rlm() for lm(), the program returns another message: Error in rlm.default(x, y, weights, method = method, wt.method = wt.method, : 'x' is singular: singular fits are not implemented in rlm What is a singular fit and why is it, apparently, undesirable? I am new to R and any help would be greatly appreciated. Thanks so much, Rebecca Sealfon __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Function as.Date leading to error implying that strptime requires 3 arguments
I'm using R V 2.2.1. When I try an example from the as.Date help page, I get an error. x - c(1jan1960, 2jan1960, 31mar1960, 30jul1960) z - as.Date(x, %d%b%Y) Error in strptime(x, format) : 2 arguments passed to 'strptime' which requires 3 Any suggestions would be appreciated. Thanks, Rob __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fast update of a lot of records in a database?
On 5/19/2006 3:19 PM, hadley wickham wrote: put the updates into a temporary table called updates UPDATE bigtable AS a FROM updates AS b WHERE a.id = b.id SET a.col1 = b.col1 I don't think this will be any faster - why would creating a new table be faster than updating existing rows? I don't know, but I assumed each SQL statement resulted in some sort of turnaround delay with the database. Creating the new table requires just a few statements, one of which is a huge INSERT statement). I haven't had a chance yet to do the timing, but creating a test update table with 40 new values hadn't finished in an hour. I've never had a problem with using large numbers of SQL update statements (in the order of hundreds of thousands) to update a table and having them complete in a reasonable time (a few minutes). How heavily indexed is the field you are updating? The field being updated isn't indexed; the id field is. You may be able to get some speed improvements by turning off indices before the update and back on again afterwards (highly dependent on your database system though). I would strongly suspect your bottleneck lies elsewhere (eg. when generating the statements in R, or using ODBC to send them) It's not in R (that goes quickly), but I don't know how to distinguish ODBC slowdowns from database slowdowns. On my machine ODBC is the only option. Part of the problem is probably that I am remote from the server, using ODBC over an SSH tunnel. But the original thousands of UPDATES was taking hours running on a machine much closer to the server. It was using RdbiPgsql, I'm using RODBC. Doing the SELECT takes me a couple of minutes, and is faster locally; why is UPDATE so slow? Duncan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] can Box test the Ljung Box test say which ARIMA model is better?
First, have you made normal probably plots (e.g., with function qqnorm) of the data and of the whitened residuals from the fits of the different models? If you've got outliers, they could drive strange results; you should refit after setting the few outliers to NA. You didn't tell us what software you used, but the following seemed to work for me: qqnorm(as.numeric(lh), datax=TRUE) lh100 - arima(lh, order = c(1,0,0)) qqnorm(as.numeric((resid(lh100)), datax=TRUE)) lh.tst - lh length(lh) # Remove observation 20 # pretending it was an outler. lh.tst[20] - NA lh.tst100 - arima(lh.tst, order=c(1,0,0)) resid(lh.tst100) # NOTE: redid(...)[20] is NA What are the AIC values? I think most experts would suggest making the choice based on the AIC, especially if both models passed the Box-Ljung test. In my opinion, the best work I've seen relevant to your question talks about Bayesian Model Averaging. RSiteSearch(Bayesian model averaging for time series) led me to the following: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/70293.html hope this helps. Spencer Graves Michael wrote: two ARIMA models, both have several bars signicant in ACF and PACF plots of their residuals, but when run Ljung Box tests, both don't show any significant correlations... however, one model has p-value that is larger than the other model, based on the p-values, can I say the model with larger p-values should be better than the model with smaller p-values? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Function as.Date leading to error implying that strptime requires 3 arguments
I can't reproduce this on my XP system: x - c(1jan1960, 2jan1960, 31mar1960, 30jul1960) z - as.Date(x, %d%b%Y) z [1] 1960-01-01 1960-01-02 1960-03-31 1960-07-30 R.version.string # XP [1] R version 2.2.1, 2005-12-20 I also tried it on 2.3.0 patched and could not reproduce it there either. On 5/19/06, Rob Balshaw [EMAIL PROTECTED] wrote: I'm using R V 2.2.1. When I try an example from the as.Date help page, I get an error. x - c(1jan1960, 2jan1960, 31mar1960, 30jul1960) z - as.Date(x, %d%b%Y) Error in strptime(x, format) : 2 arguments passed to 'strptime' which requires 3 Any suggestions would be appreciated. Thanks, Rob __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html