Re: [R] distance method in kmeans
Chandra, you might want to have a look at package flexclust. Best, Bettina Ranga Chandra Gudivada wrote: I am trying to cluster some binary data using k-means . As the regular kmeans available from stats package in R does'nt provide the option to change the distance method. I was wondering there is any package available to specify type of distance measure to be used in k means clustering in R. Especially distances like Jaccard which is good for binary data. Thanks chandra - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracing Interval of Time in seconds in R
?system.time might be what you're looking for. Ted. Mohammad Ehsanul Karim wrote on 04/23/2007 03:53 PM: Dear List, I want to let R calculate the time (run-time) it requires to run a self-written simulation function. I tried as follows: it enables me to see the starting and finishing time points. # sim.result - function(nsim, ...){ Starting - date() ... # calculations # final.result - ... # Output for display # cat(# of Iterations used =, nsim, \n ) Ending - date() cat(Start of Program at, Starting, \n ) cat(End of Program at, Ending, \n ) return(print(final.result, quote = FALSE)) } # But how about I want the results in difference of statring and ending time in seconds only (say, in output, I need the function to say that the program took 597 secornds to run the whole simulation), not in all these format(Sys.time(), %a %b %d %H:%M:%S %Y) Again, I tried in the following way, but it does not seem to do the trick starting-Sys.time() ending-Sys.time() format(diff(ending,starting), %H:%M:%S) Error in Ops.POSIXt(lag, differences) : * not defined for POSIXt objects as.numeric(format(ending, %H:%M:%S)) [1] NA How should I proceed? Thanks for your valuable time. Thanks in advance. Mohammad Ehsanul Karim Using Windows Xp, R 2.3.1 Institute of Statistical Research and Training, University of Dhaka __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr E.A. Catchpole Visiting Fellow Univ of New South Wales at ADFA, Canberra, Australia _ and University of Kent, Canterbury, England 'v'- www.pems.adfa.edu.au/~ecatchpole / \ - fax: +61 2 6268 8786 m m- ph: +61 2 6268 8895 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with dgamma ?
Tong Wang wrote: Hi All, Here 's what I got using dgamma function : nu-.2 nu*log(nu)-log(gamma(nu))+(nu-1)*log(1)-nu*(1) [1] -2.045951 dgamma(1,nu,nu,1) [1] 0.0801333 dgamma(1,nu,nu,0) [1] NaN Warning message: NaNs produced in: dgamma(x, shape, scale, log) Could anyone tell me what is wrong here ? Did you intend the 4th argument to match the log formal argument? It doesn't: args(dgamma) function (x, shape, rate = 1, scale = 1/rate, log = FALSE) NULL I am using R-2.4.1 on windows XP. Thanks a lot. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracing Interval of Time in seconds in R
The following seem to work: begin.time-Sys.time() begin.times - format(begin.time, %a %b %d, %Y at %X) end.time-Sys.time() end.times - format(end.time, %a %b %d, %Y at %X) run.time-difftime(end.time,begin.time,units=secs) cat( Start time:, begin.times , \n, Finish time:, end.times, \n, Run time:, run.time, 'secs.\n') Thanks. Mohammad Ehsanul Karim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Open source community help-desks
On 22-Apr-07 20:51:09, Jeffrey Miller wrote: I read the reply earlier in which Nima was naughty-naughty'd for calling us a Help Desk. And, I have to admit that I agree. Well, we're not a Help Desk of the kind that puts you on hold, listening to the Free Software Song[1] round and round ... but ... But then, an open-source listserv is, in essence, a help desk well, maybe not a desk, but we do help each other. ... well, exactly! And I can accept help desk as a metaphor. My conflicting feelings about this are better left for a therapist but, in the meanwhile, perhaps we need a revision of Ayn Rand's Virtue of Selfishness and how it may or may not extend to the open-source community. Ayn Rand's concept of selfishness is of course not the standard one (gratifying oneself in disregard for others), and can (if I have it right) well embrace ensuring that the self is well looked after while extending one's resulting strength, vigour and survival to the benefit of others. And I think this may be a good analogy of the way the Open Source community works. All in fun, Hmmm, slightly serious fun ... ?? !! Best wishes to all, Ted. [1] http://www.gnu.org/music/free-software-song.au or other variants which may be found at http://www.gnu.org/music/free-software-song.html E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 23-Apr-07 Time: 00:38:23 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Suggestions for statistical computing course
On Fri, 2007-04-20 at 12:13 -0400, Fred Bacon wrote: Ideally, it would work like this: The free VMware player is installed on each of the lab computers. The lab manager uses a licensed copy of VMware Workstation to create a clean image of a computer. You can use the open source QEMU program to create VMware machines. http://fabrice.bellard.free.fr/qemu/ After installing QEMU, the following command creates a machine with 20 Gb disk space, onto which you can load a (licensed!) copy of Windows (or better, Linux :-) ): qemu-img.exe create -f vmdk VMmachine.vmdk 20G The instructor makes a copy of the clean image and installs the necessary software and instructional materials. The instructor can use either the free player or the paid workstation version to do this. After the virtual machine is completed, the image is sent back to the lab where it is made available to the lab computers. If you use the paid workstation version rather than the free player version on the lab computers, then you can use the Snapshot feature to create a consistent image for every student. Every time the virtual machine is shutdown, the system can revert back to the snapshot for the next student. It all depends on your budget. Again, you can do this for free with QEMU, using the -snapshot option. How you handle the OS licensing issue for the guest operating system is up to you. I personally would recommend using Linux, but some of our customers are terrified of anything that doesn't look like a Microsoft OS. The only caveat is the disk space utilization. Having a complete OS image for every student for every class could eat up terabytes of space. But heck, terabyte RAID arrays are readily available these days. Fred -- Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320, Goddard Building (8) T: +61 7 3365 2506 email: S.Blomberg1_at_uq.edu.au The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimates at each iteration of optim()?
DEEPANKAR BASU wrote: I am trying to maximise a complicated loglikelihood function with the optim command. Is there some way to get to know the estiamtes at each iteration? When I put control=list(trace=TRUE) as an option in optim, I just got the initial and final values of the loglikelihood, number of iterations and whether the routine has converged or not. I need to know the estimate values at each iteration. It might help if you actually _read_ the description of the trace control parameter (hint: it is not an on/off switch) in ?optim... And, as it says, this is method dependent, so you may have to study the source code. Deepankar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Suggestions for statistical computing course
2. I do most of my work in R using Emacs and ESS. That means that I keep a file in an emacs window and I submit it to R one line at a time or one region at a time, making corrections and iterating as needed. When I am done, I just save the file with the last, working, correct (hopefully!) version of my code. Is there a way of doing something like that, or in the same spirit, without using Emacs/ESS? What approach would you use to polish and save your code in this case? For my course I will be working in a Windows environment. I do this with kate on linux. Kate has a konsole window in which I run R, and then pipe the lines from the editor to konsole. You can easily define a shortcut key to pipe the lines/regions to konsole. Vikas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Do *NOT* repost ! {Re: about R square value}
Dear Nitish, Please do *NOT* resend your message several times to the R-help mailing list. This is considered very impolite. Nitish == Nitish Kumar Mishra [EMAIL PROTECTED] on Sun, 22 Apr 2007 16:39:03 +0530 (IST) writes: Nitish Hi, I am simply asking about coefficient od Nitish determination(R square), is its value more than 1 Nitish also posiible. Because it ranges from 0-1. So I Nitish want to know that R squre may be more than one. If Nitish yes what is its interpretation. Thanking to all of Nitish You(R help group). [] Nitish PLEASE do read the posting guide Nitish http://www.R-project.org/posting-guide.html and Nitish provide commented, minimal, self-contained, Nitish reproducible code. Please, DEFINITELY do read the posting guide before sending another message to R-help. Martin Maechler, ETH Zurich R- mailing lists maintainer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting the mode of a vector
Benoît Lété wrote: Hello, I have an elementary question (for which I couldn't find the answer on the web or the help): how can I extract the mode (modal score) of a vector? Assuming that your vector contains only integers: v - sample(1:5, size=20, replace=T) v [1] 1 1 1 1 2 3 5 1 1 5 2 4 1 3 1 1 5 4 1 5 vt - table(v) as.numeric(names(vt[vt == max(vt)])) [1] 1 Cheers, Gad -- Gad Abraham Department of Mathematics and Statistics The University of Melbourne Parkville 3010, Victoria, Australia email: [EMAIL PROTECTED] web: http://www.ms.unimelb.edu.au/~gabraham __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Does any package do stepwise using p-value criterian?
Dear all, I found most of R packages do stepwise model selection with AIC criterian. I am doing a study on the comparison of severy popular model selection methods including stepwise using p-value criterian. We know that in SAS the stepwise uses p-value criterian, so this method could be a widely-used method. Does any one know in R which package has the stepwise using p-value criterian? Thanks, Li Junjie [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extract from a data frame
hello, I'd like know how to do to extract data from a frame for example how can I do to extract only the data where variety=victory or variety=golden rain thanks. Oats Block Variety nitro yield 1 I Victory 0.0 111 2 I Victory 0.2 130 3 I Victory 0.4 157 4 I Victory 0.6 174 5 I Golden Rain 0.0 117 6 I Golden Rain 0.2 114 7 I Golden Rain 0.4 161 8 I Golden Rain 0.6 141 9 I Marvellous 0.0 105 10 I Marvellous 0.2 140 11 I Marvellous 0.4 118 12 I Marvellous 0.6 156 13II Victory 0.061 14II Victory 0.291 15II Victory 0.497 16II Victory 0.6 100 17II Golden Rain 0.070 18II Golden Rain 0.2 108 19II Golden Rain 0.4 126 20II Golden Rain 0.6 149 21II Marvellous 0.096 22II Marvellous 0.2 124 23II Marvellous 0.4 121 ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Forest
Hi R-wizards, I ran a random forest on a dataset where the response variable had two possible values. It returned a warning telling me that it did regression and if that was really what I wanted. Does anybody know what is being in terms of the algorithm when it does a regression? (the random forest is used as a regression, how does that work?) Thanks for your time! Ruben [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: extract from a data frame
Oats[Oats$Variety %in% c(Victory, Golden Rain),] or subset(Oats, Variety %in% c(Victory, Golden Rain)) Stefano -Messaggio originale- Da: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] conto di elyakhlifi mustapha Inviato: lunedì 23 aprile 2007 9.56 A: R-help@stat.math.ethz.ch Oggetto: [R] extract from a data frame hello, I'd like know how to do to extract data from a frame for example how can I do to extract only the data where variety=victory or variety=golden rain thanks. Oats Block Variety nitro yield 1 I Victory 0.0 111 2 I Victory 0.2 130 3 I Victory 0.4 157 4 I Victory 0.6 174 5 I Golden Rain 0.0 117 6 I Golden Rain 0.2 114 7 I Golden Rain 0.4 161 8 I Golden Rain 0.6 141 9 I Marvellous 0.0 105 10 I Marvellous 0.2 140 11 I Marvellous 0.4 118 12 I Marvellous 0.6 156 13II Victory 0.061 14II Victory 0.291 15II Victory 0.497 16II Victory 0.6 100 17II Golden Rain 0.070 18II Golden Rain 0.2 108 19II Golden Rain 0.4 126 20II Golden Rain 0.6 149 21II Marvellous 0.096 22II Marvellous 0.2 124 23II Marvellous 0.4 121 ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data recoding problem
Hi R experts, I have a data recoding problem I cant get my head around - I am not that great at the subsetting syntax. I have a dataset of longitudinal toxicity data (for multistate modelling) for which I want to also want to do a simple Kaplan-Meier curve of the time to first toxic event. The data for 2 cases presently looks like this (one with an event, the other without), with id representing each person on study, and follow-up time and status: tox id t event PMC011 0.000 0 PMC011 3.154 0 PMC011 5.914 0 PMC011 12.353 0 PMC011 18.103 1 PMC011 24.312 0 PMC011 30.029 0 PMC011 47.967 0 PMC011 96.953 0 PMC016 0.000 0 PMC016 3.943 0 PMC016 5.782 0 PMC016 11.762 0 PMC016 17.741 0 PMC016 23.951 0 PMC016 28.353 0 PMC016 44.747 0 PMC016 89.692 0 So what I need is an output in the same column format, containing each of the unique values of id: PMC011 18.103 1 PMC016 89.692 0 In my head, I would do this by looking at each unique value of id (each unique case), look down the event data of each of these cases - if there is no event (event==0), then I would go to the time column (t) and find the max value and paste this time along with a 0 for event. If there were an event, I would then need to find the minimum time associated with an event to paste across with the event marker. I am sure someone out there can point me in the right direction to do this without tedious and slow loops. Any help greatly appreciated. Cheers Scott _ Dr. Scott Williams MBBS BScMed FRANZCR Radiation Oncologist Peter MacCallum Cancer Centre Melbourne, Australia ph +61 3 9656 fax +61 3 9656 1424 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract from a data frame
Hi, I'd like know how to do to extract data from a frame for example how can I do to extract only the data where variety=victory or variety=golden rain thanks. Oats Block Variety nitro yield 1 I Victory 0.0 111 2 I Victory 0.2 130 You can try : Oats[variety==Victory || Variety==Golden, ] See also help on the subset function. HTH, Julien -- Julien Barnier Groupe de recherche sur la socialisation ENS-LSH - Lyon, France __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data recoding problem
one option is the following: do.call(rbind, lapply(split(tox, tox$id), function (x) { if (any(ind - x$event == 1)) x[which(ind)[1], ] else x[nrow(x), ] })) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Williams Scott [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Monday, April 23, 2007 10:14 AM Subject: [R] data recoding problem Hi R experts, I have a data recoding problem I cant get my head around - I am not that great at the subsetting syntax. I have a dataset of longitudinal toxicity data (for multistate modelling) for which I want to also want to do a simple Kaplan-Meier curve of the time to first toxic event. The data for 2 cases presently looks like this (one with an event, the other without), with id representing each person on study, and follow-up time and status: tox id t event PMC011 0.000 0 PMC011 3.154 0 PMC011 5.914 0 PMC011 12.353 0 PMC011 18.103 1 PMC011 24.312 0 PMC011 30.029 0 PMC011 47.967 0 PMC011 96.953 0 PMC016 0.000 0 PMC016 3.943 0 PMC016 5.782 0 PMC016 11.762 0 PMC016 17.741 0 PMC016 23.951 0 PMC016 28.353 0 PMC016 44.747 0 PMC016 89.692 0 So what I need is an output in the same column format, containing each of the unique values of id: PMC011 18.103 1 PMC016 89.692 0 In my head, I would do this by looking at each unique value of id (each unique case), look down the event data of each of these cases - if there is no event (event==0), then I would go to the time column (t) and find the max value and paste this time along with a 0 for event. If there were an event, I would then need to find the minimum time associated with an event to paste across with the event marker. I am sure someone out there can point me in the right direction to do this without tedious and slow loops. Any help greatly appreciated. Cheers Scott _ Dr. Scott Williams MBBS BScMed FRANZCR Radiation Oncologist Peter MacCallum Cancer Centre Melbourne, Australia ph +61 3 9656 fax +61 3 9656 1424 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subset
hi, ok I understand how to use the subset function but sometimes I need to use it to extract data by date and its format it isn't so easy (like below) for example like in using SQL I thougth that it was possible to write %/2004 but it doesn't run. Can you help me please about this? subset(don, Date_O in %/2004, select = c(Annee_O, Date_O)) ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with dgamma ?
On 23-Apr-07 04:41:03, ecatchpole wrote: dgamma(x=1, shape=nu, rate=nu, log=TRUE) [1] -2.045951 This is a good example of why you should call parameters by name. Ted. True up to the point that the log parameter is in the 5th position in the list of dgamma paramaters, so if its value is given in any other position (here the 4th) then it needs to be named. The other arguments are given in the positions where dgamm() expects to find them: dgamma(x, shape, rate = 1, scale = 1/rate, log = FALSE) Hence: dgamma(1,nu,nu,log=TRUE) [1] -2.045951 dgamma(1,nu,nu,log=FALSE) [1] 0.1292572 dgamma(1,nu,nu,log=1) [1] -2.045951 dgamma(1,nu,nu,log=0) [1] 0.1292572 all work as expected. Tong Wang was in fact setting scale. Ted (Harding) Tong Wang wrote on 04/23/2007 01:59 PM: Hi All, Here 's what I got using dgamma function : nu-.2 nu*log(nu)-log(gamma(nu))+(nu-1)*log(1)-nu*(1) [1] -2.045951 dgamma(1,nu,nu,1) [1] 0.0801333 dgamma(1,nu,nu,0) [1] NaN Warning message: NaNs produced in: dgamma(x, shape, scale, log) Could anyone tell me what is wrong here ? I am using R-2.4.1 on windows XP. Thanks a lot. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr E.A. Catchpole Visiting Fellow Univ of New South Wales at ADFA, Canberra, Australia _ and University of Kent, Canterbury, England 'v' - www.pems.adfa.edu.au/~ecatchpole / \ - fax: +61 2 6268 8786 m m- ph: +61 2 6268 8895 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 23-Apr-07 Time: 10:26:01 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] colored shading lines
Hi all, it there any possibility to draw colored shading lines of a polygon plot? E.g. plot(polygon_object,col=red,density=10,angle=45) produces only black shading lines within the polygon. With many thanks for any hint Albrecht __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to convert the lower triangle of a matrix to a symmetricmatrix
Sorry if this answer was already given, or if I miss the point, but did you have a look to lowerTriangle and upperTriangle functions in the gdata package ? # example # A-matrix(rnorm(9),3,3) # B-B-matrix(NA,dim(A)[1],dim(A)[2]) # lowerTriangle(B)-lowerTriangle(A) # upperTriangle(B)-lowerTriangle(A) # diag(B)-diag(A) Hope this helps, sorry if it was already answered. Olivier -- Olivier ETERRADOSSI Maître-Assistant CMGD / Equipe Propriétés Psycho-Sensorielles des Matériaux Ecole des Mines d'Alès Hélioparc, 2 av. P. Angot, F-64053 PAU CEDEX 9 tel std: +33 (0)5.59.30.54.25 tel direct: +33 (0)5.59.30.90.35 fax: +33 (0)5.59.30.63.68 http://www.ema.fr __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help about princomp
Hello, I have a problem with the princomp method, it seems stupid but I don't know how to handle it. I have a dataset with some regular data and some outliers. I want to calculate a PCA on the regular data and get the scores for all data, including the outliers. Is this possible on R? Thank you for helping!!! -- View this message in context: http://www.nabble.com/Help-about-princomp-tf3630184.html#a10136737 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help about princomp
On Mon, 23 Apr 2007, annina wrote: Hello, I have a problem with the princomp method, it seems stupid but I don't know how to handle it. I have a dataset with some regular data and some outliers. I want to calculate a PCA on the regular data and get the scores for all data, including the outliers. Is this possible on R? Yes. Do you know which are the outliers? You can either fit to the 'regular data' with princomp and use predict() to get the 'scores' for all the data, or use a robust method to find the 'covmat' argument (as the help page says, you could use cov.mcd from MASS) and call princomp() on all the data. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] For Help in libm_c32.so for fortran
Dear Sir I am running a script file for fortran but it showing error i.e. .sgb.x : /sbin/loader: Fatal Error : cannot map libm_c32.so here sgb.x is its output file that I have got after running make file. I have seached this library file inside /usr/shlib directory its not there. Please tel me solution for this problem and if you have any link from where I can download this file kindly send that also. Thanking You Regards, Savita Rai CAS, IIT Delhi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help about princomp
Hi Annina, You may use the dudi.pca function in the ade4 package. PCA - dudi.pca(your_data, scale = FALSE, scan = FALSE) # to get scores PCA$li and you're done. Maybe have a look at ?dudi.pca Christophe annina a écrit : Hello, I have a problem with the princomp method, it seems stupid but I don't know how to handle it. I have a dataset with some regular data and some outliers. I want to calculate a PCA on the regular data and get the scores for all data, including the outliers. Is this possible on R? Thank you for helping!!! -- === *Christophe BONENFANT* UMR CNRS 5558, Laboratoire de Biométrie et Biologie Evolutive Université Claude Bernard Lyon 1 43, Boulevard du 11 Novembre 1918 F-69622 Villeurbanne cedex, FRANCE Phone : (+33) 8 72 20 98 73 Courriel/E-mail: bonenfan[at]biomserv[dot]univ-lyon1[dot]fr _(~)_ )( (@_@) /-) (Comtois, rends-toi ! Nenni ma foi ! / || | * / \`--/ \ ~~ ~~ Error 404 - Brain not found... __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest
Ruben, Maybe your binary response is a numeric vector - try converting it into a factor with two levels. You probably want classification rather than regression (the dependent variable should be numeric and continous)! Arne -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ruben Feldman Sent: Monday, April 23, 2007 10:28 AM To: r-help@stat.math.ethz.ch Subject: [R] Random Forest Hi R-wizards, I ran a random forest on a dataset where the response variable had two possible values. It returned a warning telling me that it did regression and if that was really what I wanted. Does anybody know what is being in terms of the algorithm when it does a regression? (the random forest is used as a regression, how does that work?) Thanks for your time! Ruben [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] about R squared value
Hi Nitish, R^2 cannot take values of greater than 1. Per definition (see http://en.wikipedia.org/wiki/Coefficient_of_determination) R^2 := 1- SSE/SST whereby SSE = sum of squared errors SST = total sum of squares For R^2 1 would require SSE/SST 0. Since SSE and SST are non-negative (check the formulas, they are the sum of squared differences which are neccessarily non-negative), SSE/SST 0 is impossible. Bernd __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aggregate function
Hello, is there a way to use the aggregate function to calculate monthly mean in case i have one row in data frame that holds the date like -mm-dd? i know that it works for daily means. i also like to do it for monthly and yearly means. maybe there is something like aggregate(x, list(Date[%m]), mean)? the data frame looks like: DateTimez 2006-01-01 21:00 6,2 2006-01-01 22:00 5,7 2006-01-01 23:00 3,2 2006-01-02 00:00 7,8 2006-01-02 01:00 6,8 2006-01-02 02:00 5,6 . . . 2007-03-30 22:00 5,2 2007-03-30 23:00 8,3 2007-03-31 00:00 6,4 2007-03-31 01:00 7,4 thanks for help! -- Michél Schnitz [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a function with output from lmer
Hello there someone who knows this? im having some trouble with making a function that comes out with what i want. i have these vectors: x1 - factor(rep(1:2500,3*10)) x2 - factor(rep(1:25000,3)) x3 - factor(1:75000) y - rep(rnorm(2500,mean=0,sd=2),10*3)+ rep(rnorm(25000,mean=0,sd=sqrt(2)),3)+rnorm(75000,mean=0,sd=1) its a simulation of data and i want to run them fx 1000 times, and in every run make an output from this lmer1 - lmer(y~1+(1|x1)+(1|x2)) with the variance on x1 and x2 and residual (fx in 3 vectors or a dataframe or something is this possible? is there someone who knows this? thanks in advance Rina [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] automating merging operations from multiple dataframes
Hi, I have a set of dataframes names AINDSLIM, BINDSLIM, CINDSLIM ... NINDSLIM In each dataframe I want to extract two variables, pid and {w}region, where {w} means a, b, c, ...n At the moment my code looks like: PidRegion - data.frame(pid=XWAVEID$pid) this.region - AINDSLIM[,c(pid, aregion)] PidRegion - merge(PidRegion, this.region, by=pid, all=T) this.region - BINDSLIM[,c(pid, bregion)] PidRegion - merge(PidRegion, this.region, by=pid, all=T) this.region - CINDSLIM[,c(pid, cregion)] ... this.region - NINDSLIM[,c(pid, nregion)] PidRegion - merge(PidRegion, this.region, by=pid, all=T) But surely theres a way to automate this? Any suggestions? Jon Minton Checked by AVG Free Edition. 22/04/2007 20:18 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] limmaGUI
Quoting [EMAIL PROTECTED]: Dear all, I have a question about limmaGUI that is usually run in R environment. My problem is loading data into the programm. I have 6 gpr files that apparently are not compatible with limma. Everytime I'm trying to load the data (including a RNA targets file, an error appears:Error reading files. that I'm not sure,but seems to have something to do with the format of my files (gpr). Is that the problem? does anyone have any idea what it could be? I was wondering if I try GAL files, the problem would be solved. I still don't have access to the gal files that's why I haven't tried it yet. Thanks, Solmaz Hi Solmaz, for questions about BioConductor packages (such as limma or limmaGUI) you´re probably better off asking in the BioConductor forum. I recommend you subscribe to it, it´s very much microarray-oriented and I´ve learnt a lot there. Now, about your question... I haven´t used limmaGUI for a long time (I switched to teh command line limma instead, more flexible), but I´ve dealt with similar problems as you describe, both my own and others... I can´t tell you what you´re problem is, but I´ll give you a couple of ideas so hopefully you can check and maybe one of them gives you a solution. Well, first of all, I´m concerned that you say you have no GAL file. The GAL file describes what´s on the array, the locations of teh spots etc... and limmaGUI *requires* a GAL file, as far as I remember... so you won´t get far without one! However, you say you gave GenePix files (GPR), so you could make your own, as each GPR file will contain the minimal info that you need in a GAL file: the columns with headers block, row, column, ID and Name. The error reading the GPR files, was usually (when I encounter it) due to non-standard headers in teh files. If you select GenePix as your type of data file to load, by default it expects to find a column with the header F532 Median and F635 Median as teh source of raw intensity readings for the Cy3 and teh Cy5 channels respectively. But depending on the settings of Genepix when teh scanning was made, and when the quantitation was calculated, you may have a different wavelength (maybe 685 instead of 635, etc)... so limmaGUI cannot find teh right columns and returns an error. You should open all your GPR files (in Excel, for instance... GPR is just a standard text file, in tab-delimited format) and check what teh headers say. Take note of them. If teh headers are not the same in all files, you´ll get errors. To read GPR files with a different header than teh default, you can simply use teh generic option (Other, from teh menu of file formats). This opens a little window where you specify teh actual clñumn names to load from teh files. Rf means Red Foreground (usually Cy5 channel), Rb is Red backgrund, and similarly you get to specify teh column names for Gf and Gb. So, say you could write F650 Median in the Rf slot, and this tells limmaGUI to take that column you specified as teh source for the signals for teh Cy5 channel. B650 Median on the slot Rb then specifies that column from the GPR files as teh background for Cy5 channel... etc. I´d say the reason above is the most likely to solve the problem. However, I have also seen instances where some GPR files had been modified to omit any empty spots, for example. This results in having files containing different numbers of genes, and this returns errors. So make sure all teh files have teh right number of rows, and that the GAL file has teh same number of rows too, and in the same order! I hope this helps a bit... and do check the BioConductor forum, it is a great place to ask this sort of questions, especially as you are more likely to get answers directly from teh guys who developed the packages themselves. Good luck! Jose -- Dr. Jose I. de las Heras Email: [EMAIL PROTECTED] The Wellcome Trust Centre for Cell BiologyPhone: +44 (0)131 6513374 Institute for Cell Molecular BiologyFax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing working directory
Good morning, I keep copies my .RData file in different directories for different projects on Windows XP. There is an icon on my desktop for each project so all I have to do is click on the icon to open R for a specific project, i.e. a specific .RData file. How do I change to another .RData file from within R without first closing R? Thanks, Walt Paczkowski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing working directory
Walter Paczkowski wrote: Good morning, I keep copies my .RData file in different directories for different projects on Windows XP. There is an icon on my desktop for each project so all I have to do is click on the icon to open R for a specific project, i.e. a specific .RData file. How do I change to another .RData file from within R without first closing R? Best practice is to avoid using Workspaces at all. If you really want to do what you are asking for: # save current workspace, if you really ... save.image() # clean up current Workspace: rm(list=ls(all=TRUE)) setwd(/path/to/other/.RData) load(.RData) Uwe Ligges Thanks, Walt Paczkowski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing working directory
?setwd -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Walter Paczkowski Sent: Monday, April 23, 2007 7:41 AM To: r-help@stat.math.ethz.ch Subject: [R] Changing working directory Good morning, I keep copies my .RData file in different directories for different projects on Windows XP. There is an icon on my desktop for each project so all I have to do is click on the icon to open R for a specific project, i.e. a specific .RData file. How do I change to another .RData file from within R without first closing R? Thanks, Walt Paczkowski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** * This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by any error in transmission. If you have received this message in error, please immediately notify the sender by e-mail, delete the message and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message if you are not the intended recipient. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help on xyplot and curves
Hi, I need to add some different curves to a each panel in a xyplot. I have a old function to make this using panel.number, like this: panel=function(x,y,panel.number,...){ panel.xyplot(x,y,...) if(panel.number==1){ panel.curve(-655.8689+769.1589*log(5)+64.7981*log(x)-206.4475*log(5)^2) } if(panel.number==2){ panel.curve(-655.8689+769.1589*log(6)+64.7981*log(x)-206.4475*log(6)^2) } } Bu now the panel.number don't work anymore. I try to find the new substitute but without success. Thanks Ronaldo -- Prof. Ronaldo Reis Júnior | .''`. UNIMONTES/Depto. Biologia Geral/Lab. Ecologia Evolutiva | : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil | `- Fone: (38) 3229-8190 | [EMAIL PROTECTED] | [EMAIL PROTECTED] | ICQ#: 5692561 | LinuxUser#: 205366 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate function
try this. The first group of lines recreates your data frame, DF, and the last line is the aggregate: Input - DateTimez 2006-01-01 21:00 6,2 2006-01-01 22:00 5,7 2006-01-01 23:00 3,2 2006-01-02 00:00 7,8 2006-01-02 01:00 6,8 2006-01-02 02:00 5,6 2007-03-30 22:00 5,2 2007-03-30 23:00 8,3 2007-03-31 00:00 6,4 2007-03-31 01:00 7,4 DF - read.table(textConnection(Input), header = TRUE, as.is = TRUE) DF$z - as.numeric(sub(,, ., DF$z)) DF$Date - as.Date(DF$Date) aggregate(DF[z], list(yearmon = format(DF$Date, %Y-%m)), mean) On 4/23/07, Michel Schnitz [EMAIL PROTECTED] wrote: Hello, is there a way to use the aggregate function to calculate monthly mean in case i have one row in data frame that holds the date like -mm-dd? i know that it works for daily means. i also like to do it for monthly and yearly means. maybe there is something like aggregate(x, list(Date[%m]), mean)? the data frame looks like: DateTimez 2006-01-01 21:00 6,2 2006-01-01 22:00 5,7 2006-01-01 23:00 3,2 2006-01-02 00:00 7,8 2006-01-02 01:00 6,8 2006-01-02 02:00 5,6 . . . 2007-03-30 22:00 5,2 2007-03-30 23:00 8,3 2007-03-31 00:00 6,4 2007-03-31 01:00 7,4 thanks for help! -- Michél Schnitz [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] automating merging operations from multiple dataframes
Consider sapply and get. There might be something like the following (untested) fn-function(l){ # l is supposed to be a letter. Errors will occur otherwise. #constructing names dfr.name-paste(toupper(l),INDSLIM,sep=) column.name-paste(tolower(l),region,sep=) #retrieving data from the environment this.reg-get(dfr.name)[,c(pid,column.name)] #merging data frames. #please, note -. This assigns the value to the variable in this function environment's parent frame PidRegion-merge(PidRegion,this.reg,by=pid,all=TRUE) # this should help avoiding too much output invisible(PidRegion) } PidRegion - data.frame(pid=XWAVEID$pid) sapply(letters[1:14],FUN=fn) Jon Minton wrote: Hi, I have a set of dataframes names AINDSLIM, BINDSLIM, CINDSLIM ... NINDSLIM In each dataframe I want to extract two variables, pid and {w}region, where {w} means a, b, c, ...n At the moment my code looks like: PidRegion - data.frame(pid=XWAVEID$pid) this.region - AINDSLIM[,c(pid, aregion)] PidRegion - merge(PidRegion, this.region, by=pid, all=T) this.region - BINDSLIM[,c(pid, bregion)] PidRegion - merge(PidRegion, this.region, by=pid, all=T) this.region - CINDSLIM[,c(pid, cregion)] ... this.region - NINDSLIM[,c(pid, nregion)] PidRegion - merge(PidRegion, this.region, by=pid, all=T) But surely theres a way to automate this? Any suggestions? Jon Minton -- View this message in context: http://www.nabble.com/automating-merging-operations-from-multiple-dataframes-tf3630723.html#a10139026 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing working directory
Have you looked on the 'File' menu? 'Load Workspace...' is what you need, possibly after 'Change dir...' On Mon, 23 Apr 2007, Walter Paczkowski wrote: Good morning, I keep copies my .RData file in different directories for different projects on Windows XP. There is an icon on my desktop for each project so all I have to do is click on the icon to open R for a specific project, i.e. a specific .RData file. How do I change to another .RData file from within R without first closing R? Thanks, Walt Paczkowski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Changing working directory
Hi, you seem to have mixed 2 different things: 1) changing a working directory - see ?setwd, ?getwd However, this will NOT load another .Rdata file. 2) loading data - see ?load and ?save, ?save.image - loading new data image will erase all currently stored objects. Petr Walter Paczkowski napsal(a): Good morning, I keep copies my .RData file in different directories for different projects on Windows XP. There is an icon on my desktop for each project so all I have to do is click on the icon to open R for a specific project, i.e. a specific .RData file. How do I change to another .RData file from within R without first closing R? Thanks, Walt Paczkowski __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on manipulating a data frame
Thankyou for your reply Gabor. Your code helped me get started in creating id for each week of month. What I'm really looking for though is a more general application where I can extract each final week of the month conditional on the pattern of values (simply plus or minus signs) of the weeks preceding it in that month (i didnt really explain that in my previous post). For example, with the data below: Date Value Sign Week 2005-02.4 2005-02-04 1.6742797211 2005-02.5 2005-02-11 0.02 2005-02.6 2005-02-18 0.1422138213 2005-02.7 2005-02-25 -0.85633254 -14 2005-03.8 2005-03-04 2.2207385611 2005-03.9 2005-03-11 -0.07011803 -12 2005-03.10 2005-03-18 1.0003573013 2005-03.11 2005-03-25 -2.48430869 -14 2005-04.12 2005-04-01 -0.04747211 -11 2005-04.13 2005-04-08 0.1897533812 2005-04.14 2005-04-15 -3.54552994 -13 2005-04.15 2005-04-22 0.5142658614 2005-04.16 2005-04-29 -1.52599565 -15 if I wanted to show the last week of any months where the pattern of the signs of the three preceding weeks was 1,-1,1 , then the following would be returned: Date Value Sign Week 2005-03.11 2005-03-25 -2.48430869 -1 4 2005-04.16 2005-04-29 -1.52599565 -1 5 I would greatly appreciate any hint or example which might lead me the right way on this. Thankyou, Alf Sammassimo - Original Message - From: Gabor Grothendieck [EMAIL PROTECTED] To: Alfonso Sammassimo [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Monday, April 23, 2007 2:31 PM Subject: Re: [R] Help on manipulating a data frame Do you mean you want to return the first row for each mont for which the value 0? In that case try this. The first group of lines recreates your data frame and calls it DF. by causes f to operate on a subset of rows comprising one month extracting the first row for which the value is positive and also adding in the week number. do.call rbinds the results from each month altogether. Input - 2007-01-05 -1.52377151 2007-01-12 1.04787390 2007-01-19 0.61647047 2007-01-26 1.87864283 2007-02-02 0.54992405 2007-02-09 1.96850069 2007-02-16 0.26850159 2007-02-23 1.56305144 2007-03-02 -4.19500573 2007-03-09 0.77127814 2007-03-16 0.32387312 2007-03-23 2.02163219 2007-03-30 0.63175605 2007-04-06 1.33346284 2007-04-13 0.96021569 DF - read.table(textConnection(Input), col.names = c(Date, Value), colClasses = c(Date, numeric)) f - function(x) cbind(x, Week = seq_len(nrow(x)))[which.max(x$Value 0),] do.call(rbind, by(DF, format(DF$Date, %Y-%m), f)) On 4/22/07, Alfonso Sammassimo [EMAIL PROTECTED] wrote: Hi R-experts, I have a large set of weekly data in this format: 2007-01-05 -1.52377151 2007-01-12 1.04787390 2007-01-19 0.61647047 2007-01-26 1.87864283 2007-02-02 0.54992405 2007-02-09 1.96850069 2007-02-16 0.26850159 2007-02-23 1.56305144 2007-03-02 -4.19500573 2007-03-09 0.77127814 2007-03-16 0.32387312 2007-03-23 2.02163219 2007-03-30 0.63175605 2007-04-06 1.33346284 2007-04-13 0.96021569 How might this data be sorted to something like this?: Date Week of Month Value 2007-01-05 1 -1.52377151 2007-01-12 2 1.04787390 2007-01-19 3 0.61647047 2007-01-26 4 1.87864283 2007-02-02 1 0.54992405 My aim is to return the last value of every month where the previous values in that month were negative values, hence the need to split the data by month. Any guide as to how this might this be possible without a loop? Any help would be much appreciated. Thanks, Alf Sammassimo Melbourne, Australia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stringsAsFactor global option (was character coerced to a factor)
--- Gabor Grothendieck [EMAIL PROTECTED] wrote: Just one caveat. I personally would try to avoid using global options since it can cause conflicts when two different programs assume two different settings of the same global option and need to interact. I see this argument often, and don't buy it. In any case, for this particular option, the Mayo biostatistics group (~120 users) has had stringsAsFactors=F as a global default for 15+ years now with no ill effects. It is much less confusing for both new and old users. Johh Kane asked Any idea what the rationale was for setting the option to TRUE? When factors were first introduced, there was no option to turn them off. Reading between the lines of the white book (Statistical Models in S) that introduced them, this is my guess: they made perfect sense for the particular data sets that were being analysed by the authors at the time. Many of the defaults in the survival package, which I wrote, have exactly the same rationale --- so let us not be too harsh on an author for not forseeing all the future consequences of a default! A place where factors really are a pain is when the patient id is a character string. When, for instance, you subset the data to do an analysis of only the females, having the data set `remember' all of the male id's (the original levels) is non-productive in dozens of ways. For other variables factors work well and have some nice properties. In general, I've found in my work (medical research) that factors are beneficial for about 1/5 of the character variables, a PITA for 1/4, and a wash for the rest; so prefer to do any transformations myself. For the historically curious: In Splus, one originally fixed this with an override of the function as.data.frame.character - as.data.frame.vector before they added the global option. In R, unfortunately, this override didn't work due to namespaces, and we had to wait for the option to be added. (Another dammed-if-you-do dammed-if-you-don't issue. Normally you don't want users to be able to override a base function, because 9 times out of 10 they did it by accident and dont' want it either. But when a user really does want to do so ...) Terry Therneau __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Approaches of Frailty estimation: coxme vs coxph(...frailty(id, dist='gauss'))
M Karim asked about the difference between coxme(..., random= ~1|id) and coxph( ... frailty(id, dist='gauss') 1. coxme is the later routine, with more sophisticated and reliable optimization, and a wider range of models. If I get the abstract done in time, there will be a presentation at the R conference about a next release of the survival package which folds in coxme, improvements in coxme, and suggestion of depreciated status for the frailty() function. There are data sets where frailty() gets lost in searching for the optimum and coxme does not. 2. McGilchrist suggested an REML estimator for Cox models with a Gaussian frailty; but it was motivated by analogy with linear models and not by a direct EM argument. Later work by Cortinas (PhD thesis, 2004) showed cases where it performed more poorly than the ML estimate, which does have a formal derivation due to Ripatti and Palmgren. The coxme function uses the ML, the frailty(, dist='gauss') the proposed 'reml' estimate. \ I don't have answers for Karim's further questions about existence of a routine for the positive stable distribution, or comparisons to the nltm() or frailtypack routines. Terry Therneau __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help about princomp
annina wossona at yahoo.co.uk writes: Hello, I have a problem with the princomp method, it seems stupid but I don't know how to handle it. I have a dataset with some regular data and some outliers. I want to calculate a PCA on the regular data and get the scores for all data, including the outliers. Is this possible on R? Thank you for helping!!! Dear Annina, Yes, this is possible in R. Both 'prcomp' (that I recommend) and 'princompä have 'predict' method which can have 'newdata' as an argument. The following assumes that 'keep' is a vector which is TRUE for cases you keep, and FALSE for those ignored: pc - prcomp(x[keep,]) predict(pc, newdata=x) cheers, Jari Oksanen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest
Rubin, I'm assuming you really do want to do a classification? check out ?factor I'm guessing you have coded MMS_ENABLED_HANDSET as 0, 1; or some such numeric coding. suggest you do: dat$MMS_ENABLED_HANDSET - factor(dat$MMS_ENABLED_HANDSET) to force your response variable to be a factor (AKA categorical) And, perhaps, label your levels with something like: levels(dat$MMS_ENABLED_HANDSET) - c(Not Enabled, MMS Enabled) On 4/23/07, Ruben Feldman [EMAIL PROTECTED] wrote: Hi R-wizards, I ran a random forest on a dataset where the response variable had two possible values. It returned a warning telling me that it did regression and if that was really what I wanted. Does anybody know what is being in terms of the algorithm when it does a regression? (the random forest is used as a regression, how does that work?) Thanks for your time! Ruben [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- HTH, Jim Porzak San Francisco, CA http://www.linkedin.com/in/jimporzak __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Open source community help-desks
Ted Harding wrote: but, in the meanwhile, perhaps we need a revision of Ayn Rand's Virtue of Selfishness and how it may or may not extend to the open-source community. Ayn Rand's concept of selfishness is of course not the standard one (gratifying oneself in disregard for others), and can (if I have it right) well embrace ensuring that the self is well looked after while extending one's resulting strength, vigour and survival to the benefit of others. And I think this may be a good analogy of the way the Open Source community works. Ayn Rand claimed to base her philosophical system on three ``axioms'': (1) Existence exists. (2) Conciousness is concious. (3) [I can't --- thank God --- remember.] When I was in graduate school, lo these many years ago, I had a friend who was a very right-wing person from the South of the U. S. and who might have been expected to have some sympathy with Ayn Rand's views. This friend had a gift for succinct and pithy aphorisms. He remarked: ``Anyone who can say `Existence exists' with a straight face is either a fool or a charlatan.'' cheers, Rolf Turner [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on manipulating a data frame
Just change f appropriately, e.g. f - function(x) { v - embed(x$Sign, 4) %*% c(0, 1, -1, 1) == 3 if (any(v)) x[which.max(v) + 3, ] } On 4/23/07, Alfonso Sammassimo [EMAIL PROTECTED] wrote: Thankyou for your reply Gabor. Your code helped me get started in creating id for each week of month. What I'm really looking for though is a more general application where I can extract each final week of the month conditional on the pattern of values (simply plus or minus signs) of the weeks preceding it in that month (i didnt really explain that in my previous post). For example, with the data below: Date Value Sign Week 2005-02.4 2005-02-04 1.6742797211 2005-02.5 2005-02-11 0.02 2005-02.6 2005-02-18 0.1422138213 2005-02.7 2005-02-25 -0.85633254 -14 2005-03.8 2005-03-04 2.2207385611 2005-03.9 2005-03-11 -0.07011803 -12 2005-03.10 2005-03-18 1.0003573013 2005-03.11 2005-03-25 -2.48430869 -14 2005-04.12 2005-04-01 -0.04747211 -11 2005-04.13 2005-04-08 0.1897533812 2005-04.14 2005-04-15 -3.54552994 -13 2005-04.15 2005-04-22 0.5142658614 2005-04.16 2005-04-29 -1.52599565 -15 if I wanted to show the last week of any months where the pattern of the signs of the three preceding weeks was 1,-1,1 , then the following would be returned: Date Value Sign Week 2005-03.11 2005-03-25 -2.48430869 -1 4 2005-04.16 2005-04-29 -1.52599565 -1 5 I would greatly appreciate any hint or example which might lead me the right way on this. Thankyou, Alf Sammassimo - Original Message - From: Gabor Grothendieck [EMAIL PROTECTED] To: Alfonso Sammassimo [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Monday, April 23, 2007 2:31 PM Subject: Re: [R] Help on manipulating a data frame Do you mean you want to return the first row for each mont for which the value 0? In that case try this. The first group of lines recreates your data frame and calls it DF. by causes f to operate on a subset of rows comprising one month extracting the first row for which the value is positive and also adding in the week number. do.call rbinds the results from each month altogether. Input - 2007-01-05 -1.52377151 2007-01-12 1.04787390 2007-01-19 0.61647047 2007-01-26 1.87864283 2007-02-02 0.54992405 2007-02-09 1.96850069 2007-02-16 0.26850159 2007-02-23 1.56305144 2007-03-02 -4.19500573 2007-03-09 0.77127814 2007-03-16 0.32387312 2007-03-23 2.02163219 2007-03-30 0.63175605 2007-04-06 1.33346284 2007-04-13 0.96021569 DF - read.table(textConnection(Input), col.names = c(Date, Value), colClasses = c(Date, numeric)) f - function(x) cbind(x, Week = seq_len(nrow(x)))[which.max(x$Value 0),] do.call(rbind, by(DF, format(DF$Date, %Y-%m), f)) On 4/22/07, Alfonso Sammassimo [EMAIL PROTECTED] wrote: Hi R-experts, I have a large set of weekly data in this format: 2007-01-05 -1.52377151 2007-01-12 1.04787390 2007-01-19 0.61647047 2007-01-26 1.87864283 2007-02-02 0.54992405 2007-02-09 1.96850069 2007-02-16 0.26850159 2007-02-23 1.56305144 2007-03-02 -4.19500573 2007-03-09 0.77127814 2007-03-16 0.32387312 2007-03-23 2.02163219 2007-03-30 0.63175605 2007-04-06 1.33346284 2007-04-13 0.96021569 How might this data be sorted to something like this?: Date Week of Month Value 2007-01-05 1 -1.52377151 2007-01-12 2 1.04787390 2007-01-19 3 0.61647047 2007-01-26 4 1.87864283 2007-02-02 1 0.54992405 My aim is to return the last value of every month where the previous values in that month were negative values, hence the need to split the data by month. Any guide as to how this might this be possible without a loop? Any help would be much appreciated. Thanks, Alf Sammassimo Melbourne, Australia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stringsAsFactor global option (was character coerced to a factor)
A place where factors really are a pain is when the patient id is a character string. When, for instance, you subset the data to do an analysis of only the females, having the data set `remember' all of the male id's (the original levels) is non-productive in dozens of ways. For other variables factors work well and have some nice properties. In general, I've found in my work (medical research) that factors are beneficial for about 1/5 of the character variables, a PITA for 1/4, and a wash for the rest; so prefer to do any transformations myself. It seems to me that the most importance difference between factors and character vectors is that factors also store the range of the variable. You could imagine doing something similar for continuous variables. This would have the interesting property that plots of subsets would have the same range as plots of the original data. I'd imagine, just as with factors, this would be useful and frustrating in equal parts. In terms of which should be the default, I can imagine two arguments: * keep to the original format of the data as closely as possible: character vectors should be the default * maintain as much information about the original data as possible: factors should be the default. Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on xyplot and curves
See: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/90294.html On 4/23/07, Ronaldo Reis Junior [EMAIL PROTECTED] wrote: Hi, I need to add some different curves to a each panel in a xyplot. I have a old function to make this using panel.number, like this: panel=function(x,y,panel.number,...){ panel.xyplot(x,y,...) if(panel.number==1){ panel.curve(-655.8689+769.1589*log(5)+64.7981*log(x)-206.4475*log(5)^2) } if(panel.number==2){ panel.curve(-655.8689+769.1589*log(6)+64.7981*log(x)-206.4475*log(6)^2) } } Bu now the panel.number don't work anymore. I try to find the new substitute but without success. Thanks Ronaldo -- Prof. Ronaldo Reis Júnior | .''`. UNIMONTES/Depto. Biologia Geral/Lab. Ecologia Evolutiva | : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil | `- Fone: (38) 3229-8190 | [EMAIL PROTECTED] | [EMAIL PROTECTED] | ICQ#: 5692561 | LinuxUser#: 205366 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame
hello, I wanna print something like this Class Levels Values Id_TrT1 1 2 Id_Geno764208 64209 64210 64211 64212 64213 64214 Id_Rep 2 12 Is it possible? I have some problem I think taht I should use data.frame with matrix but I'm not sure and perhaps it's false ___ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stringsAsFactor global option (was character coerced to a factor)
On Mon, 23 Apr 2007, Terry Therneau wrote: --- Gabor Grothendieck [EMAIL PROTECTED] wrote: Just one caveat. I personally would try to avoid using global options since it can cause conflicts when two different programs assume two different settings of the same global option and need to interact. I see this argument often, and don't buy it. In any case, for this particular option, the Mayo biostatistics group (~120 users) has had stringsAsFactors=F as a global default for 15+ years now with no ill effects. It is much less confusing for both new and old users. Johh Kane asked Any idea what the rationale was for setting the option to TRUE? When factors were first introduced, there was no option to turn them off. Reading between the lines of the white book (Statistical Models in S) that introduced them, this is my guess: they made perfect sense for the particular data sets that were being analysed by the authors at the time. Many of the defaults in the survival package, which I wrote, have exactly the same rationale --- so let us not be too harsh on an author for not forseeing all the future consequences of a default! A place where factors really are a pain is when the patient id is a character string. When, for instance, you subset the data to do an analysis of only the females, having the data set `remember' all of the male id's (the original levels) is non-productive in dozens of ways. For other variables factors work well and have some nice properties. In general, I've found in my work (medical research) that factors are beneficial for about 1/5 of the character variables, a PITA for 1/4, and a wash for the rest; so prefer to do any transformations myself. For the historically curious: In Splus, one originally fixed this with an override of the function as.data.frame.character - as.data.frame.vector before they added the global option. In R, unfortunately, this override didn't work due to namespaces, and we had to wait for the option to be added. (Another dammed-if-you-do dammed-if-you-don't issue. Normally you don't want users to be able to override a base function, because 9 times out of 10 they did it by accident and dont' want it either. But when a user really does want to do so ...) That is what 'assignInNamespace' is for (and it came in with namespaces). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package installed, functional but not available
Hello, when I run packageStatus(), I get the following results: packageStatus() Number of installed packages: ok upgrade unavailable /home/fernando/my_library 38 0 1 /usr/local/lib/R/library 28 0 0 Number of available packages (each package/bundle counted only once): installed not installed http://cran-r.c3sl.ufpr.br/src/contrib51 957 i.e., there is an unavailable package in my personal library. With, summary(packageStatus()) I see that the unavailable package is one that I have done and installed via R CMD INSTALL. Although it says it is unavailable, I can load the package with library() and use its functions in the usual way. There is no problem at all here since I can use the functions. I was just curious about what really means this unavailable classification. My guess is that this a package that is not on CRAN (?). This is version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 2 minor 4.1 year 2006 month 12 day18 svn rev40228 language R Thanks for any explanations, --- Fernando Mayer Fisheries Study Group Technology, Earth and Ocean Sciences Center University of Vale do Itajaí Itajaí - SC - Brazil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate function
it works. thanks a lot. Gabor Grothendieck wrote: try this. The first group of lines recreates your data frame, DF, and the last line is the aggregate: Input - DateTimez 2006-01-01 21:00 6,2 2006-01-01 22:00 5,7 2006-01-01 23:00 3,2 2006-01-02 00:00 7,8 2006-01-02 01:00 6,8 2006-01-02 02:00 5,6 2007-03-30 22:00 5,2 2007-03-30 23:00 8,3 2007-03-31 00:00 6,4 2007-03-31 01:00 7,4 DF - read.table(textConnection(Input), header = TRUE, as.is = TRUE) DF$z - as.numeric(sub(,, ., DF$z)) DF$Date - as.Date(DF$Date) aggregate(DF[z], list(yearmon = format(DF$Date, %Y-%m)), mean) On 4/23/07, Michel Schnitz [EMAIL PROTECTED] wrote: Hello, is there a way to use the aggregate function to calculate monthly mean in case i have one row in data frame that holds the date like -mm-dd? i know that it works for daily means. i also like to do it for monthly and yearly means. maybe there is something like aggregate(x, list(Date[%m]), mean)? the data frame looks like: DateTimez 2006-01-01 21:00 6,2 2006-01-01 22:00 5,7 2006-01-01 23:00 3,2 2006-01-02 00:00 7,8 2006-01-02 01:00 6,8 2006-01-02 02:00 5,6 . . . 2007-03-30 22:00 5,2 2007-03-30 23:00 8,3 2007-03-31 00:00 6,4 2007-03-31 01:00 7,4 thanks for help! -- Michél Schnitz [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michél Schnitz [EMAIL PROTECTED] Scharrenstrasse 07 06108 Halle-Saale phone: +0049-(0)345- 290 85 24 mobile:+0049-(0)176- 239 000 64 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on xyplot and curves
Em Segunda 23 Abril 2007 10:38, Gabor Grothendieck escreveu: See: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/90294.html Thanks, it work. Inte Ronaldo -- As pequenas dÃvidas são aborrecidas como as moscas. As grandes, logicamente, deveriam ser terrÃveis como os leões, e são mansÃssimas --Machado de Assis -- Prof. Ronaldo Reis Júnior | .''`. UNIMONTES/Depto. Biologia Geral/Lab. Ecologia Evolutiva | : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil | `- Fone: (38) 3229-8190 | [EMAIL PROTECTED] | [EMAIL PROTECTED] | ICQ#: 5692561 | LinuxUser#: 205366 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame
how about: ?str ever considered reading an introductory text? find some here: http://cran.r-project.org/other-docs.html Stefan elyakhlifi mustapha wrote: hello, I wanna print something like this Class Levels Values Id_TrT1 1 2 Id_Geno764208 64209 64210 64211 64212 64213 64214 Id_Rep 2 12 Is it possible? I have some problem I think taht I should use data.frame with matrix but I'm not sure and perhaps it's false __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package installed, functional but not available
If you read the help page it says Description: Summarize information about installed packages and packages available at various repositories, and automatically upgrade outdated packages. ... avail: a data frame with columns as the _matrix_ returned by 'available.packages' plus 'Status', a factor with levels 'c(installed, not installed, unavailable)'.. so this is 'available' as in 'available.packages' (qv). On Mon, 23 Apr 2007, Fernando Mayer wrote: Hello, when I run packageStatus(), I get the following results: packageStatus() Number of installed packages: ok upgrade unavailable /home/fernando/my_library 38 0 1 /usr/local/lib/R/library 28 0 0 Number of available packages (each package/bundle counted only once): installed not installed http://cran-r.c3sl.ufpr.br/src/contrib51 957 i.e., there is an unavailable package in my personal library. With, summary(packageStatus()) I see that the unavailable package is one that I have done and installed via R CMD INSTALL. Although it says it is unavailable, I can load the package with library() and use its functions in the usual way. There is no problem at all here since I can use the functions. I was just curious about what really means this unavailable classification. My guess is that this a package that is not on CRAN (?). No, not on the repositories you specified. This is version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 2 minor 4.1 year 2006 month 12 day18 svn rev40228 language R Thanks for any explanations, --- Fernando Mayer Fisheries Study Group Technology, Earth and Ocean Sciences Center University of Vale do Itajaí Itajaí - SC - Brazil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. PLEASE do. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset
What format does your date have? This is essential here. However it must be something like subset(yourdata, year %in% 2004) how to extract the year from your date you must find out yourself... (depending on the dates format...) ever considered reading an introductory text? find some here: http://cran.r-project.org/other-docs.html Stefan elyakhlifi mustapha wrote: hi, ok I understand how to use the subset function but sometimes I need to use it to extract data by date and its format it isn't so easy (like below) for example like in using SQL I thougth that it was possible to write %/2004 but it doesn't run. Can you help me please about this? subset(don, Date_O in %/2004, select = c(Annee_O, Date_O)) ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] chi square problem
Dear all, I have a problem that I could not solve neither with spss nor with R. Please, excuse me if it is atrivial question but I did not find any soultion. I have the followig practical problem: we a product that has A, B, C effects (we differentiated about 30) on health and we want to know whether these effects are associated with its physical properties (shape, size, color). We listed all these 30 effects that were important for us and the subjects received a list of photos in which the product was indicated in different size and color etc. There were 20 possiblities. The subjects had to order one of these photos to each effect in the list. If I make a chi-square test it says only that there is a significant effect. I also want to know which property or properties caused this significant effect in the deviation from expected frequency. I would appreciate your help. Daniel _ FREE pop-up blocking with the new MSN Toolbar [1]MSN Toolbar Get it now! References 1. http://g.msn.com/8HMBEN/2752??PS=47575 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] regarding redirecting output of GARCH to a file
Hi All R Experts I wrote this code so that all the summaries are stored in one file so that i can try to see among them which one is most fitting. but the results.txt file is having * ESTIMATION WITH ANALYTICAL GRADIENT * many times i.e. 25x25 = 625 Please help in sending the summaries into one file :-) sink(C:/results.txt) for ( i in 1:25 ) for ( j in 1:25 ) { x1.garch-garch(ts(x1[,2]),c(i,j)) summary(x1) } sink() Thanks in adv -gaurav DISCLAIMER AND CONFIDENTIALITY CAUTION:\ \ This message and ...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame
You can build the data frame with: dat - data.frame(Class=I(Id_TrT1), Levels=I(1), Values=I(2)) new.info - c(Class=Id_Geno, Levels=7 , Values=64208 64209 64210 64211 64212 64213 64214) dat - rbind(dat, new.info) dat new.info - c(Class= Id_Rep , Levels=2 , Values=12) dat - rbind(dat, new.info) dat It works. The R console result can be seen in the attachment. CU, Corinna -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von elyakhlifi mustapha Gesendet: Montag, 23. April 2007 16:02 An: R-help@stat.math.ethz.ch Betreff: [R] data frame hello, I wanna print something like this Class Levels Values Id_TrT1 1 2 Id_Geno764208 64209 64210 64211 64212 64213 64214 Id_Rep 2 12 Is it possible? I have some problem I think taht I should use data.frame with matrix but I'm not sure and perhaps it's false ___ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bug in R 2.4.1 ?
Hello everybody, I'm using hdf5 files to store results from intermediate calculations. These are usually part of a list, called res. As I want the hdf-files to contain all the members of res in its top directory, I used to do attach(res) do.call(hdf5save, args=c(fileout=file.path(dir, ofile), as.list(names(res detach(res) which did what I wanted (R version 2.3.1 under ubuntu edgy). Since the upgrade to ubuntu feisty fawn which ships with R 2.4.1, the code above causes a crash: *** caught segfault *** address 0x11, cause 'memory not mapped' Traceback: 1: .External(do_hdf5save, call, sys.frame(sys.parent()), fileout, ..., PACKAGE = hdf5) 2: hdf5save(fileout = tex/ABpattern_pub/data/knnTest/gTest_annAB.1.statsAll.hdf5, newman, hist, graphProp, graphBins) 3: do.call(hdf5save, args = c(fileout = file.path(dir, ofile), as.list(names(res 4: avgGraphData(dir = tex/ABpattern_pub/data/knnTest) Any ideas on how to fix this or what is wrong? To me it seems to be a bug introduced in R 2.4.1. Greetings, Sebastian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] WG: data frame
You can build the data frame with: dat - data.frame(Class=I(Id_TrT1), Levels=I(1), Values=I(2)) new.info - c(Class=Id_Geno, Levels=7 , Values=64208 64209 64210 64211 64212 64213 64214) dat - rbind(dat, new.info) dat new.info - c(Class= Id_Rep , Levels=2 , Values=12) dat - rbind(dat, new.info) dat It works. The R console result can be seen in the attachment. CU, Corinna -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von elyakhlifi mustapha Gesendet: Montag, 23. April 2007 16:02 An: R-help@stat.math.ethz.ch Betreff: [R] data frame hello, I wanna print something like this Class Levels Values Id_TrT1 1 2 Id_Geno764208 64209 64210 64211 64212 64213 64214 Id_Rep 2 12 Is it possible? I have some problem I think taht I should use data.frame with matrix but I'm not sure and perhaps it's false ___ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unsubscription Confirmation
Thank you for subscribing. You have now unsubscribed and no more messages will be sent. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in R 2.4.1 ?
On 4/23/2007 10:56 AM, Sebastian Weber wrote: Hello everybody, I'm using hdf5 files to store results from intermediate calculations. These are usually part of a list, called res. As I want the hdf-files to contain all the members of res in its top directory, I used to do attach(res) do.call(hdf5save, args=c(fileout=file.path(dir, ofile), as.list(names(res detach(res) which did what I wanted (R version 2.3.1 under ubuntu edgy). Since the upgrade to ubuntu feisty fawn which ships with R 2.4.1, the code above causes a crash: *** caught segfault *** address 0x11, cause 'memory not mapped' Traceback: 1: .External(do_hdf5save, call, sys.frame(sys.parent()), fileout, ..., PACKAGE = hdf5) 2: hdf5save(fileout = tex/ABpattern_pub/data/knnTest/gTest_annAB.1.statsAll.hdf5, newman, hist, graphProp, graphBins) 3: do.call(hdf5save, args = c(fileout = file.path(dir, ofile), as.list(names(res 4: avgGraphData(dir = tex/ABpattern_pub/data/knnTest) Any ideas on how to fix this or what is wrong? To me it seems to be a bug introduced in R 2.4.1. hdf5save is a function in the hdf5 contributed package, so you should start with its maintainer, Marcus G. Daniels [EMAIL PROTECTED]. But before you bother him, make sure you're using the latest release of it. If you still have problems, give him the usual details requested in the posting guide. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest
Hi, Ruben: fit$confusion if you provide your test data, then you can also access the confusion matrix of test data by fit$test$confusion there are details of how to use randomForest by reading: ?randomForest HTH, Weiwei On 4/22/07, Ruben Feldman [EMAIL PROTECTED] wrote: Hi, I am trying to print out my confusion matrix after having created my random forest. I have put in this command: fit-randomForest(MMS_ENABLED_HANDSET~.,data=dat,ntree=500,mtry=14, na.action=na.omit,confusion=TRUE) but I can't get it to give me the confusion matrix, anyone know how this works? Thansk! Ruben [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dominance in qtl model
Hi, I'm using R for a QTL analysis of SNP data. I was wondering if anyone had any advice on fitting a dominance effect into the following function; myfun4 function (x) { x - scan(con, nmax=169) y - unique(x[which(!is.na(x))]) if(length(y)1) { summary(lme(Ad ~ x, random= ~1|sire, na.action=na.omit)) } else {print(no.infomation)} } Con is the connection to a file of the genotypes for each SNP. It is set up as a continues string of genotype (0, 1, 2), the first 169 for the first SNP, the second 169 for the second SNP and so on. I need a way of determining if the deviation of the mean of genotype 1 is significant from that of the mean of genotypes 0 2. Any help would be greatly appreciated. Cheers, Joseph __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame
Hallo You can build the data frame with: dat - data.frame(Class=I(Id_TrT1), Levels=I(1), Values=I(2)) new.info - c(Class=Id_Geno, Levels=7 , Values=64208 64209 64210 64211 64212 64213 64214) dat - rbind(dat, new.info) dat new.info - c(Class= Id_Rep , Levels=2 , Values=12) dat - rbind(dat, new.info) dat It works. The R console result can be seen in the attachment. CU, Corinna -Ursprüngliche Nachricht- Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im Auftrag von elyakhlifi mustapha Gesendet: Montag, 23. April 2007 16:02 An: R-help@stat.math.ethz.ch Betreff: [R] data frame hello, I wanna print something like this Class Levels Values Id_TrT1 1 2 Id_Geno764208 64209 64210 64211 64212 64213 64214 Id_Rep 2 12 Is it possible? I have some problem I think taht I should use data.frame with matrix but I'm not sure and perhaps it's false ___ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] summary and min max
Hi, I came across a case where there's a discrepancy between minimum and maximum values reported by 'summary' and the 'min' and 'max' functions: --cut here---start- R str(tt) num [1:1397] 1952 1970 1976 1967 1946 ... R summary(tt) Min. 1st Qu. MedianMean 3rd Qu.Max. 192019601970197019802000 R min(tt) [1] 1918 R max(tt) [1] 2001 R sessionInfo() R version 2.5.0 beta (2007-04-12 r41139) x86_64-pc-linux-gnu locale: LC_CTYPE=en_CA.UTF-8;LC_NUMERIC=C;LC_TIME=en_CA.UTF-8;LC_COLLATE=en_CA.UTF-8;LC_MONETARY=en_CA.UTF-8;LC_MESSAGES=en_CA.UTF-8;LC_PAPER=en_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_CA.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: lattice 0.14-17 --cut here---end--- So this is a simple numeric vector, without any NA's, so I'm not sure what's causing the discrepancy between these functions. Any suggestions as to what to look for welcome. Thanks. Cheers, -- Seb __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame
Its not usual to represent structures in this form in R but you can do it if you really want: data.frame(A = letters[1:3], B = 1:3, C = I(list(2, 1:6, 9))) Note the I (capital i) to make sure the list gets passed in asis. On 4/23/07, elyakhlifi mustapha [EMAIL PROTECTED] wrote: hello, I wanna print something like this Class Levels Values Id_TrT1 1 2 Id_Geno764208 64209 64210 64211 64212 64213 64214 Id_Rep 2 12 Is it possible? I have some problem I think taht I should use data.frame with matrix but I'm not sure and perhaps it's false ___ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimates at each iteration of optim()?
I read the description of the trace control parameter in ?optim and then also looked at the examples given at the end. In one of the examples I found that they had used trace=TRUE with the method SANN. I am using the method BFGS and I tried using trace=TRUE too but I did not get the parameter estimates at each iteration. As you say, it might be method dependent. I tried reading the source code for optim but could not find out what I was looking for. Hence, I was wondering if anyone could tell me what option to use with the method BFGS to get the parameter estimates at each iteration of the optimization. Deepankar - Original Message - From: Peter Dalgaard [EMAIL PROTECTED] Date: Monday, April 23, 2007 2:46 am Subject: Re: [R] Estimates at each iteration of optim()? DEEPANKAR BASU wrote: I am trying to maximise a complicated loglikelihood function with the optim command. Is there some way to get to know the estiamtes at each iteration? When I put control=list(trace=TRUE) as an option in optim, I just got the initial and final values of the loglikelihood, number of iterations and whether the routine has converged or not. I need to know the estimate values at each iteration. It might help if you actually _read_ the description of the trace control parameter (hint: it is not an on/off switch) in ?optim... And, as it says, this is method dependent, so you may have to study the source code. Deepankar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] colored shading lines
Here are a few options (others may have better ones): 1. Don't use shading lines. These were mainly used when color/grayscale was not available and are less needed now. Also, sometimes the used of shading lines causes a Moire vibration (the combination of the lines and the physiology of the eye give the illusion of movement) which makes the graph harder to read (see Tufte). 2. polygon (and friends) generally do either shading or color, but not both. When doing shading they use the color specified by par('fg'), so you can set par(fg='red'), plot the polygon, then change the forground color back to black (or whatever it was). If you want the outline of the polygon black, just plot another polygon (without density or color arguments) over the top of the red one. 3. create your plot they way you do now (with the wrong color), then use the plot2script function in the TeachingDemos package to get the low level commands that were used to create the plot. Find the segments functions and change the color from red to black (you will also need to remove the extra arguments from the box and polygon functions) then rerun the entire created script. This option is inferior to the 2 above unless you want really detailed control of the final plot, or really want to see the details that go into creating the plot. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Albrecht Kauffmann Sent: Monday, April 23, 2007 3:27 AM To: r-help@stat.math.ethz.ch Subject: [R] colored shading lines Hi all, it there any possibility to draw colored shading lines of a polygon plot? E.g. plot(polygon_object,col=red,density=10,angle=45) produces only black shading lines within the polygon. With many thanks for any hint Albrecht __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing sas datasets
Hi John and Daniel, Thanks for your suggestions, I updated line 127 of the sas.get function but after submitting the following command: c- sas.get(lib=c:\\ghan, mem=mkds0001, var=( ), format.library=d:\\R\\R-2.4.1, sasprog='C:\\Programmi\\SAS\\SAS 9.1\\sas.exe') (also trying with sasprog=C:\\Programmi\\SAS\\SAS 9.1\\sas.exe) the log signaled the following error: Errore in system(paste(shQuote(sasprog), shQuote(sasin), -log, shQuote(log.file)), : unused argument(s) (output = FALSE) which is about the modified 127 line. I've also tried with export files, with the sasxport.get function, it works out well, but only if the length of the variables' names' is maximum 8 bites. regards, Anna Emilia Martino - Original Message - Da : John Kane [EMAIL PROTECTED] A : Daniel Nordlund [EMAIL PROTECTED], [EMAIL PROTECTED], r-help@stat.math.ethz.ch Oggetto : Re: [R] importing sas datasets Data : Fri, 20 Apr 2007 20:11:21 -0400 (EDT) Hi Anna, I'm the sas.get problem man. I still have not gotten it to work but I think that is because I have some slightly dodgy SAS files. Assuming that the sas.get problem is what was described in the earlier thread it appears to have been fixed. You might want to do an update to R to get the most recent Hmisc. An alternative in Hmisc that Frank Harrell pointed out is to do a SAS export file and a special version of the format file. It is described in the Hmisc reference manual See sasxport.get . It worked just fine for me on a couple of test files. I don't remember but I think you're stuck with the 8 character variable names though. --- Daniel Nordlund [EMAIL PROTECTED] wrote: -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Friday, April 20, 2007 6:36 AM To: r-help@stat.math.ethz.ch Subject: [R] importing sas datasets Hello, I wanted to ask help about importing sas datasets. 1)I tried with some functions as read.ssd (foreign package), but it doesn't import the file if the length of the variables' names are longer than 8 bite (it has to conform to the 6 version). 2)I then tried with the sas.get function (Hmisc package) but with the command: c- sas.get(lib=c:\\ghan, mem=mkds0001, var=( ), format.library=d:\\R\\R-2.4.1, sasprog=C:\\Programmi\\SAS\\SAS 9.1\\sas.exe) R can't launch the sas.exe because there is a space in the directory SAS 9.1. snip Anna, There has been a thread on this problem recently. You could check the archives for posts with the subject sas.get problem. I can't comment about point 1, but the problem in point 2 has nothing to do with the space in the name. (Well, it kind of does because it has to do with trying to get around the problem of spaces in path names). The problem you are having with sas.get is that the function is broken in the Windows version of Hmisc. There is a fix which you can apply, and when that is done sas.get is a very nice function (I have heard that the problem will be fixed in an upcoming version of Hmisc). Here is a solution that works for me and others. 1. start up R interactively 2. I will assume you have appropriately installed the Hmisc package 3. load Hmisc by typing library(Hmisc) at the R prompt. 4. type 'sas.get' (without the quotes) at the R prompt. This will print the source code for the sas.get function definition. 5. cut and paste the source code into the text editor of your choice and correct line 127 (change 'sys' to 'system'), i.e. change line 127 from status - sys(paste(shQuote(sasprog), shQuote(sasin), -log, to status - system(paste(shQuote(sasprog), shQuote(sasin), -log, 7. next, sas.get needs to be redefined with the corrected code. In your text editor, add sas.get - to the first line so that it reads sas.get - function (library, member, variables = character(0), ifs = character(0), 8. save this corrected function definition as a text file (I chose Hmisc_sas_get_correction.R as the file name). Now, any time you want to use sas.get from Hmisc you can take the following steps 1. start R 2. load Hmisc using library(Hmisc) 3. source the corrected sas.get definition source(your_path/Hmisc_sas_get_correction.R) Now you are set to go. Let us know if this works for you. Hope this is helpful, Dan Daniel Nordlund Bothell, WA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Make free worldwide PC-to-PC calls. Try the new Yahoo! Canada Messenger with
Re: [R] Estimates at each iteration of optim()?
DEEPANKAR BASU wrote: I read the description of the trace control parameter in ?optim and then also looked at the examples given at the end. In one of the examples I found that they had used trace=TRUE with the method SANN. I am using the method BFGS and I tried using trace=TRUE too but I did not get the parameter estimates at each iteration. As you say, it might be method dependent. I tried reading the source code for optim but could not find out what I was looking for. Hence, I was wondering if anyone could tell me what option to use with the method BFGS to get the parameter estimates at each iteration of the optimization. Deepankar Well, ?optim has: The 'control' argument is a list that can supply any of the following components: 'trace' Non-negative integer. If positive, tracing information on the progress of the optimization is produced. Higher values may produce more tracing information: for method 'L-BFGS-B' there are six levels of tracing. (To understand exactly what these do see the source code: higher levels give more detail.) which I can only infer that you still haven't read... - Original Message - From: Peter Dalgaard [EMAIL PROTECTED] Date: Monday, April 23, 2007 2:46 am Subject: Re: [R] Estimates at each iteration of optim()? DEEPANKAR BASU wrote: I am trying to maximise a complicated loglikelihood function with the optim command. Is there some way to get to know the estiamtes at each iteration? When I put control=list(trace=TRUE) as an option in optim, I just got the initial and final values of the loglikelihood, number of iterations and whether the routine has converged or not. I need to know the estimate values at each iteration. It might help if you actually _read_ the description of the trace control parameter (hint: it is not an on/off switch) in ?optim... And, as it says, this is method dependent, so you may have to study the source code. Deepankar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Documentation for namespaces
Brian Ripley recently replied to a comment of mine by referring to a function 'assignInNamespace', which I had not heard of. Is there a good write up on name spaces in R? There are little tidbits in the manuals on the R site, but I found nothing substative. I'd like to understand these better. Terry Therneau __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New version of actuar
UseRs, actuar is a package for Actuarial Science. A rather preliminary version (0.1-3) of the package has been available on CRAN since February 2006. We now announce the immediate availability of version 0.9-2 sporting a large number of new features. Non actuaries behold! There can be some features of interest for you, especially those related to new probability distribution and to the manipulation of grouped data. Since I took the time to write a fairly detailed NEWS file, I'll let it speak for itself: === actuar: an R package for Actuarial Science === Version 0.9-2 = Major official update. This version is not backward compatible with the 0.1-x series. Feature of the package can be split in the following categories: loss distributions modeling, risk theory, credibility theory. NEW FEATURES -- LOSS DISTRIBUTIONS o Functions {d,p,q,r}foo to compute the density function, cumulative distribution function, quantile function of, and to generate variates from, all probability distributions of Appendix A of Klugman et al. (2004), Loss Models, Second Edition (except the inverse gaussian) not already in R. Namely, this adds the following distributions (the root is what follows the 'd', 'p', 'q' or 'r' in function names): Distribution name Root - -- Burr burr Generalized beta genbeta Generalized Pareto genpareto Inverse Burr invburr Inverse exponential invexp Inverse Pareto invpareto Inverse paralogistic invparalogis Inverse Weibull invweibull Loggamma loggamma Loglogistic llogis Paralogistic paralogis Pareto pareto Single parameter Pareto pareto1 Transformed beta trbeta Transformed gammatrgamma All functions are coded in C for efficiency purposes and should behave exactly like the functions in base R. For all distributions that have a scale parameter, the corresponding functions have 'rate = 1' and 'scale = 1/rate' arguments. o Functions {m,lev}foo to compute the k-th raw (non-central) moment and k-th limited moment for all the probability distributions mentioned above, plus the following ones of base R: beta, exponential, gamma, lognormal and Weibull. o Facilities to store and manipulate grouped data (stored in an interval-frequency fashion). Function grouped.data() creates a grouped data object similar to a data frame. Methods of [, [-, mean() and hist() created for objects of class grouped.data. o Function ogive() --- with appropriate methods of knots(), plot(), print() and summary() --- to compute the ogive of grouped data. Usage is in every respect similar to ecdf(). o Function elev() to compute the empirical limited expected value of a sample of individual or grouped data. o Function emm() to compute the k-th empirical raw (non-central) moment of a sample of individual or grouped data. o Function mde() to compute minimum distance estimators from a sample of individual or grouped data using one of three distance measures: Cramer-von Mises (CvM), chi-square, layer average severity (LAS). Usage is similar to fitdistr() of package 'MASS'. o Function coverage() to obtain the pdf or cdf of the payment per payment or payment per loss random variable under any combination of the following coverage modifications: ordinary of franchise deductible, policy limit, coinsurance, inflation. The result is a function that can be used in fitting models to data subject to such coverage modifications. o Individual dental claims data set 'dental' and grouped dental claims data set 'gdental' of Klugman et al. (2004), Loss Models, Second Edition. NEW FEATURES -- RISK THEORY o Function aggregateDist() returns a function to compute the cumulative distribution function of the total amount of claims random variable for an insurance portfolio using any of the following five methods: 1. exact calculation by convolutions (using function convolve() of package 'stats'; 2. recursive calculation using Panjer's algorithm; 3. normal approximation; 4. normal power approximation; 5. simulation. The modular conception of aggregateDist() allows for easy inclusion of additional methods. There are special methods of print(), summary(), quantile() and mean() for objects of class aggregateDist. The objects otherwise inherit from classes ecdf (for methods 1, 2 and 3) and function. See also the Deprecated, defunct or no backward compatibility section below. o Function discretize() to discretize a continuous distribution using any
Re: [R] exemple pour l'AFD
Bonjours monsieurs Je suis un ?tudient en 4eme ann?e informatique a l?universite djilali liabes SBA ALGERIE. Je suis entrain de pr?parer un expos? sur l?AFD et j?ai besoin d?un exemple sous R pour bien pr?senter mon travail. Not sure to understand your question but if by AFD you mean CDA, that is Correspondence Discriminant Analysis, there is an example under R there: http://pbil.univ-lyon1.fr/R/fichestd/tdr624.pdf PEA, (Please Expand Acronyms) -- Jean R. Lobry([EMAIL PROTECTED]) Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I, 43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE allo : +33 472 43 27 56 fax: +33 472 43 13 88 http://pbil.univ-lyon1.fr/members/lobry/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Documentation for namespaces
See the article Tierney, L. (2003): Name Space Management for R, R News 3 (1), 2-6. Uwe Ligges Terry Therneau wrote: Brian Ripley recently replied to a comment of mine by referring to a function 'assignInNamespace', which I had not heard of. Is there a good write up on name spaces in R? There are little tidbits in the manuals on the R site, but I found nothing substative. I'd like to understand these better. Terry Therneau __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary and min max
Sebastian P. Luque spluque at gmail.com writes: Hi, I came across a case where there's a discrepancy between minimum and maximum values reported by 'summary' and the 'min' and 'max' functions: By default summary only lists 3 significant digits ... see ?summary Ben Bolker (is this a candidate for a FAQ?) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest
Dear all R gurus, I am really sorry if my query embraces anyone. Can anyone give me some introductory papers or suggestions about what Random Forest is? Thanks and regards, - Original Message From: Weiwei Shi [EMAIL PROTECTED] To: Ruben Feldman [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Monday, April 23, 2007 8:56:29 PM Subject: Re: [R] Random Forest Hi, Ruben: fit$confusion if you provide your test data, then you can also access the confusion matrix of test data by fit$test$confusion there are details of how to use randomForest by reading: ?randomForest HTH, Weiwei On 4/22/07, Ruben Feldman [EMAIL PROTECTED] wrote: Hi, I am trying to print out my confusion matrix after having created my random forest. I have put in this command: fit-randomForest(MMS_ENABLED_HANDSET~.,data=dat,ntree=500,mtry=14, na.action=na.omit,confusion=TRUE) but I can't get it to give me the confusion matrix, anyone know how this works? Thansk! Ruben [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimates at each iteration of optim()?
Deepankar, Here is an example using BFGS: fr - function(x) { ## Rosenbrock Banana function + x1 - x[1] + x2 - x[2] + 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 + } grr - function(x) { ## Gradient of 'fr' + x1 - x[1] + x2 - x[2] + c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1), +200 * (x2 - x1 * x1)) + } optim(c(-1.2,1), fr, grr, method = BFGS, control=list(trace=TRUE)) initial value 24.20 iter 10 value 1.367383 iter 20 value 0.134560 iter 30 value 0.001978 iter 40 value 0.00 final value 0.00 converged $par [1] 1 1 $value [1] 9.594955e-18 $counts function gradient 110 43 $convergence [1] 0 $message NULL This example shows that the parameter estimates are printed out every 10 iterations. However, trying different integer values for trace from 2 to 10 (trace = 1 behaves the same as trace=TRUE) did not change anything. If you want to get estimates at every iteration, look at the source code for BFGS (which I assume is in FORTRAN). You may have to modify the source code and recompile it yourself to get more detailed trace for BFGS. However, you can get parameter iterates at every step for L-BFGS-B using trace=6, although this gives a lot more information than just the parameter estimates. Alternatively, you can use the CG methods with trace=TRUE or trace=1, which is a generally a lot slower than BFGS or L-BFGS-B. Why do you want to look at parameter estimates for each step, anyway? Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of DEEPANKAR BASU Sent: Monday, April 23, 2007 11:34 AM To: Peter Dalgaard Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Estimates at each iteration of optim()? I read the description of the trace control parameter in ?optim and then also looked at the examples given at the end. In one of the examples I found that they had used trace=TRUE with the method SANN. I am using the method BFGS and I tried using trace=TRUE too but I did not get the parameter estimates at each iteration. As you say, it might be method dependent. I tried reading the source code for optim but could not find out what I was looking for. Hence, I was wondering if anyone could tell me what option to use with the method BFGS to get the parameter estimates at each iteration of the optimization. Deepankar - Original Message - From: Peter Dalgaard [EMAIL PROTECTED] Date: Monday, April 23, 2007 2:46 am Subject: Re: [R] Estimates at each iteration of optim()? DEEPANKAR BASU wrote: I am trying to maximise a complicated loglikelihood function with the optim command. Is there some way to get to know the estiamtes at each iteration? When I put control=list(trace=TRUE) as an option in optim, I just got the initial and final values of the loglikelihood, number of iterations and whether the routine has converged or not. I need to know the estimate values at each iteration. It might help if you actually _read_ the description of the trace control parameter (hint: it is not an on/off switch) in ?optim... And, as it says, this is method dependent, so you may have to study the source code. Deepankar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] residuals and predict
hi, in using glm function is it possible to extract residuals and predict values ? ___ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest
Dear all R gurus, I am really sorry if my query embraces anyone. Can anyone give me some introductory papers or suggestions about what Random Forest is? Thanks and regards, - Original Message From: Weiwei Shi [EMAIL PROTECTED] To: Ruben Feldman [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Monday, April 23, 2007 8:56:29 PM Subject: Re: [R] Random Forest Hi, Ruben: fit$confusion if you provide your test data, then you can also access the confusion matrix of test data by fit$test$confusion there are details of how to use randomForest by reading: ?randomForest HTH, Weiwei On 4/22/07, Ruben Feldman [EMAIL PROTECTED] wrote: Hi, I am trying to print out my confusion matrix after having created my random forest. I have put in this command: fit-randomForest(MMS_ENABLED_HANDSET~.,data=dat,ntree=500,mtry=14, na.action=na.omit,confusion=TRUE) but I can't get it to give me the confusion matrix, anyone know how this works? Thansk! Ruben [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary and min max
Sebastian P. Luque [EMAIL PROTECTED] wrote: I came across a case where there's a discrepancy between minimum and maximum values reported by 'summary' and the 'min' and 'max' functions: summary() rounds by default. Thus its reporting oddball values is considered a feature, not a bug. -- Mike Prager, NOAA, Beaufort, NC * Opinions expressed are personal and not represented otherwise. * Any use of tradenames does not constitute a NOAA endorsement. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimates at each iteration of optim()?
Ravi, Thanks a lot for your detailed reply. It clarifies many of the confusions in my mind. I want to look at the parameter estimates at each iteration because the full model that I am trying to estimate is not converging; a smaller version of the model converges but the results are quite meaningless. The problem in the estimation of the full model is the following: my likelihood function contains the elements of a (bivariate normal) covariance matrix as parameters. To compute the likelihood, I have to draw random samples from the bivariate normal distribution. But no matter what starting values I give, I cannot ensure that the covariance matrix remains positive definite at each iteration of the optimization exercise. Moreover, as soon as the covariance matrix fails to be positive definite, I get an error message (because I can no longer draw from the bivariate normal distribution) and the program stops. Faced with this problem, I wanted to see exactly at which parameter estimates the covariance matrix fails to remain positive definite. From that I would think of d evising a method to get around the problem, at least I would try to. Probably there is some other way to solve this problem. I would like your opinion on the following question: is there some way I can transform the three parametrs of my (2 by 2) covariance matrix (the two standard devaitions and the correlation coefficient) to ensure that the covariance matrix remains positive definite at each iteration of the optimization. Is there any method other than transforming the parameters to ensure this? Deepankar - Original Message - From: Ravi Varadhan [EMAIL PROTECTED] Date: Monday, April 23, 2007 12:21 pm Subject: RE: [R] Estimates at each iteration of optim()? Deepankar, Here is an example using BFGS: fr - function(x) { ## Rosenbrock Banana function + x1 - x[1] + x2 - x[2] + 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 + } grr - function(x) { ## Gradient of 'fr' + x1 - x[1] + x2 - x[2] + c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1), +200 * (x2 - x1 * x1)) + } optim(c(-1.2,1), fr, grr, method = BFGS, control=list(trace=TRUE)) initial value 24.20 iter 10 value 1.367383 iter 20 value 0.134560 iter 30 value 0.001978 iter 40 value 0.00 final value 0.00 converged $par [1] 1 1 $value [1] 9.594955e-18 $counts function gradient 110 43 $convergence [1] 0 $message NULL This example shows that the parameter estimates are printed out every 10 iterations. However, trying different integer values for trace from 2 to 10 (trace = 1 behaves the same as trace=TRUE) did not change anything. If you want to get estimates at every iteration, look at the source code for BFGS (which I assume is in FORTRAN). You may have to modify the source code and recompile it yourself to get more detailed trace for BFGS. However, you can get parameter iterates at every step for L-BFGS- B using trace=6, although this gives a lot more information than just the parameterestimates. Alternatively, you can use the CG methods with trace=TRUE or trace=1, which is a generally a lot slower than BFGS or L-BFGS-B. Why do you want to look at parameter estimates for each step, anyway? Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of DEEPANKAR BASU Sent: Monday, April 23, 2007 11:34 AM To: Peter Dalgaard Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Estimates at each iteration of optim()? I read the description of the trace control parameter in ?optim and thenalso looked at the examples given at the end. In one of the examples I found that they had used trace=TRUE with the method SANN. I am using themethod BFGS and I tried using trace=TRUE too but I did not get the parameter estimates at each iteration. As you say, it might be method dependent. I tried reading the source code for optim but could not find out what I was looking for. Hence, I was wondering if anyone could tell me what option to use with the method BFGS to get the parameter estimates at each iteration of the optimization. Deepankar - Original Message - From: Peter Dalgaard [EMAIL PROTECTED] Date: Monday, April 23, 2007 2:46 am Subject: Re: [R] Estimates at each iteration of optim()? DEEPANKAR BASU wrote: I am trying to maximise a complicated loglikelihood function
Re: [R] aggregate function
If monthly should aggregate per -mm combination, you could try something like aggregate(x$z,list(cut(as.Date(x$Date),m)),mean) for monthly aggregation and aggregate(x$z,list(cut(as.Date(x$Date),y)),mean) for yearly means. If monthly aggregation should aggregate over different years (and produce only 12 numbers), maybe aggregate(x$z, list(format(as.Date(x$Date),%m)),mean) works (everything untested). Be sure to use R 2.4.1 patched or 2.5.0, since there was a bug in cut.Date which prevents the yearly aggregation from working properly before R 2.4.1 patched! Regards, Martin Michel Schnitz wrote: Hello, is there a way to use the aggregate function to calculate monthly mean in case i have one row in data frame that holds the date like -mm-dd? i know that it works for daily means. i also like to do it for monthly and yearly means. maybe there is something like aggregate(x, list(Date[%m]), mean)? the data frame looks like: Date Timez 2006-01-0121:00 6,2 2006-01-0122:00 5,7 2006-01-0123:00 3,2 2006-01-0200:00 7,8 2006-01-0201:00 6,8 2006-01-0202:00 5,6 . . . 2007-03-3022:00 5,2 2007-03-3023:00 8,3 2007-03-3100:00 6,4 2007-03-3101:00 7,4 thanks for help! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] residuals and predict
?fitted ?residuals ?glm section 'value' Please be so kind and read the available documentation before posting... Petr elyakhlifi mustapha napsal(a): hi, in using glm function is it possible to extract residuals and predict values ? ___ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting the mode of a vector
Beno?t L?t? wrote: Hello, I have an elementary question (for which I couldn't find the answer on the web or the help): how can I extract the mode (modal score) of a vector? Assuming that your vector contains only integers: v - sample(1:5, size=20, replace=T) v [1] 1 1 1 1 2 3 5 1 1 5 2 4 1 3 1 1 5 4 1 5 vt - table(v) as.numeric(names(vt[vt == max(vt)])) [1] 1 Cheers, Gad # or more succinctly, names(vt[which.max(vt)]) [1] 1 John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test deviation from a binomial distribution - lack of 50:50
Dear R-users, I have a data set where each observation consists of a number of trials (n.trials) that varies between 5 and 7, 6 being most common. Each trial can take either of two outcomes, success or failure. A dummy data set: n.trials - sample(5:7, 50, replace=T, prob=c(0.2, 0.6, 0.2)) success - rbinom(50, n.trials, p=0.5) failure - n.trials - success I know I could test for a deviation from 50:50 success:failure in one or the other direction using a glm with binomial errors. However, I suspect that in my 'real' data set the outcome 50:50 is underrepresented, not due to a skew in one particular direction, but rather that within each observation there are either many successes or many failures. Although I did not manage to create a dummy data set with these properties, which would be the proper way in R to test for a 'lack of 50:50 outcome' using the simple dummy data above as a starting point? Thanks in advance! Henrik -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] residuals and predict
Hi, try assign the output of glm to a object. g - glm(model) names(g) -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 Ohttp://maps.google.com/maps?f=qhl=enq=Curitiba,+Brazillayer=ie=UTF8z=18ll=-25.448315,-49.276916spn=0.002054,0.005407t=kom=1 On 4/23/07, elyakhlifi mustapha [EMAIL PROTECTED] wrote: hi, in using glm function is it possible to extract residuals and predict values ? ___ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest
Google random forests see Leo Brieman's site, Wikipedia, esp link at bottom of wikipedia page to Andy Matt's article in RNews I did a DMA/AC webinar in January. Slides are at: http://www.porzak.com/JimArchive/JimPorzak_RFwithR_DMAAC_Jan07_webinar.pdf On 4/23/07, Ron Michael [EMAIL PROTECTED] wrote: Dear all R gurus, I am really sorry if my query embraces anyone. Can anyone give me some introductory papers or suggestions about what Random Forest is? Thanks and regards, - Original Message From: Weiwei Shi [EMAIL PROTECTED] To: Ruben Feldman [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Monday, April 23, 2007 8:56:29 PM Subject: Re: [R] Random Forest Hi, Ruben: fit$confusion if you provide your test data, then you can also access the confusion matrix of test data by fit$test$confusion there are details of how to use randomForest by reading: ?randomForest HTH, Weiwei On 4/22/07, Ruben Feldman [EMAIL PROTECTED] wrote: Hi, I am trying to print out my confusion matrix after having created my random forest. I have put in this command: fit-randomForest(MMS_ENABLED_HANDSET~.,data=dat,ntree=500,mtry=14, na.action=na.omit,confusion=TRUE) but I can't get it to give me the confusion matrix, anyone know how this works? Thansk! Ruben [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- HTH, Jim Porzak San Francisco, CA http://www.linkedin.com/in/jimporzak __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimates at each iteration of optim()?
Without knowing much about your problem, it is hard to suggest good strategies. However, if you are having trouble with the estimates of covariance matrix not being positive-definite, you can force them to be positive-definite after each iteration, before moving on to the next iteration. Look at the make.positive.definite function from corpcor package. This is just one approach. There may be better approaches - perhaps, an EM-like approach might be applicable that would automatically satisfy all parameter constraints. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: DEEPANKAR BASU [mailto:[EMAIL PROTECTED] Sent: Monday, April 23, 2007 1:09 PM To: Ravi Varadhan Cc: 'Peter Dalgaard'; r-help@stat.math.ethz.ch Subject: Re: RE: [R] Estimates at each iteration of optim()? Ravi, Thanks a lot for your detailed reply. It clarifies many of the confusions in my mind. I want to look at the parameter estimates at each iteration because the full model that I am trying to estimate is not converging; a smaller version of the model converges but the results are quite meaningless. The problem in the estimation of the full model is the following: my likelihood function contains the elements of a (bivariate normal) covariance matrix as parameters. To compute the likelihood, I have to draw random samples from the bivariate normal distribution. But no matter what starting values I give, I cannot ensure that the covariance matrix remains positive definite at each iteration of the optimization exercise. Moreover, as soon as the covariance matrix fails to be positive definite, I get an error message (because I can no longer draw from the bivariate normal distribution) and the program stops. Faced with this problem, I wanted to see exactly at which parameter estimates the covariance matrix fails to remain positive definite. From that I would think of d evising a method to get around the problem, at least I would try to. Probably there is some other way to solve this problem. I would like your opinion on the following question: is there some way I can transform the three parametrs of my (2 by 2) covariance matrix (the two standard devaitions and the correlation coefficient) to ensure that the covariance matrix remains positive definite at each iteration of the optimization. Is there any method other than transforming the parameters to ensure this? Deepankar - Original Message - From: Ravi Varadhan [EMAIL PROTECTED] Date: Monday, April 23, 2007 12:21 pm Subject: RE: [R] Estimates at each iteration of optim()? Deepankar, Here is an example using BFGS: fr - function(x) { ## Rosenbrock Banana function + x1 - x[1] + x2 - x[2] + 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 + } grr - function(x) { ## Gradient of 'fr' + x1 - x[1] + x2 - x[2] + c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1), +200 * (x2 - x1 * x1)) + } optim(c(-1.2,1), fr, grr, method = BFGS, control=list(trace=TRUE)) initial value 24.20 iter 10 value 1.367383 iter 20 value 0.134560 iter 30 value 0.001978 iter 40 value 0.00 final value 0.00 converged $par [1] 1 1 $value [1] 9.594955e-18 $counts function gradient 110 43 $convergence [1] 0 $message NULL This example shows that the parameter estimates are printed out every 10 iterations. However, trying different integer values for trace from 2 to 10 (trace = 1 behaves the same as trace=TRUE) did not change anything. If you want to get estimates at every iteration, look at the source code for BFGS (which I assume is in FORTRAN). You may have to modify the source code and recompile it yourself to get more detailed trace for BFGS. However, you can get parameter iterates at every step for L-BFGS- B using trace=6, although this gives a lot more information than just the parameterestimates. Alternatively, you can use the CG methods with trace=TRUE or trace=1, which is a generally a lot slower than BFGS or L-BFGS-B. Why do you want to look at parameter estimates for each step, anyway? Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
[R] fitting mixed models to censored data?
Hi, I'm trying to figure out if there are any packages allowing one to fit mixed models (or non-linear mixed models) to data that includes censoring. I've done some searching already on CRAN and through the mailing list archives, but haven't discovered anything. Since I may well have done a poor job searching I thought I'd ask here prior to giving up. I understand that SAS's proc nlmixed can accomodate censoring (though proc mixed apparently can't), so if I can't find something available in R, I'll have to break down and use that. Please, save me from having to use SAS! Thanks much, Doug __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimates at each iteration of optim()?
You can of course print out the values in your objective function, as that is where you want the information. In any case, using R's debugging facilities (e.g. dump.frames debugger) would have enabled you to find the input values are which your function was failing. Please see the chapter in `Writing R Extensions'. One way out is to return NA if the objective cannot be evaluated: that often works (not L-BFGS). Another would be to work with log(standard deviation) and atanh(correlation). On Mon, 23 Apr 2007, DEEPANKAR BASU wrote: Ravi, Thanks a lot for your detailed reply. It clarifies many of the confusions in my mind. I want to look at the parameter estimates at each iteration because the full model that I am trying to estimate is not converging; a smaller version of the model converges but the results are quite meaningless. The problem in the estimation of the full model is the following: my likelihood function contains the elements of a (bivariate normal) covariance matrix as parameters. To compute the likelihood, I have to draw random samples from the bivariate normal distribution. But no matter what starting values I give, I cannot ensure that the covariance matrix remains positive definite at each iteration of the optimization exercise. Moreover, as soon as the covariance matrix fails to be positive definite, I get an error message (because I can no longer draw from the bivariate normal distribution) and the program stops. Faced with this problem, I wanted to see exactly at which parameter estimates the covariance matrix fails to remain positive definite. From that I would think of d evising a method to get around the problem, at least I would try to. Probably there is some other way to solve this problem. I would like your opinion on the following question: is there some way I can transform the three parametrs of my (2 by 2) covariance matrix (the two standard devaitions and the correlation coefficient) to ensure that the covariance matrix remains positive definite at each iteration of the optimization. Is there any method other than transforming the parameters to ensure this? Deepankar - Original Message - From: Ravi Varadhan [EMAIL PROTECTED] Date: Monday, April 23, 2007 12:21 pm Subject: RE: [R] Estimates at each iteration of optim()? Deepankar, Here is an example using BFGS: fr - function(x) { ## Rosenbrock Banana function + x1 - x[1] + x2 - x[2] + 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 + } grr - function(x) { ## Gradient of 'fr' + x1 - x[1] + x2 - x[2] + c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1), +200 * (x2 - x1 * x1)) + } optim(c(-1.2,1), fr, grr, method = BFGS, control=list(trace=TRUE)) initial value 24.20 iter 10 value 1.367383 iter 20 value 0.134560 iter 30 value 0.001978 iter 40 value 0.00 final value 0.00 converged $par [1] 1 1 $value [1] 9.594955e-18 $counts function gradient 110 43 $convergence [1] 0 $message NULL This example shows that the parameter estimates are printed out every 10 iterations. However, trying different integer values for trace from 2 to 10 (trace = 1 behaves the same as trace=TRUE) did not change anything. If you want to get estimates at every iteration, look at the source code for BFGS (which I assume is in FORTRAN). You may have to modify the source code and recompile it yourself to get more detailed trace for BFGS. However, you can get parameter iterates at every step for L-BFGS- B using trace=6, although this gives a lot more information than just the parameterestimates. Alternatively, you can use the CG methods with trace=TRUE or trace=1, which is a generally a lot slower than BFGS or L-BFGS-B. Why do you want to look at parameter estimates for each step, anyway? Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of DEEPANKAR BASU Sent: Monday, April 23, 2007 11:34 AM To: Peter Dalgaard Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Estimates at each iteration of optim()? I read the description of the trace control parameter in ?optim and thenalso looked at the examples given at the end. In one of the examples I found that they had used trace=TRUE with the method SANN. I am using themethod BFGS and I tried using trace=TRUE too but I did not get the parameter estimates at each iteration. As you say, it might be method
Re: [R] Documentation for namespaces
http://www.stat.uiowa.edu/~luke/R/namespaces/morenames.html http://www.ci.tuwien.ac.at/Conferences/useR-2004/Keynotes/Tierney.pdf http://cran.r-project.org/doc/Rnews/Rnews_2003-1.pdf may all help, but there is as yet nothing (AFAIK) like a comprehensive user-level manual. On Mon, 23 Apr 2007, Terry Therneau wrote: Brian Ripley recently replied to a comment of mine by referring to a function 'assignInNamespace', which I had not heard of. Is there a good write up on name spaces in R? There are little tidbits in the manuals on the R site, but I found nothing substative. I'd like to understand these better. Terry Therneau -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting mixed models to censored data?
Douglas: AFAIK, this is subject area of active current research. Diggle, Heagerty, Liang, and Zeger , 2002, (ANALYSIS OF LONGITUDINAL DATA) say on p.316: An emerging consensus is that analysis of data with potentially informative dropouts necessarily involves assumptions which are difficult, or even impossible, to check from the observed data. This was ca 1994, I believe, so I don't know whether this view is still held among experts (which I am not). But if it is, you may do well to be careful of whatever SAS does even if you do have to go running off to it. Cheers, Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Douglas Grove Sent: Monday, April 23, 2007 10:58 AM To: r-help@stat.math.ethz.ch Subject: [R] fitting mixed models to censored data? Hi, I'm trying to figure out if there are any packages allowing one to fit mixed models (or non-linear mixed models) to data that includes censoring. I've done some searching already on CRAN and through the mailing list archives, but haven't discovered anything. Since I may well have done a poor job searching I thought I'd ask here prior to giving up. I understand that SAS's proc nlmixed can accomodate censoring (though proc mixed apparently can't), so if I can't find something available in R, I'll have to break down and use that. Please, save me from having to use SAS! Thanks much, Doug __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting mixed models to censored data?
Hi Bert, Yes, I am always wary when one software offers something that other do not. The censoring I'm faced with (at present) isn't as complicated as with much 'survival' data. I'm trying to analyze assay data and have a lower limit of detection (LLD) to contend with. Once the level of the analyte gets low enough it can't be accurately quantitated, hence all that is reported is that the level is less than some value (the LLD). So I'm not worried about all the complex assumptions that go along with censoring in clinical trials, etc. Thanks, Doug On Mon, 23 Apr 2007, Bert Gunter wrote: Douglas: AFAIK, this is subject area of active current research. Diggle, Heagerty, Liang, and Zeger , 2002, (ANALYSIS OF LONGITUDINAL DATA) say on p.316: An emerging consensus is that analysis of data with potentially informative dropouts necessarily involves assumptions which are difficult, or even impossible, to check from the observed data. This was ca 1994, I believe, so I don't know whether this view is still held among experts (which I am not). But if it is, you may do well to be careful of whatever SAS does even if you do have to go running off to it. Cheers, Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Douglas Grove Sent: Monday, April 23, 2007 10:58 AM To: r-help@stat.math.ethz.ch Subject: [R] fitting mixed models to censored data? Hi, I'm trying to figure out if there are any packages allowing one to fit mixed models (or non-linear mixed models) to data that includes censoring. I've done some searching already on CRAN and through the mailing list archives, but haven't discovered anything. Since I may well have done a poor job searching I thought I'd ask here prior to giving up. I understand that SAS's proc nlmixed can accomodate censoring (though proc mixed apparently can't), so if I can't find something available in R, I'll have to break down and use that. Please, save me from having to use SAS! Thanks much, Doug __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimates at each iteration of optim()?
Thanks a lot for all your suggestions; they have been extremely helpful. I will work through each (starting with Ravi's suggestions) and get back with other questions if they arise. Deepankar - Original Message - From: Ravi Varadhan [EMAIL PROTECTED] Date: Monday, April 23, 2007 1:26 pm Subject: Re: [R] Estimates at each iteration of optim()? Without knowing much about your problem, it is hard to suggest good strategies. However, if you are having trouble with the estimates of covariance matrix not being positive-definite, you can force them to be positive-definite after each iteration, before moving on to the next iteration. Look at the make.positive.definite function from corpcorpackage. This is just one approach. There may be better approaches - perhaps, an EM-like approach might be applicable that would automaticallysatisfy all parameter constraints. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: DEEPANKAR BASU [EMAIL PROTECTED] Sent: Monday, April 23, 2007 1:09 PM To: Ravi Varadhan Cc: 'Peter Dalgaard'; r-help@stat.math.ethz.ch Subject: Re: RE: [R] Estimates at each iteration of optim()? Ravi, Thanks a lot for your detailed reply. It clarifies many of the confusions in my mind. I want to look at the parameter estimates at each iteration because the full model that I am trying to estimate is not converging; a smaller version of the model converges but the results are quite meaningless. The problem in the estimation of the full model is the following: my likelihood functioncontains the elements of a (bivariate normal) covariance matrix as parameters. To compute the likelihood, I have to draw random samples from the bivariate normal distribution. But no matter what starting values I give, I cannot ensure that the covariance matrix remains positive definiteat each iteration of the optimization exercise. Moreover, as soon as the covariance matrix fails to be positive definite, I get an error message(because I can no longer draw from the bivariate normal distribution) and the program stops. Faced with this problem, I wanted to see exactly at which parameter estimates the covariance matrix fails to remain positive definite.From that I would think of d evising a method to get around the problem, at least I would try to. Probably there is some other way to solve this problem. I would like your opinion on the following question: is there some way I can transform the three parametrs of my (2 by 2) covariance matrix (the two standard devaitions and the correlation coefficient) to ensure that the covariancematrix remains positive definite at each iteration of the optimization. Is there any method other than transforming the parameters to ensure this? Deepankar - Original Message - From: Ravi Varadhan [EMAIL PROTECTED] Date: Monday, April 23, 2007 12:21 pm Subject: RE: [R] Estimates at each iteration of optim()? Deepankar, Here is an example using BFGS: fr - function(x) { ## Rosenbrock Banana function + x1 - x[1] + x2 - x[2] + 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 + } grr - function(x) { ## Gradient of 'fr' + x1 - x[1] + x2 - x[2] + c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1), +200 * (x2 - x1 * x1)) + } optim(c(-1.2,1), fr, grr, method = BFGS, control=list(trace=TRUE)) initial value 24.20 iter 10 value 1.367383 iter 20 value 0.134560 iter 30 value 0.001978 iter 40 value 0.00 final value 0.00 converged $par [1] 1 1 $value [1] 9.594955e-18 $counts function gradient 110 43 $convergence [1] 0 $message NULL This example shows that the parameter estimates are printed out every 10 iterations. However, trying different integer values for trace from 2 to 10 (trace = 1 behaves the same as trace=TRUE) did not change anything. If you want to get estimates at every iteration, look at the source code for BFGS (which I assume is in FORTRAN). You may have to modify the source code and recompile it yourself to get more detailed trace for BFGS. However, you can get parameter iterates at every step for L-BFGS- B using trace=6, although this gives a lot more information than just the parameterestimates. Alternatively, you can use the CG methods with trace=TRUE or trace=1, which is a generally a lot slower than BFGS
Re: [R] importing sas datasets
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, April 23, 2007 8:45 AM To: John Kane; Daniel Nordlund; r-help@stat.math.ethz.ch Subject: Re: [R] importing sas datasets Hi John and Daniel, Thanks for your suggestions, I updated line 127 of the sas.get function but after submitting the following command: c- sas.get(lib=c:\\ghan, mem=mkds0001, var=( ), Why are you using parentheses in the line above for the var parameter? If you want all variables, just leave the var parameter out of the call (it defaults to all variables). But if you want to include it, the function call could be: c- sas.get(lib=c:\\ghan, mem=mkds0001, var=, format.library=d:\\R\\R-2.4.1, sasprog=C:\\Programmi\\SAS\\SAS 9.1\\sas.exe) Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] .RData saved on exit, but not .Rhistory?
Apologies in advance if I've misunderstood something or this is a stupid question. When using the R on my Mac (2.4.1 and 2.3.0), if I exit and ask to save the workspace, .RData is updated but .Rhistory is not. Introduction to R makes it sound like both should be saved, and it clearly happened for me at some point in the past: I have some (very old) commands in my .Rhistory, and these are loaded at startup, but new entries are not added. Is this a bug, or am I doing something wrong?? Much thanks, Ian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BlueGene - Compile R for BGL?
Howdy, I was just wondering if anyone out there has compiled R to run on the IBM BlueGene, and if so, could they share their compilation options / configuration? Thanks, Mike [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting mixed models to censored data?
Doug, In perhaps similar situations where there are clusters of measurements due to repeated time or space on an individual subject or experimental unit, I have used the survreg() function from the survival library. You can specify left, right, and/or interval censoring within a data set through Surv(), and so I have used left censoring for the LOD observations. I was just focused on marginal or population-averaged estimation, so the use of cluster() in the argument for survreg() and the robust option in survreg() to get sandwich error estimates was sufficient for me. Depending on your needs to evaluate random effects, frailty() in the survival package -- which can be used with survreg() or coxph() --- is another alternative to explore, I believe. Hope that helps, Bill Nonclinical Statistics, Centocor R D -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Douglas Grove Sent: Monday, April 23, 2007 2:29 PM To: Bert Gunter Cc: r-help@stat.math.ethz.ch Subject: Re: [R] fitting mixed models to censored data? Hi Bert, Yes, I am always wary when one software offers something that other do not. The censoring I'm faced with (at present) isn't as complicated as with much 'survival' data. I'm trying to analyze assay data and have a lower limit of detection (LLD) to contend with. Once the level of the analyte gets low enough it can't be accurately quantitated, hence all that is reported is that the level is less than some value (the LLD). So I'm not worried about all the complex assumptions that go along with censoring in clinical trials, etc. Thanks, Doug On Mon, 23 Apr 2007, Bert Gunter wrote: Douglas: AFAIK, this is subject area of active current research. Diggle, Heagerty, Liang, and Zeger , 2002, (ANALYSIS OF LONGITUDINAL DATA) say on p.316: An emerging consensus is that analysis of data with potentially informative dropouts necessarily involves assumptions which are difficult, or even impossible, to check from the observed data. This was ca 1994, I believe, so I don't know whether this view is still held among experts (which I am not). But if it is, you may do well to be careful of whatever SAS does even if you do have to go running off to it. Cheers, Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Douglas Grove Sent: Monday, April 23, 2007 10:58 AM To: r-help@stat.math.ethz.ch Subject: [R] fitting mixed models to censored data? Hi, I'm trying to figure out if there are any packages allowing one to fit mixed models (or non-linear mixed models) to data that includes censoring. I've done some searching already on CRAN and through the mailing list archives, but haven't discovered anything. Since I may well have done a poor job searching I thought I'd ask here prior to giving up. I understand that SAS's proc nlmixed can accomodate censoring (though proc mixed apparently can't), so if I can't find something available in R, I'll have to break down and use that. Please, save me from having to use SAS! Thanks much, Doug __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.