[R] Interfacing C++ , MysQL and R
Hello! After a presentation of some statistical analysis of process datas, (where the few R possibilities I was able to show made quite a big impression), I was asked if it was possible to program a statistical application which could be used directly by the end user. Such an application would include a userfriendly interface (developped in C++), a db , a core statistical program, standard output; the necessary queries and statistical procedures would be interactively generated from the user input by the C++ program. As I do not intend to reprogram the necessary statistical functions if I can help it, I'm interested to know if a) it is possible to integrate R in such a way? b) naturally as I would sell the end product, what the royalties arrangements are c) has anybody in the list experience with such a project? Thanks for the help! Anne [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] how to print a plot
You did use dev.off() to finish the plots before trying to look at them? The symptoms you report are what happens if you did not. There is no ps() function in R: the postscript device is postscript() not ps(). If you want to print a plot, try dev.print. If you want to copy to a file, try dev.copy2eps. (You are on Linux, where EPS is more widely acceptable than PDF.) On 15 Sep 2003, Weiming Zhang wrote: Hi, Thank both of you. I tried everything. pdf(file=out.pdf) gave me a damaged pdf file. ps() did not print. ps(out.ps) gave me a ps file with badly drawn graph and could not be printed. I am using RH linux 7.2. Thanks again. weiming Zhang On Mon, 2003-09-15 at 16:06, Jason Turner wrote: On Tue, 2003-09-16 at 08:56, Weiming Zhang wrote: Hi, I am using R-1.7.1 on Linux. I integrated XEMACS with R. Could anybody tell me how to print a plot? I used plot function to make some graphs and then I wanted to print them or to save them to files. But I could not find out how to do it. Have you tried: help(Devices) help(pdf) What I do: pdf(file=myplots.pdf) plot(...) dev.off() Use Acrobat or gv to view the pdf files. Postscript is also good, but not as universally understood; I have many coleagues who work in very standard Windows environments, where ghostscript is unknown. PDF is a very sensible choice for e-mailing graphs. -- Indigo Industrial Controls Ltd. http://www.indigoindustrial.co.nz +64-(0)21-343-545 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] POSIX and identify
Hi On 15 Sep 2003 at 20:09, Troels Ring wrote: Thanks a lot, but something else may be awry ? - at least as.POSIXct(Dato) although now of length 84 still elicits a report of different argument lengths even though length now is 84 for both arguments. try plot(as.POSIXct(Dato),Crea)) and then identify(as.POSIXct(Dato),Crea)) identify(as.POSIXct(Dato),Crea,5,plot=TRUE) Error in identify(x, y, as.character(labels), n, plot, offset) : different argument lengths length(as.POSIXct(Dato)) [1] 84 length(Crea) [1] 84 length(Dato) [1] 9 Best wishes Troels Ring Aalborg At 18:43 9/15/03, you wrote: You need to convert to POSIXct before using Dato in identify(). This will work as you expected in R 1.8.0. On Mon, 15 Sep 2003, Troels Ring wrote: Dear Friends, I'm using winXP and R 1.7.1 and plotting some data using dates on the x-axis, and wanted to use identify to show some points but was told by identify that the x and y vectors producing a fine graph with 84 points were not equal in length. Below are the Dato for date - and length(Dato) finds 9 but str finds 84 as known. Will identify not work in this context ? Best wishes Troels Ring Aalborg, Denmark Dato [1] 2000-01-04 2000-01-07 2000-01-10 2000-01-13 2000-01-17 ... [81] 2003-04-23 2003-05-14 2003-07-30 2003-08-14 length(Dato) [1] 9 str(Dato) `POSIXlt', format: chr [1:84] 2000-01-04 2000-01-07 2000-01-10 2000-01-13 ... -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help Cheers Petr Pikal [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Fourth R Mailing List : R-packages
We (mainly the R core team) have been discussing the creation of another R mailing list, with the goal to fill the gap between R-helpvery high volume, with its great merits, but and R-announceonly for R important announcements (mostly R-core) hence __MODERATED__ and *very* low volume, and hence highly recommended for almost all users of R. *** all messages are forwarded to R-help *** In the past, several CRAN package authors have rightly felt that they would like the announcement of a major update of a package be a bit more prominent than the flood of messages on R-help but (most of the time) they still weren't supposed nor granted to use R-announce for this. This has been one main motivation for this new mailing list: R-packageso all messages forwarded to R-help o moderated (i.e. not accepting posts by anyone), but CRAN package authors (and others, similarly qualified) can freely post without moderator interaction {unless there's abuse}. The corresponding (new) web page, http://www.stat.math.ethz.ch/mailman/listinfo/r-packages/ now has TITLE: R Packages Extensions Announcements DESCRIPTION : A moderated board for announcements about contributed R packages and similar R project extensions. All messages are forwarded to R-help automatically, so please do not subscribe to this list if you are subscribed to R-help. For major announcements on the R project, see the R-announce mailing list, instead. And R-project.org's Mailing Lists web page will describe it from tomorrow as R-packages This list is for announcements as well, usually on the availability of new or enhanced contributed packages (on CRAN, typically). Note that the list is moderated. However, CRAN package authors (and others, similarly qualified) can freely post. As with R-announce, all messages to R-packages are automatically forwarded to the main R-help mailing list; hence you should only subscribe to R-packages if you do not to R-help. Use the web interface for information, subscription, archives, etc. Amount of mail to expect: Of course, we don't know yet, but I'd expect to see only a few messages per week. Finally, just re-iterating the obvious: o This is *NOT* a list for discussion, just announcements of extensions to R. o Do only subscribe if you are *NOT* subscribed to R-help, (but then, strongly consider doing it)! For more info, subscription, etc, please use the URL above Your R mailing list maintainer, Martin Maechler [EMAIL PROTECTED] http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 ___ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-announce __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] simplifying randomForest(s)
Dear All, I have been using the randomForest package for a couple of difficult prediction problems (which also share p n). The performance is good, but since all the variables in the data set are used, interpretation of what is going on is not easy, even after looking at variable importance as produced by the randomForest run. I have tried a simple variable selection scheme, and it does seem to perform well (as judged by leave-one-out) but I am not sure if it makes any sense. The idea is, in a kind of backwards elimination, to eliminate one by one the variables with smallest importance (or all the ones with negative importance in one go) until the out-of-bag estimate of classification error becames larger than that of the previous model (or of the initial model). So nothing really new. But I haven't been able to find any comments in the literature about simplification of random forests. Any suggestions/comments? Best, Ramón -- Ramón Díaz-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Persp and color
Hi, If you run the demo for persp (I have R 1.7), you will see that there is a good example of 'coluring' a volcano according to different heights, just try demo(persp) and check out the code. You probably will find it too complicated as I did, I was trying to do the same and honestly I wasn't able to. However, there is a way around and it is to use the function wireframe from the lattice package library(lattice) ?wireframe If you run through the help examples you'll see that it is a lot easier to colour the surfaces the way you want using this function. However, wireframe is extreMELY slow, so, if you have a big matrix it might be a pain in the behind. Also, the way you feed the data to wireframe is different to the way you do it with the persp function. I hope this is of any help. M. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] package documentation
Dear all, I writing my first package and everything seems to work (at least up to now) However when I try to build documentation (.dvi or .pdf), using Rcmd Rd2dvi.sh --pdf mypack.Rd I get a mypack.pdf whose title is R documentation of mypack.Rd instead of The mypack package as it should be, is it right? Also Version, Title, License, namely info from the DESCRIPTION file, are missing on the first page. Where is the problem? Many thanks, vito __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] package documentation
On Tue, 16 Sep 2003, Vito Muggeo wrote: I writing my first package and everything seems to work (at least up to now) However when I try to build documentation (.dvi or .pdf), using Rcmd Rd2dvi.sh --pdf mypack.Rd I get a mypack.pdf whose title is R documentation of mypack.Rd instead of The mypack package as it should be, is it right? It is right, rather than you: it did as you asked and not as you wanted. Also Version, Title, License, namely info from the DESCRIPTION file, are missing on the first page. Where is the problem? gannet% R CMD Rd2dvi --help Usage: R CMD Rd2dvi [options] files Generate DVI (or PDF) output from the Rd sources specified by files, by either giving the paths to the files, or the path to a directory with the sources of a package. You haven't called this with the option to give what you expected. Note what follows the `or'. That will give Package 'mypack' as the title. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] simplifying randomForest(s)
Ramon, From: Ramon Diaz-Uriarte [mailto:[EMAIL PROTECTED] Dear All, I have been using the randomForest package for a couple of difficult prediction problems (which also share p n). The performance is good, but since all the variables in the data set are used, interpretation of what is going on is not easy, even after looking at variable importance as produced by the randomForest run. I have tried a simple variable selection scheme, and it does seem to perform well (as judged by leave-one-out) but I am not sure if it makes any sense. The idea is, in a kind of backwards elimination, to eliminate one by one the variables with smallest importance (or all the ones with negative importance in one go) until the out-of-bag estimate of classification error becames larger than that of the previous model (or of the initial model). So nothing really new. But I haven't been able to find any comments in the literature about simplification of random forests. This is quite a hazardous game. We've been burned by this ourselves. I'll send you a paper we submitted on variable selection for random forest off-line. (Those who are interested, let me know.) The basic problem is that when you select important variables by RF and then re-run RF with those variables, the OOB error rate become biased downward. As you iterate more times, the overfitting becomes more and more severe (in the sense that, the OOB error rate will keep decreasing while error rate on an independent test set will be flat or increases). I was naïve enough to ask Breiman about this, and his reply was something like any competent statistician would know that you need something like cross-validation to do that... In the upcoming version 5 of Breiman's Fortran code, he offers an option to run RF twice, first time with all variables, and the second with the k (selected by user) most important variables from the 1st run. The OOB error rate from the 2nd run is no longer unbiased, but the bias is probably not too severe with only one iteration. Best, Andy Any suggestions/comments? Best, Ramón -- Ramón Díaz-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] RSPython crashes (using R 1.7.1 under Solaris 5.9)
Hello. I tried to install RSPython on Solaris 5.9. After compiling R with --enable-R-shlib, I try to install RSPython using R INSTALL --clean RSPython. This led to an error complaining about missing libutil, wich seems not to exist on Solaris. Therefore I just removed the -lutil entry in configure and tried to install again. The installation worked without problems, but after calling python and importing RS the python interpreter immediately crashes with a segmentation fault. Thanks for any hints. Michael __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Interfacing C++ , MysQL and R
From: Anne Piotet [mailto:[EMAIL PROTECTED] Hello! After a presentation of some statistical analysis of process datas, (where the few R possibilities I was able to show made quite a big impression), I was asked if it was possible to program a statistical application which could be used directly by the end user. Such an application would include a userfriendly interface (developped in C++), a db , a core statistical program, standard output; the necessary queries and statistical procedures would be interactively generated from the user input by the C++ program. As I do not intend to reprogram the necessary statistical functions if I can help it, I'm interested to know if a) it is possible to integrate R in such a way? b) naturally as I would sell the end product, what the royalties arrangements are Others will know more about this, but that never stopped me from tossing in my $0.02... As R is licensed as GPL, if you distribute (e.g., sell) your code, it will have to be GPL'ed as well. I believe that means while you can sell it for money, 1. You have to make it clear to whomever get the code that it's GPL'ed. 2. You have to distribute the source code, or allow a way for people to get source code. 3. You can not restrict further distribution of the code, free or otherwise. My understanding of how RedHat deals with this (at least in their enterprise server product) is by tacking on GPL a term that whoever installed their software agrees to purchase service/support contract from them. Another company that has a software linked to R does similar thing, by not selling the software, but the service (installation and training). HTH, Andy c) has anybody in the list experience with such a project? Thanks for the help! Anne [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo /r-help -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re[2]: [R] Persp and color and adding a color vector
Followin Prof. Uwes idea and after checking up some docs I was able to build a color vector with the correct colors and then call it from using persp col = option ... Nevertheless I still have a small problem... using something like : colorvect - rainbow(length(mat3),start=0.1,end=0.8) persp(mat3,col=colorvect, box= FALSE, theta=30) works something like I need... But ... if I try to visualize a specific part like mat3[1:900,2:78] ... persp(mat3[1:900,2:78,col=colorvect, box= FALSE, theta=30) What I get is bleach result with only part of the colors... I know that I were specting this but how can I avoid it with making the colorvect vector each time I call persp ? The other idea is making an equivalent matrix with each cell with the color info... but how can I automate that kind of procedure ? THanks Mark Marques __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] gam and concurvity
Hello, in the paper Avoiding the effects of concurvity in GAM's .. of Figueiras et al. (2003) it is mentioned that in GLM collinearity is taken into account in the calc of se but not in GAM (- results in confidence interval too narrow, p-value understated, GAM S-Plus version). I haven't found any references to GAM and concurvity or collinearity on the R page. And I wonder if the R version of Gam differ in this point. Another question would be, what the best manual way of a variable selection is, due to the lack of a stepwise procedure for GAM. Including the first variables, add var1, if GCV improves (what would be considered as improvement?) or P-value signif., keep it, otherwise drop it - add var 2, and so on? thanks in advance, cheers Martin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Persp and color
Hi, first of all I would like to say that honestly the persp demo is quite impresive, I won't take that away from you. The only problem I had, was that the code that actually builds the matrix of topo-colours that are used in the demo, is quite complicated (at least for me), and that code is poorly commented. So I was left with a series of help's() to try to see what each function would do, etc, etc..., while I was in that proccess I rembered about the wireframe function, and after checking its documentation, I found out that it has 'built-in' the ability to creaty this topo colors, that, I think is a great advantage. Maybe a good idea would be to insert the procedure you used to create the colors into the persp function itself, so humble neophyte users can easily plot striking volcano surfaces. This is actually the bit of code I couldn't work out, I know I would if I just could invest more of my precious time to it: fcol - fill zi - volcano[-1, -1] + volcano[-1, -61] + volcano[-87, -1] + volcano[-87, -61] fcol[-i1, -i2] - terrain.colors(20)[cut(zi, quantile(zi, seq(0, 1, len = 21)), include.lowest = TRUE)] persp(x, y, 2 * z, theta = 110, phi = 40, col = fcol, scale = FALSE, ltheta = -120, shade = 0.4, border = NA, box = FALSE) just another thing, I have realised that the demo runs from beginning to end without stopping (not always), that is not very nice because the plots are displayed too quickly to appreciate, so the user is left to 'run' de demo manualy, i.e. copying and pasting each bit of code in order to see each plot in detail. I am aware that R is the product of the cooperation of many people, contributing part of their work-time into making it better, I think your demo is fine, and perhaps you won't have time to improve on it, don't worry about that (no bad feelings). You correctly pointed out that a better way around was to go ask R-help directly. Certainly, that is what I intended to do with my own problem, but first I wanted to write my code properly. Tomorrow, I will post a thread about making persp representations of fractals in R, and maybe, you will be able to help me in showing how to correctly apply the colours to the surface. Maybe you will find this interested, and who knows, you will perhaps put it in your demo! By the way, I did also check help(persp), and how the colours are asigned to the surface facets is not well specified, not even on the examples (as I am aware of). Thanks, Mario. At 14:41 16/09/03 +0200, you wrote: ucgamdo == ucgamdo [EMAIL PROTECTED] on Tue, 16 Sep 2003 11:46:18 +0100 writes: ucgamdo Hi, If you run the demo for persp (I have R 1.7), ucgamdo you will see that there is a good example of ucgamdo 'coluring' a volcano according to different ucgamdo heights, just try demo(persp) ucgamdo and check out the code. You probably will find it ucgamdo too complicated as I did, I was trying to do the ucgamdo same and honestly I wasn't able to. Thank you for you honesty. As a main author of the part of demo(persp) I'm quite interested to find out what the problem was. I assume you also have looked at help(persp) ? ucgamdo However, there is a way around [ another way around would be to ask on R-help or ask someone who knows R better ... ... ] ucgamdo and it is to use the ucgamdo function wireframe from the lattice package library(lattice) ?wireframe ucgamdo If you run through the help examples you'll see ucgamdo that it is a lot easier to colour the surfaces the ucgamdo way you want using this function. However, ucgamdo wireframe is extreMELY slow, so, if you have a big ucgamdo matrix it might be a pain in the behind. Also, the ucgamdo way you feed the data to wireframe is different to ucgamdo the way you do it with the persp function. I hope ucgamdo this is of any help. ucgamdo M. ucgamdo __ ucgamdo [EMAIL PROTECTED] mailing list ucgamdo https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Persp and color and adding a color vector
Mark Marques wrote: Followin Prof. Uwes Who's that? If you mean me, I am not a Professor ... idea and after checking up some docs I was able to build a color vector with the correct colors and then call it from using persp col = option ... Nevertheless I still have a small problem... using something like : colorvect - rainbow(length(mat3),start=0.1,end=0.8) Attention: ?persp tells you there are (nx-1)(ny-1) facets given you have a matrix of dimension nx x ny. Additionally, this was not my idea, at least you have to select the colors by height, if I understood your question correctly. persp(mat3,col=colorvect, box= FALSE, theta=30) works something like I need... But ... if I try to visualize a specific part like mat3[1:900,2:78] ... persp(mat3[1:900,2:78,col=colorvect, box= FALSE, theta=30) What I get is bleach result with only part of the colors... What about writing a little function along the lines of foo - function(M){ colvect - ...(M)... persp(M, .) } and calling it with foo(mat3[1:900,2:78]) I know that I were specting this but how can I avoid it with making the colorvect vector each time I call persp ? As already mentioned, generate some colors and then take one color for a certain range of values of your matrix. Anyway, wireframe() in package lattice might have some features that are much more convenient to perform your task, as mentioned by someone else (too lazy to look into the archives). Uwe Ligges The other idea is making an equivalent matrix with each cell with the color info... but how can I automate that kind of procedure ? THanks Mark Marques __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Retrieve ... argument values
Dear R users, I want to retrieve ... argument values within a function. Here is a small exmaple: myfunc - function(x, ...) { if (hasArg(ylim)) a - ylim plot(x, ...) } x - rnorm(100) myfunc(x, ylim=c(-0.5, 0.5)) Error in myfunc(x, ylim = c(-0.5, 0.5)) : Object ylim not found I need to retrieve values of ylim (if it is defined when function is called) for later use in the function. Can anybody give me some hint? Thanks a lot. Huan This message and any attachments (the message) is\ intende...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
R: [R] gam and concurvity
As someone (Simon Wood, for instance) could explain much better and as it is stressed in the help files of the mgcv pakage (the package including the gam() function) gam in R is not a clone of gam in S+. S+ uses backfitting while R uses penalized splines (see the references inside gam() function). The approaches are quite different and can lead to substantial differences in particular cases, for instance with concurvity. best, vito PS Can you point out the exact reference for Figueiras et al. (2003)? - Original Message - From: Martin Wegmann [EMAIL PROTECTED] To: R-list [EMAIL PROTECTED] Sent: Tuesday, September 16, 2003 3:47 PM Subject: [R] gam and concurvity Hello, in the paper Avoiding the effects of concurvity in GAM's .. of Figueiras et al. (2003) it is mentioned that in GLM collinearity is taken into account in the calc of se but not in GAM (- results in confidence interval too narrow, p-value understated, GAM S-Plus version). I haven't found any references to GAM and concurvity or collinearity on the R page. And I wonder if the R version of Gam differ in this point. Another question would be, what the best manual way of a variable selection is, due to the lack of a stepwise procedure for GAM. Including the first variables, add var1, if GCV improves (what would be considered as improvement?) or P-value signif., keep it, otherwise drop it - add var 2, and so on? thanks in advance, cheers Martin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Persp and color
On Tue, 16 Sep 2003 [EMAIL PROTECTED] wrote: just another thing, I have realised that the demo runs from beginning to end without stopping (not always), that is not very nice because the plots are displayed too quickly to appreciate, so the user is left to 'run' de demo manualy, i.e. copying and pasting each bit of code in order to see each plot in detail. I am aware that R is the product of the cooperation of many people, contributing part of their work-time into making it better, I think your demo is fine, and perhaps you won't have time to improve on it, don't worry about that (no bad feelings). If you type par(ask=TRUE) you will always be prompted before a new graph is drawn. -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] simplifying randomForest(s)
Dear Andy, Thanks a lot for your message. This is quite a hazardous game. We've been burned by this ourselves. I'll send you a paper we submitted on variable selection for random forest off-line. (Those who are interested, let me know.) Thanks! The basic problem is that when you select important variables by RF and then re-run RF with those variables, the OOB error rate become biased downward. As you iterate more times, the overfitting becomes more and more severe (in the sense that, the OOB error rate will keep decreasing while error rate on an independent test set will be flat or increases). I was naïve enough to ask Breiman about this, and his reply was something like any competent statistician would know that you need something like cross-validation to do that... Yes, I understand the points you are making. However, I have tried to achieve protection against this problem by assessing the leave-one-out cross-validation error (LOOCVE) of the complete selection process. And the LOOCVE suggests this is working. Within the variable selection routine the OOB error rate is biased, but I guess that does not concern me that much, because I only use it to guide the selection. However, my final estimate of error comes from the LOOCVE. This is the esqueleton of the alorithm: n - length(y) for(i in 1:n) { the.simple.rf - simplify.the.rf(data = data[-i, ]) prediction[i] - predict(the.simple.rf, newdata = data[i, ]) } loocve - sum(y != prediction) / n Thus, the LOOCVE is computed with observations that were never used for the simplification of the tree that is predicting them. [I'll be glad to send my code to anyone interested]. And, the interesting thing with the data set I have tried is that it seems to perform reasonably (actually, the LOOCVE of a tree with the reduced set of variables is smaller than the LOOCVE of the original tree). (This is a first shot. I have a small sample size (29) so LOOCV is not that bad in terms of computation, although I am aware it can have high variance. I guess I could try the .632+ bootstrap method). Best, Ramón Best, Andy Any suggestions/comments? Best, Ramón -- Ramón Díaz-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help --- --- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (Whitehouse Station, New Jersey, USA), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it. --- --- -- Ramón Díaz-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Persp and color
[EMAIL PROTECTED] wrote: Hi, first of all I would like to say that honestly the persp demo is quite impresive, I won't take that away from you. The only problem I had, was that the code that actually builds the matrix of topo-colours that are used in the demo, is quite complicated (at least for me), and that code is poorly commented. So I was left with a series of help's() to try to see what each function would do, etc, etc..., while I was in that proccess I rembered about the wireframe function, and after checking its documentation, I found out that it has 'built-in' the ability to creaty this topo colors, that, I think is a great advantage. Maybe a good idea would be to insert the procedure you used to create the colors into the persp function itself, so humble neophyte users can easily plot striking volcano surfaces. This is actually the bit of code I couldn't work out, I know I would if I just could invest more of my precious time to it: fcol - fill zi - volcano[-1, -1] + volcano[-1, -61] + volcano[-87, -1] + volcano[-87, -61] Since dim(volcano) [1] 87 61 you have to throw away some points of the margins, because you need (nx-1)*(ny-1) facets' colors. And you want the color to be specified for the middle of the facets, not one of the 4 corners, so you average the matrices of those 4 corners. fcol[-i1, -i2] - terrain.colors(20)[cut(zi, quantile(zi, seq(0, 1, len = 21)), include.lowest = TRUE)] You use 20 different colors, choosen (indexed) by quantiles of the matrix calculated above. That's the obvious idea (nice implemented here, though). persp(x, y, 2 * z, theta = 110, phi = 40, col = fcol, scale = FALSE, ltheta = -120, shade = 0.4, border = NA, box = FALSE) just another thing, I have realised that the demo runs from beginning to end without stopping (not always), that is not very nice because the plots are displayed too quickly to appreciate, so the user is left to 'run' de demo manualy, i.e. copying and pasting each bit of code in order to see each plot in detail. I am aware that R is the product of the cooperation of many people, contributing part of their work-time into making it better, I think your demo is fine, and perhaps you won't have time to improve on it, don't worry about that (no bad feelings). The improvement seems to be: par(ask = TRUE) demo(persp) Uwe Ligges You correctly pointed out that a better way around was to go ask R-help directly. Certainly, that is what I intended to do with my own problem, but first I wanted to write my code properly. Tomorrow, I will post a thread about making persp representations of fractals in R, and maybe, you will be able to help me in showing how to correctly apply the colours to the surface. Maybe you will find this interested, and who knows, you will perhaps put it in your demo! By the way, I did also check help(persp), and how the colours are asigned to the surface facets is not well specified, not even on the examples (as I am aware of). Thanks, Mario. At 14:41 16/09/03 +0200, you wrote: ucgamdo == ucgamdo [EMAIL PROTECTED] on Tue, 16 Sep 2003 11:46:18 +0100 writes: ucgamdo Hi, If you run the demo for persp (I have R 1.7), ucgamdo you will see that there is a good example of ucgamdo 'coluring' a volcano according to different ucgamdo heights, just try demo(persp) ucgamdo and check out the code. You probably will find it ucgamdo too complicated as I did, I was trying to do the ucgamdo same and honestly I wasn't able to. Thank you for you honesty. As a main author of the part of demo(persp) I'm quite interested to find out what the problem was. I assume you also have looked at help(persp) ? ucgamdo However, there is a way around [ another way around would be to ask on R-help or ask someone who knows R better ... ... ] ucgamdo and it is to use the ucgamdo function wireframe from the lattice package library(lattice) ?wireframe ucgamdo If you run through the help examples you'll see ucgamdo that it is a lot easier to colour the surfaces the ucgamdo way you want using this function. However, ucgamdo wireframe is extreMELY slow, so, if you have a big ucgamdo matrix it might be a pain in the behind. Also, the ucgamdo way you feed the data to wireframe is different to ucgamdo the way you do it with the persp function. I hope ucgamdo this is of any help. ucgamdo M. ucgamdo __ ucgamdo [EMAIL PROTECTED] mailing list ucgamdo https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Retrieve ... argument values
On Tue, 16 Sep 2003 [EMAIL PROTECTED] wrote: Dear R users, I want to retrieve ... argument values within a function. Here is a small exmaple: myfunc - function(x, ...) { if (hasArg(ylim)) a - ylim plot(x, ...) } One solution is dots-substitute(list(...)) a-dots$ylim which sets a to NULL if there is no ylim argument and to the ylim argument if it exists -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Retrieve ... argument values
For most purposes a more useful technique is to write the function with a default NULL argument myfunc - function(x, ylim=NULL) so that it can be called as myfunc(x) or myfunc(x,y). Inside the function you test for !is.null(ylim) and take appropriate action. Alternatively, and maybe more commonly, you give ylim a sensible default so the caller has to be explicit about setting ylim to NULL if required. As it happens, in your specific case you can write as you did except for the pseudo-line(hasArg(ylim)), because all arguments will get passed to plot, which will itself recognise and test for an argument called ylim, within its own HTH -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 16 September 2003 15:14 To: [EMAIL PROTECTED] Subject: [R] Retrieve ... argument values Security Warning: If you are not sure an attachment is safe to open please contact Andy on x234. There are 0 attachments with this message. Dear R users, I want to retrieve ... argument values within a function. Here is a small exmaple: myfunc - function(x, ...) { if (hasArg(ylim)) a - ylim plot(x, ...) } x - rnorm(100) myfunc(x, ylim=c(-0.5, 0.5)) Error in myfunc(x, ylim = c(-0.5, 0.5)) : Object ylim not found I need to retrieve values of ylim (if it is defined when function is called) for later use in the function. Can anybody give me some hint? Thanks a lot. Huan This message and any attachments (the message) is\ intende...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Retrieve ... argument values
[EMAIL PROTECTED] writes: Dear R users, I want to retrieve ... argument values within a function. Here is a small exmaple: myfunc - function(x, ...) { if (hasArg(ylim)) a - ylim plot(x, ...) } x - rnorm(100) myfunc(x, ylim=c(-0.5, 0.5)) Error in myfunc(x, ylim = c(-0.5, 0.5)) : Object ylim not found I need to retrieve values of ylim (if it is defined when function is called) for later use in the function. Can anybody give me some hint? Yes, several: ylim %in% names(match.call(expand.dots=FALSE)$...) or ylim %in% names(list(...) (Use the former if it is somehow important not to evaluate the arguments). Or even a - list(...)$ylim and then check for is.null(a). -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: R: [R] gam and concurvity
On Tuesday 16 September 2003 16:28, Vito Muggeo wrote: As someone (Simon Wood, for instance) could explain much better and as it is stressed in the help files of the mgcv pakage (the package including the gam() function) gam in R is not a clone of gam in S+. S+ uses backfitting while R uses penalized splines (see the references inside gam() function). The approaches are quite different and can lead to substantial differences in particular cases, for instance with concurvity. best, vito PS Can you point out the exact reference for Figueiras et al. (2003)? I haven't found a journal name but the *.pdf download is http://isi-eh.usc.es/trabajos/110_70_fullpaper.pdf - Original Message - From: Martin Wegmann [EMAIL PROTECTED] To: R-list [EMAIL PROTECTED] Sent: Tuesday, September 16, 2003 3:47 PM Subject: [R] gam and concurvity Hello, in the paper Avoiding the effects of concurvity in GAM's .. of Figueiras et al. (2003) it is mentioned that in GLM collinearity is taken into account in the calc of se but not in GAM (- results in confidence interval too narrow, p-value understated, GAM S-Plus version). I haven't found any references to GAM and concurvity or collinearity on the R page. And I wonder if the R version of Gam differ in this point. Another question would be, what the best manual way of a variable selection is, due to the lack of a stepwise procedure for GAM. Including the first variables, add var1, if GCV improves (what would be considered as improvement?) or P-value signif., keep it, otherwise drop it - add var 2, and so on? thanks in advance, cheers Martin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Retrieve ... argument values
Huan - Look at the function code for order(). To show the function definition, type just order at the command line (no quotes, no parentheses). This example is what I found most useful when I had a similar question. The green book is also useful. - tom blackwell - u michigan medical school - ann arbor - On Tue, 16 Sep 2003 [EMAIL PROTECTED] wrote: Dear R users, I want to retrieve ... argument values within a function. Here is a small exmaple: myfunc - function(x, ...) { if (hasArg(ylim)) a - ylim plot(x, ...) } I need to retrieve values of ylim (if it is defined when function is called) for later use in the function. Can anybody give me some hint? Thanks a lot. Huan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Retrieve ... argument values
Try: myfunc - function(x, ...) { if (hasArg(ylim)) a - ...$ylim plot(x, ...) } HTH, Andy -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 16, 2003 10:14 AM To: [EMAIL PROTECTED] Subject: [R] Retrieve ... argument values Dear R users, I want to retrieve ... argument values within a function. Here is a small exmaple: myfunc - function(x, ...) { if (hasArg(ylim)) a - ylim plot(x, ...) } x - rnorm(100) myfunc(x, ylim=c(-0.5, 0.5)) Error in myfunc(x, ylim = c(-0.5, 0.5)) : Object ylim not found I need to retrieve values of ylim (if it is defined when function is called) for later use in the function. Can anybody give me some hint? Thanks a lot. Huan This message and any attachments (the message) is\ intende...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo /r-help -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] ASA Stat. Computing and Stat. Graphics 2004 Student Paper competition
The Statistical Computing and Statistical Graphics Sections of the ASA are co-sponsoring a student paper competition on the topics of Statistical Computing and Statistical Graphics. Students are encouraged to submit a paper in one of these areas, which might be original methodological research, some novel computing or graphical application in statistics, or any other suitable contribution (for example, a software-related project). The selected winners will present their papers in a topic-contributed session at the 2004 Joint Statistical Meetings. The Sections will pay registration fees for the winners as well as a substantial allowance for transportation to the meetings and lodging. Enclosed below is the full text of the award announcement. More details can be found at the Stat. Computing Section website at http://www.statcomputing.org. Best Regards, --José Pinheiro Awards Chair ASA Statistical Computing Section Statistical Computing and Statistical Graphics Sections American Statistical Association Student Paper Competition 2004 The Statistical Computing and Statistical Graphics Sections of the ASA are co-sponsoring a student paper competition on the topics of Statistical Computing and Statistical Graphics. Students are encouraged to submit a paper in one of these areas, which might be original methodological research, some novel computing or graphical application in statistics, or any other suitable contribution (for example, a software-related project). The selected winners will present their papers in a topic-contributed session at the 2004 Joint Statistical Meetings. The Sections will pay registration fees for the winners as well as a substantial allowance for transportation to the meetings and lodging (which in most cases covers these expenses completely). Anyone who is a student (graduate or undergraduate) on or after September 1, 2003 is eligible to participate. An entry must include an abstract, a six page manuscript (including figures, tables and references), a C.V., and a letter from a faculty member familiar with the student's work. The applicant must be the first author of the paper. The faculty letter must include a verification of the applicant's student status and, in the case of joint authorship, should indicate what fraction of the contribution is attributable to the applicant. We prefer that electronic submissions of papers be in Postscript or PDF. All materials must be in English. All application materials MUST BE RECEIVED by 5:00 PM EST, Monday, January 5, 2004 at the address below. They will be reviewed by the Student Paper Competition Award committee of the Statistical Computing and Graphics Sections. The selection criteria used by the committee will include innovation and significance of the contribution. Award announcements will be made in late January, 2004. Additional important information on the competition can be accessed on the website of the Statistical Computing Section, www.statcomputing.org. A current pointer to the website is available from the ASA website at www.amstat.org. Inquiries and application materials should be emailed or mailed to: Student Paper Competition c/o Dr. José Pinheiro Biostatistics, Novartis Pharmaceuticals One Health Plaza, Room 419/2115 East Hanover, NJ 07936 [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Old libraries with new R?
Hi Folks, I'm currently installing R-1.7.1 off CRAN. As it happens, I have a CD (kindly made for me by Linux Emporium) containing all the libraries which were on CRAN early this year when I installed R-1.6.1. This is highly convenient, since the alternative would be several hours on-line. While a recent library will on installation announce the fact should it need a newer version of R than the one which is installed, presumably this is not likely to be the case for an old library if a newer version of R is incompatible with it. So is there a way of finding out whether a library dating from some time back is compatible with a recent R, other than simply trying it out to see if it works OK? With thanks, and best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 167 1972 Date: 16-Sep-03 Time: 15:54:16 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Retrieve ... argument values
Yes, and I was wrong to say ylim=NULL was more useful, I should have said much easier to understand and much easier to read, debug and maintain. Of course for certain applications it IS worth getting to grips with ..., and other people's posts have been extremely useful in that regard. -Original Message- From: Ben Bolker [mailto:[EMAIL PROTECTED] Sent: 16 September 2003 16:18 To: Simon Fear Cc: [EMAIL PROTECTED]; R help list Subject: RE: [R] Retrieve ... argument values Security Warning: If you are not sure an attachment is safe to open please contact Andy on x234. There are 0 attachments with this message. Yes, although this becomes tedious if (e.g.) you have a function that calls two different functions, each of which has many arguments (e.g. plot() and barplot(); then you have to set up a whole lot of arguments that default to NULL and, more annoyingly, you have to document them all in any .Rd file you create -- rather than just having a ... argument which you can say should contain arguments for either of the subfunctions (as long as the arguments don't overlap, of course) Ben Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Retrieve ... argument values
Yes, although this becomes tedious if (e.g.) you have a function that calls two different functions, each of which has many arguments (e.g. plot() and barplot(); then you have to set up a whole lot of arguments that default to NULL and, more annoyingly, you have to document them all in any .Rd file you create -- rather than just having a ... argument which you can say should contain arguments for either of the subfunctions (as long as the arguments don't overlap, of course) Ben On Tue, 16 Sep 2003, Simon Fear wrote: For most purposes a more useful technique is to write the function with a default NULL argument myfunc - function(x, ylim=NULL) so that it can be called as myfunc(x) or myfunc(x,y). Inside the function you test for !is.null(ylim) and take appropriate action. Alternatively, and maybe more commonly, you give ylim a sensible default so the caller has to be explicit about setting ylim to NULL if required. As it happens, in your specific case you can write as you did except for the pseudo-line(hasArg(ylim)), because all arguments will get passed to plot, which will itself recognise and test for an argument called ylim, within its own HTH -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 16 September 2003 15:14 To: [EMAIL PROTECTED] Subject: [R] Retrieve ... argument values Security Warning: If you are not sure an attachment is safe to open please contact Andy on x234. There are 0 attachments with this message. Dear R users, I want to retrieve ... argument values within a function. Here is a small exmaple: myfunc - function(x, ...) { if (hasArg(ylim)) a - ylim plot(x, ...) } x - rnorm(100) myfunc(x, ylim=c(-0.5, 0.5)) Error in myfunc(x, ylim = c(-0.5, 0.5)) : Object ylim not found I need to retrieve values of ylim (if it is defined when function is called) for later use in the function. Can anybody give me some hint? Thanks a lot. Huan This message and any attachments (the message) is\ intende...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- 620B Bartram Hall[EMAIL PROTECTED] Zoology Department, University of Floridahttp://www.zoo.ufl.edu/bolker Box 118525 (ph) 352-392-5697 Gainesville, FL 32611-8525 (fax) 352-392-3704 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Old libraries with new R?
(Ted Harding) wrote: Hi Folks, I'm currently installing R-1.7.1 off CRAN. As it happens, I have a CD (kindly made for me by Linux Emporium) containing all the libraries which were on CRAN early this year when I installed R-1.6.1. This is highly convenient, since the alternative would be several hours on-line. While a recent library will on installation announce the fact should it need a newer version of R than the one which is installed, presumably this is not likely to be the case for an old library if a newer version of R is incompatible with it. So is there a way of finding out whether a library dating from some time back is compatible with a recent R, other than simply trying it out to see if it works OK? With thanks, and best wishes to all, Ted. There is a dependency field in a package's DESCRIPTION file. Here the package author *might* give information on dependency for a minimal required R version. But this is not checked, and the author not always knows about dependencies, because he/she is probably developing on recent versions. Uwe Ligges __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] gnls( ) question
Last week (Wed 9/10/2003, regression questions) I posted a question regarding the use of gnls( ) and its dissimilarity to the syntax that nls( ) will accept. No one replied, so I partly answered my own question by constructing indicator variables for use in gnls( ). The code I used to construct the indicators is at the end of this email. I do have a nagging, unanswered question: What exactly does Warning message: Step halving factor reduced below minimum in NLS step in: gnls(model = y ~ 5 + ...) mean? I have tried to address this by specifying control = list(maxIter = 1000, pnlsMaxIter = 200, msMaxIter = 1000, tolerance = 1e-06, pnlsTol = 1e-04, msTol = 1e-07, minScale = 1e-10, returnObject = TRUE) in my model calls, but this does not entirely eliminate the problem (I am running gnls( ) 24 separate times on separate data sets). Much thanks in advance, david paul #Constructing Indicator Variables indicator - paste( foo$X - sapply(foo$subject.id, FUN = function(x) if(x == X) 1 else 0) ) indicator - parse( text = indicator )[[1]] subjectID.foo - as.factor(as.character(unique(foo$animal.id))) for(i in subjectID.foo) { INDICATOR - do.call(substitute, list(indicator, list(i = i, X = as.character(subjectID.foo[i] eval(INDICATOR) } foo$Overall.Effect - rep(1,length(foo$dose.group)) __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] how to print a plot
Thank you all very much! I did forget to use dev.off(). Everything works great now. Many thanks. Weiming zhang On Tue, 2003-09-16 at 00:38, Prof Brian Ripley wrote: You did use dev.off() to finish the plots before trying to look at them? The symptoms you report are what happens if you did not. There is no ps() function in R: the postscript device is postscript() not ps(). If you want to print a plot, try dev.print. If you want to copy to a file, try dev.copy2eps. (You are on Linux, where EPS is more widely acceptable than PDF.) On 15 Sep 2003, Weiming Zhang wrote: Hi, Thank both of you. I tried everything. pdf(file=out.pdf) gave me a damaged pdf file. ps() did not print. ps(out.ps) gave me a ps file with badly drawn graph and could not be printed. I am using RH linux 7.2. Thanks again. weiming Zhang On Mon, 2003-09-15 at 16:06, Jason Turner wrote: On Tue, 2003-09-16 at 08:56, Weiming Zhang wrote: Hi, I am using R-1.7.1 on Linux. I integrated XEMACS with R. Could anybody tell me how to print a plot? I used plot function to make some graphs and then I wanted to print them or to save them to files. But I could not find out how to do it. Have you tried: help(Devices) help(pdf) What I do: pdf(file=myplots.pdf) plot(...) dev.off() Use Acrobat or gv to view the pdf files. Postscript is also good, but not as universally understood; I have many coleagues who work in very standard Windows environments, where ghostscript is unknown. PDF is a very sensible choice for e-mailing graphs. -- Indigo Industrial Controls Ltd. http://www.indigoindustrial.co.nz +64-(0)21-343-545 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Question in Using sink function
Could anyone please explain to me why the following writes nothing into all.Rout file? If the for loop is removed, t.test output can be written into all.out. Thanks in advance. Minghua Yao .. zz - file(all.Rout, open=wt) sink(zz) for(i in 1:n) { Cy3-X[,2*i-1]; Cy5-X[,2*i]; t.test(Cy3, Cy5) } sink() close(zz) .. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Question in Using sink function
Dear Minghua Yao, If you throw in a print() or two you'll get some output in your file. You could try print(t.test(Cy3, Cy5)) or whatever you actually want. Regards, Andrew C. Ward CAPE Centre Department of Chemical Engineering The University of Queensland Brisbane Qld 4072 Australia [EMAIL PROTECTED] Quoting Yao, Minghua [EMAIL PROTECTED]: Could anyone please explain to me why the following writes nothing into all.Rout file? If the for loop is removed, t.test output can be written into all.out. Thanks in advance. Minghua Yao .. zz - file(all.Rout, open=wt) sink(zz) for(i in 1:n) { Cy3-X[,2*i-1]; Cy5-X[,2*i]; t.test(Cy3, Cy5) } sink() close(zz) .. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Question in Using sink function
Autoprinting does not work inside a for() {} loop, and you did not print anything. Try for(i in 1:10) {i} Did you try your problem without sink()? On Tue, 16 Sep 2003, Yao, Minghua wrote: Could anyone please explain to me why the following writes nothing into all.Rout file? If the for loop is removed, t.test output can be written into all.out. Thanks in advance. Minghua Yao .. zz - file(all.Rout, open=wt) sink(zz) for(i in 1:n) { Cy3-X[,2*i-1]; Cy5-X[,2*i]; t.test(Cy3, Cy5) } sink() close(zz) .. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Question in Using sink function
Thanks, Prof. Ripley. Right. I saw nothing, either, when I tried without for loop. Does anywhere in the documents mention that Autoprinting does not work inside a for() {} loop? Minghua -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 16, 2003 11:35 AM To: Yao, Minghua Cc: R Help (E-mail) Subject: Re: [R] Question in Using sink function Autoprinting does not work inside a for() {} loop, and you did not print anything. Try for(i in 1:10) {i} Did you try your problem without sink()? On Tue, 16 Sep 2003, Yao, Minghua wrote: Could anyone please explain to me why the following writes nothing into all.Rout file? If the for loop is removed, t.test output can be written into all.out. Thanks in advance. Minghua Yao .. zz - file(all.Rout, open=wt) sink(zz) for(i in 1:n) { Cy3-X[,2*i-1]; Cy5-X[,2*i]; t.test(Cy3, Cy5) } sink() close(zz) .. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Old libraries with new R?
On Tue, 16 Sep 2003, Uwe Ligges wrote: (Ted Harding) wrote: Hi Folks, I'm currently installing R-1.7.1 off CRAN. As it happens, I have a CD (kindly made for me by Linux Emporium) containing all the libraries which were on CRAN early this year when I installed R-1.6.1. This is highly convenient, since the alternative would be several hours on-line. While a recent library will on installation announce the fact should it need a newer version of R than the one which is installed, presumably this is not likely to be the case for an old library if a newer version of R is incompatible with it. So is there a way of finding out whether a library dating from some time back is compatible with a recent R, other than simply trying it out to see if it works OK? With thanks, and best wishes to all, Ted. There is a dependency field in a package's DESCRIPTION file. Here the package author *might* give information on dependency for a minimal required R version. But this is not checked, and the author not always knows about dependencies, because he/she is probably developing on recent versions. I think Ted wants the reverse: will an old package source work with current R? That would need prescience beyond most package authors to know at the time the package was bundled. The best think to do I believe is to first check if the version you have is the same as that on CRAN, and if not try R CMD check on the old package. (If yes, you could check http://cran.r-project.org/src/contrib/checkSummary.html to see if it works on the CRAN machines.) -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] can predict ignore rows with insufficient info
I need predict to ignore rows that contain levels not in the model. Consider a data frame, const, that has columns for the number of days required to construct a site and the city and state the site was constructed in. g-lm(days~city,data=const) Some of the sites in const have not yet been completed, and therefore they have days==NA. I want to predict how many days these sites will take to complete (I've simplified the above discussion to remove many of the other factors involved.) nconst-subset(const,is.na(const$days)) x-predict(g,nconst) Error in model.frame.default(object, data, xlev = xlev) : factor city has new level(s) ALBANY This is because we haven't yet completed a site in Albany. If I just had one to worry about I could easily fix it (choose a nearby market with similar characteristic) but I am dealing with a several hundred cities. Instead, for the cities not modeled by g I'd simply like to use the state, even though I don't expect it to be as good: g-lm(days~state,data=const) x-predict(g,nconst) I'm not sure how to identify the cities in nconst that are not modeled by g (my actual model has many more predictors in the formula) Is there a way to instruct predict to only predict the rows for which it has enough information and not complain about the others? g-lm(days~city,data=const) x-predict(g,nconst) ## the rows of x with city=ALBANY will be NA g-lm(days~state,data=const) y-predict(g,nconst) x[is.na(x)]-y[is.na(x)] thanks, pete __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] help(print) seems truncated
Dear r-help - I just noticed that in my R-1.7.1 on i386-pc-linux-gnu, the page displayed by help(print) ends with the line ## Printing of factors illustrated for ex and then no more. It looks as though something got truncated here. I think this is an R that I compiled from source off of CRAN, but I can't quite remember. - tom blackwell - u michigan medical school - ann arbor - __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Question in Using sink function
On Tue, 16 Sep 2003, Yao, Minghua wrote: Thanks, Prof. Ripley. Right. I saw nothing, either, when I tried without for loop. Does anywhere in the documents mention that Autoprinting does not work inside a for() {} loop? It is in `An Introduction to R', albeit in a rather sophisticated way, and of course in all good books on R/S. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Question in Using sink function
On Tue, 16 Sep 2003, Yao, Minghua wrote: Thanks, Prof. Ripley. Right. I saw nothing, either, when I tried without for loop. Does anywhere in the documents mention that Autoprinting does not work inside a for() {} loop? It's a FAQ. -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] help(print) seems truncated
Thomas W Blackwell wrote: Dear r-help - I just noticed that in my R-1.7.1 on i386-pc-linux-gnu, the page displayed by help(print) ends with the line ## Printing of factors illustrated for ex and then no more. It looks as though something got truncated here. I think this is an R that I compiled from source off of CRAN, but I can't quite remember. - tom blackwell - u michigan medical school - ann arbor - It's still in the R-1.8.0 alpha sources from yesterday and has been introduced between R-1.5.1 and R-1.6.2. Might be fixed before this message come through, hence not as a bug report ... Uwe Ligges __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] path analysis
There is a library sem for structural equation models. Best, Christian Hennig On Mon, 15 Sep 2003, Catherine Stein wrote: Can anyone help me find a R script that does path analysis with family data (like a Beta model)? A script that takes the variance-covariance matrix in as input would be ideal. Thanks! Please email me with any ideas! Cathy Stein [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- *** Christian Hennig Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently) and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg [EMAIL PROTECTED], http://stat.ethz.ch/~hennig/ [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/ ### ich empfehle www.boag-online.de __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] can predict ignore rows with insufficient info
On Tue, Sep 16, 2003 at 11:44:02AM -0500, Peter Whiting wrote: I'm not sure how to identify the cities in nconst that are not modeled by g (my actual model has many more predictors in the formula) I guess I could use some form of subset(const,const$city%in%g$xlevels$city) over and over again for each factor... as usual, there has to be a better way. pete Is there a way to instruct predict to only predict the rows for which it has enough information and not complain about the others? g-lm(days~city,data=const) x-predict(g,nconst) ## the rows of x with city=ALBANY will be NA g-lm(days~state,data=const) y-predict(g,nconst) x[is.na(x)]-y[is.na(x)] thanks, pete __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] can predict ignore rows with insufficient info
Peter - Your subsequent email seems just right. You have to determine ahead of time which rows can be estimated. Here's a strategy, and possibly some code to implement it. Let supported(i,y,d) be a user-written function which returns a logical vector indicating rows which should be omitted from the prediction on account of a non-covered covariate in column i of data frame d with outcome variable y. Apply this function to all columns in your data frame using lapply(). Then do the or of all the logical vectors by calculating the row sums of the numeric (0 or 1) equivalents. Last, convert back to logical, and subscript your data frame with this in the call to predict(). Here's some rough code: supported - function(i,y,d) { result - rep(F, dim(d)[1]) # default return value when if (is.factor(d[[i]])) # d[[i]] is not a factor. result - d[[i]] %in% unique(d[[i]][ !is.na(d[[y]]) ]) result } tmp.1 - lapply(seq(along=const), supported, days, const) tmp.2 - matrix(unlist(tmp.1[ names(const) != days ]), nrow=dim(const)[1]) tmp.3 - as.logical(as.vector(tmp.2 %*% rep(1, dim(tmp.2)[2]))) x - predict(g, const[ is.na(const$days) !tmp.3, ]) This code uses a few arcane maneuvers. Look at help pages for the relevant functions to dope out what it is doing. Particularly for lapply(), seq(), rep(), unlist(), unique(), %*%, %in%. (The last two must be quoted in order to see the help). However, the code might work for you right out of the box ! - tom blackwell - u michigan medical school - ann arbor - On Tue, 16 Sep 2003, Peter Whiting wrote: I need predict to ignore rows that contain levels not in the model. Consider a data frame, const, that has columns for the number of days required to construct a site and the city and state the site was constructed in. g-lm(days~city,data=const) Some of the sites in const have not yet been completed, and therefore they have days==NA. I want to predict how many days these sites will take to complete (I've simplified the above discussion to remove many of the other factors involved.) nconst-subset(const,is.na(const$days)) x-predict(g,nconst) Error in model.frame.default(object, data, xlev = xlev) : factor city has new level(s) ALBANY This is because we haven't yet completed a site in Albany. If I just had one to worry about I could easily fix it (choose a nearby market with similar characteristic) but I am dealing with a several hundred cities. Instead, for the cities not modeled by g I'd simply like to use the state, even though I don't expect it to be as good: g-lm(days~state,data=const) x-predict(g,nconst) I'm not sure how to identify the cities in nconst that are not modeled by g (my actual model has many more predictors in the formula) Is there a way to instruct predict to only predict the rows for which it has enough information and not complain about the others? g-lm(days~city,data=const) x-predict(g,nconst) ## the rows of x with city=ALBANY will be NA g-lm(days~state,data=const) y-predict(g,nconst) x[is.na(x)]-y[is.na(x)] thanks, pete __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] can predict ignore rows with insufficient info
Peter - Error !! I forgot a not in the third line inside the function supported(). And, my mail editor doesn't balance parentheses, so I don't guarantee that my code is even syntatically correct. Corrected and re-named version of function: unsupported - function(i,y,d) { result - rep(F, dim(d)[1]) # default return value when if (is.factor(d[[i]])) # d[[i]] is not a factor. result - !(d[[i]] %in% unique(d[[i]][ !is.na(d[[y]]) ])) result } tmp.1 - lapply(seq(along=const), unsupported, days, const) tmp.2 - matrix(unlist(tmp.1[ names(const) != days ]), nrow=dim(const)[1]) tmp.3 - as.logical(as.vector(tmp.2 %*% rep(1, dim(tmp.2)[2]))) x - predict(g, const[ is.na(const$days) !tmp.3, ]) - tom blackwell - u michigan medical school - ann arbor - __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Re: Number of R users
I have been asked again about the numbers of R users. The following is part of my answer, probably of some interest to some. In preparing some notes, I'd like to give some approximate baseline estimate of how many people are using R nowdays. of course a very interesting question. It came up on R-help in June 2000, with a small heated debate : start at --- http://www.r-project.org/nocvs/mail/r-help/2000/1493.html but first, read on. perhaps the size of the R-help list would be a decent starting point. Any chance you could give me the approximate number of users? Well, as with practical statistics, at first it's trivial, but if you start thinking it becomes quite interesting.. At the moment, o R-help has 2005 unique e-mail addresses subscribed o All R-lists have 2659 for R-* alone, i.e. w/o bioconductor o ALL R-lists have 3189 unique (all R-* lists + bioconductor combined, then uniqued) addresses, But from the mailman logs, for R-help e.g., this noon, 1023 got r-help directly 780 got r-help as digest which leaves about 200 (~ 10%) who seem to have mail delivery disabled for some reason {explicitly, by bouncing, delivery not-disabled but not successful on first try, ..?..} Then I also guess (from the address) that some groups deliver R-help to an `internal mailing list' ((something we pretty strongly discourage, particularly since it complicates unsubscription)). Now you should probably read the R-help discussion thread from two years ago (URL above). Quite interesting. People's guesses wildly varied then from about 10'000 to 400'000 -- based on about the third of mailing list subscribers. The multiplication factor `f' inR_users = f * R-help_readers was conservatively estimated in the range of 10-20 (rather the latter). This would lead to a guess of about 50'000 users (with a wildly estimated [logarithmic] standard error of factor 2). I think most would agree that this would *not* count students who only use R during their classes. (Please before you comment on this, do read the June 2000 thread ..) -- Martin Maechler [EMAIL PROTECTED] http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] can predict ignore rows with insufficient info
On Tue, Sep 16, 2003 at 04:17:59PM -0400, Thomas W Blackwell wrote: Peter - Your subsequent email seems just right. You have to determine ahead of time which rows can be estimated. It seems that predict removes rows with insufficient information (ie, if I replace ALBANY with NA and refactor everything works) - I wonder why it doesn't exhibit the same behavior when it encounters a new level - just eliminate the row and go on... Somewhat related: I had been assuming (incorrectly) that length(x) would equal length(const$days) after x-predict(g,const) - this isn't the case if any of the rows of const don't contain enough info for the model. Those rows are eliminated - I'd have expected them to just be NAs in the result. I'll go back and look through the documents to see if there is a straight forward way to convert: x 1 3 4 1.5 1.5 1.5 to x 1 2 3 4 5 1.5 NA 1.5 1.5 NA slowly learning, pete Here's a strategy, and possibly some code to implement it. Let supported(i,y,d) be a user-written function which returns a logical vector indicating rows which should be omitted from the prediction on account of a non-covered covariate in column i of data frame d with outcome variable y. Apply this function to all columns in your data frame using lapply(). Then do the or of all the logical vectors by calculating the row sums of the numeric (0 or 1) equivalents. Last, convert back to logical, and subscript your data frame with this in the call to predict(). Here's some rough code: supported - function(i,y,d) { result - rep(F, dim(d)[1]) # default return value when if (is.factor(d[[i]])) # d[[i]] is not a factor. result - d[[i]] %in% unique(d[[i]][ !is.na(d[[y]]) ]) result } tmp.1 - lapply(seq(along=const), supported, days, const) tmp.2 - matrix(unlist(tmp.1[ names(const) != days ]), nrow=dim(const)[1]) tmp.3 - as.logical(as.vector(tmp.2 %*% rep(1, dim(tmp.2)[2]))) x - predict(g, const[ is.na(const$days) !tmp.3, ]) This code uses a few arcane maneuvers. Look at help pages for the relevant functions to dope out what it is doing. Particularly for lapply(), seq(), rep(), unlist(), unique(), %*%, %in%. (The last two must be quoted in order to see the help). However, the code might work for you right out of the box ! - tom blackwell - u michigan medical school - ann arbor - On Tue, 16 Sep 2003, Peter Whiting wrote: I need predict to ignore rows that contain levels not in the model. Consider a data frame, const, that has columns for the number of days required to construct a site and the city and state the site was constructed in. g-lm(days~city,data=const) Some of the sites in const have not yet been completed, and therefore they have days==NA. I want to predict how many days these sites will take to complete (I've simplified the above discussion to remove many of the other factors involved.) nconst-subset(const,is.na(const$days)) x-predict(g,nconst) Error in model.frame.default(object, data, xlev = xlev) : factor city has new level(s) ALBANY This is because we haven't yet completed a site in Albany. If I just had one to worry about I could easily fix it (choose a nearby market with similar characteristic) but I am dealing with a several hundred cities. Instead, for the cities not modeled by g I'd simply like to use the state, even though I don't expect it to be as good: g-lm(days~state,data=const) x-predict(g,nconst) I'm not sure how to identify the cities in nconst that are not modeled by g (my actual model has many more predictors in the formula) Is there a way to instruct predict to only predict the rows for which it has enough information and not complain about the others? g-lm(days~city,data=const) x-predict(g,nconst) ## the rows of x with city=ALBANY will be NA g-lm(days~state,data=const) y-predict(g,nconst) x[is.na(x)]-y[is.na(x)] thanks, pete __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Re: Number of R users
MM == Martin Maechler [EMAIL PROTECTED] on Tue, 16 Sep 2003 22:45:24 +0200 writes: ^ (too late in the evening !) .. MM Well, as with practical statistics, at first it's trivial, but MM if you start thinking it becomes quite interesting.. MM At the moment, MM o R-help has 2005 unique e-mail addresses subscribed MM o All R-lists have 2659 for R-* alone, i.e. w/o bioconductor MM o ALL R-lists have 3189 unique (all R-* lists + bioconductor MM combined, then uniqued) addresses, As Jeff Gentry has noted (from the size of bioconductor) this seems pretty (too!) astonishing. I have checked, and from the 530 bioconductor subscribers, 112 are on R-help as well. The bug in the above counting: I got the last number manually -- with a mistake -- where the 2659 comes from a reliable perl script. 3189 must be corrected down to 3055. (as it says Never trust a statistic, unless :-) okay, definitely getting late today..) Martin __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] can predict ignore rows with insufficient info
On Tue, 16 Sep 2003, Peter Whiting wrote: It seems that predict removes rows with insufficient information (ie, if I replace ALBANY with NA and refactor everything works) - I wonder why it doesn't exhibit the same behavior when it encounters a new level - just eliminate the row and go on... Somewhat related: I had been assuming (incorrectly) that length(x) would equal length(const$days) after x-predict(g,const) - this isn't the case if any of the rows of const don't contain enough info for the model. Those rows are eliminated - I'd have expected them to just be NAs in the result. I'll go back and look through the documents to see if there is a straight forward way to convert: x 1 3 4 1.5 1.5 1.5 to x 1 2 3 4 5 1.5 NA 1.5 1.5 NA slowly learning, pete Before running predict(...), do options(na.action=na.exclude). this will give the equal length behavior that you may want ... as long as you have replaced unsupported factor levels with NA. See help(na.omit) and help(options) to see what this is doing. (It won't have any effect of course, if you subscript the newdata argument to predict() using my strategy.) And, DO use a simple strategy that you cooked up yourself, in preference to anything canned. It's much easier to maintain. - tom blackwell - u michigan medical school - ann arbor - __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] can predict ignore rows with insufficient info
On Tue, Sep 16, 2003 at 04:31:29PM -0400, Thomas W Blackwell wrote: Peter - Error !! I forgot a not in the third line inside the function supported(). And, my mail editor doesn't balance parentheses, so I don't guarantee that my code is even syntatically correct. Corrected and re-named version of function: unsupported - function(i,y,d) { result - rep(F, dim(d)[1]) # default return value when if (is.factor(d[[i]])) # d[[i]] is not a factor. result - !(d[[i]] %in% unique(d[[i]][ !is.na(d[[y]]) ])) result } tmp.1 - lapply(seq(along=const), unsupported, days, const) tmp.2 - matrix(unlist(tmp.1[ names(const) != days ]), nrow=dim(const)[1]) tmp.3 - as.logical(as.vector(tmp.2 %*% rep(1, dim(tmp.2)[2]))) x - predict(g, const[ is.na(const$days) !tmp.3, ]) this still suffers from the fact that the factor for city still has ALBANY in it (even though it doesn't occur in the subset). It can be fixed by creating yet another tmp variable and refactoring... Kinda painful with multiple predictors in addition to city, but it is workable. const state city days 1s1 c11 2s1 c1 NA 3s2 c21 4s2 c21 5s1 c3 NA tmp.1 - lapply(seq(along=const), unsupported, days, const) tmp.2 - matrix(unlist(tmp.1[ names(const) != days ]), nrow=dim(const)[1]) tmp.3 - as.logical(as.vector(tmp.2 %*% rep(1, dim(tmp.2)[2]))) x - predict(g, const[ is.na(const$days) !tmp.3, ]) Error in model.frame.default(object, data, xlev = xlev) : factor city has new level(s) c3 tmp.4 - subset(const,is.na(const$days) !tmp.3) x - predict(g, tmp.4) Error in model.frame.default(object, data, xlev = xlev) : factor city has new level(s) c3 tmp.4$city=factor(tmp.4$city) x - predict(g, tmp.4) pete __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] can predict ignore rows with insufficient info
On Tue, Sep 16, 2003 at 04:31:29PM -0400, Thomas W Blackwell wrote: Corrected and re-named version of function: unsupported - function(i,y,d) { result - rep(F, dim(d)[1]) # default return value when if (is.factor(d[[i]])) # d[[i]] is not a factor. result - !(d[[i]] %in% unique(d[[i]][ !is.na(d[[y]]) ])) result } tmp.1 - lapply(seq(along=const), unsupported, days, const) tmp.2 - matrix(unlist(tmp.1[ names(const) != days ]), nrow=dim(const)[1]) tmp.3 - as.logical(as.vector(tmp.2 %*% rep(1, dim(tmp.2)[2]))) x - predict(g, const[ is.na(const$days) !tmp.3, ]) Here is an approach I came up with that appears to work: predict2 - function(g,data,...) { for(nm in names(g$xlevels)) { cat(paste(nm,\n)) data[[nm]]- factor(data[[nm]],levels=g$xlevels[[nm]]) } predict(g,data,...) } It bases its operation on refactoring each predictor using the factor's levels= argument. Any element having a level not in g$xlevels ends up as an NA, which predict correctly handles. I'm not sure why predict doesn't do something like this by default, but I am just a newbee. pete __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Quit asking me if I want to save the workspace!
How do you stop R from putting up a dialog box when you quit Rgui? (I use Windows and I never save workspaces that way) Murray -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Quit asking me if I want to save the workspace!
Rafael A. Irizarry wrote: you can type this: q(no) see the help file for q Still more work than two mouse clicks. -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Quit asking me if I want to save the workspace!
you can type this: q(no) see the help file for q On Wed, 17 Sep 2003, Murray Jorgensen wrote: How do you stop R from putting up a dialog box when you quit Rgui? (I use Windows and I never save workspaces that way) Murray -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Quit asking me if I want to save the workspace!
Consider Q - function(x)q(no) With R 1.7.1 under Windows, Q() caused R to close without asking for confirmation. This does not solve the whole problem, but it might provide a piece of the puzzle. hope this helps. spencer graves Rafael A. Irizarry wrote: you can type this: q(no) see the help file for q On Wed, 17 Sep 2003, Murray Jorgensen wrote: How do you stop R from putting up a dialog box when you quit Rgui? (I use Windows and I never save workspaces that way) Murray -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Quit asking me if I want to save the workspace!
On Tuesday 16 September 2003 21:26, Murray Jorgensen wrote: Rafael A. Irizarry wrote: you can type this: q(no) see the help file for q Still more work than two mouse clicks. Start R with --no-save (not sure how/whether this will work on Windows). __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Date on x-axis of xyplot
xyplot doesn't seem to want to label my x-axis with dates but instead puts the day-number for each date. begdate is the number of days since January 1, 1960 and was initially created by library(date) ... polls$begdate-mdy.date(begmm,begdd,begyy) I create a new dataframe (pollstack) which includes begdate. In the process begdate seems to lose its date attribute so I redo it as: pollstack$begdate-as.date(pollstack$begdate) after which attach(pollstack) summary(pollstack) begdate pct names First :15Nov2002 Min. : 0.000 Clark : 54 Last :10Sep2003 1st Qu.: 2.000 Dean: 54 Median : 5.000 Edwards : 54 Mean : 6.991 Gephardt: 54 3rd Qu.:12.000 Graham : 54 Max. :29.000 Kerry : 54 (Other) :216 And all seems well. But xyplot continues to use day number on the x-axis. My plots are created by print(xyplot(pct ~ begdate | names, pch=2, cex=.2, prepanel = function(x, y) prepanel.loess(x, y, span = 1), main=2004 Democratic Primary Race, xlab = Date of Survey, ylab = Percent Support, panel = function(x, y) { panel.grid(h=-1, v= -1) panel.xyplot(x, y, pch=1,col=2,cex=.7) panel.loess(x,y, span=.65, lwd=2,col=4) }, ) ) What am I missing? Thanks! Charles /** ** Charles H. Franklin ** Professor, Political Science ** University of Wisconsin, Madison ** 1050 Bascom Mall ** Madison, WI 53706 ** 608-263-2022 Office ** 608-265-2663 Fax ** mailto:[EMAIL PROTECTED] (best) ** mailto:[EMAIL PROTECTED] (alt) ** http://www.polisci.wisc.edu/~franklin **/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Quit asking me if I want to save the workspace!
On Wed, 2003-09-17 at 14:26, Murray Jorgensen wrote: Rafael A. Irizarry wrote: you can type this: q(no) see the help file for q Still more work than two mouse clicks. Two clicks! How awful! ;) Actually, it bugs me too, so my desktop shortcut (under Win XP) has this for Target. ### C:\Program Files\R\rw1071\bin\Rgui.exe --no-save ### (my mail client might've line-wrapped that by the time you see it. Everything between the ### marks is one line. *Include* the quotes. There is a space between Rgui.exe and --no-save) See Appendix B of An Introduction to R if you need more info. Hope that helps. Jason -- Indigo Industrial Controls Ltd. http://www.indigoindustrial.co.nz +64-(0)21-343-545 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Quit asking me if I want to save the workspace!
In a message dated 9/16/03 7:20:08 PM Pacific Daylight Time, [EMAIL PROTECTED] writes: How do you stop R from putting up a dialog box when you quit Rgui? (I use Windows and I never save workspaces that way) On windows-98, an easy solution that works: * Right-click the icon with which you start R. * Go to Properties -- Shortcut -- Target. * In Target add --no-save (I tried without quotation marks). * Click OK, and try. Hope it works if you are using something other than Windows-98. --Anupam. [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Quit asking me if I want to save the workspace!
Ah! now that tells me what I want to know. I was trying to type C:\Program Files\R\rw1071\bin\Rgui.exe --no-save instead of C:\Program Files\R\rw1071\bin\Rgui.exe --no-save into the Target box. Silly me! Jason Turner wrote: On Wed, 2003-09-17 at 14:26, Murray Jorgensen wrote: Rafael A. Irizarry wrote: you can type this: q(no) see the help file for q Still more work than two mouse clicks. Two clicks! How awful! ;) Actually, it bugs me too, so my desktop shortcut (under Win XP) has this for Target. ### C:\Program Files\R\rw1071\bin\Rgui.exe --no-save ### (my mail client might've line-wrapped that by the time you see it. Everything between the ### marks is one line. *Include* the quotes. There is a space between Rgui.exe and --no-save) See Appendix B of An Introduction to R if you need more info. Hope that helps. Jason -- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: [EMAIL PROTECTED]Fax 7 838 4155 Phone +64 7 838 4773 wk+64 7 849 6486 homeMobile 021 1395 862 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Help with glmmML package
Dear R users, I have been using the package glmmML to fit a logistic-normal mixed model to clustered binary data. Along with parameter estimates I would also like to obtain estimates of the random effects. I have noticed that a fitted glmmML object contains a component called frail, a vector, which looks to be an estimate of the random effects. Can anyone confirm this? And if so, how are these estimates obtained from the fitted model? Are they the empirical Bayes estimates? Any reference would also be great. Thanks very much for your help. Farouk __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Date on x-axis of xyplot
On Tuesday 16 September 2003 22:00, Charles H. Franklin wrote: xyplot doesn't seem to want to label my x-axis with dates but instead puts the day-number for each date. begdate is the number of days since January 1, 1960 and was initially created by library(date) ... polls$begdate-mdy.date(begmm,begdd,begyy) I create a new dataframe (pollstack) which includes begdate. In the process begdate seems to lose its date attribute so I redo it as: pollstack$begdate-as.date(pollstack$begdate) after which attach(pollstack) summary(pollstack) begdate pct names First :15Nov2002 Min. : 0.000 Clark : 54 Last :10Sep2003 1st Qu.: 2.000 Dean: 54 Median : 5.000 Edwards : 54 Mean : 6.991 Gephardt: 54 3rd Qu.:12.000 Graham : 54 Max. :29.000 Kerry : 54 (Other) :216 And all seems well. But xyplot continues to use day number on the x-axis. My plots are created by print(xyplot(pct ~ begdate | names, pch=2, cex=.2, prepanel = function(x, y) prepanel.loess(x, y, span = 1), main=2004 Democratic Primary Race, xlab = Date of Survey, ylab = Percent Support, panel = function(x, y) { panel.grid(h=-1, v= -1) panel.xyplot(x, y, pch=1,col=2,cex=.7) panel.loess(x,y, span=.65, lwd=2,col=4) }, ) ) What am I missing? The fact that xyplot doesn't know anything about the date class. I'm not familiar with the date package, but the docs and a few experiments seem to indicate that an object of class date is simply a numeric/integer vector with the class attribute set to date. xyplot interprets it as plain numeric data. You may be able to get what you want by print(xyplot(pct ~ factor(as.character(begdate)) | names, pch=2, cex=.2, prepanel = function(x, y) prepanel.loess(x, y, span = 1), ... (but this will try to label all unique dates, which may not be good). Is the date class standard enough to warrant including a check for it in lattice ? Deepayan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Date on x-axis of xyplot
On Wed, 2003-09-17 at 16:31, Deepayan Sarkar wrote: ... Is the date class standard enough to warrant including a check for it in lattice ? I've never used it myself, but the lack of POSIXct support in the lattice graphics axes has often caused me to think up new ways around the plot. Unless I'm missing an obvious way to apply that... Cheers Jason -- Indigo Industrial Controls Ltd. http://www.indigoindustrial.co.nz +64-(0)21-343-545 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] Date on x-axis of xyplot
On Tuesday 16 September 2003 22:00, Charles H. Franklin wrote: xyplot doesn't seem to want to label my x-axis with dates but instead puts the day-number for each date. ... What am I missing? Deepayan Sarkar replies: The fact that xyplot doesn't know anything about the date class. I'm not familiar with the date package, but the docs and a few experiments seem to indicate that an object of class date is simply a numeric/integer vector with the class attribute set to date. xyplot interprets it as plain numeric data. You may be able to get what you want by print(xyplot(pct ~ factor(as.character(begdate)) | names, pch=2, cex=.2, prepanel = function(x, y) prepanel.loess(x, y, span = 1), ... (but this will try to label all unique dates, which may not be good). Is the date class standard enough to warrant including a check for it in lattice ? Deepayan OK. I was afraid of that. I'm not sure how standard the date class is. Or if there is an alternative that is better. But I DO think that being able to label dates in some way on the x-axis would be a common enough problem to be worth solving. This is especially an issue when data are irregularly spaced and so time series plots are not appropriate. As it stands, my graphs are labeled 15650 to 15950, which is surely not intuitive to anyone! Many thanks to Deepayan for the Lattice package and his support of it. It is so great that I just want one more thing...! Charles __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Date on x-axis of xyplot
On Tuesday 16 September 2003 23:51, Jason Turner wrote: On Wed, 2003-09-17 at 16:31, Deepayan Sarkar wrote: ... Is the date class standard enough to warrant including a check for it in lattice ? I've never used it myself, but the lack of POSIXct support in the lattice graphics axes has often caused me to think up new ways around the plot. Unless I'm missing an obvious way to apply that... Actually, lattice does support POSIXct for some time now, although the quality of that support (in terms of control over tick locations and labels) is not very good. But it might not be too difficult to add support for date objects as well. Deepayan __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help