Re: [R] Reproducible research
FYI: If you use LaTex, you can work out on something between R and LaTex. -- View this message in context: http://r.789695.n4.nabble.com/Reproducible-research-tp2532353p2532361.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice: layout and number of pages
On Wed, Sep 8, 2010 at 5:49 AM, Philipp Pagel p.pa...@wzw.tum.de wrote: Dear expeRts, ?xyplot says: In general, giving a high value of ‘layout[3]’ is not wasteful because blank pages are never created. But the following example does generate blank pages - well except for the ylab: data(barley) require(lattice) stripplot(yield~year|site, barley, layout=c(2,1,5)) Did I misinterpret the sentence from the help page or is this a bug? The statement used to be true at some point. Unfortunately it no longer seems possible to (easily) determine with 100% accuracy whether a page will be blank. I will remove that sentence from the documentation, but add a warning when lattice detects likely blank pages. -Deepayan Yes - I know that his works fine: stripplot(yield~year|site, barley, layout=c(2,1)) Just curious... cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot symbol +, but with variable bar lenghts
Hi, does anybody know of some plotting function or an easy way to generate + symbols with individually settable bar lengths? I tried just combining | and - as pch and setting the size via cex, but that doesn't really work since the two symbols have different default lengths. Is there a horizontal | or a longer - available? Thanks, Rainer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to project a vector on a subspace?
Hi Peng, Gabor Peter, Thank you very much for replying me so soon. I will try it right now! -- View this message in context: http://r.789695.n4.nabble.com/How-to-project-a-vector-on-a-subspace-tp2530886p2532388.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reproducible research
Hello David, You could also have a look to the ascii package : http://eusebe.github.com/ascii/ With asciidoc (http://www.methods.co.nz/asciidoc/), or one of other markup languages supported (restructuredtext, txt2tags, or textile), you can obtain good results. For example, vignettes of the book Analysis and Interpretation of Freshwater Fisheries Data (http://www.ncfaculty.net/dogle/fishR/bookex/AIFFD/AIFFD.html) are made with asciidoc and ascii package. If you are an emacs user, you might also be interested in org-mode and org-babel: http://orgmode.org/worg/org-contrib/babel/ Best, david 2010/9/9 David Scott d.sc...@auckland.ac.nz: I am investigating some approaches to reproducible research. I need in the end to produce .html or .doc or .docx. I have used hwriter in the past but have had some problems with verbatim output from R. Tables are also not particularly convenient. I am interested in R2HTML and R2wd in particular, and possibly odfWeave. Does anyone have sample documents using any of these approaches which they could let me have? David Scott _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] See what is inside a matrix
Hello everyone.. Is there any graphical tool to help me see what is inside a matrix? I have 100x100 dimensions matrix and as you already know as it does not fit on my screen R splits it into pieces. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions
This sounds interesting, thank you. I'll have a look. jason Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk --- On Wed, 8/9/10, Greg Snow greg.s...@imail.org wrote: From: Greg Snow greg.s...@imail.org Subject: RE: [R] two questions To: Iasonas Lamprianou lampria...@yahoo.com, juan xiong xiongjuan2...@gmail.com, Dennis Murphy djmu...@gmail.com Cc: r-help@r-project.org r-help@r-project.org Date: Wednesday, 8 September, 2010, 17:41 Have you considered doing a permutation test on the interaction? Here is an article that gives the general procedure for a couple of algorithms and a comparison of how well they do: Anderson, Marti J and Legendre, Pierre; An Empirical Comparison of Permutation Methods for Tests of Partial Regression Coefficients in a Linear Model. J. Statist. Comput. Simul., 1999, vol 62, pp. 271-303. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Iasonas Lamprianou Sent: Tuesday, September 07, 2010 12:25 AM To: juan xiong; Dennis Murphy Cc: r-help@r-project.org Subject: Re: [R] two questions By the way, ordinal regression would require huge datasets because my dependent variable has around 20 different responses... but again, one might say that with so many ordinal responses, it is as if we have a linear/interval variable, right? I just hoped that there would be a two-way kruskal-wallis or something like that. On the other hand, what is going to happen if I (1) bootstrap data from all cells of my design and average the rank ordering of the data of every cell? And then (2) do the same but using data from a uniform/normal distribution so that I assume that there is no difference between the cells? From point (1) I will find the statistical value and from point (2) the expectation and then with a third step (3) I can run a chi-square on the observed/expected values. Would this be reasonable? But again, how can I distinguish between main and interaction effects? Dr. Iasonas Lamprianou Assistant Professor (Educational Research and Evaluation) Department of Education Sciences European University-Cyprus P.O. Box 22006 1516 Nicosia Cyprus Tel.: +357-22-713178 Fax: +357-22-590539 Honorary Research Fellow Department of Education The University of Manchester Oxford Road, Manchester M13 9PL, UK Tel. 0044 161 275 3485 iasonas.lampria...@manchester.ac.uk --- On Tue, 7/9/10, Dennis Murphy djmu...@gmail.com wrote: From: Dennis Murphy djmu...@gmail.com Subject: Re: [R] two questions To: juan xiong xiongjuan2...@gmail.com Cc: David Winsemius dwinsem...@comcast.net, r-help@r-project.org, Iasonas Lamprianou lampria...@yahoo.com Date: Tuesday, 7 September, 2010, 4:47 Hi: On Mon, Sep 6, 2010 at 5:26 PM, juan xiong xiongjuan2...@gmail.com wrote: Maybe Friedman test The Friedman test corresponds to randomized complete block designs, not general two-way classifications. David's advice is sound, but also investigate proportional odds models (e.g., lrm in Prof. Harrell's rms package) in case the 'usual' approach comes up short. It would be helpful to know the number of response categories and some idea of the number of cities-of-birth under study, though... HTH, Dennis On Mon, Sep 6, 2010 at 4:47 PM, David Winsemius dwinsem...@comcast.netwrote: The usual least-squares methods are fairly robust to departures from normality. Furthermore, it is the residuals that are assumed to be normally distributed (not the marginal distributions that you are probably looking at) , so it does not sound as though you have yet examined the data properly. Tell us what the descriptive stats (say the means, variance, 10th and 90th percentiles) are on the residuals within cells cross- classified by the gender and city-of-birth variables (say the means, variance, 10th and 90th percentiles). On Sep 6, 2010, at 4:34 PM, Iasonas Lamprianou wrote: Dear friends, two questions (1) does anyone know if there are any non-parametric equivalents of the two-way ANOVA in R? I have an ordinal non-normally distributed dependent variable and two factors (gender and city of birth). Normally, one would try a two-way anova,
Re: [R] plot symbol +, but with variable bar lenghts
On 09-Sep-10 06:41:34, Rainer Machne wrote: Hi, does anybody know of some plotting function or an easy way to generate + symbols with individually settable bar lengths? I tried just combining | and - as pch and setting the size via cex, but that doesn't really work since the two symbols have different default lengths. Is there a horizontal | or a longer - available? Thanks, Rainer I tried this using pch=_ for the horizontal bar, but it is only about half the length of the pch=| bar. However, compared with the same plot using pch=+, at least the resulting cross went through the centre of the + cross. To increase the length of the _ to equal that of the | would require some empirical fiddling with 'cex=... and would increase the thickness of the _. I don't know of any way to create a symbol using drawing commands, and assigning the result to a character which could be evoked using 'pch=...', which would seem to be the sort of thing you would like to be able to do. This could be a useful extension to the plot() function and friends. You can of course define an auxiliary function, say mycross(), on the lines of mycross - function(x,y,L,U,R,D){ lines(c(x,x-L),c(y,y)) lines(c(x,x),c(y,y+U)) lines(c(x,x+R),c(y,y)) lines(c(x,x),c(y,y-D)) } but then you would have to explicitly apply this to the data, rather than delegate it to the plot() function's pch option. Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 09-Sep-10 Time: 08:36:36 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help request: highlighting R code on WordPress.com blogs
Hello dear R help members (and also Yihui and Romain), There are currently 28 R bloggers (out of the 117 R-bloggershttp://www.r-bloggers.com/I know of) that are using wordpress.com for publishing their R code (and I suspect this number will increase with time). WordPress.com doesn't support R syntax highlighting, nor can it be embedded from other services (like gist githttp://gettinggeneticsdone.blogspot.com/2010/09/embed-rstats-code-with-syntax.html ) After contacting the WordPress.com vip manager, he instructed me that they will add R support if a relevant brush will be created according to this document: http://alexgorbatchev.com/SyntaxHighlighter/manual/brushes/custom.html Since this is what they use on wordpress.com (see: http://en.support.wordpress.com/code/posting-source-code/). Creating this brush is beyond my ability at this point, I am writing to *ask if any of you can/wishes to make this brush *for the community. Something I thought might be relevant is the code Yihui Xie recently wrotehttp://yihui.name/en/2010/08/auto-completion-in-notepad-for-r-script/for creating a NPPtoR code brush ( http://yihui.name/en/wp-content/uploads/2010/08/Npp_R_Auto_Completion.r) If such a brush will be created, I'll push to have it included in wordpress.com and to try and inform the current R bloggers using it. Best, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Emacs function argument hints
Hi I've recently started using Emacs as my text editor for writing R script. I am looking for a feature which I have seen on the standard R text editor for Mac OS. In the Mac OS editor when you start typing a function, the possible arguments for that function appear at the bottom of the window. E.g. if you type table( before you finish typing you can see at the bottom of the window: table(..., exclude = if (useNA == no) c(NA, NaN), useNA = c(no, ifany, always), dnn = list.names(...), deparse.level = 1) I think this feature may be called function argument hints but I'm not sure and searching the archive with that term has not produced anything useful. Is this feature available in Emacs or any other windows text editor for R? Thanks very much Tim (Using Windows XP, R 2.11.1, GNU Emacs 23.2.1) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reproducible research
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/09/10 07:47, David Scott wrote: I am investigating some approaches to reproducible research. I need in the end to produce .html or .doc or .docx. I have used hwriter in the past but have had some problems with verbatim output from R. Tables are also not particularly convenient. I am interested in R2HTML and R2wd in particular, and possibly odfWeave. Does anyone have sample documents using any of these approaches which they could let me have? Hi David I am using emacs + org-mode (http://orgmode.org/) for exactly this (see http://orgmode.org/worg/org-contrib/babel/intro.php#reproducable-research for an example for Reproducible research andhttp://orgmode.org/worg/org-contrib/babel/languages/org-babel-doc-R.php about R in emacs + org-mode + ESS. It is literate programming at its best. Concerning reproducible research and report generating, org-babel has one HUGE advantage: you an combine different programming languages easily in the report - so e.g. you can do your analysis in R, do some data preparation in python, and some final file manipulations in bash - and everything is in one file and reproducible (see http://orgmode.org/worg/org-contrib/babel/intro.php#meta-programming-language and http://orgmode.org/worg/org-contrib/babel/examples/data-collection-analysis.php). I think that would be the best tool for the job (see http://orgmode.org/worg/org-contrib/babel/uses.php for examples for what it can be used - some will be relevant for your intended application). Although emacs has a steep learning curve, it is definitely worth it (and it works on Linux, Windows and Mac) - and the mailing list is also really good. Cheers, Rainer David Scott _ David ScottDepartment of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email:d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel:+33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: rai...@krugs.de Skype: RMkrug -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyIltUACgkQoYgNqgF2egpb/QCfW0zgFrNqC6/58sounMDwmsNm VgIAn3ZaDhcKGmo+Fwv+yz0UxXqiDaBC =vZV+ -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimized value worse than starting Value
On Wed, Sep 8, 2010 at 6:26 PM, Michael Bernsteiner dethl...@hotmail.com wrote: @Barry: Yes it is the Rosenbrock Function. I'm trying out some thing I found here: http://math.fullerton.edu/mathews/n2003/PowellMethodMod.html @Ravi: Thanks for your help. I will have a closer look at the BB package. Am I right, that the optimx package is ofline atm? (Windows) It looks like the Windows build of optimx failed the R CMD check when running the examples: http://www.r-project.org/nosvn/R.check/r-devel-windows-ix86+x86_64/optimx-00check.html Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot symbol +, but with variable bar lenghts
Hi, The TeachingDemos package has a my.symbols() function that you could use with you own glyph. HTH, baptiste On Sep 9, 2010, at 9:36 AM, (Ted Harding) wrote: On 09-Sep-10 06:41:34, Rainer Machne wrote: Hi, does anybody know of some plotting function or an easy way to generate + symbols with individually settable bar lengths? I tried just combining | and - as pch and setting the size via cex, but that doesn't really work since the two symbols have different default lengths. Is there a horizontal | or a longer - available? Thanks, Rainer I tried this using pch=_ for the horizontal bar, but it is only about half the length of the pch=| bar. However, compared with the same plot using pch=+, at least the resulting cross went through the centre of the + cross. To increase the length of the _ to equal that of the | would require some empirical fiddling with 'cex=... and would increase the thickness of the _. I don't know of any way to create a symbol using drawing commands, and assigning the result to a character which could be evoked using 'pch=...', which would seem to be the sort of thing you would like to be able to do. This could be a useful extension to the plot() function and friends. You can of course define an auxiliary function, say mycross(), on the lines of mycross - function(x,y,L,U,R,D){ lines(c(x,x-L),c(y,y)) lines(c(x,x),c(y,y+U)) lines(c(x,x+R),c(y,y)) lines(c(x,x),c(y,y-D)) } but then you would have to explicitly apply this to the data, rather than delegate it to the plot() function's pch option. Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 09-Sep-10 Time: 08:36:36 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reproducible research
Dear David I have tried odfWeave and I find it quite useful for the purpose. I would recommend you give it a try. It comes with simple example files along with installation. You might have some difficulty in getting the zip files and path configurations set, which is a pre-requisite, but I am sure it is worth the efforts. Best of Luck. Regards Vijayan Padmanabhan -- View this message in context: http://r.789695.n4.nabble.com/Reproducible-research-tp2532353p2532458.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Maxdiff Analysis in R
Dear Group Does anybody have an example data and R Script for analysis of Maxdiff study in R. Thanks Regards Vijayan Padmanabhan What is expressed without proof can be denied without proof - Euclide. Can you avoid printing this? Think of the environment before printing the email. --- Please visit us at www.itcportal.com ** This Communication is for the exclusive use of the intended recipient (s) and shall not attach any liability on the originator or ITC Ltd./its Subsidiaries/its Group Companies. If you are the addressee, the contents of this email are intended for your use only and it shall not be forwarded to any third party, without first obtaining written authorisation from the originator or ITC Ltd./its Subsidiaries/its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with by any third party in any manner whatsoever without the specific consent of ITC Ltd./its Subsidiaries/its Group Companies. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation question
Did you try taking out P7, which is text? Moreover, if you get a message saying ' the standard deviation is zero', it means that the entire column is constant. By definition, the covariance of a constant with a random variable is 0, but your data consists of values, so cor() understandably throws a warning that one or more of your columns are constant. Applying the following to your data (which I named expd instead), we get sapply(expd[, -12], var) P1 P2 P3 P4 P5 P6 5.43e-01 1.08e+00 5.77e-01 1.08e+00 6.43e-01 5.57e-01 P8 P9 P10 P11 P12 SITE 5.73e-01 3.19e+00 5.07e-01 2.50e-01 5.50e+00 2.49e+00 Errors warnings ManualTotalH_tot HP1.1 9.072840e+03 2.081334e+04 7.43e-01 3.823500e+04 3.880250e+03 2.676667e+00 HP1.2HP1.3HP1.4 HP_totHO1.1 HO1.2 0.00e+00 2.008440e+03 3.057067e+02 3.827250e+03 8.40e-01 0.00e+00 HO1.3HO1.4 HO_totHU1.1HU1.2 HU1.3 0.00e+00 0.00e+00 8.40e-01 0.00e+00 2.10e-01 2.27e-01 HU_tot HRL_totLP1.1LP1.2 LP1.3 6.23e-01 7.43e-01 3.754610e+03 3.209333e+01 0.00e+00 2.065010e+03 LP1.4 LP_totLO1.1LO1.2LO1.3 LO1.4 2.246233e+02 3.590040e+03 3.684000e+01 0.00e+00 0.00e+00 2.84e+00 LO_totLU1.1LU1.2LU1.3 LU_tot LR_tot 6.00e+01 0.00e+00 1.44e+00 3.626667e+00 8.37e+00 4.94e+00 SP_totSP1.1SP1.2SP1.3SP1.4 SP_tot.1 6.911067e+02 4.225000e+01 0.00e+00 1.009600e+02 4.161600e+02 3.071600e+02 SO1.1SO1.2SO1.3SO1.4 SO_tot SU1.1 4.54e+00 2.50e-01 0.00e+00 2.10e-01 5.25e+00 0.00e+00 SU1.2SU1.3 SU_tot SR 1.556667e+00 4.225000e+01 3.504000e+01 4.225000e+01 Which columns are constant? which(sapply(expd[, -12], var) .Machine$double.eps) HP1.2 HO1.2 HO1.3 HO1.4 HU1.1 LP1.2 LO1.2 LO1.3 LU1.1 SP1.2 SO1.3 SU1.1 192425262835404144515760 I suspect that in your real data set, there aren't so many constant columns, but this is one way to check. HTH, Dennis On Wed, Sep 8, 2010 at 12:35 PM, Stephane Vaucher vauch...@iro.umontreal.ca wrote: Hi everyone, I'm observing what I believe is weird behaviour when attempting to do something very simple. I want a correlation matrix, but my matrix seems to contain correlation values that are not found when executed on pairs: test2$P2 [1] 2 2 4 4 1 3 2 4 3 3 2 3 4 1 2 2 4 3 4 1 2 3 2 1 3 test2$HP_tot [1] 10 10 10 10 10 10 10 10 136 136 136 136 136 136 136 136 136 136 15 [20] 15 15 15 15 15 15 c=cor(test2$P3,test2$HP_tot,method='spearman') c [1] -0.2182876 c=cor(test2,method='spearman') Warning message: In cor(test2, method = spearman) : the standard deviation is zero write(c,file='out.csv') from my spreadsheet -0.25028783918741 Most cells are correct, but not that one. If this is expected behaviour, I apologise for bothering you, I read the documentation, but I do not know if the calculation of matrices and pairs is done using the same function (eg, with respect to equal value observations). If this is not a desired behaviour, I noticed that it only occurs with a relatively large matrix (I couldn't reproduce on a simple 2 column data set). There might be a naming error. names(test2) [1] ID NOMBRE MAIL [4] Age SEXO Studies [7] Hours_Internet Vision.Disabilities Other.disabilities [10] Technology_Knowledge Start_Time End_Time [13] Duration P1 P1Book [16] P1DVDP2 P3 [19] P4 P5 P6 [22] P8 P9 P10 [25] P11 P12 P7 [28] SITE Errors warnings [31] Manual TotalH_tot [34] HP1.1HP1.2HP1.3 [37] HP1.4HP_tot HO1.1 [40] HO1.2HO1.3HO1.4 [43] HO_tot HU1.1HU1.2 [46] HU1.3HU_tot HR [49] L_totLP1.1LP1.2 [52] LP1.3LP1.4LP_tot [55] LO1.1LO1.2LO1.3 [58] LO1.4LO_tot LU1.1 [61] LU1.2LU1.3LU_tot [64] LR_tot SP_tot SP1.1 [67] SP1.2SP1.3SP1.4 [70] SP_tot.1 SO1.1SO1.2 [73] SO1.3SO1.4
[R] advise on operations speed with Rcpp,Boost::ipc Shared Memory
Hi, I have an implementation where I transfer data records via shared memory to an R program. If anyone has prior experience, I'd like to find out which would be faster 1) storing data records in shared memory as they are(in a matrix) and then use the Rcpp::wrap to convert them to R datatypes 2) merge the records into a string and store records as strings. Then use R functions like strsplit,lapply etc to convert them to their original matrix form. Any help is appreciated [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with HB analysis in R for a conjoint study Data
Dear Group I was referring to a conjoint analysis scenario using R from the paper referred below: Agricultural Information Research 17(2),2008,86-94 available online at www.jstage.jst.go.jp/ This paper describes the data modelling of a conjoint study design based on conditional logit procedure. I understand that Heirarchical Bayes is asymptotically equivalent to Conditionallogit. However it would be of interest if somebody is willing to share the script to fit this data using HB in R (I understand that bayesm package supports HB , but I am not able to figure out exactly how to model this example data and interpret it). Thanks in Advance. Regards Vijayan Padmanabhan What is expressed without proof can be denied without proof - Euclide. Can you avoid printing this? Think of the environment before printing the email. --- Please visit us at www.itcportal.com ** This Communication is for the exclusive use of the intended recipient (s) and shall not attach any liability on the originator or ITC Ltd./its Subsidiaries/its Group Companies. If you are the addressee, the contents of this email are intended for your use only and it shall not be forwarded to any third party, without first obtaining written authorisation from the originator or ITC Ltd./its Subsidiaries/its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with by any third party in any manner whatsoever without the specific consent of ITC Ltd./its Subsidiaries/its Group Companies. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Feature selection via glmnet package (LASSO)
Hi: When you need to search for a function in R, rely on our good friend, the package sos: library(sos) findFn('elastic net') found 23 matches; retrieving 2 pages HTH, Dennis On Wed, Sep 8, 2010 at 6:58 PM, jjenkner jjenk...@web.de wrote: Hello Lai! You can try the elastic net which is a mixture of lasso and ridge regression. Setting the parameter alpha to less than one will provide you with more coefficients different from zero. I am not sure about the R implementation. You have to search for it on your own. Johannes -- View this message in context: http://r.789695.n4.nabble.com/Feature-selection-via-glmnet-package-LASSO-tp2308635p2532271.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating with tolerances
On Thu, 2010-09-09 at 09:16 +0430, Jan private wrote: Dear list, I am from an engineering background, accustomed to work with tolerances. For example, I have measured Q = 0.15 +- 0.01 m^3/s H = 10 +- 0.1 m and now I want to calculate P = 5 * Q * H and get a value with a tolerance +- What is the elegant way of doing this in R? Thank you, Jan Hi Jan, If I understood your problem this script solve your problem: q-0.15 + c(-.1,0,.1) h-10 + c(-.1,0,.1) 5*q*h [1] 2.475 7.500 12.625 -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'par mfrow' and not filling horizontally
Hi! I think you've got already all useful solutions, but I usually just change mfrow to c(2,2). There is then free space left, but I usually edit my graphs in Illustrator anyway. Ivan Le 9/8/2010 21:01, (Ted Harding) a écrit : Greetings, Folks. I'd appreciate being shown the way out of this one! I've been round the documentation in ever-drecreasing circles, and along other paths, without stumbling on the answer. The background to the question can be exemplified by the example (no graphics window open to start with): set.seed(54321) X0- rnorm(50) ; Y0- rnorm(50) par(mfrow=c(2,1),mfg=c(1,1),cex=0.5) plot(X0,Y0,pch=+,col=blue,xlim=c(-3,3),ylim=c(-3,3), xlab=X,ylab=Y,main=My Plot,asp=1) par(mfg=c(2,1)) plot(X0,Y0,pch=+,col=blue,xlim=c(-3,3),ylim=c(-3,3), xlab=X,ylab=Y,main=My Plot,asp=1) As you will see, both plots have been extended laterally to fill the plotting area horizontally, hence extend from approx X = -8 to approx X = +8 (on my X11 display), despite the xlim=c(-3,3); however, the ylim=c(-3,3) has been respected, as has asp=1. What I would like to see, independently of the shape of the graphics window, is a pair of square plots, each with X and Y ranging from -3 to 3, even if this leaves empty space in the graphics window on either side. Hints? With thanks, Ted. E-Mail: (Ted Harding)ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 08-Sep-10 Time: 20:01:19 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] See what is inside a matrix
Hi: One possibility is a heatmap, although there are other approaches. x - matrix(sample(1:100, 1, replace = TRUE), nrow = 100) image(x) xx - apply(x, 1, sort) # sorts the rows of x image(xx) # ggplot2 version: library(ggplot2) ggplot(melt(x), aes(x=X1, y=X2, fill=value)) + geom_tile() + scale_fill_gradientn(colour = terrain.colors(10)) See the online help page http://had.co.nz/ggplot2/scale_gradientn.html for several examples of choosing color ranges in scale_fill_gradientn(). To get similar control over image, change the col = argument according to the description on the help page of image - ?image . Another alternative is an enhanced heatmap function in package gplots. I'll leave that to you to investigate... HTH, Dennis On Thu, Sep 9, 2010 at 12:22 AM, Alaios ala...@yahoo.com wrote: Hello everyone.. Is there any graphical tool to help me see what is inside a matrix? I have 100x100 dimensions matrix and as you already know as it does not fit on my screen R splits it into pieces. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with outer
thank you for your answers, but my problem is that i want plot the function guete for the variables p_11 and p_12 between zero and one. that means that i also want to plot p_11=0.7 and p_12=0.3. but with a=0.4 and b=0.6 and p_11=seq(0,a,0.05*a) and p_12=seq(0,b,0.0*b) i cannot do that. i hope you have an other idea. tuggi -- View this message in context: http://r.789695.n4.nabble.com/problem-with-outer-tp2532074p2532550.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Strange output daply with empty strata
Dear list, I get some strange results with daply from the plyr package. In the example below, the average age per municipality for employed en unemployed is calculated. If I do this using tapply (see code below) I get the following result: no yes A NA 36.94931 B 51.22505 34.24887 C 48.05759 51.00198 If I do this using daply: municipality no yes A 36.94931 48.05759 B 51.22505 51.00198 C 34.24887 NA daply generates the same numbers. However, these are not in the correct cells. For example, in municipality A everybody is employed. Therefore, the NA should be in the cell for unemployed in municipality A. Am I using daply incorrectly or is there indeed something wrong with the output of daply? Regards, Jan I am using version 1.1 of the plyr-package. # Generate some test data data.test - data.frame( municipality=rep(LETTERS[1:3], each=10), employed=sample(c(yes, no), 30, replace=TRUE), age=runif(30,20,70)) # Make sure everybody is employed in municipality A data.test$employed[data.test$municipality == A] - yes # Compare the output of tapply: tapply(data.test$age, list(data.test$municipality, data.test$employed), mean) # to that of daply: daply(data.test, .(municipality, employed), function(d){mean(d$age)} ) # results of ddply are the samen as tapply ddply(data.test, .(municipality, employed), function(d){mean(d$age)} ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] markov model
Dear all, I would like some help to writing the likelihood function for the continuous-time markov model, even if it can be calculated with the msm package, I need to know how it is calculated Thank you Luis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Which language is faster for numerical computation?
Dear all, R offers integration mechanism with different programming languages like C, C++, Fortran, .NET etc. Therefore I am curious on, for heavy numerical computation which language is the fastest? Is there any study? I specially want to know because, if there is some study saying that C is the fastest language for numerical computation then I would change some of my R code into C. Thanks for your time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Making R lazy
Dear All, I hope this is not too off-topic. I am wondering if there is any possibility to make an R code lazy i.e. to prevent it from calculating quantities which are not used in the code. As an example: you are in a rush to modify your code and at the end it ends up with dead branches, let's say a sequence which is calculated but not used in any following calculations, not printed on screen, not stored in a file etc... It would be nice to teach R to automagically skip its calculation when I run the script (at least in a non-interactive way). I know that such a situation is probably the result of bad programming habits, but it may arise all the same. If I understand correctly, what I am asking for is something different from any kind of garbage collection which would take place, if ever, only after the array has been calculated. Any suggestions (or clarifications if I am on the wrong track) are appreciated. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange output daply with empty strata
Hi: Here's what I tried: # data frame versions (aggregate, ddply): aggregate(age ~ municipality + employed, data = data.test, FUN = mean) municipality employed age 1B no 55.57407 2C no 44.67463 3A yes 41.58759 4B yes 43.59330 5C yes 43.82545 ddply(data.test, .(municipality, employed), summarise, mean = mean(age)) municipality employed mean 1A yes 41.58759 2B no 55.57407 3B yes 43.59330 4C no 44.67463 5C yes 43.82545 It appears that aggregate() silently removes groups where no observations are present, but ddply() has an option .drop, which when set to FALSE, returns NaN for the not employed group in municipality A: ddply(data.test, .(municipality, employed), summarise, avgage = mean(age), .drop = FALSE) municipality employed avgage 1A no NaN 2A yes 41.58759 3B no 55.57407 4B yes 43.59330 5C no 44.67463 6C yes 43.82545 # tapply/daply with(data.test, tapply(age, list(municipality, employed), mean)) no yes A NA 41.58759 B 55.57407 43.59330 C 44.67463 43.82545 daply(data.test, .(municipality, employed), function(d){mean(d$age)} ) employed municipality no yes A 41.58759 44.67463 B 55.57407 43.82545 C 43.59330 NA The .drop argument has a different meaning in daply. Some R functions have an na.last argument, and it may be that somewhere in daply, there is a function call that moves all NAs to the end. The means are in the right order except for the first, where the NA is supposed to be, so everything is offset in the table by 1. I've cc'ed Hadley on this. HTH, Dennis On Thu, Sep 9, 2010 at 2:43 AM, Jan van der Laan rh...@eoos.dds.nl wrote: Dear list, I get some strange results with daply from the plyr package. In the example below, the average age per municipality for employed en unemployed is calculated. If I do this using tapply (see code below) I get the following result: no yes A NA 36.94931 B 51.22505 34.24887 C 48.05759 51.00198 If I do this using daply: municipality no yes A 36.94931 48.05759 B 51.22505 51.00198 C 34.24887 NA daply generates the same numbers. However, these are not in the correct cells. For example, in municipality A everybody is employed. Therefore, the NA should be in the cell for unemployed in municipality A. Am I using daply incorrectly or is there indeed something wrong with the output of daply? Regards, Jan I am using version 1.1 of the plyr-package. # Generate some test data data.test - data.frame( municipality=rep(LETTERS[1:3], each=10), employed=sample(c(yes, no), 30, replace=TRUE), age=runif(30,20,70)) # Make sure everybody is employed in municipality A data.test$employed[data.test$municipality == A] - yes # Compare the output of tapply: tapply(data.test$age, list(data.test$municipality, data.test$employed), mean) # to that of daply: daply(data.test, .(municipality, employed), function(d){mean(d$age)} ) # results of ddply are the samen as tapply ddply(data.test, .(municipality, employed), function(d){mean(d$age)} ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] confidence intervals around p-values
Dear all I wonder if anyone has heard of confidence intervals around p-values... Any pointer would be highly appreciated. Best Fer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 91, Issue 9
dear all I wonder if anyone has heard of confidence intervals around p-values... Any pointer would be highly appreciated. Best Fer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating with tolerances (error propagation)
Hello Bernardo, - If I understood your problem this script solve your problem: q-0.15 + c(-.1,0,.1) h-10 + c(-.1,0,.1) 5*q*h [1] 2.475 7.500 12.625 - OK, this solves the simple example. But what if the example is not that simple. E.g. P = 5 * q/h Here, to get the maximum tolerances for P, we need to divide the maximum value for q by the minimum value for h, and vice versa. Is there any way to do this automatically, without thinking about every single step? There is a thing called interval arithmetic (I saw it as an Octave package) which would do something like this. I would have thought that tracking how a (measuring) error propagates through a complex calculation would be a standard problem of statistics?? In other words, I am looking for a data type which is a number with a deviation +- somehow attached to it, with binary operators that automatically knows how to handle the deviation. Thank you, Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] UseR groups: NewJersey R - LondonR - BaselR
NewJerseyR Mango Solutions is pleased to announce the inaugural meeting of NewJerseyR a networking and social event for all local and regional R users. Thank you to those of you already registered to attend the first NewJerseyR meeting on Thursday 16th September and to those of you who have already joined our mailing list for future NewJerseyR events. Please note that we are announcing a CHANGE OF VENUE for the meeting next week. The new venue is adjacent to the originally advertised venue so we are confident that this late change should cause no inconvenience. Date:Thursday 16th September 2010 Venue: Kona Grill, 511 Route One South, Iselin, New Jersey 08830 Time:6:30pm - 9:30pm (talks start at 7pm) Presentations at the event will be: - Richard Pugh, Mango Solutions: MSToolkit 3.0: Clinical Trial Simulation using R - Max Kuhn, Pfizer: The Caret Package: A Unified Interface for Predictive Models - Mani Subramaniam, ATT Labs: tsX: An R package for the exploratory analysis of a large collection of time-series http://user2010.org/abstracts/Subramaniam+Varadhan+Urbanek+Epstein_3.pdf%20%20 - Brian McHugh, Bristol Myers Squibb: How R can orchestrate bootstrapping Free drinks and snacks will be available. Mailing List - To ensure you receive details of all future NewJerseyR meeting please ask to join our mailing list by emailing us at: newjers...@mango-solutions.com LondonR Thank you to everyone who attended the July LondonR meeting and big thanks to Chris Campbell, Matthew Dowle and Andy Nicholls for presenting. Past presentations are available at http://londonr.org/LondonR-20090331/Agenda.html As ever, we need volunteers to present at all future meetings. If you feel you have something to input into this meeting or can recommend someone else, we would be delighted to hear from you. The next LondonR meeting will be held on the 5th October 2010 Venue: Counting House - 50 Cornhill, London, EC3V 3PD Tel: 020 7283 7123 (Nearest tube is Bank exit 4 or 5. Opposite Star Bucks) Time:6pm - 9pm Agenda:to be confirmed The following LondonR meetings will be held: * 8th December 2010 * 9th March 2011 (Agenda and venue to be confirmed) To register, for more information or to speak at the next LondonR meeting please email us at lond...@mango-solutions.com BaselR Mango Solutions would like to thank all who came to the BaselR meeting on Wednesday 29th July. We were delighted to see so many in attendance. Our particular thanks go to the following for their extremely interesting presentations: Andrew Ellis, ETH Zurich - Desktop Publishing with Sweave Dominik Locher, THETA AG - Professional Reporting with RExcel Sebastian Pérez Saaibi, ETH Zurich - R Generator Tool for Google Motion Charts These presentations are available at http://www.baselr.org/Presentations.html The next BaselR meeting will be on Wednesday 13th October 2010. Time: 6:30pm - 9:30pm (talks start at 7pm) Venue: transBARent, Viaduktstrasse 3 CH-4051 Basel Full details of presentation for this next meeting will be published in due course. If you would like to attend the next BaselR meeting we would ask you to please register ahead of the meeting in order to help us with our planning. Please register by emailing: bas...@mango-solutions.com Mailing List - To ensure you receive details of all future BaselR meeting please ask to join our mailing list by emailing us at: bas...@mango-solutions.com All of our UseR group meetings are free. All we ask is for attendees to register prior to the event so that we can cater for everyone. Mango Solutions run public R courses as well as private, customised R courses. Please visit http://mango-solutions.com/training.html For more information about Mango Solutions please contact us at i...@mango-solutions.com or visit our website www.mango-solutions.com Sarah Lewis Hadley Wickham, Creator of ggplot2 - first time teaching in the UK. 1st - 2nd November 2010. To book your seat please go to http://mango-solutions.com/news.html T: +44 (0)1249 767700 Ext: 200 F: +44 (0)1249 767707 M: +44 (0)7746 224226 www.mango-solutions.com Unit 2 Greenways Business Park Bellinger Close Chippenham Wilts SN15 1BN UK LEGAL NOTICE This message is intended for the use o...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] try-error can not be test. Why?
On 08/09/2010 11:46 PM, Philippe Grosjean wrote: On 08/09/10 19:25, David Winsemius wrote: On Sep 8, 2010, at 1:18 PM, telm8 wrote: Hi, I am having some strange problem with detecting try-error. From what I have read so far the following statement: try( log(a) ) == try-error should yield TRUE, however, it yields FALSE. I can not figure out why. Can someone help? class(try( log(a), silent=TRUE )) == try-error [1] TRUE This is perfectly correct in this case, but while we are mentioning a test on the class of an object, the better syntax is: inherits(try(log(a)), try-error) In a more general context, class may be defined with multiple strings (R way of subclassing S3 objects). For instance, this does not work: if (class(Sys.time()) == POSIXct) ok else not ok ... because the class of a `POSIXct' object is defined as: c(POSIXt, POSIXct). This works: Getting even further off track: another advantage of inherits() is that the class can change. For example, in the upcoming 2.12.0 release, the class of Sys.time() will be class(Sys.time()) [1] POSIXct POSIXt Putting the names in the reverse order was a relic from ancient times that will soon be corrected. The tests below won't care about this change, but some more fragile tests might. Duncan Murdoch if (inherits(Sys.time(), POSIXct)) ok else not ok Alternate valid tests would be (but a little bit less readable): if (any(class(Sys.time()) == POSIXct)) ok else not ok or, by installing the operators package, a less conventional, but cleaner code: install.packages(operators) library(operators) if (Sys.time() %of% POSIXct) ok else not ok Best, Philippe Grosjean Many thanks -- View this message in context: http://r.789695.n4.nabble.com/try-error-can-not-be-test-Why-tp2531675p2531675.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which language is faster for numerical computation?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/09/10 12:26, Christofer Bogaso wrote: Dear all, R offers integration mechanism with different programming languages like C, C++, Fortran, .NET etc. Therefore I am curious on, for heavy numerical computation which language is the fastest? Is there any study? I specially want to know because, if there is some study saying that C is the fastest language for numerical computation then I would change some of my R code into C. As far as I am aware, the two choices are C and Fortran - where it depends on the calculations, which one is faster. Cheers, Rainer Thanks for your time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel:+33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: rai...@krugs.de Skype: RMkrug -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkyIxHYACgkQoYgNqgF2egofwQCePvkN3xewbSNEeKUiuxlL7Utx CxMAniGRwoAWfJ8VNTPHLXIbtpiZCrMd =fmFx -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modulo operation
Dear R-users, May be there is something that I am not understanding, missed or else... Why do these operations yield these results? 25%/%0.2 [1] 124 25%%0.2 [1] 0.2 I would expect (although I know that what I do expect and what is really intended in the code may be different things) 25/0.2 [1] 125 25 - floor(25/0.25)*0.25 [1] 0 (At least this second one is what I would expect from the code in arithmetic.c, lines 168 to 178) -- --- José M. Blanco-Moreno Dept. de Biologia Vegetal (Botànica) Facultat de Biologia Universitat de Barcelona Av. Diagonal 645 08028 Barcelona SPAIN --- phone: (+34) 934 039 863 fax: (+34) 934 112 842 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] createDataPartition
Dear all, does anyone know how to define the structure of the required samples using function createDataPartition, meaning proportions of different types of variable in the partition? Smth like this for iris data: createDataPartition(y = c(setosa = .5, virginica = .3, versicolor = .2), times = 10, p = .7, list = FALSE) Thanks a lot for your help. Regards, Trafim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in normalizePath(path) : with McAfee
On 09/09/2010 12:01 AM, Erin Hodgess wrote: Dear R People: I keep getting the Error in normalizePath(path) : while trying to obtain the necessary packages to use with the Applied Spatial Statistics with R book. I turned off the Firewall (from McAfee) but am still getting the same message. Does anyone have any idea on a solution please? I think you need to show us your code and the error in context. Duncan Murdoch sessionInfo() R version 2.11.1 (2010-05-31) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ctv_0.6-0 loaded via a namespace (and not attached): [1] tools_2.11.1 Thanks, Erin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modulo operation
2010/9/9 José M. Blanco Moreno jmbla...@ub.edu: Dear R-users, May be there is something that I am not understanding, missed or else... Why do these operations yield these results? 25%/%0.2 [1] 124 25%%0.2 [1] 0.2 I would expect (although I know that what I do expect and what is really intended in the code may be different things) 25/0.2 [1] 125 25 - floor(25/0.25)*0.25 [1] 0 (At least this second one is what I would expect from the code in arithmetic.c, lines 168 to 178) Did you read the documentation before you read the code? ‘%%’ and ‘x %/% y’ can be used for non-integer ‘y’, e.g. ‘1 %/% 0.2’, but the results are subject to rounding error and so may be platform-dependent. Because the IEC 60059 representation of ‘0.2’ is a binary fraction slightly larger than ‘0.2’, the answer to ‘1 %/% 0.2’ should be ‘4’ but most platforms give ‘5’. I suspect that is relevant to your interests Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making R lazy
On 09/09/2010 6:27 AM, Lorenzo Isella wrote: Dear All, I hope this is not too off-topic. I am wondering if there is any possibility to make an R code lazy i.e. to prevent it from calculating quantities which are not used in the code. As an example: you are in a rush to modify your code and at the end it ends up with dead branches, let's say a sequence which is calculated but not used in any following calculations, not printed on screen, not stored in a file etc... It would be nice to teach R to automagically skip its calculation when I run the script (at least in a non-interactive way). I know that such a situation is probably the result of bad programming habits, but it may arise all the same. If I understand correctly, what I am asking for is something different from any kind of garbage collection which would take place, if ever, only after the array has been calculated. Any suggestions (or clarifications if I am on the wrong track) are appreciated. R does lazy evaluation of function arguments, so an ugly version of what you're asking for is to put all your code into arguments, either as default values or as actual argument values. For example: f - function(a = slow1, b = slow2, c = slow3) { a c } f() will never calculate slow2, but it will calculate slow1 and slow3. The other version of this is f - function(a,b,c) { a c } f(slow1, slow2, slow3) The big difference between the two versions is in scoping: the first one evaluates the expressions in the local scope of f, the second one evaluates them in the scope of the caller. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] average columns of data frame corresponding to replicates
try this: myData sample1.id1 sample1.id2 sample2.id1 sample1.id3 sample3.id1 sample1.id4 sample2.id2 11 2 2 1 1 1 1 21 2 2 2 1 2 1 31 2 2 3 1 3 1 41 2 2 4 1 4 1 51 2 2 5 1 5 1 61 2 2 6 1 6 1 71 2 2 7 1 7 1 81 2 2 8 1 8 1 91 2 2 9 1 9 1 10 1 2 2 10 1 10 1 newData - NULL for (i in repeat_ids){ + # determine the columns to use + colIndx - grep(paste(i, $, sep=''), colnames(myData)) + if (length(colIndx) == 0) next # make sure it exists + # create the average of the columns + newData - cbind(newData, rowMeans(myData[, colIndx], na.rm=TRUE)) + colnames(newData)[ncol(newData)] - i # add the name + } newData id1 id2 [1,] 1.33 1.5 [2,] 1.33 1.5 [3,] 1.33 1.5 [4,] 1.33 1.5 [5,] 1.33 1.5 [6,] 1.33 1.5 [7,] 1.33 1.5 [8,] 1.33 1.5 [9,] 1.33 1.5 [10,] 1.33 1.5 On Tue, Sep 7, 2010 at 12:00 PM, Juliet Hannah juliet.han...@gmail.com wrote: Hi Group, I have a data frame below. Within this data frame there are samples (columns) that are measured more than once. Samples are indicated by idx. So id1 is present in columns 1, 3, and 5. Not every id is repeated. I would like to create a new data frame so that the repeated ids are averaged. For example, in the new data frame, columns 1, 3, and 5 of the original will be replaced by 1 new column that is the mean of these three. Thanks for any suggestions. Juliet myData - data.frame(sample1.id1 =rep(1,10), sample1.id2=rep(2,10), sample2.id1 = rep(2,10), sample1.id3 = 1:10, sample3.id1 = rep(1,10), sample1.id4 = 1:10, sample2.id2 = rep(1,10)) repeat_ids - c(id1,id2) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modulo operation
On 09/09/2010 7:56 AM, Barry Rowlingson wrote: 2010/9/9 José M. Blanco Moreno jmbla...@ub.edu: Dear R-users, May be there is something that I am not understanding, missed or else... Why do these operations yield these results? 25%/%0.2 [1] 124 25%%0.2 [1] 0.2 I would expect (although I know that what I do expect and what is really intended in the code may be different things) 25/0.2 [1] 125 25 - floor(25/0.25)*0.25 [1] 0 (At least this second one is what I would expect from the code in arithmetic.c, lines 168 to 178) Did you read the documentation before you read the code? ‘%%’ and ‘x %/% y’ can be used for non-integer ‘y’, e.g. ‘1 %/% 0.2’, but the results are subject to rounding error and so may be platform-dependent. Because the IEC 60059 representation of ‘0.2’ is a binary fraction slightly larger than ‘0.2’, the answer to ‘1 %/% 0.2’ should be ‘4’ but most platforms give ‘5’. I suspect that is relevant to your interests Yes. I think José is assuming that 25 %/% 0.2 and floor(25/0.2) are equal, but they are not, because rounding affects them differently. (The first is a single operation with no rounding except in the representation of 0.2; the second is two operations and is subject to another set of rounding.) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reproducible research
Another vote for org-mode here. In addition the advantages the other posts mentioned, you get multiple export engines (html, latex, ...) all built in. On 09/09/2010 12:47 AM, David Scott wrote: I am investigating some approaches to reproducible research. I need in the end to produce .html or .doc or .docx. I have used hwriter in the past but have had some problems with verbatim output from R. Tables are also not particularly convenient. I am interested in R2HTML and R2wd in particular, and possibly odfWeave. Does anyone have sample documents using any of these approaches which they could let me have? David Scott _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Failure to aggregate
On Wed, Sep 8, 2010 at 4:48 AM, Dimitri Shvorob dimitri.shvo...@gmail.com wrote: I was able to aggregate (with sqldf, at least), after saving and re-loading the dataframe. My first guess was that h (and/or price?) now being a factor - stringsAsFactors = T by default - made the difference, and I tried to convert x$h to factor, but received an error. Please provide enough of x to reproduce your problem. e.g. x - head(x) dput(x) # repeat code and ensure it still shows problem -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modulo operation
Did you read the documentation before you read the code? ‘%%’ and ‘x %/% y’ can be used for non-integer ‘y’, e.g. ‘1 %/% 0.2’, but the results are subject to rounding error and so may be platform-dependent. Because the IEC 60059 representation of ‘0.2’ is a binary fraction slightly larger than ‘0.2’, the answer to ‘1 %/% 0.2’ should be ‘4’ but most platforms give ‘5’. I suspect that is relevant to your interests Yes. I think José is assuming that 25 %/% 0.2 and floor(25/0.2) are equal, but they are not, because rounding affects them differently. (The first is a single operation with no rounding except in the representation of 0.2; the second is two operations and is subject to another set of rounding.) Duncan Murdoch Thank you (both) very much for the info. Indeed I wasn't aware of that piece of documentation and of the implications of rounding. Excuse me for my hasty question when facing this behaviour. -- --- José M. Blanco-Moreno Dept. de Biologia Vegetal (Botànica) Facultat de Biologia Universitat de Barcelona Av. Diagonal 645 08028 Barcelona SPAIN --- phone: (+34) 934 039 863 fax: (+34) 934 112 842 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Emacs function argument hints
Hi Tim, This works out of the box for me, with ESS 5.11 and Emacs 23.1 -Ista On Thu, Sep 9, 2010 at 4:07 AM, Tim Elwell-Sutton tesut...@hku.hk wrote: Hi I've recently started using Emacs as my text editor for writing R script. I am looking for a feature which I have seen on the standard R text editor for Mac OS. In the Mac OS editor when you start typing a function, the possible arguments for that function appear at the bottom of the window. E.g. if you type table( before you finish typing you can see at the bottom of the window: table(..., exclude = if (useNA == no) c(NA, NaN), useNA = c(no, ifany, always), dnn = list.names(...), deparse.level = 1) I think this feature may be called function argument hints but I'm not sure and searching the archive with that term has not produced anything useful. Is this feature available in Emacs or any other windows text editor for R? Thanks very much Tim (Using Windows XP, R 2.11.1, GNU Emacs 23.2.1) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confidence intervals around p-values
Fernando Marmolejo Ramos fernando.marmolejoramos at adelaide.edu.au writes: Dear all I wonder if anyone has heard of confidence intervals around p-values... Any pointer would be highly appreciated. No, and my reflex is that it seems like a bad idea. If you are using p-values as an index of effect size (e.g. translating a t- or Z-score into a p-value), why not calculate the confidence interval on the effect size? This is off-topic for the group (not an R question), but if you gave a sense of the problem you were trying to solve you might get some answers. Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confidence intervals around p-values
On 09/09/2010 6:44 AM, Fernando Marmolejo Ramos wrote: Dear all I wonder if anyone has heard of confidence intervals around p-values... That doesn't really make sense. p-values are statistics, not parameters. You would compute a confidence interval around a population mean because that's a parameter, but you wouldn't compute a confidence interval around the sample mean: you've observed it exactly. Duncan Murdoch Any pointer would be highly appreciated. Best Fer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New package for medical image registration: RNiftyReg
The first release of RNiftyReg, an R package for registration (alignment and resampling) of medical images, is now available on CRAN [1]. It may also be useful for other 3D array-like data sets. RNiftyReg is built on top of the NiftyReg library [2], and is written in a mixture of C, C++ and R. It currently supports 3D rigid-body and affine registration, and support for 2D and nonlinear registration is planned for a future release. NIfTI-format files can be read in and passed to the registration algorithm using the oro.nifti package. In testing I've found that a standard 12 degree-of-freedom affine registration typically takes less than a minute, but timings will depend on the dimensions of the images. Feedback on the package would be very welcome at this stage. Over the last few years a number of R packages for medical image analysis have been produced, and R is gaining momentum as a platform in this field [3]. I hope that this package will be a useful addition. All the best, Jon -- [1] http://cran.r-project.org/web/packages/RNiftyReg/index.html [2] http://sourceforge.net/projects/niftyreg/ [3] http://cran.r-project.org/web/views/MedicalImaging.html ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Invitation to the ICANNGA'11 Conference
Dear Colleague, The 10th ICANNGA conference, to be held April 14-16 2011 in Ljubljana, Slovenia, is fast approaching and with it the paper submission deadline, which is October 1st, 2010. Let us kindly invite you to visit our web page: www.icannga.com, where all the details on ICANNGA, its history, program topics, the keynote speakers, and registration details can be found. Moreover, it offers information on Slovenia and Ljubljana, its capital. We want to remind you also that the accepted papers will be published in the Springer's Lecture Notes in Computer Science, and that the best selected papers will appear in the Springer's Computing journal with SCI impact factor. We are looking forward to hosting you in our beautiful country and hope you will be able to feel Slovenia and enjoy the conference and your stay. With best wishes, ICANNGA Organizing Committee [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange output daply with empty strata
daply(data.test, .(municipality, employed), function(d){mean(d$age)} ) employed municipality no yes A 41.58759 44.67463 B 55.57407 43.82545 C 43.59330 NA The .drop argument has a different meaning in daply. Some R functions have an na.last argument, and it may be that somewhere in daply, there is a function call that moves all NAs to the end. The means are in the right order except for the first, where the NA is supposed to be, so everything is offset in the table by 1. I've cc'ed Hadley on this. This is a bug, which I've fixed in the development version (hopefully to be released next week). In the plyr 1.2: daply(data.test, .(municipality, employed), function(d){mean(d$age)} ) employed municipality no yes A NA 39.49980 B 44.69291 51.63733 C 57.38072 45.28978 Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Highlighting a few bars in a barplot
Hello, I have a bar plot where I am already using colour to distinguish one set of samples from another. I would also like to highlight a few of these bars as ones that should be looked at in detail. I was thinking of using hatching, but I can't work out how or if you can have a background colour and hatching which is different between bars. Any suggestions on how I should do this? Thanks Dan -- ** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.bre...@icr.ac.uk ** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot symbol +, but with variable bar lenghts
Look at my.symbols in the TeachingDemos package. -Original Message- From: Rainer Machne r...@tbi.univie.ac.at Sent: Thursday, September 09, 2010 12:42 AM To: R-help@r-project.org R-help@r-project.org Subject: [R] plot symbol +, but with variable bar lenghts Hi, does anybody know of some plotting function or an easy way to generate + symbols with individually settable bar lengths? I tried just combining | and - as pch and setting the size via cex, but that doesn't really work since the two symbols have different default lengths. Is there a horizontal | or a longer - available? Thanks, Rainer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving/loading custom R scripts
Josh, I liked your idea of setting the repo in the .Rprofile file, so I tried it: r - getOption(repos) r[CRAN] - http://cran.stat.ucla.edu; options(repos = r) rm(r) And now when I open R I get an error: Error in r[CRAN] - http://cran.stat.ucla.edu; : cannot do complex assignments in base namespace I am using R2.11.1pat in windows. Thanks, Roger -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Joshua Wiley Sent: Wednesday, September 08, 2010 11:20 AM To: DrCJones Cc: r-help@r-project.org Subject: Re: [R] Saving/loading custom R scripts Hi, Just create a file called .Rprofile that is located in your working directory (this means you could actually have different ones in each working directory). In that file, you can put in code just like any other code that would be source()d in. For instance, all my .Rprofile files start with: r - getOption(repos) r[CRAN] - http://cran.stat.ucla.edu; options(repos = r) rm(r) So that I do not have to pick my CRAN mirror. Similarly you could merely add this line to the file: source(file = http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-with-Post-Hoc.r.txt;) and R would go online, download that file and source it in (not that I am recommending re-downloading every time you start R). Then whatever names they used to define the functions, would be in your workspace. Note that in general, you will not get any output alerting you that it has worked; however, if you type ls() you should see those functions' names. Cheers, Josh On Wed, Sep 8, 2010 at 12:25 AM, DrCJones matthias.godd...@gmail.com wrote: Hi, How does R automatically load functions so that they are available from the workspace? Is it anything like Matlab - you just specify a directory path and it finds it? The reason I ask is because I found a really nice script that I would like to use on a regular basis, and it would be nice not to have to 'copy and paste' it into R on every startup: http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-w ith-Post-Hoc.r.txt This would be for Ubuntu, if that makes any difference. Cheers -- View this message in context: http://r.789695.n4.nabble.com/Saving-loading-custom-R-scripts-tp253092 4p2530924.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This message is for the named person's use only. It may\...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multi-class for BRT
Hi I want to do boosted regression(classification) tree for categorical response (with 7 levels). Can I do this by GBM package? please help me? Thanks alot [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Determine Bounds of Current Graph
I'm having trouble determining the bounds of my current graph. I know how to set the bounds up front (ylim xlim in most cases), but I would rather be able to dynamically see what was chosen to use in later code. Example: library(maps) map('state','Indiana') map.axes() ??Something that lets me know the y-axis is from ~38 to ~42 and store this information into a vector Is there some way to query what the bounds of the current graph are? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie cross tabulation issue
It would help if you included a bit of sample data. See ?dput as a way of doing this. Also a good place to start is by looking at the package reshape. Have a look at http://had.co.nz/reshape/ for some information on the package. --- On Wed, 9/8/10, Jonathan Finlay jmfinl...@gmail.com wrote: From: Jonathan Finlay jmfinl...@gmail.com Subject: [R] Newbie cross tabulation issue To: r-help@r-project.org Received: Wednesday, September 8, 2010, 6:40 PM hi, i'm new in R and i need some help. Please, ¿do you know a function how can process cross tables for many variables and show the result in one table who look like this?: ++ |-- | X variable | |- | Xop1 | Xop2 | Xop3|.| ++ |Yvar1 | Total | %row..| | | Op1 | %row..| | | Op2 | %row..| |+---+ |Yvar2 | Op1 | %row..| | | Op2 | %row...| ++ |Yvar3 | Op1 | %row..| | | Op2 | %row...| | | Op3 | %row...| |+---+ Like a pivot table! thanks a lot. -- Jonathan. [[alternative HTML version deleted]] -Inline Attachment Follows- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which language is faster for numerical computation?
For the compiled languages, it depends heavily on the compiler. This sort of comparison is rendered moot by the huge variety of compiler and hardware specific optimizations. My suggestion is to use C, or possibly C++ in conjunction with Rcpp, as these are most compatible with R. Also, C and C++ are consistently rated highly (often in the top 3) in popularity and use. Fortran is not. This would make a difference if you want to collaborate or ask for help. -Matt On Thu, 2010-09-09 at 06:26 -0400, Christofer Bogaso wrote: Dear all, R offers integration mechanism with different programming languages like C, C++, Fortran, .NET etc. Therefore I am curious on, for heavy numerical computation which language is the fastest? Is there any study? I specially want to know because, if there is some study saying that C is the fastest language for numerical computation then I would change some of my R code into C. Thanks for your time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which language is faster for numerical computation?
On 9 September 2010 at 13:26, Rainer M Krug wrote: | -BEGIN PGP SIGNED MESSAGE- | Hash: SHA1 | | On 09/09/10 12:26, Christofer Bogaso wrote: | Dear all, R offers integration mechanism with different programming | languages like C, C++, Fortran, .NET etc. Therefore I am curious on, | for heavy numerical computation which language is the fastest? Is | there any study? I specially want to know because, if there is some | study saying that C is the fastest language for numerical computation | then I would change some of my R code into C. | | As far as I am aware, the two choices are C and Fortran - where it | depends on the calculations, which one is faster. Could it get any more un-scientific and un-empirical? Maybe we should debate whether it is faster on Thursdays than on Wednesdays too ? FWIW the Rcpp package contains this benchmark example where (R and) C++ is faster than (R and) C. So it really all depends. If someone wants to contribute a Fortran version I'll gladly commit it too. test replications elapsed relative user.self sys.self 5Rcpp_New_ptr(a, b)1 0.213 1. 0.210 1 R_API_optimised(a, b)1 0.233 1.0939 0.230 4Rcpp_New_std(a, b)1 0.258 1.2113 0.260 3Rcpp_Classic(a, b)1 0.445 2.0892 0.450 2 R_API_naive(a, b)1 1.179 5.5352 1.170 6 Rcpp_New_sugar(a, b)1 1.260 5.9155 1.260 All results are equal e...@max:~/svn/rcpp/pkg/Rcpp/inst/examples/ConvolveBenchmarks$ (That is a slightly reworked version from SVN and to be on CRAN soon. What is on CRAN looks a little different as it doesn't use the rbenchmark package.) Benchmark results are far from conclusive proofs but a carefully set-up study can highlight and illuminate differences and/or lack thereof. If Christopher has a particular problem in mind he should probably test and benchmark approaches to that problem. Lastly, the time the code runs is just one measure. For Rcpp we also aim to minimise the time it takes to _write_ the code to solve the problem. Dirk -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determine Bounds of Current Graph
On Sep 9, 2010, at 10:07 AM, Isamoor wrote: I'm having trouble determining the bounds of my current graph. I know how to set the bounds up front (ylim xlim in most cases), but I would rather be able to dynamically see what was chosen to use in later code. Example: library(maps) map('state','Indiana') map.axes() ?par bounds - par(usr) bounds [1] -88.12964 -84.77184 37.74583 41.82082 ??Something that lets me know the y-axis is from ~38 to ~42 and store this information into a vector Is there some way to query what the bounds of the current graph are? Thanks! David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confidence intervals around p-values
On 09-Sep-10 13:21:07, Duncan Murdoch wrote: On 09/09/2010 6:44 AM, Fernando Marmolejo Ramos wrote: Dear all I wonder if anyone has heard of confidence intervals around p-values... That doesn't really make sense. p-values are statistics, not parameters. You would compute a confidence interval around a population mean because that's a parameter, but you wouldn't compute a confidence interval around the sample mean: you've observed it exactly. Duncan Murdoch Duncan has succinctly stated the essential point in the standard interpretation. The P-value is calculated from the sample in hand, a definite null hypothesis, and the distribution of the test statistic given the null hyptohesis, so (given all of these) there is no scope for any other answer. However, there are circumstances in which the notion of confidence interval for a P-value makes some sense. One such might be the Mann-Whitney test for identity of distribution of two samples of continuous variables, where (because of discretisation of the values when they were recorded) there are ties. Then you know in theory that the underlying values are all different, but because you don't know where these lie in the discretisation intervals you don't know which way a tie may split. So it would make sense to simulate by splitting ties at random (e.g. uniformly distribute each 1.5 value over the interval (1.5,1.6) or (1.45,1.55)). For each such simulated tie-broken sample, calculate the P-value. Then you get a distribution of exact P-values calculated from samples without ties which are consistent with the recorded data. The central 95% of this distribution could be interpreted as a 95% coinfidence interval for the true P-value. To bring this closer to on-topic, here is an example in R (rounding to intervals of 0.2): set.seed(51324) X - sort(2*round(0.5*rnorm(12),1)) Y - sort(2*round(0.5*rnorm(12)+0.25,1)) rbind(X,Y) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] # X -1.8 -1.2 -0.8 -0.6 0.00 0.2 0.2 1.2 1.8 2 2.2 # Y -1.2 -0.4 -0.2 0.4 0.41 1.0 1.0 1.2 1.8 2 2.6 # So several ties (-1.2,1.2,1.8,2.0), as well as 0.0, 0.4, 1.0 # which don't matter. wilcox.test(X,Y,alternative=less,exact=TRUE,correct=FALSE) # data: X and Y W = 54, p-value = 0.1488 Ps - numeric(1000) for(i in (1:1000)){ Xr - (X-0.1) + 0.2*runif(10) Yr - (Y-0.1) + 0.2*runif(10) Ps[i] - wilcox.test(Xr,Yr,alternative=less, exact=TRUE,correct=FALSE)$p.value } hist(Ps) table(round(Ps,4)) # 0.1328 0.1457 0.1593 0.1737 0.1888 # 81267336226 90 So this gives you a picture of the uncertainty in the P-value (0.1488, calculated from the rounded data) relative to what it really should have been (if calculated from unrounded data). Since each possible true (tie-broken) sample can be viewed as a hypothesis about unobserved truth, it does make a certain sense to view these results as a kind of confidence distribution for the P-value you should have got. However, this is more of a Bayesian argument, since the above calculation has assigned equal prior probability to the tie-breaks! One could also, I suppose, consider the question of what distribution of P-values might arise if the/an alternative huypothesis were true, and where in this does the P-value that we actually got lie? But these are murkier waters ... Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 09-Sep-10 Time: 15:24:29 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimized value worse than starting Value
Yes, Barry, we are aware of this issue. It is caused by printing to console from FORTRAN in one of the optimization codes, ucminf. If we set trace=FALSE in optimx, this problem goes away. Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Barry Rowlingson b.rowling...@lancaster.ac.uk Date: Thursday, September 9, 2010 4:13 am Subject: Re: [R] optimized value worse than starting Value To: Michael Bernsteiner dethl...@hotmail.com Cc: rvarad...@jhmi.edu, r-help@r-project.org On Wed, Sep 8, 2010 at 6:26 PM, Michael Bernsteiner dethl...@hotmail.com wrote: @Barry: Yes it is the Rosenbrock Function. I'm trying out some thing I found here: @Ravi: Thanks for your help. I will have a closer look at the BB package. Am I right, that the optimx package is ofline atm? (Windows) It looks like the Windows build of optimx failed the R CMD check when running the examples: Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating with tolerances (error propagation)
On Sep 9, 2010, at 6:50 AM, Jan private wrote: Hello Bernardo, - If I understood your problem this script solve your problem: q-0.15 + c(-.1,0,.1) h-10 + c(-.1,0,.1) 5*q*h [1] 2.475 7.500 12.625 - OK, this solves the simple example. But what if the example is not that simple. E.g. P = 5 * q/h Here, to get the maximum tolerances for P, we need to divide the maximum value for q by the minimum value for h, and vice versa. Is there any way to do this automatically, without thinking about every single step? There is a thing called interval arithmetic (I saw it as an Octave package) which would do something like this. I would have thought that tracking how a (measuring) error propagates through a complex calculation would be a standard problem of statistics?? In other words, I am looking for a data type which is a number with a deviation +- somehow attached to it, with binary operators that automatically knows how to handle the deviation. Thank you, Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating with tolerances (error propagation)
q-0.15 + c(-.1,0,.1) h-10 + c(-.1,0,.1) 5*q/h[3:1] [1] 0.02475248 0.0750 0.12626263 -- View this message in context: http://r.789695.n4.nabble.com/Re-Calculating-with-tolerances-error-propagation-tp2532640p2532991.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation question
Thank you Dennis, You identified a factor (text column) that I was concerned with. I simplified my example to try and factor out possible causes. I eliminated the recurring values in columns (which were not the columns that caused problems). I produced three examples with simple data sets. 1. Correct output, 2 columns only: test.notext = read.csv('test-notext.csv') cor(test.notext, method='spearman') P3 HP_tot P3 1.000 -0.2182876 HP_tot -0.2182876 1.000 dput(test.notext) structure(list(P3 = c(2L, 2L, 2L, 4L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L), HP_tot = c(10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 15L, 15L, 15L, 15L, 15L, 15L, 15L)), .Names = c(P3, HP_tot ), class = data.frame, row.names = c(NA, -25L)) 2. Incorrect output where I introduced my P7 column containing text only the 'a' character: test = read.csv('test.csv') cor(test, method='spearman') P3 P7 HP_tot P3 1.000 NA -0.2502878 P7 NA 1 NA HP_tot -0.2502878 NA 1.000 Warning message: In cor(test, method = spearman) : the standard deviation is zero dput(test) structure(list(P3 = c(2L, 2L, 2L, 4L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L), P7 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L ), .Label = a, class = factor), HP_tot = c(10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 15L, 15L, 15L, 15L, 15L, 15L, 15L)), .Names = c(P3, P7, HP_tot), class = data.frame, row.names = c(NA, -25L)) 3. Incorrect output with P7 containing a variety of alpha-numeric characters (ascii), to factor out equal valued column issue. Notice that the text column is interpreted as a numeric value. test.number = read.csv('test-alpha.csv') cor(test.number, method='spearman') P3 P7 HP_tot P3 1.000 0.4093108 -0.2502878 P7 0.4093108 1.000 -0.3807193 HP_tot -0.2502878 -0.3807193 1.000 dput(test.number) structure(list(P3 = c(2L, 2L, 2L, 4L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L), P7 = structure(c(11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), .Label = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o), class = factor), HP_tot = c(10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 15L, 15L, 15L, 15L, 15L, 15L, 15L)), .Names = c(P3, P7, HP_tot), class = data.frame, row.names = c(NA, -25L)) Correct output is obtained by avoiding matrix computation of correlation: cor(test.number$P3, test.number$HP_tot, method='spearman') [1] -0.2182876 It seems that a text column corrupts my correlation calculation (only in a matrix calculation). I assumed that text columns would not influence the result of the calculations. Is this a correct behaviour? If not,I can submit a bug report? If it is, is there a known workaround? cheers, Stephane Vaucher On Thu, 9 Sep 2010, Dennis Murphy wrote: Did you try taking out P7, which is text? Moreover, if you get a message saying ' the standard deviation is zero', it means that the entire column is constant. By definition, the covariance of a constant with a random variable is 0, but your data consists of values, so cor() understandably throws a warning that one or more of your columns are constant. Applying the following to your data (which I named expd instead), we get sapply(expd[, -12], var) P1 P2 P3 P4 P5 P6 5.43e-01 1.08e+00 5.77e-01 1.08e+00 6.43e-01 5.57e-01 P8 P9 P10 P11 P12 SITE 5.73e-01 3.19e+00 5.07e-01 2.50e-01 5.50e+00 2.49e+00 Errors warnings ManualTotalH_tot HP1.1 9.072840e+03 2.081334e+04 7.43e-01 3.823500e+04 3.880250e+03 2.676667e+00 HP1.2HP1.3HP1.4 HP_totHO1.1 HO1.2 0.00e+00 2.008440e+03 3.057067e+02 3.827250e+03 8.40e-01 0.00e+00 HO1.3HO1.4 HO_totHU1.1HU1.2 HU1.3 0.00e+00 0.00e+00 8.40e-01 0.00e+00 2.10e-01 2.27e-01 HU_tot HRL_totLP1.1LP1.2 LP1.3 6.23e-01 7.43e-01 3.754610e+03 3.209333e+01 0.00e+00 2.065010e+03 LP1.4 LP_totLO1.1LO1.2LO1.3 LO1.4 2.246233e+02 3.590040e+03 3.684000e+01 0.00e+00 0.00e+00 2.84e+00 LO_totLU1.1LU1.2LU1.3 LU_tot LR_tot 6.00e+01 0.00e+00 1.44e+00 3.626667e+00 8.37e+00 4.94e+00 SP_tot
Re: [R] Highlighting a few bars in a barplot
Hello Daniel, something like that might work: x - runif(6) marker1 - rep(c(red, blue), 3) marker2 - c(rep(0,5), 10) barplot(x, col = marker1) barplot(x, density = marker2, add=T) But I'd be interested if you learn about other solutions... -Heinrich. -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von Daniel Brewer Gesendet: Donnerstag, 09. September 2010 16:03 An: r-h...@stat.math.ethz.ch Betreff: [R] Highlighting a few bars in a barplot Hello, I have a bar plot where I am already using colour to distinguish one set of samples from another. I would also like to highlight a few of these bars as ones that should be looked at in detail. I was thinking of using hatching, but I can't work out how or if you can have a background colour and hatching which is different between bars. Any suggestions on how I should do this? Thanks Dan -- ** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.bre...@icr.ac.uk ** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating with tolerances (error propagation)
On Sep 9, 2010, at 6:50 AM, Jan private wrote: Hello Bernardo, - If I understood your problem this script solve your problem: q-0.15 + c(-.1,0,.1) h-10 + c(-.1,0,.1) 5*q*h [1] 2.475 7.500 12.625 - OK, this solves the simple example. But what if the example is not that simple. E.g. P = 5 * q/h Here, to get the maximum tolerances for P, we need to divide the maximum value for q by the minimum value for h, and vice versa. Have you considered the division by zero problems? Is there any way to do this automatically, without thinking about every single step? There is a thing called interval arithmetic (I saw it as an Octave package) which would do something like this. (Sorry for the blank reply posting. Serum caffeine has not yet reached optimal levels.) Is it possible that interval arithmetic would produce statistically incorrect tolerance calculation, and that be why it has not been added to R? Those tolerance intervals are presumably some sort of (unspecified) prediction intervals (i.e. contain 95% or 63% or some fraction of a large sample) and combinations under mathematical operations are not going to be properly derived by c( min(XY), max(XY) ) since those are not calculated with any understanding of combining variances of functions on random variables. -- David. I would have thought that tracking how a (measuring) error propagates through a complex calculation would be a standard problem of statistics?? In probability theory, anyway. In other words, I am looking for a data type which is a number with a deviation +- somehow attached to it, with binary operators that automatically knows how to handle the deviation. There is the suite of packages that represent theoretic random variables and support mathematical operations on them. See distrDoc and the rest of that suite. -- David. Thank you, Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reproducible research
I have a little package I've been using to write template blog posts (in HTML) with embedded R code. It's quite small but very flexible and extensible, and aims to do something similar to Sweave and brew. In fact, the package is heavily influenced by the brew package, though implemented quite differently. It depends on the evaluate package, available in the CRAN. The tentatively titled 'markup' package is attached. After it's installed, see ?markup and the few examples in the inst/ directory, or just example(markup). -Matt On Thu, 2010-09-09 at 01:47 -0400, David Scott wrote: I am investigating some approaches to reproducible research. I need in the end to produce .html or .doc or .docx. I have used hwriter in the past but have had some problems with verbatim output from R. Tables are also not particularly convenient. I am interested in R2HTML and R2wd in particular, and possibly odfWeave. Does anyone have sample documents using any of these approaches which they could let me have? David Scott _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email:d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Failure to aggregate
Hi you has to provide some more info about x e.g. str(x) x-data.frame(price=1, h=Sys.time()) r-help-boun...@r-project.org napsal dne 08.09.2010 10:18:52: Mnay thanks fr suggestions. I am afraid this is one tough daatframe... t = sqldf(select h, count(*) from x group by h) Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (error in statement: no such table: x) In addition: Warning message: In value[[3L]](cond) : RAW() can only be applied to a 'raw', not a 'double' did not test t = aggregate(x[price], by = x[h], FUN = NROW) Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list? works aggregate(x[price], by = x[h], FUN = NROW) h price 1 2010-09-09 16:58:04 1 t = aggregate(x[price], by = x[h], FUN = length) Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list? works aggregate(x[price], by = x[h], FUN = length) h price 1 2010-09-09 16:58:04 1 t = tapply(x$price, by = x$h, FUN = length) Error in is.list(INDEX) : 'INDEX' is missing works use INDEX instead of by tapply(x$price, by = list(x$h), FUN = length) Error in is.list(INDEX) : 'INDEX' is missing tapply(x$price, x$h, FUN = length) 2010-09-09 16:58:04 1 Regards Petr class(x) [1] data.frame class(x$h) [1] POSIXt POSIXlt class(x$price) [1] integer -- View this message in context: http://r.789695.n4.nabble.com/Failure-to- aggregate-tp2528613p2530963.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating with tolerances (error propagation)
Jan private jrheinlaen...@gmx.de wrote in message news:1284029454.2740.361.ca...@localhost.localdomain... Hello Bernardo, - If I understood your problem this script solve your problem: q-0.15 + c(-.1,0,.1) h-10 + c(-.1,0,.1) 5*q*h [1] 2.475 7.500 12.625 - OK, this solves the simple example. But what if the example is not that simple. E.g. P = 5 * q/h Here, to get the maximum tolerances for P, we need to divide the maximum value for q by the minimum value for h, and vice versa. Is there any way to do this automatically, without thinking about every single step? There is a thing called interval arithmetic (I saw it as an Octave package) which would do something like this. I would have thought that tracking how a (measuring) error propagates through a complex calculation would be a standard problem of statistics?? In other words, I am looking for a data type which is a number with a deviation +- somehow attached to it, with binary operators that automatically knows how to handle the deviation. Thank you, Jan Ahhh! tracking how a (measuring) error propagates through a complex calculation That doesn't depend only on values+errors, it also depends on the calculations, so - as you imply - you'd have to define a new data type and appropriate methods for all the mathematical operators (not just the binary ones!). Not a trivial task! If you don't already know it, you should look at Evaluation of measurement data - Guide to the expression of uncertainty in measurement http://www.bipm.org/utils/common/documents/jcgm/JCGM_100_2008_E.pdf especially section 5. Hope that helps, Keith J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie cross tabulation issue
2010/9/8 David Winsemius dwinsem...@comcast.net I hope you mean only two factors and an n x m table. Yes David I like say factor, but am new here. -- Jonathan. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in normalizePath(path) : with McAfee
On Sep 9, 2010, at 13:52 , Duncan Murdoch wrote: On 09/09/2010 12:01 AM, Erin Hodgess wrote: Dear R People: I keep getting the Error in normalizePath(path) : while trying to obtain the necessary packages to use with the Applied Spatial Statistics with R book. I turned off the Firewall (from McAfee) but am still getting the same message. Does anyone have any idea on a solution please? I think you need to show us your code and the error in context. Maybe not... We have been here before, and if I remember correctly, the issue is not the firewall, but antivirus software messing with temp directories. (As I understand it, it goes I see you have created a new directory, now let me put that in a safe place while I check it for malware. Oh, you were still using it? Too bad. Sort of like grabbing someone's new cup while they are trying to pour milk into it) Check back issues of R-help for details of the earlier incidents. Duncan Murdoch sessionInfo() R version 2.11.1 (2010-05-31) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ctv_0.6-0 loaded via a namespace (and not attached): [1] tools_2.11.1 Thanks, Erin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with outer
Can you set the multinomial prob. to zero for p1+p2+p3 != 1 if you have to use the multinomial distribution in guete(). Otherwise, I would say the problem/guete() itself is problematic. -- View this message in context: http://r.789695.n4.nabble.com/problem-with-outer-tp2532074p2533050.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie cross tabulation issue
On Sep 8, 2010, at 7:32 PM, Jonathan Finlay wrote: Thanks David, gmodels::Crosstable partially work because can show only 1 x 1 tablen CrossTable(x,y,...) I need something how can process at less 1 variable in X an 10 in Y. A further thought (despite a lack of clarification on what your data situation really is.). The strong tendency in R is not to attempt replication of formats in SAS that were developed in an era of dot- matrix printers, but to target modern output devices. As such most of the table output facilities with any degree of sophistication have LaTeX or HTML as targets. RSiteSearch(html tables) produces over 1000 links although they have many that are not for multiway tables where multi is greater than R x C. RSiteSearch(latex tables) produces many fewer. You may want to look at xtable, Sweave, odfWeave, the various HTML utilities, and Harrell's Hmisc::summary.formula -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Prediction confidence intervals for a Poisson GLM
I am following up on an old post. Please, comment: it appears that predict(glm.model,type=response,se.fit=T) will do all the conversions and give se on the scale of the response. This only takes into account the error in parameter estimation. what a prediction interval is meant to be usually means it has to capture the error due to both parameter estimation and sampling variation ie it encompasses the actual realizations. for a given parameter there is sampling variation and that is not included in the output of predict. the discreteness of models makes it quite difficult to estimate a percentile interval, though. for binary outcomes, I think it does not make sense. for Poisson and binomial (grouped binary) I think it is possible to get approximations at least and this is what the original poster needed I think. so, let's say we have plow and pup for an observation from predict. if size=100 for that obs. predlow=qbinom(.025,100,plow) predup=qbinom(.975,100,pup) will give the prediction bounds. this I think partly ignores possible overdispersion. Please, suggest a better way taking overdispersion into account (in the qbinom part). Thanks everybody. Stephen B. -- View this message in context: http://r.789695.n4.nabble.com/Prediction-confidence-intervals-for-a-Poisson-GLM-tp841577p2533070.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Failure to aggregate
g = head(x) dput(g) structure(list(price = c(500L, 500L, 501L, 501L, 500L, 501L), size = c(221000L, 2000L, 1000L, 13000L, 3000L, 3000L), src = c(R, R, R, R, R, R), t = structure(list(sec = c(24.133, 47.096, 12.139, 18.142, 10.721, 28.713), min = c(0L, 0L, 1L, 1L, 2L, 2L), hour = c(9L, 9L, 9L, 9L, 9L, 9L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt)), d = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L), hour = c(0L, 0L, 0L, 0L, 0L, 0L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L )), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt)), h = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L), hour = c(9L, 9L, 9L, 9L, 9L, 9L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt)), m = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 1L, 1L, 2L, 2L), hour = c(9L, 9L, 9L, 9L, 9L, 9L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L )), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt)), s = structure(list(sec = c(24, 47, 12, 18, 10, 28), min = c(0L, 0L, 1L, 1L, 2L, 2L), hour = c(9L, 9L, 9L, 9L, 9L, 9L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt))), .Names = c(price, size, src, t, d, h, m, s), row.names = c(NA, 6L), class = data.frame) n = sqldf(select distinct h, src, count(*) from g group by h, src) Loading required package: tcltk Loading Tcl/Tk interface ... done Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (error in statement: no such table: g) In addition: Warning message: In value[[3L]](cond) : RAW() can only be applied to a 'raw', not a 'double' -- View this message in context: http://r.789695.n4.nabble.com/Failure-to-aggregate-tp2528613p2533051.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regression function for categorical predictor data
Hi, thank you very much for the help. one more quick question: is that, my predictor variable should be coded as 'factor' when using either 'lm' or 'glm'? sincerely, karena -- View this message in context: http://r.789695.n4.nabble.com/regression-function-for-categorical-predictor-data-tp2532045p2533035.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a question about replacing the value in the data.frame
Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/a-question-about-replacing-the-value-in-the-data-frame-tp2532010p2533036.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reproducible research
Well, the attachment was a dud. Try this: http://biostatmatt.com/R/markup_0.0.tar.gz -Matt On Thu, 2010-09-09 at 10:54 -0400, Matt Shotwell wrote: I have a little package I've been using to write template blog posts (in HTML) with embedded R code. It's quite small but very flexible and extensible, and aims to do something similar to Sweave and brew. In fact, the package is heavily influenced by the brew package, though implemented quite differently. It depends on the evaluate package, available in the CRAN. The tentatively titled 'markup' package is attached. After it's installed, see ?markup and the few examples in the inst/ directory, or just example(markup). -Matt On Thu, 2010-09-09 at 01:47 -0400, David Scott wrote: I am investigating some approaches to reproducible research. I need in the end to produce .html or .doc or .docx. I have used hwriter in the past but have had some problems with verbatim output from R. Tables are also not particularly convenient. I am interested in R2HTML and R2wd in particular, and possibly odfWeave. Does anyone have sample documents using any of these approaches which they could let me have? David Scott _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Alignment of lines within barplot bars
Dear all, I have a barplot upon which I hope to superimpose horizontal lines extending across the width of each bar. I am able to partly achieve this through the following set of commands: positions - barplot(bar_values, col=grey) par(new=TRUE) plot(positions, horiz_values, col=red, pch=_, ylim=c(min(bar_values), max(bar_values))) ...however this results in small, off-centred lines, which don't extend across the width of each bar. I've tried using 'cex' to increase the width, but of course this also increases the height of the line and results in it spanning a large range of y-axis values. I'm sure this shouldn't be too tricky to achieve, nor that uncommon a problem! It may be that I'm taking the wrong approach. Any help offered would be gratefully received. Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Axis break with gap.plot()
Hi everyone. I'm trying to break the y axis on a plot. For instance, I have 2 series (points and a loess). Since the loess is a continuous set of points, it passes in the break section. However, with gap.plot I cant plot the loess because of this (I got the message some values of y will not be displayed). Here's my code: library(plotrix); #generate some data x = seq(-pi,pi,0.1); sinx = sin(x); #add leverage value sinx = c(sinx,10); xx = c(x,max(x) + 0.1); #Loess yy = loess(sinx ~ xx, span = 0.1); yy = predict(yy); #Add break between 2 and 8 gap.plot(xx,sinx,c(2,8)); #This line works fine gap.plot(xx,yy,c(2,8), add = T); #This wont plot the loess I did the graphic I would like to produce in Sigmaplot. http://img830.imageshack.us/img830/5206/breakaxis.jpg Can it be done in R ? With regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Axis-break-with-gap-plot-tp2533027p2533027.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scalable delimiters in plotmath
Dear list, I read in ?plotmath that I can use bgroup to draw scalable delimiters such as [ ] and ( ). The same technique fails with however, and I cannot find a workaround, grid.text(expression(bgroup(,atop(x,y),))) Error in bgroup(, atop(x, y), ) : invalid group delimiter Regards, baptiste sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] TeachingDemos_2.7 loaded via a namespace (and not attached): [1] tools_2.11.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] See what is inside a matrix
The image function will create a plot with the values transformed to colors. Or the View function (note the capitol V) will let you look at it in a spreadsheet like window with scrollbars. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Alaios Sent: Thursday, September 09, 2010 1:23 AM To: Rhelp Subject: [R] See what is inside a matrix Hello everyone.. Is there any graphical tool to help me see what is inside a matrix? I have 100x100 dimensions matrix and as you already know as it does not fit on my screen R splits it into pieces. I would like to thank you in advance for your help Best Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Failure to aggregate
I think your main problem is that you have your time as POSIXlt which is a multiple valued vector. I converted the 't' to POSIXct, removed the other POSIXlt value and created a 'h' as the character for the hour and it works fine: str(g) 'data.frame': 6 obs. of 5 variables: $ price: int 500 500 501 501 500 501 $ size : int 221000 2000 1000 13000 3000 3000 $ src : chr R R R R ... $ time : POSIXct, format: 2005-01-04 09:00:24 2005-01-04 09:00:47 2005-01-04 09:01:12 2005-01-04 09:01:18 ... $ h: chr 09 09 09 09 ... sqldf(select distinct h, src, count(*) from g group by h,src) h src count(*) 1 09 R6 On Thu, Sep 9, 2010 at 11:16 AM, Dimitri Shvorob dimitri.shvo...@gmail.com wrote: g = head(x) dput(g) structure(list(price = c(500L, 500L, 501L, 501L, 500L, 501L), size = c(221000L, 2000L, 1000L, 13000L, 3000L, 3000L), src = c(R, R, R, R, R, R), t = structure(list(sec = c(24.133, 47.096, 12.139, 18.142, 10.721, 28.713), min = c(0L, 0L, 1L, 1L, 2L, 2L), hour = c(9L, 9L, 9L, 9L, 9L, 9L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt)), d = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L), hour = c(0L, 0L, 0L, 0L, 0L, 0L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L )), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt)), h = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L), hour = c(9L, 9L, 9L, 9L, 9L, 9L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt)), m = structure(list(sec = c(0, 0, 0, 0, 0, 0), min = c(0L, 0L, 1L, 1L, 2L, 2L), hour = c(9L, 9L, 9L, 9L, 9L, 9L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L )), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt)), s = structure(list(sec = c(24, 47, 12, 18, 10, 28), min = c(0L, 0L, 1L, 1L, 2L, 2L), hour = c(9L, 9L, 9L, 9L, 9L, 9L), mday = c(4L, 4L, 4L, 4L, 4L, 4L), mon = c(0L, 0L, 0L, 0L, 0L, 0L), year = c(105L, 105L, 105L, 105L, 105L, 105L), wday = c(2L, 2L, 2L, 2L, 2L, 2L), yday = c(3L, 3L, 3L, 3L, 3L, 3L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt))), .Names = c(price, size, src, t, d, h, m, s), row.names = c(NA, 6L), class = data.frame) n = sqldf(select distinct h, src, count(*) from g group by h, src) Loading required package: tcltk Loading Tcl/Tk interface ... done Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (error in statement: no such table: g) In addition: Warning message: In value[[3L]](cond) : RAW() can only be applied to a 'raw', not a 'double' -- View this message in context: http://r.789695.n4.nabble.com/Failure-to-aggregate-tp2528613p2533051.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bug on chron
hello I think I've found a bug I don't know if it's a chron bug or a R one. (05/12/05 23:00:00) +1/24 gives (05/12/05 24:00:00) instead of (05/13/05 00:00:00) it looks like the same but it's not because when you get the date of this datetime it says day 12 instead of 13. Please, forward it to the place where this bugs are supposed to be posted. cheers -- View this message in context: http://r.789695.n4.nabble.com/Bug-on-chron-tp2533135p2533135.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rgl and lighting
Dear R community (and Duncan more specifically), I can't work out how to make additional light sources work in rgl. Here is the example. First I create a cube and visualize it: cubo - cube3d(col=black) shade3d(cubo) Next I position the viewpoint at theta=0 and phi=30: view3d(theta=0,phi=30) Next, I want to create a 2nd light source which diffuses red light from the front face. I thought I could do: light3d(diffuse=red,theta=0,phi=0) but...the front side doesn't show any red-iness. Same goes for specular and ambient. What am I doing wrong here? How should the fron side show in red colour? J Dr James Foadi PhD Membrane Protein Laboratory (MPL) Diamond Light Source Ltd Diamond House Harewell Science and Innovation Campus Chilton, Didcot Oxfordshire OX11 0DE Email: james.fo...@diamond.ac.uk Alt Email: j.fo...@imperial.ac.uk -- This e-mail and any attachments may contain confidential...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coxph and ordinal variables?
On Wed, 8 Sep 2010, Paul Johnson wrote: run it with factor() instead of ordered(). You don't want the orthogonal polynomial contrasts that result from ordered if you need to compare against Stata. If you don't want polynomial contrasts for ordered factors, you can just tell R not to use them. options(contrasts=c(contr.treatment,contr.treatment)) It's like the Good Old Days when you had to use options() to tell S-PLUS not to use Helmert contrasts. -thomas Thomas Lumley Professor of Biostatistics University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving/loading custom R scripts
On Thu, Sep 9, 2010 at 7:05 AM, Bos, Roger roger@rothschild.com wrote: Josh, I liked your idea of setting the repo in the .Rprofile file, so I tried it: r - getOption(repos) r[CRAN] - http://cran.stat.ucla.edu; options(repos = r) rm(r) And now when I open R I get an error: Error in r[CRAN] - http://cran.stat.ucla.edu; : cannot do complex assignments in base namespace II have been using that for several months now. I use a text editor to create ~/.Rprofile (where ~ represents the path to my working directory), and add those four lines of code. I don't know why it would not work for you, and I cannot replicate the error myself so it is hard to offer any suggestions. I am using R2.11.1pat in windows. Thanks, Roger -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Joshua Wiley Sent: Wednesday, September 08, 2010 11:20 AM To: DrCJones Cc: r-help@r-project.org Subject: Re: [R] Saving/loading custom R scripts Hi, Just create a file called .Rprofile that is located in your working directory (this means you could actually have different ones in each working directory). In that file, you can put in code just like any other code that would be source()d in. For instance, all my .Rprofile files start with: r - getOption(repos) r[CRAN] - http://cran.stat.ucla.edu; options(repos = r) rm(r) So that I do not have to pick my CRAN mirror. Similarly you could merely add this line to the file: source(file = http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-with-Post-Hoc.r.txt;) and R would go online, download that file and source it in (not that I am recommending re-downloading every time you start R). Then whatever names they used to define the functions, would be in your workspace. Note that in general, you will not get any output alerting you that it has worked; however, if you type ls() you should see those functions' names. Cheers, Josh On Wed, Sep 8, 2010 at 12:25 AM, DrCJones matthias.godd...@gmail.com wrote: Hi, How does R automatically load functions so that they are available from the workspace? Is it anything like Matlab - you just specify a directory path and it finds it? The reason I ask is because I found a really nice script that I would like to use on a regular basis, and it would be nice not to have to 'copy and paste' it into R on every startup: http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-w ith-Post-Hoc.r.txt This would be for Ubuntu, if that makes any difference. Cheers -- View this message in context: http://r.789695.n4.nabble.com/Saving-loading-custom-R-scripts-tp253092 4p2530924.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by an error in transmission. If you have received this message in error, please immediately notify the the sender by e-mail, delete the message and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message if you are not the intended recipient. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug on chron
On Thu, Sep 9, 2010 at 11:59 AM, skan juanp...@gmail.com wrote: hello I think I've found a bug I don't know if it's a chron bug or a R one. (05/12/05 23:00:00) +1/24 gives (05/12/05 24:00:00) instead of (05/13/05 00:00:00) it looks like the same but it's not because when you get the date of this datetime it says day 12 instead of 13. I can't reproduce such behavior: library(chron) x - chron(05/12/05, 23:00:00) + 1/24; x [1] (05/13/05 00:00:00) month.day.year(x)$day [1] 13 packageDescription(chron)$Version [1] 2.3-36 R.version.string [1] R version 2.11.1 Patched (2010-05-31 r52167) win.version() [1] Windows Vista (build 6002) Service Pack 2 -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regression function for categorical predictor data
Hi, If your predictor variable is categorical than it should be converted to a factor. If it is continuous or being treated as such, you do not need to. It is generally quite easy to do: varname - factor(varname) or if it is in a data frame yourdf$varname - factor(yourdf$varname) Cheers, Josh On Thu, Sep 9, 2010 at 8:09 AM, karena dr.jz...@gmail.com wrote: Hi, thank you very much for the help. one more quick question: is that, my predictor variable should be coded as 'factor' when using either 'lm' or 'glm'? sincerely, karena -- View this message in context: http://r.789695.n4.nabble.com/regression-function-for-categorical-predictor-data-tp2532045p2533035.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Uncertainty analysis
Dear all I would like to run in R an uncertainty/sensitivity analysis. I know that these two are performed together. I have a geochemical model where I have the inputs, the water variables (e.g. pH, temperature, oxygen ect) and as well an output of different variables. What I would like to do is to estimate the uncertainty that the output variables have considering the uncertainty of the input variables as well how the variations of my inputs contribute most to the variations of my output (probably though the sensitivity analysis). I was thinking perhaps with Monte Carlo analysis. Is there a way to do that? Thanks a lot. Maria [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving/loading custom R scripts
On Thu, Sep 9, 2010 at 1:14 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: On Thu, Sep 9, 2010 at 7:05 AM, Bos, Roger roger@rothschild.com wrote: Josh, I liked your idea of setting the repo in the .Rprofile file, so I tried it: r - getOption(repos) r[CRAN] - http://cran.stat.ucla.edu; options(repos = r) rm(r) I couldn't understand why to use 4 lines of code... You could try this: options(repos = http://cran.stat.ucla.edu;) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confidence intervals around p-values
One other case where a confidence interval on a p-value may make sense is permutation (or other resampling) tests. The population parameter p-value would be the p-value that would be obtained from the distribution of all possible permutations, but in practice we just sample from that population and estimate a p-value. The confidence interval would then be based on the number of sample permutations and could give an idea if that number was big enough. If the full confidence interval is less than alpha then you can be confident that the true p-value would give significance, if it is completely above alpha then it is not significant. The real problem comes when the confidence interval includes alpha, that would indicate that B (the number of resamples/permutations) was not large enough. Be careful, doing a small number of permutations then deciding to do more based on the CI would likely introduce bias (how much is another question). The nice thing is that in this case the p-value is a simple proportion and the confidence interval can be computed using binom.test. But, I fully agree that in most cases the idea of a CI for a p-value is not meaningful, you need to have some case where your p-value is an estimate of a population parameter p-value that has some meaning. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Ted Harding Sent: Thursday, September 09, 2010 8:25 AM To: r-help@r-project.org Cc: Fernando Marmolejo Ramos Subject: Re: [R] confidence intervals around p-values On 09-Sep-10 13:21:07, Duncan Murdoch wrote: On 09/09/2010 6:44 AM, Fernando Marmolejo Ramos wrote: Dear all I wonder if anyone has heard of confidence intervals around p-values... That doesn't really make sense. p-values are statistics, not parameters. You would compute a confidence interval around a population mean because that's a parameter, but you wouldn't compute a confidence interval around the sample mean: you've observed it exactly. Duncan Murdoch Duncan has succinctly stated the essential point in the standard interpretation. The P-value is calculated from the sample in hand, a definite null hypothesis, and the distribution of the test statistic given the null hyptohesis, so (given all of these) there is no scope for any other answer. However, there are circumstances in which the notion of confidence interval for a P-value makes some sense. One such might be the Mann-Whitney test for identity of distribution of two samples of continuous variables, where (because of discretisation of the values when they were recorded) there are ties. Then you know in theory that the underlying values are all different, but because you don't know where these lie in the discretisation intervals you don't know which way a tie may split. So it would make sense to simulate by splitting ties at random (e.g. uniformly distribute each 1.5 value over the interval (1.5,1.6) or (1.45,1.55)). For each such simulated tie-broken sample, calculate the P-value. Then you get a distribution of exact P-values calculated from samples without ties which are consistent with the recorded data. The central 95% of this distribution could be interpreted as a 95% coinfidence interval for the true P-value. To bring this closer to on-topic, here is an example in R (rounding to intervals of 0.2): set.seed(51324) X - sort(2*round(0.5*rnorm(12),1)) Y - sort(2*round(0.5*rnorm(12)+0.25,1)) rbind(X,Y) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] # X -1.8 -1.2 -0.8 -0.6 0.00 0.2 0.2 1.2 1.8 2 2.2 # Y -1.2 -0.4 -0.2 0.4 0.41 1.0 1.0 1.2 1.8 2 2.6 # So several ties (-1.2,1.2,1.8,2.0), as well as 0.0, 0.4, 1.0 # which don't matter. wilcox.test(X,Y,alternative=less,exact=TRUE,correct=FALSE) # data: X and Y W = 54, p-value = 0.1488 Ps - numeric(1000) for(i in (1:1000)){ Xr - (X-0.1) + 0.2*runif(10) Yr - (Y-0.1) + 0.2*runif(10) Ps[i] - wilcox.test(Xr,Yr,alternative=less, exact=TRUE,correct=FALSE)$p.value } hist(Ps) table(round(Ps,4)) # 0.1328 0.1457 0.1593 0.1737 0.1888 # 81267336226 90 So this gives you a picture of the uncertainty in the P-value (0.1488, calculated from the rounded data) relative to what it really should have been (if calculated from unrounded data). Since each possible true (tie-broken) sample can be viewed as a hypothesis about unobserved truth, it does make a certain sense to view these results as a kind of confidence distribution for the P-value you should have got. However, this is more of a Bayesian argument, since the above calculation has assigned equal prior probability to the tie-breaks! One could also, I suppose, consider
Re: [R] Bug on chron
Something strange. Your example work but... I have a zoo object. I extract its element 21 index(test[21]) [1] (05/12/05 23:00:00) index(test[21])+1/24 [1] (05/12/05 24:00:00) Why 24:00 ? packageDescription(chron)$Version [1] 2.3-35 R.version.string [1] R version 2.11.1 (2010-05-31) packageDescription(zoo)$Version [1] 1.7-0 cheers -- View this message in context: http://r.789695.n4.nabble.com/Bug-on-chron-tp2533135p2533194.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Correlation question
Hi Stephane, When I use your sample data (e.g., test, test.number), cor() throws an error that x must be numeric (because of the factor or character data). Are you not getting any errors when trying to calculate the correlation on these data? If you are not, I wonder what version of R are you using? The quickest way to find out is sessionInfo(). As far as a work around, it would be relative simple to find out which columns of your data frame were not numeric or integer and exclude those (I'm happy to provide that code if you want). Best regards, Josh On Thu, Sep 9, 2010 at 7:50 AM, Stephane Vaucher vauch...@iro.umontreal.ca wrote: Thank you Dennis, You identified a factor (text column) that I was concerned with. I simplified my example to try and factor out possible causes. I eliminated the recurring values in columns (which were not the columns that caused problems). I produced three examples with simple data sets. 1. Correct output, 2 columns only: test.notext = read.csv('test-notext.csv') cor(test.notext, method='spearman') P3 HP_tot P3 1.000 -0.2182876 HP_tot -0.2182876 1.000 dput(test.notext) structure(list(P3 = c(2L, 2L, 2L, 4L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L), HP_tot = c(10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 15L, 15L, 15L, 15L, 15L, 15L, 15L)), .Names = c(P3, HP_tot ), class = data.frame, row.names = c(NA, -25L)) 2. Incorrect output where I introduced my P7 column containing text only the 'a' character: test = read.csv('test.csv') cor(test, method='spearman') P3 P7 HP_tot P3 1.000 NA -0.2502878 P7 NA 1 NA HP_tot -0.2502878 NA 1.000 Warning message: In cor(test, method = spearman) : the standard deviation is zero dput(test) structure(list(P3 = c(2L, 2L, 2L, 4L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L), P7 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L ), .Label = a, class = factor), HP_tot = c(10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 15L, 15L, 15L, 15L, 15L, 15L, 15L)), .Names = c(P3, P7, HP_tot), class = data.frame, row.names = c(NA, -25L)) 3. Incorrect output with P7 containing a variety of alpha-numeric characters (ascii), to factor out equal valued column issue. Notice that the text column is interpreted as a numeric value. test.number = read.csv('test-alpha.csv') cor(test.number, method='spearman') P3 P7 HP_tot P3 1.000 0.4093108 -0.2502878 P7 0.4093108 1.000 -0.3807193 HP_tot -0.2502878 -0.3807193 1.000 dput(test.number) structure(list(P3 = c(2L, 2L, 2L, 4L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L), P7 = structure(c(11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), .Label = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o), class = factor), HP_tot = c(10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 136L, 15L, 15L, 15L, 15L, 15L, 15L, 15L)), .Names = c(P3, P7, HP_tot), class = data.frame, row.names = c(NA, -25L)) Correct output is obtained by avoiding matrix computation of correlation: cor(test.number$P3, test.number$HP_tot, method='spearman') [1] -0.2182876 It seems that a text column corrupts my correlation calculation (only in a matrix calculation). I assumed that text columns would not influence the result of the calculations. Is this a correct behaviour? If not,I can submit a bug report? If it is, is there a known workaround? cheers, Stephane Vaucher On Thu, 9 Sep 2010, Dennis Murphy wrote: Did you try taking out P7, which is text? Moreover, if you get a message saying ' the standard deviation is zero', it means that the entire column is constant. By definition, the covariance of a constant with a random variable is 0, but your data consists of values, so cor() understandably throws a warning that one or more of your columns are constant. Applying the following to your data (which I named expd instead), we get sapply(expd[, -12], var) P1 P2 P3 P4 P5 P6 5.43e-01 1.08e+00 5.77e-01 1.08e+00 6.43e-01 5.57e-01 P8 P9 P10 P11 P12 SITE 5.73e-01 3.19e+00 5.07e-01 2.50e-01 5.50e+00 2.49e+00 Errors warnings Manual Total H_tot HP1.1 9.072840e+03 2.081334e+04 7.43e-01 3.823500e+04 3.880250e+03 2.676667e+00 HP1.2 HP1.3 HP1.4 HP_tot HO1.1
Re: [R] Saving/loading custom R scripts
On Thu, Sep 9, 2010 at 9:28 AM, Jakson A. Aquino jaksonaqu...@gmail.com wrote: On Thu, Sep 9, 2010 at 1:14 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: On Thu, Sep 9, 2010 at 7:05 AM, Bos, Roger roger@rothschild.com wrote: Josh, I liked your idea of setting the repo in the .Rprofile file, so I tried it: r - getOption(repos) r[CRAN] - http://cran.stat.ucla.edu; options(repos = r) rm(r) I couldn't understand why to use 4 lines of code... You could try this: You can have more than one repository, using repos = url, will overwrite all of them. For instance, I believe it is standard on Windows to have CRAN and CRANextra. The one line option probably would be fine often. In any case, reading through the documentation, the code used there is: local({r - getOption(repos); r[CRAN] - http://my.local.cran;; options(repos=r)}) perhaps wrapping it in local() will take care of your problem, Roger. options(repos = http://cran.stat.ucla.edu;) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug on chron
Could this be a case of faq 7.31? where rounding error means that you are seeing a time that is slightly before midnight (but printing shows it at midnight). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of skan Sent: Thursday, September 09, 2010 10:00 AM To: r-help@r-project.org Subject: [R] Bug on chron hello I think I've found a bug I don't know if it's a chron bug or a R one. (05/12/05 23:00:00) +1/24 gives (05/12/05 24:00:00) instead of (05/13/05 00:00:00) it looks like the same but it's not because when you get the date of this datetime it says day 12 instead of 13. Please, forward it to the place where this bugs are supposed to be posted. cheers -- View this message in context: http://r.789695.n4.nabble.com/Bug-on- chron-tp2533135p2533135.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug on chron
I don't know. You can look at the file, is very short. http://r.789695.n4.nabble.com/file/n2533223/test test -- View this message in context: http://r.789695.n4.nabble.com/Bug-on-chron-tp2533135p2533223.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimized value worse than starting Value
Barry Rowlingson b.rowlingson at lancaster.ac.uk writes: On Wed, Sep 8, 2010 at 1:35 PM, Michael Bernsteiner dethlef1 at hotmail.com wrote: Dear all, I'm optimizing a relatively simple function. Using optimize the optimized parameter value is worse than the starting. why? I would like to stress here that finding a global minimum is not as much sorcery as this thread seems to suggest. A widely accepted procedure to provably identify a global minimum goes roughly as follows (see Chapt. 4 in [1]): - Make sure the global minimum does not lie 'infinitely' for out. - Provide estimations for the derivatives/gradients. - Define a grid fine enough to capture or exclude minima. - Search grid cells coming into consideration and compare. This can be applied to two- and higher-dimensional problems, but of course may require enormous efforts. In science and engineering applications it is at times necessary to really execute this approach. Hans Werner [1] F. Bornemann et al., The SIAM 100-Digit Challenge, 2004, pp. 79. In fact, a slightly finer grid search will succeed in locating the proper minimum; several teams used such a search together with estimates based on the partial derivatives of f to show that the search was fine enough to guarantee capture of the answer. This looks familiar. Is this some 1-d version of the Rosenbrock Banana Function? http://en.wikipedia.org/wiki/Rosenbrock_function It's designed to be hard to find the minimum. In the real world one would hope that things would not have such a pathological behaviour. Numerical optimisations are best done using as many methods as possible - see optimise, nlm, optim, nlminb and the whole shelf of library books devoted to it. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confidence intervals around p-values
A confidence interval around the p-value makes no sense because there is no parameter being estimated, but the sampling distribution of the p-value makes a lot of sense. The pre-observational P-value is a random variable that is a function of the underlying random variable being tested. That is, P_X(t) = Pr(Xt) is itself a random variable with density, distribution, and moments. Thus, one can compute the 95% sampling distribution around the expectation of P. See Hung, H. M. J.; O'Neill, R. T.; Bauer, P. Kohne, K. The behavior of the P-value when the alternative hypothesis is true Biometrics, 1997, 53, 1-22 Donahue, R. M. J. A note on information seldom reported via the p value. The American Statistician, American Statistical Association, 1999, 53, 303-306 Greg Snow greg.s...@imail.org Sent by: r-help-boun...@r-project.org 09/09/2010 12:29 PM To ted.hard...@manchester.ac.uk ted.hard...@manchester.ac.uk, r-help@r-project.org r-help@r-project.org cc Fernando Marmolejo Ramos fernando.marmolejora...@adelaide.edu.au Subject Re: [R] confidence intervals around p-values One other case where a confidence interval on a p-value may make sense is permutation (or other resampling) tests. The population parameter p-value would be the p-value that would be obtained from the distribution of all possible permutations, but in practice we just sample from that population and estimate a p-value. The confidence interval would then be based on the number of sample permutations and could give an idea if that number was big enough. If the full confidence interval is less than alpha then you can be confident that the true p-value would give significance, if it is completely above alpha then it is not significant. The real problem comes when the confidence interval includes alpha, that would indicate that B (the number of resamples/permutations) was not large enough. Be careful, doing a small number of permutations then deciding to do more based on the CI would likely introduce bias (how much is another question). The nice thing is that in this case the p-value is a simple proportion and the confidence interval can be computed using binom.test. But, I fully agree that in most cases the idea of a CI for a p-value is not meaningful, you need to have some case where your p-value is an estimate of a population parameter p-value that has some meaning. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Ted Harding Sent: Thursday, September 09, 2010 8:25 AM To: r-help@r-project.org Cc: Fernando Marmolejo Ramos Subject: Re: [R] confidence intervals around p-values On 09-Sep-10 13:21:07, Duncan Murdoch wrote: On 09/09/2010 6:44 AM, Fernando Marmolejo Ramos wrote: Dear all I wonder if anyone has heard of confidence intervals around p-values... That doesn't really make sense. p-values are statistics, not parameters. You would compute a confidence interval around a population mean because that's a parameter, but you wouldn't compute a confidence interval around the sample mean: you've observed it exactly. Duncan Murdoch Duncan has succinctly stated the essential point in the standard interpretation. The P-value is calculated from the sample in hand, a definite null hypothesis, and the distribution of the test statistic given the null hyptohesis, so (given all of these) there is no scope for any other answer. However, there are circumstances in which the notion of confidence interval for a P-value makes some sense. One such might be the Mann-Whitney test for identity of distribution of two samples of continuous variables, where (because of discretisation of the values when they were recorded) there are ties. Then you know in theory that the underlying values are all different, but because you don't know where these lie in the discretisation intervals you don't know which way a tie may split. So it would make sense to simulate by splitting ties at random (e.g. uniformly distribute each 1.5 value over the interval (1.5,1.6) or (1.45,1.55)). For each such simulated tie-broken sample, calculate the P-value. Then you get a distribution of exact P-values calculated from samples without ties which are consistent with the recorded data. The central 95% of this distribution could be interpreted as a 95% coinfidence interval for the true P-value. To bring this closer to on-topic, here is an example in R (rounding to intervals of 0.2): set.seed(51324) X - sort(2*round(0.5*rnorm(12),1)) Y - sort(2*round(0.5*rnorm(12)+0.25,1)) rbind(X,Y) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] # X -1.8 -1.2 -0.8 -0.6 0.00 0.2 0.2 1.2 1.8 2 2.2 # Y -1.2 -0.4 -0.2 0.4 0.41 1.0 1.0 1.2 1.8 2 2.6
Re: [R] rgl and lighting
On 09/09/2010 12:02 PM, james.fo...@diamond.ac.uk wrote: Dear R community (and Duncan more specifically), I can't work out how to make additional light sources work in rgl. Here is the example. First I create a cube and visualize it: cubo- cube3d(col=black) shade3d(cubo) Next I position the viewpoint at theta=0 and phi=30: view3d(theta=0,phi=30) Next, I want to create a 2nd light source which diffuses red light from the front face. I thought I could do: light3d(diffuse=red,theta=0,phi=0) but...the front side doesn't show any red-iness. Same goes for specular and ambient. What am I doing wrong here? How should the fron side show in red colour? Black doesn't reflect anything, so that's why you're not seeing the red. Colour the cube white, and you'll see it turn pink when you turn the red light on, or red if you turn off the default light first (using rgl.pop(lights)). Be aware that OpenGL (underlying rgl) has a fairly complicated lighting model. When you say col=black, you're only setting the ambient colour, i.e. the colour that appears the same in all directions. (It won't be the same on all faces of the cube, because the intensity depends on the incoming light.) There is also a specular component, which makes things appear shiny, because it's brighter from some viewpoints than others. It is normally white. Finally there's an emission component, which doesn't care about lighting, but is normally turned off. Lights also have 3 components, ambient (non-directional), diffuse (somewhat directional), and specular (highly directional). Duncan Murdoch J Dr James Foadi PhD Membrane Protein Laboratory (MPL) Diamond Light Source Ltd Diamond House Harewell Science and Innovation Campus Chilton, Didcot Oxfordshire OX11 0DE Email: james.fo...@diamond.ac.uk Alt Email: j.fo...@imperial.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.