Re: [R] Problems with graphical devices, e.g., png(), pdf(): blurry graphical output
Your PDF problems indicate a broken viewer. How were you viewing PDF? You have also not told us how you view PNG, but you would expect anti-aliased output to be blurry when viewed at 100% (or more). You need to be careful not to have anti-aliasing turned on in the viewer as well as in the file producer. Note that R does give you lots of options to tune the output to your intended use of it, so you have no cause to complain if you use an inappropriate set. Do remember that you cannot say 'the graphical output is blurry': it is just a binary file. Far too often, useRs blame R for issues in the viewers they use. On Tue, 16 Dec 2008, Y-H Chen wrote: On my current home system, I am getting undesirable output from graphical devices such as png() and pdf(). The graphical output is blurry. I haven't experienced the problem on other systems. As you will see from the attached text file (more information on this file below), the problem does not occur when type='Xlib' is forced. The blurriness is more severe with bitmap output (yes, I am viewing the bitmap files at 100%), but occurs with pdf output as well. Software details: Fedora 10, with at least the following packages: -- R, R-core, R-devel -- cairo, cairo-devel -- pixman, pixman-devel -- libpng, libpng-devel -- poppler Everything is current and updated via Fedora's repository. R was installed via Fedora's repository. I've attached some commands and output in a text file. This file includes: (1) hardware information (2) information about my R installation (3) code for simple R graphics, with comments re output, plus URLs for the corresponding graphical output Any advice would be really appreciated. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Construct All Possible Strings from 4 Bases (ATCG)
Dear all, Is there an efficient way in R to construct all strings from 4 bases (ATCG). If we want a length L string, there are 4 ^ L possible strings of such. e . g with L = 2 we have AA, AT, AC, AG, .. GC, GA, GT, GG as many as 4 ^ 2 = 16 strings, with L = 3 we have as many as 4 ^ 3 = 64 strings - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help - java.library.path
Stefan Gengenbach schrieb: Hello all, I`m a computer science student from Frankfurt/main (germany). Our student team (10 students) are working on a R/Interface project (Java and R) and we have a problem with R and JRI. We are working on 10 computers with Windows XP SP2, R 2.8.0, jdk1.6.0_11 and the same classpath setttings but our application works only on two computers. We can't explain us this. --- This is the error message: Cannot find JRI native library! Please make sure that the JRI native library is in a directory listed in java.library.path. java.lang.UnsatisfiedLinkError: no jri in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1682) at java.lang.Runtime.loadLibrary0(Runtime.java:823) at java.lang.System.loadLibrary(System.java:1030) at org.rosuda.JRI.Rengine.clinit(Rengine.java:9) at net.datenanalyse.iedae.rinterface.RInterface.init(RInterface.java:20) at net.datenanalyse.iedae.rinterface.Start.main(Start.java:9) Java Result: 1 - OK, we know that we have same .dll reference problems. In the first step we copied all necessary .dlls into the system32 folder and checked that we fix all reference problems. No changes… After this step we set the Classpaths to \Java\jdk1.6.0_11\bin and \R\R-2.8.0\bin. No change, the error message is the same. Do you have from an faq, idea or a guide to install the JRI? Sorry for my bad English but I hope you understand me. Thank you, Stefan Not the solution but a few hints. The library path is not the same as the classpath. You can get the library path by: System.out.println(JLP = + System.getProperty(java.library.path); Maybe have a look at java.lang.System, especially load, loadLibrary and setProperty. Greetings, Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Construct All Possible Strings from 4 Bases (ATCG)
Gundala f - function(n){expand.grid(rep(list(seq_len(4)),n))} HTH Robin Gundala Viswanath wrote: Dear all, Is there an efficient way in R to construct all strings from 4 bases (ATCG). If we want a length L string, there are 4 ^ L possible strings of such. e . g with L = 2 we have AA, AT, AC, AG, .. GC, GA, GT, GG as many as 4 ^ 2 = 16 strings, with L = 3 we have as many as 4 ^ 3 = 64 strings - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Robin K. S. Hankin Uncertainty Analyst University of Cambridge 19 Silver Street Cambridge CB3 9EP 01223-764877 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] repeated measures aov with weights
Dear R-help, I'm facing a problem with defining a repeated measures anova with weighted data. Here's the code to reproduce the problem: # generate some data seed=11 rtrep - data.frame(rt=rnorm(100),ti=rep(1:5,20),subj=gl (20,5,100),we=runif(100)) # model with within factor for subjects/repeated measurements, no problem aov(rt~ti + Error(subj/ti),data=rtrep) #model with weights and subj as between factor, ie ignoring repeated measures, #again, no problem aov(rt~ti+subj,data=rtrep,weights=we) #combination of above two: repeated measures AND weights aov(rt~ti + Error(subj/ti),data=rtrep,weights=we) The latter model gives an error (see report below), but only after fitting it, ie the error is produced by the print and summary methods of the aov objects. Any guidance is appreciated, best, Ingmar Error produced in printing the fitted aov object: Call: aov(formula = rt ~ ti + Error(subj/ti), data = rtrep, weights = we) Note: The results below are on the weighted scale Grand Mean: 0.112081 Stratum 1: subj Terms: ti Residuals Sum of Squares 2.002826 10.869940 Deg. of Freedom 118 Residual standard error: 0.7771007 Estimated effects are balanced Stratum 2: subj:ti Terms: ti Residuals Sum of Squares 0.382535 5.540047 Deg. of Freedom119 Residual standard error: 0.5399828 Estimated effects are balanced Stratum 3: Within Error in print.aov(xi, ...) : dims [product 60] do not match the length of object [100] In addition: Warning message: In resid * wt^0.5 : longer object length is not a multiple of shorter object length Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with graphical devices, e.g., png(), pdf(): blurry graphical output
On Wed, Dec 17, 2008 at 12:54 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: Your PDF problems indicate a broken viewer. How were you viewing PDF? You are absolutely correct about the PDF files; I've since checked the PDF files in other viewers and have not been able to reproduce the problem. There was definitely something wrong with my default viewer. I am most certainly embarrassed. As for the PNG files: I've viewed the PNG files on two different systems, in various viewers. On a Fedora system in Eye of Gnome, Firefox, Epiphany, GIMP, and gThumb. And, on a Windows system in Paint, Firefox, Internet Explorer, and GIMP. I see blurriness in all of these viewers for all of the files I originally claimed difficultly with (as mentioned in my last message, I don't see blurriness in the images produced via type='Xlib'). I'll do more tests on other systems once I get the chance, but that's what I see at the moment. The files are linked to in the text file I provided, and you are all invited to check those out if you are interested. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replacing elements of a zoo object
Dear R Users, I am trying to do something quite simple: replace the elements of a zoo object. For some reason, the following code does not seem to work. How can I replace the value for the 14th of Dec of 2008 in the zoo object x below with 1 (it is currently NA). x 2008-12-11 2008-12-12 2008-12-13 2008-12-14 2008-12-15 2008-12-16 361.667389.875 NA NA397.822395.667 class(x) [1] zoo class(index(x)) [1] Date x[as.Date(2008-12-14),] 2008-12-14 NA x[as.Date(2008-12-14),]-1 Error in x[as.Date(2008-12-14), ] - 1 : incorrect number of subscripts on matrix Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
If this were copy and paste-able then I could probably give you a solution, but I would have a look at ?coredata On Wed, Dec 17, 2008 at 11:24 AM, tolga.i.uzu...@jpmorgan.com wrote: Dear R Users, I am trying to do something quite simple: replace the elements of a zoo object. For some reason, the following code does not seem to work. How can I replace the value for the 14th of Dec of 2008 in the zoo object x below with 1 (it is currently NA). x 2008-12-11 2008-12-12 2008-12-13 2008-12-14 2008-12-15 2008-12-16 361.667389.875 NA NA397.822395.667 class(x) [1] zoo class(index(x)) [1] Date x[as.Date(2008-12-14),] 2008-12-14 NA x[as.Date(2008-12-14),]-1 Error in x[as.Date(2008-12-14), ] - 1 : incorrect number of subscripts on matrix Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] win.graph question
On 12/17/2008 12:27 PM, Kirk Wythers wrote: I have what I hope is a ridiculously simple question. I am trying to follow an example that uses the function win.graph(), but my machine does not recognize win.graph(). I am running R 2.8.0 on OS X. Is there some OS X specific function that replaces win.graph or am I missing some package? win.graph() is the name of the Windows graphics device. The platform-independent way to do the same thing is dev.new(). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting regression lines and points from lm using lattice
On 12/17/08, Javier PB j.perez-barbe...@macaulay.ac.uk wrote: Dear R-users, Sorry if someone came out with a similar question but after one day of searching I am giving up: Does anyone know how to plot the original points used in a lm model and the set of resulting regression lines generated by the model? This is how I do it using the plot and lines functions but I would like to do it using lattice and I cannot find a way to plot in the same panel points and lines that come from different datasets. Note: I would like to use this sort of predict approach rather than using the regression coefficients of the model as in complex models I get confused when I have to combine the coefficients to build up each regression line. Take a look at http://www.r-project.org/conferences/useR-2007/program/presentations/sarkar.pdf -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT: (quasi-?) separation in a logistic GLM
Gavin: I think its important to point out two probably obvious things (1) the dataset is very imbalanced...we have an overabundance of 'analogs==FALSE' roughly 94% of the data. If we think of a purely non-parametric test of equality of the underlying CDF's then we have alot of confidence in F0 and not much in F1 (2) this aside, it does appear that the Dij value 0.209 seems to be the optimum from the standpoint of maximizing Youdin's index Se + Sp - 1 which is the expected utility assigning utilities +/- 1/pi and +/- 1/(1-pi) to True/False positives and negatives... meaning in this case, a true/false positive is worth 0.94/0.06 the value of a true/false negative which seems reasonable given the imbalance of the dataset and the expectation that the measurements are equally precise in the two populations. x - 0.209; with(dat, c(sp - mean(Dij[!analogs]x), se- mean(Dij[analogs]=x), sp+se - 1)) [1] 0.9443561 0.9269231 0.8712792 So it appears that the dataset is quite well separated into two samples at the cutpoint 0.209 Re: [R] OT: (quasi-?) separation in a logistic GLM Grant Izmirlian NCI On 15 Dec 2008, at 18:03, Gavin Simpson wrote: Dear List, Apologies for this off-topic post but it is R-related in the sense that I am trying to understand what R is telling me with the data to hand. ROC curves have recently been used to determine a dissimilarity threshold for identifying whether two samples are from the same type or not. Given the bashing that ROC curves get whenever anyone asks about them on this list (and having implemented the ROC methodology in my analogue package) I wanted to try directly modelling the probability that two sites are analogues for one another for given dissimilarity using glm(). The data I have then are a logical vector ('analogs') indicating whether the two sites come from the same vegetation and a vector of the dissimilarity between the two sites ('Dij'). These are in a csv file currently in my university web space. Each 'row' in this file corresponds to single comparison between 2 sites. When I analyse these data using glm() I get the familiar fitted probabilities numerically 0 or 1 occurred warning. The data do not look linearly separable when plotted (code for which is below). I have read Venables and Ripley's discussion of this in MASS4 and other sources that discuss this warning and R (Faraway's Extending the Linear Model with R and John Fox's new Applied Regression, Generalized Linear Models, and Related Methods, 2nd Ed) as well as some of the literature on Firth's bias reduction method. But I am still somewhat unsure what (quasi-)separation is and if this is the reason for the warnings in this case. My question then is, is this a separation issue with my data, or is it quasi-separation that I have read a bit about whilst researching this problem? Or is this something completely different? Code to reproduce my problem with the actual data is given below. I'd appreciate any comments or thoughts on this. Begin code snippet ## note data file is ~93Kb in size dat - read.csv(url(http://www.homepages.ucl.ac.uk/~ucfagls/ dat.csv)) head(dat) ## fit model --- produces warning mod - glm(analogs ~ Dij, data = dat, family = binomial) ## plot the data plot(analogs ~ Dij, data = dat) fit.mod - fitted(mod) ord - with(dat, order(Dij)) with(dat, lines(Dij[ord], fit.mod[ord], col = red, lwd = 2)) End code snippet ## Thanks in advance Gavin -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot multiple lines, same plot, different axes?
Dear list, I would like to plot 2 series of numbers with very different ranges/scales as lines on the same plot. I assumed this is commonly done and easy, but I have not found any help files (e.g. axis() or matplot() that show how. I've searched many old posts to no avail. I'll be very grateful for any suggestions on how this is done. Best, Zack __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Shrink Trellis margins settings (when printed to png file)
Dear R-experts, I have two problems: PROBLEM (1) --- I want to produce a very small png file (35 x 18 px) that contains a histogram without a figure region or margins, only the pure heights. In the base graphic system this is simple: png(filename = hist.png, res = 72, width=35, height=18) par(mar=c(0,0,0,0), oma=c(0,0,0,0)) hist(rnorm(100), main=) dev.off() Now I want a grid graphics output as I need the graphic as an object. I tried several trellis.par settings but I was not able to figure it out (PROBLEM (1)). Up to now it looks like this: library(lattice) histogram(rnorm(100), xlab=, ylab=, par.settings=list( axis.line=list(col=transparent), xlab.text=list(col=transparent), ylab.text=list(col=transparent), axis.text=list(col=transparent)) ) This looks acceptable although I would like smaller margins, that is to say no margins at all. How cab I achieve that? PROBLEM (2) --- Now is, that when it is printed to the png file, the graphic almost consist of margins only. The main part of the plot shrinks to some tiny points. I don't know how to change the settings, so I that I get the same as in the base system. Does anyone know? TIA Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with ggplot2
I'm also getting this error with ggplot2_0.8.1 on winXP with R 2.8.0 and R 2.7.1. hadley wrote: Hi David, I inadvertently introduced a bug in ggplot in the last release. I uploaded a fix to CRAN this morning and it should be available in the near future. Sorry for the inconvenience. Regards, Hadley On Thu, Nov 20, 2008 at 2:49 PM, David Hajage dhajag...@gmail.com wrote: Hello R users, I have an error with package ggplot2 under linux (ubuntu 8.10), R 2.8.0 and ggplot2 0.7, everything up to date : library(ggplot2) Le chargement a nécessité le package : grid Le chargement a nécessité le package : reshape Le chargement a nécessité le package : plyr Le chargement a nécessité le package : proto Le chargement a nécessité le package : RColorBrewer Le chargement a nécessité le package : splines Le chargement a nécessité le package : MASS Attachement du package : 'ggplot2' The following object(s) are masked from package:grid : nullGrob qplot(mpg, wt, data=mtcars) Erreur dans gList(...) : Only 'grobs' allowed in 'gList' I don't understand : this code is working on my work computer (windows 2000). I forgot something ? Thank you very much. david [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Problem-with-ggplot2-tp20613210p21057683.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R function to calculate number of data points of each level
I have a dataframe with two columns : 11600 238'4 12000 218'0 12200 209'0 12600 192'0 13000 176'4 14000 145'0 15000 119'0 16000 103'0 1800080'0 1900068'3 2 59'0 11600 208'1 12000 189'2 12200 180'3 There are repeatations in 1st column and I want to use this as Level. Next I want to report for each level how many data points are there, with corresponding Level number. Is there any direct R function? -- View this message in context: http://www.nabble.com/R-function-to-calculate-number-of-data-points-of-each-level-tp21051649p21051649.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
Try this: x[index(x) == as.Date(2008-12-14)] - 1 x On Wed, Dec 17, 2008 at 2:24 PM, tolga.i.uzu...@jpmorgan.com wrote: Dear R Users, I am trying to do something quite simple: replace the elements of a zoo object. For some reason, the following code does not seem to work. How can I replace the value for the 14th of Dec of 2008 in the zoo object x below with 1 (it is currently NA). x 2008-12-11 2008-12-12 2008-12-13 2008-12-14 2008-12-15 2008-12-16 361.667389.875 NA NA397.822395.667 class(x) [1] zoo class(index(x)) [1] Date x[as.Date(2008-12-14),] 2008-12-14 NA x[as.Date(2008-12-14),]-1 Error in x[as.Date(2008-12-14), ] - 1 : incorrect number of subscripts on matrix Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeated measures aov with weights
On Wed, 17 Dec 2008, Ingmar Visser wrote: I see, I was afraid for an answer along these lines as my problem now turns into a stat problem (-;Any suggestions on how to analyze such data as given below with weights AND at the same time taking into account that there are repeated measurements? It matters what you mean by 'weights'. If they are case weights, repeat the cases. If they are inverse-variance weights, you can use package lme4. List R-sig-mixed-modles might be more appropriate if you want more detailed help. Best, Ingmar On 17 Dec 2008, at 11:02, Prof Brian Ripley wrote: Weights are not supported: multistratum aov is designed for balanced designs and uses projection for which weighting is inappropriate. On Wed, 17 Dec 2008, Ingmar Visser wrote: Dear R-help, I'm facing a problem with defining a repeated measures anova with weighted data. Here's the code to reproduce the problem: # generate some data seed=11 rtrep - data.frame(rt=rnorm(100),ti=rep(1:5,20),subj=gl (20,5,100),we=runif(100)) # model with within factor for subjects/repeated measurements, no problem aov(rt~ti + Error(subj/ti),data=rtrep) #model with weights and subj as between factor, ie ignoring repeated measures, #again, no problem aov(rt~ti+subj,data=rtrep,weights=we) #combination of above two: repeated measures AND weights aov(rt~ti + Error(subj/ti),data=rtrep,weights=we) The latter model gives an error (see report below), but only after fitting it, ie the error is produced by the print and summary methods of the aov objects. Any guidance is appreciated, best, Ingmar Error produced in printing the fitted aov object: Call: aov(formula = rt ~ ti + Error(subj/ti), data = rtrep, weights = we) Note: The results below are on the weighted scale Grand Mean: 0.112081 Stratum 1: subj Terms: ti Residuals Sum of Squares 2.002826 10.869940 Deg. of Freedom 1 18 Residual standard error: 0.7771007 Estimated effects are balanced Stratum 2: subj:ti Terms: ti Residuals Sum of Squares 0.382535 5.540047 Deg. of Freedom 1 19 Residual standard error: 0.5399828 Estimated effects are balanced Stratum 3: Within Error in print.aov(xi, ...) : dims [product 60] do not match the length of object [100] In addition: Warning message: In resid * wt^0.5 : longer object length is not a multiple of shorter object length Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot multiple lines, same plot, different axes?
zackfire wrote: Dear list, I would like to plot 2 series of numbers with very different ranges/scales as lines on the same plot. I assumed this is commonly done and easy, but I have not found any help files (e.g. axis() or matplot() that show how. I've searched many old posts to no avail. http://wiki.r-project.org/rwiki/doku.php?id=tips:graphics-base:2yaxes -- View this message in context: http://www.nabble.com/Plot-multiple-lines%2C-same-plot%2C-different-axes--tp21058305p21058655.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Yet another set of codes to optimize
[R] Yet another set of codes to optimize Daren Tan daren76 at hotmail.com Fri Dec 5 03:41:23 CET 2008 I have problems converting my dataset from long to wide format. Previous attempts using reshape package and aggregate function were unsuccessful as they took too long. Apparently, my simplified solution also lasted as long. My complete codes is given below. When sample.size = 1, the execution takes about 20 seconds. But sample.size = 10 seems to take eternity. My actual sample.size is 1500 i.e. 15 million. sample.size - 1 m - data.frame(Name=sample(1:10, sample.size, T), Type=sample(1:1000, sample.size, T), Predictor=sample(LETTERS[1:10], sample.size, T)) res - function(m) { m.12.unique - unique(m[,1:2]) m.12.unique - m.12.unique[order(m.12.unique[,1], m.12.unique[,2]),] v1 - paste(m.12.unique[,1], m.12.unique[,2], sep=.) v2 - c(sort(unique(m[,3]))) res - matrix(0, nr=length(v1), nc=length(v2), dimnames=list(v1, v2)) m.ids - paste(m[,1], m[,2], sep=.) for(i in 1:nrow(m)) { x - m.ids[i] y - m[i,3] res[x, y] - res[x, y] + 1 } res - data.frame(m.12.unique[,1], m.12.unique[,2], res, row.names=NULL) colnames(res) - c(Name, Type, v2) return(res) } res(m) Your for loop is tabulating the items in m.ids and m[,3] so think of using table(). E.g., replace res - matrix(0, nr=length(v1), nc=length(v2), dimnames=list(v1, v2)) for(i in 1:nrow(m)) { x - m.ids[i] y - m[i,3] res[x, y] - res[x, y] + 1 } with res-table(factor(m.ids,levels=v1), factor(m[,3])) There is a bit of trickiness in putting this table into the data.frame. Since as.data.frame(tableObject) works very differently than as.data.frame(matrixObject), the naive data.frame(m.12.unique[,1], m.12.unique[,2], res, row.names=NULL) fails. You need to convert the table res into a matrix with the same data, dimensions, and dimnames. data.frame(m.12.unique[,1], m.12.unique[,2], as.matrix(res), row.names=NULL) also fails because a table object is a matrix object so as.matrix(tableObject) returns its input, unchanged. as(res,matrix) seems to work, as the the wordier but more explicit array(res,dim(res),dimnames(res)). res1 - function(m) { m.12.unique - unique(m[,1:2]) m.12.unique - m.12.unique[order(m.12.unique[,1], m.12.unique[,2]),] v1 - paste(m.12.unique[,1], m.12.unique[,2], sep=.) v2 - c(sort(unique(m[,3]))) res - matrix(0, nr=length(v1), nc=length(v2), dimnames=list(v1, v2)) m.ids - paste(m[,1], m[,2], sep=.) res - table(factor(m.ids,levels=v1), factor(m[,3])) res - data.frame(m.12.unique[,1], m.12.unique[,2], as(res, matrix), row.names=NULL) colnames(res) - c(Name, Type, v2) return(res) } Here is a table of times for your original function, time0, and this modified one, time0. It looks like res1 eventually becomes worse than linear, but for a much larger size than your original. sort() and unique() cannot have linear time so they may be becoming factors at size=1e6. size time0 time1 1 10 0.012 0.012 2 100 0.032 0.014 3 200 0.061 0.016 4 400 0.126 0.020 5 800 0.286 0.028 6 1000 0.383 0.033 7 2000 2.337 0.054 8 4000 8.578 0.100 9 8000 39.955 0.214 10 1 68.767 0.318 11 2 327.973 1.057 12 10 NA 3.021 12 100 NA 89.881 Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using dvi with latex object: directory not correctly set, maybe due to error in shQuote()
Hello Marco, as might not be evident at first sight, but have you set the environment variable R_SHELL? If you spot at the dvi method for latex you will find a call to sys(), which will call shell() and if the argument shell is unset then the contents of R_SHELL will be used. Hence, what does: Sys.getenv(R_SHELL) yield at your machine? I reckon will be returned. Therefore as a first step: Sys.setenv(R_SHELL = cmd.exe) or permanently in your R environment file. Having done so and running dvi(latex.obj) now produces at least not the warning that everything beyond cd is skipped and the command via paste is parsed. The next problem is the path the randomly generated file that latex cannot handle. The following alternative might work for you too: dvi.latex2 - function (object, prlog = FALSE, nomargins = TRUE, width = 5.5, height = 7, ...) { fi - object$file sty - object$style if (length(sty)) sty - paste(\\usepackage{, sty, }, sep = ) if (nomargins) sty - c(sty, paste(\\usepackage[paperwidth=, width, in,paperheight=, height, in,noheadfoot,margin=0in]{geometry}, sep = )) tmp - tempfile(tmpdir = tempdir()) tmptex - paste(tmp, tex, sep = .) infi - readLines(fi, n = -1) cat(\\documentclass{report}, sty, \\begin{document}\\pagestyle{empty}, infi, \\end{document}\n, file = tmptex, sep = \n) sc - if (under.unix) { } else { } shell(paste(cd, shQuote(tempdir()), sc, optionsCmds(latex), -interaction=scrollmode, shQuote(tmp)), translate = TRUE) if (prlog) cat(scan(paste(tmp, log, sep = .), list(), sep = \n)[[1]], sep = \n) fi - paste(tmp, dvi, sep = .) structure(list(file = fi), class = dvi) } And therefore: tbl.loc - matrix(1:4, ncol=2) latex.obj - latex(tbl.loc) tempdir - function(){H:/PROJECTS/data} Sys.getenv(R_SHELL) Sys.setenv(R_SHELL = cmd.exe) Sys.getenv(R_SHELL) ## options(xdvicmd='dviout') set appropriately I use TeXLive and have not yap installed; ## working with MikTeX there should be no need to change the default viewer dvi.latex2(latex.obj) ## It might be the case that the dvi file is not displayed immediately after production but can be opened ## manually Does this work for your? Probably it is also a good idea to address this problem directly to the package maintainer (already cc'ed). Best, Bernhard Dear friends of R, I want to produce a pdf file with the contents of a matrix. I employ the latex command in combination with dvi, both contained in the Hmisc package. It seems to me that the function does not correctly set the directory. tbl.loc - matrix(1:4, nc=2) latex.obj - latex(tbl.loc) dvi(latex.obj) warning: extra args ignored after 'cd' H:\PROJECTS\data warning: extra args ignored after 'yap' When I have a look at the function dvi.latex I find the following line which, I guess, is meant to set the new directory and to run latex. sys(paste(cd, shQuote(tempdir()), sc, optionsCmds(latex), -interaction=scrollmode, shQuote(tmp)), output = FALSE) Running just the piece shQuote(tempdir()) returns shQuote(tempdir()) [1] \C:\\DOKUME~1\\ferimawi\\LOKALE~1\\Temp\\Rtmpr4CG3A\ tempdir() [1] C:\\DOKUME~1\\ferimawi\\LOKALE~1\\Temp\\Rtmpr4CG3A Is the leading \ causing the problem? How can I fix the problem? The R-help dealt with a related problem some while ago but I do not think that it resolves my problem: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/62975.html I am using Windows XP, R version 2.7.2 (2008-08-25) and Hmisc version 3.4-4. Thanks in advance for your help. Regards Marco Marco Willner Senior Analyst Quantitative Asset Allocation Feri Finance AG Haus am Park Rathausplatz 8-10 D-61348 Bad Homburg v.d.H Tel: +49 (6172) 916 3037 Fax: +49 (6172) 916 1037 E-Mail: marco.will...@feri.de Internet: www.feri.de Handelsregister des Amtsgerichts Bad Homburg v.d.H. (HRB 7473) Vorstände: Michael Stammler (Sprecher), Dr. Matthias Klöpper, Dr. Helmut Knepel, Dr. Heinz-Werner Rapp, Arndt Thorn Vorsitzender des Aufsichtsrates: Dr. Uwe Schroeder-Wildberg Disclaimer: Diese Nachricht enthält vertrauliche und/oder ausschließlich für den Adressaten bestimmte Informationen. Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Veröffentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser E-Mail und der E-Mail selber unzulässig ist. Sollten Sie diese E-Mail irrtümlich erhalten haben, so bitten wir Sie, den Absender unverzüglich durch Antwort-E-Mail oder Anruf unter +49 (6172) 916-0 zu informieren und diese Nachricht zu löschen. Soweit nicht anderweitig angegeben, ist diese Nachricht weder ein Angebot noch die Einholung eines Angebots zum Kauf oder Verkauf von Investitionen jedweder Art. Wir senden und empfangen E-Mails nur auf der Grundlage, dass wir nicht für
Re: [R] using dvi with latex object: directory not correctly set, maybe due to error in shQuote()
Dear Bernhard You were right, pushing my knowledge about R once again to the next level! Sys.getenv(R_SHELL) indeed returns . The fix you are proposing works for me. If I am right, your function dvi.latex2 makes use of shell rather than of sys in the original function dvi.latex and stores results in the specified for tempdir(). On my system the original function dvi.latex also works up to the point when I want to produce a ps file (see output below: dvi can't be opened). However the next problem is the output (either by opening the dvi file with yap or the ps file with ghostscript). The page size is not the one specified by the parameters width and height in dvi.latex/dvi.latex2. It looks like DIN A4. I apologize if, at this point, it becomes a problem caused by Miktex 2.7 and geometry package. Ultimately, I want to embed the pdf file into another file and I cannot use the large format. Cheers Marco tbl.loc - matrix(1:4, ncol=2) latex.obj - latex(tbl.loc) dvi.obj - dvi(latex.obj) Das System kann den angegebenen Pfad nicht finden. This is pdfTeX, Version 3.141592-1.40.4 (MiKTeX 2.7 Beta 4) entering extended mode (C:/DOKUME~1/ferimawi/LOKALE~1/Temp/Rtmpr4CG3A/file33ea5db2.tex LaTeX2e 2005/12/01 Babel v3.8g and hyphenation patterns for english, dumylang, nohyphenation, ge rman, ngerman, french, loaded. (C:\Programme\MiKTeX 2.7\tex\latex\base\report.cls Document Class: report 2005/09/16 v1.4f Standard LaTeX document class (C:\Programme\MiKTeX 2.7\tex\latex\base\size10.clo)) (C:\Programme\MiKTeX 2.7\tex\latex\geometry\geometry.sty (C:\Programme\MiKTeX 2.7\tex\latex\graphics\keyval.sty) (C:\Programme\MiKTeX 2.7\tex\latex\geometry\geometry.cfg)) No file file33ea5db2.aux. [1] (file33ea5db2.aux) ) Output written on file33ea5db2.dvi (1 page, 300 bytes). Transcript written on file33ea5db2.log. dvips(object=dvi.obj, tst.ps) This is dvips(k) 5.96 Copyright 2007 Radical Eye Software (www.radicaleye.com) dvips: ! DVI file can't be opened. dvi.obj2 - dvi.latex2(latex.obj) This is pdfTeX, Version 3.141592-1.40.4 (MiKTeX 2.7 Beta 4) entering extended mode (H:/Asset Allocation/PROJECTS/MAM/latex/plot_tbl/file48cc23c9.tex LaTeX2e 2005/12/01 Babel v3.8g and hyphenation patterns for english, dumylang, nohyphenation, ge rman, ngerman, french, loaded. (C:\Programme\MiKTeX 2.7\tex\latex\base\report.cls Document Class: report 2005/09/16 v1.4f Standard LaTeX document class (C:\Programme\MiKTeX 2.7\tex\latex\base\size10.clo)) (C:\Programme\MiKTeX 2.7\tex\latex\geometry\geometry.sty (C:\Programme\MiKTeX 2.7\tex\latex\graphics\keyval.sty) (C:\Programme\MiKTeX 2.7\tex\latex\geometry\geometry.cfg)) No file file48cc23c9.aux. [1] (file48cc23c9.aux) ) Output written on file48cc23c9.dvi (1 page, 300 bytes). Transcript written on file48cc23c9.log. dvips(object=dvi.obj2, tst.ps) This is dvips(k) 5.96 Copyright 2007 Radical Eye Software (www.radicaleye.com) ' TeX output 2008.12.17:1739' - tst.ps C:/Programme/MiKTeX 2.7/dvips/base/tex.pro C:/Programme/MiKTeX 2.7/dvips/base/texps.pro. C:/Programme/MiKTeX 2.7/fonts/type1/bluesky/cm/cmr10.pfb[1] -Ursprüngliche Nachricht- Von: Pfaff, Bernhard Dr. [mailto:bernhard_pf...@fra.invesco.com] Gesendet: Wednesday, December 17, 2008 4:25 PM An: Willner, Marco; r-help@r-project.org Cc: charles.dup...@vanderbilt.edu Betreff: AW: [R] using dvi with latex object: directory not correctly set,maybe due to error in shQuote() Hello Marco, as might not be evident at first sight, but have you set the environment variable R_SHELL? If you spot at the dvi method for latex you will find a call to sys(), which will call shell() and if the argument shell is unset then the contents of R_SHELL will be used. Hence, what does: Sys.getenv(R_SHELL) yield at your machine? I reckon will be returned. Therefore as a first step: Sys.setenv(R_SHELL = cmd.exe) or permanently in your R environment file. Having done so and running dvi(latex.obj) now produces at least not the warning that everything beyond cd is skipped and the command via paste is parsed. The next problem is the path the randomly generated file that latex cannot handle. The following alternative might work for you too: dvi.latex2 - function (object, prlog = FALSE, nomargins = TRUE, width = 5.5, height = 7, ...) { fi - object$file sty - object$style if (length(sty)) sty - paste(\\usepackage{, sty, }, sep = ) if (nomargins) sty - c(sty, paste(\\usepackage[paperwidth=, width, in,paperheight=, height, in,noheadfoot,margin=0in]{geometry}, sep = )) tmp - tempfile(tmpdir = tempdir()) tmptex - paste(tmp, tex, sep = .) infi - readLines(fi, n = -1) cat(\\documentclass{report}, sty, \\begin{document}\\pagestyle{empty}, infi, \\end{document}\n, file = tmptex, sep = \n) sc - if (under.unix) { } else { } shell(paste(cd, shQuote(tempdir()), sc, optionsCmds(latex), -interaction=scrollmode, shQuote(tmp)),
Re: [R] Plot multiple lines, same plot, different axes?
On 18/12/2008, at 4:52 AM, zack holden wrote: Dear list, I would like to plot 2 series of numbers with very different ranges/ scales as lines on the same plot. I assumed this is commonly done and easy, but I have not found any help files (e.g. axis() or matplot() that show how. I've searched many old posts to no avail. I'll be very grateful for any suggestions on how this is done. ***DON'T***!!! cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] passing arguments to subset from a function
Hello R-helpers, I'm writing a long function in which I manipulate a certain number of datasets. I want the arguments of said function to allow me to adapt the way I do this. Among other things, I want my function to have an argument which I will pass on to subset() somewhere inside my function. Here is a quick and simplified example with the iris dataset. myfunction-function(table, extraction) { table2-subset(table, extraction) return(table2) } myfunction(iris, extraction= Species==setosa) ## end What I would like is for this function to return exactly the same thing as : subset(iris, Species==setosa) Thanks for your help. Regards, David Gouache __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals of log transformed data
Hello, I was wondering if you could tell me how to calculate 95% confidence intervals for lambda for a box-cox power transformation. Best wishes Eoin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R function to calculate number of data points of each level
On Wed, Dec 17, 2008 at 03:36:44AM -0800, RON70 wrote: I have a dataframe with two columns : 11600 238'4 12000 218'0 [...] There are repeatations in 1st column and I want to use this as Level. Next I want to report for each level how many data points are there, with corresponding Level number. Is there any direct R function? table(foo$V1) 11600 12000 12200 12600 13000 14000 15000 16000 18000 19000 2 2 2 2 1 1 2 2 2 2 2 1 or maybe ggregate(foo$V1, by=list(foo$V1), FUN=length) Group.1 x 111600 2 212000 2 312200 2 412600 1 513000 1 614000 2 715000 2 816000 2 918000 2 10 19000 2 11 2 1 cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://mips.gsf.de/staff/pagel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Prediction intervals for zero inflated Poisson regression
Dear Achim, Thanks for the script. It works fine except it sometimes yields extreme wide confidence intervals. That is for a factor level with only a few replications or a level with all zeros. I noticed that the se for those predictions was Nan. Therefore I've added two lines (marked with #% at the end) that set the lower and upper bound to NA when is.na(se). No confidence intervals make, in my opinion, in those cases more sense than c.i. like [1e-200, 1e200]. Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: Achim Zeileis [mailto:achim.zeil...@wu-wien.ac.at] Verzonden: dinsdag 16 december 2008 16:45 Aan: ONKELINX, Thierry CC: r-help@r-project.org Onderwerp: Re: [R] Prediction intervals for zero inflated Poisson regression Thierry, Simon had written some code for this but we never got round to fully integrate it into the pscl package. A file pb.R is attached, but as a disclaimer: I haven't looked at this code for a while. It still seems to work (an example is included at the end) but please check. hth, Z On Tue, 16 Dec 2008, ONKELINX, Thierry wrote: Dear all, I'm using zeroinfl() from the pscl-package for zero inflated Poisson regression. I would like to calculate (aproximate) prediction intervals for the fitted values. The package itself does not provide them. Can this be calculated analyticaly? Or do I have to use bootstrap? What I tried until now is to use bootstrap to estimate these intervals. Any comments on the code are welcome. The data and the model are based on the examples in zeroinfl(). #aproximate prediction intervals with Poisson regression fm_pois - glm(art ~ fem, data = bioChemists, family = poisson) newdata - na.omit(unique(bioChemists[, fem, drop = FALSE])) prediction - predict(fm_pois, newdata = newdata, se.fit = TRUE) ci - data.frame(exp(prediction$fit + matrix(prediction$se.fit, ncol = 1) %*% c(-1.96, 1.96))) newdata$fit - exp(prediction$fit) newdata - cbind(newdata, ci) newdata$model - Poisson library(pscl) #aproximate prediction intervals with zero inflated poisson regression fm_zip - zeroinfl(art ~ fem | 1, data = bioChemists) fit - predict(fm_zip) Pearson - resid(fm_zip, type = pearson) VarComp - resid(fm_zip, type = response) / Pearson fem - bioChemists$fem bootstrap - replicate(999, { yStar - pmax(round(fit + sample(Pearson) * VarComp, 0), 0) predict(zeroinfl(yStar ~ fem | 1), newdata = newdata) }) newdata0 - newdata newdata0$fit - predict(fm_zip, newdata = newdata, type = response) newdata0[, 3:4] - t(apply(bootstrap, 1, quantile, c(0.025, 0.975))) newdata0$model - Zero inflated #compare the intervals in a nice plot. newdata - rbind(newdata, newdata0) library(ggplot2) ggplot(newdata, aes(x = fem, y = fit, min = X1, max = X2, colour = model)) + geom_point(position = position_dodge(width = 0.4)) + geom_errorbar(position = position_dodge(width = 0.4)) Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
[R] R2winbugs : vectorization
Many thanks to all responders. It turns out that there is a winbugs update available, which defines a new version of inprod, inprod2. Using inprod2 the vectorized code runs in about the same time as the scalar version. On the print problem: the issue here turns out to be that the arm package (or one of the packages it loads) resets digits to 2. So the solution is to reset digits ( options(digits=5) or whatever) after loading your packages. Thanks again! Philip A. Viton City Planning, Ohio State University 275 West Woodruff Avenue, Columbus OH 43210 vito...@osu.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing arguments to subset from a function
Available free for the typing are the functions for the default and the dataframe methods of subset: subset.default function (x, subset, ...) { if (!is.logical(subset)) stop('subset' must be logical) x[subset !is.na(subset)] } subset.data.frame function (x, subset, select, drop = FALSE, ...) { if (missing(subset)) r - TRUE else { e - substitute(subset) r - eval(e, x, parent.frame()) if (!is.logical(r)) stop('subset' must evaluate to logical) r - r !is.na(r) } if (missing(select)) vars - TRUE else { nl - as.list(1:ncol(x)) names(nl) - names(x) vars - eval(substitute(select), nl, parent.frame()) } x[r, vars, drop = drop] } (There is also a matrix method.) -- David Winsemius On Dec 17, 2008, at 2:07 PM, GOUACHE David wrote: Hello R-helpers, I'm writing a long function in which I manipulate a certain number of datasets. I want the arguments of said function to allow me to adapt the way I do this. Among other things, I want my function to have an argument which I will pass on to subset() somewhere inside my function. Here is a quick and simplified example with the iris dataset. myfunction-function(table, extraction) { table2-subset(table, extraction) return(table2) } myfunction(iris, extraction= Species==setosa) ## end What I would like is for this function to return exactly the same thing as : subset(iris, Species==setosa) Thanks for your help. Regards, David Gouache __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with graphical devices, e.g., png(), pdf(): blurry graphical output
The artefacts that you see are a normal result of using bitmap graphics devices. I have tried to explain these below: I have looked at your figures in Eye of Gnome, with anti-aliasing turned off (Menu Edit/Preferences; Tab Image View; option Smooth images when zoomed). I recommend that you do the same. png('test.png',antialias='none') # type is 'cairo' plot(1:10) dev.off() ## result: no fuzziness at all but the box is missing ## the top and left border lines ## http://www.piccdrop.com/images/1229495388.png Cairo works in real (double precision) coordinates. But the line must be converted to bitmap to be displayed. When this is done without anti-aliasing, it is quite possible for a thin horizontal or vertical line to pass in-between the points on a grid that are sampled to form the bitmap image, and hence disappear. png('test.png') # type is 'cairo' plot(1:10) dev.off() ## result: box lines fuzzy at top and left, and appears ## darker and thicker where the axes are overplotted ## http://www.piccdrop.com/images/1229495327.png With anti-aliasing, a horizontal or vertical line may appear as a 1 pixel wide black line, but is more likely to appear as a 2 pixel wide grey line. When two such grey lines are over-plotted, they will create a darker grey line. The overplotted line also appears thicker, but this is an optical illusion. png('test.png',type='Xlib') plot(1:10) dev.off() ## result: no fuzziness at all and no lines missing ## http://www.piccdrop.com/images/1229495428.png Xlib works in integer coordinates. When a line is plotted in Xlib, the start and end coordinates are cast to integer before plotting. Hence horizontal/vertical lines will always appear, with a width of 1 pixel and overplotting does not change the appearance of a line. Martyn On Wed, 2008-12-17 at 01:36 -0800, Y-H Chen wrote: On Wed, Dec 17, 2008 at 12:54 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: Your PDF problems indicate a broken viewer. How were you viewing PDF? You are absolutely correct about the PDF files; I've since checked the PDF files in other viewers and have not been able to reproduce the problem. There was definitely something wrong with my default viewer. I am most certainly embarrassed. As for the PNG files: I've viewed the PNG files on two different systems, in various viewers. On a Fedora system in Eye of Gnome, Firefox, Epiphany, GIMP, and gThumb. And, on a Windows system in Paint, Firefox, Internet Explorer, and GIMP. I see blurriness in all of these viewers for all of the files I originally claimed difficultly with (as mentioned in my last message, I don't see blurriness in the images produced via type='Xlib'). I'll do more tests on other systems once I get the chance, but that's what I see at the moment. The files are linked to in the text file I provided, and you are all invited to check those out if you are interested. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- This message and its attachments are strictly confidenti...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Noobie question, regression across levels
AllenL wrote: Much thanks! This helped a lot. Another quick one: In using the lmList function in the nlme package, is it possible to subset my data according to the number of observations in each level? (ie. I obviously want to include only those levels in which the observations are of sufficient size for regression). What is the best way to exclude factors of insufficient size? Can I do it inside the lmList function? I've read the requisite help files etc. and two hours later am still confused. Thanks in advance, Allen Don't know if you can do it directly in lmList, but: splitdat - split(mydata,splitfactor) lengths - sapply(splitdat,nrow) ## NOT sapply(splitdat,length) splitdat - splitdat[lengthsminlength] lmfun - function(d) { lm(y~x,data=d) } myLmList - lapply(splitdat,lmfun) OR lengths - sapply(split(mydata,splitfactor),nrow) badlevels - levels(splitfactor)[lengthsminlength] subdata - subset(mydata,!splitfactor %in% badlevels) and then proceed with lmList of course, I didn't test any of this ... Ben Bolker -- View this message in context: http://www.nabble.com/Noobie-question%2C-regression-across-levels-tp21020222p21054226.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trouble pulling data from a messy ASCII file...
Hi all, I am a new graduate student who is also new to R. I am ok with the basics, but the problem I am having right now seems beyond what I can do..so I am looking for advice. I am trying to pull data from flat ASCII files, but they do not have a nice structure so a simple read.table doesn't work. An example first half of a data file is below: -- 19 c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt 10 s name of program that wrote this file trkplt name of program that wrote this file 10 GORDON machine that generated this file machine that generated this file 10 3.7 version of program 10 3.6 version of this data file 105.81 version of Universal Library 10 20081121.145730 when this file was written 10 Windows_XP operating system used operating system used * * radar characteristics 11 WF-100 11 2000 A/D rate, samples/second 11 7.5 bin width, m 11 800 nominal PRF, Hz 11 0.25 nominal pulse width, microsec 11 0 tuning, volts 11 3.19779 nominal wave length, cm --- ..the file goes on from there... How would I go about getting this data into some kind of useful format? This is one of about 1000 files I will need to go through. I would ideally like to get these into a format with each data file as a row with columns for the various values with the description text removed(version of program, file version, tuning volts, etc...). I'm not looking for a cut and paste answer, but perhaps some direction on where I should start. I have only done basic .csv, table, and line inputs up until now. Thanks for any advice -- View this message in context: http://www.nabble.com/Trouble-pulling-data-from-a-messy-ASCII-file...-tp21059239p21059239.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorow-Smirnow-Test to check if Data comes from Subbotin distribution
Dear Sara, You could also use the Subbotools package by Giulio Bottazzi and test the goodness of fit of your data to this distribution. See: http://www.lem.sssup.it/WPLem/files/2004-14.pdf Also, http://cafim.sssup.it/~giulio/software/subbotools/install_cygwin.html Regards, José Mr José Luis Iparraguirre Senior Research Economist Economic Research Institute of Northern Ireland 2 -14 East Bridge Street Belfast BT1 3NQ Northern Ireland United Kingdom Tel: +44 (0)28 9072 7365 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeated measures aov with weights
On Wed, 17 Dec 2008, Ingmar Visser wrote: I see, I was afraid for an answer along these lines as my problem now turns into a stat problem (-;Any suggestions on how to analyze such data as given You mean, you were afraid that the help page was correct? Weights can be specified by a 'weights' argument, but should not be used with an 'Error' term, and are incompletely supported (e.g., not by 'model.tables'). RTFM time below with weights AND at the same time taking into account that there are repeated measurements? Best, Ingmar On 17 Dec 2008, at 11:02, Prof Brian Ripley wrote: Weights are not supported: multistratum aov is designed for balanced designs and uses projection for which weighting is inappropriate. On Wed, 17 Dec 2008, Ingmar Visser wrote: Dear R-help, I'm facing a problem with defining a repeated measures anova with weighted data. Here's the code to reproduce the problem: # generate some data seed=11 rtrep - data.frame(rt=rnorm(100),ti=rep(1:5,20),subj=gl (20,5,100),we=runif(100)) # model with within factor for subjects/repeated measurements, no problem aov(rt~ti + Error(subj/ti),data=rtrep) #model with weights and subj as between factor, ie ignoring repeated measures, #again, no problem aov(rt~ti+subj,data=rtrep,weights=we) #combination of above two: repeated measures AND weights aov(rt~ti + Error(subj/ti),data=rtrep,weights=we) The latter model gives an error (see report below), but only after fitting it, ie the error is produced by the print and summary methods of the aov objects. Any guidance is appreciated, best, Ingmar Error produced in printing the fitted aov object: Call: aov(formula = rt ~ ti + Error(subj/ti), data = rtrep, weights = we) Note: The results below are on the weighted scale Grand Mean: 0.112081 Stratum 1: subj Terms: ti Residuals Sum of Squares 2.002826 10.869940 Deg. of Freedom 1 18 Residual standard error: 0.7771007 Estimated effects are balanced Stratum 2: subj:ti Terms: ti Residuals Sum of Squares 0.382535 5.540047 Deg. of Freedom 1 19 Residual standard error: 0.5399828 Estimated effects are balanced Stratum 3: Within Error in print.aov(xi, ...) : dims [product 60] do not match the length of object [100] In addition: Warning message: In resid * wt^0.5 : longer object length is not a multiple of shorter object length Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing arguments to subset from a function
I wrote a dirty hack last time I faced this problem, I'll be curious to see what is the proper way of dealing with the scoping and evaluation rules. library(datasets) myfunction-function(table, extraction) { table2-subset(table,extraction) return(table2) } condition1 - quote(iris$Species==setosa) # I'm not sure how to evaluate within the environment, perhaps you could use with(table, subset(table, extraction) ) in your function condition2 - bquote(.(condition1) iris$Sepal.Width 3.5) # say, if you want to concatenate several conditions together myfunction(iris, extraction= eval(condition1)) myfunction(iris, extraction= eval(condition2)) Best wishes, baptiste On 17 Dec 2008, at 19:07, GOUACHE David wrote: Hello R-helpers, I'm writing a long function in which I manipulate a certain number of datasets. I want the arguments of said function to allow me to adapt the way I do this. Among other things, I want my function to have an argument which I will pass on to subset() somewhere inside my function. Here is a quick and simplified example with the iris dataset. myfunction-function(table, extraction) { table2-subset(table, extraction) return(table2) } myfunction(iris, extraction= Species==setosa) ## end What I would like is for this function to return exactly the same thing as : subset(iris, Species==setosa) Thanks for your help. Regards, David Gouache __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] surface contour plot help
Look at rotate.wireframe in the TeachingDemos package. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Brad B Sent: Tuesday, December 16, 2008 3:05 PM To: r-help@r-project.org Subject: Re: [R] surface contour plot help I was able to get a surface plot with wireframe, however I cant rotate it around like you can with the plot3d function? Is thier a way to do this in R? To: r-help@r-project.org Date: Tuesday, December 16, 2008, 9:13 AM I am trying to do a surface profile plot. data is X Y(1) Z(1) 1-jan-02 2002number 2-jan-02 2002number . . . 1-jan-03 2003 (Y2) number Z(2) 2-jan-03 2003 (Y2) number Z(2) . . . until dec 31 2007. I used the plot3d funtions to build a scatter point plot. Call rinterface.rrun(library(rgl)) Call rinterface.rrun(plot3d(x,y1,z1,xlab='Date',ylab='Year',zlab='Vol',ylim =c(2001,2008))) Call rinterface.rrun(plot3d(x,y2,z2,add=TRUE)) Call rinterface.rrun(plot3d(x,y3,z3,add=TRUE)) Call rinterface.rrun(plot3d(x,y4,z4,add=TRUE)) Call rinterface.rrun(plot3d(x,y5,z5,add=TRUE)) Call rinterface.rrun(plot3d(x,y6,z6,add=TRUE)) Is thier a way to lay a surface to this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R function to calculate number of data points of each level
On Wed, Dec 17, 2008 at 02:09:57PM +0100, Philipp Pagel wrote: or maybe ggregate(foo$V1, by=list(foo$V1), FUN=length) Oops- that was supposed to be 'aggegate'... -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://mips.gsf.de/staff/pagel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble pulling data from a messy ASCII file...
I usually use Unix tools to process really data beforehand (sed, awk), but if you want a pure R solution it is usually possible to kludge something together with scan() working line by line. # read a line # if it contains stuff you aren't interested in, go on to the next line # if it contains one kind of interesting stuff, do X # if it contains another kind of interesting stuff, do Y and so on. I've done this when it was easier than alternative processing (though slower), and found that it worked best for me to read the entire line in as a string, then split it apart later and convert to numeric if appropriate. Sarah On Wed, Dec 17, 2008 at 2:37 PM, Titan8883 jpla...@gmail.com wrote: Hi all, I am a new graduate student who is also new to R. I am ok with the basics, but the problem I am having right now seems beyond what I can do..so I am looking for advice. I am trying to pull data from flat ASCII files, but they do not have a nice structure so a simple read.table doesn't work. An example first half of a data file is below: -- 19 c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt 10 s name of program that wrote this file trkplt name of program that wrote this file 10 GORDON machine that generated this file machine that generated this file 10 3.7 version of program 10 3.6 version of this data file 105.81 version of Universal Library 10 20081121.145730 when this file was written 10 Windows_XP operating system used operating system used * * radar characteristics 11 WF-100 11 2000 A/D rate, samples/second 11 7.5 bin width, m 11 800 nominal PRF, Hz 11 0.25 nominal pulse width, microsec 11 0 tuning, volts 11 3.19779 nominal wave length, cm --- ..the file goes on from there... How would I go about getting this data into some kind of useful format? This is one of about 1000 files I will need to go through. I would ideally like to get these into a format with each data file as a row with columns for the various values with the description text removed(version of program, file version, tuning volts, etc...). I'm not looking for a cut and paste answer, but perhaps some direction on where I should start. I have only done basic .csv, table, and line inputs up until now. Thanks for any advice -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing arguments to subset from a function
This thread may help? https://stat.ethz.ch/pipermail/r-help/2007-November/145345.html On Wed, 17 Dec 2008 20:07:08 +0100, GOUACHE David d.goua...@arvalisinstitutduvegetal.fr wrote: Hello R-helpers, I'm writing a long function in which I manipulate a certain number of datasets. I want the arguments of said function to allow me to adapt the way I do this. Among other things, I want my function to have an argument which I will pass on to subset() somewhere inside my function. Here is a quick and simplified example with the iris dataset. myfunction-function(table, extraction) { table2-subset(table, extraction) return(table2) } myfunction(iris, extraction= Species==setosa) ## end What I would like is for this function to return exactly the same thing as : subset(iris, Species==setosa) Thanks for your help. Regards, David Gouache __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble pulling data from a messy ASCII file...
It would be helpful if you could show what the output would be for the example given. Exactly what are 'values' and what would be the 'headings'. As mentioned before, you can use readLines and then parse the data you want, but something like Perl might be easier, but it is hard to tell from the mail. On Wed, Dec 17, 2008 at 2:37 PM, Titan8883 jpla...@gmail.com wrote: Hi all, I am a new graduate student who is also new to R. I am ok with the basics, but the problem I am having right now seems beyond what I can do..so I am looking for advice. I am trying to pull data from flat ASCII files, but they do not have a nice structure so a simple read.table doesn't work. An example first half of a data file is below: -- 19 c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt 10 s name of program that wrote this file trkplt name of program that wrote this file 10 GORDON machine that generated this file machine that generated this file 10 3.7 version of program 10 3.6 version of this data file 105.81 version of Universal Library 10 20081121.145730 when this file was written 10 Windows_XP operating system used operating system used * * radar characteristics 11 WF-100 11 2000 A/D rate, samples/second 11 7.5 bin width, m 11 800 nominal PRF, Hz 11 0.25 nominal pulse width, microsec 11 0 tuning, volts 11 3.19779 nominal wave length, cm --- ..the file goes on from there... How would I go about getting this data into some kind of useful format? This is one of about 1000 files I will need to go through. I would ideally like to get these into a format with each data file as a row with columns for the various values with the description text removed(version of program, file version, tuning volts, etc...). I'm not looking for a cut and paste answer, but perhaps some direction on where I should start. I have only done basic .csv, table, and line inputs up until now. Thanks for any advice -- View this message in context: http://www.nabble.com/Trouble-pulling-data-from-a-messy-ASCII-file...-tp21059239p21059239.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pruning trees using rpart
Hi, I am using the packages tree and rpart to build a classification tree to predict a 0/1 outcome. The package rpart has the advantage that the function plotcp gives a visual representation of the cross-validation results with a horizontal line indicating the 1 standard error rule, i.e. the recommendation to select the most parsimonious model (the smallest tree) whose error is not more than one standard error above the error of the best model. However, in the rpart package I am not getting trees of all sizes but for example three sizes are 1,2,5 in one example I am working with, while with cv.tree in package tree it gives 1,2,3,4,5 like I would guess it should (weakest link pruning successively collapses the internal nodes that contrubute the least). What is the reason for this? A second problem I am having in both packages is that the cross-validation results are highly variable between different runs of the programs. This is not unexpected as cross-validations means that the dataset is randomly divided in 10 equal subsets, which can be done in a lot of different ways. One then hopes that the results do not depend on this very much, but I observed they do often. Should one then do this many times, e.g. 100, each time select the model using the 1 standard error rule, and in the end count which model got selected most often? Or rather do it many times and average the means and standard errors of the prediction error? Or does a very high variability in cross-validation results mean that the dataset is too small to reach conclusions? Kind regards and thanks for your help, Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence intervals of log transformed data
The boxcox function in the MASS package computes these intervals. If you want to see how it computes the interval you can look at the source code. MASS the book also describes the general process. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Eoin Barry Sent: Wednesday, December 17, 2008 7:24 AM To: r-help@r-project.org Subject: Re: [R] Confidence intervals of log transformed data Hello, I was wondering if you could tell me how to calculate 95% confidence intervals for lambda for a box-cox power transformation. Best wishes Eoin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add a string to each string of an array
Dear all, I have an array of strings a - c(2008q3,2005q1,2004q3) I would like to add to each a[i], with i = 1,2,3 the following string IMT, such that in the end I could get b - c(IMT2008q3,IMT2005q1,IMT2004q3) Is it possible to accomplish this without a loop command? Thank you. Best, Bori -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble pulling data from a messy ASCII file...
The output I would be looking for would be one row for each data file with columns for each variable, so using a .csv example with a few variables would be: - File_name,date_written,program_ver,data_file_ver,bin_width 20080911.013115.007.17.txt, 20081121.145730,3.7,3.6,7.5 -- My plan is to create a table with all the data files listed. This would allow me to find mean/min/max values for different variables,sort by a certain variable, etc. I am not limiting myself to R, I have seen awk mentioned before, so that sounds like it is worth looking at to prep the data. Hope that helps. jholtman wrote: It would be helpful if you could show what the output would be for the example given. Exactly what are 'values' and what would be the 'headings'. As mentioned before, you can use readLines and then parse the data you want, but something like Perl might be easier, but it is hard to tell from the mail. On Wed, Dec 17, 2008 at 2:37 PM, Titan8883 jpla...@gmail.com wrote: Hi all, I am a new graduate student who is also new to R. I am ok with the basics, but the problem I am having right now seems beyond what I can do..so I am looking for advice. I am trying to pull data from flat ASCII files, but they do not have a nice structure so a simple read.table doesn't work. An example first half of a data file is below: -- 19 c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt 10 s name of program that wrote this file trkplt name of program that wrote this file 10 GORDON machine that generated this file machine that generated this file 10 3.7 version of program 10 3.6 version of this data file 105.81 version of Universal Library 10 20081121.145730 when this file was written 10 Windows_XP operating system used operating system used * * radar characteristics 11 WF-100 11 2000 A/D rate, samples/second 11 7.5 bin width, m 11 800 nominal PRF, Hz 11 0.25 nominal pulse width, microsec 11 0 tuning, volts 11 3.19779 nominal wave length, cm --- ..the file goes on from there... How would I go about getting this data into some kind of useful format? This is one of about 1000 files I will need to go through. I would ideally like to get these into a format with each data file as a row with columns for the various values with the description text removed(version of program, file version, tuning volts, etc...). I'm not looking for a cut and paste answer, but perhaps some direction on where I should start. I have only done basic .csv, table, and line inputs up until now. Thanks for any advice -- View this message in context: http://www.nabble.com/Trouble-pulling-data-from-a-messy-ASCII-file...-tp21059239p21059239.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Trouble-pulling-data-from-a-messy-ASCII-file...-tp21059239p21060639.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add a string to each string of an array
Hi Boris, Yes. Try this: paste('IMT',a,sep=) See ?paste. HTH, Jorge On Wed, Dec 17, 2008 at 10:25 AM, Boriss bor...@gmx.net wrote: Dear all, I have an array of strings a - c(2008q3,2005q1,2004q3) I would like to add to each a[i], with i = 1,2,3 the following string IMT, such that in the end I could get b - c(IMT2008q3,IMT2005q1,IMT2004q3) Is it possible to accomplish this without a loop command? Thank you. Best, Bori -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Model building using lmer
Dear R-experts, Quite new to R on this end, but learning fast (I hope). I am running version 2.7.1 on Windows Vista. I have small dataset which consists of: # NestID: nest indicator for each chicken. Siblings sharing the same nest have the same nest indicator. # Chick: chick indicator consisting of a unique ID for each single chick. # Year: 1, 2. # ClutchSize: 1-, 2- , 3-eggs. # HO: hatching order within each clutch (1, 2, 3 [first, second and third-hatched chick]). # SibComp: sibling competence: present/ absent (0, 1) # Death2: death at two days post-hatch (0, 1) # Death10: death at ten days post-hatch (0, 1) So a subset of my dataset looks something like this: NestID Chick Year ClutchSize HO Hatching SibComp Death2 Death10 1 11 1 11 1 1 1 2 21 1 11 1 0 0 3 31 1 10 0 0 0 4 41 1 11 0 1 0 4 51 2 20 1 0 1 5 61 2 11 0 0 0 5 71 2 20 0 0 0 6 82 3 11 1 0 0 6 92 3 21 0 1 0 6 10 2 3 30 1 0 0 7 11 2 3 10 0 0 1 7 11 2 3 20 0 0 0 7 11 2 3 31 1 1 1 In order to account for lack of independence at the nest level (many chicks are siblings), I'd like to run a GLMM with random slopes and intercepts for nests. Using lmer, my model for survival at 10 days, for example, would read as follows (or not!): model - lmer(Death10 ~ HO + ClutchSize + SibComp + Year + (1|NestID), family=binomial, 1) summary(model) From what I understand, the model above includes only random intercepts for NestID. So at this point my question is how do I make this model into one which includes both random intercepts and slopes for NestID? Look forward to receiving your input. Thank you all for your time! Luciano __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing arguments to subset from a function
On Wed, 17 Dec 2008 20:07:08 +0100, GOUACHE David d.goua...@arvalisinstitutduvegetal.fr wrote: argument which I will pass on to subset() somewhere inside my function. I would use the example of .() function from plyr package in this case: .-function (...){ structure(as.list(match.call()[-1]), class = quoted) } myfunction-function(table, extraction) { table2-subset(table, eval(extraction[[1]])) return(table2) } myfunction(iris, extraction = .(Species==setosa)) You can pass as many arguments in .() as you wish and index correspondingly in myfunction afterwards. HTH. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Timing Portion of R Code
see ?system.time __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using dvi with latex object: directory not correctly set, maybe due to error in shQuote()
Dear friends of R, I want to produce a pdf file with the contents of a matrix. I employ the latex command in combination with dvi, both contained in the Hmisc package. It seems to me that the function does not correctly set the directory. tbl.loc - matrix(1:4, nc=2) latex.obj - latex(tbl.loc) dvi(latex.obj) warning: extra args ignored after 'cd' H:\PROJECTS\data warning: extra args ignored after 'yap' When I have a look at the function dvi.latex I find the following line which, I guess, is meant to set the new directory and to run latex. sys(paste(cd, shQuote(tempdir()), sc, optionsCmds(latex), -interaction=scrollmode, shQuote(tmp)), output = FALSE) Running just the piece shQuote(tempdir()) returns shQuote(tempdir()) [1] \C:\\DOKUME~1\\ferimawi\\LOKALE~1\\Temp\\Rtmpr4CG3A\ tempdir() [1] C:\\DOKUME~1\\ferimawi\\LOKALE~1\\Temp\\Rtmpr4CG3A Is the leading \ causing the problem? How can I fix the problem? The R-help dealt with a related problem some while ago but I do not think that it resolves my problem: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/62975.html I am using Windows XP, R version 2.7.2 (2008-08-25) and Hmisc version 3.4-4. Thanks in advance for your help. Regards Marco Marco Willner Senior Analyst Quantitative Asset Allocation Feri Finance AG Haus am Park Rathausplatz 8-10 D-61348 Bad Homburg v.d.H Tel: +49 (6172) 916 3037 Fax: +49 (6172) 916 1037 E-Mail: marco.will...@feri.de Internet: www.feri.de Handelsregister des Amtsgerichts Bad Homburg v.d.H. (HRB 7473) Vorstände: Michael Stammler (Sprecher), Dr. Matthias Klöpper, Dr. Helmut Knepel, Dr. Heinz-Werner Rapp, Arndt Thorn Vorsitzender des Aufsichtsrates: Dr. Uwe Schroeder-Wildberg Disclaimer: Diese Nachricht enthält vertrauliche und/oder ausschließlich für den Adressaten bestimmte Informationen. Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Veröffentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser E-Mail und der E-Mail selber unzulässig ist. Sollten Sie diese E-Mail irrtümlich erhalten haben, so bitten wir Sie, den Absender unverzüglich durch Antwort-E-Mail oder Anruf unter +49 (6172) 916-0 zu informieren und diese Nachricht zu löschen. Soweit nicht anderweitig angegeben, ist diese Nachricht weder ein Angebot noch die Einholung eines Angebots zum Kauf oder Verkauf von Investitionen jedweder Art. Wir senden und empfangen E-Mails nur auf der Grundlage, dass wir nicht für Datenkorruption, Abfangen von Daten, nicht autorisierte Änderungen, Verfälschung und Viren und deren Konsequenzen haften. This message contains confidential and/or privileged inf...{{dropped:16}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bug (?!) in pam() clustering from fpc package ?
Hello all. I wish to run k-means with manhattan distance. Since this is not supported by the function kmeans, I turned to the pam function in the fpc package. Yet, when I tried to have the algorithm run with different starting points, I found that pam ignores and keep on starting the algorithm from the same starting-points (medoids). For my questions: 1) is there a bug in the code or in the way I am using it ? 2) is there a way to either fix the code or to another function in some package that can run kmeans with manhattan distance (manhattan distances are the sum of absolute differences) ? here is a sample code: require(fpc) x - rbind(cbind(rnorm(10,0,0.5), rnorm(10,0,0.5)), cbind(rnorm(15,5,0.5), rnorm(15,5,0.5))) pam(x, 2, medoids = c(1,16)) output: Medoids: ID [1,] 3 -0.1406026 0.1131493 [2,] 17 4.9564839 4.6480520 ... So the initial medeoids where 3 and 17, not 1 and 16 as I asked. Thanks, Tal -- -- Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: www.talgalili.com www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting regression lines and points from lm using lattice
Dear R-users, Sorry if someone came out with a similar question but after one day of searching I am giving up: Does anyone know how to plot the original points used in a lm model and the set of resulting regression lines generated by the model? This is how I do it using the plot and lines functions but I would like to do it using lattice and I cannot find a way to plot in the same panel points and lines that come from different datasets. Note: I would like to use this sort of predict approach rather than using the regression coefficients of the model as in complex models I get confused when I have to combine the coefficients to build up each regression line. Many thanks in advance and sorry for bothering you. Javier fm-lm(c4~SEX+AREA+c3+SEX:AREA+SEX:c3+AREA:c3,data=mydata) mygrid-expand.grid(SEX=c(male,female),AREA=c(area1,area2,area3),c3=seq(-2.5,2.5,length.out=20)) pred_fm-predict(fm,newdata=mygrid) plot(mydata$c3, mydata$c4, col = ifelse(mydata$SEX== male, blue, red), pch = ifelse(mydata$SEX==male,2,1),xlab=c3,ylab=c4) lines(seq(-2.5, 2.5, length.out = 20), pred_fm[mygrid$SEX == female mygrid$AREA == area3] , col = red, lwd = 2, lty = 3) lines(seq(-2.5, 2.5, length.out = 20), pred_fm[mygrid$SEX == female mygrid$AREA == area1] , col = red, lwd = 2, lty = 1) lines(seq(-2.5, 2.5, length.out = 20), pred_fm[mygrid$SEX == female mygrid$AREA == area2] , col = red, lwd = 2, lty = 2) lines(seq(-2.5, 2.5, length.out = 20), pred_fm[mygrid$SEX == male mygrid$AREA == area3] , col = blue, lwd = 2, lty = 3) lines(seq(-2.5, 2.5, length.out = 20), pred_fm[mygrid$SEX == male mygrid$AREA == area1] , col = blue, lwd = 2, lty = 1) lines(seq(-2.5, 2.5, length.out = 20), pred_fm[mygrid$SEX == male mygrid$AREA == area2] , col = blue, lwd = 2, lty = 2) -- View this message in context: http://www.nabble.com/Plotting-regression-lines-and-points-from-lm-using-lattice-tp21052486p21052486.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] win.graph question
I have what I hope is a ridiculously simple question. I am trying to follow an example that uses the function win.graph(), but my machine does not recognize win.graph(). I am running R 2.8.0 on OS X. Is there some OS X specific function that replaces win.graph or am I missing some package? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Check if data frame column is numeric
On Tue, 16 Dec 2008 16:25:07 +0100, Mark Heckmann mark.heckm...@gmx.de wrote: Hi R-users, I want to apply a function to each column of a data frame that is numeric. colwise(), numcolwise() and catcolwise() in plyr package turn a function that operates on vectors into one that operates on columns of data frame: in your case it would be : numcolwise(your.fun)(your.data.frame) Vitalie. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PREDICT NEW VALUES FROM REGRESSION MODEL, EST. ST.ERROR, AND CI
Greetings, I'd be grateful if a good Samaritan helps me to approach this problem with my data, I've created the following model lm(formula = OUTCOME ~ VAR1 + VAR2) summary(model) Call: lm(formula = OUTCOME ~ VAR1 + VAR2) Residuals: Min 1Q Median 3Q Max -1.4341 -0.3621 0.1879 0.4994 0.7696 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.89020 0.26826 7.046 5.92e-07 *** VAR1 0.04725 0.06001 0.787 0.440 VAR2 0.04139 0.05655 0.732 0.472 Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.6618 on 21 degrees of freedom Multiple R-squared: 0.9474, Adjusted R-squared: 0.9424 F-statistic: 189.2 on 2 and 21 DF, p-value: 3.696e-14 but now, I need to predict OUTCOME (Y) when VAR1=8 and VAR2 =64; estimate the standard error of the predicted value, and construct a 95% CI Your help is much appreciated RG * Ricardo L Gomez Center for International Education University of Massachusetts-Amherst Telephone: (413)545-0465 | Fax: (413)545-1263 Web Address http://www.umass.edu/cie E-mail: c...@educ.umass.edu Get the world#39;s best email - http://nz.mail.yahoo.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with graphical devices, e.g., png(), pdf(): blurry graphical output
On Wed, Dec 17, 2008 at 3:06 AM, Martyn Plummer plum...@iarc.fr wrote: The artefacts that you see are a normal result of using bitmap graphics devices. I have tried to explain these below: Thanks very much for your explanations, MP; they were quite informative!! I recognize that others may feel differently, but to me, the default PNG being produced by my system -- e.g., http://www.piccdrop.com/images/1229495327.png -- is not ideal for presentation on the web. E.g., the image mentioned in the last sentence is blurry at 100% in every web browser and image viewer I've tried (even when I turn off aliasing in Eye of Gnome, as MP suggests). Given my feelings about the blurriness, my question is: how can I produce a non-blurry image via png()? I recognize that I can do so via png('plot.png',type='Xlib') but, I am wondering if there are solutions that don't involve Xlib. I guess my second question (if anybody has patience for it) is: what is the philosophy behind the current behavior? I was able to find some of that here: http://www.cairographics.org/FAQ/#sharp_lines ... but I am wondering if somebody could elaborate on this philosophy in relationship to R and statistical graphics, or point me to some links that do so. The link above does suggest that sharp single-pixel lines are possible via Cairo. So: are sharp single-pixel lines possible via Cairo in R, and if so how (this of course the question I've asked above)? And if not, why? (Just to be clear: I'm not saying the current implementation is necessarily bad. At present, I'm just trying to understand more about why it is the way it is.) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shrink Trellis margins settings (when printed to png file)
On Wed, Dec 17, 2008 at 8:11 AM, Mark Heckmann mark.heckm...@gmx.de wrote: Dear R-experts, I have two problems: PROBLEM (1) --- I want to produce a very small png file (35 x 18 px) that contains a histogram without a figure region or margins, only the pure heights. In the base graphic system this is simple: png(filename = hist.png, res = 72, width=35, height=18) par(mar=c(0,0,0,0), oma=c(0,0,0,0)) hist(rnorm(100), main=) dev.off() Now I want a grid graphics output as I need the graphic as an object. I tried several trellis.par settings but I was not able to figure it out You could do it this way if you really want, but a better approach would be to directly plot the pieces you want (rather than start with a high-level solution and get rid of the pieces you don't want). For example: library(lattice) library(grid) x - rnorm(100) limits - prepanel.default.histogram(x, breaks = NULL) ## grid.newpage() # to start a new page pushViewport(viewport(xscale = extendrange(limits$xlim), yscale = extendrange(limits$ylim))) panel.histogram(x, breaks = NULL) -Deepayan (PROBLEM (1)). Up to now it looks like this: library(lattice) histogram(rnorm(100), xlab=, ylab=, par.settings=list( axis.line=list(col=transparent), xlab.text=list(col=transparent), ylab.text=list(col=transparent), axis.text=list(col=transparent)) ) This looks acceptable although I would like smaller margins, that is to say no margins at all. How cab I achieve that? PROBLEM (2) --- Now is, that when it is printed to the png file, the graphic almost consist of margins only. The main part of the plot shrinks to some tiny points. I don't know how to change the settings, so I that I get the same as in the base system. Does anyone know? TIA Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Timing Portion of R Code
Dear all, What's the common R idiom to do this task? For example I want to compute the running time of one function in R mylong_running_func(100) How can I compute the running time of the function above? - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pruning trees using rpart
On Wed, 17 Dec 2008, Tom Cattaert wrote: Hi, I am using the packages tree and rpart to build a classification tree to predict a 0/1 outcome. The package rpart has the advantage that the function plotcp gives a visual representation of the cross-validation results with a horizontal line indicating the 1 standard error rule, i.e. the recommendation to select the most parsimonious model (the smallest tree) whose error is not more than one standard error above the error of the best model. However, in the rpart package I am not getting trees of all sizes but for example three sizes are 1,2,5 in one example I am working with, while with cv.tree in package tree it gives 1,2,3,4,5 like I would guess it should (weakest link pruning successively collapses the internal nodes that contrubute the least). What is the reason for this? How are we to know without the reproducible example you were asked for? The pruning sequence need not cover all sizes, but it depends on the inputs and the tuning parameters. A second problem I am having in both packages is that the cross-validation results are highly variable between different runs of the programs. This is not unexpected as cross-validations means that the dataset is randomly divided in 10 equal subsets, which can be done in a lot of different ways. One then hopes that the results do not depend on this very much, but I observed they do often. Should one then do this many times, e.g. 100, each time select the model using the 1 standard error rule, and in the end count which model got selected most often? Or rather do it many times and average the means and standard errors of the prediction error? Or does a very high variability in cross-validation results mean that the dataset is too small to reach conclusions? MASS (the book) covers this. Kind regards and thanks for your help, Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glmnet : Error in validObject(.Object) :
Could any one help ? I start to learn the glmnet package. I tried with the example in the manual. x=matrix(rnorm(100*20),100,20) y=rnorm(100) fit1=glmnet(x,y) When I tried to fit the model, I received the error message: Error in validObject(.Object) : invalid class dgCMatrix object: row indices are not sorted within columns Thank you very much! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] OFF topic testing for positive coeffs
Dear all, This is off-topic, however I hope someone can give me useful suggestion.. Given the regression model y = b0 + b1*x + e I am interested in testing for positive coeffs, namely H0: b00 AND b10 H1: b0,b1 unconstrained It is simple to estimate the model under H0 and H1 (there are several suggestions on the Rlist about estimation but nothing about testing..) perform a likelihood ratio test by comparing the logLik under the constrained and the unconstrained models, however I do not know how many degrees of freedom.. Model under H0 uses two df, however it reasonable to believe that the real dimension is =2.. Is there anyone which can give me any advices or suggest me references? Many thanks, vito -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 6626240 fax: 091 485726/485612 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble pulling data from a messy ACII file...
On Wed, 17 Dec 2008, Titan8883 wrote: Hi all, I am a new graduate student who is also new to R. I am ok with the basics, but the problem I am having right now seems beyond what I can do..so I am looking for advice. Advice? OK. Here goes. I would suggest you pull one of the data files into a character vector using readLines(). From there you can try out different methods of finding the data elements in the file that you want to extract. If it is guaranteed that 'nominal pulse width' ALWAYS shows up on the same line in every file, you can use the line numbers to figure out where to look for data elements. If not, you will probably want to get familiar with grep() and regular expressions, see ?regex and use RSiteSearch(regexpr) and the like to turn up the many useful discussions of them on this list. From there sub(), gsub(), strsplit(), and friends will help you. They may take a good deal of fiddling to get them to digest your data. If parts of your file can be read using read.csv() or scan() or something, you can use a textConnection() to pass some lines that readLines() has stored for you to read.csv(). Once you get so that one data file can be processed, rolling up your code as a function should not be too hard. Put the function in a loop using res - list() for(ifile in your.file.list ) res[[ifile]] - your.function( ifile) or res - sapply(your.file.list, your.function) or res - lapply(your.file.list, your.function) and you are ready to chomp away at your files. HTH, Chuck I am trying to pull data from flat ASCII files, but they do not have a nice structure so a simple read.table doesn't work. An example first half of a data file is below: -- 19 c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt 10 s name of program that wrote this file trkplt name of program that wrote this file 10 GORDON machine that generated this file machine that generated this file 10 3.7 version of program 10 3.6 version of this data file 105.81 version of Universal Library 10 20081121.145730 when this file was written 10 Windows_XP operating system used operating system used * * radar characteristics 11 WF-100 11 2000 A/D rate, samples/second 11 7.5 bin width, m 11 800 nominal PRF, Hz 11 0.25 nominal pulse width, microsec 11 0 tuning, volts 11 3.19779 nominal wave length, cm --- ..the file goes on from there... How would I go about getting this data into some kind of useful format? This is one of about 1000 files I will need to go through. I would ideally like to get these into a format with each data file as a row with columns for the various values with the description text removed(version of program, file version, tuning volts, etc...). I'm not looking for a cut and paste answer, but perhaps some direction on where I should start. I have only done basic .csv, table, and line inputs up until now. Thanks for any advice -- View this message in context: http://www.nabble.com/Trouble-pulling-data-from-a-messy-ACII-file...-tp21059239p21059239.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glmnet : Error in validObject(.Object) :
Dear Hao, It works for me. Here is my sessionInfo(): sessionInfo() R version 2.8.0 Patched (2008-11-08 r46864) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] glmnet_1.1-1 Matrix_0.999375-16 lattice_0.17-15plotrix_2.5 Which version of R are you using? HTH, Jorge On Wed, Dec 17, 2008 at 3:50 PM, Hao haotang1...@gmail.com wrote: Could any one help ? I start to learn the glmnet package. I tried with the example in the manual. x=matrix(rnorm(100*20),100,20) y=rnorm(100) fit1=glmnet(x,y) When I tried to fit the model, I received the error message: Error in validObject(.Object) : invalid class dgCMatrix object: row indices are not sorted within columns Thank you very much! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find all numbers in a certain interval
Hi, If you can formulate your question it in terms of actual problem you have with data.frame it would be easier to answer. for the time being check subset() if it is what you want. SV. On Tue, 16 Dec 2008 11:09:19 +0100, Antje niederlein-rs...@yahoo.de wrote: Hi all, I'd like to know, if I can solve this with a shorter command: a - rnorm(100) which(a -0.5 a 0.5) # would give me all indices of numbers greater than -0.5 and smaller than +0.5 I have something similar with a dataframe and it produces sometimes quite long commands... I'd like to have something like: which(within.interval(a, -0.5, 0.5)) Is there anything I could use for this purpose? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find all numbers in a certain interval
Thanks a lot for every answer I got! I could solve my problem! Greg, your proposal seems to be quite useful for me :-) Thank you. Ciao, Antje Antje schrieb: Hi all, I'd like to know, if I can solve this with a shorter command: a - rnorm(100) which(a -0.5 a 0.5) # would give me all indices of numbers greater than -0.5 and smaller than +0.5 I have something similar with a dataframe and it produces sometimes quite long commands... I'd like to have something like: which(within.interval(a, -0.5, 0.5)) Is there anything I could use for this purpose? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeated measures aov with weights
Weights are not supported: multistratum aov is designed for balanced designs and uses projection for which weighting is inappropriate. On Wed, 17 Dec 2008, Ingmar Visser wrote: Dear R-help, I'm facing a problem with defining a repeated measures anova with weighted data. Here's the code to reproduce the problem: # generate some data seed=11 rtrep - data.frame(rt=rnorm(100),ti=rep(1:5,20),subj=gl (20,5,100),we=runif(100)) # model with within factor for subjects/repeated measurements, no problem aov(rt~ti + Error(subj/ti),data=rtrep) #model with weights and subj as between factor, ie ignoring repeated measures, #again, no problem aov(rt~ti+subj,data=rtrep,weights=we) #combination of above two: repeated measures AND weights aov(rt~ti + Error(subj/ti),data=rtrep,weights=we) The latter model gives an error (see report below), but only after fitting it, ie the error is produced by the print and summary methods of the aov objects. Any guidance is appreciated, best, Ingmar Error produced in printing the fitted aov object: Call: aov(formula = rt ~ ti + Error(subj/ti), data = rtrep, weights = we) Note: The results below are on the weighted scale Grand Mean: 0.112081 Stratum 1: subj Terms: ti Residuals Sum of Squares 2.002826 10.869940 Deg. of Freedom 118 Residual standard error: 0.7771007 Estimated effects are balanced Stratum 2: subj:ti Terms: ti Residuals Sum of Squares 0.382535 5.540047 Deg. of Freedom119 Residual standard error: 0.5399828 Estimated effects are balanced Stratum 3: Within Error in print.aov(xi, ...) : dims [product 60] do not match the length of object [100] In addition: Warning message: In resid * wt^0.5 : longer object length is not a multiple of shorter object length Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeated measures aov with weights
I see, I was afraid for an answer along these lines as my problem now turns into a stat problem (-; Any suggestions on how to analyze such data as given below with weights AND at the same time taking into account that there are repeated measurements? Best, Ingmar On 17 Dec 2008, at 11:02, Prof Brian Ripley wrote: Weights are not supported: multistratum aov is designed for balanced designs and uses projection for which weighting is inappropriate. On Wed, 17 Dec 2008, Ingmar Visser wrote: Dear R-help, I'm facing a problem with defining a repeated measures anova with weighted data. Here's the code to reproduce the problem: # generate some data seed=11 rtrep - data.frame(rt=rnorm(100),ti=rep(1:5,20),subj=gl (20,5,100),we=runif(100)) # model with within factor for subjects/repeated measurements, no problem aov(rt~ti + Error(subj/ti),data=rtrep) #model with weights and subj as between factor, ie ignoring repeated measures, #again, no problem aov(rt~ti+subj,data=rtrep,weights=we) #combination of above two: repeated measures AND weights aov(rt~ti + Error(subj/ti),data=rtrep,weights=we) The latter model gives an error (see report below), but only after fitting it, ie the error is produced by the print and summary methods of the aov objects. Any guidance is appreciated, best, Ingmar Error produced in printing the fitted aov object: Call: aov(formula = rt ~ ti + Error(subj/ti), data = rtrep, weights = we) Note: The results below are on the weighted scale Grand Mean: 0.112081 Stratum 1: subj Terms: ti Residuals Sum of Squares 2.002826 10.869940 Deg. of Freedom 118 Residual standard error: 0.7771007 Estimated effects are balanced Stratum 2: subj:ti Terms: ti Residuals Sum of Squares 0.382535 5.540047 Deg. of Freedom119 Residual standard error: 0.5399828 Estimated effects are balanced Stratum 3: Within Error in print.aov(xi, ...) : dims [product 60] do not match the length of object [100] In addition: Warning message: In resid * wt^0.5 : longer object length is not a multiple of shorter object length Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 Ingmar Visser Department of Psychology, University of Amsterdam Roetersstraat 15 1018 WB Amsterdam The Netherlands t: +31-20-5256723 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Model building using lmer
Dear Luciano, The 1 in (1|NestID) indicates only a random intercept. Note that in most models in R, a 1 on the righthandside of the formula indicates the intercept, -1 or 0 indicates no intercept. ~X, which is equivalent to ~X + 1, indicates a slope along X and an intercept. Hence a random slope and intercept is write as (X|NestID). If you only want the random slope then write (X + 0|Nest). Note that (X|NestID) implies that the random slope and the random intercept can be correlated. If you need them to be independent you will have to write (X + 0|NestID) + (1|NestID). HTH, Thierry PS Next time try to send questions about lmer to the R-sig-mixed-models mailinglist. ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Luciano La Sala Verzonden: woensdag 17 december 2008 15:47 Aan: r help Onderwerp: [R] Model building using lmer Dear R-experts, Quite new to R on this end, but learning fast (I hope). I am running version 2.7.1 on Windows Vista. I have small dataset which consists of: # NestID: nest indicator for each chicken. Siblings sharing the same nest have the same nest indicator. # Chick: chick indicator consisting of a unique ID for each single chick. # Year: 1, 2. # ClutchSize: 1-, 2- , 3-eggs. # HO: hatching order within each clutch (1, 2, 3 [first, second and third-hatched chick]). # SibComp: sibling competence: present/ absent (0, 1) # Death2: death at two days post-hatch (0, 1) # Death10: death at ten days post-hatch (0, 1) So a subset of my dataset looks something like this: NestID Chick Year ClutchSize HO Hatching SibComp Death2 Death10 1 11 1 11 1 1 1 2 21 1 11 1 0 0 3 31 1 10 0 0 0 4 41 1 11 0 1 0 4 51 2 20 1 0 1 5 61 2 11 0 0 0 5 71 2 20 0 0 0 6 82 3 11 1 0 0 6 92 3 21 0 1 0 6 10 2 3 30 1 0 0 7 11 2 3 10 0 0 1 7 11 2 3 20 0 0 0 7 11 2 3 31 1 1 1 In order to account for lack of independence at the nest level (many chicks are siblings), I'd like to run a GLMM with random slopes and intercepts for nests. Using lmer, my model for survival at 10 days, for example, would read as follows (or not!): model - lmer(Death10 ~ HO + ClutchSize + SibComp + Year + (1|NestID), family=binomial, 1) summary(model) From what I understand, the model above includes only random intercepts for NestID. So at this point my question is how do I make this model into one which includes both random intercepts and slopes for NestID? Look forward to receiving your input. Thank you all for your time! Luciano __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bug (?!) in pam() clustering from fpc package ?
Dear Tal, pam is not in the fpc package but in the cluster package. Look at ?pam and ?pam.object to find out what it does. As far as I see, the medoids in the output object are the final cluster medoids, not the initial ones, which presumably explains the observed behaviour. Best regards, Christian On Wed, 17 Dec 2008, Tal Galili wrote: Hello all. I wish to run k-means with manhattan distance. Since this is not supported by the function kmeans, I turned to the pam function in the fpc package. Yet, when I tried to have the algorithm run with different starting points, I found that pam ignores and keep on starting the algorithm from the same starting-points (medoids). For my questions: 1) is there a bug in the code or in the way I am using it ? 2) is there a way to either fix the code or to another function in some package that can run kmeans with manhattan distance (manhattan distances are the sum of absolute differences) ? here is a sample code: require(fpc) x - rbind(cbind(rnorm(10,0,0.5), rnorm(10,0,0.5)), cbind(rnorm(15,5,0.5), rnorm(15,5,0.5))) pam(x, 2, medoids = c(1,16)) output: Medoids: ID [1,] 3 -0.1406026 0.1131493 [2,] 17 4.9564839 4.6480520 ... So the initial medeoids where 3 and 17, not 1 and 16 as I asked. Thanks, Tal -- -- Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: www.talgalili.com www.biostatistics.co.il *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pcnm - scaling
Dear listers, I am using the pcnm function (spacemakeR) to obtain eigenvectors for a spatial grid of sampling sites. These pcnm eigenvectors are then used in multivariate ordination to test where community composition follows local environment or rather shows spatial autocorrelation, and get support for apatial autocorrelation. In an RDA analysis, I can see which eigenvectors correlate most with my data, which can be interpreted that the spatial scale is rather 'fine' or 'broad'. However, is there a way to read the spatial scale on which the data correlates best, i.e. is it possible to extract a spatial scale describing the apatial autocorrelation (e.g. from the eigenvalues)? My data - irregularly spaced sampling sites, total scale ca. 2000 km, most sites few tens of km apart - i use the dfunction rdist.earth to obtain a distance-matrix of all sites prior to using pcnm Thanks ! Robert Robert Ptacnik Norwegian Institute for Water Research (NIVA) Gaustadalléen 21 NO-0349 Oslo FON +47 982 277 81 FAX +47 221 852 00 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Principal Component Analysis - Selecting compo nents? + right choice?
Hi, I have been testing some of the alternative suggested approaches. The best PC set may not be the best predictors subset, but is that true that it is not generally the case? If you have to explore data patterns and (potential) relationships between a response variables and a large set of candidate predictors, PC still seem to be best candidate for a relatively quick test. I think some time you have to trade off against time (for example: computing time), and if any pattern emerges from response vs . first k PC then you investigate further am I completely wrong there? what alternative do you have that reduces so drastically the computation request for exploratory purposes? Furthermore, is it really generally not the case that the best PC set, say, the top k PCs contain the best predictor subset in linear regression, or does that happens only in specific situations (that is, generally the best PC set is actually a good set of predictors, but in some specific cases it is not)? Best, On Thursday 11 December 2008 17:30:51 you wrote: Hi, It is generally not the case that the best PC set, say, the top k PCs (where k p, p being the number of predcitors) contain the best predictor subset in linear regression. Hadi and Ling (Amer Stat, 1998) show that it is even possible to have an extreme situation where the first (p-1) PCs contribute nothing towards explaining the variation in the response, yet the last PC alone contributes everything. Their theorem is that if the true vector of regression coefficients is in the direction of the j-th eigenvector (of the correlation matrix), then the j-th PC alone will contribute everything to the model fit, while the remaining PCs will contribute zilch. They illustrate this phenomenon with a real data set from a classic text on regression, Draper and Smith. Ravi. --- - --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html --- - -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of S Ellison Sent: Thursday, December 11, 2008 9:37 AM To: r-help@r-project.org; Corrado Subject: Re: [R] Principal Component Analysis - Selecting components? + right choice? If you're intending to create a model using PCs as predictors, select the PCs based on whether they contribute significanctly to the model fit. In chemometrics (multivariate stats in chemistry, among other things), if we're expecting 3 or 4 PC's to be useful in a principal component regression, we'd generally start with at least the first half-dozen or so and let the model fit sort them out. The reason for not preselecting too rigorously early on is that there's no guarantee at all that the first couple of PC's are good predictors for what you're interested in. The're properties of the predictor set, not of the response set. Mind you, there used to be something of a gap between chemometrics and proper statistics; I'm sure chemometricians used to do things with data that would turn a statistician pale. You could also look for a PLS model, which (if I recall correctly) actually uses the response data to select the latent variables used for prediction. S Corrado ct...@york.ac.uk 11/12/2008 11:46:37 Dear R gurus, I have some climatic data for a region of the world. They are monthly averages 1950 -2000 of precipitation (12 months), minimum temperature (12 months), maximum temperature (12 months). I have scaled them to 2 km x 2km cells, and I have around 75,000 cells. I need to feed them into a statistical model as co-variates, to use them to predict a response variable. The climatic data are obviously correlated: precipitation for January is correlated to precipitation for February and so on even precipitation and temperature are heavily correlated. I did some correlation analysis and they are all strongly correlated. I though of running PCA on them, in order to reduce the number of co-variates I feed into the model. I run the PCA using prcomp, quite successfully. Now I need to use a criteria to select the right number of PC. (that is: is it 1,2,3,4?) What criteria would you suggest? At the moment, I am using a criteria based on threshold, but that is highly subjective, even if there are some rules of thumb (Jolliffe,Principal Component Analysis, II Edition, Springer Verlag,2002). Could you suggest something more rigorous? By the way, do you think I would have been better off by using something different from PCA? Best, -- Corrado Topi Global Climate
Re: [R] odfWeave learning resources
In general I try not to post questions to forums until I've tried my best to read about them in the available documentation. I recently undertook a project that used odfWeave and have been very pleased with the package. But, the R help documentation suggests that there are more sophisticated things I can do - for example, with conditionally formatted tables. Can anyone point me to resources I could review to educate myself about the full capabilities of this lovely package? The package directory has a sub-directory called examples with a few different odt files in it. One is called formatting.odt and has examples for tables figures and other things. See the end of the document for the code to use odfWeave on it. If you would like to contribute example files, please let me know. -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] non numeric argument to binary operator
hi i have a huge matrix and want to split it into 2. with the command %%2. but i get this warning message: *Error in mmat%%2 : non-numeric argument to binary operator* here's the bit from my matrix: V1 V2 [1,] Affymetrix:CompositeSequence:ATH1-121501:244901_at 2.653 [2,] Affymetrix:CompositeSequence:ATH1-121501:244902_at 1.753 -- all the best, Baur [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non numeric argument to binary operator
I'm not sure what you are trying to do: modulo arithmetic isn't exactly splitting a matrix into 2. Or even remotely. Regardless of the goal, when numbers are shown in quotes, it's a sure sign that R thinks they are strings, and thus non-number, just as the error message (not a warning) states. To help further, we'd need a clear statement of your goals and a reproducible example, as in the posting guide. Sarah On Wed, Dec 17, 2008 at 4:57 PM, Baurzhan Aituov ait...@gmail.com wrote: hi i have a huge matrix and want to split it into 2. with the command %%2. but i get this warning message: *Error in mmat%%2 : non-numeric argument to binary operator* here's the bit from my matrix: V1 V2 [1,] Affymetrix:CompositeSequence:ATH1-121501:244901_at 2.653 [2,] Affymetrix:CompositeSequence:ATH1-121501:244902_at 1.753 -- -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: bad value problem
Dear R-help, Like several other subscribers, I have recently encountered a problem whereby R will execute code apparently correctly and without error, but any subsequent command will yield Error: bad value so that R has to be killed and restarted. We have checked this out with a few different operating systems (Windows XP/Vista and Linux) and with different versions of R. We have established the following: 1. The error occurs with versions of R from 2.5.0 onwards on all the OSs we have tried, but not with earlier versions of R. 2. The error is not reproducable between machines - identical code will fail at different points on two different machines. 3. The error is not related to contributed packages, because our code doesn't use any. The code *does*, however, use repeated calls to optim() and nlm(), and passes several arguments through a sequence of functions using 4. Occasionally, we get an error relating to subset replacement instead of Error:bad value. For example: x - rnorm(10) x[1] - 3 Error in x[1] - 3 : could not find function [- 5. The error behaviour changes as a result of minor modifications to print() statements in the code e.g. by inserting a line that prints the value of a well-defined variable. I suspect this really *is* a bug (particularly since it only started happening with version 2.5.0), but I figure it would be worth giving people a chance to tell me I'm an idiot before reporting it as such. In case anybody would like to try and see the error for themselves, I have uploaded some files to http://www.homepages.ucl.ac.uk/~ucakarc/Rtest/. The file ErrorDemo.r is the main script - see the file header there for more details. Files momfit.r and elmstats.dat are also required for this example to work. I'm sorry that it isn't a very simple example, but I haven't seen a simple illustration of the problem (and I couldn't find any examples in the list archives either). With best wishes to all, Richard = Richard E. Chandler ^^^ Room 135, Dept of Statistical Science, University College London, 1-19 Torrington Place, London WC1E 6BT, UK Tel: +44 (0)20 7679 1880Fax: +44 (0)20 7383 4703 Internet: http://www.ucl.ac.uk/Stats(department) http://www.homepages.ucl.ac.uk/~ucakarc (personal) email:rich...@stats.ucl.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] model.tables error from aov
In addition, your model statement is odd. Note that within-S factor Type is tested with both the type I and the type II residuals, whereas only the latter should be used. Try this model instead: aov.errs.ae - aov(TrainErrs ~ idio*Type + Error(Subject/ Type),data=learnDat.ae) or, for more clarity: aov.errs.ae - aov(TrainErrs ~ idio*Type + Error(Subject +Subject:Type),data=learnDat.ae), which explicitly denotes the two error strata. On 17-Dec-08, at 4:00 AM, r-help-requ...@r-project.org wrote: Your design seems to be unbalanced: multistatum aov is intended for balanced designs. My guess is that one idio subject has two Type=1 observations: in which case try removing one of them. On Tue, 16 Dec 2008, Harlan Harris wrote: Hi, I'm a new R user, coming from SPSS, and without a particularly strong stats background. I've got a data set that I'd like to do a mixed-design ANOVA with. No missing values. Here's the summary: summary(learnDat.ae) Type Subjectidio struct TrainErrs cond 0:20 11 : 3 idio :28 ae :58 Min. : 0.00 idioae :28 2:19 12 : 3 nonidio:30 fact: 0 1st Qu.: 6.25 idiofact : 0 3:19 14 : 3 Median :11.50 nonidioae:30 15 : 3 Mean :13.40 18 : 3 3rd Qu.:16.00 2 : 3 Max. :59.00 (Other):40 Note that the TrainErrs column is the only numeric column, and I forced everything else to be a factor. (Is that correct?) I then do the following: aov.errs.ae - aov(TrainErrs ~ (idio*Type) + Error(Subject/Type) + (idio), learnDat.ae) So, idio is between-subjects and Type is within-subjects. This is based on examples I've found elsewhere. summary(aov.errs.ae) This seems to work fine: Error: Subject Df Sum Sq Mean Sq F value Pr(F) idio 1179 1790.89 0.36 Type 1210 2101.05 0.32 Residuals 17 3401 200 Error: Subject:Type Df Sum Sq Mean Sq F value Pr(F) Type 2515 2582.44 0.103 idio:Type 2680 3403.22 0.053 . Residuals 34 3595 106 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 -- Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html -Dr. John R. Vokey __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: bad value problem
On 18/12/2008, at 11:34 AM, Richard E. Chandler wrote: Dear R-help, Like several other subscribers, I have recently encountered a problem whereby R will execute code apparently correctly and without error, but any subsequent command will yield Error: bad value so that R has to be killed and restarted. We have checked this out with a few different operating systems (Windows XP/Vista and Linux) and with different versions of R. We have established the following: 1. The error occurs with versions of R from 2.5.0 onwards on all the OSs we have tried, but not with earlier versions of R. 2. The error is not reproducable between machines - identical code will fail at different points on two different machines. 3. The error is not related to contributed packages, because our code doesn't use any. The code *does*, however, use repeated calls to optim() and nlm(), and passes several arguments through a sequence of functions using 4. Occasionally, we get an error relating to subset replacement instead of Error:bad value. For example: x - rnorm(10) x[1] - 3 Error in x[1] - 3 : could not find function [- 5. The error behaviour changes as a result of minor modifications to print() statements in the code e.g. by inserting a line that prints the value of a well-defined variable. I suspect this really *is* a bug (particularly since it only started happening with version 2.5.0), but I figure it would be worth giving people a chance to tell me I'm an idiot before reporting it as such. In case anybody would like to try and see the error for themselves, I have uploaded some files to http://www.homepages.ucl.ac.uk/~ucakarc/Rtest/. The file ErrorDemo.r is the main script - see the file header there for more details. Files momfit.r and elmstats.dat are also required for this example to work. I'm sorry that it isn't a very simple example, but I haven't seen a simple illustration of the problem (and I couldn't find any examples in the list archives either). I can confirm that the error occurs. I downloaded the files and sourced ``ErrorDemo.R''. Doing x - rnorm(10) after doing so triggered the error. Subsequently attempting traceback() (or anything else) simply triggered the error again. Good luck to R Core in tracking this down! cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: bad value problem
On 17/12/2008 5:34 PM, Richard E. Chandler wrote: Dear R-help, Like several other subscribers, I have recently encountered a problem whereby R will execute code apparently correctly and without error, but any subsequent command will yield Error: bad value so that R has to be killed and restarted. We have checked this out with a few different operating systems (Windows XP/Vista and Linux) and with different versions of R. We have established the following: The symptoms you describe suggest an out of bounds read or write. Unfortunately, I don't see the error, in R-devel or R 2.8.0. Tracking this sort of bug down on Windows is pretty hard, because as far as I know valgrind doesn't run on Windows, and I don't know of any Windows equivalent that supports gcc. So I'm not going to be able to help...other than to suggest reducing the example to something that runs quickly, then try running it after running gctorture(). That often flushes out hard to reproduce bugs, but it goes very slowly... Duncan Murdoch 1. The error occurs with versions of R from 2.5.0 onwards on all the OSs we have tried, but not with earlier versions of R. 2. The error is not reproducable between machines - identical code will fail at different points on two different machines. 3. The error is not related to contributed packages, because our code doesn't use any. The code *does*, however, use repeated calls to optim() and nlm(), and passes several arguments through a sequence of functions using 4. Occasionally, we get an error relating to subset replacement instead of Error:bad value. For example: x - rnorm(10) x[1] - 3 Error in x[1] - 3 : could not find function [- 5. The error behaviour changes as a result of minor modifications to print() statements in the code e.g. by inserting a line that prints the value of a well-defined variable. I suspect this really *is* a bug (particularly since it only started happening with version 2.5.0), but I figure it would be worth giving people a chance to tell me I'm an idiot before reporting it as such. In case anybody would like to try and see the error for themselves, I have uploaded some files to http://www.homepages.ucl.ac.uk/~ucakarc/Rtest/. The file ErrorDemo.r is the main script - see the file header there for more details. Files momfit.r and elmstats.dat are also required for this example to work. I'm sorry that it isn't a very simple example, but I haven't seen a simple illustration of the problem (and I couldn't find any examples in the list archives either). With best wishes to all, Richard = Richard E. Chandler ^^^ Room 135, Dept of Statistical Science, University College London, 1-19 Torrington Place, London WC1E 6BT, UK Tel: +44 (0)20 7679 1880Fax: +44 (0)20 7383 4703 Internet: http://www.ucl.ac.uk/Stats(department) http://www.homepages.ucl.ac.uk/~ucakarc (personal) email:rich...@stats.ucl.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: bad value problem
On 18/12/2008, at 1:09 PM, Peter Dalgaard wrote: Rolf Turner wrote: snip I can confirm that the error occurs. I downloaded the files and sourced ``ErrorDemo.R''. Doing x - rnorm(10) after doing so triggered the error. Subsequently attempting traceback() (or anything else) simply triggered the error again. Good luck to R Core in tracking this down! Unfortunately, there's at least on R Core member for which it does NOT happen... Fedora 9 i686, R 2.8.0 and 2.8.1 RC (2008-12-15 r47214) I should've included my session info: R version 2.8.0 (2008-10-20) i386-apple-darwin8.11.1 locale: C attached base packages: [1] datasets utils stats graphics grDevices methods base other attached packages: [1] misc_0.0-9 fortunes_1.3-5 MASS_7.2-44 So it's happening to me with R 2.8.0 --- but under Mac OSX, rather than Fedora. cheers, Rolf ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: bad value problem
Peter Dalgaard wrote: Rolf Turner wrote: On 18/12/2008, at 11:34 AM, Richard E. Chandler wrote: Dear R-help, Like several other subscribers, I have recently encountered a problem whereby R will execute code apparently correctly and without error, but any subsequent command will yield Error: bad value so that R has to be killed and restarted. We have checked this out with a few different operating systems (Windows XP/Vista and Linux) and with different versions of R. We have established the following: 1.The error occurs with versions of R from 2.5.0 onwards on all the OSs we have tried, but not with earlier versions of R. 2.The error is not reproducable between machines - identical code will fail at different points on two different machines. 3.The error is not related to contributed packages, because our code doesn't use any. The code *does*, however, use repeated calls to optim() and nlm(), and passes several arguments through a sequence of functions using 4.Occasionally, we get an error relating to subset replacement instead of Error:bad value. For example: x - rnorm(10) x[1] - 3 Error in x[1] - 3 : could not find function [- 5.The error behaviour changes as a result of minor modifications to print() statements in the code e.g. by inserting a line that prints the value of a well-defined variable. I suspect this really *is* a bug (particularly since it only started happening with version 2.5.0), but I figure it would be worth giving people a chance to tell me I'm an idiot before reporting it as such. In case anybody would like to try and see the error for themselves, I have uploaded some files to http://www.homepages.ucl.ac.uk/~ucakarc/Rtest/. The file ErrorDemo.r is the main script - see the file header there for more details. Files momfit.r and elmstats.dat are also required for this example to work. I'm sorry that it isn't a very simple example, but I haven't seen a simple illustration of the problem (and I couldn't find any examples in the list archives either). I can confirm that the error occurs. I downloaded the files and sourced ``ErrorDemo.R''. Doing x - rnorm(10) after doing so triggered the error. Subsequently attempting traceback() (or anything else) simply triggered the error again. Good luck to R Core in tracking this down! Unfortunately, there's at least on R Core member for which it does NOT happen... Fedora 9 i686, R 2.8.0 and 2.8.1 RC (2008-12-15 r47214) Wrong, actually... It did happen when run with 2.8.0 and valgrind (but not without it). Unfortunately there were no errors from valgrind yuck!!! a Error: bad value Save workspace image? [y/n/c]: n ==29050== ==29050== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 116 from 1) ==29050== malloc/free: in use at exit: 13,984,907 bytes in 10,061 blocks. ==29050== malloc/free: 117,023 allocs, 106,962 frees, 93,104,003 bytes allocated. ==29050== For counts of detected errors, rerun with: -v ==29050== searching for pointers to 10,061 not-freed blocks. ==29050== checked 14,502,208 bytes. ==29050== ==29050== LEAK SUMMARY: ==29050==definitely lost: 0 bytes in 0 blocks. ==29050== possibly lost: 0 bytes in 0 blocks. ==29050==still reachable: 13,984,907 bytes in 10,061 blocks. ==29050== suppressed: 0 bytes in 0 blocks. ==29050== Rerun with --leak-check=full to see details of leaked memory. [ -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Construct All Possible Strings from 4 Bases (ATCG)
Dear Ivar, How can I extend the limit of n size? When I tried this function with n= 15, it fails: f - function(bases, n){apply(expand.grid(rep(list(bases),n)), 1, paste, collapse=)} f(c(A, T, C, G), 15) Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : cannot allocate vector of length 1073741824 f(c(A, T, C, G), 30) Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : invalid 'times' value In addition: Warning message: In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : NAs introduced by coercion - Gundala Viswanath Jakarta - Indonesia On Wed, Dec 17, 2008 at 6:41 PM, Ivar Herfindal ivar.herfin...@bio.ntnu.no wrote: To add on Robin Hankin's solution, if you want to generate the strings you can try: f - function(bases, n){apply(expand.grid(rep(list(bases),n)), 1, paste, collapse=)} f(c(A, T, C, G), 2) f(c(A, T, C, G), 4) best Ivar Robin Hankin wrote: Gundala f - function(n){expand.grid(rep(list(seq_len(4)),n))} HTH Robin Gundala Viswanath wrote: Dear all, Is there an efficient way in R to construct all strings from 4 bases (ATCG). If we want a length L string, there are 4 ^ L possible strings of such. e . g with L = 2 we have AA, AT, AC, AG, .. GC, GA, GT, GG as many as 4 ^ 2 = 16 strings, with L = 3 we have as many as 4 ^ 3 = 64 strings - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Newbie if problem
I have got a general problem when applying a function to a dataframe using the function; Apply(df,1,myfunct) Where myfunct has an IF statement in it along the lines of; If (z == 0) X = 1 If (z ==0) Z = 1 I.e. Two If statements Is there something I am missing or have a just formed the if statements wrong just checking there is not some trick you have to use when using apply with functions with if statements in. The function works fine in isolation with df[1,.] say Thanks Glenn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie if problem
glenn roberts wrote: I have got a general problem when applying a function to a dataframe using the function; Apply(df,1,myfunct) Do you mean apply? Or does the Aaply function come from a separate package? Where myfunct has an IF statement in it along the lines of; If (z == 0) X = 1 If (z ==0) Z = 1 I.e. Two If statements Is there something I am missing or have a just formed the if statements wrong just checking there is not some trick you have to use when using apply with functions with if statements in. The function works fine in isolation with df[1,.] say You are forgetting to include details here, specifically, a reproducible example of a function called 'myfunct' that exhibits the behavior you aren't expecting. 'myfunct' is going to have to take a parameter (say, 'x') that contains the data. If you call it 'x', you might try replacing z above by x[z] perhaps? It's difficult to say more until we have more details. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Construct All Possible Strings from 4 Bases (ATCG)
Gundala Viswanath wrote: Dear Ivar, How can I extend the limit of n size? When I tried this function with n= 15, it fails: f - function(bases, n){apply(expand.grid(rep(list(bases),n)), 1, paste, collapse=)} f(c(A, T, C, G), 15) Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : cannot allocate vector of length 1073741824 f(c(A, T, C, G), 30) Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : invalid 'times' value In addition: Warning message: In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : NAs introduced by coercion Get more memory/move to a 64-bit machine? 4^15 [1] 1073741824 obj - rep(0,4^15) Error: cannot allocate vector of length 1073741824 from help(Memory-limit): There are also limits on individual objects. On all versions of R, the maximum length (number of elements) of a vector is 2^31 - 1 ~ 2*10^9, as lengths are stored as signed integers. In addition, the storage space cannot exceed the address limit, and if you try to exceed that limit, the error message begins 'cannot allocate vector of length'. The number of characters in a character string is in theory only limited by the address space. What were you going to do with these approx. 10^9 objects once you had a vector of them? see http://lucis.net/stuff/clarke/9billion_clarke.html cheers Ben Bolker -- View this message in context: http://www.nabble.com/Construct-All-Possible-Strings-from-4-Bases-%28ATCG%29-tp21049478p21065275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: bad value problem
I can get the errors to happen on Ubuntu 8.10 with R --vanilla (*without* valgrind) -- but editing momfit.r line 742 so that plot.progress=FALSE seems to make the problem go away. (This was a lucky guess, it looked like there was something odd going on with the plots.) Hope that helps someone ... Ben Bolker sessionInfo() R version 2.8.0 (2008-10-20) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- View this message in context: http://www.nabble.com/%22Error%3A-bad-value%22-problem-tp21063091p21065458.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: bad value problem
Ben Bolker wrote: I can get the errors to happen on Ubuntu 8.10 with R --vanilla (*without* valgrind) -- but editing momfit.r line 742 so that plot.progress=FALSE seems to make the problem go away. (This was a lucky guess, it looked like there was something odd going on with the plots.) Hope that helps someone ... Probably not. The problem is to reproduce the error state in a way so that we can understand what is causing it. I can debug this to (gdb) bt #0 Rf_error (format=0x8220c65 bad value) at ../../../R/src/main/errors.c:704 #1 0x0805a924 in SETCDR (x=0x8f89348, y=0x9b276e8) at ../../../R/src/main/memory.c:2728 #2 0x0819fa46 in GrowList (l=0x951e8f4, s=value optimized out) at gram.y:958 #3 0x081a2a7b in xxvalue (v=0x8f89348, k=4, lloc=value optimized out) at gram.y:440 and the problem in GrowList is that CAR(l) is R_NilValue (==0x8f89348), which supposedly cannot happen, and the thing that calls GrowList is something with srcrefs (DuncanM?). Digging deeper probably has to wait till the weekend for my part. (The natural next step is figuring out how the R_NilValue got into that location, but I should try to sleep off this cold) I'm CCing r-devel on this. Can we move the discussion there? Ben Bolker sessionInfo() R version 2.8.0 (2008-10-20) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Error: bad value problem
On 17/12/2008 8:56 PM, Peter Dalgaard wrote: Ben Bolker wrote: I can get the errors to happen on Ubuntu 8.10 with R --vanilla (*without* valgrind) -- but editing momfit.r line 742 so that plot.progress=FALSE seems to make the problem go away. (This was a lucky guess, it looked like there was something odd going on with the plots.) Hope that helps someone ... Probably not. The problem is to reproduce the error state in a way so that we can understand what is causing it. I can debug this to (gdb) bt #0 Rf_error (format=0x8220c65 bad value) at ../../../R/src/main/errors.c:704 #1 0x0805a924 in SETCDR (x=0x8f89348, y=0x9b276e8) at ../../../R/src/main/memory.c:2728 #2 0x0819fa46 in GrowList (l=0x951e8f4, s=value optimized out) at gram.y:958 #3 0x081a2a7b in xxvalue (v=0x8f89348, k=4, lloc=value optimized out) at gram.y:440 and the problem in GrowList is that CAR(l) is R_NilValue (==0x8f89348), which supposedly cannot happen, and the thing that calls GrowList is something with srcrefs (DuncanM?). Digging deeper probably has to wait till the weekend for my part. (The natural next step is figuring out how the R_NilValue got into that location, but I should try to sleep off this cold) I'm CCing r-devel on this. Can we move the discussion there? I can probably take a look tomorrow. I wasn't getting an error, but maybe I'll see the same corruption if I watch it run. Duncan Murdoch Ben Bolker sessionInfo() R version 2.8.0 (2008-10-20) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Error: bad value problem
On 17/12/2008 9:47 PM, Duncan Murdoch wrote: On 17/12/2008 8:56 PM, Peter Dalgaard wrote: Ben Bolker wrote: I can get the errors to happen on Ubuntu 8.10 with R --vanilla (*without* valgrind) -- but editing momfit.r line 742 so that plot.progress=FALSE seems to make the problem go away. (This was a lucky guess, it looked like there was something odd going on with the plots.) Hope that helps someone ... Probably not. The problem is to reproduce the error state in a way so that we can understand what is causing it. I can debug this to (gdb) bt #0 Rf_error (format=0x8220c65 bad value) at ../../../R/src/main/errors.c:704 #1 0x0805a924 in SETCDR (x=0x8f89348, y=0x9b276e8) at ../../../R/src/main/memory.c:2728 #2 0x0819fa46 in GrowList (l=0x951e8f4, s=value optimized out) at gram.y:958 #3 0x081a2a7b in xxvalue (v=0x8f89348, k=4, lloc=value optimized out) at gram.y:440 and the problem in GrowList is that CAR(l) is R_NilValue (==0x8f89348), which supposedly cannot happen, and the thing that calls GrowList is something with srcrefs (DuncanM?). Digging deeper probably has to wait till the weekend for my part. (The natural next step is figuring out how the R_NilValue got into that location, but I should try to sleep off this cold) I'm CCing r-devel on this. Can we move the discussion there? I can probably take a look tomorrow. I wasn't getting an error, but maybe I'll see the same corruption if I watch it run. I had time to see if I was getting a NilValue there tonight, and the answer was no, with the Windows RC. I don't get the error in any version I've tried on Windows, though I can see it in 2.8.0 on MacOSX. Duncan Duncan Murdoch Ben Bolker sessionInfo() R version 2.8.0 (2008-10-20) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base __ r-de...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble pulling data from a messy ACII file...
Its not clear what the result would be but you may be able to use read.table. Try this: Lines - 19 c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt 10 s name of program that wrote this file trkplt name of program that wrote this file 10 GORDON machine that generated this file machine that generated this file 10 3.7 version of program 10 3.6 version of this data file 105.81 version of Universal Library 10 20081121.145730 when this file was written 10 Windows_XP operating system used operating system used * * radar characteristics 11 WF-100 11 2000 A/D rate, samples/second 11 7.5 bin width, m 11 800 nominal PRF, Hz 11 0.25 nominal pulse width, microsec 11 0 tuning, volts 11 3.19779 nominal wave length, cm DF - read.table(textConnection(Lines), fill = TRUE) DF2 - with(DF, na.omit(data.frame(V1, V2 = as.numeric(V2), V3 = do.call(paste, DF[-(1:2)] You may need to remove the na.omit if you really do need those rows and make other changes but that at least gives the idea. On Wed, Dec 17, 2008 at 2:03 PM, Titan8883 jpla...@gmail.com wrote: Hi all, I am a new graduate student who is also new to R. I am ok with the basics, but the problem I am having right now seems beyond what I can do..so I am looking for advice. I am trying to pull data from flat ASCII files, but they do not have a nice structure so a simple read.table doesn't work. An example first half of a data file is below: -- 19 c:/data/WF-100/2008/20080911/trk/20080911.013115.007.17.txt 10 s name of program that wrote this file trkplt name of program that wrote this file 10 GORDON machine that generated this file machine that generated this file 10 3.7 version of program 10 3.6 version of this data file 105.81 version of Universal Library 10 20081121.145730 when this file was written 10 Windows_XP operating system used operating system used * * radar characteristics 11 WF-100 11 2000 A/D rate, samples/second 11 7.5 bin width, m 11 800 nominal PRF, Hz 11 0.25 nominal pulse width, microsec 11 0 tuning, volts 11 3.19779 nominal wave length, cm --- ..the file goes on from there... How would I go about getting this data into some kind of useful format? This is one of about 1000 files I will need to go through. I would ideally like to get these into a format with each data file as a row with columns for the various values with the description text removed(version of program, file version, tuning volts, etc...). I'm not looking for a cut and paste answer, but perhaps some direction on where I should start. I have only done basic .csv, table, and line inputs up until now. Thanks for any advice -- View this message in context: http://www.nabble.com/Trouble-pulling-data-from-a-messy-ACII-file...-tp21059239p21059239.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replacing elements of a zoo object
Remove the comma in the line with the error. On Wed, Dec 17, 2008 at 11:24 AM, tolga.i.uzu...@jpmorgan.com wrote: Dear R Users, I am trying to do something quite simple: replace the elements of a zoo object. For some reason, the following code does not seem to work. How can I replace the value for the 14th of Dec of 2008 in the zoo object x below with 1 (it is currently NA). x 2008-12-11 2008-12-12 2008-12-13 2008-12-14 2008-12-15 2008-12-16 361.667389.875 NA NA397.822395.667 class(x) [1] zoo class(index(x)) [1] Date x[as.Date(2008-12-14),] 2008-12-14 NA x[as.Date(2008-12-14),]-1 Error in x[as.Date(2008-12-14), ] - 1 : incorrect number of subscripts on matrix Thanks in advance, Tolga Generally, this communication is for informational purposes only and it is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. In the event you are receiving the offering materials attached below related to your interest in hedge funds or private equity, this communication may be intended as an offer or solicitation for the purchase or sale of such fund(s). All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of JPMorgan Chase Co., its subsidiaries and affiliates. This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you. Please refer to http://www.jpmorgan.com/pages/disclosures for disclosures relating to UK legal entities. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add a string to each string of an array
Your question has already been answered but as an aside note that the zoo package has a yearqtr class that can be helpful when dealing with quarterly dates. On Wed, Dec 17, 2008 at 10:25 AM, Boriss bor...@gmx.net wrote: Dear all, I have an array of strings a - c(2008q3,2005q1,2004q3) I would like to add to each a[i], with i = 1,2,3 the following string IMT, such that in the end I could get b - c(IMT2008q3,IMT2005q1,IMT2004q3) Is it possible to accomplish this without a loop command? Thank you. Best, Bori -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Parsing unusual date format
Hello, If I have a character string like d - c(1990m3, 1992m8) #March 1990 and Aug 1992 what is the easiest way to convert it into any standard date form; for example, d - c(01/03/1990, 01/08/1992) I looked at as.Date but it doesn't seem to address my problem as I have an m stuck in the middle of my character string which R does not recognise. Would be very grateful for any help on this. Shruthi -- View this message in context: http://www.nabble.com/Parsing-unusual-date-format-tp21067562p21067562.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing unusual date format
hi: i tried regular expressions and , as usual, failed. but below does do the job uglily using strsplit. i'd still be curious and appreciate if someone could do the regular expression method. thanks. dts - c(1990m12, 1992m8) #March 1990 and Aug 1992 #SPLIT IT USING m temp - strsplit(dts,m) # PASTE THE #'S BUT CHECK # FOR GREATER THAN 9 UGLIFIES IT moddates - lapply(temp,function(.str) { if ( as.numeric(.str[2]) 9 ) { paste(.str[1],as.numeric(.str[2]),01,sep=-) } else { paste(.str[1],-0,as.numeric(.str[2]),-01,sep=) } }) # MAKE THE DATE datelist-lapply(moddates,as.Date ,format=%Y-%m-%d) On Thu, Dec 18, 2008 at 1:14 AM, Shruthi Jayaram wrote: Hello, If I have a character string like d - c(1990m3, 1992m8) #March 1990 and Aug 1992 what is the easiest way to convert it into any standard date form; for example, d - c(01/03/1990, 01/08/1992) I looked at as.Date but it doesn't seem to address my problem as I have an m stuck in the middle of my character string which R does not recognise. Would be very grateful for any help on this. Shruthi -- View this message in context: http://www.nabble.com/Parsing-unusual-date-format-tp21067562p21067562.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PREDICT NEW VALUES FROM REGRESSION MODEL, EST. ST.ERROR, AND CI
First, package the new values of your predictor variables into a data frame: data.frame(VAR1=8, VAR2=64) Then, use the predict() method to apply the model to the predictor values: predict(model, newdata=data.frame(VAR1=8, VAR2=64)) Finally, tell the predict() method that these are new observations, not original observations, and you want the confidence interval of the prediction: predict(model, newdata=data.frame(VAR1=8, VAR2=64), interval=pred, level=0.95) That should print the fitted value for OUTCOME, the lower bound, and the upper bound. To see the standard error, include predict(..., st.fit=TRUE) in your call to predict(). Hope that helps. Paul Paul Teetor Elgin, IL USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Predict
Dear Cristopher, I have a question and hope you may help me...I designed a model with a data set, the model Call: lm(formula = outcom ~ vari1 + vari2 + dummy1 + dummy2) Residuals: Min 1Q Median 3Q Max -0.7090 -0.2648 0.1466 0.2167 0.7513 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.528295 0.205280 7.445 4.79e-07 *** vari1 0.067471 0.042248 1.597 0.12676 vari2 0.009433 0.040919 0.231 0.82015 dummy1 1.275861 0.256066 4.983 8.27e-05 *** dummy2 1.326405 0.414477 3.200 0.00471 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.458 on 19 degrees of freedom Multiple R-squared: 0.9772, Adjusted R-squared: 0.9724 F-statistic: 203.8 on 4 and 19 DF, p-value: 2.564e-15 but now, using this model, I need to calculate the predicted value of OUTCOM (only for one observation) , when vari1=8 and vari 2 =64, and also the confidence interval(the data set has total of 24 observations), -- Ricardo Gómez Center for International Education University of Massachusetts-Amherst, School of Education http://www.umass.edu/cie/ ** Scoff at all knowledge and despise Reason and Science, those flowers of mankind. Let the father of all lies With dazzling necromancy make you blind, Then I'll have you unconditionally. (Faust, Goethe) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing unusual date format
You can use regular expressions: as.Date(sub('(\\d+)m(\\d+)','\\1-\\2-01',dts,perl=TRUE)) but as.Date isn't as inflexible as you think: as.Date(paste(dts,'m01',sep=''),'%Ym%mm%d') - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Thu, 18 Dec 2008, markle...@verizon.net wrote: hi: i tried regular expressions and , as usual, failed. but below does do the job uglily using strsplit. i'd still be curious and appreciate if someone could do the regular expression method. thanks. dts - c(1990m12, 1992m8) #March 1990 and Aug 1992 #SPLIT IT USING m temp - strsplit(dts,m) # PASTE THE #'S BUT CHECK # FOR GREATER THAN 9 UGLIFIES IT moddates - lapply(temp,function(.str) { if ( as.numeric(.str[2]) 9 ) { paste(.str[1],as.numeric(.str[2]),01,sep=-) } else { paste(.str[1],-0,as.numeric(.str[2]),-01,sep=) } }) # MAKE THE DATE datelist-lapply(moddates,as.Date ,format=%Y-%m-%d) On Thu, Dec 18, 2008 at 1:14 AM, Shruthi Jayaram wrote: Hello, If I have a character string like d - c(1990m3, 1992m8) #March 1990 and Aug 1992 what is the easiest way to convert it into any standard date form; for example, d - c(01/03/1990, 01/08/1992) I looked at as.Date but it doesn't seem to address my problem as I have an m stuck in the middle of my character string which R does not recognise. Would be very grateful for any help on this. Shruthi -- View this message in context: http://www.nabble.com/Parsing-unusual-date-format-tp21067562p21067562.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.