[R] SNPRelate package error
Dear, I am using the R package SNPRelate but I found an error when I run the following command. Do you know what might be the problem? Thanks in advance. vcf.fn - system.file(extdata,str.vcf,package=SNPRelate) snpgdsVCF2GDS(vcf.fn,test.gds) Start snpgdsVCF2GDS ... Extracting bi-allelic and polymorhpic SNPs. Scanning ... file: D:/Program Files/R/R-2.14.2/library/SNPRelate/extdata/str.vcf Error in scan.vcf.marker(fn, method) : The file (D:/Program Files/R/R-2.14.2/library/SNPRelate/extdata/str.vcf) has different numbers of columns. Best regrads -- Dr. Ye SUN Key Laboratory of Plant Resources Conservation and Sustainable Utilization South China Botanical Garden, Chinese Academy of Sciences Xingke Road 723,Tianhe District, Guangzhou 510650, PR China [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple reshape
Dear friends - this is a very simple question - I have a data frame 'data.frame': 87 obs. of 3 variables: $ ID : int 1 1 1 2 2 2 3 3 3 4 ... $ prep : num 1.18 1.38 1.34 1.93 2.38 2.24 1.17 1.13 1.21 1.89 ... $ postp: num 0.63 0.71 0.75 1.01 1.12 1.07 0.87 0.64 0.7 0.8 ... - 29 persons (ID) each measured three times before and after an intervention: prep and postp - I need data rearranged like IDtimeval 11prep 12postp 11 12 11 12 I cannot make reshape or stack do the trick. I'm on windows 7 R version 2.15.2 (2012-10-26) Best wishes Troels Ring, Nephrology Aalborg, Denmark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SNPRelate package error
Hello, Why do you think it is a package error? The error message says that the file [...] has different numbers of columns. Please check that file first. Regards, Pascal Le 22/01/2013 16:05, sun...@scib.ac.cn a écrit : Dear, I am using the R package SNPRelate but I found an error when I run the following command. Do you know what might be the problem? Thanks in advance. vcf.fn - system.file(extdata,str.vcf,package=SNPRelate) snpgdsVCF2GDS(vcf.fn,test.gds) Start snpgdsVCF2GDS ... Extracting bi-allelic and polymorhpic SNPs. Scanning ... file: D:/Program Files/R/R-2.14.2/library/SNPRelate/extdata/str.vcf Error in scan.vcf.marker(fn, method) : The file (D:/Program Files/R/R-2.14.2/library/SNPRelate/extdata/str.vcf) has different numbers of columns. Best regrads -- Dr. Ye SUN Key Laboratory of Plant Resources Conservation and Sustainable Utilization South China Botanical Garden, Chinese Academy of Sciences Xingke Road 723,Tianhe District, Guangzhou 510650, PR China [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple use of dcast (reshape2 package)
Suppose I have a small dataframe aa Target Eaten ID 50 TPP 0 1 51 TPP 1 2 52 TPP 3 3 53 TPP 1 4 54 TPP 2 5 50.1GPA 9 1 51.1GPA11 2 52.1GPA 8 3 53.1GPA 8 4 54.1GPA10 5 And I want to reshape it into ID TPP GPA 1 1 0 9 2 2 1 11 3 3 3 8 4 4 1 8 5 5 2 10 I realise that dcast function in the reshape2 package can handle much more complicated tasks than that, but I can't make it do a simple one. If I simply tried dcast(aa, ... ~ Target) Using ID as value column: use value.var to override. Aggregation function missing: defaulting to length Eaten GPA TPP 1 0 0 1 2 1 0 2 3 2 0 1 4 3 0 1 5 8 2 0 6 9 1 0 710 1 0 811 1 0 As per the help file, it's giving counts of the numbers in the Eaten column since that's the default fun.aggregate value. My questions are: what fun.aggregate would work? Alternatively, can value.var be set to something useful? TIA -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to align group based on the common values of two columns in r
HI, I met this problem: I have the feature data frame: Feature OS 4 2 4 1 4 3 1 2 4 1 what I want to do is to autimatically create one more column called group: Feature OS Group 4 2 1 4 1 2 4 3 3 1 2 4 4 1 2 I don't want Ifelse, because I have so many combination of feature and OS, I even can not account. I just want to have sth to autimatically create group indicator based on the difference combination of feature and OS. Thanks for your help. Kind regards, Tammy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ellipse in PCA with parameters a and bdefined.
Ok... so, in my model my a is built using the standard deviation of the first principal component and b with the second, so my x and Y should be : PCA $ scores [, 1], PCA $ scores [, 2] but in this way I do not get out a confidence interval set on my parameters but many ellipses. Thanks Mary -- View this message in context: http://r.789695.n4.nabble.com/Ellipse-in-PCA-with-parameters-a-and-b-defined-tp4656215p4656242.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to align group based on the common values of two columnsinr
Hi, Tammy, maybe you find something interesting looking at ?interaction and/or try (with df being your data frame) df$Group - as.integer( with( df, interaction( Feature, OS)[, drop = TRUE])) HtH -- Gerrit On Tue, 22 Jan 2013, Tammy Ma wrote: HI, I met this problem: I have the feature data frame: Feature OS 4 2 4 1 4 3 1 2 4 1 what I want to do is to autimatically create one more column called group: Feature OS Group 4 2 1 4 1 2 4 3 3 1 2 4 4 1 2 I don't want Ifelse, because I have so many combination of feature and OS, I even can not account. I just want to have sth to autimatically create group indicator based on the difference combination of feature and OS. Thanks for your help. Kind regards, Tammy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple use of dcast (reshape2 package)
you could try the following: DF - read.table(textConnection( Target Eaten ID 50 TPP 0 1 51 TPP 1 2 52 TPP 3 3 53 TPP 1 4 54 TPP 2 5 50.1GPA 9 1 51.1GPA11 2 52.1GPA 8 3 53.1GPA 8 4 54.1GPA10 5), header = TRUE) newDF - as.data.frame(with(DF, tapply(Eaten, list(ID, Target), c))) newDF$ID - unique(DF$ID) newDF I hope it helps. Best, Dimitris On 1/22/2013 10:23 AM, Patrick Connolly wrote: Suppose I have a small dataframe aa Target Eaten ID 50 TPP 0 1 51 TPP 1 2 52 TPP 3 3 53 TPP 1 4 54 TPP 2 5 50.1GPA 9 1 51.1GPA11 2 52.1GPA 8 3 53.1GPA 8 4 54.1GPA10 5 And I want to reshape it into ID TPP GPA 1 1 0 9 2 2 1 11 3 3 3 8 4 4 1 8 5 5 2 10 I realise that dcast function in the reshape2 package can handle much more complicated tasks than that, but I can't make it do a simple one. If I simply tried dcast(aa, ... ~ Target) Using ID as value column: use value.var to override. Aggregation function missing: defaulting to length Eaten GPA TPP 1 0 0 1 2 1 0 2 3 2 0 1 4 3 0 1 5 8 2 0 6 9 1 0 710 1 0 811 1 0 As per the help file, it's giving counts of the numbers in the Eaten column since that's the default fun.aggregate value. My questions are: what fun.aggregate would work? Alternatively, can value.var be set to something useful? TIA -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple reshape
On 01/22/2013 07:19 PM, Troels Ring wrote: Dear friends - this is a very simple question - I have a data frame 'data.frame': 87 obs. of 3 variables: $ ID : int 1 1 1 2 2 2 3 3 3 4 ... $ prep : num 1.18 1.38 1.34 1.93 2.38 2.24 1.17 1.13 1.21 1.89 ... $ postp: num 0.63 0.71 0.75 1.01 1.12 1.07 0.87 0.64 0.7 0.8 ... - 29 persons (ID) each measured three times before and after an intervention: prep and postp - I need data rearranged like ID time val 1 1 prep 1 2 postp 1 1 1 2 1 1 1 2 I cannot make reshape or stack do the trick. Hi Troels, With a bit of extra processing I think rep_n_stack (prettyR) will do what you want: # fake some data tr.df-data.frame(ID=rep(1:29,each=3),prep=runif(87,1,3),postp=runif(87,0.5,1.5)) # add a repeat number tr.df$repno-rep(1:3,29) # get the reshaped data frame trlong.df-rep_n_stack(tr.df,to.stack=2:3, stack.names=c(prepost,value)) # reorder it trlong.df[order(trlong.df$ID,trlong.df$repno),] ID repno prepost value 11 1prep 2.9158693 88 1 1 postp 0.9932342 21 2prep 1.2852817 89 1 2 postp 0.8187234 31 3prep 2.5771902 90 1 3 postp 1.0033936 42 1prep 2.2969320 91 2 1 postp 0.6837140 52 2prep 1.3083553 92 2 2 postp 1.4537096 62 3prep 2.8654184 93 2 3 postp 1.0880881 ... Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple use of dcast (reshape2 package)
Hi, Patrick, I think (with reshape from the stats package) reshape( aa, idvar = ID, v.names = Eaten, timevar = Target, direction = wide) does the trick (followed by renaming the columns of the resulting data frame). Hth -- Gerrit On Tue, 22 Jan 2013, Patrick Connolly wrote: Suppose I have a small dataframe aa Target Eaten ID 50 TPP 0 1 51 TPP 1 2 52 TPP 3 3 53 TPP 1 4 54 TPP 2 5 50.1GPA 9 1 51.1GPA11 2 52.1GPA 8 3 53.1GPA 8 4 54.1GPA10 5 And I want to reshape it into ID TPP GPA 1 1 0 9 2 2 1 11 3 3 3 8 4 4 1 8 5 5 2 10 I realise that dcast function in the reshape2 package can handle much more complicated tasks than that, but I can't make it do a simple one. If I simply tried dcast(aa, ... ~ Target) Using ID as value column: use value.var to override. Aggregation function missing: defaulting to length Eaten GPA TPP 1 0 0 1 2 1 0 2 3 2 0 1 4 3 0 1 5 8 2 0 6 9 1 0 710 1 0 811 1 0 As per the help file, it's giving counts of the numbers in the Eaten column since that's the default fun.aggregate value. My questions are: what fun.aggregate would work? Alternatively, can value.var be set to something useful? TIA -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Concatenate two lists, list by list
Dear all, I would like to concatenate the lists below str(Part2$dataset) List of 3 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... str(Part1$dataset) List of 3 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... I tried concatenating those with: str(cbind(Part1$datase,Part2$dataset)) List of 6 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... - attr(*, dim)= int [1:2] 3 2 but I want something different. To concatenate those into a list by list operation so I will end up with something looking like that str(concatenatedLists) List of 3 $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ... - attr(*, dim)= int [1:2] 3 2 Is there anything that can do that in R? Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] FactoMineR
Dear Users, I installed R Commander and the FactoMineR plug-in. Everything is fine, I can see the new menu, I can import datasets, but if I want to use any of the items in the FactoMineR menu, i get the following error: Error in get(.activeDataSet) : object '.activeDataSet' not found even if there is an active dataset (if there is none, all the menu items are grey of course). I have R version 2.15.2 using Windows 7 but experienced the same on other machines. Please let me know if you have any idea! Thanks a lot daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Concatenate two lists, list by list
Hi Maybe you could use mapply mapply(c, Part1$dataset,Part2$dataset) Regards Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alaios Sent: Tuesday, January 22, 2013 11:26 AM To: R help Subject: [R] Concatenate two lists, list by list Dear all, I would like to concatenate the lists below str(Part2$dataset) List of 3 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... str(Part1$dataset) List of 3 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... I tried concatenating those with: str(cbind(Part1$datase,Part2$dataset)) List of 6 $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:16001] 0 0 0 0 0 0 0 0 0 0 ... - attr(*, dim)= int [1:2] 3 2 but I want something different. To concatenate those into a list by list operation so I will end up with something looking like that str(concatenatedLists) List of 3 $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:32002] 0 0 0 0 0 0 0 0 0 0 ... - attr(*, dim)= int [1:2] 3 2 Is there anything that can do that in R? Regards Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New book announcement: R and Data Mining - Examples and Case Studies
R and Data Mining: Examples and Case Studies Author: Yanchang Zhao Publisher: Academic Press, Elsevier Publish date: December 2012 ISBN: 978-0-12-396963-7 Length: 256 pages URL: http://www.rdatamining.com/books/rdm This book introduces into using R for data mining with examples and case studies. It contains 1) examples on decision trees, random forest, regression, clustering, outlier detection, time series analysis, association rules, text mining and social network analysis; and 2) three real-world case studies. Table of Contents and Abstracts: http://www.rdatamining.com/books/rdm/toc R Code and Data for the book: http://www.rdatamining.com/books/rdm/code Sample pages on Google Books: http://books.google.com.au/books?id=FEOh08LBD9UCprintsec=frontcoversource=gbs_ge_summary_rcad=0#v=onepageqf=false Buy the book on Amazon: http://www.amazon.com/Data-Mining-Examples-Case-Studies/dp/0123969638 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regex for ^ (the caret symbol)?
-Original Message- So what is the special behavior of the ^ symbol when not at the beginning of the string that occurs when it is not escaped? I think it retains its meaning as an assertion that it occurs at the beginning of the line, and so a pattern like a^b could never match anything. ... unless a or b are newlines and you are matching multi-line expressions, when ^ and $ match before and after line breaks as well as beginning and end of string. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Approximating discrete distribution by continuous distribution
Dear all, I have a discrete distribution showing how age is distributed across a population using a certain set of bands: Age - matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1, dimnames=list(c(18, 18-34, 35-64, 65+),c())) Age_dist - Age/sum(Age) For example I know that 23.94% of all people are between 0-18 years, 23.28% between 18-34 years and so forth. I would like to find a continuous approximation of this discrete distribution in order to estimate the probability that a person is for example 16 years old. Is there some automatic way in R through which this can be done? I tried a Kernel density estimation of the histogram but this does not seem to provide what I'm looking for. Thanks very much for your help, Michael Michael Haenlein Associate Professor of Marketing ESCP Europe Paris, France [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Approximating discrete distribution by continuous distribution
On Tue, Jan 22, 2013 at 11:49 AM, Michael Haenlein haenl...@escpeurope.eu wrote: I would like to find a continuous approximation of this discrete distribution in order to estimate the probability that a person is for example 16 years old. Given that people age continuously (and continually...), you sound like you are trying to replace one discrete distribution with another (discretised by year). A continuous distribution would give you, for example, the probability that a person is between 16.0 and 16.1 years old. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Approximating discrete distribution by continuous distribution
On 22/01/2013 11:49, Michael Haenlein wrote: Dear all, I have a discrete distribution showing how age is distributed across a population using a certain set of bands: Age - matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1, dimnames=list(c(18, 18-34, 35-64, 65+),c())) Age_dist - Age/sum(Age) For example I know that 23.94% of all people are between 0-18 years, 23.28% between 18-34 years and so forth. I would like to find a continuous approximation of this discrete distribution in order to estimate the probability that a person is for example 16 years old. Is there some automatic way in R through which this can be done? I tried a Kernel density estimation of the histogram but this does not seem to provide what I'm looking for. This is not really an R question, but a statistics one. It is almost guesswork: if for example these were drivers in the UK, the answer is 0. So you need to supply some information about the shape of the distribution of 18 year olds. You have estimates of the cumulative distribution function at c(0, 18, 35, 65, Inf) (or some better upper limit). You want to interpolate it. You could use linear interpolation (approx[fun]) or a monotone spline interpolation (spline[fun]) or any other interpolation method which meets your needs. But whatever you use, you will supplying a lot of information not actually in your data. Thanks very much for your help, Michael Michael Haenlein Associate Professor of Marketing ESCP Europe Paris, France [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FactoMineR
Dear Daniel, There were changes to the new version 1.9-3 of the Rcmdr so that it conforms to CRAN policies. These changes can break plug-ins that haven't been modified for compatibility. One change is that the environment in which the Rcmdr stores state information is no longer put on the search path. That's apparently preventing the FactoMineR plug-in from finding the active data set. The solution is for the author to replace get(.activeDataSet) with something like get(getRcmdr(.activeDataSet)). I'll correspond with the package author to suggest this. I apologize for the difficulties introduced by these changes. John John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Tue, 22 Jan 2013 10:34:28 + Dániel Kehl ke...@ktk.pte.hu wrote: Dear Users, I installed R Commander and the FactoMineR plug-in. Everything is fine, I can see the new menu, I can import datasets, but if I want to use any of the items in the FactoMineR menu, i get the following error: Error in get(.activeDataSet) : object '.activeDataSet' not found even if there is an active dataset (if there is none, all the menu items are grey of course). I have R version 2.15.2 using Windows 7 but experienced the same on other machines. Please let me know if you have any idea! Thanks a lot daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FactoMineR
Dear John, great news, thank you for your kind answer and quick response. I am sure that the author is going to do his best as well. An other good experience why I love R! :) Have a nice day, daniel Feladó: John Fox [j...@mcmaster.ca] Küldve: 2013. január 22. 13:39 To: Dániel Kehl Cc: R-help Tárgy: Re: [R] FactoMineR Dear Daniel, There were changes to the new version 1.9-3 of the Rcmdr so that it conforms to CRAN policies. These changes can break plug-ins that haven't been modified for compatibility. One change is that the environment in which the Rcmdr stores state information is no longer put on the search path. That's apparently preventing the FactoMineR plug-in from finding the active data set. The solution is for the author to replace get(.activeDataSet) with something like get(getRcmdr(.activeDataSet)). I'll correspond with the package author to suggest this. I apologize for the difficulties introduced by these changes. John John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Tue, 22 Jan 2013 10:34:28 + Dániel Kehl ke...@ktk.pte.hu wrote: Dear Users, I installed R Commander and the FactoMineR plug-in. Everything is fine, I can see the new menu, I can import datasets, but if I want to use any of the items in the FactoMineR menu, i get the following error: Error in get(.activeDataSet) : object '.activeDataSet' not found even if there is an active dataset (if there is none, all the menu items are grey of course). I have R version 2.15.2 using Windows 7 but experienced the same on other machines. Please let me know if you have any idea! Thanks a lot daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple use of dcast (reshape2 package)
Hi, ID is not the value column. Your casting call should be dcast(aa, ... ~ Target, value.var = Eaten) Best, Ista On Tue, Jan 22, 2013 at 4:23 AM, Patrick Connolly p_conno...@slingshot.co.nz wrote: Suppose I have a small dataframe aa Target Eaten ID 50 TPP 0 1 51 TPP 1 2 52 TPP 3 3 53 TPP 1 4 54 TPP 2 5 50.1GPA 9 1 51.1GPA11 2 52.1GPA 8 3 53.1GPA 8 4 54.1GPA10 5 And I want to reshape it into ID TPP GPA 1 1 0 9 2 2 1 11 3 3 3 8 4 4 1 8 5 5 2 10 I realise that dcast function in the reshape2 package can handle much more complicated tasks than that, but I can't make it do a simple one. If I simply tried dcast(aa, ... ~ Target) Using ID as value column: use value.var to override. Aggregation function missing: defaulting to length Eaten GPA TPP 1 0 0 1 2 1 0 2 3 2 0 1 4 3 0 1 5 8 2 0 6 9 1 0 710 1 0 811 1 0 As per the help file, it's giving counts of the numbers in the Eaten column since that's the default fun.aggregate value. My questions are: what fun.aggregate would work? Alternatively, can value.var be set to something useful? TIA -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying a formula from text
Dear Arun, Thank you very much. Yours, Ilya On Sun, Jan 20, 2013 at 9:18 PM, arun kirshna [via R] ml-node+s789695n4656104...@n4.nabble.com wrote: Dear Ilya, Please check these links. http://stackoverflow.com/questions/4556524/whats-the-way-to-learn-r http://www.r-bloggers.com/learn-to-use-r-for-free-with-coursera/ http://stackoverflow.com/questions/192369/books-for-learning-the-r-language You may also benefit from An introduction to R from the http://cran.r-project.org/manuals.html. A.K. - Original Message - From: IlyaNovikov [hidden email]http://user/SendEmail.jtp?type=nodenode=4656104i=0 To: [hidden email] http://user/SendEmail.jtp?type=nodenode=4656104i=1 Cc: Sent: Sunday, January 20, 2013 1:21 AM Subject: Re: [R] applying a formula from text Dear Arun, I am a novice in R bu some my friends that use R for a long time were not able to help me. Thank you really. Concerning your question why I need it, I think that it can be situations where the condition, that I have to apply, depends on the data. May be you can advice me a good text to learn programming in R. Thank you again. Ilya Novikov Sat, Jan 19, 2013 at 8:02 PM, arun kirshna [via R] [hidden email] http://user/SendEmail.jtp?type=nodenode=4656104i=2 wrote: HI, Not sure why you need to do this: s- x5 h(1,eval(parse(text=s))) #[1] FALSE A.K. -- If you reply to this email, your message will be added to the discussion below: . NAML http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- Sincerely, Ilya Novikov -- View this message in context: http://r.789695.n4.nabble.com/applying-a-formula-from-text-tp4656045p4656084.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ [hidden email] http://user/SendEmail.jtp?type=nodenode=4656104i=3mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [hidden email] http://user/SendEmail.jtp?type=nodenode=4656104i=4mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/applying-a-formula-from-text-tp4656045p4656104.html To unsubscribe from applying a formula from text, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4656045code=aW5vdmlrb3ZAZ21haWwuY29tfDQ2NTYwNDV8MTU4NTI3OTk5MA== . NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- Sincerely, Ilya Novikov -- View this message in context: http://r.789695.n4.nabble.com/applying-a-formula-from-text-tp4656045p4656238.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to remove the vertical space between two graps
Hi, I have created a barplot using the following code. a-c(11,23,15,34,42,31) m-matrix(a,nrow=2) m[2,]-(-1)*m[2,] par(mar=c(4,4,4,0)) barplot(m[2,],horiz=T) par(mar=c(4,0,4,2)) barplot(m[1,],horiz=T,col=black) and the plot obtained is shown in plot1.tiff. I was not willing to see the gap (vertical space) between two graphs. How can I achieve it? Further I tried to achieve my goal in a single plot, for which I tried this code: a-c(11,23,15,34,42,31) m-matrix(a,nrow=2) m[2,]-(-1)*m[2,] barplot(m,horiz=T,beside=T) and the plot obtained is showed in plot2.tiff in the second attempt I'm able to place the bars next to each other using beside=T argument. However, I fail when I use beside=F argument (obtained plot3.tiff with this). Can you suggest me in achieving my goal (similar to plot2 with no vertical space)? Regards, Purna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Erro message in glmmADMB
Hello everybody, I am using glmmADMB and when I run some models, I recieve the following message: Erro em glmmadmb(eumencells ~ 1 + (1 | owners), data = pred3, family = nbinom, : The function maximizer failed (couldn't find STD file) Furthermore: Lost warning messages: Command execution 'C:\Windows\system32\cmd.exe /c C:/Users/helenametal/Documents/R/win-library/2.15/glmmADMB/bin/windows32/glmmadmb.exe -maxfn 500 -maxph 5 -noinit -shess' teve status 1 : Mensagens de aviso perdidas: execução do comando 'C:\Windows\system32\cmd.exe /c C:/Users/helenametal/Documents/R/win-library/2.15/glmmADMB/bin/windows32/glmmadmb.exe -maxfn 500 -maxph 5 -noinit -shess' teve status 1 Does anyone know what is this and why does it happen? Thanks a lot! Maria -- View this message in context: http://r.789695.n4.nabble.com/Erro-message-in-glmmADMB-tp4656253.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Remove my adress from mailing list
Hello! I wish, that my email-adress is removed from the R-help mailing list. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] solve equations
Hello ! I have a rather mathematical than statistical question. I have a formula: P=R*T/(v-b) -a/(sqrt(T)*V*(V+b)) and I want to solve the equation for V , in terms of V= . Is this possible with R or have I to use another program perhaps octave? thanking you in anticipation Claudia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] solve equations
On 22-01-2013, at 14:20, paladini palad...@beuth-hochschule.de wrote: Hello ! I have a rather mathematical than statistical question. I have a formula: P=R*T/(v-b) -a/(sqrt(T)*V*(V+b)) and I want to solve the equation for V , in terms of V= . Is this possible with R or have I to use another program perhaps octave? Have a look at uniroot. Since this is a single equation there is probably no need to look at more high powered alternatives such as packages nleqslv or BB. And since you can rewrite your formula a quadratic you can also solve it with a simple function. Berend thanking you in anticipation Claudia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles with R for a big data.frame
Hey Duncan, Neither me do imagine what formula OpenOffice uses for quantiles. I have checked a data string, 24 values, to calculate a quantiles with OpenOffice and R. The result is identical. The problem arises when I try to implement quantile calculation in this form: dat2-with(dat1,aggregate(cbind(dat1[,1:71]),by=list(newID),quantiles,0.1,type=4)) . This code does not generate an error, but I guess neither a right result. So my question would be: How I could calculate quantiles for a big data.frame in R (71 columns and 288 rows). I need to take 24 rows, calculate quantiles, then take another 24 rows etc..for 71 columns. Thanks in advance. 2013/1/22 Duncan Murdoch murdoch.dun...@gmail.com On 13-01-21 6:41 PM, Simonas Kecorius wrote: Dear R users, I came up to a problem dealing with percentiles in R. From my previous questions: I do have a big data.frame, with lots of columns and rows. The following command enables me to calculate means for all data frame. dat1$newID-rep(1:(nrow(dat1)/**12),each=12) #if nrow(dat1)/12 is integer dat2-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(**newID),mean)) What I need is to calculate percentiles for each group (there are 12 values in a group). I tried the following: duomenai-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(** newID),quantiles,0.1,type=4)) You didn't define quantiles, so that won't work. Assuming that's a typo, and you meant quantile... First, is the following syntax is right? Secondly, I tried to calculate percentiles using OpenOffice and there is disagreement between values. If I do calculation for some number row, than R and OpenOffice numbers coincide, but for a data.frame it seams that something goes wrong. There are lots of different formulas for empirical quantiles. The ones available in R are described in the ?quantile help topic. What formula does OpenOffice use? Duncan Murdoch -- Simonas Kecorius ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ellipse in PCA with parameters a and bdefined.
Try x - mean(PCA$scores[,1]) and y - mean(PCA$scores[,2]) which should be the same as x - 0, y - 0 within rounding error. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of mary Sent: Tuesday, January 22, 2013 3:40 AM To: r-help@r-project.org Subject: Re: [R] Ellipse in PCA with parameters a and bdefined. Ok... so, in my model my a is built using the standard deviation of the first principal component and b with the second, so my x and Y should be : PCA $ scores [, 1], PCA $ scores [, 2] but in this way I do not get out a confidence interval set on my parameters but many ellipses. Thanks Mary -- View this message in context: http://r.789695.n4.nabble.com/Ellipse-in- PCA-with-parameters-a-and-b-defined-tp4656215p4656242.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Approximating discrete distribution by continuous distribution
On Jan 22, 2013, at 13:45 , Prof Brian Ripley wrote: On 22/01/2013 11:49, Michael Haenlein wrote: Dear all, I have a discrete distribution showing how age is distributed across a population using a certain set of bands: Age - matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1, dimnames=list(c(18, 18-34, 35-64, 65+),c())) Age_dist - Age/sum(Age) For example I know that 23.94% of all people are between 0-18 years, 23.28% between 18-34 years and so forth. I would like to find a continuous approximation of this discrete distribution in order to estimate the probability that a person is for example 16 years old. Is there some automatic way in R through which this can be done? I tried a Kernel density estimation of the histogram but this does not seem to provide what I'm looking for. This is not really an R question, but a statistics one. It is almost guesswork: if for example these were drivers in the UK, the answer is 0. So you need to supply some information about the shape of the distribution of 18 year olds. You have estimates of the cumulative distribution function at c(0, 18, 35, 65, Inf) (or some better upper limit). You want to interpolate it. You could use linear interpolation (approx[fun]) or a monotone spline interpolation (spline[fun]) or any other interpolation method which meets your needs. But whatever you use, you will supplying a lot of information not actually in your data. Agreed. The linear interpolation method is sometimes described as the sum polygon, and sort of assumes that there is a uniform distribution of ages in each range. I.e., the number of 16 year olds would be 1/18 of the 0-17 y.o. However, I'd feel somewhat uneasy about doing this with such wide age-bands. There is also the option of fitting a standard distribution like the Weibull to the data and using that. The mle() function should do this if you write out the log-likelihood using something like dmultinom(Age, prob=diff(pweibull(c(0,18,15,65,Inf), shape, scale), log=TRUE) With a quarter of a billion observations, the fit might be less than perfect, but on the other hand, extracting more than two parameters from four data points sound a bit ominous. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to remove the vertical space between two graps
Most attachments are automatically stripped from r-help so we cannot see your results. You may be able to get what you want with pyramid.plot in package plotrix by changing the default values (it is designed for population pyramids) or you may be able to get there using the layout() function in base graphics before your plotting commands. Alternatively you can set up a plot and then use the polygon() function to place the rectangles where you want them. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Purna chander Sent: Tuesday, January 22, 2013 3:41 AM To: r-help Subject: [R] How to remove the vertical space between two graps Hi, I have created a barplot using the following code. a-c(11,23,15,34,42,31) m-matrix(a,nrow=2) m[2,]-(-1)*m[2,] par(mar=c(4,4,4,0)) barplot(m[2,],horiz=T) par(mar=c(4,0,4,2)) barplot(m[1,],horiz=T,col=black) and the plot obtained is shown in plot1.tiff. I was not willing to see the gap (vertical space) between two graphs. How can I achieve it? Further I tried to achieve my goal in a single plot, for which I tried this code: a-c(11,23,15,34,42,31) m-matrix(a,nrow=2) m[2,]-(-1)*m[2,] barplot(m,horiz=T,beside=T) and the plot obtained is showed in plot2.tiff in the second attempt I'm able to place the bars next to each other using beside=T argument. However, I fail when I use beside=F argument (obtained plot3.tiff with this). Can you suggest me in achieving my goal (similar to plot2 with no vertical space)? Regards, Purna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with interpolation
Next time please provide sample data in a form we can easily read in (look at ?dput for example) If i understand this right: yourData-read.table(header=T,text= datedays rate 1996_01_02 155.74590 1996_01_02 505.67332 1996_01_02 785.60888 1996_01_02 1695.47376 1996_01_02 2605.35267 1996_01_02 3515.27619 1996_01_03 145.74740 1996_01_03 495.67226 1996_01_03 775.60371 1996_01_03 1685.47058 1996_01_03 2595.34662 1996_01_03 3505.26630 ) results-sapply(unique(yourData$date),function(thisDate){ subSet - yourData[yourData$date==thisDate,] appr-approx(subSet$days,subSet$rate,xout=seq(0,360, by=30)) rates-appr$y names(rates)-appr$x rates }) colnames(results)-unique(yourData$date) This gives 13 results per date though, and it can't interpolate the first and last value. If you need those values that are not in-between, try spline instead of approx (you never specified how you wanted to interpolate). On 17.01.2013, at 15:50, beanbandit wrote: hi guys I need to interpolate values for the zero coupon yield curve. Following data is given datedays rate 1996 01 02 155.74590 1996 01 02 505.67332 1996 01 02 785.60888 1996 01 02 1695.47376 1996 01 02 2605.35267 1996 01 02 3515.27619 1996 01 03 145.74740 1996 01 03 495.67226 1996 01 03 775.60371 1996 01 03 1685.47058 1996 01 03 2595.34662 1996 01 03 3505.26630 For every day i have to interpolate 10 values, for example for maturities of 30,60 or 90 days. I have interpolate data for a one year period, 10 interpolation values a day, so that equals 3600 values. what's the easiest way to implement this in R? please hlep! -- View this message in context: http://r.789695.n4.nabble.com/Help-with-interpolation-tp4655843.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ellipse in PCA with parameters a and bdefined.
thank you David, it was my first idea, but i don't know if it is right statistically speaking!!! -- View this message in context: http://r.789695.n4.nabble.com/Ellipse-in-PCA-with-parameters-a-and-b-defined-tp4656215p4656274.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles with R for a big data.frame
On Jan 22, 2013, at 5:58 AM, Simonas Kecorius wrote: Hey Duncan, Neither me do imagine what formula OpenOffice uses for quantiles. I have checked a data string, 24 values, to calculate a quantiles with OpenOffice and R. The result is identical. The problem arises when I try to implement quantile calculation in this form: dat2-with(dat1,aggregate(cbind(dat1[, 1:71]),by=list(newID),quantiles,0.1,type=4)) . This code does not generate an error, but I guess neither a right result. You guess? What result and what is right? So my question would be: How I could calculate quantiles for a big data.frame in R (71 columns and 288 rows). I need to take 24 rows, calculate quantiles, then take another 24 rows etc..for 71 columns. You have already been told that you are misspelling the name of the R function. The other open question in my mind is whether you were hoping for something other than a single quantile (in this case the 10th percentile, or perhaps wanted the quantiles that would divide your data into deciles? If you want to do the calculation within groups then the second argument to `aggregate` must specify the grouping. By design `aggregate` will apply the function on all columns. -- David. Thanks in advance. 2013/1/22 Duncan Murdoch murdoch.dun...@gmail.com On 13-01-21 6:41 PM, Simonas Kecorius wrote: Dear R users, I came up to a problem dealing with percentiles in R. From my previous questions: I do have a big data.frame, with lots of columns and rows. The following command enables me to calculate means for all data frame. dat1$newID-rep(1:(nrow(dat1)/**12),each=12) #if nrow(dat1)/12 is integer dat2-with(dat1,aggregate(**cbind(dat1[, 1:71]),by=list(**newID),mean)) What I need is to calculate percentiles for each group (there are 12 values in a group). I tried the following: duomenai-with(dat1,aggregate(**cbind(dat1[,1:71]),by=list(** newID),quantiles,0.1,type=4)) You didn't define quantiles, so that won't work. Assuming that's a typo, and you meant quantile... First, is the following syntax is right? Secondly, I tried to calculate percentiles using OpenOffice and there is disagreement between values. If I do calculation for some number row, than R and OpenOffice numbers coincide, but for a data.frame it seams that something goes wrong. There are lots of different formulas for empirical quantiles. The ones available in R are described in the ?quantile help topic. What formula does OpenOffice use? Duncan Murdoch -- Simonas Kecorius ** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A smart way to use $ in data frame
Hello Greg, Thanks very much! This helps! Cheers, Rebecca From: Greg Snow [mailto:538...@gmail.com] Sent: Friday, January 18, 2013 5:17 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] A smart way to use $ in data frame The important thing to understand is that $ is a shortcut for [[ and you are moving into the realm where a shortcut is the longest distance between 2 points (see fortune(312)). So your code can be something like: state - 'oldstate' balance - 'oldbalance' dataa[[balance]][ dataa[[state]]=='AR' ] You may also benefit from learning to use tools like with and subset (though subset has its own complications when used inside of other functions) or grep and match to find the columns of interest. On Fri, Jan 18, 2013 at 12:40 PM, Yuan, Rebecca rebecca.y...@bankofamerica.commailto:rebecca.y...@bankofamerica.com wrote: Hello all, I have a data frame dataa: newdate newstate newid newbalance newaccounts 1 31DEC2001AR 1 1170 61 2 31DEC2001VA 2 4565 54 3 31DEC2001WA 3 2726 35 4 31DEC2001AR 3 2700 35 The following gives me the balance of state AR: dataa$newbalance[data$newstate == 'AR'] 1170 2700 Now, I have another different data frame datab, it is very similar to data, except that the name of the columns are different, and the order of the columns are different: oldstate olddate oldbalance oldid oldaccounts 1 AR 31DEC20121234 7 40 2 WA 31DEC2012 3 30 3 VA 31DEC20122345 5 23 3 AR 31DEC20125673 5 23 datab$oldbalance[datab$oldstate== 'AR' ] 1234 5673 Could I have a way to quote data$balance[data$state == 'AR'] in general, where balance=oldbalance, state=oldstate when data=dataa, and balance = newbalance, state = newstate when data=datab ? Thanks very much! Cheers, Rebecca -- This message, and any attachments, is for the intended r...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot two time series with different length and different starting point in one figure.
Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. Thanks very much! Cheers, Rebecca -- This message, and any attachments, is for the intended r...{{dropped:5}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Yuan, Rebecca Sent: Tuesday, January 22, 2013 4:07 PM To: R help Subject: [R] plot two time series with different length and different starting point in one figure. Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. Merge those 2 series. ?merge Regards Petr Thanks very much! Cheers, Rebecca -- This message, and any attachments, is for the intended r...{{dropped:5}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
Hello Petr, As the time series have the same column names, I got the error message like: m1-merge(A, B, by.x = time, by.y = balance) Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s) To plot A and B in one plot is to compare the difference between them... Any other thoughts? Thanks, Rebecca -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Tuesday, January 22, 2013 10:28 AM To: Yuan, Rebecca; R help Subject: RE: plot two time series with different length and different starting point in one figure. Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Yuan, Rebecca Sent: Tuesday, January 22, 2013 4:07 PM To: R help Subject: [R] plot two time series with different length and different starting point in one figure. Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. Merge those 2 series. ?merge Regards Petr Thanks very much! Cheers, Rebecca -- This message, and any attachments, is for the intended...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Erro message in glmmADMB
peixotop peixotop at leuphana.de writes: I am using glmmADMB and when I run some models, I recieve the following message: Erro em glmmadmb(eumencells ~ 1 + (1 | owners), data = pred3, family = nbinom, : The function maximizer failed (couldn't find STD file) Furthermore: Lost warning messages: Command execution 'C:\Windows\system32\cmd.exe /c C:/Users/helenametal/Documents/R/win-library/2.15/ glmmADMB/bin/windows32/glmmadmb.exe -maxfn 500 -maxph 5 -noinit -shess' teve status 1 : Mensagens de aviso perdidas: execução do comando 'C:\Windows\system32\cmd.exe /c C:/Users/helenametal/Documents/R/win-library/2.15/ glmmADMB/bin/windows32/glmmadmb.exe -maxfn 500 -maxph 5 -noinit -shess' teve status 1 Sorry, this is not nearly enough information for diagnosis. This message just means that *something* went wrong during the optimization step (I do appreciate that it would be good to improve the error messages, although there may not be that much more information available). Please (1) follow-up to r-sig-mixed-mod...@r-project.org and (2) give more complete information on the full model you ran, contents of pred3, etc. (see e.g. http://tinyurl.com/reproducible-000) Here's a minimal example that shows that a model of the form you present *could* work: pred3 - data.frame(owners=rep(letters[1:20],each=20)) set.seed(1001) u - rnorm(20,sd=2) pred3$eumencells - rnbinom(nrow(pred3),mu=exp(1.5+u),size=2) library(glmmADMB) glmmadmb(eumencells ~ 1 + (1|owners),family=nbinom,data=pred3) -- although it doesn't work very well -- it essentially estimates the random effects as zero, lumps the among-owner variance into the NB variance, and mis-estimates the intercept. I don't blame glmmADMB for this, though, it's a small data set and a tough problem. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assistant
Good-day Sir, I am R.Language users but am try to estimate parameter of beta distribution particular dataset but give this error, which is not clear to me: (Initial value in vmmin is not finite) beta.fit - fitdistr(data,densfun=dbeta,shape1=value , shape2=value) kindly assist. expecting your reply: __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple use of dcast (reshape2 package)
Hi, This could be done with ?aggregate() res-aggregate(aa$Eaten,by=list(ID=aa$ID),FUN=function(x) x) res1-data.frame(ID=res[,1],data.frame(res[[2]])) names(res1)[2:3]-unique(aa$Target) res1 # ID TPP GPA #1 1 0 9 #2 2 1 11 #3 3 3 8 #4 4 1 8 #5 5 2 10 A.K. - Original Message - From: Patrick Connolly p_conno...@slingshot.co.nz To: R-help r-help@r-project.org Cc: Sent: Tuesday, January 22, 2013 4:23 AM Subject: [R] Simple use of dcast (reshape2 package) Suppose I have a small dataframe aa Target Eaten ID 50 TPP 0 1 51 TPP 1 2 52 TPP 3 3 53 TPP 1 4 54 TPP 2 5 50.1 GPA 9 1 51.1 GPA 11 2 52.1 GPA 8 3 53.1 GPA 8 4 54.1 GPA 10 5 And I want to reshape it into ID TPP GPA 1 1 0 9 2 2 1 11 3 3 3 8 4 4 1 8 5 5 2 10 I realise that dcast function in the reshape2 package can handle much more complicated tasks than that, but I can't make it do a simple one. If I simply tried dcast(aa, ... ~ Target) Using ID as value column: use value.var to override. Aggregation function missing: defaulting to length Eaten GPA TPP 1 0 0 1 2 1 0 2 3 2 0 1 4 3 0 1 5 8 2 0 6 9 1 0 7 10 1 0 8 11 1 0 As per the help file, it's giving counts of the numbers in the Eaten column since that's the default fun.aggregate value. My questions are: what fun.aggregate would work? Alternatively, can value.var be set to something useful? TIA -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove my adress from mailing list
HI, Please check the link: https://stat.ethz.ch/mailman/listinfo/r-help At the end, there is an option to unsubscribe: To unsubscribe from R-help, get a password reminder, or change your subscription options enter your subscription email address: Hope it helps: A.K. - Original Message - From: M. Maurice m.mauric...@yahoo.de To: R-help@r-project.org R-help@r-project.org Cc: Sent: Tuesday, January 22, 2013 7:13 AM Subject: [R] Remove my adress from mailing list Hello! I wish, that my email-adress is removed from the R-help mailing list. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to align group based on the common values of two columns in r
Hi, I am not sure about the logic behind creation of groups, especially, how do you want to assign the group number to a particular combination of Feature and OS. One possible way would be: dat1$Group-paste(dat1[,1],dat1[,2],sep=) dat1 # Feature OS Group #1 4 2 42 #2 4 1 41 #3 4 3 43 #4 1 2 12 #5 4 1 41 A.K. - Original Message - From: Tammy Ma metal_lical...@live.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Tuesday, January 22, 2013 4:28 AM Subject: [R] How to align group based on the common values of two columns in r HI, I met this problem: I have the feature data frame: Feature OS 4 2 4 1 4 3 1 2 4 1 what I want to do is to autimatically create one more column called group: Feature OS Group 4 2 1 4 1 2 4 3 3 1 2 4 4 1 2 I don't want Ifelse, because I have so many combination of feature and OS, I even can not account. I just want to have sth to autimatically create group indicator based on the difference combination of feature and OS. Thanks for your help. Kind regards, Tammy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple reshape
Hi, You could also do this by: set.seed(15) tr.df-data.frame(ID=rep(1:29,each=3),prep=runif(87,1,3),postp=runif(87,0.5,1.5)) tr.df$time-1:87 res- reshape(tr.df, varying=2:3, v.name=value, times=c(prep,postp),idvar=time,timevar=prepost,direction=long) res-res[order(res$ID,res$time),] row.names(res)-1:nrow(res) head(res,4) # ID time prepost value #1 1 1 prep 2.2042281 #2 1 1 postp 1.3553657 #3 1 2 prep 1.3900879 #4 1 2 postp 0.8674933 A.K. - Original Message - From: Jim Lemon j...@bitwrit.com.au To: Troels Ring tr...@gvdnet.dk Cc: r-help@r-project.org Sent: Tuesday, January 22, 2013 4:46 AM Subject: Re: [R] simple reshape On 01/22/2013 07:19 PM, Troels Ring wrote: Dear friends - this is a very simple question - I have a data frame 'data.frame': 87 obs. of 3 variables: $ ID : int 1 1 1 2 2 2 3 3 3 4 ... $ prep : num 1.18 1.38 1.34 1.93 2.38 2.24 1.17 1.13 1.21 1.89 ... $ postp: num 0.63 0.71 0.75 1.01 1.12 1.07 0.87 0.64 0.7 0.8 ... - 29 persons (ID) each measured three times before and after an intervention: prep and postp - I need data rearranged like ID time val 1 1 prep 1 2 postp 1 1 1 2 1 1 1 2 I cannot make reshape or stack do the trick. Hi Troels, With a bit of extra processing I think rep_n_stack (prettyR) will do what you want: # fake some data tr.df-data.frame(ID=rep(1:29,each=3),prep=runif(87,1,3),postp=runif(87,0.5,1.5)) # add a repeat number tr.df$repno-rep(1:3,29) # get the reshaped data frame trlong.df-rep_n_stack(tr.df,to.stack=2:3, stack.names=c(prepost,value)) # reorder it trlong.df[order(trlong.df$ID,trlong.df$repno),] ID repno prepost value 1 1 1 prep 2.9158693 88 1 1 postp 0.9932342 2 1 2 prep 1.2852817 89 1 2 postp 0.8187234 3 1 3 prep 2.5771902 90 1 3 postp 1.0033936 4 2 1 prep 2.2969320 91 2 1 postp 0.6837140 5 2 2 prep 1.3083553 92 2 2 postp 1.4537096 6 2 3 prep 2.8654184 93 2 3 postp 1.0880881 ... Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assistant
You're not giving people much to work with. I googled the error, and it seems to come from the call to optim and has likely to do with bad starting parameters. That said, the documentation of fitdistr doesn't suggest it even supports dbeta, there is only a beta mentioned. On 22.01.2013, at 17:07, Adelabu Ahmmed wrote: Good-day Sir, I am R.Language users but am try to estimate parameter of beta distribution particular dataset but give this error, which is not clear to me: (Initial value in vmmin is not finite) beta.fit - fitdistr(data,densfun=dbeta,shape1=value , shape2=value) kindly assist. expecting your reply: __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assistant
Hello, You are calling the function in a wrong way. In the case of a beta fit, densfun should be the quoted string beta and the initial parameter values are elements of a named list. Like this: library(MASS) x - rbeta(1000, shape1 = 2, shape2 = 0.5) fitdistr(x, densfun = beta, start = list(shape1 = 1, shape2 = 1)) As for your error, I only got it if the data clearly can not fit a beta. y - rgamma(1000, shape = 2, rate = 0.5) fitdistr(y, densfun = beta, start = list(shape1 = 1, shape2 = 1)) Error in optim(x = c(6.19809706003757, 2.32632108817696, 3.60844436009277, : initial value in 'vmmin' is not finite So revise the way you call fitdistr and then, if the error persists, revise the parametric distribution to be fitted. Hope this helps, Rui Barradas Em 22-01-2013 16:07, Adelabu Ahmmed escreveu: Good-day Sir, I am R.Language users but am try to estimate parameter of beta distribution particular dataset but give this error, which is not clear to me: (Initial value in vmmin is not finite) beta.fit - fitdistr(data,densfun=dbeta,shape1=value , shape2=value) kindly assist. expecting your reply: __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] c(), rbind and cbind functions - why type of resulting object is double
Hello Everyone, I am using R 2.15.0 and I came across this behaviour and I was wondering why I don't get an integer vector or and integer matrix with the following code: z - c(1, 2:0, 3, 4:8) typeof(z) [1] double z - rbind(1, 2:0, 3, 4:8) Warning message: In rbind(1, 2:0, 3, 4:8) : number of columns of result is not a multiple of vector length (arg 2) typeof(z) [1] double z - matrix(c(1, 2:0, 3, 4:8), nrow = 5) typeof(z) [1] double Shouldn't be typeof integer? According to the online help if everything is integer the output should be integer. But if I do this, I get an integer matrix. z - matrix(1:20, nrow = 5) typeof(z) [1] integer Thanks! Lourdes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] c(), rbind and cbind functions - why type of resulting object is double
I was wondering why I don't get an integer vector or and integer matrix with the following code: z - c(1, 2:0, 3, 4:8) typeof(z) [1] double It is because the literals 1 and 3 have type double. Append L to make them literal integers. typeof(c(1L, 2:0, 3L, 4:8)) [1] integer The colon function (:) returns an integer vector if it can do so without giving a numerically incorrect answer. typeof(1.0:3.0) [1] integer typeof(1.5:3.5) [1] double Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Lourdes Peña Castillo Sent: Tuesday, January 22, 2013 9:26 AM To: r-help@r-project.org Subject: [R] c(), rbind and cbind functions - why type of resulting object is double Hello Everyone, I am using R 2.15.0 and I came across this behaviour and I was wondering why I don't get an integer vector or and integer matrix with the following code: z - c(1, 2:0, 3, 4:8) typeof(z) [1] double z - rbind(1, 2:0, 3, 4:8) Warning message: In rbind(1, 2:0, 3, 4:8) : number of columns of result is not a multiple of vector length (arg 2) typeof(z) [1] double z - matrix(c(1, 2:0, 3, 4:8), nrow = 5) typeof(z) [1] double Shouldn't be typeof integer? According to the online help if everything is integer the output should be integer. But if I do this, I get an integer matrix. z - matrix(1:20, nrow = 5) typeof(z) [1] integer Thanks! Lourdes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] c(), rbind and cbind functions - why type of resulting object is double
One place that talks about what Bill says is: http://www.burns-stat.com/documents/tutorials/impatient-r/more-r-key-objects/more-r-numbers/ Pat On 22/01/2013 17:35, William Dunlap wrote: I was wondering why I don't get an integer vector or and integer matrix with the following code: z - c(1, 2:0, 3, 4:8) typeof(z) [1] double It is because the literals 1 and 3 have type double. Append L to make them literal integers. typeof(c(1L, 2:0, 3L, 4:8)) [1] integer The colon function (:) returns an integer vector if it can do so without giving a numerically incorrect answer. typeof(1.0:3.0) [1] integer typeof(1.5:3.5) [1] double Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Lourdes Peña Castillo Sent: Tuesday, January 22, 2013 9:26 AM To: r-help@r-project.org Subject: [R] c(), rbind and cbind functions - why type of resulting object is double Hello Everyone, I am using R 2.15.0 and I came across this behaviour and I was wondering why I don't get an integer vector or and integer matrix with the following code: z - c(1, 2:0, 3, 4:8) typeof(z) [1] double z - rbind(1, 2:0, 3, 4:8) Warning message: In rbind(1, 2:0, 3, 4:8) : number of columns of result is not a multiple of vector length (arg 2) typeof(z) [1] double z - matrix(c(1, 2:0, 3, 4:8), nrow = 5) typeof(z) [1] double Shouldn't be typeof integer? According to the online help if everything is integer the output should be integer. But if I do this, I get an integer matrix. z - matrix(1:20, nrow = 5) typeof(z) [1] integer Thanks! Lourdes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Impatient R' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] change confidence interval line length in barplot2 (plotrix package)
Hi, is there any way to change the width of the horizontal line of confidence intervals in the barplot2 function in the plotrix package (independent of the width of the bars)? example code: library(plotrix) # Example with confidence intervals and grid hh - t(VADeaths)[, 1] mybarcol - gray20 ci.l - hh * 0.85 ci.u - hh * 1.15 mp - barplot2(hh, beside = TRUE, col = c(lightblue, mistyrose, lightcyan, lavender), legend = colnames(VADeaths), ylim = c(0, 20), main = Death Rates in Virginia, font.main = 4, sub = Faked 95 percent error bars, col.sub = mybarcol, cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u) thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
On Jan 22, 2013, at 7:07 AM, Yuan, Rebecca wrote: Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. You could set the xlim argument to c( min(timeA, timeB), max(timeA, timeB) ) in the `plot` of either of the series and then use `lines` for the other series, perhaps with a different color argument. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
Hello Arun, This would help me to get the date type of data. A new question comes out that since the dates are not exactly the same on two date sets, there are some NA values in the merged data set, such as 2012-09-28 NA NA5400726 14861715970 2012-09-30 5035606 14832837436 NA NA Does R have a function to convert the date to some format of Sep,2012, therefore when I merge those two, they will not have those NA numbers... Thanks, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 2:15 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi Rebecca, Assuming that 'raw_data' is data.frame with first column as raw_time: You could convert the raw_time to date format by as.Date(28FEB2002,format=%d%B%Y) #[1] 2002-02-28 In your data, it should be: raw_data$raw_time- as.Date(raw_time,format=%d%B%Y) Could you just dput() a few lines of your dataset if this is not working? Tx. A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: Sent: Tuesday, January 22, 2013 2:08 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, My data shows that I do not have a date type of data: summary(raw_data) raw_time raw_acct raw_baln 28FEB2002: 1 Min. : 61714 Min. :117079835 28FEB2003: 1 1st Qu.: 75587 1st Qu.:158035150 28FEB2005: 1 Median :100234 Median :206906298 28FEB2006: 1 Mean : 96058 Mean :210550369 28FEB2007: 1 3rd Qu.:116908 3rd Qu.:263623782 28FEB2009: 1 Max. :121853 Max. :325290870 (Other) :127 How could I transfer the raw_time column to a date format, such as summary(dateA) Min. 1st Qu. Median Mean 3rd Qu. Max. 2012-01-01 2012-04-01 2012-07-01 2012-07-01 2012-09-30 2012-12-31 Thanks very much! Cheers, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 12:39 PM To: Yuan, Rebecca Cc: R help; Petr PIKAL Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi, You could also try this: dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day) dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day) set.seed(15) A-data.frame(dateA,value=sample(1:300,366,replace=TRUE)) set.seed(25) B-data.frame(dateB,value=sample(1:300,275,replace=TRUE)) library(xts) Anew-as.xts(A[,-1],order.by=A[,1]) Bnew-as.xts(B[,-1],order.by=B[,1]) res-merge(Anew,Bnew) res1-res[complete.cases(res),] library(zoo) plot.zoo(res1) plot.zoo(res) A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'PIKAL Petr' petr.pi...@precheza.cz Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 10:36 AM Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hello Petr, As the time series have the same column names, I got the error message like: m1-merge(A, B, by.x = time, by.y = balance) Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s) To plot A and B in one plot is to compare the difference between them... Any other thoughts? Thanks, Rebecca -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Tuesday, January 22, 2013 10:28 AM To: Yuan, Rebecca; R help Subject: RE: plot two time series with different length and different starting point in one figure. Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Yuan, Rebecca Sent: Tuesday, January 22, 2013 4:07 PM To: R help Subject: [R] plot two time series with different length and different starting point in one figure. Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. Merge those 2 series. ?merge Regards Petr Thanks very much! Cheers, Rebecca -- This message, and any attachments, is for the \ in...{{dropped:23}} __ R-help@r-project.org mailing list
Re: [R] change confidence interval line length in barplot2 (plotrix package)
There does not appear to be any such function as barplot2 in the current version (3.4-5) of the plotrix package. Moreover I can find no reference to such a function in the NEWS for plotrix. cheers, Rolf Turner On 01/23/2013 07:28 AM, Martin Batholdy wrote: Hi, is there any way to change the width of the horizontal line of confidence intervals in the barplot2 function in the plotrix package (independent of the width of the bars)? example code: library(plotrix) # Example with confidence intervals and grid hh - t(VADeaths)[, 1] mybarcol - gray20 ci.l - hh * 0.85 ci.u - hh * 1.15 mp - barplot2(hh, beside = TRUE, col = c(lightblue, mistyrose, lightcyan, lavender), legend = colnames(VADeaths), ylim = c(0, 20), main = Death Rates in Virginia, font.main = 4, sub = Faked 95 percent error bars, col.sub = mybarcol, cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
Hello David, If I use plot with the following code: plot(A, type = o, col = plot_colors[plotcolor], axes = FALSE, ann = FALSE) par(new=TRUE) plot(B, type = o, col = plot_colors[plotcolor+1], axes = FALSE, ann = FALSE) box() I will have the two series in one plot, but they are only from March,2012 to Nov, 2012, the nonoverlapping months are dropped out... I know in Matlab that I can specify the x axis such as Plot(timeofA, A) Hold on; Plot(timeofB, B) to get them in the same figure, but in R, I do not know how to do it. Thanks, Rebecca -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Tuesday, January 22, 2013 2:34 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. On Jan 22, 2013, at 7:07 AM, Yuan, Rebecca wrote: Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. You could set the xlim argument to c( min(timeA, timeB), max(timeA, timeB) ) in the `plot` of either of the series and then use `lines` for the other series, perhaps with a different color argument. -- David Winsemius Alameda, CA, USA -- This message, and any attachments, is for the intended r...{{dropped:2}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to assign time series to a vector with one leap year
Hello All, I am trying to do the time series analysis in R and I want to assign a vector as a time series. The data I provided is hourly. The data is from Jan 1 2008 to Dec 31 2009. How can I assign the data such that the first year is leap year and second is not ? airtemp - read.csv(airtemp.csv,header=T,sep=) aw - ts(airtemp,start=2008,frequency=8784,end=2009) I assigned frequency as 8784 because 2008 year will have 8784 hourly data points and 2009 has 8760 data points. The total data points are 17544 The data can be found on https://www.dropbox.com/s/03z74632v1f3g1e/airtemp.csv I apologize if this is very trivial to some of you. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
On Jan 22, 2013, at 11:42 AM, Yuan, Rebecca wrote: Hello David, If I use plot with the following code: plot(A, type = o, col = plot_colors[plotcolor], axes = FALSE, ann = FALSE) par(new=TRUE) plot(B, type = o, col = plot_colors[plotcolor+1], axes = FALSE, ann = FALSE) box() I will have the two series in one plot, but they are only from March,2012 to Nov, 2012, the nonoverlapping months are dropped out... I know in Matlab that I can specify the x axis such as Plot(timeofA, A) Hold on; Plot(timeofB, B) to get them in the same figure, but in R, I do not know how to do it. As I said before . You need to use the xlim argument to 'plot'. If you insist on using plot twice then you will need to use 'xlim=' twice, although I thought it would be easier to use `plo`t first and `lines` second. -- David. Thanks, Rebecca -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Tuesday, January 22, 2013 2:34 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. On Jan 22, 2013, at 7:07 AM, Yuan, Rebecca wrote: Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. You could set the xlim argument to c( min(timeA, timeB), max(timeA, timeB) ) in the `plot` of either of the series and then use `lines` for the other series, perhaps with a different color argument. -- David Winsemius Alameda, CA, USA -- This message, and any attachments, is for the intended...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a Data Frame from an XML
Hello, I'm attempting to read information from an XML into a data frame in R using the XML package. I am unable to get the data into a data frame as I would like. I have some sample code below. *XML Code:* Header... Data I want in a data frame: data row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 / row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 / row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 / row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 / row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 / row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 / row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 / row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 / row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 / /data *R Code:* doc -xmlInternalTreeParse (Sample2.xml) top - xmlRoot (doc) xmlName (top) names (top) art - top [[row]] art ** *Output:* artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/ This is where I am having difficulties. I am unable to access additional rows; ( i.e. row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / ) and I am unable to access the individual entries to actually create the data frame. The data frame I would like is as follows: BRANDNUMYEARVALUE GMC1 1999 1 FORD 2 2000 12000 GMC1 2001 12500 etc Any help or suggestions would be appreciated. Conversly, my eventual goal would be to take a data frame and write it into an XML in the previously shown format. Thank you AG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Introduction and help request
Hello all I am a researcher in the field of tourism and have just recently installed R64 and RStudio onto my Mac (running latest OS). I am ran into some problems installing additional packages. I have looked through the General FAQs and Mac FAQS but haven't been able to find a solution. I have downloaded the various packages I need from CRAN sources and while some have successfully installed others have not. I have been following the instructions on the Mac FAQ to unzip and install the downloaded packages using the command line but the results seem to indicate an error (they are installed but then don't work properly and so are subsequently uninstalled). It happens on more than one so that's why I thought it might be something generic I am doing. Here is a copy of the command line results: Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/Hmisc_3.10-1.tar.gz * installing to library ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’ * installing *source* package ‘Hmisc’ ... ** package ‘Hmisc’ successfully unpacked and MD5 sums checked ** libs *** arch - i386 sh: make: command not found ERROR: compilation failed for package ‘Hmisc’ * removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/Hmisc’ Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/acepack_1.3-3.2.tar.gz * installing to library ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’ * installing *source* package ‘acepack’ ... ** package ‘acepack’ successfully unpacked and MD5 sums checked ** libs *** arch - i386 sh: make: command not found ERROR: compilation failed for package ‘acepack’ * removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/acepack’ Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/arm_1.6-01.02.tar.gz * installing to library ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’ ERROR: dependency ‘lme4’ is not available for package ‘arm’ * removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/arm’ Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/chron_2.3-43.tar.gz * installing to library ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’ * installing *source* package ‘chron’ ... ** package ‘chron’ successfully unpacked and MD5 sums checked ** libs *** arch - i386 sh: make: command not found ERROR: compilation failed for package ‘chron’ * removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/chron’ Thank you for any help Ross __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot.mob() fails with cut() error 'breaks' are not unique
DeaR all, I am using mob() for model based partitioning, with a dichotomous variable (participant's correct/incorrect response to a test item) regressed onto a continuous predictor related to a given property of the test item. Although this variable is continuous, the value of this variable for many items in this particular analysis is 0. The partitioning criterion is self-reported ability in a related area. mob1 - mob( correct ~ circular.mean | srp.dimension, control=mob_control(alpha=.001), model=glinearModel, family=binomial() ) plot(mob1) Error in cut.default(x, breaks = breaks, include.lowest = TRUE) : 'breaks' are not unique The same persists if I specify either a desired number of breaks, or explicit breakpoints (e.g. breaks=3 or breaks=c(-0.1,0.1,0.5)). I guess this is to do with the funny distribution of the predictor variable, but I'm not sure what to do about it. Many thanks and apologies if this doesn't fit the mailing list---it is my first posting! Jason Musil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to align group based on the common values of two columns in r
Hi, You could also try: dat1-read.table(text= Feature OS 4 2 4 1 4 3 1 2 4 1 ,sep=,header=TRUE) dat1$Group- as.numeric(factor(Reduce(paste0,dat1))) A.K. - Original Message - From: Tammy Ma metal_lical...@live.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Tuesday, January 22, 2013 4:28 AM Subject: [R] How to align group based on the common values of two columns in r HI, I met this problem: I have the feature data frame: Feature OS 4 2 4 1 4 3 1 2 4 1 what I want to do is to autimatically create one more column called group: Feature OS Group 4 2 1 4 1 2 4 3 3 1 2 4 4 1 2 I don't want Ifelse, because I have so many combination of feature and OS, I even can not account. I just want to have sth to autimatically create group indicator based on the difference combination of feature and OS. Thanks for your help. Kind regards, Tammy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
Hi, dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day) dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day) set.seed(15) A-data.frame(dateA,value=sample(1:300,366,replace=TRUE)) set.seed(25) B-data.frame(dateB,value=sample(1:300,275,replace=TRUE)) res-merge(A,B,by.x=dateA,by.y=dateB) #it works A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'PIKAL Petr' petr.pi...@precheza.cz Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 10:36 AM Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hello Petr, As the time series have the same column names, I got the error message like: m1-merge(A, B, by.x = time, by.y = balance) Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s) To plot A and B in one plot is to compare the difference between them... Any other thoughts? Thanks, Rebecca -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Tuesday, January 22, 2013 10:28 AM To: Yuan, Rebecca; R help Subject: RE: plot two time series with different length and different starting point in one figure. Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Yuan, Rebecca Sent: Tuesday, January 22, 2013 4:07 PM To: R help Subject: [R] plot two time series with different length and different starting point in one figure. Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. Merge those 2 series. ?merge Regards Petr Thanks very much! Cheers, Rebecca -- This message, and any attachments, is for the intended...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
Hi, You could also try this: dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day) dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day) set.seed(15) A-data.frame(dateA,value=sample(1:300,366,replace=TRUE)) set.seed(25) B-data.frame(dateB,value=sample(1:300,275,replace=TRUE)) library(xts) Anew-as.xts(A[,-1],order.by=A[,1]) Bnew-as.xts(B[,-1],order.by=B[,1]) res-merge(Anew,Bnew) res1-res[complete.cases(res),] library(zoo) plot.zoo(res1) plot.zoo(res) A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'PIKAL Petr' petr.pi...@precheza.cz Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 10:36 AM Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hello Petr, As the time series have the same column names, I got the error message like: m1-merge(A, B, by.x = time, by.y = balance) Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s) To plot A and B in one plot is to compare the difference between them... Any other thoughts? Thanks, Rebecca -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Tuesday, January 22, 2013 10:28 AM To: Yuan, Rebecca; R help Subject: RE: plot two time series with different length and different starting point in one figure. Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Yuan, Rebecca Sent: Tuesday, January 22, 2013 4:07 PM To: R help Subject: [R] plot two time series with different length and different starting point in one figure. Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. Merge those 2 series. ?merge Regards Petr Thanks very much! Cheers, Rebecca -- This message, and any attachments, is for the intended...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fdHess function
Your question is better addressed to the R-help@R-project.org mailing list, which I am copying on this reply. You are confusing a statistical concept, the Fisher Information matrix, with a numerical concept, the Hessian matrix of a scalar function of a vector argument. The Fisher information matrix is the Hessian matrix of a particular function at its optimum and I have forgotten whether that function is the log-likelihood or negative twice the log-likelihood or ... Rather than get it wrong I am sending a copy of this reply to the list where many of the readers will be able to answer you more reliably than I can. On Tue, Jan 22, 2013 at 1:22 PM, Marcos Coque Jr mcoqu...@yahoo.com.brwrote: Dear Bates, I am using the fdHess function for R language. And I have a question. What is the relationship with the Hessian and Fisher Information in your function? Because I think that Fisher Information=-Hessian, but I found the oposite in your function. Maybe I be something wrong... Thanks, Marcos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
Hi Rebecca, Assuming that 'raw_data' is data.frame with first column as raw_time: You could convert the raw_time to date format by as.Date(28FEB2002,format=%d%B%Y) #[1] 2002-02-28 In your data, it should be: raw_data$raw_time- as.Date(raw_time,format=%d%B%Y) Could you just dput() a few lines of your dataset if this is not working? Tx. A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: Sent: Tuesday, January 22, 2013 2:08 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, My data shows that I do not have a date type of data: summary(raw_data) raw_time raw_acct raw_baln 28FEB2002: 1 Min. : 61714 Min. :117079835 28FEB2003: 1 1st Qu.: 75587 1st Qu.:158035150 28FEB2005: 1 Median :100234 Median :206906298 28FEB2006: 1 Mean : 96058 Mean :210550369 28FEB2007: 1 3rd Qu.:116908 3rd Qu.:263623782 28FEB2009: 1 Max. :121853 Max. :325290870 (Other) :127 How could I transfer the raw_time column to a date format, such as summary(dateA) Min. 1st Qu. Median Mean 3rd Qu. Max. 2012-01-01 2012-04-01 2012-07-01 2012-07-01 2012-09-30 2012-12-31 Thanks very much! Cheers, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 12:39 PM To: Yuan, Rebecca Cc: R help; Petr PIKAL Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi, You could also try this: dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day) dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day) set.seed(15) A-data.frame(dateA,value=sample(1:300,366,replace=TRUE)) set.seed(25) B-data.frame(dateB,value=sample(1:300,275,replace=TRUE)) library(xts) Anew-as.xts(A[,-1],order.by=A[,1]) Bnew-as.xts(B[,-1],order.by=B[,1]) res-merge(Anew,Bnew) res1-res[complete.cases(res),] library(zoo) plot.zoo(res1) plot.zoo(res) A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'PIKAL Petr' petr.pi...@precheza.cz Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 10:36 AM Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hello Petr, As the time series have the same column names, I got the error message like: m1-merge(A, B, by.x = time, by.y = balance) Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s) To plot A and B in one plot is to compare the difference between them... Any other thoughts? Thanks, Rebecca -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Tuesday, January 22, 2013 10:28 AM To: Yuan, Rebecca; R help Subject: RE: plot two time series with different length and different starting point in one figure. Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Yuan, Rebecca Sent: Tuesday, January 22, 2013 4:07 PM To: R help Subject: [R] plot two time series with different length and different starting point in one figure. Hello, I do have two different time series A and B, they are different in length and starting point. A starts in Jan, 2012 and ends in Dec, 2012 and B starts in March, 2012 and ends in Nov, 2012. How can I plot those two series A and B in the same plot? I.E., from Jan. 2012 - Feb, 2012, it would have one data point from A and from Mar, 2012-Nov, 2012, it would have two data points from A and B, and in December 2012, it would have one data point from A. Merge those 2 series. ?merge Regards Petr Thanks very much! Cheers, Rebecca -- This message, and any attachments, is for the intended...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. __ R-help@r-project.org mailing list
Re: [R] change confidence interval line length in barplot2 (plotrix package)
On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote: Hi, is there any way to change the width of the horizontal line of confidence intervals in the barplot2 function in the plotrix package (independent of the width of the bars)? example code: library(plotrix) # Example with confidence intervals and grid hh - t(VADeaths)[, 1] mybarcol - gray20 ci.l - hh * 0.85 ci.u - hh * 1.15 mp - barplot2(hh, beside = TRUE, col = c(lightblue, mistyrose, lightcyan, lavender), legend = colnames(VADeaths), ylim = c(0, 20), main = Death Rates in Virginia, font.main = 4, sub = Faked 95 percent error bars, col.sub = mybarcol, cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u) When I did an sos::findFn(barplot2) search to locate the real `barplot2` O alos noted in the same package (gplots) a function named `ooplot`. It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might save you the time of doing what I thought might be needed, hacking te code. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change confidence interval line length in barplot2 (plotrix package)
Ok, I have to apologize, I confused the packages. It's the function barplot2 from the gplots package! It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might save you the time of doing what I thought might be needed, hacking te code. Unfortunately ci.lwd controls the thickness of the line but not the horizontal width. On Jan 22, 2013, at 21:24 , David Winsemius dwinsem...@comcast.net wrote: On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote: Hi, is there any way to change the width of the horizontal line of confidence intervals in the barplot2 function in the plotrix package (independent of the width of the bars)? example code: library(plotrix) # Example with confidence intervals and grid hh - t(VADeaths)[, 1] mybarcol - gray20 ci.l - hh * 0.85 ci.u - hh * 1.15 mp - barplot2(hh, beside = TRUE, col = c(lightblue, mistyrose, lightcyan, lavender), legend = colnames(VADeaths), ylim = c(0, 20), main = Death Rates in Virginia, font.main = 4, sub = Faked 95 percent error bars, col.sub = mybarcol, cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u) When I did an sos::findFn(barplot2) search to locate the real `barplot2` O alos noted in the same package (gplots) a function named `ooplot`. It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might save you the time of doing what I thought might be needed, hacking te code. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Introduction and help request
On 22-01-2013, at 19:20, Ross Tinsley rtins...@htmi.ch wrote: Hello all I am a researcher in the field of tourism and have just recently installed R64 and RStudio onto my Mac (running latest OS). I am ran into some problems installing additional packages. I have looked through the General FAQs and Mac FAQS but haven't been able to find a solution. I have downloaded the various packages I need from CRAN sources and while some have successfully installed others have not. I have been following the instructions on the Mac FAQ to unzip and install the downloaded packages using the command line but the results seem to indicate an error (they are installed but then don't work properly and so are subsequently uninstalled). It happens on more than one so that's why I thought it might be something generic I am doing. Here is a copy of the command line results: Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/Hmisc_3.10-1.tar.gz * installing to library ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’ * installing *source* package ‘Hmisc’ ... ** package ‘Hmisc’ successfully unpacked and MD5 sums checked ** libs *** arch - i386 sh: make: command not found ERROR: compilation failed for package ‘Hmisc’ * removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/Hmisc’ Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/acepack_1.3-3.2.tar.gz * installing to library ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’ * installing *source* package ‘acepack’ ... ** package ‘acepack’ successfully unpacked and MD5 sums checked ** libs *** arch - i386 sh: make: command not found ERROR: compilation failed for package ‘acepack’ * removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/acepack’ Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/arm_1.6-01.02.tar.gz * installing to library ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’ ERROR: dependency ‘lme4’ is not available for package ‘arm’ * removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/arm’ Rosss-MacBook-Pro:~ rosstinsley$ R CMD INSTALL /private/var/folders/ld/3f2dl80154z47_864skpt2_8gn/T/Rtmp0ittcT/downloaded_packages/chron_2.3-43.tar.gz * installing to library ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’ * installing *source* package ‘chron’ ... ** package ‘chron’ successfully unpacked and MD5 sums checked ** libs *** arch - i386 sh: make: command not found ERROR: compilation failed for package ‘chron’ * removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/chron’ 1. This belongs on the R-SIG-Mac mailing list 2. Why don't you use the R.app GUI to install the binary versions of the required packages? Much easier. 3. The message: sh: make: command not found means that you don't have make installed. Most likely you don't have other required tools installed. If you use the R.app GUI you don't really need tall those tools. Advice: use R.app to install and if needed get the Xcode tools but only if you intend to compile your own packages. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fdHess function
Hi Doug: I was just looking at this coincidentally. When X is a vector, the Fisher Information I_{theta} = the negative expectation of the second derivatives of the log likelihood. So it's a matrix. In other words, I_theta = E(partial^2 /partial theta^2(log(X,theta).) where X is a vector. But, even though the the Fisher Information has a seemingly nice formula, ( and this is where my confusion arose when I was dealing with this and why I'm looking at it right now. I have short document that I wrote to myself explaining it so if anyone wants it, email me individually. It's nothing earth shattering ! ) in many cases taking the that expectation is not easy so the Fischer Information is approximated by its empirical counterpart which is obtained by summing each of the elements in the matrix given the n observations and then dividing each of the elements in the matrix by n. On Tue, Jan 22, 2013 at 3:27 PM, Douglas Bates ba...@stat.wisc.edu wrote: Your question is better addressed to the R-help@R-project.org mailing list, which I am copying on this reply. You are confusing a statistical concept, the Fisher Information matrix, with a numerical concept, the Hessian matrix of a scalar function of a vector argument. The Fisher information matrix is the Hessian matrix of a particular function at its optimum and I have forgotten whether that function is the log-likelihood or negative twice the log-likelihood or ... Rather than get it wrong I am sending a copy of this reply to the list where many of the readers will be able to answer you more reliably than I can. On Tue, Jan 22, 2013 at 1:22 PM, Marcos Coque Jr mcoqu...@yahoo.com.br wrote: Dear Bates, I am using the fdHess function for R language. And I have a question. What is the relationship with the Hessian and Fisher Information in your function? Because I think that Fisher Information=-Hessian, but I found the oposite in your function. Maybe I be something wrong... Thanks, Marcos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fdHess function
I neglected to mention that, once you get either I_theta or some empirical estimate of it, you then invert it to get an estimate of the asymptotic covariance matrix of the MLE. On Tue, Jan 22, 2013 at 3:48 PM, Mark Leeds marklee...@gmail.com wrote: Hi Doug: I was just looking at this coincidentally. When X is a vector, the Fisher Information I_{theta} = the negative expectation of the second derivatives of the log likelihood. So it's a matrix. In other words, I_theta = E(partial^2 /partial theta^2(log(X,theta).) where X is a vector. But, even though the the Fisher Information has a seemingly nice formula, ( and this is where my confusion arose when I was dealing with this and why I'm looking at it right now. I have short document that I wrote to myself explaining it so if anyone wants it, email me individually. It's nothing earth shattering ! ) in many cases taking the that expectation is not easy so the Fischer Information is approximated by its empirical counterpart which is obtained by summing each of the elements in the matrix given the n observations and then dividing each of the elements in the matrix by n. On Tue, Jan 22, 2013 at 3:27 PM, Douglas Bates ba...@stat.wisc.eduwrote: Your question is better addressed to the R-help@R-project.org mailing list, which I am copying on this reply. You are confusing a statistical concept, the Fisher Information matrix, with a numerical concept, the Hessian matrix of a scalar function of a vector argument. The Fisher information matrix is the Hessian matrix of a particular function at its optimum and I have forgotten whether that function is the log-likelihood or negative twice the log-likelihood or ... Rather than get it wrong I am sending a copy of this reply to the list where many of the readers will be able to answer you more reliably than I can. On Tue, Jan 22, 2013 at 1:22 PM, Marcos Coque Jr mcoqu...@yahoo.com.br wrote: Dear Bates, I am using the fdHess function for R language. And I have a question. What is the relationship with the Hessian and Fisher Information in your function? Because I think that Fisher Information=-Hessian, but I found the oposite in your function. Maybe I be something wrong... Thanks, Marcos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
Hi Rebecca, In the previous email, res-merge(Anew,Bnew) head(res) # Anew Bnew #2012-01-01 181 NA #2012-01-02 59 NA #2012-01-03 290 NA #2012-01-04 196 NA #2012-01-05 111 NA #2012-01-06 297 NA plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I guess, it would remove that from plotting) If you want to remove the NA rows: use, na.omit() or complete.cases()? #as I did in the previous email. Could you dput() an example dataset? A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 2:38 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, This would help me to get the date type of data. A new question comes out that since the dates are not exactly the same on two date sets, there are some NA values in the merged data set, such as 2012-09-28 NA NA 5400726 14861715970 2012-09-30 5035606 14832837436 NA NA Does R have a function to convert the date to some format of Sep,2012, therefore when I merge those two, they will not have those NA numbers... Thanks, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 2:15 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi Rebecca, Assuming that 'raw_data' is data.frame with first column as raw_time: You could convert the raw_time to date format by as.Date(28FEB2002,format=%d%B%Y) #[1] 2002-02-28 In your data, it should be: raw_data$raw_time- as.Date(raw_time,format=%d%B%Y) Could you just dput() a few lines of your dataset if this is not working? Tx. A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: Sent: Tuesday, January 22, 2013 2:08 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, My data shows that I do not have a date type of data: summary(raw_data) raw_time raw_acct raw_baln 28FEB2002: 1 Min. : 61714 Min. :117079835 28FEB2003: 1 1st Qu.: 75587 1st Qu.:158035150 28FEB2005: 1 Median :100234 Median :206906298 28FEB2006: 1 Mean : 96058 Mean :210550369 28FEB2007: 1 3rd Qu.:116908 3rd Qu.:263623782 28FEB2009: 1 Max. :121853 Max. :325290870 (Other) :127 How could I transfer the raw_time column to a date format, such as summary(dateA) Min. 1st Qu. Median Mean 3rd Qu. Max. 2012-01-01 2012-04-01 2012-07-01 2012-07-01 2012-09-30 2012-12-31 Thanks very much! Cheers, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 12:39 PM To: Yuan, Rebecca Cc: R help; Petr PIKAL Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi, You could also try this: dateA-seq.Date(as.Date(1jan2012,format=%d%b%Y),as.Date(31Dec2012,format=%d%b%Y),by=day) dateB-seq.Date(as.Date(1Mar2012,format=%d%b%Y),as.Date(30Nov2012,format=%d%b%Y),by=day) set.seed(15) A-data.frame(dateA,value=sample(1:300,366,replace=TRUE)) set.seed(25) B-data.frame(dateB,value=sample(1:300,275,replace=TRUE)) library(xts) Anew-as.xts(A[,-1],order.by=A[,1]) Bnew-as.xts(B[,-1],order.by=B[,1]) res-merge(Anew,Bnew) res1-res[complete.cases(res),] library(zoo) plot.zoo(res1) plot.zoo(res) A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'PIKAL Petr' petr.pi...@precheza.cz Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 10:36 AM Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hello Petr, As the time series have the same column names, I got the error message like: m1-merge(A, B, by.x = time, by.y = balance) Error in fix.by(by.x, x) : 'by' must specify uniquely valid column(s) To plot A and B in one plot is to compare the difference between them... Any other thoughts? Thanks, Rebecca -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Tuesday, January 22, 2013 10:28 AM To: Yuan, Rebecca; R help Subject: RE: plot two time series with different length and different starting point in one figure. Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Yuan, Rebecca Sent: Tuesday, January 22, 2013 4:07 PM To: R help Subject: [R] plot two time series with different length and different starting point in one
Re: [R] user units in plotrix
If you want to convert between different units using base graphics then look at the grconvertX and grconvertY functions (in the graphics package). These functions will convert from/to user coordinates, inches, device, figure, and plot coordinates. So you could use grconvertX to find out what user value on the x scale to give to draw.circle that would then generate a circle with a given size in inches, or relative to the device, figure, or plotting region. On Sun, Jan 20, 2013 at 2:59 PM, Murat Tasan mmu...@gmail.com wrote: hi all - i'm having some difficulty figuring out how to convert between user units (which i can't find a definition for in the plotrix package) and either (a) device units (e.g. inches with PDFs) or (b) user coordinates along any particular axis. as an example, suppose i set up a PDF device with inches, the device has both outer and inner magins, and the plot region has drastically different x and y coordinate ranges (e.g. xlim = c(0, 1), ylim = c(0, SOME_VERY_LARGE_NUMBER)). now i'd like to draw.circle(...) but i can't figure out what units the radius argument takes. user units doesn't appear to be inches in this case, and it it corresponds to user coordinates, i don't know which axis' scaling is to be used as the reference. ideally, one would be able to specify the radius in user coordinates while specifying _which_ axis to use as the standard (e.g. an axis = y or axis = x argument). getFigCtr(...) can help in figuring this out, but its argument takes the relative position of the figure region, rather than the plot region, which is more apt for properly placing shapes. i know the grid package has extensive unit conversion code, but i'm trying to update a series of figures using only base graphics... i can't seem to find a rigorous definition of user units anywhere in the plotrix package. anyone know of where i can find this info? cheers, -m __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to give a lengend in symbols functions
I don't see a symbols function in the gtools package, do you mean the symbols function in the graphics package? If so, there is not a simple legend or key function to create the legend (the number of possible options would make it more complicated than building the legend by hand). You will need to construct the legend by hand. You can use the symbols function to add the example symbols to the legend and the text function to add the explanatory text. The functions grconvertX, grconvertY, strheight, and strwidth will help with deciding where to place the text and symbols. On Mon, Jan 21, 2013 at 6:37 PM, Jie Tang totang...@gmail.com wrote: hi Rusers I am trying to use symbos in gtools package symbols(data1,data3,circle=data1/data3,inches=0.1,bg=lightgreen) Now I want to give a lengend to tell the reader the meaning or magnitude of these circle. How can I add these information in symbols plot just like legend in plot ? thank you . -- TANG Jie Email: totang...@gmail.com Tel: 0086-2154896104 Shanghai Typhoon Institute,China [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create a Data Frame from an XML
Hello, I'm attempting to read information from an XML into a data frame in R using the XML package. I am unable to get the data into a data frame as I would like. I have some sample code below. *XML Code:* Header... Data I want in a data frame: data row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 / row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 / row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 / row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 / row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 / row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 / row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 / row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 / row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 / /data *R Code:* doc -xmlInternalTreeParse (Sample2.xml) top - xmlRoot (doc) xmlName (top) names (top) art - top [[row]] art ** *Output:* artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/ * * This is where I am having difficulties. I am unable to access additional rows; ( i.e. row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / ) and I am unable to access the individual entries to actually create the data frame. The data frame I would like is as follows: BRANDNUMYEARVALUE GMC1 1999 1 FORD 2 2000 12000 GMC1 2001 12500 etc Any help or suggestions would be appreciated. Conversly, my eventual goal would be to take a data frame and write it into an XML in the previously shown format. Thank you AG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change confidence interval line length in barplot2 (plotrix package)
Maybe a fortunate mistake. If you use the base graphics barplot(), you can use plotCI() in plotrix to add the confidence intervals with control over the width of the horizontal ends of the bars (if needed, the defaults are much narrower): out - barplot(hh, beside = TRUE, col = c(lightblue, mistyrose, lightcyan, lavender), legend = colnames(VADeaths), ylim = c(0, 20), main = Death Rates in Virginia, font.main = 4, sub = Faked 95 percent error bars, col.sub = mybarcol, cex.names = 1.5) plotCI(out, hh, pch=, gap=0, ui=ci.u, li=ci.l, add=TRUE) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Martin Batholdy Sent: Tuesday, January 22, 2013 2:42 PM To: r-help@r-project.org Subject: Re: [R] change confidence interval line length in barplot2 (plotrix package) Ok, I have to apologize, I confused the packages. It's the function barplot2 from the gplots package! It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might save you the time of doing what I thought might be needed, hacking te code. Unfortunately ci.lwd controls the thickness of the line but not the horizontal width. On Jan 22, 2013, at 21:24 , David Winsemius dwinsem...@comcast.net wrote: On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote: Hi, is there any way to change the width of the horizontal line of confidence intervals in the barplot2 function in the plotrix package (independent of the width of the bars)? example code: library(plotrix) # Example with confidence intervals and grid hh - t(VADeaths)[, 1] mybarcol - gray20 ci.l - hh * 0.85 ci.u - hh * 1.15 mp - barplot2(hh, beside = TRUE, col = c(lightblue, mistyrose, lightcyan, lavender), legend = colnames(VADeaths), ylim = c(0, 20), main = Death Rates in Virginia, font.main = 4, sub = Faked 95 percent error bars, col.sub = mybarcol, cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u) When I did an sos::findFn(barplot2) search to locate the real `barplot2` O alos noted in the same package (gplots) a function named `ooplot`. It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might save you the time of doing what I thought might be needed, hacking te code. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] change confidence interval line length in barplot2 (plotrix package)
On Jan 22, 2013, at 2:41 PM, Martin Batholdy batho...@googlemail.com wrote: Ok, I have to apologize, I confused the packages. It's the function barplot2 from the gplots package! It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might save you the time of doing what I thought might be needed, hacking te code. Unfortunately ci.lwd controls the thickness of the line but not the horizontal width. barplot2() in gplots uses a hard coded width for the CI's, which is 50% of the bar width, so it is a consistent proportion. You could hack the code or simply use base graphics barplot() along with either ?segments or perhaps more easily, ?arrows, which would give you more flexibility. Compare: mp - barplot(1:5) arrows(mp, 1:5 + 0.5, mp, 1:5 - 0.5, code = 3, angle = 90, length = 0.1) with: mp - barplot(1:5) arrows(mp, 1:5 + 0.5, mp, 1:5 - 0.5, code = 3, angle = 90, length = 0.25) where the 'length' argument to arrows() defines the width of the upper and lower boundary lines. There are a fair number of other functions around that can add CI's to plots as well and a search of the archives should bear fruit. Regards, Marc Schwartz On Jan 22, 2013, at 21:24 , David Winsemius dwinsem...@comcast.net wrote: On Jan 22, 2013, at 10:28 AM, Martin Batholdy wrote: Hi, is there any way to change the width of the horizontal line of confidence intervals in the barplot2 function in the plotrix package (independent of the width of the bars)? example code: library(plotrix) # Example with confidence intervals and grid hh - t(VADeaths)[, 1] mybarcol - gray20 ci.l - hh * 0.85 ci.u - hh * 1.15 mp - barplot2(hh, beside = TRUE, col = c(lightblue, mistyrose, lightcyan, lavender), legend = colnames(VADeaths), ylim = c(0, 20), main = Death Rates in Virginia, font.main = 4, sub = Faked 95 percent error bars, col.sub = mybarcol, cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u) When I did an sos::findFn(barplot2) search to locate the real `barplot2` O alos noted in the same package (gplots) a function named `ooplot`. It calls itself an extenstion of barplot2 and has a ci.lwd argument. Might save you the time of doing what I thought might be needed, hacking te code. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
HI Rebecca, Try this: dateA-seq.Date(as.Date(28JAN2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month) dateB-seq.Date(as.Date(30JAN2012,format=%d%B%Y),as.Date(30DEC2012,format=%d%B%Y),by=month) set.seed(15) A-data.frame(dateA,value=cumsum(sample(1:50,12,replace=TRUE))) set.seed(25) B-data.frame(dateB,value=cumsum(sample(1:72,12,replace=TRUE))) B[,1]-as.Date(gsub(\\d+$,28,B[,1])) B[,1][duplicated(B[,1],fromLast=TRUE)]-as.Date(gsub((.*-).*(-.*),\\102\\2,B[,1][duplicated(B[,1],fromLast=TRUE)])) #this step may not be needed in ur data. In the month of march, there were two values library(xts) Anew-as.xts(A[,-1],order.by=A[,1]) Bnew-as.xts(B[,-1],order.by=B[,1]) res-merge(Anew,Bnew) library(zoo) plot.zoo(res) A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: Sent: Tuesday, January 22, 2013 3:53 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, I do not want to remove those NA values because they are the monthly data but recorded as the last calendar date in A and last business date in B. I tried to use raw_time - substr(raw_time,3,9) raw_time - as.Date(raw_time,format=%d%B%Y) to cutoff the date and leave the month and year in raw_time, and then convert it to a valid date type of data, but I failed. Is there a way that I can present 2012-09-28 NA NA 5400726 14861715970 2012-09-30 5035606 14832837436 NA NA into something like 2012-09-30 5035606 14832837436 5400726 14861715970 By converting 2012-09-28 to the last calendar date as of 2012-09-30 then B will be recorded at the last business date of the month, and will not have any NA values. Dput() gives me dput(tail(res)) structure(c(121, NA, 111, 111, 120, 119, 309, NA, 313, 307, 30, 313, 130, 130, NA, 130, 130, 130, 309, 313, NA, 309, 310, 315), class = c(xts, zoo), .indexCLASS = Date, .indexTZ = , tclass = Date, tzone = , index = structure(c(134, 134, 134, 135, 135, 135), tzone = , tclass = Date), .Dim = c(6L, 4L), .Dimnames = list(NULL, c(raw_acct, raw_baln, raw_acct.1, raw_baln.1))) Thanks very much! Cheers, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 3:41 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi Rebecca, In the previous email, res-merge(Anew,Bnew) head(res) # Anew Bnew #2012-01-01 181 NA #2012-01-02 59 NA #2012-01-03 290 NA #2012-01-04 196 NA #2012-01-05 111 NA #2012-01-06 297 NA plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I guess, it would remove that from plotting) If you want to remove the NA rows: use, na.omit() or complete.cases()? #as I did in the previous email. Could you dput() an example dataset? A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 2:38 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, This would help me to get the date type of data. A new question comes out that since the dates are not exactly the same on two date sets, there are some NA values in the merged data set, such as 2012-09-28 NA NA 5400726 14861715970 2012-09-30 5035606 14832837436 NA NA Does R have a function to convert the date to some format of Sep,2012, therefore when I merge those two, they will not have those NA numbers... Thanks, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 2:15 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi Rebecca, Assuming that 'raw_data' is data.frame with first column as raw_time: You could convert the raw_time to date format by as.Date(28FEB2002,format=%d%B%Y) #[1] 2002-02-28 In your data, it should be: raw_data$raw_time- as.Date(raw_time,format=%d%B%Y) Could you just dput() a few lines of your dataset if this is not working? Tx. A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: Sent: Tuesday, January 22, 2013 2:08 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, My data shows that I do not have a date type of data: summary(raw_data) raw_time raw_acct raw_baln 28FEB2002: 1 Min. : 61714 Min. :117079835 28FEB2003: 1 1st Qu.: 75587 1st Qu.:158035150 28FEB2005: 1 Median :100234 Median :206906298 28FEB2006: 1 Mean : 96058 Mean :210550369 28FEB2007: 1 3rd
Re: [R] plot two time series with different length and different starting point in one figure.
Hello Arun, Thanks very much! In this way, it works! I convert both A and B to the same day of the month, and therefore there is no NA shown for different last business day and last calendar day of the month. You are very help! Cheers, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 5:06 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. HI Rebecca, Try this: dateA-seq.Date(as.Date(28JAN2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month) dateB-seq.Date(as.Date(30JAN2012,format=%d%B%Y),as.Date(30DEC2012,format=%d%B%Y),by=month) set.seed(15) A-data.frame(dateA,value=cumsum(sample(1:50,12,replace=TRUE))) set.seed(25) B-data.frame(dateB,value=cumsum(sample(1:72,12,replace=TRUE))) B[,1]-as.Date(gsub(\\d+$,28,B[,1])) B[,1][duplicated(B[,1],fromLast=TRUE)]-as.Date(gsub((.*-).*(-.*),\\102\\2,B[,1][duplicated(B[,1],fromLast=TRUE)])) #this step may not be needed in ur data. In the month of march, there were two values library(xts) Anew-as.xts(A[,-1],order.by=A[,1]) Bnew-as.xts(B[,-1],order.by=B[,1]) res-merge(Anew,Bnew) library(zoo) plot.zoo(res) A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: Sent: Tuesday, January 22, 2013 3:53 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, I do not want to remove those NA values because they are the monthly data but recorded as the last calendar date in A and last business date in B. I tried to use raw_time - substr(raw_time,3,9) raw_time - as.Date(raw_time,format=%d%B%Y) to cutoff the date and leave the month and year in raw_time, and then convert it to a valid date type of data, but I failed. Is there a way that I can present 2012-09-28 NA NA 5400726 14861715970 2012-09-30 5035606 14832837436 NA NA into something like 2012-09-30 5035606 14832837436 5400726 14861715970 By converting 2012-09-28 to the last calendar date as of 2012-09-30 then B will be recorded at the last business date of the month, and will not have any NA values. Dput() gives me dput(tail(res)) structure(c(121, NA, 111, 111, 120, 119, 309, NA, 313, 307, 30, 313, 130, 130, NA, 130, 130, 130, 309, 313, NA, 309, 310, 315), class = c(xts, zoo), .indexCLASS = Date, .indexTZ = , tclass = Date, tzone = , index = structure(c(134, 134, 134, 135, 135, 135), tzone = , tclass = Date), .Dim = c(6L, 4L), .Dimnames = list(NULL, c(raw_acct, raw_baln, raw_acct.1, raw_baln.1))) Thanks very much! Cheers, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 3:41 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi Rebecca, In the previous email, res-merge(Anew,Bnew) head(res) # Anew Bnew #2012-01-01 181 NA #2012-01-02 59 NA #2012-01-03 290 NA #2012-01-04 196 NA #2012-01-05 111 NA #2012-01-06 297 NA plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I guess, it would remove that from plotting) If you want to remove the NA rows: use, na.omit() or complete.cases()? #as I did in the previous email. Could you dput() an example dataset? A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 2:38 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, This would help me to get the date type of data. A new question comes out that since the dates are not exactly the same on two date sets, there are some NA values in the merged data set, such as 2012-09-28 NA NA 5400726 14861715970 2012-09-30 5035606 14832837436 NA NA Does R have a function to convert the date to some format of Sep,2012, therefore when I merge those two, they will not have those NA numbers... Thanks, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 2:15 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi Rebecca, Assuming that 'raw_data' is data.frame with first column as raw_time: You could convert the raw_time to date format by as.Date(28FEB2002,format=%d%B%Y) #[1] 2002-02-28 In your data, it should be: raw_data$raw_time- as.Date(raw_time,format=%d%B%Y) Could you just dput() a few lines of your dataset if this is not working? Tx. A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: Sent: Tuesday,
[R] tapply and functions with more than one objects
Hello, How i can use a costum function in tapply which has more than one variable? I mean sum(x) only needs one object but what when i have a function function(x,y) with more, how i indicate where are the other variables to use?7 I hope someone can help me. Thank you!! Best regards, Dominic __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding a line to barchart
R-helpers: I need a quick help with the following graph (I'm a lattice newbie): require(lattice) npp=1:5 names(npp)=c(A,B,C,D,E) barchart(npp,origin=0,box.width=1) # What I want to do, is add a single vertical line positioned at x = 2 that lays over the bars (say, using a dotted line). How do I go about doing this? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] density of hist(freq = FALSE) inversely affected by data magnitude
Hi, I have a couple of observations, a question or two, and perhaps a suggestion related to the plotting of density on the y-axis within the hist() function when freq=FALSE. I was using the function and trying to develop an intuitive understanding of what the density is telling me. After reading through this fairly helpful post: http://stats.stackexchange.com/questions/17258/odd-problem-with-a-histogram-in-r-with-a-relative-frequency-axis I finally realized that in the case where freq = FALSE, the y-axis isn't really telling me the density. It's actually indicating the density multiplied by the bin size. I assume this is for the case where the bins may be of non-regular size. from hist.default: dens - counts/(n * diff(breaks)) So the count in each bin is divided by the total number of observations (n) multiplied by the size of the bin. The problem, as I see it, is that the density ends up being scaled by the size of the bins, which is inversely proportional to the magnitude of the data. Therefore the magnitude of the data is directly affecting the density, which seems problematic. For example*: set.seed() x - runif(100) y - x / 1000 par(mfrow = c(2, 1)) hist(x, prob = TRUE) hist(y, prob = TRUE) From this example, you see that the density for the y histogram is 1000 times larger, simply because the y data is 1000 times smaller. Again, that seems problematic. It seems to me, that the density should be unit-less, but here it's affected by the magnitude of the data. So, my question is, why is density calculated this way? For the case where all the bins are of the same size, I would think density should simply be calculated as: dens - counts / n Of course, that might be somewhat misleading for the case where the bin sizes vary. So then why not calculate density as: dens - counts / (n * diff(breaks) / min(diff(breaks))) Dividing diff(breaks) by min(diff(breaks)) removes the scaling effect of the magnitude of the data, and simply leaves the relative difference in bin size. For the case where all the bins are the same size, the calculation is equivalent to dens - counts / n For all other cases, the density is scaled by the size of the bin, but unaffected by the magnitude of the data. So, what am I misunderstanding? Why is density calculated as it is, and what does it mean? Thanks, James *example from http://stats.stackexchange.com/questions/17258/odd-problem-with-a-histogram-in-r-with-a-relative-frequency-axis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot two time series with different length and different starting point in one figure.
Hi Rebecca, No problem. Just a doubt regarding the last calendar day and last business day. dateA-seq(as.Date(01FEB2012,format=%d%B%Y),length=15,by=1 month)-1 #gives the last calendar day/month dateB- seq.Date(as.Date(28MAR2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month) #here I used day 28. If it didn't change #then this works. set.seed(15) A-data.frame(dateA,value=cumsum(sample(1:50,15,replace=TRUE))) set.seed(25) B-data.frame(dateB,value=cumsum(sample(1:72,10,replace=TRUE))) A[,1]-as.Date(gsub(\\d+$,28,A[,1])) library(xts) library(zoo) Anew-as.xts(A[,-1],order.by=A[,1]) Bnew-as.xts(B[,-1],order.by=B[,1]) res-merge(Anew,Bnew) plot.zoo(res) From your reply, it seems like dateB day didn't change. A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 5:28 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, Thanks very much! In this way, it works! I convert both A and B to the same day of the month, and therefore there is no NA shown for different last business day and last calendar day of the month. You are very help! Cheers, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 5:06 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. HI Rebecca, Try this: dateA-seq.Date(as.Date(28JAN2012,format=%d%B%Y),as.Date(28DEC2012,format=%d%B%Y),by=month) dateB-seq.Date(as.Date(30JAN2012,format=%d%B%Y),as.Date(30DEC2012,format=%d%B%Y),by=month) set.seed(15) A-data.frame(dateA,value=cumsum(sample(1:50,12,replace=TRUE))) set.seed(25) B-data.frame(dateB,value=cumsum(sample(1:72,12,replace=TRUE))) B[,1]-as.Date(gsub(\\d+$,28,B[,1])) B[,1][duplicated(B[,1],fromLast=TRUE)]-as.Date(gsub((.*-).*(-.*),\\102\\2,B[,1][duplicated(B[,1],fromLast=TRUE)])) #this step may not be needed in ur data. In the month of march, there were two values library(xts) Anew-as.xts(A[,-1],order.by=A[,1]) Bnew-as.xts(B[,-1],order.by=B[,1]) res-merge(Anew,Bnew) library(zoo) plot.zoo(res) A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: Sent: Tuesday, January 22, 2013 3:53 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, I do not want to remove those NA values because they are the monthly data but recorded as the last calendar date in A and last business date in B. I tried to use raw_time - substr(raw_time,3,9) raw_time - as.Date(raw_time,format=%d%B%Y) to cutoff the date and leave the month and year in raw_time, and then convert it to a valid date type of data, but I failed. Is there a way that I can present 2012-09-28 NA NA 5400726 14861715970 2012-09-30 5035606 14832837436 NA NA into something like 2012-09-30 5035606 14832837436 5400726 14861715970 By converting 2012-09-28 to the last calendar date as of 2012-09-30 then B will be recorded at the last business date of the month, and will not have any NA values. Dput() gives me dput(tail(res)) structure(c(121, NA, 111, 111, 120, 119, 309, NA, 313, 307, 30, 313, 130, 130, NA, 130, 130, 130, 309, 313, NA, 309, 310, 315), class = c(xts, zoo), .indexCLASS = Date, .indexTZ = , tclass = Date, tzone = , index = structure(c(134, 134, 134, 135, 135, 135), tzone = , tclass = Date), .Dim = c(6L, 4L), .Dimnames = list(NULL, c(raw_acct, raw_baln, raw_acct.1, raw_baln.1))) Thanks very much! Cheers, Rebecca -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, January 22, 2013 3:41 PM To: Yuan, Rebecca Cc: R help Subject: Re: [R] plot two time series with different length and different starting point in one figure. Hi Rebecca, In the previous email, res-merge(Anew,Bnew) head(res) # Anew Bnew #2012-01-01 181 NA #2012-01-02 59 NA #2012-01-03 290 NA #2012-01-04 196 NA #2012-01-05 111 NA #2012-01-06 297 NA plot.zoo(res) # removes the NA values from Bnew.. (if NA was present in Anew, I guess, it would remove that from plotting) If you want to remove the NA rows: use, na.omit() or complete.cases()? #as I did in the previous email. Could you dput() an example dataset? A.K. - Original Message - From: Yuan, Rebecca rebecca.y...@bankofamerica.com To: 'arun' smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Tuesday, January 22, 2013 2:38 PM Subject: RE: [R] plot two time series with different length and different starting point in one figure. Hello Arun, This would help me to get the date type of data. A new question comes out that since the dates are not exactly the same on two date sets, there are some NA values
[R] summarise subsets of a vector
Hello, I have vector called test. And now I wish to measure the mean of the first 10 number, the second 10 numbers etc How does it work? Thanks Wim dput (test) c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.71, 0.21875, 0, 0.27375, 0.26125, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.84125, 0.0575, 0.92625, 0.12, 0, 0) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to assign time series to a vector with one leap year
HI, You can check this link: http://r.789695.n4.nabble.com/leap-years-in-temporal-series-command-ts-td3309014.html Also, this may help you: library(lubridate), ?leap_year() leap_year(2008) #[1] TRUE ymd(2008-2-29) 1 parsed with %Y-%m-%d #[1] 2008-02-29 UTC A.K. - Original Message - From: Janesh Devkota janesh.devk...@gmail.com To: r-help@r-project.org Cc: Sent: Tuesday, January 22, 2013 2:46 PM Subject: [R] How to assign time series to a vector with one leap year Hello All, I am trying to do the time series analysis in R and I want to assign a vector as a time series. The data I provided is hourly. The data is from Jan 1 2008 to Dec 31 2009. How can I assign the data such that the first year is leap year and second is not ? airtemp - read.csv(airtemp.csv,header=T,sep=) aw - ts(airtemp,start=2008,frequency=8784,end=2009) I assigned frequency as 8784 because 2008 year will have 8784 hourly data points and 2009 has 8760 data points. The total data points are 17544 The data can be found on https://www.dropbox.com/s/03z74632v1f3g1e/airtemp.csv I apologize if this is very trivial to some of you. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply and functions with more than one objects
On Jan 22, 2013, at 2:24 PM, Dominic Roye wrote: Hello, How i can use a costum function in tapply which has more than one variable? I mean sum(x) only needs one object but what when i have a function function(x,y) with more, how i indicate where are the other variables to use?7 You can use: lapply(split( multi_col_object, category_vec) , function(x,y){sum(x,y)} ) aggregate(dat, category, FUN=sum) Or: do.call(rbind, by( multi_col_object, category_vec, function(x,y){ } ) Sometimes `Reduce` is more compact. Other times `mapply` is needed. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density of hist(freq = FALSE) inversely affected by data magnitude
The probability density function is not unitless - it is the derivative of the [cumulative] probability distribution function so it has units delta-probability-mass over delta-x. It must integrate to 1 (over the all possible x). hist(freq=FALSE,x) or hist(prob=TRUE,x) displays an estimate of the density function and the following example shows how the scale matches what you get from the presumed population density function. f function (n, sd) { x - rnorm(n, sd = sd) hist(x, freq = FALSE) # estimated density s - seq(min(x), max(x), len = 129) lines(s, dnorm(s, sd = sd), col = red) # overlay expected density for this sample } f(1e6, sd=1) f(100, sd=1) f(100, sd=0.0001) f(1e6, sd=0.0001) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of J Toll Sent: Tuesday, January 22, 2013 2:48 PM To: r-help Subject: [R] density of hist(freq = FALSE) inversely affected by data magnitude Hi, I have a couple of observations, a question or two, and perhaps a suggestion related to the plotting of density on the y-axis within the hist() function when freq=FALSE. I was using the function and trying to develop an intuitive understanding of what the density is telling me. After reading through this fairly helpful post: http://stats.stackexchange.com/questions/17258/odd-problem-with-a-histogram-in-r- with-a-relative-frequency-axis I finally realized that in the case where freq = FALSE, the y-axis isn't really telling me the density. It's actually indicating the density multiplied by the bin size. I assume this is for the case where the bins may be of non-regular size. from hist.default: dens - counts/(n * diff(breaks)) So the count in each bin is divided by the total number of observations (n) multiplied by the size of the bin. The problem, as I see it, is that the density ends up being scaled by the size of the bins, which is inversely proportional to the magnitude of the data. Therefore the magnitude of the data is directly affecting the density, which seems problematic. For example*: set.seed() x - runif(100) y - x / 1000 par(mfrow = c(2, 1)) hist(x, prob = TRUE) hist(y, prob = TRUE) From this example, you see that the density for the y histogram is 1000 times larger, simply because the y data is 1000 times smaller. Again, that seems problematic. It seems to me, that the density should be unit-less, but here it's affected by the magnitude of the data. So, my question is, why is density calculated this way? For the case where all the bins are of the same size, I would think density should simply be calculated as: dens - counts / n Of course, that might be somewhat misleading for the case where the bin sizes vary. So then why not calculate density as: dens - counts / (n * diff(breaks) / min(diff(breaks))) Dividing diff(breaks) by min(diff(breaks)) removes the scaling effect of the magnitude of the data, and simply leaves the relative difference in bin size. For the case where all the bins are the same size, the calculation is equivalent to dens - counts / n For all other cases, the density is scaled by the size of the bin, but unaffected by the magnitude of the data. So, what am I misunderstanding? Why is density calculated as it is, and what does it mean? Thanks, James *example from http://stats.stackexchange.com/questions/17258/odd-problem-with-a- histogram-in-r-with-a-relative-frequency-axis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the convergence criterion for binomial logit in glm?
On Jan 22, 2013, at 2:55 PM, Dimitri Liakhovitski wrote: Dear R-ers, I am running logistics regression using package glm: glm(myDV ~ ., data=mydata, family=binomial(logit)) I have a general question: in glm (binary logit) - what convergence criterion is being used? You should look at the help page for `glm` (and follow the obvious links.) -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding a line to barchart
Hi, May be this helps: barchart(npp,origin=0,box.width=1, panel=function(x,y,...){ panel.barchart(x,y,...) panel.abline(v=2,col.line=red,lty=3)}) A.K. - Original Message - From: Jonathan Greenberg j...@illinois.edu To: r-help r-help@r-project.org Cc: Sent: Tuesday, January 22, 2013 5:41 PM Subject: [R] Adding a line to barchart R-helpers: I need a quick help with the following graph (I'm a lattice newbie): require(lattice) npp=1:5 names(npp)=c(A,B,C,D,E) barchart(npp,origin=0,box.width=1) # What I want to do, is add a single vertical line positioned at x = 2 that lays over the bars (say, using a dotted line). How do I go about doing this? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summarise subsets of a vector
Hi, try this: unlist(lapply(split(test,((seq_along(test)-1)%/% 10)+1),mean)) # 1 2 3 4 5 6 7 8 #0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.146375 # 9 10 11 #0.00 0.194500 0.00 A.K. - Original Message - From: Wim Kreinen wkrei...@gmail.com To: r-help r-help@r-project.org Cc: Sent: Tuesday, January 22, 2013 6:09 PM Subject: [R] summarise subsets of a vector Hello, I have vector called test. And now I wish to measure the mean of the first 10 number, the second 10 numbers etc How does it work? Thanks Wim dput (test) c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.71, 0.21875, 0, 0.27375, 0.26125, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.84125, 0.0575, 0.92625, 0.12, 0, 0) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the convergence criterion for binomial logit in glm?
I already looked. This help file for loglin ( http://127.0.0.1:12583/library/stats/html/loglin.html) says: The Iterative Proportional Fitting algorithm as presented in Haberman (1972) is used for fitting the model. At most iter iterations are performed, convergence is taken to occur when the maximum deviation between observed and fitted margins is less than eps. And the default eps is 0.1 So, is it then the convergence criterion used by glm when family=binomial(logit)? I just need to know for sure. Thanks for confirming! Dimitri On Tue, Jan 22, 2013 at 6:37 PM, David Winsemius dwinsem...@comcast.netwrote: On Jan 22, 2013, at 2:55 PM, Dimitri Liakhovitski wrote: Dear R-ers, I am running logistics regression using package glm: glm(myDV ~ ., data=mydata, family=binomial(logit)) I have a general question: in glm (binary logit) - what convergence criterion is being used? You should look at the help page for `glm` (and follow the obvious links.) -- David Winsemius Alameda, CA, USA -- Dimitri Liakhovitski gfk.com http://marketfusionanalytics.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the convergence criterion for binomial logit in glm?
On Jan 22, 2013, at 3:59 PM, Dimitri Liakhovitski wrote: I already looked. This help file for loglin (http://127.0.0.1:12583/library/stats/html/loglin.html) says: The Iterative Proportional Fitting algorithm as presented in Haberman (1972) is used for fitting the model. At most iter iterations are performed, convergence is taken to occur when the maximum deviation between observed and fitted margins is less than eps. And the default eps is 0.1 So, is it then the convergence criterion used by glm when family=binomial(logit)? I just need to know for sure. I assumed that you would follow the link on help(glm) to `glm.control` where the convergence criteria is described and can be altered. The link to that help page is at the end of the line that reads: control a list of parameters for controlling the fitting process. For glm.fit this is passed to glm.control. The default epsilon is NOT 0.1 -- David. Thanks for confirming! Dimitri On Tue, Jan 22, 2013 at 6:37 PM, David Winsemius dwinsem...@comcast.net wrote: On Jan 22, 2013, at 2:55 PM, Dimitri Liakhovitski wrote: Dear R-ers, I am running logistics regression using package glm: glm(myDV ~ ., data=mydata, family=binomial(logit)) I have a general question: in glm (binary logit) - what convergence criterion is being used? You should look at the help page for `glm` (and follow the obvious links.) -- David Winsemius Alameda, CA, USA -- Dimitri Liakhovitski gfk.com David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to construct a valid seed for l'Ecuyer's method with given .Random.seed?
Dear expeRts, I struggle with the following problem using snow clusters for parallel computing: I would like to specify l'Ecuyer's random number generator. Base R creates a .Random.seed of length 7, the first value indicating the kind fo random number generator. I would thus like to use the components 2 to 7 as the seed for l'Ecuyer's random number generator. By doing so, I receive (see the minimal example below): , | Loading required package: Rmpi | Loading required package: grDevices | Loading required package: grDevices | Loading required package: grDevices | Loading required package: grDevices | 4 slaves are spawned successfully. 0 failed. | Loading required package: rlecuyer | Error in .lec.SetPackageSeed(seed) (from #11) : | Seed[0] = -930997252, Seed is not set. ` What's the problem? How can I construct a valid seed for l'Ecuyer's rng with just the information in .Random.seed? Thanks Cheers, Marius Here is the minimal example: require(doSNOW) require(foreach) doForeach - function(n, seed=1, type=MPI) { ## create cluster object cl - snow::makeCluster(parallel::detectCores(), type=type) on.exit(snow::stopCluster(cl)) ## shut down cluster and terminate execution environment registerDoSNOW(cl) ## register the cluster object with foreach ## seed if(seed==L'Ecuyer-CMRG) { if(!exists(.Random.seed)) stop(.Random.seed does not exist - in l'Ecuyer setting) .t - snow::clusterSetupRNG(cl, seed=.Random.seed[2:7]) # = fails! } ## actual work foreach(i=seq_len(n)) %dopar% { runif(1) } } ## standard (base) way of specifying l'Ecuyer RNGkind(L'Ecuyer-CMRG) # = .Random.seed is of length 7 res - doForeach(10, seed=L'Ecuyer-CMRG) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create a Data Frame from an XML
Hi Adam [You seem to have sent the same message twice to the mailing list.] There are various strategies/approaches to creating the data frame from the XML. Perhaps the approach that most closely follows your approach is xmlRoot(doc)[ row ] which returns a list of XML nodes whose node name is row that are children of the root node data. So sapply(xmlRoot(doc) [ row ], xmlAttrs) yields a matrix with as many columns as there are row nodes and with 3 rows - one for each of the BRAND, YEAR and VALUE attributes. So d = t( sapply(xmlRoot(doc) [ row ], xmlAttrs) ) gives you a matrix with the correct rows and column orientation and now you can turn that into a data frame, converting the columns into numbers, etc. as you want with regular R commands (i.e. independently of the XML). D. On 1/22/13 1:43 PM, Adam Gabbert wrote: Hello, I'm attempting to read information from an XML into a data frame in R using the XML package. I am unable to get the data into a data frame as I would like. I have some sample code below. *XML Code:* Header... Data I want in a data frame: data row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 / row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 / row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 / row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 / row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 / row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 / row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 / row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 / row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 / /data *R Code:* doc -xmlInternalTreeParse (Sample2.xml) top - xmlRoot (doc) xmlName (top) names (top) art - top [[row]] art ** *Output:* artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/ * * This is where I am having difficulties. I am unable to access additional rows; ( i.e. row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / ) and I am unable to access the individual entries to actually create the data frame. The data frame I would like is as follows: BRANDNUMYEARVALUE GMC1 1999 1 FORD 2 2000 12000 GMC1 2001 12500 etc Any help or suggestions would be appreciated. Conversly, my eventual goal would be to take a data frame and write it into an XML in the previously shown format. Thank you AG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] csv mask order
I have imported a CSV file: rfishR - read.csv(file=rfishR.csv,stringsAsFactors = FALSE, strip.white = TRUE, na.strings = c(NA,) ) attach(rfishR) When I call it up in R, it starts with line 2066 rather than 1 and some of the headers (used Headers = TRUE, too) are masked? Sample data loc lat lon datum water date obs net species length mass other Dispos NS10 69.5 -156.8 NAD83 Chuc pt f fourhorn sculpin 225 na na id NS10 69.5 -156.4 NAD83 Chuc pt f fourhorn sculpin 293 na na id NS10 69.5 -156.2 NAD83 Chuc pt f fourhorn sculpin 243 na na id Please help. -TS [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density of hist(freq = FALSE) inversely affected by data magnitude
Bill, Thank you. I got it. That can require a fair amount of work to interpret the density, especially with odd or irregular bin sizes. Thanks again, James On Tue, Jan 22, 2013 at 5:33 PM, William Dunlap wdun...@tibco.com wrote: The probability density function is not unitless - it is the derivative of the [cumulative] probability distribution function so it has units delta-probability-mass over delta-x. It must integrate to 1 (over the all possible x). hist(freq=FALSE,x) or hist(prob=TRUE,x) displays an estimate of the density function and the following example shows how the scale matches what you get from the presumed population density function. f function (n, sd) { x - rnorm(n, sd = sd) hist(x, freq = FALSE) # estimated density s - seq(min(x), max(x), len = 129) lines(s, dnorm(s, sd = sd), col = red) # overlay expected density for this sample } f(1e6, sd=1) f(100, sd=1) f(100, sd=0.0001) f(1e6, sd=0.0001) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the convergence criterion for binomial logit in glm?
Thanks a lot, David. Yes, now I see it - it's 1e-8 Dimitri On Tue, Jan 22, 2013 at 7:08 PM, David Winsemius dwinsem...@comcast.netwrote: glm.control -- Dimitri Liakhovitski gfk.com http://marketfusionanalytics.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a Data Frame from an XML
On Jan 22, 2013, at 3:11 PM, Adam Gabbert wrote: Hello, I'm attempting to read information from an XML into a data frame in R using the XML package. I am unable to get the data into a data frame as I would like. I have some sample code below. *XML Code:* Header... Data I want in a data frame: data row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 / row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 / row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 / row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 / row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 / row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 / row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 / row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 / row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 / /data *R Code:* doc -xmlInternalTreeParse (Sample2.xml) top - xmlRoot (doc) xmlName (top) names (top) art - top [[row]] art ** *Output:* artrow BRAND=GMC NUM=1 YEAR=1999 VALUE=1/ This is where I am having difficulties. I am unable to access additional rows; ( i.e. row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / ) and I am unable to access the individual entries to actually create the data frame. The data frame I would like is as follows: BRANDNUMYEARVALUE GMC1 1999 1 FORD 2 2000 12000 GMC1 2001 12500 etc Any help or suggestions would be appreciated. Conversly, my eventual goal would be to take a data frame and write it into an XML in the previously shown format. Hi, You are so close! You have a number of nodes with the name 'row'. The [[ function selects just one item from a list, and when there's a number that have that name it returns just the first. So you really want to use the [ function instead and then select by order index using [[ library(XML) s - c( data, row BRAND=\GMC\ NUM=\1\ YEAR=\1999\ VALUE=\1\ /, row BRAND=\FORD\ NUM=\1\ YEAR=\2000\ VALUE=\12000\ /, row BRAND=\GMC\ NUM=\1\ YEAR=\2001\ VALUE=\12500\ /, row BRAND=\FORD\ NUM=\1\ YEAR=\2002\ VALUE=\13000\ /, row BRAND=\GMC\ NUM=\1\ YEAR=\2003\ VALUE=\14000\ /, row BRAND=\FORD\ NUM=\1\ YEAR=\2004\ VALUE=\17000\ /, row BRAND=\GMC\ NUM=\1\ YEAR=\2005\ VALUE=\15000\ /, row BRAND=\GMC\ NUM=\1\ YEAR=\1967\ VALUE=\PRICLESS\ /, row BRAND=\FORD\ NUM=\1\ YEAR=\2007\ VALUE=\17500\ /, row BRAND=\GMC\ NUM=\1\ YEAR=\2008\ VALUE=\22000\ /, /data) x - xmlRoot(xmlTreeParse(s, asText = TRUE, useInternalNodes = TRUE)) x[row][[1]] row BRAND=GMC NUM=1 YEAR=1999 VALUE=1/ x[row][[2]] row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000/ Your rows are set up so the attributes have the values you want - use xmlAttrs to retrieve them. xmlAttrs(x[row][[2]]) BRAND NUMYEAR VALUE FORD 1 2000 12000 You can use lapply to iterate through each row and apply the xmlAttrs function. You'll end up with a list if character vectors. y - lapply(x[row], xmlAttrs) str(y) List of 10 $ row: Named chr [1:4] GMC 1 1999 1 ..- attr(*, names)= chr [1:4] BRAND NUM YEAR VALUE $ row: Named chr [1:4] FORD 1 2000 12000 ..- attr(*, names)= chr [1:4] BRAND NUM YEAR VALUE $ row: Named chr [1:4] GMC 1 2001 12500 ..- attr(*, names)= chr [1:4] BRAND NUM YEAR VALUE . . . Next make a character matrix using do.call and rbind ... m - do.call(rbind, y) str(m) chr [1:10, 1:4] GMC FORD GMC FORD GMC FORD GMC GMC FORD ... - attr(*, dimnames)=List of 2 ..$ : chr [1:10] row row row row ... ..$ : chr [1:4] BRAND NUM YEAR VALUE And then on to a data.frame... d - as.data.frame(m) str(d) 'data.frame': 10 obs. of 4 variables: $ BRAND: chr GMC FORD GMC FORD ... $ NUM : chr 1 1 1 1 ... $ YEAR : chr 1999 2000 2001 2002 ... $ VALUE: chr 1 12000 12500 13000 ... Cheers, Ben Thank you AG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Ben Tupper Bigelow Laboratory for Ocean Sciences 180 McKown Point Rd. P.O. Box 475 West Boothbay Harbor, Maine 04575-0475 http://www.bigelow.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] csv mask order
Do your lines start with the hash mark #? If so, they are considered comment. Set comment.char= in your call to read.csv. Another frequent culprit (personal experience) are apostrophes ('). If you have any in your file, use the argument quote = \ or, if you are sure the data are not quoted, use quote=. This is all described in detail in help(read.csv), you may want to study it carefully to see whether your file is misinterpreted in some subtle way. HTH Peter On Tue, Jan 22, 2013 at 4:49 PM, Todd Sformo todd.sfo...@north-slope.org wrote: I have imported a CSV file: rfishR - read.csv(file=rfishR.csv,stringsAsFactors = FALSE, strip.white = TRUE, na.strings = c(NA,) ) attach(rfishR) When I call it up in R, it starts with line 2066 rather than 1 and some of the headers (used Headers = TRUE, too) are masked? Sample data loc lat lon datum water date obs net species length mass other Dispos NS10 69.5 -156.8 NAD83 Chuc pt f fourhorn sculpin 225 na na id NS10 69.5 -156.4 NAD83 Chuc pt f fourhorn sculpin 293 na na id NS10 69.5 -156.2 NAD83 Chuc pt f fourhorn sculpin 243 na na id Please help. -TS [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a Data Frame from an XML
On Tue, Jan 22, 2013 at 3:11 PM, Adam Gabbert adamjgabb...@gmail.com wrote: Hello, I'm attempting to read information from an XML into a data frame in R using the XML package. I am unable to get the data into a data frame as I would like. I have some sample code below. *XML Code:* Header... Data I want in a data frame: data row BRAND=GMC NUM=1 YEAR=1999 VALUE=1 / row BRAND=FORD NUM=1 YEAR=2000 VALUE=12000 / row BRAND=GMC NUM=1 YEAR=2001 VALUE=12500 / row BRAND=FORD NUM=1 YEAR=2002 VALUE=13000 / row BRAND=GMC NUM=1 YEAR=2003 VALUE=14000 / row BRAND=FORD NUM=1 YEAR=2004 VALUE=17000 / row BRAND=GMC NUM=1 YEAR=2005 VALUE=15000 / row BRAND=GMC NUM=1 YEAR=1967 VALUE=PRICLESS / row BRAND=FORD NUM=1 YEAR=2007 VALUE=17500 / row BRAND=GMC NUM=1 YEAR=2008 VALUE=22000 / /data *R Code:* doc -xmlInternalTreeParse (Sample2.xml) top - xmlRoot (doc) xmlName (top) names (top) art - top [[row]] art ** This will get a data frame of character columns as.data.frame(t(xpathSApply(doc, //row, xmlAttrs)), stringsAsFactors = FALSE) BRAND NUM YEARVALUE 1GMC 1 19991 2 FORD 1 200012000 3GMC 1 200112500 4 FORD 1 200213000 5GMC 1 200314000 6 FORD 1 200417000 7GMC 1 200515000 8GMC 1 1967 PRICLESS 9 FORD 1 200717500 10 GMC 1 200822000 -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] user units in plotrix
oo, sounds like exactly what i want! thanks! -m On Tue, Jan 22, 2013 at 4:13 PM, Greg Snow 538...@gmail.com wrote: If you want to convert between different units using base graphics then look at the grconvertX and grconvertY functions (in the graphics package). These functions will convert from/to user coordinates, inches, device, figure, and plot coordinates. So you could use grconvertX to find out what user value on the x scale to give to draw.circle that would then generate a circle with a given size in inches, or relative to the device, figure, or plotting region. On Sun, Jan 20, 2013 at 2:59 PM, Murat Tasan mmu...@gmail.com wrote: hi all - i'm having some difficulty figuring out how to convert between user units (which i can't find a definition for in the plotrix package) and either (a) device units (e.g. inches with PDFs) or (b) user coordinates along any particular axis. as an example, suppose i set up a PDF device with inches, the device has both outer and inner magins, and the plot region has drastically different x and y coordinate ranges (e.g. xlim = c(0, 1), ylim = c(0, SOME_VERY_LARGE_NUMBER)). now i'd like to draw.circle(...) but i can't figure out what units the radius argument takes. user units doesn't appear to be inches in this case, and it it corresponds to user coordinates, i don't know which axis' scaling is to be used as the reference. ideally, one would be able to specify the radius in user coordinates while specifying _which_ axis to use as the standard (e.g. an axis = y or axis = x argument). getFigCtr(...) can help in figuring this out, but its argument takes the relative position of the figure region, rather than the plot region, which is more apt for properly placing shapes. i know the grid package has extensive unit conversion code, but i'm trying to update a series of figures using only base graphics... i can't seem to find a rigorous definition of user units anywhere in the plotrix package. anyone know of where i can find this info? cheers, -m __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding a line to barchart
Hi This function adds line to each panel addLine - function (a = NULL, b = NULL, v = NULL, h = NULL, ..., once = F) { tcL - trellis.currentLayout() k - 0 for (i in 1:nrow(tcL)) for (j in 1:ncol(tcL)) if (tcL[i, j] 0) { k - k + 1 trellis.focus(panel, j, i, highlight = FALSE) if (once) panel.abline(a = a[k], b = b[k], v = v[k], h = h[k], ...) else panel.abline(a = a, b = b, v = v, h = h, ...) trellis.unfocus() } } addLine(v=2, col=2, lty=3) Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Jonathan Greenberg Sent: Tuesday, January 22, 2013 11:42 PM To: r-help Subject: [R] Adding a line to barchart R-helpers: I need a quick help with the following graph (I'm a lattice newbie): require(lattice) npp=1:5 names(npp)=c(A,B,C,D,E) barchart(npp,origin=0,box.width=1) # What I want to do, is add a single vertical line positioned at x = 2 that lays over the bars (say, using a dotted line). How do I go about doing this? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.