Re: [R] Object attributes in R
This issue is discussed in much more detail in the manuals for 2.4.0, in particular in the new 'R Internals' manual. You will find the basic rules in the Blue Book (Becker, Chambers, Wilks, 1988) but they have not always been applied consistently. On Wed, 11 Oct 2006, Michael Toews wrote: Hi, I have questions about object attributes, and how they are handled when subsetted. My examples will use: tm - (1:10)/10 ds - (1:10)^2 attr(tm,units) - sec attr(ds,units) - cm dat - data.frame(tm=tm,ds=ds) attr(dat,id) - test1 When a primitive class object (numeric, character, etc.) is subsetted, the attributes are often stripped away, but the rules change for more complex classes, such as a data.frame, where they 'stick' for the data.frame, but attributes from the members are lost: tm[3:5]# lost ds[-3] # lost str(dat[1:3,]) # only kept for data.frame Is there any way of keeping the attributes when subsetted from primitive classes, like a fictional attr.drop option within the [ braces? The best alternative I have found is to make a new object, and copy the attributes: tm2 - tm[3:5] attributes(tm2) - attributes(tm) However, for the data.frame, how can I copy the attributes over (without using a for loop -- I've tried a few things using sapply but no success)? Also I don't see how this is consistent with an empty index, [], where attributes are always retained (as documented): tm[] I have other concerns about the evaluation of objects with attributes (e.g. ds/tm), where the attributes from the first object are retained for the output, but this evaluation of logic is a whole other can of worms I'd rather keep closed for now. +mt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ts vs zoo
Schweitzer, Markus wrote: Hello, I have lots of data in zoo format and would like to do some time series analysis. (using library(zoo), library(ts) ) My data is usually from one year, and I try for example stl() to find some seasonalities or trends. I have now accepted, that I might have to convert my series into ts() but still I am not able to execute the comand since stl() is not satisfied x-zoo(rnorm(365), as.Date(2005-01-01):as.Date(2005-12-31)) x-as.ts(x) #x-as.ts(x, frequency=12) #this has no effect frequency is not taken stl(x) Fehler in stl(x) : series is not periodic or has less than two periods Please, read the error message carefully: ... has less than two periods... And you say you have series of one year. stl() cannot be used with so short series. Otherwise, you must take care to define the time unit as year for using stl(), since it assumes that the periodic signal it extracts is of frequence one. Best, Philippe Grosjean I googled for an answer but I couldn t find any. Is it really necessary to transform my zoo objects to ts? how can I fix the frequency-problem. I hope you can help me. Thank you very much in advance and best regards, Markus [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to Get Categorical Correlation Coefficient
Howdy Gurus ! I have a different correlation result from the same data. The corridor1 string variable is expressed as a number like the corridor2 number variable. -- levels(corridor1) [1] A B C D E F levels(as.factor(corridor2)) [1] 0 1 2 3 4 -- I have the correlation results followings using cor() function. -- cor(jh1_1, as.factor(corridor1)) [1] 0.01528538 cor(jh1_1, as.factor(corridor2)) [1] -0.4972571 -- I donot know why the above correlation coefficients used the same data are different. They are 0.015 from as.factor(corridor1), -0.497 from as,factor(corridor2). The string variable corridor1 is the same catergory data with the variable corridor2. The difference is that A is replaced with 0, B with 1, C with 2, . Could you tell me why they are different, and which correlation coefficient is correct? Thank in advance, -- Kum-Hoe Hwang, Ph.D.Phone : 82-31-250-3516Email : [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Get Categorical Correlation Coefficient
Kum-Hoe Hwang [EMAIL PROTECTED] writes: Howdy Gurus ! I have a different correlation result from the same data. The corridor1 string variable is expressed as a number like the corridor2 number variable. -- levels(corridor1) [1] A B C D E F levels(as.factor(corridor2)) [1] 0 1 2 3 4 -- I have the correlation results followings using cor() function. -- cor(jh1_1, as.factor(corridor1)) [1] 0.01528538 cor(jh1_1, as.factor(corridor2)) [1] -0.4972571 -- I donot know why the above correlation coefficients used the same data are different. They are 0.015 from as.factor(corridor1), -0.497 from as,factor(corridor2). The string variable corridor1 is the same catergory data with the variable corridor2. The difference is that A is replaced with 0, B with 1, C with 2, . Could you tell me why they are different, and which correlation coefficient is correct? One thing that strikes me is that corridor1 has 6 levels and corridor2 has 5... In general correlations are not expected to work on factors so I'd be explicit about taking as.numeric(). A glance at table(corridor1,corridor2) should be informative too, as would a summary(as.numeric(as.factor(corridor1))-as.numeric(as.factor(corridor1))) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help copy.file()
Dear R user, I have a little truble using the command copy.file() because R copy the files but trasform all of them from their kb weight to 0 kb! I tried with different kind of file extension as .doc, .png, .pdf and the result is the same: all files are copied to ther right folder and path but every file have 0 kb of weight and result unavailable. Of course, I want to copy file, expecially charts in .png format, created by R from a folder in a PC to a folder in another PC that is a web server. Have you any ideas? Thanks a lot. MARCO __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in lowess
On Wed, 2006-10-11 at 22:29 -0500, Frank E Harrell Jr wrote: x - c(0,7,8,14,15,120,242) y - c(122,128,130,158,110,110,92) lowess(x,y) $x [1] 0 7 8 14 15 120 242 $y [1] 122. 128. 132.2857 158. 110. -4930. 110. Same behaviour here on a more recent R (below), and I recall a posting for a year or so back that reported similar behaviour in the then current version of R. G x - c(0,7,8,14,15,120,242) y - c(122,128,130,158,110,110,92) lowess(x,y) $x [1] 0 7 8 14 15 120 242 $y [1] 122. 128. 132.2857 158. 110. -4930. 110. sessionInfo() R version 2.4.0 Patched (2006-10-03 r39576) i686-pc-linux-gnu locale: LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base R version 2.2.1, 2005-12-20, i486-pc-linux-gnu attached base packages: [1] methods stats graphics grDevices utils datasets [7] base other attached packages: lattice Hmisc chron 0.12-11 3.1-1 2.3-8 -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC ENSIS, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [EMAIL PROTECTED] pixmap format
Dear list, Is there any command in R to plot a graph to pixmap object? Or any command to convert some other type of images (png, jpg ) to pixmap format? From the list, I read about the ImageMagick tool to convert the image format. But can I use it within R in a window system? Thanks! /Mike _ Share your special moments by uploading 500 photos per month to Windows Live Spaces __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generate random numbers that sum up to 1
Thanks for the answers. I am sorry if the question is too simple to make everyone thought it is a homework, but it is not. I am setting up a simulation with expected utility theory which use EU = sum(Pi*Ui) form and I have to randomly generate these Pis, I do not really know which distribution these Pis has to follow but just know they have to be equally random hence the question. I thought Dirichlet(1,1, ..., 1) could do the trick for me. Sun - Original Message - From: Duncan Murdoch [EMAIL PROTECTED] To: Alberto Monteiro [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Wednesday, October 11, 2006 11:39 PM Subject: Re: [R] generate random numbers that sum up to 1 On 10/11/2006 5:04 PM, Alberto Monteiro wrote: I don't have the previous messages, but it seems to me that the solutions didn't quite get the requirements of the problem. For example, it's obvious that for n = 2, the random variables X1 and X2 should be X1 - runif(1) and X2 - 1 - X1; while the solution X - runif(2); X1 - X[1] / sum(X); X2 - X[2] / sum(X) will give different distributions, as shown in this test case: N - 1000 M - matrix(runif(2 * N), 2, N) X - M[1,] / (M[1,] + M[2,]) hist(X) It's obvious that for a generic n-th dimensional set of uniform variables X1, ... X_n subject to the restriction X1 + ... + X_n = 1, the solution is to take a uniform distribution in the (n-1)-th dimensional hyperpyramid generated by the above relation and the restrictions that each X_i = 0. For example, for n = 3, we should sample from the equilateral triangle with vertices c(1,0,0), c(0,1,0) and c(0,0,1). For n = 4, we should sample from the pyramid whose vertices are c(1,0,0,0), c(0,1,0,0), c(0,0,1,0) and c(0,0,0,1). I don't know if there is a simple formula to do this sampling. That's exactly what the Dirichlet(1,1, ..., 1) distribution does. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] copula
Dear R-helpers, is it some sort of R internal fault or am I doing something wrong when typing the following two lines in R, I get the surface ploted and trying it another time it gives the error message below: Error in ceiling(length.out) : Non-numeric argument to mathematical function Why is it that sometimes I try the same commands and i get the copula density plotted? I will appreciate any help. thanks, Dominique K. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Get Categorical Correlation Coefficient
There was my mistake in the earlier email. I have corrected the error by dropping ns.omit from data.frame(). I added a new corrected correlation and output followings: -- # nrow(sdi) [1] 65613 print(corridor1[65600:65613]) [1] C C C C F [6] F F F B B [11] F F B B Levels: B C D E A F print(corridor2[65600:65613]) [1] 4 4 4 4 2 2 2 2 1 1 2 2 1 1 summary(corridor1) B CD E A F 1509213456 6652 1611 179627006 summary(corridor2) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.0 1.0 2.0 2.3 3.0 5.0 summary(as.numeric(as.factor(corridor1))-as.numeric(as.factor(corridor1))) Min. 1st Qu. MedianMean 3rd Qu.Max. 0 0 0 0 0 0 table(corridor1,corridor2) corridor2 corridor1 0 1 2 3 4 5 B 0 15092 0 0 0 0 C 0 0 0 0 13456 0 D 0 0 0 6652 0 0 E 0 0 0 0 0 1611 A 1796 0 0 0 0 0 F 0 0 27006 0 0 0 --- There are different correlation coefficients from the following results: Are there any functions or packages for a categorical correlation? cor(jh1_1, corridor1) [1] 0.02753303 cor(jh1_1, as.factor(corridor2)) [1] -0.3682788 Thanks for your kindness, Kum On 12 Oct 2006 10:25:33 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote: Kum-Hoe Hwang [EMAIL PROTECTED] writes: Howdy Gurus ! I have a different correlation result from the same data. The corridor1 string variable is expressed as a number like the corridor2 number variable. -- levels(corridor1) [1] A B C D E F levels(as.factor(corridor2)) [1] 0 1 2 3 4 -- I have the correlation results followings using cor() function. -- cor(jh1_1, as.factor(corridor1)) [1] 0.01528538 cor(jh1_1, as.factor(corridor2)) [1] -0.4972571 -- I donot know why the above correlation coefficients used the same data are different. They are 0.015 from as.factor(corridor1), -0.497 from as,factor(corridor2). The string variable corridor1 is the same catergory data with the variable corridor2. The difference is that A is replaced with 0, B with 1, C with 2, . Could you tell me why they are different, and which correlation coefficient is correct? One thing that strikes me is that corridor1 has 6 levels and corridor2 has 5... In general correlations are not expected to work on factors so I'd be explicit about taking as.numeric(). A glance at table(corridor1,corridor2) should be informative too, as would a summary(as.numeric(as.factor(corridor1))-as.numeric(as.factor(corridor1))) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 -- Kum-Hoe Hwang, Ph.D.Phone : 82-31-250-3516Email : [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ts vs zoo
thank you very much for the information. I guess I should have been more clear here. I was looking for the monthly or weekly trends within this one year period. to get there I now only took the zoo object x and made x-as.ts(x) x-ts(x, frequency=7) #to get 52 weeks(Periods) with 7 days each - to get 12 periods e.g. months with 29,30 or 31 days, I guess I can only choose frequency=30 I then can run stl It is just a pitty, that the labeling (jan 2005, feb 2005 ..) has gone. So thank you for your hint with barplot and rollmean best regards, markus -Original Message- From: Achim Zeileis [mailto:[EMAIL PROTECTED] Sent: Donnerstag, 12. Oktober 2006 12:15 To: Schweitzer, Markus Cc: R-help@stat.math.ethz.ch Subject: Re: [R] ts vs zoo Markus, several comments: I have lots of data in zoo format and would like to do some time series analysis. (using library(zoo), library(ts) ) The ts package has been integrated into the stats package for a long time now... My data is usually from one year, and I try for example stl() to find some seasonalities or trends. As pointed out by Philippe, this is not what STL is made for. In STL you try to find seasonality patterns by loess smoothing the seasonality of subsequent years. If you have observations from just one year, there is just one seasonality pattern (at least if you look for monthly or quaterly patterns). I have now accepted, that I might have to convert my series into ts () but still I am not able to execute the comand since stl() is not satisfied And there are reasons for this: you need to have a regular time series with a certain frequency so that STL is applicable. (One could argue that ts is not the only format for regular time series but typically you can easily coerce back and forth between ts and zoo/zooreg. x-zoo(rnorm(365), as.Date(2005-01-01):as.Date(2005-12-31)) I don't think that this is what you want. Look at time(x). I guess you mean x - zoo(rnorm(365), seq(from = as.Date(2005-01-01), to = as.Date(2005-12-31), by = 1 day)) x-as.ts(x) #x-as.ts(x, frequency=12) #this has no effect frequency is not Here, it seems to me that you want to aggregate to monthly data, this can be done via x2 - aggregate(x, as.yearmon, mean) This is now (by default) a regular series with frequency 12 frequency(x2) and hence it can be easily coereced to ts and back (with almost no loss of information): as.zoo(as.ts(x2)) However, calling stl(as.ts(x2)) still complains that there are not enough periods because this is just a single year, i.e., only a single seasonality pattern. To look at this, you could do barplot(x2) For looking at the trend you could use a simple running mean plot(x) lines(rollmean(x, 14), 2) or you could also use loess() or some other smoother... For more details on the zoo package, see vignette(zoo, package = zoo) Best, Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Get Categorical Correlation Coefficient
Kum-Hoe Hwang [EMAIL PROTECTED] writes: There was my mistake in the earlier email. I have corrected the error by dropping ns.omit from data.frame(). I added a new corrected correlation and output followings: -- # nrow(sdi) [1] 65613 print(corridor1[65600:65613]) [1] C C C C F [6] F F F B B [11] F F B B Levels: B C D E A F print(corridor2[65600:65613]) [1] 4 4 4 4 2 2 2 2 1 1 2 2 1 1 summary(corridor1) B CD E A F 1509213456 6652 1611 179627006 summary(corridor2) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.0 1.0 2.0 2.3 3.0 5.0 summary(as.numeric(as.factor(corridor1))-as.numeric(as.factor(corridor1))) Min. 1st Qu. MedianMean 3rd Qu.Max. 0 0 0 0 0 0 One term of course needs to have corridor2. (That's my typo, but...) table(corridor1,corridor2) corridor2 corridor1 0 1 2 3 4 5 B 0 15092 0 0 0 0 C 0 0 0 0 13456 0 D 0 0 0 6652 0 0 E 0 0 0 0 0 1611 A 1796 0 0 0 0 0 F 0 0 27006 0 0 0 Notice that they are not in the same order! as.numeric(corridor1) will have 1 for B, ..., 5 for A, 6 for F --- There are different correlation coefficients from the following results: Are there any functions or packages for a categorical correlation? cor(jh1_1, corridor1) [1] 0.02753303 cor(jh1_1, as.factor(corridor2)) [1] -0.3682788 Thanks for your kindness, Kum On 12 Oct 2006 10:25:33 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote: Kum-Hoe Hwang [EMAIL PROTECTED] writes: Howdy Gurus ! I have a different correlation result from the same data. The corridor1 string variable is expressed as a number like the corridor2 number variable. -- levels(corridor1) [1] A B C D E F levels(as.factor(corridor2)) [1] 0 1 2 3 4 -- I have the correlation results followings using cor() function. -- cor(jh1_1, as.factor(corridor1)) [1] 0.01528538 cor(jh1_1, as.factor(corridor2)) [1] -0.4972571 -- I donot know why the above correlation coefficients used the same data are different. They are 0.015 from as.factor(corridor1), -0.497 from as,factor(corridor2). The string variable corridor1 is the same catergory data with the variable corridor2. The difference is that A is replaced with 0, B with 1, C with 2, . Could you tell me why they are different, and which correlation coefficient is correct? One thing that strikes me is that corridor1 has 6 levels and corridor2 has 5... In general correlations are not expected to work on factors so I'd be explicit about taking as.numeric(). A glance at table(corridor1,corridor2) should be informative too, as would a summary(as.numeric(as.factor(corridor1))-as.numeric(as.factor(corridor1))) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 -- Kum-Hoe Hwang, Ph.D.Phone : 82-31-250-3516Email : [EMAIL PROTECTED] -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ts vs zoo
Schweitzer, Markus wrote: thank you very much for the information. I guess I should have been more clear here. I was looking for the monthly or weekly trends within this one year period. Then, always keep in mind that stl() is looking for periodic component of a frequency = 1. This means you have to define the time unit so that you catch it. For monthly periodic component, use month as time unit. For weekly periodic component, use week as time unit. You must convert your data accordingly. Also keep in mind that stl() decomposes your series into a general trend, a periodic trend of frenquency 1, and noise, using an ADDITIVE model. So, if the components are multiplicative, you should use a different model. Otherwise, there is much more than stl() in R! In particular, you could look at any significant periodic component in your series by using spectrum(), for instance... still considering that your series is long enough, which is probably the case for looking at weekly periodic signals on a one-year long series. Best, Philippe Grosjean to get there I now only took the zoo object x and made x-as.ts(x) x-ts(x, frequency=7) #to get 52 weeks(Periods) with 7 days each - to get 12 periods e.g. months with 29,30 or 31 days, I guess I can only choose frequency=30 I then can run stl It is just a pitty, that the labeling (jan 2005, feb 2005 ..) has gone. So thank you for your hint with barplot and rollmean best regards, markus -Original Message- From: Achim Zeileis [mailto:[EMAIL PROTECTED] Sent: Donnerstag, 12. Oktober 2006 12:15 To: Schweitzer, Markus Cc: R-help@stat.math.ethz.ch Subject: Re: [R] ts vs zoo Markus, several comments: I have lots of data in zoo format and would like to do some time series analysis. (using library(zoo), library(ts) ) The ts package has been integrated into the stats package for a long time now... My data is usually from one year, and I try for example stl() to find some seasonalities or trends. As pointed out by Philippe, this is not what STL is made for. In STL you try to find seasonality patterns by loess smoothing the seasonality of subsequent years. If you have observations from just one year, there is just one seasonality pattern (at least if you look for monthly or quaterly patterns). I have now accepted, that I might have to convert my series into ts () but still I am not able to execute the comand since stl() is not satisfied And there are reasons for this: you need to have a regular time series with a certain frequency so that STL is applicable. (One could argue that ts is not the only format for regular time series but typically you can easily coerce back and forth between ts and zoo/zooreg. x-zoo(rnorm(365), as.Date(2005-01-01):as.Date(2005-12-31)) I don't think that this is what you want. Look at time(x). I guess you mean x - zoo(rnorm(365), seq(from = as.Date(2005-01-01), to = as.Date(2005-12-31), by = 1 day)) x-as.ts(x) #x-as.ts(x, frequency=12) #this has no effect frequency is not Here, it seems to me that you want to aggregate to monthly data, this can be done via x2 - aggregate(x, as.yearmon, mean) This is now (by default) a regular series with frequency 12 frequency(x2) and hence it can be easily coereced to ts and back (with almost no loss of information): as.zoo(as.ts(x2)) However, calling stl(as.ts(x2)) still complains that there are not enough periods because this is just a single year, i.e., only a single seasonality pattern. To look at this, you could do barplot(x2) For looking at the trend you could use a simple running mean plot(x) lines(rollmean(x, 14), 2) or you could also use loess() or some other smoother... For more details on the zoo package, see vignette(zoo, package = zoo) Best, Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ts vs zoo
On Thu, 12 Oct 2006 12:26:42 +0200 Schweitzer, Markus wrote: thank you very much for the information. I guess I should have been more clear here. I was looking for the monthly or weekly trends within this one year period. to get there I now only took the zoo object x and made x-as.ts(x) x-ts(x, frequency=7) #to get 52 weeks(Periods) with 7 days each - to get 12 periods e.g. months with 29,30 or 31 days, I guess I can only choose frequency=30 I then can run stl It is just a pitty, that the labeling (jan 2005, feb 2005 ..) has gone. But you can do the following x - zoo(rnorm(365), seq(from = as.Date(2005-01-01), to = as.Date(2005-12-31), by = 1 day)) xt - ts(as.ts(x), frequency = 7) xt_stl - stl(xt, s.window = 28, t.window = 28) xz_stl - as.zoo(xt_stl$time.series) time(xz_stl) - time(x) plot(xz_stl) i.e., convert the extracted series back to zoo and add the original index for plotting. Z So thank you for your hint with barplot and rollmean best regards, markus -Original Message- From: Achim Zeileis [mailto:[EMAIL PROTECTED] Sent: Donnerstag, 12. Oktober 2006 12:15 To: Schweitzer, Markus Cc: R-help@stat.math.ethz.ch Subject: Re: [R] ts vs zoo Markus, several comments: I have lots of data in zoo format and would like to do some time series analysis. (using library(zoo), library(ts) ) The ts package has been integrated into the stats package for a long time now... My data is usually from one year, and I try for example stl() to find some seasonalities or trends. As pointed out by Philippe, this is not what STL is made for. In STL you try to find seasonality patterns by loess smoothing the seasonality of subsequent years. If you have observations from just one year, there is just one seasonality pattern (at least if you look for monthly or quaterly patterns). I have now accepted, that I might have to convert my series into ts () but still I am not able to execute the comand since stl() is not satisfied And there are reasons for this: you need to have a regular time series with a certain frequency so that STL is applicable. (One could argue that ts is not the only format for regular time series but typically you can easily coerce back and forth between ts and zoo/zooreg. x-zoo(rnorm(365), as.Date(2005-01-01):as.Date(2005-12-31)) I don't think that this is what you want. Look at time(x). I guess you mean x - zoo(rnorm(365), seq(from = as.Date(2005-01-01), to = as.Date(2005-12-31), by = 1 day)) x-as.ts(x) #x-as.ts(x, frequency=12) #this has no effect frequency is not Here, it seems to me that you want to aggregate to monthly data, this can be done via x2 - aggregate(x, as.yearmon, mean) This is now (by default) a regular series with frequency 12 frequency(x2) and hence it can be easily coereced to ts and back (with almost no loss of information): as.zoo(as.ts(x2)) However, calling stl(as.ts(x2)) still complains that there are not enough periods because this is just a single year, i.e., only a single seasonality pattern. To look at this, you could do barplot(x2) For looking at the trend you could use a simple running mean plot(x) lines(rollmean(x, 14), 2) or you could also use loess() or some other smoother... For more details on the zoo package, see vignette(zoo, package = zoo) Best, Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: rarefy a matrix of counts
Hi On 11 Oct 2006 at 12:54, Tony Plate wrote: Date sent: Wed, 11 Oct 2006 12:54:44 -0600 From: Tony Plate [EMAIL PROTECTED] To: Brian Frappier [EMAIL PROTECTED] Copies to: Petr Pikal [EMAIL PROTECTED], r-help@stat.math.ethz.ch Subject:Re: [R] Fwd: rarefy a matrix of counts Two things to note: (1) rep() can be vectorized: rep(1:3, 2:4) [1] 1 1 2 2 2 3 3 3 3 (2) you will likely get much better performance if you work with integers and convert to strings after sampling (or use factors), e.g.: that is what I actually used in my suggestion (I hope). DF color sample1 sample2 sample3 1 red 400 3002500 2 green 100 0 200 3 black 3001000 500 notice that red, green, black is not **row names** but a column in data frame. That is why following code gives red, green, etc. x - data.frame(matrix(NA,100,3)) for (i in 2:ncol(DF)) x[,i-1] - sample(rep(DF[,1], DF[,i]),100) if you want result in data frame or x-vector(list, 3) for (i in 2:ncol(DF)) x[[,i-1]] - sample(rep(DF[,1], DF[,i]),100) c(red,green,blue)[sample(rep(1:3,c(400,100,300)), 5)] [1] red blue red red red -- Tony Plate snip is that this code still samples the rows, not the elements, i.e. No, see above. returns 100 or 300 in the matrix cells instead of red or a matrix of counts by color (object type) like: x1x2 x3 red 32 560 gr6895 40 sum 100 100 100 something like sapply(x,table) X1 X2 X3 black 36 79 15 green 14 0 9 red 50 21 76 HTH Petr It looks like Tony is right: sampling without replacement requires listing of all elements to be sampled. But, the code Petr provided x1 - sample(c(rep(red,400),rep(green, 100),rep(black,300)),100) did give me a clue of how to quickly make such a list using the 'rep' command. I will for-loop a rep statement using my original matrix to create a list of elements for each sample: Thanks Petr and Tony for your help! On 10/11/06, *Tony Plate* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Here's a way using apply(), and the prob= argument of sample(): df - data.frame(sample1=c(red=400,green=100,black=300), sample2=c(300,0,1000), sample3=c(2500,200,500)) df sample1 sample2 sample3 red 400 3002500 green 100 0 200 black 3001000 500 set.seed(1) apply(df, 2, function(counts) sample(seq(along=counts), rep=T, size=7, prob=counts)) sample1 sample2 sample3 [1,] 1 3 1 [2,] 1 3 1 [3,] 3 3 1 [4,] 2 3 2 [5,] 1 3 1 [6,] 2 3 1 [7,] 2 3 3 Note that this does sampling WITH replacement. AFAIK, sampling without replacement requires enumerating the entire population to be sampled from. I.e., you cannot do sample(1:3, prob=1:3, rep=F, size=4) instead of sample(c(1,2,2,3,3,3), rep=F, size=4) -- Tony Plate From reading ?sample, I was a little unclear on whether sampling without replacement could work Petr Pikal wrote: Hi a litle bit different story. But x1 - sample(c(rep(red,400),rep(green, 100), rep(black,300)),100) is maybe close. With data frame (if it is not big) DF color sample1 sample2 sample3 1 red 400 3002500 2 green 100 0 200 3 black 3001000 500 x - data.frame(matrix(NA,100,3)) for (i in 2:ncol(DF)) x[,i-1] - sample(rep(DF[,1], DF[,i]),100) if you want result in data frame or x-vector(list, 3) for (i in 2:ncol(DF)) x[[,i-1]] - sample(rep(DF[,1], DF[,i]),100) if you want it in list. Maybe somebody is clever enough to discard for loop but you said you have 80 columns which shall be no problem. HTH Petr On 11 Oct 2006 at 10:11, Brian Frappier wrote: Date sent:Wed, 11 Oct 2006 10:11:33 -0400 From: Brian Frappier [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] To: Petr Pikal [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Subject: Fwd: [R] rarefy a matrix of counts -- Forwarded message -- From: Brian Frappier [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Date: Oct 11, 2006 10:10 AM Subject: Re: [R] rarefy a matrix of counts To: r-help@stat.math.ethz.ch
[R] Problem loading SpareM package
Hi, I have just installed R 2.4.0 and when I try to load SpareseM, I get the following error message library(SparseM) Package SparseM (0.71) loaded. To cite, see citation(SparseM) Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source = keep.source) : in 'SparseM' methods specified for export, but none defined: as.matrix.csr, as.matrix.csc, as.matrix.ssr, as.matrix.ssc, as.matrix.coo, as.matrix, t, coerce, dim, diff, diag, diag-, det, norm, chol, backsolve, solve, model.matrix, model.response, %*%, %x%, image Error: package/namespace load failed for 'SparseM' I have contacted the package maintainers and they couldn't be of any help. I do not recall getting this error in older R versions. Regards Coomaren Send instant messages to your online friends http://uk.messenger.yahoo.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem loading SpareM package
I have been getting a similar error when I compile my own package. The message says the problem is in the methods package which is part of R. * checking S3 generic/method consistency ... WARNING Error in get(cname, where) : recursive default argument reference Error: .onLoad failed in 'loadNamespace' for 'methods' Execution halted See section 'Generic functions and methods' of the 'Writing R Extensions' manual. * checking replacement functions ... OK * checking foreign function calls ... WARNING Error in get(cname, where) : recursive default argument reference Error: .onLoad failed in 'loadNamespace' for 'methods' Execution halted See the chapter 'System and foreign language interfaces' of the 'Writing R Extensions' manual. * checking R code for possible problems ... OK * checking Rd files ... OK * checking Rd cross-references ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... WARNING Error in get(cname, where) : recursive default argument reference Error: .onLoad failed in 'loadNamespace' for 'methods' Execution halted In a running R session, I tried to detach methods and then bring it back with the following message search() [1] .GlobalEnvpackage:HHpackage:multcomp [4] package:mvtnorm package:grid package:lattice [7] package:methods package:stats package:graphics [10] package:grDevices package:utils package:datasets [13] Autoloads package:base detach(7) library(methods) Error in identical(pkg, [EMAIL PROTECTED]) : formal classes cannot be used without the methods package Error: .onAttach failed in 'attachNamespace' Error: package/namespace load failed for 'methods' __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem loading SpareM package
On Thu, 12 Oct 2006, Coomaren Vencatasawmy wrote: Hi, I have just installed R 2.4.0 and when I try to load SpareseM, I get the following error message library(SparseM) Package SparseM (0.71) loaded. To cite, see citation(SparseM) Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source = keep.source) : in 'SparseM' methods specified for export, but none defined: as.matrix.csr, as.matrix.csc, as.matrix.ssr, as.matrix.ssc, as.matrix.coo, as.matrix, t, coerce, dim, diff, diag, diag-, det, norm, chol, backsolve, solve, model.matrix, model.response, %*%, %x%, image Error: package/namespace load failed for 'SparseM' Please re-install the package. All contributed packages using new-style classes need to be re-installed because the internal representation of such classes and methods has changed, see CHANGES TO S4 METHODS in NEWS. Doing: update.packages(checkBuilt = TRUE) will check your libraries for packages built under previous releases and replace them with ones built for the platform release. I have contacted the package maintainers and they couldn't be of any help. I do not recall getting this error in older R versions. Regards Coomaren Send instant messages to your online friends http://uk.messenger.yahoo.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem loading SpareM package
url:www.econ.uiuc.edu/~rogerRoger Koenker email[EMAIL PROTECTED]Department of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Oct 12, 2006, at 7:12 AM, Roger Bivand wrote: On Thu, 12 Oct 2006, Coomaren Vencatasawmy wrote: Hi, I have just installed R 2.4.0 and when I try to load SpareseM, I get the following error message library(SparseM) Package SparseM (0.71) loaded. To cite, see citation(SparseM) Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source = keep.source) : in 'SparseM' methods specified for export, but none defined: as.matrix.csr, as.matrix.csc, as.matrix.ssr, as.matrix.ssc, as.matrix.coo, as.matrix, t, coerce, dim, diff, diag, diag-, det, norm, chol, backsolve, solve, model.matrix, model.response, %*%, %x%, image Error: package/namespace load failed for 'SparseM' Please re-install the package. All contributed packages using new- style classes need to be re-installed because the internal representation of such classes and methods has changed, see CHANGES TO S4 METHODS in NEWS. Doing: update.packages(checkBuilt = TRUE) will check your libraries for packages built under previous releases and replace them with ones built for the platform release. I have contacted the package maintainers and they couldn't be of any help. I do not recall getting this error in older R versions. Regards Coomaren Send instant messages to your online friends http:// uk.messenger.yahoo.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cross two dataframe
Dear r-users! I would like to cross two data frame which have the same row number but different in the number of column. Can anybody help me for this case ? Thanks a lot in advance Majid Iravani PhD Student Swiss Federal Research Institute WSL Research Group of Vegetation Ecology Zürcherstrasse 111 CH-8903 Birmensdorf Switzerland Phone: +41-1-739-2693 Fax: +41-1-739-2215 Email: [EMAIL PROTECTED] http://www.wsl.ch/staff/majid.iravani/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in stepAIC?
You sent this earlier to R-devel. Please do see the posting guide! Since you (incorrectly) thought this was a bug in MASS, you should have contacted the maintainer. On Wed, 11 Oct 2006, Martin C. Martin wrote: Hi, First of all, thanks for the great work on R in general, and MASS in particular. It's been a life saver for me many times. However, I think I've discovered a bug. It seems that, when I use weights during an initial least-squares regression fit, and later try to add terms using stepAIC(), it uses the weights when looking to remove terms, but not when looking to add them: hills.lm - lm(time ~ dist + climb, data = hills, weights = 1/dist2) Presumably dist^2? small.hills.lm - stepAIC(hills.lm) stepAIC(small.hills.lm, time ~ dist + climb) In the first stepAIC(), it says that the AIC for the full time ~ dist + climb is 94.41. Yet, during the second stepAIC, it says adding climb would produce an AIC of 212.1 (and an RSS of 12633.3). Is this a bug? Yes, but not in stepAIC. Consider drop1(hills.lm) Single term deletions Model: time ~ dist + climb Df Sum of SqRSSAIC none 437.64 94.41 dist1164.05 601.68 103.55 climb 1 8.66 446.29 93.10 add1(small.hills.lm, time ~ dist + climb) Single term additions Model: time ~ dist Df Sum of Sq RSS AIC none 15787.2 217.9 climb 13153.8 12633.3 212.1 stats:::add1.default(small.hills.lm, time ~ dist + climb) Single term additions Model: time ~ dist DfAIC none93.097 climb 1 94.411 so the bug is in add1.lm, part of R itself. Other code has been altered which then broke add1.lm and 'z' needs to be given class lm. Now fixed in r-devel and r-patched. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in lowess
Prof Brian Ripley wrote: On Thu, 12 Oct 2006, Gavin Simpson wrote: On Wed, 2006-10-11 at 22:29 -0500, Frank E Harrell Jr wrote: x - c(0,7,8,14,15,120,242) y - c(122,128,130,158,110,110,92) lowess(x,y) $x [1] 0 7 8 14 15 120 242 $y [1] 122. 128. 132.2857 158. 110. -4930. 110. Same behaviour here on a more recent R (below), and I recall a posting for a year or so back that reported similar behaviour in the then current version of R. Actually, it is system-dependent as I get (on x86_64) lowess(x,y, iter=3) lowess(): ns = 4 cmad = 0.25589 cmad =0 cmad = 0.00583385 $x [1] 0 7 8 14 15 120 242 $y [1] 128. 128. 132.2857 158. 110. 109.9990 110. having turned DEBUG_lowess on. So the issue is finite-precision arithmetic, once again. It seems rather a moot point as to what the right answer actually is here, and even if that found by Frank is indeed wrong, as lowess() is defined by an algorithm. Perhaps the best one can hope for is a good approximation to what the algorithm would give in infinite-precision arithmetic (having defined what should happen if cmod is zero). Thank you Brian. It seems that no matter what is the right answer, the answer currently returned on my system is clearly wrong. lowess()$y should be constrained to be within range(y). lowess(x,y,iter=0) provides a reasonable solution in this case; I just don't know how to automatically force iter=0. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cross two dataframe
Hi try to: 1. read posting.guide 2. look into some introduction documents and FAQs 3. consult help 4. look to ?merge, ?cbind, ?rbind HTH Petr On 12 Oct 2006 at 14:45, Majid Iravani wrote: Date sent: Thu, 12 Oct 2006 14:45:19 +0200 To: r-help@stat.math.ethz.ch From: Majid Iravani [EMAIL PROTECTED] Subject:[R] Cross two dataframe Dear r-users! I would like to cross two data frame which have the same row number but different in the number of column. Can anybody help me for this case ? Thanks a lot in advance -- -- Majid Iravani PhD Student Swiss Federal Research Institute WSL Research Group of Vegetation Ecology Zürcherstrasse 111 CH-8903 Birmensdorf Switzerland Phone: +41-1-739-2693 Fax: +41-1-739-2215 Email: [EMAIL PROTECTED] http://www.wsl.ch/staff/majid.iravani/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extrapolated regression lines
Dear list members, When I create a simple scatterplot with a regression line (se below) the line is automatically extrapolated outside the range of data points. Why is this and how can I prevent R from extrapolating the regression line? Thank you in advance, Johan model-lm(Herb~Para) plot(Para,Herb) abline(model) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Variance Ratio test
Hello, I am looking for a code in R for the variance ratio test statistic (the Lo and Mackinlay version or any other versions). Does anybody have such a code they can share or know a library in which I can find this function? Basically I have a number of time series which I need to check for persistence. One other test I can use is the runs test in the tseries package. Any help will be greatly appreciated. Thanks a lot, Spyros __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in stepAIC?
Prof Brian Ripley wrote: You sent this earlier to R-devel. Please do see the posting guide! Since you (incorrectly) thought this was a bug in MASS, you should have contacted the maintainer. Thanks, but I did try emailing both you and Prof. Venables directly a month ago. After not receiving a response, I emailed R-devel last week. After not receiving a response there, I thought perhaps the code was correct after all, and I misunderstood how to call it - a perfect question for R-help. There can be a fine line between R-help and R-devel, which is even harder to find when you're new to R and you don't really know where the problem is. On Wed, 11 Oct 2006, Martin C. Martin wrote: Hi, First of all, thanks for the great work on R in general, and MASS in particular. It's been a life saver for me many times. However, I think I've discovered a bug. It seems that, when I use weights during an initial least-squares regression fit, and later try to add terms using stepAIC(), it uses the weights when looking to remove terms, but not when looking to add them: hills.lm - lm(time ~ dist + climb, data = hills, weights = 1/dist2) Presumably dist^2? Yes, sorry, a problem with Thunderbird being a little too smart for it's own good. :) small.hills.lm - stepAIC(hills.lm) stepAIC(small.hills.lm, time ~ dist + climb) In the first stepAIC(), it says that the AIC for the full time ~ dist + climb is 94.41. Yet, during the second stepAIC, it says adding climb would produce an AIC of 212.1 (and an RSS of 12633.3). Is this a bug? Yes, but not in stepAIC. Consider drop1(hills.lm) Single term deletions Model: time ~ dist + climb Df Sum of SqRSSAIC none 437.64 94.41 dist1164.05 601.68 103.55 climb 1 8.66 446.29 93.10 add1(small.hills.lm, time ~ dist + climb) Single term additions Model: time ~ dist Df Sum of Sq RSS AIC none 15787.2 217.9 climb 13153.8 12633.3 212.1 stats:::add1.default(small.hills.lm, time ~ dist + climb) Single term additions Model: time ~ dist DfAIC none93.097 climb 1 94.411 so the bug is in add1.lm, part of R itself. Other code has been altered which then broke add1.lm and 'z' needs to be given class lm. Now fixed in r-devel and r-patched. Great; thanks! Best, Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variance Ratio test
I don't know if it's there but check Rmetrics : ( Fcalendar, Fseries ). Just an aside : I don't think the runs test is really a test you want top use for checking for the random walk nature of a series. There's some vague relation in terms of the fact that the runs test test whether the number of streaks of positive and negatives is random but it's not as specific of a test ( for testing the random walk ) as the variance ratio test. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mesomeris, Spyros [CIR] Sent: Thursday, October 12, 2006 9:42 AM To: r-help@stat.math.ethz.ch Subject: [R] Variance Ratio test Hello, I am looking for a code in R for the variance ratio test statistic (the Lo and Mackinlay version or any other versions). Does anybody have such a code they can share or know a library in which I can find this function? Basically I have a number of time series which I need to check for persistence. One other test I can use is the runs test in the tseries package. Any help will be greatly appreciated. Thanks a lot, Spyros __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to get the variance-covariance matrix/information of alpha and beta after fitting a GLMs?
Dear friends, After fitting a generalized linear models ,i hope to get the variance of alpha,variance of beta and their covariance, that is , the variance-covariance matrix/information of alpha and beta , suppose *B* is the object of GLMs, i use attributes(B) to look for the options ,but can't find it, anybody knows how to get it? attributes(B) $names [1] coefficients residuals fitted.values effects [5] R rank qr family [9] linear.predictors deviance aic null.deviance [13] iter weights prior.weights df.residual [17] df.null y converged boundary [21] model call formula terms [25] data offsetcontrol method [29] contrasts xlevels $class [1] glm lm I appreciate any help/suggestions. -- With Kind Regards, oooO: (..): :\.(:::Oooo:: ::\_)::(..):: :::)./::: ::(_/ : [***] Zhi Jie,Zhang ,PHD Tel:86-21-54237149 [EMAIL PROTECTED] Dept. of Epidemiology,school of public health,Fudan University Address:No. 138 Yi Xue Yuan Road,Shanghai,China Postcode:200032 [***] oooO: (..): :\.(:::Oooo:: ::\_)::(..):: :::)./::: ::(_/ : [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extrapolated regression lines
On 10/12/2006 9:41 AM, Johan A. Stenberg wrote: Dear list members, When I create a simple scatterplot with a regression line (se below) the line is automatically extrapolated outside the range of data points. Why is this and how can I prevent R from extrapolating the regression line? Thank you in advance, Johan model-lm(Herb~Para) plot(Para,Herb) abline(model) abline draws a line, not a line segment. If you want a segment that stays within the data, you'll need something like this: model - lm(Herb ~ Para) plot(Para, Herb) lines(Para, predict(model)) Note that if Para is not sorted, this may look a little strange, and it will look like a complete mess if you try to fit a polynomial. You can sort it (and Herb correspondingly) by o - order(Para) Para - Para[o] Herb - Herb[o] Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unevaluated expression
Hello, x- something(a+b) + c is there any function F such that F(x) gives me the unevaluated value of x, i.e. something(a+b)+c I would appreciate any help on this thanks - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get the variance-covariance matrix/information of alphaand beta after fitting a GLMs?
?vcov From: [EMAIL PROTECTED] on behalf of zhijie zhang Sent: Thu 10/12/2006 9:56 AM To: R-help@stat.math.ethz.ch Subject: [R] how to get the variance-covariance matrix/information of alphaand beta after fitting a GLMs? Dear friends, After fitting a generalized linear models ,i hope to get the variance of alpha,variance of beta and their covariance, that is , the variance-covariance matrix/information of alpha and beta , suppose *B* is the object of GLMs, i use attributes(B) to look for the options ,but can't find it, anybody knows how to get it? attributes(B) $names [1] coefficients residuals fitted.values effects [5] R rank qr family [9] linear.predictors deviance aic null.deviance [13] iter weights prior.weights df.residual [17] df.null y converged boundary [21] model call formula terms [25] data offsetcontrol method [29] contrasts xlevels $class [1] glm lm I appreciate any help/suggestions. -- With Kind Regards, oooO: (..): :\.(:::Oooo:: ::\_)::(..):: :::)./::: ::(_/ : [***] Zhi Jie,Zhang ,PHD Tel:86-21-54237149 [EMAIL PROTECTED] Dept. of Epidemiology,school of public health,Fudan University Address:No. 138 Yi Xue Yuan Road,Shanghai,China Postcode:200032 [***] oooO: (..): :\.(:::Oooo:: ::\_)::(..):: :::)./::: ::(_/ : [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extrapolated regression lines
On Thu, 2006-10-12 at 15:41 +0200, Johan A. Stenberg wrote: Dear list members, When I create a simple scatterplot with a regression line (se below) the line is automatically extrapolated outside the range of data points. Why is this and how can I prevent R from extrapolating the regression line? Thank you in advance, Johan model-lm(Herb~Para) plot(Para,Herb) abline(model) Hi Johan, Simply predict over the range of your predictor (Para). As I don't have those data, the example below uses dummy data X - rnorm(500) # dummy data Y - X + rnorm(500) mod - lm(Y ~ X) # fit model # regular sequence over range of predictor X newdat - seq(min(X), max(X), length = 100) # predict Y for each of these new data points pred - predict(mod, newdata = list(X = newdat)) # plot the results plot(Y ~ X) lines(pred ~ newdat, col = red) HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC ENSIS, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sending help output to a file
when i do ?whatever at the R prompt ( i use linux ), the help comes up but it comes up like a man page. i would prefer to send it to a file. i did a ?help and it says something about sending the output to a file but nothing specific enough that i can figure out what to do. the help page talks about a parameter called type but as far as i can tell, there is no type parameter in the call to the help function ? if someone could tell me how to send output to a file instead of the screen, i would really appreciate it. thanks. also , i am using linux but i haven't figured out what kind or what version #. This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned or an official confirmation. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. This is not research and is not from MS Research but it may refer to a research analyst/research report. Unless indicated, these views are the author's and may differ from those of Morgan Stanley research or others in the Firm. We do not represent this is accurate or complete and we may not update this. Past performance is not indicative of future returns. For additional information, research reports and important disclosures, contact me or see https://secure.ms.com/servlet/cls. You should not use e-mail to request, authorize or effect the purchase or sale of any security or instrument, to send transfer instructions, or to effect any other transactions. We cannot guarantee that any such requests received via ! e-mail will be processed in a timely manner. This communication is solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get the variance-covariance matrix/information of alpha and beta after fitting a GLMs?
On Thu, 12 Oct 2006, zhijie zhang wrote: Dear friends, After fitting a generalized linear models ,i hope to get the variance of alpha,variance of beta and their covariance, that is , the variance-covariance matrix/information of alpha and beta , suppose *B* is the object of GLMs, i use attributes(B) to look for the options ,but can't find it, anybody knows how to get it? vcov(your.model) -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unevaluated expression
On 10/12/2006 10:03 AM, johan Faux wrote: Hello, x- something(a+b) + c is there any function F such that F(x) gives me the unevaluated value of x, i.e. something(a+b)+c I would appreciate any help on this parse(text=x) or parse(text=x)[[1]], depending whether you want an expression containing that expression, or the call that it actually corresponds to. quote(something(a+b) + c) would get you directly to the latter. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] import rda data in R
Hello I'm triying to import data to R What is the problem? es_rda - read.table(D:\\fdrtrees\\data\\es.rda,sep=,,header=T) Warning message: incomplete final line found by readTableHeader on 'D:\fdrtrees\data\es.rda' Thank you a lot Rita [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] C code for KalmnaLike
hi, i am looking for c code of kalman filtering please can you help me...thankyou bye... - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] import rda data in R
Because rda files are R objects. Use load, not read.table -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rita Gottloiber Sent: Thursday, October 12, 2006 10:36 AM To: r-help@stat.math.ethz.ch; r-help@stat.math.ethz.ch Subject: [R] import rda data in R Hello I'm triying to import data to R What is the problem? es_rda - read.table(D:\\fdrtrees\\data\\es.rda,sep=,,header=T) Warning message: incomplete final line found by readTableHeader on 'D:\fdrtrees\data\es.rda' Thank you a lot Rita [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] C code for KalmnaLike
you shouldn't need it. Kalmanlike() ( spelling ) I think is in the base package and there is atleast One constributed package and probably many others that do kalman filtering but I can't recall the names of them. Check out the list of packages at www.r-project.org. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Malini Subramanian Sent: Thursday, October 12, 2006 9:56 AM To: R-help@stat.math.ethz.ch Subject: Re: [R] C code for KalmnaLike hi, i am looking for c code of kalman filtering please can you help me...thankyou bye... - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sending help output to a file
On Thu, 2006-10-12 at 10:22 -0400, Leeds, Mark (IED) wrote: when i do ?whatever at the R prompt ( i use linux ), the help comes up but it comes up like a man page. i would prefer to send it to a file. i did a ?help and it says something about sending the output to a file but nothing specific enough that i can figure out what to do. the help page talks about a parameter called type but as far as i can tell, there is no type parameter in the call to the help function ? if someone could tell me how to send output to a file instead of the screen, i would really appreciate it. thanks. also , i am using linux but i haven't figured out what kind or what version #. Typically, R's help files are already available as text files. They are usually in: $R_HOME/library/PACKAGENAME/help where $R_HOME on Linux is usually: R.home() [1] /usr/local/lib/R Note that if you might prefer that help files come up in a browser window (ie. Firefox), you can set: options(htmlhelp=TRUE) in your ~/.Rprofile. In this way, they won't come up in the pager within the terminal console. See ?options and section 10.8 in the Intro to R Manual. WRT to the Linux version, most recent versions support the LSB command of: $ lsb_release -a LSB Version::core-3.0-ia32:core-3.0-noarch:graphics-3.0-ia32:graphics-3.0-noarch Distributor ID: FedoraCore Description:Fedora Core release 5 (Bordeaux) Release:5 Codename: Bordeaux You can also get kernel version information with: $ uname -a Linux horizons 2.6.17-1.2187_FC5 #1 Mon Sep 11 01:17:06 EDT 2006 i686 i686 i386 GNU/Linux HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unevaluated expression
parse(text=x)[[1]] is what I wanted, thank you Duncan Murdoch [EMAIL PROTECTED] wrote: On 10/12/2006 10:03 AM, johan Faux wrote: Hello, x- something(a+b) + c is there any function F such that F(x) gives me the unevaluated value of x, i.e. something(a+b)+c I would appreciate any help on this parse(text=x) or parse(text=x)[[1]], depending whether you want an expression containing that expression, or the call that it actually corresponds to. quote(something(a+b) + c) would get you directly to the latter. Duncan Murdoch __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to get the variance-covariance matrix/information of alpha and beta after fitting a GLMs?
Dear friends, Both vcov(your.model) and summary(B)$cov.unscaled,summary(B)$cov.scaled works, and vcov is the function that i'm looking for. Thanks very much! - with kind regards zhijie [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] binom calculation with range
What is the way to calculate the probability that between 1 and 5? i.e. We can do: pbinom(q=1,size=1000,prob=0.0005,lower.tail=FALSE) [1] 0.09016608 pbinom(q=5,size=1000,prob=0.0005,lower.tail=FALSE) [1] 1.398798e-05 pbinom(q=1:5,size=1000,prob=0.0005,lower.tail=FALSE) [1] 9.016608e-02 1.435924e-02 1.743731e-03 1.707366e-04 1.398798e-05 But it is really about pr(1 = X = 5). I can think of: pbinom(q=5,size=1000,prob=0.0005,lower.tail=FALSE) - pbinom(q=1,size=1000,prob=0.0005,lower.tail=FALSE) but, I am unsure if we don't have any other options for calculating it in R. thx much. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is there a function in R to evaluate the adjusted AIC or other statistc where overdispersion existed in GLMs?
Dear friends, As we all know, the usual model selection criteria(e.g.deviance,AIC...) in GLMs isn't very good for selecting the best model when overdispersion exist, so we need to adjust the corresponding statistic,see(Fitzmaurice,G.M. (1997) Model selection with overdispersed datafile:///D:/²©Ê¿¿ÎÌâ/Prediction%20model%20of%20Snails/1997/Model%20Selection%20with%20Overdispersed%20Data.pdf, The Statistician,46(1):81-91.). Is there a function in R to evaluate the adjusted AIC or other statistc where overdispersion existed in GLMs? How should i do in that case? Thanks in advance. -- With Kind Regards, oooO: (..): :\.(:::Oooo:: ::\_)::(..):: :::)./::: ::(_/ : [***] Zhi Jie,Zhang ,PHD Tel:86-21-54237149 [EMAIL PROTECTED] Dept. of Epidemiology,school of public health,Fudan University Address:No. 138 Yi Xue Yuan Road,Shanghai,China Postcode:200032 [***] oooO: (..): :\.(:::Oooo:: ::\_)::(..):: :::)./::: ::(_/ : [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multithreading calling from the rpy Python package
Hello, I don't know if this question ought to go here, or rather on R-devel, so please bear with me. I'm interfacing to R via RPy (rpy.sf.net) and an embedded Python interpreter. This is really quite convenient. I use this approach to calculate the correlation coefficient of 1 independent dataset (vector) with 4 dependent vectors. It'd be nice if that could be done in 4 parallel threads, or even two. As long as I stick to pure Python code (using equivalents to R routines that can be found in Numpy and SciPy), this works fine. (Tested on a single-core machine.) However, when I call R functions through rpy, a crash will occur at some point, with the error *** caught segfault *** address 0x5164000, cause 'memory not mapped' (this is on Mac OS X 10.4.8), somewhere in Rf_eval: Thread 4 Crashed: 0 libR.dylib 0x03676af0 Rf_eval + 128 1 libR.dylib 0x03676e6c Rf_eval + 1020 2 libR.dylib 0x03677108 Rf_eval + 1688 3 libR.dylib 0x03676e6c Rf_eval + 1020 4 libR.dylib 0x03677108 Rf_eval + 1688 5 libR.dylib 0x03676e6c Rf_eval + 1020 6 libR.dylib 0x03677108 Rf_eval + 1688 7 libR.dylib 0x03678144 Rf_evalList + 148 8 libR.dylib 0x036bb5cc do_internal + 796 9 libR.dylib 0x03676fbc Rf_eval + 1356 10 libR.dylib 0x0367ad10 Rf_applyClosure + 1120 11 libR.dylib 0x03676e3c Rf_eval + 972 12 libR.dylib 0x0367ad10 Rf_applyClosure + 1120 13 libR.dylib 0x03676e3c Rf_eval + 972 14 libR.dylib 0x0367a110 do_if + 48 15 libR.dylib 0x03676fbc Rf_eval + 1356 16 libR.dylib 0x0367932c do_begin + 108 17 libR.dylib 0x03676fbc Rf_eval + 1356 18 libR.dylib 0x0367ad10 Rf_applyClosure + 1120 19 libR.dylib 0x03676e3c Rf_eval + 972 20 libR.dylib 0x0361b7c0 protectedEval + 64 21 libR.dylib 0x0361c170 R_ToplevelExec + 544 22 libR.dylib 0x0361c22c R_tryEval + 60 23 _rpy2031.so 0x032f0b8c do_eval_expr + 108 24 _rpy2031.so 0x032ef950 Robj_call + 688 25 Python2.5 0x023c6c08 PyObject_Call + 56 26 Python2.5 0x024a68ec PyEval_EvalFrameEx + 16844 27 Python2.5 0x024a8cf8 PyEval_EvalFrameEx + 26072 28 Python2.5 0x024aaef8 PyEval_EvalCodeEx + 3512 29 Python2.5 0x024a7ce0 PyEval_EvalFrameEx + 21952 30 Python2.5 0x024a8cf8 PyEval_EvalFrameEx + 26072 31 Python2.5 0x024aaef8 PyEval_EvalCodeEx + 3512 32 Python2.5 0x023fbb88 function_call + 472 33 Python2.5 0x023c6c08 PyObject_Call + 56 34 Python2.5 0x023d3294 instancemethod_call + 388 35 Python2.5 0x023c6c08 PyObject_Call + 56 36 Python2.5 0x024a0cf4 PyEval_CallObjectWithKeywords + 276 37 Python2.5 0x024f244c t_bootstrap + 60 38 libSystem.B.dylib 0x9002b508 _pthread_body + 96 Is this because R itself isn't thread-safe, or maybe the R code I'm calling? I've found discussions on why should we make R thread-safe and how on the website, but there appears to be no date on these documents. The R/Python wrapper functions I'm using: # a variance calculator that returns 0 for vectors that have only 1 non-NaN element: def vvar(a): v=rpy.r.var(a, na_rm=True) if isnan(v): return 0 return v # Calculate the Spearman Rho correlation between a and b and return the result # as scipy.stats.stats.spearmanr() does: R_spearmanr=rpy.r('function(a,b){ kk-cor.test(a,b,method=spearman); c( kk$estimate[[1]], kk$p.value) ; }') I'm taking care to make copies of the arrays I'm correlating when initialising the threads. (I can post more of the Python code, if required.) I'm using R 2.3.1 . thanks in advance, René (as always, please CC me on replies sent to the list, thanks!) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] binom calculation with range
dbinom(1:5,size=1000,prob=0.0005) [1] 0.3033791010 0.0758068339 0.0126155111 0.0015729946 0.0001567486 sum(dbinom(1:5,size=1000,prob=0.0005)) [1] 0.3935312 diff(pbinom(q=c(0,5),size=1000,prob=0.0005)) [1] 0.3935312 diff(pbinom(q=c(5,0),size=1000,prob=0.0005, lower=FALSE)) [1] 0.3935312 All the distribution functions have variants: p probability d density q quantile r random numbers See the help files, for example ?pbinom __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] avoiding a loop?
I have a vector, (not a list) repeated.measures.FACTOR.names [1] Insp1 Insp2 Insp3 Insp4 Insp5 Insp6 Insp7 Insp8 Insp9 and would like to convert this into a single string Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 I can do that with a loop, but isn't there a more elegant way? result - repeated.measures.FACTOR.names[[1]] for(i in 2:length(repeated.measures.FACTOR.names)) { result - paste(result, repeated.measures.FACTOR.names[[i]], sep=,) } result [1] Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 Thanks. Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding error bars to lattice plots
Dear R users, About a year ago Deepayan offered a suggestion to incorporate error bars into a dotplot using the singer data as an example http://finzi.psych.upenn.edu/R/Rhelp02a/archive/63875.html. When I try to utilize this code with a grouping variable, I get an error stating that the subscripts argument is missing. I have tried to insert them in various ways, but cannot figure out where they should go. Deepayan's original code follows, with additions from me for factor, grouping and by variables. (Note that I could use xYplot (Dotplot), but I need my response variable on the vertical axis.) Any suggestions would be greatly appreciated. Thanks, Dan prepanel.ci - function(x, y, lx, ux, subscripts, ...) { x - as.numeric(x) lx - as.numeric(lx[subscripts]) ux - as.numeric(ux[subscripts]) list(xlim = range(x, ux, lx, finite = TRUE)) } panel.ci - function(x, y, lx, ux, subscripts, pch = 16, ...) { x - as.numeric(x) y - as.numeric(y) lx - as.numeric(lx[subscripts]) ux - as.numeric(ux[subscripts]) panel.abline(h = unique(y), col = grey) panel.arrows(lx, y, ux, y, col = 'black', length = 0.25, unit = native, angle = 90, code = 3) panel.xyplot(x, y, pch = pch, ...) } singer.split - with(singer, split(height, voice.part)) singer.ucl - sapply(singer.split, function(x) { st - boxplot.stats(x) c(st$stats[3], st$conf) }) singer.ucl - as.data.frame(t(singer.ucl)) names(singer.ucl) - c(median, lower, upper) singer.ucl$voice.part - factor(rownames(singer.ucl), levels = rownames(singer.ucl)) # add factor, grouping and by variables singer.ucl$fac1=c(level1,level1, level2, level2) singer.ucl$by1=c(two,one) singer.ucl$group1=c(rep(letters[1],4),rep(letters[2],4)) ## show the data frame singer.ucl # Deepayan's original example with(singer.ucl, xyplot(voice.part ~ median, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci), horizontal=FALSE) # with by variable, works fine with(singer.ucl, xyplot(voice.part ~ median|by1, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci)) # with groups, fails for lack of subscripts. with(singer.ucl, xyplot(voice.part ~ median, groups=group1, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci)) # what I need, ultimately, is something like this, with error bars: with(singer.ucl, dotplot(median~fac1|by1, groups=group1)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding a loop?
On Thu, 2006-10-12 at 12:43 -0400, Charles Annis, P.E. wrote: I have a vector, (not a list) repeated.measures.FACTOR.names [1] Insp1 Insp2 Insp3 Insp4 Insp5 Insp6 Insp7 Insp8 Insp9 and would like to convert this into a single string Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 I can do that with a loop, but isn't there a more elegant way? result - repeated.measures.FACTOR.names[[1]] for(i in 2:length(repeated.measures.FACTOR.names)) { result - paste(result, repeated.measures.FACTOR.names[[i]], sep=,) } result [1] Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 paste() is vectorized and note the use of 'collapse' in lieu of 'sep': paste(repeated.measures.FACTOR.names, collapse = ,) [1] Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding a loop?
Le Jeudi 12 Octobre 2006 12:43, Charles Annis, P.E. a écrit : I have a vector, (not a list) repeated.measures.FACTOR.names [1] Insp1 Insp2 Insp3 Insp4 Insp5 Insp6 Insp7 Insp8 Insp9 and would like to convert this into a single string Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 I can do that with a loop, but isn't there a more elegant way? result - repeated.measures.FACTOR.names[[1]] for(i in 2:length(repeated.measures.FACTOR.names)) { result - paste(result, repeated.measures.FACTOR.names[[i]], sep=,) } result [1] Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 Thanks. Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com paste() will do what you want. -- Vincent Goulet, Professeur agrégé École d'actuariat Université Laval, Québec [EMAIL PROTECTED] http://vgoulet.act.ulaval.ca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding a loop?
On Thu, 2006-10-12 at 12:07 -0500, Marc Schwartz wrote: On Thu, 2006-10-12 at 12:43 -0400, Charles Annis, P.E. wrote: I have a vector, (not a list) repeated.measures.FACTOR.names [1] Insp1 Insp2 Insp3 Insp4 Insp5 Insp6 Insp7 Insp8 Insp9 and would like to convert this into a single string Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 I can do that with a loop, but isn't there a more elegant way? result - repeated.measures.FACTOR.names[[1]] for(i in 2:length(repeated.measures.FACTOR.names)) { result - paste(result, repeated.measures.FACTOR.names[[i]], sep=,) } result [1] Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 paste() is vectorized and note the use of 'collapse' in lieu of 'sep': paste(repeated.measures.FACTOR.names, collapse = ,) [1] Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 Before I forget, you can also do the following to reconstruct the initial sequence and the final result in a single step: paste(Insp, 1:9, sep = , collapse = ,) [1] Insp1,Insp2,Insp3,Insp4,Insp5,Insp6,Insp7,Insp8,Insp9 In this case, we use 'sep' to indicate that there should be no space between each occurrence of 'Insp' and the integers and then use 'collapse' to indicate (as above) that each alphanum construct is to be joined by a comma into a single element. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cross two dataframe
I am unsure what you need, but try ?merge. If this isn't what you need, try posting an example. hth, Mihai Nica 170 East Griffith St. G5 Jackson, MS 39201 601-914-0361 - Original Message From: Majid Iravani [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Thursday, October 12, 2006 7:45:19 AM Subject: [R] Cross two dataframe Dear r-users! I would like to cross two data frame which have the same row number but different in the number of column. Can anybody help me for this case ? Thanks a lot in advance Majid Iravani PhD Student Swiss Federal Research Institute WSL Research Group of Vegetation Ecology Zürcherstrasse 111 CH-8903 Birmensdorf Switzerland Phone: +41-1-739-2693 Fax: +41-1-739-2215 Email: [EMAIL PROTECTED] http://www.wsl.ch/staff/majid.iravani/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Draw a circle at the end of a line
I have a plot of cumulative distribution function which is a step function, I'd like to put a cycle at the right end of each line to indicate that the value here is not available in this line. How can I do that? Thank you. cdf-function(x){ do.call(rbind,lapply(1:nrow(as.matrix(x)), function(i){ a-x[i] if (a0.5){b=0.1} else if (a1){b=0.3} else if (a1.5){b=0.5} else if (a2.5){b=0.7} else if (a3){b=0.9} else b-1 }))} xx-seq(0,3.5,by=0.01) pp-cdf(xx) plot(xx,pp) Jue Wang, Biostatistician Contracted Position for Preclinical Research Biostatistics PrO Unlimited (908) 231-3022 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding error bars to lattice plots
Daniel E. Bunker said the following on 10/12/2006 11:48 AM: Dear R users, About a year ago Deepayan offered a suggestion to incorporate error bars into a dotplot using the singer data as an example http://finzi.psych.upenn.edu/R/Rhelp02a/archive/63875.html. When I try to utilize this code with a grouping variable, I get an error stating that the subscripts argument is missing. I have tried to insert them in various ways, but cannot figure out where they should go. Deepayan's original code follows, with additions from me for factor, grouping and by variables. (Note that I could use xYplot (Dotplot), but I need my response variable on the vertical axis.) Any suggestions would be greatly appreciated. Thanks, Dan prepanel.ci - function(x, y, lx, ux, subscripts, ...) { x - as.numeric(x) lx - as.numeric(lx[subscripts]) ux - as.numeric(ux[subscripts]) list(xlim = range(x, ux, lx, finite = TRUE)) } panel.ci - function(x, y, lx, ux, subscripts, pch = 16, ...) { x - as.numeric(x) y - as.numeric(y) lx - as.numeric(lx[subscripts]) ux - as.numeric(ux[subscripts]) panel.abline(h = unique(y), col = grey) panel.arrows(lx, y, ux, y, col = 'black', length = 0.25, unit = native, angle = 90, code = 3) panel.xyplot(x, y, pch = pch, ...) } singer.split - with(singer, split(height, voice.part)) singer.ucl - sapply(singer.split, function(x) { st - boxplot.stats(x) c(st$stats[3], st$conf) }) singer.ucl - as.data.frame(t(singer.ucl)) names(singer.ucl) - c(median, lower, upper) singer.ucl$voice.part - factor(rownames(singer.ucl), levels = rownames(singer.ucl)) # add factor, grouping and by variables singer.ucl$fac1=c(level1,level1, level2, level2) singer.ucl$by1=c(two,one) singer.ucl$group1=c(rep(letters[1],4),rep(letters[2],4)) ## show the data frame singer.ucl # Deepayan's original example with(singer.ucl, xyplot(voice.part ~ median, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci), horizontal=FALSE) # with by variable, works fine with(singer.ucl, xyplot(voice.part ~ median|by1, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci)) # with groups, fails for lack of subscripts. with(singer.ucl, xyplot(voice.part ~ median, groups=group1, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci)) # what I need, ultimately, is something like this, with error bars: with(singer.ucl, dotplot(median~fac1|by1, groups=group1)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi, Daniel, Try this panel function: panel.ci - function(x, y, lx, ux, subscripts, groups = NULL, pch = 16, ...) { x - as.numeric(x) y - as.numeric(y) lx - as.numeric(lx[subscripts]) ux - as.numeric(ux[subscripts]) par - if(is.null(groups))plot.symbol else superpose.symbol sym - trellis.par.get(par) col - sym$col groups - if(!is.null(groups)) { groups[subscripts] } else { rep(1, along = x) } ug - unique(groups) for(i in seq(along = ug)) { subg - groups == ug[i] y.g - y[subg] x.g - x[subg] lx.g - lx[subg] ux.g - ux[subg] panel.abline(h = unique(y.g), col = grey) panel.arrows(lx.g, y.g, ux.g, y.g, col = 'black', length = 0.25, unit = native, angle = 90, code = 3) panel.xyplot(x.g, y.g, pch = pch, col = col[i], ...) } } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R not responding for nested glm model
Hi, I'm trying to perform a glm model on count data (poisson distribution of the errors) where data are nested. glmmodel-glm(y~x/z,poisson) x and z are factors, z nested within x, y is count data. In that point the R just stuck and not respond anymore. I tried glmmodel-glm(y~x,poisson) and there were no problems. Trying glmmodel-glm(y~x/z) gave the same no response. Similar problem with continuous data (normal distribution of the errors). I am using R 2.3.1 on Windows XP Thanks Yuval -- Yuval Sapir, PhD Post-doctorate research associate Dept of Genetics University of Georgia Athens, GA 30602, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] .RData and UNIX
Hi, I have been using R under Windows but have started to use R (ver. 2.3.0,) under Linux (linux-gnu). I have been saving my workspace and I notice that I sometimes end up with 2 version of the .RData file: .RData and .Rdata. Can someone explain what is happening? I am not sure which one is loaded so I have ended up losing data by deleting the wrong file. I think the saved file should be .RData. Thank you. Jeff __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R not responding for nested glm model
Yuval Sapir said the following on 10/12/2006 1:08 PM: Hi, I'm trying to perform a glm model on count data (poisson distribution of the errors) where data are nested. glmmodel-glm(y~x/z,poisson) x and z are factors, z nested within x, y is count data. In that point the R just stuck and not respond anymore. I tried glmmodel-glm(y~x,poisson) and there were no problems. Trying glmmodel-glm(y~x/z) gave the same no response. Similar problem with continuous data (normal distribution of the errors). I am using R 2.3.1 on Windows XP Thanks Yuval Please see the signature and provide commented, minimal, self-contained, reproducible code. Namely, what is 'x' and 'z'? My guess is ~x/z produces so large a model matrix you are approaching your limits of memory. But, there's no way to tell without more information. HTH, --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] .RData and UNIX
On 10/12/2006 1:50 PM, J.M. Breiwick wrote: Hi, I have been using R under Windows but have started to use R (ver. 2.3.0,) under Linux (linux-gnu). I have been saving my workspace and I notice that I sometimes end up with 2 version of the .RData file: .RData and .Rdata. Can someone explain what is happening? I am not sure which one is loaded so I have ended up losing data by deleting the wrong file. I think the saved file should be .RData. Thank you. Unix (mostly) uses a case-sensitive file system, so .Rdata and .RData are different files. On Windows or MacOSX (mostly) they would refer to the same file. So at some point in your code or the code you're using, someone was a little sloppy and saved a file to .Rdata. As far as I know none of the R base packages or GUIs do this, so it's likely in something you wrote or in a contributed package. I'd keep a careful watch until you see what you're doing when this happens in order to identify where the problem is. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to assign a rank to a range of values..cumulative area distribution
Here's what I came up with, thanks to Alex Brown and Roger for the solution: #needs the maptools package to read ESRI grid require(maptools) #import the flow accumulation grid basin.map - readAsciiGrid(c:/temp/eno_fac.asc, colname=fac) #split on unique fac cell values areas -split(basin.map$fac,basin.map$fac) length(areas) #count each occurence of fac value cell_count-sapply(areas, length) #calculate each drainage area, original is 20 ft resolution, we want square meters basin_area - cell_count * 20 * 20 * 0.09290304 #read the area into a dataframe freqs-as.data.frame(table(basin_area)) #rank the frequencies based on each unique occerence, note, ranks from 1 to n r-rank(freqs$basin_area) n-length(r) #determing the probability, n+1 insures there is no 100%, 1- reverses the order so #low drainage area gets high probability of exceedence z-cbind(Rank = r, PRank = 1-(r/(n+1))) #attach the probability to the table, result is high prob of exceed is in row with low drainage #and low probabibility is in row with high drainage freqs$rank-z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Nov-Dec Courses: (1) Prof Frank Harrell *** Regression Modeling Strategies in R/Splus, (2) R/Splus Advanced Programming, (3) R/Splus Fundamentals and Programming Techniques
XLSolutions Corporation (www.xlsolutions-corp.com) is proud to announce November - December courses: (1) Regression Modeling Strategies in R/Splus --- by Prof Frank Harrell http://www.xlsolutions-corp.com/Rstats2.htm *** Chicago, November 16-17, 2006 *** *** San Francisco, November 30-Dec 1, 2006 *** *** Seattle, December 7-8, 2006 *** (2) R/Splus Advanced Programming --- by the R Development Core Team Guru! http://www.xlsolutions-corp.com/Radv.htm *** Seattle / TBD *** (3) R/Splus Fundamentals and Programming Techniques http://www.xlsolutions-corp.com/Rfund.htm *** San Francisco / December 14-15, 2006 *** *** Boston / December 18-19, 2006 Ask for group discount and reserve your seat Now - Earlybird Rates Payment due after the class! Email Sue Turner: [EMAIL PROTECTED] Email us for group discounts: [EMAIL PROTECTED] Phone: 206 686 1578 Visit us: www.xlsolutions-corp.com/training.htm Please let us know if you and your colleagues are interested in this class to take advantage of group discount. Register now to secure your seat! Cheers, Elvis Miller, PhD Manager Training XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com/training.htm [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: rarefy a matrix of counts
Here's the script I wrote to randomly sample without replacement from a csv file of counts for various object classes (columns) in 76 samples (rows): The data file common_macro_raw.csv was: SiteID,Scaling_factor,Collembola,Hydrachnida,Nematomorpha,Oligochaeta,Turbellaria,Glossiphonidae,Hirudinidae,Gammaridae,Asellidae,Baetidae,Ephemerellidae,Ephemeridae,Heptageniidae,Leptophlebiidae,Siphlonuridae,Chloroperlidae,Leuctridae,Nemouridae,Peltoperlidae,Perlidae,Perlolidae,Pteronarcyidae,Brachycentridae,Glossosomatidae,Hydropsychidae,Hydroptilidae,Lepidostomatidae,Leptoceridae,Limnephilidae,Molannidae,Odontoceridae,Philopotamidae,Phryganeidae,Polycentropidae,Rhyacophilidae,Uenoidae,Corixidae,Corydalidae,Sialidae,Chrysolmelidae,Dytiscidae,Elmidae,Psephenidae,Athericidae,Blepharicidae,Ceratopogonidae,Chironomidae,Dixidae,Empididae,Psychodidae,Simuliidae,Strationyidae,Tabanidae,Tipulidae,Aeshnidae,Calopterygidae,Cordulegastridae,Gomphidae,Libellulidae,Pyralidae,Planorbidae,Sphaeridae 1100291,1,1,3,2,2,0,0,0,0,0,4,66,1,2,11,1,10,21,0,0,0,1,0,0,1,0,0,3,0,0,0,3,0,0,0,8,0,0,0,1,0,0,71,0,1,0,5,121,0,1,0,2,0,0,15,0,0,0,0,0,0,1,12 2400143,1.88 ,0,0,0,25,0,0,0,0,0,6,8,0,17,3,0,11,9,1,6,0,1,3,0,4,0,0,1,0,0,0,4,38,0,0,8,2,0,0,0,0,11,25,0,1,0,2,29,0,0,0,22,0,0,8,0,0,2,5,0,0,0,0 2500364,1,0,4,0,6,0,0,0,0,0,66,0,0,63,0,0,55,14,3,0,0,0,0,0,4,0,0,1,0,2,0,0,11,0,0,18,0,0,0,0,0,0,2,0,2,0,0,86,0,0,0,9,0,0,10,0,0,0,0,0,1,0,0 2600075,1,0,1,0,15,0,0,0,0,0...etc The program requires two loops, but took less than a second to run on my 1.8Ghz: #Reads matrix of raw macroinvertebrate counts from the subsampling prior to large-rare search #and scaling for sub-sampling effort rm(list=ls()) library(stats) master_data = read.csv(common_macro_raw.csv, row.names=1) data.frame(master_data) attach(master_data) counts = master_data[,2:ncol(master_data)] #These loops will extract a stream's assemblage, create a list of buggies identified, #take a random sample of 100 buggies without repalcement, and then re-combine the resulting #list into a vector of counts by taxa taxa_codes = c(1:ncol(counts)) #this creates a sequential integer for each taxon that will be the index for the subsequent lists rarified_samples = numeric() for (x in 1: nrow(counts)) { temp_counts = counts[x,] full_list = rep(taxa_codes, times=temp_counts) stream_rand = sum(temp_counts)/100*master_data[x,1] #puts new scaling factor in first column of stream_rand rare_list = sample(full_list, 100) for (i in 1:ncol(counts)) { temp_sum = sum(rare_list==i) stream_rand = c(stream_rand, temp_sum) } rarified_samples = rbind(rarified_samples, stream_rand) } rownames(rarified_samples)=SiteID colnames(rarified_samples)=colnames(master_data) data.frame(rarified_samples) write.csv(rarified_samples, file = rarified_samples.csv) You could add another for loop that appends as many iterations as needed to the output file. Thanks for all of your input, it helped tremendously. On 10/12/06, Petr Pikal [EMAIL PROTECTED] wrote: Hi On 11 Oct 2006 at 12:54, Tony Plate wrote: Date sent: Wed, 11 Oct 2006 12:54:44 -0600 From: Tony Plate [EMAIL PROTECTED] To: Brian Frappier [EMAIL PROTECTED] Copies to: Petr Pikal [EMAIL PROTECTED], r-help@stat.math.ethz.ch Subject:Re: [R] Fwd: rarefy a matrix of counts Two things to note: (1) rep() can be vectorized: rep(1:3, 2:4) [1] 1 1 2 2 2 3 3 3 3 (2) you will likely get much better performance if you work with integers and convert to strings after sampling (or use factors), e.g.: that is what I actually used in my suggestion (I hope). DF color sample1 sample2 sample3 1 red 400 3002500 2 green 100 0 200 3 black 3001000 500 notice that red, green, black is not **row names** but a column in data frame. That is why following code gives red, green, etc. x - data.frame(matrix(NA,100,3)) for (i in 2:ncol(DF)) x[,i-1] - sample(rep(DF[,1], DF[,i]),100) if you want result in data frame or x-vector(list, 3) for (i in 2:ncol(DF)) x[[,i-1]] - sample(rep(DF[,1], DF[,i]),100) c(red,green,blue)[sample(rep(1:3,c(400,100,300)), 5)] [1] red blue red red red -- Tony Plate snip is that this code still samples the rows, not the elements, i.e. No, see above. returns 100 or 300 in the matrix cells instead of red or a matrix of counts by color (object type) like: x1x2 x3 red 32 560 gr6895 40 sum 100 100 100 something like sapply(x,table) X1 X2 X3 black 36 79 15 green 14 0 9 red 50 21 76 HTH Petr It looks like Tony is right: sampling without replacement requires listing of all elements to be sampled. But, the code Petr provided x1 - sample(c(rep(red,400),rep(green, 100),rep(black,300)),100) did give me a clue of how to quickly make such a
Re: [R] adding error bars to lattice plots
On 10/12/06, Daniel E. Bunker [EMAIL PROTECTED] wrote: Dear R users, About a year ago Deepayan offered a suggestion to incorporate error bars into a dotplot using the singer data as an example http://finzi.psych.upenn.edu/R/Rhelp02a/archive/63875.html. When I try to utilize this code with a grouping variable, I get an error stating that the subscripts argument is missing. I have tried to insert them in various ways, but cannot figure out where they should go. Deepayan's original code follows, with additions from me for factor, grouping and by variables. (Note that I could use xYplot (Dotplot), but I need my response variable on the vertical axis.) Any suggestions would be greatly appreciated. Thanks, Dan prepanel.ci - function(x, y, lx, ux, subscripts, ...) { x - as.numeric(x) lx - as.numeric(lx[subscripts]) ux - as.numeric(ux[subscripts]) list(xlim = range(x, ux, lx, finite = TRUE)) } panel.ci - function(x, y, lx, ux, subscripts, pch = 16, ...) { x - as.numeric(x) y - as.numeric(y) lx - as.numeric(lx[subscripts]) ux - as.numeric(ux[subscripts]) panel.abline(h = unique(y), col = grey) panel.arrows(lx, y, ux, y, col = 'black', length = 0.25, unit = native, angle = 90, code = 3) panel.xyplot(x, y, pch = pch, ...) } singer.split - with(singer, split(height, voice.part)) singer.ucl - sapply(singer.split, function(x) { st - boxplot.stats(x) c(st$stats[3], st$conf) }) singer.ucl - as.data.frame(t(singer.ucl)) names(singer.ucl) - c(median, lower, upper) singer.ucl$voice.part - factor(rownames(singer.ucl), levels = rownames(singer.ucl)) # add factor, grouping and by variables singer.ucl$fac1=c(level1,level1, level2, level2) singer.ucl$by1=c(two,one) singer.ucl$group1=c(rep(letters[1],4),rep(letters[2],4)) ## show the data frame singer.ucl # Deepayan's original example with(singer.ucl, xyplot(voice.part ~ median, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci), horizontal=FALSE) # with by variable, works fine with(singer.ucl, xyplot(voice.part ~ median|by1, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci)) # with groups, fails for lack of subscripts. with(singer.ucl, xyplot(voice.part ~ median, groups=group1, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.ci)) Although that does seem to be the eventual error message, this fails not due to the lack of subscripts, but because 'panel.ci' does not know how to deal with groups. One solution to this is Sundar's approach, which is to change the panel function to handle groups. Another generic solution is to use 'panel.superpose', which _does_ know how to handle groups, and also accepts a custom panel function to be called for each group. Often (but not always), you can use a panel function designed for a non-groups aware display for this. In this case, the following gives results similar to Sundar's code: with(singer.ucl, xyplot(voice.part ~ median, groups=group1, lx = lower, ux = upper, prepanel = prepanel.ci, panel = panel.superpose, panel.groups = panel.ci, pch = 16)) Note the need for an explicit pch=16, since the default in panel.ci is overridden by panel.groups. # what I need, ultimately, is something like this, with error bars: with(singer.ucl, dotplot(median~fac1|by1, groups=group1)) If you have more than one interval (from different groups) for any level of your categorical variable - which seems to be the case in this example - you will encounter a problem. The problem is this: the grey reference lines are drawn once for every group, and will draw over intervals drawn by calls corresponding to earlier levels. This can be easily fixed by moving that part of the code from 'panel.groups' to 'panel', I'll leave that as an exercise. -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] debug package in R 2.4.0
Hi. I recently upgraded to R version 2.4.0, and I have found that the debug package no longer works. In particular, when I try to debug a function, I get the following error message: Error in attr(value, row.names) - rlabs : row names must be 'character' or 'integer', not 'double' I have, of course, taken all the necessary preceding steps, such as issuing the commands library(debug) source(test.r) mtrace(test) test() Does anyone know how to get around this problem? Thanks, Shlomo. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Draw a circle at the end of a line
try this: xx-seq(0,3.5,by=0.01) x.breaks - seq(0, max(ceiling(xx)), .5) # use 'cut' to split the data and create labels for your increments x.1 - cut(xx, breaks=x.breaks, labels=seq(.1, by=.2, length=length(x.breaks)-1), include.lowest=TRUE) plot(as.numeric(as.character(x.1))) # get indices of each step and use last point for the circle x.ind - split(seq(along=x.1), x.1) for (i in names(x.ind)){ if (length(x.ind[[i]]) 0) # make sure there is data points(tail(x.ind[[i]],n=1),as.numeric(i), pch=21, cex=5, col='red') } On 10/12/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I have a plot of cumulative distribution function which is a step function, I'd like to put a cycle at the right end of each line to indicate that the value here is not available in this line. How can I do that? Thank you. cdf-function(x){ do.call(rbind,lapply(1:nrow(as.matrix(x)), function(i){ a-x[i] if (a0.5){b=0.1} else if (a1){b=0.3} else if (a1.5){b=0.5} else if (a2.5){b=0.7} else if (a3){b=0.9} else b-1 }))} xx-seq(0,3.5,by=0.01) pp-cdf(xx) plot(xx,pp) Jue Wang, Biostatistician Contracted Position for Preclinical Research Biostatistics PrO Unlimited (908) 231-3022 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] debug package in R 2.4.0
Den 2006-10-12 21:39, Shlomo Katchmalik skrev: Hi. I recently upgraded to R version 2.4.0, and I have found that the debug package no longer works. In particular, when I try to debug a function, I get the following error message: Error in attr(value, row.names) - rlabs : row names must be 'character' or 'integer', not 'double' I have, of course, taken all the necessary preceding steps, such as issuing the commands library(debug) source(test.r) mtrace(test) test() I can confirm the above behaviour. Does anyone know how to get around this problem? Have you tried contacting the package maintainer, as suggested by the posting guide? (In this case Mark Bravington, CCed here.) Hopefully it's fixable. I'm deeply in love with `debug'... HTH, Henric Thanks, Shlomo. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fishers exact test in R
I was hoping for some advice regarding Fishers exact test in R. The help file indicates odds ratios are only available for 2 x 2 tables and that these odds ratios are the hypothesized odds ratio. I was uncertain as to what a hypothesised odds ratio is and whether it was possible to obtain odds ratios for a 3 x 2 table? Alternatively to trying to obtain an odds ratio for my five 3 x 2 tables, I was considering analysing them as a series of 2 x 2 tables as the odds ratio feature seemed a useful aid to determine the source of difference in a 3 x 2 table. I have, however, read conflicting accounts of whether subdividing larger tables into 2 x 2 tables is a justifiable practice. Any assistance with the above enquiry is much appreciated, regards Bob Green __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] debug package in R 2.4.0
Hi Henric et al. Thanks for the notification. The R development cycle always fills me with dread in case my packages stop working! I do plan to change debug (and no doubt mvbutils) to match the changes in R, hopefully in the next week or two... depends on spare time, sigh... bye Mark -Original Message- From: Henric Nilsson [mailto:[EMAIL PROTECTED] Sent: Thu 12/10/2006 22:05 To: Shlomo Katchmalik Cc: r-help@stat.math.ethz.ch; Bravington, Mark (CMIS, Hobart) Subject:Re: [R] debug package in R 2.4.0 Den 2006-10-12 21:39, Shlomo Katchmalik skrev: Hi. I recently upgraded to R version 2.4.0, and I have found that the debug package no longer works. In particular, when I try to debug a function, I get the following error message: Error in attr(value, row.names) - rlabs : row names must be 'character' or 'integer', not 'double' I have, of course, taken all the necessary preceding steps, such as issuing the commands library(debug) source(test.r) mtrace(test) test() I can confirm the above behaviour. Does anyone know how to get around this problem? Have you tried contacting the package maintainer, as suggested by the posting guide? (In this case Mark Bravington, CCed here.) Hopefully it's fixable. I'm deeply in love with `debug'... HTH, Henric Thanks, Shlomo. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arrange histogram
The data set has a number of variables each of which is classified into two groups. For each variable of each group, I need to create a histogram. All the histograms are to be lined up into a file that looks like group1 group2 Variable 1 Histogram histogram Variable 2 Histogram histogram ... Can you give me a hint as to what package I'd look into for help? Thank you Jue Wang, Biostatistician Contracted Position for Preclinical Research Biostatistics PrO Unlimited (908) 231-3022 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Object attributes in R
Michael Toews [EMAIL PROTECTED] wrote: Is there any way of keeping the attributes when subsetted from primitive classes, like a fictional attr.drop option within the [ braces? The best alternative I have found is to make a new object, and copy the attributes: tm2 - tm[3:5] attributes(tm2) - attributes(tm) However, for the data.frame, how can I copy the attributes over (without using a for loop -- I've tried a few things using sapply but no success)? Also I don't see how this is consistent with an empty index, [], where attributes are always retained (as documented): tm[] What I've done is to define a subclass that keeps attributes, that can be added to any object, shown below. The keep.attr() function is supposed to return just user attributes but I'm not sure if my list of special ones is complete. I haven't looked at the performance impact of this sort of thing. -- David Hinds keep.attr - function(x) { a - attributes(x) a[c('names','row.names','class','dim','dimnames')] - NULL a } keep - function(x, ..., attr=NULL) { cl - union('keep', class(x)) do.call('structure', c(list(x, class=cl, ..., attr)) } '[.keep' - function(x, ...) keep(NextMethod(), keep.attr(x)) '[-.keep' - function(x, ...) keep(NextMethod(), keep.attr(x)) tm - keep((1:10)/10, units='sec') ds - keep((1:10)^2, units='cm') dat - data.frame(tm=tm,ds=ds) str(dat) tm[3:5] ds[-3] str(dat[1:3,]) str(dat[,1]) str(dat[1]) dat - keep(dat, parent='xyz') str(dat) str(dat[2,2]) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to force x AND y log scale in xyplot?
On 10/12/06, Thomas P. Colson [EMAIL PROTECTED] wrote: the following plots a log-log plot of cumulative drainage area, but the axis labels are 10^-1, ...10^5, so forth. xyplot(white$rank.PRank~white$basin_area,scales=list(log=TRUE),xlab=Drainag e Area m^2,ylab=P(AA*)) When I try this, I get the desired labels, sort of: the x axis contains 100, 1000,1 and then 1e+05, 1e+06 xyplot(white$rank.PRank~white$basin_area,scales=list(y=list(log=TRUE,at=c(.0 001,.001,.01,.1,1)),x=list(log=TRUE,at=c(10,100,1000,1,10,100))) ,xlab=Drainage Area m^2,ylab=P(AA*)) How to make those last two x-axis labels conform? Specify a 'labels' as well, e.g. x=list(log=TRUE, at=c(10,100,1000,1,10,100), labels = c(10,100,1000,1,10,100)) -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Object attributes in R
[EMAIL PROTECTED] wrote: What I've done is to define a subclass that keeps attributes, that can be added to any object, shown below. The keep.attr() function is supposed to return just user attributes but I'm not sure if my list of special ones is complete. Sorry, the following version should be better. -- David Hinds keep.attr - function(x) { a - attributes(x) a[c('names','row.names','class','dim','dimnames')] - NULL a } keep - function(.Data, ..., .Attr=NULL) { cl - union('keep', class(.Data)) do.call('structure', c(list(.Data, class=cl, ...), .Attr)) } '[.keep' - function(.Data, ...) keep(NextMethod(), .Attr=keep.attr(.Data)) '[-.keep' - function(.Data, ...) keep(NextMethod(), .Attr=keep.attr(.Data)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 4PL algorithm
WinXP, Splus7 and R2.3.1. All, I have been using the SSfpl and SSlogis self-starting functions in the nlme library to fit 4PL and restricted 4PL models. I need to adapt these routines to fit the alternative model f(x) = A + (B-A)/(1 + abs(x/EC50)^C) My Question: How do I obtain good starting values for this alternative model? (The pseudo-code found on pages 517 - 520 of Mixed Effects Models in S and Splus is not applicable to my problem because it deals with f(x)=A+(B-A)/(1+exp(EC50-x)/C)) which is not the same function.) Many thanks, Greg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compiling R 2.4.0 in ubuntu/linux
use synaptic or apt-get to get readline and readline-dev, you also do not need f2c as there is a real fortran, so use synaptic/apt-get to get gfortran and use the following syntax when running ./configure: env F77=/usr/bin/gfortran-4.0 ./configure I have Ubuntu Edgy and the above works with almost any R, not only 2.4.0 Best, Oleg PS. This is not a question for R-devel :) T C wrote: I'm not sure if this is the place to post this question, but, I am having trouble compiling the source code. I do have a suitable C compiler and f2c but I get this error when I run ./configure configure: error: --with-readline=yes (default) and headers/libs are not available Any ideas? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr Oleg Sklyar * EBI/EMBL, Cambridge CB10 1SD, England * +44-1223-494466 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plackett-Dale Model in R
I'm not familiar with the Plackett-Dale model, and RSiteSearch(Plackett-Dale) produced nothing. Google led me to an article on Plackett-Dale Inference to Study Inheritance (http://doclib.uhasselt.be/dspace/bitstream/1942/466/1/molg23.pdf#search=%22Plackett-Dale%22). This article discusses parameter estimation to maximize a pseudo-likelihood for a Plackett-Dale distribution, defined in other references. If you can write a function to compute the Plackett-Dale distribution, it should not be too difficult to maximize it (or minimize the negative of its logarithm). From that you should be able to get Wald approximate confidence intervals, likelihood ratio tests, etc. However, unless it's available under some other name, it looks to me like it's not included in any current CRAN package. I know this is not what you wanted, but I hope it helps. Spencer Graves Pryseley Assam wrote: Dear R users, Can someone inform me about a library/function in R that fits a Plackett-Dale model ? Thanks in advance Pryseley - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.