Re: [R] [R-SIG-Finance] AsOf join in R
A bit late, but here is what I always do: m = merge(bid, ask, tick) m- interpNA(m, method=before) intrepNA can also interpolate NAs in different ways, for example linearly. Hth, Ulrich Am 06.10.2011 03:08, schrieb Robert A'gata: Hi Roupell, Yes I am aware of RTAQ function matchTradesQuotes. But my time series does not follow the TAQ format like they suggest. So I gave it a try and find that it doesn't work. In particular, my time series contain full level 2 order book and trades. I want to do asof join of the book to the trade. Is there any better way? Or I misunderstand anything about the RTAQ's function? Thank you. Robert On Wed, Oct 5, 2011 at 1:59 AM, Roupell, Darkodarko.roup...@cba.com.au wrote: na.locf {zoo}, should do a job, also if you look in RTAQ they wrote a function that looks on previous tick and carries forward value in case its not available. __ Commonwealth Bank Darko Roupell Associate Quantitative Analyst Institutional Banking Markets Equities Research Darling Park Tower 1 Level 23, 201 Sussex Street Sydney, NSW 200 P: +61 2 9117 1254 F: +61 2 9118 1000 M: +61 400 170 515 E: darko.roup...@cba.com.au Our vision is to be Australia's finest financial services organisation through excelling in customer service. Email Security This email is sent solely for informational purposes. Hoax emails, commonly referred to as phishing, can appear to be from the Commonwealth Bank and ask you to update or confirm details such as client numbers, passwords, personal identification questions, contact details or account numbers. The Commonwealth Bank will never send you an email asking you to confirm, update or reveal your confidential banking information. Important Information Produced by Global Markets Research, a business unit of Commonwealth Bank of Australia ABN 48 123 123 124 - AFSL 234945 (Commonwealth Bank). This publication is based on information available at the time of publishing. We believe that the information in this communication is correct and any opinions, conclusions or recommendations are reasonably held or made as at the time of its compilation, but no warranty is made as to accuracy, reliability or completeness. To the extent permitted by law, neither Commonwealth Bank nor any of its subsidiaries accept liability to any person for loss or damage arising from the use of this communication. This communication does not purport to be a complete statement or summary. The information provided has been prepared without considering your objectives, financial situation or needs, and before acting on the information, you should consider its appropriateness to your circumstances. No person should act on the basis of this report without considering and if necessary taking appropriate professional advice upon their own particular circumstances. Commonwealth Bank of Australia, as a provider of investment, borrowing and other financial services undertakes financial transactions with many corporate entities in Australia. This may include any corporate issuer referred to in this communication. Commonwealth Bank and its subsidiaries have effected or may effect transactions for their own account in any investments or related investments referred to herein. In the case of certain securities Commonwealth Bank is or may be the only market maker. -Original Message- From: r-sig-finance-boun...@r-project.org [mailto:r-sig-finance-boun...@r-project.org] On Behalf Of Robert A'gata Sent: Wednesday, 5 October 2011 2:41 PM To: r-help@r-project.org; r-sig-fina...@stat.math.ethz.ch Subject: [R-SIG-Finance] AsOf join in R Hi, I tried to google for any solution for asof join operator in R. But I couldn't find one. The asof join operator AsOf(A,B) merges 2 time series by looking for latest available value of B prior to each time point in A. For example, A- xts(c(10,15,20,25), order.by=as.POSIXct(c(2011-09-01,2011-09-09,2011-09-10,2011-09-15)) B- xts(c(1.1,1.5,1.3,1.7), order.by=as.POSIXct(c(2011-08-31,2011-09-09,2011-09-11,2011-09-12)) AsOf(A,B) should return A B 2011-09-0110 1.1 2011-09-0915 1.1 # (because latest value B prior to 2011-09-09 is 1.1) 2011-09-1020 1.5 2011-09-1525 1.7 How do I write the above AsOf function in R? The merge function does not do what I want because it will align points that have the same time stamp together while what I want is actually latest value prior to timestamp in A. Any example would be greatly appreciated. Thank you. Cheers, Robert ___ r-sig-fina...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. ** IMPORTANT MESSAGE * This e-mail message is
Re: [R] any way to convert back to DateTime class when accidental conversion to numeric?
A more portable way (that function only works in some versions of R) is as.POSIXct(1317857320, origin=1970-01-01) possibly with a 'tz' argument if you need to restore the timezone. On Wed, 5 Oct 2011, jim holtman wrote: Here is what I use: unix2POSIXct(1317857320) [1] 2011-10-05 19:28:40 EDT unix2POSIXct - function (time) structure(time, class = c(POSIXt, POSIXct)) On Wed, Oct 5, 2011 at 7:38 PM, Mike Williamson this.is@gmail.com wrote: Hi, In short, I would like to know if there is any way to convert a numeric into a date, similar to how strptime() can convert a string to a date time class? There are some functions, etc. which don't work well with dates, and tend to force them into numerics. I understand that the number it spits back is the number of seconds since the beginning of 1970 (see the first few sentences of the Details portion of ?DateTimeClasses). However, it's a bit of a hassle to convert that by hand. I can create a function to do this, and it isn't so hard, but I found it hard to believe such a function didn't already exist, so I wanted to ask the community. As an example, today (Oct 5th 2011 at approximately 4:30pm, Pacific time) is approximately 1317857320 as a numeric, but I would like to know how to go from that number back to the 2011-10-05 16:28:39 PDT date time class which originally generated it. Thanks! Mike --- XKCD http://www.xkcd.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mean(s) from values in different row?
Hello: Is there a way to get a mean from values stored in different rows? The data looks like this: YEAR-1, JAN, FEB, ..., DEC YEAR-2, JAN, FEB, ..., DEC YEAR-3, JAN, FEB, ..., DEC What I want is the mean(s) for just the consecutive winter months: YEAR-1.DEC, YEAR-2.JAN, YEAR-2.FEB YEAR-2.DEC, YEAR-3.JAN, YEAR-3.FEB etc. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Concecutive zeros and ones
Dear all, I have a data series (might be vector or matrix) which is composed only from zeros and ones like the following example 0 0 0 1 1 0 1 0 0 1 1 1 1 0 0 0 I want to be able to return back the length of concecutive zeros and the length of concecutive ones. For that I want to have something like that returned: zeros= [3 1 2 3]; ones=[2 1 4]; How I can do that simply in R? I would like to thank you in advance for your help B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Concecutive zeros and ones
Check function rle(). I hope it helps. Best, Dimitris On 10/6/2011 9:18 AM, Alaios wrote: Dear all, I have a data series (might be vector or matrix) which is composed only from zeros and ones like the following example 0 0 0 1 1 0 1 0 0 1 1 1 1 0 0 0 I want to be able to return back the length of concecutive zeros and the length of concecutive ones. For that I want to have something like that returned: zeros= [3 1 2 3]; ones=[2 1 4]; How I can do that simply in R? I would like to thank you in advance for your help B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Populate a matrix
This last solution is what I was looking for, I was trying to avoid loops. Thanks! -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rainer Schuermann Sent: 5. oktober 2011 18:29 To: r-help@r-project.org Subject: Re: [R] Populate a matrix m - matrix( rep( y, length( x ) ), length( y ), length( x ) ) On Wednesday 05 October 2011 18:11:18 fernando.cabr...@nordea.com wrote: Hi guys I have vectors x - c(1,2,3,4) and y - c(4,3,9) and would like to generate a matrix which has 3 rows (length(y)) and 4 columns (length(x)), and each row is the corresponding y element repeated length(x) times. 4,4,4,4 3,3,3,3 9,9,9,9 Thanks. Fernando Álvarez __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kriging shapefiles
On 10/05/2011 09:07 AM, Leynnard Rey Matillano wrote: Hi! Im new to R and I need to interpolate a shapefile using kriging. I've been able to plot/read the shapefile using the package maptools or rgdal. I've searched the internet for sample codes but most of the kriging codes that I've found done in R is done using txtfiles or CSVs. An example could be of great help. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ...you could also take a look at the automap package (some blatant self promotion ;)). This builds on gstat... Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reporting multiple objects out of a function
On 10/05/2011 09:14 PM, andrewH wrote: Thanks, Sina! This is very helpful and informative, but still not quite what I want. So, here is the thing: When a function returns an object, that object is available in the calling environment. If it is returned inside a function, it is available in the function, but not outside of the function. What I want to do is simply to return more than one object in the usual sense in which functions return objects. Hi, As I understand it, you want to return multiple arguments without returning them explicitly as an object. This can probably be done, by I would advice against it because it makes your code harder to read. You dump something in the calling environment, and a new user (maybe yourself in a few months) has to do a lot of reasoning of what is happening under the hood, which object is dumped in which environment. I would just return a list. Alternatively, take a look at object oriented programming like Gabor suggested. This, however, still involves returning an object... Again, I would recommend doing this the standard R way cheers, Paul Here is a test to see if a function fun does this, at least to the depth of 1. obj1 - 1 obj2 - 2 cat(obj1 in global=, obj1) cat(obj2 in global=, obj2) wrapFun - function(fun) { obj1 - 3 obj2 - 4 cat(obj1 in calling=, obj1) cat(obj2 in calling=, obj2) fun() cat(obj in calling=, obj) cat(obj1 in calling=, obj1) cat(obj2 in calling=, obj2) } cat(obj1 in global=, obj1) cat(obj2 in global=, obj2) Suppose the function fun assigns the values 5 and 6 to obj1 and obj2. If the function does what I want, this code should print: obj1 in global= 1 obj2 in global= 2 obj1 in calling= 3 obj2 in calling= 4 obj1 in calling= 5 obj2 in calling= 6 obj1 in global= 1 obj2 in global= 2 I turned Paul’s and Sina’s code into functions as follows: paulFun - function() { obj1 - 5; obj2 - 6; } sinaFun - function() { attach(what = NULL, name = my_env) assign(obj1, 5, envir = as.environment(my_env)) assign(obj1, 5, envir = as.environment(my_env)) } Running these two functions in the code above yields: paulFun: obj1 in global= 1 obj2 in global= 2 obj1 in calling= 3 obj2 in calling= 4 obj1 in calling= 3 obj2 in calling= 4 obj1 in global= 5 obj2 in global= 6 So paulFun puts the objects in the global environment but not in the calling environment. Let’s try sinaFun: sinaFun: obj1 in global= 1 obj2 in global= 2 obj1 in calling= 3 obj2 in calling= 4 obj1 in calling= 3 obj2 in calling= 4 obj1 in global= 1 obj2 in global= 2 sinaFun puts the objects in the new environment it defines, but they are available in neither the calling nor the global environment. However, I was immediately convinced that Sina had given me the tool I was missing: the assign function. (Thanks, Sina!) But I was wrong (or used it wrong), and now I am even more deeply confused. Here is a function that I thought would do what I want: andrewFun - function() { assign(obj1, 5, pos = sys.parent(n = 1)) assign(obj2, 6, pos = sys.parent(n = 1)) NULL } However, when I tried it, my results were the same as paulFun: assigned in the global environment, but not in the calling environment. Setting n = 0 seemed to limit the assignment to the interior of andrewFun: none of the printed obj values were affected. Help? andrewH -- View this message in context: http://r.789695.n4.nabble.com/reporting-multiple-objects-out-of-a-function-tp3873380p3876201.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] apply and functions with many arguments
Dear all, I would like to use the following function fitdist(data, distr, method=c(mle, mme, qme, mge), start=NULL, fix.arg=NULL, ...) for many different distr values like distr=c(norm,lnorm,pois) (just a small example) and take back into a list the parameter name which is what is inside distr plus what the function fitdist returns (another list). How can I do that ? B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] anova.rq {quantreg) - Why do different level of nesting changes the P values?!
Hello dear R help members. I am trying to understand the anova.rq, and I am finding something which I can not explain (is it a bug?!): The example is for when we have 3 nested models. I run the anova once on the two models, and again on the three models. I expect that the p.value for the comparison of model 1 and model 2 would remain the same, whether or not I add a third model to be compared with. However, the P values change, and I do not understand why. Here is an example code (following with it's input): data(barro) fit0 - rq(y.net ~ lgdp2 + fse2 , data = barro) fit1 - rq(y.net ~ lgdp2 + fse2 + gedy2 , data = barro) fit2 - rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 , data = barro) anova(fit0,fit1,fit2, R = 1000) anova(fit0,fit1, R = 1000) Output: data(barro) fit0 - rq(y.net ~ lgdp2 + fse2 , data = barro) fit1 - rq(y.net ~ lgdp2 + fse2 + gedy2 , data = barro) fit2 - rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 , data = barro) anova(fit0,fit1,fit2, R = 1000) Quantile Regression Analysis of Deviance Table Model 1: y.net ~ lgdp2 + fse2 + gedy2 + Iy2 Model 2: y.net ~ lgdp2 + fse2 + gedy2 Model 3: y.net ~ lgdp2 + fse2 Df Resid Df F valuePr(F) 1 1 156 29.494 2.110e-07 *** 2 2 156 18.194 7.901e-08 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 anova(fit0,fit1, R = 1000) Quantile Regression Analysis of Deviance Table Model 1: y.net ~ lgdp2 + fse2 + gedy2 Model 2: y.net ~ lgdp2 + fse2 Df Resid Df F value Pr(F) 1 1 157 3.9532 0.04852 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 [3] LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C [5] LC_TIME=Hebrew_Israel.1255 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] rms_3.3-1Hmisc_3.8-3 survival_2.36-9 [4] colorspace_1.1-0 quantreg_4.71SparseM_0.89 [7] PerformanceAnalytics_1.0.3.2 xts_0.8-2zoo_1.7-4 [10] reporttools_1.0.6xtable_1.5-6 loaded via a namespace (and not attached): [1] cluster_1.14.0 grid_2.13.1 lattice_0.19-33 tools_2.13.1 Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] speed up this algorithm (apply-fuction / 4D array)
here's another one - which is easier to generalize: x - array(rnorm(50 * 50 * 50 * 91, 0, 2), dim=c(50, 50, 50, 91)) y - x [,,,1:90] # decide yourself what to do with slice 91, but # 91 is not divisible by 3 system.time ({ dim (y) - c (50, 50, 50, 3, 90 %/% 3) y - aperm (y, c (4, 1:3, 5)) v2 - colMeans (y) }) User System verstrichen 0.320.080.40 (my computer is a bit slower than Bill's:) system.time (v1 - f1 (x)) User System verstrichen 0.360 0.030 0.396 Claudia Am 05.10.2011 20:24, schrieb William Dunlap: I corrected your code a bit and put it into a function, f0, to make testing easier. I also made a small dataset to make testing easier. Then I made a new function f1 which does what f0 does in a vectorized manner: x- array(rnorm(50 * 50 * 50 * 91, 0, 2), dim=c(50, 50, 50, 91)) xsmall- array(log(seq_len(2 * 2 * 2 * 91)), dim=c(2, 2, 2, 91)) f0- function(x) { data_reduced- array(0, dim=c(dim(x)[1:3], trunc(dim(x)[4]/3))) reduce- seq(1, dim(x)[4]-1, by=3) for( i in 1:length(reduce) ) { data_reduced[ , , , i]- apply(x[ , , , reduce[i] : (reduce[i]+2) ], 1:3, mean) } data_reduced } f1- function(x) { reduce- seq(1, dim(x)[4]-1, by=3) data_reduced- (x[, , , reduce] + x[, , , reduce+1] + x[, , , reduce+2]) / 3 data_reduced } The results were: system.time(v1- f1(x)) user system elapsed 0.280 0.040 0.323 system.time(v0- f0(x)) user system elapsed 73.760 0.060 73.867 all.equal(v0, v1) [1] TRUE I thought apply would already vectorize, rather than loop over every coordinate. No, you have that backwards. Use *apply functions when you cannot figure out how to vectorize. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Martin Batholdy Sent: Wednesday, October 05, 2011 10:40 AM To: R Help Subject: [R] speed up this algorithm (apply-fuction / 4D array) Hi, I have this sample-code (see above) and I was wondering wether it is possible to speed things up. What this code does is the following: x is 4D array (you can imagine it as x, y, z-coordinates and a time-coordinate). So x contains 50x50x50 data-arrays for 91 time-points. Now I want to reduce the 91 time-points. I want to merge three consecutive time points to one time-points by calculating the mean of this three time-points for every x,y,z coordinate. The reduce-sequence defines which time-points should get merged. And the apply-function in the for-loop calculates the mean of the three 3D-Arrays and puts them into a new 4D array (data_reduced). The problem is that even in this example it takes really long. I thought apply would already vectorize, rather than loop over every coordinate. But for my actual data-set it takes a really long time ... So I would be really grateful for any suggestions how to speed this up. x- array(rnorm(50 * 50 * 50 * 90, 0, 2), dim=c(50, 50, 50, 91)) data_reduced- array(0, dim=c(50, 50, 50, 90/3)) reduce- seq(1,90, 3) for( i in 1:length(reduce) ) { data_reduced[ , , , i]-apply(x[ , , , reduce[i] : (reduce[i]+3) ], 1:3, mean) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Claudia Beleites Spectroscopy/Imaging Institute of Photonic Technology Albert-Einstein-Str. 9 07745 Jena Germany email: claudia.belei...@ipht-jena.de phone: +49 3641 206-133 fax: +49 2641 206-399 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply and functions with many arguments
On 10/06/2011 08:03 AM, Alaios wrote: Dear all, I would like to use the following function fitdist(data, distr, method=c(mle, mme, qme, mge), start=NULL, fix.arg=NULL, ...) for many different distr values like distr=c(norm,lnorm,pois) (just a small example) and take back into a list the parameter name which is what is inside distr plus what the function fitdist returns (another list). How can I do that ? B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi, Take a look at 'expand.grid' and 'ddply' from the plyr package or foreach from the foreach package. cheers, Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean(s) from values in different row?
One way for doing it would be to combine the columns using paste and then use tapply to get the means. For example: set.seed(32341) a1 = sample(c(a,b), 100,replace = T) a2 = sample(c(a,b), 100,replace = T) y = rnorm(100) tapply(y,paste(a1,a2), mean) Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Oct 6, 2011 at 8:40 AM, SML paral...@lafn.org wrote: Hello: Is there a way to get a mean from values stored in different rows? The data looks like this: YEAR-1, JAN, FEB, ..., DEC YEAR-2, JAN, FEB, ..., DEC YEAR-3, JAN, FEB, ..., DEC What I want is the mean(s) for just the consecutive winter months: YEAR-1.DEC, YEAR-2.JAN, YEAR-2.FEB YEAR-2.DEC, YEAR-3.JAN, YEAR-3.FEB etc. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] expression set (Bioconductor) problem
Hello R people, dim(exprs(estrogenrma) I have an expressionSet with 8 samples and 12695 features (genes) estrogenrma$estrogen present present absent absent present present absent absent estrogenrma$time.h 10 10 10 10 48 48 48 48 present - grep(present, as.character(estrogenrma$estrogen)) absent - grep(absent, as.character(estrogenrma$estrogen)) ten - grep(10, as.character(estrogenrma$time.h)) fortyeight - grep(48, as.character(estrogenrma$time.h)) present.10 - estrogenrma[, intersect(present, ten)] present.48 - estrogenrma[, intersect(present, fortyeight)] absent.10 - estrogenrma[, intersect(absent, ten)] absent.48 - estrogenrma[, intersect(absent, fortyeight)] present.10, present.48, absent.10, and absent.48 are four expression sets with two samples and 12695 features. How can I make a new 2 new expressionsets, each have 12695 features and one sample where expressionset1 = (present.10 + present.48) / 2 expressionset2 = (absent.10 + absent.48) / 2 ? Thanks, Clayton - Original Message - From: Tal Galili tal.gal...@gmail.com To: SML paral...@lafn.org Cc: r-help@r-project.org Sent: Thursday, October 6, 2011 4:09:43 AM Subject: Re: [R] Mean(s) from values in different row? One way for doing it would be to combine the columns using paste and then use tapply to get the means. For example: set.seed(32341) a1 = sample(c(a,b), 100,replace = T) a2 = sample(c(a,b), 100,replace = T) y = rnorm(100) tapply(y,paste(a1,a2), mean) Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Oct 6, 2011 at 8:40 AM, SML paral...@lafn.org wrote: Hello: Is there a way to get a mean from values stored in different rows? The data looks like this: YEAR-1, JAN, FEB, ..., DEC YEAR-2, JAN, FEB, ..., DEC YEAR-3, JAN, FEB, ..., DEC What I want is the mean(s) for just the consecutive winter months: YEAR-1.DEC, YEAR-2.JAN, YEAR-2.FEB YEAR-2.DEC, YEAR-3.JAN, YEAR-3.FEB etc. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fitting parabolic function to data
Dear R users and experts, I want to fit a shifted parabolic function with the following functional form to my data: f(x)=a0*(x+a1)^2+a2 (a0, a1 and a2 are scaling factors.) What is standard approach to do this in R? I tried the lm function in R but I got problems getting the above functional form. Any help is welcome :) . Greetings, Henri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R CMD check
On 05.10.2011 23:56, Jeff Breiwick wrote: Richard M. Heibergerrmhat temple.edu writes: The next thing to check is this item from doc/manual/R-exts.html Quoted strings within R-like text are handled specially... My guess is that the problem is occuring in the .Rd file, not in the .R file. Remove the line, or double the \ characters. Rich Yes, the error appears to be in the .Rd file so I will modify that. Thanks. Then you have not deleted the function from its example section where backslashes have to be escaped once more. Uwe Ligges Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does it exist a function for this?
On 10/05/2011 07:42 PM, lunarossa wrote: I have this kind of matrix, with thousands of cases. A 2 apple A 2 peach A 3 peach B 1 pear B 4 peach B 4 beef B 7 beef C 1 peach D 2 apple D 5 peach I have to distinguish, from the other rows, the rows with peach and this is not a problem. I also have to discriminate the rows with peach like the second one (associated with the same two cells A and 2 to apple, see first row) from the row like the 3rd or the 8th ones, when the first two cells are unique (A and 3 or C and 1). Hi lunarossa, I may be on the wrong track, but you could just stick the three components together: alphanumfruit[,4]-paste(alphanumfruit[,1], alphanumfruit[,2],alphanumfruit[,3],sep=) and the fourth column of your object (which I suspect is a data frame) will have elements that can be tested for matching or non-matching. More complicated conditions can be accommodated by pasting different combinations of the columns together. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting parabolic function to data
On 11-10-06 4:59 AM, Henri Mone wrote: Dear R users and experts, I want to fit a shifted parabolic function with the following functional form to my data: f(x)=a0*(x+a1)^2+a2 (a0, a1 and a2 are scaling factors.) What is standard approach to do this in R? I tried the lm function in R but I got problems getting the above functional form. Any help is welcome :) . That can be expanded into a regular quadratic: (a0*a1^2 + a2) + 2*a0*a1*x + a0*x^2 So fit a regular quadratic, and then solve for a0, a1, a2 from the resulting coefficients. The only tricky bit will be computing errors on the a's. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] break.axis all range of data
On 10/06/2011 12:31 AM, Heverkuhn Heverkuhn wrote: ..all the point from 8 to 13. On Wed, Oct 5, 2011 at 8:28 AM, Heverkuhn Heverkuhnheverk...@gmail.comwrote: The problem with that function is that it does not really separate the 2parts of the graph but it inserts , when style is gap, a blank strip that cover axis and points. So for example if a insert it at 8 and I set the gap of length 5 , it would cancel al the point from 8 to 10. ... I would like to increase the distance between x tick-marks 8 and 9, and not connect the points x=8 and x=9. Ah, I think I see what you want. You want an axis like this: axis(1,at=c(1:8,10:37),labels=1:36) and to get your points right, you would have to do something like: plot(c(1:8,10:37),1:36,xaxt=n) lines(1:8,1:8) lines(10:37,9:36) first. This is more or less the reverse of the gap.* functions in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply and functions with many arguments
Hello Paul I have also tried this one distrList-c(norm,lnorm,pois,exp,gamma,nbinom,geom,logis); return (lapply(distrList, function(distrList) { fitdist(x1,distrList)})); which seems to work. I am not sure though if there is a strict performance penalty. B.R Alex From: Paul Hiemstra paul.hiems...@knmi.nl Cc: R-help@r-project.org R-help@r-project.org Sent: Thursday, October 6, 2011 10:07 AM Subject: Re: [R] apply and functions with many arguments On 10/06/2011 08:03 AM, Alaios wrote: Dear all, I would like to use the following function fitdist(data, distr, method=c(mle, mme, qme, mge), start=NULL, fix.arg=NULL, ...) for many different distr values like distr=c(norm,lnorm,pois) (just a small example) and take back into a list the parameter name which is what is inside distr plus what the function fitdist returns (another list). How can I do that ? B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi, Take a look at 'expand.grid' and 'ddply' from the plyr package or foreach from the foreach package. cheers, Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anova.rq {quantreg) - Why do different level of nesting changes the P values?!
Hi Tal, you are comparing different things. The Details-section of ?anova.qr states test the hypothesis that smaller models are adequate relative to the largest specified model. So in your first anova you compare fit2 with fit1 and fit0, in your second attempt fit1 with fit0, so you have different base-models to compare with, and consequently different p-values. hth. Am 06.10.2011 10:05, schrieb Tal Galili: Hello dear R help members. I am trying to understand the anova.rq, and I am finding something which I can not explain (is it a bug?!): The example is for when we have 3 nested models. I run the anova once on the two models, and again on the three models. I expect that the p.value for the comparison of model 1 and model 2 would remain the same, whether or not I add a third model to be compared with. However, the P values change, and I do not understand why. Here is an example code (following with it's input): data(barro) fit0 - rq(y.net ~ lgdp2 + fse2 , data = barro) fit1 - rq(y.net ~ lgdp2 + fse2 + gedy2 , data = barro) fit2 - rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 , data = barro) anova(fit0,fit1,fit2, R = 1000) anova(fit0,fit1, R = 1000) Output: data(barro) fit0 - rq(y.net ~ lgdp2 + fse2 , data = barro) fit1 - rq(y.net ~ lgdp2 + fse2 + gedy2 , data = barro) fit2 - rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 , data = barro) anova(fit0,fit1,fit2, R = 1000) Quantile Regression Analysis of Deviance Table Model 1: y.net ~ lgdp2 + fse2 + gedy2 + Iy2 Model 2: y.net ~ lgdp2 + fse2 + gedy2 Model 3: y.net ~ lgdp2 + fse2 Df Resid Df F valuePr(F) 1 1 156 29.494 2.110e-07 *** 2 2 156 18.194 7.901e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 anova(fit0,fit1, R = 1000) Quantile Regression Analysis of Deviance Table Model 1: y.net ~ lgdp2 + fse2 + gedy2 Model 2: y.net ~ lgdp2 + fse2 Df Resid Df F value Pr(F) 1 1 157 3.9532 0.04852 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 [3] LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C [5] LC_TIME=Hebrew_Israel.1255 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] rms_3.3-1Hmisc_3.8-3 survival_2.36-9 [4] colorspace_1.1-0 quantreg_4.71SparseM_0.89 [7] PerformanceAnalytics_1.0.3.2 xts_0.8-2zoo_1.7-4 [10] reporttools_1.0.6xtable_1.5-6 loaded via a namespace (and not attached): [1] cluster_1.14.0 grid_2.13.1 lattice_0.19-33 tools_2.13.1 Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting a data frame with multiple values and exclusions.
Thanks. Such a short and sweet answer that does what it should. - Natalie Van Zuydam PhD Student University of Dundee nvanzuy...@dundee.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3877472.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The Sets package and element equality test
Hi, I tried to use the sets package yesterday, which seems to be a very powerful package: thanks to the developers. I wanted to define sets of lists where 2 lists are considered equal if they have the same length. So, I implemented: match.2.list - function(l1,l2){ length(l1)==length(l2) } and then defined my cset as: s - set(list(1,2),list(3,4)) lset - cset(s,matchfun(match.2.list)) so if I now do: y - list(3,4) y %e% lset I get the correct answer, which is TRUE. But if I do: x - list(1,8) x %e% lset I now get FALSE, even though x is a list of length 2, and should thus match any of the 2 lists in lset. I must be doing something wrong; I checked with the doc, but I don't understand. Thanks to anyone who can help. Regards Johnny [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with .C
Hello, first thank you for your answers. I did not read the whole pdf Writing R Extension, but I read this strongly shortened introduction to this subject: http://www.math.kit.edu/stoch/~lindner/media/.c.call%20extensions.pdf I get the same error with this C-function: void test(double * b, int l) { int i; for(i=0; i l ; i++) b[i] +=i; } I call it from R like this: parameter = c(0,0,1,1,1,0,1.5,0.7,0,1.2,0.3); .C(test, as.double(parameter), as.integer(11)) The programm crashes even in this simple case. Where can be the error? Thanks Grigory Alexandrovich Answer 1 Without knowing that C code, we cannot know. Have you read Writing R Extensions carefully? I.e. take care with memory allocation and printing as mentioned in the manual. Uwe Ligges Answer 2 This looks like a classic case of not reading the manual, and then compounding it by not reading the posting guide. The manual would be the Writing R Extensions pdf that comes with R or you can google it. The posting guide is referenced at the bottom of this and every other posting on this mailing list. There are nearly an infinite variety of errors that can lead to a crash, so it is really unreasonable of you to pose this question this way and expect constructive assistance. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Answer 3 It's impossible to say, with such minimal information, but a reasonable guess is that there is a problem with the declaration of x and y in foo.c. These would (I think) need to be declared as double *, not double, when foo is called from .C(). cheers, Rolf Turner Answer 4 Hi, As other have said, it's very difficult to help you without an example + code to know what you are talking about. That having been said, it seems as if you are just getting your feet wet in this R -- C bridge, and I'd recommend you checkout the Rcpp and inline package to help make your life a lot easier ... -steve On 04.10.2011 14:04, Grigory Alexandrovich wrote: Hello, I wrote a function in C, which works fine if called from the main-function in C. But as soon as I try to call this function from R like .C('foo', as.double(x), as.integer(y)), the programm crashes. I created a dll with the cmd command R --arch x64 CMD SHLIB foo.c and loaded it into R with dyn.load(). What can be the cause of such behaviour? Again, the C-funcion itself works, but not if called from R. Thanks Grigory Alexandrovich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] non-cumulative hazard in Cox model with time-dependent covariates
Dear all, Is there a way to calculate the non-cumulative hazard (instantaneous hazard), which is the product of baseline hazard and exp{beta*covariate} ? I knew in survfit, we can get the estimator of cumulative baseline hazard, but how can we get the non-cumulative one? Thank you very much! Koshihaku -- View this message in context: http://r.789695.n4.nabble.com/non-cumulative-hazard-in-Cox-model-with-time-dependent-covariates-tp3877744p3877744.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About stepwise regression problem
If I used mixed model ( linear + smoothing function) How can I used p-value to perform stepwise regression??? Is it similar of Pr(|t|) of linear term and P-value of smoothing term but T value of linear and F of smoothing term? please help! -- View this message in context: http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3877259.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error messages
Im doing a project using Rcommander. I have a dataset with 5 variables, BMI being one of them. I would like to find the variance, standard deviation, range, median, etc of BMI. However, everytime I type a command into the script window, (such as var(BMI$BMI,na.rm=TRUE) )an error message pops up saying Object BMI not found. I can graph it, work it, do t-test, etc,. It's listed as one of the variables in the data set. So I don't understand why it cant find BMI. Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About stepwise regression problem
Why not use AIC / BIC based model selection which in this case makes more sense than artificial P-Values, and eliminates the problems of stepwise regression. For background read Burnham and anderson - multimodel selection. Chris. On 6 Oct 2011, at 08:14, pigpigmeow wrote: If I used mixed model ( linear + smoothing function) How can I used p-value to perform stepwise regression??? Is it similar of Pr(|t|) of linear term and P-value of smoothing term but T value of linear and F of smoothing term? please help! -- View this message in context: http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3877259.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Concecutive zeros and ones
Great :) From: Dimitris Rizopoulos d.rizopou...@erasmusmc.nl Cc: R-help@r-project.org R-help@r-project.org Sent: Thursday, October 6, 2011 9:22 AM Subject: Re: [R] Concecutive zeros and ones Check function rle(). I hope it helps. Best, Dimitris On 10/6/2011 9:18 AM, Alaios wrote: Dear all, I have a data series (might be vector or matrix) which is composed only from zeros and ones like the following example 0 0 0 1 1 0 1 0 0 1 1 1 1 0 0 0 I want to be able to return back the length of concecutive zeros and the length of concecutive ones. For that I want to have something like that returned: zeros= [3 1 2 3]; ones=[2 1 4]; How I can do that simply in R? I would like to thank you in advance for your help B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reg : read missing values from database using RJDBC
On 05.10.2011 17:10, Raji wrote: Hi All, This seems to be a bug with RJDBC package and it has been fixed in the latest RJDBC_0.1-6 version. I would like to try out RJDBC_0.1-6. can you please guide me to a link where i can find the 64-bit RJDBC_0.1-6.zip . I could find the 32-bit version at http://cran.sixsigmaonline.org/bin/windows/contrib/2.11/ ? RJDBC_0.2-0 seesm to be recent and binaries are available for recent versions of R on all CRAN mirrors. If you need a binary for an ancient version of R, you have to compile it yourself. Uwe Ligges Thanks in advance. -- View this message in context: http://r.789695.n4.nabble.com/Reg-read-missing-values-from-database-using-RJDBC-tp3257766p3874809.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anova.rq {quantreg) - Why do different level of nesting changes the P values?!
Thank you Eik - this makes much more sense now. Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Oct 6, 2011 at 1:21 PM, Eik Vettorazzi e.vettora...@uke.de wrote: Hi Tal, you are comparing different things. The Details-section of ?anova.qr states test the hypothesis that smaller models are adequate relative to the largest specified model. So in your first anova you compare fit2 with fit1 and fit0, in your second attempt fit1 with fit0, so you have different base-models to compare with, and consequently different p-values. hth. Am 06.10.2011 10:05, schrieb Tal Galili: Hello dear R help members. I am trying to understand the anova.rq, and I am finding something which I can not explain (is it a bug?!): The example is for when we have 3 nested models. I run the anova once on the two models, and again on the three models. I expect that the p.value for the comparison of model 1 and model 2 would remain the same, whether or not I add a third model to be compared with. However, the P values change, and I do not understand why. Here is an example code (following with it's input): data(barro) fit0 - rq(y.net ~ lgdp2 + fse2 , data = barro) fit1 - rq(y.net ~ lgdp2 + fse2 + gedy2 , data = barro) fit2 - rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 , data = barro) anova(fit0,fit1,fit2, R = 1000) anova(fit0,fit1, R = 1000) Output: data(barro) fit0 - rq(y.net ~ lgdp2 + fse2 , data = barro) fit1 - rq(y.net ~ lgdp2 + fse2 + gedy2 , data = barro) fit2 - rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 , data = barro) anova(fit0,fit1,fit2, R = 1000) Quantile Regression Analysis of Deviance Table Model 1: y.net ~ lgdp2 + fse2 + gedy2 + Iy2 Model 2: y.net ~ lgdp2 + fse2 + gedy2 Model 3: y.net ~ lgdp2 + fse2 Df Resid Df F valuePr(F) 1 1 156 29.494 2.110e-07 *** 2 2 156 18.194 7.901e-08 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 anova(fit0,fit1, R = 1000) Quantile Regression Analysis of Deviance Table Model 1: y.net ~ lgdp2 + fse2 + gedy2 Model 2: y.net ~ lgdp2 + fse2 Df Resid Df F value Pr(F) 1 1 157 3.9532 0.04852 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 [3] LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C [5] LC_TIME=Hebrew_Israel.1255 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] rms_3.3-1Hmisc_3.8-3 survival_2.36-9 [4] colorspace_1.1-0 quantreg_4.71 SparseM_0.89 [7] PerformanceAnalytics_1.0.3.2 xts_0.8-2zoo_1.7-4 [10] reporttools_1.0.6xtable_1.5-6 loaded via a namespace (and not attached): [1] cluster_1.14.0 grid_2.13.1 lattice_0.19-33 tools_2.13.1 Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Titles changing when a plot is redrawn
Thank you for telling me a fix. But I still don't know if this behavior is what is intended. I used bquote(...) because the plotmath(...) help page refers to bquote and gives an example like this. I suspect most users will be baffled by this kind of behavior, especially since it does not occur when there is one plot. By this I mean that I can draw one plot and title it with the same string using bquote( ). If I change the value of i, and redraw the graph, the redrawn graph has the original value of i in the title, not the updated value. So in this case, an unevaluated expression is not re-evaluated at draw time? John -xieyi...@gmail.com wrote: - To: John Nolan jpno...@american.edu From: Yihui Xie Sent by: xieyi...@gmail.com Date: 10/05/2011 11:49PM Cc: r-help@r-project.org Subject: Re: [R] Titles changing when a plot is redrawn I think the problem is your str1 is an unevaluated expression and will change with the value of i. You should be able to get a fixed title by this: par(mfrow = c(2, 1)) for (i in 1:2) { x - 1:100 rmse - sin(x/5) # fake data plot(x, rmse, main = substitute(list(RMSE(theta), i == z), list(z = i))) } Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Wed, Oct 5, 2011 at 10:01 PM, John Nolan jpno...@american.edu wrote: I ran into a problem with titles on graphs. I wanted a graph with multiple subplots, with each having a title that involved both a Greek letter and an identifier for each graph. Below is a simplified version of code to do this. The graph appears fine, with the first graph having i=1 in the title, and the second graph having i=2 in the title. However, when I resize the graph, the plot titles change, with both showing i=2. The titles also change when I save the plot to a file using the File menu, then Save as in Windows. Is this what should happen? I always thought that titles are static once the graph is drawn, and couldn't change. The problem occurs on some version of R, but not on others. It does occur with the latest version of R: str(R.Version()) List of 13 $ platform : chr i386-pc-mingw32 $ arch : chr i386 $ os: chr mingw32 $ system: chr i386, mingw32 $ status: chr $ major : chr 2 $ minor : chr 13.2 $ year : chr 2011 $ month : chr 09 $ day : chr 30 $ svn rev : chr 57111 $ language : chr R $ version.string: chr R version 2.13.2 (2011-09-30) The problem also occurs on: R 2.13.0 on Win32 and Mac (R 2.12.0, x86_64-apple-darwin9.8.0) The problem DOES NOT occur under R 2.10.0 on Win32. If the code below is bracketed with pdf(test.pdf) and dev.off(), the correct labels appear in the file. This behavior doesn't seem to appear if there is only one plot. My guess is that the titles are being reevaluated when the plot is redrawn, and since the value of i is 2 when the redraw occurs, both labels get set to i=2. I guess Save as forces a redraw because a dialog box pops up? If could be that this behavior is what is intended, and that somewhere between R 2.10.0 and R 2.13.2 an old bug was fixed. Or this behavior is not what was intended, and a bug was introduced. If the former, this should be explained to the user somewhere. If the latter, can someone track it down and fix? John Nolan #- par(mfrow=c(2,1)) for (i in 1:2) { x - 1:100 rmse - sin(x/5) # fake data plot(x,rmse) str1 - bquote( paste(RMSE(,theta,), ,i==.(i) )) title( str1 ) } #- ... John P. Nolan Math/Stat Department 227 Gray Hall American University 4400 Massachusetts Avenue, NW Washington, DC 20016-8050 jpno...@american.edu 202.885.3140 voice 202.885.3155 fax http://academic2.american.edu/~jpnolan ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with .C
An obvious reason might be that your second argument should be a pointer to int. As others have mentioned, you might want to have a look at Rccp and/or inline. The documentation is good and I find it much easier to work with. For example, your example could be written as: library(Rcpp) library(inline) test - cxxfunction(signature(x = numeric ) , ' Rcpp::NumericVector v(x); Rcpp::NumericVector result(v.length()); for (int i = 0; i v.length(); ++i) { result[i] = v[i] + i; } return(result); ', plugin = Rcpp ) HTH, Jan Quoting Grigory Alexandrovich alexandrov...@mathematik.uni-marburg.de: Hello, first thank you for your answers. I did not read the whole pdf Writing R Extension, but I read this strongly shortened introduction to this subject: http://www.math.kit.edu/stoch/~lindner/media/.c.call%20extensions.pdf I get the same error with this C-function: void test(double * b, int l) { int i; for(i=0; i l ; i++) b[i] +=i; } I call it from R like this: parameter = c(0,0,1,1,1,0,1.5,0.7,0,1.2,0.3); .C(test, as.double(parameter), as.integer(11)) The programm crashes even in this simple case. Where can be the error? Thanks Grigory Alexandrovich Answer 1 Without knowing that C code, we cannot know. Have you read Writing R Extensions carefully? I.e. take care with memory allocation and printing as mentioned in the manual. Uwe Ligges Answer 2 This looks like a classic case of not reading the manual, and then compounding it by not reading the posting guide. The manual would be the Writing R Extensions pdf that comes with R or you can google it. The posting guide is referenced at the bottom of this and every other posting on this mailing list. There are nearly an infinite variety of errors that can lead to a crash, so it is really unreasonable of you to pose this question this way and expect constructive assistance. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Answer 3 It's impossible to say, with such minimal information, but a reasonable guess is that there is a problem with the declaration of x and y in foo.c. These would (I think) need to be declared as double *, not double, when foo is called from .C(). cheers, Rolf Turner Answer 4 Hi, As other have said, it's very difficult to help you without an example + code to know what you are talking about. That having been said, it seems as if you are just getting your feet wet in this R -- C bridge, and I'd recommend you checkout the Rcpp and inline package to help make your life a lot easier ... -steve On 04.10.2011 14:04, Grigory Alexandrovich wrote: Hello, I wrote a function in C, which works fine if called from the main-function in C. But as soon as I try to call this function from R like .C('foo', as.double(x), as.integer(y)), the programm crashes. I created a dll with the cmd command R --arch x64 CMD SHLIB foo.c and loaded it into R with dyn.load(). What can be the cause of such behaviour? Again, the C-funcion itself works, but not if called from R. Thanks Grigory Alexandrovich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Running *slow*
Anyone got any hints on how to make this code more efficient? An early version (which to be fair did more than this one is) ran for 330 hours and produced no output. I have a two column table, Dat, with 12,000,000 rows and I want to produce a lookup table, ltable, in a 1 dimensional matrix with one copy of each of the values in Dat: for (i in 1:nrow(Dat)) { for (j in 1:2) { #If next value is already in ltable, do nothing if (is.na(match(Dat[i,j], ltable))){ltable - rbind(ltable,Dat[i,j])} } } but it takes forever to produce anything. Any advice gratefully received. Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Re: Party extract BinaryTree from cforest?
-- Forwarded message -- Date: Wed, 5 Oct 2011 21:09:41 + From: Chris christopher.a.h...@gmail.com To: r-h...@stat.math.ethz.ch Subject: Re: [R] Party extract BinaryTree from cforest? I found an internal workaround to this to support printing and plot type simple, tt-party:::prettytree(cf at ensemble[[1]], names(cf at data at get(input))) npt - new(BinaryTree) npt@tree-tt plot(npt) Error in terminal_panel(S4 object of class BinaryTree) : âctreeobjâ is not a regression tree library(party) cf - cforest(Species ~ ., data = iris) pt - party:::prettytree(cf@ensemble[[1]], names(cf@data@get(input))) pt nt - new(BinaryTree) nt@tree - pt nt@data - cf@data nt@responses - cf@responses nt plot(nt) will do. Torsten plot(npt, type=simple) Any additional help is appreciated. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error messages
Joyce Flynt dr.joyceflynt at yahoo.com writes: Im doing a project using Rcommander. I have a dataset with 5 variables, BMI being one of them. I would like to find the variance, standard deviation, range, median, etc of BMI. However, everytime I type a command into the script window, (such as var(BMI$BMI,na.rm=TRUE) ) an error message pops up saying Object BMI not found. I can graph it, work it, do t-test, etc,. It's listed as one of the variables in the data set. So I don't understand why it cant find BMI. Thanks It's hard to solve without a *reproducible* example. Is your data set really called 'BMI'? (i.e. should you be saying BMI$BMI, or just BMI (if your data frame is attached), or mydata$BMI? If your data set *is* called BMI, does it make a difference if you rename it to e.g. BMIdat? What are the results of ls() ? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with .C
On 06.10.2011 14:51, Jan van der Laan wrote: An obvious reason might be that your second argument should be a pointer to int. As others have mentioned, you might want to have a look at Rccp and/or inline. The documentation is good and I find it much easier to work with. For example, your example could be written as: library(Rcpp) library(inline) test - cxxfunction(signature(x = numeric ) , ' Rcpp::NumericVector v(x); Rcpp::NumericVector result(v.length()); for (int i = 0; i v.length(); ++i) { result[i] = v[i] + i; } return(result); ', plugin = Rcpp ) Oh, come on, this is now really too much of overkill. Just make the original source void test(double *b, int *l) { int i; for(i=0; i *l ; i++) b[i] += i; } which you would have know after reading the Wriiting R Extensions manual. Best, Uwe Ligges HTH, Jan Quoting Grigory Alexandrovich alexandrov...@mathematik.uni-marburg.de: Hello, first thank you for your answers. I did not read the whole pdf Writing R Extension, but I read this strongly shortened introduction to this subject: http://www.math.kit.edu/stoch/~lindner/media/.c.call%20extensions.pdf I get the same error with this C-function: void test(double * b, int l) { int i; for(i=0; i l ; i++) b[i] +=i; } I call it from R like this: parameter = c(0,0,1,1,1,0,1.5,0.7,0,1.2,0.3); .C(test, as.double(parameter), as.integer(11)) The programm crashes even in this simple case. Where can be the error? Thanks Grigory Alexandrovich Answer 1 Without knowing that C code, we cannot know. Have you read Writing R Extensions carefully? I.e. take care with memory allocation and printing as mentioned in the manual. Uwe Ligges Answer 2 This looks like a classic case of not reading the manual, and then compounding it by not reading the posting guide. The manual would be the Writing R Extensions pdf that comes with R or you can google it. The posting guide is referenced at the bottom of this and every other posting on this mailing list. There are nearly an infinite variety of errors that can lead to a crash, so it is really unreasonable of you to pose this question this way and expect constructive assistance. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Answer 3 It's impossible to say, with such minimal information, but a reasonable guess is that there is a problem with the declaration of x and y in foo.c. These would (I think) need to be declared as double *, not double, when foo is called from .C(). cheers, Rolf Turner Answer 4 Hi, As other have said, it's very difficult to help you without an example + code to know what you are talking about. That having been said, it seems as if you are just getting your feet wet in this R -- C bridge, and I'd recommend you checkout the Rcpp and inline package to help make your life a lot easier ... -steve On 04.10.2011 14:04, Grigory Alexandrovich wrote: Hello, I wrote a function in C, which works fine if called from the main-function in C. But as soon as I try to call this function from R like .C('foo', as.double(x), as.integer(y)), the programm crashes. I created a dll with the cmd command R --arch x64 CMD SHLIB foo.c and loaded it into R with dyn.load(). What can be the cause of such behaviour? Again, the C-funcion itself works, but not if called from R. Thanks Grigory Alexandrovich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to install a locally built package
Is there a way to put all R code in a single file? I have too many small files now, and it is inconvenient to edit (I still have to put everything in one buffer) and when I add just one new func I have to go through the process of manually editing all help files one by one. When I put all the code in one file only the first func is loaded. Thanks everybody Stephen B -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Friday, September 16, 2011 3:43 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 20:38, Bond, Stephen wrote: I got it working by typing a string in getdfv.Rd after \title{ \title{ getdf %% ~~function to do ... ~~ } Strange why the skeleton would not do that given it did \name{getdfv} \alias{getdfv} One reason is that we try to force people to deal with the documentation which we feel is very important, even if the stuff is just used by yourself. Best, Uwe Ligges Anyway, I'm happy now. Stephen Bond -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, September 15, 2011 12:15 PM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 17:47, Bond, Stephen wrote: Uwe, That gave me the same error like CMD install install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) Loading required package: stats Loading required package: utils Loading required package: graphics Loading required package: splines * installing *source* package 'mypack' ... ** R ** preparing package for lazy loading ** help Warning: C:/DOCUME~1/BondStep/LOCALS~1/Temp/Rtmpw5N7dm/R.INSTALL6ccc14e6/mypac k/man/mypack-package.Rd:33: All text must be in a section *** installing help indices Error in Rd_info(db[[i]]) : Rd files must have a non-empty \title. See chapter 'Writing R documentation' in manual 'Writing R Extensions'. * removing 'c:/PROGRA~1/R/R-212~1.1/library/mypack' * restoring previous 'c:/PROGRA~1/R/R-212~1.1/library/mypack' Warning message: In install.packages(C:/Temp/mypack_1.0.tar.gz, repos = NULL, type = source) : installation of package 'C:/Temp/mypack_1.0.tar.gz' had non-zero exit status is there a way to skip the Rd part? This is for private use only and there is no help or data files. Then you have to delete the Rd file that has been generated by package.skeleton. Please read the manuals Writing R Extensions and R Installation and Administration. Best, Uwe Ligges Thank you. Stephen Bond -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, September 15, 2011 11:43 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 17:32, Bond, Stephen wrote: Hello useRs, I am trying to put my funcs in a package to avoid clutter in my workspace as the funcs are now in .Rprofile. All plain R code no compilations. Using win XP with a full cygwin install and R2.12 I first did package.skeleton(mypack,list=getdfv, namespace=T) # just a single func which created a folder with the required stuff in it. Calling R CMD build creates a tar.gz file with warnings about DOS paths. Trying to install from the R GUI complains install.packages(mypack,repos=C:/Temp/mypack_1.0.tar.gz) Warning: unable to access index for repository C:/Temp/mypack_1.0.tar.gz/bin/windows/contrib/2.12 Warning message: In getDependencies(pkgs, dependencies, available, lib) : package 'mypack' is not available install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) seems to be what you want. Uwe Ligges putting the mypack folder in zip and then trying to install from the zip produces no error message but then library(mypack) Error in library(mypack) : 'mypack' is not a valid installed package Just putting the folder in the library folder of R did not work either Finally R CMD Install complains about missing title in the Rd file. I do not have a Rd file as the skeleton did not create it. Please, advise how to make simple R code available to be called without showing in ls() Thank you. Stephen Bond [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to install a locally built package
On 06.10.2011 15:10, Bond, Stephen wrote: Is there a way to put all R code in a single file? I have too many small files now, and it is inconvenient to edit (I still have to put everything in one buffer) and when I add just one new func I have to go through the process of manually editing all help files one by one. When I put all the code in one file only the first func is loaded. He? You really do not need to edit anything when you add a function. Just don't use package skeleton but just add the function. Uwe Ligges Thanks everybody Stephen B -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Friday, September 16, 2011 3:43 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 20:38, Bond, Stephen wrote: I got it working by typing a string in getdfv.Rd after \title{ \title{ getdf %% ~~function to do ... ~~ } Strange why the skeleton would not do that given it did \name{getdfv} \alias{getdfv} One reason is that we try to force people to deal with the documentation which we feel is very important, even if the stuff is just used by yourself. Best, Uwe Ligges Anyway, I'm happy now. Stephen Bond -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, September 15, 2011 12:15 PM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 17:47, Bond, Stephen wrote: Uwe, That gave me the same error like CMD install install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) Loading required package: stats Loading required package: utils Loading required package: graphics Loading required package: splines * installing *source* package 'mypack' ... ** R ** preparing package for lazy loading ** help Warning: C:/DOCUME~1/BondStep/LOCALS~1/Temp/Rtmpw5N7dm/R.INSTALL6ccc14e6/mypac k/man/mypack-package.Rd:33: All text must be in a section *** installing help indices Error in Rd_info(db[[i]]) : Rd files must have a non-empty \title. See chapter 'Writing R documentation' in manual 'Writing R Extensions'. * removing 'c:/PROGRA~1/R/R-212~1.1/library/mypack' * restoring previous 'c:/PROGRA~1/R/R-212~1.1/library/mypack' Warning message: In install.packages(C:/Temp/mypack_1.0.tar.gz, repos = NULL, type = source) : installation of package 'C:/Temp/mypack_1.0.tar.gz' had non-zero exit status is there a way to skip the Rd part? This is for private use only and there is no help or data files. Then you have to delete the Rd file that has been generated by package.skeleton. Please read the manuals Writing R Extensions and R Installation and Administration. Best, Uwe Ligges Thank you. Stephen Bond -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, September 15, 2011 11:43 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 17:32, Bond, Stephen wrote: Hello useRs, I am trying to put my funcs in a package to avoid clutter in my workspace as the funcs are now in .Rprofile. All plain R code no compilations. Using win XP with a full cygwin install and R2.12 I first did package.skeleton(mypack,list=getdfv, namespace=T) # just a single func which created a folder with the required stuff in it. Calling R CMD build creates a tar.gz file with warnings about DOS paths. Trying to install from the R GUI complains install.packages(mypack,repos=C:/Temp/mypack_1.0.tar.gz) Warning: unable to access index for repository C:/Temp/mypack_1.0.tar.gz/bin/windows/contrib/2.12 Warning message: In getDependencies(pkgs, dependencies, available, lib) : package 'mypack' is not available install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) seems to be what you want. Uwe Ligges putting the mypack folder in zip and then trying to install from the zip produces no error message but then library(mypack) Error in library(mypack) : 'mypack' is not a valid installed package Just putting the folder in the library folder of R did not work either Finally R CMD Install complains about missing title in the Rd file. I do not have a Rd file as the skeleton did not create it. Please, advise how to make simple R code available to be called without showing in ls() Thank you. Stephen Bond [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Running *slow*
?unique x - matrix(c(1:6, 6:1),ncol=2) x.temp - x dim(x.temp) - NULL unique(x.temp) Michael On Thu, Oct 6, 2011 at 8:37 AM, Thomas chesney@gmail.com wrote: Anyone got any hints on how to make this code more efficient? An early version (which to be fair did more than this one is) ran for 330 hours and produced no output. I have a two column table, Dat, with 12,000,000 rows and I want to produce a lookup table, ltable, in a 1 dimensional matrix with one copy of each of the values in Dat: for (i in 1:nrow(Dat)) { for (j in 1:2) { #If next value is already in ltable, do nothing if (is.na(match(Dat[i,j], ltable))){ltable - rbind(ltable,Dat[i,j])} } } but it takes forever to produce anything. Any advice gratefully received. Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] break.axis all range of data
Thank you Jim It seems exactly what I was looking for :) Claudio On Thu, Oct 6, 2011 at 5:11 AM, Jim Lemon j...@bitwrit.com.au wrote: On 10/06/2011 12:31 AM, Heverkuhn Heverkuhn wrote: ..all the point from 8 to 13. On Wed, Oct 5, 2011 at 8:28 AM, Heverkuhn Heverkuhnheverk...@gmail.com* *wrote: The problem with that function is that it does not really separate the 2parts of the graph but it inserts , when style is gap, a blank strip that cover axis and points. So for example if a insert it at 8 and I set the gap of length 5 , it would cancel al the point from 8 to 10. ... I would like to increase the distance between x tick-marks 8 and 9, and not connect the points x=8 and x=9. Ah, I think I see what you want. You want an axis like this: axis(1,at=c(1:8,10:37),labels=**1:36) and to get your points right, you would have to do something like: plot(c(1:8,10:37),1:36,xaxt=**n) lines(1:8,1:8) lines(10:37,9:36) first. This is more or less the reverse of the gap.* functions in the plotrix package. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Usng MCMCpack, error is initial value in vmmin is not finite
On 05.10.2011 20:38, yiy83102 wrote: __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gamm: problems with corCAR1()
Karel V karel.viaene at ugent.be writes: Dear all, I’m analyzing this dataset containing biodiversity indices, measured over time (Week), and at various contaminant concentrations (Treatment). We have two replicates (Replicate) per treatment. I’m looking for the effects of time (Week) and contaminant concentration (Treatment) on diversity indices (e.g. richness). [snip snip snip snip] You might do better with this question on the r-sig-mixed-models mailing list (r-sig-mixed-mod...@r-project.org) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-SIG-Finance] AsOf join in R
On Thu, Oct 6, 2011 at 2:15 AM, Ulrich Staudinger ustaudin...@gmail.com wrote: A bit late, but here is what I always do: m = merge(bid, ask, tick) m- interpNA(m, method=before) intrepNA can also interpolate NAs in different ways, for example linearly. Its not clear precisely what sort of objects bid, ask and tick were intended to be in your example but none of the tries below produce 1.1 for Sep 9th as asked for: library(xts) library(timeSeries) A - xts(c(10,15,20,25), + order.by=as.POSIXct(c(2011-09-01,2011-09-09,2011-09-10,2011-09-15))) B - xts(c(1.1,1.5,1.3,1.7), + order.by=as.POSIXct(c(2011-08-31,2011-09-09,2011-09-11,2011-09-12))) #1 interpNA(merge(A, B), method = before) A B 2011-08-31 NA 1.1 2011-09-01 10 1.1 2011-09-09 15 1.5 2011-09-10 20 1.5 2011-09-11 20 1.3 2011-09-12 20 1.7 2011-09-15 25 NA #2 interpNA(merge(as.timeSeries(A), as.timeSeries(B)), method = before) GMT TS.1 2011-08-31 1.1 2011-09-01 10.0 2011-09-09 1.5 2011-09-09 15.0 2011-09-10 20.0 2011-09-11 1.3 2011-09-12 1.7 2011-09-15 25.0 #3 interpNA(as.timeSeries(merge(A, B)), method = before) GMT A B 2011-08-31 NA 1.1 2011-09-01 10 1.1 2011-09-09 15 1.5 2011-09-10 20 1.5 2011-09-11 20 1.3 2011-09-12 20 1.7 2011-09-15 25 NA interpNA seems to use approx underneath so its functionality is similar to na.approx in zoo/xts but with different defaults. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Wilcox Test / Mann Whitney U Test
Hello List, I'm trying to prepare some lecture notes on non parametric methods, and I can't manually reproduce the results of the wilcox.test function for ordinal data. The data I'm using are from David Howell's website, available here http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html If I run the wilcox.test function on the data I get a p-value of .0407, but when I do it myself I get a p-value of 0.0530. It's not so much the jump across 0.05, but the fact that I thought I knew what the function was doing. I know from the R help page that there is some controversy about how exactly to calculate the test statistic, but that's not what is causing the problem, as I can get the same W value. Am I calculating the test statistic incorrectly? Thanks, sample code below Sam Stewart #Ordinal example dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6)) remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3)) tab2 = rbind(table(dropouts),table(remain)) ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE) cumsum(colSums(tab2)) W = max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))]))) n1 = length(dropouts) n2 = length(remain) testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12)) 2*(1-pnorm(testStat)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with .C
Hi, 2011/10/6 Uwe Ligges lig...@statistik.tu-dortmund.de: On 06.10.2011 14:51, Jan van der Laan wrote: An obvious reason might be that your second argument should be a pointer to int. As others have mentioned, you might want to have a look at Rccp and/or inline. The documentation is good and I find it much easier to work with. For example, your example could be written as: library(Rcpp) library(inline) test - cxxfunction(signature(x = numeric ) , ' Rcpp::NumericVector v(x); Rcpp::NumericVector result(v.length()); for (int i = 0; i v.length(); ++i) { result[i] = v[i] + i; } return(result); ', plugin = Rcpp ) Oh, come on, this is now really too much of overkill. I don't agree that it's overkill -- you get to sidestep the whole `R CMD SHLIB ...` and `dyn.load` dance this way while you experiment with C(++) code 'live using the inline package. It's really handy. Just make the original source void test(double *b, int *l) { int i; for(i=0; i *l ; i++) b[i] += i; } which you would have know after reading the Wriiting R Extensions manual. I agree that this step is unavoidable no matter which avenue (Rcpp or otherwise) one decides to take. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to install a locally built package
Uwe, Are u saying 1) I can add the new func in one of the existing .R files in the R dir?? Or 2) add a new .R file in the same dir and ignore the lack of a matching .Rd file? Thank you. Stephen Bond | Senior Analyst | Treasury Analytics | 416-956-3092 -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, October 06, 2011 9:12 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 06.10.2011 15:10, Bond, Stephen wrote: Is there a way to put all R code in a single file? I have too many small files now, and it is inconvenient to edit (I still have to put everything in one buffer) and when I add just one new func I have to go through the process of manually editing all help files one by one. When I put all the code in one file only the first func is loaded. He? You really do not need to edit anything when you add a function. Just don't use package skeleton but just add the function. Uwe Ligges Thanks everybody Stephen B -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Friday, September 16, 2011 3:43 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 20:38, Bond, Stephen wrote: I got it working by typing a string in getdfv.Rd after \title{ \title{ getdf %% ~~function to do ... ~~ } Strange why the skeleton would not do that given it did \name{getdfv} \alias{getdfv} One reason is that we try to force people to deal with the documentation which we feel is very important, even if the stuff is just used by yourself. Best, Uwe Ligges Anyway, I'm happy now. Stephen Bond -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, September 15, 2011 12:15 PM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 17:47, Bond, Stephen wrote: Uwe, That gave me the same error like CMD install install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) Loading required package: stats Loading required package: utils Loading required package: graphics Loading required package: splines * installing *source* package 'mypack' ... ** R ** preparing package for lazy loading ** help Warning: C:/DOCUME~1/BondStep/LOCALS~1/Temp/Rtmpw5N7dm/R.INSTALL6ccc14e6/mypa c k/man/mypack-package.Rd:33: All text must be in a section *** installing help indices Error in Rd_info(db[[i]]) : Rd files must have a non-empty \title. See chapter 'Writing R documentation' in manual 'Writing R Extensions'. * removing 'c:/PROGRA~1/R/R-212~1.1/library/mypack' * restoring previous 'c:/PROGRA~1/R/R-212~1.1/library/mypack' Warning message: In install.packages(C:/Temp/mypack_1.0.tar.gz, repos = NULL, type = source) : installation of package 'C:/Temp/mypack_1.0.tar.gz' had non-zero exit status is there a way to skip the Rd part? This is for private use only and there is no help or data files. Then you have to delete the Rd file that has been generated by package.skeleton. Please read the manuals Writing R Extensions and R Installation and Administration. Best, Uwe Ligges Thank you. Stephen Bond -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, September 15, 2011 11:43 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 17:32, Bond, Stephen wrote: Hello useRs, I am trying to put my funcs in a package to avoid clutter in my workspace as the funcs are now in .Rprofile. All plain R code no compilations. Using win XP with a full cygwin install and R2.12 I first did package.skeleton(mypack,list=getdfv, namespace=T) # just a single func which created a folder with the required stuff in it. Calling R CMD build creates a tar.gz file with warnings about DOS paths. Trying to install from the R GUI complains install.packages(mypack,repos=C:/Temp/mypack_1.0.tar.gz) Warning: unable to access index for repository C:/Temp/mypack_1.0.tar.gz/bin/windows/contrib/2.12 Warning message: In getDependencies(pkgs, dependencies, available, lib) : package 'mypack' is not available install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) seems to be what you want. Uwe Ligges putting the mypack folder in zip and then trying to install from the zip produces no error message but then library(mypack) Error in library(mypack) : 'mypack' is not a valid installed package Just putting the folder in the library folder of R did not work either Finally R CMD Install complains about missing title in the Rd file. I do not have a Rd file as the skeleton did not create it. Please, advise how to make simple R
Re: [R] how to install a locally built package
On 06.10.2011 15:45, Bond, Stephen wrote: Uwe, Are u saying 1) I can add the new func in one of the existing .R files in the R dir?? Yes, sure, or add an additional file. Or 2) add a new .R file in the same dir and ignore the lack of a matching .Rd file? Yes. You can also generate a single Rd file using prompt() on the function. Uwe Ligges Thank you. Stephen Bond | Senior Analyst | Treasury Analytics | 416-956-3092 -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, October 06, 2011 9:12 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 06.10.2011 15:10, Bond, Stephen wrote: Is there a way to put all R code in a single file? I have too many small files now, and it is inconvenient to edit (I still have to put everything in one buffer) and when I add just one new func I have to go through the process of manually editing all help files one by one. When I put all the code in one file only the first func is loaded. He? You really do not need to edit anything when you add a function. Just don't use package skeleton but just add the function. Uwe Ligges Thanks everybody Stephen B -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Friday, September 16, 2011 3:43 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 20:38, Bond, Stephen wrote: I got it working by typing a string in getdfv.Rd after \title{ \title{ getdf %% ~~function to do ... ~~ } Strange why the skeleton would not do that given it did \name{getdfv} \alias{getdfv} One reason is that we try to force people to deal with the documentation which we feel is very important, even if the stuff is just used by yourself. Best, Uwe Ligges Anyway, I'm happy now. Stephen Bond -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, September 15, 2011 12:15 PM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 17:47, Bond, Stephen wrote: Uwe, That gave me the same error like CMD install install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) Loading required package: stats Loading required package: utils Loading required package: graphics Loading required package: splines * installing *source* package 'mypack' ... ** R ** preparing package for lazy loading ** help Warning: C:/DOCUME~1/BondStep/LOCALS~1/Temp/Rtmpw5N7dm/R.INSTALL6ccc14e6/mypa c k/man/mypack-package.Rd:33: All text must be in a section *** installing help indices Error in Rd_info(db[[i]]) : Rd files must have a non-empty \title. See chapter 'Writing R documentation' in manual 'Writing R Extensions'. * removing 'c:/PROGRA~1/R/R-212~1.1/library/mypack' * restoring previous 'c:/PROGRA~1/R/R-212~1.1/library/mypack' Warning message: In install.packages(C:/Temp/mypack_1.0.tar.gz, repos = NULL, type = source) : installation of package 'C:/Temp/mypack_1.0.tar.gz' had non-zero exit status is there a way to skip the Rd part? This is for private use only and there is no help or data files. Then you have to delete the Rd file that has been generated by package.skeleton. Please read the manuals Writing R Extensions and R Installation and Administration. Best, Uwe Ligges Thank you. Stephen Bond -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Thursday, September 15, 2011 11:43 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] how to install a locally built package On 15.09.2011 17:32, Bond, Stephen wrote: Hello useRs, I am trying to put my funcs in a package to avoid clutter in my workspace as the funcs are now in .Rprofile. All plain R code no compilations. Using win XP with a full cygwin install and R2.12 I first did package.skeleton(mypack,list=getdfv, namespace=T) # just a single func which created a folder with the required stuff in it. Calling R CMD build creates a tar.gz file with warnings about DOS paths. Trying to install from the R GUI complains install.packages(mypack,repos=C:/Temp/mypack_1.0.tar.gz) Warning: unable to access index for repository C:/Temp/mypack_1.0.tar.gz/bin/windows/contrib/2.12 Warning message: In getDependencies(pkgs, dependencies, available, lib) : package 'mypack' is not available install.packages(C:/Temp/mypack_1.0.tar.gz, repos=NULL, type=source) seems to be what you want. Uwe Ligges putting the mypack folder in zip and then trying to install from the zip produces no error message but then library(mypack) Error in library(mypack) : 'mypack' is not a valid installed package Just putting the folder in the library folder of R did not work either Finally R CMD Install complains about missing title in the Rd file. I do not have a Rd
Re: [R] Problem with .C
On 06.10.2011 15:41, Steve Lianoglou wrote: Hi, 2011/10/6 Uwe Liggeslig...@statistik.tu-dortmund.de: On 06.10.2011 14:51, Jan van der Laan wrote: An obvious reason might be that your second argument should be a pointer to int. As others have mentioned, you might want to have a look at Rccp and/or inline. The documentation is good and I find it much easier to work with. For example, your example could be written as: library(Rcpp) library(inline) test- cxxfunction(signature(x = numeric ) , ' Rcpp::NumericVector v(x); Rcpp::NumericVector result(v.length()); for (int i = 0; i v.length(); ++i) { result[i] = v[i] + i; } return(result); ', plugin = Rcpp ) Oh, come on, this is now really too much of overkill. I don't agree that it's overkill -- you get to sidestep the whole `R CMD SHLIB ...` and `dyn.load` dance this way while you experiment with C(++) code 'live using the inline package. You need two additional packages now where you have to rely on the fact those are available. Moreover, you have to get used to that syntax, and part of it seems to be C++ now? At least I do not know why the above should work at all, while I know the simple C function does. Uwe It's really handy. Just make the original source void test(double *b, int *l) { int i; for(i=0; i *l ; i++) b[i] += i; } which you would have know after reading the Wriiting R Extensions manual. I agree that this step is unavoidable no matter which avenue (Rcpp or otherwise) one decides to take. -steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wilcox Test / Mann Whitney U Test
So I checked it with the wilcox_test in the coin library, and got the same result. That makes me more confident that I made a mistake, but still doesn't help me find it d = data.frame(value=c(dropouts,remain),group=c(rep(dropout,length(dropouts)),rep(remain,length(remain wilcox_test(value~group,data=d) Sam On Thu, Oct 6, 2011 at 10:35 AM, Sam Stewart rhelp.st...@gmail.com wrote: Hello List, I'm trying to prepare some lecture notes on non parametric methods, and I can't manually reproduce the results of the wilcox.test function for ordinal data. The data I'm using are from David Howell's website, available here http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html If I run the wilcox.test function on the data I get a p-value of .0407, but when I do it myself I get a p-value of 0.0530. It's not so much the jump across 0.05, but the fact that I thought I knew what the function was doing. I know from the R help page that there is some controversy about how exactly to calculate the test statistic, but that's not what is causing the problem, as I can get the same W value. Am I calculating the test statistic incorrectly? Thanks, sample code below Sam Stewart #Ordinal example dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6)) remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3)) tab2 = rbind(table(dropouts),table(remain)) ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE) cumsum(colSums(tab2)) W = max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))]))) n1 = length(dropouts) n2 = length(remain) testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12)) 2*(1-pnorm(testStat)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: break.axis all range of data
But in this way for moving also the point have to add 2 at all the x point after 8. axis(1,at=c(1:8,12:39),labels=1:36) plot(1:36,1:36,xaxt=n) I increased the gap, and if I don increase the x for the points at X 9, 10 they will appear in the gap. HC On Thu, Oct 6, 2011 at 8:20 AM, Heverkuhn Heverkuhn heverk...@gmail.comwrote: Thank you Jim It seems exactly what I was looking for :) Claudio On Thu, Oct 6, 2011 at 5:11 AM, Jim Lemon j...@bitwrit.com.au wrote: On 10/06/2011 12:31 AM, Heverkuhn Heverkuhn wrote: ..all the point from 8 to 13. On Wed, Oct 5, 2011 at 8:28 AM, Heverkuhn Heverkuhnheverk...@gmail.com **wrote: The problem with that function is that it does not really separate the 2parts of the graph but it inserts , when style is gap, a blank strip that cover axis and points. So for example if a insert it at 8 and I set the gap of length 5 , it would cancel al the point from 8 to 10. ... I would like to increase the distance between x tick-marks 8 and 9, and not connect the points x=8 and x=9. Ah, I think I see what you want. You want an axis like this: axis(1,at=c(1:8,10:37),labels=**1:36) and to get your points right, you would have to do something like: plot(c(1:8,10:37),1:36,xaxt=**n) lines(1:8,1:8) lines(10:37,9:36) first. This is more or less the reverse of the gap.* functions in the plotrix package. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unique possible bug
On 05.10.2011 22:15, Patrick McCann wrote: Hi, I am trying to read in a rather large list of transactions using the arules library. You mean the arules package? It seems in the coerce method into the dgCmatrix, it somewhere calls unique. Unique.c throws an error when n 536870912; however, when 4*n was modified to 2*n in 2004, the overflow protection should have changed from 2^29 to 2^30, right? If so, how would I change it in my copy? Do I have to recompile everything? Yes. There is also the way to ask the maintainer to improve it, but it won't work without reinstallation of the changed package sources. Uwe Ligges Thanks, Patrick McCann Here is a simple to reproduce example: runif(2^29+5)-a sum(unique(a))-b Error in unique.default(a) : length 536870917 is too large for hashing traceback() 3: unique.default(a) 2: unique(a) 1: unique(a) unique.default function (x, incomparables = FALSE, fromLast = FALSE, ...) { z- .Internal(unique(x, incomparables, fromLast)) if (is.factor(x)) factor(z, levels = seq_len(nlevels(x)), labels = levels(x), ordered = is.ordered(x)) else if (inherits(x, POSIXct)) structure(z, class = class(x), tzone = attr(x, tzone)) else if (inherits(x, Date)) structure(z, class = class(x)) else z } environment: namespace:base From http://svn.r-project.org/R/trunk/src/main/unique.c I see: /* Choose M to be the smallest power of 2 not less than 2*n and set K = log2(M). Need K= 1 and hence M= 2, and 2^M= 2^31 -1, hence n= 2^29. Dec 2004: modified from 4*n to 2*n, since in the worst case we have a 50% full table, and that is still rather efficient -- see R. Sedgewick (1998) Algorithms in C++ 3rd edition p.606. */ static void MKsetup(int n, HashData *d) { int n4 = 2 * n; if(n 0 || n 536870912) /* protect against overflow to -ve */ error(_(length %d is too large for hashing), n); d-M = 2; d-K = 1; while (d-M n4) { d-M *= 2; d-K += 1; } } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wilcox Test / Mann Whitney U Test
And I figured it out, sorry to bother the list. The normal approximation I was using is not accurate in the presence of ties. Sam On Thu, Oct 6, 2011 at 10:56 AM, Sam Stewart rhelp.st...@gmail.com wrote: So I checked it with the wilcox_test in the coin library, and got the same result. That makes me more confident that I made a mistake, but still doesn't help me find it d = data.frame(value=c(dropouts,remain),group=c(rep(dropout,length(dropouts)),rep(remain,length(remain wilcox_test(value~group,data=d) Sam On Thu, Oct 6, 2011 at 10:35 AM, Sam Stewart rhelp.st...@gmail.com wrote: Hello List, I'm trying to prepare some lecture notes on non parametric methods, and I can't manually reproduce the results of the wilcox.test function for ordinal data. The data I'm using are from David Howell's website, available here http://www.uvm.edu/~dhowell/StatPages/More_Stuff/OrdinalChisq/OrdinalChiSq.html If I run the wilcox.test function on the data I get a p-value of .0407, but when I do it myself I get a p-value of 0.0530. It's not so much the jump across 0.05, but the fact that I thought I knew what the function was doing. I know from the R help page that there is some controversy about how exactly to calculate the test statistic, but that's not what is causing the problem, as I can get the same W value. Am I calculating the test statistic incorrectly? Thanks, sample code below Sam Stewart #Ordinal example dropouts = c(rep(0,25),rep(3,10),rep(2,9),rep(1,13),rep(4,6)) remain = c(rep(0,31),rep(3,2),rep(2,6),rep(1,21),rep(4,3)) tab2 = rbind(table(dropouts),table(remain)) ordTest = wilcox.test(x=dropouts,y=remain,correct=FALSE,exact=FALSE) cumsum(colSums(tab2)) W = max(c(sum(rank(cbind(dropouts,remain))[1:length(dropouts)]),sum(rank(cbind(dropouts,remain))[-(1:length(dropouts))]))) n1 = length(dropouts) n2 = length(remain) testStat = (S-n1*(n1+n2+1)/2)/(sqrt(n1*n2*(n1+n2+1)/12)) 2*(1-pnorm(testStat)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with wireframe graphics problem (newbie)
On 06.10.2011 00:50, David Wiley wrote: All, I've read several tutorials re: generating wireframes, but am clearly missing something. I have data along the lines of: tbl [1:10,] Visits Activity Course.Grade 1 17218.31 2 7 1120.67 3 9 1724.69 4 28 7138.72 5 43 10745.46 6 14547.77 7 25 5157.81 8 32 2759.46 9 60 4364.39 10 31 4666.26 Looks like you are after a 3D scatterplot rather than a wireframe - the latter only works on a regular grid of Vistits and Activity. Uwe ligges When I run the command: wireframe(Course.Grade ~ Visits * Activity, data = tbl) I get an appropriately labeled, but empty, cube. Can anyone tell me why there's no data in my cube? Thanks in advance, David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SPlus to R
I did some testing and I believe the program is operating properly. It takes some time to finish, especially as sample sizes get larger, but I seem to be able to reproduce the results from the original paper. Right now I'm most interested in method 3. I set nc=40 and d=.2 as in the paper. The results are below. The program iterates to find a solution and the more subjects (nc), the more iterations it needs. The initial pass uses estimates from the randomized method as starting values and this is plotted in black on a graph. The next pass uses the algorithm to generate sample size estimates which are plotted in gray on the first pass. Each new estimate is plotted in dark blue as the previous one turns black. This reults in several curves being plotted but one can get an idea of the relevant one by looking for the blue plot and examining the $ne vector below. Reading the graph is more troublesome as nc increases. It takes many more iterations to complete and the graph gets quite crowded. I'm not sure what the best solution is. Which would be easiest? 1) Plot ne against pc for the final iteration. The problem here is pc is not saved in a vector. 2) Cut and paste $ne into Excel and re-create the corresponding pc's. It's not clear to me how pc is incremented. Seems like it's in the code block below: ### for method 3 if (method==3) { if (tol1 tol2/10) tol1-tol2/10 ncstar-(1-d)*nc pc-(0:ncstar)/nc ne-rep(NA,ncstar + 1) for (i in (0:ncstar)) { ne[i+1]-ss.rand(i,nc,d,alpha=.05,power=.8,tol=.01) } plot(pc,ne,type='l',ylim=c(0,max(ne)*1.5)) ans-c.searchd(nc, d, ne, alpha, power, cc, tol1) sshc() [1] 0.8000 0.7889 0.7825 0.7786 0.7763 0.7751 0.7747 0.7750 0.7759 0.7772 [11] 0.7790 0.7815 0.7852 0.7899 0.7943 0.7968 0.7979 0.7989 0.8002 0.8021 [21] 0.8045 0.8072 0.8100 0.8123 0.8137 0.8137 0.8124 0.8105 0.8095 0.8103 [31] 0.8134 0.8180 0.8244 [1] 11.64 19.62 27.50 36.44 46.54 58.19 71.81 87.93 107.30 130.38 [11] 158.47 192.34 235.47 291.93 370.48 471.26 569.17 626.36 626.18 588.94 [21] 529.27 459.67 389.30 323.70 263.50 210.58 164.26 124.25 90.49 63.23 [31] 42.13 26.23 8.13 [1] old.abs.dev= 0.434571909427754 [1] abs.dev= 0.259398042835904 [1] 0.8000 0.8031 0.8063 0.8094 0.8120 0.8140 0.8152 0.8156 0.8152 0.8141 [11] 0.8122 0.8097 0.8073 0.8055 0.8045 0.8037 0.8025 0.8015 0.8006 0.7995 [21] 0.7982 0.7969 0.7956 0.7946 0.7939 0.7930 0.7916 0.7897 0.7879 0.7872 [31] 0.7876 0.7886 0.7887 [1] 11.64 20.91 30.51 42.13 56.19 73.71 95.59 123.29 159.20 204.83 [11] 263.66 337.90 424.41 520.76 623.55 727.31 804.98 810.60 733.27 621.58 [21] 504.83 397.60 310.92 242.78 190.93 151.72 120.81 95.24 72.67 52.84 [31] 36.01 22.42 6.32 [1] old.abs.dev= 0.259398042835904 [1] abs.dev= 0.160699574601494 [1] 0.8000 0.7985 0.7972 0.7958 0.7944 0.7930 0.7917 0.7907 0.7903 0.7907 [11] 0.7917 0.7929 0.7941 0.7951 0.7964 0.7977 0.7982 0.7984 0.7989 0.7995 [21] 0.8000 0.8006 0.8013 0.8023 0.8036 0.8048 0.8054 0.8055 0.8055 0.8063 [31] 0.8081 0.8103 0.8125 [1] 11.64 20.56 29.46 39.64 51.27 64.86 80.67 99.46 122.30 149.90 [11] 184.62 230.35 286.04 352.93 429.63 508.43 588.14 635.70 616.86 561.50 [21] 489.16 408.17 333.63 269.21 215.86 173.66 139.24 110.43 84.50 61.28 [31] 41.38 25.49 7.42 [1] old.abs.dev= 0.160699574601494 [1] abs.dev= 0.0893104146067729 [1] 0.8000 0.8006 0.8013 0.8021 0.8029 0.8036 0.8043 0.8048 0.8050 0.8051 [11] 0.8049 0.8045 0.8037 0.8024 0.8019 0.8019 0.8015 0.8012 0.8010 0.8009 [21] 0.8007 0.8002 0.7995 0.7990 0.7987 0.7985 0.7980 0.7970 0.7958 0.7950 [31] 0.7950 0.7950 0.7939 [1] 11.64 20.72 29.92 40.75 53.42 68.99 87.79 111.53 141.92 179.12 [11] 226.54 290.16 365.88 456.60 556.96 640.04 722.83 757.33 711.48 624.30 [21] 522.26 423.34 335.91 264.10 206.62 162.86 128.85 101.60 77.86 56.72 [31] 38.30 23.41 6.49 [1] old.abs.dev= 0.0893104146067729 [1] abs.dev= 0.0506515395975703 [1] 0.8000 0.7996 0.7994 0.7990 0.7986 0.7982 0.7978 0.7974 0.7970 0.7969 [11] 0.7970 0.7975 0.7981 0.7979 0.7982 0.7990 0.7991 0.7990 0.7990 0.7992 [21] 0.7994 0.7996 0.7996 0.7998 0.8002 0.8009 0.8015 0.8015 0.8014 0.8016 [31] 0.8025 0.8035 0.8041 [1] 11.64 20.65 29.71 40.22 52.31 66.93 84.02 104.94 131.25 161.98 [11] 200.06 251.89 310.71 386.59 476.45 545.33 622.06 664.45 636.92 570.09 [21] 486.12 402.82 326.45 262.37 208.03 165.78 131.88 104.87 81.05 59.38 [31] 40.18 24.54 6.99 $ne [1] 11.637555 20.651336 29.710258 40.223468 52.314568 66.928887 [7] 84.022673 104.936774 131.254491 161.981416 200.060785 251.890694 [13] 310.710449 386.591379 476.449601 545.327456 622.056417 664.447078 [19] 636.923065 570.091413 486.117837 402.817857 326.451672 262.370265 [25] 208.029592 165.777369 131.875077 104.873469 81.051410 59.379349 [31] 40.182140 24.544626 6.990464 $Ep [1] 0.741 0.7996387 0.7993605 0.7990384 0.7986411 0.7982104 0.7978227 [8] 0.7974117 0.7970255 0.7969174 0.7970263 0.7974569 0.7980592
Re: [R] distance coefficient for amatrix with ngative valus
Did you read any of the comments I made regarding working examples, meaningful question asking, or replying to the entire list? If you look at the code, you'll see pco is just a very elementary wrapper for cmdscale, the author of which is active on this list and could have seen your question and replied to it with a hundredfold more speed and knowledge than myself, ... had you replied to the entire list. Looking further into cmdscale, you can see that it is not designed to return variable loadings directly (to be honest, I'm not particularly familiar with PCO and I'm not sure that the method provides such loadings but I'm assuming you have reason to at least think they exist) so you'll have to calculate them directly. Read the documentation of cmdscale to find understand what the various things being returned are. If another function in the labsdv package seems to calculate them, perhaps you can pilfer some code from there. Michael On Tue, Oct 4, 2011 at 1:58 PM, dilshan benaragama benaraga...@yahoo.com wrote: Hi, As you mentioned I was able to run the pco ignoring the warning massege of negative values. The nest problem I have is how to get the loadings for each variable as it will not give the summary out put or loadings as we get for pca. Thanks. From: R. Michael Weylandt michael.weyla...@gmail.com To: dilshan benaragama benaraga...@yahoo.com; r-help r-help@r-project.org Sent: Monday, October 3, 2011 11:05:19 PM Subject: Re: [R] distance coefficient for amatrix with ngative valus Comments inline: On Mon, Oct 3, 2011 at 11:27 PM, dilshan benaragama benaraga...@yahoo.com wrote: Yes I think you did not get my problem. No, you did not state your problem. I have replied to everything you have actually included to this point. Admittedly, I have failed to reply to things you did not say... Actualy I want run PCO with (labdsv). To do that I I am trying to get the distance metrix using following fuctions with library (vegan). This is now the 7th email in this chain. You should mention the packages and functions you are using in the FIRST email of the chain. This is mentioned in the posting guide which you apparently have still not yet read. pca.gower- vegdist(envt[,2:9],method=gower) pca.eucl-vegdist(envt[,2:9],method=euclidean) pca.chi-vegdist(envt[,2:9],method=chi.square) pca.mahal-vegdist(envt[,2:9],method=mahal) pca.bray-vegdist(envt,method=bray) However none of the functions work They all work for any data I put in. This is perhaps when that minimal working example, which you also should have included, is necessary. The append at the end of each of the 7 emails in this chain that tells you to read the posting guide also asks for this, as did I explicitly. (gives an error saying that is not working due to negatve values) No, they each give warnings. Warnings are not errors. They are warnings and they say warning. Perhaps unsurprisingly, errors say error. If you are using an old version of vegan that throws an error, you should always update before seeking help.Not surprisingly, a certain document suggests this. except euclidean distance for the raw data set as the raw data has negative values for some variables. It is no point of using euclidean metrix with PCO as we can do the same thing from PCA. So I need to find a way I can run PCO with a different dissimilarity metrix for this data. It will be a great help if you can help me on this Actually read the warning message: it warns you that you have given negative data to an ecological function and suggests this might be a point you look into as this usually suggests a user-end problem. It does not fail to work in any sense of the word as evidence by the output of distances. If negative data is nonsense, you should heed this warning; if you know its not, disregard it. More importantly, as I said in my initial response, any distance metric worth its salt is translation invariant. To wit, x - matrix(rnorm(50),5) d1 = vegdist(x, method=gower) d2 = vegdist(x + abs(min(x))*3, method=gower) all.equal(as.numeric(d1), as.numeric(d2)) TRUE In fairness, I'll admit this does not seem to work for the bray distance. I am not an ecologist and I do not know why this would be -- it does leave me somewhat confused as to what sort of space motivates the bray metric, but that's a discussion for another time and place -- but the function still returns a valid dist object for both d1 and d2. Thanks, From: R. Michael Weylandt michael.weyla...@gmail.com To: dilshan benaragama benaraga...@yahoo.com; r-help r-help@r-project.org You will note that I include the r-help list on each email on this chain while you have not; this is mentioned in the posting guide. Sent: Monday, October 3, 2011 10:00:53 PM Subject: Re: [R] distance coefficient for amatrix with ngative valus You still haven't explained what's wrong with *almost every metric there is*, but if you want other distance
Re: [R] Dealing with proportions
On Oct 5, 2011, at 4:08 PM, Sam wrote: Dear list, I have very little experience in dealing with proportions, i am sure this is a very simple question Sometimes making that claim in a group of statisticians can provoke an extended discussion to which there will be no eventual agreement. Discussions about testing proportions in groups with small numbers is one such excellent example. but i could find no suitable answer beyond doing a chi-sq test and then using the Marascuilo procedure as a post-hoc analysis. I am simply wanting to know if the proportions ( i.e the number of Yes / No) significantly differ between the cases and if so which cases are significantly high or low? proportion - structure(list(Case = structure(1:11, .Label = c(A, B, C, D, E, F, G, H, I, J, K), class = factor), Yes = c(18L, 2L, 1L, 2L, 44L, 27L, 2L, 15L, 13L, 3L, 34L), No = c(171L, 11L, 5L, 8L, 146L, 80L, 5L, 30L, 22L, 5L, 42L), Num = c(189L, 13L, 6L, 10L, 190L, 107L, 7L, 45L, 35L, 8L, 76L)), .Names = c(Case, Yes, No, Num), class = data.frame, row.names = c(NA, -11L)) I would probably have turned to logistic regression and used p.adjust to deal with the multiple comparisons issues. By default, the Intercept p-value estimate for a simple model with a factor variable as the predictor tests for the Intercept = 0 tests for of a proportion to 0.50 (since that is the proportion at which the log(odds) = 0). It then tests for equality to that odds for all the other levels (the same as odds ratios being 1 or their coefficients=0). (The first way I set this up I ended up with your A- level being the reference and the hypotheses being tested didn't seem reasonable or interesting namely that the lowest proportion was equal to 50% and that all the rest of the proportions were equal to that lowest proportion.) as.matrix(proportion$Yes/proportion$Num) [,1] [1,] 0.0952381 [2,] 0.1538462 [3,] 0.167 [4,] 0.200 [5,] 0.2315789 [6,] 0.2523364 [7,] 0.2857143 [8,] 0.333 [9,] 0.3714286 [10,] 0.375 [11,] 0.4473684 So I took your data and made your K level the reference value since its value (0.44) is the one closest to 50% and then looked at the p- values from the others as information about their similarity to the K level. proportion$Case - relevel(proportion$Case, K) casefit - glm(cbind(Yes, Num) ~ Case, data= proportion, family=binomial) summary(casefit) summary(casefit)$coefficients p.adjust( summary(casefit)$coefficients[ , 4] ) That set of hypotheses gives very few significant results, but most of your groups are pretty sparse so that should be expected. Another (and I think better) approach resulting in even fewer significant results is to use logistic regression with an offset term equal to the log(odds) for the entire group: log( with(proportion, sum(Yes)/sum(No) ) ) # returns [1] -1.181994 This corresponds to an expected proportion of 0.2346939 casefit - glm(cbind(Yes, Num) ~ 0+ Case, data= proportion, offset=rep( log( with(proportion, sum(Yes)/ sum(No) ) ), 11), # each level gets the same offset. family=binomial) I used the 0+ on the RHS of the formula to force a labeled estimate for each level, since the p-values are now in reference to the overall proportion. Since only one level has a significant p-value and it is highly significant (due to the combination of higher numbers of events and distance from the NULL estimate) you will not see much change if you adjust its p-value. cbind( round( summary(casefit)$coefficients, 3), Prpns =round( (proportion$Yes/proportion$Num)[c(11,1:10)], 3), N_Yes=proportion$Yes[c(11,1:10)] ) Estimate Std. Error z value Pr(|z|) Prpns N_Yes CaseK0.378 0.206 1.8300.067 0.44734 CaseA -1.169 0.247 -4.7410.000 0.09518 CaseB -0.690 0.760 -0.9080.364 0.154 2 CaseC -0.610 1.080 -0.5650.572 0.167 1 CaseD -0.427 0.775 -0.5520.581 0.200 2 CaseE -0.281 0.167 -1.6790.093 0.23244 CaseF -0.195 0.215 -0.9050.365 0.25227 CaseG -0.071 0.802 -0.0880.930 0.286 2 CaseH0.083 0.298 0.2800.780 0.33315 CaseI0.192 0.325 0.5900.555 0.37113 CaseJ0.201 0.677 0.2970.766 0.375 3 Comparing the numbers and the observed proportions to the overall proportion (0.2346939) shows not many (only one actually) groups with strong evidence of being different. I think you need larger numbers. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Limitations of R
Dear all, I have a few binary files like 9Gb or even of 50Gb. I would like to ask you what are the known limits of the R for the data processing part, I have a system with a lot of RAM (500Gb) but I am not sure about the internal limitations of the R. How long for example a vector can be? Could you please inform me for these internal R limitations? I would like to thank you in advance for your help B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dealing with proportions
On Oct 6, 2011, at 10:33 AM, David Winsemius wrote: On Oct 5, 2011, at 4:08 PM, Sam wrote: Dear list, I have very little experience in dealing with proportions, i am sure this is a very simple question Sometimes making that claim in a group of statisticians can provoke an extended discussion to which there will be no eventual agreement. Discussions about testing proportions in groups with small numbers is one such excellent example. but i could find no suitable answer beyond doing a chi-sq test and then using the Marascuilo procedure as a post-hoc analysis. I am simply wanting to know if the proportions ( i.e the number of Yes / No) significantly differ between the cases and if so which cases are significantly high or low? proportion - structure(list(Case = structure(1:11, .Label = c(A, B, C, D, E, F, G, H, I, J, K), class = factor), Yes = c(18L, 2L, 1L, 2L, 44L, 27L, 2L, 15L, 13L, 3L, 34L), No = c(171L, 11L, 5L, 8L, 146L, 80L, 5L, 30L, 22L, 5L, 42L), Num = c(189L, 13L, 6L, 10L, 190L, 107L, 7L, 45L, 35L, 8L, 76L)), .Names = c(Case, Yes, No, Num), class = data.frame, row.names = c(NA, -11L)) I would probably have turned to logistic regression and used p.adjust to deal with the multiple comparisons issues. By default, the Intercept p-value estimate for a simple model with a factor variable as the predictor tests for the Intercept = 0 tests for of a proportion to 0.50 (since that is the proportion at which the log(odds) = 0). It then tests for equality to that odds for all the other levels (the same as odds ratios being 1 or their coefficients=0). (The first way I set this up I ended up with your A- level being the reference and the hypotheses being tested didn't seem reasonable or interesting namely that the lowest proportion was equal to 50% and that all the rest of the proportions were equal to that lowest proportion.) as.matrix(proportion$Yes/proportion$Num) [,1] [1,] 0.0952381 [2,] 0.1538462 [3,] 0.167 [4,] 0.200 [5,] 0.2315789 [6,] 0.2523364 [7,] 0.2857143 [8,] 0.333 [9,] 0.3714286 [10,] 0.375 [11,] 0.4473684 So I took your data and made your K level the reference value since its value (0.44) is the one closest to 50% and then looked at the p-values from the others as information about their similarity to the K level. proportion$Case - relevel(proportion$Case, K) casefit - glm(cbind(Yes, Num) ~ Case, data= proportion, family=binomial) Should have been : casefit - glm(cbind(Yes, No) ~ Case, data= proportion, family=binomial) summary(casefit) summary(casefit)$coefficients p.adjust( summary(casefit)$coefficients[ , 4] ) That set of hypotheses gives very few significant results, but most of your groups are pretty sparse so that should be expected. Another (and I think better) approach resulting in even fewer significant results is to use logistic regression with an offset term equal to the log(odds) for the entire group: log( with(proportion, sum(Yes)/sum(No) ) ) # returns [1] -1.181994 This corresponds to an expected proportion of 0.2346939 casefit - glm(cbind(Yes, Num) ~ 0+ Case, data= proportion, offset=rep( log( with(proportion, sum(Yes)/ sum(No) ) ), 11), # each level gets the same offset. family=binomial) This should have been: casefit - glm(cbind(Yes, No) ~ 0+ Case, data= proportion, offset=rep( log( with(proportion, sum(Yes)/sum(No) ) ), 11), family=binomial) I used the 0+ on the RHS of the formula to force a labeled estimate for each level, since the p-values are now in reference to the overall proportion. Since only one level has a significant p-value and it is highly significant (due to the combination of higher numbers of events and distance from the NULL estimate) you will not see much change if you adjust its p-value. New version of table : cbind(round( summary(casefit)$coefficients, 3), Prpns =round( (proportion$Yes/proportion$Num)[c(11,1:10)], 3), N_Yes=proportion$Yes[c(11,1:10)] ) Estimate Std. Error z value Pr(|z|) Prpns N_Yes CaseK0.971 0.231 4.2080.000 0.44734 CaseA -1.069 0.248 -4.3150.000 0.09518 CaseB -0.523 0.769 -0.6800.496 0.154 2 CaseC -0.427 1.095 -0.3900.696 0.167 1 CaseD -0.204 0.791 -0.2580.796 0.200 2 CaseE -0.017 0.172 -0.1010.919 0.23244 CaseF0.096 0.223 0.4300.667 0.25227 CaseG0.266 0.837 0.3180.751 0.286 2 CaseH0.489 0.316 1.5460.122 0.33315 CaseI0.656 0.350 1.8750.061 0.37113 CaseJ0.671 0.730 0.9190.358 0.375 3 Comparing the numbers and the observed proportions to the overall proportion (0.2346939) shows not many (only one actually) groups with strong evidence of being different. I think you need larger numbers. Now have two groups with significant differences. p.adjust(
Re: [R] unique possible bug
I see. For now: Yes, you need to change and recompile. I will take a look what was actually changed and will run some test cases. Best, Uwe On 06.10.2011 16:50, Patrick McCann wrote: The error I am referring to is in unique.c in Base R, it cannot accomodate greater than 2^29 values, even though it appears the overflow protection should be 2^30. The only relevance of the arules package is I was using it while I discovered this issue. Thanks, Patrick 2011/10/6 Uwe Liggeslig...@statistik.tu-dortmund.de: On 05.10.2011 22:15, Patrick McCann wrote: Hi, I am trying to read in a rather large list of transactions using the arules library. You mean the arules package? It seems in the coerce method into the dgCmatrix, it somewhere calls unique. Unique.c throws an error when n536870912; however, when 4*n was modified to 2*n in 2004, the overflow protection should have changed from 2^29 to 2^30, right? If so, how would I change it in my copy? Do I have to recompile everything? Yes. There is also the way to ask the maintainer to improve it, but it won't work without reinstallation of the changed package sources. Uwe Ligges Thanks, Patrick McCann Here is a simple to reproduce example: runif(2^29+5)-a sum(unique(a))-b Error in unique.default(a) : length 536870917 is too large for hashing traceback() 3: unique.default(a) 2: unique(a) 1: unique(a) unique.default function (x, incomparables = FALSE, fromLast = FALSE, ...) { z- .Internal(unique(x, incomparables, fromLast)) if (is.factor(x)) factor(z, levels = seq_len(nlevels(x)), labels = levels(x), ordered = is.ordered(x)) else if (inherits(x, POSIXct)) structure(z, class = class(x), tzone = attr(x, tzone)) else if (inherits(x, Date)) structure(z, class = class(x)) else z } environment: namespace:base From http://svn.r-project.org/R/trunk/src/main/unique.c I see: /* Choose M to be the smallest power of 2 not less than 2*n and set K = log2(M). Need K= 1 and hence M= 2, and 2^M= 2^31 -1, hence n= 2^29. Dec 2004: modified from 4*n to 2*n, since in the worst case we have a 50% full table, and that is still rather efficient -- see R. Sedgewick (1998) Algorithms in C++ 3rd edition p.606. */ static void MKsetup(int n, HashData *d) { int n4 = 2 * n; if(n0 || n536870912) /* protect against overflow to -ve */ error(_(length %d is too large for hashing), n); d-M = 2; d-K = 1; while (d-Mn4) { d-M *= 2; d-K += 1; } } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] KS test and theoretical distribution
x - runif(100) y - runif(100) ks.test(x,y) Two-sample Kolmogorov-Smirnov test data: x and y D = 0.11, p-value = 0.5806 alternative hypothesis: two-sided ok I expected that, but: ks.test(runif(100), runif) One-sample Kolmogorov-Smirnov test data: runif(100) D = 0.9106, p-value 2.2e-16 alternative hypothesis: two-sided How come? -- View this message in context: http://r.789695.n4.nabble.com/KS-test-and-theoretical-distribution-tp3878640p3878640.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] KS test and theoretical distribution
On Thu, 6 Oct 2011, francogrex wrote: x - runif(100) y - runif(100) ks.test(x,y) Two-sample Kolmogorov-Smirnov test data: x and y D = 0.11, p-value = 0.5806 alternative hypothesis: two-sided ok I expected that, but: ks.test(runif(100), runif) One-sample Kolmogorov-Smirnov test data: runif(100) D = 0.9106, p-value 2.2e-16 alternative hypothesis: two-sided How come? Because you didn't read the docs, or didn't follow them, or mistyped. The docs say y: either a numeric vector of data values, or a character string naming a cumulative distribution function or an actual cumulative distribution function such as 'pnorm'. Only continuous CDFs are valid. So you would want ks.test(runif(100), punif) Best, Z -- View this message in context: http://r.789695.n4.nabble.com/KS-test-and-theoretical-distribution-tp3878640p3878640.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SPlus to R
I'm re-posting this since it did not appear at the end of the thread. Sorry for the inconvenience. Not sure why it's giving the message: An embedded and charset-unspecified text was scrubbed... As far as I know my replies are set up as text and not html. I did some testing and I believe the program is operating properly. It takes some time to finish, especially as sample sizes get larger, but I seem to be able to reproduce the results from the original paper. Right now I'm most interested in method 3. I set nc=40 and d=.2 as in the paper. The results are below. The program iterates to find a solution and the more subjects (nc), the more iterations it needs. The initial pass uses estimates from the randomized method as starting values and this is plotted in black on a graph. The next pass uses the algorithm to generate sample size estimates which are plotted in gray on the first pass. Each new estimate is plotted in dark blue as the previous one turns black. This reults in several curves being plotted but one can get an idea of the relevant one by looking for the blue plot and examining the $ne vector below. Reading the graph is more troublesome as nc increases. It takes many more iterations to complete and the graph gets quite crowded. I'm not sure what the best solution is. Which would be easiest? 1) Plot ne against pc for the final iteration. The problem here is pc is not saved in a vector. 2) Cut and paste $ne into Excel and re-create the corresponding pc's. It's not clear to me how pc is incremented. Seems like it's in the code block below: ### for method 3 if (method==3) { if (tol1 tol2/10) tol1-tol2/10 ncstar-(1-d)*nc pc-(0:ncstar)/nc ne-rep(NA,ncstar + 1) for (i in (0:ncstar)) { ne[i+1]-ss.rand(i,nc,d,alpha=.05,power=.8,tol=.01) } plot(pc,ne,type='l',ylim=c(0,max(ne)*1.5)) ans-c.searchd(nc, d, ne, alpha, power, cc, tol1) sshc() [1] 0.8000 0.7889 0.7825 0.7786 0.7763 0.7751 0.7747 0.7750 0.7759 0.7772 [11] 0.7790 0.7815 0.7852 0.7899 0.7943 0.7968 0.7979 0.7989 0.8002 0.8021 [21] 0.8045 0.8072 0.8100 0.8123 0.8137 0.8137 0.8124 0.8105 0.8095 0.8103 [31] 0.8134 0.8180 0.8244 [1] 11.64 19.62 27.50 36.44 46.54 58.19 71.81 87.93 107.30 130.38 [11] 158.47 192.34 235.47 291.93 370.48 471.26 569.17 626.36 626.18 588.94 [21] 529.27 459.67 389.30 323.70 263.50 210.58 164.26 124.25 90.49 63.23 [31] 42.13 26.23 8.13 [1] old.abs.dev= 0.434571909427754 [1] abs.dev= 0.259398042835904 [1] 0.8000 0.8031 0.8063 0.8094 0.8120 0.8140 0.8152 0.8156 0.8152 0.8141 [11] 0.8122 0.8097 0.8073 0.8055 0.8045 0.8037 0.8025 0.8015 0.8006 0.7995 [21] 0.7982 0.7969 0.7956 0.7946 0.7939 0.7930 0.7916 0.7897 0.7879 0.7872 [31] 0.7876 0.7886 0.7887 [1] 11.64 20.91 30.51 42.13 56.19 73.71 95.59 123.29 159.20 204.83 [11] 263.66 337.90 424.41 520.76 623.55 727.31 804.98 810.60 733.27 621.58 [21] 504.83 397.60 310.92 242.78 190.93 151.72 120.81 95.24 72.67 52.84 [31] 36.01 22.42 6.32 [1] old.abs.dev= 0.259398042835904 [1] abs.dev= 0.160699574601494 [1] 0.8000 0.7985 0.7972 0.7958 0.7944 0.7930 0.7917 0.7907 0.7903 0.7907 [11] 0.7917 0.7929 0.7941 0.7951 0.7964 0.7977 0.7982 0.7984 0.7989 0.7995 [21] 0.8000 0.8006 0.8013 0.8023 0.8036 0.8048 0.8054 0.8055 0.8055 0.8063 [31] 0.8081 0.8103 0.8125 [1] 11.64 20.56 29.46 39.64 51.27 64.86 80.67 99.46 122.30 149.90 [11] 184.62 230.35 286.04 352.93 429.63 508.43 588.14 635.70 616.86 561.50 [21] 489.16 408.17 333.63 269.21 215.86 173.66 139.24 110.43 84.50 61.28 [31] 41.38 25.49 7.42 [1] old.abs.dev= 0.160699574601494 [1] abs.dev= 0.0893104146067729 [1] 0.8000 0.8006 0.8013 0.8021 0.8029 0.8036 0.8043 0.8048 0.8050 0.8051 [11] 0.8049 0.8045 0.8037 0.8024 0.8019 0.8019 0.8015 0.8012 0.8010 0.8009 [21] 0.8007 0.8002 0.7995 0.7990 0.7987 0.7985 0.7980 0.7970 0.7958 0.7950 [31] 0.7950 0.7950 0.7939 [1] 11.64 20.72 29.92 40.75 53.42 68.99 87.79 111.53 141.92 179.12 [11] 226.54 290.16 365.88 456.60 556.96 640.04 722.83 757.33 711.48 624.30 [21] 522.26 423.34 335.91 264.10 206.62 162.86 128.85 101.60 77.86 56.72 [31] 38.30 23.41 6.49 [1] old.abs.dev= 0.0893104146067729 [1] abs.dev= 0.0506515395975703 [1] 0.8000 0.7996 0.7994 0.7990 0.7986 0.7982 0.7978 0.7974 0.7970 0.7969 [11] 0.7970 0.7975 0.7981 0.7979 0.7982 0.7990 0.7991 0.7990 0.7990 0.7992 [21] 0.7994 0.7996 0.7996 0.7998 0.8002 0.8009 0.8015 0.8015 0.8014 0.8016 [31] 0.8025 0.8035 0.8041 [1] 11.64 20.65 29.71 40.22 52.31 66.93 84.02 104.94 131.25 161.98 [11] 200.06 251.89 310.71 386.59 476.45 545.33 622.06 664.45 636.92 570.09 [21] 486.12 402.82 326.45 262.37 208.03 165.78 131.88 104.87 81.05 59.38 [31] 40.18 24.54 6.99 $ne [1] 11.637555 20.651336 29.710258 40.223468 52.314568 66.928887 [7] 84.022673 104.936774 131.254491 161.981416 200.060785 251.890694 [13] 310.710449 386.591379 476.449601 545.327456 622.056417 664.447078 [19] 636.923065 570.091413 486.117837 402.817857 326.451672
Re: [R] KS test and theoretical distribution
On Thu, 6 Oct 2011, francogrex wrote: x - runif(100) y - runif(100) ks.test(x,y) Two-sample Kolmogorov-Smirnov test data: x and y D = 0.11, p-value = 0.5806 alternative hypothesis: two-sided ok I expected that, but: ks.test(runif(100), runif) One-sample Kolmogorov-Smirnov test data: runif(100) D = 0.9106, p-value 2.2e-16 alternative hypothesis: two-sided How come? Pilot error. The help says y: either a numeric vector of data values, or a character string naming a cumulative distribution function or an actual cumulative distribution function such as ‘pnorm’. Only continuous CDFs are valid. runif is not 'an actual cumulative distribution function' -- View this message in context: http://r.789695.n4.nabble.com/KS-test-and-theoretical-distribution-tp3878640p3878640.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with .C
Quoting Uwe Ligges lig...@statistik.tu-dortmund.de: I don't agree that it's overkill -- you get to sidestep the whole `R CMD SHLIB ...` and `dyn.load` dance this way while you experiment with C(++) code 'live using the inline package. You need two additional packages now where you have to rely on the fact those are available. Moreover, you have to get used to that syntax, and part of it seems to be C++ now? At least I do not know why the above should work at all, while I know the simple C function does. OK, I agree that switching to Rcpp/C++ might be a bit of overkill in this example although in a lot of other example I find the Rcpp syntax much more readable than the c-code when dealing with .Call . The example could also have been writen in C using inline removing the need of Rcpp and looking more like the original example: library(inline) test - cfunction(signature(b = numeric, l = integer) , ' for(int i=0; i *l; i++) b[i] += i; ', convention=.C) I find that the advantage of using inline (especially in case of simple functions like this) is that 1. I no long need to compile and load the shared library manually, which can sometimes be frustrating when windows locks the dll. 2. Inline performs typechecking and casts variables to the right type. You can now type test(1:10,10) without needing as.numeric or as.integer. Reducing the amount of r code and the probabiliry of screwing things up by passing the wrong type. Jan Uwe It's really handy. Just make the original source void test(double *b, int *l) { int i; for(i=0; i *l ; i++) b[i] += i; } which you would have know after reading the Wriiting R Extensions manual. I agree that this step is unavoidable no matter which avenue (Rcpp or otherwise) one decides to take. -steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Limitations of R
On 06.10.2011 16:49, Alaios wrote: Dear all, I have a few binary files like 9Gb or even of 50Gb. I would like to ask you what are the known limits of the R for the data processing part, I have a system with a lot of RAM (500Gb) Really accessible from one core? Amazing. but I am not sure about the internal limitations of the R. How long for example a vector can be? See ?Memory-limits Uwe Ligges Could you please inform me for these internal R limitations? I would like to thank you in advance for your help B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] converting commas to points
Hello everyone! I work with a german excell version which uses commas instead of points for seperating decimal places. R work with points so in order to be able to save my excell tables without changing the commas to points, whenever I load a table I type in: read.table(..., dec = ,) only R puts the points into the wron places. For excample excell has a cell with the number: 0,09 so what R does, it writes the number as 0.9 which is wrong and makes my calculations become useless. Does anyone know this problem? Maby I made a mistake somewhere? I would be glad about your answers! Cheers, Anna -- Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu benachrichtigen und die E-Mail zu löschen. This e-mail is confidential. If you have received it in error, please notify me immediately and delete it from your system. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Urgent help needed for honours project - breaks between races in one year
Hi to anyone who is willing to help, I have a csv. file which has 1999 horses as the rows and the age(in years) of the horse at each race as columns. Ive read this file into R and called it 'horses'. Im trying to find the longest break between each race in the horse's first year of racing. I already have a numeric called 'age.first' which is the age at which each horse had its first race. I understand if i want to look at the breaks a year from the first race i have to add +1. But i am not sure how to write the code into R. Can anyone help me get the longest break in the first year of racing for each horse? Jana. -- View this message in context: http://r.789695.n4.nabble.com/Urgent-help-needed-for-honours-project-breaks-between-races-in-one-year-tp3878187p3878187.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sum of functions
Dear all, I would like to create a code for semiparametric Klein and Spady's estimator. For that I created a function that provides the log-likelihood function for each observation (so it is a function of betas and i, where i denotes the observation). Now, in order to maximize the log-likelihood function, I have to sum these log-likelihood functions for each i and so to get another function that is a function only of betas and so to maximize it through maxLik for instance. Is that possible? In order to be more clear I give an example of how it could be: Prob1- function(b, i) g.yj(b,y=1,h.np,i)/(g.yj(b,y=1,h.np,i)+g.yj(b,y=0,h.np,i)) loglik.i- function(b, i) Y[i,]*log(Prob1(b,i))+(1-Y[i,])*log(1-Prob1(b,i)) where b denotes the betas, i the observations, Y is the response vector and g.yj(b,1,h.np,i) a function that I created previously, Prob1(b,i) is a function that gives the conditional probability for observation i and loglik.i(b,i) is function that gives the log-likelihood for observation i. How can I sum the loglik.i(b,i) for each i and remain as a function of b ONLY in order to maximize it??? For exemple this could be done manually by loglik- function(b) loglik.i(b,1)+loglik.i(b,2)+loglik.i(b,3)+….+loglik.i(b,N) but how can I do it automatically for all observations? Thank you Dimitris -- View this message in context: http://r.789695.n4.nabble.com/sum-of-functions-tp3878448p3878448.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About stepwise regression problem
using AIC/BIC, I'm not know too much about this. I just know using p-value to perform stepwise regression if I used p-value to perform multimodel stepwise regression, is it correct in the first message box? -- View this message in context: http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3878297.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] setting up the RPostgreSQL_0.2-0.tar
I am a new R user, and am looking to call Postgres through R. I am not so technical and need help in setting up. I have downloaded the tar file but cannot get passed this step. Is there documentation for dummies to get working with this? I don't even know where I should save this file? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] initial value in ComboBox tkwidget
Hello, I took me very long to find out how to set the initial value in the combobox widget in tk (see the example below). My question is now : Why does textvariable has to be a tclVar while values can be a normal vector? My next question is: How could I have known this much earlier? Is there a documentation for the tcl/tk usage in R? I know a lot of website with example. But the documentation is not very complet. Any tipps? require(tcltk) tclRequire(BWidget) tt - tktoplevel() tkgrid(tklabel(tt,text=What's your favorite fruit?)) fruits - c(Apple,Orange,Banana,Pear) comboBox - tkwidget(tt,ComboBox,editable=TRUE,values=fruits,textvariable=tclVar(Banana)) tkgrid(comboBox) OnOK - function() { fruitChoice - fruits[as.numeric(tclvalue(tcl(comboBox,getvalue)))] tkdestroy(tt) msg - paste(Good choice! ,fruitChoice,s are delicious!,sep=) tkmessageBox(title=Fruit Choice,message=msg) } OK.but -tkbutton(tt,text= OK ,command=OnOK) tkgrid(OK.but) tkfocus(tt) -- View this message in context: http://r.789695.n4.nabble.com/initial-value-in-ComboBox-tkwidget-tp3878164p3878164.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] converting commas to points
From ?read.csv: read.csv2( file, header = TRUE, sep = ;, quote=\, dec=,, fill = TRUE, comment.char=, ...) I think this is specifically set up for German decimal commas. Rgds, Rainer On Thursday 06 October 2011 17:39:46 Anna Lee wrote: Hello everyone! I work with a german excell version which uses commas instead of points for seperating decimal places. R work with points so in order to be able to save my excell tables without changing the commas to points, whenever I load a table I type in: read.table(..., dec = ,) only R puts the points into the wron places. For excample excell has a cell with the number: 0,09 so what R does, it writes the number as 0.9 which is wrong and makes my calculations become useless. Does anyone know this problem? Maby I made a mistake somewhere? I would be glad about your answers! Cheers, Anna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unique possible bug
The error I am referring to is in unique.c in Base R, it cannot accomodate greater than 2^29 values, even though it appears the overflow protection should be 2^30. The only relevance of the arules package is I was using it while I discovered this issue. Thanks, Patrick 2011/10/6 Uwe Ligges lig...@statistik.tu-dortmund.de: On 05.10.2011 22:15, Patrick McCann wrote: Hi, I am trying to read in a rather large list of transactions using the arules library. You mean the arules package? It seems in the coerce method into the dgCmatrix, it somewhere calls unique. Unique.c throws an error when n 536870912; however, when 4*n was modified to 2*n in 2004, the overflow protection should have changed from 2^29 to 2^30, right? If so, how would I change it in my copy? Do I have to recompile everything? Yes. There is also the way to ask the maintainer to improve it, but it won't work without reinstallation of the changed package sources. Uwe Ligges Thanks, Patrick McCann Here is a simple to reproduce example: runif(2^29+5)-a sum(unique(a))-b Error in unique.default(a) : length 536870917 is too large for hashing traceback() 3: unique.default(a) 2: unique(a) 1: unique(a) unique.default function (x, incomparables = FALSE, fromLast = FALSE, ...) { z- .Internal(unique(x, incomparables, fromLast)) if (is.factor(x)) factor(z, levels = seq_len(nlevels(x)), labels = levels(x), ordered = is.ordered(x)) else if (inherits(x, POSIXct)) structure(z, class = class(x), tzone = attr(x, tzone)) else if (inherits(x, Date)) structure(z, class = class(x)) else z } environment: namespace:base From http://svn.r-project.org/R/trunk/src/main/unique.c I see: /* Choose M to be the smallest power of 2 not less than 2*n and set K = log2(M). Need K= 1 and hence M= 2, and 2^M= 2^31 -1, hence n= 2^29. Dec 2004: modified from 4*n to 2*n, since in the worst case we have a 50% full table, and that is still rather efficient -- see R. Sedgewick (1998) Algorithms in C++ 3rd edition p.606. */ static void MKsetup(int n, HashData *d) { int n4 = 2 * n; if(n 0 || n 536870912) /* protect against overflow to -ve */ error(_(length %d is too large for hashing), n); d-M = 2; d-K = 1; while (d-M n4) { d-M *= 2; d-K += 1; } } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] converting commas to points
On 06.10.2011 17:39, Anna Lee wrote: Hello everyone! I work with a german excell version which uses commas instead of points for seperating decimal places. R work with points so in order to be able to save my excell tables without changing the commas to points, whenever I load a table I type in: read.table(..., dec = ,) only R puts the points into the wron places. For excample excell has a cell with the number: 0,09 so what R does, it writes the number as 0.9 Please specify an example with the excel output and what R reads. You will find that such an example does not exist and hence you made some mistake and compared different numbers or so. Uwe Ligges which is wrong and makes my calculations become useless. Does anyone know this problem? Maby I made a mistake somewhere? I would be glad about your answers! Cheers, Anna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] converting commas to points
Sorry guys, I allready found the solution. Excell showed some of the numbers in the format: 1,90053-E05 and R interpreted it as 1.9... I changed that in Excel Cheers, Anna Am 6. Oktober 2011 17:48 schrieb Uwe Ligges lig...@statistik.tu-dortmund.de: On 06.10.2011 17:39, Anna Lee wrote: Hello everyone! I work with a german excell version which uses commas instead of points for seperating decimal places. R work with points so in order to be able to save my excell tables without changing the commas to points, whenever I load a table I type in: read.table(..., dec = ,) only R puts the points into the wron places. For excample excell has a cell with the number: 0,09 so what R does, it writes the number as 0.9 Please specify an example with the excel output and what R reads. You will find that such an example does not exist and hence you made some mistake and compared different numbers or so. Uwe Ligges which is wrong and makes my calculations become useless. Does anyone know this problem? Maby I made a mistake somewhere? I would be glad about your answers! Cheers, Anna -- Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu benachrichtigen und die E-Mail zu löschen. This e-mail is confidential. If you have received it in error, please notify me immediately and delete it from your system. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] undefined slot classes in package code
Dear all, I am facing a problem (warning message) when building a package that I am unable to fix: * checking whether package ‘myRcppTest’ can be installed ... WARNING Found the following significant warnings: Warning: undefined slot classes in definition of S4test: Rcpptest(class Rcpp_test) The package (originally generated with Rcpp.package.skeleton) contains a class named test, in module test (see the *.cpp file below). This class works fine (I can generate and use objects created with new(test$test)). To extend the functionality of this class, I want to wrap a S4 class (named S4test in the minimal example below) around the Rcpp class. The idea is to store an instance of the Rcpp-class in a slot (see R file below), and provide S4 methods that make use of this internal object. The problem seems to be that during the check/install process test$test is not known when this file is processed. However, I could not figure out so far how to make test$test known within the context of this file... Thanks for your help Pascal --- cpp file #include Rcpp.h using namespace Rcpp ; RCPP_MODULE(test) { class_test(test) .constructor() .field(value, test::val) ; } test::test() {} test::~test() {} -- R file - setClass(S4test, representation( Rcpptest=Rcpp_test) ); setMethod(f=initialize, signature=S4test, definition=function(.Object) { .Object@Rcpptest - new(test$test); .Object; } ); newS4test - function() { new(S4test); } - zzz.R - .NAMESPACE - environment() test - new( Module ) .onLoad - function(libname, pkgname) { unlockBinding( test , .NAMESPACE ) assign( test, Module( test ), .NAMESPACE ) lockBinding( test, .NAMESPACE ) } .onAttach - function(libname, pkgname) {} .onUnload - function(libname, pkgname) {} NAMESPACE useDynLib(myRcppTest) exportPattern(^[[:alpha:]]+) import( Rcpp ) -- Pascal A. Niklaus Institute of Evolutionary Biology and Environmental Studies University of Zurich CH-8057 Zurich / Switzerland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Limitations of R
On 10/06/2011 08:33 AM, Uwe Ligges wrote: On 06.10.2011 16:49, Alaios wrote: Dear all, I have a few binary files like 9Gb or even of 50Gb. Hi Alaios -- Maybe you have a particular domain you are interested in (e.g., high-throughput sequence analysis) and there are packages (e.g., at http://bioconductor.org) that make it easier to work with this size and format of data. Martin I would like to ask you what are the known limits of the R for the data processing part, I have a system with a lot of RAM (500Gb) Really accessible from one core? Amazing. but I am not sure about the internal limitations of the R. How long for example a vector can be? See ?Memory-limits Uwe Ligges Could you please inform me for these internal R limitations? I would like to thank you in advance for your help B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] converting commas to points
Hi Anna, have a look at ?write.csv2, which deals with the Excel conventions for CSV in German locale. cheers Am 06.10.2011 17:39, schrieb Anna Lee: Hello everyone! I work with a german excell version which uses commas instead of points for seperating decimal places. R work with points so in order to be able to save my excell tables without changing the commas to points, whenever I load a table I type in: read.table(..., dec = ,) only R puts the points into the wron places. For excample excell has a cell with the number: 0,09 so what R does, it writes the number as 0.9 which is wrong and makes my calculations become useless. Does anyone know this problem? Maby I made a mistake somewhere? I would be glad about your answers! Cheers, Anna -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting up the RPostgreSQL_0.2-0.tar
I got this message in a private message before and asked you to read the posting guide before posting here! I also asked you to specify R version and operating system assuming install.packages() does not work. And I still do not know what the error was you got. Uwe Ligges On 06.10.2011 17:13, Elinor Hartman wrote: I am a new R user, and am looking to call Postgres through R. I am not so technical and need help in setting up. I have downloaded the tar file but cannot get passed this step. Is there documentation for dummies to get working with this? I don't even know where I should save this file? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running *slow*
Probably most of the time you're waiting for this you are in Circle 2 of 'The R Inferno'. If the values are numbers, you might also be in Circle 1. On 06/10/2011 13:37, Thomas wrote: Anyone got any hints on how to make this code more efficient? An early version (which to be fair did more than this one is) ran for 330 hours and produced no output. I have a two column table, Dat, with 12,000,000 rows and I want to produce a lookup table, ltable, in a 1 dimensional matrix with one copy of each of the values in Dat: for (i in 1:nrow(Dat)) { for (j in 1:2) { #If next value is already in ltable, do nothing if (is.na(match(Dat[i,j], ltable))){ltable - rbind(ltable,Dat[i,j])} } } but it takes forever to produce anything. Any advice gratefully received. Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting parabolic function to data
Hi Duncan, Thanks for your reply. You saved my Day :) Thanks, Henri On Thu, Oct 6, 2011 at 12:07 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 11-10-06 4:59 AM, Henri Mone wrote: Dear R users and experts, I want to fit a shifted parabolic function with the following functional form to my data: f(x)=a0*(x+a1)^2+a2 (a0, a1 and a2 are scaling factors.) What is standard approach to do this in R? I tried the lm function in R but I got problems getting the above functional form. Any help is welcome :) . That can be expanded into a regular quadratic: (a0*a1^2 + a2) + 2*a0*a1*x + a0*x^2 So fit a regular quadratic, and then solve for a0, a1, a2 from the resulting coefficients. The only tricky bit will be computing errors on the a's. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running *slow*
Patrick is right, most of the time is probably taken up for the reasons documented in the (masterful) R Inferno, namely the rbind() calls. There is another problem though and it gets at the very core of R, and for that matter, all interpreted languages that I'm familiar with. I'll give a fairly elementary explanation and gloss over many of the subtleties that R core worries about so we mere mortals don't have to. At the end of the day, everything is looped, there's no way to get around it. However, from a code perspective we have a choice of looping in C or R. Whenever possible it is better to loop in C than R and most of the key built-in functions, like unique(), are designed to do just that. The reason for it is pretty straightforward: consider what has to happen to run a loop in R: Iterator is defined: a sequence of C calls start this first line of loop is hit - interpreted by R - sent to C code - executed - changed back into an R result - passed to the next line of the loop iterator is increased: C again second line of loop is hit - interpreted by R - sent to C code - executed - changed back into an R result - passed to the next line of the loop etc. Complicated and/or multiple lines of code only compound the problem because you have to go up and down multiple times at each iteration. Looping on the C level gets rid of all those translations between C/R, save 2, and thereby mightily increases efficiency. Hence, even if you are using the same (or heaven forbid a faster!) algorithm on the R level, it can look super slow because of all the moving up and down the ladder; I don't know how unique.C is implemented, but my guess is it's more or less like what you have now, with more efficient memory usage/preallocation, it just looks *much* faster because of the C architecture. DISCLAIMER: there are quite a few inaccuracies, most small, maybe a few large, in here, and I probably only am aware of a small fraction thereof, but this wasn't intended to be a super accurate explanation. On another note, I should explain my solution a little more clearly. A straight call to unique() would check for unique ROWS not values of x. I take x, make a copy so as not to harm the original object, strip if of its dimensionality (thereby converting it to a vector efficiently), and then apply unique() which will now find unique values. It's not a huge thing, but not immediately apparent from what I did. Hope this helps, Michael On Thu, Oct 6, 2011 at 11:59 AM, Patrick Burns pbu...@pburns.seanet.com wrote: Probably most of the time you're waiting for this you are in Circle 2 of 'The R Inferno'. If the values are numbers, you might also be in Circle 1. On 06/10/2011 13:37, Thomas wrote: Anyone got any hints on how to make this code more efficient? An early version (which to be fair did more than this one is) ran for 330 hours and produced no output. I have a two column table, Dat, with 12,000,000 rows and I want to produce a lookup table, ltable, in a 1 dimensional matrix with one copy of each of the values in Dat: for (i in 1:nrow(Dat)) { for (j in 1:2) { #If next value is already in ltable, do nothing if (is.na(match(Dat[i,j], ltable))){ltable - rbind(ltable,Dat[i,j])} } } but it takes forever to produce anything. Any advice gratefully received. Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] any way to convert back to DateTime class when accidental conversion to numeric?
Hi Dr. Ripley, All, Thanks for the succinct advice! Perfectly what I needed! Jeff, Absolutely I agree that this is a dangerous path, and I would never consider doing it for something that needs to be robust. But in 'R' type casting is a bit messed up, so I've come to accept that sometimes something I called a string might become a factor, a date became a numeric, etc. Then I stick in enough catches throughout my functions to deal with it, worst case. It is ugly, but I cannot figure out a way to have tight types and still use 'R'. Yet, 'R' has so many cool functions (and more added every day); I'd be silly to throw the baby out with the bathwater. So my typical best case or robust solution is to write a parent script in python (I simply know python best) that handles all data typing, etc., then call 'R' once I know that everything is clean. In this particular case above where I was asking, this is really for exploratory work. Once I get a solution, I will likely handle typing outside of 'R'. Thanks for the advice! Regards, Mike --- XKCD http://www.xkcd.com On Wed, Oct 5, 2011 at 11:41 PM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: A more portable way (that function only works in some versions of R) is as.POSIXct(1317857320, origin=1970-01-01) possibly with a 'tz' argument if you need to restore the timezone. On Wed, 5 Oct 2011, jim holtman wrote: Here is what I use: unix2POSIXct(1317857320) [1] 2011-10-05 19:28:40 EDT unix2POSIXct - function (time) structure(time, class = c(POSIXt, POSIXct)) On Wed, Oct 5, 2011 at 7:38 PM, Mike Williamson this.is@gmail.com wrote: Hi, In short, I would like to know if there is any way to convert a numeric into a date, similar to how strptime() can convert a string to a date time class? There are some functions, etc. which don't work well with dates, and tend to force them into numerics. I understand that the number it spits back is the number of seconds since the beginning of 1970 (see the first few sentences of the Details portion of ?DateTimeClasses). However, it's a bit of a hassle to convert that by hand. I can create a function to do this, and it isn't so hard, but I found it hard to believe such a function didn't already exist, so I wanted to ask the community. As an example, today (Oct 5th 2011 at approximately 4:30pm, Pacific time) is approximately 1317857320 as a numeric, but I would like to know how to go from that number back to the 2011-10-05 16:28:39 PDT date time class which originally generated it. Thanks! Mike --- XKCD http://www.xkcd.com [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SPlus to R
On Oct 6, 2011, at 11:17 AM, Scott Raynaud wrote: I'm re-posting this since it did not appear at the end of the thread. Sorry for the inconvenience. Not sure why it's giving the message: An embedded and charset-unspecified text was scrubbed... As far as I know my replies are set up as text and not html. As far as I can tell but have not been able to find this well- documented, the test being applied is not to the content of an attachment but rather to the extension of the attachment's name. If it is not something.txt or something_else.pdf, it's probably going to fail the server test and be scrubbed. Files with extensions .csv or .dat will generally fail, despite being text formats. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About stepwise regression problem
Hi: Please read this: http://www.stata.com/support/faqs/stat/stepwise.html Dennis On Thu, Oct 6, 2011 at 6:41 AM, pigpigmeow gloryk...@hotmail.com wrote: using AIC/BIC, I'm not know too much about this. I just know using p-value to perform stepwise regression if I used p-value to perform multimodel stepwise regression, is it correct in the first message box? -- View this message in context: http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3878297.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with regexp
Thanks to all who replied! With all these possible solutions it will be hard to find the best one :-). --- Gabor Grothendieck ggrothendi...@gmail.com schrieb am Mi, 5.10.2011: Von: Gabor Grothendieck ggrothendi...@gmail.com Betreff: Re: [R] help with regexp An: Jannis bt_jan...@yahoo.de CC: r-h...@stat.math.ethz.ch Datum: Mittwoch, 5. Oktober, 2011 15:13 Uhr On Wed, Oct 5, 2011 at 7:56 AM, Jannis bt_jan...@yahoo.de wrote: Dear list memebers, I am stuck with using regular expressions. Imagine I have a vector of character strings like: test - c('filename_1_def.pdf', 'filename_2_abc.pdf') How could I use regexpressions to extract only the 'def'/'abc' parts of these strings? Some try from my side yielded no results: testresults - grep('(?=filename_[[:digit:]]_).{1,3}(?=.pdf)', perl = TRUE, value = TRUE) Somehow I seem to miss some important concept here. Until now I always used nested sub expressions like: testresults - sub('.pdf$', '', sub('^filename_[[:digit:]]_', '' , test)) but this tends to become cumbersome and I was wondering whether there is a more elegant way to do this? Here are a couple of solutions: # remove everything up to _b as well as everything from . onwards gsub(.*_|[.].*, , test) # extract everything that is not a _ provided it is immediately followed by . library(gsubfn) strapply(test, ([^_]+)[.], simplify = TRUE) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Advice in model construction
Hi Chris, Linear regression model of categorical variables is equivalent to anova. If one of estimates of coefficients is significant in lm, it is interpreted as samples of Family did not come from the same population and the signficant factor level is compared to the reference level. If you want pair-wise comparison, you need post-hoc testing methods (?TukeyHSD). HTH Weidong Gu On Wed, Oct 5, 2011 at 1:55 PM, Chris Mcowen chrismco...@gmail.com wrote: Dear list, I am unsure how to structure my model, i have tried something and it makes sense but i am unsure if i am interpreting it correctly? i have a continuous response variable - the observed quantity of evolutionary history - EH Then i have a number of species which have a hierarchical structure ~ Genus, Family etc My research question is do certain families have significantly higher ( or lower) EH values than the others. Reproducible example: example - structure(list(Family = structure(c(2L, 1L, 1L, 5L, 7L, 7L, 3L, 4L, 6L, 6L, 1L, 3L), .Label = c(Araceae, Asphodelaceae, Bromeliaceae, Cyperaceae, Orchidaceae, Poaceae, Zingiberaceae), class = factor), Genus = structure(c(3L, 4L, 4L, 1L, 6L, 6L, 2L, 5L, 8L, 9L, 4L, 7L), .Label = c(Acianthera, Aechmea, Aloe, Anthurium, Bulbostylis, Hedychium, Lindmania, Psathyrostachys, Sesleria), class = factor), Species = structure(c(9L, 1L, 10L, 11L, 7L, 4L, 3L, 5L, 8L, 2L, 6L, 12L), .Label = c(bonplandii, coerulans, cymosopaniculata, elatum, emmerichiae, gehrigeri, glabrum, juncea, pubescens, sagittatum, scalpricaulis, sessilis), class = factor), EH = c(8.746525, 24.462699, 33.03942, 32.719489, 13.598201, 13.598201, 13.164928, 9.339228, 9.69705, 13.478372, 37.497137, 59.562911)), .Names = c(Family, Genus, Species, EH), class = data.frame, row.names = c(NA, -12L)) #My model test - lm(EH~Family, data = example) #in this small example no families are significant but if one was - would that mean they have significantly more EH than the others? Thanks Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About stepwise regression problem
I hadn't seen that page Dennis, that makes the case much more succinctly than my anti stepwise ramblings! Furthermore, pigpigmeow if you are using a random effects model i.e lmer - where are you getting your p-values from? And what do they mean in this context? I would strongly advise using information criteria in these scenarios. Chris P.S - Doug Bates discusses P-values in relation to mixed effect models here : https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html On 6 Oct 2011, at 18:17, Dennis Murphy wrote: Hi: Please read this: http://www.stata.com/support/faqs/stat/stepwise.html Dennis On Thu, Oct 6, 2011 at 6:41 AM, pigpigmeow gloryk...@hotmail.com wrote: using AIC/BIC, I'm not know too much about this. I just know using p-value to perform stepwise regression if I used p-value to perform multimodel stepwise regression, is it correct in the first message box? -- View this message in context: http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3878297.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a loop for a function
Well you said you wanted statistics from the test, but you didn't say which statistics you wanted: any of the following would work: sapply(1:10, function(i) Box.test (lfut, lag = i, type=Ljung)$statistic) sapply(1:10, function(i) Box.test (lfut, lag = i, type=Ljung)$parameter) sapply(1:10, function(i) Box.test (lfut, lag = i, type=Ljung)$p.value) extractor wasn't quite the right word, but more or less that's the idea (if you happen to know it from other programming contexts). Michael On Thu, Oct 6, 2011 at 1:43 AM, upananda pani upananda.p...@gmail.com wrote: Respected Sir, I am grateful to you for your reply. This is working perfectly fine. I do not know how to add extractors. Please give me one example with this function. With regards, Upananda On Thu, Oct 6, 2011 at 1:04 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: lapply(1:10, function(i) Box.test (lfut, lag = i, type=Ljung)) Add extractors to get statistics as desired. Michael Weylandt On Wed, Oct 5, 2011 at 1:09 PM, upananda pani upananda.p...@gmail.com wrote: Dear All, I want to create a loop within a function r. The example follows: Box.test (lfut, lag = 1, type=Ljung) if i want to compute the Box.test for lag 1 to 10, I have to write manually change each time for different lag. So i wan to write a loop for the lag 1 to 10 and return the statistics for each lag. Is there any method to do this ? With regards, Upananda -- You may delay, but time will not. Research Scholar alternative mail id: up...@iitkgp.ac.in Department of HSS, IIT KGP KGP [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- You may delay, but time will not. Research Scholar alternative mail id: up...@iitkgp.ac.in Department of HSS, IIT KGP KGP __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gefp() boundaries?
Thank you very much for the answer. If I take Poisson model and follow Generalized M-fluctuation tests for parameter instability, A. Zeileis and K. Hornik, Statistica Neerlandica (2007) Vol. 61, N. 4, p. 500-501 (section 4.3): data(Boston) n - 506; my.X - as.matrix(cbind(1, Boston[crim], Boston[age])); my.model - glm(tax ~ crim + age, family = poisson, data = Boston); my.psi - estfun(my.model); my.mu - fitted(my.model); J - sum(my.mu*my.X%*%t(my.X))/n; my.process - apply(as.matrix(my.psi), 2, cumsum)/sqrt(J*n); gprocess - gefp(tax ~ crim + age, family = poisson, data=Boston); then my.process and gprocess$process have to be the same? Best regards, Julia -- View this message in context: http://r.789695.n4.nabble.com/gefp-boundaries-tp3872529p3878886.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] factors in probit regression
Hi to all of you, I'm fitting an full factorial probit model from an experiment, and I've the independent variables as factors. The model is as follows: fit16-glm(Sube ~ as.factor(CE)*as.factor(CEBO)*as.factor(Luz), family=binomial(link=probit), data=experimento) but, when I took a look to the results I've obtained the following: glm(formula = Sube ~ CE * CEBO * Luz, family = binomial(link = probit), data = experimento) Deviance Residuals: Min 1Q Median 3Q Max -1.651e-06 -1.651e-06 1.651e-06 1.651e-06 1.651e-06 Coefficients: (3 not defined because of singularities) Estimate Std. Error z value Pr(|z|) (Intercept)6.991e+00 3.699e+04 0 1 CEexperimental 5.357e-09 4.775e+04 0 1 CENO -1.398e+01 4.320e+04 0 1 CEBOcombinado 4.948e-26 4.637e+04 0 1 CEBOolor 1.183e-25 4.446e+04 0 1 CEBOvisual 7.842e-26 5.650e+04 0 1 Luzoscuridad 3.383e-26 4.637e+04 0 1 CEexperimental:CEBOcombinado -6.227e-26 6.656e+04 0 1 CENO:CEBOcombinado-3.758e-26 5.540e+04 0 1 CEexperimental:CEBOolor -2.611e-25 6.865e+04 0 1 CENO:CEBOolor -5.252e-26 5.620e+04 0 1 CEexperimental:CEBOvisual -2.786e-09 7.700e+04 0 1 CENO:CEBOvisual8.169e-15 6.334e+04 0 1 CEexperimental:Luzoscuridad -1.703e-25 6.304e+04 0 1 CENO:Luzoscuridad -1.672e-28 6.117e+04 0 1 CEBOcombinado:Luzoscuridad 1.028e-26 5.950e+04 0 1 CEBOolor:Luzoscuridad 9.212e-27 6.207e+04 0 1 CEBOvisual:Luzoscuridad NA NA NA NA CEexperimental:CEBOcombinado:Luzoscuridad 9.783e-26 8.744e+04 0 1 CENO:CEBOcombinado:Luzoscuridad -2.948e-26 7.959e+04 0 1 CEexperimental:CEBOolor:Luzoscuridad 1.573e-25 9.005e+04 0 1 CENO:CEBOolor:Luzoscuridad-2.111e-26 8.208e+04 0 1 CEexperimental:CEBOvisual:LuzoscuridadNA NA NA NA CENO:CEBOvisual:Luzoscuridad NA NA NA NA (Dispersion parameter for binomial family taken to be 1) Null deviance: 2.0853e+02 on 150 degrees of freedom Residual deviance: 4.1146e-10 on 130 degrees of freedom AIC: 42 Well, there are too many levels of the original factors lacking in this table. As an example, the factor CE has three levels (Undefined, Control, Experimental), but in the table there are only two of them (NO=undefined, Experimental=Experimental). I need to check the complete result, how can I obtain the effects for the remaining levels of the factors? Thanks, Pablo -- View this message in context: http://r.789695.n4.nabble.com/factors-in-probit-regression-tp3879176p3879176.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave and bibliographies
I added a bibliographic reference to one of my .Rnw documents, to wit: \begin{thebibliography}{9} \bibitem{Jose} Jos{\'{e}} C. Pinheiro and Douglas M. Bates, \emph{Mixed-Effects Models in S and S-PLUS}, Springer, 2000. \end{thebibliography} just before the \end{document}. When I run Sweave on the result these lines disappear without a trace. Is this intentional? Subquestion: is there a link to some Sweave documentation from the main CRAN or R page? Terry T. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ignore my last message
It was an editor interaction -- the changes weren't being saved. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Titles changing when a plot is redrawn
On Oct 6, 2011, at 2:29 PM, David Winsemius wrote: On Oct 6, 2011, at 8:25 AM, John Nolan wrote: Thank you for telling me a fix. But I still don't know if this behavior is what is intended. I used bquote(...) because the plotmath(...) help page refers to bquote and gives an example like this. I suspect most users will be baffled by this kind of behavior, especially since it does not occur when there is one plot. By this I mean that I can draw one plot and title it with the same string using bquote( ). It seems to be an infelicity that is not reproducible on Macs: RMSE.titles.pdf Apologies, I take it back. I read you original post incorrectly and thought you were having problems with the original plot appearing incorrectly. When I do the the resizing I do see the same unexpected change to both titles having i = 2 If I change the value of i, and redraw the graph, the redrawn graph has the original value of i in the title, not the updated value. So in this case, an unevaluated expression is not re-evaluated at draw time? I certainly would not have expected redrawing to call any R code again. I would have expected the graphics device to do the recalculations. As I said. I would not have expected this, either. I seem to remember this being brought up before, but I was unable to find it on a search. -- David. John -xieyi...@gmail.com wrote: - To: John Nolan jpno...@american.edu From: Yihui Xie Sent by: xieyi...@gmail.com Date: 10/05/2011 11:49PM Cc: r-help@r-project.org Subject: Re: [R] Titles changing when a plot is redrawn I think the problem is your str1 is an unevaluated expression and will change with the value of i. You should be able to get a fixed title by this: par(mfrow = c(2, 1)) for (i in 1:2) { x - 1:100 rmse - sin(x/5) # fake data plot(x, rmse, main = substitute(list(RMSE(theta), i == z), list(z = i))) } Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Wed, Oct 5, 2011 at 10:01 PM, John Nolan jpno...@american.edu wrote: I ran into a problem with titles on graphs. I wanted a graph with multiple subplots, with each having a title that involved both a Greek letter and an identifier for each graph. Below is a simplified version of code to do this. The graph appears fine, with the first graph having i=1 in the title, and the second graph having i=2 in the title. However, when I resize the graph, the plot titles change, with both showing i=2. The titles also change when I save the plot to a file using the File menu, then Save as in Windows. Is this what should happen? I always thought that titles are static once the graph is drawn, and couldn't change. The problem occurs on some version of R, but not on others. It does occur with the latest version of R: str(R.Version()) List of 13 $ platform : chr i386-pc-mingw32 $ arch : chr i386 $ os: chr mingw32 $ system: chr i386, mingw32 $ status: chr $ major : chr 2 $ minor : chr 13.2 $ year : chr 2011 $ month : chr 09 $ day : chr 30 $ svn rev : chr 57111 $ language : chr R $ version.string: chr R version 2.13.2 (2011-09-30) The problem also occurs on: R 2.13.0 on Win32 and Mac (R 2.12.0, x86_64-apple-darwin9.8.0) The problem DOES NOT occur under R 2.10.0 on Win32. If the code below is bracketed with pdf(test.pdf) and dev.off(), the correct labels appear in the file. This behavior doesn't seem to appear if there is only one plot. My guess is that the titles are being reevaluated when the plot is redrawn, and since the value of i is 2 when the redraw occurs, both labels get set to i=2. I guess Save as forces a redraw because a dialog box pops up? If could be that this behavior is what is intended, and that somewhere between R 2.10.0 and R 2.13.2 an old bug was fixed. Or this behavior is not what was intended, and a bug was introduced. If the former, this should be explained to the user somewhere. If the latter, can someone track it down and fix? John Nolan #- par(mfrow=c(2,1)) for (i in 1:2) { x - 1:100 rmse - sin(x/5) # fake data plot(x,rmse) str1 - bquote( paste(RMSE(,theta,), ,i==.(i) )) title( str1 ) } #- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot titles and labels XXXX
Hi everyone, Is it possible to to specify titles and axis labels on 2 separate lines when using the plot fn? What about tick mark labels? Example of main title: Figure 1: Side-by-side Boxplots of Y by X instead of Figure 1: Side-by-side Boxplots of Y by X [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] TCGA expression data: plotting ....
Hi, I am new to R. I am trying to figure out how to graph expression data from the TCGA database. If I understand correctly the expression data I have downloaded is from a microarray using the AgilentG4502A. I've had trouble reading into R in the level I, level II, and the gene expression analysis data using dat-read.table(C:\\file.txt, header=T, row.names=1) for example: dat1-read.table(C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt, header=T, row.names=1) dat-read.table(C:\\unc.edu__AgilentG4502A_07_3__TCGA-A6-2674-01A-02R-0821-07__gene_expression_analysis.txt, header=T, row.names=1) in all cases I get the error more columns than column names I have only been able to read in the level II data with the code: dat2-read.table(C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt_lmean.out.logratio.probe.tcga_level2.data.txt,header = TRUE, as.is = TRUE, sep=\t) So this is what I am working with. I can see that the dimensions of this data are dim(dat2) [1] 90798 2 When I print dat2 to the screen it looks like this: I assume that this is one patient with expression (intensity) data for a large number of genes, but don't know. 49995A_23_P67323 -0.427 49996A_23_P67330 -0.3275 49997A_23_P67332 -0.409 49998A_23_P67339 3.2955 4A_23_P67355 1.205 If I try to plot the data with the following below names(dat2) [1] Hybridization.REFTCGA.A6.2674.01A.02R.0821.07 x-c(Hybridization.REF) y-(TCGA.A6.2674.01A.02R.0821.07) plot(x,y,type='p',xlab='Hybridization.REF',ylab='TCGA.A6.2674.01A.02R.0821.07',main='plot') I get the error: Error in plot.window(...) : need finite 'xlim' values In addition: Warning messages: 1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion 2: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf 5: In min(x) : no non-missing arguments to min; returning Inf 6: In max(x) : no non-missing arguments to max; returning -Inf I am really not sure how to plot this data, partly because I'm not sure what the level II data represents. Can anyone tell me what the level II data represents and what type of plotting functions I might use? Thanks in advance, MAB -- View this message in context: http://r.789695.n4.nabble.com/TCGA-expression-data-plotting-tp3879484p3879484.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot titles and labels XXXX
On Oct 6, 2011, at 2:51 PM, Dan Abner wrote: Hi everyone, Is it possible to to specify titles and axis labels on 2 separate lines when using the plot fn? What about tick mark labels? Example of main title: ?Quotes try: title(main=Figure 1: Side-by-side Boxplots\n of Y by X) instead of Figure 1: Side-by-side Boxplots of Y by X [[alternative HTML version deleted]] You should learn to post in plain text. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counts in quantiles in and from a matrix
Excellent! Thank you! ben On Wed, Oct 5, 2011 at 9:18 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: Here's one way: m - matrix(rpois(100, 8), nrow = 5) f - function(x) { q - quantile(x, c(0.1, 0.9), na.rm = TRUE) c(sum(x q[1]), sum(x q[2])) } t(apply(m, 1, f)) HTH, Dennis On Wed, Oct 5, 2011 at 8:11 PM, Ben qant ccqu...@gmail.com wrote: Hello, I'm trying to get the count of values in each row that are above and below quantile thresholds. Thanks! Example: x = matrix(1:30,5,6) x [,1] [,2] [,3] [,4] [,5] [,6] [1,]16 11 16 21 26 [2,]27 12 17 22 27 [3,]38 13 18 23 28 [4,]49 14 19 24 29 [5,]5 10 15 20 25 30 qtl = t(apply(x, 1, quantile, probs = c(.1,.9),na.rm=T)) qtl 10% 90% [1,] 3.5 23.5 [2,] 4.5 24.5 [3,] 5.5 25.5 [4,] 6.5 26.5 [5,] 7.5 27.5 I would like counts like this for each row: cnts [,1] [,2] [1,] 11 [2,] 11 [3,] 11 [4,] 11 [5,] 11 ...because for the first row (x[1,]) only value 1 is less than 3.5 and only value 26 is greater 23.5 and so on for the other rows. I'm thinking its a apply(x,1,...some FUN here...), but still getting use to apply and I've been coding for too long... Also, if anyone knows how to change the background color of the r-Tinn editor my eyes would love you! Off to bed. I look forward to your answers! Thanks! Ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Duplicate elements of a vector
Hi, let's assume I have the following vector a: 1 5 23 How can I use R to duplicate the elements so that my new vector looks like: 1 1 5 5 23 23 Many thanks, Syrvn -- View this message in context: http://r.789695.n4.nabble.com/Duplicate-elements-of-a-vector-tp3879561p3879561.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.