[R] How to change the class of data?
Hi all, I have some data x, which are actualy consisted of numerical enties. But the class of this matrix is set to be factor by someone else. I used class(x), it turns out to be factor. So I can not calculate them. How can I turn them into numerical data so that I can apply math operations on them? Thanks a lot for your help. Selina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The log function problem
Hi R, Please see the below commands. The question is I can see the value of log(2) before loading the package fcalendar in R. But after loading the package, the 'log' function doesn't work. How to solve this problem? Also note that the function code differs before and after downloading the packages. log function (x, base = exp(1)) .Primitive(log) log(2) [1] 0.6931472 library(fCalendar) Loading required package: fEcofin Rmetrics, (C) 1999-2006, Diethelm Wuertz, GPL fCalendar: Time, Date and Calendar Tools library(fCalendar) log function (x, base = exp(1)) { UseMethod(log) } log(2) Error in .Internal(log(x)) : no internal function log Many Thanks for your help, Shubha This e-mail may contain confidential and/or privileged i...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The log function problem
SVK == Shubha Vishwanath Karanth [EMAIL PROTECTED] on Thu, 12 Jun 2008 12:02:25 +0530 SVK Hi R, SVK SVK SVK SVK Please see the below commands. The question is I can see the value of SVK log(2) before loading the package fcalendar in R. But after loading the SVK package, the 'log' function doesn't work. How to solve this problem? SVK Also note that the function code differs before and after downloading SVK the packages. It seems that you are using a old version of fCalendar. Please update all your Rmetrics packages (update.packages()) and use the latest R version. regards, Yohan -- PhD student Swiss Federal Institute of Technology Zurich www.ethz.ch www.rmetrics.org NOTE: Rmetrics Workshop: http://www.rmetrics.org/meielisalp.htm June 29th - July 3rd Meielisalp, Lake Thune, Switzerland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
When you have a data X with a class factor, you can transform it to numeric as y-as.numeric(X) to transform it to a factor again use y-as.factor(X) -- View this message in context: http://www.nabble.com/How-to-change-the-class-of-data--tp17793351p17793713.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
On Jun 12, 2008, at 2:24 AM, Qman Fin wrote: Hi all, I have some data x, which are actualy consisted of numerical enties. But the class of this matrix is set to be factor by someone else. I used class(x), it turns out to be factor. So I can not calculate them. The typical approach is to do: as.numeric(as.character(x)) How can I turn them into numerical data so that I can apply math operations on them? Thanks a lot for your help. Selina Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
If x is a vector (one dimensional) then as.numeric(levels(x)) works - I am not sure whether this is the best solution. If you have a matrix you can use apply, i.e x1 - apply(x,2,function(a) as.numeric(levels(a))) --- On Thu, 12/6/08, Qman Fin [EMAIL PROTECTED] wrote: From: Qman Fin [EMAIL PROTECTED] Subject: [R] How to change the class of data? To: r-help@r-project.org Received: Thursday, 12 June, 2008, 4:24 PM Hi all, I have some data x, which are actualy consisted of numerical enties. But the class of this matrix is set to be factor by someone else. I used class(x), it turns out to be factor. So I can not calculate them. How can I turn them into numerical data so that I can apply math operations on them? Thanks a lot for your help. Selina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
Try: x - factor(1:10) class(x) x + 1 class(x) - numeric x+1 On Jun 12, 2008, at 8:24 AM, Qman Fin wrote: Hi all, I have some data x, which are actualy consisted of numerical enties. But the class of this matrix is set to be factor by someone else. I used class(x), it turns out to be factor. So I can not calculate them. How can I turn them into numerical data so that I can apply math operations on them? Thanks a lot for your help. Selina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help
i am new user of R-language. i have problem in attachment of spss file . i have downloaded the foreign package but i have still problem in attachment. i am typing 'data=read.spss(file name.choose()). is it right or not? could you plz tell me the way for attachment? thanks _ s. It's easy! aspxmkt=en-us [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
Hi Selina, try ?as.numeric, small example a=c(1,2,3,4,5) b=as.factor(a) class(b) c=as.numeric(b) class(c) in the case of a matrix of factor,try apply(matrix,1, as.numeric) Cheers A. - Messaggio originale - Da: Qman Fin [EMAIL PROTECTED] A: r-help@r-project.org Inviato: Giovedì 12 giugno 2008, 8:24:08 Oggetto: [R] How to change the class of data? Hi all, I have some data x, which are actualy consisted of numerical enties. But the class of this matrix is set to be factor by someone else. I used class(x), it turns out to be factor. So I can not calculate them. How can I turn them into numerical data so that I can apply math operations on them? Thanks a lot for your help. Selina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ___ pinione! http://www.ymailblogit.com/blog/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ADaCGH package crashes at mpiInit()
I have successfully installed ADaCGH package, and trying the example in SegmentPlotWrite did produce alot of pngs and html. I tried again the same example this morning (after a long night of installation), ADaCGH crashes at mpiInit() showing the error: Loading required package: Rmpi ELAN_EXCEPTION @ --: 6 (Initialisation error) elan_init: Can't get capability from environment Aborted I suspect the cluster adminstrator may have modified the mpi program or environment variables, but being new to mpi and am unsure what questions to pose to the adminstrator. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
Seeing how there have been three wrong answers so far, I should point out that: 1) This is an FAQ: http://cran.r-project.org/doc/FAQ/R-FAQ.html#How- do-I-convert-factors-to-numeric_003f 2) Most of the other methods suggested so far fail if the example x used is not of the form 1:n. The only reason they happen to work, is that in that case the levels coincide with their labels. x-factor(8:5) as.numeric(levels(x)) [1] 5 6 7 8 as.numeric(x) [1] 4 3 2 1 class(x) - numeric x+1 [1] 5 4 3 2 attr(,levels) [1] 5 6 7 8 Haris Skiadas Department of Mathematics and Computer Science Hanover College On Jun 12, 2008, at 3:07 AM, anna freni sterrantino wrote: Hi Selina, try ?as.numeric, small example a=c(1,2,3,4,5) b=as.factor(a) class(b) c=as.numeric(b) class(c) in the case of a matrix of factor,try apply(matrix,1, as.numeric) Cheers A. - Messaggio originale - Da: Qman Fin [EMAIL PROTECTED] A: r-help@r-project.org Inviato: Giovedì 12 giugno 2008, 8:24:08 Oggetto: [R] How to change the class of data? Hi all, I have some data x, which are actualy consisted of numerical enties. But the class of this matrix is set to be factor by someone else. I used class(x), it turns out to be factor. So I can not calculate them. How can I turn them into numerical data so that I can apply math operations on them? Thanks a lot for your help. Selina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
On Thu, Jun 12, 2008 at 03:42:23AM -0400, Charilaos Skiadas wrote: Seeing how there have been three wrong answers so far, I should point out that: 1) This is an FAQ: http://cran.r-project.org/doc/FAQ/R-FAQ.html#How- do-I-convert-factors-to-numeric_003f Going over the r-help archive, we have seen the misunderstanding you are pointing out quite often - it clearly is a FREQUENTLY asked question). In my experience, it is also among the top-five confusions among students I have introduced R to. And I clearly remember my own confusion about as.numeric(some_factor) returning the internal encoding rather than what I had expected, when I started with R. I guess there is no way to change that behaviour without breaking existing code, but I feel it would have been much better to have as.numeric and as.integer do what people expect and have something like levelencoding(some_factor) for getting the integer representation. The problem is particularly frustrating as the result of type casting is inconsistent: as.character does exactly what people's intuition says (i.e. operate on the levels) while as.numeric does not. So what is my point? I guess it's my message for all new R-users: You are not alone - this has confused all of us in the beginning. Maybe as.integer and as.numeric should give a warning whenever applied to a factor? cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://mips.gsf.de/staff/pagel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
Conversion to factor may happen (and often does) when you read in data with read.table(). So one solution may be reading in the same data again in a slightly different way: read.table(file=mydatafile, as.is=TRUE) # see also ?read.table You can also specify a class to each column of the data you're about to read in: read.table(, colClasses=c(numeric, factor, character, my.funny.class)) Ad take a look at http://cran.r-project.org/doc/FAQ/R-FAQ.htmlp. 7.10 for the right answer -- in any case, don't use as.numeric(x)! Kenn On Thu, Jun 12, 2008 at 9:24 AM, Qman Fin [EMAIL PROTECTED] wrote: Hi all, I have some data x, which are actualy consisted of numerical enties. But the class of this matrix is set to be factor by someone else. I used class(x), it turns out to be factor. So I can not calculate them. How can I turn them into numerical data so that I can apply math operations on them? Thanks a lot for your help. Selina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv::gam error message for predict.gam
Wild extrapolation thought it is... it works for me with mgcv 1.4-0 and R 2.7.0 on linux: test 1 2 3 45 20.73032 16.83549 59.42120 29.07759 13.09754 what mgcv and R versions are you using, and on what OS? (btw `gam.method' isn't an argument of mgcv:gam for any recent version, not that it makes any difference for gaussian with identity link.) best, Simon On Thursday 12 June 2008 00:16, David Katz wrote: Sometimes, for specific models, I get this error from predict.gam in library mgcv: Error in complete.cases(object) : negative length vectors are not allowed Here's an example: model.calibrate - gam(meansalesw ~ s(tscore,bs=cs,k=4), data=toplot, weights=weight, gam.method=perf.magic) test - predict(model.calibrate,newdata) Error in complete.cases(object) : negative length vectors are not allowed The data is shown below: toplot[,c(meansalesw,tscore,weight)] meansalesw tscore weight 1 0.1275841 0.003446797 15224 2 0.1495748 0.004017158 15523 3 0.2245844 0.004375278 15520 4 0.2197668 0.004753941 15525 5 0.1317830 0.005049050 15524 6 0.2809621 0.005403199 15498 7 0.2933119 0.005764413 15529 8 0.4791150 0.006335145 15514 9 0.1833688 0.006617095 15528 10 0.3200599 0.007135850 15527 11 0.4931882 0.007781095 15529 12 0.4207684 0.008766088 15512 13 0.5928568 0.009731357 15514 14 0.8025296 0.010927579 15520 15 0.6286192 0.012004714 15513 16 0.7477922 0.014083143 15527 17 0.7251362 0.017382274 15531 18 1.1871948 0.025481173 15521 19 1.6495832 0.048264689 15524 20 5.1180227 0.131198022 15218 newdata tscore 1 0.5059341 2 0.4125522 3 1.4335818 4 0.7060673 5 0.3229316 Thanks! -- Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK +44 1225 386603 www.maths.bath.ac.uk/~sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
I have an additional question concerning to this topic. I usually use something liek that: read.table(, colClasses=c(numeric, factor, character, my.funny.class)) but why can I not implement ordered.factor in there? Birgit Kenn Konstabel wrote: Conversion to factor may happen (and often does) when you read in data with read.table(). So one solution may be reading in the same data again in a slightly different way: read.table(file=mydatafile, as.is=TRUE) # see also ?read.table You can also specify a class to each column of the data you're about to read in: read.table(, colClasses=c(numeric, factor, character, my.funny.class)) Ad take a look at http://cran.r-project.org/doc/FAQ/R-FAQ.htmlp. 7.10 for the right answer -- in any case, don't use as.numeric(x)! Kenn On Thu, Jun 12, 2008 at 9:24 AM, Qman Fin [EMAIL PROTECTED] wrote: Hi all, I have some data x, which are actualy consisted of numerical enties. But the class of this matrix is set to be factor by someone else. I used class(x), it turns out to be factor. So I can not calculate them. How can I turn them into numerical data so that I can apply math operations on them? Thanks a lot for your help. Selina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/How-to-change-the-class-of-data--tp17793351p17795106.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to extract rows from matrices consistently?
Hi, How do I ensure that I always get a matrix back when I extract rows? The mickey-mouse example doesn't matter much, but if instead of 1:2 or 1, I have a vector which may have 1 or more values, then I'm in trouble. Any way to make this consistently return a matrix? Thx in advance. - Ken # - x - matrix( 1:10, nrow = 5 ) x [,1] [,2] [1,]16 [2,]27 [3,]38 [4,]49 [5,]5 10 class( x[1:2,] ) [1] matrix# this is good class( x[1,] ) [1] integer # this is EVIL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XML parameters to Column Headers for importing into a dataset
Dear List, Do you know any way I can convert XML parameters into column headers. My data is in a csv file with each row containing a xml form of data , and multiple parameters ( param1 data_val1 /param2 , param2 data_val2 /param2 ) I want to convert it so each row caters to one record and each parameter becomes a different column. param1 param2 Row1 data_val1 data_val2 What is the most efficient way for doing this. Apologize for the duplicate email , but this is an emergency with loads of files for me !!! Regards, Ajay www.decisionstats.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to extract rows from matrices consistently?
Feng, Ken wrote: Hi, How do I ensure that I always get a matrix back when I extract rows? The mickey-mouse example doesn't matter much, but if instead of 1:2 or 1, I have a vector which may have 1 or more values, then I'm in trouble. Any way to make this consistently return a matrix? Thx in advance. - Ken # - x - matrix( 1:10, nrow = 5 ) x [,1] [,2] [1,]16 [2,]27 [3,]38 [4,]49 [5,]5 10 class( x[1:2,] ) [1] matrix # this is good class( x[1,] ) [1] integer # this is EVIL class(x[1,,drop=FALSE]) [1] matrix # this is good (should be the default, perhaps) vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] timereg and relative risks
Hi all, I've been reading and using the information from the list for some time but this is my first question here. English is not my primary language, so sorry in advance for any language mistakes. :) I'm working with the timereg package to analize survival data. I want to perform a multivariate analisis of clinical information similar to the Cox regression but taking competing risks into account. I have taken a look to the documentation of the package and decided to use the Fine and Gray model and I use the command: fg-comp.risk(Surv(recact, statusRecaiguda0)~const(font.PH)+const(earlyadvan)+const(mini)+const(edatmes41)+const(DONRS10925027dosgrups)+const(RECRS11665831dosgrups), dadesRec, dadesRec$statusRecaiguda, timesRec[-1], causeS=1, resample.iid=1, model=prop, detail=1) The results are ok and coherent with my previous experiments but I dont know how to get the equivalent to the Relative Risks I get when applying th Cox model. Is it possible to compute it? can I get it using another command? Thanks a lot for any help Bernat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] difference between nlm and nlminb
Thank you for those details, the only optimization routine I've come accross outside of CRAN is: http://www.stat.umn.edu/geyer/trust/ Personally I only use nlminb for the estimation of Time Series models, which typically have well defined limits for the elements of the parameter vector - so in my post I guess as a high level explaination I was stressing in reality you'd use nlm for unconstrained and nlminb for constrained (and as you point out box constraints) optimization as the take home point. I notice the R group where taking part in the Google summer of code 2008 event - perhaps a useful project could be the implementation of numerous optimization routines in R? Thanks David Douglas Bates-2 wrote: nlminb provides unconstrained optimization and optimization subject to box constraints (i.e. upper and/or lower constraints on individual elements of the parameter vector). The nlm function provides unconstrained optimization. I created the nlminb function because I was unable to get reliable convergence on some difficult optimization problems for the nlme and lme4 packages using nlm and optim. The nlme package was originally written for S from Bell Labs (the forerunner of S-PLUS) and the PORT package was the optimization code used. Even though it is very old style Fortran code I find it quite reliable as an optimizer. It allows for what is called reverse communication which is convenient in an environment like R. It is a technical issue that has to do with what code is in control when your R expression needs to be evaluated. That said, I still don't feel that I have seen good, modern Open-Source optimization code. I would welcome suggestions of where one might find such code. On Wed, Jun 11, 2008 at 3:16 AM, DavidM.UK [EMAIL PROTECTED] wrote: I believe nlminb() performs *constrained* optimization, where as nlm() is for *unconstrained* opimization So I guess nlm() is for solving min(f[a,b]), and nlminb() min(f[a,b]) given a+b = c FYI I think optim() also does constrained optimization, well I've used for min(f[a,b]) given a = a* and b = b*. David ae2356 wrote: Hi, I was wondering if someone could give a brief, big picture overview of the difference between the two optimization functions nlm and nlminb. I'm not familiar with PORT routines, so I was hoping someone could give an explanation. Thanks, Angelo _ Instantly invite friends from Facebook and other social networks to join yo https://www.invite2messenger.net/im/?source=TXT_EML_WLH_InviteFriends [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/difference-between-nlm-and-nlminb-tp17769859p17772440.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - David Merritt Postgrad [Statistics] University of Bristol, UK -- View this message in context: http://www.nabble.com/difference-between-nlm-and-nlminb-tp17769859p17796362.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ADaCGH package crashes at mpiInit()
Dear Daren, First, please note that since this problem concerns a particular package, you are supposed to contact the package maintainer (me) directly. (See the R-FAQ, 9.2). Anyway, I've never seen that error message before. But I think it indicates a problem with your MPI setup, nothing related to R (and, thus, neither ADaCGH nor Rmpi). A few things to try: 1. When ADaCGH fails, could you do a traceback() to see where exactly is failing? 2. (in a newly started R session) please try something like the following: library(Rmpi) mpi.universe.size() mpi.spawn.Rslaves() mpi.remote.exec(rnorm(2)) And lets see how far the example gets. 3. How is MPI started? Is it LAM-MPI or OpenMPI? Best, R. On Thu, Jun 12, 2008 at 9:31 AM, Daren Tan [EMAIL PROTECTED] wrote: I have successfully installed ADaCGH package, and trying the example in SegmentPlotWrite did produce alot of pngs and html. I tried again the same example this morning (after a long night of installation), ADaCGH crashes at mpiInit() showing the error: Loading required package: Rmpi ELAN_EXCEPTION @ --: 6 (Initialisation error) elan_init: Can't get capability from environment Aborted I suspect the cluster adminstrator may have modified the mpi program or environment variables, but being new to mpi and am unsure what questions to pose to the adminstrator. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ramon Diaz-Uriarte Statistical Computing Team Structural Biology and Biocomputing Programme Spanish National Cancer Centre (CNIO) http://ligarto.org/rdiaz Phone: +34-91-224-6900 ext. 3019 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to change the class of data?
On Thu, 12 Jun 2008, Birgitle wrote: I have an additional question concerning to this topic. I usually use something liek that: read.table(, colClasses=c(numeric, factor, character, my.funny.class)) but why can I not implement ordered.factor in there? Because the help page says colClasses: character. A vector of classes to be assumed for the columns. Recycled as necessary, or if the character vector is named, unspecified values are taken to be 'NA'. Possible values are 'NA' (when 'type.convert' is used), 'NULL' (when the column is skipped), one of the atomic vector classes (logical, integer, numeric, complex, character, raw), or 'factor', 'Date' or 'POSIXct'. Otherwise there needs to be an 'as' method (from package 'methods') for conversion from 'character' to the specified formal class. There is no as.ordered.factor() nor as() method, and nothing in the data can specify the order of the levels unambiguously. So unless one accepts a guess (and in R we try not to do that for you, as in e.g. as.Date with numbers), there is no possibility to support ordered factors. Birgit Kenn Konstabel wrote: Conversion to factor may happen (and often does) when you read in data with read.table(). So one solution may be reading in the same data again in a slightly different way: read.table(file=mydatafile, as.is=TRUE) # see also ?read.table You can also specify a class to each column of the data you're about to read in: read.table(, colClasses=c(numeric, factor, character, my.funny.class)) Ad take a look at http://cran.r-project.org/doc/FAQ/R-FAQ.htmlp. 7.10 for the right answer -- in any case, don't use as.numeric(x)! Kenn On Thu, Jun 12, 2008 at 9:24 AM, Qman Fin [EMAIL PROTECTED] wrote: Hi all, I have some data x, which are actualy consisted of numerical enties. But the class of this matrix is set to be factor by someone else. I used class(x), it turns out to be factor. So I can not calculate them. How can I turn them into numerical data so that I can apply math operations on them? Thanks a lot for your help. Selina [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/How-to-change-the-class-of-data--tp17793351p17795106.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying a function recursively
Georg Otto wrote: Hi, I have a question about applying a function recursively through a list. Suppose I have a list where the different elements have different levels of recursion: ... I understand that with a fixed number of recursion levels one can use lapply() in a nested way, but what if the numbers of recursion levels is not fixed or is different between the list elements as it is in my example? Hi Georg, Have a look at listBuilder and listCrawler in the crank package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] as.numeric(.) returns 0
In R version 2.7.0 (2008-04-22) as.numeric(.) returns zero. as.numeric(.) [1] 0 This must be a bug. Splus and previous versions of R (= 2.6.0) return NA, as you might expect. I'm running R version 2.7.0 (2008-04-22) on Windows XP. Paul _ Paul Johnson Robertson Centre for Biostatistics University of Glasgow Glasgow G12 8QQ, UK [EMAIL PROTECTED] http://www.stats.gla.ac.uk/~paulj/index.html http://www.rcb.gla.ac.uk/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying a function recursively
Hi, thanks a lot for your help. Somehow rapply had escaped my notice. I also have a follow-up question on that. I would like to flatten my output list to a list with only one level. Option unlist in rapply returns a character vector, in my example: rapply(test.list, rev, how=unlist) I.A1I.A2I.A3I.B1I.B2I.B3I.C1I.C2I.C3 II.A.a1 c b a f e d i h g c II.A.a2 II.A.a3 II.A.b1 II.A.b2 II.A.b3 II.A.c1 II.A.c2 II.A.c3 II.B1 II.B2 b a f e d i h g f e II.B3 II.C1 II.C2 II.C3 d i h g What I rather would like to achieve is a list like this: $I.A [1] c b a $I.B [1] f e d $I.C [1] i h g $II.A.a [1] c b a $II.A.b [1] f e d $II.A.c [1] i h g $II.B [1] f e d $II.C [1] i h g Any hint will be appreciated. Best, Georg Prof Brian Ripley [EMAIL PROTECTED] writes: See ?rapply On Wed, 11 Jun 2008, Georg Otto wrote: Hi, I have a question about applying a function recursively through a list. Suppose I have a list where the different elements have different levels of recursion: test.list-list(I=list(A=c(a, b, c), B=c(d, e, f), C=c(g, h, i)), + II=list(A=list(a=c(a, b, c), b=c(d, e, f), + c=c(g, h, i)), + B=c(d, e, f), C=c(g, h, i))) test.list $I $I$A [1] a b c $I$B [1] d e f $I$C [1] g h i $II $II$A $II$A$a [1] a b c $II$A$b [1] d e f $II$A$c [1] g h i $II$B [1] d e f $II$C [1] g h i I would like to apply a function recursively to that list, in a way that the function does someting with each vector (eg. rev()) and returns a list of modified vectors that has the same structure as the input list, in my example: $I $I$A [1] c b a $I$B [1] f e d $I$C [1] i h g $II $II$A $II$A$a [1] c b a $II$A$b [1] f e d $II$A$c [1] i h g $II$B [1] f e d $II$C [1] i h g I understand that with a fixed number of recursion levels one can use lapply() in a nested way, but what if the numbers of recursion levels is not fixed or is different between the list elements as it is in my example? Any hint will be appreciated. Best, Georg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying a function recursively
Prof Brian Ripley wrote: See ?rapply Golly, the things one learns when least expecting it. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.numeric(.) returns 0
Paul Johnson: In R version 2.7.0 (2008-04-22) as.numeric(.) returns zero. as.numeric(.) [1] 0 Seems to be fixed already. In R version 2.7.0 Patched (2008-06-12 r45898): $ as.numeric(.) [1] NA Warning message: NAs introduced by coercion -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read and write stdout() to tktext window
I've been trying to write the consoles output to a tktext window, but have not succeeded... Does anybody know if that works? Any help would be highly appreciated. Thanks in advance, Andreas Posch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.numeric(.) returns 0
Paul Johnson wrote: In R version 2.7.0 (2008-04-22) as.numeric(.) returns zero. as.numeric(.) [1] 0 This must be a bug. Splus and previous versions of R (= 2.6.0) return NA, as you might expect. I'm running R version 2.7.0 (2008-04-22) on Windows XP. I suspect that this got fixed along with the lone sign issue. I have R version 2.7.0 Patched (2008-06-12 r45900) Copyright (C) 2008 The R Foundation for Statistical Computing as.numeric(.) [1] NA Warning message: NAs introduced by coercion -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Close Window Button Problems
You have failed to provide the most basic of information as requested in the posting guide. As you mention 'i create a x11 window and plot' I will assume you mean that you open an X11() device and hence this is some Unix-alike OS. This has come up several times before, so please search the archives. If you 'popen' R, its input is not from a tty and so it is not considered to be running interactively and hence is not expecting users to interact with it (like shutting down windows). R 2.7.0 allows a --interactive flag which may help. Exactly what the circumstances are in which the event loop is blocked seem to be system-specific. But normally if non-interactive R is reading from stdin it is completely blocked until input is completed -- and that is not usually the case if R is interactive. Should this be Windows (which does have an x11() device), the relevant flag is --ess. On Wed, 11 Jun 2008, [EMAIL PROTECTED] wrote: i have created a wrapper C++ class that popen()s R. When i create a x11 window and plot, i cannot close it using the window button. Also, when i minimize or maximize the window, the plot does not redraw. Can anybody tell me if there is a way i can get back this window functionality or if it is not possible? thank you kindly. -damon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Predicting from an nls model
I keep running up against the same error when I try to plot a line from a nls model. The data is fisheries length/weight data. Code follows: require(graphics) pow = nls(Weight~alpha*Length^beta, data=wae, start=list(alpha=0.001, beta=3.0), trace=TRUE) predict(pow) plot(Weight~Length, data = wae, pch=19, xlab=Length (mm), ylab=Weight (g), xlim = c(150,1000), ylim = c(0, 10050)) mod = seq(150, 1000) lines(mod, predict(pow, list(Weight = mod))) The error I get after I submit the final line is: Error in xy.coords(x, y) : 'x' and 'y' lengths differ Like my last post, I'm certain there's something simple I'm overlooking. I've been able to get this to work on other data sets, but _how_ I've been able to get this to work, I'm unsure. Thanks for your help, SR Steven H. Ranney Graduate Research Assistant (Ph.D) USGS Montana Cooperative Fishery Research Unit Montana State University PO Box 173460 Bozeman, MT 59717-3460 phone: (406) 994-6643 fax: (406) 994-7479 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predicting from an nls model
Ranney, Steven steven.ranney at montana.edu writes: plot(Weight~Length, data = wae, pch=19, xlab=Length (mm), ylab=Weight (g), xlim = c(150,1000), ylim = c(0, 10050)) mod = seq(150, 1000) lines(mod, predict(pow, list(Weight = mod))) The error I get after I submit the final line is: Error in xy.coords(x, y) : 'x' and 'y' lengths differ Don't you need to specify Length (predictor variable) rather than Weight (response variable) to predict() in this case? Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predicting from an nls model
To predict from Weight~alpha*Length^beta you need to specify Length, not Weight. It is most likely finding Length from your workspace. On Thu, 12 Jun 2008, Ranney, Steven wrote: I keep running up against the same error when I try to plot a line from a nls model. The data is fisheries length/weight data. Code follows: require(graphics) pow = nls(Weight~alpha*Length^beta, data=wae, start=list(alpha=0.001, beta=3.0), trace=TRUE) predict(pow) plot(Weight~Length, data = wae, pch=19, xlab=Length (mm), ylab=Weight (g), xlim = c(150,1000), ylim = c(0, 10050)) mod = seq(150, 1000) lines(mod, predict(pow, list(Weight = mod))) The error I get after I submit the final line is: Error in xy.coords(x, y) : 'x' and 'y' lengths differ Like my last post, I'm certain there's something simple I'm overlooking. I've been able to get this to work on other data sets, but _how_ I've been able to get this to work, I'm unsure. Thanks for your help, SR Steven H. Ranney Graduate Research Assistant (Ph.D) USGS Montana Cooperative Fishery Research Unit Montana State University PO Box 173460 Bozeman, MT 59717-3460 phone: (406) 994-6643 fax: (406) 994-7479 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predicting from an nls model
Thanks. As a (relatively) new user of R and programming in general, I tend to miss things like that. I appreciate your patience. SR Steven H. Ranney Graduate Research Assistant (Ph.D) USGS Montana Cooperative Fishery Research Unit Montana State University PO Box 173460 Bozeman, MT 59717-3460 phone: (406) 994-6643 fax: (406) 994-7479 -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Thu 6/12/2008 7:00 AM To: Ranney, Steven Cc: r-help@r-project.org Subject: Re: [R] Predicting from an nls model To predict from Weight~alpha*Length^beta you need to specify Length, not Weight. It is most likely finding Length from your workspace. On Thu, 12 Jun 2008, Ranney, Steven wrote: I keep running up against the same error when I try to plot a line from a nls model. The data is fisheries length/weight data. Code follows: require(graphics) pow = nls(Weight~alpha*Length^beta, data=wae, start=list(alpha=0.001, beta=3.0), trace=TRUE) predict(pow) plot(Weight~Length, data = wae, pch=19, xlab=Length (mm), ylab=Weight (g), xlim = c(150,1000), ylim = c(0, 10050)) mod = seq(150, 1000) lines(mod, predict(pow, list(Weight = mod))) The error I get after I submit the final line is: Error in xy.coords(x, y) : 'x' and 'y' lengths differ Like my last post, I'm certain there's something simple I'm overlooking. I've been able to get this to work on other data sets, but _how_ I've been able to get this to work, I'm unsure. Thanks for your help, SR Steven H. Ranney Graduate Research Assistant (Ph.D) USGS Montana Cooperative Fishery Research Unit Montana State University PO Box 173460 Bozeman, MT 59717-3460 phone: (406) 994-6643 fax: (406) 994-7479 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying a function recursively
Wrap each element in an environment, flatten that and then extact the element in each environment. (Be sure not to use an old version of R since sufficiently far back R had a bug when environments were stored in lists that was since fixed.) L - rapply(test.list, function(el) environment(), how = unlist) lapply(L, [[, el) Alternately use proto objects (http://r-proto.googlecode.com): library(proto) L - rapply(test.list, function(el) proto(, el = el), how = unlist) lapply(L, [[, el) On Thu, Jun 12, 2008 at 7:11 AM, Georg Otto [EMAIL PROTECTED] wrote: Hi, thanks a lot for your help. Somehow rapply had escaped my notice. I also have a follow-up question on that. I would like to flatten my output list to a list with only one level. Option unlist in rapply returns a character vector, in my example: rapply(test.list, rev, how=unlist) I.A1I.A2I.A3I.B1I.B2I.B3I.C1I.C2I.C3 II.A.a1 c b a f e d i h g c II.A.a2 II.A.a3 II.A.b1 II.A.b2 II.A.b3 II.A.c1 II.A.c2 II.A.c3 II.B1 II.B2 b a f e d i h g f e II.B3 II.C1 II.C2 II.C3 d i h g What I rather would like to achieve is a list like this: $I.A [1] c b a $I.B [1] f e d $I.C [1] i h g $II.A.a [1] c b a $II.A.b [1] f e d $II.A.c [1] i h g $II.B [1] f e d $II.C [1] i h g Any hint will be appreciated. Best, Georg Prof Brian Ripley [EMAIL PROTECTED] writes: See ?rapply On Wed, 11 Jun 2008, Georg Otto wrote: Hi, I have a question about applying a function recursively through a list. Suppose I have a list where the different elements have different levels of recursion: test.list-list(I=list(A=c(a, b, c), B=c(d, e, f), C=c(g, h, i)), + II=list(A=list(a=c(a, b, c), b=c(d, e, f), + c=c(g, h, i)), + B=c(d, e, f), C=c(g, h, i))) test.list $I $I$A [1] a b c $I$B [1] d e f $I$C [1] g h i $II $II$A $II$A$a [1] a b c $II$A$b [1] d e f $II$A$c [1] g h i $II$B [1] d e f $II$C [1] g h i I would like to apply a function recursively to that list, in a way that the function does someting with each vector (eg. rev()) and returns a list of modified vectors that has the same structure as the input list, in my example: $I $I$A [1] c b a $I$B [1] f e d $I$C [1] i h g $II $II$A $II$A$a [1] c b a $II$A$b [1] f e d $II$A$c [1] i h g $II$B [1] f e d $II$C [1] i h g I understand that with a fixed number of recursion levels one can use lapply() in a nested way, but what if the numbers of recursion levels is not fixed or is different between the list elements as it is in my example? Any hint will be appreciated. Best, Georg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cch function and time dependent covariates
- begin included message In case cohort study, we can fit proportional hazard regression model to case-cohort data. In R, the function is cch() in Survival package Now I am working on case cohort analysis with time dependent covariates using cch() of Survival R package. I wonder if cch() provide this utility or not? The cch() manual does not say if time dependent covariate is allowed I know coxph() in Survival package can estimate time dependent covariates. -- end inclusion --- The cch function was added to the package by Breslow and Lumley, neither of which appears to be monitoring the list lately. Since it claims to impliment the methods in Li and Therneau, and I don't know the cch code, let me suggest an alternate way to create your fit: Assume that your data set has the ususal coxph variables, including time-dependent covariates as multiple observations per subject using (start, stop) style, along with 2 other variables id = a unique identifier per subject case = 0 if the subject is a member of the random subcohort 1 if the subject is a case (an event from outside the subcohort) Then coxph(Surv(time1, time2, status) ~ x1 + x2+ + offset(-100*case) + cluster(id), data=mydata) Will fit the case-cohort model. This correctly allows for time-dependent covariates. It corresponds to the Self method of cch. Why -100? It causes the case to have a relative weight of approx 0 in a particular weighted mean; exp(-100) is small enough and doesn't cause trouble for the exp function. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] controlling location of labels in axis()
Here's a naive question about axis() How do you control the location of the labels with the axis() command? In the following example: foo - data.frame(plot.x=seq(1:3), plot.y=seq(4:6)) plot(foo$plot.x, foo$plot.y, type='n', axes=FALSE) points(foo$plot.x, foo$plot.y) axis(1, at=foo$plot.x, labels=foo$plot.x) I'd like to be able the control the y location of the labels (i.e. move up or down in the graph). Thanks, Andrew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] controlling location of labels in axis()
Hi Andrew, Perhaps this example would help. You can add in spaces to the mtext text to move the text sideways. par(mai=c(0.5,0.5,0.5,0.5),oma=c(2,2,2,2)) #mai units are INCHES, oma units are LINES plot(runif(50),xlab=xlab,ylab=ylab,bty=l) #n.b. these labels don't appear mtext(First inner x axis label,side=1) mtext(First inner y axis label,side=2) mtext(Second inner x axis label,side=3) mtext(Second inner y axis label,side=4) mtext(First outer x axis label,side=1,outer=TRUE) mtext(First outer y axis label,side=2,outer=TRUE) mtext(Second outer x axis label,side=3,outer=TRUE) mtext(Second outer y axis label,side=4,outer=TRUE) Cheers, Toby Marthews Le Jeu 12 juin 2008 15:32, Andrew Yee a écrit : Here's a naive question about axis() How do you control the location of the labels with the axis() command? In the following example: foo - data.frame(plot.x=seq(1:3), plot.y=seq(4:6)) plot(foo$plot.x, foo$plot.y, type='n', axes=FALSE) points(foo$plot.x, foo$plot.y) axis(1, at=foo$plot.x, labels=foo$plot.x) I'd like to be able the control the y location of the labels (i.e. move up or down in the graph). Thanks, Andrew __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] p-value
Dear R User, say I have this sample of data ( attach with). What i'm going to do is to test whether this data is uniformly distributed or not by finding the p-value. I've tried using the punif command but it gave me the value of 1 of all the data. Any suggestion on R command to find the p-value??Thanks in advance!! Cheers, Anisah 132968364 135945080 156539568 157817896 162399496 168344072 173146584 176302744 182878168 183946152 185068720 190791232 84317660 93708872 106810172 12684 148519056 150945112 155771432 181069984 87104384 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] p-value
Here's a sample: unif_rand_1 - runif(1000); unif_rand_2 - runif(1000); ks.test(unif_rand_1,unif_rand_2); Two-sample Kolmogorov-Smirnov test data: unif_rand_1 and unif_rand_2 D = 0.021, p-value = 0.9802 alternative hypothesis: two-sided So in your case: ks.test( runif( length( your_data ) ), your_data ); - John On Thu, Jun 12, 2008 at 9:55 AM, mohamed nur anisah [EMAIL PROTECTED] wrote: Dear R User, say I have this sample of data ( attach with). What i'm going to do is to test whether this data is uniformly distributed or not by finding the p-value. I've tried using the punif command but it gave me the value of 1 of all the data. Any suggestion on R command to find the p-value??Thanks in advance!! Cheers, Anisah 132968364 135945080 156539568 157817896 162399496 168344072 173146584 176302744 182878168 183946152 185068720 190791232 84317660 93708872 106810172 12684 148519056 150945112 155771432 181069984 87104384 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] p-value
Something like. . . midpoint - c(132968364, 135945080, 156539568, 157817896, + 162399496, 168344072, 173146584, 176302744, + 182878168, 183946152, 185068720, 190791232, + 84317660, 93708872, 106810172, 12684, + 148519056, 150945112, 155771432, 181069984, + 87104384 + ) shapiro.test(midpoint) HTH, Patrick -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of mohamed nur anisah Sent: Thursday, June 12, 2008 9:56 AM To: r-help@r-project.org Subject: [R] p-value Dear R User, say I have this sample of data ( attach with). What i'm going to do is to test whether this data is uniformly distributed or not by finding the p-value. I've tried using the punif command but it gave me the value of 1 of all the data. Any suggestion on R command to find the p-value??Thanks in advance!! Cheers, Anisah This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] About Mcneil Hanley test for a portion of AUC!
Dear all I am trying to compare the performances of several methods using the AUC0.1 and not the whole AUC. (meaning I wanted to compare to AUC's whose x axis only goes to 0.1 not 1) I came to know about the Mcneil Hanley test from Bernardo Rangel Tura and I referred to the original paper for the calculation of r which is an argument of the function cROC. I can only find the value of r for the whole AUC's . seROC-function(AUC,na,nn){ a-AUC q1-a/(2-a) q2-(2*a^2)/(1+a) se-sqrt((a*(1-a)+(na-1)*(q1-a^2)+(nn-1)*(q2-a^2))/(nn*na)) se } cROC-function(AUC1,na1,nn1,AUC2,na2,nn2,r){ se1-seROC(AUC1,na1,nn1) se2-seROC(AUC2,na2,nn2) sed-sqrt(se1^2+se2^2-2*r*se1*se2) zad-(AUC1-AUC2)/sed p-dnorm(zad) a-list(zad,p) a Could somebody kindly suggest me how to calculate the value of r or some ways to calculate the statistical significance measure for the differences of auc for a part of the curve like AUC0.1. Thank You -- Dukka KC UNCC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] p-value
Not sure if this is what you are looking for but you can get the p-value with something like this: # Create a vector mydata- c(132968364, 135945080, 156539568, 157817896, 162399496, 168344072, 173146584, 176302744, 182878168, 183946152, 185068720, 190791232, 84317660, 93708872, 106810172, 12684, 148519056, 150945112, 155771432, 181069984, 87104384) plot(density(mydata)) shapiro.test(mydata) From: mohamed nur anisah say I have this sample of data ( attach with). What i'm going to do is to test whether this data is uniformly distributed or not by finding the p-value. I've tried using the punif command but it gave me the value of 1 of all the data. Any suggestion on R command to find the p-value??Thanks in advance!! Cheers, Anisah132968364 135945080 156539568 157817896 162399496 168344072 173146584 176302744 182878168 183946152 185068720 190791232 84317660 93708872 106810172 12684 148519056 150945112 155771432 181069984 87104384__ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R/S course in *** New York City *** July 28-29 by XLSolutions Corp
Our July *** New York City *** R/S Fundamentals and Programming Techniques is scheduled for: New York City / July 28-29, 2008 *** Please direct enquiries to Sue Turner: [EMAIL PROTECTED] Ask for Group Discount --- Looking for R Advanced course? It's comming up in Seattle on August 14-15, 2008 :) www.xlsolutions-corp.com/courselist.htm Payment due AFTER the class Email us for group discounts. Email Sue Turner: [EMAIL PROTECTED] Phone: 206-686-1578 Visit us: www.xlsolutions-corp.com/courselist.htm Please let us know if you and your colleagues are interested in this class to take advantage of group discount. Register now to secure your seat! Cheers, Elvis Miller, PhD Manager Training. XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cch function and time dependent covariates
I tried your alternative method on the example in cch() description manual. The example data nwtco has not time-dependent covariates yet. I test cch() and coxph() on the same data. But the estimation result is different. I don't know if I did anything wrong. subcoh - nwtco$in.subcohort selccoh - with(nwtco, rel==1|subcoh==1) ccoh.data - nwtco[selccoh,] ccoh.data$subcohort - subcoh[selccoh] ## central-lab histology ccoh.data$histol - factor(ccoh.data$histol,labels=c(FH,UH)) ## tumour stage ccoh.data$stage - factor(ccoh.data$stage,labels=c(I,II,III,IV)) ccoh.data$age - ccoh.data$age/12 # Age in years fit.ccSP - cch(Surv(edrel, rel) ~ stage + histol + age, data =ccoh.data, subcoh = ~subcohort, id=~seqno, cohort.size=4028, method=SelfPren) fit2.ccP - coxph(Surv(edrel, rel) ~ stage + histol + age + offset(-100*subcohort)+cluster(seqno),data =ccoh.data) fit2.ccP Call: coxph(formula = Surv(edrel, rel) ~ stage + histol + age + offset(-100 * subcohort) + cluster(seqno), data = ccoh.data) coef exp(coef) se(coef) robust se z p stageII -0.1245 0.883 0.12360.1371 -0.908 0.3600 stageIII 0.0193 1.020 0.12520.1517 0.127 0.9000 stageIV 0.2997 1.350 0.13700.1509 1.986 0.0470 histolUH 0.3518 1.422 0.09200.1092 3.223 0.0013 age -0.0281 0.972 0.01440.0168 -1.678 0.0930 Likelihood ratio test=34.5 on 5 df, p=1.89e-06 n= 1154 summary(fit.ccSP) Case-cohort analysis,x$method, SelfPrentice with subcohort of 668 from cohort of 4028 Call: cch(formula = Surv(edrel, rel) ~ stage + histol + age, data = ccoh.data, subcoh = ~subcohort, id = ~seqno, cohort.size = 4028, method = SelfPren) Coefficients: CoefHR (95% CI) p stageII 0.736 2.088 1.491 2.925 0.000 stageIII 0.597 1.818 1.285 2.571 0.001 stageIV 1.392 4.021 2.670 6.057 0.000 histolUH 1.506 4.507 3.274 6.203 0.000 age 0.043 1.044 0.996 1.095 0.069 2008/6/12 Terry Therneau [EMAIL PROTECTED]: - begin included message In case cohort study, we can fit proportional hazard regression model to case-cohort data. In R, the function is cch() in Survival package Now I am working on case cohort analysis with time dependent covariates using cch() of Survival R package. I wonder if cch() provide this utility or not? The cch() manual does not say if time dependent covariate is allowed I know coxph() in Survival package can estimate time dependent covariates. -- end inclusion --- The cch function was added to the package by Breslow and Lumley, neither of which appears to be monitoring the list lately. Since it claims to impliment the methods in Li and Therneau, and I don't know the cch code, let me suggest an alternate way to create your fit: Assume that your data set has the ususal coxph variables, including time-dependent covariates as multiple observations per subject using (start, stop) style, along with 2 other variables id = a unique identifier per subject case = 0 if the subject is a member of the random subcohort 1 if the subject is a case (an event from outside the subcohort) Then coxph(Surv(time1, time2, status) ~ x1 + x2+ + offset(-100*case) + cluster(id), data=mydata) Will fit the case-cohort model. This correctly allows for time-dependent covariates. It corresponds to the Self method of cch. Why -100? It causes the case to have a relative weight of approx 0 in a particular weighted mean; exp(-100) is small enough and doesn't cause trouble for the exp function. Terry Therneau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] model simplification using Crawley as a guide
Simon Blomberg wrote: Good points Ben. For now I'd recommend simply that the allergic reaction to insignificant statistical tests be treated with an antihistamine :-) A vote for Frank's comment to be added to the 'fortunes' package. Seconded! :-) That'll be anti-hist()-amine, I presume? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] model simplification using Crawley as a guide
on 06/12/2008 09:37 AM Peter Dalgaard wrote: Simon Blomberg wrote: Good points Ben. For now I'd recommend simply that the allergic reaction to insignificant statistical tests be treated with an antihistamine :-) A vote for Frank's comment to be added to the 'fortunes' package. Seconded! :-) That'll be anti-hist()-amine, I presume? rimshot ;-) Marc __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with function rep
We need a reproducible example of this to tell you what is going on. Find a small example that exhibits the confusing behavior, and share it with the list. Julien Hunt wrote: To whom it may concern, I am currently writing a program where I need to use function rep. The results I get are quite confusing. Given two vectors A and B, I want to replicate a[1] b[1] times, a[2] b[2] times and so on. All the entries of vector B are positive integers. My problem comes from the fact that if I sum up all the elements of B, I get a certain value x(for example 1). And if i calculate the length of the vector obtained after replication, I dont always get x(1) but sometimes I get x sometimes I get instead of 1. Has this problem been reported before? Do you need more information on my specific program. Thanks for your time and help, Best regards, Julien Hunt Julien Hunt, PhD student and teaching assistant, Institute of Statistics, Université Catholique de Louvain, Voie du Roman pays 20 B-1348 Louvain-La-Neuve, Belgium E-mail: [EMAIL PROTECTED] Tel: +32 10 / 47 94 01 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] controlling location of labels in axis()
Thanks for the reply. I think I've figured it out, you can set this with the mgp parameter. So I'd use the following statement instead: axis(1, at=foo$plot.x, labels=foo$plot.x, mgp=c(3,0.5,1)) #this brings the axis labels closer to the axis line Andrew On Thu, Jun 12, 2008 at 9:53 AM, Toby Marthews [EMAIL PROTECTED] wrote: Hi Andrew, Perhaps this example would help. You can add in spaces to the mtext text to move the text sideways. par(mai=c(0.5,0.5,0.5,0.5),oma=c(2,2,2,2)) #mai units are INCHES, oma units are LINES plot(runif(50),xlab=xlab,ylab=ylab,bty=l) #n.b. these labels don't appear mtext(First inner x axis label,side=1) mtext(First inner y axis label,side=2) mtext(Second inner x axis label,side=3) mtext(Second inner y axis label,side=4) mtext(First outer x axis label,side=1,outer=TRUE) mtext(First outer y axis label,side=2,outer=TRUE) mtext(Second outer x axis label,side=3,outer=TRUE) mtext(Second outer y axis label,side=4,outer=TRUE) Cheers, Toby Marthews Le Jeu 12 juin 2008 15:32, Andrew Yee a écrit : Here's a naive question about axis() How do you control the location of the labels with the axis() command? In the following example: foo - data.frame(plot.x=seq(1:3), plot.y=seq(4:6)) plot(foo$plot.x, foo$plot.y, type='n', axes=FALSE) points(foo$plot.x, foo$plot.y) axis(1, at=foo$plot.x, labels=foo$plot.x) I'd like to be able the control the y location of the labels (i.e. move up or down in the graph). Thanks, Andrew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with function rep
G'day Julien, On Thu, 12 Jun 2008 16:48:43 +0200 Julien Hunt [EMAIL PROTECTED] wrote: I am currently writing a program where I need to use function rep. The results I get are quite confusing. Given two vectors A and B, I want to replicate a[1] b[1] times, a[2] b[2] times and so on. All the entries of vector B are positive integers. My problem comes from the fact that if I sum up all the elements of B, [...] Others mentioned already the need for a reproducible example. But my guess is that the elements in B are calculated. Recently, I was sent the following code by a colleague of mine: --- Hi Berwin, Try this in R2.7.0 pai = c(.4,.1,.1,.4) s = .5 p = diag(1-s, 4) + s * t(matrix(pai, 4, 4)) f = diag(pai) %*% p z = 200*f ### bug??? z sum(z) length(rep(1:16, z)) length(rep(1:16, round(z))) I tested the code and my answer was: --- Interesting variation on FAQ 7.31: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f Look at z-round(z) and where the negative residuals are. My money is on you having the same problem and that using round(B) instead of B in the rep() command will solve your problem. HTH. Cheers, Berwin === Full address = Berwin A TurlachTel.: +65 6516 4416 (secr) Dept of Statistics and Applied Probability+65 6516 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: [EMAIL PROTECTED] Singapore 117546http://www.stat.nus.edu.sg/~statba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with function rep
on 06/12/2008 09:48 AM Julien Hunt wrote: To whom it may concern, I am currently writing a program where I need to use function rep. The results I get are quite confusing. Given two vectors A and B, I want to replicate a[1] b[1] times, a[2] b[2] times and so on. All the entries of vector B are positive integers. My problem comes from the fact that if I sum up all the elements of B, I get a certain value x(for example 1). And if i calculate the length of the vector obtained after replication, I dont always get x(1) but sometimes I get x sometimes I get instead of 1. Has this problem been reported before? Do you need more information on my specific program. Thanks for your time and help, Best regards, Julien Hunt An example would be most helpful, but presuming that you are using something along the lines of: rep(a, each = b) I would check to be sure that: length(a) == length(b) lest you end up with the issue of recycling values. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cch function and time dependent covariates
same subject id has to be multiple in mutiple times like following format, Multiple records per id not allowed in cch() so it's difficult to use cch() for time dependent covariate. Maybe coxph() is alternative, but seems difficult because coxph() and cch() return different estimate for same data nwtco even without time-dependent covariates. id start end event 134 145 1561 223 234 245 2561 I use the time-dependent covariates data Rossi in http://socserv.mcmaster.ca/jfox/Books/Companion/appendix-cox-regression.pdf I rebuild new case cohort data with time dependent variable based on Rossi data sc-sample(c(TRUE,FALSE,FALSE,FALSE,FALSE,FALSE), 432, replace = TRUE) str(Rossi) Rossi1-cbind(Rossi,sc) Rossi2-cbind(seqno,Rossi1) subcoh1 - Rossi2$sc selccoh1 - with(Rossi2, arrest.time==1|subcoh1==1) ccoh1.data - Rossi2[selccoh1,] ccoh1.data$subcohort - subcoh1[selccoh1] str(ccoh1.data) ccoh1.data.fold - fold(ccoh1.data, time='week', event='arrest', cov=12:63, cov.names='employed') str(ccoh1.data.fold) ccoh1.data.fold$sc-as.logical(ccoh1.data.fold$sc) ccoh1.data.fold$subcohort-as.logical(ccoh1.data.fold$subcohort) fit1.allison.2 - cch(Surv(start, stop, arrest.time) ~ fin + age + race + wexp + mar + paro + prio + employed, data=ccoh1.data.fold,subcoh=~subcohort,id=~seqno,cohort.size=19809) history(1000) fit1.allison.2 - cch(Surv(start, stop, arrest.time) ~ + fin + age + race + wexp + mar + paro + prio + employed, + data=ccoh1.data.fold,subcoh=~subcohort,id=~seqno,cohort.size=19809) Error in cch(Surv(start, stop, arrest.time) ~ fin + age + race + wexp + : Multiple records per id not allowed === 2008/6/12 Jin Wang [EMAIL PROTECTED]: I tried your alternative method on the example in cch() description manual. The example data nwtco has not time-dependent covariates yet. I test cch() and coxph() on the same data. But the estimation result is different. I don't know if I did anything wrong. subcoh - nwtco$in.subcohort selccoh - with(nwtco, rel==1|subcoh==1) ccoh.data - nwtco[selccoh,] ccoh.data$subcohort - subcoh[selccoh] ## central-lab histology ccoh.data$histol - factor(ccoh.data$histol,labels=c(FH,UH)) ## tumour stage ccoh.data$stage - factor(ccoh.data$stage,labels=c(I,II,III,IV)) ccoh.data$age - ccoh.data$age/12 # Age in years fit.ccSP - cch(Surv(edrel, rel) ~ stage + histol + age, data =ccoh.data, subcoh = ~subcohort, id=~seqno, cohort.size=4028, method=SelfPren) fit2.ccP - coxph(Surv(edrel, rel) ~ stage + histol + age + offset(-100*subcohort)+cluster(seqno),data =ccoh.data) fit2.ccP Call: coxph(formula = Surv(edrel, rel) ~ stage + histol + age + offset(-100 * subcohort) + cluster(seqno), data = ccoh.data) coef exp(coef) se(coef) robust se z p stageII -0.1245 0.883 0.12360.1371 -0.908 0.3600 stageIII 0.0193 1.020 0.12520.1517 0.127 0.9000 stageIV 0.2997 1.350 0.13700.1509 1.986 0.0470 histolUH 0.3518 1.422 0.09200.1092 3.223 0.0013 age -0.0281 0.972 0.01440.0168 -1.678 0.0930 Likelihood ratio test=34.5 on 5 df, p=1.89e-06 n= 1154 summary(fit.ccSP) Case-cohort analysis,x$method, SelfPrentice with subcohort of 668 from cohort of 4028 Call: cch(formula = Surv(edrel, rel) ~ stage + histol + age, data = ccoh.data, subcoh = ~subcohort, id = ~seqno, cohort.size = 4028, method = SelfPren) Coefficients: CoefHR (95% CI) p stageII 0.736 2.088 1.491 2.925 0.000 stageIII 0.597 1.818 1.285 2.571 0.001 stageIV 1.392 4.021 2.670 6.057 0.000 histolUH 1.506 4.507 3.274 6.203 0.000 age 0.043 1.044 0.996 1.095 0.069 2008/6/12 Terry Therneau [EMAIL PROTECTED]: - begin included message In case cohort study, we can fit proportional hazard regression model to case-cohort data. In R, the function is cch() in Survival package Now I am working on case cohort analysis with time dependent covariates using cch() of Survival R package. I wonder if cch() provide this utility or not? The cch() manual does not say if time dependent covariate is allowed I know coxph() in Survival package can estimate time dependent covariates. -- end inclusion --- The cch function was added to the package by Breslow and Lumley, neither of which appears to be monitoring the list lately. Since it claims to impliment the methods in Li and Therneau, and I don't know the cch code, let me suggest an alternate way to create your fit: Assume that your data set has the ususal coxph variables, including time-dependent covariates as multiple observations per subject using (start, stop) style, along with 2 other variables id = a unique identifier per subject case = 0 if the subject is a member of the random subcohort 1 if the subject is a case (an
[R] Generate Random Samples
Hi, I am a newbie to R and I am working with a Mac. Is there any package that I can use to generate random samples from a user defined distribution ? That is , I define a distribution function ( maybe multi dimension ) and I want some random samples generated from my this distribution. Or, there is a more specific problem . If I have a three component mixture with each of them being normal distribution( say 3 dimension ) , is there any package that I can use to generate random samples from this mixture . I know I can generate random samples from each individual component. However, can I just add them directly Thanks. -- Peng Jiang 江鹏 Ph.D. Candidate Antai College of Economics Management 安泰经济管理学院 Department of Mathematics 数学系 Shanghai Jiaotong University (Minhang Campus) 800 Dongchuan Road 200240 Shanghai P. R. China __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] case-cohort
Jin Wang had an error. My original note specified a variable that was 1 for subjects NOT in the subcohort, so the correct coxph call is coxph(Surv(edrel, rel) ~ stage + histol + age + offset(-100*(subcohort==0)) + cluster(seqno), data =ccoh.data) This gives the same coefficients as the cch example, along with the infinitesimal jackknife or robust variance estimate. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to increase the for() loop speed?
Dear R users, I would like to know if there is a way to increase the for() loop speed because in my routine the calculations are too slow. Best regards. Rafael Barros de Rezende Cedeplar - Center for Development and Regional Planning Face, UFMG ([1]http://www.cedeplar.ufmg.br) -- Esta mensagem foi verificada pelo sistema de antivÃrus e acredita-se estar livre de perigo. References 1. http://www.cedeplar.ufmg.br/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with function rep
Berwin appears to be correct here. After you do x - x / 0.0001, I inserted a call to round(x) - x, and received round(x) - x [1] 7.275958e-12 0.00e+00 0.00e+00 This is basically a case of FAQ 7.31. Julien Hunt wrote: Hi I believe this should provide an example of the confusing behavior. Run this with t=100 for example: test=function(t){ x=c() while(sum(x)=t){ ###I simply generate some numbers from an exponential until the sum of these numbers gets to 100(without loss of generality) x=c(x,round(rexp(1,0.1),4)) } x=x/0.0001 y=rnorm(length(x),0,1) t=rep(y,x) return(sum(x),length(t)) } The intuition is that sum(x) and length(t) should be the same. furthermore, rounding x seems since all is done for it to be an integer. Nevertheless, I will try Berwin Turlach's method. Regards, Julien At 17:01 12/06/2008, Erik Iverson wrote: We need a reproducible example of this to tell you what is going on. Find a small example that exhibits the confusing behavior, and share it with the list. Julien Hunt wrote: To whom it may concern, I am currently writing a program where I need to use function rep. The results I get are quite confusing. Given two vectors A and B, I want to replicate a[1] b[1] times, a[2] b[2] times and so on. All the entries of vector B are positive integers. My problem comes from the fact that if I sum up all the elements of B, I get a certain value x(for example 1). And if i calculate the length of the vector obtained after replication, I dont always get x(1) but sometimes I get x sometimes I get instead of 1. Has this problem been reported before? Do you need more information on my specific program. Thanks for your time and help, Best regards, Julien Hunt Julien Hunt, PhD student and teaching assistant, Institute of Statistics, Université Catholique de Louvain, Voie du Roman pays 20 B-1348 Louvain-La-Neuve, Belgium E-mail: [EMAIL PROTECTED] Tel: +32 10 / 47 94 01 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Julien Hunt, PhD student and teaching assistant, Institute of Statistics, Université Catholique de Louvain, Voie du Roman pays 20 B-1348 Louvain-La-Neuve, Belgium E-mail: [EMAIL PROTECTED] Tel: +32 10 / 47 94 01 * __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with Freq function {prettyR}
Dear list, I have a problem with freq from prettyR. Please have a look at my syntax with a litte example: library(prettyR) #Version 1 test.df-data.frame(q1=sample(1:4,8,TRUE), gender=sample(c(f,m),8,TRUE)) test.df freq(test.df) #No error message #Version 2 test.df-data.frame(gender=sample(c(f,m),8,TRUE), q1=sample(1:4,8,TRUE)) test.df freq(test.df) Error message: Error in vector(integer, length) : Vector size can´t be NA Can someone tell me, why an error message occurs in version two? I am helpless... Thanks in advance! Udo K ö n i g Clinic for Child an Adolescent Psychiatry Philipps University of Marburg / Germany __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to increase the for() loop speed?
Dear R users, I would like to know if there is a way to increase the for() loop speed because in my routine the calculations are too slow. Best regards. Rafael Barros de Rezende Cedeplar - Center for Development and Regional Planning Face, UFMG ([1]http://www.cedeplar.ufmg.br) -- Esta mensagem foi verificada pelo sistema de antivÃrus e acredita-se estar livre de perigo. References 1. http://www.cedeplar.ufmg.br/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] shell command
Hi, Can we execute a unix shell command from within R shell? thanks, Sam -- View this message in context: http://www.nabble.com/shell-command-tp17803089p17803089.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with function rep
Hi I believe this should provide an example of the confusing behavior. Run this with t=100 for example: test=function(t){ x=c() while(sum(x)=t){ ###I simply generate some numbers from an exponential until the sum of these numbers gets to 100(without loss of generality) x=c(x,round(rexp(1,0.1),4)) } x=x/0.0001 y=rnorm(length(x),0,1) t=rep(y,x) return(sum(x),length(t)) } The intuition is that sum(x) and length(t) should be the same. furthermore, rounding x seems since all is done for it to be an integer. Nevertheless, I will try Berwin Turlach's method. Regards, Julien At 17:01 12/06/2008, Erik Iverson wrote: We need a reproducible example of this to tell you what is going on. Find a small example that exhibits the confusing behavior, and share it with the list. Julien Hunt wrote: To whom it may concern, I am currently writing a program where I need to use function rep. The results I get are quite confusing. Given two vectors A and B, I want to replicate a[1] b[1] times, a[2] b[2] times and so on. All the entries of vector B are positive integers. My problem comes from the fact that if I sum up all the elements of B, I get a certain value x(for example 1). And if i calculate the length of the vector obtained after replication, I dont always get x(1) but sometimes I get x sometimes I get instead of 1. Has this problem been reported before? Do you need more information on my specific program. Thanks for your time and help, Best regards, Julien Hunt Julien Hunt, PhD student and teaching assistant, Institute of Statistics, Université Catholique de Louvain, Voie du Roman pays 20 B-1348 Louvain-La-Neuve, Belgium E-mail: [EMAIL PROTECTED] Tel: +32 10 / 47 94 01 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Julien Hunt, PhD student and teaching assistant, Institute of Statistics, Université Catholique de Louvain, Voie du Roman pays 20 B-1348 Louvain-La-Neuve, Belgium E-mail: [EMAIL PROTECTED] Tel: +32 10 / 47 94 01 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] overall title
I have a 2x2 plot set up using: par(mfrow=c(2,2)) I'd like to put an overall title on the page, but I cannot figure out how. Any ideas? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] shell command
Yes, see ?system samitj wrote: Hi, Can we execute a unix shell command from within R shell? thanks, Sam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] shell command
samitj wrote: Hi, Can we execute a unix shell command from within R shell? thanks, Sam ?system hth, Paul -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +31302535773 Fax:+31302531145 http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XML parameters to Column Headers for importing into a dataset
Hi Ajay -- ajay ohri [EMAIL PROTECTED] writes: Dear List, Do you know any way I can convert XML parameters into column headers. My In R, the XML package will help you... data is in a csv file with each row containing a xml form of data , and multiple parameters ( param1 data_val1 /param2 , param2 data_val2 /param2 ) I guess that first closing tag is param1... I want to convert it so each row caters to one record and each parameter becomes a different column. param1 param2 Row1 data_val1 data_val2 What is the most efficient way for doing this. Apologize for the duplicate Personally I like to use the xpath query language; the following relies a little on your data being regular (e.g., all rows having entries for all column values), but for some file 'fl' (perhaps accessible via a url) library(xml) xml = xmlTreeParse(fl, useInternal=TRUE) data.frame( param1 = unlist(xpathApply(xml, //param1, xmlValue)), param2 = unlist(xpathApply(xml, //param2, xmlValue))) does the trick. these are string values, you can convert them to numeric in the usual R way (as.numeric(unlist...)) or at the xpath level (along the lines of xpathApply(xml, number(//param1))). xpath help is available at http://www.w3.org/TR/xpath, especially http://www.w3.org/TR/xpath#path-abbrev The above is with R 2.7.0 and XML 1.95-2 Martin email , but this is an emergency with loads of files for me !!! Regards, Ajay www.decisionstats.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to increase the for() loop speed?
13 minutes is a long time for a loop to simply send an email, what other calculations are going on? Rafael Barros de Rezende wrote: Dear R users, I would like to know if there is a way to increase the for() loop speed because in my routine the calculations are too slow. Best regards. Rafael Barros de Rezende Cedeplar - Center for Development and Regional Planning Face, UFMG ([1]http://www.cedeplar.ufmg.br) -- Esta mensagem foi verificada pelo sistema de antivÃrus e acredita-se estar livre de perigo. References 1. http://www.cedeplar.ufmg.br/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overall title
See mtext: mtext(Title, outer = T, side = 3, line = -2) On Thu, Jun 12, 2008 at 12:38 PM, [EMAIL PROTECTED] wrote: I have a 2x2 plot set up using: par(mfrow=c(2,2)) I'd like to put an overall title on the page, but I cannot figure out how. Any ideas? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overall title
Please try this: z1 - rexp(100) z2 - rexp(100) z3 - rexp(100) z4 - rexp(100) par(mfrow=c(2,2),oma = c(0, 0, 3, 0)) curve(dexp,from=0,to=5) hist(z1,main=first) hist(z2,main=second) hist(z3,main=third) mtext(Densities, outer = TRUE, cex = 1.5) Hope this helps. Sincerely, Erin On Thu, Jun 12, 2008 at 10:38 AM, [EMAIL PROTECTED] wrote: I have a 2x2 plot set up using: par(mfrow=c(2,2)) I'd like to put an overall title on the page, but I cannot figure out how. Any ideas? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About Mcneil Hanley test for a portion of AUC!
Dukka k.c. wrote: Dear all I am trying to compare the performances of several methods using the AUC0.1 and not the whole AUC. (meaning I wanted to compare to AUC's whose x axis only goes to 0.1 not 1) I came to know about the Mcneil Hanley test from Bernardo Rangel Tura and I referred to the original paper for the calculation of r which is an argument of the function cROC. I can only find the value of r for the whole AUC's . seROC-function(AUC,na,nn){ a-AUC q1-a/(2-a) q2-(2*a^2)/(1+a) se-sqrt((a*(1-a)+(na-1)*(q1-a^2)+(nn-1)*(q2-a^2))/(nn*na)) se } cROC-function(AUC1,na1,nn1,AUC2,na2,nn2,r){ se1-seROC(AUC1,na1,nn1) se2-seROC(AUC2,na2,nn2) sed-sqrt(se1^2+se2^2-2*r*se1*se2) zad-(AUC1-AUC2)/sed p-dnorm(zad) a-list(zad,p) a Could somebody kindly suggest me how to calculate the value of r or some ways to calculate the statistical significance measure for the differences of auc for a part of the curve like AUC0.1. Thank You The ROC area is not a sensitive enough measure for comparing two competing predictors. Its power is too low. See for example the following papers. Note that Pencina et al's approach is now in the Hmisc package (function improveProb; documentation to be coming soon). Likelihood ratio tests are even more powerful. @Article{pen08eva, author = {Pencina, Michael J. and {D'Agostino Sr}, Ralph B. and {D'Agostino Jr}, Ralph B. and Vasan, Ramachandran S.}, title = {Evaluating the added predictive ability of a new marker: {From} area under the {ROC} curve to reclassification and beyond}, journal = Stat in Med, year = 2008, volume = 27, pages ={157-172}, annote = {discrimination;model performance;AUC;C-index;risk prediction;biomarker;small differences in ROC area can still be very meaningful;example of insignificant test for difference in ROC areas with very significant results from new method;Yates' discrimination slope;reclassification table;limiting version of this based on whether and amount by which probabilities rise for events and lower for non-events when compare new model to old;comparing two models} } @Article{coo07use, author = {Cook, Nancy R.}, title = {Use and misues of the receiver operating characteristic curve in risk prediction}, journal = {Circulation}, year = 2007, volume = 115, pages ={928-935}, annote = {reclassification table;problems with c index;problems with ROC area;example of large change in predicted risk in cardiovascular disease with tiny change in ROC area;possible limits to c index when calibration is perfect;importance of calibration accuracy and changes in predicted risk when new variables are added} } -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to increase the for() loop speed?
My routine is on Financial Econometrics (Yield Curve Modeling). It is very intensive. And I have heard that the for() loop speed could be increased with a command. I want to know if there a way to do it. Best regards. Rafael Barros de Rezende -- Original Message --- From: Erik Iverson [EMAIL PROTECTED] To: Rafael Barros de Rezende [EMAIL PROTECTED] Cc: r-help@r-project.org Sent: Thu, 12 Jun 2008 11:06:09 -0500 Subject: Re: [R] How to increase the for() loop speed? 13 minutes is a long time for a loop to simply send an email, what other calculations are going on? Rafael Barros de Rezende wrote: Dear R users, I would like to know if there is a way to increase the for() loop speed because in my routine the calculations are too slow. Best regards. Rafael Barros de Rezende Cedeplar - Center for Development and Regional Planning Face, UFMG ([1][1]http://www.cedeplar.ufmg.br) -- Esta mensagem foi verificada pelo sistema de antivÃrus e acredita-se estar livre de perigo. References 1. [2]http://www.cedeplar.ufmg.br/ __ R-help@r-project.org mailing list [3]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [4]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Esta mensagem foi verificada pelo sistema de antivÃrus e acredita-se estar livre de perigo. --- End of Original Message --- -- Esta mensagem foi verificada pelo sistema de antivÃrus e acredita-se estar livre de perigo. References 1. http://www.cedeplar.ufmg.br/ 2. http://www.cedeplar.ufmg.br/ 3. https://stat.ethz.ch/mailman/listinfo/r-help 4. http://www.r-project.org/posting-guide.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overall title
?title, see 'outer' (and you will need to make room for an outer margin). This is described in 'An Introduction to R' (and in all good books on R). On Thu, 12 Jun 2008, [EMAIL PROTECTED] wrote: I have a 2x2 plot set up using: par(mfrow=c(2,2)) I'd like to put an overall title on the page, but I cannot figure out how. Any ideas? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to increase the for() loop speed?
We would certainly need more information about your function to offer any specific advice, therefore I'll fall back on the general. First there is no command that will increase a for loop speed, it is not as if they are artificially slowed down. In general, you may be able to do whatever it is you are doing using vectorized calculations instead of looping, and/or by using a combination of the *apply functions. You should read the latest issue of R News, as it has an article related to your question, and will explain my paragraphs above. Find it here, http://cran.r-project.org/doc/Rnews/ Finally, there is a special mailing list for using R for quantitative finance, perhaps you could a more detailed question there, search for R-SIG-Finance. Erik Rafael Barros de Rezende wrote: My routine is on Financial Econometrics (Yield Curve Modeling). It is very intensive. And I have heard that the for() loop speed could be increased with a command. I want to know if there a way to do it. Best regards. Rafael Barros de Rezende *-- Original Message ---* From: Erik Iverson [EMAIL PROTECTED] To: Rafael Barros de Rezende [EMAIL PROTECTED] Cc: r-help@r-project.org Sent: Thu, 12 Jun 2008 11:06:09 -0500 Subject: Re: [R] How to increase the for() loop speed? 13 minutes is a long time for a loop to simply send an email, what other calculations are going on? Rafael Barros de Rezende wrote: Dear R users, I would like to know if there is a way to increase the for() loop speed because in my routine the calculations are too slow. Best regards. Rafael Barros de Rezende Cedeplar - Center for Development and Regional Planning Face, UFMG ([1]http://www.cedeplar.ufmg.br http://www.cedeplar.ufmg.br/) -- Esta mensagem foi verificada pelo sistema de antivÃrus e acredita-se estar livre de perigo. References 1. http://www.cedeplar.ufmg.br/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Esta mensagem foi verificada pelo sistema de antivírus e acredita-se estar livre de perigo. *--- End of Original Message ---* -- Esta mensagem foi verificada pelo sistema de antivírus e acredita-se estar livre de perigo. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding horizontal lines to a trellis plot
I would like to add two horizontal lines representing acceptible drug levels to a trellis plot. I tried using abline and I get an error that plot.new has not been called. See below. xyplot(FK~WEEK|Event1/MRN, data=FKdat.o1) abline(h=5) abline(h=10) Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : plot.new has not been called yet Any help that can be provided on how to do this would be appreciated. Best, Suzette Suzette Blanchard, Ph.D. Assistant Professor, Dept. of Biostatistics City of Hope 1500 East Duarte Rd Duarte, CA 91010-3000 ph: (626) 256-4673 ext:64446 [EMAIL PROTECTED] - SECURITY/CONFIDENTIALITY WARNING: \ This message an...{{dropped:24}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] adding horizontal lines to a trellis plot
Please read about panel functions in ?xyplot and ?panel.abline In particular, you do this sort of thing in panel functions where you must use grid graphics functions or various lattice forms (wrappers) thereof. The standard graphics constructions will not work (as you found out).Suggested reference: Deepayan Sarkar's new book on lattice graphics. -- Bert Gunter Genentech -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Blanchard, Suzette Sent: Thursday, June 12, 2008 10:52 AM To: r-help@r-project.org Subject: [R] adding horizontal lines to a trellis plot I would like to add two horizontal lines representing acceptible drug levels to a trellis plot. I tried using abline and I get an error that plot.new has not been called. See below. xyplot(FK~WEEK|Event1/MRN, data=FKdat.o1) abline(h=5) abline(h=10) Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) : plot.new has not been called yet Any help that can be provided on how to do this would be appreciated. Best, Suzette Suzette Blanchard, Ph.D. Assistant Professor, Dept. of Biostatistics City of Hope 1500 East Duarte Rd Duarte, CA 91010-3000 ph: (626) 256-4673 ext:64446 [EMAIL PROTECTED] - SECURITY/CONFIDENTIALITY WARNING: \ This message an...{...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to increase the for() loop speed?
The first thing to do is to run Rprof and determine where time is being spent. It may be that it is one of the functions that you are calling inside the loop that is taking the majority of time and if that is the case, there may not be any improvement other than coming up with a different algorithm. 'for' loops themselves are not necessarily that slow; it is usually the case that you may not be preallocating storage, or there are some other things happening. Rprof will help determine where the problem might be. On Thu, Jun 12, 2008 at 11:52 AM, Rafael Barros de Rezende [EMAIL PROTECTED] wrote: Dear R users, I would like to know if there is a way to increase the for() loop speed because in my routine the calculations are too slow. Best regards. Rafael Barros de Rezende Cedeplar - Center for Development and Regional Planning Face, UFMG ([1]http://www.cedeplar.ufmg.br) -- Esta mensagem foi verificada pelo sistema de antivírus e acredita-se estar livre de perigo. References 1. http://www.cedeplar.ufmg.br/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] save workspace while running R on a cluster
Hello, I have a question about running R in a cluster environment. The shell script I am running looks like this: #!/bin/bash cd /nfs/apollo/2/c2b2/users/mb0001/Data /nfs/apollo/1/shares/software/core_facility/local/x86_64_rocks/R/current/bin/ R --save calculate.R script.out I have used the -save command to save the R workspace (If, I understand it correctly) . However, I am unable to find the workspace file. Using find .RData did not help. I would be obliged if someone can help me figure out where the workspace file gets saved? Thanks manisha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save workspace while running R on a cluster
Hi Manisha, How about you incluse something like this on your script.R: setwd(/your/full/working/directory) # ?setwd save.image()# or save.image(your_workspace.RDA). By the way, I don´t know if you added the line below to run in background: R --save calculate.R script.out May be the rigth thing is R --save calculate.R script.out Good luck. miltinho Brazil On 6/12/08, Manisha Brahmachary [EMAIL PROTECTED] wrote: Hello, I have a question about running R in a cluster environment. The shell script I am running looks like this: #!/bin/bash cd /nfs/apollo/2/c2b2/users/mb0001/Data /nfs/apollo/1/shares/software/core_facility/local/x86_64_rocks/R/current/bin/ R --save calculate.R script.out I have used the -save command to save the R workspace (If, I understand it correctly) . However, I am unable to find the workspace file. Using find .RData did not help. I would be obliged if someone can help me figure out where the workspace file gets saved? Thanks manisha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] alternative to matching/merge?
Greetings, I am doing matching/merge for a table (40919x3) to data which is in the form of a list of 1268 data.frames. Using lapply this is taking ~5 minutes. I know that the match/merge functions are time consuming, so is there an alternative to this accomplish this goal? is lapply not efficient? Lana Schaffer Biostatistics/Informatics The Scripps Research Institute DNA Array Core Facility La Jolla, CA 92037 (858) 784-2263 (858) 784-2994 [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subset in cast: compact form?
Hi! How can I subset several variables in cast? For example, I can do it for one, (ie, ph): cast(am, organismo +arriba ~ variable,subset=variable==ph,mean,na.rm=T) For selecting ph, temperature and Ba I'm using: cast(am, organismo +arriba ~ variable,subset=variable==ph variable==temperature| variable== Ba,mean,na.rm=T) Is there a more compact form? something like select=c(ph, temperature, Ba) Thanks Dr. Agustin Lobo Institut de Ciencies de la Terra Jaume Almera (CSIC) LLuis Sole Sabaris s/n 08028 Barcelona Spain Tel. 34 934095410 Fax. 34 934110012 email: [EMAIL PROTECTED] http://www.ija.csic.es/gt/obster __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset in cast: compact form?
On Thu, Jun 12, 2008 at 2:27 PM, Agustin Lobo [EMAIL PROTECTED] wrote: Hi! How can I subset several variables in cast? For example, I can do it for one, (ie, ph): cast(am, organismo +arriba ~ variable,subset=variable==ph,mean,na.rm=T) For selecting ph, temperature and Ba I'm using: cast(am, organismo +arriba ~ variable,subset=variable==ph variable==temperature| variable== Ba,mean,na.rm=T) Probably the most compact way is: cast(am, organismo +arriba ~ variable , subset =variable %in% c(ph, temperature,Ba) , mean, na.rm=T ) (that's an R thing, not particular to reshape) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overall title
Check out this previous post from years ago. http://tolstoy.newcastle.edu.au/R/help/00a/2237.html Bill Date: Thu, 12 Jun 2008 10:38:03 -0500 From: [EMAIL PROTECTED] To: r-help@r-project.org Subject: [R] overall title I have a 2x2 plot set up using: par(mfrow=c(2,2)) I'd like to put an overall title on the page, but I cannot figure out how. Any ideas? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Instantly invite friends from Facebook and other social networks to join yo https://www.invite2messenger.net/im/?source=TXT_EML_WLH_InviteFriends [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rprintf and C stack usage is too close to the limit
Hi, I would appreciate if someone could comment on this problem I am experiencing. I am writing a C++ program to be called from R. In this program, there is a verbose switch that decides whether to print some debugging info using Rprintf. On windows, things work ok. On linux, things are fine in non-verbose mode, but in verbose mode, I get error saying C stack usage is too close to the limit after a few lines are printed. Is Rprintf the right function to use for showing message on R console? If yes, what should I do about the error message? Thank you very much in advance! This problem has been bugging me for a few days now. Youyi -- Youyi Fong, Graduate Student, Department of Biostatistics University of Washington, Box 357232, Seattle, WA 98195 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with rowMeans()
Hi all, I have a matrix called 'data', which looks like: data[1:4,1:4] Probe_ID Gene_Symbol M1601 M1602 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 dim(data) [1] 2396385 What I want to do is to make a new matrix called 'data2', which would be transformed by subtracting the mean of each row from matrix 'data'. There are some 'NA's in the matrix and I do want to keep it. I tried to take 'mean's from each row first by using: a- rowMeans(data[,3:85],na.rm = FALSE) but I got: a- rowMeans(data[,3:85],na.rm = FALSE) Error in rowMeans(data[, 3:85], na.rm = FALSE) : 'x' must be numeric Can anybody suggest me how to get around this? Thank you very much! Allen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] numbers as part of long character
Hi, I'm looking for some way to pick up the numbers which are contained and buried in a long character. For example, outtree.new=(((B:1204.25,E:1204.25):7581.11,F:8785.36):8353.85,C:17139.21); num.char = unlist(strsplit(unlist(strsplit(unlist(strsplit(unlist(strsplit(unlist(strsplit(outtree.new,),fixed=TRUE)),(,fixed=TRUE)),:,fixed=TRUE)),,,fixed=TRUE)),;,fixed=TRUE)) num.vec=as.numeric(num.char[1:(length(num.char)-1)]) num.char # B1204.25 E1204.25 7581.11 F8785.36 8353.85 C17139.21 num.vec # NA 1204.25 NA 1204.25 7581.11 NA 8785.36 8353.85 NA 17139.21 would help me get the numbers such as 1204.25, 7581.11, etc, but with a warning message which reads: Warning message: NAs introduced by coercion Is there a way to get around this? Thanks! Hua __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with rowMeans()
Hello - ss wrote: Hi all, I have a matrix called 'data', which looks like: data[1:4,1:4] Probe_ID Gene_Symbol M1601 M1602 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 dim(data) [1] 2396385 Do you really have a matrix, or a data.frame? Try class(data) What I want to do is to make a new matrix called 'data2', which would be transformed by subtracting the mean of each row from matrix 'data'. There are some 'NA's in the matrix and I do want to keep it. See ?scale I tried to take 'mean's from each row first by using: a- rowMeans(data[,3:85],na.rm = FALSE) but I got: a- rowMeans(data[,3:85],na.rm = FALSE) Error in rowMeans(data[, 3:85], na.rm = FALSE) : 'x' must be numeric Can anybody suggest me how to get around this? Figure out what you are giving the rowMeans function. If you really have a matrix, then all(apply(data[,3:85], 2, class) == numeric) should be TRUE. Thank you very much! Allen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with rowMeans()
Dear Erik, Thanks! The 'data' is matrix but all(apply(data[,3:85], 2, class) == numeric) is false. class(data) [1] matrix a- rowMeans(data[,3:85],na.rm = TRUE) Error in rowMeans(data[, 3:85], na.rm = TRUE) : 'x' must be numeric all(apply(data[,3:85], 2, class) == numeric) [1] FALSE What else should I do? I appreciate! Allen On Thu, Jun 12, 2008 at 4:55 PM, Erik Iverson [EMAIL PROTECTED] wrote: Hello - ss wrote: Hi all, I have a matrix called 'data', which looks like: data[1:4,1:4] Probe_ID Gene_Symbol M1601 M1602 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 dim(data) [1] 2396385 Do you really have a matrix, or a data.frame? Try class(data) What I want to do is to make a new matrix called 'data2', which would be transformed by subtracting the mean of each row from matrix 'data'. There are some 'NA's in the matrix and I do want to keep it. See ?scale I tried to take 'mean's from each row first by using: a- rowMeans(data[,3:85],na.rm = FALSE) but I got: a- rowMeans(data[,3:85],na.rm = FALSE) Error in rowMeans(data[, 3:85], na.rm = FALSE) : 'x' must be numeric Can anybody suggest me how to get around this? Figure out what you are giving the rowMeans function. If you really have a matrix, then all(apply(data[,3:85], 2, class) == numeric) should be TRUE. Thank you very much! Allen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with rowMeans()
ss wrote: Hi all, I have a matrix called 'data', which looks like: data[1:4,1:4] Probe_ID Gene_Symbol M1601 M1602 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 dim(data) [1] 2396385 What I want to do is to make a new matrix called 'data2', which would be transformed by subtracting the mean of each row from matrix 'data'. There are some 'NA's in the matrix and I do want to keep it. I tried to take 'mean's from each row first by using: a- rowMeans(data[,3:85],na.rm = FALSE) but I got: a- rowMeans(data[,3:85],na.rm = FALSE) Error in rowMeans(data[, 3:85], na.rm = FALSE) : 'x' must be numeric sure, at least the first two columns are not numeric Can anybody suggest me how to get around this? you can compute row means based on only those columns which are numeric as follows: a = rowMeans(data[sapply(data, is.numeric)]) what you do with NAs is another story. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numbers as part of long character
on 06/12/2008 03:46 PM Hua Li wrote: Hi, I'm looking for some way to pick up the numbers which are contained and buried in a long character. For example, outtree.new=(((B:1204.25,E:1204.25):7581.11,F:8785.36):8353.85,C:17139.21); num.char = unlist(strsplit(unlist(strsplit(unlist(strsplit(unlist(strsplit(unlist(strsplit(outtree.new,),fixed=TRUE)),(,fixed=TRUE)),:,fixed=TRUE)),,,fixed=TRUE)),;,fixed=TRUE)) num.vec=as.numeric(num.char[1:(length(num.char)-1)]) num.char # B1204.25 E1204.25 7581.11 F8785.36 8353.85 C17139.21 num.vec # NA 1204.25 NA 1204.25 7581.11 NA 8785.36 8353.85 NA 17139.21 would help me get the numbers such as 1204.25, 7581.11, etc, but with a warning message which reads: Warning message: NAs introduced by coercion Is there a way to get around this? Thanks! Hua Your code above is overly and needlessly complicated, which makes it difficult to debug. I would take an approach whereby you use gsub() to strip non-numeric characters from the input character vector and then use scan() to read the remaining numbers: Vec - scan(textConnection(gsub([^0-9\\.]+, , outtree.new))) Read 6 items Vec [1] 1204.25 1204.25 7581.11 8785.36 8353.85 17139.21 str(Vec) num [1:6] 1204 1204 7581 8785 8354 ... The result of using gsub() above is: gsub([^0-9\\.]+, , outtree.new) [1] 1204.25 1204.25 7581.11 8785.36 8353.85 17139.21 That gives you a character vector which can then be passed to scan() as a textConnection(). See ?gsub, ?regex, ?textConnection and ?scan for more information. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with rowMeans()
ss wrote: Dear Erik, Thanks! The 'data' is matrix but all(apply(data[,3:85], 2, class) == numeric) is false. class(data) [1] matrix a- rowMeans(data[,3:85],na.rm = TRUE) Error in rowMeans(data[, 3:85], na.rm = TRUE) : 'x' must be numeric all(apply(data[,3:85], 2, class) == numeric) [1] FALSE What else should I do? try str(data) , do you have a character matrix? That would explain the error message. I would consider storing your data in a data.frame, as you appear not to have homogeneous types, and then your analysis should be trivial. Erik I appreciate! Allen On Thu, Jun 12, 2008 at 4:55 PM, Erik Iverson [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hello - ss wrote: Hi all, I have a matrix called 'data', which looks like: data[1:4,1:4] Probe_ID Gene_Symbol M1601 M1602 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 dim(data) [1] 2396385 Do you really have a matrix, or a data.frame? Try class(data) What I want to do is to make a new matrix called 'data2', which would be transformed by subtracting the mean of each row from matrix 'data'. There are some 'NA's in the matrix and I do want to keep it. See ?scale I tried to take 'mean's from each row first by using: a- rowMeans(data[,3:85],na.rm = FALSE) but I got: a- rowMeans(data[,3:85],na.rm = FALSE) Error in rowMeans(data[, 3:85], na.rm = FALSE) : 'x' must be numeric Can anybody suggest me how to get around this? Figure out what you are giving the rowMeans function. If you really have a matrix, then all(apply(data[,3:85], 2, class) == numeric) should be TRUE. Thank you very much! Allen [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with mars in R in the case of nonlinear functions
Hi, I'm trying to use mars function in R to interpolate nonlinear multivariate functions. However, it seems that mars gives me a fit which uses only very few basis function and it underfits very badly. For example, I have tried the following code to test mars: require(mda) f - function(x,y) { x^2-y^2 }; #f - function(x,y) { x+2*y }; # Grid x - seq(-1,1,length=10); x - outer(x*0,x,FUN=+); y - t(x); X - cbind(as.vector(x),as.vector(y)); # Data z - f(x,y); fit - mars(X,as.vector(z),nk=200,penalty=2,thresh=1e-3,degree=2); # Plotting par(mfrow=c(1,2),pty=s) lims - c(min(c(min(z),min(fit$fitted))),max(c(max(z),max(fit$fitted persp(z=z,ticktype='detailed',col='lightblue',shade=.75,ltheta=50, xlab='x',ylab='y',zlab='z',main='true',phi=25,theta=55,zlim=lims) persp(z=matrix(fit$fitted.values,nrow=nrow(x),byrow=F),ticktype='detailed', col='lightblue', xlab='x',ylab='y',zlab='z',shade=.75,ltheta=50,main='MARS', phi=25,theta=55,zlim=lims) (the code is also here if someone wants to try it: http://venda.uku.fi/~jmhuttun/R/marstest.R) The results are here: http://venda.uku.fi/~jmhuttun/R/R-10.pdf . The fitted model contains only 5 terms which is not enough in this case. Adjusting parameters like nk, thresh, penalty and degree seems only have minor effect or no effect at all. It's also strange that when I increase the number of points in the grid, the results are ever worse: see e.g. http://venda.uku.fi/~jmhuttun/R/R-20.pdf for a 20x20 grid. However Mars seems to work well with linear functions (e.g. with the function which is commented in the above code). Do anyone know what is wrong in this case? Do I miss something is there something wrong in my code? This seems not to be a problem with MARS method in general. For example, Friedman's MARS implementation (ran in Matlab) gives a rather good fit: see http://venda.uku.fi/~jmhuttun/R/Matlab.pdf . Thank you Janne -- Janne Huttunen University of California Department of Statistics 367 Evans Hall Berlekey, CA 94720-3860 email: [EMAIL PROTECTED] phone: +1-510-502-5205 office room: 449 Evans Hall __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numbers as part of long character
On Jun 12, 2008, at 5:06 PM, Marc Schwartz wrote: on 06/12/2008 03:46 PM Hua Li wrote: Hi, I'm looking for some way to pick up the numbers which are contained and buried in a long character. For example, outtree.new=(((B:1204.25,E:1204.25):7581.11,F:8785.36):8353.85,C: 17139.21); num.char = unlist(strsplit(unlist(strsplit(unlist(strsplit(unlist (strsplit(unlist(strsplit (outtree.new,),fixed=TRUE)),(,fixed=TRUE)),:,fixed=TRUE)),,,f ixed=TRUE)),;,fixed=TRUE)) num.vec=as.numeric(num.char[1:(length(num.char)-1)]) num.char # B1204.25 E1204.25 7581.11 F8785.36 8353.85 C17139.21 num.vec # NA 1204.25 NA 1204.25 7581.11 NA 8785.36 8353.85 NA 17139.21 would help me get the numbers such as 1204.25, 7581.11, etc, but with a warning message which reads: Warning message: NAs introduced by coercion Is there a way to get around this? Thanks! Hua Your code above is overly and needlessly complicated, which makes it difficult to debug. I would take an approach whereby you use gsub() to strip non- numeric characters from the input character vector and then use scan () to read the remaining numbers: Vec - scan(textConnection(gsub([^0-9\\.]+, , outtree.new))) Read 6 items Vec [1] 1204.25 1204.25 7581.11 8785.36 8353.85 17139.21 str(Vec) num [1:6] 1204 1204 7581 8785 8354 ... The result of using gsub() above is: gsub([^0-9\\.]+, , outtree.new) [1] 1204.25 1204.25 7581.11 8785.36 8353.85 17139.21 That gives you a character vector which can then be passed to scan () as a textConnection(). Another approach would be to split on sequences of non-integers: as.numeric( strsplit(outtree.new, [^\\d.]+, perl=TRUE)[[1]] ) Use [^+-\\d.]+ if your numbers might be signed. This does assume that dots, +/- occur only as decimal points. Hua, did you want to keep the information of which number is B, which is C etc? See ?gsub, ?regex, ?textConnection and ?scan for more information. HTH, Marc Schwartz Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rJava classpath issues
I'm having some issues with getting my own jars to work properly with rJava. Bear with me as I explain my scenario: I have a java package called rjbridge, with the following classes: RJBridge.class ObjectInfo.class Each of the classes has the following line on top: package com.rjbridge My directory structure is: /home/Danish/java/com/rjbridge All the class files lie in the above directory I use the following commands to compile and jar: cd /home/Danish/java javac -d . ./com/rjbridge/*.java jar -cvf rjbridge.jar ./com/rjbridge So far, so good. Everything works as it should, and I can use the RJBridge class as follows: java -cp /home/Danish/java/rjbridge.jar RJBridge [args] and it runs fine. Now in R, when I try the following: library(rJava) .jinit() .jaddClassPath(/home/Danish/java/rjbridge.jar) r-.jnew(ObjectInfo) I get, Exception in thread main java.lang.ClassNotFoundException at RJavaClassLoader.findClass(RJavaClassLoader.java:195) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:242) Error in .jnew(ObjectInfo) : Failed to create object of class `ObjectInfo' In addition: Warning message: createObject.FindClass ObjectInfo failed in: .jnew(ObjectInfo) Trying: r-.jnew(rjbridge.ObjectInfo) Gives the following: Exception in thread main java.lang.ClassNotFoundException at RJavaClassLoader.findClass(RJavaClassLoader.java:195) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:242) Error in .jnew(rjbridge.ObjectInfo) : Failed to create object of class `rjbridge/ObjectInfo' I've spent a lot of time trying this in a number of different ways, but to no avail. Will some kind soul help me out here? Thanks Danish - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice. IRS Circular 230 Disclosure: Please be advised that any discussion of U.S. tax matters contained within this communication (including any attachments) is not intended or written to be used and cannot be used for the purpose of (i) avoiding U.S. tax related penalties or (ii) promoting, marketing or recommending to another party any transaction or matter addressed herein. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numbers as part of long character
Thanks, Marc and Haris! I didn't know the values of the numbers beforehand, so the scan method won't work, but [^+-\\d.]+ will do! And Haris, I didn't intend to keep the information of which number is B, which is C etc when asking the question, as I had a tedious way to do it (use strspilt and unlist over and over again, after I get the number). But if you have a easier way to do it, I'd like to know! Hua --- On Thu, 6/12/08, Charilaos Skiadas [EMAIL PROTECTED] wrote: From: Charilaos Skiadas [EMAIL PROTECTED] Subject: Re: [R] numbers as part of long character To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], r-help@r-project.org Date: Thursday, June 12, 2008, 6:03 PM On Jun 12, 2008, at 5:06 PM, Marc Schwartz wrote: on 06/12/2008 03:46 PM Hua Li wrote: Hi, I'm looking for some way to pick up the numbers which are contained and buried in a long character. For example, outtree.new=(((B:1204.25,E:1204.25):7581.11,F:8785.36):8353.85,C: 17139.21); num.char = unlist(strsplit(unlist(strsplit(unlist(strsplit(unlist (strsplit(unlist(strsplit (outtree.new,),fixed=TRUE)),(,fixed=TRUE)),:,fixed=TRUE)),,,f ixed=TRUE)),;,fixed=TRUE)) num.vec=as.numeric(num.char[1:(length(num.char)-1)]) num.char # B1204.25 E1204.25 7581.11 F8785.36 8353.85 C 17139.21 num.vec # NA 1204.25 NA 1204.25 7581.11 NA 8785.36 8353.85 NA 17139.21 would help me get the numbers such as 1204.25, 7581.11, etc, but with a warning message which reads: Warning message: NAs introduced by coercion Is there a way to get around this? Thanks! Hua Your code above is overly and needlessly complicated, which makes it difficult to debug. I would take an approach whereby you use gsub() to strip non- numeric characters from the input character vector and then use scan () to read the remaining numbers: Vec - scan(textConnection(gsub([^0-9\\.]+, , outtree.new))) Read 6 items Vec [1] 1204.25 1204.25 7581.11 8785.36 8353.85 17139.21 str(Vec) num [1:6] 1204 1204 7581 8785 8354 ... The result of using gsub() above is: gsub([^0-9\\.]+, , outtree.new) [1] 1204.25 1204.25 7581.11 8785.36 8353.85 17139.21 That gives you a character vector which can then be passed to scan () as a textConnection(). Another approach would be to split on sequences of non-integers: as.numeric( strsplit(outtree.new, [^\\d.]+, perl=TRUE)[[1]] ) Use [^+-\\d.]+ if your numbers might be signed. This does assume that dots, +/- occur only as decimal points. Hua, did you want to keep the information of which number is B, which is C etc? See ?gsub, ?regex, ?textConnection and ?scan for more information. HTH, Marc Schwartz Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numbers as part of long character
Oh, Sorry, Marc. The scan method does work! Hua --- On Thu, 6/12/08, Charilaos Skiadas [EMAIL PROTECTED] wrote: From: Charilaos Skiadas [EMAIL PROTECTED] Subject: Re: [R] numbers as part of long character To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], r-help@r-project.org Date: Thursday, June 12, 2008, 6:03 PM On Jun 12, 2008, at 5:06 PM, Marc Schwartz wrote: on 06/12/2008 03:46 PM Hua Li wrote: Hi, I'm looking for some way to pick up the numbers which are contained and buried in a long character. For example, outtree.new=(((B:1204.25,E:1204.25):7581.11,F:8785.36):8353.85,C: 17139.21); num.char = unlist(strsplit(unlist(strsplit(unlist(strsplit(unlist (strsplit(unlist(strsplit (outtree.new,),fixed=TRUE)),(,fixed=TRUE)),:,fixed=TRUE)),,,f ixed=TRUE)),;,fixed=TRUE)) num.vec=as.numeric(num.char[1:(length(num.char)-1)]) num.char # B1204.25 E1204.25 7581.11 F8785.36 8353.85 C 17139.21 num.vec # NA 1204.25 NA 1204.25 7581.11 NA 8785.36 8353.85 NA 17139.21 would help me get the numbers such as 1204.25, 7581.11, etc, but with a warning message which reads: Warning message: NAs introduced by coercion Is there a way to get around this? Thanks! Hua Your code above is overly and needlessly complicated, which makes it difficult to debug. I would take an approach whereby you use gsub() to strip non- numeric characters from the input character vector and then use scan () to read the remaining numbers: Vec - scan(textConnection(gsub([^0-9\\.]+, , outtree.new))) Read 6 items Vec [1] 1204.25 1204.25 7581.11 8785.36 8353.85 17139.21 str(Vec) num [1:6] 1204 1204 7581 8785 8354 ... The result of using gsub() above is: gsub([^0-9\\.]+, , outtree.new) [1] 1204.25 1204.25 7581.11 8785.36 8353.85 17139.21 That gives you a character vector which can then be passed to scan () as a textConnection(). Another approach would be to split on sequences of non-integers: as.numeric( strsplit(outtree.new, [^\\d.]+, perl=TRUE)[[1]] ) Use [^+-\\d.]+ if your numbers might be signed. This does assume that dots, +/- occur only as decimal points. Hua, did you want to keep the information of which number is B, which is C etc? See ?gsub, ?regex, ?textConnection and ?scan for more information. HTH, Marc Schwartz Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting Batch mode to continue running a script after running into errors
I'm invoking R in batch mode from a bash script as follows: R --no-restore --no-save --vanilla $TARGET/$directory/o2sat-$VERSION.R $TARGET/$directory/o2sat-$VERSION.Routput When R comes across some error in the script however it seems to halt instead of running subsequent lines in the script: Error in file(file, r) : cannot open the connection Calls: read.table - file In addition: Warning message: In file(file, r) : cannot open file '/datapool/experiments/ois/080502/petri': No such file or directory Execution halted How can I get R to continue running the script even if it comes across errors? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numbers as part of long character
On Jun 12, 2008, at 6:34 PM, Hua Li wrote: Thanks, Marc and Haris! I didn't know the values of the numbers beforehand, so the scan method won't work, but [^+-\\d.]+ will do! And Haris, I didn't intend to keep the information of which number is B, which is C etc when asking the question, as I had a tedious way to do it (use strspilt and unlist over and over again, after I get the number). But if you have a easier way to do it, I'd like to know! Depending on how your real use case looks like, the following might work: vec1 - strsplit(outtree.new, [^+-\\d.:\\w]+, perl=TRUE)[[1]] nums - as.numeric(gsub(\\w?:,, vec1, perl=TRUE)) names(nums) - gsub(:[+-\\d.]+,, vec1, perl=TRUE) If it doesn't, then provide us with the example that fails it. Hua Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with rowMeans()
ss wrote: Thank you very much, Wacek! It works very well. But there is a minor problem. I did the following: data - read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt', +row.names = NULL ,header=TRUE, fill=TRUE) looks like you have a data frame, not a matrix dim(data) [1] 2396385 data[1:4,1:4] Probe_ID Gene_Symbol M16012391010920 M16012391010525 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 data1-data[sapply(data, is.numeric)] dim(data1) [1] 2396382 data1[1:4,1:4] M16012391010525 M16012391010843 M16012391010531 M16012391010921 10.16 -0.23 -1.400.90 20.590.28 -0.300.08 3 -0.62 -0.62 -0.22 -0.18 4 -0.420.010.28 -0.79 You will notice that, after using 'data[sapply(data, is.numeric)]' and getting data1, the first sample in data, called 'M16012391010920', was missed in data1. Any further suggestions? surely there must be an entry in column 3 that makes it non-numeric. what does is.numeric(data[3]) say? (NAs should not make a column non-numeric, unless there are only NAs there, which is not the case here.) check your data for non-numeric entries in column 3, there can be a typo. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with rowMeans()
Hi Wacek, Yes, data is data frame not a matrix. is.numeric(data[3]) [1] FALSE But I looked at the column 3 and it looks okay though. There are few NAs and I did find anything strange. Any suggestions? Thanks, Allen On Thu, Jun 12, 2008 at 7:01 PM, Wacek Kusnierczyk [EMAIL PROTECTED] wrote: ss wrote: Thank you very much, Wacek! It works very well. But there is a minor problem. I did the following: data - read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt', +row.names = NULL ,header=TRUE, fill=TRUE) looks like you have a data frame, not a matrix dim(data) [1] 2396385 data[1:4,1:4] Probe_ID Gene_Symbol M16012391010920 M16012391010525 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 data1-data[sapply(data, is.numeric)] dim(data1) [1] 2396382 data1[1:4,1:4] M16012391010525 M16012391010843 M16012391010531 M16012391010921 10.16 -0.23 -1.400.90 20.590.28 -0.300.08 3 -0.62 -0.62 -0.22 -0.18 4 -0.420.010.28 -0.79 You will notice that, after using 'data[sapply(data, is.numeric)]' and getting data1, the first sample in data, called 'M16012391010920', was missed in data1. Any further suggestions? surely there must be an entry in column 3 that makes it non-numeric. what does is.numeric(data[3]) say? (NAs should not make a column non-numeric, unless there are only NAs there, which is not the case here.) check your data for non-numeric entries in column 3, there can be a typo. vQ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate Random Samples
To answer your specific question, you can use mvrnorm (from MASS, i.e. library(MASS)) to generate each component. To generate a mixture with three components (Prob(1 st component) = p1, Prob(2nd component) = p2, Prob(3rd component) = p3, p1+p2+p3=1), you can generate a uniformly distributed variable X in (0,1) and then generate the 1st component if X p1, the 2nd one if p1 = X p1+p2 and the 3rd one if X =p1+p2. --- On Fri, 13/6/08, Peng Jiang [EMAIL PROTECTED] wrote: From: Peng Jiang [EMAIL PROTECTED] Subject: [R] Generate Random Samples To: r-help@r-project.org Received: Friday, 13 June, 2008, 1:24 AM Hi, I am a newbie to R and I am working with a Mac. Is there any package that I can use to generate random samples from a user defined distribution ? That is , I define a distribution function ( maybe multi dimension ) and I want some random samples generated from my this distribution. Or, there is a more specific problem . If I have a three component mixture with each of them being normal distribution( say 3 dimension ) , is there any package that I can use to generate random samples from this mixture . I know I can generate random samples from each individual component. However, can I just add them directly Thanks. -- Peng Jiang 江鹏 Ph.D. Candidate Antai College of Economics Management 安泰经济管理学院 Department of Mathematics 数学系 Shanghai Jiaotong University (Minhang Campus) 800 Dongchuan Road 200240 Shanghai P. R. China __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with rowMeans()
ss wrote: Hi Wacek, Yes, data is data frame not a matrix. is.numeric(data[3]) [1] FALSE what is class(data[3]) But I looked at the column 3 and it looks okay though. There are few NAs and I did find anything strange. Any suggestions? Thanks, Allen On Thu, Jun 12, 2008 at 7:01 PM, Wacek Kusnierczyk [EMAIL PROTECTED] wrote: ss wrote: Thank you very much, Wacek! It works very well. But there is a minor problem. I did the following: data - read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt', +row.names = NULL ,header=TRUE, fill=TRUE) looks like you have a data frame, not a matrix dim(data) [1] 2396385 data[1:4,1:4] Probe_ID Gene_Symbol M16012391010920 M16012391010525 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 data1-data[sapply(data, is.numeric)] dim(data1) [1] 2396382 data1[1:4,1:4] M16012391010525 M16012391010843 M16012391010531 M16012391010921 10.16 -0.23 -1.400.90 20.590.28 -0.300.08 3 -0.62 -0.62 -0.22 -0.18 4 -0.420.010.28 -0.79 You will notice that, after using 'data[sapply(data, is.numeric)]' and getting data1, the first sample in data, called 'M16012391010920', was missed in data1. Any further suggestions? surely there must be an entry in column 3 that makes it non-numeric. what does is.numeric(data[3]) say? (NAs should not make a column non-numeric, unless there are only NAs there, which is not the case here.) check your data for non-numeric entries in column 3, there can be a typo. vQ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with rowMeans()
It is: data - read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt', row.names = NULL ,header=TRUE, fill=TRUE) class(data[3]) [1] data.frame And if I try to use as.matrix(read.table()), I got: data -as.matrix(read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt', + row.names = NULL ,header=TRUE, fill=TRUE)) data[1:4,1:4] Probe_ID Gene_Symbol M16012391010920 M16012391010525 [1,] A_23_P105862 13CDNA73 -1.6 0.16 [2,] A_23_P76435 15E1.20.18 0.59 [3,] A_24_P402115 15E1.21.63 -0.62 [4,] A_32_P227764 15E1.2-0.76 -0.42 You see they are surrounded by . I don't see such if I just use read.table data - read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt', row.names = NULL ,header=TRUE, fill=TRUE) data[1:4,1:4] Probe_ID Gene_Symbol M16012391010920 M16012391010525 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 Thanks, Allen On Thu, Jun 12, 2008 at 7:34 PM, Erik Iverson [EMAIL PROTECTED] wrote: ss wrote: Hi Wacek, Yes, data is data frame not a matrix. is.numeric(data[3]) [1] FALSE what is class(data[3]) But I looked at the column 3 and it looks okay though. There are few NAs and I did find anything strange. Any suggestions? Thanks, Allen On Thu, Jun 12, 2008 at 7:01 PM, Wacek Kusnierczyk [EMAIL PROTECTED] wrote: ss wrote: Thank you very much, Wacek! It works very well. But there is a minor problem. I did the following: data - read.table('E-TABM-1-processed-data-1342561271_log2_with_symbols.txt', +row.names = NULL ,header=TRUE, fill=TRUE) looks like you have a data frame, not a matrix dim(data) [1] 2396385 data[1:4,1:4] Probe_ID Gene_Symbol M16012391010920 M16012391010525 1 A_23_P10586213CDNA73-1.60.16 2 A_23_P76435 15E1.20.180.59 3 A_24_P402115 15E1.21.63 -0.62 4 A_32_P227764 15E1.2 -0.76 -0.42 data1-data[sapply(data, is.numeric)] dim(data1) [1] 2396382 data1[1:4,1:4] M16012391010525 M16012391010843 M16012391010531 M16012391010921 10.16 -0.23 -1.400.90 20.590.28 -0.300.08 3 -0.62 -0.62 -0.22 -0.18 4 -0.420.010.28 -0.79 You will notice that, after using 'data[sapply(data, is.numeric)]' and getting data1, the first sample in data, called 'M16012391010920', was missed in data1. Any further suggestions? surely there must be an entry in column 3 that makes it non-numeric. what does is.numeric(data[3]) say? (NAs should not make a column non-numeric, unless there are only NAs there, which is not the case here.) check your data for non-numeric entries in column 3, there can be a typo. vQ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.