Re: [R] help get character output R in Delphi
mahendra mahfood mahendra025 at yahoo.com writes: i cant get the character output of R in delphi. i have got the graphic of R in delphi. i wanna get the character output in R for my delphi application. anybody knows??? the example source please? I have written a little wrapper for Delphi and R (see http:www.menne- biomed.de/download/download.html). I never tried to capture the full output, because it's better to pick the items you want selectively. If you really want the text, I suggest redirecting selected output to a file (?sink()) and reading it in from Delphi. Dieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Systematic treatment of missing values
Thank you very much for your prompt reply and for adding the comments to the help pages for match and ==. I think the source of my confusion was that by looking at the current documentation (v 2.3.0) I did not realize that matching is different from equality testing. (Obviously in the case of using regular expressions, etc, it is different, but I thought that when using plain match and %in%, matching would be determined by ==.) Also I did not mean for my first comment to sound like a criticism of R for treating NAs inconsistently. Nonetheless I am still curious why the particular choice was made that match (and therefore %in%) acts differently from == with respect to NA's and NaN's (with the default and the only implemented value of the incomparables parameter)? Thank you, David On May 28, 2006, at 1:10 AM, Prof Brian Ripley wrote: You start with very general comments, but only use one specific function, match (see ?%in%, a help page entitled `value matching'). Matching and equality are treated differently. By definition, NA matches NA and nothing else, and NaN matches NaN and nothing else. In comparisons, these values are not comparable. As you will have seen from the help page, match() has the expansion capacity for declaring values non-comparable. That has not been implemented for a decade and no one has supplied code to implement it, so it seems no want has much need of it. I have added notes to the help pages for match and == to say explicitly what matches and what is comparable. If the *Draft* R Language Definition were ever to be finished it would have such details: it already has a useful commentary. On Sat, 27 May 2006, David Soloveichik wrote: I am wondering whether there is a well-accepted approach to handling missing values (NA's) in a programming language such as R. For example, most functions seem to propagate NA to the output when the value of the missing entry could have mattered. In other words, most functions are not willing to take a stand on what the missing value was. However, some functions don't seem to do this. For example, c(1,2,3,NA) %in% c(2,3) [1] FALSE TRUE TRUE FALSE rather than: FALSE TRUE TRUE NA Also, what is the logic of the following: c(1,2,3,NA) %in% c(2,3,NA) [1] FALSE TRUE TRUE TRUE Why is the last output value TRUE? Why does R claim that the NA on the left hand side of %in% is the same as the NA on the right hand side of %in%? It does not: it reports that it *matches*. Please do read the help page bwofre posting, as the posting guide asked you to. PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Query: lme output
Dear R-Users I have a problem accessing some values in the output from the summary of an lme fit. I fit the model below: ggg - lme (ST~ -1 + as.factor(endp):Z.sas + as.factor(endp), data=dat4a, random=~-1 + as.factor(endp) + as.factor(endp):Z.sas|as.factor(trials), correlation = corSymm(form=~1|as.factor(trials)/as.factor(id)), weights=varIdent(form=~1|endp)) hh - summary(ggg) hh Below is the following part of the output of interest: Correlation Structure: General Formula: ~1 | as.factor(trials)/as.factor(id) Parameter estimate(s): Correlation: 1 2 0.785 Variance function: Structure: Different standard deviations per stratum Formula: ~1 | endp Parameter estimates: -1 1 1.000 0.9692405 I wish to access the value of the correlation (0.785) and the vector of the variance function estimates (1,0.969). I know these can be done throught the intervals function, but sometimes when the estimated Hessian matrix is not positive definite or something like that (i am not quite sure), the intervals function delivers an error message. Thus, i will like to ask if there is another way to access these values. I tried using the following code: hh$modelStruct$corStruct[1] hh$modelStruct$varStruct[1] Rather the output was: hh$modelStruct$corStruct[1] [1] -1.308580 hh$modelStruct$varStruct[1] [1] -0.03124255 I presume there is a way to calculate the correlation and variance function coefficients using these values. Could someone tell me how to access those values (without using the intervals function) or better still how to calculate the values from the last output values. Kind regards Pryseley - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] multiple comparisons of time series data (Stephan Moratti)
To account for the strong serial correlation you could try the lme() function of the nlme package. There you can apply different covariance structures in your linear model such as a first-order autoregressive covariance structure (AR1). example: model.fit - lme(response ~ condition * time, data = time.series.data, random=~1|case, correlation = corCAR1()); This model uses an autoregressive process for continous data. The random expression defines the intercept for each case (or observation, subject) as a random factor. Condition and time would be fixed factors in this case. See also help(lme) and help(corClasses). Hopes that helps, Stephan Stephan Moratti, PhD Centro de Magnetoencefalografía Dr. Perez Modrego Faculdad de Medicina Universidad Complutense de Madrid Pabellón 8 Avda. Complutense, s/n 28040 Madrid Spain email: [EMAIL PROTECTED] Tel.: +34 91 394 2292 Fax.: +34 91 394 2294 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Query: lme output
You can try dissecting the output of VarCorr(ggg) but, once again, I can't guarantee that it's what you want because we can't fit your model, and you have not sent us a simple, reproducible example. Andrew On Tue, May 30, 2006 at 03:37:19AM -0700, Pryseley Assam wrote: Dear R-Users I have a problem accessing some values in the output from the summary of an lme fit. I fit the model below: ggg - lme (ST~ -1 + as.factor(endp):Z.sas + as.factor(endp), data=dat4a, random=~-1 + as.factor(endp) + as.factor(endp):Z.sas|as.factor(trials), correlation = corSymm(form=~1|as.factor(trials)/as.factor(id)), weights=varIdent(form=~1|endp)) hh - summary(ggg) hh Below is the following part of the output of interest: Correlation Structure: General Formula: ~1 | as.factor(trials)/as.factor(id) Parameter estimate(s): Correlation: 1 2 0.785 Variance function: Structure: Different standard deviations per stratum Formula: ~1 | endp Parameter estimates: -1 1 1.000 0.9692405 I wish to access the value of the correlation (0.785) and the vector of the variance function estimates (1,0.969). I know these can be done throught the intervals function, but sometimes when the estimated Hessian matrix is not positive definite or something like that (i am not quite sure), the intervals function delivers an error message. Thus, i will like to ask if there is another way to access these values. I tried using the following code: hh$modelStruct$corStruct[1] hh$modelStruct$varStruct[1] Rather the output was: hh$modelStruct$corStruct[1] [1] -1.308580 hh$modelStruct$varStruct[1] [1] -0.03124255 I presume there is a way to calculate the correlation and variance function coefficients using these values. Could someone tell me how to access those values (without using the intervals function) or better still how to calculate the values from the last output values. Kind regards Pryseley - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 Email: [EMAIL PROTECTED] http://www.ms.unimelb.edu.au __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Automate concatenation?
I have this typical problem of joining a number of vectors with similar names - a1, a2,..., a10 - which should be concatenated into one. Using c(a1,a2,a3,a4,a5,a6,a,a8,a9,a10) naturally works, but I would like to do it with less manual input. My attempts to use paste() gives a vector of the vector names, see below. The question is how to do the the concatenation? Any suggestions? paste(a,1:10,sep=) Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Time rather than dates?
Using strptime() and other functions for dates has been very helpful with the kind of data I often work with. However, I haven't found out how time as such should be specified. All my attempts result in time *and* date: treatment_time-c(01:02:03,02:03:04) # hours:minutes:seconds time.2-strptime(treatment_time,format=%H:%M:%S) time.2 [1] 1900-01-01 01:02:03 1900-01-01 02:03:04 Why the 1900-...? I had hoped for some easy conversion from time to numeric data and possibly back. Assistance would be appreciated. Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Time rather than dates?
On Tue, 30 May 2006, Robert Lundqvist wrote: Using strptime() and other functions for dates has been very helpful with the kind of data I often work with. However, I haven't found out how time as such should be specified. All my attempts result in time *and* date: treatment_time-c(01:02:03,02:03:04) # hours:minutes:seconds time.2-strptime(treatment_time,format=%H:%M:%S) time.2 [1] 1900-01-01 01:02:03 1900-01-01 02:03:04 Why the 1900-...? I had hoped for some easy conversion from time to numeric data and possibly back. Assistance would be appreciated. You asked to print a datetime object of class POSIXlt, and unspecified fields are set to their earliest values. But printing is only part of the story. What do you actually want from this? 3600*time.2$hour + 60*time.2$min + time.2$sec gives you the number of seconds since midnight, for example. See ?DateTimeClasses -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automate concatenation?
you need get(), look also at FAQ 7.21 a1 - 1:3 a2 - 4:5 a3 - 6:10 a4 - 11:20 a5 - 21:25 # lapply(paste(a, 1:5, sep = ), get) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Robert Lundqvist [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Tuesday, May 30, 2006 2:07 PM Subject: [R] Automate concatenation? I have this typical problem of joining a number of vectors with similar names - a1, a2,..., a10 - which should be concatenated into one. Using c(a1,a2,a3,a4,a5,a6,a,a8,a9,a10) naturally works, but I would like to do it with less manual input. My attempts to use paste() gives a vector of the vector names, see below. The question is how to do the the concatenation? Any suggestions? paste(a,1:10,sep=) Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Time rather than dates?
Try using the times class in the chron package. library(chron) times(c(01:02:03,02:03:04)) [1] 01:02:03 02:03:04 On 5/30/06, Robert Lundqvist [EMAIL PROTECTED] wrote: Using strptime() and other functions for dates has been very helpful with the kind of data I often work with. However, I haven't found out how time as such should be specified. All my attempts result in time *and* date: treatment_time-c(01:02:03,02:03:04) # hours:minutes:seconds time.2-strptime(treatment_time,format=%H:%M:%S) time.2 [1] 1900-01-01 01:02:03 1900-01-01 02:03:04 Why the 1900-...? I had hoped for some easy conversion from time to numeric data and possibly back. Assistance would be appreciated. Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Time rather than dates?
Hi Which version of R do you use? Version 2.3.1 beta (2006-05-23 r38179) treatment_time-c(01:02:03,02:03:04) strptime(treatment_time,format=%H:%M:%S) [1] 2006-05-30 01:02:03 2006-05-30 02:03:04 ttt-strptime(treatment_time,format=%H:%M:%S) For changing format of POSIX variables use format format(ttt, %H:%M) [1] 01:02 02:03 Maybe also consult locale setting. HTH Petr On 30 May 2006 at 14:16, Robert Lundqvist wrote: To: r-help@stat.math.ethz.ch From: Robert Lundqvist [EMAIL PROTECTED] Date sent: Tue, 30 May 2006 14:16:29 +0200 Subject:[R] Time rather than dates? Send reply to: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Using strptime() and other functions for dates has been very helpful with the kind of data I often work with. However, I haven't found out how time as such should be specified. All my attempts result in time *and* date: treatment_time-c(01:02:03,02:03:04) # hours:minutes:seconds time.2-strptime(treatment_time,format=%H:%M:%S) time.2 [1] 1900-01-01 01:02:03 1900-01-01 02:03:04 Why the 1900-...? I had hoped for some easy conversion from time to numeric data and possibly back. Assistance would be appreciated. Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automate concatenation?
Robert Lundqvist [EMAIL PROTECTED] writes: I have this typical problem of joining a number of vectors with similar names - a1, a2,..., a10 - which should be concatenated into one. Using c(a1,a2,a3,a4,a5,a6,a,a8,a9,a10) naturally works, but I would like to do it with less manual input. And less error-prone (where did the 7 go?)... My attempts to use paste() gives a vector of the vector names, see below. The question is how to do the the concatenation? Any suggestions? paste(a,1:10,sep=) I think this should work: unlist(lapply( paste(a,1:10,sep=), get)) (and of course, the usual sermon applies: You're likely better off using a list of vectors rather than vectors with similar names.) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Add text/numbers to x axis?
Is there any simple way to add text to the x axis in R? I have tried to add simple characters in R without any greater sucess. As an example of what I want to do is the added C in the following dumb plot (produced with MacAnova using the option dumb:T): ++--+---+--+---+--+---+--+---++ 1+ : *+ | : * * | | : * * | | : * .| 0.8+ : * .+ | : .| | :*.| | : * .| 0.6+ : .+ | : * .| | * .| | : .| | * : .| 0.4+ : .+ |* : .| | *: .| | : .| 0.2+ * : .+ | * : .| | * * : .| |* * * : .| 0++--+---+--+---+--+---+--.---+---++ -2-1.5 -1-0.5 0 0.5 1 C 1.5 2 My attempts to use locator() and text() in R's plot() has not been working as good as expected. Never really knows where the added symbols end up... BTW, anyone who knows how such dumb plots could be achieved in R? This option is avaliable in S, so a port to R shouldn't be impossible, should it? Don't know how to do it myself however... Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Add text/numbers to x axis?
Hi maybe something like plot(1:10,1:10, axes=F) axis(1, at=1:10, labels=c((1:7)/10, c, 9:10)) axis(2, at=1:10, labels=c(letters[1:5], 6:10)) box() HTH Petr On 30 May 2006 at 14:45, Robert Lundqvist wrote: To: r-help@stat.math.ethz.ch From: Robert Lundqvist [EMAIL PROTECTED] Date sent: Tue, 30 May 2006 14:45:41 +0200 Subject:[R] Add text/numbers to x axis? Send reply to: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Is there any simple way to add text to the x axis in R? I have tried to add simple characters in R without any greater sucess. As an example of what I want to do is the added C in the following dumb plot (produced with MacAnova using the option dumb:T): ++--+---+--+---+--+---+--+---++ 1+ : *+ | : * * | | : * * | | : * .| 0.8+ : * .+ | : .| | :*.| | : * .| 0.6+ : .+ | : * .| | * .| | : .| | * : .| 0.4+ : .+ |* : .| | *: .| | : .| 0.2+ * : .+ | * : .| | * * : .| |* * * : .| 0++--+---+--+---+--+---+--.---+---++ -2-1.5 -1-0.5 0 0.5 1 C 1.5 2 My attempts to use locator() and text() in R's plot() has not been working as good as expected. Never really knows where the added symbols end up... BTW, anyone who knows how such dumb plots could be achieved in R? This option is avaliable in S, so a port to R shouldn't be impossible, should it? Don't know how to do it myself however... Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Time rather than dates?
On Tue, 30 May 2006, Petr Pikal wrote: Which version of R do you use? Version 2.3.1 beta (2006-05-23 r38179) treatment_time-c(01:02:03,02:03:04) strptime(treatment_time,format=%H:%M:%S) [1] 2006-05-30 01:02:03 2006-05-30 02:03:04 From ?strptime If the date string does not specify the date completely, the returned answer may be system-specific. The most common behaviour is to assume that unspecified seconds, minutes or hours are zero, and a missing year, month or day is the current one. which explains the different answer. Neither of you told us your OS. ttt-strptime(treatment_time,format=%H:%M:%S) For changing format of POSIX variables use format format(ttt, %H:%M) [1] 01:02 02:03 Maybe also consult locale setting. HTH Petr On 30 May 2006 at 14:16, Robert Lundqvist wrote: To: r-help@stat.math.ethz.ch From: Robert Lundqvist [EMAIL PROTECTED] Date sent:Tue, 30 May 2006 14:16:29 +0200 Subject: [R] Time rather than dates? Send reply to:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Using strptime() and other functions for dates has been very helpful with the kind of data I often work with. However, I haven't found out how time as such should be specified. All my attempts result in time *and* date: treatment_time-c(01:02:03,02:03:04) # hours:minutes:seconds time.2-strptime(treatment_time,format=%H:%M:%S) time.2 [1] 1900-01-01 01:02:03 1900-01-01 02:03:04 Why the 1900-...? I had hoped for some easy conversion from time to numeric data and possibly back. Assistance would be appreciated. Robert -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] boosting - second posting
The family arg appears to be the problem. Either bernoulli or adaboost are appropriate for classification problems. Max Perhaps by following the Posting Guide you're likely to get more helpful responses. You have not shown an example that others can reproduce, not given version information for R or gbm. The output you showed does not use type=response, either. Andy _ From: r-help-bounces at stat.math.ethz.ch on behalf of stephenc Sent: Sat 5/27/2006 4:02 PM To: 'R Help' Subject: [R] boosting - second posting [Broadcast] Hi I am using boosting for a classification and prediction problem. For some reason it is giving me an outcome that doesn't fall between 0 and 1 for the predictions. I have tried type=response but it made no difference. Can anyone see what I am doing wrong? Screen output shown below: boost.model - gbm(as.factor(train$simNuance) ~ ., # formula + data=train, # dataset + # +1: monotone increase, + # 0: no monotone restrictions + distribution=gaussian, # bernoulli, adaboost, gaussian, + # poisson, and coxph available + n.trees=3000,# number of trees + shrinkage=0.005, # shrinkage or learning rate, + # 0.001 to 0.1 usually work + interaction.depth=3, # 1: additive model, 2: two-way interactions, etc. + bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best + train.fraction = 0.5,# fraction of data for training, + # first train.fraction*N used for training + n.minobsinnode = 10, # minimum total weight needed in each node + cv.folds = 5,# do 5-fold cross-validation + keep.data=TRUE, # keep a copy of the dataset with the object + verbose=FALSE)# print out progress best.iter = gbm.perf(boost.model,method=cv) pred = predict.gbm(boost.model, test, best.iter) summary(pred) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.4772 1.5140 1.6760 1.5100 1.7190 1.9420 -- LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Add text/numbers to x axis?
Robert Lundqvist [EMAIL PROTECTED] writes: Is there any simple way to add text to the x axis in R? I have tried to add simple characters in R without any greater sucess. As an example of what I want to do is the added C in the following dumb plot (produced with MacAnova using the option dumb:T): ++--+---+--+---+--+---+--+---++ 1+ : *+ | : * * | | : * * | | : * .| 0.8+ : * .+ | : .| | :*.| | : * .| 0.6+ : .+ | : * .| | * .| | : .| | * : .| 0.4+ : .+ |* : .| | *: .| | : .| 0.2+ * : .+ | * : .| | * * : .| |* * * : .| 0++--+---+--+---+--+---+--.---+---++ -2-1.5 -1-0.5 0 0.5 1 C 1.5 2 My attempts to use locator() and text() in R's plot() has not been working as good as expected. Never really knows where the added symbols end up... mtext() is your friend. BTW, anyone who knows how such dumb plots could be achieved in R? This option is avaliable in S, so a port to R shouldn't be impossible, should it? Don't know how to do it myself however... Presumably you just need someone to write a device driver for it. The structure of those is somewhat different from S, so it's not a straightforward port (even if the S code was publically available). As I recall it (fortunately, it was decades ago), some cosmetic issues become tricky due to the high granularity of such line printer plots; things like alignment of strings and continuation of polylines. However if someone wants to read up on Bresenham's line drawing algorithm and all that... -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Add text/numbers to x axis?
On Tue, 30 May 2006, Robert Lundqvist wrote: Is there any simple way to add text to the x axis in R? I have tried to add simple characters in R without any greater sucess. As an example of what I want to do is the added C in the following dumb plot (produced with MacAnova using the option dumb:T): ++--+---+--+---+--+---+--+---++ 1+ : *+ | : * * | | : * * | | : * .| 0.8+ : * .+ | : .| | :*.| | : * .| 0.6+ : .+ | : * .| | * .| | : .| | * : .| 0.4+ : .+ |* : .| | *: .| | : .| 0.2+ * : .+ | * : .| | * * : .| |* * * : .| 0++--+---+--+---+--+---+--.---+---++ -2-1.5 -1-0.5 0 0.5 1 C 1.5 2 My attempts to use locator() and text() in R's plot() has not been working as good as expected. Never really knows where the added symbols end up... Well, they end up where you request them. In that particular case I would use axis() to add a label that matched the tick marks, but mtext() is often useful. BTW, anyone who knows how such dumb plots could be achieved in R? This option is avaliable in S, so a port to R shouldn't be impossible, should it? Don't know how to do it myself however... There is the little matter of having access to the S source code and permission to make use of it, as well as knowing how the S graphics device model works (which AFAIK has never been publicly documented but it believed to be somewhat different from R). The devil is in the details: notice how the y axis on the plot shown is uneven, with a stretch in the middle? Should such things be allowed (I think not)? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] correlating values
I did a correlation for the values - EX4577 EX4599 EX4566EX4522 WL917 2.53528 0.79077 0.21499 -0.01084 WP429S -0.192723715 WP819 -1.016997552 WP977 1.378674-0.070710.6250890.4728363 WI205S -0.24443-1.789526 0.648923-0.775867 by using round(cor(t(person.data),use=pairwise.complete.obs)) i got the result as - WL917 WP429S WP819 WP977 WI205S WL917 1 NA NA 0.344 -0.11424 WP429S NA NA NA NA NA WP819 NA NA NA NA NA WP977 0.34461 NA NA 1 .23294 WI205S -0.11424NA NA 0.23294 1 i notice that for correlation between WP429S x wp429S the value is given as NA where as it should be 1 same is the case with WP819 x WP819 Can someone please help me reason out why this is happening? and any corrective measure that I need to take so that I can get the true value of 1 for them. Thank you - ash# __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] boosting - second posting
I remember if you use distribution=bernoulli, then you don't have to as.factor(your_response_variable) either. Weiwei On 5/30/06, Kuhn, Max [EMAIL PROTECTED] wrote: The family arg appears to be the problem. Either bernoulli or adaboost are appropriate for classification problems. Max Perhaps by following the Posting Guide you're likely to get more helpful responses. You have not shown an example that others can reproduce, not given version information for R or gbm. The output you showed does not use type=response, either. Andy _ From: r-help-bounces at stat.math.ethz.ch on behalf of stephenc Sent: Sat 5/27/2006 4:02 PM To: 'R Help' Subject: [R] boosting - second posting [Broadcast] Hi I am using boosting for a classification and prediction problem. For some reason it is giving me an outcome that doesn't fall between 0 and 1 for the predictions. I have tried type=response but it made no difference. Can anyone see what I am doing wrong? Screen output shown below: boost.model - gbm(as.factor(train$simNuance) ~ ., # formula + data=train, # dataset + # +1: monotone increase, + # 0: no monotone restrictions + distribution=gaussian, # bernoulli, adaboost, gaussian, + # poisson, and coxph available + n.trees=3000,# number of trees + shrinkage=0.005, # shrinkage or learning rate, + # 0.001 to 0.1 usually work + interaction.depth=3, # 1: additive model, 2: two-way interactions, etc. + bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best + train.fraction = 0.5,# fraction of data for training, + # first train.fraction*N used for training + n.minobsinnode = 10, # minimum total weight needed in each node + cv.folds = 5,# do 5-fold cross-validation + keep.data=TRUE, # keep a copy of the dataset with the object + verbose=FALSE)# print out progress best.iter = gbm.perf(boost.model,method=cv) pred = predict.gbm(boost.model, test, best.iter) summary(pred) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.4772 1.5140 1.6760 1.5100 1.7190 1.9420 -- LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Add text/numbers to x axis?
BTW, anyone who knows how such dumb plots could be achieved in R? This option is avaliable in S, so a port to R shouldn't be impossible, should it? Don't know how to do it myself however... Use gnuplot or MacAnova. RSiteSearch(MacAnova) has an example. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] correlating values
Ahamarshan jn wrote: I did a correlation for the values - EX4577 EX4599 EX4566 EX4522 WL917 2.53528 0.79077 0.21499 -0.01084 WP429S-0.192723715 WP819 -1.016997552 WP977 1.378674-0.070710.6250890.4728363 WI205S-0.24443-1.789526 0.648923-0.775867 by using round(cor(t(person.data),use=pairwise.complete.obs)) i got the result as - WL917 WP429S WP819 WP977 WI205S WL917 1 NA NA 0.344 -0.11424 WP429SNA NA NA NA NA WP819 NA NA NA NA NA WP977 0.34461 NA NA 1 .23294 WI205S-0.11424NA NA 0.23294 1 i notice that for correlation between WP429S x wp429S the value is given as NA where as it should be 1 same is the case with WP819 x WP819 Can someone please help me reason out why this is happening? and any corrective measure that I need to take so that I can get the true value of 1 for them. Thank you - ash# Hi, Ash, It's difficult to tell with the data you provided (maybe just poor email formatting), but it looks like both WP429S and WP819 have only one value. You cannot obtain a correlation with only one value (see any intro stats book for the definition or correlation or just Google it). Try: x - rnorm(1) y - rnorm(1) cor(x, y) cor(matrix(x, 1, 1)) This is because x and y have zero variance. Forcing the NA to 1 as a corrective measure would be incorrect and misleading. HTH, --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] when dimensionality is larger than the number of observations?
Hi, there: Can anyone here kindly point some good reference or links on this topic? Esp. some solutions from BioConductor or R, when dealing with microarray-like, fat data? thanks, -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] executable file with R
Hi, I made an R function, and I want make an executable applet with it. Do you know how it is possible? Thank for your help. Romain -- Lorrillière Romain UMR 8079 Laboratoire Ecologie, Systématique et Evolution Bât. 362 Université Paris-Sud 91405 Orsay cedex France tel : 01 69 15 56 85 fax : 01 69 15 56 96 mobile : 06 81 70 90 70 email : [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] average by group...
I have a dataframe with 700,000 rows and 2 vectors (columns): group and score. I wish to calculate a third vector of length 70: the average score by group. Even though the avarge value will repeat, I wish to return the average for that particular group for each row. (I know I can do this by calculating each groups average and then using the merge command, but as my calculations get more complex and my data set gets larger, the merge command seems to be fairly slow.) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] when dimensionality is larger than the number of observations?
On 5/30/06, Weiwei Shi [EMAIL PROTECTED] wrote: Hi, there: Can anyone here kindly point some good reference or links on this topic? Esp. some solutions from BioConductor or R, when dealing with microarray-like, fat data? In that case there will be an entire subspace of coefficient vectors that will give the same fitted values. Lets take 3 rows of the iris data set and regress column 1 on the rest. There will be an entire subspace of coefficients that correspond to the same (unique) fitted values and we can get one of those coefficient vectors using the generalized inverse: # test data iris3 - iris[c(1, 51, 101),] y - iris3[,1] y [1] 5.1 7.0 6.3 X - model.matrix(~., iris3[,2:5]) X (Intercept) Sepal.Width Petal.Length Petal.Width Speciesversicolor Speciesvirginica 1 1 3.5 1.4 0.2 0 0 511 3.2 4.7 1.4 1 0 101 1 3.3 6.0 2.5 0 1 attr(,assign) [1] 0 1 2 3 4 4 attr(,contrasts) attr(,contrasts)$Species [1] contr.treatment library(MASS) # needed for ginv coefs - c(ginv(crossprod(X)) %*% crossprod(X, y)) names(coefs) - colnames(X) coefs (Intercept) Sepal.Width Petal.Length Petal.Width Speciesversicolor Speciesvirginica 0.3619361 1.1497417 0.5443438 -0.2405670 0.7372685-0.5207289 X %*% coefs # fitted values [,1] 15.1 51 7.0 101 6.3 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] position of number at risk in survplot() graphs
Dear R-help How can one get survplot() to place the number at risk just below the survival curve as opposed to the default which is just above the x-axis? I tried the code bellow but the result is not satisfactory as some numbers are repeated several times at different y coordinates and the position of the n.risk numbers corresponds to the x-axis tick marks not the survival curve time of censoring. n - 20 set.seed(731) cens - 15*runif(n) h - .02*exp(2) dt - -log(runif(n))/h label(dt) - 'Follow-up Time' e - ifelse(dt = cens,1,0) dt - pmin(dt, cens) units(dt) - Year S - Surv(dt,e) km-survfit(S~1) survplot(km,n.risk=T,conf='none', y.n.risk=unique(summary(km)$surv)) Any suggestion on addressing this problem would be apprecited. Also, is there a way to add a tick mark to the survival curve at times of censoring similar to the mark.time=T argument in plot.survplot()? Thanks Osman -- Osman O. Al-Radi, MD, MSc, FRCSC Fellow, Cardiovascular Surgery The Hospital for Sick Children University of Toronto, Canada [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] average by group...
?tapply -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of r user Sent: Tuesday, May 30, 2006 12:27 PM To: rhelp Subject: [R] average by group... I have a dataframe with 700,000 rows and 2 vectors (columns): group and score. I wish to calculate a third vector of length 70: the average score by group. Even though the avarge value will repeat, I wish to return the average for that particular group for each row. (I know I can do this by calculating each group's average and then using the merge command, but as my calculations get more complex and my data set gets larger, the merge command seems to be fairly slow.) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] average by group...
Doran, Harold [EMAIL PROTECTED] writes: ?tapply Nope. ?ave -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of r user Sent: Tuesday, May 30, 2006 12:27 PM To: rhelp Subject: [R] average by group... I have a dataframe with 700,000 rows and 2 vectors (columns): group and score. I wish to calculate a third vector of length 70: the average score by group. Even though the avarge value will repeat, I wish to return the average for that particular group for each row. (I know I can do this by calculating each group's average and then using the merge command, but as my calculations get more complex and my data set gets larger, the merge command seems to be fairly slow.) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] combinatorial programming problem
Hi, Martin: Thanks much. I don't know if it solved Kjetil's problem, but it helps educate me. Best Wishes, Spencer Graves Martin Maechler wrote: SpG == Spencer Graves [EMAIL PROTECTED] on Sun, 28 May 2006 16:21:53 -0700 writes: SpGI'm not sure I understand your question, but SpG are you asking how to index choose(k, r) objects? SpG Almost 3 years ago, I asked a question like this. Andy SpG Liaw referred me to nchoosek(vsn) SpG (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/12518.html). SpG This produces a matrix of dimension (r, choose(k, r)). SpG With this matrix, you could convert an integer between SpG 1 and choose(k, r) into an r-vector by table look-up. SpG Reading the code for nchoosek might help you further if SpG this does not seem appropriate for you. Note that *if* the above is the answer, I'd rather recommend to use combn() from package combinat, since a (slightly improved) version of combn() has been part of R-devel (to become R 2.4.0 in October) for a while. Also, combn() from combinat precedes nchoosek() historically and is also faster. Martin Maechler, ETH Zurich SpGI found this just now using 'RSiteSearch(all SpG subsets of a size)', which produced 102 hits. Another SpG one that looked like it might help you is SpG http://finzi.psych.upenn.edu/R/Rhelp02a/archive/1717.html;. SpGHope this helps, Spencer Graves SpG Kjetil Brinchmann Halvorsen wrote: Hola! I am programming a class (S3) symarray for storing the results of functions symmetric in its k arguments. Intended use is for association indices for more than two variables, for instance coresistivity against antibiotics. There is one programming problem I haven't solved, making an inverse of the index function indx() --- se code below. It could for instance return the original k indexes in strictly increasing order, to make indx() formally invertible. Any ideas? Kjetil Code: # Implementing an S3 class for symarrays with array rank r for dimension # [k, k, ..., k] with k=r repeated r times. We do not store the diagonal. # Storage requirement is given by {r, k}= choose(k, r) # where r=array rank, k=maximum index symarray - function(data=NA, dims=c(1,1)){ r - dims[1] k - dims[2] if(r k) stop(symarray needs dimension larger than array rank) len - choose(k, r) out - data[1:len] attr(out, dims) - dims class(out) - symarray out } # Index calculation: indx - function(inds, k){ r - length(inds) if(r==1) return(inds) else { if(inds[1]==1) { return( indx(inds[-1]-1, k-1 ) ) } else { return( indx(c(inds[1]-1, seq(from=k-r+2, by=1, to=k)), k) + indx( inds[-1]-inds[1], k-inds[1] )) } } } # end indx # Methods for assignment and indexing: [.symarray - function(x, inds, drop=FALSE){ dims - attr(x, dims) k - dims[2] inds - indx(inds, k) res - NextMethod([, x) res } [-.symarray - function(x, inds, value){ dims - attr(x, dims) k - dims[2] inds - indx(inds, k) res - NextMethod([-, x) res } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html SpG __ SpG R-help@stat.math.ethz.ch mailing list SpG https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do SpG read the posting guide! SpG http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Help with adding minutes to time
Dear R Helpers, I need to read time from a .csv file which is formated as chartime (09:12:00) below. I need to add one minute (cf chartime2). Then I need to output the value just as 09:13 without the seconds for writing a csv file and input in another program. I get it with the following reproducible example but I can't help thinking that there must a less clumsy way to do that ! Thanks for any input and eventually a pointer to an example of how to add minutes or seconds to a time without a date which I think makes POSIX not relevant in this case. Best regards, Jean-louis library(chron) chartime-09:12:00 chartime2-00:01:00 chrontime-times(chartime) chrontime2-times(chartime2) test-chrontime+chrontime2 test-as.character(test) test-sub(:00,,test) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Weighting cluster variables in R?
Are there functions to weight variables for clustering in R? I can't seem to find anything, so apologies if there is. I am particularly interested in weighting variables (starting with kmeans) to optimise inter/intra-cluster distances. It seems to me that if certain variables do show a strong cluster structure, this would be a wise thing to do. Any advice welcome. Quin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] average by group...
I didn't know about ave(). What about this, though: dat - data.frame(score = rnorm(100), group = gl(10,10)) group.score - with(dat, tapply(score, group, mean)) dat$group.score - group.score[as.character(dat$group)] Harold -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard Sent: Tuesday, May 30, 2006 1:09 PM To: Doran, Harold Cc: r user; rhelp Subject: Re: [R] average by group... Doran, Harold [EMAIL PROTECTED] writes: ?tapply Nope. ?ave -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of r user Sent: Tuesday, May 30, 2006 12:27 PM To: rhelp Subject: [R] average by group... I have a dataframe with 700,000 rows and 2 vectors (columns): group and score. I wish to calculate a third vector of length 70: the average score by group. Even though the avarge value will repeat, I wish to return the average for that particular group for each row. (I know I can do this by calculating each group's average and then using the merge command, but as my calculations get more complex and my data set gets larger, the merge command seems to be fairly slow.) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] na.pass
Hello... What does na.pass? x=c(2.4, 2.4, 1.9, 2.5, 2.1) xNA=replace(x, 3, NA) p=c(acf(x, type=c(covariance), plot=FALSE)$acf) pNA=c(acf(xNA, type=c(covariance), na.action=na.pass, plot=FALSE)$acf) p [1] 0.05040 -0.03112 0.00816 0.00224 -0.00448 pNA [1] 0.02250 -0.01167 0.00250 -0.00100 -0.00250 In the manual say na.pass returns objet unchanged... Thanks, Natalia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] max / pmax
Hello R users, I am relatively new to R and cannot seem to crack a coding problem. I am working with substance abuse data, and I have a variable called primary.drug which is considered the drug of choice for each subject. I have just a few missing values on that variable. Instead of using a multiple imputation method like chained equations, I would prefer to derive these values from other survey responses. Specifically, I have a frequency of use (in days) for each of the major drugs, so I would like the missing values to be replaced by that drug with the highest level of use. I am starting with the ifelse and max statements, but I know it is wrong: impute.primary.drug - ifelse(is.na(primary.drug), max(marijuana, crack, cocaine, heroin), primary.drug) Here are the problems. First, the max statement (should it be pmax?), returns the highest numeric quantity rather than the variable itself. In other words, I want to test which drug has the highest value, but return the variable name rather than the observed value. Second, if ties are observed, how can I specify the value to be NA? Or, how can I specify one of the values to be randomly selected? Thank in advance for your assistance. Regards, Brian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with adding minutes to time
Jean-Louis Abitbol [EMAIL PROTECTED] wrote: Dear R Helpers, I need to read time from a .csv file which is formated as chartime (09:12:00) below. I need to add one minute (cf chartime2). Then I need to output the value just as 09:13 without the seconds for writing a csv file and input in another program. [...] The numeric representation of chron objects is expressed in day units, and there are sensible arithmetic methods: chrontime - times(09:12:00) + 1/24/60 and then: paste(hours(chrontime), minutes(chrontime), sep=:) Cheers, -- Seb __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] merging
Dear List, Given, y - matrix(c(0,1,1,1,0,0,0,4,4), ncol = 3, byrow = TRUE) rownames(y) - c(a,b,c) colnames(y) - c(1,2,3) y y2 - y[2:3, ] rownames(y2) - c(x,z) y2 how can I stop merge(y, y2, all = TRUE, sort = FALSE) squishing the extra rows? Ideally I want the same as: rbind(y, y2) in this case. This is specific example of situation where two data matrices have same column variables and all I want is to stick the two sets of rows together, but I have been using merge for cases such as the one below, where the second matrix has extra column(s): y3 - matrix(c(0,1,1,1,0,0,0,4,4,5,6,7), ncol = 4, byrow = TRUE) rownames(y3) - c(d,e,f) colnames(y3) - c(1,2,3,4) y3 merge(y, y3, all = TRUE, sort = FALSE) We don't know before hand if the columns will match. But I see now that even this doesn't work as I was expecting/thinking! So I'm looking for a general way to merge two matrices such that the number of rows in the merged matrix is nrow(mat1) + nrow(mat2) and the number of columns in the merged matrix is length(unique(colnames(mat1), colnames(mat2). Is there a function in R to do this, or can someone suggest a way to achieve this? My R version info is at the end. Just to be clear, for the y, y3 example I want something like this returned: 1 2 3 4 a 0 1 1 NA b 1 0 0 NA c 0 4 4 NA d 0 1 1 1 e 0 0 0 4 f 4 5 6 7 and for the y, y2 example, I want something like this returned: 1 2 3 a 0 1 1 b 1 0 0 c 0 4 4 x 1 0 0 z 0 4 4 Many thanks, Gav version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status Patched major 2 minor 3.0 year 2006 month 05 day03 svn rev37978 language R version.string Version 2.3.0 Patched (2006-05-03 r37978) -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% * Note new Address, Telephone Fax numbers from 6th April 2006 * %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson ECRC ENSIS [t] +44 (0)20 7679 0522 UCL Department of Geography [f] +44 (0)20 7679 0565 Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street [w] http://www.ucl.ac.uk/~ucfagls/cv/ London, UK. [w] http://www.ucl.ac.uk/~ucfagls/ WC1E 6BT. %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compiling R to run natively on Windows x64
I am today where Alastiar was back in August 2005. I am considering buying a 64 bit machine with 16GB of memory to run bigger objects in R, which brougt me down the Linux path. The problem is that my company's IT staff doesn't support Linux and this path was getting very bumpy for me, so I am oscilating back towards 64bit Windows. I thought the lack of a 64 bit Windows binary was due to there being no open-source gcc compiler for 64bit Windows, but reading Prof. Ripley reply seems to imply that I could not even buy a compiler and compiler the source code myself because there are none that treat a long as 64 bits. Is that the only remaining obstacle and does that obstacle still exist? Thanks, Roger On 8/19/05, Prof Brian Ripley [EMAIL PROTECTED] wrote: On Thu, 18 Aug 2005, Alastair Cooper wrote: I am looking at getting a PC preinstalled with Windows XP x64. What I want to know is, has anyone successfully compiled a version of R for 64-bit Windows (Amd64 - not Itanium), and if so did they find any performance boost? Hmm, where do you get a reliable C99-compatible compiler for 64-bit Windows? We don't know of one, and the R sources are written assuming long is 64-bit on a 64-bit platform (and that is not the Win64 convention) so there would still be a lot of 32-bit restrictions until we change that (which is on my TODO list). (We don't support building R with VC++, and although there have been a number of attempts none has produced a version that passes make check: I recall finding VC++ thought -Inf 3, for example.) Based on extensive experience on other platforms, I would expect a noticeable performance hit for a 64-bit build, but the ability to run bigger tasks: this is discussed with data and reasoning in the latest R-admin manual (in the R-devel version of R). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Piecewise (broken stick) models in R
Hi there, I´m a newbie in R and I´m looking for some advice how to teste Piecewise (Broken Stick) models. I´have another simple question: how can I compute descriptive statistics for data grouped for two or more variables? See below, please. Year Site Repetition rainfall 1980 1 1 ... 1980 1 2 ... 1980 1 3 ... 1980 2 1 ... 1980 2 2 ... 1980 2 3 ... 1990 1 1 ... 1990 1 2 ... 1990 1 3 ... 1990 2 1 ... 1990 2 2 ... 1990 2 3 ... I´d like to compute the mean rainfall for each YEAR * SITE combination. Kind regards, Miltinho - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compiling R to run natively on Windows x64
On 5/30/2006 2:18 PM, roger bos wrote: I am today where Alastiar was back in August 2005. I am considering buying a 64 bit machine with 16GB of memory to run bigger objects in R, which brougt me down the Linux path. The problem is that my company's IT staff doesn't support Linux and this path was getting very bumpy for me, so I am oscilating back towards 64bit Windows. I thought the lack of a 64 bit Windows binary was due to there being no open-source gcc compiler for 64bit Windows, but reading Prof. Ripley reply seems to imply that I could not even buy a compiler and compiler the source code myself because there are none that treat a long as 64 bits. Is that the only remaining obstacle and does that obstacle still exist? I would somewhat seriously suggest that you give your support staff a copy of the R source, and get them to work out the details of compiling it in Win64. I'm sure lots of people would appreciate their work. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with adding minutes to time
Try this: hhmm - function(x) sub(:..$, , format(times(x))) y - times(01:02:00, out.format = hhmm) + times(00:01:00) y # or format(y) [1] 01:03 or # hhmm is from above x - times(01:02:00) hhmm(x + times(00:01:00)) [1] 01:03 On 5/30/06, Jean-Louis Abitbol [EMAIL PROTECTED] wrote: Dear R Helpers, I need to read time from a .csv file which is formated as chartime (09:12:00) below. I need to add one minute (cf chartime2). Then I need to output the value just as 09:13 without the seconds for writing a csv file and input in another program. I get it with the following reproducible example but I can't help thinking that there must a less clumsy way to do that ! Thanks for any input and eventually a pointer to an example of how to add minutes or seconds to a time without a date which I think makes POSIX not relevant in this case. Best regards, Jean-louis library(chron) chartime-09:12:00 chartime2-00:01:00 chrontime-times(chartime) chrontime2-times(chartime2) test-chrontime+chrontime2 test-as.character(test) test-sub(:00,,test) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] max / pmax
Here's an example of how I think you can do what you want. Play with the definition of the function highest.use() to get random selection of multiple maxima. drug.names - c(marijuana, crack, cocaine, heroin) drugs - factor(drug.names, levels=drug.names) drugs [1] marijuana crack cocaine heroin Levels: marijuana crack cocaine heroin as.numeric(drugs) [1] 1 2 3 4 N - 20 set.seed(1) primary.drug - sample(drugs, N, rep=T) primary.drug[sample(1:20, 10)] - NA primary.drug [1] NA crack NA NA NA NA heroin [8] cocaine cocaine marijuana NA NA cocaine crack [15] heroinNA cocaine heroinNA NA Levels: marijuana crack cocaine heroin # usage frequencies marijuana - sample(1:3, N, rep=T) crack - sample(1:3, N, rep=T) cocaine - sample(1:3, N, rep=T) heroin - sample(1:3, N, rep=T) cbind(marijuana, crack, cocaine, heroin) marijuana crack cocaine heroin [1,] 2 2 2 1 [2,] 2 3 3 1 [3,] 2 2 2 2 [4,] 1 1 2 3 [5,] 3 1 2 3 [6,] 3 1 3 3 [7,] 3 1 3 2 [8,] 1 2 2 2 [9,] 3 2 3 3 [10,] 2 2 3 2 [11,] 3 3 2 2 [12,] 2 1 3 2 [13,] 3 2 2 1 [14,] 2 1 1 3 [15,] 2 2 3 2 [16,] 3 1 1 1 [17,] 1 2 3 1 [18,] 2 3 1 2 [19,] 3 1 1 3 [20,] 3 3 1 2 highest.use - function(x) {y - which(x==max(x, na.rm=T)); if (length(y)==1) return(y) else return(NA)} apply(cbind(marijuana, crack, cocaine, heroin), 1, highest.use) [1] NA NA NA 4 NA NA NA NA NA 3 NA 3 1 4 3 1 3 2 NA NA impute.primary.drug - drugs[ifelse(is.na(primary.drug), apply(cbind(marijuana, crack, cocaine, heroin), 1, highest.use), as.numeric(primary.drug))] data.frame(primary.drug, impute.primary.drug) primary.drug impute.primary.drug 1 NANA 2 crack crack 3 NANA 4 NA heroin 5 NANA 6 NANA 7heroin heroin 8 cocaine cocaine 9 cocaine cocaine 10marijuana marijuana 11 NANA 12 NA cocaine 13 cocaine cocaine 14crack crack 15 heroin heroin 16 NA marijuana 17 cocaine cocaine 18 heroin heroin 19 NANA 20 NANA Brian Perron wrote: Hello R users, I am relatively new to R and cannot seem to crack a coding problem. I am working with substance abuse data, and I have a variable called primary.drug which is considered the drug of choice for each subject. I have just a few missing values on that variable. Instead of using a multiple imputation method like chained equations, I would prefer to derive these values from other survey responses. Specifically, I have a frequency of use (in days) for each of the major drugs, so I would like the missing values to be replaced by that drug with the highest level of use. I am starting with the ifelse and max statements, but I know it is wrong: impute.primary.drug - ifelse(is.na(primary.drug), max(marijuana, crack, cocaine, heroin), primary.drug) Here are the problems. First, the max statement (should it be pmax?), returns the highest numeric quantity rather than the variable itself. In other words, I want to test which drug has the highest value, but return the variable name rather than the observed value. Second, if ties are observed, how can I specify the value to be NA? Or, how can I specify one of the values to be randomly selected? Thank in advance for your assistance. Regards, Brian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Piecewise (broken stick) models in R
?aggregate y year site rep rain 1 19801 1 2.6550866 7 19801 2 3.7212390 13 19801 3 5.7285336 4 19802 1 9.0820779 10 19802 2 2.0168193 16 19802 3 8.9838968 2 19811 1 9.4467527 8 19811 2 6.6079779 14 19811 3 6.2911404 5 19812 1 0.6178627 11 19812 2 2.0597457 17 19812 3 1.7655675 3 19821 1 6.8702285 9 19821 2 3.8410372 15 19821 3 7.6984142 6 19822 1 4.9769924 12 19822 2 7.1761851 18 19822 3 9.9190609 aggregate(y, list(y$year, y$site), mean) Group.1 Group.2 year site rep rain 11980 1 19801 2 4.034953 21981 1 19811 2 7.448624 31982 1 19821 2 6.136560 41980 2 19802 2 6.694265 51981 2 19812 2 1.481059 61982 2 19822 2 7.357413 On 5/30/06, Milton Cezar [EMAIL PROTECTED] wrote: Hi there, I´m a newbie in R and I´m looking for some advice how to teste Piecewise (Broken Stick) models. I´have another simple question: how can I compute descriptive statistics for data grouped for two or more variables? See below, please. Year Site Repetition rainfall 1980 1 1 ... 1980 1 2 ... 1980 1 3 ... 1980 2 1 ... 1980 2 2 ... 1980 2 3 ... 1990 1 1 ... 1990 1 2 ... 1990 1 3 ... 1990 2 1 ... 1990 2 2 ... 1990 2 3 ... I´d like to compute the mean rainfall for each YEAR * SITE combination. Kind regards, Miltinho - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compiling R to run natively on Windows x64
Duncan, Thanks for the reply. I was hoping to hear that other people were successful and then I would try doing so myself, but no one in the R community has gotten R to compile under 64 bit Windows, I unfortunately I not going to be the one to overcome this obstacle since I only took a few C++ courses 7 years ago--and haven't used it since. I have a better chance of learning how to use Linux. Thanks, Roger On 5/30/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 5/30/2006 2:18 PM, roger bos wrote: I am today where Alastiar was back in August 2005. I am considering buying a 64 bit machine with 16GB of memory to run bigger objects in R, which brougt me down the Linux path. The problem is that my company's IT staff doesn't support Linux and this path was getting very bumpy for me, so I am oscilating back towards 64bit Windows. I thought the lack of a 64 bit Windows binary was due to there being no open-source gcc compiler for 64bit Windows, but reading Prof. Ripley reply seems to imply that I could not even buy a compiler and compiler the source code myself because there are none that treat a long as 64 bits. Is that the only remaining obstacle and does that obstacle still exist? I would somewhat seriously suggest that you give your support staff a copy of the R source, and get them to work out the details of compiling it in Win64. I'm sure lots of people would appreciate their work. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] BOUNCE [EMAIL PROTECTED]: Non-member submission from [EMAIL PROTECTED]
Received: (qmail 29230 invoked from network); 30 May 2006 19:49:24 - Received: from cupona1.hp.com (HELO cuprel1.hp.com) (15.13.176.10) by cxx.cup.hp.com with SMTP; 30 May 2006 19:49:24 - Received: from stat.math.ethz.ch (unknown [216.83.198.30]) by cuprel1.hp.com (Postfix) with ESMTP id A024D1256 for [EMAIL PROTECTED]; Tue, 30 May 2006 12:49:18 -0700 (PDT) From: r-help@stat.math.ethz.ch To: [EMAIL PROTECTED] Subject: Mail System Error - Returned Mail Date: Tue, 30 May 2006 12:47:15 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Microsoft Outlook Express 6.00.2600. Message-Id: [EMAIL PROTECTED] X-Converted-To-Plain-Text: from multipart/mixed by demime 0.99d.1 X-Converted-To-Plain-Text: Alternative section used was text/plain The message was undeliverable due to the following reason: Your message could not be delivered because the destination server was unreachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. Most likely there is a network problem that prevented delivery, but it is also possible that the computer is turned off, or does not have a mail system running right now. Your message was not delivered within 3 days: Server 198.164.212.248 is not responding. The following recipients did not receive this message: [EMAIL PROTECTED] Please reply to [EMAIL PROTECTED] if you feel this message to be in error. [demime 0.99d.1 removed an attachment of type application/octet-stream which had a name of [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Faster way to zero-pad a data frame...?
Hello List, I am working on creating periodograms from IP network traffic logs using the Fast Fourier Transform. The FFT requires all the data points to be evenly-spaced in the time domain (constant delta-T), so I have a step where I zero-pad the data. Lately I've been wondering if there is a faster way to do this. Here's what I've got: * data1 is a data frame consisting of a timestamp, in seconds, from the beginning of the network log, and the number of network events that fell on that timestamp. Example: time,events 0,1 1,30 5,14 10,4 *data2 is the zero-padded data frame. It has length equal to the greatest value of time in data2: time,events 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0 9,0 10,0 So I run this for loop: for(i in 1:length(data1[,1])) { data2[data1[i,1],2]-data1[i,2] } Which goes to each row in data1, reads the timestamp, and writes the events to the corresponding row in data2. The result is: time,events 0,1 1,30 2,0 3,0 4,0 5,14 6,0 7,0 9,0 9,0 10,4 For a 24-hour log (86,400 seconds) this can take a while...Any advice on how to speed it up would be appreciated. Thanks, Pete Cap - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] weighted correlation coefficient
Dear R-listers Is there a R-package that I can use to compute a weighted correlation coefficient along with its p-values? I want to be able to assign weights to the data points before computing a Pearson correlation coefficient. Thanks in advance. Young-Jin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging
Gavin Simpson wrote: Dear List, Given, y - matrix(c(0,1,1,1,0,0,0,4,4), ncol = 3, byrow = TRUE) rownames(y) - c(a,b,c) colnames(y) - c(1,2,3) y y2 - y[2:3, ] rownames(y2) - c(x,z) y2 how can I stop merge(y, y2, all = TRUE, sort = FALSE) squishing the extra rows? Ideally I want the same as: rbind(y, y2) in this case. This is specific example of situation where two data matrices have same column variables and all I want is to stick the two sets of rows together, but I have been using merge for cases such as the one below, where the second matrix has extra column(s): y3 - matrix(c(0,1,1,1,0,0,0,4,4,5,6,7), ncol = 4, byrow = TRUE) rownames(y3) - c(d,e,f) colnames(y3) - c(1,2,3,4) y3 merge(y, y3, all = TRUE, sort = FALSE) We don't know before hand if the columns will match. But I see now that even this doesn't work as I was expecting/thinking! So I'm looking for a general way to merge two matrices such that the number of rows in the merged matrix is nrow(mat1) + nrow(mat2) and the number of columns in the merged matrix is length(unique(colnames(mat1), colnames(mat2). Is there a function in R to do this, or can someone suggest a way to achieve this? My R version info is at the end. Just to be clear, for the y, y3 example I want something like this returned: 1 2 3 4 a 0 1 1 NA b 1 0 0 NA c 0 4 4 NA d 0 1 1 1 e 0 0 0 4 f 4 5 6 7 and for the y, y2 example, I want something like this returned: 1 2 3 a 0 1 1 b 1 0 0 c 0 4 4 x 1 0 0 z 0 4 4 Many thanks, Gav version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status Patched major 2 minor 3.0 year 2006 month 05 day03 svn rev37978 language R version.string Version 2.3.0 Patched (2006-05-03 r37978) Will this help: rbind.all - function(...) { x - list(...) cn - unique(unlist(lapply(x, colnames))) for(i in seq(along = x)) { if(any(m - !cn %in% colnames(x[[i]]))) { na - matrix(NA, nrow(x[[i]]), sum(m)) dimnames(na) - list(rownames(x[[i]]), cn[m]) x[[i]] - cbind(x[[i]], na) } } do.call(rbind, x) } y - matrix(c(0,1,1,1,0,0,0,4,4), ncol = 3, byrow = TRUE) rownames(y) - c(a,b,c) colnames(y) - c(1,2,3) y2 - y[2:3, 2:3] rownames(y2) - c(x,z) y3 - matrix(c(0,1,1,1,0,0,0,4,4,5,6,7), ncol = 4, byrow = TRUE) rownames(y3) - c(d,e,f) colnames(y3) - c(1,2,3,4) rbind.all(y, y2, as.data.frame(y3)) It does very little error-checking, so be careful how you use it. --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Install R problem
I have copied R-2.3.0.tar.gx and uncompressed the directory R-2.3.0 is created. ./configure Make Typed R in the directory /root/downloads/R-2.3.0/bin/exec -bash-2.05b# pwd /root/downloads/R-2.3.0/bin/exec -bash-2.05b# R Fatal error: R home directory is not defined __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Faster way to zero-pad a data frame...?
How about starting your time from 1 instead of 0 to make indexing earier (you can always substract one later). If so: x time events 11 1 22 30 36 14 4 11 4 y - data.frame(time=seq(max(x$time)), events=rep(0, max(x$time))) y time events 1 1 0 2 2 0 3 3 0 4 4 0 5 5 0 6 6 0 7 7 0 8 8 0 9 9 0 10 10 0 11 11 0 y$events[x$time] - x$events y time events 1 1 1 2 2 30 3 3 0 4 4 0 5 5 0 6 6 14 7 7 0 8 8 0 9 9 0 10 10 0 11 11 4 On 5/30/06, Pete Cap [EMAIL PROTECTED] wrote: Hello List, I am working on creating periodograms from IP network traffic logs using the Fast Fourier Transform. The FFT requires all the data points to be evenly-spaced in the time domain (constant delta-T), so I have a step where I zero-pad the data. Lately I've been wondering if there is a faster way to do this. Here's what I've got: * data1 is a data frame consisting of a timestamp, in seconds, from the beginning of the network log, and the number of network events that fell on that timestamp. Example: time,events 0,1 1,30 5,14 10,4 *data2 is the zero-padded data frame. It has length equal to the greatest value of time in data2: time,events 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0 9,0 10,0 So I run this for loop: for(i in 1:length(data1[,1])) { data2[data1[i,1],2]-data1[i,2] } Which goes to each row in data1, reads the timestamp, and writes the events to the corresponding row in data2. The result is: time,events 0,1 1,30 2,0 3,0 4,0 5,14 6,0 7,0 9,0 9,0 10,4 For a 24-hour log (86,400 seconds) this can take a while...Any advice on how to speed it up would be appreciated. Thanks, Pete Cap - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] En: R: Piecewise (broken stick) models in R Stats for groups
Data: Tue, 30 May 2006 16:08:26 -0300 (ART) De: Milton Cezar [EMAIL PROTECTED] Assunto: R: Piecewise (broken stick) models in R Stats for groups Para: [EMAIL PROTECTED] Chris Barker and everyone Thanks for your fast replay. Regarding the descreptive Stats, I can compute it for the full dataset, but I can´t comput it for groups of that. I used mean ( Rainfall [year==1980] ) without problem. But how can I get the mean Rainfall for Year==1980 AND Site=1; Year=1980 AND Site=2 etc. Regarding my question for Pieacewise Regression model It can really be done with liner models functions. In fact, the results are a set of models that work bether in different range of an X variable (for X varying 0 to 0.15 use model_1; for 0.16 to 0.46 use model_2... for X0,47 use model_3 etc. Sorry for my broken english, Regards a lot, Miltinho From: Barker, Chris [SCIUS] To: 'Milton Cezar' Sent: Tuesday, May 30, 2006 3:47 PM Subject: RE: [R] Piecewise (broken stick) models in R You can get descriptive statistics with mean(), var() or summary(). AS to broken stick, I can only speculate that's a variation on a regression. You may need to be more specific in your questions to the list, as I suspect its something easily done in the linear models functions. Chris Barker Associate Director, Biostatistics Scios Inc. 6500 Paseo Padre Parkway Fremont, CA 94555 Tel 510 248 2439 Fax 510 248 2451 == From: [EMAIL PROTECTED] On Behalf Of Milton Cezar Sent: Tuesday, May 30, 2006 11:42 AM To: r-help@stat.math.ethz.ch Subject:[R] Piecewise (broken stick) models in R File: ATT17883177.txt Hi there, I´m a newbie in R and I´m looking for some advice how to teste Piecewise (Broken Stick) models. I´have another simple question: how can I compute descriptive statistics for data grouped for two or more variables? See below, please. Year Site Repetition rainfall 1980 1 1 ... 1980 1 2 ... 1980 1 3 ... 1980 2 1 ... 1980 2 2 ... 1980 2 3 ... 1990 1 1 ... 1990 1 2 ... 1990 1 3 ... 1990 2 1 ... 1990 2 2 ... 1990 2 3 ... I´d like to compute the mean rainfall for each YEAR * SITE combination. Kind regards, Miltinho - - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging
On Tue, 2006-05-30 at 19:08 +0100, Gavin Simpson wrote: Dear List, Given, y - matrix(c(0,1,1,1,0,0,0,4,4), ncol = 3, byrow = TRUE) rownames(y) - c(a,b,c) colnames(y) - c(1,2,3) y y2 - y[2:3, ] rownames(y2) - c(x,z) y2 how can I stop merge(y, y2, all = TRUE, sort = FALSE) squishing the extra rows? Ideally I want the same as: rbind(y, y2) in this case. This is specific example of situation where two data matrices have same column variables and all I want is to stick the two sets of rows together, but I have been using merge for cases such as the one below, where the second matrix has extra column(s): y3 - matrix(c(0,1,1,1,0,0,0,4,4,5,6,7), ncol = 4, byrow = TRUE) rownames(y3) - c(d,e,f) colnames(y3) - c(1,2,3,4) y3 merge(y, y3, all = TRUE, sort = FALSE) We don't know before hand if the columns will match. But I see now that even this doesn't work as I was expecting/thinking! So I'm looking for a general way to merge two matrices such that the number of rows in the merged matrix is nrow(mat1) + nrow(mat2) and the number of columns in the merged matrix is length(unique(colnames(mat1), colnames(mat2). Is there a function in R to do this, or can someone suggest a way to achieve this? My R version info is at the end. Just to be clear, for the y, y3 example I want something like this returned: 1 2 3 4 a 0 1 1 NA b 1 0 0 NA c 0 4 4 NA d 0 1 1 1 e 0 0 0 4 f 4 5 6 7 and for the y, y2 example, I want something like this returned: 1 2 3 a 0 1 1 b 1 0 0 c 0 4 4 x 1 0 0 z 0 4 4 Gavin, Here is a possible solution, though not fully tested. It uses the row.names for the two matrices as part of the 'by' matching process. This is noted in the Details section in ?merge. So for y and y2: res - merge(y, y2, by = c(row.names, intersect(colnames(y), colnames(y2))), all = TRUE) # Note that the row names are now the first col res Row.names 1 2 3 1 a 0 1 1 2 b 1 0 0 3 c 0 4 4 4 x 1 0 0 5 z 0 4 4 # Subset res, leaving out the first col mat - res[, -1] # Set the rownames from res rownames(mat) - res[, 1] mat 1 2 3 a 0 1 1 b 1 0 0 c 0 4 4 x 1 0 0 z 0 4 4 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging
On Tue, 2006-05-30 at 15:38 -0500, Marc Schwartz (via MN) wrote: On Tue, 2006-05-30 at 19:08 +0100, Gavin Simpson wrote: Dear List, Given, y - matrix(c(0,1,1,1,0,0,0,4,4), ncol = 3, byrow = TRUE) rownames(y) - c(a,b,c) colnames(y) - c(1,2,3) y y2 - y[2:3, ] rownames(y2) - c(x,z) y2 how can I stop merge(y, y2, all = TRUE, sort = FALSE) squishing the extra rows? Ideally I want the same as: rbind(y, y2) in this case. This is specific example of situation where two data matrices have same column variables and all I want is to stick the two sets of rows together, but I have been using merge for cases such as the one below, where the second matrix has extra column(s): y3 - matrix(c(0,1,1,1,0,0,0,4,4,5,6,7), ncol = 4, byrow = TRUE) rownames(y3) - c(d,e,f) colnames(y3) - c(1,2,3,4) y3 merge(y, y3, all = TRUE, sort = FALSE) We don't know before hand if the columns will match. But I see now that even this doesn't work as I was expecting/thinking! So I'm looking for a general way to merge two matrices such that the number of rows in the merged matrix is nrow(mat1) + nrow(mat2) and the number of columns in the merged matrix is length(unique(colnames(mat1), colnames(mat2). Is there a function in R to do this, or can someone suggest a way to achieve this? My R version info is at the end. Just to be clear, for the y, y3 example I want something like this returned: 1 2 3 4 a 0 1 1 NA b 1 0 0 NA c 0 4 4 NA d 0 1 1 1 e 0 0 0 4 f 4 5 6 7 and for the y, y2 example, I want something like this returned: 1 2 3 a 0 1 1 b 1 0 0 c 0 4 4 x 1 0 0 z 0 4 4 Gavin, Here is a possible solution, though not fully tested. It uses the row.names for the two matrices as part of the 'by' matching process. This is noted in the Details section in ?merge. So for y and y2: res - merge(y, y2, by = c(row.names, intersect(colnames(y), colnames(y2))), all = TRUE) # Note that the row names are now the first col res Row.names 1 2 3 1 a 0 1 1 2 b 1 0 0 3 c 0 4 4 4 x 1 0 0 5 z 0 4 4 # Subset res, leaving out the first col mat - res[, -1] # Set the rownames from res rownames(mat) - res[, 1] mat 1 2 3 a 0 1 1 b 1 0 0 c 0 4 4 x 1 0 0 z 0 4 4 Ack...hit the wrong button. Sorry. Must be the long weekendyeah, that's my story and I'm sticking to it... ;-) Here is the solution for y and y3: res2 - merge(y, y3, by = c(row.names, intersect(colnames(y), colnames(y3))), all = TRUE) res2 Row.names 1 2 3 4 1 a 0 1 1 NA 2 b 1 0 0 NA 3 c 0 4 4 NA 4 d 0 1 1 1 5 e 0 0 0 4 6 f 4 5 6 7 mat2 - res2[, -1] rownames(mat2) - res2[, 1] mat2 1 2 3 4 a 0 1 1 NA b 1 0 0 NA c 0 4 4 NA d 0 1 1 1 e 0 0 0 4 f 4 5 6 7 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Faster way to zero-pad a data frame...?
Try this: Lines - time,events 0,1 1,30 5,14 10,4 library(zoo) data1 - read.zoo(textConnection(Lines), header = TRUE, sep = ,) data2 - as.ts(data1) data2[is.na(data2)] - 0 # omit this lines if NAs in extra positions is ok On 5/30/06, Pete Cap [EMAIL PROTECTED] wrote: Hello List, I am working on creating periodograms from IP network traffic logs using the Fast Fourier Transform. The FFT requires all the data points to be evenly-spaced in the time domain (constant delta-T), so I have a step where I zero-pad the data. Lately I've been wondering if there is a faster way to do this. Here's what I've got: * data1 is a data frame consisting of a timestamp, in seconds, from the beginning of the network log, and the number of network events that fell on that timestamp. Example: time,events 0,1 1,30 5,14 10,4 *data2 is the zero-padded data frame. It has length equal to the greatest value of time in data2: time,events 1,0 2,0 3,0 4,0 5,0 6,0 7,0 8,0 9,0 10,0 So I run this for loop: for(i in 1:length(data1[,1])) { data2[data1[i,1],2]-data1[i,2] } Which goes to each row in data1, reads the timestamp, and writes the events to the corresponding row in data2. The result is: time,events 0,1 1,30 2,0 3,0 4,0 5,14 6,0 7,0 9,0 9,0 10,4 For a 24-hour log (86,400 seconds) this can take a while...Any advice on how to speed it up would be appreciated. Thanks, Pete Cap - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Install R problem
Pramod Anugu [EMAIL PROTECTED] writes: I have copied R-2.3.0.tar.gx and uncompressed the directory R-2.3.0 is created. ./configure Make Typed R in the directory /root/downloads/R-2.3.0/bin/exec -bash-2.05b# pwd /root/downloads/R-2.3.0/bin/exec -bash-2.05b# R Fatal error: R home directory is not defined Try: .../R-2.3.0/bin/R Or consider make install. + seth __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] position of number at risk in survplot() graphs
Osman Al-Radi wrote: Dear R-help How can one get survplot() to place the number at risk just below the survival curve as opposed to the default which is just above the x-axis? I tried the code bellow but the result is not satisfactory as some numbers are repeated several times at different y coordinates and the position of the n.risk numbers corresponds to the x-axis tick marks not the survival curve time of censoring. n - 20 set.seed(731) cens - 15*runif(n) h - .02*exp(2) dt - -log(runif(n))/h label(dt) - 'Follow-up Time' e - ifelse(dt = cens,1,0) dt - pmin(dt, cens) units(dt) - Year S - Surv(dt,e) km-survfit(S~1) survplot(km,n.risk=T,conf='none', y.n.risk=unique(summary(km)$surv)) Any suggestion on addressing this problem would be apprecited. Also, is there a way to add a tick mark to the survival curve at times of censoring similar to the mark.time=T argument in plot.survplot()? Thanks Osman Osman, y.n.risk has to be a scalar and gives the y-coordinate of the bottom line of number at risk. I take it that you want the numbers not all at the same height. This will require a customization of survplot or fetching information from the km object and using text( ). Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automate concatenation?
On Tue, 30 May 2006, Robert Lundqvist wrote: I have this typical problem of joining a number of vectors with similar names - a1, a2,..., a10 - which should be concatenated into one. Using c(a1,a2,a3,a4,a5,a6,a,a8,a9,a10) naturally works, but I would like to do it with less manual input. My attempts to use paste() gives a vector of the vector names, see below. The question is how to do the the concatenation? Any suggestions? paste(a,1:10,sep=) a1 - c(5, 4) a2 - 2 a3 - 6:9 cmd - sprintf(c(%s), paste(a, 1:3, sep = , collapse = , )) eval(parse(text = cmd)) -- SIGSIG -- signature too long (core dumped) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Faster way to zero-pad a data frame...?
Why not something simple like: # Toy example: data1 - data.frame(time=c(0,1,5,10),events=c(1,30,14,4)) data2 - rep(0,11) # Or more generally data2 - rep(0,1+max(data1$time)) # You don't need a for loop! Use the indexing capabilities of R! data2[data1$time+1] - data1$events # The ``+1'' is to allow for 0-origin. data2 - ts(data2,start=0) ??? cheers, Rolf Turner [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] sib TDT transmission/disequilibrium test
Does anyone know if the sib TDT has been implemented in R 1. Spielman, R.S., and Ewens, W.J. (1998) A sibship test for linkage in the presence of association: the sib transmission/disequilibrium test. Am J Hum Genet 62, 450-458 -- Farrel Buchinsky, MD Pediatric Otolaryngologist Allegheny General Hospital Pittsburgh, PA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automate concatenation?
Well, I like do.call(): puddy.tat - do.call(c,lapply(paste(a,1:10,sep=),get)) cheers, Rolf Turner [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] position of number at risk in survplot() graphs
Prof. Harrell, Thank you for your response. Potting the n.risk on the curve is probably only useful for a small data sets with a few censored events. The default present method is more general and has more applications. For my purpose I figured out a way of editing the postscript file in WinEdt. Regards, Osman On 5/30/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote: Osman Al-Radi wrote: Dear R-help How can one get survplot() to place the number at risk just below the survival curve as opposed to the default which is just above the x-axis? I tried the code bellow but the result is not satisfactory as some numbers are repeated several times at different y coordinates and the position of the n.risk numbers corresponds to the x-axis tick marks not the survival curve time of censoring. n - 20 set.seed(731) cens - 15*runif(n) h - .02*exp(2) dt - -log(runif(n))/h label(dt) - 'Follow-up Time' e - ifelse(dt = cens,1,0) dt - pmin(dt, cens) units(dt) - Year S - Surv(dt,e) km-survfit(S~1) survplot(km,n.risk=T,conf='none', y.n.risk=unique(summary(km)$surv)) Any suggestion on addressing this problem would be apprecited. Also, is there a way to add a tick mark to the survival curve at times of censoring similar to the mark.time=T argument in plot.survplot()? Thanks Osman Osman, y.n.risk has to be a scalar and gives the y-coordinate of the bottom line of number at risk. I take it that you want the numbers not all at the same height. This will require a customization of survplot or fetching information from the km object and using text( ). Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University -- Osman O. Al-Radi, MD, MSc, FRCSC Fellow, Cardiovascular Surgery The Hospital for Sick Children University of Toronto, Canada [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] changes in RSiteSearch() and http://finzi.psych.upenn.edu/search.html
This is about my searchable archive. The function RSiteSearch() now searches this archive. I'm considering the following changes. If you have comments, please write me. Try to avoid cc'ing the list. The Rhelp02a directory, which now contains all list mail from 2002 (about 100 MB), is getting larger and larger. This probably cannot go on forever, and performance might even improve if it got smaller now. I've considered two changes. One is to start a new archive. This would break RSiteSearch. Even if that were fixed, the fix would spread very slowly. The second solution, which I plan to implement unless I hear a better one, is to make this major archive include a maximum four-year window, so that it would now start in 2003 rather than 2002, then (next year) would start in 2004, and so on. The main disadvantage is that references to current message in my archive would be lost, because the message numbers would change. (I will try to save the old archive.) There aren't very many of these (244, including replies, in the archive itself). Although the archive would start in 2003, it would still be called Rhelp02a, so that RSiteSearch(), etc., still work. Maybe someone experienced with hypermail or namazu will tell me that 100 MB is nothing and I should wait until I get a GB before getting nervous. I also plan to change the default number of items per page from 20 to 100. This will not change in RSiteSearch() until the next version, but this is not so big a deal. Jon -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to choose columns in data.frame by parts of columns' names?
Dear all, I have a data.frame which has names as following. [1] XG1 YG1 XEST YEST [2] XNOEMP1 XNOEMP2 YNOEMP1 YNOEMP2 [3] XBUS10 XBUS10A XBUS10B XBUS10C [4] YBUS10 YBUS10A YBUS10B YBUS10C [5] XOWNBUS XSELFEST YOWNBUS YSELFEST Those columns have names beginning with X or Y. Each X is paired by a Y, e.g. XG1 and YG1, but they are not in the order of X Y X Y I want to combine X* and Y* like this: data.new[,G1] - (data.old[,XG1] + endata.use[,YG1])/2 How to choose columns by parts of names? For example, I can pick out XG1 and YG1 because they have the common part G1. Thank you. Wei-Wei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] TsayData
I don't know how to get the TsayData in fSeries. However, the data for the first edition of Tsay 2002) Analysis of Financial Time Series (Wiley) can be downloaded from http://www.gsb.uchicago.edu/fac/ruey.tsay/teaching/fts/;. For the second edition, try http://www.gsb.uchicago.edu/fac/ruey.tsay/teaching/fts2/;. If you'd like further help, you might include a brief, self-contained example, so someone else can try what you tried and get (presumably) the same response, as suggested in the posting guide! (www.R-project.org/posting-guide.html). Posts more consistent with that style often receive quicker and more helpful answers. Hope this helps. Spencer Graves SUMANTA BASAK wrote: Hi, I'm trying to work with TsayData in fSeries package. How can i fetch any time series data of this package. Please advice. Thanks, Sumanta Basak. Send instant messages to your online friends http://in.messenger.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to choose columns in data.frame by parts of columns' names?
On 5/30/06, Guo Wei-Wei [EMAIL PROTECTED] wrote: Dear all, I have a data.frame which has names as following. [1] XG1 YG1 XEST YEST [2] XNOEMP1 XNOEMP2 YNOEMP1 YNOEMP2 [3] XBUS10 XBUS10A XBUS10B XBUS10C [4] YBUS10 YBUS10A YBUS10B YBUS10C [5] XOWNBUS XSELFEST YOWNBUS YSELFEST Those columns have names beginning with X or Y. Each X is paired by a Y, e.g. XG1 and YG1, but they are not in the order of X Y X Y I want to combine X* and Y* like this: data.new[,G1] - (data.old[,XG1] + endata.use[,YG1])/2 How to choose columns by parts of names? For example, I can pick out XG1 and YG1 because they have the common part G1. This gives all columns whose column name contains G1: data.old[, regexpr(G1, colnames(data.old)) 0] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to choose columns in data.frame by parts of columns' names?
Wei-wei I have a data.frame which has names as following. [1] XG1 YG1 XEST YEST [2] XNOEMP1 XNOEMP2 YNOEMP1 YNOEMP2 [3] XBUS10 XBUS10A XBUS10B XBUS10C [4] YBUS10 YBUS10A YBUS10B YBUS10C [5] XOWNBUS XSELFEST YOWNBUS YSELFEST Those columns have names beginning with X or Y. Each X is paired by a Y, e.g. XG1 and YG1, but they are not in the order of X Y X Y I want to combine X* and Y* like this: data.new[,G1] - (data.old[,XG1] + endata.use[,YG1])/2 How to choose columns by parts of names? For example, I can pick out XG1 and YG1 because they have the common part G1. Not entirely sure what you mean but one approach might be to re-order the columns so that they are in order. yourNames [1] XG1 YG1 XEST YEST XNOEMP1 XNOEMP2 [7] YNOEMP1 YNOEMP2 XBUS10 XBUS10A XBUS10B XBUS10C [13] YBUS10 YBUS10A YBUS10B YBUS10C XOWNBUS XSELFEST [19] YOWNBUS YSELFEST yourNames[order(substring(yourNames,2), substring(yourNames, 1,1))] [1] XBUS10 YBUS10 XBUS10A YBUS10A XBUS10B YBUS10B [7] XBUS10C YBUS10C XEST YEST XG1 YG1 [13] XNOEMP1 YNOEMP1 XNOEMP2 YNOEMP2 XOWNBUS YOWNBUS [19] XSELFEST YSELFEST gives an idea of what I mean ... Peter Alspach __ The contents of this e-mail are privileged and/or confidenti...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] na.pass
Check out the source code to na.pass. It just returns its first argument unchanged: na.pass function (object, ...) object environment: namespace:stats On 5/30/06, Bahamonde Natalia [EMAIL PROTECTED] wrote: Hello... What does na.pass? x=c(2.4, 2.4, 1.9, 2.5, 2.1) xNA=replace(x, 3, NA) p=c(acf(x, type=c(covariance), plot=FALSE)$acf) pNA=c(acf(xNA, type=c(covariance), na.action=na.pass, plot=FALSE)$acf) p [1] 0.05040 -0.03112 0.00816 0.00224 -0.00448 pNA [1] 0.02250 -0.01167 0.00250 -0.00100 -0.00250 In the manual say na.pass returns objet unchanged... Thanks, Natalia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to choose columns in data.frame by parts of columns' names?
Thank you. I made a mistake in my previous email. What I mean is: data.new[,G1] - (data.old[,XG1] + data.old[,YG1])/2 data.old[, regexpr(G1, colnames(data.old)) 0] is a nice way, but there are about 100 X*s and Y*s. Can I do some comparision on all those column names and get columns with similar parts? 2006/5/31, Gabor Grothendieck [EMAIL PROTECTED]: On 5/30/06, Guo Wei-Wei [EMAIL PROTECTED] wrote: Dear all, I have a data.frame which has names as following. [1] XG1 YG1 XEST YEST [2] XNOEMP1 XNOEMP2 YNOEMP1 YNOEMP2 [3] XBUS10 XBUS10A XBUS10B XBUS10C [4] YBUS10 YBUS10A YBUS10B YBUS10C [5] XOWNBUS XSELFEST YOWNBUS YSELFEST Those columns have names beginning with X or Y. Each X is paired by a Y, e.g. XG1 and YG1, but they are not in the order of X Y X Y I want to combine X* and Y* like this: data.new[,G1] - (data.old[,XG1] + endata.use[,YG1])/2 How to choose columns by parts of names? For example, I can pick out XG1 and YG1 because they have the common part G1. This gives all columns whose column name contains G1: data.old[, regexpr(G1, colnames(data.old)) 0] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to choose columns in data.frame by parts of columns' names?
Peter, Thank you, I made a mistake in my previous email. What I mean is: data.new[,G1] - (data.old[,XG1] + data.old[,YG1])/2 Does your way have effects on data? or only have effects on those column names? I tried on my data and get a list of numbers. Can I rearrange the order of columns of data.frame by your way? 2006/5/31, Peter Alspach [EMAIL PROTECTED]: Wei-wei yourNames [1] XG1 YG1 XEST YEST XNOEMP1 XNOEMP2 [7] YNOEMP1 YNOEMP2 XBUS10 XBUS10A XBUS10B XBUS10C [13] YBUS10 YBUS10A YBUS10B YBUS10C XOWNBUS XSELFEST [19] YOWNBUS YSELFEST yourNames[order(substring(yourNames,2), substring(yourNames, 1,1))] [1] XBUS10 YBUS10 XBUS10A YBUS10A XBUS10B YBUS10B [7] XBUS10C YBUS10C XEST YEST XG1 YG1 [13] XNOEMP1 YNOEMP1 XNOEMP2 YNOEMP2 XOWNBUS YOWNBUS [19] XSELFEST YSELFEST gives an idea of what I mean ... Peter Alspach __ The contents of this e-mail are privileged and/or confidential to the named recipient and are not to be used by any other person and/or organisation. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. __ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to choose columns in data.frame by parts of columns' names?
This is not restricted to single matches: colnames(iris) [1] Sepal.Length Sepal.Width Petal.Length Petal.Width Species regexpr(Sepal, colnames(iris)) 0 [1] TRUE TRUE FALSE FALSE FALSE On 5/30/06, Guo Wei-Wei [EMAIL PROTECTED] wrote: Thank you. I made a mistake in my previous email. What I mean is: data.new[,G1] - (data.old[,XG1] + data.old[,YG1])/2 data.old[, regexpr(G1, colnames(data.old)) 0] is a nice way, but there are about 100 X*s and Y*s. Can I do some comparision on all those column names and get columns with similar parts? 2006/5/31, Gabor Grothendieck [EMAIL PROTECTED]: On 5/30/06, Guo Wei-Wei [EMAIL PROTECTED] wrote: Dear all, I have a data.frame which has names as following. [1] XG1 YG1 XEST YEST [2] XNOEMP1 XNOEMP2 YNOEMP1 YNOEMP2 [3] XBUS10 XBUS10A XBUS10B XBUS10C [4] YBUS10 YBUS10A YBUS10B YBUS10C [5] XOWNBUS XSELFEST YOWNBUS YSELFEST Those columns have names beginning with X or Y. Each X is paired by a Y, e.g. XG1 and YG1, but they are not in the order of X Y X Y I want to combine X* and Y* like this: data.new[,G1] - (data.old[,XG1] + endata.use[,YG1])/2 How to choose columns by parts of names? For example, I can pick out XG1 and YG1 because they have the common part G1. This gives all columns whose column name contains G1: data.old[, regexpr(G1, colnames(data.old)) 0] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to choose columns in data.frame by parts of columns' names?
Gabor and Peter, Thank you. Both of you give me excellent ways. I have a further problem. How can I get the common parts of column names as column names in a new data.frame? For example, I combines data of XG1 and YG1 in data.old and get a new column in data.new named G1. Can It be done automaticlly? data.new[,G1] - (data.old[,XG1] + data.old[,YG1])/2 2006/5/31, Gabor Grothendieck [EMAIL PROTECTED]: This is not restricted to single matches: colnames(iris) [1] Sepal.Length Sepal.Width Petal.Length Petal.Width Species regexpr(Sepal, colnames(iris)) 0 [1] TRUE TRUE FALSE FALSE FALSE __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html