Re: [R] getting numeric [0..6] day of week from POSIXct?
Hi John, The package lubridate is the easiest way to deal with dates. library(lubridate) frame$groupByWeekNumber - wday(frame$dt) - 1 # Sun=1, Sat=7 On Mon, Jul 7, 2014 at 11:54 PM, John McKown john.archie.mck...@gmail.com wrote: I have a column, dt, in a data.frame. It is a list of POSIXct objects. I could use strftime(frame$dt,%a) to get the day of week as [sun..sat]. But I need the numeric value in the range of [0..6]. I can't see a function to do this. I can get it by converting the POSIXct objects to POSIXlt objects, then extracting the $wday. I don't know why, but that just doesn't feel right to me. What I am actually trying to do is group my data by Gregorian week (Sunday..Saturday). To group the data, I am getting the ISO 8601 year and week number using strftime(dt) with the format of %G-%V . But the ISO yearweek number start on Monday, not Sunday. So what I do is: dt - as.POSIXlt(frame$dt); dt - dt - dt$wday*86400; # 86400 is seconds in a day frame$groupByWeekNumber - strftime(dt,%G-%V); is there a better way? I have tried my best to find a simpler way. -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! John McKown __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if, apply, ifelse
Hi Andrea, A cleaner alternative to Jim's suggestion is something like a.df - as.data.frame(a) group1 - (a.df$col1 == 1) apply(a.df[,c(col2,col3,col4)], 2, function(x) any(x == 1 | is.na(x))) group2 - (a.df$col1 == 1) apply(a.df[,c(col2,col3,col4)], 1, function(x) all(x == 0 | is.na(x))) group3 - (a.df$col1 != 1) - Jon On Thu, Nov 28, 2013 at 5:10 PM, Jim Lemon j...@bitwrit.com.au wrote: On 11/28/2013 04:33 AM, Andrea Lamont wrote: Hello: This seems like an obvious question, but I am having trouble answering it. I am new to R, so I apologize if its too simple to be posting. I have searched for solutions to no avail. I have data that I am trying to set up for further analysis (training data). What I need is 12 groups based on patterns of 4 variables. The complication comes in when missing data is present. Let me describe with an example - focusing on just 3 of the 12 groups: ... Any ideas on how to approach this efficiently? Hi Andrea, I would first convert the matrix a to a data frame: a1-as.data.frame(a) Then I would start adding columns: # group 1 is a 1 (logical TRUE) in col1 and at least one other 1 # here NAs are converted to zeros a1$group1-a1$col1 (ifelse(is.na(a1$col2),0,a1$col2) | ifelse(is.na(a1$col3),0,a1$col3) | ifelse(is.na(a1$col4),0,a1$col4)) # group 2 is a 1 in col1 and no other 1s # here NAs are converted to 1s a1$group2-a1$col1 !(ifelse(is.na(a1$col2),1,a1$col2) | ifelse(is.na(a1$col3),1,a1$col3) | ifelse(is.na(a1$col4),1,a1$col4)) # here NAs are converted to 1s a1$group3-!ifelse(is.na(a1$col1),1,a1$col1) and so on. It is clunky, but then you've got a clunky problem. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] compute p/t value from pearson r and n
Hi Martin, See ?cor.test example(cor.test) Regards, - Jon On Mon, Feb 25, 2013 at 5:06 AM, Martin Batholdy batho...@googlemail.com wrote: Hi, is there a predefined function that computes the p- or t-value based on a correlation coefficient and its sample size? thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R script .bat file from Python
Hi Fabio, I cannot reproduce it but this is probably some env var not set, or some problem with the path to your R installation having whitespace in it. See ?.libPaths, if it is empty you might want to hard-code R_HOME somewhere. Regards, On Thu, Feb 14, 2013 at 10:58 PM, Fabio Veronesi f.veron...@gmail.com wrote: Hello, I would like to start running a script from Python with the Rscript command. I tested several ways of invoking R from Python and I finally I succeeded. The problem is that the script starts but R does not recognize the installed packages. I tried simplifying the matter and I created a script.bat with the classic commands: Rscript c:\test.R If I run it by double clicking on it it works perfectly. However, if I try to run it from Python, with a command such as os.system(script.bat), it says that it cannot recognize any of the packages that it needs to load. Has anyone had a similar problem? Many thanks, Fabio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variables and greek letters in a plot title
Hi Dominik, You can try x - 5 plot(rnorm(50), main=bquote(.(x) * mu * g/m^3 * substance)) Regards, - Jon On Thu, Aug 16, 2012 at 3:37 PM, Dominik Refardt dominik.refa...@gmail.com wrote: Hello This is a problem I encountered repeatedly and I found no answer that made me really happy. I hope it is not too trivial. I would like to give the concentration of a substance in a plot title: 5 ug/ml substance the '5' would be a variable and the ug should be micrograms (with greek letter mu). It is the mu that causes the problems for me. I failed using various combinations of paste, expression and bquote. I would be very grateful if someone could help me (or point me to the solution, which I might have overlooked). Thank you very much Dominik -- Dominik Refardt Institute of Integrative Biology, ETH Zürich [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Printing a variable in a loop
Hi Kat, On Thu, Jun 28, 2012 at 8:22 AM, kat_the_great k...@hotmail.com wrote: Dear R Users: I'm a STATA user converting to R, and I'd like to be to do the following. #Assign var_1 and var_2 a value 10-var1 20-var2 #Now I'd like to print the values of var_1 and var_2 by looping through var_1 and var_2 in such a manner: while(y3){ print(var_y) y+1-y } The nearest you can get is while (y 3) { print(.GlobalEnv[[paste(var, y, sep=)]]) y - y + 1 } .GlobalEnv (a list, or strictly speaking an environment) contains all variables at the top-level of the REPL But this is not how we do it in R. 1. if you want to display the variable, just type it var1 2. In Stata, you are working with one (tabular) data set at any time. In R, you can work with multiple data sets (R construct: dataframes) at the same time. For example using the builtin anscombe data set Stata: use anscombe di x1 y1 x2 y2 // display all di x1 y1 x2 y2 if _n = 10 R: # data(anscombe) # optional anscombe[, c(x1, y1, x2, y2)] # index by column anscombe[1:10, c(x1, y1, x2, y2)] # index by row column head(anscombe[, c(x1,y1,x2,y2)], n=10] # same as above Regards, Jon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] If statement - copying a factor variable to a new variable
Hi James, On Thu, Jun 28, 2012 at 12:33 AM, James Holland holland.ag...@gmail.com wrote: I need to look through a dataset with two factor variables, and depending on certain criteria, create a new variable containing the data from one of those other variables. The problem is, R keeps making my new variable an integer and saving the data as a 1 or 2 (I believe the levels of the factor). I've tried using as.factor in the IF output statement, but that doesn't seem to work. Any help is appreciated. #Sample code rm(list=ls()) v1.factor - c(S,S,D,D,D,NA) v2.factor - c(D,D,S,S,S,S) test.data - data.frame(v1.factor,v2.factor) The vectorized way to do that would be # v1.factor if present, else v2.factor test.data$newvar - ifelse(!is.na(v1.factor), v1.factor, v2.factor) I suggest you work with the character levels first then convert it into a factor, e.g. if v1.factor v2.factor are already factors, do: test.data$newvar - as.factor(ifelse(!is.na(v1.factor), as.character(v1.factor), as.character(v2.factor))) Regards, Jon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] If statement - copying a factor variable to a new variable
On Thu, Jun 28, 2012 at 8:47 PM, James Holland holland.ag...@gmail.com wrote: With the multiple if statements I need to check for, I though for statements with the if/else if conditional statement was better than nested ifelse functions. for () gives you a lot of flexibility at the expense of being verbose slow, ifelse() is a bit limited but you get conciseness (== more elegant, IMO) and intuitively should be faster since it is vectorized For example #example expanded on rm(list=ls()) v1.factor - c(S,S,D,D,D,NA) v2.factor - c(D,D,S,S,S,S) v3 - c(1,0,0,0,0,NA) v4 - c(0,0,1,1,0,0) test.data - data.frame(v1.factor,v2.factor, v3, v4) Technically since you will pick a value from one of v1.factor, v2.factor, v3, v4 into a new vector, they should have the same type (e.g. numeric, character, integer). So I'll assume v3 - c(S,D,D,D,D,NA) v4 - c(D,D,S,S,D,D) If you prefer vectorizing, you can create an index # btw, is.na(v1.factor) is already logical (boolean), # is.na(v1.factor)==TRUE is redundant cond1 - is.na(v1.factor) is.na(v2.factor) cond2 - is.na(v1.factor) ! is.na(v2.factor) ... # cond1, cond2, etc should be mutually exclusive for this to work, # i.e. for each row, one and only one of cond1, cond2, cond3 is TRUE # not the case in your example, but you can make it so like # cond2 - !cond1 (is.na(v1.factor) !is.na(v2.factor)) # cond3 - !cond1 !cond2 (...) idx - c(cond1, cond2, cond3, ...) # to make it intuitive, you can convert idx into a matrix # i.e. test.data[idx] will return elements of test.data corresponding to elements of # matrix idx which is TRUE # this is actually optional, R stores matrices in column-major order idx - matrix(idx, nrow=length(cond1)) cbind(NA, test.data)[idx]# because your first condition should return NA! Or you can use sapply(), which in essence is similar to for-loop(). I'm not familiar with ifelse, but is there a way to use it in a nested format that would be better than my for loop structure? Or might I be better off finding a programming way of converting the new factor variables back to their factor values using the levels function? I don't understand your second question, but when combining factors it is better to deal with their labels (i.e. as.character(my.factor)) then convert the vector of strings to a factor (i.e. as.factor(my.result)). Internally a factor is a vector of (non-negative) integers, and levels(v1.factor) shows the mapping of these integers to its label. So you'll have a problem e.g. if the two factor vectors map the integer 1 to different labels. Regards, Jon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Date formats
Hi Walt, as.Date(01OCT1928, %d%b%Y) works for me. See also ?strftime Regards, Jon On Tue, Jun 19, 2012 at 8:00 PM, Data Analytics Corp. w...@dataanalyticscorp.com wrote: Hi, I imported an excel table (using read.csv) of Dow Jones monthly average closings where the first variable is a date as a character string such as 01OCT1928. How do I convert this to a date variable so I can plot monthly average closings against date using ggplot2? Thanks, Walt Walter R. Paczkowski, Ph.D. Data Analytics Corp. 44 Hamilton Lane Plainsboro, NJ 08536 (V) 609-936-8999 (F) 609-936-3733 w...@dataanalyticscorp.com www.dataanalyticscorp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] triangular matrix
Hi Lucia, On Mon, Jun 18, 2012 at 6:11 PM, lucinka lucia.bohus...@gmail.com wrote: Hello, I got this matrix of gentic distances between my samples. it is 85x85 but only lower half (without diagonal) contains my distances. How can I make a mean and standard deviation on these distances, please ? You can try something like dist.vec - dist.matrix[lower.tri(dist.matrix)] mean(dist.vec) sd(dist.vec) Regards, - Jon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] interpolation to montly data
Hi Ken, Stef, We can make your script more elegant like below: On Sun, Jun 17, 2012 at 12:52 AM, Ken katak...@bu.edu wrote: stef salvez loggyedy at googlemail.com writes: [snip] #load library library(plyr) # utility function mean.var = function(df, var){ mean(df[[var]], na.rm = T)}; # create example data dat - data.frame(country = c(rep(1,8), rep(2, 8)), date = c(23/11/08,28/12/08,25/01/09,22/02/09, 29/03/09,26/04/09,24/05/09, 28/06/09, 26/10/08,23/11/08,21/12/08,18/01/09, 15/02/09,16/03/09,12/04/09,10/05/09), price = c(2,3,4,5,6,32,23,32,45,46,90,54,65,77,7,6)) # add month column to df dat$month = substr(dat$date, 4,5) dat - transform(dat, date=as.Date(date, %d/%m/%y)) dat - transform(dat, month=as.numeric(format(date, %m))) #calculate average price by month across all countries and calculate monthly #frequency and put output in one data frame monthly.price = ddply(dat, .(month), mean.var, var = price) monthly.price = cbind(monthly.price, month.freq = as.vector(table(df$month))) names(monthly.price) = c(month, average.price, month.freq) # by country month ddply(dat, .(country, month), function(x) c(avg.price=mean(x$price), freq=nrow(x))) # by country year-month library(zoo) dat - transform(dat, yearmon=as.yearmon(date)) ddply(dat, .(country, yearmon), function(x) c(avg.price=mean(x$price), freq=nrow(x))) Regards, - Jon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replication of linear model/autoregressive model
Hi Al, Michael, On Sat, Jun 16, 2012 at 11:01 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: On Fri, Jun 15, 2012 at 6:56 AM, Al Ehan aehan3...@gmail.com wrote: Hi, I would like to make a replication of 10 of a linear, first order Autoregressive function, with respect to the replication of its innovation, e. for example: #where e is a random variables of innovation (from GEV distribution-that explains the rgev) #by using the arima.sim model from TSA package, I try to produce Y replicates, with respect to every replicates of e, #means for e[,1], I want to have say Y[,1]. The code: e=replicate(10,rgev(20,xi=0.2,mu= 931.1512,sigma= 168.2702 )) Y=replicate(10,ts(arima.sim(list(ar=0.775),n=20,innov=e,start.innov=e))) what I get is the same random variables for every replicates of Y. Well, what would you expect? You're passing the same values of e each time. What you probably want to do is to put the rgev call as the innov argument to arima.sim(). Take a look at the second example of ?arima.sim to see how its done (change the rt to rgev and you're good to go. More specifically, you'd do something like Y - replicate(10, ts(arima.sim(list(ar=0.775), n=20, rand.gen=rgev, xi=0.2, mu=931.1512, sigma=168.2702))) - Jon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.