Re: [R] getting numeric [0..6] day of week from POSIXct?

2014-07-07 Thread Miguel Manese
Hi John,

The package lubridate is the easiest way to deal with dates.

library(lubridate)
frame$groupByWeekNumber - wday(frame$dt) - 1   # Sun=1, Sat=7






On Mon, Jul 7, 2014 at 11:54 PM, John McKown john.archie.mck...@gmail.com
wrote:

 I have a column, dt, in a data.frame. It is a list of POSIXct objects.
 I could use strftime(frame$dt,%a) to get the day of week as
 [sun..sat]. But I need the numeric value in the range of [0..6]. I
 can't see a function to do this. I can get it by converting the
 POSIXct objects to POSIXlt objects, then extracting the $wday. I don't
 know why, but that just doesn't feel right to me. What I am actually
 trying to do is group my data by Gregorian week (Sunday..Saturday). To
 group the data, I am getting the ISO 8601 year and week number using
 strftime(dt) with the format of %G-%V . But the ISO yearweek number
 start on Monday, not Sunday. So what I do is:

 dt - as.POSIXlt(frame$dt);
 dt - dt - dt$wday*86400; # 86400 is seconds in a day
 frame$groupByWeekNumber - strftime(dt,%G-%V);

 is there a better way? I have tried my best to find a simpler way.

 --
 There is nothing more pleasant than traveling and meeting new people!
 Genghis Khan

 Maranatha! 
 John McKown

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] if, apply, ifelse

2013-11-28 Thread Miguel Manese
Hi Andrea,

A cleaner alternative to Jim's suggestion is something like

a.df - as.data.frame(a)

group1 - (a.df$col1 == 1)  apply(a.df[,c(col2,col3,col4)], 2,
function(x) any(x == 1 | is.na(x)))

group2 - (a.df$col1 == 1)  apply(a.df[,c(col2,col3,col4)], 1,
function(x) all(x == 0 | is.na(x)))

group3 - (a.df$col1 != 1)

- Jon



On Thu, Nov 28, 2013 at 5:10 PM, Jim Lemon j...@bitwrit.com.au wrote:

 On 11/28/2013 04:33 AM, Andrea Lamont wrote:

 Hello:

 This seems like an obvious question, but I am having trouble answering it.
 I am new to R, so I apologize if its too simple to be posting. I have
 searched for solutions to no avail.

 I have data that I am trying to set up for further analysis (training
 data). What I need is 12 groups based on patterns of 4 variables. The
 complication comes in when missing data is present. Let  me describe with
 an example - focusing on just 3 of the 12 groups:
 ...

 Any ideas on how to approach this efficiently?

  Hi Andrea,
 I would first convert the matrix a to a data frame:

 a1-as.data.frame(a)

 Then I would start adding columns:

 # group 1 is a 1 (logical TRUE) in col1 and at least one other 1
 # here NAs are converted to zeros
 a1$group1-a1$col1  (ifelse(is.na(a1$col2),0,a1$col2) |
  ifelse(is.na(a1$col3),0,a1$col3) |
  ifelse(is.na(a1$col4),0,a1$col4))
 # group 2 is a 1 in col1 and no other 1s
 # here NAs are converted to 1s
 a1$group2-a1$col1  !(ifelse(is.na(a1$col2),1,a1$col2) |
  ifelse(is.na(a1$col3),1,a1$col3) |
  ifelse(is.na(a1$col4),1,a1$col4))
 # here NAs are converted to 1s
 a1$group3-!ifelse(is.na(a1$col1),1,a1$col1)

 and so on. It is clunky, but then you've got a clunky problem.

 Jim


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/
 posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compute p/t value from pearson r and n

2013-02-24 Thread Miguel Manese
Hi Martin,

See ?cor.test

example(cor.test)

Regards,
- Jon

On Mon, Feb 25, 2013 at 5:06 AM, Martin Batholdy
batho...@googlemail.com wrote:
 Hi,

 is there a predefined function that computes the p- or t-value
 based on a correlation coefficient and its sample size?


 thanks!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R script .bat file from Python

2013-02-20 Thread Miguel Manese
Hi Fabio,

I cannot reproduce it but this is probably some env var not set, or
some problem with the path to your R installation having whitespace in
it.

See ?.libPaths, if it is empty you might want to hard-code R_HOME somewhere.

Regards,

On Thu, Feb 14, 2013 at 10:58 PM, Fabio Veronesi f.veron...@gmail.com wrote:
 Hello,
 I would like to start running a script from Python with the Rscript command.
 I tested several ways of invoking R from Python and I finally I succeeded.

 The problem is that the script starts but R does not recognize the
 installed packages.
 I tried simplifying the matter and I created a script.bat with the classic
 commands: Rscript c:\test.R

 If I run it by double clicking on it it works perfectly. However, if I try
 to run it from Python, with a command such as os.system(script.bat), it
 says that it cannot recognize any of the packages that it needs to load.

 Has anyone had  a similar problem?

 Many thanks,
 Fabio

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variables and greek letters in a plot title

2012-08-16 Thread Miguel Manese
Hi Dominik,

You can try

x - 5
plot(rnorm(50), main=bquote(.(x) * mu * g/m^3 *  substance))

Regards,

- Jon

On Thu, Aug 16, 2012 at 3:37 PM, Dominik Refardt
dominik.refa...@gmail.com wrote:
 Hello

 This is a problem I encountered repeatedly and I found no answer that made
 me really happy. I hope it is not too trivial.

 I would like to give the concentration of a substance in a plot title:

 5 ug/ml substance

 the '5' would be a variable and the ug should be micrograms (with greek
 letter mu). It is the mu that causes the problems for me. I failed using
 various combinations of paste, expression and bquote. I would be very
 grateful if someone could help me (or point me to the solution, which I
 might have overlooked).

 Thank you very much

 Dominik

 --
 Dominik Refardt
 Institute of Integrative Biology, ETH Zürich

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Printing a variable in a loop

2012-06-28 Thread Miguel Manese
Hi Kat,

On Thu, Jun 28, 2012 at 8:22 AM, kat_the_great k...@hotmail.com wrote:
 Dear R Users:

 I'm a STATA user converting to R, and I'd like to be to do the following.
 #Assign var_1 and var_2 a value
 10-var1
 20-var2

 #Now I'd like to print the values of var_1 and var_2 by looping through
 var_1 and var_2 in such a manner:

 while(y3){
  print(var_y)
 y+1-y
 }

The nearest you can get is

while (y  3) {
  print(.GlobalEnv[[paste(var, y, sep=)]])
  y - y + 1
}

.GlobalEnv (a list, or strictly speaking an environment) contains all
variables at the top-level of the REPL

But this is not how we do it in R.

1. if you want to display the variable, just type it

 var1

2. In Stata, you are working with one (tabular) data set at any time.
In R, you can work with multiple data sets (R construct: dataframes)
at the same time. For example using the builtin anscombe data set

Stata:

use anscombe
di x1 y1 x2 y2   // display all
di x1 y1 x2 y2 if _n = 10


R:

# data(anscombe)   # optional
anscombe[, c(x1, y1, x2, y2)]   # index by column
anscombe[1:10, c(x1, y1, x2, y2)]  # index by row  column
head(anscombe[, c(x1,y1,x2,y2)], n=10]   # same as above




Regards,

Jon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] If statement - copying a factor variable to a new variable

2012-06-28 Thread Miguel Manese
Hi James,

On Thu, Jun 28, 2012 at 12:33 AM, James Holland holland.ag...@gmail.com wrote:
 I need to look through a dataset with two factor variables, and depending
 on certain criteria, create a new variable containing the data from one of
 those other variables.

 The problem is, R keeps making my new variable an integer and saving the
 data as a 1 or 2 (I believe the levels of the factor).

 I've tried using as.factor in the IF output statement, but that doesn't
 seem to work.

 Any help is appreciated.



 #Sample code

 rm(list=ls())


 v1.factor - c(S,S,D,D,D,NA)
 v2.factor - c(D,D,S,S,S,S)

 test.data - data.frame(v1.factor,v2.factor)

The vectorized way to do that would be

# v1.factor if present, else v2.factor
test.data$newvar - ifelse(!is.na(v1.factor), v1.factor, v2.factor)

I suggest you work with the character levels first then convert it
into a factor, e.g. if v1.factor  v2.factor are already factors, do:

test.data$newvar - as.factor(ifelse(!is.na(v1.factor),
as.character(v1.factor), as.character(v2.factor)))



Regards,

Jon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] If statement - copying a factor variable to a new variable

2012-06-28 Thread Miguel Manese
On Thu, Jun 28, 2012 at 8:47 PM, James Holland holland.ag...@gmail.com wrote:
 With the multiple if statements I need to check for, I though for statements
 with the if/else if conditional statement was better than nested ifelse
 functions.

for () gives you a lot of flexibility at the expense of being verbose
 slow, ifelse() is a bit limited but you get conciseness (== more
elegant, IMO) and intuitively should be faster since it is vectorized

 For example

 #example expanded on

 rm(list=ls())

 v1.factor - c(S,S,D,D,D,NA)
 v2.factor - c(D,D,S,S,S,S)
 v3 - c(1,0,0,0,0,NA)
 v4 - c(0,0,1,1,0,0)

 test.data - data.frame(v1.factor,v2.factor, v3, v4)

Technically since you will pick a value from one of v1.factor,
v2.factor, v3, v4 into a new vector, they should have the same type
(e.g. numeric, character, integer). So I'll assume

v3 - c(S,D,D,D,D,NA)
v4 - c(D,D,S,S,D,D)

If you prefer vectorizing, you can create an index

# btw, is.na(v1.factor) is already logical (boolean),
# is.na(v1.factor)==TRUE is redundant
cond1 - is.na(v1.factor)  is.na(v2.factor)
cond2 - is.na(v1.factor)  ! is.na(v2.factor)
...

# cond1, cond2, etc should be mutually exclusive for this to work,
# i.e. for each row, one and only one of cond1, cond2, cond3 is TRUE
# not the case in your example, but you can make it so like
# cond2 - !cond1  (is.na(v1.factor)  !is.na(v2.factor))
# cond3 - !cond1  !cond2  (...)
idx - c(cond1, cond2, cond3, ...)

# to make it intuitive, you can convert idx into a matrix
# i.e. test.data[idx] will return elements of test.data corresponding
to elements of
# matrix idx which is TRUE
# this is actually optional,  R stores matrices in column-major order
idx - matrix(idx, nrow=length(cond1))

cbind(NA, test.data)[idx]# because your first condition should return NA!

Or you can use sapply(), which in essence is similar to for-loop().


 I'm not familiar with ifelse, but is there a way to use it in a nested
 format that would be better than my for loop structure?  Or might I be
 better off finding a programming way of converting the new factor variables
 back to their factor values using the levels function?

I don't understand your second question, but when combining factors it
is better to  deal with their labels (i.e. as.character(my.factor))
then convert the vector of strings to a factor (i.e.
as.factor(my.result)). Internally a factor is a vector of
(non-negative) integers, and levels(v1.factor) shows the mapping of
these integers to its label. So you'll have a problem e.g. if the
two factor vectors map the integer 1 to different labels.

Regards,

Jon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date formats

2012-06-19 Thread Miguel Manese
Hi Walt,

as.Date(01OCT1928, %d%b%Y) works for me. See also ?strftime

Regards,

Jon


On Tue, Jun 19, 2012 at 8:00 PM, Data Analytics Corp.
w...@dataanalyticscorp.com wrote:
 Hi,

 I imported an excel table (using read.csv) of Dow Jones monthly average
 closings where the first variable is a date as a character string such as
 01OCT1928.  How do I convert this to a date variable so I can plot monthly
 average closings against date using ggplot2?

 Thanks,

 Walt

 

 Walter R. Paczkowski, Ph.D.
 Data Analytics Corp.
 44 Hamilton Lane
 Plainsboro, NJ 08536
 
 (V) 609-936-8999
 (F) 609-936-3733
 w...@dataanalyticscorp.com
 www.dataanalyticscorp.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] triangular matrix

2012-06-18 Thread Miguel Manese
Hi Lucia,

On Mon, Jun 18, 2012 at 6:11 PM, lucinka lucia.bohus...@gmail.com wrote:
 Hello,

 I got this matrix of gentic distances between my samples. it is 85x85 but
 only lower half (without diagonal) contains my distances. How can I make a
 mean and standard deviation on these distances, please ?


You can try something like

dist.vec - dist.matrix[lower.tri(dist.matrix)]
mean(dist.vec)
sd(dist.vec)


Regards,

- Jon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] interpolation to montly data

2012-06-16 Thread Miguel Manese
Hi Ken, Stef,

We can make your script more elegant like below:


On Sun, Jun 17, 2012 at 12:52 AM, Ken katak...@bu.edu wrote:

 stef salvez loggyedy at googlemail.com writes:

[snip]


 #load library
 library(plyr)

 # utility function
 mean.var = function(df, var){ mean(df[[var]], na.rm = T)};

 # create example data
 dat - data.frame(country = c(rep(1,8), rep(2, 8)),
                 date = c(23/11/08,28/12/08,25/01/09,22/02/09,
                          29/03/09,26/04/09,24/05/09, 28/06/09,
                          26/10/08,23/11/08,21/12/08,18/01/09,
                          15/02/09,16/03/09,12/04/09,10/05/09),
                 price = c(2,3,4,5,6,32,23,32,45,46,90,54,65,77,7,6))
 # add month column to df
 dat$month = substr(dat$date, 4,5)

dat - transform(dat, date=as.Date(date, %d/%m/%y))
dat - transform(dat, month=as.numeric(format(date, %m)))


 #calculate average price by month across all countries and calculate
 monthly
 #frequency and put output in one data frame
 monthly.price = ddply(dat, .(month), mean.var, var = price)
 monthly.price = cbind(monthly.price, month.freq =
 as.vector(table(df$month)))
 names(monthly.price) = c(month, average.price, month.freq)

# by country  month
ddply(dat, .(country, month), function(x) c(avg.price=mean(x$price),
freq=nrow(x)))

# by country  year-month
library(zoo)
dat - transform(dat, yearmon=as.yearmon(date))
ddply(dat, .(country, yearmon), function(x) c(avg.price=mean(x$price),
freq=nrow(x)))

Regards,
- Jon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replication of linear model/autoregressive model

2012-06-16 Thread Miguel Manese
Hi Al, Michael,

On Sat, Jun 16, 2012 at 11:01 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 On Fri, Jun 15, 2012 at 6:56 AM, Al Ehan aehan3...@gmail.com wrote:
 Hi,

 I would like to make a replication of 10 of a linear, first order
 Autoregressive function, with respect to the replication of its innovation,
 e. for example:

 #where e is a random variables of innovation (from GEV distribution-that
 explains the rgev)
 #by using the arima.sim model from TSA package, I try to produce Y
 replicates, with respect to every replicates of e,
 #means for e[,1], I want to have say Y[,1].

 The code:

 e=replicate(10,rgev(20,xi=0.2,mu= 931.1512,sigma= 168.2702 ))
 Y=replicate(10,ts(arima.sim(list(ar=0.775),n=20,innov=e,start.innov=e)))

 what I get is the same random variables for every replicates of Y.

 Well, what would you expect? You're passing the same values of e each
 time. What you probably want to do is to put the rgev call as the
 innov argument to arima.sim(). Take a look at the second example of
 ?arima.sim to see how its done (change the rt to rgev and you're good
 to go.


More specifically, you'd do something like

Y - replicate(10, ts(arima.sim(list(ar=0.775), n=20, rand.gen=rgev,
xi=0.2, mu=931.1512, sigma=168.2702)))


- Jon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.