from:"Patrick Hausmann"

[R] Add a dim to an array

2012-06-03 Thread Patrick Hausmann


Dear list,

I'm trying to add a new dim to a multidimensional array. My array looks 
like this


a1 - array(1:8, c(2, 2, 2))
dimnames(a1) - list(A = c(A1, A2),
 B = c(B1, B2),
 D = c(D1, D2))

I would like to add a new dim 'group' with the value low. Right now 
I'm using this, but I think are better ways...


a2 - as.data.frame(as.table(a1))
a2$group - low
a2 - xtabs(Freq ~ A + B + D + group, data = a2)
a2

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Create a new Vector based on two columns

2012-04-25 Thread Patrick Hausmann


Hello,

I am trying to get a new vector 'x1' based on the not NA-values in 
column 'a' and 'b'. I found a way but I am sure this is not the best 
solution. So any ideas on how to optimize this would be great!


m - factor(c(a1, a1, a2, b1, b2, b3, d1, d1),  ordered 
= TRUE)

df - data.frame( a= m, b = m)
df[1,1] - NA
df[4,2] - NA
df[2,2] - NA
df[6,1] - NA
df

w - !apply(df, 2, is.na)
v - apply(w, 1, FUN=function(L) which(L == TRUE)[[1]])

for (i in 1:nrow(df) ) {
g[i] - df[i, v[i]]
}

df$x1 - g

Thanks for any help
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Create new Vector based on two colums

2012-04-25 Thread Patrick Hausmann


Hello,

I am trying to get a new vector 'x1' based on the not NA-values in 
column 'a' and 'b'. I found a way but I am sure this is not the best 
solution. So any ideas on how to optimize this would be great!


m - factor(c(a1, a1, a2, b1, b2, b3, d1, d1),  ordered 
= TRUE)

df - data.frame( a= m, b = m)
df[1,1] - NA
df[4,2] - NA
df[2,2] - NA
df[6,1] - NA
df

w - !apply(df, 2, is.na)
v - apply(w, 1, FUN=function(L) which(L == TRUE)[[1]])

for (i in 1:nrow(df) ) {
g[i] - df[i, v[i]]
}

df$x1 - g

Thanks for any help
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate the difference using ave

2011-10-27 Thread Patrick Hausmann


Thanks Dimitris,

but I would like to bind the result on the dataframe, so the length 
should be equal to nrow(df1).


BTW, sorry for the example, it wasn't very clear, next try:

#

options(stringsAsFactors = FALSE)

set.seed(123)
df1 - data.frame(id = rep(LETTERS[1:6], 3),
  yr = rep(c(2009:2011), each=6),
  water = sample(c(100:500), 18),
  salt  = sample(c(10:40), 18))

CalcDiffPct - function(xdf) {   
  n - length(unique(xdf[[id]]))
  n.NA - rep(NA, n)
  w - seq_len(nrow(xdf) - n)
  diff_pct - xdf$salt / c(n.NA, xdf$water[w]) * 100
  diff_pct
  }

# The order is important
df1 - df1[order(df1$yr, df1$id), ]

# This works, as long as each
# combination of yr / id exist
with(df1, table(id, yr))
df1$salt_pct - CalcDiffPct(df1)
df1

# But if the I drop any row the result will be wrong
# (or 'correct' as the function doesn't handle this case)
df2 - df1
df2 - df2[-15, ]
with(df2, table(id, yr))
df2$salt_pct2 -  CalcDiffPct(df2)
df2

##

Thanks for any help!
Patrick


Am 26.10.2011 14:00, schrieb Dimitris Rizopoulos:

Maybe one approach could be:

set.seed(123)
df1 - data.frame(measure = rep(c(A1, A2, A3), each=3),
water = sample(c(100:200), 9),
tide = sample(c(-10:+10), 9))


100 * tail(df1$tide, -3) / head(df1$water, -3)


I hope it helps.

Best,
Dimitris


On 10/26/2011 12:02 PM, Patrick Hausmann wrote:

Dear R users,

It may be very simple but it is being difficult for me.
I'd like to calculate the difference in percent between to measures.
My data looks like this:

set.seed(123)
df1 - data.frame(measure = rep(c(A1, A2, A3), each=3),
water = sample(c(100:200), 9),
tide = sample(c(-10:+10), 9))
df1

# What I want to calculate is:
# tide_[A2] / water_[A1],
# tide_[A3] / water_[A2]

# This 'works' for the example, but I am
# looking for a more general solution.

df1$tide_diff - ave(df1$tide, FUN=function(L) L /
c(NA, NA, NA, df1$water)) * 100
df1

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculate the difference using ave

2011-10-26 Thread Patrick Hausmann


Dear R users,

It may be very simple but it is being difficult for me.
I'd like to calculate the difference in percent between to measures.
My data looks like this:

set.seed(123)
df1 - data.frame(measure = rep(c(A1, A2, A3), each=3),
  water = sample(c(100:200), 9),
  tide  = sample(c(-10:+10), 9))
df1

# What I want to calculate is:
# tide_[A2] / water_[A1],
# tide_[A3] / water_[A2]

# This 'works' for the example, but I am
# looking for a more general solution.

df1$tide_diff - ave(df1$tide, FUN=function(L) L /
 c(NA, NA, NA, df1$water)) * 100
df1

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] MARGIN in sweep refers to a specific column in a second df

2011-08-15 Thread Patrick Hausmann


Dear R folks,

I am doing some calculations over an array using sweep and apply.

# Sample Data (from help 'addmargins')
Aye - sample(c(Yes, Si, Oui), 177, replace = TRUE)
Bee - sample(c(Hum, Buzz), 177, replace = TRUE)
Sea - sample(c(White, Black, Red, Dead), 177, replace = TRUE)
(A - table(Aye, Bee, Sea))

apply(A, c(1, 2), sum )

## ok, sweep with fixed MARGIN
round( sweep( apply(A, c(1, 2), sum ), 1 , c(111, 333, 444), FUN = /), 2)

# DF with values for sweep MARGIN
DF - data.frame( answer = c(111, 333, 444), Aye = c(Oui, Si, Yes))

## ok, MARGIN in correct order
round( sweep( apply(A, c(1, 2), sum ), 1 , DF[['answer']], FUN = /), 2)

## But if I change the order in DF the result is not what I want...
DF.s - DF[order(DF$Aye, decreasing = TRUE), ]
DF.s
round( sweep( apply(A, c(1, 2), sum ), 1 , DF.s[['answer']], FUN = /), 2)

So, I would like to know, how to set MARGIN in sweep to refer to the 
values in DF with notice of the Aye-column?


Thanks for any help
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Expand DF with all levels of a variable

2011-06-14 Thread Patrick Hausmann


Dear list,

I would like to expand a DF with all the missing levels of a variable.

a - c(2,2,3,4,5,6,7,8,9)
a.cut - cut(a, breaks=c(0,2,6,9,12), right=FALSE )
(x - data.frame(a, a.cut))

# In 'x'  the level [0,2) is missing.

AddMissingLevel - function(xdf) {

xfac - factor( c([0,2), [2,6), [6,9), [9,12)) )
xlevels - levels(xfac)

if(length(xlevels) != nlevels(factor(xdf$a.cut))) {
   v - setdiff(xlevels, factor(xdf$a.cut))
   u - data.frame(a = 0, a.cut = v)
   x - rbind(u, x)
}
return(x)
}

AddMissingLevel(x)

Does a more general approach exist, e.g. using expand.grid?

Thanks for any help!!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] More flexible aggregate / eval

2011-04-30 Thread Patrick Hausmann


Dear list,

I would like to do some calculation using different grouping variables. 
My 'df' looks like this:


# Some data
set.seed(345)
id - seq(200,400, by=10)
ids - sample(substr(id,1,1))
group1 - rep(1:3, each=7)
group2 - rep(1:2, c(10,11))
group3 - rep(1:4, c(5,5,5,6))
df - data.frame(id, ids, group1, group2, group3)
df - rbind(df, df, df)
df$time - seq(2009, 2011, each=3)
df$x1 - sample(0:100, 63)
df$x2 - sample(44:234, 63)

head(df)

## For group1
d1 - aggregate(cbind(x1, x2) ~
  group1 + ids + time, data = df, sum)

d1$l_pct - with(d1, ave(x1, list(group1, time),
 FUN = function(x) round(prop.table(x) * 100, 1) ) )

op1 - xtabs(l_pct ~ group1 + ids + time, data = d1)
ftable(op1, row.vars=c(1,3))

## For group2
d2 - aggregate(cbind(x1, x2) ~
  group2 + ids + time, data = df, sum)

d2$l_pct - with(d2, ave(x1, list(group2, time),
 FUN = function(x) round(prop.table(x) * 100, 1) ) )

op2 - xtabs(l_pct ~ group2 + ids + time, data = d2)
ftable(op2, row.vars=c(1,3))

## and for group3...
## To have a more flexible solution I wrote this function:

myfun - function(xdf, xvar) {

 fo1 - cbind(x1, x2) ~ 
 fo2 - paste(fo1, xvar, + ids + time, sep=)
 formular - as.formula(fo2)

 d2 - do.call(aggregate, list(formular, data = xdf, FUN = sum))

 d2$l_pct - with(d2, ave(x1, list(eval(as.name(xvar)), time),
  FUN = function(x) round(prop.table(x) * 100, 1) ) )
 op2 - xtabs(l_pct ~ eval(as.name(xvar)) + ids + time, data = d2)
 fop2 - ftable(op2, row.vars=c(1,3))
 out - list(d2, fop2)
 return(out)

}

( out_gr1 - myfun(df, group1) )
( out_gr2 - myfun(df, group2) )
( out_gr3 - myfun(df, group3) )

This seems to work ok, but I am not really familiar with 'as.formula', 
'eval' and 'as.name'. So I would like to know, if my solution is ok or 
if there are maybe better ways to solve this task.


Thanks for any help!!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply mean to a three-dimension data

2011-03-25 Thread Patrick Hausmann


Hi,
I think you could also use this way (via array, see 
http://r.789695.n4.nabble.com/apply-over-list-of-data-frames-td3057968.html)


b - list()
b[[1]] = matrix(1:4, 2, 2)
b[[2]] = matrix(10:13, 2, 2)
b[[3]] = matrix(20:23, 2, 2)

b.a - array(unlist(b), dim=c(2, 2, 3))

(b.mean - apply(X = b.a, MARGIN = c(1, 2), FUN = mean))

(b.sum - apply(X = b.a, MARGIN = c(1, 2), FUN = sum))


HTH
Patrick

Am 24.03.2011 16:07, schrieb Hui Du:

Hi All,

 Suppose I have data like

b[[1]] = matrix(1:4, 2, 2)
b[[2]] = matrix(10:13, 2, 2)
b[[3]] = matrix(20:23, 2, 2)

[[1]]
  [,1] [,2]
[1,]13
[2,]24

[[2]]
  [,1] [,2]
[1,]   10   12
[2,]   11   13

[[3]]
  [,1] [,2]
[1,]   20   22
[2,]   21   23

 Now I want to calculate the mean of each cell across the list. 
For example mean of (b[[1]][1,1], b[[2]][1,1], b[[3]][1,1]), mean of 
(b[[1]][1,2], b[[2]][1,2], b[[3]][1,2]) etc. e.g. mean of (1, 10, 20), mean 
of(3, 12, 22). Could somebody tell me how to do it? Thank you in advance.

HXD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fraction with timelag

2011-03-24 Thread Patrick Hausmann


Dear r-help,

I'm having this DF

df - data.frame(id = 1:6,
 xout = c(1234, 2134, 234, 456, 324, 345),
 xin= c(NA, 34,67,87,34, NA))

and would like to calculate the fraction (xin_t / xout_t-1)
The result should be:
# NA, 2.76, 3.14, 37.18, 7.46, NA

I am sure there is a solution using zoo... but I don't know how...

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] list multiplied by a factor / mapply

2011-02-23 Thread Patrick Hausmann


Dear list,

this works fine:

x - split(iris, iris$Species)
x1 - lapply(x, function(L) transform(L, g = L[,1:4] * 3))

but I would like to multiply each Species with another factor:
setosa by 2, versicolor by 3 and virginica by 4. I've tried mapply but 
without success.


Any thoughts? Thanks for any idea!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] manipulate dataframe

2011-02-06 Thread Patrick Hausmann


Hi André,

try this:

df1 - data.frame(x1 = rep(1:3, each=3), x2=letters[1:9])

dfs - split(df1, df1$x1)

df2 - data.frame(sapply(dfs, FUN=[[, x2))
colnames(df2) - paste(d, unique(df1$x1), sep=)
df2

HTH
Patrick


Am 06.02.2011 12:13, schrieb André de Boer:

Hello,

Can someone give me hint to change a data.frame.
I want to split a column in more columns depending on the value of a other
column.
Thanks for the reaction,
Andre

Example:

dat

   x1 x2
1  1  a
2  1  b
3  1  c
4  2  d
5  2  e
6  2  f
7  3  g
8  3  h
9  3  i

in


dur

   d1 d2 d3
1  a  d  g
2  b  e  h
3  c  f  i

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] transform a df with a condition

2011-01-16 Thread Patrick Hausmann


Dear all,

for each A == 3 in 'df' I would like to change the variables B and K.
My result should be the whole df and not the subset (A==3)...

df - data.frame(A = c(1,1,3,2,2,3,3),
 B = c(2,1,1,2,7,8,7),
 K = c(a.1, d.2, f.3,
   a.1, k.4, f.9, f.5))

x1 - within(df[df$A ==3, ], {
   B1 - 5
   K1 - gsub(f,m, K)
   })

x2 - transform(df[df$A==3, ], B1 = 5, K1 = gsub(f,m, K))

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] transform a df with a condition

2011-01-16 Thread Patrick Hausmann

Arrg, sorry - of course I don't want *new* variables. So this is my 
correct example:


df - data.frame(A = c(1,1,3,2,2,3,3),
 B = c(2,1,1,2,7,8,7),
 K = c(a.1, d.2, f.3,
   a.1, k.4, f.9, f.5))

x1 - within(df[df$A ==3, ], {
  B - 5
  K - gsub(f,m, K)
 })

x2 - transform(df[df$A==3, ], B = 5, K = gsub(f,m, K))

Thanks
Patrick

Am 16.01.2011 15:13, schrieb Patrick Hausmann:

Dear all,

for each A == 3 in 'df' I would like to change the variables B and K.
My result should be the whole df and not the subset (A==3)...

df - data.frame(A = c(1,1,3,2,2,3,3),
B = c(2,1,1,2,7,8,7),
K = c(a.1, d.2, f.3,
a.1, k.4, f.9, f.5))

x1 - within(df[df$A ==3, ], {
B1 - 5
K1 - gsub(f,m, K)
})

x2 - transform(df[df$A==3, ], B1 = 5, K1 = gsub(f,m, K))

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using combn

2011-01-10 Thread Patrick Hausmann


Dear list,

I want to apply the table function to every pair of variables in df 
and the return should be a list.


setwd(123)
asd - data.frame(a1=sample(1:4, 20, replace=TRUE),
  a2=sample(1:4, 20, replace=TRUE),
  a3=sample(1:4, 20, replace=TRUE),
  a4=sample(1:4, 20, replace=TRUE))

with(asd, table(a1, a2))
with(asd, table(a1, a3))
with(asd, table(a1, a4))
...

I'm sure there is a solution using combn - but I don't get it...

combn(colnames(asd), 2)
...

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Projecting data on a world map using long/lat

2010-12-11 Thread Patrick Hausmann


Hi Mathijs,

this should work:

library(maptools)
library(ggplot2)
gpclibPermit()
theme_set(theme_bw())

#setwd(C:\\foo) point to your local dir
# Data: http://thematicmapping.org/downloads/world_borders.php
world.shp - readShapeSpatial(TM_WORLD_BORDERS-0.3.shp)

# check for region-id - Use FIPS
head(world@data)

## see licence, not GPL
world.shp.p - fortify.SpatialPolygonsDataFrame(world.shp, region=FIPS)

world - merge(world.shp.p, world.shp, by.x=id, by.y=FIPS)

head(world)
dim(world)

# only the worldmap
p - ggplot(data=world, aes(x=long, y=lat, group=group)) + 
geom_polygon(fill=#63D1F4)

p - p + geom_path(color=white) + coord_equal()
ggsave(p, width=11.69, height=8.27, file=world_map.jpg)

## Add some locations
cities - read.table(textConnection(
longlat  city  pop
-58.381944 -34.599722 'Buenos Aires' 11548541
14.25 40.83 Neapel 962940), header = TRUE)

p1 - p + geom_point(data = cities, aes(group = NULL), shape=5,
 color='black')
ggsave(p1, width=11.69, height=8.27, file=world_map_2.jpg)

Regards,
Patrick


Am 10.12.2010 17:53, schrieb mathijsdevaan:


Thanks for the suggestions, but I am not there yet (I'm a real novice). In
the code provided by Patrick (see below), I changed the shape input (from
sids to world) which I downloaded here:
http://thematicmapping.org/downloads/world_borders.php. As a result I also
need to change the CNTY_ID and id in the code, but I have no idea what
to put there. Could you please help me? Thanks!

Mathijs

library(maptools)
library(ggplot2)
gpclibPermit()

myshp- readShapeSpatial(system.file(shapes/sids.shp,
package=maptools))

## see licence, not GPL
myshp.points- fortify.SpatialPolygonsDataFrame(myshp,
region=CNTY_ID)

shpm- merge(myshp.points, myshp, by.x=id, by.y=CNTY_ID)

head(shpm)

p- ggplot(shpm, aes(long, lat, group=group, fill=NWBIR74))
p- p + geom_polygon() + geom_path(color=white) + coord_equal()

## Add some locations
cities- read.table(textConnection(
longlat  city  val
-78.644722 35.818889  Raleigh   323
-80.84 35.226944  Charlotte  510
-82.555833 35.58  Asheville400), header = TRUE)

p- p + geom_point(aes( fill=NULL, group = NULL, size=val),
  data = cities, color= 'black')
p


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LaTeX, MiKTeX, LyX: A Guide for the Perplexed

2010-12-08 Thread Patrick Hausmann


Hi Paul,
I am using Sweave and MiKTeX and the results are really impressive, but 
it's often quite complicated (or impossible) to share the rnw-files with 
my colleagues/clients. So it depends with/for whom you are working. 
Perhaps as an alternative you could use a simpler markup format e.g. 
Markdown (with the ascii package). To convert between different 
markup languages Pandoc [1] looks very promising.


HTH
Patrick

[1] http://johnmacfarlane.net/pandoc

Am 08.12.2010 00:29, schrieb Paul Miller:

Hello Everyone,
�
Been learning R over the past several months. Read several books and have 
learned a great deal about data manipulation, statistical analysis, and 
graphics.
�
Now I want to learn how to make nice looking documents and�about literate 
programming. My understanding is that R users normally do this using LaTeX, MiKTeX, 
LyX, etc. in conjuction with Sweave.�An alternative might be to use the R2wd package to 
create Word documents.
�
So I guess I have�four questions:
�
1. How do I choose between the various options? Why would someone decide to use 
LaTeX instead of MiKTeX or vice versa for example?
�
2. What are the best sources of information about LaTeX, MiKTeX, LyX, etc.?
�
3. What is the learning curve like for each of these? What do you get�for the 
time you put in learning something that is more difficult?
�
4. How do people who use LaTeX, MiKTeX, LyX, etc. share documents with people 
who are just using Word? How difficult does using LaTeX, MiKTeX, LyX, etc. make 
it to collaborate on projects with others?
�
Thanks,
�
Paul


[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help summarizing R data frame

2010-12-02 Thread Patrick Hausmann


Here are some examples with tapply, aggregate, ddply:

x - read.table(clipboard, head=TRUE)

with(x, tapply(quantity, identifier, sum))

aggregate(x$quantity, by=list(x$identifier), sum)

aggregate(quantity ~ identifier, data = x, sum)

library(plyr)
ddply(x, .(identifier), summarise, quantity=sum(quantity))

HTH
Patrick

Am 02.12.2010 17:24, schrieb chris99:


I am trying to aggregate data in column 2 to identifiers in col 1

eg..

take this

identifier   quantity
1 10
1 20
2 30
1 15
2 10
3 20

and make this

identifier quantity
145
240
320


Thanks in advance for your help!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] more flexible ave

2010-11-30 Thread Patrick Hausmann


Hi all,

I would like to calculate the percent of the total per group for this 
data.frame:


df - data.frame(site = c(a, a, a, b, b, b),
 gr = c(total, x1, x2, x1, total,x2),
 value1 = c(212, 56, 87, 33, 456, 213))
df

calcPercent - function(df) {

df - transform(df, pct_val1 = ave(df[, -c(1:2)], df$gr,
  FUN = function(x)
  x/df[df$gr == total, value1]) )
}

# This works as intended...
w - lapply(split(df, df$site), calcPercent)
w - do.call(rbind, w)
w

# ... but when I add a new column
df$value2 - c(1546, 560, 543, 234, 654, 312)

# the result is not what I want...
w - lapply(split(df, df$site), calcPercent)
w - do.call(rbind, w)
w

Clearly I have to change the function, (particularly value1) - but 
how... I've also played around with apply but without any success.


Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stacking consecutive columns

2010-11-17 Thread Patrick Hausmann



Hi Gregory,

is this what you want? Ok, not the most elegant way...

# using 'melt' from the 'reshape' package

library(reshape)
Data - data.frame(month = 1:12,
 x2002 = runif(12),
 x2003 = runif(12),
 x2004 = runif(12),
 x2005 = runif(12))

v - NULL

for(i in 2:4) {
 kk - melt(Data[, c(1, i, i+1)], id.vars=month, variable_name=year)
 v[[i-1]] - kk[order(kk$year, decreasing=TRUE),]
}

out - do.call(cbind, v)

HTH
Patrick

Am 17.11.2010 15:03, schrieb Graves, Gregory:

I have a file, each column of which is a separate year, and each row of each 
column is mean precipitation for that month.  Looks like this (except it goes 
back to 1964).

monthX2000  X2001  X2002  X2003  X2004  X2005  X2006 X2007  X2008  X2009
11.600  1.010  4.320  2.110  0.925  3.275  3.460 0.675  1.315  2.920
22.960  3.905  3.230  2.380  2.720  1.880  2.430 1.380  2.480  2.380
31.240  1.815  1.755  1.785  1.250  3.940 10.025 0.420  2.845  2.460
43.775  1.350  2.745  0.170  0.710  2.570  0.255 0.425  4.470  1.250
54.050  1.385  5.650  1.515 12.005  6.895  7.020 4.060  7.725  2.775
68.635  8.900 15.715 12.680 16.270 12.605  7.095 7.025 10.465  7.345
75.475  7.955  7.880  6.670  7.955  7.355  5.475 5.650  7.255  7.985
88.435  5.525  7.120  6.250  7.150  7.610  5.525 6.500  6.275 10.405
95.855  7.830  7.250  7.355  9.715  7.850  6.385 7.960  4.485  7.250
10  7.965 11.915  6.735  8.125  7.855 10.465  4.340 6.165  2.400  3.240
11  1.705  1.525  0.905  1.670  1.840  2.100  0.255 2.830  4.425  1.645
12  2.335  0.840  0.795  1.890  0.145  1.700  0.260 2.160  2.300  2.220

What I want to do is to stack 2008 data underneath 2009 data, 2007 under 2008, 
2006 under 2007, etc.  I have so far figured out that I can do this with the 
following clumsy approach:

a=stack(yearmonth,select=c(X2009,X2008))
b=stack(yearmonth,select=c(X2008,X2007))
x=as.data.frame(c(a,b))
write.table(x,clipboard,sep=  ,col.names=NA) #then paste this back into 
Excel to get this


values  ind values.1ind.1
1   0.275   X2009   1.285   X2008
2   0.41X2009   3.85X2008
3   1.915   X2009   3.995   X2008
4   1.25X2009   3.845   X2008
5   8.76X2009   2.095   X2008
6   8.65X2009   8.29X2008
7   7.175   X2009   9.405   X2008
8   7.19X2009   13.44   X2008
9   8.13X2009   7.245   X2008
10  1.46X2009   5.645   X2008
11  2.56X2009   0.535   X2008
12  5.01X2009   1.225   X2008
13  1.285   X2008   0.72X2007
14  3.85X2008   1.89X2007
15  3.995   X2008   1.035   X2007
16  3.845   X2008   2.86X2007
17  2.095   X2008   3.785   X2007
18  8.29X2008   9.62X2007
19  9.405   X2008   9.245   X2007
20  13.44   X2008   5.595   X2007
21  7.245   X2008   8.4 X2007
22  5.645   X2008   6.705   X2007
23  0.535   X2008   1.47X2007
24  1.225   X2008   1.665   X2007


Is there an easier, cleaner way to do this?  Thanks.

Gregory A. Graves, Lead Scientist
Everglades REstoration COoordination and VERification (RECOVER)
Restoration Sciences Department
South Florida Water Management District
Phones:  DESK: 561 / 682 - 2429
CELL:  561 / 719 - 8157

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] replace NA-values

2010-06-21 Thread Patrick Hausmann


Dear list,

I'm trying to replace NA-values with the preceding values in that column.
This code works, but I am sure there is a more elegant way...

df - data.frame(id = c(A1, NA, NA, NA, B1,
 NA, NA, C1, NA, NA, NA, NA),
 value = c(1:12))

rn - c(rownames(df[!is.na(df$id),]), nrow(df)+1)
rn - diff(as.numeric(rn))
df$id2 - rep(levels(df$id), rn)

thanks for any help
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply a function on elements of a list two by two

2010-05-08 Thread Patrick Hausmann


Am 08.05.2010 15:43, schrieb Joris Meys:

Dear all,

I want to apply a function to list elements, two by two. I hoped that combn
would help me out, but I can't get it to work. A nested for-loop works, but
seems highly inefficient when you have large lists. Is there a more
efficient way of approaching this?

# Make some toy data
data(iris)
test- vector(list,3)
for (i in 1:3){
 x- levels(iris$Species)[i]
 tmp- dist(iris[iris$Species==x,-5])
 test[[i]]- tmp
}
names(test)- levels(iris$Species)


# Using 'lapply' and 'split' is a little bit more flexible:
test - lapply(split(iris[, -5], iris$Species), function(x) dist(x))



# nested for loop works
for(i in 1:2){
 for(j in (i+1):3){
 print(all.equal(test[[i]],test[[j]]))
 }
}

# combn doesn't work
combn(test,2,all.equal)


Sorry, no answer

HTH
Patrick



Cheers
Joris


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find the three best values in every row

2010-05-07 Thread Patrick Hausmann


Hello Alfred,

I found the solution from S. Ellison
(https://stat.ethz.ch/pipermail/r-help/2010-May/238158.html) really 
inspiring.

Here I am using tail and the library 'plyr':

 set.seed(17*11)
 d-data.frame(africa=sample(50, 10),
europe= sample(50, 10),
n.america= sample(50, 10),
s.america= sample(50, 10),
antarctica= sample((1:50)/20, 10)
)

# using tail
t(apply(d, 1, function(x, n) tail(sort(x), n), n=3))

lapply(split(d, rownames(d)), function(x, n) sort(x)[n:ncol(x)], n=3)

# with plyr from Hadley Wickham
library(plyr)
ldply(split(d, rownames(d)), function(x, n) sort(x)[n : ncol(x)], n=3)

HTH
Patrick

Am 07.05.2010 15:43, schrieb Alfred Schulze:


Hello,

i have a dataframe with the GDP for different Country (in the columns) and
Years (in the rows).

Now i want for every year the best three values, if possible with name of
the countries (columnnames).

For the best it's no problem but for the other two values.

Thanks,

Alfred
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data frame pivoting

2010-05-06 Thread Patrick Hausmann


Hi Angelo,

try

x - structure(list(ID = c(A1, A1, A1, A1, A1, A2, A2,
A3, A3, A3, A3, A3), YEAR = c(2007, 2007, 2007, 2008,
2008, 2007, 2008, 2007, 2007, 2008, 2008, 2008), PROPERTY = c(P1,
P2, P3, P1, P2, P5, P6, P1, P3, P1, P2, P6
), VALUE = c(1, 2, 3, 10, 20, 50, 20, 1, 30, 10, 4, 25)), .Names = c(ID,
YEAR, PROPERTY, VALUE), row.names = c(NA, 12L), class = data.frame)

# package reshape
library(reshape)
xm - melt(x, id.var=c(ID, YEAR, PROPERTY))

# with cast (reshape)
cast(xm, ID ~ YEAR ~ PROPERTY)

ftable(cast(xm, ID ~ YEAR ~ PROPERTY))

# with xtabs - 0 != NA
xtabs(value ~ ID + YEAR + PROPERTY, data = xm)

ftable( xtabs(value ~ ID + YEAR + PROPERTY, data = xm) )

ftable(addmargins(xtabs(value ~ ID + YEAR + PROPERTY, data = xm)))

HTH
Patrick

Am 06.05.2010 09:06, schrieb angelo.lina...@bancaditalia.it:


Dear R experts,

I am trying to solve this problem, related to the possibility of
changing the shape of a data frame using a pivoting-like function.
I have a dataframe df of observations as follows:

ID  VALIDITY YEAR   PROPERTYPROPERTY VALUE
A1  2007P1  V1
A1  2007P2  V2
A1  2007P3  V3
A1  2008P1  V10
A1  2008P2  V20
A2  2007P5  V50
A2  2008P6  V20
A3  2007P1  V1
A3  2007P3  V30
A3  2008P1  V10
A3  2008P2  V4
A3  2008P6  V25

(you can imagine that this data is collected every year from a sample of
people with several measures - weight, number of children, income...
It can happen that some properties could be missing from some IDs).
I have to obtain a data frame like this:


ID  VALIDITY YEAR   P1  P2  P3  P4  P5  P6
A1  2007V1  V2  V3  -   -
-
A1  200 V10 V20 -   -   -
-
A2  2007-   -   -   -   V50
-
A2  2008-   -   -   -   -
V60
A3  2007V1  -   V30 -   -
-
A3  2008V10 V4  -   -   -
V25


I started using the operator by obtaining the different slices of
data:

by(df,df$PROPERTY,list)

but then ?

I also tried using tapply:

tapply(df$CID,df$PROPERTY,list)

obtaining a list but I am not able to go on.

Can you help me ?

Thank you in advance

Angelo Linardi



** Le e-mail provenienti dalla Banca d'Italia sono trasmesse in buona fede e non
comportano alcun vincolo ne' creano obblighi per la Banca stessa, salvo che 
cio' non
sia espressamente previsto da un accordo scritto.
Questa e-mail e' confidenziale. Qualora l'avesse ricevuta per errore, La 
preghiamo di
comunicarne via e-mail la ricezione al mittente e di distruggerne il contenuto. 
La
informiamo inoltre che l'utilizzo non autorizzato del messaggio o dei suoi 
allegati
potrebbe costituire reato. Grazie per la collaborazione.
-- E-mails from the Bank of Italy are sent in good faith but they are neither 
binding on
the Bank nor to be understood as creating any obligation on its part except 
where
provided for in a written agreement. This e-mail is confidential. If you have 
received it
by mistake, please inform the sender by reply e-mail and delete it from your 
system.
Please also note that the unauthorized disclosure or use of the message or any
attachments could be an offence. Thank you for your cooperation. **

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] transpose? reshape? flipping? challenge with data frame

2010-04-23 Thread Patrick Hausmann


Hi David,

you could use a mix of plyr and reshape:

# Example datasets
# Input
propsum - data.frame(coverClass=c(C, G, L, O, S),
  R209120812=c(NA, 0.49, 0.38, 0.04, 0.09),
  R209122212=c(0.05, 0.35, 0.41, 0.09, 0.10))

library(plyr)
xpropsum - melt(propsum, id.var=coverClass, variable_name = Image)
tpropsum - reshape(xpropsum, timevar=coverClass, idvar=Image, 
direction=wide)

colnames(tpropsum) - sub(value., , colnames(tpropsum))
tpropsum

Cheers
Patrick

Am 23.04.2010 06:43, schrieb david.gobb...@csiro.au:

Greetings all,

I am having difficulty transposing, reshaping, flipping (not sure which) a data 
frame which is read from a DBF file.  I have tried using t(), reshape() and 
other approaches without success. Can anyone please suggest an way (elegant or 
not) of flipping this data around ?

The initial data is like propsum (defined below), and I want it to look like 
tpropsum once reformed.

propsum

   coverClass R209120812 R209122212
1  C NA   0.05
2  G   0.49   0.35
3  L   0.38   0.41
4  O   0.04   0.09
5  S   0.09   0.10


tpropsum

ImageCGLO  L.1
1 R209120812   NA 0.49 0.38 0.04 0.09
2 R209122212 0.05 0.35 0.41 0.09 0.10

# Example datasets
# Input
propsum- data.frame(coverClass=c(C, G, L, O, S),
   R209120812=c(NA, 0.49, 0.38, 0.04, 0.09),
   R209122212=c(0.05, 0.35, 0.41, 0.09, 0.10))

# Desired output
tpropsum- data.frame(Image=c(R209120812, R209122212),
   C=c(NA, 0.05),
   G=c(0.49, 0.35),
   L=c(0.38, 0.41),
   O=c(0.04, 0.09),
   L=c(0.09, 0.10))

Thanks,
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] transpose? reshape? flipping? challenge with data frame

2010-04-23 Thread Patrick Hausmann


Ups, I mean library(reshape) not plyr, sorry

# Example datasets
# Input
propsum - data.frame(coverClass=c(C, G, L, O, S),
  R209120812=c(NA, 0.49, 0.38, 0.04, 0.09),
  R209122212=c(0.05, 0.35, 0.41, 0.09, 0.10))

library(reshape)
xpropsum - melt(propsum, id.var=coverClass, variable_name = Image)
tpropsum - reshape(xpropsum, timevar=coverClass, idvar=Image, 
direction=wide)

colnames(tpropsum) - sub(value., , colnames(tpropsum))
tpropsum

HTH,
Patrick

Am 23.04.2010 12:16, schrieb Patrick Hausmann:

Hi David,

you could use a mix of plyr and reshape:

# Example datasets
# Input
propsum - data.frame(coverClass=c(C, G, L, O, S),
R209120812=c(NA, 0.49, 0.38, 0.04, 0.09),
R209122212=c(0.05, 0.35, 0.41, 0.09, 0.10))

library(plyr)
xpropsum - melt(propsum, id.var=coverClass, variable_name = Image)
tpropsum - reshape(xpropsum, timevar=coverClass, idvar=Image,
direction=wide)
colnames(tpropsum) - sub(value., , colnames(tpropsum))
tpropsum

Cheers
Patrick


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use tapply for quantile

2010-04-09 Thread Patrick Hausmann


Hi James,

I don't know how to solve it with tapply (something with split I 
think..), but you could use plyr (from Hadley Wickham).


library(plyr)

# Generate some data
set.seed(321)
myD - data.frame(
  Place = sample(c(AWQ,DFR, WEQ), 10, replace=T),
  Light = sample(LETTERS[1:2], 15, replace=T),
  value=rnorm(30)
)

myD[c(3,12,29), value] - NA

# data.frame to data.frame
ddply(myD, .(Place, Light), summarise,
 quan_value = quantile(value, na.rm=TRUE))

# data.frame to list
quant - function(df) quantile(df$value, na.rm=TRUE)
dlply(myD, .(Place, Light), quant)

Cheers
Patrick


Am 09.04.2010 03:24, schrieb James Rome:

I am trying to calculate quantiles of a data frame column split up by
two factors:
# Calculate the quantiles
 quarts = tapply(gdf$tt, list(gdf$Runway, gdf$OnHour), FUN=quantile,
na.rm = TRUE)
This does not work:

quarts

04L   04R   15R   22L   22R   2732
33L   33R
0  NULL  Numeric,5 NULL  Numeric,5 NULL  Numeric,5 NULL
Numeric,5 NULL
1  NULL  Numeric,5 NULL  Numeric,5 NULL  NULL  NULL
Numeric,5 NULL
2  NULL  NULL  NULL  Numeric,5 NULL  NULL  NULL
NULL  NULL
3  NULL  NULL  NULL  NULL  NULL  NULL  NULL
Numeric,5 NULL
4  NULL  NULL  NULL  NULL  NULL  NULL  NULL
NULL  NULL
5  NULL  NULL  NULL  NULL  NULL  NULL  NULL
NULL  NULL
6  NULL  NULL  NULL  NULL  NULL  NULL  NULL
NULL  NULL
7  NULL  Numeric,5 NULL  NULL  NULL  Numeric,5 NULL
Numeric,5 NULL
8  NULL  Numeric,5 NULL  Numeric,5 NULL  Numeric,5 NULL
Numeric,5 NULL
. . .

But if I leave out either of the two factors, it does work

quarts = tapply(gdf$tt, list(gdf$Runway), FUN=quantile, na.rm = TRUE)
quarts

$`04L`
   0%  25%  50%  75% 100%
489   10   20

$`04R`
   0%  25%  50%  75% 100%
09   10   11   28
. . . .

How can I get this to work?

Thanks,
Jim Rome

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] write.csv and header

2009-12-14 Thread Patrick Hausmann


Hi Alex,

try this

mfile - c:\\ex01.txt
nperm - 12

sDate - paste(date: , 2009-12-13, sep=)
sFile - paste(filename: , mfile, sep=)
sPerm - paste(number of permutations: , nperm, sep=)

mt - matrix(1:10, 2)

sink(mfile)
  cat(sDate, \n)
  cat(sFile, \n)
  cat(sPerm, \n)
  cat(-, \n\n)
  print(mt)
sink()

Best
Patrick


Walther, Alexander schrieb:

Dear list,

I would like to export a matrix to a TXT-File by using write.csv (not
necessarily). Is there a way to add a header (with additional
informations concerning the project) spanning multiple lines to this
file before the actual data are listed up? Should look like this:



date:
filename:
number of permutations:



data (as a matrix)



Any suggestions? Thnx in advance.


cheers

Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] xtabs - missing combination

2009-12-12 Thread Patrick Hausmann


Dear list,

I am trying to make a contingency table with xtabs but I am getting
a 0 where I expect a 'NA'. Here is a simple example:

options(stringsAsFactors = FALSE)
rn - LETTERS[1:4]
df1 - data.frame(r07 = rep(rn, each=4),
  r08 = rep(rn, 4), value = 1:16)
xtabs(value ~ r07 + r08, df1)

# Delete the combination [A, C]
df1 - df1[-3,]

# Set 'value' for this combination to 0
df1[13, 3] - 0

# This is the output I want
tapply(df1[, value], df1[, c(r07, r08)], c)

# but using 'xtabs' I get a 0 for [A, C]
xtabs(value ~ r07 + r08, df1)

Hmm, what have I missed...

Thanks for any help!

Best,
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ave and grouping

2009-03-02 Thread Patrick Hausmann


Dear list,

# I have a DF like this:
sleep$b   - c(rep(8,10), rep(9,10))
sleep$me  - with(sleep, ave(extra, group, FUN = mean))
sleep

# I would like to create a new variable
# holding the b-th value of group 1 and 2.

# This is not what I want, it takes always the '8' from group '1'
# and not the '9'
sleep$gr  - with(sleep, ave(extra, group, FUN = function(x) x[ b[1] ]))
sleep

Thanks for any help!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lapply and aggregate function

2009-02-03 Thread Patrick Hausmann


Dear list,

I have two things I am struggling...

# First
set.seed(123)
myD - data.frame( Light = sample(LETTERS[1:2], 10, replace=T),
Feed  = sample(letters[1:5], 20, replace=T),
value=rnorm(20) )

# Mean for Light
myD$meanLight - unlist( lapply( myD$Light,
function(x) mean( myD$value[myD$Light == x]) ) )
# Mean for Feed
myD$meanFeed  - unlist( lapply( myD$Feed,
function(x) mean( myD$value[myD$Feed == x]) ) )
myD

# I would like to get a new Var meanLightFeed
# holding the Group-Mean for each combination (eg. A:a = 0.821581)
# by(myD$value, list(myD$Light, myD$Feed), mean)[[1]]


# Second
set.seed(321)
myD - data.frame( Light = sample(LETTERS[1:2], 10, replace=T),
value=rnorm(20) )

w1 - tapply(myD$value, myD$Light, mean)
w1
#  w1
# A  B
# 0.4753412 -0.2108387

myfun - function(x) (myD$value  w1[x]  myD$value  w1[x] * 1.5)

I would like to have a TRUE/FALSE-Variable depend on the constraint in
myfun for each level in Light...

As always - thanks for any help!!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] NA-values and logical operation

2009-01-13 Thread Patrick Hausmann


Dear list,

as a result of a logical operation I want to assign
a new variable to a DF with NA-values.

z - data.frame( x = c(5,6,5,NA,7,5,4,NA),
 y = c(1,2,2,2,2,2,2,2) )

p - (z$x = 5)  (z$y == 1)
p
z[p, p1] -5
z
# ok, this works fine

z - z[,-3]

p - (z$x = 5)  (z$y == 2)
p
z[p, p2] -5
z
# this failed... - how can I assign the value '5' to the new
# var p2

Thanks for any help!!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lattice: Color in Barchart legend

2008-09-24 Thread Patrick Hausmann


Dear list,

with the code below I produce the right graph, but the colours of the  
legend are different from the colours of the graph. The colours of the  
graph are the desired colours.


Thanks for any help.
Patrick

library(lattice)

pal1 - rgb(196, 255, 255, max = 255)
pal2 - rgb(  0,  35, 196, max = 255)

df - data.frame( Gruppe = c(A, B, A, B),
Kat = c(x1, x1, w1, w1),
value= c(1,2, 4, 5))

barchart(value ~ Kat, group= Gruppe,
 panel = function(y,x,...){
 panel.barchart(x,y, ..., col=c(pal1, pal2))
 }, data = df,
 auto.key = list(points = FALSE, rectangles = TRUE,
columns = 2, space = bottom)
)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] strsplit and regexp

2008-08-30 Thread Patrick Hausmann


Dear list,

I am trying to split a string using regexp:

x - 2 Value 34 a-c 45 t
strsplit(x, [0-9])

[[1]]
[1]Value  a-c  t

But I don't want to lose the digits (pattern), the result
should be:

[[1]]
[1] 2  Value   34   a-c   45  t

Thanks for any tipp
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] tapply and grouping

2008-05-17 Thread Patrick Hausmann


Hello all,

I have a df like this:

w - c(1.20, 1.34, 2.34, 3.12, 2.89, 4.67, 2.43,
   2.89, 1.99, 3.45, 2.01, 2.23, 1.45, 1.59)
g - rep(c(a, b), each=7)
df - data.frame(g, w)
df

# 1. Mean for each group
tapply(df$w, df$g, function(x) mean(x))

# 2. Range for each group - fix value 0.15
tapply(df$w, df$g,
   function(x)
   x[(x  mean(x) - 0.15) 
 (x  mean(x) + ( 1 - 0.15 ))])

Now my question: How can I use different values of 0.15 for
each group. As a result of a calculation I have an object
vari:

  vari
ab
 0.41 0.08

  str(vari)
  num [, 1:2] 0.41 0.08
  - attr(*, dimnames)=List of 1
   ..$ : chr [1:2] a b

So, I wanted to use 0.41 for group a and 0.08 for b
instead of 0.15...

Thanks for any help!!
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using tapply

2008-05-15 Thread Patrick Hausmann


Dear list,

I have a dataframe like this:

w - c(1.2, 1.34, 2.34, 3.12, 2.43, 1.99, 2.01, 2.23, 1.45, 1.59)
g - rep(c(a, b), each=5)
df - data.frame(g, w)
df


df

   gw
1  a 1.20
2  a 1.34
3  a 2.34
4  a 3.12
5  a 2.43
6  b 1.99
7  b 2.01
8  b 2.23
9  b 1.45
10 b 1.59

Using tapply to get the mean for each group:

vk  - tapply(df$w, df$g, mean)
vk
# vk
#a b
#2.086 1.854

Now I would like to get for each group the first value *greater*
than the mean.
So for a it should be 2.34 and for b 1.99.

Thanks for any help
Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ave and sd

2007-11-21 Thread Patrick Hausmann

Dear list,

I'm still trying to calculate the sd for V2 for
each group in V1 if V3 is '0':

 x
V1   V2 V3
1 A01 2.40  0
2 A01 3.40  1
3 A01 2.80  0
4 A02 3.20  0
5 A02 4.20  0
6 A03 2.98  1
7 A03 2.31  0
8 A04 4.20  0

# Work
x$vmean - ave(x$V2, x$V1, x$V3 == 0, FUN = mean)

# Work
x$vsd2 - ave(x$V2, x$V1, FUN = sd)

# Doesn't work
x$vsd - ave(x$V2, x$V1, x$V3 == 0, FUN = sd)

Thank you for any help!

Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ave and levels

2007-11-19 Thread Patrick Hausmann

Dear list,

I want to calculate the standard deviation using
'ave' on two different DFs.

In the first DF M1 has only 1 level:

 str(x)
'data.frame':   18 obs. of  3 variables:
  $ M1: Factor w/ 1 level A03: 1 1 1 1 ...
  $ M2: num  2.76 2.93 3.06 3.07 3.12 ...
  $ M3: Factor w/ 2 levels Ausgewählt,Nicht ausgewählt: 1 1 1 1 ...

and I am getting a correct 'NA' for the last value
ave(x$M2, x$M1, factor(x$M3), FUN = sd)
# [1] 0.1810123 0.1810123 0.1810123 0.1810123
# 0.1810123 0.1810123 0.1810123 0.1810123 0.1810123
# 0.1810123 0.1810123 0.1810123 0.1810123
# 0.1810123 0.1810123 0.1810123 0.1810123NA

This ist the second DF (here M1 as 138 Levels):

 str(k)
'data.frame':   18 obs. of  3 variables:
  $ M1: Factor w/ 138 levels A01,A02,A03,..: 3 3 3 3 ...
  $ M2: num  2.76 2.93 3.06 3.07 3.12 3.12 3.15 3.17 3.17 3.17 ...
  $ M3: Factor w/ 2 levels Ausgewählt,Nicht ausgewählt: 1 1 1 1 ...

and I am getting this error
ave(k$M2, k$M1, factor(k$M3), FUN = sd)
#Fehler in var(x, na.rm = na.rm) : 'x' ist leer

So, what have I missed?
Thank you for any help!

Patrick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

38 matches

Mail list logo