[R] Hypergeometric Function seems to give wrong results

2015-07-06 Thread Carlos Nasher via R-help
Hello R helpers,

I need to evaluate the Hypergeometric Function of the 2nd kind (Tricomi
confluent hypergeometric function). Therefore I'm using the kummerU
function from the fAsianOptions package. It seems to me that kummerU is
giving wrong results. Here's an example:

library(fAsianOptions)
kummerU(a=19, b=19, x = 10)

R gives 1838.298 for the real part.

If I use Mathematica via the wolfram site (
http://functions.wolfram.com/webMathematica/FunctionEvaluation.jsp?name=HypergeometricU)
the result is 3.52603e-20 which is more reasonable in the context of my
analysis.

Can anyone help how to compute the correct values within R?

Best regards,
Carlos

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] apply function to multiple list arguments

2014-10-14 Thread Carlos Nasher
Hi R helpers,

I'm struggling how to apply a function to multiple lists. My function uses
a dataframe, a list of parameters and a fixed value as arguments. Now I
want to apply that function to several dataframes contained in list, with
several lists of parameters (also contained in a list) and the fixed value.
Here's an example:


#

fix - 2 # fixed value

x - c(1,2,3)
y - c(4,5,6)

df_1 - data.frame(x,y) # first dataframe
df_2 - 2*df_1 # second dataframe

list_df - list(df_1,df_2) # list containing dataframes

par_1 - list(a=5,b=10) # first list of parameters
par_2 - list(a=6,b=11) # second list of parameters

list_par - list(par_1,par_2) # list of lists of parameters

f - function(data,params,z){
  res - (data$x*params$a+data$y*params$b)*z
  return(res)
}

res_1 - f(data = df_1, params = par_1, z = fix) # result of applying
function to first dataframe and first list of parameters
res_2 - f(data = df_2, params = par_2, z = fix) # result of applying
function to second dataframe and second list of parameters

#

I got the list of dataframes and parameters from a former use of lapply. I
was hoping to get the desired results (res_1, res_2) again in a list. I
tried mapply, but I don't get it running. Can anybody help?

Thanks and best regards,
Carlos



-- 
-
Carlos Nasher
Buchenstr. 12
22299 Hamburg

tel:+49 (0)40 67952962
mobil:+49 (0)175 9386725
mail:  carlos.nas...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SOLVED: Count number of consecutive zeros by group

2013-11-11 Thread Carlos Nasher
Thanks to all of you. All solutions work fine. I'm running S Ellisons
version with Williams comment. Perfect for what I'm doing.

And sorry for using a name same as a base R function (twice) ;-)

Cheers,
Carlos


2013/11/1 PIKAL Petr petr.pi...@precheza.cz

 Hi

 Yes you are right. This gives number of zeroes not max number of
 consecutive zeroes.

 Regards
 Petr


  -Original Message-
  From: arun [mailto:smartpink...@yahoo.com]
  Sent: Friday, November 01, 2013 2:17 PM
  To: R help
  Cc: PIKAL Petr; Carlos Nasher
  Subject: Re: [R] Count number of consecutive zeros by group
 
  I think this gives a different result than the one OP asked for:
 
  df1 - structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
  2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), x = c(1, 0, 0, 1, 0,
  0, 0, 1, 2, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0)), .Names = c(ID,
  x), row.names = c(NA, -22L), class = data.frame)
 
  with(df1, sapply(split(x, ID), function(x) sum(x==0)))
 
  with(df1,tapply(x,list(ID),function(y) {rl - rle(!y);
  max(c(0,rl$lengths[rl$values]))}))
 
 
  A.K.
 
 
  On Friday, November 1, 2013 6:01 AM, PIKAL Petr
  petr.pi...@precheza.cz wrote:
  Hi
 
  Another option is sapply/split/sum construction
 
  with(data, sapply(split(x, ID), function(x) sum(x==0)))
 
  Regards
  Petr
 
 
   -Original Message-
   From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
   project.org] On Behalf Of Carlos Nasher
   Sent: Thursday, October 31, 2013 6:46 PM
   To: S Ellison
   Cc: r-help@r-project.org
   Subject: Re: [R] Count number of consecutive zeros by group
  
   If I apply your function to my test data:
  
   ID - c(1,1,1,2,2,3,3,3,3)
   x - c(1,0,0,0,0,1,1,0,1)
   data - data.frame(ID=ID,x=x)
   rm(ID,x)
  
   f2 -   function(x) {
 max( rle(x == 0)$lengths )
   }
   with(data, tapply(x, ID, f2))
  
   the result is
   1 2 3
   2 2 2
  
   which is not what I'm aiming for. It should be
   1 2 3
   2 2 1
  
   I think f2 does not return the max of consecutive zeros, but the max
   of any consecutve number... Any idea how to fix this?
  
  
   2013/10/31 S Ellison s.elli...@lgcgroup.com
  
   
   
 -Original Message-
 So I want to get the max number of consecutive zeros of variable
  x
 for
each
 ID. I found rle() to be helpful for this task; so I did:

 FUN - function(x) {
   rles - rle(x == 0)
 }
 consec - lapply(split(df[,2],df[,1]), FUN)
   
You're probably better off with tapply and a function that returns
what you want. You're probably also better off with a data frame
name that isn't a function name, so I'll use dfr instead of df...
   
dfr- data.frame(x=rpois(500, 1.5), ID=gl(5,100)) #5 ID groups
numbered 1-5, equal size but that doesn't matter for tapply
   
f2 -   function(x) {
max( rle(x == 0)$lengths )
}
with(dfr, tapply(x, ID, f2))
   
   
S Ellison
   
   
***
This email and any attachments are confidential. Any
u...{{dropped:24}}
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide http://www.R-project.org/posting-
   guide.html and provide commented, minimal, self-contained,
   reproducible code.
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
-
Carlos Nasher
Buchenstr. 12
22299 Hamburg

tel:+49 (0)40 67952962
mobil:+49 (0)175 9386725
mail:  carlos.nas...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Count number of consecutive zeros by group

2013-10-31 Thread Carlos Nasher
Dear R-helpers,

I need to count the maximum number of consecutive zero values of a variable
in a dataframe by different groups. My dataframe looks like this:

ID - c(1,1,1,2,2,3,3,3,3)
x - c(1,0,0,0,0,1,1,0,1)
df - data.frame(ID=ID,x=x)
rm(ID,x)

So I want to get the max number of consecutive zeros of variable x for each
ID. I found rle() to be helpful for this task; so I did:

FUN - function(x) {
  rles - rle(x == 0)
}
consec - lapply(split(df[,2],df[,1]), FUN)

consec is now an rle object containing lists für each ID that contain
$lenghts: int as the counts for every consecutive number and $values: logi
indicating if the consecutive numbers are zero or not.

Unfortunately I'm not very experienced with lists. Could you help me how to
extract the max number of consec zeros for each ID and return the result as
a dataframe containing ID and max number of consecutive zeros?

Different approaches are also welcome. Since the real dataframe is quite
large, a fast solution is appreciated.

Best regards,
Carlos


-- 
-
Carlos Nasher
Buchenstr. 12
22299 Hamburg

tel:+49 (0)40 67952962
mobil:+49 (0)175 9386725
mail:  carlos.nas...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Count number of consecutive zeros by group

2013-10-31 Thread Carlos Nasher
If I apply your function to my test data:

ID - c(1,1,1,2,2,3,3,3,3)
x - c(1,0,0,0,0,1,1,0,1)
data - data.frame(ID=ID,x=x)
rm(ID,x)

f2 -   function(x) {
  max( rle(x == 0)$lengths )
}
with(data, tapply(x, ID, f2))

the result is
1 2 3
2 2 2

which is not what I'm aiming for. It should be
1 2 3
2 2 1

I think f2 does not return the max of consecutive zeros, but the max of any
consecutve number... Any idea how to fix this?


2013/10/31 S Ellison s.elli...@lgcgroup.com



  -Original Message-
  So I want to get the max number of consecutive zeros of variable x for
 each
  ID. I found rle() to be helpful for this task; so I did:
 
  FUN - function(x) {
rles - rle(x == 0)
  }
  consec - lapply(split(df[,2],df[,1]), FUN)

 You're probably better off with tapply and a function that returns what
 you want. You're probably also better off with a data frame name that isn't
 a function name, so I'll use dfr instead of df...

 dfr- data.frame(x=rpois(500, 1.5), ID=gl(5,100)) #5 ID groups numbered
 1-5, equal size but that doesn't matter for tapply

 f2 -   function(x) {
 max( rle(x == 0)$lengths )
 }
 with(dfr, tapply(x, ID, f2))


 S Ellison


 ***
 This email and any attachments are confidential. Any u...{{dropped:24}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [dfoptim] 'Error in fn(ginv(par), ...) : object 'alpha' not found'

2013-09-09 Thread Carlos Nasher
Hi Simon,

thank you for your help. You suggest to use 'alpha_zero' and 'beta_zero' in
the 'L' function. But that's not exactly what I'm trying to do. Maybe it
helps if I show where the 'Likelihood_cov' model developed from. The basic
likelihood look like this:

### Likelihood function ###
Likelihood - function(params, x, tx, T) {
  r - params[1]
  alpha - params[2]
  s - params[3]
  beta - params[4]
  f - function(x, tx, T)
  {
g - function(y)
  (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1))
integrate(g, tx, T)$value
  }
  integral - mdply(data, f)
  L -
exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1))
  f - -sum(log(L))
  return (f)
}

Parameters to be estimated are r, alpha, s and beta. Works fine so far. Now
I intent to incorporate the covariate 'IS' into this model. I do this by
facilitating a proportional hazard approach for the parameters 'alpha' and
'beta'. So alpha should be replaced by alpha_zero*exp(-gamma_1*IS) and
likewise beta shall become beta_zero*exp(-gamma_2*IS). So for that extended
model the parameters to be estimated are r, alpha_zero, s, beta_zero,
gamma_1 and gamma_2.

Therefore alpha and beta shall also be replaced in the L function by
alpha_zero*exp(-gamma_1*IS)
and beta_zero*exp(-gamma_2*IS) as well as with the integrate function. I
thought that would work by assigning:

data$alpha - alpha_zero*exp(-gamma_1*IS)
data$beta - beta_zero*exp(-gamma_2*IS)

as I did in the code I posted previously. But as you pointed out alpha and
beta do not exist when calling them in the L function. Unfortunately i do
not understand why they do not exist and how to fix this. Maybe you could
help me out here?

Thanks in advance and best regards,
Carlos


2013/9/3 Simon Zehnder szehn...@uni-bonn.de

 Hi Carlos,

 your problem is a wrong definition of your Likelihood function. You call
 symbols in the code (alpha, beta) which have no value assigned to. When L
 the long calculation in the last lines is assigned to L alpha and beta do
 not exist. The code below corrects it. But you have a problem with a
 divergent integral when calling integrate. A problem you can surely fix as
 you know what your function is doing.

 Likelihood_cov - function(params, x, tx, T, IS) {
  r - params[1]
  alpha_zero - params[2]
  s - params[3]
  beta_zero - params[4]
  gamma_1 - params[5]
  gamma_2 - params[6]
  data$alpha - alpha_zero*exp(-gamma_1*IS)
  data$beta - beta_zero*exp(-gamma_2*IS)
  f - function(x, tx, T, alpha, beta)
  {
g - function(y)
  (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1))
integrate(g, tx, T)$value
  }
  integral - mdply(data, f)
  L -

 exp(lgamma(r+x)-lgamma(r)+r*(log(alpha_zero)-log(alpha_zero+T))-x*log(alpha_zero+T)+s*(log(beta_zero)-log(beta_zero+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha_zero)+log(s)+s*log(beta_zero)+log(integral$V1))
  f - -sum(log(L))
  return (f)
 }


 Best

 Simon


 On Sep 3, 2013, at 1:28 PM, Carlos Nasher carlos.nas...@googlemail.com
 wrote:

  Dear R helpers,
 
  I have problems to properly define a Likelihood function. Thanks to your
  help my basic model is running quite well, but I have problems to get the
  enhanced version (now incorporating covariates) running.
 
  Within my likelihood function I define a variable 'alpha'. When I want to
  optimize the function I get the error message:
 
  'Error in fn(ginv(par), ...) : object 'alpha' not found'
 
  I think it's actually not a problem with the optimization function
 (nmkb),
  but with the Likelihood function itself. I do not understand why 'alpha'
 is
  a missing object. 'alpha' should be part of the dataframe 'data' (as
 'beta'
  should be too), like 'x', 'tx', ''T. But it obviously isn't.
 
  Here's a minimum example which reproduces my problem:
 
  ##
 
  library(plyr)
  library(dfoptim)
 
  ### Sample data ###
  x - c(3, 0, 2, 5, 1, 0, 0, 1, 0, 2)
  tx - c(24.57, 0.00, 26.86, 34.57, 2.14, 0.00, 0.00, 8.57, 0.00, 14.29)
  T - c(33.29, 30.71, 31.29, 34.57, 36.00, 35.43, 31.14, 33.86, 35.71,
 35.86)
  IS - c(54.97, 13.97, 122.33, 110.84, 30.72, 14.96, 30.72, 20.74, 29.16,
  83.00)
  data - data.frame(x=x, tx=tx, T=T)
  rm(x, tx, T)
 
  ### Likelihood function ###
  Likelihood_cov - function(params, x, tx, T, IS) {
   r - params[1]
   alpha_zero - params[2]
   s - params[3]
   beta_zero - params[4]
   gamma_1 - params[5]
   gamma_2 - params[6]
   data$alpha - alpha_zero*exp(-gamma_1*IS)
   data$beta - beta_zero*exp(-gamma_2*IS)
   f - function(x, tx, T, alpha, beta)
   {
 g - function(y)
   (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1))
 integrate(g, tx, T)$value
   }
   integral - mdply(data, f)
   L -
 
 exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1))
   f - -sum(log(L

[R] [dfoptim] 'Error in fn(ginv(par), ...) : object 'alpha' not found'

2013-09-03 Thread Carlos Nasher
Dear R helpers,

I have problems to properly define a Likelihood function. Thanks to your
help my basic model is running quite well, but I have problems to get the
enhanced version (now incorporating covariates) running.

Within my likelihood function I define a variable 'alpha'. When I want to
optimize the function I get the error message:

'Error in fn(ginv(par), ...) : object 'alpha' not found'

I think it's actually not a problem with the optimization function (nmkb),
but with the Likelihood function itself. I do not understand why 'alpha' is
a missing object. 'alpha' should be part of the dataframe 'data' (as 'beta'
should be too), like 'x', 'tx', ''T. But it obviously isn't.

Here's a minimum example which reproduces my problem:

##

library(plyr)
library(dfoptim)

### Sample data ###
x - c(3, 0, 2, 5, 1, 0, 0, 1, 0, 2)
tx - c(24.57, 0.00, 26.86, 34.57, 2.14, 0.00, 0.00, 8.57, 0.00, 14.29)
T - c(33.29, 30.71, 31.29, 34.57, 36.00, 35.43, 31.14, 33.86, 35.71, 35.86)
IS - c(54.97, 13.97, 122.33, 110.84, 30.72, 14.96, 30.72, 20.74, 29.16,
83.00)
data - data.frame(x=x, tx=tx, T=T)
rm(x, tx, T)

### Likelihood function ###
Likelihood_cov - function(params, x, tx, T, IS) {
  r - params[1]
  alpha_zero - params[2]
  s - params[3]
  beta_zero - params[4]
  gamma_1 - params[5]
  gamma_2 - params[6]
  data$alpha - alpha_zero*exp(-gamma_1*IS)
  data$beta - beta_zero*exp(-gamma_2*IS)
  f - function(x, tx, T, alpha, beta)
  {
g - function(y)
  (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1))
integrate(g, tx, T)$value
  }
  integral - mdply(data, f)
  L -
exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1))
  f - -sum(log(L))
  return (f)
}

### ML optimization ###
params - c(0.2, 5, 0.2, 5, -0.02, -0.02)
fit - nmkb(par=params, fn=Likelihood_cov, lower=c(0.0001, 0.0001, 0.0001,
0.0001, -Inf, -Inf), upper=c(Inf, Inf, Inf, Inf, Inf, Inf), x=data$x,
tx=data$tx, T=data$T, IS=IS)

##


Maybe you could give me a hint were the flaw in my code is. Many thanks in
advance.
Carlos


-
Carlos Nasher
Buchenstr. 12
22299 Hamburg

tel:+49 (0)40 67952962
mobil:+49 (0)175 9386725
mail:  carlos.nas...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [optim/bbmle] function returns NA at ... distance from x

2013-08-13 Thread Carlos Nasher
Dear R helpers,

I try to find the model parameters using mle2 (bbmle package). As I try to
optimize the likelihood function the following error message occurs:

Error in grad.default(objectivefunction, coef) :
  function returns NA at
1e-040.001013016911639890.0003166929388711890.000935163594829395 distance
from x.
In addition: Warning message:
In optimx(par = c(0.5, 10, 0.7, 10), fn = function (p)  :
  Gradient not computable after method Nelder-Mead

I can't figure out what that means exactly and how to fix it. I understand
that mle2 uses optim (or in my case optimx) to optimize the likelihood
function. As I use the Nelder-Mead method it should not be a problem if the
function returns NA at any iteration (as long as the initial values don't
return NA). Can anyone help me with that?

Here a small example of my code that reproduces the problem:

library(plyr)
library(optimx)

### Sample data ###
x - c(1,1,4,2,3,0,1,6,0,0)
tx - c(30.14, 5.14, 24.43, 10.57, 25.71, 0.00, 14.14, 32.86, 0.00, 0.00)
T - c(32.57, 29.14, 33.57, 34.71, 27.71, 38.14, 36.57, 37.71, 35.86, 30.57)
data - data.frame(x=x, tx=tx, T=T)

### Likelihood function ###
Likelihood - function(data, r, alpha, s, beta) {
  with(data, {
if (r=0 | alpha=0 | s=0 | beta=0) return (NaN)
f - function(x, tx, T)
{
  g - function(y)
(y + alpha)^(-( r + x))*(y + beta)^(-(s + 1))
  integrate(g, tx, T)$value
}
integral - mdply(data, f)
L -
exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1))
f - sum(log(L))
return (f)
  })
}

### ML estimation function ###
Estimate_parameters_MLE - function(data, initValues) {
  llhd - function(r, alpha, s, beta) {
return (Likelihood(data, r, alpha, s, beta))
  }
  library(bbmle)
  fit - mle2(llhd, initValues, skip.hessian=TRUE, optimizer=optimx,
method=Nelder-Mead, control=list(maxit=1e8))
  return (fit)
}

### Parameter estimation ###
Likelihood(data=data, r=0.5, alpha=10, s=0.7, beta=10) ### check initial
parameters -- -72.75183 -- initial parameters do return value
MLE_estimation - Estimate_parameters_MLE(data=data, list(r=0.5, alpha=10,
s=0.7, beta=10))

'Error in grad.default(objectivefunction, coef) :
  function returns NA at
1e-040.001013016911639890.0003166929388711890.000935163594829395 distance
from x.
In addition: Warning message:
  In optimx(par = c(0.5, 10, 0.7, 10), fn = function (p)  :
  Gradient not computable after method Nelder-Mead'


Best regards,
Carlos

-
Carlos Nasher
Buchenstr. 12
22299 Hamburg

tel:+49 (0)40 67952962
mobil:+49 (0)175 9386725
mail:  carlos.nas...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Numercial evaluation of intgral with different bounds

2013-08-08 Thread Carlos Nasher
Hello R helpers,

I'm struggling how to apply the integrate function to a data frame. Here is
an example of what I'm trying to do:

# Create data frame
x - 0:4
tx - 10:14
T - 12:16
data - data.frame(x=x, tx=tx, T=T)

# Parameter
alpha - 10
beta - 11

# Integral
integrand - function(y){
  (y+alpha)^-(r+data$x)*(y+beta^-(s+1))
}

Now I want to apply the integrate function to evaluate the integral for
each line of the data frame with tx as the lower and T as the upper bound.
The respektive values (and the values only) should be returned in a vector.

I want to avoid the use of a loop since the integral is part of a function
I want to optimize with optim and so speed is crucial. I tried to do this
by something like:

integral - lapply(data$tx, integrate, f=integrand, upper=data$T)
integral2 - sapply(integral, function(x){x[1]})
integral3 - unlist(integral2, use.names=FALSE)

But this doesn't work properly. I'd glad if you have any hints how to get
this done.


Many thanks and best regards,
Carlos


-- 
-
Carlos Nasher
Buchenstr. 12
22299 Hamburg

tel:+49 (0)40 67952962
mobil:+49 (0)175 9386725
mail:  carlos.nas...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Set a zero at minimum row by group

2012-12-18 Thread Carlos Nasher
Dear R Helpers,

I'm struggling with a data preparation problem. I feel that it is a quite
easy task but I don't get it done. I hope you can help me with that.

I have a data frame looking like this:

ID - c(1,1,1,2,2,3,3,3,3)
T - c(1,2,3,1,4,3,5,6,8)
x - rep(1,9)
df - data.frame(ID,T,x)

df
ID T x
1 1 1
1 2 1
1 3 1
2 1 1
2 4 1
3 3 1
3 5 1
3 6 1
3 8 1

I want to manipulate the x column in a way that for each customer (ID) at
the minimum of T the x value is set to zero. The result should look like
this:

ID T x x_new
1 1 1 0
1 2 1 1
1 3 1 1
2 1 1 0
2 4 1 1
3 3 1 0
3 5 1 1
3 6 1 1
3 8 1 1

I already tried the aggregate() and apply() function, but I don't get the
result I'm looking for. I would glad if you could help me out.

Best regards,
Carlos

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.