[R] Hypergeometric Function seems to give wrong results
Hello R helpers, I need to evaluate the Hypergeometric Function of the 2nd kind (Tricomi confluent hypergeometric function). Therefore I'm using the kummerU function from the fAsianOptions package. It seems to me that kummerU is giving wrong results. Here's an example: library(fAsianOptions) kummerU(a=19, b=19, x = 10) R gives 1838.298 for the real part. If I use Mathematica via the wolfram site ( http://functions.wolfram.com/webMathematica/FunctionEvaluation.jsp?name=HypergeometricU) the result is 3.52603e-20 which is more reasonable in the context of my analysis. Can anyone help how to compute the correct values within R? Best regards, Carlos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] apply function to multiple list arguments
Hi R helpers, I'm struggling how to apply a function to multiple lists. My function uses a dataframe, a list of parameters and a fixed value as arguments. Now I want to apply that function to several dataframes contained in list, with several lists of parameters (also contained in a list) and the fixed value. Here's an example: # fix - 2 # fixed value x - c(1,2,3) y - c(4,5,6) df_1 - data.frame(x,y) # first dataframe df_2 - 2*df_1 # second dataframe list_df - list(df_1,df_2) # list containing dataframes par_1 - list(a=5,b=10) # first list of parameters par_2 - list(a=6,b=11) # second list of parameters list_par - list(par_1,par_2) # list of lists of parameters f - function(data,params,z){ res - (data$x*params$a+data$y*params$b)*z return(res) } res_1 - f(data = df_1, params = par_1, z = fix) # result of applying function to first dataframe and first list of parameters res_2 - f(data = df_2, params = par_2, z = fix) # result of applying function to second dataframe and second list of parameters # I got the list of dataframes and parameters from a former use of lapply. I was hoping to get the desired results (res_1, res_2) again in a list. I tried mapply, but I don't get it running. Can anybody help? Thanks and best regards, Carlos -- - Carlos Nasher Buchenstr. 12 22299 Hamburg tel:+49 (0)40 67952962 mobil:+49 (0)175 9386725 mail: carlos.nas...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SOLVED: Count number of consecutive zeros by group
Thanks to all of you. All solutions work fine. I'm running S Ellisons version with Williams comment. Perfect for what I'm doing. And sorry for using a name same as a base R function (twice) ;-) Cheers, Carlos 2013/11/1 PIKAL Petr petr.pi...@precheza.cz Hi Yes you are right. This gives number of zeroes not max number of consecutive zeroes. Regards Petr -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Friday, November 01, 2013 2:17 PM To: R help Cc: PIKAL Petr; Carlos Nasher Subject: Re: [R] Count number of consecutive zeros by group I think this gives a different result than the one OP asked for: df1 - structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), x = c(1, 0, 0, 1, 0, 0, 0, 1, 2, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0)), .Names = c(ID, x), row.names = c(NA, -22L), class = data.frame) with(df1, sapply(split(x, ID), function(x) sum(x==0))) with(df1,tapply(x,list(ID),function(y) {rl - rle(!y); max(c(0,rl$lengths[rl$values]))})) A.K. On Friday, November 1, 2013 6:01 AM, PIKAL Petr petr.pi...@precheza.cz wrote: Hi Another option is sapply/split/sum construction with(data, sapply(split(x, ID), function(x) sum(x==0))) Regards Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Carlos Nasher Sent: Thursday, October 31, 2013 6:46 PM To: S Ellison Cc: r-help@r-project.org Subject: Re: [R] Count number of consecutive zeros by group If I apply your function to my test data: ID - c(1,1,1,2,2,3,3,3,3) x - c(1,0,0,0,0,1,1,0,1) data - data.frame(ID=ID,x=x) rm(ID,x) f2 - function(x) { max( rle(x == 0)$lengths ) } with(data, tapply(x, ID, f2)) the result is 1 2 3 2 2 2 which is not what I'm aiming for. It should be 1 2 3 2 2 1 I think f2 does not return the max of consecutive zeros, but the max of any consecutve number... Any idea how to fix this? 2013/10/31 S Ellison s.elli...@lgcgroup.com -Original Message- So I want to get the max number of consecutive zeros of variable x for each ID. I found rle() to be helpful for this task; so I did: FUN - function(x) { rles - rle(x == 0) } consec - lapply(split(df[,2],df[,1]), FUN) You're probably better off with tapply and a function that returns what you want. You're probably also better off with a data frame name that isn't a function name, so I'll use dfr instead of df... dfr- data.frame(x=rpois(500, 1.5), ID=gl(5,100)) #5 ID groups numbered 1-5, equal size but that doesn't matter for tapply f2 - function(x) { max( rle(x == 0)$lengths ) } with(dfr, tapply(x, ID, f2)) S Ellison *** This email and any attachments are confidential. Any u...{{dropped:24}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- - Carlos Nasher Buchenstr. 12 22299 Hamburg tel:+49 (0)40 67952962 mobil:+49 (0)175 9386725 mail: carlos.nas...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Count number of consecutive zeros by group
Dear R-helpers, I need to count the maximum number of consecutive zero values of a variable in a dataframe by different groups. My dataframe looks like this: ID - c(1,1,1,2,2,3,3,3,3) x - c(1,0,0,0,0,1,1,0,1) df - data.frame(ID=ID,x=x) rm(ID,x) So I want to get the max number of consecutive zeros of variable x for each ID. I found rle() to be helpful for this task; so I did: FUN - function(x) { rles - rle(x == 0) } consec - lapply(split(df[,2],df[,1]), FUN) consec is now an rle object containing lists für each ID that contain $lenghts: int as the counts for every consecutive number and $values: logi indicating if the consecutive numbers are zero or not. Unfortunately I'm not very experienced with lists. Could you help me how to extract the max number of consec zeros for each ID and return the result as a dataframe containing ID and max number of consecutive zeros? Different approaches are also welcome. Since the real dataframe is quite large, a fast solution is appreciated. Best regards, Carlos -- - Carlos Nasher Buchenstr. 12 22299 Hamburg tel:+49 (0)40 67952962 mobil:+49 (0)175 9386725 mail: carlos.nas...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Count number of consecutive zeros by group
If I apply your function to my test data: ID - c(1,1,1,2,2,3,3,3,3) x - c(1,0,0,0,0,1,1,0,1) data - data.frame(ID=ID,x=x) rm(ID,x) f2 - function(x) { max( rle(x == 0)$lengths ) } with(data, tapply(x, ID, f2)) the result is 1 2 3 2 2 2 which is not what I'm aiming for. It should be 1 2 3 2 2 1 I think f2 does not return the max of consecutive zeros, but the max of any consecutve number... Any idea how to fix this? 2013/10/31 S Ellison s.elli...@lgcgroup.com -Original Message- So I want to get the max number of consecutive zeros of variable x for each ID. I found rle() to be helpful for this task; so I did: FUN - function(x) { rles - rle(x == 0) } consec - lapply(split(df[,2],df[,1]), FUN) You're probably better off with tapply and a function that returns what you want. You're probably also better off with a data frame name that isn't a function name, so I'll use dfr instead of df... dfr- data.frame(x=rpois(500, 1.5), ID=gl(5,100)) #5 ID groups numbered 1-5, equal size but that doesn't matter for tapply f2 - function(x) { max( rle(x == 0)$lengths ) } with(dfr, tapply(x, ID, f2)) S Ellison *** This email and any attachments are confidential. Any u...{{dropped:24}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [dfoptim] 'Error in fn(ginv(par), ...) : object 'alpha' not found'
Hi Simon, thank you for your help. You suggest to use 'alpha_zero' and 'beta_zero' in the 'L' function. But that's not exactly what I'm trying to do. Maybe it helps if I show where the 'Likelihood_cov' model developed from. The basic likelihood look like this: ### Likelihood function ### Likelihood - function(params, x, tx, T) { r - params[1] alpha - params[2] s - params[3] beta - params[4] f - function(x, tx, T) { g - function(y) (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1)) integrate(g, tx, T)$value } integral - mdply(data, f) L - exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1)) f - -sum(log(L)) return (f) } Parameters to be estimated are r, alpha, s and beta. Works fine so far. Now I intent to incorporate the covariate 'IS' into this model. I do this by facilitating a proportional hazard approach for the parameters 'alpha' and 'beta'. So alpha should be replaced by alpha_zero*exp(-gamma_1*IS) and likewise beta shall become beta_zero*exp(-gamma_2*IS). So for that extended model the parameters to be estimated are r, alpha_zero, s, beta_zero, gamma_1 and gamma_2. Therefore alpha and beta shall also be replaced in the L function by alpha_zero*exp(-gamma_1*IS) and beta_zero*exp(-gamma_2*IS) as well as with the integrate function. I thought that would work by assigning: data$alpha - alpha_zero*exp(-gamma_1*IS) data$beta - beta_zero*exp(-gamma_2*IS) as I did in the code I posted previously. But as you pointed out alpha and beta do not exist when calling them in the L function. Unfortunately i do not understand why they do not exist and how to fix this. Maybe you could help me out here? Thanks in advance and best regards, Carlos 2013/9/3 Simon Zehnder szehn...@uni-bonn.de Hi Carlos, your problem is a wrong definition of your Likelihood function. You call symbols in the code (alpha, beta) which have no value assigned to. When L the long calculation in the last lines is assigned to L alpha and beta do not exist. The code below corrects it. But you have a problem with a divergent integral when calling integrate. A problem you can surely fix as you know what your function is doing. Likelihood_cov - function(params, x, tx, T, IS) { r - params[1] alpha_zero - params[2] s - params[3] beta_zero - params[4] gamma_1 - params[5] gamma_2 - params[6] data$alpha - alpha_zero*exp(-gamma_1*IS) data$beta - beta_zero*exp(-gamma_2*IS) f - function(x, tx, T, alpha, beta) { g - function(y) (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1)) integrate(g, tx, T)$value } integral - mdply(data, f) L - exp(lgamma(r+x)-lgamma(r)+r*(log(alpha_zero)-log(alpha_zero+T))-x*log(alpha_zero+T)+s*(log(beta_zero)-log(beta_zero+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha_zero)+log(s)+s*log(beta_zero)+log(integral$V1)) f - -sum(log(L)) return (f) } Best Simon On Sep 3, 2013, at 1:28 PM, Carlos Nasher carlos.nas...@googlemail.com wrote: Dear R helpers, I have problems to properly define a Likelihood function. Thanks to your help my basic model is running quite well, but I have problems to get the enhanced version (now incorporating covariates) running. Within my likelihood function I define a variable 'alpha'. When I want to optimize the function I get the error message: 'Error in fn(ginv(par), ...) : object 'alpha' not found' I think it's actually not a problem with the optimization function (nmkb), but with the Likelihood function itself. I do not understand why 'alpha' is a missing object. 'alpha' should be part of the dataframe 'data' (as 'beta' should be too), like 'x', 'tx', ''T. But it obviously isn't. Here's a minimum example which reproduces my problem: ## library(plyr) library(dfoptim) ### Sample data ### x - c(3, 0, 2, 5, 1, 0, 0, 1, 0, 2) tx - c(24.57, 0.00, 26.86, 34.57, 2.14, 0.00, 0.00, 8.57, 0.00, 14.29) T - c(33.29, 30.71, 31.29, 34.57, 36.00, 35.43, 31.14, 33.86, 35.71, 35.86) IS - c(54.97, 13.97, 122.33, 110.84, 30.72, 14.96, 30.72, 20.74, 29.16, 83.00) data - data.frame(x=x, tx=tx, T=T) rm(x, tx, T) ### Likelihood function ### Likelihood_cov - function(params, x, tx, T, IS) { r - params[1] alpha_zero - params[2] s - params[3] beta_zero - params[4] gamma_1 - params[5] gamma_2 - params[6] data$alpha - alpha_zero*exp(-gamma_1*IS) data$beta - beta_zero*exp(-gamma_2*IS) f - function(x, tx, T, alpha, beta) { g - function(y) (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1)) integrate(g, tx, T)$value } integral - mdply(data, f) L - exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1)) f - -sum(log(L
[R] [dfoptim] 'Error in fn(ginv(par), ...) : object 'alpha' not found'
Dear R helpers, I have problems to properly define a Likelihood function. Thanks to your help my basic model is running quite well, but I have problems to get the enhanced version (now incorporating covariates) running. Within my likelihood function I define a variable 'alpha'. When I want to optimize the function I get the error message: 'Error in fn(ginv(par), ...) : object 'alpha' not found' I think it's actually not a problem with the optimization function (nmkb), but with the Likelihood function itself. I do not understand why 'alpha' is a missing object. 'alpha' should be part of the dataframe 'data' (as 'beta' should be too), like 'x', 'tx', ''T. But it obviously isn't. Here's a minimum example which reproduces my problem: ## library(plyr) library(dfoptim) ### Sample data ### x - c(3, 0, 2, 5, 1, 0, 0, 1, 0, 2) tx - c(24.57, 0.00, 26.86, 34.57, 2.14, 0.00, 0.00, 8.57, 0.00, 14.29) T - c(33.29, 30.71, 31.29, 34.57, 36.00, 35.43, 31.14, 33.86, 35.71, 35.86) IS - c(54.97, 13.97, 122.33, 110.84, 30.72, 14.96, 30.72, 20.74, 29.16, 83.00) data - data.frame(x=x, tx=tx, T=T) rm(x, tx, T) ### Likelihood function ### Likelihood_cov - function(params, x, tx, T, IS) { r - params[1] alpha_zero - params[2] s - params[3] beta_zero - params[4] gamma_1 - params[5] gamma_2 - params[6] data$alpha - alpha_zero*exp(-gamma_1*IS) data$beta - beta_zero*exp(-gamma_2*IS) f - function(x, tx, T, alpha, beta) { g - function(y) (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1)) integrate(g, tx, T)$value } integral - mdply(data, f) L - exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1)) f - -sum(log(L)) return (f) } ### ML optimization ### params - c(0.2, 5, 0.2, 5, -0.02, -0.02) fit - nmkb(par=params, fn=Likelihood_cov, lower=c(0.0001, 0.0001, 0.0001, 0.0001, -Inf, -Inf), upper=c(Inf, Inf, Inf, Inf, Inf, Inf), x=data$x, tx=data$tx, T=data$T, IS=IS) ## Maybe you could give me a hint were the flaw in my code is. Many thanks in advance. Carlos - Carlos Nasher Buchenstr. 12 22299 Hamburg tel:+49 (0)40 67952962 mobil:+49 (0)175 9386725 mail: carlos.nas...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [optim/bbmle] function returns NA at ... distance from x
Dear R helpers, I try to find the model parameters using mle2 (bbmle package). As I try to optimize the likelihood function the following error message occurs: Error in grad.default(objectivefunction, coef) : function returns NA at 1e-040.001013016911639890.0003166929388711890.000935163594829395 distance from x. In addition: Warning message: In optimx(par = c(0.5, 10, 0.7, 10), fn = function (p) : Gradient not computable after method Nelder-Mead I can't figure out what that means exactly and how to fix it. I understand that mle2 uses optim (or in my case optimx) to optimize the likelihood function. As I use the Nelder-Mead method it should not be a problem if the function returns NA at any iteration (as long as the initial values don't return NA). Can anyone help me with that? Here a small example of my code that reproduces the problem: library(plyr) library(optimx) ### Sample data ### x - c(1,1,4,2,3,0,1,6,0,0) tx - c(30.14, 5.14, 24.43, 10.57, 25.71, 0.00, 14.14, 32.86, 0.00, 0.00) T - c(32.57, 29.14, 33.57, 34.71, 27.71, 38.14, 36.57, 37.71, 35.86, 30.57) data - data.frame(x=x, tx=tx, T=T) ### Likelihood function ### Likelihood - function(data, r, alpha, s, beta) { with(data, { if (r=0 | alpha=0 | s=0 | beta=0) return (NaN) f - function(x, tx, T) { g - function(y) (y + alpha)^(-( r + x))*(y + beta)^(-(s + 1)) integrate(g, tx, T)$value } integral - mdply(data, f) L - exp(lgamma(r+x)-lgamma(r)+r*(log(alpha)-log(alpha+T))-x*log(alpha+T)+s*(log(beta)-log(beta+T)))+exp(lgamma(r+x)-lgamma(r)+r*log(alpha)+log(s)+s*log(beta)+log(integral$V1)) f - sum(log(L)) return (f) }) } ### ML estimation function ### Estimate_parameters_MLE - function(data, initValues) { llhd - function(r, alpha, s, beta) { return (Likelihood(data, r, alpha, s, beta)) } library(bbmle) fit - mle2(llhd, initValues, skip.hessian=TRUE, optimizer=optimx, method=Nelder-Mead, control=list(maxit=1e8)) return (fit) } ### Parameter estimation ### Likelihood(data=data, r=0.5, alpha=10, s=0.7, beta=10) ### check initial parameters -- -72.75183 -- initial parameters do return value MLE_estimation - Estimate_parameters_MLE(data=data, list(r=0.5, alpha=10, s=0.7, beta=10)) 'Error in grad.default(objectivefunction, coef) : function returns NA at 1e-040.001013016911639890.0003166929388711890.000935163594829395 distance from x. In addition: Warning message: In optimx(par = c(0.5, 10, 0.7, 10), fn = function (p) : Gradient not computable after method Nelder-Mead' Best regards, Carlos - Carlos Nasher Buchenstr. 12 22299 Hamburg tel:+49 (0)40 67952962 mobil:+49 (0)175 9386725 mail: carlos.nas...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Numercial evaluation of intgral with different bounds
Hello R helpers, I'm struggling how to apply the integrate function to a data frame. Here is an example of what I'm trying to do: # Create data frame x - 0:4 tx - 10:14 T - 12:16 data - data.frame(x=x, tx=tx, T=T) # Parameter alpha - 10 beta - 11 # Integral integrand - function(y){ (y+alpha)^-(r+data$x)*(y+beta^-(s+1)) } Now I want to apply the integrate function to evaluate the integral for each line of the data frame with tx as the lower and T as the upper bound. The respektive values (and the values only) should be returned in a vector. I want to avoid the use of a loop since the integral is part of a function I want to optimize with optim and so speed is crucial. I tried to do this by something like: integral - lapply(data$tx, integrate, f=integrand, upper=data$T) integral2 - sapply(integral, function(x){x[1]}) integral3 - unlist(integral2, use.names=FALSE) But this doesn't work properly. I'd glad if you have any hints how to get this done. Many thanks and best regards, Carlos -- - Carlos Nasher Buchenstr. 12 22299 Hamburg tel:+49 (0)40 67952962 mobil:+49 (0)175 9386725 mail: carlos.nas...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Set a zero at minimum row by group
Dear R Helpers, I'm struggling with a data preparation problem. I feel that it is a quite easy task but I don't get it done. I hope you can help me with that. I have a data frame looking like this: ID - c(1,1,1,2,2,3,3,3,3) T - c(1,2,3,1,4,3,5,6,8) x - rep(1,9) df - data.frame(ID,T,x) df ID T x 1 1 1 1 2 1 1 3 1 2 1 1 2 4 1 3 3 1 3 5 1 3 6 1 3 8 1 I want to manipulate the x column in a way that for each customer (ID) at the minimum of T the x value is set to zero. The result should look like this: ID T x x_new 1 1 1 0 1 2 1 1 1 3 1 1 2 1 1 0 2 4 1 1 3 3 1 0 3 5 1 1 3 6 1 1 3 8 1 1 I already tried the aggregate() and apply() function, but I don't get the result I'm looking for. I would glad if you could help me out. Best regards, Carlos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.