[R] Which external functions are called in a package? [Solved]
By using the *pkgapi* package and with quite a bit of manual work I was able to (almost) automatically find all function calls to my package in 150 depending on, importing, or suggesting packages. It took two days to overcome all the obstacles during the process -- and was a very rewarding experience. And one where I learned a lot about R again. I also found two bugs in *pkgapi* that I will report. I will write up a short notice on R-pubs for those interested in doing the same for their packages. (By the way, I like the somewhat stricter rules for package dependencies and vignettes on BioConductor.) NB: `trapz`, ie. the trapezoidal integration formula, seems to be the numerical function to be missed the most in R base. Thanks for all the help. --HW __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Which external functions are called in a package?
Jeff, the problem is: There I see the packages that depend on mine, but not which functions are used Or maybe I misunderstood your comment. On Fri, 3 Jan 2020 at 22:49, Jeff Newmiller wrote: > > If you are so lucky as to have this problem, perhaps you could take a look at > the reverse dependencies on your packages' CRAN web page. > > On January 3, 2020 1:45:42 PM PST, Hans W Borchers > wrote: > >You are absolutely right. I forgot that there is a difference between > >the unpacked and the installed directory of a package. The > >documentation of the *pkgapi* package in development is quite scarce > >and does not mention the details. Thanks for the tip. > > > >--HW > > > >PS: Still I would like to learn about other approaches for listing > >external calls of a package. This must be a general problem for > >package developers whose packages are depended on by many other CRAN > >packages. > > > > > >On Fri, 3 Jan 2020 at 20:19, Duncan Murdoch > >wrote: > >> > >> I'm not really familiar with pkgapi, but I believe the first argument > >is > >> to the source directory for a package. It looks as though you are > >> pointing to the installed copy of it. > >> > >> Duncan Murdoch > > > >__ > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > -- > Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Which external functions are called in a package?
You are absolutely right. I forgot that there is a difference between the unpacked and the installed directory of a package. The documentation of the *pkgapi* package in development is quite scarce and does not mention the details. Thanks for the tip. --HW PS: Still I would like to learn about other approaches for listing external calls of a package. This must be a general problem for package developers whose packages are depended on by many other CRAN packages. On Fri, 3 Jan 2020 at 20:19, Duncan Murdoch wrote: > > I'm not really familiar with pkgapi, but I believe the first argument is > to the source directory for a package. It looks as though you are > pointing to the installed copy of it. > > Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Which external functions are called in a package?
How can I find out which functions of my package A are called within another package B that depends on, imports, or suggests package A ? And more specifically, which functions in B are calling functions in A ? I tried to utilize the *pkgapi* package, but get error messages like > map_package("./R/x86_64-pc-linux-gnu-library/3.6/cranlogs") Error: --> in process 22909 (1) What do these error messages mean? (2) Are there easier ways to get this information? Thanks in advance. --HW __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] class of 'try' if error is raised
Yes, CRAN did accept 'if(inherits(e, "try-error"))'. I remember now, when I used the try-construct the first time, I also saw tryCatch and found it a bit too extensive for my purposes. Will look at it again when needed. Thanks to you and Enrico On Sun, 15 Dec 2019 at 16:03, Bert Gunter wrote: > > See ?try which links you to ?tryCatch for the preferred approach. > > Alternatively: if(inherits(e, "try-error")) ## should work and satisfy > CRAN > > -- Bert > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Sun, Dec 15, 2019 at 6:21 AM Hans W Borchers wrote: >> >> I have been informed by CRAN administrators that the development >> version of R issues warnings for my package(s). Some are easy to mend >> (such as Internet links not working anymore), but this one I don't >> know how to avoid: >> >> Error in if (class(e) == "try-error") { : the condition has length > 1 >> >> I understand that `class` can return more than one value. But what >> would be the appropriate way to catch an error in a construct like >> this: >> >> e <- try(b <- solve(a), silent=TRUE) >> if (class(e) == "try-error") { >> # ... do something >> } >> >> Should I instead compare the class with "matrix" or "array" (or >> both)?. That is, in each case check with a correct result class >> instead of an error? >> >> Thanks, HW >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] class of 'try' if error is raised
I have been informed by CRAN administrators that the development version of R issues warnings for my package(s). Some are easy to mend (such as Internet links not working anymore), but this one I don't know how to avoid: Error in if (class(e) == "try-error") { : the condition has length > 1 I understand that `class` can return more than one value. But what would be the appropriate way to catch an error in a construct like this: e <- try(b <- solve(a), silent=TRUE) if (class(e) == "try-error") { # ... do something } Should I instead compare the class with "matrix" or "array" (or both)?. That is, in each case check with a correct result class instead of an error? Thanks, HW __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CRAN package NlcOptim query
The maintainer of the *NlcOptim* package told me that he has fixed the problem and already submitted a new version to CRAN. Thanks, XianYan, for this prompt reaction. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CRAN package NlcOptim query
To be corrected: `Constr_new` with a capital letter; `aeq2` is a list, should be a matrix. As I said last month, you can yourself combine inequality constraints with bounds constraints as follows: myA <- rbind(-Constr_new, diag(-1,numel), diag(1,numel)) myB <- c(-x_than0, rep(0,numel), rep(1,numel)) and `solnl` will return a result like this: sol <- NlcOptim::solnl(X = c(InputTM), objfun = obj_F, A = myA, B = myB, Aeq = as.matrix(aeq2), Beq = beq2) c(sol$par) [1] 0.8310997170, 0.0378150241, ..., 0.2463006547, 1.00 sol$fn [1] 0.00421616 I will write to the maintainer asking about why this example does not work -- supplying functioning code, maybe that will trigger a response. Hans Werner Please note: You are sending e-mail in HTML format which makes it almost impossible to use as code in the R console. On Wed, Dec 12, 2018 at 12:45 PM Hans W Borchers wrote: > > This is still not complete: `x_than0` is missing. > `Constr_new` is written with a capital 'C'. > And aeq2 is a list of column vectors, not a matrix. > Setting the tolerance to 0 does not seem to be a good idea. > > Making aeq2 a matrix and adding `x_than0 <- matrix(c(1, 1))`, then > > aeq2 <- as.matrix(aeq2) > x_than0 <- matrix(c(1, 1)) > > NlcOptim::solnl(X=c(InputTM), objfun=obj_F, A=-Constr_new, B=-x_than0, > Aeq=as.matrix(aeq2), Beq=beq2, > lb=c(rep(0,numel)),ub=c(rep(1,numel)), tolX = 0) > > will indeed return in the same error, while it runs without error if you > either leave out the inequality constraints or the bounds constraints. So > I guess there may be a bug when the function internally combines these > constraints and the bounds. > > You could / should write to the maintainer. I know he is very responsive. > > For the moment, you can combine the bounds constraints and the lower and > upper bounds yourself: > > myA <- rbind(-Constr_new, diag(-1,numel), diag(1,numel)) > myB <- c(-x_than0, rep(0,numel), rep(1,numel)) > > NlcOptim::solnl(X=c(InputTM), objfun=obj_F, A=myA, B=myB, > Aeq=as.matrix(aeq2), Beq=beq2) > > returns "constraints are inconsistent, no solution!", but that may be the > case because I don't know your `x_than` value. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CRAN package NlcOptim query
This is still not complete: `x_than0` is missing. `Constr_new` is written with a capital 'C'. And aeq2 is a list of column vectors, not a matrix. Setting the tolerance to 0 does not seem to be a good idea. Making aeq2 a matrix and adding `x_than0 <- matrix(c(1, 1))`, then aeq2 <- as.matrix(aeq2) x_than0 <- matrix(c(1, 1)) NlcOptim::solnl(X=c(InputTM), objfun=obj_F, A=-Constr_new, B=-x_than0, Aeq=as.matrix(aeq2), Beq=beq2, lb=c(rep(0,numel)),ub=c(rep(1,numel)), tolX = 0) will indeed return in the same error, while it runs without error if you either leave out the inequality constraints or the bounds constraints. So I guess there may be a bug when the function internally combines these constraints and the bounds. You could / should write to the maintainer. I know he is very responsive. For the moment, you can combine the bounds constraints and the lower and upper bounds yourself: myA <- rbind(-Constr_new, diag(-1,numel), diag(1,numel)) myB <- c(-x_than0, rep(0,numel), rep(1,numel)) NlcOptim::solnl(X=c(InputTM), objfun=obj_F, A=myA, B=myB, Aeq=as.matrix(aeq2), Beq=beq2) returns "constraints are inconsistent, no solution!", but that may be the case because I don't know your `x_than` value. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I know the Hausdorff dimensions of fractals in the 'fractalcurve' function of package 'pracma'?
> Dear Dr. Hans W. Borchers, This is a public mailing list; do not address specific people here, everyone can read and (possibly) answer your questions. And please send e-mail in plain text format, not as HTML. > I'm using your 'pracma' package. It is very useful. May I have a small > question for the 'fractalcurve' function? How can I know the Hausdorff > dimension for every option below? > c("hilbert", "sierpinski", "snowflake", "dragon", "triangle", >"arrowhead", "flowsnake", "molecule") > For the "dragon" option, its Hausdorff dimension is log(4)/log(2) = 2. > For the others, what are the Hausdorff dimensions? I don't know. Have you tried the *fractaldim* package to find at least a plausible numerical value for the Hausdorff dimension? > I have found the list of some fractals by Hausforff dimensions. > I don't know which in the above option corresponds to which in Wikepedia. Guess you mean the "List of fractals by Hausdorff dimension". Well, "hilbert" is obviously the Hilbert curve and has a Hausdorff dimension of 2 -- because it is a 'space-filling' curve.[^1] Otherwise, use `fractalcurve()` to generate plots of the fractal curves and compare with the pictures shown on the Wikipedia page. Not all the curves may be present on this list, though. > Thanks a lot! > Best wishes, > Peijian Shi --- [1] By the way, I doubt your estimation that the dragon curve has Hausdorff dimension 2; it looks more like 1.5236. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimisation with Normalisation Constraint
One way will be to solve this as an ordinary optimization problem with an equality constraint. Function `alabama::auglag` can do this: library(alabama) fn <- function(p) sum((df$y - p[1]*exp(-p[2]*df$x))^2) heq <- function(p) sum(p[1]*exp(-p[2]*df$x)) - 5 # Start with initial values near first solution sol <- auglag(c(4, 4), fn=fn, heq=heq) Solution with myfit:4.1344.078 Solution with auglag: 4.126763 4.017768 The equality constraint is still fulfilled: a <- sol$par[1]; lambda <- sol$par[2] sum(a*exp(-lambda*df$x)) ## [1] 5 Plot the differences of these two solutions: plot(df$x, a*exp(-lambda*df$x) - predict(myfit), type='l') Interestingly, the difference between 5 and `predict(myfit)` is *not* just evenly spread across all 15 points. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lsqlin in R package pracma
On Tue, Sep 1, 2015 at 11:24 PM, Wang, Xue, Ph.D.wrote: > > slsqp in R seems quite slow. Does anyone have some suggestion as how to speed > up this? > It is no surprise that a general solver like slsqp() takes longer than specialized quadratic solvers such as solve.QP, ipop(), or lowRankQP(). You got good advice in this thread, even with code; so why don't you use what has been suggested in this thread? If in your timings you include calling lsqlin() in pracma, then this will increase the time by at least 0.4 ms. NLopt is an external, compiled program that cannot be accelerated from within R. I wrote a small wrapper for all the different approaches proposed. Here is a rough timing estimate (with microbenchmark, in milliseconds), all derived from the MATLAB example: with constraintsw/o constraints --- qr.solve-- 0.06 ms slsqp 5.5 ms5.25 ms [*] solve.QP0.045 ms-- ipop5.0 ms-- lowRankQP 0.16 ms-- --- [*] plus 0.4 ms when using lsqlin for start values. Because solve.QP is fastest, I append a function mimicking the MATLAB API of lsqlin (without error handling for now) that can be used with and without equality constraints, utilizing Berend's contribution. If this is too slow, I guess you will have to look outside R, for example Julia provides convex solvers that may be fast. --- cut -- require(quadprog) constrLsqlin <- function(C, d, A, b, # linear inequalities required Aeq = NULL, beq = NULL, lb = NULL, ub = NULL) { m <- nrow(C); n <- ncol(C) meq <- nrow(Aeq); neq <- ncol(Aeq) if (is.null(meq)) meq = 0 Dmat <- t(C) %*% C dvec <- t(C) %*% d if (is.null(Aeq)) { Amat <- -A bvec <- -b } else { Amat <- rbind(Aeq, -A) bvec <- c(beq, -b) } if (!is.null(lb)) { Amat <- rbind(Amat, diag(n)) bvec <- c(bvec, lb) } if (!is.null(ub)) { Amat <- rbind(Amat, -diag(n)) bvec <- c(bvec, -ub) } rslt <- solve.QP(Dmat, dvec, t(Amat), bvec, meq=meq) rslt$solution } __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lsqlin in R package pracma
I got interested in enabling the full funcionality that MATLAB's lsqlin() has, that is with equality and bound constraints. To replace an equality constraint with two inequality constraints will not work with solve.QP() because it requires positive definite matrices. I will use kernlab::ipop() instead. To handle the full MATLAB example, add the following simple linear equality constraint 3x1 + 5x2 + 7x3 + 9x4 = 4 to the example above, plus lower and upper bounds -0.1 and 2.0 for all x_i. C - matrix(c( 0.9501, 0.7620, 0.6153, 0.4057, 0.2311, 0.4564, 0.7919, 0.9354, 0.6068, 0.0185, 0.9218, 0.9169, 0.4859, 0.8214, 0.7382, 0.4102, 0.8912, 0.4447, 0.1762, 0.8936), 5, 4, byrow=TRUE) d - c(0.0578, 0.3528, 0.8131, 0.0098, 0.1388) A - matrix(c( 0.2027, 0.2721, 0.7467, 0.4659, 0.1987, 0.1988, 0.4450, 0.4186, 0.6037, 0.0152, 0.9318, 0.8462), 3, 4, byrow=TRUE) b - c(0.5251, 0.2026, 0.6721) # Add the equality constraint to matrix A Aeq - c(3, 5, 7, 9) beq - 4 A1 - rbind(A , c(3, 5, 7, 9)) b1 - c(b, 4) lb - rep(-0.1, 4) # lower and upper bounds ub - rep( 2.0, 4) r1 - c(1, 1, 1, 0) # 0 to force an equality constraint # Prepare for a quadratic solver Dmat - t(C) %*% C dvec - (t(C) %*% d) Amat - -1 * A1 bvec - -1 * b1 library(kernlab) s - ipop(-dvec, Dmat, Amat, bvec, lb, ub, r1) s # An object of class ipop # Slot primal: # [1] -0.0885 -0.0997 0.15990817 0.40895991 # ... x - s@primal # [1] -0.1000 -0.1000 0.1599 0.4090 A1 %*% x - b1 = 0 # i.e., A x = b and 3x[1] + ... + 9x[4] = 4 sum((C %*% x - d)^2)# minimum: 0.1695 And this is exactly the solution that lsqlin() in MATLAB computes. On Thu, Aug 27, 2015 at 6:06 PM, Raubertas, Richard richard_rauber...@merck.com wrote: Is it really that complicated? This looks like an ordinary quadratic programming problem, and 'solve.QP' from the 'quadprog' package seems to solve it without user-specified starting values: library(quadprog) Dmat - t(C) %*% C dvec - (t(C) %*% d) Amat - -1 * t(A) bvec - -1 * b rslt - solve.QP(Dmat, dvec, Amat, bvec) sum((C %*% rslt$solution - d)^2) [1] 0.01758538 Richard Raubertas Merck Co. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lsqlin in R package pracma
On Mon Aug 24 Wang, Xue, Ph.D. Wang.Xue at mayo.edu wrote I am looking for a R version of Matlab function lsqlin. I came across R pracma package which has a lsqlin function. Compared with Matlab lsqlin, the R version does not allow inequality constraints. I am wondering if this functionality will be available in future. And also like to get your opinion on which R package/function is the best for solving least square minimization problem with linear inequality constraints. Thanks very much for your time and attention! Solving (linear) least-squares problems with linear inequality constraints is more difficult then one would expect. Inspecting the MATLAB code reveals that it employs advanced methods such as active-set (linear inequality constraints) and interior-point (for bounds constraints). Function nlsLM() in package *minpack.lm* supports bound constraints if that is sufficient for you. The same is true for *nlmrt*. Convex optimization might be a promising approach for linear inequality constraints, but there is no easy-to-handle convex solver in R at this moment. So the most straightforward way would be to use constrOptim(), that is optim with linear constraints. It requires a reasonable starting point, and keeping your fingers crossed that you are able to find such a point in the interior of the feasible region. I someone wants to try: Here is the example from the MATLAB lsqlin page: C - matrix(c( 0.9501, 0.7620, 0.6153, 0.4057, 0.2311, 0.4564, 0.7919, 0.9354, 0.6068, 0.0185, 0.9218, 0.9169, 0.4859, 0.8214, 0.7382, 0.4102, 0.8912, 0.4447, 0.1762, 0.8936), 5, 4, byrow=TRUE) d - c(0.0578, 0.3528, 0.8131, 0.0098, 0.1388) A - matrix(c( 0.2027, 0.2721, 0.7467, 0.4659, 0.1987, 0.1988, 0.4450, 0.4186, 0.6037, 0.0152, 0.9318, 0.8462), 3, 4, byrow=TRUE) b - c(0.5251, 0.2026, 0.6721) The least-square function to be minimized is ||C x - d||_2 , and the constraints are A x = b : f - function(x) sum((C %*% x - d)^2) The solution x0 returned by MATLAB has a minimum of f(x0) = 0.01759204 . This point does not lie in the interior and cannot be used for a start. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lsqlin in R package pracma
I am a strong advocate of the *nloptr* package; and sequential quadratic programming, i.e. slsqp(), should be a good choice for least-squares problems. Note that you still have to provide a starting point. BUT: this point does not need to lie in the interior of the feasible region. So you can start with a point that solves the problem on the boundary (with lsqlin()) and then continue with slsqp(). For the example mentioned above, it looks like this: library(pracma) x0 - lsqlin(C, d, A, b) # starting point on the boundary library(nloptr) f - function(x) sum((C %*% x - d)^2) # objective hin - function(x) -A %*% x + b# constraint, w/ A x = 0 ! slsqp(x0, f, hin = hin) # $par # [1] 0.1298620 -0.5756944 0.4251035 0.2438448 # # $value # [1] 0.01758538 # ... The solution is slightly better than the MATLAB one or the constrOptim one. Of course, all these can be improved by changing some optional parameters. On Wed, Aug 26, 2015 at 2:18 PM, Wang, Xue, Ph.D. wang@mayo.edu wrote: Hi Hans, Thanks for your comments! I need the linear inequality constraints so nlsLM is not a candidate. I am also looking at the functions mma and slsqp in R package nloptr. Compared with constrOptim(), which approach would you recommend? Thanks, Xue __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Second order bessel function
On Wed Mar 4 21:32:30 CET 2015 Chris Vanlangenberg writes: I want to compute the numerical values for modified second order bessel function given x and other parameters, currently base R has a bessel function for 1st order and I have tried to use the relationship between 1st and 2nd order to compute the 2nd order bessel function, but I ended up getting a zero. Any suggestions how to proceed on this? or any alternative methods? R has the Bessel and modified Bessel functions of first and second kind, of integer and fractional order. Package 'Bessel' contains Bessel functions for real and complex numbers. Package 'gsl' provides access to lots of special Bessel functions. What exactly are you looking for? Regards, Chris Vanlangenberg __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] integrate with vector arguments
marKo mtoncic at ffri.hr writes: I'm a bit stuck. I have to integrate a series of polynomial functions with vector arguments. v1-c(1:5) v2-c(1:5) f1-function (x) {v1*x+v2*x^2} The problem is that integrate(f1, 0, 1) does not work. The point is not that there are vector arguments, but that your function is vector-valued and so the generated error message below rightly says evaluation of function gave a result of wrong length. You could integrate each dimension separately or, e.g., you use quadv() from package 'pracma' which handles vector-valued functions: v1 - v2 - 1:5 f1-function (x) {v1*x+v2*x^2} library(pracma) quadv(f1, 0, 1, tol=1e-10) $Q [1] 0.833 1.667 2.500 3.333 4.167 $fcnt [1] 13 $estim.prec [1] 0.03 quadv() employs an adaptive Simpson quadrature where the recursion is applied to all components at once. I does not, even if a pas the arguments (v1, v2) f1-function (x, v1, v2) {v1*x+v2*x^2} or if i try to vectorize the function f1-Vectorize(function(x, v1, v2){v1*x+v2*x^2}, vectorize.args=c(v1, v2)) integrate(f1, 0, 1) gives an error: Error in integrate(f1, 0, 1) : evaluation of function gave a result of wrong length Any help will be greatly appreciated. Thanks, Marko __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nonlinear integer programming (again)
Zwick, Rebecca J RZwick at ETS.ORG writes: Oddly, Excel's Solver will produce a solution to such problems but (1) I don't trust it and (2) it cannot handle a large number of constraints. [...] My question is whether there is an R package that can handle this problem. There are not many free integer (nonlinear) programming (IP, INLP) solvers available. You could send your problem to one of the MINLP solvers at NEOS: http://neos.mcs.anl.gov/neos/solvers/ [See the list of relevant NEOS solvers (commercial and free) on this page: http://www.neos-guide.org/content/mixed-integer-nonlinear-programming] You can also use the 'rneos' package to send your request to one of these Web services, but most of the time I find it easier to directly fill the solver template. Please note that you have to format your problem and data according to the solver's needs, i.e. likely in AMLP or GAMS syntax. If you intend to solve such problems more often, I'd suggest to download one of the commercial solvers with academic licenses (e.g., Knitro, Gurobi, ...) and to install a corresponding R package to access the solver. For more information see the Optimization task view. I would *never* trust Excel for these kinds of problems... By the way, I may not correctly understand your objective, but perhaps you can formulate it as maximizing a quadratic function (with constraints). __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for an R package for Set Cover Problem
As the Wikipedia page you took your example problem from explains, the sets cover problem can be formulated as an integer linear programming problem. In R, such problems will be solved effectively applying one of the available MILP packages, for example LPsolve or Rsymphony. Kumar Mainali kpmainali at gmail.com writes: I am looking for an R package that solves Set Cover Problem. As Wikipedia explains: Given a set of elements [image: \{1,2,...,m\}] (called the universe) and a set [image: S] of [image: n] sets whose union equals the universe, the set cover problem is to identify the smallest subset of [image: S] whose union equals the universe. For example, consider the universe [image: U = \{1, 2, 3, 4, 5\}] and the set of sets [image: S = \{\{1, 2, 3\}, \{2, 4\}, \{3, 4\}, \{4, 5\}\}]. Clearly the union of [image: S] is [image: U]. However, we can cover all of the elements with the following, smaller number of sets: [image: \{\{1, 2, 3\}, \{4, 5\}\}]. For this concrete case the set of linear inequalities looks like: x1 = 1 # as 1 is only element of S1 x1 + x2 = 1 # etc. x1 + x3 = 1 x2 + x3 + x4 = 1 x4 = 1 # all xi in {0,1} which has the minimal solution x1, x2, x3, x4 = 1, 0, 0, 1 . Thank you, Kumar Mainali Postdoctoral Fellow Environmental Science Institute The University of Texas at Austin ᐧ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Double Infinite Integration
Aya Anas aanas at feps.edu.eg writes: Hello all, I need to perform the following integration where the integrand is the product of three functions: f(x)g(y)z(x,y) the limits of x are(0,inf) and the limits of y are(-inf,inf). Could this be done using R? There is a saying: Don't ask Can this be done in R?, ask How is it done? Extracting function f(x) from the inner integral may not always be the best idea. And applying package 'cubature' will not work as adaptIntegrate() does not really handle non-finite interval limits. As an example, let us assume the functions are f - function(x) x g - function(y) y^2 h - function(x, y) exp(-(x^2+y^2)) Define a function that calculates the inner integral: F1 - function(x) { fun - function(y) f(x) * g(y) * h(x, y) integrate(fun, -Inf, Inf)$value } F1 - Vectorize(F1) # requested when using integrate() We have to check that integrate() is indeed capable of computing this integrand over an infinite interval. F1(c(0:4)) # looks good ## [1] 0.00e+00 3.260247e-01 3.246362e-02 3.281077e-04 3.989274e-07 Now integrate this function over the second (infinite) interval. integrate(F1, 0, Inf) ## 0.4431135 with absolute error 2.4e-06 Correct, as the integral is equal to sqrt(pi)/4 ~ 0.44311346... If we extract f(x) from the inner integral the value of the integral and the computation times will be the same, but the overall handling will be slightly more complicated. I tried using the function integrate 2 times, but it didn't work: z- function(x,y) { } f-function(x){ rr-put here the function in x *integrate(function(y) z(x, y), -Inf,Inf)$value return(rr) } rr2-integrate(function(x) f(x), 0, Inf)$value print(rr2) I didn't get any output at all!!! Thanks, Aya __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Graphing complex functions
John Van Praag john at jvp247.com writes: Does R have any facilities, or packages, for graphing complex functions? Package 'elliptic' has function view() for Visualization of complex functions using colourmaps and contours Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MCP solver
Maxwell, John McFarland jmmaxwell at wsu.edu writes: Hello, I'm trying to find a solver that will work for the mixed complementarity problem (MCP). I've searched the CRAN task view page on optimization and mathematical programming as well as many google searches to no avail. Does anyone know if there is an MCP solver available for R? Thanks very much, JM The problem class of 'mixed complementary problems' is quite big and encompasses difficult optimization problems such as nonlinear systems of equations, nonlinear optimization problems, or optimization with equality constraints, among others. There is no solver or package in R that will solve 'mixed complementary problems' in general. Perhaps your problem can be cast into a more specialized form that is accessible to one of the available solvers. Otherwise, GAMS has its own module for solving MCP problems. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to double integrate a function in R
Tiago V. Pereira tiago.pereira at mbe.bio.br writes: I am trying to double integrate the following expression: # expression (1/(2*pi))*exp(-y2/2)*sqrt((y1/(y2-y1))) for y2y10. I am trying the following approach # first attempt library(cubature) fun - function(x) { (1/(2*pi))*exp(-x[2]/2)*sqrt((x[1]/(x[2]-x[1])))} adaptIntegrate(fun, lower = c(0,0), upper =c(5, 6), tol=1e-8) However, I don't know how to constrain the integration so that y2y10. Any ideas? Tiago You could use integral2() in package 'pracma'. It implements the TwoD algorithm and has the following properties: (1) The boundaries of the second variable y can be functions of the first variable x; (2) it can handle singularities on the boundaries (to a certain extent). library(pracma) fun - function(y1, y2) (1/(2*pi))*exp(-y2/2)*sqrt((y1/(y2-y1))) integral2(fun, 0, 5, function(x) x, 6, singular=TRUE) $Q [1] 0.7706771 $error [1] 7.890093e-11 The relative error is a bit optimistic, the absolute error here is 0.5e-6. The computation time is 0.025 seconds. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to compute maximum of fitted polynomial?
David Winsemius dwinsemius at comcast.net writes: [...] On Jun 4, 2013, at 10:15 PM, Hans W Borchers wrote: In the case of polynomials, elementary math ... methods can actually be executed with R: library(polynomial) # -6 + 11*x - 6*x^2 + x^3 p0 - polynomial(c(-6, 11, -6, 1)) # has zeros at 1, 2, and 3 p1 - deriv(p0); p2 - deriv(p1)# first and second derivative xm - solve(p1) # maxima and minima of p0 xmax = xm[predict(p2, xm) 0] # select the maxima xmax# [1] 1.42265 These look like the functions present in the 'polynom' package authored by Bill Venables [aut] (S original), Kurt Hornik [aut, cre] (R port), Martin Maechler. I wasn't able to find a 'polynomial' package on CRAN. The 'mpoly' package by David Kahle offers multivariate symbolic operations as well. Sorry, yes of course, it should be `library(polynom)`. Somehow I'm making this mistake again and again. And one has to be a bit careful about complex roots. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to compute maximum of fitted polynomial?
Bert Gunter gunter.berton at gene.com writes: 1. This looks like a homework question. We should not do homework here. 2. optim() will only approximate the max. 3. optim() is not the right numerical tool for this anyway. optimize() is. 4. There is never a guarantee numerical methods will find the max. 5. This can (and should?) be done exactly using elementary math rather than numerical methods. Cheers, Bert In the case of polynomials, elementary math ... methods can actually be executed with R: library(polynomial) # -6 + 11*x - 6*x^2 + x^3 p0 - polynomial(c(-6, 11, -6, 1)) # has zeros at 1, 2, and 3 p1 - deriv(p0); p2 - deriv(p1)# first and second derivative xm - solve(p1) # maxima and minima of p0 xmax = xm[predict(p2, xm) 0] # select the maxima xmax# [1] 1.42265 Obviously, the same procedure will work for polynomials p0 of higher orders. Hans Werner Em 04-06-2013 21:32, Joseph Clark escreveu: My script fits a third-order polynomial to my data with something like this: model - lm( y ~ poly(x, 3) ) What I'd like to do is find the theoretical maximum of the polynomial (i.e. the x at which model predicts the highest y). Specifically, I'd like to predict the maximum between 0 = x = 1. What's the best way to accomplish that in R? Bonus question: can R give me the derivative or 2nd derivative of the polynomial? I'd like to be able to compute these at that maximum point. Thanks in advance! // joseph w. clark , phd , visiting research associate \\ university of nebraska at omaha - college of IST __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is DUD available in nls()?
Bert Gunter gunter.berton at gene.com writes: I certainly second all Jeff's comments. **HOWEVER** : http://www.tandfonline.com/doi/pdf/10.1080/00401706.1978.10489610 IIRC, DUD's provenance is old, being originally a BMDP feature. Thanks for the pointer. That seems to be an interesting derivative-free approach, specialized for least-squares problems. I will take a closer look, as it might be a nice addition for the 'dfoptim' package for derivative-free optimization. -- Hans Werner Jeff Newmiller jdnewmil at dcn.davis.ca.us writes: a) Read the Posting Guide. [...] b) Regarding your DUD algorithm, it doesn't sound familiar. There are many algorithms that don't use derivatives for optimization so it isn't clear that this is a referenceable label. The help file for nls mentions two algorithms that don't require derivatives. You might want to review the CRAN Task View on Optimization and Mathematical Programming if you want to know more about what is available in CRAN and base R. qi A send2aqi at gmail.com writes: SAS has DUD (Does not Use Derivatives)/Secant Method for nonlinear regression, does R offer this option for nonlinear regression? I have read the helpfile for nls() and could not find such option, any suggestion? Thanks, Derek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimization in R similar to MS Excel Solver
Pavel_K kuk064 at vsb.cz writes: Dear all, I am trying to find the solution for the optimization problem focused on the finding minimum cost. I used the solution proposed by excel solver, but there is a restriction in the number of variables. My data consists of 300 rows represent cities and 6 columns represent the centres. It constitutes a cost matrix, where the cost are distances between each city and each of six centres. ..+ 1 column contains variables, represents number of firms. I want to calculate the minimum cost between cities and centres. Each city can belong only to one of the centres. (1) The solution you say the Excel Solver returns does not appear to be correct: The column sum in columns 3 to 5 is not (greater or) equal to 1 as you request. (2) lpSolve does not return an error, but says no feasible solution found, which seems to be correct: The equality constraints are too strict. (3) If you relieve these constraints to inequalities, lpSolves does find a solution: costs - matrix(c( 30, 20, 60, 40, 66, 90, 20, 30, 60, 40, 66, 90, 25, 31, 60, 40, 66, 90, 27, 26, 60, 40, 66, 90), 4, 6, byrow = TRUE) firms - c(15, 10, 5, 30) row.signs - rep (=, 4) row.rhs - firms col.signs - rep (=, 6) col.rhs - c(1,1,1,1,1,1) require(lpSolve) T - lp.transport (costs, min, row.signs, row.rhs, col.signs, col.rhs, presolve = 0, compute.sens = 0) T$solution sum(T$solution * costs) # 1557 Of course, I don't know which constraints you really want to impose. Hans Werner A model example: costs: distance between municipalities and centres + plus number of firms in each municipality Municipality Centre1 Centre2 Centre3 Centre4 Centre5 Centre6 Firms Muni1 30 20 60 40 66 90 15 Muni2 20 30 60 40 66 90 10 Muni3 25 31 60 40 66 90 5 Muni4 27 26 60 40 66 90 30 The outcome of excel functon Solver is: cost assigned Municipality Centre1 Centre2 Centre3 Centre4 Centre5 Centre6 Solution Muni1 0 20 0 0 0 0 300 Muni2 20 0 0 0 0 0 200 Muni3 25 0 0 0 0 0 125 Muni4 0 26 0 0 0 0 780 objective : 1405 I used package lpSolve but there is a problem with variables firms: s - as.matrix(read.table(C:/R/OPTIMALIZATION/DATA.TXT, dec = ,, sep=;,header=TRUE)) [2] [3] [4] [5] [6] [1] 30 20 60 40 66 90 [2] 20 30 60 40 66 90 [3] 25 31 60 40 66 90 [4] 27 26 60 40 66 90 row.signs - rep (=, 4) row.rhs - c(15,10,5,30) col.signs - rep (=, 6) col.rhs - c(1,1,1,1,1,1) lp.transport (costs, min, row.signs, row.rhs, col.signs, col.rhs, presolve=0, compute.sens=0) lp.transport (costs, min, row.signs, row.rhs, col.signs, col.rhs, presolve=0, compute.sens=0)$solution Outcome: Error in lp.transport(costs, ...): Error: We have 6 signs, but 7 columns Does anyone know where could the problem ? Does there exist any other possibility how to perform that analysis in R ? I am bit confused here about how can I treat with the variables firms. Thanks Pavel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using lm to estimate a parameter?
Uwe Ligges ligges at statistik.tu-dortmund.de writes: On 19.02.2013 11:23, hellen wrote: Hi, I have a data with three variables (X,Y,Z) and I have an equation as Z=X/(1+L*X/Y) where L is a constant which need to be estimated from data. How should I write the formula in lm or is it possible to fit a linear model in this case? Neither, it is nonlinear in the parameters. See ?nls or ?optim, for example. Well, if the Z values are not too small, you can linearize it as U = (X Y - Y Z) / Z = L X and solve it with lm(U ~ X - 1), that is without absolute term. Uwe Ligges Thanks! Hallen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R nls results different from those of Excel ??
Jeff Newmiller jdnewmil at dcn.davis.ca.us writes: Excel definitely does not use nonlinear least squares fitting for power curve fitting. It uses linear LS fitting of the logs of x and y. There should be no surprise in the OP's observation. May I be allowed to say that the general comments on MS Excel may be alright, in this special case they are not. The Excel Solver -- which is made by an external company, not MS -- has a good reputation for being fast and accurate. And it indeed solves least-squares and nonlinear problems better than some of the solvers available in R. There is a professional version of this solver, not available from Microsoft, that could be called excellent. We, and this includes me, should not be too arrogant towards the outside, non-R world, the 'barbarians' as the ancient Greeks called it. Hans Werner --- Jeff NewmillerThe . . Go Live... DCN:jdnewmil at dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimizing a numerical integration
Dear all, I am a new user to R and I am using pracma and nloptr libraries to minimize a numerical integration subject to a single constraint . The integrand itself is somehow a complicated function of x and y that is computed through several steps. i formulated the integrand in a separate function called f which is a function of x y. I want to find the optimal value of x such that the integration over y is minimum. Here is my code, it is not working. I got an error: 'y' is missing library('nloptr') library('pracma') Guess there's no need to use pracma or nloptr. `eval_f0` returns the error message as function `f(x, y)` when called in an integration routine cannot decide which variable to integrate over. You don't provide code, so here is a simple example: f - function(x, y) sin(y) * cos(x * y) eval_f0 - function(x) integrate(function(y) f(x, y), 0, pi)$value optimize(eval_f0, c(0, 2*pi)) ## minimum: 1.652188 ## objective: -0.844125 In your code, x is a scalar. But if x is a vector, applying nloptr might be a choice: f - function(x, y) sin(y) * cos((x[1]+x[2])*y) eval_f0 - function(x) integrate(function(y) f(x, y), 0, pi)$value eval_g0 - function(x) x[1]^2 + x[2]^2 - 1 # i.e., sum(x^2) = 1 nloptr( x0=c(0.5, 0.5), eval_f=eval_f0, lb = c(0, 0), ub = c(1, 1), eval_g_ineq = eval_g0, opts = list(algorithm=NLOPT_LN_COBYLA, maxeval=1000)) ## Optimal value of objective function: -0.733744658465974 ## Optimal value of controls: 0.707091 0.7071225 Hans Werner f - function(x,y) {#here i should put the commands representing my function return( ) } #constraint function eval_g0 - function(x) { return( ) } # objective function eval_f0 - function(x) { romberg(f, 0.5, 0.5001)} ARL1 - nloptr( x0=c(0.653), eval_f=eval_f0, lb = c(0), ub = c(6), eval_g_ineq = eval_g0, opts = list(algorithm=NLOPT_LN_COBYLA, maxeval=1000), Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cubic spline
but definitely *no* need to use a function from an extra CRAN package .. as someone else ``erronously'' suggested. Except that Matlab's interp1() 'cubic' method does not use cubic spline interpolation, but Moler's 'pchip' approach, a piecewise cubic Hermite interpolation. Thus the results are different: % Matlab: interp1(X, Y, 11, 'cubic') %= 5.8000 interp1(X, Y, 15, 'cubic') %= 5. # R: spfun - splinefun( X, Y, natural) spfun(11)#= 5.785714 spfun(15)#= 4.928571 spfun - splinefun( X, Y, monoH.FC) spfun(11)#= 5.8 spfun(15)#= 5.0 Unfortunately, if the points are not monotonic, the 'monoH.FC' method does not exactly agree with the 'pchip' approach, i.e. does not in general return the same results. By the way, pchip() in package 'pracma' implements Moler's approach and does return the same (interpolation and extrapolation) results as interp1() with the 'cubic' option in Matlab. Hans Werner Martin Maechler maechler at stat.math.ethz.ch writes: David Winsemius dwinsemius at comcast.net on Sat, 1 Dec 2012 09:25:42 -0700 writes: On Dec 1, 2012, at 5:09 AM, Steve Stephenson wrote: Hallo, I'm facing a problem and I would really appreciate your support. I have to translate some Matalb code in R that I don't know very well but I would like to. I have to interpolate 5 point with a cubic spline function and then I expect my function returns the Y value as output a specific X value inside the evaluation range. Let's suppose that: 1- *X = [-10, -5, 0, 5, 10]* 2 - *Y = [12, 10, 8, 7, 6]* 3 - *I have to interpolate with a cubic spline assuming x=11* In Matlab I used this function: *y = interp1(X, Y, x, cubic); * How can I do the same in R? Many thanks in advance for your reply and support! splinefun( x = c(-10, -5, 0, 5, 10), y = c(12, 10, 8, 7, 6), method=natural)(11) [1] 5.785714 Yes, indeed, or simple spline() but definitely *no* need to use a function from an extra CRAN package .. as someone else ``erronously'' suggested. Note that spline() and splinefun() together with approx() and approxfun() are among the several hundred functions that were already part of pre-alpha R, i.e., before R had a version number or *any* packages ... and yes, the README then started with the two lines | R Source Code (Tue Jun 20 14:33:47 NZST 1995) | Copyright 1993, 1994, 1995 by Robert Gentleman and Ross Ihaka and it would be *really* *really* great if people did not add stuff to their packages that has been part of R for longer than they have even heard of R. Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The best solver for non-smooth functions?
Cren oscar.soppelsa at bancaakros.it writes: The most robust solver for non-smooth functions I know of in R is Nelder-Mead in the 'dfoptim' package (that also allows for box constraints). First throw out the equality constraint by using c(w1, w1, 1-w1-w2) as input. This will enlarge the domain a bit, but comes out allright in the end. sharpe2 - function(w) { w - c(w[1], w[2], 1-w[1]-w[2]) - (t(w) %*% y) / cm.CVaR(M, lgd, w, N, n, r, rho, alpha, rating) } nmkb(c(1/3,1/3), sharpe2, lower=c(0,0), upper=c(1,1)) ## $par ## [1] 0.1425304 0.1425646 ## $value ## [1] -0.03093439 This is still in the domain of definition, and is about the same optimum that solnp() finds. There are some more solvers, especially aimed at non-smooth functions, in the making. For low-dimensional problems like this Nelder-Mead is a reasonable choice. # Whoops! I have just seen there's a little mistake # in the 'sharpe' function, because I had to use # 'w' array instead of 'ead' in the cm.CVaR function! # This does not change the main features of my, # but you should be aware of it --- # The function to be minimized sharpe - function(w) { - (t(w) %*% y) / cm.CVaR(M, lgd, ead, N, n, r, rho, alpha, rating) } # This becomes... sharpe - function(w) { - (t(w) %*% y) / cm.CVaR(M, lgd, w, N, n, r, rho, alpha, rating) } # ...substituting 'ead' with 'w'. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identifying local maxima
Gary Dong pdxgary163 at gmail.com writes: Dear R users, I have created a Loess surface in R, in which x is relative longitude by miles, y is relative latitude by miles, and z is population density at the neighborhood level. The purpose is to identify some population centers in the region. I'm wondering if there is a way to determine the coordinates (x,y) of each center, so I can know exactly where they are. Let me use the elevation data set in geoR to illustrate what have done: require(geoR) data(elevation) elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 * elevation$coords[,y], z = 10 * elevation$data) elev.loess - loess(z ~ x * y, data = elev.df, degree = 2, span = 0.1) grid.x - seq(10, 300, 1) grid.y - seq(10, 300, 1) grid.mar - list(x=grid.x, y=grid.y) elev.fit-expand.grid(grid.mar) z.pred - predict(elev.loess, newdata = elev.fit) persp(seq(10, 300, 5), seq(10, 300, 5), emp.pred, phi = 45, theta = 45, xlab = X Coordinate (feet), ylab = Y Coordinate (feet), main = Surface elevation data) With this, what's the right way to determine the coordinates of the local maixma on the surface? If there are more local maxima, you probably need to start the optimization process from each point of your grid and see if you stumble into different maxima. Here is how to find the one maximum in your example. require(geoR); data(elevation) elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 *elevation$coords[,y], z = 10 * elevation$data) ## Compute the surface on an irregular grid library(akima) foo - function(xy) with( elev.df, -interp(x, y, z, xy[1], xy[2], linear=FALSE, extrap=TRUE)$z ) ## Use Nelder-Mead to find a local maximum optim(c(150, 150), foo) # $par # [1] 195.8372 21.5866 # $value # [1] -9705.536 ## See contour plot if this is the only maximum with(elev.df, {A - akima::interp(x, y, z, linear=FALSE) plot(x, y, pch=20, col=blue) contour(A$x, A$y, A$z, add=TRUE) }) points(195.8372, 21.5866, col=red) Thanks. Gary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numerical integration
Michael Meyer mjhmeyer at yahoo.com writes: Check your logic. The following lines show that integrate *does* return the correct values: a = 0.08 # alpha M - function(j,s){ return(exp(-j*a*s)) } A - matrix(NA, 5, 5) for (i in 1:5) { for (j in i:5) { f - function(s) { return(M(i,s)) } g - function(s) { return(M(j,s)) } integrand - function(s){ return(f(s)*g(s)) } A[i, j] - integrate(integrand,lower=0,upper=Inf)$value if (i != j) A[j, i] - A[i, j] } } A # [,1] [,2] [,3] [,4] [,5] # [1,] 6.25 4.17 3.125000 2.50 2.08 # [2,] 4.17 3.125000 2.50 2.08 1.785714 # [3,] 3.125000 2.50 2.08 1.785714 1.562500 # [4,] 2.50 2.08 1.785714 1.562500 1.39 # [5,] 2.08 1.785714 1.562500 1.39 1.25 Try setting A - matrix(NA, 5, 5). You'll see that the wrong entries in matrix A are still NA, that is not yet computed. Regards, Hans Werner I encounter a strange problem computing some numerical integrals on [0,oo). Define $$ M_j(x)=exp(-jax) $$ where $a=0.08$. We want to compute the $L^2([0,\infty))$-inner products $$ A_{ij}:=(M_i,M_j)=\int_0^\infty M_i(x)M_j(x)dx $$ Analytically we have $$ A_{ij}=1/(a(i+j)). $$ In the code below we compute the matrix $A_{i,j}$, $1\leq i,j\leq 5$, numerically and check against the known analytic values. When I run this code most components of A are correct, but some are zero. I get the following output: [1] Dot products, analytic: [,1] [,2] [,3] [,4] [,5] [1,] 6.25 4.17 3.125000 2.50 2.08 [2,] 4.17 3.125000 2.50 2.08 1.785714 [3,] 3.125000 2.50 2.08 1.785714 1.562500 [4,] 2.50 2.08 1.785714 1.562500 1.39 [5,] 2.08 1.785714 1.562500 1.39 1.25 [1] Dot products, numeric: [,1] [,2] [,3] [,4] [,5] [1,] 6.25 4.17 3.125000 2.50 2.08 [2,] 4.17 3.125000 2.50 2.08 0.00 [3,] 3.125000 2.50 2.08 0.00 0.00 [4,] 2.50 2.08 0.00 0.00 0.00 [5,] 2.08 1.785714 1.562500 1.39 1.25 Undoubtedly the code is suboptimal. What is wrong with it? a = 0.08 # alpha M - function(j,s){ return(exp(-j*a*s)) } # Inner product in $L^2([0,+\oo))$ # innerProduct - function(f,g){ integrand - function(s){ return(f(s)*g(s)) } return(integrate(integrand,lower=0,upper=Inf)$value) } # The 5x5 matrix of the L^2-inner products of $M_1,\dots,M_5$. # numericalDotProductMatrix_M - function(){ A - matrix(0,5,5) for(i in seq(1:5)) for(j in seq(i:5)){ f - function(s){ return(M_j(i,s)) } g - function(s){ return(M_j(j,s)) } A[i,j] - innerProduct(f,g) if(ij) A[j,i] - A[i,j] } return(A) } # The 5x5 matrix of the L^2-inner products of $M_1,\dots,M_5$. # dotProductMatrix_M - function(){ A - matrix(0,5,5) for(i in seq(1:5)) for(j in seq(1:5)) A[i,j] - 1/(a*(i+j)) return(A) } testNumericalIntegration - function(){ A - dotProductMatrix_M() print(Dot products, analytic:) print(A) # why doesn't the numerical integration work B - numericalDotProductMatrix_M() print(Dot products, numeric:) print(B) } testNumericalIntegration() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need to help to get value for bigger calculation
Rehena Sultana hena_4567 at yahoo.com writes: I want to calculate values like 15^200 or 17^300 in R. In normal case it can calculate the small values of b (a^b). I have fixed width = 1 and digits = 22 but still answers are Inf. How to deal the cases like these? Thanks in advance. library(Rmpfr) m15 - mpfr(15, precBits= 1024) m15^200 165291991078820803015600259355571011187461128806050897708002963982861165 279305672605355515846488679511983197774726563035424119280679882914553697 406070345089013507994161512883544043567583004683422057135011584705353016 03376865386962890625 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Runs up and runs down test
JWDougherty jwd at surewest.net writes: Can someone point me to an implementation of the runs up and runs down test, or does such beast exist in R? From web searches the runs up and runs down test is commonly used for testing pseudo-random number generators and in simulations. John C. Davis describes its use in geology in his book on geological statistics (2002). Having searched the full site, I can locate only the runs.test() in the tseries package. See the Runs test in the *Dieharder* random number test suite (if I understand your request properly), available in R through the 'RDieHarder' package. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimization inconsistencies
peter dalgaard pdalgd at gmail.com writes: On May 18, 2012, at 00:14 , Nathan Stephens wrote: I have a very simple maximization problem where I'm solving for the vector x: objective function: w'x = value to maximize box constraints (for all elements of w): low x high equality constraint: sum(x) = 1 But I get inconsistent results depending on what starting values I. I've tried various packages but none seem to bee the very solver in Excel. Any recommendations on what packages or functions I should try? Use the equality constraint to reduce the dimension of the problem by one. Then formulate the inequality constraints and apply, e.g., constrOptim(). You can immediately write down and use the gradient, so method L-BFGS-B is appropriate. Because the problem is linear, there is only one maximum and no dependency on starting values. If you had supplied some data and code (which packages did you try, and how?), a more concrete answer would have been possible. My own test example worked out of the box. Hans Werner Sounds like a linear programming problem, so perhaps one of the packages that are specialized for that? lpSolve looks like it should do it. (As a general matter: There's nothing simple about constrained maximization problems, and generic optimizers aren't really geared towards dealing with large sets of constraints.) --Nathan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimization inconsistencies
Marc Girondot marc_grt at yahoo.fr writes: Le 18/05/12 00:14, Nathan Stephens a écrit : I have a very simple maximization problem where I'm solving for the vector But I get inconsistent results depending on what starting values I. I've tried various packages but none seem to bee the very solver in Excel. Any recommendations on what packages or functions I should try? [...] I had similar problem to solve (x were frequencies) and optimization stops before to reach the global maximum. As [...] indicate[d], I fitted {x}-1 values because the last one is known by the equality constraint. For the vector constraints, I used w - -Inf when x goes out of the limits. Same as before: Would you *please* post data and code, as the posting guide requests! The variable that is left out still has some (linear) inequalities to fullfil. I didn't understand how you worked that out with optim() as it only allows to include box constraints. Hans Werner Finally I used Bayesian mcmc to get the convergence and it works much better. I don't know why in this case the optim does not converge. Hope it hepls, Marc __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Translation of matlab vectors code into r
Haio joakim.aback at gmail.com writes: Hi there I am new user of r, i would need some help to translate som code for vectors in matlab to r. I have managed to translate the first 7 rows, but not the rest. Could anyone give me any suggestions for this problem?? Matlab code: tempo=[]; temps=[]; tempn=[]; tempao=[]; tempas=[]; tempan=[]; for k=1:5 tempo = [tempo n_o(k,:)]; temps = [temps n_s(k,:)]; tempn = [tempn n_n(k,:)]; tempao = [tempao nanst_o(k,:)]; tempas = [tempas nanst_s(k,:)]; tempan = [tempan nanst_n(k,:)]; end. This is the code that i´m trying to translate into r, so far i have managed to translate the first 7 rows into r, but not last 6 rows. This is extremely bad Matlab code, I would discourage you to translate it line by line. What the code does: It takes each of the matrices n_o, n_s, n_n, nanst_o, nanst_s, nanst_n and combines the first five lines into vectors each. A way to do this in R could be as follows: tempo - c(t(n_o[1:5, ])) # etc. where n_o, etc. will be the same matrices as in your Matlab example. A similar solution, of course, would be possible in Matlab. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Define lower-upper bound for parameters in Optim using Nelder-Mead method
Ben Bolker bbolker at gmail.com writes: Ted.Harding at wlandres.net writes: In addition to these options, there is also a derivative-free box-constrained optimizer (bobyqa) in the 'minqa' package (and in an optim-like wrapper via the optimx package), and a box-constrained Nelder-Mead optimizer in the development (r-forge) version of lme4, which is based on the NLopt optimization library (also accessible via the nloptr package). I could add another Nelder-Mead implementation in package 'dfoptim'. It comes in pure R and is still quite efficient, based on Kelley's well-known book code. It exists in unconstrained and box-constraint versions. The optimization world in R is by now really scattered across many different package with sometimes 'strange' names. Some of the packages have not yet made it from R-Forge to CRAN. Unfortunately, the Optimization task view is not of much help anymore in this shattered world. We will get a lot more of these questions on R-help if we do not come up with a solution to this problem, for instance more up-to-date optimization functions in R base, a recommened package for optimization, or e.g. an optimization guide as a kind of global vignette. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in gamma(delta + (complex(0, 0, 1) * (x - mu))/alpha) : unimplemented complex function
I am trying to obtain the grafic of a pdf . but this error keeps showing . Here is the code Or use the complex gamma function gammaz() in package 'pracma'. The following code works, that is produces a plot: library(pracma) MXN.fd - function(x,alpha,beta,mu,delta) { A - (2*cos(beta/2))^(2*delta) B - 2*alpha*pi*gamma(2*delta) C - (beta*(x-mu))/alpha D - abs(gammaz(delta + (complex(0,0,1)*(x-mu))/alpha)^2) M - A/B*exp(C)*D plot(x,M,type=l,lwd=2,col=red) } alpha - 0.02612297; beta - -0.50801886 mu - 0.00575829917; delta - 0.67395 x - seq(-0.04,0.04,length=200) MXN.fd(x,alpha,beta,mu,delta) grid() i think the problem is the gamma function, does anyone know how to compute gamma with imaginary numbers? thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R numerical integration
casperyc casperyc at hotmail.co.uk writes: I don't know what is wrong with your Maple calculations, but I think you should check them carefully, because: (1) As Petr explained, the value of the integral will be 0.5 (2) The approach of Peter still works and returns : 0.4999777 (3) And the same result comes out with Mathematica: 0.49997769... Hans Werner The quadinf command in library pracma still fails when mu=-2.986731 with sigma=53415.18. While Maple gives me an estimate of 0.5001701024. Maple: (for those who are interested) myf:=(mu,sigma)- evalf(Int(exp(-(x-mu)^2/2/sigma^2)/sigma/sqrt(2*Pi)/(1+exp(-x)), x=-infinity..infinity)); myf(-2.986731, 53415.18 ); 0.5001701024 These 'mu's and 'sigma's are now random starting points I generated for an optimization problem like I have mentioned. I should really investigate the behavior of this function before I ask R doing the integration. As I have mentioned that I had already realized the integral is between 0 and 1. And I have had a look at the contour plots of different mu and sigma. I am going to 'restrict' mu and sigma to certain (small) values, and still get the integral to produce a value between 0 and 1. All of your help is much appreciated. casper __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R numerical integration
casperyc casperyc at hotmail.co.uk writes: Is there any other packages to do numerical integration other than the default 'integrate'? Basically, I am integrating: integrate(function(x) dnorm(x,mu,sigma)/(1+exp(-x)),-Inf,Inf)$value The integration is ok provided sigma is 0. However, when mu=-1.645074 and sigma=17535.26 It stopped working. On the other hand, Maple gives me a value of 0.5005299403. Using `integrate()` to integrate from -1e-8 to 1e-8 will give quite a correct result, while integrating from -1e-10 to 1e-10 will return 0. It seems more appropriate to transform the infinite into a finite interval. Function `quadinf` in package pracma does this automatically, applying the `integrate` routine to the finite interval [-1, 1]. library(pracma) quadinf(fun, -Inf, Inf, tol=1e-10) # [1] 0.4999626 I am astonished to see the result from Maple as this does not appear to be correct --- Mathematica, for instance, returns 0.499963. It is an important line of the coding that I am doing and I am looking for some package that is able to do numerical integration efficiently (fast and accurate to a tol=1e-4). I have tried 'cubature', which does not give me anything even after 10 minutes. Thanks. casper __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R numerical integration
Hans W Borchers hwborchers at googlemail.com writes: casperyc casperyc at hotmail.co.uk writes: Is there any other packages to do numerical integration other than the default 'integrate'? Basically, I am integrating: integrate(function(x) dnorm(x,mu,sigma)/(1+exp(-x)),-Inf,Inf)$value The integration is ok provided sigma is 0. However, when mu=-1.645074 and sigma=17535.26 It stopped working. On the other hand, Maple gives me a value of 0.5005299403. Using `integrate()` to integrate from -1e-8 to 1e-8 will give quite a correct result, while integrating from -1e-10 to 1e-10 will return 0. Saturday morning... Well, of course i meant integrating from -1e8 to 1e8 and from -1e10 to 1e10. The first one returns almost the correct result, while the other returns 0. The same happens for `adaptIntegrate` in package cubature. It shows that one cannot automatically set the limits very high. Therefore, transforming to a finite intervall is to be preferred. There are several way to do that, depending also on the convergence rate of your function at infinity. Hans Werner It seems more appropriate to transform the infinite into a finite interval. Function `quadinf` in package pracma does this automatically, applying the `integrate` routine to the finite interval [-1, 1]. library(pracma) quadinf(fun, -Inf, Inf, tol=1e-10) # [1] 0.4999626 I am astonished to see the result from Maple as this does not appear to be correct --- Mathematica, for instance, returns 0.499963. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ACM Software Copyright and License Agreement
peter dalgaard pdalgd at gmail.com writes: On Feb 16, 2012, at 12:31 , Hans W Borchers wrote: I have often seen the use of routines from the ACM Collected Algorithms, i.e. netlib.org/toms/≥ (CALGO, or Trans. On Math. Software, TOMS), in Open Source programs, maybe also in some R packages --- and sometimes these programs are distributed under the GPL license, sometimes under proprietary licenses, e.g. in Scilab. The use of these CALGO programs is subject to the ACM Software Copyright and License Agreement www.acm.org/publications/policies/softwarecrnotice which includes the following paragraph: **Commercial Use** Any User wishing to make a commercial use of the Software must contact ACM at permissions at acm.org to arrange an appropriate license. Commercial use includes (1) integrating or incorporating all or part of the source code into a product for sale or license by, or on behalf of, User to third parties, (2) distribution of the binary or source code to third parties for use with a commercial product sold or licensed by, or on behalf of, User. I assume that this license extension is not compatible with GPL, but may be wrong here. So my question is: Can software from the ACM Collected Algorithms be distributed under a GPL-compatible licence, and how to formulate and where to put such a license extension. One needs to tread _really_ carefully with these items. You plain can't claim that the ACM license is compatible with the GPL; it just isn't. However, there are cases where software has been placed in the Public Domain in addition to being published by an ACM Journal. E.g., the NSWC (Naval Surface Warfare Center) library is in the Public Domain even though some of its routines have been published in TOMS. And how can I be sure that these algorithms have been rightly placed on the NSWC library page under a license different from its original ACM license? I am inclined to be quite suspicious about that. Best, Hans Werner However, I am not a lawyer, etc... -pd __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ACM Software Copyright and License Agreement
ACM Software Copyright and License Agreement I have often seen the use of routines from the ACM Collected Algorithms, i.e. netlib.org/toms/ (CALGO, or Trans. On Math. Software, TOMS), in Open Source programs, maybe also in some R packages --- and sometimes these programs are distributed under the GPL license, sometimes under proprietary licenses, e.g. in Scilab. The use of these CALGO programs is subject to the ACM Software Copyright and License Agreement www.acm.org/publications/policies/softwarecrnotice which includes the following paragraph: **Commercial Use** Any User wishing to make a commercial use of the Software must contact ACM at permissi...@acm.org to arrange an appropriate license. Commercial use includes (1) integrating or incorporating all or part of the source code into a product for sale or license by, or on behalf of, User to third parties, or (2) distribution of the binary or source code to third parties for use with a commercial product sold or licensed by, or on behalf of, User. I assume that this license extension is not compatible with GPL, but I may be wrong here. So my question is: Can software from the ACM Collected Algorithms be distributed under a GPL-compatible licence, and how to formulate and where to put such a license extension. Thanks Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Schwefel Function Optimization
Vartanian, Ara aravart at indiana.edu writes: All, I am looking for an optimization library that does well on something as chaotic as the Schwefel function: schwefel - function(x) sum(-x * sin(sqrt(abs(x With these guys, not much luck: optim(c(1,1), schwefel)$value [1] -7.890603 optim(c(1,1), schwefel, method=SANN, control=list(maxit=1))$value [1] -28.02825 optim(c(1,1), schwefel, lower=c(-500,-500), upper=c(500,500), method=L-BFGS-B)$value [1] -7.890603 optim(c(1,1), schwefel, method=BFGS)$value [1] -7.890603 optim(c(1,1), schwefel, method=CG)$value [1] -7.890603 Why is it necessary over and over again to point to the Optimization Task View? This is a question about a global optimization problem, and the task view tells you to look at packages like 'NLoptim' with specialized routines, or use one of the packages with evolutionary algorithms, such as 'DEoptim' or'pso'. library(DEoptim) schwefel - function(x) sum(-x * sin(sqrt(abs(x de - DEoptim(schwefel, lower = c(-500,-500), upper = c(500,500), control = list(trace = FALSE)) de$optim$bestmem # par1 par2 # 420.9687 420.9687 de$optim$bestval # [1] -837.9658 All trapped in local minima. I get the right answer when I pick a starting point that's close: optim(c(400,400), schwefel, lower=c(-500,-500), upper=c(500,500), method=L-BFGS-B)$value [1] -837.9658 Of course I can always roll my own: r - vector() for(i in 1:1000) { x - runif(2, -500,500) m - optim(x, schwefel, lower=c(-500,-500), upper=c(500,500), method=L-BFGS-B) r - rbind(r, c(m$par, m$value)) } And this does fine. I'm just wondering if this is the right approach, or if there is some other package that wraps this kind of multi-start up so that the user doesn't have to think about it. Best, Ara __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need very fast application of 'diff' - ideas?
R. Michael Weylandt michael.weylandt at gmail.com writes: I'd write your own diff() that eliminates the method dispatch and argument checking that diff - diff.default does. x[-1] - x[-len(x)] # is all you really need. (# you could also try something like c(x[-1], NA) - x which may be marginally faster as it only subsets x once but you should profile to find out) is probably about as fast as you can get within pure R code (the function overhead will add a little bit of time as well, so if speed is truly the only thing that matters, best not to use it. If you wanna go for even more speed, you'll have to go to compiled code; I'd suggest inline+Rcpp as the easiest way to do so. That could get it down to a single pass through the vector in pure C (or nice C++) which seems to be a lower bound for speed. Michael Python has become astonishingly fast during the last years. On an iMAc with 3.06 GHz I can see the following timings (though I do feel a bit suspicious about the timings Python reports): Python 0.040 s Version 2.6.1, 1e7 integer elements Matlab 0.095 s Matlab's diff function (Version R2011b) Matlab 0.315 s Matlab using x(2:N)-x(1:(N-1)) R 2.14.10.375 s R's diff() function R 0.365 s R using x[-1]-x[-N] R 0.270 s R using c(x[-1],NA)-x) R+Fortran 0.180 s R function calling .Fortran R+C 0.180 s R function calling .C where---as an example---the C code looks like: void diff(int *n, int *x, int *d) { for (long i=0; i*n-2; i++) d[i] = x[i+1] - x[i]; } There appears to be a factor of 4 between R+compiled code and Python code. It is also interesting to see that in Matlab 'diff' is considerably faster than differencing vectors, while in R it is slower. P. S.: To make the comparison fair I have used the following Python call: python -m timeit -n 1 -r 1 -s 'import numpy' -s 'arr = numpy.random.randint(0, 1000, (1000,1)).astype(int32)' 'diff = arr[1:] - arr[:-1]' i.e., used 32-bit integers and included the indexing in the loop. On Fri, Jan 27, 2012 at 7:15 PM, Kevin Ummel kevinummel at gmail.com wrote: Hi everyone, Speed is the key here. I need to find the difference between a vector and its one-period lag (i.e. the difference between each value and the subsequent one in the vector). Let's say the vector contains 10 million random integers between 0 and 1,000. The solution vector will have 9,999,999 values, since their is no lag for the 1st observation. In R we have: #Set up input vector x = runif(n=10e6, min=0, max=1000) x = round(x) #Find one-period difference y = diff(x) Question is: How can I get the 'diff(x)' part as fast as absolutely possible? I queried some colleagues who work with other languages, and they provided equivalent solutions in Python and Clojure that, on their machines, appear to be potentially much faster (I've put the code below in case anyone is interested). However, they mentioned that the overhead in passing the data between languages could kill any improvements. I don't have much experience integrating other languages, so I'm hoping the community has some ideas about how to approach this particular problem... Many thanks, Kevin In iPython: In [3]: import numpy as np In [4]: arr = np.random.randint(0, 1000, (1000,1)).astype(int16) In [5]: arr1 = arr[1:].view() In [6]: timeit arr2 = arr1 - arr[:-1] 10 loops, best of 3: 20.1 ms per loop In Clojure: (defn subtract-lag [n] (let [v (take n (repeatedly rand))] (time (dorun (map - v (cons 0 v)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] k-means++
Ferebee Tunno ferebee.tunno at mathstat.astate.edu writes: Hi everyone - I know that R is capable of clustering using the k-means algorithm, but can R do k-means++ clustering as well? k-means++ is a routine to suggest center points before the classical k-means is called. The following lines of code will do that, where X is a matrix of data points, as requested for kmeans, and k the number of centers: kmpp - function(X, k) { n - nrow(X) C - numeric(k) C[1] - sample(1:n, 1) for (i in 2:k) { dm - distmat(X, X[C, ]) pr - apply(dm, 1, min); pr[C] - 0 C[i] - sample(1:n, 1, prob = pr) } kmeans(X, X[C, ]) } Here distmat(a, b) should return the distances between the rows of two matrices a and b There may be several implementations in R, one is distmat() in package pracma. Please note that AFAIK it is not clear whether the approach of kmeans++ is really better than, e.g., kmeans with several restarts. Hans Werner Thanks, -- Dr. Ferebee Tunno Assistant Professor Department of Mathematics and Statistics Arkansas State University P.O. Box 70 State University, AR. 72467 ftu...@astate.edu (870) 329-7710 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Base function for flipping matrices
Hadley Wickham hadley at rice.edu writes: See functions flipud(), fliplr() in package 'matlab' (or 'pracma'). Those are the names of corresponding functions in MATLAB. Hans Werner Hi all, Are there base functions that do the equivalent of this? fliptb - function(x) x[nrow(x):1, ] fliplr - function(x) x[, nrow(x):1] Obviously not hard to implement (although it needs some more checks), just wondering if it had already been implemented. Hadley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Union/Intersect two continuous sets
谢一鸣 cutexym at gmail.com writes: Dear All, It is my first time using this mail list to post a question. And I sincerely hope that this will not bother any subscribers. So far as I know, there are functions like union( ), which can help to combine two sets of discrete data. But what if the data sets are with continuous data. For instance, I got three sets like [2, 4], [5, 6], [3.4, 5.5]. I wonder if there is a quick solution to recognize their union is [2, 6], and return the size of this union 6-2=4 to me. The similar demand is for the intersection operation. If you get any idea, please inform me. Thanks in advance! Xie I got interested in your request as I could use it myself (not for intervals, but a similar kind of problem). I assume, you represent your set of intervals as a matrix M with two columns, first column the left endpoints, second the right ones. Intersection is easy, as it is the interval from the maximum of left end points to the minimum of the right end points (or NULL if the maximum is greater than the minimum). For the union I didn't see a simple, i.e. one or two lines, solution ahead, so here is my dumb, looped version. Surely there are more elegant solutions coming. # Mnew the union of intervals in M, an (nx2)-matrix with n 1: o - order(M[, 1], M[, 2]) L - M[o, 1]; R - M[o, 2] k - 1 Mnew - matrix(c(L[k], R[k]), 1, 2) for (i in 2:nrow(M)) { if (L[i] = Mnew[k, 2]) { Mnew[k, 2] - max(R[i], Mnew[k, 2]) } else { k - k+1 Mnew - rbind(Mnew, c(L[i], R[i])) } } # Inew the intersection of intervals in M, an (nx2)-matrix with n 1: L - max(M[, 1]); R - min(M[, 2]) Inew - if (L = R) c(L, R) else c() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] axis tick colors: only one value allowed?
Carl Witthoft carl at witthoft.com writes: Hi, So far as I can tell, the 'col.ticks' parameter for axis() only uses the first value provided. E.g.: plot(0:1,0:1, col.ticks=c('blue','red','green')) #all ticks are blue Just wondering if there's a different option in the basic plot commands that can handle multiple colors, and also whether ggplot and/or lattice allow for multiple tick colors. See `?axis' or try this: plot(0:1,0:1, type = n, axes = FALSE) box() axis(side=1, lwd.ticks = 2, col.ticks=blue) axis(side=2, lwd.ticks = 2, col.ticks=red) [...] Carl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nls start values
Niklaus Fankhauser niklaus.fankhauser at cell.biol.ethz.ch writes: I'm using nls to fit periodic gene-expression data to sine waves. I need to set the upper and lower boundaries, because I do not want any negative phase and amplitude solutions. This means that I have to use the port algorithm. The problem is, that depending on what start value I choose for phase, the fit works for some cases, but not for others. In the example below, the fit works using phase=pi, but not using phase=0. But there are many examples which fit just fine using 0. Is there a comparable alternative to nls that is not so extremely influenced by the start values? Use package `nls2' to first search on a grid, and then apply `nls' again to identify the globally best point: # Data for example fit afreq - 1 / (24 / 2 / pi) tpoints - c(0,0,0,2,2,2,4,4,4,6,6,6,8,8,8,12,12,12, 14,14,14,16,16,16,18,18,18,20,20,20,24,24,24) gene_expression - c(1.551383, 1.671742, 1.549499, 1.694480, 1.632436, 1.471568, 1.623381, 1.579361, 1.809394, 1.753223, 1.685918, 1.754968, 1.963069, 1.820690, 1.985159, 2.205064, 2.160308, 2.120189, 2.194758, 2.165993, 2.189981, 2.098671, 2.122207, 2.012621, 1.963610, 1.884184, 1.955160, 1.801175, 1.829686, 1.773260, 1.588768, 1.563774, 1.559192) shift=mean(gene_expression) # y-axis (expression) shift # Grid search library(nls2) frml - gene_expression ~ sin(tpoints * afreq + phase) * amp + shift startdf - data.frame(phase=c(0, 2*pi), amp = c(0, 2)) nls2(frml, algorithm = grid-search, start = startdf, control = list(maxiter=200)) # Perfect fit startvals - list(phase = 4.4880, amp = 0.2857) sine_nls - nls(frml, start=startvals) # phaseamp # 4.3964 0.2931 # residual sum-of-squares: 0.1378 Maybe this can be done in one step. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] efficiently finding the integrals of a sequence of functions
JeffND Zuofeng.Shang.5 at nd.edu writes: Hi folks, I am having a question about efficiently finding the integrals of a list of functions. We had the same discussion last month under the heading performance of adaptIntegrate vs. integrate, see https://stat.ethz.ch/pipermail/r-help/2011-November/295260.html It also depends on the accuracy you want to achieve. Adaptive methods will be slower as the adaptation will be different in each single step for each function, i.e. no vectorization here. Non-adaptive Gaussian quadrature appears to be a good candidate. Assume you have found grid points x_i and weights w_i for your interval [a, b], then if F is the matrix with F_ij = f_j(x_i) amd the integrals will be computed all at once with w %*% F . Example: A function that returns x, x^2, ..., x^5 in columns f - function(x) cbind(x, x^2, x^3, x^4, x^5) The grid points and weights for the interval [0, 1] are: x - c(0.025446, 0.129234, 0.297077, 0.50, 0.702923, 0.870766, 0.974554) w - c(0.064742, 0.139853, 0.190915, 0.208980, 0.190915, 0.139853, 0.064742) and the integrals for these five functions are w %*% f(x) # 0.5 0.334 0.25 0.2 0.167 Functions to calculate Gaussian points and weights are mentioned in the thread above. Hans Werner To be specific, here is a simple example showing my question. Suppose we have a function f defined by f-function(x,y,z) c(x,y^2,z^3) Thus, f is actually corresponding to three uni-dimensional functions f_1(x)=x, f_2(y)=y^2 and f_3(z)=z^3. What I am looking for are the integrals of these three functions f_1,f_2,f_3 over some interval, say, (0,1). More specifically, the integrals \int_0^1 f_1(x) dx, \int_0^1 f_2(y) dy and \int_0^1 f_3(z) dz. For this simple example, of course we can do these three integrals one by one using integrate (f_1, lower=0, upper=1) However, in practice, I have a sequence of 5000 uni-dimensional functions and hope to find all of their integrals (over some regions), so using loops to do this one by one is not efficient. A possible idea is to convert the sequence of functions to a list of objects, and use sapply() which allow us to find the integrals in a fashion of vectorizing. But I don't know how to convert the above example function f to a list. In my research problem, I can only define the 5000 functions in a vectorizing way like f-function(x1,x2,...,x5000) c(f1(x1),f2(x2),...,f5000(x5000)) So how to convert it to a list will be a crucial step to efficiently get the integrals. Thanks for all the suggestions! Jeff -- View this message in context: http://r.789695.n4.nabble.com/efficiently-finding-the-integrals-of-a- sequence-of-functions-tp4179452p4179452.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Intersection of 2 matrices
Michael Kao mkao006rmail at gmail.com writes: Your solution is fast, but not completely correct, because you are also counting possible duplicates within the second matrix. The 'refitted' function could look as follows: compMat2 - function(A, B) { # rows of B present in A B0 - B[!duplicated(B), ] na - nrow(A); nb - nrow(B0) AB - rbind(A, B0) ab - duplicated(AB)[(na+1):(na+nb)] return(sum(ab)) } and testing an example the size the OR was asking for: set.seed(8237) A - matrix(sample(1:1000, 2*67420, replace=TRUE), 67420, 2) B - matrix(sample(1:1000, 2*59199, replace=TRUE), 59199, 2) system.time(n - compMat2(A, B)) # n = 3790 while compMat() will return 5522 rows, with 1732 duplicates within B ! A 3.06 GHz iMac needs about 2 -- 2.5 seconds. Hans Werner On 2/12/2011 2:48 p.m., David Winsemius wrote: On Dec 2, 2011, at 4:20 AM, oluwole oyebamiji wrote: Hi all, I have matrix A of 67420 by 2 and another matrix B of 59199 by 2. I would like to find the number of rows of matrix B that I can find in matrix A (rows that are common to both matrices with or without sorting). I have tried the intersection and is.element functions in R but it only working for the vectors and not matrix i.e,intersection(A,B) and is.element(A,B). Have you considered the 'duplicated' function? Here is an example based on the duplicated function test.mat1 - matrix(1:20, nc = 5) test.mat2 - rbind(test.mat1[sample(1:5, 2), ], matrix(101:120, nc = 5)) compMat - function(mat1, mat2){ nr1 - nrow(mat1) nr2 - nrow(mat2) mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ] } compMat(test.mat1, test.mat2) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Intersection of 2 matrices
Michael Kao mkao006rmail at gmail.com writes: Well, taking a second look, I'd say it depends on the exact formulation. In the applications I have in mind, I would like to count each occurrence in B only once. Perhaps the OP never thought about duplicates in B Hans Werner Here is an example based on the duplicated function test.mat1 - matrix(1:20, nc = 5) test.mat2 - rbind(test.mat1[sample(1:5, 2), ], matrix(101:120, nc = 5)) compMat - function(mat1, mat2){ nr1 - nrow(mat1) nr2 - nrow(mat2) mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ] } compMat(test.mat1, test.mat2) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x, y for point of intersection
Monica has sent me some data and code for taking a quick look. As it turned out, there was a simple programming error on her side. The segm_distance() function in package 'pracma' is working correctly. And there is no minimization procedure in here, it simply solves some equations from plane geometry. Maybe the function suggested by Don is faster, I haven't checked that. Regards, Hans Werner 2011/11/29 Monica Pisica pisican...@hotmail.com Hi again, Working with my real data and not with the little example i sent to the list i discovered that segm_distance function from package pracma does not converge to 0 in all cases, even if i increase the number of iteration to 10,000 for example. It seems that it depends on the initialization point - most like a minimization function. So my thanks go to Don who's suggestion works for the real data as well without any problems - so far ;-) He suggested to use the function crossing.psp from package spatstat. Thanks again to all who have answered and helped to solve my problem. I certainly learned few new things. Monica From: macque...@llnl.gov To: pisican...@hotmail.com CC: r-help@r-project.org Date: Wed, 23 Nov 2011 14:03:42 -0800 Subject: Re: [R] x, y for point of intersection The function crossing.psp() in the spatstat package might be of use. Here's an excerpt from its help page: crossing.psp package:spatstat R Documentation Crossing Points of Two Line Segment PatternsDescription: Finds any crossing points between two line segment patterns. Usage: crossing.psp(A,B) -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x, y for point of intersection
Monica Pisica pisicandru at hotmail.com writes: Hi everyone, I am trying to get a point of intersection between a polyline and a straight line ….. and get the x and y coordinates of this point. For exemplification consider this: set.seed(123) k1 -rnorm(100, mean=1.77, sd=3.33) k1 - sort(k1) q1 - rnorm(100, mean=2.37, sd=0.74) q1 - sort(q1, decreasing = TRUE) plot(k1, q1, xlim - c((min(k1)-5), (max(k1)+5)), type=l) xa - -5; ya - 2 xb - 12; yb - 4 lines(c(xa, xb), c(ya, yb), col = 2) I want to get the x and y coordinates of the intersection of the 2 lines ... m - (ya-yb)/(xa-xb) b - ya-m*xa ln - loess(q1~k1) lines(ln) It is clear that the x, y will satisfy both linear equations, y = m*x + b and the ln polyline ….. but while I can visualize the equation of the straight line – I have problems with the polyline. I will appreciate any ideas to solve this problem. I thought it a trivial solution but it seems I cannot see it. You could apply the function segm_distance in package 'pracma'. If the distance between two segments is 0, it returns the intersection point: p1 - c(xa, ya); p2 - c(xb, yb) for (i in 2:100) { p3 - c(k1[i-1], q1[i-1]); p4 - c(k1[i], q1[i]) s - segm_distance(p1, p2, p3, p4) if (s$d == 0) break } s$p # 0.2740154 2.6204724 points(s$p[1], s$p[2], pch=+, col=red) Thanks, Monica __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] performance of adaptIntegrate vs. integrate
baptiste auguie baptiste.auguie at googlemail.com writes: Dear list, [cross-posting from Stack Overflow where this question has remained unanswered for two weeks] I'd like to perform a numerical integration in one dimension, I = int_a^b f(x) dx where the integrand f: x in IR - f(x) in IR^p is vector-valued. integrate() only allows scalar integrands, thus I would need to call it many (p=200 typically) times, which sounds suboptimal. The cubature package seems well suited, as illustrated below, library(cubature) Nmax - 1e3 tolerance - 1e-4 integrand - function(x, a=0.01) c(exp(-x^2/a^2), cos(x)) adaptIntegrate(integrand, -1, 1, tolerance, 2, max=Nmax)$integral [1] 0.01772454 1.68294197 However, adaptIntegrate appears to perform quite poorly when compared to integrate. Consider the following example (one-dimensional integrand), library(cubature) integrand - function(x, a=0.01) exp(-x^2/a^2)*cos(x) Nmax - 1e3 tolerance - 1e-4 # using cubature's adaptIntegrate time1 - system.time(replicate(1e3, { a - adaptIntegrate(integrand, -1, 1, tolerance, 1, max=Nmax) }) ) # using integrate time2 - system.time(replicate(1e3, { b - integrate(integrand, -1, 1, rel.tol=tolerance, subdivisions=Nmax) }) ) time1 user system elapsed 2.398 0.004 2.403 time2 user system elapsed 0.204 0.004 0.208 a$integral [1] 0.0177241 b$value [1] 0.0177241 a$functionEvaluations [1] 345 b$subdivisions [1] 10 Somehow, adaptIntegrate was using many more function evaluations for a similar precision. Both methods apparently use Gauss-Kronrod quadrature, though ?integrate adds a Wynn's Epsilon algorithm. Could that explain the large timing difference? Cubature is astonishingly slow here though it was robust and accurate in most cases I used it. You may write to the maintainer for more information. Even a pure R implementation of adaptive Gauss-Kronrod as in pracma::quadgk is faster than cubature: library(pracma) time3 - system.time(replicate(1e3, { d - quadgk(integrand, -1, 1) }# 0.0177240954011303 ) ) # user system elapsed # 0.899 0.0020.897 If your functions are somehow similar and you are willing to dispense with the adaptive part, then compute Gaussian quadrature nodes xi and weights wi for the fixed interval and compute sum(wi * fj(xi)) for each function fj. This avoids recomputing nodes and weights for every call or function. Package 'gaussquad' provides such nodes and weights. Or copy the relevant calculations from quadgk (the usual (7,15)-rule). Regards, Hans Werner I'm open to suggestions of alternative ways of dealing with vector-valued integrands. Thanks. baptiste __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim seems to be finding a local minimum
Ben Bolker bbolker at gmail.com writes: Simulated annealing and other stochastic global optimization methods are also possible solutions, although they may or may not work better than the many-starting-points solution -- it depends on the problem, and pretty much everything has to be tuned. Tabu search http://en.wikipedia.org/wiki/Tabu_search is another possibility, although I don't know much about it ... It is known that the Excel Solver has much improved during recent years. Still there are slightly better points such as myfunc(c(0.889764228112319, 94701144.5712312)) # 334.18844 restricting the domain to [0, 1] x [0, 10^9] for an evolutionary approach, for instance DEoptim::DEoptim(). Finding a global optimum in 2 dimensions is not so difficult. Here the scale of the second variable could pose a problem as small local minima might be overlooked easily. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear programming problem, RGPLK - no feasible solution.
Liu Evans, Gareth Gareth.Liu-Evans at liverpool.ac.uk writes: In my post at https://stat.ethz.ch/pipermail/r-help/2011-October/292019.html I included an undefined term ej. The problem code should be as follows. It seems like a simple linear programming problem, but for some reason my code is not finding the solution. obj - c(rep(0,3),1) col1 -c(1,0,0,1,0,0,1,-2.330078923,0) col2 -c(0,1,0,0,1,0,1,-2.057855981,0) col3 -c(0,0,1,0,0,1,1,-1.885177032,0) col4 -c(-1,-1,-1,1,1,1,0,0,1) mat - cbind(col1, col2, col3, col4) dir - c(rep(=, 3), rep(=, 3), rep(==, 2), =) rhs - c(rep(0, 7), 1, 0) sol - Rglpk_solve_LP(obj, mat, dir, rhs, types = NULL, max = FALSE, bounds = c(-100,100), verbose = TRUE) The R output says there is no feasible solution, but e.g. (-2.3756786, 0.3297676, 2.0459110, 2.3756786) is feasible. The output is GLPK Simplex Optimizer, v4.42 9 rows, 4 columns, 19 non-zeros 0: obj = 0.0e+000 infeas = 1.000e+000 (2) PROBLEM HAS NO FEASIBLE SOLUTION Please have a closer look at the help page ?Rglpk_solve_LP. The way to define the bounds is a bit clumsy, but then it works: sol - Rglpk_solve_LP(obj, mat, dir, rhs, types = NULL, max = FALSE, bounds = list(lower=list(ind=1:4, val=rep(-100,4)), upper=list(ind=1:4, val=rep(100,4))), verbose=TRUE) GLPK Simplex Optimizer, v4.42 9 rows, 4 columns, 19 non-zeros 0: obj = -1.0e+02 infeas = 1.626e+03 (2) *10: obj = 1.0e+02 infeas = 0.000e+00 (0) *13: obj = 2.247686558e+00 infeas = 0.000e+00 (0) OPTIMAL SOLUTION FOUND sol $optimum [1] 2.247687 $solution [1] -2.247687e+00 -6.446292e-31 2.247687e+00 2.247687e+00 One other thing, a possible bug - if I run this code with dir shorter than it should be, R crashes. My version of R is 2.131.56322.0, and I'm running it on Windows 7. If you can reproduce that R crashes -- which it shall never do -- inform the maintainer of this package. On Mac it doesn't crash, it goes into an infinite loop with Execution aborted.Error detected in file glplib03.c at line 83. Regards, Hans Werner Regards, Gareth __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Binary optimization problem in R
Michael Haenlein haenlein at escpeurope.eu writes: Dear all, I would like to solve a problem similar to a multiple knapsack problem and am looking for a function in R that can help me. Specifically, my situation is as follows: I have a list of n items which I would like to allocate to m groups with fixed size. Each item has a certain profit value and this profit depends on the type of group the item is in. My problem is to allocate the items into groups so the overall profit is maximized while respecting the fixed size of each group. Take the following example with 20 items (n=5) and 5 groups (m=5): set.seed(1) profits - matrix(runif(100), nrow=20) size-c(2,3,4,5,6) The matrix profits describes the profit of each item when it is allocated to a certain group. For example, when item 1 is allocated to group 1 it generates a profit of 0.26550866. However, when item 1 is allocated to group 2 it generates a profit of 0.93470523. The matrix size describes the size of each group. So group 1 can contain 2 items, group 2 3 items, group 4 4 items, etc. I think this is probably something that could be done with constrOptim() but I'm not exactly sure how. The task you describe is an integer optimization problem and can most likely not be solved with a numeric optimization routine like constrOptim(). In the example above, the solution I get is a maximal profit of 16.31194 with these allocations: group 1: 4, 18 2: 1, 9, 15 3: 3, 6, 11, 12 4: 8, 10, 16, 17, 20 5: 2, 5, 7, 13, 14, 19 (mis)using the mixed-integer package 'lpSolve' in the following way (with 100 binary variables): library(lpSolve) obj - -c(profits) # we will look for a minimum dir - rep(==, 25)# discrete equality constraints rhs - c(c(2,3,4,5,6), rep(1, 20)) # Assign 2, 3, 4, 5, 6 items to groups 1--5 resp. mat - matrix(0, 25, 100); mat[1, 1: 20] - 1; mat[2, 21: 40] - 1 mat[3, 41: 60] - 1; mat[4, 61: 80] - 1; mat[5, 81:100] - 1 # Assign each item to only one group for (i in 1:20) { mat[i+5, c(i, i+20, i+40, i+60, i+80)] - 1 } sol - lp(min, obj, mat, dir, rhs, all.bin=TRUE) # RESULT -sol$objval # 16.31194 # ALLOCATIONS which(sol$solution==1) - rep(seq(0, 80, 20), 2:6) # 4 18 1 9 15 3 6 11 12 8 10 16 17 20 2 5 7 13 14 19 In case you have thousands of items and many groups, packages like 'lpSolve', 'Rglpk', or 'Rsymphony' will not be able to handle the task and you might be forced to look for tools outside the R realm. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fitting a sinus curve
David Winsemius dwinsemius at comcast.net writes: On Jul 28, 2011, at 1:07 PM, Hans W Borchers wrote: maaariiianne marianne.zeyringer at ec.europa.eu writes: Dear R community! I am new to R and would be very grateful for any kind of help. I am a PhD student and need to fit a model to an electricity load profile of a household (curve with two peaks). I was thinking of looking if a polynomial of 4th order, a sinus/cosinus combination or a combination of 3 parabels fits the data best. I have problems with the sinus/cosinus regression: time - c( 0.00, 0.15, 0.30, 0.45, 1.00, 1.15, 1.30, 1.45, 2.00, 2.15, 2.30, 2.45, 3.00, 3.15, 3.30, 3.45, 4.00, 4.15, 4.30, 4.45, 5.00, 5.15, 5.30, 5.45, 6.00, 6.15, 6.30, 6.45, 7.00, 7.15, 7.30, 7.45, 8.00, 8.15, 8.30, 8.45, 9.00, 9.15, 9.30, 9.45, 10.00, 10.15, 10.30, 10.45, 11.00, 11.15, 11.30, 11.45, 12.00, 12.15, 12.30, 12.45, 13.00, 13.15, 13.30, 13.45, 14.00, 14.15, 14.30, 14.45, 15.00, 15.15, 15.30, 15.45, 16.00, 16.15, 16.30, 16.45, 17.00, 17.15, 17.30, 17.45, 18.00, 18.15, 18.30, 18.45, 19.00, 19.15, 19.30, 19.45, 20.00, 20.15, 20.30, 20.45, 21.00, 21.15, 21.30, 21.45, 22.00, 22.15, 22.30, 22.45, 23.00, 23.15, 23.30, 23.45) watt - c( 94.1, 70.8, 68.2, 65.9, 63.3, 59.5, 55, 50.5, 46.6, 43.9, 42.3, 41.4, 40.8, 40.3, 39.9, 39.5, 39.1, 38.8, 38.5, 38.3, 38.3, 38.5, 39.1, 40.3, 42.4, 45.6, 49.9, 55.3, 61.6, 68.9, 77.1, 86.1, 95.7, 105.8, 115.8, 124.9, 132.3, 137.6, 141.1, 143.3, 144.8, 146, 147.2, 148.4, 149.8, 151.5, 153.5, 156, 159, 162.4, 165.8, 168.4, 169.8, 169.4, 167.6, 164.8, 161.5, 158.1, 154.9, 151.8, 149, 146.5, 144.4, 142.7, 141.5, 140.9, 141.7, 144.9, 151.5, 161.9, 174.6, 187.4, 198.1, 205.2, 209.1, 211.1, 212.2, 213.2, 213, 210.4, 203.9, 192.9, 179, 164.4, 151.5, 141.9, 135.3, 131, 128.2, 126.1, 124.1, 121.6, 118.2, 113.4, 107.4, 100.8) df-data.frame(time, watt) lmfit - lm(time ~ watt + cos(time) + sin(time), data = df) Your regression formula does not make sense to me. You seem to expect a periodic function within 24 hours, and if not it would still be possible to subtract the trend and then look at a periodic solution. Applying a trigonometric regression results in the following approximations: library(pracma) plot(2*pi*time/24, watt, col=red) ts - seq(0, 2*pi, len = 100) xs6 - trigApprox(ts, watt, 6) xs8 - trigApprox(ts, watt, 8) lines(ts, xs6, col=blue, lwd=2) lines(ts, xs8, col=green, lwd=2) grid() where as examples the trigonometric fits of degree 6 and 8 are used. I would not advise to use higher orders, even if the fit is not perfect. Thank you ! That is a real gem of a worked example. Not only did it introduce me to a useful package I was not familiar with, but there was even a worked example in one of the help pages that might have specifically answered the question about getting a 2nd(?) order trig regression. If I understood the commentary on that page, this method might also be appropriate for an irregular time series, whereas trigApprox and trigPoly would not? That's true. For the moment, the trigPoly() function works correctly only with equidistant data between 0 and 2*pi. This is adapted from the trigPoly help page in Hans Werner's pracma package: The error I made myself was to take the 'time' variable literally, though obviously the numbers after the decimal point were meant as minutes. Thus time - seq(0, 23.75, len = 96) would be a better choice. The rest in your adaptation is absolutely correct. A - cbind(1, cos(pi*time/24), sin(pi*time/24), cos(2*pi*time/24), sin(2*pi*time/24)) (ab - qr.solve(A, watt)) # [1] 127.29131 -26.88824 -10.06134 -36.22793 -38.56219 ts - seq(0, pi, length.out = 100) xs - ab[1] + ab[2]*cos(ts) + ab[3]*sin(ts) + ab[4]*cos(2*ts) + ab[5]*sin(2*ts) plot(pi*time/24, watt, col = red, xlim=c(0, pi), ylim=range(watt), main = Trigonometric Regression) lines(ts, xs, col=blue) Hans: I corrected the spelling of Trigonometric, but other than that I may well have introduced other errors for which I would be happy to be corrected. For instance, I'm unsure of the terminology regarding the ordinality of this model. I'm also not sure if my pi/24 and 2*pi/24 factors were correct in normalizing the time scale, although the prediction seemed sensible. And yes, this curve is the best trigonometric approximation you can get for this order(?). You will see the same result when you apply and plot xs1 - trigApprox(ts, watt, 1) But I see your problem with the term 'order' I will have a closer look at this and clarify the terminology on the help page. [All this reminds me of an article in the Mathematical Intelligencer some years ago where it was convincingly argued that the universal constant \pi should have the value 2*pi (in today's notation).] Thanks, Hans Werner Hans Werner
Re: [R] fitting a sinus curve
maaariiianne marianne.zeyringer at ec.europa.eu writes: Dear R community! I am new to R and would be very grateful for any kind of help. I am a PhD student and need to fit a model to an electricity load profile of a household (curve with two peaks). I was thinking of looking if a polynomial of 4th order, a sinus/cosinus combination or a combination of 3 parabels fits the data best. I have problems with the sinus/cosinus regression: time - c( 0.00, 0.15, 0.30, 0.45, 1.00, 1.15, 1.30, 1.45, 2.00, 2.15, 2.30, 2.45, 3.00, 3.15, 3.30, 3.45, 4.00, 4.15, 4.30, 4.45, 5.00, 5.15, 5.30, 5.45, 6.00, 6.15, 6.30, 6.45, 7.00, 7.15, 7.30, 7.45, 8.00, 8.15, 8.30, 8.45, 9.00, 9.15, 9.30, 9.45, 10.00, 10.15, 10.30, 10.45, 11.00, 11.15, 11.30, 11.45, 12.00, 12.15, 12.30, 12.45, 13.00, 13.15, 13.30, 13.45, 14.00, 14.15, 14.30, 14.45, 15.00, 15.15, 15.30, 15.45, 16.00, 16.15, 16.30, 16.45, 17.00, 17.15, 17.30, 17.45, 18.00, 18.15, 18.30, 18.45, 19.00, 19.15, 19.30, 19.45, 20.00, 20.15, 20.30, 20.45, 21.00, 21.15, 21.30, 21.45, 22.00, 22.15, 22.30, 22.45, 23.00, 23.15, 23.30, 23.45) watt - c( 94.1, 70.8, 68.2, 65.9, 63.3, 59.5, 55, 50.5, 46.6, 43.9, 42.3, 41.4, 40.8, 40.3, 39.9, 39.5, 39.1, 38.8, 38.5, 38.3, 38.3, 38.5, 39.1, 40.3, 42.4, 45.6, 49.9, 55.3, 61.6, 68.9, 77.1, 86.1, 95.7, 105.8, 115.8, 124.9, 132.3, 137.6, 141.1, 143.3, 144.8, 146, 147.2, 148.4, 149.8, 151.5, 153.5, 156, 159, 162.4, 165.8, 168.4, 169.8, 169.4, 167.6, 164.8, 161.5, 158.1, 154.9, 151.8, 149, 146.5, 144.4, 142.7, 141.5, 140.9, 141.7, 144.9, 151.5, 161.9, 174.6, 187.4, 198.1, 205.2, 209.1, 211.1, 212.2, 213.2, 213, 210.4, 203.9, 192.9, 179, 164.4, 151.5, 141.9, 135.3, 131, 128.2, 126.1, 124.1, 121.6, 118.2, 113.4, 107.4, 100.8) df-data.frame(time, watt) lmfit - lm(time ~ watt + cos(time) + sin(time), data = df) Your regression formula does not make sense to me. You seem to expect a periodic function within 24 hours, and if not it would still be possible to subtract the trend and then look at a periodic solution. Applying a trigonometric regression results in the following approximations: library(pracma) plot(2*pi*time/24, watt, col=red) ts - seq(0, 2*pi, len = 100) xs6 - trigApprox(ts, watt, 6) xs8 - trigApprox(ts, watt, 8) lines(ts, xs6, col=blue, lwd=2) lines(ts, xs8, col=green, lwd=2) grid() where as examples the trigonometric fits of degree 6 and 8 are used. I would not advise to use higher orders, even if the fit is not perfect. Hans Werner Thanks a lot, Marianne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] interpolation and extremum location of a surface
Clement LAUZIN lauzin at hotmail.com writes: Hello, Hello, I have a x,y,z file.Z is not corresponding to a simple analytical function of x and y (see below). I am trying to find the minimum location of this surface and the z value corresponding to this location by a spline interpolation or from a polynomial fit. I tried with the akima package but as the location of the point I am looking for (the minimum) is outside of the convex hull it's not giving any answer. If someone has any idea or any suggestion? Thank you in advance Pierre According to a 2-dimensional barycentric Lagrange interpolation that I am implementing right now in R, the minimum appears to be at x0 = (4.230490, 68.52776); fval = -175.7959 I certainly would be interested to hear from other reasonable locations of a minimum, as this may be a good test case. -- Hans Werner xyz 4.1 60 -152.1719593 [ ... ] 4.35 90 -163.8389615 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix of higher order differences
Jeroen Ooms jeroenooms at gmail.com writes: Is there an easy way to turn a vector of length n into an n by n matrix, in which the diagonal equals the vector, the first off diagonal equals the first order differences, the second... etc. I.e. to do this more efficiently: diffmatrix - function(x){ n - length(x); M - diag(x); for(i in 1:(n-1)){ differences - diff(x, dif=i); for(j in 1:length(differences)){ M[j,i+j] - differences[j] } } M[lower.tri(M)] - t(M)[lower.tri(M)]; return(M); } x - c(1,2,3,5,7,11,13,17,19); diffmatrix(x); I do not know whether you will call the appended version more elegant, but at least it is much faster -- up to ten times for length(x) = 1000, i.e. less than 2 secs for generating and filling a 1000-by-1000 matrix. I also considered col(), row() indexing: M[col(M) == row(M) + k] - x Surprisingly (for me), this makes it even slower than your version with a double 'for' loop. -- Hans Werner # diffmatrix - function(x){ n - length(x) if (n == 1) return(x) M - diag(x) for(i in 1:(n-1)){ x - diff(x) # use 'diff' in a loop for(j in 1:(n-i)){ # length is known M[j, i+j] - x[j] # and reuse x } } M[lower.tri(M)] - t(M)[lower.tri(M)] return(M) } # __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I modify uniroot function to return .0001 if error ?
eric ericstrom at aol.com writes: I am calling the uniroot function from inside another function using these lines (last two lines of the function) : d - uniroot(k, c(.001, 250), tol=.05) return(d$root) The problem is that on occasion there's a problem with the values I'm passing to uniroot. In those instances uniroot stops and sends a message that it can't calculate the root because f.upper * f.lower is greater than zero. All I'd like to do in those cases is be able to set the return value of my calling function return(d$root) to .0001. But I'm not sure how to pull that off. I tried a few modifications to uniroot but so far no luck. Do not modify uniroot(). Use 'try' or 'tryCatch', for example e - try( d - uniroot(k, c(.001, 250), tol=.05), silent = TRUE ) if (class(e) == try-error) { return(0.0001) } else { return(d$root) } --Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about calculating derivatives for vectors
Mingwei Min mm809 at cam.ac.uk writes: Hi all, I am trying to calculating derivatives for vectors using R. I have a numerical vector, which is in time series manner and contains NAs. I want to know the changing rate throughout the whole time range. I am wondering if there is function in R can do this. In R references I only found functions for calculating derivatives of expressions. For example, vectors like this x-c(0.997549748, 0.98039798, 0.963097713, 0.930576181, 0.907484407, 0.875705376, 0.826477576, NA, 0.772794773) Thanks in advance. Mingwei First you have to decide how to interpolate the data points, e.g. through a polynomial fit (of degree 2,3,...), some Newton or Lagrangian interpolation, spline fitting (see splinefun()), etc. Then you can differentiate this fitted function, e.g. directly if it is a polynomial or numerically (see numderiv::grad()). Most of these functions just skip any NA values. --Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear constrained optimization in R
sammyny sjain at caa.columbia.edu writes: I am trying to use http://rss.acs.unt.edu/Rdoc/library/stats/html/constrOptim.html in R to do optimization in R with some given linear constraints but not able to figure out how to set up the problem. For example, I need to maximize $f(x,y) = log(x) + \frac{x^2}{y^2}$ subject to constraints $g_1(x,y) = x+y 1$, $g_2(x,y) = x 0$ and $g_3(x,y) = y 0$. How do I do this in R? This is just a hypothetical example. Do not worry about its structure, instead I am interested to know how to set this up in R. thanks! To get a reasonable solution, avoid coming near to x=0 and y=0. With x = 0.0001 and y = 0.0001 constrOptim() can be called as follows: constrOptim(c(0.25, 0.25), function(x) -log(x[1])-x[1]^2/x[2]^2, NULL, matrix(c(-1,1,0, -1,0,1), 3, 2), c(-1, 0.0001, 0.0001)) --Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minimum distance between line segments
Darcy Webber darcy.webber at gmail.com writes: Dear R helpers, I think that this may be a bit of a math question as the more I consider it, the harder it seems. I am trying to come up with a way to work out the minimum distance between line segments. For instance, consider 20 random line segments: x1 - runif(20) y1 - runif(20) x2 - runif(20) y2 - runif(20) plot(x1, y1, type = n) segments(x1, y1, x2, y2) Inititally I thought the solution to this problem was to work out the distance between midpoints (it quickly became apparent that this is totally wrong when looking at the plot). So, I thought that perhaps finding the minimum distance between each of the lines endpoints AND their midpoints would be a good proxy for this, so I set up a loop that uses pythagoras to work out these 9 distances and find the minimum. But, this solution is obviously flawed as well (sometimes lines actually intersect, sometimes the minimum distances are less etc). Any help/dection on this one would be much appreciated. Thanks in advance, Darcy. A correct approach could proceed as follows: (1) Find out whether two segments intersect (e.g., compute the intersection point of the extended lines and compare its x-coords with those of the segment endpoints); if they do, the distance is 0, otherwise set it to Inf. (2) For each endpoint, compute the intersection point of the perpendicular line through this point with the other segment line; if this point lies on the other segment, take the distance, otherwise compute the distance to the other two endpoints. (3) The minimum of all those distances is what you seek. I have done a fast implementation, but the code is so crude that I would not like to post it here. If you are really in need I could send it to you. --Hans Werner (I am not aware of a pure plane geometry package in R --- is there any?) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding pairs with least magnitude difference from mean
rex.dwyer at syngenta.com writes: James, It seems the 2*mean(x) term is irrelevant if you are seeking to minimize sd. Then you want to sort the distances from smallest to largest. Then it seems clear that your five values will be adjacent in the list, since if you have a set of five adjacent values, exchanging any of them for one further away in the list will increase the sd. The only problem I see with this is that you can't use a number more than once. In any case, you need to compute the best five pairs beginning at position i in the sorted list, for 1=i=choose(n,2), then take the max over all i. There no R in my answer such as you'd notice, but I hope it helps just the same. Rex You probably mean something like the following: x - rnorm(10) y - outer(x, x, +) - (2 * mean(x)) o - order(x) sd(c(y[o[1],o[10]], y[o[2],o[9]], y[o[3],o[8]], y[o[4],o[7]], y[o[5],o[6]])) This seems reasonable, though you would have to supply a more stringent argument. I did two tests and it works alright. --Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding pairs with least magnitude difference from mean
I have what I think is some kind of linear programming question. Basically, what I want to figure out is if I have a vector of numbers, x - rnorm(10) x [1] -0.44305959 -0.26707077 0.07121266 0.44123714 -1.10323616 -0.19712807 0.20679494 -0.98629992 0.97191659 -0.77561593 mean(x) [1] -0.2081249 Using each number only once, I want to find the set of five pairs where the magnitude of the differences between the mean(x) and each pairs sum is least. y - outer(x, x, +) - (2 * mean(x)) With this matrix, if I put together a combination of pairs which uses each number only once, the sum of the corresponding numbers is 0. For example, compare the SD between this set of 5 pairs sd(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5])) [1] 1.007960 versus this hand-selected, possibly lowest SD combination of pairs sd(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7])) [1] 0.2367030 Your selection is not bad, as only about 0.4% of all possible distinct combinations have a smaller value -- the minimum is 0.1770076, for example [10 7 9 5 8 4 6 2 3 1]. (1) combinat() from the 'combinations' package seems slow, try instead the permutations() function from 'e1071'. (2) Yes, except your vector is getting much larger in which case brute force is no longer feasible. (3) This is not a linear programming, but a combinatorial optimization task. You could try optim() with the SANN method, or some mixed-integer linear program (e.g., lpSolve, Rglpk, Rsymphony) by intelligently using binary variables to define the sets. This does not mean that some specialized approach might not be more appropriate. --Hans Werner I believe that if I could test all the various five pair combinations, the combination with the lowest SD of values from the table would give me my answer. I believe I have 3 questions regarding my problem. 1) How can I find all the 5 pair combinations of my 10 numbers so that I can perform a brute force test of each set of combinations? I believe there are 45 different pairs (i.e. choose(10,2)). I found combinations from the {Combinations} package but I can't figure out how to get it to provide pairs. 2) Will my brute force strategy of testing the SD of each of these 5 pair combinations actually give me the answer I'm searching for? 3) Is there a better way of doing this? Probably something to do with real linear programming, rather than this method I've concocted. Thanks for any help you can provide regarding my question. Best regards, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using uniroot() with output from deriv() or D()
I need to find the root of the second derivative of many curves and do not want to cut and paste the expression results from the deriv() or D() functions every time. Below is an example. What I need to do is refer to fn2nd in the uniroot() function, but when I try something like uniroot(fn2nd,c(0,1)) I get an error, so I have resorted to pasting in the expression, but this is highly inefficient. Thanks, J [...] What is so wrong with using r - uniroot(function(x) eval(fn2nd, list(x=x)), interval=c(0, 1), tol=0.0001) (I thought you were almost there) or even fn2nd_fun - function(x) eval(fn2nd, list(x=x)) ex - seq(from=0, to=1, length.out = 1000) y1 - fn2nd_fun(ex) ... r - uniroot(fn2nd_fun, interval=c(0, 1), tol=0.0001) --Hans Werner r$root abline(h=0, col = red) abline(v=r$root, col = green) arrows(0.6, -2, r$root,0, length = 0.1, angle = 30, code = 2, col = red) text(0.765,-2.3,paste(b = ,r$root,sep=)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding Savitzky-Golay Smoothing Filter
reynolds pravindev reynoldspravindev at gmail.com writes: Hi When we use the sav_gol command in R , it shows an error which says: error in as.matrix. We've downloaded the necessary packages. Kindly help us with this issue. If there is any other function to perform Savitzky-Golay smoothing in R, please let me know. Would be okay to mention the package where your 'sav_gol' is located. You could also provide your session info for others to see which packages are loaded. Guess you mean RTisean::sav_gol(). There is another implementation in the 'signal' package. Personally I am still using a version you can find by RSiteSearch'ing the mailing list 2002-2007. --Hans Werner With Regards Reynolds __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rubin's rules of multiple imputation
Joe P King jp at joepking.com writes: Hello all, if I have multiple imputed data sets, is there a command or function in R in any package you know of to combine those, I know one common MI approach is rubins rules, is there a way to do this using his rules or others? I know theres ways, like using Amelia from Gary King's website to create the imputed data sets, but how to make them into one or combine them for analysis. On Gary King's page you can find the following hint: To automatically combine multiply imputed data sets: in R see Zelig; ... and on CRAN there are the packages norm: Analysis of multivariate normal datasets with missing values, based on routines described in Chapter 5 of Schafer (1997a). amelia: Amelia II: A Program for Missing Data see http://gking.harvard.edu/amelia --Hans Werner Joe King, M.A. Ph.D. Student University of Washington - Seattle 206-913-2912 mailto:jp at joepking.com jp at joepking.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim function multi variables -newbie
michalseneca michalseneca at gmail.com writes: I tried to modify the code,thanks for noticing... now i get that the function cannot be evaluated at initial parameters. However I do not know what does that mean.. Should I try to modify parameters. I am still not sure of syntax of function. I cannot get that optimize work correctly. I am trying to test that..however I cannot get to a result.. In matlab however that works correctly... Did anybody of you tried to succesfully run it ? Michal I went through your functions and the Matlab code, below I will append my versions. Matlab and R now return the same values, at least for a few test cases. One obstacle, for instance, was that solve.polynom() expects the coefficients in reverse order. Here is an example of how to apply optim(). Whether the result is resonable you'll have to find out for yourself: MktVol - c(20.8, 19.8, 18.1, 16.1, 15.1, 14.5, 14.0, 14.3, 15.0, 15.9, 16.2, 16.4, 16.6, 17.3, 19.1) / 100; MktStrike - c(4.3, 4.8, 5.3, 5.8, 6.3, 6.8, 7.3, 7.8, 8.3, 8.8, 9.3, 9.8, 10.3, 10.8, 11.3) / 100; F - 0.073 ATMVol - 0.14 T - 1 b-1 parsbox-c(0.7,0.5) f - function(x) EstimateRhoAndVol(x, MktStrike, MktVol, ATMVol, F, T, b) optim(parsbox, f) # $par # [1] -0.06374008 0.60383201 Of course, there are some more open points, for instance will 'findAlpha' always find a real, positive cubic root? etc. --Hans Werner ### --- snip --- library(polynom) EstimateRhoAndVol - function(params, MktStrike, MktVol, ATMVol, F, T, b) { # - # Returns the following SABR parameters: # r = rho # v = vol-of-vol # Uses ATM volatility to estimate alpha # Required inputs: # MktStrike = Vector of Strikes # MktVol= Vector of corresponding volatilities # ATMVol = ATM volatility # F = spot price # T = maturity # b = beta parameter # - r - params[1] v - params[2] a - findAlpha(F, F, T, ATMVol, b, r, v) N - length(MktVol) ModelVol - numeric(N) error - numeric(N) # Define the model volatility and the squared error terms for (i in 1:N) { ModelVol[i] = SABRvol(a, b, r, v, F, MktStrike[i], T) error[i] = (ModelVol[i] - MktVol[i])^2 } # Return the SSE y - sum(error, na.rm=TRUE) # Impose the constraint that -1 = rho = +1 # via a penalty on the objective function if (abs(r) 1) y - Inf return(y) } SABRvol - function(a, b, r, v, F, K, T) { # - # Returns the SABR volatility. # Required inputs: # a = alpha parameter # b = beta parameter # r = rho parameter # v = vol of vol parameter # F = spot price # K = strike price # T = maturity # - # Separate into ATM case (simplest case) and # Non-ATM case (most general case) if (abs(F-K) = 0.001) { # ATM vol Term1 - a/F^(1-b) Term2 - ((1-b)^2/24*a^2/F^(2-2*b) + r*b*a*v/4/F^(1-b) + (2-3*r^2)*v^2/24) y - Term1 * (1 + Term2*T) } else { # Non-ATM vol FK - F*K z - v/a*(FK)^((1-b)/2)*log(F/K) x - log((sqrt(1 - 2*r*z + z^2) + z - r)/(1-r)) Term1 - a / FK^((1-b)/2) / (1 + (1-b)^2/24*log(F/K)^2 + (1-b)^4/1920*log(F/K)^4) if (abs(x-z) 1e-10) Term2 - 1 else Term2 - z / x Term3 - 1 + ((1-b)^2/24*a^2/FK^(1-b) + r*b*v*a/4/FK^((1-b)/2) + (2-3*r^2)/24*v^2)*T y - Term1 * Term2 * Term3 } return(y) } findAlpha - function(F, K, T, ATMvol, b, r, v) { # - # Finds the SABR alpha parameter by solving the cubic equation described # in West (2005) Calibration of the SABR Model in Illiquid Markets # The function can return multiple roots. In that case, the program # eliminates negative roots, and select the smallest root among the # positive roots that remain. # Required inputs: # F = spot price # K = strike price # T = maturity # ATMvol = ATM market volatility # b = beta parameter # r = rho parameter # v = vol of vol parameter # - # Find the coefficients of the cubic equation for alpha C0 - -ATMvol * F^(1-b) C1 - (1 + (2-3*r^2) * v^2 * T / 24) C2 - r * b * v * T / 4 / F^(1-b) C3 - (1-b)^2 * T / 24 / F^(2-2*b) # Return the roots of the cubic equation (multiple roots) AlphaVector - solve(as.polynomial(c(C0, C1, C2, C3))) # Find and return the smallest positive root index - which(Im(AlphaVector) == 0 Re(AlphaVector) 0) Alpha - AlphaVector[index] if (length(Alpha) == 0) Alpha - 0 return(min(Re(Alpha))) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
Re: [R] Hilbert Huang Transformation
Jumlong Vongprasert jumlong.ubru at gmail.com writes: Dear All I try to use Hilbert Huang Transformation in R. How I can do. Many Thanks. The Hilbert-Huang transformation is a combination of empirical mode decomposition (EMD) and Hilbert Spectral Analysis. For both see the package 'EMD' on CRAN. You could propose to the package maintainer to include HHT explicitely. --Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim function multi variables -newbie
michalseneca michalseneca at gmail.com writes: Hi, Basically what I am trying is to rewrite matlab code into R ...This is code for famous SABR model Calibration. I did most of the code up to optimalization issue. In the attachment you can find matlab and my unfinished R code..This is very helpful code...Basically I am trying to achieve the same results. Thank you very much for your help..This might help me much with my thesis Best regards Michal Apparently, you are resetting the input 'params' within the function EstimateRhoAndVol() to c(0.5, 0.5) regardless of what the actual input is. No wonder that 'optim' returns the starting values: it sees a constant function. Besides that, you have to be more careful with NAs in the SABR model. Did you try out the function? --- Are you testing the list? --Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integration of two lines
Xavier Robin Xavier.Robin at unige.ch writes: Hello, I need to integrate the absolute difference between two lines measured on different points. # For example : x - seq(0, 1, 1/100) f_x - runif(101) + x y - seq(0, 1, 1/23) f_y - runif(24) + (1 - y) plot(x, f_x, type=l) lines(y, f_y) Then I would like to compute Integral( | f_x - f_y | )dx. (This is not the same as | Integral(f_x)dx - Integral(f_y)dx |.) First define a function from those points: fx - approxfun(x, f_x) fy - approxfun(y, f_y) f - function(x) abs(fx(x)-fy(x)) and now you can apply integrate() or trapz(): xx - sort(c(x, y)) yy - f(xx) trapz(xx, yy) trapz() should return the more accurate (i.e. exact) result. --Hans Werner Computing this integral looks non trivial. I guess I should interpolate the points of f_y over x and integrate both lines on these intervals. Even then I would miss points where the lines cross. There are functions to integrate below *one* line (I'm thinking about the trapz function in caTools). Do you know if there is a function to do this integration properly with two lines (and especially their absolute difference)? Regards, Xavier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on a Display function
William Dunlap wdunlap at tibco.com writes: But it fails on this: my_names - c(Bill, William) display(rev(my_names)) rev(my_names) = Error in cat(list(...), file, sep, fill, labels, append) : argument 3 (type 'list') cannot be handled by 'cat' This is because you call eval() using the environment of the function, while ordinary argument evaluation evaluates them in the environment of the caller. Do not use eval() directly but use the ordinary evaluation that R does automatically. I appreciate this extensive answer. So I decided to use the following function, suppressing the deparsed form if there is a tag (while I'm inclined to forbid function code as input): display - function (...) { evaluatedArgs - list(...) n - length(evaluatedArgs) argTags - names(evaluatedArgs) deparsedArgs - lapply(substitute(placeholderFunction(...))[-1], function(expr) { d - deparse(expr) paste(d, collapse = \n) } ) if (is.null(argTags)) argTags - rep(, n) namedArgs - ifelse(argTags != , argTags, deparsedArgs) cat(paste(namedArgs, evaluatedArgs, sep= = ), sep = \n) } It works fine on your examples, my_names - c(Bill, William) display(x=log(10), 1+2+3, sin(1), rev(my_names), z=(function(p){lp-log(p);lp+lp^2/2+lp^3/6})(0.2)) # x = 2.30258509299405 # 1 + 2 + 3 = 6 # sin(1) = 0.841470984807897 # rev(my_names) = c(William, Bill) # z = -1.00911130949159 while there still are problems with matrices (and probably other structures): A - matrix(1:4, 2, 2); B=diag(4) display(A, B) # A = 1:4 # B = c(1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1) I guess I have to define my own output representation in these cases, based on lapply(evaluatedArgs, class). Many thanks, Hans Werner display2 - function (...) { evaluatedArgs - list(...) argTags - names(evaluatedArgs) deparsedArgs - lapply(substitute(placeholderFunction(...))[-1], function(expr) { d - deparse(expr) paste(d, collapse = \n) # or d[1] or ... }) # use if(is.null(argTags)) ... cat without argTags ... ? cat(paste(sep = = , argTags, deparsedArgs, evaluatedArgs), sep = \n) } my_names - c(Bill, William) display2(rev(my_names)) = rev(my_names) = c(William, Bill) display2(strings=rev(my_names)) strings = rev(my_names) = c(William, Bill) display2(x=log(10), 1+2+3, z=(function(p){lp-log(p);lp+lp^2/2+lp^3/6})(0.2)) x = log(10) = 2.30258509299405 = 1 + 2 + 3 = 6 z = (function(p) { lp - log(p) lp + lp^2/2 + lp^3/6 })(0.2) = -1.00911130949159 My questions: (1) Is there a better or more appropriate way to write such a function? --- I'm not so well versed in internal R functions such as (de)parse(), substitute(), or eval(). (2) What is the role of the placeholderFunction? I could not find enough information about it resp. about the whole construction. The function f - function(x) substitute(func(x)) produces an object of class call whose first element is the name func and whose subsequent elements are the arguments in the call. The [-1] strips off the function name. You might also do f - function(x) substitute((x)) which produces an object of class ( whose first element is the name ( which you can strip off. ( is easier to misread and ( objects must have length 2, while call objects have any length greater than 0. I learned about this by playing around with the output of quote() (or parse() or substitute() or expression()). Look at the class, length, and [[elements]] of expressions. The following str.language might get you started. It prints the name, class, length, and a summary of the value of each part of an expression. [...] Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Thanks, Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help on a Display function
I wanted to simulate the Matlab DISPLAY function for some time now. After seeing a recent proposal by Gabor Grothendieck I came up with the following solution, display - function(...) { my_names - lapply(substitute(placeholderFunction(...))[-1], deparse) for (my in my_names) cat(my, =, eval(parse(text=my)), \n, sep= ) } that works about as good as I would have hoped: a - 1; b - a + 1 display(a, b, sin(1)) a = 1 b = 2 sin(1) = 0.841471 My questions: (1) Is there a better or more appropriate way to write such a function? --- I'm not so well versed in internal R functions such as (de)parse(), substitute(), or eval(). (2) What is the role of the placeholderFunction? I could not find enough information about it resp. about the whole construction. Thanks, Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integrate and subdivisions limit
Dear all, I have some issues with integrate in R thus I would like to request your help. I am trying to calculate the integral of f(x)*g(x). The f(x) is a step function while g(x) is a polynomial. If f(x) (step function) changes its value only few times (5 or 6 'steps') everything is calulated ok(verified results in scrap paper) but if f(x) takes like 800 different values I receive the error Error in integrate number of subdivisions reached I did some checks on the internet and I found that I can increase the number of subdivisions (as this is a parameter in integrate(). Thus I raised it from 100 to 1000 (and then to 1). A. Does this makes the error produced higher or does it only stress the computer? B. When the number was raised to 10.000 I started getting the error message roundoff error was detected What do you think I should do to solve that? I would like to thank u in advance for your help Best Regards Alex There's obviously a more numerically stable approach. If g(x) is a polynomial you do know its polynomial antiderivative. Take that and sum up all intervals where the step function is constant. Example: g(x) = 1 constant, integrate x^2 over 0..1 in 1 subdivisions. The antiderivative is 1/3 * x^3, thus g - function(x) 1 x - seq(0, 1, len=10001) sum((x[2:10001]^3 - x[1:1]^3)*g(2:10001))/3 #= 0.333 The antiderivative of a polynomial a_1 x^n + ... + a_{n+1} given as vector c(a1,...) can also be determined automatically, no manual work is necessary. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quadratic programming with semi-definite matrix
Andreas Jensen bentrevisor at gmail.com writes: Hello. I'm trying to solve a quadratic programming problem of the form min ||Hx - y||^2 s.t. x = 0 and x = t using solve.QP in the quadprog package but I'm having problems with Dmat not being positive definite, which is kinda okay since I expect it to be numerically semi-definite in most cases. As far as I'm aware the problem arises because the Goldfarb and Idnani method first solves the unconstrained problem requiring a positive definite matrix. Are there any (fast) packages that allows me to do QP with (large) semidefinite matrices? Example: t - 100 y - signalConvNoisy[1,] D - crossprod(H,H) d - crossprod(H,y) A - cbind(rep(-1, nrow(H)), diag(ncol(H))) b0 - c(t, rep(0, ncol(H))) sol - solve.QP(Dmat=D, dvec = d, Amat = A, bvec = b0)$solution Error in solve.QP(Dmat = D, dvec = d, Amat = A, bvec = b0) : matrix D in quadratic function is not positive definite! Thanks in advance, Andreas Jensen See the Optimization task view, there are several packages that promise to handle quadratic programming problems. You may also try one of the nonlinear optimization packages, they are not much slower nowadays. Your code cannot be executed as signalConvNoisy is unknown. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimize multiple variable sets
peter dalgaard pdalgd at gmail.com writes: On Dec 6, 2010, at 15:15 , Jonathan P Daily wrote: Correct me if I'm wrong, but isn't the minimal x value in your example the same regardless of what positive coefficient you apply to x? If that is the case, you would expect the same min(x) for each iteration. i.e. in the interval [0,1] the minimum x value of x^2 + x is the same as x^2 + 1*x, at x = 0. You're wrong --- slightly. The returned $minimum is the x, the y is $objective. But the interval given doesn't bracket the minimum, as you'll clearly see if you put int=c(-10,10). The only puzzling bit is that optimize() doesn't actually return the left endpoint, but rather the first evaluation point inside the interval. The rather wide tolerance of .Machine$double.eps^0.25 == 0.0001220703 probably plays a role in this. 'optimize', as other golden section search algorithms, never returns one of the endpoints. So to speak, they avoid evaluating the function at the end points of the interval. An advantage is that the function value at one of these points can be NA or Inf. Of course, with tol=.Machine$double.eps you will get closer. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to formulate constraint like abs(x) = y i n constrOptim (or other)
Benjamin B. benj.bad.ac at googlemail.com writes: Hello list reader, I am trying to form some constraints for an optimization I am working on. I think I have understand the use of the constraints in matrix form. I use them like: [...] Now I would like to formulate a constraint like Sum(Abs(x_i))=limit But I have no idea how I could get such a condition by using the matrix constraint formulation. I have already searched the help, web and all searchable R resources I could find, but got no helping result. Is constrOptim the right function to use? Maybe you can give me a hint or can point me to a good description of constraint handling. Greetings and thanks in advance, Benjamin Benjamin B. Hamburg, Germany With 'constrOptim' you can formulate linear constraints only, and in most cases abs() is not linear. You might try one of the nonlinear optimization packages. Another possibility is to reformulate the absolute value with two binary values, if your optimization function itself is linear; you didn't tell us the whole story. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integral of PDF
That does not remedy the situation in any case, take the following function fun - function(x) dnorm(x, -500, 50) + dnorm(x, 500, 50) that has a 'mode' of 0 again. Interestingly, if I transform it by 1/x, to integrate I again have to reduce the error tolerance to at least 1e-10: f - function(x) fun(1/x)/x^2 integrate(f, 0, 1) #- 0, should be 1 while cubature::adaptIntegrate does not require to change the tolerance: require(cubature) adaptIntegrate(f, 0, 1)#- 1 What I find *dangerous* about 'integrate' is not that it returns a wrong result, but that it --- w/o warning --- gives an error estimate that leads completely astray. Hans Werner But that only postpones the misery, Hans Werner! You can always make the mean large enough to get a wrong answer from `integrate'. integrate(function(x) dnorm(x, 700,50), -Inf, Inf, subdivisions=500, rel.tol=1e-11) integrate(function(x) dnorm(x, -700,50), -Inf, Inf, subdivisions=500, rel.tol=1e-11) The problem is that the quadrature algorithm is not smart enough to recognize that the center of mass is at `mean'. The QUADPACK algorithm (Gauss-Kronrod quadratutre) is adaptive, but it does not look to identify regions of high mass. Algorithms which can locate regions of highest density, such as those developed by Alan Genz, will do much better in problems like this. Genz and Kass (1997). J Computational Graphics and Statistics. There is a FORTRAN routine DCHURE that you might want to try for infinite regions. I don't think this has been ported to R (although I may be wrong). There may be other more recent ones. A simple solution is to locate the mode of the integrand, which should be quite easy to do, and then do a coordinate shift to that point and then integrate the mean-shifted integrand using `integrate'. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu That does not remedy the situation in any case, take the following function fun - function(x) dnorm(x, -500, 50) + dnorm(x, 500, 50) that has a 'mode' of 0 again. Interestingly, if I transform it by 1/x, to integrate I again have to reduce the error tolerance to at least 1e-10: f - function(x) fun(1/x)/x^2 integrate(f, 0, 1) #- 0, should be 1 while cubature::adaptIntegrate does not require to change the tolerance: require(cubature) adaptIntegrate(f, 0, 1)#- 1 What I find *dangerous* about 'integrate' is not that it returns a wrong result, but that it --- w/o warning --- gives an error estimate that leads completely astray. Hans Werner Ravi Varadhan wrote: But that only postpones the misery, Hans Werner! You can always make the mean large enough to get a wrong answer from `integrate'. integrate(function(x) dnorm(x, 700,50), -Inf, Inf, subdivisions=500, rel.tol=1e-11) integrate(function(x) dnorm(x, -700,50), -Inf, Inf, subdivisions=500, rel.tol=1e-11) The problem is that the quadrature algorithm is not smart enough to recognize that the center of mass is at `mean'. The QUADPACK algorithm (Gauss-Kronrod quadratutre) is adaptive, but it does not look to identify regions of high mass. Algorithms which can locate regions of highest density, such as those developed by Alan Genz, will do much better in problems like this. Genz and Kass (1997). J Computational Graphics and Statistics. There is a FORTRAN routine DCHURE that you might want to try for infinite regions. I don't think this has been ported to R (although I may be wrong). There may be other more recent ones. A simple solution is to locate the mode of the integrand, which should be quite easy to do, and then do a coordinate shift to that point and then integrate the mean-shifted integrand using `integrate'. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu -- View this message in context: http://r.789695.n4.nabble.com/Integral-of-PDF-tp3070243p3070768.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Integral of PDF
You can dive into the thread puzzle with integrate over infinite range from September this year. The short answer appears to be: Increase the error tolerance. integrate(function(x) dnorm(x, 500,50), -Inf, Inf, subdivisions=500, rel.tol=1e-11) # 1 with absolute error 1.1e-12 Hans Werner Doran, Harold wrote: The integral of any probability density from -Inf to Inf should equal 1, correct? I don't understand last result below. integrate(function(x) dnorm(x, 0,1), -Inf, Inf) 1 with absolute error 9.4e-05 integrate(function(x) dnorm(x, 100,10), -Inf, Inf) 1 with absolute error 0.00012 integrate(function(x) dnorm(x, 500,50), -Inf, Inf) 8.410947e-11 with absolute error 1.6e-10 all.equal(integrate(function(x) dnorm(x, 500,50), -Inf, Inf)$value, 0) [1] TRUE sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] statmod_1.4.6 mlmRev_0.99875-1 lme4_0.999375-35 Matrix_0.999375-33 lattice_0.17-26 loaded via a namespace (and not attached): [1] grid_2.10.1 nlme_3.1-96 stats4_2.10.1 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Integral-of-PDF-tp3070243p3070315.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about constraint minimization
dhacademic at gmail.com dhacademic at gmail.com writes: Hi, I have struggled on this bound optimization with equality constraint by using optim function for two days, but still fail to prepare a good input. Can anyone help to prepare the input for my specific case? Many thanks. Best, Hao You did not disclose the the function f, is it linear or nonlinear, does it have (many) local minima, do you know its gradient?, etc. With some bloody tricks it is possible to emulate equality and inequality constraints with 'optim', too. But in general, I would suggest to apply 'constrOptim.nl' in the alabama package (or 'solnp' in Rsolnp). These are newer implementations and will handle equality constraints nicely. Assuming your original function f12 is a function of 12 variables, e.g. f12 - function(x) sum((1:12)*(x - 1:12)^2) define a new function f eliminating x3, x4, and x12 through f - function(x) { xx1r - 1.5 - x[1] - sum(x) x12 - c(x[1], x[2], x[1], x[1], x[3:9], xx1r) f12(x12) } I would suggest 'solnp' in package Rsolnp, as the bound constraints can be formulated somewhat easier; the start value has to be feasible(!): lower - c(-1, 0, -1, 0, 0, 0, 0, 0, 0) upper - c( 0, 1, 0, 1, 1, 1, 1, 1, 1) fun - function(x) 1.5 - 2*x[1] - x[2] - sum(x[3:9]) start - c(-0.2, 0.2, -0.2, rep(0.2, 6)) S - solnp(start, f, LB=lower, UB=upper, ineqfun=fun, ineqLB=0, ineqUB=1) This will return a (local?) minimum (-1, 0, -1, 0, 0, 0.5, 1, 1, 1) as: S$pars # [1] -1.00e+00 1.209474e-08 -9.99e-01 4.801754e-08 1.930926e-07 # [6] 4.99e-01 9.98e-01 1.00e+00 1.00e+00 Hans Werner P. S.: Sorry, Ravi, but I could not resist the temptation to at least **indicate** one complete solution. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about constraint minimization
dhacademic at gmail.com dhacademic at gmail.com writes: Hi, I am a beginner of R. There is a question about constraint minimization. A function, y=f(x1,x2,x3x12), needs to be minimized. There are 3 requirements for the minimization: (1) x2+x3+...+x12=1.5 (x1 is excluded); (2) x1=x3=x4; (3) x1, x3 and x5 are in the range of -1~0, respectively. The rest variables (x2, x4, x6, x7, , x12) are in the range of 0~1, respectively. The optim function is used. And part of my input is as follow, where xx1r represents the x12: xx1r=1.5-x[2]-x[1]-x[1]-x[3]-x[4]-x[5]-x[6]-x[7]-x[8]-x[9] start=rnorm(9) up=1:9/1:9*1 lo=1:9/1:9*-1 out=optim(start,f,lower=lo,upper=up,method=L-BFGS-B,hessian=TRUE, control=list(trace=6,maxit=1000)) There are two problems in this input. the up and lo only define a range of -1~1 for x1 to x11, which can not meet the requirement (3). In addition, there is not any constraint imposed on x12. I have no idea how to specify a matrix that can impose different constraints on individual variables in a function. Any suggestion is highly appreciated. Best, Hao I don't see any direct need for real 'constraint' optimization here, it is a 'bounded' optimization where you are allowed to use lower - c(-1,0,-1,0,-1,0,0,0,0,0,0,0) upper - c( 0,1, 0,0, 0,1,1,1,1,1,1,1) Otherwise, your description is confusing: (1) Did you change f to a new function with 9 variables, eliminating x3, x4, and x12 ? (2) x4 (being equal to x1) has to be in [-1, 0] but also in [0, 1]? (3) If you need to restrict x12 to [0, 1] also, you cannot eliminate it. Either keep x12 and use an equality constraint, or use inequality constraints on xxlr. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to find near neighbors?
Czerminski, Ryszard Ryszard.Czerminski at astrazeneca.com writes: I am looking for an efficient way to find near neighbors... More specifically... I have two sets of points: A B and I want to find points in set B which are closer to set A than some cutoff (or n-closest) I will appreciate very much any pointers... Ryszard 'nn' in the RANN package represents a good solution for this task as it implements a very fast algorithm and provides the ability to search for nearest neighbors only within a certain radius. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Numerical integration
Eduardo de Oliveira Horta eduardo.oliveirahorta at gmail.com writes: -- Forwarded message -- From: Eduardo de Oliveira Horta eduardo.oliveirahorta at gmail.com Date: Wed, Nov 17, 2010 at 3:59 PM Subject: Re: [R] Numerical integration To: David Winsemius dwinsemius at comcast.net It works, however is not very efficient for the problem I'm working with, since it is the same as vectorizing the integrand before applying integrate. Any other thoughts? Efficiency would be highly welcome! Thanks again, Eduardo Horta 'adaptIntegrate' in package cubature does not require pre-vectorized functions. But I doubt it is more efficient as every integration needs to perform a lot of function calls. And with 'adaptIntegrate' you have to do the reduction of an infinite interval to a finite one yourself. E. g., with a given function f define g - function(x) f(1/x) / x^2 adaptIntegrate(g, 0, 1/a) # instead of integrate(f, a, Inf) You have not given the slightest indication why your function integration would be so slow. So it's difficult to give more focussed advice. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rsolnp examples
Jan Theodore Galkowski bayesianlogic at acm.org writes: I'm interested in the Rsolnp package. For their primary function solnp, one example is given, and there is a reference to unit tests. Anyone know where these can be found? Also, Rsolnp is used in a few other packages (e.g., depmixS4), but I cannot seem to find source illustrating its call sequence, and the precise definition of the functions passed. When you download the source package, test functions are in the Rsolnp/R/benchmarks.R file. And you will find the original manual describing these tests in Rsolnp/inst/doc (Matlab version). Calling 'solnp' is easy; what exactly is your problem? Hans Werner Can anyone help? Thanks, - Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate area between intersecting polygons
jonas garcia garcia.jonas80 at googlemail.com writes: Thanks for your reply, My main issue is that I don't have any equations to generate the data, just a bunch of points, each corresponding to a polygon. I was looking in package sp and there is a function to calculate areas ( areapl()), but not for intersecting polygons. Is there any other package that does this? Thanks There is a formula for calculating the area of a polygon from its points, A = 1/2 * sum_{i=1}^{n}{x_i*y_{i+1} - x_{i+1}*y_i}, where (x_{n+1}, y_{n+1}) = (x_0, y_0), n the number of points (non-closed), i.e. polygonArea(poly1) # 6.39605 polygonArea(poly2) # 9.35967 You will need to identify the intersection points between the polygons and merge the appropriate parts of the polygon lines. Hans Werner polygonArea - function(P) { n - nrow(P) x - P[, 1]; y - P[, 2] p1 - sum(x[1:(n-1)]*y[2:n]) + x[n]*y[1] p2 - sum(x[2:n]*y[1:(n-1)]) + x[1]*y[n] return(0.5*(p1-p2)) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate area between intersecting polygons
Remko Duursma remkoduursma at gmail.com writes: Here is a different solution: library(gpclib) p1 - as(poly1, gpc.poly) p2 - as(poly2, gpc.poly) area.poly(p2) + area.poly(p1) - area.poly(union(p1,p2)) I.e., take areas of both polygons and subtract the union (check plot(union(p1,p2)) ) to get the area of the intersection. greetings, Remko Thanks for the hint. I didn't know that the polygon clipping library has been packaged for R, I could have used it several time before. There are too many package, though this time I admit it would have been easy to hit upon it by searching CRAN carefully. Hans Werner Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hash fuctions library for R?
Uwe Ziegenhagen ziegenhagen at gmail.com writes: Hi, I would like to calculate various hash sums for given string but haven't found a suitable package, yet. I found tools::md5sum which works only for files, however I would like to avoid putting my string in a file before calculating the hash. See the 'digest' package. Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding Zeros of a Function
Lorenzo Isella wrote: Hello, And sorry for the late reply. You see, my problem is that it is not known a priori whether I have one or two zeros of my function in the interval where I am considering it; on top of that, does BBsolve allow me to select an interval of variation of x? Many thanks Lorenzo Let's see how many zeroes are there of the function sin(1/x) in the interval [0.1, 1.0]. Can you find out by plotting? In the 1-dimensional case the simplest approach is to split into subintervals small enough to contain at most one zero and search each of these, e.g., with the secant rule --- or uniroot() if you like. The following function will find 31 zeroes for sin(1/x) using subintervals of length 1/000, i.e. x - seq(0.1, 1.0, len=1001); y - sin(1/x); and call piecewise(x, y). Hans Werner piecewise - function(x, y) { n - length(x) zeros - if (y[1] == 0) c(x[1]) else c() for (i in 2:n) { if (y[i]*y[i-1] = 0) { if (y[i] == 0) zeros - c(zeros, x[i]) } else { x0 - (x[i-1]*y[i] - x[i]*y[i-1])/(y[i] - y[i-1]) zeros - c(zeros, x0) } } return(zeros) } -- View this message in context: http://r.789695.n4.nabble.com/Finding-Zeros-of-a-Function-tp2713980p2968228.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding Zeros of a Function
Hans W Borchers wrote: Let's see how many zeroes are there of the function sin(1/x) in the interval [0.1, 1.0]. Can you find out by plotting? In the 1-dimensional case the simplest approach is to split into subintervals small enough to contain at most one zero and search each of these, e.g., with the secant rule --- or uniroot() if you like. The following function will find 31 zeroes for sin(1/x) using subintervals of length 1/000, i.e. x - seq(0.1, 1.0, len=1001); y - sin(1/x); and call piecewise(x, y). Hans Werner piecewise - function(x, y) { n - length(x) zeros - if (y[1] == 0) c(x[1]) else c() for (i in 2:n) { if (y[i]*y[i-1] = 0) { if (y[i] == 0) zeros - c(zeros, x[i]) } else { x0 - (x[i-1]*y[i] - x[i]*y[i-1])/(y[i] - y[i-1]) zeros - c(zeros, x0) } } return(zeros) } Sorry, I meant how many zeroes are there in the interval [0.01, 1.0] ? ! The following will find 31 zeroes, and these are all. x - seq(0.01, 1.0, len=1001) y - sin(1/x); piecewise(x, y) Hans Werner -- View this message in context: http://r.789695.n4.nabble.com/Finding-Zeros-of-a-Function-tp2713980p2968249.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hashing a set
Lorenzo Isella lorenzo.isella at gmail.com writes: Dear All, I am given a time series such at, at every time t_i, I am given a set of data (every element of the set is just an integer number). What I need is an injective function able to map every set into a number (possibly an integer number, but that is not engraved in the stone). Does anybody know how to achieve that? In set theory you learn about the function that assigns to a set of integers (i1, ..., in) the integer p1^i1 * ... * pn^in, where p1, ... is the sequence of primes. In practice, unfortunately, this will lead to too large numbers. I would recommend the 'digest' package that provides hashing functions such as 'md5', etc., thanks to Dirk Eddelbuettel. It is injective enough and returns character strings of fixed length. Hans Werner Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] constrained optimization -which package?
Leonardo Monasterio leonardo.monasterio at gmail.com writes: Dear R users, In the function bellow I want to find the maximum value of v, subject to the constrain that the sum of x is equal to 1. I want to maximize: v-t(x)%*%distance%*%x Subject to: sum(x)=1 I do not see why you would take the trouble to use constraint optimization here. Use x_n as a dependent variable, x_n - 1 - sum(x), x an (n-1)-dim. vector, define another function f1(x) - f(c(x, 1-sum(x))) appropriately, and apply optim() without constraints. Hans Werner Where: x is a vector n X 1 distance is a matrix n*n and it is given. (In practice, the number of n can go up to hundreds.) I have a bit of experience using R but not much on optimization techniques or on R optimization packages. I have taken a look at optim and nlminb, but I am quite confused. Do you have any suggestion on how to do it? Thank you very much, Leo Monasterio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-linear integer optimization?
darckeen darckeen at hotmail.com writes: Are there any packages that do non-linear integer otimization? Looked at lpSolve but i'm pretty sure it only works with linear programming and not non-linear, tried L-BFGS-B optim after flooring all my params, works somewhat but seems really inefficient. Anything else I should look at? The most general answer is: There is no R package for MINLP problems ! But: Would have been nice if you had done the following things: - Have a look into the 'optimization' task view - Provide an example of a function you would like to optimize (can it be reduced to a quadratic or convex problem?). - Tell whether you need to do it repeatedly or only a few times (e.g., utilizing an interface to the NEOS solvers). - How many integer variables, binary or really integer, etc. (could you do your own branch-and-bound, e.g.)? Providing as little information as above makes communicating difficult. Hans Werner P. S.: The R-Forge project Integer and Nonlinear Optimization in R (RINO) will (hopefully) make some of those COIN-OR solvers available. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.