from:"\"Ravi Varadhan\""

Re: [R] Number of Cores limited to two in CRAN

2023-07-02 Thread Ravi Varadhan via R-help

Dear Henrik,
Thank you!  This is quite helpful, especially your longer blog post.

Best regards,
Ravi

From: Henrik Bengtsson 
Sent: Sunday, July 2, 2023 4:33 AM
To: Ravi Varadhan 
Cc: R-Help 
Subject: Re: [R] Number of Cores limited to two in CRAN

  External Email - Use Caution

Short answer: You don't want to override that limit in your R package.
Don't do it.

Long answer: You'll find the reason for this in the 'CRAN Repository
Policy' 
(https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcran.r-project.org%2Fweb%2Fpackages%2Fpolicies.html&data=05%7C01%7Cravi.varadhan%40jhu.edu%7Cbf56c90173504853907408db7ad70f08%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638238836311718317%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KDB6%2BqMgn4KrKtVEgWezfRY9isT926wLwZGg5yuDKI4%3D&reserved=0<https://cran.r-project.org/web/packages/policies.html>).
Specifically, the following passage:

"Checking the package should take as little CPU time as possible, as
the CRAN check farm is a very limited resource and there are thousands
of packages. Long-running tests and vignette code can be made optional
for checking, but do ensure that the checks that are left do exercise
all the features of the package.

**If running a package uses multiple threads/cores it must never use
more than two simultaneously: the check farm is a shared resource and
will typically be running many checks simultaneously.**

Examples should run for no more than a few seconds each: they are
intended to exemplify to the would-be user how to use the functions in
the package."

Basically, you can use two cores to demonstrate or validate (e.g. in
package tests) that your code *can* run in parallel, but you must not
use more than that to demonstrate that your code can "run super fast".

Even-longer answer: See my blog post 'Please Avoid detectCores() in
your R Packages' 
(https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jottr.org%2F2022%2F12%2F05%2Favoid-detectcores%2F&data=05%7C01%7Cravi.varadhan%40jhu.edu%7Cbf56c90173504853907408db7ad70f08%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638238836311718317%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xPkA6JCccHZcST76lWa6xFYWWKBOZYQCOezF5t4YVPM%3D&reserved=0<https://www.jottr.org/2022/12/05/avoid-detectcores/>)
from 2022-12-05 for even more reasons.

/Henrik

On Sun, Jul 2, 2023 at 9:55 AM Ravi Varadhan via R-help
 wrote:
>
> This is the specific error messsage from R CMD check --as-cran
>
>
> Error in .check_ncores(length(names)) : 16 simultaneous processes spawned
>   Calls: prepost -> makeCluster -> makePSOCKcluster -> .check_ncores
>   Execution halted
>
>
> Thanks,
> Ravi
>
> 
> From: Ravi Varadhan
> Sent: Saturday, July 1, 2023 1:15 PM
> To: R-Help 
> Subject: Number of Cores limited to two in CRAN
>
> Hi,
> I am developing a package where I would like to utilize multiple cores for 
> parallel computing.  However, I get an error message when I run R CMD check 
> --as-cran.
>
> I read that CRAN limits the number of cores to 2.  Is this correct? Is there 
> any way to overcome this limitation?
>
> Thank you,
> Ravi
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Cravi.varadhan%40jhu.edu%7Cbf56c90173504853907408db7ad70f08%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638238836311873942%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KTNwXurNvbEhviVf15RF0la%2FvUVGBW60yuyVVy9Umxc%3D&reserved=0<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide 
> https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Cravi.varadhan%40jhu.edu%7Cbf56c90173504853907408db7ad70f08%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638238836311873942%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HvGb20UiUey9O3plynLiRmHQtqpjfrndk2iPfvleT3Q%3D&reserved=0<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Number of Cores limited to two in CRAN

2023-07-02 Thread Ravi Varadhan via R-help

This is the specific error messsage from R CMD check --as-cran


Error in .check_ncores(length(names)) : 16 simultaneous processes spawned
  Calls: prepost -> makeCluster -> makePSOCKcluster -> .check_ncores
  Execution halted


Thanks,
Ravi

____
From: Ravi Varadhan
Sent: Saturday, July 1, 2023 1:15 PM
To: R-Help 
Subject: Number of Cores limited to two in CRAN

Hi,
I am developing a package where I would like to utilize multiple cores for 
parallel computing.  However, I get an error message when I run R CMD check 
--as-cran.

I read that CRAN limits the number of cores to 2.  Is this correct? Is there 
any way to overcome this limitation?

Thank you,
Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Number of Cores limited to two in CRAN

2023-07-02 Thread Ravi Varadhan via R-help

Hi,
I am developing a package where I would like to utilize multiple cores for 
parallel computing.  However, I get an error message when I run R CMD check 
--as-cran.

I read that CRAN limits the number of cores to 2.  Is this correct? Is there 
any way to overcome this limitation?

Thank you,
Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use ifelse without invoking warnings

2021-10-08 Thread Ravi Varadhan via R-help

Thank you to Bert, Sarah, and John. I did consider suppressing warnings, but I 
felt that there must be a more principled approach.  While John's solution is 
what I would prefer, I cannot help but wonder why `ifelse' was not constructed 
to avoid this behavior.

Thanks & Best regards,
Ravi

From: John Fox 
Sent: Thursday, October 7, 2021 2:00 PM
To: Ravi Varadhan 
Cc: R-Help 
Subject: Re: [R] How to use ifelse without invoking warnings


  External Email - Use Caution



Dear Ravi,

It's already been suggested that you could disable warnings, but that's
risky in case there's a warning that you didn't anticipate. Here's a
different approach:

 > kk <- k[k >= -1 & k <= n]
 > ans <- numeric(length(k))
 > ans[k > n] <- 1
 > ans[k >= -1 & k <= n] <- pbeta(p, kk + 1, n - kk, lower.tail=FALSE)
 > ans
[1] 0.0 0.006821826 0.254991551 1.0

BTW, I don't think that you mentioned that p = 0.3, but that seems
apparent from the output you showed.

I hope this helps,
  John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: 
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsocialsciences.mcmaster.ca%2Fjfox%2F&data=04%7C01%7Cravi.varadhan%40jhu.edu%7Cfd882e7c4f4349db34e108d989bc6a9f%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637692265160038474%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=Q33yXm36BwEVKUWO72CWFpSUx7gcEEXhM3qFi7n78ZM%3D&reserved=0

On 2021-10-07 12:29 p.m., Ravi Varadhan via R-help wrote:
> Hi,
> I would like to execute the following vectorized calculation:
>
>ans <- ifelse (k >= -1 & k <= n, pbeta(p, k+1, n-k, lower.tail = FALSE), 
> ifelse (k < -1, 0, 1) )
>
> For example:
>
>
>> k <- c(-1.2,-0.5, 1.5, 10.4)
>> n <- 10
>> ans <- ifelse (k >= -1 & k <= n, pbeta(p,k+1,n-k,lower.tail=FALSE), ifelse 
>> (k < -1, 0, 1) )
> Warning message:
> In pbeta(p, k + 1, n - k, lower.tail = FALSE) : NaNs produced
>> print(ans)
> [1] 0.0 0.006821826 0.254991551 1.0
>
> The answer is correct.  However, I would like to eliminate the annoying 
> warnings.  Is there a better way to do this?
>
> Thank you,
> Ravi
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=04%7C01%7Cravi.varadhan%40jhu.edu%7Cfd882e7c4f4349db34e108d989bc6a9f%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637692265160048428%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=FXX%2B4zNT0JHBnDFO5dXBDQ484oQF1EK5%2Fa0dG9P%2F4k4%3D&reserved=0
> PLEASE do read the posting guide 
> https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=04%7C01%7Cravi.varadhan%40jhu.edu%7Cfd882e7c4f4349db34e108d989bc6a9f%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C637692265160048428%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=ss2ohzJIY6qj0eAexk4yVzTzbjXxK5VZNors0GpsbA0%3D&reserved=0
> and provide commented, minimal, self-contained, reproducible code.
>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to use ifelse without invoking warnings

2021-10-07 Thread Ravi Varadhan via R-help

Hi,
I would like to execute the following vectorized calculation:

  ans <- ifelse (k >= -1 & k <= n, pbeta(p, k+1, n-k, lower.tail = FALSE), 
ifelse (k < -1, 0, 1) )

For example:


> k <- c(-1.2,-0.5, 1.5, 10.4)
> n <- 10
> ans <- ifelse (k >= -1 & k <= n, pbeta(p,k+1,n-k,lower.tail=FALSE), ifelse (k 
> < -1, 0, 1) )
Warning message:
In pbeta(p, k + 1, n - k, lower.tail = FALSE) : NaNs produced
> print(ans)
[1] 0.0 0.006821826 0.254991551 1.0

The answer is correct.  However, I would like to eliminate the annoying 
warnings.  Is there a better way to do this?

Thank you,
Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A Question about MLE in R

2020-07-24 Thread Ravi Varadhan

I agree with John that SANN should be removed from optim.


More importantly, the default choice of optimizer in optim should be changed 
from "Nelder-Mead" to "BFGS."   Nelder-Mead is a bad choice for the most 
commonly encountered optimization problems in statistics.  I really do not see 
a good reason for not changing the default in optim.


Best regards,

Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fast matrix multiplication

2018-08-14 Thread Ravi Varadhan

Does Microsoft open R come with pre-compiled BLAS that is optimized for matrix 
computations?

Thanks,
Ravi

-Original Message-
From: Ista Zahn  
Sent: Monday, August 13, 2018 3:18 PM
To: Ravi Varadhan 
Cc: r-help@r-project.org
Subject: Re: [R] Fast matrix multiplication


On Mon, Aug 13, 2018 at 2:41 PM Ravi Varadhan  wrote:
>
> Hi Ista,
> Thank you for the response.  I use Windows.  Is there a pre-compiled version 
> of openBLAS for windows that would make it easy for me to use it?

Not sure. If you want an easy way I would use MRO. More info at
https://mran.microsoft.com/rro#intelmkl1

--Ista

> Thanks,
> Ravi
>
> -Original Message-
> From: Ista Zahn 
> Sent: Friday, August 10, 2018 12:20 PM
> To: Ravi Varadhan 
> Cc: r-help@r-project.org
> Subject: Re: [R] Fast matrix multiplication
>
>
> Hi Ravi,
>
> You can achieve substantial speed up by using a faster BLAS (e.g., OpenBLAS 
> or MKL), especially on systems with multiple CPUs. On my (6 year old, but 8 
> core) system your example takes 3.9 seconds with using the reference BLAS and 
> only 0.9 seconds using OpenBLAS.
>
> Best,
> Ista
> On Fri, Aug 10, 2018 at 11:46 AM Ravi Varadhan  wrote:
> >
> > Hi,
> >
> > I would like to compute:  A %*% B %*% t(A)
> >
> >
> >
> > A is a mxn matrix and B is an nxn symmetric, positive-definite matrix, 
> > where m is large relative to n (e.g., m=50,000 and n=100).
> >
> >
> >
> > Here is a sample code.
> >
> >
> >
> > M <- 1
> >
> > N <- 100
> >
> > A <- matrix(rnorm(M*N), M, N)
> >
> > B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric 
> > positive-definite matrix
> >
> >
> >
> > # method 1
> >
> > system.time(D <- A %*% B %*% t(A))
> >
> >
> >
> > # I can obtain speedup by using a Cholesky decomposition of B
> >
> > # method 2
> >
> > system.time({
> >
> > C <- t(chol(B))
> >
> > E <- tcrossprod(A%*%C)
> >
> > })
> >
> >
> >
> > all.equal(D, E)
> >
> >
> >
> > I am wondering how to obtain more substantial speedup.  Any suggestions 
> > would be greatly appreciated.
> >
> >
> >
> > Thanks,
> >
> > Ravi
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fast matrix multiplication

2018-08-13 Thread Ravi Varadhan

Hi Ista,
Thank you for the response.  I use Windows.  Is there a pre-compiled version of 
openBLAS for windows that would make it easy for me to use it?
Thanks,
Ravi

-Original Message-
From: Ista Zahn  
Sent: Friday, August 10, 2018 12:20 PM
To: Ravi Varadhan 
Cc: r-help@r-project.org
Subject: Re: [R] Fast matrix multiplication

Hi Ravi,

You can achieve substantial speed up by using a faster BLAS (e.g., OpenBLAS or 
MKL), especially on systems with multiple CPUs. On my (6 year old, but 8 core) 
system your example takes 3.9 seconds with using the reference BLAS and only 
0.9 seconds using OpenBLAS.

Best,
Ista
On Fri, Aug 10, 2018 at 11:46 AM Ravi Varadhan  wrote:
>
> Hi,
>
> I would like to compute:  A %*% B %*% t(A)
>
>
>
> A is a mxn matrix and B is an nxn symmetric, positive-definite matrix, where 
> m is large relative to n (e.g., m=50,000 and n=100).
>
>
>
> Here is a sample code.
>
>
>
> M <- 1
>
> N <- 100
>
> A <- matrix(rnorm(M*N), M, N)
>
> B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric 
> positive-definite matrix
>
>
>
> # method 1
>
> system.time(D <- A %*% B %*% t(A))
>
>
>
> # I can obtain speedup by using a Cholesky decomposition of B
>
> # method 2
>
> system.time({
>
> C <- t(chol(B))
>
> E <- tcrossprod(A%*%C)
>
> })
>
>
>
> all.equal(D, E)
>
>
>
> I am wondering how to obtain more substantial speedup.  Any suggestions would 
> be greatly appreciated.
>
>
>
> Thanks,
>
> Ravi
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fast matrix multiplication

2018-08-10 Thread Ravi Varadhan

Hi,

I would like to compute:  A %*% B %*% t(A)



A is a mxn matrix and B is an nxn symmetric, positive-definite matrix, where m 
is large relative to n (e.g., m=50,000 and n=100).



Here is a sample code.



M <- 1

N <- 100

A <- matrix(rnorm(M*N), M, N)

B <- crossprod(matrix(rnorm(N*N), N, N)) # creating a symmetric 
positive-definite matrix



# method 1

system.time(D <- A %*% B %*% t(A))



# I can obtain speedup by using a Cholesky decomposition of B

# method 2

system.time({

C <- t(chol(B))

E <- tcrossprod(A%*%C)

})



all.equal(D, E)



I am wondering how to obtain more substantial speedup.  Any suggestions would 
be greatly appreciated.



Thanks,

Ravi



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] adding overall constraint in optim()

2018-05-05 Thread Ravi Varadhan

Here is what you do for your problem:



require(BB)

Mo.vect <- as.vector(tail(head(mo,i),1))
 wgt.vect <- as.vector(tail(head(moWeightsMax,i),1))
 cov.mat <- cov(tail(head(morets,i+12),12))
 opt.fun <- function(wgt.vect) -sum(Mo.vect %*% wgt.vect) / (t(wgt.vect) 
%*% (cov.mat %*% wgt.vect))

 LowerBounds<-c(0.2,0.05,0.1,0,0,0)
 UpperBounds<-c(0.6,0.3,0.6,0.15,0.1,0.2)

  spgSolution <- spg(wgt.vect, fn=opt.fun, lower=LowerBounds, 
upper=UpperBounds, project="projectLinear", projectArgs=list(A=matrix(1, 1, 
length(wgt.vect)), b=1, meq=1)))





Ravi



____
From: Ravi Varadhan
Sent: Saturday, May 5, 2018 12:31 PM
To: m.ash...@enduringinvestments.com; r-help@r-project.org
Subject: adding overall constraint in optim()


Hi,

You can use the projectLinear argument in BB::spg to optimize with linear 
equality/inequality constraints.



Here is how you implement the constraint that all parameters sum to 1.



require(BB)

spg(par=p0, fn=myFn, project="projectLinear", projectArgs=list(A=matrix(1, 1, 
length(p0)), b=1, meq=1))



Hope this is helpful,

Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] adding overall constraint in optim()

2018-05-05 Thread Ravi Varadhan

Hi,

You can use the projectLinear argument in BB::spg to optimize with linear 
equality/inequality constraints.



Here is how you implement the constraint that all parameters sum to 1.



require(BB)

spg(par=p0, fn=myFn, project="projectLinear", projectArgs=list(A=matrix(1, 1, 
length(p0)), b=1, meq=1))



Hope this is helpful,

Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subject: glm and stepAIC selects too many effects

2017-06-06 Thread Ravi Varadhan

More principled would be to use a lasso-type approach, which combines selection 
and estimation in one fell swoop!



Ravi


From: Ravi Varadhan
Sent: Tuesday, June 6, 2017 10:16 AM
To: r-help@r-project.org
Subject: Subject: [R] glm and stepAIC selects too many effects


If AIC is giving you a model that is too large, then use BIC (log(n) as the 
penalty for adding a term in the model).  This will yield a more parsimonious 
model.  Now, if you ask me which is the better option, I have to refer you to 
the huge literature on model selection.

Best,

Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Subject: glm and stepAIC selects too many effects

2017-06-06 Thread Ravi Varadhan

If AIC is giving you a model that is too large, then use BIC (log(n) as the 
penalty for adding a term in the model).  This will yield a more parsimonious 
model.  Now, if you ask me which is the better option, I have to refer you to 
the huge literature on model selection.

Best,

Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Haplo.glm error: Failed to converge during EM-glm loop

2016-06-14 Thread Ravi Varadhan

It is a "warning" message, but not an "error" message.  Therefore, you should 
still get parameter estimates.  However, lack of convergence is often an 
important issue. Although in the case of the EM algorithm, it may not be so 
serious since EM is notoriously slow to converge when you set a small tolerance 
for convergence. You may fiddle with the control options for EM algorithm and 
see if you can get convergence.  If not, you should contact the package authors.

Hope this is helpful,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Linear Regressions with non-negativity constraint

2016-05-23 Thread Ravi Varadhan

Hi,

Take a look at the package "ic.infer" by Ulrike Gromping.

https://www.jstatsoft.org/article/view/v033i10

Best,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help in maximum likelihood

2016-03-04 Thread Ravi Varadhan

Standard error = sqrt(diag(solve(opt$hessian)))

Ravi

From: Alaa Sindi [mailto:alaasi...@icloud.com]
Sent: Wednesday, March 02, 2016 3:22 PM
To: Ravi Varadhan 
Cc: r-help@r-project.org
Subject: Re: help in maximum likelihood

Thank you very much prof. Ravi,

That was very helpful. Is there a way to get the t and p value for the 
coefficients?

Thanks
Alaa
On Mar 2, 2016, at 10:05 AM, Ravi Varadhan 
mailto:ravi.varad...@jhu.edu>> wrote:

There is nothing wrong with the optimization.  It is a warning message.  
However, this is a good example to show that one should not simply dismiss a 
warning before understanding what it means.  The MLE parameters are also large, 
indicating that there is something funky about the model or the data or both.  
In your case, there is one major problem with the data:  for the highest dose 
(value of x), you have all subjects responding, i.e. y = n.  Even for the next 
lower dose, there is almost complete response.  Where do these data come from? 
Are they real or fake (simulated) data?

Also, take a look at the eigenvalues of the hessian at the solution.  You will 
see that there is some ill-conditioning, as the eigenvalues are widely 
separated.

x <- c(1.6907, 1.7242, 1.7552, 1.7842, 1.8113, 1.8369, 1.8610, 1.8839)
y <- c( 6, 13, 18, 28, 52, 53, 61, 60)
n <- c(59, 60, 62, 56, 63, 59, 62, 60)

# note: there is no need to have the choose(n, y) term in the likelihood
fn <- function(p)
sum( - (y*(p[1]+p[2]*x) - n*log(1+exp(p[1]+p[2]*x))) )

out <- nlm(fn, p = c(-50,20), hessian = TRUE)

out

eigen(out$hessian)

Hope this is helpful,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help in maximum likelihood

2016-03-02 Thread Ravi Varadhan

There is nothing wrong with the optimization.  It is a warning message.  
However, this is a good example to show that one should not simply dismiss a 
warning before understanding what it means.  The MLE parameters are also large, 
indicating that there is something funky about the model or the data or both.  
In your case, there is one major problem with the data:  for the highest dose 
(value of x), you have all subjects responding, i.e. y = n.  Even for the next 
lower dose, there is almost complete response.  Where do these data come from? 
Are they real or fake (simulated) data?

Also, take a look at the eigenvalues of the hessian at the solution.  You will 
see that there is some ill-conditioning, as the eigenvalues are widely 
separated.

x <- c(1.6907, 1.7242, 1.7552, 1.7842, 1.8113, 1.8369, 1.8610, 1.8839)
y <- c( 6, 13, 18, 28, 52, 53, 61, 60)
n <- c(59, 60, 62, 56, 63, 59, 62, 60)

# note: there is no need to have the choose(n, y) term in the likelihood
fn <- function(p)
sum( - (y*(p[1]+p[2]*x) - n*log(1+exp(p[1]+p[2]*x))) )

out <- nlm(fn, p = c(-50,20), hessian = TRUE)

out

eigen(out$hessian)


Hope this is helpful,
Ravi



Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Looking for a data set - Teratology data of Chen and Kodell (1989)

2016-02-18 Thread Ravi Varadhan

Hi,
I am looking for the teratology data set from the Chen and Kodell (JASA 1989) 
on mice exposed to different doses of a chemical, DEHP.  This was also analyzed 
in a Biometrics 2000 paper by Louise Ryan using GEE.  I think this data set is 
publicly available, but I am unable to locate it.  Does anyone know of this or 
worked with this data set?  If so, can someone share it with me?

Thank you very much.

Best,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solve an equation including integral

2016-01-08 Thread Ravi Varadhan

There was a mistake in the previously sent function.  I had hard coded the 
values of parameters `nu' and `exi'.

Use this one:

myroot <- function(t, nu, exi, alpha){
  fun <- function(t, exi, nu) 
(nu+t^2)^(-(nu+1)/2)*exp(((nu+1)*exi*t)/((nu+t^2)^0.5))
  res <- alpha - integrate(fun, -Inf, t, nu=nu, exi=exi)$value
  return(res)
}

uniroot(myroot, c(-2, 2), nu=2, exi=0.5, alpha=.05)

Ravi


From: Ravi Varadhan
Sent: Friday, January 08, 2016 11:29 AM
To: r-help@r-project.org; 'sstol...@gmail.com' 
Subject: Re: Solve an equation including integral

I think this is what you want:

myroot <- function(t, nu, exi, alpha){
  fun <- function(t, exi, nu) 
(nu+t^2)^(-(nu+1)/2)*exp(((nu+1)*exi*t)/((nu+t^2)^0.5))
  res <- alpha - integrate(fun, -Inf, t, nu=2, exi=0.5)$value
  return(res)
}

uniroot(myroot, c(-2, 2), nu=2, exi=0.5, alpha=.05)

Hope this is helpful,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Solve an equation including integral

2016-01-08 Thread Ravi Varadhan

I think this is what you want:

myroot <- function(t, nu, exi, alpha){
  fun <- function(t, exi, nu) 
(nu+t^2)^(-(nu+1)/2)*exp(((nu+1)*exi*t)/((nu+t^2)^0.5))
  res <- alpha - integrate(fun, -Inf, t, nu=2, exi=0.5)$value
  return(res)
}

uniroot(myroot, c(-2, 2), nu=2, exi=0.5, alpha=.05)

Hope this is helpful,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] grep/regexp

2015-12-01 Thread Ravi Varadhan

Hi,
I would appreciate some help with using grep().  I have a bunch of variables in 
a data frame and I would like to select some of them using grep.  Here is an 
example of what I am trying to do:

vars <- c("Fr_I_total", "Fr_I_percent_of_CD4", "Ki.67_in_Fr_I_percent_of_Fr_I", 
"Fr_II_percent_of_CD4", "Ki.67_in_Fr_II_percent_of_Fr_II")

>From the above vector, I would like to select those variables beginning with 
>`Fr' and containing `percent' in them.   In other words, I would like to get 
>the variables "Fr_I_percent_of_CD4" and "Fr_II_percent_of_CD4".

How can I use grep() to do this?

More generally, are there any good online resources with examples like this for 
the use of grep() and regexp() in R?  I didn't find the help pages for these 
very user-friendly.

Thank you very much,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cautioning optim() users about "Nelder-Mead" default - (originally) Optim instability

2015-11-15 Thread Ravi Varadhan

Hi John,
My main point is not about Nelder-Mead per se.  It is *primarily* about the 
Nelder-Mead implementation in optim().  

The users of optim() should be cautioned regarding the default algorithm and 
that they should consider alternatives such as "BFGS" in optim(), or other 
implementations of Nelder-Mead.

Best regards,
Ravi

From: ProfJCNash 
Sent: Sunday, November 15, 2015 12:21 PM
To: Ravi Varadhan; 'r-help@r-project.org'; lorenzo.ise...@gmail.com
Cc: b...@xs4all.nl; Gabor Grothendieck
Subject: Re: Cautioning optim() users about "Nelder-Mead" default - 
(originally) Optim instability

Not contradicting Ravi's message, but I wouldn't say Nelder-Mead is
"bad" per se. It's issues are that it assumes the parameters are all on
the same scale, and the termination (not convergence) test can't use
gradients, so it tends to get "near" the optimum very quickly -- say
only 10% of the computational effort -- then spends an awful amount of
effort deciding it's got there. It often will do poorly when the
function has nearly "flat" zones e.g., long valley with very low slope.

So my message is still that Nelder-Mead is an unfortunate default -- it
has been chosen I believe because it is generally robust and doesn't
need gradients. BFGS really should use accurate gradients, preferably
computed analytically, so it would only be a good default in that case
or with very good approximate gradients (which are costly
computationally).

However, if you understand what NM is doing, and use it accordingly, it
is a valuable tool. I generally use it as a first try BUT turn on the
trace to watch what it is doing as a way to learn a bit about the
function I am minimizing. Rarely would I use it as a production minimizer.

Best, JN

On 15-11-15 12:02 PM, Ravi Varadhan wrote:
> Hi,
>
>
>
> While I agree with the comments about paying attention to parameter
> scaling, a major issue here is that the default optimization algorithm,
> Nelder-Mead, is not very good.  It is unfortunate that the optim
> implementation chose this as the "default" algorithm.  I have several
> instances where people have come to me with poor results from using
> optim(), because they did not realize that the default algorithm is
> bad.  We (John Nash and I) have pointed this out before, but the R core
> has not addressed this issue due to backward compatibility reasons.
>
>
>
> There is a better implementation of Nelder-Mead in the "dfoptim" package.
>
>
>
> require(dfoptim)
>
> mm_def1 <- nmk(par = par_ini1, min.perc_error, data = data)
>
> mm_def2 <- nmk(par = par_ini2, min.perc_error, data = data)
>
> mm_def3 <- nmk(par = par_ini3, min.perc_error, data = data)
>
> print(mm_def1$par)
>
> print(mm_def2$par)
>
> print(mm_def3$par)
>
>
>
> In general, better implementations of optimization algorithms are
> available in packages such as "optimx", "nloptr".  It is unfortunate
> that most naïve users of optimization in R do not recognize this.
> Perhaps, there should be a "message" in the optim help file that points
> this out to the users.
>
>
>
> Hope this is helpful,
>
> Ravi
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cautioning optim() users about "Nelder-Mead" default - (originally) Optim instability

2015-11-15 Thread Ravi Varadhan

Hi,



While I agree with the comments about paying attention to parameter scaling, a 
major issue here is that the default optimization algorithm, Nelder-Mead, is 
not very good.  It is unfortunate that the optim implementation chose this as 
the "default" algorithm.  I have several instances where people have come to me 
with poor results from using optim(), because they did not realize that the 
default algorithm is bad.  We (John Nash and I) have pointed this out before, 
but the R core has not addressed this issue due to backward compatibility 
reasons.



There is a better implementation of Nelder-Mead in the "dfoptim" package.



?require(dfoptim)

mm_def1 <- nmk(par = par_ini1, min.perc_error, data = data)

mm_def2 <- nmk(par = par_ini2, min.perc_error, data = data)

mm_def3 <- nmk(par = par_ini3, min.perc_error, data = data)

print(mm_def1$par)

print(mm_def2$par)

print(mm_def3$par)



In general, better implementations of optimization algorithms are available in 
packages such as "optimx", "nloptr".  It is unfortunate that most na�ve users 
of optimization in R do not recognize this.  Perhaps, there should be a 
"message" in the optim help file that points this out to the users.



Hope this is helpful,

Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About error: L-BFGS-B needs finite values of 'fn'

2015-11-11 Thread Ravi Varadhan

It seems like there is substantial finite-sample bias in the MLEs.  Either that 
or there is some error in your procedure.  See attached code.

Ravi

From: Ravi Varadhan
Sent: Wednesday, November 11, 2015 2:33 PM
To: 'denizozo...@gazi.edu.tr' ; r-help@r-project.org
Cc: 'profjcn...@gmail.com' 
Subject: Re: [R] About error: L-BFGS-B needs finite values of 'fn'

With a small sample size, n=30, you will have realizations of data where you 
will run into difficulties with the MLE of generalized Gamma distribution.  
This is mainly due to the `k' parameter.  Increase the sample size (e.g., n=50 
or 100) and this problem is less likely to happen (but can still happen).

I would strongly suggest that when you are doing simulations, you should 
encapsulate the parameter estimation inside a `try' or `tryCatch' statement so 
that when there is an error, the simulation keeps going rather than crashing 
out.

See the attached code.

Best,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About error: L-BFGS-B needs finite values of 'fn'

2015-11-11 Thread Ravi Varadhan

With a small sample size, n=30, you will have realizations of data where you 
will run into difficulties with the MLE of generalized Gamma distribution.  
This is mainly due to the `k' parameter.  Increase the sample size (e.g., n=50 
or 100) and this problem is less likely to happen (but can still happen).

I would strongly suggest that when you are doing simulations, you should 
encapsulate the parameter estimation inside a `try' or `tryCatch' statement so 
that when there is an error, the simulation keeps going rather than crashing 
out.

See the attached code.

Best,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get variable name while doing series of regressions in an automated manner?

2015-10-27 Thread Ravi Varadhan

Thank you very much, Marc & Bert.

Bert - I think you're correct.  With Marc's solution, I am not able to get the 
response variable name in the call to lm().  But, your solution works well.

Best regards,
Ravi

-Original Message-
From: Bert Gunter [mailto:bgunter.4...@gmail.com] 
Sent: Tuesday, October 27, 2015 1:50 PM
To: Ravi Varadhan 
Cc: r-help@r-project.org
Subject: Re: [R] How to get variable name while doing series of regressions in 
an automated manner?

Marc,Ravi:

I may misunderstand, but I think Marc's solution labels the list components but 
not necessarily the summary() outputs. This might be sufficient, as in:

> z <- list(y1=rnorm(10,5),y2 = rnorm(10,8),x=1:10)
>
> ##1
> results1<-lapply(z[-3],function(y)lm(log(y)~x,data=z))
> lapply(results1,summary)
$y1

Call:
lm(formula = log(y) ~ x, data = z)

Residuals:
Min  1Q  Median  3Q Max
-0.2185 -0.1259 -0.0643  0.1340  0.3988

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  1.693190.14375  11.779 2.47e-06 ***
x   -0.014950.02317  -0.6450.537
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2104 on 8 degrees of freedom
Multiple R-squared:  0.04945,Adjusted R-squared:  -0.06937
F-statistic: 0.4161 on 1 and 8 DF,  p-value: 0.5369


$y2

Call:
lm(formula = log(y) ~ x, data = z)

Residuals:
  Min1QMedian3Q   Max
-0.229072 -0.094579 -0.006498  0.134303  0.188158

Coefficients:
 Estimate Std. Error t value Pr(>|t|)
(Intercept)  2.084846   0.104108  20.026 4.03e-08 ***
x   -0.006226   0.016778  -0.371 0.72
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1524 on 8 degrees of freedom
Multiple R-squared:  0.01692,Adjusted R-squared:  -0.106
F-statistic: 0.1377 on 1 and 8 DF,  p-value: 0.7202


## 2

Alternatively, if you want output with the correct variable names,
bquote() can be used, as in:

> results2 <-lapply(names(z)[1:2],
+function(nm){
+  fo <-formula(paste0("log(",nm,")~x"))
+   eval(bquote(lm(.(u),data=z),list(u=fo)))
+})
> lapply(results2,summary)
[[1]]

Call:
lm(formula = log(y1) ~ x, data = z)

Residuals:
Min  1Q  Median  3Q Max
-0.2185 -0.1259 -0.0643  0.1340  0.3988

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  1.693190.14375  11.779 2.47e-06 ***
x   -0.014950.02317  -0.6450.537
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2104 on 8 degrees of freedom
Multiple R-squared:  0.04945,Adjusted R-squared:  -0.06937
F-statistic: 0.4161 on 1 and 8 DF,  p-value: 0.5369


[[2]]

Call:
lm(formula = log(y2) ~ x, data = z)

Residuals:
  Min1QMedian3Q   Max
-0.229072 -0.094579 -0.006498  0.134303  0.188158

Coefficients:
 Estimate Std. Error t value Pr(>|t|)
(Intercept)  2.084846   0.104108  20.026 4.03e-08 ***
x   -0.006226   0.016778  -0.371 0.72
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1524 on 8 degrees of freedom
Multiple R-squared:  0.01692,Adjusted R-squared:  -0.106
F-statistic: 0.1377 on 1 and 8 DF,  p-value: 0.7202


HTH or apologies if I've missed the point and broadcasted noise.

Cheers,
Bert
Bert Gunter

"Data is not information. Information is not knowledge. And knowledge is 
certainly not wisdom."
   -- Clifford Stoll


On Tue, Oct 27, 2015 at 8:19 AM, Ravi Varadhan  wrote:
> Hi,
>
> I am running through a series of regression in a loop as follows:
>
> results <- vector("list", length(mydata$varnames))
>
> for (i in 1:length(mydata$varnames)) { results[[i]] <- 
> summary(lm(log(eval(parse(text=varnames[i]))) ~ age + sex + 
> CMV.status, data=mydata)) }
>
> Now, when I look at the results[i]] objects, I won't be able to see the 
> original variable names.  Obviously, I will only see the following:
>
> Call:
> lm(formula = log(eval(parse(text = varnames[i]))) ~ age + sex + CMV.status,
> data = mydata)
>
>
> Is there a way to display the original variable names on the LHS?  In 
> addition, is there a better paradigm for doing these type of series of 
> regressions in an automatic fashion?
>
> Thank you very much,
> Ravi
>
> Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg) 
> Associate Professor,  Department of Oncology Division of Biostatistics 
> & Bionformatics Sidney Kimmel Comprehensive Cancer Center Johns 
> Hopkins University
> 550 N. Broadway, Suite -E
> Baltimore, MD 21205
> 410-502-2619
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help

[R] How to get variable name while doing series of regressions in an automated manner?

2015-10-27 Thread Ravi Varadhan

Hi,

I am running through a series of regression in a loop as follows:

results <- vector("list", length(mydata$varnames))

for (i in 1:length(mydata$varnames)) {
results[[i]] <- summary(lm(log(eval(parse(text=varnames[i]))) ~ age + sex + 
CMV.status, data=mydata))
}

Now, when I look at the results[i]] objects, I won't be able to see the 
original variable names.  Obviously, I will only see the following:

Call:
lm(formula = log(eval(parse(text = varnames[i]))) ~ age + sex + CMV.status,
data = mydata)


Is there a way to display the original variable names on the LHS?  In addition, 
is there a better paradigm for doing these type of series of regressions in an 
automatic fashion?

Thank you very much,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear regression with a rounded response variable

2015-10-22 Thread Ravi Varadhan

Dear Peter, Charles, Gabor, Jim, Mark, Victor, Peter, and Harold,
You have given me plenty of ammunition.  Thank you very much for the useful 
answers. 
Gratefully,
Ravi

From: peter dalgaard 
Sent: Wednesday, October 21, 2015 8:11 PM
To: Charles C. Berry
Cc: Ravi Varadhan; r-help@r-project.org
Subject: Re: [R] Linear regression with a rounded response variable

> On 21 Oct 2015, at 19:57 , Charles C. Berry  wrote:
>
> On Wed, 21 Oct 2015, Ravi Varadhan wrote:
>
>> [snippage]
>
> If half the subjects have a value of 5 seconds and the rest are split between 
> 4 and 6, your assertion that rounding induces an error of 
> dunif(epsilon,-0.5,0.5) is surely wrong (more positive errors in the 6 second 
> group and more negative errors in the 4 second group under any plausible 
> model).

Yes, and I think that the suggestion in another post to look at censored 
regression is more in the right direction.

In general, I'd expect the bias caused by rounding the response to quite small, 
except at very high granularity. I did a few small experiments with the 
simplest possible linear model: estimating a mean based on highly rounded data,

> y <- round(rnorm(1e2,pi,.5))
> mean(y)
[1] 3.12
> table(y)
y
 2  3  4  5
13 63 23  1

Or, using a bigger sample:

> mean(round(rnorm(1e8,pi,.5)))
[1] 3.139843

in which there is a visible bias, but quite a small one:

> pi - 3.139843
[1] 0.001749654

At lower granularity (sd=1 instead of .5), the bias has almost disappeared.

> mean(round(rnorm(1e8,pi,1)))
[1] 3.141577

If the granularity is increased sufficiently, you _will_ see a sizeable bias 
(because almost all observations will be round(pi)==3):

> mean(round(rnorm(1e8,pi,.1)))
[1] 3.00017

A full ML fit (with known sigma=1) is pretty easily done:

> library(stats4)
> mll <- function(mu)-sum(log(pnorm(y+.5,mu, .5)-pnorm(y-.5, mu, .5)))
> mle(mll,start=list(mu=3))

Call:
mle(minuslogl = mll, start = list(mu = 3))

Coefficients:
  mu
3.122069
> mean(y)
[1] 3.12

As you see, the difference is only 0.002.

A small simulation (1000 repl.) gave (r[1,]==MLE ; r{2,]==mean)

> summary(r[1,]-r[2,])
 Min.   1st Qu.Median  Mean   3rd Qu.  Max.
-0.004155  0.000702  0.001495  0.001671  0.002554  0.006860

so the corrections relative to the crude mean stay within one unit in the 2nd 
place. Notice  that the corrections are pretty darn close to cancelling out the 
bias.

-pd

>
>
> HTH,
>
> Chuck
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Linear regression with a rounded response variable

2015-10-21 Thread Ravi Varadhan

Hi,
I am dealing with a regression problem where the response variable, time 
(second) to walk 15 ft, is rounded to the nearest integer.  I do not care for 
the regression coefficients per se, but my main interest is in getting the 
prediction equation for walking speed, given the predictors (age, height, sex, 
etc.), where the predictions will be real numbers, and not integers.  The hope 
is that these predictions should provide unbiased estimates of the "unrounded" 
walking speed. These sounds like a measurement error problem, where the 
measurement error is due to rounding and hence would be uniformly distributed 
(-0.5, 0.5).

Are there any canonical approaches for handling this type of a problem? What is 
wrong with just doing the standard linear regression?

I googled and saw that this question was asked by someone else in a 
stackexchange post, but it was unanswered.  Any suggestions?

Thank you,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bug in auglag?

2015-10-06 Thread Ravi Varadhan

Dear Rainer,
This is NOT a bug in auglag.  I already mentioned that auglag() can work with 
infeasible starting values, which also implies that the function must be 
evaluable at infeasible values.  A simple solution to your problem would be to 
fix up your objective function such that it evaluates to `Inf' or some large 
value, when the parameter values are not in the constrained domain.  
constrOptim.nl() is a barrier method so it forces the initial value and the 
subsequent iterates to be feasible.
Best,
Ravi

From: Rainer M Krug 
Sent: Tuesday, October 6, 2015 9:20 AM
To: Ravi Varadhan
Cc: 'r-help@r-project.org'
Subject: Bug in auglag?

Hi Ravi,

I would like come back to your offer. I have a problem which possibly is
caused by a bug or by something I don't understand:

My function to be minimised is executed even when an element in hin() is
negative.

My hin looks as follow:

--8<---cut here---start->8---
hinMahat <- function(x, hauteur, na, zjoint, y, LAI, ...) {
if (x[1] < 0) {
cat(names(list(...)), "\n")
cat(..., "\n")
cat(x, "|", hauteur, LAI, y, "\n")
}

h <- rep(NA, 8)
if (!missing(na)) {
x <- c(na, x )
}
if (!missing(y)) {
x <- c(x, y)
}
if (!missing(zjoint)) {
x <- c(x[1], zjoint, x[2])
}

##
dep <- hauteur * (0.05 + LAI^0.02 / 2) + (x[3] - 1)/20
h[1] <- dep
h[2] <- hauteur - dep
## if (h[2]==0) {
## h[2] <- -1
## }
##
z0 <- hauteur * (0.23 + LAI^0.25 / 10) + (x[3] - 1)/67
h[3] <- z0
## if (h[3]==0) {
## h[3] <- -1
## }
h[4] <- hauteur - z0
##
h[5] <- x[1]
##
h[6] <- x[2]
h[7] <- hauteur - x[2]
##
h[8] <- hauteur - dep - z0
if (any(h<=0)) {
cat(h, "\n")
cat("\n")
}
return(h)
}
--8<---cut here---end--->8---

the x contains up to three elements: c(na=, zjoint=, y=) and I fit these
three, unless one or two are specified explicitely.

The values going into hin are:

,
| ... (z  u ua za z0sol )
| 3 11 17 23 29 37 0.315 0.422 0.458 0.556 1.567 1.747 1.747 37 0.001
|
| x(na, zjoint): -8.875735 24.51316
| hauteur: 28
| na:  8.1
| y:   3
|
| the resulting hin() is:
| 16.09815 11.90185 11.19352 16.80648 -8.875735 24.51316 3.486843 0.708335
`


Which is negative in element 5 as x[2]=na is negative.

So I would expect that the function fn is not evaluated. But it is, and
raises an error:

,
| Error in wpLELMahat(z = z, ua = ua, na = ifelse(missing(na), par[1], na),  :
|   na has to be larger or equal than zero!
`

Is this a misunderstanding on my part, or is it an error in the function
auglag?


Below is the function which is doing the minimisation.

If I replace auglag() with constrOptim.nl(), the optimisation is working
as expected.

So I think this is a bug in auglag?

Let me know if you need further information.

Cheers,

Rainer

--8<---cut here---start->8---
fitAuglag.wpLEL.mahat.single <- function(
 z,
 u,
 LAI,
 initial = c(na=9, zjoint=0.2*2, y=3),
 na, zjoint, y,
 h  = 28,
 za = 37,
 z0sol  = 0.001,
 hin,
 ...
 ) {
if (missing(hin)) {
hin <- hinMahat
}

wpLELMin <- function(par, na, zjoint, y, z, u, ua, hauteur, za, z0sol, LAI) 
{
result <- NA
try({
p <- wpLELMahat(
z  = z,
ua = ua,
na = ifelse(missing(na), par[1], na),
zjoint = ifelse(missing(zjoint), par[2], zjoint),
h  = hauteur,
za = za,
z0sol  = z0sol,
LAI= LAI,
y  = ifelse(missing(y), par[3], y)
)
result <- sum( ( (p$u - u)^2 ) / length(u) )
},
silent = FALSE
)
## cat("From wpLELMin", par, "\n")
return( result )
}

ua <- u[length(u)]
result <- list()
result$method <- "fitAuglag.wpLEL.mahat.single"
result$initial <-  initial
result$dot <- list(...)
result$z <- z
result$u <- u

result$fit <- auglag(
par = initial,
fn= wpLELMin,
hin   = hin,
na = na,

Re: [R] optimizing with non-linear constraints

2015-10-01 Thread Ravi Varadhan

I would recommend that you use auglag() rather than constrOptim.nl() in the 
package "alabama."  It is a better algorithm, and it does not require feasible 
starting values.
Best,
Ravi  

-Original Message-
From: Rainer M Krug [mailto:rai...@krugs.de] 
Sent: Thursday, October 01, 2015 3:37 AM
To: Ravi Varadhan 
Cc: 'r-help@r-project.org' 
Subject: Re: optimizing with non-linear constraints

Ravi Varadhan  writes:

> Hi Rainer,
> It is very simple to specify the constraints (linear or nonlinear) in 
> "alabama" .  They are specified in a function called `hin', where the 
> constraints are written such that they are positive.

OK - I somehow missed the part that, when the values x are valid, i.e. in the 
range as defined by the conditions, the result of hin(x) that they are all 
positive.

> Your two nonlinear constraints would be written as follows:
>
> hin <- function(x, LAI) {
> h <- rep(NA, 2)
> h[1] <- LAI^x[2] / x[3] + x[1]
> h[2] <- 1 - x[1] - LAI^x[2] / x[3]
> h
> }

Makes perfect sense.

>
> Please take a look at the help page.  If it is still not clear, you can 
> contact me offline.

Yup - I did. But I somehow missed the fact stated above.

I am using constrOptim() and constrOptim.nl() for a paper and am compiling a 
separate document which explains how to get the constraints for the two 
functions step by step - I will make it available as a blog post and a pdf.

I might have further questions concerning the different fitting functions and 
which ones are the most appropriate in my case.

Thanks a lot,

Rainer

> Best,
> Ravi
>
> Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg) 
> Associate Professor,  Department of Oncology Division of Biostatistics 
> & Bionformatics Sidney Kimmel Comprehensive Cancer Center Johns 
> Hopkins University
> 550 N. Broadway, Suite -E
> Baltimore, MD 21205
> 410-502-2619
>
>
>   [[alternative HTML version deleted]]
>

--
Rainer M. Krug
email: Rainerkrugsde
PGP: 0x0F52F982

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optimizing with non-linear constraints

2015-09-30 Thread Ravi Varadhan

Hi Rainer,
It is very simple to specify the constraints (linear or nonlinear) in "alabama" 
.  They are specified in a function called `hin', where the constraints are 
written such that they are positive.  Your two nonlinear constraints would be 
written as follows:

hin <- function(x, LAI) {
h <- rep(NA, 2)
h[1] <- LAI^x[2] / x[3] + x[1]
h[2] <- 1 - x[1] - LAI^x[2] / x[3]
h
}

Please take a look at the help page.  If it is still not clear, you can contact 
me offline.
Best,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lsqlin in pracma

2015-08-29 Thread Ravi Varadhan

In solve.QP(), you don't need to expand the equality into two inequalities.  It 
can DIRECTLY handle the equality constraints.  The first `meq' rows of the 
constraint matrix are equality constraints. Here is the excerpt from  the 
documentation.

meq
the first meq constraints are treated as equality constraints, all further as 
inequality constraints (defaults to 0).


Therefore, solve.QP() can provide the full functionality of lsqlin in Matlab.  
However, one caveat is that the bounds constraints have to be implemented via 
inequalities in solve.QP(), which is a slight pain, but not a deal breaker.

Best,
Ravi



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] : Ramanujan and the accuracy of floating point computations - using Rmpfr in R

2015-07-04 Thread Ravi Varadhan


What about numeric constants, like `163'?  

eval(Pre(exp(sqrt(163)*pi), 120))does not work.

Thanks,
Ravi

From: David Winsemius 
Sent: Saturday, July 4, 2015 1:12 PM
To: Duncan Murdoch
Cc: r-help; John Nash; Ravi Varadhan
Subject: Re: [R] : Ramanujan and the accuracy of floating point computations - 
using Rmpfr in R

> On Jul 4, 2015, at 12:20 AM, Duncan Murdoch  wrote:
>
> On 04/07/2015 8:21 AM, David Winsemius wrote:
>>
>>> On Jul 3, 2015, at 11:05 PM, Duncan Murdoch  
>>> wrote:
>>>
>>> On 04/07/2015 3:45 AM, David Winsemius wrote:
>>>>
>>>>> On Jul 3, 2015, at 5:08 PM, David Winsemius  
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> It doesn’t appear to me that mpfr was ever designed to handle expressions 
>>>>> as the first argument.
>>>>
>>>> This could be a start. Obviously one would wnat to include code to do 
>>>> other substitutions probably using the all.vars function to pull out the 
>>>> other “constants” and ’numeric’ values to make them of equivalent 
>>>> precision. I’m guessing you want to follow the parse-tree and then testing 
>>>> the numbers for integer-ness and then replacing by paste0( “mpfr(“, val, 
>>>> “L, “, prec,”)” )
>>>>
>>>> Pre <- function(expr, prec){ sexpr <- deparse(substitute(expr) )
>>>
>>> Why deparse?  That's almost never a good idea.  I can't try your code (I
>>> don't have mpfr available), but it would be much better to modify the
>>> expression than the text representation of it.  For example, I think
>>> your code would modify strings containing "pi", or variables with those
>>> letters in them, etc.  If you used substitute(expr) without the
>>> deparse(), you could replace the symbol "pi" with the call to the Const
>>> function, and be more robust.
>>>
>>
>> Really? I did try. I was  fairly sure that someone could do better but I 
>> don’t see an open path along the lines you suggest. I’m pretty sure I tried 
>> `substitute(expr, list(pi= pi))` when `expr` had been the formal argument 
>> and got disappointed because there is no `pi` in the expression `expr`. I 
>> _thought_ the problem was that `substitute` does not evaluate its first 
>> argument, but I do admit to be pretty much of a klutz with this sort of 
>> programming. I don’t think you need to have mpfr installed in order to 
>> demonstrate this.
>
> The substitute() function really does two different things.
> substitute(expr) (with no second argument) grabs the underlying
> expression out of a promise.  substitute(expr, list(pi = pi)) tries to
> make the substitution in the expression "expr", so it doesn't see "pi”.

Thank you. That was really helpful. I hope it “sticks” to  sufficiently durable 
set of neurons.
>

> This should work:
>
> do.call(substitute, list(expr = substitute(expr), env=list(pi =
> Const(pi, 120
>
> (but I can't evaluate the Const function to test it).

The expression `pi` as the argument to Const only needed to be character()-ized 
and then evaluation on that result succeeded:

library(mpfr)
#  Pre <- function(expr, prec){ do.call(substitute,
list(expr = substitute(expr),
 env=list(pi = Const(pi, prec }

> Pre(exp(pi),120)
Error in Const(pi, prec) :
  'name' must be one of 'pi', 'gamma', 'catalan', 'log2'


 Pre <- function(expr, prec){ do.call(substitute,
 list(expr = substitute(expr),
  env=list(pi = Const('pi', prec }

> Pre(exp(pi),120)
exp(list())
> eval( Pre(exp(pi),120) )
1 'mpfr' number of precision  120   bits
[1] 23.140692632779269005729086367948547394

So the next step for delivering Ravi’s request would be to add the rest of the 
‘Const’ options:

Pre <- function(expr, prec){ do.call(substitute,
 list(expr = substitute(expr),
 env=list(pi = Const('pi', prec),
 catalan=Const('catalan', prec),
 log2=Const('log2', prec),
 gamma=Const('gamma', prec }


> eval(Pre(exp(gamma)))
Error in stopifnot(is.numeric(prec)) :
  argument "prec" is missing, with no default
> eval(Pre(exp(gamma), 120))
1 'mpfr' number of precision  120   bits
[1] 1.781072417990197985236504103107179549

>
> This works:
>
> do.call(substitute, list(expr = substitute(expr), env = list(pi =
> quote(Const(pi, 120)
>
> because it never evaluates the Const function, it just returns some code
> that you could modify more if you liked.
>
> Duncan Murdoch

David Winsemius, MD
Alameda, CA, USA


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] : Ramanujan and the accuracy of floating point computations - using Rmpfr in R

2015-07-03 Thread Ravi Varadhan

Thank you all.  I did think about declaring `pi' as a special constant, but for 
some reason didn't actually try it.  
Would it be easy to have the mpfr() written such that its argument is 
automatically of extended precision? In other words, if I just called:  
mpfr(exp(sqrt(163)*pi, 120), then all the constants, 163, pi, are automatically 
of 120 bits precision.

Is this easy to do?

Best,
Ravi
 
From: David Winsemius 
Sent: Friday, July 3, 2015 2:06 PM
To: John Nash
Cc: r-help; Ravi Varadhan
Subject: Re: [R] : Ramanujan and the accuracy of floating point computations - 
using Rmpfr in R

> On Jul 3, 2015, at 8:08 AM, John Nash  wrote:
>
>
>
>
> Third try -- I unsubscribed and re-subscribed. Sorry to Ravi for extra
> traffic.
>
> In case anyone gets a duplicate, R-help bounced my msg "from non-member" (I 
> changed server, but not email yesterday,
> so possibly something changed enough). Trying again.
>
> JN
>
> I got the Wolfram answer as follows:
>
> library(Rmpfr)
> n163 <- mpfr(163, 500)
> n163
> pi500 <- mpfr(pi, 500)
> pi500
> pitan <- mpfr(4, 500)*atan(mpfr(1,500))
> pitan
> pitan-pi500
> r500 <- exp(sqrt(n163)*pitan)
> r500
> check <- "262537412640768743.25007259719818568887935385..."
> savehistory("jnramanujan.R")
>
> Note that I used 4*atan(1) to get pi.

RK got it right by following the example in the help page for mpfr:

Const("pi", 120)

The R `pi` constant is not recognized by mpfr as being anything other than 
another double .


There are four special values that mpfr recognizes.

—
Best;
David


> It seems that may be important,
> rather than converting.
>
> JN
>
> On 15-07-03 06:00 AM, r-help-requ...@r-project.org wrote:
>
>> Message: 40
>> Date: Thu, 2 Jul 2015 22:38:45 +
>> From: Ravi Varadhan 
>> To: "'Richard M. Heiberger'" , Aditya Singh
>>  
>> Cc: r-help 
>> Subject: Re: [R] : Ramanujan and the accuracy of floating point
>>  computations - using Rmpfr in R
>> Message-ID:
>>  <14ad39aaf6a542849bbf3f62a0c2f...@dom-eb1-2013.win.ad.jhu.edu>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hi Rich,
>>
>> The Wolfram answer is correct.
>> http://mathworld.wolfram.com/RamanujanConstant.html
>>
>> There is no code for Wolfram alpha.  You just go to their web engine and 
>> plug in the expression and it will give you the answer.
>> http://www.wolframalpha.com/
>>
>> I am not sure that the precedence matters in Rmpfr.  Even if it does, the 
>> answer you get is still wrong as you showed.
>>
>> Thanks,
>> Ravi
>>
>> -Original Message-
>> From: Richard M. Heiberger [mailto:r...@temple.edu]
>> Sent: Thursday, July 02, 2015 6:30 PM
>> To: Aditya Singh
>> Cc: Ravi Varadhan; r-help
>> Subject: Re: [R] : Ramanujan and the accuracy of floating point computations 
>> - using Rmpfr in R
>>
>> There is a precedence error in your R attempt.  You need to convert
>> 163 to 120 bits first, before taking
>> its square root.
>>
>>>> exp(sqrt(mpfr(163, 120)) * mpfr(pi, 120))
>> 1 'mpfr' number of precision  120   bits
>> [1] 262537412640768333.51635812597335712954
>>
>> ## just the last four characters to the left of the decimal point.
>>>> tmp <- c(baseR=8256, Wolfram=8744, Rmpfr=8333, wrongRmpfr=7837)
>>>> tmp-tmp[2]
>> baseRWolfram  Rmpfr wrongRmpfr
>>  -488  0   -411   -907
>> You didn't give the Wolfram alpha code you used.  There is no way of 
>> verifying the correct value from your email.
>> Please check that you didn't have a similar precedence error in that code.
>>
>> Rich
>>
>>
>> On Thu, Jul 2, 2015 at 2:02 PM, Aditya Singh via R-help 
>>  wrote:
>>>> Ravi
>>>>
>>>> I am a chemical engineer by training. Is there not something like law of 
>>>> corresponding states in numerical analysis?
>>>>
>>>> Aditya
>>>>
>>>>
>>>>
>>>> --
>>>> On Thu 2 Jul, 2015 7:28 AM PDT Ravi Varadhan wrote:
>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Ramanujan supposedly discovered that the number, 163, has this 
>>>>>> interesting property that exp(sqrt(163)*pi), which is obviously a 
>>>>>> transcendental number, is real close to an integer (close to 10^(-12)).
>>>>

Re: [R] : Ramanujan and the accuracy of floating point computations - using Rmpfr in R

2015-07-03 Thread Ravi Varadhan

Hi Rich,

The Wolfram answer is correct.  
http://mathworld.wolfram.com/RamanujanConstant.html 

There is no code for Wolfram alpha.  You just go to their web engine and plug 
in the expression and it will give you the answer.
http://www.wolframalpha.com/ 

I am not sure that the precedence matters in Rmpfr.  Even if it does, the 
answer you get is still wrong as you showed.

Thanks,
Ravi

-Original Message-
From: Richard M. Heiberger [mailto:r...@temple.edu] 
Sent: Thursday, July 02, 2015 6:30 PM
To: Aditya Singh
Cc: Ravi Varadhan; r-help
Subject: Re: [R] : Ramanujan and the accuracy of floating point computations - 
using Rmpfr in R

There is a precedence error in your R attempt.  You need to convert
163 to 120 bits first, before taking
its square root.

> exp(sqrt(mpfr(163, 120)) * mpfr(pi, 120))
1 'mpfr' number of precision  120   bits
[1] 262537412640768333.51635812597335712954

## just the last four characters to the left of the decimal point.
> tmp <- c(baseR=8256, Wolfram=8744, Rmpfr=8333, wrongRmpfr=7837) 
> tmp-tmp[2]
 baseRWolfram  Rmpfr wrongRmpfr
  -488  0   -411   -907
>

You didn't give the Wolfram alpha code you used.  There is no way of verifying 
the correct value from your email.
Please check that you didn't have a similar precedence error in that code.

Rich


On Thu, Jul 2, 2015 at 2:02 PM, Aditya Singh via R-help  
wrote:
>
> Ravi
>
> I am a chemical engineer by training. Is there not something like law of 
> corresponding states in numerical analysis?
>
> Aditya
>
>
>
> --
> On Thu 2 Jul, 2015 7:28 AM PDT Ravi Varadhan wrote:
>
>>Hi,
>>
>>Ramanujan supposedly discovered that the number, 163, has this interesting 
>>property that exp(sqrt(163)*pi), which is obviously a transcendental number, 
>>is real close to an integer (close to 10^(-12)).
>>
>>If I compute this using the Wolfram alpha engine, I get:
>>262537412640768743.25007259719818568887935385...
>>
>>When I do this in R 3.1.1 (64-bit windows), I get:
>>262537412640768256.
>>
>>The absolute error between the exact and R's value is 488, with a relative 
>>error of about 1.9x10^(-15).
>>
>>In order to replicate Wolfram Alpha, I tried doing this in "Rmfpr" but I am 
>>unable to get accurate results:
>>
>>library(Rmpfr)
>>
>>
>>> exp(sqrt(163) * mpfr(pi, 120))
>>
>>1 'mpfr' number of precision  120   bits
>>
>>[1] 262537412640767837.08771354274620169031
>>
>>The above answer is not only inaccurate, but it is actually worse than the 
>>answer using the usual double precision.  Any thoughts as to what I am doing 
>>wrong?
>>
>>Thank you,
>>Ravi
>>
>>
>>
>>   [[alternative HTML version deleted]]
>>
>>__
>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide 
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ramanujan and the accuracy of floating point computations - using Rmpfr in R

2015-07-02 Thread Ravi Varadhan

Hi,

Ramanujan supposedly discovered that the number, 163, has this interesting 
property that exp(sqrt(163)*pi), which is obviously a transcendental number, is 
real close to an integer (close to 10^(-12)).

If I compute this using the Wolfram alpha engine, I get:
262537412640768743.25007259719818568887935385...

When I do this in R 3.1.1 (64-bit windows), I get:
262537412640768256.

The absolute error between the exact and R's value is 488, with a relative 
error of about 1.9x10^(-15).

In order to replicate Wolfram Alpha, I tried doing this in "Rmfpr" but I am 
unable to get accurate results:

library(Rmpfr)


> exp(sqrt(163) * mpfr(pi, 120))

1 'mpfr' number of precision  120   bits

[1] 262537412640767837.08771354274620169031

The above answer is not only inaccurate, but it is actually worse than the 
answer using the usual double precision.  Any thoughts as to what I am doing 
wrong?

Thank you,
Ravi



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple parameter optimization with optim()

2015-02-21 Thread Ravi Varadhan

Harold,



Obviously the bottleneck is your objective function fn().  I have speeded up 
your function by a factor of about 2.4 by using `outer' instead of sapply.  I 
think it can be speeded much more.  I couldn't figure it out without spending a 
lot of time.  I am sure someone on this list-serv can come up with a cleverer 
way to program the objective function.



pl3 <- function(dat, Q, startVal = NULL, ...){
 if(!is.null(startVal) && length(startVal) != ncol(dat) ){
 stop("Length of argument startVal not equal to 
the number of parameters estimated")
 }
 if(!is.null(startVal)){
 startVal <- startVal
 } else {
 p <- colMeans(dat)
 startValA <- rep(1, ncol(dat))
 startValB <- as.vector(log((1 - p)/p))
 startVal <- c(startValA,startValB)
 }
 rr1 <- rr2 <- numeric(nrow(dat))
 qq <- gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
 nds <- qq$nodes
 wts <- qq$weights
 Q <- length(qq$nodes)
 dat <- as.matrix(dat)
 fn <- function(params){
 a <- params[1:20]
 b <- params[21:40]
 for(j in 1:length(rr1)){
 rr1[j] <- sum(wts*exp(colSums(outer(dat[j,], nds, 
function(x,y) dbinom(x, 1, 1/ (1 + exp(- 1.7 * a * (y - b))), log = TRUE)
  }
  -sum(log(rr1))
  }
 #opt <- optim(startVal, fn, method = "BFGS", hessian = TRUE)
 opt <-  nlminb(startVal, fn, control=list(trace=1, 
rel.tol=1.e-06, iter.max=50))
# opt <- spg(startVal, fn)
 opt
 #list("coefficients" = opt$par, "LogLik" = -opt$value, 
"Std.Error" = sqrt(diag(solve(opt$hessian
 }



Hope this is helpful,

Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unable to use `eval(parse(text))' in nlme::lme

2015-02-14 Thread Ravi Varadhan

Yes, this is a very important point. Thank you, Bill.

Best,

Ravi

From: William Dunlap 
Sent: Friday, February 13, 2015 4:56 PM
To: Ravi Varadhan
Cc: r-h...@stat.math.ethz.ch
Subject: Re: [R] Unable to use `eval(parse(text))' in nlme::lme

> ff <- reformulate(termlabels=c("time","as.factor(gvhd)"), response=yname, 
> intercept=TRUE)

If the right hand side of the formula were more complicated than an additive 
list of terms,
say '~ time * as.factor(gvhd)' or the left side were more than a name, say 
'log(yname)' you
could use bquote() instead of reformulate.  E.g.,
  > formulaTemplate <- log(.(responseName)) ~ time * as.factor(gvhd)
  > lapply(c("y1","y2","y3"), function(yname)do.call(bquote, 
list(formulaTemplate, list(responseName=as.name<http://as.name>(yname)
  [[1]]
  log(y1) ~ time * as.factor(gvhd)

  [[2]]
  log(y2) ~ time * as.factor(gvhd)

  [[3]]
  log(y3) ~ time * as.factor(gvhd)

I used 'do.call' because bquote does not evaluate its first argument,
but we need to evaluate the name 'formulaTemplate'.  You could avoid that
by putting the template verbatim in the call to bquote, as in
  lapply(c("y1","y2","y3"), function(yname)bquote(log(.(responseName)) ~ time * 
as.factor(gvhd), list(responseName=as.name<http://as.name>(yname

I like the do.call method because I can bury it in a function and forget about 
it.

bquote() retains the environment of the formula template.

Bill Dunlap
TIBCO Software
wdunlap tibco.com<http://tibco.com>

On Mon, Feb 9, 2015 at 6:44 AM, Ravi Varadhan 
mailto:ravi.varad...@jhu.edu>> wrote:
Thanks to Rolf, Duncan, and Ben.

Ben, your suggestion worked (with a minor correction of concatenating the 
termlabels into a vector).

Here is the solution to those interested.

ff <- reformulate(termlabels=c("time","as.factor(gvhd)"), response=yname, 
intercept=TRUE)
dd <- subset(labdata2, Transplant_type!=0 & time >0)
lme(ff, random=~1|Patient, data=dd, correlation=corAR1(), na.action=na.omit)

Best,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor
Department of Oncology
Division of Biostatistics & Bionformatics
Johns Hopkins University
550 N. Broadway
Baltimore, MD 21205
40-502-2619

[[alternative HTML version deleted]]

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unable to use `eval(parse(text))' in nlme::lme

2015-02-09 Thread Ravi Varadhan

Thanks to Rolf, Duncan, and Ben.

Ben, your suggestion worked (with a minor correction of concatenating the 
termlabels into a vector).

Here is the solution to those interested.

ff <- reformulate(termlabels=c("time","as.factor(gvhd)"), response=yname, 
intercept=TRUE)
dd <- subset(labdata2, Transplant_type!=0 & time >0)
lme(ff, random=~1|Patient, data=dd, correlation=corAR1(), na.action=na.omit)

Best,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor
Department of Oncology
Division of Biostatistics & Bionformatics
Johns Hopkins University
550 N. Broadway
Baltimore, MD 21205
40-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Unable to use `eval(parse(text))' in nlme::lme

2015-02-08 Thread Ravi Varadhan

Hi,

I would like to run lme() on a number of response variables in a dataframe in 
an automatic manner.  Bu, when I use eval(parse(text=yname)) to denote the LHS 
of the formula in lme(), I get the following error message:



> require(nlme)



> mod2 <- with(subset(labdata2, Transplant_type!=0 & time >0), 
> lme(eval(parse(text=yname)) ~ time +  as.factor(gvhd), random = ~1|Patient, 
> correlation = corAR1(), method="ML", na.action=na.omit))
Error in model.frame.default(formula = ~Patient + yname + time + gvhd,  :
  variable lengths differ (found for 'yname')

The same usage works well in lme4::lmer without any problems.



It seems that there is a problem in how the formula object is evaluated in 
lme().  Is there an alternative way to do this?



Thank you,

Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Re-order levels of a categorical (factor) variable

2015-01-22 Thread Ravi Varadhan

Dear Bert, Martin, Bill,

Thank you very much for the help.  Bert & Martin's solutions solved my problem. 
 Would it be useful to add an example like this to the help page on ?factor or 
?levels?

Thanks again,
Ravi

From: Martin Maechler 
Sent: Thursday, January 22, 2015 10:29 AM
To: Bert Gunter
Cc: William Dunlap; r-help@r-project.org; Ravi Varadhan
Subject: Re: [R] Re-order levels of a categorical (factor) variable

>>>>> Bert Gunter 
>>>>> on Wed, 21 Jan 2015 18:52:12 -0800 writes:

> Bill/Ravi:
> I believe the problem is that the factor is automatically created when
> a data frame is created by read.table(). By default, the levels are
> lexicographically ordered. The following reproduces the problem and
> gives a solution.

>> library(lattice)

>> z <- data.frame(y = 1:9, x = rep(c("pre", "day2","day10")))
>> xyplot(y~x,data=z) ## x axis order is day 10, day2, pre

>> levels(z$x)
> [1] "day10" "day2"  "pre"

>> z$x <- factor(as.character(z$x),levels=c(levels(z$x)[3:1])) ## 
explicitly defines level order
>> xyplot(y~x,data=z) ##  desired plot

Indeed, thank you, Bert,
and using  levels(.) <- *   does *not* work... (as I first thought).

However, slightly shorter and easier and maybe even easier to remember than

 z$x <- factor(as.character(z$x), levels = c(levels(z$x)[3:1])) ## def. level 
order

is

 z$x <- factor(z$x, levels = levels(z$x)[3:1])   ## def. level order

Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Re-order levels of a categorical (factor) variable

2015-01-21 Thread Ravi Varadhan

Hi,
I have a fairly elementary problem that I am unable to figure out.  I have a 
continuous variable, Y, repeatedly measured at multiple times, T.  The variable 
T is however is coded as a factor variable having these levels: c("Presurgery", 
"Day 30", "Day 60",  "Day 180", "Day 365").
When I plot the boxplot, e.g., boxplot(Y ~ T), it displays the boxes in this 
order:  c("Day 180", "Day 30", "Day 365", "Day 60",  "Presurgery").
Is there a way to control the order of the boxes such that they are plotted in 
a particular order that I want, for example:  c("Presurgery", "Day 30", "Day 
60",  "Day 180", "Day 365")?

More generally, is there a simple way to redefine the ordering of the 
categorical variable such that this ordering will be used in whatever operation 
is done?  I looked at relevel, reorder, etc., but they did not seem to be 
applicable to my problem.

Thanks for any help.

Best,
Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 3.1.2 mle2() function on Windows 7 Error and multiple solutions

2015-01-15 Thread Ravi Varadhan

A more important point that want to make is that I find few people taking 
advantage of the "comparative evaluation" or benchmarking ability of optimx.  
There is no "uniformly best" optimizer for all problems.  Different ones turn 
out to perform better for different problems and it is quite difficult to know 
a priori which one would be the best for "my" problem.  Thus, the benchmarking 
capability provided by optimx is a powerful feature.  

Ravi  

From: Ben Bolker 
Sent: Thursday, January 15, 2015 9:29 AM
To: Ravi Varadhan; R-Help
Cc: malqura...@ksu.edu.sa
Subject: Re: R 3.1.2 mle2() function on Windows 7 Error and multiple solutions

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

  For what it's worth, you can use either nlminb (directly) or optimx
within the mle2 wrapper by specifying the 'optimizer' parameter ...
this gives you flexibility in optimization along with the convenience
of mle2 (likelihood ratio tests via anova(), likelihood profiling, etc.)

On 15-01-15 09:26 AM, Ravi Varadhan wrote:
> Hi,
>
>
>
> I tried your problem with optimx package.  I found a better
> solution than that found by mle2.
>
>
>
> ?library(optimx)
>
>
>
> # the objective function needs to be re-written
>
> LL2 <- function(par,y) {
>
> lambda <- par[1] alpha <- par[2] beta <- par[3] R =
> Nweibull(y,lambda,alpha,beta)
>
> -sum(log(R)) }
>
>
>
> optimx(fn=LL2,  par=c(.01,325,.8),y=y,
> lower=c(.1,.1,.1),upper = c(Inf,
> Inf,Inf),control=list(all.methods=TRUE))
>
>
>
> # Look at the solution found by `nlminb' and `nmkb'. This is the
> optimal one.  This log-likelihood is larger than that of mle2 and
> other optimizers in optimx.
>
>
>
> If this solution is not what you are looking for, your problem may
> be poorly scaled.  First, make sure that the likelihood is coded
> correctly.  If it is correct, then you may need to improve the
> scaling of the problem.
>
>
>
>
>
> Hope this is helpful,
>
> Ravi
>
>
>

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUt87VAAoJEOCV5YRblxUH9E4H/ismNjBi/diA7db1f4EtIYUz
fk0V1GIjAkhNr+gxs8bu6CBAMB2f/ufw+9ey2X6yHlzvgfwzIwNafgg9c5qVlArF
xD8A4w/4G9cRsQFX8yySEQMP7dH5tyCTeRHU0sEcTbY+vV/NtWAYpF7k36He0QnQ
Jz/Gfmjt/TTVlcsL4crr8IdOjP34mq7H1SGXKNoBymhaggkBXXjG+IlhPK3/HE4s
2LFKusdSVDiJCCR+kafwyKxk76Lf2WADw9/RaysWfW0/v5O5dWU4IuvK2//nzvts
7rKMkF9/zlT+LgLNo7LON+RTOeDtTMqyA10Vu+txQTKH4AcMP4LqYoiGMerl6O0=
=cHEo
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 3.1.2 mle2() function on Windows 7 Error and multiple solutions

2015-01-15 Thread Ravi Varadhan

Hi,



I tried your problem with optimx package.  I found a better solution than that 
found by mle2.



?library(optimx)



# the objective function needs to be re-written

LL2 <- function(par,y) {

lambda <- par[1]
alpha <- par[2]
beta <- par[3]
R = Nweibull(y,lambda,alpha,beta)

-sum(log(R))
}



optimx(fn=LL2,  par=c(.01,325,.8),y=y, lower=c(.1,.1,.1),upper = 
c(Inf, Inf,Inf),control=list(all.methods=TRUE))



# Look at the solution found by `nlminb' and `nmkb'. This is the optimal one.  
This log-likelihood is larger than that of mle2 and other optimizers in optimx.



If this solution is not what you are looking for, your problem may be poorly 
scaled.  First, make sure that the likelihood is coded correctly.  If it is 
correct, then you may need to improve the scaling of the problem.





Hope this is helpful,

Ravi



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Subject: short-sale constraint with nonpositive-definite matrix in portfolio optimization

2014-12-21 Thread Ravi Varadhan

Hi,

You can try a projected gradient approach, which is implemented in the spg() 
function in the "BB" package. You have to provide a projection function which 
will take an infeasible matrix as input and will give a feasible (i.e. 
positive-definite) matrix as output. This kind of matrix projection is quite 
easy to do, and in fact, there are functions in R to do this (e.g., see 
posdefify() function in "sfsmisc" package).


There is a recent review article on spectral projected gradient algorithm in J 
Statistical Software.

http://www.jstatsoft.org/v60/i03


Hope this is helpful,

Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Checking modeling assumptions in a binomial GLMM

2014-07-16 Thread Ravi Varadhan

Dear All,

I am fitting a model for a binary response variable measured repeatedly at 
multiple visits.  I am using the binomial GLMM using the glmer() function in 
lme4 package.  How can I evaluate the model assumptions (e.g., residual 
diagnostics, adequacy of random effects distribution) for a binomial GLMM?  Are 
there any standard checks that are commonly done?  Are there any pedagogical 
examples or data sets where model assumptions have been examined for binomial 
GLMMs?

Any suggestions/guidance is appreciated.

Thank you,
Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Box-cox transformation

2014-07-07 Thread Ravi Varadhan

Thank you.  It is very helpful.

Ravi

-Original Message-
From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] 
Sent: Monday, July 07, 2014 4:15 PM
To: Ravi Varadhan
Cc: r-help@r-project.org
Subject: Re: [R] Box-cox transformation

Dear Ravi,

In my previous example, I used the residuals, so:

sum [ (r_i / scaling)^2 ]

If you want to use the deviance from glm, that gives you:

sum [ r_i^2 ]

and since the scaling factor is just a constant for any given lambda, then the 
modification would be:

sum [ r_i^2 ] / ( scaling^2 )

and is given in the modified code below (posted back to R-help in case any else 
has this question).

Hope this helps,

Josh


##
require(MASS)

myp <- function(y, lambda) (y^lambda-1)/lambda


lambda <- seq(-0.05, 0.45, len = 20)
N <- nrow(quine)
res <- matrix(numeric(0), nrow = length(lambda), 2, dimnames = list(NULL, 
c("Lambda", "LL")))

# scaling contant
C <- exp(mean(log(quine$Days+1)))

for(i in seq_along(lambda)) {
  SS <- deviance(glm(myp(Days + 1, lambda[i]) ~ Eth*Sex*Age*Lrn, data = quine))
  LL <- (- (N/2) * log(SS/((C^lambda[i])^2)))
  res[i, ] <- c(lambda[i], LL)
}

# box cox
boxcox(Days+1 ~ Eth*Sex*Age*Lrn, data = quine, lambda = lambda) # add our 
points on top to verify match points(res[, 1], res[,2], pch = 16)

##

On Mon, Jul 7, 2014 at 11:57 PM, Ravi Varadhan  wrote:
> Dear Josh,
> Thank you very much.  I knew that the scaling had to be adjusted, but was not 
> sure on how to do this.
>
> Can you please show me how to do this scaling with `glm'? In other words, how 
> would I scale the deviance from glm?
>
> Thanks,
> Ravi
>
> -Original Message-
> From: Joshua Wiley [mailto:jwiley.ps...@gmail.com]
> Sent: Sunday, July 06, 2014 11:34 PM
> To: Ravi Varadhan
> Cc: r-help@r-project.org
> Subject: Re: [R] Box-cox transformation
>
> Hi Ravi,
>
> Deviance is the SS in this case, but you need a normalizing constant adjusted 
> by the lambda to put them on the same scale.  I modified your example below 
> to simplify slightly and use the normalization (see the LL line).
>
> Cheers,
>
> Josh
>
> ##
>
> require(MASS)
>
> myp <- function(y, lambda) (y^lambda-1)/lambda
>
>
> lambda <- seq(-0.05, 0.45, len = 20)
> N <- nrow(quine)
> res <- matrix(numeric(0), nrow = length(lambda), 2, dimnames = 
> list(NULL, c("Lambda", "LL")))
>
> # scaling contant
> C <- exp(mean(log(quine$Days+1)))
>
> for(i in seq_along(lambda)) {
>   r <- resid(lm(myp(Days + 1, lambda[i]) ~ Eth*Sex*Age*Lrn, data = quine))
>   LL <- (- (N/2) * log(sum((r/(C^lambda[i]))^2)))
>   res[i, ] <- c(lambda[i], LL)
> }
>
> # box cox
> boxcox(Days+1 ~ Eth*Sex*Age*Lrn, data = quine, lambda = lambda) # add 
> our points on top to verify match points(res[, 1], res[,2], pch = 16)
>
> ##
>
>
>
> On Mon, Jul 7, 2014 at 11:33 AM, Ravi Varadhan  wrote:
>> Hi,
>>
>> I am trying to do Box-Cox transformation, but I am not sure how to do it 
>> correctly.  Here is an example showing what I am trying:
>>
>>
>>
>> # example from MASS
>>
>> require(MASS)
>> boxcox(Days+1 ~ Eth*Sex*Age*Lrn, data = quine,
>>lambda = seq(-0.05, 0.45, len = 20))
>>
>> # Here is My attempt at getting the profile likelihood for the 
>> Box-Cox parameter lam <- seq(-0.05, 0.45, len = 20) dev <- rep(NA, 
>> length=20)
>>
>> for (i in 1:20) {
>> a <- lam[i]
>> ans <- glm(((Days+1)^a-1)/a ~ Eth*Sex*Age*Lrn, family=gaussian, data 
>> =
>> quine) dev[i] <- ans$deviance }
>>
>> plot(lam, dev, type="b", xlab="lambda", ylab="deviance")
>>
>> I am trying to create the profile likelihood for the Box-Cox parameter, but 
>> obviously I am not getting it right.  I am not sure that ans$deviance is the 
>> right thing to do.
>>
>> I would appreciate any guidance.
>>
>> Thanks & Best,
>> Ravi
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Joshua F. Wiley
> Ph.D. Student, UCLA Department of Psychology http://joshuawiley.com/ Senior 
> Analyst, Elkhart Group Ltd.
> http://elkhartgroup.com
> Office: 260.673.5518



--
Joshua F. Wiley
Ph.D. Student, UCLA Department of Psychology http://joshuawiley.com/ Senior 
Analyst, Elkhart Group Ltd.
http://elkhartgroup.com
Office: 260.673.5518
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Box-cox transformation

2014-07-06 Thread Ravi Varadhan

Hi,

I am trying to do Box-Cox transformation, but I am not sure how to do it 
correctly.  Here is an example showing what I am trying:



# example from MASS

require(MASS)
boxcox(Days+1 ~ Eth*Sex*Age*Lrn, data = quine,
   lambda = seq(-0.05, 0.45, len = 20))

# Here is My attempt at getting the profile likelihood for the Box-Cox parameter
lam <- seq(-0.05, 0.45, len = 20)
dev <- rep(NA, length=20)

for (i in 1:20) {
a <- lam[i]
ans <- glm(((Days+1)^a-1)/a ~ Eth*Sex*Age*Lrn, family=gaussian, data = quine)
dev[i] <- ans$deviance
}

plot(lam, dev, type="b", xlab="lambda", ylab="deviance")

I am trying to create the profile likelihood for the Box-Cox parameter, but 
obviously I am not getting it right.  I am not sure that ans$deviance is the 
right thing to do.

I would appreciate any guidance.

Thanks & Best,
Ravi




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A combinatorial assignment problem

2014-05-01 Thread Ravi Varadhan

Thank you, Dan and Bert.


Bert - Your approach provides a solution.  However, it has the undesired 
property of referees lumping together (I apologize that I did not state this as 
a condition).  In other words, it does not "mix" the referees in some random 
fashion.

Dan - your approach attempts to have the desired properties, but is not 
guaranteed to work.  Here is a counterexample:

> set.seed(1234)
> a <- assignment(40,15,3)
> table(a)
a
1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
10  7 12  7  4 10  8  6  8 13  7  7 11  3  7

Notice that the difference between maximum and minimum candidates for referees 
is 13 - 3 = 10.  Of course, I have to increase the # iters to get a better 
solution, but for large  K and R this may not converge at all.

Best regards,
Ravi

From: Ravi Varadhan
Sent: Wednesday, April 30, 2014 1:49 PM
To: r-help@r-project.org
Subject: A combinatorial assignment problem

Hi,

I have this problem:  K candidates apply for a job.  There are R referees 
available to review their resumes and provide feedback.  Suppose that we would 
like M referees to review each candidate (M < R).  How would I assign 
candidates to referees (or, conversely, referees to candidates)?  There are two 
important cases:  (a) K > (R choose M) and (b) K < (R chooses M).

Case (a) actually reduces to case (b), so we only have to consider case (b).  
Without any other constraints, the problem is quite easy to solve.  Here is an 
example that shows this.

require(gtools)
set.seed(12345)
K <- 10  # number of candidates
R <- 7# number of referees
M <- 3   # overlap, number of referees reviewing  each candidate

allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
assignment <- allcombs[sample(1:nrow(allcombs), size=K, replace=FALSE), ]
assignment
> assignment
  [,1] [,2] [,3]
[1,]345
[2,]357
[3,]567
[4,]356
[5,]167
[6,]127
[7,]145
[8,]367
[9,]245
[10,]457
>

Here each row stands for a candidate and the column stands for the referees who 
review that candidate.

Of course, there are some constraints that make the assignment challenging.  We 
would like to have an even distribution of the number of candidates reviewed by 
each referee.  For example, the above code produces an assignment where referee 
#2 gets only 2 candidates and referee #5 gets 7 candidates.  We would like to 
avoid such uneven assignments.

> table(assignment)
assignment
1 2 3 4 5 6 7
3 2 4 4 7 4 6
>

Note that in this example there are 35 possible triplets of referees and 10 
candidates.  Therefore, a perfectly even assignment is not possible.

I tried some obvious, greedy search methods but they are not guaranteed to 
work.   Any hints or suggestions would be greatly appreciated.

Best,
Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A combinatorial assignment problem

2014-05-01 Thread Ravi Varadhan

Thanks, Bert.   
I have written this simple code, which is crude, but seems to do a decent job.  
It works perfectly when M is a factor of R. Otherwise, it gives decent balance 
(of course, balance is not guaranteed).  I guess it is possible to take the 
results, that are somewhat unbalanced and then reshuffle it semi-randomly to 
get better balance.  Any improvements are appreciated! 

assign <- function(K, R, M, iseed=1234) {
  assignment <- matrix(NA, K, M)
  N <- R %/% M
  Nrem <- R %% M
  iseq <- seq(1,K,N)
  for (i in iseq){
size <- ifelse(K-i >= N, R-Nrem, M*(K-i+1))
sel <- sample(1:R, size=size, replace=FALSE)
end <- min((i+N-1),K)
assignment[i:end, ] <- sel
}
assignment
}

sol <- assign(40,16,3)
table(sol)

Thanks,
Ravi

-Original Message-
From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Thursday, May 01, 2014 10:46 AM
To: Ravi Varadhan
Cc: r-help@r-project.org; djnordl...@frontier.com
Subject: Re: A combinatorial assignment problem

Ravi:

You cannot simultaneously have balance and guarantee random mixing.
That is, you would need to specify precisely what you mean by balance and 
random mixing in this context, as these terms are now subjective and undefined.

You could, of course, randomize the initial assignment of referees to positions 
and note also that some mixing of referee groups would occur if m does not 
divide r evenly. More explicitly, note that a very fast way to generate the 
groups I described is:

rmk <- function(nrefs,size,ncands){
  n <- ncands * size
  matrix(rep(seq_len(nrefs), n %/% nrefs +1)[seq_len(n)],nrow=
ncands,byrow=TRUE)
}
## corner case checks and adjustments need to be made to this, of course.

>  rmk(7,3,10)
  [,1] [,2] [,3]
 [1,]123
 [2,]456
 [3,]712
 [4,]345
 [5,]671
 [6,]234
 [7,]567
 [8,]123
 [9,]456
[10,]712

You could then modify this by randomly reordering the referees every time a 
complete cycle of the groupings occurred, i.e. after each
nrefs/gcd(nrefs,size) candidates = rows. This would give variable groups and 
even assignment. This algorithm could be further fiddled with by choosing nrefs 
and size to assure they are relatively prime and then adding and removing 
further refs randomly during the cycling.
etc.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge is 
certainly not wisdom."
H. Gilbert Welch

On Thu, May 1, 2014 at 6:09 AM, Ravi Varadhan  wrote:
> Thank you, Dan and Bert.
>
>
>
> Bert – Your approach provides a solution.  However, it has the 
> undesired property of referees lumping together (I apologize that I 
> did not state this as a condition).  In other words, it does not "mix" 
> the referees in some random fashion.
>
>
>
> Dan – your approach attempts to have the desired properties, but is 
> not guaranteed to work.  Here is a counterexample:
>
>
>
>> set.seed(1234)
>
>> a <- assignment(40,15,3)
>
>> table(a)
>
> a
>
> 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
>
> 10  7 12  7  4 10  8  6  8 13  7  7 11  3  7
>
>
>
> Notice that the difference between maximum and minimum candidates for 
> referees is 13 – 3 = 10.  Of course, I have to increase the # iters to 
> get a better solution, but for large  K and R this may not converge at all.
>
>
>
> Best regards,
>
> Ravi
>
>
>
> From: Ravi Varadhan
> Sent: Wednesday, April 30, 2014 1:49 PM
> To: r-help@r-project.org
> Subject: A combinatorial assignment problem
>
>
>
> Hi,
>
>
>
> I have this problem:  K candidates apply for a job.  There are R 
> referees available to review their resumes and provide feedback.  
> Suppose that we would like M referees to review each candidate (M < 
> R).  How would I assign candidates to referees (or, conversely, 
> referees to candidates)?  There are two important cases:  (a) K > (R choose 
> M) and (b) K < (R chooses M).
>
>
>
> Case (a) actually reduces to case (b), so we only have to consider case (b).
> Without any other constraints, the problem is quite easy to solve.  
> Here is an example that shows this.
>
>
>
> require(gtools)
>
> set.seed(12345)
>
> K <- 10  # number of candidates
>
> R <- 7# number of referees
>
> M <- 3   # overlap, number of referees reviewing  each candidate
>
>
>
> allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
>
> assignment <- allcombs[sample(1:nrow(allcombs), size=K, 
> replace=FALSE), ]
>
> assignment
>
>> assignment
>
>   [,1] [,2] [,3]
>
> [1,

[R] A combinatorial assignment problem

2014-04-30 Thread Ravi Varadhan

Hi,

I have this problem:  K candidates apply for a job.  There are R referees 
available to review their resumes and provide feedback.  Suppose that we would 
like M referees to review each candidate (M < R).  How would I assign 
candidates to referees (or, conversely, referees to candidates)?  There are two 
important cases:  (a) K > (R choose M) and (b) K < (R chooses M).

Case (a) actually reduces to case (b), so we only have to consider case (b).  
Without any other constraints, the problem is quite easy to solve.  Here is an 
example that shows this.

require(gtools)
set.seed(12345)
K <- 10  # number of candidates
R <- 7# number of referees
M <- 3   # overlap, number of referees reviewing  each candidate

allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
assignment <- allcombs[sample(1:nrow(allcombs), size=K, replace=FALSE), ]
assignment
> assignment
  [,1] [,2] [,3]
[1,]345
[2,]357
[3,]567
[4,]356
[5,]167
[6,]127
[7,]145
[8,]367
[9,]245
[10,]457
>

Here each row stands for a candidate and the column stands for the referees who 
review that candidate.

Of course, there are some constraints that make the assignment challenging.  We 
would like to have an even distribution of the number of candidates reviewed by 
each referee.  For example, the above code produces an assignment where referee 
#2 gets only 2 candidates and referee #5 gets 7 candidates.  We would like to 
avoid such uneven assignments.

> table(assignment)
assignment
1 2 3 4 5 6 7
3 2 4 4 7 4 6
>

Note that in this example there are 35 possible triplets of referees and 10 
candidates.  Therefore, a perfectly even assignment is not possible.

I tried some obvious, greedy search methods but they are not guaranteed to 
work.   Any hints or suggestions would be greatly appreciated.

Best,
Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A couple of issues with lattice::bwplot

2014-02-25 Thread Ravi Varadhan

Hi All,

I am using lattice::bwplot to plot the results of a subgroup analysis for 
different simulation scenarios.  I have 4 subgroups, 5 methods of estimating 
treatment effect in subgroups, and 4 scenarios.  I am plotting the subgroup 
effect and its upper and lower CI.

I have 2 issues that I am unable to resolve:

1.   How can I control the order in which the panels are plotted?  For 
example, I would like to bottom and top rows to be switched, and for 2nd and 
3rd rows to be switched.

2.   How can I plot a point within each panel at a fixed coordinate?

My code is as follows:

library(lattice)
trellis.par.set("box.rectangle", list(lty=1, lwd=2))
trellis.par.set("box.umbrella", list(lty=1, lwd=2))

n.param <- 16
est.method <- c("Bayes-DS", "EB", "MLE-DS", "Naive", "Stand")
est.type <- c("Point", "LCL", "UCL")
est.names <- c("Int.", paste0("X", 2:n.param))
est.names <- c(outer(c("X0=0", "X0=1","X1=0", "X1=1"), paste0(", scenario", 
1:4), function(x,y) paste0(x, y)))


n.methods <- length(est.method)
n.types <- length(est.type)

estimates <- data.frame(Tx.effect=sort(rnorm(n.param*n.methods*n.types, 
mean=0.5, sd=0.2)),
method=rep(est.method, each=n.param*n.types),
name=rep(est.names, times=n.types*n.methods),
type=rep(est.type, times=n.param*n.methods))

truth <- runif(16)  # I would like to plot these points, one per each panel

bwplot(method ~ Tx.effect|name, data=estimates, box.width=0, coef=0, pch=22)

I would greatly appreciate any help.

Thank you,
Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Unable to Install a package from source in Windows

2014-01-09 Thread Ravi Varadhan

Hi,
I am using following R version:
> version
   _
platform   i386-w64-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  3
minor  0.1
year   2013
month  05
day16
svn rev62743
language   R
version.string R version 3.0.1 (2013-05-16)
nickname   Good Sport
>

I was able to build the source of a package without any errors or warnings.  
However, when I try to install the package, I get the following error messages. 
 Can someone point me to what I am doing wrong?

Thanks in advance,
Ravi

> install.packages("H:/Documents/computations/BB_2014.01-1.tar.gz", repos=NULL, 
> type="source")
Installing package into 
'\\homer.win.ad.jhu.edu/users$/rvaradh1/Documents/R/win-library/3.0'
(as 'lib' is unspecified)
'\\homer.win.ad.jhu.edu\users$\rvaradh1\Documents'
CMD.EXE was started with the above path as the current directory.
UNC paths are not supported.  Defaulting to Windows directory.
* installing *source* package 'BB' ...
** R
** demo
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
*** arch - i386
Warning in library(pkg_name, lib.loc = lib, character.only = TRUE, 
logical.return = TRUE) :
  there is no package called 'BB'
Error: loading failed
Execution halted
*** arch - x64
Warning in library(pkg_name, lib.loc = lib, character.only = TRUE, 
logical.return = TRUE) :
  there is no package called 'BB'
Error: loading failed
Execution halted
ERROR: loading failed for 'i386', 'x64'
* removing 
'\\homer.win.ad.jhu.edu/users$/rvaradh1/Documents/R/win-library/3.0/BB'
* restoring previous 
'\\homer.win.ad.jhu.edu/users$/rvaradh1/Documents/R/win-library/3.0/BB'
Warning messages:
1: running command '"C:/PROGRA~1/R/R-30~1.1/bin/i386/R" CMD INSTALL -l 
"\\homer.win.ad.jhu.edu\users$\rvaradh1\Documents\R\win-library\3.0" 
"H:/Documents/computations/BB_2014.01-1.tar.gz"' had status 1
2: In install.packages("H:/Documents/computations/BB_2014.01-1.tar.gz",  :
  installation of package 'H:/Documents/computations/BB_2014.01-1.tar.gz' had 
non-zero exit status
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] convergence=0 in optim and nlminb is real?

2013-12-17 Thread Ravi Varadhan

The optimization algorithms did converge to a limit point.  But, not to a 
stationary point, i.e. a point in parameter space where the first and second 
order KKT conditions are satisfied.  If you check the gradient at the solution, 
you will see that it is quite large in magnitude relative to 0.  So, why did 
the algorithms declare convergence?  Convergence is based on absolute change in 
function value and/or relative change in parameter values between consecutive 
iterations.  This does not ensure that the KKT conditions are satisfied.

Now, to the real issue:  your problem is ill-posed.  As you can tell from the 
eigenvalues of the hessian, they vary over 9 orders of magnitude.  This may 
indicate a problem with the data in that the log-likelihood is 
over-parametrized relative to the information in the data set.  Get a better 
data set or formulate a simpler model, and the problem will disappear.

Best,
Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [optim/bbmle] function returns NA

2013-08-14 Thread Ravi Varadhan

Carlos,
There are likely several problems with your likelihood.  You should check it 
carefully first before you do any optimization.  It seems to me that you have 
box constraints on the parameters.  They way you are enforcing them is not 
correct. I would prefer to use an optimization algorithm that can handle box 
constraints rather than use artificial mechanisms such as what you are trying 
to do:
if (r<=0 | alpha<=0 | s<=0 | beta<=0) return (NaN)

Even here, a better approach would be:
if (r<=0 | alpha<=0 | s<=0 | beta<=0) return (-.Machine$double.xmax)

I also notice that you do not use `tx' in your `g' function.  There are likely 
a number of other issues as well.

You may try the nmkb() function in the "dfoptim" package.  It can handle box 
constraints and typically tends to perform a bit better than optim's 
Nelder-Mead for unconstrained problems.

Hope this is helpful,
Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GARCH Optimization Problems

2013-08-01 Thread Ravi Varadhan

Hi Tony,
Did you try the `auglag' algorithm in "alabma" package?
Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] While using R CMD check: LaTex error: File `inconsolata.sty' not found

2013-07-12 Thread Ravi Varadhan

Hi,
While using R CMD check I get the following Latex error message which occurs 
when creating PDF version of manual:
LaTex error:  File `inconsolata.sty' not found
I am using Windows 7 (64-bit) and R 3.0.1.  I have MikTex 2.9.
I see that the incosolata.sty is present under \doc\fonts folder.  How can I 
eliminate this problem?

Thanks,
Ravi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorized code for generating the Kac (Clement) matrix

2013-04-26 Thread Ravi Varadhan

Thank you, Berend and Enrico, for looking into this.  I did not think of 
Enrico's clever use of cbind() to form the subsetting indices. 

Best,
Ravi

From: Berend Hasselman [b...@xs4all.nl]
Sent: Friday, April 26, 2013 10:08 AM
To: Enrico Schumann
Cc: Ravi Varadhan; 'r-help@r-project.org'
Subject: Re: [R] Vectorized code for generating the Kac (Clement) matrix

On 26-04-2013, at 14:42, Enrico Schumann  wrote:

> On Thu, 25 Apr 2013, Ravi Varadhan  writes:
>
>> Hi, I am generating large Kac matrices (also known as Clement matrix).
>> This a tridiagonal matrix.  I was wondering whether there is a
>> vectorized solution that avoids the `for' loops to the following code:
>>
>> n <- 1000
>>
>> Kacmat <- matrix(0, n+1, n+1)
>>
>> for (i in 1:n) Kacmat[i, i+1] <- n - i + 1
>>
>> for (i in 2:(n+1)) Kacmat[i, i-1] <- i-1
>>
>> The above code is fast, but I am curious about vectorized ways to do this.
>>
>> Thanks in advance.
>> Best,
>> Ravi
>>
>
>
> This may be a bit faster; but as Berend and you said, the original
> function seems already fast.
>
> n <- 5000
>
> f1 <- function(n) {
>Kacmat <- matrix(0, n+1, n+1)
>for (i in 1:n) Kacmat[i, i+1] <- n - i + 1
>for (i in 2:(n+1)) Kacmat[i, i-1] <- i-1
>Kacmat
> }
> f3 <- function(n) {
>n1 <- n + 1L
>res <- numeric(n1 * n1)
>dim(res) <- c(n1, n1)
>bw <- n:1L ## bw = backward, fw = forward
>fw <- seq_len(n)
>res[cbind(fw, fw + 1L)] <- bw
>res[cbind(fw + 1L, fw)] <- fw
>res
> }
>
> system.time(K1 <- f1(n))
> system.time(K3 <- f3(n))
> identical(K3, K1)
>
> ##user  system elapsed
> ##   0.132   0.028   0.161
> ##
> ##user  system elapsed
> ##   0.024   0.048   0.071
> ##

Using some of your code in my function I was able to speed up my function f2.
Complete code:

f1 <- function(n) { #Ravi
Kacmat <- matrix(0, n+1, n+1)
for (i in 1:n) Kacmat[i, i+1] <- n - i + 1
for (i in 1:n) Kacmat[i+1, i] <- i
Kacmat
}

f2 <- function(n) { # Berend 1 modified to use 1L
Kacmat <- matrix(0, n+1, n+1)
Kacmat[row(Kacmat)==col(Kacmat)-1L] <- n:1L
Kacmat[row(Kacmat)==col(Kacmat)+1L] <- 1L:n
Kacmat
}

f3 <- function(n) { # Enrico
   n1 <- n + 1L
   res <- numeric(n1 * n1)
   dim(res) <- c(n1, n1)
   bw <- n:1L ## bw = backward, fw = forward
   fw <- seq_len(n)
   res[cbind(fw, fw + 1L)] <- bw
   res[cbind(fw + 1L, fw)] <- fw
   res
}

f4 <- function(n) {# Berend 2 using which with arr.ind=TRUE
Kacmat <- matrix(0, n+1, n+1)
k1 <- which(row(Kacmat)==col(Kacmat)-1L, arr.ind=TRUE)
k2 <- which(row(Kacmat)==col(Kacmat)+1L, arr.ind=TRUE)

Kacmat[k1] <- n:1L
Kacmat[k2] <- 1L:n
Kacmat
}

library(compiler)

f1.c <- cmpfun(f1)
f2.c <- cmpfun(f2)
f3.c <- cmpfun(f3)
f4.c <- cmpfun(f4)

f1(n)
f2(n)
n <- 5000

system.time(K1 <- f1(n))
system.time(K2 <- f2(n))
system.time(K3 <- f3(n))
system.time(K4 <- f4(n))

system.time(K1c <- f1.c(n))
system.time(K2c <- f2.c(n))
system.time(K3c <- f3.c(n))
system.time(K4c <- f4.c(n))
identical(K2,K1)
identical(K3,K1)
identical(K4,K1)
identical(K1c,K1)
identical(K2c,K2)
identical(K3c,K3)
identical(K4c,K4)

Result:

# > system.time(K1 <- f1(n))
#user  system elapsed
#   0.387   0.120   0.511
# > system.time(K2 <- f2(n))
#user  system elapsed
#   3.541   0.702   4.250
# > system.time(K3 <- f3(n))
#user  system elapsed
#   0.108   0.089   0.199
# > system.time(K4 <- f4(n))
#user  system elapsed
#   1.975   0.355   2.336
# >
# > system.time(K1c <- f1.c(n))
#user  system elapsed
#   0.323   0.120   0.445
# > system.time(K2c <- f2.c(n))
#user  system elapsed
#   3.374   0.422   3.807
# > system.time(K3c <- f3.c(n))
#user  system elapsed
#   0.107   0.098   0.205
# > system.time(K4c <- f4.c(n))
#user  system elapsed
#   1.816   0.384   2.203
# > identical(K2,K1)
# [1] TRUE
# > identical(K3,K1)
# [1] TRUE
# > identical(K4,K1)
# [1] TRUE
# > identical(K1c,K1)
# [1] TRUE
# > identical(K2c,K2)
# [1] TRUE
# > identical(K3c,K3)
# [1] TRUE
# > identical(K4c,K4)
# [1] TRUE

So Ravi's original and Enrico's versions are the quickest.
Using which with arr.ind made  my version run a lot quicker.

All in all an interesting exercise.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Vectorized code for generating the Kac (Clement) matrix

2013-04-26 Thread Ravi Varadhan

Hi,
I am generating large Kac matrices (also known as Clement matrix).  This a 
tridiagonal matrix.  I was wondering whether there is a vectorized solution 
that avoids the `for' loops to the following code:

n <- 1000

Kacmat <- matrix(0, n+1, n+1)

for (i in 1:n) Kacmat[i, i+1] <- n - i + 1

for (i in 2:(n+1)) Kacmat[i, i-1] <- i-1

The above code is fast, but I am curious about vectorized ways to do this.

Thanks in advance.
Best,
Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Crrstep help

2013-03-28 Thread Ravi Varadhan

Kathy,

You need to make sure that you have installed the latest version of `cmprsk' 
package (version 2.2-4).  crrstep will not work Otherwise.



Ravi

From: Ravi Varadhan
Sent: Thursday, March 28, 2013 5:25 PM
To: 'kathyhan...@gmail.com'
Cc: r-help@r-project.org
Subject: Re: [R] Crrstep help

Hi Kathy,
You should first contact the package maintainer (which is me!) before posting 
to r-help.  Which version of the package are you using?  Can you send me a 
minimally reproducible code?

Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Crrstep help

2013-03-28 Thread Ravi Varadhan

Hi Kathy,
You should first contact the package maintainer (which is me!) before posting 
to r-help.  Which version of the package are you using?  Can you send me a 
minimally reproducible code?

Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Constrained Optimization in R (alabama)

2013-02-11 Thread Ravi Varadhan

Berend,

I had pointed out this solution to Axel already, but he aparently wants to 
solve more general problems like this, so he wanted a solution from constrained 
optimimzers. But, if the more general problems have the similarly discrete set 
of candidate parameter values, using constrained optimizers like alabama would 
still NOT be good approaches. It would be easiest to just go through the 
discrete values.

Ravi

From: Berend Hasselman [b...@xs4all.nl]
Sent: Monday, February 11, 2013 12:45 AM
To: Axel Urbiz
Cc: R-help@r-project.org; Ravi Varadhan
Subject: Re: [R] Constrained Optimization in R (alabama)

On 10-02-2013, at 21:16, Axel Urbiz  wrote:

> Dear List,
>
> I'm trying to solve this simple optimization problem in R. The parameters
> are the exponents to the matrix mm. The constraints specify that each row
> of the parameter matrix should sum to 1 and their product to 0. I don't
> understand why the constraints are not satisfied at the solution. I must be
> misinterpreting how to specify the constrains somehow.
>
>
> library(alabama)
>
> ff <- function (x) {
>  mm <- matrix(c(10, 25, 5, 10), 2, 2)
>  matx <- matrix(x, 2, 2)
>  -sum(apply(mm ^ matx, 1, prod))
> }
>
>
> ### constraints
>
> heq <- function(x) {
>  h <- rep(NA, 1)
>  h[1] <- x[1] + x[3] -1
>  h[2] <- x[2] + x[4] -1
>  h[3] <- x[1] * x[3]
>  h[4] <- x[2] * x[4]
>  h
> }
>
> res <- constrOptim.nl(par = c(1, 1, 1, 1), fn = ff,
>  heq = heq)
> res$convergence #why NULL?
> matrix(round(res$par, 2), 2, 2)  #why constraints are not satisfied?

There is no need to use an optimization function in this case. You can easily 
find the optimum by hand.

>From the constraints :

x[3] equals (1-x[1])
x[4] equals (1-x[2])

Therefore

x[1] * (1-x[1]) equals 0
x[2] * (1-x[2]) equals 0

So you only need to calculate the function value for  4 vectors:

x1 <- c(1,1,0,0)
x2 <- c(0,0,1,1)
x3 <- c(1,0,0,1)
x4 <- c(0,1,1,0)

Modifying your ff function in the way Ravi suggested:

ff <- function (x) {
 mm <- matrix(c(10, 25, 5, 10), 2, 2)
 matx <- matrix(x, 2, 2)
 sum(apply(mm ^ matx, 1, prod))
}

and running ff with x{1,2,3,4} as argument, one gets

> ff(x1)
[1] 35
> ff(x2)
[1] 15
> ff(x3)
[1] 20
> ff(x4)
[1] 30

So ff(x2) achieves the minimum, which corresponds to Ravi's result with auglag.

Berend
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Constrained Optimization in R (alabama)

2013-02-10 Thread Ravi Varadhan

Axel,



Your objective function has the wrong sign.  The maximum is infinity, so it 
does not make sense to maximize.  YOu should be minimizing the product.  So, 
remove the negative sign and it works as you had expected.



ff <- function (x) {

mm <- matrix(c(10, 25, 5, 10), 2, 2)

matx <- matrix(x, 2, 2)

# - sum(apply(mm ^ matx, 1, prod)) # this is the problem

sum(apply(mm ^ matx, 1, prod))

}



### constraints

heq <- function(x) {

h <- rep(NA, 1)

h[1] <- x[1] + x[3] -1

h[2] <- x[2] + x[4] -1

h[3] <- x[1] * x[3]

h[4] <- x[2] * x[4]

h

}



res <- auglag(par = rep(1,4), fn = ff, heq = heq)



Ravi




From: Axel Urbiz [axel.ur...@gmail.com]
Sent: Sunday, February 10, 2013 3:16 PM
To: R-help@r-project.org; Ravi Varadhan
Subject: Constrained Optimization in R (alabama)

Dear List,

I'm trying to solve this simple optimization problem in R. The parameters are 
the exponents to the matrix mm. The constraints specify that each row of the 
parameter matrix should sum to 1 and their product to 0. I don't understand why 
the constraints are not satisfied at the solution. I must be misinterpreting 
how to specify the constrains somehow.


library(alabama)

ff <- function (x) {
  mm <- matrix(c(10, 25, 5, 10), 2, 2)
  matx <- matrix(x, 2, 2)
  -sum(apply(mm ^ matx, 1, prod))
}


### constraints

heq <- function(x) {
  h <- rep(NA, 1)
  h[1] <- x[1] + x[3] -1
  h[2] <- x[2] + x[4] -1
  h[3] <- x[1] * x[3]
  h[4] <- x[2] * x[4]
  h
}

res <- constrOptim.nl(par = c(1, 1, 1, 1), fn = ff,
  heq = heq)
res$convergence #why NULL?
matrix(round(res$par, 2), 2, 2)  #why constraints are not satisfied?

Axel.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Constrained Optimization in R (alabama)

2013-02-10 Thread Ravi Varadhan

Your constraints are non-sensical.  The only way for these constraints to be 
satisfied is for two of the parameters to be 0 and the other two to be 1.



Please spend time to formulate your problem correctly, before imposing on 
others' time.



Ravi


From: Axel Urbiz [axel.ur...@gmail.com]
Sent: Sunday, February 10, 2013 3:16 PM
To: R-help@r-project.org; Ravi Varadhan
Subject: Constrained Optimization in R (alabama)

Dear List,

I'm trying to solve this simple optimization problem in R. The parameters are 
the exponents to the matrix mm. The constraints specify that each row of the 
parameter matrix should sum to 1 and their product to 0. I don't understand why 
the constraints are not satisfied at the solution. I must be misinterpreting 
how to specify the constrains somehow.


library(alabama)

ff <- function (x) {
  mm <- matrix(c(10, 25, 5, 10), 2, 2)
  matx <- matrix(x, 2, 2)
  -sum(apply(mm ^ matx, 1, prod))
}


### constraints

heq <- function(x) {
  h <- rep(NA, 1)
  h[1] <- x[1] + x[3] -1
  h[2] <- x[2] + x[4] -1
  h[3] <- x[1] * x[3]
  h[4] <- x[2] * x[4]
  h
}

res <- constrOptim.nl(par = c(1, 1, 1, 1), fn = ff,
  heq = heq)
res$convergence #why NULL?
matrix(round(res$par, 2), 2, 2)  #why constraints are not satisfied?

Axel.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Non linear programming: choose R that minimizes corr(y,x^R)

2013-02-06 Thread Ravi Varadhan

Hi John,

Do you want to minimize correlation or |correlation|?

Here is a simple code, with example, that shows how to optimize the magnitude 
(that is corr^2) of correlation between y and x^p, where p is a positive real 
number.  You can modify it to suit your needs.

corxy.opt <- function(x, y, interval=c(0,4), maximum=TRUE){

obj <- function(p, x, y) cor(y, x^p)^2

ans <- optimize(f=obj, interval=interval, maximum=maximum, x=x, y=y)

ans
}

set.seed(321)

x <- runif(100)

y <- rnorm(100, mean=1-2*x, sd=1)

corxy.opt(x, y)

#  Why minimizing does not make sense
obj2 <- function(p, x, y) sapply(p, function(p) cor(y, x^p)^2)

plot(seq(0,5,length=1000), obj2(seq(0,5,length=1000),x,y), type="l")

Best,
Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] starting values in glm(..., family = binomial(link =log))

2013-01-30 Thread Ravi Varadhan

I did not get any warnings when I ran your data/model example.

From: Fischer, Felix [mailto:felix.fisc...@charite.de]
Sent: Wednesday, January 30, 2013 11:19 AM
To: Ravi Varadhan
Cc: r-help@r-project.org
Subject: AW: [R] starting values in glm(..., family = binomial(link =log))

Thanks for your replies! It seems, that I can fit my model now, when I can 
provide the "right" starting values; however there remain warnings, such as:

1: In log(ifelse(y == 1, 1, (1 - y)/(1 - mu))) : NaNs wurden erzeugt
2: step size truncated due to divergence
3: step size truncated: out of bounds
...

That makes me feel uncomfortable and I wonder whether I can "trust" the fitted 
model. Why is this kind of regression so picky about starting values compared 
to logistic regression? And is there a way to explain the difference between 
binomial - quasibinomial to a simple mind like mine?

Best,
Felix

Von: Ravi Varadhan [mailto:ravi.varad...@jhu.edu]
Gesendet: Mittwoch, 30. Januar 2013 17:02
An: Fischer, Felix
Cc: r-help@r-project.org<mailto:r-help@r-project.org>
Betreff: [R] starting values in glm(..., family = binomial(link =log))

Try this:

Age_log_model = glm(Arthrose ~ Alter, data=x, start=c(-1, 0), 
family=quasibinomial(link = log))

Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] starting values in glm(..., family = binomial(link =log))

2013-01-30 Thread Ravi Varadhan

Try this:

Age_log_model = glm(Arthrose ~ Alter, data=x, start=c(-1, 0), 
family=quasibinomial(link = log))

Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] model selection with spg and AIC (or, convert list to fitted model object)

2012-12-04 Thread Ravi Varadhan

Adam,

Getting the variance of MLE estimator when the true parameter is on the 
boundary is a very difficult problem.  It is known that the standard bootstrap 
does not work.  There are some sub-sampling approaches (Springer book:  
Politis, Romano, Wolff), but I am not an expert on this.

However, an important question is how do we know that the truth is supposedly 
on the boundary.  If you can assume that the truth is in the interior, then you 
can use standard approaches: observed hessian or bootstrap to get standard 
errors.

Best,
Ravi

From: Adam Zeilinger [mailto:ad...@ucr.edu]
Sent: Sunday, December 02, 2012 5:04 PM
To: Ravi Varadhan
Cc: Adam Zeilinger (zeil0...@umn.edu); r-help@r-project.org
Subject: Re: [R] model selection with spg and AIC (or, convert list to fitted 
model object)

Dear Ravi,

Thank you so much for the help.  I switched to using the optimx function but I 
continue to use the spg method (for the most part) because I found that only 
spg consistently converges give different datasets.  I also decided to use AIC 
rather that a likelihood ratio test.

I have a new question.  I would like to construct 95% confidence intervals for 
the parameter estimates from the best model.  From a previous R Help thread, 
you said that it was a bad idea to use the Hessian matrix computed from 
optim/optimx or hessian() [numDeriv package] when the optimization is 
constrained and parameter estimates are on the boundary because the MLE is 
likely at a local minimum:

http://tolstoy.newcastle.edu.au/R/e15/help/11/09/6673.html

In the same thread, you suggest using the Hessian matrix from augmented 
Lagrangian optimization with auglag() [alabama package] (with some caveats).  I 
would like to construct 95% CI with auglag, but I don't understand how to write 
the inequality constraints (hin) function.  Could you please help me write the 
hin function?

Below is R code for my MLE problem, using data that results in parameter 
estimates on the boundary, and my unsuccessful attempt at auglag() 
optimization.  Note: I have gradient functions for NLL1 and NLL2 but they're 
very large and don't seem to improve optimization, so they are not included 
here.  I can supply them if needed for the auglag() function.  Also, I am 
running R 2.15.2 on a Windows 7 64-bit machine.

##

library(optimx)
library(alabama)

# define multinomial distribution
dmnom2 <- function(x,prob,log=FALSE) {
  r <- lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1))
  if (log) r else exp(r)
}

# data frame y


y <- structure(list(t = c(0.167, 0.5, 1, 12, 18, 24, 36), n1 = c(1L,

1L, 1L, 8L, 9L, 10L, 12L), n2 = c(0L, 1L, 2L, 6L, 5L, 3L, 2L),

n3 = c(13L, 12L, 11L, 0L, 0L, 1L, 0L)), .Names = c("t", "n1",

"n2", "n3"), class = "data.frame", row.names = 36:42)

# Negative log-likelihood functions
NLL1 <- function(par, y) {
  p1 <- par[1]
  p2 <- p1
  mu1 <- par[2]
  mu2 <- mu1
  t <- y$t
  n1 <- y$n1
  n2 <- y$n2
  n3 <- y$n3
  P1 <- (p1*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 -
4*(mu2*p1 + mu1*(mu2 + p2)))*t))*((-mu2)*(mu2 - p1 + p2) +
mu1*(mu2 + 2*p2)) - mu2*sqrt((mu1 + mu2 + p1 + p2)^2 -
4*(mu2*p1 + mu1*(mu2 + p2))) -
exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)*
mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) +
2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
4*(mu2*p1 + mu1*(mu2 + p2*t)*mu2*
sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)/
exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
4*(mu2*p1 + mu1*(mu2 + p2*t)/(2*(mu2*p1 + mu1*(mu2 + p2))*
sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2
  P2 <- (p2*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 -
4*(mu2*p1 + mu1*(mu2 + p2)))*t))*(-mu1^2 + 2*mu2*p1 +
mu1*(mu2 - p1 + p2)) - mu1*sqrt((mu1 + mu2 + p1 + p2)^2 -
4*(mu2*p1 + mu1*(mu2 + p2))) -
exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)*
mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) +
2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
4*(mu2*p1 + mu1*(mu2 + p2*t)*mu1*
sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)/
exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 -
4*(mu2*p1 + mu1*(mu2 + p2*t)/(2*(mu2*p1 + mu1*(mu2 + p2))*
sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2
  P3 <- 1 - P1 - P2
  p.all <- c(P1, P2, P3)
  if(all(p.all > 0 & p.all < 1)) -sum(dmnom2(c(n1, n2, n3), prob = p.all, log = 
TRUE)) else 1e07
}

NLL2 <- function(par, y) {
  p1 <- par[1]
  p2 <- par[2]
  mu1 <- par[3]
  mu2 <- par[4]
  t <- y$t
  n1 <- y$n1
  n2 <- y$n2
  n

Re: [R] Integration in R

2012-11-21 Thread Ravi Varadhan

Send us a reproducible R code that shows what you actually tried.

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vectorized uni-root?

2012-11-09 Thread Ravi Varadhan

Hi Ivo,

The only problem is that uniroot() actually uses Brent's algorithm, and is 
based on the C code from netlib (there is also Fortran code available - 
zeroin.f).  Brent's algorithm is more sophisticated than a simple bisection 
approach that you have vectorized.  It combines bisection and secant methods 
along with some quadratic interpolation.  The idea behind this hybrid approach 
(which by the way is a fundamental theme in all of numerical analysis) is to 
get faster convergence while not sacrificing the global convergence of 
bisection.

So, the existing uniroot() will not be deprecated unless you can vectorize 
Brent's hybrid root-finding approach!

Best,
Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optim & .C / Crashing on run

2012-11-05 Thread Ravi Varadhan

You might also want to try the Nelder-Mead algorithm, nmk(), in the "dfoptim" 
package.  It is a better algorithm than the Nelder-Mead in optim.  It is all R 
code, so you might be able to modify it to fit your needs.

Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Struggeling with nlminb...

2012-11-05 Thread Ravi Varadhan

Try optimx() in the "optimx" package with the control option  
`all.methods=TRUE'.

Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating Optimization Constraints

2012-10-17 Thread Ravi Varadhan

You have nonlinear constraints on the parameters.  Take a look at constrained 
optimization algorithms in packages "alabama" or "Rsolnp" that can handle 
non-linearly constrained optimization problems.

Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fit a "threshold" function with nls

2012-10-16 Thread Ravi Varadhan

Veronique,

Can you write down the likelihood function in R and send it to me (this is very 
easy to do, but I don't want to do your work)?  Also send me the code for 
simulating the data.  I will show you how to fit such models using optimization 
tools.  

Best,
Ravi

From: Vito Muggeo [vito.mug...@unipa.it]
Sent: Tuesday, October 16, 2012 9:55 AM
To: Bert Gunter
Cc: Véronique Boucher Lalonde; r-help@r-project.org; Ravi Varadhan
Subject: Re: [R] fit a "threshold" function with nls

Véronique,
in addition to Bert's comments, I would like to bring to your attention
that there are several packages that perform
threshold/breakpoint/changepoint estimation in R, including

cumSeg, segmented, strucchange, and bcp for a Bayesian approach

Moreover some packages, such as cghFLasso, perfom smoothing with a L1
penalty that can yield a step-function fit.

I hope this helps you,

vito


Il 15/10/2012 22.21, Bert Gunter ha scritto:
> Véronique:
>
> I've cc'ed this to a true expert (Ravi Varadhan) who is one of those
> who can give you a definitive response, but I **believe** the problem
> is that threshhold type function fits have  objective functions whose
> derivatives are discontinuous,and hence gradient -based methods can
> run into the sorts of problems that you see.
>
> **If** this is so, then you might do better using an explicit
> non-gradient optimizer = rss minimizer via one of the optim() suite of
> functions (maybe even the default Nelder-Mead); but this is definitely
> where the counsel of an expert would be valuable. Also check out the
> CRAN Optimization Task View for advice on optimization options.
>
> Cheers,
> Bert
>
> On Mon, Oct 15, 2012 at 9:43 AM, Véronique Boucher Lalonde
>  wrote:
>> I am trying to model a dependent variable as a threshold function of
>> my independent variable.
>> What I mean is that I want to fit different intercepts to y following 2
>> breakpoints, but with fixed slopes.
>> I am trying to do this with using ifelse statements in the nls function.
>> Perhaps, this is not an appropriate approach.
>>
>> I have created a very simple example to illustrate what I am trying to do.
>>
>> #Creating my independent variable
>> x <- seq(0, 1000)
>> #Creating my dependent variable, where all values below threshold #1 are 0,
>> all values above threshold #2 are 0 and all values within the two
>> thresholds are 1
>> y <- ifelse(x < 300, 0, ifelse(x>700, 0, 1))
>> #Adding noise to my dependent variable
>> y <- y + rnorm(length(y), 0, 0.01)
>> #I am having trouble clearly explaining the model I want to fit but perhaps
>> the plot is self explanatory
>> plot(x, y)
>> #Now I am trying to adjust a nonlinear model to fit the two breakpoints,
>> with known slopes between the breakpoints (here, respectively 0, 1 and 0)
>> threshold.model <- nls(y~ifelse(xb2, 0, 1)),
>> start=list(b1=300, b2=700), trace=T)
>>
>> I have played around with this function quite a bit and systematically get
>> an error saying: singular gradient matrix at initial parameter estimates
>> I admit that I don't fully understand what a singular gradient matrix
>> implies.
>> But it seems to me that both my model and starting values are sensible
>> given the data, and that the break points are in fact estimable.
>> Could someone point to what I am doing wrong?
>>
>> More generally, I am interested in knowing
>> (1) whether I can use such ifelse statements in the function nls
>> (1) whether I can even use nls for this model
>> (3) whether I can model this with a function that would allow me to assume
>> that the errors are binomial, rather than normally distributed
>> (ultimately I want to use such a model on binomial data)
>>
>> I am using R version 2.15.1 on 64-bit Windows 7
>>
>> Any guidance would be greatly appreciated.
>> Veronique
>>
>>  [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>

--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] model selection with spg and AIC (or, convert list to fitted model object)

2012-10-11 Thread Ravi Varadhan

Adam,

See the attached R code that solves your problem and beyond.  One important 
issue is that you are enforcing constraints only indirectly.  You need to make 
sure that P1, P2, and P3 (which are functions of original parameters and time) 
are all between 0 and 1.  It is not enough to impose constraints on original 
parameters p1, p2, mu1 and mu2.

I also show you how to do a likelihood ratio test with the results from spg.  
You can also do the same for nlminb or Rvmmin.

Finally, I also show you how to use optimx to compare different algorithms. 
This shows you that in addition to spg, you also get very good results (as seen 
from objective function values and KKT conditions) with a number of other 
algorithms (e.g., Rvmmin, nlminb), many of which are much faster than spg.

This example illustrates the utility of optimx().  I am always surprised as to 
why more R users doing optimization are not using optimx.  This is a very 
powerful for benchmarking unconstrained and box-constrained optimization 
problems.  It deserves to be used widely, in my biased, but correct, opinion.

Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Constraint Optimization with constrOptim

2012-09-18 Thread Ravi Varadhan

Hi Raimund,



You can use spg() in the BB package. This requires that you have a projection 
rule for projecting an infeasible point onto the simplex defined by:  0 <= x <= 
1, sum(x) = 1.



I show you here how to project onto the unit simplex.  So, this is how your 
problem can be solved:



require(BB)

　

proj.simplex <- function(y, c=1) {

#

# project an n-dim vector y to the simplex S_n

# S_n = { x | x \in R^n, 0 <= x <= c, sum(x) = c}

# Copyright:  Ravi Varadhan, Johns Hopkins University

# August 8, 2012

#

n <- length(y)

sy <- sort(y, decreasing=TRUE)

csy <- cumsum(sy)

rho <- max(which(sy > (csy - c)/(1:n)))

theta <- (csy[rho] - c) / rho

return(pmax(0, y - theta))

}

#Rendite - data
dataset <- matrix(c(0.019120, 0.033820, -0.053180, 0.036220, -0.021480, 
-0.029380,
-0.012180, -0.076980, -0.060380,
0.038901, -0.032107, -0.034153, -0.031944, 0.006494, -0.021062,
-0.058497, 0.050473, 0.026086,
0.004916, -0.015996, 0.003968, 0.030721, 0.034774, -0.004972,
0.019459, 0.000196, -0.001849), nrow=3, ncol=9, byrow=TRUE)

#Computation of the variance-covariance matrix
covarianceMatrix <- ((t(dataset)) %*% dataset) / 3



fkt <- function(x, covMat) {t(x) %*% covMat %*% x}
startingEstimates = rep(1/9, 9)



ans <- spg(startingEstimates, fn=fkt, project=proj.simplex, covMat = 
covarianceMatrix)



ans

sum(ans$par)



ans <- spg(runif(9), fn=fkt, project=proj.simplex, covMat = covarianceMatrix) # 
non-uniqueness of the optimal solution



ans

sum(ans$par)



# You see that there is an infinite number of solutiions. Because your 
covariance matrix is rank-deficient, it only has rank 3.



# You get a unique solution when the covariance Matrix is positive-definite

aMat <- matrix(rnorm(81), 9, 9)

covMat <- aMat %*% t(aMat)



ans <- spg(runif(9), fn=fkt, project=proj.simplex, covMat=covMat) # 
non-uniqueness of the



ans

sum(ans$par)


Hope this is helpful,

Ravi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coxph not converging with continuous variable

2012-09-04 Thread Ravi Varadhan

Dear Terry,

I agree that this example is highly atypical.  Having said that, my experience 
with optimization algorithms is that scaling (a.k.a. standardizing) the 
continuous covariates is greatly helpful in terms of convergence.  Have you 
considered automatically standardizing the continuous covariates, and then 
converting the scaled coefficients back to the original scale?  Of course, the 
end user could do this just as easily!

Best,
Ravi

Ravi Varadhan, Ph.D.
Assistant Professor
The Center on Aging and Health
Division of Geriatric Medicine & Gerontology
Johns Hopkins University
rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A contour plot question - vis.gam () function in "mgcv"

2012-04-03 Thread Ravi Varadhan

Thank you, Peter.  I don't know why I didn't think of this! 

Also, thanks to Ilai.

Ravi


From: Peter Ehlers [ehl...@ucalgary.ca]
Sent: Tuesday, April 03, 2012 8:22 PM
To: ilai
Cc: Ravi Varadhan; r-help@r-project.org
Subject: Re: [R] A contour plot question - vis.gam () function in "mgcv"

On 2012-04-03 15:49, ilai wrote:
> Try to plot the points first followed by vis.gam(...,type='contour',
> color='bw', add=T) instead of vis.gam followed by points.
>
> HTH

Or, if vis.gam gives you default scales that you wish to preserve,
then just replot the contours over the points with

vis.gam(., add = TRUE)

Peter Ehlers

>
>
> On Tue, Apr 3, 2012 at 2:48 PM, Ravi Varadhan  wrote:
>> Hi,
>>
>> Please see the attached contour plot (I am sorry about the big file).  This 
>> was created using the vis.gam() function in "mgcv" package.  However, my 
>> question is somewhat broader.
>>
>> In generating this figure, I first created the contours using vis.gam() and 
>> then I plotted the points.  These point are plotted on top of the contours 
>> so that some of the contour lines are only partially visible.  Is there a 
>> way to make the contour lines fully visible?  I cannot reverse the order and 
>> plot the points first and then call vis.gam().  Or, can I?  Are there other 
>> options?
>>
>> Thanks for any help or hints.
>>
>> Best,
>> Ravi
>>
>>
>> Ravi Varadhan, Ph.D.
>> Assistant Professor
>> The Center on Aging and Health
>> Division of Geriatric Medicine&  Gerontology
>> Johns Hopkins University
>> rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
>> 410-502-2619
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quadratic programming-maximization instead of

2012-01-04 Thread Ravi Varadhan

Maximizing f(x) = x'Ax  makes sense only when A is negative-definite.  
Therefore, this is the same as minimizing x'Bx, where B = -A, and B is 
positive-definite.

In other words, you should be able to simply flip the sign of the original 
matrix .  This should yield a positive-definite matrix since the original 
matrix ought to be negative-definite.

Ravi

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem of "constrOptim.nl", no hessian and convergence

2011-12-29 Thread Ravi Varadhan

Hi,

Use the `auglag' function in "alabama" if you want to get the Hessian at 
convergence.  This typically tends to perform better than `constrOptim.nl'.  
Also,  `constrOptim.nl' does not compute the Hessian.   You should not specify 
method="L-BFGS-B". The default method "BFGS" is better in this setting.

Hope this helps,
Ravi
-------
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Non-negativity constraints for logistic regression

2011-12-22 Thread Ravi Varadhan

Hi Thomas,

Using box-constrained optimizer in glm.fit is a good suggestion for finding the 
point estimates.  However, there is still the issue of making inference, i.e., 
computing the variances and p-values for the estimates.  You have to deal with 
the issue of MLE possibly being on the boundary.  Asymptotic distribution of 
MLE estimators will not be normal in the case of convergence at the boundary.  
This is a difficult problem.

Best,
Ravi

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] performance of adaptIntegrate vs. integrate

2011-11-11 Thread Ravi Varadhan

The integrand is highly peaked.  It is approximately an impulse function where 
much of the mass is concentrated at a very small interval.  Plot the function 
and see for yourself.  This is the likely cause of the problem.

Other types of integrands where you could experience problems are: integrands 
with singularity at either limit and slowly decaying oscillatory integrands.  
As to why integrate performs better than adaptIntegrate in this situation, I 
don't know.  You have to study the two implementations.   Wynn's epsilon 
algorithm is an extrapolation method for improving the convergence of a 
sequence.  This could be an explanation for the better performance, but I 
cannot say for sure.

Hope this is helpful,
Ravi
---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optim seems to be finding a local minimum

2011-11-11 Thread Ravi Varadhan

Hi Dimitri,

Your problem has little to do with local versus global optimum.  You can 
convince yourself that the solution you got is not even a local optimum by 
checking the gradient at the solution.

The main issue is that your objective function is not differentiable 
everywhere.  So, you have 2 options: either you use a smooth objective function 
(e.g. squared residuals) or you use an optimization algorithm than can handle 
non-smooth objective function.

Here I show that your problem is well solved by the `nmkb' function (a 
bound-constraints version of Nelder-Mead simplex method) from the "dfoptim" 
package.

library(dfoptim)

> myopt2 <- nmkb(fn=myfunc, par=c(0.1,max(IV)), lower=0)
> myopt2
$par
[1] 8.897590e-01 9.470163e+07

$value
[1] 334.1901

$feval
[1] 204

$restarts
[1] 0

$convergence
[1] 0

$message
[1] "Successful convergence"

Then, there is also the issue of properly scaling your function, because it is 
poorly scaled.   Look how different the 2 parameters are - they are 7 orders of 
magnitude apart.  You are really asking for trouble here.

Hope this is helpful,
Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Poor performance of "Optim"

2011-10-02 Thread Ravi Varadhan

Hi,

You really need to study the documentation of "optim" carefully before you make 
broad generalizations.  There are several algorithms available in optim.  The 
default is a simplex-type algorithm called Nelder-Mead.  I think this is an 
unfortunate choice as the  default algorithm.  Nelder-Mead is a robust 
algorithm that can work well for almost any kind of objective function (smooth 
or nasty). However, the trade-off is that it is very slow in terms of 
convergence rate.  For simple, smooth problems, such as yours, you should use 
"BFGS" (or "L-BFGS" if you have simple box-constraints).  Also, take a look at 
the "optimx" package and the most recent paper in J Stat Software on optimx for 
a better understanding of the wide array of optimization options available in R.

Best,
Ravi.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Download statistics for a package

2011-09-28 Thread Ravi Varadhan

Hi,

How can I get information on how many times a particular package has been 
downloaded from CRAN?

Thanks,
Ravi.
---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Global optimisation with inequality constraints

2011-09-09 Thread Ravi Varadhan

Hi,

Take a look at the multiStart() function in the "BB" package.  This allows you 
to specify random multiple starting values, which need not satisfy constraints. 
 Then you need to specify the constraints using the `projectLinear' argument.

Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Imposing Feller condition using project constraint in spg

2011-09-08 Thread Ravi Varadhan

Hi Kristian,

The idea behind projection is that you take an iterate that violates the 
constraints and project it onto a point such that it is the nearest point that 
satisfies the constraints.

Suppose you have an iterate (w1, w4) that does not satisfy the constraint that 
w1 * w4 != (1 + eps)/2.  Our goal is to find a (w1', w2'), given (w1, w2), such 
that


(A)   w1' * w2' = (1+eps)/2 = k

(B)   (w1-w1')^2 + (w2-w2')^2 is minimum.

This is quite easy to solve.  We know (w1, w2).  You plug in w2' = k/w1' from 
(A) into (B) and minimize the function of w1'.  This is a simple calculus 
exercise, and I will leave this as a homework problem for you to solve!

Best,
Ravi.

-------
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Hessian matrix issue

2011-09-07 Thread Ravi Varadhan

"However, it is not known whether the standard errors obtained from this 
Hessian are asymptotically valid."

Let me rephrase this.  I think that as a measure of dispersion, the standard 
error obtained using the augmented Lagrangian algorithm is correct.  However, 
what is *not known* is the asymptotic distribution of the parameter estimates 
when constraints are active.  This is a "non-regular" situation where MLEs 
might have strange asymptotic behavior.  We cannot generally assume normality 
and use the standard error estimates to construct confidence intervals or 
calculate significance levels.

Best,
Ravi.
---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>

From: Ravi Varadhan
Sent: Wednesday, September 07, 2011 1:37 PM
To: 'gpet...@uark.edu'; 'nas...@uottawa.ca'; 'r-help@r-project.org'
Subject: Hessian matrix issue

Yes, numDeriv::hessian is very accurate.  So, I normally take the output from 
the optimizer, *if it is a local optimum*, and then apply numDeriv::hessian to 
it.  I, then, compute the standard errors from it.

However, it is important to know that you have obtained a local optimum.  In 
fact, you need the gradient and Hessian to actually verify this - the first and 
second order KKT conditions.  However, this is tricky in *constrained* 
optimization problems.  If you have constraints, and at least one of the 
constraints is active at the best parameter estimate, then the gradient of the 
original objective function need not be close to zero and the hessian is also 
incorrect.

If you employ an augmented Lagrangian approach (see alabama::auglag), then you 
can obtain the correct gradient and Hessian at best parameter estimate. These 
gradients and Hessian correspond to the modified objective function that 
includes a Lagrangian term augmented by a quadratic penalty term.  They can be 
used to check the KKT conditions.

Here is a simple example:

require(alabama)

fr <- function(x) {   ## Rosenbrock Banana function
x1 <- x[1]
x2 <- x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}

grr <- function(x) { ## Gradient of 'fr'
x1 <- x[1]
x2 <- x[2]
c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
   200 *  (x2 - x1 * x1))
}

p0 <- c(0, 0)

ans1 <- optim(par=p0, fn=fr, gr=grr, upper=c(0.9, Inf), method="L-BFGS-B", 
hessian=TRUE)

grr(ans1$par)  # this will not be close to zero

ans1$hessian

# Using an augmented Lagrangian optimizer

hcon <- function(x) 0.9 - x[1]

hcon.jac <- function(x) matrix(c(-1, 0) , 1, 2)

ans2 <- auglag(par=p0, fn=fr, gr=grr, hin=hcon, hin.jac=hcon.jac)

ans2$gradient  # this will be close to zero

ans2$hessian

However, it is not known whether the standard errors obtained from this Hessian 
are asymptotically valid.

Best,
Ravi.
---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Hessian matrix issue

2011-09-07 Thread Ravi Varadhan

Yes, numDeriv::hessian is very accurate.  So, I normally take the output from 
the optimizer, *if it is a local optimum*, and then apply numDeriv::hessian to 
it.  I, then, compute the standard errors from it.

However, it is important to know that you have obtained a local optimum.  In 
fact, you need the gradient and Hessian to actually verify this - the first and 
second order KKT conditions.  However, this is tricky in *constrained* 
optimization problems.  If you have constraints, and at least one of the 
constraints is active at the best parameter estimate, then the gradient of the 
original objective function need not be close to zero and the hessian is also 
incorrect.

If you employ an augmented Lagrangian approach (see alabama::auglag), then you 
can obtain the correct gradient and Hessian at best parameter estimate. These 
gradients and Hessian correspond to the modified objective function that 
includes a Lagrangian term augmented by a quadratic penalty term.  They can be 
used to check the KKT conditions.

Here is a simple example:

require(alabama)

fr <- function(x) {   ## Rosenbrock Banana function
x1 <- x[1]
x2 <- x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}

grr <- function(x) { ## Gradient of 'fr'
x1 <- x[1]
x2 <- x[2]
c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
   200 *  (x2 - x1 * x1))
}

p0 <- c(0, 0)

ans1 <- optim(par=p0, fn=fr, gr=grr, upper=c(0.9, Inf), method="L-BFGS-B", 
hessian=TRUE)

grr(ans1$par)  # this will not be close to zero

ans1$hessian

# Using an augmented Lagrangian optimizer

hcon <- function(x) 0.9 - x[1]

hcon.jac <- function(x) matrix(c(-1, 0) , 1, 2)

ans2 <- auglag(par=p0, fn=fr, gr=grr, hin=hcon, hin.jac=hcon.jac)

ans2$gradient  # this will be close to zero

ans2$hessian

However, it is not known whether the standard errors obtained from this Hessian 
are asymptotically valid.

Best,
Ravi.
-------
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Gradients in optimx

2011-08-31 Thread Ravi Varadhan

Hi Reuben,

I am puzzled to note that the gradient check in "optimx" does not work for you. 
 Can you send me a reproducible example so that I can figure this out?

John - I think the best solution for now is to issue a "warning" rather than an 
error message, when the numerical gradient is not sufficiently close to the 
user-specified gradient.

Best,
Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Gradient function in OPTIMX

2011-08-30 Thread Ravi Varadhan

Hi Kathie,

The gradient check in "optimx" checks if the user specified gradient (at 
starting parameters) is within roughly 1.e-05 * (1 + fval) of the numerically 
computed gradient. It is likely that you have correctly coded up the gradient, 
but still there can be significant differences b/w numerical and exact 
gradients.  This can happen when the gradients are very large.  

I would check this again separately as follows:

require(numDeriv)

mygrad <-  gr.fy(theta0)

numgrad <- grad(x=theta0, func=gr.fy)

cbind(mygrad, numgrad)

all.equal(mygrad, numgrad)

Can you report these gradients to us?

In "optimx", we should probably change this into a "warning" rather than a 
"stop". 

Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to generate a random variate that is correlated with a given right-censored random variate?

2011-08-26 Thread Ravi Varadhan

Hi,

Here is an update.  I implemented this bivariate lognormal approach.  It works 
well in simulations when I generated the marginal [T] from a lognormal 
distribution and independently censored it.  It, however, does not do well when 
I generate from a marginal [T] that is Weibull or a distribution not well 
approximated by a lognormal.   How to make the approach more robust to 
distributional assumptions on the marginal of [T]?

I am thinking that a bivariate copula approach is called for.  The [U] margin 
can be standard lognormal as before, and the [T] margin needs to be a flexible 
distribution.  What kind of bivariate coupla might work?   Then, how to 
generate from the conditional distribution [U | T]?  Any thoughts?

Thanks,
Ravi.
---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>

From: Ravi Varadhan
Sent: Friday, August 26, 2011 2:56 PM
To: r-help@r-project.org
Subject: How to generate a random variate that is correlated with a given 
right-censored random variate?

Hi,

I have  a right-censored (positive) random variable (e.g. failure times subject 
to right censoring) that is observed for N subjects:  Y_i, I = 1, 2, ..., N.  
Note that Y_i = min(T_i, C_i), where T_i is the true failure time and C_i is 
the censored time.  Let us assume that C_i is independent of T_i.  Now, I would 
like to generate another random variable U_i, I = 1, 2, ..., N, which is 
correlated with T.  In other words, I would like to generate U from the 
conditional distribution [U | T=t].

One approach might be to assume that the joint distn [T, U] is bivariate 
lognormal.  So, the marginals [T] and [U], as well as the conditional [U | T] 
are also lognormal.  I can estimate the marginal [T] using the right-censored 
data Y (assuming independent censoring). For example, I might use 
survival::survreg to do this.   Then, I assume that U is standard lognormal 
(mean = 0, var = 1).  Now, I only need to assume a value for correlation 
parameter, r,  and I can then sample from the conditional [U | T=t] which is 
also a lognormal (parametrized by r).

Does this sound right? Are there better/simpler ways to do this?

Thanks very much for any hints.

Best,
Ravi.
---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to generate a random variate that is correlated with a given right-censored random variate?

2011-08-26 Thread Ravi Varadhan

Hi,

I have  a right-censored (positive) random variable (e.g. failure times subject 
to right censoring) that is observed for N subjects:  Y_i, I = 1, 2, ..., N.  
Note that Y_i = min(T_i, C_i), where T_i is the true failure time and C_i is 
the censored time.  Let us assume that C_i is independent of T_i.  Now, I would 
like to generate another random variable U_i, I = 1, 2, ..., N, which is 
correlated with T.  In other words, I would like to generate U from the 
conditional distribution [U | T=t].

One approach might be to assume that the joint distn [T, U] is bivariate 
lognormal.  So, the marginals [T] and [U], as well as the conditional [U | T] 
are also lognormal.  I can estimate the marginal [T] using the right-censored 
data Y (assuming independent censoring). For example, I might use 
survival::survreg to do this.   Then, I assume that U is standard lognormal 
(mean = 0, var = 1).  Now, I only need to assume a value for correlation 
parameter, r,  and I can then sample from the conditional [U | T=t] which is 
also a lognormal (parametrized by r).

Does this sound right? Are there better/simpler ways to do this?

Thanks very much for any hints.

Best,
Ravi.
---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scaling problem in optim()

2011-08-15 Thread Ravi Varadhan

There is an error in your call to `optim'.  The`lower' bound is incorrect:

lower=c(rep(-1,n),1e-5,1e-5,1e5)

This should be:

lower=c(rep(-1,n),1e-5,1e-5,1e-5)

I am not sure if this fixes the problem, but it is worth a try.

I do not understand your scaling.  From the lower and upper bounds it seems 
like (n+2)th parameter is the only one that needs to be scaled by a scale 
factor of 100.

After you have fixed the error in `lower' and the scaling, you might also want 
to try "optimx" package that runs a bunch of optimization algorithms.

Ravi.
-------
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optimization problems

2011-08-13 Thread Ravi Varadhan

Kathie,

It is very difficult to help without adequate information.  What does your 
objective function look like? Are you maximizing (in which case you have to 
make sure that the sign of the objective function is correct) or minimizing?

Can you try "optimx" with the control option all.methods=TRUE?

Hope this is helpful,
Ravi.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ridge regression - covariance matrices of ridge

2011-08-07 Thread Ravi Varadhan

Hi Michael,



The coefficients of ridge regression are given by:



\beta^* = (X'X + k I)^{-1} X' y,   (1)



where k > 0 is the penalty parameter and I is the identity matrix.



The ridge estimates are related to OLS estimates \beta as follows:



\beta^* = Z \beta,  -- (2)



where Z = [I + k(X'X)^{-1}]^{-1} , -- (3)



Let \Sigma and \Sigma^* be the variance-covariance matrices of \beta and 
\beta^*, resply.  Therefore,



\Sigma^* = Z \Sigma Z'  - (4)



In other words, for a fixed k, you can obtain the covariance matrix of \beta^* 
by making use of (3) and (4).



You can read the original paper by Hoerl and Kennard (Technometrics 1970) for 
more details.



Hope this is helpful,

Ravi.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to capture console output in a numeric format

2011-06-24 Thread Ravi Varadhan

I did think of this solution, Keith, but I am generally uncomfortable (may be 
"paranoid" is a better word) with the use of `<<-'.  Perhaps, my fear is 
unjustified in this particular situation.

Thanks,
Ravi.
-------
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Keith Jewell
Sent: Friday, June 24, 2011 11:49 AM
To: r-h...@stat.math.ethz.ch
Subject: Re: [R] How to capture console output in a numeric format

If you don't want the information as character, why are you printing it 
rather than storing it in a matrix?
Why not something along the lines of this...

fr <- function(x) {   ## Rosenbrock Banana function
on.exit(aMatrix <<- rbind(aMatrix,(cbind(x1, x2, f
x1 <- x[1]
x2 <- x[2]
f <- 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
f
}
aMatrix <- NULL
ans <- optim(c(-1.2,1), fr)
aMatrix

HTH

Keith J
-Original Message-
"Ravi Varadhan"  wrote in message 
news:2f9ea67ef9ae1c48a147cb41be2e15c303e...@dom-eb-mail1.win.ad.jhu.edu...
Thank you very much, Jim.  That works!

I did know that I could process the character strings using regex, but was
also wondering if there was a direct way to get this.

Suppose, in the current example I would like to obtain a 3-column matrix 
that contains the parameters and the function value:

fr <- function(x) {   ## Rosenbrock Banana function
on.exit(print(cbind(x1, x2, f)))
x1 <- x[1]
x2 <- x[2]
f <- 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
f
}

fvals <- capture.output(ans <- optim(c(-1.2,1), fr))

Now, I need to tweak your solution to get the 3-column matrix.  It would be
nice, if there was a more direct way to get the numerical output, perhaps a
numeric option in capture.output().

Best,
Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns 
Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to capture console output in a numeric format

2011-06-24 Thread Ravi Varadhan

Thank you very much, Jim.  That works!  

I did know that I could process the character strings using regex, but was also 
wondering if there was a direct way to get this.  

Suppose, in the current example I would like to obtain a 3-column matrix that 
contains the parameters and the function value:

fr <- function(x) {   ## Rosenbrock Banana function
on.exit(print(cbind(x1, x2, f)))
x1 <- x[1]
x2 <- x[2]
f <- 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
f   
}

fvals <- capture.output(ans <- optim(c(-1.2,1), fr))

Now, I need to tweak your solution to get the 3-column matrix.  It would be 
nice, if there was a more direct way to get the numerical output, perhaps a 
numeric option in capture.output().

Best,
Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu

-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: Friday, June 24, 2011 10:48 AM
To: Ravi Varadhan
Cc: r-help@r-project.org
Subject: Re: [R] How to capture console output in a numeric format

try this:

> fr <- function(x) {   ## Rosenbrock Banana function
+on.exit(print(f))
+x1 <- x[1]
+x2 <- x[2]
+f <- 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
+f
+ }
>
> fvals <- capture.output(ans <- optim(c(-1.2,1), fr))
> # convert to numeric
> fvals <- as.numeric(sub("^.* ", "", fvals))
>
> fvals
  [1] 24.20  7.095296 15.08  4.541696
  [5]  6.029216  4.456256  8.879936  7.777856
  [9]  4.728125  5.167901  4.21  4.437670
 [13]  4.178989  4.326023  4.070813  4.221489
 [17]  4.039810  4.896359  4.009379  4.077130
 [21]  4.020798  3.993600  4.024586  4.117625
 [25]  3.993115  3.976081  3.971089  4.023905
 [29]  3.980807  3.9525770000  3.932179  3.935345


On Fri, Jun 24, 2011 at 10:39 AM, Ravi Varadhan  wrote:
> Hi,
>
> I would like to know how to capture the console output from running an 
> algorithm for further analysis.  I can capture this using capture.output() 
> but that yields a character vector.  I would like to extract the actual 
> numeric values.  Here is an example of what I am trying to do.
>
> fr <- function(x) {   ## Rosenbrock Banana function
>    on.exit(print(f))
>    x1 <- x[1]
>    x2 <- x[2]
>    f <- 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
>    f
> }
>
> fvals <- capture.output(ans <- optim(c(-1.2,1), fr))
>
> Now, `fvals' contains character elements, but I would like to obtain the 
> actual numerical values.  How can I do this?
>
> Thanks very much for any suggestions.
>
> Best,
> Ravi.
>
> ---
> Ravi Varadhan, Ph.D.
> Assistant Professor,
> Division of Geriatric Medicine and Gerontology School of Medicine Johns 
> Hopkins University
>
> Ph. (410) 502-2619
> email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to capture console output in a numeric format

2011-06-24 Thread Ravi Varadhan

Hi,

I would like to know how to capture the console output from running an 
algorithm for further analysis.  I can capture this using capture.output() but 
that yields a character vector.  I would like to extract the actual numeric 
values.  Here is an example of what I am trying to do.

fr <- function(x) {   ## Rosenbrock Banana function
on.exit(print(f))
x1 <- x[1]
x2 <- x[2]
f <- 100 * (x2 - x1 * x1)^2 + (1 - x1)^2
f
}

fvals <- capture.output(ans <- optim(c(-1.2,1), fr))

Now, `fvals' contains character elements, but I would like to obtain the actual 
numerical values.  How can I do this?

Thanks very much for any suggestions.

Best,
Ravi.

---
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins 
University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu<mailto:rvarad...@jhmi.edu>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 7 >

1 - 100 of 694 matches

Mail list logo