Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-21 Thread Wang, Zhu
Thanks Michael. I agree that it is possible to simplify the codes.

Best,

Zhu 

-Original Message-
From: Michael Weylandt  
Sent: Monday, May 20, 2019 3:41 PM
To: Wang, Zhu 
Cc: Ivan Krylov ; R-package-devel@r-project.org
Subject: Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in 
Fortran?

Negative binomial is a bit trickier since it's a two parameter family without a 
closed-form MLE.

For the probability parameter, you can use the closed form MLE at 
https://en.wikipedia.org/wiki/Negative_binomial_distribution#Maximum_likelihood_estimation

For the number of samples, you'll need to use an iterative method to solve the 
score equation (see the above link).

It's probably easier to code this up yourself rather than calling into R, but 
if you do call into R, I'd look into using the `fitdistr` function instead of a 
full regression method, as demonstrated at 
https://stat.ethz.ch/pipermail/r-help/2012-October/338458.html

Michael

On Sat, May 11, 2019 at 11:10 AM Wang, Zhu  wrote:
>
> Thanks Michael.
>
> I also need an intercept-only negative binomial model with unknown scale 
> parameter. So my thought was on borrowing some codes that already existed. I 
> think Ivan's solution is an excellent one and can be extended to other 
> scenarios.
>
> Best,
>
> Zhu
>
> On May 11, 2019, at 9:48 AM, Michael Weylandt  
> wrote:
>
> On Sat, May 11, 2019 at 8:28 AM Wang, Zhu  wrote:
>>
>>
>> I am open to whatever suggestions but I am not aware a simple closed-form 
>> solution for my original question.
>>
>
> It would help if you could clarify your original question a bit more, 
> but for at least the main three GLMs, there are closed form solutions, 
> based on means of y. Assuming canonical links,
>
> - Gaussian: intercept = mean(y)
> - Logistic: intercept = logit(mean(y))  [Note that you have problems 
> here if your data is all 0 or all 1]
> - Poisson: intercept = log(mean(y)) [You have problems here if your 
> data is all 0]
>
> (Check my math on these, but I'm pretty sure this is right.)
>
> Like I said above, this gets trickier if you add observation weights or 
> offsets, but the same ideas work.
>
> Stepping back to the statistical theory: GLMs predict the mean of y, 
> conditional on x. If x doesn't vary (intercept only model), then the GLM is 
> just predicting the mean of y and the MLE for the mean of y is exactly that 
> under standard GLM assumptions - the sample mean of y.
>
> We then just have to use the link function and its inverse to transform to 
> and from the observation space (where mean(y) lives) and the linear predictor 
> space (where the intercept term naturally lives).
>
> Michael
__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-20 Thread Michael Weylandt
Negative binomial is a bit trickier since it's a two parameter family
without a closed-form MLE.

For the probability parameter, you can use the closed form MLE at
https://en.wikipedia.org/wiki/Negative_binomial_distribution#Maximum_likelihood_estimation

For the number of samples, you'll need to use an iterative method to
solve the score equation (see the above link).

It's probably easier to code this up yourself rather than calling into
R, but if you do call into R, I'd look into using the `fitdistr`
function instead of a full regression method, as demonstrated at
https://stat.ethz.ch/pipermail/r-help/2012-October/338458.html

Michael

On Sat, May 11, 2019 at 11:10 AM Wang, Zhu  wrote:
>
> Thanks Michael.
>
> I also need an intercept-only negative binomial model with unknown scale 
> parameter. So my thought was on borrowing some codes that already existed. I 
> think Ivan's solution is an excellent one and can be extended to other 
> scenarios.
>
> Best,
>
> Zhu
>
> On May 11, 2019, at 9:48 AM, Michael Weylandt  
> wrote:
>
> On Sat, May 11, 2019 at 8:28 AM Wang, Zhu  wrote:
>>
>>
>> I am open to whatever suggestions but I am not aware a simple closed-form 
>> solution for my original question.
>>
>
> It would help if you could clarify your original question a bit more, but for 
> at least the main three GLMs, there are closed form solutions, based on means 
> of y. Assuming canonical links,
>
> - Gaussian: intercept = mean(y)
> - Logistic: intercept = logit(mean(y))  [Note that you have problems here if 
> your data is all 0 or all 1]
> - Poisson: intercept = log(mean(y)) [You have problems here if your data is 
> all 0]
>
> (Check my math on these, but I'm pretty sure this is right.)
>
> Like I said above, this gets trickier if you add observation weights or 
> offsets, but the same ideas work.
>
> Stepping back to the statistical theory: GLMs predict the mean of y, 
> conditional on x. If x doesn't vary (intercept only model), then the GLM is 
> just predicting the mean of y and the MLE for the mean of y is exactly that 
> under standard GLM assumptions - the sample mean of y.
>
> We then just have to use the link function and its inverse to transform to 
> and from the observation space (where mean(y) lives) and the linear predictor 
> space (where the intercept term naturally lives).
>
> Michael

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Wang, Zhu
Thanks Michael.

I also need an intercept-only negative binomial model with unknown scale 
parameter. So my thought was on borrowing some codes that already existed. I 
think Ivan's solution is an excellent one and can be extended to other 
scenarios.

Best,

Zhu

On May 11, 2019, at 9:48 AM, Michael Weylandt 
mailto:michael.weyla...@gmail.com>> wrote:

On Sat, May 11, 2019 at 8:28 AM Wang, Zhu 
mailto:wan...@uthscsa.edu>> wrote:

I am open to whatever suggestions but I am not aware a simple closed-form 
solution for my original question.


It would help if you could clarify your original question a bit more, but for 
at least the main three GLMs, there are closed form solutions, based on means 
of y. Assuming canonical links,

- Gaussian: intercept = mean(y)
- Logistic: intercept = logit(mean(y))  [Note that you have problems here if 
your data is all 0 or all 1]
- Poisson: intercept = log(mean(y)) [You have problems here if your data is all 
0]

(Check my math on these, but I'm pretty sure this is right.)

Like I said above, this gets trickier if you add observation weights or 
offsets, but the same ideas work.

Stepping back to the statistical theory: GLMs predict the mean of y, 
conditional on x. If x doesn't vary (intercept only model), then the GLM is 
just predicting the mean of y and the MLE for the mean of y is exactly that 
under standard GLM assumptions - the sample mean of y.

We then just have to use the link function and its inverse to transform to and 
from the observation space (where mean(y) lives) and the linear predictor space 
(where the intercept term naturally lives).

Michael

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Michael Weylandt
On Sat, May 11, 2019 at 8:28 AM Wang, Zhu  wrote:

>
> I am open to whatever suggestions but I am not aware a simple closed-form
> solution for my original question.
>
>
It would help if you could clarify your original question a bit more, but
for at least the main three GLMs, there are closed form solutions, based on
means of y. Assuming canonical links,

- Gaussian: intercept = mean(y)
- Logistic: intercept = logit(mean(y))  [Note that you have problems here
if your data is all 0 or all 1]
- Poisson: intercept = log(mean(y)) [You have problems here if your data is
all 0]

(Check my math on these, but I'm pretty sure this is right.)

Like I said above, this gets trickier if you add observation weights or
offsets, but the same ideas work.

Stepping back to the statistical theory: GLMs predict the mean of y,
conditional on x. If x doesn't vary (intercept only model), then the GLM is
just predicting the mean of y and the MLE for the mean of y is exactly that
under standard GLM assumptions - the sample mean of y.

We then just have to use the link function and its inverse to transform to
and from the observation space (where mean(y) lives) and the linear
predictor space (where the intercept term naturally lives).

Michael

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Wang, Zhu
Ivan's answer is very impressive.   

Michael,

I am open to whatever suggestions but I am not aware a simple closed-form 
solution for my original question.

Thanks,

Zhu

> On May 11, 2019, at 8:01 AM, Michael Weylandt  
> wrote:
> 
> This is very cool, but I wonder if it isn't over-kill for the larger
> problem.
> 
> In general, calculating the coefficient of an intercept-only GLM is just
> calculating (a transformation of) the MLE of a univariate exponential
> family distribution. (Things may be a bit trickier if the GLM also involves
> weights and offsets, not just an intercept, but I'm assuming it doesn't.)
> 
> OP: Can you clarify why you want to invoke R's entire GLM machinery as
> opposed to just using the closed form solutions?
> 
> Michael
> 
>> On Sat, May 11, 2019 at 7:23 AM Ivan Krylov  wrote:
>> 
>> On Fri, 10 May 2019 16:17:42 +
>> "Wang, Zhu"  wrote:
>> 
>>> Are there any examples or links for me to follow through more closely?
>> 
>> Calling R functions from C++ is described at
>>  and
>> elsewhere in Rcpp documentation. An example follows:
>> 
>> --8<--glmfit.cpp--8<--
>> #include 
>> #include 
>> using namespace Rcpp;
>> 
>> extern "C" double intercept_glm(size_t n, const double * response) {
>>// access functions from default environment
>>Function glm_fit("glm.fit"), coef("coef");
>> 
>>// intercept-only model: response ~ 1
>>NumericVector x(n);
>>x.fill(1);
>> 
>>// I couldn't find a way to wrap a double* into a NumericVector
>>// without copying anything, sorry; perhaps someone else
>>// can offer a solution
>>NumericVector y(n);
>>std::copy_n(response, n, y.begin());
>> 
>>// call the R function, convert the result back
>>return as(coef(glm_fit(x, y)));
>> }
>> --8<--glmfit.cpp--8<--
>> 
>> Since this function is extern "C" and uses only primitive C types, it
>> should be fairly easy to call from Fortran. (C is the lingua franca of
>> programming languages). Fortran-C interoperability is well described in
>> "Modern Fortran Explained" by Metcalf et al. Here is the Fortran side
>> of the code:
>> 
>> --8<--callglm.f90--8<--
>> subroutine callglm(ret)
>> use, intrinsic :: iso_c_binding, only: c_size_t, c_double
>> ! using iso_c_binding here
>> ! - to get correct type of ret when R calls the function
>> ! - to convert variables before calling C function
>> implicit none
>> ! using F77-style arguments to match expectations of .Fortran()
>> real(c_double), intent(out) :: ret
>> ! toy data to compare against R code later
>> real :: y(10) = [10, 11, 20, 9, 10, 8, 11, 45, 2, 3]
>> ! the interface block declares an extern "C" function
>> interface
>>  ! double intercept_glm(size_t n, const double * response)
>>  function intercept_glm(n, response) bind(c)
>>   use, intrinsic :: iso_c_binding
>>   real(c_double) :: intercept_glm
>>   integer(c_size_t), value :: n
>>   real(c_double) :: response(*)
>>  end function
>> end interface
>> 
>> ! call the function as you would call any other function
>> ret = intercept_glm(int(size(y), c_size_t), real(y, c_double))
>> end subroutine
>> --8<--callglm.f90--8<--
>> 
>> For a quick test, make sure that you have Rcpp installed and run:
>> 
>> # adjust R version and path if your library is elsewhere
>> PKG_CPPFLAGS='-g -I ~/R/x86_64-pc-linux-gnu-library/3.3/Rcpp/include' \
>>R CMD SHLIB callglm.f90 glmfit.cpp
>> R
>> library(Rcpp)
>> dyn.load('callglm.so') # change extension if needed
>> .Fortran('callglm', ret=numeric(1))
>> # $ret
>> # [1] 12.9
>> coef(glm.fit(rep(1, 10), c(10, 11, 20, 9, 10, 8, 11, 45, 2, 3)))
>> # [1] 12.9
>> 
>> To use this in a package, place both files in the src/ subdirectory of
>> your package and add LinkingTo: Rcpp in the DESCRIPTION.
>> 
>> --
>> Best regards,
>> Ivan
>> 
>> __
>> R-package-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>> 
> 
>[[alternative HTML version deleted]]
> 
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Michael Weylandt
This is very cool, but I wonder if it isn't over-kill for the larger
problem.

In general, calculating the coefficient of an intercept-only GLM is just
calculating (a transformation of) the MLE of a univariate exponential
family distribution. (Things may be a bit trickier if the GLM also involves
weights and offsets, not just an intercept, but I'm assuming it doesn't.)

OP: Can you clarify why you want to invoke R's entire GLM machinery as
opposed to just using the closed form solutions?

Michael

On Sat, May 11, 2019 at 7:23 AM Ivan Krylov  wrote:

> On Fri, 10 May 2019 16:17:42 +
> "Wang, Zhu"  wrote:
>
> > Are there any examples or links for me to follow through more closely?
>
> Calling R functions from C++ is described at
>  and
> elsewhere in Rcpp documentation. An example follows:
>
> --8<--glmfit.cpp--8<--
> #include 
> #include 
> using namespace Rcpp;
>
> extern "C" double intercept_glm(size_t n, const double * response) {
> // access functions from default environment
> Function glm_fit("glm.fit"), coef("coef");
>
> // intercept-only model: response ~ 1
> NumericVector x(n);
> x.fill(1);
>
> // I couldn't find a way to wrap a double* into a NumericVector
> // without copying anything, sorry; perhaps someone else
> // can offer a solution
> NumericVector y(n);
> std::copy_n(response, n, y.begin());
>
> // call the R function, convert the result back
> return as(coef(glm_fit(x, y)));
> }
> --8<--glmfit.cpp--8<--
>
> Since this function is extern "C" and uses only primitive C types, it
> should be fairly easy to call from Fortran. (C is the lingua franca of
> programming languages). Fortran-C interoperability is well described in
> "Modern Fortran Explained" by Metcalf et al. Here is the Fortran side
> of the code:
>
> --8<--callglm.f90--8<--
> subroutine callglm(ret)
> use, intrinsic :: iso_c_binding, only: c_size_t, c_double
> ! using iso_c_binding here
> ! - to get correct type of ret when R calls the function
> ! - to convert variables before calling C function
> implicit none
> ! using F77-style arguments to match expectations of .Fortran()
> real(c_double), intent(out) :: ret
> ! toy data to compare against R code later
> real :: y(10) = [10, 11, 20, 9, 10, 8, 11, 45, 2, 3]
>  ! the interface block declares an extern "C" function
>  interface
>   ! double intercept_glm(size_t n, const double * response)
>   function intercept_glm(n, response) bind(c)
>use, intrinsic :: iso_c_binding
>real(c_double) :: intercept_glm
>integer(c_size_t), value :: n
>real(c_double) :: response(*)
>   end function
>  end interface
>
>  ! call the function as you would call any other function
>  ret = intercept_glm(int(size(y), c_size_t), real(y, c_double))
> end subroutine
> --8<--callglm.f90--8<--
>
> For a quick test, make sure that you have Rcpp installed and run:
>
> # adjust R version and path if your library is elsewhere
> PKG_CPPFLAGS='-g -I ~/R/x86_64-pc-linux-gnu-library/3.3/Rcpp/include' \
> R CMD SHLIB callglm.f90 glmfit.cpp
> R
> library(Rcpp)
> dyn.load('callglm.so') # change extension if needed
> .Fortran('callglm', ret=numeric(1))
> # $ret
> # [1] 12.9
> coef(glm.fit(rep(1, 10), c(10, 11, 20, 9, 10, 8, 11, 45, 2, 3)))
> # [1] 12.9
>
> To use this in a package, place both files in the src/ subdirectory of
> your package and add LinkingTo: Rcpp in the DESCRIPTION.
>
> --
> Best regards,
> Ivan
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Ivan Krylov
On Fri, 10 May 2019 16:17:42 +
"Wang, Zhu"  wrote:

> Are there any examples or links for me to follow through more closely?

Calling R functions from C++ is described at
 and
elsewhere in Rcpp documentation. An example follows:

--8<--glmfit.cpp--8<--
#include 
#include 
using namespace Rcpp;

extern "C" double intercept_glm(size_t n, const double * response) {
// access functions from default environment
Function glm_fit("glm.fit"), coef("coef");

// intercept-only model: response ~ 1
NumericVector x(n);
x.fill(1);

// I couldn't find a way to wrap a double* into a NumericVector
// without copying anything, sorry; perhaps someone else
// can offer a solution
NumericVector y(n);
std::copy_n(response, n, y.begin());

// call the R function, convert the result back
return as(coef(glm_fit(x, y)));
}
--8<--glmfit.cpp--8<--

Since this function is extern "C" and uses only primitive C types, it
should be fairly easy to call from Fortran. (C is the lingua franca of
programming languages). Fortran-C interoperability is well described in
"Modern Fortran Explained" by Metcalf et al. Here is the Fortran side
of the code:

--8<--callglm.f90--8<--
subroutine callglm(ret)
use, intrinsic :: iso_c_binding, only: c_size_t, c_double
! using iso_c_binding here
! - to get correct type of ret when R calls the function
! - to convert variables before calling C function
implicit none
! using F77-style arguments to match expectations of .Fortran()
real(c_double), intent(out) :: ret
! toy data to compare against R code later
real :: y(10) = [10, 11, 20, 9, 10, 8, 11, 45, 2, 3]
 ! the interface block declares an extern "C" function
 interface
  ! double intercept_glm(size_t n, const double * response)
  function intercept_glm(n, response) bind(c)
   use, intrinsic :: iso_c_binding
   real(c_double) :: intercept_glm
   integer(c_size_t), value :: n
   real(c_double) :: response(*)
  end function
 end interface

 ! call the function as you would call any other function
 ret = intercept_glm(int(size(y), c_size_t), real(y, c_double))
end subroutine
--8<--callglm.f90--8<--

For a quick test, make sure that you have Rcpp installed and run:

# adjust R version and path if your library is elsewhere
PKG_CPPFLAGS='-g -I ~/R/x86_64-pc-linux-gnu-library/3.3/Rcpp/include' \
R CMD SHLIB callglm.f90 glmfit.cpp
R
library(Rcpp)
dyn.load('callglm.so') # change extension if needed
.Fortran('callglm', ret=numeric(1))
# $ret
# [1] 12.9
coef(glm.fit(rep(1, 10), c(10, 11, 20, 9, 10, 8, 11, 45, 2, 3)))
# [1] 12.9

To use this in a package, place both files in the src/ subdirectory of
your package and add LinkingTo: Rcpp in the DESCRIPTION.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-10 Thread Wang, Zhu
Thanks Ivan for the tip. Are there any examples or links for me to follow 
through more closely?

Thanks,

Zhu

-Original Message-
From: Ivan Krylov  
Sent: Monday, May 6, 2019 4:14 AM
To: Wang, Zhu 
Cc: R-package-devel@r-project.org
Subject: Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in 
Fortran?

On Sat, 4 May 2019 22:41:16 +
"Wang, Zhu"  wrote:

> In an R package I would like to compute intercept for an 
> intercept-only GLM in a Fortran subroutine.

If all else fails, you could use R API [*] to call coef(glm.fit(...)), though 
it might require writing a C or C++ wrapper to avoid translating all function 
prototypes from Rinternals.h into Fortran 2003 C interoperability syntax.

--
Best regards,
Ivan

[*]:
https://cran.r-project.org/doc/manuals/R-exts.html#Evaluating-R-expressions-from-C

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-06 Thread Ivan Krylov
On Sat, 4 May 2019 22:41:16 +
"Wang, Zhu"  wrote:

> In an R package I would like to compute intercept for an
> intercept-only GLM in a Fortran subroutine.

If all else fails, you could use R API [*] to call coef(glm.fit(...)),
though it might require writing a C or C++ wrapper to avoid translating
all function prototypes from Rinternals.h into Fortran 2003 C
interoperability syntax.

-- 
Best regards,
Ivan

[*]:
https://cran.r-project.org/doc/manuals/R-exts.html#Evaluating-R-expressions-from-C

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel