Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Wang, Zhu
Thanks Michael.

I also need an intercept-only negative binomial model with unknown scale 
parameter. So my thought was on borrowing some codes that already existed. I 
think Ivan's solution is an excellent one and can be extended to other 
scenarios.

Best,

Zhu

On May 11, 2019, at 9:48 AM, Michael Weylandt 
mailto:michael.weyla...@gmail.com>> wrote:

On Sat, May 11, 2019 at 8:28 AM Wang, Zhu 
mailto:wan...@uthscsa.edu>> wrote:

I am open to whatever suggestions but I am not aware a simple closed-form 
solution for my original question.


It would help if you could clarify your original question a bit more, but for 
at least the main three GLMs, there are closed form solutions, based on means 
of y. Assuming canonical links,

- Gaussian: intercept = mean(y)
- Logistic: intercept = logit(mean(y))  [Note that you have problems here if 
your data is all 0 or all 1]
- Poisson: intercept = log(mean(y)) [You have problems here if your data is all 
0]

(Check my math on these, but I'm pretty sure this is right.)

Like I said above, this gets trickier if you add observation weights or 
offsets, but the same ideas work.

Stepping back to the statistical theory: GLMs predict the mean of y, 
conditional on x. If x doesn't vary (intercept only model), then the GLM is 
just predicting the mean of y and the MLE for the mean of y is exactly that 
under standard GLM assumptions - the sample mean of y.

We then just have to use the link function and its inverse to transform to and 
from the observation space (where mean(y) lives) and the linear predictor space 
(where the intercept term naturally lives).

Michael

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Michael Weylandt
On Sat, May 11, 2019 at 8:28 AM Wang, Zhu  wrote:

>
> I am open to whatever suggestions but I am not aware a simple closed-form
> solution for my original question.
>
>
It would help if you could clarify your original question a bit more, but
for at least the main three GLMs, there are closed form solutions, based on
means of y. Assuming canonical links,

- Gaussian: intercept = mean(y)
- Logistic: intercept = logit(mean(y))  [Note that you have problems here
if your data is all 0 or all 1]
- Poisson: intercept = log(mean(y)) [You have problems here if your data is
all 0]

(Check my math on these, but I'm pretty sure this is right.)

Like I said above, this gets trickier if you add observation weights or
offsets, but the same ideas work.

Stepping back to the statistical theory: GLMs predict the mean of y,
conditional on x. If x doesn't vary (intercept only model), then the GLM is
just predicting the mean of y and the MLE for the mean of y is exactly that
under standard GLM assumptions - the sample mean of y.

We then just have to use the link function and its inverse to transform to
and from the observation space (where mean(y) lives) and the linear
predictor space (where the intercept term naturally lives).

Michael

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Wang, Zhu
Ivan's answer is very impressive.   

Michael,

I am open to whatever suggestions but I am not aware a simple closed-form 
solution for my original question.

Thanks,

Zhu

> On May 11, 2019, at 8:01 AM, Michael Weylandt  
> wrote:
> 
> This is very cool, but I wonder if it isn't over-kill for the larger
> problem.
> 
> In general, calculating the coefficient of an intercept-only GLM is just
> calculating (a transformation of) the MLE of a univariate exponential
> family distribution. (Things may be a bit trickier if the GLM also involves
> weights and offsets, not just an intercept, but I'm assuming it doesn't.)
> 
> OP: Can you clarify why you want to invoke R's entire GLM machinery as
> opposed to just using the closed form solutions?
> 
> Michael
> 
>> On Sat, May 11, 2019 at 7:23 AM Ivan Krylov  wrote:
>> 
>> On Fri, 10 May 2019 16:17:42 +
>> "Wang, Zhu"  wrote:
>> 
>>> Are there any examples or links for me to follow through more closely?
>> 
>> Calling R functions from C++ is described at
>>  and
>> elsewhere in Rcpp documentation. An example follows:
>> 
>> --8<--glmfit.cpp--8<--
>> #include 
>> #include 
>> using namespace Rcpp;
>> 
>> extern "C" double intercept_glm(size_t n, const double * response) {
>>// access functions from default environment
>>Function glm_fit("glm.fit"), coef("coef");
>> 
>>// intercept-only model: response ~ 1
>>NumericVector x(n);
>>x.fill(1);
>> 
>>// I couldn't find a way to wrap a double* into a NumericVector
>>// without copying anything, sorry; perhaps someone else
>>// can offer a solution
>>NumericVector y(n);
>>std::copy_n(response, n, y.begin());
>> 
>>// call the R function, convert the result back
>>return as(coef(glm_fit(x, y)));
>> }
>> --8<--glmfit.cpp--8<--
>> 
>> Since this function is extern "C" and uses only primitive C types, it
>> should be fairly easy to call from Fortran. (C is the lingua franca of
>> programming languages). Fortran-C interoperability is well described in
>> "Modern Fortran Explained" by Metcalf et al. Here is the Fortran side
>> of the code:
>> 
>> --8<--callglm.f90--8<--
>> subroutine callglm(ret)
>> use, intrinsic :: iso_c_binding, only: c_size_t, c_double
>> ! using iso_c_binding here
>> ! - to get correct type of ret when R calls the function
>> ! - to convert variables before calling C function
>> implicit none
>> ! using F77-style arguments to match expectations of .Fortran()
>> real(c_double), intent(out) :: ret
>> ! toy data to compare against R code later
>> real :: y(10) = [10, 11, 20, 9, 10, 8, 11, 45, 2, 3]
>> ! the interface block declares an extern "C" function
>> interface
>>  ! double intercept_glm(size_t n, const double * response)
>>  function intercept_glm(n, response) bind(c)
>>   use, intrinsic :: iso_c_binding
>>   real(c_double) :: intercept_glm
>>   integer(c_size_t), value :: n
>>   real(c_double) :: response(*)
>>  end function
>> end interface
>> 
>> ! call the function as you would call any other function
>> ret = intercept_glm(int(size(y), c_size_t), real(y, c_double))
>> end subroutine
>> --8<--callglm.f90--8<--
>> 
>> For a quick test, make sure that you have Rcpp installed and run:
>> 
>> # adjust R version and path if your library is elsewhere
>> PKG_CPPFLAGS='-g -I ~/R/x86_64-pc-linux-gnu-library/3.3/Rcpp/include' \
>>R CMD SHLIB callglm.f90 glmfit.cpp
>> R
>> library(Rcpp)
>> dyn.load('callglm.so') # change extension if needed
>> .Fortran('callglm', ret=numeric(1))
>> # $ret
>> # [1] 12.9
>> coef(glm.fit(rep(1, 10), c(10, 11, 20, 9, 10, 8, 11, 45, 2, 3)))
>> # [1] 12.9
>> 
>> To use this in a package, place both files in the src/ subdirectory of
>> your package and add LinkingTo: Rcpp in the DESCRIPTION.
>> 
>> --
>> Best regards,
>> Ivan
>> 
>> __
>> R-package-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>> 
> 
>[[alternative HTML version deleted]]
> 
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Ivan Krylov
On Fri, 10 May 2019 16:17:42 +
"Wang, Zhu"  wrote:

> Are there any examples or links for me to follow through more closely?

Calling R functions from C++ is described at
 and
elsewhere in Rcpp documentation. An example follows:

--8<--glmfit.cpp--8<--
#include 
#include 
using namespace Rcpp;

extern "C" double intercept_glm(size_t n, const double * response) {
// access functions from default environment
Function glm_fit("glm.fit"), coef("coef");

// intercept-only model: response ~ 1
NumericVector x(n);
x.fill(1);

// I couldn't find a way to wrap a double* into a NumericVector
// without copying anything, sorry; perhaps someone else
// can offer a solution
NumericVector y(n);
std::copy_n(response, n, y.begin());

// call the R function, convert the result back
return as(coef(glm_fit(x, y)));
}
--8<--glmfit.cpp--8<--

Since this function is extern "C" and uses only primitive C types, it
should be fairly easy to call from Fortran. (C is the lingua franca of
programming languages). Fortran-C interoperability is well described in
"Modern Fortran Explained" by Metcalf et al. Here is the Fortran side
of the code:

--8<--callglm.f90--8<--
subroutine callglm(ret)
use, intrinsic :: iso_c_binding, only: c_size_t, c_double
! using iso_c_binding here
! - to get correct type of ret when R calls the function
! - to convert variables before calling C function
implicit none
! using F77-style arguments to match expectations of .Fortran()
real(c_double), intent(out) :: ret
! toy data to compare against R code later
real :: y(10) = [10, 11, 20, 9, 10, 8, 11, 45, 2, 3]
 ! the interface block declares an extern "C" function
 interface
  ! double intercept_glm(size_t n, const double * response)
  function intercept_glm(n, response) bind(c)
   use, intrinsic :: iso_c_binding
   real(c_double) :: intercept_glm
   integer(c_size_t), value :: n
   real(c_double) :: response(*)
  end function
 end interface

 ! call the function as you would call any other function
 ret = intercept_glm(int(size(y), c_size_t), real(y, c_double))
end subroutine
--8<--callglm.f90--8<--

For a quick test, make sure that you have Rcpp installed and run:

# adjust R version and path if your library is elsewhere
PKG_CPPFLAGS='-g -I ~/R/x86_64-pc-linux-gnu-library/3.3/Rcpp/include' \
R CMD SHLIB callglm.f90 glmfit.cpp
R
library(Rcpp)
dyn.load('callglm.so') # change extension if needed
.Fortran('callglm', ret=numeric(1))
# $ret
# [1] 12.9
coef(glm.fit(rep(1, 10), c(10, 11, 20, 9, 10, 8, 11, 45, 2, 3)))
# [1] 12.9

To use this in a package, place both files in the src/ subdirectory of
your package and add LinkingTo: Rcpp in the DESCRIPTION.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to obtain intercept of intercept-only glm in Fortran?

2019-05-11 Thread Michael Weylandt
This is very cool, but I wonder if it isn't over-kill for the larger
problem.

In general, calculating the coefficient of an intercept-only GLM is just
calculating (a transformation of) the MLE of a univariate exponential
family distribution. (Things may be a bit trickier if the GLM also involves
weights and offsets, not just an intercept, but I'm assuming it doesn't.)

OP: Can you clarify why you want to invoke R's entire GLM machinery as
opposed to just using the closed form solutions?

Michael

On Sat, May 11, 2019 at 7:23 AM Ivan Krylov  wrote:

> On Fri, 10 May 2019 16:17:42 +
> "Wang, Zhu"  wrote:
>
> > Are there any examples or links for me to follow through more closely?
>
> Calling R functions from C++ is described at
>  and
> elsewhere in Rcpp documentation. An example follows:
>
> --8<--glmfit.cpp--8<--
> #include 
> #include 
> using namespace Rcpp;
>
> extern "C" double intercept_glm(size_t n, const double * response) {
> // access functions from default environment
> Function glm_fit("glm.fit"), coef("coef");
>
> // intercept-only model: response ~ 1
> NumericVector x(n);
> x.fill(1);
>
> // I couldn't find a way to wrap a double* into a NumericVector
> // without copying anything, sorry; perhaps someone else
> // can offer a solution
> NumericVector y(n);
> std::copy_n(response, n, y.begin());
>
> // call the R function, convert the result back
> return as(coef(glm_fit(x, y)));
> }
> --8<--glmfit.cpp--8<--
>
> Since this function is extern "C" and uses only primitive C types, it
> should be fairly easy to call from Fortran. (C is the lingua franca of
> programming languages). Fortran-C interoperability is well described in
> "Modern Fortran Explained" by Metcalf et al. Here is the Fortran side
> of the code:
>
> --8<--callglm.f90--8<--
> subroutine callglm(ret)
> use, intrinsic :: iso_c_binding, only: c_size_t, c_double
> ! using iso_c_binding here
> ! - to get correct type of ret when R calls the function
> ! - to convert variables before calling C function
> implicit none
> ! using F77-style arguments to match expectations of .Fortran()
> real(c_double), intent(out) :: ret
> ! toy data to compare against R code later
> real :: y(10) = [10, 11, 20, 9, 10, 8, 11, 45, 2, 3]
>  ! the interface block declares an extern "C" function
>  interface
>   ! double intercept_glm(size_t n, const double * response)
>   function intercept_glm(n, response) bind(c)
>use, intrinsic :: iso_c_binding
>real(c_double) :: intercept_glm
>integer(c_size_t), value :: n
>real(c_double) :: response(*)
>   end function
>  end interface
>
>  ! call the function as you would call any other function
>  ret = intercept_glm(int(size(y), c_size_t), real(y, c_double))
> end subroutine
> --8<--callglm.f90--8<--
>
> For a quick test, make sure that you have Rcpp installed and run:
>
> # adjust R version and path if your library is elsewhere
> PKG_CPPFLAGS='-g -I ~/R/x86_64-pc-linux-gnu-library/3.3/Rcpp/include' \
> R CMD SHLIB callglm.f90 glmfit.cpp
> R
> library(Rcpp)
> dyn.load('callglm.so') # change extension if needed
> .Fortran('callglm', ret=numeric(1))
> # $ret
> # [1] 12.9
> coef(glm.fit(rep(1, 10), c(10, 11, 20, 9, 10, 8, 11, 45, 2, 3)))
> # [1] 12.9
>
> To use this in a package, place both files in the src/ subdirectory of
> your package and add LinkingTo: Rcpp in the DESCRIPTION.
>
> --
> Best regards,
> Ivan
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel