Re: [R] Constrained Poisson model / Bayesian Poisson model

2016-01-23 Thread David Winsemius

> On Jan 23, 2016, at 5:24 AM, mara.pfleide...@uni-ulm.de wrote:
> 
> Hi David,
> 
> I'm sorry. I'm not familiar with posting problems on helppages.
> As the data I deal with is confidential I can't provide all details,
> but I will try to be as precise as possible about my problem:
> 
> As I said I'm working on a Poisson regression model with a linear
> predictor and the identity as link function.

Then that is not what most people would call a "Poisson regression model".

In particular: I have data which consists of observations of two random 
variables X and Y over the period of about 3 months. I have clustered this data 
into hours, so I received 2088 observations of those variables and each 
observation represents the values of the variables in one hour. Now I assume Y 
to be Poisson distributed (Y_i~Poi(lambda_i)) and that the values of Y come 
from two different impacts, so I split Y_i into two Poisson-distributed 
variables Y_i1 and Y_i2. I assume that the values of X have a long-term effect 
(of at least one day) on the part Y_i1 and I have estimators for the parameters 
lambda_i2, but I have no exact values of the variables of Y_i1 and Y_i2 but 
only of Y_i as a whole.

> So my model looks as follows:
> fml=Y_i1 ~ b_1*X_i++b_n*X_(i-n+1) - lambda_i2 -1 with n>=24
> That means that the last 24 (or more) values of X influence the value of Y_i1 
> in an additive way and I have no intercept(b_0=0).
> Now I made a matrix whose rows represent Y_i, all of its 24 (or more) 
> regressors for each observation of Y_i and the corresponding estimator 
> lambda_i2.
> Then I used glm(fml, family=poisson(link="identity"), data=matrix) and tried 
> it for different values of n (=24,48,36,...).

I worry that might not construct the model you described above. The use of 
`poisson` constructs Poisson distributed _errors_ with an additive link, and to 
my understanding does _not_ assume Poisson distributed Y values.


> But always some of the coefficients received negative values which doesnt't 
> make sense in interpretation. (The values of X represent certain events which 
> can only have a positive or none effect on the value of Y.)
> 
> Now I want to use the constraint b_i >=0 in my model, but I don't know how I 
> can do this.
> 
> I also thought of analyzing this model in a Bayesian way, but yet I haven't 
> found a Bayesian version of glm() for the Poisson distribution such that I 
> can specify the prior for the b_i on my own. (Then I could include the 
> positivity in the prior.) Do you have a hint for me?

The R blogoshere has been heralding the arrival of the `rstanarm` package which 
provides a glm-like interface to an MCMC (Bayesian) estimation engine. Version 
2.90 is on CRAN. The list of authors in the description file is impressive: 

Authors@R: c(person("Jonah", "Gabry", email = "jsg2...@columbia.edu", role = 
"aut"),
 person("Trustees of", "Columbia University", role = "cph"),
 person("R Core", "Deveopment Team", role = "cph", 
 comment = "R/pp_data.R, R/stan_aov.R"),
 person("Douglas", "Bates", role = "cph", comment = "R/pp_data.R"),
 person("Martin", "Maechler", role = "cph", comment = 
"R/pp_data.R"),
 person("Ben", "Bolker", role = "cph", comment = "R/pp_data.R"),
 person("Steve", "Walker", role = "cph", comment = "R/pp_data.R"),
 person("Brian", "Ripley", role = "cph", 
comment = "R/stan_aov.R, R/stan_polr.R"),
 person("William", "Venables", role = "cph", comment = 
"R/stan_polr.R"),
 person("Ben", "Goodrich", email = 
"benjamin.goodr...@columbia.edu", 
role = c("cre", "aut")))

> 
> I'm sorry if I went too much into detail now...
> I hope you understand my point now and have some answers for me!
> 
> Best,
> Mara
> 
> 
> Zitat von David Winsemius :
> 
>> 
>>> On Jan 22, 2016, at 7:01 AM, mara.pfleide...@uni-ulm.de wrote:
>>> 
>>> Hi all,
>>> 
>>> I am dealing with a problem about my linear Poisson regression   model 
>>> (link function=identity).
>>> 
>>> I am using the glm()-function which results in negative   coefficients, but 
>>> a negative influence of the regressors wouldn't   make sense.
>> 
>> Negative coefficients merely indicate a lower relative rate. You   need to 
>> be more specific about the exactly data and model output   before you can 
>> raise our concern to a level where further comment   can be made.
>> 
>> 
>>> 
>>> (i) Is there a possibility to set constraints on the regression   
>>> parameters in glm() such that all coefficients are positive? Or is   there 
>>> another function in R for which this is possible?
>>> 
>>> (ii) Is there a Bayesian version of the glm()-function where I can   
>>> specify the prior distribution for my regression parameters? (e.g.   a 
>>> Dirichlet prior s.t. the parameters are positive)
>>> 
>>> All this with respect to the linear Poisson model...

Re: [R] Constrained Poisson model / Bayesian Poisson model

2016-01-23 Thread David Winsemius

> On Jan 23, 2016, at 9:40 AM, David Winsemius  wrote:
> 
> 
>> On Jan 23, 2016, at 5:24 AM, mara.pfleide...@uni-ulm.de wrote:
>> 
>> Hi David,
>> 
>> I'm sorry. I'm not familiar with posting problems on helppages.
>> As the data I deal with is confidential I can't provide all details,
>> but I will try to be as precise as possible about my problem:
>> 
>> As I said I'm working on a Poisson regression model with a linear
>> predictor and the identity as link function.
> 
> Then that is not what most people would call a "Poisson regression model".
> 
>> In particular: I have data which consists of observations of two random 
>> variables X and Y over the period of about 3 months. I have clustered this 
>> data into hours, so I received 2088 observations of those variables and each 
>> observation represents the values of the variables in one hour. Now I assume 
>> Y to be Poisson distributed (Y_i~Poi(lambda_i)) and that the values of Y 
>> come from two different impacts, so I split Y_i into two Poisson-distributed 
>> variables Y_i1 and Y_i2. I assume that the values of X have a long-term 
>> effect (of at least one day) on the part Y_i1 and I have estimators for the 
>> parameters lambda_i2, but I have no exact values of the variables of Y_i1 
>> and Y_i2 but only of Y_i as a whole.
>> 
>> So my model looks as follows:
>> fml=Y_i1 ~ b_1*X_i++b_n*X_(i-n+1) - lambda_i2 -1 with n>=24
>> That means that the last 24 (or more) values of X influence the value of 
>> Y_i1 in an additive way and I have no intercept(b_0=0).
>> Now I made a matrix whose rows represent Y_i, all of its 24 (or more) 
>> regressors for each observation of Y_i and the corresponding estimator 
>> lambda_i2.
>> Then I used glm(fml, family=poisson(link="identity"), data=matrix) and tried 
>> it for different values of n (=24,48,36,...).
> 
> I worry that might not construct the model you described above. The use of 
> `poisson` constructs Poisson distributed _errors_ with an additive link, and 
> to my understanding does _not_ assume Poisson distributed Y values.
> 
> 
>> But always some of the coefficients received negative values which doesnt't 
>> make sense in interpretation. (The values of X represent certain events 
>> which can only have a positive or none effect on the value of Y.)

One further thought. Since your errors are being constructed from a 
non-negative distribution, and the Y|X_i dependence is linear, then it seems 
perfectly reasonable that some of the estimates (which are on the linear scale) 
might be forced to be negative despite this not being completely sensible in 
real-life. The errors would be force to be "centered" around the linear 
estimate and so if the "true" value were small and the errors were larger, then 
that would "push" the estimate down into negative territory. Just one of the 
dangers in choosing a non-canonical choice of link and error distribution that 
you as an analyst needs to accept.


>> 
>> Now I want to use the constraint b_i >=0 in my model, but I don't know how I 
>> can do this.
>> 
>> I also thought of analyzing this model in a Bayesian way, but yet I haven't 
>> found a Bayesian version of glm() for the Poisson distribution such that I 
>> can specify the prior for the b_i on my own. (Then I could include the 
>> positivity in the prior.) Do you have a hint for me?
> 
> The R blogoshere has been heralding the arrival of the `rstanarm` package 
> which provides a glm-like interface to an MCMC (Bayesian) estimation engine. 
> Version 2.90 is on CRAN.

The version on CRAN is actually 2.90-1 and after running the script to install 
from GitHub, you can have the 2.90-2 version.

https://github.com/stan-dev/rstanarm

The github route will require that you have the needed development platform for 
your OS.

The install log showed an error with failure to "find template 
reference-stan-manual" but that did not stop compilation and there seemed to be 
a substantial numer of vignettes to consult, including one entitled: "stan_glm: 
GLMs for Count Data"

Best of luck.

> The list of authors in the description file is impressive: 
> 
> Authors@R: c(person("Jonah", "Gabry", email = "jsg2...@columbia.edu", role = 
> "aut"),
> person("Trustees of", "Columbia University", role = "cph"),
> person("R Core", "Deveopment Team", role = "cph", 
> comment = "R/pp_data.R, R/stan_aov.R"),
> person("Douglas", "Bates", role = "cph", comment = "R/pp_data.R"),
> person("Martin", "Maechler", role = "cph", comment = 
> "R/pp_data.R"),
> person("Ben", "Bolker", role = "cph", comment = "R/pp_data.R"),
> person("Steve", "Walker", role = "cph", comment = "R/pp_data.R"),
> person("Brian", "Ripley", role = "cph", 
>comment = "R/stan_aov.R, R/stan_polr.R"),
> person("William", "Venables", role = "cph", comment = 
> "R/stan_polr.R"),
> person("Ben", 

Re: [R] Constrained Poisson model / Bayesian Poisson model

2016-01-23 Thread mara . pfleiderer

Hi David,

I'm sorry. I'm not familiar with posting problems on helppages.
As the data I deal with is confidential I can't provide all details,
but I will try to be as precise as possible about my problem:

As I said I'm working on a Poisson regression model with a linear
predictor and the identity as link function.
In particular: I have data which consists of observations of two  
random variables X and Y over the period of about 3 months. I have  
clustered this data into hours, so I received 2088 observations of  
those variables and each observation represents the values of the  
variables in one hour. Now I assume Y to be Poisson distributed  
(Y_i~Poi(lambda_i)) and that the values of Y come from two different  
impacts, so I split Y_i into two Poisson-distributed variables Y_i1  
and Y_i2. I assume that the values of X have a long-term effect (of at  
least one day) on the part Y_i1 and I have estimators for the  
parameters lambda_i2, but I have no exact values of the variables of  
Y_i1 and Y_i2 but only of Y_i as a whole.

So my model looks as follows:
fml=Y_i1 ~ b_1*X_i++b_n*X_(i-n+1) - lambda_i2 -1 with n>=24
That means that the last 24 (or more) values of X influence the value  
of Y_i1 in an additive way and I have no intercept(b_0=0).
Now I made a matrix whose rows represent Y_i, all of its 24 (or more)  
regressors for each observation of Y_i and the corresponding estimator  
lambda_i2.
Then I used glm(fml, family=poisson(link="identity"), data=matrix) and  
tried it for different values of n (=24,48,36,...).
But always some of the coefficients received negative values which  
doesnt't make sense in interpretation. (The values of X represent  
certain events which can only have a positive or none effect on the  
value of Y.)


Now I want to use the constraint b_i >=0 in my model, but I don't know  
how I can do this.


I also thought of analyzing this model in a Bayesian way, but yet I  
haven't found a Bayesian version of glm() for the Poisson distribution  
such that I can specify the prior for the b_i on my own. (Then I could  
include the positivity in the prior.) Do you have a hint for me?


I'm sorry if I went too much into detail now...
I hope you understand my point now and have some answers for me!

Best,
Mara


Zitat von David Winsemius :




On Jan 22, 2016, at 7:01 AM, mara.pfleide...@uni-ulm.de wrote:

Hi all,

I am dealing with a problem about my linear Poisson regression
model (link function=identity).


I am using the glm()-function which results in negative
coefficients, but a negative influence of the regressors wouldn't
make sense.


Negative coefficients merely indicate a lower relative rate. You
need to be more specific about the exactly data and model output
before you can raise our concern to a level where further comment
can be made.





(i) Is there a possibility to set constraints on the regression
parameters in glm() such that all coefficients are positive? Or is   
 there another function in R for which this is possible?


(ii) Is there a Bayesian version of the glm()-function where I can   
 specify the prior distribution for my regression parameters? (e.g.  
  a Dirichlet prior s.t. the parameters are positive)


All this with respect to the linear Poisson model...


As I implied above, the word "linear" means something different than  
  "additive" when the link is log().


--

David Winsemius
Alameda, CA, USA




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Constrained Poisson model / Bayesian Poisson model

2016-01-22 Thread mara . pfleiderer

Hi all,

I am dealing with a problem about my linear Poisson regression model  
(link function=identity).


I am using the glm()-function which results in negative coefficients,  
but a negative influence of the regressors wouldn't make sense.


(i) Is there a possibility to set constraints on the regression  
parameters in glm() such that all coefficients are positive? Or is  
there another function in R for which this is possible?


(ii) Is there a Bayesian version of the glm()-function where I can  
specify the prior distribution for my regression parameters? (e.g. a  
Dirichlet prior s.t. the parameters are positive)


All this with respect to the linear Poisson model...

Thanks in advance!
Best,
Mara

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained Poisson model / Bayesian Poisson model

2016-01-22 Thread David Winsemius

> On Jan 22, 2016, at 7:01 AM, mara.pfleide...@uni-ulm.de wrote:
> 
> Hi all,
> 
> I am dealing with a problem about my linear Poisson regression model (link 
> function=identity).
> 
> I am using the glm()-function which results in negative coefficients, but a 
> negative influence of the regressors wouldn't make sense.

Negative coefficients merely indicate a lower relative rate. You need to be 
more specific about the exactly data and model output before you can raise our 
concern to a level where further comment can be made.


> 
> (i) Is there a possibility to set constraints on the regression parameters in 
> glm() such that all coefficients are positive? Or is there another function 
> in R for which this is possible?
> 
> (ii) Is there a Bayesian version of the glm()-function where I can specify 
> the prior distribution for my regression parameters? (e.g. a Dirichlet prior 
> s.t. the parameters are positive)
> 
> All this with respect to the linear Poisson model...

As I implied above, the word "linear" means something different than "additive" 
when the link is log().

--

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.