Re: Help with Bayesian Linear Regression

Herman Rubin Thu, 13 Nov 2003 13:52:44 -0800

In article <[EMAIL PROTECTED]>,
Alf <[EMAIL PROTECTED]> wrote:
>Hi,


>I'm a bit confused about this and would like to ask a few questions:

>When can we use an analytical method and when must we use sampling
>simulation?

Use analytic methods whenever possible, and even use them in
conjunction with simulation.

>For example, I have an introductory Statistics book and it provides formulae
>(though with no real explanation or examples), for calculating the posterior
>means SD's and CI's given the prior (data) and actual data.

>These formulae are;

>mean b = n0*b0 +nb/n0+n

>SE b = (sigma/sigmaX)/sqr(n0+n)

>95% CI b =(n0b0+nb)/(n0+n)+-1.96(SE b)

>where the suffix 0 specifies prior data
>and n0 = (sigma/sigmaX)^2/sigma0^2 is called the size of the "quasi-sample".

>So, if I have prior data, can I just perform a linear regression on this
>data to get b0 and sigma0, and again on the test data, then combine using
>the above formulae to get the posterior value for b?

>Can I extend this to a quadratic example, but still linear in the parameters
>like

>y=a+(b1)x+(b2)x^2

>and use the same formulae for a, b1 and b2 to achieve the posterior values
>for the parameters?

I will assume that you know linear algebra; without that,
things are at best clumsy.  Also, I will use the standard
notation for multivariate statistics, partially explained
here.  I hope the abbreviated Usenet TeX notation is not
too difficult.

The standard linear model is 

        Y_t = \sum X_tj*\theta_j + U_t,

t denoting the observation, where X_tj are the values of
the explanatory variables, and U_t are the random errors,
which are independent of the X's.  In your quadratic model
above, X_t1 = 1, X_t2 = x_t, X_t3 = x_t^2, etc.

Writing this in matrix form, it becomes

        Y = X*\theta + U,

and we will use this.  The usual least square solution
is 

(1)     \theta~ = (X'*X)^{-1}*X'*U

and \theta~ is (normally, if U is normal) distributed
with mean \theta and covariance matrix v*(X'*X)^{-1},
where v is the variance of the U's.  For simplicity,
this will be assumed known in the sequel, else things
get more complicated, and numerical methods may have
to be used to deal with the prior for v.  These methods
may or may not include simulation.

For the Bayesian approach, assume that \theta is 
normally distributed with mean 0 (the mean can be
corrected for) and covariance matrix v*S.  Then
the posterior mean of \theta is given by replacing
X'*X by X'*X+S^{-1} in (1) and also in the computed
covariance matrix.

-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: Help with Bayesian Linear Regression

Reply via email to