On Sunday, August 31, 2014 1:30:32 PM UTC-5, Bradley Setzler wrote:
>
> Thank you Adam, this works.
>
> Let me suggest that this information be included in the GLM documentation:
>
> To fit a GLM model, use the function,
> glm(formula, data, family, link), 
> where,
> - formula uses column symbols from the DataFrame data, e.g., if 
> names(data)=[:Y,:X], then a valid formula is Y~X;
> - data is a DataFrame which may contain NA values, the rows with NA values 
> will be ignored (apparently);
> - family may be chosen from Binomial(), Gamma(), Normal(), or Poisson(), 
> and the parentheses are required; and,
> - link may be chosen from the list in the GLM documentation, such as 
> LogitLink(), and again the parentheses are required. For some families, a 
> default link is available so the link argument may be left blank.
>

It would be more accurate to say that if the link argument is omitted 
("left blank" is ambiguous) the canonical link is used.  A distribution 
from the exponential family (en.wikipedia.org/wiki/Exponential_family) has 
a canonical link function derived from the probability mass or probability 
density function.  Because it is difficult to distinguish between models 
fit using the same distribution but different links, it is uncommon to use 
non-canonical links.  What I am trying to say is that the canonical link is 
more than an arbitrarily chosen default.

It is unfortunate that the names "Poisson regression", "Logistic 
regression" and "Probit regression" had existed before Nelder and 
Wedderburn came up with a unifying framework for such models.  These names 
refer to three different aspects of the model; "Poisson" refers to the 
distribution, "Logistic" to the inverse link function and "Probit" to the 
link.

Of course most statistics nomenclature is badly botched so this 
inconsistency should not be a surprise.

The parentheses after the name are to create an instance of a distribution 
type or of a link type for the purposes of dispatch.  It should be possible 
to dispatch on a DataType as well (i.e. you could write Poisson instead of 
Poisson()).  I took a look at the sources but I have lost track of the 
changes relative to the original design and am not sure the changes would 
be made now.
 

> On Sunday, August 31, 2014 12:56:19 PM UTC-5, Adam Kapor wrote:
>
>> This works for me:
>>
>> ```
>>
>> *julia> **fit(GeneralizedLinearModel,Y~X,data,Binomial(),ProbitLink())*
>>
>> *DataFrameRegressionModel{GeneralizedLinearModel,Float64}:*
>>
>> *Coefficients:*
>>
>> *                Estimate Std.Error     z value Pr(>|z|)*
>>
>> *(Intercept)     0.430727   1.98019    0.217518   0.8278*
>>
>> *X            2.37745e-17   0.91665 2.59362e-17   1.0000*
>>
>> *julia> **fit(GeneralizedLinearModel,Y~X,data,Binomial(),LogitLink())*
>>
>> *DataFrameRegressionModel{GeneralizedLinearModel,Float64}:*
>>
>> *Coefficients:*
>>
>> *                 Estimate Std.Error      z value Pr(>|z|)*
>>
>> *(Intercept)      0.693147   3.24037      0.21391   0.8306*
>>
>> *X            -7.44332e-17       1.5 -4.96221e-17   1.0000*
>>
>> *```*
>>
>> On Sunday, August 31, 2014 1:27:15 PM UTC-4, Bradley Setzler wrote:
>>>
>>> Has anyone successfully performed probit or logit regression in Julia? 
>>> The GLM documentation <https://github.com/JuliaStats/GLM.jl> does not 
>>> provide a generalizable example of how to use glm(). It gives a Poisson 
>>> example without any suggestion of how to switch from Poisson to some other 
>>> type.
>>>
>>> *Using the Poisson example from GLM documentation works:*
>>>
>>> julia> X = [1;2;3.]
>>> julia> Y = [1;0;1.]
>>> julia> data = DataFrame(X=X,Y=Y)
>>> julia> fit(GeneralizedLinearModel, Y ~ X,data, Poisson())
>>> DataFrameRegressionModel{GeneralizedLinearModel,Float64}: 
>>> Coefficients: 
>>> Estimate Std.Error z value Pr(>|z|) 
>>> (Intercept) -0.405465 1.87034 -0.216787 0.8284 
>>> X -3.91448e-17 0.8658 -4.52123e-17 1.0000 
>>>
>>> *But does not generalize:*
>>>
>>> julia> fit(GeneralizedLinearModel, Y ~ X ,data, Logit()) 
>>> ERROR: Logit not defined
>>>
>>> julia> fit(GeneralizedLinearModel, Y ~ X, data, link=:ProbitLink) 
>>> ERROR: `fit` has no method matching fit(::Type{GeneralizedLinearModel}, 
>>> ::Array{Float64,2}, ::Array{Float64,1})
>>>
>>> julia> fit(GeneralizedLinearModel, Y ~ X, data, 
>>> family="binomial",link="probit") 
>>> ERROR: `fit` has no method matching fit(::Type{GeneralizedLinearModel}, 
>>> ::Array{Float64,2}, ::Array{Float64,1})
>>>
>>> ....and a dozen other similar attempts fail. 
>>>
>>>
>>> Thanks,
>>> Bradley
>>>
>>>

Reply via email to