Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread William Dunlap via R-help
I like using a logical response in cases like this, but put its
construction in the formula so it is unambiguous when I look at the
results later.
> d <- data.frame(Covid=c("Pos","Pos","Neg","Pos","Neg","Neg"), Age=41:46)
> glm(family=binomial, data=d, Covid=="Pos"~Age)

Call:  glm(formula = Covid == "Pos" ~ Age, family = binomial, data = d)

Coefficients:
(Intercept)  Age
 52.810   -1.214

Degrees of Freedom: 5 Total (i.e. Null);  4 Residual
Null Deviance:  8.318
Residual Deviance: 4.956AIC: 8.956


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sat, Aug 1, 2020 at 12:21 PM John Fox  wrote:
>
> Dear Paul,
>
> I think that this thread has gotten unnecessarily complicated. The
> answer, as is easily demonstrated, is that a binary response for a
> binomial GLM in glm() may be a factor, a numeric variable, or a logical
> variable, with identical results; for example:
>
> --- snip -
>
>  > set.seed(123)
>
>  > head(x <- rnorm(100))
> [1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774  1.71506499
>
>  > head(y <- rbinom(100, 1, 1/(1 + exp(-x
> [1] 0 1 1 1 1 0
>
>  > head(yf <- as.factor(y))
> [1] 0 1 1 1 1 0
> Levels: 0 1
>
>  > head(yl <- y == 1)
> [1] FALSE  TRUE  TRUE  TRUE  TRUE FALSE
>
>  > glm(y ~ x, family=binomial)
>
> Call:  glm(formula = y ~ x, family = binomial)
>
> Coefficients:
> (Intercept)x
>   0.3995   1.1670
>
> Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
> Null Deviance:  134.6
> Residual Deviance: 114.9AIC: 118.9
>
>  > glm(yf ~ x, family=binomial)
>
> Call:  glm(formula = yf ~ x, family = binomial)
>
> Coefficients:
> (Intercept)x
>   0.3995   1.1670
>
> Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
> Null Deviance:  134.6
> Residual Deviance: 114.9AIC: 118.9
>
>  > glm(yl ~ x, family=binomial)
>
> Call:  glm(formula = yl ~ x, family = binomial)
>
> Coefficients:
> (Intercept)x
>   0.3995   1.1670
>
> Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
> Null Deviance:  134.6
> Residual Deviance: 114.9AIC: 118.9
>
> --- snip -
>
> The original poster claimed to have encountered an error with a 0/1
> numeric response, but didn't show any data or even a command. I suspect
> that the response was a character variable, but of course can't really
> know that.
>
> Best,
>   John
>
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
> On 2020-08-01 2:25 p.m., Paul Bernal wrote:
> > Dear friend,
> >
> > I am aware that I have a binomial dependent variable, which is covid status
> > (1 if covid positive, and 0 otherwise).
> >
> > My question was if R requires to turn a binomial response variable into a
> > factor or not, that's all.
> >
> > Cheers,
> >
> > Paul
> >
> > El sáb., 1 de agosto de 2020 1:22 p. m., Bert Gunter 
> > 
> > escribió:
> >
> >> ... yes, but so does lm() for a categorical **INdependent** variable with
> >> more than 2 numerically labeled levels. n levels  = (n-1) df for a
> >> categorical covariate, but 1 for a continuous one (unless more complex
> >> models are explicitly specified of course). As I said, the OP seems
> >> confused about whether he is referring to the response or covariates. Or
> >> maybe he just made the same typo I did.
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along and
> >> sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) <
> >> mal...@malonequantitative.com> wrote:
> >>
> >>> No, R does not. glm() does in order to do logistic regression.
> >>>
> >>> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal 
> >>> wrote:
> >>>
>  Hi Bert,
> 
>  Thank you for the kind reply.
> 
>  But what if I don't turn the variable into a factor. Let's say that in
>  excel I just coded the variable as 1s and 0s and just imported the
>  dataset
>  into R and fitted the logistic regression without turning any categorical
>  variable or dummy variable into a factor?
> 
>  Does R requires every dummy variable to be treated as a factor?
> 
>  Best regards,
> 
>  Paul
> 
>  El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
>  bgunter.4...@gmail.com> escribió:
> 
> > x <- factor(0:1)
> > x <- factor("yes","no")
> >
> > will produce identical results up to labeling.
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
>  and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
> > wrote:
> >
> >> Dear friends,
> >>
> >> 

Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Rui Barradas

Hello,

Inline.

Às 20:01 de 01/08/2020, John Fox escreveu:

Dear Paul,

I think that this thread has gotten unnecessarily complicated. The 
answer, as is easily demonstrated, is that a binary response for a 
binomial GLM in glm() may be a factor, a numeric variable, or a 
logical variable, with identical results; for example:


--- snip -

> set.seed(123)

> head(x <- rnorm(100))
[1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774 
1.71506499


> head(y <- rbinom(100, 1, 1/(1 + exp(-x
[1] 0 1 1 1 1 0

> head(yf <- as.factor(y))
[1] 0 1 1 1 1 0
Levels: 0 1

> head(yl <- y == 1)
[1] FALSE  TRUE  TRUE  TRUE  TRUE FALSE

> glm(y ~ x, family=binomial)

Call:  glm(formula = y ~ x, family = binomial)

Coefficients:
(Intercept)    x
 0.3995   1.1670

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:    134.6
Residual Deviance: 114.9 AIC: 118.9

> glm(yf ~ x, family=binomial)

Call:  glm(formula = yf ~ x, family = binomial)

Coefficients:
(Intercept)    x
 0.3995   1.1670

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:    134.6
Residual Deviance: 114.9 AIC: 118.9

> glm(yl ~ x, family=binomial)

Call:  glm(formula = yl ~ x, family = binomial)

Coefficients:
(Intercept)    x
 0.3995   1.1670

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:    134.6
Residual Deviance: 114.9 AIC: 118.9

--- snip -

The original poster claimed to have encountered an error with a 0/1 
numeric response, but didn't show any data or even a command. I 
suspect that the response was a character variable, but of course 
can't really know that.


So continuing with your example:

> head(yc <- as.character(y))
[1] "0" "1" "1" "1" "1" "0"
> glm(yc ~ x, family=binomial)
Error in weights * y : non-numeric argument to binary operator


But the OP says that

[...] R complains that I should make the dependent variable a factor.

That is not what the error message says, it "asks" for a numeric 
argument to the '*' operator.
We haven't seen the exact R message yet, so, like others have said, the 
OP should post it along with code.


Hope this helps,

Rui Barradas



Best,
 John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

On 2020-08-01 2:25 p.m., Paul Bernal wrote:

Dear friend,

I am aware that I have a binomial dependent variable, which is covid 
status

(1 if covid positive, and 0 otherwise).

My question was if R requires to turn a binomial response variable 
into a

factor or not, that's all.

Cheers,

Paul

El sáb., 1 de agosto de 2020 1:22 p. m., Bert Gunter 


escribió:

... yes, but so does lm() for a categorical **INdependent** variable 
with

more than 2 numerically labeled levels. n levels  = (n-1) df for a
categorical covariate, but 1 for a continuous one (unless more complex
models are explicitly specified of course). As I said, the OP seems
confused about whether he is referring to the response or 
covariates. Or

maybe he just made the same typo I did.

Bert Gunter

"The trouble with having an open mind is that people keep coming 
along and

sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) <
mal...@malonequantitative.com> wrote:


No, R does not. glm() does in order to do logistic regression.

On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal 
wrote:


Hi Bert,

Thank you for the kind reply.

But what if I don't turn the variable into a factor. Let's say 
that in

excel I just coded the variable as 1s and 0s and just imported the
dataset
into R and fitted the logistic regression without turning any 
categorical

variable or dummy variable into a factor?

Does R requires every dummy variable to be treated as a factor?

Best regards,

Paul

El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
bgunter.4...@gmail.com> escribió:


x <- factor(0:1)
x <- factor("yes","no")

will produce identical results up to labeling.


Bert Gunter

"The trouble with having an open mind is that people keep coming 
along

and

sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
wrote:


Dear friends,

Hope you are doing great. I want to fit a logistic regression in R,

where

the dependent variable is the covid status (I used 1 for covid

positives,
and 0 for covid negatives), but when I ran the glm, R complains 
that I

should make the dependent variable a factor.

What would be more advisable, to keep the dependent variable 
with 1s

and

0s, or code it as yes/no and then make it a factor?

Any guidance will be greatly appreciated,

Best regards,

Paul

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To 

Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Patrick (Malone Quantitative)
I didn't mean to imply that was the only time that it was required, only
that it's not universal in R.

On Sat, Aug 1, 2020 at 2:22 PM Bert Gunter  wrote:

> ... yes, but so does lm() for a categorical **INdependent** variable with
> more than 2 numerically labeled levels. n levels  = (n-1) df for a
> categorical covariate, but 1 for a continuous one (unless more complex
> models are explicitly specified of course). As I said, the OP seems
> confused about whether he is referring to the response or covariates. Or
> maybe he just made the same typo I did.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) <
> mal...@malonequantitative.com> wrote:
>
>> No, R does not. glm() does in order to do logistic regression.
>>
>> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal 
>> wrote:
>>
>>> Hi Bert,
>>>
>>> Thank you for the kind reply.
>>>
>>> But what if I don't turn the variable into a factor. Let's say that in
>>> excel I just coded the variable as 1s and 0s and just imported the
>>> dataset
>>> into R and fitted the logistic regression without turning any categorical
>>> variable or dummy variable into a factor?
>>>
>>> Does R requires every dummy variable to be treated as a factor?
>>>
>>> Best regards,
>>>
>>> Paul
>>>
>>> El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
>>> bgunter.4...@gmail.com> escribió:
>>>
>>> > x <- factor(0:1)
>>> > x <- factor("yes","no")
>>> >
>>> > will produce identical results up to labeling.
>>> >
>>> >
>>> > Bert Gunter
>>> >
>>> > "The trouble with having an open mind is that people keep coming along
>>> and
>>> > sticking things into it."
>>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>> >
>>> >
>>> > On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
>>> > wrote:
>>> >
>>> >> Dear friends,
>>> >>
>>> >> Hope you are doing great. I want to fit a logistic regression in R,
>>> where
>>> >> the dependent variable is the covid status (I used 1 for covid
>>> positives,
>>> >> and 0 for covid negatives), but when I ran the glm, R complains that I
>>> >> should make the dependent variable a factor.
>>> >>
>>> >> What would be more advisable, to keep the dependent variable with 1s
>>> and
>>> >> 0s, or code it as yes/no and then make it a factor?
>>> >>
>>> >> Any guidance will be greatly appreciated,
>>> >>
>>> >> Best regards,
>>> >>
>>> >> Paul
>>> >>
>>> >> [[alternative HTML version deleted]]
>>> >>
>>> >> __
>>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>>> >> PLEASE do read the posting guide
>>> >> http://www.R-project.org/posting-guide.html
>>> >> and provide commented, minimal, self-contained, reproducible code.
>>> >>
>>> >
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>> --
>> Patrick S. Malone, Ph.D., Malone Quantitative
>> NEW Service Models: http://malonequantitative.com
>>
>> He/Him/His
>>
>

-- 
Patrick S. Malone, Ph.D., Malone Quantitative
NEW Service Models: http://malonequantitative.com

He/Him/His

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Rui Barradas

Hello,

From the documentation, help('glm'):


 Details

A typical predictor has the form|response ~ terms|where|response|is the 
(numeric) response vector and|terms|is a series of terms which specifies 
a linear predictor for|response|. 
For|binomial|and|quasibinomial|families the response can also be 
specified as a|factor 
|(when the first level 
denotes failure and all others success) or as a two-column matrix with 
the columns giving the numbers of successes and failures. A terms 
specification of the form|first + second|indicates all the terms 
in|first|together with all the terms in|second|with any duplicates removed.



There is no need for the response to be a factor, it is optional, the 
wording is very clear,


"For|binomial|and|quasibinomial|families the response *can* also be 
specified as a|factor "|


And with binary, numeric responses I cannot reproduce the warning 
message, the models fit silently.



Hope this helps,

Rui Barradas




Às 18:39 de 01/08/2020, Paul Bernal escreveu:

Dear friends,

Hope you are doing great. I want to fit a logistic regression in R, where
the dependent variable is the covid status (I used 1 for covid positives,
and 0 for covid negatives), but when I ran the glm, R complains that I
should make the dependent variable a factor.

What would be more advisable, to keep the dependent variable with 1s and
0s, or code it as yes/no and then make it a factor?

Any guidance will be greatly appreciated,

Best regards,

Paul

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Este e-mail foi verificado em termos de vírus pelo software antivírus Avast.
https://www.avast.com/antivirus

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread John Fox

Dear Paul,

I think that this thread has gotten unnecessarily complicated. The 
answer, as is easily demonstrated, is that a binary response for a 
binomial GLM in glm() may be a factor, a numeric variable, or a logical 
variable, with identical results; for example:


--- snip -

> set.seed(123)

> head(x <- rnorm(100))
[1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774  1.71506499

> head(y <- rbinom(100, 1, 1/(1 + exp(-x
[1] 0 1 1 1 1 0

> head(yf <- as.factor(y))
[1] 0 1 1 1 1 0
Levels: 0 1

> head(yl <- y == 1)
[1] FALSE  TRUE  TRUE  TRUE  TRUE FALSE

> glm(y ~ x, family=binomial)

Call:  glm(formula = y ~ x, family = binomial)

Coefficients:
(Intercept)x
 0.3995   1.1670

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:  134.6
Residual Deviance: 114.9AIC: 118.9

> glm(yf ~ x, family=binomial)

Call:  glm(formula = yf ~ x, family = binomial)

Coefficients:
(Intercept)x
 0.3995   1.1670

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:  134.6
Residual Deviance: 114.9AIC: 118.9

> glm(yl ~ x, family=binomial)

Call:  glm(formula = yl ~ x, family = binomial)

Coefficients:
(Intercept)x
 0.3995   1.1670

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:  134.6
Residual Deviance: 114.9AIC: 118.9

--- snip -

The original poster claimed to have encountered an error with a 0/1 
numeric response, but didn't show any data or even a command. I suspect 
that the response was a character variable, but of course can't really 
know that.


Best,
 John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

On 2020-08-01 2:25 p.m., Paul Bernal wrote:

Dear friend,

I am aware that I have a binomial dependent variable, which is covid status
(1 if covid positive, and 0 otherwise).

My question was if R requires to turn a binomial response variable into a
factor or not, that's all.

Cheers,

Paul

El sáb., 1 de agosto de 2020 1:22 p. m., Bert Gunter 
escribió:


... yes, but so does lm() for a categorical **INdependent** variable with
more than 2 numerically labeled levels. n levels  = (n-1) df for a
categorical covariate, but 1 for a continuous one (unless more complex
models are explicitly specified of course). As I said, the OP seems
confused about whether he is referring to the response or covariates. Or
maybe he just made the same typo I did.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) <
mal...@malonequantitative.com> wrote:


No, R does not. glm() does in order to do logistic regression.

On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal 
wrote:


Hi Bert,

Thank you for the kind reply.

But what if I don't turn the variable into a factor. Let's say that in
excel I just coded the variable as 1s and 0s and just imported the
dataset
into R and fitted the logistic regression without turning any categorical
variable or dummy variable into a factor?

Does R requires every dummy variable to be treated as a factor?

Best regards,

Paul

El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
bgunter.4...@gmail.com> escribió:


x <- factor(0:1)
x <- factor("yes","no")

will produce identical results up to labeling.


Bert Gunter

"The trouble with having an open mind is that people keep coming along

and

sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
wrote:


Dear friends,

Hope you are doing great. I want to fit a logistic regression in R,

where

the dependent variable is the covid status (I used 1 for covid

positives,

and 0 for covid negatives), but when I ran the glm, R complains that I
should make the dependent variable a factor.

What would be more advisable, to keep the dependent variable with 1s

and

0s, or code it as yes/no and then make it a factor?

Any guidance will be greatly appreciated,

Best regards,

Paul

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Patrick S. Malone, Ph.D., 

Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Bert Gunter
You appear to be confusing a binomial **response** with categorical
"dependent variables." glm() of course fits continuous or categorical
dependent variables. If a continuous dependent variable has only 2 values,
the results for glm() will be the same whether or not it is considered to
be continuous or categorical, though you may not recognize it as such.

This discussion has already wandered off topic to statistical issues. I
will not comment further on or off list. I suggest you consult a good
reference on linear/generalized linear models or talk with a local
statistician.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 1, 2020 at 11:04 AM Paul Bernal  wrote:

> Hi Bert,
>
> Thank you for the kind reply.
>
> But what if I don't turn the variable into a factor. Let's say that in
> excel I just coded the variable as 1s and 0s and just imported the dataset
> into R and fitted the logistic regression without turning any categorical
> variable or dummy variable into a factor?
>
> Does R requires every dummy variable to be treated as a factor?
>
> Best regards,
>
> Paul
>
> El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
> bgunter.4...@gmail.com> escribió:
>
>> x <- factor(0:1)
>> x <- factor("yes","no")
>>
>> will produce identical results up to labeling.
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
>> wrote:
>>
>>> Dear friends,
>>>
>>> Hope you are doing great. I want to fit a logistic regression in R, where
>>> the dependent variable is the covid status (I used 1 for covid positives,
>>> and 0 for covid negatives), but when I ran the glm, R complains that I
>>> should make the dependent variable a factor.
>>>
>>> What would be more advisable, to keep the dependent variable with 1s and
>>> 0s, or code it as yes/no and then make it a factor?
>>>
>>> Any guidance will be greatly appreciated,
>>>
>>> Best regards,
>>>
>>> Paul
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Patrick (Malone Quantitative)
No, R does not. glm() does in order to do logistic regression.

On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal  wrote:

> Hi Bert,
>
> Thank you for the kind reply.
>
> But what if I don't turn the variable into a factor. Let's say that in
> excel I just coded the variable as 1s and 0s and just imported the dataset
> into R and fitted the logistic regression without turning any categorical
> variable or dummy variable into a factor?
>
> Does R requires every dummy variable to be treated as a factor?
>
> Best regards,
>
> Paul
>
> El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
> bgunter.4...@gmail.com> escribió:
>
> > x <- factor(0:1)
> > x <- factor("yes","no")
> >
> > will produce identical results up to labeling.
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
> > wrote:
> >
> >> Dear friends,
> >>
> >> Hope you are doing great. I want to fit a logistic regression in R,
> where
> >> the dependent variable is the covid status (I used 1 for covid
> positives,
> >> and 0 for covid negatives), but when I ran the glm, R complains that I
> >> should make the dependent variable a factor.
> >>
> >> What would be more advisable, to keep the dependent variable with 1s and
> >> 0s, or code it as yes/no and then make it a factor?
> >>
> >> Any guidance will be greatly appreciated,
> >>
> >> Best regards,
> >>
> >> Paul
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Patrick S. Malone, Ph.D., Malone Quantitative
NEW Service Models: http://malonequantitative.com

He/Him/His

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Bert Gunter
... and further:
" If a continuous independent variable has only 2 values,..."

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 1, 2020 at 11:11 AM Bert Gunter  wrote:

> You appear to be confusing a binomial **response** with categorical
> "dependent variables." glm() of course fits continuous or categorical
> dependent variables. If a continuous dependent variable has only 2 values,
> the results for glm() will be the same whether or not it is considered to
> be continuous or categorical, though you may not recognize it as such.
>
> This discussion has already wandered off topic to statistical issues. I
> will not comment further on or off list. I suggest you consult a good
> reference on linear/generalized linear models or talk with a local
> statistician.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Aug 1, 2020 at 11:04 AM Paul Bernal 
> wrote:
>
>> Hi Bert,
>>
>> Thank you for the kind reply.
>>
>> But what if I don't turn the variable into a factor. Let's say that in
>> excel I just coded the variable as 1s and 0s and just imported the dataset
>> into R and fitted the logistic regression without turning any categorical
>> variable or dummy variable into a factor?
>>
>> Does R requires every dummy variable to be treated as a factor?
>>
>> Best regards,
>>
>> Paul
>>
>> El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
>> bgunter.4...@gmail.com> escribió:
>>
>>> x <- factor(0:1)
>>> x <- factor("yes","no")
>>>
>>> will produce identical results up to labeling.
>>>
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
>>> wrote:
>>>
 Dear friends,

 Hope you are doing great. I want to fit a logistic regression in R,
 where
 the dependent variable is the covid status (I used 1 for covid
 positives,
 and 0 for covid negatives), but when I ran the glm, R complains that I
 should make the dependent variable a factor.

 What would be more advisable, to keep the dependent variable with 1s and
 0s, or code it as yes/no and then make it a factor?

 Any guidance will be greatly appreciated,

 Best regards,

 Paul

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Bert Gunter
Sorry, typo.My first sentences should read:

"You appear to be confusing a binomial **response** with categorical
"independent variables." glm() of course fits continuous or categorical
independent variables."

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 1, 2020 at 11:11 AM Bert Gunter  wrote:

> You appear to be confusing a binomial **response** with categorical
> "dependent variables." glm() of course fits continuous or categorical
> dependent variables. If a continuous dependent variable has only 2 values,
> the results for glm() will be the same whether or not it is considered to
> be continuous or categorical, though you may not recognize it as such.
>
> This discussion has already wandered off topic to statistical issues. I
> will not comment further on or off list. I suggest you consult a good
> reference on linear/generalized linear models or talk with a local
> statistician.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Aug 1, 2020 at 11:04 AM Paul Bernal 
> wrote:
>
>> Hi Bert,
>>
>> Thank you for the kind reply.
>>
>> But what if I don't turn the variable into a factor. Let's say that in
>> excel I just coded the variable as 1s and 0s and just imported the dataset
>> into R and fitted the logistic regression without turning any categorical
>> variable or dummy variable into a factor?
>>
>> Does R requires every dummy variable to be treated as a factor?
>>
>> Best regards,
>>
>> Paul
>>
>> El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
>> bgunter.4...@gmail.com> escribió:
>>
>>> x <- factor(0:1)
>>> x <- factor("yes","no")
>>>
>>> will produce identical results up to labeling.
>>>
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
>>> wrote:
>>>
 Dear friends,

 Hope you are doing great. I want to fit a logistic regression in R,
 where
 the dependent variable is the covid status (I used 1 for covid
 positives,
 and 0 for covid negatives), but when I ran the glm, R complains that I
 should make the dependent variable a factor.

 What would be more advisable, to keep the dependent variable with 1s and
 0s, or code it as yes/no and then make it a factor?

 Any guidance will be greatly appreciated,

 Best regards,

 Paul

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Paul Bernal
Dear friend,

I am aware that I have a binomial dependent variable, which is covid status
(1 if covid positive, and 0 otherwise).

My question was if R requires to turn a binomial response variable into a
factor or not, that's all.

Cheers,

Paul

El sáb., 1 de agosto de 2020 1:22 p. m., Bert Gunter 
escribió:

> ... yes, but so does lm() for a categorical **INdependent** variable with
> more than 2 numerically labeled levels. n levels  = (n-1) df for a
> categorical covariate, but 1 for a continuous one (unless more complex
> models are explicitly specified of course). As I said, the OP seems
> confused about whether he is referring to the response or covariates. Or
> maybe he just made the same typo I did.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) <
> mal...@malonequantitative.com> wrote:
>
>> No, R does not. glm() does in order to do logistic regression.
>>
>> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal 
>> wrote:
>>
>>> Hi Bert,
>>>
>>> Thank you for the kind reply.
>>>
>>> But what if I don't turn the variable into a factor. Let's say that in
>>> excel I just coded the variable as 1s and 0s and just imported the
>>> dataset
>>> into R and fitted the logistic regression without turning any categorical
>>> variable or dummy variable into a factor?
>>>
>>> Does R requires every dummy variable to be treated as a factor?
>>>
>>> Best regards,
>>>
>>> Paul
>>>
>>> El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
>>> bgunter.4...@gmail.com> escribió:
>>>
>>> > x <- factor(0:1)
>>> > x <- factor("yes","no")
>>> >
>>> > will produce identical results up to labeling.
>>> >
>>> >
>>> > Bert Gunter
>>> >
>>> > "The trouble with having an open mind is that people keep coming along
>>> and
>>> > sticking things into it."
>>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>> >
>>> >
>>> > On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
>>> > wrote:
>>> >
>>> >> Dear friends,
>>> >>
>>> >> Hope you are doing great. I want to fit a logistic regression in R,
>>> where
>>> >> the dependent variable is the covid status (I used 1 for covid
>>> positives,
>>> >> and 0 for covid negatives), but when I ran the glm, R complains that I
>>> >> should make the dependent variable a factor.
>>> >>
>>> >> What would be more advisable, to keep the dependent variable with 1s
>>> and
>>> >> 0s, or code it as yes/no and then make it a factor?
>>> >>
>>> >> Any guidance will be greatly appreciated,
>>> >>
>>> >> Best regards,
>>> >>
>>> >> Paul
>>> >>
>>> >> [[alternative HTML version deleted]]
>>> >>
>>> >> __
>>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>>> >> PLEASE do read the posting guide
>>> >> http://www.R-project.org/posting-guide.html
>>> >> and provide commented, minimal, self-contained, reproducible code.
>>> >>
>>> >
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>> --
>> Patrick S. Malone, Ph.D., Malone Quantitative
>> NEW Service Models: http://malonequantitative.com
>>
>> He/Him/His
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Bert Gunter
... yes, but so does lm() for a categorical **INdependent** variable with
more than 2 numerically labeled levels. n levels  = (n-1) df for a
categorical covariate, but 1 for a continuous one (unless more complex
models are explicitly specified of course). As I said, the OP seems
confused about whether he is referring to the response or covariates. Or
maybe he just made the same typo I did.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) <
mal...@malonequantitative.com> wrote:

> No, R does not. glm() does in order to do logistic regression.
>
> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal  wrote:
>
>> Hi Bert,
>>
>> Thank you for the kind reply.
>>
>> But what if I don't turn the variable into a factor. Let's say that in
>> excel I just coded the variable as 1s and 0s and just imported the dataset
>> into R and fitted the logistic regression without turning any categorical
>> variable or dummy variable into a factor?
>>
>> Does R requires every dummy variable to be treated as a factor?
>>
>> Best regards,
>>
>> Paul
>>
>> El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
>> bgunter.4...@gmail.com> escribió:
>>
>> > x <- factor(0:1)
>> > x <- factor("yes","no")
>> >
>> > will produce identical results up to labeling.
>> >
>> >
>> > Bert Gunter
>> >
>> > "The trouble with having an open mind is that people keep coming along
>> and
>> > sticking things into it."
>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> >
>> >
>> > On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
>> > wrote:
>> >
>> >> Dear friends,
>> >>
>> >> Hope you are doing great. I want to fit a logistic regression in R,
>> where
>> >> the dependent variable is the covid status (I used 1 for covid
>> positives,
>> >> and 0 for covid negatives), but when I ran the glm, R complains that I
>> >> should make the dependent variable a factor.
>> >>
>> >> What would be more advisable, to keep the dependent variable with 1s
>> and
>> >> 0s, or code it as yes/no and then make it a factor?
>> >>
>> >> Any guidance will be greatly appreciated,
>> >>
>> >> Best regards,
>> >>
>> >> Paul
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> Patrick S. Malone, Ph.D., Malone Quantitative
> NEW Service Models: http://malonequantitative.com
>
> He/Him/His
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Paul Bernal
Hi Bert,

Thank you for the kind reply.

But what if I don't turn the variable into a factor. Let's say that in
excel I just coded the variable as 1s and 0s and just imported the dataset
into R and fitted the logistic regression without turning any categorical
variable or dummy variable into a factor?

Does R requires every dummy variable to be treated as a factor?

Best regards,

Paul

El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
bgunter.4...@gmail.com> escribió:

> x <- factor(0:1)
> x <- factor("yes","no")
>
> will produce identical results up to labeling.
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal 
> wrote:
>
>> Dear friends,
>>
>> Hope you are doing great. I want to fit a logistic regression in R, where
>> the dependent variable is the covid status (I used 1 for covid positives,
>> and 0 for covid negatives), but when I ran the glm, R complains that I
>> should make the dependent variable a factor.
>>
>> What would be more advisable, to keep the dependent variable with 1s and
>> 0s, or code it as yes/no and then make it a factor?
>>
>> Any guidance will be greatly appreciated,
>>
>> Best regards,
>>
>> Paul
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Rich Shepard

On Sat, 1 Aug 2020, Paul Bernal wrote:


Hope you are doing great. I want to fit a logistic regression in R, where
the dependent variable is the covid status (I used 1 for covid positives,
and 0 for covid negatives), but when I ran the glm, R complains that I
should make the dependent variable a factor.

What would be more advisable, to keep the dependent variable with 1s and
0s, or code it as yes/no and then make it a factor?


Paul,

1 or 0 are equivalent to yes or no, success or failure. All are nomminal
variables so all should be factors, regardless of the coding.

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dependent Variable in Logistic Regression

2020-08-01 Thread Bert Gunter
x <- factor(0:1)
x <- factor("yes","no")

will produce identical results up to labeling.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal  wrote:

> Dear friends,
>
> Hope you are doing great. I want to fit a logistic regression in R, where
> the dependent variable is the covid status (I used 1 for covid positives,
> and 0 for covid negatives), but when I ran the glm, R complains that I
> should make the dependent variable a factor.
>
> What would be more advisable, to keep the dependent variable with 1s and
> 0s, or code it as yes/no and then make it a factor?
>
> Any guidance will be greatly appreciated,
>
> Best regards,
>
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dependent Variable in Logistic Regression

2020-08-01 Thread Paul Bernal
Dear friends,

Hope you are doing great. I want to fit a logistic regression in R, where
the dependent variable is the covid status (I used 1 for covid positives,
and 0 for covid negatives), but when I ran the glm, R complains that I
should make the dependent variable a factor.

What would be more advisable, to keep the dependent variable with 1s and
0s, or code it as yes/no and then make it a factor?

Any guidance will be greatly appreciated,

Best regards,

Paul

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.