Re: [R] Varying statistical significance in estimates of linear model

2013-08-09 Thread Stathis Kamperis
For archiving reasons:

1. Practical Regression and Anova using R by Faraway
2. Possible reason: multi-collinearity in predictor variables.

Thanks everybody!

On Thu, Aug 8, 2013 at 1:43 PM, Stathis Kamperis ekamp...@gmail.com wrote:
 Hi everyone,

 I have a response variable 'y' and several predictor variables 'x_i'.
 I start with a linear model:

 m1 - lm(y ~ x1); summary(m1)

 and I get a statistically significant estimate for 'x1'. Then, I
 modify my model as:

 m2 - lm(y ~ x1 + x2); summary(m2)

 At this moment, the estimate for x1 might become non-significant while
 the estimate of x2 significant.

 As I add more predictor variables (or interaction terms), the
 estimates for which I get a statistically significant result vary. So
 sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.

 It seems to me that I could tweak my model in such a way (by
 adding/removing predictor variables or suitable interaction terms)
 that I could prove whatever I'd like to prove.

 What is the proper methodology involved here ? What do you people do
 in such cases ? I can provide the data if anyone cares and would like
 to have a look at them.

 Best regards,
 Stathis Kamperis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Varying statistical significance in estimates of linear model

2013-08-08 Thread Stathis Kamperis
Hi everyone,

I have a response variable 'y' and several predictor variables 'x_i'.
I start with a linear model:

m1 - lm(y ~ x1); summary(m1)

and I get a statistically significant estimate for 'x1'. Then, I
modify my model as:

m2 - lm(y ~ x1 + x2); summary(m2)

At this moment, the estimate for x1 might become non-significant while
the estimate of x2 significant.

As I add more predictor variables (or interaction terms), the
estimates for which I get a statistically significant result vary. So
sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.

It seems to me that I could tweak my model in such a way (by
adding/removing predictor variables or suitable interaction terms)
that I could prove whatever I'd like to prove.

What is the proper methodology involved here ? What do you people do
in such cases ? I can provide the data if anyone cares and would like
to have a look at them.

Best regards,
Stathis Kamperis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Varying statistical significance in estimates of linear model

2013-08-08 Thread ONKELINX, Thierry
Dear Stathis,

I recommend that you try to get some advice from a local statistician or read 
an introductory book on statistics. This kind of question is beyond the scope 
of a mailing list.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey


-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
Stathis Kamperis
Verzonden: donderdag 8 augustus 2013 12:43
Aan: r-help@r-project.org
Onderwerp: [R] Varying statistical significance in estimates of linear model

Hi everyone,

I have a response variable 'y' and several predictor variables 'x_i'.
I start with a linear model:

m1 - lm(y ~ x1); summary(m1)

and I get a statistically significant estimate for 'x1'. Then, I modify my 
model as:

m2 - lm(y ~ x1 + x2); summary(m2)

At this moment, the estimate for x1 might become non-significant while the 
estimate of x2 significant.

As I add more predictor variables (or interaction terms), the estimates for 
which I get a statistically significant result vary. So sometimes x1, x2, x6 
are significant, while others, x2, x4, x5 are.

It seems to me that I could tweak my model in such a way (by adding/removing 
predictor variables or suitable interaction terms) that I could prove 
whatever I'd like to prove.

What is the proper methodology involved here ? What do you people do in such 
cases ? I can provide the data if anyone cares and would like to have a look at 
them.

Best regards,
Stathis Kamperis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Varying statistical significance in estimates of linear model

2013-08-08 Thread Bert Gunter
Stathis:

1. This has nothing to do with R.  Post on a statistics list, like
stats.stackexchange.com

2. Read a basic regression/linear models text. You need to educate yourself.

-- Bert

On Thu, Aug 8, 2013 at 3:43 AM, Stathis Kamperis ekamp...@gmail.com wrote:
 Hi everyone,

 I have a response variable 'y' and several predictor variables 'x_i'.
 I start with a linear model:

 m1 - lm(y ~ x1); summary(m1)

 and I get a statistically significant estimate for 'x1'. Then, I
 modify my model as:

 m2 - lm(y ~ x1 + x2); summary(m2)

 At this moment, the estimate for x1 might become non-significant while
 the estimate of x2 significant.

 As I add more predictor variables (or interaction terms), the
 estimates for which I get a statistically significant result vary. So
 sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.

 It seems to me that I could tweak my model in such a way (by
 adding/removing predictor variables or suitable interaction terms)
 that I could prove whatever I'd like to prove.

 What is the proper methodology involved here ? What do you people do
 in such cases ? I can provide the data if anyone cares and would like
 to have a look at them.

 Best regards,
 Stathis Kamperis

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.