Re: [R] Code Verification

2005-07-20 Thread Christoph Buser
Dear Andy

Since the power of the t-test decreases when there are
discrepancies in the data to the normal distribution and there
is only a small loss of power if the data is normal distributed,
the only reason to use the t.test is his simplicity and the
easier interpretation.
Generally I'd prefer the wilcoxon test even if the data is
normal distributed.
But I agree that for a gamma distribution there is no huge loss
of power.

## example simulation:

n <- 1000
z1 <- numeric(n)
z2 <- numeric(n)

## gamma distribution
for (i in 1:n)
{
x<-rgamma(40, 2.5, 0.1)  
y<-rgamma(40, 3.5, 0.1)  
z1[i]<-t.test(x, y)$p.value
z2[i]<-wilcox.test(x, y)$p.value
}
## Power
1 - sum(z1>0.05)/1000  ## 0.71
1 - sum(z2>0.05)/1000  ## 0.76

##

## t distribution
for (i in 1:n)
{
x<-rt(40, df = 3)  
y<-1 + rt(40, df = 3)  
z1[i]<-t.test(x, y)$p.value
z2[i]<-wilcox.test(x, y)$p.value
}
## Power
1 - sum(z1>0.05)/1000  ## 0.76
1 - sum(z2>0.05)/1000  ## 0.91

##

## normal distribution
for (i in 1:n)
{
x<-rnorm(40, 0, 3)  
y<-1 + rnorm(40, 1, 3)  
z1[i]<-t.test(x, y)$p.value
z2[i]<-wilcox.test(x, y)$p.value
}
## Power
1 - sum(z1>0.05)/1000  ## 0.83
1 - sum(z2>0.05)/1000  ## 0.81

Regards,

Christoph Buser

--
Christoph Buser <[EMAIL PROTECTED]>
Seminar fuer Statistik, LEO C13
ETH (Federal Inst. Technology)  8092 Zurich  SWITZERLAND
phone: x-41-44-632-4673 fax: 632-1228
http://stat.ethz.ch/~buser/
--

Liaw, Andy writes:
 > > From: Christoph Buser
 > > 
 > > Hi
 > > 
 > > "t.test" assumes that your data within each group has a normal
 > > distribution. This is not the case in your example.
 > 
 > Eh?  What happen to the CLT?
 > 
 > Andy
 > 
 > 
 > > I would recommend you a non parametric test like "wilcox.test" if
 > > you want to compare the mean of two samples that are not normal
 > > distributed. 
 > > see ?wilcox.test
 > > 
 > > Be careful. Your example produces two gamma distributed samples
 > > with rate = 10, not scale = 10. 
 > > rate = 1/scale.
 > > If you want to use scale, you need to specify this argument 
 > > x<-rgamma(40, 2.5, scale = 10)
 > > see ?rgamma
 > > 
 > > I do not see the interpretation of your result. Since you do
 > > know the distribution and the parameters of your sample, you
 > > know the true means and that they are different. It is only a
 > > question of the sample size and the power of your test, if this
 > > difference is detected.
 > > Is that something you are investigating? Maybe a power
 > > calculation or something similar.
 > > 
 > > Regards,
 > > 
 > > Christoph Buser
 > > 
 > > --
 > > Christoph Buser <[EMAIL PROTECTED]>
 > > Seminar fuer Statistik, LEO C13
 > > ETH (Federal Inst. Technology) 8092 Zurich  SWITZERLAND
 > > phone: x-41-44-632-4673fax: 632-1228
 > > http://stat.ethz.ch/~buser/
 > > --
 > > 
 > > [EMAIL PROTECTED] writes:
 > >  > Hi R Users
 > >  > I have a code which I am running for my thesis work. Just 
 > > want to make sure that
 > >  > its ok. Its a t test I am conducting between two gamma 
 > > distributions with
 > >  > different shape parameters.
 > >  > 
 > >  > the code looks like:
 > >  > 
 > >  > sink("a1.txt");
 > >  > 
 > >  > for (i in 1:1000)
 > >  > {
 > >  > x<-rgamma(40, 2.5, 10)  # n = 40, shape = 2.5, Scale = 10
 > >  > y<-rgamma(40, 2.8, 10)  # n = 40, shape = 2.8, Scale = 10
 > >  > z<-t.test(x, y)
 > >  > print(z)
 > >  > }
 > >  > 
 > >  > 
 > >  > I will appreciate it if someone could tell me if its alrite or not.
 > >  > 
 > >  > thanks
 > >  > 
 > >  > -dev
 > >  > 
 > >  > __
 > >  > R-help@stat.math.ethz.ch mailing list
 > >  > https://stat.ethz.ch/mailman/listinfo/r-help
 > >  > PLEASE do read the posting guide! 
 > > http://www.R-project.org/posting-guide.html
 > > 
 > > 
 > > __
 > > R-help@stat.math.ethz.ch mailing list
 > > https://stat.ethz.ch/mailman/listinfo/r-help
 > > PLEASE do read the posting guide! 
 > > http://www.R-project.org/posting-guide.html
 > > 
 > > 
 > > 
 > 
 > 
 > 
 > --
 > Notice:  This e-mail message, together with any attachments, contains 
 > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New 
 > Jersey, USA 08889), and/or its affiliates (which may be known outside the 
 > United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as 
 > Banyu) that may be confidential, proprietary copyrighted and/or legally 
 > privileged. It is intended solely for the use of the individual or entity 
 > named on this message.  If you are not the intended recipient, and have 
 > received this message in 

Re: [R] Code Verification

2005-07-20 Thread Liaw, Andy
> From: Christoph Buser
> 
> Hi
> 
> "t.test" assumes that your data within each group has a normal
> distribution. This is not the case in your example.

Eh?  What happen to the CLT?

Andy


> I would recommend you a non parametric test like "wilcox.test" if
> you want to compare the mean of two samples that are not normal
> distributed. 
> see ?wilcox.test
> 
> Be careful. Your example produces two gamma distributed samples
> with rate = 10, not scale = 10. 
> rate = 1/scale.
> If you want to use scale, you need to specify this argument 
> x<-rgamma(40, 2.5, scale = 10)
> see ?rgamma
> 
> I do not see the interpretation of your result. Since you do
> know the distribution and the parameters of your sample, you
> know the true means and that they are different. It is only a
> question of the sample size and the power of your test, if this
> difference is detected.
> Is that something you are investigating? Maybe a power
> calculation or something similar.
> 
> Regards,
> 
> Christoph Buser
> 
> --
> Christoph Buser <[EMAIL PROTECTED]>
> Seminar fuer Statistik, LEO C13
> ETH (Federal Inst. Technology)8092 Zurich  SWITZERLAND
> phone: x-41-44-632-4673   fax: 632-1228
> http://stat.ethz.ch/~buser/
> --
> 
> [EMAIL PROTECTED] writes:
>  > Hi R Users
>  > I have a code which I am running for my thesis work. Just 
> want to make sure that
>  > its ok. Its a t test I am conducting between two gamma 
> distributions with
>  > different shape parameters.
>  > 
>  > the code looks like:
>  > 
>  > sink("a1.txt");
>  > 
>  > for (i in 1:1000)
>  > {
>  > x<-rgamma(40, 2.5, 10)  # n = 40, shape = 2.5, Scale = 10
>  > y<-rgamma(40, 2.8, 10)  # n = 40, shape = 2.8, Scale = 10
>  > z<-t.test(x, y)
>  > print(z)
>  > }
>  > 
>  > 
>  > I will appreciate it if someone could tell me if its alrite or not.
>  > 
>  > thanks
>  > 
>  > -dev
>  > 
>  > __
>  > R-help@stat.math.ethz.ch mailing list
>  > https://stat.ethz.ch/mailman/listinfo/r-help
>  > PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Code Verification

2005-07-20 Thread Christoph Buser
Hi

"t.test" assumes that your data within each group has a normal
distribution. This is not the case in your example.
I would recommend you a non parametric test like "wilcox.test" if
you want to compare the mean of two samples that are not normal
distributed. 
see ?wilcox.test

Be careful. Your example produces two gamma distributed samples
with rate = 10, not scale = 10. 
rate = 1/scale.
If you want to use scale, you need to specify this argument 
x<-rgamma(40, 2.5, scale = 10)
see ?rgamma

I do not see the interpretation of your result. Since you do
know the distribution and the parameters of your sample, you
know the true means and that they are different. It is only a
question of the sample size and the power of your test, if this
difference is detected.
Is that something you are investigating? Maybe a power
calculation or something similar.

Regards,

Christoph Buser

--
Christoph Buser <[EMAIL PROTECTED]>
Seminar fuer Statistik, LEO C13
ETH (Federal Inst. Technology)  8092 Zurich  SWITZERLAND
phone: x-41-44-632-4673 fax: 632-1228
http://stat.ethz.ch/~buser/
--

[EMAIL PROTECTED] writes:
 > Hi R Users
 > I have a code which I am running for my thesis work. Just want to make sure 
 > that
 > its ok. Its a t test I am conducting between two gamma distributions with
 > different shape parameters.
 > 
 > the code looks like:
 > 
 > sink("a1.txt");
 > 
 > for (i in 1:1000)
 > {
 > x<-rgamma(40, 2.5, 10)  # n = 40, shape = 2.5, Scale = 10
 > y<-rgamma(40, 2.8, 10)  # n = 40, shape = 2.8, Scale = 10
 > z<-t.test(x, y)
 > print(z)
 > }
 > 
 > 
 > I will appreciate it if someone could tell me if its alrite or not.
 > 
 > thanks
 > 
 > -dev
 > 
 > __
 > R-help@stat.math.ethz.ch mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-help
 > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Code Verification

2005-07-19 Thread pantd
Hi R Users
I have a code which I am running for my thesis work. Just want to make sure that
its ok. Its a t test I am conducting between two gamma distributions with
different shape parameters.

the code looks like:

sink("a1.txt");

for (i in 1:1000)
{
x<-rgamma(40, 2.5, 10)  # n = 40, shape = 2.5, Scale = 10
y<-rgamma(40, 2.8, 10)  # n = 40, shape = 2.8, Scale = 10
z<-t.test(x, y)
print(z)
}


I will appreciate it if someone could tell me if its alrite or not.

thanks

-dev

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html