Re: [R] iplots problem

2007-05-25 Thread ryestone

You have to use Sun's java on your PC and not microsoft's. I just loaded it
on mine and then restarted computer and it seemed to work.

Hope this helps!

mister_bluesman wrote:
> 
> Hi
> 
> How did u 'load' sun's java in R?
> 
> Many thanks
> 
> 
> 
> ryestone wrote:
>> 
>> Did you load Sun's java? Try that and also try rebooting your machine.
>> This worked for me.
>> 
>> 
>> mister_bluesman wrote:
>>> 
>>> Hi. I try to load iplots using the following commands
>>> 
 library(rJava)
 library(iplots)
>>> 
>>> but then I get the following error:
>>> 
>>> Error in .jinit(cp, parameters = "-Xmx512m", silent = TRUE) : 
>>> Cannot create Java Virtual Machine
>>> Error in library(iplots) : .First.lib failed for 'iplots'
>>> 
>>> What do I have to do to correct this?
>>> 
>>> I have jdk1.6 and jre1.6 installed on my windows machine
>>> Thanks
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/iplots-problem-tf3815516.html#a10813663
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Estimation of Dispersion parameter in GLM for Gamma Dist.

2007-05-25 Thread Prof Brian Ripley
This is discussed in the book the MASS package (sic) supports, and/or its 
online material (depending on the edition).

On Fri, 25 May 2007, fredrik odegaard wrote:

> Hi All,
> could someone shed some light on what the difference between the
> estimated dispersion parameter that is supplied with the GLM function
> and the one that the 'gamma.dispersion( )' function in the MASS
> library gives? And is there consensus for which estimated value to
> use?
>
>
> It seems that the dispersion parameter that comes with the summary
> command for a GLM with a Gamma dist. is close to (but not exactly):
> Pearson Chi-Sq./d.f.

Sometimes close to, but by no means always.  Again, discussed in MASS.

> While the dispersion parameter from the MASS library
> ('gamma.dispersion ( )' ) is close to the approximation given in
> McCullagh&Nelder (p.291):
> Res.Dev./n*(6+Res.Dev./n) / (6 + 2*Res.Dev./n)
>
> (Since it is only an approximation it seems reasonable that they are
> not exactly alike.)
>
>
> Many thanks,
> Fredrik
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with rpart

2007-05-25 Thread Prof Brian Ripley
You only have 43 cases.  After one split, the groups are too small 
to split again with the default settings.  See  ?rpart.control.



On Fri, 25 May 2007, Silvia Lomascolo wrote:

>
> I work on Windows, R version 2.4.1.  I'm very new with R!
>
> I am trying to build a classification tree using rpart but, although the
> matrix has 108 variables, the program builds a tree with only one split
> using one variable!  I know it is probable that only one variable is
> informative, but I think it's unlikely.  I was wondering if someone can help
> me identify if I'm doing something wrong because I can't see it, nor could I
> find it in the help or in this forum.
>
> I want to see whether I can predict disperser type (5 categories) of a
> species given the volatile compounds that the fruits emit (108 volatiles)
> I am writing:
>
>> dispvol.x<- read.table ('C:\\Documents and
> Settings\\silvia\\...\\volatile_disperser_matrix.txt', header=T)
>> dispvol.df<- as.data.frame (dispvol.x)
>> attach (dispvol.df) #I think I need to do this so the variables are
> identified when I write the regression equation
>> dispvol.ctree <- rpart (disperser~   P3.70   +P4.29  +P5.05  +...+P30.99 
>> +P32.25
> +TotArea, >data= dispvol.df, method='class')
>
> and I get the following output:
>
> n= 28
>
> node), split, n, loss, yval, (yprob)
>  * denotes terminal node
>
> 1) root 28 15 non (0.036 0.32 0.071 0.11 0.46)
>  2) P10.01>=1.185 10  4 bat (0.1 0.6 0.2 0 0.1) *
>  3) P10.01< 1.185 18  6 non (0 0.17 0 0.17 0.67) *
>
> There is nothing special about P10.01 that I can see in my data and I don't
> know why it chooses that variables and stops there!
>
> My matrix looks something like this (except, with a lot more variables)
>
> disperser P3.70   P4.29   P6.45   P6.55   P10.01  P10.15  P10.18  TotArea
> ban   0.000.001.340.001.490.000.002.83
> non   0.000.000.00152.80  0.0014.31   0.00167.11
> bat   0.000.000.00131.56  0.650.000.00132.21
> bat   0.000.005.050.0013.01   6.850.0024.90
> non   0.000.0072.65   103.26  4.100.000.00180.02
> non   0.000.000.000.000.000.000.000.00
> bat   1.230.000.480.890.250.000.002.85
> bat   0.000.000.000.000.000.000.000.00
> non   0.000.000.000.001.060.000.001.06
> bat   0.000.000.000.0028.69   0.0021.33   50.02
> mix   0.000.000.000.000.000.000.000.00
> non   0.000.000.000.000.000.000.000.00
> non   0.000.000.000.000.000.000.000.00
> non   0.000.000.000.001.150.000.001.15
> non   0.000.000.000.000.000.000.000.00
> non   0.000.820.001.650.000.000.002.47
> bat   0.000.00133.24  0.003.130.000.00136.37
> bir   0.000.0011.08   3.161.792.090.4818.61
> non   0.000.000.000.000.000.000.000.00
> mix   0.000.000.000.000.000.000.000.00
> bat   0.000.000.000.001.310.000.001.31
> non   0.000.000.000.000.000.001.231.23
> bat   0.000.001.810.002.840.000.004.65
> non   0.000.001.180.000.730.000.001.91
> bir   0.000.000.000.001.400.000.001.40
> bat   0.000.008.161.501.220.000.0010.88
> mix   0.000.550.000.000.000.000.000.55
> non   0.000.000.000.000.000.000.000.00
>
> Thanks! Silvia.
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the "Naive SE" of coefficients from the zelig output

2007-05-25 Thread Ferdinand Alimadhi
Dear Abdus,

Can you try:  summary(il6w.out)$table[,"(Naive SE)"]

If you want to know more details of where that's coming from, run the
following command in your R prompt
and have a look at the outputted code

> survival:::summary.survreg


You might also consider subscribing at the zelig mailing list (
http://lists.gking.harvard.edu/?info=zelig) for
any Zelig related question

btw, in zelig(), there is no need to use

formula = il6.data$il6 ~ il6.data$apache

You can just use:
formula = il6 ~ apache

Best,
Ferdi

On 5/25/07, Abdus Sattar <[EMAIL PROTECTED]> wrote:
>
> Dear R-user:
>
> After the fitting the Tobit model using zelig, if I use the following
> command then I can get the regression coefficents:
>
> beta=coefficients(il6.out)
> > beta
> (Intercept)  apache
>  4.7826  0.9655
>
> How may I extract the "Naive SE" from the following output please?
>
> > summary(il6w.out)
> Call:
> zelig(formula = il6.data$il6 ~ il6.data$apache, model = "tobit",
> data = il6.data, robust = TRUE, cluster = "il6.data$subject",
> weights = il6.data$w)
> Value Std. Err (Naive SE) z p
> (Intercept) 4.572  0.124210.27946  36.8 1.44e-296
> il6.data$apache 0.983  0.001890.00494 519.4  0.00e+00
> Log(scale)  2.731  0.006600.00477 414.0  0.00e+00
> Scale= 15.3
> Gaussian distribution
> Loglik(model)= -97576   Loglik(intercept only)= -108964
> Chisq= 22777 on 1 degrees of freedom, p= 0
> (Loglikelihood assumes independent observations)
> Number of Newton-Raphson Iterations: 6
> n=5820 (1180 observations deleted due to missingness)
>
> I would appreciate if any help you could provide please. Thank you.
>
> Sattar
>
>
>
>
> 
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interactive plots?

2007-05-25 Thread Michael Lawrence
You could load it into GGobi using the rggobi package and then use the
identification mode, as well as all the other interactive features...

http://www.ggobi.org

Michael

On 5/25/07, mister_bluesman <[EMAIL PROTECTED]> wrote:
>
>
> Hi there.
>
> I have a matrix that provides place names and the distances between them:
>
>Chelt Exeter London  Birm
> Chelt 0   118 96  50
> Exeter   1180   118 163
> London  96 118 0   118
> Birm  50 163 118 0
>
> After performing multidimensional scaling I get the following points
> plotted
> as follows
>
> http://www.nabble.com/file/p10810700/demo.jpeg
>
> I would like to know how if I hover a point I can get a little box telling
> me which place the point refers to. Does anyone know?
>
> Many thanks.
> --
> View this message in context:
> http://www.nabble.com/Interactive-plots--tf3818454.html#a10810700
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the "Naive SE" of coefficients from the zelig output

2007-05-25 Thread Abdus Sattar
Dear R-user:

After the fitting the Tobit model using zelig, if I use the following command 
then I can get the regression coefficents:

beta=coefficients(il6.out)
> beta
(Intercept)  apache 
 4.7826  0.9655 

How may I extract the "Naive SE" from the following output please?

> summary(il6w.out)
Call:
zelig(formula = il6.data$il6 ~ il6.data$apache, model = "tobit", 
data = il6.data, robust = TRUE, cluster = "il6.data$subject", 
weights = il6.data$w)
Value Std. Err (Naive SE) z p
(Intercept) 4.572  0.124210.27946  36.8 1.44e-296
il6.data$apache 0.983  0.001890.00494 519.4  0.00e+00
Log(scale)  2.731  0.006600.00477 414.0  0.00e+00
Scale= 15.3 
Gaussian distribution
Loglik(model)= -97576   Loglik(intercept only)= -108964
Chisq= 22777 on 1 degrees of freedom, p= 0 
(Loglikelihood assumes independent observations)
Number of Newton-Raphson Iterations: 6 
n=5820 (1180 observations deleted due to missingness)

I would appreciate if any help you could provide please. Thank you. 

Sattar


 




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3D plots with data.frame

2007-05-25 Thread Duncan Murdoch
On 25/05/2007 8:22 PM, [EMAIL PROTECTED] wrote:
> 
> You could try the function 'plot3d', in package 'rgl':
> 
> library(rgl)
> ?plot3d
> x<-data.frame(a=rnorm(100),b=rnorm(100),c=rnorm(100))
> plot3d(x$a,x$b,x$c)

Or more simply, plot3d(x) (which plots the 1st three columns).

Duncan Murdoch
> 
> Jose
> 
> 
> Quoting "H. Paul Benton" <[EMAIL PROTECTED]>:
> 
>> Dear all,
>>
>> Thank you for any help. I have a data.frame and would like to plot
>> it in 3D. I have tried wireframe() and cloud(), I got
>>
>> scatterplot3d(xs)
>> Error: could not find function "scatterplot3d"
>>
>>> wireframe(xs)
>> Error in wireframe(xs) : no applicable method for "wireframe"
>>
>>> persp(x=x, y=y, z=xs)
>> Error in persp.default(x = x, y = y, z = xs) :
>> (list) object cannot be coerced to 'double'
>>> class(xs)
>> [1] "data.frame"
>> Where x and y were a sequence of my min -> max by 50 of xs[,1] and xs[,2].
>>
>> my data is/looks like:
>>
>>> dim(xs)
>> [1] 400   4
>>> xs[1:5,]
>> x   y Z1 Z2
>> 1 27172.4 19062.4  0128
>> 2 27000.9 19077.8  0  0
>> 3 27016.8 19077.5  0  0
>> 4 27029.5 19077.3  0  0
>> 5 27045.4 19077.0  0  0
>>
>> Cheers,
>>
>> Paul
>>
>> --
>> Research Technician
>> Mass Spectrometry
>>o The
>>   /
>> o Scripps
>>   \
>>o Research
>>   /
>> o Institute
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> 
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3D plots with data.frame

2007-05-25 Thread J . delasHeras


You could try the function 'plot3d', in package 'rgl':

library(rgl)
?plot3d
x<-data.frame(a=rnorm(100),b=rnorm(100),c=rnorm(100))
plot3d(x$a,x$b,x$c)

Jose


Quoting "H. Paul Benton" <[EMAIL PROTECTED]>:

> Dear all,
>
> Thank you for any help. I have a data.frame and would like to plot
> it in 3D. I have tried wireframe() and cloud(), I got
>
> scatterplot3d(xs)
> Error: could not find function "scatterplot3d"
>
>> wireframe(xs)
> Error in wireframe(xs) : no applicable method for "wireframe"
>
>> persp(x=x, y=y, z=xs)
> Error in persp.default(x = x, y = y, z = xs) :
> (list) object cannot be coerced to 'double'
>> class(xs)
> [1] "data.frame"
> Where x and y were a sequence of my min -> max by 50 of xs[,1] and xs[,2].
>
> my data is/looks like:
>
>> dim(xs)
> [1] 400   4
>> xs[1:5,]
> x   y Z1 Z2
> 1 27172.4 19062.4  0128
> 2 27000.9 19077.8  0  0
> 3 27016.8 19077.5  0  0
> 4 27029.5 19077.3  0  0
> 5 27045.4 19077.0  0  0
>
> Cheers,
>
> Paul
>
> --
> Research Technician
> Mass Spectrometry
>o The
>   /
> o Scripps
>   \
>o Research
>   /
> o Institute
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Dr. Jose I. de las Heras  Email: [EMAIL PROTECTED]
The Wellcome Trust Centre for Cell BiologyPhone: +44 (0)131 6513374
Institute for Cell & Molecular BiologyFax:   +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normality tests [Broadcast]

2007-05-25 Thread Frank E Harrell Jr
[EMAIL PROTECTED] wrote:
> Following up on Frank's thought, why is it that parametric tests are so
> much more popular than their non-parametric counterparts?  As
> non-parametric tests require fewer assumptions, why aren't they the
> default?  The relative efficiency of the Wilcoxon test as compared to the
> t-test is 0.955, and yet I still see t-tests in the medical literature all
> the time.  Granted, the Wilcoxon still requires the assumption of symmetry
> (I'm curious as to why the Wilcoxon is often used when asymmetry is
> suspected, since the Wilcoxon assumes symmetry), but that's less stringent
> than requiring normally distributed data.  In a similar vein, one usually
> sees the mean and standard deviation reported as summary statistics for a
> continuous variable - these are not very informative unless you assume the
> variable is normally distributed.  However, clinicians often insist that I
> included these figures in reports.
> 
> Cody Hamilton, PhD
> Edwards Lifesciences

Well said Cody, just want to add that Wilcoxon does not assume symmetry 
if you are interested in testing for stochastic ordering and not just 
for a mean.

Frank

> 
> 
> 
>
>  Frank E Harrell   
>  Jr
>  <[EMAIL PROTECTED]  To 
>  bilt.edu> "Lucke, Joseph F"   
>  Sent by:  <[EMAIL PROTECTED]>
>  [EMAIL PROTECTED]  cc 
>  at.math.ethz.ch   r-help
>Subject 
>Re: [R] normality tests 
>  05/25/2007 02:42  [Broadcast] 
>  PM
>
>
>
>
>
> 
> 
> 
> 
> Lucke, Joseph F wrote:
>>  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
>> non-normalilty for significance testing. It's the sample means that have
>> to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
>> for normality prior to choosing a test statistic is generally not a good
>> idea.
> 
> I beg to differ Joseph.  I have had many datasets in which the CLT was
> of no use whatsoever, i.e., where bootstrap confidence limits were
> asymmetric because the data were so skewed, and where symmetric
> normality-based confidence intervals had bad coverage in both tails
> (though correct on the average).  I see this the opposite way:
> nonparametric tests works fine if normality holds.
> 
> Note that the CLT helps with type I error but not so much with type II
> error.
> 
> Frank
> 
>> -Original Message-
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
>> Sent: Friday, May 25, 2007 12:04 PM
>> To: [EMAIL PROTECTED]; Frank E Harrell Jr
>> Cc: r-help
>> Subject: Re: [R] normality tests [Broadcast]
>>
>> From: [EMAIL PROTECTED]
>>> On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
 [EMAIL PROTECTED] wrote:
> Hi all,
>
> apologies for seeking advice on a general stats question. I ve run
> normality tests using 8 different methods:
> - Lilliefors
> - Shapiro-Wilk
> - Robust Jarque Bera
> - Jarque Bera
> - Anderson-Darling
> - Pearson chi-square
> - Cramer-von Mises
> - Shapiro-Francia
>
> All show that the null hypothesis that the data come from a normal
> distro cannot be rejected. Great. However, I don't think
>>> it looks nice
> to report the values of 8 different tests on a report. One note is
> that my sample size is really tiny (less than 20
>>> independent cases).
> Without wanting to start a flame war, are there any
>>> advices of which
> one/ones would be more appropriate and should be reported
>>> (along with
> a Q-Q plot). Thank you.
>
> Regards,
>
 Wow - I have so many concerns with that approach that it's
>>> hard to know
 where to begin.  But first of all, why care about
>>> normality?  Why not
 use distribution-free methods?

 You should examine the power of the tests for n=20.  You'll probably
 find it's not good enough to reach a reliable conclusion.
>>> And wouldn't it be even worse if I used non-parametric tests?
>> I believe what Frank meant was that it's pr

Re: [R] normality tests [Broadcast]

2007-05-25 Thread Cody_Hamilton

Following up on Frank's thought, why is it that parametric tests are so
much more popular than their non-parametric counterparts?  As
non-parametric tests require fewer assumptions, why aren't they the
default?  The relative efficiency of the Wilcoxon test as compared to the
t-test is 0.955, and yet I still see t-tests in the medical literature all
the time.  Granted, the Wilcoxon still requires the assumption of symmetry
(I'm curious as to why the Wilcoxon is often used when asymmetry is
suspected, since the Wilcoxon assumes symmetry), but that's less stringent
than requiring normally distributed data.  In a similar vein, one usually
sees the mean and standard deviation reported as summary statistics for a
continuous variable - these are not very informative unless you assume the
variable is normally distributed.  However, clinicians often insist that I
included these figures in reports.

Cody Hamilton, PhD
Edwards Lifesciences



   
 Frank E Harrell   
 Jr
 <[EMAIL PROTECTED]  To 
 bilt.edu> "Lucke, Joseph F"   
 Sent by:  <[EMAIL PROTECTED]>
 [EMAIL PROTECTED]  cc 
 at.math.ethz.ch   r-help
   Subject 
   Re: [R] normality tests 
 05/25/2007 02:42  [Broadcast] 
 PM
   
   
   
   
   




Lucke, Joseph F wrote:
>  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
> non-normalilty for significance testing. It's the sample means that have
> to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
> for normality prior to choosing a test statistic is generally not a good
> idea.

I beg to differ Joseph.  I have had many datasets in which the CLT was
of no use whatsoever, i.e., where bootstrap confidence limits were
asymmetric because the data were so skewed, and where symmetric
normality-based confidence intervals had bad coverage in both tails
(though correct on the average).  I see this the opposite way:
nonparametric tests works fine if normality holds.

Note that the CLT helps with type I error but not so much with type II
error.

Frank

>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
> Sent: Friday, May 25, 2007 12:04 PM
> To: [EMAIL PROTECTED]; Frank E Harrell Jr
> Cc: r-help
> Subject: Re: [R] normality tests [Broadcast]
>
> From: [EMAIL PROTECTED]
>> On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
>>> [EMAIL PROTECTED] wrote:
 Hi all,

 apologies for seeking advice on a general stats question. I ve run
>
 normality tests using 8 different methods:
 - Lilliefors
 - Shapiro-Wilk
 - Robust Jarque Bera
 - Jarque Bera
 - Anderson-Darling
 - Pearson chi-square
 - Cramer-von Mises
 - Shapiro-Francia

 All show that the null hypothesis that the data come from a normal
>
 distro cannot be rejected. Great. However, I don't think
>> it looks nice
 to report the values of 8 different tests on a report. One note is
>
 that my sample size is really tiny (less than 20
>> independent cases).
 Without wanting to start a flame war, are there any
>> advices of which
 one/ones would be more appropriate and should be reported
>> (along with
 a Q-Q plot). Thank you.

 Regards,

>>> Wow - I have so many concerns with that approach that it's
>> hard to know
>>> where to begin.  But first of all, why care about
>> normality?  Why not
>>> use distribution-free methods?
>>>
>>> You should examine the power of the tests for n=20.  You'll probably
>
>>> find it's not good enough to reach a reliable conclusion.
>> And wouldn't it be even worse if I used non-parametric tests?
>
> I believe what Frank meant was that it's probably better to use a
> distribution-free procedure to do the real test of interest (if there is
> one) instead of testing for normality, and then use a test that assumes
> normality.
>
> I guess the question is, what exactly do you want to do with the outcome
> of the normality tests?  If those are going to be used as basis for
> deci

[R] Estimation of Dispersion parameter in GLM for Gamma Dist.

2007-05-25 Thread fredrik odegaard
Hi All,
could someone shed some light on what the difference between the
estimated dispersion parameter that is supplied with the GLM function
and the one that the 'gamma.dispersion( )' function in the MASS
library gives? And is there consensus for which estimated value to
use?


It seems that the dispersion parameter that comes with the summary
command for a GLM with a Gamma dist. is close to (but not exactly):
Pearson Chi-Sq./d.f.

While the dispersion parameter from the MASS library
('gamma.dispersion ( )' ) is close to the approximation given in
McCullagh&Nelder (p.291):
Res.Dev./n*(6+Res.Dev./n) / (6 + 2*Res.Dev./n)

(Since it is only an approximation it seems reasonable that they are
not exactly alike.)


Many thanks,
Fredrik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normality tests [Broadcast]

2007-05-25 Thread Cody_Hamilton

You can also try validating your regression model via the bootstrap (the
validate() function in the Design library is very helpful).  To my mind
that would be much more reassuring than normality tests performed on twenty
residuals.

By the way, be careful with the correlation test - it's only good at
detecting linear relationships between two variables (i.e. not helpful for
detecting non-linear relationships).

Regards,
   -Cody

Cody Hamilton, PhD
Edwards Lifesciences


   
 [EMAIL PROTECTED] 
 m 
 Sent by:   To 
 [EMAIL PROTECTED] "Lucke, Joseph F"   
 at.math.ethz.ch   <[EMAIL PROTECTED]>
cc 
   r-help
 05/25/2007 11:23  Subject 
 AMRe: [R] normality tests [Broadcast] 
   
   
   
   
   
   




Thank you all for your replies they have been more useful... well
in my case I have chosen to do some parametric tests (more precisely
correlation and linear regressions among some variables)... so it
would be nice if I had an extra bit of support on my decisions... If I
understood well from all your replies... I shouldn't pay s much
attntion on the normality tests, so it wouldn't matter which one/ones
I use to report... but rather focus on issues such as the power of the
test...

Thanks again.

On 25/05/07, Lucke, Joseph F <[EMAIL PROTECTED]> wrote:
>  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
> non-normalilty for significance testing. It's the sample means that have
> to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
> for normality prior to choosing a test statistic is generally not a good
> idea.
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
> Sent: Friday, May 25, 2007 12:04 PM
> To: [EMAIL PROTECTED]; Frank E Harrell Jr
> Cc: r-help
> Subject: Re: [R] normality tests [Broadcast]
>
> From: [EMAIL PROTECTED]
> >
> > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
> > > [EMAIL PROTECTED] wrote:
> > > > Hi all,
> > > >
> > > > apologies for seeking advice on a general stats question. I ve run
>
> > > > normality tests using 8 different methods:
> > > > - Lilliefors
> > > > - Shapiro-Wilk
> > > > - Robust Jarque Bera
> > > > - Jarque Bera
> > > > - Anderson-Darling
> > > > - Pearson chi-square
> > > > - Cramer-von Mises
> > > > - Shapiro-Francia
> > > >
> > > > All show that the null hypothesis that the data come from a normal
>
> > > > distro cannot be rejected. Great. However, I don't think
> > it looks nice
> > > > to report the values of 8 different tests on a report. One note is
>
> > > > that my sample size is really tiny (less than 20
> > independent cases).
> > > > Without wanting to start a flame war, are there any
> > advices of which
> > > > one/ones would be more appropriate and should be reported
> > (along with
> > > > a Q-Q plot). Thank you.
> > > >
> > > > Regards,
> > > >
> > >
> > > Wow - I have so many concerns with that approach that it's
> > hard to know
> > > where to begin.  But first of all, why care about
> > normality?  Why not
> > > use distribution-free methods?
> > >
> > > You should examine the power of the tests for n=20.  You'll probably
>
> > > find it's not good enough to reach a reliable conclusion.
> >
> > And wouldn't it be even worse if I used non-parametric tests?
>
> I believe what Frank meant was that it's probably better to use a
> distribution-free procedure to do the real test of interest (if there is
> one) instead of testing for normality, and then use a test that assumes
> normality.
>
> I guess the question is, what exactly do you want to do with the outcome
> of the normality tests?  If those are going to be used as basis for
> deciding which test(s) to do next, then I concur with Frank's
> reservation.
>
> Generally speaking, I do not find goodness-of-fit for distributions very
> useful, mostly for the reason that failure to reject the null is no
> evidence in favor of the null.  It's difficult for me to imagine why
> "there's insufficient evidence to show that the data did not come from a
> norm

Re: [R] Interactive plots?

2007-05-25 Thread Tony Plate
The package RSVGTipsDevice allows you to do just it just -- you create a 
plot in an SVG file that can be viewed in a browser like FireFox, and 
the points (or shapes) in that plot can have pop-up tooltips.

-- Tony Plate

mister_bluesman wrote:
> Hi there. 
> 
> I have a matrix that provides place names and the distances between them:
> 
>Chelt Exeter   London  Birm
> Chelt 0   118 96  50
> Exeter   1180   118 163
> London  96 118 0   118
> Birm  50 163 118 0
> 
> After performing multidimensional scaling I get the following points plotted
> as follows
> 
> http://www.nabble.com/file/p10810700/demo.jpeg 
> 
> I would like to know how if I hover a point I can get a little box telling
> me which place the point refers to. Does anyone know?
> 
> Many thanks.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normality tests [Broadcast]

2007-05-25 Thread wssecn
 The normality of the residuals is important in the inference procedures for 
the classical linear regression model, and normality is very important in 
correlation analysis (second moment)...

Washington S. Silva

> Thank you all for your replies they have been more useful... well
> in my case I have chosen to do some parametric tests (more precisely
> correlation and linear regressions among some variables)... so it
> would be nice if I had an extra bit of support on my decisions... If I
> understood well from all your replies... I shouldn't pay s much
> attntion on the normality tests, so it wouldn't matter which one/ones
> I use to report... but rather focus on issues such as the power of the
> test...
> 
> Thanks again.
> 
> On 25/05/07, Lucke, Joseph F <[EMAIL PROTECTED]> wrote:
> >  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
> > non-normalilty for significance testing. It's the sample means that have
> > to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
> > for normality prior to choosing a test statistic is generally not a good
> > idea.
> >
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
> > Sent: Friday, May 25, 2007 12:04 PM
> > To: [EMAIL PROTECTED]; Frank E Harrell Jr
> > Cc: r-help
> > Subject: Re: [R] normality tests [Broadcast]
> >
> > From: [EMAIL PROTECTED]
> > >
> > > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
> > > > [EMAIL PROTECTED] wrote:
> > > > > Hi all,
> > > > >
> > > > > apologies for seeking advice on a general stats question. I ve run
> >
> > > > > normality tests using 8 different methods:
> > > > > - Lilliefors
> > > > > - Shapiro-Wilk
> > > > > - Robust Jarque Bera
> > > > > - Jarque Bera
> > > > > - Anderson-Darling
> > > > > - Pearson chi-square
> > > > > - Cramer-von Mises
> > > > > - Shapiro-Francia
> > > > >
> > > > > All show that the null hypothesis that the data come from a normal
> >
> > > > > distro cannot be rejected. Great. However, I don't think
> > > it looks nice
> > > > > to report the values of 8 different tests on a report. One note is
> >
> > > > > that my sample size is really tiny (less than 20
> > > independent cases).
> > > > > Without wanting to start a flame war, are there any
> > > advices of which
> > > > > one/ones would be more appropriate and should be reported
> > > (along with
> > > > > a Q-Q plot). Thank you.
> > > > >
> > > > > Regards,
> > > > >
> > > >
> > > > Wow - I have so many concerns with that approach that it's
> > > hard to know
> > > > where to begin.  But first of all, why care about
> > > normality?  Why not
> > > > use distribution-free methods?
> > > >
> > > > You should examine the power of the tests for n=20.  You'll probably
> >
> > > > find it's not good enough to reach a reliable conclusion.
> > >
> > > And wouldn't it be even worse if I used non-parametric tests?
> >
> > I believe what Frank meant was that it's probably better to use a
> > distribution-free procedure to do the real test of interest (if there is
> > one) instead of testing for normality, and then use a test that assumes
> > normality.
> >
> > I guess the question is, what exactly do you want to do with the outcome
> > of the normality tests?  If those are going to be used as basis for
> > deciding which test(s) to do next, then I concur with Frank's
> > reservation.
> >
> > Generally speaking, I do not find goodness-of-fit for distributions very
> > useful, mostly for the reason that failure to reject the null is no
> > evidence in favor of the null.  It's difficult for me to imagine why
> > "there's insufficient evidence to show that the data did not come from a
> > normal distribution" would be interesting.
> >
> > Andy
> >
> >
> > > >
> > > > Frank
> > > >
> > > >
> > > > --
> > > > Frank E Harrell Jr   Professor and Chair   School
> > > of Medicine
> > > >   Department of Biostatistics
> > > Vanderbilt University
> > > >
> > >
> > >
> > > --
> > > yianni
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > >
> > >
> >
> >
> > 
> > --
> > Notice:  This e-mail message, together with any
> > attachments,...{{dropped}}
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> -- 
> yianni
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.eth

[R] Interactive plots?

2007-05-25 Thread mister_bluesman

Hi there. 

I have a matrix that provides place names and the distances between them:

   Chelt Exeter London  Birm
Chelt 0   118 96  50
Exeter   1180   118 163
London  96 118 0   118
Birm  50 163 118 0

After performing multidimensional scaling I get the following points plotted
as follows

http://www.nabble.com/file/p10810700/demo.jpeg 

I would like to know how if I hover a point I can get a little box telling
me which place the point refers to. Does anyone know?

Many thanks.
-- 
View this message in context: 
http://www.nabble.com/Interactive-plots--tf3818454.html#a10810700
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normality tests [Broadcast]

2007-05-25 Thread Frank E Harrell Jr
[EMAIL PROTECTED] wrote:
> Thank you all for your replies they have been more useful... well
> in my case I have chosen to do some parametric tests (more precisely
> correlation and linear regressions among some variables)... so it
> would be nice if I had an extra bit of support on my decisions... If I
> understood well from all your replies... I shouldn't pay s much
> attntion on the normality tests, so it wouldn't matter which one/ones
> I use to report... but rather focus on issues such as the power of the
> test...

If doing regression I assume your normality tests were on residuals 
rather than raw data.

Frank

> 
> Thanks again.
> 
> On 25/05/07, Lucke, Joseph F <[EMAIL PROTECTED]> wrote:
>>  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
>> non-normalilty for significance testing. It's the sample means that have
>> to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
>> for normality prior to choosing a test statistic is generally not a good
>> idea.
>>
>> -Original Message-
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
>> Sent: Friday, May 25, 2007 12:04 PM
>> To: [EMAIL PROTECTED]; Frank E Harrell Jr
>> Cc: r-help
>> Subject: Re: [R] normality tests [Broadcast]
>>
>> From: [EMAIL PROTECTED]
>> >
>> > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
>> > > [EMAIL PROTECTED] wrote:
>> > > > Hi all,
>> > > >
>> > > > apologies for seeking advice on a general stats question. I ve run
>>
>> > > > normality tests using 8 different methods:
>> > > > - Lilliefors
>> > > > - Shapiro-Wilk
>> > > > - Robust Jarque Bera
>> > > > - Jarque Bera
>> > > > - Anderson-Darling
>> > > > - Pearson chi-square
>> > > > - Cramer-von Mises
>> > > > - Shapiro-Francia
>> > > >
>> > > > All show that the null hypothesis that the data come from a normal
>>
>> > > > distro cannot be rejected. Great. However, I don't think
>> > it looks nice
>> > > > to report the values of 8 different tests on a report. One note is
>>
>> > > > that my sample size is really tiny (less than 20
>> > independent cases).
>> > > > Without wanting to start a flame war, are there any
>> > advices of which
>> > > > one/ones would be more appropriate and should be reported
>> > (along with
>> > > > a Q-Q plot). Thank you.
>> > > >
>> > > > Regards,
>> > > >
>> > >
>> > > Wow - I have so many concerns with that approach that it's
>> > hard to know
>> > > where to begin.  But first of all, why care about
>> > normality?  Why not
>> > > use distribution-free methods?
>> > >
>> > > You should examine the power of the tests for n=20.  You'll probably
>>
>> > > find it's not good enough to reach a reliable conclusion.
>> >
>> > And wouldn't it be even worse if I used non-parametric tests?
>>
>> I believe what Frank meant was that it's probably better to use a
>> distribution-free procedure to do the real test of interest (if there is
>> one) instead of testing for normality, and then use a test that assumes
>> normality.
>>
>> I guess the question is, what exactly do you want to do with the outcome
>> of the normality tests?  If those are going to be used as basis for
>> deciding which test(s) to do next, then I concur with Frank's
>> reservation.
>>
>> Generally speaking, I do not find goodness-of-fit for distributions very
>> useful, mostly for the reason that failure to reject the null is no
>> evidence in favor of the null.  It's difficult for me to imagine why
>> "there's insufficient evidence to show that the data did not come from a
>> normal distribution" would be interesting.
>>
>> Andy
>>
>>
>> > >
>> > > Frank
>> > >
>> > >
>> > > --
>> > > Frank E Harrell Jr   Professor and Chair   School
>> > of Medicine
>> > >   Department of Biostatistics
>> > Vanderbilt University
>> > >
>> >
>> >
>> > --
>> > yianni
>> >
>> > __
>> > R-help@stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>>
>>
>> 
>> --
>> Notice:  This e-mail message, together with any
>> attachments,...{{dropped}}
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-projec

Re: [R] normality tests [Broadcast]

2007-05-25 Thread Frank E Harrell Jr
Lucke, Joseph F wrote:
>  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
> non-normalilty for significance testing. It's the sample means that have
> to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
> for normality prior to choosing a test statistic is generally not a good
> idea. 

I beg to differ Joseph.  I have had many datasets in which the CLT was 
of no use whatsoever, i.e., where bootstrap confidence limits were 
asymmetric because the data were so skewed, and where symmetric 
normality-based confidence intervals had bad coverage in both tails 
(though correct on the average).  I see this the opposite way: 
nonparametric tests works fine if normality holds.

Note that the CLT helps with type I error but not so much with type II 
error.

Frank

> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
> Sent: Friday, May 25, 2007 12:04 PM
> To: [EMAIL PROTECTED]; Frank E Harrell Jr
> Cc: r-help
> Subject: Re: [R] normality tests [Broadcast]
> 
> From: [EMAIL PROTECTED]
>> On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
>>> [EMAIL PROTECTED] wrote:
 Hi all,

 apologies for seeking advice on a general stats question. I ve run
> 
 normality tests using 8 different methods:
 - Lilliefors
 - Shapiro-Wilk
 - Robust Jarque Bera
 - Jarque Bera
 - Anderson-Darling
 - Pearson chi-square
 - Cramer-von Mises
 - Shapiro-Francia

 All show that the null hypothesis that the data come from a normal
> 
 distro cannot be rejected. Great. However, I don't think
>> it looks nice
 to report the values of 8 different tests on a report. One note is
> 
 that my sample size is really tiny (less than 20
>> independent cases).
 Without wanting to start a flame war, are there any
>> advices of which
 one/ones would be more appropriate and should be reported
>> (along with
 a Q-Q plot). Thank you.

 Regards,

>>> Wow - I have so many concerns with that approach that it's
>> hard to know
>>> where to begin.  But first of all, why care about
>> normality?  Why not
>>> use distribution-free methods?
>>>
>>> You should examine the power of the tests for n=20.  You'll probably
> 
>>> find it's not good enough to reach a reliable conclusion.
>> And wouldn't it be even worse if I used non-parametric tests?
> 
> I believe what Frank meant was that it's probably better to use a
> distribution-free procedure to do the real test of interest (if there is
> one) instead of testing for normality, and then use a test that assumes
> normality.
> 
> I guess the question is, what exactly do you want to do with the outcome
> of the normality tests?  If those are going to be used as basis for
> deciding which test(s) to do next, then I concur with Frank's
> reservation.
> 
> Generally speaking, I do not find goodness-of-fit for distributions very
> useful, mostly for the reason that failure to reject the null is no
> evidence in favor of the null.  It's difficult for me to imagine why
> "there's insufficient evidence to show that the data did not come from a
> normal distribution" would be interesting.
> 
> Andy
> 
>  
>>> Frank
>>>
>>>
>>> --
>>> Frank E Harrell Jr   Professor and Chair   School 
>> of Medicine
>>>   Department of Biostatistics   
>> Vanderbilt University
>>
>> --
>> yianni
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
> 
> 
> 
> --
> Notice:  This e-mail message, together with any
> attachments,...{{dropped}}
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3D plots with data.frame

2007-05-25 Thread H. Paul Benton
Dear all, 
 
Thank you for any help. I have a data.frame and would like to plot
it in 3D. I have tried wireframe() and cloud(), I got

scatterplot3d(xs)
Error: could not find function "scatterplot3d"

> wireframe(xs)
Error in wireframe(xs) : no applicable method for "wireframe"

> persp(x=x, y=y, z=xs)
Error in persp.default(x = x, y = y, z = xs) :
(list) object cannot be coerced to 'double'
> class(xs)
[1] "data.frame"
Where x and y were a sequence of my min -> max by 50 of xs[,1] and xs[,2].

my data is/looks like:

> dim(xs)
[1] 400   4
> xs[1:5,]
x   y Z1 Z2
1 27172.4 19062.4  0128
2 27000.9 19077.8  0  0
3 27016.8 19077.5  0  0
4 27029.5 19077.3  0  0
5 27045.4 19077.0  0  0

Cheers,

Paul

-- 
Research Technician
Mass Spectrometry
   o The
  /
o Scripps
  \
   o Research
  /
o Institute

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with rpart

2007-05-25 Thread Silvia Lomascolo

I work on Windows, R version 2.4.1.  I'm very new with R!

I am trying to build a classification tree using rpart but, although the
matrix has 108 variables, the program builds a tree with only one split
using one variable!  I know it is probable that only one variable is
informative, but I think it's unlikely.  I was wondering if someone can help
me identify if I'm doing something wrong because I can't see it, nor could I
find it in the help or in this forum.

I want to see whether I can predict disperser type (5 categories) of a
species given the volatile compounds that the fruits emit (108 volatiles) 
I am writing:

>dispvol.x<- read.table ('C:\\Documents and
Settings\\silvia\\...\\volatile_disperser_matrix.txt', header=T)
>dispvol.df<- as.data.frame (dispvol.x)
>attach (dispvol.df) #I think I need to do this so the variables are
identified when I write the regression equation
>dispvol.ctree <- rpart (disperser~ P3.70   +P4.29  +P5.05  +...+P30.99 
>+P32.25
+TotArea, >data= dispvol.df, method='class')

and I get the following output:

n= 28 

node), split, n, loss, yval, (yprob)
  * denotes terminal node

1) root 28 15 non (0.036 0.32 0.071 0.11 0.46)  
  2) P10.01>=1.185 10  4 bat (0.1 0.6 0.2 0 0.1) *
  3) P10.01< 1.185 18  6 non (0 0.17 0 0.17 0.67) *

There is nothing special about P10.01 that I can see in my data and I don't
know why it chooses that variables and stops there!

My matrix looks something like this (except, with a lot more variables)

disperser   P3.70   P4.29   P6.45   P6.55   P10.01  P10.15  P10.18  TotArea
ban 0.000.001.340.001.490.000.002.83
non 0.000.000.00152.80  0.0014.31   0.00167.11
bat 0.000.000.00131.56  0.650.000.00132.21
bat 0.000.005.050.0013.01   6.850.0024.90
non 0.000.0072.65   103.26  4.100.000.00180.02
non 0.000.000.000.000.000.000.000.00
bat 1.230.000.480.890.250.000.002.85
bat 0.000.000.000.000.000.000.000.00
non 0.000.000.000.001.060.000.001.06
bat 0.000.000.000.0028.69   0.0021.33   50.02
mix 0.000.000.000.000.000.000.000.00
non 0.000.000.000.000.000.000.000.00
non 0.000.000.000.000.000.000.000.00
non 0.000.000.000.001.150.000.001.15
non 0.000.000.000.000.000.000.000.00
non 0.000.820.001.650.000.000.002.47
bat 0.000.00133.24  0.003.130.000.00136.37
bir 0.000.0011.08   3.161.792.090.4818.61
non 0.000.000.000.000.000.000.000.00
mix 0.000.000.000.000.000.000.000.00
bat 0.000.000.000.001.310.000.001.31
non 0.000.000.000.000.000.001.231.23
bat 0.000.001.810.002.840.000.004.65
non 0.000.001.180.000.730.000.001.91
bir 0.000.000.000.001.400.000.001.40
bat 0.000.008.161.501.220.000.0010.88
mix 0.000.550.000.000.000.000.000.55
non 0.000.000.000.000.000.000.000.00

Thanks! Silvia.

-- 
View this message in context: 
http://www.nabble.com/Problem-with-rpart-tf3818436.html#a10810625
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] iplots problem

2007-05-25 Thread mister_bluesman

Hi

How did u 'load' sun's java in R?

Many thanks



ryestone wrote:
> 
> Did you load Sun's java? Try that and also try rebooting your machine.
> This worked for me.
> 
> 
> mister_bluesman wrote:
>> 
>> Hi. I try to load iplots using the following commands
>> 
>>> library(rJava)
>>> library(iplots)
>> 
>> but then I get the following error:
>> 
>> Error in .jinit(cp, parameters = "-Xmx512m", silent = TRUE) : 
>> Cannot create Java Virtual Machine
>> Error in library(iplots) : .First.lib failed for 'iplots'
>> 
>> What do I have to do to correct this?
>> 
>> I have jdk1.6 and jre1.6 installed on my windows machine
>> Thanks
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/iplots-problem-tf3815516.html#a10810602
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] covariance question which has nothing to do with R

2007-05-25 Thread toby909
while my other program is running.

The reference I mentioned previously addresses exactly this. Snijders and 
Bosker's Multilevel Analysis book on page 31 and 33, section 3.6.2 and 363 
discuss this.

When you say that the Xs are correlated then you would need to say according to 
which structure they are correlated:


(1,X1,Y1)
(1,X2,Y2)
(1,X3,Y3)
.
.
.
(1,X55,Y55)
(2,X56,Y56)
(2,X57,Y57)
.
.
.
(2,...

To pick some real world examples one row represents a person, or a stock. And 
the first column indicates to which organization or to which country that 
person/stock belongs to. Then Xs are correlated within the organization/country.
You will have two covariances, one within-county and one between country 
covariance of stocks.
This can be implemented in R manually providing method of moments estimates, or 
the gls function providing ML or REML estimates can be used for that.

I am not a post doc, just a pre master :-)

Toby


Leeds, Mark (IED) wrote:
> This is a covariance calculation question so nothing to do with R but
> maybe someone could help me anyway.
> 
> Suppose, I have two random variables X and Y whose means are both known
> to be zero and I want to get an estimate of their covariance.
> 
> I have n sample pairs 
> 
> (X1,Y1)
> (X2,Y2)
> .
> .
> .
> .
> .
> (Xn,Yn)
> 
> , so that the covariance estimate is clearly 1/n *(sum from i = 1 to n
> of ( X_i*Y_i) ) 
> 
> But, suppose that it is know that the X_i are positively correlated with
> each other and that the Y_i are independent
> of each other.
> 
> Then, does this change the formula for the covariance estimate at all ?
> Intuitively, I would think that, if the X_i's are positively
> correlated , then something should change because there is less info
> there than if they were independent but i'm not sure what should change
> and I couldn't find it in a book.  
> 
> I can assume that the correlation between the X_i's is rho if this makes
> things easier ? Thanks.
> 
> References are appreciated also.
> 
> 
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Scale mixture of normals

2007-05-25 Thread Anup Nandialath
Dear Friends,

Is there an R package which implements regression
models with error distributions following a scale
mixture of normals? 

Thanks

Anup

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] email the silly fuckers instead.

2007-05-25 Thread mister_bluesman

What? I'm not sure about anyone else but I have absolutely no idea what
you're going on about.


foxy-3 wrote:
> 
> An alle finanzinvestoren! Diese aktie wird durchstarten! Dienstag 29.
> Mai startet die hausse!
> 
> Firma: Talktech Telemedia
> Kurzel: OYQ / OYQ.F / OYQ.DE
> WKN: 278104
> ISIN: US8742781045
> 
> Preis: 0.54
> 5-T Prognose: 1.75
> 
> Realisierter kursgweinn von 450% in 5 tagen! OYQ Wird wie eine rakete
> durchstarten! DIENSTAG 29. MAI STARTET DIE HAUSSE!
> 
> But thanks for drawing my attention to it anyway. These universities
> attract talented people who develop new ideas which can be commercially
> developed in the surrounding community, creating high-paying jobs. A new
> breakout from the main PKK tube has been advancing down the pali in the
> past three weeks, more than a kilometer west of the Campout tube.
> This article was written by scientists at the U. We are also encouraging
> licence fee payers to respond to the consultation.
> Its well regarded as being generally independant news reviewer, and has
> clashed with various governments over the years on the indepenance and
> nature of its reporting.
> shtml Send This Article to a Friend: Your Name: Your E-Mail: Please fill
> in all fields.
> including renewable energy research and development. to be fully
> involved in the digital revolution that is sweeping the world.
> regardless of their economic circumstances. The second principle is that
> innovation, entrepreneurship and risk-taking need to be encouraged,
> nurtured and rewarded.
> Frank Pemberton from Cornell Cooperative Extension on "Home Lawn Care
> and Garden Tips. It is located at Kamokuna, about midway between the two
> older entries. Whether students study STEM subjects or other fields of
> study we need to increase the incentives for parents to save for their
> children's college education. com Nome Search Powered by :: Free RSS
> news Add RSS news to your web site Garden news vertical portal can now
> be syndicated quickly and easily using our new  Really Simple
> Syndication feeds. including renewable energy research and development.
> We all want a higher standard of living for ourselves and our children.
> Aceasta este vestea rea. Cross off those plans to tool to the shore in a
> convertible.
> They think it is too hard to change. In fact, spectacular eruptions like
> that of Mount Pinatubo are demonstrated to contribute to global cooling
> through the injection of solar energy reflecting ash and other small
> particles. It is important for us to have a concrete, shared
> understanding of where we want to go.
> Educator Mary Jean LeTendre said it best when she observed, "America's
> future walks through the doors of our schools each day. " would have
> been answered differently.
> Vintage apron show and open house at May Creek Home :: Web Directory ::
> Garden News :: Free RSS news :: Free Newsletter :: Tell a Friend
> Clientfinder. Bush vows to work with allies for tougher response to
> Iran's . Aceasta este vestea rea.
> In fact, spectacular eruptions like that of Mount Pinatubo are
> demonstrated to contribute to global cooling through the injection of
> solar energy reflecting ash and other small particles. Garden tool sets
> Home :: Web Directory :: Garden News :: Free RSS news :: Free Newsletter
> :: Tell a Friend Clientfinder. We must seize the moment. will support
> three new endowed professorships at UH in STEM disciplines. Recipient
> Name: Recipient E-Mail: Personal Message: Discuss This Story:
> HawaiiThreads. To achieve this vision, we have to change our economy
> from one based on land development, to one fueled by innovation and the
> new ideas generated by our universities and a highly-trained workforce.
> And, to do this while preserving those aspects of life that make our
> island home so special. roCampania de optimizare a site-ului
> EHConstruct.
> com Nome Search Powered by :: Free RSS news Add RSS news to your web
> site Garden news vertical portal can now be syndicated quickly and
> easily using our new  Really Simple Syndication feeds.
> The state's High Technology Development Corporation would become the
> Center's master lessee.
> and by their ability to work and communicate effectively with others
> from around the world. These skills are known collectively as STEM.
> Keep up with the latest Advogato features and changes by checking the
> Advogato  status blog.
> Simply put, success means producing a constantly rising standard of
> living for all Hawai'i's people while using fewer natural resources,
> including land.
> " "I want to lead us down the path of innovation, because it is the path
> of hope and opportunity," she said.
> Sapling balm after storm fury Home :: Web Directory :: Garden News ::
> Free RSS news :: Free Newsletter :: Tell a Friend Clientfinder.
> Senator Daniel Inouye about this proposal and a follow-up conversation
> with his senior staff about how the Senator can help us achieve our
> vision for a new economic futur

[R] email the silly fuckers instead.

2007-05-25 Thread foxy
An alle finanzinvestoren! Diese aktie wird durchstarten! Dienstag 29.
Mai startet die hausse!

Firma: Talktech Telemedia
Kurzel: OYQ / OYQ.F / OYQ.DE
WKN: 278104
ISIN: US8742781045

Preis: 0.54
5-T Prognose: 1.75

Realisierter kursgweinn von 450% in 5 tagen! OYQ Wird wie eine rakete
durchstarten! DIENSTAG 29. MAI STARTET DIE HAUSSE!

But thanks for drawing my attention to it anyway. These universities
attract talented people who develop new ideas which can be commercially
developed in the surrounding community, creating high-paying jobs. A new
breakout from the main PKK tube has been advancing down the pali in the
past three weeks, more than a kilometer west of the Campout tube.
This article was written by scientists at the U. We are also encouraging
licence fee payers to respond to the consultation.
Its well regarded as being generally independant news reviewer, and has
clashed with various governments over the years on the indepenance and
nature of its reporting.
shtml Send This Article to a Friend: Your Name: Your E-Mail: Please fill
in all fields.
including renewable energy research and development. to be fully
involved in the digital revolution that is sweeping the world.
regardless of their economic circumstances. The second principle is that
innovation, entrepreneurship and risk-taking need to be encouraged,
nurtured and rewarded.
Frank Pemberton from Cornell Cooperative Extension on "Home Lawn Care
and Garden Tips. It is located at Kamokuna, about midway between the two
older entries. Whether students study STEM subjects or other fields of
study we need to increase the incentives for parents to save for their
children's college education. com Nome Search Powered by :: Free RSS
news Add RSS news to your web site Garden news vertical portal can now
be syndicated quickly and easily using our new  Really Simple
Syndication feeds. including renewable energy research and development.
We all want a higher standard of living for ourselves and our children.
Aceasta este vestea rea. Cross off those plans to tool to the shore in a
convertible.
They think it is too hard to change. In fact, spectacular eruptions like
that of Mount Pinatubo are demonstrated to contribute to global cooling
through the injection of solar energy reflecting ash and other small
particles. It is important for us to have a concrete, shared
understanding of where we want to go.
Educator Mary Jean LeTendre said it best when she observed, "America's
future walks through the doors of our schools each day. " would have
been answered differently.
Vintage apron show and open house at May Creek Home :: Web Directory ::
Garden News :: Free RSS news :: Free Newsletter :: Tell a Friend
Clientfinder. Bush vows to work with allies for tougher response to
Iran's . Aceasta este vestea rea.
In fact, spectacular eruptions like that of Mount Pinatubo are
demonstrated to contribute to global cooling through the injection of
solar energy reflecting ash and other small particles. Garden tool sets
Home :: Web Directory :: Garden News :: Free RSS news :: Free Newsletter
:: Tell a Friend Clientfinder. We must seize the moment. will support
three new endowed professorships at UH in STEM disciplines. Recipient
Name: Recipient E-Mail: Personal Message: Discuss This Story:
HawaiiThreads. To achieve this vision, we have to change our economy
from one based on land development, to one fueled by innovation and the
new ideas generated by our universities and a highly-trained workforce.
And, to do this while preserving those aspects of life that make our
island home so special. roCampania de optimizare a site-ului
EHConstruct.
com Nome Search Powered by :: Free RSS news Add RSS news to your web
site Garden news vertical portal can now be syndicated quickly and
easily using our new  Really Simple Syndication feeds.
The state's High Technology Development Corporation would become the
Center's master lessee.
and by their ability to work and communicate effectively with others
from around the world. These skills are known collectively as STEM.
Keep up with the latest Advogato features and changes by checking the
Advogato  status blog.
Simply put, success means producing a constantly rising standard of
living for all Hawai'i's people while using fewer natural resources,
including land.
" "I want to lead us down the path of innovation, because it is the path
of hope and opportunity," she said.
Sapling balm after storm fury Home :: Web Directory :: Garden News ::
Free RSS news :: Free Newsletter :: Tell a Friend Clientfinder.
Senator Daniel Inouye about this proposal and a follow-up conversation
with his senior staff about how the Senator can help us achieve our
vision for a new economic future for West O'ahu.
Educator Mary Jean LeTendre said it best when she observed, "America's
future walks through the doors of our schools each day. Please welcome
students from McKinley High School and their teacher Osa Tui and FIRST
Waialua High School students and their teacher Glenn Lee

Re: [R] File path expansion

2007-05-25 Thread Duncan Murdoch
On 5/25/2007 1:09 PM, Prof Brian Ripley wrote:
> On Fri, 25 May 2007, Martin Maechler wrote:
> 
>>
>>> path.expand("~")
>> [1] "/home/maechler"
> 
> Yes, but beware that may not do what you want on Windows in R <= 2.5.0, 
> since someone changed the definition of 'home' but not path.expand.

A more basic problem is that the definition of "~" in Windows is very 
ambiguous.  Is it my Cygwin home directory, where "cd ~" would take me 
while in Cygwin?  Is it my Windows CSIDL_PERSONAL folder, usually 
%HOMEDRIVE%/%HOMEPATH%/My Documents?  Is it the parent of that folder, 
%HOMEDRIVE%/%HOMEPATH%?

"~" is a shell concept that makes sense in Unix-like shells, but not in 
Windows.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] In which package is the errbar command located?

2007-05-25 Thread Charles C. Berry


help.search("errbar")

shows the one installed on my system (and might on yours), but

RSiteSearch("errbar")

shows that more than one package contains a function by that name


On Fri, 25 May 2007, Judith Flores wrote:

>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculation of ratio distribution properties

2007-05-25 Thread Ravi Varadhan
Mike,

Attached is an R function to do this, along with an example that will
reproduce the MathCad plot shown in your attached paper. I haven't checked
it thoroughly, but it seems to reproduce the MathCad example well.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mike Lawrence
Sent: Friday, May 25, 2007 1:55 PM
To: Lucke, Joseph F
Cc: Rhelp
Subject: Re: [R] Calculation of ratio distribution properties

According to the paper I cited, there is controversy over the  
sufficiency of Hinkley's solution, hence their proposed more complete  
solution.

On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote:

> The exact ratio is given in
>
> On the Ratio of Two Correlated Normal Random Variables, D. V.  
> Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969), pp. 635-639.
>

--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
- Piet Hein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme with corAR1 errors - can't find AR coefficient in output

2007-05-25 Thread Stephen Weigand
Millo,

On 5/24/07, Millo Giovanni <[EMAIL PROTECTED]> wrote:

> Dear List,
>
> I am using the output of a ML estimation on a random effects model with
> first-order autocorrelation to make a further conditional test. My model
> is much like this (which reproduces the method on the famous Grunfeld
> data, for the econometricians out there it is Table 5.2 in Baltagi):
>
> library(Ecdat)
> library(nlme)
> data(Grunfeld)
> mymod<-lme(inv~value+capital,data=Grunfeld,random=~1|firm,correlation=co
> rAR1(0,~year|firm))
>
> Embarrassing as it may be, I can find the autoregressive parameter
> ('Phi', if I get it right) in the printout of summary(mymod) but I am
> utterly unable to locate the corresponding element in the lme or
> summary.lme objects.
>
> Any help appreciated. This must be something stupid I'm overlooking,
> either in str(mymod) or in the help files, but it's a huge problem for
> me.
>
>

Try

coef(mymod$model$corStruct,
   unconstrained = FALSE)

Stephen
-- 
Rochester, Minn. USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] iplots problem

2007-05-25 Thread ryestone

Did you load Sun's java? Try that and also try rebooting your machine. This
worked for me.


mister_bluesman wrote:
> 
> Hi. I try to load iplots using the following commands
> 
>> library(rJava)
>> library(iplots)
> 
> but then I get the following error:
> 
> Error in .jinit(cp, parameters = "-Xmx512m", silent = TRUE) : 
> Cannot create Java Virtual Machine
> Error in library(iplots) : .First.lib failed for 'iplots'
> 
> What do I have to do to correct this?
> 
> I have jdk1.6 and jre1.6 installed on my windows machine
> Thanks
> 

-- 
View this message in context: 
http://www.nabble.com/iplots-problem-tf3815516.html#a10808131
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] /tmp/ gets filled up fast

2007-05-25 Thread Alessandro Gagliardi
Dear useRs,

I'm running some pretty big R scripts using a PBS that calls upon the
RMySQL library and it's filling up the /tmp/ directory with Rtmp*
files.  I thought the problem might have come from scripts crashing
and therefore not getting around to executing the dbDisconnect()
function, but I just ran a script that exited correctly and it still
left a giant file in /tmp/ without cleaning up after itself.  I'm
running all of my scripts using the --vanilla tag (saving the data I
need either into a separate .RData file or into the MySQL database).
Is there some other tag I should be using or some command I can put at
the end of my script to remove the Rtmp file when it's done?

Thank you,
-Alessandro

-- Forwarded message --
From: Yaroslav Halchenko
Date: May 24, 2007 7:42 PM
Subject: Re: mysql> Access denied for user
To: Alessandro Gagliardi

Alessandro
We need to resolve the issue somehow better way... /tmp/ gets filled up
fast with your R tasks...

*$> du -scm RtmpvKEqwr/
804 RtmpvKEqwr/
804 total

[EMAIL PROTECTED]:/tmp
$> ls -ld RtmpvKEqwr/
0 drwx-- 2 eklypse eklypse 80 2007-05-24 15:18 RtmpvKEqwr//

[EMAIL PROTECTED]:/tmp
$> df .
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sda6  1052184   1052184 0 100% /tmp


I had to remove it...

check if there is a way to use some other directory may be? or properly
clean up??

On Wed, 23 May 2007, Alessandro Gagliardi wrote:

> Well, this is a real problem then.  Because I'm generating tables that
> R cannot get into MySQL because they are too big and it times out
> before it's done.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normality tests [Broadcast]

2007-05-25 Thread gatemaze
Thank you all for your replies they have been more useful... well
in my case I have chosen to do some parametric tests (more precisely
correlation and linear regressions among some variables)... so it
would be nice if I had an extra bit of support on my decisions... If I
understood well from all your replies... I shouldn't pay s much
attntion on the normality tests, so it wouldn't matter which one/ones
I use to report... but rather focus on issues such as the power of the
test...

Thanks again.

On 25/05/07, Lucke, Joseph F <[EMAIL PROTECTED]> wrote:
>  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
> non-normalilty for significance testing. It's the sample means that have
> to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
> for normality prior to choosing a test statistic is generally not a good
> idea.
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
> Sent: Friday, May 25, 2007 12:04 PM
> To: [EMAIL PROTECTED]; Frank E Harrell Jr
> Cc: r-help
> Subject: Re: [R] normality tests [Broadcast]
>
> From: [EMAIL PROTECTED]
> >
> > On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
> > > [EMAIL PROTECTED] wrote:
> > > > Hi all,
> > > >
> > > > apologies for seeking advice on a general stats question. I ve run
>
> > > > normality tests using 8 different methods:
> > > > - Lilliefors
> > > > - Shapiro-Wilk
> > > > - Robust Jarque Bera
> > > > - Jarque Bera
> > > > - Anderson-Darling
> > > > - Pearson chi-square
> > > > - Cramer-von Mises
> > > > - Shapiro-Francia
> > > >
> > > > All show that the null hypothesis that the data come from a normal
>
> > > > distro cannot be rejected. Great. However, I don't think
> > it looks nice
> > > > to report the values of 8 different tests on a report. One note is
>
> > > > that my sample size is really tiny (less than 20
> > independent cases).
> > > > Without wanting to start a flame war, are there any
> > advices of which
> > > > one/ones would be more appropriate and should be reported
> > (along with
> > > > a Q-Q plot). Thank you.
> > > >
> > > > Regards,
> > > >
> > >
> > > Wow - I have so many concerns with that approach that it's
> > hard to know
> > > where to begin.  But first of all, why care about
> > normality?  Why not
> > > use distribution-free methods?
> > >
> > > You should examine the power of the tests for n=20.  You'll probably
>
> > > find it's not good enough to reach a reliable conclusion.
> >
> > And wouldn't it be even worse if I used non-parametric tests?
>
> I believe what Frank meant was that it's probably better to use a
> distribution-free procedure to do the real test of interest (if there is
> one) instead of testing for normality, and then use a test that assumes
> normality.
>
> I guess the question is, what exactly do you want to do with the outcome
> of the normality tests?  If those are going to be used as basis for
> deciding which test(s) to do next, then I concur with Frank's
> reservation.
>
> Generally speaking, I do not find goodness-of-fit for distributions very
> useful, mostly for the reason that failure to reject the null is no
> evidence in favor of the null.  It's difficult for me to imagine why
> "there's insufficient evidence to show that the data did not come from a
> normal distribution" would be interesting.
>
> Andy
>
>
> > >
> > > Frank
> > >
> > >
> > > --
> > > Frank E Harrell Jr   Professor and Chair   School
> > of Medicine
> > >   Department of Biostatistics
> > Vanderbilt University
> > >
> >
> >
> > --
> > yianni
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>
>
> 
> --
> Notice:  This e-mail message, together with any
> attachments,...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] iPlots package

2007-05-25 Thread ryestone

I am having trouble connecting two points in iplot. In the normal plot
command I would use segments(). I know there is a function ilines() but can
you just enter coordinates of 2 points?
-- 
View this message in context: 
http://www.nabble.com/iPlots-package-tf3817683.html#a10808180
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] In which package is the errbar command located?

2007-05-25 Thread John Kane
Hmisc
--- Judith Flores <[EMAIL PROTECTED]> wrote:

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculation of ratio distribution properties

2007-05-25 Thread Mike Lawrence
According to the paper I cited, there is controversy over the  
sufficiency of Hinkley's solution, hence their proposed more complete  
solution.

On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote:

> The exact ratio is given in
>
> On the Ratio of Two Correlated Normal Random Variables, D. V.  
> Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969), pp. 635-639.
>

--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
- Piet Hein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplot: different scales accross rows, same scales within rows

2007-05-25 Thread Deepayan Sarkar
On 5/25/07, Marta Rufino <[EMAIL PROTECTED]> wrote:
> Dear list members,
>
>
> I would like to set up a multiple panel in xyplots, with the same scale
> for all colunms in each row, but different accross rows.
> relation="free" would set up all x or y scales free... which is not what
> I want :-(
>
> Is this possible?

It's possible, but requires some abuse of the Trellis design, which
doesn't really allow for such use. See

https://stat.ethz.ch/pipermail/r-help/2004-October/059396.html

-Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplot: different scales accross rows, same scales within rows

2007-05-25 Thread Gabor Grothendieck
xlim= can take a list:

# CO2 is built into R
library(lattice)
xlim <- rep(list(c(0, 1000), c(0, 2000)), each = 2)
xyplot(uptake ~ conc | Type * Treatment, data = CO2,
scales = list(relation = "free"), xlim = xlim)


On 5/25/07, Marta Rufino <[EMAIL PROTECTED]> wrote:
> Dear list members,
>
>
> I would like to set up a multiple panel in xyplots, with the same scale
> for all colunms in each row, but different accross rows.
> relation="free" would set up all x or y scales free... which is not what
> I want :-(
>
> Is this possible?
>
>
> Thank you in advance,
> Best wishes,
> Marta
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normality tests [Broadcast]

2007-05-25 Thread Lucke, Joseph F
 Most standard tests, such as t-tests and ANOVA, are fairly resistant to
non-normalilty for significance testing. It's the sample means that have
to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
for normality prior to choosing a test statistic is generally not a good
idea. 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
Sent: Friday, May 25, 2007 12:04 PM
To: [EMAIL PROTECTED]; Frank E Harrell Jr
Cc: r-help
Subject: Re: [R] normality tests [Broadcast]

From: [EMAIL PROTECTED]
> 
> On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
> > [EMAIL PROTECTED] wrote:
> > > Hi all,
> > >
> > > apologies for seeking advice on a general stats question. I ve run

> > > normality tests using 8 different methods:
> > > - Lilliefors
> > > - Shapiro-Wilk
> > > - Robust Jarque Bera
> > > - Jarque Bera
> > > - Anderson-Darling
> > > - Pearson chi-square
> > > - Cramer-von Mises
> > > - Shapiro-Francia
> > >
> > > All show that the null hypothesis that the data come from a normal

> > > distro cannot be rejected. Great. However, I don't think
> it looks nice
> > > to report the values of 8 different tests on a report. One note is

> > > that my sample size is really tiny (less than 20
> independent cases).
> > > Without wanting to start a flame war, are there any
> advices of which
> > > one/ones would be more appropriate and should be reported
> (along with
> > > a Q-Q plot). Thank you.
> > >
> > > Regards,
> > >
> >
> > Wow - I have so many concerns with that approach that it's
> hard to know
> > where to begin.  But first of all, why care about
> normality?  Why not
> > use distribution-free methods?
> >
> > You should examine the power of the tests for n=20.  You'll probably

> > find it's not good enough to reach a reliable conclusion.
> 
> And wouldn't it be even worse if I used non-parametric tests?

I believe what Frank meant was that it's probably better to use a
distribution-free procedure to do the real test of interest (if there is
one) instead of testing for normality, and then use a test that assumes
normality.

I guess the question is, what exactly do you want to do with the outcome
of the normality tests?  If those are going to be used as basis for
deciding which test(s) to do next, then I concur with Frank's
reservation.

Generally speaking, I do not find goodness-of-fit for distributions very
useful, mostly for the reason that failure to reject the null is no
evidence in favor of the null.  It's difficult for me to imagine why
"there's insufficient evidence to show that the data did not come from a
normal distribution" would be interesting.

Andy

 
> >
> > Frank
> >
> >
> > --
> > Frank E Harrell Jr   Professor and Chair   School 
> of Medicine
> >   Department of Biostatistics   
> Vanderbilt University
> >
> 
> 
> --
> yianni
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 



--
Notice:  This e-mail message, together with any
attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] In which package is the errbar command located?

2007-05-25 Thread Judith Flores

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculation of ratio distribution properties

2007-05-25 Thread Mike Lawrence

MathCad code failed to attach last time. Here it is.


On 25-May-07, at 2:24 PM, Mike Lawrence wrote:


I came across this reference:
http://www.informaworld.com/smpp/content? 
content=10.1080/03610920600683689


The authors sent me code (attached with permission) in MathCad to  
perform the calculations in which I'm interested. However, I do not  
have MathCad nor experience with its syntax, so I thought I'd send  
the code to the list to see if anyone with more experience with R  
and MathCad would be interested in making this code into a function  
or package of some sort


Mike

Begin forwarded message:


From: "Noyan Turkkan" <[EMAIL PROTECTED]>
Date: May 25, 2007 11:09:02 AM ADT
To: "Mike Lawrence" <[EMAIL PROTECTED]>
Subject: Rép. : R code for 'Density of the Ratio of Two Normal  
Random Variables'?


Hi Mike
I do not know if anyone coded my approach in R. However if you  
have acces to MathCad, I am including a MathCad file (also in  
PDF)  which computes the density of the ratio of 2 dependent  
normal variables, its mean & variance. If you do not have acces to  
MathCad, you will see that all the computations can be easily  
programmed in R, as I replaced Hypergeometric function by the Erf  
function. I am not very familiar with R but the erf function may  
be programmed as : (erf <- function(x)  2*pnorm(x *sqrt(2)) - 1).  
Good luck.


Noyan Turkkan, ing.
Professeur titulaire & directeur / Professor & Head
Dépt. de génie civil / Civil Eng. Dept.
Faculté d'ingénierie / Faculty of Engineering
Université de Moncton
Moncton, N.B., Canada, E1A 3E9



>>> Mike Lawrence <[EMAIL PROTECTED]> 05/25/07 9:20 am >>>
Hi Dr. Turkkan,

I am working on a problem that necessitates the estimation of the
mean and variance of the density of two dependent normal random
variables and in my search for methods to achieve such estimation I
came across your paper 'Density of the Ratio of Two Normal Random
Variables' (2006). I'm not a statistics or math expert by any means,
but I am quite familiar with the R programming language; do you
happen to know whether anyone has coded your approach for R yet?

Cheers,

Mike

--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
- Piet Hein




--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
- Piet Hein


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting- 
guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
- Piet Hein


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] File path expansion

2007-05-25 Thread Prof Brian Ripley
On Fri, 25 May 2007, Martin Maechler wrote:

>
>> path.expand("~")
> [1] "/home/maechler"

Yes, but beware that may not do what you want on Windows in R <= 2.5.0, 
since someone changed the definition of 'home' but not path.expand.

>
>> "RobMcG" == McGehee, Robert <[EMAIL PROTECTED]>
>> on Fri, 25 May 2007 11:44:27 -0400 writes:
>
>RobMcG> R-Help,
>RobMcG> I discovered a "mis-feature" is ghostscript, which is used by the 
> bitmap
>RobMcG> function. It seems that specifying file names in the form 
> "~/abc.png"
>RobMcG> rather than "/home/directory/abc.png" causes my GS to crash when I 
> open
>RobMcG> the bitmap device on my Linux box.
>
>RobMcG> The easiest solution would seem to be to intercept any file names 
> in the
>RobMcG> form "~/abc.png" and replace the "~" with the user's home 
> directory. I'm
>RobMcG> sure I could come up with something involving regular expressions 
> and
>RobMcG> system calls to do this in Linux, but even that might not be system
>RobMcG> independent. So, I wanted to see if anyone knew of a native R 
> solution
>RobMcG> of converting "~" to its full path expansion.
>
>RobMcG> Thanks,
>RobMcG> Robert
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normality tests [Broadcast]

2007-05-25 Thread Liaw, Andy
From: [EMAIL PROTECTED]
> 
> On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
> > [EMAIL PROTECTED] wrote:
> > > Hi all,
> > >
> > > apologies for seeking advice on a general stats question. I ve run
> > > normality tests using 8 different methods:
> > > - Lilliefors
> > > - Shapiro-Wilk
> > > - Robust Jarque Bera
> > > - Jarque Bera
> > > - Anderson-Darling
> > > - Pearson chi-square
> > > - Cramer-von Mises
> > > - Shapiro-Francia
> > >
> > > All show that the null hypothesis that the data come from a normal
> > > distro cannot be rejected. Great. However, I don't think 
> it looks nice
> > > to report the values of 8 different tests on a report. One note is
> > > that my sample size is really tiny (less than 20 
> independent cases).
> > > Without wanting to start a flame war, are there any 
> advices of which
> > > one/ones would be more appropriate and should be reported 
> (along with
> > > a Q-Q plot). Thank you.
> > >
> > > Regards,
> > >
> >
> > Wow - I have so many concerns with that approach that it's 
> hard to know
> > where to begin.  But first of all, why care about 
> normality?  Why not
> > use distribution-free methods?
> >
> > You should examine the power of the tests for n=20.  You'll probably
> > find it's not good enough to reach a reliable conclusion.
> 
> And wouldn't it be even worse if I used non-parametric tests?

I believe what Frank meant was that it's probably better to use a
distribution-free procedure to do the real test of interest (if there is
one) instead of testing for normality, and then use a test that assumes
normality.

I guess the question is, what exactly do you want to do with the outcome
of the normality tests?  If those are going to be used as basis for
deciding which test(s) to do next, then I concur with Frank's
reservation.

Generally speaking, I do not find goodness-of-fit for distributions very
useful, mostly for the reason that failure to reject the null is no
evidence in favor of the null.  It's difficult for me to imagine why
"there's insufficient evidence to show that the data did not come from a
normal distribution" would be interesting.

Andy

 
> >
> > Frank
> >
> >
> > --
> > Frank E Harrell Jr   Professor and Chair   School 
> of Medicine
> >   Department of Biostatistics   
> Vanderbilt University
> >
> 
> 
> -- 
> yianni
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xyplot: different scales accross rows, same scales within rows

2007-05-25 Thread Marta Rufino
Dear list members,


I would like to set up a multiple panel in xyplots, with the same scale 
for all colunms in each row, but different accross rows.
relation="free" would set up all x or y scales free... which is not what 
I want :-(

Is this possible?


Thank you in advance,
Best wishes,
Marta

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] windows to unix

2007-05-25 Thread Barry Rowlingson
Martin Maechler wrote:
>> "Erin" == Erin Hodgess <[EMAIL PROTECTED]>
>> on Fri, 25 May 2007 06:10:10 -0500 writes:
> 
> Erin> Dear R People:
> Erin> Is there any way to take a Windows version of R, compiled from 
> source, 
> Erin> compress it, and put it on a Unix-like environment, please?

> Just 'zip' the corresponding directory and copy the zip
> file to your "unix like" environment.
> 
>

  You can take a Windows-compiled R to Unix, but you can't make it work

  The big unasked question is 'What is this unix-like environment?'.

  Linux isn't Unix, so maybe you mean that, in which case you'll not 
make your Windows compiled R run. Not without 'Wine' or some other layer 
of obfuscation.

Barry

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normality tests

2007-05-25 Thread gatemaze
On 25/05/07, Frank E Harrell Jr <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > Hi all,
> >
> > apologies for seeking advice on a general stats question. I ve run
> > normality tests using 8 different methods:
> > - Lilliefors
> > - Shapiro-Wilk
> > - Robust Jarque Bera
> > - Jarque Bera
> > - Anderson-Darling
> > - Pearson chi-square
> > - Cramer-von Mises
> > - Shapiro-Francia
> >
> > All show that the null hypothesis that the data come from a normal
> > distro cannot be rejected. Great. However, I don't think it looks nice
> > to report the values of 8 different tests on a report. One note is
> > that my sample size is really tiny (less than 20 independent cases).
> > Without wanting to start a flame war, are there any advices of which
> > one/ones would be more appropriate and should be reported (along with
> > a Q-Q plot). Thank you.
> >
> > Regards,
> >
>
> Wow - I have so many concerns with that approach that it's hard to know
> where to begin.  But first of all, why care about normality?  Why not
> use distribution-free methods?
>
> You should examine the power of the tests for n=20.  You'll probably
> find it's not good enough to reach a reliable conclusion.

And wouldn't it be even worse if I used non-parametric tests?

>
> Frank
>
>
> --
> Frank E Harrell Jr   Professor and Chair   School of Medicine
>   Department of Biostatistics   Vanderbilt University
>


-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] File path expansion

2007-05-25 Thread Martin Maechler

> path.expand("~")
[1] "/home/maechler"

> "RobMcG" == McGehee, Robert <[EMAIL PROTECTED]>
> on Fri, 25 May 2007 11:44:27 -0400 writes:

RobMcG> R-Help,
RobMcG> I discovered a "mis-feature" is ghostscript, which is used by the 
bitmap
RobMcG> function. It seems that specifying file names in the form 
"~/abc.png"
RobMcG> rather than "/home/directory/abc.png" causes my GS to crash when I 
open
RobMcG> the bitmap device on my Linux box.

RobMcG> The easiest solution would seem to be to intercept any file names 
in the
RobMcG> form "~/abc.png" and replace the "~" with the user's home 
directory. I'm
RobMcG> sure I could come up with something involving regular expressions 
and
RobMcG> system calls to do this in Linux, but even that might not be system
RobMcG> independent. So, I wanted to see if anyone knew of a native R 
solution
RobMcG> of converting "~" to its full path expansion.

RobMcG> Thanks,
RobMcG> Robert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normality tests

2007-05-25 Thread Frank E Harrell Jr
[EMAIL PROTECTED] wrote:
> Hi all,
> 
> apologies for seeking advice on a general stats question. I ve run
> normality tests using 8 different methods:
> - Lilliefors
> - Shapiro-Wilk
> - Robust Jarque Bera
> - Jarque Bera
> - Anderson-Darling
> - Pearson chi-square
> - Cramer-von Mises
> - Shapiro-Francia
> 
> All show that the null hypothesis that the data come from a normal
> distro cannot be rejected. Great. However, I don't think it looks nice
> to report the values of 8 different tests on a report. One note is
> that my sample size is really tiny (less than 20 independent cases).
> Without wanting to start a flame war, are there any advices of which
> one/ones would be more appropriate and should be reported (along with
> a Q-Q plot). Thank you.
> 
> Regards,
> 

Wow - I have so many concerns with that approach that it's hard to know 
where to begin.  But first of all, why care about normality?  Why not 
use distribution-free methods?

You should examine the power of the tests for n=20.  You'll probably 
find it's not good enough to reach a reliable conclusion.

Frank


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] windows to unix

2007-05-25 Thread Martin Maechler
> "Erin" == Erin Hodgess <[EMAIL PROTECTED]>
> on Fri, 25 May 2007 06:10:10 -0500 writes:

Erin> Dear R People:
Erin> Is there any way to take a Windows version of R, compiled from 
source, 
Erin> compress it, and put it on a Unix-like environment, please?

Since nobody has answered yet, let a die-hard non-windows user
try :

Just 'zip' the corresponding directory and copy the zip
file to your "unix like" environment.

I assume the only things this does not contain
would be the
- registry entries (which used to be optional anyway;
I'm not sure if that's still true) 
- desktop  links to R
- startup menu links to R

but the last two can easily be recreated after people copy the
zip file and unpack it in "their windows enviroment" -- 
which I assume is the purpose of the whole procedure..

{Please reply to R-help; not me, I am *the" windows-non-expert ..}

Martin


Erin> thanks in advance,
Erin> Sincerely,
Erin> Erin Hodgess
Erin> Associate Professor
Erin> Department of Computer and Mathematical Sciences
Erin> University of Houston - Downtown
Erin> mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trouble with snow and Rmpi

2007-05-25 Thread Ramon Diaz-Uriarte
Dear Erin,

What operating system are you trying this on? Windows? In Linux you
definitely don't need MPICH2 but, rather, LAM/MPI.

Best,

R.

On 5/25/07, Erin Hodgess <[EMAIL PROTECTED]> wrote:
> Dear R People:
>
> I am having some trouble with the snow package.
>
> It requires MPICH2 and Rmpi.
>
> Rmpi is fine.  However, I downloaded the MPICH2 package, and installed.
>
> There is no mpicc, mpirun, etc.
>
> Does anyone have any suggestions, please?
>
> Thanks in advance!
>
> Sincerely,
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: [EMAIL PROTECTED]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] File path expansion

2007-05-25 Thread Gabor Grothendieck
Try

?path.expand

On 5/25/07, McGehee, Robert <[EMAIL PROTECTED]> wrote:
> R-Help,
> I discovered a "mis-feature" is ghostscript, which is used by the bitmap
> function. It seems that specifying file names in the form "~/abc.png"
> rather than "/home/directory/abc.png" causes my GS to crash when I open
> the bitmap device on my Linux box.
>
> The easiest solution would seem to be to intercept any file names in the
> form "~/abc.png" and replace the "~" with the user's home directory. I'm
> sure I could come up with something involving regular expressions and
> system calls to do this in Linux, but even that might not be system
> independent. So, I wanted to see if anyone knew of a native R solution
> of converting "~" to its full path expansion.
>
> Thanks,
> Robert
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] File path expansion

2007-05-25 Thread McGehee, Robert
R-Help,
I discovered a "mis-feature" is ghostscript, which is used by the bitmap
function. It seems that specifying file names in the form "~/abc.png"
rather than "/home/directory/abc.png" causes my GS to crash when I open
the bitmap device on my Linux box.

The easiest solution would seem to be to intercept any file names in the
form "~/abc.png" and replace the "~" with the user's home directory. I'm
sure I could come up with something involving regular expressions and
system calls to do this in Linux, but even that might not be system
independent. So, I wanted to see if anyone knew of a native R solution
of converting "~" to its full path expansion.

Thanks,
Robert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read in 250K snp chips

2007-05-25 Thread James W. MacDonald
bhiggs wrote:
> I'm having trouble getting summaries out of the 250K snp chips in R.  I'm
> using the oligo package and when I attempt to create the necessary SnpQSet
> object (to get genotype calls and intensities) using snprma, I encounter
> memory issues.
> 
> Anyone have an alternative package or workaround for these large snp chips?


Oligo is a Bioconductor package, so you should probably direct questions 
on the Bioconductor-help listserve rather than R-help.

Best,

Jim


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**
Electronic Mail is not secure, may not be read every day, and should not be 
used for urgent or sensitive issues.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cochran-Armitage

2007-05-25 Thread [EMAIL PROTECTED]
Hi,
 
I have to calculate a Cochran-Armitage test on data. I don't know much 
on this test and then, I have few questions on the independence_test 
function. I have read the help and search the archives but this doesn't 
help me...
 
- They are two ways: either with a formula (independence_test(tumor ~ 
dose, data = lungtumor, teststat = "quad")) or with a table 
(independence_test(table(lungtumor$dose, lungtumor$tumor), teststat = 
"quad")).
Why this difference ? The two ways do not give the same results... Why 
?
 
- In the teststat option, what is the difference between "quad", "max" 
and "scalar" ? What is used generaly ?
 
- Why choosing approximate or asymptotic distribution if an exact test 
can be calculate ?
 
For a classical Cochran Armitage test (nothing special was specified 
so I guess I have to do the most classical Cochran Armitage test), 
which options should I use ?
 
Thanks for your help.
 
Delphine Fontaine

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Read in 250K snp chips

2007-05-25 Thread bhiggs

I'm having trouble getting summaries out of the 250K snp chips in R.  I'm
using the oligo package and when I attempt to create the necessary SnpQSet
object (to get genotype calls and intensities) using snprma, I encounter
memory issues.

Anyone have an alternative package or workaround for these large snp chips?
-- 
View this message in context: 
http://www.nabble.com/Read-in-250K-snp-chips-tf3816761.html#a10805124
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with complex lme model fit

2007-05-25 Thread Colin Beale
 Hi R helpers,

 I'm trying to fit a rather complex model to some simulated data using
 lme and am not getting the correct results. It seems there might be
some
 identifiability issues that could possibly be dealt with by
specifying
 starting parameters - but I can't see how to do this. I'm comparing
 results from R to those got when using GenStat...

 The raw data are available on the web at
http://cmbeale.freehostia.com/OutData.txt and can be read directly
into R using:

 gpdat <- read.table("http://cmbeale.freehostia.com/OutData.txt";,
header = T)
 gpdat$X7 <- as.factor(gpdat$X7)
 gpdat$X4 <- as.factor(gpdat$X4)
 rand_mat <- as.matrix(gpdat[,11:26])
 gpdat <- groupedData(Y1 ~X1 + X2 + X3 + X4 + X5 + m_sum|.g, data =
gpdat)


 the model fitted using:

 library(Matrix)
 library(nlme)

 m_sum <- rowSums(gpdat[,11:27])
 mod1 <- lme(fixed = Y1 ~ X1 + X2 + X3 + X4 + X5 +  m_sum,
   random = pdBlocked(list(pdIdent(~1), pdIdent (~ X6
-
 1),
   pdIdent (~ X7 - 1), pdIdent(~ rand_mat -1))), data
=
 gpdat)

 Which should recover the variance components:

 var_labelvar_est
  rand_mat_scalar 0.00021983
 X6_scalar   0.62314002
 X7_scalar0.03853604

 as recovered by GenStat and used to generate the dataset. Instead I
 get:

 X6  0.6231819
 X7 0.05221481
 rand_mat1.377596e-11

 However, If I change or drop either of X5 or X6. I then get much
closer
 estimates to what is expected. For example:


 mod2 <- lme(fixed = Y1 ~ X1 + X2 + X3 + X4 + X5 +  m_sum,
   random = pdBlocked(list(pdIdent(~1), pdIdent (~ X6
-
 1),
   pdIdent (~as.numeric( X7) - 1), pdIdent(~ rand_mat
 -1))), data = gpdat)

 returns variance components:
 X6  0.6137986
 X7 Not meaningful
 rand_mat0.0006119088

 which is much closer to those used to generate the dataset for the
 parameters that are now meaningful, and has appropriate random effect
 estimates for the -rand_mat columns (the variable of most interest
 here). This suggests to me that there is some identifiability issue
that
 might be helped by giving different starting values. Is this
possible?
 Or does anyone have any other suggestions?

 Thanks,

 Colin

 sessionInfo:
R version 2.5.0 (2007-04-23) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United Kingdom.1252;
LC_CTYPE=English_United Kingdom.1252;
LC_MONETARY=English_United Kingdom.1252;
LC_NUMERIC=C;
LC_TIME=English_United Kingdom.1252

attached base packages:
[1] "stats" "graphics"  "grDevices" "datasets"  "tcltk" "utils"
"methods"   "base" 

other attached packages:
   nlme  Matrix latticesvSocketsvIO  R2HTML
 svMisc   svIDE 
   "3.1-80" "0.9975-11""0.15-5" "0.9-5" "0.9-5"  "1.58"
"0.9-5" "0.9-5" 



Dr. Colin Beale
Spatial Ecologist
The Macaulay Institute
Craigiebuckler
Aberdeen
AB15 8QH
UK

Tel: 01224 498245 ext. 2427
Fax: 01224 311556
Email: [EMAIL PROTECTED] 



-- 
Please note that the views expressed in this e-mail are those of the
sender and do not necessarily represent the views of the Macaulay
Institute. This email and any attachments are confidential and are
intended solely for the use of the recipient(s) to whom they are
addressed. If you are not the intended recipient, you should not read,
copy, disclose or rely on any information contained in this e-mail, and
we would ask you to contact the sender immediately and delete the email
from your system. Thank you.
Macaulay Institute and Associated Companies, Macaulay Drive,
Craigiebuckler, Aberdeen, AB15 8QH.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Adam Witney

 
> Adam> Here is the results:
> 
>>> xB <- c(2000,1e6,1e50,Inf)
>>> for(df in c(0.1, 1, 10))
> Adam> +for(ncp in c(0, 1, 10, 100))
> Adam> +print(pchisq(xB, df=df, ncp=ncp), digits == 15)
> Adam> Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15)
> :
> Adam> object "digits" not found
> 
> well, that's a typo - I think - you should have been able to fix
> (I said "something like" ...).
> Just do replace the '==' by '='

Sorry my R is very limited... Here is the output with an '=' instead

>  xB <- c(2000,1e6,1e50,Inf)
>   for(df in c(0.1, 1, 10))
+for(ncp in c(0, 1, 10, 100))
+print(pchisq(xB, df=df, ncp=ncp), digits = 15)
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1

Thanks again

Adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Martin Maechler
> "Adam" == Adam Witney <[EMAIL PROTECTED]>
> on Fri, 25 May 2007 14:48:18 +0100 writes:

>>> ##-- non central Chi^2 :
>>> xB <- c(2000,1e6,1e50,Inf)
>>> for(df in c(0.1, 1, 10))
>> + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) 
==1)
>> Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
>> Execution halted
>> 
>> Ok, thanks;
>> so, if we want to learn more, we need
>> the output of something like
>> 
>> xB <- c(2000,1e6,1e50,Inf)
>> for(df in c(0.1, 1, 10))
>> for(ncp in c(0, 1, 10, 100))
>> print(pchisq(xB, df=df, ncp=ncp), digits == 15)

Adam> Here is the results:

>> xB <- c(2000,1e6,1e50,Inf)
>> for(df in c(0.1, 1, 10))
Adam> +for(ncp in c(0, 1, 10, 100))
Adam> +print(pchisq(xB, df=df, ncp=ncp), digits == 15)
Adam> Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15) :
Adam> object "digits" not found

well, that's a typo - I think - you should have been able to fix
(I said "something like" ...).
Just do replace the '==' by '='

Martin

Adam> Thanks again...

Adam> adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with numerical integration and optimization with BFGS

2007-05-25 Thread Deepankar Basu
Ravi,

Thanks a lot for your detailed suggestions. I will certainly look at the
links that you have sent and the package "mnormt". For the moment, I
have managed to analytically integrate the expression using "pnorm"
along the lines suggested by Prof. Ripley yesterday. 

For instance, my first integral becomes the following:

f1 <- function(w1,w0) {

   a <- 1/(2*(1-rho2^2)*sigep^2)  
   b <- (rho2*(w1-w0+delta))/((1-rho2^2)*sigep*sigeta)
   c <- ((w1-w0+delta)^2)/(2*(1-rho2^2)*sigeta^2)
   d <- muep
   k <- 2*pi*sigep*sigeta*(sqrt(1-rho2^2))

   b1 <- ((-2*a*d - b)^2)/(4*a) - a*d^2 - b*d - c
   b21 <- sqrt(a)*(w1-rho1*w0) + (-2*a*d - b)/(2*sqrt(a))
   b22 <- sqrt(a)*(-w1+rho1*w0) + (-2*a*d - b)/(2*sqrt(a))
   b31 <- 2*pnorm(b21*sqrt(2)) - 1  # ERROR FUNCTION  
   b32 <- 2*pnorm(b22*sqrt(2)) - 1  # ERROR FUNCTION
   b33 <- as.numeric(w1-rho1*w0>=0)*(b31-b32)   
   
   return(sqrt(pi)*(1/(2*k*sqrt(a)))*exp(b1)*b33)
  }

 for (i in 2:n) {

 out1 <- f1(y[i],y[i-1])
 
}

I have worked out similar expressions for the other two integrals also. 

Deepankar

On Fri, 2007-05-25 at 09:56 -0400, Ravi Varadhan wrote:
> Deepankar,
> 
> If the problem seems to be in the evaluation of numerical quadrature part,
> you might want to try quadrature methods that are better suited to
> integrands with strong peaks.  The traditional Gaussian quadrature methods,
> even their adaptive versions such as Gauss-Kronrod, are not best suited for
> integrating because they do not explicitly account for the "peakedness" of
> the integrand, and hence can be inefficient and inaccurate. See the article
> below:
> http://citeseer.ist.psu.edu/cache/papers/cs/18996/http:zSzzSzwww.sci.wsu.edu
> zSzmathzSzfacultyzSzgenzzSzpaperszSzmvn.pdf/genz92numerical.pdf
> 
> Alan Genz has worked on this problem a lot and has a number of computational
> tools available. I used some of them when I was working on computing Bayes
> factors for binomial regression models with different link functions.  If
> you are interested, check the following:
> 
> http://www.math.wsu.edu/faculty/genz/software/software.html.
> 
> For your immediate needs, there is an R package called "mnormt" that has a
> function for computing integrals under a multivariate normal (and
> multivariate t) densities, which is actually based on Genz's Fortran
> routines.  You could try that.
> 
> Ravi.
> 
> 
> 
> ---
> 
> Ravi Varadhan, Ph.D.
> 
> Assistant Professor, The Center on Aging and Health
> 
> Division of Geriatric Medicine and Gerontology 
> 
> Johns Hopkins University
> 
> Ph: (410) 502-2619
> 
> Fax: (410) 614-9625
> 
> Email: [EMAIL PROTECTED]
> 
> Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
> 
>  
> 
> 
> 
> 
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Deepankar Basu
> Sent: Friday, May 25, 2007 12:02 AM
> To: Prof Brian Ripley
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] Problem with numerical integration and optimization with
> BFGS
> 
> Prof. Ripley,
> 
> The code that I provided with my question of course does not contain
> code for the derivatives; but I am supplying analytical derivatives in
> my full program. I did not include that code with my question because
> that would have added about 200 more lines of code without adding any
> new information relevant for my question. The problem that I had pointed
> to occurs whether I provide analytical derivatives or not to the
> optimization routine. And the problem was that when I use the "BFGS"
> method in optim, I get an error message saying that the integrals are
> probably divergent; I know, on the other hand, that the integrals are
> convergent. The same problem does not arise when I instead use the
> Nelder-Mead method in optim.
> 
> Your suggestion that the expression can be analytically integrated
> (which will involve "pnorm") might be correct though I do not see how to
> do that. The integrands are the bivariate normal density functions with
> one variable replaced by known quantities while I integrate over the
> second. 
> 
> For instance, the first integral is as follows: the integrand is the
> bivariate normal density function (with general covariance matrix) where
> the second variable has been replaced by 
> y[i] - rho1*y[i-1] + delta 
> and I integrate over the first variable; the range of integration is
> lower=-y[i]+rho1*y[i-1]
> upper=y[i]-rho1*y[i-1]
> 
> The other two integrals are very similar. It would be of great help if
> you could point out how to integrate the expressions analytically using
> "pnorm".
> 
> Thanks.
> Deepankar
> 
> 
> On Fri, 2007-05-25 at 04:22 +0100, Prof Brian Ripley wrote:
> > You are trying to use a derivative-based optimization method without 
> > supplying derivatives.  Th

Re: [R] Problem with numerical integration and optimization with BFGS

2007-05-25 Thread Ravi Varadhan
Deepankar,

If the problem seems to be in the evaluation of numerical quadrature part,
you might want to try quadrature methods that are better suited to
integrands with strong peaks.  The traditional Gaussian quadrature methods,
even their adaptive versions such as Gauss-Kronrod, are not best suited for
integrating because they do not explicitly account for the "peakedness" of
the integrand, and hence can be inefficient and inaccurate. See the article
below:
http://citeseer.ist.psu.edu/cache/papers/cs/18996/http:zSzzSzwww.sci.wsu.edu
zSzmathzSzfacultyzSzgenzzSzpaperszSzmvn.pdf/genz92numerical.pdf

Alan Genz has worked on this problem a lot and has a number of computational
tools available. I used some of them when I was working on computing Bayes
factors for binomial regression models with different link functions.  If
you are interested, check the following:

http://www.math.wsu.edu/faculty/genz/software/software.html.

For your immediate needs, there is an R package called "mnormt" that has a
function for computing integrals under a multivariate normal (and
multivariate t) densities, which is actually based on Genz's Fortran
routines.  You could try that.

Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Deepankar Basu
Sent: Friday, May 25, 2007 12:02 AM
To: Prof Brian Ripley
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Problem with numerical integration and optimization with
BFGS

Prof. Ripley,

The code that I provided with my question of course does not contain
code for the derivatives; but I am supplying analytical derivatives in
my full program. I did not include that code with my question because
that would have added about 200 more lines of code without adding any
new information relevant for my question. The problem that I had pointed
to occurs whether I provide analytical derivatives or not to the
optimization routine. And the problem was that when I use the "BFGS"
method in optim, I get an error message saying that the integrals are
probably divergent; I know, on the other hand, that the integrals are
convergent. The same problem does not arise when I instead use the
Nelder-Mead method in optim.

Your suggestion that the expression can be analytically integrated
(which will involve "pnorm") might be correct though I do not see how to
do that. The integrands are the bivariate normal density functions with
one variable replaced by known quantities while I integrate over the
second. 

For instance, the first integral is as follows: the integrand is the
bivariate normal density function (with general covariance matrix) where
the second variable has been replaced by 
y[i] - rho1*y[i-1] + delta 
and I integrate over the first variable; the range of integration is
lower=-y[i]+rho1*y[i-1]
upper=y[i]-rho1*y[i-1]

The other two integrals are very similar. It would be of great help if
you could point out how to integrate the expressions analytically using
"pnorm".

Thanks.
Deepankar


On Fri, 2007-05-25 at 04:22 +0100, Prof Brian Ripley wrote:
> You are trying to use a derivative-based optimization method without 
> supplying derivatives.  This will use numerical approoximations to the 
> derivatives, and your objective function will not be suitable as it is 
> internally using adaptive numerical quadrature and hence is probably not 
> close enough to a differentiable function (it may well have steps).
> 
> I believe you can integrate analytically (the answer will involve pnorm), 
> and that you can also find analytical derivatives.
> 
> Using (each of) numerical optimization and integration is a craft, and it 
> seems you need to know more about it.  The references on ?optim are too 
> advanced I guess, so you could start with Chapter 16 of MASS and its 
> references.
> 
> On Thu, 24 May 2007, Deepankar Basu wrote:
> 
> > Hi R users,
> >
> > I have a couple of questions about some problems that I am facing with
> > regard to numerical integration and optimization of likelihood
> > functions. Let me provide a little background information: I am trying
> > to do maximum likelihood estimation of an econometric model that I have
> > developed recently. I estimate the parameters of the model using the
> > monthly US unemployment rate series obtained from the Federal Reserve
> > Bank of St. Louis. (The data is freely available from their web-based
> > database called FRED-II).
> >
> > For my model, the likelihood function for each observation is the sum of
> > three integrals. The integrand in each 

Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Adam Witney

 
>> ##-- non central Chi^2 :
>> xB <- c(2000,1e6,1e50,Inf)
>> for(df in c(0.1, 1, 10))
>   + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1)
>   Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
>   Execution halted
> 
> Ok, thanks;
> so, if we want to learn more, we need
> the output of something like
> 
>   xB <- c(2000,1e6,1e50,Inf)
>   for(df in c(0.1, 1, 10))
>for(ncp in c(0, 1, 10, 100))
>print(pchisq(xB, df=df, ncp=ncp), digits == 15)

Here is the results:

>   xB <- c(2000,1e6,1e50,Inf)
>   for(df in c(0.1, 1, 10))
+for(ncp in c(0, 1, 10, 100))
+print(pchisq(xB, df=df, ncp=ncp), digits == 15)
Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15) :
object "digits" not found

Thanks again...

adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] normality tests

2007-05-25 Thread gatemaze
Hi all,

apologies for seeking advice on a general stats question. I ve run
normality tests using 8 different methods:
- Lilliefors
- Shapiro-Wilk
- Robust Jarque Bera
- Jarque Bera
- Anderson-Darling
- Pearson chi-square
- Cramer-von Mises
- Shapiro-Francia

All show that the null hypothesis that the data come from a normal
distro cannot be rejected. Great. However, I don't think it looks nice
to report the values of 8 different tests on a report. One note is
that my sample size is really tiny (less than 20 independent cases).
Without wanting to start a flame war, are there any advices of which
one/ones would be more appropriate and should be reported (along with
a Q-Q plot). Thank you.

Regards,

-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Competing Risks Analysis

2007-05-25 Thread Kevin E. Thorpe
I am working on a competing risks problem, specifically an analysis of
cause-specific mortality.  I am familiar with the cmprsk package and
have used it before to create cumulative incidence plots.  I also came
across an old (1998) s-news post from Dr. Terry Therneau describing
a way to use coxph to model competing risks.  I am re-producing the
post at the bottom of this message.

I would like to know if this approach is still reasonable or are there
other ways to go now.  I did an RSiteSearch with the term
"competing risks" and found some interesting articles but nothing as
specific as the post below.


- S-news Article Begins -
Competing risks

It's actually quite easy.

Assume a data set with n subjects and 4 types of competing events. Then
create a data set with 4n observations
First n obs: the data set you would create for an analysis of
"time to event type 1", where all other event types are censored. An
extra variable "etype" is =1.
Second n obs: the data set you would create for "time to event type 2",
with etype=2
.
.
.

Then
fit <- coxph(Surv(time,status) ~  + strata(etype), 

1. Wei, Lin, and Weissfeld apply this to data sets where the competing
risks are not necessarily exclusive, i.e., time to progression and time
to death for cancer patients. JASA 1989, 1065-1073. If a given subject
can have more than one "event", then you need to use the sandwich estimate
of variance, obtained by adding ".. + cluster(id).." to the model
statement above, where "id" is variable unique to each subject.
(The method of fitting found in WLW, namely to do individual fits and
then glue the results together, is not necessary).

2. If a given subject can have at most one event, then it is not clear
that the sandwich estimate of variance is necessary. See Lunn and McNeil,
Biometrics (year?) for an example.

3. The covariates can be coded any way you like. WLW put in all of the
strata * covariate interactions for instance (the "x" coef is different for
each event type), but I never seem to have a big enough sample to justify
doing this. Lunn and McNeil use a certain coding of the treatment effect,
so that the betas are a contrast of interest to them; I've used similar
things
but never that particular one.

4. "etype" doesn't have to be 1,2,3,... of course; etype= 'paper',
'scissors', 'stone', 'pc' would work as well.

Terry M. Therneau, Ph.D.
- S-news Article Ends -

-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: [EMAIL PROTECTED]  Tel: 416.864.5776  Fax: 416.864.6057

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Martin Maechler
> "Adam" == Adam Witney <[EMAIL PROTECTED]>
> on Fri, 25 May 2007 09:38:29 +0100 writes:

Adam> Thanks for your replies Details inline below:

Adam> On 24/5/07 17:12, "Martin Maechler" <[EMAIL PROTECTED]> wrote:

>>> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
>>> on Thu, 24 May 2007 17:34:16 +0200 writes:
>> 
UweL> Some of these test are expected from time to time, since they are
>> using 
UweL> random numbers. Just re-run.
>> 
>> eehm,  "some of these", yes, but not the ones Adam mentioned,
>> d-p-q-r-tests.R.
>> 
>> Adam, if you want more info you should report to us the *end*
>> (last dozen of lines) of
>> your d-p-q-r-tests.Rout[.fail]  file.

Adam> Ok, here they are...

  [1] TRUE TRUE TRUE TRUE
  > 
  > ##-- non central Chi^2 :
  > xB <- c(2000,1e6,1e50,Inf)
  > for(df in c(0.1, 1, 10))
  + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1)
  Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
  Execution halted

Ok, thanks;
so, if we want to learn more, we need
the output of something like

  xB <- c(2000,1e6,1e50,Inf)
  for(df in c(0.1, 1, 10))
   for(ncp in c(0, 1, 10, 100)) 
   print(pchisq(xB, df=df, ncp=ncp), digits == 15)



UweL> BTW: We do have R-2.5.0 these days.
>> 
>> Indeed! 
>> 
>> And gcc 2.95.4 is also very old.
>> Maybe you've recovered an old compiler / math-library bug from
>> that antique compiler suite ?

Adam> Yes, maybe I should start think about upgrading this box!

yes, at least "start" ... ;-)

Adam> Thanks again

Adam> adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-About PLSR

2007-05-25 Thread Gavin Simpson
On Fri, 2007-05-25 at 17:25 +0530, Nitish Kumar Mishra wrote:
> hi R help group,
> I have installed PLS package in R and use it for princomp & prcomp
> commands for calculating PCA using its example file(USArrests example).
> But How I can use PLS for Partial least square, R square, mvrCv one more
> think how i can import external file in R. When I use plsr, R2, RMSEP it
> show error could not find function plsr, RMSEP etc.
> How I can calculate PLS, R2, RMSEP, PCR, MVR using pls package in R.
> Thanking you

Did you load the package with:

library(pls)

Before you tried to use the functions you mention?

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R-About PLSR

2007-05-25 Thread Nitish Kumar Mishra
hi R help group,
I have installed PLS package in R and use it for princomp & prcomp
commands for calculating PCA using its example file(USArrests example).
But How I can use PLS for Partial least square, R square, mvrCv one more
think how i can import external file in R. When I use plsr, R2, RMSEP it
show error could not find function plsr, RMSEP etc.
How I can calculate PLS, R2, RMSEP, PCR, MVR using pls package in R.
Thanking you



-- 
Nitish Kumar Mishra
Junior Research Fellow
BIC, IMTECH, Chandigarh, India
E-Mail Address:
[EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmer and scale parameter in glmm model

2007-05-25 Thread Olivier MARTIN
Hi all,

I try to fit a glmm model with binomial distribution and I would to
verify that the scale parameter is close to 1...

the lmer function gives the following result :
Estimated scale (compare to  1 )  0.766783

But I would like to know how this estimation (0.766783) is performed, 
and I would
like to know if it is possible to find this estimation with the 
different results given by the function lmer.

Thanks,
Olivier.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] iplots problem

2007-05-25 Thread mister_bluesman

Hi. I try to load iplots using the following commands

> library(rJava)
> library(iplots)

but then I get the following error:

Error in .jinit(cp, parameters = "-Xmx512m", silent = TRUE) : 
Cannot create Java Virtual Machine
Error in library(iplots) : .First.lib failed for 'iplots'

What do I have to do to correct this?

Thanks
-- 
View this message in context: 
http://www.nabble.com/iplots-problem-tf3815516.html#a10801096
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] windows to unix

2007-05-25 Thread Erin Hodgess
Dear R People:

Is there any way to take a Windows version of R, compiled from source, 
compress it, and put it on a Unix-like environment, please?

thanks in advance,
Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] trouble with snow and Rmpi

2007-05-25 Thread Erin Hodgess
Dear R People:

I am having some trouble with the snow package.

It requires MPICH2 and Rmpi.

Rmpi is fine.  However, I downloaded the MPICH2 package, and installed.

There is no mpicc, mpirun, etc.

Does anyone have any suggestions, please?

Thanks in advance!

Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] in unix opening data object created under win

2007-05-25 Thread John Kane

--- [EMAIL PROTECTED] wrote:

> On Unix R's version is 2.3.1 and on PC its 2.4.1.
> 
> I dont have the rights to install newer version of R
> on Unix.
> 
> I tried different upload methods. No one worked.
> 
> On Unix it looks as follows (dots to hide my
> userid):
> 
>  >
>
load("/afs/ir/users/../project/ps/data/dtaa")
>  > head(dtaa)
>   hospid mfpi1 mfpi2 mfpi3 mfpi4 mfpi5
> mfpi6 mfpi7 mfpi8
> NA9 0.1428571 1   0.5 0.2857143  0.50  
> 0.0 0.333 0
> 4041  9 0.1428571 0   0.0 0.2857143  0.25  
> 0.2 0.000 0
>   mfpi9
> NA   0.333
> 4041 1.000
>  >
> 
> The data comes through but its screwed up.
> 
> Thanks for your help.
> 
> Toby

Hi Toby,
Except that the rest of the data is not showing up 
why  do you say the data is screwed up?  

I don't know what your data should look like but the
two rows above look "okay".  

If you are having a compatability problem with the two
version of R, something you might want to try is
downloading 2.4.1 or 2.5.0 and installing it on  a USB
stick.  You can run R quite nicely from a USB and it
might indicate if it is R or a corrupt file that is
the problem.

 
 
> 
> Liaw, Andy wrote:
> > What are the versions of R on the two platform? 
> Is the version on Unix
> > at least as new as the one on Windows?
> > 
> > Andy 
> > 
> > From: [EMAIL PROTECTED]
> > 
> >>Hi All
> >>
> >>I am saving a dataframe in my MS-Win R with
> save().
> >>Then I copy it onto my personal AFS space.
> >>Then I start R and run it with emacs and load()
> the data.
> >>It loads only 2 lines: head() shows only two lines
> nrow() als 
> >>say it has only 2 
> >>lines, I get error message, when trying to use
> this data 
> >>object, saying that 
> >>some row numbers are missing.
> >>If anyone had similar situation, I appreciate
> letting me know.
> >>
> >>Best Toby

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to mimic plot=F for truehist?

2007-05-25 Thread Vladimir Eremeev

By defining your own function.
You can get the function body by typing its name in the R command line and
pressing Enter.

Copy-paste the function body in ascii file (source R code), redefine it as
you like, for example, by adding desired argument and code for processing
it, then source that file and use your customized function.


Johannes Graumann-2 wrote:
> 
> Dear Rologists,
> 
> In order to combine plots I need to get access to the some "par"s specific
> to my plot prior to replot it with modified parameters. I have not found
> any option like "plot=F" associated with truehist and would like to know
> whether someone can point out how to overcome this problem.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/how-to-mimic-plot%3DF-for-truehist--tf3815196.html#a10800310
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] L1-SVM

2007-05-25 Thread Gorden T Jemwa

> Is there a package in R to find L1-SVM. I did search and found svmpath
> but not sure if it is what I need.


Try kernlab

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] testing difference (or similarities) between two distance matrices (not independent)

2007-05-25 Thread Stephane . Buhler
Hi,

i'm looking to test if two distance matrices are statistically  
different from each others.

These two matrices have been computed on the same set of data (several  
population samples)

1. using a particular genetic distance
2. weighting that genetic distance with an extra factor

(we can look at this as one set is computed before applying a  
treatment and the second one after applying that particular treatment  
... kind of a similar situation)

both these matrices are obviously not independent from each others, so  
Mantel test and others correlation tests do not apply here.

I thought of testing the order of values between these matrices (if  
distances are ordered the same way for both matrices, we have very  
similar matrices and the additional factor has quite no effect on the  
calculation). Is there any package or function in R allowing to do  
that and statistically test it (with permutations or another approach)?

I did check the mailing lists but did not find anything on that  
problem i'm trying to solve

thanks for your help and insights on that problem

Stéphane Buhler

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to mimic plot=F for truehist?

2007-05-25 Thread Johannes Graumann
Dear Rologists,

In order to combine plots I need to get access to the some "par"s specific
to my plot prior to replot it with modified parameters. I have not found
any option like "plot=F" associated with truehist and would like to know
whether someone can point out how to overcome this problem.

Thanks, Joh

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speeding up resampling of rows from a large matrix

2007-05-25 Thread Juan Pablo Lewinger
That's beautiful. For the full 120 x 65,000 matrix your approach took 
85 seconds. A truly remarkable improvement over my 80 minutes!

Thank you!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question concerning "pastecs" package

2007-05-25 Thread Rainer M. Krug
Dear Philippe

Thanks a lot for your information - it is extremely usefull

Rainer


Philippe Grosjean wrote:
> Hello,
> 
> I already answered privately to your question. No, there is no 
> translation of pastecs.pdf. The English documentation is accessible, as 
> usual, by:
> 
> ?turnpoints
> 
> Regarding your specific question, 'info' is the quantity of information 
> I associated with the turning points:
> 
> I = -log2 P(t)
> 
> where P is the probability to observe a turning point at time t under 
> the null hypothesis that the time series is purely random, and thus, the 
> distribution of turning points follows a normal distribution with:
> 
> E(p) = 2/3*(n-2)
> var(p) = (16*n - 29)/90
> 
> with p, the number of observed turning points and n the number of 
> observations. Ibanez (1982, in French, sorry... not my fault!) 
> demonstrated that P(t) is:
> 
> P(t) = 2*(1/n(t-1)! * (n-1)!)
> 
> As you can easily imagine, from this point on, it is straightforward to 
> construct a test to determine if the series is random (regarding the 
> distribution of the turning points), more or less monotonic (more or 
> less turning points than expected), See also the ref cited in the online 
> help (Kendall 1976).
> 
> References:
> ---
> Ibanez, F., 1982. Sur une nouvelle application de la théorie de 
> l'information à la description des séries chronologiques planctoniques. 
> J. Exp. Mar. Biol. Ecol., 4:619-632
> 
> Kendall, M.G., 1976. Time-series, 2nd ed. Charles Griffin & Co, London
> 
> Best,
> 
> Philippe Grosjean
> 
> ..<°}))><
>  ) ) ) ) )
> ( ( ( ( (Prof. Philippe Grosjean
>  ) ) ) ) )
> ( ( ( ( (Numerical Ecology of Aquatic Systems
>  ) ) ) ) )   Mons-Hainaut University, Belgium
> ( ( ( ( (
> ..
> 
> Rainer M. Krug wrote:
>> Hi
>>
>> I just installed the pastecs package and I am wondering: is there an 
>> english (or german) translation of the file pastecs.pdf? If not, is 
>> there an explanation somewhere of the object of type 'turnpoints' as a 
>> result of turnpoints(), especially the "info" field?
>>
>> Thanks,
>>
>> Rainer
>>


-- 
NEW EMAIL ADDRESS AND ADDRESS:

[EMAIL PROTECTED]

[EMAIL PROTECTED] WILL BE DISCONTINUED END OF MARCH

Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
Biology (UCT)

Leslie Hill Institute for Plant Conservation
University of Cape Town
Rondebosch 7701
South Africa

Fax:+27 - (0)86 516 2782
Fax:+27 - (0)21 650 2440 (w)
Cell:   +27 - (0)83 9479 042

Skype:  RMkrug

email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Off topic: S.E. for cross validation

2007-05-25 Thread Gad Abraham
Hi,

I'm performing (blocked) 10-fold cross-validation of a several time 
series forecasting methods, measuring their mean squared error (MSE).

I know that the MSE_cv is the average over the 10 MSEs. Is there a way 
to calculate the standard error as well?

The usual SD/sqrt(n) formula probably doesn't apply here as the 10 
observations aren't independent.

Thanks,
Gad

-- 
Gad Abraham
Department of Mathematics and Statistics
The University of Melbourne
Parkville 3010, Victoria, Australia
email: [EMAIL PROTECTED]
web: http://www.ms.unimelb.edu.au/~gabraham

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question concerning "pastecs" package

2007-05-25 Thread Philippe Grosjean
Hello,

I already answered privately to your question. No, there is no 
translation of pastecs.pdf. The English documentation is accessible, as 
usual, by:

?turnpoints

Regarding your specific question, 'info' is the quantity of information 
I associated with the turning points:

I = -log2 P(t)

where P is the probability to observe a turning point at time t under 
the null hypothesis that the time series is purely random, and thus, the 
distribution of turning points follows a normal distribution with:

E(p) = 2/3*(n-2)
var(p) = (16*n - 29)/90

with p, the number of observed turning points and n the number of 
observations. Ibanez (1982, in French, sorry... not my fault!) 
demonstrated that P(t) is:

P(t) = 2*(1/n(t-1)! * (n-1)!)

As you can easily imagine, from this point on, it is straightforward to 
construct a test to determine if the series is random (regarding the 
distribution of the turning points), more or less monotonic (more or 
less turning points than expected), See also the ref cited in the online 
help (Kendall 1976).

References:
---
Ibanez, F., 1982. Sur une nouvelle application de la théorie de 
l'information à la description des séries chronologiques planctoniques. 
J. Exp. Mar. Biol. Ecol., 4:619-632

Kendall, M.G., 1976. Time-series, 2nd ed. Charles Griffin & Co, London

Best,

Philippe Grosjean

..<°}))><
  ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
  ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
  ) ) ) ) )   Mons-Hainaut University, Belgium
( ( ( ( (
..

Rainer M. Krug wrote:
> Hi
> 
> I just installed the pastecs package and I am wondering: is there an 
> english (or german) translation of the file pastecs.pdf? If not, is 
> there an explanation somewhere of the object of type 'turnpoints' as a 
> result of turnpoints(), especially the "info" field?
> 
> Thanks,
> 
> Rainer
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why might X11() not be found?

2007-05-25 Thread Patrick Connolly
On Fri, 25-May-2007 at 08:25AM +1200, Patrick Connolly wrote:

|> > sessionInfo()
|> R version 2.5.0 (2007-04-23) 
|> x86_64-unknown-linux-gnu 
|> 
|> locale:
|> 
|> 
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
|> 
|> attached base packages:
|> [1] "utils""stats""graphics" "methods"  "base"

Well, in fact it was very simple.  There's no "package:grDevices" in
there.  Now, why that didn't happen before, I'm yet to work out.


Thanks for the suggestions.

best


-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~} Great minds discuss ideas
 _( Y )_Middle minds discuss events 
(:_~*~_:)Small minds discuss people  
 (_)-(_)   . Anon
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Adam Witney

Thanks for your replies Details inline below:

On 24/5/07 17:12, "Martin Maechler" <[EMAIL PROTECTED]> wrote:

>> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
>> on Thu, 24 May 2007 17:34:16 +0200 writes:
> 
> UweL> Some of these test are expected from time to time, since they are
> using 
> UweL> random numbers. Just re-run.
> 
> eehm,  "some of these", yes, but not the ones Adam mentioned,
> d-p-q-r-tests.R.
> 
> Adam, if you want more info you should report to us the *end*
> (last dozen of lines) of
> your d-p-q-r-tests.Rout[.fail]  file.

Ok, here they are...

[1] TRUE TRUE TRUE TRUE
> 
> ##-- non central Chi^2 :
> xB <- c(2000,1e6,1e50,Inf)
> for(df in c(0.1, 1, 10))
+ for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==
1)
Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
Execution halted


> UweL>  BTW: We do have R-2.5.0 these days.
> 
> Indeed! 
> 
> And gcc 2.95.4 is also very old.
> Maybe you've recovered an old compiler / math-library bug from
> that antique compiler suite ?

Yes, maybe I should start think about upgrading this box!

Thanks again

adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speeding up resampling of rows from a large matrix

2007-05-25 Thread Bill.Venables
Here is a possibility.  The only catch is that if a pair of rows is
selected twice you will get the results in a block, not scattered at
random throughout the columns of G.  I can't see that as a problem.

### --- start code excerpt ---
nSNPs <- 1000
H <- matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120)

# G <- matrix(0, nrow=3, ncol=nSNPs)

# Keep in mind that the real H is 120 x 65000

ij <- as.matrix(subset(expand.grid(i = 1:120, j = 1:120), i < j))

nResamples <- 3000
sel <- sample(1:nrow(ij), nResamples, rep = TRUE)
repf <- table(sel)   # replication factors
ij <- ij[as.numeric(names(repf)), ]  # distinct choice made

G <- matrix(0, nrow = 3, ncol = nrow(ij))  # for now

for(j in 1:ncol(G))
  G[,j] <- rowSums(outer(0:2, colSums(H[ij[j, ], ]), "=="))

G <- G[, rep(1:ncol(G), repf)] # bulk up the result

# _
# _
# _
# _pair <- replicate(nResamples, sample(1:120, 2))
# _
# _gen <- function(x){g <- sum(x); c(g==0, g==1, g==2)}
# _
# _for (i in 1:nResamples){
# _G <- G + apply(H[pair[,i],], 2, gen)
# _}
### --- end of code excerpt ---

I did a timing on my machine which is a middle-of-the range windows
monstrosity...

> system.time({
+ 
+ nSNPs <- 1000
+ H <- matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120)
+ 
+ # G <- matrix(0, nrow=3, ncol=nSNPs)
+ 
+ # Keep in mind that the real H is 120 x 65000
+ 
+ ij <- as.matrix(subset(expand.grid(i = 1:120, j = 1:120), i < j))
+ 
+ nResamples <- 3000
+ sel <- sample(1:nrow(ij), nResamples, rep = TRUE)
+ repf <- table(sel)   # replication factors
+ ij <- ij[as.numeric(names(repf)), ]  # distinct choice made
+ 
+ G <- matrix(0, nrow = 3, ncol = nrow(ij))  # for now
+ 
+ for(j in 1:ncol(G))
+   G[,j] <- rowSums(outer(0:2, colSums(H[ij[j, ], ]), "=="))
+ 
+ G <- G[, rep(1:ncol(G), repf)] # bulk up the result
+ 
+ # _
+ # _
+ # _
+ # _pair <- replicate(nResamples, sample(1:120, 2))
+ # _
+ # _gen <- function(x){g <- sum(x); c(g==0, g==1, g==2)}
+ # _
+ # _for (i in 1:nResamples){
+ # _G <- G + apply(H[pair[,i],], 2, gen)
+ # _}
+ #
_#--
-
+ # _
+ })
   user  system elapsed 
   0.970.000.99 


Less than a second.  Somewhat of an improvement on the 80 minutes, I
reckon.  This will increase, of course when you step the size of the H
matrix up from 1000 to 65000 columns

Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:(I don't have one!)
Home Phone: +61 7 3286 7700
mailto:[EMAIL PROTECTED]
http://www.cmis.csiro.au/bill.venables/ 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Juan Pablo
Lewinger
Sent: Friday, 25 May 2007 4:04 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Speeding up resampling of rows from a large matrix

I'm trying to:

Resample with replacement pairs of distinct rows from a 120 x 65,000 
matrix H of 0's and 1's. For each resampled pair sum the resulting 2 
x 65,000 matrix by column:

 0 1 0 1 ...
+
 0 0 1 1 ...
___
=  0 1 1 2 ...

For each column accumulate the number of 0's, 1's and 2's over the 
resamples to obtain a 3 x 65,000 matrix G.

For those interested in the background, H is a matrix of haplotypes, 
each pair of haplotypes forms a genotype, and each column corresponds 
to a SNP. I'm using resampling to compute the null distribution of 
the maximum over correlated SNPs of a simple statistic.


The code:
#---

nSNPs <- 1000
H <- matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120)
G <- matrix(0, nrow=3, ncol=nSNPs)
# Keep in mind that the real H is 120 x 65000

nResamples <- 3000
pair <- replicate(nResamples, sample(1:120, 2))

gen <- function(x){g <- sum(x); c(g==0, g==1, g==2)}

for (i in 1:nResamples){
G <- G + apply(H[pair[,i],], 2, gen)
}
#---

The problem is that the loop takes about 80 mins to complete and I 
need to repeat the whole thing 10,000 times, which would then take 
over a year and a half!

Is there a way to speed this up so that the full 10,000 iterations 
take a reasonable amount of time (say a week)?

My machine has an Intel Xeon 3.40GHz CPU with 1GB of RAM

 > sessionInfo()
R version 2.5.0 (2007-04-23)
i386-pc-mingw32

I would greatly appreciate any help.

Juan Pablo Lewinger
Department of Preventive Medicine
Keck School of Medicine
University of Southern California

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mail

Re: [R] Running R in Bash and R GUI

2007-05-25 Thread michael watson \(IAH-C\)
There are two things that occur.  Firstly, I normally have to unset
no_proxy:

%> unset no_proxy; R

Secondly, if for some reason http_proxy isn't being seen in R, you can
use the Sys.putenv() function within R to manipulate the environment

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: 24 May 2007 16:05
To: R-help@stat.math.ethz.ch
Subject: [R] Running R in Bash and R GUI

I have been trying to get the R and package update functions in the  
GUI version of R to work on my Mac.

Initially I got error messages that suggested I needed to set up the  
http_proxy for GUI R to use, but how can this be done?

I eventually got to the point of writing a .bash_profile file in the  
Bash terminal and setting the proxy addresses there.

I can now use my Bash terminal, invoke R, and run the update /  
install commands and they work!

The problem that still remains is that in the R console of the GUI  
R,  the http_proxy is not seen and thus I cannot connect to CRAN or  
any other mirror using the GUI functions in the pull-down menus.

I get

 > update.packages ()
Warning: unable to access index for repository http://cran.uk.r- 
project.org/bin/macosx/universal/contrib/2.5
 >

Basically it still seems unable to access port 80.

Is there a way of solving this so that I can use both terminals  
rather than just everything through Bash?

Thanks


Steve Hodgkinson

University of Brighton

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.