date:20070525

[R] Speeding up resampling of rows from a large matrix

2007-05-25 Thread Juan Pablo Lewinger

I'm trying to:

Resample with replacement pairs of distinct rows from a 120 x 65,000 
matrix H of 0's and 1's. For each resampled pair sum the resulting 2 
x 65,000 matrix by column:

 0 1 0 1 ...
+
 0 0 1 1 ...
___
=  0 1 1 2 ...

For each column accumulate the number of 0's, 1's and 2's over the 
resamples to obtain a 3 x 65,000 matrix G.

For those interested in the background, H is a matrix of haplotypes, 
each pair of haplotypes forms a genotype, and each column corresponds 
to a SNP. I'm using resampling to compute the null distribution of 
the maximum over correlated SNPs of a simple statistic.


The code:
#---
nSNPs - 1000
H - matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120)
G - matrix(0, nrow=3, ncol=nSNPs)
# Keep in mind that the real H is 120 x 65000

nResamples - 3000
pair - replicate(nResamples, sample(1:120, 2))

gen - function(x){g - sum(x); c(g==0, g==1, g==2)}

for (i in 1:nResamples){
G - G + apply(H[pair[,i],], 2, gen)
}
#---
The problem is that the loop takes about 80 mins to complete and I 
need to repeat the whole thing 10,000 times, which would then take 
over a year and a half!

Is there a way to speed this up so that the full 10,000 iterations 
take a reasonable amount of time (say a week)?

My machine has an Intel Xeon 3.40GHz CPU with 1GB of RAM

  sessionInfo()
R version 2.5.0 (2007-04-23)
i386-pc-mingw32

I would greatly appreciate any help.

Juan Pablo Lewinger
Department of Preventive Medicine
Keck School of Medicine
University of Southern California

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Hosmer-lemeshow test for survival

2007-05-25 Thread giovanni parrinello

Dear All,
I am looking for function's code to implement the Hosmer-Lemeshow gof test
for survival data.

Could anyone share it?

TIA

Mario

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] L1-SVM

2007-05-25 Thread Saeed Abu Nimeh

Hi List,
Is there a package in R to find L1-SVM. I did search and found svmpath
but not sure if it is what I need.
Thanks,
Saeed

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Running R in Bash and R GUI

2007-05-25 Thread michael watson \(IAH-C\)

There are two things that occur.  Firstly, I normally have to unset
no_proxy:

% unset no_proxy; R

Secondly, if for some reason http_proxy isn't being seen in R, you can
use the Sys.putenv() function within R to manipulate the environment

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: 24 May 2007 16:05
To: R-help@stat.math.ethz.ch
Subject: [R] Running R in Bash and R GUI

I have been trying to get the R and package update functions in the  
GUI version of R to work on my Mac.

Initially I got error messages that suggested I needed to set up the  
http_proxy for GUI R to use, but how can this be done?

I eventually got to the point of writing a .bash_profile file in the  
Bash terminal and setting the proxy addresses there.

I can now use my Bash terminal, invoke R, and run the update /  
install commands and they work!

The problem that still remains is that in the R console of the GUI  
R,  the http_proxy is not seen and thus I cannot connect to CRAN or  
any other mirror using the GUI functions in the pull-down menus.

I get

  update.packages ()
Warning: unable to access index for repository http://cran.uk.r- 
project.org/bin/macosx/universal/contrib/2.5
 

Basically it still seems unable to access port 80.

Is there a way of solving this so that I can use both terminals  
rather than just everything through Bash?

Thanks


Steve Hodgkinson

University of Brighton

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Speeding up resampling of rows from a large matrix

2007-05-25 Thread Bill.Venables

Here is a possibility.  The only catch is that if a pair of rows is
selected twice you will get the results in a block, not scattered at
random throughout the columns of G.  I can't see that as a problem.

### --- start code excerpt ---
nSNPs - 1000
H - matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120)

# G - matrix(0, nrow=3, ncol=nSNPs)

# Keep in mind that the real H is 120 x 65000

ij - as.matrix(subset(expand.grid(i = 1:120, j = 1:120), i  j))

nResamples - 3000
sel - sample(1:nrow(ij), nResamples, rep = TRUE)
repf - table(sel)   # replication factors
ij - ij[as.numeric(names(repf)), ]  # distinct choice made

G - matrix(0, nrow = 3, ncol = nrow(ij))  # for now

for(j in 1:ncol(G))
  G[,j] - rowSums(outer(0:2, colSums(H[ij[j, ], ]), ==))

G - G[, rep(1:ncol(G), repf)] # bulk up the result

# _
# _
# _
# _pair - replicate(nResamples, sample(1:120, 2))
# _
# _gen - function(x){g - sum(x); c(g==0, g==1, g==2)}
# _
# _for (i in 1:nResamples){
# _G - G + apply(H[pair[,i],], 2, gen)
# _}
### --- end of code excerpt ---

I did a timing on my machine which is a middle-of-the range windows
monstrosity...

 system.time({
+ 
+ nSNPs - 1000
+ H - matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120)
+ 
+ # G - matrix(0, nrow=3, ncol=nSNPs)
+ 
+ # Keep in mind that the real H is 120 x 65000
+ 
+ ij - as.matrix(subset(expand.grid(i = 1:120, j = 1:120), i  j))
+ 
+ nResamples - 3000
+ sel - sample(1:nrow(ij), nResamples, rep = TRUE)
+ repf - table(sel)   # replication factors
+ ij - ij[as.numeric(names(repf)), ]  # distinct choice made
+ 
+ G - matrix(0, nrow = 3, ncol = nrow(ij))  # for now
+ 
+ for(j in 1:ncol(G))
+   G[,j] - rowSums(outer(0:2, colSums(H[ij[j, ], ]), ==))
+ 
+ G - G[, rep(1:ncol(G), repf)] # bulk up the result
+ 
+ # _
+ # _
+ # _
+ # _pair - replicate(nResamples, sample(1:120, 2))
+ # _
+ # _gen - function(x){g - sum(x); c(g==0, g==1, g==2)}
+ # _
+ # _for (i in 1:nResamples){
+ # _G - G + apply(H[pair[,i],], 2, gen)
+ # _}
+ #
_#--
-
+ # _
+ })
   user  system elapsed 
   0.970.000.99 


Less than a second.  Somewhat of an improvement on the 80 minutes, I
reckon.  This will increase, of course when you step the size of the H
matrix up from 1000 to 65000 columns

Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:(I don't have one!)
Home Phone: +61 7 3286 7700
mailto:[EMAIL PROTECTED]
http://www.cmis.csiro.au/bill.venables/ 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Juan Pablo
Lewinger
Sent: Friday, 25 May 2007 4:04 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Speeding up resampling of rows from a large matrix

I'm trying to:

Resample with replacement pairs of distinct rows from a 120 x 65,000 
matrix H of 0's and 1's. For each resampled pair sum the resulting 2 
x 65,000 matrix by column:

 0 1 0 1 ...
+
 0 0 1 1 ...
___
=  0 1 1 2 ...

For each column accumulate the number of 0's, 1's and 2's over the 
resamples to obtain a 3 x 65,000 matrix G.

For those interested in the background, H is a matrix of haplotypes, 
each pair of haplotypes forms a genotype, and each column corresponds 
to a SNP. I'm using resampling to compute the null distribution of 
the maximum over correlated SNPs of a simple statistic.


The code:
#---

nSNPs - 1000
H - matrix(sample(0:1, 120*nSNPs , replace=T), nrow=120)
G - matrix(0, nrow=3, ncol=nSNPs)
# Keep in mind that the real H is 120 x 65000

nResamples - 3000
pair - replicate(nResamples, sample(1:120, 2))

gen - function(x){g - sum(x); c(g==0, g==1, g==2)}

for (i in 1:nResamples){
G - G + apply(H[pair[,i],], 2, gen)
}
#---

The problem is that the loop takes about 80 mins to complete and I 
need to repeat the whole thing 10,000 times, which would then take 
over a year and a half!

Is there a way to speed this up so that the full 10,000 iterations 
take a reasonable amount of time (say a week)?

My machine has an Intel Xeon 3.40GHz CPU with 1GB of RAM

  sessionInfo()
R version 2.5.0 (2007-04-23)
i386-pc-mingw32

I would greatly appreciate any help.

Juan Pablo Lewinger
Department of Preventive Medicine
Keck School of Medicine
University of Southern California

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list

Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Adam Witney


Thanks for your replies Details inline below:

On 24/5/07 17:12, Martin Maechler [EMAIL PROTECTED] wrote:

 UweL == Uwe Ligges [EMAIL PROTECTED]
 on Thu, 24 May 2007 17:34:16 +0200 writes:
 
 UweL Some of these test are expected from time to time, since they are
 using 
 UweL random numbers. Just re-run.
 
 eehm,  some of these, yes, but not the ones Adam mentioned,
 d-p-q-r-tests.R.
 
 Adam, if you want more info you should report to us the *end*
 (last dozen of lines) of
 your d-p-q-r-tests.Rout[.fail]  file.

Ok, here they are...

[1] TRUE TRUE TRUE TRUE
 
 ##-- non central Chi^2 :
 xB - c(2000,1e6,1e50,Inf)
 for(df in c(0.1, 1, 10))
+ for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==
1)
Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
Execution halted


 UweL  BTW: We do have R-2.5.0 these days.
 
 Indeed! 
 
 And gcc 2.95.4 is also very old.
 Maybe you've recovered an old compiler / math-library bug from
 that antique compiler suite ?

Yes, maybe I should start think about upgrading this box!

Thanks again

adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why might X11() not be found?

2007-05-25 Thread Patrick Connolly

On Fri, 25-May-2007 at 08:25AM +1200, Patrick Connolly wrote:

|  sessionInfo()
| R version 2.5.0 (2007-04-23) 
| x86_64-unknown-linux-gnu 
| 
| locale:
| 
| 
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
| 
| attached base packages:
| [1] utilsstatsgraphics methods  base

Well, in fact it was very simple.  There's no package:grDevices in
there.  Now, why that didn't happen before, I'm yet to work out.


Thanks for the suggestions.

best


-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~} Great minds discuss ideas
 _( Y )_Middle minds discuss events 
(:_~*~_:)Small minds discuss people  
 (_)-(_)   . Anon
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Off topic: S.E. for cross validation

2007-05-25 Thread Gad Abraham

Hi,

I'm performing (blocked) 10-fold cross-validation of a several time 
series forecasting methods, measuring their mean squared error (MSE).

I know that the MSE_cv is the average over the 10 MSEs. Is there a way 
to calculate the standard error as well?

The usual SD/sqrt(n) formula probably doesn't apply here as the 10 
observations aren't independent.

Thanks,
Gad

-- 
Gad Abraham
Department of Mathematics and Statistics
The University of Melbourne
Parkville 3010, Victoria, Australia
email: [EMAIL PROTECTED]
web: http://www.ms.unimelb.edu.au/~gabraham

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question concerning pastecs package

2007-05-25 Thread Philippe Grosjean

Hello,

I already answered privately to your question. No, there is no 
translation of pastecs.pdf. The English documentation is accessible, as 
usual, by:

?turnpoints

Regarding your specific question, 'info' is the quantity of information 
I associated with the turning points:

I = -log2 P(t)

where P is the probability to observe a turning point at time t under 
the null hypothesis that the time series is purely random, and thus, the 
distribution of turning points follows a normal distribution with:

E(p) = 2/3*(n-2)
var(p) = (16*n - 29)/90

with p, the number of observed turning points and n the number of 
observations. Ibanez (1982, in French, sorry... not my fault!) 
demonstrated that P(t) is:

P(t) = 2*(1/n(t-1)! * (n-1)!)

As you can easily imagine, from this point on, it is straightforward to 
construct a test to determine if the series is random (regarding the 
distribution of the turning points), more or less monotonic (more or 
less turning points than expected), See also the ref cited in the online 
help (Kendall 1976).

References:
---
Ibanez, F., 1982. Sur une nouvelle application de la théorie de 
l'information à la description des séries chronologiques planctoniques. 
J. Exp. Mar. Biol. Ecol., 4:619-632

Kendall, M.G., 1976. Time-series, 2nd ed. Charles Griffin  Co, London

Best,

Philippe Grosjean

..°}))
  ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
  ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
  ) ) ) ) )   Mons-Hainaut University, Belgium
( ( ( ( (
..

Rainer M. Krug wrote:
 Hi
 
 I just installed the pastecs package and I am wondering: is there an 
 english (or german) translation of the file pastecs.pdf? If not, is 
 there an explanation somewhere of the object of type 'turnpoints' as a 
 result of turnpoints(), especially the info field?
 
 Thanks,
 
 Rainer


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question concerning pastecs package

2007-05-25 Thread Rainer M. Krug

Dear Philippe

Thanks a lot for your information - it is extremely usefull

Rainer


Philippe Grosjean wrote:
 Hello,
 
 I already answered privately to your question. No, there is no 
 translation of pastecs.pdf. The English documentation is accessible, as 
 usual, by:
 
 ?turnpoints
 
 Regarding your specific question, 'info' is the quantity of information 
 I associated with the turning points:
 
 I = -log2 P(t)
 
 where P is the probability to observe a turning point at time t under 
 the null hypothesis that the time series is purely random, and thus, the 
 distribution of turning points follows a normal distribution with:
 
 E(p) = 2/3*(n-2)
 var(p) = (16*n - 29)/90
 
 with p, the number of observed turning points and n the number of 
 observations. Ibanez (1982, in French, sorry... not my fault!) 
 demonstrated that P(t) is:
 
 P(t) = 2*(1/n(t-1)! * (n-1)!)
 
 As you can easily imagine, from this point on, it is straightforward to 
 construct a test to determine if the series is random (regarding the 
 distribution of the turning points), more or less monotonic (more or 
 less turning points than expected), See also the ref cited in the online 
 help (Kendall 1976).
 
 References:
 ---
 Ibanez, F., 1982. Sur une nouvelle application de la théorie de 
 l'information à la description des séries chronologiques planctoniques. 
 J. Exp. Mar. Biol. Ecol., 4:619-632
 
 Kendall, M.G., 1976. Time-series, 2nd ed. Charles Griffin  Co, London
 
 Best,
 
 Philippe Grosjean
 
 ..°}))
  ) ) ) ) )
 ( ( ( ( (Prof. Philippe Grosjean
  ) ) ) ) )
 ( ( ( ( (Numerical Ecology of Aquatic Systems
  ) ) ) ) )   Mons-Hainaut University, Belgium
 ( ( ( ( (
 ..
 
 Rainer M. Krug wrote:
 Hi

 I just installed the pastecs package and I am wondering: is there an 
 english (or german) translation of the file pastecs.pdf? If not, is 
 there an explanation somewhere of the object of type 'turnpoints' as a 
 result of turnpoints(), especially the info field?

 Thanks,

 Rainer



-- 
NEW EMAIL ADDRESS AND ADDRESS:

[EMAIL PROTECTED]

[EMAIL PROTECTED] WILL BE DISCONTINUED END OF MARCH

Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
Biology (UCT)

Leslie Hill Institute for Plant Conservation
University of Cape Town
Rondebosch 7701
South Africa

Fax:+27 - (0)86 516 2782
Fax:+27 - (0)21 650 2440 (w)
Cell:   +27 - (0)83 9479 042

Skype:  RMkrug

email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Speeding up resampling of rows from a large matrix

2007-05-25 Thread Juan Pablo Lewinger

That's beautiful. For the full 120 x 65,000 matrix your approach took 
85 seconds. A truly remarkable improvement over my 80 minutes!

Thank you!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to mimic plot=F for truehist?

2007-05-25 Thread Johannes Graumann

Dear Rologists,

In order to combine plots I need to get access to the some pars specific
to my plot prior to replot it with modified parameters. I have not found
any option like plot=F associated with truehist and would like to know
whether someone can point out how to overcome this problem.

Thanks, Joh

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] testing difference (or similarities) between two distance matrices (not independent)

2007-05-25 Thread Stephane . Buhler

Hi,

i'm looking to test if two distance matrices are statistically  
different from each others.

These two matrices have been computed on the same set of data (several  
population samples)

1. using a particular genetic distance
2. weighting that genetic distance with an extra factor

(we can look at this as one set is computed before applying a  
treatment and the second one after applying that particular treatment  
... kind of a similar situation)

both these matrices are obviously not independent from each others, so  
Mantel test and others correlation tests do not apply here.

I thought of testing the order of values between these matrices (if  
distances are ordered the same way for both matrices, we have very  
similar matrices and the additional factor has quite no effect on the  
calculation). Is there any package or function in R allowing to do  
that and statistically test it (with permutations or another approach)?

I did check the mailing lists but did not find anything on that  
problem i'm trying to solve

thanks for your help and insights on that problem

Stéphane Buhler

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] L1-SVM

2007-05-25 Thread Gorden T Jemwa


 Is there a package in R to find L1-SVM. I did search and found svmpath
 but not sure if it is what I need.


Try kernlab

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to mimic plot=F for truehist?

2007-05-25 Thread Vladimir Eremeev


By defining your own function.
You can get the function body by typing its name in the R command line and
pressing Enter.

Copy-paste the function body in ascii file (source R code), redefine it as
you like, for example, by adding desired argument and code for processing
it, then source that file and use your customized function.


Johannes Graumann-2 wrote:
 
 Dear Rologists,
 
 In order to combine plots I need to get access to the some pars specific
 to my plot prior to replot it with modified parameters. I have not found
 any option like plot=F associated with truehist and would like to know
 whether someone can point out how to overcome this problem.
 
 

-- 
View this message in context: 
http://www.nabble.com/how-to-mimic-plot%3DF-for-truehist--tf3815196.html#a10800310
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] in unix opening data object created under win

2007-05-25 Thread John Kane


--- [EMAIL PROTECTED] wrote:

 On Unix R's version is 2.3.1 and on PC its 2.4.1.
 
 I dont have the rights to install newer version of R
 on Unix.
 
 I tried different upload methods. No one worked.
 
 On Unix it looks as follows (dots to hide my
 userid):
 
  

load(/afs/ir/users/../project/ps/data/dtaa)
   head(dtaa)
   hospid mfpi1 mfpi2 mfpi3 mfpi4 mfpi5
 mfpi6 mfpi7 mfpi8
 NA9 0.1428571 1   0.5 0.2857143  0.50  
 0.0 0.333 0
 4041  9 0.1428571 0   0.0 0.2857143  0.25  
 0.2 0.000 0
   mfpi9
 NA   0.333
 4041 1.000
  
 
 The data comes through but its screwed up.
 
 Thanks for your help.
 
 Toby

Hi Toby,
Except that the rest of the data is not showing up 
why  do you say the data is screwed up?  

I don't know what your data should look like but the
two rows above look okay.  

If you are having a compatability problem with the two
version of R, something you might want to try is
downloading 2.4.1 or 2.5.0 and installing it on  a USB
stick.  You can run R quite nicely from a USB and it
might indicate if it is R or a corrupt file that is
the problem.

 
 
 
 Liaw, Andy wrote:
  What are the versions of R on the two platform? 
 Is the version on Unix
  at least as new as the one on Windows?
  
  Andy 
  
  From: [EMAIL PROTECTED]
  
 Hi All
 
 I am saving a dataframe in my MS-Win R with
 save().
 Then I copy it onto my personal AFS space.
 Then I start R and run it with emacs and load()
 the data.
 It loads only 2 lines: head() shows only two lines
 nrow() als 
 say it has only 2 
 lines, I get error message, when trying to use
 this data 
 object, saying that 
 some row numbers are missing.
 If anyone had similar situation, I appreciate
 letting me know.
 
 Best Toby

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] windows to unix

2007-05-25 Thread Erin Hodgess

Dear R People:

Is there any way to take a Windows version of R, compiled from source, 
compress it, and put it on a Unix-like environment, please?

thanks in advance,
Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] iplots problem

2007-05-25 Thread mister_bluesman


Hi. I try to load iplots using the following commands

 library(rJava)
 library(iplots)

but then I get the following error:

Error in .jinit(cp, parameters = -Xmx512m, silent = TRUE) : 
Cannot create Java Virtual Machine
Error in library(iplots) : .First.lib failed for 'iplots'

What do I have to do to correct this?

Thanks
-- 
View this message in context: 
http://www.nabble.com/iplots-problem-tf3815516.html#a10801096
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] trouble with snow and Rmpi

2007-05-25 Thread Erin Hodgess

Dear R People:

I am having some trouble with the snow package.

It requires MPICH2 and Rmpi.

Rmpi is fine.  However, I downloaded the MPICH2 package, and installed.

There is no mpicc, mpirun, etc.

Does anyone have any suggestions, please?

Thanks in advance!

Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lmer and scale parameter in glmm model

2007-05-25 Thread Olivier MARTIN

Hi all,

I try to fit a glmm model with binomial distribution and I would to
verify that the scale parameter is close to 1...

the lmer function gives the following result :
Estimated scale (compare to  1 )  0.766783

But I would like to know how this estimation (0.766783) is performed, 
and I would
like to know if it is possible to find this estimation with the 
different results given by the function lmer.

Thanks,
Olivier.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R-About PLSR

2007-05-25 Thread Nitish Kumar Mishra

hi R help group,
I have installed PLS package in R and use it for princomp  prcomp
commands for calculating PCA using its example file(USArrests example).
But How I can use PLS for Partial least square, R square, mvrCv one more
think how i can import external file in R. When I use plsr, R2, RMSEP it
show error could not find function plsr, RMSEP etc.
How I can calculate PLS, R2, RMSEP, PCR, MVR using pls package in R.
Thanking you



-- 
Nitish Kumar Mishra
Junior Research Fellow
BIC, IMTECH, Chandigarh, India
E-Mail Address:
[EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Martin Maechler

 Adam == Adam Witney [EMAIL PROTECTED]
 on Fri, 25 May 2007 09:38:29 +0100 writes:

Adam Thanks for your replies Details inline below:

Adam On 24/5/07 17:12, Martin Maechler [EMAIL PROTECTED] wrote:

 UweL == Uwe Ligges [EMAIL PROTECTED]
 on Thu, 24 May 2007 17:34:16 +0200 writes:
 
UweL Some of these test are expected from time to time, since they are
 using 
UweL random numbers. Just re-run.
 
 eehm,  some of these, yes, but not the ones Adam mentioned,
 d-p-q-r-tests.R.
 
 Adam, if you want more info you should report to us the *end*
 (last dozen of lines) of
 your d-p-q-r-tests.Rout[.fail]  file.

Adam Ok, here they are...

  [1] TRUE TRUE TRUE TRUE
   
   ##-- non central Chi^2 :
   xB - c(2000,1e6,1e50,Inf)
   for(df in c(0.1, 1, 10))
  + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1)
  Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
  Execution halted

Ok, thanks;
so, if we want to learn more, we need
the output of something like

  xB - c(2000,1e6,1e50,Inf)
  for(df in c(0.1, 1, 10))
   for(ncp in c(0, 1, 10, 100)) 
   print(pchisq(xB, df=df, ncp=ncp), digits == 15)



UweL BTW: We do have R-2.5.0 these days.
 
 Indeed! 
 
 And gcc 2.95.4 is also very old.
 Maybe you've recovered an old compiler / math-library bug from
 that antique compiler suite ?

Adam Yes, maybe I should start think about upgrading this box!

yes, at least start ... ;-)

Adam Thanks again

Adam adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-About PLSR

2007-05-25 Thread Gavin Simpson

On Fri, 2007-05-25 at 17:25 +0530, Nitish Kumar Mishra wrote:
 hi R help group,
 I have installed PLS package in R and use it for princomp  prcomp
 commands for calculating PCA using its example file(USArrests example).
 But How I can use PLS for Partial least square, R square, mvrCv one more
 think how i can import external file in R. When I use plsr, R2, RMSEP it
 show error could not find function plsr, RMSEP etc.
 How I can calculate PLS, R2, RMSEP, PCR, MVR using pls package in R.
 Thanking you

Did you load the package with:

library(pls)

Before you tried to use the functions you mention?

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] normality tests

2007-05-25 Thread gatemaze

Hi all,

apologies for seeking advice on a general stats question. I ve run
normality tests using 8 different methods:
- Lilliefors
- Shapiro-Wilk
- Robust Jarque Bera
- Jarque Bera
- Anderson-Darling
- Pearson chi-square
- Cramer-von Mises
- Shapiro-Francia

All show that the null hypothesis that the data come from a normal
distro cannot be rejected. Great. However, I don't think it looks nice
to report the values of 8 different tests on a report. One note is
that my sample size is really tiny (less than 20 independent cases).
Without wanting to start a flame war, are there any advices of which
one/ones would be more appropriate and should be reported (along with
a Q-Q plot). Thank you.

Regards,

-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Competing Risks Analysis

2007-05-25 Thread Kevin E. Thorpe

I am working on a competing risks problem, specifically an analysis of
cause-specific mortality.  I am familiar with the cmprsk package and
have used it before to create cumulative incidence plots.  I also came
across an old (1998) s-news post from Dr. Terry Therneau describing
a way to use coxph to model competing risks.  I am re-producing the
post at the bottom of this message.

I would like to know if this approach is still reasonable or are there
other ways to go now.  I did an RSiteSearch with the term
competing risks and found some interesting articles but nothing as
specific as the post below.


- S-news Article Begins -
Competing risks

It's actually quite easy.

Assume a data set with n subjects and 4 types of competing events. Then
create a data set with 4n observations
First n obs: the data set you would create for an analysis of
time to event type 1, where all other event types are censored. An
extra variable etype is =1.
Second n obs: the data set you would create for time to event type 2,
with etype=2
.
.
.

Then
fit - coxph(Surv(time,status) ~  + strata(etype), 

1. Wei, Lin, and Weissfeld apply this to data sets where the competing
risks are not necessarily exclusive, i.e., time to progression and time
to death for cancer patients. JASA 1989, 1065-1073. If a given subject
can have more than one event, then you need to use the sandwich estimate
of variance, obtained by adding .. + cluster(id).. to the model
statement above, where id is variable unique to each subject.
(The method of fitting found in WLW, namely to do individual fits and
then glue the results together, is not necessary).

2. If a given subject can have at most one event, then it is not clear
that the sandwich estimate of variance is necessary. See Lunn and McNeil,
Biometrics (year?) for an example.

3. The covariates can be coded any way you like. WLW put in all of the
strata * covariate interactions for instance (the x coef is different for
each event type), but I never seem to have a big enough sample to justify
doing this. Lunn and McNeil use a certain coding of the treatment effect,
so that the betas are a contrast of interest to them; I've used similar
things
but never that particular one.

4. etype doesn't have to be 1,2,3,... of course; etype= 'paper',
'scissors', 'stone', 'pc' would work as well.

Terry M. Therneau, Ph.D.
- S-news Article Ends -

-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: [EMAIL PROTECTED]  Tel: 416.864.5776  Fax: 416.864.6057

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Adam Witney


 
 ##-- non central Chi^2 :
 xB - c(2000,1e6,1e50,Inf)
 for(df in c(0.1, 1, 10))
   + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1)
   Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
   Execution halted
 
 Ok, thanks;
 so, if we want to learn more, we need
 the output of something like
 
   xB - c(2000,1e6,1e50,Inf)
   for(df in c(0.1, 1, 10))
for(ncp in c(0, 1, 10, 100))
print(pchisq(xB, df=df, ncp=ncp), digits == 15)

Here is the results:

   xB - c(2000,1e6,1e50,Inf)
   for(df in c(0.1, 1, 10))
+for(ncp in c(0, 1, 10, 100))
+print(pchisq(xB, df=df, ncp=ncp), digits == 15)
Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15) :
object digits not found

Thanks again...

adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with numerical integration and optimization with BFGS

2007-05-25 Thread Ravi Varadhan

Deepankar,

If the problem seems to be in the evaluation of numerical quadrature part,
you might want to try quadrature methods that are better suited to
integrands with strong peaks.  The traditional Gaussian quadrature methods,
even their adaptive versions such as Gauss-Kronrod, are not best suited for
integrating because they do not explicitly account for the peakedness of
the integrand, and hence can be inefficient and inaccurate. See the article
below:
http://citeseer.ist.psu.edu/cache/papers/cs/18996/http:zSzzSzwww.sci.wsu.edu
zSzmathzSzfacultyzSzgenzzSzpaperszSzmvn.pdf/genz92numerical.pdf

Alan Genz has worked on this problem a lot and has a number of computational
tools available. I used some of them when I was working on computing Bayes
factors for binomial regression models with different link functions.  If
you are interested, check the following:

http://www.math.wsu.edu/faculty/genz/software/software.html.

For your immediate needs, there is an R package called mnormt that has a
function for computing integrals under a multivariate normal (and
multivariate t) densities, which is actually based on Genz's Fortran
routines.  You could try that.

Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Deepankar Basu
Sent: Friday, May 25, 2007 12:02 AM
To: Prof Brian Ripley
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Problem with numerical integration and optimization with
BFGS

Prof. Ripley,

The code that I provided with my question of course does not contain
code for the derivatives; but I am supplying analytical derivatives in
my full program. I did not include that code with my question because
that would have added about 200 more lines of code without adding any
new information relevant for my question. The problem that I had pointed
to occurs whether I provide analytical derivatives or not to the
optimization routine. And the problem was that when I use the BFGS
method in optim, I get an error message saying that the integrals are
probably divergent; I know, on the other hand, that the integrals are
convergent. The same problem does not arise when I instead use the
Nelder-Mead method in optim.

Your suggestion that the expression can be analytically integrated
(which will involve pnorm) might be correct though I do not see how to
do that. The integrands are the bivariate normal density functions with
one variable replaced by known quantities while I integrate over the
second. 

For instance, the first integral is as follows: the integrand is the
bivariate normal density function (with general covariance matrix) where
the second variable has been replaced by 
y[i] - rho1*y[i-1] + delta 
and I integrate over the first variable; the range of integration is
lower=-y[i]+rho1*y[i-1]
upper=y[i]-rho1*y[i-1]

The other two integrals are very similar. It would be of great help if
you could point out how to integrate the expressions analytically using
pnorm.

Thanks.
Deepankar


On Fri, 2007-05-25 at 04:22 +0100, Prof Brian Ripley wrote:
 You are trying to use a derivative-based optimization method without 
 supplying derivatives.  This will use numerical approoximations to the 
 derivatives, and your objective function will not be suitable as it is 
 internally using adaptive numerical quadrature and hence is probably not 
 close enough to a differentiable function (it may well have steps).
 
 I believe you can integrate analytically (the answer will involve pnorm), 
 and that you can also find analytical derivatives.
 
 Using (each of) numerical optimization and integration is a craft, and it 
 seems you need to know more about it.  The references on ?optim are too 
 advanced I guess, so you could start with Chapter 16 of MASS and its 
 references.
 
 On Thu, 24 May 2007, Deepankar Basu wrote:
 
  Hi R users,
 
  I have a couple of questions about some problems that I am facing with
  regard to numerical integration and optimization of likelihood
  functions. Let me provide a little background information: I am trying
  to do maximum likelihood estimation of an econometric model that I have
  developed recently. I estimate the parameters of the model using the
  monthly US unemployment rate series obtained from the Federal Reserve
  Bank of St. Louis. (The data is freely available from their web-based
  database called FRED-II).
 
  For my model, the likelihood function for each observation is the sum of
  three integrals. The integrand in each of these integrals is of the
  following form:

Re: [R] Problem with numerical integration and optimization with BFGS

2007-05-25 Thread Deepankar Basu

Ravi,

Thanks a lot for your detailed suggestions. I will certainly look at the
links that you have sent and the package mnormt. For the moment, I
have managed to analytically integrate the expression using pnorm
along the lines suggested by Prof. Ripley yesterday. 

For instance, my first integral becomes the following:

f1 - function(w1,w0) {

   a - 1/(2*(1-rho2^2)*sigep^2)  
   b - (rho2*(w1-w0+delta))/((1-rho2^2)*sigep*sigeta)
   c - ((w1-w0+delta)^2)/(2*(1-rho2^2)*sigeta^2)
   d - muep
   k - 2*pi*sigep*sigeta*(sqrt(1-rho2^2))

   b1 - ((-2*a*d - b)^2)/(4*a) - a*d^2 - b*d - c
   b21 - sqrt(a)*(w1-rho1*w0) + (-2*a*d - b)/(2*sqrt(a))
   b22 - sqrt(a)*(-w1+rho1*w0) + (-2*a*d - b)/(2*sqrt(a))
   b31 - 2*pnorm(b21*sqrt(2)) - 1  # ERROR FUNCTION  
   b32 - 2*pnorm(b22*sqrt(2)) - 1  # ERROR FUNCTION
   b33 - as.numeric(w1-rho1*w0=0)*(b31-b32)   
   
   return(sqrt(pi)*(1/(2*k*sqrt(a)))*exp(b1)*b33)
  }

 for (i in 2:n) {

 out1 - f1(y[i],y[i-1])
 
}

I have worked out similar expressions for the other two integrals also. 

Deepankar

On Fri, 2007-05-25 at 09:56 -0400, Ravi Varadhan wrote:
 Deepankar,
 
 If the problem seems to be in the evaluation of numerical quadrature part,
 you might want to try quadrature methods that are better suited to
 integrands with strong peaks.  The traditional Gaussian quadrature methods,
 even their adaptive versions such as Gauss-Kronrod, are not best suited for
 integrating because they do not explicitly account for the peakedness of
 the integrand, and hence can be inefficient and inaccurate. See the article
 below:
 http://citeseer.ist.psu.edu/cache/papers/cs/18996/http:zSzzSzwww.sci.wsu.edu
 zSzmathzSzfacultyzSzgenzzSzpaperszSzmvn.pdf/genz92numerical.pdf
 
 Alan Genz has worked on this problem a lot and has a number of computational
 tools available. I used some of them when I was working on computing Bayes
 factors for binomial regression models with different link functions.  If
 you are interested, check the following:
 
 http://www.math.wsu.edu/faculty/genz/software/software.html.
 
 For your immediate needs, there is an R package called mnormt that has a
 function for computing integrals under a multivariate normal (and
 multivariate t) densities, which is actually based on Genz's Fortran
 routines.  You could try that.
 
 Ravi.
 
 
 
 ---
 
 Ravi Varadhan, Ph.D.
 
 Assistant Professor, The Center on Aging and Health
 
 Division of Geriatric Medicine and Gerontology 
 
 Johns Hopkins University
 
 Ph: (410) 502-2619
 
 Fax: (410) 614-9625
 
 Email: [EMAIL PROTECTED]
 
 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
 
  
 
 
 
 
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Deepankar Basu
 Sent: Friday, May 25, 2007 12:02 AM
 To: Prof Brian Ripley
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Problem with numerical integration and optimization with
 BFGS
 
 Prof. Ripley,
 
 The code that I provided with my question of course does not contain
 code for the derivatives; but I am supplying analytical derivatives in
 my full program. I did not include that code with my question because
 that would have added about 200 more lines of code without adding any
 new information relevant for my question. The problem that I had pointed
 to occurs whether I provide analytical derivatives or not to the
 optimization routine. And the problem was that when I use the BFGS
 method in optim, I get an error message saying that the integrals are
 probably divergent; I know, on the other hand, that the integrals are
 convergent. The same problem does not arise when I instead use the
 Nelder-Mead method in optim.
 
 Your suggestion that the expression can be analytically integrated
 (which will involve pnorm) might be correct though I do not see how to
 do that. The integrands are the bivariate normal density functions with
 one variable replaced by known quantities while I integrate over the
 second. 
 
 For instance, the first integral is as follows: the integrand is the
 bivariate normal density function (with general covariance matrix) where
 the second variable has been replaced by 
 y[i] - rho1*y[i-1] + delta 
 and I integrate over the first variable; the range of integration is
 lower=-y[i]+rho1*y[i-1]
 upper=y[i]-rho1*y[i-1]
 
 The other two integrals are very similar. It would be of great help if
 you could point out how to integrate the expressions analytically using
 pnorm.
 
 Thanks.
 Deepankar
 
 
 On Fri, 2007-05-25 at 04:22 +0100, Prof Brian Ripley wrote:
  You are trying to use a derivative-based optimization method without 
  supplying derivatives.  This will use numerical approoximations to the 
  derivatives, and your objective function will not be suitable as it is

[R] Help with complex lme model fit

2007-05-25 Thread Colin Beale

 Hi R helpers,

 I'm trying to fit a rather complex model to some simulated data using
 lme and am not getting the correct results. It seems there might be
some
 identifiability issues that could possibly be dealt with by
specifying
 starting parameters - but I can't see how to do this. I'm comparing
 results from R to those got when using GenStat...

 The raw data are available on the web at
http://cmbeale.freehostia.com/OutData.txt and can be read directly
into R using:

 gpdat - read.table(http://cmbeale.freehostia.com/OutData.txt;,
header = T)
 gpdat$X7 - as.factor(gpdat$X7)
 gpdat$X4 - as.factor(gpdat$X4)
 rand_mat - as.matrix(gpdat[,11:26])
 gpdat - groupedData(Y1 ~X1 + X2 + X3 + X4 + X5 + m_sum|.g, data =
gpdat)


 the model fitted using:

 library(Matrix)
 library(nlme)

 m_sum - rowSums(gpdat[,11:27])
 mod1 - lme(fixed = Y1 ~ X1 + X2 + X3 + X4 + X5 +  m_sum,
   random = pdBlocked(list(pdIdent(~1), pdIdent (~ X6
-
 1),
   pdIdent (~ X7 - 1), pdIdent(~ rand_mat -1))), data
=
 gpdat)

 Which should recover the variance components:

 var_labelvar_est
  rand_mat_scalar 0.00021983
 X6_scalar   0.62314002
 X7_scalar0.03853604

 as recovered by GenStat and used to generate the dataset. Instead I
 get:

 X6  0.6231819
 X7 0.05221481
 rand_mat1.377596e-11

 However, If I change or drop either of X5 or X6. I then get much
closer
 estimates to what is expected. For example:


 mod2 - lme(fixed = Y1 ~ X1 + X2 + X3 + X4 + X5 +  m_sum,
   random = pdBlocked(list(pdIdent(~1), pdIdent (~ X6
-
 1),
   pdIdent (~as.numeric( X7) - 1), pdIdent(~ rand_mat
 -1))), data = gpdat)

 returns variance components:
 X6  0.6137986
 X7 Not meaningful
 rand_mat0.0006119088

 which is much closer to those used to generate the dataset for the
 parameters that are now meaningful, and has appropriate random effect
 estimates for the -rand_mat columns (the variable of most interest
 here). This suggests to me that there is some identifiability issue
that
 might be helped by giving different starting values. Is this
possible?
 Or does anyone have any other suggestions?

 Thanks,

 Colin

 sessionInfo:
R version 2.5.0 (2007-04-23) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United Kingdom.1252;
LC_CTYPE=English_United Kingdom.1252;
LC_MONETARY=English_United Kingdom.1252;
LC_NUMERIC=C;
LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices datasets  tcltk utils
methods   base 

other attached packages:
   nlme  Matrix latticesvSocketsvIO  R2HTML
 svMisc   svIDE 
   3.1-80 0.9975-110.15-5 0.9-5 0.9-5  1.58
0.9-5 0.9-5 



Dr. Colin Beale
Spatial Ecologist
The Macaulay Institute
Craigiebuckler
Aberdeen
AB15 8QH
UK

Tel: 01224 498245 ext. 2427
Fax: 01224 311556
Email: [EMAIL PROTECTED] 



-- 
Please note that the views expressed in this e-mail are those of the
sender and do not necessarily represent the views of the Macaulay
Institute. This email and any attachments are confidential and are
intended solely for the use of the recipient(s) to whom they are
addressed. If you are not the intended recipient, you should not read,
copy, disclose or rely on any information contained in this e-mail, and
we would ask you to contact the sender immediately and delete the email
from your system. Thank you.
Macaulay Institute and Associated Companies, Macaulay Drive,
Craigiebuckler, Aberdeen, AB15 8QH.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Martin Maechler

 Adam == Adam Witney [EMAIL PROTECTED]
 on Fri, 25 May 2007 14:48:18 +0100 writes:

 ##-- non central Chi^2 :
 xB - c(2000,1e6,1e50,Inf)
 for(df in c(0.1, 1, 10))
 + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) 
==1)
 Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
 Execution halted
 
 Ok, thanks;
 so, if we want to learn more, we need
 the output of something like
 
 xB - c(2000,1e6,1e50,Inf)
 for(df in c(0.1, 1, 10))
 for(ncp in c(0, 1, 10, 100))
 print(pchisq(xB, df=df, ncp=ncp), digits == 15)

Adam Here is the results:

 xB - c(2000,1e6,1e50,Inf)
 for(df in c(0.1, 1, 10))
Adam +for(ncp in c(0, 1, 10, 100))
Adam +print(pchisq(xB, df=df, ncp=ncp), digits == 15)
Adam Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15) :
Adam object digits not found

well, that's a typo - I think - you should have been able to fix
(I said something like ...).
Just do replace the '==' by '='

Martin

Adam Thanks again...

Adam adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Adam Witney


 
 Adam Here is the results:
 
 xB - c(2000,1e6,1e50,Inf)
 for(df in c(0.1, 1, 10))
 Adam +for(ncp in c(0, 1, 10, 100))
 Adam +print(pchisq(xB, df=df, ncp=ncp), digits == 15)
 Adam Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15)
 :
 Adam object digits not found
 
 well, that's a typo - I think - you should have been able to fix
 (I said something like ...).
 Just do replace the '==' by '='

Sorry my R is very limited... Here is the output with an '=' instead

  xB - c(2000,1e6,1e50,Inf)
   for(df in c(0.1, 1, 10))
+for(ncp in c(0, 1, 10, 100))
+print(pchisq(xB, df=df, ncp=ncp), digits = 15)
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1
[1] 1 1 1 1

Thanks again

Adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Read in 250K snp chips

2007-05-25 Thread bhiggs


I'm having trouble getting summaries out of the 250K snp chips in R.  I'm
using the oligo package and when I attempt to create the necessary SnpQSet
object (to get genotype calls and intensities) using snprma, I encounter
memory issues.

Anyone have an alternative package or workaround for these large snp chips?
-- 
View this message in context: 
http://www.nabble.com/Read-in-250K-snp-chips-tf3816761.html#a10805124
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] File path expansion

2007-05-25 Thread McGehee, Robert

R-Help,
I discovered a mis-feature is ghostscript, which is used by the bitmap
function. It seems that specifying file names in the form ~/abc.png
rather than /home/directory/abc.png causes my GS to crash when I open
the bitmap device on my Linux box.

The easiest solution would seem to be to intercept any file names in the
form ~/abc.png and replace the ~ with the user's home directory. I'm
sure I could come up with something involving regular expressions and
system calls to do this in Linux, but even that might not be system
independent. So, I wanted to see if anyone knew of a native R solution
of converting ~ to its full path expansion.

Thanks,
Robert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Read in 250K snp chips

2007-05-25 Thread James W. MacDonald

bhiggs wrote:
 I'm having trouble getting summaries out of the 250K snp chips in R.  I'm
 using the oligo package and when I attempt to create the necessary SnpQSet
 object (to get genotype calls and intensities) using snprma, I encounter
 memory issues.
 
 Anyone have an alternative package or workaround for these large snp chips?


Oligo is a Bioconductor package, so you should probably direct questions 
on the Bioconductor-help listserve rather than R-help.

Best,

Jim


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**
Electronic Mail is not secure, may not be read every day, and should not be 
used for urgent or sensitive issues.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] File path expansion

2007-05-25 Thread Gabor Grothendieck

Try

?path.expand

On 5/25/07, McGehee, Robert [EMAIL PROTECTED] wrote:
 R-Help,
 I discovered a mis-feature is ghostscript, which is used by the bitmap
 function. It seems that specifying file names in the form ~/abc.png
 rather than /home/directory/abc.png causes my GS to crash when I open
 the bitmap device on my Linux box.

 The easiest solution would seem to be to intercept any file names in the
 form ~/abc.png and replace the ~ with the user's home directory. I'm
 sure I could come up with something involving regular expressions and
 system calls to do this in Linux, but even that might not be system
 independent. So, I wanted to see if anyone knew of a native R solution
 of converting ~ to its full path expansion.

 Thanks,
 Robert

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] trouble with snow and Rmpi

2007-05-25 Thread Ramon Diaz-Uriarte

Dear Erin,

What operating system are you trying this on? Windows? In Linux you
definitely don't need MPICH2 but, rather, LAM/MPI.

Best,

R.

On 5/25/07, Erin Hodgess [EMAIL PROTECTED] wrote:
 Dear R People:

 I am having some trouble with the snow package.

 It requires MPICH2 and Rmpi.

 Rmpi is fine.  However, I downloaded the MPICH2 package, and installed.

 There is no mpicc, mpirun, etc.

 Does anyone have any suggestions, please?

 Thanks in advance!

 Sincerely,
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] File path expansion

2007-05-25 Thread Martin Maechler


 path.expand(~)
[1] /home/maechler

 RobMcG == McGehee, Robert [EMAIL PROTECTED]
 on Fri, 25 May 2007 11:44:27 -0400 writes:

RobMcG R-Help,
RobMcG I discovered a mis-feature is ghostscript, which is used by the 
bitmap
RobMcG function. It seems that specifying file names in the form 
~/abc.png
RobMcG rather than /home/directory/abc.png causes my GS to crash when I 
open
RobMcG the bitmap device on my Linux box.

RobMcG The easiest solution would seem to be to intercept any file names 
in the
RobMcG form ~/abc.png and replace the ~ with the user's home 
directory. I'm
RobMcG sure I could come up with something involving regular expressions 
and
RobMcG system calls to do this in Linux, but even that might not be system
RobMcG independent. So, I wanted to see if anyone knew of a native R 
solution
RobMcG of converting ~ to its full path expansion.

RobMcG Thanks,
RobMcG Robert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] windows to unix

2007-05-25 Thread Martin Maechler

 Erin == Erin Hodgess [EMAIL PROTECTED]
 on Fri, 25 May 2007 06:10:10 -0500 writes:

Erin Dear R People:
Erin Is there any way to take a Windows version of R, compiled from 
source, 
Erin compress it, and put it on a Unix-like environment, please?

Since nobody has answered yet, let a die-hard non-windows user
try :

Just 'zip' the corresponding directory and copy the zip
file to your unix like environment.

I assume the only things this does not contain
would be the
- registry entries (which used to be optional anyway;
I'm not sure if that's still true) 
- desktop  links to R
- startup menu links to R

but the last two can easily be recreated after people copy the
zip file and unpack it in their windows enviroment -- 
which I assume is the purpose of the whole procedure..

{Please reply to R-help; not me, I am *the windows-non-expert ..}

Martin


Erin thanks in advance,
Erin Sincerely,
Erin Erin Hodgess
Erin Associate Professor
Erin Department of Computer and Mathematical Sciences
Erin University of Houston - Downtown
Erin mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests

2007-05-25 Thread Frank E Harrell Jr

[EMAIL PROTECTED] wrote:
 Hi all,
 
 apologies for seeking advice on a general stats question. I ve run
 normality tests using 8 different methods:
 - Lilliefors
 - Shapiro-Wilk
 - Robust Jarque Bera
 - Jarque Bera
 - Anderson-Darling
 - Pearson chi-square
 - Cramer-von Mises
 - Shapiro-Francia
 
 All show that the null hypothesis that the data come from a normal
 distro cannot be rejected. Great. However, I don't think it looks nice
 to report the values of 8 different tests on a report. One note is
 that my sample size is really tiny (less than 20 independent cases).
 Without wanting to start a flame war, are there any advices of which
 one/ones would be more appropriate and should be reported (along with
 a Q-Q plot). Thank you.
 
 Regards,
 

Wow - I have so many concerns with that approach that it's hard to know 
where to begin.  But first of all, why care about normality?  Why not 
use distribution-free methods?

You should examine the power of the tests for n=20.  You'll probably 
find it's not good enough to reach a reliable conclusion.

Frank


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests

2007-05-25 Thread gatemaze

On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
  Hi all,
 
  apologies for seeking advice on a general stats question. I ve run
  normality tests using 8 different methods:
  - Lilliefors
  - Shapiro-Wilk
  - Robust Jarque Bera
  - Jarque Bera
  - Anderson-Darling
  - Pearson chi-square
  - Cramer-von Mises
  - Shapiro-Francia
 
  All show that the null hypothesis that the data come from a normal
  distro cannot be rejected. Great. However, I don't think it looks nice
  to report the values of 8 different tests on a report. One note is
  that my sample size is really tiny (less than 20 independent cases).
  Without wanting to start a flame war, are there any advices of which
  one/ones would be more appropriate and should be reported (along with
  a Q-Q plot). Thank you.
 
  Regards,
 

 Wow - I have so many concerns with that approach that it's hard to know
 where to begin.  But first of all, why care about normality?  Why not
 use distribution-free methods?

 You should examine the power of the tests for n=20.  You'll probably
 find it's not good enough to reach a reliable conclusion.

And wouldn't it be even worse if I used non-parametric tests?


 Frank


 --
 Frank E Harrell Jr   Professor and Chair   School of Medicine
   Department of Biostatistics   Vanderbilt University



-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] windows to unix

2007-05-25 Thread Barry Rowlingson

Martin Maechler wrote:
 Erin == Erin Hodgess [EMAIL PROTECTED]
 on Fri, 25 May 2007 06:10:10 -0500 writes:
 
 Erin Dear R People:
 Erin Is there any way to take a Windows version of R, compiled from 
 source, 
 Erin compress it, and put it on a Unix-like environment, please?

 Just 'zip' the corresponding directory and copy the zip
 file to your unix like environment.
 


  You can take a Windows-compiled R to Unix, but you can't make it work

  The big unasked question is 'What is this unix-like environment?'.

  Linux isn't Unix, so maybe you mean that, in which case you'll not 
make your Windows compiled R run. Not without 'Wine' or some other layer 
of obfuscation.

Barry

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] xyplot: different scales accross rows, same scales within rows

2007-05-25 Thread Marta Rufino

Dear list members,


I would like to set up a multiple panel in xyplots, with the same scale 
for all colunms in each row, but different accross rows.
relation=free would set up all x or y scales free... which is not what 
I want :-(

Is this possible?


Thank you in advance,
Best wishes,
Marta

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests [Broadcast]

2007-05-25 Thread Liaw, Andy

From: [EMAIL PROTECTED]
 
 On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
  [EMAIL PROTECTED] wrote:
   Hi all,
  
   apologies for seeking advice on a general stats question. I ve run
   normality tests using 8 different methods:
   - Lilliefors
   - Shapiro-Wilk
   - Robust Jarque Bera
   - Jarque Bera
   - Anderson-Darling
   - Pearson chi-square
   - Cramer-von Mises
   - Shapiro-Francia
  
   All show that the null hypothesis that the data come from a normal
   distro cannot be rejected. Great. However, I don't think 
 it looks nice
   to report the values of 8 different tests on a report. One note is
   that my sample size is really tiny (less than 20 
 independent cases).
   Without wanting to start a flame war, are there any 
 advices of which
   one/ones would be more appropriate and should be reported 
 (along with
   a Q-Q plot). Thank you.
  
   Regards,
  
 
  Wow - I have so many concerns with that approach that it's 
 hard to know
  where to begin.  But first of all, why care about 
 normality?  Why not
  use distribution-free methods?
 
  You should examine the power of the tests for n=20.  You'll probably
  find it's not good enough to reach a reliable conclusion.
 
 And wouldn't it be even worse if I used non-parametric tests?

I believe what Frank meant was that it's probably better to use a
distribution-free procedure to do the real test of interest (if there is
one) instead of testing for normality, and then use a test that assumes
normality.

I guess the question is, what exactly do you want to do with the outcome
of the normality tests?  If those are going to be used as basis for
deciding which test(s) to do next, then I concur with Frank's
reservation.

Generally speaking, I do not find goodness-of-fit for distributions very
useful, mostly for the reason that failure to reject the null is no
evidence in favor of the null.  It's difficult for me to imagine why
there's insufficient evidence to show that the data did not come from a
normal distribution would be interesting.

Andy

 
 
  Frank
 
 
  --
  Frank E Harrell Jr   Professor and Chair   School 
 of Medicine
Department of Biostatistics   
 Vanderbilt University
 
 
 
 -- 
 yianni
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] File path expansion

2007-05-25 Thread Prof Brian Ripley

On Fri, 25 May 2007, Martin Maechler wrote:


 path.expand(~)
 [1] /home/maechler

Yes, but beware that may not do what you want on Windows in R = 2.5.0, 
since someone changed the definition of 'home' but not path.expand.


 RobMcG == McGehee, Robert [EMAIL PROTECTED]
 on Fri, 25 May 2007 11:44:27 -0400 writes:

RobMcG R-Help,
RobMcG I discovered a mis-feature is ghostscript, which is used by the 
 bitmap
RobMcG function. It seems that specifying file names in the form 
 ~/abc.png
RobMcG rather than /home/directory/abc.png causes my GS to crash when I 
 open
RobMcG the bitmap device on my Linux box.

RobMcG The easiest solution would seem to be to intercept any file names 
 in the
RobMcG form ~/abc.png and replace the ~ with the user's home 
 directory. I'm
RobMcG sure I could come up with something involving regular expressions 
 and
RobMcG system calls to do this in Linux, but even that might not be system
RobMcG independent. So, I wanted to see if anyone knew of a native R 
 solution
RobMcG of converting ~ to its full path expansion.

RobMcG Thanks,
RobMcG Robert

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] In which package is the errbar command located?

2007-05-25 Thread Judith Flores


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests [Broadcast]

2007-05-25 Thread Lucke, Joseph F

 Most standard tests, such as t-tests and ANOVA, are fairly resistant to
non-normalilty for significance testing. It's the sample means that have
to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
for normality prior to choosing a test statistic is generally not a good
idea. 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
Sent: Friday, May 25, 2007 12:04 PM
To: [EMAIL PROTECTED]; Frank E Harrell Jr
Cc: r-help
Subject: Re: [R] normality tests [Broadcast]

From: [EMAIL PROTECTED]
 
 On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
  [EMAIL PROTECTED] wrote:
   Hi all,
  
   apologies for seeking advice on a general stats question. I ve run

   normality tests using 8 different methods:
   - Lilliefors
   - Shapiro-Wilk
   - Robust Jarque Bera
   - Jarque Bera
   - Anderson-Darling
   - Pearson chi-square
   - Cramer-von Mises
   - Shapiro-Francia
  
   All show that the null hypothesis that the data come from a normal

   distro cannot be rejected. Great. However, I don't think
 it looks nice
   to report the values of 8 different tests on a report. One note is

   that my sample size is really tiny (less than 20
 independent cases).
   Without wanting to start a flame war, are there any
 advices of which
   one/ones would be more appropriate and should be reported
 (along with
   a Q-Q plot). Thank you.
  
   Regards,
  
 
  Wow - I have so many concerns with that approach that it's
 hard to know
  where to begin.  But first of all, why care about
 normality?  Why not
  use distribution-free methods?
 
  You should examine the power of the tests for n=20.  You'll probably

  find it's not good enough to reach a reliable conclusion.
 
 And wouldn't it be even worse if I used non-parametric tests?

I believe what Frank meant was that it's probably better to use a
distribution-free procedure to do the real test of interest (if there is
one) instead of testing for normality, and then use a test that assumes
normality.

I guess the question is, what exactly do you want to do with the outcome
of the normality tests?  If those are going to be used as basis for
deciding which test(s) to do next, then I concur with Frank's
reservation.

Generally speaking, I do not find goodness-of-fit for distributions very
useful, mostly for the reason that failure to reject the null is no
evidence in favor of the null.  It's difficult for me to imagine why
there's insufficient evidence to show that the data did not come from a
normal distribution would be interesting.

Andy

 
 
  Frank
 
 
  --
  Frank E Harrell Jr   Professor and Chair   School 
 of Medicine
Department of Biostatistics   
 Vanderbilt University
 
 
 
 --
 yianni
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 



--
Notice:  This e-mail message, together with any
attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculation of ratio distribution properties

2007-05-25 Thread Mike Lawrence


MathCad code failed to attach last time. Here it is.


On 25-May-07, at 2:24 PM, Mike Lawrence wrote:


I came across this reference:
http://www.informaworld.com/smpp/content? 
content=10.1080/03610920600683689


The authors sent me code (attached with permission) in MathCad to  
perform the calculations in which I'm interested. However, I do not  
have MathCad nor experience with its syntax, so I thought I'd send  
the code to the list to see if anyone with more experience with R  
and MathCad would be interested in making this code into a function  
or package of some sort


Mike

Begin forwarded message:


From: Noyan Turkkan [EMAIL PROTECTED]
Date: May 25, 2007 11:09:02 AM ADT
To: Mike Lawrence [EMAIL PROTECTED]
Subject: Rép. : R code for 'Density of the Ratio of Two Normal  
Random Variables'?


Hi Mike
I do not know if anyone coded my approach in R. However if you  
have acces to MathCad, I am including a MathCad file (also in  
PDF)  which computes the density of the ratio of 2 dependent  
normal variables, its mean  variance. If you do not have acces to  
MathCad, you will see that all the computations can be easily  
programmed in R, as I replaced Hypergeometric function by the Erf  
function. I am not very familiar with R but the erf function may  
be programmed as : (erf - function(x)  2*pnorm(x *sqrt(2)) - 1).  
Good luck.


Noyan Turkkan, ing.
Professeur titulaire  directeur / Professor  Head
Dépt. de génie civil / Civil Eng. Dept.
Faculté d'ingénierie / Faculty of Engineering
Université de Moncton
Moncton, N.B., Canada, E1A 3E9



 Mike Lawrence [EMAIL PROTECTED] 05/25/07 9:20 am 
Hi Dr. Turkkan,

I am working on a problem that necessitates the estimation of the
mean and variance of the density of two dependent normal random
variables and in my search for methods to achieve such estimation I
came across your paper 'Density of the Ratio of Two Normal Random
Variables' (2006). I'm not a statistics or math expert by any means,
but I am quite familiar with the R programming language; do you
happen to know whether anyone has coded your approach for R yet?

Cheers,

Mike

--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein

Ratio2depNV.pdf


--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting- 
guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot: different scales accross rows, same scales within rows

2007-05-25 Thread Gabor Grothendieck

xlim= can take a list:

# CO2 is built into R
library(lattice)
xlim - rep(list(c(0, 1000), c(0, 2000)), each = 2)
xyplot(uptake ~ conc | Type * Treatment, data = CO2,
scales = list(relation = free), xlim = xlim)


On 5/25/07, Marta Rufino [EMAIL PROTECTED] wrote:
 Dear list members,


 I would like to set up a multiple panel in xyplots, with the same scale
 for all colunms in each row, but different accross rows.
 relation=free would set up all x or y scales free... which is not what
 I want :-(

 Is this possible?


 Thank you in advance,
 Best wishes,
 Marta

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot: different scales accross rows, same scales within rows

2007-05-25 Thread Deepayan Sarkar

On 5/25/07, Marta Rufino [EMAIL PROTECTED] wrote:
 Dear list members,


 I would like to set up a multiple panel in xyplots, with the same scale
 for all colunms in each row, but different accross rows.
 relation=free would set up all x or y scales free... which is not what
 I want :-(

 Is this possible?

It's possible, but requires some abuse of the Trellis design, which
doesn't really allow for such use. See

https://stat.ethz.ch/pipermail/r-help/2004-October/059396.html

-Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] In which package is the errbar command located?

2007-05-25 Thread John Kane

Hmisc
--- Judith Flores [EMAIL PROTECTED] wrote:

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculation of ratio distribution properties

2007-05-25 Thread Mike Lawrence

According to the paper I cited, there is controversy over the  
sufficiency of Hinkley's solution, hence their proposed more complete  
solution.

On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote:

 The exact ratio is given in

 On the Ratio of Two Correlated Normal Random Variables, D. V.  
 Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969), pp. 635-639.


--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] iPlots package

2007-05-25 Thread ryestone


I am having trouble connecting two points in iplot. In the normal plot
command I would use segments(). I know there is a function ilines() but can
you just enter coordinates of 2 points?
-- 
View this message in context: 
http://www.nabble.com/iPlots-package-tf3817683.html#a10808180
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests [Broadcast]

2007-05-25 Thread gatemaze

Thank you all for your replies they have been more useful... well
in my case I have chosen to do some parametric tests (more precisely
correlation and linear regressions among some variables)... so it
would be nice if I had an extra bit of support on my decisions... If I
understood well from all your replies... I shouldn't pay s much
attntion on the normality tests, so it wouldn't matter which one/ones
I use to report... but rather focus on issues such as the power of the
test...

Thanks again.

On 25/05/07, Lucke, Joseph F [EMAIL PROTECTED] wrote:
  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
 non-normalilty for significance testing. It's the sample means that have
 to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
 for normality prior to choosing a test statistic is generally not a good
 idea.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
 Sent: Friday, May 25, 2007 12:04 PM
 To: [EMAIL PROTECTED]; Frank E Harrell Jr
 Cc: r-help
 Subject: Re: [R] normality tests [Broadcast]

 From: [EMAIL PROTECTED]
 
  On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
   [EMAIL PROTECTED] wrote:
Hi all,
   
apologies for seeking advice on a general stats question. I ve run

normality tests using 8 different methods:
- Lilliefors
- Shapiro-Wilk
- Robust Jarque Bera
- Jarque Bera
- Anderson-Darling
- Pearson chi-square
- Cramer-von Mises
- Shapiro-Francia
   
All show that the null hypothesis that the data come from a normal

distro cannot be rejected. Great. However, I don't think
  it looks nice
to report the values of 8 different tests on a report. One note is

that my sample size is really tiny (less than 20
  independent cases).
Without wanting to start a flame war, are there any
  advices of which
one/ones would be more appropriate and should be reported
  (along with
a Q-Q plot). Thank you.
   
Regards,
   
  
   Wow - I have so many concerns with that approach that it's
  hard to know
   where to begin.  But first of all, why care about
  normality?  Why not
   use distribution-free methods?
  
   You should examine the power of the tests for n=20.  You'll probably

   find it's not good enough to reach a reliable conclusion.
 
  And wouldn't it be even worse if I used non-parametric tests?

 I believe what Frank meant was that it's probably better to use a
 distribution-free procedure to do the real test of interest (if there is
 one) instead of testing for normality, and then use a test that assumes
 normality.

 I guess the question is, what exactly do you want to do with the outcome
 of the normality tests?  If those are going to be used as basis for
 deciding which test(s) to do next, then I concur with Frank's
 reservation.

 Generally speaking, I do not find goodness-of-fit for distributions very
 useful, mostly for the reason that failure to reject the null is no
 evidence in favor of the null.  It's difficult for me to imagine why
 there's insufficient evidence to show that the data did not come from a
 normal distribution would be interesting.

 Andy


  
   Frank
  
  
   --
   Frank E Harrell Jr   Professor and Chair   School
  of Medicine
 Department of Biostatistics
  Vanderbilt University
  
 
 
  --
  yianni
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 


 
 --
 Notice:  This e-mail message, together with any
 attachments,...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] /tmp/ gets filled up fast

2007-05-25 Thread Alessandro Gagliardi

Dear useRs,

I'm running some pretty big R scripts using a PBS that calls upon the
RMySQL library and it's filling up the /tmp/ directory with Rtmp*
files.  I thought the problem might have come from scripts crashing
and therefore not getting around to executing the dbDisconnect()
function, but I just ran a script that exited correctly and it still
left a giant file in /tmp/ without cleaning up after itself.  I'm
running all of my scripts using the --vanilla tag (saving the data I
need either into a separate .RData file or into the MySQL database).
Is there some other tag I should be using or some command I can put at
the end of my script to remove the Rtmp file when it's done?

Thank you,
-Alessandro

-- Forwarded message --
From: Yaroslav Halchenko
Date: May 24, 2007 7:42 PM
Subject: Re: mysql Access denied for user
To: Alessandro Gagliardi

Alessandro
We need to resolve the issue somehow better way... /tmp/ gets filled up
fast with your R tasks...

*$ du -scm RtmpvKEqwr/
804 RtmpvKEqwr/
804 total

[EMAIL PROTECTED]:/tmp
$ ls -ld RtmpvKEqwr/
0 drwx-- 2 eklypse eklypse 80 2007-05-24 15:18 RtmpvKEqwr//

[EMAIL PROTECTED]:/tmp
$ df .
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sda6  1052184   1052184 0 100% /tmp


I had to remove it...

check if there is a way to use some other directory may be? or properly
clean up??

On Wed, 23 May 2007, Alessandro Gagliardi wrote:

 Well, this is a real problem then.  Because I'm generating tables that
 R cannot get into MySQL because they are too big and it times out
 before it's done.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] iplots problem

2007-05-25 Thread ryestone


Did you load Sun's java? Try that and also try rebooting your machine. This
worked for me.


mister_bluesman wrote:
 
 Hi. I try to load iplots using the following commands
 
 library(rJava)
 library(iplots)
 
 but then I get the following error:
 
 Error in .jinit(cp, parameters = -Xmx512m, silent = TRUE) : 
 Cannot create Java Virtual Machine
 Error in library(iplots) : .First.lib failed for 'iplots'
 
 What do I have to do to correct this?
 
 I have jdk1.6 and jre1.6 installed on my windows machine
 Thanks
 

-- 
View this message in context: 
http://www.nabble.com/iplots-problem-tf3815516.html#a10808131
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculation of ratio distribution properties

2007-05-25 Thread Ravi Varadhan

Mike,

Attached is an R function to do this, along with an example that will
reproduce the MathCad plot shown in your attached paper. I haven't checked
it thoroughly, but it seems to reproduce the MathCad example well.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mike Lawrence
Sent: Friday, May 25, 2007 1:55 PM
To: Lucke, Joseph F
Cc: Rhelp
Subject: Re: [R] Calculation of ratio distribution properties

According to the paper I cited, there is controversy over the  
sufficiency of Hinkley's solution, hence their proposed more complete  
solution.

On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote:

 The exact ratio is given in

 On the Ratio of Two Correlated Normal Random Variables, D. V.  
 Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969), pp. 635-639.


--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

Website: http://myweb.dal.ca/mc973993
Public calendar: http://icalx.com/public/informavore/Public

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] In which package is the errbar command located?

2007-05-25 Thread Charles C. Berry



help.search(errbar)

shows the one installed on my system (and might on yours), but

RSiteSearch(errbar)

shows that more than one package contains a function by that name


On Fri, 25 May 2007, Judith Flores wrote:


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lme with corAR1 errors - can't find AR coefficient in output

2007-05-25 Thread Stephen Weigand

Millo,

On 5/24/07, Millo Giovanni [EMAIL PROTECTED] wrote:

 Dear List,

 I am using the output of a ML estimation on a random effects model with
 first-order autocorrelation to make a further conditional test. My model
 is much like this (which reproduces the method on the famous Grunfeld
 data, for the econometricians out there it is Table 5.2 in Baltagi):

 library(Ecdat)
 library(nlme)
 data(Grunfeld)
 mymod-lme(inv~value+capital,data=Grunfeld,random=~1|firm,correlation=co
 rAR1(0,~year|firm))

 Embarrassing as it may be, I can find the autoregressive parameter
 ('Phi', if I get it right) in the printout of summary(mymod) but I am
 utterly unable to locate the corresponding element in the lme or
 summary.lme objects.

 Any help appreciated. This must be something stupid I'm overlooking,
 either in str(mymod) or in the help files, but it's a huge problem for
 me.



Try

coef(mymod$model$corStruct,
   unconstrained = FALSE)

Stephen
-- 
Rochester, Minn. USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] File path expansion

2007-05-25 Thread Duncan Murdoch

On 5/25/2007 1:09 PM, Prof Brian Ripley wrote:
 On Fri, 25 May 2007, Martin Maechler wrote:
 

 path.expand(~)
 [1] /home/maechler
 
 Yes, but beware that may not do what you want on Windows in R = 2.5.0, 
 since someone changed the definition of 'home' but not path.expand.

A more basic problem is that the definition of ~ in Windows is very 
ambiguous.  Is it my Cygwin home directory, where cd ~ would take me 
while in Cygwin?  Is it my Windows CSIDL_PERSONAL folder, usually 
%HOMEDRIVE%/%HOMEPATH%/My Documents?  Is it the parent of that folder, 
%HOMEDRIVE%/%HOMEPATH%?

~ is a shell concept that makes sense in Unix-like shells, but not in 
Windows.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] email the silly fuckers instead.

2007-05-25 Thread foxy

An alle finanzinvestoren! Diese aktie wird durchstarten! Dienstag 29.
Mai startet die hausse!

Firma: Talktech Telemedia
Kurzel: OYQ / OYQ.F / OYQ.DE
WKN: 278104
ISIN: US8742781045

Preis: 0.54
5-T Prognose: 1.75

Realisierter kursgweinn von 450% in 5 tagen! OYQ Wird wie eine rakete
durchstarten! DIENSTAG 29. MAI STARTET DIE HAUSSE!

But thanks for drawing my attention to it anyway. These universities
attract talented people who develop new ideas which can be commercially
developed in the surrounding community, creating high-paying jobs. A new
breakout from the main PKK tube has been advancing down the pali in the
past three weeks, more than a kilometer west of the Campout tube.
This article was written by scientists at the U. We are also encouraging
licence fee payers to respond to the consultation.
Its well regarded as being generally independant news reviewer, and has
clashed with various governments over the years on the indepenance and
nature of its reporting.
shtml Send This Article to a Friend: Your Name: Your E-Mail: Please fill
in all fields.
including renewable energy research and development. to be fully
involved in the digital revolution that is sweeping the world.
regardless of their economic circumstances. The second principle is that
innovation, entrepreneurship and risk-taking need to be encouraged,
nurtured and rewarded.
Frank Pemberton from Cornell Cooperative Extension on Home Lawn Care
and Garden Tips. It is located at Kamokuna, about midway between the two
older entries. Whether students study STEM subjects or other fields of
study we need to increase the incentives for parents to save for their
children's college education. com Nome Search Powered by :: Free RSS
news Add RSS news to your web site Garden news vertical portal can now
be syndicated quickly and easily using our new  Really Simple
Syndication feeds. including renewable energy research and development.
We all want a higher standard of living for ourselves and our children.
Aceasta este vestea rea. Cross off those plans to tool to the shore in a
convertible.
They think it is too hard to change. In fact, spectacular eruptions like
that of Mount Pinatubo are demonstrated to contribute to global cooling
through the injection of solar energy reflecting ash and other small
particles. It is important for us to have a concrete, shared
understanding of where we want to go.
Educator Mary Jean LeTendre said it best when she observed, America's
future walks through the doors of our schools each day.  would have
been answered differently.
Vintage apron show and open house at May Creek Home :: Web Directory ::
Garden News :: Free RSS news :: Free Newsletter :: Tell a Friend
Clientfinder. Bush vows to work with allies for tougher response to
Iran's . Aceasta este vestea rea.
In fact, spectacular eruptions like that of Mount Pinatubo are
demonstrated to contribute to global cooling through the injection of
solar energy reflecting ash and other small particles. Garden tool sets
Home :: Web Directory :: Garden News :: Free RSS news :: Free Newsletter
:: Tell a Friend Clientfinder. We must seize the moment. will support
three new endowed professorships at UH in STEM disciplines. Recipient
Name: Recipient E-Mail: Personal Message: Discuss This Story:
HawaiiThreads. To achieve this vision, we have to change our economy
from one based on land development, to one fueled by innovation and the
new ideas generated by our universities and a highly-trained workforce.
And, to do this while preserving those aspects of life that make our
island home so special. roCampania de optimizare a site-ului
EHConstruct.
com Nome Search Powered by :: Free RSS news Add RSS news to your web
site Garden news vertical portal can now be syndicated quickly and
easily using our new  Really Simple Syndication feeds.
The state's High Technology Development Corporation would become the
Center's master lessee.
and by their ability to work and communicate effectively with others
from around the world. These skills are known collectively as STEM.
Keep up with the latest Advogato features and changes by checking the
Advogato  status blog.
Simply put, success means producing a constantly rising standard of
living for all Hawai'i's people while using fewer natural resources,
including land.
 I want to lead us down the path of innovation, because it is the path
of hope and opportunity, she said.
Sapling balm after storm fury Home :: Web Directory :: Garden News ::
Free RSS news :: Free Newsletter :: Tell a Friend Clientfinder.
Senator Daniel Inouye about this proposal and a follow-up conversation
with his senior staff about how the Senator can help us achieve our
vision for a new economic future for West O'ahu.
Educator Mary Jean LeTendre said it best when she observed, America's
future walks through the doors of our schools each day. Please welcome
students from McKinley High School and their teacher Osa Tui and FIRST
Waialua High School students and their teacher Glenn Lee.
The

Re: [R] email the silly fuckers instead.

2007-05-25 Thread mister_bluesman


What? I'm not sure about anyone else but I have absolutely no idea what
you're going on about.


foxy-3 wrote:
 
 An alle finanzinvestoren! Diese aktie wird durchstarten! Dienstag 29.
 Mai startet die hausse!
 
 Firma: Talktech Telemedia
 Kurzel: OYQ / OYQ.F / OYQ.DE
 WKN: 278104
 ISIN: US8742781045
 
 Preis: 0.54
 5-T Prognose: 1.75
 
 Realisierter kursgweinn von 450% in 5 tagen! OYQ Wird wie eine rakete
 durchstarten! DIENSTAG 29. MAI STARTET DIE HAUSSE!
 
 But thanks for drawing my attention to it anyway. These universities
 attract talented people who develop new ideas which can be commercially
 developed in the surrounding community, creating high-paying jobs. A new
 breakout from the main PKK tube has been advancing down the pali in the
 past three weeks, more than a kilometer west of the Campout tube.
 This article was written by scientists at the U. We are also encouraging
 licence fee payers to respond to the consultation.
 Its well regarded as being generally independant news reviewer, and has
 clashed with various governments over the years on the indepenance and
 nature of its reporting.
 shtml Send This Article to a Friend: Your Name: Your E-Mail: Please fill
 in all fields.
 including renewable energy research and development. to be fully
 involved in the digital revolution that is sweeping the world.
 regardless of their economic circumstances. The second principle is that
 innovation, entrepreneurship and risk-taking need to be encouraged,
 nurtured and rewarded.
 Frank Pemberton from Cornell Cooperative Extension on Home Lawn Care
 and Garden Tips. It is located at Kamokuna, about midway between the two
 older entries. Whether students study STEM subjects or other fields of
 study we need to increase the incentives for parents to save for their
 children's college education. com Nome Search Powered by :: Free RSS
 news Add RSS news to your web site Garden news vertical portal can now
 be syndicated quickly and easily using our new  Really Simple
 Syndication feeds. including renewable energy research and development.
 We all want a higher standard of living for ourselves and our children.
 Aceasta este vestea rea. Cross off those plans to tool to the shore in a
 convertible.
 They think it is too hard to change. In fact, spectacular eruptions like
 that of Mount Pinatubo are demonstrated to contribute to global cooling
 through the injection of solar energy reflecting ash and other small
 particles. It is important for us to have a concrete, shared
 understanding of where we want to go.
 Educator Mary Jean LeTendre said it best when she observed, America's
 future walks through the doors of our schools each day.  would have
 been answered differently.
 Vintage apron show and open house at May Creek Home :: Web Directory ::
 Garden News :: Free RSS news :: Free Newsletter :: Tell a Friend
 Clientfinder. Bush vows to work with allies for tougher response to
 Iran's . Aceasta este vestea rea.
 In fact, spectacular eruptions like that of Mount Pinatubo are
 demonstrated to contribute to global cooling through the injection of
 solar energy reflecting ash and other small particles. Garden tool sets
 Home :: Web Directory :: Garden News :: Free RSS news :: Free Newsletter
 :: Tell a Friend Clientfinder. We must seize the moment. will support
 three new endowed professorships at UH in STEM disciplines. Recipient
 Name: Recipient E-Mail: Personal Message: Discuss This Story:
 HawaiiThreads. To achieve this vision, we have to change our economy
 from one based on land development, to one fueled by innovation and the
 new ideas generated by our universities and a highly-trained workforce.
 And, to do this while preserving those aspects of life that make our
 island home so special. roCampania de optimizare a site-ului
 EHConstruct.
 com Nome Search Powered by :: Free RSS news Add RSS news to your web
 site Garden news vertical portal can now be syndicated quickly and
 easily using our new  Really Simple Syndication feeds.
 The state's High Technology Development Corporation would become the
 Center's master lessee.
 and by their ability to work and communicate effectively with others
 from around the world. These skills are known collectively as STEM.
 Keep up with the latest Advogato features and changes by checking the
 Advogato  status blog.
 Simply put, success means producing a constantly rising standard of
 living for all Hawai'i's people while using fewer natural resources,
 including land.
  I want to lead us down the path of innovation, because it is the path
 of hope and opportunity, she said.
 Sapling balm after storm fury Home :: Web Directory :: Garden News ::
 Free RSS news :: Free Newsletter :: Tell a Friend Clientfinder.
 Senator Daniel Inouye about this proposal and a follow-up conversation
 with his senior staff about how the Senator can help us achieve our
 vision for a new economic future for West O'ahu.
 Educator Mary Jean LeTendre said it best when she observed, America's

[R] Scale mixture of normals

2007-05-25 Thread Anup Nandialath

Dear Friends,

Is there an R package which implements regression
models with error distributions following a scale
mixture of normals? 

Thanks

Anup

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] covariance question which has nothing to do with R

2007-05-25 Thread toby909

while my other program is running.

The reference I mentioned previously addresses exactly this. Snijders and 
Bosker's Multilevel Analysis book on page 31 and 33, section 3.6.2 and 363 
discuss this.

When you say that the Xs are correlated then you would need to say according to 
which structure they are correlated:


(1,X1,Y1)
(1,X2,Y2)
(1,X3,Y3)
.
.
.
(1,X55,Y55)
(2,X56,Y56)
(2,X57,Y57)
.
.
.
(2,...

To pick some real world examples one row represents a person, or a stock. And 
the first column indicates to which organization or to which country that 
person/stock belongs to. Then Xs are correlated within the organization/country.
You will have two covariances, one within-county and one between country 
covariance of stocks.
This can be implemented in R manually providing method of moments estimates, or 
the gls function providing ML or REML estimates can be used for that.

I am not a post doc, just a pre master :-)

Toby


Leeds, Mark (IED) wrote:
 This is a covariance calculation question so nothing to do with R but
 maybe someone could help me anyway.
 
 Suppose, I have two random variables X and Y whose means are both known
 to be zero and I want to get an estimate of their covariance.
 
 I have n sample pairs 
 
 (X1,Y1)
 (X2,Y2)
 .
 .
 .
 .
 .
 (Xn,Yn)
 
 , so that the covariance estimate is clearly 1/n *(sum from i = 1 to n
 of ( X_i*Y_i) ) 
 
 But, suppose that it is know that the X_i are positively correlated with
 each other and that the Y_i are independent
 of each other.
 
 Then, does this change the formula for the covariance estimate at all ?
 Intuitively, I would think that, if the X_i's are positively
 correlated , then something should change because there is less info
 there than if they were independent but i'm not sure what should change
 and I couldn't find it in a book.  
 
 I can assume that the correlation between the X_i's is rho if this makes
 things easier ? Thanks.
 
 References are appreciated also.
 
 
 This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] iplots problem

2007-05-25 Thread mister_bluesman


Hi

How did u 'load' sun's java in R?

Many thanks



ryestone wrote:
 
 Did you load Sun's java? Try that and also try rebooting your machine.
 This worked for me.
 
 
 mister_bluesman wrote:
 
 Hi. I try to load iplots using the following commands
 
 library(rJava)
 library(iplots)
 
 but then I get the following error:
 
 Error in .jinit(cp, parameters = -Xmx512m, silent = TRUE) : 
 Cannot create Java Virtual Machine
 Error in library(iplots) : .First.lib failed for 'iplots'
 
 What do I have to do to correct this?
 
 I have jdk1.6 and jre1.6 installed on my windows machine
 Thanks
 
 
 

-- 
View this message in context: 
http://www.nabble.com/iplots-problem-tf3815516.html#a10810602
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with rpart

2007-05-25 Thread Silvia Lomascolo


I work on Windows, R version 2.4.1.  I'm very new with R!

I am trying to build a classification tree using rpart but, although the
matrix has 108 variables, the program builds a tree with only one split
using one variable!  I know it is probable that only one variable is
informative, but I think it's unlikely.  I was wondering if someone can help
me identify if I'm doing something wrong because I can't see it, nor could I
find it in the help or in this forum.

I want to see whether I can predict disperser type (5 categories) of a
species given the volatile compounds that the fruits emit (108 volatiles) 
I am writing:

dispvol.x- read.table ('C:\\Documents and
Settings\\silvia\\...\\volatile_disperser_matrix.txt', header=T)
dispvol.df- as.data.frame (dispvol.x)
attach (dispvol.df) #I think I need to do this so the variables are
identified when I write the regression equation
dispvol.ctree - rpart (disperser~ P3.70   +P4.29  +P5.05  +...+P30.99 
+P32.25
+TotArea, data= dispvol.df, method='class')

and I get the following output:

n= 28 

node), split, n, loss, yval, (yprob)
  * denotes terminal node

1) root 28 15 non (0.036 0.32 0.071 0.11 0.46)  
  2) P10.01=1.185 10  4 bat (0.1 0.6 0.2 0 0.1) *
  3) P10.01 1.185 18  6 non (0 0.17 0 0.17 0.67) *

There is nothing special about P10.01 that I can see in my data and I don't
know why it chooses that variables and stops there!

My matrix looks something like this (except, with a lot more variables)

disperser   P3.70   P4.29   P6.45   P6.55   P10.01  P10.15  P10.18  TotArea
ban 0.000.001.340.001.490.000.002.83
non 0.000.000.00152.80  0.0014.31   0.00167.11
bat 0.000.000.00131.56  0.650.000.00132.21
bat 0.000.005.050.0013.01   6.850.0024.90
non 0.000.0072.65   103.26  4.100.000.00180.02
non 0.000.000.000.000.000.000.000.00
bat 1.230.000.480.890.250.000.002.85
bat 0.000.000.000.000.000.000.000.00
non 0.000.000.000.001.060.000.001.06
bat 0.000.000.000.0028.69   0.0021.33   50.02
mix 0.000.000.000.000.000.000.000.00
non 0.000.000.000.000.000.000.000.00
non 0.000.000.000.000.000.000.000.00
non 0.000.000.000.001.150.000.001.15
non 0.000.000.000.000.000.000.000.00
non 0.000.820.001.650.000.000.002.47
bat 0.000.00133.24  0.003.130.000.00136.37
bir 0.000.0011.08   3.161.792.090.4818.61
non 0.000.000.000.000.000.000.000.00
mix 0.000.000.000.000.000.000.000.00
bat 0.000.000.000.001.310.000.001.31
non 0.000.000.000.000.000.001.231.23
bat 0.000.001.810.002.840.000.004.65
non 0.000.001.180.000.730.000.001.91
bir 0.000.000.000.001.400.000.001.40
bat 0.000.008.161.501.220.000.0010.88
mix 0.000.550.000.000.000.000.000.55
non 0.000.000.000.000.000.000.000.00

Thanks! Silvia.

-- 
View this message in context: 
http://www.nabble.com/Problem-with-rpart-tf3818436.html#a10810625
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 3D plots with data.frame

2007-05-25 Thread H. Paul Benton

Dear all, 
 
Thank you for any help. I have a data.frame and would like to plot
it in 3D. I have tried wireframe() and cloud(), I got

scatterplot3d(xs)
Error: could not find function scatterplot3d

 wireframe(xs)
Error in wireframe(xs) : no applicable method for wireframe

 persp(x=x, y=y, z=xs)
Error in persp.default(x = x, y = y, z = xs) :
(list) object cannot be coerced to 'double'
 class(xs)
[1] data.frame
Where x and y were a sequence of my min - max by 50 of xs[,1] and xs[,2].

my data is/looks like:

 dim(xs)
[1] 400   4
 xs[1:5,]
x   y Z1 Z2
1 27172.4 19062.4  0128
2 27000.9 19077.8  0  0
3 27016.8 19077.5  0  0
4 27029.5 19077.3  0  0
5 27045.4 19077.0  0  0

Cheers,

Paul

-- 
Research Technician
Mass Spectrometry
   o The
  /
o Scripps
  \
   o Research
  /
o Institute

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests [Broadcast]

2007-05-25 Thread Frank E Harrell Jr

Lucke, Joseph F wrote:
  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
 non-normalilty for significance testing. It's the sample means that have
 to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
 for normality prior to choosing a test statistic is generally not a good
 idea. 

I beg to differ Joseph.  I have had many datasets in which the CLT was 
of no use whatsoever, i.e., where bootstrap confidence limits were 
asymmetric because the data were so skewed, and where symmetric 
normality-based confidence intervals had bad coverage in both tails 
(though correct on the average).  I see this the opposite way: 
nonparametric tests works fine if normality holds.

Note that the CLT helps with type I error but not so much with type II 
error.

Frank

 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
 Sent: Friday, May 25, 2007 12:04 PM
 To: [EMAIL PROTECTED]; Frank E Harrell Jr
 Cc: r-help
 Subject: Re: [R] normality tests [Broadcast]
 
 From: [EMAIL PROTECTED]
 On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
 Hi all,

 apologies for seeking advice on a general stats question. I ve run
 
 normality tests using 8 different methods:
 - Lilliefors
 - Shapiro-Wilk
 - Robust Jarque Bera
 - Jarque Bera
 - Anderson-Darling
 - Pearson chi-square
 - Cramer-von Mises
 - Shapiro-Francia

 All show that the null hypothesis that the data come from a normal
 
 distro cannot be rejected. Great. However, I don't think
 it looks nice
 to report the values of 8 different tests on a report. One note is
 
 that my sample size is really tiny (less than 20
 independent cases).
 Without wanting to start a flame war, are there any
 advices of which
 one/ones would be more appropriate and should be reported
 (along with
 a Q-Q plot). Thank you.

 Regards,

 Wow - I have so many concerns with that approach that it's
 hard to know
 where to begin.  But first of all, why care about
 normality?  Why not
 use distribution-free methods?

 You should examine the power of the tests for n=20.  You'll probably
 
 find it's not good enough to reach a reliable conclusion.
 And wouldn't it be even worse if I used non-parametric tests?
 
 I believe what Frank meant was that it's probably better to use a
 distribution-free procedure to do the real test of interest (if there is
 one) instead of testing for normality, and then use a test that assumes
 normality.
 
 I guess the question is, what exactly do you want to do with the outcome
 of the normality tests?  If those are going to be used as basis for
 deciding which test(s) to do next, then I concur with Frank's
 reservation.
 
 Generally speaking, I do not find goodness-of-fit for distributions very
 useful, mostly for the reason that failure to reject the null is no
 evidence in favor of the null.  It's difficult for me to imagine why
 there's insufficient evidence to show that the data did not come from a
 normal distribution would be interesting.
 
 Andy
 
  
 Frank


 --
 Frank E Harrell Jr   Professor and Chair   School 
 of Medicine
   Department of Biostatistics   
 Vanderbilt University

 --
 yianni

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 
 
 
 --
 Notice:  This e-mail message, together with any
 attachments,...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests [Broadcast]

2007-05-25 Thread Frank E Harrell Jr

[EMAIL PROTECTED] wrote:
 Thank you all for your replies they have been more useful... well
 in my case I have chosen to do some parametric tests (more precisely
 correlation and linear regressions among some variables)... so it
 would be nice if I had an extra bit of support on my decisions... If I
 understood well from all your replies... I shouldn't pay s much
 attntion on the normality tests, so it wouldn't matter which one/ones
 I use to report... but rather focus on issues such as the power of the
 test...

If doing regression I assume your normality tests were on residuals 
rather than raw data.

Frank

 
 Thanks again.
 
 On 25/05/07, Lucke, Joseph F [EMAIL PROTECTED] wrote:
  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
 non-normalilty for significance testing. It's the sample means that have
 to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
 for normality prior to choosing a test statistic is generally not a good
 idea.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
 Sent: Friday, May 25, 2007 12:04 PM
 To: [EMAIL PROTECTED]; Frank E Harrell Jr
 Cc: r-help
 Subject: Re: [R] normality tests [Broadcast]

 From: [EMAIL PROTECTED]
 
  On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
   [EMAIL PROTECTED] wrote:
Hi all,
   
apologies for seeking advice on a general stats question. I ve run

normality tests using 8 different methods:
- Lilliefors
- Shapiro-Wilk
- Robust Jarque Bera
- Jarque Bera
- Anderson-Darling
- Pearson chi-square
- Cramer-von Mises
- Shapiro-Francia
   
All show that the null hypothesis that the data come from a normal

distro cannot be rejected. Great. However, I don't think
  it looks nice
to report the values of 8 different tests on a report. One note is

that my sample size is really tiny (less than 20
  independent cases).
Without wanting to start a flame war, are there any
  advices of which
one/ones would be more appropriate and should be reported
  (along with
a Q-Q plot). Thank you.
   
Regards,
   
  
   Wow - I have so many concerns with that approach that it's
  hard to know
   where to begin.  But first of all, why care about
  normality?  Why not
   use distribution-free methods?
  
   You should examine the power of the tests for n=20.  You'll probably

   find it's not good enough to reach a reliable conclusion.
 
  And wouldn't it be even worse if I used non-parametric tests?

 I believe what Frank meant was that it's probably better to use a
 distribution-free procedure to do the real test of interest (if there is
 one) instead of testing for normality, and then use a test that assumes
 normality.

 I guess the question is, what exactly do you want to do with the outcome
 of the normality tests?  If those are going to be used as basis for
 deciding which test(s) to do next, then I concur with Frank's
 reservation.

 Generally speaking, I do not find goodness-of-fit for distributions very
 useful, mostly for the reason that failure to reject the null is no
 evidence in favor of the null.  It's difficult for me to imagine why
 there's insufficient evidence to show that the data did not come from a
 normal distribution would be interesting.

 Andy


  
   Frank
  
  
   --
   Frank E Harrell Jr   Professor and Chair   School
  of Medicine
 Department of Biostatistics
  Vanderbilt University
  
 
 
  --
  yianni
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 


 
 --
 Notice:  This e-mail message, together with any
 attachments,...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Interactive plots?

2007-05-25 Thread mister_bluesman


Hi there. 

I have a matrix that provides place names and the distances between them:

   Chelt Exeter London  Birm
Chelt 0   118 96  50
Exeter   1180   118 163
London  96 118 0   118
Birm  50 163 118 0

After performing multidimensional scaling I get the following points plotted
as follows

http://www.nabble.com/file/p10810700/demo.jpeg 

I would like to know how if I hover a point I can get a little box telling
me which place the point refers to. Does anyone know?

Many thanks.
-- 
View this message in context: 
http://www.nabble.com/Interactive-plots--tf3818454.html#a10810700
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interactive plots?

2007-05-25 Thread Tony Plate

The package RSVGTipsDevice allows you to do just it just -- you create a 
plot in an SVG file that can be viewed in a browser like FireFox, and 
the points (or shapes) in that plot can have pop-up tooltips.

-- Tony Plate

mister_bluesman wrote:
 Hi there. 
 
 I have a matrix that provides place names and the distances between them:
 
Chelt Exeter   London  Birm
 Chelt 0   118 96  50
 Exeter   1180   118 163
 London  96 118 0   118
 Birm  50 163 118 0
 
 After performing multidimensional scaling I get the following points plotted
 as follows
 
 http://www.nabble.com/file/p10810700/demo.jpeg 
 
 I would like to know how if I hover a point I can get a little box telling
 me which place the point refers to. Does anyone know?
 
 Many thanks.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests [Broadcast]

2007-05-25 Thread wssecn

 The normality of the residuals is important in the inference procedures for 
the classical linear regression model, and normality is very important in 
correlation analysis (second moment)...

Washington S. Silva

 Thank you all for your replies they have been more useful... well
 in my case I have chosen to do some parametric tests (more precisely
 correlation and linear regressions among some variables)... so it
 would be nice if I had an extra bit of support on my decisions... If I
 understood well from all your replies... I shouldn't pay s much
 attntion on the normality tests, so it wouldn't matter which one/ones
 I use to report... but rather focus on issues such as the power of the
 test...
 
 Thanks again.
 
 On 25/05/07, Lucke, Joseph F [EMAIL PROTECTED] wrote:
   Most standard tests, such as t-tests and ANOVA, are fairly resistant to
  non-normalilty for significance testing. It's the sample means that have
  to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
  for normality prior to choosing a test statistic is generally not a good
  idea.
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
  Sent: Friday, May 25, 2007 12:04 PM
  To: [EMAIL PROTECTED]; Frank E Harrell Jr
  Cc: r-help
  Subject: Re: [R] normality tests [Broadcast]
 
  From: [EMAIL PROTECTED]
  
   On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
[EMAIL PROTECTED] wrote:
 Hi all,

 apologies for seeking advice on a general stats question. I ve run
 
 normality tests using 8 different methods:
 - Lilliefors
 - Shapiro-Wilk
 - Robust Jarque Bera
 - Jarque Bera
 - Anderson-Darling
 - Pearson chi-square
 - Cramer-von Mises
 - Shapiro-Francia

 All show that the null hypothesis that the data come from a normal
 
 distro cannot be rejected. Great. However, I don't think
   it looks nice
 to report the values of 8 different tests on a report. One note is
 
 that my sample size is really tiny (less than 20
   independent cases).
 Without wanting to start a flame war, are there any
   advices of which
 one/ones would be more appropriate and should be reported
   (along with
 a Q-Q plot). Thank you.

 Regards,

   
Wow - I have so many concerns with that approach that it's
   hard to know
where to begin.  But first of all, why care about
   normality?  Why not
use distribution-free methods?
   
You should examine the power of the tests for n=20.  You'll probably
 
find it's not good enough to reach a reliable conclusion.
  
   And wouldn't it be even worse if I used non-parametric tests?
 
  I believe what Frank meant was that it's probably better to use a
  distribution-free procedure to do the real test of interest (if there is
  one) instead of testing for normality, and then use a test that assumes
  normality.
 
  I guess the question is, what exactly do you want to do with the outcome
  of the normality tests?  If those are going to be used as basis for
  deciding which test(s) to do next, then I concur with Frank's
  reservation.
 
  Generally speaking, I do not find goodness-of-fit for distributions very
  useful, mostly for the reason that failure to reject the null is no
  evidence in favor of the null.  It's difficult for me to imagine why
  there's insufficient evidence to show that the data did not come from a
  normal distribution would be interesting.
 
  Andy
 
 
   
Frank
   
   
--
Frank E Harrell Jr   Professor and Chair   School
   of Medicine
  Department of Biostatistics
   Vanderbilt University
   
  
  
   --
   yianni
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
  
  
 
 
  
  --
  Notice:  This e-mail message, together with any
  attachments,...{{dropped}}
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 yianni
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and

Re: [R] normality tests [Broadcast]

2007-05-25 Thread Cody_Hamilton


You can also try validating your regression model via the bootstrap (the
validate() function in the Design library is very helpful).  To my mind
that would be much more reassuring than normality tests performed on twenty
residuals.

By the way, be careful with the correlation test - it's only good at
detecting linear relationships between two variables (i.e. not helpful for
detecting non-linear relationships).

Regards,
   -Cody

Cody Hamilton, PhD
Edwards Lifesciences


   
 [EMAIL PROTECTED] 
 m 
 Sent by:   To 
 [EMAIL PROTECTED] Lucke, Joseph F   
 at.math.ethz.ch   [EMAIL PROTECTED]
cc 
   r-help r-help@stat.math.ethz.ch   
 05/25/2007 11:23  Subject 
 AMRe: [R] normality tests [Broadcast] 
   
   
   
   
   
   




Thank you all for your replies they have been more useful... well
in my case I have chosen to do some parametric tests (more precisely
correlation and linear regressions among some variables)... so it
would be nice if I had an extra bit of support on my decisions... If I
understood well from all your replies... I shouldn't pay s much
attntion on the normality tests, so it wouldn't matter which one/ones
I use to report... but rather focus on issues such as the power of the
test...

Thanks again.

On 25/05/07, Lucke, Joseph F [EMAIL PROTECTED] wrote:
  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
 non-normalilty for significance testing. It's the sample means that have
 to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
 for normality prior to choosing a test statistic is generally not a good
 idea.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
 Sent: Friday, May 25, 2007 12:04 PM
 To: [EMAIL PROTECTED]; Frank E Harrell Jr
 Cc: r-help
 Subject: Re: [R] normality tests [Broadcast]

 From: [EMAIL PROTECTED]
 
  On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
   [EMAIL PROTECTED] wrote:
Hi all,
   
apologies for seeking advice on a general stats question. I ve run

normality tests using 8 different methods:
- Lilliefors
- Shapiro-Wilk
- Robust Jarque Bera
- Jarque Bera
- Anderson-Darling
- Pearson chi-square
- Cramer-von Mises
- Shapiro-Francia
   
All show that the null hypothesis that the data come from a normal

distro cannot be rejected. Great. However, I don't think
  it looks nice
to report the values of 8 different tests on a report. One note is

that my sample size is really tiny (less than 20
  independent cases).
Without wanting to start a flame war, are there any
  advices of which
one/ones would be more appropriate and should be reported
  (along with
a Q-Q plot). Thank you.
   
Regards,
   
  
   Wow - I have so many concerns with that approach that it's
  hard to know
   where to begin.  But first of all, why care about
  normality?  Why not
   use distribution-free methods?
  
   You should examine the power of the tests for n=20.  You'll probably

   find it's not good enough to reach a reliable conclusion.
 
  And wouldn't it be even worse if I used non-parametric tests?

 I believe what Frank meant was that it's probably better to use a
 distribution-free procedure to do the real test of interest (if there is
 one) instead of testing for normality, and then use a test that assumes
 normality.

 I guess the question is, what exactly do you want to do with the outcome
 of the normality tests?  If those are going to be used as basis for
 deciding which test(s) to do next, then I concur with Frank's
 reservation.

 Generally speaking, I do not find goodness-of-fit for distributions very
 useful, mostly for the reason that failure to reject the null is no
 evidence in favor of the null.  It's difficult for me to imagine why
 there's insufficient evidence to show that the data did not come from a
 normal distribution would be interesting.

 Andy


  
   Frank
  
  
   --
   Frank E Harrell Jr   Professor and Chair   School
  of Medicine

[R] Estimation of Dispersion parameter in GLM for Gamma Dist.

2007-05-25 Thread fredrik odegaard

Hi All,
could someone shed some light on what the difference between the
estimated dispersion parameter that is supplied with the GLM function
and the one that the 'gamma.dispersion( )' function in the MASS
library gives? And is there consensus for which estimated value to
use?


It seems that the dispersion parameter that comes with the summary
command for a GLM with a Gamma dist. is close to (but not exactly):
Pearson Chi-Sq./d.f.

While the dispersion parameter from the MASS library
('gamma.dispersion ( )' ) is close to the approximation given in
McCullaghNelder (p.291):
Res.Dev./n*(6+Res.Dev./n) / (6 + 2*Res.Dev./n)

(Since it is only an approximation it seems reasonable that they are
not exactly alike.)


Many thanks,
Fredrik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests [Broadcast]

2007-05-25 Thread Cody_Hamilton


Following up on Frank's thought, why is it that parametric tests are so
much more popular than their non-parametric counterparts?  As
non-parametric tests require fewer assumptions, why aren't they the
default?  The relative efficiency of the Wilcoxon test as compared to the
t-test is 0.955, and yet I still see t-tests in the medical literature all
the time.  Granted, the Wilcoxon still requires the assumption of symmetry
(I'm curious as to why the Wilcoxon is often used when asymmetry is
suspected, since the Wilcoxon assumes symmetry), but that's less stringent
than requiring normally distributed data.  In a similar vein, one usually
sees the mean and standard deviation reported as summary statistics for a
continuous variable - these are not very informative unless you assume the
variable is normally distributed.  However, clinicians often insist that I
included these figures in reports.

Cody Hamilton, PhD
Edwards Lifesciences



   
 Frank E Harrell   
 Jr
 [EMAIL PROTECTED]  To 
 bilt.edu Lucke, Joseph F   
 Sent by:  [EMAIL PROTECTED]
 [EMAIL PROTECTED]  cc 
 at.math.ethz.ch   r-help r-help@stat.math.ethz.ch   
   Subject 
   Re: [R] normality tests 
 05/25/2007 02:42  [Broadcast] 
 PM
   
   
   
   
   




Lucke, Joseph F wrote:
  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
 non-normalilty for significance testing. It's the sample means that have
 to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
 for normality prior to choosing a test statistic is generally not a good
 idea.

I beg to differ Joseph.  I have had many datasets in which the CLT was
of no use whatsoever, i.e., where bootstrap confidence limits were
asymmetric because the data were so skewed, and where symmetric
normality-based confidence intervals had bad coverage in both tails
(though correct on the average).  I see this the opposite way:
nonparametric tests works fine if normality holds.

Note that the CLT helps with type I error but not so much with type II
error.

Frank


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
 Sent: Friday, May 25, 2007 12:04 PM
 To: [EMAIL PROTECTED]; Frank E Harrell Jr
 Cc: r-help
 Subject: Re: [R] normality tests [Broadcast]

 From: [EMAIL PROTECTED]
 On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
 Hi all,

 apologies for seeking advice on a general stats question. I ve run

 normality tests using 8 different methods:
 - Lilliefors
 - Shapiro-Wilk
 - Robust Jarque Bera
 - Jarque Bera
 - Anderson-Darling
 - Pearson chi-square
 - Cramer-von Mises
 - Shapiro-Francia

 All show that the null hypothesis that the data come from a normal

 distro cannot be rejected. Great. However, I don't think
 it looks nice
 to report the values of 8 different tests on a report. One note is

 that my sample size is really tiny (less than 20
 independent cases).
 Without wanting to start a flame war, are there any
 advices of which
 one/ones would be more appropriate and should be reported
 (along with
 a Q-Q plot). Thank you.

 Regards,

 Wow - I have so many concerns with that approach that it's
 hard to know
 where to begin.  But first of all, why care about
 normality?  Why not
 use distribution-free methods?

 You should examine the power of the tests for n=20.  You'll probably

 find it's not good enough to reach a reliable conclusion.
 And wouldn't it be even worse if I used non-parametric tests?

 I believe what Frank meant was that it's probably better to use a
 distribution-free procedure to do the real test of interest (if there is
 one) instead of testing for normality, and then use a test that assumes
 normality.

 I guess the question is, what exactly do you want to do with the outcome
 of the normality tests?  If those are going to be used as basis for
 deciding which test(s) to do next, then I concur with Frank's
 reservation.

 Generally speaking, I do not find goodness-of-fit for

Re: [R] normality tests [Broadcast]

2007-05-25 Thread Frank E Harrell Jr

[EMAIL PROTECTED] wrote:
 Following up on Frank's thought, why is it that parametric tests are so
 much more popular than their non-parametric counterparts?  As
 non-parametric tests require fewer assumptions, why aren't they the
 default?  The relative efficiency of the Wilcoxon test as compared to the
 t-test is 0.955, and yet I still see t-tests in the medical literature all
 the time.  Granted, the Wilcoxon still requires the assumption of symmetry
 (I'm curious as to why the Wilcoxon is often used when asymmetry is
 suspected, since the Wilcoxon assumes symmetry), but that's less stringent
 than requiring normally distributed data.  In a similar vein, one usually
 sees the mean and standard deviation reported as summary statistics for a
 continuous variable - these are not very informative unless you assume the
 variable is normally distributed.  However, clinicians often insist that I
 included these figures in reports.
 
 Cody Hamilton, PhD
 Edwards Lifesciences

Well said Cody, just want to add that Wilcoxon does not assume symmetry 
if you are interested in testing for stochastic ordering and not just 
for a mean.

Frank

 
 
 

  Frank E Harrell   
  Jr
  [EMAIL PROTECTED]  To 
  bilt.edu Lucke, Joseph F   
  Sent by:  [EMAIL PROTECTED]
  [EMAIL PROTECTED]  cc 
  at.math.ethz.ch   r-help r-help@stat.math.ethz.ch   
Subject 
Re: [R] normality tests 
  05/25/2007 02:42  [Broadcast] 
  PM





 
 
 
 
 Lucke, Joseph F wrote:
  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
 non-normalilty for significance testing. It's the sample means that have
 to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
 for normality prior to choosing a test statistic is generally not a good
 idea.
 
 I beg to differ Joseph.  I have had many datasets in which the CLT was
 of no use whatsoever, i.e., where bootstrap confidence limits were
 asymmetric because the data were so skewed, and where symmetric
 normality-based confidence intervals had bad coverage in both tails
 (though correct on the average).  I see this the opposite way:
 nonparametric tests works fine if normality holds.
 
 Note that the CLT helps with type I error but not so much with type II
 error.
 
 Frank
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
 Sent: Friday, May 25, 2007 12:04 PM
 To: [EMAIL PROTECTED]; Frank E Harrell Jr
 Cc: r-help
 Subject: Re: [R] normality tests [Broadcast]

 From: [EMAIL PROTECTED]
 On 25/05/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
 Hi all,

 apologies for seeking advice on a general stats question. I ve run
 normality tests using 8 different methods:
 - Lilliefors
 - Shapiro-Wilk
 - Robust Jarque Bera
 - Jarque Bera
 - Anderson-Darling
 - Pearson chi-square
 - Cramer-von Mises
 - Shapiro-Francia

 All show that the null hypothesis that the data come from a normal
 distro cannot be rejected. Great. However, I don't think
 it looks nice
 to report the values of 8 different tests on a report. One note is
 that my sample size is really tiny (less than 20
 independent cases).
 Without wanting to start a flame war, are there any
 advices of which
 one/ones would be more appropriate and should be reported
 (along with
 a Q-Q plot). Thank you.

 Regards,

 Wow - I have so many concerns with that approach that it's
 hard to know
 where to begin.  But first of all, why care about
 normality?  Why not
 use distribution-free methods?

 You should examine the power of the tests for n=20.  You'll probably
 find it's not good enough to reach a reliable conclusion.
 And wouldn't it be even worse if I used non-parametric tests?
 I believe what Frank meant was that it's probably better to use a
 distribution-free procedure to do the real test of interest (if there is
 one) instead of testing for normality, and then use a test that assumes
 normality.

 I guess the question is, what exactly do you want

Re: [R] 3D plots with data.frame

2007-05-25 Thread J . delasHeras



You could try the function 'plot3d', in package 'rgl':

library(rgl)
?plot3d
x-data.frame(a=rnorm(100),b=rnorm(100),c=rnorm(100))
plot3d(x$a,x$b,x$c)

Jose


Quoting H. Paul Benton [EMAIL PROTECTED]:

 Dear all,

 Thank you for any help. I have a data.frame and would like to plot
 it in 3D. I have tried wireframe() and cloud(), I got

 scatterplot3d(xs)
 Error: could not find function scatterplot3d

 wireframe(xs)
 Error in wireframe(xs) : no applicable method for wireframe

 persp(x=x, y=y, z=xs)
 Error in persp.default(x = x, y = y, z = xs) :
 (list) object cannot be coerced to 'double'
 class(xs)
 [1] data.frame
 Where x and y were a sequence of my min - max by 50 of xs[,1] and xs[,2].

 my data is/looks like:

 dim(xs)
 [1] 400   4
 xs[1:5,]
 x   y Z1 Z2
 1 27172.4 19062.4  0128
 2 27000.9 19077.8  0  0
 3 27016.8 19077.5  0  0
 4 27029.5 19077.3  0  0
 5 27045.4 19077.0  0  0

 Cheers,

 Paul

 --
 Research Technician
 Mass Spectrometry
o The
   /
 o Scripps
   \
o Research
   /
 o Institute

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dr. Jose I. de las Heras  Email: [EMAIL PROTECTED]
The Wellcome Trust Centre for Cell BiologyPhone: +44 (0)131 6513374
Institute for Cell  Molecular BiologyFax:   +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to get the Naive SE of coefficients from the zelig output

2007-05-25 Thread Ferdinand Alimadhi

Dear Abdus,

Can you try:  summary(il6w.out)$table[,(Naive SE)]

If you want to know more details of where that's coming from, run the
following command in your R prompt
and have a look at the outputted code

 survival:::summary.survreg


You might also consider subscribing at the zelig mailing list (
http://lists.gking.harvard.edu/?info=zelig) for
any Zelig related question

btw, in zelig(), there is no need to use

formula = il6.data$il6 ~ il6.data$apache

You can just use:
formula = il6 ~ apache

Best,
Ferdi

On 5/25/07, Abdus Sattar [EMAIL PROTECTED] wrote:

 Dear R-user:

 After the fitting the Tobit model using zelig, if I use the following
 command then I can get the regression coefficents:

 beta=coefficients(il6.out)
  beta
 (Intercept)  apache
  4.7826  0.9655

 How may I extract the Naive SE from the following output please?

  summary(il6w.out)
 Call:
 zelig(formula = il6.data$il6 ~ il6.data$apache, model = tobit,
 data = il6.data, robust = TRUE, cluster = il6.data$subject,
 weights = il6.data$w)
 Value Std. Err (Naive SE) z p
 (Intercept) 4.572  0.124210.27946  36.8 1.44e-296
 il6.data$apache 0.983  0.001890.00494 519.4  0.00e+00
 Log(scale)  2.731  0.006600.00477 414.0  0.00e+00
 Scale= 15.3
 Gaussian distribution
 Loglik(model)= -97576   Loglik(intercept only)= -108964
 Chisq= 22777 on 1 degrees of freedom, p= 0
 (Loglikelihood assumes independent observations)
 Number of Newton-Raphson Iterations: 6
 n=5820 (1180 observations deleted due to missingness)

 I would appreciate if any help you could provide please. Thank you.

 Sattar




 



 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with rpart

2007-05-25 Thread Prof Brian Ripley

You only have 43 cases.  After one split, the groups are too small 
to split again with the default settings.  See  ?rpart.control.



On Fri, 25 May 2007, Silvia Lomascolo wrote:


 I work on Windows, R version 2.4.1.  I'm very new with R!

 I am trying to build a classification tree using rpart but, although the
 matrix has 108 variables, the program builds a tree with only one split
 using one variable!  I know it is probable that only one variable is
 informative, but I think it's unlikely.  I was wondering if someone can help
 me identify if I'm doing something wrong because I can't see it, nor could I
 find it in the help or in this forum.

 I want to see whether I can predict disperser type (5 categories) of a
 species given the volatile compounds that the fruits emit (108 volatiles)
 I am writing:

 dispvol.x- read.table ('C:\\Documents and
 Settings\\silvia\\...\\volatile_disperser_matrix.txt', header=T)
 dispvol.df- as.data.frame (dispvol.x)
 attach (dispvol.df) #I think I need to do this so the variables are
 identified when I write the regression equation
 dispvol.ctree - rpart (disperser~   P3.70   +P4.29  +P5.05  +...+P30.99 
 +P32.25
 +TotArea, data= dispvol.df, method='class')

 and I get the following output:

 n= 28

 node), split, n, loss, yval, (yprob)
  * denotes terminal node

 1) root 28 15 non (0.036 0.32 0.071 0.11 0.46)
  2) P10.01=1.185 10  4 bat (0.1 0.6 0.2 0 0.1) *
  3) P10.01 1.185 18  6 non (0 0.17 0 0.17 0.67) *

 There is nothing special about P10.01 that I can see in my data and I don't
 know why it chooses that variables and stops there!

 My matrix looks something like this (except, with a lot more variables)

 disperser P3.70   P4.29   P6.45   P6.55   P10.01  P10.15  P10.18  TotArea
 ban   0.000.001.340.001.490.000.002.83
 non   0.000.000.00152.80  0.0014.31   0.00167.11
 bat   0.000.000.00131.56  0.650.000.00132.21
 bat   0.000.005.050.0013.01   6.850.0024.90
 non   0.000.0072.65   103.26  4.100.000.00180.02
 non   0.000.000.000.000.000.000.000.00
 bat   1.230.000.480.890.250.000.002.85
 bat   0.000.000.000.000.000.000.000.00
 non   0.000.000.000.001.060.000.001.06
 bat   0.000.000.000.0028.69   0.0021.33   50.02
 mix   0.000.000.000.000.000.000.000.00
 non   0.000.000.000.000.000.000.000.00
 non   0.000.000.000.000.000.000.000.00
 non   0.000.000.000.001.150.000.001.15
 non   0.000.000.000.000.000.000.000.00
 non   0.000.820.001.650.000.000.002.47
 bat   0.000.00133.24  0.003.130.000.00136.37
 bir   0.000.0011.08   3.161.792.090.4818.61
 non   0.000.000.000.000.000.000.000.00
 mix   0.000.000.000.000.000.000.000.00
 bat   0.000.000.000.001.310.000.001.31
 non   0.000.000.000.000.000.001.231.23
 bat   0.000.001.810.002.840.000.004.65
 non   0.000.001.180.000.730.000.001.91
 bir   0.000.000.000.001.400.000.001.40
 bat   0.000.008.161.501.220.000.0010.88
 mix   0.000.550.000.000.000.000.000.55
 non   0.000.000.000.000.000.000.000.00

 Thanks! Silvia.



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimation of Dispersion parameter in GLM for Gamma Dist.

2007-05-25 Thread Prof Brian Ripley

This is discussed in the book the MASS package (sic) supports, and/or its 
online material (depending on the edition).

On Fri, 25 May 2007, fredrik odegaard wrote:

 Hi All,
 could someone shed some light on what the difference between the
 estimated dispersion parameter that is supplied with the GLM function
 and the one that the 'gamma.dispersion( )' function in the MASS
 library gives? And is there consensus for which estimated value to
 use?


 It seems that the dispersion parameter that comes with the summary
 command for a GLM with a Gamma dist. is close to (but not exactly):
 Pearson Chi-Sq./d.f.

Sometimes close to, but by no means always.  Again, discussed in MASS.

 While the dispersion parameter from the MASS library
 ('gamma.dispersion ( )' ) is close to the approximation given in
 McCullaghNelder (p.291):
 Res.Dev./n*(6+Res.Dev./n) / (6 + 2*Res.Dev./n)

 (Since it is only an approximation it seems reasonable that they are
 not exactly alike.)


 Many thanks,
 Fredrik

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

79 matches

Mail list logo