Re: [R] Suggestion for ?split

2006-06-22 Thread Prof Brian Ripley
On Thu, 22 Jun 2006, Simon Blomberg wrote:

 Hi all,

 I noticed an undocumented feature for split. It sorts the resulting list
 according to the grouping factor. An example:

 test - data.frame(x=rnorm(48), f=letters[sample(1:8)])
 split(test, test$f)

 I wasn't expecting this behaviour, although I was pleasantly surprised.
 I suggest that the help page for split be amended to include this
 feature. I know it's a small thing, but someone else may also find it
 useful to know.

It is not really true.  The help page says

  The value returned from 'split' is a list of vectors containing
  the values for the groups.  The components of the list are named
  by the _used_ factor levels given by 'f'.

They are in the same order as the _used_ factor levels (as the statement 
implies), but those are in no sense sorted.  Indeed, the factor may be 
created by as.factor or interaction, and working out the order of the 
factor levels can be tricky, which is why they are named.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] NLME: using the layout option with the plot command

2006-06-22 Thread Mark Difford
Hi Greg,

Since you haven't yet had a response, you could distill this.  It uses the 
pixel dataset from nlme() as an example.

## To get separate files, do this
postscript(c:\MyGraph%03.ps, onefile=F)
plot(Pixel, display = Dog, inner = ~Side, layout=c(4,1))
dev.off()

## To get your layout into one file, as separate pages, do this
postscript(c:\MyGraph.ps, onefile=T)
plot(Pixel, display = Dog, inner = ~Side, layout=c(4,1))
dev.off()

If you prefer pdf, then use : pdf(filename, onefile=F), c.  At the R prompt do 
: ?postscript (and ?pdf), and go on reading!  Also have a look at setps() in 
package Hmisc.

Regards,
Mark.

 
Mark DiffordPh.D. candidate, Botany Department,
Nelson Mandela Metropolitan University,
Port Elizabeth, SA.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Basic NA handling problem

2006-06-22 Thread Sumanta Basak
Hi All,

I need your help in NA handling.

I've following data series.

x-c(1,4,5,8,NA,4,NA,5,5,1,2,7,8,9,NA,NA,NA,15,6,8)

Now i want to interpolate where NA value persists. Like, between 9 and 5 there 
are three NA's. So, that should be interpolated like,

1st NA- (15-9)/4
2nd NA- 1st NA value + (15-9)/4

Can i get help on this using a 'for' loop. Actually i have huge daily time 
series with lots of NA values where i need to make these values.

Please help.

Thanks,
Sumanta Basak.


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Basic NA handling problem

2006-06-22 Thread JeeBee

Try this:
help.search(interpolate)

Then find this:

 library(zoo)
 na.approx(x)
 [1]  1.0  4.0  5.0  8.0  6.0  4.0  4.5  5.0  5.0  1.0  2.0  7.0  8.0  9.0 10.5
[16] 12.0 13.5 15.0  6.0  8.0

Hth,
JeeBee.

On Thu, 22 Jun 2006 06:54:53 +0100, Sumanta Basak wrote:

 Hi All,
 
 I need your help in NA handling.
 
 I've following data series.
 
 x-c(1,4,5,8,NA,4,NA,5,5,1,2,7,8,9,NA,NA,NA,15,6,8)
 
 Now i want to interpolate where NA value persists. Like, between 9 and 5 
 there are three NA's. So, that should be interpolated like,
 
 1st NA- (15-9)/4
 2nd NA- 1st NA value + (15-9)/4
 
 Can i get help on this using a 'for' loop. Actually i have huge daily time 
 series with lots of NA values where i need to make these values.
 
 Please help.
 
 Thanks,
 Sumanta Basak.
 
   
 -
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to plot a image with restrictions?

2006-06-22 Thread Roger Bivand
On Wed, 21 Jun 2006, Cleber N.Borges wrote:

 
 
 Hello All!
 
 
 How to plot a image ( image function )
 with restrictions, like a polygon??

The suggestions given in:

http://finzi.psych.upenn.edu/R/Rhelp02a.new/archive/15030.html

by Denis White let you overplot the image with a masking polygon - that
may be the most suitable for you. Alternatively, you can set the image
values to be omitted to NA, but how you would do this depends rather on
whether the polygon is irregular or not. If the polygon is regular,
unrolling the image into a vector and setting conditions on expand.grid()
of the row and column coordinates will work with is.na(). If irregular,
you'll need to do point-in-polygon checking of the image grid coordinates.
If your data are geographical (not assumed here), you may like to follow
this up on the R-sig-geo list.

Roger

 
 Thanks in advance!
 
 Cleber N. Borges
 
 
 # problem example
 
 x - y - seq(-4*pi, 4*pi, len=27)
 r - sqrt(outer(x^2, y^2, +))
 image( z )
 contour(z, add = TRUE, drawlabels = FALSE)
 
 m=scan()
 0.2  0.2
 0.8  0.2
 0.5  0.8
 
 m = matrix(m,nr=3,byrow=T)
 polygon( m, lwd=5 )
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Rosner's test

2006-06-22 Thread Martin Maechler
 BertG == Berton Gunter [EMAIL PROTECTED]
 on Tue, 13 Jun 2006 14:34:48 -0700 writes:

BertG RSiteSearch('Rosner') ?RSiteSearch or search directly
BertG from CRAN.

BertG Incidentally, I'll repeat what I've said
BertG before. Don't do outlier tests.  They're
BertG dangerous. Use robust methods instead.

Yes, yes, yes!!!

Note that  rlm() or cov.rob()  from recommended package MASS
will most probably be sufficient for your needs.

For slightly newer methodology, look at package 'robustbase', or
also 'rrcov'.

Martin Maechler, ETH Zurich

BertG -- Bert Gunter Genentech Non-Clinical Statistics
BertG South San Francisco, CA
 
BertG The business of the statistician is to catalyze the
BertG scientific learning process.  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] multivariate splits

2006-06-22 Thread Bálint Czúcz
Thanks to everyone for the replies to my question.

There are several good algorithms to make decision trees with
multivariate splits (sometimes called oblique trees), some are even
freeware (QUEST, CRUISE...), but most are not open-sourced.

Unfortunately there seems to be no good choice for oblique trees in R
now. Mvpart seems to be really fancy, although it cannot handle splits
on the linear combination of the covariates. Rweka might be useful,
but it still takes some time for me to get used to its language.
I'll see where I get.

Thanks again and bye:
Bálint


-- 
Czúcz Bálint
PhD hallgató
BCE KTK Talajtan és Vízgazdálkodás Tanszék
1118 Budapest, Villányi út 29-43.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sort matrix by sum of columns

2006-06-22 Thread john seers \(IFR\)

Albert

Is this what you want?:

 a[,order(colSums(a))]


John S

 
---


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Albert Vilella
Sent: 21 June 2006 11:38
To: r help
Subject: [R] sort matrix by sum of columns


Hi all,

I would like to know how can I sort the cols of a matrix by the sum of
their elements.


a - matrix(as.integer(rnorm(25,4,2)),10,5)
colnames(a) = c(alfa,bravo,charlie,delta,echo)

I guess I should use colSums, and then rearrange the matrix somehow
according to the result.

My idea is to display a sorted barplot:

barplot(a, horiz=TRUE, legend.text=T)

Thanks in advance,

Albert.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Two sets of axis in a lattice plot?

2006-06-22 Thread Fredrik Karlsson
Dear list,

Is it possible to have lattice provide two different ordinate axis,
one on the right side and one of the left side of the plot?

I would like to make a xyplot with x both in a linear scale and in a
log-approximate scale.

Thank you in advance.

/Fredrik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] problem with integrate() - correction

2006-06-22 Thread Rainer M Krug
Sorry

should read:

ip - function(x) 1240 * x^(-0.88)


Rainer M Krug wrote:
 Hi
 
 System:
 Linux, SuSE 10
 R 2.3.0
 
 I try to do the following
 
 ip - function(x) 1240*dip(x, 0.88)
 iip - function(x) integrate(ip, x - 0.045, x + 0.045)$value
 f - integrate(iip, 10, 100)
 Error in integrate(iip, 10, 100) : evaluation of function gave a result
 of wrong length
 
 How can I solve this?
 
 Thanks
 
 Rainer
 
 
 


-- 
Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
Biology (UCT)

Department of Conservation Ecology and Entomology
University of Stellenbosch
Matieland 7602
South Africa

Tel:+27 - (0)72 808 2975 (w)
Fax:+27 - (0)21 808 3304
Cell:   +27 - (0)83 9479 042

email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problem with integrate() - correction

2006-06-22 Thread Frede Aakmann Tøgersen
Well why not do it symbolically, e.g.

int(a*x^b,x) = \frac{a}{b+1}x^{b+1}

and integrating this again gives

\frac{a}{(b+1)(b+2)}x^{b+2},

of course for resonable values of a and b such that things are well defined.

But if you follow the link

https://stat.ethz.ch/pipermail/r-help/2005-January/064026.html

you may come to this solution:


ip - function(x) 1240 * x^(-0.88)

iip - function(x){
  res - rep(NA,length(x))
  for (i in seq(along=x))
res[i] - integrate(ip, x[i] - 0.045, x[i] + 0.045)$value
  return(res)
}

f - integrate(iip, 10, 100)

which may not be an effective way to do it.


Med venlig hilsen
Frede Aakmann Tøgersen
 

 

 -Oprindelig meddelelse-
 Fra: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] På vegne af Rainer M Krug
 Sendt: 22. juni 2006 14:03
 Til: [EMAIL PROTECTED]
 Cc: R help list
 Emne: [R] problem with integrate() - correction
 
 Sorry
 
 should read:
 
 ip - function(x) 1240 * x^(-0.88)
 
 
 Rainer M Krug wrote:
  Hi
  
  System:
  Linux, SuSE 10
  R 2.3.0
  
  I try to do the following
  
  ip - function(x) 1240*dip(x, 0.88)
  iip - function(x) integrate(ip, x - 0.045, x + 0.045)$value f - 
  integrate(iip, 10, 100)
  Error in integrate(iip, 10, 100) : evaluation of function gave a 
  result of wrong length
  
  How can I solve this?
  
  Thanks
  
  Rainer
  
  
  
 
 
 --
 Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT)
 
 Department of Conservation Ecology and Entomology University 
 of Stellenbosch Matieland 7602 South Africa
 
 Tel:  +27 - (0)72 808 2975 (w)
 Fax:  +27 - (0)21 808 3304
 Cell: +27 - (0)83 9479 042
 
 email:[EMAIL PROTECTED]
   [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problem with integrate() - correction

2006-06-22 Thread Rainer M Krug
Thanks a lot - the symbolic solution will work and I understand what is
wrong with my function

Rainer


Frede Aakmann Tøgersen wrote:
 Well why not do it symbolically, e.g.
 
 int(a*x^b,x) = \frac{a}{b+1}x^{b+1}
 
 and integrating this again gives
 
 \frac{a}{(b+1)(b+2)}x^{b+2},
 
 of course for resonable values of a and b such that things are well defined.
 
 But if you follow the link
 
 https://stat.ethz.ch/pipermail/r-help/2005-January/064026.html
 
 you may come to this solution:
 
 
 ip - function(x) 1240 * x^(-0.88)
 
 iip - function(x){
   res - rep(NA,length(x))
   for (i in seq(along=x))
 res[i] - integrate(ip, x[i] - 0.045, x[i] + 0.045)$value
   return(res)
 }
 
 f - integrate(iip, 10, 100)
 
 which may not be an effective way to do it.
 
 
 Med venlig hilsen
 Frede Aakmann Tøgersen
  
 
  
 
 -Oprindelig meddelelse-
 Fra: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] På vegne af Rainer M Krug
 Sendt: 22. juni 2006 14:03
 Til: [EMAIL PROTECTED]
 Cc: R help list
 Emne: [R] problem with integrate() - correction

 Sorry

 should read:

 ip - function(x) 1240 * x^(-0.88)


 Rainer M Krug wrote:
 Hi

 System:
 Linux, SuSE 10
 R 2.3.0

 I try to do the following

 ip - function(x) 1240*dip(x, 0.88)
 iip - function(x) integrate(ip, x - 0.045, x + 0.045)$value f - 
 integrate(iip, 10, 100)
 Error in integrate(iip, 10, 100) : evaluation of function gave a 
 result of wrong length

 How can I solve this?

 Thanks

 Rainer




 --
 Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT)

 Department of Conservation Ecology and Entomology University 
 of Stellenbosch Matieland 7602 South Africa

 Tel: +27 - (0)72 808 2975 (w)
 Fax: +27 - (0)21 808 3304
 Cell:+27 - (0)83 9479 042

 email:   [EMAIL PROTECTED]
  [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html



-- 
Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
Biology (UCT)

Department of Conservation Ecology and Entomology
University of Stellenbosch
Matieland 7602
South Africa

Tel:+27 - (0)72 808 2975 (w)
Fax:+27 - (0)21 808 3304
Cell:   +27 - (0)83 9479 042

email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Two sets of axis in a lattice plot?

2006-06-22 Thread Deepayan Sarkar
On 6/22/06, Fredrik Karlsson [EMAIL PROTECTED] wrote:
 Dear list,

 Is it possible to have lattice provide two different ordinate axis,
 one on the right side and one of the left side of the plot?

Not easily. I hope to add some support for this sort of thing in the
future, but I'm not sure when. Currently, you can add a second axis as
part of the panel function (using e.g. panel.axis), but you will have
to adjust the space for it manually. You will also need to turn off
clipping.

-Deepayan

 I would like to make a xyplot with x both in a linear scale and in a
 log-approximate scale.

 Thank you in advance.

 /Fredrik

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



-- 
http://www.stat.wisc.edu/~deepayan/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] As.Factor with Logistic Regression

2006-06-22 Thread Justin Rapp
I am modeling the probability of player succeeding in the NFL with a
binomial logistic regression with 1 signifying success and 0
signifying no success.   I performed the regression of the binomial
variable against overall draft position using the college conference
for which each player played as a factor using the
as.factor(Conference) command.

My question is:

How do I plot specific factors against the curve for the entire set.
There are only a few factors that have significant coefficients and I
would like to plot those against the logistic curve for the entire
set.

Any help would be greatly appreciated.

-- 
Justin Rapp
409 S. 22nd St.
Apt. 1
Philadelphia, PA 19146
Cell:(267)252.0297
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] glm beta hypothesis testing

2006-06-22 Thread Davis, Jacob B.
Thanks for the insight on the debugger.  It has taken me a day or so to
get use to it, but it is very helpful.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ben Bolker
Sent: Tuesday, June 20, 2006 12:56 PM
To: r-help@stat.math.ethz.ch
Subject: Re: [R] glm beta hypothesis testing

Davis, Jacob B.  JBDavis at txfb-ins.com writes:

 
 In summary.glm I'm trying to get a better feel for the z output.  The
 following lines can be found in the function
 
  [snip]

  digging through the function is good: debugging your way through
the function is sometimes even better.

examples(glm,local=TRUE)  ## run glm examples and get
   ## results left in local workspace)

ls()
debug(summary.glm)
summary(glm.D93)

  shows ...
  p is object rank (~ number of parameters)
  Qr$pivot gives the order in which the parameters
have been rearranged to solve the model,
so Qr$pivot[p1] gives the rearranged order
of the coefficients.  We need this rearranged
order because we're going to extract the
unscaled covariance matrix by solving the
inverse QR matrix, which is in the pivoted
(rearranged) order.

The null hypothesis for any particular contrast
in glm is that the parameter is 0, so the
estimates of the coefficients (object$coefficients)
*are* the distance from the null hypothesis.

   Ben Bolker

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] As.Factor with Logistic Regression

2006-06-22 Thread Frank E Harrell Jr
Justin Rapp wrote:
 I am modeling the probability of player succeeding in the NFL with a
 binomial logistic regression with 1 signifying success and 0
 signifying no success.   I performed the regression of the binomial
 variable against overall draft position using the college conference
 for which each player played as a factor using the
 as.factor(Conference) command.
 
 My question is:
 
 How do I plot specific factors against the curve for the entire set.
 There are only a few factors that have significant coefficients and I
 would like to plot those against the logistic curve for the entire
 set.
 
 Any help would be greatly appreciated.
 

It will bias the analysis to omit insignificant factors.

To get the plots you want:

library(Hmisc); library(Design)
dd - datadist(mydata); options(datadist='dd')
f - lrm(y ~ x1+x2+)
plot(f, x1=NA, fun=plogis)
plot(f, x2=NA, fun=plogis)
plot(f, fun=plogis) # plot partial effects of all predictors, 
probability scale

-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] GLM plotting estimated responses

2006-06-22 Thread Davis, Jacob B.
I have recently fit a binomial link = logit model and I want to visually
see the estimated response across a specific covariate.

 

I can accomplish this by creating a function that extracts the betas
from the model then forms a linear combination with the classes I'm
interested in next pass the value through the inverse logit function, do
this for each class in a covariate, then plot the results.

 

My question is:  Is there a built in function that will do all of this
for me, where all I have to do is supply the classes for each covariate
I'm interested in and it plots for me?

 

On a similar note is there a quick way to plot the study size across a
covariate from model results, without creating my own functions?

 

Thanks in advance,

 

~~

 

Jacob Davis

Actuarial Analyst

 

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] programming advice

2006-06-22 Thread Fred JEAN
Dear R users

I want to compute Kendall's Tau between two vectors x and y.
But x and y  may have zeros in the same position(s) and I wrote the
following function to be sure to drop out those double zeros

cor.kendall - function(x,y) {
  nox - c()
  noy - c()
  #
  for (i in 1:length(x)) if (x[i]!= 0 | y[i] != 0)
  nox[length(nox)+1]- x[i]
  for (i in 1:length(y)) if (x[i]!= 0 | y[i] != 0)
  noy[length(noy)+1]- y[i]
  #
  res.kendall - cor.test(nox,noy,method = kendall,exact=F)
  return(list(x=nox,y=noy,res.kendall,length(nox)))
}

Do you know a more elegant way to achieve the same goal ?
(I'm sure you do : it's a newbie's program actually)

-- 
Fred

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Basic package structure question

2006-06-22 Thread Jay Emerson
At the encouragement of many at UseR, I'm trying to build my first real
package. I have no C/Fortran code, just plain old R code, so it should be
rocket science.  On a Linux box, I used package.skeleton() to create a basic
package containing just one hello world type of function.  I edited the
DESCRIPTION file, changin the package name appropriately.  I edited the
hello.Rd file.  Upon running R CMD check hello, the only warning had to do
with the fact that src/ was empty (obviously I had no source in such a
simple package).  I doubt this is a problem.

I was able to install and use the package successfully on the Linux system
from the .tar.gz file, so far so good!  Next, on to Windows, where the
problem arose:

I created a zip file from inside the package directory: zip -r ../hello.zip
./*

When I moved this to my Windows machine and tried to install the package, I
received the following error:

 utils:::menuInstallLocal()
Error in unpackPkg(pkgs[i], pkgnames[i], lib, installWithVers) :
malformed bundle DESCRIPTION file, no Contains field

I only found one mention of this in my Google search, with no reply to the
thread.  The Contains field appears to be used for bundles, but I'm trying
to create a package, not a bundle.  This leads me to believe that a simple
zipping of the package directory structure is not the correct format for
Windows.

Needless to say, there appears to be wide agreement that making packages
requires precision, but fundamentally a package should (as described in the
documentation) just be a collection of files and folders organized a certain
way.  If someone could point me to documentation I may have missed that
explains this, I would be grateful.

Regards,

Jay

-- 
John W. Emerson (Jay)
Assistant Professor of Statistics
Yale University
http://www.stat.yale.edu/~jay

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Why different results with different initial values for MLE (optim)!

2006-06-22 Thread Xin
Hi, All:

I used optim() to minimise likelihood function for fitting the data to a 
partiuclar distribution. The function is converged and the value of 
log-likelihood is different when I change the intial value.

Whether it means the program does not work well?

   Thanks!

   Xin
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Bayesian Networks with deal

2006-06-22 Thread Spencer Graves
  Since I haven't seen a reply to this, I will offer a couple of 
comments.  I've never used deal, but it sounded interesting, so I 
thought I'd look at it.

  Have you looked at Susanne G. Bøttcher and Claus Dethlefsen. deal: A 
Package for Learning Bayesian Networks. Journal of Statistical Software, 
8(20), 2003, and the deal reference manual downloadable under 
documentation from www.math.aau.dk/~dethlef/novo/deal?  If yes and 
you still would like more help from this listserve, please submit 
another post including a simple, self-contained example explaining 
something you've tried and why it doesn't seem to answer your question? 
  (This is suggested in the posting guide! 
'www.R-project.org/posting-guide.html'.)

  This documentation might answer your questions.  Even though I've not 
read them, I will guess potential answers to your two questions, hoping 
some other reader may disabuse us both of our ignorance:

  From what I saw in the examples, I would guess that deal supports 
two types of distributions:  normal and finite (discrete).  If so, it 
does NOT support a Poisson.  If it were my problem and I still held that 
view after reviewing this documentation, I might write to the maintainer 
[listed with help(package=deal)] and ask him for suggestions.  Then if 
it were sufficiently important, I might think about how I would modify 
the code to allow for a Poisson.

  Regarding simulations, have you looked at rnetwork, which provides 
simulation of data sets with a given dependency structure?

  Hope this helps,
  Spencer Graves

Carsten Steinhoff wrote:
 Hello,
  
 I want to use R to model a bayesian belief network of frequencies for system
 failiures in various departments of a company.
  
 For the nodes I want to use a poisson-distribution parameterized with expert
 knowledge (e.g. with a gamma prior).
 Then I want to fill in learning-data to improve the initial estimates and
 get some information about possible connections.
 Later I want to simulate dependend random variables from the network
  
 I tryed to use the package deal for that task, which is as far as I know
 the best (and only?) R-package for that task.
 But a few questions rose that I could not solve with the documentation:
  
 (1) Is it possible to parameterize the prior distribution (for example
 (dpois(x,lambda=60) directly and non gaussian ?
  
 (2) If I've chosen a structure, can I simulate dependend states that are non
 gausian distributed?
  
 Thank you for any idea!
  
 Regards, Carsten
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Reading in a table with ISO-latin1 encoding in MacOS-X (Intel)

2006-06-22 Thread Antti Arppe

Dear colleagues,

With the help of a colleague of mine here in Helsinki (Seppo Nyrkkö) 
who looked at innards of the R source code for Mac it turned out that 
this was indeed an issue concerning the Mac locale and its settings 
and not R.


Though we had tried this earlier by changing the LANG variable to 
'fi_FI', we hadn't looked hard enough in the available encodings (with 
locale -a) to select the exactly correct value, being:


LANG=fi_FI.IS08859-1;
export LANG;

With this configuration R was able to happily read in my original 
table with the Scandinavian characters in the header, without no 
fuss.


Thanks for your advice, and wishing all a good Midsummer,

-Antti Arppe

On Thu, 8 Jun 2006, Peter Dalgaard wrote:

so one can only guess that you have a local or Mac-specific setup
issue.

On Thu, 8 Jun 2006, Prof Brian Ripley wrote:

If so, you need to investigate the locale in use, as which letters are valid
depends on the locale: on Linux UTF-8 locales all letters in all languages
are valid in R names, but that is not necessarily the MacOS interpretation.
(Invalid characters in names will be converted to ., and if the locale is
wrong so may be the interpretation of bytes as characters.)
You might find more informed help on the r-sig-mac list.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Package structure question (progress!)

2006-06-22 Thread John W Emerson

Thanks to Jeff Laake for giving me a starting point.
It appears my zipping should have been:

zip -r hello.zip hello

essentially creating an outer folder having the package
name inside the zip file.  It still didn't quite work,
with an error as follows:

 utils:::menuInstallLocal()
updating HTML package descriptions
Warning message:
no package 'file678418be' was found in: packageDescription(i, lib.loc = 
lib, field = Title, encoding = UTF-8)


The warning doesn't mean anything to me, but I suspect that because
I zipped the linux files, it is possible that the unix end-of-line
difference might pose a problem.  I'll try doing the obvious
conversion and will report back.

As I wrote to Jeff, I'll be happy to document this experience once
I understand it.  For those of us interested in spreading R source
with no C/Fortran (e.g. packages in their simplest form), this
packaging should be fairly simple from first principles without
having to rely on special tools.

Regards,

Jay

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Why different results with different initial values for MLE(optim)!

2006-06-22 Thread Ravi Varadhan
It is difficult to say much since you haven't described your likelihood
function.  However, there are some general things that you can do: (1) make
sure that you are maximizing, by defining your objective function as the
negative of likelihood, (2) use log-likelihood rather than the likelihood
function, (3) look at the value of the objective function at different
converged points to see whether you truly have multiple local maxima or if
the likelihood is very flat near the optimum, (4) look at convergence
criterion from the output to see whether you have true convergence, and (5)
try different optimization techniques in optim or try fitdistr function
in MASS library.

Ravi.

--
Ravi Varadhan, Ph.D.
Assistant Professor,  The Center on Aging and Health
Division of Geriatric Medicine and Gerontology
Johns Hopkins University
Ph: (410) 502-2619
Fax: (410) 614-9625
Email:  [EMAIL PROTECTED]
Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html 
--
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:r-help-
 [EMAIL PROTECTED] On Behalf Of Xin
 Sent: Thursday, June 22, 2006 12:10 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Why different results with different initial values for
 MLE(optim)!
 
 Hi, All:
 
 I used optim() to minimise likelihood function for fitting the data to
 a partiuclar distribution. The function is converged and the value of log-
 likelihood is different when I change the intial value.
 
 Whether it means the program does not work well?
 
Thanks!
 
Xin
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-
 guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] useR! Thanks

2006-06-22 Thread Ravi Varadhan
Dear All,

I would like to totally echo Frank Harrell's appreciation of both UseR! and
Vienna.  I had a wonderful time despite being jet-lagged for the whole time!

I was also wondering whether the minutes of the final session on getting
credit for contributions to computational statistics software could be made
available to the participants.

Best,
Ravi.

--
Ravi Varadhan, Ph.D.
Assistant Professor,  The Center on Aging and Health
Division of Geriatric Medicine and Gerontology
Johns Hopkins University
Ph: (410) 502-2619
Fax: (410) 614-9625
Email:  [EMAIL PROTECTED]
Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html 
--

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:r-help-
 [EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr
 Sent: Monday, June 19, 2006 6:14 AM
 To: RHELP
 Subject: [R] useR! Thanks
 
 After attending my first useR! conference I want to thank the organizers
 for doing a wonderful job and the presenters for their high quality
 presentations and stimulating ideas.   The conference venue was
 excellent and of course Vienna is one of the greatest cities in the
 world to visit.  useR! is one of the most fun conferences I've attended.
 
 Thanks again!
 --
 Frank E Harrell Jr   Professor and Chair   School of Medicine
   Department of Biostatistics   Vanderbilt University
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-
 guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] QUERY: correlation between two time series

2006-06-22 Thread Stefano Sofia
I am writing on the behalf of a colleague. 
 
She needs to evaluate the correlation between two time series. 
How can this be performed through R? (how many types of correlations can be 
performed and which are the commands to do it?)
 
thank you for your help and attention
Stefano
 
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] High breakdown/efficiency statistics -- was RE: Rosner's test [Broadcast]

2006-06-22 Thread Liaw, Andy
What would be nice is to have something like a robust task view...

Andy 

From: Berton Gunter
 
 Many thanks for this Martin. There now are several packages 
 with what appear to be overlapping functions (or at least 
 algorithms). Besides those you mentioned, robust and 
 roblm are at least two others. Any recommendations about 
 how or whether to choose among these for us enthusiastic but 
 non-expert users?
 
 
 Cheers,
 Bert
  
 
  -Original Message-
  From: Martin Maechler [mailto:[EMAIL PROTECTED]
  Sent: Thursday, June 22, 2006 2:04 AM
  To: Berton Gunter
  Cc: 'Robert Powell'; r-help@stat.math.ethz.ch
  Subject: Re: [R] Rosner's test
  
   BertG == Berton Gunter [EMAIL PROTECTED]
   on Tue, 13 Jun 2006 14:34:48 -0700 writes:
  
  BertG RSiteSearch('Rosner') ?RSiteSearch or search directly
  BertG from CRAN.
  
  BertG Incidentally, I'll repeat what I've said
  BertG before. Don't do outlier tests.  They're
  BertG dangerous. Use robust methods instead.
  
  Yes, yes, yes!!!
  
  Note that  rlm() or cov.rob()  from recommended package 
 MASS will most 
  probably be sufficient for your needs.
  
  For slightly newer methodology, look at package 
 'robustbase', or also 
  'rrcov'.
  
  Martin Maechler, ETH Zurich
  
  BertG -- Bert Gunter Genentech Non-Clinical Statistics
  BertG South San Francisco, CA
   
  BertG The business of the statistician is to catalyze the
  BertG scientific learning process.  - George E. P. Box
   
   
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] useR! Thanks

2006-06-22 Thread Jeffrey Horner
Achim Zeileis wrote:
[...]
 The plan is to provide the original presentation slides of the
 panelists and a video of the whole panel disc, and probably some minutes
 and/or further information.

Did you happen to video Ivan Mizera's talk as well? I'd really like to 
see that again!

-- 
Jeffrey Horner   Computer Systems Analyst School of Medicine
615-322-8606 Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] useR! Thanks

2006-06-22 Thread Marc Schwartz (via MN)
On Thu, 2006-06-22 at 15:40 -0500, Jeffrey Horner wrote:
 Achim Zeileis wrote:
 [...]
  The plan is to provide the original presentation slides of the
  panelists and a video of the whole panel disc, and probably some minutes
  and/or further information.
 
 Did you happen to video Ivan Mizera's talk as well? I'd really like to 
 see that again!

Hear! Hear!

A most enjoyable presentation!

I now suffer from one newly engendered fear:

   Becoming an abuseR!

;-)


Thanks to the Organizers, Hosts and Presenters for another wonderful
meeting. It is great to see the continued healthy growth and diversity
in this community. I look forward to the next one.

Best regards,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] QUERY: correlation between two time series

2006-06-22 Thread Ben Bolker
Stefano Sofia stefano.sofia at regione.marche.it writes:

 
 I am writing on the behalf of a colleague. 
 
 She needs to evaluate the correlation between two time series. 
 How can this be performed through R? (how many types of correlations can be
performed and which are the
 commands to do it?)
 

  This question is too broad, and the first answer is
the usual please read the
posting guide/documentation/etc. -- as a start, try 

?cor
help.search(correlation)
help.search(time series)

RSiteSearch( etc. )

  taking a wild guess at your friend's statistical issue,
she might needs to estimate cross-correlations between
the series in the presence of autocorrelation in each time
series, which is typically done using a multiple ARMA
model:

http://finzi.psych.upenn.edu/R/Rhelp02a.new/archive/24612.html

suggests that the dse bundle has such capabilities.

  This list is *not* a general-purpose statistics query list, but
if your friend (or you) submitted a much more careful description
of the problem, and showed more evidence of (1) having a good idea
of the general statistical background of the problem (e.g. I think
model X would be appropriate based on reading the literature in
my area but I can't figure out whether R can do it or 
everyone in my field uses model X but it's not appropriate
in this case because ...) and (2) having tried to use the existing
R documentation (RSiteSearch, etc.) to find the answers to your
questions --- then you'd be much more likely to get a useful
response, even if it was not strictly an R question.

  Ben Bolker

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] xyplot, type=b

2006-06-22 Thread Benjamin Tyner
For those who are interested, here is a solution using grid. This
preserves some of the original spirit of do_plot_xy, but is more
satisfactory when x and y are on different scales:

my.panel - function(x, y, cex=1, ...){
require(grid)
panel.xyplot(x, y, cex = cex, ...)
a - 0.5 * cex * convertWidth(unit(1, char), native, valueOnly=TRUE)
b - 0.5 * cex * convertHeight(unit(1, char), native,valueOnly=TRUE)
for(i in 1:length(x)){
dx - x[i] - x[i - 1]
dy - y[i] - y[i - 1]
f - 1 / sqrt((dx/a)^2 + (dy/b)^2)
if((i  1)  identical(f  0.5, TRUE)){
panel.lines(x = c(x[i - 1] + f * dx, x[i] - f * dx),
y = c(y[i - 1] + f * dy, y[i] - f * dy))
}
}
}

# example
n - 50
dat - data.frame(x = runif(n), y = runif(n))
dat - dat[order(dat$x), ]
myplot - xyplot(y~x, data = dat, pch=., panel = my.panel)
print(myplot)

(It's a little slow due to looping panel.lines.)

Ben

Benjamin Tyner wrote:

Unfortunately this is not quite right; to do it correctly it seems one
has to address two problems:

1. how to get the size of the default 'cex' to use for 'd' (do_plot_xy uses 
'GConvertYUnits' to accomplish this)
2. figure out how to achieve the same effect as what 'GConvert(xx, yy, USER, 
INCHES, dd)' does in do_plot_xy. Otherwise, the gap sizes are not constant.

(1) sounds easy but I don't know the answer offhand. (2) seems more subtle. 
Any suggestions would be greatly appreciated.

Ben
  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Effect size in mixed models

2006-06-22 Thread Spencer Graves
  I just learned that my earlier suggestion was wrong.  It's better to 
compute the variance of the predicted or fitted values and compare those 
with the estimated variance components.

  To see how to do this, consider the following minor modification of 
an example in the lme documentation:

fm1. - lme(distance ~ age, data = Orthodont, random=~1)
fm2. - lme(distance ~ age + Sex, data = Orthodont, random = ~ 1)

# str(fm1.) suggested the following:
  var(fm2.$fitted[, fixed]-fm1.$fitted[, fixed])
[1] 1.312756
  VarCorr(fm1.)[, 1]
(Intercept)Residual
  4.472056  2.049456
  VarCorr(fm2.)[, 1]
(Intercept)Residual
  3.266784  2.049456

  In this example, the subject variance without considering Sex was 
4.47 but with Sex in the model, it dropped to 3.27, while the Residual 
variance remained unchanged at 2.05.  The difference between 
fm2.$fitted[, fixed] and fm1.$fitted[, fixed] is the change in the 
predictions generated by the addition of Sex to the model.  The 
variance of that difference was 1.31.  Note that 3.27 + 1.31 = 4.58, 
which is moderately close to 4.47.

  In sum, I think we can get a reasonable estimate of the size of an 
effect from the variance of the differences in the fixed portion of 
the fitted model.

  Comments?
  Hope this helps.
  Spencer Graves

Spencer Graves wrote:
   You have asked a great question:  It would indeed be useful to 
 compare the relative magnitude of fixed and random effects, e.g. to 
 prioritize efforts to better understand and possibly manage processes 
 being studied.  I will offer some thoughts on this, and I hope if there 
 are errors in my logic or if someone else has a better idea, we will 
 both benefit from their comments.
 
   The ideal might be an estimate of something like a mean square for 
 a particular effect to compare with an estimated variance component.
 Such mean squares were a mandatory component of any analysis of variance 
 table prior to the (a) popularization of generalized linear models and 
 (b) availability of software that made it feasible to compute maximum 
 likelihood estimates routinely for unbalanced, mixed-effects models. 
 However, anova(lme(...)) such mean squares are for most purposes 
 unnecessary cluster in a modern anova table.
 
   To estimate a mean square for a fixed effect, consider the 
 following log(likelihood) for a mixed-effects model:
 
   lglk = (-0.5)*(n*log(2*pi*var.e)-log(det(W)) + 
 t(y-X%*%b)%*%W%*%(y-X%*%b)/var.e),
 
 where n = the number of observations,
 
   b = the fixed-effect parameter variance,
 
 and the covariance matrix of the residuals, after integrating out the 
 random effects is var.e*solve(W).  In this formulation, the matrix W 
 is a function of the variance components.  Since they are not needed to 
 compute the desired mean squares, they are suppressed in the notation here.
 
   Then, the maximum likelihood estimate of
 
   var.e = SSR/n,
 
 where SSR = t(y-X%*%b)%*%W%*%(y-X%*%b).
 
   Then
 
   mle.lglk = (-0.5)*(n*(log(2*pi*SSR/n)-1)-log(det(W))).
 
   Now let
 
   SSR0 = this generalized sum of squares of residuals (SSR) without 
 effect 1,
 
 and
 
   SSR1 = this generalized SSR with this effect 1.
 
   If I've done my math correctly, then
 
   D = deviance = 2*log(likelihood ratio)
 = (n*log(SSR0/SSR1)+log(det(W1)/det(W0)))
 
   For roughly half a century, a major part of the analysis of 
 variance was the Pythagorean idea that the sum of squares under H0 was 
 the sum of squares under H1 plus the sum of squares for effect 1:
 
   SSR0 = SS1 + SSR1.
 
   Whence,
 
   exp((D/n)-log(det(W1)/det(W0))) = 1+SS1/SSR1.
 
 Thus,
 
   SS1 = SSR1*(exp((D/n)-log(det(W1)/det(W0)))-1).
 
   If the difference between deg(W1) and det(W0) can be ignored, we get:
 
   SS1 = SSR1*(exp((D/n)-1).
 
   Now compute MS1 = SS1/df1, and compare with the variance components.
 
   If there is a flaw in this logic, I hope someone will disabuse me 
 of it.
 
   If this seems too terse or convoluted to follow, please provide a 
 simple, self-contained example, as suggested in the posting guide! 
 www.R-project.org/posting-guide.html.  You asked a theoretical 
 question, you got a theoretical answer.  If you want a concrete answer, 
 it might help to pose a more concrete question.
 
   Hope this helps.
   Spencer Graves   
 
 Bruno L. Giordano wrote:
 Hello,
 Is there a way to compare the relative relevance of fixed and random 
 effects in mixed models? I have in mind measures of effect size in 
 ANOVAs, and would like to obtain similar information with mixed models.

 Are there information criteria that allow to compare the relevance of 
 each of the effects in a mixed model to the overall fit?

 Thank you,
 Bruno

 __
 R-help@stat.math.ethz.ch mailing list
 

Re: [R] rank(x,y)?

2006-06-22 Thread Duncan Murdoch
Gabor Grothendieck wrote:
 On 6/21/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
   
 Peter Dalgaard wrote:
 
 Duncan Murdoch [EMAIL PROTECTED] writes:


   
 Suppose I have two columns, x,y.  I can use order(x,y) to calculate a
 permutation that puts them into increasing order of x,
 with ties broken by y.

 I'd like instead to calculate the rank of each pair under the same
 ordering, but the rank() function doesn't take multiple values
 as input.  Is there a simple way to get what I want?

 E.g.

   x - c(1,2,3,4,1,2,3,4)
   y - c(1,2,3,1,2,3,1,2)
   rank(x+y/10)
 [1] 1 3 6 7 2 4 5 8

 gives me the answer I want, but only because I know the range of y and
 the size of gaps in the x values.  What do I do in general?

 
 Still not quite general, but in the absence of ties:


   
 z[order(x,y)]-1:8
 z

 
 [1] 1 3 6 7 2 4 5 8


   
 Thanks to all who have replied.  Unfortunately for me, ties do exist,
 and I'd like them to get identical ranks.  John Fox's suggestion would
 handle ties properly, but I'm worried about rounding error giving
 spurious ties.

 

 Try this variant of my prior solution:

 (order(order(x,y)) + rev(order(order(rev(x), rev(y)/2

 Note that no arithmetic is done on the original data, only on
 the output of order, so there should not be any worry about
 rounding -- in fact its sufficiently general that the data
 do not have to be numeric, e.g.

   
 x - c(a, a, b, a, c, d)
 y - c(b, a, b, b, a, a)
 (order(order(x,y)) + rev(order(order(rev(x), rev(y)/2
 
 [1] 2.5 1.0 4.0 2.5 5.0 6.0
   

This is a very nice solution, thanks!

So now we have equivalents to ties=average and first; ties=random 
would be easy.  I wonder if it's worth working out ties=max and 
ties=min and putting in a new function?

Duncan Murdoch
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] hurdle and zip model

2006-06-22 Thread Jeff Miller
This is more of a stats question, but since I'm using R.

 

Can someone tell me if the standard errors produced by hurdle(), zicounts(),
poisson, and the negative binomial formulations of three are directly
comparable? Why or why not? 

 

Thank you,

Jeff

 

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Basic package structure question

2006-06-22 Thread Gabor Grothendieck
On 6/22/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
 Jay Emerson wrote:
  At the encouragement of many at UseR, I'm trying to build my first real
  package. I have no C/Fortran code, just plain old R code, so it should be
  rocket science.  On a Linux box, I used package.skeleton() to create a basic
  package containing just one hello world type of function.  I edited the
  DESCRIPTION file, changin the package name appropriately.  I edited the
  hello.Rd file.  Upon running R CMD check hello, the only warning had to do
  with the fact that src/ was empty (obviously I had no source in such a
  simple package).  I doubt this is a problem.
 
  I was able to install and use the package successfully on the Linux system
  from the .tar.gz file, so far so good!  Next, on to Windows, where the
  problem arose:
 
  I created a zip file from inside the package directory: zip -r ../hello.zip
  ./*
 
 
 Which package directory, the source or the installed copy?  I think this
 might work in the installed copy, but would not work on the source.
 It's not the documented way to build a binary zip, though.
  When I moved this to my Windows machine and tried to install the package, I
  received the following error:
 
 
  utils:::menuInstallLocal()
 
  Error in unpackPkg(pkgs[i], pkgnames[i], lib, installWithVers) :
  malformed bundle DESCRIPTION file, no Contains field
 
  I only found one mention of this in my Google search, with no reply to the
  thread.  The Contains field appears to be used for bundles, but I'm trying
  to create a package, not a bundle.  This leads me to believe that a simple
  zipping of the package directory structure is not the correct format for
  Windows.
 
  Needless to say, there appears to be wide agreement that making packages
  requires precision, but fundamentally a package should (as described in the
  documentation) just be a collection of files and folders organized a certain
  way.  If someone could point me to documentation I may have missed that
  explains this, I would be grateful.
 

 I think the organized in a certain way part is actually important.
 Using R CMD install --build is the documented way to achieve this.  It's
 not trivial to do this on Windows, because you need to set up a build
 environment first, but it's not horribly difficult.

 Duncan Murdoch
  Regards,
 
  Jay

One idea that occurred to me in reading this would be to have a server
that one can send a package to and get back a Windows build to
eliminate having to set up a development environment.  Not sure if
this is feasible, particularly security aspects, but if it were it would
open up package building on Windows to a larger audience.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] eacf

2006-06-22 Thread Spencer Graves
  RSiteSearch(EACF) led me to
http://finzi.psych.upenn.edu/R/Rhelp02a.new/archive/43927.html;.  In
this note from Oct. 2004, Brian Ripley said the EACF did not exist but
'acf' and 'pacf' did.  Moreover, he said that model selection using acf
and pacf is a rather old-fashioned idea. Just fit all the target models
and compare their performance.

What are you trying to do?  If model time series, I suggest you follow 
Prof. Ripley's advice.  If you are conducting research and you want to 
compare the EACF with something else, I doubt if it would be too hard to 
program in R (though I could neither recall nor find the formulae 
involved in the few minutes I spent just now looking).  If you try to 
write an 'eacf' function and get stuck, please submit another post 
describing what you've tried and what does not seem to work (as 
suggested in the posting guide! www.R-project.org/posting-guide.html).

  Hope this helps.
  Spencer Graves
p.s.  Your English is just fine.

Michel Helcias wrote:
 I am Brazilian and I don't know how to speak English, for that I apologize
 for my writing.
 I'd like of informations about the extended (sample) autocorrelation
 function (*EACF*) for the time series. where to acquire some related
 command.
 thak you.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] xyplot, type=b

2006-06-22 Thread Paul Murrell
Hi


Benjamin Tyner wrote:
 For those who are interested, here is a solution using grid. This
 preserves some of the original spirit of do_plot_xy, but is more
 satisfactory when x and y are on different scales:
 
 my.panel - function(x, y, cex=1, ...){
 require(grid)
 panel.xyplot(x, y, cex = cex, ...)
 a - 0.5 * cex * convertWidth(unit(1, char), native, valueOnly=TRUE)
 b - 0.5 * cex * convertHeight(unit(1, char), native,valueOnly=TRUE)
 for(i in 1:length(x)){
 dx - x[i] - x[i - 1]
 dy - y[i] - y[i - 1]
 f - 1 / sqrt((dx/a)^2 + (dy/b)^2)
 if((i  1)  identical(f  0.5, TRUE)){
 panel.lines(x = c(x[i - 1] + f * dx, x[i] - f * dx),
 y = c(y[i - 1] + f * dy, y[i] - f * dy))
 }
 }
 }
 
 # example
 n - 50
 dat - data.frame(x = runif(n), y = runif(n))
 dat - dat[order(dat$x), ]
 myplot - xyplot(y~x, data = dat, pch=., panel = my.panel)
 print(myplot)
 
 (It's a little slow due to looping panel.lines.)


Here's one way to vectorize the lines ...

my.panel2 - function(x, y, cex=1, ...){
panel.xyplot(x, y, cex = cex, ...)
a - 0.5 * cex *
 convertWidth(unit(1, char), native, valueOnly=TRUE)
b - 0.5 * cex *
 convertHeight(unit(1, char), native,valueOnly=TRUE)
dx - diff(x)
dy - diff(y)
f - 1 / sqrt((dx/a)^2 + (dy/b)^2)
n - length(x)
x1 - x[1:(n-1)] + f*dx
x2 - x[2:n] - f*dx
y1 - y[1:(n-1)] + f*dy
y2 - y[2:n] - f*dy
i - f  0.5
panel.segments(x1[i], y1[i], x2[i], y2[i])
}

Paul


 Benjamin Tyner wrote:
 
 
Unfortunately this is not quite right; to do it correctly it seems one
has to address two problems:

1. how to get the size of the default 'cex' to use for 'd' (do_plot_xy uses 
'GConvertYUnits' to accomplish this)
2. figure out how to achieve the same effect as what 'GConvert(xx, yy, 
USER, INCHES, dd)' does in do_plot_xy. Otherwise, the gap sizes are not 
constant.

(1) sounds easy but I don't know the answer offhand. (2) seems more subtle. 
Any suggestions would be greatly appreciated.

Ben
 

 
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] xyplot, type=b

2006-06-22 Thread Benjamin Tyner
Most clever! I'd completely forgotten about panel.segments.

Paul Murrell wrote:

Here's one way to vectorize the lines ...

my.panel2 - function(x, y, cex=1, ...){
panel.xyplot(x, y, cex = cex, ...)
a - 0.5 * cex *
 convertWidth(unit(1, char), native, valueOnly=TRUE)
b - 0.5 * cex *
 convertHeight(unit(1, char), native,valueOnly=TRUE)
dx - diff(x)
dy - diff(y)
f - 1 / sqrt((dx/a)^2 + (dy/b)^2)
n - length(x)
x1 - x[1:(n-1)] + f*dx
x2 - x[2:n] - f*dx
y1 - y[1:(n-1)] + f*dy
y2 - y[2:n] - f*dy
i - f  0.5
panel.segments(x1[i], y1[i], x2[i], y2[i])
}

Paul


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] positioning of separate y-axis labels in xyplot

2006-06-22 Thread Paul Murrell
Hi


Benjamin Tyner wrote:
 I like the functionality provided by outer=TRUE, but when it comes time 
 to place separate xlabs or ylabs, I always end up 'eyeballing' it on a 
 case-by-case basis. For example,
 
 ##begin example
 require(lattice)
 
 cars.lo - loess(dist ~ speed, cars)
 
 print(xyplot(cars.lo$residuals+cars.lo$fitted~cars.lo$x,
  strip=FALSE,
  outer=TRUE,
  layout=c(1,2),
  ylab=,
  scales=list(y=list(relation=free,rot=0)),
  panel=function(x,y,panel.number,...){
if(panel.number==1){
   panel.xyplot(x,y)
   panel.abline(h=0)
}else{
   panel.xyplot(x,y=cars.lo$y)
   panel.xyplot(x,y,type=l)
}
  }))
 
 require(grid)
 trellis.focus(panel, 1, 1, clip.off=TRUE, highlight=FALSE)
 grid.text(residuals, x=unit(0, npc) + unit(-2, lines),rot=90)
 trellis.focus(panel, 1, 2, clip.off=TRUE, highlight=FALSE)
 grid.text(fitted, x=unit(0, npc) + unit(-2, lines),rot=90)
 ## end example
 
 In this case, a distance of -2 lines happens to be enough, but one has 
 to make the plot to know this. I'm interested in learning how one can 
 place the ylabs without fear of overlapping the tick labels; i.e., how 
 to use the exact space allocated by ylab=. I'm thinking it must 
 involve viewports?


There is a 'ylab' viewport set up by lattice, although it is just one 
big one for all panels.  Here's what you could do for your example (with 
knowledge of how many rows of panels there are) ...

downViewport(trellis.vpname(ylab))
# If you want to see where the viewport is ...
# grid.rect(gp=gpar(col=red), name=temp)
grid.text(residuals, y=0.25, rot=90)
grid.text(fitted, y=0.75, rot=90)
# If you want to remove the rect again ...
# grid.remove(temp)
upViewport(0)

Paul
-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] frechet distance

2006-06-22 Thread Spencer Graves
  RSiteSearch(Frechet distance) returned only one hit for me just 
now, and that was for a Frechet distribution, as you mentioned.  Google 
found www.cs.concordia.ca/cccg/papers/39.pdf, which suggests that 
computing it may not be easy.

  If you'd like more help from this listserve, I can offer two 
suggestions:

  1.  If you absolutely need the Frechet distance and you can describe 
an algorithm for computing it but get stuck writing a function for it, 
please submit another question outlining what you've tried and that 
obstacle you've found.

  2.  Alternatively, you might describe the more general problem you 
are trying to solve, why you thought the Frechet distance might help and 
invite alternative suggestions.

  Hope this helps.
  Spencer Graves

Rajarshi Guha wrote:
 Hi, is there any package (or source code snippet) that will evaluate the
 Frechet distance for curves represented as sets of points?
 
 Searching around only threw up references to a Frechet distribution.
 
 Thanks,
 
 ---
 Rajarshi Guha [EMAIL PROTECTED] http://jijo.cjb.net
 GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
 ---
 
 If you believe in telekinesis, raise my hand.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Nested variance-covariance matrix in Multilevel model

2006-06-22 Thread Spencer Graves
  What's your error message?  I see a syntax error in

random=list(Probe=pdBlocked(list(~1,pdCompSymm(~1
),End=~1,ProbeNo=pdCompSymm(~1))

  I count 5 open parens and 3 close parens.  I know that might have 
been part of what you copied into this email and not what you gave R. 
However, it sounds like you have several problems.

  Have you seen Pinheiro and Bates (2000) Mixed-effects models in S and 
S-PLUS (Springer)?  This is the premier (almost mandatory) reference for 
'lme'.  Moreover, the ~\library\nlme\scripts subdirectory of the R 
installation directory (at least under Windows) contains files ch01.R, 
ch02.R, ..., ch06.R, ch08.R containing essentially all the S 
commands used in the book, organized by chapter.  This can save endless 
hassles with spelling problems (and minor syntax differences between 
S-Plus and R).  Sections 1.5 and 1.6, pp. 40-72, discuss analyses if two 
different nested experiments.

  If just want a standard nested analysis, this should be adequate.  If 
you have other special things happening that I missed in skimming your 
email, please explain why the analyses described in Pinheiro and Bates 
do you apply.  In doing so, please include a simple, self-contained 
example showing what you tried and why you think it's not adequate. 
Please don't send us a megabyte of data.  Instead, please either (a) 
modify an example from the book or some other R documentation  or (b) 
generate phony data with a very few lines of R code.

  Hope this helps.
  Spencer Graves

Tobias Guennel wrote:
  I completely forgot to supply the R code I tried:
  
vov1i2-read.table(VOV1_INHIBITED6-16-2006-13h35min33sec.txt,header=TRUE)
  
test.lme-lme(fixed=log2(Intensity)~End+logpgc,random=list(Probe=pdBlocked(list(~1,pdCompSymm(~1
 

  ),End=~1,ProbeNo=pdCompSymm(~1)),data=vov1i2)
 
  It doesn't look right to me and it produces an error too.
 
  It should be something around those lines though.
 
  Tobias Guennel

Tobias Guennel wrote:
 Dear R community,
 
 I have trouble implementing a nested variance-covariance matrix in the 
 lme function.
 
 The model has two fixed effects called End and logpgc, the response 
 variable is the logarithm to base 2 of  Intensity ( log2(Intensity) ) 
 and the random effects are called Probe and ProbeNo.
 The model has the following nesting structure: A Pixel is nested within 
 the ProbeNo,the ProbeNo is within the ProbeEnd ( there are two ends for 
 every probe), and the ProbeEnd is within the Probe.
 
 Now the problem I have is that the variance-covariance structure of the 
 model is quite complex and I can not find the right syntax for fitting 
 it in the lme function.
 
 The variance-covariance structure  is a block diagonal matrix of the form,
   V1   0   0
  V=0  V20
   0  0V3
 
 where V1...V3 are of the structure:
   v11  v12
   V1=  and so on.
   v21  v22
 
 V1...V3 are assumed to have a compound symmetric variance-covariance 
 structure and therefore the submatrices are of the form:
Lambda   Delta1   Delta1 ...   Delta1
Delta1  LambdaDelta1 ...   Delta1
 v11=v22=...
 
Delta1   .   Lambda
   
Delta2  Delta2   Delta2 ...   Delta2
Delta2  Delta2   Delta2 ...   Delta2
 v12=v21=...
 
Delta2   .   Delta2
 
 The elements of these submatrices depend only upon the three covariance 
 parameters: the compound symmetry parameter delta; the variance of 
 random effect sigma^2g; and the residual variance sigma^2. I have 
 formulas for the submatrices Lambda,Delta1 and Delta2 which I can't 
 really paste in here.
 
 The SAS code dealing with this model is the following:
 
 proc mixed data=rnadeg.pnau;
 title 'CV structure for PNAU';
 class probepos probeno end probe pixelid newprobeid;
 model logPM=end logpgc / ddfm=satterth;
 random probeno newprobeid / subject=probe type=cs;
 lsmeans end / diff cl; run;
 
 Any ideas are appreciated a lot since I am kind of stuck at this point.
 
 Thank you
 Tobias Guennel
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Time series labeling with Zoo

2006-06-22 Thread Gad Abraham
Hi,

I'm using zoo because it can automatically label the months of a time 
series composed of daily observations.

This works well for certain time series lengths, but not for others, e.g.:

While:

  library(zoo)
  plot(zoo(runif(10), as.Date(2005-06-01) + 0:50))

Shows up the months and day of month,

  plot(zoo(runif(10), as.Date(2005-06-01) + 0:380))

results in a single label 2006 in the middle of the x axis, with no 
months labelled, although there's plenty of room for labels.


How do I get zoo to label the months too, without having to manually 
work out which day was the first day of each month?


I'm using zoo 1.0-3 with R 2.2.1.

Thanks,
Gad

-- 
Gad Abraham
Department of Mathematics and Statistics
University of Melbourne
Parkville 3010, Victoria, Australia
email: [EMAIL PROTECTED]
web: http://www.ms.unimelb.edu.au/~gabraham

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] frechet distance

2006-06-22 Thread Rajarshi Guha
On Thu, 2006-06-22 at 19:52 -0700, Spencer Graves wrote:
 RSiteSearch(Frechet distance) returned only one hit for me just 
 now, and that was for a Frechet distribution, as you mentioned.  Google 
 found www.cs.concordia.ca/cccg/papers/39.pdf, which suggests that 
 computing it may not be easy.

In addition, from what I have read it is supposed to be NP-hard. However
my problems are usually small. 

 1.  If you absolutely need the Frechet distance and you can describe 
 an algorithm for computing it but get stuck writing a function for it, 
 please submit another question outlining what you've tried and that 
 obstacle you've found.
 
 2.  Alternatively, you might describe the more general problem you 
 are trying to solve, why you thought the Frechet distance might help and 
 invite alternative suggestions.

I have some curves (in the form of points) which are sigmoidal. I also
have some curves which look like 2 sigmoid curves joined head to tail.
Something like


 -
/
   /
 --
/
   /
  /
   --

Now these curves can vary: in some cases the initial lower tail might be
truncated.  For the 'stepwise' sigmoidal curves, the middle step might
not be horizontal but inclined to some degree and so on.

My goal is to be given a set of points representing a curve and try and
identify whether it is of the standard sigmoid form or of the 'stepwise'
sigmoid form.

My plan was to generate a 'canonical sigmoid curve' via the logistic
equation and then perform a curve matching operation. Thus for a
supplied curve that is sigmoid in nature it will match the 'canonical
curve' to a better extent than would a curve that is stepwise in nature.
(The matching is performed after applying a Procrustes transformation)

Initially I tried using the Hausdorff distance, but this does not take
into account the ordering of the points in the curve and did not always
give a conclusive answer. A number of references (including the one
above) indicate that the Frechet distance is better suited for curve
matching problems.

As you noted, evaluating the Frechet distance is non-trivial and the
only code that I could find was some code that is dependent on the CGAL
(http://www.cgal.org/) library. As far as I could see, CGAL does not
have any R bindings.

An alternative that I had considered was to to evaluate the distance
matrix of the points making up the curve and then evaluating the root
mean square error of the matrix elements for the canonical curve and the
supplied curve. My initial experiments indicated that this generally
works but I observed some cases where a stepwise curve matched the
canonical sigmoid better (ie lower RMSE) than an actual sigmoid curve.

Another alternative is look at a graph of the first derivative of the
curve. A standard sigmoidal curve will result in a graph with a single
peak, a stepwise curve like above will result in a graph with 2 peaks.
Thus this could be reduced to a peak picking problem. The problem is the
curves I'll get are not smooth and can have small kinks - this leads to
(usually) quite small peaks in the graph of the first derivative - but
most of the code that has been described on this list for peak picking
also picks them up, thus making identification of the curve ambiguous.

To be honest I do not fully understand the algorithm used to evaluate
the Frechet distance hence my request for code. However, I'm not fixated
on the Frechet distance :) If there are simpler approaches I'm open to
them.

Thanks,

---
Rajarshi Guha [EMAIL PROTECTED] http://jijo.cjb.net
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
---
Q: What do you get when you cross a mosquito with a mountain climber?
A: Nothing. You can't cross a vector with a scaler.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] frechet distance

2006-06-22 Thread Spencer Graves
see inline

Rajarshi Guha wrote:
 On Thu, 2006-06-22 at 19:52 -0700, Spencer Graves wrote:
RSiteSearch(Frechet distance) returned only one hit for me just 
 now, and that was for a Frechet distribution, as you mentioned.  Google 
 found www.cs.concordia.ca/cccg/papers/39.pdf, which suggests that 
 computing it may not be easy.
 
 In addition, from what I have read it is supposed to be NP-hard. However
 my problems are usually small. 
 
 1.  If you absolutely need the Frechet distance and you can describe 
 an algorithm for computing it but get stuck writing a function for it, 
 please submit another question outlining what you've tried and that 
 obstacle you've found.

2.  Alternatively, you might describe the more general problem you 
 are trying to solve, why you thought the Frechet distance might help and 
 invite alternative suggestions.
 
 I have some curves (in the form of points) which are sigmoidal. I also
 have some curves which look like 2 sigmoid curves joined head to tail.
 Something like
 
 
  -
   /
  /
--
   /
  /
 /
  --
 
  If all curves are of this form, it suggests they are continuous and 
monotonically increasing.  Is that correct?  If yes, isn't the Frechet 
distance then just the max of the distances between the two curves at 
the end points?

  I don't see a complete proof yet, but it would seem that the 
difficulties in computing Frechet distances would come from functions 
that are not monotonic and / or discontinuous.

  This brings me back to my second question:  What's the larger problem 
you want to solve, for which you think the Frechet distance might help?

  Hope this helps.
  Spencer Graves

suppose we want

 Now these curves can vary: in some cases the initial lower tail might be
 truncated.  For the 'stepwise' sigmoidal curves, the middle step might
 not be horizontal but inclined to some degree and so on.
 
 My goal is to be given a set of points representing a curve and try and
 identify whether it is of the standard sigmoid form or of the 'stepwise'
 sigmoid form.
 
 My plan was to generate a 'canonical sigmoid curve' via the logistic
 equation and then perform a curve matching operation. Thus for a
 supplied curve that is sigmoid in nature it will match the 'canonical
 curve' to a better extent than would a curve that is stepwise in nature.
 (The matching is performed after applying a Procrustes transformation)
 
 Initially I tried using the Hausdorff distance, but this does not take
 into account the ordering of the points in the curve and did not always
 give a conclusive answer. A number of references (including the one
 above) indicate that the Frechet distance is better suited for curve
 matching problems.
 
 As you noted, evaluating the Frechet distance is non-trivial and the
 only code that I could find was some code that is dependent on the CGAL
 (http://www.cgal.org/) library. As far as I could see, CGAL does not
 have any R bindings.
 
 An alternative that I had considered was to to evaluate the distance
 matrix of the points making up the curve and then evaluating the root
 mean square error of the matrix elements for the canonical curve and the
 supplied curve. My initial experiments indicated that this generally
 works but I observed some cases where a stepwise curve matched the
 canonical sigmoid better (ie lower RMSE) than an actual sigmoid curve.
 
 Another alternative is look at a graph of the first derivative of the
 curve. A standard sigmoidal curve will result in a graph with a single
 peak, a stepwise curve like above will result in a graph with 2 peaks.
 Thus this could be reduced to a peak picking problem. The problem is the
 curves I'll get are not smooth and can have small kinks - this leads to
 (usually) quite small peaks in the graph of the first derivative - but
 most of the code that has been described on this list for peak picking
 also picks them up, thus making identification of the curve ambiguous.
 
 To be honest I do not fully understand the algorithm used to evaluate
 the Frechet distance hence my request for code. However, I'm not fixated
 on the Frechet distance :) If there are simpler approaches I'm open to
 them.
 
 Thanks,
 
 ---
 Rajarshi Guha [EMAIL PROTECTED] http://jijo.cjb.net
 GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
 ---
 Q: What do you get when you cross a mosquito with a mountain climber?
 A: Nothing. You can't cross a vector with a scaler.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do 

Re: [R] High breakdown/efficiency statistics -- was RE: Rosner's test [Broadcast]

2006-06-22 Thread Martin Maechler
 AndyL == Liaw, Andy [EMAIL PROTECTED]
 on Thu, 22 Jun 2006 14:21:17 -0400 writes:

AndyL What would be nice is to have something like a
AndyL robust task view...  Andy

Indeed.  I have volunteered to do this a while ago.
First wanted to get robustbase to a reasonable state.

A 'robust' CRAN task view is planned to be done by the end of
the summer, hopefully even earlier.

Martin

AndyL From: Berton Gunter
  Many thanks for this Martin. There now are several
 packages with what appear to be overlapping functions (or
 at least algorithms). Besides those you mentioned,
 robust and roblm are at least two others. Any
 recommendations about how or whether to choose among
 these for us enthusiastic but non-expert users?
 
 
 Cheers, Bert
 
 
  -Original Message-  From: Martin Maechler
 [mailto:[EMAIL PROTECTED]  Sent: Thursday,
 June 22, 2006 2:04 AM  To: Berton Gunter  Cc: 'Robert
 Powell'; r-help@stat.math.ethz.ch  Subject: Re: [R]
 Rosner's test
  
   BertG == Berton Gunter [EMAIL PROTECTED]
   on Tue, 13 Jun 2006 14:34:48 -0700 writes:
  
  BertG RSiteSearch('Rosner') ?RSiteSearch or search
 directly  BertG from CRAN.
  
  BertG Incidentally, I'll repeat what I've said 
 BertG before. Don't do outlier tests.  They're  BertG
 dangerous. Use robust methods instead.
  
  Yes, yes, yes!!!
  
  Note that rlm() or cov.rob() from recommended package
 MASS will most  probably be sufficient for your needs.
  
  For slightly newer methodology, look at package
 'robustbase', or also  'rrcov'.
  
  Martin Maechler, ETH Zurich
  
  BertG -- Bert Gunter Genentech Non-Clinical Statistics
  BertG South San Francisco, CA
   
  BertG The business of the statistician is to catalyze
 the  BertG scientific learning process.  - George
 E. P. Box
   
   
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
 read the posting guide!
 http://www.R-project.org/posting-guide.html
 
 


AndyL 
--
AndyL Notice: This e-mail message, together with any
AndyL attachments, contains information of Merck  Co.,
AndyL Inc. (One Merck Drive, Whitehouse Station, New
AndyL Jersey, USA 08889), and/or its affiliates (which may
AndyL be known outside the United States as Merck Frosst,
AndyL Merck Sharp  Dohme or MSD and in Japan, as Banyu)
AndyL that may be confidential, proprietary copyrighted
AndyL and/or legally privileged. It is intended solely for
AndyL the use of the individual or entity named on this
AndyL message.  If you are not the intended recipient, and
AndyL have received this message in error, please notify us
AndyL immediately by reply e-mail and then delete it from
AndyL your system.
AndyL 
--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Time series labeling with Zoo

2006-06-22 Thread Gabor Grothendieck
When the axis labelling does not work well you will have to do it yourself
like this.  The plot statement is instructed not to plot the axis and then
we extract into tt all the dates which are day of the month 1.  Then
we manually draw the axis using those.

library(zoo)
set.seed(1)
z - zoo(runif(10), as.Date(2005-06-01) + 0:380)

plot(z, xaxt = n)
tt - time(z)[as.POSIXlt(time(z))$mday == 1]
axis(1, tt, format(tt, %d%b%y))

On 6/22/06, Gad Abraham [EMAIL PROTECTED] wrote:
 Hi,

 I'm using zoo because it can automatically label the months of a time
 series composed of daily observations.

 This works well for certain time series lengths, but not for others, e.g.:

 While:

   library(zoo)
   plot(zoo(runif(10), as.Date(2005-06-01) + 0:50))

 Shows up the months and day of month,

   plot(zoo(runif(10), as.Date(2005-06-01) + 0:380))

 results in a single label 2006 in the middle of the x axis, with no
 months labelled, although there's plenty of room for labels.


 How do I get zoo to label the months too, without having to manually
 work out which day was the first day of each month?


 I'm using zoo 1.0-3 with R 2.2.1.

 Thanks,
 Gad

 --
 Gad Abraham
 Department of Mathematics and Statistics
 University of Melbourne
 Parkville 3010, Victoria, Australia
 email: [EMAIL PROTECTED]
 web: http://www.ms.unimelb.edu.au/~gabraham

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html