Re: [R] thanks

2005-01-12 Thread Lefebure Tristan
see man R

example from a shell:

echo -e pdf(file=\test.pdf\)\nplot(1:10,11:20)\ndev.off(dev.cur())\ncmd.R
R -s cmd.R

(write a file of command for R, and than feed R with it)

On Tuesday 11 January 2005 15:59, Cserháti Mátyás wrote:
 Dear all,

 Thanks to those 3 people who sent me answers to my question. Got
 the problem solved. Great!

 Now, another question of mine is:

 I would like to run an R script from the Linux prompt. Is there any way
 possible to do this? The reason is, the calculation that I'm doing takes a
 few hours, and I would like to automatize it.

 Or does it mean that I have to run source within the R prompt?

 Or is there a way to do the automatization within the R prompt?

 Thanks, Matthew

 u.i. Köszi, Zoli!

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

-- 

Tristan LEFEBURE
Laboratoire d'écologie des hydrosystèmes fluviaux (UMR 5023)
Université Lyon I - Campus de la Doua
Bat. Darwin C 69622 Villeurbanne - France

Phone: (33) (0)4 26 23 44 02
Fax: (33) (0)4 72 43 15 23

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] (no subject)

2005-01-12 Thread Bigby
Hello,

My name is Graham, I am an engineering student and my lecturer wishes us to
get a numerical summary of some data. He said use the command
numerical.summary(Data), which didnt work, he suggested we try library(s20x)
first, which came up with an error on my console. I have version 2.0.1 of R
and i dont understand what to do. As this is part of an assignment I would
really apreciate some advice.

Regards

-  Graham

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] global objects not overwritten within function

2005-01-12 Thread McGehee, Robert
I would suggest reading the posting guide,
(http://www.r-project.org/posting-guide.html) and give a reproducible
example, with the error message that you received. As is, I have no idea
what you are doing here, and certainly cannot run this code. You use
... as an argument to your functions (why I have no idea), but then
use ... within your function seemingly to mean code was omitted rather
than using your function argument  What ...obj... means I have
no idea.

One point though. If the f() function does not take any arguments, then
why are you using the special R object  ... as an argument?
Furthermore, why not just do the assignment inside of fct() instead of
calling another function that just runs code? Please reference the
_rest_ of the R Language guide for information on correct usage of
..., especially the chapter on functions, and include a script that
can be run from start to finish by anyone at an R prompt without having
to decipher what the missing code does, or what the ellipsis is doing in
your context.

My guess to what I think is going on here is that you are trying to use
dynamic scoping, when R uses lexical scoping. If you are an S user, this
will be a change. The f() function is stored in .GlobalEnv, so is not
aware of any objects stored in the fct() environment. When you run f(),
it's looking for an obj object in its environment, probably can't find
one, and then looking for the obj object in the global environment. If
it finds it there, it assigns it to itself, basically doing nothing.
Once again, the R language guide will explain this. You could solve this
by either imbedding the f() function inside of fct(), passing in the obj
object, instead of relying on dynamic scoping (which R doesn't use), or
probably preferably, not have an f() function at all, as all it does is
call another function.

Also, I'd reference ?- for perhaps a cleaner way of doing global
assignments. Using this alone may solve your problems, as it may force
you to scope your code correctly.


-Original Message-
From: bogdan romocea [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, January 11, 2005 9:26 AM
To: r-help@stat.math.ethz.ch
Subject: [R] global objects not overwritten within function


Dear useRs,

I have a function that creates several global objects with
assign(obj,obj,.GlobalEnv), and which I need to run iteratively in
another function. The code is similar to

f - function(...) {
assign(obj,obj,.GlobalEnv)
}
fct - function(...) {
for (i in 1:1000)
{
...
f(...)  
...obj...
rm(obj) #code fails without this line
}
}

I don't understand why f(), when run in a for() loop inside fct(), does
not overwrite the global object 'obj'. If I don't delete 'obj' after I
use it, the code fails - the same objects created by the first
iteration are used by subsequent iterations. 

I checked ?assign and the Evaluation chapter in 'R Language Definition'
but still don't understand why the above happens. Can someone briefly
explain or suggest something I should read? By the way, I don't want to
use 'better' techniques (lists, functions that return values instead of
creating global objects etc) - I want to create global objects with f()
and overwrite them again and again within fct().

Thank you,
b.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Graphical table in R

2005-01-12 Thread Marc Schwartz
On Tue, 2005-01-11 at 14:59 +, Dan Bolser wrote:
 On 10 Jan 2005, Peter Dalgaard wrote:
 
 Dan Bolser [EMAIL PROTECTED] writes:
 
  Cheers. This is really me just being lazy (as usual). The latex
 function
  in Hmisc allows me to make a .ps file then grab a screen shot of
 that ps
  and make a .png file.
  
  I would just like to use plot so I can wrap it in a png command and
 not
  have to use the 'screen shot' in between.
 
 A screen shot of a ps file? That sounds ... weird. If you can view
 it,
 presumably you have Ghostscript and that can do png files.
 
 The thing is the ps file has teh wrong size, so I end up with a small
 table in the corner of a big white page (using imageMagick convert
 function).
 
 I havent tried ghostscript (don't know the cmd).
 
 I could set the paper size correctly if I knew the size of my table,
 but I
 don't know how to calculate that before hand and feed it into the
 latex
 commands (Hmisc).
 
 Seems like I should roll my own table with the plot command and
 'primatives' (like the demo(mathplot)) - I just hoped that someone had
 already done the hard work for me and I could type something like...
 
 plot.xtable(x)
 
 x = any R object that makes sense to have a tabular output.
 
 Seems like such a function done correctly could be usefull for helping
 people write up (hem) analysis.
 
 Thanks again for the help everyone.
 
 Dan.

Dan,

I think that taking Peter's/Thomas' solution provides a substantial
level of flexibility in formatting. I wish that I had thought of that
approach... :-)

For example:

  plot(1:10, type=n)

  txt - capture.output(ftable(UCBAdmissions))

  par(family = mono)

  text(4, 8, paste(txt,collapse=\n))

  text(4, 6, paste(txt,collapse=\n), cex = 0.75)

  text(4, 4, paste(txt,collapse=\n), cex = 0.5)


Using par(cex) in the call to text() and modifying the x,y coordinates
will enable you to place the table anywhere within the plot region and
also adjust the overall size of the table by modifying the font size.

You can also use the 'adj' and 'pos' arguments in the call to text() to
adjust the placement of the table, so rather than being centered on x,y
(the default) it could be moved accordingly. See ?text for more
information.

Finally, you can even put a frame around the table by crudely using
strwidth() and strheight(). Some additional hints on this would be
available by reviewing the code for legend()...

# Do this for the first table (assumes 'cex = 1'):

# Get table width and add 10%
table.w - max(strwidth(txt)) * 1.1

# Get table height (not including space between rows)
table.h - sum(strheight(txt))

rect(4 - (table.w / 2), 8 - (table.h), 
 4 + (table.w / 2), 8 + (table.h))


It would take some work to combine all of this into a single function,
providing for additional flexibility in positioning, frame line
types/color/width, adjusting for 'cex' and so on. It could be done
though...

This is, in effect, taking an entire R character object and plotting it.

Does that help?

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] RODBC package -- sqlQuery(channel,.....,nullstring=0) still gives NA's

2005-01-12 Thread Luis Rideau Cruz
R-help,

I'm using the RODBC package to retrieve data froma ODBC database which
contain NA's.

By using the argument nullstring = 0  in sqlQuery() I expect to
coerce them to numeric but still get NA's in my select.

I'm running on Windows XP

 version
 _  
platform i386-pc-mingw32
arch i386   
os   mingw32
system   i386, mingw32  
status  
major2  
minor0.1
year 2004   
month11 
day  15 
language R 


Thank you

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Meeker's SPLIDA Reliability in R

2005-01-12 Thread Cunningham, Colin A
Hi,

Would anyone be aware of an R package implementing the functionality
found in Meeker's SPLIDA software written for S-Plus?  I don't know if
anyone has tried to port the s/w to R directly, or if equivalent
functions are available within another package.

Thanks in Advance.

- Colin

Colin Cunningham
D1C Ramp Statistician
Intel Corporation
971.214.6623
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] thanks

2005-01-12 Thread Jan T. Kim
On Tue, Jan 11, 2005 at 03:59:58PM +0100, Cserh?ti M?ty?s wrote:

 I would like to run an R script from the Linux prompt. Is there any way 
 possible to do this? The reason is, the calculation that I'm doing takes a 
 few hours, and I would like to automatize it.
 
 Or does it mean that I have to run source within the R prompt?
 
 Or is there a way to do the automatization within the R prompt?

The standard way (well, my usual way, anyway) is to just use I/O
redirection:

linux R --vanilla  stuff.r

is, for the most part (see below), equivalent to

linux R
 source(stuff.r);

The --vanilla option is necessary to suppress any interactive questions
concerning workspace saving (i.e. the Save workspace image? [y/n/c]
thing); differences between the automated and the interactive form may
be due to your script depending on some saved environment, or some
stuff in your init files.

I'd like to encourage you to automate your calculations, as this enhances
not only convenience but also reproducibility of your results.

Best regards, Jan
-- 
 +- Jan T. Kim ---+
 |*NEW*email: [EMAIL PROTECTED]   |
 |*NEW*WWW:   http://www.cmp.uea.ac.uk/people/jtk |
 *-=  hierarchical systems are for files, not for humans  =-*

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] global objects not overwritten within function

2005-01-12 Thread Prof Brian Ripley
On Tue, 11 Jan 2005, bogdan romocea wrote:
Dear useRs,
I have a function that creates several global objects with
assign(obj,obj,.GlobalEnv), and which I need to run iteratively in
another function. The code is similar to
f - function(...) {
assign(obj,obj,.GlobalEnv)
}
fct - function(...) {
for (i in 1:1000)
{
...
f(...)
...obj...
rm(obj) #code fails without this line
}
}
I don't understand why f(), when run in a for() loop inside fct(), does
not overwrite the global object 'obj'. If I don't delete 'obj' after I
use it, the code fails - the same objects created by the first
iteration are used by subsequent iterations.
I checked ?assign and the Evaluation chapter in 'R Language Definition'
but still don't understand why the above happens. Can someone briefly
explain or suggest something I should read? By the way, I don't want to
use 'better' techniques (lists, functions that return values instead of
creating global objects etc) - I want to create global objects with f()
and overwrite them again and again within fct().
Since you are not using ... in the sense it is used in R, we have little 
idea of what your real code looks like and so what it does.

Can you please give a small real example that fails.  Here is one that 
works, yet has all the features I can deduce from your non-code:

f - function(x) assign(obj, x, pos=.GlobalEnv)
fct - function()
{
   for(i in 1:2) {
 x - i+3
 f(x)
 print(obj)
   }
}
fct()
[1] 4
[1] 5
obj
[1] 5
--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] help on integrate function

2005-01-12 Thread Francisca xuan
here is a function I wrote
cdfest=function(t,lambda,delta,x,y){
   a1=mean(x t)
   a2=mean(x t-delta)
   a3=mean(y1 t)
  s=((1-lambda)*a1+lambda*a2-a3)^2
  s
}
when I try to integrate over t, I got this message:
integrate(cdfest,0,4,lambda=0.3,delta=1,x=x,y=y1)
Error in integrate(cdfest, 0, 4, lambda = 0.3, delta = 1, x = x, y = y1) :
   evaluation of function gave a result of wrong length
but the function is definitely in one dimension. what is wrong?
any suggestions are welcome. thanks
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] useR 2005 ?

2005-01-12 Thread Rau, Roland
Dear R-Help-List,

are there any plans to organize a useR conference in 2005?

Best,
Roland



+
This mail has been sent through the MPI for Demographic Rese...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] thanks

2005-01-12 Thread Uwe Ligges
Cserhti Mtys wrote:
Dear all,
Thanks to those 3 people who sent me answers to my question. Got 
the problem solved. Great!

Now, another question of mine is:
I would like to run an R script from the Linux prompt. Is there any way 
possible to do this? The reason is, the calculation that I'm doing takes a 
few hours, and I would like to automatize it.

Or does it mean that I have to run source within the R prompt?
Or is there a way to do the automatization within the R prompt?
Thanks, Matthew
u.i. Kszi, Zoli!
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
See
a) the manual An Introduction to R, Appendix B
b) inside R type:  ?BATCH
c) outside R type: R CMD BATCH --help
Uwe Ligges
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] getting variable names from formula

2005-01-12 Thread Daniel Almirall
R-list,

1.  Given a formula (f) w variables referencing some data set (dat), is
there any easier/faster way than this to get the names (in character form)
of the variables on the RHS of '~' ?

 dat - data.frame(x1 = x1 - rnorm(100,0,1), x2 = x2 - rnorm(100,0,1), y = x1 
+ x2 + rnorm(100,0,1))

 f - y ~ x1 + x2

 mf - model.frame(f, data=dat)

 mt - attr(mf, terms)

 predvarnames - attr(mt, term.labels)

 predvarnames
[1] x1 x2

-

2.  Also, is there an easy/fast way to do it, without having the data set
(dat) available?  That is, not using 'model.frame' which requires 'data'?
I understand that one approach for this is to use the way formulas are
stored as 'list's.  For example, this works

 predvarnames - character()

 for (i in 2:length(f[[3]]) ){

 predvarnames - c(predvarnames, as.character(f[[3]][[i]]))

 }

 predvarnames
[1] x1 x2

but is there a better way?

Thanks,
Danny

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Graphical table in R

2005-01-12 Thread Dan Bolser
On Tue, 11 Jan 2005, Marc Schwartz wrote:

On Tue, 2005-01-11 at 14:59 +, Dan Bolser wrote:
 On 10 Jan 2005, Peter Dalgaard wrote:
 
 Dan Bolser [EMAIL PROTECTED] writes:
 
  Cheers. This is really me just being lazy (as usual). The latex
 function
  in Hmisc allows me to make a .ps file then grab a screen shot of
 that ps
  and make a .png file.
  
  I would just like to use plot so I can wrap it in a png command and
 not
  have to use the 'screen shot' in between.
 
 A screen shot of a ps file? That sounds ... weird. If you can view
 it,
 presumably you have Ghostscript and that can do png files.
 
 The thing is the ps file has teh wrong size, so I end up with a small
 table in the corner of a big white page (using imageMagick convert
 function).
 
 I havent tried ghostscript (don't know the cmd).
 
 I could set the paper size correctly if I knew the size of my table,
 but I
 don't know how to calculate that before hand and feed it into the
 latex
 commands (Hmisc).
 
 Seems like I should roll my own table with the plot command and
 'primatives' (like the demo(mathplot)) - I just hoped that someone had
 already done the hard work for me and I could type something like...
 
 plot.xtable(x)
 
 x = any R object that makes sense to have a tabular output.
 
 Seems like such a function done correctly could be usefull for helping
 people write up (hem) analysis.
 
 Thanks again for the help everyone.
 
 Dan.

Dan,

I think that taking Peter's/Thomas' solution provides a substantial
level of flexibility in formatting. I wish that I had thought of that
approach... :-)

For example:

  plot(1:10, type=n)

  txt - capture.output(ftable(UCBAdmissions))

  par(family = mono)

  text(4, 8, paste(txt,collapse=\n))

  text(4, 6, paste(txt,collapse=\n), cex = 0.75)

  text(4, 4, paste(txt,collapse=\n), cex = 0.5)


Using par(cex) in the call to text() and modifying the x,y coordinates
will enable you to place the table anywhere within the plot region and
also adjust the overall size of the table by modifying the font size.

You can also use the 'adj' and 'pos' arguments in the call to text() to
adjust the placement of the table, so rather than being centered on x,y
(the default) it could be moved accordingly. See ?text for more
information.

Finally, you can even put a frame around the table by crudely using
strwidth() and strheight(). Some additional hints on this would be
available by reviewing the code for legend()...

# Do this for the first table (assumes 'cex = 1'):

# Get table width and add 10%
table.w - max(strwidth(txt)) * 1.1

# Get table height (not including space between rows)
table.h - sum(strheight(txt))

rect(4 - (table.w / 2), 8 - (table.h), 
 4 + (table.w / 2), 8 + (table.h))


It would take some work to combine all of this into a single function,
providing for additional flexibility in positioning, frame line
types/color/width, adjusting for 'cex' and so on. It could be done
though...

This is, in effect, taking an entire R character object and plotting it.

Does that help?

It certainly fits the bill. I will give it a go, but I may stick with the
latex() functions in Hmisc.

Thanks for all the help, it is a really elegant solution in the end :)

Dan.



Marc



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Standard error for the area under a smoothed ROC curve?

2005-01-12 Thread Dan Bolser

Hello, 

I am making some use of ROC curve analysis. 

I find much help on the mailing list, and I have used the Area Under the
Curve (AUC) functions from the ROC function in the bioconductor project...

http://www.bioconductor.org/repository/release1.5/package/Source/
ROC_1.0.13.tar.gz 

However, I read here...

http://www.medcalc.be/manual/mpage06-13b.php

The 95% confidence interval for the area can be used to test the
hypothesis that the theoretical area is 0.5. If the confidence interval
does not include the 0.5 value, then there is evidence that the laboratory
test does have an ability to distinguish between the two groups (Hanley 
McNeil, 1982; Zweig  Campbell, 1993).

But aside from early on the above article is short on details. Can anyone
tell me how to calculate the CI of the AUC calculation?


I read this...

http://www.bioconductor.org/repository/devel/vignette/ROCnotes.pdf

Which talks about resampling (by showing R code), but I can't understand
what is going on, or what is calculated (the example given is specific to
microarray analysis I think).

I think a general AUC CI function would be a good addition to the ROC
package.




One more thing, in calculating the AUC I see the splines function is
recomended over the approx function. Here...

http://tolstoy.newcastle.edu.au/R/help/04/10/6138.html

How would I rewrite the following AUC functions (adapted from bioconductor
source) to use splines (or approxfun or splinefun) ...

 spe # Specificity
 [1] 0.02173913 0.13043478 0.21739130 0.32608696 0.43478261 0.54347826
 [7] 0.65217391 0.76086957 0.89130435 1. 1. 1.
[13] 1.

 sen # Sensitivity
 [1] 1.000 1.000 1.000 1.000 1.000 0.9302326 0.8139535
 [8] 0.6976744 0.5581395 0.4418605 0.3488372 0.2325581 0.1162791

trapezint(1-spe,sen)
my.integrate(1-spe,sen)

## Functions
## Nicked (and modified) from the ROC function in bioconductor.
trapezint -
function (x, y, a = 0, b = 1)
{
if (x[1]  x[length(x)]) {
  x - rev(x)
  y - rev(y)
}
y - y[x = a  x = b]
x - x[x = a  x = b]
if (length(unique(x))  2)
return(NA)
ya - approx(x, y, a, ties = max, rule = 2)$y
yb - approx(x, y, b, ties = max, rule = 2)$y
x - c(a, x, b)
y - c(ya, y, yb)
h - diff(x)
lx - length(x)
0.5 * sum(h * (y[-1] + y[-lx]))
}

my.integrate -
function (x, y, t0 = 1)
{
f - function(j) approx(x,y,j,rule=2,ties=max)$y
integrate(f, 0, t0)$value
}





Thanks for any pointers,
Dan.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] help on integrate function

2005-01-12 Thread Prof Brian Ripley
On Tue, 11 Jan 2005, Francisca xuan wrote:
here is a function I wrote
cdfest=function(t,lambda,delta,x,y){
  a1=mean(x t)
  a2=mean(x t-delta)
  a3=mean(y1 t)
 s=((1-lambda)*a1+lambda*a2-a3)^2
 s
}
when I try to integrate over t, I got this message:
integrate(cdfest,0,4,lambda=0.3,delta=1,x=x,y=y1)
Error in integrate(cdfest, 0, 4, lambda = 0.3, delta = 1, x = x, y = y1) :
  evaluation of function gave a result of wrong length
but the function is definitely in one dimension. what is wrong?
Please read the help page:
   f: an R function taking a numeric first argument and returning a
  numeric vector of the same length.  Returning a non-finite
  element will generate an error.
Your function does not do that: it returns a scalar for a vector input of 
length  1, as the message clearly says.

PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Breslow Day Test

2005-01-12 Thread Palos, Judit
Breslow-Day test 
A statistical test for the homogeneity of odds ratios.
 

Homogeneity

In  javascript:void(0); systematic reviews homogeneity refers to the
degree to which the results of studies included in a review are similar.
Clinical homogeneity means that, in studies included in a review, the
participants, interventions and outcome measures are similar or comparable.
Studies are considered statistically homogeneous if their results vary no
more than might be expected by the play of chance. See
javascript:void(0); heterogeneity.
 
 

Odds ratio (OR)

The ratio of the odds of an event in the experimental (intervention) group
to the odds of an event in the  javascript:void(0); control group. Odds
are the ratio of the number of people in a group with an event to the number
without an event. Thus, if a group of 100 people had an
javascript:void(0); event rate of 0.20, 20 people had the event and 80 did
not, and the odds would be 20/80 or 0.25. An odds ratio of one indicates no
difference between comparison groups. For undesirable outcomes an OR that is
less than one indicates that the intervention was effective in reducing the
risk of that outcome. When the event rate is small, odds ratios are very
similar to  javascript:void(0); relative risks.
 
 
http://www.cochrane.dk/cochrane/handbook/contents.htm
 
Bye,
Judit
 
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] getting variable names from formula

2005-01-12 Thread Douglas Bates
Daniel Almirall wrote:
R-list,
1.  Given a formula (f) w variables referencing some data set (dat), is
there any easier/faster way than this to get the names (in character form)
of the variables on the RHS of '~' ?
 dat - data.frame(x1 = x1 - rnorm(100,0,1), x2 = x2 - rnorm(100,0,1), y = x1 
+ x2 + rnorm(100,0,1))
 f - y ~ x1 + x2
 mf - model.frame(f, data=dat)
 mt - attr(mf, terms)
 predvarnames - attr(mt, term.labels)

predvarnames
[1] x1 x2
-
2.  Also, is there an easy/fast way to do it, without having the data set
(dat) available?  That is, not using 'model.frame' which requires 'data'?
I understand that one approach for this is to use the way formulas are
stored as 'list's.  For example, this works
 predvarnames - character()
 for (i in 2:length(f[[3]]) ){
 predvarnames - c(predvarnames, as.character(f[[3]][[i]]))
 }

predvarnames
[1] x1 x2
but is there a better way?
Thanks,
Danny
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
That's exactly what the all.vars function does.  If you apply it to the 
formula you get all the names of variables referenced in the formula. 
If you only want the right hand side then apply it to the third 
component of the formula

 f - y ~ x1 + x2
 all.vars(f)
[1] y  x1 x2
 all.vars(f[[3]])
[1] x1 x2
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] transcan() from Hmisc package for imputing data

2005-01-12 Thread avneet singh
Hello:

I have been trying to impute missing values of a data
frame which has both numerical and categorical values
using the function transcan() with little luck.

Would you be able to give me a simple example where a
data frame is fed to transcan and it spits out a new
data frame with the NA values filled up?

Or is there any other function that i could use?

Thank you
avneet

=
I believe in equality for everyone, except reporters and photographers.
~Mahatma Gandhi

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] getting variable names from formula

2005-01-12 Thread Dimitris Rizopoulos
maybe something like:
f - y ~ x1 + x2
attr(terms(f), term.labels)
but this wan't work if you have a more complex formula (e.g., f - y ~ 
x1*x2 + I(x1^2)) and you want only c(x1, x2).

I hope it helps.
Best,
Dimitris

Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/16/336899
Fax: +32/16/337015
Web: http://www.med.kuleuven.ac.be/biostat
http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm
- Original Message - 
From: Daniel Almirall [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Tuesday, January 11, 2005 9:55 PM
Subject: [R] getting variable names from formula


R-list,
1.  Given a formula (f) w variables referencing some data set (dat), 
is
there any easier/faster way than this to get the names (in character 
form)
of the variables on the RHS of '~' ?

dat - data.frame(x1 = x1 - rnorm(100,0,1), x2 = x2 - 
rnorm(100,0,1), y = x1 + x2 + rnorm(100,0,1))

f - y ~ x1 + x2
mf - model.frame(f, data=dat)
mt - attr(mf, terms)
predvarnames - attr(mt, term.labels)
predvarnames
[1] x1 x2
-
2.  Also, is there an easy/fast way to do it, without having the 
data set
(dat) available?  That is, not using 'model.frame' which requires 
'data'?
I understand that one approach for this is to use the way formulas 
are
stored as 'list's.  For example, this works

predvarnames - character()
for (i in 2:length(f[[3]]) ){
predvarnames - c(predvarnames, as.character(f[[3]][[i]]))
}
predvarnames
[1] x1 x2
but is there a better way?
Thanks,
Danny
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] (no subject)

2005-01-12 Thread Kevin Wang
Hi,

On Wed, 12 Jan 2005, Bigby wrote:

 Hello,

 numerical.summary(Data), which didnt work, he suggested we try library(s20x)
 first, which came up with an error on my console. I have version 2.0.1 of R

library(s20x) is a package written by the Department of Statistics at the
University of Auckland.  It is used for their STATS 201/208 courses.  It
is not on CRAN.  You may need to contact them for it.

But you can get most of it using other commands.  From memory it simply
combines several other R functions, such as summary(), quantile()...etc.

HTH,

Kevin


Ko-Kang Kevin Wang
PhD Student
Centre for Mathematics and its Applications
Building 27, Room 1004
Mathematical Sciences Institute (MSI)
Australian National University
Canberra, ACT 0200
Australia

Homepage: http://wwwmaths.anu.edu.au/~wangk/
Ph (W): +61-2-6125-2431
Ph (H): +61-2-6125-7407
Ph (M): +61-40-451-8301

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Destructor for S4 objects?

2005-01-12 Thread Adam Lyon
Hi Robert,

It looks like there is no way to explicitly make an S4 object call a
function when it is garbage collected unless you resort to tricks with
reg.finalizer.

It turns out that Prof. Ripley's reply (thanks!!) had enough hints in it
that I was able to get the effect I wanted by using R's external pointer
facility. In fact it works quite nicely.

In a nutshell, I create a C++ object (with new) and then wrap its pointer
with an R external pointer using
SEXP rExtPtr = R_MakeExternalPtr( cPtr, aTag, R_NilValue);

Where cPtr is the C++/C pointer to the object and aTag is an R symbol
describing the pointer type [e.g. SEXP aTag =
install(this_is_a_tag_for_a_pointer_to_my_object)]. The final argument is
a value to protect. I don't know what this means, but all of the examples
I saw use R_NilValue.

If you want a C++ function to be called when R loses the reference to the
external pointer (actually when R garbage collects it, or when R quits), do
R_RegisterCFinalizerEx( rExtPtr, (R_CFinalizer_t)functionToBeCalled, TRUE );

The TRUE means that R will call the functionToBeCalled if the pointer is
still around when R quits. I guess if you set it to FALSE, then you are
assuming that your shell can delete memory and/or release resources when R
quits. 

So return this external pointer to R (the function that new'ed it was called
by .Call or something similar) and stick it in a slot of your object. Then
when your object is garbage collected, functionToBeCalled will be called.
The slot would have the type externalptr.

The functionToBeCalled contains the code to delete the C++ pointer or
release resources, for example...

SEXP functionToBeCalled( SEXP rExtPtr ) {
  // Get the C++ pointer
  MyThing* ptr = R_ExternalPtrAddr(rExtPtr);

  // Delete it
  delete ptr;

  // Clear the external pointer
  R_ClearExternalPtr(rExtPtr);

  return R_NilValue;
}

And there you have it.

There doesn't seem to be any official documentation on this stuff (at least
none that I could find). The best references I found are on the R developers
web page. See the links within  some notes on _references, external
objects, or mutable state_ for R and a _simple implementation_ of external
references and finalization. Note that the documents are slightly out of
date (the function names have apparently been changed somewhat). The latter
one has some examples that are very helpful. And as Prof. Ripley pointed
out, RODBC uses this facility too, so look at that code.

Hope this was useful. Good luck.

--- Adam

Adam Lyon (lyon-at-fnal.gov)
Fermi National Accelerator Laboratory
Computing Division / D0 Experiment

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RODBC package -- sqlQuery(channel,.....,nullstring=0) stillgives NA's

2005-01-12 Thread Prof Brian Ripley
PLEASE do read the help page, which says
nullstring: character string to be used when reading 'SQL_NULL_DATA'
  character items from the database.
  ^^^
so this does not apply to numeric items.
You can of course easily change numeric NAs to 0s, if you want to.
On Wed, 12 Jan 2005, Luis Rideau Cruz wrote:
There is something strange in R behaviour (perhaps).
Your negative remarks are not appreciated.
I have run the same select in Oracle SQL*Plus (version 10.1.0.2.0) and
the output comes out with NULLs  (which is what it ougth to be).
But in R I still get the same result with NAs (no matter I use
na.strings or nullstring arguments)
An output example follows below:
Using na.string=0  and nullstring=0 (sorry by the indents):
  Length 2003 2002 2001 2000 1999 1998 1997 1996 1995
1  32   NA1   NA   NA   NA   NA   NA2   NA
2  343   NA   NA   NA   NA   NA   NA6   NA
3  35   NA   NA   NA   NA2   NA   NA   NA   NA
4  36   NA   12   NA   NA   10   NA   NA1   NA
5  3733   NA   NA4   NA   NA   31   NA
6  382411   126   NA   11   NA
7  394   1355   348   NA   58   13
 Length 2003 2002 2001 2000 1999 1998 1997 1996 1995
  32 1
   2
  34  3
   6
  35  2
  3612   10
   1
  37  3  34
  31
  38  2  4  1  1 12
 611
  39  4 13  5  5 34
 858 13
Best,
Luis

Prof Brian Ripley [EMAIL PROTECTED] 12/01/2005 09:14:22 
On Tue, 11 Jan 2005, Luis Rideau Cruz wrote:
R-help,
I'm using the RODBC package to retrieve data froma ODBC database
which
contain NA's.
By using the argument nullstring = 0  in sqlQuery() I expect to
coerce them to numeric but still get NA's in my select.
You need to read the help page (as the posting guide asks): it says
na.strings: character string(s) to be mapped to 'NA' when reading
  character data.
which is the opposite of what you are saying you want to do.
An ODBC database cannot contain NA's.  It may contain NULLs, and it may
contain NA, so we have no idea what you mean.
--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RODBC package -- sqlQuery(channel,.....,nullstring=0)stillgives NA's

2005-01-12 Thread Luis Rideau Cruz

(1) I do read the posting guide (the fact that I missread o
missunderstood something does not imply not reading)

(2) I could change NAs to 0 (I know) but I have previously (older
versions of R and SQL*Plus) used the same select with the right output
(namely with 0s).

(3) AFAIK strange is not a negative remark and does not seem to me at
the very least but that is always a matter of taste.

(4) Thank you for your replies but the door is still open so as to know
a solution to the select without coercing NAs to 0s after retrieving the
data

Best,
Luis 

 Prof Brian Ripley [EMAIL PROTECTED] 12/01/2005 11:21:33 
PLEASE do read the help page, which says

nullstring: character string to be used when reading 'SQL_NULL_DATA'
   character items from the database.
   ^^^
so this does not apply to numeric items.

You can of course easily change numeric NAs to 0s, if you want to.

On Wed, 12 Jan 2005, Luis Rideau Cruz wrote:

 There is something strange in R behaviour (perhaps).

Your negative remarks are not appreciated.

 I have run the same select in Oracle SQL*Plus (version 10.1.0.2.0)
and
 the output comes out with NULLs  (which is what it ougth to be).

 But in R I still get the same result with NAs (no matter I use
 na.strings or nullstring arguments)
 An output example follows below:

 Using na.string=0  and nullstring=0 (sorry by the indents):

   Length 2003 2002 2001 2000 1999 1998 1997 1996 1995
 1  32   NA1   NA   NA   NA   NA   NA2   NA
 2  343   NA   NA   NA   NA   NA   NA6   NA
 3  35   NA   NA   NA   NA2   NA   NA   NA   NA
 4  36   NA   12   NA   NA   10   NA   NA1   NA
 5  3733   NA   NA4   NA   NA   31   NA
 6  382411   126   NA   11   NA
 7  394   1355   348   NA   58   13

  Length 2003 2002 2001 2000 1999 1998 1997 1996 1995
   32 1
2
   34  3
6
   35  2
   3612   10
1
   37  3  34
   31
   38  2  4  1  1 12
  611
   39  4 13  5  5 34
  858 13


 Best,
 Luis


 Prof Brian Ripley [EMAIL PROTECTED] 12/01/2005 09:14:22 
 On Tue, 11 Jan 2005, Luis Rideau Cruz wrote:

 R-help,

 I'm using the RODBC package to retrieve data froma ODBC database
 which
 contain NA's.

 By using the argument nullstring = 0  in sqlQuery() I expect to
 coerce them to numeric but still get NA's in my select.

 You need to read the help page (as the posting guide asks): it says

 na.strings: character string(s) to be mapped to 'NA' when reading
   character data.

 which is the opposite of what you are saying you want to do.

 An ODBC database cannot contain NA's.  It may contain NULLs, and it
may

 contain NA, so we have no idea what you mean.

 -- 
 Brian D. Ripley,  [EMAIL PROTECTED] 
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/

 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595



-- 
Brian D. Ripley,  [EMAIL PROTECTED] 
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/ 
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Breslow Day Test

2005-01-12 Thread Tobias Verbeke
On Tue, 11 Jan 2005 10:45:48 -0500
Palos, Judit [EMAIL PROTECTED] wrote:

 Breslow-Day test 
 A statistical test for the homogeneity of odds ratios.
  
[..some definitions..]

Your message was not particularly clear, but if
you were looking for R code to do a Breslow-Day test,
Google found this for you:

http://www.math.montana.edu/~jimrc/classes/stat524/Rcode/breslowday.test.r

HTH,
Tobias

 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Standard error for the area under a smoothed ROC curve?

2005-01-12 Thread Frank E Harrell Jr
Dan Bolser wrote:
Hello, 

I am making some use of ROC curve analysis. 

I find much help on the mailing list, and I have used the Area Under the
Curve (AUC) functions from the ROC function in the bioconductor project...
http://www.bioconductor.org/repository/release1.5/package/Source/
ROC_1.0.13.tar.gz 

However, I read here...
http://www.medcalc.be/manual/mpage06-13b.php
The 95% confidence interval for the area can be used to test the
hypothesis that the theoretical area is 0.5. If the confidence interval
does not include the 0.5 value, then there is evidence that the laboratory
test does have an ability to distinguish between the two groups (Hanley 
McNeil, 1982; Zweig  Campbell, 1993).
But aside from early on the above article is short on details. Can anyone
tell me how to calculate the CI of the AUC calculation?
I read this...
http://www.bioconductor.org/repository/devel/vignette/ROCnotes.pdf
Which talks about resampling (by showing R code), but I can't understand
what is going on, or what is calculated (the example given is specific to
microarray analysis I think).
I think a general AUC CI function would be a good addition to the ROC
package.

One more thing, in calculating the AUC I see the splines function is
recomended over the approx function. Here...
http://tolstoy.newcastle.edu.au/R/help/04/10/6138.html
How would I rewrite the following AUC functions (adapted from bioconductor
source) to use splines (or approxfun or splinefun) ...

spe # Specificity
 [1] 0.02173913 0.13043478 0.21739130 0.32608696 0.43478261 0.54347826
 [7] 0.65217391 0.76086957 0.89130435 1. 1. 1.
[13] 1.

sen # Sensitivity
 [1] 1.000 1.000 1.000 1.000 1.000 0.9302326 0.8139535
 [8] 0.6976744 0.5581395 0.4418605 0.3488372 0.2325581 0.1162791
trapezint(1-spe,sen)
my.integrate(1-spe,sen)
## Functions
## Nicked (and modified) from the ROC function in bioconductor.
trapezint -
function (x, y, a = 0, b = 1)
{
if (x[1]  x[length(x)]) {
  x - rev(x)
  y - rev(y)
}
y - y[x = a  x = b]
x - x[x = a  x = b]
if (length(unique(x))  2)
return(NA)
ya - approx(x, y, a, ties = max, rule = 2)$y
yb - approx(x, y, b, ties = max, rule = 2)$y
x - c(a, x, b)
y - c(ya, y, yb)
h - diff(x)
lx - length(x)
0.5 * sum(h * (y[-1] + y[-lx]))
}
my.integrate -
function (x, y, t0 = 1)
{
f - function(j) approx(x,y,j,rule=2,ties=max)$y
integrate(f, 0, t0)$value
}


Thanks for any pointers,
Dan.
I don't see why the above formulas are being used.  The 
Bamber-Hanley-McNeil-Wilcoxon-Mann-Whitney nonparametric method works 
great.  Just get the U statistic (concordance probability) used in 
Wilcoxon.  As Somers' Dxy rank correlation coefficient is 2*(1-C) where 
C is the concordance or ROC area, the Hmisc package function rcorr.cens 
uses U statistic methods to get the standard error of Dxy.  You can 
easily translate this to a standard error of C.

Frank
--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] transcan() from Hmisc package for imputing data

2005-01-12 Thread Frank E Harrell Jr
avneet singh wrote:
Hello:
I have been trying to impute missing values of a data
frame which has both numerical and categorical values
using the function transcan() with little luck.
Would you be able to give me a simple example where a
data frame is fed to transcan and it spits out a new
data frame with the NA values filled up?
Or is there any other function that i could use?
Thank you
avneet
It's in the help file for transcan.  But multiple imputation is much 
better, and transcan does not do multiple imputation as well as the 
newer Hmisc function aregImpute.

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [survey] R for Reporting - the R Output MAnager (ROMA) project

2005-01-12 Thread Eric Lecoutre
Hi R UseRs,
I am interested in providing Reporting abilities to R and have initiated a 
project called R Output MAnager (ROMA).
My starting point was my R2HTML package which provides (rough) HTML 
exportations. I began with trying to mimic it for LaTeX but fastly did 
realize that it was a bad idea.
Thus, I started again from scratch a new package and did spend a lot of 
time reading about this topic, looking at what other software do (SAS ODS, 
SPlus SPXML,...) and studying technologies and formats: XML+XLST, LyX, 
DocBook, RTF,...

What follows is a description of my plans. This email is targetted to 
interested useRs, in order to have a return on that.
It comes with a little survey at the end that will be useful to me to 
target my package.
If you are also interested in Reporting (Output, Formats, Exchange,...), 
please read the following and answer the survey.
If not, you can skip this message - apologies for sending it to R-help, I 
hope you don't mind.

---

As a matter of fact, I have implemented something that shows promises 
(according to me). Currently, from the following output description:

***
 data(iris)
 mm=as.matrix(iris[1:5,1:4])
 out = emptyContent()
 out = out + Section(A title here)
 out = out + diag(2)
 out = out + Comment(comment: yes!)
 out = out + list(un=1,pi)
 out = out + Then a boolean: + TRUE
 out = out + Section(Default matrix,level=2)
 out = out + mm
 out = out + Section(Custom matrix + Footnote(It works!),level=2)
 out = out + 
ROMA(mm,style=custommatrix,rowstyle=paste(color,row(mm)[,1]%%2,sep=),align=left)
 out = out + Section(An other title)
 out = out + ROMAgenerated()# ROMAgenerated is a predefined function
***

You can generate a proper HTML file by the following command:
 Export(out)
(see result: http://www.stat.ucl.ac.be/ROMA/sample.htm)
The same output object could be exported to (tex+dvi+ps+pdf) with:
  Export(out,driver=latex)
(see result: http://www.stat.ucl.ac.be/ROMA/sample.pdf / Change extension 
for other formats: tex and ps)

--- Survey ---
IMPORTANT: ONLY DO REPLY TO ME, NOT TO R-HELP MAILING LIST
Simply fill in the questions you want to asnwer to:
1. I am interesting in Reporting abilities for R
[ ] Definitively
[ ] Rather Yes
[ ] Rather No
[ ] Not at all

2. I have some knowledge about those different formats / specifications
[ ] rtf  [ ] LaTeX  [ ] LyX
[ ] html [ ] css[ ] xHTML
[ ] XML  [ ] XLST   [ ] DocBook

3. I have some knowledge about those tools
[ ] SAS ODS
[ ] SPlus SPXML library
[ ] XLST + XLST-FO chain
4. I would be specially interested in the following formats (multiple 
choices possible)
[ ] rtf
[ ] tex
[ ] lyx
[ ] XML, with a DTD specific to R
[ ] XML, with the DTD from SPlus (compatible with SPXML library)
[ ] XML, DocBook flavor
[ ] HTML + css (good xHTML)
[ ] Word (doc)
[ ] OpenOffice (oo)
[ ] Plain text
[ ] Other:

4bis: If several formats, the best (according to me and my needs) one would 
be: 

5. The approach is to fully separate content from formating. So, XML would 
be an ideal output format. Nevertheless, few people who use R may also 
mater XLST to produce nice formatted output. Thus, a way to handle styles 
(bold, colors, fonts, etc.) from R would also be great. It may not be a 
priority. Statistical output do have some specific issues: mathematics, 
complicated tables, graphs, and so on. For each of the following items, 
please tell me how important the issue is for you:

0: I don't need that (and think I will never need it)
1: Not really important
...
5: Crucial - I can't leave without that point anymore

 5.1 - Beeing able to read the document in any OS:
 Importance: __
 5.2 - Having an object that describes the output within R (as in the 
example), so that I could add/remove things, reexport it
 Importance: __	

 5.3 - Beeing able to define basics formatting also within R (bold, 
colors, fonts, and so on)
 Importance: __	

 5.4 - Beeing able to include mathematics, as (La)TeX codes or MathML
 Importance: __ 
 5.5 - Beeing able to build complicated tables, with merged cells, 
embedding lists, eventually sub-tables
 Importance: __	

6. Here are some conceptual objects that a report may contain. Are there 
any more you can think to which may be important?
	Tables (containing Rows and Cells), Lists, Titles, Footnotes, Comment, 
Abbreviations / Acronyms, Code, Links, Graphs, Layout  (to have 2 or 3 
columns), Mathematics (equations), Table of Contents, Index
	
	Other that could be added:
	

7. Two different tools allow to create dynamic or alike documents: Sweave 
(for LaTeX and HTML) and Rpad (HTML, with a server). I would be interested 
in beeing able to describe the structure of a document that would be 
exportable to:

 7.1 -  Sweave   [ ] Yes[ ] No
 7.2 -  Rpad [ ] Yes[ ] No
If you are interested in contributing to the project, please let me know 
also. 

Re: [R] thanks

2005-01-12 Thread Jan T. Kim
On Tue, Jan 11, 2005 at 04:24:11PM +0100, Lefebure Tristan wrote:

 example from a shell:
 
 echo -e pdf(file=\test.pdf\)\nplot(1:10,11:20)\ndev.off(dev.cur())\ncmd.R
 R -s cmd.R
 
 (write a file of command for R, and than feed R with it)

This may be on the verge of becoming offtopic, but let me remark
that the technique proposed here should be used for illustrative
purposes only. For real life, use pipes:

echo 'print(mean(rnorm(10)));' | R --vanilla

This is equivalent to

echo ''print(mean(rnorm(10)));'  cmd.R
R --vanilla  cmd.R

*as long as only one shell is executing this sequence at any given time*.

The reason I mention this here is that I've seen it happen a few times
that this temporary command file approach has made it from examples
into shell scripts of which then, later on, multiple instances were
run at a time, resulting in very rare, very irreproducible, and most
inexplicable erroneous results.

Best regards, Jan
-- 
 +- Jan T. Kim ---+
 |*NEW*email: [EMAIL PROTECTED]   |
 |*NEW*WWW:   http://www.cmp.uea.ac.uk/people/jtk |
 *-=  hierarchical systems are for files, not for humans  =-*

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] CUSUM SQUARED structural breaks approach?

2005-01-12 Thread Achim Zeileis
On Tue, 11 Jan 2005 19:33:41 + Rick Ram wrote:

 Groundwork for the choice of break method in my specific application
 has already been done - otherwise I would need to rework the wheel
 (make a horribly detailed comparison of performance of break
 approaches in context of modelling post break)
 
 If it interests you, Pesaran  Timmerman 2002 compared CUSUM Squared,
 BaiPerron and a time varying approach to detect singular previous
 breaks in reverse ordered financial time series so as to update a
 forecasting model. 

Yes, I know that paper. And if I recall correctly they are mainly
interested in modelling the time period after the last break. For this,
the reverse ordered recursive CUSUM approach works because they
essentially look back in time to see when their predictions break down.
And for their application looking for variance changes also makes sense.
The approach is surely valid and sound in this context...but it might be
possible to do something better (but I would have to look much closer at
the particular application to have an idea what might be a way to go).

 This works fine i.e. the plot looks correct.  The problem is how to
 appropriately normalise these to rescale them to what the CUSUM
 squared procedure expects (this looks to be a different and more
 complicated procedure than the normalisation used for the basic
 CUSUM).  I am from an IT background and am slightly illiterate in
 terms of math notation... guidance from anyone would be appreciated

I just had a brief glance at BDE75, page 154, Section 2.4. If I
haven't missed anything important on reading it very quickly, you just
need to do something like the following (a reproducible example, based
on data from strucchange, using a notation similar to BDE's):

## load GermanM1 data and model
library(strucchange)
data(GermanM1)
M1.model - dm ~ dy2 + dR + dR1 + dp + ecm.res + season

## compute squared recursive residuals
w2 - recresid(M1.model, data = GermanM1)^2
## compute CUSUM of squares process
sr - ts(cumsum(c(0, w2))/sum(w2), end = end(GermanM1$dm), freq = 12)
## the border (r-k)/(T-k)
border - ts(seq(0, 1, length = length(sr)),
 start = start(sr), freq = 12)

## nice plot
plot(sr, xaxs = i, yaxs = i, main = CUSUM of Squares)
lines(border, col = grey(0.5))
lines(0.4 + border, col = grey(0.5))
lines(- 0.4 + border, col = grey(0.5))

Instead of 0.4 you would have to use the appropriate critical values
from Durbin (1969) if my reading of the paper is correct.
 
hth,
Z

 Does anyone know if this represents some commonly performed type of
 normalisation than exists in another function??
 
 I will hunt out the 1969 paper for the critical values but prior to
 doing this I am a bit confused as to how they will
 implemented/interpreted... the CUSUM squared plot does/should run
 diagonally up from left to right and there are two straight lines that
 one would put around this from the critical values.  Hence, a
 different interpretation/implementation of confidence levels than in
 other contexts.  I realise this is not just a R thing but a problem
 with my theoretical background.
 
 
 Thanks for detailed reply!
 
 Rick.
 
 
  
  But depending on the model and hypothesis you want to test, another
  technique than CUSUM of squares might be more appropriate and also
  available in strucchange.
 
  
  hth,
  Z
  
   Any help or pointers about where to look would be more than
   appreciated!  Hopefully I have just missed obvious something in
   the package...
  
   Many thanks,
  
   Rick R.
   
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
  
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] [survey] R for Reporting - the R Output MAnager (ROMA) project

2005-01-12 Thread A.J. Rossini
Your example is sequential, ignoring the tree-like structure of most
documents. Why not via a DOM or similar XML-ish structure?

While I'd never advocate general purpose XML as a user format, as you
note, that is what XSLT is for, and using XML as an electronic
internal document representation would provide a potentially more
scalable system.

(i.e. use XML and the DOM internally, but provide a simple API to it).

The other advantage would be that you could stick a dependency DAG
(ADG) via a second set of marked edges of the document graph/tree to
allow for selective regeneration of results.

But then, this project isn't on my to-do list this year :-).

best,
-tony



On Wed, 12 Jan 2005 14:16:57 +0100, Eric Lecoutre
[EMAIL PROTECTED] wrote:
 
 Hi R UseRs,
 
 I am interested in providing Reporting abilities to R and have initiated a
 project called R Output MAnager (ROMA).
 My starting point was my R2HTML package which provides (rough) HTML
 exportations. I began with trying to mimic it for LaTeX but fastly did
 realize that it was a bad idea.
 Thus, I started again from scratch a new package and did spend a lot of
 time reading about this topic, looking at what other software do (SAS ODS,
 SPlus SPXML,...) and studying technologies and formats: XML+XLST, LyX,
 DocBook, RTF,...
 
 What follows is a description of my plans. This email is targetted to
 interested useRs, in order to have a return on that.
 It comes with a little survey at the end that will be useful to me to
 target my package.
 If you are also interested in Reporting (Output, Formats, Exchange,...),
 please read the following and answer the survey.
 If not, you can skip this message - apologies for sending it to R-help, I
 hope you don't mind.
 
 ---
 
 As a matter of fact, I have implemented something that shows promises
 (according to me). Currently, from the following output description:
 
 ***
  data(iris)
  mm=as.matrix(iris[1:5,1:4])
 
  out = emptyContent()
  out = out + Section(A title here)
  out = out + diag(2)
  out = out + Comment(comment: yes!)
  out = out + list(un=1,pi)
  out = out + Then a boolean: + TRUE
  out = out + Section(Default matrix,level=2)
  out = out + mm
  out = out + Section(Custom matrix + Footnote(It works!),level=2)
  out = out +
 ROMA(mm,style=custommatrix,rowstyle=paste(color,row(mm)[,1]%%2,sep=),align=left)
  out = out + Section(An other title)
  out = out + ROMAgenerated()# ROMAgenerated is a predefined function
 ***
 
 You can generate a proper HTML file by the following command:
 
  Export(out)
 
 (see result: http://www.stat.ucl.ac.be/ROMA/sample.htm)
 
 The same output object could be exported to (tex+dvi+ps+pdf) with:
 
   Export(out,driver=latex)
 
 (see result: http://www.stat.ucl.ac.be/ROMA/sample.pdf / Change extension
 for other formats: tex and ps)
 
 --- Survey ---
 
 IMPORTANT: ONLY DO REPLY TO ME, NOT TO R-HELP MAILING LIST
 
 Simply fill in the questions you want to asnwer to:
 
 1. I am interesting in Reporting abilities for R
 [ ] Definitively
 [ ] Rather Yes
 [ ] Rather No
 [ ] Not at all
 
 2. I have some knowledge about those different formats / specifications
 [ ] rtf  [ ] LaTeX  [ ] LyX
 [ ] html [ ] css[ ] xHTML
 [ ] XML  [ ] XLST   [ ] DocBook
 
 3. I have some knowledge about those tools
 [ ] SAS ODS
 [ ] SPlus SPXML library
 [ ] XLST + XLST-FO chain
 
 4. I would be specially interested in the following formats (multiple
 choices possible)
 [ ] rtf
 [ ] tex
 [ ] lyx
 [ ] XML, with a DTD specific to R
 [ ] XML, with the DTD from SPlus (compatible with SPXML library)
 [ ] XML, DocBook flavor
 [ ] HTML + css (good xHTML)
 [ ] Word (doc)
 [ ] OpenOffice (oo)
 [ ] Plain text
 [ ] Other:
 
 4bis: If several formats, the best (according to me and my needs) one would
 be: 
 
 5. The approach is to fully separate content from formating. So, XML would
 be an ideal output format. Nevertheless, few people who use R may also
 mater XLST to produce nice formatted output. Thus, a way to handle styles
 (bold, colors, fonts, etc.) from R would also be great. It may not be a
 priority. Statistical output do have some specific issues: mathematics,
 complicated tables, graphs, and so on. For each of the following items,
 please tell me how important the issue is for you:
 
0: I don't need that (and think I will never need it)
1: Not really important
...
5: Crucial - I can't leave without that point anymore
 
  5.1 - Beeing able to read the document in any OS:
  Importance: __
 
  5.2 - Having an object that describes the output within R (as in the
 example), so that I could add/remove things, reexport it
  Importance: __
 
  5.3 - Beeing able to define basics formatting also within R (bold,
 colors, fonts, and so on)
  Importance: __
 
  5.4 - Beeing able to include mathematics, as (La)TeX codes or MathML
  Importance: __
 
  5.5 - Beeing able to build 

Re: [R] useR 2005 ?

2005-01-12 Thread Marc Schwartz
On Tue, 2005-01-11 at 17:39 +0100, Rau, Roland wrote:
 Dear R-Help-List,
 
 are there any plans to organize a useR conference in 2005?
 
 Best,
 Roland


As I understand it, no. The next one will be in 2006, so it will be
every other year, interleaved with the DSC meeting the odd years. 

Information on past DSC meetings is here:

http://www.ci.tuwien.ac.at/Conferences/DSC.html

I have not seen anything posted yet for DSC 2005, unless I missed it
someplace.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] (no subject)

2005-01-12 Thread nicolas . deig

hi,

I am trying to grow a classification tree on some data, but I have a little
problem. In order to do so I have to use a function like tree in R and on the
internet help(tree) I get the following: 

The left-hand-side (response) should be either a numerical vector when a
regression tree will be fitted or a factor, when a classification tree is 
produced

I would like to know what is a factor in R, is it numerical value with no
formula or just a word??

Thanks in advance
Nicolas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] What is a factor? [was :(no subject)]

2005-01-12 Thread Marc Schwartz
On Wed, 2005-01-12 at 15:17 +0100, [EMAIL PROTECTED] wrote:
 hi,
 
 I am trying to grow a classification tree on some data, but I have a little
 problem. In order to do so I have to use a function like tree in R and on 
 the
 internet help(tree) I get the following: 
 
 The left-hand-side (response) should be either a numerical vector when a
 regression tree will be fitted or a factor, when a classification tree is 
 produced
 
 I would like to know what is a factor in R, is it numerical value with no
 formula or just a word??
 
 Thanks in advance
 Nicolas

See ?factor and/or Chapter 4 Ordered and Unordered Factors in An
Introduction to R.

Also, you might want to look into the 'rpart' package for an alternative
to 'tree'. rpart is included in the base R distribution:

library(rpart)
?rpart

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] (no subject)

2005-01-12 Thread saverio vicario
Dear help desk and R community,
I have a problem on how R2.0 handle the RAM, maybe a bug
In fact I used R1.9 since  january 2004 with large data set using a 
macosx G5 with 1G of ram without problem .  Then I passed to 2.0 and 
I found myself short in  ram using virtual memory.  I tried to use 
the program from terminal windows to avoid the GUI but it was the 
same.  The annoying part  is that even if I cancel big object from 
the workspace the RAM consumption do not decrease (looking at the 
percentage of usage in ps or the actual value in activity monitor). 
Only after a long time (1/2 hour ) the consumption of RAM decreased 
somewhat. When I use the workspace browser on the GUI and I use 
refresh the consumption of RAM fluctuate each time  both decreasing 
and incresing even if the workspace do not change. For example I can 
pass from using 270mb of memory to 430mb (or the contrary) simply 
pushing several time refresh.  The value is stable once I do not push 
refresh anymore
thanks
saverio

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Breslow Day Test

2005-01-12 Thread Thomas Lumley
On Wed, 12 Jan 2005, Tobias Verbeke wrote:
On Tue, 11 Jan 2005 10:45:48 -0500
Palos, Judit [EMAIL PROTECTED] wrote:
Breslow-Day test
A statistical test for the homogeneity of odds ratios.
[..some definitions..]
Your message was not particularly clear, but if
you were looking for R code to do a Breslow-Day test,
Google found this for you:
There is code for meta-analyses, including a test of homogeneity that I 
think is the same as the Breslow-Day one, in the rmeta package. The 
package does forest plots, too.

-thomas
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] (no subject)

2005-01-12 Thread Anne
I think you will find all the doc in the help files
 ?factor()
  gets

   The function 'factor' is used to encode a vector as a factor (the
 terms 'category' and 'enumerated type' are also used for factors).
  If 'ordered' is 'TRUE', the factor levels are assumed to be
 ordered. For compatibility with S there is also a function
 'ordered'.

 'is.factor', 'is.ordered', 'as.factor' and 'as.ordered' are the
 membership and coercion functions for these classes.

Usage:

 factor(x, levels = sort(unique.default(x), na.last = TRUE),
labels = levels, exclude = NA, ordered = is.ordered(x))
 ordered(x, ...)
etc...

c'est une  variable de type catégorique! whose levels (values) are strings

To get help:
type ?functionname
or if you are under Windows see the menu Help\Html help and look under
Packages. What you will want first are the Base and Statistics packages

Anne



- Original Message - 
From: [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Wednesday, January 12, 2005 3:17 PM
Subject: [R] (no subject)



 hi,

 I am trying to grow a classification tree on some data, but I have a
little
 problem. In order to do so I have to use a function like tree in R and
on the
 internet help(tree) I get the following:

 The left-hand-side (response) should be either a numerical vector when a
 regression tree will be fitted or a factor, when a classification tree is
produced

 I would like to know what is a factor in R, is it numerical value with
no
 formula or just a word??

 Thanks in advance
 Nicolas

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] changing langage

2005-01-12 Thread Kurt Sys
Hi all,
I've got a small, practical question, which untill now I couldn't solve 
(otherwhise I wouldn't mail it, right?) First of all, I'm talking about 
R 2.0.1 on a winxp system (using the default graphical interface being 
'Rgui').
When I make plots, using dates on the x-axis, it puts the labels in 
Dutch, which is nice (since it's my mother tongue) unless I want them in 
English... Is there a way to change this behaviour?  (Can I change the 
labels etc to English?)

tnx,
Kurt Sys
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] defining lower part of distribution with covariate

2005-01-12 Thread Troels Ring
I try again - perhaps it is analysis of covariance with treatment 
(thio,ultiva) as two categories and antime as covariate. On the basis of 
such a model, is then the probability of GCS = 12 larger with thio treatment ?

Dear friends, forgive me a simple question, possibly related to quantreg 
but I failed to get it done and hope for basic instruction.

I have two sets of observed Glasgow coma scores at admission to ICU after 
operation, and accompanying time of anesthesia (in hours).
Thio is cheap and perhaps old fashioned, and ultiva expensive and rapidly 
terminated. The problem is to estimate the probability of GCS 12 or lower 
on the two treatments after taking time of anesthesia into account (antime) 
which is longer for thio. How would I do that in the best way ?

Best wishes
Troels Ring, MD
Aalborg, Denmark
thio
  GCS antime
 [1,]  144.5
 [2,]  157.5
 [3,]  117.5
 [4,]  154.5
 [5,]  144.5
 [6,]  153.5
 [7,]  155.5
 [8,]  145.5
 [9,]  153.5
[10,]  148.5
[11,]  134.5
[12,]  125.5
[13,]  153.5
[14,]  136.5
[15,]   98.5
[16,]  156.5
 ultiva
  GCS antime
 [1,]  154.5
 [2,]  154.5
 [3,]  152.5
 [4,]  153.5
 [5,]  153.5
 [6,]  125.5
 [7,]  154.5
 [8,]  153.5
 [9,]  158.5
[10,]  134.5
[11,]  143.5
[12,]  144.5
[13,]  154.5
[14,]  142.5
[15,]  154.5
[16,]  153.5
[17,]  153.5
[18,]  144.5
[19,]  144.5
[20,]  154.5
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Kolmogorov-Smirnof test for lognormal distribution with estimated parameters

2005-01-12 Thread Christoph Buser
Hi Kwabena

I did once a simulation, generating normal distributed values
(500 values) and calculating a KS test with estimated
parameters. For 1 times repeating this test I got about
1 significant tests (on a level alpha=0.05 I'm expecting about 500 
significant tests by chance)
So I think if you estiamte the parameters from the data, you fit
to good and the used distribution of the test statistic is not
adequate as it is indicated in the help page you cited. There
(in the help page) is some literature, but it is no easy stuff
to read.
Furthermore I know no implementation of an KS test which
accounts for this estimation of the parameter.

I recommend a graphical tool instead of a test:

x - rlnorm(100)
qqnorm(log(x))

See also ?qqnorm and ?qqplot.

If you insist on testing a theoretical distribution be aware
that a non significant test does not mean that your data has the
tested distribution (especially if you have few data, there is
no power in the test to detect deviations from the theoretical
distribution and the conclusion that the data fits well is
trappy)

If there are enough data I'd prefer a chi square test to the KS
test (but even there I use graphical tools instead). 

See ?chisq

For this test you have to specify classes and this is 
subjective (you can't avoid this).

You can reduce the DF of the expected chi square distribution
(under H_0) by the number of estimated parameters from the data
and will get better results. 

DF = number of classes - 1 - estimated parameters

I think this test is more powerful than the KS test,
particularly if you must estimate the parameters from data.

Regards,

Christoph

-- 
Christoph Buser [EMAIL PROTECTED]
Seminar fuer Statistik, LEO C11
ETH (Federal Inst. Technology)  8092 Zurich  SWITZERLAND
phone: x-41-1-632-5414  fax: 632-1228
http://stat.ethz.ch/~buser/



Kwabena Adusei-Poku writes:
  Hello all,
  
  Would somebody be kind enough to show me how to do a KS test in R for a
  lognormal distribution with ESTIMATED parameters. The R function
  ks.test()says the parameters specified must be prespecified and not
  estimated from the data Is there a way to correct this when one uses
  estimated data?
  
  Regards,
  
  Kwabena.
  
  
  Kwabena Adusei-Poku
  University of Goettingen
  Institute of Statistics and Econometrics
  Platz der Goettingen Sieben 5
  37073 Goettingen
  Germany
  Tel: +49-(0)551-394794
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] model.response error

2005-01-12 Thread Bang
When I installed R 2.0.1 (replacing 1.9.0) for Windows, a code using
model.response began acting up.  Here are the first several lines of a
code I had been tweaking for a spatial model (the code is mostly that of
Roger Bivand--I am adapting it to a slightly different data structure and
the problem I'm sure is with my changes, not his code).

command name - function (formula, data = list(), weights, na.action =
na.fail, type = lag, quiet = TRUE, zero.policy = FALSE, tol.solve = 1e-07,
tol.opt = .Machine$double.eps^0.5, sparsedebug = FALSE)
{
mt - terms(formula, data = data)
mf - lm(formula, data, na.action = na.action, method = model.frame)
na.act - attr(mf, na.action)
if (!is.matrix.csr(weights))
cat(\nWarning: weights matrix not in sparse form\n)
switch(type, lag = if (!quiet)
cat(\nSpatial lag model\n), mixed = if (!quiet)
cat(\nSpatial mixed autoregressive model\n), stop(\nUnknown model
type\n))
if (!quiet)
cat(Jacobian calculated using weights matrix eigenvalues\n)
y - model.response(mf, numeric)
if (any(is.na(y))) stop(NAs in dependent variable)
x - model.matrix(mt, mf)
if (any(is.na(x))) stop(NAs in independent variable)
if (nrow(x) != nrow(weights))
stop(Input data and weights have different dimensions)
n - nrow(x)
m - ncol(x)

When it reads the Y variable in the command:

y - model.response(mf, numeric)

The error it gives is:

Error in model.response(mf, numeric) : No direct or inherited method
for function model.response for this call

The problem is puzzling me because it is not something I encountered when I
was running the same code in 1.9.0, but is causing problems in 2.0.1

Thanks, and any comments on debugging the error are welcome. 

Jim

Well I AM missing the back of my head.you COULD cut me a little slack!
-Homer Simpson

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Kolmogorov-Smirnof test for lognormal distribution with estimated parameters

2005-01-12 Thread Frank E Harrell Jr
Christoph Buser wrote:
Hi Kwabena
I did once a simulation, generating normal distributed values
(500 values) and calculating a KS test with estimated
parameters. For 1 times repeating this test I got about
1 significant tests (on a level alpha=0.05 I'm expecting about 500 
significant tests by chance)
So I think if you estiamte the parameters from the data, you fit
to good and the used distribution of the test statistic is not
adequate as it is indicated in the help page you cited. There
(in the help page) is some literature, but it is no easy stuff
to read.
Furthermore I know no implementation of an KS test which
accounts for this estimation of the parameter.

I recommend a graphical tool instead of a test:
x - rlnorm(100)
qqnorm(log(x))
See also ?qqnorm and ?qqplot.
If you insist on testing a theoretical distribution be aware
that a non significant test does not mean that your data has the
tested distribution (especially if you have few data, there is
no power in the test to detect deviations from the theoretical
distribution and the conclusion that the data fits well is
trappy)
If there are enough data I'd prefer a chi square test to the KS
test (but even there I use graphical tools instead). 

See ?chisq
For this test you have to specify classes and this is 
subjective (you can't avoid this).

You can reduce the DF of the expected chi square distribution
(under H_0) by the number of estimated parameters from the data
and will get better results. 

DF = number of classes - 1 - estimated parameters
I think this test is more powerful than the KS test,
particularly if you must estimate the parameters from data.
Regards,
Christoph
It is also a good idea to ask why one compares against a known 
distribution form.  If you use the empirical CDF to select a parametric 
distribution, the final estimate of the distribution will inherit the 
variance of the ECDF.  The main reason statisticians think that 
parametric curve fits are far more efficient than nonparametric ones is 
that they don't account for model uncertainty in their final confidence 
intervals.

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] changing langage

2005-01-12 Thread Peter Dalgaard
Kurt Sys [EMAIL PROTECTED] writes:

 Hi all,
 
 I've got a small, practical question, which untill now I couldn't
 solve (otherwhise I wouldn't mail it, right?) First of all, I'm
 talking about R 2.0.1 on a winxp system (using the default graphical
 interface being 'Rgui').
 When I make plots, using dates on the x-axis, it puts the labels in
 Dutch, which is nice (since it's my mother tongue) unless I want them
 in English... Is there a way to change this behaviour?  (Can I change
 the labels etc to English?)

This type of stuff works on Linux at least:

Sys.setlocale(LC_ALL,da_DK) # or en_GB, or
plot(date,)


-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Kolmogorov-Smirnof test for lognormal distribution with estimated parameters

2005-01-12 Thread Christian Hennig
For the KS-test of normality with estimated parameters see

?lillie.test in package nortest.

Best,
Christian

On Wed, 12 Jan 2005, Christoph Buser wrote:

 Hi Kwabena
 
 I did once a simulation, generating normal distributed values
 (500 values) and calculating a KS test with estimated
 parameters. For 1 times repeating this test I got about
 1 significant tests (on a level alpha=0.05 I'm expecting about 500 
 significant tests by chance)
 So I think if you estiamte the parameters from the data, you fit
 to good and the used distribution of the test statistic is not
 adequate as it is indicated in the help page you cited. There
 (in the help page) is some literature, but it is no easy stuff
 to read.
 Furthermore I know no implementation of an KS test which
 accounts for this estimation of the parameter.
 
 I recommend a graphical tool instead of a test:
 
 x - rlnorm(100)
 qqnorm(log(x))
 
 See also ?qqnorm and ?qqplot.
 
 If you insist on testing a theoretical distribution be aware
 that a non significant test does not mean that your data has the
 tested distribution (especially if you have few data, there is
 no power in the test to detect deviations from the theoretical
 distribution and the conclusion that the data fits well is
 trappy)
 
 If there are enough data I'd prefer a chi square test to the KS
 test (but even there I use graphical tools instead). 
 
 See ?chisq
 
 For this test you have to specify classes and this is 
 subjective (you can't avoid this).
 
 You can reduce the DF of the expected chi square distribution
 (under H_0) by the number of estimated parameters from the data
 and will get better results. 
 
 DF = number of classes - 1 - estimated parameters
 
 I think this test is more powerful than the KS test,
 particularly if you must estimate the parameters from data.
 
 Regards,
 
 Christoph
 
 -- 
 Christoph Buser [EMAIL PROTECTED]
 Seminar fuer Statistik, LEO C11
 ETH (Federal Inst. Technology)8092 Zurich  SWITZERLAND
 phone: x-41-1-632-5414fax: 632-1228
 http://stat.ethz.ch/~buser/
 
 
 
 Kwabena Adusei-Poku writes:
   Hello all,
   
   Would somebody be kind enough to show me how to do a KS test in R for a
   lognormal distribution with ESTIMATED parameters. The R function
   ks.test()says the parameters specified must be prespecified and not
   estimated from the data Is there a way to correct this when one uses
   estimated data?
   
   Regards,
   
   Kwabena.
   
   
   Kwabena Adusei-Poku
   University of Goettingen
   Institute of Statistics and Econometrics
   Platz der Goettingen Sieben 5
   37073 Goettingen
   Germany
   Tel: +49-(0)551-394794
   
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

***
Christian Hennig
Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
[EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/
###
ich empfehle www.boag-online.de

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] changing langage

2005-01-12 Thread Gabor Grothendieck
Kurt Sys kurt.sys at pandora.be writes:

: 
: Hi all,
: 
: I've got a small, practical question, which untill now I couldn't solve 
: (otherwhise I wouldn't mail it, right?) First of all, I'm talking about 
: R 2.0.1 on a winxp system (using the default graphical interface being 
: 'Rgui').
: When I make plots, using dates on the x-axis, it puts the labels in 
: Dutch, which is nice (since it's my mother tongue) unless I want them in 
: English... Is there a way to change this behaviour?  (Can I change the 
: labels etc to English?)


Here is an example:

R Sys.setlocale(LC_TIME, en-us)
[1] English_United States.1252
R format(ISOdate(2004,1:12,1),%B)
 [1] January   February  March April May   June 
 [7] July  AugustSeptember October   November  December 
R Sys.setlocale(LC_TIME, du-be)
[1] Dutch_Netherlands.1252
R format(ISOdate(2004,1:12,1),%B)
 [1] januari   februari  maart april mei   juni 
 [7] juli  augustus  september oktober   november  december 
R R.version.string # XP
[1] R version 2.1.0, 2005-01-02 

For more codes, google for:

   Microsoft language codes 

and look at the first result that is on a Microsoft site.

This may or may not change your labels depending on precisely
what you are doing.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] changing langage

2005-01-12 Thread Prof Brian Ripley
It uses the language set by LC_TIME: see ?Sys.setlocale and ?format.Date
which references it.
On Wed, 12 Jan 2005, Kurt Sys wrote:
Hi all,
I've got a small, practical question, which untill now I couldn't solve 
(otherwhise I wouldn't mail it, right?) First of all, I'm talking about R 
2.0.1 on a winxp system (using the default graphical interface being 'Rgui').
When I make plots, using dates on the x-axis, it puts the labels in Dutch, 
which is nice (since it's my mother tongue) unless I want them in English... 
Is there a way to change this behaviour?  (Can I change the labels etc to 
English?)

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] model.response error

2005-01-12 Thread Roger Bivand
On Wed, 12 Jan 2005, Bang wrote:

 When I installed R 2.0.1 (replacing 1.9.0) for Windows, a code using
 model.response began acting up.  Here are the first several lines of a
 code I had been tweaking for a spatial model (the code is mostly that of
 Roger Bivand--I am adapting it to a slightly different data structure and
 the problem I'm sure is with my changes, not his code).

I don't think it's the R versions, rather the SparseM versions. I think
what is happening is the SparseM generic for model.response is being
picked up. In the current NAMESPACE file in the spdep package I now have:

importFrom(stats, model.matrix, model.response)

but I'm not sure that your function is in a package. You will probably 
need to say that both model.response and model.matrix are from stats, at 
least this should give you a lead.

Best wishes,

Roger

 
 command name - function (formula, data = list(), weights, na.action =
 na.fail, type = lag, quiet = TRUE, zero.policy = FALSE, tol.solve = 1e-07,
 tol.opt = .Machine$double.eps^0.5, sparsedebug = FALSE)
 {
 mt - terms(formula, data = data)
 mf - lm(formula, data, na.action = na.action, method = model.frame)
 na.act - attr(mf, na.action)
 if (!is.matrix.csr(weights))
 cat(\nWarning: weights matrix not in sparse form\n)
 switch(type, lag = if (!quiet)
 cat(\nSpatial lag model\n), mixed = if (!quiet)
 cat(\nSpatial mixed autoregressive model\n), stop(\nUnknown model
 type\n))
 if (!quiet)
 cat(Jacobian calculated using weights matrix eigenvalues\n)
 y - model.response(mf, numeric)
 if (any(is.na(y))) stop(NAs in dependent variable)
 x - model.matrix(mt, mf)
 if (any(is.na(x))) stop(NAs in independent variable)
 if (nrow(x) != nrow(weights))
 stop(Input data and weights have different dimensions)
 n - nrow(x)
 m - ncol(x)
 
 When it reads the Y variable in the command:
 
 y - model.response(mf, numeric)
 
 The error it gives is:
 
 Error in model.response(mf, numeric) : No direct or inherited method
 for function model.response for this call
 
 The problem is puzzling me because it is not something I encountered when I
 was running the same code in 1.9.0, but is causing problems in 2.0.1
 
 Thanks, and any comments on debugging the error are welcome. 
 
 Jim
 
 Well I AM missing the back of my head.you COULD cut me a little slack!
 -Homer Simpson
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] defining lower part of distribution with covariate

2005-01-12 Thread Berton Gunter
Troels:

It would be best if you discussed this with a local statistician to make
sure that the data and analysis are properly addressing the scientific
issues. Perhaps that is why no one replied to your previous post. Also, this
is primarily a **statistical** issue, not really an ** R-issue **.

Having said that, I'll take a stab at it ...

Probably the most important thing to say is that there is probably not much
that these data can tell you as you only have 36 cases in all and only 3 are
= 12. While this probably represented a **lot** of work for you, the simple
fact is that when trying to understand what influences dichotomous
probabilities, you generally need lots of data (hundreds of cases,
typically). Note: This remark may be subject to correction by wiser
statisticians.

Next, the nature of your response, GCS. It appears to be a subjective rating
score that is probably best modeled as an ordered categorical response,
which in R is called an ordered factor. Dichotomizing it to =12/12 loses
information. Treating it as a continuous response (quantreg/ancova) seems
inappropriate for your data.

Finally, the model. Considering GCS to be an ordered category, a reasonable
modeling strategy seems to be proportional odds logistic regression, which
models the GCS response as a linear function of the anstimes and anstypes
(which encompasses your ancova ideas). The results from this model would
then allow you to calculate the =12 probability if you chose to do so. This
model can be fit using the polr() function in the MASS package.

However, I again urge you to discuss this with a local statistically
knowledgeable resource -- and not to expect too much from such rather meager
data. 

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Troels 
 Ring (by way of Troels Ring [EMAIL PROTECTED])
 Sent: Wednesday, January 12, 2005 9:11 AM
 To: R-help
 Subject: [R] defining lower part of distribution with covariate
 
 I try again - perhaps it is analysis of covariance with treatment 
 (thio,ultiva) as two categories and antime as covariate. On 
 the basis of 
 such a model, is then the probability of GCS = 12 larger 
 with thio treatment ?
 
 
 Dear friends, forgive me a simple question, possibly related 
 to quantreg 
 but I failed to get it done and hope for basic instruction.
 
 I have two sets of observed Glasgow coma scores at admission 
 to ICU after 
 operation, and accompanying time of anesthesia (in hours).
 Thio is cheap and perhaps old fashioned, and ultiva expensive 
 and rapidly 
 terminated. The problem is to estimate the probability of GCS 
 12 or lower 
 on the two treatments after taking time of anesthesia into 
 account (antime) 
 which is longer for thio. How would I do that in the best way ?
 
 Best wishes
 Troels Ring, MD
 Aalborg, Denmark
 
 
 thio
GCS antime
   [1,]  144.5
   [2,]  157.5
   [3,]  117.5
   [4,]  154.5
   [5,]  144.5
   [6,]  153.5
   [7,]  155.5
   [8,]  145.5
   [9,]  153.5
 [10,]  148.5
 [11,]  134.5
 [12,]  125.5
 [13,]  153.5
 [14,]  136.5
 [15,]   98.5
 [16,]  156.5
   ultiva
GCS antime
   [1,]  154.5
   [2,]  154.5
   [3,]  152.5
   [4,]  153.5
   [5,]  153.5
   [6,]  125.5
   [7,]  154.5
   [8,]  153.5
   [9,]  158.5
 [10,]  134.5
 [11,]  143.5
 [12,]  144.5
 [13,]  154.5
 [14,]  142.5
 [15,]  154.5
 [16,]  153.5
 [17,]  153.5
 [18,]  144.5
 [19,]  144.5
 [20,]  154.5
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] transfer function models

2005-01-12 Thread Paul Gilbert
I don't know what SAS does, but transfer functions are essentially MA/AR 
from an ARMA model, so you should be able to get what you want from the 
various ARMA estimation tools in R.

Paul Gilbert
Samuel Kemp (Comp) wrote:
Hi,
Does anyone know of a function in R that can estimate the parameters of 
a transfer function model with added noise like in SAS?

Thanks in advance,
Sam.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Standard error for the area under a smoothed ROC curve?

2005-01-12 Thread Frank E Harrell Jr
Dan Bolser wrote:
On Wed, 12 Jan 2005, Frank E Harrell Jr wrote:

Dan Bolser wrote:
Hello, 

I am making some use of ROC curve analysis. 

I find much help on the mailing list, and I have used the Area Under the
Curve (AUC) functions from the ROC function in the bioconductor project...
http://www.bioconductor.org/repository/release1.5/package/Source/
ROC_1.0.13.tar.gz 

However, I read here...
http://www.medcalc.be/manual/mpage06-13b.php
The 95% confidence interval for the area can be used to test the
hypothesis that the theoretical area is 0.5. If the confidence interval
does not include the 0.5 value, then there is evidence that the laboratory
test does have an ability to distinguish between the two groups (Hanley 
McNeil, 1982; Zweig  Campbell, 1993).
But aside from early on the above article is short on details. Can anyone
tell me how to calculate the CI of the AUC calculation?
I read this...
http://www.bioconductor.org/repository/devel/vignette/ROCnotes.pdf
Which talks about resampling (by showing R code), but I can't understand
what is going on, or what is calculated (the example given is specific to
microarray analysis I think).
I think a general AUC CI function would be a good addition to the ROC
package.

One more thing, in calculating the AUC I see the splines function is
recomended over the approx function. Here...
http://tolstoy.newcastle.edu.au/R/help/04/10/6138.html
How would I rewrite the following AUC functions (adapted from bioconductor
source) to use splines (or approxfun or splinefun) ...

spe # Specificity
[1] 0.02173913 0.13043478 0.21739130 0.32608696 0.43478261 0.54347826
[7] 0.65217391 0.76086957 0.89130435 1. 1. 1.
[13] 1.

sen # Sensitivity
[1] 1.000 1.000 1.000 1.000 1.000 0.9302326 0.8139535
[8] 0.6976744 0.5581395 0.4418605 0.3488372 0.2325581 0.1162791
trapezint(1-spe,sen)
my.integrate(1-spe,sen)
## Functions
## Nicked (and modified) from the ROC function in bioconductor.
trapezint -
function (x, y, a = 0, b = 1)
{
   if (x[1]  x[length(x)]) {
 x - rev(x)
 y - rev(y)
   }
   y - y[x = a  x = b]
   x - x[x = a  x = b]
   if (length(unique(x))  2)
   return(NA)
   ya - approx(x, y, a, ties = max, rule = 2)$y
   yb - approx(x, y, b, ties = max, rule = 2)$y
   x - c(a, x, b)
   y - c(ya, y, yb)
   h - diff(x)
   lx - length(x)
   0.5 * sum(h * (y[-1] + y[-lx]))
}
my.integrate -
function (x, y, t0 = 1)
{
   f - function(j) approx(x,y,j,rule=2,ties=max)$y
   integrate(f, 0, t0)$value
}


Thanks for any pointers,
Dan.
I don't see why the above formulas are being used.  The 
Bamber-Hanley-McNeil-Wilcoxon-Mann-Whitney nonparametric method works 
great.  Just get the U statistic (concordance probability) used in 
Wilcoxon.  As Somers' Dxy rank correlation coefficient is 2*(1-C) where 
C is the concordance or ROC area, the Hmisc package function rcorr.cens 
uses U statistic methods to get the standard error of Dxy.  You can 
easily translate this to a standard error of C.

I am sure I could do this easily, except I can't. 

The good thing about ROC is that I understand it (I can see it). I know
why the area means what it means, and I could even imagine how sampling
the data could give a CI on the area. 

However, I don't know why the area under the ROC curve is well known to
be equivalent to the numerator of the Mann-Whitney U statistic - from
http://www.bioconductor.org/repository/devel/vignette/ROCnotes.pdf
Nor do I know how to calculate the numerator of the Mann-Whitney U
statistic.
This is clear in the original Bamber or Hanley-McNeil articles.  The ROC 
area is a linear translation of the mean rank of predicted values in one 
of the two outcome groups.  The little somers2 function in Hmisc shows this:

##S function somers2
##
##Calculates concordance probability and Somers'  Dxy  rank  correlation
##between  a  variable  X  (for  which  ties are counted) and a binary
##variable Y (having values 0 and 1, for which ties are not  counted).
##Uses short cut method based on average ranks in two groups.
##
##Usage:
##
## somers2(X,Y)
##
##Returns vector whose elements are C Index, Dxy, n and missing, where
##C Index is the concordance probability and Dxy=2(C Index-.5).
##
##F. Harrell 28 Nov 90 6 Apr 98: added weights
somers2 - function(x, y, weights=NULL, normwt=FALSE, na.rm=TRUE) {
  if(length(y)!=length(x))stop(y must have same length as x)
  y - as.integer(y)
  wtpres - length(weights)
  if(wtpres  (wtpres != length(x)))
stop('weights must have same length as x')
  if(na.rm) {
miss - if(wtpres) is.na(x + y + weights) else is.na(x + y)
nmiss - sum(miss)
if(nmiss0) {
  miss - !miss
  x - x[miss]
  y - y[miss]
  if(wtpres) weights - weights[miss]
}
  } else nmiss - 0
   u - sort(unique(y))
  if(any(y %nin% 0:1)) stop('y must be binary')  ## 7dec02
  if(wtpres) {
if(normwt) weights - 

Re: [R] global objects not overwritten within function

2005-01-12 Thread bogdan romocea
Apparently the message below wasn't posted on R-help, so I'm sending it
again. Sorry if you received it twice.

--- bogdan romocea [EMAIL PROTECTED] wrote:

 Date: Tue, 11 Jan 2005 17:31:42 -0800 (PST)
 From: bogdan romocea [EMAIL PROTECTED]
 Subject: Re: [R] global objects not overwritten within function

Thank you to everyone who replied. I had no idea that ... means
something in R, I only wanted to make the code look simpler. I'm
pasting below the functional equivalent of what took me yesterday a
couple of hours to debug. Function f() takes several arguments (that's
why I want to have the code as a function) and creates several objects.
I then need to use those objects in another function fct(), and I want
to overwrite them to save memory (they're pretty large).

It appears that Robert's guess (dynamic/lexical scoping) explains
what's going on. I've noticed though another strange (to me) issue:
without indexing (such as obj1 - obj1[obj1  0] - which I need to use
though), fct() prints the expected values even without removing the
objects after each iteration. However, after indexing is introduced,
rm() must be used to make fct() return the intended output. How would
that be explained?

Kind regards,
b.

f - function(read,position){
obj1 - 5 * read[position]:(read[position]+5)
obj2 - 7 * read[position]:(read[position]+5)
assign(obj1,obj1,.GlobalEnv)
assign(obj2,obj2,.GlobalEnv)
}
fct - function(input){
for (i in 1:5)
{
f(input,i)
obj1 - obj1[obj1  0]
obj2 - obj2[obj2  0]
print(obj1)
print(obj2)
#   rm(obj1,obj2)   #get intended results with this line
}
}
a - 1:10
fct(a)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] changing langage [SOLVED]

2005-01-12 Thread Kurt Sys
To all that replied, thanks... I have a clue where I can change the settings.

tnx,
Kurt Sys


- Oorspronkelijk bericht -
Van
: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
Verzonden
: woensdag
, januari
 12, 2005 05:26 PM
Aan
: r-help@stat.math.ethz.ch
Onderwerp
: Re: [R] changing langage

Kurt Sys kurt.sys at pandora.be writes:

: 
: Hi all,
: 
: I've got a small, practical question, which untill now I couldn't solve 
: (otherwhise I wouldn't mail it, right?) First of all, I'm talking about 
: R 2.0.1 on a winxp system (using the default graphical interface being 
: 'Rgui').
: When I make plots, using dates on the x-axis, it puts the labels in 
: Dutch, which is nice (since it's my mother tongue) unless I want them in 
: English... Is there a way to change this behaviour?  (Can I change the 
: labels etc to English?)


Here is an example:

R Sys.setlocale(LC_TIME, en-us)
[1] English_United States.1252
R format(ISOdate(2004,1:12,1),%B)
 [1] January   February  March April May   June 
 [7] July  AugustSeptember October   November  December 
R Sys.setlocale(LC_TIME, du-be)
[1] Dutch_Netherlands.1252
R format(ISOdate(2004,1:12,1),%B)
 [1] januari   februari  maart april mei   juni 
 [7] juli  augustus  september oktober   november  december 
R R.version.string # XP
[1] R version 2.1.0, 2005-01-02 

For more codes, google for:

   Microsoft language codes 

and look at the first result that is on a Microsoft site.

This may or may not change your labels depending on precisely
what you are doing.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] gbm

2005-01-12 Thread Weiwei Shi
Hi, there:
I am wondering if I can find some detailed explanation
on gbm or explanation on examples of gbm.

thanks,

Ed

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Finding seasonal peaks in a time series....

2005-01-12 Thread Dr Carbon
 I have a seasonal time series. I want to calculate the annual mean
value of the time series at its peak

 (say the mean of the three values before the peak, the peak, and the
three values after the peak).

 The peak of the time series might change cycle slightly from year to year.

# E.g.,
nPts - 254
foo - sin((2 * pi * 1/24) * 1:nPts)
foo - foo + rnorm(nPts, 0, 0.05)
bar - ts(foo, start = c(1980,3), frequency = 24)
plot(bar)
start(bar)
end(bar)

# I want to find the peak value from each year, and then get the mean
of the values on either side.
# So, if the peak value in the year 1981 is
max.in.1981 - max(window(bar, start = c(1981,1), end = c(1981,24)))
# e.g, cycle 7 or 8
window(bar, start = c(1981,1), end = c(1981,24)) == max.in.1981
# E.g. if the highest value in 1981 is in cycle 8 I want
mean.in.1981 - mean(window(bar, start = c(1981,5), end = c(1981,11)))
plot(bar)
points(ts(mean.in.1981, start = c(1981,8), frequency = 24), col =
red, pch = +)


 Is there a way to automate this for each year.

 How can I return the cycle of the max value by year?

 Thanks in advance. -DC

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gbm

2005-01-12 Thread Spencer Graves
 I just got 25 hits from www.r-project.org - search - R site 
search.  Might one or more of these help you?  If they don't solve your 
problem, I suggest you try the posting guide! 
http://www.R-project.org/posting-guide.html;.  If that still doesn't 
solve your problem, it should help you phrase your question to increase 
the chances of getting a helpful reply. 

 hope this helps.  spencer graves
Weiwei Shi wrote:
Hi, there:
I am wondering if I can find some detailed explanation
on gbm or explanation on examples of gbm.
thanks,
Ed
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gbm

2005-01-12 Thread Peter Dalgaard
Weiwei Shi [EMAIL PROTECTED] writes:

 Hi, there:
 I am wondering if I can find some detailed explanation
 on gbm or explanation on examples of gbm.

What is gbm?

Green Belt Movement?
Georgie Boy Manufacturing?

I'm serious! Well, only sort of, but try Google on gbm and you'll
find those two expansions and several others like them.

I suppose you mean Gradient Boosting Machine, or Generalized Boosted
regression Models. Have you followed up on the references and examples
on its help page?

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gbm

2005-01-12 Thread Daniel Almirall

You can also check out:

http://www.i-pensieri.com/gregr/gbm.shtml

There are reference papers on there, too.

HTH,
Danny


On Wed, 12 Jan 2005, Peter Dalgaard wrote:

 Weiwei Shi [EMAIL PROTECTED] writes:

  Hi, there:
  I am wondering if I can find some detailed explanation
  on gbm or explanation on examples of gbm.

 What is gbm?

 Green Belt Movement?
 Georgie Boy Manufacturing?

 I'm serious! Well, only sort of, but try Google on gbm and you'll
 find those two expansions and several others like them.

 I suppose you mean Gradient Boosting Machine, or Generalized Boosted
 regression Models. Have you followed up on the references and examples
 on its help page?

 --
O__   Peter Dalgaard Blegdamsvej 3
   c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
  (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Off Topic: Statistical philosophy rant

2005-01-12 Thread Berton Gunter
R-Listers.

The following is a rant originally sent privately to Frank Harrell in
response to remarks he made on this list. The ideas are not new or original,
but he suggested I share it with the list, as he felt that it might be of
wider interest, nonetheless. I have real doubts about this, and I apologize
in advance to those who agree that I should have kept my remarks private.
In view of this, if you wish to criticize my remarks on list, that's fine,
but I won't respond (I've said enough already!). I would be happy to discuss
issues (a little) further off list with anyone who wishes to bother, but not
on list. 

Also, Frank sent me a relevant reference for those who might wish to read a
more thoughtful consideration of the issues:

@ARTICLE{far92cos,
   author = {Faraway, J. J.},
   year = 1992,
   title = {The cost of data analysis},
   journal = J Comp Graphical Stat,
   volume = 1,
   pages = {213-229},
   annote = {bootstrap; validation; predictive accuracy; modeling strategy;
regression diagnostics;model uncertainty}
}

I welcome further relevant references, pro or con!

Finally, I need to emphasize that these are clearly my very personal views
and do not reflect those of my company or colleagues. 

Cheers to all ...
---

The relevant portion of Frank's original comment was in a thread about K-S
tests for the goodness of fit of a parametric distribution:

...
 If you use the empirical CDF to select a parametric 
 distribution, the final estimate of the distribution will inherit the 
 variance of the ECDF.
 The main reason statisticians think that 
 parametric curve fits are far more efficient than 
 nonparametric ones is 
 that they don't account for model uncertainty in their final 
 confidence 
 intervals.
 
 -- Frank Harrell

My reply:

That's a perceptive remark, but I would go further... You mentioned
**model** uncertainty. In fact, in any data analysis in which we explore the
data first to choose a model, fit the model (parametric or non..), and then
use whatever (pivots from parametric analysis; bootstrapping;...) to say
something about model uncertainty, we're always kidding ourselves and our
colleagues because we fail to take into account the considerable variability
introduced by our initial subjective exploration and subsequent choice of
modeling strategy. One can only say (at best) that the stated model
uncertainty is an underestimate of the true uncertainty. And very likely a
considerable underestimate because of the model choice subjectivity.

Now I in no way wish to discourage or abridge data exploration; only to
point out that we statisticians have promulgated a self-serving and
unrealistic view of the value of formal inference in quantifying true
scientific uncertainty when we do such exploration -- and that there is
therefore something fundamentally contradictory in our own rhetoric and
methods. Taking a larger view, I think this remark is part of the deeper
epistemological issue of characterizing what can be scientifically known
or, indeed, defining the difference between science and art, say. My own
view is that scientific certainty is a fruitless concept: we build models
that we benchmark against our subjective measurements (as the measurements
themselves depend on earlier scientific models) of reality. Insofar as
data can limit or support our flights of modeling fancy, they do; but in the
end, it is neither an objective process nor one whose uncertainty can be
strictly quantified. In creating the illusion that statistical methods can
overcome these limitations, I think we have both done science a disservice
and relegated ourselves to an isolated, fringe role in scientific inquiry.

Needless to say, opposing viewpoints to such iconclastic remarks are
cheerfully welcomed.

Best regards,

Bert Gunter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] gbm

2005-01-12 Thread Weiwei Shi
Hi, there:
Thanks a lot for all people' prompt replies.

In detail, I am facing a huge amount of data: over
10,000 and 400 vars. This project is very challenging
and interesting to me. I tried rpart which gives me
some promising results but not good enough. So I am
trying randomForest and gbm now. 

My plan of using gbm is like this:
rt-rpart(...)
gbm(formula(rt)...)

Does this work? (My first question)

My another CONCERN FOR GBM is the scalability since I
realize R seems to load all the data into memory. (My
second question)

But I believe the idea above will run very slowly. (I
think I might try TreeNet, though I don't like it
since it is commercial.). BTW, sampling might be a
good idea, but it does not seem a good idea for my
project from previous experiments.

I read some reference mentioned earlier by helpers
before I sent my first email. But I still appreciate
any helps. You guys are so nice!

BTW, gbm means gradient boosting modeling :)

Ed

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Off Topic: Statistical philosophy rant

2005-01-12 Thread Dan Bolser
On Wed, 12 Jan 2005, Berton Gunter wrote:

R-Listers.

The following is a rant originally sent privately to Frank Harrell in
response to remarks he made on this list. The ideas are not new or original,
but he suggested I share it with the list, as he felt that it might be of
wider interest, nonetheless. I have real doubts about this, and I apologize
in advance to those who agree that I should have kept my remarks private.
In view of this, if you wish to criticize my remarks on list, that's fine,
but I won't respond (I've said enough already!). I would be happy to discuss
issues (a little) further off list with anyone who wishes to bother, but not
on list. 

Also, Frank sent me a relevant reference for those who might wish to read a
more thoughtful consideration of the issues:

@ARTICLE{far92cos,
   author = {Faraway, J. J.},
   year = 1992,
   title = {The cost of data analysis},
   journal = J Comp Graphical Stat,
   volume = 1,
   pages = {213-229},
   annote = {bootstrap; validation; predictive accuracy; modeling strategy;
regression diagnostics;model uncertainty}
}

I welcome further relevant references, pro or con!

Finally, I need to emphasize that these are clearly my very personal views
and do not reflect those of my company or colleagues. 

Cheers to all ...
---

The relevant portion of Frank's original comment was in a thread about K-S
tests for the goodness of fit of a parametric distribution:

...
 If you use the empirical CDF to select a parametric 
 distribution, the final estimate of the distribution will inherit the 
 variance of the ECDF.
 The main reason statisticians think that 
 parametric curve fits are far more efficient than 
 nonparametric ones is 
 that they don't account for model uncertainty in their final 
 confidence 
 intervals.
 
 -- Frank Harrell

My reply:

That's a perceptive remark, but I would go further... You mentioned
**model** uncertainty. In fact, in any data analysis in which we explore the
data first to choose a model, fit the model (parametric or non..), and then
use whatever (pivots from parametric analysis; bootstrapping;...) to say
something about model uncertainty, we're always kidding ourselves and our
colleagues because we fail to take into account the considerable variability
introduced by our initial subjective exploration and subsequent choice of
modeling strategy. One can only say (at best) that the stated model
uncertainty is an underestimate of the true uncertainty. And very likely a
considerable underestimate because of the model choice subjectivity.

Now I in no way wish to discourage or abridge data exploration; only to
point out that we statisticians have promulgated a self-serving and
unrealistic view of the value of formal inference in quantifying true
scientific uncertainty when we do such exploration -- and that there is
therefore something fundamentally contradictory in our own rhetoric and
methods. Taking a larger view, I think this remark is part of the deeper
epistemological issue of characterizing what can be scientifically known
or, indeed, defining the difference between science and art, say. My own
view is that scientific certainty is a fruitless concept: we build models
that we benchmark against our subjective measurements (as the measurements
themselves depend on earlier scientific models) of reality. Insofar as
data can limit or support our flights of modeling fancy, they do; but in the
end, it is neither an objective process nor one whose uncertainty can be
strictly quantified. 

I totally agree with the above and I am totally unqualified to comment on
the below.


You (and others) might find these papers interesting...

http://www.santafe.edu/~chaos/chaos/pubs.htm


Specifically papers like...

Synchronizing to the Environment: Information Theoretic Constraints on
Agent Learning.
http://www.santafe.edu/~cmg/papers/stte.pdf

Is Anything Ever New? Considering Emergence.
http://www.santafe.edu/~cmg/papers/EverNew.pdf


Observing Complexity and The Complexity of Observation
http://www.santafe.edu/~cmg/papers/OCACO.pdf


What Lies Between Order and Chaos?
http://www.santafe.edu/~cmg/papers/wlboac.pdf



And probably many more.


In creating the illusion that statistical methods can
overcome these limitations, I think we have both done science a disservice
and relegated ourselves to an isolated, fringe role in scientific inquiry.

Needless to say, opposing viewpoints to such iconclastic remarks are
cheerfully welcomed.

Does it make any difference to the mass of Saturn?

Dan.


Best regards,

Bert Gunter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] gbm

2005-01-12 Thread Liaw, Andy
 From: Weiwei Shi
 
 Hi, there:
 Thanks a lot for all people' prompt replies.
 
 In detail, I am facing a huge amount of data: over
 10,000 and 400 vars. This project is very challenging
 and interesting to me. I tried rpart which gives me
 some promising results but not good enough. So I am
 trying randomForest and gbm now. 
 
 My plan of using gbm is like this:
 rt-rpart(...)
 gbm(formula(rt)...)
 
 Does this work? (My first question)

Given a machine with sufficient memory and CPU speed, yes.
 
 My another CONCERN FOR GBM is the scalability since I
 realize R seems to load all the data into memory. (My
 second question)

We have dealt with data larger than what you described.  One thing to avoid
is the use of the formula interface if you have _lots_ (like, hundreds) of
variables.  gbm.fit(), I believe, was created for that reason.
 
 But I believe the idea above will run very slowly. (I
 think I might try TreeNet, though I don't like it
 since it is commercial.). BTW, sampling might be a
 good idea, but it does not seem a good idea for my
 project from previous experiments.

To me being commercial is not a crime.  I judge software on quality, ease of
use, access to source (if I need it), etc.  To me, TreeNet failed on several
of those criteria, but it works just fine for some people.
 
 I read some reference mentioned earlier by helpers
 before I sent my first email. But I still appreciate
 any helps. You guys are so nice!

That's no excuse for not following the posting guide, right?
 
 BTW, gbm means gradient boosting modeling :)

No.  I believe Greg calls it `generalized boosting models'.

Andy

 
 Ed
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Changing the ranges for the axis in image()

2005-01-12 Thread Mulholland, Tom
There's something that you're not telling me. If you want something other than 
your data, why use your data. If you have another set of data that you wish to 
overlay on the image then you are going to have to scale one of the data 
sources to match the other. I'm not sure where your problem is but the code 
below might prove useful in you understanding how the plotting occurs. I assume 
that image always use the -.25, 1.25 limits, but this would need to be 
confirmed.
 
I assume that when you talk about trying to use label, you are referring to the 
Hmisc package. I don't use this function so I can't give advice about it. 
However I think you probably need to be more familiar before you will get the 
best out of Frank's package. Someone else on the list may be able to help in 
the use of this function.
 
x-matrix(c(1,1,0,1,0,1,0,1,1),3,3)
 
# dummy secondary data
y - runif(20) * 20
z - runif(20) * 20
y1 - (y/(20/1.5)) - 0.25 # rescale y from 0 to 20 to -.25 to 1.25
z1 - (z/(20/1.5)) - 0.25
x;y;z;y1;z1
 
par(mfrow = c(1,3))
image (x,xlim =c(-0.25,1.25),ylim = c(-0.25,1.25))

image(x,xlim =c(0,0.5),ylim = c(0.1,0.9),axes = FALSE)
points(y1,z1)

image (x,axes = FALSE,xlim =c(-0.25,1.25),ylim = c(-0.25,1.25))
points(y1,z1)
rect(0,0.1,0.5,0.9)
axis(2,at = seq(-0.25,1.25,length = 5),labels = seq(0,20, length = 5))
 
Tom

 -Original Message-
From: Costas Vorlow [mailto:[EMAIL PROTECTED]
Sent: Wednesday, 12 January 2005 6:22 PM
To: Mulholland, Tom
Subject: Re: [R] Changing the ranges for the axis in image()



Dear Tom,

Thanks. What happens though If I want an entirely different range than that of 
my data? I am trying with label() but it doesn't work properly.

Best,
Costas

Mulholland, Tom wrote:


Setting Axes = FALSE does not remove the axes, you can therefore still set the 
limits using xlim and ylim.



x-matrix(c(1,1,0,1,0,1,0,1,1),3,3)

par(mfrow = c(1,2))

image (x)

image(x,xlim =c(0.5,0.8),ylim = c(0.1,0.9),axes = FALSE)





Tom



  

-Original Message-

From: Costas Vorlow [ mailto:[EMAIL PROTECTED]

Sent: Tuesday, 11 January 2005 9:29 PM

To:  r-help@stat.math.ethz.ch

Subject: [R] Changing the ranges for the axis in image()





Dear all,



I can not find/understand the solution to this from the help pages:



Say we have the following script:



 x-matrix(c(1,1,0,1,0,1,0,1,1),3,3)

image(x)



How can I change the ranges on the vertical and horizontal axis to a 

range of my own or at least place a box frame around the image if I 

choose to use axes=FALSE?



Apologies for such a bsic question and thanks beforehand for 

your answers.



__

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide! 

http://www.R-project.org/posting-guide.html









  


-- 



This e-mail contains information intended for the addressee only.  It may be 
confidential and may be the subject of legal and/or professional Privilege. Any 
dissemination, distribution, copyright or use of this communication without 
prior permission of the addressee is strictly prohibited.

---

 Costas E. Vorlow   | Tel: +44 (0)191 33 45727

 Durham Business School | Fax: +44 (0)191 33 45201

 Room (324), University of Durham,  | email: K.E.Vorloou(at)durham.ac.uk

 Mill Hill Lane,| or : costas(at)vorlow.org

 Durham DH1 3LB, UK.|  http://www.vorlow.org

  http://ssrn.com/author=341149  | replace (at) with @ for my email



  Fingerprint: B010 577A 9EC3 9185 08AE  8F22 1A48 B4E7 9FA6 C31A



   How empty is theory in presence of fact!  (Mark Twain, 1889)


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] multivariate diagnostics

2005-01-12 Thread Yulei He
Hi, there.

I have two questions about the diagnostics in multivarite statistics.

1. Is there any diagnostics tool to check if a multivariate sample is from
multivariate normal distribution? If there is one, is there any function
doing it in R?

2. Is there any function of testing if two multivariate distribution are
same, i.e. the multivariate extension of Kolomogrov-Smirnov test?

Thanks for your help.

Yulei


$$$
Yulei He
1586 Murfin Ave. Apt 37
Ann Arbor, MI 48105-3135
[EMAIL PROTECTED]
734-647-0305(H)
734-763-0421(O)
734-763-0427(O)
734-764-8263(fax)
$$

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Off Topic: Statistical philosophy rant

2005-01-12 Thread Mulholland, Tom
I have often noted that statistics can't prove a damn thing, but they can be 
really useful in disproving something. Having spent most of 80s and half of 
the 90s with the Australian Bureau of Statistics to find out how you collect 
these numbers, I am disconcerted at the apparent disregard for measurement 
issues such as bias, input error, questionnaire design etc etc. ... Science 
wars ... the real world ... and the not so real world. Having only recently 
discovered what our esteemed J Baron does I should say that a lot of his work 
requires us to ask how we use (abuse?) the tools we have.

Having said that some of my most influential work has come from data 
exploration within fields where I would describe myself as a complete novice. 
Using ony the phrase the data seems to indicate realtionship x with y or some 
variant and asking if this is an accepted norm has produced some unexpected 
paradigm shifts.

Someone on the list has a footline of something along the lines of All models 
are wrong, but some of them are useful. I think this is attributed to Box. As 
most of us know some of the advice on this list has more sage than others.

That all concludes to say the manner in which we deal with non-model 
uncertainty, impacts upon the degree to which we perform a disservice to 
science/ourselves. I think you are being unduly pessimistic, but then again I 
might just be a cynic masquerading as a realist.

Tom

 -Original Message-
...
 That's a perceptive remark, but I would go further... You mentioned
 **model** uncertainty. In fact, in any data analysis in which 
 we explore the
 data first to choose a model, fit the model (parametric or 
 non..), and then
 use whatever (pivots from parametric analysis; 
 bootstrapping;...) to say
 something about model uncertainty, we're always kidding 
 ourselves and our
 colleagues because we fail to take into account the 
 considerable variability
 introduced by our initial subjective exploration and 
 subsequent choice of
 modelling strategy. One can only say (at best) that the stated model
 uncertainty is an underestimate of the true uncertainty. And 
 very likely a
 considerable underestimate because of the model choice subjectivity.
 
 Now I in no way wish to discourage or abridge data 
 exploration; only to
 point out that we statisticians have promulgated a self-serving and
 unrealistic view of the value of formal inference in quantifying true
 scientific uncertainty when we do such exploration -- and 
 that there is
 therefore something fundamentally contradictory in our own 
 rhetoric and
 methods. Taking a larger view, I think this remark is part of 
 the deeper
 epistemological issue of characterizing what can be 
 scientifically known
 or, indeed, defining the difference between science and art, 
 say. My own
 view is that scientific certainty is a fruitless concept: we 
 build models
 that we benchmark against our subjective measurements (as the 
 measurements
 themselves depend on earlier scientific models) of reality. 
 Insofar as
 data can limit or support our flights of modeling fancy, they 
 do; but in the
 end, it is neither an objective process nor one whose 
 uncertainty can be
 strictly quantified. In creating the illusion that 
 statistical methods can
 overcome these limitations, I think we have both done science 
 a disservice
 and relegated ourselves to an isolated, fringe role in 
 scientific inquiry.
 
 Needless to say, opposing viewpoints to such iconclastic remarks are
 cheerfully welcomed.
 
 Best regards,
 
 Bert Gunter
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Finding seasonal peaks in a time series....

2005-01-12 Thread Mulholland, Tom
You might find breakpoints in strucchange helpful

Tom

 -Original Message-
 From: Dr Carbon [mailto:[EMAIL PROTECTED]
 Sent: Thursday, 13 January 2005 6:19 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Finding seasonal peaks in a time series
 
 
  I have a seasonal time series. I want to calculate the annual mean
 value of the time series at its peak
 
  (say the mean of the three values before the peak, the peak, and the
 three values after the peak).
 
  The peak of the time series might change cycle slightly from 
 year to year.
 
 # E.g.,
 nPts - 254
 foo - sin((2 * pi * 1/24) * 1:nPts)
 foo - foo + rnorm(nPts, 0, 0.05)
 bar - ts(foo, start = c(1980,3), frequency = 24)
 plot(bar)
 start(bar)
 end(bar)
 
 # I want to find the peak value from each year, and then get the mean
 of the values on either side.
 # So, if the peak value in the year 1981 is
 max.in.1981 - max(window(bar, start = c(1981,1), end = c(1981,24)))
 # e.g, cycle 7 or 8
 window(bar, start = c(1981,1), end = c(1981,24)) == max.in.1981
 # E.g. if the highest value in 1981 is in cycle 8 I want
 mean.in.1981 - mean(window(bar, start = c(1981,5), end = c(1981,11)))
 plot(bar)
 points(ts(mean.in.1981, start = c(1981,8), frequency = 24), col =
 red, pch = +)
 
 
  Is there a way to automate this for each year.
 
  How can I return the cycle of the max value by year?
 
  Thanks in advance. -DC
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Please unsubscribe me from you list

2005-01-12 Thread Mulholland, Tom
You stand more chance if you do it yourself
https://stat.ethz.ch/mailman/listinfo/r-help

 -Original Message-
 From: Kevin Ita [mailto:[EMAIL PROTECTED]
 Sent: Thursday, 13 January 2005 2:25 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] Please unsubscribe me from you list
 
 
 Please unsubscribe me from your list.
  
 Thank you.
  
 Kevin
 
   
 -
 
  The all-new My Yahoo! - What will yours do?
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Finding seasonal peaks in a time series....

2005-01-12 Thread Mulholland, Tom
Sorry I didn't read the question properly. Please disregard, my mind was 
elsewhere.

Tom

 -Original Message-
 From: Mulholland, Tom 
 Sent: Thursday, 13 January 2005 10:52 AM
 To: Dr Carbon; r-help@stat.math.ethz.ch
 Subject: RE: [R] Finding seasonal peaks in a time series
 
 
 You might find breakpoints in strucchange helpful
 
 Tom
 
  -Original Message-
  From: Dr Carbon [mailto:[EMAIL PROTECTED]
  Sent: Thursday, 13 January 2005 6:19 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] Finding seasonal peaks in a time series
  
  
   I have a seasonal time series. I want to calculate the annual mean
  value of the time series at its peak
  
   (say the mean of the three values before the peak, the 
 peak, and the
  three values after the peak).
  
   The peak of the time series might change cycle slightly from 
  year to year.
  
  # E.g.,
  nPts - 254
  foo - sin((2 * pi * 1/24) * 1:nPts)
  foo - foo + rnorm(nPts, 0, 0.05)
  bar - ts(foo, start = c(1980,3), frequency = 24)
  plot(bar)
  start(bar)
  end(bar)
  
  # I want to find the peak value from each year, and then 
 get the mean
  of the values on either side.
  # So, if the peak value in the year 1981 is
  max.in.1981 - max(window(bar, start = c(1981,1), end = c(1981,24)))
  # e.g, cycle 7 or 8
  window(bar, start = c(1981,1), end = c(1981,24)) == max.in.1981
  # E.g. if the highest value in 1981 is in cycle 8 I want
  mean.in.1981 - mean(window(bar, start = c(1981,5), end = 
 c(1981,11)))
  plot(bar)
  points(ts(mean.in.1981, start = c(1981,8), frequency = 24), col =
  red, pch = +)
  
  
   Is there a way to automate this for each year.
  
   How can I return the cycle of the max value by year?
  
   Thanks in advance. -DC
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] New package: MatchIt

2005-01-12 Thread Elizabeth Stuart
We would like to announce the release of our software MatchIt, now 
available on CRAN.   MatchIt implements a variety of matching methods for 
causal inference.  

Abstract:
MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2004) 
for improving parametric statistical models by preprocessing data with 
nonparametric matching methods. MatchIt implements a wide range of 
sophisticated matching methods, making it possible to greatly reduce the 
dependence of causal inferences on hard-to-justify, but commonly made, 
statistical modeling assumptions. The software also easily fits into 
existing research practices since, after preprocessing data with MatchIt, 
researchers can use whatever parametric model they would have used without 
MatchIt, but produce inferences with substantially more robustness and 
less sensitivity to modeling assumptions. MatchIt is an R program, and 
also works seamlessly with Zelig.

For more information, please see http://gking.harvard.edu/matchit/.  
Comments and suggestions are welcome.  

Sincerely,
Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html