Re: [R] Translating lm.object to SQL, C, etc function

2003-02-14 Thread j+rhelp
On Fri, 14 Feb 2003 08:31:58 +0100, Uwe Ligges
[EMAIL PROTECTED] said:
 [EMAIL PROTECTED] wrote:
...
  So my question is, how do I export an lm.object in some form that I can
  then apply to prediction in C, SQL, or some other language? All I'm
  looking for is some well-structured textual or data frame output that I
  can then manipulate with appropriate tools, whether it be S itself, or
  something like Perl.
...

 See ?dump

Thanks for the suggestion. After my last post I tried switching from
SPLUS to R and discovered the useful xlevels attribute, which when output
with expression(), combined with the coefficients attribute, gives me the
information I need. dump() also provides those things, although it has a
lot of other stuff not needed to build the prediction function.

I'll start coding something using this, but it won't be ideal. The two
problems are:
 - The variable name / level name are still concatenated with
   no delimiter in the coefficients, so it's possible there will
   be ambiguous names
 - It feels rather clunky to be relying on these attributes when
   I feel like I should be adding methods directly to the class
   somehow...

In SPLUS I came across a useful attribute 'assign', which has a mapping
of term names to variables - the same attribute in R doesn't appear to
provide this information. Is this available somewhere?

What approaches are others using to apply their models to data sets where
S is not available? Has anyone written any convertors of models to other
languages? Is it possible to compile an expression or model into a DLL or
COM object and access it that way? I'm aware of the SOAP interface, but
that doesn't really suit our needs in this case.

TIA,
  Jeremy

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Translating lm.object to SQL, C, etc function

2003-02-14 Thread ripley
The issue here is that coef() tells you the coefficients in R's internal
parametrization of the model, and that is of no use to you unless you have
a means of creating a model matrix in C, SQL or (heaven forbid) Perl. The
information needed to re-create a model matrix is stored in the lm fit,
but in ways that are going to be hard to use anywhere else (since they
include R functions).  This is not perverse: what R does is very general,
*far* more so than SPSS.  Formulae in lm can include poly() and ns() 
terms, for example.

The only practical solution it seems to us is to ask R to create the model 
matrix for new data.  Then the things you are talking about are just the 
colnames of that matrix, and don't need to be interpreted.

You may want to read the sources to find out how R does it: that area is 
one of the most complex parts of the internals, and one in which bugs 
continue to emerge.

On Fri, 14 Feb 2003 [EMAIL PROTECTED] wrote:

 This is my first post to this list so I suppose a quick intro is in
 order. I've been using SPLUS 2000 and R1.6.2 for just a couple of days,
 and love S already. I'm reading MASS and also John Fox's book - both have
 been very useful. My background in stat software was mainly SPSS (which
 I've never much liked - thanks heavens I've found S!), and Perl is my
 tool of choice for general-purpose programming (I chaired the
 perl6-language-data working group, responsible for improving the data
 analysis capabilities in Perl).
 
 I have just completed my first S project, and I now have 8 lm.objects.
 The models are all reasonably complex with multiple numeric and factor
 variables and some 2-way and 3-way interactions. I now need to use these
 models in other environments, such as C code, SQL functions (using CASE)
 and in Perl - I can not work out how to do this.
 
 The difficulty I am having is that the output of coef() is not really
 parsable, since there is no marker in the name of an coefficient of
 separate out the components. For instance, in SPSS the name of a
 coefficient might be:
 
   var1=[a]*var2=[b]*var3
 
 ...which is easy to write a little script to pull that apart and turn it
 into a line of SQL, C, or whatever. In S however the name looks like:
 
   var1avar2bvar3
 
 ...which provides no way to pull the bits apart.

I find that impossible to understand anyway, but doubt that it corresponds
to SPSS.  For a variable V, label Va does not mean V=[a] except in unusual
special cases.

 So my question is, how do I export an lm.object in some form that I can
 then apply to prediction in C, SQL, or some other language? All I'm
 looking for is some well-structured textual or data frame output that I
 can then manipulate with appropriate tools, whether it be S itself, or
 something like Perl.
 
 Thanks in advance for any suggestions (and apologies in advance if this
 is well documented somewhere!),
 
   Jeremy
 
 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] How to keep two Vectors to be

2003-02-14 Thread #JIA YIYU#

Hi all,

I am beginner of R. I want to ask for help from you. 

I have two data.frame type object : s40 and s100. s40 and s100 have same structure: 
they are actually two dimention array like : 

V1V2
34 6768
234   36
65 60
.

Now s40 and s100 have almost same value in V1, but they lack some value in V1 from 
each other. What I want to do is to expand them to be same long by inserting those 
lacking values into V1 of s40 and s50 and the responed value in V2 is 0 or mean of V2. 

Is there any easy way to set this problem down? Any help will be appreciated very much!

Jia Yiyu

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] factorial function

2003-02-14 Thread Serge Boiko

Sorry for the stupid question, but is there the factorial function in
R? I tried to find it using help.search('factorial') but got nothing
appropriate. 
Many thanks, 
-Serge

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] off topic: sharing of software in the life sciences

2003-02-14 Thread Ramon Diaz
This might be of interest to some people in these lists.
The latest issue of Science (vol 299, 14 Febr. 2003), on p. 990, mentions a 
recent report from the National Academy of Sciences that deals with some 
guidelines for the sharing of data and research materials in the life 
sciences. The NAS report can be accessed from

http://bob.nap.edu/books/0309088593/html/

and the most relevant pages, regarding making code available, are pp. 20-23 
and p. 27.




-- 
Ramón Díaz-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
http://bioinfo.cnio.es/~rdiaz

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] factorial function

2003-02-14 Thread Ko-Kang Kevin Wang
Hi,

- Original Message -
From: Serge Boiko [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Friday, February 14, 2003 11:36 PM
Subject: [R] factorial function



 Sorry for the stupid question, but is there the factorial function in
 R? I tried to find it using help.search('factorial') but got nothing
 appropriate.


There isn't.

But there are at least four different ways to do this -- from the S
Programming Workshop (by Dr. Ross Ihaka):

  # Iteration
  fac1 - function(n) {
ans - 1
for(i in seq(n)) ans - ans * i
ans
  }

  # Recursion
  fac2 - function(n)
if (n = 0) 1 else n * fac(n - 1)

  # Vectorised
  fac3 - function(n)
 prod(seq(n))

  # Special Mathematical Function -- Gamma
  fac4 - function(n)
 gamma(n+1)

Of these Gamma is probably the most efficient.  Note that the above hasn't
got any debugging codes, you probably want to add them.

Cheers,

Kevin


Ko-Kang Kevin Wang
Master of Science (MSc) Student
Department of Statistics
University of Auckland
New Zealand
www.stat.auckland.ac.nz/~kwan022

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] factorial function

2003-02-14 Thread Ramon Diaz
Dear Serge,

For factorial of x, you can use gamma(x + 1). Alternatively, you can install 
the gregmisc package which has a factorial function that does that (if I 
recall correctly).

Best,




On Friday 14 February 2003 11:36, Serge Boiko wrote:
 Sorry for the stupid question, but is there the factorial function in
 R? I tried to find it using help.search('factorial') but got nothing
 appropriate.
 Many thanks,
 -Serge

 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help

-- 
Ramón Díaz-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
http://bioinfo.cnio.es/~rdiaz

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



RE: [R] Change array size

2003-02-14 Thread Liaw, Andy
Is the following what you want?

 x- rnorm(800)
 xt - x[1:2^trunc(log(length(x),base=2))]
 length(xt)
[1] 512

HTH,
Andy

 -Original Message-
 From: Poizot Emmanuel [mailto:[EMAIL PROTECTED]]
 Sent: Friday, February 14, 2003 4:52 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Change array size
 
 
 Hi,
 I would like to know if there is a way to change a vector of 
 arbitrary size
 to make it fits the nearest upper size multiple of a power of 2.
 
 -- 
 Cordialy
 
 Emmanuel POIZOT
 Cnam/Intechmer
 Digue de Collignon
 50110 Tourlaville
 Tél : (33)(0)2 33 88 73 42
 Fax : (33)(0)2 33 88 73 39
 -
 
 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help
 


--

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] time series missing 0 counts

2003-02-14 Thread ripley
Yes, there is an easy way.  Create the regular time series you want by 
something like

x - ts(0, start=c(2000,52), end=c(2003,9), frequency=52)

and fill in the time points you have data for by

xYear - trunc(times(x)); xWeek - cycle(x)
attach(mydata)
x[(xYear==year)  (xWeek==Week)] - Count
detach()

Easy!

On Fri, 14 Feb 2003, Schnitzler, Johannes wrote:

  I have several large data sets with counts per week. 
  (Maximum week per year is 52. Counts from Week 53
  are added to week 52.) 
  
  A data set contains for example:
  
  YearWeekCount
  200052  2
  20011   5
  20012   7
  20015   4
  20017   2
  ... ... ...
  ... ... ...
  
  Weeks with 0 counts are not listed in the data set.
  I want to perform time series analysis (frequency 52).
  
  
  Is there an easy way to expand the data set to:
  
  YearWeekCount
  200052  2
  20011   5
  20012   7
  20013   0
  20014   0
  20015   4
  20016   0
  20017   2
  ... ... ...
  ... ... ...
  
  or is there already a function in ts, which i have not found so far,
  to deal with this problem?
  
  
  Thank you very much.
  
  Johannes Schnitzler
  Germany Berlin
  
   
 
 
 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Translating lm.object to SQL, C, etc function

2003-02-14 Thread Frank E Harrell Jr
On Fri, 14 Feb 2003 16:37:42 +1100
[EMAIL PROTECTED] wrote:

 This is my first post to this list so I suppose a quick intro is in
 order. I've been using SPLUS 2000 and R1.6.2 for just a couple of days,
 and love S already. I'm reading MASS and also John Fox's book - both have
 been very useful. My background in stat software was mainly SPSS (which
 I've never much liked - thanks heavens I've found S!), and Perl is my
 tool of choice for general-purpose programming (I chaired the
 perl6-language-data working group, responsible for improving the data
 analysis capabilities in Perl).
 
 I have just completed my first S project, and I now have 8 lm.objects.
 The models are all reasonably complex with multiple numeric and factor
 variables and some 2-way and 3-way interactions. I now need to use these
 models in other environments, such as C code, SQL functions (using CASE)
 and in Perl - I can not work out how to do this.
 
 The difficulty I am having is that the output of coef() is not really
 parsable, since there is no marker in the name of an coefficient of
 separate out the components. For instance, in SPSS the name of a
 coefficient might be:
 
   var1=[a]*var2=[b]*var3
 
 ...which is easy to write a little script to pull that apart and turn it
 into a line of SQL, C, or whatever. In S however the name looks like:
 
   var1avar2bvar3
 
 ...which provides no way to pull the bits apart.
 
 So my question is, how do I export an lm.object in some form that I can
 then apply to prediction in C, SQL, or some other language? All I'm
 looking for is some well-structured textual or data frame output that I
 can then manipulate with appropriate tools, whether it be S itself, or
 something like Perl.
 
 Thanks in advance for any suggestions (and apologies in advance if this
 is well documented somewhere!),
 
   Jeremy
 


Some functions that may give you some ideas, from the Design library 
(http://hesweb1.med.virginia.edu/biostat/s/Design.html).:

Function(fit): generate S function to obtain predicted values from a regression fit 
that was done with Design in effect (i.e., fit with ols, cph, lrm, psm, glmD)

latex(fit): generate LaTeX code for typesetting the model fit

sascode(Function(fit)): translate formula to SAS notation

What I think would be very useful would be a function like Function that instead 
symbolically creates the design matrix, and translating that function to SQL etc.  
This would allow computation of confidence limits.
-- 
Frank E Harrell Jr  Prof. of Biostatistics  Statistics
Div. of Biostatistics  Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



RE: [R] FW: [Fwd: Re: [S] Exact p-values]

2003-02-14 Thread Christian . Stratowa
Dear Spencer

Thank you for this extensive explanation of the problem.
I was just curious.

Best regards
Christian

==
Christian Stratowa, PhD
Boehringer Ingelheim Austria
Dept NCE Lead Discovery - Bioinformatics
Dr. Boehringergasse 5-11
A-1121 Vienna, Austria
Tel.: ++43-1-80105-2470
Fax: ++43-1-80105-2683
email: [EMAIL PROTECTED]

 -Original Message-
 From: Spencer Graves [SMTP:[EMAIL PROTECTED]]
 Sent: Friday, February 14, 2003 1:29 PM
 To:   Stratowa,Dr,Christian   FEX BIG-AT-V
 Cc:   [EMAIL PROTECTED]; David Smith
 Subject:  Re: [R] FW: [Fwd: Re: [S] Exact p-values]
 
 To understand the correct answer, you need to understand the following:
 
   pbinom(1, 2, .5)
 [1] 0.75
 
 This is the binomial cumulative distribution function.
 *** pbinom(0, 2, .5) = 0.25
 *** pbinom(1, 2, .5) = 0.75 = 0.25 + 0.5
 *** pbinom(2, 2, .5) = 1
 
 However, pbinom(1e15, 2e15, .5) is a computational challenge.  Standard 
 numerical algorithms often fail in situations like this.  The code 
 should test for such cases and use more numerically stable 
 approximations in place of the exact algorithms.
 
 The standard deviation for a binomial is sqrt(p*(1-p)/n) = 
 0.5/sqrt(2e15), which is roughly 1e-8 in your case.
 
 
 I get the following from both S-Plus and R:
 
   pbinom(1e5+c(-1, 0, 1), 2e5, .5)
 [1] 0.4991079 0.5008921 0.5026762
 
 For the problem you cite, the correct answer should be 0.5 to about 8 
 significant digits.  Instead, I get 1 from R (as you did) and the 
 following from S-Plus:
 
   pbinom(1e15,2e15,0.5)
 [1] 0.7411209
 
 Both give wrong answers without warning, though in this case, S-Plus is 
 closer.
 
 Answer the question?
 Spencer Graves
 #
 
 [EMAIL PROTECTED] wrote:
  Dear all
  
  Just for fun, I have just downloaded the paper mentioned below and
 checked
  it with R-1.6.1.
  Everything is ok with exception of Table 2b, where I get always 1
 instead of
  0.5:
  
 pbinom(1e15,2e15,0.5)
  
  [1] 1
  
  Which value should be correct?
  
  Best regards
  Christian Stratowa
  
  ==
  Christian Stratowa, PhD
  Boehringer Ingelheim Austria
  Dept NCE Lead Discovery - Bioinformatics
  Dr. Boehringergasse 5-11
  A-1121 Vienna, Austria
  Tel.: ++43-1-80105-2470
  Fax: ++43-1-80105-2683
  email: [EMAIL PROTECTED]
  
  
  
  Original Message 
 Subject: Re: [S] Exact p-values
 Date: Thu, 13 Feb 2003 18:31:38 +0100
 From: Rau, Roland [EMAIL PROTECTED]
 To: 'Spencer Graves' [EMAIL PROTECTED],  Jose María Fedriani
 Laffitte [EMAIL PROTECTED]
 CC: [EMAIL PROTECTED]
 
 Dear all,
 
 in relation to your question, the following working paper of Leo
 Knuesel,
 University of Munich, might be of interest:
 On the Accuracy of Statistical Distributions in S-Plus for Windows
 (1999)
 You can download the paper from (pdf-Format, 45k):
 http://www.stat.uni-muenchen.de/~knuesel/elv/accuracy.html
 
 Best,
 Roland
 
   -Original Message-
   From:Spencer Graves [SMTP:[EMAIL PROTECTED]]
   Sent:Thursday, February 13, 2003 6:12 PM
   To:  Jose María Fedriani Laffitte
   Cc:  [EMAIL PROTECTED]
   Subject: Re: [S] Exact p-values
  
  
   Try ( 1-pchisq(29.8, df=1)):  With S-Plus 6.1, I got  4.78992e-008.
  
By the way, the distribtion functions in R have more
 arguments.
For example,  pchisq(29.8, df=1, lower.tail=F) produces the same
   answer, and pchisq(29.8, df=1, lower.tail=F, log=T) produces its
 natural
   logarithm.  Also, pchisq, dchisq, qchisq, and rchisq in R all have an
   ncp noncentrality parameter argument;  only pchisq has such in
 S-Plus
   6.1.  Similarly, none of the Student's t functions in S-Plus have a
   non-centralitity parameter;  in R, pt has an argument ncp, and from
 this
   one can easily program ncp for dt, qt and rt.  Also, the distribution
   functions in the current release of S-Plus are known to have
 problems.
For example, pt(-1, Inf) = 0.5 in S-Plus 6.1, but 0.159 in R;
 clearly,
   S-Plus gives a wrong answer without warning.
  
   Best Wishes,
   Spencer Graves
  
   Jose María Fedriani Laffitte wrote:
  
   Dear all,
   
   I want to get the exact p-values, on 1 degree of freedom, for an
   array
   of chi-square values.  When my chi-square values are equal or lower
 than
   29.7, I get the exact associated p-values.  Thus, for instance:
   
   
   
   pchisq(29.7, df=1)
   
   
   [1] 0.999
   
   However, when my chi-square values are greater or equal to 29.8 what
 I
   get
   is:
   
   
   
   pchisq(29.8, df=1)
   
   
   [1] 1
   
   
   Could anyone tell me how to fix this trivial issue?  Very
 grateful,
   Jose
   M. Fedriani
   
   
   Jose Mª Fedriani Laffitte
   Estacion Biologica de Donana (CSIC)
   Avda. Mª Luisa s/n
   41013-Sevilla
   Spain
   Tel. +34-954232340
   Fax +34-954621125
   http://ebd.csic.es
   
   

Re: [R] Translating lm.object to SQL, C, etc function

2003-02-14 Thread John Fox
Dear Jeremy,

I've written replacements for the standard R contrast functions that 
produce the kind of more easily parsed (and more readable) contrast names 
that I think you have in mind. I intend to include these in the next 
release of the car package for R but haven't done so yet. Since the code 
isn't very long, I've appended it (and the .Rd documentation file to this 
note). Note that R does separate terms in an interaction with a colon.

I hope that this does what you need.
 John

 Contrasts.R -

# last modified 2 Dec 2002 by J. Fox
# all of these functions are adapted from functions in the R base package

contr.Treatment - function (n, base = 1, contrasts = TRUE) {
if (is.numeric(n)  length(n) == 1)
levs - 1:n
else {
levs - n
n - length(n)
}
lev.opt - getOption(decorate.contrasts)
pre - if (is.null(lev.opt)) [ else lev.opt[1]
suf - if (is.null(lev.opt)) ] else lev.opt[2]
dec - getOption(decorate.contr.Treatment)
dec - if (!contrasts) 
   else if (is.null(dec)) T.
   else dec
contr.names - paste(pre, dec, levs, suf, sep=)
contr - array(0, c(n, n), list(levs, contr.names))
diag(contr) - 1
if (contrasts) {
if (n  2)
stop(paste(Contrasts not defined for, n - 1, degrees of 
freedom))
if (base  1 | base  n)
stop(Baseline group number out of range)
contr - contr[, -base, drop = FALSE]
}
contr
}

contr.Sum - function (n, contrasts = TRUE)
{
if (length(n) = 1) {
if (is.numeric(n)  length(n) == 1  n  1)
levels - 1:n
else stop(Not enough degrees of freedom to define contrasts)
}
else levels - n
lenglev - length(levels)
lev.opt - getOption(decorate.contrasts)
pre - if (is.null(lev.opt)) [ else lev.opt[1]
suf - if (is.null(lev.opt)) ] else lev.opt[2]
dec - getOption(decorate.contr.Sum)
dec - if (!contrasts) 
   else if (is.null(dec)) S.
   else dec
show.lev - getOption(contr.Sum.show.levels)
contr.names - if ((is.null(show.lev)) || show.lev) paste(pre, dec, 
levels, suf, sep=)
if (contrasts) {
cont - array(0, c(lenglev, lenglev - 1), list(levels,
contr.names[-lenglev]))
cont[col(cont) == row(cont)] - 1
cont[lenglev, ] - -1
}
else {
cont - array(0, c(lenglev, lenglev), list(levels,
contr.names))
cont[col(cont) == row(cont)] - 1
}
cont
}


contr.Helmert - function (n, contrasts = TRUE)
{
if (length(n) = 1) {
if (is.numeric(n)  length(n) == 1  n  1)
levels - 1:n
else stop(contrasts are not defined for 0 degrees of freedom)
}
else levels - n
lenglev - length(levels)
lev.opt - getOption(decorate.contrasts)
pre - if (is.null(lev.opt)) [ else lev.opt[1]
suf - if (is.null(lev.opt)) ] else lev.opt[2]
dec - getOption(decorate.contr.Helmert)
dec - if (!contrasts) 
   else if (is.null(dec)) H.
   else dec
nms - if (contrasts) 1:lenglev else levels
contr.names - paste(pre, dec, nms, suf, sep=)
if (contrasts) {
cont - array(-1, c(lenglev, lenglev - 1), list(levels,
contr.names[-lenglev]))
cont[col(cont) = row(cont) - 2] - 0
cont[col(cont) == row(cont) - 1] - 1:(lenglev - 1)
}
else {
cont - array(0, c(lenglev, lenglev), list(levels, contr.names))
cont[col(cont) == row(cont)] - 1
}
cont
}

--- Contrasts.Rd 
--

\name{Contrasts}
\alias{Contrasts}
\alias{contr.Treatment}
\alias{contr.Sum}
\alias{contr.Helmert}

\title{Functions to Construct Contrasts}
\description{
These are substitutes for similarly named functions in the base package
(note the uppercase letter starting the second word in each function 
name).
The only difference is that the contrast functions from the car package
produce easier-to-read names for the contrasts when they are used in 
statistical models.

The functions and this documentation are adapted from the base package.
}

\usage{
contr.Treatment(n, base = 1, contrasts = TRUE)

contr.Sum(n, contrasts = TRUE)

contr.Helmert(n, contrasts = TRUE)
}

\arguments{
  \item{n}{a vector of levels for a factor, or the number of levels.}
  \item{base}{an integer specifying which level is considered the baseline 
level.
Ignored if \code{contrasts} is \code{FALSE}.}
  \item{contrasts}{a logical indicating whether contrasts should be computed.}
}

\details{
These functions are used for creating contrast matrices for use in 
fitting analysis of variance and regression models.
The columns of the resulting matrices contain contrasts which can be 
used for coding a factor with \code{n} levels.
The returned value contains the computed contrasts. If the argument 
\code{contrasts} is \code{FALSE} 

[R] RAV AntiVirus scan results

2003-02-14 Thread RAV AntiVirus


---
This e-mail is generated by the www.unipa.it mail server to warn you that the e-mail
sent by [EMAIL PROTECTED] to [EMAIL PROTECTED] is infected with virus: 
HTML/IFrame_Exploit*.

Please contact your system administrator for further information.

If you are the sender:
---
The scanned e-mail has your address in the From header field. Either your
computer is infected or someone's computer having your e-mail address in
the address book has been infected.

(Please note that some viruses are sending e-mails directly from your computer.
Our advise is to check your computer using an up-to-date antivirus product).

If you are the receiver:
-
Please contact the sender: very probably he/she doesn't know he/she has a computer 
virus.

Actions taken for the infected files:
-


The infected file was saved to quarantine with name: 1045228118-dfh1ED8cq03740.
The file (part:)-(IFRAME) attached to mail (with subject:Colours ) sent by 
[EMAIL PROTECTED] to [EMAIL PROTECTED]
is infected with virus: HTML/IFrame_Exploit*.
Cannot clean this file.
The file was successfully deleted by RAV AntiVirus.
The file (part0001:alpha,.pif) attached to mail (with subject:Colours ) sent by 
[EMAIL PROTECTED] to [EMAIL PROTECTED]
is infected with virus: Win32/Klez.H@mm.
Cannot clean this file.
The file was successfully deleted by RAV AntiVirus.

this is a copy of the e-mail header:



RAV AntiVirus for Linux i386 version: 8.4.0 (snapshot-20020919)

Scan engine 8.9 for i386.
Last update: Thu, 13 Feb 2003 18:34:41 +01
Scanning for 78123 malwares (viruses, trojans and worms).

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Change array size

2003-02-14 Thread Peter Dalgaard BSA
Liaw, Andy [EMAIL PROTECTED] writes:

 Is the following what you want?
 
  x- rnorm(800)
  xt - x[1:2^trunc(log(length(x),base=2))]
  length(xt)
 [1] 512

I don't think so (notice upper). More likely

x - rnorm(800)
l - length(x)
xt - c(x,numeric(2^ceiling(log(l,base=2))-l))
length(xt) # 1024

but fits might also imply interpolation?

  Hi,
  I would like to know if there is a way to change a vector of 
  arbitrary size
  to make it fits the nearest upper size multiple of a power of 2.


-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] data manipulation function descriptions

2003-02-14 Thread Luke Tierney
On Fri, 14 Feb 2003 [EMAIL PROTECTED] wrote:

 On Thu, 13 Feb 2003, kjetil brinchmann halvorsen wrote:
 
  On 13 Feb 2003 at 17:09, Jason Bond wrote:
 
   case  switch
  [R-core : switch should be better 
 announced. It is for   
  instance not 
   mentioned in An 
introduction to R]
 
 Well, that is an *introduction*, not a programmer's guide.  You will find
 switch() is rarely used in R: it is a bit peculiar in its semantics, and 
 something definitely not to be considered introductory.
 
 On the original question, I think it would be a mistake to translate what
 you know.  R is a vector language, not a pairlist language, and I see
 quite a bit of evidence of convoluted solutions in its internals dating
 from when R was the second.  Chapter 2 of Venables  Ripley (2002) (as in
 the R FAQ) is devoted to using S/R for data manipulation.

As someone reasonably familiar with both languages I have to disagree
with several points here.  First and foremost, despite differences in
surface syntax, as languages xlispstat and R are much more alike than
they are different.  xlispstat is much closer to R than S-plus because
both xlispstat and R use lexical scope, a feature of R that is still
not used as much as it could be.  The main language differences are
the limited form of lazy evaluation used in R, which you can usully
ignore, and the fact that R does not provide mutable data structures,
which is also rarely an issue.  There are other differences, but these
are the main ones that affect coding practices I think.

The basic xlispstat data handling functions mentioned in the original
post are quite similar to corresponding basic functions in R.  This is
not by accident: the choice of functions included in xlispstat was
heavily influenced by what was then called the New S language.  As a
result, if you want to create an R version of an xlispstat function
you can often do far worse than start with a fairly direct
transliteration.  In my view at least, good coding practices in
xlispstat are good coding practices for any high level mostly
functional language and carry over quite well to R.

I am sorry if the following seems a bit harsh, but I, and many others
who have worked with lisp, find it extremely frustrating to read
statements about lisp like the one above that suggest that lisp is a
pairlist language only, especially when these statements come from
people I thought knew better.  Lisp dates back to the 1950's.  The
only other language of any consequence still in use from that era is
FORTRAN.  No one would now claim that a major flaw in FORTAN is the
lack of an if-then-else construct.  That was true in the early days
but has not been for several decades.  But for some reason many people
seem very happy to very authoritatively make statements about lisp
that, if they were ever true at all, have not been so for a very long
time indeed.  Pairlists are a very useful data structure for
expressing many algorithms in a functional style.  That is why they
were one of the first data structures in Lisp, and that is why they
are available in virtually all other high level functional languages
(ML, Haskell, Miranda, Clean, ...).  Pailrists are NOT the only data
structure in Lisp.  For many years Lisp has also supported vectors and
arrays, both generic and typed (and other data structures).  Vectors
and pairlists are collectively referred to as sequences, and, if I
remember correctly, all the functions listed in the original post
except mapcar are designed to work on all kinds of sequences (the
sequence version of mapcar is map).  Code written in xlispstat in
terms of sequence functions can often be translated quite easily to R,
and the resulting code will be quite consistent with good R coding
practices.

R does not provide a pairlist data structure. This creates a dilemma
when translating some list-based xlispstat code, or, more importantly,
when implementing an algorithm for which parilists are the natural
data structure to use.  There are two choices: use a vector based
algorithm that may be a bit less natural but fits better with the
basic R data structures, or build your own pairlist abstraction for
this particular problem and write the algorithm the more natural way.
I have used both approaches on different occasions.  I usually prefer
to write an algorithm in the most natural way for the algorithm, since
that usually maximizes the probability that my code is actually
correct.  If this approach requires some additional abstract data
types, be they pairlists or anything else, then I develop and test
them separately and write the main code in terms of these
abstractions.  Occasianally, but not all that often, this results in
code that is slower than I like; then I 

[R] Numeric Coerceing

2003-02-14 Thread Wayne Jones
Does anyone know how to coerce a numeric to a string??

THanks

Wayne


KSS Ltd
A division of Knowledge Support Systems Group plc
Seventh Floor  St James's Buildings  79 Oxford Street  Manchester  M1 6SS  England
Company Registration Number 2800886 (Limited) 3449594 (plc)
Tel: +44 (0) 161 228 0040   Fax: +44 (0) 161 236 6305
mailto:[EMAIL PROTECTED]http://www.kssg.com


The information in this Internet email is confidential and may be legally privileged. 
It is intended solely for the addressee(s). Access to this Internet email by anyone 
else is unauthorised.

If you are not the intended recipient, any disclosure, copying, distribution or any 
action taken or omitted to be taken in reliance on it, is prohibited and may be 
unlawful. When addressed to our clients any opinions or advice contained in this 
Internet email are subject to the terms and conditions expressed in the governing 
engagement letter or contract.

This email message and any attached files have been scanned for the presence of 
computer viruses.  However you are advised that you open any attachments at your own 
risk.


[[alternate HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



RE: [R] Numeric Coerceing

2003-02-14 Thread Marc Schwartz
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED]] On Behalf Of Wayne Jones
Sent: Friday, February 14, 2003 8:20 AM
To: [EMAIL PROTECTED]
Subject: [R] Numeric Coerceing


Does anyone know how to coerce a numeric to a string??

THanks

Wayne


See ?as.character

For example:

 y - 123
 y
[1] 123
 as.character(y)
[1] 123


Regards,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Change array size

2003-02-14 Thread Spencer Graves
then change trunc to ceiling???

Peter Dalgaard BSA wrote:

Liaw, Andy [EMAIL PROTECTED] writes:



Is the following what you want?



x- rnorm(800)
xt - x[1:2^trunc(log(length(x),base=2))]
length(xt)


[1] 512



I don't think so (notice upper). More likely

x - rnorm(800)
l - length(x)
xt - c(x,numeric(2^ceiling(log(l,base=2))-l))
length(xt) # 1024

but fits might also imply interpolation?



Hi,
I would like to know if there is a way to change a vector of 
arbitrary size
to make it fits the nearest upper size multiple of a power of 2.






__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] pairlists (was: data manipulation function descriptions)

2003-02-14 Thread Warnes, Gregory R


 -Original Message-
 From: Luke Tierney [mailto:[EMAIL PROTECTED]]

 R does not provide a pairlist data structure. This creates a dilemma
 when translating some list-based xlispstat code, or, more
 importantly, when implementing an algorithm for which parilists are
 the natural data structure to use.
 ...
 Pairlists were and still are used internally for many things. 
 ...

Wouldn't it, therefore, make sense to provide a 'pairlist' package which
exposes the internal pairlist structure and provides appropriate functions
(car, cdr, ...), instead of expecting people to keep re-implementing these
features?

-Greg


LEGAL NOTICE\ Unless expressly stated otherwise, this message is ... [[dropped]]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] programs for genetics - haplo.score for R

2003-02-14 Thread Roger Peng
It would appear that Gregory Warnes has ported it to R and the package
`haplo.score' can be downloaded from CRAN (http://cran.r-project.org).

-roger
___
UCLA Department of Statistics
[EMAIL PROTECTED]
http://www.stat.ucla.edu/~rpeng

On Fri, 14 Feb 2003, Shona Livingstone wrote:

 colorparam0100,0100,0100/paramDear All,
 
 I wish to use a suite of programs called haplo.score first written in S plus by 
 Rowland et al of the Mayo clinic (details given below). Unfortunately, I do not 
 have S plus available to me at the moment
 
 
 Has anyone written the equivalent for R? 
 
 
 Any pointers will be appreciated, bearing in mind that I am new to R.
 
 
 Thank you for your help
 
 Shona Livingstone
 
 Epidemiology and Public Health, UCL
 
 
 *
 
 boldFontFamilyparamArial/paramsmallerhaplo.score
 
 italicScore Tests for Association of Traits with Haplotypes when
 
 Linkage Phase is Ambiguous
 
 /italic/boldFontFamilyparamTimes New Roman/paramCharles M. Rowland, David 
E. Tines, and Daniel J. Schaid
 
 Mayo Clinic
 
 Rochester, MN
 
 E-mail contact: [EMAIL PROTECTED]
 
 boldFontFamilyparamArial/paramI
 
 [/boldFontFamilyparamTimes New Roman/paramA suite of S-PLUS routines, 
referred to as haplo.score, can be used to compute score 
 statistics to
 
 test associations between haplotypes and a wide variety of traits, including binary, 
 ordinal,
 
 quantitative, and Poisson. These methods assume that all subjects are unrelated and 
that 
 haplotypes
 
 are ambiguous (due to unknown linkage phase of the genetic markers). The methods 
 provide
 
 several different global and haplotype-specific tests for association, as well as 
provide 
 adjustment
 
 for non-genetic covariates and computation of simulation p-values (which may be 
 needed for
 
 sparse data). Details on the background and theory of the score statistics can be 
found 
 in the
 
 following reference:
 
 Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA. Score tests for
 
 association of traits with haplotypes when linkage phase is ambiguous. American J
 
 Human Genetics, February, 2002.]
 
 
 nofill
 
 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help


__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] How to solve A'A=S for A

2003-02-14 Thread Cliff Lunneborg
It is not clear to me that one can. If the singular value decomposition
of A is the triple product P d Q', then the singular value decomposition
of A'A=S is Q d^2 Q'. The information about the orthonormal matrix P is
lost, is it not?
**
Cliff Lunneborg, Professor Emeritus, Statistics 
Psychology, University of Washington, Seattle
Visiting: Melbourne, Feb-May 1999, Brisbane, Jun-Aug 1999,
Sydney, Sep-Nov 1999, Perth, Dec 1999-Feb 2000
[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] pairlists (was: data manipulation function descriptions)

2003-02-14 Thread Peter Dalgaard BSA
Warnes, Gregory R [EMAIL PROTECTED] writes:

  -Original Message-
  From: Luke Tierney [mailto:[EMAIL PROTECTED]]
 
  R does not provide a pairlist data structure. This creates a dilemma
  when translating some list-based xlispstat code, or, more
  importantly, when implementing an algorithm for which parilists are
  the natural data structure to use.
  ...
  Pairlists were and still are used internally for many things. 
  ...
 
 Wouldn't it, therefore, make sense to provide a 'pairlist' package which
 exposes the internal pairlist structure and provides appropriate functions
 (car, cdr, ...), instead of expecting people to keep re-implementing these
 features?

Some ancient consideration pops up here. We do actually expose
pairlists in a few places (try mode(.Options)). Some people consider
that this is a remnant and should be stamped out, but we might also
consider doing what you suggest. 

The big problem with old R was not so much the pairlists but that they
were used for representing objects of mode list so to get to X[[n]]
you had to count through the list from the beginning which killed
performance in some important cases. Then again, adding elements to a
generic vector requires copying the whole thing. Of course all the
legacy S code tended to do the former and not the latter, so generic
vectors ended up winning.

One or two reservations: With full lisp style access, could we end up
with (circular) data structures that confuse the garbage collector?
And might we -- supposing we allowed destructive list modifications --
end up with strange semantics a la the .Alias mess we had for a while?
Of course Luke would be the first to know about this.

Then of course there is the question of reverse compatibility. I don't
consider it much of a loss if R code doesn't run in Splus, but others
might.

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Function update problem

2003-02-14 Thread lun li
Dear Professor Ripley,

Thank you very much for your help.

I tested the idea by form - substitute(.~.+x[,i], list(i=i)), however the 
problem still is there.  In order to automate model selection, I prefer to 
use update. My regression problem is to test significant predictors from  
several hundred candidates using forward regression by BIC criteria. For 
each loop, first to find one with maximum BIC and then to update lm to get 
a new model.  So, it is convenient to useupdate function. I would be 
grateful if you could still help.

In addition, I tried the way update+add1 to my question.  But, new problem 
is  update.default(model, . ~ . + x) need an object with call component, 
help? The code is:

model-add1(model,.~.+form)
model.new-update(model,.~.+x)


With best regards,


Lun





You can use substitute: something like (untested)

for(i in 1:100){
form - substitute(.~.+x[,i], list(i=i))
model - update(model, form)
## do something useful in here
}
and you do not need to update unchanged arguments!

However, why are you rewriting add1.default, when there is add1.lm?

On Thu, 13 Feb 2003, lun li wrote:

 Dear all,

 I am trying an automatic model selection for a multiple linear 
regression
 using function lm and update. But, I meet a problem when using update. 
The
 problem is the function update can not update when variables as a 
vector(for
 example,x is a matrix with 100 regression variables). The code is as 
below:

model-lm(y~x1,singular.ok=T,na.action=na.omit)
for(i in 1:100){
model-update(model,.~.+x[,i],singular.ok=T,na.action=na.omit)}

 If the above code is represented as below, I can get the correct result.
 However, I must use the loops.

 model-lm(y~x1,singular.ok=T,na.action=na.omit)
 model-update(model,.~.+x[,1],singular.ok=T,na.action=na.omit)
 model-update(model,.~.+x[,2],singular.ok=T,na.action=na.omit)
 ..
 model-update(model,.~.+x[,100],singular.ok=T,na.action=na.omit)

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help

- End forwarded message -



Lun Li
Department of Geology and Geophysics
The University of Edinburgh
Grant InstituteTel. +44(0)131 650 7339
King's Buildings   Fax. +44(0)131 668 3184
West Mains Road			   E-mail:[EMAIL PROTECTED]
Edinburgh EH9 3JW
UK

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] error in unique(pkgs) (fwd)

2003-02-14 Thread Katalin Csillery
 
 I read your reply as an R mail archive in respect to the installation of
 new R contributed packages on a Mac OS X. 
 I found the same problem, I use OroborOSX and emacs-ESS, I wanted to
 install xtable, ape packages, with the install.packages() command, and I
 got the same error message (naturally: sudo emacs - which avoids the
 argument lib missing error message), 
 
 Error in unique(pkgs) : Object ape not found
 
 Ape is not a base package so it should be available separately.
 
 Any idea?
 
 Thanks for your help in advance!

You really need to use the r-help list for this, I'm not a Mac expert.
(I think you might need to fetch the Stuffit archive from CRAN and
unstuff it manually.)

-p

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] matrix from sequences

2003-02-14 Thread Miha STAUT
Hi, Miha:

1.  How do I get the GRASS library?  library(GRASS) produced Error in 
library(GRASS) : There is no package called `GRASS' for me from R 1.6.2 
for Windows.

I do not know whether it exists for Windows or not, but look under:
http://cran.r-project.org/src/contrib/Devel/
Or visit:
http://grass.itc.it/index.html



2.  I assume there is a typographical error in the last line of your email: 
 If G$xseq and $yseq are coordinates of points, then length(G$xseq) == 
length(G$yseq)???  In that case, 'as.matrix(G[,c(xseq, yseq)])' should 
give you what you want.

Or am I missing something?

OK I really am lousy at explaining things. The length(G$xseq) * 
length(G$yseq) stands because you have to get all the permutations of the 
elements of those two sequences. Get it? If you have:
xseq-1:10
yseq-1:10
I would like to get:
 x
y [,1] [,2] [,3] [,4] [,5] ...
[1,]
[2,]
[3,]
[4,]
...

or

str(xy)
$x 1,1,1,1,1,1,1,1,1,1,2,2,2,...
$y 1,2,3,4,5,6,7,8,9,10,1,2,3,...


Spencer Graves

Miha STAUT wrote:

Miha STAUT wrote:


Hi all,

I have a data frame with sequences of x and y from a map. I would like 
to know it both ways:
1. How to make a matrix from that;
2. how to make a data frame of all points in a map.

Probably it is a silly question, but please tell me where to read about 
it or tell me how to do it.

Miha Staut

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help

Hi Miha,

1) What is the structure of your data.frame ? Assuming all co-ordinates 
are in the same column (one column for x and one column for y), the 
simplest way to extract them and turn them into a matrix would be:

as.matrix(mydata[ , c(x, y)])

e.g.:

Rmydata - data.frame(x = rnorm(10), y = rnorm(10), z = rnorm(10))
Rmydata
 x   y  z
1  -0.73735224 -0.51218243 -0.9602624
2  -1.46079091 -0.63634091  1.4967066
3  -0.28574919 -1.30719383 -0.2887403
4   0.04137159  0.61711350 -0.7057102
5   0.03179303  0.05734869 -0.4637660
6  -0.06638058 -0.74565157  0.9239402
7  -0.67611541 -1.01760810 -0.2854017
8   0.34215052  0.30564550  0.6931193
9   0.83597837  0.75443762 -2.3394679
10 -0.14967073 -0.02027512 -0.1143414
Ras.matrix(mydata[ , c(x, y)])
 x   y
1  -0.73735224 -0.51218243
2  -1.46079091 -0.63634091
3  -0.28574919 -1.30719383
4   0.04137159  0.61711350
5   0.03179303  0.05734869
6  -0.06638058 -0.74565157
7  -0.67611541 -1.01760810
8   0.34215052  0.30564550
9   0.83597837  0.75443762
10 -0.14967073 -0.02027512


2) How are the points stored ? If in a matrix, say mat, with 2 columns 
for x and y, simply:

as.data.frame(mat)

Best,

Renaud



Thanks to both of you (Dr Renaud Lancelot and James Holtman)

I see I formulated the question in a wrong way. I got from GRASS the 
coordinates of a map. There is a package in R named GRASS to connect R 
with GRASS.

library(GRASS)
G-gmeta() # copy the environment from GRASS

Now G is a data frame containig also $xseq and $yseq which would be the 
coordinates of all the points in x and y direction. The final matrix 
should have length(G$xseq) * length(G$yseq) points.

Miha Staut

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] function editing?

2003-02-14 Thread Sundar Dorai-Raj
See ?fix.

Joshua Gramlich wrote:

Is there a way to edit user defined functions once they've been
created?  For instance, I've a simple function that plots a table, but
I'd like to go back and add more parameters to the barplot call.  Is
there a way to change this function without completely starting from
scratch?  Other than storing the code in a file and re-running it?


Joshua Gramlich
Piocon Technologies
Chicago, Illinois USA

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] function editing?

2003-02-14 Thread Deepayan Sarkar

See ?edit

If you use ESS, C-c C-d.

On Friday 14 February 2003 04:07 pm, Joshua Gramlich wrote:
 Is there a way to edit user defined functions once they've been
 created?  For instance, I've a simple function that plots a table, but
 I'd like to go back and add more parameters to the barplot call.  Is
 there a way to change this function without completely starting from
 scratch?  Other than storing the code in a file and re-running it?


 Joshua Gramlich
 Piocon Technologies
 Chicago, Illinois USA

 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Translating lm.object to SQL, C, etc function

2003-02-14 Thread j+rhelp
On Fri, 14 Feb 2003 08:06:45 + (GMT), [EMAIL PROTECTED] said:
 The issue here is that coef() tells you the coefficients in R's
 internal parametrization of the model, and that is of no use to you
 unless you have a means of creating a model matrix in C, SQL or (heaven
 forbid) Perl. The information needed to re-create a model matrix is
 stored in the lm fit, but in ways that are going to be hard to use
 anywhere else (since they include R functions).  This is not perverse:
 what R does is very general, *far* more so than SPSS.  Formulae in lm
 can include poly() and ns() terms, for example.

I understand that. And indeed a perfectly general function export is a
very big job. However, once we can export the model into a reasonably
generic textual form, simply including the text name of any R functions
in the export, then users can create special-case translators for the
parts that they need. We try to make this as easy for ourselves as
possible, for instance by doing all required transformations in SQL
(where possible) before importing to R, which means that all the terms in
the linear model are often untransformed variables. The only thing we
don't do in SQL normally is creating the contrasts, since this is
something that SQL is not well suited for.

 The only practical solution it seems to us is to ask R to create the
 model matrix for new data.  Then the things you are talking about are
 just the colnames of that matrix, and don't need to be interpreted.

Yes, that makes things pretty easy then, but's it's not an option in all
cases. We need to embed our models into C code. Previously we had a
routine to take the SPSS output, convert it into C code, and then
recompile the C code into our simulation. The linear model is utilised in
the inner loop of the simulation so needs to be very fast; CORBA or SOAP
calls to uncompiled code in the inner loop slow things down a great deal.
In addition, the simulation is accessed by many people - requiring all of
them install R would make the roll-out procedure much more complex.

 You may want to read the sources to find out how R does it: that area
 is one of the most complex parts of the internals, and one in which
 bugs continue to emerge.

I'm glad to hear it is considered complex! ;-) I've actually been reading
that bit of the code quite a bit over the last two days and haven't been
getting that far. My lack of familiarity with the language, combined with
the lack of comments in that section of code, and the very
concise/non-descriptive variable names often used in the code, make this
even harder. Still, it's a useful exercise for learning more about the
language. 

  The difficulty I am having is that the output of coef() is not really
  parsable, since there is no marker in the name of an coefficient of
  separate out the components. For instance, in SPSS the name of a
  coefficient might be:
 
var1=[a]*var2=[b]*var3
 
  ...which is easy to write a little script to pull that apart and
  turn it into a line of SQL, C, or whatever. In S however the name
  looks like:
 
var1avar2bvar3
 
  ...which provides no way to pull the bits apart.

 I find that impossible to understand anyway, but doubt that it
 corresponds to SPSS.  For a variable V, label Va does not mean V=[a]
 except in unusual special cases.

I should firstly mention that I got this slightly wrong - I showed above
the SPLUS output, not the R output. R actually looks like this:
  var1a:var2b:var3

The ':'s certainly help a lot, but still there's the problem of handling
factor levels, which are concatenated with the variable name without a
delimiter (at least, in all the linear models I've run so far, this is
the case).

I think with all the great feedback and ideas I've got so far on the list
and in private mail (thanks everyone!) I have enough information to make
a start. If I create anything that might be more generally useful I'll
post back of course.

Many thanks,
  Jeremy
-- 
  Jeremy Howard
  [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] using locator with xyplot() result

2003-02-14 Thread Vadim Ogranovich
Dear R-Users,

Is there a way to interactively get location of a point on a graph produced
by xyplot() of lattice package (similar to what locator() does with a
regular plot)?

Thanks, Vadim

-- 
DISCLAIMER \ This e-mail, and any attachments thereto, is intend ... [[dropped]]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] using locator with xyplot() result

2003-02-14 Thread Deepayan Sarkar

No.

On Friday 14 February 2003 04:37 pm, Vadim Ogranovich wrote:
 Dear R-Users,

 Is there a way to interactively get location of a point on a graph produced
 by xyplot() of lattice package (similar to what locator() does with a
 regular plot)?

 Thanks, Vadim

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Translating lm.object to SQL, C, etc function

2003-02-14 Thread j+rhelp
On Fri, 14 Feb 2003 07:53:46 -0500, John Fox [EMAIL PROTECTED] said:
 I've written replacements for the standard R contrast functions that 
 produce the kind of more easily parsed (and more readable) contrast names 
 that I think you have in mind. I intend to include these in the next 
 release of the car package for R but haven't done so yet. Since the code 
 isn't very long, I've appended it (and the .Rd documentation file to this 
 note). Note that R does separate terms in an interaction with a colon.
 
 I hope that this does what you need.
...
 ##  Coefficients:
 ##  (Intercept)  income   education
 ##  2.2757530.0035221.713275
 ##  type[T.prof]  type[T.wc] income:type[T.prof]
 ##  15.351896  -33.536652   -0.002903
 ##  income:type[T.wc]  education:type[T.prof]education:type[T.wc]
 ##  -0.0020721.3878094.290875

Yes, it's perfect. Thanks so much (and also thanks for your really
readable and useful book, including web appendices)!

Regards,
  Jeremy
-- 
  Jeremy Howard
  [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] time series missing 0 counts

2003-02-14 Thread Dirk Eddelbuettel
On Fri, Feb 14, 2003 at 11:21:33AM +, [EMAIL PROTECTED] wrote:
 Yes, there is an easy way.  Create the regular time series you want by 
 something like
 
 x - ts(0, start=c(2000,52), end=c(2003,9), frequency=52)
 
 and fill in the time points you have data for by
 
 xYear - trunc(times(x)); xWeek - cycle(x)
 attach(mydata)
 x[(xYear==year)  (xWeek==Week)] - Count
 detach()

But won't this will fail for weeks with a count number of 0 or 53 (as both
of those are outside the ts() range specified above)?  

As 52*7=364 is different from the number of days in a year, each year is
bound to have one of those unless the data is pre-scrubbed.

Dirk


 Easy!
 
 On Fri, 14 Feb 2003, Schnitzler, Johannes wrote:
 
   I have several large data sets with counts per week. 
   (Maximum week per year is 52. Counts from Week 53
   are added to week 52.) 
   
   A data set contains for example:
   
   Year  WeekCount
   2000  52  2
   2001  1   5
   2001  2   7
   2001  5   4
   2001  7   2
   ...   ... ...
   ...   ... ...
   
   Weeks with 0 counts are not listed in the data set.
   I want to perform time series analysis (frequency 52).
   
   
   Is there an easy way to expand the data set to:
   
   Year  WeekCount
   2000  52  2
   2001  1   5
   2001  2   7
   2001  3   0
   2001  4   0
   2001  5   4
   2001  6   0
   2001  7   2
   ...   ... ...
   ...   ... ...
   
   or is there already a function in ts, which i have not found so far,
   to deal with this problem?
   
   
   Thank you very much.
   
   Johannes Schnitzler
   Germany Berlin
   

  
  
  __
  [EMAIL PROTECTED] mailing list
  http://www.stat.math.ethz.ch/mailman/listinfo/r-help
  
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 [EMAIL PROTECTED] mailing list
 http://www.stat.math.ethz.ch/mailman/listinfo/r-help
 

-- 
Prediction is very difficult, especially about the future. 
 -- Niels Bohr

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help