Re: [R] Running R function as a Batch process

2007-05-16 Thread Hanke, Alex
Hi,
There are many clues in the help.
First I created the file c:\sumfunction.R

x-as.numeric(commandArgs()[-1:-4] )
print(x)
addtogether-function(x,y){SUM-x+y;print(SUM)}
addtogether(x[1],x[2])

Then at the command line in Windows I enter

R --vanilla --slave --args 7 10 c:\sumfunction.R c:\logout.txt

Regards
Alex

 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of d. sarthi
maheshwari
Sent: May 16, 2007 8:29 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Running R function as a Batch process

Hi,

I am struggling with using R CMD BATCH command. Kindly suggest solution
to the following problem.

I have a function named CinC with accept two input parameters. This can
be shown as:

CinC - function(start, end)

where start and end both are character strings.

Please suggest me how can I run this function using command R CMD BATCH.

Currently I am trying to do like this - R CMD BATCH c:/tt/CinC.r
c:/tt/h.Rout -20070416 08:41 -20070416 10:33

What is wrong/incorrect with it?

Your suggestions are important to me. Kindly reply.
Thanks in advance.

Regards
Divya Sarthi

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error using newdata argument in survfit

2005-06-16 Thread Hanke, Alex
Hi Brian,
The factor Prior.f has 5 levels (1,2,3,4,5) which coxph deals with by
creating 4 dummy variables coded with 1 or zero. That's what I see when I
look at fit$x.
fit$x[1:5,]
   Week LagAOO factor(Prior.f)2 factor(Prior.f)3 factor(Prior.f)4
31   22  0000
32   22  0000
33   22  2000
34   22  3000
35   22  2000
   factor(Prior.f)5
310
320
330
340
350 
I have played with the formula a bit adding a term at a time and then
checking to see if I can produce the survival curves for pseudo cohorts. I
get as far as the Prior.f term and am successful if I treat it as a
continuous variable. If I introduce it as a factor and assume it wants four
dummy variables as above I get the variable lengths error. If I represent
the term with one variable:
survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=c(1,2)))
I get:
Error in x2 %*% coef : non-conformable arguments
Which is a nice change but still short of knowing what is going on.
 Regards
Alex 

-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Sent: June 15, 2005 4:24 PM
To: Hanke, Alex
Cc: 'r-help@stat.math.ethz.ch'
Subject: Re: [R] Error using newdata argument in survfit

You appear to have a coding for prior.f in newdata rather than the factor 
itself.

It's a bit hard to be sure when we don't have data8 to compare with.

On Wed, 15 Jun 2005, Hanke, Alex wrote:

 Dear R-helpers,
 To get curves for a pseudo cohort other than the one centered at the mean
of
 the covariates, I have been trying to use the newdata argument to survfit
 with no success. Here is my model statement, the newdata and the ensuing
 error. What am I doing wrong?

 summary(fit)
 Call:
 coxph(formula = Surv(Start, Stop, Event, type = counting) ~
Week + LagAOO + Prior.f + cluster(interaction(Station, Year)),
data = data8, method = breslow, x = T, y = T)

  n= 1878
coef exp(coef) se(coef) robust se z   p
 Week 0.00582  1.01   0.03230.0239 0.244 8.1e-01
 LagAOO   0.71929  2.05   0.12380.1215 5.918 3.3e-09
 Prior.f2 0.12927  1.14   0.44020.4025 0.321 7.5e-01
 Prior.f3 0.79082  2.21   0.54840.4460 1.773 7.6e-02
 Prior.f4 2.04189  7.71   0.60080.4685 4.358 1.3e-05
 Prior.f5 1.20450  3.34   0.64230.5481 2.198 2.8e-02

 exp(coef) exp(-coef) lower .95 upper .95
 Week  1.01  0.994 0.960  1.05
 LagAOO2.05  0.487 1.618  2.61
 Prior.f2  1.14  0.879 0.517  2.50
 Prior.f3  2.21  0.453 0.920  5.29
 Prior.f4  7.71  0.130 3.076 19.30
 Prior.f5  3.34  0.300 1.139  9.76

 Rsquare= 0.047   (max possible= 0.25 )
 Likelihood ratio test= 91  on 6 df,   p=0
 Wald test= 209  on 6 df,   p=0
 Score (logrank) test = 142  on 6 df,   p=0,   Robust = 17.4  p=0.00803

  (Note: the likelihood ratio and score tests assume independence of
 observations within a cluster, the Wald and robust score tests do
not).

 newdat
  Week   LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5
 1 17.55218 1.1916931000
 2 17.55218 1.1916930000

 survfit(fit,newdata=newdat)
 Error in model.frame(formula, rownames, variables, varnames, extras,
 extranames,  :
variable lengths differ
 In addition: Warning message:
 'newdata' had 2 rows but variable(s) found have 1878 rows

 Regards,
 Alex


 Alex Hanke
 Department of Fisheries and Oceans
 St. Andrews Biological Station
 531 Brandy Cove Road
 St. Andrews, NB
 Canada
 E5B 2L9



   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error using newdata argument in survfit

2005-06-16 Thread Hanke, Alex
Thanks to Thomas, Brain and Ales,
Whose advice led me to the solution. I actually had a second problem
preventing Thomas' solution :
survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=factor(c(1,2),levels=1:
5)))
from working. In the model statement I create a factor from Prior.f via
factor(Prior.f). Rather one should predefine the factor variable
Prior.f-factor(Prior.f) and use that term in the model and then Thomas'
solution works fine.
Alex

-Original Message-
From: Thomas Lumley [mailto:[EMAIL PROTECTED] 
Sent: June 16, 2005 11:00 AM
To: Hanke, Alex
Cc: 'r-help@stat.math.ethz.ch'
Subject: Re: [R] Error using newdata argument in survfit

On Thu, 16 Jun 2005, Hanke, Alex wrote:

 Hi Brian,
 The factor Prior.f has 5 levels (1,2,3,4,5) which coxph deals with by
 creating 4 dummy variables coded with 1 or zero. That's what I see when I
 look at fit$x.
 fit$x[1:5,]
   Week LagAOO factor(Prior.f)2 factor(Prior.f)3 factor(Prior.f)4
 31   22  0000
 32   22  0000
 33   22  2000
 34   22  3000
 35   22  2000
   factor(Prior.f)5
 310
 320
 330
 340
 350
 I have played with the formula a bit adding a term at a time and then
 checking to see if I can produce the survival curves for pseudo cohorts. I
 get as far as the Prior.f term and am successful if I treat it as a
 continuous variable. If I introduce it as a factor and assume it wants
four
 dummy variables as above I get the variable lengths error. If I represent
 the term with one variable:
 survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=c(1,2)))
 I get:
 Error in x2 %*% coef : non-conformable arguments

Yes, but it wants a factor with *5* levels.  Try
survfit(fit,list(Week=c(15,15),LagAOO=c(0,0),Prior.f=factor(c(1,2),levels=1:
5)))

-thomas


 Which is a nice change but still short of knowing what is going on.
 Regards
 Alex

 -Original Message-
 From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
 Sent: June 15, 2005 4:24 PM
 To: Hanke, Alex
 Cc: 'r-help@stat.math.ethz.ch'
 Subject: Re: [R] Error using newdata argument in survfit

 You appear to have a coding for prior.f in newdata rather than the factor
 itself.

 It's a bit hard to be sure when we don't have data8 to compare with.

 On Wed, 15 Jun 2005, Hanke, Alex wrote:

 Dear R-helpers,
 To get curves for a pseudo cohort other than the one centered at the mean
 of
 the covariates, I have been trying to use the newdata argument to survfit
 with no success. Here is my model statement, the newdata and the ensuing
 error. What am I doing wrong?

 summary(fit)
 Call:
 coxph(formula = Surv(Start, Stop, Event, type = counting) ~
Week + LagAOO + Prior.f + cluster(interaction(Station, Year)),
data = data8, method = breslow, x = T, y = T)

  n= 1878
coef exp(coef) se(coef) robust se z   p
 Week 0.00582  1.01   0.03230.0239 0.244 8.1e-01
 LagAOO   0.71929  2.05   0.12380.1215 5.918 3.3e-09
 Prior.f2 0.12927  1.14   0.44020.4025 0.321 7.5e-01
 Prior.f3 0.79082  2.21   0.54840.4460 1.773 7.6e-02
 Prior.f4 2.04189  7.71   0.60080.4685 4.358 1.3e-05
 Prior.f5 1.20450  3.34   0.64230.5481 2.198 2.8e-02

 exp(coef) exp(-coef) lower .95 upper .95
 Week  1.01  0.994 0.960  1.05
 LagAOO2.05  0.487 1.618  2.61
 Prior.f2  1.14  0.879 0.517  2.50
 Prior.f3  2.21  0.453 0.920  5.29
 Prior.f4  7.71  0.130 3.076 19.30
 Prior.f5  3.34  0.300 1.139  9.76

 Rsquare= 0.047   (max possible= 0.25 )
 Likelihood ratio test= 91  on 6 df,   p=0
 Wald test= 209  on 6 df,   p=0
 Score (logrank) test = 142  on 6 df,   p=0,   Robust = 17.4  p=0.00803

  (Note: the likelihood ratio and score tests assume independence of
 observations within a cluster, the Wald and robust score tests do
 not).

 newdat
  Week   LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5
 1 17.55218 1.1916931000
 2 17.55218 1.1916930000

 survfit(fit,newdata=newdat)
 Error in model.frame(formula, rownames, variables, varnames, extras,
 extranames,  :
variable lengths differ
 In addition: Warning message:
 'newdata' had 2 rows but variable(s) found have 1878 rows

 Regards,
 Alex


 Alex Hanke
 Department of Fisheries and Oceans
 St. Andrews Biological Station
 531 Brandy Cove Road
 St. Andrews, NB
 Canada
 E5B 2L9



  [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R

[R] Error using newdata argument in survfit

2005-06-15 Thread Hanke, Alex
Dear R-helpers,
To get curves for a pseudo cohort other than the one centered at the mean of
the covariates, I have been trying to use the newdata argument to survfit
with no success. Here is my model statement, the newdata and the ensuing
error. What am I doing wrong?

 summary(fit)
Call:
coxph(formula = Surv(Start, Stop, Event, type = counting) ~ 
Week + LagAOO + Prior.f + cluster(interaction(Station, Year)), 
data = data8, method = breslow, x = T, y = T)

  n= 1878 
coef exp(coef) se(coef) robust se z   p
Week 0.00582  1.01   0.03230.0239 0.244 8.1e-01
LagAOO   0.71929  2.05   0.12380.1215 5.918 3.3e-09
Prior.f2 0.12927  1.14   0.44020.4025 0.321 7.5e-01
Prior.f3 0.79082  2.21   0.54840.4460 1.773 7.6e-02
Prior.f4 2.04189  7.71   0.60080.4685 4.358 1.3e-05
Prior.f5 1.20450  3.34   0.64230.5481 2.198 2.8e-02

 exp(coef) exp(-coef) lower .95 upper .95
Week  1.01  0.994 0.960  1.05
LagAOO2.05  0.487 1.618  2.61
Prior.f2  1.14  0.879 0.517  2.50
Prior.f3  2.21  0.453 0.920  5.29
Prior.f4  7.71  0.130 3.076 19.30
Prior.f5  3.34  0.300 1.139  9.76

Rsquare= 0.047   (max possible= 0.25 )
Likelihood ratio test= 91  on 6 df,   p=0
Wald test= 209  on 6 df,   p=0
Score (logrank) test = 142  on 6 df,   p=0,   Robust = 17.4  p=0.00803

  (Note: the likelihood ratio and score tests assume independence of
 observations within a cluster, the Wald and robust score tests do not).

newdat
  Week   LagAOO Prior.f2 Prior.f3 Prior.f4 Prior.f5
1 17.55218 1.1916931000
2 17.55218 1.1916930000

 survfit(fit,newdata=newdat)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  : 
variable lengths differ
In addition: Warning message: 
'newdata' had 2 rows but variable(s) found have 1878 rows 

Regards,
Alex


Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Survfit,newdata and continuous time varying covariates

2005-06-14 Thread Hanke, Alex
Hi All,
I am working with three time varying covariates in a coxph model. I cannot
seem to figure out how to use survfit and the newdata argument to provide
estimated survival curves for two scenarios of one covariate while holding
the other two at the mean value.
Is it possible to display how estimated survival depends on alternative
scenarios/profiles of a time varying covariate?


Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Estimate of baseline hazard in survival

2005-06-10 Thread Hanke, Alex
Dear All,
I'm having just a little terminology problem, relating the language used in
the Hosmer and Lemeshow text on Applied Survival Analysis to that of the
help that comes with the survival package.

I am trying to back out the values for the baseline hazard, h_o(t_i), for
each event time or observation time.
Now survfit(fit)$surv gives me the value of the  survival function,
S(t_i|X_i,B), using mean values of the covariates and the coxph() object
provides me with the estimate of the linear predictors, exp(X'B).
If S(t_i|X_i,B)=S_o(t_i)^exp(X_iB) is the expression for the survival
function 
And
-ln(S_o(t_i) ) is the expression for the cumulative baseline hazard
function, H_o(t_i)
Then by rearranging the expression for the survival function I get the
following:
-ln(S_o(t_i) ) = -ln( S(t_i|X_i,B) ) / exp(X_iB)
   = basehaz(fit)/exp(fit$linear.predictors)
Am I right so far and is there an easier way?
The plot of the cumulative baseline hazard function , H_o(t_i), should be
linear across time. Once I have, H_o(t_i),   to get at h_o(t_i) I then need
to reverse the cumsum operation. The corresponding plot should have a
constant baseline hazard over time.

I am aware of cox.zph() for testing the proportionality of hazards
assumption.

Thanks
Alex





Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problem installing packages and weird R site behaviour

2005-03-29 Thread Hanke, Alex
Hi,
I tried to install a package using the menu option and was presented a list
filled with NA's.
I then tried visiting the R site and each option on the side bar (eg. CRAN,
Search,FAQ) sends me to the address attached below (NB: I left off the h in
http).
The first problem seems to be related to the second.
Is anyone else experiencing this behaviour and how do I restore normal
behaviour?
Regards
Alex

Problem 1
local({a - CRAN.packages()
+ install.packages(select.list(a[,1],,TRUE), .libPaths()[1], available=a,
dependencies=TRUE)})
trying URL `http://cran.r-project.org/bin/windows/contrib/2.0/PACKAGES'
Content type `text/html' length unknown
opened URL
downloaded 953 bytes

Problem 2
{ttp://64.235.246.142/?a_id=794adultfilter=ondomainname=r-project.org}

Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Hexidecimal conversion

2004-12-02 Thread Hanke, Alex

Help
I can produce the hexidecimal equivalent of a decimal number but I am having
a hard time reversing the operation. I'm good for hex representations to 159
and am close to extending to 2559. The archives are not clear on the
existence of a function for this task. Is there one?
Here is what I have got so far:
#Good for hex values to 9F
as.decmode-function(as.character(x)){
  hexDigit-c(0:9,LETTERS[1:6])
  z-matrix(c(strsplit(x, NULL),recursive=T),
  length(x),2,byrow=T)
  z.1-as.numeric(z[,1])
  z.2- match(z[,2],hexDigit)-1
  dec-16*z.1+z.2
  return(dec)
  }
Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Hexidecimal conversion

2004-12-02 Thread Hanke, Alex
Thanks to Patrick Burns, Peter Wolf and Duncan Murdoch who all provided me
with workable solutions to the hexidecimal conversion problem. They all work
and basically differ in the number of bells and whistles. 
Alex

-Original Message-
From: Duncan Murdoch [mailto:[EMAIL PROTECTED] 
Sent: December 2, 2004 9:42 AM
To: Hanke, Alex
Subject: Re: [R] Hexidecimal conversion


On Wed, 01 Dec 2004 15:07:16 -0400, Hanke, Alex
[EMAIL PROTECTED] wrote :


Help
I can produce the hexidecimal equivalent of a decimal number but I am
having
a hard time reversing the operation. I'm good for hex representations to
159
and am close to extending to 2559. The archives are not clear on the
existence of a function for this task. Is there one?

I don't think so.

Here is what I have got so far:
#Good for hex values to 9F
as.decmode-function(as.character(x)){
  hexDigit-c(0:9,LETTERS[1:6])
  z-matrix(c(strsplit(x, NULL),recursive=T),
  length(x),2,byrow=T)
  z.1-as.numeric(z[,1])
  z.2- match(z[,2],hexDigit)-1
  dec-16*z.1+z.2
  return(dec)
  }

I think what you're missing is a loop over the characters.  You can
probably vectorize this to make it more efficient, but here's a
sketch:

hex2numeric - function(x) {
hexDigits - c(0:9, LETTERS[1:6])
chars - strsplit(toupper(x), split=NULL)
result - rep(0, length(chars))
for (i in seq(along=chars)) {
for (j in seq(along=chars[[i]])) 
result[i] - 16*result[i] + match(chars[[i]][j],
hexDigits) - 1
}
result
}

Duncan Murdoch

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Google for scientists: search engine

2004-11-30 Thread Hanke, Alex

A new way to search for documentation on your favourite science topic.
http://scholar.google.com/ http://scholar.google.com/ 

Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] RODBC and Table views

2004-11-22 Thread Hanke, Alex
This question relates to the use of the RODBC package for retrieving data
from a MS Access database. It is quite easy to retrieve data sitting in
tables but is it possible to select from views based on these tables? The
archives do not touch on this point.
sqlTables() lets me see tables and views but only the tables yield data. Do
I need to recreate these views on the R side of the connection?
Regards
Alex

Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-09 Thread Hanke, Alex
I get the following message:
Error in isoMDS(tt) : zero or negative distance between objects 1 and 2
This makes sense since a and b are identical in their relationship to c to
h. 
Drop row 1 and col 1 and you get
 isoMDS(tt[2:8,2:8])
initial  value 14.971992 
iter   5 value 8.027815
iter  10 value 4.433377
iter  15 value 3.496364
iter  20 value 3.346726
final  value 3.233738 
converged
$points
   [,1]   [,2]
[1,] -2.3143653 -0.1259226
[2,] -0.3205746 -1.1534662
[3,] -2.8641922 -0.1182906
[4,]  0.7753674  0.1497328
[5,] -0.5705552  1.2416843
[6,]  2.2305175 -0.6995917
[7,]  3.0638025  0.7058540

$stress
[1] 3.233738

Does this help?

-Original Message-
From: Doran, Harold [mailto:[EMAIL PROTECTED] 
Sent: September 9, 2004 8:26 AM
To: Jari Oksanen
Cc: Doran, Harold; Prof Brian Ripley; R-News
Subject: RE: [R] isoMDS


Thank you. I use the same matrix on cmdscale as I did with isoMDS. I have
reproduced my steps below for clarification if this happens to shed any
light.
 
Here is the original total matrix (see opening thread if you care how this
is created)
 
  a b c d e f g h
a 4 4 2 4 1 2 0 0
b 4 4 2 4 1 2 0 0
c 2 2 4 2 3 2 2 1
d 4 4 2 4 1 2 0 0
e 1 1 3 1 4 3 3 2
f 2 2 2 2 3 4 2 1
g 0 0 2 0 3 2 4 3
h 0 0 1 0 2 1 3 4
 
So, there are 8 items. This matrix indicates that items 1,2, and 4 were
always grouped together (or viewed as being similar by individuals). I
transformed this using 
 
tt-max(t)-t
 
which results in
  a b c d e f g h
a 0 0 2 0 3 2 4 4
b 0 0 2 0 3 2 4 4
c 2 2 0 2 1 2 2 3
d 0 0 2 0 3 2 4 4
e 3 3 1 3 0 1 1 2
f 2 2 2 2 1 0 2 3
g 4 4 2 4 1 2 0 1
h 4 4 3 4 2 3 1 0
 
When I run isoMDS on this new matrix, it tells me to specify the initial
config because of the NA/INFs/ But when I perform cmdscale on this same
matrix I end up with the following results,
 
 bt-cmdscale(tt);bt
 
 [,1]   [,2]
a -1.79268634 -0.2662750
b -1.79268634 -0.2662750
c -0.02635497  0.5798934
d -1.79268634 -0.2662750
e  1.08978620  0.6265313
f -0.02635497  0.5798934
g  2.20852966  0.2828937
h  2.13245309 -1.2703869
 
The results suggest that items 1,2, and 4 have similar locations as is
expected. Also items 3 and 6 have similar locations as would also be
expected. So, my results seem to have been replicated correctly using
cmdscale. 
 
I've tried to specify an initial config using isoMDS in a few ways without
success, so I am surely doing something wrong. So far, I have tried the
following:
 
 ll-isoMDS(tt, y=cmdscale(tt))
 
which tells me zero or negative distance between objects 1 and 2
 
 ll-isoMDS(tt, y=cmdscale(tt, k=2))
 
 
Again, thanks,
 
Harold
 
 

-Original Message- 
From: Jari Oksanen [mailto:[EMAIL PROTECTED] 
Sent: Thu 9/9/2004 4:26 AM 
To: Doran, Harold 
Cc: Prof Brian Ripley; R-News 
Subject: RE: [R] isoMDS



On Wed, 2004-09-08 at 21:31, Doran, Harold wrote:
 Thank you. Quick clarification. isoMDS only works with
dissimilarities.
 Converting my similarity matrix into the dissimilarity matrix is
done as
 (from an email I found on the archives)

  d- max(tt)-tt

 Where tt is the similarity matrix. With this, I tried isoMDS as
follows:

  tt.mds-isoMDS(d)

 and I get the following error message.

 Error in isoMDS(d) : An initial configuration must be supplied
with
 NA/Infs in d. I was a little confused on exactly how to specify
this
 initial config. So, from here I ran cmdscale on d as

This error message is quite informative: you have either missing or
non-finite entries in your data. The only surprising thing here is
that
cmdscale works: it should fail, too. Are you sure that you haven't
done
anything with your data matrix in between, like changed it from
matrix
to a dist object? If the Inf/NaN/NA values are on the diagonal, they
will magically disappear with as.dist. Anyway, if you're able to get
a
metric scaling result, you can manually feed that into isoMDS for
the
initial configuration, and  avoid the check. See ?isoMDS.

  d.mds-cmdscale(d)

 which seemed to work fine and produce reasonable results. I was
able to
 take the coordinates and run them through a k-means cluster and
the
 results seemed to correctly match the grouping structure I created
for
 this sample analysis.

 Cmdscale is for metric scaling, but it seemed to produce the
results
 correctly.

 So, did I correctly convert the similarity matrix to the
dissimilarity
 matrix? Second, should I have used cmdscale rather than isoMDS as
I have
 done? Or, is there a way to specify the initial configuration that
I
 have not done correctly.

If you don't know whether you should use isoMDS or cmdscale, you
probably 

RE: [R] kolmogorov-smirnov for discrete ordinal scale data

2004-09-09 Thread Hanke, Alex
Hi
I think the answer is no. However, I have written a script that implements
the test described in Testing for shifts in the Vertical Distribution of
Plankton using a robust Kolmogorov-Smirnov like Statistic by Smith, Beet
and Solow (1998). The test has the properties you are looking for. If this
sounds helpful let me know.
Alex

-Original Message-
From: Gila Lithwick [mailto:[EMAIL PROTECTED] 
Sent: September 9, 2004 11:03 AM
To: [EMAIL PROTECTED]
Subject: [R] kolmogorov-smirnov for discrete ordinal scale data


Hi,

I was wondering whether there is an implementation of the 
Kolmogorov-Smirnov goodness of fit test for discrete, ordinal scale data 
in R - I've only managed to find the test for continuous data.

Thanks!
Gila

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Skipping panels in Lattice

2004-09-09 Thread Hanke, Alex
I have the same problem. As far as I can see, the only thing you can do is :

attach(df2)
group=paste(facb,facc,sep= )
bwplot( dv ~ faca | factor(group))

Alex

-Original Message-
From: Leon Barmuta [mailto:[EMAIL PROTECTED] 
Sent: September 9, 2004 1:19 AM
To: [EMAIL PROTECTED]
Subject: [R] Skipping panels in Lattice


Dear all,

I wish to generate a lattice boxplot which skips an empty cell in a design. 
I have trawled r-help, scruitinized xyplot(lattice) help page, and merrily 
reproduced examples of using skip from a couple of previous r-help queries 
and the example given in Pinheiro  Bates. But I must be missing
something...

Here's an example (running R 1.9.1 on Win2k):

# generate some data

df1 - data.frame(expand.grid(obsnum=seq(1, 15, 1), faca=c(A1, A2,
A3),
 facb=c(B1,B2, B3, B4), facc=c(C1,C2)),
dv=rpois(15*3*4*2,10))

# now get rid of the cell B4  C1 to simulate a missing treatment
combination

df2 - df1[df1$facb !=B4 | df1$facc !=C1, ]

# plain vanilla lattice plot generates an empty panel corresponding to the 
empty cell

plot1 - bwplot( dv ~ faca | facb*facc, data=df2)
plot1

# now try to skip the empty panel
# turn plot history on so that the separate pages can be recalled

plot2 - update(plot1, skip=c(rep(F, 3), T, rep(F, 4)))
plot2

and the 4th panel position of the bottom row is skipped, BUT the B4C1 cell 
is shunted to the top left of row 1 and the last panel of plot1 is now 
moved to page 2. Messing with layout= doesn't help, neither does 
substituting NA for the values of the missing cell (instead of cutting it 
out of the data frame). I also get the same behaviour for stripplot and 
dotplot too.

Apologies if I've missed a previous solution to this during my searches of 
the archive.

Regards,

Leon Barmuta
School of Zoology  TAFI, University of Tasmania, Australia.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-08 Thread Hanke, Alex
Distances cannot always be constructed from similarities. This can be done
only if the matrix of similarities is nonnegative definite. With the
nonnegative definite condition, and with the maximum similarity scaled so
that s_ii=1, d_ik=(2*(1-s_ik))^-.5

Check out the vegan package.
Alex

-Original Message-
From: Doran, Harold [mailto:[EMAIL PROTECTED] 
Sent: September 8, 2004 10:00 AM
To: [EMAIL PROTECTED]
Cc: Doran, Harold
Subject: [R] isoMDS


Dear List:

I have a question regarding an MDS procedure that I am accustomed to
using. I have searched around the archives a bit and the help doc and
still need a little assistance. The package isoMDS is what I need to
perform the non-metric scaling, but I am working with similarity
matrices, not dissimilarities. The question may end up being resolved
simply.

Here is a bit of substantive background. I am working on a technique
where individuals organize items based on how similar they perceive the
items to be. For example, assume there are 10 items. Person 1 might
group items 1,2,3,4,5 in group 1 and the others in group 2. I then turn
this grouping into a binomial similarity matrix. The following is a
sample matrix for Person 1 based on this hypothetical grouping. The off
diagonals are the similar items with the 1's representing similarities. 
  a b c d e f g h i j
a 1 1 1 1 1 0 0 0 0 0
b 1 1 1 1 1 0 0 0 0 0
c 1 1 1 1 1 0 0 0 0 0
d 1 1 1 1 1 0 0 0 0 0
e 1 1 1 1 1 0 0 0 0 0
f 0 0 0 0 0 1 1 1 1 1
g 0 0 0 0 0 1 1 1 1 1
h 0 0 0 0 0 1 1 1 1 1
i 0 0 0 0 0 1 1 1 1 1
j 0 0 0 0 0 1 1 1 1 1


Each of these individual matrices are summed over individuals. So, in
this summed matrix diagonal elements represent the total number of
participants and the off-diagonals represent the number of times an item
was viewed as being similar by members of the group (obviously the
matrix is symmetric below the diagonal). So, a 4 in row 'a' column 'c'
means that these items were viewed as being similar by 4 people. A
sample total matrix is at the bottom of this email describing the
perceived similarities of 10 items across 4 individuals.

It is this total matrix that I end up working with in the MDS. I have
previously worked in systat where I run the MDS and specify the matrix
as a similarity matrix. I then take the resulting data from the MDS and
perform a k-means cluster analysis to identify which items belong to a
particular cluster, centroids, etc.

So, here are my questions. 

1)  Can isoMDS work only with dissimilarities? Or, is there a way
that it can perform the analysis on the similarity matrix as I have
described it?
2)  If I cannot perform the analysis on the similarity matrix, how
can I turn this matrix into a dissimilarity matrix necessary? I am less
familiar with this matrix and how it would be constructed?

Thanks for any help offered,

Harold 


  a b c d e f g h i j
a 4 2 4 3 3 2 0 0 0 0
b 2 4 2 3 1 0 2 2 2 2
c 4 2 4 3 3 2 0 0 0 0
d 3 3 3 4 2 1 1 1 1 1
e 3 1 3 2 4 3 1 1 1 1
f 2 0 2 1 3 4 2 2 2 2
g 0 2 0 1 1 2 4 4 4 4
h 0 2 0 1 1 2 4 4 4 4
i 0 2 0 1 1 2 4 4 4 4
j 0 2 0 1 1 2 4 4 4 4

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-08 Thread Hanke, Alex
I don't understand.
If isoMDS does not work with distances, why does the help for isoMDS
indicate that the Data are assumed to be dissimilarities or relative
distances ? 
Equally confusing is the loose use of the terms dissimilarities and
distances in the literature. As you point out in your book Distances are
often called disimilarities. 

-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Sent: September 8, 2004 11:58 AM
To: Hanke, Alex
Cc: 'Doran, Harold'; '[EMAIL PROTECTED]'
Subject: RE: [R] isoMDS


On Wed, 8 Sep 2004, Hanke, Alex wrote:

 Distances cannot always be constructed from similarities. This can be done
 only if the matrix of similarities is nonnegative definite. With the
 nonnegative definite condition, and with the maximum similarity scaled so
 that s_ii=1, d_ik=(2*(1-s_ik))^-.5

But isoMDDS works with dissimilarities not distances.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] A question about external time-dependent covariates in co x model

2004-08-19 Thread Hanke, Alex
Dear Rui,
From my understanding of time-dependent covariates (not an expert but have
been working on a similar problem), it would appear that the coding of the
status column is not correct. Unless you have observed an event at each
interval you should only have status=1 for the last interval. In your
example I see 3 in total. Also, I think that if end is proportional to
your covariate you are incorporating a redundant time effect into the
model. The time effect is in the baseline hazard.

Alex
-Original Message-
From: Rui Song [mailto:[EMAIL PROTECTED] 
Sent: August 19, 2004 12:21 AM
To: [EMAIL PROTECTED]
Subject: [R] A question about external time-dependent covariates in cox
model


Dear Sir or Madam:
I am a graduate student in UW-Madison statistics department. I have a
question about fitting a cox model with external time-dependent
covariates.

Say the original data is in the following format:
Obs Eventtime  Status  Cov(time=5)  Cov(time=8)  Cov(time=10)   Cov(time=12)
1   5   1   2
2   8   0(censored) 2   4
3   10  1   2   4   6
4   12  1   2   4   6   8


Notice that the time-dependent covariates are identical at the same
time points for all obs since they are external to the failure process.
process.

Then I organized the data as the following:
obs start   end eventtime   status  cov
1   0   5   5   1   2
2   0   5   8   0   2
2   5   8   8   0   4
3   0   5   10  1   2
3   5   8   10  1   4
3   8   10  10  1   6
4   0   5   12  1   2
4   5   8   12  1   4
4   8   10  12  1   6
4   10  12  12  1   8

And fit the model using:

fit-coxph(Surv(start, end, status)~cov);

When I fit the model to my data set (Which has 89 observations and 81
distinct time points, sort of large.), I always got a message that
Process R segmentation fault (core dumped). Would you let me know if it
is due to the matrix sigularity in the computation of the partial
likelihood or something else? And how should I fit a cox model with
external time-dependent covariates?

Thanks a lot for your time and help!

Sincerely,
Rui Song

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Clustering and the test for proportional hazards

2004-08-19 Thread Hanke, Alex
Dear List,
Is the test for proportional hazards valid when the model contains a cluster
variable? The output looks strange with the cluster variable.

My intervals are based on calendar time and the clustering variable is
related to the season the event occurs in. 

model1-coxph(Surv(Start,Stop,Event)~LagAOO+I(LagAOO^2)+
FirstSeen+TSLE+strata(CumPOO.rc)+cluster(quarter),data=data8, x=T)

 Cox.zph(model1)
LagAOO 0.209  5.61 0.0179
I(LagAOO^2) -0.209  5.61 0.0179
FirstSeen  0.209  5.61 0.0179
TSLE 0.209  5.61 0.0179
GLOBALNA  5.61 0.2304
 
In this example there is strong evidence for non-proportional hazards for
each of my covariates.
By defining my strata differently I can remove the non-proportional hazards.
A quick comment is appreciated.
Alex

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Xtable method for coxph, bug?

2004-08-18 Thread Hanke, Alex
Hi
There is a xtable method for coxph. It bombs, however,  when applied to my
coxph object. It cannot find  'nse' which I think is
sqrt(diag(coxph.object$naive.var)). Adding 'nse' to the coxph object cures
the problem. Is this a bug in xtable.coxph?

There is no xtable method for summary.coxph. How can I access the result of
summary(coxph.object) so that I can create a matrix object for which there
is an xtable method?
Thanks
Alex

Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Isotopic notation in plots

2004-05-18 Thread Hanke, Alex
Henrik, Please try:
plot(1:10,xlab=expression({}^{14}*C))

-Original Message-
From: Andersson, Henrik [mailto:[EMAIL PROTECTED] 
Sent: May 18, 2004 7:00 AM
To: [EMAIL PROTECTED]
Subject: [R] Isotopic notation in plots


I really like to use R for all my graphs, and as I work with stable
isotopes I want to have a proper chemical notation in my plots

I have looked at ?plotmath, but didn't find the answer and also searched
the R website.

--

plot(1:10,xlab=expression(^{14}*C))  # I want to have a superscript with
nothing in front, but it doesn't work 

plot(1:10,xlab=expresssion(.^{14}*C))  # this works, but is not
beautiful

Any ideas ?


-
Henrik Andersson
Netherlands Institute of Ecology -
Centre for Estuarine and Marine Ecology
P.O. Box 140
4400 AC Yerseke
Phone: +31 113 577473
[EMAIL PROTECTED]
http://www.nioo.knaw.nl/ppages/handersson

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] (no subject)

2004-04-21 Thread Hanke, Alex
Dear R-Help
Does The R Package for Multivariate and Spatial Analysis Version 4.0
(Casgrain
and Legendre, 2001) exist on CRAN and under what name?  It supposedly has a
chronological clustering program ,CHRONO, that I would like to use. 
Alternatively, I would ask if there is a R based program that performs
chronological clustering?

Thanks
Alex

Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] multiple plots problem

2004-04-01 Thread Hanke, Alex
The command:
layout(c(1,2,3), 3, 1) specifies 3 plots
Try
layout(1:4,2,2,byrow=T)

Regards,
Alex
-Original Message-
From: Oleg Bartunov [mailto:[EMAIL PROTECTED] 
Sent: April 1, 2004 7:39 AM
To: R-help
Subject: [R] multiple plots problem


Hello,

for testing  learning purposes I create X11 device and specify layout like
layout(c(1,2,3), 3, 1), so I could play with parameters and see
several plots at the same time. That works fine until I try to create 4-th
plot - all other plots erased. Such behaviour isn't desirable for testing
purposes and I'm asking  where to look to disable erasing other plots.

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] multiple plots problem

2004-04-01 Thread Hanke, Alex
Correction:
I should have wrote
layout(matrix(c(1,2,3,4), 2, 2, byrow = TRUE))
Sorry
Alex
-Original Message-
From: Hanke, Alex [mailto:[EMAIL PROTECTED] 
Sent: April 1, 2004 9:25 AM
To: 'Oleg Bartunov'; '[EMAIL PROTECTED]'
Subject: RE: [R] multiple plots problem


The command:
layout(c(1,2,3), 3, 1) specifies 3 plots
Try
layout(1:4,2,2,byrow=T)

Regards,
Alex
-Original Message-
From: Oleg Bartunov [mailto:[EMAIL PROTECTED] 
Sent: April 1, 2004 7:39 AM
To: R-help
Subject: [R] multiple plots problem


Hello,

for testing  learning purposes I create X11 device and specify layout like
layout(c(1,2,3), 3, 1), so I could play with parameters and see
several plots at the same time. That works fine until I try to create 4-th
plot - all other plots erased. Such behaviour isn't desirable for testing
purposes and I'm asking  where to look to disable erasing other plots.

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] identify() and controlling label size

2004-03-31 Thread Hanke, Alex
I thought this was going to be easy ...
Can the label size of identify() be controlled by setting par(cex.*) because
I'm having no luck? My only recourse is to save the index and position of
the labels from identify() and use text() to replot them.

Regards
Alex

Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9



[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] identify() and controlling label size

2004-03-31 Thread Hanke, Alex
Thank-you it works! I have ignored ps and relied on the cex arguments until
now.
Alex

-Original Message-
From: Barry Rowlingson [mailto:[EMAIL PROTECTED] 
Sent: March 31, 2004 10:59 AM
To: Hanke, Alex
Cc: '[EMAIL PROTECTED]'
Subject: Re: [R] identify() and controlling label size


Hanke, Alex wrote:
 I thought this was going to be easy ...
 Can the label size of identify() be controlled by setting par(cex.*)
because
 I'm having no luck? My only recourse is to save the index and position of
 the labels from identify() and use text() to replot them.
 

  Funny, it seems to ignore that, but allows 'col=' and 'font=' to set 
the font colour and type.

  But! It accepts 'ps=' to set the point size! You may be saved...

Try:

plot(1:10)
identify(1:10,ps=24)
identify(1:10,ps=12)

Baz

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] strange thing with sd

2004-03-29 Thread Hanke, Alex
Just so you don't feel that you are alone
I get the same response as you use R1.8.0 on an XP operating system.
Alex

-Original Message-
From: Andreas Pauling [mailto:[EMAIL PROTECTED] 
Sent: March 29, 2004 10:02 AM
To: [EMAIL PROTECTED]
Subject: [R] strange thing with sd


Dear R people

I came across a strange thing: 


sd(rep(0.01,  15))  #0
sd(rep(0.001, 15))  #4.489023e-19
sd(rep(0.1,   15))  #0
sd(rep(0.0001,15))  #1.712427e-24

sd(rep(0.01,  13))  #1.805557e-18
sd(rep(0.001, 13))  #4.513894e-19
sd(rep(0.1,   13))  #0
sd(rep(0.0001,13))  #0

sd(rep(5.01,  15))  #0
sd(rep(5.001, 15))  #4.489023e-19
sd(rep(5.1,   15))  #1.838704e-15
sd(rep(5.0001,15))  #9.19352e-16

sd(rep(5.01,  13))  #9.244454e-16
sd(rep(5.001, 13))  #9.244454e-16
sd(rep(5.1,   13))  #1.848891e-15
sd(rep(5.0001,13))  #0

Why gives sd sometimes zero and sometimes values close to zero
and why does it depend on the value and on how many times it is
repeated?
Shouldn't it give always zero?
Is there a way to control this?

I use R Version 1.8.1 under UNIX.

Thanks for any suggestions!!

Andreas

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Setting the 'fig' graphic parameter

2004-03-22 Thread Hanke, Alex
Try:
x - rnorm(100)
y - rnorm(100)

par(fig=c(0,0.7,0,1))
plot(x,y)

# (please maximise your plotting device so you get a 'rectangular' area)

# now lets do the upper corner 'little' plot

par(fig=c(0.7, 1, 0.5, 1),new=T)
plot(x,y)

# and ...

par(fig=c(0.7, 1, 0, 0.5),new=T)
plot(x,y)

-Original Message-
From: Mario dos Reis [mailto:[EMAIL PROTECTED] 
Sent: March 22, 2004 12:23 PM
To: [EMAIL PROTECTED]
Subject: [R] Setting the 'fig' graphic parameter


Hi guys,

# I would like to plot a figure with the following layout:
#
#  
#  | ||
#  | ||
#  | ||
#  | ||
#  | ||
#  | ||
#  | ||
#  

x - rnorm(100)
y - rnorm(100)

par(fig=c(0,0.7,0,1))
plot(x,y)

# (please maximise your plotting device so you get a 'rectangular' area)

# now lets do the upper corner 'little' plot

par(fig=c(0.7, 1, 0.5, 1))
plot(x,y)

# and ...

par(fig=c(0.7, 1, 0, 0.5))
plot(x,y)

Now, my problem is as you might have seen already, that the old figure
gets deleted when the new one is placed. I was trying to reproduce an
exercise a saw in an S-plus book. I would really like to know what is
going on, the documentation about graphic parameters is not very helpful
about fig, and I would really like to set a graph with the above layout.

Thanks,
Mario dos Reis.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Help to compare...

2004-03-22 Thread Hanke, Alex
DF-data.frame(V1=c(0,8,6,4,3,1,2,9,6,5),V2=1:10)
DF[((DF$V1=2  DF$V18)*1:10)[2:8],]

-Original Message-
From: joseclaudio.faria [mailto:[EMAIL PROTECTED] 
Sent: March 22, 2004 1:28 PM
To: [EMAIL PROTECTED]
Subject: [R] Help to compare...


Dear list,

I'm needing submit values (V1 = 8,6,4,3,1,2,9) (Id = 2:8) of a data.frame
(DF), like below

Id  V1  V2 ...
101  ...
2810  ...
362  ...
444  ...
537  ...
618  ...
726  ...
897  ...
961  ...
10  54  ...

to selection (=2 and 8) for remanescents like below:

Id  V1  V2 ...
101  ...
2. 10  ...
362  ...
444  ...
537  ...
6. 8  ...
726  ...
8. 7  ...
961  ...
10  54  ...

how to do that betther with R? Is there a command to compare all to same
time?

I would be very thankful.

Yours sincerly

José Cláudio Faria
UESC/DCET
Brasil
73-634.2779
[EMAIL PROTECTED]
[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Reading Data

2004-03-19 Thread Hanke, Alex
The following response by B.Ripley to a similar request may help.
Alex

On Tue, 21 Oct 2003, Ernie Adorio wrote:

 
 Am using R on a Linux box and am currently writing an interactive R
script.
 
 1. How do I ask a user to press any key to continue ? I used a system call
to 
 read but this only works if the Enter key is pressed:
   print(Press any key to continue)
   system(read)

You seem over-fond of the system() command!  Try

cat(Press enter key to continue\n)
foo - readLines(stdin(), 1)

In general R does not have a character-by-character interface with a 
keyboard.

 2. How do I get a string input from the user? Would like to see an R
function,
 say askget():
 
   delay  =  askget(Enter delay in seconds)
   system(paste( sleep , delay))

cat(Enter delay in seconds:\n)
Delay - scan(, numeric(1))
Sys.sleep(Delay)

will do it (there are other more elegant ways such as

testit - function()
{
  cat(Enter delay in seconds: )
  Delay - as.numeric(readLines(stdin(), 1))
  if(is.na(Delay)) stop(non-numeric input)
  Sys.sleep(Delay)
}

)


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

-Original Message-
From: Kissell, Robert [EQRE] [mailto:[EMAIL PROTECTED] 
Sent: March 19, 2004 12:47 PM
To: [EMAIL PROTECTED]
Subject: [R] Reading Data


Hi,

Quick question on reading data.

I am running simulations but want to allow the user the option to define the
number of simulations. How can I have R read-in user data entered from the
keyboard? Is there a difference for reading in numeric and character data?

For example, I am trying to write the following in R:

Enter Number of Iterations?
the user then enters a number say Y 

X - Y   # i then want to read that number into X


Thanks.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] How to ascertain the number of clusters automatically?

2004-03-10 Thread Hanke, Alex
Hi,
You may be interested in a clustering algorithm called OPTICS. It is both
interactive and automatic and does not require a lot of input parameters. It
is described as creating  an augmented ordering of the data representing
its density-based clustering structure. It automatically and efficiently
extracts not only traditional clustering information but also the intrinsic
clustering structure.
Check out this site:
http://www.dbs.informatik.uni-muenchen.de/Forschung/KDD/Clustering/
Alex

-Original Message-
From: Fucang Jia [mailto:[EMAIL PROTECTED] 
Sent: March 9, 2004 11:30 AM
To: [EMAIL PROTECTED]
Subject: [R] How to ascertain the number of clusters automatically?


Hi, everyone,

There is many small cells which can be classified into several big cells 
from the scanned image. K-means clustering does not work well in this 
condition. I have done hierarchical clustering on cells successfully which 
uses shortest distance between classes. The number of clusters is about 3, 
4, 5, 6, 7 generally. One can ascertain the number of clusters visually.  
But because there are thousands of images to be clustered. So it is humdrum 
to me. I want to know if there are any methods that can be used to ascertain

the number of clusters automatically, especially in this case, only several 
clusters?

Thank you very much!

Best,

Fucang

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Rank Simulations - Test statistic Help

2004-03-10 Thread Hanke, Alex
Dear S,
Try rephrasing your question instead of altering the subject line. I can see
why you haven't any takers.
Alex

-Original Message-
From: S P [mailto:[EMAIL PROTECTED] 
Sent: March 10, 2004 9:59 AM
To: [EMAIL PROTECTED]
Subject: [R] Rank Simulations - Test statistic Help


Hi all,

I am a biostatistician and I have developed my own
ranking system for clinical data. I would like to test
the efficiency of it w.r.t. to other ranking systems.
I would like to simulate the data and after assigning
ranks to my observed scores(after neglecting
dropouts), observe the type I error. If I want to do a
Kruuskal Wallis type of test, what test statistic
should I use to test for a difference of delta
between the two groups (I know delta since the data
is generated with that difference). The default K-W
test statistics test for a NULL difference and I want
to see how frequently my ranking system rejects the
correct null hypothesis of a delta difference.

(So my data is now in 2 arrays of unequal lengths due
to dropouts, which have the ranks for drug and
placebo.)

Thank you in advance for your help.

~S

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] boot package

2004-03-08 Thread Hanke, Alex
Hi,
The function anosim() in vegan package or sample() in base may be of help to
you.
Alex

-Original Message-
From: Rogério Rosa da Silva [mailto:[EMAIL PROTECTED] 
Sent: March 4, 2004 4:57 AM
To: rhelp
Subject: [R] boot package 



Dear all 

As part of an ongoing study on the ecomorphology of ant communities, I have 
obtained a matrix with 156 row (species) and 20 columns (several
measurements 
of body shape) for 4 localities.

For each community, I calculated a matrix of Euclidean distances between all
pairs of species. From this matrix, I extracted two measures of community
structure: i) I identified the distance from a individual to its nearest
neighbor (NND) in the morphological space and then calculated the averages
of 
the NND (MNND); ii) the mean of the Euclidean distances (MED). NND and MED 
are of practical use in describing spatial relations between species. The 
results of Euclidean distance studies in each of the four localities were 
compared with each other for evidence repeating patterns.

To determine wheter ou not the morphological arrangement of species in 
communities reflected internal organization, MNND and MED should be compared

with randomly generated communities.

It is possible to write a function in th boot package to evaluate the values

of average nearest-neighbor distance and of the mean of Euclidean distances 
in randomly generated communities ? One set of 1000 random communities each 
with 78, 90, 100 and 102 species must be generaded from the pool of species 
(156 species). The only restrictions on species composition  is that no 
species could be placed in a community more than once, and only rows values 
can be permuted.

Sorry, I don't know how to define the statistic term in the boot() function 
from the boot library in R. I have studied the boot package, argument list
of 
functions and the functions definition. Can someone point me the correct 
syntax?

I would very much appreciate any help

Best regards

Rogério

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] limit to complex plots?

2004-02-26 Thread Hanke, Alex

try:
layout(matrix(c(0,0,0,0,0,1,0,2,0,0,0,0,0,3,0,4), nrow=4,
byrow=TRUE))
plot(rnorm(10), rnorm(10))
plot(rnorm(20), rnorm(20))
plot(rnorm(30), rnorm(30))
plot(rnorm(40), rnorm(40))

 -Original Message-
 From: [EMAIL PROTECTED] [SMTP:[EMAIL PROTECTED]
 Sent: Thursday, February 26, 2004 7:15 AM
 To:   [EMAIL PROTECTED]
 Subject:  [R] limit to complex plots?
 
 
 Hello all.
 
 I am trying to create one figure, divided into 6 graphs/plots each with an
 inset sub-figure.  I can use to layout command to generate one figure with
 one inset sub-figure, but cannot seem to do it for multiple figures on one
 page.
 
 I've tried a test with the following code:
 
 layout(matrix(c(1,2,3,4), nrow=2, byrow=TRUE))
 plot(rnorm(10), rnorm(10))
 plot(rnorm(10), rnorm(10))
 plot(rnorm(30), rnorm(30))
 plot(rnotm(40), rnorm(40))
 layout.show(4)
 
 #this works and gives me my one page with 4 figures on it
 
 layout(matrix(c(0,0,0,0,0,1,0,2,0,0,0,0,0,3,0,4), nrow=4, byrow=TRUE))
 par(new=TRUE)
 plot(rnorm(10), rnorm(10))
 
 par(new=TRUE)
 plot(rnorm(20), rnorm(20))
 
 par(new=TRUE)
 plot(rnorm(30), rnorm(30))
 
 par(new=TRUE)
 plot(rnorm(40), rnorm(40))
 
 # this is the part that doesn't.  I've tried only one 'par(new=TRUE)'
 command before ALL the plot commands and as written above.  The best I can
 get is 3 sub-figures #2,3 and 4, in positions 1,2 and 3.
 
 Has anyone figured this out?
 
 thanks,
 Suzanne Blatt
 
 __
 Introducing the New Netscape Internet Service.
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] strptime() behaviour

2004-02-20 Thread Hanke, Alex
Is it normal behaviour for strptime(29-Jan-01,%d-%b-%y)$mon to return a
value of 0?
strptime(29-Jan-01,%d-%b-%y)$year #works ok
101 
strptime(29-Jan-01,%d-%b-%y)$mday #works ok
29
Regards,
Alex

Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE:[R] reshape direction=wide

2004-02-19 Thread Hanke, Alex

v.names=c(var1,var2) creates a separate column for each combination of
variables in v.names and level of variable identified by timevar.



I am reshaping a data.frame bids -- reshaped as shown below.

I thought this should be possible with a single invocation of
reshape, but the only way I came up with is reshaping subsets for each
keyword and then joining them together. Does anyone have an idea how to
solve this in a more elegant way? Efficiency is a concern as the datasets
are very large.

Is there a way to specify multiple v.names?



Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] glm.poisson.disp versus glm.nb

2004-02-06 Thread Hanke, Alex
In response to my own query (see below),
The estimate theta from glm.nb is actually 1/phi or 1/alpha in some texts,
where phi  is the dispersion parameter for the negative binomial
distribution. However, the dispersion estimate from glm.poisson.disp does
not equal 1/theta because 1/theta is a maximum likelihood (ML) estimate
whereas the dispersion estimate from glm.poisson.disp is based on the method
of moments (MM). Correct me if I'm wrong!
The two estimates for dispersion are fairly different.
Paul and Banerjee (1998) observe that the test statistic discussed below
(TNBI) is more conservative when one uses the moment generated estimates for
phi, and that phi is underestimated by ML and overestimated by MM. However,
the overestimation by MM is slight compared to the underestimation by ML.
Does that then mean MM estimates of phi are to be preferred?

Alex

Dear list,
This is a question about overdispersion and the ML estimates of the
parameters returned by the glm.poisson.disp (L. Scrucca) and glm.nb
(Venables and Ripley) functions. Both appear to assume a negative binomial
distribution for the response variable.

Paul and Banerjee (1998) developed C(alpha) tests for interaction and main
effects, in an unbalanced two-way layout of counts involving two fixed
factors, when data are Poisson distributed, and when data are extra
dispersed. In R I coded their C(alpha) test statistic (called TNBI) for
interaction for the case where the counts are extra-dispersed, as well as
their test for extra-dispersion (called T_a). Using the Quine data set
(Quine, 1975) the authors collapse the orginal 4x2x2x2 study design into a
2x4 table and obtained estimates of TNBI=10.36 and T_a=90.81.
Using the dispersion estimate from glm.poisson.disp and the estimates for mu
I got exactly the same results for TNBI and T_a. This made me happy. Now I
thought to try the ML estimates from glm.nb to see if the results would be
the same but I am having difficulty relating the dispersion phi from
glm.poisson.disp to theta estimated by glm.nb.
According to the R help for glm.poisson.disp  Var(y_i) =  mu_i(1+mu_i*phi)
. The help for glm.nb lead me to a book by VR (1994) which indicates that
Var(y)=mu+mu^2/theta. From this I gathered that phi=1/theta but the
estimates do not relate to each other in this way unless one is in error. In
a document by L.P. Ammann he says a negative binomial model can be
specified with mean mu and dispersion phi by taking theta=mu/(phi-1). I had
a problem implementing this because in my mind mu is a vector whereas phi
and theta are scalars.



Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] glm.poisson.disp versus glm.nb

2004-02-02 Thread Hanke, Alex
Dear list,
This is a question about overdispersion and the ML estimates of the
parameters returned by the glm.poisson.disp (L. Scrucca) and glm.nb
(Venables and Ripley) functions. Both appear to assume a negative binomial
distribution for the response variable.

Paul and Banerjee (1998) developed C(alpha) tests for interaction and main
effects, in an unbalanced two-way layout of counts involving two fixed
factors, when data are Poisson distributed, and when data are extra
dispersed. In R I coded their C(alpha) test statistic (called TNBI) for
interaction for the case where the counts are extra-dispersed, as well as
their test for extra-dispersion (called T_a). Using the Quine data set
(Quine, 1975) the authors collapse the orginal 4x2x2x2 study design into a
2x4 table and obtained estimates of TNBI=10.36 and T_a=90.81.
Using the dispersion estimate from glm.poisson.disp and the estimates for mu
I got exactly the same results for TNBI and T_a. This made me happy. Now I
thought to try the ML estimates from glm.nb to see if the results would be
the same but I am having difficulty relating the dispersion phi from
glm.poisson.disp to theta estimated by glm.nb.
According to the R help for glm.poisson.disp  Var(y_i) =  mu_i(1+mu_i*phi)
. The help for glm.nb lead me to a book by VR (1994) which indicates that
Var(y)=mu+mu^2/theta. From this I gathered that phi=1/theta but the
estimates do not relate to each other in this way unless one is in error. In
a document by L.P. Ammann he says a negative binomial model can be
specified with mean mu and dispersion phi by taking theta=mu/(phi-1). I had
a problem implementing this because in my mind mu is a vector whereas phi
and theta are scalars.

Consequently, I would like to know  how to get phi from theta so that I can
compare the glm.poisson.disp and glm.nb methods for estimating dispersion. 

Regards, Alex




Alex Hanke
Department of Fisheries and Oceans
St. Andrews Biological Station
531 Brandy Cove Road
St. Andrews, NB
Canada
E5B 2L9

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html