Re: [R] A rude question

2005-01-27 Thread John Dougherty
On Wednesday 26 January 2005 21:09, [EMAIL PROTECTED] wrote:
 Dear all,
  I am beginner using R. I have a question about it. When you use it,
  since it is written by so many authors, how do you know that the
  results are trustable?(I don't want to affend anyone, also I trust
  people). But I think this should be a question.

Almost all software - generally all important software - is has numerous 
authors.  Windows has hundreds, perhaps thousand of coders.  So too does 
Unix.  The big difference between open source and closed source is not in the 
number of authors.  Rather it is in the open availability of the code.  
Arguably, if there is sufficient interest in an open source project, studies 
have indicated that the code is likely to be superior to that of a comparable 
closed source program.  This a probability though, not a natural law.

If you are concerned about the trustworthiness of R, then perhaps the best 
gauge is that some of our favorite if occasionally curmudgeonly authors on 
this list are also experts in S and S-Plus, the proprietary, closed source 
language of which R is also a dialect.  They evidently know what they're 
doing and work comfortably in both domains.

If you compare statistical results using R and Excel, there is no question 
that R is superior, but that will also be true if you tested Excel against 
S-Plus, or SAS, or NCSS - all proprietary programs, or any number of other 
closed and open source programs designed to do statistical analyses.  At the 
same time just about any spreadsheet, open or closed source will also suffer 
in a similar comparison.

If you want a more information about the safety of Excel I would suggest this 
site:

http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html

Read the various links. Beyond this there is a broad literature available on 
the risks and benefits of open and close source programs.  Read it.

JWDougherty

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] getting package version inside .First.lib

2005-01-27 Thread Prof Brian Ripley
On Thu, 27 Jan 2005, Adrian Baddeley wrote:
Greetings -
Is it possible, inside .First.lib,
to find out the version number of the package that is being loaded?
If only one version of the package has been installed,
we could scan the DESCRIPTION file, something like
.First.lib - function(lib, pkg) {
   library.dynam(spatstat, pkg, lib)
   dfile - system.file(DESCRIPTION, package=spatstat)
   ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2]
\n not ^M, please, and readLines is better than scan here.
   vvv - strsplit(ttt, )[[1]][2]
   cat(spatstat version number,vvv,\n)
}
but even this does not seem very safe (it makes assumptions about the
format of the DESCRIPTION file).
It is better to use read.dcf or the installed description information in 
package.rds. Take a look at how library() does this.

Post R-2.0.0 you can assume the format is as library uses.
BTW: all installed.packages does is to read the descriptions of all the 
packages it finds, and in .First.lib you know the path to your package.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] A rude question

2005-01-27 Thread Bill.Venables
When Haydn was asked about his 100+ symphonies he is reputed to have
replied sunt mala bona mixta which is kind of dog latin for There are
good ones and bad ones all mixed together.  It's certainly the same
with R packages so to continue the latin motif: caveat emptor

The R engine, on the other hand, is pretty well uniformly excellent code
but you have to take my word for that.  Actually, you don't.  The whole
engine is open source so, if you wish, you can check every line of it.
If people were out to push dodgy software, this is not the way they'd go
about it.

Bill Venables.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Thursday, 27 January 2005 3:10 PM
To: r-help@stat.math.ethz.ch
Subject: [R] A rude question


Dear all, 
 I am beginner using R. I have a question about it. When you use it,
since it is written by so many authors, how do you know that the
results are trustable?(I don't want to affend anyone, also I trust
people). But I think this should be a question.

 Thanks,
 Ming

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Request for help

2005-01-27 Thread Michela Marignani
My name is Michela Marignani and I'm an ecologist trying to solve a problem 
linked to knight' s tour algorithm.
I need a program to create random matrices with presence/absence (i.e. 1,0 
values), with defined colums and rows sums, to create null models for 
statistical comparison of species distribution phenomena.
I've seen on the web many solutions of the problem, but none provides the 
freedom to easily change row+colums constraint and none of them produce 
matrices  with 1 and 0. Also, I've tryied to use R, but it is too 
complicated for a not-statistician as I amcan you help me?

Thank you for your attention,
so long
Michela Marignani
University of Siena
Environmental Science Dept.
Siena, Italy
[EMAIL PROTECTED]
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Specification of factorial random-effects model

2005-01-27 Thread Douglas Bates
Berton Gunter wrote:
If you read the Help file for lme (!), you'll see that ~1|a*b is certainly
incorrect.
Briefly, the issue has been discussed before on this list: the current
version of lme() follows the original Laird/Ware formulation for **nested**
random effects. Specifying **crossed** random effects is possible but
difficult, and the fitting algorithm is not optimized for this. See p. 163
in Bates and Pinheiro for an example.
The development package lme4 has a version of a linear mixed model 
function that does handle crossed random effects.  In lme4_0.8-1 and 
later the new version of lme, called lmer (which could mean either lme 
revised or lme for R), has a different syntax for specifying mixed 
models.  A random effects specification is indicated by a `|' character 
which  separates a linear model expression on the left side from the 
grouping factor on the right side.  Because the | operator has very low 
precedence, such terms usually must be enclosed in parentheses.

The same type of specification is used for nested or crossed or 
partially crossed grouping factors.  The only restriction is that the 
grouping factor must have a unique level for each group, which is to say 
that you must explicitly create nested factors - you cannot specify them 
implicitly.

This example could be fit as
 (fm1 - lmer(y ~ c + (1|a) +(1|b) + (1|a:b)))
Linear mixed-effects model fit by REML
Formula: y ~ c + (1 | a) + (1 | b) + (1 | a:b)
  AIC  BIClogLik MLdeviance REMLdeviance
 376.0148 392.7759 -181.0074   369.2869 362.0148
Random effects:
 Groups   NameVariance Std.Dev.
 a:b  (Intercept)   1.1118  1.0544
 b(Intercept) 286.8433 16.9364
 a(Intercept)  86.2138  9.2851
 Residual   3.4626  1.8608
# of obs: 81, groups: a:b, 9; b, 3; a, 3
Fixed effects:
 Estimate Std. Error DF  t value  Pr(|t|)
(Intercept)  65.91259   11.16262 78   5.9048 8.707e-08
c2   -9.470000.50645 78 -18.6989  2.2e-16
c3  -10.882590.50645 78 -21.4881  2.2e-16
For the random effects the Variance column is the estimate of the 
variance component.  The Std.Dev. column is simply the square root of 
the estimated variance.  I find it easier to think in terms of standard 
deviations rather than variances because I can compare the standard 
deviations to the scale of the data.  Note that this column is *not* a 
standard error of the estimated variance component (and purposely so 
because I feel that such quantities are often nonsensical).

A test of, say, whether the variance component for the interaction could 
be zero is performed by fitting the reduced model and using the anova 
function to compare the fitted models.  The p-value quoted for this test 
is conservative because the null hypothesis is on the boundary of the 
parameter space.

 (fm2 - lmer(y ~ c + (1|a) +(1|b)))
Linear mixed-effects model fit by REML
Formula: y ~ c + (1 | a) + (1 | b)
  AIC  BIClogLik MLdeviance REMLdeviance
 379.3209 393.6876 -183.6605   374.8822 367.3209
Random effects:
 Groups   NameVariance Std.Dev.
 a(Intercept)  86.3823  9.2942
 b(Intercept) 286.5391 16.9275
 Residual   4.0039  2.0010
# of obs: 81, groups: a, 3; b, 3
Fixed effects:
Estimate Std. Error DF  t value  Pr(|t|)
(Intercept)  65.912611.1560 78   5.9083  8.58e-08
c2   -9.4700 0.5446 78 -17.3890  2.2e-16
c3  -10.8826 0.5446 78 -19.9829  2.2e-16
Warning message:
optim returned message ERROR: ABNORMAL_TERMINATION_IN_LNSRCH
 in: LMEoptimize-(`*tmp*`, value = list(maxIter = 50, msMaxIter = 50,
 anova(fm1, fm2)
Data:
Models:
fm2: y ~ c + (1 | a) + (1 | b)
fm1: y ~ c + (1 | a) + (1 | b) + (1 | a:b)
Df AIC BIC  logLik  Chisq Chi Df Pr(Chisq)
fm2  6  386.88  401.25 -187.44
fm1  7  383.29  400.05 -184.64 5.5953  10.01801
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Nicholas Galwey
Sent: Wednesday, January 26, 2005 1:45 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Specification of factorial random-effects model

I want to specify two factors and their interaction as random 
effects using
the function lme().   This works okay when I specify these 
terms using the
function Error() within the function aov(), but I can't get 
the same model
fitted using lme().   The code below illustrates the problem.


a - factor(rep(c(1:3), each = 27))
b - factor(rep(rep(c(1:3), each = 9), times = 3))
c - factor(rep(rep(c(1:3), each = 3), times = 9))
y - c(74.59,75.63,76.7,63.48,63.17,65.99,64,66.35,64.5,
  46.57,44.16,47.96,35.09,36.14,35.16,36.4,34.72,34.58,
  41.82,47.35,45.74,33.33,36.8,33.38,34.13,34.39,34.48,
  89.73,85.24,90.86,82.5,79.44,81.65,77.74,77.02,81.62,
  59.32,62.29,60.7,55.42,55.5,51.17,50.54,53.54,51.85,
  64.5,63.6,65.19,55.07,50.26,53.73,54.57,47.8,48.8,91.56,
  94.49,92.17,82.14,83.16,81.31,83.58,78.63,77.08,60.53,
  

RE: [R] A rude question

2005-01-27 Thread michael watson \(IAH-C\)
Hi

I don't know if you are asking the question for the same reasons I did,
but recently (and ongoing) we have been required to adopt an
internationally recognised standard.  Being in the bioinformatics field,
where open-source software is the beating heart of cutting edge
research, we have obviously had to ask ourselves that exact question -
How can we be sure the software we use works?.  

In science, this doesn't just apply to software though.  When someone
publishes a paper, how can any of us be sure they did what they said
they did?  Or that their methods are the correct ones to use?  Luckily,
there is a two word answer that we hope will satisfy our auditors, and
that is Peer Review.  In the context of R, I would say that you could
put a confidence measure on any package based on the number of people
who use it; the more people who use a package, the more likely they are
to find and remove bugs.  

I won't get into the open source vs commercial argument, but put
simply, all software has bugs at some stage, no matter who has written
it.  Given that fact, I prefer the code to be open so I can see them,
not closed so that I can't.  The fact that we can see all code relating
to R is surely the biggest quality measure of all?

Cheers
Mick

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: 27 January 2005 05:10
To: r-help@stat.math.ethz.ch
Subject: [R] A rude question


Dear all, 
 I am beginner using R. I have a question about it. When you use it,
since it is written by so many authors, how do you know that the
results are trustable?(I don't want to affend anyone, also I trust
people). But I think this should be a question.

 Thanks,
 Ming

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Results of MCD estimators in MASS and rrcov

2005-01-27 Thread rainer grohmann
Hi!

I tested two different implementations of the robust MCD estimator:
cov.mcd from the MASS package and
covMcd from the rrcov package.
Tests were done on the hbk dataset included in the rrcov package. 

Unfortunately I get quite differing results -- so the question is whether
this differences are justified or an error on my side or a bug?

Here is, what I did:

 require(MASS)
 require(rrcov)
 data(hbk)

 mass.mcd-cov.mcd(hbk,quantile.used=57)
 rrcov.covMcd-covMcd(hbk,alpha=0.75)

 #output from cov.mcd (MASS)
 mass.mcd$center
 X1  X2  X3   Y
 1.5583  1.8033  1.6600 -0.0867
 mass.mcd$cov
   X1 X2 X3   Y
X1 1.12484463 0.02217514  0.1537288  0.07615819
X2 0.02217514 1.13897175  0.1814915  0.02029379
X3 0.15372881 0.18149153  1.0434576 -0.12877966
Y  0.07615819 0.02029379 -0.1287797  0.31236158

 #output from covMcd (rrcov)
 rrcov.covMcd$center
 X1  X2  X3   Y
 1.53770492  1.78032787  1.68688525 -0.07377049
 rrcov.covMcd$cov

 X1  X2 X3Y
  X1 1.61921813 0.072595397  0.1678300  0.083905209
  X2 0.07259540 1.648137481  0.2013022  0.002657454
  X3 0.16782996 0.201302158  1.5306858 -0.150876964
  Y  0.08390521 0.002657454 -0.1508770  0.453846286

As you can see, the results are quite different. 

I tried to start both calls with 75% (=57 of 75) good data-points.

I crosschecked the results with the MCD implementation in MATLAB by Verboven
and Hubert. This functions give the same results as cov.mcd (MASS).

If somebody knows, why the results do not match, although both functions are
implementation referring to the same estimator, please tell me.

Thanks,
   Rainer

-- 
10 GB Mailbox, 100 FreeSMS http://www.gmx.net/de/go/topmail

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] getting package version inside .First.lib

2005-01-27 Thread Roger Bivand
On Thu, 27 Jan 2005, Prof Brian Ripley wrote:

 On Thu, 27 Jan 2005, Adrian Baddeley wrote:
 
  Greetings -
 
  Is it possible, inside .First.lib,
  to find out the version number of the package that is being loaded?
 
  If only one version of the package has been installed,
  we could scan the DESCRIPTION file, something like
 
  .First.lib - function(lib, pkg) {
 library.dynam(spatstat, pkg, lib)
 dfile - system.file(DESCRIPTION, package=spatstat)
 ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2]
 
 \n not ^M, please, and readLines is better than scan here.
 
 vvv - strsplit(ttt, )[[1]][2]
 cat(spatstat version number,vvv,\n)
  }
 
  but even this does not seem very safe (it makes assumptions about the
  format of the DESCRIPTION file).
 
 It is better to use read.dcf or the installed description information in 
 package.rds. Take a look at how library() does this.

Or even packageDescription() in utils, which uses read.dcf() and should be
a way of making sure you get the version even if the underlying formatting
changes.

Roger

 
 Post R-2.0.0 you can assume the format is as library uses.
 
 BTW: all installed.packages does is to read the descriptions of all the 
 packages it finds, and in .First.lib you know the path to your package.
 
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Cluster analysis using EM algorithm

2005-01-27 Thread Christian Hennig
Hi!

Take a look at the packages mclust and flexmix!
They use the EM algorithm for mixture modelling, sometimes called model
based cluster analysis.

Best,
Christian

On Wed, 26 Jan 2005 [EMAIL PROTECTED] wrote:

 Hi, 
  I am looking for a package to do the clustering analysis using the
  expectation maximization algorithm. 
 
  Thanks in advance.
 
  Ming
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

***
Christian Hennig
Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
[EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/
###
ich empfehle www.boag-online.de

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Converting yr mo da to dates

2005-01-27 Thread Uwe Ligges
David Parkhurst wrote:
I'm using R 2.0.1 in windows XP (and am not currently subscribed to this 
mailing list).

I have a USGS dataset, a text file with fixed width fields, that 
includes dates as 6-digit integers in the form yrmoda.  I could either 
read them that way, or with yr, mo, and da as separate integers.  In 
either case, I'd like to convert them to a form will allow plotting 
other y variables against the dates (with correct spacing) on the 
horizontal axis.

I've looked in all the manuals, but didn't find a way to do this.  I can 
copy the data to a spreadsheet, make the conversion there, and then move 
the data to R, but that's a nuisance.

I'd appreciate learning whether there is a way to do this all within R. 
Thanks.

Dave Parkhurst
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
See ?strptime as in:
strptime(c(050127, 050128), %y%m%d)
Uwe Ligges
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] self-written function

2005-01-27 Thread Christoph Scherber
Dear all,
I´ve got a simple self-written function to calculate the mean + s.e. 
from arcsine-transformed data:

backsin-function(x,y,...){
backtransf-list()
backtransf$back-((sin(x[x!=NA]))^2)*100
backtransf$mback-tapply(backtransf$back,y[x!=NA],mean)
backtransf$sdback-tapply(backtransf$back,y[x!=NA],stdev)/sqrt(length(y[x!=NA]))
backtransf
}
I would like to apply this function to whole datasets, such as
tapply(variable,list(A,B,C,D),backsin)
Of course, this doesn´t work with the way in which the backsin() 
function is specified.

Does anyone have suggestions on how I could improve my function?
Regards,
Christoph
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] getting package version inside .First.lib

2005-01-27 Thread Liaw, Andy
 From: Roger Bivand
 
 On Thu, 27 Jan 2005, Prof Brian Ripley wrote:
 
  On Thu, 27 Jan 2005, Adrian Baddeley wrote:
  
   Greetings -
  
   Is it possible, inside .First.lib,
   to find out the version number of the package that is 
 being loaded?
  
   If only one version of the package has been installed,
   we could scan the DESCRIPTION file, something like
  
   .First.lib - function(lib, pkg) {
  library.dynam(spatstat, pkg, lib)
  dfile - system.file(DESCRIPTION, package=spatstat)
  ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2]
  
  \n not ^M, please, and readLines is better than scan here.
  
  vvv - strsplit(ttt, )[[1]][2]
  cat(spatstat version number,vvv,\n)
   }
  
   but even this does not seem very safe (it makes 
 assumptions about the
   format of the DESCRIPTION file).
  
  It is better to use read.dcf or the installed description 
 information in 
  package.rds. Take a look at how library() does this.
 
 Or even packageDescription() in utils, which uses read.dcf() 
 and should be
 a way of making sure you get the version even if the 
 underlying formatting
 changes.

This is how I do it in randomForest (using .onAttach instead of .First.Lib):

.onAttach - function(libname, pkgname) {
RFver - if (as.numeric(R.version$major)  2  
 as.numeric(R.version$minor)  9.0)
  package.description(randomForest)[Version] else
packageDescription(randomForest)$Version
cat(paste(randomForest, RFver), \n)
cat(Type rfNews() to see new features/changes/bug fixes.\n)
}

HTH,
Andy
 
 Roger
 
  
  Post R-2.0.0 you can assume the format is as library uses.
  
  BTW: all installed.packages does is to read the 
 descriptions of all the 
  packages it finds, and in .First.lib you know the path to 
 your package.
  
  
 
 -- 
 Roger Bivand
 Economic Geography Section, Department of Economics, 
 Norwegian School of
 Economics and Business Administration, Breiviksveien 40, 
 N-5045 Bergen,
 Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
 e-mail: [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] getting package version inside .First.lib

2005-01-27 Thread Prof Brian Ripley
On Thu, 27 Jan 2005, Liaw, Andy wrote:
From: Roger Bivand
On Thu, 27 Jan 2005, Prof Brian Ripley wrote:
On Thu, 27 Jan 2005, Adrian Baddeley wrote:
Greetings -
Is it possible, inside .First.lib,
to find out the version number of the package that is
being loaded?
If only one version of the package has been installed,
we could scan the DESCRIPTION file, something like
.First.lib - function(lib, pkg) {
   library.dynam(spatstat, pkg, lib)
   dfile - system.file(DESCRIPTION, package=spatstat)
   ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2]
\n not ^M, please, and readLines is better than scan here.
   vvv - strsplit(ttt, )[[1]][2]
   cat(spatstat version number,vvv,\n)
}
but even this does not seem very safe (it makes
assumptions about the
format of the DESCRIPTION file).
It is better to use read.dcf or the installed description
information in
package.rds. Take a look at how library() does this.
Or even packageDescription() in utils, which uses read.dcf() and should 
be a way of making sure you get the version even if the underlying 
formatting changes.
This is how I do it in randomForest (using .onAttach instead of .First.Lib):
.onAttach - function(libname, pkgname) {
   RFver - if (as.numeric(R.version$major)  2 
as.numeric(R.version$minor)  9.0)
 package.description(randomForest)[Version] else
   packageDescription(randomForest)$Version
   cat(paste(randomForest, RFver), \n)
   cat(Type rfNews() to see new features/changes/bug fixes.\n)
}
Please don't use functions from utils in such places without explicitly 
loading them from utils unless your package has an explicit dependence on 
utils (and randomForest does not).

There was a good reason why I suggested what I did: you don't need the 
utils namespace for this.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] getting package version inside .First.lib

2005-01-27 Thread Adrian Baddeley
Thanks, Brian. 

So, to print the version number when 'mypackage' is loaded,

.First.lib - function(lib, pkg) {
library.dynam(mypackage, pkg, lib)
vvv - read.dcf(file=system.file(DESCRIPTION, package=mypackage), 
fields=Version)
cat(paste(mypackage, vvv, \n))
}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] computing roots of bessel function

2005-01-27 Thread Robin Hankin
hi
package(gsl)
calculates zeroes of regular Bessel functions of integral order.
You need function bessel_zero_Jnu()
best wishes
Robn
On Jan 27, 2005, at 11:55 am, coutand wrote:
I am not yet a R user but I will be soon.
I am looking for the R command and syntax to compute the roots of 
Bessel function i.e. computing the z values that lead to Jnu(z)=0 
where J is a Bessel function or order nu.
May You help me ?
thanks in advance.

Dr Catherine COUTAND
Institut National de la Recherche Agronomique (INRA)
umr Physiologie Intégrative de l'Arbre Fruitier et Forestier (PIAF)
234 av. du Brézet
63039 Clermont-Ferrand cedex 02
France
tel : 00-33-(0)4-73-62-46-73
fax : 00-33-(0)4-73-62-44-54
email : [EMAIL PROTECTED]
http://www.clermont.inra.fr/piaf
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html


--
Robin Hankin
Uncertainty Analyst
Southampton Oceanography Centre
European Way, Southampton SO14 3ZH, UK
 tel  023-8059-7743
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] getting package version inside .First.lib

2005-01-27 Thread Liaw, Andy
 From: Prof Brian Ripley 
 
 On Thu, 27 Jan 2005, Liaw, Andy wrote:
 
  From: Roger Bivand
 
  On Thu, 27 Jan 2005, Prof Brian Ripley wrote:
 
  On Thu, 27 Jan 2005, Adrian Baddeley wrote:
 
  Greetings -
 
  Is it possible, inside .First.lib,
  to find out the version number of the package that is
  being loaded?
 
  If only one version of the package has been installed,
  we could scan the DESCRIPTION file, something like
 
  .First.lib - function(lib, pkg) {
 library.dynam(spatstat, pkg, lib)
 dfile - system.file(DESCRIPTION, package=spatstat)
 ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2]
 
  \n not ^M, please, and readLines is better than scan here.
 
 vvv - strsplit(ttt, )[[1]][2]
 cat(spatstat version number,vvv,\n)
  }
 
  but even this does not seem very safe (it makes
  assumptions about the
  format of the DESCRIPTION file).
 
  It is better to use read.dcf or the installed description
  information in
  package.rds. Take a look at how library() does this.
 
  Or even packageDescription() in utils, which uses 
 read.dcf() and should 
  be a way of making sure you get the version even if the underlying 
  formatting changes.
 
  This is how I do it in randomForest (using .onAttach 
 instead of .First.Lib):
 
  .onAttach - function(libname, pkgname) {
 RFver - if (as.numeric(R.version$major)  2 
  as.numeric(R.version$minor)  9.0)
   package.description(randomForest)[Version] else
 packageDescription(randomForest)$Version
 cat(paste(randomForest, RFver), \n)
 cat(Type rfNews() to see new features/changes/bug fixes.\n)
  }
 
 Please don't use functions from utils in such places without 
 explicitly 
 loading them from utils unless your package has an explicit 
 dependence on 
 utils (and randomForest does not).
 
 There was a good reason why I suggested what I did: you don't 
 need the 
 utils namespace for this.

Thanks for the tip!  Will remediate...

Andy
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Request for help

2005-01-27 Thread Patrick Burns
If I understand your problem properly, then your matrices have a
known number of zeros and ones in them.  So you can create a
matrix with just this constraint binding via:
mat - matrix(sample(rep(0:1, c(nzeros, nones))), nr, nc)
That command first generates the appropriate number of zeros and ones
(via 'rep'), then does a random permutation of them (with 'sample') and
finally turns it into a matrix.
You could then test for the row and column constraints, and permute
the sub-matrix of rows and columns that do not obey their constraints.
It could look something like:
mat[bad.rows, bad.cols] - sample(mat[bad.rows, bad.cols])
where 'bad.rows' and 'bad.cols' are logical vectors stating if the 
constraints
are satisfied or not.

You do not need to be a statistician to use R -- far from it.  The 
'Guide for the
Unwilling' gives you a  brief introduction.  There is also a lot of 
introductory
material in the contributed documentation section of  the R Project website.

It would be good to use a more descriptive subject for messages to R-help.
Patrick Burns
Burns Statistics
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)
Michela Marignani wrote:
My name is Michela Marignani and I'm an ecologist trying to solve a 
problem linked to knight' s tour algorithm.
I need a program to create random matrices with presence/absence (i.e. 
1,0 values), with defined colums and rows sums, to create null models 
for statistical comparison of species distribution phenomena.
I've seen on the web many solutions of the problem, but none provides 
the freedom to easily change row+colums constraint and none of them 
produce matrices  with 1 and 0. Also, I've tryied to use R, but it is 
too complicated for a not-statistician as I amcan you help me?

Thank you for your attention,
so long
Michela Marignani
University of Siena
Environmental Science Dept.
Siena, Italy
[EMAIL PROTECTED]
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] A rude question

2005-01-27 Thread Pikounis, Bill [CNTUS]
Ming,

  results are trustable?(I don't want to affend anyone, also I trust
  people).

Years ago I read about a simplified formula to answer whether I trust
someone, and in turn, something:

Trustworthiness = Competence + Character.

I think a bit of research, as the other R-help posters have so
comprehensively covered in their replies to your original question, will
convince you or anyone else you need to convince that the R-core team and
the core product of R itself rates at the top of the scale on both character
and competence.

Packages of course will not be as consistently high in the trustworthiness
continuum, but rest assured there are several that are high, which again,
you can verify yourself for your and/or your audience's needs.

Best Regards,
Bill

---
Bill Pikounis, PhD

Nonclinical Statistics
Centocor, Inc.
200 Great Valley Parkway
MailStop C4-1
Malvern, PA 19355

610 240 8498
fax 610 651 6717 

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 Sent: Thursday, January 27, 2005 12:10 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] A rude question
 
 
 Dear all, 
  I am beginner using R. I have a question about it. When you use it,
  since it is written by so many authors, how do you know that the
  results are trustable?(I don't want to affend anyone, also I trust
  people). But I think this should be a question.
 
  Thanks,
  Ming
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Output predictions based on a Cox model

2005-01-27 Thread George Fraser
Hi,

I've generated a cox model, but I'm struggling to work out how to output
predctions based on the model I've made.

my.model-coxph(Surv(duration,status) ~ gender + var1 + var2,
data=mydata)

My test data set looks something like this:

id,actualduration,gender,var1,var2
a,65,m,1,3
b,34,f,1,5
...

What i need to do is for each id, output a predicted duration based on
my cox model so that I can compare it with other models.

I've looked in the survival package, and the Design package, but I can
only see how to output survival probabilities.  I'm probably missing
something obvious, but trawling the mail archives has been fruitless,
any suggestions?

Cheers,
George

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A rude question

2005-01-27 Thread Michael Grant
Ming,

You have received a number of excellent replies to
your question and should really consider them. Here is
another point--really extending a Bill Venables
comment:

If people were out to push dodgy software, this is
not the way they'd go about it.

Definitely! Look at the requirements for submitting a
package to R. While the required documentation and
uniform approach mandated do not automatically equate
to VV'ed code it is a strong indication of commitment
of the R core and contributing communities. The
imposition of these standards by the core team and the
time committed to the project vis-a-vis development,
the help list, etc. speaks volumes about the quality
of  R. Rest assured such commitment is not the norm.

That being said, I do respectfully disagree with Dr.
Rossini in one minor detail ;O). It is not 'extremely
paranoid' to  re-code in another language and
definitely not so to do hand calculations! Murphy's
Law is relentless in all matters! If you are like most
of us (all of us?) you will find errors in your own
coding and maybe rarely an R bug. 

BTW, since you are starting out in R...voraciously
read the documentation, helplist, newletter, and other
free and commercial material on R, work thru the
examples relevant to you area of endeavor, read more,
code more, read more, code more, read more, code
more The facility with R that you gain as a result
will reward you multifold down the road.

Best regards,
Michael Grant

P.S. Whenever you upgrade R, read the CHANGES, NEWS
files, etc. R does evolve--even the core--although it
is very controlled and managed. (You will learn of
bugfixes there too.)


--- [EMAIL PROTECTED] wrote:

 Dear all, 
  I am beginner using R. I have a question about it.
 When you use it,
  since it is written by so many authors, how do you
 know that the
  results are trustable?(I don't want to affend
 anyone, also I trust
  people). But I think this should be a question.
 
  Thanks,
  Ming
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sw

2005-01-27 Thread Christoph Buser
Dear Mahdi

Mahdi Osman writes:
  Hi list,
  
  I am just a new user of R.
  
  How can I run stepwise regression in R?

If you look for a stepwise procedure for Linear Regression
Models or Generalized Linear Models, you can use step() 
(see ?step) 

Regards,

Christoph

--
Christoph Buser [EMAIL PROTECTED]
Seminar fuer Statistik, LEO C11
ETH (Federal Inst. Technology)  8092 Zurich  SWITZERLAND
phone: x-41-1-632-5414  fax: 632-1228
http://stat.ethz.ch/~buser/
--



  Is there a graphic user interphase for any of the spatial packages inculded
  in R, such as gstat, geoR and someothers. I am mainly interested interactive
  variogram modelling and mapping.
  
  Thanks
  Mahdi
  
  -- 
  ---
  Mahdi Osman (PhD)
  E-mail: [EMAIL PROTECTED]
  ---
  
  10 GB Mailbox, 100 FreeSMS http://www.gmx.net/de/go/topmail
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] cluster, mona error

2005-01-27 Thread Morten Mattingsdal
Hi
I have a problem using the package cluster on my binary data. I want to 
try mona at first. But i get the an error.

hc-read.table(all.txt, header=TRUE, sep=\t, row.names=1)
srt(hc)
`data.frame':   51 obs. of  59 variables:
$ G1p : int  2 1 1 1 1 1 1 1 1 1 ...
$ G1q : int  1 1 1 1 1 1 1 1 1 1 ...
$ G2p : int  1 1 1 1 1 1 1 1 1 1 ...
$ G2q : int  1 1 1 1 1 1 1 1 1 1 ...
$ G3p : int  1 1 1 1 1 1 1 1 1 1 ...
m-mona(hc)
Error in mona(hc) : All variables must be binary (factor with 2 levels).
I find this strange when the cluster dataset animals have the same 
structure as my data.

srt(animals)
`data.frame':   20 obs. of  6 variables:
$ war: int  1 1 2 1 2 2 2 2 2 1 ...
$ fly: int  1 2 1 1 1 1 2 2 1 2 ...
$ ver: int  1 1 2 1 2 2 2 2 2 1 ...
$ end: int  1 1 1 1 2 1 1 2 2 1 ...
$ gro: int  2 2 1 1 2 2 2 1 2 1 ...
$ hai: int  1 2 2 2 2 2 1 1 1 1 ...
m-mona(animals) #works fine
what is this error trying to tell me?
mvh
morten
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] getting package version inside .First.lib

2005-01-27 Thread Roger D. Peng
This is what I use for all my packages, which I believe handles multiple 
versions of the same package being installed:

.First.lib - function(lib, pkg) {
ver - read.dcf(file.path(lib, pkg, DESCRIPTION), Version)
ver - as.character(ver)
...
}
-roger
Adrian Baddeley wrote:
Greetings - 

Is it possible, inside .First.lib,
to find out the version number of the package that is being loaded?
If only one version of the package has been installed,
we could scan the DESCRIPTION file, something like
.First.lib - function(lib, pkg) {
library.dynam(spatstat, pkg, lib)
dfile - system.file(DESCRIPTION, package=spatstat)
ttt - scan(dfile, what=, sep=^M, quiet=TRUE)[2]
vvv - strsplit(ttt, )[[1]][2]
cat(spatstat version number,vvv,\n)
}
but even this does not seem very safe (it makes assumptions about the
format of the DESCRIPTION file).
Is there a better way?
thanks
Adrian Baddeley
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
--
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] cluster, mona error

2005-01-27 Thread Sean Davis
On Jan 27, 2005, at 9:06 AM, Morten Mattingsdal wrote:
Hi
I have a problem using the package cluster on my binary data. I want 
to try mona at first. But i get the an error.

hc-read.table(all.txt, header=TRUE, sep=\t, row.names=1)
srt(hc)
`data.frame':   51 obs. of  59 variables:
$ G1p : int  2 1 1 1 1 1 1 1 1 1 ...
$ G1q : int  1 1 1 1 1 1 1 1 1 1 ...
$ G2p : int  1 1 1 1 1 1 1 1 1 1 ...
$ G2q : int  1 1 1 1 1 1 1 1 1 1 ...
$ G3p : int  1 1 1 1 1 1 1 1 1 1 ...
m-mona(hc)
Error in mona(hc) : All variables must be binary (factor with 2 
levels).

You have to be careful that the data are indeed each factors with 2 
levels (numeric variables with values 1 and 2 will not do).  A summary 
of the data will tell you that.

Sean
I find this strange when the cluster dataset animals have the same 
structure as my data.

srt(animals)
`data.frame':   20 obs. of  6 variables:
$ war: int  1 1 2 1 2 2 2 2 2 1 ...
$ fly: int  1 2 1 1 1 1 2 2 1 2 ...
$ ver: int  1 1 2 1 2 2 2 2 2 1 ...
$ end: int  1 1 1 1 2 1 1 2 2 1 ...
$ gro: int  2 2 1 1 2 2 2 1 2 1 ...
$ hai: int  1 2 2 2 2 2 1 1 1 1 ...
m-mona(animals) #works fine
what is this error trying to tell me?
mvh
morten
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] cluster, mona error solved

2005-01-27 Thread Morten Mattingsdal
Sean Davis wrote:
On Jan 27, 2005, at 9:06 AM, Morten Mattingsdal wrote:
Hi
I have a problem using the package cluster on my binary data. I want 
to try mona at first. But i get the an error.

hc-read.table(all.txt, header=TRUE, sep=\t, row.names=1)
srt(hc)
`data.frame':   51 obs. of  59 variables:
$ G1p : int  2 1 1 1 1 1 1 1 1 1 ...
$ G1q : int  1 1 1 1 1 1 1 1 1 1 ...
$ G2p : int  1 1 1 1 1 1 1 1 1 1 ...
$ G2q : int  1 1 1 1 1 1 1 1 1 1 ...
$ G3p : int  1 1 1 1 1 1 1 1 1 1 ...
m-mona(hc)
Error in mona(hc) : All variables must be binary (factor with 2 levels).
You have to be careful that the data are indeed each factors with 2 
levels (numeric variables with values 1 and 2 will not do).  A summary 
of the data will tell you that.

Sean
Yes. Now I understand. There was one single variable among my 59, which 
did only have 1 level: I used summary(mydata) as you said:
and found

L16p
Min.   :1
1st Qu.:1
Median :1
Mean   :1
3rd Qu.:1
Max.   :1
I removed this and now it workes fine thanks alot for your quick reply
regards
greatful morten
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] cluster, mona error

2005-01-27 Thread Christian Hennig
Morten,

just a try: is there a constant variable (only 1) in the first dataset?

Christian

On Thu, 27 Jan 2005, Morten Mattingsdal wrote:

 Hi
 
 I have a problem using the package cluster on my binary data. I want to 
 try mona at first. But i get the an error.
 
 hc-read.table(all.txt, header=TRUE, sep=\t, row.names=1)
 srt(hc)
 `data.frame':   51 obs. of  59 variables:
  $ G1p : int  2 1 1 1 1 1 1 1 1 1 ...
  $ G1q : int  1 1 1 1 1 1 1 1 1 1 ...
  $ G2p : int  1 1 1 1 1 1 1 1 1 1 ...
  $ G2q : int  1 1 1 1 1 1 1 1 1 1 ...
  $ G3p : int  1 1 1 1 1 1 1 1 1 1 ...
 
 m-mona(hc)
 Error in mona(hc) : All variables must be binary (factor with 2 levels).
 
 I find this strange when the cluster dataset animals have the same 
 structure as my data.
 
 srt(animals)
 `data.frame':   20 obs. of  6 variables:
  $ war: int  1 1 2 1 2 2 2 2 2 1 ...
  $ fly: int  1 2 1 1 1 1 2 2 1 2 ...
  $ ver: int  1 1 2 1 2 2 2 2 2 1 ...
  $ end: int  1 1 1 1 2 1 1 2 2 1 ...
  $ gro: int  2 2 1 1 2 2 2 1 2 1 ...
  $ hai: int  1 2 2 2 2 2 1 1 1 1 ...
 
 m-mona(animals) #works fine
 
 what is this error trying to tell me?
 mvh
 morten
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

***
Christian Hennig
Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
[EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/
###
ich empfehle www.boag-online.de

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Indexing Lists and Partial Matching

2005-01-27 Thread McGehee, Robert
I was unaware until recently that partial matching was used to index
data frames and lists. This is now causing a great deal of problems in
my code as I sometimes index a list without knowing what elements it
contains, expecting a NULL if the column does not exist. However, if
partial matching is used, sometimes R will return an object I do not
want. My question, is there an easy way of getting around this?

For example:
 a - NULL
 a$abc - 5
 a$a
[1] 5
 a$a - a$a
 a
$abc
[1] 5
$a
[1] 5

Certainly from a coding prospective, one might expect assigning a$a to
itself wouldn't do anything since either 1) a$a doesn't exist, so
nothing happens, or 2) a$a does exist and so it just assigns its value
to itself. However, in the above case, it creates a new column entirely
because I happen to have another column called a$abc. I do not want this
behavior.

The solution I came up with was to create another indexing function that
uses the subset() (which doesn't partial match), then check for an
error, and if there is an error substitute NULL (to mimic the [
behavior). However, I don't really want to start using another indexing
function altogether just to get around this behavior. Is there a better
way? Can I turn off partial matching?

Thanks,
Robert


Robert McGehee
Geode Capital Management, LLC
53 State Street, 5th Floor | Boston, MA | 02109
Tel: 617/392-8396Fax:617/476-6389
mailto:[EMAIL PROTECTED]



This e-mail, and any attachments hereto, are intended for us...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] partial ranking models

2005-01-27 Thread Ruud H. Koning
Dear R-users, is a library available to estimate partial ranking models? 
Best, Ruud

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] weighting in nls

2005-01-27 Thread Robert Brown FM CEFAS
I'm fitting nonlinear functions to some growth data but  I'm getting radically 
different results in R to another program (Prism). Furthermore the values from 
the other program give a better fit and seem more realistic.  I think there is 
a problem with the results from the r nls function. The differences only occur 
with weighted data so I think I'm making a mistake in the weighting. I'm 
following the procedure outlined on p 244 of MASS (or at least I'm trying to).

Thus, I'm using mean data with heteroscedasticity so I'm weighting by n/ 
variance, where the variance is well known from a large data set. This 
weighting factor is available as the variable 'novervar'.

The function is a von Bertalanffy curve of the form 
weight~(a*(1-exp(-b*(age-c^3.  Thus I'm entering the command in the form:

solb1wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^3),data=solb1.na.rm,start=list(a=0.85,b=0.45,c=0.48))

Can anyone suggest what I'm doing wrong?  I seem to be folowing the 
instructions in MASS. I tried following the similar instructions on page 450 of 
the white book but these were a bit cryptic.

I'm using R 2.0.0 on a Windows 2000 machine

Regards,

Robert Brown


***
This email and any attachments are intended for the named re...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Indexing Lists and Partial Matching

2005-01-27 Thread Liaw, Andy
This has been discussed a few times on this list before, so you might want
to dig into the archive...

You might want to check existence of name instead of checking whether the
component is NULL:

 x - list(bc=bc, ab=ab)
 is.null(x$b)
[1] FALSE
 b %in% names(x)
[1] FALSE

Andy


 From: McGehee, Robert
 
 I was unaware until recently that partial matching was used to index
 data frames and lists. This is now causing a great deal of problems in
 my code as I sometimes index a list without knowing what elements it
 contains, expecting a NULL if the column does not exist. However, if
 partial matching is used, sometimes R will return an object I do not
 want. My question, is there an easy way of getting around this?
 
 For example:
  a - NULL
  a$abc - 5
  a$a
 [1] 5
  a$a - a$a
  a
 $abc
 [1] 5
 $a
 [1] 5
 
 Certainly from a coding prospective, one might expect assigning a$a to
 itself wouldn't do anything since either 1) a$a doesn't exist, so
 nothing happens, or 2) a$a does exist and so it just assigns its value
 to itself. However, in the above case, it creates a new 
 column entirely
 because I happen to have another column called a$abc. I do 
 not want this
 behavior.
 
 The solution I came up with was to create another indexing 
 function that
 uses the subset() (which doesn't partial match), then check for an
 error, and if there is an error substitute NULL (to mimic the [
 behavior). However, I don't really want to start using 
 another indexing
 function altogether just to get around this behavior. Is 
 there a better
 way? Can I turn off partial matching?
 
 Thanks,
 Robert
 
 
 Robert McGehee
 Geode Capital Management, LLC
 53 State Street, 5th Floor | Boston, MA | 02109
 Tel: 617/392-8396Fax:617/476-6389
 mailto:[EMAIL PROTECTED]
 
 
 
 This e-mail, and any attachments hereto, are intended for 
 us...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Indexing Lists and Partial Matching

2005-01-27 Thread Huntsinger, Reid
This came up a few months ago. Check the thread on hashing and partial
matching around Nov 18. The short answer is no, you can't turn it off
because lots of code relies on that behavior. 

Reid Huntsinger 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of McGehee, Robert
Sent: Thursday, January 27, 2005 9:34 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Indexing Lists and Partial Matching


I was unaware until recently that partial matching was used to index
data frames and lists. This is now causing a great deal of problems in
my code as I sometimes index a list without knowing what elements it
contains, expecting a NULL if the column does not exist. However, if
partial matching is used, sometimes R will return an object I do not
want. My question, is there an easy way of getting around this?

For example:
 a - NULL
 a$abc - 5
 a$a
[1] 5
 a$a - a$a
 a
$abc
[1] 5
$a
[1] 5

Certainly from a coding prospective, one might expect assigning a$a to
itself wouldn't do anything since either 1) a$a doesn't exist, so
nothing happens, or 2) a$a does exist and so it just assigns its value
to itself. However, in the above case, it creates a new column entirely
because I happen to have another column called a$abc. I do not want this
behavior.

The solution I came up with was to create another indexing function that
uses the subset() (which doesn't partial match), then check for an
error, and if there is an error substitute NULL (to mimic the [
behavior). However, I don't really want to start using another indexing
function altogether just to get around this behavior. Is there a better
way? Can I turn off partial matching?

Thanks,
Robert


Robert McGehee
Geode Capital Management, LLC
53 State Street, 5th Floor | Boston, MA | 02109
Tel: 617/392-8396Fax:617/476-6389
mailto:[EMAIL PROTECTED]



This e-mail, and any attachments hereto, are intended for us...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] weighting in nls

2005-01-27 Thread Liaw, Andy
Can you show us the difference; i.e., what are the parameter estimates and
associated SEs from the two programs?  Even better, can you supply an
example data set?

[With is `trick' for weighted nls, you need to be careful with the output of
predict().]

Andy

 From: Robert Brown FM CEFAS
 
 I'm fitting nonlinear functions to some growth data but  I'm 
 getting radically different results in R to another program 
 (Prism). Furthermore the values from the other program give a 
 better fit and seem more realistic.  I think there is a 
 problem with the results from the r nls function. The 
 differences only occur with weighted data so I think I'm 
 making a mistake in the weighting. I'm following the 
 procedure outlined on p 244 of MASS (or at least I'm trying to).
 
 Thus, I'm using mean data with heteroscedasticity so I'm 
 weighting by n/ variance, where the variance is well known 
 from a large data set. This weighting factor is available as 
 the variable 'novervar'.
 
 The function is a von Bertalanffy curve of the form 
 weight~(a*(1-exp(-b*(age-c^3.  Thus I'm entering the 
 command in the form:
 
 solb1wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^
 3),data=solb1.na.rm,start=list(a=0.85,b=0.45,c=0.48))
 
 Can anyone suggest what I'm doing wrong?  I seem to be 
 folowing the instructions in MASS. I tried following the 
 similar instructions on page 450 of the white book but these 
 were a bit cryptic.
 
 I'm using R 2.0.0 on a Windows 2000 machine
 
 Regards,
 
 Robert Brown
 
 
 **
 *
 This email and any attachments are intended for the named 
 re...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] weighting in nls

2005-01-27 Thread Robert Brown FM CEFAS

Hi there,

this is the output from R

 solb2wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^3),data=solb2.na.rm,start=list(a=0.85,b=0.45,c=0.48))
 summary(solb2wvb)

Formula:  ~ sqrt(novervar) * (weight - (a * (1 - exp( - b * (age - c^3)

Parameters:
  Value Std. Error  t value 
a  1.087370 0.01193090  91.1392
b  0.151838 0.00714963  21.2372
c -1.809770 0.13186000 -13.7250

Residual standard error: 4.41368 on 109 degrees of freedom

The output from Prism is:

von Bertalanffy 
Best-fit values 
 A  0.8957
 B  0.2381
 C  -1.358
Std. Error  
 A  0.002280
 B  0.002568
 C  0.02919
95% Confidence Intervals
 A  0.8912 to 0.9001
 B  0.2331 to 0.2431
 C  -1.415 to -1.300

The latter has much better visual fit and reasonable residuals. Furthermore 
theory and practice both lead to the expectation that this model should fit the 
data.

Incidentally, I was under the impression that with a weighted nls in R the SE 
values were not accurate.

Finally I've attached the dataset




-Original Message-
From: Liaw, Andy [mailto:[EMAIL PROTECTED]
Sent: 27 January 2005 15:25
To: Robert Brown FM CEFAS; r-help@stat.math.ethz.ch
Subject: RE: [R] weighting in nls


Can you show us the difference; i.e., what are the parameter estimates and
associated SEs from the two programs?  Even better, can you supply an
example data set?

[With is `trick' for weighted nls, you need to be careful with the output of
predict().]

Andy

 From: Robert Brown FM CEFAS
 
 I'm fitting nonlinear functions to some growth data but  I'm 
 getting radically different results in R to another program 
 (Prism). Furthermore the values from the other program give a 
 better fit and seem more realistic.  I think there is a 
 problem with the results from the r nls function. The 
 differences only occur with weighted data so I think I'm 
 making a mistake in the weighting. I'm following the 
 procedure outlined on p 244 of MASS (or at least I'm trying to).
 
 Thus, I'm using mean data with heteroscedasticity so I'm 
 weighting by n/ variance, where the variance is well known 
 from a large data set. This weighting factor is available as 
 the variable 'novervar'.
 
 The function is a von Bertalanffy curve of the form 
 weight~(a*(1-exp(-b*(age-c^3.  Thus I'm entering the 
 command in the form:
 
 solb1wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^
 3),data=solb1.na.rm,start=list(a=0.85,b=0.45,c=0.48))
 
 Can anyone suggest what I'm doing wrong?  I seem to be 
 folowing the instructions in MASS. I tried following the 
 similar instructions on page 450 of the white book but these 
 were a bit cryptic.
 
 I'm using R 2.0.0 on a Windows 2000 machine
 
 Regards,
 
 Robert Brown
 
 
 **
 *
 This email and any attachments are intended for the named 
 re...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


--
Notice:  This e-mail message, together with any attachments, contains 
information of Merck  Co., Inc. (One Merck Drive, Whitehouse Station, New 
Jersey, USA 08889), and/or its affiliates (which may be known outside the 
United States as Merck Frosst, Merck Sharp  Dohme or MSD and in Japan, as 
Banyu) that may be confidential, proprietary copyrighted and/or legally 
privileged. It is intended solely for the use of the individual or entity named 
on this message.  If you are not the intended recipient, and have received this 
message in error, please notify us immediately by reply e-mail and then delete 
it from your system.
--


***
This email and any attachments are intended for the named recipient only.  Its 
unauthorised use, distribution, disclosure, storage or copying is not 
permitted.  If you have received it in error, please destroy all copies and 
notify the sender.  In messages of a non-business nature, the views and 
opinions expressed are the author's own and do not necessarily reflect those of 
the organisation from which it is sent.  All emails may be subject to 
monitoring.
***

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Indexing Lists and Partial Matching

2005-01-27 Thread McGehee, Robert
Thank you both for your reference. I had missed the previous discussions
before posting.

I am surprised to hear that there is code that relies on this indexing
behavior, especially if it is in the base package. I'm not sure how a
function could even make use of this feature without first asking R what
the names of the list or data frame are, and then intentionally
shortening them to something else. It even seems reasonable that if code
_does_ rely on this behavior, then it may be subject to other problems
anyway, such as if the wrong data is unintentionally returned (when NULL
or error should be returned instead). (Although I freely acknowledge my
ignorance of the uses of this feature as I only recently discovered it.)

From the previous posts, it seems the only way in R to code around this
is to _always_ check the names of a list before indexing, as anything
else could lead to very subtle errors in complex code, unless one can a
priori guarantee that the list names are always distinguishable. 

Perhaps one easy way to optionally remove this feature without breaking
anything would be to have an option/flag in the description or namespace
of a package indicating that list-indexing partial-matching should not
be used for any function within that package. But that might be a bit
hackish.

However, for my personal code, the a[[match(abc, names(a))]] construct
(from one of the Nov 18th posts) is easy enough to use, so no intention
to rehash an already well-discussed topic.

Thanks,
Robert

PS. None of this applies to partial matching of function arguments, as
this is certainly widely used.

-Original Message-
From: Huntsinger, Reid [mailto:[EMAIL PROTECTED] 
Sent: Thursday, January 27, 2005 10:15 AM
To: 'McGehee, Robert'; r-help@stat.math.ethz.ch
Subject: RE: [R] Indexing Lists and Partial Matching


This came up a few months ago. Check the thread on hashing and partial
matching around Nov 18. The short answer is no, you can't turn it off
because lots of code relies on that behavior. 

Reid Huntsinger 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of McGehee, Robert
Sent: Thursday, January 27, 2005 9:34 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Indexing Lists and Partial Matching


I was unaware until recently that partial matching was used to index
data frames and lists. This is now causing a great deal of problems in
my code as I sometimes index a list without knowing what elements it
contains, expecting a NULL if the column does not exist. However, if
partial matching is used, sometimes R will return an object I do not
want. My question, is there an easy way of getting around this?

For example:
 a - NULL
 a$abc - 5
 a$a
[1] 5
 a$a - a$a
 a
$abc
[1] 5
$a
[1] 5

Certainly from a coding prospective, one might expect assigning a$a to
itself wouldn't do anything since either 1) a$a doesn't exist, so
nothing happens, or 2) a$a does exist and so it just assigns its value
to itself. However, in the above case, it creates a new column entirely
because I happen to have another column called a$abc. I do not want this
behavior.

The solution I came up with was to create another indexing function that
uses the subset() (which doesn't partial match), then check for an
error, and if there is an error substitute NULL (to mimic the [
behavior). However, I don't really want to start using another indexing
function altogether just to get around this behavior. Is there a better
way? Can I turn off partial matching?

Thanks,
Robert


Robert McGehee
Geode Capital Management, LLC
53 State Street, 5th Floor | Boston, MA | 02109
Tel: 617/392-8396Fax:617/476-6389
mailto:[EMAIL PROTECTED]



This e-mail, and any attachments hereto, are intended for
us...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html




--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Installing Problems

2005-01-27 Thread Ashok Veeraraghavan
Hi,

I tried installing R on my MAC OS 10.3. After R installation I tried
installing BioConductor which requires R. I ran into some problems
with Bioconductor. Right now I want to remove (uninstall) all R and
Bioconductor components from my machine and start afresh. Can somebody
tell me how i can remove(uninstall) all R and Bioconductor components.

Thanks
Regards

Ashok

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sw

2005-01-27 Thread Spencer Graves
also stepAIC in library(MASS).  hope this helps.  spencer graves
Christoph Buser wrote:
Dear Mahdi
Mahdi Osman writes:
 Hi list,
 
 I am just a new user of R.
 
 How can I run stepwise regression in R?

If you look for a stepwise procedure for Linear Regression
Models or Generalized Linear Models, you can use step() 
(see ?step) 

Regards,
Christoph
--
Christoph Buser [EMAIL PROTECTED]
Seminar fuer Statistik, LEO C11
ETH (Federal Inst. Technology)  8092 Zurich  SWITZERLAND
phone: x-41-1-632-5414  fax: 632-1228
http://stat.ethz.ch/~buser/
--

 Is there a graphic user interphase for any of the spatial packages inculded
 in R, such as gstat, geoR and someothers. I am mainly interested interactive
 variogram modelling and mapping.
 
 Thanks
 Mahdi
 
 -- 
 ---
 Mahdi Osman (PhD)
 E-mail: [EMAIL PROTECTED]
 ---
 
 10 GB Mailbox, 100 FreeSMS http://www.gmx.net/de/go/topmail
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] How to generate labels or names?

2005-01-27 Thread Eric Rodriguez
Hi,

I'm new to R and I would like to generate labels like data.frame does
: V1 V2 V3
I'm trying to generate a N vector with label such as Lab1 Lab2 ... LabN.

I guess this is pretty easy when you know R ;)

Thanks for help

Eric

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] svd error

2005-01-27 Thread WU,TONGTONG
Hi,

  I met a probem recently and need your help.  I would really appreciate
it.

  I kept receiving the following error message when running a program:

'Error in svd(X) : infinite or missing values in x'.

However, I did not use any svd function in this program though I did
include the function pseudoinverse.  Is the problem caused by doing
pseudoinverse?

Best regards,
Tongtong

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] weighting in nls

2005-01-27 Thread Liaw, Andy
There seems to be some peculiarity with the weights.  If you try the
unweighted fit, it comes much closer to the answer from Prism...

Andy

 From: Robert Brown FM CEFAS 
 
 Hi there,
 
 this is the output from R
 
  
 solb2wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^
 3),data=solb2.na.rm,start=list(a=0.85,b=0.45,c=0.48))
  summary(solb2wvb)
 
 Formula:  ~ sqrt(novervar) * (weight - (a * (1 - exp( - b * 
 (age - c^3)
 
 Parameters:
   Value Std. Error  t value 
 a  1.087370 0.01193090  91.1392
 b  0.151838 0.00714963  21.2372
 c -1.809770 0.13186000 -13.7250
 
 Residual standard error: 4.41368 on 109 degrees of freedom
 
 The output from Prism is:
 
 von Bertalanffy   
 Best-fit values   
  A0.8957
  B0.2381
  C-1.358
 Std. Error
  A0.002280
  B0.002568
  C0.02919
 95% Confidence Intervals  
  A0.8912 to 0.9001
  B0.2331 to 0.2431
  C-1.415 to -1.300
 
 The latter has much better visual fit and reasonable 
 residuals. Furthermore theory and practice both lead to the 
 expectation that this model should fit the data.
 
 Incidentally, I was under the impression that with a weighted 
 nls in R the SE values were not accurate.
 
 Finally I've attached the dataset
 
 
 
 
 -Original Message-
 From: Liaw, Andy [mailto:[EMAIL PROTECTED]
 Sent: 27 January 2005 15:25
 To: Robert Brown FM CEFAS; r-help@stat.math.ethz.ch
 Subject: RE: [R] weighting in nls
 
 
 Can you show us the difference; i.e., what are the parameter 
 estimates and
 associated SEs from the two programs?  Even better, can you supply an
 example data set?
 
 [With is `trick' for weighted nls, you need to be careful 
 with the output of
 predict().]
 
 Andy
 
  From: Robert Brown FM CEFAS
  
  I'm fitting nonlinear functions to some growth data but  I'm 
  getting radically different results in R to another program 
  (Prism). Furthermore the values from the other program give a 
  better fit and seem more realistic.  I think there is a 
  problem with the results from the r nls function. The 
  differences only occur with weighted data so I think I'm 
  making a mistake in the weighting. I'm following the 
  procedure outlined on p 244 of MASS (or at least I'm trying to).
  
  Thus, I'm using mean data with heteroscedasticity so I'm 
  weighting by n/ variance, where the variance is well known 
  from a large data set. This weighting factor is available as 
  the variable 'novervar'.
  
  The function is a von Bertalanffy curve of the form 
  weight~(a*(1-exp(-b*(age-c^3.  Thus I'm entering the 
  command in the form:
  
  solb1wvb-nls(~sqrt(novervar)*(weight-(a*(1-exp(-b*(age-c^
  3),data=solb1.na.rm,start=list(a=0.85,b=0.45,c=0.48))
  
  Can anyone suggest what I'm doing wrong?  I seem to be 
  folowing the instructions in MASS. I tried following the 
  similar instructions on page 450 of the white book but these 
  were a bit cryptic.
  
  I'm using R 2.0.0 on a Windows 2000 machine
  
  Regards,
  
  Robert Brown
  
  
  **
  *
  This email and any attachments are intended for the named 
  re...{{dropped}}
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
  
 
 
 --
 
 Notice:  This e-mail message, together with any attachments, 
 contains information of Merck  Co., Inc. (One Merck Drive, 
 Whitehouse Station, New Jersey, USA 08889), and/or its 
 affiliates (which may be known outside the United States as 
 Merck Frosst, Merck Sharp  Dohme or MSD and in Japan, as 
 Banyu) that may be confidential, proprietary copyrighted 
 and/or legally privileged. It is intended solely for the use 
 of the individual or entity named on this message.  If you 
 are not the intended recipient, and have received this 
 message in error, please notify us immediately by reply 
 e-mail and then delete it from your system.
 --
 
 
 
 **
 *
 This email and any attachments are intended for the named 
 recipient only.  Its unauthorised use, distribution, 
 disclosure, storage or copying is not permitted.  If you have 
 received it in error, please destroy all copies and notify 
 the sender.  In messages of a non-business nature, the views 
 and opinions expressed are the author's own and do not 
 necessarily reflect those of the organisation from which it 
 is sent.  All emails may be subject to monitoring.
 **
 *
 



RE: [R] Request for help (reference details)

2005-01-27 Thread Huntsinger, Reid
I referred in my reply to a paper by Diaconis and Sturmfels. The exact
reference is:

Diaconis and Sturmfels, Algebraic algorithms for sampling from conditional
distributions, Ann. Stat 26 (1998) 363-397. 

They cite the following:

Besag and Clifford, Generalized Monte Carlo significance tests, Biometrika
76 (1989) 633-42.

which actually contains your problem (section 3, Testing the Rasch model)
and gives a very simple Markov chain for sampling from the uniform
distribution on these matrices. If you need other than the uniform
distribution, see the modifications Diaconis and Sturmfels make (the
Metropolis step).

Reid Huntsinger

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Huntsinger, Reid
Sent: Thursday, January 27, 2005 10:50 AM
To: 'Michela Marignani'; r-help@stat.math.ethz.ch
Subject: RE: [R] Request for help


Persi Diaconis and Bernd Sturmfels have an article on generating random
contingency tables uniformly distributed subject to having fixed marginals
for the same purpose (null distribution of conditional test) and they used
Markov Chain Monte Carlo to sample. That could perhaps be adapted here. The
article is in Annals of Statistics from several years ago, and if you google
for algebraic statistics you'll probably find several recent expositions
of the ideas, possibly even code.

Reid Huntsinger

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Michela Marignani
Sent: Thursday, January 27, 2005 3:52 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Request for help


My name is Michela Marignani and I'm an ecologist trying to solve a problem 
linked to knight' s tour algorithm.
I need a program to create random matrices with presence/absence (i.e. 1,0 
values), with defined colums and rows sums, to create null models for 
statistical comparison of species distribution phenomena.
I've seen on the web many solutions of the problem, but none provides the 
freedom to easily change row+colums constraint and none of them produce 
matrices  with 1 and 0. Also, I've tryied to use R, but it is too 
complicated for a not-statistician as I amcan you help me?

Thank you for your attention,
so long

Michela Marignani
University of Siena
Environmental Science Dept.
Siena, Italy
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to generate labels or names?

2005-01-27 Thread Christoph Buser
Hi Eric

If you want produce a vector with names, you can use

v - rnorm(20)
names(v) - paste(Lab,1:20, sep=)

Regards,

Christoph

--
Christoph Buser [EMAIL PROTECTED]
Seminar fuer Statistik, LEO C11
ETH (Federal Inst. Technology)  8092 Zurich  SWITZERLAND
phone: x-41-1-632-5414  fax: 632-1228
http://stat.ethz.ch/~buser/
--


Eric Rodriguez writes:
  Hi,
  
  I'm new to R and I would like to generate labels like data.frame does
  : V1 V2 V3
  I'm trying to generate a N vector with label such as Lab1 Lab2 ... LabN.
  
  I guess this is pretty easy when you know R ;)
  
  Thanks for help
  
  Eric
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] svd error

2005-01-27 Thread Spencer Graves
 You haven't told us what you used to compute the pseudoinverse, 
but I can get that error message using ginv in library(MASS).  When I 
then typed ginv (without the parentheses, it listed the code, and I 
quickly saw Xsvd - svd(X) [using R 2.0.1 under Windows 2000]. 

 hope this helps. 
 spencer graves
p.s.  The posting guide (www.R-project.org/posting-guide.html) can help 
you find answers to many questions like this yourself, in addition to 
improving your facility with language AND improving, I believe, your 
chances of getting a reply that actually answers your question.  In this 
case, if you are not using ginv in library(MASS) and the discussion 
above doesn't help you solve the problem otherwise, following the 
posting guide would have made it much easier for someone like me to 
provide a more useful answer. 

WU,TONGTONG wrote:
Hi,
 I met a probem recently and need your help.  I would really appreciate
it.
 I kept receiving the following error message when running a program:
'Error in svd(X) : infinite or missing values in x'.
However, I did not use any svd function in this program though I did
include the function pseudoinverse.  Is the problem caused by doing
pseudoinverse?
Best regards,
Tongtong
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to generate labels or names?

2005-01-27 Thread Sean Davis
Note that sometimes it makes more sense to use a list than a labeled 
vector.

Sean
On Jan 27, 2005, at 12:26 PM, Spencer Graves wrote:
?paste
One of its examples is
paste(A, 1:6, sep = )
[1] A1 A2 A3 A4 A5 A6
spencer graves
Eric Rodriguez wrote:
Hi,
I'm new to R and I would like to generate labels like data.frame does
: V1 V2 V3
I'm trying to generate a N vector with label such as Lab1 Lab2 ... 
LabN.

I guess this is pretty easy when you know R ;)
Thanks for help
Eric
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] svd error

2005-01-27 Thread Spencer Graves
Dear Prof. Ripley: 

 With library(MASS), I got the following in R 2.0.1 under Windows 
2000: 

 X
[,1] [,2]
[1,]13
[2,]2   NA
 ginv(X)
Error in svd(X) : infinite or missing values in x
 This may not relate to Tongtong Wu's problem, but it used ginv 
in library(MASS) as you suggested and did produce the cited error message. 

 spencer graves
Prof Brian Ripley wrote:
On Thu, 27 Jan 2005, WU,TONGTONG wrote:
Hi,
 I met a probem recently and need your help.  I would really appreciate
it.
 I kept receiving the following error message when running a program:
'Error in svd(X) : infinite or missing values in x'.
However, I did not use any svd function in this program though I did
include the function pseudoinverse.  Is the problem caused by doing
pseudoinverse?

Where did you find that function?  It is not part of R as it ships, 
and it *may* be part of GeneTS, where it calls svd after squaring the 
matrix. But there are simpler pseudoinverse functions (e.g. ginv in 
MASS) that will not introduce that error.

The tool you needed was traceback(): try it to see what it tells you 
here.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] svd error

2005-01-27 Thread Prof Brian Ripley
On Thu, 27 Jan 2005, Spencer Graves wrote:
Dear Prof. Ripley: 
With library(MASS), I got the following in R 2.0.1 under Windows 2000: 
X
   [,1] [,2]
[1,]13
[2,]2   NA
ginv(X)
Error in svd(X) : infinite or missing values in x
This may not relate to Tongtong Wu's problem, but it used ginv in 
library(MASS) as you suggested and did produce the cited error message.
I said `introduce'.  The cause of the error is in X, not introduced by 
ginv. pseudoinverse can introduce NaNs/infinities.

Please do remember the care I take when writing things.
BDR

spencer graves
Prof Brian Ripley wrote:
On Thu, 27 Jan 2005, WU,TONGTONG wrote:
Hi,
 I met a probem recently and need your help.  I would really appreciate
it.
 I kept receiving the following error message when running a program:
'Error in svd(X) : infinite or missing values in x'.
However, I did not use any svd function in this program though I did
include the function pseudoinverse.  Is the problem caused by doing
pseudoinverse?

Where did you find that function?  It is not part of R as it ships, and it 
*may* be part of GeneTS, where it calls svd after squaring the matrix. But 
there are simpler pseudoinverse functions (e.g. ginv in MASS) that will not 
introduce that error.

The tool you needed was traceback(): try it to see what it tells you here.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help with R and Bioconductor

2005-01-27 Thread Jeff Gentry
 seemed successful. Then while attempting to getBioC() I had to force
 quit the R application since I had to attend to something else
 urgently. When i returned and tried to getBioC, I am getting errors

Why not just let it run?

 indicating that there is a lock on some files. So i would like to

The directory will likely be path-to-R/library/00LOCK (I say likely
because the 'path-to-R/library' part could be something else if you
specified an alternate installation directory or your default .libPaths is
different then standard), and removing that directory will solve your
issues.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Array Manipulation

2005-01-27 Thread cdsmith
I have a data set that looks like the following:
ID  Responce
1   57
1   63
1   49
2   31
2   45
2   67
2   91
3   56
3   43
4   23
4   51
4   61
4   76
4   68
5   34
5   35
5   45
I used sample(unique(ID)) to select a sample if ID's, say, (1,4,5).  Now
I want to pull out the rows with ID's 1, 4, and 5.  I've tried forceing
the matrix into a vector but it does not create and appropriate vector.
 I've also tried the if statment but it didn't work right either.  Any
suggestions?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Array Manipulation

2005-01-27 Thread Liaw, Andy
Something like:

dat[dat$ID %in% sample(unique(dat$ID), 3), ]

Andy

 From: [EMAIL PROTECTED]
 
 I have a data set that looks like the following:
 ID  Responce
 1   57
 1   63
 1   49
 2   31
 2   45
 2   67
 2   91
 3   56
 3   43
 4   23
 4   51
 4   61
 4   76
 4   68
 5   34
 5   35
 5   45
 I used sample(unique(ID)) to select a sample if ID's, say, 
 (1,4,5).  Now
 I want to pull out the rows with ID's 1, 4, and 5.  I've 
 tried forceing
 the matrix into a vector but it does not create and 
 appropriate vector.
  I've also tried the if statment but it didn't work right either.  Any
 suggestions?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Array Manipulation

2005-01-27 Thread Douglas Bates
Liaw, Andy wrote:
Something like:
dat[dat$ID %in% sample(unique(dat$ID), 3), ]
or
subset(dat, ID %in% sample(unique(ID), 3))
which I find to be more readable.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Is glm weird or am I?

2005-01-27 Thread Peter Dalgaard
roy wilson [EMAIL PROTECTED] writes:

 Hi,
 
 I've written a script that checks all bivariate correlations for
 variables in a matrix. I'm now trying to run a logistic regression on
 each pair (x,y) where y is a factor with 2 levels. I don't know how
 (or whether I want) to try to fathom what's up with glm.
 
 What I wrote is attached. Here's what I get.

[If you want people to debug your code, you might supply the data as
well. People might be more helpful if they can actually run your code.
Remember who is asking who for a favour...]
 
 *
 
 source(lrtest.R)
 building model: Wgend ~ WAY
 construct_and_run_model:
 class of x: integer nlevels(x): 0
 class of y: factor nlevels(y): 2
   model built
   model ran
 -1.070886   0.01171153
 building model: Wgend ~ WBWS
 construct_and_run_model:
 class of x: integer nlevels(x): 0
 class of y: factor nlevels(y): 2
   model built
   model ran
 0.0837854   0.01898052
 building model: Wgend ~ Wcond
 construct_and_run_model:
 class of x: factor nlevels(x): 2
 class of y: factor nlevels(y): 2
 Error in contrasts-(`*tmp*`, value = contr.treatment) :
 contrasts can be applied only to factors with 2 or more levels
 
 *
 
 Both Wcond and Wgend take values in {1,2}. My understanding is that,
 when family is bonomial, GLM recodes these to {0, 1}. That's
 consistent with what I've seen previously.
 
 Excuse the possible stupidity :-).

What you're seeing is similar to this:

 x - factor(rep(0,20),levels=0:1)
 y - rbinom(20,1,.5)
 glm(y~x,binomial)
Error in contrasts-(`*tmp*`, value = contr.treatment) :
contrasts can be applied only to factors with 2 or more levels

I.e. x is a two-level factor, but only one level is actually present
in data.

You have

 attach(newDataSet)
 for (cond in 1:2) {
 # Select rows for each condition
 t - newDataSet[Wcond == cond,] 


and then you proceed to use Wcond as a regressor within the data frame
t. 

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Where is MASS

2005-01-27 Thread Doran, Harold
Dear List:

I have been using the MASS package until 5 minutes ago. I just updated
some packages from CRAN, something happened and R crashed. I then
started R again and tried to source in some code that calls MASS, but
received an error that there is not a package called MASS.

I then went to install packages from CRAN and MASS was not visible as an
option and I then went to the CRAN website and did not see MASS as one
of the contributed packages available. I looked at the changes in the
last two editions of R-News and didn't see anything related to MASS. I
might be missing something obvious. 

Has something happened to this package?

Thanks,

Harold

Windows XP
2.0.1

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] clustering

2005-01-27 Thread WeiWei Shi
Hi,
I just get a question (sorry if it is a dumb one) and I phase my
question in the following R codes:

group1-rnorm(n=50, mean=0, sd=1)
group2-rnorm(n=20, mean=1, sd=1.5)
group3-c(group1,group2)


Now, if I am given a dataset from group3, what method (discriminant
analysis, clustering, maybe) is the best to cluster them by using R.
The known info includes: 2 clusters, normal distribution (but the
parameters are unknown).

Thanks,

Ed

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Where is MASS

2005-01-27 Thread Liaw, Andy
It's part of the VR bundle (for a _long_ time...)

Andy

 From: Doran, Harold
 
 Dear List:
 
 I have been using the MASS package until 5 minutes ago. I just updated
 some packages from CRAN, something happened and R crashed. I then
 started R again and tried to source in some code that calls MASS, but
 received an error that there is not a package called MASS.
 
 I then went to install packages from CRAN and MASS was not 
 visible as an
 option and I then went to the CRAN website and did not see MASS as one
 of the contributed packages available. I looked at the changes in the
 last two editions of R-News and didn't see anything related to MASS. I
 might be missing something obvious. 
 
 Has something happened to this package?
 
 Thanks,
 
 Harold
 
 Windows XP
 2.0.1
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Where is MASS

2005-01-27 Thread Rolf Turner
You need to get the VR bundle from CRAN.  MASS has been part of the
VR bundle depuis longtemps.  Like forever.  But the packages in
the VR bundle ship with R by default, so if you've installed R
you should have those packages there automatically.

So it would seem that something got damaged in the R crash that
you spoke of.  Perhaps you should re-install R.

cheers,

Rolf Turner
[EMAIL PROTECTED]

===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===

Harold Doran wrote:

 I have been using the MASS package until 5 minutes ago. I just updated
 some packages from CRAN, something happened and R crashed. I then
 started R again and tried to source in some code that calls MASS, but
 received an error that there is not a package called MASS.
 
 I then went to install packages from CRAN and MASS was not visible as an
 option and I then went to the CRAN website and did not see MASS as one
 of the contributed packages available. I looked at the changes in the
 last two editions of R-News and didn't see anything related to MASS. I
 might be missing something obvious. 
 
 Has something happened to this package?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Results of MCD estimators in MASS and rrcov

2005-01-27 Thread Valentin Todorov
The two implementations use different consistency factors as well as
different small sample correction factors.

1. The search parts of both implementations produce the same result -
compare rrcov.mcd$best and mass.mcd$best.

2. The raw MCD covariance matrix is corrected as follows:

MASS:
 - Rousseeuw and Leroy (1987), p.259 (eq. 1.26)
 - Marazzi (1993) (or may be Rousseeuw and van Zomeren (1900) p.638 (eq A.9)

rrcov:
 - Croux and Haesbroeck (1999), Pison et.al. p. 337
 - Pison et.al. (2002), p.338

3. The reweighted (final) covariance matrix is corrected as follows:

MASS: no correction
rrcov: Pison et.al. (2002) p. 339

This explains the different covariance matrices.
As far as the location is concerned, in this particular case the raw MCD
estimates in MASS identify one additional outlier - observation 53, which is
discarded from the computation of the reweighted estimates.
Look at the following plots and judge yourself if this is an outlier or not:

  covPlot(hbk, mcd=rrcov.mcd, which=distance, id.n=15)
  covPlot(hbk, mcd=mass.mcd, which=distance, id.n=15)

valentin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Where is MASS

2005-01-27 Thread Pikounis, Bill [CNTUS]
Harold,

Try looking in C:\Program Files\R\rw2001\library (presuming the default
install path) to see if the MASS folder is still there. If it is, then what
exactly is the error message you are getting when you try library(MASS)?

If MASS is no longer there, I would try the Packages  Install Packages from
CRAN.. menu item in the R Console and see if VR is listed. If so, install
that by the Select  OK process and MASS should be restored.  Actually, this
*might* work even if you still have a MASS folder.

Hope that helps,
Bill

---
Bill Pikounis, PhD

Nonclinical Statistics
Centocor, Inc.
200 Great Valley Parkway
MailStop C4-1
Malvern, PA 19355

 
  I have been using the MASS package until 5 minutes ago. I 
 just updated
  some packages from CRAN, something happened and R crashed. I then
  started R again and tried to source in some code that calls 
 MASS, but
  received an error that there is not a package called MASS.
  
  I then went to install packages from CRAN and MASS was not 
 visible as an
  option and I then went to the CRAN website and did not see 
 MASS as one
  of the contributed packages available. I looked at the 
 changes in the
  last two editions of R-News and didn't see anything related 
 to MASS. I
  might be missing something obvious. 
  
  Has something happened to this package?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] clustering

2005-01-27 Thread msck9
The cluster analysis should be able to handle that. I think if you 
know how many clusters you have, kmeans is ok, or the EM algorithm 
can also do that. 
On Thu, Jan 27, 2005 at 03:44:42PM -0500, WeiWei Shi wrote:
 Hi,
 I just get a question (sorry if it is a dumb one) and I phase my
 question in the following R codes:
 
 group1-rnorm(n=50, mean=0, sd=1)
 group2-rnorm(n=20, mean=1, sd=1.5)
 group3-c(group1,group2)
 
 
 Now, if I am given a dataset from group3, what method (discriminant
 analysis, clustering, maybe) is the best to cluster them by using R.
 The known info includes: 2 clusters, normal distribution (but the
 parameters are unknown).
 
 Thanks,
 
 Ed
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Finding runs of TRUE in binary vector

2005-01-27 Thread Sean Davis
I have a binary vector and I want to find all regions of that vector 
that are runs of TRUE (or FALSE).

 a - rnorm(10)
 b - a0.5
 b
 [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
My function would return something like a list:
region[[1]] 1,3
region[[2]] 5,5
region[[3]] 7,10
Any ideas besides looping and setting start and ends directly?
Thanks,
Sean
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] clustering

2005-01-27 Thread WeiWei Shi
Hi,
thanks for reply. In fact, I tried both of them and I also tried the
other method and I found all of them gave me different boundaries (to
my real datasets). I am thinking about k-median but hoping to get more
suggestions from all of you in this forum.

Cheers,

Ed


On Thu, 27 Jan 2005 15:37:16 -0600, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 The cluster analysis should be able to handle that. I think if you
 know how many clusters you have, kmeans is ok, or the EM algorithm
 can also do that.
 On Thu, Jan 27, 2005 at 03:44:42PM -0500, WeiWei Shi wrote:
  Hi,
  I just get a question (sorry if it is a dumb one) and I phase my
  question in the following R codes:
 
  group1-rnorm(n=50, mean=0, sd=1)
  group2-rnorm(n=20, mean=1, sd=1.5)
  group3-c(group1,group2)
 
 
  Now, if I am given a dataset from group3, what method (discriminant
  analysis, clustering, maybe) is the best to cluster them by using R.
  The known info includes: 2 clusters, normal distribution (but the
  parameters are unknown).
 
  Thanks,
 
  Ed
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Finding runs of TRUE in binary vector

2005-01-27 Thread Vadim Ogranovich
Untested:

c(TRUE, b[-1] != b[-length(b)]) gives you the (logical) indexes of the
beginnings of the runs
c(b[-1] != b[-length(b)], TRUE) gives you the (logical) indexes of the
ends of the runs

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sean Davis
 Sent: Thursday, January 27, 2005 2:14 PM
 To: r-help
 Subject: [R] Finding runs of TRUE in binary vector
 
 I have a binary vector and I want to find all regions of 
 that vector that are runs of TRUE (or FALSE).
 
   a - rnorm(10)
   b - a0.5
   b
   [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
 
 My function would return something like a list:
 region[[1]] 1,3
 region[[2]] 5,5
 region[[3]] 7,10
 
 Any ideas besides looping and setting start and ends directly?
 
 Thanks,
 Sean
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Finding runs of TRUE in binary vector

2005-01-27 Thread Sean Davis
Thanks Patrick, Albyn, and Vadim.
rle() does what I want and, Vadim, your method gives the same results 
in a different form.  I appreciate the help!

Sean
On Jan 27, 2005, at 5:29 PM, Vadim Ogranovich wrote:
Untested:
c(TRUE, b[-1] != b[-length(b)]) gives you the (logical) indexes of the
beginnings of the runs
c(b[-1] != b[-length(b)], TRUE) gives you the (logical) indexes of the
ends of the runs
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Sean Davis
Sent: Thursday, January 27, 2005 2:14 PM
To: r-help
Subject: [R] Finding runs of TRUE in binary vector
I have a binary vector and I want to find all regions of
that vector that are runs of TRUE (or FALSE).
a - rnorm(10)
b - a0.5
b
  [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
My function would return something like a list:
region[[1]] 1,3
region[[2]] 5,5
region[[3]] 7,10
Any ideas besides looping and setting start and ends directly?
Thanks,
Sean
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Survreg with gamma distribution

2005-01-27 Thread Roger Dungan
Dear r-help subscribers,

I am working on some survival analysis of some interval censored failure
time data in R. I have done similar analysis before using PROC LIFEREG
in SAS. In that instance, a gamma survival function was the optimum
parametric model for describing the survival and hazard functions. I
would like to be able to use a gamma function in R, but apparently the
survival package does not support this distribution. I have been
googling around for some help, and have found some threads to a similar
question posted to the R-Help list in October last year. Because I am a
bit of a survival analysis and R newbie, I didn't really understand the
discussion thread. 

I've been working with a Weibull distribution, thus:

leafsurv.weibull-survreg(Surv(minage, maxage, censorcode, type =
interval)~1, dist = weib)

And I guess I'd like to be able to do something that's the equivalent of

leafsurv.gamma-survreg(Surv(minage, maxage, censorcode, type =
interval)~1, dist = gamma)

At least one of the R-help listserver comments mentioned using
survreg.distributions to customise a gamma distribution, but I can't
figure out how to make this work with the resources (intellectual and
bibliographical!) that I have available.

With thanks in advance for your help,

Dr Roger Dungan
School of Biological Sciences
University of Cantebury
Christchurch, New Zealand
ph +64 3 366 7001 ext. 4848
fax +64 3 354 2590

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Multiple colors in a plot title

2005-01-27 Thread Leo Espindle
R-Help:

Is there a way to use multiple colors in the title of a plot?  For
instance, to have certain words be red, and certain words be blue?

thanks in advance,
Leo

-- 
1718 Commonwealth Avenue
Apt 2
Brighton, MA 02135
Cell:  617-599-0037

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Array Manipulation

2005-01-27 Thread Eric Rodriguez
and something like that:

dat[dat$ID == sample(unique(dat$ID), 3), 2]   ?

I'm not sure about the ,2 maybe you need the full matrix ?


ps: first time, i forgot the list

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Finding runs of TRUE in binary vector

2005-01-27 Thread Peter Dalgaard
Sean Davis [EMAIL PROTECTED] writes:

 I have a binary vector and I want to find all regions of that vector
 that are runs of TRUE (or FALSE).
 
   a - rnorm(10)
   b - a0.5
   b
   [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
 
 My function would return something like a list:
 region[[1]] 1,3
 region[[2]] 5,5
 region[[3]] 7,10
 
 Any ideas besides looping and setting start and ends directly?

You could base it on

 rle(b)
Run Length Encoding
  lengths: int [1:5] 1 1 2 4 2
  values : logi [1:5]  TRUE FALSE  TRUE FALSE  TRUE
 b
 [1]  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE

(Notice that my b differs from yours)

then you might proceed with 

 end - cumsum(rle(b)$lengths)
 start - rev(length(b) + 1  - cumsum(rev(rle(b)$lengths)))
 # or:   start - c(1, end[-length(end)] + 1)
 cbind(start,end)[rle(b)$values,]
 start end
[1,] 1   1
[2,] 3   4
[3,] 9  10


-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] binomia data and mixed model

2005-01-27 Thread deanm
Hi,
I am a first user of R.

I was hoping I could get some help on some data I need to analyze.

The experimental design is a complete randomized design with 2 factors (Source 
material and Depth). The experimental design was suppose to consist of 4 
treatments replicated 3 time, Source 1 and applied at 10 cm and source 2 
applied at 20 cm. During the construction of the treatmetns the depths vary 
considerably so i can't test all my samples based on 10 and 20 cm any more the 
depths are now considered random and not fixed. Each treatment was sampled for 
depth and total density of plants with 3 transects with 28 quadrats per 
transect. The data is very non-normal (lots of zeros) therefore the only way 
to analyze it is to convert to binomial data. Does any one know what type of 
analysis I should use? I was told that a NLmixed model would work but also a 
GLIM mixed model was appropriate. Is there any info using these in R.

Dean D. MacKenzie
Master's of Science Candidate. A.Ag
Department of Renewable Resources
Rm 723 GSB
University of Alberta
Edmonton, AB
T6G 2H1
Office tel: (780) 492-4135
Home tel:   (780) 437-9563

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Threshhold Models in gnlm

2005-01-27 Thread Ben Bolker

  

Eliot McIntire emcintire at forestry.umt.edu writes:

 Hello,
 
 I am interested in fitting a generalized nonlinear regression (gnlr) model  
 with negative binomial errors.
 
 I have found Jim Lindsay's package that will do gnlr, but I have having  
 trouble with the particular model I am interested in fitting.
 
 It is a threshhold model, where below a certain value of one of the  
 parameters being fitted, the model changes.
 
   [BIG SNIP]

Threshold models (also known as piecewise linear, or more recently,
hockey stick models) are actually surprisingly challenging to fit
numerically.  There are papers on least-squares fitting going back
to Bacon and Watts (1971, Biometrika) and before to Quandt, and more
recent (2000) posts on the S-PLUS lists from Bill Venables, Mary Lindstrom,
and Nicholas Barrowman (who has a paper with Ram Myers on hockey stick
models in fisheries).  The basic trick is that, unless you do some kind
of numerical smoothing, it's very easy to get stuck in local minima.
I got a little carried away with the problem and am sending you some
code off-list ...

  cheers
Ben Bolker

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help with R and Bioconductor

2005-01-27 Thread Ashok Veeraraghavan
Hi Jeff,

First of all thanks for the response. But i still am encountering some
problems with installing bioconductor.

I did remove the directory 00Lock.
I ran getBioC(affy)

During the installation i get a warning several times.
chmod:/Library/Frameworks/R.framework/Version/2.0.1/Resources/library/R.css:Operation
Not Permitted
Is this warning message critical to the installation?

Moreover at the end of the Installation i also got this message. I
have appended that message to the end of this email.

Question 1.
Package Annotate was not updated. Why wasnt it updated. How can i update it?

Question 2.
Should i be worried abt the other warning messages?

Thanks
Regards
Ashok




Packages that were not updated:
annotate

Warning messages: 
1: 
 Package annotate version 1.5.1 suggests zebrafish
Package annotate version 1.5.1 suggests xenopuslaevis 
 in: resolve.depends(pkgInfo, repEntry, force, lib = lib,
searchOptions = searchOptions,
2: Installation of package annaffy had non-zero exit status in:
installPkg(fileName, pkg, pkgVer, type, lib, repEntry, versForce)
3: 
 Package annotate version 1.5.1 suggests zebrafish
Package annotate version 1.5.1 suggests xenopuslaevis 
 in: resolve.depends(pkgInfo, repEntry, force, lib = lib,
searchOptions = searchOptions,
4: Installation of package Rgraphviz had non-zero exit status in:
installPkg(fileName, pkg, pkgVer, type, lib, repEntry, versForce)
5: Installation of package geneplotter had non-zero exit status in:
installPkg(fileName, pkg, pkgVer, type, lib, repEntry, versForce)
6: 
 Package annotate version 1.5.1 suggests zebrafish
Package annotate version 1.5.1 suggests xenopuslaevis 
 in: resolve.depends(pkgInfo, repEntry, force, lib = lib,
searchOptions = searchOptions,




On Thu, 27 Jan 2005 13:41:45 -0500 (EST), Jeff Gentry
[EMAIL PROTECTED] wrote:
  seemed successful. Then while attempting to getBioC() I had to force
  quit the R application since I had to attend to something else
  urgently. When i returned and tried to getBioC, I am getting errors
 
 Why not just let it run?
 
  indicating that there is a lock on some files. So i would like to
 
 The directory will likely be path-to-R/library/00LOCK (I say likely
 because the 'path-to-R/library' part could be something else if you
 specified an alternate installation directory or your default .libPaths is
 different then standard), and removing that directory will solve your
 issues.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Finding runs of TRUE in binary vector

2005-01-27 Thread james . holtman




use 'rle';

 a - rnorm(20)
 b - a  .5
 b
 [1] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE
TRUE
[13] FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE
 rle(b)
Run Length Encoding
  lengths: int [1:9] 1 7 2 2 2 3 1 1 1
  values : logi [1:9] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE

__
James HoltmanWhat is the problem you are trying to solve?
Executive Technical Consultant  --  Office of Technology, Convergys
[EMAIL PROTECTED]
+1 (513) 723-2929



   
  Sean Davis
   
  [EMAIL PROTECTED]To:   r-help 
r-help@stat.math.ethz.ch 
  cc:  
   
  Sent by: Subject:  [R] Finding runs 
of TRUE in binary vector   
  [EMAIL PROTECTED] 
   
  ath.ethz.ch   
   

   

   
  01/27/2005 17:13  
   

   

   




I have a binary vector and I want to find all regions of that vector
that are runs of TRUE (or FALSE).

  a - rnorm(10)
  b - a0.5
  b
  [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE

My function would return something like a list:
region[[1]] 1,3
region[[2]] 5,5
region[[3]] 7,10

Any ideas besides looping and setting start and ends directly?

Thanks,
Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] agglomerative coefficient in agnes (cluster)

2005-01-27 Thread Weiguang Shi
Thanks very much Andy for the code and the
explanation.
The meaning of AC is much more clear now.

I did notice, when I tried the code, the results were
not exactly the same as yours.
   sapply(c(.25,.5), testAC, x=x[1:4],
method=single)
  Loading required package: cluster 
  Error in FUN(X[[1]], ...) : Object x not found
   x=rnorm(50)
   sapply(c(.25,.5), testAC, x=x[1:4],
method=single)
  [1] 0.7450599 0.9926918

   version
 _
  platform i686-pc-linux-gnu
  arch i686 
  os   linux-gnu
  system   i686, linux-gnu  
  status
  major2
  minor0.1  
  year 2004 
  month11   
  day  15 
  language R

Regards,
Weiguang

 --- Liaw, Andy [EMAIL PROTECTED] wrote: 
 It has to do with sample sizes.  Consider the
 following:
 
 testAC - function(prop1=0.5, x=rnorm(50),
 center=c(0, 100), ...) {
 stopifnot(require(cluster))
 n - length(x)
 n1 - ceiling(n * prop1)
 n2 - n - n1
 agnes(x + rep(center, c(n1, n2)), ...)$ac
 }
 
 Now some tests:
 
  sapply(c(.25, .5), testAC, x=x[1:4],
 method=single)
 [1] 0.7427591 0.9862944
  sapply(1:5 / 10, testAC, x=x[1:10],
 method=single)
 [1] 0.8977139 0.9974224 0.9950061 0.9946366
 0.9946366
  sapply(1:5 / 10, testAC, x=x, method=single)
 [1] 0.9982955 0.9969757 0.9971114 0.9971127
 0.9975111
 
 So it seems like AC does not consider isolated
 singletons as cluster
 structures.  This is only discernable in small
 sample size, though.
 
 Andy
 
 
  
   --- Liaw, Andy [EMAIL PROTECTED] wrote: 
   BTW, I checked the book.  You're not going find
 much
   more than that.
   
  Thanks for checking.
  
  Weiguang
  
 

__
   
  Post your free ad now! http://personals.yahoo.ca
  
  
 
 

--
 Notice:  This e-mail message, together with any
 attachments, contains information of Merck  Co.,
 Inc. (One Merck Drive, Whitehouse Station, New
 Jersey, USA 08889), and/or its affiliates (which may
 be known outside the United States as Merck Frosst,
 Merck Sharp  Dohme or MSD and in Japan, as Banyu)
 that may be confidential, proprietary copyrighted
 and/or legally privileged. It is intended solely for
 the use of the individual or entity named on this
 message.  If you are not the intended recipient, and
 have received this message in error, please notify
 us immediately by reply e-mail and then delete it
 from your system.

--


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] clustering

2005-01-27 Thread Liaw, Andy
It depends a lot on what you know or don't know about the data, and what
problem you're trying to solve.  

If you know for sure it's a mixture of gaussians, likelihood based
approaches might be better.  MASS (the book) has an example of fitting
univariate mixture of gaussians using various optimizers.  The code is even
in $R_HOME/library/MASS/scripts/ch16.R.

Andy

 From: WeiWei Shi
 
 Hi,
 thanks for reply. In fact, I tried both of them and I also tried the
 other method and I found all of them gave me different boundaries (to
 my real datasets). I am thinking about k-median but hoping to get more
 suggestions from all of you in this forum.
 
 Cheers,
 
 Ed
 
 
 On Thu, 27 Jan 2005 15:37:16 -0600, [EMAIL PROTECTED] 
 [EMAIL PROTECTED] wrote:
  The cluster analysis should be able to handle that. I think if you
  know how many clusters you have, kmeans is ok, or the EM algorithm
  can also do that.
  On Thu, Jan 27, 2005 at 03:44:42PM -0500, WeiWei Shi wrote:
   Hi,
   I just get a question (sorry if it is a dumb one) and I phase my
   question in the following R codes:
  
   group1-rnorm(n=50, mean=0, sd=1)
   group2-rnorm(n=20, mean=1, sd=1.5)
   group3-c(group1,group2)
  
  
   Now, if I am given a dataset from group3, what method 
 (discriminant
   analysis, clustering, maybe) is the best to cluster them 
 by using R.
   The known info includes: 2 clusters, normal distribution (but the
   parameters are unknown).
  
   Thanks,
  
   Ed
   
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] clustering

2005-01-27 Thread WeiWei Shi
Actually the problem I am trying to solve is to discretize a
continuous variable (which is my response variable (dependent
variable) in my project so that I can make a regression problem into a
classification one. (There are many reasons for doing this.)

Since there is no class label for this variable (because this variable
is my class variable :), the unsupervised approach can be applied
here. However, checking the related papers shows there is little
research (in my knowledge, and I haven't checked the MCC yet) in this
field. Using qqnorm to check the normality and histogram indicates
there might be two normal distributions.

My approach is splitting the values for this variable into 2 or 3
intervals and check each interval's normality again. If some approach
like clustering or the one Andy suggests works well, then I should get
much better normality. I will try that tomorrow.

I am not sure if my idea works or not here, please be advised !

Thanks,

Ed


On Thu, 27 Jan 2005 18:58:28 -0500, Liaw, Andy [EMAIL PROTECTED] wrote:
 It depends a lot on what you know or don't know about the data, and what
 problem you're trying to solve.
 
 If you know for sure it's a mixture of gaussians, likelihood based
 approaches might be better.  MASS (the book) has an example of fitting
 univariate mixture of gaussians using various optimizers.  The code is even
 in $R_HOME/library/MASS/scripts/ch16.R.
 
 Andy
 
  From: WeiWei Shi
 
  Hi,
  thanks for reply. In fact, I tried both of them and I also tried the
  other method and I found all of them gave me different boundaries (to
  my real datasets). I am thinking about k-median but hoping to get more
  suggestions from all of you in this forum.
 
  Cheers,
 
  Ed
 
 
  On Thu, 27 Jan 2005 15:37:16 -0600, [EMAIL PROTECTED]
  [EMAIL PROTECTED] wrote:
   The cluster analysis should be able to handle that. I think if you
   know how many clusters you have, kmeans is ok, or the EM algorithm
   can also do that.
   On Thu, Jan 27, 2005 at 03:44:42PM -0500, WeiWei Shi wrote:
Hi,
I just get a question (sorry if it is a dumb one) and I phase my
question in the following R codes:
   
group1-rnorm(n=50, mean=0, sd=1)
group2-rnorm(n=20, mean=1, sd=1.5)
group3-c(group1,group2)
   
   
Now, if I am given a dataset from group3, what method
  (discriminant
analysis, clustering, maybe) is the best to cluster them
  by using R.
The known info includes: 2 clusters, normal distribution (but the
parameters are unknown).
   
Thanks,
   
Ed
   
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
  
 
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 
 
 
 --
 Notice:  This e-mail message, together with any attachment...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Matrix multiplication in R is inaccurate!!

2005-01-27 Thread Vikas Rawal

If you multiply a matrix by its inverse you should get an identity matrix. In 
R, you get an answer that is accurate up to about 16 decimal points? Why can't 
one get a perfect answer?

See for example: 

c(5,3)-x1
c(3,2)-x2
cbind(x1,x2)-x
solve(x)-y
x%*%y

Vikas Rawal

==

 This Mail was Scanned for Virus and found Virus free

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Error: cannot allocate vector of size... but with a twist

2005-01-27 Thread James Muller
Hi,
I have a memory problem, one which I've seen pop up in the list a few 
times, but which seems to be a little different. It is the Error: cannot 
allocate vector of size x problem. I'm running R2.0 on RH9.

My R program is joining big datasets together, so there are lots of 
duplicate cases of data in memory. This (and other tasks) prompted me 
to... expand... my swap partition to 16Gb. I have 0.5Gb of regular, fast 
DDR. The OS seems to be fine accepting the large amount of memory, and 
I'm not restricting memory use or vector size in any way.

R chews up memory up until the 3.5Gb area, then halts. Here's the last 
bit of output:

 # join the data together
 cdata01.data - 
cbind(c.1,c.2,c.3,c.4,c.5,c.6,c.7,c.8,c.9,c.10,c.11,c.12,c.13,c.14,c.15,c.16,c.17,c.18,c.19,c.20,c.21,c.22,c.23,c.24,c.25,c.26,c.27,c.28,c.29,c.30,c.31,c.32,c.33)
Error: cannot allocate vector of size 145 Kb
Execution halted

145--Kb---?? This has me rather lost. Maybe on overflow of some sort?? 
Maybe on OS problem of some sort? I'm scratching here.

Before you question it, there is a legitimate reason for sticking all 
these components in the one data.frame.

One of the problems here is that tinkering is not really feasible. This 
cbind took 1.5 hrs to finally halt.

Any help greatly appreciated,
James
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error: cannot allocate vector of size... but with a twist

2005-01-27 Thread Prof Brian Ripley
On Fri, 28 Jan 2005, James Muller wrote:
Hi,
I have a memory problem, one which I've seen pop up in the list a few times, 
but which seems to be a little different. It is the Error: cannot allocate 
vector of size x problem. I'm running R2.0 on RH9.

My R program is joining big datasets together, so there are lots of duplicate 
cases of data in memory. This (and other tasks) prompted me to... expand... 
my swap partition to 16Gb. I have 0.5Gb of regular, fast DDR. The OS seems to 
be fine accepting the large amount of memory, and I'm not restricting memory 
use or vector size in any way.

R chews up memory up until the 3.5Gb area, then halts. Here's the last bit of 
output:
You have, presumably, a 32-bit computer with a 4GB-per-process memory 
limit.  You have hit it (you get less than 4Gb as the OS services need 
some and there is some fragmentation).  The last failed allocation may be 
small, as you see, if you are allocating lots of smallish pieces.

The only way to overcome that is to use a 64-bit OS and version of R.
What was the `twist' mentioned in the title?  You will find a similar 
overall limit mentioned about weekly on this list if you look in the 
archives.


# join the data together
cdata01.data - 
cbind(c.1,c.2,c.3,c.4,c.5,c.6,c.7,c.8,c.9,c.10,c.11,c.12,c.13,c.14,c.15,c.16,c.17,c.18,c.19,c.20,c.21,c.22,c.23,c.24,c.25,c.26,c.27,c.28,c.29,c.30,c.31,c.32,c.33)
Error: cannot allocate vector of size 145 Kb
Execution halted
145--Kb---?? This has me rather lost. Maybe on overflow of some sort?? Maybe 
on OS problem of some sort? I'm scratching here.

Before you question it, there is a legitimate reason for sticking all these 
components in the one data.frame.

One of the problems here is that tinkering is not really feasible. This cbind 
took 1.5 hrs to finally halt.

Any help greatly appreciated,
James
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error: cannot allocate vector of size... but with a twist

2005-01-27 Thread Paul Roebuck
On Fri, 28 Jan 2005, James Muller wrote:

 I have a memory problem, one which I've seen pop up in the list a few
 times, but which seems to be a little different. It is the Error: cannot
 allocate vector of size x problem. I'm running R2.0 on RH9.

 [SNIP]

 R chews up memory up until the 3.5Gb area, then halts. Here's the last
 bit of output:


32-bit addressing goes to ~4Gb.

--
SIGSIG -- signature too long (core dumped)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html