Re: [R] R2WinBUGS error

2006-04-06 Thread Uwe Ligges
Joseph Retzer wrote:
  
  
 Dear R-help,
   I'm using the R2WinBUGS package and getting an error message:
  
  Error in file(file, r) : unable to open connection
  In addition: Warning message:
  cannot open file 'codaIndex.txt', reason 'No such file or
 directory' 


Looks like there was an error in WinBUGS: R cannot access the file, 
because WinBUGS has not written it before.

You can use
   bugs(.., debug = TRUE)
Then WinBUGS stays open and you can look at its error messages in order 
to correct the code/data/inits or whatever

Uwe Ligges









 
 I'm using R 2.2.1 and WinBUGS 1.4.1 on a windows machine (XP).  My R code
 and WinBUGS code is given below. The complete WinBUGS program executes
 correctly in WinBUGS however I'm generating some of my inits using WinBUGS
 so I may be making an error there. On the other hand, the error generated in
 R seems to imply it cannot locate a file. I've checked my paths and they are
 correct. Also, my data is loading correctly.
 Many thanks,
 Joe
  
 #
  
 # R code
 # Runs Bayesian Ordered Logit by calling WinBUGS from R
 # ologit2.txt: WinBUGS commands
  
 library(R2WinBUGS)
  
 setwd(c:/docume~1/admini~1/mydocu~1/r_tuto~1)
 load(oldat.Rdata) # R data file containing data frame ol.dat
 # with vars: q02, bf23f, bf23b, bf22, bf34a, bf34.1,
 bf34.2
  
q02 -ol.dat$q02 
  bf23f -  ol.dat$bf23f 
  bf23b -  ol.dat$bf23b 
   bf22 -   ol.dat$bf22 
  bf34a -  ol.dat$bf34a 
 bf34.1 - ol.dat$bf34.1 
 bf34.2 - ol.dat$bf34.2 
  
 N=nrow(ol.dat) 
 Ncut=5  
  
 data - list(N, q02, bf23f, bf23b, bf22, bf34a, bf34.1,
 bf34.2, Ncut)
  
 inits - function()
 {
list(k=c(-5, -4, -3, -2, -1), tau=2, theta=rnorm(7, -1, 100))
 }
  
 parameters - c(k)
  
 olog.out - bugs(data, inits, parameters, 
  model.file=c:/Documents and Settings/Administrator/My
 Documents/r_tutorial/ologit2.txt,
  n.chains = 2, n.iter = 1000,
  bugs.directory = c:/Program Files/WinBUGS14/)
  
 
 # WinBUGS code
  
 model exec;  
 {
  # Priors on regression coefficients
theta[1] ~  dnorm( -1,1.0) ;   theta[2] ~  dnorm(-1,1.0)   
theta[3] ~  dnorm(  1,1.0) ;   theta[4] ~  dnorm(-1,1.0)   
theta[5] ~  dnorm( -1,1.0) ;   theta[6] ~  dnorm( 1,1.0)   
theta[7] ~  dnorm( -1,1.0)  
  
  # Priors on latent variable cutpoints  
k[1] ~ dnorm(0, 0.1)I(, k[2]);  k[2] ~ dnorm(0, 0.1)I(k[1], k[3])
k[3] ~ dnorm(0, 0.1)I(k[2], k[4]);  k[4] ~ dnorm(0, 0.1)I(k[3], k[5])
k[5] ~ dnorm(0, 0.1)I(k[4], )
  
  # Prior on precision
tau ~ dgamma(0.001, 0.001)
  
  # Some defs
sigma - sqrt(1 / tau); log.sigma - log(sigma);
  
  for (i in 1 : N) 
{
 # Prior on 
  b[i] ~ dnorm(0.0, tau)
  
 # Model Mean
  mu[i] - theta[1] + theta[2]*bf22[i] + theta[3]*bf23b[i] +
 theta[4]*bf23f[i] + theta[5]*bf34a[i] + theta[6]*bf34.1[i] +
 theta[7]*bf34.2[i]
  
  for (j in 1 : Ncut) 
{ 
   # Logit Model
   # cumulative probability of lower response than j
logit(Q[i, j]) -  -(k[j] + mu[i] + b[i])
}
  
# probability that response = j
  p[i,1] - max( min(1 - Q[i,1], 1), 0)
  
  for (j in 2 : Ncut) 
  { 
 p[i,j] - max( min(Q[i,j-1] - Q[i,j],1), 0) 
  }
  
  p[i,(Ncut+1)] - max( min(Q[i,Ncut], 1), 0)
  q02[i] ~ dcat(p[i, ])
}
 }
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] rounding of voronoi vertices using deldir()

2006-04-06 Thread Prof Brian Ripley
The problem is most likely your use of cat() for output.  Consider

 x - 8665540.49905558
 cat(x, \n)
8665540
 cat(as.character(x), \n)
8665540.49905558
 options(digits=10)
 cat(x, \n)
8665540.499

So it would be best to do the conversions yourself, and I would
investigate using format() to do so.  But just changing options(digits=) 
may suffice.

The 'digits' option in deldir rounds the output, and with the size of 
numbers you have 10dp is at or below representation accuracy.

There are alternative R packages for Voronoi polygons, e.g. tripack.

On Wed, 5 Apr 2006, Mike Leahy wrote:

 Hello list,

 I'm just getting started with using R - I have been trying over the past
 day or so to work out a method for generating voronoi polygons for
 PostGIS using SQL.  I was able to put together a procedure which works
 relatively well, but is somewhat inefficient.  Someone on the PostGIS
 list pointed me to the deldir() function in R, for which I can export a
 text file with x/y coordinates from a PostGIS table, and write an SQL
 script with insert statements containing PostGIS ewkt-compatible
 geometries representing the voronoi polygons generated by deldir().

 The script I'm using is below.  I've added a very small frac parameter
 to ensure none of the points are dropped from my dataset (which was
 actually happening for me with some fairly dispersed clusters of
 points), and a rectangular window that is much larger than the dataset
 itself to ensure that the voronoi polygons extend far enough to actually
 cover all of the points in my dataset.

 The problem I'm having is that the vertices at the corners of these
 polygons seem to have their coordinates rounded to one decimal place.
 Based on the documentation, the 'digits' option should allow me to
 change this, but I'm not having getting anything different no matter
 what I set it to.  Is there something obvious that I've missed?  I
 realize that in most practical applications it's not a significant
 issue, but I'd prefer more accuracy as I may be using voronoi polygons
 to evaluate some statistical methods.

 Below are some sample coordinates from my points.txt file, and the
 script I'm running in R to write out the voronoi polygons.  If anyone
 can see what's going, I'd be happy to hear it, as the R calculations are
 much faster than the method I'm using within PostGIS/SQL:

 points.txt:
 22.462042329 | 8665540.49905558
 270171.836250559 |  8667802.6446983
 268895.572741816 | 8674257.75469324
 270054.378262961 | 8666483.37597101
 268402.641255299 | 8664853.87941629
 265707.056272354 | 8665434.09025432
 269985.118229025 | 8667743.14071004
 269282.034045422 | 8665403.39312076
 


 R
  library(deldir)
  points = scan(file=points.txt,what=list(x=0.0,y=0.0),sep=|)
  voro =
 deldir(points$x,points$y,digits=10,frac=0.001,rw=c(min(points$x)-abs(min(points$x)-max(points$x)),max(points$x)+abs(min(points$x)-max(points$x)),min(points$y)-abs(min(points$y)-max(points$y)),max(points$y)+abs(min(points$y)-max(points$y
  # generate voronoi edges
  tiles = tile.list(voro)# combine edges into polygons
  sink(voronoi.sql)# redirect output to file
  for (i in 1:length(tiles)) {   # write out polygons
tile = tiles[[i]]
cat(insert into mytable (the_geom) values(geomfromtext('POLYGON(()
for (j in 1:length(tile$x)) {
   cat (tile$x[[j]],' ',tile$y[[j]],,)
}
cat (tile$x[[1]],' ',tile$y[[1]]) #close polygon
cat ())',32718));\n) # add SRID and newline
  }
  sink()  # output to terminal
 q()
 n



 Regards,
 Mike

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Uneven y-axis scale

2006-04-06 Thread Jim Lemon
Kåre Edvardsen wrote:
 Dear R-gurus!
 
 Is it possible within boxplot to break the y-scale into two (or
 anything)? I'd like to have a normal linear y-range from 0-10 and the
 next tick mark starting at, say 60 and continue to 90. The reason is for
 better visualising  the outliers.
 
Hi Kare,

In the plotrix package, there are two functions, gap.plot amd 
gap.barplot, that do something like this. If you think that they are 
sufficiently close to what you want, I could have a try at doing a 
gap.boxplot function.

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] imaging and contouring

2006-04-06 Thread Nicolas Degallier
Dear R'Helpers and Colleagues,

I have looked in the documentation, asked to some colleagues in the  
lab, but was unable to solve the following difficulty.

I am running R on a  iMac  G5 with OS 10.4.

The file below (73 rows x 144 col) shows the values of a climatic  
index on the globe with a grid of 2,5 °  x 2,5 ° (NA = no value):


With image() and map() and running the following function:

dessin_i-function(nomfichier=ERA40NA.dat,lat=73,long=144) 
{ #définition de la fonction
library(maps)
a - read.table(nomfichier) #lecture du fichier
z - array(0,dim=c(long,lat)) #création d'un tableau de 0
z - t(a) #transposition de la matrice a en z
z1-rbind(z[65:long,],z[1:64,]) #centrage de la carte sur greenwich
nlat-seq(-90,90,2.5) #définition des latitudes / ordonnées de 2,5°  
en 2,5°
nlong - seq(-180,177.5,2.5) #définition des longitudes /abcisses de  
2,5° en 2,5°
image(nlong,nlat,z1,col=rainbow 
(100),xlab=longitude,ylab=latitude) #traçage
map(add=TRUE) #superposition de la carte des continents et pays
}

And then:
 dessin_i()

I got the following figure (anonymous ftp):

ftp://ftp.lodyc.jussieu.fr/ndelod/ND_Fig1.pdf

running the following:

dessin_c-function(nomfichier=ERA40NA.dat,lat=73,long=144) 
{ #définition de la fonction
library(maps) #chargement du package maps
a - read.table(nomfichier) #lecture des données
z - array(0,dim=c(long,lat)) #création d'un tableau de 0
z - t(a) #transposition de la matrice a en z
z1-rbind(z[65:long,],z[1:64,]) #centrage de la carte sur greenwich
nlat-seq(-90,90,2.5) #définition des latitudes / ordonnées de 2,5°  
en 2,5°
nlong - seq(-180,177.5,2.5) #définition des longitudes /abcisses de  
2,5° en 2,5°
contour(nlong,nlat,z1,col=rainbow 
(100),xlab=longitude,ylab=latitude) #traçage
map(add=TRUE) #superposition de la carte des continents et pays
}

and:
 dessin_c()

I got the following figure (anonymous ftp):

ftp://ftp.lodyc.jussieu.fr/ndelod/ND_Fig2.pdf

Finally, running:

dessin_f-function(nomfichier=ERA40NA.dat,lat=73,long=144) 
{ #définition de la fonction
library(maps) #chargement du package maps
a - read.table(nomfichier) #lecture des données
z - array(0,dim=c(long,lat)) #création d'un tableau de 0
z - t(a) #transposition de la matrice a en z
z1-rbind(z[65:long,],z[1:64,]) #centrage de la carte sur greenwich
nlat-seq(-90,90,2.5) #définition des latitudes / ordonnées de 2,5°  
en 2,5°
nlong - seq(-180,177.5,2.5) #définition des longitudes /abcisses de  
2,5° en 2,5°
filled.contour(nlong,nlat,z1,col=rainbow 
(100),xlab=longitude,ylab=latitude) #traçage
map(add=TRUE) #superposition de la carte des continents et pays
}

and:
 dessin_f()

I got the following figure (anonymous ftp):

ftp://ftp.lodyc.jussieu.fr/ndelod/ND_Fig3.pdf

It may be similar to what I am looking for, i.e. the first figure  
with smoothed contours and a scale of the colors/values but I have  
been unable (or didn't found the right options) to put the map in the  
right place.

Is it possible and how to do it:
1)  to use filled.contour() without the scale?
2) to put again the map and the coloured contours adjusted?
3) to obtain the colours of the first figure with filled.contour  
function?

A subsidiary question is yet: how to obtain levels lines thicker and/ 
or in other colour only for specified values of z1?

Many thanks in advance.

Sincerely,

Nicolas Degallier

UMR 7159 / IRD UR182
Laboratoire d'Océanographie et du Climat, Expérimentation et  
Approches Numériques (LOCEAN)
Tour 45-55, 4e ét., case 100, 4 place Jussieu
75252  Paris Cedex 5  France
tél: (33) 01 44 27 51 57
fax: (33) 01 44 27 38 05

E-mail: [EMAIL PROTECTED]
pdf reprints (anonymous ftp): ftp://ftp.lodyc.jussieu.fr/ndelod


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] rbind

2006-04-06 Thread Mahdi Osman
Hi list,

I have been trying to pileup dataframes with different column names using
rbind. I am getting the following error message. I was wondering if there is
a way to go around this  minor problem?

Error in match.names(clabs, names(xi)) : names don't match previous names:
G

Thanks indeed for your information

Regards

Mahdi

-- 
---
Mahdi Osman (PhD)
E-mail: [EMAIL PROTECTED]
---

Echte DSL-Flatrate dauerhaft für 0,- Euro*!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] interpreting anova summary tables - newbie

2006-04-06 Thread Andrew McDonagh
Hello,

Apologies if this is the wrong list, I am a first-time poster here. I 
have an experiment in which an output is measured in response to 42 
different categories.
I am only interested which of the categories is significantly different 
from a reference category.

Here is the summary of the results:

summary(simple.fit)

Call:
lm(formula = as.numeric(as.vector(TNFa)) ~ Mutant.ID, data = 
imputed.data)

Residuals:
  Min   1Q   Median   3Q  Max
-238.459  -25.261   -0.868   25.660  309.496

Coefficients:
 Estimate Std. Error t value Pr(|t|)
(Intercept)  49.047910.5971   4.628 5.08e-06 ***
Mutant.IDB  149.807023.1632   6.467 3.09e-10 ***
Mutant.IDC   98.744323.1632   4.263 2.55e-05 ***
Mutant.IDD   97.220323.1632   4.197 3.37e-05 ***
Mutant.IDE  118.982023.1632   5.137 4.49e-07 ***
Mutant.IDF  241.853723.1632  10.441   2e-16 ***
Mutant.IDG  107.488323.1632   4.640 4.80e-06 ***
Mutant.IDH  105.766423.1632   4.566 6.74e-06 ***
Mutant.IDI  517.465023.1632  22.340   2e-16 ***
Mutant.IDJ   19.23.1632   0.854 0.393735
Mutant.IDK   47.424023.1632   2.047 0.041313 *
Mutant.IDL3.254223.1632   0.140 0.888347
Mutant.IDM  180.963823.1632   7.813 5.63e-14 ***
Mutant.IDN   19.058223.1632   0.823 0.411155
Mutant.IDO   61.868423.1632   2.671 0.007891 **
Mutant.IDP   -0.530623.1632  -0.023 0.981738
Mutant.IDQ  -10.697223.1632  -0.462 0.644478
Mutant.IDR1.537723.1632   0.066 0.947107
Mutant.IDS   14.633323.1632   0.632 0.527934
Mutant.IDT   48.890023.1632   2.111 0.035458 *
Mutant.IDU   58.959723.1632   2.545 0.011313 *
Mutant.IDV   81.765723.1632   3.530 0.000467 ***
Mutant.IDW   82.957623.1632   3.581 0.000386 ***
Mutant.IDY   49.192623.1632   2.124 0.034343 *
Mutant.IDZ   51.038123.1632   2.203 0.028170 *
Mutant.IDZA 116.048723.1632   5.010 8.38e-07 ***
Mutant.IDZB  56.440223.1632   2.437 0.015287 *
Mutant.IDZC -14.530523.1632  -0.627 0.530838
Mutant.IDZD  -5.006923.1632  -0.216 0.828983
Mutant.IDZE   9.117623.1632   0.394 0.694080
Mutant.IDZF 232.287923.1632  10.028   2e-16 ***
Mutant.IDZG -27.167123.1632  -1.173 0.241595
Mutant.IDZH   0.875723.1632   0.038 0.969862
Mutant.IDZI   4.795223.1632   0.207 0.836108
Mutant.IDZJ  -5.585923.1632  -0.241 0.809568
Mutant.IDZK -12.926323.1632  -0.558 0.577138
Mutant.IDZL  38.862123.1632   1.678 0.094224 .
Mutant.IDZM  39.264323.1632   1.695 0.090880 .
Mutant.IDZN  73.841923.1632   3.188 0.001553 **
Mutant.IDZO 147.780423.1632   6.380 5.20e-10 ***
Mutant.IDZP   0.565423.1632   0.024 0.980540
Mutant.IDZQ  50.511723.1632   2.181 0.029824 *
Mutant.IDZR 217.682423.1632   9.398   2e-16 ***
Mutant.IDZS 237.322723.1632  10.246   2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 61.79 on 377 degrees of freedom
Multiple R-Squared: 0.7351, Adjusted R-squared: 0.7049
F-statistic: 24.33 on 43 and 377 DF,  p-value:  2.2e-16

 

My question relates to the meaning of the p-values. Do the p-values 
relate to
a) the confidence in the estimate
or
b)the confidence that the non-intercept categories are different to the 
intercept

Somebody mentioned to me that the p-value for the intercept is the 
confidence in the estimate of the intercept, whereas the remaining 
entries are the confidence in each strain being different from the 
reference / intercept

Note the contrasts setting is contr.treatment.

Any help would be appreciated

Andrew McDonagh,
PhD Candidate,
Department of Infectious Diseases,
Commonwealth Building,
Hammersmith Hospital,
Du Cane Road,
London W12 ONN

[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Import .scv data query

2006-04-06 Thread nicolas . guibert
Hi,

I am having troubles importing data from an scv file in R.

This is my current code:


returns - read.zoo(empty143.csv, format = %d/%b/%Y, sep = ,, header
= TRUE, skip=1 )


for importing data of this type:

Request4*,
,fx.TWD.USD.SPOT.soho
01/Jan/1988,0.03502627
04/Jan/1988,0.03502627
05/Jan/1988,0.03502627
06/Jan/1988,0.035038542

the error is:
Error in read.zoo(empty143.csv, format = %d/%b/%Y, sep = ,, header =
TRUE,  :
index contains NAs

I guess my syntax is just wrong

The next code lines would be:
returns - mreturns - window(returns, start = as.Date(01/Jan/1988))
returns - window(returns, end = as.Date(08/Jun/1988))
retdat - as.data.frame(returns)
attach(retdat)
retdat

Any help would be more than greatly appreciated.

Cheers,
NIco



This message and any attachments (the message) is\ intende...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problems in package management after Linux system upgrade

2006-04-06 Thread José Matos
On 05/04/06, Paul Johnson [EMAIL PROTECTED] wrote:

  Hi Paul,

 I upgraded from Fedora Core 4 to Fedora Core 5 and I find a lot of
 previously installed packages won't run because shared libraries or
 other system things have changed out from under the installed R
 libraries.  I do not know for sure if the R version now from
 Fedora-Extras (2.2.1) is exactly the same one I was using in FC4.

  It is. I expect also that as soon as 2.3.0 is released it will
become available for FC-4, Fc-5 and devel.

 I see problems in many packages. Example, Hmisc:
 unable to load shared library 
 '/usr/lib/R/library/Hmisc/libs/Hmisc.so':
   libgfortran.so.0: cannot open shared object file: No such file or directory
 Error in library(Hmisc) : .First.lib failed for 'Hmisc'

  To avoid this problem I am submiting R pakages to Extras. I maintain
some and I intend to submit more, I have posted here a python script
to convert an R package into a src.rpm that follows Extras packaging
policy.

  All help is welcome. :-)

 --
 Paul E. Johnson
 Professor, Political Science
 1541 Lilac Lane, Room 504
 University of Kansas

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
José Abílio

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] rbind

2006-04-06 Thread Petr Pikal
Hi

if you are really sure that all columns in all data frames you want 
to stack are the same type and in the same position just make names 
in all data frames equal.

 df1-data.frame(rnorm(10), rnorm(10))
 df2-data.frame(rnorm(10), rnorm(10))
 names(df2)-c(a, b)
 df1
rnorm.10. rnorm.10..1
1  -0.1645236   0.3981059
2  -0.2533617  -0.6120264
...
 df2
 ab
1   2.40161776  0.475509529
2  -0.03924000 -0.709946431
...
 rbind(df1,df2)
Error in match.names(clabs, names(xi)) : names don't match previous 
names:
a, b

 names(df1)-names(df2)
 rbind(df1,df2)
  ab
1   -0.16452360  0.398105880
2   -0.25336168 -0.612026393
30.69696338  0.341119691
40.55666320 -1.129363096
5   -0.68875569  1.433023702
...

HTH
Petr


On 6 Apr 2006 at 9:46, Mahdi Osman wrote:

Date sent:  Thu, 6 Apr 2006 09:46:56 +0200 (MEST)
From:   Mahdi Osman [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject:[R] rbind

 Hi list,
 
 I have been trying to pileup dataframes with different column names
 using rbind. I am getting the following error message. I was wondering
 if there is a way to go around this  minor problem?
 
 Error in match.names(clabs, names(xi)) : names don't match previous
 names:
  G
 
 Thanks indeed for your information
 
 Regards
 
 Mahdi
 
 -- 
 ---
 Mahdi Osman (PhD)
 E-mail: [EMAIL PROTECTED]
 ---
 
 Echte DSL-Flatrate dauerhaft für 0,- Euro*!
 
 

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] skipping rows in trellis key

2006-04-06 Thread Sundar Dorai-Raj
Hi, Steve,

I think you can trick it by adding an empty level with a small cex:

keyArgs - list(points=list(pch=c(NA,rep(17,5),NA),lwd=2,
 col=c(NA,c(red, chartreuse3, black, cyan, 
blue)),NA),
 text=list(lab=c(S-R 
Mapping,Color,Shape,Letter,Compatible,Incompatible,),cex=c(1.2,1,1,1,1,1,.2)),
 
text=list(lab=c(expression(R^2),as.character(rep(0.999,5)),),cex=c(1.2,1,1,1,1,1,.2)))

HTH,

--sundar

Steven Lacey wrote:
 Try this...
 
 xyplot(y~x,data=data.frame(x=1:10,y=1:10))
 keyArgs - list()
 keyArgs - list(points=list(pch=c(NA,rep(17,5)),lwd=2,col=c(NA,c(red,
 chartreuse3, black, cyan, blue))),
 text=list(lab=c(S-R Mapping,
 Color,Shape,Letter,Compatible,Incompatible),cex=c(1.2,1,1,1,1,1)),
  
 text=list(lab=c(expression(R^2),as.character(rep(0.999,5))),cex=c(1.2,1,1,1,
 1,1)))
 keyArgs$between - c(1,0,5)
 keyArgs$columns - 1
 keyArgs$column.between - 1
 keyArgs$border - TRUE
 drawKeyArgs - list(key=keyArgs,draw=FALSE)
 keyArgs$draw - FALSE
 key - do.call(draw.key,drawKeyArgs)
 vp1-viewport(x=1,y=1,height=unit(1,grobheight,key),width=unit(1,grobwidt
 h,key),just=c(right,top))
 pushViewport(vp1)
 grid.draw(key)
 popViewport() 
 
 Steve
 
 -Original Message-
 From: Deepayan Sarkar [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, April 05, 2006 11:53 PM
 To: Steven Lacey
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] skipping rows in trellis key
 
 
 On 4/5/06, Steven Lacey [EMAIL PROTECTED] wrote:
 
Yes, that works! Thanks!

On a related note... When I draw my key there is not sufficient 
padding between the last line of the key and the border line. For 
instance, the vertical line in p often drops below the border. Is 
there an easy way to add padding between the last row of the key and 
the border?
 
 
 Example please.
 
 Deepayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] pelora extension to continuous response problem

2006-04-06 Thread Rossana Dell'Anna
Hi,

 

I'm using Mark Dettling's supclust package. 

I have a typical numeric matrix x of explanatory variables (n cases, p
features, n p).

I have a numeric vector y of length n, containing a quantitative
response variable.

I am interested in grouping features in a way which is strongly
predictive of response y (like in PLS approach).

In paper describing Pelora it is said that Pelora can be easily adapted
to this case. 

However it seems to me that no explicit indication is given in supclust
user manual.

Should I  simply call 

 

aa- pelora(x_training,y) and than 

predict(aa, x_test, type=probs) ?

 

 

Thank You in advance

rossana


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help on hypothesis testing

2006-04-06 Thread Jinsong Zhao
Hi,

I hope to use R to perform some hypothesis testing, such as t.test() and 
var.test(). 
However, I don't find a function that could do test on means with variance known
(i.e., u test or z test in some textbook), and a function that could do test on 
variances of normal distribution (i.e. chi-squared test).

Thanks in advance for any hints.

Best wishes,

Jinsong

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] polynomial predict with lme

2006-04-06 Thread i.m.s.white
Does lme prediction work correctly with poly() terms?
In the following simulated example, the predictions
are wildly off.

Or am I doing something daft?

Milk yield for five cows is measured weekly for 45 weeks.
Yield is simulated as cubic function of weekno + random
cow effect (on intercept) + residual error. 
I want to recover an estimate of the fixed curve.
###
library(nlme)
set.seed(1)
ncows - 5; nweeks - 45; week - 1:nweeks
mcurve - 25 + 0.819*week - 0.0588*week^2 + 0.000686*week^3
cow.eff - rnorm(ncows)
week - rep(week, ncows)
cow - gl(ncows,nweeks)
yield - mcurve + cow.eff[cow] + rnorm(ncows*nweeks)

lmefit - lme(yield ~ poly(week,3), random = ~1|cow)
summary(lmefit) # seems OK
someweeks - seq(5,45,by=5)
new - data.frame(week=someweeks)
predicts - predict(lmefit, new, level=0)
print(predicts) # not even close

#plot(week, yield, las=1)
#lines(someweeks, predicts)
###

-- 

*I.White   *
*University of Edinburgh   *
*Ashworth Laboratories, West Mains Road*
*Edinburgh EH9 3JT *
*Fax: 0131 650 6564   Tel: 0131 650 5490   *
*E-mail: [EMAIL PROTECTED]  *

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help on hypothesis testing

2006-04-06 Thread Sean Davis



On 4/6/06 7:17 AM, Jinsong Zhao [EMAIL PROTECTED] wrote:

 Hi,
 
 I hope to use R to perform some hypothesis testing, such as t.test() and
 var.test(). 
 However, I don't find a function that could do test on means with variance
 known
 (i.e., u test or z test in some textbook), and a function that could do test
 on 
 variances of normal distribution (i.e. chi-squared test).

See ?pnorm and ?pchisq, if I understand you correctly.

Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Sort Problem

2006-04-06 Thread Sumanta Basak
Hi All,



I have certain combinations for which I have some value, e.g.



0 1 20

0 2 15

1 1 40

1 2 52



Now I need to sort this list for which I'll get the combination against the
lowest value. In this case, 15, and the combination will be 0 2.




--
SUMANTA BASAK.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sort Problem

2006-04-06 Thread Sean Davis



On 4/6/06 8:04 AM, Sumanta Basak [EMAIL PROTECTED] wrote:

 Hi All,
 
 
 
 I have certain combinations for which I have some value, e.g.
 
 
 
 0 1 20
 
 0 2 15
 
 1 1 40
 
 1 2 52
 
 
 
 Now I need to sort this list for which I'll get the combination against the
 lowest value. In this case, 15, and the combination will be 0 2.

See ?order.  You can use the ordering that you get from order() to grab the
appropriate entry in your matrix/data.frame.

Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] polynomial predict with lme

2006-04-06 Thread Prof Brian Ripley
I don't believe predict.lme implements makepredictcall (which is the magic 
used in normal model-fitting), nor does it use model.frame.default.

So the answer would appear to be that there is no reason to expect it to 
work with poly().

See http://developer.r-project.org/model-fitting-functions.txt
for the standard paradigm.

On Thu, 6 Apr 2006, i.m.s.white wrote:

 Does lme prediction work correctly with poly() terms?
 In the following simulated example, the predictions
 are wildly off.

 Or am I doing something daft?

 Milk yield for five cows is measured weekly for 45 weeks.
 Yield is simulated as cubic function of weekno + random
 cow effect (on intercept) + residual error.
 I want to recover an estimate of the fixed curve.
 ###
 library(nlme)
 set.seed(1)
 ncows - 5; nweeks - 45; week - 1:nweeks
 mcurve - 25 + 0.819*week - 0.0588*week^2 + 0.000686*week^3
 cow.eff - rnorm(ncows)
 week - rep(week, ncows)
 cow - gl(ncows,nweeks)
 yield - mcurve + cow.eff[cow] + rnorm(ncows*nweeks)

 lmefit - lme(yield ~ poly(week,3), random = ~1|cow)
 summary(lmefit) # seems OK
 someweeks - seq(5,45,by=5)
 new - data.frame(week=someweeks)
 predicts - predict(lmefit, new, level=0)
 print(predicts) # not even close

 #plot(week, yield, las=1)
 #lines(someweeks, predicts)
 ###



-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] weighted kernel density estimate

2006-04-06 Thread May, Roel
Dear R-users,

I intend to do a spatial analysis on the genetic structuring within a
population. For this I had thought to prepare a kernel density estimate
map showing the spatial distribution of individuals, while incorporating
the genetic distances among individuals. I have a dataset of locations
of N unique individuals (XY-coordinates) and an NxN matrix with the
genetic distances given as a fraction between 0 and 1. As far as I
understand the methodology, a kernel density estimate works with the
geographic distance matrix. My idea was to somehow incorporate the
genetic distance matrix (e.g. as an among-individual-based smoothing
factor???) in the estimation. Does anyone know if this is possible? To
me it sounds a logical inclusion which may be interesting for a wide
variety of topics (i.e., not all individuals are equal). I hope
someone knows of any way to proceed. Thanks in advance,
 

Cheers Roel May

 
Roel May
Norwegian Institute for Nature Research (NINA)
Tungasletta 2, NO-7485 Trondheim, Norway
Tlf. +47 73 80 14 65, Mob. +47 95 78 59 95
Email [EMAIL PROTECTED]
Internett www.nina.no, www.jerv.info

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sort Problem

2006-04-06 Thread Liaw, Andy
From: Sean Davis
 
 On 4/6/06 8:04 AM, Sumanta Basak [EMAIL PROTECTED] wrote:
 
  Hi All,
  
  
  
  I have certain combinations for which I have some value, e.g.
  
  
  
  0 1 20
  
  0 2 15
  
  1 1 40
  
  1 2 52
  
  
  
  Now I need to sort this list for which I'll get the combination 
  against the lowest value. In this case, 15, and the 
 combination will 
  be 0 2.
 
 See ?order.  You can use the ordering that you get from 
 order() to grab the appropriate entry in your matrix/data.frame.
 
 Sean

If Sumanta just wants the entry with the minimum, there's no need to sort
the whole thing.  which.min() should work.

Andy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Uneven y-axis scale

2006-04-06 Thread Jian Zhang
It is useful to me. Thank you!
ask the other question.
I have some data with many points in a smaller scales (e.g. 0-10) and little
points in a larger scales (e.g, 10-1000).If I want to a normal linear
y-scale from 0-10 and let the other points (10-1000) display in a shorter
distance.
Can I do it by using the package?
How to do it?
Thanks!

On 4/6/06, Jim Lemon [EMAIL PROTECTED] wrote:

 Kåre Edvardsen wrote:
  Dear R-gurus!
 
  Is it possible within boxplot to break the y-scale into two (or
  anything)? I'd like to have a normal linear y-range from 0-10 and the
  next tick mark starting at, say 60 and continue to 90. The reason is for
  better visualising  the outliers.
 
 Hi Kare,

 In the plotrix package, there are two functions, gap.plot amd
 gap.barplot, that do something like this. If you think that they are
 sufficiently close to what you want, I could have a try at doing a
 gap.boxplot function.

 Jim

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] weighted kernel density estimate

2006-04-06 Thread Liaw, Andy
If you don't insist on kernel smoothing, and are willing to use something
similar, locfit() in the locfit package uses local likelihood to estimate
density and can accept weights.  E.g.,

library(locfit)
plot(locfit(~Petal.Length + Petal.Width, data=iris))
plot(locfit(~Petal.Length + Petal.Width, data=iris, weights=rep(1:3,
each=50)))

Andy

From: May, Roel
 
 Dear R-users,
 
 I intend to do a spatial analysis on the genetic structuring 
 within a population. For this I had thought to prepare a 
 kernel density estimate map showing the spatial distribution 
 of individuals, while incorporating the genetic distances 
 among individuals. I have a dataset of locations of N unique 
 individuals (XY-coordinates) and an NxN matrix with the 
 genetic distances given as a fraction between 0 and 1. As far 
 as I understand the methodology, a kernel density estimate 
 works with the geographic distance matrix. My idea was to 
 somehow incorporate the genetic distance matrix (e.g. as an 
 among-individual-based smoothing
 factor???) in the estimation. Does anyone know if this is 
 possible? To me it sounds a logical inclusion which may be 
 interesting for a wide variety of topics (i.e., not all 
 individuals are equal). I hope someone knows of any way to 
 proceed. Thanks in advance,
  
 
 Cheers Roel May
 
  
 Roel May
 Norwegian Institute for Nature Research (NINA)
 Tungasletta 2, NO-7485 Trondheim, Norway
 Tlf. +47 73 80 14 65, Mob. +47 95 78 59 95
 Email [EMAIL PROTECTED]
 Internett www.nina.no, www.jerv.info
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] key position in trellis plotting area

2006-04-06 Thread Steven Lacey
Hi, 
 
I want to do the following:
 
1) create a trellis plot with 1 x 1 layout
2) add a key in the upper right hand corner of the plotting region (i.e.,
the panel), but after the initial call to trellis
3) resize the resulting device graphics device without changing the relative
position of the key
 
For instance, the code below draws the key relative to the device
window--not the plotting area.
 
xyplot(y~x,data=data.frame(x=1:10,y=1:10),par.setting=list(background=white
),col=black)
keyArgs - list()
keyArgs - list(points=list(pch=17,lwd=2,col=c(transparent,red,
chartreuse3, black, cyan, blue,transparent)),
text=list(lab=c(S-R Mapping,
Color,Shape,Letter,Compatible,Incompatible,),cex=c(1.2,1,1,1,1,1
,0.2)),
 
text=list(lab=c(expression(R^2),as.character(rep(0.999,5)),),cex=c(1.2,1,1
,1,1,1,0.2)))
keyArgs$between - c(1,0,5)
keyArgs$background - white
keyArgs$border - TRUE
drawKeyArgs - list(key=keyArgs,draw=FALSE)
keyArgs$draw - FALSE
key - do.call(draw.key,drawKeyArgs)
vp1-viewport(x=1,y=1,height=unit(1,grobheight,key),width=unit(1,grobwidt
h,key),just=c(right,top))
pushViewport(vp1)
grid.draw(key)
popViewport() 
 
If I embed the call to draw.key into the panel function, where presumably
the plotting area is the current viewport, then it works. 
 
 
panel.test - function(x,y,...){
panel.xyplot(x,y,...)
keyArgs - list()
keyArgs - list(points=list(pch=17,lwd=2,col=c(transparent,red,
chartreuse3, black, cyan, blue,transparent)),
text=list(lab=c(S-R Mapping,
Color,Shape,Letter,Compatible,Incompatible,),cex=c(1.2,1,1,1,1,1
,0.2)),
 
text=list(lab=c(expression(R^2),as.character(rep(0.999,5)),),cex=c(1.2,1,1
,1,1,1,0.2)))
keyArgs$between - c(1,0,5)
keyArgs$background - white
keyArgs$border - TRUE
drawKeyArgs - list(key=keyArgs,draw=FALSE)
keyArgs$draw - FALSE
key - do.call(draw.key,drawKeyArgs)
 
vp1-viewport(x=1,y=1,height=unit(1,grobheight,key),width=unit(1,grobwidt
h,key),just=c(right,top))
pushViewport(vp1)
grid.draw(key)
popViewport() 
}
xyplot(y~x,data=data.frame(x=1:10,y=1:10),par.setting=list(background=white
),col=black,panel=panel.test)

 
But, I don't want to write panel functions that are specific to the key. I'd
like to draw the plot, access the plotting area viewport, push it, and add
the key. 
 
Something may be possible with baseViewports, but I can't get that to work
properly with a 1 x 1 lattice plot, and the relative position of the key
changes when the device is rescaled. Is there an easy way to do this in the
initial call to xyplot? My read of the documentation is that a key may be
positioned in the npc coordinates of the device region, but not the plotting
region. Is that correct?
 
Thanks,
Steve

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] weighted kernel density estimate

2006-04-06 Thread Markus Jantti

On Thu, 2006-04-06 at 14:29 +0200, May, Roel wrote:
 Dear R-users,

 I intend to do a spatial analysis on the genetic structuring within a
 population. For this I had thought to prepare a kernel density estimate
 map showing the spatial distribution of individuals, while incorporating
 the genetic distances among individuals. I have a dataset of locations
 of N unique individuals (XY-coordinates) and an NxN matrix with the
 genetic distances given as a fraction between 0 and 1. As far as I
 understand the methodology, a kernel density estimate works with the
 geographic distance matrix. My idea was to somehow incorporate the
 genetic distance matrix (e.g. as an among-individual-based smoothing
 factor???) in the estimation. Does anyone know if this is possible? To
 me it sounds a logical inclusion which may be interesting for a wide
 variety of topics (i.e., not all individuals are equal). I hope
 someone knows of any way to proceed. Thanks in advance,



Dear Roel -- it is not entirely clear what you wish to achieve. Sampling
weights associated with a unit can be incorporated at least in the
locfit package, which also allows you to do a 2-dimensional density
estimate, but this does not sound like what you are interested in.

From your description, it sounds like you have a data set which consists
of 1/2*N*(N-1) unique pairs of individuals with data

x1 y1 x2 y2 w

where (x1, y1) and (x2, y2) are the (x,y) coordinates and w is the
genetic distance between the two (there are only 1/2*N*(N-1) on the
assumption that the genetic distance for any i,j individuals is
symmetric so w(i,j) = w(j, i) and w(i, i) = 1).

Are you interested in plotting on a map of (x, y) coordinates some
measure of how genetically related the population at the that coordinate
are with their surroundinds? I.e., value at x1, y1 of w summarised by a
kernel function  over all x2, y2?

Best wishes,


Markus

 Cheers Roel May


 Roel May
 Norwegian Institute for Nature Research (NINA)
 Tungasletta 2, NO-7485 Trondheim, Norway
 Tlf. +47 73 80 14 65, Mob. +47 95 78 59 95
 Email [EMAIL PROTECTED]
 Internett www.nina.no, www.jerv.info

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

-- 
Markus Jantti
Abo Akademi University
[EMAIL PROTECTED]
http://www.iki.fi/~mjantti

###

This message has been scanned by F-Secure Anti-Virus for Mic...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R CMD check for packages in a bundle

2006-04-06 Thread Robin Hankin
Hi

[MacOsX 10.4.6;  R-2.2.1]

I have a  bundle that comprises three packages.  I want to run R CMD  
check on
each one individually, as it takes a long time to run  on all three.  
I am
having problems.

The bundle as a whole passes R CMD check, but fails when I cd to the  
bundle
directory and run R CMD check on a  package directory.

The whole bundle passes:

octopus:~/scratch% R CMD check ./BACCO
* checking for working latex ... OK
* using log directory '/Users/rksh/scratch/BACCO.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'BACCO/DESCRIPTION' ... OK
* looks like 'BACCO' is a package bundle
* this is bundle 'BACCO' version '1.0-29'
* checking if this is a source bundle ... OK

[snip]

** creating approximator-manual.tex ... OK
** checking approximator-manual.tex ... OK

octopus:~/scratch%

but R CMD check fails on the packages individually:


octopus:~/scratch/BACCO% R CMD check ./calibrator
* checking for working latex ... OK
* using log directory '/Users/rksh/scratch/BACCO/calibrator.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'calibrator/DESCRIPTION' ... OK
* looks like 'calibrator' is a package bundle
* this is bundle 'BACCO' version '1.0-29'
* checking if this is a source bundle ... OK

/Library/Frameworks/R.framework/Resources/bin/INSTALL: line 1: cd: / 
Users/rksh/scratch/BACCO/calibrator/emulator: No such file or directory
/Library/Frameworks/R.framework/Resources/bin/INSTALL: line 1: cd: / 
Users/rksh/scratch/BACCO/calibrator/calibrator: No such file or  
directory
/Library/Frameworks/R.framework/Resources/bin/INSTALL: line 1: cd: / 
Users/rksh/scratch/BACCO/calibrator/approximator: No such file or  
directory
ERROR: no 'Package' field in 'DESCRIPTION'
ERROR
Installation failed.
octopus:~/scratch/BACCO%


Both DESCRIPTION and DESCRIPTION.in files seem to be OK:



octopus:~/scratch/BACCO/calibrator% cat DESCRIPTION
Package: calibrator
Description: Performs Bayesian calibration of computer models as per
[snip]
Title: Bayesian calibration of computer models
Bundle: BACCO
Version: 1.0-29
Date: 13 October 2005
Depends: R (= 2.0.0), mvtnorm, adapt
Contains: emulator calibrator approximator
Title: Bayesian Analysis of Computer Code Output
BundleDescription: Bayesian analysis of computer code software.  The
   bundle contains routines that evaluate the formulae of Kennedy,
   O'Hagan, and Oakley.
License: GPL
Author: Robin K. S. Hankin [EMAIL PROTECTED]
Maintainer: Robin K. S. Hankin [EMAIL PROTECTED]
URL: http://www.noc.soton.ac.uk/JRD/SAT/pers/rksh.html



octopus:~/scratch/BACCO/calibrator% cat DESCRIPTION.in
Package: calibrator
Description: Performs Bayesian calibration of computer models as per
[snip]
Title: Bayesian calibration of computer models



--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] weighted kernel density estimate

2006-04-06 Thread May, Roel
Dear Markus,

I indeed have a data set consisting of 1/2*N*(N-1) unique pairs of
individuals with data

x1 y1 x2 y2 w

I am however not interested in, like you said, the value at x1, y1 of w
summarised by a kernel function  over all x2, y2 (if I understand you
rightly that is...). This sounds like doing a weighted kernel density
estimate as seen from the 'viewpoint' of a certain location/individual.
I know could be done with sm.density in the sm library.

I am interested to create a map depicted the total structuring in the
entire population (both based on geographic and genetic distances). This
means that the evaluation/interpolation at each location have to take
into account both the geographic AND the genetic distance matrix (both
with 1/2*N*(N-1) unique combinations).

To clarify myself a bit more, such a map could show for example
differentiation in the population (of wolverines by the way) because of
large geographic distances OR because of large genetic distances.

I have quickly checked locfit but I am not sure if this would work for
me.

Roel

-Original Message-
From: Markus Jantti [mailto:[EMAIL PROTECTED] 
Sent: 6. april 2006 14:56
To: r-help@stat.math.ethz.ch
Cc: May, Roel
Subject: Re: [R] weighted kernel density estimate


On Thu, 2006-04-06 at 14:29 +0200, May, Roel wrote:
 Dear R-users,

 I intend to do a spatial analysis on the genetic structuring within a 
 population. For this I had thought to prepare a kernel density 
 estimate map showing the spatial distribution of individuals, while 
 incorporating the genetic distances among individuals. I have a 
 dataset of locations of N unique individuals (XY-coordinates) and an 
 NxN matrix with the genetic distances given as a fraction between 0 
 and 1. As far as I understand the methodology, a kernel density 
 estimate works with the geographic distance matrix. My idea was to 
 somehow incorporate the genetic distance matrix (e.g. as an 
 among-individual-based smoothing
 factor???) in the estimation. Does anyone know if this is possible? To

 me it sounds a logical inclusion which may be interesting for a wide 
 variety of topics (i.e., not all individuals are equal). I hope 
 someone knows of any way to proceed. Thanks in advance,



Dear Roel -- it is not entirely clear what you wish to achieve. Sampling
weights associated with a unit can be incorporated at least in the
locfit package, which also allows you to do a 2-dimensional density
estimate, but this does not sound like what you are interested in.

From your description, it sounds like you have a data set which 
consists
of 1/2*N*(N-1) unique pairs of individuals with data

x1 y1 x2 y2 w

where (x1, y1) and (x2, y2) are the (x,y) coordinates and w is the
genetic distance between the two (there are only 1/2*N*(N-1) on the
assumption that the genetic distance for any i,j individuals is
symmetric so w(i,j) = w(j, i) and w(i, i) = 1).

Are you interested in plotting on a map of (x, y) coordinates some
measure of how genetically related the population at the that coordinate
are with their surroundinds? I.e., value at x1, y1 of w summarised by a
kernel function  over all x2, y2?

Best wishes,


Markus

 Cheers Roel May


 Roel May
 Norwegian Institute for Nature Research (NINA) Tungasletta 2, NO-7485 
 Trondheim, Norway Tlf. +47 73 80 14 65, Mob. +47 95 78 59 95 Email 
 [EMAIL PROTECTED] Internett www.nina.no, www.jerv.info

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

--
Markus Jantti
Abo Akademi University
[EMAIL PROTECTED]
http://www.iki.fi/~mjantti

###

This message has been scanned by F-Secure Anti-Virus for Mic...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] convert a data frame to matrix - changed column name

2006-04-06 Thread Muhammad Subianto
I have a question, which very easy to solve, but I can't find a solution.
I want to convert a data frame to matrix. Here my toy example:

 L3 - c(1:3)
 L10 - c(1:6)
 d - data.frame(cbind(x=c(10,20), y=L10), fac=sample(L3, + 6, repl=TRUE))
 d
   x y fac
1 10 1   1
2 20 2   1
3 10 3   1
4 20 4   3
5 10 5   2
6 20 6   2
 is.data.frame(d)
[1] TRUE
 sapply(d, function(x) unlist(x, use.names=FALSE))
  x y fac
[1,] 10 1   1
[2,] 20 2   1
[3,] 10 3   1
[4,] 20 4   3
[5,] 10 5   2
[6,] 20 6   2
 is.matrix(sapply(d, function(x) unlist(x, use.names=FALSE)))
[1] TRUE


Yes, I get a matrix TRUE. But I need to change a column name like [,1]
[,2] [,3]. I need the result like

 [,1] [,2] [,3]
[1,]   1011
[2,]   2021
[3,]   1031
[4,]   2043
[5,]   1052
[6,]   2062

How can I do that?
Thanks in advance, Muhammad Subianto

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] convert a data frame to matrix - changed column name

2006-04-06 Thread Robin Hankin
Hi

set the column names to NULL:


  a - data.frame(x=1:4,y=4:1)
  aa - as.matrix(a)
  colnames(aa) - NULL
  aa
   [,1] [,2]
114
223
332
441



best wishes


Robin


On 6 Apr 2006, at 15:16, Muhammad Subianto wrote:

 I have a question, which very easy to solve, but I can't find a  
 solution.
 I want to convert a data frame to matrix. Here my toy example:

 L3 - c(1:3)
 L10 - c(1:6)
 d - data.frame(cbind(x=c(10,20), y=L10), fac=sample(L3, + 6,  
 repl=TRUE))
 d
x y fac
 1 10 1   1
 2 20 2   1
 3 10 3   1
 4 20 4   3
 5 10 5   2
 6 20 6   2
 is.data.frame(d)
 [1] TRUE
 sapply(d, function(x) unlist(x, use.names=FALSE))
   x y fac
 [1,] 10 1   1
 [2,] 20 2   1
 [3,] 10 3   1
 [4,] 20 4   3
 [5,] 10 5   2
 [6,] 20 6   2
 is.matrix(sapply(d, function(x) unlist(x, use.names=FALSE)))
 [1] TRUE


 Yes, I get a matrix TRUE. But I need to change a column name like [,1]
 [,2] [,3]. I need the result like

  [,1] [,2] [,3]
 [1,]   1011
 [2,]   2021
 [3,]   1031
 [4,]   2043
 [5,]   1052
 [6,]   2062

 How can I do that?
 Thanks in advance, Muhammad Subianto

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting- 
 guide.html

--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] convert a data frame to matrix - changed column name

2006-04-06 Thread Dimitris Rizopoulos
try the following:

out - data.matrix(d)
dimnames(out) - NULL
out


Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Muhammad Subianto [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, April 06, 2006 4:16 PM
Subject: [R] convert a data frame to matrix - changed column name


I have a question, which very easy to solve, but I can't find a 
solution.
 I want to convert a data frame to matrix. Here my toy example:

 L3 - c(1:3)
 L10 - c(1:6)
 d - data.frame(cbind(x=c(10,20), y=L10), fac=sample(L3, + 6, 
 repl=TRUE))
 d
   x y fac
 1 10 1   1
 2 20 2   1
 3 10 3   1
 4 20 4   3
 5 10 5   2
 6 20 6   2
 is.data.frame(d)
 [1] TRUE
 sapply(d, function(x) unlist(x, use.names=FALSE))
  x y fac
 [1,] 10 1   1
 [2,] 20 2   1
 [3,] 10 3   1
 [4,] 20 4   3
 [5,] 10 5   2
 [6,] 20 6   2
 is.matrix(sapply(d, function(x) unlist(x, use.names=FALSE)))
 [1] TRUE


 Yes, I get a matrix TRUE. But I need to change a column name like 
 [,1]
 [,2] [,3]. I need the result like

 [,1] [,2] [,3]
 [1,]   1011
 [2,]   2021
 [3,]   1031
 [4,]   2043
 [5,]   1052
 [6,]   2062

 How can I do that?
 Thanks in advance, Muhammad Subianto

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] convert a data frame to matrix - changed column name

2006-04-06 Thread Muhammad Subianto
On this day 06/04/2006 16:22, Robin Hankin wrote:
 Hi

 set the column names to NULL:


  a - data.frame(x=1:4,y=4:1)
  aa - as.matrix(a)
  colnames(aa) - NULL
  aa

On this day 06/04/2006 16:28, Dimitris Rizopoulos wrote:
 try the following:

 out - data.matrix(d)
 dimnames(out) - NULL
 out


Thank you very much for your help.
Best, Muhammad Subianto



On 4/6/06, Muhammad Subianto [EMAIL PROTECTED] wrote:
 I have a question, which very easy to solve, but I can't find a solution.
 I want to convert a data frame to matrix. Here my toy example:

  L3 - c(1:3)
  L10 - c(1:6)
  d - data.frame(cbind(x=c(10,20), y=L10), fac=sample(L3, + 6, repl=TRUE))
  d
x y fac
 1 10 1   1
 2 20 2   1
 3 10 3   1
 4 20 4   3
 5 10 5   2
 6 20 6   2
  is.data.frame(d)
 [1] TRUE
  sapply(d, function(x) unlist(x, use.names=FALSE))
   x y fac
 [1,] 10 1   1
 [2,] 20 2   1
 [3,] 10 3   1
 [4,] 20 4   3
 [5,] 10 5   2
 [6,] 20 6   2
  is.matrix(sapply(d, function(x) unlist(x, use.names=FALSE)))
 [1] TRUE
 

 Yes, I get a matrix TRUE. But I need to change a column name like [,1]
 [,2] [,3]. I need the result like

  [,1] [,2] [,3]
 [1,]   1011
 [2,]   2021
 [3,]   1031
 [4,]   2043
 [5,]   1052
 [6,]   2062

 How can I do that?
 Thanks in advance, Muhammad Subianto


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] New behavior in estimable(), bug or feature?

2006-04-06 Thread CG Pettersson
Hello all!

w2k, R-2.2.1. update.packages done today.

I just started to work with a new dataset, using lme() (library nlme) and
estimable() (library gmodels). I first wanted to establish the fixed
effects for eight fertiliser treatments (variable treat) coded as A to H.

Fitting and reducing a first model for grain yield went smoothly. When I
wanted to look at the fixed effects with estimable() I got an error
message claiming that I was using the wrong variable names, estimable
wanted the variable names in the form usually given by fixed.effects().

I changed the names in my matrix to the requested form, but got the same
error message. Then I changed to an old library with a variety trial
material using a matrix I _know_ worked two months ago, I got the same
type of error message.

What is happening?
Is there a new namecheck in estimable acting a little over-enthusiastic or
what?

Here are some output from R:

 trMat1

A 1 0 0 0 0 0 0 0
B 0 1 0 0 0 0 0 0
C 0 0 1 0 0 0 0 0
D 0 0 0 1 0 0 0 0
E 0 0 0 0 1 0 0 0
F 0 0 0 0 0 1 0 0
G 0 0 0 0 0 0 1 0
H 0 0 0 0 0 0 0 1
 formula(m4.y)
y.dm ~ treat - 1
 fixed.effects(m4.y)
  treatA   treatB   treatC   treatD   treatE   treatF   treatG   treatH
2065.267 4052.033 4571.479 4933.026 4680.980 5021.347 5063.306 5198.153
 estimable(m4.y, trMat1)
Error in FUN(newX[, i], ...) :
Invalid parameter name(s): , , , , , , ,
Valid names are: treatA, treatB, treatC, treatD, treatE, treatF,
treatG, treatH



All the best
/CG

-- 
CG Pettersson, MSci, PhD Stud.
Swedish University of Agricultural Sciences (SLU)
Dep. of Crop Production Ekology. Box 7043.
SE-750 07 Uppsala, Sweden
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Multivariate linear regression

2006-04-06 Thread bogdan romocea
Apparently you do not understand the point, and seem to (want to) see
patterns all over the place. A good start for the treatment of this
interesting disease is 'Fooled by Randomness' by Nassim Nicholas
Taleb. The main point of the book is that many things may be a lot
more random than one might care to imagine or believe. (Ramsey theory
is misleading and of no help here, given its biased premise that
complete disorder is impossible (T. S. Motzkin, Wikipedia).)


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Nagu
 Sent: Wednesday, April 05, 2006 8:09 PM
 To: Berton Gunter
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Multivariate linear regression

 Hi Bert,

 Thank you for your prompt reply.

 I understand your point.

 But randomness is just a matter of scale of the object (Ramsey
 Theory) . The X matrix does not explain the complete variation in Y
 due to a large noise in X or simply the mapping f: X-Y is many valued
 (or due to other finite number of reasons). Theoretically inverse does
 not exist for many valued functions. In regression type problems, we
 are evaluating the pseudoinverse of data space.

 To estimate the inverses of many valued functions, theoretically, we
 may have to use branch cuts method or something called Riemann
 surfaces, they are partition of the domain of connected sheets.

 As I am not a qualified statistician or have a good experience in
 building statistical models for highly noisy data, I am wondering how
 did you deal with such situations, if any exist, in your working
 experience?

 I will try your idea of feeding some random variables as
 predictors in X.

 Thank you again,
 Nagu

 P.S. Why is that pattern recognition is all about finding patterns
 that can not be seen easily, huh?

 On 4/5/06, Berton Gunter [EMAIL PROTECTED] wrote:
  Ummm...
 
  If y is unrelated to x, then why would one expect any
 reasonable method to
  show a greater or lesser relationship than any other? It's
 all random. Of
  course, put enough random regressors into/tune the
 parameters enough of
  any regression methodology and you'll be able to precisely
 predict the data
  at hand -- but **only** the data at hand. I should note
 that such work
  apparently frequently appears in various sorts of
 informatics/data
  mining/omics/etc. journals these days, as various papers
 demonstrating
  the irreproducibility of numerous purported discoveries
 have infamously
  demonstrated. Let us not forget Occam!
 
  Just being cranky ...
 
  -- Bert Gunter
 
 
   -Original Message-
   From: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED] On Behalf Of Nagu
   Sent: Wednesday, April 05, 2006 3:52 PM
   To: r-help@stat.math.ethz.ch
   Subject: [R] Multivariate linear regression
  
   Hi,
  
   I am working on a multivariate linear regression of the
 form y = Ax.
  
   I am seeing a great dispersion of y w.r.t x. For example, the
   correlations between y and x are very small, even after using some
   typical transformations like log, power.
  
   I tried with simple linear regression, robust regression
 and ace and
   avas package in R (or splus). I didn't see an improvement
 in the fit
   and predictions over simple linear regression. (I also
 tried this with
   transformed variables)
  
   I am sure that some of you came across such data. How did you
   deal with it?
  
   Linear regressions are good for the data like y = x +
   0.01Normal(mu,sigma2) i.e. a small noise (data observed
 in a lab). But
   linear regressions are bad for large noise, like typical
 market (or
   survey) data.
  
   Thank you,
   Nagu
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
  
 
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] key position in trellis plotting area

2006-04-06 Thread Steven Lacey
Hi, 
 
After posting this I found a collection of functions designed to interact
with a trellis plot after it has been created. One such function is
trellis.focus, which retrieves the viewport for a specific panel.
 
Calling trellis.focus(name=panel,row=1,column=1,highlight=FALSE) sets the
current viewport to the one for the panel in which I want to position the
key.
 
Steve
 
 
 

-Original Message-
From: Steven Lacey [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 06, 2006 8:55 AM
To: 'r-help@stat.math.ethz.ch'
Subject: key position in trellis plotting area


Hi, 
 
I want to do the following:
 
1) create a trellis plot with 1 x 1 layout
2) add a key in the upper right hand corner of the plotting region (i.e.,
the panel), but after the initial call to trellis
3) resize the resulting device graphics device without changing the relative
position of the key
 
For instance, the code below draws the key relative to the device
window--not the plotting area.
 
xyplot(y~x,data=data.frame(x=1:10,y=1:10),par.setting=list(background=white
),col=black)
keyArgs - list()
keyArgs - list(points=list(pch=17,lwd=2,col=c(transparent,red,
chartreuse3, black, cyan, blue,transparent)),
text=list(lab=c(S-R Mapping,
Color,Shape,Letter,Compatible,Incompatible,),cex=c(1.2,1,1,1,1,1
,0.2)),
 
text=list(lab=c(expression(R^2),as.character(rep(0.999,5)),),cex=c(1.2,1,1
,1,1,1,0.2)))
keyArgs$between - c(1,0,5)
keyArgs$background - white
keyArgs$border - TRUE
drawKeyArgs - list(key=keyArgs,draw=FALSE)
keyArgs$draw - FALSE
key - do.call(draw.key,drawKeyArgs)
vp1-viewport(x=1,y=1,height=unit(1,grobheight,key),width=unit(1,grobwidt
h,key),just=c(right,top))
pushViewport(vp1)
grid.draw(key)
popViewport() 
 
If I embed the call to draw.key into the panel function, where presumably
the plotting area is the current viewport, then it works. 
 
 
panel.test - function(x,y,...){
panel.xyplot(x,y,...)
keyArgs - list()
keyArgs - list(points=list(pch=17,lwd=2,col=c(transparent,red,
chartreuse3, black, cyan, blue,transparent)),
text=list(lab=c(S-R Mapping,
Color,Shape,Letter,Compatible,Incompatible,),cex=c(1.2,1,1,1,1,1
,0.2)),
 
text=list(lab=c(expression(R^2),as.character(rep(0.999,5)),),cex=c(1.2,1,1
,1,1,1,0.2)))
keyArgs$between - c(1,0,5)
keyArgs$background - white
keyArgs$border - TRUE
drawKeyArgs - list(key=keyArgs,draw=FALSE)
keyArgs$draw - FALSE
key - do.call(draw.key,drawKeyArgs)
 
vp1-viewport(x=1,y=1,height=unit(1,grobheight,key),width=unit(1,grobwidt
h,key),just=c(right,top))
pushViewport(vp1)
grid.draw(key)
popViewport() 
}
xyplot(y~x,data=data.frame(x=1:10,y=1:10),par.setting=list(background=white
),col=black,panel=panel.test)

 
But, I don't want to write panel functions that are specific to the key. I'd
like to draw the plot, access the plotting area viewport, push it, and add
the key. 
 
Something may be possible with baseViewports, but I can't get that to work
properly with a 1 x 1 lattice plot, and the relative position of the key
changes when the device is rescaled. Is there an easy way to do this in the
initial call to xyplot? My read of the documentation is that a key may be
positioned in the npc coordinates of the device region, but not the plotting
region. Is that correct?
 
Thanks,
Steve


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Sorting problem

2006-04-06 Thread Kåre Edvardsen
R-gurus...

I've got a 5 column dataframe where I'd like to plot each ID's b
against c with b in ascending order (within the same ID). How do I
sort b so that the other variables are altered equally?

   IDab  c d
   101   1  240 26.7  21.85
   101   2  335 21.8  21.85
   101   3 1387 26.6  21.85
   101   4  877 24.1  21.85
   101   5  978 24.3  21.85
   101   7 4962 28.3  21.85
   101   8 1690 26.1  21.85
   101   9 2038 24.4  21.85
   102   1  240 53.4  51.65
   102   2  533 50.4  51.65
   102   3  802 53.3  51.65
   102   4 1155 47.0  51.65
   102   5  953 47.7  51.65
   102   7 5353 47.8  51.65
   .
   .
   .
   115   9  840 34.9  42.01

All the best,
Kare

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] imaging and contouring

2006-04-06 Thread Gregory Snow
The help for filled.contour suggests using the plot.axes argument to annotate a 
plot, try changing the last part of your 3rd example to:

filled.contour(nlong,nlat,z1,col=rainbow 
(100),xlab=longitude,ylab=latitude,
plot.axes=map(add=TRUE))

Or 

filled.contour(nlong,nlat,z1,col=rainbow 
(100),xlab=longitude,ylab=latitude,
plot.axes={box();axis(1);axis(2);map(add=TRUE)})

Hope this helps

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Nicolas Degallier
 Sent: Tuesday, April 04, 2006 8:20 AM
 To: Ethz. Ch
 Cc: Roger Bivand
 Subject: [R] imaging and contouring
 
 Dear R'Helpers and Colleagues,
 
 I have looked in the documentation, asked to some colleagues 
 in the lab, but was unable to solve the following difficulty.
 
 I am running R on a  iMac  G5 with OS 10.4.
 
 The file below (73 rows x 144 col) shows the values of a 
 climatic index on the globe with a grid of 2,5 °  x 2,5 ° (NA 
 = no value):
 
  
 With image() and map() and running the following function:
 
 dessin_i-function(nomfichier=ERA40NA.dat,lat=73,long=144) 
 { #définition de la fonction
 library(maps)
 a - read.table(nomfichier) #lecture du fichier
 z - array(0,dim=c(long,lat)) #création d'un tableau de 0
 z - t(a) #transposition de la matrice a en z
 z1-rbind(z[65:long,],z[1:64,]) #centrage de la carte sur greenwich
 nlat-seq(-90,90,2.5) #définition des latitudes / ordonnées de 2,5°  
 en 2,5°
 nlong - seq(-180,177.5,2.5) #définition des longitudes /abcisses de  
 2,5° en 2,5°
 image(nlong,nlat,z1,col=rainbow 
 (100),xlab=longitude,ylab=latitude) #traçage
 map(add=TRUE) #superposition de la carte des continents et pays
 }
 
 And then:
  dessin_i()
 
 I got the following figure (anonymous ftp):
 
 ftp://ftp.lodyc.jussieu.fr/ndelod/ND_Fig1.pdf
 
 running the following:
 
 dessin_c-function(nomfichier=ERA40NA.dat,lat=73,long=144) 
 { #définition de la fonction
 library(maps) #chargement du package maps
 a - read.table(nomfichier) #lecture des données
 z - array(0,dim=c(long,lat)) #création d'un tableau de 0
 z - t(a) #transposition de la matrice a en z
 z1-rbind(z[65:long,],z[1:64,]) #centrage de la carte sur greenwich
 nlat-seq(-90,90,2.5) #définition des latitudes / ordonnées de 2,5°  
 en 2,5°
 nlong - seq(-180,177.5,2.5) #définition des longitudes /abcisses de  
 2,5° en 2,5°
 contour(nlong,nlat,z1,col=rainbow 
 (100),xlab=longitude,ylab=latitude) #traçage
 map(add=TRUE) #superposition de la carte des continents et pays
 }
 
 and:
  dessin_c()
 
 I got the following figure (anonymous ftp):
 
 ftp://ftp.lodyc.jussieu.fr/ndelod/ND_Fig2.pdf
 
 Finally, running:
 
 dessin_f-function(nomfichier=ERA40NA.dat,lat=73,long=144) 
 { #définition de la fonction
 library(maps) #chargement du package maps
 a - read.table(nomfichier) #lecture des données
 z - array(0,dim=c(long,lat)) #création d'un tableau de 0
 z - t(a) #transposition de la matrice a en z
 z1-rbind(z[65:long,],z[1:64,]) #centrage de la carte sur greenwich
 nlat-seq(-90,90,2.5) #définition des latitudes / ordonnées de 2,5°  
 en 2,5°
 nlong - seq(-180,177.5,2.5) #définition des longitudes /abcisses de  
 2,5° en 2,5°
 filled.contour(nlong,nlat,z1,col=rainbow 
 (100),xlab=longitude,ylab=latitude) #traçage
 map(add=TRUE) #superposition de la carte des continents et pays
 }
 
 and:
  dessin_f()
 
 I got the following figure (anonymous ftp):
 
 ftp://ftp.lodyc.jussieu.fr/ndelod/ND_Fig3.pdf
 
 It may be similar to what I am looking for, i.e. the first figure  
 with smoothed contours and a scale of the colors/values but I have  
 been unable (or didn't found the right options) to put the 
 map in the  
 right place.
 
 Is it possible and how to do it:
 1)  to use filled.contour() without the scale?
 2) to put again the map and the coloured contours adjusted?
 3) to obtain the colours of the first figure with filled.contour  
 function?
 
 A subsidiary question is yet: how to obtain levels lines thicker and/ 
 or in other colour only for specified values of z1?
 
 Many thanks in advance.
 
 Sincerely,
 
 Nicolas Degallier
 
 UMR 7159 / IRD UR182
 Laboratoire d'Océanographie et du Climat, Expérimentation et  
 Approches Numériques (LOCEAN)
 Tour 45-55, 4e ét., case 100, 4 place Jussieu
 75252  Paris Cedex 5  France
 tél: (33) 01 44 27 51 57
 fax: (33) 01 44 27 38 05
 
 E-mail: [EMAIL PROTECTED]
 pdf reprints (anonymous ftp): ftp://ftp.lodyc.jussieu.fr/ndelod
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Uneven y-axis scale

2006-04-06 Thread Clint Bowman
Have you thought about using a log scale?

Clint BowmanINTERNET:   [EMAIL PROTECTED]
Air Quality Modeler INTERNET:   [EMAIL PROTECTED]
Department of Ecology   VOICE:  (360) 407-6815
PO Box 47600FAX:(360) 407-7534
Olympia, WA 98504-7600

On Thu, 6 Apr 2006, Jim Lemon wrote:

 Kåre Edvardsen wrote:
  Dear R-gurus!
 
  Is it possible within boxplot to break the y-scale into two (or
  anything)? I'd like to have a normal linear y-range from 0-10 and the
  next tick mark starting at, say 60 and continue to 90. The reason is for
  better visualising  the outliers.
 
 Hi Kare,

 In the plotrix package, there are two functions, gap.plot amd
 gap.barplot, that do something like this. If you think that they are
 sufficiently close to what you want, I could have a try at doing a
 gap.boxplot function.

 Jim

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Help on hypothesis testing

2006-04-06 Thread Gregory Snow
There is a z.test function in the TeachingDemos Package.  Note however
that this function is meant for teaching purposes as a stepping stone
for students to learn the syntax and output for hypothesis test
functions before learning about t tests etc.  The z.test function only
does one sample tests.

If you need more functionality you can look at the code for the function
(as well as the code for functions t.test and var.test) to see how to
write your own function (or extend one of the above).

For simple tests you can compute the test statistic by hand and use the
pnorm and pchisq functions.  If you want the output in the standard
hypothesis test format then you will need to write or extend a function
to do that (it's not hard).



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jinsong Zhao
 Sent: Thursday, April 06, 2006 5:17 AM
 To: R-help
 Subject: [R] Help on hypothesis testing
 
 Hi,
 
 I hope to use R to perform some hypothesis testing, such as 
 t.test() and var.test(). 
 However, I don't find a function that could do test on means 
 with variance known (i.e., u test or z test in some 
 textbook), and a function that could do test on variances of 
 normal distribution (i.e. chi-squared test).
 
 Thanks in advance for any hints.
 
 Best wishes,
 
 Jinsong
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sorting problem

2006-04-06 Thread Berton Gunter
Please read the Help files before posting! At the bottom of ?sort you will
find a link to order which is the answer to your question.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Kåre Edvardsen
 Sent: Thursday, April 06, 2006 8:01 AM
 To: R-help
 Subject: [R] Sorting problem
 
 R-gurus...
 
 I've got a 5 column dataframe where I'd like to plot each ID's b
 against c with b in ascending order (within the same ID). How do I
 sort b so that the other variables are altered equally?
 
IDab  c d
101   1  240 26.7  21.85
101   2  335 21.8  21.85
101   3 1387 26.6  21.85
101   4  877 24.1  21.85
101   5  978 24.3  21.85
101   7 4962 28.3  21.85
101   8 1690 26.1  21.85
101   9 2038 24.4  21.85
102   1  240 53.4  51.65
102   2  533 50.4  51.65
102   3  802 53.3  51.65
102   4 1155 47.0  51.65
102   5  953 47.7  51.65
102   7 5353 47.8  51.65
.
.
.
115   9  840 34.9  42.01
 
 All the best,
 Kare
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R CMD check for packages in a bundle

2006-04-06 Thread Paul Gilbert
I think you have to remove the  DESCRIPTION.in  file when you check the 
package stand alone. As I recall,  if it exists that is the flag to 
indicate a bundle. Also beware that the DESCRIPTION file in the stand 
alone package is not exactly the same as the DESCRIPTION.in file when it 
is part of the bundle.

What you are trying to do does work, I do it all the time, but there are 
a few details to work out. Because of the above differences I actually 
generate the DESCRIPTION* files from my Makefile, differently depending 
on the target.

Paul Gilbert

Robin Hankin wrote:

Hi

[MacOsX 10.4.6;  R-2.2.1]

I have a  bundle that comprises three packages.  I want to run R CMD  
check on
each one individually, as it takes a long time to run  on all three.  
I am
having problems.

The bundle as a whole passes R CMD check, but fails when I cd to the  
bundle
directory and run R CMD check on a  package directory.

The whole bundle passes:

octopus:~/scratch% R CMD check ./BACCO
* checking for working latex ... OK
* using log directory '/Users/rksh/scratch/BACCO.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'BACCO/DESCRIPTION' ... OK
* looks like 'BACCO' is a package bundle
* this is bundle 'BACCO' version '1.0-29'
* checking if this is a source bundle ... OK

[snip]

** creating approximator-manual.tex ... OK
** checking approximator-manual.tex ... OK

octopus:~/scratch%

but R CMD check fails on the packages individually:


octopus:~/scratch/BACCO% R CMD check ./calibrator
* checking for working latex ... OK
* using log directory '/Users/rksh/scratch/BACCO/calibrator.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'calibrator/DESCRIPTION' ... OK
* looks like 'calibrator' is a package bundle
* this is bundle 'BACCO' version '1.0-29'
* checking if this is a source bundle ... OK

/Library/Frameworks/R.framework/Resources/bin/INSTALL: line 1: cd: / 
Users/rksh/scratch/BACCO/calibrator/emulator: No such file or directory
/Library/Frameworks/R.framework/Resources/bin/INSTALL: line 1: cd: / 
Users/rksh/scratch/BACCO/calibrator/calibrator: No such file or  
directory
/Library/Frameworks/R.framework/Resources/bin/INSTALL: line 1: cd: / 
Users/rksh/scratch/BACCO/calibrator/approximator: No such file or  
directory
ERROR: no 'Package' field in 'DESCRIPTION'
ERROR
Installation failed.
octopus:~/scratch/BACCO%


Both DESCRIPTION and DESCRIPTION.in files seem to be OK:



octopus:~/scratch/BACCO/calibrator% cat DESCRIPTION
Package: calibrator
Description: Performs Bayesian calibration of computer models as per
[snip]
Title: Bayesian calibration of computer models
Bundle: BACCO
Version: 1.0-29
Date: 13 October 2005
Depends: R (= 2.0.0), mvtnorm, adapt
Contains: emulator calibrator approximator
Title: Bayesian Analysis of Computer Code Output
BundleDescription: Bayesian analysis of computer code software.  The
   bundle contains routines that evaluate the formulae of Kennedy,
   O'Hagan, and Oakley.
License: GPL
Author: Robin K. S. Hankin [EMAIL PROTECTED]
Maintainer: Robin K. S. Hankin [EMAIL PROTECTED]
URL: http://www.noc.soton.ac.uk/JRD/SAT/pers/rksh.html



octopus:~/scratch/BACCO/calibrator% cat DESCRIPTION.in
Package: calibrator
Description: Performs Bayesian calibration of computer models as per
[snip]
Title: Bayesian calibration of computer models



--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
  



La version française suit le texte anglais.



This email may contain privileged and/or confidential inform...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread r user
Can anyone comment or point me to a discussion of the
pros and cons of robust regressions, vs. a more
manual approach to trimming outliers and/or
normalizing data used in regression analysis?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] interpreting anova summary tables - newbie

2006-04-06 Thread Prof Brian Ripley
On Thu, 6 Apr 2006, Andrew McDonagh wrote:

My question relates to the meaning of the p-values. Do the p-values
relate to
a) the confidence in the estimate
or
b)the confidence that the non-intercept categories are different to the
intercept

Both (given a loose interpretation of your words and some assumptions 
that a newbie is using the default settings of e.g. contrasts).

The estimates are of the difference in intercept from the reference 
category, I guess Mutant.IDA.  The p-values refer to the test that the 
particular (population) difference is zero.

However, the p-values are for a test done in isolation.  Here you are 
doing 41 tests (ignoring the intercept), and I think you should be making 
some allowance for multiple comparisone.  See e.g. the package multcomp.

My preference (and it is widely shared) would be to present confidence 
intervals for the estimated differences you are interested in (which might 
be all the differences from the reference category, or something else 
as, just guessing, you might be interested to know if there is a 
difference between IDG and IDH).


On Thu, 6 Apr 2006, Andrew McDonagh wrote:

 Hello,

 Apologies if this is the wrong list, I am a first-time poster here. I
 have an experiment in which an output is measured in response to 42
 different categories.
 I am only interested which of the categories is significantly different
 from a reference category.

 Here is the summary of the results:

 summary(simple.fit)

 Call:
 lm(formula = as.numeric(as.vector(TNFa)) ~ Mutant.ID, data =
 imputed.data)

 Residuals:
  Min   1Q   Median   3Q  Max
 -238.459  -25.261   -0.868   25.660  309.496

 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept)  49.047910.5971   4.628 5.08e-06 ***
 Mutant.IDB  149.807023.1632   6.467 3.09e-10 ***
 Mutant.IDC   98.744323.1632   4.263 2.55e-05 ***
 Mutant.IDD   97.220323.1632   4.197 3.37e-05 ***
 Mutant.IDE  118.982023.1632   5.137 4.49e-07 ***
 Mutant.IDF  241.853723.1632  10.441   2e-16 ***
 Mutant.IDG  107.488323.1632   4.640 4.80e-06 ***
 Mutant.IDH  105.766423.1632   4.566 6.74e-06 ***
 Mutant.IDI  517.465023.1632  22.340   2e-16 ***
 Mutant.IDJ   19.23.1632   0.854 0.393735
 Mutant.IDK   47.424023.1632   2.047 0.041313 *
 Mutant.IDL3.254223.1632   0.140 0.888347
 Mutant.IDM  180.963823.1632   7.813 5.63e-14 ***
 Mutant.IDN   19.058223.1632   0.823 0.411155
 Mutant.IDO   61.868423.1632   2.671 0.007891 **
 Mutant.IDP   -0.530623.1632  -0.023 0.981738
 Mutant.IDQ  -10.697223.1632  -0.462 0.644478
 Mutant.IDR1.537723.1632   0.066 0.947107
 Mutant.IDS   14.633323.1632   0.632 0.527934
 Mutant.IDT   48.890023.1632   2.111 0.035458 *
 Mutant.IDU   58.959723.1632   2.545 0.011313 *
 Mutant.IDV   81.765723.1632   3.530 0.000467 ***
 Mutant.IDW   82.957623.1632   3.581 0.000386 ***
 Mutant.IDY   49.192623.1632   2.124 0.034343 *
 Mutant.IDZ   51.038123.1632   2.203 0.028170 *
 Mutant.IDZA 116.048723.1632   5.010 8.38e-07 ***
 Mutant.IDZB  56.440223.1632   2.437 0.015287 *
 Mutant.IDZC -14.530523.1632  -0.627 0.530838
 Mutant.IDZD  -5.006923.1632  -0.216 0.828983
 Mutant.IDZE   9.117623.1632   0.394 0.694080
 Mutant.IDZF 232.287923.1632  10.028   2e-16 ***
 Mutant.IDZG -27.167123.1632  -1.173 0.241595
 Mutant.IDZH   0.875723.1632   0.038 0.969862
 Mutant.IDZI   4.795223.1632   0.207 0.836108
 Mutant.IDZJ  -5.585923.1632  -0.241 0.809568
 Mutant.IDZK -12.926323.1632  -0.558 0.577138
 Mutant.IDZL  38.862123.1632   1.678 0.094224 .
 Mutant.IDZM  39.264323.1632   1.695 0.090880 .
 Mutant.IDZN  73.841923.1632   3.188 0.001553 **
 Mutant.IDZO 147.780423.1632   6.380 5.20e-10 ***
 Mutant.IDZP   0.565423.1632   0.024 0.980540
 Mutant.IDZQ  50.511723.1632   2.181 0.029824 *
 Mutant.IDZR 217.682423.1632   9.398   2e-16 ***
 Mutant.IDZS 237.322723.1632  10.246   2e-16 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Residual standard error: 61.79 on 377 degrees of freedom
 Multiple R-Squared: 0.7351, Adjusted R-squared: 0.7049
 F-statistic: 24.33 on 43 and 377 DF,  p-value:  2.2e-16

 

 My question relates to the meaning of the p-values. Do the p-values
 relate to
 a) the confidence in the estimate
 or
 b)the confidence that the non-intercept categories are different to the
 intercept

 Somebody mentioned to me that the p-value for the intercept is the
 confidence in the estimate of the intercept, whereas the remaining
 entries are the confidence in each strain being different from the
 reference / intercept

 Note the contrasts setting is contr.treatment.

 Any help would be appreciated

 Andrew McDonagh,
 PhD Candidate,
 Department of Infectious Diseases,
 Commonwealth Building,
 Hammersmith Hospital,
 Du Cane Road,
 London W12 ONN

 [EMAIL PROTECTED]

 

[R] recommendation for post-hoc tests after anova

2006-04-06 Thread Colm Connolly
Greetings all,

I've done some ANOVAs and have found significant effects across my
groups, so now I have to do some post-hoc tests to ascertain which of
the groups is driving the significant effect. Usually one would do
something like a Newman-Keuls or Scheffe test to get at this but based
on googling and browsing through the r-help archives R doesn't support
these and they seem to be badly received by the R community generally
(for reasons related to alpha not being properly controlled, if I
understand correctly).

I found the TukeyHSD function but on reading the help page, I'm unsure
whether it's safe for me to use it since the number of subjects in my
groups is not balanced (ctrl=6, short=9, long=9). Would you reckon
that this is mildly unbalanced in the spirit of the help page or
that is my caution warranted?

In the absence of NK or Scheefe tests could someone recommend an
appropriate test for me to conduct in this instance? From what I've
read in Explaining Psychological Statistics, the REGWQ test seems to
be the way to go. Can R perform this test?

Thanks for your time,

Regards,

--
Dr Colm G. Connolly
School of Psychology and Institute of Neuroscience
The Lloyd Building
University of Dublin
Trinity College, Dublin 2, Éire

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] calculating similarity/distance among hierarchically classified items

2006-04-06 Thread Michael Friendly
This is a question about how to calculate similarities/distances
among items that are classified by hierarchical attributes
for the purpose of visualizing the relations among items by means
of clustering, MDS, self-organizing maps, and so forth.

I have a set of ~260 items that have been classified using two sets of
hierarchically-organized codes on the basis of form and content.  The
data looks like that below, where the last two variables (ITEMFORM and
ITEMCONTENT) are each a ';' separated list of codes assigned to each
item. The items are identified by the KEY variable. (Other fields are
ignored here.)

KEY,YEAR,WHERE,CONTENT,FORM,ITEMFORM,ITEMCONTENT
1782Fourcroy,1782,Eur,Hdem,Stats,F5;F5K;F5N;F5N1,C8;C82
1785Crome,1785,Eur,Pdem,Stats,F5;F5N;F5N1,C7
1786Playfair,1786,Eur,Hdem,Stats,F6;F68;F69;F61;F62,C3;C32;C321;C323
1787Chladni,1787,Eur,Other,Other,F5;F55;FH;FD;FD3,C9;C95
1794Buxton,1794,Eur,Other,Tech,F3;F31;F7;F72;F722,C9;C9A
1795Pouchet,1795,Eur,Math,Stats,F5;F5G;FG;FG7,C2
1796Watt,1796,Eur,Pdem,Tech,FGB,C7;C9;C9A
1798Senefelder,1798,Eur,Other,Tech,FB;F5,C9;C97
 ...

The codes are hierarchical in the sense that, e.g.,
C321 corresponds to the levels in a tree,
Commerce (C3)  Internal (C32)  Labour (C321)
F5G corresponds to
Diagram (F5)  Nomogram (F5G)
so the number of characters in a code is the level in the tree.

There are about 290 distinct codes, with varying frequency of use,
from 1 ..~40, so the data could be rearranged to a 260x290 incidence matrix
of items x codes.  In computing similarities between items, all measures
I know of for binary attribute data treat the attributes as nominal, and
so ignore the hierarchical nature of the codes. 

To take that into account, the 0/1 values could be replaced by the
tree level values (0=NA, 1..5) of the codes in each column.  Then some
measure of similarity could be computed based on the profiles for each
pair of items. 

But I don't know what measure (Gower's, euclidean, etc.) would be (most,
or arguably) appropriate here. Is this a situation that anyone recognizes?
Or, maybe there is another way to approach this.  I'd appreciate any
suggestions.


-- 
Michael Friendly Email: [EMAIL PROTECTED] 
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT  M3J 1P3 CANADA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread Berton Gunter
There is a **Huge** literature on robust regression, including many books
that you can search on at e.g. Amazon. I think it fair to say that we have
known since at least the 1970's that practically any robust downweighting
procedure (see, e.g M-estimation) is preferable (more efficient, better
continuity properties, better estimates) to trimming outliers defined by
arbitrary threshholds. An excellent but now probably dated introductory
discussion can be found in UNDERSTANDING ROBUST AND EXPLORATORY DATA
ANALYSIS edited by Hoaglin, Tukey, Mosteller, et. al.

The rub in all this is that nice small sample inference results go our the
window, though bootstrapping can help with this. Nevertheless, for a variety
of reasons, my recommendation is simply to **never** use lm and **always**
use rlm (with maybe a few minor caveats). Many would disagree with this,
however.

I don't think normalizing data as it's conventionally used has anything to
do with robust regression, btw.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of r user
 Sent: Thursday, April 06, 2006 8:51 AM
 To: rhelp
 Subject: [R] pros and cons of robust regression? (i.e. rlm vs lm)
 
 Can anyone comment or point me to a discussion of the
 pros and cons of robust regressions, vs. a more
 manual approach to trimming outliers and/or
 normalizing data used in regression analysis?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R2WinBUGS error Solved, THANK YOU!

2006-04-06 Thread Joseph Retzer
Thanks very much Uwe! There was a problem reading my data which I would not
have discovered without careful examination of the WinBUGS log file. The
program is working now!
Take care,
Joe 

-Original Message-
From: Uwe Ligges [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 06, 2006 1:30 AM
To: Joseph Retzer
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] R2WinBUGS error

Joseph Retzer wrote:
  
  
 Dear R-help,
   I'm using the R2WinBUGS package and getting an error message:
  
  Error in file(file, r) : unable to open connection
  In addition: Warning message:
  cannot open file 'codaIndex.txt', reason 'No such file or 
 directory'


Looks like there was an error in WinBUGS: R cannot access the file, because
WinBUGS has not written it before.

You can use
   bugs(.., debug = TRUE)
Then WinBUGS stays open and you can look at its error messages in order to
correct the code/data/inits or whatever

Uwe Ligges









 
 I'm using R 2.2.1 and WinBUGS 1.4.1 on a windows machine (XP).  My R 
 code and WinBUGS code is given below. The complete WinBUGS program 
 executes correctly in WinBUGS however I'm generating some of my inits 
 using WinBUGS so I may be making an error there. On the other hand, 
 the error generated in R seems to imply it cannot locate a file. I've 
 checked my paths and they are correct. Also, my data is loading correctly.
 Many thanks,
 Joe
  
 #
  
 # R code
 # Runs Bayesian Ordered Logit by calling WinBUGS from R
 # ologit2.txt: WinBUGS commands
  
 library(R2WinBUGS)
  
 setwd(c:/docume~1/admini~1/mydocu~1/r_tuto~1)
 load(oldat.Rdata) # R data file containing data frame ol.dat
 # with vars: q02, bf23f, bf23b, bf22, bf34a, 
 bf34.1,
 bf34.2
  
q02 -ol.dat$q02 
  bf23f -  ol.dat$bf23f
  bf23b -  ol.dat$bf23b 
   bf22 -   ol.dat$bf22 
  bf34a -  ol.dat$bf34a
 bf34.1 - ol.dat$bf34.1
 bf34.2 - ol.dat$bf34.2
  
 N=nrow(ol.dat)
 Ncut=5
  
 data - list(N, q02, bf23f, bf23b, bf22, bf34a, bf34.1, 
 bf34.2, Ncut)
  
 inits - function()
 {
list(k=c(-5, -4, -3, -2, -1), tau=2, theta=rnorm(7, -1, 100))
 }
  
 parameters - c(k)
  
 olog.out - bugs(data, inits, parameters, 
  model.file=c:/Documents and 
 Settings/Administrator/My Documents/r_tutorial/ologit2.txt,
  n.chains = 2, n.iter = 1000,
  bugs.directory = c:/Program Files/WinBUGS14/)
  
 
 # WinBUGS code
  
 model exec;
 {
  # Priors on regression coefficients
theta[1] ~  dnorm( -1,1.0) ;   theta[2] ~  dnorm(-1,1.0)   
theta[3] ~  dnorm(  1,1.0) ;   theta[4] ~  dnorm(-1,1.0)   
theta[5] ~  dnorm( -1,1.0) ;   theta[6] ~  dnorm( 1,1.0)   
theta[7] ~  dnorm( -1,1.0)  
  
  # Priors on latent variable cutpoints  
k[1] ~ dnorm(0, 0.1)I(, k[2]);  k[2] ~ dnorm(0, 0.1)I(k[1], k[3])
k[3] ~ dnorm(0, 0.1)I(k[2], k[4]);  k[4] ~ dnorm(0, 0.1)I(k[3], k[5])
k[5] ~ dnorm(0, 0.1)I(k[4], )
  
  # Prior on precision
tau ~ dgamma(0.001, 0.001)
  
  # Some defs
sigma - sqrt(1 / tau); log.sigma - log(sigma);
  
  for (i in 1 : N) 
{
 # Prior on 
  b[i] ~ dnorm(0.0, tau)
  
 # Model Mean
  mu[i] - theta[1] + theta[2]*bf22[i] + theta[3]*bf23b[i] + 
 theta[4]*bf23f[i] + theta[5]*bf34a[i] + theta[6]*bf34.1[i] + 
 theta[7]*bf34.2[i]
  
  for (j in 1 : Ncut) 
{ 
   # Logit Model
   # cumulative probability of lower response than j
logit(Q[i, j]) -  -(k[j] + mu[i] + b[i])
}
  
# probability that response = j
  p[i,1] - max( min(1 - Q[i,1], 1), 0)
  
  for (j in 2 : Ncut) 
  { 
 p[i,j] - max( min(Q[i,j-1] - Q[i,j],1), 0) 
  }
  
  p[i,(Ncut+1)] - max( min(Q[i,Ncut], 1), 0)
  q02[i] ~ dcat(p[i, ])
}
 }
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Reshaping genetic data from long to wide

2006-04-06 Thread Farrel Buchinsky
Bottom Line Up Front: How does one reshape genetic data from long to wide?

I currently have a lot of data. About 180 individuals (some
probands/patients, some parents, rare siblings) and SNP data from 6000 loci
on each. The standard formats seem to be something along the lines of Famid,
pid, fatid, motid, affected, sex, locus1Allele1, locus1Allele2,
locus2Allele1, locus2Allele2, etc

In other words one human, one row. If there were multiple loci then the
variables would continue to be heaped up on the right. This kind of
orientation, shall be referred to as wide.

Given how big my dataset is, it is easier to manage the data in the database
in the long format. In this format I have a pedigree table and from it, a
one to many relationship with the SNP data. The SNP table has fields:
uniqueHumanID, Allele1, Allele2, locus

That makes for an incredibly long table.

Data is stored in a Sybase database that I communicate with through ODBC
using Microsoft Access. RODBC package then reads the queries that I have
created in Microsoft Access. The only reason for Microsoft Access is that I
have had well over a decade's worth of experience using it at an
intermediate level.

With the magic of SQL I can mix and match these tables. But creating the
table that is 180 rows long and about 12010 variables wide is daunting.
Essentially the 6000 SNPs represent each human having 12000 repeated
measures (6000SNPs times 2 alleles)

I presume I would be able to use the reshape function in R:
Reshape Grouped Data
Description
This function reshapes a data frame between 'wide' format with repeated
measurements in separate columns of the same record and 'long' format with
the repeated measurements in separate records. 


BUT BEFORE I launch into this.

Is there a way that either the Warnes package (Genetics) or the David
Clayton package can handle the data in the long form?
If not do any of the packages reshape the data in a way that is pedigree and
genotype aware. The general R reshape function is not predesigned to be
friendly to genetic data.

Farrel Buchinsky, MD --- Mobile (412) 779-1073
Pediatric Otolaryngologist
Allegheny General Hospital
Pittsburgh, PA 


**
This email and any files transmitted with it are confidentia...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] skipping rows in trellis key

2006-04-06 Thread Deepayan Sarkar
On 4/5/06, Steven Lacey [EMAIL PROTECTED] wrote:
 Try this...

 xyplot(y~x,data=data.frame(x=1:10,y=1:10))
 keyArgs - list()
 keyArgs - list(points=list(pch=c(NA,rep(17,5)),lwd=2,col=c(NA,c(red,
 chartreuse3, black, cyan, blue))),
 text=list(lab=c(S-R Mapping,
 Color,Shape,Letter,Compatible,Incompatible),cex=c(1.2,1,1,1,1,1)),

 text=list(lab=c(expression(R^2),as.character(rep(0.999,5))),cex=c(1.2,1,1,1,
 1,1)))
 keyArgs$between - c(1,0,5)
 keyArgs$columns - 1
 keyArgs$column.between - 1
 keyArgs$border - TRUE
 drawKeyArgs - list(key=keyArgs,draw=FALSE)
 keyArgs$draw - FALSE
 key - do.call(draw.key,drawKeyArgs)
 vp1-viewport(x=1,y=1,height=unit(1,grobheight,key),width=unit(1,grobwidt
 h,key),just=c(right,top))
 pushViewport(vp1)
 grid.draw(key)
 popViewport()

Hmm. I'll add a component called 'padding.text' in R 2.3.0 which can
be used to control the padding around text rows.

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] key position in trellis plotting area

2006-04-06 Thread Deepayan Sarkar
On 4/6/06, Steven Lacey [EMAIL PROTECTED] wrote:
 Hi,

 I want to do the following:

 1) create a trellis plot with 1 x 1 layout
 2) add a key in the upper right hand corner of the plotting region (i.e.,
 the panel), but after the initial call to trellis
 3) resize the resulting device graphics device without changing the relative
 position of the key

 For instance, the code below draws the key relative to the device
 window--not the plotting area.

When I first wrote draw.key, I thought about which choice made more
sense, and decided on the device window (or more accurately the total
graphing area) because it is slightly more general. I eventually
decided that there should be an option to support both, but I haven't
gotten around to implementing that yet. It won't happen in time for R
2.3.0, but perhaps in some later update.

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Uneven y-axis scale

2006-04-06 Thread Jian Zhang
yeah, I try this method. But it is not fine to some data.



On 4/6/06, Clint Bowman [EMAIL PROTECTED] wrote:

 Have you thought about using a log scale?

 Clint BowmanINTERNET:   [EMAIL PROTECTED]
 Air Quality Modeler INTERNET:   [EMAIL PROTECTED]
 Department of Ecology   VOICE:  (360) 407-6815
 PO Box 47600FAX:(360) 407-7534
 Olympia, WA 98504-7600

 On Thu, 6 Apr 2006, Jim Lemon wrote:

  Kåre Edvardsen wrote:
   Dear R-gurus!
  
   Is it possible within boxplot to break the y-scale into two (or
   anything)? I'd like to have a normal linear y-range from 0-10 and the
   next tick mark starting at, say 60 and continue to 90. The reason is
 for
   better visualising  the outliers.
  
  Hi Kare,
 
  In the plotrix package, there are two functions, gap.plot amd
  gap.barplot, that do something like this. If you think that they are
  sufficiently close to what you want, I could have a try at doing a
  gap.boxplot function.
 
  Jim
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread Liaw, Andy
To add to Bert's comments:

-  Normalizing data (e.g., subtracting mean and dividing by SD) can help
numerical stability of the computation, but that's mostly unnecessary with
modern hardware.  As Bert said, that has nothing to do with robustness.

-  Instead of _replacing_ lm() with rlm() or other robust procedure, I'd do
both of them.  Some scientists view robust procedures that omit some data
points (e.g., by assigning basically 0 weight to them) in automatic fashion
and just trust the result as bad science, and I think they have a point.
Use of robust procedure does not free one from examining the data carefully
and looking at diagnostics.  Careful treatment of outliers is esspecially
important, I think, for data coming from a confirmatory experiment.  If the
conclusion you draw depends on downweighting or omitting certain data
points, you ought to have very good reason for doing so.  I think it can not
be over-emphasized how important it is not to take outlier deletion lightly.
I've seen many cases that what seems like outlier originally turned out to
be legitimate data, and omission of them just lead to overly optimistic
assessment of variability.

Andy

From: Berton Gunter
 
 There is a **Huge** literature on robust regression, 
 including many books that you can search on at e.g. Amazon. I 
 think it fair to say that we have known since at least the 
 1970's that practically any robust downweighting procedure 
 (see, e.g M-estimation) is preferable (more efficient, 
 better continuity properties, better estimates) to trimming 
 outliers defined by arbitrary threshholds. An excellent but 
 now probably dated introductory discussion can be found in 
 UNDERSTANDING ROBUST AND EXPLORATORY DATA ANALYSIS edited 
 by Hoaglin, Tukey, Mosteller, et. al.
 
 The rub in all this is that nice small sample inference 
 results go our the window, though bootstrapping can help with 
 this. Nevertheless, for a variety of reasons, my 
 recommendation is simply to **never** use lm and **always** 
 use rlm (with maybe a few minor caveats). Many would disagree 
 with this, however.
 
 I don't think normalizing data as it's conventionally used 
 has anything to do with robust regression, btw.
 
 -- Bert Gunter
 Genentech Non-Clinical Statistics
 South San Francisco, CA
  
 The business of the statistician is to catalyze the 
 scientific learning process.  - George E. P. Box
  
  
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of r user
  Sent: Thursday, April 06, 2006 8:51 AM
  To: rhelp
  Subject: [R] pros and cons of robust regression? (i.e. rlm vs lm)
  
  Can anyone comment or point me to a discussion of the
  pros and cons of robust regressions, vs. a more
  manual approach to trimming outliers and/or
  normalizing data used in regression analysis?
  
  __
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread Berton Gunter
Thanks, Andy. Well said. Excellent points. The final weights from rlm serve
this diagnostic purpose, of course.

-- Bert
 

 -Original Message-
 From: Liaw, Andy [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, April 06, 2006 9:56 AM
 To: 'Berton Gunter'; 'r user'; 'rhelp'
 Subject: RE: [R] pros and cons of robust regression? (i.e. 
 rlm vs lm)
 
 To add to Bert's comments:
 
 -  Normalizing data (e.g., subtracting mean and dividing by 
 SD) can help
 numerical stability of the computation, but that's mostly 
 unnecessary with
 modern hardware.  As Bert said, that has nothing to do with 
 robustness.
 
 -  Instead of _replacing_ lm() with rlm() or other robust 
 procedure, I'd do
 both of them.  Some scientists view robust procedures that 
 omit some data
 points (e.g., by assigning basically 0 weight to them) in 
 automatic fashion
 and just trust the result as bad science, and I think they 
 have a point.
 Use of robust procedure does not free one from examining the 
 data carefully
 and looking at diagnostics.  Careful treatment of outliers is 
 esspecially
 important, I think, for data coming from a confirmatory 
 experiment.  If the
 conclusion you draw depends on downweighting or omitting certain data
 points, you ought to have very good reason for doing so.  I 
 think it can not
 be over-emphasized how important it is not to take outlier 
 deletion lightly.
 I've seen many cases that what seems like outlier originally 
 turned out to
 be legitimate data, and omission of them just lead to overly 
 optimistic
 assessment of variability.
 
 Andy
 
 From: Berton Gunter
  
  There is a **Huge** literature on robust regression, 
  including many books that you can search on at e.g. Amazon. I 
  think it fair to say that we have known since at least the 
  1970's that practically any robust downweighting procedure 
  (see, e.g M-estimation) is preferable (more efficient, 
  better continuity properties, better estimates) to trimming 
  outliers defined by arbitrary threshholds. An excellent but 
  now probably dated introductory discussion can be found in 
  UNDERSTANDING ROBUST AND EXPLORATORY DATA ANALYSIS edited 
  by Hoaglin, Tukey, Mosteller, et. al.
  
  The rub in all this is that nice small sample inference 
  results go our the window, though bootstrapping can help with 
  this. Nevertheless, for a variety of reasons, my 
  recommendation is simply to **never** use lm and **always** 
  use rlm (with maybe a few minor caveats). Many would disagree 
  with this, however.
  
  I don't think normalizing data as it's conventionally used 
  has anything to do with robust regression, btw.
  
  -- Bert Gunter
  Genentech Non-Clinical Statistics
  South San Francisco, CA
   
  The business of the statistician is to catalyze the 
  scientific learning process.  - George E. P. Box
   
   
  
   -Original Message-
   From: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED] On Behalf Of r user
   Sent: Thursday, April 06, 2006 8:51 AM
   To: rhelp
   Subject: [R] pros and cons of robust regression? (i.e. 
 rlm vs lm)
   
   Can anyone comment or point me to a discussion of the
   pros and cons of robust regressions, vs. a more
   manual approach to trimming outliers and/or
   normalizing data used in regression analysis?
   
   __
   R-help@stat.math.ethz.ch mailing list 
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
  
  
  __
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
  
 
 
 --
 
 Notice:  This e-mail message, together with any attachments, 
 contains information of Merck  Co., Inc. (One Merck Drive, 
 Whitehouse Station, New Jersey, USA 08889), and/or its 
 affiliates (which may be known outside the United States as 
 Merck Frosst, Merck Sharp  Dohme or MSD and in Japan, as 
 Banyu) that may be confidential, proprietary copyrighted 
 and/or legally privileged. It is intended solely for the use 
 of the individual or entity named on this message.  If you 
 are not the intended recipient, and have received this 
 message in error, please notify us immediately by reply 
 e-mail and then delete it from your system.
 --
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R CMD check for packages in a bundle

2006-04-06 Thread Prof Brian Ripley

On Thu, 6 Apr 2006, Paul Gilbert wrote:


I think you have to remove the  DESCRIPTION.in  file when you check the
package stand alone. As I recall,  if it exists that is the flag to
indicate a bundle. Also beware that the DESCRIPTION file in the stand
alone package is not exactly the same as the DESCRIPTION.in file when it
is part of the bundle.

What you are trying to do does work, I do it all the time, but there are
a few details to work out. Because of the above differences I actually
generate the DESCRIPTION* files from my Makefile, differently depending
on the target.


I do too.  Having a DESCRIPTION.in is irrelevant: having a Contains: line 
in DESCRIPTION is the test used.


Can we please read the posting guide and not use R-help for

'questions and discussion about code development in R'

which is the description of the R-devel list.



Paul Gilbert

Robin Hankin wrote:


Hi

[MacOsX 10.4.6;  R-2.2.1]

I have a  bundle that comprises three packages.  I want to run R CMD
check on
each one individually, as it takes a long time to run  on all three.
I am
having problems.

The bundle as a whole passes R CMD check, but fails when I cd to the
bundle
directory and run R CMD check on a  package directory.

The whole bundle passes:

octopus:~/scratch% R CMD check ./BACCO
* checking for working latex ... OK
* using log directory '/Users/rksh/scratch/BACCO.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'BACCO/DESCRIPTION' ... OK
* looks like 'BACCO' is a package bundle
* this is bundle 'BACCO' version '1.0-29'
* checking if this is a source bundle ... OK

[snip]

** creating approximator-manual.tex ... OK
** checking approximator-manual.tex ... OK

octopus:~/scratch%

but R CMD check fails on the packages individually:


octopus:~/scratch/BACCO% R CMD check ./calibrator
* checking for working latex ... OK
* using log directory '/Users/rksh/scratch/BACCO/calibrator.Rcheck'
* using R version 2.2.1, 2005-12-20
* checking for file 'calibrator/DESCRIPTION' ... OK
* looks like 'calibrator' is a package bundle
* this is bundle 'BACCO' version '1.0-29'
* checking if this is a source bundle ... OK

/Library/Frameworks/R.framework/Resources/bin/INSTALL: line 1: cd: /
Users/rksh/scratch/BACCO/calibrator/emulator: No such file or directory
/Library/Frameworks/R.framework/Resources/bin/INSTALL: line 1: cd: /
Users/rksh/scratch/BACCO/calibrator/calibrator: No such file or
directory
/Library/Frameworks/R.framework/Resources/bin/INSTALL: line 1: cd: /
Users/rksh/scratch/BACCO/calibrator/approximator: No such file or
directory
ERROR: no 'Package' field in 'DESCRIPTION'
ERROR
Installation failed.
octopus:~/scratch/BACCO%


Both DESCRIPTION and DESCRIPTION.in files seem to be OK:



octopus:~/scratch/BACCO/calibrator% cat DESCRIPTION
Package: calibrator
Description: Performs Bayesian calibration of computer models as per
[snip]
Title: Bayesian calibration of computer models
Bundle: BACCO
Version: 1.0-29
Date: 13 October 2005
Depends: R (= 2.0.0), mvtnorm, adapt
Contains: emulator calibrator approximator
Title: Bayesian Analysis of Computer Code Output
BundleDescription: Bayesian analysis of computer code software.  The
  bundle contains routines that evaluate the formulae of Kennedy,
  O'Hagan, and Oakley.
License: GPL
Author: Robin K. S. Hankin [EMAIL PROTECTED]
Maintainer: Robin K. S. Hankin [EMAIL PROTECTED]
URL: http://www.noc.soton.ac.uk/JRD/SAT/pers/rksh.html



octopus:~/scratch/BACCO/calibrator% cat DESCRIPTION.in
Package: calibrator
Description: Performs Bayesian calibration of computer models as per
[snip]
Title: Bayesian calibration of computer models



--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
 tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html





La version française suit le texte anglais.



This email may contain privileged and/or confidential inform...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread Ruben Roa
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Berton Gunter
Sent: 06 April 2006 14:22
To: 'r user'; 'rhelp'
Subject: Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

There is a **Huge** literature on robust regression, including many books that 
you can search on at e.g. Amazon. I think it fair to say that we have known 
since at least the 1970's that practically any robust downweighting procedure 
(see, e.g M-estimation) is preferable (more efficient, better continuity 
properties, better estimates) to trimming outliers defined by arbitrary 
threshholds. An excellent but now probably dated introductory discussion can be 
found in UNDERSTANDING ROBUST AND EXPLORATORY DATA ANALYSIS edited by 
Hoaglin, Tukey, Mosteller, et. al.


In the mixture-of-distributions approach of ADMB's robust_regression(x,y,a) 
command, there is no need to abandon the likelihood function for a more general 
function. The outliers are assumed to come from another, contaminating 
distribution, with extra parameter a, and then a proper, more complete, 
likelihood function is used. Also it seems that the mixture-of-distributions 
approach is more interpretable, more related to physical mechanisms generating 
departures from the distributional assumptions. In a paper on nonlinear models 
for the growth of certain marine animals where I used ADMB robust regression, I 
argued that the outliers were produced by human errors in the reading of age in 
certain hard structures in the body of the animals. This was consistent with 
the structure of the likelihood which consisted of the mixture of a normal and 
another contaminating distribution with fatter tails, operating mostly at 
higher values of the predictor variable (age).
Ruben

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] recommendation for post-hoc tests after anova

2006-04-06 Thread Liaw, Andy
You might want to take a look at the multcomp package, which does it in more
modern fashion.  The tukey or dunnett options there refer to the types
of comparisons (Tukey means all pairwise, Dunnett means all vs. control)
rather than the named procedures.  The package has a vignette that ought to
be helpful to you.

Andy

From: Colm Connolly
 
 Greetings all,
 
 I've done some ANOVAs and have found significant effects 
 across my groups, so now I have to do some post-hoc tests to 
 ascertain which of the groups is driving the significant 
 effect. Usually one would do something like a Newman-Keuls or 
 Scheffe test to get at this but based on googling and 
 browsing through the r-help archives R doesn't support these 
 and they seem to be badly received by the R community 
 generally (for reasons related to alpha not being properly 
 controlled, if I understand correctly).
 
 I found the TukeyHSD function but on reading the help page, 
 I'm unsure whether it's safe for me to use it since the 
 number of subjects in my groups is not balanced (ctrl=6, 
 short=9, long=9). Would you reckon that this is mildly 
 unbalanced in the spirit of the help page or that is my 
 caution warranted?
 
 In the absence of NK or Scheefe tests could someone recommend 
 an appropriate test for me to conduct in this instance? From 
 what I've read in Explaining Psychological Statistics, the 
 REGWQ test seems to be the way to go. Can R perform this test?
 
 Thanks for your time,
 
 Regards,
 
 --
 Dr Colm G. Connolly
 School of Psychology and Institute of Neuroscience
 The Lloyd Building
 University of Dublin
 Trinity College, Dublin 2, Éire
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread roger bos
I'm asking this question purely for my own benefit, not to try to correct
anyone.  The procedure you refer to as normalization I have always heard
referred to as standardization.  Is the former the proper term?  Also, you
say its not necessary given today's hardware, but isn't it beneficial to get
all the variables in a similar range?  Is thre any other transformation that
you would suggest?  I use rlm (and normalization) in my models I use every
day, so I was happy to read the above comments.

Thanks,

Roger



On 4/6/06, Berton Gunter [EMAIL PROTECTED] wrote:

 Thanks, Andy. Well said. Excellent points. The final weights from rlm
 serve
 this diagnostic purpose, of course.

 -- Bert


  -Original Message-
  From: Liaw, Andy [mailto:[EMAIL PROTECTED]
  Sent: Thursday, April 06, 2006 9:56 AM
  To: 'Berton Gunter'; 'r user'; 'rhelp'
  Subject: RE: [R] pros and cons of robust regression? (i.e.
  rlm vs lm)
 
  To add to Bert's comments:
 
  -  Normalizing data (e.g., subtracting mean and dividing by
  SD) can help
  numerical stability of the computation, but that's mostly
  unnecessary with
  modern hardware.  As Bert said, that has nothing to do with
  robustness.
 
  -  Instead of _replacing_ lm() with rlm() or other robust
  procedure, I'd do
  both of them.  Some scientists view robust procedures that
  omit some data
  points (e.g., by assigning basically 0 weight to them) in
  automatic fashion
  and just trust the result as bad science, and I think they
  have a point.
  Use of robust procedure does not free one from examining the
  data carefully
  and looking at diagnostics.  Careful treatment of outliers is
  esspecially
  important, I think, for data coming from a confirmatory
  experiment.  If the
  conclusion you draw depends on downweighting or omitting certain data
  points, you ought to have very good reason for doing so.  I
  think it can not
  be over-emphasized how important it is not to take outlier
  deletion lightly.
  I've seen many cases that what seems like outlier originally
  turned out to
  be legitimate data, and omission of them just lead to overly
  optimistic
  assessment of variability.
 
  Andy
 
  From: Berton Gunter
  
   There is a **Huge** literature on robust regression,
   including many books that you can search on at e.g. Amazon. I
   think it fair to say that we have known since at least the
   1970's that practically any robust downweighting procedure
   (see, e.g M-estimation) is preferable (more efficient,
   better continuity properties, better estimates) to trimming
   outliers defined by arbitrary threshholds. An excellent but
   now probably dated introductory discussion can be found in
   UNDERSTANDING ROBUST AND EXPLORATORY DATA ANALYSIS edited
   by Hoaglin, Tukey, Mosteller, et. al.
  
   The rub in all this is that nice small sample inference
   results go our the window, though bootstrapping can help with
   this. Nevertheless, for a variety of reasons, my
   recommendation is simply to **never** use lm and **always**
   use rlm (with maybe a few minor caveats). Many would disagree
   with this, however.
  
   I don't think normalizing data as it's conventionally used
   has anything to do with robust regression, btw.
  
   -- Bert Gunter
   Genentech Non-Clinical Statistics
   South San Francisco, CA
  
   The business of the statistician is to catalyze the
   scientific learning process.  - George E. P. Box
  
  
  
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of r user
Sent: Thursday, April 06, 2006 8:51 AM
To: rhelp
Subject: [R] pros and cons of robust regression? (i.e.
  rlm vs lm)
   
Can anyone comment or point me to a discussion of the
pros and cons of robust regressions, vs. a more
manual approach to trimming outliers and/or
normalizing data used in regression analysis?
   
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
   
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
  
  
 
 
  --
  
  Notice:  This e-mail message, together with any attachments,
  contains information of Merck  Co., Inc. (One Merck Drive,
  Whitehouse Station, New Jersey, USA 08889), and/or its
  affiliates (which may be known outside the United States as
  Merck Frosst, Merck Sharp  Dohme or MSD and in Japan, as
  Banyu) that may be confidential, proprietary copyrighted
  and/or legally privileged. It is intended solely for the use
  of the individual or entity named on this message.  If you
  are not the intended recipient, and have received this
  

Re: [R] skipping rows in trellis key

2006-04-06 Thread Steven Lacey
Deepayan, 

I also noticed that the width of the key object is off if the title of the
key is bigger than the columns. For example, 

xyplot(y~x,data=data.frame(x=1:10,y=1:10))
keyArgs - list()
keyArgs - list(points=list(pch=c(NA,rep(17,5)),lwd=2,col=c(NA,c(red,
chartreuse3, black, cyan, blue))),
text=list(lab=c(S-R Mapping,
Color,Shape,Letter,Compatible,Incompatible),cex=c(1.2,1,1,1,1,1)),
 
text=list(lab=c(expression(R^2),as.character(rep(0.999,5))),cex=c(1.2,1,1,1,
1,1)))
keyArgs$title - try this very, very, very long title
keyArgs$cex.title - 1.4
keyArgs$between - c(1,0,5)
keyArgs$columns - 1
keyArgs$column.between - 1
keyArgs$border - TRUE
drawKeyArgs - list(key=keyArgs,draw=FALSE)
keyArgs$draw - FALSE
key - do.call(draw.key,drawKeyArgs)
vp1-viewport(x=1,y=1,height=unit(1,grobheight,key),width=unit(1,grobwidt
h,key),just=c(right,top))
pushViewport(vp1)
grid.draw(key)
popViewport() 

What is a good work around?

Thanks, 
Steve


-Original Message-
From: Deepayan Sarkar [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 06, 2006 12:32 PM
To: Steven Lacey
Cc: r-help@stat.math.ethz.ch
Subject: Re: skipping rows in trellis key


On 4/5/06, Steven Lacey [EMAIL PROTECTED] wrote:
 Try this...

 xyplot(y~x,data=data.frame(x=1:10,y=1:10))
 keyArgs - list()
 keyArgs - 
 list(points=list(pch=c(NA,rep(17,5)),lwd=2,col=c(NA,c(red,
 chartreuse3, black, cyan, blue))),
 text=list(lab=c(S-R Mapping,

Color,Shape,Letter,Compatible,Incompatible),cex=c(1.2,1,1,1,1,1)),

 text=list(lab=c(expression(R^2),as.character(rep(0.999,5))),cex=c(1.2,
 1,1,1,
 1,1)))
 keyArgs$between - c(1,0,5)
 keyArgs$columns - 1
 keyArgs$column.between - 1
 keyArgs$border - TRUE
 drawKeyArgs - list(key=keyArgs,draw=FALSE)
 keyArgs$draw - FALSE
 key - do.call(draw.key,drawKeyArgs)

vp1-viewport(x=1,y=1,height=unit(1,grobheight,key),width=unit(1,grobwidt
 h,key),just=c(right,top))
 pushViewport(vp1)
 grid.draw(key)
 popViewport()

Hmm. I'll add a component called 'padding.text' in R 2.3.0 which can be used
to control the padding around text rows.

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread Spencer Graves
  A great example of the hazards of automatic outlier rejection is the 
story of how the hole in the ozone layer in the southern hemisphere was 
discovered.  Outliers were dutifully entered into the data base but 
discounted as probable metrology problems, which also plagued the 
investigation.  As the percentage of outliers became excessive, 
investigators untimately became convinced that many of the outliers 
were not metrology problems but real physical problems.

  For a recent discussion of this, see Maureen Christie (2004) 11. 
Data Collection and the Ozone Hole:  Too much of a good thing? 
Proceedigns of the International Commission on History of Meteorology 
1.1, pp. 99-105 
(www.meteohistory.org/2004proceedings1.1/pdfs/11christie.pdf).

  spencer graves
p.s.  I understand that Australia now has one of the world's highest 
rates of skin cancer, which has contributed to a major change in outdoor 
styles of dress there.

Berton Gunter wrote:

 Thanks, Andy. Well said. Excellent points. The final weights from rlm serve
 this diagnostic purpose, of course.
 
 -- Bert
  
 
 
-Original Message-
From: Liaw, Andy [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 06, 2006 9:56 AM
To: 'Berton Gunter'; 'r user'; 'rhelp'
Subject: RE: [R] pros and cons of robust regression? (i.e. 
rlm vs lm)

To add to Bert's comments:

-  Normalizing data (e.g., subtracting mean and dividing by 
SD) can help
numerical stability of the computation, but that's mostly 
unnecessary with
modern hardware.  As Bert said, that has nothing to do with 
robustness.

-  Instead of _replacing_ lm() with rlm() or other robust 
procedure, I'd do
both of them.  Some scientists view robust procedures that 
omit some data
points (e.g., by assigning basically 0 weight to them) in 
automatic fashion
and just trust the result as bad science, and I think they 
have a point.
Use of robust procedure does not free one from examining the 
data carefully
and looking at diagnostics.  Careful treatment of outliers is 
esspecially
important, I think, for data coming from a confirmatory 
experiment.  If the
conclusion you draw depends on downweighting or omitting certain data
points, you ought to have very good reason for doing so.  I 
think it can not
be over-emphasized how important it is not to take outlier 
deletion lightly.
I've seen many cases that what seems like outlier originally 
turned out to
be legitimate data, and omission of them just lead to overly 
optimistic
assessment of variability.

Andy

From: Berton Gunter

There is a **Huge** literature on robust regression, 
including many books that you can search on at e.g. Amazon. I 
think it fair to say that we have known since at least the 
1970's that practically any robust downweighting procedure 
(see, e.g M-estimation) is preferable (more efficient, 
better continuity properties, better estimates) to trimming 
outliers defined by arbitrary threshholds. An excellent but 
now probably dated introductory discussion can be found in 
UNDERSTANDING ROBUST AND EXPLORATORY DATA ANALYSIS edited 
by Hoaglin, Tukey, Mosteller, et. al.

The rub in all this is that nice small sample inference 
results go our the window, though bootstrapping can help with 
this. Nevertheless, for a variety of reasons, my 
recommendation is simply to **never** use lm and **always** 
use rlm (with maybe a few minor caveats). Many would disagree 
with this, however.

I don't think normalizing data as it's conventionally used 
has anything to do with robust regression, btw.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the 
scientific learning process.  - George E. P. Box
 
 


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of r user
Sent: Thursday, April 06, 2006 8:51 AM
To: rhelp
Subject: [R] pros and cons of robust regression? (i.e. 

rlm vs lm)

Can anyone comment or point me to a discussion of the
pros and cons of robust regressions, vs. a more
manual approach to trimming outliers and/or
normalizing data used in regression analysis?

__
R-help@stat.math.ethz.ch mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html




--

Notice:  This e-mail message, together with any attachments, 
contains information of Merck  Co., Inc. (One Merck Drive, 
Whitehouse Station, New Jersey, USA 08889), and/or its 
affiliates (which may be known outside the United States as 
Merck Frosst, Merck Sharp  Dohme or MSD and in Japan, as 
Banyu) that may be confidential, proprietary 

Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread Berton Gunter
Spencer:

Your comment reinforces Andy's point, which is that purported outliers must
not be ignored but need to be clearly identified and examined. For reasons
that you well understand, robust regression methods are better for this in
the linear models context than standard least aquares. However, as I
understand it, the problem in the case that you describe is **not** that the
outliers weren't identified and examined, but that they were and were
dismissed as metrological anomalies. I would say this was an error of
scientific (mis)judgment, not data analystical methodology.

Anyway, this has taken us too far afield, so no more on-list comments from
me.

-- Bert 
 

 -Original Message-
 From: Spencer Graves [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, April 06, 2006 10:30 AM
 To: Berton Gunter
 Cc: 'Liaw, Andy'; 'r user'; 'rhelp'
 Subject: Re: [R] pros and cons of robust regression? (i.e. 
 rlm vs lm)
 
 A great example of the hazards of automatic outlier 
 rejection is the 
 story of how the hole in the ozone layer in the southern 
 hemisphere was 
 discovered.  Outliers were dutifully entered into the data base but 
 discounted as probable metrology problems, which also plagued the 
 investigation.  As the percentage of outliers became excessive, 
 investigators untimately became convinced that many of the outliers 
 were not metrology problems but real physical problems.
 
 For a recent discussion of this, see Maureen Christie 
 (2004) 11. 
 Data Collection and the Ozone Hole:  Too much of a good thing? 
 Proceedigns of the International Commission on History of Meteorology 
 1.1, pp. 99-105 
 (www.meteohistory.org/2004proceedings1.1/pdfs/11christie.pdf).
 
 spencer graves
 p.s.  I understand that Australia now has one of the world's highest 
 rates of skin cancer, which has contributed to a major change 
 in outdoor 
 styles of dress there.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] difference between two density plots

2006-04-06 Thread Philipp H. Mohr
Hello,

I have two densities which I plot with:

sm.density(gps)
sm.density(gps2)

Both data sets are 2D and have the same scale.

Is there a way of plotting the difference between the two density plots ?

Thank you very much,
Phil

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] skipping rows in trellis key

2006-04-06 Thread Deepayan Sarkar
On 4/6/06, Steven Lacey [EMAIL PROTECTED] wrote:
 Deepayan,

 I also noticed that the width of the key object is off if the title of the
 key is bigger than the columns. For example,

 xyplot(y~x,data=data.frame(x=1:10,y=1:10))
 keyArgs - list()
 keyArgs - list(points=list(pch=c(NA,rep(17,5)),lwd=2,col=c(NA,c(red,
 chartreuse3, black, cyan, blue))),
 text=list(lab=c(S-R Mapping,
 Color,Shape,Letter,Compatible,Incompatible),cex=c(1.2,1,1,1,1,1)),

 text=list(lab=c(expression(R^2),as.character(rep(0.999,5))),cex=c(1.2,1,1,1,
 1,1)))
 keyArgs$title - try this very, very, very long title
 keyArgs$cex.title - 1.4
 keyArgs$between - c(1,0,5)
 keyArgs$columns - 1
 keyArgs$column.between - 1
 keyArgs$border - TRUE
 drawKeyArgs - list(key=keyArgs,draw=FALSE)
 keyArgs$draw - FALSE
 key - do.call(draw.key,drawKeyArgs)
 vp1-viewport(x=1,y=1,height=unit(1,grobheight,key),width=unit(1,grobwidt
 h,key),just=c(right,top))
 pushViewport(vp1)
 grid.draw(key)
 popViewport()

 What is a good work around?

None I can think of (and it's not easy to fix given the current
implementation). Bad workarounds are:

(1) use space=top and border=FALSE
(2) Shorten the title by embedding newlines or shrinking the text size
(3) add dummy columns (say rectangles) with transparent colors

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Creating tables with std errors beneath estimates

2006-04-06 Thread Brian Quinif
I have recently become familiar with the latex() function thanks to
the help of many people on this list.  However, now I need to figure
out how to make the right dataframe for the tables I want to make.

What I need to do is make tables with the std. errors beneath the
estimates in parentheses (standard econ style).  How can I make a
dataframe with that format?

Thanks,

Brian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] difference between two density plots

2006-04-06 Thread Duncan Murdoch
On 4/6/2006 1:34 PM, Philipp H. Mohr wrote:
 Hello,
 
 I have two densities which I plot with:
 
 sm.density(gps)
 sm.density(gps2)
 
 Both data sets are 2D and have the same scale.
 
 Is there a way of plotting the difference between the two density plots ?

?sm.density describes the value returned by those function calls, which 
includes the values of the density estimate at the evaluation points. 
   The evaluation points may be different between the two datasets, but 
you could use approxfun to generate functions doing linear interpolation 
for each of them, and then evaluate the density difference whereever you 
like.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread Liaw, Andy
Some people use the two terms sort of interchangeably.
 
Beneficial in what sense?  If there are no polynomial terms involving a
variable, any linear transformation of a variable (by itself) does not
change the quality of the fit at all.  The scaling gets reflected in the
coefficient and its SE, but t-statistics and p-value stay the same, as well
as predictions, R-squared, etc.  Even when there's polynomial term,
R-squared and predictions do not change.
 
HTH,
Andy

-Original Message-
From: roger bos [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 06, 2006 1:15 PM
To: Berton Gunter; Liaw, Andy
Cc: rhelp
Subject: Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)


I'm asking this question purely for my own benefit, not to try to correct
anyone.  The procedure you refer to as normalization I have always heard
referred to as standardization.  Is the former the proper term?  Also, you
say its not necessary given today's hardware, but isn't it beneficial to get
all the variables in a similar range?  Is thre any other transformation that
you would suggest?  I use rlm (and normalization) in my models I use every
day, so I was happy to read the above comments. 
 
Thanks,
 
Roger


 
On 4/6/06, Berton Gunter [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]  wrote: 

Thanks, Andy. Well said. Excellent points. The final weights from rlm serve
this diagnostic purpose, of course. 

-- Bert


 -Original Message-
 From: Liaw, Andy [mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
]
 Sent: Thursday, April 06, 2006 9:56 AM
 To: 'Berton Gunter'; 'r user'; 'rhelp' 
 Subject: RE: [R] pros and cons of robust regression? (i.e.
 rlm vs lm)

 To add to Bert's comments:

 -  Normalizing data (e.g., subtracting mean and dividing by 
 SD) can help
 numerical stability of the computation, but that's mostly
 unnecessary with
 modern hardware.  As Bert said, that has nothing to do with
 robustness.

 -  Instead of _replacing_ lm() with rlm() or other robust 
 procedure, I'd do
 both of them.  Some scientists view robust procedures that
 omit some data
 points (e.g., by assigning basically 0 weight to them) in
 automatic fashion
 and just trust the result as bad science, and I think they 
 have a point.
 Use of robust procedure does not free one from examining the
 data carefully
 and looking at diagnostics.  Careful treatment of outliers is
 esspecially
 important, I think, for data coming from a confirmatory 
 experiment.  If the
 conclusion you draw depends on downweighting or omitting certain data
 points, you ought to have very good reason for doing so.  I
 think it can not
 be over-emphasized how important it is not to take outlier 
 deletion lightly.
 I've seen many cases that what seems like outlier originally
 turned out to
 be legitimate data, and omission of them just lead to overly
 optimistic
 assessment of variability. 

 Andy

 From: Berton Gunter
 
  There is a **Huge** literature on robust regression,
  including many books that you can search on at e.g. Amazon. I
  think it fair to say that we have known since at least the 
  1970's that practically any robust downweighting procedure
  (see, e.g M-estimation) is preferable (more efficient,
  better continuity properties, better estimates) to trimming 
  outliers defined by arbitrary threshholds. An excellent but
  now probably dated introductory discussion can be found in
  UNDERSTANDING ROBUST AND EXPLORATORY DATA ANALYSIS edited 
  by Hoaglin, Tukey, Mosteller, et. al.
 
  The rub in all this is that nice small sample inference
  results go our the window, though bootstrapping can help with
  this. Nevertheless, for a variety of reasons, my 
  recommendation is simply to **never** use lm and **always**
  use rlm (with maybe a few minor caveats). Many would disagree
  with this, however.
 
  I don't think normalizing data as it's conventionally used 
  has anything to do with robust regression, btw.
 
  -- Bert Gunter
  Genentech Non-Clinical Statistics
  South San Francisco, CA
 
  The business of the statistician is to catalyze the 
  scientific learning process.  - George E. P. Box
 
 
 
   -Original Message-
   From: [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED] ] On Behalf Of r user
   Sent: Thursday, April 06, 2006 8:51 AM 
   To: rhelp
   Subject: [R] pros and cons of robust regression? (i.e.
 rlm vs lm)
  
   Can anyone comment or point me to a discussion of the 
   pros and cons of robust regressions, vs. a more
   manual approach to trimming outliers and/or
   normalizing data used in regression analysis?
   
   __
   R-help@stat.math.ethz.ch mailto:R-help@stat.math.ethz.ch  mailing
list
   https://stat.ethz.ch/mailman/listinfo/r-help
https://stat.ethz.ch/mailman/listinfo/r-help 
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
http://www.R-project.org/posting-guide.html 
  
 
  

Re: [R] Mpirun with R CMD scripts

2006-04-06 Thread Martin Morgan
Your description was a bit hard for me to follow. I think what you're
trying to do is something like

TestRmpi.R
--
library(Rmpi)
if (0==mpi.comm.rank(comm=0)) {
  # 'manager', e.g., cat(manager\n)
} else {
  # 'worker', e.g., cat(mpi.comm.rank(comm=0),worker\n)
}
mpi.quit()

TestRmpi.sh
---
R --slave  $1

and from the command line

mpirun -np 3 TestRmpi.sh TestRmpi.R

If instead you try

mpirun -np 3 R CMD BATCH TestRmpi.R

then each node executes R CMD BATCH. As a consequence, each node
writes to a file TestRmpi.Rout, and you end up with an undetermined
outcome (because the different nodes are all trying to write to the
same file).

If your file were

WRONGTestSnow.R
---
library(snow)
cl - makeCluster(3,type=MPI)
clusterCall(cl,function() Sys.info()[c(nodename,machine)])
stopCluster(cl)

then you would execute makeCluster on each node, for a total of 9
instances of R. Probably not what you want.

I found that snow was useful for interactive and exploratory work, but
that Rmpi (or rpvm) provide the level of control that you really need
for more substantial use of parallel resources. Recent versions of
Rmpi have mpi.parLapply and related, which are like the high-level
functions in snow.

Hope that helps,

Martin

Srividya Valivarthi [EMAIL PROTECTED] writes:

 Hi,

I am working on a 64-bit rocks cluster and am relatively new to the
 R package. I am trying to get Snow working with R and Rmpi and have
 run into the following issue. R is able to load the Rmpi and snow
 libraries and is able to run simple commands both interactively and
 batch as follows:

 ---
 [EMAIL PROTECTED] ~]$ R
 R : Copyright 2005, The R Foundation for Statistical Computing
 Version 2.2.0  (2005-10-06 r35749)
 ISBN 3-900051-07-0
 library(snow)
 c1-makeCluster(3, type=MPI)
 Loading required package: Rmpi
 3 slaves are spawned successfully. 0 failed.



 clusterCall(c1,function() Sys.info()[c(nodename,machine)])
 [[1]]
nodename machine
 cheaha.ac.uab.edux86_64

 [[2]]
 nodename  machine
 compute-0-12.local x86_64

 [[3]]
 nodename  machine
 compute-0-13.local x86_64

 stopCluster(c1)
 [1] 1
 q()
 --

 But on running the same script with mpirun i get the following error.
 *8
 [EMAIL PROTECTED] ~]$ mpirun -np 3 R -slave R CMD BATCH TestSnow.R
 /home/srividya/R/library/snow/RMPInode.sh: line 9: 19431 Segmentation
 fault  ${RPROG:-R} --vanilla  ${OUT:-/dev/null} 21 EOF

 library(Rmpi)
 library(snow)

 runMPIslave()
 EOF
 **

 I am not sure about what the error could be and any help is greatly
 appreaciated.

 Thanks,
 Srividya

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Finding out the format of an object

2006-04-06 Thread Thomas L Jones
Suppose I have an arbitrary R object. Is there a way to find out its 
format? There are 118 points, each described by two numbers. Let the 
name of the object be obj without the quotes. I can do a print 
(obj), but all I get is a bunch of numbers. I can do a ls.str (obj), 
but all I get is a bunch of numbers. Is it a data frame? A vector with 
118 elements, each having two sub-elements? Or ---?

The object in question is of the class gam (Generalized Additive 
Model). I know how to pull out some data with the predict function, 
but, when I try to print the value of the predict function, all I get 
is a bunch of numbers.

Also, how could I have looked this up without having to post a 
question here?

Tom
Thomas L. Jones, Ph.D., Computer Science

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Finding out the format of an object

2006-04-06 Thread Duncan Murdoch
On 4/6/2006 2:24 PM, Thomas L Jones wrote:
 Suppose I have an arbitrary R object. Is there a way to find out its 
 format? There are 118 points, each described by two numbers. Let the 
 name of the object be obj without the quotes. I can do a print 
 (obj), but all I get is a bunch of numbers. I can do a ls.str (obj), 
 but all I get is a bunch of numbers. Is it a data frame? A vector with 
 118 elements, each having two sub-elements? Or ---?
 
 The object in question is of the class gam (Generalized Additive 
 Model). I know how to pull out some data with the predict function, 
 but, when I try to print the value of the predict function, all I get 
 is a bunch of numbers.

str(obj) usually gives you the sort of thing you're looking for.

 Also, how could I have looked this up without having to post a 
 question here?

This is a tricky question.  Usually RSiteSearch() will turn up useful 
information, but I suspect you'll get a lot of false hits if you look 
for format or structure.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Indexing With List Of Vectors (Replacement)

2006-04-06 Thread Paul Roebuck
I have the following:

 a - matrix(1:10, nrow = 2, byrow = TRUE)
 b - array(as.integer(0), c(7, 5))
 idx - list()
 length(idx) - 2
 dim(idx) - c(1, 2)
 idx[[1]] - as.integer(1:2)
 idx[[2]] - as.integer(1:5)

I can do the following, which works if 'b' is a matrix.

 b[idx[[1]], idx[[2]]] - a
 b
 [,1] [,2] [,3] [,4] [,5]
[1,]13579
[2,]2468   10
[3,]88888
[4,]88888
[5,]88888
[6,]88888
[7,]88888

Looking for a way to do this generically such that 'idx'
with 'n' length can be used to index n-dimensional arrays.

b[translate 'idx' to array indices] - a

--
SIGSIG -- signature too long (core dumped)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Finding out the format of an object

2006-04-06 Thread Sarah Goslee
Hi Tom,

If you assume that's a common question, and so likely to be in base,
help.search(object, package=base) calls up a list of functions relating
to objects. (Leaving off the package argument calls up a much longer one).
Browsing thru the list gets you the intriguing line:
class(base) Object Classes

?class gets you more information about that function, which sounds
promising.

class(myobject) will tell you what class(es) myobject belongs to.

names(myobject) will show you the named subcomponents, which can
be accessed with myobject$somename

The object in question is of the class gam (Generalized Additive
 Model). I know how to pull out some data with the predict function,
 but, when I try to print the value of the predict function, all I get
 is a bunch of numbers.


I'm not sure what you mean by value of the predict function, but you
could try typing just
predict.gam
at the prompt to get some idea of what it's doing (although not necessarily
the complete source).

Sarah
--
Sarah Goslee
http://www.stringpage.com

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] algebra

2006-04-06 Thread Franklin Parlamis
hello all.  i have been trying to develop a representation (in the S4 sense) 
for a floating cash object, which would store a cash amount as a function of an 
arbitrary number of variables (with unknown values).  for example, an interest 
rate swap may call for a payment in one year that can be represented as a 
function of a 3-month libor rate to be determined in nine months.  this 
floating cash amount would be represented as, say, 25*L (i am omitting some 
of the details leading to the scalar), where L is the unknown rate.  it will be 
necessary for addition and multiplication operations (among others) to make 
sense for the class.  i haven't found R to be particularly accommodating of 
this type of problem, in particular of the algebraic computations necessary to 
make addition and multiplication work.

my best thought so far was to have one slot in the representation be an 
expression object that would evaluate to a numeric cash object, if the 
evaluation were to take place in an environment where all the variables were 
ultimately bound to numerics (in the example above, expression(25*L) would 
fill the slot).  this approach leads me to the following questions:

(i) does R have any native computer algebra functionality that would allow 

 expression(25*L) + expression(25*M)

to evaluate to 

[1] expression(25*(L+M))

(ii) if not, does anyone have experience with integrating available computer 
algebra platforms (such as Mathomatic, which Gabor discussed in an earlier 
thread, but maybe not Mathomatic, because of its lack of natural log support) 
with R, i.e, passing expressions out for algebraic computations and then 
returning the result?  are there any packages available to provide APIs to any 
systems?  Any open source systems?

(iii) is there another way to handle all this other than algebraic manipulation 
of expressions?  even though i can't complete the thought, i keep picturing a 
massive n-dimensional array (if n is the total number of unknowns i am dealing 
with) that would somehow store scalars associated with all the possible 
combinations of variables?  does that thought have a future?  (please ignore 
this last question if i am not, as is often the case, making sense)

thanks in advance for any help.

franklin parlamis

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] difference between two density plots

2006-04-06 Thread Roger Bivand
On Thu, 6 Apr 2006, Duncan Murdoch wrote:

 On 4/6/2006 1:34 PM, Philipp H. Mohr wrote:
  Hello,
  
  I have two densities which I plot with:
  
  sm.density(gps)
  sm.density(gps2)
  
  Both data sets are 2D and have the same scale.
  
  Is there a way of plotting the difference between the two density plots ?
 
 ?sm.density describes the value returned by those function calls, which 
 includes the values of the density estimate at the evaluation points. 
The evaluation points may be different between the two datasets, but 
 you could use approxfun to generate functions doing linear interpolation 
 for each of them, and then evaluate the density difference whereever you 
 like.

The eval.points can be set using sm.options(), fixing them to be the same.

 
 Duncan Murdoch
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] JSS

2006-04-06 Thread Jan de Leeuw
There are some nice new R contributions in the
latest volume of JSS

http://www.jstatsoft.org

The most recent one (and a good model for similar
comparative work) is

Authors:
Alexandros Karatzoglou and David Meyer and Kurt Hornik
Title:
Support Vector Machines in R
Reference:
Volume 15, 2006, Issue 9
Acquire:
paper [0]  browse files [0]
Dates:
submitted: 2005-10-24accepted: 2006-04-06

If you want to be kept abreast of whatever
is published, subscribe to jss-announce at

http://lists.stat.ucla.edu/mailman/listinfo/jss-announce


===
Jan de Leeuw; Distinguished Professor and Chair, UCLA Department of  
Statistics;
Editor: Journal of Multivariate Analysis, Journal of Statistical  
Software
US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554
phone (310)-825-9550;  fax (310)-206-5658;  email: [EMAIL PROTECTED]
.mac: jdeleeuw ++  aim: deleeuwjan ++ skype: j_deleeuw
homepages: http://gifi.stat.ucla.edu ++ http://www.cuddyvalley.org
   
 
-
   No matter where you go, there you are. --- Buckaroo Banzai
http://gifi.stat.ucla.edu/sounds/nomatter.au

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pros and cons of robust regression? (i.e. rlm vs lm)

2006-04-06 Thread bogdan romocea
There are several kinds of standardization, and 'normalization' is
only one of them. For some details you could check
http://support.sas.com/91doc/getDoc/statug.hlp/stdize_index.htm
(see Details for standardization methods).

Standardization is required prior to clustering to control for the
impact of scale. (Variables with large variances tend to have more
effect on the resulting clusters than those with small variances.) I
don't know how valuable standardization may be in other areas.


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of roger bos
 Sent: Thursday, April 06, 2006 1:15 PM
 To: Berton Gunter; Liaw, Andy
 Cc: rhelp
 Subject: Re: [R] pros and cons of robust regression? (i.e.
 rlm vs lm)

 I'm asking this question purely for my own benefit, not to
 try to correct
 anyone.  The procedure you refer to as normalization I have
 always heard
 referred to as standardization.  Is the former the proper
 term?  Also, you
 say its not necessary given today's hardware, but isn't it
 beneficial to get
 all the variables in a similar range?  Is thre any other
 transformation that
 you would suggest?  I use rlm (and normalization) in my
 models I use every
 day, so I was happy to read the above comments.

 Thanks,

 Roger



 On 4/6/06, Berton Gunter [EMAIL PROTECTED] wrote:
 
  Thanks, Andy. Well said. Excellent points. The final
 weights from rlm
  serve
  this diagnostic purpose, of course.
 
  -- Bert
 
 
   -Original Message-
   From: Liaw, Andy [mailto:[EMAIL PROTECTED]
   Sent: Thursday, April 06, 2006 9:56 AM
   To: 'Berton Gunter'; 'r user'; 'rhelp'
   Subject: RE: [R] pros and cons of robust regression? (i.e.
   rlm vs lm)
  
   To add to Bert's comments:
  
   -  Normalizing data (e.g., subtracting mean and dividing by
   SD) can help
   numerical stability of the computation, but that's mostly
   unnecessary with
   modern hardware.  As Bert said, that has nothing to do with
   robustness.
  
   -  Instead of _replacing_ lm() with rlm() or other robust
   procedure, I'd do
   both of them.  Some scientists view robust procedures that
   omit some data
   points (e.g., by assigning basically 0 weight to them) in
   automatic fashion
   and just trust the result as bad science, and I think they
   have a point.
   Use of robust procedure does not free one from examining the
   data carefully
   and looking at diagnostics.  Careful treatment of outliers is
   esspecially
   important, I think, for data coming from a confirmatory
   experiment.  If the
   conclusion you draw depends on downweighting or omitting
 certain data
   points, you ought to have very good reason for doing so.  I
   think it can not
   be over-emphasized how important it is not to take outlier
   deletion lightly.
   I've seen many cases that what seems like outlier originally
   turned out to
   be legitimate data, and omission of them just lead to overly
   optimistic
   assessment of variability.
  
   Andy
  
   From: Berton Gunter
   
There is a **Huge** literature on robust regression,
including many books that you can search on at e.g. Amazon. I
think it fair to say that we have known since at least the
1970's that practically any robust downweighting procedure
(see, e.g M-estimation) is preferable (more efficient,
better continuity properties, better estimates) to trimming
outliers defined by arbitrary threshholds. An excellent but
now probably dated introductory discussion can be found in
UNDERSTANDING ROBUST AND EXPLORATORY DATA ANALYSIS edited
by Hoaglin, Tukey, Mosteller, et. al.
   
The rub in all this is that nice small sample inference
results go our the window, though bootstrapping can help with
this. Nevertheless, for a variety of reasons, my
recommendation is simply to **never** use lm and **always**
use rlm (with maybe a few minor caveats). Many would disagree
with this, however.
   
I don't think normalizing data as it's conventionally used
has anything to do with robust regression, btw.
   
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
   
The business of the statistician is to catalyze the
scientific learning process.  - George E. P. Box
   
   
   
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of r user
 Sent: Thursday, April 06, 2006 8:51 AM
 To: rhelp
 Subject: [R] pros and cons of robust regression? (i.e.
   rlm vs lm)

 Can anyone comment or point me to a discussion of the
 pros and cons of robust regressions, vs. a more
 manual approach to trimming outliers and/or
 normalizing data used in regression analysis?

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
   

[R] Evaluating a function with another function

2006-04-06 Thread Guenther, Cameron
Hello all,
I hope someone can help me with this.

I have function that calculates two values based on input data.  A
simple example follows:

test-function(x,s,rangit=seq(0,10,1))
{
rangit-rangit
y-vector()
p-vector()
for(i in 0:length(rangit)){
y[i]-x+s[1]+rangit[i]
p[i]-x+s[2]+rangit[i]
}
return(data.frame(rangit,y,p))
}

And returns the following:

   rangit  y  p
1   0  2  3
2   1  3  4
3   2  4  5
4   3  5  6
5   4  6  7
6   5  7  8
7   6  8  9
8   7  9 10
9   8 10 11
10  9 11 12
11 10 12 13

Which is what I want.  The part I am having trouble with is that I want
to write another function that will evaluate the previous function at
multiple levels of x, where the levels of x may not always be the same
length

I have tried several options but so far have not been able to figure it
out.

I tried:

testexp-function(x,s,rangit){
out-data.frame()
for (j in 1:length(x)){
out[j]-test(x[j],s)
}
return(out)
}
Q2-testexp(x=c(1:4),s=c(1,2))

But that returns a warning with no values.  I have also tried various
other methods with no success.

Basically what I want the output to look like is

Out[1]  Out[2]  Out[3]  Out[4]
Rangit y p  rangit y p  rangit y p  rangit y p
0  2 3  0  3 4  0  4 5  0  5 6
1  3 4  1  4 5  1  5 6  1  6 7  
2  4 5  2  5 6  2  6 7  2  7 8
3  . .  .  . .  .  . .  .  . . 
4  . .  .  . .  .  . .  .  . .
5  . .  .  . .  .  . .  .  . .
6  . .  .  . .  .  . .  .  . .
7  . .  .  . .  .  . .  .  . .
8  . .  .  . .  .  . .  .  . .
9  . .  .  . .  .  . .  .  . .
10 . .  .  . .  .  . .  .  . .

Thanks for any help you can offer.


Cameron Guenther, Ph.D. 
Associate Research Scientist
FWC/FWRI, Marine Fisheries Research
100 8th Avenue S.E.
St. Petersburg, FL 33701
(727)896-8626 Ext. 4305
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] More Logistic Regression Tools?

2006-04-06 Thread Ramón Casero Cañas
Frank E Harrell Jr wrote:
 Eric Rescorla [EMAIL PROTECTED] wrote:
 
 (2) I'd like to compute goodness-of-fit statistics for my fit
 (Hosmer-Lemeshow, Pearson, etc.). I didn't see a package that
 did this. Have I missed one?
 
 Hosmer-Lemeshow has low power and relies on arbitrary binning of
 predicted probabilities. The Hosmer-Le Cessie omnibus test is unique
 and has more power usually. To get it:
 
 f - update(f, x=T, y=T)
 resid(f, 'gof') # uses residuals.lrm 


The documentation of the Design package, section residuals.lrm, says

CITE
Details
For the goodness-of-fit test, the le Cessie-van Houwelingen normal
test statistic for the unweighted
sum of squared errors (Brier score times n) is used.
/CITE

It is not clear to me whether the test implemented is for the statistic
with a constant kernel described in

S. le Cessie and J.C. van Houwelingen. A goodness-of-fit test for binary
regression models, based on smoothing methods. Biometrics, 47:1267­1282,
Dec 1991.

or the variation in

D.W. Hosmer, T. Hosmer, S. Le Cessie, and S. Lemeshow. A comparison of
goodness-of-fit tests for the logistic regression model. Statistics in
Medicine, 16:965­980, 1997.

Sorry, I couldn't get this from looking at the code either (I'm quite
new to R). Is the statistic tested the T defined by le Cessie and van
Houwelingen?

Regards,

-- 
Ramón Casero Cañas

http://www.robots.ox.ac.uk/~rcasero/wiki
http://www.robots.ox.ac.uk/~rcasero/blog

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] for bclust in package e1071

2006-04-06 Thread Linda Lei
Hi All,

 

Could you please help with the error I get from the following codes?

 

 Library(cluster)

 

 data(iris)

 

 bc1 - bclust(iris[,2:5], 3, base.method=clara, base.centers=5)

 

Then I got:

 

Committee Member:
1(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)
Error in bclust(iris[, 2:5], 3, base.method = clara,  

 base.centers = 5):

 

 Could not find valid cluster solution in 20 replications

 

 I don't quite understand what the error means here and how to fix it?

 

Thank you! 

 

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] reshape question

2006-04-06 Thread Richard van Wingerden
Hi,

I have a data fram like this:

date column1 column2 column3 value1 value2 value3
1-1 A B C 10 5 2
2-1 A B D 5 2 0
3-1 A B E 17 10 7

How can I reshape it to:

date column1 column2 column3 v x
1-1 A B C value1 10
1-1 A B C value2 5
1-1 A B C value3 2
2-1 A B D value1 5
2-1 A B D value2 2
2-1 A B D value3 0
3-1 A B E value1 17
3-1 A B E value2 10
3-1 A B E value3 7

Thx!

Regards,

Richard

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Evaluating a function with another function

2006-04-06 Thread Gabor Grothendieck
I assume you want a list containing length(x) data frames, one per
list component.  If that's it try this:

test - function(x = 1:4, s = 1:2, rangit = 0:11)
   lapply(x, function(x) data.frame(rangit, y = x+s[1]+rangit, p=x+s[2]+rangit))
test() # test run

On 4/6/06, Guenther, Cameron [EMAIL PROTECTED] wrote:
 Hello all,
 I hope someone can help me with this.

 I have function that calculates two values based on input data.  A
 simple example follows:

 test-function(x,s,rangit=seq(0,10,1))
 {
 rangit-rangit
 y-vector()
 p-vector()
 for(i in 0:length(rangit)){
 y[i]-x+s[1]+rangit[i]
 p[i]-x+s[2]+rangit[i]
 }
 return(data.frame(rangit,y,p))
 }

 And returns the following:

   rangit  y  p
 1   0  2  3
 2   1  3  4
 3   2  4  5
 4   3  5  6
 5   4  6  7
 6   5  7  8
 7   6  8  9
 8   7  9 10
 9   8 10 11
 10  9 11 12
 11 10 12 13

 Which is what I want.  The part I am having trouble with is that I want
 to write another function that will evaluate the previous function at
 multiple levels of x, where the levels of x may not always be the same
 length

 I have tried several options but so far have not been able to figure it
 out.

 I tried:

 testexp-function(x,s,rangit){
out-data.frame()
for (j in 1:length(x)){
out[j]-test(x[j],s)
}
return(out)
 }
 Q2-testexp(x=c(1:4),s=c(1,2))

 But that returns a warning with no values.  I have also tried various
 other methods with no success.

 Basically what I want the output to look like is

 Out[1]  Out[2]  Out[3]  Out[4]
 Rangit y p  rangit y p  rangit y p  rangit y p
 0  2 3  0  3 4  0  4 5  0  5 6
 1  3 4  1  4 5  1  5 6  1  6 7
 2  4 5  2  5 6  2  6 7  2  7 8
 3  . .  .  . .  .  . .  .  . .
 4  . .  .  . .  .  . .  .  . .
 5  . .  .  . .  .  . .  .  . .
 6  . .  .  . .  .  . .  .  . .
 7  . .  .  . .  .  . .  .  . .
 8  . .  .  . .  .  . .  .  . .
 9  . .  .  . .  .  . .  .  . .
 10 . .  .  . .  .  . .  .  . .

 Thanks for any help you can offer.


 Cameron Guenther, Ph.D.
 Associate Research Scientist
 FWC/FWRI, Marine Fisheries Research
 100 8th Avenue S.E.
 St. Petersburg, FL 33701
 (727)896-8626 Ext. 4305
 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Indexing With List Of Vectors (Replacement)

2006-04-06 Thread Gabor Grothendieck
Try this:

do.call([-, c(list(b), idx, list(a)))

On 4/6/06, Paul Roebuck [EMAIL PROTECTED] wrote:
 I have the following:

  a - matrix(1:10, nrow = 2, byrow = TRUE)
  b - array(as.integer(0), c(7, 5))
  idx - list()
  length(idx) - 2
  dim(idx) - c(1, 2)
  idx[[1]] - as.integer(1:2)
  idx[[2]] - as.integer(1:5)

 I can do the following, which works if 'b' is a matrix.

  b[idx[[1]], idx[[2]]] - a
  b
 [,1] [,2] [,3] [,4] [,5]
 [1,]13579
 [2,]2468   10
 [3,]88888
 [4,]88888
 [5,]88888
 [6,]88888
 [7,]88888

 Looking for a way to do this generically such that 'idx'
 with 'n' length can be used to index n-dimensional arrays.

 b[translate 'idx' to array indices] - a

 --
 SIGSIG -- signature too long (core dumped)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] reshape question

2006-04-06 Thread Gabor Grothendieck
Just follow this example from example(reshape):

 reshape(wide, idvar=Subject, varying=list(names(wide)[2:12]),
 v.names=conc, direction=long)

like this:

long - reshape(dd, direction = long, varying =
list(names(dd)[5:7]), idvar = v,
   v.names = x, times = paste(value, 1:3, sep=))
long[order(long$date),]

Its even simpler if you use the reshape package:

library(reshape)
long - melt(dd, id = 1:4)
long[order(long$date),]

On 4/6/06, Richard van Wingerden [EMAIL PROTECTED] wrote:
 Hi,

 I have a data fram like this:

 date column1 column2 column3 value1 value2 value3
 1-1 A B C 10 5 2
 2-1 A B D 5 2 0
 3-1 A B E 17 10 7

 How can I reshape it to:

 date column1 column2 column3 v x
 1-1 A B C value1 10
 1-1 A B C value2 5
 1-1 A B C value3 2
 2-1 A B D value1 5
 2-1 A B D value2 2
 2-1 A B D value3 0
 3-1 A B E value1 17
 3-1 A B E value2 10
 3-1 A B E value3 7

 Thx!

 Regards,

 Richard

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] problem with statetable function in msm

2006-04-06 Thread Vassily Shvets
Just a note to say if you use msm you may find that
statetable returns an incorrect statetable: you can
then build an incorrect model and not even know that
statetable was the villain.

shfets

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Referencing variables in a dataframe.

2006-04-06 Thread Brian Quinif
I have a question about how to reference variables in a dataframe.

Normally, after I have read in some Stata data using the following command
all - read.dta('all.dta')

Whenever I want to use the variable sat.vr1 in the all data frame,
I do so using

all$sat.vr1

However, I'd like to be able to use the sat.vr1 variable without the
all$ (as well as all of the other variables of the dataframe).

Is there some way to make a particular dataframe the active or
default one so that a can do this?

Thanks,

Brian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Referencing variables in a dataframe.

2006-04-06 Thread Gabor Grothendieck
Here are two ways:

with(iris, Sepal.Length + Sepal.Width)

attach(iris)
Species



On 4/6/06, Brian Quinif [EMAIL PROTECTED] wrote:
 I have a question about how to reference variables in a dataframe.

 Normally, after I have read in some Stata data using the following command
 all - read.dta('all.dta')

 Whenever I want to use the variable sat.vr1 in the all data frame,
 I do so using

 all$sat.vr1

 However, I'd like to be able to use the sat.vr1 variable without the
 all$ (as well as all of the other variables of the dataframe).

 Is there some way to make a particular dataframe the active or
 default one so that a can do this?

 Thanks,

 Brian

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Financial functions

2006-04-06 Thread Spencer Graves
  R has many different functions for financial computations, but I 
don't know if you will find them all together.  Are you familiar with 
RSiteSearch (including the second restrict argument)?

  If you can't find what you want, please submit another post (after 
first reading the posting guide! www.R-project.org/posting-guide.html; 
  the guide has helped people answer their own questions and, failing 
that, posts more consistent with that guide tend to receive quicker, 
more useful answers).

  hope this helps,
  spencer graves

vittorio wrote:

 In what R package(-s) can I find the entire set  of financial functions that 
 you can find in MS-Excel such as PMT, PPMT, FV and IPMT? 
 
 Ciao
 Vittorio
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Financial functions

2006-04-06 Thread ronggui
I don't know what PMT, etc, does.
So I just give some hints and maybe it is not so useful.
There are several packages in CRAN related to financial.Have you try
to find some functions in them?

fBasics Rmetrics - Marketes and Basic Statistics
fCalendar Rmetrics - Chronological Objects
fExtremes Rmetrics - Extreme Financial Market Data
fMultivar Rmetrics - Multivariate Market Analysis
fOptions Rmetrics - Option Valuation
fPortfolio Rmetrics - Portfolio Selection and Optimization
fSeries Rmetrics - The Dynamical Process Behind Markets


2006/4/5, vittorio [EMAIL PROTECTED]:
 In what R package(-s) can I find the entire set  of financial functions that
 you can find in MS-Excel such as PMT, PPMT, FV and IPMT?

 Ciao
 Vittorio

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
黄荣贵
Deparment of Sociology
Fudan University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Financial functions

2006-04-06 Thread Liaw, Andy
Those functions in Excel are for computing things like interest payment for
fixed interest rate loan, etc.  RSiteSearch() doesn't turn up anything
useful, so I'd guess no one has written them.  Good opportunity for people
to contribute, I guess.

Andy

From: Spencer Graves
 
 R has many different functions for financial 
 computations, but I 
 don't know if you will find them all together.  Are you familiar with 
 RSiteSearch (including the second restrict argument)?
 
 If you can't find what you want, please submit 
 another post (after 
 first reading the posting guide! 
 www.R-project.org/posting-guide.html; 
   the guide has helped people answer their own questions and, failing 
 that, posts more consistent with that guide tend to receive quicker, 
 more useful answers).
 
 hope this helps,
 spencer graves
 
 vittorio wrote:
 
  In what R package(-s) can I find the entire set  of financial 
  functions that
  you can find in MS-Excel such as PMT, PPMT, FV and IPMT? 
  
  Ciao
  Vittorio
  
  __
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Financial functions

2006-04-06 Thread Gabor Grothendieck
See the first two hits for:

RSiteSearch(IRR)



On 4/6/06, Liaw, Andy [EMAIL PROTECTED] wrote:
 Those functions in Excel are for computing things like interest payment for
 fixed interest rate loan, etc.  RSiteSearch() doesn't turn up anything
 useful, so I'd guess no one has written them.  Good opportunity for people
 to contribute, I guess.

 Andy

 From: Spencer Graves
 
  R has many different functions for financial
  computations, but I
  don't know if you will find them all together.  Are you familiar with
  RSiteSearch (including the second restrict argument)?
 
  If you can't find what you want, please submit
  another post (after
  first reading the posting guide!
  www.R-project.org/posting-guide.html;
the guide has helped people answer their own questions and, failing
  that, posts more consistent with that guide tend to receive quicker,
  more useful answers).
 
  hope this helps,
  spencer graves
 
  vittorio wrote:
 
   In what R package(-s) can I find the entire set  of financial
   functions that
   you can find in MS-Excel such as PMT, PPMT, FV and IPMT?
  
   Ciao
   Vittorio
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Command line support tools - suggestions?

2006-04-06 Thread Andrew Robinson
Dear Community,

I'm interested in developing a package that could ease the
command-line learning curve for new users.  It would provide more
detailed syntax checking and commentary as feedback.  It would try to
anticipate common new-user errors, and provide feedback to help
correct them. 

As a trivial example, instead of

 mean(c(1,2,NA))
[1] NA

we might have

 mean(c(1,2,NA))
[1] NA
Warning: your data contain missing values.  If you wish to ignore
these to compute a mean, use na.rm=TRUE.




I'm interested in any thoughts that people have about this idea -
what errors do you commonly see, and how can they be dealt with?

I have applied for funding for a couple of months of programming
support for this idea, and will find out whether or not I get it
later this month.  Until then, I thought that it would be interesting
to kick around some ideas for the sorts of changes that we could try.

Cheers,

Andrew
-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
Email: [EMAIL PROTECTED] http://www.ms.unimelb.edu.au

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] rpart.predict error--subscript out of bounds

2006-04-06 Thread Leon
Hi,
I am using rpart to do leave one out cross validation, but met some problem,
Data is a data frame, the first column is the subject id, the second column is 
the group id, and the rest columns are numerical variables,

 Data[1:5,1:10]
  sub.id group.id X3262.345 X3277.402 X3369.036 X3439.895 X3886.935 X3939.054 
X3953.777 X3970.352
1  32613  HAM_TSP  417.7082  430.4895  619.4776  720.8246 1048.1290  374.4598  
770.4653  646.8712
2  32616  HAM_TSP 2553.5913 1113.4997  225.5340  274.5644 1105.7908  363.4623 
2380.1872  784.9196
3  32617  HAM_TSP  291.6596  314.7322  238.3287  315.8305  982.6276  299.5855 
1059.1140  540.8592
4  32628  HAM_TSP  456.9504  508.2278  552.3632  719.9989 1306.6299  446.6184 
1352.9955  867.4219
5  32629  HAM_TSP  898.8879  640.2680  342.5477  386.5816  811.6709  518.0244  
715.9886  441.1622

Example, I use the first sample as test set, the rest as training set

 fit - rpart(as.factor(Data[-1,2]) ~., Data[-1, -c(1:2 ) ], minbucket=2  )

 predict(fit, Data[1,],type='prob')
Error in predict.rpart(fit, Data[1, ]) : subscript out of bounds

but when I changes the parameter of type into 'class'
it works well

 predict(fit, Data[1,-c(1:2)],type='class')
[1] HTLV_Carrier
Levels: HAM_TSP HTLV_Carrier Leukemia Normal

Could anyone tell me what is the problem with it?

Thanks very much in advance!

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Command line support tools - suggestions?

2006-04-06 Thread Larry Howe
 I'm interested in any thoughts that people have about this idea -
 what errors do you commonly see, and how can they be dealt with?

NAs introduced by coercion

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Using vectorization instead of for loop for performing a calculation efficiently

2006-04-06 Thread Peter Wilkinson
Thanks,

Ok so I feel silly now ... I think I need to get a more thorough text and
work through some examples. 'replace' does what I was trying to do manually.

Thanks for the tip, this really is MUCH faster.

 

-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Saturday, April 01, 2006 7:58 PM
To: Peter Wilkinson
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Using vectorization instead of for loop for performing a
calculation efficiently

Try this:

 set.seed(1)
 mat - matrix(rnorm(2 * 10), 2)
 system.time(mat2 - replace(mat, rowSums(mat  0) == 10, NA))
[1] 0.04 0.01 0.05   NA   NA
 R.version.string # Windows XP
[1] R version 2.2.1, 2005-12-20


On 4/1/06, Peter Wilkinson [EMAIL PROTECTED] wrote:
  I am trying to write an efficient function that will do the following:

 Given an nxm matrix, 10 rows (observations) by 10 columns (samples) 
 for each row, test of all values in the row are greater than a value k 
 If all values are greater than k, then set all values to NA (or 
 something), Return an nxm matrix with the modified rows.

 If I do this with a matrix of 20,000 rows, I will be waiting until 
 Christmas for it to finish:

 For rows in Matrix:
if rows  filter
set all elements in rows to NA   (or something)
else
do nothing
 Return the matrix with the modified rows


 I don't know how to code this properly. The following:

 If (sum(ifelse(nvectorfilter,1,0) == 0 ) )

 Tells me if any row has at least 1 value above the filter. How do I 
 get rid of the 'outer' loop?

 Peter

 -

 Peter Wilkinson
 Senior Bioinformatician / Programmer-Analyst National Immune 
 Monitoring Laboratory
 tel: (514)-343-7876

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] creating files using for loop

2006-04-06 Thread Brian Quinif
I would like to use a for loop to run estimations on 12 different
subsets of a dataset.

I have the basics of my script figured out, but I am having problems
getting the loop to name some files as I would like.   Here is a
sketch of my code:

sub.dataset - c(101, 201)
#Assume I only have two subsets which I call 101 and 201

for (i in 1:length(sub.dataset)) {
#Estimation commands here...unimportant for this question
#Estimates - matrix of estimation resultsI will write out to LaTeX
latex(Estimates, file=sub.dataset[i].tex)
}

So, the latex command is what's problematic here.  How can I get my
for loop to put file='101.tex' and file='201.tex' (or many similar
things) where I want it to.  I could just as easily want to put
something like 101$variable1 or 201$variable1.

Sorry for such a basic question regarding for loops, but thanks for the help.

Brian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] creating files using for loop

2006-04-06 Thread Gabor Grothendieck
See ?paste and try:

paste(sub.dataset[i], tex, sep = .)

On 4/7/06, Brian Quinif [EMAIL PROTECTED] wrote:
 I would like to use a for loop to run estimations on 12 different
 subsets of a dataset.

 I have the basics of my script figured out, but I am having problems
 getting the loop to name some files as I would like.   Here is a
 sketch of my code:

 sub.dataset - c(101, 201)
 #Assume I only have two subsets which I call 101 and 201

 for (i in 1:length(sub.dataset)) {
 #Estimation commands here...unimportant for this question
 #Estimates - matrix of estimation resultsI will write out to LaTeX
 latex(Estimates, file=sub.dataset[i].tex)
 }

 So, the latex command is what's problematic here.  How can I get my
 for loop to put file='101.tex' and file='201.tex' (or many similar
 things) where I want it to.  I could just as easily want to put
 something like 101$variable1 or 201$variable1.

 Sorry for such a basic question regarding for loops, but thanks for the help.

 Brian

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] creating files using for loop

2006-04-06 Thread Brian Quinif
I tried that, and the problem seems to be that the results of the
'paste'ing are enclosed in quotes...that is the paste yields
'101.tex'.  Of course, that is what I said I wanted above, and it
should work for that particular instance, but I also want to be able
to create estimates.101 and estimates.201, for example.  That is, I
want to be able to just generate the value of sub.dataset[i] and
follow it or precede it by whatever I choose, not necessarily
quotation marks.

Thanks,

BQ

2006/4/7, Gabor Grothendieck [EMAIL PROTECTED]:
 See ?paste and try:

 paste(sub.dataset[i], tex, sep = .)

 On 4/7/06, Brian Quinif [EMAIL PROTECTED] wrote:
  I would like to use a for loop to run estimations on 12 different
  subsets of a dataset.
 
  I have the basics of my script figured out, but I am having problems
  getting the loop to name some files as I would like.   Here is a
  sketch of my code:
 
  sub.dataset - c(101, 201)
  #Assume I only have two subsets which I call 101 and 201
 
  for (i in 1:length(sub.dataset)) {
  #Estimation commands here...unimportant for this question
  #Estimates - matrix of estimation resultsI will write out to LaTeX
  latex(Estimates, file=sub.dataset[i].tex)
  }
 
  So, the latex command is what's problematic here.  How can I get my
  for loop to put file='101.tex' and file='201.tex' (or many similar
  things) where I want it to.  I could just as easily want to put
  something like 101$variable1 or 201$variable1.
 
  Sorry for such a basic question regarding for loops, but thanks for the 
  help.
 
  Brian
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html