-Original Message-
From: X.H Chen [mailto:[EMAIL PROTECTED]
Sent: Sunday, September 24, 2006 12:16 AM
To: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Subject: Re: [R] Data frames questions
1) Is there a way to build an empty data frame, containing
nothing but
the data frame
How about:
all.nas - apply( old, 1, function(x) sum( is.na( x ) ) )
new - old[all.nas dim( old )[2], ]
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
I have an odd problem in building a package with only R-code in it.
I have a package mainly used by myself which I last build under R
1.9.0.
The operation system is Win2000 5.00.2195, Service Pack 3
When I do:
c:\stat\r\rw2000\bin\Rcmd install --docs=normal --build
try:
plot(1:10, xlab = substitute( expression(paste(nm SO[4]^{2-x}, ,
mu, eq cm^{-2}, yr^{-1})),
list(x=) ) )
I have no understanding of why it works, formula fidgeting usually
requires use of subtitute().
Btw. I started out
To Danardonos concern of splitting time for records with delayed entry:
This can fairly easily be accomodated, by simply splitting time in small
intervals of time since entry into the study, and then compute the value
of the other timescales for each of these e.g.:
current.age - time.from.entry
Two major advantages of SAS that seems to have been overlooked in
the previous replies are:
1) The data-set language is SAS for data manipulation is more
human-readable than R-code in general.
R is not a definite write-only laguage as APL, but in particular
in datamanipulation it is
There is a package, Lexis, not officil though, which contains
a function ROC (and some other stuff for epidemiology).
You can find it in:
http://biostat.ku.dk/~bxc/SPE/library/
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens
Try to say:
class(x)
unclass(x)
and it will dawn on you what goes on.
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
[EMAIL PROTECTED]
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Henrik
Andersson
Sent: Monday, December 06, 2004 2:44 PM
To: [EMAIL PROTECTED]
Subject: [R] Modyfing PATH in Windows Installer for R
Just a small suggestion since Windows have a file system not
not that its much shorter:
length( table( sapply( list(x,y,z), length ) ) ) == 1
Bendix
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
[EMAIL
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Rachel Pearce
Sent: Monday, December 13, 2004 10:37 AM
To: [EMAIL PROTECTED]
Subject: [R] Percentages in contingency tables *warning
trivial question*
I hesitate to post this question in the
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Saurin Jani
Sent: Friday, December 17, 2004 3:59 PM
To: [EMAIL PROTECTED]
Subject: [R] How can I take anti log of log base 2 values in R
Hi,
I am using R for microarray data anlaysis. When I
In:
http://biostat.ku.dk/~bxc/SPE/library/
you will find a zip of the Lexis package that contains the
function Relevel, which has precise this (and other) features.
Bendix
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
I was wrong about needing the Relevel from the Lexis package.
The default verson of relevel does the job of reshuffling levels
in any desired order, albeit with a warning (which comes from the
fact that apparently only a single number had been anticipated by
the designer):
testf - factor(
try:
tapply( Z, list( X, Y ), mean )
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
[EMAIL PROTECTED]
www.biostat.ku.dk/~bxc
--
If you have all your vectors in a list
vl - list( p1, p2, p3)
the the following should do the trick:
res - numeric(0)
for( i in 1:length(vl) ) res - c( res, vl[[i]] )
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
For exact contol of height and width, you may want to have a look at:
?win.metafile
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
You would probably be better off without the loop, for example:
ns - length( sequence )
num.seq - match( sequence, names )
scores - mscore[cbind(num.seq[-1],num.seq[-ns])]
sum( scores )
I have used the fact that if you index a matrix with
a two-column,matrix ( here, cbind( , ) ), you select the
You probably want something like:
t1 - table( x )
t1
x
1 2 20
3 2 1
t2 - rbind( as.numeric( names( t1 ) ), t1 )
t2
1 2 20
1 2 20
t1 3 2 1
dimnames( t2 ) - NULL
t2
[,1] [,2] [,3]
[1,]12 20
[2,]321
Bendix
--
Bendix Carstensen
You want:
tapply( Outcome, predictor, mean )
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
[EMAIL PROTECTED]
www.biostat.ku.dk/~bxc
The code below illustrates some points about results from tapply that
I find strange. I wonder if they are intended and if so why it is so.
1) When you make a table the dimnames is a *named* list, tapply
returns an unnamed list.
2) data.frame behaves differently on an array and a table. Is
how about:
x - rnorm(400)
nbin-7
hist(x,breaks=quantile(x,prob=seq(0,1,length=nbin+1)))
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07
Unless you have repeated events per person, you do not need to
keep tack of which follow-up belongs to whom.
The likelihood contribution from the two parts is a product.
See any textbook on survival anlysis or p. 50 ff. of:
http://staff.pubhealth.ku.dk/~bxc/Melbourne/Staff/foils.pdf
or another
What you need is matrix multiplication:
rbind( c(1,1,0,0,0,0),
c(0,0,1,1,0,0),
c(0,0,0,0,1,1) ) %*% M
Bendix
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
The ZIP model can be fitted with Jim Lindsey's function fmr
from his gnlm library, see:
http://popgen0146uns50.unimaas.nl/~jlindsey/rcode.html
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel:
Course in
STATISTICAL PRACTICE IN EPIDEMIOLOGY USING R
Tartu, Estonia, 26 - 31 May 2005
The course is aimed at epidemiologists and statisticians who wish to
use R for statistical modelling and analysis of epidemiological data.
The course
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Florian Menzel
Sent: Tuesday, January 25, 2005 3:22 PM
To: r-help@stat.math.ethz.ch; r-help@stat.math.ethz.ch
Subject: [R] GLM function with poisson distribution
Hello all,
I found a weird
You have a perfect separtaion of y by x, i.e.
y == (x30)
is true for all units.
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
Consider the following two specifications of a model:
library( splines )
x - 1:100
y - rnorm( 100 )
w - rep( 1, 100 )
A - factor( sample( 1:2, 100, replace=T ) )
B - factor( sample( letters[1:4], 100, replace=T ) )
summary( lm( y ~ ns( x, knots=c(30, 50, 70 ), intercept=T ):A - 1 + B )
)
summary(
What you want is probably:
cxy - c(x,y)
xy - rep( c(x,y), c(length(x),length(y)) )
( txy - table(xy, cxy ) )
cxy
xy 2 3 4 5
x 0 6 4 0
y 4 0 1 5
barplot( txy, beside=T )
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels
?Devices
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Liu, Jane
Sent: Monday, February 21, 2005 10:14 AM
To: RHELP
Subject: [R] save plot as jpg/gif
I am generating multiple plots and would like to save them as
jpg or gif files. Could
When I run in BATCH mode I use a script (win2000):
c:\stat\r\%R_VERS%\bin\Rcmd BATCH -q --no-restore --no-save %1
Now I want to be able to print the filename of the program, i.e.
the value if the %1 argument, in the .Rout file.
(Basically I want to write a piece of code in .First() which
tapply( x, f, mean )
maybe with
tapply( x, f, mean, na.rm=T )
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
[EMAIL PROTECTED]
www.biostat.ku.dk/~bxc
?tapply
Bendix Carstensen
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of thomas hills
Sent: Saturday, February 26, 2005 6:17 PM
To: r-help@stat.math.ethz.ch
Subject: [R] averaging within columns
I have a dataframe with names in the first
The function addmargins() adds margins to a table, but returns a matrix.
But even after converted to a table the print.zero=. option of
print.table() does not work:
x - sample( 1:7, 20, replace=T )
y - sample( 1:7, 20, replace=T )
tt - table( x, y )
tx - as.table( addmargins( table( x, y ) )
STATISTICAL PRACTICE IN EPIDEMIOLOGY USING R
Tartu, Estonia, Thursday 8 - Tuesday 13 June 2006
Application deadline: 15 April 2006.
The course is aimed at epidemiologists and statisticians who wish to
use R for statistical modelling and analysis of
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of David
J. Netherway
Sent: Monday, May 03, 2004 10:25 AM
To: [EMAIL PROTECTED]
Subject: [R] Setting up contrasts
I am using the following model:
lm - lm(mydata[[variableName]] ~ Age + Gender +
This is how I get the month names from within R:
mon - rep(strptime(01/01/1952, format = %d/%m/%Y), 12)
mon$mon - mon$mon + 0:11
mnam - months(mon, abbreviate = F)
mnam
[1] januarfebruar marts april maj juni
juli augustseptember
[10] oktober november december
Is the following an inconsistency, programming glitch or a feature?
One would expect that vcov(obj) was the variance-covariance of
coef(obj),
but apparently this is not the case for polr objects:
x - rnorm( 100 )
y - rnorm( 100 )
ff - factor( sample( 1:4, 100, replace=T ) )
pm - polr( ff ~
Is there a way to prevent latex.default() from starting LaTeX and JUST
create the file requested?
I generate a number of LaTeX tables for inclusion in a document running
R in batch, and I don't want a lot of calls to LaTeX.
I cannot find any arguments for that task in the documentation.
It seems
You want the function cut(), followed by table().
Bendix
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
[EMAIL PROTECTED]
www.biostat.ku.dk/~bxc
t( my.table )
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
[EMAIL PROTECTED]
www.biostat.ku.dk/~bxc
--
-Original
You (and the mailing list) would defintely benefit from cliking on:
Help - Manuals - An introduction to R
and spend a few hours in frot of R while reading that.
Bendix
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Has anyone written a function that will print a difftime in the form:
hh:mm:ss
or
yy-mm-dd hh:mm:ss
depending on the actual size.
(sloppy notation for months/minutes, but surely you get the point).
Bendix
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels
Just open a device before you plot:
pdf( plotfile.pdf )
plot( x, y )
dev.off()
also have a look at:
?Devices
Best
Bendix
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
The key to solving your problem is that read.spss per default
gives you a *list* and not a *dataframe* (can anyone explain this
choice of default?).
So most likely wou want:
children = read.spss(filename,to.data.frame=TRUE)
attach(children)
or to get things a little more handy:
children -
There is now an Epi-package on CRAN.
It is intended for epidemiological analysis in R.
It has its own homepage, http://www.pubhealth.ku.dk/~bxc/Epi
The package has been used at the course Statistical practise
in Epidemiology with R, see http://www.pubhealth.ku.dk/~bxc/SPE.
A mailing list for
In Venables \ Ripley 3rd edition (p. 231) the proportional odds model
is described as:
logit(p=k) = zeta_k + eta
but polr apparently thinks there is a minus in front of eta,
as is apprent below.
Is this a bug og a feature I have overlooked?
Here is the naked code for reproduction, below the
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mike Hollegger
Sent: Saturday, February 14, 2004 12:03 PM
To: [EMAIL PROTECTED]
Subject: [R] converting data to date format
Dear all,
...snip
I've tried to bring it in character-format
Course in
STATISTICAL PRACTICE IN EPIDEMIOLOGY USING R
Tartu, Estonia, 29 May - 4 June 2004.
Aimed at young statisticians and epidemiologists wishing to broaden
their epidemiological skills, in particular with respect to practical
statistical
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Wednesday, March 03, 2004 6:27 PM
To: Peter Dalgaard
Cc: [EMAIL PROTECTED]
Subject: Re: [R] read.spss and time/date information
On Wed, 3 Mar 2004, Peter Dalgaard
I guess what you want is:
a - abs(a)
floor( a / 10^floor( log10( a ) ) )
Bendix
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
[EMAIL PROTECTED]
paste
is the function you need.
Bendix C.
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43 07 06
[EMAIL PROTECTED]
www.biostat.ku.dk/~bxc
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of David Firth
Sent: Tuesday, March 16, 2004 1:12 PM
To: Paul Johnson
Cc: [EMAIL PROTECTED]
Subject: Re: [R] glm questions
Dear Paul
Here are some attempts at your questions. I hope it's of some
lr.mod - glm( y ~ x + w, family=binomial )
exp( summary( lr.mod )$coef[,1:2] %*% rbind( c(1,1,1), 1.96*c(0,-1,-1) )
)
should do the job.
Pack it in a function if you like, see e.g. the
(so far) undocumented function ci.lin in:
http://www.biostat.ku.dk/~bxc/R/ci.lin.R
(depends on)
You may want:
lm( y ~ x:z )
This is the same model you fitted, but prametrized differently.
But please check that what you REALLY want is not
lm( y ~ z + x:z )
This is the model with different intercepts as well.
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
When I do:
apc - glm( D ~ ns( Ax, knots=seq(50,80,10), Bo=c(40,90) ) +
+ ns( Cx, knots=seq(1880,1940,20), Bo=c(1840,1960) ) +
+ ns( Px, knots=seq(1960,1980,10), Bo=c(1940,2000) ) +
+ offset( log( Y ) ),
+ family=poisson )
pterm -
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Thomas Lumley
Sent: Wednesday, April 07, 2004 5:44 PM
To: Giovanni Petris
On Wed, 7 Apr 2004, Giovanni Petris wrote:
Hello,
After reading the help for predict.lm and predict.glm, it
is not
I am not sure what you mean but you might be interested in the functions
row() and col().
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob: +45 30 75 87 38
fax: +45 44 43
Is there a function that does the same as pretty but on a log-scale?
Suppose you have
x - exp( runif( 100, 0, 6 ) )
(which will between 1 and 403), then I would like to have a result like:
log.pretty( x )
[1] 1 5 10 50 100 500
Bendix C.
--
Bendix Carstensen
Senior
I wonder if there is a count of the number of downloads of
each version of the R installations (for example rw1090.exe)?
Bendix Carstensen
--
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2
DK-2820 Gentofte
Denmark
tel: +45 44 43 87 38
mob:
Course in
STATISTICAL PRACTICE IN EPIDEMIOLOGY USING R
Tartu, Estonia, Thursday 8 - Tuesday 13 June 2006
The course is aimed at epidemiologists and statisticians who wish to
use R for statistical modelling and analysis of epidemiological data.
The
Here is a piece of code fitting a model to a (part) of a dataset, just
for
illustration. I can extract the random interaction and the residual
variance
in group meth==1 using VarCorr, but how do I get the other residual
variance?
Is there any way to get the other variances in numerical form
Course in
STATISTICAL PRACTICE IN EPIDEMIOLOGY USING R
Tartu, Estonia, 25 to 30 May 2007
The course is aimed at epidemiologists and statisticians who wish to use R for
statistical modelling and analysis of epidemiological data.
The course requires
Generally it is difficult to get an overview of what's there.
But the following function I acquired from (???) ages ago does a nice
job:
lls -
function (pos = 1, pat = )
{
dimx - function(dd) if (is.null(dim(dd)))
length(dd)
else dim(dd)
lll - ls(pos = pos, pat = pat)
Here is a workable solution:
df1 - data.frame(ar1)
df2 - data.frame(ar2)
cmn - intersect(names(df1),names(df2))
rbind(df1[,cmn],df2[,cmn])
Best
Bendix
__
Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2-4
DK-2820
I have a samll package that works well and complies nicly on WinXP,
using R-2.4.1
Now I want to add a document so i make a folder inst\doc and put the
.tex and .pdf in there.
But the complation then crashes. Is this because the installin expects
some file to be present in inst if an inst folder
I was a bit puzzed by:
formatC(6.65,format=f,digits=1)
[1] 6.6
So I experimented and found:
formatC(6.6501,format=f,digits=1)
[1] 6.6
formatC(6.651,format=f,digits=1)
[1] 6.7
round(6.6501,1)
[1] 6.7
round(6.651,1)
[1] 6.7
version
68 matches
Mail list logo