from:"Marie Pierre Sylvestre"

[R] Package rms: c-statistic from lrm function with weights

2016-06-15 Thread Marie-Pierre Sylvestre

Dear list,

I am using the lrm function from the rms package to estimate a logistic
model with weights. The c-statistic (or area under the curve) is part of
the lrm output.

To understand how the weights enter the computation of the c-statistics, I
looked at the script of lrm and lrm.fit but I am out of luck because it is
making a call to a Fortran routine and I don't know Fortran.

z <- .Fortran("lrmfit", coef = initial, nx, 1:nx, x,
y, offset, u = double(nvi), double(nvi * (nvi + 1)),
double(1), n, nx, sumw, nvi, v = double(nvi * nvi),
double(nvi), double(2 * nvi), double(nvi), integer(nvi),
opts = opts, ftable, penmat, weights, PACKAGE = "rms")


Can somebody help me figure out how the weights from the regression are
used in the computation of the c-statistic? Here is a small example that
shows that the c-statistic computed from the rms package and using the pROC
packages are not the same (not even close) when calculated from a weighted
logistic regression.

set.seed(1233)
x <- rnorm(100)
w <- runif(100)
y <- rbinom(100, 1, .5)
require(rms)
# unweighted model
umod <- lrm(y~x)
umod$stat # c-statistic is   0.5776796
# weighted model
wmod <- lrm(y~x, weight = w)
wmod$stat # c-statistic is  0.65625
# using pROC
require(pROC)
umod2 <- glm(y~x, family = binomial)
auc(y, predict(umod2)) # 0.5769
wmod2 <- glm(y~x, weights = w, family = binomial)
auc(y, predict(wmod2)) # 0.5769

BTW results from umod and umod2 and from wmod and wmod2 are identical so
the discrepancy in c-statistics in not due to using lrm vs. glm.

Best regards,
MP

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Frequencies for a list of vectors

2014-08-05 Thread Marie-Pierre Sylvestre

Dear R users,

I have a list of vectors (list is called HTNlist). Each vector is of length
1 to 4 and takes only 0 and 1 as values. E.g.

head(HTNlist)
$`30008`
[1] 1 0 1 0

$`60008`
[1] 0 0 1 0

$`90008`
[1] 0 0 1 0

$`17`
[1] 1

$`130001`
[1] 0 1

$`130007`
[1] 1 0 1 0

I would like to obtain a frequency table for the elements of the list. I
want to know how many of
'1 0 0' I have in the list, how many '1 0 1 0' etc.

Can you please help?

Thank you in advance,
MP

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] update.formula() to add interaction terms

2014-02-04 Thread Marie-Pierre Sylvestre

Many thanks, that is exactly what I was looking for. I did not know about
bquote, interesting!
MP


On Mon, Feb 3, 2014 at 6:29 PM, William Dunlap wdun...@tibco.com wrote:

 Is this the sort of thing you are looking for?
 fm - y ~ x1 + x2 + x3 + log(x4)
 # Use terms() instead of just all.vars() to keep log(x4) as log(x4)
 xVars - with(attributes(terms(fm)),
 as.list(variables)[-c(1,response+1)])
 str(xVars)
List of 4
 $ : symbol x1
 $ : symbol x2
 $ : symbol x3
 $ : language log(x4)
 # use bquote to make the addition to the formula
 update(fm, bquote( ~ . + .(xVars[[1]]) * .(xVars[[length(xVars)]])))
y ~ x1 + x2 + x3 + log(x4) + x1:log(x4)

 As a function it would be
addInteraction - function(formula){
   xVars - with(attributes(terms(formula)),
 as.list(variables)[-c(1,response+1)])
  update(formula, bquote( ~ . + .(xVars[[1]]) *
 .(xVars[[length(xVars)]])))
}
 used as
 addInteraction(y~x1+x2+sqrt(x3))
y ~ x1 + x2 + sqrt(x3) + x1:sqrt(x3)

 If the last 'term' in the formula is a compound like x4:x5 and you
 want x1:x4:x5 added you will need to do more work (look at the
 'factors' attribute of terms()'s output) - currently it adds x1:x5.


 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com


  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf
  Of Marie-Pierre Sylvestre
  Sent: Monday, February 03, 2014 12:54 PM
  To: r-help@r-project.org
  Subject: [R] update.formula() to add interaction terms
 
  Hi,
  I have a list of formulae that I need to modify. For each formula, I need
  to add an interaction term between the first independent variable and the
  last one. I want to write a function to achieve that because the list of
  formulae to modify is long.
 
  For example, if the formula is y ~ x1 + x2 + x3 + x4, then I need to turn
  it either into
 
  y ~ x1*x4 + x2 + x3
 
  or
 
  y ~ x1 + x2 + x3 + x4 + x4:x1
 
  (I know they are the same, but one may be easier to work with than the
  other).
 
 
  Suppose I have the formula a which is defined as a - formula(y ~
  x1+x2+x3+x4)
 
  I know how to access to the first and last independent variables of my
  formula:
 
  firstvar - all.vars(a[[3]])[1]
  lastvar - all.vars(a[[3]])[length( all.vars(a[[3]]))]
 
 
  What I can't figure out is how to use update.formula() to include my
  interaction term in order to get y ~ x1+x2+x3+x4 + x1:x4. Specifically,
 
  update.formula(a, ~ . + paste(all.vars(a[[3]])[1],firstvar, lastvar, sep
 =
  ':'))
 
  is not producing y ~ x1+x2+x3+x4 + x1:x4.
 
  Any idea?
 
  Thanks in advance,
  MP
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] update.formula() to add interaction terms

2014-02-03 Thread Marie-Pierre Sylvestre

Hi,
I have a list of formulae that I need to modify. For each formula, I need
to add an interaction term between the first independent variable and the
last one. I want to write a function to achieve that because the list of
formulae to modify is long.

For example, if the formula is y ~ x1 + x2 + x3 + x4, then I need to turn
it either into

y ~ x1*x4 + x2 + x3

or

y ~ x1 + x2 + x3 + x4 + x4:x1

(I know they are the same, but one may be easier to work with than the
other).


Suppose I have the formula a which is defined as a - formula(y ~
x1+x2+x3+x4)

I know how to access to the first and last independent variables of my
formula:

firstvar - all.vars(a[[3]])[1]
lastvar - all.vars(a[[3]])[length( all.vars(a[[3]]))]


What I can't figure out is how to use update.formula() to include my
interaction term in order to get y ~ x1+x2+x3+x4 + x1:x4. Specifically,

update.formula(a, ~ . + paste(all.vars(a[[3]])[1],firstvar, lastvar, sep =
':'))

is not producing y ~ x1+x2+x3+x4 + x1:x4.

Any idea?

Thanks in advance,
MP

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] meta-analysis, outcome = OR associated with a continuous independent variable

2012-04-04 Thread Marie-Pierre Sylvestre

Hello everyone,
I want to do a meta-analysis of case-control studies on which an OR was
computed based on a continuous exposure. I have found several several
packages (metafor, rmeta, meta) but unless I misunderstood their main
functions,  it seems to me that they focus on two-group comparisons (binary
independent variable), and do not have the option of using a continuous
independent variable.


If this is right, do you have any suggestions for a meta-analysis with
continuous independent variable? I using lme or lme4 with weights my only
option?

Thanks in advance,
MP

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] pairs and panel.smooth for two groups

2010-09-08 Thread Marie-Pierre Sylvestre

Hi,
I have modified the USJudgeRatings data (available in R)  to
illustrate my question.

# Use the first 4 variables of USJudgeRatings and add a group variable
with two levels
USJudgeRatings - USJudgeRatings[,1:4]
USJudgeRatings$group - factor(c(rep(1, 22), rep(0, 21)))

# I can draw a pairs graph where members of each group are drawn in
different colors:

pairs(USJudgeRatings[,1:4], col = c(2,3)[USJudgeRatings$group], pch =
c(21,3)[USJudgeRatings$group])

# I would also like to add a smooth line to each subplot like
pairs(USJudgeRatings[,1:4], panel=panel.smooth)

# but I want the smooth to be done for each of the group, i.e. I want
two smooths per subplot.

# this creates only one smooth
pairs(USJudgeRatings[,1:4], col = c(2,3)[USJudgeRatings$group], pch =
c(21,3)[USJudgeRatings$group], panel = panel.smooth)

# I understand that panel.smooth is a function that is called for each
subplot. I don't know how to tell it to do a smooth for each of my
group in each of the subplot.

Any help would be appreciated!

Best,

Marie-Pierre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Converting a data set from 'long' format to 'interval' format

2010-03-24 Thread Marie-Pierre Sylvestre

Hi,

I have a data set in which the variable 'dose' is time-varying. Currently,
the data set is in a long format, with 1 row for each time unit of follow-up
for each individual Id. It looks like this:


orig.data - cbind(Id = c(rep(1,4), rep(2,5)), time = c(1:4, 1:5), dose =
c(1,1,1,0,1,0,1,1,0))

orig.data
  Id time dose
 [1,]  111
 [2,]  121
 [3,]  131
 [4,]  140
 [5,]  211
 [6,]  220
 [7,]  231
 [8,]  241
 [9,]  250

What I would like to do is to convert the data set into an interval format.
By that I mean a data set in which each row has a 'Start' and a 'Stop' value
that indicates the time units in which the 'dose' is constant. For example,
my orig.data example would now be:

int.data -  cbind(Id = c(rep(1,2), rep(2,4)), Start = c(1,4,1,2,3,5), Stop
= c(3,4,1,2,4,5), dose = c(1,0,1,0,1,0))

int.data
 Id Start Stop dose
[1,]  1 131
[2,]  1 440
[3,]  2 111
[4,]  2 220
[5,]  2 341
[6,]  2 550

Basically, this implies collapsing rows that have the same Id and dose
and creating Start and Stop to index the time.

While I can write a clumsy routine with multiple loops to do it, it will be
inefficient and will not work for large data set.

I wonder if people know of a function that would reshape my data set from
'long' to 'interval'?

Best,

MP

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gamm (mgvc) and time-varying coefficient model

2009-03-01 Thread Marie-Pierre Sylvestre

Dear R users,

I have repeated measurements on individuals. I want to estimate the
time-varying effect of a factor variable X (taking three levels), e.g. a
model in the spirit of Hastie and Tibshirani (1993). 

I am considering using the package mgvc which implements generalized
additive models, especially the function gamm, which estimates
generalized additive mixed models, and thus, can deal with the
correlated repeated measures within individuals.

However, I am confused as to how to specify the time-varying coefficient
part of the formula. According to the mgvc documentation (p. 35):
by variables are the means for constructing 'varying-coefficient
models' (geographic regression models) and for letting smooths
'interact' with factors or parametric terms.

Suppose that y is the response variable, id identifies individuals,
x is the three-level factor variable and time indexes the chronology
of responses.

Is this model estimating the time-varying coefficient of x? If it is
not, how should I specify the model?

mod - gamm( y ~ factor(x) + s(time, by=factor(x)), data=mydata,
random=list(patient=~ 1), correlation = corAR1())

Best,
MP


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gamm (mgvc) and time-varying coefficient model

2009-03-01 Thread Marie-Pierre Sylvestre

 

Dear R users,

I have repeated measurements on individuals. I want to estimate the
time-varying effect of a factor variable X (taking three levels), e.g. a
model in the spirit of Hastie and Tibshirani (1993). 

I am considering using the package mgvc which implements generalized
additive models, especially the function gamm, which estimates
generalized additive mixed models, and thus, can deal with the
correlated repeated measures within individuals.

However, I am confused as to how to specify the time-varying coefficient
part of the formula. According to the mgvc documentation (p. 35):
by variables are the means for constructing 'varying-coefficient
models' (geographic regression models) and for letting smooths
'interact' with factors or parametric terms.

Suppose that y is the response variable, id identifies individuals,
x is the three-level factor variable and time indexes the chronology
of responses.

Is this model estimating the time-varying coefficient of x? If it is
not, how should I specify the model?

mod - gamm( y ~ factor(x) + s(time, by=factor(x)), data=mydata,
random=list(patient=~ 1), correlation = corAR1())

Best,
MP

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] creating a package with internal (hiden) S4 classes and methods

2008-09-11 Thread Marie-Pierre Sylvestre


Dear R users,

I am writing my first R package and it finally past the R CMD check. 
Before sending it to CRAN, I would like to check a few issues.


1. I created s4 classes and methods in the package but I want only 1 
function (foo) to be available to users and documented in the manual 
that goes with the package. I put foo in 1 file and the rest (internal 
functions, classes and methods) in a file called mypackage-internal.R.


For the .Rd file, I created one for each class, method and internal 
functions, and I wrote  \keyword{internal} and the end.


Finally, I have created a NAMESPACE that export my function f00. I 
thought it would work but it created a problem with the classes and 
methods not being declared so I added Classes(''myclass) and 
Methods(my method) to the NAMESPACE file.


Does this make sense as to a way to hide classes and methods from 
users, and avoid describing them?


2. Everything goes smoothly when I R CMD build and R CMD check. However, 
when I look at the mypackage-manual.tex file that the R CMD check 
created, I see all my internal classes, methods and functions listed. 
This is not what I want, I only want foo.


I know that I need to send the tar.gz file to CRAN but I am clueless as 
to how the package manual (pdf) will be created. Will it look like the 
mypackage-manual.tex? If so, how do I make sure that only the package 
DESCRIPTION and the foo function are listed?


Please do not simply refer me to the manual 'Writing R extensions!'. I 
have read it carefully but I have to admit that some of the stuff 
explained there is beyond me. I am lost.


many thanks in advance,

MP

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] building a package that contains S4 classes and methods

2008-09-09 Thread Marie Pierre Sylvestre

Hello R users,

I am trying to make a my first package and I get an error that I can
understand. The package is build out of three files (one for functions, 1
for s4 classes and 1 for s4 methods).

Once I source them I run 

package.skeleton( name=TDC )

within a R session and I get 

Creating directories ...
Creating DESCRIPTION ...
Creating Read-and-delete-me ...
Saving functions and data ...
Making help files ...
Done.
Further steps are described in './TDC/Read-and-delete-me'.
Warning messages:
1: In dump(internalObjs, file = file.path(code_dir,
sprintf(%s-internal.R,  :
  deparse of an S4 object will not be source()able
2: In dump(internalObjs, file = file.path(code_dir,
sprintf(%s-internal.R,  :
  deparse of an S4 object will not be source()able
3: In dump(internalObjs, file = file.path(code_dir,
sprintf(%s-internal.R,  :
  deparse of an S4 object will not be source()able
4: In dump(internalObjs, file = file.path(code_dir,
sprintf(%s-internal.R,  :
  deparse may be incomplete


I keep going in spite of the warnings with 
R CMD check --no-examples TDC

and I get 
* checking for working pdflatex ... OK
* using log directory
'/home/mariepierre/Packages/PermAlgo/PermAlgo/PermAlgo2/TDC.Rcheck'
* using R version 2.7.1 (2008-06-23)
* using session charset: UTF-8
* checking for file 'TDC/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'TDC' version '1.0'
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking whether package 'TDC' can be installed ... ERROR
Installation failed.

The error file says:
 
* Installing *source* package 'TDC' ...
** R
** preparing package for lazy loading
Error in parse(n = -1, file = file) : unexpected '' at
102: `.__C__BindArgs` -
103: 
Calls: Anonymous - code2LazyLoadDB - sys.source - parse
Execution halted
ERROR: lazy loading failed for package 'TDC'
** Removing
'/home/mariepierre/Packages/PermAlgo/PermAlgo/PermAlgo2/TDC.Rcheck/TDC'

The problem is with my classes and methods. The respective files contain:

setClass(BindArgs,  signature( function ))
setClass(BindArgs2, signature( function )) 

and

setMethod(initialize, BindArgs, function( .Object, f, ... )
  callNextMethod( .Object, function( x ) f( x, ... ) )) 

setMethod(initialize, BindArgs2, function( .Object, f, ...)
  callNextMethod( .Object, function( x, y ) f( x, y, ... ) )) 

Everything works well within a R session but I can build the package.

If I look at the internal R file that this created I get

`.__C__BindArgs` -
S4 object of class structure(classRepresentation, package = methods)
`.__C__BindArgs2` -
S4 object of class structure(classRepresentation, package = methods)
`.__M__initialize:methods` -
S4 object of class structure(MethodsList, package = methods)
`.__T__initialize:methods` -
environment

Well, let just say that I am new to classes so this confuses me greatly. I
have checked the documentation and tried a few things but I reached my
personal limits!

I am using R 2.7.1 on Linux Fedora 8.

Any comments on what is happening and/or help would be greatly appreciated.

MP

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Tidying up code - Warning message: deparse may be incomplete

2008-08-28 Thread Marie Pierre Sylvestre

Dear R users,

I am currently writing a R package and to do so I am following the
guidelines in manual 'Writing R extensions'.

In Section 3.1, it is suggested to tidy up the code using a file
containing the following:
options(keep.source = FALSE)
source(myfuns..R)
dump(ls(all = TRUE), file = new.myfuns.R)

I have done this for my own packages and although it runs, I get several
warnings of the type:

Warning message:
In dump(ls(all = TRUE), file = PermAlgo.R) : deparse may be incomplete

I am clueless as to what this means.

Even if I try to tidy only one function from my code, I get the warning.


E.g. the file lala.R contains only this:
partialHazards - function(t, v, covArray, betas){   exp( covArray[v,t,]
%*% betas ) }

the file tidylala.R contains:

options(keep.source = FALSE)
source(lala.R)
dump(ls(all = TRUE), file = newlala.R)

On Linux I run:

R --vanilla  tidylala.R  

Then I obtain:

Warning message:
In dump(ls(all = TRUE), file = newlala.R) : deparse may be incomplete

The file newlala.R looks like this:

`partialHazards` -
function (t, v, covArray, betas)
{
exp(covArray[v, t, ] %*% betas)
}

What does the warning mean? Can I simply ignore it?

thanks,

MP

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fixing the coefficient of a regressor in formula

2008-05-28 Thread Marie-Pierre Sylvestre


Dear R users,

I want to estimate a Cox PH model with time-dependent covariates so I am 
using a counting process format with the following formula:


Surv(data$start, data$stop, data$event.time) ~ cluster(data$id) +  G1 + 
G2 + G3 + G4 + G5 +G6


Gs represent  a B-spline basis functions so they sum to 1 and I can't 
estimate the model as is without getting the last coefficient to be NA, 
which makes sense given the perfect collinearity.


without getting in lengthy details about my code, let me just say that 
to avoid the colinearity problem,.  I do not want to omit G1 from the 
regression. Instead, I want to fix the regression coefficient of one of 
the regressors, G1, to 1.


I have read the R manual section on formulae but I have not found how to 
do fix a regression coefficient. Conceptually speaking it seems to me 
that it should be simple, and I am sure that someone explained it 
somewhere, but I did not find the proper keywords to find it!


So, does someone know how to fix the coefficient of a regressor in the 
formula for Cox model so that the coefficient is not estimated but still 
taken into account?


Thank you in advance,

MP

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] assigning and saving datasets in a loop, with names changing with i

2007-12-18 Thread Marie Pierre Sylvestre

Dear R users,

I am analysing a very large data set and I need to perform several data
manipulations. The dataset is so big that the only way I can play with it
without having memory problems (E.g. cannot allocate vectors of size...)
is to write a batch script to:

1. cut the data into pieces 
2. save the pieces in seperate .RData files
3. Remove everything from the environment
4. load one of the piece
5. perform the manipulations on it
6. save it and remove it from the environment
7. Redo 4-6 for every piece
8. Merge everything together at the end

It works if coded line by line but since I'll have to perform these tasks
on other data sets, I am trying to automate this as much as I can. 

I am using a loop in which I used 'assign' and 'get' (pseudo code below).
My problem is when I use 'get', it prints the whole object on the screen.
I am wondering whether there is a more efficient way to do what I need to
do. Any help would be appreciated. Please keep in mind that the whole
process is quite computer-intensive, so I can't keep everything in the
environment while R performs calculations.

Say I have 1 big dataframe called data. I use 'split' to divide it into a
list of 12 dataframes (call this list my.list)

my.fun is a function that takes a dataframe, performs several
manipulations on it and returns a dataframe.


for (i in 1:12){
  assign( paste( data, i, sep=),  my.fun(my.list[i]))   # this works
  # now I need to save this new object as a RData. 

  # The following line does not work
  save(paste(data, i, sep = ),  file = paste(  paste(data, i, sep =
), RData, sep=.))
}

  # This works but it is a bit convoluted!!!
  temp - get(paste(data, i, sep = ))
  save(temp,  file = lala.RData)
}


I am *sure* there is something more clever to do but I can't find it. Any
help would be appreciated.

best regards,

MP

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Identifying and characterizing strings of NA in a vector

2007-10-11 Thread Marie-Pierre Sylvestre

Dear R users,

I was wondering if someone could suggest a few lines of code for my problem.

I want to count the number and the length of strings of NA in a vector. 
For example:

vec - c(1, 2, 1, NA, NA, 1, 2, NA, NA, NA, 3, 4, NA, NA)

has 2 strings of NA's of length 2 and 1 string of NA' of length 3.

I can easily count the number of NA's per vector, but I am having a hard 
time counting the number and length of strings of NA's per vector 
without relying heavily on loops. I will have to perform this task for 
many vectors.

Can somebody help?

many thanks,

Marie-Pierre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Package rms: c-statistic from lrm function with weights

[R] Frequencies for a list of vectors

Re: [R] update.formula() to add interaction terms

[R] update.formula() to add interaction terms

[R] meta-analysis, outcome = OR associated with a continuous independent variable

[R] pairs and panel.smooth for two groups

[R] Converting a data set from 'long' format to 'interval' format

[R] gamm (mgvc) and time-varying coefficient model

[R] gamm (mgvc) and time-varying coefficient model

[R] creating a package with internal (hiden) S4 classes and methods

[R] building a package that contains S4 classes and methods

[R] Tidying up code - Warning message: deparse may be incomplete

[R] Fixing the coefficient of a regressor in formula

[R] assigning and saving datasets in a loop, with names changing with i

[R] Identifying and characterizing strings of NA in a vector

15 matches

Site Navigation

Mail list logo

Footer information