On 5/27/07, Robert A. LaBudde <[EMAIL PROTECTED]> wrote:
> As I was working through elementary examples, I was using dataset
> "plasma" of package "HSAUR".
> In performing a logistic regression of the data, and making the
> diagnostic plots (R-2.5.0)
> data(plasma,package='HSAUR')
> plasma_1<- glm(ESR ~ fibrinogen * globulin, data=plasma, family=binomial())
> layout(matrix(1:4,nrow=2))
> plot(plasma_1)
> I find that data points corresponding to rownames 17 and 23 are
> outliers and high leverage.
> I would then like to perform a fit without these two rows.
> In principle this should be easy, using an update() with subset=-c(17,23).
> The problem is that the rownames in this dataset are not ordered,
> and, in fact, the relevant rows are 30 and 31, not 17 and 23.
> This brings up the following (elementary?) questions:
> 1. How do you reference rows in "subset=" for which you know the
> rownames, but not the row numbers?

Use a logical vector:

   rownames(plasma) %in% c(17, 23)

> 2. How do you discovery the rows corresponding to particular
> rownames? (Using plasma[rownames(plasma)==17,] shows the data, but
> NOT the row number!) (Probably the same answer as in Q. 1 above.)

  which(rownames(plasma) %in% c(17, 23)) # 30, 31

> 3. How do you sort (order) the rows of an existing data frame so that
> the rownames are in order?

  plasma[order(as.numeric(rownames(plasma))), ]

R-help@stat.math.ethz.ch mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to