Re: [R] how exactly does 'identify' work?

2010-11-18 Thread Greg Snow
I think that your problem comes from a misunderstanding.

The general rule is that you give the plot command 2 vectors, x and y (though 
you can give it the vectors separately, or together in a list or matrix).  If 
you give plot only a single vector then it will use this as the y vector and 
use the sequence of integers from 1 to the length of y as the x variable.  Now 
in your example that matches your x exactly. 

In your working examples you either give the function both x and y, or only y 
and the generated sequence for x happens to match your x for this specific 
example, but not in general.  For your non-working examples you give only the x 
variable, which is then used as the y variable and the sequence is generated 
for x, so it will only identify points along the diagonal.  It does not know 
where to find your y variable.

The fix, always give both x and y (the fact that your examples worked with only 
y is due to your specific example, not anything general).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of casperyc
 Sent: Tuesday, November 16, 2010 5:03 PM
 To: r-help@r-project.org
 Subject: [R] how exactly does 'identify' work?
 
 
 Hi all,
 
 #
 test=data.frame(x=1:26,y=-23.5+0.45*(1:26)+rnorm(26))
 rownames(test)=LETTERS[1:26]
 attach(test)
 #test
 test.lm=lm(y~x)
 
 plot(test.lm,2)
 identify(test.lm$res,,row.names(test))
 # not working
 
 plot(x,y)
 identify(x,y,row.names(test))
 # works fine
 identify(y,,row.names(test))
 # works fine
 identify(x,,row.names(test))
 # not working
 identify(y,,y)
 # works
 identify(x,,y)
 # not working
 
 #
 
 My guess is that identify take the object 'x' ( the first argument ) is
 the
 thing that on the y axis.
 
 However, i have tried many many ways
 trying to get the LETTERS to be identified in the QQ-plot
 (plot(test.lm,2))
 it never works.
 
 I have even tried to extract the standardized residual using
 library(MASS),
 the 'stdres' function, and put it as the first argument in identify,
 still failed...
 
 Is there any means to achieve this?
 
 Thanks!
 
 casper
 --
 View this message in context: http://r.789695.n4.nabble.com/how-
 exactly-does-identify-work-tp3045953p3045953.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how exactly does 'identify' work?

2010-11-18 Thread casperyc

Hi, 

I think the problem is

1 - when a linear model is fitted, ploting the qqnorm( test.lm$ res )
we dont 'know' what values are actually being used on the y-axis; and
how do we refer to the ‘Index’ on the x-axis??
 therefore, i dont know how to refer to the x and y coordinates in the
identify function

2 - i have tried using the stdres function in the MASS library, to extract
the standardised
residuals and plot them manully, ( using the plot ) function.
 this way, the problem is we have to SORT the residuals first in
increasing order to reproduce the same qqnorm plot, in that case, 'identify'
function works, however, that CHANGES the order, i.e. it wont return the
original A:Z ( row.names ) label.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-exactly-does-identify-work-tp3045953p3049357.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how exactly does 'identify' work?

2010-11-18 Thread Duncan Murdoch

On 18/11/2010 1:50 PM, casperyc wrote:

Hi,

I think the problem is

1 - when a linear model is fitted, ploting the qqnorm( test.lm$ res )
we dont 'know' what values are actually being used on the y-axis; and
how do we refer to the ‘Index’ on the x-axis??
  therefore, i dont know how to refer to the x and y coordinates in the
identify function


You could look at qqnorm.default to figure those things out, but it is 
probably difficult to do.  You'd be better off using locator() to find 
the coordinates of a mouse click, and plotting the label using text().


For a simple example,

x - rnorm(100, mean=10, sd=2)
qqnorm(x)
repeat {
  pt - locator(1)
  if (!length(pt$x)) break
  text(pt, labels=which.min( abs(x - pt$y) ) )
}

Duncan Murdoch


2 - i have tried using the stdres function in the MASS library, to extract
the standardised
residuals and plot them manully, ( using the plot ) function.
  this way, the problem is we have to SORT the residuals first in
increasing order to reproduce the same qqnorm plot, in that case, 'identify'
function works, however, that CHANGES the order, i.e. it wont return the
original A:Z ( row.names ) label.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how exactly does 'identify' work?

2010-11-18 Thread casperyc

yes, i tried to modify the 2L part in plot.lm

###
if (show[2L]) {
ylim - range(rs, na.rm = TRUE)
ylim[2L] - ylim[2L] + diff(ylim) * 0.075
qq - qqnorm(rs, main = main, ylab = ylab23, ylim = ylim, 
...)
if (qqline) 
qqline(rs, lty = 3, col = gray50)
if (one.fig) 
title(sub = sub.caption, ...)
mtext(getCaption(2), 3, 0.25, cex = cex.caption)
if (id.n  0) 
text.id(qq$x[show.rs], qq$y[show.rs], rs)
###

but didnt go very far,

I could just use text to add the label,
I just dont understand why identify does not 'identify' the residuals in
a linear model in the qqnorm plot ...

Thanks.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-exactly-does-identify-work-tp3045953p3049385.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how exactly does 'identify' work?

2010-11-18 Thread Greg Snow
Did you read the help page for qqnorm?  The return value has the x and y 
coordinates used, you can just do something like:

 tmp - qqnorm( resid(test.lm) )
 identify(tmp, , names(resid(test.lm)) )

Or the plot.lm function has an argument id.n that automatically labels the n 
most extreme values:

 plot( test.lm, 2, id.n=10 )

Those both worked in my tests, if they are not working for you then send a 
reproducible example (include data, see ?dput) and maybe we can help further.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of casperyc
 Sent: Thursday, November 18, 2010 11:50 AM
 To: r-help@r-project.org
 Subject: Re: [R] how exactly does 'identify' work?
 
 
 Hi,
 
 I think the problem is
 
 1 - when a linear model is fitted, ploting the qqnorm( test.lm$ res )
 we dont 'know' what values are actually being used on the y-axis; and
 how do we refer to the ‘Index’ on the x-axis??
  therefore, i dont know how to refer to the x and y coordinates in
 the
 identify function
 
 2 - i have tried using the stdres function in the MASS library, to
 extract
 the standardised
 residuals and plot them manully, ( using the plot ) function.
  this way, the problem is we have to SORT the residuals first in
 increasing order to reproduce the same qqnorm plot, in that case,
 'identify'
 function works, however, that CHANGES the order, i.e. it wont return
 the
 original A:Z ( row.names ) label.
 --
 View this message in context: http://r.789695.n4.nabble.com/how-
 exactly-does-identify-work-tp3045953p3049357.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how exactly does 'identify' work?

2010-11-18 Thread Greg Snow
One additional point on your original post.  You added row names to the test 
data frame, but did not specify the name of the data frame when you did the 
regression, rather you attached the data frame.  When you did this lm found x 
and y, but did not find the rownames, so the diagnostic plot just used numbers 
to label the extreme points.

This is just one of the many pitfalls with using attach rather than the more 
direct methods, try your example again but instead of attaching the data frame 
use it in the data argument to lm:

 test.lm - lm( y~x, data=test )

Then when you do plot(test.lm, 2) the most extreme points (3 if you don't 
change the id.n value) will be labeled using the rownames.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Greg Snow
 Sent: Thursday, November 18, 2010 1:11 PM
 To: casperyc; r-help@r-project.org
 Subject: Re: [R] how exactly does 'identify' work?
 
 Did you read the help page for qqnorm?  The return value has the x and
 y coordinates used, you can just do something like:
 
  tmp - qqnorm( resid(test.lm) )
  identify(tmp, , names(resid(test.lm)) )
 
 Or the plot.lm function has an argument id.n that automatically labels
 the n most extreme values:
 
  plot( test.lm, 2, id.n=10 )
 
 Those both worked in my tests, if they are not working for you then
 send a reproducible example (include data, see ?dput) and maybe we can
 help further.
 
 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of casperyc
  Sent: Thursday, November 18, 2010 11:50 AM
  To: r-help@r-project.org
  Subject: Re: [R] how exactly does 'identify' work?
 
 
  Hi,
 
  I think the problem is
 
  1 - when a linear model is fitted, ploting the qqnorm( test.lm$ res )
  we dont 'know' what values are actually being used on the y-axis; and
  how do we refer to the ‘Index’ on the x-axis??
   therefore, i dont know how to refer to the x and y coordinates
 in
  the
  identify function
 
  2 - i have tried using the stdres function in the MASS library, to
  extract
  the standardised
  residuals and plot them manully, ( using the plot ) function.
   this way, the problem is we have to SORT the residuals first in
  increasing order to reproduce the same qqnorm plot, in that case,
  'identify'
  function works, however, that CHANGES the order, i.e. it wont return
  the
  original A:Z ( row.names ) label.
  --
  View this message in context: http://r.789695.n4.nabble.com/how-
  exactly-does-identify-work-tp3045953p3049357.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how exactly does 'identify' work?

2010-11-18 Thread casperyc

Omg!

yes, it is working now!

tmp - qqnorm( resid(test.lm) ) 

What a simple nice trick!!!

Actually, i wasnt looking for the 'i'th label,
I was looking for the 'row.names' as label,
like I stated in the 1st post.

 identify(tmp, , row.names(test) ) 

is the label i have been trying to get.


THANKS!

casper
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-exactly-does-identify-work-tp3045953p3049507.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how exactly does 'identify' work?

2010-11-16 Thread casperyc

Hi all,

#
test=data.frame(x=1:26,y=-23.5+0.45*(1:26)+rnorm(26))
rownames(test)=LETTERS[1:26]
attach(test)
#test
test.lm=lm(y~x)

plot(test.lm,2)
identify(test.lm$res,,row.names(test))
# not working

plot(x,y)
identify(x,y,row.names(test))
# works fine
identify(y,,row.names(test))
# works fine
identify(x,,row.names(test))
# not working
identify(y,,y)
# works
identify(x,,y)
# not working

#

My guess is that identify take the object 'x' ( the first argument ) is the
thing that on the y axis.

However, i have tried many many ways 
trying to get the LETTERS to be identified in the QQ-plot (plot(test.lm,2))
it never works.

I have even tried to extract the standardized residual using library(MASS),
the 'stdres' function, and put it as the first argument in identify,
still failed...

Is there any means to achieve this?

Thanks!

casper
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-exactly-does-identify-work-tp3045953p3045953.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.