There are real examples; they are all fairly obscure. It can't be a big problem because the standard formal argument name for a data frame in modelling and graphics functions is 'data'. That's actually a more serious problem than the function called data() -- having local and global variables with the same name won't confuse R, but it can easily confuse you.
Possibilities for R getting confused include
1. The functions for environment access by name, eg exists(), get(), don't by
default check the type of the argument.
2. bquote() and substitute() substitute before evaluating and could get
confused.
There used to be real problems in S when certain function names were used as
data names. Then there was a period of aversive conditioning by irritating
warnings. As a result, I still avoid 'c' and 't' as variable names.
You could call your data frames 'df' -- many of the people who complain about
'data' don't realise that df() is the density function of the F distribution :)
-thomas
On Tue, 13 Jan 2009, Ista Zahn wrote:
On Tue, Jan 13, 2009 at 10:23 AM, jim holtman <[email protected]> wrote:How about this:data(ToothGrowth) ls()[1] "ToothGrowth"data <- function(x){invisible(NULL)} data(ToothGrowth) ls()[1] "data"Yep, that sure does cause a problem alright. Is it the case that that problems arise when you name a function with the same name as an existing function? Or are there cases where naming data.frames, vectors, matrices, etc. can also cause problems? I hope I'm not being annoying -- I'm just trying to determine if I need to break my habit of naming data.frames "data". Thanks, IstaOn Tue, Jan 13, 2009 at 9:53 AM, Ista Zahn <[email protected]> wrote:From: baptiste auguie <[email protected]> To: Dimitris Rizopoulos <[email protected]> Date: Tue, 13 Jan 2009 09:38:09 +0000 Subject: Re: [R] indexing questionyou can also look at subset, my.data.frame <- data.frame(a=rnorm(10),b=factor(sample(letters[1:4], 10, replace=T))) str(my.data.frame) my.data.frame[my.data.frame$b == "a", ] subset(my.data.frame, b == "a")by the way, it is probably safer not to use "data" as a variable name asitis also a function.I've often wondered about this. The thing is, I've never run into aproblemwith this. For example:ls()character(0)data(ToothGrowth) ls()[1] "ToothGrowth"rm(ToothGrowth) ls()character(0)data <- data.frame(1:10, 101:110) data(ToothGrowth) #works just the same ls()[1] "data" "ToothGrowth"In this example the data command works just the same the second time,eventhough I have a data.frame named data. Can someone give an example where this causes a problem? Thanks, Ista [[alternative HTML version deleted]] ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?[[alternative HTML version deleted]] ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thomas Lumley Assoc. Professor, Biostatistics [email protected] University of Washington, Seattle ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

