Re: [Rd] RFC: adding an 'exact' argument to [[
In addition to $ that was mentioned in this thread there is also attr, e.g. names(attributes(CO2)) [1] names row.names class formula outer labels [7] units attr(CO2, f) # matches formula uptake ~ conc | Plant On 5/17/07, Seth Falcon [EMAIL PROTECTED] wrote: Hi all, One of the things I find most problematic in R is the partial matching of names in lists. Robert and I have discussed this and we believe that having a mechanism that does not do partial matching would be of significant benefit to R programmers. To that end, I have written a patch that modifies the behavior of [[ as follows: 1. [[ gains an 'exact' argument with default value NA 2. Behavior of 'exact' argument: exact=NA partial matching is performed as usual, however, a warning will be issued when a partial match occurs. This is the default. exact=TRUE no partial matching is performed. exact=FALSE partial matching is allowed and no warning issued if it occurs. This change has been discussed among R-core members and there appeared to be a general consensus that this approach was a good way to proceed. However, we are interested in other suggestions from the broader R developer community. Some additional rationale for our approach: Lists are used as the underlying data structures in many R programs and in these cases the named elements are not a fixed set of things with a fixed set of names. For these programs, [[ will be used with an argument that gets evaluated at runtime and partial matching here is almost always a disaster. Furthermore, dealing with data that has common prefixes happens often and is not an exceptional circumstance (a precondition for partial matching issues). We have tested a similar patch that simply eliminated partial matching for [[ on all CRAN and Bioconductor packages and did not see any obvious failures. A downside of this approach is that S4 methods on [[ will need to be modified to accommodate the new signature. However, by adding an argument, we are able to move more slowly towards a non-partially matching [[ (eventually, the default could be exact=TRUE, but that is a discussion for another day). + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: adding an 'exact' argument to [[
Hi again, Robert has committed the proposed patch to R-devel. So [[ now has an 'exact' argument and the behavior is as described: Seth Falcon [EMAIL PROTECTED] writes: 1. [[ gains an 'exact' argument with default value NA 2. Behavior of 'exact' argument: exact=NA partial matching is performed as usual, however, a warning will be issued when a partial match occurs. This is the default. exact=TRUE no partial matching is performed. exact=FALSE partial matching is allowed and no warning issued if it occurs. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: adding an 'exact' argument to [[
On Thu, 17 May 2007, Duncan Murdoch wrote: On 5/17/2007 3:54 PM, Prof Brian Ripley wrote: There is a similar issue with argument partial matching. Since we have the source of R one can pretty easily build a version of R which does not have the feature: I have been doing that in conjunction with 'codetools' to do some checking. In both cases there is traditional partial matching: seq(along=) or seq(length=), and $fitted vs $fitted.values. There are not many uses of seq(along.with=) about and vastly more of seq(along=) (although in R using seq_along() is preferable): even in some packages which do use seq(along.with=) there are more instances of seq(along=). Opinions, please: In another thread I think we have agreement to add an extra arg to the vignette() function to limit it to attached packages. By analogy with other similar functions, the arg would be named all.available. However, I suspect most users would abbreviate that to just all. Should I name it all.available for consistency, or all in anticipation of a day when exact argument matching will be required? I don't think it will be required. However, the use of all.names etc is historical, from the days when S (and R) would warn if you used the name of a local non-function as a function, do an arg 'all' got in the way. I would use the most intuitive form. Shortly R-devel will have options to warn on partial matching in $ and in args. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: adding an 'exact' argument to [[
Bill Dunlap [EMAIL PROTECTED] writes: This sounds interesting. Do you intend to leave the $ operator alone, so it will continue to do partial matching? I suspect that that is where the majority of partial matching for list names is done. The current proposal will not touch $. I agree that most intentional partial matching uses $ (hopefully only during interactive sessions). The main benefit of the our proposed change is more reliable package code. For long lists and certain patterns of use, there are also performance benefits: kk - paste(abc, 1:(1e6), sep=) vv = as.list(1:(1e6)) names(vv) = kk system.time(vv[[fooo, exact=FALSE]]) user system elapsed 0.074 0.000 0.074 system.time(vv[[fooo, exact=TRUE]]) user system elapsed 0.042 0.000 0.042 It might be nice to have an option that made x$partial warn so we would fix code that relied on partial matching, but that is lower priority. I think that could be useful as well. To digress a bit further in discussing $... I think the argument that partial matching is desirable because it saves typing during interactive sessions now has a lot less weight. The recent integration of the completion code gives less typing and complete names. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: adding an 'exact' argument to [[
On Thu, 17 May 2007, Seth Falcon wrote: Bill Dunlap [EMAIL PROTECTED] writes: This sounds interesting. Do you intend to leave the $ operator alone, so it will continue to do partial matching? I suspect that that is where the majority of partial matching for list names is done. The current proposal will not touch $. I agree that most intentional partial matching uses $ (hopefully only during interactive sessions). The main benefit of the our proposed change is more reliable package code. For long lists and certain patterns of use, there are also performance benefits: kk - paste(abc, 1:(1e6), sep=) vv = as.list(1:(1e6)) names(vv) = kk system.time(vv[[fooo, exact=FALSE]]) user system elapsed 0.074 0.000 0.074 system.time(vv[[fooo, exact=TRUE]]) user system elapsed 0.042 0.000 0.042 It might be nice to have an option that made x$partial warn so we would fix code that relied on partial matching, but that is lower priority. I think that could be useful as well. To digress a bit further in discussing $... I think the argument that partial matching is desirable because it saves typing during interactive sessions now has a lot less weight. The recent integration of the completion code gives less typing and complete names. There is a similar issue with argument partial matching. Since we have the source of R one can pretty easily build a version of R which does not have the feature: I have been doing that in conjunction with 'codetools' to do some checking. In both cases there is traditional partial matching: seq(along=) or seq(length=), and $fitted vs $fitted.values. There are not many uses of seq(along.with=) about and vastly more of seq(along=) (although in R using seq_along() is preferable): even in some packages which do use seq(along.with=) there are more instances of seq(along=). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: adding an 'exact' argument to [[
On 5/17/2007 3:54 PM, Prof Brian Ripley wrote: On Thu, 17 May 2007, Seth Falcon wrote: Bill Dunlap [EMAIL PROTECTED] writes: This sounds interesting. Do you intend to leave the $ operator alone, so it will continue to do partial matching? I suspect that that is where the majority of partial matching for list names is done. The current proposal will not touch $. I agree that most intentional partial matching uses $ (hopefully only during interactive sessions). The main benefit of the our proposed change is more reliable package code. For long lists and certain patterns of use, there are also performance benefits: kk - paste(abc, 1:(1e6), sep=) vv = as.list(1:(1e6)) names(vv) = kk system.time(vv[[fooo, exact=FALSE]]) user system elapsed 0.074 0.000 0.074 system.time(vv[[fooo, exact=TRUE]]) user system elapsed 0.042 0.000 0.042 It might be nice to have an option that made x$partial warn so we would fix code that relied on partial matching, but that is lower priority. I think that could be useful as well. To digress a bit further in discussing $... I think the argument that partial matching is desirable because it saves typing during interactive sessions now has a lot less weight. The recent integration of the completion code gives less typing and complete names. There is a similar issue with argument partial matching. Since we have the source of R one can pretty easily build a version of R which does not have the feature: I have been doing that in conjunction with 'codetools' to do some checking. In both cases there is traditional partial matching: seq(along=) or seq(length=), and $fitted vs $fitted.values. There are not many uses of seq(along.with=) about and vastly more of seq(along=) (although in R using seq_along() is preferable): even in some packages which do use seq(along.with=) there are more instances of seq(along=). Opinions, please: In another thread I think we have agreement to add an extra arg to the vignette() function to limit it to attached packages. By analogy with other similar functions, the arg would be named all.available. However, I suspect most users would abbreviate that to just all. Should I name it all.available for consistency, or all in anticipation of a day when exact argument matching will be required? Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel