1) I'd like to take a look at what is involved before commenting on efficiency issues. They may not be what I thought they were (or at least, being generic at all may be so big a hit that a few more cases may be immaterial).
2) This + is clearly not commutative. 3) + is part of group generic. I think it is a little awkward to change the dispatch rules for just one member of the group. That's one argument for character + character being different from character + number. I'm not much in favour of adding special cases. Brian On Fri, 25 Aug 2006, Martin Maechler wrote: > >>>>> "Duncan" == Duncan Murdoch <[EMAIL PROTECTED]> > >>>>> on Fri, 25 Aug 2006 13:18:42 -0400 writes: > > Duncan> On 8/25/2006 12:31 PM, Martin Maechler wrote: > >> This thread remains me of an old recurring (last May!) > >> theme which maybe fits well to Friday late afternoon... > >> > >> There have been propositions to make "+" work in S (and > >> R) like in some other languages, namely for character > >> (vectors), > >> > >> a + b := paste(a,b, sep="") > >> > >> IIRC, when this theme came up last, the one argument > >> against it was the penalty of method dispatch that we > >> were not willing to pay for something as fundamentally > >> speed-important as "+" -- which is a .Primitive in R > >> exactly for that reason of efficiency. > >> > >> But then, we actually do dispatch for "+" -- internally > >> in C code via DispatchGroup() --- but only if we need, so > >> not when usual numeric/complex arguments are used. > >> > >> I think - but may be wrong - it should be possible to > >> also check very fast for two "character" arguments and in > >> that case do a fast version of paste(a, b, sep=""). > > Duncan> But for consistency shouldn't this work if only one > Duncan> of the args is character, coercing the other to > Duncan> character? E.g. we have > > >> "2" > 10 > Duncan> [1] TRUE > > yes. But see also below > > >> When this last came up (in May), Brian said that about > >> the fact that you could not just simply define > >> "+.character" > >> > >>>> I would think that the intention was also to positively > >>>> discourage messing with the basics of R, as if you were > >>>> able to do this erroneous uses would likely not get > >>>> caught. > >> ( > >> https://stat.ethz.ch/pipermail/r-help/2006-May/104751.html > >> ) and subsequently > >> (https://stat.ethz.ch/pipermail/r-help/2006-May/104754.html) > >> gave an example for this > >> > >>>> 2 + x, for example, where x is not numeric. > > Duncan> This is a valid concern, but I think the clarity > Duncan> obtained by coding paste operations using + is worth > Duncan> it. > > Duncan> For example, the first instance of paste(a, b, > Duncan> sep="") I see in the source is > > Duncan> is.ALL(structure(1:7, names = paste("a",1:7,sep=""))) > > Duncan> in base/demo/is.things.R > > Duncan> which I find clearer as > > Duncan> is.ALL(structure(1:7, names = "a" + 1:7)) > > > Duncan> But then I'm used to using + for strings from > Duncan> Borland's Pascal extensions; to a C-speaker the > Duncan> meaning may not be so obvious. > > yes. I think however if we keep speed and clarity and catching > user errors all in mind, it would be enough - and better - to > only dispatch to paste(.,.) when both arguments are character > (vectors), i.e., the above case needed > "a" + as.character(1:7) or "a" + paste(1:7) or "a" + format(1:7) > which after all is really more clearer, even more for cases of > "1" + 2 which I'd rather want keeping to give errors. > > If Char + Num should work like above, then also > Num + Char should (since after all, "+" should be commutative > apart from floating point precision issues). > > and so the internal C code gets a bit more complicated and slightly > slower.. something we had in mind we should strongly avoid... > > Martin > > >> I wonder however, if we do this in C, and basically only > >> go into the paste-branch when both arguments are > >> characters, if we wouldn't get to a nice useful solution > >> without a noticable performance penalty. > >> > >> This would also solve my other slight related uneasyness > >> : Many times in the past, when using paste(..., sep='') > >> in function definitions I had wanted this (empty sep) to > >> be the default and to have an easier, more readable way > >> to achieve the same. > >> > >> But then these all are just musings at the end of the > >> week... > >> > >> Martin Maechler, ETH Zurich > >> > >> ______________________________________________ > >> R-devel@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > > Duncan> ______________________________________________ > Duncan> R-devel@r-project.org mailing list > Duncan> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel