Remember that nchar() returns by default the number of *bytes* and not the number of characters. I've recently spotted many cases in which nchar() has been used with substr() which works in characters; this can lead to incorrect results. (This seems the commonest use of nchar() in packages.)
There were two reasons why nchar() was left defaulting to bytes when we allowed MBCSs in R: 1) Many of the uses are of the form if(nchar(x)) or if(nchar(x)==0) or even if nchar(x) != 0. Computing the length of a string is an inefficient way to find out if it is non-empty, especially if it has to be converted to wchars to do so. 2) Once you allow multibyte characters, not all character strings are valid and for those nchar(x, "c") is NA. Not much code has been written to take into account the possibility that nchar() might return an NA. Despite these reasons, it seems that the dangers of incorrect use outweigh them. So for 2.6.0 - There is a new function nzchar() which provided a quick test of non-zero number of characters. - The default becomes nchar(type="chars"). It seems that nchar() is used quite often to lay out 'printed' or graphical output. For that, normally nchar(type="width") is what is needed. None of this is an issue in single-byte locales or for ASCII text in UTF-8 or the Windows' CJK locales, but please bear in mind that you cannot assume such for a public package. (The assumption that ASCII code is represented in single bytes is pretty widespread, but at some point we may want to support Windows' native UCS-2 encoding for which it is not true.) The best advice is to use the 'type' argument for all uses of nchar() in public code unless perhaps you are sure only ASCII data will ever be encountered. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel