[Rd] Character encodings and packages
Since R 2.5.0 it has been possible to declare the encodings of character strings (at the level of individual elements of a character vector). As a reminder, here is the announcement in NEWS o R now attempts to keep track of character strings which are known to be in Latin-1 or UTF-8 and print or plot them appropriately in other locales. This is primarily intended to make it possible to use data in Western European languages in both Latin-1 and UTF-8 locales. Currently scan(), read.table(), readLines(), parse() and source() allow encodings to be declared, and console input in suitable locales is also recognized. New function Encoding() can read or set the declared encodings for a character vector. Whereas R itself is careful to make use of this, I see very little recognition of it in packages -- which need to be making use of translateChar() rather than CHAR(): see the 'Writing R Extensions' manual. (I see it used in only one package, and that mainly in a copy of base R code.) This will become more important as time goes by and more ways are introduced to generate marked data. In particular, in R 2.7.0 under Windows 'Unicode' data (as used by NT-based versions of Windows, usually UCS-2 but possibly UTF-16) is translated to UTF-8 and marked as such. In essence, every time you use CHAR() in .Call/.External call in a package you should consider if the data can be non-ASCII and if so how you want to handle it. The choices are - to replace CHAR() by translateChar() and handle the string in the native encoding of the current locale. This needs the package to depend on 'R (= 2.5.0)'. - to note the declared encoding and handle the string in that encoding. - to translate the string to UTF-8 and handle it in UTF-8. This will be easiest to do in R = 2.7.0 using the function translateCharUTF8(). For writers of graphics devices where is a further twist in R = 2.7.0: currently text is passed to the graphics device in the native encoding, but by setting the DevDesc variable hasTextUTF8 to TRUE you can indicate to the graphics engine the ability to accept text in UTF-8. This is done in several of the standard devices: for example windows() was already re-encoding to UCS-2 for plotting, and postscript()/pdf() also re-encode to the selected single-byte encoding. Character data passed to .C or .Fortran is automatically re-encoded to the current locale (for .C, from the encoding specified by ENCODING=, otherwise from the declared encoding if any). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Strict-prototypes definitions in R includes
Dear list, Whenever the flag -Wstrict-prototypes is set in gcc, compiling code that includes headers in lib/R/include generates often warnings (example with R-2.6.1: Rinternals.h:560: warning: function declaration isn't a prototype ). All such warnings I looked at were about functions with empty signatures declared as bar foo(); rather than bar foo(void);. Is there a reason, or is this just an oversight in the include files ? Thanks, Laurent __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] seekViewport error
Hi Gabor Grothendieck wrote: On Jan 23, 2008 9:38 PM, Paul Murrell [EMAIL PROTECTED] wrote: Hi Gabor Grothendieck wrote: Why does the seekViewport at the bottom give an error? Because the viewport is popped after GRID.cellGrob.84 is drawn. grid.ls() shows the viewport because it recurses down into the legend frame grob. Compare your output with (grid-generated numbering differs) ... grid.ls(recurs=FALSE, view=TRUE) ROOT GRID.rect.28 plot1.toplevel.vp plot1.xlab.vp plot1.xlab 1 plot1.ylab.vp plot1.ylab 1 plot1.strip.1.1.off.vp GRID.segments.29 1 plot1.strip.left.1.1.off.vp GRID.segments.30 GRID.text.31 1 plot1.panel.1.1.off.vp GRID.segments.32 GRID.text.33 GRID.segments.34 1 plot1.panel.1.1.vp GRID.points.35 GRID.points.36 GRID.points.37 1 plot1.panel.1.1.off.vp GRID.rect.38 1 plot1.legend.top.vp GRID.frame.9 1 plot1. 1 1 If you look at what viewports are actually available, via current.vpTree(), you'll see that GRID.VP.24 is not there. The problem (see also https://stat.ethz.ch/pipermail/r-help/2008-January/151655.html) is that cellGrobs (children of frame grobs) use their 'vp' component to store the viewport that positions them within the parent frame. This means that the viewport is pushed and then popped (as per normal behaviour for 'vp' components). A possible solution that I am currently trialling uses a special 'cellvp' slot instead so that the cellGrob viewports are pushed and then upped. That way they remain available after the cellGrob has drawn, so you can downViewport() to them. The disadvantage of this approach is that the viewports no longer appear in the grid.ls() listing (because grid.ls() has no way of knowing about special components of grobs that contain viewports). This effect can already be seen by the fact that the viewport for the frame grob (GRID.frame.70) is not shown in the grid.ls() output. On the other hand, the viewports will be visible via current.vpTree() ... Perhaps some convention could be adopted which, if followed, would let grid.ls know? If that worked at least for graphics generated from lattice and ggplot2 that would likely satisfy a significant percentage of uses. The gridList() function used by grid.ls() is generic, so a solution is to simply write a method for frames and cellGrobs. I have committed a fix along these lines. Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] tapply on empty data.frames (PR#10644)
Full_Name: Hilmar Berger Version: 2.4.1/2.6.2alpha OS: WinXP Submission from: (NULL) (84.185.128.110) Hi all, If I use tapply on an empty data.frame I get an error. I'm not quite sure if one can actually expect the function to return with a result. However, the error message suggests that this case does not get handled well. This happens both in R-2.4.1 and 2.6.2alpha (version 2008-01-26). z = data.frame(a = c(1,2,3,4),b=c(a,b,c,d)) z1 = subset(z,a == 5) tapply(z1$a,z1$b,length) Error in ansmat[index] - ans : incompatible types (from NULL to logical) in subassignment type fix Deleting unused factor levels from the group parameter gives: tapply(z1$a,factor(z1$b),length) logical(0) Regards, Hilmar platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status alpha major 2 minor 6.2 year 2008 month 01 day26 svn rev44181 language R version.string R version 2.6.2 alpha (2008-01-26 r44181) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel