[Rd] Character encodings and packages

2008-01-27 Thread Prof Brian Ripley
Since R 2.5.0 it has been possible to declare the encodings of character 
strings (at the level of individual elements of a character vector).
As a reminder, here is the announcement in NEWS

 o  R now attempts to keep track of character strings which are
known to be in Latin-1 or UTF-8 and print or plot them
appropriately in other locales.  This is primarily intended
to make it possible to use data in Western European languages
in both Latin-1 and UTF-8 locales.  Currently scan(),
read.table(), readLines(), parse() and source() allow
encodings to be declared, and console input in suitable
locales is also recognized.

New function Encoding() can read or set the declared encodings
for a character vector.

Whereas R itself is careful to make use of this, I see very little 
recognition of it in packages -- which need to be making use of 
translateChar() rather than CHAR(): see the 'Writing R Extensions' manual. 
(I see it used in only one package, and that mainly in a copy of base R 
code.)

This will become more important as time goes by and more ways are 
introduced to generate marked data.  In particular, in R 2.7.0 under 
Windows 'Unicode' data (as used by NT-based versions of Windows, usually 
UCS-2 but possibly UTF-16) is translated to UTF-8 and marked as such.

In essence, every time you use CHAR() in .Call/.External call in a package 
you should consider if the data can be non-ASCII and if so how you want to 
handle it.  The choices are

- to replace CHAR() by translateChar() and handle the string in the native 
encoding of the current locale.  This needs the package to depend on
'R (= 2.5.0)'.

- to note the declared encoding and handle the string in that encoding.

- to translate the string to UTF-8 and handle it in UTF-8.  This will be 
easiest to do in R = 2.7.0 using the function translateCharUTF8().


For writers of graphics devices where is a further twist in R = 2.7.0: 
currently text is passed to the graphics device in the native encoding, 
but by setting the DevDesc variable hasTextUTF8 to TRUE you can indicate 
to the graphics engine the ability to accept text in UTF-8.  This is done 
in several of the standard devices: for example windows() was already 
re-encoding to UCS-2 for plotting, and postscript()/pdf() also re-encode 
to the selected single-byte encoding.


Character data passed to .C or .Fortran is automatically re-encoded to the 
current locale (for .C, from the encoding specified by ENCODING=, 
otherwise from the declared encoding if any).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Strict-prototypes definitions in R includes

2008-01-27 Thread Laurent Gautier
  Dear list,

  Whenever the flag -Wstrict-prototypes is set in gcc, compiling code that
includes headers in lib/R/include generates often warnings
(example with R-2.6.1:
Rinternals.h:560: warning: function declaration isn't a prototype
).

  All such warnings I looked at were about functions with empty
signatures declared
as bar foo(); rather than bar foo(void);. Is there a reason, or is
this just an oversight in the include files ?

  Thanks,


Laurent

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] seekViewport error

2008-01-27 Thread Paul Murrell
Hi


Gabor Grothendieck wrote:
 On Jan 23, 2008 9:38 PM, Paul Murrell [EMAIL PROTECTED] wrote:
 Hi


 Gabor Grothendieck wrote:
 Why does the seekViewport at the bottom give an error?

 Because the viewport is popped after GRID.cellGrob.84 is drawn.

 grid.ls() shows the viewport because it recurses down into the legend
 frame grob.  Compare your output with (grid-generated numbering differs)
  ...

   grid.ls(recurs=FALSE, view=TRUE)
 ROOT
   GRID.rect.28
   plot1.toplevel.vp
 plot1.xlab.vp
   plot1.xlab
   1
 plot1.ylab.vp
   plot1.ylab
   1
 plot1.strip.1.1.off.vp
   GRID.segments.29
   1
 plot1.strip.left.1.1.off.vp
   GRID.segments.30
   GRID.text.31
   1
 plot1.panel.1.1.off.vp
   GRID.segments.32
   GRID.text.33
   GRID.segments.34
   1
 plot1.panel.1.1.vp
   GRID.points.35
   GRID.points.36
   GRID.points.37
   1
 plot1.panel.1.1.off.vp
   GRID.rect.38
   1
 plot1.legend.top.vp
   GRID.frame.9
   1
 plot1.
   1
 1

 If you look at what viewports are actually available, via
 current.vpTree(), you'll see that GRID.VP.24 is not there.

 The problem (see also
 https://stat.ethz.ch/pipermail/r-help/2008-January/151655.html) is that
 cellGrobs (children of frame grobs) use their 'vp' component to store
 the viewport that positions them within the parent frame.  This means
 that the viewport is pushed and then popped (as per normal behaviour for
 'vp' components).

 A possible solution that I am currently trialling uses a special
 'cellvp' slot instead so that the cellGrob viewports are pushed and then
 upped.  That way they remain available after the cellGrob has drawn,
 so you can downViewport() to them.

 The disadvantage of this approach is that the viewports no longer appear
 in the grid.ls() listing (because grid.ls() has no way of knowing about
 special components of grobs that contain viewports).  This effect can
 already be seen by the fact that the viewport for the frame grob
 (GRID.frame.70) is not shown in the grid.ls() output.  On the other
 hand, the viewports will be visible via current.vpTree()  ...
 
 Perhaps some convention could be adopted which, if followed, would
 let grid.ls know?  If that worked at least for graphics generated from
 lattice and ggplot2 that would likely satisfy a significant percentage
 of uses.


The gridList() function used by grid.ls() is generic, so a solution is
to simply write a method for frames and cellGrobs.  I have committed a
fix along these lines.

Paul
-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] tapply on empty data.frames (PR#10644)

2008-01-27 Thread hilmar . berger
Full_Name: Hilmar Berger
Version: 2.4.1/2.6.2alpha
OS: WinXP
Submission from: (NULL) (84.185.128.110)


Hi all,

If I use tapply on an empty data.frame I get an error. I'm not quite sure if one
can actually expect the function to return with a result. However, the error
message suggests that this case does not get handled well.

This happens both in R-2.4.1 and 2.6.2alpha (version 2008-01-26).

 z = data.frame(a = c(1,2,3,4),b=c(a,b,c,d))
 z1 = subset(z,a == 5)
 tapply(z1$a,z1$b,length)
Error in ansmat[index] - ans : 
  incompatible types (from NULL to logical) in subassignment type fix

Deleting unused factor levels from the group parameter gives:

 tapply(z1$a,factor(z1$b),length)
logical(0)


Regards,
Hilmar

platform   i386-pc-mingw32  
arch   i386 
os mingw32  
system i386, mingw32
status alpha
major  2
minor  6.2  
year   2008 
month  01   
day26   
svn rev44181
language   R
version.string R version 2.6.2 alpha (2008-01-26 r44181)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel