Hi list,

When using R CMD Rd2pdf, it is possible to set environment variable RD2PDF_INPUTENC to value "inputenx" and enjoy better support for UTF-8 characters (see ?Rd2pdf). This enables LaTeX package "inputenx" instead of "inputenc".

Even better support for UTF-8 encoded characters can be had by better using the facilities provided by inputenx and making R CMD Rd2pdf insert a line to its temporary .tex file: "\input{ix-utf8enc.dfu}". The instructions are found in section 1.2 "Unicode" of the inputenx manual: http://mirror.ctan.org/macros/latex/contrib/oberdiek/inputenx.pdf

I suggest that R CMD Rd2pdf automatically insert "\input{ix-utf8enc.dfu}" to its temporary .tex file when a combination of inputenx and UTF-8 is detected. The attached small patch does that.

A demo package is also attached (tarball built manually, not R CMD build). It uses some UTF-8 characters not supported without the patch: R CMD Rd2pdf gives an error, propagated from LaTeX. With the patch installed, R CMD Rd2pdf works OK when RD2PDF_INPUTENC=inputenx is set. For testing, unpack tarball and run R CMD Rd2pdf on the resulting directory. Tested on R development version r59731 running on Ubuntu 10.10 64 bit.

--
Mikko Korpela
Aalto University School of Science
Department of Information and Computer Science


Attachment: encTest3.tar.gz
Description: GNU Zip compressed data

Index: src/library/tools/R/Rd2pdf.R
===================================================================
--- src/library/tools/R/Rd2pdf.R        (revision 59731)
+++ src/library/tools/R/Rd2pdf.R        (working copy)
@@ -466,12 +466,17 @@
     inputenc <- Sys.getenv("RD2PDF_INPUTENC", "inputenc")
     ## this needs to be canonical, e.g. 'utf8'
     ## trailer is for detection if we want to edit it later.
+    latex_outputEncoding <- latex_canonical_encoding(outputEncoding)
     setEncoding <-
         paste("\\usepackage[",
-              latex_canonical_encoding(outputEncoding), "]{",
+              latex_outputEncoding, "]{",
               inputenc, "} % @SET ENCODING@", sep="")
     useGraphicx <- "% \\usepackage{graphicx} % @USE GRAPHICX@"
     writeLines(c(setEncoding,
+                 if (inputenc == "inputenx" &&
+                     latex_outputEncoding == "utf8") {
+                     "\\input{ix-utf8enc.dfu}"
+                 },
                 useGraphicx,
                  if (index) "\\makeindex{}",
                  "\\begin{document}"), out)
@@ -545,21 +550,28 @@
     latexEncodings <- unique(latexEncodings)
     latexEncodings <- latexEncodings[!is.na(latexEncodings)]
     cyrillic <- if (nzchar(Sys.getenv("_R_CYRILLIC_TEX_"))) "utf8" %in% 
latexEncodings else FALSE
-    latex_outputEncoding <- latex_canonical_encoding(outputEncoding)
     encs <- latexEncodings[latexEncodings != latex_outputEncoding]
     if (length(encs) || hasFigures || cyrillic) {
         lines <- readLines(outfile)
+        moreUnicode <- inputenc == "inputenx" && "utf8" %in% encs
        encs <- paste(encs, latex_outputEncoding, collapse=",", sep=",")
 
        if (!cyrillic) {
-           lines[lines == setEncoding] <-
+           setEncoding2 <-
                paste0("\\usepackage[", encs, "]{", inputenc, "}")
        } else {
-           lines[lines == setEncoding] <-
+           setEncoding2 <-
                paste(
 "\\usepackage[", encs, "]{", inputenc, "}
 \\IfFileExists{t2aenc.def}{\\usepackage[T2A]{fontenc}}{}", sep = "")
        }
+       if (moreUnicode) {
+           setEncoding2 <-
+               paste0(
+setEncoding2, "
+\\input{ix-utf8enc.dfu}")
+        }
+        lines[lines == setEncoding] <- setEncoding2
        if (hasFigures)
            lines[lines == useGraphicx] <- 
"\\usepackage{graphicx}\\setkeys{Gin}{width=0.7\\textwidth}"
        writeLines(lines, outfile)

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to