Dear Duncan and Yihui, I was able to test it with the new R-devel version. Adding only %\SweaveUTF8 to the vignette works (= passes R CMD CHECK --as-cran and UTF-8 characters render as they should). Adding only Encoding: UTF-8 to the DESCRIPTION instead of %\SweaveUTF8 works too.
I have tested the same things with the github version of knitr on R-3.1.2-patched. Adding Encoding: UTF-8 to the DESCRIPTION gives an R CMD check --as-cran warning: * checking package vignettes in 'inst/doc' ... WARNING Non-ASCII package vignette without specified encoding: 'utf8vignette.Rmd' The UTF-8 characters in the vignette are none the less rendered correctly. Adding only \%SweaveUTF8 to the vignette makes it passing R CMD Check --as-cran and the UTF-8 characters are rendered correctly. So both the changes to R-devel and knitr seems to work fine. Thanks a lot. Thierry PS I've added the sessionInfo() of both configurations. #sessionInfo() of R-devel > library(rmarkdown) > library(knitr) > sessionInfo() R Under development (unstable) (2014-12-18 r67185) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Dutch_Belgium.1252 LC_CTYPE=Dutch_Belgium.1252 [3] LC_MONETARY=Dutch_Belgium.1252 LC_NUMERIC=C [5] LC_TIME=Dutch_Belgium.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] knitr_1.8 rmarkdown_0.3.11 loaded via a namespace (and not attached): [1] digest_0.6.4 evaluate_0.5.5 formatR_1.0 htmltools_0.2.6 stringr_0.6.2 [6] tools_3.2.0 > library(knitr) > library(rmarkdown) > sessionInfo() R version 3.1.2 Patched (2014-12-11 r67166) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Dutch_Belgium.1252 LC_CTYPE=Dutch_Belgium.1252 [3] LC_MONETARY=Dutch_Belgium.1252 LC_NUMERIC=C [5] LC_TIME=Dutch_Belgium.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rmarkdown_0.3.11 knitr_1.8.6 loaded via a namespace (and not attached): [1] bitops_1.0-6 devtools_1.6.1 digest_0.6.6 evaluate_0.5.5 formatR_1.0 [6] htmltools_0.2.6 httr_0.6.0 RCurl_1.95-4.5 stringr_0.6.2 tools_3.1.2 ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey ________________________________________ Van: Duncan Murdoch [murdoch.dun...@gmail.com] Verzonden: vrijdag 19 december 2014 14:02 Aan: Yihui Xie CC: ONKELINX, Thierry; r-devel@r-project.org; Kurt Hornik Onderwerp: Re: [Rd] UTF8 markdown vignette On 18/12/2014, 12:17 AM, Yihui Xie wrote: > For the record, I saw a change had been made in R-devel: > https://github.com/wch/r-source/commit/d53b098 (Thanks, Duncan) > Meanwhile, I also made a change in knitr to assume UTF-8 unless R > passes an encoding to the vignette engine: > https://github.com/yihui/knitr/commit/23c6c8e2 Both will solve the > original problem, but apparently the former one is the ideal fix. The Windows builds of R-devel were stalled for a few days, but I've given them a kick now, so this should appear in the Windows binaries on CRAN soon. Duncan Murdoch > > Regards, > Yihui > -- > Yihui Xie <xieyi...@gmail.com> > Web: http://yihui.name > > > On Wed, Dec 10, 2014 at 6:19 AM, Duncan Murdoch > <murdoch.dun...@gmail.com> wrote: >> On 09/12/2014, 10:36 PM, Yihui Xie wrote: >>> I took a look at the R source and I realized that the encoding was >>> actually never passed to the vignette engine: >>> https://github.com/wch/r-source/blob/e721ef5f4/src/library/tools/R/Vignettes.R#L507 >>> Apparently only the file and quiet arguments are passed to the >>> vignette engine. Did I miss anything? >> >> I think it's actually a little messier than that: sometimes the >> encoding is passed (e.g. by tools:::.run_one_vignette, used in R CMD >> check), but not always. Here's what I think should happen instead: >> >> When building a vignette in a package, R knows the encoding declared for >> the package, so it should assume this as the default for the vignette. >> If nothing is declared, it should assume "native.enc", i.e. whatever is >> the native encoding on the machine it's running on. >> >> For each vignette, at the same time as it determines the vignette >> engine, it should see whether there is a declared encoding within the >> vignette. >> >> When it calls the engine, it should pass an encoding (and it should be a >> legal one, e.g. UTF-8, not utf8). >> >> Unless I notice something missing when I do this, or someone else tells >> me something that's missing, I'll try to make the changes above in >> R-devel and R-patched sometime before 3.1.3 is released. >> >> In the meantime, unless declaring a dependence on R >= 3.1.3, vignette >> engines should determine the encoding themselves whenever they are >> called without an "encoding" argument. >> >> Duncan Murdoch Disclaimer Bezoek onze website / Visit our website<https://drupal.inbo.be/nl/disclaimer-mailberichten-van-het-inbo> ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel