[R] non-ascii characters in R output
All, I'd like to automatically output text from R to HTML. In doing this I've run into trouble with non-ascii characters, as my browser (and presumably others) does not render such characters correctly. For example, the 'fancy' single quotes associated with summary.lm are multi-byte characters on my platform. This particular problem is solved by options(useFancyQuotes=FALSE). But now I'm concerned about other non-ascii characters. As an overkill maybe, my current solution involves capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other sources of non-ascii character? Is there a better or general solution? Best, Matt sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.1 -- Matthew S Shotwell Assistant Professor School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-ascii characters in R output
OK, looks like my web browser does render non-ascii characters output by R when it's given the encoding explicitly. This works for me: meta http-equiv=Content-Type content=text/html; charset=UTF-8/. So that's another solution, but not a general one. -Matt On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote: All, I'd like to automatically output text from R to HTML. In doing this I've run into trouble with non-ascii characters, as my browser (and presumably others) does not render such characters correctly. For example, the 'fancy' single quotes associated with summary.lm are multi-byte characters on my platform. This particular problem is solved by options(useFancyQuotes=FALSE). But now I'm concerned about other non-ascii characters. As an overkill maybe, my current solution involves capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other sources of non-ascii character? Is there a better or general solution? Best, Matt sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-ascii characters in R output
On 18/02/2011 5:58 PM, Matt Shotwell wrote: OK, looks like my web browser does render non-ascii characters output by R when it's given the encoding explicitly. This works for me:meta http-equiv=Content-Type content=text/html; charset=UTF-8/. So that's another solution, but not a general one. I don't understand your final comment. What is not general about declaring how the file is encoded? Duncan Murdoch -Matt On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote: All, I'd like to automatically output text from R to HTML. In doing this I've run into trouble with non-ascii characters, as my browser (and presumably others) does not render such characters correctly. For example, the 'fancy' single quotes associated with summary.lm are multi-byte characters on my platform. This particular problem is solved by options(useFancyQuotes=FALSE). But now I'm concerned about other non-ascii characters. As an overkill maybe, my current solution involves capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other sources of non-ascii character? Is there a better or general solution? Best, Matt sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-ascii characters in R output
On Fri, 2011-02-18 at 19:50 -0500, Duncan Murdoch wrote: On 18/02/2011 5:58 PM, Matt Shotwell wrote: OK, looks like my web browser does render non-ascii characters output by R when it's given the encoding explicitly. This works for me:meta http-equiv=Content-Type content=text/html; charset=UTF-8/. So that's another solution, but not a general one. I don't understand your final comment. What is not general about declaring how the file is encoded? I meant that declaring UTF-8 is not generally applicable, because R doesn't always output UTF-8 (right?). For example, locales that use exotic encodings might output characters that are not interpretable where UTF-8 is assumed. The general solution, I suppose, is to automatically generate the meta / line with the encoding used by R. Matt Duncan Murdoch -Matt On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote: All, I'd like to automatically output text from R to HTML. In doing this I've run into trouble with non-ascii characters, as my browser (and presumably others) does not render such characters correctly. For example, the 'fancy' single quotes associated with summary.lm are multi-byte characters on my platform. This particular problem is solved by options(useFancyQuotes=FALSE). But now I'm concerned about other non-ascii characters. As an overkill maybe, my current solution involves capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other sources of non-ascii character? Is there a better or general solution? Best, Matt sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.