[R] non-ascii characters in R output

2011-02-18 Thread Matt Shotwell

All,

I'd like to automatically output text from R to HTML. In doing this I've 
run into trouble with non-ascii characters, as my browser (and 
presumably others) does not render such characters correctly. For 
example, the 'fancy' single quotes associated with summary.lm are 
multi-byte characters on my platform. This particular problem is solved 
by options(useFancyQuotes=FALSE). But now I'm concerned about other 
non-ascii characters. As an overkill maybe, my current solution involves 
capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other 
sources of non-ascii character? Is there a better or general solution?


Best,
Matt

 sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.12.1

--
Matthew S Shotwell   Assistant Professor   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-ascii characters in R output

2011-02-18 Thread Matt Shotwell
OK, looks like my web browser does render non-ascii characters output by
R when it's given the encoding explicitly. This works for me: meta
http-equiv=Content-Type content=text/html; charset=UTF-8/. So
that's another solution, but not a general one.

-Matt

On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote:
 All,
 
 I'd like to automatically output text from R to HTML. In doing this I've 
 run into trouble with non-ascii characters, as my browser (and 
 presumably others) does not render such characters correctly. For 
 example, the 'fancy' single quotes associated with summary.lm are 
 multi-byte characters on my platform. This particular problem is solved 
 by options(useFancyQuotes=FALSE). But now I'm concerned about other 
 non-ascii characters. As an overkill maybe, my current solution involves 
 capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other 
 sources of non-ascii character? Is there a better or general solution?
 
 Best,
 Matt
 
   sessionInfo()
 R version 2.12.1 (2010-12-16)
 Platform: x86_64-pc-linux-gnu (64-bit)
 
 locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
   [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
   [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
   [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 loaded via a namespace (and not attached):
 [1] tools_2.12.1


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-ascii characters in R output

2011-02-18 Thread Duncan Murdoch

On 18/02/2011 5:58 PM, Matt Shotwell wrote:

OK, looks like my web browser does render non-ascii characters output by
R when it's given the encoding explicitly. This works for me:meta
http-equiv=Content-Type content=text/html; charset=UTF-8/. So
that's another solution, but not a general one.


I don't understand your final comment.  What is not general about 
declaring how the file is encoded?


Duncan Murdoch



-Matt

On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote:

All,

I'd like to automatically output text from R to HTML. In doing this I've
run into trouble with non-ascii characters, as my browser (and
presumably others) does not render such characters correctly. For
example, the 'fancy' single quotes associated with summary.lm are
multi-byte characters on my platform. This particular problem is solved
by options(useFancyQuotes=FALSE). But now I'm concerned about other
non-ascii characters. As an overkill maybe, my current solution involves
capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other
sources of non-ascii character? Is there a better or general solution?

Best,
Matt

sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
   [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
   [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
   [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
   [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
   [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.12.1



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-ascii characters in R output

2011-02-18 Thread Matt Shotwell


On Fri, 2011-02-18 at 19:50 -0500, Duncan Murdoch wrote:
 On 18/02/2011 5:58 PM, Matt Shotwell wrote:
  OK, looks like my web browser does render non-ascii characters output by
  R when it's given the encoding explicitly. This works for me:meta
  http-equiv=Content-Type content=text/html; charset=UTF-8/. So
  that's another solution, but not a general one.
 
 I don't understand your final comment.  What is not general about 
 declaring how the file is encoded?

I meant that declaring UTF-8 is not generally applicable, because R
doesn't always output UTF-8 (right?). For example, locales that use
exotic encodings might output characters that are not interpretable
where UTF-8 is assumed.

The general solution, I suppose, is to automatically generate the
meta / line with the encoding used by R.

Matt

 
 Duncan Murdoch
 
 
  -Matt
 
  On Fri, 2011-02-18 at 12:47 -0600, Matt Shotwell wrote:
  All,
 
  I'd like to automatically output text from R to HTML. In doing this I've
  run into trouble with non-ascii characters, as my browser (and
  presumably others) does not render such characters correctly. For
  example, the 'fancy' single quotes associated with summary.lm are
  multi-byte characters on my platform. This particular problem is solved
  by options(useFancyQuotes=FALSE). But now I'm concerned about other
  non-ascii characters. As an overkill maybe, my current solution involves
  capture.output and iconv(..., to=ASCII//TRANSLIT). Are there other
  sources of non-ascii character? Is there a better or general solution?
 
  Best,
  Matt
 
  sessionInfo()
  R version 2.12.1 (2010-12-16)
  Platform: x86_64-pc-linux-gnu (64-bit)
 
  locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
  [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
 
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base
 
  loaded via a namespace (and not attached):
  [1] tools_2.12.1
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.