Re: [R] odfWeave UTF-8 error and latin characters

2012-01-11 Thread staffan7s
Hello,

I am using R and Libreoffice on Ubuntu 11.10 (64-bit) and have been
experiencing similar problems with character encoding (Swedish utf8) in
odfWeave. Here is an example of what it looks like:

Should be: "Hör Ärland dåligt?"
Appears as: "Hör Ärland dåligt?"

I found a (pretty clumsy) solution which I post below. Has anyone been able
to solve this in a more elegant way?

Setup:

> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
 [1] LC_CTYPE=sv_SE.UTF-8   LC_NUMERIC=C  
other attached packages:
[1] odfWeave_0.7.17 XML_3.2-0   lattice_0.19-30

Problem:

I have some R syntax for tables in the file "in.odt":

<>=
irre <- xtabs(~Species, data=iris) 
irre <- data.frame(irre) 
colnames(irre) <- c("växt", "antal") 
row.names(irre) <- c("å", "ä", "ö") 
odfTable(irre)
odfTableCaption("Tabell åäö")
@

Running odfWeave on this with odfWeave("in.odt", "out.odt") yields lots of
output, ending with this Warning message: ‘content.Rnw’ has unknown
encoding: assuming Latin-1. 

On opening the output file (odt.out), Swedish characters appear jumbled. I
had a look at the content.Rnw file, which was correctly coded with utf-8.
The same was true for the content.xml file in the odt source (this had to be
unzipped).

I then tried downgrading to XML 3.2, as suggested elsewhere. This didn't
help. I then looked for tools for converting an odt file from one kind of
encoding to another, again to no avail.

Solution:

Save the odt file in flat xml format (Libreoffice > save as > second last
option). Convert the resulting .fodt file FROM utf-8 TO latin 1 (aka
ISO_8859-1) with iconv from a bash terminal:

iconv -t ISO_8859-1 -f UTF-8 -o converted.fodt out.fodt

This produces a correctly encoded file!




--
View this message in context: 
http://r.789695.n4.nabble.com/odfWeave-UTF-8-error-and-latin-characters-tp2544333p4285335.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] odfWeave UTF-8 error and latin characters

2010-09-17 Thread Pedro Emmanuel Alvarenga Americano do Brasil
Hello R masters,

I have sent this same message to other lists and none so far could give some
light. I was trying to use odfWeave to generate a report from R and Im
getting an error that I think is related to latin characters. I looked
around and did find some stuff related to this problem about Sweave
http://labmoluscos.wordpress.com/2010/02/18/sweave-latex-character-encoding/
but
did not find a way to fix it so far for odfWeave. Perhaps some one could
give me some light on how to workaround it.

I think my problem is that I have a table with characters such as 'ç', 'ó'
and 'ã' that odfWeave is not recognizing properly. The error follows below.

Just to make it clear: Windows vista (default language - Brazilian
Portuguese), R 2.11.1, odfWeave 0.7.11, OpenOffice 3.0.1

in my odt file ...

  <>=

odfTable(tabela2,useRowNames=T,name ='Tabela 2')

@

in R console ...

>library(odfWeave)
>imageDefs <- getImageDefs()
>imageDefs$type <- 'bmp'
>imageDefs$device <- 'bmp'
 >setImageDefs(imageDefs)
>options(SweaveSyntax="SweaveSyntaxNoweb")
>odfWeave('teste.odt','figura1.odt')
  Copying  teste.odt
  Setting wd to
C:\Users\PEDROE~1\AppData\Local\Temp\Rtmpfv32oJ/odfWeave0215405313
  Unzipping ODF file using unzip -o "teste.odt"
Archive:  teste.odt
 extracting: mimetype
   creating: Configurations2/statusbar/
  inflating: Configurations2/accelerator/current.xml
   creating: Configurations2/floater/
   creating: Configurations2/popupmenu/
   creating: Configurations2/progressbar/
   creating: Configurations2/menubar/
   creating: Configurations2/toolbar/
   creating: Configurations2/images/Bitmaps/
   inflating: content.xml
  inflating: styles.xml
 extracting: meta.xml
  inflating: Thumbnails/thumbnail.png
  inflating: settings.xml
  inflating: META-INF/manifest.xml

  Removing  teste.odt
  Creating a Pictures directory

  Pre-processing the contents
  Sweaving  content.Rnw

  Writing to file content_1.xml
  Processing code chunks ...
1 : term verbatim(label=fluxograma)
Loading required package: shape
Loading required package: shape
2 : term xml(label=tabela2)

  'content_1.xml' has been Sweaved

  Removing content.xml

  Post-processing the contents
Input is not proper UTF-8, indicate encoding !
Bytes: 0xE2 0x6E 0x63 0x69
Erro: 1: Input is not proper UTF-8, indicate encoding !
Bytes: 0xE2 0x6E 0x63 0x69
>

> tabela2[1:5,] # a piece of table 2
Concordância observada  Kappa p valor
Sexo:   1. 1.   0e+00
Referenciamento para diagnóstico:   0.6863 0.5081   4e-03
Reteste na doação de sangue:0.9379 0.7874   0e+00
Resultado do reteste da doação: 0.9317 0.6607   2e-04
Indicação médica para investigação: 0.6957 0.5556   1e-04

Considering some sugestions form other lists I tryed to encode the table
using enc2utf8 and descr::toUTF8 such as

<>=

odfTable(enc2utf8(tabela2),useRowNames=T,name ='Tabela 2')

@

OR

<>=

enc2utf8(odfTable(tabela2,useRowNames=T,name ='Tabela 2'))

@

OR

<>=

toUTF8(odfTable(tabela2,useRowNames=T,name ='Tabela 2'))

@

But all of them gave the same error. However, if I set the table without the
rownames such as:

<>=

odfTable(tabela2,useRowNames=F,name ='Tabela 2')

@

It works fine... but the rownames are not there. I tryed to bind the
rownames as column but the error comes back.

After a couple days banging my head around Im about to appeal to old friend
"copy and paste". Any sugestion is most welcome.

Kind regards to all and thanks in advance,

Abraço forte e que a força esteja com você,

Dr. Pedro Emmanuel A. A. do Brasil
Instituto de Pesquisa Clínica Evandro Chagas
Fundação Oswaldo Cruz
Rio de Janeiro - Brasil
Av. Brasil 4365
Tel 55 21 3865-9648
email: pedro.bra...@ipec.fiocruz.br
email: emmanuel.bra...@gmail.com

---Apoio aos softwares livres
www.zotero.org - gerenciamento de referências bibliográficas.
www.broffice.org ou www.openoffice.org - textos, planilhas ou apresentações.
www.epidata.dk - entrada de dados.
www.r-project.org - análise de dados.
www.ubuntu.com - sistema operacional

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.