Bug#748984: recode html..utf8 breaks existing UTF-8

2014-05-22 Thread Andreas Schamanek
Package: recode Version: 3.6-20 Severity: normal When converting text from HTML to UTF-8, existing and valid UTF-8 characters get mangled. Example shell session: $ cat sample.htm !doctype htmlhtmlhead meta http-equiv=Content-Type content=text/html; charset=UTF-8 /headbodypEntities:

Bug#748984: recode html..utf8 breaks existing UTF-8

2014-05-22 Thread François Pinard
When converting text from HTML to UTF-8, existing and valid UTF-8 characters get mangled. I'm far from my things and with rather limited connectivity. The truth is that I now live in an hospital, and likely for good, matter of speaking :-) . So, Recode and my other projects are opened to

Bug#748984: recode html..utf8 breaks existing UTF-8

2014-05-22 Thread Andreas Schamanek
Hi François, Thanks a lot for your prompt reply. I was afraid about what you wrote. Unfortunately, I cannot take over. I hope others will. All the best to you! Thanks for pointing me at the -d option. Actually, I forgot to mention a workaround: $ recode -d utf8..html sample.htm | recode