Warren Block wrote:
 > Oliver Fromme wrote:
 > > Warren Block wrote:
 > > > Oliver Fromme wrote:
 > > > > Gary Kline wrote:
 > > > > > 
 > > > > > Whenever I save a wordpeocessoe file [OOo, say] into a
 > > > > > text file, I get a slew of hex codes to indicate the char to be
 > > > > > used.  I'm looking for a perl one-liner or script to translate
 > > > > > hex back into ', ", -- [that's a dash), and so forth.  Why does
 > > > > > this fail to trans the hex code to an apostrophe?
 > > > > > 
 > > > > > perl -pi.bak -e 's/\xe2\x80\x99/'/g'
 > > > > 
 > > > > You need to escape the inner quote character, of course.
 > > > > I think sed is better suited for this task than perl.
 > > > 
 > > > That's twice now people have suggested sed instead of perl.  Why?  For
 > > > many uses, perl is a better sed than sed.  The regex engine is far more
 > > > powerful and escapes are much simpler.
 > > 
 > > Neither powerful regexes nor escapes will help in this case.
 > 
 > Certainly \x will not help in sed; sed doesn't have it.

Right, that's an annoying flaw in sed (it doesn't even
support the \0 syntax for octal values, which is more
standard than \x).

Normally I just type such characters literally, which
is accepted fine by sed (it is 8 bit clean).

However, in this particular case I really recommend to
use the "recode" tool (ports/conversion/recode) to convert
from UTF-8 to some other encoding.  Much easier, and more
correct.

E2-80-99 (unicode 2019) isn't even a real apostrophe in
UTF-8, it's a right single quotation mark.  An apostrophe
would be ASCII 27.

Maybe the OP should configure his software to not save the
file with UTF-8 encoding in the first place.  I'm not an
OOo user, so I can't tell how to do that.  But obviously
the OP doesn't want the file to be stored as UTF-8.

 > It's possible "Mastering Regular Expressions" has influenced my thinking 
 > on this.

This isn't about regular expressions at all.  This is
about replacing fixed strings.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"One of the main causes of the fall of the Roman Empire was that,
lacking zero, they had no way to indicate successful termination
of their C programs."
        -- Robert Firth
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Reply via email to