Re: Sed, shell and hexadecimal character codes

2008-05-23 Thread Oliver Fromme
Karel Miklav wrote:
  There's a tip in the FreeBSD fortunes database that says:
  
   Want to strip UTF-8 BOM(Bye Order Mark) from given files?
   
   sed -e '1s/^\xef\xbb\xbf//'  bomfile  newfile

FreeBSD's sed(1) doesn't support hexadecimal or octal
sequences.  I think even gnu sed doesn't support it, but
you might try it yourself (/usr/ports/textprog/gsed).

I don't know why that fortunes entry exist.  It's wrong.

  I can't make it work, and I can't find any other method to
  work with hexa codes in scripts or on the command line so
  I'm kind-a depressed :) I help myself with xxd now, but if
  it is possible to avoid it, I'd like to hear about it.

There is no standard for handling octal and hexadecimal
sequences, unfortunately, so you have to consult the
manual page to find out.  For example, tr(1) supports
octal sequences only (no hexadecimal), while awk(1)
supports both.  So the above line could be rewritten
with awk:

awk '{if(NR==1)sub(/^\xef\xbb\xbf/, );print}'  bomfile  newfile

Basically that's exactly the same instruction as the sed
one above, but awk is a little more verbose:

1 in sed means that the following command should only
affect the first line.  That's what if(NR==1) does in
awk.

s/OLD/NEW/ is the replacement command in sed.  In awk
it looks like sub(/old/, new).

Finally, sed prints all resulting lines by default, while
awk has to be told with an explicit print command.
(awk prints lines automatically only if there are no
other commands at all.)

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

'Instead of asking why a piece of software is using 1970s technology,
start asking why software is ignoring 30 years of accumulated wisdom.'
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Sed, shell and hexadecimal character codes

2008-05-21 Thread Karel Miklav

There's a tip in the FreeBSD fortunes database that says:

 Want to strip UTF-8 BOM(Bye Order Mark) from given files?

 sed -e '1s/^\xef\xbb\xbf//'  bomfile  newfile

I can't make it work, and I can't find any other method to
work with hexa codes in scripts or on the command line so
I'm kind-a depressed :) I help myself with xxd now, but if
it is possible to avoid it, I'd like to hear about it.

--

Regards,
Karel Miklav


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]