RE: UCS-2 to UTF-8 hex values

Hietaniemi Jarkko (NRC/Boston) Wed, 19 Sep 2001 11:35:15 -0700

> CP     UCS
> =============
> 2E     002E  
> 2F     002F  
> 30     0030 
> ...
> 
> Has anyone written or found  a script which takes 4 digit 
> hex representation of UCS-2 as (or similar to) the above 
> which outputs the UTF-8 value equivalent in the 
> same hex format? 

Assuming I parsed your question correctly... any Perl newer than 5.6.0
(5.6.1 recommended) (check what perl -v shows):

$ perl -le 'print join(" ", map { sprintf "%02x", $_ } unpack("C*",
pack("U*", 0x80)))'
c2 80

Unraveling the incantation:
        pack U: pack as Unicode (Perl's internal representation is
UTF-8)
        unpack c:       unpack as bytes
        sprintf:        format as hex
        map:            for all the results of the unpack C
        join:           space separated

And your for input file, assuming two hex numbers separated by
whitespace
on their own line:

$ perl -nle 'if (/^[0-9a-f]+\s+([0-9a-f]+)$/i) { print "$_\t", join(" ",
map { sprintf "%02x", $_ } unpack("C*", pack("U*", hex($1))))'
input.file

>

RE: UCS-2 to UTF-8 hex values

Reply via email to