How best to handle utf8 in octetty module

peter . billam Wed, 30 Mar 2005 18:25:38 -0800

Greetings.

I'm the author of CPAN's Crypt::Tea_JS which implements the Tiny Encryption
Algorithm compatibly in Perl and JavaScript.  It offers functions such as
   $cyphertext = encrypt( $plaintext, $key );
   $plainagain = decrypt( $cyphertext, $key );


The encryption, of course, works with octets. I've just (version 2.13)
introduced a first attempt at handling utf8 string arguments; this
is still undocumented so I can change it if there's a better way.
Currently, at the top of sub encrypt, there is:

        use bytes;
        ...
   sub encrypt { my ($str,$key)[EMAIL PROTECTED];
      if ($] > 5.007 && Encode::is_utf8($str)) {
         Encode::_utf8_off($str);
         # $str = Encode::encode_utf8($str);
      }
                ...

Is this the right sort of way to do it (e.g. functionality, portability) ?

It means that after decrypting again the is_utf8 information is lost;
But I don't see a way round that because 1) Perl's not the only language
involved, 2) putting encoding information into the cyphertext would break
backward compatibility and give an attacker a known-plaintext attack.

Would it be worth giving sub decrypt an option to decode the plaintext
into Perl's internal form (if it's well-formed), or should I leave
that to the user and the Encode module ?

Guidance gratefully received,  Regards,  Peter

Peter Billam,  DPIWE/ILS/CIT/Servers,  hbt/lnd/l8,  6233 3061

How best to handle utf8 in octetty module

Reply via email to