Change 15169: Undocument the use of .utf8.{upgrade,downgrade,encode,decode}

Jarkko Hietaniemi Mon, 11 Mar 2002 05:57:44 -0800

Change 15169 by jhi@alpha on 2002/03/11 12:57:45

        Undocument the use of .*utf8.*{upgrade,downgrade,encode,decode}
        as general purpose encoding transformation interfaces
        since that's not what they are.


Affected files ...

.... //depot/perl/lib/utf8.pm#23 edit
.... //depot/perl/pod/perlunicode.pod#80 edit
.... //depot/perl/sv.c#526 edit

Differences ...

==== //depot/perl/lib/utf8.pm#23 (text) ====
Index: perl/lib/utf8.pm
--- perl/lib/utf8.pm.~1~        Mon Mar 11 06:15:05 2002
+++ perl/lib/utf8.pm    Mon Mar 11 06:15:05 2002
@@ -79,21 +79,27 @@
 
 Converts internal representation of string to the Perl's internal
 I<UTF-X> form.  Returns the number of octets necessary to represent
-the string as I<UTF-X>.
+the string as I<UTF-X>.  Note that this should not be used to convert
+a legacy byte encoding to Unicode: use Encode for that.  Affected
+by the encoding pragma.
 
 =item * utf8::downgrade($string[, CHECK])
 
 Converts internal representation of string to be un-encoded bytes.
+Note that this should not be used to convert Unicode back to a legacy
+byte encoding: use Encode for that.  B<Not> affected by the encoding
+pragma.
 
 =item * utf8::encode($string)
 
-Converts (in-place) I<$string> from logical characters to octet sequence
-representing it in Perl's I<UTF-X> encoding.
+Converts (in-place) I<$string> from logical characters to octet
+sequence representing it in Perl's I<UTF-X> encoding.  Note that this
+should not be used to convert a legacy byte encoding to Unicode: use
+Encode for that.  =item * $flag = utf8::decode($string)
 
-=item * $flag = utf8::decode($string)
-
 Attempts to convert I<$string> in-place from Perl's I<UTF-X> encoding
-into logical characters.
+into logical characters.  Note that this should not be used to convert
+Unicode back to a legacy byte encoding: use Encode for that.
 
 =back
 

==== //depot/perl/pod/perlunicode.pod#80 (text) ====
Index: perl/pod/perlunicode.pod
--- perl/pod/perlunicode.pod.~1~        Mon Mar 11 06:15:05 2002
+++ perl/pod/perlunicode.pod    Mon Mar 11 06:15:05 2002
@@ -873,6 +873,10 @@
 encoded form.  sv_utf8_downgrade(sv) does the opposite (if possible).
 sv_utf8_encode(sv) is like sv_utf8_upgrade but the UTF8 flag does not
 get turned on.  sv_utf8_decode() does the opposite of sv_utf8_encode().
+Note that none of these are to be used as general purpose encoding/decoding
+interfaces: use Encode for that.  sv_utf8_upgrade() is affected by the
+encoding pragma, but sv_utf8_downgrade() is not (since the encoding
+pragma is designed to be a one-way street).
 
 =item *
 

==== //depot/perl/sv.c#526 (text) ====
Index: perl/sv.c
--- perl/sv.c.~1~       Mon Mar 11 06:15:05 2002
+++ perl/sv.c   Mon Mar 11 06:15:05 2002
@@ -3313,6 +3313,9 @@
 Always sets the SvUTF8 flag to avoid future validity checks even
 if all the bytes have hibit clear.
 
+This is not as a general purpose byte encoding to Unicode interface:
+use the Encode extension for that.
+
 =cut
 */
 
@@ -3332,6 +3335,9 @@
 will C<mg_get> on C<sv> if appropriate, else not. C<sv_utf8_upgrade> and
 C<sv_utf8_upgrade_nomg> are implemented in terms of this function.
 
+This is not as a general purpose byte encoding to Unicode interface:
+use the Encode extension for that.
+
 =cut
 */
 
@@ -3397,6 +3403,9 @@
 if this is the case, either returns false or, if C<fail_ok> is not
 true, croaks.
 
+This is not as a general purpose Unicode to byte encoding interface:
+use the Encode extension for that.
+
 =cut
 */
 
End of Patch.

Change 15169: Undocument the use of .*utf8.*{upgrade,downgrade,encode,decode}

Reply via email to

Change 15169: Undocument the use of .utf8.{upgrade,downgrade,encode,decode}