[perl.git] branch blead, updated. v5.25.9-47-gc83d5090d8

Karl Williamson Thu, 26 Jan 2017 06:16:54 -0800

In perl.git, the branch blead has been updated

<http://perl5.git.perl.org/perl.git/commitdiff/c83d5090d8022c6cf4240c0a13309bcd1ccbfaed?hp=c1ac151ab04e29c3a8a22e7035487bd0d8793f63>


- Log -----------------------------------------------------------------
commit c83d5090d8022c6cf4240c0a13309bcd1ccbfaed
Author: Karl Williamson <k...@cpan.org>
Date:   Wed Jan 25 22:33:22 2017 -0700

    perlapi: Fix grammar

M       toke.c

commit 8e179dd8df306c5088bf6c15b494826d48278928
Author: Pali <p...@cpan.org>
Date:   Sun Sep 18 17:25:48 2016 +0200

    pod: Suggest to use strict UTF-8 encoding when dealing with external data
    
    For data exchange it is not good idea to use not strict perl's extended
    dialect of utf8 encoding.

M       pod/perldiag.pod
M       pod/perlfunc.pod
M       pod/perlpacktut.pod
M       pod/perlunicode.pod
M       pod/perlunicook.pod
M       pod/perlunifaq.pod
M       pod/perluniintro.pod

commit 96b108235b7a4c239dbc0251abf17c3ef015c4d8
Author: Pali <p...@cpan.org>
Date:   Sun Sep 18 17:21:54 2016 +0200

    perluniintro: Encode::encode_utf8() not always appropriate
    
    Do not suggest to use Encode::encode_utf8() when you need to know the
    byte length of a string Encode module could do some additional
    operations and bytes pragma is supposed to do that job.

M       pod/perluniintro.pod

commit f8ac05207854091347d4c59c31cabb61ff952919
Author: Karl Williamson <k...@cpan.org>
Date:   Thu Jan 26 07:10:00 2017 -0700

    Add Pali to AUTHORS

M       AUTHORS
-----------------------------------------------------------------------

Summary of changes:
 AUTHORS              |  1 +
 pod/perldiag.pod     |  2 +-
 pod/perlfunc.pod     |  4 ++--
 pod/perlpacktut.pod  |  7 ++++---
 pod/perlunicode.pod  |  8 ++++----
 pod/perlunicook.pod  |  8 ++++----
 pod/perlunifaq.pod   |  6 ++++--
 pod/perluniintro.pod | 15 ++++++---------
 toke.c               |  2 +-
 9 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/AUTHORS b/AUTHORS
index 6c2dd2131f..4e4756b494 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -930,6 +930,7 @@ Ollivier Robert                     
<robe...@keltia.freenix.fr>
 Osvaldo Villalon               <ovilla...@dextratech.com>
 Owain G. Ainsworth             <o...@nicotinebsd.org>
 Owen Taylor                    <o...@cornell.edu>
+Pali                           <p...@cpan.org>
 Papp Zoltan                    <pa...@elte.hu>
 parv                           <p...@pair.com>
 Pascal Rigaux                  <pi...@mandriva.com>
diff --git a/pod/perldiag.pod b/pod/perldiag.pod
index 585c512753..76edb9b1da 100644
--- a/pod/perldiag.pod
+++ b/pod/perldiag.pod
@@ -3407,7 +3407,7 @@ the variable, C<%s>, part of the message.
 
 One possible cause is that you set the UTF8 flag yourself for data that
 you thought to be in UTF-8 but it wasn't (it was for example legacy
-8-bit data).  To guard against this, you can use Encode::decode_utf8.
+8-bit data).  To guard against this, you can use C<Encode::decode('UTF-8', 
...)>.
 
 If you use the C<:encoding(UTF-8)> PerlIO layer for input, invalid byte
 sequences are handled gracefully, but if you use C<:utf8>, the flag is
diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod
index 1e32cca6dd..d4dc2dfd53 100644
--- a/pod/perlfunc.pod
+++ b/pod/perlfunc.pod
@@ -3763,8 +3763,8 @@ many elements these have.  For that, use C<scalar @array> 
and C<scalar keys
 Like all Perl character operations, L<C<length>|/length EXPR> normally
 deals in logical
 characters, not physical bytes.  For how many bytes a string encoded as
-UTF-8 would take up, use C<length(Encode::encode_utf8(EXPR))> (you'll have
-to C<use Encode> first).  See L<Encode> and L<perlunicode>.
+UTF-8 would take up, use C<length(Encode::encode('UTF-8', EXPR))>
+(you'll have to C<use Encode> first).  See L<Encode> and L<perlunicode>.
 
 =item __LINE__
 X<__LINE__>
diff --git a/pod/perlpacktut.pod b/pod/perlpacktut.pod
index f40d1c2a93..f6a9411c8f 100644
--- a/pod/perlpacktut.pod
+++ b/pod/perlpacktut.pod
@@ -668,9 +668,10 @@ Usually you'll want to pack or unpack UTF-8 strings:
    my @hebrew = unpack( 'U*', $utf );
 
 Please note: in the general case, you're better off using
-Encode::decode_utf8 to decode a UTF-8 encoded byte string to a Perl
-Unicode string, and Encode::encode_utf8 to encode a Perl Unicode string
-to UTF-8 bytes. These functions provide means of handling invalid byte
+L<C<Encode::decode('UTF-8', $utf)>|Encode/decode> to decode a UTF-8
+encoded byte string to a Perl Unicode string, and
+L<C<Encode::encode('UTF-8', $str)>|Encode/encode> to encode a Perl Unicode
+string to UTF-8 bytes. These functions provide means of handling invalid byte
 sequences and generally have a friendlier interface.
 
 =head2 Another Portable Binary Encoding
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod
index 33e52b31b3..ba5e312d02 100644
--- a/pod/perlunicode.pod
+++ b/pod/perlunicode.pod
@@ -1904,7 +1904,7 @@ check the documentation to verify if this is still true.
 
   if ($] > 5.008) {
     require Encode;
-    $val = Encode::encode_utf8($val); # make octets
+    $val = Encode::encode("UTF-8", $val); # make octets
   }
 
 =item *
@@ -1916,7 +1916,7 @@ want the UTF8 flag restored:
 
   if ($] > 5.008) {
     require Encode;
-    $val = Encode::decode_utf8($val);
+    $val = Encode::decode("UTF-8", $val);
   }
 
 =item *
@@ -2017,8 +2017,8 @@ Perl's internal representation like so:
     sub my_escape_html ($) {
         my($what) = shift;
         return unless defined $what;
-        Encode::decode_utf8(Foo::Bar::escape_html(
-                                         Encode::encode_utf8($what)));
+        Encode::decode("UTF-8", Foo::Bar::escape_html(
+                                     Encode::encode("UTF-8", $what)));
     }
 
 Sometimes, when the extension does not convert data but just stores
diff --git a/pod/perlunicook.pod b/pod/perlunicook.pod
index ac305098eb..9a8d4daaa8 100644
--- a/pod/perlunicook.pod
+++ b/pod/perlunicook.pod
@@ -234,8 +234,8 @@ C<binmode> as described later below.
  or
      $ export PERL_UNICODE=A
  or
-    use Encode qw(decode_utf8);
-    @ARGV = map { decode_utf8($_, 1) } @ARGV;
+    use Encode qw(decode);
+    @ARGV = map { decode('UTF-8', $_, 1) } @ARGV;
 
 =head2 â 14: Decode program arguments as locale encoding
 
@@ -289,8 +289,8 @@ Files opened without an encoding argument will be in UTF-8:
      $ export PERL_UNICODE=SDA
  or
      use open qw(:std :utf8);
-     use Encode qw(decode_utf8);
-     @ARGV = map { decode_utf8($_, 1) } @ARGV;
+     use Encode qw(decode);
+     @ARGV = map { decode('UTF-8', $_, 1) } @ARGV;
 
 =head2 â 19: Open file with specific encoding
 
diff --git a/pod/perlunifaq.pod b/pod/perlunifaq.pod
index 4135fbaeb2..ba391d423f 100644
--- a/pod/perlunifaq.pod
+++ b/pod/perlunifaq.pod
@@ -199,7 +199,9 @@ or by letting automatic decoding and encoding do all the 
work:
 =head2 What are C<decode_utf8> and C<encode_utf8>?
 
 These are alternate syntaxes for C<decode('utf8', ...)> and C<encode('utf8',
-...)>.
+...)>. Do not use these functions for data exchange. Instead use
+C<decode('UTF-8', ...)> and C<encode('UTF-8', ...)>; see
+L</What's the difference between UTF-8 and utf8?> below.
 
 =head2 What is a "wide character"?
 
@@ -283,7 +285,7 @@ C<UTF-8> is the official standard. C<utf8> is Perl's way of 
being liberal in
 what it accepts. If you have to communicate with things that aren't so liberal,
 you may want to consider using C<UTF-8>. If you have to communicate with things
 that are too liberal, you may have to use C<utf8>. The full explanation is in
-L<Encode>.
+L<Encode/"UTF-8 vs. utf8 vs. UTF8">.
 
 C<UTF-8> is internally known as C<utf-8-strict>. The tutorial uses UTF-8
 consistently, even where utf8 is actually used internally, because the
diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod
index cd62d4c126..5a865c9912 100644
--- a/pod/perluniintro.pod
+++ b/pod/perluniintro.pod
@@ -729,16 +729,13 @@ the output string will be UTF-8-encoded C<ab\x80c = 
\x{100}\n>, but
 C<$a> will stay byte-encoded.
 
 Sometimes you might really need to know the byte length of a string
-instead of the character length. For that use either the
-C<Encode::encode_utf8()> function or the C<bytes> pragma
+instead of the character length. For that use the C<bytes> pragma
 and the C<length()> function:
 
     my $unicode = chr(0x100);
     print length($unicode), "\n"; # will print 1
-    require Encode;
-    print length(Encode::encode_utf8($unicode)),"\n"; # will print 2
     use bytes;
-    print length($unicode), "\n"; # will also print 2
+    print length($unicode), "\n"; # will print 2
                                   # (the 0xC4 0x80 of the UTF-8)
     no bytes;
 
@@ -755,12 +752,12 @@ How Do I Detect Data That's Not Valid In a Particular 
Encoding?
 Use the C<Encode> package to try converting it.
 For example,
 
-    use Encode 'decode_utf8';
+    use Encode 'decode';
 
-    if (eval { decode_utf8($string, Encode::FB_CROAK); 1 }) {
-        # $string is valid utf8
+    if (eval { decode('UTF-8', $string, Encode::FB_CROAK); 1 }) {
+        # $string is valid UTF-8
     } else {
-        # $string is not valid utf8
+        # $string is not valid UTF-8
     }
 
 Or use C<unpack> to try decoding it:
diff --git a/toke.c b/toke.c
index 61ea45da9b..864c5269c3 100644
--- a/toke.c
+++ b/toke.c
@@ -669,7 +669,7 @@ S_cr_textfilter(pTHX_ int idx, SV *sv, int maxlen)
 Creates and initialises a new lexer/parser state object, supplying
 a context in which to lex and parse from a new source of Perl code.
 A pointer to the new state object is placed in L</PL_parser>.  An entry
-is made on the save stack so that upon unwinding the new state object
+is made on the save stack so that upon unwinding, the new state object
 will be destroyed and the former value of L</PL_parser> will be restored.
 Nothing else need be done to clean up the parsing context.
 

--
Perl5 Master Repository

[perl.git] branch blead, updated. v5.25.9-47-gc83d5090d8

Reply via email to