[perl.git] branch blead, updated. v5.15.8-221-gf6067ad

Karl Williamson Mon, 19 Mar 2012 22:35:18 -0700

In perl.git, the branch blead has been updated

<http://perl5.git.perl.org/perl.git/commitdiff/f6067adc61108c3398de698bb0294d95f09b55ef?hp=41c3b428c4a3ce29a0f80c7f63eda133089137de>


- Log -----------------------------------------------------------------
commit f6067adc61108c3398de698bb0294d95f09b55ef
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 19 23:15:07 2012 -0600

    charnames: Clarify viacode pod
    
    This mentions that viacode's return can change as a result of
    corrections to the Unicode standard.

M       lib/charnames.pm

commit ffec675822f6354e94f29a96daa07ef9465a43bc
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 19 23:14:28 2012 -0600

    charnames pod: slight rewording

M       lib/charnames.pm

commit 228e8c7b6cef1e12cb12f45083fdcae7b35fba27
Author: Karl Williamson <[email protected]>
Date:   Mon Mar 19 22:10:18 2012 -0600

    charnames: re-order pod sections
    
    This merely moves one =head1 section to later in the pod, so that future
    changes will make more sense; and it has to bump the version.

M       lib/_charnames.pm
M       lib/charnames.pm
-----------------------------------------------------------------------

Summary of changes:
 lib/_charnames.pm |    2 +-
 lib/charnames.pm  |  109 ++++++++++++++++++++++++++++++++++------------------
 2 files changed, 72 insertions(+), 39 deletions(-)

diff --git a/lib/_charnames.pm b/lib/_charnames.pm
index 02dbef0..d29af30 100644
--- a/lib/_charnames.pm
+++ b/lib/_charnames.pm
@@ -7,7 +7,7 @@ package _charnames;
 use strict;
 use warnings;
 use File::Spec;
-our $VERSION = '1.29';
+our $VERSION = '1.30';
 use unicore::Name;    # mktables-generated algorithmically-defined names
 
 use bytes ();          # for $bytes::hint_bits
diff --git a/lib/charnames.pm b/lib/charnames.pm
index 07c1b70..495c303 100644
--- a/lib/charnames.pm
+++ b/lib/charnames.pm
@@ -1,7 +1,7 @@
 package charnames;
 use strict;
 use warnings;
-our $VERSION = '1.29';
+our $VERSION = '1.30';
 use unicore::Name;    # mktables-generated algorithmically-defined names
 use _charnames ();    # The submodule for this where most of the work gets done
 
@@ -328,43 +328,6 @@ Also, both these methods currently allow only single 
characters to be named.
 To name a sequence of characters, use a
 L<custom translator|/CUSTOM TRANSLATORS> (described below).
 
-=head1 charnames::viacode(I<code>)
-
-Returns the full name of the character indicated by the numeric code.
-For example,
-
-    print charnames::viacode(0x2722);
-
-prints "FOUR TEARDROP-SPOKED ASTERISK".
-
-The name returned is the official name for the code point, if
-available; otherwise your custom alias for it.  This means that your
-alias will only be returned for code points that don't have an official
-Unicode name (nor alias) such as private use code points.
-Until Unicode 6.1, the 4 control characters U+0080, U+0081, U+0084, and U+0099
-did not have names (actually, to be precise they still don't, but they do have
-aliases, which for most purposes are indistiunguishable from true names).
-To preserve backwards compatibility, any alias you define for these code
-points will be returned by this function, in preference to the official alias.
-
-If you define more than one name for the code point, it is indeterminate
-which one will be returned.
-
-The function returns C<undef> if no name is known for the code point.
-In Unicode the proper name of these is the empty string, which
-C<undef> stringifies to.  (If you ask for a code point past the legal
-Unicode maximum of U+10FFFF that you haven't assigned an alias to, you
-get C<undef> plus a warning.)
-
-The input number must be a non-negative integer, or a string beginning
-with C<"U+"> or C<"0x"> with the remainder considered to be a
-hexadecimal integer.  A literal numeric constant must be unsigned; it
-will be interpreted as hex if it has a leading zero or contains
-non-decimal hex digits; otherwise it will be interpreted as decimal.
-
-Notice that the name returned for U+FEFF is "ZERO WIDTH NO-BREAK
-SPACE", not "BYTE ORDER MARK".
-
 =head1 charnames::string_vianame(I<name>)
 
 This is a runtime equivalent to C<\N{...}>.  I<name> can be any expression
@@ -397,6 +360,76 @@ character, even ones that aren't legal under the C<S<use 
bytes>> pragma,
 See L</BUGS> for the circumstances in which the behavior differs
 from  that described above.
 
+=head1 charnames::viacode(I<code>)
+
+Returns the full name of the character indicated by the numeric code.
+For example,
+
+    print charnames::viacode(0x2722);
+
+prints "FOUR TEARDROP-SPOKED ASTERISK".
+
+The name returned is the "best" (defined below) official name or alias
+for the code point, if
+available; otherwise your custom alias for it, if defined; otherwise C<undef>.
+This means that your alias will only be returned for code points that don't
+have an official Unicode name (nor alias) such as private use code points.
+
+If you define more than one name for the code point, it is indeterminate
+which one will be returned.
+
+As mentioned, the function returns C<undef> if no name is known for the code
+point.  In Unicode the proper name of these is the empty string, which
+C<undef> stringifies to.  (If you ask for a code point past the legal
+Unicode maximum of U+10FFFF that you haven't assigned an alias to, you
+get C<undef> plus a warning.)
+
+The input number must be a non-negative integer, or a string beginning
+with C<"U+"> or C<"0x"> with the remainder considered to be a
+hexadecimal integer.  A literal numeric constant must be unsigned; it
+will be interpreted as hex if it has a leading zero or contains
+non-decimal hex digits; otherwise it will be interpreted as decimal.
+
+As mentioned above under L</ALIASES>, Unicode 6.1 defines extra names
+(synonyms or aliases) for some code points, most of which were already
+available as Perl extensions.  All these are accepted by C<\N{...}> and the
+other functions in this module, but C<viacode> has to choose which one
+name to return for a given input code point, so it returns the "best" name.
+To understand how this works, it is helpful to know more about the Unicode
+name properties.  All code points actually have only a single name, which
+(starting in Unicode 2.0) can never change once a character has been assigned
+to the code point.  But mistakes have been made in assigning names, for
+example sometimes a clerical error was made during the publishing of the
+Standard which caused words to be misspelled, and there was no way to correct
+those.  The Name_Alias property was eventually created to handle these
+situations.  If a name was wrong, a corrected synonym would be published for
+it, using Name_Alias.  C<viacode> will return that corrected synonym as the
+"best" name for a code point.  (It is even possible, though it hasn't happened
+yet, that the correction itself will need to be corrected, and so another
+Name_Alias can be created for that code point; C<viacode> will return the
+most recent correction.)
+
+The Unicode name for each of the control characters (such as LINE FEED) is the
+empty string.  However almost all had names assigned by other standards, such
+as the ASCII Standard, or were in common use.  C<viacode> returns these names
+as the "best" ones available.  Unicode 6.1 has created Name_Aliases for each
+of them, including alternate names, like NEW LINE.  C<viacode> uses the
+original name, "LINE FEED" in preference to the alternate.  Similarly the
+name returned for U+FEFF is "ZERO WIDTH NO-BREAK SPACE", not "BYTE ORDER
+MARK".
+
+Until Unicode 6.1, the 4 control characters U+0080, U+0081, U+0084, and U+0099
+did not have names nor aliases.
+To preserve backwards compatibility, any alias you define for these code
+points will be returned by this function, in preference to the official name.
+
+Some code points also have abbreviated names, such as "LF" or "NL".
+C<viacode> never returns these.
+
+Because a name correction may be added in future Unicode releases, the name
+that C<viacode> returns may change as a result.  This is a rare event, but it
+does happen.
+
 =head1 CUSTOM TRANSLATORS
 
 The mechanism of translation of C<\N{...}> escapes is general and not

--
Perl5 Master Repository

[perl.git] branch blead, updated. v5.15.8-221-gf6067ad

Reply via email to