In perl.git, the branch blead has been updated <http://perl5.git.perl.org/perl.git/commitdiff/21da7284d6090cdcf2be93de47dfe3e32cbaf6c4?hp=a5ab225509d435feb91e537a15f319832528ca1f>
- Log ----------------------------------------------------------------- commit 21da7284d6090cdcf2be93de47dfe3e32cbaf6c4 Author: Karl Williamson <[email protected]> Date: Sun Dec 18 13:57:46 2016 -0700 perlapi: Add explanation for why certain macros don't exist. This also fixes some orphaned references. ----------------------------------------------------------------------- Summary of changes: handy.h | 34 ++++++++++++++++++++++++++-------- 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/handy.h b/handy.h index 1eb88923bf..848050f333 100644 --- a/handy.h +++ b/handy.h @@ -783,6 +783,16 @@ Returns the value of an ASCII-range hex digit and advances the string pointer. Behaviour is only well defined when isXDIGIT(*str) is true. =head1 Character case changing +Perl uses "full" Unicode case mappings. This means that converting a single +character to another case may result in a sequence of more than one character. +For example, the uppercase of C<E<223>> (LATIN SMALL LETTER SHARP S) is the two +character sequence C<SS>. This presents some complications The lowercase of +all characters in the range 0..255 is a single character, and thus +C<L</toLOWER_L1>> is furnished. But, C<toUPPER_L1> can't exist, as it couldn't +return a valid result for all legal inputs. Instead C<L</toUPPER_uvchr>> has +an API that does allow every possible legal result to be returned.) Likewise +no other function that is crippled by not being able to give the correct +results for the full range of possible inputs has been implemented here. =for apidoc Am|U8|toUPPER|U8 ch Converts the specified character to uppercase. If the input is anything but an @@ -797,7 +807,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1> bytes since the uppercase version may be longer than the original character. The first code point of the uppercased version is returned -(but note, as explained just above, that there may be more.) +(but note, as explained at L<the top of this section|/Character case +changing>, that there may be more.) =for apidoc Am|UV|toUPPER_utf8|U8* p|U8* s|STRLEN* lenp Converts the UTF-8 encoded character at C<p> to its uppercase version, and @@ -806,7 +817,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1> bytes since the uppercase version may be longer than the original character. The first code point of the uppercased version is returned -(but note, as explained just above, that there may be more.) +(but note, as explained at L<the top of this section|/Character case +changing>, that there may be more). The input character at C<p> is assumed to be well-formed. @@ -824,7 +836,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1> bytes since the foldcase version may be longer than the original character. The first code point of the foldcased version is returned -(but note, as explained just above, that there may be more.) +(but note, as explained at L<the top of this section|/Character case +changing>, that there may be more). =for apidoc Am|UV|toFOLD_utf8|U8* p|U8* s|STRLEN* lenp Converts the UTF-8 encoded character at C<p> to its foldcase version, and @@ -833,7 +846,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1> bytes since the foldcase version may be longer than the original character. The first code point of the foldcased version is returned -(but note, as explained just above, that there may be more.) +(but note, as explained at L<the top of this section|/Character case +changing>, that there may be more). The input character at C<p> is assumed to be well-formed. @@ -858,7 +872,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1> bytes since the lowercase version may be longer than the original character. The first code point of the lowercased version is returned -(but note, as explained just above, that there may be more.) +(but note, as explained at L<the top of this section|/Character case +changing>, that there may be more). =for apidoc Am|UV|toLOWER_utf8|U8* p|U8* s|STRLEN* lenp Converts the UTF-8 encoded character at C<p> to its lowercase version, and @@ -867,7 +882,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1> bytes since the lowercase version may be longer than the original character. The first code point of the lowercased version is returned -(but note, as explained just above, that there may be more.) +(but note, as explained at L<the top of this section|/Character case +changing>, that there may be more). The input character at C<p> is assumed to be well-formed. @@ -886,7 +902,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1> bytes since the titlecase version may be longer than the original character. The first code point of the titlecased version is returned -(but note, as explained just above, that there may be more.) +(but note, as explained at L<the top of this section|/Character case +changing>, that there may be more). =for apidoc Am|UV|toTITLE_utf8|U8* p|U8* s|STRLEN* lenp Converts the UTF-8 encoded character at C<p> to its titlecase version, and @@ -895,7 +912,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1> bytes since the titlecase version may be longer than the original character. The first code point of the titlecased version is returned -(but note, as explained just above, that there may be more.) +(but note, as explained at L<the top of this section|/Character case +changing>, that there may be more). The input character at C<p> is assumed to be well-formed. -- Perl5 Master Repository
