In perl.git, the branch smoke-me/jkeenan/131337-craig has been updated <http://perl5.git.perl.org/perl.git/commitdiff/19fae8342f6294b0c93105833f278ee0084fb412?hp=8d2660b2367a4d947f69ec307cebdf4a159cdd33>
discards 8d2660b2367a4d947f69ec307cebdf4a159cdd33 (commit) - Log ----------------------------------------------------------------- commit 19fae8342f6294b0c93105833f278ee0084fb412 Author: James E Keenan <[email protected]> Date: Sun May 21 22:16:23 2017 -0400 Patch suggested by Craig Berry for RT 131337. ----------------------------------------------------------------------- Summary of changes: pod/perldelta.pod | 376 +++++++++++++++++++++++++++++------------------------- 1 file changed, 200 insertions(+), 176 deletions(-) diff --git a/pod/perldelta.pod b/pod/perldelta.pod index 4a30a164fd..0e6dada11f 100644 --- a/pod/perldelta.pod +++ b/pod/perldelta.pod @@ -15,9 +15,9 @@ This release includes three updates with widespread effects: =over 4 -=item * C<.> no longer in C<@INC> +=item * C<"."> no longer in C<@INC> -For security reasons, the current directory (C<.>) is no longer included +For security reasons, the current directory (C<".">) is no longer included by default at the end of the module search path (C<@INC>). This may have widespread implications for the building, testing and installing of modules, and for the execution of scripts. See the section @@ -27,7 +27,7 @@ for the full details. =item * C<do> may now warn C<do> now gives a deprecation warning when it fails to load a file which -it would have loaded had C<.> been in C<@INC>. +it would have loaded had C<"."> been in C<@INC>. =item * In regular expression patterns, a literal left brace C<"{"> should be escaped @@ -38,28 +38,17 @@ See L</Unescaped literal C<"{"> characters in regular expression patterns are no =head1 Core Enhancements -=head2 New regular expression modifier C</xx> - -Specifying two C<x> characters to modify a regular expression pattern -does everything that a single one does, but additionally TAB and SPACE -characters within a bracketed character class are generally ignored and -can be added to improve readability, like -S<C</[ ^ A-Z d-f p-x ]/xx>>. Details are at -L<perlre/E<sol>x and E<sol>xx>. - -=head2 New Hash Function For 64-bit Builds - -We have switched to a hybrid hash function to better balance -performance for short and long keys. +=head2 Lexical subroutines are no longer experimental -For short keys, 16 bytes and under, we use an optimised variant of -One At A Time Hard, and for longer keys we use Siphash 1-3. For very -long keys this is a big improvement in performance. For shorter keys -there is a modest improvement. +Using the C<lexical_subs> feature introduced in v5.18 no longer emits a warning. Existing +code that disables the C<experimental::lexical_subs> warning category +that the feature previously used will continue to work. The +C<lexical_subs> feature has no effect; all Perl code can use lexical +subroutines, regardless of what feature declarations are in scope. =head2 Indented Here-documents -This adds a new modifier '~' to here-docs that tells the parser +This adds a new modifier C<"~"> to here-docs that tells the parser that it should look for /^\s*$DELIM\n/ as the closing delimiter. These syntaxes are all supported: @@ -73,7 +62,7 @@ These syntaxes are all supported: <<~ "EOF"; <<~ `EOF`; -The '~' modifier will strip, from each line in the here-doc, the +The C<"~"> modifier will strip, from each line in the here-doc, the same whitespace that appears before the delimiter. Newlines will be copied as-is, and lines that don't include the @@ -89,6 +78,15 @@ For example: prints "Hello there\n" with no leading whitespace. +=head2 New regular expression modifier C</xx> + +Specifying two C<"x"> characters to modify a regular expression pattern +does everything that a single one does, but additionally TAB and SPACE +characters within a bracketed character class are generally ignored and +can be added to improve readability, like +S<C</[ ^ A-Z d-f p-x ]/xx>>. Details are at +L<perlre/E<sol>x and E<sol>xx>. + =head2 @{^CAPTURE}, %{^CAPTURE}, and %{^CAPTURE_ALL} C<@{^CAPTURE}> exposes the capture buffers of the last match as an @@ -106,6 +104,20 @@ C<%{^CAPTURE_ALL}> is the equivalent to C<%-> (I<i.e.>, all named captures). Other than being more self documenting there is no difference between the two forms. +=head2 Declaring a reference to a variable + +As an experimental feature, Perl now allows the referencing operator to come +after L<C<my()>|perlfunc/my>, L<C<state()>|perlfunc/state>, +L<C<our()>|perlfunc/our>, or L<C<local()>|perlfunc/local>. This syntax must +be enabled with C<use feature 'declared_refs'>. It is experimental, and will +warn by default unless C<no warnings 'experimental::refaliasing'> is in effect. +It is intended mainly for use in assignments to references. For example: + + use experimental 'refaliasing', 'declared_refs'; + my \$a = \$b; + +See L<perlref/Assigning to References> for more details. + =head2 Unicode 9.0 is now supported A list of changes is at L<http://www.unicode.org/versions/Unicode9.0.0/>. @@ -123,20 +135,6 @@ programs that very specifically needed the old behavior. The meaning of compound forms, like C<\p{sc=I<script>}> are unchanged. See L<perlunicode/Scripts>. -=head2 Declaring a reference to a variable - -As an experimental feature, Perl now allows the referencing operator to come -after L<C<my()>|perlfunc/my>, L<C<state()>|perlfunc/state>, -L<C<our()>|perlfunc/our>, or L<C<local()>|perlfunc/local>. This syntax must -be enabled with C<use feature 'declared_refs'>. It is experimental, and will -warn by default unless C<no warnings 'experimental::refaliasing'> is in effect. -It is intended mainly for use in assignments to references. For example: - - use experimental 'refaliasing', 'declared_refs'; - my \$a = \$b; - -See L<perlref/Assigning to References> for more details. - =head2 Perl can now do default collation in UTF-8 locales on platforms that support it @@ -155,14 +153,6 @@ ignored at the higher priority ones. There are still some gotchas in some strings, though. See L<perllocale/Collation of strings containing embedded C<NUL> characters>. -=head2 Lexical subroutines are no longer experimental - -Using the C<lexical_subs> feature introduced in v5.18 no longer emits a warning. Existing -code that disables the C<experimental::lexical_subs> warning category -that the feature previously used will continue to work. The -C<lexical_subs> feature has no effect; all Perl code can use lexical -subroutines, regardless of what feature declarations are in scope. - =head2 C<CORE> subroutines for hash and array functions callable via reference @@ -172,23 +162,28 @@ be called with ampersand syntax (C<&CORE::keys(\%hash>) and via reference (C<< my $k = \&CORE::keys; $k-E<gt>(\%hash) >>). Previously they could only be used when inlined. -=head2 for XS code, create a safer utf8_hop() called utf8_hop_safe() +=head2 New Hash Function For 64-bit Builds -Unlike utf8_hop(), utf8_hop_safe() won't navigate before the beginning or after -the end of the supplied buffer. +We have switched to a hybrid hash function to better balance +performance for short and long keys. + +For short keys, 16 bytes and under, we use an optimised variant of +One At A Time Hard, and for longer keys we use Siphash 1-3. For very +long keys this is a big improvement in performance. For shorter keys +there is a modest improvement. =head1 Security -=head2 Removal of the current directory (C<.>) from C<@INC> +=head2 Removal of the current directory (C<".">) from C<@INC> The perl binary includes a default set of paths in C<@INC>. Historically -it has also included the current directory (C<.>) as the final entry, +it has also included the current directory (C<".">) as the final entry, unless run with taint mode enabled (C<perl -T>). While convenient, this has security implications: for example, where a script attempts to load an optional module when its current directory is untrusted (such as F</tmp>), it could load and execute code from under that directory. -Starting with v5.26, C<.> is always removed by default, not just under +Starting with v5.26, C<"."> is always removed by default, not just under tainting. This has major implications for installing modules and executing scripts. @@ -200,7 +195,7 @@ issues. =item * C<Configure -Udefault_inc_excludes_dot> There is a new C<Configure> option, C<default_inc_excludes_dot> (enabled -by default) which builds a perl executable without C<.>; unsetting this +by default) which builds a perl executable without C<".">; unsetting this option using C<-U> reverts perl to the old behaviour. This may fix your path issues but will reintroduce all the security concerns, so don't build a perl executable like this unless you're I<really> confident that @@ -209,8 +204,8 @@ such issues are not a concern in your environment. =item * C<PERL_USE_UNSAFE_INC> There is a new environment variable recognised by the perl interpreter. -If this variable has the value C<1> when the perl interpreter starts up, -then C<.> will be automatically appended to C<@INC> (except under tainting). +If this variable has the value 1 when the perl interpreter starts up, +then C<"."> will be automatically appended to C<@INC> (except under tainting). This allows you restore the old perl interpreter behaviour on a case-by-case basis. But note that this is intended to be a temporary crutch, @@ -224,7 +219,7 @@ will not reintroduce any security concerns. While it is well-known that C<use> and C<require> use C<@INC> to search for the file to load, many people don't realise that C<do "file"> also -searches C<@INC> if the file is a relative path. With the removal of C<.>, +searches C<@INC> if the file is a relative path. With the removal of C<".">, a simple C<do "file.pl"> will fail to read in and execute C<file.pl> from the current directory. Since this is commonly expected behaviour, a new mandatory warning is now issued whenever C<do> fails to load a file which @@ -242,7 +237,7 @@ their software work in the new regime. If the issue is within your own code (rather than within included modules), then you have two main options. Firstly, if you are confident that your script will only be run within a trusted directory (under which -you expect to find trusted files and modules), then add C<.> back into the +you expect to find trusted files and modules), then add C<"."> back into the path; I<e.g.>: BEGIN { @@ -309,7 +304,7 @@ assess the resultant risks first. If you maintain a CPAN distribution, it may need updating to run in a dotless environment. Although C<cpan> and other such tools will currently set the C<PERL_USE_UNSAFE_INC> during module build, this is a -temporary workaround for the set of modules which rely on C<.> being in +temporary workaround for the set of modules which rely on C<"."> being in C<@INC> for installation and testing, and this may mask deeper issues. It could result in a module which passes tests and installs, but which fails at run time. @@ -477,13 +472,16 @@ delimiter that isn't a grapheme by itself. These are unlikely to exist in actual code, as they would typically display as attached to the character in front of them. -=head1 Removed Deprecations +=head2 C<\cI<X>> that maps to a printable is no longer deprecated -The C<\cI<X>> construct is intended to be a way to specify non-printable -characters. Previously it was deprecated to use it for a printable one, -which is better written as simply itself, perhaps preceded by a -backslash for non-word characters. Now this raises a warning, but not a -deprecation. See +This means we have no plans to remove this feature. It still raises a +warning, but only if syntax warnings are enabled. The feature was +originally intended to be a way to express non-printable characters that +don't have a mnemonic (C<\t> and C<\n> are mnemonics for two +non-printable characters, but most non-printables don't have a +mnemonic.) But the feature can be used to specify a few printable +characters, though those are more clearly expressed as the printable +itself. See L<http://www.nntp.perl.org/group/perl.perl5.porters/2017/02/msg242944.html>. =head1 Performance Enhancements @@ -1016,9 +1014,9 @@ This adds support for the new L<C<E<47>xx>|perlre/E<sol>x and E<sol>xx> regular expression pattern modifier, and a change to the L<S<C<use re 'strict'>>|re/'strict' mode> experimental feature. When S<C<re 'strict'>> is enabled, a warning now will be generated for all -unescaped uses of the two characters C<}> and C<]> in regular +unescaped uses of the two characters C<"}"> and C<"]"> in regular expression patterns (outside bracketed character classes) that are taken -literally. This brings them more in line with the C<)> character which +literally. This brings them more in line with the C<")"> character which is always a metacharacter unless escaped. Being a metacharacter only sometimes, depending on action at a distance, can lead to silently having the pattern mean something quite different than was intended, @@ -1392,7 +1390,7 @@ Document C<@ISA>. Was documented other places, not not in L<perlvar>. =item * -Since C<.> is now removed from C<@INC> by default, C<do> will now trigger +Since C<"."> is now removed from C<@INC> by default, C<do> will now trigger a warning recommending to fix the C<do> statement: L<do "%s" failed, '.' is no longer in @INC|perldiag/do "%s" failed, '.' is no longer in @INC; did you mean do ".E<sol>%s"?> @@ -1501,7 +1499,7 @@ the C<encoding> pragma, is no longer supported as of Perl 5.26. =item * -Since C<.> is now removed from C<@INC> by default, C<do> will now trigger +Since C<"."> is now removed from C<@INC> by default, C<do> will now trigger a warning recommending to fix the C<do> statement: L<do "%s" failed, '.' is no longer in @INC|perldiag/do "%s" failed, '.' is no longer in @INC; did you mean do ".E<sol>%s"?> @@ -2082,8 +2080,8 @@ VAX floating point formats are now supported on NetBSD. =item * The path separator for the C<PERL5LIB> and C<PERLLIB> environment entries is -now a colon (C<:>) when running under a Unix shell. There is no change when -running under DCL (it's still C<|>). +now a colon (C<":">) when running under a Unix shell. There is no change when +running under DCL (it's still C<"|">). =item * @@ -2161,18 +2159,23 @@ t/uni/overload.t: Skip hanging test on FreeBSD. =item * -The C<op_class()> API function has been added. This is like the existing -C<OP_CLASS()> macro, but can more accurately determine what struct an op -has been allocated as. For example C<OP_CLASS()> might return -C<OA_BASEOP_OR_UNOP> indicating that ops of this type are usually -allocated as an C<OP> or C<UNOP>; while C<op_class()> will return -C<OPclass_BASEOP> or C<OPclass_UNOP> as appropriate. +A new API function C<sv_setvpv_bufsize()> allows simultaneously setting the +length and allocated size of the buffer in an C<SV>, growing the buffer if +necessary. =item * -The output format of the C<op_dump()> function (as used by C<perl -Dx>) -has changed: it now displays an "ASCII-art" tree structure, and shows more -low-level details about each op, such as its address and class. +A new API macro C<SvPVCLEAR()> sets its C<SV> argument to an empty string, +like Perl-space C<$x = ''>, but with several optimisations. + +=item * + +Several new macros and functions for dealing with Unicode and +UTF-8-encoded strings have been added to the API, as well some changes in +functionality of existing functions (see L<perlapi/Unicode Support> for +more details): + +=over =item * @@ -2193,23 +2196,70 @@ Similarly, macros like C<toLOWER_utf8> on malformed UTF-8 now die. =item * -Calling the functions C<utf8n_to_uvchr> and its derivatives, while -passing a string length of 0 is now asserted against in DEBUGGING -builds, and otherwise returns the Unicode REPLACEMENT CHARACTER. If -you have nothing to decode, you shouldn't call the decode function. +Several new macros for analysing the validity of utf8 sequences. These +are: + +C<L<perlapi/UTF8_GOT_ABOVE_31_BIT>> +C<L<perlapi/UTF8_GOT_CONTINUATION>> +C<L<perlapi/UTF8_GOT_EMPTY>> +C<L<perlapi/UTF8_GOT_LONG>> +C<L<perlapi/UTF8_GOT_NONCHAR>> +C<L<perlapi/UTF8_GOT_NON_CONTINUATION>> +C<L<perlapi/UTF8_GOT_OVERFLOW>> +C<L<perlapi/UTF8_GOT_SHORT>> +C<L<perlapi/UTF8_GOT_SUPER>> +C<L<perlapi/UTF8_GOT_SURROGATE>> +C<L<perlapi/UTF8_IS_INVARIANT>> +C<L<perlapi/UTF8_IS_NONCHAR>> +C<L<perlapi/UTF8_IS_SUPER>> +C<L<perlapi/UTF8_IS_SURROGATE>> +C<L<perlapi/UVCHR_IS_INVARIANT>> +C<L<perlapi/isUTF8_CHAR_flags>> +C<L<perlapi/isSTRICT_UTF8_CHAR>> +C<L<perlapi/isC9_STRICT_UTF8_CHAR>> =item * -The functions C<utf8n_to_uvchr> and its derivatives now return the -Unicode REPLACEMENT CHARACTER if called with UTF-8 that has the overlong -malformation, and that malformation is allowed by the input parameters. -This malformation is where the UTF-8 looks valid syntactically, but -there is a shorter sequence that yields the same code point. This has -been forbidden since Unicode version 3.1. +Functions that are all extensions of the C<is_utf8_string_*()> functions, +that apply various restrictions to the UTF-8 recognized as valid: + +C<L<perlapi/is_strict_utf8_string>>, +C<L<perlapi/is_strict_utf8_string_loc>>, +C<L<perlapi/is_strict_utf8_string_loclen>>, + +C<L<perlapi/is_c9strict_utf8_string>>, +C<L<perlapi/is_c9strict_utf8_string_loc>>, +C<L<perlapi/is_c9strict_utf8_string_loclen>>, + +C<L<perlapi/is_utf8_string_flags>>, +C<L<perlapi/is_utf8_string_loc_flags>>, +C<L<perlapi/is_utf8_string_loclen_flags>>, + +C<L<perlapi/is_utf8_fixed_width_buf_flags>>, +C<L<perlapi/is_utf8_fixed_width_buf_loc_flags>>, +C<L<perlapi/is_utf8_fixed_width_buf_loclen_flags>>. + +C<L<perlapi/is_utf8_invariant_string>>. +C<L<perlapi/is_utf8_valid_partial_char>>. +C<L<perlapi/is_utf8_valid_partial_char_flags>>. =item * -The functions C<utf8n_to_uvchr> and its derivatives now accept an input +The functions C<L<perlapi/utf8n_to_uvchr>> and its derivatives have had +several changes of behaviour. + +Calling them, while passing a string length of 0 is now asserted against +in DEBUGGING builds, and otherwise returns the Unicode REPLACEMENT +CHARACTER. If you have nothing to decode, you shouldn't call the decode +function. + +They now return the Unicode REPLACEMENT CHARACTER if called with UTF-8 +that has the overlong malformation, and that malformation is allowed by +the input parameters. This malformation is where the UTF-8 looks valid +syntactically, but there is a shorter sequence that yields the same code +point. This has been forbidden since Unicode version 3.1. + +They now accept an input flag to allow the overflow malformation. This malformation is when the UTF-8 may be syntactically valid, but the code point it represents is not capable of being represented in the word length on the platform. @@ -2218,21 +2268,19 @@ error, and advances the parse pointer to beyond the UTF-8 in question, but it returns the Unicode REPLACEMENT CHARACTER as the value of the code point (since the real value is not representable). -=item * - -The C<PADOFFSET> type has changed from being unsigned to signed, and -several pad-related variables such as C<PL_padix> have changed from being -of type C<I32> to type C<PADOFFSET>. - -=item * - -The function C<L<perlapi/utf8n_to_uvchr>> has been changed to not +C<utf8n_to_uvchr> has been changed to not abandon searching for other malformations when the first one is encountered. A call to it thus can generate multiple diagnostics, instead of just one. =item * +C<valid_utf8_to_uvchr()> has been added to the API (although it was +present in core earlier). Like C<utf8_to_uvchr_buf()>, but assumes that +the next character is well-formed. + +=item * + A new function, C<L<perlapi/utf8n_to_uvchr_error>>, has been added for use by modules that need to know the details of UTF-8 malformations beyond pass/fail. Previously, the only ways to know why a sequence was @@ -2241,108 +2289,85 @@ your own analysis. =item * -Several new functions for handling Unicode have been added to the API: -C<L<perlapi/is_strict_utf8_string>>, -C<L<perlapi/is_c9strict_utf8_string>>, -C<L<perlapi/is_utf8_string_flags>>, -C<L<perlapi/is_strict_utf8_string_loc>>, -C<L<perlapi/is_strict_utf8_string_loclen>>, -C<L<perlapi/is_c9strict_utf8_string_loc>>, -C<L<perlapi/is_c9strict_utf8_string_loclen>>, -C<L<perlapi/is_utf8_string_loc_flags>>, -C<L<perlapi/is_utf8_string_loclen_flags>>, -C<L<perlapi/is_utf8_fixed_width_buf_flags>>, -C<L<perlapi/is_utf8_fixed_width_buf_loc_flags>>, -C<L<perlapi/is_utf8_fixed_width_buf_loclen_flags>>. - -These functions are all extensions of the C<is_utf8_string_*()> functions, -that apply various restrictions to the UTF-8 recognized as valid. +There is now a safer version of utf8_hop(), called utf8_hop_safe(). +Unlike utf8_hop(), utf8_hop_safe() won't navigate before the beginning or +after the end of the supplied buffer. =item * -A new API function C<sv_setvpv_bufsize()> allows simultaneously setting the -length and allocated size of the buffer in an C<SV>, growing the buffer if -necessary. +Two new functions, C<utf8_hop_forward()> and C<utf8_hop_back()> are +similar to C<utf8_hop_safe()> but are for when you know which direction +you wish to travel. =item * -A new API macro C<SvPVCLEAR()> sets its C<SV> argument to an empty string, -like Perl-space C<$x = ''>, but with several optimisations. +Two new macros which return useful utf8 byte sequences: -=item * +C<L<perlapi/BOM_UTF8>> +C<L<perlapi/REPLACEMENT_CHARACTER_UTF8>> -All parts of the internals now agree that the C<sassign> op is a C<BINOP>; -previously it was listed as a C<BASEOP> in F<regen/opcodes>, which meant -that several parts of the internals had to be special-cased to accommodate -it. This oddity's original motivation was to handle code like C<$x ||= 1>; -that is now handled in a simpler way. +=back =item * -Several new internal C macros have been added that take a string literal as -arguments, alongside existing routines that take the equivalent value as two -arguments, a character pointer and a length. The advantage of this is that -the length of the string is calculated automatically, rather than having to -be done manually. These routines are now used where appropriate across the -entire codebase. - -=item * +Perl is now built with the C<PERL_OP_PARENT> compiler define enabled by +default. To disable it, use the C<PERL_NO_OP_PARENT> compiler define. +This flag alters how the C<op_sibling> field is used in C<OP> structures, +and has been available optionally since perl 5.22. -The code in F<gv.c> that determines whether a variable has a special meaning -to Perl has been simplified. +See L<perl5220delta/"Internal Changes"> for more details of what this +build option does. =item * -The C<DEBUGGING>-mode output for regex compilation and execution has been -enhanced. +Three new ops, C<OP_ARGELEM>, C<OP_ARGDEFELEM> and C<OP_ARGCHECK> have +been added. These are intended principally to implement the individual +elements of a subroutine signature, plus any overall checking required. =item * -Several macros and functions have been added to the public API for -dealing with Unicode and UTF-8-encoded strings. See -L<perlapi/Unicode Support>. +The C<op_class()> API function has been added. This is like the existing +C<OP_CLASS()> macro, but can more accurately determine what struct an op +has been allocated as. For example C<OP_CLASS()> might return +C<OA_BASEOP_OR_UNOP> indicating that ops of this type are usually +allocated as an C<OP> or C<UNOP>; while C<op_class()> will return +C<OPclass_BASEOP> or C<OPclass_UNOP> as appropriate. =item * -Use C<my_strlcat()> in C<locale.c>. While C<strcat()> is safe in this context, -some compilers were optimizing this to C<strcpy()> causing a porting test to -fail that looks for unsafe code. Rather than fighting this, we just use -C<my_strlcat()> instead. +All parts of the internals now agree that the C<sassign> op is a C<BINOP>; +previously it was listed as a C<BASEOP> in F<regen/opcodes>, which meant +that several parts of the internals had to be special-cased to accommodate +it. This oddity's original motivation was to handle code like C<$x ||= 1>; +that is now handled in a simpler way. =item * -Three new ops, C<OP_ARGELEM>, C<OP_ARGDEFELEM> and C<OP_ARGCHECK> have -been added. These are intended principally to implement the individual -elements of a subroutine signature, plus any overall checking required. +The output format of the C<op_dump()> function (as used by C<perl -Dx>) +has changed: it now displays an "ASCII-art" tree structure, and shows more +low-level details about each op, such as its address and class. =item * -Perl no longer panics when switching into some locales on machines with -buggy C<strxfrm()> implementations in their libc. [perl #121734] +The C<PADOFFSET> type has changed from being unsigned to signed, and +several pad-related variables such as C<PL_padix> have changed from being +of type C<I32> to type C<PADOFFSET>. =item * -Perl is now built with the C<PERL_OP_PARENT> compiler define enabled by -default. To disable it, use the C<PERL_NO_OP_PARENT> compiler define. -This flag alters how the C<op_sibling> field is used in C<OP> structures, -and has been available optionally since perl 5.22. - -See L<perl5220delta/"Internal Changes"> for more details of what this -build option does. +The C<DEBUGGING>-mode output for regex compilation and execution has been +enhanced. =item * -The meanings of some internal SV flags have been changed - -OPpRUNTIME, SVpbm_VALID, SVpbm_TAIL, SvTAIL_on, SvTAIL_off, SVrepl_EVAL, -SvEVALED +Several obscure SV flags have been eliminated, sometimes along with the +macros which manipulate them: C<SVpbm_VALID>, C<SVpbm_TAIL>, C<SvTAIL_on>, +C<SvTAIL_off>, C<SVrepl_EVAL>, C<SvEVALED> =item * -Change C<hv_fetch(â¦, "â¦", â¦, â¦)> to C<hv_fetchs(â¦, "â¦", â¦)> - -The dual-life dists all use Devel::PPPort, so they can use this function even -though it was only added in 5.10. +An OP op_private flag has been eliminated: C<OPpRUNTIME>. This used to +often get set on C<PMOP>s, but had become meaningless over time. =back @@ -2352,6 +2377,11 @@ though it was only added in 5.10. =item * +Perl no longer panics when switching into some locales on machines with +buggy C<strxfrm()> implementations in their libc. [perl #121734] + +=item * + C< $-{$name} > would leak an C<AV> on each access if the regular expression had no named captures. The same applies to access to any hash tied with L<Tie::Hash::NamedCapture> and C<< all =E<gt> 1 >>. [perl @@ -2404,7 +2434,7 @@ is wellformed. This resolves [perl #126310]. =item * -The range operator C<..> on strings now handles its arguments correctly when in +The range operator C<".."> on strings now handles its arguments correctly when in the scope of the L<< C<unicode_strings>|feature/"The 'unicode_strings' feature" >> feature. The previous behaviour was sufficiently unexpected that we believe no correct program could have made use of it. @@ -2418,7 +2448,7 @@ the stack. [perl #130262] =item * -Using a large code point with the C<W> pack template character with +Using a large code point with the C<"W"> pack template character with the current output position aligned at just the right point could cause a write a single zero byte immediately beyond the end of an allocated buffer. [perl #129149] @@ -2476,7 +2506,7 @@ in patterns exceeded a minimum length. [perl #130522]. =item * -Only warn once per literal about a misplaced C<_>. [perl #70878]. +Only warn once per literal about a misplaced C<"_">. [perl #70878]. =item * @@ -2582,7 +2612,7 @@ Fixed place where regex was not setting the syntax error correctly. =item * -The C<&.> operator (and the C<&> operator, when it treats its arguments as +The C<&.> operator (and the C<"&"> operator, when it treats its arguments as strings) were failing to append a trailing null byte if at least one string was marked as utf8 internally. Many code paths (system calls, regexp compilation) still expect there to be a null byte in the string buffer @@ -2609,7 +2639,7 @@ Check for null PL_curcop in IN_LC() [perl #129106] =item * Fixed the parser error handling for an 'C<:attr(foo>' that does not have -an ending 'C<)>'. +an ending 'C<")">'. =item * @@ -2689,23 +2719,17 @@ A regression in 5.24 with C<tr/\N{U+...}/foo/> when the code point was between =item * -A regression from the previous development release, 5.23.3, where -compiling a regular expression could crash the interpreter has been -fixed. [perl #128686]. - -=item * - Use of a string delimiter whose code point is above 2**31 now works correctly on platforms that allow this. Previously, certain characters, due to truncation, would be confused with other delimiter characters -with special meaning (such as C<?> in C<m?...?>), resulting +with special meaning (such as C<"?"> in C<m?...?>), resulting in inconsistent behaviour. Note that this is non-portable, and is based on Perl's extension to UTF-8, and is probably not displayable nor enterable by any editor. [perl #128738] =item * -C<@{x> followed by a newline where C<x> represents a control or non-ASCII +C<@{x> followed by a newline where C<"x"> represents a control or non-ASCII character no longer produces a garbled syntax error message or a crash. [perl #128951] -- Perl5 Master Repository
