Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: c2a46f9d3954f4e4f9b0f8255ab006eb0da04ce3
      
https://github.com/Perl/perl5/commit/c2a46f9d3954f4e4f9b0f8255ab006eb0da04ce3
  Author: Karl Williamson <[email protected]>
  Date:   2025-10-17 (Fri, 17 Oct 2025)

  Changed paths:
    M perl.c
    M toke.c

  Log Message:
  -----------
  toke.c: S_intuit_more: Add more commentary

This function is described in its comments as 'terrifying', and by its
original author, Larry Wall, as "truly awful".  As a result, it has been
mostly untouched since its introduction in 1993.  That means it has not
been updated as new language features have been added.

As an example, it does not know about lexical variables, so the code it
has for globals just doesn't work on the vast majority of modern day
coding practices.

Another example is it knows nothing of UTF-8, and as a result simply
changing the input encoding from Latin1 to UTF-8 can result in its
outcome being the opposite result.

And it is buggy.

An example of how hard this can be to get right is this fairly common
use in our test suite:

 [$A-Z]

That looks like a character class matching 27 characters.  But wait,
what if there exists a $A and a parameterless subroutine 'Z'.  Then this
could instead be an expression for a subcript.

A few years ago, I set out to try to understand it.  I added commentary
and simplified some overly complicated expressions, but left its
behavior unchanged.

Now, I set out to make some changes, and found many more issues than I
had earlier.  This commit adds commentary about those.  Hopefully this
will lead to some discussion and a consensus on the way forward.



To unsubscribe from these emails, change your notification settings at 
https://github.com/Perl/perl5/settings/notifications

Reply via email to