Branch: refs/heads/blead Home: https://github.com/Perl/perl5 Commit: c2bb0b9930e2c73d35f279e89dd2a241de96e887 https://github.com/Perl/perl5/commit/c2bb0b9930e2c73d35f279e89dd2a241de96e887 Author: Karl Williamson <k...@cpan.org> Date: 2020-10-14 (Wed, 14 Oct 2020)
Changed paths: M regen/regen_lib.pl Log Message: ----------- regen_lib: Output blanks; not tabs This makes it easier to calculate widths; and our policy is to not use tabs anyway. Commit: 519d76f5929997820de5bb942a6e6be7f1bf60bd https://github.com/Perl/perl5/commit/519d76f5929997820de5bb942a6e6be7f1bf60bd Author: Karl Williamson <k...@cpan.org> Date: 2020-10-14 (Wed, 14 Oct 2020) Changed paths: M regen/regcomp.pl Log Message: ----------- regen/regcomp.pl: Change variable name The more specific name this is changed to will make code clearer in future commits. Commit: ce553cf576f837c8b843c307e2e1c957d8bab24d https://github.com/Perl/perl5/commit/ce553cf576f837c8b843c307e2e1c957d8bab24d Author: Karl Williamson <k...@cpan.org> Date: 2020-10-14 (Wed, 14 Oct 2020) Changed paths: M regen/regcomp.pl M regnodes.h Log Message: ----------- regen/regcomp.pl: Generate #defines for UTF8ness This causes #defines to be generated for regexec.c to use in switch statements, so that for each opcode there that is a case: there are actually 4 cases, for the the target being UTF-8 or not, combined with the pattern being UTF-8 or not. This will be used in future commits to simplify things. Commit: dd8dc88c6c318c49836493d65c4faf0e5ede57b2 https://github.com/Perl/perl5/commit/dd8dc88c6c318c49836493d65c4faf0e5ede57b2 Author: Karl Williamson <k...@cpan.org> Date: 2020-10-14 (Wed, 14 Oct 2020) Changed paths: M regexec.c Log Message: ----------- regexec.c: S_find_byclass(): utf8ness in switch() This uses the #defines created in the previous commit to make the switch statement in this function incorporate the UTF8ness of both the pattern and the target string. The reason for this is that the first statement in nearly every case of the switch is to test if the target string being matched is UTF-8 or not. By putting that information into the the case number, those conditionals can be eliminated, leading to cleaner, more modular code. I had hoped that this would also improve performance since there are fewer conditionals, but Sergey Aleynikov did performance testing of this change for me, and found no real noticeable gain nor loss. Further, the cases involving matching EXACTish nodes have to also test if the pattern is UTF-8 or not before doing anything else. I added that information as well to the case number, so that those conditionals can be eliminated. For the non-EXACTish nodes, it simply means that that two case statements execute the same code. This is an intermediate commit, which only does the expansion of the current cases into four for each. The refactoring that takes advantage of this is in the following commit. Commit: 56ff0609361466f7eb706d56bdaf69e44342c2e1 https://github.com/Perl/perl5/commit/56ff0609361466f7eb706d56bdaf69e44342c2e1 Author: Karl Williamson <k...@cpan.org> Date: 2020-10-14 (Wed, 14 Oct 2020) Changed paths: M regexec.c Log Message: ----------- regexec.c: find_byclass(): Restructure This is a follow-on to the previous commit. The case number of the main switch statement now includes three things: the regnode op, the UTF8ness of the target, and the UTF8ness of the pattern. This allows the conditionals within the previous cases (which only encoded the op), to be removed, and things to be moved around so that there is more fall throughs and fewer gotos, and the macros that are called no longer have to test for UTF8ness; so I teased the UTF8 ones apart from the non_UTF8 ones. Commit: 25f81fd589673867331d0217a5c6ef17ed4d2e70 https://github.com/Perl/perl5/commit/25f81fd589673867331d0217a5c6ef17ed4d2e70 Author: Karl Williamson <k...@cpan.org> Date: 2020-10-14 (Wed, 14 Oct 2020) Changed paths: M regexec.c Log Message: ----------- regexec.c: Rename a static variable This is to distinguish it from a similar variable being added in a future commit Commit: b272adb45fa3fca1b787d7ff479196523e7d6336 https://github.com/Perl/perl5/commit/b272adb45fa3fca1b787d7ff479196523e7d6336 Author: Karl Williamson <k...@cpan.org> Date: 2020-10-14 (Wed, 14 Oct 2020) Changed paths: M regexec.c Log Message: ----------- regexec.c: Macroize a common paradigm Commit: 526e4b9dadeed7e61d0154aceb7927fb02dd85bb https://github.com/Perl/perl5/commit/526e4b9dadeed7e61d0154aceb7927fb02dd85bb Author: Karl Williamson <k...@cpan.org> Date: 2020-10-14 (Wed, 14 Oct 2020) Changed paths: M regexec.c Log Message: ----------- regexec.c: Macroize another common paradigm Compare: https://github.com/Perl/perl5/compare/206c207c12a8...526e4b9dadee