In perl.git, the branch smoke-me/khw-onepass has been created <https://perl5.git.perl.org/perl.git/commitdiff/d1917b165eea65098898c2f9c0f5d6c62fb7fbbb?hp=0000000000000000000000000000000000000000>
at d1917b165eea65098898c2f9c0f5d6c62fb7fbbb (commit) - Log ----------------------------------------------------------------- commit d1917b165eea65098898c2f9c0f5d6c62fb7fbbb Author: Karl Williamson <k...@cpan.org> Date: Thu Nov 15 21:55:25 2018 -0700 smoke2 commit 59c096c5b089f49db5ce297080176c7c13e0e182 Author: Karl Williamson <k...@cpan.org> Date: Thu Nov 15 08:14:23 2018 -0700 pop commit d09c0f37d308cd74e6e02fe9afd714e2a59ab0d3 Author: Karl Williamson <k...@cpan.org> Date: Wed Nov 14 09:21:04 2018 -0700 perlrun: Clarify -Dv commit f9f6f9a97f695c2a4802d5568c308396135ca161 Author: Karl Williamson <k...@cpan.org> Date: Wed Nov 14 09:09:38 2018 -0700 l commit 914a3778c7ec0e2b5c95cd2de7bf357946c77bf8 Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 13 14:17:37 2018 -0700 regcomp.c: Simplify early failure returns Previous commits have removed the need for certain macros and generality in returning from functions early. Correspondingly simplify l commit d4e3359808cc67e684061cba5a3b721662576785 Author: Karl Williamson <k...@cpan.org> Date: Sun Nov 11 21:54:41 2018 -0700 regcomp.c: Add assertion commit 6e81709e0b928c69cf628b8ea2cb653a50ca30a4 Author: Karl Williamson <k...@cpan.org> Date: Sun Nov 11 21:48:53 2018 -0700 XXX regcomp.sym: commit 89c42cf00ca96f32b22f30adace98decc15e132f Author: Karl Williamson <k...@cpan.org> Date: Sun Nov 11 21:42:51 2018 -0700 XXX pod,delta -Drv now turns on all regex debugging This commit makes the v (verbose) modifier to -Dr do something: turn on all possible regex debugging. commit e868f9a6bdc35c073de067cafa93364f5e0d5de7 Author: Karl Williamson <k...@cpan.org> Date: Sun Nov 11 21:38:29 2018 -0700 regcomp.h: Delete duplicate macro defn commit 0f7cea152b97ab4ee86e281da2236b8e0568edb4 Author: Karl Williamson <k...@cpan.org> Date: Sun Nov 11 15:59:14 2018 -0700 perl.h: White-space, comment commit a3424ea65ede7282eac22b56cb7893a38a52a431 Author: Karl Williamson <k...@cpan.org> Date: Sun Nov 11 11:59:52 2018 -0700 Revert "regcomp.c: Avoid a memory leak if fatalized warnings" This reverts commit commit 3ec014f396e11986e87ceb77ea4a45362d7ff870 Author: Karl Williamson <k...@cpan.org> Date: Sun Nov 11 11:53:41 2018 -0700 regcomp.c: Avoid a memory leak if fatalized warnings commit 4e05a285d22fe2d77146649a81d9be5907ef15df Author: Karl Williamson <k...@cpan.org> Date: Sun Nov 11 11:31:17 2018 -0700 re/re_tests: Add test commit 4ddcd55abf626fcca65eec64aedae3c95257d242 Author: Karl Williamson <k...@cpan.org> Date: Wed Nov 7 20:22:36 2018 -0700 XXX tests PATCH: [perl #133642] Double free This was caused by doing a SAVEFREEPV twice. The solution is to not do this twice. But this means that if the process unexpectedly dies, there is a potential memory leak. That potential already exists with other variables, and has its own ticket #133589. commit 1f0689a9416d6db5cac5057d2628718ee936eaf5 Author: Karl Williamson <k...@cpan.org> Date: Wed Nov 7 20:20:53 2018 -0700 XXX see msg in previous commit commit 350abc9e9396179285d291fba7edb655c5daa144 Author: Karl Williamson <k...@cpan.org> Date: Wed Nov 7 20:11:50 2018 -0700 XXX utf8.c: calculate vairants instead of assuming worst case When converting a byte string to UTF-8, the needed size may increase due to some bytes (the UTF-8 variants) occupying two bytes instead of one under UTF-8. Prior to this commit, the string was assumed to contain only variants, and enough memory was allocated for the worst case, then the excess was returned at the end. This commit actually calculates how much space is needed and allocates only that, so there is no need to trim afterwards. There is extra work involved in doing this calculation. But the string is parsed per-word. For short strings, it doesn't much matter either way. But for very long strings, it seems to me the consequences of potentially allocating way too much memory out weighs the negative of this extra work. commit 29bf1531feffc99978a9a576b1f7d16bc95fb5a5 Author: Karl Williamson <k...@cpan.org> Date: Wed Nov 7 20:07:51 2018 -0700 regcomp.c: Refactor to remove an else and a NOT_REACHED This just simplifies things a bit. commit 17ccb901a92f91da0d8f59e71ddf5cf4afdceeed Author: Karl Williamson <k...@cpan.org> Date: Wed Nov 7 18:44:56 2018 -0700 t/re/reg_mesg.t: Add test Verify this still works after the recent removal of the sizing pass commit f310f6488981166fe7a5ae0b914d64924478c3dd Author: Karl Williamson <k...@cpan.org> Date: Wed Nov 7 18:40:37 2018 -0700 handy.h: Add some comments This allows us to remove a comment in regcomp.c commit e94bdf7da8f48bbf043b65959597b46347ee34d3 Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 6 22:49:51 2018 -0700 regcomp.c: Remove parameter no longer used and refactor This static function no longer is called with a non-NULL final parameter. That means it no longer returns a list, and its name is hereby changed to reflect that. It also means the function can be refactored and made simpler. commit f929f83dcf67069541614506cadc6479c4746f1b Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 6 18:44:46 2018 -0700 regcomp.c: Remove now always NULL parameter This parameter is always NULL. No need to have it in this static function commit 2fee0afafaf644eaaa6b50f4e6cc7217c7402d07 Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 6 18:26:39 2018 -0700 regcomp.c: Don't restart parse for /d to /u if no need to This commit keeps track of if there are any operations encountered which differ under /d from /u. If we switch to /u and haven't so far found anything which differs, there's no need to reparse commit 5c877d53fe0cef8dce9010616e71944241f0e353 Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 6 18:10:36 2018 -0700 regcomp.c: Don't restart parse for /d to /u if reparsing anyway Prior to this commit, if the rules changed from /d to /u, the parse was immediately restarted. This commit changes that so that it doesn't do this if it is known that the parse will be redone anyway, but a full parse needs to done first in order to count the parentheses, commit d5b8c76b2fac25935017630f7ce08ec0993d66f3 Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 6 18:02:07 2018 -0700 regcomp.c: Don't restart parse now if doing so later Prior to this commit, if it became apparent that long branches were going to be needed, the parse was immediately restarted. This commit changes that so that it doesn't do this if it is known that the parse will be redone anyway, but a full parse needs to done first in order to count the parentheses, commit c739703872275dab384eb304ed24c1cda33393c7 Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 6 17:41:18 2018 -0700 regcomp.c: Swap 'if' branches for readability It's easier to understand if the simplest case is first in the code. commit 6bb91afbb0f43fe22480a93f5c8bd1a7c36140ba Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 6 17:31:21 2018 -0700 regcomp.c: Refactor constructing EXACTish nodes The previous commits have allowed us to refactor this to eliminate redundancies. Previously, the same logic was done separately for UTF-8 and non-UTF-8 patterns. This refactors so the logic is done once. The details differ for UTF-8 and non-UT-8. So that's where the differences lie, in the details without having to duplicate the logic. commit 3c739ec692d78567e49eee794a2163d4dae473b7 Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 6 10:03:00 2018 -0700 regcomp.c: XXX prepare commit f9fd8d9914a4df1995406c0754ce5493f628aef8 Author: Karl Williamson <k...@cpan.org> Date: Sat Nov 3 10:03:37 2018 -0600 regcomp.c: Remove obsolete code This code was obsoleted by removal of the sizing pass. Previously we had to take special care when encountering the LATIN SMALL LETTER SHARP S because it can fold to more bytes than it occupies. But with the sizing pass gone, that is no longer necessary. commit f95ea5dfef46b18e56690c39f7703e15f2261de4 Author: Karl Williamson <k...@cpan.org> Date: Sat Nov 3 09:41:52 2018 -0600 regcomp.c: Comments, white-space, rmv extra parens This commit re-indents things after the previous commit added a block, fixes typos in comments, and removes obsolete references in them to the sizing pass, and adds some comments, and does some white-space changes. In one case it removes extraneous parentheses commit 7726dc0b3814bfecf856540e454b05e7f11195fe Author: Karl Williamson <k...@cpan.org> Date: Thu Nov 1 09:08:04 2018 -0600 regcomp.c: Remove no longer useful code This code has been obsoleted by the previous commit. That commit looks at a bracketed character class in general and optimizes it to some faster and/or simpler operation if possible. The code being removed in this commit was originally added to try to find some optimizations that were feasible to find in the sizing pass. Now that we don't have a sizing pass, and we find optimizations generally, this code doesn't add any value. commit a205b8feb803cc757f90a5091e4a34804f615e15 Author: Karl Williamson <k...@cpan.org> Date: Thu Nov 1 01:43:39 2018 -0600 Find optimizations for /[[:posix:]]/a Various optimizations are done when the regular expression compiler sees a bracketed character class. For example, /[a]/ is optimized into /a/. This wasn't done for the various POSIX classes under /a, as the speed of the operations of using a regular bracketed class (which uses a bitmap) and of an optimized version (which uses a specialized opcode for the POSIX class) is similar. The only advantage would have been that the specialized opcode is 1/10 the size of the bitmap. But the optimization in general couldn't come until the second pass, after the size had already been calculated and space allocated. So there was no savings. But now that there is no separate sizing pass, doing the optimization actually will save space. commit 1267cb20dfb181ee53905d3969ec58111edce61d Author: Karl Williamson <k...@cpan.org> Date: Sat Oct 20 17:24:53 2018 -0600 regcomp.c: Don't do unnecessary tests A pattern being UTF-8 implies it is /u. By proper initialization of if it is /u or not, we can avoid the existing additional tests for UTF-8 in those places where we care about /u but don't care about UTF-8ness. commit d5a395255530534894638be4399cdf49c6db0344 Author: Karl Williamson <k...@cpan.org> Date: Sun Oct 28 21:24:22 2018 -0600 regcomp.c: Make sure UTF-8 regex pattern uses /u When a pattern is in UTF-8, Unicoe rules should be selected. This commit makes sure that this happens and that the displayable form of the pattern shows /u. I don't know of any bugs this fixes. commit 0b8b43ccf185cc1588c5b3901794ae1bc1eaf5db Author: Karl Williamson <k...@cpan.org> Date: Tue Nov 6 09:51:53 2018 -0700 t/re/pat.t: Add a test commit 203ce5de4cc409ef5eeba69182b9609ae6620fd3 Author: Karl Williamson <k...@cpan.org> Date: Mon Oct 15 21:30:00 2018 -0600 XXX don't push now. Fix up undefined behavior in warnings.h commit ba292928245dd3e14a6280e8f72bc2b8ee0602f5 Author: Karl Williamson <k...@cpan.org> Date: Sun Oct 14 19:45:06 2018 -0600 Use "%td" for printing ptrdiff_t values. I did not know that perl supported this C99 feature, but it does. ----------------------------------------------------------------------- -- Perl5 Master Repository