In perl.git, the branch smoke-me/khw-onepass has been created
<https://perl5.git.perl.org/perl.git/commitdiff/d1917b165eea65098898c2f9c0f5d6c62fb7fbbb?hp=0000000000000000000000000000000000000000>
at d1917b165eea65098898c2f9c0f5d6c62fb7fbbb (commit)
- Log -----------------------------------------------------------------
commit d1917b165eea65098898c2f9c0f5d6c62fb7fbbb
Author: Karl Williamson <[email protected]>
Date: Thu Nov 15 21:55:25 2018 -0700
smoke2
commit 59c096c5b089f49db5ce297080176c7c13e0e182
Author: Karl Williamson <[email protected]>
Date: Thu Nov 15 08:14:23 2018 -0700
pop
commit d09c0f37d308cd74e6e02fe9afd714e2a59ab0d3
Author: Karl Williamson <[email protected]>
Date: Wed Nov 14 09:21:04 2018 -0700
perlrun: Clarify -Dv
commit f9f6f9a97f695c2a4802d5568c308396135ca161
Author: Karl Williamson <[email protected]>
Date: Wed Nov 14 09:09:38 2018 -0700
l
commit 914a3778c7ec0e2b5c95cd2de7bf357946c77bf8
Author: Karl Williamson <[email protected]>
Date: Tue Nov 13 14:17:37 2018 -0700
regcomp.c: Simplify early failure returns
Previous commits have removed the need for certain macros and generality
in returning from functions early. Correspondingly simplify
l
commit d4e3359808cc67e684061cba5a3b721662576785
Author: Karl Williamson <[email protected]>
Date: Sun Nov 11 21:54:41 2018 -0700
regcomp.c: Add assertion
commit 6e81709e0b928c69cf628b8ea2cb653a50ca30a4
Author: Karl Williamson <[email protected]>
Date: Sun Nov 11 21:48:53 2018 -0700
XXX regcomp.sym:
commit 89c42cf00ca96f32b22f30adace98decc15e132f
Author: Karl Williamson <[email protected]>
Date: Sun Nov 11 21:42:51 2018 -0700
XXX pod,delta -Drv now turns on all regex debugging
This commit makes the v (verbose) modifier to -Dr do something: turn on
all possible regex debugging.
commit e868f9a6bdc35c073de067cafa93364f5e0d5de7
Author: Karl Williamson <[email protected]>
Date: Sun Nov 11 21:38:29 2018 -0700
regcomp.h: Delete duplicate macro defn
commit 0f7cea152b97ab4ee86e281da2236b8e0568edb4
Author: Karl Williamson <[email protected]>
Date: Sun Nov 11 15:59:14 2018 -0700
perl.h: White-space, comment
commit a3424ea65ede7282eac22b56cb7893a38a52a431
Author: Karl Williamson <[email protected]>
Date: Sun Nov 11 11:59:52 2018 -0700
Revert "regcomp.c: Avoid a memory leak if fatalized warnings"
This reverts commit
commit 3ec014f396e11986e87ceb77ea4a45362d7ff870
Author: Karl Williamson <[email protected]>
Date: Sun Nov 11 11:53:41 2018 -0700
regcomp.c: Avoid a memory leak if fatalized warnings
commit 4e05a285d22fe2d77146649a81d9be5907ef15df
Author: Karl Williamson <[email protected]>
Date: Sun Nov 11 11:31:17 2018 -0700
re/re_tests: Add test
commit 4ddcd55abf626fcca65eec64aedae3c95257d242
Author: Karl Williamson <[email protected]>
Date: Wed Nov 7 20:22:36 2018 -0700
XXX tests PATCH: [perl #133642] Double free
This was caused by doing a SAVEFREEPV twice. The solution is to not do
this twice.
But this means that if the process unexpectedly dies, there is a
potential memory leak. That potential already exists with other
variables, and has its own ticket #133589.
commit 1f0689a9416d6db5cac5057d2628718ee936eaf5
Author: Karl Williamson <[email protected]>
Date: Wed Nov 7 20:20:53 2018 -0700
XXX see msg in previous commit
commit 350abc9e9396179285d291fba7edb655c5daa144
Author: Karl Williamson <[email protected]>
Date: Wed Nov 7 20:11:50 2018 -0700
XXX utf8.c: calculate vairants instead of assuming worst case
When converting a byte string to UTF-8, the needed size may increase due
to some bytes (the UTF-8 variants) occupying two bytes instead of one
under UTF-8.
Prior to this commit, the string was assumed to contain only variants,
and enough memory was allocated for the worst case, then the excess was
returned at the end.
This commit actually calculates how much space is needed and allocates
only that, so there is no need to trim afterwards.
There is extra work involved in doing this calculation. But the string
is parsed per-word. For short strings, it doesn't much matter either
way. But for very long strings, it seems to me the consequences of
potentially allocating way too much memory out weighs the negative of
this extra work.
commit 29bf1531feffc99978a9a576b1f7d16bc95fb5a5
Author: Karl Williamson <[email protected]>
Date: Wed Nov 7 20:07:51 2018 -0700
regcomp.c: Refactor to remove an else and a NOT_REACHED
This just simplifies things a bit.
commit 17ccb901a92f91da0d8f59e71ddf5cf4afdceeed
Author: Karl Williamson <[email protected]>
Date: Wed Nov 7 18:44:56 2018 -0700
t/re/reg_mesg.t: Add test
Verify this still works after the recent removal of the sizing pass
commit f310f6488981166fe7a5ae0b914d64924478c3dd
Author: Karl Williamson <[email protected]>
Date: Wed Nov 7 18:40:37 2018 -0700
handy.h: Add some comments
This allows us to remove a comment in regcomp.c
commit e94bdf7da8f48bbf043b65959597b46347ee34d3
Author: Karl Williamson <[email protected]>
Date: Tue Nov 6 22:49:51 2018 -0700
regcomp.c: Remove parameter no longer used and refactor
This static function no longer is called with a non-NULL final
parameter. That means it no longer returns a list, and its name is
hereby changed to reflect that. It also means the function can be
refactored and made simpler.
commit f929f83dcf67069541614506cadc6479c4746f1b
Author: Karl Williamson <[email protected]>
Date: Tue Nov 6 18:44:46 2018 -0700
regcomp.c: Remove now always NULL parameter
This parameter is always NULL. No need to have it in this static
function
commit 2fee0afafaf644eaaa6b50f4e6cc7217c7402d07
Author: Karl Williamson <[email protected]>
Date: Tue Nov 6 18:26:39 2018 -0700
regcomp.c: Don't restart parse for /d to /u if no need to
This commit keeps track of if there are any operations encountered which
differ under /d from /u. If we switch to /u and haven't so far found
anything which differs, there's no need to reparse
commit 5c877d53fe0cef8dce9010616e71944241f0e353
Author: Karl Williamson <[email protected]>
Date: Tue Nov 6 18:10:36 2018 -0700
regcomp.c: Don't restart parse for /d to /u if reparsing anyway
Prior to this commit, if the rules changed from /d to /u, the parse was
immediately restarted. This commit changes that so that it doesn't do
this if it is known that the parse will be redone anyway, but a full
parse needs to done first in order to count the parentheses,
commit d5b8c76b2fac25935017630f7ce08ec0993d66f3
Author: Karl Williamson <[email protected]>
Date: Tue Nov 6 18:02:07 2018 -0700
regcomp.c: Don't restart parse now if doing so later
Prior to this commit, if it became apparent that long branches were
going to be needed, the parse was immediately restarted. This commit
changes that so that it doesn't do this if it is known that the parse
will be redone anyway, but a full parse needs to done first in order to
count the parentheses,
commit c739703872275dab384eb304ed24c1cda33393c7
Author: Karl Williamson <[email protected]>
Date: Tue Nov 6 17:41:18 2018 -0700
regcomp.c: Swap 'if' branches for readability
It's easier to understand if the simplest case is first in the code.
commit 6bb91afbb0f43fe22480a93f5c8bd1a7c36140ba
Author: Karl Williamson <[email protected]>
Date: Tue Nov 6 17:31:21 2018 -0700
regcomp.c: Refactor constructing EXACTish nodes
The previous commits have allowed us to refactor this to eliminate
redundancies.
Previously, the same logic was done separately for UTF-8 and non-UTF-8
patterns. This refactors so the logic is done once. The details differ
for UTF-8 and non-UT-8. So that's where the differences lie, in the
details without having to duplicate the logic.
commit 3c739ec692d78567e49eee794a2163d4dae473b7
Author: Karl Williamson <[email protected]>
Date: Tue Nov 6 10:03:00 2018 -0700
regcomp.c: XXX prepare
commit f9fd8d9914a4df1995406c0754ce5493f628aef8
Author: Karl Williamson <[email protected]>
Date: Sat Nov 3 10:03:37 2018 -0600
regcomp.c: Remove obsolete code
This code was obsoleted by removal of the sizing pass. Previously we
had to take special care when encountering the LATIN SMALL LETTER SHARP
S because it can fold to more bytes than it occupies. But with the
sizing pass gone, that is no longer necessary.
commit f95ea5dfef46b18e56690c39f7703e15f2261de4
Author: Karl Williamson <[email protected]>
Date: Sat Nov 3 09:41:52 2018 -0600
regcomp.c: Comments, white-space, rmv extra parens
This commit re-indents things after the previous commit added a block,
fixes typos in comments, and removes obsolete references in them to the
sizing pass, and adds some comments, and does some white-space changes.
In one case it removes extraneous parentheses
commit 7726dc0b3814bfecf856540e454b05e7f11195fe
Author: Karl Williamson <[email protected]>
Date: Thu Nov 1 09:08:04 2018 -0600
regcomp.c: Remove no longer useful code
This code has been obsoleted by the previous commit. That commit looks
at a bracketed character class in general and optimizes it to some
faster and/or simpler operation if possible. The code being removed in
this commit was originally added to try to find some optimizations that
were feasible to find in the sizing pass. Now that we don't have a
sizing pass, and we find optimizations generally, this code doesn't add
any value.
commit a205b8feb803cc757f90a5091e4a34804f615e15
Author: Karl Williamson <[email protected]>
Date: Thu Nov 1 01:43:39 2018 -0600
Find optimizations for /[[:posix:]]/a
Various optimizations are done when the regular expression compiler sees
a bracketed character class. For example, /[a]/ is optimized into /a/.
This wasn't done for the various POSIX classes under /a, as the speed of
the operations of using a regular bracketed class (which uses a bitmap)
and of an optimized version (which uses a specialized opcode for the
POSIX class) is similar. The only advantage would have been that the
specialized opcode is 1/10 the size of the bitmap. But the optimization
in general couldn't come until the second pass, after the size had
already been calculated and space allocated. So there was no savings.
But now that there is no separate sizing pass, doing the optimization
actually will save space.
commit 1267cb20dfb181ee53905d3969ec58111edce61d
Author: Karl Williamson <[email protected]>
Date: Sat Oct 20 17:24:53 2018 -0600
regcomp.c: Don't do unnecessary tests
A pattern being UTF-8 implies it is /u. By proper initialization of if
it is /u or not, we can avoid the existing additional tests for UTF-8 in
those places where we care about /u but don't care about UTF-8ness.
commit d5a395255530534894638be4399cdf49c6db0344
Author: Karl Williamson <[email protected]>
Date: Sun Oct 28 21:24:22 2018 -0600
regcomp.c: Make sure UTF-8 regex pattern uses /u
When a pattern is in UTF-8, Unicoe rules should be selected. This
commit makes sure that this happens and that the displayable form of the
pattern shows /u.
I don't know of any bugs this fixes.
commit 0b8b43ccf185cc1588c5b3901794ae1bc1eaf5db
Author: Karl Williamson <[email protected]>
Date: Tue Nov 6 09:51:53 2018 -0700
t/re/pat.t: Add a test
commit 203ce5de4cc409ef5eeba69182b9609ae6620fd3
Author: Karl Williamson <[email protected]>
Date: Mon Oct 15 21:30:00 2018 -0600
XXX don't push now. Fix up undefined behavior in warnings.h
commit ba292928245dd3e14a6280e8f72bc2b8ee0602f5
Author: Karl Williamson <[email protected]>
Date: Sun Oct 14 19:45:06 2018 -0600
Use "%td" for printing ptrdiff_t values.
I did not know that perl supported this C99 feature, but it does.
-----------------------------------------------------------------------
--
Perl5 Master Repository