In perl.git, the branch khw/ebcdic has been created
<http://perl5.git.perl.org/perl.git/commitdiff/693b0df73871ccebe6d3f9887fb2f61833292302?hp=0000000000000000000000000000000000000000>
at 693b0df73871ccebe6d3f9887fb2f61833292302 (commit)
- Log -----------------------------------------------------------------
commit 693b0df73871ccebe6d3f9887fb2f61833292302
Author: Karl Williamson <[email protected]>
Date: Wed Feb 27 16:13:24 2013 -0700
makedepend.SH: Split very long preprocessor lines
This avoids preprocessor limitations
M makedepend.SH
commit 3cf86197a68b29c5178dfca109a5b2577c6994dc
Author: Karl Williamson <[email protected]>
Date: Wed Feb 27 21:59:11 2013 -0700
makedepend.SH: Properly join continuation lines
I had thought that a continuation introduced a space. But no,
a continuation can happen in the middle of a token.
M makedepend.SH
commit a8269b875f377b4c36d1361d0a735b3572337f1a
Author: Karl Williamson <[email protected]>
Date: Wed Feb 27 15:51:28 2013 -0700
makedepend.SH: White-space only
Align continuation backslashes
M makedepend.SH
commit 4a46ca32eb8f866be68e6129459f5c535623d2bf
Author: Karl Williamson <[email protected]>
Date: Wed Feb 27 14:39:28 2013 -0700
makedepend.SH: Remove some unnecessary white space
Multi-line preprocessor directives are now joined into single lines.
This can create lines too long for the preprocessor to handle. This
commit removes blanks adjoining comments that get deleted. This makes
things somewhat less likely to exceed the limit.
This commit also fixes several [] which were meant to each match a tab
or a blank, but editors converted the tabs to blanks
M makedepend.SH
commit bec52ba7407d7115e83a311118fb3dcd159f01b8
Author: Karl Williamson <[email protected]>
Date: Wed Feb 27 14:30:51 2013 -0700
makedepend.SH: Retain '/**/' comments
These comments may actually be necessary.
M makedepend.SH
commit 8d7cb442ae02c432acad22f67a140d2b5ed5633a
Author: Karl Williamson <[email protected]>
Date: Wed Feb 27 08:38:19 2013 -0700
handy.h: Remove extraneous parens
M handy.h
commit 6d4b52d1cc9836f1929b2e0f4a55df9cb914351c
Author: Andy Dougherty <[email protected]>
Date: Wed Feb 27 13:06:07 2013 -0500
Disable gcc-style function attributes on z/OS.
John Goodyear <[email protected]> reports that the z/OS C compiler
supports the attribute keyword, but not exactly the same as gcc.
Instead of a "warning", the compiler emits an "INFORMATIONAL" message
that Configure fails to detect. Until Configure is fixed, just disable
the attributes altogether.
John Goodyear
M hints/os390.sh
commit 6de7e69170633d25c4e8da7f4fff1aff4a5d4411
Author: Andy Dougherty <[email protected]>
Date: Wed Feb 27 09:12:13 2013 -0500
Change os390 custom cppstdin script to use fgrep.
Grep appears to be limited to 2048 characters, and truncates
the output for cppstin. Fgrep apparently doesn't have that limit.
Thanks to John Goodyear <[email protected]> for reporting this.
M hints/os390.sh
commit 19731731d1fcb364b8ac1c5888edfbe138d823eb
Author: Karl Williamson <[email protected]>
Date: Tue Feb 26 13:45:19 2013 -0700
utf8.c: Use more clearly named macro
In the case of invariants these two macros should do the same thing,
but it seems to me that the latter name more clearly indicates what is
going on.
M utf8.c
commit 0a99fb35323373348787e5fc40ff3bcace498120
Author: Karl Williamson <[email protected]>
Date: Tue Feb 26 13:35:12 2013 -0700
Add macro OFFUNISKIP
This means use official Unicode code point numbering, not native. Doing
this converts the existing UNISKIP calls in the code to refer to native
code points, which is what they meant anyway. The terminology is
somewhat ambiguous, but I don't think will cause real confusion.
NATIVESKIP is also introduced for situations where it is important to be
precise.
M toke.c
M utf8.c
M utf8.h
M utfebcdic.h
commit 88c089b646d641bc96d1e1cfddaa60d11d8a46b1
Author: Karl Williamson <[email protected]>
Date: Tue Feb 26 13:22:19 2013 -0700
toke.c: white space only
M toke.c
commit b47f86885984496f5484f4b305e74087d1a51bb5
Author: Karl Williamson <[email protected]>
Date: Tue Feb 26 12:08:50 2013 -0700
utf8.c: Deprecate two functions
This is to force any code that has been using these functions to change.
Since the Unicode tables are now stored in native order, these functions
should only rarely be needed.
However, the functionality of these is needed, and in actuality, on
ASCII platforms, the native functions are #defined to these. So what
this commit does is rename the functions to something else, and create
wrappers with the old names, so that anyone using them will get the
deprecation.
M embed.fnc
M embed.h
M mathoms.c
M proto.h
M toke.c
M utf8.c
M utf8.h
commit ba0ff268d8fdea2e44fc6e6d8fc43eb3a15ab887
Author: Karl Williamson <[email protected]>
Date: Tue Feb 26 11:26:09 2013 -0700
Deprecate uvuni_to_utf8()
Code should almost never be dealing with non-native code points
M embed.fnc
M embed.h
M proto.h
M toke.c
M utf8.c
M utf8.h
commit 8e3ab33a00a8437987bafb138ec8d487b4acc574
Author: Karl Williamson <[email protected]>
Date: Tue Feb 26 11:02:33 2013 -0700
Deprecate utf8_to_uni_buf()
Now that the tables are stored in native order, there is almost no need
for code to be dealing in Unicode order.
M embed.fnc
M proto.h
M utf8.c
commit 10b2efe67f7f8b68221d93b1dcd3192cb5f6a8eb
Author: Karl Williamson <[email protected]>
Date: Tue Feb 26 09:00:18 2013 -0700
makedepend.SH: Comment out unnecessary code
This causes problems currently for z/OS. But, since we don't know why
it was there, I'm leaving it in as a placeholder.
M makedepend.SH
commit 550ab86a56d860782146b762cf93fe1a2e9ac2ce
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 20:26:44 2013 -0700
Deprecate valid_utf8_to_uvuni()
Now that all the tables are stored in native format, there is very
little reason to use this function; and those who do need this kind of
functionality should be using the bottom level routine, so as to make it
clear they are doing nonstandard stuff.
M embed.fnc
M proto.h
M utf8.c
commit eb7e9af929c62b6c1d0aafd6beee9a9d12019c65
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 20:14:26 2013 -0700
utf8.c: Swap which fcn wraps the other
This is in preparation for the current wrapee becoming deprecated
M embed.fnc
M embed.h
M proto.h
M utf8.c
M utf8.h
commit 49b311ddf29e7afa6e38b9c7df1ad80f4509b949
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 19:29:34 2013 -0700
utf8.c: Skip a no-op
Since the value is invariant under both UTF-8 and not, we already have
it in 'uv'; no need to do anything else to get it
M utf8.c
commit 749357cbbd8ed0d36d2930bc69923e05c20aef19
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 19:26:50 2013 -0700
utf8.c: Move comment to where makes more sense
M utf8.c
commit 8b58e7195f1b8bdee1ee363aaba5203e4885ee52
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 17:30:10 2013 -0700
APItest: Test native code points, instead of Unicode
M ext/XS-APItest/APItest.pm
M ext/XS-APItest/APItest.xs
M ext/XS-APItest/t/utf8.t
commit fd401ad84e35b19fe5c205374d602d98796fe84a
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 17:25:08 2013 -0700
XXX CPAN Normalize
This converts Unicode::Normalize to use the native tables that are used
by Perl starting in XXX, while using the Unicode-ordered ones that were
used before then.
Another alternative would be to have mktables generate just these tables
in Unicode ordering.
M cpan/Unicode-Normalize/Normalize.xs
commit 1d25789561a2f32036c5e553de67f9d58b70bd0a
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 17:22:55 2013 -0700
XXX CPAN prob wrong Collate
This changes to implicity usenative code points. This is likely wrong,
as the module comes with its own data, that are probably in terms of
Unicode
M cpan/Unicode-Collate/Collate.xs
commit a5d696a8f9bbff840a294334df06637df6880bb3
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 17:12:53 2013 -0700
XXX CPAN Encode.xs
Use core function if available. This will insulate this code from any
future changes.
M cpan/Encode/Encode.xs
commit cfe5f893a0d244df64b6bf572c885e84b426ca3a
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 17:04:24 2013 -0700
XXX CPAN and unsure Encode
M cpan/Encode/Encode.xs
M cpan/Encode/Unicode/Unicode.xs
commit dca5b49d9692e2a200a365e213048a9e617f7a26
Author: Karl Williamson <[email protected]>
Date: Mon Feb 25 17:00:47 2013 -0700
XXX CPAN Encode.xs: fix indent
M cpan/Encode/Encode.xs
commit 9feb2f421f1b8ff7b1159a9ccc76aee81f61d998
Author: Karl Williamson <[email protected]>
Date: Sun Feb 24 17:23:15 2013 -0700
Don't refer to U+XXXX when mean native
These messages say the output number is Unicode, but it is really
native, so change to saying is 0xXXXX.
M regen/regcharclass_multi_char_folds.pl
M regexec.c
commit a3bc4199f2eff1347f98f722d90b8da8571d65bc
Author: Karl Williamson <[email protected]>
Date: Sun Feb 24 16:43:59 2013 -0700
Convert some uvuni() to uvchr()
All the tables are now based on the native character set, so using
uvuni() in almost all cases is wrong.
M cygwin/cygwin.c
M doop.c
M op.c
M pp_pack.c
M regcomp.c
M regexec.c
M toke.c
M utf8.c
commit 8c56179837933d5f041e409c57b76c9f3748f325
Author: Karl Williamson <[email protected]>
Date: Sun Feb 24 16:25:47 2013 -0700
handy.h: White space only
M handy.h
commit 25f3ea43eecebb9258736c13fff288007bda6e08
Author: Karl Williamson <[email protected]>
Date: Sun Feb 24 16:19:49 2013 -0700
t/test.pl: Allow native/latin1 string conversions to work on utf8.
These functions no longer have the hard-coded definitions in them,
but now end up resolving to internal functions, so that new encodings
could be added and these would automatically understand them.
Instead of using tr///, these now go character by character and
converting to/from ord, which is slower, but allows them to operate on
utf8 strings.
Peephole optimization should make these essentially no-ops on ascii
platforms.
M t/test.pl
commit b3fd0939d9a5953a343cb37c2ea65c2f53b043c8
Author: Karl Williamson <[email protected]>
Date: Sun Feb 24 16:05:55 2013 -0700
t/test.pl: Simplify ord to/from native fcns
This commit changes these functions from converting to/from a string to
calling utf8:: functions which operate on ordinals instead.
M t/test.pl
commit 1061aee53f8d4ddc48f88957de917ff70f5dbf7d
Author: Karl Williamson <[email protected]>
Date: Sun Feb 24 15:35:38 2013 -0700
Make casing tables native
These are final tables that haven't been converted to native character
set casing.
M perl.h
M utfebcdic.h
commit 542b0745b1df0778711c87d682ebc2defda6ef33
Author: Karl Williamson <[email protected]>
Date: Sun Feb 24 15:32:30 2013 -0700
utfebcdic.h: Remove trailing spaces
M utfebcdic.h
commit e102751f621b5109dd0059cd94871e80969e1162
Author: Karl Williamson <[email protected]>
Date: Fri Feb 22 18:55:26 2013 -0700
EBCDIC has the unicode bug too
We have not had a working modern Perl on EBCDIC for some years. When I
started out, comments and code led me to conclude erroneously that
natively it supported semantics for all 256 characters 0-255. It turns
out that I was wrong; it natively (at least on some platforms) has the
same rules (essentially none) for the characters which don't correspond
to ASCII onees, as the rules for these on ASCII platforms.
This commit forces those rules on EBCDIC platforms (even should there be
one that natively uses all 256). To get all 256, the same things like
'use feature "unicode_strings"' must now be done.
M autodoc.pl
M handy.h
M pod/perlfunc.pod
M pod/perlre.pod
M pod/perlrecharclass.pod
M pod/perlunicode.pod
M pod/perlunifaq.pod
commit d8b59aaae438b443f3361f74addbd01aa70b8cd4
Author: Karl Williamson <[email protected]>
Date: Thu Feb 21 13:47:52 2013 -0700
handy.h: Solve a failure to compile problem under EBCDIC
handy.h is included in files that don't include perl.h, and hence not
utf8.h. We can't rely therefore on the ASCII/EBCDIC conversion
macros being available to us. The best way to cope is to use the native
ctype functions. Most, but not all, of the macros in this commit
currently resolve to use those native ones, but a future commit will
change that.
M handy.h
commit d7641e2c22c163f01a5d7ad9072cac638438e5fb
Author: Karl Williamson <[email protected]>
Date: Thu Feb 21 13:35:12 2013 -0700
handy.h: Simplify some macro definitions
Now, only one of the macros relies on magic numbers (isPRINT), leading
to clearer definitions.
M handy.h
commit 51688b6e58a84328a62ad417a3d2fcbea739bd0b
Author: Karl Williamson <[email protected]>
Date: Thu Feb 21 13:26:49 2013 -0700
handy.h: Combine macros that are same in ASCII, EBCDIC
These 4 macros can have the same RHS for their ASCII and EBCDIC
versions, so no need to duplicate their definitions
This also enables the EBCDIC versions to not have undefined expansions
when compiling without perl.h
M handy.h
commit f236ea46a348a8097cb14fe590db55f038d45d74
Author: Karl Williamson <[email protected]>
Date: Wed Feb 20 10:39:48 2013 -0700
Deprecate NATIVE_TO_NEED and ASCII_TO_NEED
These macros are no longer called in the Perl core. This commit turns
them into functions so that they can use gcc's deprecation facility.
I believe these were defective right from the beginning, and I have
struggled to understand what's going on. From the name, it appears
NATIVE_TO_NEED taks a native byte and turns it into UTF-8 if the
appropriate parameter indicates that. But that is impossible to do
correctly from that API, as for variant characters, it needs to return
two bytes. It could only work correctly if ch is an I8 byte, which
isn't native, and hence the name would be wrong.
Similar arguments for ASCII_TO_NEED.
The function S_append_utf8_from_native_byte(const U8 byte, U8** dest)
does what I think NATIVE_TO_NEED intended.
M embed.fnc
M mathoms.c
M proto.h
M toke.c
M utf8.h
M utfebcdic.h
commit 1455a70c59254accdefb62f1213831c21e154bb4
Author: Karl Williamson <[email protected]>
Date: Wed Feb 20 10:26:43 2013 -0700
Remove remaining calls of NATIVE_TO_NEED
These calls are just copying the input to the output byte by byte.
There is no need to worry about UTF-8 or not, as the output is just an
exact copy of the input
M toke.c
commit ff7710451c91b0f1b968ec0ac7c5ad723e5bd3ce
Author: Karl Williamson <[email protected]>
Date: Wed Feb 20 08:12:15 2013 -0700
toke.c: Remove some NATIVE_TO_NEED calls
I believe NATIVE_TO_NEED is defective, and will remove it in a future
commit. But, just in case I'm wrong, I'm doing it in small steps so
bisects will show the culprit. This removes the calls to it where the
parameter is clearly invariant under UTF-8 and UTF-EBCDIC, and so the
result can't be other than just the parameter.
M toke.c
commit 90133fc55f6253c4c7c386b22050e93336ad1bc8
Author: Karl Williamson <[email protected]>
Date: Wed Feb 20 08:22:07 2013 -0700
toke.c: in [A-Za-z] use macros that exclude non-ASCII alphas
This code is attempting to deal with the problem of holes in the ranges
a-z and A-Z in EBCDIC. Prior to this patch, it accepeted things like A
WITH GRAVE, etc, which shouldn't have the special processing to deal
with the holes
M toke.c
commit 6603add63d88e6be7d10265cd4bc17a45674fd8b
Author: Karl Williamson <[email protected]>
Date: Tue Feb 19 15:13:19 2013 -0700
Use real illegal UTF-8 byte
The code here was wrong in assuming that \xFF is not legal in UTF-8
encoded strings. It currently doesn't work due to a bug, but that may
eventually be fixed: [perl #116867]. The comments are also wrong that
all bytes are legal in UTF-EBCDIC.
It turns out that in well-formed UTF-8, the bytes C0 and C1 never appear
(C2, C3, and C4 as well in UTF-EBCDIC), as they would be the start byte
of an illegal overlong sequence.
This creates a #define for an illegal byte using one of the real illegal
ones, and changes the code to use that.
No test is included due to #116867.
M op.c
M toke.c
M utf8.h
commit b0227b4009e93aa56c48efd3d37216324c1961a5
Author: Karl Williamson <[email protected]>
Date: Sun Feb 17 14:00:13 2013 -0700
toke.c: Don't remap \N{} for EBCDIC
Everything is now in native,
M toke.c
commit 0554a3404b6851f16ea5cec15136c099db013890
Author: Karl Williamson <[email protected]>
Date: Sun Feb 17 13:50:45 2013 -0700
toke.c: Remove remapping for EBCDIC for octal
The code prior to this commit converted something like \04 into its
EBCDIC equivalent only in double-quoted strings. This was not done in
patterns, and so gave inconsistent results. The correct thing to do
should be to do the native thing, what someone who works on a platform
would think \04 do. Platform independent characters are available
through \N{}, either by name or by U+.
The comment changed by this was wrong, as in some cases it was native,
and in some cases Unicode.
M toke.c
commit a78921afce377df42e2e80bc937dd70a292f5748
Author: Karl Williamson <[email protected]>
Date: Sun Feb 17 13:47:13 2013 -0700
Remove EBCDIC remappings
Now that the tables are stored in native format, we shouldn't be doing
remapping.
Note that this assumes that the Latin1 casing tables are stored in
native order; this hasn't been done yet.
M handy.h
M perly.c
M pp.c
M regcomp.c
M regexec.c
M utf8.c
commit d972e2fb5ce708817cdba9d0f9a4014b8b47423a
Author: Karl Williamson <[email protected]>
Date: Sun Feb 17 12:46:05 2013 -0700
Add and use macro to return EBCDIC
The converstion from UTF-8 to code point should generally be to the
native code point. This adds a macro to do that, and converts the
core calls to the existing macro to use the new one instead. The old
macro is retained for possible backwards compatibility, though it
probably should be deprecated.
M handy.h
M pp.c
M regcomp.c
M regexec.c
M toke.c
M utf8.c
M utf8.h
commit 144e5d5c64c7f4397fe3ea0778522ab1702ae721
Author: Karl Williamson <[email protected]>
Date: Sun Feb 17 09:18:06 2013 -0700
charnames: fix nit in comment
M lib/_charnames.pm
commit 3bd78f38e6416f085d876ab8750055486a8bcb56
Author: Karl Williamson <[email protected]>
Date: Sat Feb 16 11:05:44 2013 -0700
charnames: Make work in EBCDIC
Now that mktables generates native tables, the only thing that was
needed was to make U+ mean Unicode instead of native.
M lib/_charnames.pm
M lib/charnames.pm
commit ebe7e35f328796a4f3d2f7683bb8c7a5cda34713
Author: Karl Williamson <[email protected]>
Date: Sat Feb 16 09:35:56 2013 -0700
Unicode::UCD: Work on non-ASCII platforms
Now that mktables generates native tables, it is a fairly simple matter
to get Unicode::UCD to work on those platforms.
M lib/Unicode/UCD.pm
commit fedbd68088202a50d5a1d21b7d0633abffca9d5f
Author: Karl Williamson <[email protected]>
Date: Thu Feb 14 22:16:38 2013 -0700
mktables: Generate native code-point tables
The output tables for mktables are now in the platform's native
character set. This means there is no change for ASCII platforms, but
is a change for EBCDIC ones.
Since we currently don't have any EBCDIC test platforms, I tested this
by faking it out to generate EBCDIC data, and then eye-balled the
results.
Code that didn't realize there was a potential difference between EBCDIC
and non-EBCDIC platforms will now start to work; code that tried to do
the right thing under these circumstances will no longer work. Fixing
that comes in later commits.
M lib/unicore/mktables
commit cb9d53983427e2f0dfb39a1f166f12cf42dd063a
Author: Karl Williamson <[email protected]>
Date: Thu Feb 14 10:50:00 2013 -0700
Fix some EBCDIC problems
These spots have native code points, so should be using the macros for
native code points, instead of Unicode ones.
M regcomp.c
M sv.c
M toke.c
commit 4c478c1658483d61ee2b7c4f94a42c64c8298da7
Author: Karl Williamson <[email protected]>
Date: Wed Feb 13 22:10:19 2013 -0700
Remove unnecessary temp variable in converting to UTF-8
These areas of code included a temporary that is unnecessary.
M inline.h
M regcomp.c
M sv.c
commit 307a48ddc3373bf3c1ebfc0afc4225b50d228bae
Author: Karl Williamson <[email protected]>
Date: Wed Feb 13 22:00:55 2013 -0700
utf8.h: Correct macros for EBCDIC
These macros were incorrect for EBCDIC. The 3 step process given in
utfebcdic.h wasn't being followed.
M utf8.h
commit 88f8a816d5dbf54862220c715fbce5e028e66cfc
Author: Karl Williamson <[email protected]>
Date: Sat Feb 9 21:23:30 2013 -0700
Extract common code to an inline function
This fairly short paradigm is repeated in several places; a later commit
will improve it.
M embed.fnc
M embed.h
M inline.h
M pp_pack.c
M proto.h
M sv.c
M toke.c
M utf8.c
commit 987a0f663fc0b7f00eef04306dea547bd75612f6
Author: Karl Williamson <[email protected]>
Date: Thu Feb 7 21:35:57 2013 -0700
Don't use EBCDIC macro for a C language escape
C recognizes '\a' (for BEL); just use that instead of a look-up.
regen/unicode_constants.pl could be used to generate the character for
the ESC (set in surrounding code), but I didn't do that because of
potential bootstrapping problems when porting to an EBCDIC platform
without a working perl. (The other characters generated in that .pl are
less likely to cause problems when compiling perl.)
M regcomp.c
M toke.c
commit 3ef346541c0a017245b86c16a52aec7a025b1b87
Author: Karl Williamson <[email protected]>
Date: Thu Feb 7 19:53:38 2013 -0700
Use byte domain EBCDIC/LATIN1 macro where appropriate
The macros like NATIVE_TO_UNI will work on EBCDIC, but operate on the
whole Unicode range. In the locations affected by this commit, it is
known that the domain is limited to a single byte, so the simpler ones
whose names contain LATIN1 may be used.
On ASCII platforms, all the macros are null, so there is no effective
change.
M handy.h
M regcomp.c
M utf8.c
commit b035b598721ae0ddd424c2b44879bb9454c0b95e
Author: Karl Williamson <[email protected]>
Date: Thu Feb 7 14:31:09 2013 -0700
Use new clearer named #defines
This converts several areas of code to use the more clearly named macros
introduced in a recent commit
M op.c
M toke.c
M utf8.c
M utf8.h
M utfebcdic.h
commit e0202b67141b06ff18ee5547fe74ef8321be4e05
Author: Karl Williamson <[email protected]>
Date: Thu Feb 7 13:52:31 2013 -0700
utf8.h, utfebcdic.h: Create less confusing #defines
This commit creates macros whose names mean something to me, and I don't
find confusing. The older names are retained for backwards
compatibility. Future commits will fix bugs I introduced from
misunderstanding the meaning of the older names.
The older names are now #defined in terms of the newer ones, and moved
so that they are only defined once, valid for both ASCII and EBCDIC
platforms.
M utf8.h
M utfebcdic.h
commit 45e2bdc63b0b82d1261047e75c15fcd71e0fe8f8
Author: Karl Williamson <[email protected]>
Date: Mon Feb 4 14:22:02 2013 -0700
pp_ctl.c: Use isCNTRL instead of hard-coded mask
This is clearer and portable to EBCDIC.
M pp_ctl.c
commit 3096befe00c139c8cac4fc8be5ee4a88e66c0446
Author: Karl Williamson <[email protected]>
Date: Tue Feb 26 13:51:05 2013 -0700
utf8.c: is_utf8_char_slow() should use native length
What is passed is the actual length of the native utf8 character. What
this was calculating was the length it would be if it were a Unicode
character, and then compares, apples to oranges.
M utf8.c
-----------------------------------------------------------------------
--
Perl5 Master Repository