In perl.git, the branch tonyc/utf8-readline has been created <https://perl5.git.perl.org/perl.git/commitdiff/b04324705e636f83ad75bcf8542afbf975fdbc21?hp=0000000000000000000000000000000000000000>
at b04324705e636f83ad75bcf8542afbf975fdbc21 (commit) - Log ----------------------------------------------------------------- commit b04324705e636f83ad75bcf8542afbf975fdbc21 Author: Tony Cook <t...@develop-help.com> Date: Thu May 10 13:40:31 2018 +1000 this code is WIP It's broken, don't expect it to work. Tests fail and new tests need to be written. Defaults and behaviour will change. I expect at least the following issues: - seeking to a tell position is broken (and to a certain extent may remain broken)[1] - a warning may be emitted more than once for the same error - the same message might be emitted more than once in the same croak for a single error - the readdelim code might infinite loop - if no tranformation is needed (ie. badly encoded isn't permitted at all, and we fail or craok for any other thing we treat as an error), then we could avoid the accumulation buffer (future optimization?) - the code might not follow the PerlIO API properly (which doesn't appear to be defined in detail) - this code might make you double-facepalm but I might be wrong. [1] since errors are transformed into the replacement character we can't use the position in the buffer to adjust the seek position, *but* that might end up being our best guess. The other issue with the accumulation buffer should be fixable. commit 406e47796e6f68c223ea45992cd140a08dae2a8a Author: Tony Cook <t...@develop-help.com> Date: Tue May 1 11:05:29 2018 +1000 WIP commit 37eea6b72294289ee9abec7db901f0b354989562 Author: Tony Cook <t...@develop-help.com> Date: Thu Apr 19 10:15:51 2018 +1000 version bumps: need to be integrated back to their base commits commit 809192ca3f44dc3c0e7ee581b8c9ef2520dadceb Author: Karl Williamson <k...@cpan.org> Date: Sat Dec 30 11:46:03 2017 -0700 sv_utf8_decode_flags commit 944ec303c05b2ae6e18451fc4f2149c0e3471f04 Author: Leon Timmermans <faw...@gmail.com> Date: Wed Dec 14 00:17:01 2016 +0100 Make :via and :scalar use readdelim commit f508fed368e6cbe7af667e3375bfb330af98bfed Author: Leon Timmermans <faw...@gmail.com> Date: Mon Apr 9 21:49:11 2012 +0200 Made :utf8 an actual layer It will check the input for validity, by default strict validity though less strict forms are provided. This also means PerlIO::get_layers doesn't return a "utf8" pseudo-layer anymore, which can break some code making that assumption. commit 52e0556cfb4d787323e3ec58240750f000addc61 Author: Leon Timmermans <faw...@gmail.com> Date: Mon Nov 14 12:15:18 2016 +0100 Make :encoding use the new readdelim method commit 5b8da07401a8b74f4ca8301326418f91d2add21b Author: Leon Timmermans <faw...@gmail.com> Date: Mon Nov 14 12:04:51 2016 +0100 Add fast readdelim to main buffering layers commit 0408d6f2cdb52f85b845577479db49358b50aa19 Author: Leon Timmermans <faw...@gmail.com> Date: Sun Dec 11 15:44:52 2016 +0100 Implement new style readline and the slow fallback ----------------------------------------------------------------------- -- Perl5 Master Repository