This Week on perl5-porters - 24-29 February 2008
"Is this a bug? Or why is this the expected behaviour?" -- Steffen
Ullrich, playing with signal handlers.
Topics of Interest
"use encoding 'utf8'" bug for Latin-1 range
The thread about "use encoding" continued this week. Juerd Waalboer
gave one of the best concise explanations as to why the current model
Perl uses for dealing with Unicode is broken, which is that the "\x"
hex escape is overloaded for bytes ("\x2b" *versus* "\x{d0b2}"), and
that it takes place too early, while the source is being read.
The result of which is that a source code file encoded in an Asian
language cannot embed a latin-1 character like an e-acute.
Much discussion of remarkable civility followed, regarding what to do
about the matter. Glenn Lindemann put forward the following ideas:
* Deprecate "use encoding".
* Deprecate non-ASCII characters in 5.12 source code, unless a
source encoding has been specified.
* Allow Unicode semantics to be applied to all character operations
on strings (case conversion, caseless comparisons and so on),
regardless of their internal representations.
* Sort out the timing of when "\x", "\x{}" and "\N" take effect.
No-one appeared to lament the idea of letting "encoding" go.
Yves Orton pointed out that Microsoft managed to get their Unicode
handling more or less right, albeit at a certain cost to their API,
and regretted that Unix-like operating systems supplied the absolute
strict minimum, pushing all the work onto each and every client
program. Which meant that nothing really worked at all, not even the
so-called shebang line.
Juerd and Nicholas put forward that there is a case to be made for
perl to figure out itself whether a given source file is in ASCII,
Latin-1 or UTF-8. It turns out that it's just about impossible to
construct a sensible Latin-1 file that also turns out to be be valid
UTF-8. The idea is to start out in 7-bit ASCII and carry on until a
byte with the high bit set is encountered.
If this byte introduces a valid UTF-8 character, the rest of the file
must be, too. Any invalid byte sequences thereafter trigger a fatal
compile-time error. Otherwise it means it must be Latin-1, in which
case similar but different rules apply which also cause the
compilation to halt if encodings change mid-stream. The key issue is
to determine that the encoding does indeed change.
EBCDIC was also mentioned in passing. Sadly, Perl no longer runs on
EBCDIC due to a general lack of nurturing. Then again, if it was
important, Nicholas felt that someone from IBM would have been in
touch at some point.
for some reason I now have a splitting headache
http://xrl.us/bg932
Interrupting "system()" with signal depends on signal handler
Steffen Ullrich noticed that an "alarm" signal handler that does a
"syswrite" as opposed to a "print" behave differently. After diving in
through pp_sys.c, he noticed that he could make the "print" version
(which was working correctly) behave the same incorrect way, by
setting $! to undef.
He produced a one-line patch that fixed the behaviour (hmm, did we get
a test?) and Rafael applied it as change #33408.
handle with care
http://xrl.us/bg98g
CPAN NetBIOS broadcasts
Linda W was scratching her head wondering why CPAN installations on
cygwin were glacially slow. After running a network trace, she
discovered that what had been a path /var/cache/cpan was being
interpreted as a UNC path (/cache/cpan on host //var).
This caused the local host to send out plaintive calls for host //var
to please call home. Michael G. Schwern thought that this sounded like
the same problem described in CPAN bug #32813, as did Linda.
Yves Orton, current maintainer of "ExtUtils::Install", which is were
the problem originated, pushed out a new version and Linda confirmed
that it solved the problem.
Ken Williams was not around to comment on how hard it is to use
File::Spec correctly.
not quite Unix, not quite Windows
http://xrl.us/bg934
Google summer of code
Eric Wilhelm got the ball rolling on Perl's participation in Google's
Summer of Code project. But you've probably heard about this in other
venues. All hail Eric.
The Perl 5 Wiki is place to go for the latest information.
summertime fun
http://xrl.us/bg936
http://xrl.us/bg938
Patches of Interest
sv.c consting goodness
Steven Schubiger's consting patch number 4 from the beginning of the
month was applied. This lead to patches 5, 6, 7, 8 and 9, all applying
ever more consting to sv.c being issued by Steven, which in turn were
all applied by various porters.
http://xrl.us/bg94a
no archlib in otherlibdirs
After some long, hard thought, Andy Dougherty remembered why Reini
Urban's plan for organising site and vendor libraries on Cygwin
wouldn't work in the general case. So Reini withdrew his patch but
would continue to use it locally.
http://xrl.us/bg94c
On the other hand, his enhancements to "B::Debug" made it in.
win some, lose some
http://xrl.us/bg94e
warning message for -M:Foo, extended and revised
Robin Barker finally settled on "Invalid module name :Foo with -M
option: contains single ':'", which was good enough for Rafael
colonphun
http://xrl.us/bg94g
More diagnostics for Fatal.pm
Slaven Rezic enhanced "Fatal" to name the builtin that could not be
overridden in its dying message.
if I told you I would have to kill you
http://xrl.us/bg94i
Thread patches
Jerry D. Hedden is doing so much work on threads at the moment, he
deserves his own section.
First off, the patch to not install threads on non-thread builds was
reverted (Michael G. Schwern killer argument being that at least that
way you get a nice error message).
http://xrl.us/bg94k
Then the CPAN 1.69 version of "threads" was synch'ed with blead.
http://xrl.us/bg94n
As was "threads::shared" 1.17.
http://xrl.us/bg94p
At the end of the week, he also delivered version 1.18, which added
some diagnostics to help track down what's going wrong when t/stress.t
decides to go belly up.
http://xrl.us/bg94r
Moving along, "Thread::Semaphore" 2.07 checked in.
http://xrl.us/bg94t
and last but not least, "Thread::Queue" 2.06 did too.
http://xrl.us/bg94v
Watching the smoke signals
It looked like t/stress.t in the threads module failed, and so Jerry
asked if there was any chance of seeing what the new diagnostics had
to say. Steve Hay discovered that the problem was in fact a TODO test
that had started to pass, and Test::Smoke got confused and recorded it
as a failure.
Smoke [5.11.0] 33390 FAIL(F) MSWin32 WinXP/.Net SP2 (x86/2 cpu)
http://xrl.us/bg94x
New and old bugs from RT
Segfault when calling "->next::method" on non-existing package (#51092)
David Landgren thought that the test that Rafael Garcia-Suarez added
as part of the fix for this bug should have had the RT bug number
embedded in it somewhere. In other other news, we discovered that
there are 485 subscribers to perl5-porters.
http://xrl.us/bg94z
Perl5 Bug Summary
288 new + 1500 open = 1788 (+3 -2)
http://xrl.us/bg943
http://rt.perl.org/rt3/NoAuth/perl5/Overview.html
New Core Modules
ExtUtils::Install version 1.45
This was the fix for the "//var" problem noted by Linda W. (But
stay tuned next week for exciting new developments).
http://xrl.us/bg945
ExtUtils::MakeMaker 6.44
Michael G. Schwern rolled out 6.34_01 plus Yves's EU::I 1.45 as
version 6.44. Other assorted bugfixes made it in, but Michael
announced that he had declined to put in the fixes required to
make paths with whitespace work correctly, saying that he wanted
to think about a better solution.
http://xrl.us/bg947
In Brief
Last week, Jim Cromie had the newfound ability to hook XML analysis to
a test suite (via the "PERL_XMLDUMP" environment variable). This week,
Jim wrote a patch to test -Dmad's PERL_XMLDUMP= output. It was not
applied.
truly madly
http://xrl.us/bg949
On the other hand, Rafael did apply his optimisation of the
"OP_IS_(FILETEST|SOCKET)" macros, with some "OP *"/"int" fuzz.
http://xrl.us/bg95b
The exact recipe for signalling a non-met prerequisite (such that a
perl build without threads should not attempt to require threads) was
nailed down and codified on the CPAN Testers wiki.
http://cpantest.grango.org/
http://xrl.us/bg95d
Salvador FandiƱo found that the documentation made no mention of
"av_delete" calling "sv_2mortal" on the returned "SV". Yet "av_pop"
and "av_shift" don't and so the documentation should probably point
out the difference.
quirk quirk
http://xrl.us/bg95f
Craig Berry reported that maint-5.8 was not compiling on VMS, largely
due to incorrect prototypes in re.xs. Nicholas Clark determined that a
subsequent integration fixed the problem.
a matter of time
http://xrl.us/bg95h
Steve Peters wanted to know why quad words on Win32 weren't
configured, since all the pieces were in place to allow them to be.
Jan Dubois thought that it wasn't much of a problem since you really
need to have "IVSIZE" defined to be 8 to take any advantage of them.
mmm, bignums
http://xrl.us/bg95j
Nicholas Clark hacked "perlbug" to allow it to send thank-you messages
back to the porters.
send more money
http://xrl.us/bg95m
Nicholas also got his languages mixed up trying to write else if in C
macros. Fortunately there are only four or five distinct syntaxes to
master for writing else if constructs in all computer languages.
as if
http://xrl.us/bg95o
About this summary
This summary was written by David Landgren. I chopped a day off this
week; it makes it easy to start next week on the first of the month.
17-23 February 2008
http://xrl.us/bg95q
Weekly summaries are published on http://use.perl.org/ and posted on a
mailing list, (subscription: [EMAIL PROTECTED]). The
archive is at http://dev.perl.org/perl5/list-summaries/. Corrections
and comments are welcome.
If you found this summary useful, please consider contributing to the
Perl Foundation to help support the development of Perl.