This Week on perl5-porters - 27 March-2 April 2006
Be the first kid on the block to have your very own pragma.
Topics of Interest
[Part of last week's summary vanished into a worm-hole of the
space-time continuum, and reappeared this week. The missing portion is
reproduced below.]
Fixing the "Your Makefile has been rebuilt" tedium
Dave Mitchell grew tired of the fact that the tiniest change to the
perl source causes all the extensions to be rebuilt as well. This is
because they all have a dependency on lib/Config.pm, which itself is
rebuilt each time "miniperl" is rebuilt.
So Dave changed things around to so that it (and lib/Config_heavy.pl)
are only updated if in fact they actually change during a rebuild.
Nicholas thought that this might break parallel makes, and that the
"mv-if-diff" hack was removed as it operated on a similar principle,
causing "make" to consider that the Unicode tables were perpetually
out of date, which caused "Encode" to be needlessly rebuilt many times
over in a single run.
Works on my box
http://xrl.us/kqps
Why "sv_mortalcopy(&sv_no)"?
Nicholas Clark saw that the source code uses the idiom
PUSHs(sv = sv_mortalcopy(&PL_sv_no));
which dates back as far as 5.003, and wondered why the more reasonable
construct
PUSHs(sv = sv_newmortal());
wasn't used instead. Dave Mitchell thought of a couple of reasons why,
such as the "getpw*" returning empty scalars for unsupported features,
but in the general case it was probably quite unnecessary. Yitzchak
Scott-Thoennes realised that there was a subtle difference between the
two constructs.
This all came about because of change #27612, in which Nicholas
changed when and where "sv_mortalcopy" was used in pp_sys.c.
Mortal peril
http://xrl.us/kqpt
"pthread_attr_setstacksize" failure
Jerry D. Hedden mentioned that he had received a couple of bug reports
concerning the new API for thread stack sizes, concerning threads
allocating a stack size of exactly 2Mb. Jerry wondered whether this
was some sort of 32/64-bit conversion failure and was otherwise stuck
as to figuring out where to go from here.
2147483648 bytes and counting
http://xrl.us/kqpu
Replacing "S_new_HE" with "Perl_new_body"
Jim Cromie delivered an exploratory patch to simplify "HE" allocations
in hv.c, to see what effect it would have, and asked for comments.
Something for hash-heads
http://xrl.us/kqpv
Arena-based allocations for "op"s
Nicholas Clark delivered a long thoughtful analysis of Jim Cromie's
other patch that allocated ops from arenas, saying that the complexity
it adds probably outweighs the benefits.
But there is a way forward
http://xrl.us/kqpw
Redundant $Config{d_sitearch} paths
Gisle Aas noted that if you configure perl and set $sitearch to be the
same as $archlib then the same directory appears twice in the @INC
path, which is silly.
Gisle wanted to patch this, but wasn't sure whether he could dive
straight in, or whether it required hacking on "metaconfig".
To make matters worse, he found that "SITELIB_STEM" could add yet a
third copy of the same directory to @INC, and came up with one patch
to rule them all.
One is enough
http://xrl.us/kqpx
A working "CLONE" for "Tie::RefHash"
Yuval Kogman patched "Tie::RefHash" so that it would work correctly
with threads. Rafael tidied it a bit as he put it into "blead".
It works!
http://xrl.us/kqpy
Multiple Perl version support quandary
John Peacock admitted to having been exceedingly naughty in releases
new versions of "version" and not testing them on older perls. Some of
the code contained 5.6-isms (notably dealing with warnings), and
wondered what to do to make it work again on 5.005.
John thought that the easiest way would be to fake up an *ersatz*
"warnings" module, but then, he wasn't sure how to pull off something
like "no warnings qw(redefine)".
Nick Ing-Simmons provided a elegantly devious lightweight solution
that please John no end. Yitzchak thought of a problem that John's
final implementation might have, and provided another improvement.
Revis(?:it)?ing compatibility
http://xrl.us/kqpz
Do "PERL_FLEXIBLE_EXCEPTIONS" work?
Nicholas was working (or not) with "PERL_FLEXIBLE_EXCEPTIONS" and came
to the conclusion that due to a compilation error, it didn't work,
couldn't work, and could never have worked in over six years and
twelve stable releases of Perl. Given that no bug reports have been
received to date, Nicholas concluded that they could be scrapped
without causing any harm.
Nick Ing-Simmons cautioned about being too hasty, saying that this
mechanism was there to allow Perl to be compiled natively by a C++
compiler, to map C's "longjmp" to C++'s "throw".
So Nicholas went off and tried to compile perl with a C++ compiler
("g++") on Linux and FreeBSD. Everything failed, usually due to
prototypes not being sufficiently precise (mainly "char *" *versus*
"const char *". From which one may conclude that it may be possible to
compile Perl with a C++ compiler out of the box, but certainly not
with a common C++ compiler on two of the most common Unix platforms.
Probably not
http://xrl.us/kqp2
Making the "IO::Socket" tests pass on Win32
Yves Orton sent in a patch to get "IO::Socket" to run its test suite
correctly on a threaded Windows build. He had a look at "IO::Pipe",
but couldn't think of a sane enough approach to make it work.
Steve Hay couldn't get it to smoke cleanly on a non-threaded build,
and Yves asked for advice on a better indicator to decide whether or
not to skip some of the tests (that deal with "fork"ing).
Steve Hay showed how various configure-time switches can be combined
in different ways to make all sorts of threadish behaviour in Windows.
Andy Dougherty said that the right way of seeing whether "fork" was
implemented was to look at $Config{d_fork}. And if that gave bogus
results, well by golly it ought to be fixed up so that it gave a
useful result.
Yves thought that this was a marvellous idea... except that it didn't
solve the problems for all the perls out there in the field today.
http://xrl.us/kqp3
Yves followed up with a patch that fixed just about everything, which
was applied by Steve Hay.
Socket to me
http://xrl.us/kqp4
Perl memory management and documentation
Tom Schindl thought that the documentation was unclear on the concept
of explicitly setting lexical variables to "undef" to release memory
inside a scope. Pointers were given to various book references, and
Yitzchak Scott-Thoennes summed up Perl's memory strategy nicely:
"allocate as early as possible and for as long as possible".
Sadahiro Tomoyuki observed that while there are techniques for
pre-extending arrays and hashes, no Perl-level technique is available
for scalars (although "SvGROW" can be used from XS code). This could
be useful for strings that are expected to become very large. He
suggested that making "length($string)" lvalue-able, as in
length($str) = 700000000
to tell perl to allocate that many bytes for a scalar could be very
helpful (notwithstanding the usual provisos about characters not equal
to bytes in Unicode strings).
Let my bytes go
http://xrl.us/kqp5
As usual, Dave Mitchell explained what was happening behind the scenes
in a clear and concise manner.
The algorithm
http://xrl.us/kqp6
How should "%^H" work with lexical pragmata
One of Robin Houston's many contribution to "blead" last year was a
somewhat arcane improvement that concerned "%^H" (the hints variable)
being made available to "eval" blocks.
Nicholas Clark used Robin's insight to help finish lexical pragmata,
and was running into conceptual difficulties over the state of
contents of "%^H" at compile time and run time, and when to make it
readable or writable.
The fundamental problem that needs to be addressed is that the state
of the hints get compiled into the op-tree, which means that changing
the setting of a hint has no effect (hence the "read-only" nature of
the beast) once the code has been compiled.
Rafael pointed out that that was just it: "%^H" and $^H are for
affecting compilation and, as they stand, are just not useful at run
time.
Something that does have an effect at run time, warnings.pm uses its
own variable, a bitfield, which is stored in the op-tree in its own
right, which is how warnings can come and go during run time.
Nicholas had a look at the definition of "struct cop", as that seemed
to be the logical place from which to hang hints, but then realised
that since op-trees are shared between threads there's no sane way to
make it read-write as well (since that would mean all threads would
inherit the change of hinting an any thread).
Nicholas finished up doing an extreme programming number: writing the
test to prove that lexical pragmata work. And then subsequently
committed a patch that made the test succeed.
Then Hugo admitted to being slightly confused. That's good. I was
confused all along.
I'll give you a hint
http://xrl.us/kqp7
Continued in the new month
http://xrl.us/kqp8
Rafael Garcia-Suarez played around with the pragma stuff and came up
with a user-level pragma example. David Nicol and Nicholas played
around with that, and the feedback from the exercise resulted in a
couple of other code tweaks.
(In case you're wondering, a pragma is a module with a lower case name
that can be turned on and off through the code. Two prime examples are
"use strict"/"no strict" and "use warnings"/"no warnings").
Your very own pragma
http://xrl.us/kqp9
At the same time, Rafael thought that it ought to be possible to make
encoding lexical as well, and set about trying to find out what was
still needed to get it to work. Nicholas and he thrashed out the
details.
http://xrl.us/kqqa
Combining UTF-16 output with :crlf is awkward
In a parallel universe (read: another mailing list), Jan Dubois
discovered that stacking the ":crlf" layer on top of a Unicode layer
causes "Wide character in print" warnings to be issued. The
work-around is the use the (in Jan's words) "non-intuitive"
":raw:encoding(UTF-16LE):crlf:utf8" layers together, to turn off the
"PERLIO_F_UTF8" bit in the ":crlf" layer.
Jan wondered whether it would be possible for "PerlIOCrlf_pushed()" to
inherit the flag from the previous layer, or whether "PerlIO_isutf8()"
should walk the layer stack in order to determine what it should do.
Nick Ing-Simmons preferred the first approach, going as far as saying
that that should actually be the default behaviour for a layer. The
second solution has the problem of a layer having to determine whether
some other arbitrary layer affects UTF-8 or not.
Layer upon layer upon layer
http://xrl.us/kqqb
Thread non-safety in "sv_setsv"
Nicholas was rather distressed to discover a problem with
"sv_setsv_flags" may put an end to a workable copy-on-write scheme in
threaded builds. This came about from looking at the hints
implementation, and the fact that threads share op-trees.
Nice ASCII art, Nick
http://xrl.us/kqqc
Patches of Interest
"Devel::DProf" "const"ing
Andy Lester had a look at "Devel::DProf" to see about the bugs Jarkko
Hietaniemi raised a while back. He wasn't able to fix anything yet,
but did clean the code up somewhat, and added some lovely "const"s in
the process.
It's better than nothing
http://xrl.us/kqqd
Poisoning memory
Following on from John Malmberg's plea to have allocated and
deallocated memory filled with garbage values (and thus poisoned, to
cause errant dereferences to be noticed earlier, Jarkko added a patch
to do just that.
Now allocations can be initialised with 0xAB (also known as
*strawberry cyanide*) and freed memory can be overwritten with 0xEF
(or *blueberry lithium*). Andy wondered how we'd gone for so long
without it.
Two exciting new flavours!
http://xrl.us/kqqe
VMS pool corruption fix for "_NLA0:"
In turn, John delivered a patch to make "stat" work correctly on
"NLA0:", which is very important if you're doing VMS work.
http://xrl.us/kqqf
Long file path support for VMS
Having finished with the preliminaries, John then got down to business
with a patch to long path support to all versions of and platforms of
OpeVMS that support them.
http://xrl.us/kqqg
And rejigged the "stat" structure used when "largefile" support was
enabled.
http://xrl.us/kqqh
Tidying up regexec.c
Andy Lester set his sight on regexec.c, now that Dave has finished
with it, and zapped numerous unused macros, inlined a couple of small
static functions and sprinkled the magic wand of "const"ness over the
lot.
http://xrl.us/kqqi
Random accumulated patches from Andy
Andy then shipped out all his patches that been piling up: consting
and "NULL" tweaks ("NULL" instead of 0 when dealing with pointers, and
removing casts on "NULL" assignments).
http://xrl.us/kqqj
and redid the "PERL_UNUSED_DECL" macro, eliminating a grumpy comment
at the same time. [News flash: this was eventually reverted, as there
is code Out There which relied on the previous behaviour].
http://xrl.us/kqqk
and removed some unnecessary pointer checks
http://xrl.us/kqqm
and found some more appropriate versions of the "SvREFCNT_inc" macro
to use.
http://xrl.us/kqqn
Add "V.pm" to the core
Abe Timmerman posted a patch to add V.pm, which was originally written
back in 2002 in answer to a question from Tels. John Peacock thought
that "Module::Info" would be more useful.
"Why?" said Profane. "Why not?" said Stencil.
http://xrl.us/kqqo
New and old bugs from RT
"$qr = qr/^a$/m; $x =~ $qr" fails (#3038)
Nicholas Clark beat everyone else in closing out this bug from 2004.
http://xrl.us/kqqp
He also fixed up the "hash assignment to a tied hash erroneously
stores data in the real hash too (#36267)" bug too.
http://xrl.us/kqqq
Perl segfaults; test case available (#32332)
http://xrl.us/kqqr
no "sendmsg"/"recvmsg" support (#38808)
Nicholas noted that neither the core, nor the "Socket" module provide
the "sendmsg" and "recvmsg" functions. Gise Aas thought that "POSIX"
would be a suitable place in which to have them.
http://xrl.us/kqqs
Bad return value from a block with variable localization (#38809)
Vincent Pit filed a bug that showed some code using "if(@_)", "do" and
"return" picking up Cundef> in an unexpected manner.
http://xrl.us/kqqt
Encoding error in UTF-8 locales (#38812)
Vincent Lefevre posted an encoding bug. Nicholas stripped down the
example code and highlighted the error. He wasn't sure whether it was
a problem of the documentation not being sufficiently clear, or the
core for not dealing with the issue adequately
Maybe a bit of both
http://xrl.us/kqqu
"local $h{$unicode}" doesn't work (#38815)
Nicholas Clark noticed that "local $a{"\x{100}"} = 1" doesn't behave
correctly (the way a non-Unicode key like "local $a{"N"} = 1" does),
and promised to come up with a way to fix it, and did.
All part of a day's work
http://xrl.us/kqqv
Segment fault when using "Sockets" (#38817)
http://xrl.us/kqqw
"use sort 'stable'" sorts backwards with perl5.9.3 (#38831)
Stefan Lidman discovered that stable sorting in "blead" sorts
descending instead of ascending by default. Rafael and Robin Houston
had it sorted out in a jiffy.
I hope they added a test case
http://xrl.us/kqqx
What Steve Peters did this week
After Dave Mitchell landed his impressive iterative pattern match
patch, Steve equally impressively trawled RT to resolve, like, a
jazillion bugs, each resulting in a new message to the list.
An interesting case of seeing how different people explain in their
own words what is in fact the same thing.
* Regexp causes "SIGSEGV" (stack overflow?) (#1760)
* Core dump using a Perl regular expression (#6844)
* Segmentation fault in "regmatch()" (#6987)
* Perl Segmentation Fault using "/((\w+ )+)/" on long strings
(#8685)
* Recently-introduced regex segfault (#8870)
* 5.6.0, 5.6.1, 5.8.0 regexp core on "([EMAIL PROTECTED]@|.)*" (#17611)
* perl 5.8.0 segfaults (#18489)
* perl "SIGSEGV" when applying regular expression to a long string
(#21298)
* Regexp segfault "--> ("X"x3529) =~ /( (?: \\. | [^\$] ){1,4000}
)/gx;" (#21333)
* Regexp segfault (#21922)
* Seg fault on long input to re (#21940)
* Segfault (deep recursion?) in regex match (#22051)
* Core dump on big regex (#23666)
* Segmentation fault caused by capturing regex (#24271)
* METABUG - regex stack overflow issues (#24274)
* Segmentation fault at "m//" regexp (#28999)
* Regexp "/^([^f]|f.)+/" Bus error (#31887)
* SEGV with complicated regexp and long string (#32041)
* Long strings causes segmentation fault (#32465)
* Regular expression segfaults perl (#32803)
* "SIGSEGV" in "S_regmatch" (#34349)
* Simple regexp causes segfault (#36020)
* Segfault in simple regular expression (#36999)
* Segfault when doing this regex (#38031)
* Segmentation fault for matching too long regexps (#38379)
* Silent self-termination of script using regex (#38470)
* Regexp Bus error (#38473)
* Perl Segfault in Regex Match (#38717)
When Dave said he thought his patch would allow a whole pile of bugs
to be closed out, he wasn't joking.
Steve attempted to close out "Regular expression causes segfault
(#36903)" but was having access permission problems in retrieving the
test code to be used for the bug. Milo Thurston provided another URL
to get at the code.
403 Forbidden
http://xrl.us/kqqy
Perl5 Bug Summary
1563 bugs (but wait until next time)
http://xrl.us/kqqz
Over here
http://rt.perl.org/rt3/NoAuth/perl5/Overview.html
New Core Modules
* "version" 0.59 by John Peacock,
http://xrl.us/kqq2
* "Module::Build" 0.27_10 by Ken Williams,
http://xrl.us/kqq3
* and "Time::Local" 1.12_01 from Dave Rolsky.
http://xrl.us/kqq4
In Brief
Alan Burlison forwarded a message about a new project that had been
formed to deal with programming language vulnerabilities.
http://xrl.us/kqq5
Hugo van der Sanden added a brief documentation patch to clarify the
fact that you cannot use "times()" to obtain the elapsed time consumed
by running child processes, only for finished processes that have been
"wait"ed upon.
http://xrl.us/kqq6
David Nicol proposed a "Tie::MaskedArray" technique for avoiding the
remotely tied global localized with a sigil exploit. I'm a little hazy
on the details of this particular exploit. David's techique proposes
to replace "local", more slowly, but also more safely.
http://xrl.us/kqq7
John L. Allen forwarded the current, best patch for "pow()" on AIX.
http://xrl.us/kqq8
Jim Cromie updated the documentation to make it more clear what
happens when one does something like "Configure -des
-DNoSuchConfigureFlag".
http://xrl.us/kqq9
Robin Barker found a bug in "Readonly" in 5.8.8 and so fixed the bug,
and added a test to t/op/tie.t to make sure the problem doesn't
return.
(I think I'm beginning to see a pattern here)
http://xrl.us/kqra
Paul Marquess provided a small patch to the zip test harness for
"IO::Compress::Zip".
http://xrl.us/kqrb
Sadahiro Tomoyuki looked at some of the recent changes to the source
and found some unmatching of parameters and types. Nicholas updated
embed.fnc to take that into account.
http://xrl.us/kqrc
Andy had a go at linting the source with Sun Studio's lint, and found
lots of things that need to be looked at, and wondered whether there
were any other lint-like tools freely available that could be applied.
Jarkko mentioned FlexeLint (a.k.a Gimpel Lint), which is very nice,
but not free.
http://xrl.us/kqrd
Yves Orton found a tainting oddity that should possibly be documented.
Yitzchak thought that some patches that remove the surprising
behaviour would also be well received.
http://xrl.us/kqre
H.Merijn Brand backported all of recent changes made by Nicholas in
"blead"'s Configure to that of "maint". At the end of the week he was
still busy filling in gaps in Porting/Glossary.
http://xrl.us/kqrf
Feeback from last week's summary
Craig Berry corrected my misreading of the "Module::Build" on VMS
thread, which is that the VMS port currently doesn't offer the list
form of piped "open".
http://xrl.us/kqrg
About this summary
This summary was written by David Landgren. I will be getting a
life^W^W^Wtaking a break next week so the next summary will be for the
fortnight 3-16 April.
If you want a bookmarklet approach to viewing bugs and change reports,
there are a couple of bookmarklets that you might find useful on my
page of Perl stuff:
http://www.landgren.net/perl/
Weekly summaries are published on http://use.perl.org/ and posted on a
mailing list, (subscription: [EMAIL PROTECTED]). The
archive is at http://dev.perl.org/perl5/list-summaries/. Corrections
and comments are welcome.
If you found this summary useful or enjoyable, please consider
contributing to the Perl Foundation to help support the development of
Perl.
--
"It's overkill of course, but you can never have too much overkill."