This Week on perl5-porters - 27 March-2 April 2006 Be the first kid on the block to have your very own pragma.
Topics of Interest [Part of last week's summary vanished into a worm-hole of the space-time continuum, and reappeared this week. The missing portion is reproduced below.] Fixing the "Your Makefile has been rebuilt" tedium Dave Mitchell grew tired of the fact that the tiniest change to the perl source causes all the extensions to be rebuilt as well. This is because they all have a dependency on lib/Config.pm, which itself is rebuilt each time "miniperl" is rebuilt. So Dave changed things around to so that it (and lib/Config_heavy.pl) are only updated if in fact they actually change during a rebuild. Nicholas thought that this might break parallel makes, and that the "mv-if-diff" hack was removed as it operated on a similar principle, causing "make" to consider that the Unicode tables were perpetually out of date, which caused "Encode" to be needlessly rebuilt many times over in a single run. Works on my box http://xrl.us/kqps Why "sv_mortalcopy(&sv_no)"? Nicholas Clark saw that the source code uses the idiom PUSHs(sv = sv_mortalcopy(&PL_sv_no)); which dates back as far as 5.003, and wondered why the more reasonable construct PUSHs(sv = sv_newmortal()); wasn't used instead. Dave Mitchell thought of a couple of reasons why, such as the "getpw*" returning empty scalars for unsupported features, but in the general case it was probably quite unnecessary. Yitzchak Scott-Thoennes realised that there was a subtle difference between the two constructs. This all came about because of change #27612, in which Nicholas changed when and where "sv_mortalcopy" was used in pp_sys.c. Mortal peril http://xrl.us/kqpt "pthread_attr_setstacksize" failure Jerry D. Hedden mentioned that he had received a couple of bug reports concerning the new API for thread stack sizes, concerning threads allocating a stack size of exactly 2Mb. Jerry wondered whether this was some sort of 32/64-bit conversion failure and was otherwise stuck as to figuring out where to go from here. 2147483648 bytes and counting http://xrl.us/kqpu Replacing "S_new_HE" with "Perl_new_body" Jim Cromie delivered an exploratory patch to simplify "HE" allocations in hv.c, to see what effect it would have, and asked for comments. Something for hash-heads http://xrl.us/kqpv Arena-based allocations for "op"s Nicholas Clark delivered a long thoughtful analysis of Jim Cromie's other patch that allocated ops from arenas, saying that the complexity it adds probably outweighs the benefits. But there is a way forward http://xrl.us/kqpw Redundant $Config{d_sitearch} paths Gisle Aas noted that if you configure perl and set $sitearch to be the same as $archlib then the same directory appears twice in the @INC path, which is silly. Gisle wanted to patch this, but wasn't sure whether he could dive straight in, or whether it required hacking on "metaconfig". To make matters worse, he found that "SITELIB_STEM" could add yet a third copy of the same directory to @INC, and came up with one patch to rule them all. One is enough http://xrl.us/kqpx A working "CLONE" for "Tie::RefHash" Yuval Kogman patched "Tie::RefHash" so that it would work correctly with threads. Rafael tidied it a bit as he put it into "blead". It works! http://xrl.us/kqpy Multiple Perl version support quandary John Peacock admitted to having been exceedingly naughty in releases new versions of "version" and not testing them on older perls. Some of the code contained 5.6-isms (notably dealing with warnings), and wondered what to do to make it work again on 5.005. John thought that the easiest way would be to fake up an *ersatz* "warnings" module, but then, he wasn't sure how to pull off something like "no warnings qw(redefine)". Nick Ing-Simmons provided a elegantly devious lightweight solution that please John no end. Yitzchak thought of a problem that John's final implementation might have, and provided another improvement. Revis(?:it)?ing compatibility http://xrl.us/kqpz Do "PERL_FLEXIBLE_EXCEPTIONS" work? Nicholas was working (or not) with "PERL_FLEXIBLE_EXCEPTIONS" and came to the conclusion that due to a compilation error, it didn't work, couldn't work, and could never have worked in over six years and twelve stable releases of Perl. Given that no bug reports have been received to date, Nicholas concluded that they could be scrapped without causing any harm. Nick Ing-Simmons cautioned about being too hasty, saying that this mechanism was there to allow Perl to be compiled natively by a C++ compiler, to map C's "longjmp" to C++'s "throw". So Nicholas went off and tried to compile perl with a C++ compiler ("g++") on Linux and FreeBSD. Everything failed, usually due to prototypes not being sufficiently precise (mainly "char *" *versus* "const char *". From which one may conclude that it may be possible to compile Perl with a C++ compiler out of the box, but certainly not with a common C++ compiler on two of the most common Unix platforms. Probably not http://xrl.us/kqp2 Making the "IO::Socket" tests pass on Win32 Yves Orton sent in a patch to get "IO::Socket" to run its test suite correctly on a threaded Windows build. He had a look at "IO::Pipe", but couldn't think of a sane enough approach to make it work. Steve Hay couldn't get it to smoke cleanly on a non-threaded build, and Yves asked for advice on a better indicator to decide whether or not to skip some of the tests (that deal with "fork"ing). Steve Hay showed how various configure-time switches can be combined in different ways to make all sorts of threadish behaviour in Windows. Andy Dougherty said that the right way of seeing whether "fork" was implemented was to look at $Config{d_fork}. And if that gave bogus results, well by golly it ought to be fixed up so that it gave a useful result. Yves thought that this was a marvellous idea... except that it didn't solve the problems for all the perls out there in the field today. http://xrl.us/kqp3 Yves followed up with a patch that fixed just about everything, which was applied by Steve Hay. Socket to me http://xrl.us/kqp4 Perl memory management and documentation Tom Schindl thought that the documentation was unclear on the concept of explicitly setting lexical variables to "undef" to release memory inside a scope. Pointers were given to various book references, and Yitzchak Scott-Thoennes summed up Perl's memory strategy nicely: "allocate as early as possible and for as long as possible". Sadahiro Tomoyuki observed that while there are techniques for pre-extending arrays and hashes, no Perl-level technique is available for scalars (although "SvGROW" can be used from XS code). This could be useful for strings that are expected to become very large. He suggested that making "length($string)" lvalue-able, as in length($str) = 700000000 to tell perl to allocate that many bytes for a scalar could be very helpful (notwithstanding the usual provisos about characters not equal to bytes in Unicode strings). Let my bytes go http://xrl.us/kqp5 As usual, Dave Mitchell explained what was happening behind the scenes in a clear and concise manner. The algorithm http://xrl.us/kqp6 How should "%^H" work with lexical pragmata One of Robin Houston's many contribution to "blead" last year was a somewhat arcane improvement that concerned "%^H" (the hints variable) being made available to "eval" blocks. Nicholas Clark used Robin's insight to help finish lexical pragmata, and was running into conceptual difficulties over the state of contents of "%^H" at compile time and run time, and when to make it readable or writable. The fundamental problem that needs to be addressed is that the state of the hints get compiled into the op-tree, which means that changing the setting of a hint has no effect (hence the "read-only" nature of the beast) once the code has been compiled. Rafael pointed out that that was just it: "%^H" and $^H are for affecting compilation and, as they stand, are just not useful at run time. Something that does have an effect at run time, warnings.pm uses its own variable, a bitfield, which is stored in the op-tree in its own right, which is how warnings can come and go during run time. Nicholas had a look at the definition of "struct cop", as that seemed to be the logical place from which to hang hints, but then realised that since op-trees are shared between threads there's no sane way to make it read-write as well (since that would mean all threads would inherit the change of hinting an any thread). Nicholas finished up doing an extreme programming number: writing the test to prove that lexical pragmata work. And then subsequently committed a patch that made the test succeed. Then Hugo admitted to being slightly confused. That's good. I was confused all along. I'll give you a hint http://xrl.us/kqp7 Continued in the new month http://xrl.us/kqp8 Rafael Garcia-Suarez played around with the pragma stuff and came up with a user-level pragma example. David Nicol and Nicholas played around with that, and the feedback from the exercise resulted in a couple of other code tweaks. (In case you're wondering, a pragma is a module with a lower case name that can be turned on and off through the code. Two prime examples are "use strict"/"no strict" and "use warnings"/"no warnings"). Your very own pragma http://xrl.us/kqp9 At the same time, Rafael thought that it ought to be possible to make encoding lexical as well, and set about trying to find out what was still needed to get it to work. Nicholas and he thrashed out the details. http://xrl.us/kqqa Combining UTF-16 output with :crlf is awkward In a parallel universe (read: another mailing list), Jan Dubois discovered that stacking the ":crlf" layer on top of a Unicode layer causes "Wide character in print" warnings to be issued. The work-around is the use the (in Jan's words) "non-intuitive" ":raw:encoding(UTF-16LE):crlf:utf8" layers together, to turn off the "PERLIO_F_UTF8" bit in the ":crlf" layer. Jan wondered whether it would be possible for "PerlIOCrlf_pushed()" to inherit the flag from the previous layer, or whether "PerlIO_isutf8()" should walk the layer stack in order to determine what it should do. Nick Ing-Simmons preferred the first approach, going as far as saying that that should actually be the default behaviour for a layer. The second solution has the problem of a layer having to determine whether some other arbitrary layer affects UTF-8 or not. Layer upon layer upon layer http://xrl.us/kqqb Thread non-safety in "sv_setsv" Nicholas was rather distressed to discover a problem with "sv_setsv_flags" may put an end to a workable copy-on-write scheme in threaded builds. This came about from looking at the hints implementation, and the fact that threads share op-trees. Nice ASCII art, Nick http://xrl.us/kqqc Patches of Interest "Devel::DProf" "const"ing Andy Lester had a look at "Devel::DProf" to see about the bugs Jarkko Hietaniemi raised a while back. He wasn't able to fix anything yet, but did clean the code up somewhat, and added some lovely "const"s in the process. It's better than nothing http://xrl.us/kqqd Poisoning memory Following on from John Malmberg's plea to have allocated and deallocated memory filled with garbage values (and thus poisoned, to cause errant dereferences to be noticed earlier, Jarkko added a patch to do just that. Now allocations can be initialised with 0xAB (also known as *strawberry cyanide*) and freed memory can be overwritten with 0xEF (or *blueberry lithium*). Andy wondered how we'd gone for so long without it. Two exciting new flavours! http://xrl.us/kqqe VMS pool corruption fix for "_NLA0:" In turn, John delivered a patch to make "stat" work correctly on "NLA0:", which is very important if you're doing VMS work. http://xrl.us/kqqf Long file path support for VMS Having finished with the preliminaries, John then got down to business with a patch to long path support to all versions of and platforms of OpeVMS that support them. http://xrl.us/kqqg And rejigged the "stat" structure used when "largefile" support was enabled. http://xrl.us/kqqh Tidying up regexec.c Andy Lester set his sight on regexec.c, now that Dave has finished with it, and zapped numerous unused macros, inlined a couple of small static functions and sprinkled the magic wand of "const"ness over the lot. http://xrl.us/kqqi Random accumulated patches from Andy Andy then shipped out all his patches that been piling up: consting and "NULL" tweaks ("NULL" instead of 0 when dealing with pointers, and removing casts on "NULL" assignments). http://xrl.us/kqqj and redid the "PERL_UNUSED_DECL" macro, eliminating a grumpy comment at the same time. [News flash: this was eventually reverted, as there is code Out There which relied on the previous behaviour]. http://xrl.us/kqqk and removed some unnecessary pointer checks http://xrl.us/kqqm and found some more appropriate versions of the "SvREFCNT_inc" macro to use. http://xrl.us/kqqn Add "V.pm" to the core Abe Timmerman posted a patch to add V.pm, which was originally written back in 2002 in answer to a question from Tels. John Peacock thought that "Module::Info" would be more useful. "Why?" said Profane. "Why not?" said Stencil. http://xrl.us/kqqo New and old bugs from RT "$qr = qr/^a$/m; $x =~ $qr" fails (#3038) Nicholas Clark beat everyone else in closing out this bug from 2004. http://xrl.us/kqqp He also fixed up the "hash assignment to a tied hash erroneously stores data in the real hash too (#36267)" bug too. http://xrl.us/kqqq Perl segfaults; test case available (#32332) http://xrl.us/kqqr no "sendmsg"/"recvmsg" support (#38808) Nicholas noted that neither the core, nor the "Socket" module provide the "sendmsg" and "recvmsg" functions. Gise Aas thought that "POSIX" would be a suitable place in which to have them. http://xrl.us/kqqs Bad return value from a block with variable localization (#38809) Vincent Pit filed a bug that showed some code using "if(@_)", "do" and "return" picking up Cundef> in an unexpected manner. http://xrl.us/kqqt Encoding error in UTF-8 locales (#38812) Vincent Lefevre posted an encoding bug. Nicholas stripped down the example code and highlighted the error. He wasn't sure whether it was a problem of the documentation not being sufficiently clear, or the core for not dealing with the issue adequately Maybe a bit of both http://xrl.us/kqqu "local $h{$unicode}" doesn't work (#38815) Nicholas Clark noticed that "local $a{"\x{100}"} = 1" doesn't behave correctly (the way a non-Unicode key like "local $a{"N"} = 1" does), and promised to come up with a way to fix it, and did. All part of a day's work http://xrl.us/kqqv Segment fault when using "Sockets" (#38817) http://xrl.us/kqqw "use sort 'stable'" sorts backwards with perl5.9.3 (#38831) Stefan Lidman discovered that stable sorting in "blead" sorts descending instead of ascending by default. Rafael and Robin Houston had it sorted out in a jiffy. I hope they added a test case http://xrl.us/kqqx What Steve Peters did this week After Dave Mitchell landed his impressive iterative pattern match patch, Steve equally impressively trawled RT to resolve, like, a jazillion bugs, each resulting in a new message to the list. An interesting case of seeing how different people explain in their own words what is in fact the same thing. * Regexp causes "SIGSEGV" (stack overflow?) (#1760) * Core dump using a Perl regular expression (#6844) * Segmentation fault in "regmatch()" (#6987) * Perl Segmentation Fault using "/((\w+ )+)/" on long strings (#8685) * Recently-introduced regex segfault (#8870) * 5.6.0, 5.6.1, 5.8.0 regexp core on "([EMAIL PROTECTED]@|.)*" (#17611) * perl 5.8.0 segfaults (#18489) * perl "SIGSEGV" when applying regular expression to a long string (#21298) * Regexp segfault "--> ("X"x3529) =~ /( (?: \\. | [^\$] ){1,4000} )/gx;" (#21333) * Regexp segfault (#21922) * Seg fault on long input to re (#21940) * Segfault (deep recursion?) in regex match (#22051) * Core dump on big regex (#23666) * Segmentation fault caused by capturing regex (#24271) * METABUG - regex stack overflow issues (#24274) * Segmentation fault at "m//" regexp (#28999) * Regexp "/^([^f]|f.)+/" Bus error (#31887) * SEGV with complicated regexp and long string (#32041) * Long strings causes segmentation fault (#32465) * Regular expression segfaults perl (#32803) * "SIGSEGV" in "S_regmatch" (#34349) * Simple regexp causes segfault (#36020) * Segfault in simple regular expression (#36999) * Segfault when doing this regex (#38031) * Segmentation fault for matching too long regexps (#38379) * Silent self-termination of script using regex (#38470) * Regexp Bus error (#38473) * Perl Segfault in Regex Match (#38717) When Dave said he thought his patch would allow a whole pile of bugs to be closed out, he wasn't joking. Steve attempted to close out "Regular expression causes segfault (#36903)" but was having access permission problems in retrieving the test code to be used for the bug. Milo Thurston provided another URL to get at the code. 403 Forbidden http://xrl.us/kqqy Perl5 Bug Summary 1563 bugs (but wait until next time) http://xrl.us/kqqz Over here http://rt.perl.org/rt3/NoAuth/perl5/Overview.html New Core Modules * "version" 0.59 by John Peacock, http://xrl.us/kqq2 * "Module::Build" 0.27_10 by Ken Williams, http://xrl.us/kqq3 * and "Time::Local" 1.12_01 from Dave Rolsky. http://xrl.us/kqq4 In Brief Alan Burlison forwarded a message about a new project that had been formed to deal with programming language vulnerabilities. http://xrl.us/kqq5 Hugo van der Sanden added a brief documentation patch to clarify the fact that you cannot use "times()" to obtain the elapsed time consumed by running child processes, only for finished processes that have been "wait"ed upon. http://xrl.us/kqq6 David Nicol proposed a "Tie::MaskedArray" technique for avoiding the remotely tied global localized with a sigil exploit. I'm a little hazy on the details of this particular exploit. David's techique proposes to replace "local", more slowly, but also more safely. http://xrl.us/kqq7 John L. Allen forwarded the current, best patch for "pow()" on AIX. http://xrl.us/kqq8 Jim Cromie updated the documentation to make it more clear what happens when one does something like "Configure -des -DNoSuchConfigureFlag". http://xrl.us/kqq9 Robin Barker found a bug in "Readonly" in 5.8.8 and so fixed the bug, and added a test to t/op/tie.t to make sure the problem doesn't return. (I think I'm beginning to see a pattern here) http://xrl.us/kqra Paul Marquess provided a small patch to the zip test harness for "IO::Compress::Zip". http://xrl.us/kqrb Sadahiro Tomoyuki looked at some of the recent changes to the source and found some unmatching of parameters and types. Nicholas updated embed.fnc to take that into account. http://xrl.us/kqrc Andy had a go at linting the source with Sun Studio's lint, and found lots of things that need to be looked at, and wondered whether there were any other lint-like tools freely available that could be applied. Jarkko mentioned FlexeLint (a.k.a Gimpel Lint), which is very nice, but not free. http://xrl.us/kqrd Yves Orton found a tainting oddity that should possibly be documented. Yitzchak thought that some patches that remove the surprising behaviour would also be well received. http://xrl.us/kqre H.Merijn Brand backported all of recent changes made by Nicholas in "blead"'s Configure to that of "maint". At the end of the week he was still busy filling in gaps in Porting/Glossary. http://xrl.us/kqrf Feeback from last week's summary Craig Berry corrected my misreading of the "Module::Build" on VMS thread, which is that the VMS port currently doesn't offer the list form of piped "open". http://xrl.us/kqrg About this summary This summary was written by David Landgren. I will be getting a life^W^W^Wtaking a break next week so the next summary will be for the fortnight 3-16 April. If you want a bookmarklet approach to viewing bugs and change reports, there are a couple of bookmarklets that you might find useful on my page of Perl stuff: http://www.landgren.net/perl/ Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: [EMAIL PROTECTED]). The archive is at http://dev.perl.org/perl5/list-summaries/. Corrections and comments are welcome. If you found this summary useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. -- "It's overkill of course, but you can never have too much overkill."