This Week on perl5-porters - 15-21 May 2006 "Obviously, that's not supposed to happen. And just to make matters worse, it's deleted all the evidence" -- Andy Dougherty
Topics of Interest The king is dead After the 28220th change to the Perforce source repository, Nicholas Clark announced a snapshot for "maint", whose main feature is the support for relocatable @INC paths. He mentioned that he had some 1200 patches queued up in his Inbox since October to examine for suitability for merging into "maint". This would take several weeks, and then a few more weeks of release candidates, and then 5.8.9 would be released. When that day comes, Nicholas said he would step down as pumpking, and that Dave Mitchell has volunteered to take over. Vive le pompe-roi http://xrl.us/mra5 Building DynaLoader deletes the source tree Joshua ben Jore was rather alarmed to discover that a recent change caused Dynaloader to delete the source tree in the process of being built, which puts a definite clamp on trying to test things afterwards. Dominic Dunlop suspected that something was amiss with Joshua's source tree, since other smoke reports at the same patch level were not showing anything out of the ordinary. Andy Dougherty isolated a couple of suspect passages in the configuration run that deserved further attention. Andy's analysis was correct. Joshua found that on Solaris (the platform in question), "Dynaloader" builds correctly with no threads, or with threads and the "gcc" compiler mentioned explicitly. Configure a build with threads but let "Configure" figure out implicitly that "gcc" should be used... and making "Dynaloader" will delete the source tree. Nice party trick. Andy then determined the exact chain of events, and offered a course of action to those of great Configure-fu to stop this from occurring in the future. Dynaloader ate my homework http://xrl.us/mra6 All this made Sébastien Aperghis-Tramoni notice that the test suite lacks a specific test script for "Dynaloader", so he remedied the situation. A couple of Sébastien's were marked TODO, since "can_ok()" seemed to have a bit of trouble with "Autoloader"'s autoloaded functions. chromatic briefly explained how to fix it, but Rafael Garcia-Suarez wasn't sure whether he thought it was the right way, so chromatic elaborated on the concept and afterwards Rafael did. Schwern loses a nickel http://xrl.us/mra7 chromatic submitted a patch that fixed it all up. Rafael was about to commit it when he realised that the patch used "Scalar::Util"'s "blessed()" function, but in the context of building the core, it probably hasn't yet been, or may never be, built. So in the end the expedient measure of using "ref()" was used instead. Easier than rearranging the build http://xrl.us/mra8 The question of "Scalar::Util" not being built in turn reminded Randy W. Sims that he had discovered that the latest version of Ubuntu linux ships without the XS version of "Scalar::Util", which has the unfortunate side-effect of breaking "svk". Dave Rolsky thought that having an XS version and a pure-Perl version of the same module but with different feature sets was madness. The fact that "weaken()" only comes with the XS version is a pain. Now you get it, then you don't http://xrl.us/mra9 The right hints for "Configure" After having mulled over bug #39149, Dominic Dunlop thought that the message that "Configure" prints out to explain what hints to use, was probably a bit confusing. To confuse the summariser, H.Merijn Brand explained that he had a single "Policy.sh" file that he uses an all sorts of platforms, from HP-UX to AIX to Cygwin. After a bout of archaeological prospecting, Dominic discovered one hints file, "greenhill.sh", that looks as if it is to be used in conjunction with another primary hints file, and commented that it is probably thoroughly unused as well. This caused Andy Dougherty to reminisce about the old days. He also gave a clear explanation about the purpose, and usefulness, of hints. They are a hack to give "Configure" a sharp poke in the eye to do something quick and dirty, and this saves you considerable time, since you don't have to delve into its guts to make it do the right thing in a nice cross-platform manner. At the end of the day, a couple of documentation patches made "Configure"'s intent clearer. http://xrl.us/mrba Implementing improvements to improve implementations Randal L. Schwartz thought he was confused about "Attribute::Handlers", when in fact he was confused by "CHECK" and "INIT" blocks not firing on "require" statements and said that he thought the implementation had a couple of holes in its feature matrix. This lead Nicholas to conclude, and it bears repeating in full here: I infer that this is because the people/organisations that need the functionality don't have the time/skills to provide the patch in house, and the people who do have the skills to create such a patch don't have the time or the personal need. This seems to be a general problem with Perl 5 development - there are a lot of firms using Perl to make money (that's fine - that's the idea) but no effective way of pooling resources from those firms back into supporting core development, with the upshot that core development and support is purely done by volunteers on a "best-effort" basis. At least this time there was a bit more of a discussion. TPF got a mention, and there was a bit of grumbling about how Perl 6 seems to be grabbing the spotlight even though it's still just a research project, whereas Perl 5 is here and now, not dead, no, definitely alive and kicking. And Merlyn, John Peacock and Joshua ben Jore discussed the problem of "require", "CHECK" and "INIT" blocks. Need to get 5.10 out the door http://xrl.us/mrbb documenting %^H and lexical pragmas Rafael Garcia-Suarez had thought that the "%^H" section in "perlvar" would be a suitable place to deal with documenting the new user-level lexical pragmata. Nicholas Clark looked at the existing text and concluded that the best thing to do would be to start again with a clean slate. Yitzchak Scott-Thoennes side-stepped the issue, and suggested that perlpragma.pod would be an even better place to document all this. Since no-one else came up with anything suitable to get the ball rolling, Nicholas Clark landed a first cut. use reason; http://xrl.us/mrbc "perlapio" and "PerlIO_binmode()" Matthew Byng-Maddick was having trouble marrying the output from a "truss"/"strace"-type program with "Devel::DProf". He wanted to be able to see exactly where system calls were coming from. His attempts to observe were interfering with what he was trying to measure. In the process of trying to get the thing to work in a reasonable manner he discovered some inconsistencies in the documentation and asked for advice. Warnocked ye were, and Warnocked ye be http://xrl.us/mrbd "Perl_PerlIO_context_layers()" and "PerlIO_apply_layers()" In other "PerlIO" news, Yves Orton said that he was having trouble with building recent "blead"s, and poked and prodded at the code, and managed to get it into a reasonably sane state, albeit with some odd failures in the test suite. Rafael and Steve Hay twiddled a few dials on the big machine and eventually all the errors went away. http://xrl.us/mrbe Performance in regular expressions H.Merijn Brand said that at the Dutch Perl Workshop, Juerd and he talked about the fact that "/[x]/" is not optimised to "/x/", but that sometimes the character class matches faster than the literal, which seems counter-intuitive. Yves Orton explained why things were the way they were, and in this particular case, it was apparently blind luck as much as anything else. Yves was interested in adding a single character class to literal conversion in the compiler, since character classes cause the new trie code to be skipped, and that would give more patterns a chance to be trie'd. Promise of a classless society http://xrl.us/mrbf optimize "/[x]/" to "/x/" So Yves figured out how to get the compiler to do just that and bundled it up into a shiny patch and tests which were applied by Dave. The good thing about this is that it appears that there is now a third person, along with Dave and Hugo van der Sanden, who can do battle with the C code of the regexp engine... and emerge victorious. Hairy C code 0, Yves 1 http://xrl.us/mrbg Exploring "userelocatableinc" Nicholas wrapped up the support for relocating @INC. http://xrl.us/mrbh Later on, Marcus Holland-Moritz discovered that $Config{startperl} is wrong if "userelocatableinc" is undefined. Nicholas was thrilled, as it meant that all seven people who had downloaded the latest "maint" snapshot (see above) had not tested it. But he fixed it anyway. H.Merijn wondered if it should be included in his smoke configuration. http://xrl.us/mrbi The continuing threads saga David Nicol mapped out a mechanism for linked-list stacks and queues, in the context of last week's "delivering signals to threads" thread. This received no discussion, I think because the point that Dave was trying to make initially was that no one would want to have to have this sort of machinery in the first place. http://xrl.us/mrbj Jerry found time to craft a patch to bring "blead" up to threads version 1.28, and this was applied by Rafael. http://xrl.us/mrbk Jerry wondered if threads in "BEGIN" blocks were safe to use. The documentation says "<blink>Don't Do That</blink>", but apparently it seems to work just fine. http://xrl.us/mrbm Jerry then discovered why creating threads in "BEGIN" leads to "Attempt to free unreferenced scalar" warning errors, and suggested a one-line fix that would solve the problem, but wanted to know whether this would produce any unwanted side effects, playing around with reference counting as it does. Mmmm... dunno http://xrl.us/mrbn Since no one could think of any possible harm that Jerry's suggestion could cause, he crafted another patch to fix the problem, and Dave applied it. Gentlemen, begin your threads http://xrl.us/mrbo Jerry then finished up adding an explicit thread context mechanism, which Rafael also applied. http://xrl.us/mrbp Dual-lifed modules that give CPAN grief Peter Scott remarked that "Devel::Peek"'s version number was higher in core than CPAN, and that this caused problems when upgrading CPAN. Nicholas suggested Peter contact Ilya Zakharevich, the author, directly. Rafael wondered whether it made any sense to dual-life the module at all, since it tends to be tied quite intimately with the internals. Peter said that Ilya said that the problem was with CPAN.pm. http://xrl.us/mrbq Peter also found that "Data::Dumper", "Devel::Dprof" and "Filter" do not configure themselves correctly, which causes them to be installed under "site_perl" instead of the core directories. http://xrl.us/mrbr Patches of Interest "my_snprintf" Following on from the discussion last week, where Nicholas Clark opined that it would be good to probe for variadic macro support, and use them if available, it just so happens that "Configure" was tweaked to do just that. So Jarkko Hietaniemi redid his patch to take this into account and threw in a number of safety checks at the same time. Better and better http://xrl.us/mrbs Strange encodings upsets "pp_chr" The subject of this item should be in the past tense, since Sadahiro Tomoyuki worked on the matter, and sent in a patch to make "pp_chr" happy. As a bonus, associated test scripts were made EBCDIC-friendly. This in turn made Rafael happy. http://xrl.us/mrbt "sv_pos_b2u" dislikes the extended UTF-8 Tomoyuki also fixed up "sv_pos_b2u_forwards" to behave more responsibly in the face characters residing in Perl's UTF-8 extension space (by avoiding an expensive function call merely to figure out a length). He then noticed that "S_sv_pos_b2u_forwards" looks it does the same thing as the public "Perl_utf8_length" function, and wondered if the latter should not be used instead. Carrying on in this one-person thread Tomoyuki decided the current approach was a complete mess (indeed the C comments scream out about needing to be fixed). So he fixed it. But not yet applied http://xrl.us/mrbu Andy Lester looked at "S_bytes_to_uni" and noticed that it could be made context-free and tidied up an unused variable in "Perl_refcounted_he_fetch". Applied by Rafael. http://xrl.us/mrbv "S_reguni" should return its length Elsewhere, Andy thought that it was rather silly of "S_reguni" to return its length via a pointer to an integer, and that returning the value on the stack would make the intent a lot clearer. Agreed to and applied by Rafael. http://xrl.us/mrbw Signature change of "SvVOK()" John Peacock sat up in surprise after stumbling across a patch committed by Nicholas back in January, that changed the signature of "SvVOK()". The idea was to change from returning 0 or 1, to 0 or "valid-pointer", which in turn cuts down on needless "mg_find" calls. As John has to mimic this behaviour in version.pm, he was hoping for a little moral support on the issue. Support was freely given, and appeared to consist of an inordinate amount of tweaks to header files and "Devel::PPPort" to get just right. Asleep at the wheel http://xrl.us/mrbx No more "S_regoptail" Andy Lester noticed that "S_regoptail" is called but once in regcomp.c, so he inlined it, which in turn meant that the code that called it was also able to be simplified further. Applied. Cascading goodness http://xrl.us/mrby Andy then undertook some refactoring of "reghops", but this was not applied, despite the fact that the patch featured genuine parameter "const"ing. Not enough goodness http://xrl.us/mrbz He finally attempted a "pp_sys" cleanup, but following the discovery that there are no tests in the test suite that actually exercise the code paths in question, Andy pulled it back onto bench to take another look. http://xrl.us/mrb2 After a revision, the second time around things looked much better. http://xrl.us/mrb3 Jarkko was horrified when he realised that his recent "strlcat" work was bogus, goofy and overkill, although probably not exactly dangerous. Steve Peters admitted that some of the blame was his own. http://xrl.us/mrb4 Watching the smoke signals Smoke [5.8.8] 28211 FAIL(XM) MSWin32 WinXP/.Net SP2 (x86/2 cpu) Something went wrong during configuration, so Nicholas fixed that. Other things were going wrong too, but appeared to fix themselves autonomously. Just one of those things, I guess. http://xrl.us/mrb5 New and old bugs from RT What Steve Peters did this week Noted that the desire that "CGI" multipart should support nph parameters (#24542) had been met with CGI version 3.05. http://xrl.us/mrb6 Realised that the fact that "submit()" of CGI.pm generates warning if "-sticky" used (#24760) was no longer true, at least as of CGI version 3.20. http://xrl.us/mrb7 Pointed out that no longer does CGI.pm autoloading lose $@ (#30325), thereby closing a third CGI issue. http://xrl.us/mrb8 Renamed a file because a test case name was too long (#38645), which should make Stratus VOS users happy. Shorter is better http://xrl.us/mrb9 SEGV with complicated regexp and long string (#32041) was resolved by Dave Mitchell, who fixed up an integer overflow negative wrap-around bug. http://xrl.us/mrca Perl segfaults; test case available (#32332) was also resolved by Dave Mitchell, this time adding the required make-work code to keep reference counting happy. http://xrl.us/mrcb many threads leads to various crashes (#37652) Jerry D. Hedden remarked that the biggest problem with the example code in this bug report was that it spawned threads so fast and furiously, that perl never had a chance to catch its breath and do the required housekeeping, so it was little wonder that it ran out of memory. Adding a brief "sleep" to the script seemed to help it considerably, but even then there's still a bit of a resource leak on Windows that will eventually take out the program, after some two million threads have been created. Take a short nap http://xrl.us/mrcc Problems building on Solaris 8 (#38664) Andy Dougherty followed up on this bug, offering some tips on a healthy configuration specification. Get it in writing http://xrl.us/mrcd "SvPOK" breaks scalar magic in 5.8.x (#38707) Dave Mitchell could not figure out how mere bit-testing macros could interfere with magic, and asked for more code, guessing that the problem was really elsewhere. Craig DeForest said he'd try and come up with a small test case. http://xrl.us/mrce Threads calling LWP causes exception (#38712) Dave Mitchell suggested taking this up with the LWP team, since LWP isn't in the core. Unsafe unless proven otherwise http://xrl.us/mrcf Regexp optimizer loses its hopes too soon (#39096) Dave Mitchell and Mike Guy followed up on this thread, that shows how two out of three seemingly identical regular expressions are dispatched by the engine with utmost speed, but the third get dragged down into a mess of exponential back-tracking. It would appear that there is scope within engine to identify the third expression as equivalent, however, Dave didn't wish to commit to a date as to when that might occur. Nested parens bad, m'kay? http://xrl.us/mrcg "sprintf" with UTF-8 format string and ISO-8859-1 variables redux (#39126) Sadahiro Tomoyuki took a closer look at this problem. Firstly, he managed to produce a small test case that provoked the bug. Secondly, this allowed him to narrow the offending code down to a section in "Perl_sv_vcatpvfn". Unfortunately, the solution wasn't obvious, apparently one more problem relating to the disconnect between bytes and characters. Fortunately, he was able to cook up an appropriate patch, and as an added bonus, provided a test that exercises the problem in both ASCII and EBCDIC character sets. http://xrl.us/mrch failure not always detected in "IPC::Open2::open2" (#39127) A lengthy thread developed on this, as Steve Peters tried to explain how things work from Unix's and Perl's point of view and Vincent Lefevre tried to explain how things were not working from his point of view. At the end of the week, no agreement had been reached. You just have to wait http://xrl.us/mrci "h2ph" generates incorrect code for "#if defined A|| defined B" (#39130) Rafael applied the suggested patch to "blead" and suggested that Nicholas do as much for "maint". The thread then segued into the observation that you can actually stuff just about anything into a perl "AV" array slot. Jan Dubois confirmed that this was true, but worked only as long as you accessed the contents within XS. Try to do as much in Perl code and the hammer comes down, smashing your program into tiny pieces. Just because you can, doesn't mean you can http://xrl.us/mrcj Lots of warnings with "diagnostics" and ("warn" or "die") (#39141) Fitz Elliott noted that a bare "warn "\n"" spews large amounts of "Use of uninitialized value in substitution" warnings, and suggested a fix. Dave Mitchell used a slightly different technique than Fitz's to patch diagnostics.pm. You MUST believe the error message http://xrl.us/mrck Unable to make Perl 5.8.8 on HP-UX 11.11 (#39143) Jim Duffield continued to make little progress in getting 5.8.8 to work to his satisfaction on HP-UX. As the goal was to be able to use "perlcc", Joshua ben Jore suggested using "PAR" instead, which is probably the best solution. http://xrl.us/mrcm Win32, @_ and "fork" crashing in "dounwind" (#39145) Brad Bowman showed that "sub { @_ = 3; fork ? die 5 : die 6 }->(2)" gives Win32 considerable pain. Steve Hay was able to reproduce it on Win32 in "blead", but wondered if anyone in Unix-land was able to do the same. It boils down to a problem with the way "fork" is emulated on Win32 through a lot of code here that simply never gets exercised on Unix. Jan Dubois pointed to a little known "PERL_SYNC_FORK" trick that could be used to serialise the fork executions, although it probably hasn't been used in the past five years, and may have suffered bitrot. Dave Mitchell took a wild shot in the dark, Steve Hay tried the suggestion, and as usual, Dave had called the play correctly. http://xrl.us/mrcn Perl 5.8.8 configure failure (#39149) Scott McAskill was having trouble configuring Perl on an aging Tru64 machine. Andy Dougherty, despite knowing next to nothing about that platform nonetheless was able to provide enough information to help Scott get up and running. Ideally there's something that should be tweaked in the hints file, but for the time being it looks like the problem was solved. http://xrl.us/mrco diagnostics.pm: "-traceonly" vs "-trace" (#39152) Julian Mehnle was puzzled by a discrepancy in the documentation, and had to read the source to figure out what was really going on. He thought that the best thing to do was to correct the documentation, so that someone else would not fall into the same trap. James Mastros suggested that the optimal solution would be to align the code with the documentation, in a way that was both backwards and forwards compatible. Fergal Daly admitted to being the guilty party responsible for the problem in the first place, and cooked up a patch that followed James's suggestion. Applied by Rafael. http://xrl.us/mrcp Segmentation fault on simple regexp with string larger than 29kB (#39167) Krzysztof Leszczynski isolated an innocuous regular expression in the "YAML" distribution that blows the stack on a sufficiently long string. Dave Mitchell and Dominic Dunlop explained the story of Perl's recursive-but-now-iterative regular expression engine. One more reason http://xrl.us/mrcq Do not recommend "Switch.pm" in "perlfaq" (#39170) Slaven Rezic thought that the FAQ entry concerning how to write a "switch" statement à la C should not mention the "Switch", (due to weird syntax errors it can introduce into otherwise sane code, because of its source filter nature). He wanted to point out that in 5.10 one will be able to use the perl6-ish "given"/"when" construct. Abigail thought it was pretty silly to recommend this latter point, since there is no firm date available as to when 5.10 will ship. Hopefully sooner rather than later http://xrl.us/mrcr Perl5 Bug Summary http://rt.perl.org/rt3/NoAuth/perl5/Overview.html New Core Modules * "IO::Compress::*" version 2.000_12 proposed by Paul Marquess and accepted by Steve Peters. f y cn rd ths, y nd t gt lf http://xrl.us/mrcs * "version" version 0.60 from John Peacock syncs CPAN with "blead". http://xrl.us/mrct And gets it working even betterer than before. http://xrl.us/mrcu In Brief Nick Ing-Simmons provided a thoughtful follow-up to the question of whether a "FileHandle" is "IO::Seekable"? http://xrl.us/mrcv Nicholas Clark confirmed, following on from the internal error in Bytecode.pm bug report (#39110), that "Bytecode" is indeed unsupported, since none of the (volunteer) core developers use this experimental module in the normal course of events. It is thus unlikely to receive any attention in the near future. Any itchiness will remain unscratched http://xrl.us/mrcw Jerry D. Hedden reported that he had be using the reordered "SV" flags for a few months now, with no ill effect. But they ain't maint compatible http://xrl.us/mrcx Joshua ben Jore landed a large set of shiny "B::Lint" changes, saying they were good enough for "blead". Believed to be maint compatible http://xrl.us/mrcy Dave Mitchell thought that change #28183 had broken 64-bit builds. Jarkko Hietaniemi managed to flog off a patch on the cheap to fix it up, but the after sale service nearly drove him round the bend. http://xrl.us/mrcz Scott Carroll wanted to know more about "Storable"'s license and copyright status. This program is free software http://xrl.us/mrc2 Jakob Bjeggaard had a question about "Data::Dumper" not dumping a blessed object correctly. Yves Orton explained that it cannot really hope to be able to dump an inside-out or an XS-defined object correctly. Such objects need to provide their own "freeze"/"thaw" methods to do this properly. http://xrl.us/mrc3 The "Perforce" server downtime should always be arranged to coincide with London Perl Monger meetings. http://xrl.us/mrc4 Dave Mitchell made "Devel::Peek" dump "LV"s and "GV"s, following on from the big "SV" internals restructuring a while back. http://xrl.us/mrc5 He also saw that assigning whole (hash|array) to a tied (hash|array) doesn't mangle "SvTYPE", at least, not in "blead". http://xrl.us/mrc6 And explained what exactly DEBUG_LEAKING_SCALARS does, and why you might want to use it. http://xrl.us/mrc7 Daniel Frederick Crisman suggested a way to restructure the quote-like operators section in "perlop". The patch appeared to move *a lot* of stuff around, which may explain why people's eyes glazed over. The curse of Warnock http://xrl.us/mrc8 Yves fiddled with win32/buildext.pl to handle inclusions and not just exclusions, in order to minimise the number of extensions that were built needlessly while he was performing open heart surgery on the core. He wasn't particularly insistent about having it applied, but Steve Peters did so anyway. http://xrl.us/mrc9 Last week's summary I got the part about chromatic's "sv_derived_from" blues wrong. It is code that calls "UNIVERSAL::isa()" and "UNIVERSAL::can()" directly as functions that breaks things. http://xrl.us/mrda About this summary This summary was written by David Landgren. If you want a bookmarklet approach to viewing bugs and change reports, there are a couple of bookmarklets that you might find useful on my page of Perl stuff: http://www.landgren.net/perl/ Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: [EMAIL PROTECTED]). The archive is at http://dev.perl.org/perl5/list-summaries/. Corrections and comments are welcome. If you found this summary useful or enjoyable, please consider offering Nicholas Clark a job. A nice one, with a swivel chair. -- "It's overkill of course, but you can never have too much overkill."