This Week on perl5-porters - 30 March-5 April 2008 The extent that map/grep go to to keep the calling overhead of the block is horrendous and getting that to work for reduce in "List::Util" was difficult. Doing it with multiple blocks is going to be potentially very difficult. -- Graham Barr, not exaggerating how hard it is to work on the parser and optree generator.
Topics of Interest Dual-lifing "Pod::Html" Steffen Müller gave David Landgren a commit bit last week to take over the maintenance of "Pod::Html". After looking around the blead directory tree, David wondered where the tests were. Jan Dubois pointed out their hiding place. mmm, hand-rolled test harnesses http://xrl.us/bi956 Lack of 5.10.x smoking Dave Mitchell looked through the smoke results from March and saw less than half a dozen smokes for 5.10.1-tobe. This led him to ask if some of the regular smokers could schedule a smoke or two on a more regular frequency (especially after 5.8.9 is released). Bram asked for some help on how to start smoking, such as what the most desirable combinations are for smoking. One important point to come out of the discussion was how useful "ccache" can be to cut down smoking time. chained smoking http://ccache.samba.org/ http://xrl.us/bi958 Make built-in list functions continuous Nicholas Clark noticed that one of the Google Summer of Code projects was to improve the performance of built-in functions, by getting them to skip the construction of intermediate lists. Nicholas wsa curious as to what was meant by this, since it isn't part of the current TODO list. Wren Argetlahm replied that it is an optimisation known as "deforestation" in Haskell parlance, and comes into play when you have a series of chained maps or greps, and a pipeline of SVs between each step. The answer to this is to use continuous functions, which is just a fancy way of saying that they operate on input and output streams. Wren offered some rewriting strategies that he thought would speed things up. It all began to fall apart when Nicholas explained that during compilation there was never at any point a usable abstract syntax tree (or AST) that could be used as a basis for such manipulations, since the tokeniser and lexer emit what is more or less the final optree directly. Some additional obligatory fixups are then performed on the tree, as well as some peep-hole optimisations, but both of these operations are hopelessly intertwined. A distinct, pluggable optimiser for Perl 5 remains an elusive dream. It gets worse. Nicholas said it took him a full time week's worth of work, just to create opcode optimisations for "reverse sort @pig_pen" and "foreach (reverse @recusandae)". It took him a day or so to remove the "srefgen" and "ex-list" ops from the creation of arrayrefs (like "[1, 3, 7]") and hashrefs. He wasn't sure how long it took for Dave Mitchell to teach the optimiser to perform in-place sorts for "@schlip = sort @schlip", or for Yves Orton to achieve a faster "if (%hash) {...}", but these are the only known examples of optree optimisations in the past three years. Dave admitted that it was "quite hard". Dave explained that the naive approach of "look for a long string of ops and replace them by a shorter string" are hard to do and very fragile: they are either easily broken, or they break other things. And Rafael chipped in to say that it is difficult to write regression tests for them to boot. Nicholas thought a better approach would be to get "B::Generate" and co. into the state where one could write optree rewriters in perl Perl and start to explore where the real wins lie. And it just so happens that Steffen Müller has been playing around with "B" and "B::Utils" to manipulate the optree and was beginning to make progress towards doing just that. The other alternative that Nicholas came up with was to investigate Larry Wall's MAD work, which purportedly allows one to recover the original source after compilation (although I believe no-one has actually managed to achieve this in the general case). deforestation http://www.cse.unsw.edu.au/~dons/papers/CSL06.html for de trees http://xrl.us/bi96a Expose "ptr-table" funcs, add "ptr-table-delete", and benchmark them Jim Cromie wrote a patch to expose the underlying hashing mechanisms used by the internals, so that XS code could use it directly. He wasn't entirely convinced that it was wise to do so, but a factor of 5 speed-up was nothing to sneeze at. The fact that it might help "Devel::Size" caught Tels's attention, but he wasn't sure he understood what the patch offered. the street finds its own use for things http://xrl.us/bi96c Stupid Transaction Idea? Curtis "Ovid" Poe wanted to know if anyone had ever thought about using forks or threads to create a poor man's transactional memory. Robin Barker pointed to a talk made by Simon Wistow on the subject. Mark-Jason Dominus made the connection between this question and a thread from June 2006 regarding reversible debugging. This revived the discussion about reversible debuggers and missile launches, until Abigail dragged things back on track, pointing out that rolling back transactions is a much simpler proposition than rolling the universe. For instance, a "fire_missile()" appears really to fire a missile, except that in reality it doesn't, not until the "commit()" is issued. Paul Fenwick thought that if anyone was brave enough to pursue the idea, they could do worse than use a "Safe" compartment to ensure that no operations that could not be rolled back were performed. Simon says http://london.pm.org/lpw-2004/talks/simon_wistow-perl_voodoo.ppt the p5p thread http://xrl.us/bi96e TodoTracker - get money for fixing TODO tests Thomas Klausner and the Vienna.pm crew announced the grand opening of their TODO bounty hunter scheme, whereby people who write patches to solve TODO problems earn real money (that is, Euros). What exactly is a TODO, and what it is worth is a work in progress, and you can find out more about it on their wiki: http://socialtext.useperl.at/woc/index.cgi?todo_test_bounties make money fast http://xrl.us/bi96g Leopard has more standard "/etc/passwd" files than previous Back in October 2007, Rafael Garcia-Suarez committed change #32200 to resolve a problem on an older OS/X. In newer OS/X versions, a file crucial to the test suite, "nidump", is not longer available, and thus the test suite fails. Jan Dubois suggested that scraping the output of "dscl" might do the job instead. Unfortunately he lacked the tuits to do so. Nicholas Clark said that Jan should refile it as a bug report so that it isn't left behind. http://xrl.us/bi96i Unicode 5.1.0 The latest Unicode specification was released by UCD. Of particular interest was the inclusion of uppercase Uppercase ß (eszet). Tels made a cogent argument for the gradual disappearance of such characters: they are really fiddly to text via SMS. In any event, Perl now does 5.1.0, which is going to simplify the task of people who wish to write domino servers (the game, not the Lotus kind). http://xrl.us/bi96k TODO of the week (here, this should be an easy one). "perlmodlib.PL" rewrite Currently perlmodlib.PL needs to be run from a source directory where perl has been built, or some modules won't be found, and others will be skipped. Make it run from a clean perl source tree (so it's reproducible). Patches of Interest Double magic with "substr" Vincent Pit had been sufficiently annoyed by magic in "substr" being triggered twice, when once was enough, that he sat down and crafted an elegant patch to fix it up. He had a couple of doubts about how to deal with the API change. Nicholas explained how to resolve that by having the old implementation shuffle off to "mathoms.c", and writing a macro that exposes the old name in terms of the new. old functions never die http://xrl.us/bi96n they just mathom http://xrl.us/bi96p Double magic with '\&$x' In his continuing quest to rid the core of twice-invoked magic, Vincent also delivered a patch to fix up the magic associated with "\&$x". He knew there was another possibility of magic being triggered, but questioned the wisdom of invoking magic for something as tedious as creating an error message. a surfeit of magic http://xrl.us/bi96r Make "PL_AMG_names" and "PL_AMG_namelens" static Jan Dubois noticed that a couple of new symbols were being exported for 5.8.9-tobe. Since they really should be private, he made them static in blead. Steve Hay applied the patch, and tweaked regen.pl to get it to keep track of overload.c and overload.h. Nicholas Clark thought that since 5.10 was out in the wild, it would not be possible for to hide them, since someone might already have discovered a way of using them, and thus removing their public visibility would cause such code to break (or at least, become unlinkable). http://xrl.us/bi96t perlfunc.pod: "atan2(0,0)" returns 0, not "undef" Paul Fenwick noticed a small error in the documentation concerning "atan2(0,0)", as the result of those arguments is undefined. Paul felt that perl should return "undef", but in fact it returns 0. Mark-Jason Dominus wondered if it would be better to have it throw an exception, like the logarithm of a negative number, or dividing by zero. Unfortunately that would be almost certain to break a lot of code in the wild. Paul felt that a warning would be sufficient, since people would be free to "use Fatal" and thus obtain an exception in due form. Rafael Garcia-Suarez invited interested parties to look at the "atan2" manpage on FreeBSD, which put forward some reasons why returning 0 can make sense. Dave Mitchell then looked at the source and discovered that perl just returns whatever the underlying C library does. Andy Dougherty investigated further and determined that some platforms do indeed return 0 (as dictated by the C89 standard) and some will also set "errno" to EDOM. Nicholas Clark was of the opinion that "CORE::atan2" should return 0, and that leaves "POSIX::atan2" free to call the underlying library. getting atan http://xrl.us/bi96v New and old bugs from RT possible fd bug in "PerlIOStdio_close" (#46173) Last last year, Steve Peters outlined a scenario where "dup"ing a file descriptor during a "close" could cause a file descriptor to be leaked. Nicholas Clark admitted this week that since Nick Ing-Simmons's passing, probably no-one understood how "PerlIO" works deep down. In any event, he thought the code as it stood appeared to be sufficiently wrong to merit a fix. This it turn reminded Craig Berry to ask why "PerlIOUnix_open" hard-wires the opened file to 0666 wide-open permissions, and wondered why the code didn't honour the current "umask" setting. Dave Mitchell explained that the kernel took care of that. http://xrl.us/bi96x "[[:print:]]" *versus* "\p{Print}" (#49302) Given that no-one had been able to reconcile the differences between these two syntaxes (for example, that the former fails to match some things that the latter does), Robin Barker chose to document the differences. if you can't beat 'em http://xrl.us/bi96z "utf8::valid" rejects characters in "\x14_FFFF - \x1F_FFFF" (#51710) Steve Peters wondered whether the patch included in bug #43294 would fix this problem. Which it didn't, but that left him asking why "\x14ffff" was considered to be a valid character. Chris Hall thought that it was but "utf::valid" was also happy with 0x000000 through 0x13ffff and 0x150000 through 0x7fffffff, which left him puzzled as to why "utf::valid" was singling out the "0x14xxxx" range. Chris wondered if the patch Steve was looking at was causing "utf::valid" to reject both 'ill-formed' byte sequences as well as 'non-characters'. Either way, it seemed to be sitting on the fence and not have a clear purpose. After that I lost it a bit. we need a unicode-porters list http://xrl.us/bi963 http://xrl.us/bi965 Segfault in "B::SVOP::sv" (#52284) "Inferno" filed a bug which actually works correctly on a threaded perl, only non-threaded perls have problems. Reini Urban thought that the best solution was for "B::Size" to die a quick, painless death, and to use "Devel::Size" instead, as it is so much nicer. bug in march, answer in april http://xrl.us/bi967 http://xrl.us/bi969 Attempt to free temp prematurely (perl 5.8.8) (#52386) Frank v Waveren reported a bug in 5.8.8 that Nicholas Clark determined had been fixed in 5.8.9 to be, although he didn't know off-hand what change was responsible for the fix. Frank tracked it down via the git repository, and identified change #30166 as being the fix. http://xrl.us/bi97b "lc"/"uc" have unexpected side effects inside for loop (#52412) Mike Wver discovered that the following snippet my $foo = 'A'; for my $bar (uc($foo)) { my $lower_bar = lc $bar; print "$foo $bar\n"; # $bar should still be 'A' } prints "A a" instead of "A A". No-one knew why, but Abigail pointed out that it was fixed in 5.10.0. http://xrl.us/bi97d "map" isn't context aware in some cases (#52452) Stefan Wehinger wondered why slightly different nested map constructs use some, a lot, or all available memory. David Nicol made a decent stab at explaining it in terms of lists being reclaimed sufficiently early or not. Nicholas Clark suggested that the desired behaviour described in the report can be achieved, along with a sane level of memory consumption, by rewriting the loops with "foreach" instead of "map". http://xrl.us/bi97f Perl5 Bug Summary 1807 (+7 -3) http://xrl.us/bi97h http://rt.perl.org/rt3/NoAuth/perl5/Overview.html New Core Modules Math::BigInt 1.88 Tels announced the release of a brand new Math::BigInt, along with an updated "bignum" pragma, "Math::BigInt::FastCalc" and "Math::BigRat". This release closes out nearly all the existing bugs, only two remain, at the bottom of the barrel. In the meantime, Tels is sitting back and waiting to see what the CPAN Testers make of them. http://xrl.us/bi97j In Brief Tels wondered if Reini Urban had had time to check out his patch for "Devel::Size" and bleadperl, but Reini was moving house this week. http://xrl.us/bi97m Robin Barker's verbosity tweaks to regen.pl and friends made it in. http://xrl.us/bi97o Jan Dubois felt that "PL_bincompat_opt" should be exported on AIX and Windows. Steve Hay thought so too, but realised that Jan was really talking about "PL_bincompat_options". Applied. http://xrl.us/bi97q Jarkko Hietaniemi got H.Merijn Brand to tweak Configure in order to align floating point policies of gcc and cc on Tru64. http://xrl.us/bi97s Jan Dubois thought that change #23984 should be integrated into 5.8.x, as it gets "corelist" installed on Win32. Nicholas Clark said that it was already in, the reason being that it help "perlbug" go about its business. http://xrl.us/bi97u Andreas König warned that lib/CGI/t/upload_post_text.txt was checked in as binary and wanted to know if it be changed. Rafael said that it was binary for a reason; it was in fact a GIF file. and patent-free http://xrl.us/bi97w Jerry D. Hedden ran into trouble with the above file, and Nicholas Clark straightened things out. all packed up http://xrl.us/bi97y Paul Fenwick issued an RFC for "Fatal"/"autodie" exception handling naming and structures. http://xrl.us/bi972 Last week's summary Tels clarified a point regarding the use of POD for wiki markup, explaining that his MediaWiki-Pod distribution on CPAN was a subclass of "Pod::Simple::HTML" that fixes up a lot of the problems that people encounter when using "Pod::Simple::HTML". This Week on perl5-porters - 23-29 March 2008 http://xrl.us/bi974 About this summary This summary was written by David Landgren. Weekly summaries are published on http://use.perl.org/ and posted on a mailing list, (subscription: [EMAIL PROTECTED]). The archive is at http://dev.perl.org/perl5/list-summaries/. Corrections and comments are welcome. If you found this summary useful, please consider contributing to the Perl Foundation or attending a YAPC to help support the development of Perl.