This Week on perl5-porters - 30 March-5 April 2008

David Landgren Fri, 11 Apr 2008 15:42:18 -0700

This Week on perl5-porters - 30 March-5 April 2008

  The extent that map/grep go to to keep the calling overhead of the
  block is horrendous and getting that to work for reduce in
  "List::Util" was difficult. Doing it with multiple blocks is going to
  be potentially very difficult. -- Graham Barr, not exaggerating how
  hard it is to work on the parser and optree generator.


Topics of Interest

Dual-lifing "Pod::Html"

  Steffen Müller gave David Landgren a commit bit last week to take over
  the maintenance of "Pod::Html". After looking around the blead
  directory tree, David wondered where the tests were. Jan Dubois
  pointed out their hiding place.

    mmm, hand-rolled test harnesses
    http://xrl.us/bi956

Lack of 5.10.x smoking

  Dave Mitchell looked through the smoke results from March and saw less
  than half a dozen smokes for 5.10.1-tobe. This led him to ask if some
  of the regular smokers could schedule a smoke or two on a more regular
  frequency (especially after 5.8.9 is released).

  Bram asked for some help on how to start smoking, such as what the
  most desirable combinations are for smoking. One important point to
  come out of the discussion was how useful "ccache" can be to cut down
  smoking time.

    chained smoking
    http://ccache.samba.org/
    http://xrl.us/bi958

Make built-in list functions continuous

  Nicholas Clark noticed that one of the Google Summer of Code projects
  was to improve the performance of built-in functions, by getting them
  to skip the construction of intermediate lists. Nicholas wsa curious
  as to what was meant by this, since it isn't part of the current TODO
  list.

  Wren Argetlahm replied that it is an optimisation known as
  "deforestation" in Haskell parlance, and comes into play when you have
  a series of chained maps or greps, and a pipeline of SVs between each
  step. The answer to this is to use continuous functions, which is just
  a fancy way of saying that they operate on input and output streams.

  Wren offered some rewriting strategies that he thought would speed
  things up. It all began to fall apart when Nicholas explained that
  during compilation there was never at any point a usable abstract
  syntax tree (or AST) that could be used as a basis for such
  manipulations, since the tokeniser and lexer emit what is more or less
  the final optree directly. Some additional obligatory fixups are then
  performed on the tree, as well as some peep-hole optimisations, but
  both of these operations are hopelessly intertwined. A distinct,
  pluggable optimiser for Perl 5 remains an elusive dream.

  It gets worse. Nicholas said it took him a full time week's worth of
  work, just to create opcode optimisations for "reverse sort @pig_pen"
  and "foreach (reverse @recusandae)". It took him a day or so to remove
  the "srefgen" and "ex-list" ops from the creation of arrayrefs (like
  "[1, 3, 7]") and hashrefs.

  He wasn't sure how long it took for Dave Mitchell to teach the
  optimiser to perform in-place sorts for "@schlip = sort @schlip", or
  for Yves Orton to achieve a faster "if (%hash) {...}", but these are
  the only known examples of optree optimisations in the past three
  years. Dave admitted that it was "quite hard".

  Dave explained that the naive approach of "look for a long string of
  ops and replace them by a shorter string" are hard to do and very
  fragile: they are either easily broken, or they break other things.
  And Rafael chipped in to say that it is difficult to write regression
  tests for them to boot.

  Nicholas thought a better approach would be to get "B::Generate" and
  co. into the state where one could write optree rewriters in perl Perl
  and start to explore where the real wins lie. And it just so happens
  that Steffen Müller has been playing around with "B" and "B::Utils" to
  manipulate the optree and was beginning to make progress towards doing
  just that.

  The other alternative that Nicholas came up with was to investigate
  Larry Wall's MAD work, which purportedly allows one to recover the
  original source after compilation (although I believe no-one has
  actually managed to achieve this in the general case).

    deforestation
    http://www.cse.unsw.edu.au/~dons/papers/CSL06.html

    for de trees
    http://xrl.us/bi96a

Expose "ptr-table" funcs, add "ptr-table-delete", and benchmark them

  Jim Cromie wrote a patch to expose the underlying hashing mechanisms
  used by the internals, so that XS code could use it directly. He
  wasn't entirely convinced that it was wise to do so, but a factor of 5
  speed-up was nothing to sneeze at. The fact that it might help
  "Devel::Size" caught Tels's attention, but he wasn't sure he
  understood what the patch offered.

    the street finds its own use for things
    http://xrl.us/bi96c

Stupid Transaction Idea?

  Curtis "Ovid" Poe wanted to know if anyone had ever thought about
  using forks or threads to create a poor man's transactional memory.
  Robin Barker pointed to a talk made by Simon Wistow on the subject.

  Mark-Jason Dominus made the connection between this question and a
  thread from June 2006 regarding reversible debugging. This revived the
  discussion about reversible debuggers and missile launches, until
  Abigail dragged things back on track, pointing out that rolling back
  transactions is a much simpler proposition than rolling the universe.
  For instance, a "fire_missile()" appears really to fire a missile,
  except that in reality it doesn't, not until the "commit()" is issued.

  Paul Fenwick thought that if anyone was brave enough to pursue the
  idea, they could do worse than use a "Safe" compartment to ensure that
  no operations that could not be rolled back were performed.

    Simon says
    http://london.pm.org/lpw-2004/talks/simon_wistow-perl_voodoo.ppt

    the p5p thread
    http://xrl.us/bi96e

TodoTracker - get money for fixing TODO tests

  Thomas Klausner and the Vienna.pm crew announced the grand opening of
  their TODO bounty hunter scheme, whereby people who write patches to
  solve TODO problems earn real money (that is, Euros).

  What exactly is a TODO, and what it is worth is a work in progress,
  and you can find out more about it on their wiki:

    http://socialtext.useperl.at/woc/index.cgi?todo_test_bounties

    make money fast
    http://xrl.us/bi96g

Leopard has more standard "/etc/passwd" files than previous

  Back in October 2007, Rafael Garcia-Suarez committed change #32200 to
  resolve a problem on an older OS/X. In newer OS/X versions, a file
  crucial to the test suite, "nidump", is not longer available, and thus
  the test suite fails. Jan Dubois suggested that scraping the output of
  "dscl" might do the job instead. Unfortunately he lacked the tuits to
  do so.

  Nicholas Clark said that Jan should refile it as a bug report so that
  it isn't left behind.

    http://xrl.us/bi96i

Unicode 5.1.0

  The latest Unicode specification was released by UCD. Of particular
  interest was the inclusion of uppercase Uppercase ß (eszet). Tels made
  a cogent argument for the gradual disappearance of such characters:
  they are really fiddly to text via SMS. In any event, Perl now does
  5.1.0, which is going to simplify the task of people who wish to write
  domino servers (the game, not the Lotus kind).

    http://xrl.us/bi96k

TODO of the week

  (here, this should be an easy one).

"perlmodlib.PL" rewrite

  Currently perlmodlib.PL needs to be run from a source directory where
  perl has been built, or some modules won't be found, and others will
  be skipped. Make it run from a clean perl source tree (so it's
  reproducible).

Patches of Interest

Double magic with "substr"

  Vincent Pit had been sufficiently annoyed by magic in "substr" being
  triggered twice, when once was enough, that he sat down and crafted an
  elegant patch to fix it up. He had a couple of doubts about how to
  deal with the API change.

  Nicholas explained how to resolve that by having the old
  implementation shuffle off to "mathoms.c", and writing a macro that
  exposes the old name in terms of the new.

    old functions never die
    http://xrl.us/bi96n

    they just mathom
    http://xrl.us/bi96p

Double magic with '\&$x'

  In his continuing quest to rid the core of twice-invoked magic,
  Vincent also delivered a patch to fix up the magic associated with
  "\&$x". He knew there was another possibility of magic being
  triggered, but questioned the wisdom of invoking magic for something
  as tedious as creating an error message.

    a surfeit of magic
    http://xrl.us/bi96r

Make "PL_AMG_names" and "PL_AMG_namelens" static

  Jan Dubois noticed that a couple of new symbols were being exported
  for 5.8.9-tobe. Since they really should be private, he made them
  static in blead. Steve Hay applied the patch, and tweaked regen.pl to
  get it to keep track of overload.c and overload.h.

  Nicholas Clark thought that since 5.10 was out in the wild, it would
  not be possible for to hide them, since someone might already have
  discovered a way of using them, and thus removing their public
  visibility would cause such code to break (or at least, become
  unlinkable).

    http://xrl.us/bi96t

perlfunc.pod: "atan2(0,0)" returns 0, not "undef"

  Paul Fenwick noticed a small error in the documentation concerning
  "atan2(0,0)", as the result of those arguments is undefined. Paul felt
  that perl should return "undef", but in fact it returns 0.

  Mark-Jason Dominus wondered if it would be better to have it throw an
  exception, like the logarithm of a negative number, or dividing by
  zero. Unfortunately that would be almost certain to break a lot of
  code in the wild. Paul felt that a warning would be sufficient, since
  people would be free to "use Fatal" and thus obtain an exception in
  due form.

  Rafael Garcia-Suarez invited interested parties to look at the "atan2"
  manpage on FreeBSD, which put forward some reasons why returning 0 can
  make sense.

  Dave Mitchell then looked at the source and discovered that perl just
  returns whatever the underlying C library does. Andy Dougherty
  investigated further and determined that some platforms do indeed
  return 0 (as dictated by the C89 standard) and some will also set
  "errno" to EDOM.

  Nicholas Clark was of the opinion that "CORE::atan2" should return 0,
  and that leaves "POSIX::atan2" free to call the underlying library.

    getting atan
    http://xrl.us/bi96v

New and old bugs from RT

possible fd bug in "PerlIOStdio_close" (#46173)

  Last last year, Steve Peters outlined a scenario where "dup"ing a file
  descriptor during a "close" could cause a file descriptor to be
  leaked.

  Nicholas Clark admitted this week that since Nick Ing-Simmons's
  passing, probably no-one understood how "PerlIO" works deep down. In
  any event, he thought the code as it stood appeared to be sufficiently
  wrong to merit a fix.

  This it turn reminded Craig Berry to ask why "PerlIOUnix_open"
  hard-wires the opened file to 0666 wide-open permissions, and wondered
  why the code didn't honour the current "umask" setting. Dave Mitchell
  explained that the kernel took care of that.

    http://xrl.us/bi96x

"[[:print:]]" *versus* "\p{Print}" (#49302)

  Given that no-one had been able to reconcile the differences between
  these two syntaxes (for example, that the former fails to match some
  things that the latter does), Robin Barker chose to document the
  differences.

    if you can't beat 'em
    http://xrl.us/bi96z

"utf8::valid" rejects characters in "\x14_FFFF - \x1F_FFFF" (#51710)

  Steve Peters wondered whether the patch included in bug #43294 would
  fix this problem. Which it didn't, but that left him asking why
  "\x14ffff" was considered to be a valid character.

  Chris Hall thought that it was but "utf::valid" was also happy with
  0x000000 through 0x13ffff and 0x150000 through 0x7fffffff, which left
  him puzzled as to why "utf::valid" was singling out the "0x14xxxx"
  range.

  Chris wondered if the patch Steve was looking at was causing
  "utf::valid" to reject both 'ill-formed' byte sequences as well as
  'non-characters'. Either way, it seemed to be sitting on the fence and
  not have a clear purpose.

  After that I lost it a bit.

    we need a unicode-porters list
    http://xrl.us/bi963
    http://xrl.us/bi965

Segfault in "B::SVOP::sv" (#52284)

  "Inferno" filed a bug which actually works correctly on a threaded
  perl, only non-threaded perls have problems. Reini Urban thought that
  the best solution was for "B::Size" to die a quick, painless death,
  and to use "Devel::Size" instead, as it is so much nicer.

    bug in march, answer in april
    http://xrl.us/bi967
    http://xrl.us/bi969

Attempt to free temp prematurely (perl 5.8.8) (#52386)

  Frank v Waveren reported a bug in 5.8.8 that Nicholas Clark determined
  had been fixed in 5.8.9 to be, although he didn't know off-hand what
  change was responsible for the fix. Frank tracked it down via the git
  repository, and identified change #30166 as being the fix.

    http://xrl.us/bi97b

"lc"/"uc" have unexpected side effects inside for loop (#52412)

  Mike Wver discovered that the following snippet

    my $foo = 'A';
    for my $bar (uc($foo)) {
      my $lower_bar = lc $bar;
      print "$foo $bar\n"; # $bar should still be 'A'
    }

  prints "A a" instead of "A A". No-one knew why, but Abigail pointed
  out that it was fixed in 5.10.0.

    http://xrl.us/bi97d

"map" isn't context aware in some cases (#52452)

  Stefan Wehinger wondered why slightly different nested map constructs
  use some, a lot, or all available memory. David Nicol made a decent
  stab at explaining it in terms of lists being reclaimed sufficiently
  early or not.

  Nicholas Clark suggested that the desired behaviour described in the
  report can be achieved, along with a sane level of memory consumption,
  by rewriting the loops with "foreach" instead of "map".

    http://xrl.us/bi97f

Perl5 Bug Summary

    1807 (+7 -3)
    http://xrl.us/bi97h
    http://rt.perl.org/rt3/NoAuth/perl5/Overview.html

New Core Modules

  Math::BigInt 1.88
      Tels announced the release of a brand new Math::BigInt, along with
      an updated "bignum" pragma, "Math::BigInt::FastCalc" and
      "Math::BigRat". This release closes out nearly all the existing
      bugs, only two remain, at the bottom of the barrel. In the
      meantime, Tels is sitting back and waiting to see what the CPAN
      Testers make of them.

        http://xrl.us/bi97j

In Brief

  Tels wondered if Reini Urban had had time to check out his patch for
  "Devel::Size" and bleadperl, but Reini was moving house this week.

    http://xrl.us/bi97m

  Robin Barker's verbosity tweaks to regen.pl and friends made it in.

    http://xrl.us/bi97o

  Jan Dubois felt that "PL_bincompat_opt" should be exported on AIX and
  Windows. Steve Hay thought so too, but realised that Jan was really
  talking about "PL_bincompat_options". Applied.

    http://xrl.us/bi97q

  Jarkko Hietaniemi got H.Merijn Brand to tweak Configure in order to
  align floating point policies of gcc and cc on Tru64.

    http://xrl.us/bi97s

  Jan Dubois thought that change #23984 should be integrated into 5.8.x,
  as it gets "corelist" installed on Win32. Nicholas Clark said that it
  was already in, the reason being that it help "perlbug" go about its
  business.

    http://xrl.us/bi97u

  Andreas König warned that lib/CGI/t/upload_post_text.txt was checked
  in as binary and wanted to know if it be changed. Rafael said that it
  was binary for a reason; it was in fact a GIF file.

    and patent-free
    http://xrl.us/bi97w

  Jerry D. Hedden ran into trouble with the above file, and Nicholas
  Clark straightened things out.

    all packed up
    http://xrl.us/bi97y

  Paul Fenwick issued an RFC for "Fatal"/"autodie" exception handling
  naming and structures.

    http://xrl.us/bi972

Last week's summary

  Tels clarified a point regarding the use of POD for wiki markup,
  explaining that his MediaWiki-Pod distribution on CPAN was a subclass
  of "Pod::Simple::HTML" that fixes up a lot of the problems that people
  encounter when using "Pod::Simple::HTML".

    This Week on perl5-porters - 23-29 March 2008
    http://xrl.us/bi974

About this summary

  This summary was written by David Landgren.

  Weekly summaries are published on http://use.perl.org/ and posted on a
  mailing list, (subscription: [EMAIL PROTECTED]). The
  archive is at http://dev.perl.org/perl5/list-summaries/. Corrections
  and comments are welcome.

  If you found this summary useful, please consider contributing to the
  Perl Foundation or attending a YAPC to help support the development of
  Perl.

This Week on perl5-porters - 30 March-5 April 2008

Reply via email to