This Week on perl5-porters - 20-26 March 2006

  Dave Mitchell converts the regular expression engine from recursive to
  iterative.

Topics of Interest

More on "Module::Build" on VMS

  Ken Williams got back to Craig A. Berry's patch from last week for
  "Module::Build" on VMS, and implemented a new approach to deal with
  backtick captures. John E. Malmberg and Craig batted it around for a
  while until it looked ready. John wrapped it up as a new version of
  "ExtUtils::CBuilder". John noted that there might be issues with older
  VMS versions that limit command lines to 255 characters, but decided
  to punt the issue for the time being.

    Looking good
    http://xrl.us/km2x

Building a threads-friendly debugger.

  Dean Arnold wrote to say that he was in the process of hacking "ptkdb"
  to make it easier to deal with debugging multi-threaded programs. He
  had reached the point where it seemed that the most promising way
  forward was to change the $DB::single variable to be globally shared
  across all the threads.

  After the usual admonishments ("You're mad!", "No-one who has ventured
  there has ever come back alive!"), Dave Mitchell said that he thought
  that it couldn't do much harm, except that it was likely to bring
  about a significant loss in performance, as the threads fought amongst
  themselves to acquire a lock on $DB::single to read it.

  Dean ran a couple of benchmarks and saw that Dave was right, the
  resulting performance curve was pretty atrocious (about two orders of
  magnitude).

    Where hackers fear to tread
    http://xrl.us/km2y

Dynamic libraries on AIX 5.1

  Last time we heard from John L. Allen, he had been busy doing battle
  with 32/64 bit builds with Oracle on AIX. This week he was having
  trouble with "Math::Pari", and he and Ilya Zakharevich, "Math::Pari"'s
  author, were stuck.

  The problem revolved around which libraries were being linked, which
  meant that the wrong version of the C language "pow" function being
  used. John wanted to understand what was happening and why. H.Merijn
  Brand guided him through the twisty mazes of AIX linker techniques.

  By the end of the thread John had managed to concoct a method for
  making it work, and H.Merijn made a plea for an AIX maven to step in
  and take over (and revise) the README.aix file.

    Fear and loathing
    http://xrl.us/km2z

New "Time::Local" failure

  Rafael Garcia-Suarez attempted to upgrade "blead" with "Time::Local"
  version 1.12, and saw that the test suite failed. Steve Hay recalled
  that this was the result of a bug that he had encountered in LWP's
  test suite. Gisle Aas isolated the problem with "Time::Local", and
  Dave Mitchell came up with the patch.

  Steve wondered whether that patch should be applied only to the Win32
  platform. Dave Rolsky, author of the module, responded saying that
  there were some problems with integer overflow that gets triggered
  only in certain time zones. He said that it was all a bit of a mess,
  but that he was going to get it sorted out and release 1.13.

    It's about time
    http://xrl.us/km22

Revamped UTF-8 caching code

  Nicholas Clark checked in some code to rework how UTF-8 caching is
  performed.

  First, some background: finding the offset of an arbitrary character
  in a UTF-8 string can be a difficult proposition, depending on the
  number of wide characters encountered in the string. The brute force
  method consists of starting from the beginning, and then counting
  characters until the desired offset is reached. Depending on the
  length of the string, this can be very time-consuming.

  To lessen this cost, perl maintains a cache of where wide characters
  appear in a string, to minimise the amount of linear scanning
  required. A few weeks ago, a bug report revealed that there were some
  problems with the existing cache management code.

  So Nicholas reworked it a fair bit, adding a "${^UTF8CACHE}" variable
  to allow the caching code to be enabled and disabled at will, as well
  as a "PERL_UTF8_CACHE_ASSERT" build-time switch to force extra
  checking (verifying that the cached and uncached results agree). He
  also discovered that the code wasn't taking full benefit of the
  gathered information, and tweaked the code to minimise the amount of
  linear scanning required.

    And accessible from the command-line too
    http://xrl.us/km23

    see also
    http://xrl.us/j9pq

The regexp engine no longer uses recursion

  Dave Mitchell announced that he had reworked the regular expression
  engine to use an iterative technique rather than recursive. He
  achieved this feat by making "S_regmatch()" save its match context on
  the heap and restart the main loop, rather than on the stack by
  calling itself.

  Dave measured that the heap allocation induced a 3% slowdown, but that
  this should be avoided by switching to an arena-based allocation
  scheme or similar, further down the track.

  Before you ask, yes, "/(??{$re}/)" still causes recursion. And Hugo
  van der Sanden thinks undoing *that* would be hard.

    No more nasty stack overflow bugs
    http://xrl.us/km24

Patches of Interest

Upgrading to "threads" version 1.12

  Jerry Hedden had delivered a patch to sync "blead" with "CPAN". Dave
  Mitchell declined the patch, saying that a patch must never mix
  functionality and whitespace formatting changes. If the whitespace is
  to be changed (and in general the rule is: never), then that should be
  delivered in a separate patch.

  Dave also thought that the approach was back to front. The changes
  should be applied to "blead" first, and then after the changes have
  had time to settle, the "blead" version can be released to "CPAN".

  Jan Dubois agreed that he too would prefer it this way around, since
  each change is tracked in Perforce, the "perl5-changes" mailing list
  gets to hear about it, and e-mail "Message-ID"s from the latter list
  make it easier to cross-reference the changes with traffic on
  "perl5-porters".

    http://xrl.us/km25

  Jerry also asked about the definition of "THREAD_RET_TYPE", in the
  process of coming to grips with the "threads" code base but received
  no answers.

    http://xrl.us/km26

  and finally got a patch accepted to sync "blead" with CPAN.

    http://xrl.us/km27

Serialising closures via "Storable"

  David Wheeler wanted to know whether "Storable" could be used to dump
  out a closure, bring it back again, and have it work. For instance, to
  be able to say

    my $var = 1;
    my $code = sub { $var };
    print $code->();
    $code = thaw(freeze($code));
    print $code->();

  And have it print out "1" twice, rather than once and a warning about
  uninitialised values in "print". Yuval Kogman explained how it was
  more or less possible, and the pitfalls one would encounter if one
  were brave enough to insist on the approach.

  Yves Orton, author of "Data::Dump::Streamer", showed how using that
  module could probably provide something closer to what David was
  after. Joshua realised that one only had to teach "Storable" to use
  "DDS" instead of "B::Deparse" and it would Just Work.

  Rafael noted that Storable is in the core, but "DDS" is not, although
  it should be possible to teach "Storable" to use it if it were
  available locally.

    http://xrl.us/km28

Watching the smoke signals

Compress/IO/Zlib/t/050interop-gzip.t failure on OpenBSD

  Steve Peters tracked down the smoke failures occurring on OpenBSD. It
  turns out that OpenBSD's "gzip" behaves differently when gzipping a
  zero-byte file:

    # Cygwin, FreeBSD, Linux, NetBSD, Solaris, ...
    touch /tmp/foo; gzip -c /tmp/foo > /tmp/foo.gz; echo $?
    0
    # OpenBSD
    touch /tmp/foo; gzip -c /tmp/foo > /tmp/foo.gz; echo $?
    1

  Paul took that into account, but wondered all the same why the smoke
  results mentioned "Inconsistent test results (between TEST and
  harness)", when one should expect that both TEST and harness should
  fail in exactly the same way.

  Steve had a hunch that the problem on OpenBSD arose when the file to
  be compressed is less than 10 bytes long. Which seems odd, to say the
  least. Joshua ben Jore mentioned that he had seen similar problems on
  a Ubuntu Linux but hadn't been paying close attention. He promised to
  go back and look more closely to see if it was the same error, or
  something else again.

    One more reason...
    http://xrl.us/km29

    And the patch to fix it
    http://xrl.us/km3a

Smoke [5.9.4] 27593 FAIL(F) MSWin32 WinXP/.Net SP2 (x86/2 cpu)

  Steve Hay had a Windows build fail due to a problem with
  "ExtUtils::MakeMaker" (that Rafael had recently integrated), and asked
  Michael to integrate the patch he made into the "EU::MM" repository.

    Earth to Schwern, do you read me?
    http://xrl.us/km3b

New and old bugs from RT

"print (...) interpreted as function" occasionally (#4346)

  Many moons ago, Abigail reported that the message "print (...)
  interpreted as function" appears inconsistently, depending on a
  peculiar combination of closing braces, whitespace and/or semicolons.
  Steve Peters said that "say" has picked up a similar habit.

    The more things change...
    http://xrl.us/km3c

More on overloading and reblessing (#34925)

  The thread about overloading and reblessing objects continued this
  week. Nicholas Clark proposed a solution to scan all the references to
  an object and fix them up. Yitzchak Scott-Thoennes pointed out that
  such an approach would break the following code:

    $a = $b = {};
    bless $b, OverloadedClass;
    # $a is not overloaded here

  Yitzchak admitted that such a construct would probably be quite rare,
  and wondered whether it wouldn't be better simply to document the fact
  that the initial example doesn't work, with suggested work-arounds.
  Nicholas implemented the scan approach in "maint" as change #27512.

    http://xrl.us/km3d

B::Lint chokes on simple script (#38771)

  Bart Lateur filed a bug report against "B::Lint" (on perl 5.8.7). The
  interesting thing is that the program in question was

    print for 1 .. 10

  Joshua ben Jore, who has recently put a fair amount of work into the
  "B::" namespace observed that the problem has been fixed in "blead",
  but that it probably still exists in 5.8.8.

    http://xrl.us/km3e

"NaN"s on Win32 (#38779)

  Rob a.k.a Sisyphus posted a bug report concerning "NaN"s (Not a
  Number) on Win32. It seems that there is a compiler issue, which is
  that code compiled with VC7 is correct, but VC6 is not.

  Dominic Dunlop noted sadly that the best way to fix this bug would be
  to add a note to the README.win32 documentation to say that perl
  should not be built with VC6. There's an article on the MSDN site that
  goes into more detail about floating point comparison issues.

  Yves Orton thought that that was hardly ideal, since VC6 has always
  been the standard compiler that ActiveState uses for their builds.
  Except that Dominic was talking about Microsoft's freely downloadable
  compiler, which, is apparently a slightly different beast.

  Jan Dubois came up with the best patch, one that works around
  compilers that have brain-damaged "NaN" comparison routines. Looking
  more closely at the code, Jan realised that perl's handing "NaN"
  handling is somewhat uneven. "grok_number()" will set the
  "IS_NUMBER_NAN" and "IS_NUMBER_INFINITY" bits as appropriate, but
  "sv_2nv()" doesn't bother to check them; it ducks the issue and lets
  "atof()" deal with it. He also saw that the cmp.t test that tests how
  "<=>" deals with "NaN"s is probably not doing anything meaningful.

    http://xrl.us/km3f

  In a thread-split elsewhere on the same topic, Jan provided keen
  insight into the subject of C run-time libraries on Windows.

    http://xrl.us/km3g

Constants with "undef" value deliver arbitrary value at first call (#38783)

  Markus Herber posted a bug report dealing with the XS code of "IO-Tty"
  that creates constant subroutine with "undef" as a value. Nicholas
  Clark understood what was going wrong and promptly supplied a patch
  which solved the problem. The patch is a bit of a stop-gap measure,
  but it will do for now.

    http://xrl.us/km3h

Deep hash of hashes breaks garbage collector (#38786)

  Reto Stamm uncovered a lovely bug in the garbage collector. He posted
  a program (paraphrased for succinctness here):

    my $root = {};
    my $h = $root;
    $h->{kid} = {} and $h = $h->{kid} for 1..250000

  This runs just fine, until the program exits, the garbage collector is
  run, the garbage collector exhausts the C stack due to recursion and
  the program goes belly up with a segmentation fault.

  chromatic thought that simply rewriting "S_hfreeentries",
  "Perl_hv_undef", "Perl_sv_clear", "Perl_sv_free2", and
  "Perl_hv_free_ent" for good measure to use iteration instead of
  recursion would probably do the trick.

    *crickets chirping*
    http://xrl.us/km3i

"Fatal" doesn't like "readdir()" (#38790)

  Tom Hukins filed a report that showed that "readdir" breaks when
  "Fatal" is used. ("Fatal" upgrades warnings to to fatal errors).

  The trouble is that "Fatal" gets mixed up between scalar and list
  context (doesn't everyone?) and throws all the results away. Rafael
  thought that a judiciously placed "wantarray" would solve that, but
  that in turn would alter the behaviour of something as admittedly
  bizarre as

    my @useless = open my $fh, 'does.not.exist';

  Yitzchak suggested hunting down the exceptions ("select" also seemed
  to be a likely candidate) and document their limitations in
  conjunction with "Fatal". Joshua thought that this was less than
  ideal. If someone was going to go to the effort of hunting down all of
  weird special-context builtins to document them (and there aren't a
  whole lot), it would take about as much effort to code "Fatal" to make
  it do The Right Thing all the time.

  Rafael agreed, and kept looking at his inbox for the patch. Joshua
  mumbled something about some patches to "B::Lint" he was working on,
  and promised to do something about this first.

    http://xrl.us/km3j

  Joshua went looking at "Fatal", and stumbled across some "AUTOLOAD"
  code, and wondered if and how it was used. Mark Jason Dominus
  suggested that its purpose was to allow the construct

    use Fatal;
    Fatal::open();

  to work in the same manner as

    use Fatal 'open';
    open();

  Which is either pretty slick, or pretty sick.

    Nice to know
    http://xrl.us/km3k

Perl5 Bug Summary

    1560 open tickets
    http://xrl.us/km3m

    Right here
    http://rt.perl.org/rt3/NoAuth/perl5/Overview.html

In Brief

  Dave Mitchell reminded us that "our" variables and package variables
  are compiled to the same code internally and as such have identical
  performance characteristics.

    http://xrl.us/km3n

  Philip M. Gollucci reported a bug that manifests itself using
  "mod_perl" on FreeBSD. Apparently another one of those "this is the
  second time it's broken" bugs. Robin Barker and Gisle Aas committed a
  couple of patches, including adding a check in the test suite, so
  hopefully we won't see the likes of it again.

    Perl_croak and nullch
    http://xrl.us/km3o

  Jim Cromie reported that "bleadperl" was uncompilable, due to problems
  with "Dynaloader" failing. Rafael traced it to the fact that he was
  integrating CPAN's "ExtUtils::MakeMaker" 6.30_01 into "blead", and its
  handling of "MAN3PODS" was broken. So he fixed that, and "bleadperl"
  started compiling again.

    Safe to go back in the water
    http://xrl.us/km3p

  Dan Kogai found an anomaly whilst playing with "YAML::Syck" and
  developed an detailed hypothesis as to what was going wrong. As of
  summary publishing time, no comments had been made.

    How to mangle the SvTYPEs on arrays and hashes
    http://xrl.us/km3q

  Someone asked how to use Perl to run Visual Basic code and was
  directed to Perlmonks.

    http://xrl.us/km3r

About this summary

  This summary was written by David Landgren.

  Information concerning bugs referenced in this summary (as #nnnnn) may
  be viewed at http://rt.perl.org/rt3/Ticket/Display.html?id=nnnnn

  Information concerning patches to "maint" or "blead" referenced in
  this summary (as #nnnnn) may be viewed at
  http://public.activestate.com/cgi-bin/perlbrowse?patch=nnnnn

  If you want a bookmarklet approach to viewing bugs and change reports,
  there are a couple of bookmarklets that you might find useful on my
  page of Perl stuff:

    http://www.landgren.net/perl/

  Weekly summaries are published on http://use.perl.org/ and posted on a
  mailing list, (subscription: [EMAIL PROTECTED]). The
  archive is at http://dev.perl.org/perl5/list-summaries/. Corrections
  and comments are welcome.

  If you found this summary useful or enjoyable, please consider
  contributing to the Perl Foundation to help support the development of
  Perl.

--
"It's overkill of course, but you can never have too much overkill."

Reply via email to