This Week on perl5-porters - 5-11 June 2006

  Perl has always been very good about builds. Lets me spend my time
  fighting other software issues. -- Alan Olsen

Topics of Interest

Using "#ifdef" inside a macro

  Discussion continued on this issue affecting mg.c. The essence of the
  problem was a simple issue of the character used for separating paths
  (such as "/" on Unix). Andy Dougherty suggested a way to solve the
  problem upstream, at configure time. John E. Malmberg, ever the
  devil's advocate, pointed out that this would be unlikely to fly on
  VMS, since the path separator could either be "|", ":" or perhaps
  something else again, depending on from which shell perl was being
  run.

    http://xrl.us/nf26

Failing 0.9% of the time

  Jerry D. Hedden traced a random random thread test failure down to a
  problem that was a variant on the birthday party paradox: put twenty
  people in a room, and there's a better than even chance that two
  people share the same birthday. In a similar vein, the tests would
  fail if a random number was generated twice.

  The original issue was that all threads of a program would generate
  the same sequence of random numbers, which is contrary to many
  people's ideas of randomness. This has since been fixed, albeit with
  the above side effect in the test suite. Jerry applied a quick fix to
  bring the statistical likelihood of bogus failures from 0.9% down to
  0.003% by allowing one duplicate to occur.

  Paul Johnson thought that accepting two or three duplicates would
  ensure that the testing showed that the fix continued to work, while
  pushing the threshold of failure down into the noise. Jerry thought of
  another approach that would be even better still, but considered that
  this current fix was good enough. Rafael Garcia-Suarez applied the
  patch in any event.

    Heads or tails
    http://xrl.us/nf27

UTF-8 testing black smoke

  A recent change (#28528 - abolishing "cop_io") has caused lots of
  black smoke to issue forth from the smoke boxes. Nicholas Clark
  cleaned up a number of problems, but he wasn't sure what the best way
  was to correct a couple of the remaining failures.

  Sadahiro Tomoyuki suggested one way. The problem is still unsolved. It
  boils down to a question that anyone with a bit of Perl knowledge
  could solve: how to tell the test harness that two different outputs
  can be valid, expected output.

    Over to you, dear reader
    http://xrl.us/nf28

Compiling "mod_perl" on Suse Linux 10.1

  Torsten Foertsch was having great difficulty dealing with the
  "#define"s and "#ifdef"s in perl.h and was unable to compile
  "mod_perl". The problem was figuring out what macros expanded into,
  and whether one could typecast the macros and have it work as
  expected. Apparently not, because on some platforms, the macro expands
  into C code that looks like "if(1) ...".

    http://xrl.us/nf29

"dor" *versus* "err"

  Rafael has been using the new defined-or operator in "blead", and
  noticed that he kept writing "use feature 'dor'" instead of "use
  feature 'err'". After a while he grew tired of this, and added "dor"
  as an alias for "err".

    It's not dorky
    http://xrl.us/nf3a

Action queues as an action item

  David Nicol wanted to know whether his people were interested in
  pursuing his idea of a queue-based mechanism for dealing with signal
  management and possibly other stuff.

  Nicholas ruled out any changes to the signal code in the maintenance
  branch. He also remarked that there might be a race condition between
  one thread receiving a signal, and another thread draining the queue,
  since, when a signal is received, there not enough context to
  determine which interpreter should handle it. And it would be very
  hard to solve this in a portable manner.

  David maintained that none of the problems were insurmountable.

    http://xrl.us/nf3b

Losing signals

  Nicholas took a closer look at the existing signals code, and
  identified what he thought was the source of the allegations of lost
  signals: a section of code lacking a mutex.

    That's me in the spotlight
    http://xrl.us/nf3d

Is Perl's "bsearch" bug free?

  Dan Kogai wanted to know if Perl was not subject to the binary search
  bug found by Joshua Bloch, who works at Google. The problem is that a
  binary search works in part by taking the mid-point of a high and low
  value, where the values are indices into the table being searched.

  This has usually been performed until now by adding the two values
  together and dividing by two. There is a problem if the two values
  overflow the maximum integer size, which happens when you have about a
  billion elements in an array. This is becoming more and more frequent
  as machines become deeper and faster.

  Andy Dougherty confirmed that perl's Quicksort implementation as it
  stands does have this problem. The fix is probably, as Joshua
  suggests, to take the difference of the two values, divide that by
  two, and add that to the low value. Nonetheless, a lot of thought
  should be put into exploring the boundary cases (and and even values,
  odd and even array sizes and so on) to make sure nothing breaks.

  Jan Dubois reminded people that even on 64-bit perls, the "av_*" API
  uses "I32" for indices. But changing that causes a huge ripple through
  the code base...

    Sooner or later
    http://xrl.us/nf3e

    Joshua Bloch's blog post
    http://xrl.us/ne99

"dlopen()" not found

  David Wheeler posted an innocuous call for help to get Perl compiled
  on a 64-bit Red Hat platform. It turned out that "Configure" lacked
  the necessary smarts to figure out what library paths should be used.

  Then followed a rather long discussion about Red Hat "spec" files, the
  evils of 32- and 64-bit applications residing on the same box, the
  extent to which which Linux distributors backported security patches
  to older Perls and so on.

    Looking for libs in all the wrong places
    http://xrl.us/nf3f

The range operator ("..") *versus* Unicode

  Dan Kogai ruefully discovered that ".." doesn't work with Unicode. For
  instance, "('a'..'z')" produces all the letters of the (ASCII)
  alphabet, but the same cannot be said of "("\N{GREEK CAPITAL LETTER
  ALPHA}" .. "\N{GREEK CAPITAL LETTER OMEGA}")".

  Even more surprising, though, was the fact that a simple work-around,
  by way of a "map {chr}", was all that was needed to make things work.

  Yitzchak Scott-Thoennes connected this issue to the definition of the
  magic auto-increment operator and the fact that it is not geared to
  dealing with this new-fangled Unicode stuff.

    $s = 'a';
    $s++; # now 'b'

  Dan was unhappily willing to accept that one could not create Unicode
  ranges, in which case the documentation should be amended to spell
  this out clearly. On the other hand, he was even more unhappy by the
  fact that in regular expressions and transliterations, Unicode ranges
  do work.

  Rafael entertained the idea of prolonging the magic to allow:

    $s = "\N{omega}";
    $s++; # now "\N{alpha}\N{alpha}"

  but thought that the results might prove to be entertaining in all
  sorts of unforeseen ways.

  Sadahiro Tomoyuki reminded people to very careful when dealing with
  Greek letters, such as final sigma between rho and sigma, and that
  stigma or digamma aren't treated as following epsilon. Or the
  opposite, I'm not sure.

    It's all greek to me
    http://xrl.us/nf3g

Patches of Interest

A better Aho-Corasick, and lots of benchmarks

  Yves Orton sent in a new, improved patch for adding the Aho-Corasick
  pattern matching algorithm to the regular expression engine. He also
  bundled a perl_bench.pl program that he had developed to benchmark
  multiple Perl versions.

  Yves thought that some of the observed slowdown in 5.9.4 might be due
  to Dave Mitchell's work in eliminating recursion from the engine.
  Nonetheless, he was optimistic about the situation, saying that there
  is a lot of room for improvements, and was targeting being at least as
  fast as 5.6.1.

  In a followup, he rejigged the code and had the performance running
  faster than Python (not that we're competing or anything).

  Yves develops on Windows, using MSVC as his main compiler, and as
  such, unwittingly used some MSVC-isms that gave "gcc" and other
  compilers indigestions. Other porters came to the rescue and
  straightened it all out.

    /n[iiiiii]ce/
    http://xrl.us/nf3h

  Andy Lester tossed in some cleanups to regexec.c and regcomp.c. Yves
  took them onboard, along with an observation from Nicholas that showed
  that the patch disagreed with threads, and produced a new version.

    http://xrl.us/nf3i

  Yves also tidied up and shipped out a more current version of a
  pluggable 5.10 regexp engine for 5.8.1.

    Overhaul
    http://xrl.us/nf3j

Onwards, "const"ing soldier

  Andy Lester delivered a patch for toke.c that added a range of
  assorted goodnesses.

    http://xrl.us/nf3k

  and more accumulated cleanups in files too numerous (well, four) to
  mention,

    http://xrl.us/nf3m

  which prompted Nicholas to make a few remarks about "int"s and pointer
  sizes and so forth.

    http://xrl.us/nf3n

  Moving right along, Andy added some home-grown "const"ing to dump.c,

    http://xrl.us/nf3o

  and silenced a couple of "signed"/"unsigned" warnings in toke.c and
  op.c. Not applied, apparently.

    http://xrl.us/nf3p

  Andy wrapped up with a tidying of "sv_dup" in sv.c. It looked big and
  scary, but Rafael fearlessly applied it anyway.

    http://xrl.us/nf3q

Tidying "PL_cshname" and "PL_sh_path"

  Jan Dubois tweaked to code to deal with the problem of installing a
  Perl distribution on a system that has only "/sbin/sh" and not
  "/bin/sh", and refactored the code that deals with "exec" failures,
  regardless of the shell ("sh" or "csh") used.

    Hide in your shell
    http://xrl.us/nf3r

Watching the smoke signals

  A number of smokes elicited a certain amount of comment:

    Smoke [5.9.4] 28368 FAIL(XF) linux 2.6.15-23-386 [debian] (i686/1 cpu)
    http://xrl.us/nf3s

    Smoke [5.9.4] 28372 FAIL(F) linux 2.6.15-23-386 [debian] (i686/1 cpu)
    http://xrl.us/nf3t

    Smoke [5.8.8] 28233 FAIL(F) MSWin32 WinXP/.Net SP2 (x86/2 cpu)
    http://xrl.us/nf3u

  And time is not what I have on my side to go any further.

New and old bugs from RT

"perlop": mention why "print !!0" doesn't (#33765)

  Steve Peters and Dave Mitchell kicked the tyres on this bug, deciding
  that what was needed was a paragraph that explained what values
  boolean operations return.

    A call to quills
    http://xrl.us/nf3v

More leaks à la "eval "sub { \$foo = 22 "" (#37231)

  Nicholas Clark dared to defy Dave Mitchell with yet another way of
  producing an "eval" leak. Dave went medieval on him, and then just as
  it was getting exciting, vanished into the big room with the blue
  ceiling. While Dave wasn't looking, Nicholas shook out a few more
  weird cases when warnings are made fatal.

    King leer
    http://xrl.us/nf3w

"defined"-ness of substrings disappear over repeated calls (#39247)

  Rafael had been close paying attention to what Sadahiro Tomoyuki had
  said about bugs and patches in "blead" and "maint" that dealt with
  substrings losing their "defined"-ness, for he reverted the change
  that a) broke this language feature and b) was no longer necessary
  anyway.

    Same as it ever was
    http://xrl.us/nf3x

Regexp match fails on empty pattern (#39339)

  This innocuous bug report attracted a surprising amount of comment, on
  the subtlety of "/ /", "//", '' and ' ', and how these all do slightly
  different things to "split", on how to deal with warnings,
  "usesitecustomize" and more besides. In the end, Ronald Fischer asked
  for the report to be closed, and that was the end of that.

    http://xrl.us/nf3y

"sort" with custom subname and prototype "($$)" segfaults intermittently (#39358)

  This appeared to be a curious bug on the surface, dealing as it did
  with doubly-indirected arrays. H.Merijn Brand noted that this no
  longer dumped core in "blead", instead contenting itself to bail out
  with a cryptic "Bizarre copy of UNKNOWN in aelem".

  Graham Barr was the first to realise that the reason was due to the
  fact an in-place sort was being performed, and assigning the results
  to another array made it work. Salvador Fandiño then correctly
  identified the optimisation that was causing things to go wrong: you
  cannot refer to the contents of the array within the comparator during
  an in-place sort.

  Dave Mitchell saw how to fix it, and said he would do so when he had a
  few spare moments.

    Out of sorts
    http://xrl.us/nf3z

Bug in toke.c (eval in subst) (#39365)

  B, Carter discovered that "s//#/e" does slightly naughty things in
  5.8.0, and apparently it still behaves the same way in "blead".

    http://xrl.us/nf32

Perl5 Bug Summary

    7 open, 7 closed, stable at 1488
    http://xrl.us/nf33

    Pick a winner
    http://rt.perl.org/rt3/NoAuth/perl5/Overview.html

New Core Modules

  *   Test-Harness version 2.62 was released by Andy Lester.

        http://xrl.us/nf34

  *   Version 0.64 of "version" was released by John Peacock.

        http://xrl.us/nf35

  *   Sys-Syslog version 0.15 was release by Sébastien Aperghis-Tramoni.

        http://xrl.us/nf36

In Brief

  Rafael made "IPC::Open3" call "POSIX::_exit" upon "exec" failures.
  (Bug #39252).

    http://xrl.us/nf37

  The thread concerning Perl "make" failures on HP-UX went on for far
  longer than anyone expected. (Bug #39269)

    http://xrl.us/nf38

  Lee Goddard was having trouble installing "LWP" and was directed to
  the appropriate forum. (Bug #39334).

    http://xrl.us/nf39

  Jarkko Hietaniemi declaimed that in regcomp.c, thou shalt not index
  with "char" (since that can be "signed" or "unsigned"). The thread
  degenerated into reminiscences of ye olde computers I have known and
  loved, before veering into "alt.kinky.bizarre".

    http://xrl.us/nf4a

  Russ Allbery had not quite finished his quest to deal with quotes,
  both single and double, escaped or not, in "C<>" sections and
  elsewhere.

    The great escape
    http://xrl.us/nf4b

  Rafael suggested that the question about "Pod::Usage"'s usage of $0
  could be side-stepped, choosing efficiency over accuracy.

    A sly escape
    http://xrl.us/nf4c

  Jerry D. Hedden tidied up "threads" a bit more, dealing with HP-UX
  10.20's non-standard "pthread_attr_getstacksize" and dealing with the
  absence of "threads::shared" during testing.

    http://xrl.us/nf4d

  Joshua ben Jore wondered if op/stat.t failures on an NFS-mounted
  directory could be attributed to slight clock differences between the
  local and remote machines. Rafael recalled having encountered similar
  issues in the past.

    This is the hour
    http://xrl.us/nf4e

  Steve Peters taught "Configure" that "icc" is not "gcc".

    The drama unfolds
    http://xrl.us/nf4f

  Peter Scott noticed an odd limitation in charnames.pm that caused
  named escapes in "eval" to fail. Rafael removed it in "blead".

    http://xrl.us/nf4g

  Yitzchak Scott-Thoennes fixed "blead" so that exhausting "<>" in
  "BEGIN" no longer causes an "ARGVOUT" used only once warning.

    http://xrl.us/nf4h

  Mohammad Yaseen continued his quest on building stuff on IBM big iron.
  He now has a 5.8.7 up and running, but was having trouble getting the
  test suite for "OS390-Stdio-0.008" to run cleanly. Robert Zielazinski,
  who has years of experience with MVS pointed to a couple of gotchas
  that Yaseen should be aware of.

    Man Versus System
    http://xrl.us/nf4i

  He also learnt all about %SVf in perl, what it does, and why you would
  want to.

    http://xrl.us/nf4j

  Anno Siegel delivered a new iteration of support for inside-out
  classes. Abigail queried a point of nomenclature.

    Making a hash of it
    http://xrl.us/nf4k

  Nicholas Clark showed how to get "Math-Pari" compiling again under
  "blead".

    http://xrl.us/nf4m

  Ferdinand Bolhar-Nordenkampf offered a number of suggestions for
  enhancements, that the porters duly considered.

    http://xrl.us/nf4n

  Jarkko suggested turning the "PL_perlio_fd_refcnt" into an "HV". Dave
  Mitchell thought that things were already too complicated.

    http://xrl.us/nf4o

About this summary

  This overdue summary was written by David Landgren.

  Last week's summary, here:

    http://xrl.us/nf4p

  was carefully read by Dr. Ruud, who was kind enough to point to a
  webbed version of a "c.l.p.m" thread discussing Unicode matters that I
  mentioned I was not was not able to access, where, it appears that "It
  seems that utf8 extends the core perl parser in some interesting
  ways."

  Sadahiro Tomoyuki explained what was going on, and Dr. Ruud started to
  look through the other "encoding" bugs to see if they were
  manifestations of a related cause. Tomoyuki pointed out yet another
  bizarre Unicode inconsistency, dealing with Unicode-encoded variable
  names.

  If you want a bookmarklet approach to viewing bugs and change reports,
  there are a couple of bookmarklets that you might find useful on my
  page of Perl stuff:

    http://www.landgren.net/perl/

  Weekly summaries are published on http://use.perl.org/ and posted on a
  mailing list, (subscription: [EMAIL PROTECTED]). The
  archive is at http://dev.perl.org/perl5/list-summaries/. Corrections
  and comments are welcome.

  If you found this summary useful, please consider contributing to the
  Perl Foundation to help support the development of Perl.
--
"It's overkill of course, but you can never have too much overkill."

Reply via email to