This Week on perl5-porters - 18-24 May 2008

  "Ah, more details about filenames. Well, this sounds positively weird.
  Octet strings are not particularly user-friendly if you can't
  interpret them as characters reliably.

  From what you say, and what I think I've heard elsewhere, Unix
  filename interpretation is a mess. Seems like the only bigger mess
  I've heard about is VMS file handling, where they seem to have a
  choice of several messes." -- Glenn Linderman, deep in the heart of
  Unicode, case conversion, filenames, encodings, character sets, ß and
  other exciting issues.

Topics of Interest

Another perldoc shortcut

  Tom Christiansen commented on Gisle Aas's perldoc shortcut (that
  "perldoc ipc" would redirect to "perlipc", assuming no ipc.{pod,pm}
  existed), saying that in pre-5.8 times he had been working on a
  technique to make "perlipc" itself, run from the command-line, do the
  same thing. Somewhere along the line, things went astray and the work
  never made it the core.

    not bitter, not really
    http://xrl.us/bk9mc

"File::Path::mkpath()" incompatibility in perl-5.10

  I had expected to make some progress on this issue, this week, but
  Real Life is eating my tuits like popcorn at the moment.

    next week, cross my heart
    http://xrl.us/bk9me

On the almost impossibility to write correct XS modules

  I might preface this thread "on the almost impossibility to write a
  correct summary of a complex subject". Marc Lehmann had written a few
  weeks ago that a bare "char *" through an XS API is fraught with
  peril, because there is no metadata available to tell you if it's
  Latin-1, KOI8-R, UTF-8 or something else.

  The thread blossomed this week, with a long-running debate about what
  is broken (and when, and how). One point that was made is that Win32
  encodes filenames in a particular way that doesn't really jibe with
  the rest of the internals. Unfortunately, it is only with hindsight
  that the problem really became apparent, hence the dilemma is that
  fixing it would break everything that has tried, with various degrees
  of success, to work around it.

  The "utf8" flag on SVs was again singled out as being responsible for
  world hunger and other assorted ills, with a number of examples
  demonstrating the problems.

  Rafael Garcia-Suarez outlined an approach that just may be a way
  forward out of the mess. After listening to Juerd Waalboer, he thought
  that marking an SV as "binary" and thereby disqualified from being
  upgraded to Unicode would be quite useful.

  Glenn Lindemann invented "blorf" as an opaque token for discussing the
  issues without people getting sidetracked over definitions of bytes,
  strings, characters, numbers and codepoints.

    hard core
    http://xrl.us/bk9mg

It's wafer thin!

  David Nicol's tiny patch to document the empty pattern ("m//") more
  clearly sparked a fairly intense technical debate over how to get rid
  of the latter.

  One point of particular interest was when Aristotle Pagaltzis
  suggested a "s///R" modifier which would return a modified copy of the
  original string, instead of modifying the contents and returning the
  number of matches made.

  As it turns out, this would solve a number of problems very nicely,
  not the least being the elegantly succinct

    my @changed = map { s/$this/$that/R } @list;

    so let's have it already
    http://xrl.us/bk9mi

Compiling 5.10 with g++ 4.3.0

  Not content with compiling perl with old gcc compilers, Bram took a
  very new one for a spin to see how things worked out.

  It did of course go *boom* (otherwise you probably wouldn't be reading
  about it). Bram traced the problems down to typedefs and enums in
  system headers, and wondered how in Configure this could be sorted
  out.

    duty now for the future
    http://xrl.us/bk9mk

"Getopt::Long", + options, installperl and +v

  Nicholas Clark was looking how to factor out the common code in
  "installman" and "installperl" and noticed that the main sticking
  point regarding "installperl" was that it admitted a "+v" switch (and
  it does something else than "-v"), using hand-rolled @ARGV processing.

  This precludes it from using "Getopt::Long" because, while
  "Getopt::Long" can be taught to accept "-x" and "+x", it offers no way
  of discriminating between the two.

  Johan Vromans said that as it turns out, with a bit of hand-holding,
  it is possible to coax the information out as things stand, and he
  plans to improve support for - and + switches in a future release.

  Nicholas thought that a middle path might be to keep the hand-rolled
  code, but adjust it to dump its results into an %opts hash, which
  would allow a drop-in replacement when "Getopt::Long" gets updated
  with the needed functionality.

  This brought forth a long discourse from Tom Christiansen, who
  admitted to the wrong kind of laziness regarding command-line switches
  by resorting to hand-rolling code to deal with a solitary switch when
  in fact it would have been better to rely on a module. When he quizzed
  Larry Wall about it during the first decade of Perl's development,
  Larry admitted to rolling his own frequently, since it seemed a bit of
  a waste in his eyes to pull in a module for just one or two or
  switches for a program little more than a one-liner. As a peace
  offering for his own hand-rolling sins, Tom offered the list the
  ultimate file renaming Perl program.

    bespoke options
    http://xrl.us/bk9mn

On broken manpages, trolling, inconsistent implementation and the difficulty to fix bugs

  Marc Lehmann wrote a long response to Jan Dubois as a spin-off from
  the "On the impossibility of writing XS correctly", stating that
  Perl's Unicode handling because some parts of the core deal with
  Unicode one way, and other parts another way. This leads to annoying
  bugs, in that they are hard to identify, and hard to fix.

  Tom Christiansen called him out for excessive use of rhetoric and
  asked him to clarify a couple of points. Several messages later Yves
  Orton offered a nice summary of the situation that showed where things
  break down. Then people started to speak about encodings, bytes,
  characters and character sets and as usual my eyes began to acquire
  that dead fish look.

    see also
    http://xrl.us/bk9mp

On the problem of strings and binary data in Perl

  On the subject of subjects on the problem of things, Yves Orton broke
  out into a new thread to discuss the schizophrenic attitude that Perl
  has when dealing with strings. He put forward a proposal for
  identifying and processing Unicode strings asked people to point out
  where he was wrong. Rafael Garcia-Suarez made a decent effort at doing
  just that.

  Juerd Waalboer provided a contrarian argument, suggesting that Unicode
  works pretty well in Perl, insofar as one can have strings containing
  Unicode, and other strings containing binary data, because in a
  correct program, one usually doesn't have the two appearing in the
  same string. (such as having the Thai-encoded name of a Thai person
  concatenated with the slurped contents of a PNG file representing his
  signature in the same Perl scalar). In Juerd's eyes, the main problems
  come about when dealing with pure binary data and hoping that it
  doesn't wind of being treated as Unicode when it shouldn't.

    more recommended reading
    http://xrl.us/bk9mr

  As a followup to the above discussion, Juerd announced that he had
  released BLOB to CPAN.

    http://xrl.us/bk9mt

"English.pm" alias for "%+"

  Amir Elisha Aharoni ventured for the first time into the waters of
  p5p, suggesting that %NAMED_CAPTURE would be a nice English name for
  the new 5.10 "%+" variable. Yves Orton thought the idea was worthy of
  consideration, but one also needed to deal with "%-" at the same time,
  which could be named %MAMED_CAPTURE_LIST.

    updating the babelfish
    http://xrl.us/bk9mv

07arith.t failing on "_strptime('2001-2-29 12:34:56','%Y-%m-%d %H:%M:%S')"

  February 29, 2001 was not a leap year, so trying to format it is an
  error. Apparently there is a test in "Time::Piece" to ensure it fails
  in the correct manner. Unfortunately, on some of the more exotic
  platforms like VMS and OS/X, the call also correctly fails, but does
  so in a way that fools the test suite.

    at the third stroke it will be the 32nd of february
    http://xrl.us/bk9mx

  Gisle Aas gave some additional background regarding Time-Piece-1.13
  test failures on HP-UX, by forwarding a message he sent to Matt
  Sergeant, the author of "Time::Piece".

    http://xrl.us/bk9mz

Some smoke digging (HP-UX failures)

  H.Merijn Brand delved into HP-UX smoke reports to figure out what was
  going wrong. "Time::Piece" was already under control (see above), but
  "Math::Trig" was failing (and the only recent change has been an
  upgrade to "Math::Complex"). Tests for "readdir" were also turning
  black, which suggested subtler problems.

  Half way through the conversation, Craig Berry announced the
  integration of Gisle Aas's fix for "Time::Piece" which addressed the
  VMS problems, and H.Merijn reported that it did the trick for HP-UX as
  well. Using the power of CPAN, H.Merijn was able to go through
  previous "Math::Complex" versions, and this allowed him to resolve
  that problem.

  I think the "readdir" problem was solved by upgrading smoke harness.

  The remaining failure appeared to be caused by "use blib" hoisting in
  an errant directory into @INC. Bram showed him how to fix that, which
  should nail down the last error.

    going for O O O O
    http://xrl.us/bk9m3

  But then H.Merijn reported a problem with a failing blib test, and
  everyone pretended to pay attention to the character encoding debates.

    war knocked
    http://xrl.us/bk9m5

TODO of the week

Improve the coverage of the core tests

  Use "Devel::Cover" to ascertain the core modules's test coverage, then
  add tests that are currently missing.

  Just to help budding testers along, here is a non-exhaustive list of
  suggestions to get you going (suggested by sorting out the biggest
  ".pm" files is lib/):

  "AutoLoader"
  "AutoSplit"
  "Benchmark"
  "Cwd"
  "DB"
  "Dumpvalue"
  "Exporter"
  "Memoize"
  "NEXT"
  "SelfLoader"
  "charnames"
  "diagnostics"
  "overload"
  "warnings"

  Even concentrating on a single module would be helpful.

Patches of Interest

"ExtUtils::ParseXS" - Error reporting problem with INTERFACE and ALIAS keywords

  About a year ago, Ken Williams explained that, while he was the
  maintainer of this module, he didn't know what was the best way to
  address the problem that Robert May had brought up regarding error
  reporting.

    then
    http://xrl.us/bk9m7

  Of the two approached supplied by Robert as a solution, Ken liked the
  second one back then, and Nicholas Clark, reviving the conversation
  agreed that it seemed to make more sense.

  He had a look at how things work currently, and realised that with a
  new function, he could effect a small saving of space. As a result,
  both the core and "EU::PXS" could rely on the function.

  Nicholas wrote the function, and felt that it would make it into 5.8.9
  and 5.10.1. or older releases, "ExtUtils::ParseXS" would need to
  bundle the function, and emit it as required if the core didn't supply
  it.

  Rob thought that this sounded reasonable, except that if ever a bug is
  found in the function that Nicholas just wrote, it would need to be
  fixed both in the core and EU::PXS. Since this would be less that
  desirable, Robert said that he would try to come up with an alternate
  patch at some point.

    now
    http://xrl.us/bk9m9

"lib.pm" should not warn about loading ".par" files

  Paul Fenwick noted that a "use lib 'Foo.par'" will issue a warning,
  but load the damned thing anyway. Since someone pulling in a library
  in this way probably has a pretty good idea what they're doing anyway,
  Paul thought it would be a good idea to suppress the warning, just for
  ".par" files.

  Rafael Garcia-Suarez felt that this made sense, so he applied the
  patch. Steffen Müller wanted to know if this meant that lib.pm would
  be dual-lifed, so that 5.8.8 could benefit from the improvements.

    dual-life pragma on par
    http://xrl.us/bk9nb

Indented preprocessor directives in sv.c

  Jerry D. Hedden noticed that some preprocessor defined in sv.c were
  not flush left, and thought that some compilers would choke on it.
  H.Merijn Brand explained that it was perfectly legal according to
  ANSI, although he admitted that some older compilers, such as on AIX,
  would likely get into trouble over this.

  Both Robin May and Andy Dougherty explained that something that does
  work is to leave the # in the first column, and then indent the macro
  preprocessor directive as appropriate.

    hash hard left
    http://xrl.us/bk9nd

New and old bugs from RT

*x{IO} bizarre copying (#3314)

  Steve Peters discovered that some bizarre code that used to emit a
  bizarre error message now emits a more prosaic error message. He
  noticed that the change occurred way back in change #27179 and asked
  if anyone had objections to backporting it to 5.8.

    a leap into the unknown
    http://xrl.us/bk9nf

"exists()": error message on wrong argument type is incorrect (#38955)

  A couple of years ago, Jeremy Hetzler noted that "exists" may be
  applied to a HASH, an ARRAY and also a subroutine name. The
  documentation even admits as much.

  On the other hand, for incorrect use, such as applying it to a scalar,
  the error message makes mention of only HASH and ARRAY, not of
  subroutines.

  Bram patched the source to bring the error message into line with the
  documentation and implementation, and Rafael Garcia-Suarez applied it.

    language lawyers rejoice
    http://xrl.us/bk9nh

No complaint about bareword (#53806)

  Rafael Garcia-Suarez supplied a fix for the "print Does::Not::Exist,
  ''" problem, so that the bareword is correctly identified as such, and
  not stringified. Despite all the magic surrounding "print"'s first
  argument, all that Rafael needed to do was to hoist a goto label four
  lines higher in the source.

  H.Merijn Brand applied the correction, along with Bram's tests.

    http://xrl.us/bk9nj

"pod2man" loses =head2 starting ' or . (#53910)

  Bram correctly identified "Pod::Man" as a dual-life module. This means
  that the best place to fix this particular problem is in the CPAN
  distribution, which can then be synched with blead when the problem is
  fixed.

    SEP
    http://xrl.us/bk9nm

"IO::Seekable" + "POSIX" = constant subroutines redefined (#54186)

  Part of the fallout from Nicholas Clark's corrections for this bug is
  that calls with the wrong numbers of arguments causes the program to
  croak. Rafael Garcia-Suarez felt it was safe enough to inflict on the
  world. As a point of confirmation, Sébastien Aperghis-Tramoni ran a
  code search and didn't find any examples of such usage.

    safe to break
    http://xrl.us/bk9no

"perlipc" problems

  Andrew at Sundale noted a problem in the documentation in "perlipc"
  concerning the signalling of negative process IDs. Steve Peters
  tweaked the example to show more clearly what was happening.

    perlipc and negative pids (#54412)
    http://xrl.us/bk9nq

  Andrew found another problem with "setsid", in that that the
  documentation suggests a "setsid or die" idiom, except that, if one
  reads the manpage for "setsid", one learns that it returns -1 on error
  (as do many other system calls). As such, if the "setsid" call fails,
  the die won't be triggered.

    perlipc and negative truth (#54422)
    http://xrl.us/bk9ns

  While we're on the subject, Andrew found one final problem concerning
  the documentation for safe pipe opens.

    perlipc unclear on the concept (#54424)
    http://xrl.us/bk9nu

Faulty "select()" in Activestate perl (#54544)

  Marc Lehmann noted that "select" returns "Unknown Error (10022)"
  instead of simply timing out.

    just no it
    http://xrl.us/bk9nw

Assertion failure fiddling with @ISA (#54566)

  Niko Tyni discovered a way of abusing @ISA that would result in an
  assertion failure. Rafael Garcia-Suarez figured out what was going
  wrong in mg.c and provided a patch, that H.Merijn Brand applied.

    out through the smtp tunnel
    http://xrl.us/bk9ny

"Can't take log of 0" error in perl 5.8.8. 64 bit (#54590)

  Lourdes Peña Castillo reported that on some versions of perl, but not
  others, the number 2.5e-310 gets rounded down to 0, and the log of 0
  is negative infinity.

  Various porters reported similar behaviour on a variety of perls,
  platforms and Configure options, but no clear reasons why.

    now you see it, now you don't
    http://xrl.us/bk9n2

"PerlIO::via" free unrefed scalar on certain dodgy code (#54686)

  Kevin Ryde wrote some slightly broken code that managed to make the
  perl interpreter complain about memory problems. He wasn't especially
  worried about a fix any time soon, but wondered if it was a symptom of
  an underlying problem that needed to be addressed.

    need to know
    http://xrl.us/bk9n4

Regexp modifier to disable interpolation like m'' (#54702)

  Ed Avis filed a feature enhancement request, to allow the "/n" flag on
  a regular expression to indicate that no interpolation should be
  performed.

  Currently, only "m'300 $US'" (with single quotes as a pattern
  delimiter) does no interpolation. Ed thought that "/300 $US/n" might
  be clearer.

    we'll get the whole alphabet in some day
    http://xrl.us/bk9n6

"PathTools-3.27" triggers a bug in Perl (#54728)

  Jan Dubois isolated a problem in "File::Spec::Win32"'s "catfile"
  function. The fix from the client side is to stringify a $1 passed as
  a parameter (a variation on the "better to be paranoid than sorry"
  theme), since "catfile" appears to clobber it with some other action
  before getting around to using it. Ideally, "catfile" should stringify
  its arguments itself, although Jan wondered if there was a more
  general way of solving the problem.

    match point
    http://xrl.us/bk9n8

Perl5 Bug Summary

    278 new + 1345 open = 1623 (+13 -43)
    http://xrl.us/bk9oa
    http://rt.perl.org/rt3/NoAuth/perl5/Overview.html

New Core Modules

  "Thread::Semaphore"
      Jerry D. Hedden released 2.08, which adds a few checks for
      undefined parameters.

        http://xrl.us/bk9oc

In Brief

  Ricardo Signes wondered why "delete local $hash{elem}" didn't work
  when "local $hash{elem}; delete $hash{elem}" did. After boggling
  briefly over the syntax, Rafael Garcia-Suarez thought it wouldn't be
  too hard to make it work.

    http://xrl.us/bk9oe

  Ricardo Signes looked at the documentation in "perlobj" and corrected
  errors and omissions in "DOES". He hinted that he would take the axe
  to the documentation for "UNIVERSAL".

    less is more
    http://xrl.us/bk9og

  Jerry D. Hedden corrected a typo in perlop.pod that H.Merijn Brand
  estimated as being a difference of about 3 pixels, thus possibly
  qualifying for the smallest patch ever.

    http://xrl.us/bk9oi

  He also silenced build warnings in universal.c.

    http://xrl.us/bk9ok

  Nicholas Clark discovered what he thought was a "usage error in XS
  subs" with the ALIAS keyword. This reminded Robert May that he had
  written about a similar problem with INTERFACE last year, and that the
  message had gone nowhere.

    http://xrl.us/bk9on

  Florian Ragwitz also managed what was roughly a seven pixel change to
  fix a documentation typo in "Attribute::Handlers".

    http://xrl.us/bk9op

  Artur Bergman handed over maintenance of "Attribute::Handlers" to
  Rafael Garcia-Suarez.

    http://xrl.us/bk9or

Moritz Lenz saw that "Memoize.pm" refers to old title of "Higher Order Perl"
and changed the wording. There was some discussion as to whether the full
text of HOP was available on the web, and if so, where?

    http://xrl.us/bk9ot

  After Steve Peters performed an upgrade to "AutoLoader" to bring it to
  5.66, Nicholas Clark bumped it up to 5.66_01 to be on the safe side.

    for the record
    http://xrl.us/bk9ov

  Craig Berry returned to the "File::Copy" & permission bits issue,
  saying that changes were unlikely to fly on VMS. Aristotle Pagaltzis
  pointed out that on Windows, files tend to inherit their permission
  bits from the directory in which they reside, and that the only
  important bit to honour on Unix systems is the execute bit.

    http://xrl.us/bk9ox

  Renée Bäcker was Warnocked over a patch to add more documentation to
  attributes.pm.

    http://xrl.us/bk9oz

About this summary

  This summary was written by David Landgren.

  Weekly summaries are published on http://use.perl.org/ and posted on a
  mailing list, (subscription: [EMAIL PROTECTED]). The
  archive is at http://dev.perl.org/perl5/list-summaries/. Corrections
  and comments are welcome.

  If you found this summary useful, please consider contributing to the
  Perl Foundation or attending a YAPC to help support the development of
  Perl.

--
stubborn tiny lights vs. clustering darkness forever ok?

Reply via email to