progress report on piping on VMS

Charles Lane Wed, 8 Mar 2000 15:49:16 -0800
Time for an update on piping code.  I've gone through several iterations
(and a couple of complete rewrites) over the past two weeks, and think
the revised code is close to completion.  Please make sure to look down
near the end of this message for "Suggested behavior changes".

Let me point out some of the problems with existing code:
    .   excess writes to child processes get interpreted as DCL commands
    .   child-of-a-child process termination gives excess EOFs reading from child
    .   reading from abnormally terminated child hangs main process
    .   writing to exited child hangs main process
    .   child inherits SYS$OUTPUT logical, not the current stdout setting
    .   child SYS$ERROR always points to its SYS$OUTPUT, doesn't inherit
        parents SYS$ERROR (or stderr).
    .   child writing to inherited SYS$OUTPUT contention with main process
        if writing to a file.

To address these problems:
    .   lib$spawn has input set to NL:, runs command procedure
        PERL_ROOT:[LIB]VMSPIPE.COM that defines SYS$INPUT etc and
        executes desired command.  (see RTL manual for behavior of
        LIB$SPAWN that forces this.  SYS$CREPRC has problems of not
        inheriting global symbol definitions).  Command string and
        i/o devices communicated to VMSPIPE.COM via global symbols.

    .   Filter output coming back from a subprocess to Perl, using a pair
        of mailboxes in the process doing the reading.  AST routine to
        transfer data from one MBX to the other on a "on demand" basis
        to avoid main process hangs.  EOFs removed, except for one at
        child process termination, inserted by process completion code.

    .   When child exits, AST routine reads any data written to its MBX
        and discards, preventing hang of main process.

    .   use stdout and stderr of main process (at time of subprocess
        creation) to direct subprocess SYS$OUTPUT and SYS$ERROR;
        logical for SYS$ERROR set in subprocess "wrapper" command file
        (if differing from SYS$OUTPUT), since LIB$SPAWN sets SYS$ERROR
        to SYS$OUTPUT.

    .   use AST routine to copy "MBX to file" from SYS$OUTPUT, SYS$ERROR
        of subprocess if they point to a disk file opened in main process.

About file contention:
======================
The last point, about "contention", needs some explanation.  If the
main process has SYS$OUTPUT set to a mailbox or a terminal device,
doing a LIB$SPAWN with the "default output" (or a DCL SPAWN/NOWAIT)
will just just point SYS$OUTPUT (and SYS$ERROR) to that device, and
open another channel to it.  No problem, no need for anything special.

This doesn't work with disk files (i.e., spawning from a batch job,
where SYS$OUTPUT points to the log file, or where "$perl foo >blah"
spawns) because the main process isn't going to share its output file.
Instead, the subprocess gets a MBX for SYS$OUTPUT and there's a
little AST routine in the main process that transfers data from the
MBX to the SYS$OUTPUT file.

This is the way LIB$SPAWN works, it's the way the DCL SPAWN works
also, as implied by the LIB$SPAWN docs (which says "PPFs not shared")
and verified by tests.

So when we have a subprocess and we want its output (whether
SYS$OUTPUT or SYS$ERROR) to go to a file that was already opened in
the main process, we have to give it a MBX instead, and set up a
little "copy from MBX to file" AST routine.

I tried doing this and found a problem when pushing hard on the
performance of the code: if the main process and the child process are
both writing lots of stuff to the same output file very quickly,
sometimes lines from the subprocess get lost.  Please note that this
is NOT a problem specific to my implementation; if you have
LIB$SPAWN handle SYS$OUTPUT it shows the _exact_ same pathology, as
do pre-modification versions of Perl that rely on LIB$SPAWN.

My improvement (I won't call it a "solution") is to detect when a
write doesn't complete and schedule another attempt after a short
delay, with a maximum number of attempts.  Not pretty, but the data
does get through.

Here's a little Perl script to illustrate...run it from a batch job:
(and note that no two runs will give quite the same output!)

#! perl
#    test contention  ... simple version
#    name this file "test_contend.pl"
#    and run from batch job...not all of the "slave" writes will get through

$master = $ARGV[0] eq '';

print STDERR "I am master $$\n" if $master;
print STDERR "I am slave  $$\n" if !$master;
$name = $master ? 'master' : 'slave';

if ($master) {
    $pid = open(A,"|mcr $^X test_contend.pl 1") || die('no slave');
    sleep(1);   # wait for slave to start
    print A "go!\n";
} else {
    $x = <STDIN>;  # wait for signal from master
}

for ($j = 0; $j < 1000; $j++) {
    $x = sprintf("write from $name, #%05d\n",$j);
    syswrite(STDOUT, $x, length($x));
}
close A if $master;

#-----------end test_contend.pl------------

I tried the above with Perl 5.5.57 and from a batch file there are
indeed subprocess lines missing...if you do
    $perl test_contention.pl >FOO.
something even worse happens: the "slave" sends its output to
the terminal (not to FOO.)  since the old piping code is handing the
slave the SYS$OUTPUT logical, and not the stdout at the time the
subprocess was started.

==================================================================
Suggested behavior changes: (more controversial!)

o   Error messages ("%SYSTEM-F-ABORT") at Perl exit should be supressed
    in subprocesses.  They're still present for main process.

    Rationale: The error codes are available to the main processes in
    $?  (use vmsish 'status' useful for this).  The messages get
    written to both SYS$OUTPUT and SYS$ERROR, and in most cases we get
    duplicates that are obnoxious to filter out.  Turning off the
    messages results in one error in the test suite (in lib/vmsish.t,
    where we look for an error message) while having them on generates
    many, many test errors...triggered because we can now handle SYS$ERROR
    properly.

    My opinion is that that the fewer VMS-customizations needed (either
    in scripts, modules or the test suites) then the better off we
    are...and the smaller the probability of future breakage.

o   Default MBX size for communication with subprocesses should be
    decreased, and be settable at runtime via logical.

    Rationale: process and subprocesses share a pooled BYTLM from
    which mailboxes are deducted; we currently take the minimum of
    a sysgen parameter (MAXBUF, 32K for me) and stdio's BUFSIZ (8K
    for me). It doesn't take many 8K mailboxes to deplete even 100K
    of BYTLM quota!  Note that SPAWN/NOWAIT uses 6000 byte mailboxes,
    which seems to be hardwired...

    I don't think we want a situation where only someone with a system
    account with large quotas can run Perl effectively, so having a
    logical name may be the best approach (with min and max).   It still
    takes a fairly large BYTLM quota to run the test suite, but for
    most purposes mailboxes of 1K or less are perfectly suitable.

    How about "PERL_MBX_SIZE" for the logical?
--
 Drexel University       \V                     --Chuck Lane
----------------->--------*------------<[EMAIL PROTECTED]
     (215) 895-1545      / \  Particle Physics  [EMAIL PROTECTED]
FAX: (215) 895-5934        /~~~~~~~~~~~         [EMAIL PROTECTED]
progress report on piping on VMS

Reply via email to