Cross posted to CMSTSO Pipelines and IBM-MAIN

Someone shared with me a performance comparison between Pipelines vs.
native *nix commands, both on OPENVM.

Under the OPENVM shell, this command ran 75 secs. with a 170M file in BFS:

$ time cat m170file.data | wc -b


This command, also under OPENVM shell with the same file, ran 9 secs.:

$ cms 'PIPE < /home/....../m170file.data | count bytes | cons'

Unfortunately, the person who sent this to me wasn't in a position to spent
any more time or resources on this, so I invite any one inclined to run a
similar comparison and post the results.

You may need something like this to avoid an abend depending on your system:

$ cms 'pipe filedesc 0 | count bytes | cons' < m170.data


Under OPENMVS, e.g., try something like:

$ tso 'PIPE < /home/....../m170file.data | count bytes | cons

(Caution, I have not used OPENMVS/USS, so the syntax could be wrong.)

MVSers who don't have PIPElines, and VMers who want to, can try comparing
*nix equivalents to REXX using LINEIN() if you have it, or EXECIO * if you
don't.  This will tell us if Pipelines' design is a bigger contributor to
efficiency, or if it is the superiority of record-orientation vs.
character-orientation.

I recommend using RITA to get stage level statistics.  I suspect that
scanning for CR/LF is more expensive than counting bytes in PIPElines,
while the cost might be similar in *nix.

Some variations you may want to try:

   - Count lines and/or words.
   - Try different mainframe *nix version.
   - Add more filters.
   - Add filters that drop and/or add records.
   - Add some filters that change records.
   - Use file(s) already in the CMS/MVS file systems.  The <bfs stage has
   to find the CR/LF before it can emit its output record.  Having the data in
   a record oriented file system avoids that overhead.
   - Even though the exact *nix equivalent is difficult, something
   milti-steam, e.g. with FANOUT.
   - Binary data files.

Why might one care?  Since what typically takes multiple *nix command lines
to accomplish typically only takes 1 command in Pipelines, making Pipelines
not only superior in development productivity vs. *nix, but it may also be
about an order of magnitude more efficient.

Thanks in advance.

- Hobart

OREXXMan
JCL is the buggy whip of 21st century computing.  Stabilize it.
Put Pipelines in the z/OS base.  Would you rather process data one
character at a time (Unix/C style), or one record at a time?
IBM has been looking for an HLL for program products; REXX as the C of
mainframes is that language.

Reply via email to