Gil, I know you mean well.

Your first two points are addressed by one answer:  The goal in the test is
to compare the efficiency of record-oriented piping versus
character-by-character piping.

In the first case, bypassing the *nix piping mechanism defeats the goal of
the test.

In the second case, cat can't finish until wc starts reading the last 4k
(or so) characters at the very end of the data.  With 170 M of data, that
means that the timing would be off by very much less than 0.01%, which
greatly exceeds normal variation.  Further, the timings were reported to me
in more detail than I believe time reports.  Therefore, it was probably not
the time output that gave 75 secs.  I just choose to use the only total
time that was reported to me.

Finally, I got confused and assumed that the original PIPElines case was
abbreviated or there was a typo.  I incorrectly added the $ cms and
quotes.  That is entirely on me.  It should have just read:

PIPE < /home/....../m170file.data | count bytes | cons


I've heard of PIPElines with 10K-15K (mostly generated) stages.  Depending
on the topology, such a thing would be impossibly slow with
character-by-character piping.  And that's assuming you could even do
deterministic multi-stream piping the way PIPElines does.

Think of the difference between *nix piping and PIPElines piping this way:
Your have a dinner party with a dozen guests.

Under *nix piping, you pass each and every string bean, corn kernel, and
pea to the plate of the person next to you, one-at-a-time, and only one
vegetable can move at a time.  If you even finished such a dinner, you
would never get anyone to show up at another.  And we haven't even
considered the cache-flooding implications.

Under PIPElines piping, whole plates of vegetables get passed around in the
familiar way.  The serving utensil the "cache loader".  Vegetables that you
don't want just stay in the serving bowl (slow memory or disk).

Bon appetit.  :-)

I hope this helps.

OREXXMan
JCL is the buggy whip of 21st century computing.  Stabilize it.
Put Pipelines in the z/OS base.  Would you rather process data one
character at a time (Unix/C style), or one record at a time?
IBM has been looking for an HLL for program products; REXX is that language.

On Wed, Jun 13, 2018 at 10:07 AM, Paul Gilmartin <
0000000433f07816-dmarc-requ...@listserv.ua.edu> wrote:

> On 2018-06-13, at 07:18:40, Hobart Spitz wrote:
>
> > Cross posted to CMSTSO Pipelines and IBM-MAIN
> >
> > Someone shared with me a performance comparison between Pipelines vs.
> > native *nix commands, both on OPENVM.
> >
> > Under the OPENVM shell, this command ran 75 secs. with a 170M file in
> BFS:
> >
> > $ time cat m170file.data | wc -b
> >
> The "cat" is superfluous.  Why not just:
> > $ time   wc -b  m170file.data
>
> In fact, you were timing "cat", not "wc".
>
> > This command, also under OPENVM shell with the same file, ran 9 secs.:
> >
> > $ cms 'PIPE < /home/....../m170file.data | count bytes | cons'
> >
> Won't Pipelines respect a path relative to current working directory?
> If not, shame on Pipelines.
>
> -- gil
>
> ----------------------------------------------------------------------
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
>

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to