Re: PIPElines vs. *nix
On Wed, 13 Jun 2018 19:00:19 -0400, Hobart Spitz wrote: >Gil, I know you mean well. > >Your first two points are addressed by one answer: The goal in the test is >to compare the efficiency of record-oriented piping versus >character-by-character piping. > Sorry; I missed that emphasis. COUNT WORDS is probably a fairer comparison to "wc", since both must inspect every character. >In the first case, bypassing the *nix piping mechanism defeats the goal of >the test. > But you could amplify the sensitivity with: cat m170file.data | cat | cat | ... | cat >/dev/null Or Pipe into HOLE to eliminate the overhead of COUNT or wc. Also see whether the "-u" option for "cat" makes a difference. Is there a Pipelines stage comparably neutral to "cat" you could use? >In the second case, cat can't finish until wc starts reading the last 4k >(or so) characters at the very end of the data. With 170 M of data, that >means that the timing would be off by very much less than 0.01%, which >greatly exceeds normal variation. > "exceeds"? "is exceeded by" I believe the z/OS UNIX kernel buffers are 131k, still an inconsequential fraction of 170 M. >I've heard of PIPElines with 10K-15K (mostly generated) stages. Depending >on the topology, such a thing would be impossibly slow with >character-by-character piping. And that's assuming you could even do >deterministic multi-stream piping the way PIPElines does. > Deterministing multi-stream is valuable, and upstream propagation of termination is precious. Multi-streams are possible in C; impractical in shell. >Think of the difference between *nix piping and PIPElines piping this way: >Your have a dinner party with a dozen guests. > >Under *nix piping, you pass each and every string bean, corn kernel, and >pea to the plate of the person next to you, one-at-a-time, and only one >vegetable can move at a time. If you even finished such a dinner, you >would never get anyone to show up at another. And we haven't even >considered the cache-flooding implications. > >Under PIPElines piping, whole plates of vegetables get passed around in the >familiar way. The serving utensil the "cache loader". Vegetables that you >don't want just stay in the serving bowl (slow memory or disk). > I don't think that's fair. I'd expect a well-designed "cat" to issue a nonblocking read() into an application buffer, which may be smaller than the kernel buffer, and kernel to move the data with MVCL, not an IC; STC loop. -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: PIPElines vs. *nix
Gil, I know you mean well. Your first two points are addressed by one answer: The goal in the test is to compare the efficiency of record-oriented piping versus character-by-character piping. In the first case, bypassing the *nix piping mechanism defeats the goal of the test. In the second case, cat can't finish until wc starts reading the last 4k (or so) characters at the very end of the data. With 170 M of data, that means that the timing would be off by very much less than 0.01%, which greatly exceeds normal variation. Further, the timings were reported to me in more detail than I believe time reports. Therefore, it was probably not the time output that gave 75 secs. I just choose to use the only total time that was reported to me. Finally, I got confused and assumed that the original PIPElines case was abbreviated or there was a typo. I incorrectly added the $ cms and quotes. That is entirely on me. It should have just read: PIPE < /home/../m170file.data | count bytes | cons I've heard of PIPElines with 10K-15K (mostly generated) stages. Depending on the topology, such a thing would be impossibly slow with character-by-character piping. And that's assuming you could even do deterministic multi-stream piping the way PIPElines does. Think of the difference between *nix piping and PIPElines piping this way: Your have a dinner party with a dozen guests. Under *nix piping, you pass each and every string bean, corn kernel, and pea to the plate of the person next to you, one-at-a-time, and only one vegetable can move at a time. If you even finished such a dinner, you would never get anyone to show up at another. And we haven't even considered the cache-flooding implications. Under PIPElines piping, whole plates of vegetables get passed around in the familiar way. The serving utensil the "cache loader". Vegetables that you don't want just stay in the serving bowl (slow memory or disk). Bon appetit. :-) I hope this helps. OREXXMan JCL is the buggy whip of 21st century computing. Stabilize it. Put Pipelines in the z/OS base. Would you rather process data one character at a time (Unix/C style), or one record at a time? IBM has been looking for an HLL for program products; REXX is that language. On Wed, Jun 13, 2018 at 10:07 AM, Paul Gilmartin < 000433f07816-dmarc-requ...@listserv.ua.edu> wrote: > On 2018-06-13, at 07:18:40, Hobart Spitz wrote: > > > Cross posted to CMSTSO Pipelines and IBM-MAIN > > > > Someone shared with me a performance comparison between Pipelines vs. > > native *nix commands, both on OPENVM. > > > > Under the OPENVM shell, this command ran 75 secs. with a 170M file in > BFS: > > > > $ time cat m170file.data | wc -b > > > The "cat" is superfluous. Why not just: > > $ time wc -b m170file.data > > In fact, you were timing "cat", not "wc". > > > This command, also under OPENVM shell with the same file, ran 9 secs.: > > > > $ cms 'PIPE < /home/../m170file.data | count bytes | cons' > > > Won't Pipelines respect a path relative to current working directory? > If not, shame on Pipelines. > > -- gil > > -- > For IBM-MAIN subscribe / signoff / archive access instructions, > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN > -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: PIPElines vs. *nix
On 2018-06-13, at 07:18:40, Hobart Spitz wrote: > Cross posted to CMSTSO Pipelines and IBM-MAIN > > Someone shared with me a performance comparison between Pipelines vs. > native *nix commands, both on OPENVM. > > Under the OPENVM shell, this command ran 75 secs. with a 170M file in BFS: > > $ time cat m170file.data | wc -b > The "cat" is superfluous. Why not just: > $ time wc -b m170file.data In fact, you were timing "cat", not "wc". > This command, also under OPENVM shell with the same file, ran 9 secs.: > > $ cms 'PIPE < /home/../m170file.data | count bytes | cons' > Won't Pipelines respect a path relative to current working directory? If not, shame on Pipelines. -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN