The pyzfile module source code has been available on Github [1] since its inception, and it can be considered the benchmark implementation for MVS data set I/O in Python. Many people, including IBM employees, have expressed interest in pyzfile [2]. Regarding your mention of standard Python modules, I'm not sure what you mean. While Python does have a large collection of modules, not all of them are part of the standard library. For instance, there is no standard YAML library in Python, but it's easy to install one using pip. This is commonplace in languages like Python, Java, and Perl.

[1] https://github.com/daveyc/pyzfile
[2] https://community.ibm.com/community/user/ibmz-and-linuxone/discussion/reading-an-mvs-dataset-using-z-open-automation-utility

On 28/2/23 15:26, Farley, Peter wrote:
David, I will have to complain that your python benchmark is not a fair comparison.  Your python script uses a module named 
pyzfile to access z/OS files (which I see from PYPI is authored by you but for which you have published no source yet).  A fairer 
comparison would be a python script that only used standard python modules and shell commands.  My python-vs-Rexx testing was 
done using python's subprocess.run to execute the "cat" command to copy data from z/OS files (PDS, QSAM, VSAM) to 
STDOUT captured by the subprocess.run routine and then using the captured STDOUT data for the processing.  Writing to z/OS files 
(PDS member and QSAM only) was accomplished by first writing the output file data to the Unix file system (with encoding 1047 to 
write in EBCDIC) and then again using subprocess.run to execute "cp" to copy the written Unix file to the z/OS file.  
In both read and write cases I used the "//'DSN'" file name format for the z/OS files, supported by both 
"cat" and "cp".

Rexx using EXECIO or RXVSAM from CBT beats that type of python script by a small margin but not by 
a lot -- the process I was measuring averaged 23-24 "real" seconds per test for the 
python version and 19-20 "real" seconds per test for the Rexx version.

This was all done on the IBM Zxplore z/OS platform, which is x86 under the covers rather than 
"real iron", so probably zPDT.  In any case, "students" on Zxplore aren't 
permitted to install any python packages and venv/virtualenv are not available for the same reason 
(DASD-and-CPU-constrained system).  The platform only permits you to use standard python packages 
or one of the few non-standard ones pre-installed by the admins there.

Peter

-----Original Message-----
From: IBM Mainframe Discussion List <IBM-MAIN@LISTSERV.UA.EDU> On Behalf Of 
David Crayford
Sent: Tuesday, February 28, 2023 1:53 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: Re: zOSMF and zOWE for non-mainframers

On 28/2/23 13:47, David Crayford wrote:
On 25/2/23 01:23, Farley, Peter wrote:
Python on the mainframe is pretty good, but still can't beat out Rexx
in performance even when the Rex script needs to use BPXWUNIX and
friends to access z/OS Unix file systems,

I have conducted a series of benchtests, and the results suggest that
REXX is not as fast as Python. In my testing, I compare the
performance of C, Lua, Python, and REXX, and the results are clear: C
is the fastest, followed by Lua, which is within an order of magnitude
of C. Python comes next, within an order of magnitude of Lua, and REXX
consistently performs the poorest. In addition to the performance
factor, the vast Python ecosystem compared to the limited options
available for REXX also make it an easy decision. Python is also
simpler to extend with packages, while REXX requires more effort and
potentially complex steps, such as using modern libraries that require
Language Environment (LE).

My benchtests

Lua

local file = assert(io.open(arg[1], "rb, type=record, noseek")) while true do
     local rec = file:read()
     if not rec then break end
end

Python

import sys
from pyzfile import *
try:
     with ZFile(sys.argv[1], "rb,type=record,noseek") as file:
        for rec in file:
           pass
except ZFileError as e:
     print(e)

REXX

/* REXX */
    arg dsname
    address MVS
    call bpxwdyn "ALLOC FI(INPUT) DA("dsname") SHR"
    do until eof
      "EXECIO 10000 DISKR INPUT ( STEM rec."
      eof = (rc > 0)
    end
    "EXECIO 0 DISKR INPUT ( FINIS"

The results: Add user+system to get total CPU time

  > time lua benchio.lua "//'CPA000.QADATA.PMR99999.SSA.HR1315PM'" && time python3 benchio.py 
"//'CPA000.QADATA.PMR99999.SSA.HR1315PM'" && time ./benchio.rex "CPA000.QADATA.PMR99999.SSA.HR1315PM"
real    0m47.019s
user    0m3.255s
sys     0m1.097s

real    1m0.710s
user    0m8.001s
sys     0m2.678s

real    1m17.772s
user    0m13.575s
sys     0m4.536s

--

This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify us immediately by e-mail and delete the message and any 
attachments from your system.


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to