On Tue, May 16, 2023 at 11:23:46AM -0500, Paul Gilmartin wrote:
> On Tue, 16 May 2023 11:06:59 -0500, John McKown wrote:
> 
> >In one of my C programs, I first read the RDW, did a ntohs() to convert
> >from mainframe to Intel integer, subtracted 4, then read that number of
> >bytes into a char[32768].
> >
> I'm Python-naive.  But trying to educate myself with the example,
> <https://docs.python.org/3/tutorial/inputoutput.html>:
>     with open('workfile', encoding="utf-8") as f:
>         read_data = f.read()

The default mode for python open is "rt" which is read text, not
binary.  So python will look for CR LF and alter your binary data.
The encoding="utf-8" is also invalid as the binary parts aren't utf-8
either and converting them will also break things (or raise an exception).

> ... I see no explicit byte count.  Is that implied, perhaps by a declaration
> such as "char read_data[ LRECL ]"?

No, read() without an operand says to read to the end of the file,
everything...

read can be supplied with an operand saying the maximum amount to
read:  read(4) to read 4 bytes.  It might read less either because
it reached the end of the input file or because the input is
"interactive" (console? network?) and it just doesn't have any
more right now.

Here's what the python description says:

 |  read(self, size=-1, /)
 |      Read and return up to n bytes.
 |      
 |      If the argument is omitted, None, or negative, reads and
 |      returns all data until EOF.
 |      
 |      If the argument is positive, and the underlying raw stream is
 |      not 'interactive', multiple raw reads may be issued to satisfy
 |      the byte count (unless EOF is reached first).  But for
 |      interactive raw streams (as well as sockets and pipes), at most
 |      one raw read will be issued, and a short result does not imply
 |      that EOF is imminent.
 |      
 |      Returns an empty bytes object on EOF.
 |      
 |      Returns None if the underlying raw stream was open in non-blocking
 |      mode and no data is available at the moment.

In the mainframe world, many files are "binary" of some sort and dealing
with record boundries is normal and assumed.  This means for fixed length
records counting of LRECL bytes and for variable length records using
the length in the RDW to determine the record length.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to