On Wed, Apr 25, 2018 at 1:00 PM, Paul Gilmartin < [email protected]> wrote:
> On Sun, 22 Apr 2018 18:29:38 -0400, Hobart Spitz wrote: > > > >*With the *nix/C record and string models, there are these issues:* > > > > 1. Errant/unexpected/unintended pieces of binary data in a text > > file/string can break something. > > 2. Separate functions/methods/techniques must be used to manipulate > text > > files/strings versus binary files/string. You *must* know what you are > > dealing with up front, and/or somehow code logic for both. (I'm not > sure > > the latter is possible in the general case.) > > > Gee. The program must know the format of the data it's dealing with. > Hardly a surprise. > > > 3. Even with *nix/C oriented machine instructions, the need to inspect > > all characters up to a target point results in performance killing > cache > > flooding. > > > This flaw is also present in the pervasive RECFM=FB,LRECL=80. Consider > the FORTRAN statement: > PI = 3.14 265 > ... the compiler must inspect the line to the end of the record for > additional digits. > Variable length records relieve this. Yet FORTRAN (yet, AFAIK), HLASM, > and Utility > control files must be FB 80. > Herman Hollerith, we love you! (signed, IBM). Your cards will live FOREVER in our software! > > And quote-delimited strings aggravate the problem. FORTRAN had this > solved, as in: > WRITE ( 6, 100 ) > 100 FORMAT( 13HHello, world. / ) > ... on encouhtering the 13H, the compiler can just MVC 13 bytes to > SYSPUNCH without > inspecting them. Subsequently, designers came to believe that silicon is > cheaper than > carbon, so in C: > fprintf( stdout, "Hello, world.\n" ); > ... the compiler, not the programmer can count characters in the string. > Of course, the > FORTRAN scheme was easier when the programmer coded on a form with columns > numbered > and handed it off to Data Entry to be punched and verified. > > ISPF addressed the waste of storing blanks with compressed format. > Characteristically, IBM > introduced this at the wrong implementation layer. Using ISPF "packed" is, IMO, obsolete (and never was really of much use). Today, we compress data sets using the SMS DataClass and so it is "transparent" to the application. > It should have been done at the access > method layer or even the control unit so it would be transparent to all > programs. Compression > should have been indicated by a flag in the data set label, not by a > "magic number" in the > data, which is susceptible to being broken by "[e]rrant/unexpected/unintended > pieces of binary > data". > > There's considerable merit in the UN*X/C scheme of using variable length > records and using the > TAB character ratner than multiple blanks for column alignment. It's > easier to type, it's fewer > characters for the compiler to scan, and it economizes storage. > I like tabs for semantic purposes (and yes, I like the way the Python language uses them). The only problem which arises is the "wars" that I've seen as to whether a <tab> should be the equivalent of "n" spaces or should align to the next column # which is a multiple of some value (typically 4 or 8?). Or, thirdly, a variable number of spaces which align to the aforementioned column number (I have "vim" tabs set up this way since I transfer data to z/OS UNIX and ISPF hates real tabs -- x'05' in EBCDIC). > > -- gil > > ---------------------------------------------------------------------- > For IBM-MAIN subscribe / signoff / archive access instructions, > send email to [email protected] with the message: INFO IBM-MAIN > -- We all have skeletons in our closet. Mine are so old, they have osteoporosis. Maranatha! <>< John McKown ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
