Re: New Line vs. Line Feed
On Thu, 4 Jun 2015 19:19:56 -0500, Bill Godfrey wrote: On Thu, 4 Jun 2015 11:05:15 -0400, Shmuel Metz (Seymour J.) wrote: In 4767436570688083.wa.bgodfrey.gzgmail@listserv.ua.edu, on 06/01/2015 at 10:18 PM, Bill Godfrey said: The grep and awk commands don't match \n to end-of-line on omvs, or on linux for that matter. Don't they match \n to LF on most Eunix and *ix systems? In awk there are regex patterns for the input data and there are regex patterns for strings. The regex patterns for the input data are like patterns in grep, in that they do not match \n with anything, but they do match $ with end-of-line. Do '/test$/' and '/test\n/' have the same semantics in awk? In grep? '/test\n/' doesn't match anything in grep or in awk's pattern for input data. '/test$/' matches test at end-of-line in grep or in awk's pattern for input data. Correcting myself. grep doesn't use slashes. awk's pattern for input data uses slashes. In awk's pattern for strings, test\n (without slashes) matches test\n anywhere within a string, which could have multiple \n characters, whereas test$ matches test at the end of the string if the string has no \n at the end. You would need test\n$ to match test\n at the end of a string. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Thu, 4 Jun 2015 11:05:15 -0400, Shmuel Metz (Seymour J.) wrote: In 4767436570688083.wa.bgodfrey.gzgmail@listserv.ua.edu, on 06/01/2015 at 10:18 PM, Bill Godfrey said: The grep and awk commands don't match \n to end-of-line on omvs, or on linux for that matter. Don't they match \n to LF on most Eunix and *ix systems? In awk there are regex patterns for the input data and there are regex patterns for strings. The regex patterns for the input data are like patterns in grep, in that they do not match \n with anything, but they do match $ with end-of-line. Do '/test$/' and '/test\n/' have the same semantics in awk? In grep? '/test\n/' doesn't match anything in grep or in awk's pattern for input data. '/test$/' matches test at end-of-line in grep or in awk's pattern for input data. In awk's pattern for strings, test\n (without slashes) matches test\n anywhere within a string, which could have multiple \n characters, whereas test$ matches test at the end of the string if the string has no \n at the end. You would need test\n$ to match test\n at the end of a string. Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
In 4767436570688083.wa.bgodfrey.gzgmail@listserv.ua.edu, on 06/01/2015 at 10:18 PM, Bill Godfrey bgodfrey...@gmail.com said: The grep and awk commands don't match \n to end-of-line on omvs, or on linux for that matter. Don't they match \n to LF on most Eunix and *ix systems? Do '/test$/' and '/test\n/' have the same semantics in awk? In grep? -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Mon, 1 Jun 2015 22:18:20 -0500, Bill Godfrey wrote: The grep and awk commands don't match \n to end-of-line on omvs, or on linux for that matter. awk certainly does. To wit: user@OS/390.24.00: cat awknl #! /bin/sh -x awk 'BEGIN { String = First line\nSecond line.\n # Show that \n is a line end. printf( %s, String ) # show that \n matches line end. print( match( String, \n ) ) }' user@OS/390.24.00: sh awknl First line Second line. 11 user@OS/390.24.00: -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Tue, 2 Jun 2015 03:17:35 -0500, Paul Gilmartin wrote: On Mon, 1 Jun 2015 22:18:20 -0500, Bill Godfrey wrote: The grep and awk commands don't match \n to end-of-line on omvs, or on linux for that matter. awk certainly does. To wit: user@OS/390.24.00: cat awknl #! /bin/sh -x awk 'BEGIN { String = First line\nSecond line.\n # Show that \n is a line end. printf( %s, String ) # show that \n matches line end. print( match( String, \n ) ) }' user@OS/390.24.00: sh awknl First line Second line. 11 user@OS/390.24.00: I was only referring to \n in the pattern used in awk's general pattern {action} syntax, where the pattern is matched against text being read. I should have qualified my statement. It's important to note that in your awk example and my Perl example the \n is not being treated as an anchor in the regex pattern, like $ would be. You could change \n to \n$ in the last print statement, and the result would be 24 instead of 11. I'm sure you know all of this already. I'm just mentioning it for anyone who might not. Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Tue, 2 Jun 2015 05:48:31 -0500, Bill Godfrey wrote: I was only referring to \n in the pattern used in awk's general pattern {action} syntax, where the pattern is matched against text being read. I should have qualified my statement. This is a characteristic not of awk's pattern matching but of awk's input processing. Awk discards the delimiting \n like gets() rather than retaining it like fgets(). -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
In 5568a3c8.5030...@gmail.com, on 05/30/2015 at 01:37 AM, David Crayford dcrayf...@gmail.com said: It implicitly converted strings to ASCII That's good if you want them converted; not so good if you don't. She died of a favor from which none could save her. and that was the end of Sweet Mollie Malone. -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
In 0379604180364016.wa.bgodfrey.gzgmail@listserv.ua.edu, on 05/29/2015 at 10:30 AM, Bill Godfrey bgodfrey...@gmail.com said: I get identical results whether I use \n or $ in the OP's example. In OMVS. I'm not addressing your question but rather the OP's example. Which OMVS facilities match \n to end of line (record) and which to LF? What do grep et al do about matching \n against legacy PS data sets, where there is a logical end of record? -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Mon, 1 Jun 2015 17:11:54 -0400, Shmuel Metz (Seymour J.) shmuel+ibm-m...@patriot.net wrote: In 0379604180364016.wa.bgodfrey.gzgmail@listserv.ua.edu, on 05/29/2015 at 10:30 AM, Bill Godfrey bgodfrey...@gmail.com said: I get identical results whether I use \n or $ in the OP's example. In OMVS. I'm not addressing your question but rather the OP's example. Which OMVS facilities match \n to end of line (record) and which to LF? What do grep et al do about matching \n against legacy PS data sets, where there is a logical end of record? When commands like cat and cp read legacy PS data sets as text, the results reflect this description of reading text files in the C/C++ Programming Guide: http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/CBCPG1C0/2.9.4.2 Where it says: For files opened in fixed text format, rightmost blanks are stripped off a record at input, and a new-line character is placed in the logical record. That new-line character is hex 15. So a 3-line data set of 80-byte records that look like this: a test testing will look like this in hex after being read by cat cat //test.cntl | od -tx1 -An 81 15 A3 85 A2 A3 15 A3 85 A2 A3 89 95 87 15 which is the same result as this command: printf %b a\ntest\ntesting\n | od -tx1 -An 81 15 A3 85 A2 A3 15 A3 85 A2 A3 89 95 87 15 The only facility with regular expressions that I have found that matches \n to end-of-line is in Perl. For example: printf %b a\ntest\ntesting\n | perl -ne 'print if /test\n/' test printf %b a\ntest\ntesting\n | perl -ne 'print if /test$/' test The grep and awk commands don't match \n to end-of-line on omvs, or on linux for that matter. The grep command can't read a legacy PS data set directly, but awk can. awk '/test$/' //test.cntl test Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
In 5567e81c.4040...@vse2pdf.com, on 05/29/2015 at 12:16 AM, Tony Thigpen t...@vse2pdf.com said: 1960's ATT pushes for a replacement of ITA2 which the ATA published as ASCII in 1963. I might believe ASA, through several iterations. I hate overloaded code points! -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
In 838271083.1045039.1432870193420.javamail.ya...@mail.yahoo.com, on 05/29/2015 at 03:29 AM, Ze'ev Atlas 004b34e7c98a-dmarc-requ...@listserv.ua.edu said: Does anybody know why do we need two characters that seem to do the same thing No, especially since they *don't* do the same thing. A better question would be why Eunix hijacked the Line Feed instead us using CRLF. -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
In 8215077812992901.wa.paulgboulderaim@listserv.ua.edu, on 05/29/2015 at 12:27 AM, Paul Gilmartin 000433f07816-dmarc-requ...@listserv.ua.edu said: Using a device-specific hardware command to separate records in a general file makes as little sense as Assembler H's use of machine carriage control. Or as Eunix using LF as a record separator. A device-neutral convention might have beem Record Separator, ASCII 0x1e. Please inform Ken Thomson. IBM clearly violates a standard. That's not at all clear. What do POSIX et all formally say about the use of LF-broken ASCII? -- Shmuel (Seymour J.) Metz, SysProg and JOAT ISO position; see http://patriot.net/~shmuel/resume/brief.html We don't care. We don't have to care, we're Congress. (S877: The Shut up and Eat Your spam act of 2003) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
Gil is correct, \n is implementation dependent. Actually, PCRE handles it correctly, except that I've got confused and chose an incorrect option in my config.h. Once I've corrected it tests run smoothly and produce correct test results. Thanks all for explanations and advice ZA -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On 29/05/2015 1:43 PM, Anne Lynn Wheeler wrote: EBCDIC and the P-Bit, The Biggest Computer Goof Ever http://www.bobbemer.com/P-BIT.HTM The culprit was T. Vincent Learson. The only thing for his defense is that he had no idea of what he had done. It was when he was an IBM Vice President, prior to tenure as Chairman of the Board, those lofty positions where you believe that, if you order it done, it actually will be done. I've mentioned this fiasco elsewhere. And how much has that dumb decision cost mainframe customers over the years? Fiasco is the right word. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
Thank you all for comprehensive explanation ZA -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
Your messages clarified my issue and actually assured me that the solution I'd suggested is correct, so I would like to brief you. It is apparent that IBM chose to mark the end of line with NL and not with any of LF or CRLF. That on itself is probably a correct decision and probably what the standard should have been to begin with. The problem is that in the C language convention, the escape sequence \n has subtle double meaning. It means LF but it also contains within it the semantics of NL. When we do printf (some text \n); it will work correctly on all platforms and nobody would ever notice any problem. it will produce on EBCDIC some text NL and on ASCII platforms some text LF or some text CRLF But when we issue a pattern matching (I'll use Perl syntax for brevity) if ($text =~ /some text \n/) the \n is translated by convention to LF and the EBCDIC based pattern matching will fail to match! So the solution should be to somehow (optionally) dictate to the package that \n is NL and not LF. I've requested that such option would be implemented so I can use it. ZA -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Fri, 29 May 2015 09:56:20 -0500, Paul Gilmartin paulgboul...@aim.com wrote: On Fri, 29 May 2015 09:52:42 -0500, Bill Godfrey wrote: On Fri, 29 May 2015 09:03:59 -0500, Ze'ev Atlas wrote: But when we issue a pattern matching (I'll use Perl syntax for brevity) if ($text =~ /some text \n/) the \n is translated by convention to LF and the EBCDIC based pattern matching will fail to match! why not this? if ($text =~ /some text $/) That's a circumvention, not a solution to the problem. But my question remains, by what convention in the z/OS EBCDIC environment is \n translated to LF rather than NL? I get identical results whether I use \n or $ in the OP's example. In OMVS. I'm not addressing your question but rather the OP's example. Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Fri, 29 May 2015 09:03:59 -0500, Ze'ev Atlas wrote: It is apparent that IBM chose to mark the end of line with NL and not with any of LF or CRLF. That on itself is probably a correct decision and probably what the standard should have been to begin with. The problem is that in the C language convention, the escape sequence \n has subtle double meaning. It means LF but it also contains within it the semantics of NL. The semantic of \n is implementation-dependent. In Linux, it compiles as LF; in z/OS as NL (But, I believe, as LF in Enhanced ASCII mode); and in Classic Mac OS (pre OS X) as CR. When we do printf (some text \n); it will work correctly on all platforms and nobody would ever notice any problem. it will produce on EBCDIC some text NL and on ASCII platforms some text LF or some text CRLF Much of this is handled by the device driver. But when we issue a pattern matching (I'll use Perl syntax for brevity) if ($text =~ /some text \n/) the \n is translated by convention to LF and the EBCDIC based pattern matching will fail to match! That problem should not occur. By z/OS convention, \n represents NL and then pattern matching succeeds. What z/OS facility treats \n as LF? So the solution should be to somehow (optionally) dictate to the package that \n is NL and not LF. I've requested that such option would be implemented so I can use it. That should not be necessary. Can you provide more context for your example? -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Fri, 29 May 2015 09:03:59 -0500, Ze'ev Atlas wrote: But when we issue a pattern matching (I'll use Perl syntax for brevity) if ($text =~ /some text \n/) the \n is translated by convention to LF and the EBCDIC based pattern matching will fail to match! why not this? if ($text =~ /some text $/) Bill -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Fri, May 29, 2015 at 9:31 AM, Paul Gilmartin 000433f07816-dmarc-requ...@listserv.ua.edu wrote: On Fri, 29 May 2015 19:54:20 +0800, David Crayford wrote: And how much has that dumb decision cost mainframe customers over the years? Fiasco is the right word. And IBM could have recovered, rather than compounding the fiasco at the inception of OMVS by making OMVS ASCII based and providing ASCII--EBCDIC conversion in the C RTL for Legacy data sets except when fopen() was called with mode=*b. The kernel would have been simpler for omitting autoconversion. (I believe Legacy I/O is not handled by kernel.) 99.99% agreement. I'd only change I'd make would be for UTF-8 and not ASCII instead of EBCDIC. But I'm sure that there would be other problems with inter-operability that I haven't thought of if legacy continued to be mainly CP-037 based with UNIX being UTF-8 based. And there would have been no EBCDIC obstacle to porting GNU and other FOSS. Fiasco ** 2. Even yet, I wish IBM would complete the Enhanced ASCII support in the C RTL. Significant omissions are Curses and X11; sockets is already supported. -- gil -- My sister opened a computer store in Hawaii. She sells C shells down by the seashore. If someone tell you that nothing is impossible: Ask him to dribble a football. He's about as useful as a wax frying pan. 10 to the 12th power microphones = 1 Megaphone Maranatha! John McKown -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Fri, 29 May 2015 09:52:42 -0500, Bill Godfrey wrote: On Fri, 29 May 2015 09:03:59 -0500, Ze'ev Atlas wrote: But when we issue a pattern matching (I'll use Perl syntax for brevity) if ($text =~ /some text \n/) the \n is translated by convention to LF and the EBCDIC based pattern matching will fail to match! why not this? if ($text =~ /some text $/) That's a circumvention, not a solution to the problem. But my question remains, by what convention in the z/OS EBCDIC environment is \n translated to LF rather than NL? -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On 29/05/2015 10:31 PM, Paul Gilmartin wrote: On Fri, 29 May 2015 19:54:20 +0800, David Crayford wrote: And how much has that dumb decision cost mainframe customers over the years? Fiasco is the right word. And IBM could have recovered, rather than compounding the fiasco at the inception of OMVS by making OMVS ASCII based and providing ASCII--EBCDIC conversion in the C RTL for Legacy data sets except when fopen() was called with mode=*b. The kernel would have been simpler for omitting autoconversion. (I believe Legacy I/O is not handled by kernel.) Legacy I/O is a handled quite well by Java. I had a good experience this week with Java when I got it to push over 1GB of CSV data to a Linux server in 30 seconds over a Redis backplane. It implicitly converted strings to ASCII (which saved me heaps of time) and it was very fast. And there would have been no EBCDIC obstacle to porting GNU and other FOSS. Fiasco ** 2. Even yet, I wish IBM would complete the Enhanced ASCII support in the C RTL. Significant omissions are Curses and X11; sockets is already supported. -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
change the file!? That's exactly what I don't want, and it gives me no choice. Notepad++ and vim do better: they allow the user to choose the output format, defaulting to the original input format. Paul, It was part of a general comment about how even Microsoft has different rules in different programs. BUT(!) I use the Wordpad conversion process quite frequently at work. While my laptop is Linux, everybody else is Windows. When I send files (via email), I sometimes forget to run unix2dos against the file. When my coworkers get a text file from me that seems to be one long line, they know to open it with Wordpad instead of Notepad. If they need to retail the file on their Windows box, they just save it from Wordpad and never have to worry about it's Linux format agin. Tony Thigpen Paul Gilmartin wrote on 05/29/2015 05:52 PM: On Fri, 29 May 2015 00:16:28 -0400, Tony Thigpen wrote: Interesting, Windows Notepad requires CRLF, but Windows Wordpad will read and display a LF only file correctly and even change the file to CRLF when saved. change the file!? That's exactly what I don't want, and it gives me no choice. Notepad++ and vim do better: they allow the user to choose the output format, defaulting to the original input format. On Sat, 30 May 2015 01:37:12 +0800, David Crayford wrote: Legacy I/O is a handled quite well by Java. I had a good experience this week with Java when I got it to push over 1GB of CSV data to a Linux server in 30 seconds over a Redis backplane. It implicitly converted strings to ASCII (which saved me heaps of time) and it was very fast. There; was that so hard!? -- Nick Burns, Your Company's Computer Guy -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Fri, 29 May 2015 00:16:28 -0400, Tony Thigpen wrote: Interesting, Windows Notepad requires CRLF, but Windows Wordpad will read and display a LF only file correctly and even change the file to CRLF when saved. change the file!? That's exactly what I don't want, and it gives me no choice. Notepad++ and vim do better: they allow the user to choose the output format, defaulting to the original input format. On Sat, 30 May 2015 01:37:12 +0800, David Crayford wrote: Legacy I/O is a handled quite well by Java. I had a good experience this week with Java when I got it to push over 1GB of CSV data to a Linux server in 30 seconds over a Redis backplane. It implicitly converted strings to ASCII (which saved me heaps of time) and it was very fast. There; was that so hard!? -- Nick Burns, Your Company's Computer Guy -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
john.archie.mck...@gmail.com (John McKown) writes: As a side note (as I have heard it), the reason that Windows uses CRLF as a line ending is because MS-DOS did the same. MS-DOS used CRLF because CPM-80 used CRLF. And, finally, CPM-80 used CRLF because the common printers at the time could not do a carriage return / line feed in a single operation. So, Gary Kindall (author of CPM-80) decided to end text files with CRLF so that he didn't need to complicate the printer driver to put a LF in when a CR was detected. This made good sense in the day that 64K RAM and a 1 Mhz 8080 was top of the line equipment for the hobbyist. a little other topic drift from recent IBM antitrust thread Other trivia ... also at the scientific center ... GML was invented at the science center in 1969 (G, M, L are the 1st letters of the inventor's last name). This is posting by Sowa about GML being used by IBM for documents used in the antitrust suit http://ontolog.cim3.net/forum/ontolog-forum/2012-04/msg00058.html from above: For text that was copied from the original OED, they got GML to produce exactly the same line breaks and hyphenation. They needed to get it exactly right in order to aid the proof readers who had to make sure that the new copy was identical to the old. The GML-based software in the 1980s was far more flexible than MS Word is today. Just look at the OED and imagine how you might use MS Word to match that exactly. ... snip ... in the mid-60s at science center, CMS script was implementation of CTSS runoff using dot formating controls ... then later, script was enhanced to support GML tag processing. in late 70s, a vm370 SE in the LA branch ... did implementation of CMS script on trs80 (NewScript) and periodically mentioned ... before ms/dos http://en.wikipedia.org/wiki/MS-DOS there was seattle computer http://en.wikipedia.org/wiki/Seattle_Computer_Products before seattle computer there was cp/m, http://en.wikipedia.org/wiki/CP/M before cp/m, kildall worked with cp67/cms at npg http://en.wikipedia.org/wiki/Naval_Postgraduate_School other Sowa trivia ... on the failure of FS and how poorly 3081 compared to competition http://www.jfsowa.com/computer/memo125.htm -- virtualization experience starting Jan1968, online at home since Mar1970 -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Fri, 29 May 2015 19:54:20 +0800, David Crayford wrote: And how much has that dumb decision cost mainframe customers over the years? Fiasco is the right word. And IBM could have recovered, rather than compounding the fiasco at the inception of OMVS by making OMVS ASCII based and providing ASCII--EBCDIC conversion in the C RTL for Legacy data sets except when fopen() was called with mode=*b. The kernel would have been simpler for omitting autoconversion. (I believe Legacy I/O is not handled by kernel.) And there would have been no EBCDIC obstacle to porting GNU and other FOSS. Fiasco ** 2. Even yet, I wish IBM would complete the Enhanced ASCII support in the C RTL. Significant omissions are Curses and X11; sockets is already supported. -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
It's actually much worse. There are three: Ebcdic: CR = x0D NL = x15 LF = x25 Originally, CR only moved the print back to the first position of the same line. LF only moved the print down one line without moving sideways. NL moved both down and to the first position of the line. When it was designed, they were using teletype machines and simple printers. No CRTs. Historically: 1930's had the Teletype standard: International Telegraph Alphabet No. 2 (ITA2); which had both a CR and a LF and required both at the end of a line. 1950's IBM introduces BCD and adds NL 1960's IBM introduces EBCDIC and continued using the 3 values. 1960's ATT pushes for a replacement of ITA2 which the ATA published as ASCII in 1963. (One of their requirements was 7 bit so EBCDIC was ruled out.) In the ASCII world, CR and LF were the standard until the mid-1960's when the Multics developers decided that using two characters was stupid and they started using just LF. Unix and follow-on OSs carried on the same tradition. Today, it's a mess. Windows wants CRLF. Internet RFCs normally use CRLF. Mac and Linux use just LF. Interesting, Windows Notepad requires CRLF, but Windows Wordpad will read and display a LF only file correctly and even change the file to CRLF when saved. Tony Thigpen Ze'ev Atlas wrote on 05/28/2015 11:29 PM: Hi allI am dealing with some C package on classic z/OS (PDS/E, no USS). When C reads text files it inserts 0x15 in the end of the record (it goes that far as to drop the trailing blanks and substitute them with one 0x15 for fixed length records, but I think that there is an option to override that). 0x15 is defined as New Line, but there is a separate character, 0x25 that is defined as Line Feed. Does anybody know why do we need two characters that seem to do the same thing (besides the evil desire to confuse the poor user :) Ze'ev Atlas -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
t...@vse2pdf.com (Tony Thigpen) writes: It's actually much worse. There are three: Ebcdic: CR = x0D NL = x15 LF = x25 Originally, CR only moved the print back to the first position of the same line. LF only moved the print down one line without moving sideways. NL moved both down and to the first position of the line. When it was designed, they were using teletype machines and simple printers. No CRTs. Historically: 1930's had the Teletype standard: International Telegraph Alphabet No. 2 (ITA2); which had both a CR and a LF and required both at the end of a line. 1950's IBM introduces BCD and adds NL 1960's IBM introduces EBCDIC and continued using the 3 values. 1960's ATT pushes for a replacement of ITA2 which the ATA published as ASCII in 1963. (One of their requirements was 7 bit so EBCDIC was ruled out.) In the ASCII world, CR and LF were the standard until the mid-1960's when the Multics developers decided that using two characters was stupid and they started using just LF. Unix and follow-on OSs carried on the same tradition. Today, it's a mess. Windows wants CRLF. Internet RFCs normally use CRLF. Mac and Linux use just LF. Interesting, Windows Notepad requires CRLF, but Windows Wordpad will read and display a LF only file correctly and even change the file to CRLF when saved. IBM did much of the standardization for ASCII and 360 originally was suppose to be an ASCII machine ... unfortunately the 360 ASCII unit record gear wasn't ready ... and the decision was made to go (temporarily) with the old BCD unit record gear (but there was some unfortunate side-effects of that decision). EBCDIC and the P-Bit, The Biggest Computer Goof Ever http://www.bobbemer.com/P-BIT.HTM The culprit was T. Vincent Learson. The only thing for his defense is that he had no idea of what he had done. It was when he was an IBM Vice President, prior to tenure as Chairman of the Board, those lofty positions where you believe that, if you order it done, it actually will be done. I've mentioned this fiasco elsewhere. ... snip ... by the father of ASCII http://www.bobbemer.com/FATHEROF.HTM his history index http://www.bobbemer.com/HISTORY.HTM -- virtualization experience starting Jan1968, online at home since Mar1970 -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Thu, 28 May 2015 23:34:53 -0500, John McKown wrote: 0x15 is _NOT_ a Line Feed character. It is a New Line (NEL) character from the 3215 console days. In EBCDIC, 0x25 is the true Line Feed character. On the 3215, the NEL was a single byte which did a carriage return and line feed operation all in one. If you sent a 0x15 (LF) to a 3215, the platen (roller) would advance one line, but the print head would remain stationary. As a side note (as I have heard it), the reason that Windows uses CRLF as a line ending is because MS-DOS did the same. MS-DOS used CRLF because CPM-80 used CRLF. And, finally, CPM-80 used CRLF because the common printers at the time could not do a carriage return / line feed in a single operation. So, Gary Kindall (author of CPM-80) decided to end text files with CRLF so that he didn't need to complicate the printer driver to put a LF in when a CR was detected. This made good sense in the day that 64K RAM and a 1 Mhz 8080 was top of the line equipment for the hobbyist. The Teletype 33, running at 10 CPS, could do a CR in less than 0.2 seconds; a LF in less than 0.1 second, so it made sense to use CRLF so the combined operation completed before the next printable character was issued. Taking its cue from the 3215, VM CP (CP/67?) used NL as a command separator. When the first C compilers, from ISVs, not IBM, and on VM, not MVS appeared, they used 0x15 -- UNIX was not a concern. Then OMVS used 0x15 for compatibility with those compilers. Using a device-specific hardware command to separate records in a general file makes as little sense as Assembler H's use of machine carriage control. A device-neutral convention might have beem Record Separator, ASCII 0x1e. CMS Pipelines's A2E/E2A map: ASCII EBCDIC NEL 0x85 -- NL 0x15 LF 0x0a -- LF 0x25 ... as do Linux iconv commands and subroutines, and even OMVS's dd command. This results in painful incompatibilities. The standouts are OMVS iconv and other utilities. IBM clearly violates a standard. Footnotes on various reference manual pages do not excuse such a violation. -- gil -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
Re: New Line vs. Line Feed
On Thu, May 28, 2015 at 10:29 PM, Ze'ev Atlas 004b34e7c98a-dmarc-requ...@listserv.ua.edu wrote: Hi allI am dealing with some C package on classic z/OS (PDS/E, no USS). When C reads text files it inserts 0x15 in the end of the record (it goes that far as to drop the trailing blanks and substitute them with one 0x15 for fixed length records, but I think that there is an option to override that). 0x15 is defined as New Line, but there is a separate character, 0x25 that is defined as Line Feed. Does anybody know why do we need two characters that seem to do the same thing (besides the evil desire to confuse the poor user :) Ze'ev Atlas 0x15 is _NOT_ a Line Feed character. It is a New Line (NEL) character from the 3215 console days. In EBCDIC, 0x25 is the true Line Feed character. On the 3215, the NEL was a single byte which did a carriage return and line feed operation all in one. If you sent a 0x15 (LF) to a 3215, the platen (roller) would advance one line, but the print head would remain stationary. As a side note (as I have heard it), the reason that Windows uses CRLF as a line ending is because MS-DOS did the same. MS-DOS used CRLF because CPM-80 used CRLF. And, finally, CPM-80 used CRLF because the common printers at the time could not do a carriage return / line feed in a single operation. So, Gary Kindall (author of CPM-80) decided to end text files with CRLF so that he didn't need to complicate the printer driver to put a LF in when a CR was detected. This made good sense in the day that 64K RAM and a 1 Mhz 8080 was top of the line equipment for the hobbyist. -- My sister opened a computer store in Hawaii. She sells C shells down by the seashore. If someone tell you that nothing is impossible: Ask him to dribble a football. He's about as useful as a wax frying pan. 10 to the 12th power microphones = 1 Megaphone Maranatha! John McKown -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN