Re: bug in awk implementation?
On Mon, Jul 15, 2002 at 04:00:29PM -0400, Garrett Wollman wrote: On Mon, 15 Jul 2002 21:47:09 +0200, Robert Drehmel [EMAIL PROTECTED] said: You are right. However, I still consider it a bug. :-) The standard says that the behavior is ``undefined''. That means that you computer is allowed to turn into a frog. Actually doing something useful is also permitted. And since it is clearly documented, awk(1) says, Records Normally, records are separated by newline characters. You can control how records are separated by assigning values to the built-in variable RS. If RS is any single character, that character separates records. Otherwise, RS is a regular expression. Text in the input that matches this regular expression will separate the record. However, in compatibility mode, only the first character of its string value is used for separating records. If RS is set to the null string, then records are separated by blank lines. When RS is set to the null string, the new- line character always acts as a field separator, in addi- tion to whatever value FS may have. It is not a bug. -- Crist J. Clark | [EMAIL PROTECTED] | [EMAIL PROTECTED] http://people.freebsd.org/~cjc/| [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: bug in awk implementation?
On Tue, 16 Jul 2002, Crist J. Clark wrote: And since it is clearly documented, awk(1) says, Records Normally, records are separated by newline characters. You can control how records are separated by assigning values to the built-in variable RS. If RS is any single character, that character separates records. Otherwise, RS is a regular expression. Text in the input that matches this regular expression will separate the record. However, in compatibility mode, only the first character of its string value is used for separating records. If RS is set to the null string, then records are separated by blank lines. When RS is set to the null string, the new- line character always acts as a field separator, in addi- tion to whatever value FS may have. It is not a bug. No, you are quoting from the gawk(1) man page. The awk(1) man page makes no such statement. -gordon To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: bug in awk implementation?
[Since you insisted on CC'ing me...] On Tue, 16 Jul 2002 16:57:42 -0700 (PDT), Gordon Tetlow [EMAIL PROTECTED] said: No, you are quoting from the gawk(1) man page. The awk(1) man page makes no such statement. The awk(1) manual page does not define the correct behavior of gawk(1). IEEE Std. 1003.1-2001 defines the correct behavior of both awk(1) and gawk(1), and as I have already demonstrated, it leaves the behavior in question clearly unspecified. -GAWollman To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
bug in awk implementation?
I was parsing ldif format with awk (formerly gawk) and found a buglet in awk with the following script: BEGIN { RS=\n\n; FS=(: |\n); } { print $2; } Fed the following output: dn: Some Such DN gidNumber: 1000 uidNumber: 1080 dn: Some Other DN gidNumber: 1000 uidNumber: 1405 This is what I get: one-true-awk: Some Such DN 1000 1080 Some Other DN 1000 1405 gawk: Some Such DN Some Other DN So, this seems to be a bug in the one-true-awk implementation. Any ideas on how to fix this? -gordon To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: bug in awk implementation?
On Mon, Jul 15, 2002 at 08:20:58AM -0700, Gordon Tetlow wrote: I was parsing ldif format with awk (formerly gawk) and found a buglet in awk with the following script: BEGIN { RS=\n\n; FS=(: |\n); } { print $2; } Fed the following output: dn: Some Such DN gidNumber: 1000 uidNumber: 1080 dn: Some Other DN gidNumber: 1000 uidNumber: 1405 This is what I get: one-true-awk: Some Such DN 1000 1080 Some Other DN 1000 1405 Ok. gawk: Some Such DN Some Other DN Oh. So, this seems to be a bug in the one-true-awk implementation. Any ideas on how to fix this? To me, this seems like a bug in 'gawk'. The AWK language uses only the first character in RS as the record separator, to my knowledge. ciao, -robert To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: bug in awk implementation?
On Mon, 15 Jul 2002 09:06:36 -0700 (PDT), Gordon Tetlow [EMAIL PROTECTED] said: Ah, okay, there is a distinct lack of documentation to that fact. I have figured out that I can just set RS= and that does the same thing. I suppose it would be helpful to have an awk book around. =) The Standard is clear: # The first character of the string value of RS shall be the input # record separator; a newline by default. If RS contains more than # one character, the results are unspecified. If RS is null, then # records are separated by sequences consisting of a newline plus # one or more blank lines, leading or trailing blank lines shall not # result in empty records at the beginning or end of the input, and a # newline shall always be a field separator, no matter what the # value of FS is. -GAWollman To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message