Re: bug in awk implementation?

2002-07-16 Thread Crist J. Clark

On Mon, Jul 15, 2002 at 04:00:29PM -0400, Garrett Wollman wrote:
 On Mon, 15 Jul 2002 21:47:09 +0200, Robert Drehmel [EMAIL PROTECTED] 
said:
 
  You are right.  However, I still consider it a bug.  :-)
 
 The standard says that the behavior is ``undefined''.  That means that
 you computer is allowed to turn into a frog.  Actually doing something
 useful is also permitted.

And since it is clearly documented, awk(1) says,

   Records
   Normally,  records  are  separated  by newline characters.
   You can control how records  are  separated  by  assigning
   values  to  the built-in variable RS.  If RS is any single
   character, that character separates  records.   Otherwise,
   RS  is  a  regular  expression.   Text  in  the input that
   matches this regular expression will separate the  record.
   However,  in  compatibility mode, only the first character
   of its string value is used for separating records.  If RS
   is  set  to the null string, then records are separated by
   blank lines.  When RS is set to the null string, the  new-
   line  character always acts as a field separator, in addi-
   tion to whatever value FS may have.

It is not a bug.
-- 
Crist J. Clark | [EMAIL PROTECTED]
   | [EMAIL PROTECTED]
http://people.freebsd.org/~cjc/| [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: bug in awk implementation?

2002-07-16 Thread Gordon Tetlow

On Tue, 16 Jul 2002, Crist J. Clark wrote:

 And since it is clearly documented, awk(1) says,
 
Records
Normally,  records  are  separated  by newline characters.
You can control how records  are  separated  by  assigning
values  to  the built-in variable RS.  If RS is any single
character, that character separates  records.   Otherwise,
RS  is  a  regular  expression.   Text  in  the input that
matches this regular expression will separate the  record.
However,  in  compatibility mode, only the first character
of its string value is used for separating records.  If RS
is  set  to the null string, then records are separated by
blank lines.  When RS is set to the null string, the  new-
line  character always acts as a field separator, in addi-
tion to whatever value FS may have.
 
 It is not a bug.

No, you are quoting from the gawk(1) man page. The awk(1) man page makes 
no such statement.

-gordon


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: bug in awk implementation?

2002-07-16 Thread Garrett Wollman

[Since you insisted on CC'ing me...]

On Tue, 16 Jul 2002 16:57:42 -0700 (PDT), Gordon Tetlow [EMAIL PROTECTED] said:

 No, you are quoting from the gawk(1) man page. The awk(1) man page makes 
 no such statement.

The awk(1) manual page does not define the correct behavior of
gawk(1).

IEEE Std. 1003.1-2001 defines the correct behavior of both awk(1) and
gawk(1), and as I have already demonstrated, it leaves the behavior in
question clearly unspecified.

-GAWollman


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



bug in awk implementation?

2002-07-15 Thread Gordon Tetlow

I was parsing ldif format with awk (formerly gawk) and found a buglet in 
awk with the following script:

BEGIN {
RS=\n\n;
FS=(: |\n);
}

{ print $2; }

Fed the following output:

dn: Some Such DN
gidNumber: 1000
uidNumber: 1080

dn: Some Other DN
gidNumber: 1000
uidNumber: 1405

This is what I get:

one-true-awk:

Some Such DN
1000
1080

Some Other DN
1000
1405

gawk:

Some Such DN
Some Other DN


So, this seems to be a bug in the one-true-awk implementation. Any ideas 
on how to fix this?

-gordon


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: bug in awk implementation?

2002-07-15 Thread Robert Drehmel

On Mon, Jul 15, 2002 at 08:20:58AM -0700, Gordon Tetlow wrote:
 I was parsing ldif format with awk (formerly gawk) and found a buglet in 
 awk with the following script:
 
 BEGIN {
   RS=\n\n;
   FS=(: |\n);
 }
 
 { print $2; }
 
 Fed the following output:
 
 dn: Some Such DN
 gidNumber: 1000
 uidNumber: 1080
 
 dn: Some Other DN
 gidNumber: 1000
 uidNumber: 1405
 
 This is what I get:
 
 one-true-awk:
 
 Some Such DN
 1000
 1080
 
 Some Other DN
 1000
 1405

Ok.

 
 gawk:
 
 Some Such DN
 Some Other DN
 

Oh.

 So, this seems to be a bug in the one-true-awk implementation. Any ideas 
 on how to fix this?

To me, this seems like a bug in 'gawk'.  The AWK language uses
only the first character in RS as the record separator, to my
knowledge.

ciao,
-robert

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: bug in awk implementation?

2002-07-15 Thread Garrett Wollman

On Mon, 15 Jul 2002 09:06:36 -0700 (PDT), Gordon Tetlow [EMAIL PROTECTED] said:

 Ah, okay, there is a distinct lack of documentation to that fact. I have 
 figured out that I can just set RS= and that does the same thing. I 
 suppose it would be helpful to have an awk book around. =)

The Standard is clear:

# The first character of the string value of RS shall be the input
# record separator; a newline by default. If RS contains more than
# one character, the results are unspecified. If RS is null, then
# records are separated by sequences consisting of a newline plus
# one or more blank lines, leading or trailing blank lines shall not
# result in empty records at the beginning or end of the input, and a
# newline shall always be a field separator, no matter what the
# value of FS is.

-GAWollman


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message