Thanks for you explanations! - My notes below...

> Date: Tue, 17 Mar 2015 17:19:24 -0500
> From: terrence.j.doyle+...@gmail.com
> 
>       There are two points, here.
> 
>       First, you're using an old version of ksh.

Yes, I'm aware and noted that myself. I'd be happy if newer ksh's
behaviour would not only be different than older ksh's, but also...
(see below)

> 
>       Second, ksh has a different spin from bash on reading fixed-length
> data. Read sees variables in two flavors, text and binary.

I'm in this case primarily interested in text.

> When read
> puts characters into a textual variable newlines are always changed to a
> null character which serves as a string terminator. Also, when ksh reads
> fixed-length data, characters continue to fill the buffer after a
> newline is read. Thus, any characters read into your "line" variable
> after a newline will effectively disappear.

Diasappearing characters when processing ordinary text is what
annoys and surprises me; also conceptually.

The man page says:
"The
-n
option causes at most
n 
bytes to read rather a full line
but will return when reading from a slow device
as soon as any
characters have been read."

No mention of discarding any characters. (And why would that
make more sense than reading up to the desired amount, and
keep the rest for another read.)

(And I didn't specify -N, to read "exactly" an amount of characters,
I called it with -n, to read "at most" the given amount of characters.)

> 
>       When I run your test with ksh-20140929, I get these results:
> 
> 'ABCDEFGHIJ123456'
> '7890'
> '234567890'

(Frankly, this shocks me even more than the output I got from that
older ksh version I used.)

> 
> The first 16 characters are put into line as expected and printf %s
> displays them. In the next loop iteration read puts another 16
> characters into line but changes the newline to a null character. Printf
> %s sees the null character as end-of-string and prints only 7890.

That's fine and as expected.

> Characters 6 through 16 (abcdefghij1) in line can't be seen using printf
> %s.

Here I see the problem. According to read's option -n, it should not
read more than the 6 characters (i.e. insert a NUL, and then another
string of chracters). Rather it should read just 6 characters and
aborting the read while skipping the NL and terminating the string
internally with NUL. So that the rest is available for the next read.

> In the final iteration the remaining characters are put in line and
> printf %s displays them.
> 
>       If you don't want ksh to change newlines to null characters in
> fixed-length data, you need to use binary variables.

No, what I would want is that there's not more characters read than
asked for.

All the features related to "binary" processing is not what I wanted;
I just want to do (non-binary, printable characters) text processing.

> Binary variables
> are declared with typeset -b. To print binary variables you should use
> printf %B. As with textual variables read will continue to fill the
> buffer after a newline is read. I modified your example for binary
> variables as follows:
> 
> typeset -b line
> while IFS= read -n16 -r line
> do
>         printf -- "'%B'\n" line
> done <<EOT
> ABCDEFGHIJ1234567890
> abcdefghij1234567890
> EOT
> 
> This produced the following output:
> 
> 'ABCDEFGHIJ123456'
> '7890
> abcdefghij1'
> '234567890
> '

(A change to binary processing is not what I wanted to achieve, so the
results of your changes naturally don't reflect the expected outcome.)

> 
> It might be more appropriate to change "line" to "buffer" to go along
> with ksh's behavior.
> 
>       It's still different from bash, but I'm content letting ksh be ksh and
> bash be bash.

To be clear; it's *not* about ksh imitating bash. It's about a behaviour
of ksh's read that's (IMO) extremely surprising (and [IMO] not helpful).

> I don't know if the new bash compatability-mode makes
> reading fixed-length data closer to bash's style.

I'm not a bash user (usually), I very much prefer ksh. But methinks the
visible behaviour of read -n is not intuitive, specifically when it comes
to silently discarding characters.

The explanations here left me with a feeling that read -n behaves the
way it is because it's technically implemented in the way it is. Not really
satisfying.

But thanks again. - More insights about why the given read -n behaviour
is more sensible than bash's (in this specific read -n case) is welcome.

> 
>                                       Terrence Doyle
> 
> On 3/15/15 9:10 AM, Janis Papanagnou wrote:
> > I observe a problem (see testcase below) with ksh's read -n.
> > (Version 93t 2008-11-04 on Cygwin).
> > Bash's behaviour would be what I expect.
> > Ksh doesn't read the second line and doesn't terminate output.
> > (Maybe an old bug fixed in newer versions? Or am I missing something?)
> > 
> > --snip--
> > 
> > $ cat readtest
> > while IFS= read -n16 -r line
> > do
> >   printf "'%s'\n" "$line"
> > done <<EOT
> > ABCDEFGHIJ1234567890
> > abcdefghij1234567890
> > EOT
> > 
> > $ bash readtest | head
> > 'ABCDEFGHIJ123456'
> > '7890'
> > 'abcdefghij123456'
> > '7890'
> > 
> > $ ksh readtest | head
> > 'ABCDEFGHIJ123456'
> > '7890'
> > ''
> > ''
> > ''
> > ''
> > ''
> > ''
> > ''
> > ''
> > 
> > $ ksh --version
> >   version         sh (AT&T Research) 93t 2008-11-04
> > 
> > --snip--


                                          
_______________________________________________
ast-users mailing list
ast-users@lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-users

Reply via email to