Re: Regression -- can't read input w/stderr redirect

2017-06-20 Thread Chet Ramey
On 6/20/17 8:25 AM, Greg Wooledge wrote:

>>   * "The man page of bash reads: '-d delim  The first character of
> [...]
> 
> It's not in the man page.

   -d delim
 The  first  character  of  delim is used to terminate the
 input line, rather than newline.  If delim is  the  empty
 string,  read  will  terminate a line when it reads a NUL
 character.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Regression -- can't read input w/stderr redirect

2017-06-20 Thread Greg Wooledge
On Tue, Jun 20, 2017 at 02:01:51AM -0700, L A Walsh wrote:
> * BASH???s ???read??? built-in supports '\0' as delimiter

Yes, but not with that syntax.  It uses the -d '' option,
which is undocumented, but supported according to
.

> * And a line is by definition terminated by ascii 0, right? Newline, for
>  instance, is just a formatting character?

No, that isn't a correct statement.  A "line" is terminated by a newline.
However, there are some input sources that don't use lines.  Instead,
they use records terminated by NULs.

Once upon a time, such input sources were rare.  The original Bourne
and POSIX shells have no facilities for working with these inputs.
But bash does.

>  (an apple-darwin user - stackoverflow (SO))
> * marked as 'answer' method to read null terminated lines:
>   |while IFS= read -r -d '' line ; do ... done <<<"$var" (also SO)

That's just plain wrong.  Don't believe everything you read on
stackoverflow.

A bash variable cannot contain a NUL byte, so <<<"$var" is never going
to have NULs in it.

> * support for "-d '' " as equivalent to specifying null termination:

I cited Chet above.  That's the most official word we have.

>   * "The man page of bash reads: '-d delim  The first character of
[...]

It's not in the man page.



Re: Regression -- can't read input w/stderr redirect

2017-06-20 Thread L A Walsh



Pierre Gaston wrote:



Your response: you accuse him to lie to you.

Huh?  Are you daft?  Just because I can't find something about bash
corrupting input in google and ask where people have said this
was said is accusing someone of lying?  Sorry, but I'm looking for
context to see exactly what was said is a far cry calling someone
a liar.

I've tried multiple searches to see where someone might have said
something about bash corrupting input -- most often I see that Nuls
are ignored or can be used as terminators. 


I find lots of stuff about people wanting or trying to get nulls to
work with bash, with several thinking it terminates input (or a line).
But nothing about corruption. Perhaps you have pointers to this
discussion I missed?  I find various things about Nul's but nothing
mentioning their absence as being "corruption".

* BASH’s ‘read’ built-in supports '\0' as delimiter
 
(http://transnum.blogspot.com/2008/11/bashs-read-built-in-supports-0-as.html)

* And a line is by definition terminated by ascii 0, right? Newline, for
 instance, is just a formatting character?
 (an apple-darwin user - stackoverflow (SO))
* marked as 'answer' method to read null terminated lines:
  |while IFS= read -r -d '' line ; do ... done <<<"$var" (also SO)
* support for "-d '' " as equivalent to specifying null termination:
  * "The man page of bash reads: '-d delim  The first character of
delim is  used  to  terminate  the input line, rather than newline.'
Because strings are null terminated, the first character of an empty
string is the null byte." (unix.StckExchng (unix.SE)
  * "Your read has -d $'\0'. This works because -d '' also works! It is
the empty string, so the null byte comes immediately. -d delim uses
   "The first character of delim". (trimmed for
brevity; same site)
* question asking why null bytes are removed from a read (unix.com); answer
 was to use zsh or pdksh as they seemed to handle it.
* question asking how to use null bytes in BASH (unix.SE)
* reactionary essay on posix shells being "buggy". How anything in IFS 
can cause

 corruption. (https://www.dwheeler.com/essays/filenames-in-shell.html)
|

Etc...
Pierre, if anyone is spreading falsehoods, it would be you in your 
accusations.


p.s. -- Chet, if you've taken Pierre serious, I'm sorry, I am looking for
the source of the discussion, I'm not interested in accusations.





Re: Regression -- can't read input w/stderr redirect

2017-06-19 Thread Greg Wooledge
On Sun, Jun 18, 2017 at 07:17:46PM -0700, L A Walsh wrote:
> Oh?  I want to read in a value from the registry where something may 
> have zeros
> after the data.  Please tell me the mechanism to read this in w/no warnings
> that won't silence more serious cases of zero's other than at the end of 
> line.

IFS= read -rd '' regvalue < <(regcommand)

This will read the content from the stream provided by regcommand, up
to the first NUL, where it will stop.  This is the general method used
to read a NUL-delimited value from a command or pseudo-file.  A command
substitution is NOT preferred, because it converts the stream to a
string (by dropping all NUL bytes, including those that are in between
two separate values).

An analogous common use-case on Linux:

wooledg:~$ hd /proc/$$/cmdline
  62 61 73 68 00|bash.|
0005
wooledg:~$ IFS= read -rd '' foo < /proc/$$/cmdline
wooledg:~$ declare -p foo
declare -- foo="bash"

If your regcommand produces a stream with multiple items in it, like
this:

foo\0bar\0quux\0

Then the IFS= read -rd '' regvalue command will read "foo" (the first
value) and stop, whereas a command substitution will give you the string
"foobarquux" which is nonsense.  (And in bash 4.4, you also get a
warning, because doing that is bad.  You should stop doing that.)

Now, most of us on this mailing list do not use Microsoft Windows, and
do not know what commands you are using, or what output they produce.
So, if you want any more specific advice, you'll have to go back a step
and explain what you're doing, what input you're dealing with, and what
results you expect to achieve.

It also helps if you use standard bash commands, and not your
idiosyncratic aliases like "my" and "int".  Some of us may remember
that you do things like alias my=declare, but others will not.  When
reporting a bug, you want your report to be easily understandable and
reproducible, without external references or guessing.



Re: Regression -- can't read input w/stderr redirect

2017-06-19 Thread Chet Ramey
On 6/18/17 10:17 PM, L A Walsh wrote:
> 
> 
> Chet Ramey wrote:
>> On 6/18/17 6:59 PM, L A Walsh wrote:
>>  
>>> Chet Ramey wrote:
>>>
 Bash has always stripped NULL bytes. Now it tells you it's doing it.
   
>>> Why?  Did I toggle a flag asking for the warning?  Seems like it
>>> was sorta sprung on users w/no way to disable it.

I was wondering why you chose this hill to die on, but then I remembered
you've chosen so many in the past.

The worst part is that you could have spent all this time developing a
patch for the behavior you want, but you chose to spend it complaining.

There is (once again) no path forward for this conversation that isn't
a huge waste of time. I don't plan to continue it.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Regression -- can't read input w/stderr redirect

2017-06-19 Thread Pierre Gaston
On Mon, Jun 19, 2017 at 5:17 AM, L A Walsh  wrote:

>
>
> Chet Ramey wrote:
>
>> On 6/18/17 6:59 PM, L A Walsh wrote:
>>
>>
>>> Chet Ramey wrote:
>>>
>>>
 Bash has always stripped NULL bytes. Now it tells you it's doing it.


>>> Why?  Did I toggle a flag asking for the warning?  Seems like it
>>> was sorta sprung on users w/no way to disable it.
>>>
>>>
>>
>> Users asked why bash transformed input without warning, even though it
>> had been doing that it's entire lifetime. A warning is appropriate.
>>
>>
> Maybe - but links to, at least, 2-3 users who filed bug reports about this
> problem in bug-bash would be appropriate as well to justify the inclusion
> of the text.
>
> I don't recall it ever coming up until the warning message was discussed
> as being unwelcome.  So please, I'd like to see the bug-report filings
> where this
> happened.
>
>
> 
>>>But things are changing -- people have asked for zero-terminated
>>> read's
>>> and readarrays.  More unix utils are offering NUL termination as an
>>> option
>>> because newlines alone don't cut it in some instances.
>>>
>>>
>>
>> And bash provides mechanisms to deal with the relatively few use cases
>> where it is a problem.
>>
>> Recall that the only thing that has changed is that bash now provides a
>> warning about what it's doing.
>>
>>
> Oh?  I want to read in a value from the registry where something may have
> zeros
> after the data.  Please tell me the mechanism to read this in w/no warnings
> that won't silence more serious cases of zero's other than at the end of
> line.
>
> I want to see the hyperlinks to archived bug-discussions on bug-bash where
> users complained about this and where it was at the end of a string where
> they expected to be able to read past a binary-0 in the input.
>
> I know I would have like the ability to read binary data into to a var that
> might "include a NUL", but I don't recall ever complaining about
> end-of-string
> NUL's being trimmed -- and it was drummed home to me how the null's were
> the
> end of the string -- not how bash read everything but nulls from input.
>
>
I'm sorry  to say that your behavior on this list is just not acceptable.

If you were on IRC I would have banned you much earlier, yet after all
these years of trolling the list, Chet is going to great length to
explain the rationale of his choices while you keep whining for the
shell to just do what you want in the random particular case you
happen to work at the moment.

Your response: you accuse him to lie to you.

I don't think this is constructive in any way and I'm sure
that, even if Chet has probably experienced this kind of online
situation more than most, it's not a pleasant one.

Pierre.


Re: Regression -- can't read input w/stderr redirect

2017-06-19 Thread Robert Elz
Date:Sun, 18 Jun 2017 14:02:05 -0700
From:L A Walsh 
Message-ID:  <5946ea4d.8030...@tlinx.org>


  | Side question: Why display that message if there are only
  | NUL's at the end?  I would think it normal for bash to
  | use and read NUL terminated strings.

Files with nuls are not normal, they're usually some kind of binary
(a.out, jpeg, ...) which most of the time you don't want to read in
a sh script.


  |   int dpi=$(ord $(<"$pixels_path" 2>/dev/null))
  |   # no error message, but also got (in my script):

I think you'll find that's because the $(/dev/null

I suspect (though I haven't bothered to check it - redirecting errs
from the sh running the script itself has typically been difficult to
achieve.)

kre

ps: I see Eduardo Bustamante has just said almost the same thing with
more detail...




Re: Regression -- can't read input w/stderr redirect

2017-06-18 Thread L A Walsh



Chet Ramey wrote:

On 6/18/17 6:59 PM, L A Walsh wrote:
  

Chet Ramey wrote:


Bash has always stripped NULL bytes. Now it tells you it's doing it.
  

Why?  Did I toggle a flag asking for the warning?  Seems like it
was sorta sprung on users w/no way to disable it.



Users asked why bash transformed input without warning, even though it
had been doing that it's entire lifetime. A warning is appropriate.
  

Maybe - but links to, at least, 2-3 users who filed bug reports about this
problem in bug-bash would be appropriate as well to justify the inclusion
of the text.

I don't recall it ever coming up until the warning message was discussed 
as being unwelcome.  So please, I'd like to see the bug-report filings 
where this

happened.




   But things are changing -- people have asked for zero-terminated read's
and readarrays.  More unix utils are offering NUL termination as an option
because newlines alone don't cut it in some instances.



And bash provides mechanisms to deal with the relatively few use cases
where it is a problem.

Recall that the only thing that has changed is that bash now provides a
warning about what it's doing.
  
Oh?  I want to read in a value from the registry where something may 
have zeros

after the data.  Please tell me the mechanism to read this in w/no warnings
that won't silence more serious cases of zero's other than at the end of 
line.


I want to see the hyperlinks to archived bug-discussions on bug-bash where
users complained about this and where it was at the end of a string where
they expected to be able to read past a binary-0 in the input.

I know I would have like the ability to read binary data into to a var that
might "include a NUL", but I don't recall ever complaining about 
end-of-string

NUL's being trimmed -- and it was drummed home to me how the null's were the
end of the string -- not how bash read everything but nulls from input.










Re: Regression -- can't read input w/stderr redirect

2017-06-18 Thread Chet Ramey
On 6/18/17 6:59 PM, L A Walsh wrote:
> 
> 
> Chet Ramey wrote:
>> Bash has always stripped NULL bytes. Now it tells you it's doing it.
> Why?  Did I toggle a flag asking for the warning?  Seems like it
> was sorta sprung on users w/no way to disable it.

Users asked why bash transformed input without warning, even though it
had been doing that it's entire lifetime. A warning is appropriate.

>>  
>>> Side question: Why display that message if there are only
>>> NUL's at the end?  I would think it normal for bash to
>>> use and read NUL terminated strings.  
>>
>> This is very uncommon. Most Unix utilities use newline-terminated
>> lines.
>>   
> 
>But things are changing -- people have asked for zero-terminated read's
> and readarrays.  More unix utils are offering NUL termination as an option
> because newlines alone don't cut it in some instances.

And bash provides mechanisms to deal with the relatively few use cases
where it is a problem.

Recall that the only thing that has changed is that bash now provides a
warning about what it's doing.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Regression -- can't read input w/stderr redirect

2017-06-18 Thread L A Walsh



Chet Ramey wrote:

Bash has always stripped NULL bytes. Now it tells you it's doing it.

Why?  Did I toggle a flag asking for the warning?  Seems like it
was sorta sprung on users w/no way to disable it.
  

Side question: Why display that message if there are only
NUL's at the end?  I would think it normal for bash to
use and read NUL terminated strings.  



This is very uncommon. Most Unix utilities use newline-terminated
lines.
  


   But things are changing -- people have asked for zero-terminated read's
and readarrays.  More unix utils are offering NUL termination as an option
because newlines alone don't cut it in some instances.

   Also, Bash is being used on windows: in cygwin and natively.
It's not uncommon for NUL's to be in input on windows --
its VERY common if you read something from the registry (as I've
done for over a decade), or if you read something with UTF-16
in it as I just tried to do.  Bash mangles the locale's strings.
In UTF-16, 0x (16-bits) of zero are an eoln. 

Internally, maybe, but not when dealing with external utilities.
  

Dealing with the registry is pretty common on Windows.
You don't want to stick with a solution that will orphan
all those windows users do you?

:-)







Re: Regression -- can't read input w/stderr redirect

2017-06-18 Thread Chet Ramey
On 6/18/17 5:02 PM, L A Walsh wrote:

>  int dpi=$(ord $(<"$pixels_path" 2>/dev/null))
> 
> 
> This used to work but now works _unreliably_.
> 
> (NOTE: I know that function won't work for values over 255,
> but hasn't been a problem yet, so haven't needed to fix it).
> 
> Tried running it interactively, and got:
> 
>  > int dpi=$(ord $(<"$pixels_path" ))   -bash: warning: command
> substitution: ignored null byte in input

These are not the same command. Eduado explained why this matters.

> 
> I've always expected the '0' bytes to terminate input so
> my "ord" only picked up the 1st character, but I know
> about the added message.

Bash has always stripped NULL bytes. Now it tells you it's doing it.

> 
> Side question: Why display that message if there are only
> NUL's at the end?  I would think it normal for bash to
> use and read NUL terminated strings.  

This is very uncommon. Most Unix utilities use newline-terminated
lines.


> So why the err message
> in that case?  FWIW, if the null bytes are anywhere BUT
> the end, then I'd see that as an error, but usually with
> C and bash, a NUL-byte terminating a string seems a bit
> "unremarkable". (no?)

Internally, maybe, but not when dealing with external utilities.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Regression -- can't read input w/stderr redirect

2017-06-18 Thread L A Walsh



Eduardo A. Bustamante López wrote:

On Sun, Jun 18, 2017 at 02:02:05PM -0700, L A Walsh wrote:
[...]
  

 int dpi=$(ord $(<"$pixels_path" 2>/dev/null))
This used to work but now works _unreliably_.



In what version does this used to work?
  


It used to work when "2>/dev/null" wasn't required to
silence unwanted error messages.



It's clear that the following conditions must be met:

- The redirection must be performed inside a command substitution (ln.
  415)
- The command substitution must be a simple command (ln. 417)
- The simple command must consist of a single input redirection (ln.
  420), i.e. no words (ln. 419), no "next" redirection (ln. 421), input
  redirection (ln. 422), and the target being file descriptor 0 (ln.
  423).
  


   Seems unnecessarily limited.  If bash is going to emit out
warnings, then it seems allowing them to be silenced at the point
where they come out would be reasonable, though in this case,
see further on...



I think that you're looking for:

  $ bash -c 'printf "x\0y" > f; { a=$(/dev/null; declare -p a'
  declare -- a="xy"
Instead.
  

Not really -- I hadn't thought about the impact of of NUL's
other than at the end.  For example, if I tried to create a
workaround but tried to read 256, which would have been
encoded as "0x00 0x01 0x00 0x00" -- even in UTF-16:

printf  "\0\001\0\0" >/tmp/f   
...> a=$(
bash: warning: command substitution: ignored null byte in input
...> echo "$a"|hexdump -C
  01 0a

 echo $LC_CTYPE

en_US.UTF-16

(0x100 = a cap A with a line over it (Ā)).

I'd prefer to see the warning if the NUL byte was other than
at the end. The case with NUL at the end is a standard
string terminator.  I think the warning message wouldn't have been
seen by as many people if it only complained about NUL's in the
middle (vs. as a string-terminator).  No?





Re: Regression -- can't read input w/stderr redirect

2017-06-18 Thread Eduardo A . Bustamante López
On Sun, Jun 18, 2017 at 02:02:05PM -0700, L A Walsh wrote:
[...]
>  int dpi=$(ord $(<"$pixels_path" 2>/dev/null))
> 
> This used to work but now works _unreliably_.

In what version does this used to work?

I tested on a couple of versions, and the behavior you describe didn't work:

  dualbus@debian:~/src/gnu/bash-builds$ for b in bash-*/; do $b/bash -c 'echo 
$(< <(echo x)) $BASH_VERSION'; done
  x 3.2.57(1)-release
  x 4.2.0(1)-release
  x 4.2.53(1)-release
  x 4.3.30(1)-release
  
  dualbus@debian:~/src/gnu/bash-builds$ for b in bash-*/; do $b/bash -c 'echo 
$(< <(echo x) >/dev/stdout) $BASH_VERSION'; done
  3.2.57(1)-release
  4.2.0(1)-release
  4.2.53(1)-release
  4.3.30(1)-release


And if you inspect the source code, you'll notice that this the command
substitution "cat file" functionality is implemented in here:

builtins/evalstring.c:

  413   /* See if this is a candidate for $( type == cm_simple && !command->redirects &&
  418   (command->flags & CMD_TIME_PIPELINE) == 0 &&
  419   command->value.Simple->words == 0 &&
  420   command->value.Simple->redirects &&
  421   command->value.Simple->redirects->next == 0 &&
  422   command->value.Simple->redirects->instruction == 
r_input_direction &&
  423   command->value.Simple->redirects->redirector.dest == 0)
  424 {
  425   int r;
  426   r = cat_file (command->value.Simple->redirects);
  427   last_result = (r < 0) ? EXECUTION_FAILURE : 
EXECUTION_SUCCESS;
  428 }

It's clear that the following conditions must be met:

- The redirection must be performed inside a command substitution (ln.
  415)
- The command substitution must be a simple command (ln. 417)
- The simple command must consist of a single input redirection (ln.
  420), i.e. no words (ln. 419), no "next" redirection (ln. 421), input
  redirection (ln. 422), and the target being file descriptor 0 (ln.
  423).

So that means that `echo $(Y)' is not a valid "cat file" command
substitution.

I think that you're looking for:

  $ bash -c 'printf "x\0y" > f; { a=$(/dev/null; declare -p a'
  declare -- a="xy"

Instead.

-- 
Eduardo Bustamante
https://dualbus.me/