history builtin with no HISTFILE

2023-07-27 Thread Grisha Levit
If the `history' builtin is used with no filename argument and the HISTFILE
variable is unset, the Readline default history file ~/.history is used.

$ rm -f ~/.history
$ bash -c 'history -s foo; history -w'
$ cat ~/.history
foo

The help text states:

If FILENAME is given, it is used as the history file.  Otherwise,
if HISTFILE has a value, that is used, else ~/.bash_history.

The man and info pages both just state that if no filename supplied then
"the value of HISTFILE is used."

Since Bash normally doesn't perform any history file operations if HISTFILE
is empty or unset, I think it would make sense for `history' to do the same.


I think there's also some ambiguity in the documentation regarding an empty
HISTFILE value:

   HISTFILE
  The name of the file in which command history is saved (see HIS-
  TORY below).  The default value is ~/.bash_history.   If  unset,
  the command history is not saved when a shell exits.

It might be nice to be explicit here, and in the HISTORY section, that an empty
HISTFILE is treated the same as an unset one, and/or clarify that the "default
value" is what the variable is set to on shell startup if it is unset.



Re: comments inside command subst are handled inconsistently

2023-07-27 Thread Dale R. Worley
Denys Vlasenko  writes:
> Try these two commands:
>
> $ echo "Date: `date #comment`"
> Date: Thu Jul 27 10:28:13 CEST 2023
>
> $ echo "Date: $(date #comment)"
>> )"
> Date: Thu Jul 27 10:27:58 CEST 2023
>
> As you see, #comment is handled differently in `` and $().
> I think the handling in `` makes more sense.

Or more exactly, the handling of a ")" after a "#" is different from the
handling of a "`" after a "#".  I suspect the parsing is done
differently.  Likely `...` is treated as a quoting operation, and "date
#comment" is first extracted as a string and then parsed.  Whereas it
looks like the parser, upon seeing "$(", does not first scan for ")" but
instead adjusts its context and continues parsing characters.  With that
method, the "#" turns all of "comment)" into comment.

Dale



Re: Tilde (~) in bash(1) is typeset incorrectly as Unicode character

2023-07-27 Thread Chet Ramey

On 7/26/23 11:35 AM, G. Branden Robinson wrote:


I know a little about groff.  Your advice is fine for man pages that
target only groff[1] and/or mandoc[2], but not Heirloom Doctools
troff,[3] neatroff[4] or Plan 9 troff (in its original form or as
maintained in Plan 9 from User Space[5]), and not legacy implementations
descended from AT troff that are, as far as I can tell, unmaintained
by the few Unix System V vendors that still exist.[6][7]

Many projects don't need to worry about such extreme portability in
their man pages, but GNU Bash arguably does.  (I'm open to correction.)


It's an ongoing struggle. There are projects (e.g., a 4.3BSD preservation
effort) that request such accommodations, but it's becoming more difficult
to support them.


Furthermore, in the *roff language itself, as originally implemented by
Joe Ossanna (and re-implemented by Brian Kernighan) there is no good
way to test for the existence of a special character.[8]

As a first stab at it, I'd divide the world into two camps: (a) groff
and mandoc(1), and (b) everything else, and not worry about (b).


I'd consider that, but, as you point out, there are some legacy Unix
systems that still ship old versions of troff and associated tools.


The bash(1) man page has an extensive preamble already that still
includes a workaround for 4.3BSD(!), so adding a little bit to it to
accommodate systems developed since 1990 might not be too disruptive.

I'm attaching a straw man diff to the bash(1) page.  If Chet likes it,
I'm happy to prepare one against the bash devel branch.


Thanks. I'll probably apply some variant of this to the set of man pages
that need it.


bash(1) also attempts to select a font named "CW" in places, which is
another portability problem (it's a Unix System III [and later] troff
font name that was available on _some_ output devices).  But I'd like to
see how we get over this bridge before I try to cross that one.  :)


(For others reading, it's the constant width font, usually Courier.) I
haven't received any bug reports about that, and groff and mandoc both
support it, so I'm inclined to leave it alone.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: Tilde (~) in bash(1) is typeset incorrectly as Unicode character

2023-07-27 Thread G. Branden Robinson
Hi Chet,

At 2023-07-27T11:54:19-0400, Chet Ramey wrote:
> On 7/26/23 11:35 AM, G. Branden Robinson wrote:
> > Many projects don't need to worry about such extreme portability in
> > their man pages, but GNU Bash arguably does.  (I'm open to
> > correction.)
> 
> It's an ongoing struggle. There are projects (e.g., a 4.3BSD
> preservation effort) that request such accommodations, but it's
> becoming more difficult to support them.

I hear you.  Developing groff involves a bit of legacy awareness itself!

> > Furthermore, in the *roff language itself, as originally implemented
> > by Joe Ossanna (and re-implemented by Brian Kernighan) there is no
> > good way to test for the existence of a special character.[...]
> > 
> > As a first stab at it, I'd divide the world into two camps: (a)
> > groff and mandoc(1), and (b) everything else, and not worry about
> > (b).
> 
> I'd consider that, but, as you point out, there are some legacy Unix
> systems that still ship old versions of troff and associated tools.

I was unclear.  By "not worry about (b)", I meant "just define the
string contents as the same old ASCII character".  In retrospect I'm not
sure how anyone was supposed to figure that out from what I typed.

But that is in fact what I did in my suggested patch.  So it should be
highly portable, and I'm happy to support it.

> > The bash(1) man page has an extensive preamble already that still
> > includes a workaround for 4.3BSD(!), so adding a little bit to it to
> > accommodate systems developed since 1990 might not be too
> > disruptive.
> > 
> > I'm attaching a straw man diff to the bash(1) page.  If Chet likes
> > it, I'm happy to prepare one against the bash devel branch.
> 
> Thanks. I'll probably apply some variant of this to the set of man
> pages that need it.

Cool.

> > bash(1) also attempts to select a font named "CW" in places, which
> > is another portability problem (it's a Unix System III [and later]
> > troff font name that was available on _some_ output devices).  But
> > I'd like to see how we get over this bridge before I try to cross
> > that one.  :)
> 
> (For others reading, it's the constant width font, usually Courier.)

This history of the "CW" font name in *roff is becoming clearer to me
and I know of no other narrative about it, so I'm recording this for
posterity--and to invoke Cunningham's Law.[0]

To be precise, it is, or resembles, Courier roman (that is: upright, not
slanted; and of medium stroke weight).  On some output devices,
Documenter's Workbench 3.3 troff (ca. 1990) supported the Courier family
using the names C, CI, CB, and CX (roman, italic, bold, bold-italic).[1]
It also made the roman style available as "CW"--I assume for backward
compatibility with Unix System III (1980), where the brand-new
device-independent troff supported a phototypesetter that featured a
Courier-ish font and offered tools supporting it.[2]  This history is
pretty murky, though; this was commercial Unix troff and licenses for it
were expensive.  That is, I conjecture, the reason that BSD Unix had no
device-independent troff until it adopted groff in the Net/2 release
(1991).[3]  Therefore, device-independent troff font names tended not to
be portable to BSD Unix--not that they were often portable across output
devices anyway, a problem largely tamed nowadays by (1) the dominance of
the PostScript and PDF specifications in this sector, which establish a
base set of workaday typefaces; and (2) groff's insistence on
portability for a base set of font names wherever possible.[4]

Per the copyright page of the first edition of _The C Programming
Language_ (1978), Kernighan & Ritchie must have gotten the monospaced
font into the book because they acquired photographic plates for the
Graphic Systems (by then, Wang) C/A/T in a Courier face (apparently, in
roman only, because no other style was used).  This preceded
device-independent troff[5].  Perhaps by then, a tradition of calling
this face "CW" was entrenched--but I've yet to see any evidence of it in
Seventh Edition Unix (1979).  Possibly its only application was in the
troff sources for the book itself (and the related "C Reference
Manual"), which Prentice-Hall still treats like a trade secret.

> I haven't received any bug reports about that, and groff and mandoc
> both support it, so I'm inclined to leave it alone.

You might start to receive them; stock groff 1.23.0 now produces
font-related diagnostics in many situations where groff 1.22.4 and
earlier did not.[6]  Colin Watson has patched Debian's groff-base
package to suppress this one when man pages are being formatted for
terminals, as suggested by comments in the stock man.local file.[7]  As
far as I can tell, to date other distributors have not.

Regards,
Branden

[0] https://meta.wikimedia.org/wiki/Cunningham%27s_Law
[1] https://github.com/n-t-roff/DWB3.3/tree/master/postscript/devpost

It also makes a "CO" font available, but as an alias of "C" (roman
style), not "CI" 

Re: built-in printf returns success when integer is out of range

2023-07-27 Thread Chet Ramey

On 7/26/23 4:18 PM, tho...@habets.se wrote:


Bash Version: 5.2
Patch Level: 15
Release Status: release

Description:
 printf '%d\n' 111 && echo success
 prints "success"
 /usr/bin/printf does not, but instead returns EXIT_FAILURE (1).


The bash printf builtin doesn't consider this an error -- it prints a
warning -- because strtoimax completely converts the argument (*ep == 0)
even as it changes errno to ERANGE. That dates back to bash-2.05b.

I can make ERANGE overflow a conversion error as well; that seems
consistent with POSIX and what other shells do.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




comments inside command subst are handled inconsistently

2023-07-27 Thread Denys Vlasenko

Try these two commands:

$ echo "Date: `date #comment`"
Date: Thu Jul 27 10:28:13 CEST 2023

$ echo "Date: $(date #comment)"

)"

Date: Thu Jul 27 10:27:58 CEST 2023


As you see, #comment is handled differently in `` and $().
I think the handling in `` makes more sense.