TPUT expects an incorrect number of arguments (re-posting in plain text)

Michael Hambly Wed, 15 Jun 2022 12:20:57 -0700

 >Synopsis:    /usr/bin/tput expects an incorrect number of arguments.


 >Category:    system/user

 >Environment:

     System      : OpenBSD 7.0
     Details     : OpenBSD 7.0 (GENERIC.MP) #232: Thu Sep 30 14:25:29
MDT 2021
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP

     Architecture: OpenBSD.amd64
     Machine     : amd64

 >Description:

     The *tput* command can be used to query or set terminal behaviour,
and is often used by application scripts for this purpose. The OpenBSD
version of this program is structured to calculate the expected number
of arguments based on the target terminal capability string; aborting if
anything but the correct number of command line arguments are
encountered. The problem is that the program calculates an invalid
number of expected arguments if the target capability string includes
conditionals. For example, the '*setaf*' and '*setab*' commands used to
set the foreground and background colours utilize conditionals to
modulate the output for different colour densities.

This problem has been noted before (search for tput on the sendbug
mailing list), and there is a workaround. Unfortunately that workaround
is unlikely to be implemented on 3rd party apps, and hence it limits
some packages from attaining OpenBSD compatibility.

 >How-To-Repeat:
     Set terminal type to *xterm-256color*, and execute "*tput setaf 3*".

     % *echo $TERM*
     xterm-256color
     % *tput setaf 3*
///*tput: not enough arguments (3) for capability `setaf'*/

     The '*tput setaf 3*' command should set the foreground colour to
yellow. The command as shown is using the correct number of arguments

 >Fix:
     The problem arises when processing string based terminfo
capabilities; capabilities retrieved from the terminfo database by
calling *tigetstr*().

* Boolean (tigetflag) and numeric (tigetnum)  capability types are not
subject to the same calculation errors.

     When *tput* encounters a string-based terminfo capability it passes
the string to the "*process*" function to be "processed". It is in the
"*process*" function that the calculation for the expected number of
arguments fails.

* The relevant code for this can be found in
*/usr/src/usr.bin/tput/tput.c*.

     The "*process*" function takes three arguments: *cap*, *str*, and
*argv*.

* *cap*: should be a string identifying the terminfo capability name
(e.g. “*setaf*”).
* *str*: should be the value of the terminfo capability string
corresponding to the indicated cap name. In particular this should
be the string returned by calling tigetstr (e.g. after calling
tigetstr(“setaf”)).
* *argv*: should be a pointer to an array of strings containing the
normalized argument vector.

     The capability string (*str*) is the crucial argument for purposes
of correcting the error. The *process* function uses the capability
string to try to calculate how many arguments to expect. If the number
of command line arguments matches what it expects then it calls *tparm*,
passing *tparm* the capability string and the full argument array. If
the number of command line arguments does not match the number expected,
the program aborts and does not pass the request to the *tparm*
function. Basically it's a case of over-achieving, where adding extra
checks to increase robustness sometimes leads to less stability by way
of increasing complexity.

     Since I had never previously dug into the details of the terminfo
database, ANSI escape codes, or the tput command, I had to do a bit of
digging to figure out not only why the program was breaking, but what it
was supposed to be doing in the first place (I was trying to get a third
party app to run on OpenBSD and wasn't quite sure what the tput call was
supposed to be doing). As a result of that digging I was able to figure
out most of what was going on, so I will detail some of that here in
case it helps for anybody that wants to take a shot at fixing this.

     In essence the terminfo capability string defines what characters
should be sent to the terminal in order to enact different terminal
functions such as moving the cursor, or changing the text colour. For
the most part these constitute ANSI escape codes (see:
https://en.wikipedia.org/wiki/ANSI_escape_code), but they can also
involve hardware-dependent character sequences. In other words the
capability string differs dependent upon the target terminal, and the
target capability/function.

* Note: Terminfo is the newer library, but some functionality falls
back to the older termcap library. For the purposes of the tput
program the functionality of both libraries is similar.

     In order for the *process* function to figure out how many
arguments it should be passing to *tparm* for the target string
capability, it tries to figure out how many arguments the capability
string consumes and how many it spits out. In essence the code looks
like it might work fine for simple cases, but it's kind of smelly. It
doesn't process the code the same way that the terminfo library would
and so it while it might work, its not a good implementation.
Furthermore, as noted, it does fail for cases where the capability
string includes conditionals.

* Apple does it different, they use a lookup function though they note
that their method is imperfect and has extensibility issues (see the
*tparm_type* function in
https://opensource.apple.com/source/ncurses/ncurses-7/ncurses/progs/tput.c.auto.html).

     Conditionals in a terminfo capability string are used to adjust the
output character sequence to accommodate different output standards or
protocols. For example, the *setaf* capability/command is used to set
the foreground colour, but different terminals employ different colour
densities and so there is a wide variety in the possible output codes to
deal with these varying capabilities. Furthermore programs used to
generate screen output may also use different colour densities. Where a
modern web app may focus on 24 bit colour, older terminal applications
may just use a simple 3 bit RGB designation. There is also 4 bit colour,
8 bit colour, etc. The *setaf* capability string deals with these
varying input and output scenarios by accepting  values in any of the
applicable colour densities and adjusting the character output
accordingly. It does this through the use of conditionals.

     At present the *tput* program does not handle conditionals. It
doesn't even make an attempt to recognize and process the character
sequences associated with conditionals, and because of this it winds up
double-counting the outputs. For example the *setaf* command has three
possible output modes for the *xterm-256color* terminal type, 3 bit
colour (8 colours), 4 bit colour (8 colours in two different brightness
levels), and 8 bit colour (256 colours). But there is still only one
input value passed into the command, the target colour value. The trick
is that terminfo capability string is used to modify the output
character sequence based on the colour density of the input.

     The conditional elements within a terminfo string are marked by the
following character sequences '*%?*', '*%t*', '*%e*', and '*%;*'. You
could probably construct modifications to the existing loop to handle
these conditionals, but I would suggest that a better approach would
probably involve using recursive calls to emulate the stack-based nature
of the capabilities strings. This is a bit hard to see until you
actually step through decrypting a capability string to see how it
actually functions, it is not a classic imperative algorithm. It is
stack-based; like using the dc program to do calculations. You have to
look at the algorithm a little differently than you would for a
'C'-based conditional.

     To that end, consider the following example...

*Decoding the Capabilities String*

The following  is my attempt to decrypt the machinations of the *setaf*
capabilities string for an *xterm-256colo*r terminal. Interpretation of
the character codes was based on a table that I found at IBM describing
the parameter mechanism (
https://www.ibm.com/docs/en/iis/9.1?topic=functions-tparm-function ).

*Cap String*:
|"\033[%?%p1%{8}%<%t3%p1%d%e%p1%{16}%<%t9%p1%{8}%-%d%e38;5;%p1%d%;m"|

1. \033[   -> ESC [ -> Control Sequence Introducer (ANSI Escape Codes)
     * See: https://en.wikipedia.org/wiki/ANSI_escape_code#CSIsection

2. %?%p1%{8}%<   -> IF (ARG1 < 8)
     * %?    -> Begin Conditional Expression (terminates at %;)
     * %p1   -> Push arg 1 onto stack.
     * %{8}  -> Push 8 onto stack.
     * %<    -> Compare the top two elements on the stack. If arg 1 is
less than 8, push 1 onto the stack; otherwise 0.

3. %t3%p1%d   -> THEN { Output(3); Output(ARG1); }
     * (Set Foreground Colour, ANSI CSI codes 30-37)
     * %t    -> Then
     * 3     -> Output(3) -> Prefix for mapping 3-bit RGB codes -> (ESC
[ 30-37 m ).
     * %p1   -> Push arg 1 onto stack.
     * %d    -> Pop top element off the stack and output it as a decimal
number.

4. %e%p1%{16}%<   -> ELSE IF (ARG1 < 16), interpretation of conditional
is similar to step 2.
     * %e    -> Else
     * %p1   -> Push arg 1 onto stack.
     * %{16} -> Push 16 onto stack.
     * %<    -> Compare the top two elements on the stack. If arg 1 is
less than 16, push 1 onto the stack; otherwise 0.

5. %t9%p1%{8}%-%d -> THEN { Output(9); Output(ARG1 - 8); }
     * (Set Bright Foreground Colour, ANSI CSI Codes 90-97)
     * %t    -> Then
     * 9     -> Output(9) -> Prefix for mapping bright 3-bit RGB codes
-> ( ESC [ 90-97 m ).
     * %p1   -> Push arg 1 onto stack (this is a 4-bit value with the
high bit set (8-15)).
     * %{8}  -> Push 8 onto stack.
     * %-    -> Calculate the difference of the top two stack items (p1
- 8), put result back on the stack.
     * %d    -> Pop top element off the stack and output it as a decimal
number.

6. %e38;5;%p1%d    -> ELSE { Output(“38;5;”); Output(ARG1); }
     * (Set 8-Bit Foreground Colour, ANSI CSI Code 38;5;n)
     * %e    -> Else
     * 38;5; -> Output(3); Output(8); Output(;); Output(5);
     * %p1   -> Push arg 1 onto stack.
     * %d    -> Pop top element off the stack and output it as a decimal
number.

7. %;     -> Terminates the IF/THEN/ELSE Expression (returns processing
to normal).

8. m     -> Output(m) -> Terminates the ANSI CSI Sequence (e.g. ESC [ 0
m => ANSI Reset)

*On Processing Conditionals:*

* Note that the conditionals do not require a second ‘%?’ designator to
invoke nesting.
     o The '%?' sequence just marks the start of then/else processing.
It does not invoke branching.
     o The '%;' sequence marks the end of then/else processing.
     o It is not clear from the compatibility string whether '%?'
invokes a conditional processing mode (i.e. enables conditional
processing), or whether it is just a placeholder. Either case could work.
     o The '%;' seems to act as a definitive terminator, forcing any
then/else mode to be cleared.

* This is "stack-based", so think different!
     o The conditional test is implied by the ‘%<‘ operator (and its
        brethren), and branching by the '%t' operator; not the '%?'
        sequence.
     o The conditional test pushes a boolean onto the stack, which is
        read by the following "then" sequence (%t).
     o Except where you are using a boolean for some other purpose, a
        conditional sequence (e.g. '%..%<) should probably always be
        followed by a then sequence(%t).

* Hence ‘%t’ probably says “Check the top of the stack; then …”
     o If (TOS != 0), continue processing.
     o If (TOS == 0), continue reading bytes, but don’t process them.

* When ‘%e’ is hit, it probably doesn’t check the stack, it probably
just inverts the processing state invoked by "then".
     o Note that the value pushed onto the stack by the conditional
        will already have been consumed by ‘%t’.
     o But we don’t need to check the stack, we just need to invert the
        processing state set by ‘%t’.
     o The mental model is a little different than the one you use to
        think about C conditionals.

* Assuming that my interpretation of the then/else processing state is
correct, then hitting the expression terminator (%;) cancels and
active then/else state, forcing the processing state to a
known/normal state.

* The above is a guess at how the conditional expressions within a
capability string are handled, but assuming a stack-based paradigm I
think my interpretation of the way that it is processed makes sense.
In essence my interpretation is that by utilizing a stack-based
approach, the program does not bother to track its nesting level. It
just looks for ‘%?’, ‘%t’, ‘%e’, and ‘%;’ byte sequences, and alters
the processing state accordingly.

Again, this is my first time looking into the details of the
termcap/terminfo libraries, so I may have misinterpreted some of the
details, but overall I think the essence of my analysis is correct.
--
*/Michael Hambly/*
*/Blackbird Software Design Ltd./
Email: [email protected]
*

TPUT expects an incorrect number of arguments (re-posting in plain text)

Reply via email to