Mark Millard wrote:


On 2020-Jul-10, at 16:12, Yuri Pankov <yuripv at yuripv.dev> wrote:

Mark Millard wrote:
On 2020-Jul-10, at 11:05, Yuri Pankov <yuripv at yuripv.dev> wrote:
Steve Wills wrote:
On 11/28/19 4:08 PM, Mark Millard via svn-src-head wrote:
Author: daichi
Date: Fri Sep 20 17:37:23 2019
New Revision: 352558
URL:
https://svnweb.freebsd.org/changeset/base/352558


Log:
    top(1): support multibyte characters in command names (ARGV array)
    depending on locale.
     - add setlocale()
     - remove printable() function
     - add VIS_OCTAL and VIS_SAFE to the flag of strvisx() to display
       non-printable characters that do not use C-style backslash sequences
       in three digit octal sequence, or remove it
    This change allows multibyte characters to be displayed according to
    locale. If it is recognized as a non-display character according to the
    locale, it is displayed in three digit octal sequence.


Initially picking on tab characters as an example of what is
probably a somewhat broader issue . . .

Ever since this change, characters like tabs that do not fit
in the next character cell when output, but for which they
are !isprintable(...), now mess up the top display. Again
using tab as an example: line wrapping from the text having
been shifted over by more than one character cell. top does
not track the line wrapping result in how it decides what
to output for the following display updates.

Apologies for the way late reply here, but I just now bothered tracking this down. This 
commit seems to be the cause of some corruption I'm seeing in long running top(1) as 
well. As Mark mentions, if I use "hh" it clears up. Should I open a bugzilla 
bug? I can share screenshots of the corruption, such as:
https://i.imgur.com/Xqlwf9h.png
https://i.imgur.com/Jv0d5NU.png

Does removing VIS_SAFE fixes the issue for you?

As for original Mark's report (which I missed), removing isprintable() doesn't 
look wrong as vis(3) should take of its functionality (and in multibyte-aware 
way).
vis (as used) and the old isprintable logic are not
equivalent when multi-byte is not needed/involved.
Otherwise I'd not have had anything to ever report.
If vis can do what is needed, more work needed to
be done when the change was made in order to avoid
msesed up displays in single-byte contexts.
Also, is there an easy way to reproduce this?
The following sort of command (the empty space inside quoted
text are tab characters):
# tr '0\n      1\n     2\n     3\n     4\n     5\n     6\n     7\n     8\n' '\t0      
 \t1     \t2     \t3     \t4     \t5     \t6     \t7     \t8' < /dev/zero > 
/dev/null
causes my 200 character wide window running top to show:
32920 root        100    0  12764Ki    2420Ki CPU3     3   2:22  99.87% tr 0\\n 
1\\n    2\\n    3\\n    4\\n    5\\n    6\\n    7\\n    8\\n \\t0       \\t1    
\\t2    \\t3    \\t4    \\t5    \\t6    \\t733   \\t8       20        7172      
5448Ki CPU23   23   0:00   0.04% top -HiSCazopid
But that does not show where the lines wrap at the edges of the window,
so breaking it up explicitly after the first "\" in \\7:
32920 root        100    0  12764Ki    2420Ki CPU3     3   2:22  99.87% tr 0\\n 
1\\n    2\\n    3\\n    4\\n    5\\n    6\\n    7\\n    8\\n \\t0       \\t1    
\\t2    \\t3    \\t4    \\t5    \\t6    \
\t733   \\t8       20        7172      5448Ki CPU23   23   0:00   0.04% top 
-HiSCazopid
Note how \n turned into \\n , taking an extra character for
each \n . Similarly for \t vs. \\t . (Other examples do
similarly.)
The tab characters really do use more than one character cell
on the display (sometimes).
The text from the tr command ends up spread across 2 lines
as things look like in the window where top is running.
I ran top in another ssh session first and then the tr command.
Before running the tr command, top showed as:
33019 root         20    0  17172Ki    5448Ki CPU24   24   0:00   0.05% top 
-HiSCazopid
If you do not end up with top listed just after tr in top's output,
then it will not be top's line that ends up partially overwritten.
If you have wider windows, you may need more text in the tr quoted
strings.
In another experiment I inserted a large number of backspace characters
(control-H's) at the front of the first quoted string in the tr command.
The top output displayed:
0\\n5 ro1\\n    2\\93   3\\n12764\\n   25\\ni CP6\\n   97\\n:12 100.00\t0r 
\nHiS\\t1pid \\t2    \\t3    \\t4    \\t5    \\t6    \\t
33094 root         20    0  17172Ki    5488Ki CPU21   21   0:00   0.06% top 
-HiSCazopid
In other words, backspace moved the cursor position back over prior
fields on the line and then the later line content overwrote those
fields instead of being after "tr" someplace (or truncated off).
Note that part of "-HiSCazopid" shows up on both lines. The extra
is from when top was running but tr had not started yet. top is
not managing text replacement correctly for output characters that
end up not being just "in" the next character-cell on the terminal.
The same sort of result happens when instead adding just one
carriage return (control-M) in front of that first quuoted
string instead:
0\\n8 ro1\\n    2\\92   3\\n12764\\n   25\\ni CP6\\n  117\\n:11 100.00\t0r 
\nHiS\\t1pid \\t2    \\t3    \\t4    \\t5    \\t6    \\t
33094 root         20    0  17172Ki    5488Ki CPU23   23   0:00   0.04% top 
-HiSCazopid
I do not intend to try to find all examples of characters that
cause problems but used to not cause problems.
 From what I've seen, cursor positioning escape character sequences
seem to be sent through and cause overwrites at arbitrary places
on screen, based on the escape sequence content. There are command
lines around that contain such sequences. So I sometimes see the
first few lines of top's output have garbage text from commands
that were listed below at some point overwriting the top text.
Part of what is going on is top avoiding rewriting characters
that its tracking indicates have not been updated. When the
actual display and that supposed-tracking mismatch, the
display ends up wrong when updated (bad text continues to
display).
The text in commands should not make "top -a" output mess up
the display of other lines in top's output, nor of other
top output fields on the same line. In my view, if some usage
contexts need otherwise, it should take an extra command line
option to put top in a mode that might do such things. The
default behavior should strictly avoid having such things
happen.

Thanks.

The attached diff seems to take care of the issue for me, adding VIS_TAB and 
removing VIS_SAFE, which can be blamed for passing through the following:

VIS_SAFE   Currently this form allows space, tab, newline, backspace,
           bell, and return — in addition to all graphic characters —
           unencoded.
<top.txt>

A quick test suggests agreement. We will see how it
looks for on-going use.

But I'll note that top's man page should document the
translations that are being used: it is not the same
text that top produced before -r352558 and one should
be able to read the man page to find out how to
interpret what top reports for the likes of top -a .

I think this was taken care of in r352568, and what it says now is correct:

Non-printable characters in the command line are encoded in C-style backslash sequences or a three digit octal sequences.

(It does not appear that escape sequences or vertical
tab would have gone through unencoded. So I'm still
unclear how I ever had the top few lines of top's
output messed up by command text. So it is also
unclear that this change would make a difference
for such. We will see over time if that text is
ever messed up.)

_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to