Re: ps and AIX field descriptors
On Tue 21 Feb 2023 at 16:06:48 (+0100), Andreas Leha wrote: > David Wright writes: > > On Mon 20 Feb 2023 at 10:39:21 (+0100), Andreas Leha wrote: > >> Greg Wooledge writes: > >> > On Sun, Feb 19, 2023 at 12:04:22PM -0600, David Wright wrote: > >> >> But even that's not enough > >> >> because the field width is somewhat variable: try ps -eo '%c | %z > >> >> | %a' > >> >> (We can still use | to make the problem somewhat more obvious.) > >> > > >> > Oh wow. Yeah, OK, that's not really solvable. > >> > > >> > For those who don't want to try to reverse engineer David's conclusion, > >> > or who don't just happen to stumble upon it with their current process > >> > list, here's what I'm seeing: > >> > > >> > COMMAND | VSZ | COMMAND > >> > systemd | 164140 | /sbin/init > >> > kthreadd | 0 | [kthreadd] > >> > rcu_gp | 0 | [rcu_gp] > >> > rcu_par_gp | 0 | [rcu_par_gp] > >> > [...] > >> > steamwebhelper | 4631064 | > >> > /home/greg/.steam/debian-installation/[...] > >> > [...] > >> > chrome_crashpad | 33567792 | > >> > /opt/google/chrome/chrome_crashpad_handler[...] > >> > [...] > >> > kworker/3:0-eve | 0 | [kworker/3:0-events] > >> > > >> > ps appears to guess an initial maximum width for the VSZ field, but > >> > when a value comes along that exceeds the guessed maximum, it simply > >> > shoves the field barrier over. It doesn't even become the new maximum, > >> > with all of the fields aligning after that. It's just a one-time shove, > >> > breaking the current line only. > >> > > >> > Therefore, parsing the header line cannot give us enough information to > >> > insert field separators correctly in body lines after the fact. > >> > >> > >> Dear all, > >> > >> Thanks for chiming in. The example was indeed simplified and I am using > >> %a which can contain internal whitespace. > >> > >> This is the command I was using previously: > >> > >> ps -eo '%p|%c|%C' -o "%mem" -o '|%a' --sort=-%cpu > >> I now replaced it with > >> > >> ps -eo '%p %c %C' -o "%mem" -o ' %a' --sort=-%cpu | sed -E 's/([0-9]+) > >> (.+) ([0-9]+.?[0-9]?) ([0-9]+.?[0-9]?) (.+)/\1|\2|\3|\4|\5/' > >> > >> This works, but is of course cumbersome to maintain. > >> > >> Again, thanks for all the comments! > > > > I think there are a few too many assumptions in there; > > in particular, numbers in %a will match patterns designed > > to match cpu and mem, because you can't prevent sed from > > being greedy (except with the [^ … … ]+ construction, to > > restrict what it matches). > > > > This version makes a few assumptions as well: > > . that the new format matches the old one (mine) if the > > delimiters given are a single space (like '%p %c %C'), > > or stripped (like "%mem" and '%a', but not ' %a'). > > . the short command is always 15 chars wide even if all > > the commands in the table are shorter, eg with ps -o. > > . I don't have any of those new-fangled extra-long PIDs > > yet today. > > > > It might well break if a CPU or MEM is running at 100%. > > That's not easily tested here. > > > > I've reordered the columns on the first pass, so that the > > numeric ones (with their limited character set) come first, > > which means I can use an auxiliary character for > > correcting the spacing. (The spaces between the columns get > > comingled with the leading spaces of numbers.) The second > > pass sorts that out and processes the heading. > > > > $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) > > (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1~\3~\2\4/;' | sed -E 's/([^~]+)~ > > ([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND) > > /\1|\2|/;s/%MEM COMMAND/%MEM|COMMAND/;' | less > > $ > > > > This is the same, except I deliberately chose _ for the auxiliary > > character, knowing that short commands are stuffed with underscores: > > > > $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) > > (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1_\3_\2\4/;' | sed -E 's/([^_]+)_ > > ([^_]+)_(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND) > > /\1|\2|/;s/%MEM COMMAND/%MEM|COMMAND/;' | less > > $ > > > > Examples: > > > > PID|COMMAND|%CPU %MEM|COMMAND > >9798|firefox-esr| 2.5 5.8|firefox-esr > > 16143|Isolated Web Co| 1.8 2.2|/usr/lib/firefox-esr/firefox-esr > > -contentproc -childID 11 -isForBrowser -prefsLen 47676 -prefMapSize 232307 > > -jsInitLen 277276 -parentBuildID 20230214011352 -appDir > > /usr/lib/firefox-esr/browser 9798 true tab > >1242|Xorg | 1.0 1.4|/usr/lib/xorg/Xorg -nolisten tcp :0 vt1 > > -keeptty -auth /tmp/serverauth.FxvBp8B7Qn > > [ … ] > > 8|mm_percpu_wq | 0.0 0.0|[mm_percpu_wq] > > 9|rcu_tasks_rude_| 0.0 0.0|[rcu_tasks_rude_] > > 10|rcu_tasks_trace| 0.0 0.0|[rcu_tasks_trace] > > > > An incestuous one, with -o rather -eo: > > > > PID|COMMAND|%CPU
Re: ps and AIX field descriptors
David Wright writes: > On Mon 20 Feb 2023 at 10:39:21 (+0100), Andreas Leha wrote: >> Greg Wooledge writes: >> > On Sun, Feb 19, 2023 at 12:04:22PM -0600, David Wright wrote: >> >> But even that's not enough >> >> because the field width is somewhat variable: try ps -eo '%c | %z | >> >> %a' >> >> (We can still use | to make the problem somewhat more obvious.) >> > >> > Oh wow. Yeah, OK, that's not really solvable. >> > >> > For those who don't want to try to reverse engineer David's conclusion, >> > or who don't just happen to stumble upon it with their current process >> > list, here's what I'm seeing: >> > >> > COMMAND | VSZ | COMMAND >> > systemd | 164140 | /sbin/init >> > kthreadd | 0 | [kthreadd] >> > rcu_gp | 0 | [rcu_gp] >> > rcu_par_gp | 0 | [rcu_par_gp] >> > [...] >> > steamwebhelper | 4631064 | /home/greg/.steam/debian-installation/[...] >> > [...] >> > chrome_crashpad | 33567792 | >> > /opt/google/chrome/chrome_crashpad_handler[...] >> > [...] >> > kworker/3:0-eve | 0 | [kworker/3:0-events] >> > >> > ps appears to guess an initial maximum width for the VSZ field, but >> > when a value comes along that exceeds the guessed maximum, it simply >> > shoves the field barrier over. It doesn't even become the new maximum, >> > with all of the fields aligning after that. It's just a one-time shove, >> > breaking the current line only. >> > >> > Therefore, parsing the header line cannot give us enough information to >> > insert field separators correctly in body lines after the fact. >> >> >> Dear all, >> >> Thanks for chiming in. The example was indeed simplified and I am using >> %a which can contain internal whitespace. >> >> This is the command I was using previously: >> >> ps -eo '%p|%c|%C' -o "%mem" -o '|%a' --sort=-%cpu >> >> I now replaced it with >> >> ps -eo '%p %c %C' -o "%mem" -o ' %a' --sort=-%cpu | sed -E 's/([0-9]+) >> (.+) ([0-9]+.?[0-9]?) ([0-9]+.?[0-9]?) (.+)/\1|\2|\3|\4|\5/' >> >> This works, but is of course cumbersome to maintain. >> >> Again, thanks for all the comments! > > I think there are a few too many assumptions in there; > in particular, numbers in %a will match patterns designed > to match cpu and mem, because you can't prevent sed from > being greedy (except with the [^ … … ]+ construction, to > restrict what it matches). > > This version makes a few assumptions as well: > . that the new format matches the old one (mine) if the > delimiters given are a single space (like '%p %c %C'), > or stripped (like "%mem" and '%a', but not ' %a'). > . the short command is always 15 chars wide even if all > the commands in the table are shorter, eg with ps -o. > . I don't have any of those new-fangled extra-long PIDs > yet today. > > It might well break if a CPU or MEM is running at 100%. > That's not easily tested here. > > I've reordered the columns on the first pass, so that the > numeric ones (with their limited character set) come first, > which means I can use an auxiliary character for > correcting the spacing. (The spaces between the columns get > comingled with the leading spaces of numbers.) The second > pass sorts that out and processes the heading. > > $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) > (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1~\3~\2\4/;' | sed -E 's/([^~]+)~ > ([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND) /\1|\2|/;s/%MEM > COMMAND/%MEM|COMMAND/;' | less > $ > > This is the same, except I deliberately chose _ for the auxiliary > character, knowing that short commands are stuffed with underscores: > > $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) > (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1_\3_\2\4/;' | sed -E 's/([^_]+)_ > ([^_]+)_(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND) /\1|\2|/;s/%MEM > COMMAND/%MEM|COMMAND/;' | less > $ > > Examples: > > PID|COMMAND|%CPU %MEM|COMMAND >9798|firefox-esr| 2.5 5.8|firefox-esr > 16143|Isolated Web Co| 1.8 2.2|/usr/lib/firefox-esr/firefox-esr > -contentproc -childID 11 -isForBrowser -prefsLen 47676 -prefMapSize 232307 > -jsInitLen 277276 -parentBuildID 20230214011352 -appDir > /usr/lib/firefox-esr/browser 9798 true tab >1242|Xorg | 1.0 1.4|/usr/lib/xorg/Xorg -nolisten tcp :0 vt1 > -keeptty -auth /tmp/serverauth.FxvBp8B7Qn > [ … ] > 8|mm_percpu_wq | 0.0 0.0|[mm_percpu_wq] > 9|rcu_tasks_rude_| 0.0 0.0|[rcu_tasks_rude_] > 10|rcu_tasks_trace| 0.0 0.0|[rcu_tasks_trace] > > An incestuous one, with -o rather -eo: > > PID|COMMAND|%CPU %MEM|COMMAND >1694|bash | 0.0 0.1|bash > 23486|ps | 0.0 0.0|ps -o %p %c %C -o %mem -o %a --sort=-%cpu > 23487|sed| 0.0 0.0|sed -E s/( *[0-9]+) (.{15})( +[0-9.]+ > +[0-9.]+) (.*$)/\1~\3~\2\4/; > 23488|sed| 0.0 0.0|sed -E s/([^~]+)~ >
Re: ps and AIX field descriptors
On Mon 20 Feb 2023 at 10:39:21 (+0100), Andreas Leha wrote: > Greg Wooledge writes: > > On Sun, Feb 19, 2023 at 12:04:22PM -0600, David Wright wrote: > >> But even that's not enough > >> because the field width is somewhat variable: try ps -eo '%c | %z | > >> %a' > >> (We can still use | to make the problem somewhat more obvious.) > > > > Oh wow. Yeah, OK, that's not really solvable. > > > > For those who don't want to try to reverse engineer David's conclusion, > > or who don't just happen to stumble upon it with their current process > > list, here's what I'm seeing: > > > > COMMAND | VSZ | COMMAND > > systemd | 164140 | /sbin/init > > kthreadd | 0 | [kthreadd] > > rcu_gp | 0 | [rcu_gp] > > rcu_par_gp | 0 | [rcu_par_gp] > > [...] > > steamwebhelper | 4631064 | /home/greg/.steam/debian-installation/[...] > > [...] > > chrome_crashpad | 33567792 | > > /opt/google/chrome/chrome_crashpad_handler[...] > > [...] > > kworker/3:0-eve | 0 | [kworker/3:0-events] > > > > ps appears to guess an initial maximum width for the VSZ field, but > > when a value comes along that exceeds the guessed maximum, it simply > > shoves the field barrier over. It doesn't even become the new maximum, > > with all of the fields aligning after that. It's just a one-time shove, > > breaking the current line only. > > > > Therefore, parsing the header line cannot give us enough information to > > insert field separators correctly in body lines after the fact. > > > Dear all, > > Thanks for chiming in. The example was indeed simplified and I am using > %a which can contain internal whitespace. > > This is the command I was using previously: > > ps -eo '%p|%c|%C' -o "%mem" -o '|%a' --sort=-%cpu > > I now replaced it with > > ps -eo '%p %c %C' -o "%mem" -o ' %a' --sort=-%cpu | sed -E 's/([0-9]+) > (.+) ([0-9]+.?[0-9]?) ([0-9]+.?[0-9]?) (.+)/\1|\2|\3|\4|\5/' > > This works, but is of course cumbersome to maintain. > > Again, thanks for all the comments! I think there are a few too many assumptions in there; in particular, numbers in %a will match patterns designed to match cpu and mem, because you can't prevent sed from being greedy (except with the [^ … … ]+ construction, to restrict what it matches). This version makes a few assumptions as well: . that the new format matches the old one (mine) if the delimiters given are a single space (like '%p %c %C'), or stripped (like "%mem" and '%a', but not ' %a'). . the short command is always 15 chars wide even if all the commands in the table are shorter, eg with ps -o. . I don't have any of those new-fangled extra-long PIDs yet today. It might well break if a CPU or MEM is running at 100%. That's not easily tested here. I've reordered the columns on the first pass, so that the numeric ones (with their limited character set) come first, which means I can use an auxiliary character for correcting the spacing. (The spaces between the columns get comingled with the leading spaces of numbers.) The second pass sorts that out and processes the heading. $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1~\3~\2\4/;' | sed -E 's/([^~]+)~ ([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND) /\1|\2|/;s/%MEM COMMAND/%MEM|COMMAND/;' | less $ This is the same, except I deliberately chose _ for the auxiliary character, knowing that short commands are stuffed with underscores: $ ps -eo '%p %c %C' -o "%mem" -o '%a' --sort=-%cpu | sed -E 's/( *[0-9]+) (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1_\3_\2\4/;' | sed -E 's/([^_]+)_ ([^_]+)_(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND) /\1|\2|/;s/%MEM COMMAND/%MEM|COMMAND/;' | less $ Examples: PID|COMMAND|%CPU %MEM|COMMAND 9798|firefox-esr| 2.5 5.8|firefox-esr 16143|Isolated Web Co| 1.8 2.2|/usr/lib/firefox-esr/firefox-esr -contentproc -childID 11 -isForBrowser -prefsLen 47676 -prefMapSize 232307 -jsInitLen 277276 -parentBuildID 20230214011352 -appDir /usr/lib/firefox-esr/browser 9798 true tab 1242|Xorg | 1.0 1.4|/usr/lib/xorg/Xorg -nolisten tcp :0 vt1 -keeptty -auth /tmp/serverauth.FxvBp8B7Qn [ … ] 8|mm_percpu_wq | 0.0 0.0|[mm_percpu_wq] 9|rcu_tasks_rude_| 0.0 0.0|[rcu_tasks_rude_] 10|rcu_tasks_trace| 0.0 0.0|[rcu_tasks_trace] An incestuous one, with -o rather -eo: PID|COMMAND|%CPU %MEM|COMMAND 1694|bash | 0.0 0.1|bash 23486|ps | 0.0 0.0|ps -o %p %c %C -o %mem -o %a --sort=-%cpu 23487|sed| 0.0 0.0|sed -E s/( *[0-9]+) (.{15})( +[0-9.]+ +[0-9.]+) (.*$)/\1~\3~\2\4/; 23488|sed| 0.0 0.0|sed -E s/([^~]+)~ ([^~]+)~(.{15})(.*)/\1|\3|\2|\4/;s/^( *PID) (COMMAND) /\1|\2|/;s/%MEM|COMMAND/%MEM|COMMAND/; 23489|less | 0.0 0.0|less Cheers, David.
Re: ps and AIX field descriptors
Greg Wooledge writes: > On Sun, Feb 19, 2023 at 12:04:22PM -0600, David Wright wrote: >> But even that's not enough >> because the field width is somewhat variable: try ps -eo '%c | %z | %a' >> (We can still use | to make the problem somewhat more obvious.) > > Oh wow. Yeah, OK, that's not really solvable. > > For those who don't want to try to reverse engineer David's conclusion, > or who don't just happen to stumble upon it with their current process > list, here's what I'm seeing: > > COMMAND | VSZ | COMMAND > systemd | 164140 | /sbin/init > kthreadd | 0 | [kthreadd] > rcu_gp | 0 | [rcu_gp] > rcu_par_gp | 0 | [rcu_par_gp] > [...] > steamwebhelper | 4631064 | /home/greg/.steam/debian-installation/[...] > [...] > chrome_crashpad | 33567792 | > /opt/google/chrome/chrome_crashpad_handler[...] > [...] > kworker/3:0-eve | 0 | [kworker/3:0-events] > > ps appears to guess an initial maximum width for the VSZ field, but > when a value comes along that exceeds the guessed maximum, it simply > shoves the field barrier over. It doesn't even become the new maximum, > with all of the fields aligning after that. It's just a one-time shove, > breaking the current line only. > > Therefore, parsing the header line cannot give us enough information to > insert field separators correctly in body lines after the fact. Dear all, Thanks for chiming in. The example was indeed simplified and I am using %a which can contain internal whitespace. This is the command I was using previously: ps -eo '%p|%c|%C' -o "%mem" -o '|%a' --sort=-%cpu I now replaced it with ps -eo '%p %c %C' -o "%mem" -o ' %a' --sort=-%cpu | sed -E 's/([0-9]+) (.+) ([0-9]+.?[0-9]?) ([0-9]+.?[0-9]?) (.+)/\1|\2|\3|\4|\5/' This works, but is of course cumbersome to maintain. Again, thanks for all the comments! Best, Andreas
Re: ps and AIX field descriptors
Reco writes: > Hi. > > On Fri, Feb 17, 2023 at 07:46:23AM +0100, Andreas Leha wrote: >> Now my question: How can I restore the previous behaviour that allowed >> other than whitespace separators between fields? > > diff -purw procps-3.3.17/ps/sortformat.c procps-4.0.2/src/ps/sortformat.c > shows me that: > > @@ -128,22 +127,24 @@ static const char *aix_format_parse(sf_n >items = 0; >walk = sfn->sf; >/* state machine */ { > - int c; > + int c = *walk++; >initial: > -c = *walk++; > if(c=='%')goto get_desc; > if(!c)goto looks_ok; >/* get_text: */ > items++; > - get_more_text: > + get_more: > c = *walk++; > if(c=='%')goto get_desc; > -if(c) goto get_more_text; > +if(c==' ')goto get_more; > +if(c) goto aix_oops; > goto looks_ok; >get_desc: > items++; > c = *walk++; > -if(c) goto initial; > +if(c&!=' ') goto initial; > +return _("missing AIX field descriptor"); > + aix_oops: > return _("improper AIX field descriptor"); >looks_ok: > ; > > If you look at "get_more" label, you'll notice that "old" version of > procps (bullseye's) checked for any character after "%" block. > "New" one (bookworm's) explicitly checks for space, and goes to > "aix_oops" in any other case. > > And there is no #ifdefs, no environment variable checks, no options > etc. > > > So, to answer your question - currently the only way to restore the > behaviour you want is to patch procps and rebuild it. > > Reco Dear Reco, Thanks for the fast and accurate answer! What a shame for this change... Best, Andreas
Re: ps and AIX field descriptors
On Sun, Feb 19, 2023 at 12:04:22PM -0600, David Wright wrote: > But even that's not enough > because the field width is somewhat variable: try ps -eo '%c | %z | %a' > (We can still use | to make the problem somewhat more obvious.) Oh wow. Yeah, OK, that's not really solvable. For those who don't want to try to reverse engineer David's conclusion, or who don't just happen to stumble upon it with their current process list, here's what I'm seeing: COMMAND | VSZ | COMMAND systemd | 164140 | /sbin/init kthreadd | 0 | [kthreadd] rcu_gp | 0 | [rcu_gp] rcu_par_gp | 0 | [rcu_par_gp] [...] steamwebhelper | 4631064 | /home/greg/.steam/debian-installation/[...] [...] chrome_crashpad | 33567792 | /opt/google/chrome/chrome_crashpad_handler[...] [...] kworker/3:0-eve | 0 | [kworker/3:0-events] ps appears to guess an initial maximum width for the VSZ field, but when a value comes along that exceeds the guessed maximum, it simply shoves the field barrier over. It doesn't even become the new maximum, with all of the fields aligning after that. It's just a one-time shove, breaking the current line only. Therefore, parsing the header line cannot give us enough information to insert field separators correctly in body lines after the fact.
Re: ps and AIX field descriptors
On Sat 18 Feb 2023 at 09:53:01 (-0500), Greg Wooledge wrote: > It should be noted that there appear to be two TYPES of data fields: > numeric and string. Look at this example: > > unicorn:~$ ps -o '%C %g %n %p %U %a' > %CPU RGROUPNI PID USER COMMAND > 0.0 greg 01010 greg bash > 0.0 greg 0 2094243 greg ps -o %C %g %n %p %U %a > > The "%CPU", "NI" and "PID" fields are right-justified. The "RGROUP", > "USER" and "COMMAND" fields are left-justified. > > This means the header parser will also need to contain knowledge about > each header -- whether it's left-justified (string) or right- (numeric). Oh, it's somewhat worse than that. You need to know the maximum length that can be shown for left-justified strings, and also what the maximum width of a numeric field is. But even that's not enough because the field width is somewhat variable: try ps -eo '%c | %z | %a' (We can still use | to make the problem somewhat more obvious.) > With all those pieces, I think the problem can be "solved", although I > wouldn't care to write such a thing. Time spent on writing that > parser/filter would be better spent advocating to restore the previous > functionality, IMHO. We don't know the priorities of the OP, and whether the example was somewhat simplified just for posing the question. If it wasn't, then quick and dirty might suffice. That's up to the OP, and what they consider the chances are of the "fix" being accepted. Much of man ps seems to be an exercise in vagueness. Somewhat. Cheers, David.
Re: ps and AIX field descriptors
On Fri, Feb 17, 2023 at 10:28:43PM -0600, David Wright wrote: > On Fri 17 Feb 2023 at 11:30:43 (-0500), Greg Wooledge wrote: > > On Fri, Feb 17, 2023 at 09:20:34AM -0600, David Wright wrote: > > > $ ps -eo '%p %C' | sed -e 's/\([^ ]\+\) /\1|/;' > > Eww, GNUisms. > > I don't keep a list of differences to hand, but I guess you'd prefer: > > $ ps -eo '%p %C' | sed -E 's/([^ ]+) /\1|/;' > PID|%CPU > 1| 0.0 > 2| 0.0 That's *slightly* better, in that it works on both GNU and BSD (and maybe some future edition of POSIX -- I've been told they're considering adopting the -E flag). A truly portable version would either use \{1,\} or would simply repeat itself: [^ ][^ ]* (The latter is by far the more common, especially in scripts that target ancient Unixes where \{1,\} might not work.) However, a bigger issue is that your command only works for the two-column case. It doesn't support more columns: unicorn:~$ ps -o '%p|%U|%a' PID|USER|COMMAND 1010|greg|bash 2093990|greg|ps -o %p|%U|%a unicorn:~$ ps -o '%p %U %a' | sed -E 's/([^ ]+) /\1|/;' PID|USER COMMAND 1010|greg bash 2093858|greg ps -o %p %U %a 2093859|greg sed -E s/([^ ]+) /\1|/; And even if you extended it in the "obvious" way, it would break down on columns that can contain internal whitespace (e.g. %a). > > That aside, a workaround like this is ugly and should > > not be needed. > > The OP wrote: "How can I restore the previous behaviour that > allowed other than whitespace separators between fields?" > > If that's the required format, what are the alternatives? Because data fields can contain internal whitespace, the only way to parse the output of ps and determine the right spot to put pipelines (or whatever) would be to parse the header row. All of the headers listed under "AIX format specifiers" are free of whitespace. So, one could in theory parse that line, determine the column numbers where each data field will end, and then replace spaces with pipelines in those column numbers. It should be noted that there appear to be two TYPES of data fields: numeric and string. Look at this example: unicorn:~$ ps -o '%C %g %n %p %U %a' %CPU RGROUPNI PID USER COMMAND 0.0 greg 01010 greg bash 0.0 greg 0 2094243 greg ps -o %C %g %n %p %U %a The "%CPU", "NI" and "PID" fields are right-justified. The "RGROUP", "USER" and "COMMAND" fields are left-justified. This means the header parser will also need to contain knowledge about each header -- whether it's left-justified (string) or right- (numeric). With all those pieces, I think the problem can be "solved", although I wouldn't care to write such a thing. Time spent on writing that parser/filter would be better spent advocating to restore the previous functionality, IMHO.
Re: ps and AIX field descriptors
On Fri 17 Feb 2023 at 11:30:43 (-0500), Greg Wooledge wrote: > On Fri, Feb 17, 2023 at 09:20:34AM -0600, David Wright wrote: > > On Fri 17 Feb 2023 at 10:05:20 (+0300), Reco wrote: > > > So, to answer your question - currently the only way to restore the > > > behaviour you want is to patch procps and rebuild it. > > Fabulous analysis. > > > Or, depending on the context, you could of course restore > > the appearance of the output with sed: > > > > $ ps -eo '%p %C' | sed -e 's/\([^ ]\+\) /\1|/;' > > PID|%CPU > > 1| 0.0 > > 2| 0.0 > > 3| 0.0 > > 4| 0.0 > > 6| 0.0 > > [ … ] > > Eww, GNUisms. I don't keep a list of differences to hand, but I guess you'd prefer: $ ps -eo '%p %C' | sed -E 's/([^ ]+) /\1|/;' PID|%CPU 1| 0.0 2| 0.0 [ … ] > That aside, a workaround like this is ugly and should > not be needed. The OP wrote: "How can I restore the previous behaviour that allowed other than whitespace separators between fields?" If that's the required format, what are the alternatives? > This sounds like a bug in procps that should be reported, > if it hasn't already. And how long before it's fixed? As for whether it /is/ a bug, I guess that depends on the interpretation of somewhat in "This ps supports AIX format descriptors, which work somewhat like the formatting codes of printf(1) and printf(3)." That's beyond my pay-grade. Cheers, David.
Re: ps and AIX field descriptors
On 2023-02-17 at 15:21, Greg Wooledge wrote: > On Fri, Feb 17, 2023 at 01:49:59PM -0500, The Wanderer wrote: > >> I can't speak to the new version, as I'm still running 3.3.17-7.1 on my >> machine - but I can at least note that the man page from that older >> version also explicitly says "a blank-separated or comma-separated list" >> in the description for the '-o' option, but the given command line (with >> a pipe for a separator) still works. (This may reflect only the same >> thing that you said, above.) >> >> It's entirely possible that this was an intentional change, to bring >> things in line with the documentation, and/or even one required in order >> to be in line with some appropriate specification. > > Hmm... fair point. POSIX says: > >The application shall ensure that the format specification is a list of >names presented as a single argument, or -separated. > > So the behavior of ps in bullseye is an extension of the POSIX requirement, > and apparently only applies to the "AIX format specifiers", which are yet > another extension. FWIW, at least in the 3.3.17-7.1 version I have, the man page claims that ps supports the POSIXLY_CORRECT environment variable. This type of more-strict POSIX compliance, even when it means being less capable, strikes me as the sort of thing that should probably be gated behind a check for that flag. > Nevertheless, this change definitely feels like a regression. Scripts > that are relying on the bullseye behavior, with full output formatting > capability, will no longer work in bookworm. > > I'm not using any such scripts, so I don't have anything to lose here, > but the OP might seriously want to get a bug report in, at least to > learn whether this is an intended regression, or an accidental one. Agreed, -- The Wanderer The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. -- George Bernard Shaw signature.asc Description: OpenPGP digital signature
Re: ps and AIX field descriptors
On Fri, Feb 17, 2023 at 01:49:59PM -0500, The Wanderer wrote: > I can't speak to the new version, as I'm still running 3.3.17-7.1 on my > machine - but I can at least note that the man page from that older > version also explicitly says "a blank-separated or comma-separated list" > in the description for the '-o' option, but the given command line (with > a pipe for a separator) still works. (This may reflect only the same > thing that you said, above.) > > It's entirely possible that this was an intentional change, to bring > things in line with the documentation, and/or even one required in order > to be in line with some appropriate specification. Hmm... fair point. POSIX says: The application shall ensure that the format specification is a list of names presented as a single argument, or -separated. So the behavior of ps in bullseye is an extension of the POSIX requirement, and apparently only applies to the "AIX format specifiers", which are yet another extension. On bullseye: unicorn:~$ ps -o '%U|%p|%a' USER|PID|COMMAND greg| 1010|bash greg|2023595|ps -o %U|%p|%a unicorn:~$ ps -o 'user|pid|args' error: unknown user-defined format specifier "user|pid|args" [...] Nevertheless, this change definitely feels like a regression. Scripts that are relying on the bullseye behavior, with full output formatting capability, will no longer work in bookworm. I'm not using any such scripts, so I don't have anything to lose here, but the OP might seriously want to get a bug report in, at least to learn whether this is an intended regression, or an accidental one.
Re: ps and AIX field descriptors
On 2023-02-17 at 13:21, debian-u...@howorth.org.uk wrote: > Greg Wooledge wrote: > >> This sounds like a bug in procps that should be reported, if it >> hasn't already. > > It might be a bug if it disagreed with its documentation. But do the > docs say anything about this feature? What they do say is that you > should be able to use comma-separated field decriptions instead of > space-separated I think. Is that true for the new version? I can't speak to the new version, as I'm still running 3.3.17-7.1 on my machine - but I can at least note that the man page from that older version also explicitly says "a blank-separated or comma-separated list" in the description for the '-o' option, but the given command line (with a pipe for a separator) still works. (This may reflect only the same thing that you said, above.) It's entirely possible that this was an intentional change, to bring things in line with the documentation, and/or even one required in order to be in line with some appropriate specification. It might be interesting to dig up the actual commit message from the upstream development commit that made this change, and possibly also any here's-what's-changed-in-the-new-version documentation (whether in Debian or upstream), to see whether there's anything that sheds light on whether this was intentional and if so what the reason was. The answer might, at least, inform the approach to be taken in arguing for the restoration of this functionality in a potential future version. -- The Wanderer The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. -- George Bernard Shaw signature.asc Description: OpenPGP digital signature
Re: ps and AIX field descriptors
Greg Wooledge wrote: > This sounds like a bug in procps that should be reported, if it > hasn't already. It might be a bug if it disagreed with its documentation. But do the docs say anything about this feature? What they do say is that you should be able to use comma-separated field decriptions instead of space-separated I think. Is that true for the new version?
Re: ps and AIX field descriptors
On Fri, Feb 17, 2023 at 09:20:34AM -0600, David Wright wrote: > On Fri 17 Feb 2023 at 10:05:20 (+0300), Reco wrote: > > So, to answer your question - currently the only way to restore the > > behaviour you want is to patch procps and rebuild it. Fabulous analysis. > Or, depending on the context, you could of course restore > the appearance of the output with sed: > > $ ps -eo '%p %C' | sed -e 's/\([^ ]\+\) /\1|/;' > PID|%CPU > 1| 0.0 > 2| 0.0 > 3| 0.0 > 4| 0.0 > 6| 0.0 > [ … ] Eww, GNUisms. That aside, a workaround like this is ugly and should not be needed. This sounds like a bug in procps that should be reported, if it hasn't already.
Re: ps and AIX field descriptors
On Fri 17 Feb 2023 at 10:05:20 (+0300), Reco wrote: > On Fri, Feb 17, 2023 at 07:46:23AM +0100, Andreas Leha wrote: > > Now my question: How can I restore the previous behaviour that allowed > > other than whitespace separators between fields? > > diff -purw procps-3.3.17/ps/sortformat.c procps-4.0.2/src/ps/sortformat.c > shows me that: > > @@ -128,22 +127,24 @@ static const char *aix_format_parse(sf_n >items = 0; >walk = sfn->sf; >/* state machine */ { > - int c; > + int c = *walk++; >initial: > -c = *walk++; > if(c=='%')goto get_desc; > if(!c)goto looks_ok; >/* get_text: */ > items++; > - get_more_text: > + get_more: > c = *walk++; > if(c=='%')goto get_desc; > -if(c) goto get_more_text; > +if(c==' ')goto get_more; > +if(c) goto aix_oops; > goto looks_ok; >get_desc: > items++; > c = *walk++; > -if(c) goto initial; > +if(c&!=' ') goto initial; > +return _("missing AIX field descriptor"); > + aix_oops: > return _("improper AIX field descriptor"); >looks_ok: > ; > > If you look at "get_more" label, you'll notice that "old" version of > procps (bullseye's) checked for any character after "%" block. > "New" one (bookworm's) explicitly checks for space, and goes to > "aix_oops" in any other case. > > And there is no #ifdefs, no environment variable checks, no options > etc. > > So, to answer your question - currently the only way to restore the > behaviour you want is to patch procps and rebuild it. Or, depending on the context, you could of course restore the appearance of the output with sed: $ ps -eo '%p %C' | sed -e 's/\([^ ]\+\) /\1|/;' PID|%CPU 1| 0.0 2| 0.0 3| 0.0 4| 0.0 6| 0.0 [ … ] Cheers, David.
Re: ps and AIX field descriptors
Hi. On Fri, Feb 17, 2023 at 07:46:23AM +0100, Andreas Leha wrote: > Now my question: How can I restore the previous behaviour that allowed > other than whitespace separators between fields? diff -purw procps-3.3.17/ps/sortformat.c procps-4.0.2/src/ps/sortformat.c shows me that: @@ -128,22 +127,24 @@ static const char *aix_format_parse(sf_n items = 0; walk = sfn->sf; /* state machine */ { - int c; + int c = *walk++; initial: -c = *walk++; if(c=='%')goto get_desc; if(!c)goto looks_ok; /* get_text: */ items++; - get_more_text: + get_more: c = *walk++; if(c=='%')goto get_desc; -if(c) goto get_more_text; +if(c==' ')goto get_more; +if(c) goto aix_oops; goto looks_ok; get_desc: items++; c = *walk++; -if(c) goto initial; +if(c&!=' ') goto initial; +return _("missing AIX field descriptor"); + aix_oops: return _("improper AIX field descriptor"); looks_ok: ; If you look at "get_more" label, you'll notice that "old" version of procps (bullseye's) checked for any character after "%" block. "New" one (bookworm's) explicitly checks for space, and goes to "aix_oops" in any other case. And there is no #ifdefs, no environment variable checks, no options etc. So, to answer your question - currently the only way to restore the behaviour you want is to patch procps and rebuild it. Reco
ps and AIX field descriptors
Hi all, I am facing a strange issue. This command used to work ps -eo '%p|%C' Now, on a debian testing machine only ps -eo '%p %C' works. Running ps -eo '%p|%C' results in this error: error: improper AIX field descriptor ps --version says 'ps from procps-ng 4.0.2' Now my question: How can I restore the previous behaviour that allowed other than whitespace separators between fields? Thanks in advance! Andreas