Bug#1024811: linux: /proc/[pid]/stat unparsable

2022-12-23 Thread Donald Buczek

On 12/23/22 09:49, Thorsten Glaser wrote:

Donald Buczek dixit:


To be fair, this daemon doesn't use /proc/pid/stat for that, but /proc/pid/comm


Yes, and that’s proper. The field in /proc/pid/stat is size-limited
and so not necessarily distinct.


No, it is the process name itself, which is size limited, so in this regard it 
doesn't make a difference if you read it from /proc/pid/stat or /proc/pid/comm.


As /proc/pid/stat is also used in many places, it could as well use
that to avoid code duplication or reuse data already read from the
other source.


No, because the data in /stat is incomplete *and* anything using
it that would be affected by escaping was already broken.


"Incomplete" because if truncation?

The usage in my example is not already broken. Truncation doesn't happen, because the 
process name used is the fixed string "mxqd reaper".

A process name is limited to 15 characters. The limit is already in force when 
you use PR_SET_NAME, so there is no truncation when you read it back from 
procfs.

D.



bye,
//mirabilos


--
Donald Buczek
buc...@molgen.mpg.de
Tel: +49 30 8413 1433



Bug#1024811: linux: /proc/[pid]/stat unparsable

2022-12-23 Thread Thorsten Glaser
Donald Buczek dixit:

> To be fair, this daemon doesn't use /proc/pid/stat for that, but 
> /proc/pid/comm

Yes, and that’s proper. The field in /proc/pid/stat is size-limited
and so not necessarily distinct.

> As /proc/pid/stat is also used in many places, it could as well use
> that to avoid code duplication or reuse data already read from the
> other source.

No, because the data in /stat is incomplete *and* anything using
it that would be affected by escaping was already broken.

bye,
//mirabilos
-- 
22:20⎜ The crazy that persists in his craziness becomes a master
22:21⎜ And the distance between the craziness and geniality is
only measured by the success 18:35⎜ "Psychotics are consistently
inconsistent. The essence of sanity is to be inconsistently inconsistent



Bug#1024811: linux: /proc/[pid]/stat unparsable

2022-12-23 Thread Donald Buczek

On 12/22/22 21:28, Thorsten Glaser wrote:

Donald Buczek dixit:


No, Escaping would break existing programs which parse the line by
searching for the ')' from the right.


Huh? No!

The format is "(" + string + ") " after all, and only the string
part would get escaped.

The only visible change would be that programs containing a
whitespace character (and, ideally, a ‘(’) in their name would


')'


be escaped, which are these that are currently broken anyway.


You would still break programs which use the string for anything else than 
displaying it to the user.

We have a job control daemon, which stored PIDs of jobs it has started in a 
database. The daemon is able to restart. When it comes back up, it needs to 
find out, whether its jobs are still alive. The problem here is pid wrap: A job 
might be gone, but a unrelated new process might have gotten the recycled pid. 
To avoid confusion, the restarting job control daemon looks at the comm value 
of the process in question, which is known for its own jobs [1].

[1]: 
https://github.molgen.mpg.de/mariux64/mxq/blob/f3d9fb8c6143c3a884210b212ed4a8514a49a414/mxqd.c#L904

In this case, the fixed process name (set with prctl PR_SET_NAME) even contains a space: 
"mxqd reaper".

To be fair, this daemon doesn't use /proc/pid/stat for that, but /proc/pid/comm 
instead. So it wouldn't really be affected by your proposed change. But that is 
just a random design decision. As /proc/pid/stat is also used in many places, 
it could as well use that to avoid code duplication or reuse data already read 
from the other source.


And perhaps backslashes, if you decide to encode unambiguous,
but given the field length limit, I don’t think that was ever
a goal (both because I suspect this file was intended to be
used to get a quick overview and therefore deliberately shortens
and because the full info is available elsewhere), so no need to
encode unambiguously.


If some documentation suggests, that you can just parse it with scanf,
the documentation should be corrected/improved instead.


No. Someone recently did a survey, and most code in existence splits
by whitespace. Fix the kernel bug instead.


Yes, I've seen that on oss-security. No doubt, its easy to parse the file 
wrongly and no doubt, many programs do that.

I also acknowledge, that the man page and the implementation are in conflict.

However, afaik, 'correctness' in the kernel world is not defined by 
specifications, less by man pages, but by implementation. So this can't be 
declared a kernel bug just because it conflicts with a manpage.

Plus the manpage, which you use as a foundation that there is something to fix, 
doesn't talk about encoding, either. So even when some encoding was applied, 
the interface would still be in conflict with the manpage.

Generally, changes, which might break userspace, are not very welcome, even if the current 
implementation is ugly and difficult to work with. I just wanted to point out,  that there exists 
programs which interpret the comm value they got from procfs. If these programs happen to use 
/proc/pid/stat for reading it, they might fail, if the format was changed. And experience shows, 
that any (miss-)feature is used by somebody somewhere, so any "might break" is really a 
"will break".

I don't object to a change and I think its a valid position to risk a breakage 
of a very few programs for what you might gain here. But it is not self-evident 
by the declarative power of the manpage or otherwise. It's a judgement, which 
has to be taken.

Best
  Donald


Are you referring to proc(5) "The fields, in order, with their proper
scanf(3) format specifiers, are listed below" [1] or something else?


Yes.


The referenced manual page is wrong in regard to the length, too. There
is no 16 character limit to the field, because it can contain a
workqueue task name, too:


Probably used to be cut off after 16. Go fix that in the manpage
then. But do fix the encoding kernel-side.




In fact, if you start escaping now you might also break programs which
rely on the current 64 character limit.


Just cut off at the end then, like I suspect was done at 16 bytes
initially.

Or strip whitespace and closing parenthesis if present instead
of encoding them, or replace them with a question mark.

bye,
//mirabilos


--
Donald Buczek
buc...@molgen.mpg.de
Tel: +49 30 8413 1433



Bug#1024811: Re: Bug#1024811: linux: /proc/[pid]/stat unparsable

2022-12-22 Thread Thorsten Glaser
Donald Buczek dixit:

>No, Escaping would break existing programs which parse the line by
>searching for the ')' from the right.

Huh? No!

The format is "(" + string + ") " after all, and only the string
part would get escaped.

The only visible change would be that programs containing a
whitespace character (and, ideally, a ‘(’) in their name would
be escaped, which are these that are currently broken anyway.
And perhaps backslashes, if you decide to encode unambiguous,
but given the field length limit, I don’t think that was ever
a goal (both because I suspect this file was intended to be
used to get a quick overview and therefore deliberately shortens
and because the full info is available elsewhere), so no need to
encode unambiguously.

>If some documentation suggests, that you can just parse it with scanf,
>the documentation should be corrected/improved instead.

No. Someone recently did a survey, and most code in existence splits
by whitespace. Fix the kernel bug instead.

>Are you referring to proc(5) "The fields, in order, with their proper
>scanf(3) format specifiers, are listed below" [1] or something else?

Yes.

>The referenced manual page is wrong in regard to the length, too. There
>is no 16 character limit to the field, because it can contain a
>workqueue task name, too:

Probably used to be cut off after 16. Go fix that in the manpage
then. But do fix the encoding kernel-side.

>In fact, if you start escaping now you might also break programs which
>rely on the current 64 character limit.

Just cut off at the end then, like I suspect was done at 16 bytes
initially.

Or strip whitespace and closing parenthesis if present instead
of encoding them, or replace them with a question mark.

bye,
//mirabilos
-- 
“It is inappropriate to require that a time represented as
 seconds since the Epoch precisely represent the number of
 seconds between the referenced time and the Epoch.”
-- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2



Bug#1024811: Re: Bug#1024811: linux: /proc/[pid]/stat unparsable

2022-12-22 Thread Donald Buczek
On 12/22/22 1:53 AM, Thorsten Glaser wrote:
> On Sat, 26 Nov 2022, Alexey Dobriyan wrote:
> 
>> /proc never escaped "comm" field of /proc/*/stat.
> 
> Yes, that’s precisely the bug.
> 
>> To parse /proc/*/stat reliably, search for '(' from the beginning, then
>> for ')' backwards. Everything in between parenthesis is "comm".
> 
> That’s not guaranteed to stay reliable: fields can be, and have
> been in the past, added, and new %s fields will break this. Do
> not rely on it either.
> 
>> Everything else are numbers separated by spaces.
> 
> Currently, yes.
> 
> But the field is *clearly* documented as intended to be
> parsable by scanf(3), which splits on white space. So the
> Linux kernel MUST encode embedded whitespace so the
> documented(!) access method works.

No, Escaping would break existing programs which parse the line by searching 
for the ')' from the right. The format, surly, is ugly, but that is how it is.

If some documentation suggests, that you can just parse it with scanf, the 
documentation should be corrected/improved instead.

Are you referring to proc(5) "The fields, in order, with their proper scanf(3) 
format specifiers, are listed below" [1] or something else?

The referenced manual page is wrong in regard to the length, too. There is no 
16 character limit to the field, because it can contain a workqueue task name, 
too:

buczek@theinternet:/tmp$ cat /proc/27190/stat
27190 (kworker/11:2-mm_percpu_wq) I 2 0 0 0 -1 69238880 0 0 0 0 0 170 0 0 
20 0 1 0 109348986 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 0 0 0 17 
11 0 0 0 0 0 0 0 0 0 0 0 0 0

The current limit seems to be 64 characters [2] when escaping is off, as it is 
the case with /proc/pid/stat. But generally the length of the field and thereby 
of the whole line seems to be rather undefined. So to parse that, you either 
either need to do some try-and-restart-with-a-bigger-buffer dance or use a 
buffer size of which you just hope that it will be big enough for the forseable 
time. 

In fact, if you start escaping now you might also break programs which rely on 
the current 64 character limit.

Best

  Donald

[1]: https://man7.org/linux/man-pages/man5/proc.5.html
[2]: https://elixir.bootlin.com/linux/latest/source/fs/proc/array.c#L99

> 
> bye,
> //mirabilos
> 


-- 
Donald Buczek
buc...@molgen.mpg.de
Tel: +49 30 8413 1433



Bug#1024811: linux: /proc/[pid]/stat unparsable

2022-12-21 Thread Thorsten Glaser
On Sat, 26 Nov 2022, Alexey Dobriyan wrote:

>/proc never escaped "comm" field of /proc/*/stat.

Yes, that’s precisely the bug.

>To parse /proc/*/stat reliably, search for '(' from the beginning, then
>for ')' backwards. Everything in between parenthesis is "comm".

That’s not guaranteed to stay reliable: fields can be, and have
been in the past, added, and new %s fields will break this. Do
not rely on it either.

>Everything else are numbers separated by spaces.

Currently, yes.

But the field is *clearly* documented as intended to be
parsable by scanf(3), which splits on white space. So the
Linux kernel MUST encode embedded whitespace so the
documented(!) access method works.

bye,
//mirabilos
-- 
15:41⎜ Somebody write a testsuite for helloworld :-)



Bug#1024811: linux: /proc/[pid]/stat unparsable

2022-11-25 Thread Thorsten Glaser
Dixi quod…

>The effect is that /proc/[pid]/stat cannot be parsed the way it is
>documented, as it does not escape embedded whitespace characters;

… nor parenthesēs:

tglase@x61w:~ $ ./mk\)sh -c 'echo $$; sleep 10' &
[1] 13375
tglase@x61w:~ $ 13375
tglase@x61w:~ $ cat /proc/13375/stat
13375 (mk)sh) S 13330 13375 13330 34837 13377 4194304 124 0 0 0 0 0 0 0 20 0 1 
0 59029474 2977792 180 18446744073709551615 94911056490496 94911056739789 
140721459110048 0 0 0 0 0 134307847 0 0 0 17 1 0 0 0 0 0 94911056765744 
94911056773808 94911059955712 140721459115917 140721459115946 140721459115946 
140721459118064 0

This is… sad — extremely so. It does not escape anything.
I found proc_task_name(), which has an escape parameter,
which is set to false here, but it’s only for /status
which must escape newlines.

It’s used with false in /stat and /comm… the latter indeed
needing no escapes.

I’d argue that it needs a tristate argument, 0 for /comm
to not escape anything, 1 for /status to escape newlines,
and 2 for /stat to escape whitespace (and perhaps also a
closing parenthesis, using \x29, so splitting both using
scanf as indicated in the manpage and using parenthesēs
as people seem to do on the ’net is fixed).

bye,
//mirabilos
-- 
22:20⎜ The crazy that persists in his craziness becomes a master
22:21⎜ And the distance between the craziness and geniality is
only measured by the success 18:35⎜ "Psychotics are consistently
inconsistent. The essence of sanity is to be inconsistently inconsistent



Bug#1024811: linux: /proc/[pid]/stat unparsable

2022-11-25 Thread Thorsten Glaser
Package: src:linux
Version: 5.10.149-2
Severity: normal
Tags: upstream
X-Debbugs-Cc: t...@mirbsd.de, adobri...@gmail.com

tglase@x61w:~ $ cp /bin/mksh mk\ sh
tglase@x61w:~ $ ./mk\ sh -c 'echo $$; sleep 10' &
[1] 12862
tglase@x61w:~ $ 12862
cat /proc/12862/stat
12862 (mk sh) S 12649 12862 12649 34838 12864 4194304 124 0 0 0 0 0 0 0 20 0 1 
0 58970609 2977792 211 18446744073709551615 93898845827072 93898846076365 
140724844598496 0 0 0 0 0 134307847 0 0 0 17 0 0 0 0 0 0 93898846102320 
93898846110384 93898853036032 140724844603277 140724844603306 140724844603306 
140724844605424 0
tglase@x61w:~ $ cat /proc/12862/stat | hd
  31 32 38 36 32 20 28 6d  6b 20 73 68 29 20 53 20  |12862 (mk sh) S |
[…]

This directly violates the documentation of proc(5) which says…

  The  fields,  in order, with their proper scanf(3) format speci‐
  fiers, are listed below.  Whether or not certain of these fields

… while scanf(3) indicates:

   s  Matches  a  sequence  of  non-white-space  characters;  the next

The effect is that /proc/[pid]/stat cannot be parsed the way it is
documented, as it does not escape embedded whitespace characters;
this, as a result, makes this file realistically useless and anything
relying on it broken.

-- Package-specific info:
** Version:
Linux version 5.10.0-19-amd64 (debian-ker...@lists.debian.org) (gcc-10 (Debian 
10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP 
Debian 5.10.149-2 (2022-10-21)

** Command line:
BOOT_IMAGE=/SDcardBoot/vmlinuz-5.10.0-19-amd64 
root=/dev/mapper/vg--cSD-lv--root ro net.ifnames=0 vga=792

** Tainted: W (512)
 * kernel issued warning

** Kernel log:
Unable to read kernel log; any relevant messages should be attached

** Model information
sys_vendor: LENOVO
product_name: 7673AG4
product_version: ThinkPad X61
chassis_vendor: LENOVO
chassis_version: Not Available
bios_vendor: LENOVO
bios_version: 7NET30WW (1.11 )
board_vendor: LENOVO
board_name: 7673AG4
board_version: Not Available

** Loaded modules:
cdc_acm
apple_mfi_fastcharge
fuse
cpufreq_ondemand
tun
ctr
ccm
cpufreq_powersave
snd_hda_codec_analog
snd_hda_codec_generic
snd_hda_intel
i915
snd_intel_dspcfg
soundwire_intel
soundwire_generic_allocation
snd_soc_core
iwl4965
iwlegacy
snd_compress
soundwire_cadence
mac80211
snd_hda_codec
coretemp
snd_hda_core
kvm_intel
snd_hwdep
drm_kms_helper
cfg80211
kvm
soundwire_bus
thinkpad_acpi
cec
snd_pcm
pcmcia
drm
iTCO_wdt
nvram
snd_timer
ppdev
ledtrig_audio
intel_pmc_bxt
libarc4
snd
evdev
yenta_socket
i2c_algo_bit
soundcore
rfkill
iTCO_vendor_support
irqbypass
pcmcia_rsrc
watchdog
parport_pc
pcmcia_core
serio_raw
pcspkr
sg
parport
ac
acpi_cpufreq
button
ecb
aes_generic
libaes
crypto_simd
cryptd
glue_helper
xts
dm_crypt
dm_mod
ext4
crc16
mbcache
jbd2
crc32c_generic
sd_mod
t10_pi
crc_t10dif
crct10dif_generic
mmc_block
crct10dif_common
ata_generic
ahci
ata_piix
sdhci_pci
libahci
cqhci
ehci_pci
uhci_hcd
libata
e1000e
scsi_mod
ehci_hcd
sdhci
i2c_i801
i2c_smbus
psmouse
lpc_ich
mmc_core
usbcore
ptp
usb_common
pps_core
battery
video

** PCI devices:
not available

** USB devices:
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 002: ID 17ef:1000 Lenovo ThinkPad X6 UltraBase
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub


-- System Information:
Debian Release: 11.5
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-19-amd64 (SMP w/2 CPU threads)
Kernel taint flags: TAINT_WARN
Locale: LANG=C, LC_CTYPE=C (charmap=UTF-8) (ignored: LC_ALL set to C.UTF-8), 
LANGUAGE not set
Shell: /bin/sh linked to /bin/lksh
Init: sysvinit (via /sbin/init)

Versions of packages linux-image-5.10.0-19-amd64 depends on:
ii  initramfs-tools [linux-initramfs-tool]  0.140
ii  kmod28-1
ii  linux-base  4.6

Versions of packages linux-image-5.10.0-19-amd64 recommends:
pn  apparmor 
ii  firmware-linux-free  20200122-1

Versions of packages linux-image-5.10.0-19-amd64 suggests:
pn  debian-kernel-handbook  
ii  grub-pc 2.06-3~deb11u4
pn  linux-doc-5.10  

Versions of packages linux-image-5.10.0-19-amd64 is related to:
pn  firmware-amd-graphics 
pn  firmware-atheros  
pn  firmware-bnx2 
pn  firmware-bnx2x
pn  firmware-brcm80211
pn  firmware-cavium   
pn  firmware-intel-sound  
pn  firmware-intelwimax   
pn  firmware-ipw2x00  
pn  firmware-ivtv 
ii  firmware-iwlwifi  20210315-3
pn  firmware-libertas 
pn  firmware-linux-nonfree
pn  firmware-misc-nonfree 
pn