Bug#1024811: linux: /proc/[pid]/stat unparsable
On 12/23/22 09:49, Thorsten Glaser wrote: Donald Buczek dixit: To be fair, this daemon doesn't use /proc/pid/stat for that, but /proc/pid/comm Yes, and that’s proper. The field in /proc/pid/stat is size-limited and so not necessarily distinct. No, it is the process name itself, which is size limited, so in this regard it doesn't make a difference if you read it from /proc/pid/stat or /proc/pid/comm. As /proc/pid/stat is also used in many places, it could as well use that to avoid code duplication or reuse data already read from the other source. No, because the data in /stat is incomplete *and* anything using it that would be affected by escaping was already broken. "Incomplete" because if truncation? The usage in my example is not already broken. Truncation doesn't happen, because the process name used is the fixed string "mxqd reaper". A process name is limited to 15 characters. The limit is already in force when you use PR_SET_NAME, so there is no truncation when you read it back from procfs. D. bye, //mirabilos -- Donald Buczek buc...@molgen.mpg.de Tel: +49 30 8413 1433
Bug#1024811: linux: /proc/[pid]/stat unparsable
Donald Buczek dixit: > To be fair, this daemon doesn't use /proc/pid/stat for that, but > /proc/pid/comm Yes, and that’s proper. The field in /proc/pid/stat is size-limited and so not necessarily distinct. > As /proc/pid/stat is also used in many places, it could as well use > that to avoid code duplication or reuse data already read from the > other source. No, because the data in /stat is incomplete *and* anything using it that would be affected by escaping was already broken. bye, //mirabilos -- 22:20⎜ The crazy that persists in his craziness becomes a master 22:21⎜ And the distance between the craziness and geniality is only measured by the success 18:35⎜ "Psychotics are consistently inconsistent. The essence of sanity is to be inconsistently inconsistent
Bug#1024811: linux: /proc/[pid]/stat unparsable
On 12/22/22 21:28, Thorsten Glaser wrote: Donald Buczek dixit: No, Escaping would break existing programs which parse the line by searching for the ')' from the right. Huh? No! The format is "(" + string + ") " after all, and only the string part would get escaped. The only visible change would be that programs containing a whitespace character (and, ideally, a ‘(’) in their name would ')' be escaped, which are these that are currently broken anyway. You would still break programs which use the string for anything else than displaying it to the user. We have a job control daemon, which stored PIDs of jobs it has started in a database. The daemon is able to restart. When it comes back up, it needs to find out, whether its jobs are still alive. The problem here is pid wrap: A job might be gone, but a unrelated new process might have gotten the recycled pid. To avoid confusion, the restarting job control daemon looks at the comm value of the process in question, which is known for its own jobs [1]. [1]: https://github.molgen.mpg.de/mariux64/mxq/blob/f3d9fb8c6143c3a884210b212ed4a8514a49a414/mxqd.c#L904 In this case, the fixed process name (set with prctl PR_SET_NAME) even contains a space: "mxqd reaper". To be fair, this daemon doesn't use /proc/pid/stat for that, but /proc/pid/comm instead. So it wouldn't really be affected by your proposed change. But that is just a random design decision. As /proc/pid/stat is also used in many places, it could as well use that to avoid code duplication or reuse data already read from the other source. And perhaps backslashes, if you decide to encode unambiguous, but given the field length limit, I don’t think that was ever a goal (both because I suspect this file was intended to be used to get a quick overview and therefore deliberately shortens and because the full info is available elsewhere), so no need to encode unambiguously. If some documentation suggests, that you can just parse it with scanf, the documentation should be corrected/improved instead. No. Someone recently did a survey, and most code in existence splits by whitespace. Fix the kernel bug instead. Yes, I've seen that on oss-security. No doubt, its easy to parse the file wrongly and no doubt, many programs do that. I also acknowledge, that the man page and the implementation are in conflict. However, afaik, 'correctness' in the kernel world is not defined by specifications, less by man pages, but by implementation. So this can't be declared a kernel bug just because it conflicts with a manpage. Plus the manpage, which you use as a foundation that there is something to fix, doesn't talk about encoding, either. So even when some encoding was applied, the interface would still be in conflict with the manpage. Generally, changes, which might break userspace, are not very welcome, even if the current implementation is ugly and difficult to work with. I just wanted to point out, that there exists programs which interpret the comm value they got from procfs. If these programs happen to use /proc/pid/stat for reading it, they might fail, if the format was changed. And experience shows, that any (miss-)feature is used by somebody somewhere, so any "might break" is really a "will break". I don't object to a change and I think its a valid position to risk a breakage of a very few programs for what you might gain here. But it is not self-evident by the declarative power of the manpage or otherwise. It's a judgement, which has to be taken. Best Donald Are you referring to proc(5) "The fields, in order, with their proper scanf(3) format specifiers, are listed below" [1] or something else? Yes. The referenced manual page is wrong in regard to the length, too. There is no 16 character limit to the field, because it can contain a workqueue task name, too: Probably used to be cut off after 16. Go fix that in the manpage then. But do fix the encoding kernel-side. In fact, if you start escaping now you might also break programs which rely on the current 64 character limit. Just cut off at the end then, like I suspect was done at 16 bytes initially. Or strip whitespace and closing parenthesis if present instead of encoding them, or replace them with a question mark. bye, //mirabilos -- Donald Buczek buc...@molgen.mpg.de Tel: +49 30 8413 1433
Bug#1024811: Re: Bug#1024811: linux: /proc/[pid]/stat unparsable
Donald Buczek dixit: >No, Escaping would break existing programs which parse the line by >searching for the ')' from the right. Huh? No! The format is "(" + string + ") " after all, and only the string part would get escaped. The only visible change would be that programs containing a whitespace character (and, ideally, a ‘(’) in their name would be escaped, which are these that are currently broken anyway. And perhaps backslashes, if you decide to encode unambiguous, but given the field length limit, I don’t think that was ever a goal (both because I suspect this file was intended to be used to get a quick overview and therefore deliberately shortens and because the full info is available elsewhere), so no need to encode unambiguously. >If some documentation suggests, that you can just parse it with scanf, >the documentation should be corrected/improved instead. No. Someone recently did a survey, and most code in existence splits by whitespace. Fix the kernel bug instead. >Are you referring to proc(5) "The fields, in order, with their proper >scanf(3) format specifiers, are listed below" [1] or something else? Yes. >The referenced manual page is wrong in regard to the length, too. There >is no 16 character limit to the field, because it can contain a >workqueue task name, too: Probably used to be cut off after 16. Go fix that in the manpage then. But do fix the encoding kernel-side. >In fact, if you start escaping now you might also break programs which >rely on the current 64 character limit. Just cut off at the end then, like I suspect was done at 16 bytes initially. Or strip whitespace and closing parenthesis if present instead of encoding them, or replace them with a question mark. bye, //mirabilos -- “It is inappropriate to require that a time represented as seconds since the Epoch precisely represent the number of seconds between the referenced time and the Epoch.” -- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2
Bug#1024811: Re: Bug#1024811: linux: /proc/[pid]/stat unparsable
On 12/22/22 1:53 AM, Thorsten Glaser wrote: > On Sat, 26 Nov 2022, Alexey Dobriyan wrote: > >> /proc never escaped "comm" field of /proc/*/stat. > > Yes, that’s precisely the bug. > >> To parse /proc/*/stat reliably, search for '(' from the beginning, then >> for ')' backwards. Everything in between parenthesis is "comm". > > That’s not guaranteed to stay reliable: fields can be, and have > been in the past, added, and new %s fields will break this. Do > not rely on it either. > >> Everything else are numbers separated by spaces. > > Currently, yes. > > But the field is *clearly* documented as intended to be > parsable by scanf(3), which splits on white space. So the > Linux kernel MUST encode embedded whitespace so the > documented(!) access method works. No, Escaping would break existing programs which parse the line by searching for the ')' from the right. The format, surly, is ugly, but that is how it is. If some documentation suggests, that you can just parse it with scanf, the documentation should be corrected/improved instead. Are you referring to proc(5) "The fields, in order, with their proper scanf(3) format specifiers, are listed below" [1] or something else? The referenced manual page is wrong in regard to the length, too. There is no 16 character limit to the field, because it can contain a workqueue task name, too: buczek@theinternet:/tmp$ cat /proc/27190/stat 27190 (kworker/11:2-mm_percpu_wq) I 2 0 0 0 -1 69238880 0 0 0 0 0 170 0 0 20 0 1 0 109348986 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 0 0 0 17 11 0 0 0 0 0 0 0 0 0 0 0 0 0 The current limit seems to be 64 characters [2] when escaping is off, as it is the case with /proc/pid/stat. But generally the length of the field and thereby of the whole line seems to be rather undefined. So to parse that, you either either need to do some try-and-restart-with-a-bigger-buffer dance or use a buffer size of which you just hope that it will be big enough for the forseable time. In fact, if you start escaping now you might also break programs which rely on the current 64 character limit. Best Donald [1]: https://man7.org/linux/man-pages/man5/proc.5.html [2]: https://elixir.bootlin.com/linux/latest/source/fs/proc/array.c#L99 > > bye, > //mirabilos > -- Donald Buczek buc...@molgen.mpg.de Tel: +49 30 8413 1433
Bug#1024811: linux: /proc/[pid]/stat unparsable
On Sat, 26 Nov 2022, Alexey Dobriyan wrote: >/proc never escaped "comm" field of /proc/*/stat. Yes, that’s precisely the bug. >To parse /proc/*/stat reliably, search for '(' from the beginning, then >for ')' backwards. Everything in between parenthesis is "comm". That’s not guaranteed to stay reliable: fields can be, and have been in the past, added, and new %s fields will break this. Do not rely on it either. >Everything else are numbers separated by spaces. Currently, yes. But the field is *clearly* documented as intended to be parsable by scanf(3), which splits on white space. So the Linux kernel MUST encode embedded whitespace so the documented(!) access method works. bye, //mirabilos -- 15:41⎜ Somebody write a testsuite for helloworld :-)
Bug#1024811: linux: /proc/[pid]/stat unparsable
Dixi quod… >The effect is that /proc/[pid]/stat cannot be parsed the way it is >documented, as it does not escape embedded whitespace characters; … nor parenthesēs: tglase@x61w:~ $ ./mk\)sh -c 'echo $$; sleep 10' & [1] 13375 tglase@x61w:~ $ 13375 tglase@x61w:~ $ cat /proc/13375/stat 13375 (mk)sh) S 13330 13375 13330 34837 13377 4194304 124 0 0 0 0 0 0 0 20 0 1 0 59029474 2977792 180 18446744073709551615 94911056490496 94911056739789 140721459110048 0 0 0 0 0 134307847 0 0 0 17 1 0 0 0 0 0 94911056765744 94911056773808 94911059955712 140721459115917 140721459115946 140721459115946 140721459118064 0 This is… sad — extremely so. It does not escape anything. I found proc_task_name(), which has an escape parameter, which is set to false here, but it’s only for /status which must escape newlines. It’s used with false in /stat and /comm… the latter indeed needing no escapes. I’d argue that it needs a tristate argument, 0 for /comm to not escape anything, 1 for /status to escape newlines, and 2 for /stat to escape whitespace (and perhaps also a closing parenthesis, using \x29, so splitting both using scanf as indicated in the manpage and using parenthesēs as people seem to do on the ’net is fixed). bye, //mirabilos -- 22:20⎜ The crazy that persists in his craziness becomes a master 22:21⎜ And the distance between the craziness and geniality is only measured by the success 18:35⎜ "Psychotics are consistently inconsistent. The essence of sanity is to be inconsistently inconsistent
Bug#1024811: linux: /proc/[pid]/stat unparsable
Package: src:linux Version: 5.10.149-2 Severity: normal Tags: upstream X-Debbugs-Cc: t...@mirbsd.de, adobri...@gmail.com tglase@x61w:~ $ cp /bin/mksh mk\ sh tglase@x61w:~ $ ./mk\ sh -c 'echo $$; sleep 10' & [1] 12862 tglase@x61w:~ $ 12862 cat /proc/12862/stat 12862 (mk sh) S 12649 12862 12649 34838 12864 4194304 124 0 0 0 0 0 0 0 20 0 1 0 58970609 2977792 211 18446744073709551615 93898845827072 93898846076365 140724844598496 0 0 0 0 0 134307847 0 0 0 17 0 0 0 0 0 0 93898846102320 93898846110384 93898853036032 140724844603277 140724844603306 140724844603306 140724844605424 0 tglase@x61w:~ $ cat /proc/12862/stat | hd 31 32 38 36 32 20 28 6d 6b 20 73 68 29 20 53 20 |12862 (mk sh) S | […] This directly violates the documentation of proc(5) which says… The fields, in order, with their proper scanf(3) format speci‐ fiers, are listed below. Whether or not certain of these fields … while scanf(3) indicates: s Matches a sequence of non-white-space characters; the next The effect is that /proc/[pid]/stat cannot be parsed the way it is documented, as it does not escape embedded whitespace characters; this, as a result, makes this file realistically useless and anything relying on it broken. -- Package-specific info: ** Version: Linux version 5.10.0-19-amd64 (debian-ker...@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.149-2 (2022-10-21) ** Command line: BOOT_IMAGE=/SDcardBoot/vmlinuz-5.10.0-19-amd64 root=/dev/mapper/vg--cSD-lv--root ro net.ifnames=0 vga=792 ** Tainted: W (512) * kernel issued warning ** Kernel log: Unable to read kernel log; any relevant messages should be attached ** Model information sys_vendor: LENOVO product_name: 7673AG4 product_version: ThinkPad X61 chassis_vendor: LENOVO chassis_version: Not Available bios_vendor: LENOVO bios_version: 7NET30WW (1.11 ) board_vendor: LENOVO board_name: 7673AG4 board_version: Not Available ** Loaded modules: cdc_acm apple_mfi_fastcharge fuse cpufreq_ondemand tun ctr ccm cpufreq_powersave snd_hda_codec_analog snd_hda_codec_generic snd_hda_intel i915 snd_intel_dspcfg soundwire_intel soundwire_generic_allocation snd_soc_core iwl4965 iwlegacy snd_compress soundwire_cadence mac80211 snd_hda_codec coretemp snd_hda_core kvm_intel snd_hwdep drm_kms_helper cfg80211 kvm soundwire_bus thinkpad_acpi cec snd_pcm pcmcia drm iTCO_wdt nvram snd_timer ppdev ledtrig_audio intel_pmc_bxt libarc4 snd evdev yenta_socket i2c_algo_bit soundcore rfkill iTCO_vendor_support irqbypass pcmcia_rsrc watchdog parport_pc pcmcia_core serio_raw pcspkr sg parport ac acpi_cpufreq button ecb aes_generic libaes crypto_simd cryptd glue_helper xts dm_crypt dm_mod ext4 crc16 mbcache jbd2 crc32c_generic sd_mod t10_pi crc_t10dif crct10dif_generic mmc_block crct10dif_common ata_generic ahci ata_piix sdhci_pci libahci cqhci ehci_pci uhci_hcd libata e1000e scsi_mod ehci_hcd sdhci i2c_i801 i2c_smbus psmouse lpc_ich mmc_core usbcore ptp usb_common pps_core battery video ** PCI devices: not available ** USB devices: Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 002 Device 002: ID 17ef:1000 Lenovo ThinkPad X6 UltraBase Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub -- System Information: Debian Release: 11.5 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 5.10.0-19-amd64 (SMP w/2 CPU threads) Kernel taint flags: TAINT_WARN Locale: LANG=C, LC_CTYPE=C (charmap=UTF-8) (ignored: LC_ALL set to C.UTF-8), LANGUAGE not set Shell: /bin/sh linked to /bin/lksh Init: sysvinit (via /sbin/init) Versions of packages linux-image-5.10.0-19-amd64 depends on: ii initramfs-tools [linux-initramfs-tool] 0.140 ii kmod28-1 ii linux-base 4.6 Versions of packages linux-image-5.10.0-19-amd64 recommends: pn apparmor ii firmware-linux-free 20200122-1 Versions of packages linux-image-5.10.0-19-amd64 suggests: pn debian-kernel-handbook ii grub-pc 2.06-3~deb11u4 pn linux-doc-5.10 Versions of packages linux-image-5.10.0-19-amd64 is related to: pn firmware-amd-graphics pn firmware-atheros pn firmware-bnx2 pn firmware-bnx2x pn firmware-brcm80211 pn firmware-cavium pn firmware-intel-sound pn firmware-intelwimax pn firmware-ipw2x00 pn firmware-ivtv ii firmware-iwlwifi 20210315-3 pn firmware-libertas pn firmware-linux-nonfree pn firmware-misc-nonfree pn