netconsole module unload broken between 2.6.19 and 2.6.20 (and still broken as of 2.6.21-rc6)
(Please CC me on emails, I'm not on LKML). Somewhere between 2.6.19 and 2.6.20, unloading of the netconsole module got broken. It's still broken as of 2.6.21-rc6. If you try to unload the module, the rmmod/modprobe-r just sits there forever. I can reproduce it on tg3, forcedeth and e1000 hardware (all in various Opteron machines) Looking at the differences in netconsole itself between .19 and .20, they are extremely small, so I'd guess that the problem probably lies in netpoll itself. Originally, I was trying to unload the module to reconfigure the log destination - maybe a sysfs interface for (re-)configuration would be a good addition as well? -- Robin Hugh Johnson Gentoo Linux Developer Council Member E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgp4hjM8A3xEQ.pgp Description: PGP signature
Idle loadavg of ~1, maybe MD related
(Please CC me, I'm subbed to LKML). My G5, while running practically nothing (just sshd and some to watch the load), has a weird cycle of load averages. I think it might be related to MD, simply because that's the only thing that is clocking up cputime. A full cycle lasts approximately 27 minutes. MinsLoad 0-2 0.0-0.15 (stable, 0 level) 3-5 0.50, 0.80, 0.95 (fast increase) 6-210.95-1.10 (stable, 1 level) 22-24 0.9, 0.8, 0.1 (fast decrease, to 0 level) 25-27 0.2, 0.3, 0.15 (local maxima peak) Here's a graph of it, spanning 230 minutes: http://dev.gentoo.org/~robbat2/20071230-g5-loadavg-bug.png Processed data for the graph here: http://dev.gentoo.org/~robbat2/20071230-g5-loadavg-bug.txt For the entire 230 minute period, there was _no_ disk I/O. Not recorded by iostat, nor generated. # while true ; do uptime ; iostat -t 60 2 -N -d | tail -n15 ; done /dev/shm/foo Example of single output pass for the above loop: 00:59:37 up 8:32, 2 users, load average: 0.02, 0.47, 0.66 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.00 0.00 0.00 0 0 sdb 0.00 0.00 0.00 0 0 md1 0.00 0.00 0.00 0 0 md0 0.00 0.00 0.00 0 0 md2 0.00 0.00 0.00 0 0 md3 0.00 0.00 0.00 0 0 vg-usr0.00 0.00 0.00 0 0 vg-var0.00 0.00 0.00 0 0 vg-tmp0.00 0.00 0.00 0 0 vg-opt0.00 0.00 0.00 0 0 vg-home 0.00 0.00 0.00 0 0 vg-usr_src0.00 0.00 0.00 0 0 vg-usr_portage 0.00 0.00 0.00 0 0 This is basically 1842c7f2 from Linus's tree, my own stuff is config'd out with =n for the moment. And the problem does still occur in the main tree. Snippet from the head of 'top', sorting by cputime. top - 01:59:08 up 9:32, 2 users, load average: 1.04, 0.87, 0.70 Tasks: 74 total, 1 running, 73 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.5%sy, 0.0%ni, 99.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 12074480k total, 292520k used, 11781960k free,76812k buffers Swap: 8388536k total,0k used, 8388536k free, 144276k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 4635 root 15 -5 000 S0 0.0 8:14.33 md3_raid1 3121 root 15 -5 000 S1 0.0 3:15.12 md2_raid1 3098 root 15 -5 000 S0 0.0 3:07.83 md1_raid1 3076 root 15 -5 000 S0 0.0 0:45.82 md0_raid1 829 root 15 -5 000 D0 0.0 0:01.85 kwindfarm 13 root 15 -5 000 S0 0.0 0:01.41 ksoftirqd/3 18 root 15 -5 000 S0 0.0 0:01.39 events/3 10 root 15 -5 000 S0 0.0 0:00.94 ksoftirqd/2 1 root 20 0 1900 652 576 S0 0.0 0:00.89 init 32086 root 20 0 9336 2696 2076 S0 0.0 0:00.87 sshd # ver_linux Linux buck-int 2.6.24-rc6-prod-g6f0f5304 #10 SMP Sat Dec 29 05:11:24 PST 2007 ppc64 PPC970MP, altivec supported PowerMac11,2 GNU/Linux Gnu C 4.2.2 Gnu make 3.81 binutils 2.18 util-linux 2.13 mount 2.13 module-init-tools 3.4 e2fsprogs 1.40.3 reiserfsprogs 3.6.19 xfsprogs 2.9.4 quota-tools3.15. PPP2.4.4 Linux C Library2.7 Dynamic linker (ldd) 2.7 Procps 3.2.7 Net-tools 1.60 Kbd1.13 Sh-utils 6.9 udev 118 wireless-tools 29 Modules Loaded nfsd exportfs auth_rpcgss ipv6 unix tg3 nfs_acl lockd sunrpc dm_mod # lsmod Module Size Used by nfsd 346552 1 exportfs8392 1 nfsd auth_rpcgss69152 1 nfsd ipv6 428760 20 unix 47384 13 tg3 159020 0 nfs_acl 6056 1 nfsd lockd 105248 1 nfsd sunrpc281248 6 nfsd,auth_rpcgss,nfs_acl,lockd dm_mod100520 15 # ps -ef UIDPID PPID C STIME TTY TIME CMD root 1 0 0 Dec29 ?00:00:00 init [3] root 2 0 0 Dec29 ?00:00:00 [kthreadd] root 3 2 0
Re: Idle loadavg of ~1, maybe MD related
On Sat, Jan 05, 2008 at 01:30:37AM -0800, Andrew Morton wrote: From that I'd suspect that kwindfarm is being a bad citizen. If a process is consistently stuck in D state, run Windfarm. echo w /proc/sysrq-trigger then record the resulting dmesg output so we can see where it got stuck. Traceback: [552710.416174] SysRq : Show Blocked State [552710.417876] taskPC stack pid father [552710.417888] kwindfarm D 0 829 2 [552710.417892] Call Trace: [552710.417895] [c0036c9835f0] [c0528b90] 0xc0528b90 (unreliable) [552710.417908] [c0036c9837c0] [c000f4a8] .__switch_to+0xd8/0x110 [552710.417985] [c0036c983850] [c03bb2a0] .schedule+0x62c/0x6c8 [552710.417992] [c0036c983940] [c03bb8c4] .schedule_timeout+0x3c/0xe8 [552710.417997] [c0036c983a10] [c03bb51c] .wait_for_common+0x100/0x1bc [552710.418002] [c0036c983ae0] [c0285300] .smu_fan_set+0x17c/0x1e4 [552710.418009] [c0036c983c30] [c0284078] .pm112_wf_notify+0xc50/0x12d0 [552710.418015] [c0036c983d20] [c03bfb84] .notifier_call_chain+0x5c/0xcc [552710.418021] [c0036c983dc0] [c006f4b4] .__blocking_notifier_call_chain+0x70/0xb0 [552710.418027] [c0036c983e70] [c0282d9c] .wf_thread_func+0x78/0x11c [552710.418032] [c0036c983f00] [c0069b00] .kthread+0x78/0xc4 [552710.418039] [c0036c983f90] [c0023d0c] .kernel_thread+0x4c/0x68 [552710.418064] Sched Debug Version: v0.07, 2.6.24-rc6-prod-g6f0f5304 #10 [552710.418067] now at 552723945.170635 msecs [552710.418070] .sysctl_sched_latency: 60.00 [552710.418073] .sysctl_sched_min_granularity: 12.00 [552710.418076] .sysctl_sched_wakeup_granularity : 30.00 [552710.418079] .sysctl_sched_batch_wakeup_granularity : 30.00 [552710.418082] .sysctl_sched_child_runs_first : 0.01 [552710.418085] .sysctl_sched_features : 7 Full output at http://dev.gentoo.org/~robbat2/20080105_windfarm_sysrq_w.txt -- Robin Hugh Johnson Gentoo Linux Developer Infra Guy E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgp9Esutq5LQs.pgp Description: PGP signature
Intel Core Duo/Duo2 T2300/E6400 - Hyper-Threading (the absence of)
(Please CC me, I am not subscribed to LKML [I have set the Mail-Followup-To header accordingly]). On two of my new machines, with Intel Core Duo T2300 and Core2 Duo E6400 chips respectively, I noticed some weirdness in how many CPUs are present. If the hyper-threading bit is present in the CPU info, should there always be a an extra CPU presented to the system per physical core? Both the Core1 and Core2 chips I have the ht bit set, but present only their two physical cores to the system. No access to the hyper-threading capabilities at all. I also see no configuration options in the BIOS to enable or disable hyper-threading. That is, /proc/cpuinfo and all topology data only shows 2 CPUs present, and that they are not the HT pair. (CONFIG_NR_CPUS=8 is set). (This was originally triggered by somebody else's code that read the CPU flags, saw hyper-threading, and decided there were 2x cpus for each physical core. Said code has already been taken out back and shot repeatedly). -- Robin Hugh Johnson Gentoo Linux Developer E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgp6ZAuJuHuyo.pgp Description: PGP signature
Licensing copyright of kernel .config files (defconfig, *config)
(Please CC me on replies, not subscribed to LKML) Hi, Somewhat of an odd question, but none of the files in question seem to have a copyright header on them... For a kernel .config file, either from one of the defconfig or any other *config option that automates the answer: 1. What license does the file fall under? 2. Who are the copyright holders? Naively, since the defconfigs are bundled with the kernel, that could fall under GPLv2-only implicitly, but lacking any explicit copyright headers makes this interesting (arch/*/configs/* contain lots of files, no copyright headers on them). If I manually write the names of some configuration options to a new .config file, at that point I logically am the only author and have copyright of it. My editor slaps a default license on it of BSD-2. Thereafter I run olddefconfig, and now it's a combined work of the kernel's defconfig and my manual settings. If GPL-2 was inherited from the kernel tree, this is now a combined BSD-GPL2 work, or is it? The kernel config tools did consider my file as input, possibly overrode the settings if they didn't work with others, and re-output everything. If the files are to be marked with a copyright header, who is the holder of it that it should be attributed to? Alternatively, is this a case where the work is not copyrightable, and the files should have a notice to that effect? Background: Gentoo has a bunch of stock kernel configurations for release engineering, our initramfs tool (genkernel), and other endeavors over the years. These projects claim BSD, GPL2, LGPL2 on various pieces, and I don't think they can all be correct. I'm working on getting them into one place, because some of them have been getting stale, but the differing licenses raised a red flag to me. -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Licensing copyright of kernel .config files (defconfig, *config)
On Mon, Jun 02, 2014 at 12:01:46AM +0100, Ken Moffat wrote: Naively, since the defconfigs are bundled with the kernel, that could fall under GPLv2-only implicitly, but lacking any explicit copyright headers makes this interesting (arch/*/configs/* contain lots of files, no copyright headers on them). I am not a lawyer, but surely _many_ of the kernel files do not contain any explicit copyright information ? On closer inspection, more files than I thought don't have any explicit copyrights on them. ~67% of files in v3.13 had the text 'Copyright' or 'Licens' appear in them. Why does your editor put a default license on anything ? It's my stock header, customized by per-directory vimrc. The non-project-specific default one actually has a CHANGEME string it in, to help remind me that it needs an edit before I release that file. I was just using the BSD license on the file as an example. Submissions to other open source projects are generally bound by the license of the project, with a few exceptions (I've put patches into public domain to avoid signing some CLA-like agreements). If I was being awkward, I would suggest that the config would not be useful until you had run it through make oldconfig or similar, and that therefore the kernel license of GPL-2 applies. That's the case I was interested in :-). If the files are to be marked with a copyright header, who is the holder of it that it should be attributed to? Iff the work is copyrightable (I do not have an opinion on that), surely the license only matters if you breach it ? ;-) If you distribute a compiled kernel with the source, and all of that source is GPL-2, then I assume you are in the clear. For extras which include binaries without source, my understanding is that you would always be vulnerable to kernel copyright holders. So, I suspect that the attribution of a config file is not particularly important. I agree with your reasoning if I was distributing kernel sources or compiled kernels, but this is going to be a package of kernel configurations only. Background: Gentoo has a bunch of stock kernel configurations for release engineering, our initramfs tool (genkernel), and other endeavors over the years. These projects claim BSD, GPL2, LGPL2 on various pieces, and I don't think they can all be correct. I'm working on getting them into one place, because some of them have been getting stale, but the differing licenses raised a red flag to me. To the extent that GPL-2 can include LGPL-2 and BSD, I suggest that you label them all as GPL-2. That is the licence of the kernel, and for practical reasons it will not change (this was discussed when somebody asked about GPL-3 : even if the main copyright holders wanted to make the change (and many do not), some copyright holders are no longer contactable). You might be able to dual-license some of these distro files, but I have no idea if that would be appropriate. If the rest of the logic is correct, then the non-GPL2 license on these files was never valid in the first place; they inherited GPL2 from the kernel from the get go, and I don't need to be concerned about the hassle of formally relicensing them by contacting the authors of the configs (which again, aren't always contactable anymore). -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] libata: disable a disk via libata.force params
A user on StackExchange had a failing SSD that's soldered directly onto the motherboard of his system. The BIOS does not give any option to disable it at all, so he can't just hide it from the OS via the BIOS. The old IDE layer had hdX=noprobe override for situations like this, but that was never ported to the libata layer. This patch implements a disable flag for libata.force. Example use: libata.force=2.0:disable [v2 of the patch, removed the nodisable flag per Tejun Heo] Signed-off-by: Robin H. Johnson robb...@gentoo.org X-URL: http://unix.stackexchange.com/questions/102648/how-to-tell-linux-kernel-3-0-to-completely-ignore-a-failing-disk X-URL: http://askubuntu.com/questions/352836/how-can-i-tell-linux-kernel-to-completely-ignore-a-disk-as-if-it-was-not-even-co X-URL: http://superuser.com/questions/599333/how-to-disable-kernel-probing-for-drive --- Documentation/kernel-parameters.txt | 2 ++ drivers/ata/libata-core.c | 1 + 2 files changed, 3 insertions(+) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 50680a5..b9e9bd8 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1529,6 +1529,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. * atapi_dmadir: Enable ATAPI DMADIR bridge support + * disable: Disable this device. + If there are multiple matching configurations changing the same attribute, the last one is used. diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 75b9367..70529b8 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -6519,6 +6519,7 @@ static int __init ata_parse_force_one(char **cur, { norst, .lflags = ATA_LFLAG_NO_HRST | ATA_LFLAG_NO_SRST }, { rstonce,.lflags = ATA_LFLAG_RST_ONCE }, { atapi_dmadir, .horkage_on = ATA_HORKAGE_ATAPI_DMADIR }, + { disable,.horkage_on = ATA_HORKAGE_DISABLE }, }; char *start = *cur, *p = *cur; char *id, *val, *endp; -- 1.8.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] libata: provide the ability to disable a disk via the params.
This was posted by a user on StackExchange, who has a failing SSD that's soldered directly onto the motherboard of his system. The BIOS does not give any option to disable it at all, so he can't just hide it that way. The old IDE layer had hdX=noprobe override for situations like this, but that was never ported to the libata layer. Signed-off-by: Robin H. Johnson robb...@gentoo.org X-URL: http://unix.stackexchange.com/questions/102648/how-to-tell-linux-kernel-3-0-to-completely-ignore-a-failing-disk X-URL: http://askubuntu.com/questions/352836/how-can-i-tell-linux-kernel-to-completely-ignore-a-disk-as-if-it-was-not-even-co X-URL: http://superuser.com/questions/599333/how-to-disable-kernel-probing-for-drive --- Documentation/kernel-parameters.txt | 2 ++ drivers/ata/libata-core.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 50680a5..40bf5ff 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1529,6 +1529,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. * atapi_dmadir: Enable ATAPI DMADIR bridge support + * [no]disable: Enable or disable this device. + If there are multiple matching configurations changing the same attribute, the last one is used. diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 75b9367..5069a96 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -6519,6 +6519,8 @@ static int __init ata_parse_force_one(char **cur, { norst, .lflags = ATA_LFLAG_NO_HRST | ATA_LFLAG_NO_SRST }, { rstonce,.lflags = ATA_LFLAG_RST_ONCE }, { atapi_dmadir, .horkage_on = ATA_HORKAGE_ATAPI_DMADIR }, + { disable,.horkage_on = ATA_HORKAGE_DISABLE }, + { nodisable, .horkage_off= ATA_HORKAGE_DISABLE }, }; char *start = *cur, *p = *cur; char *id, *val, *endp; -- 1.8.4.3 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: provide the ability to disable a disk via the params.
On Thu, Dec 12, 2013 at 08:39:35AM -0500, Tejun Heo wrote: Hello, Robin. On Sat, Dec 07, 2013 at 04:56:27PM -0800, Robin H. Johnson wrote: + { disable,.horkage_on = ATA_HORKAGE_DISABLE }, + { nodisable, .horkage_off= ATA_HORKAGE_DISABLE }, Given the current usage of ATA_HORKAGE_DISABLE, I don't think we need nodisable. Let's just add disable for now. Can you please update the patch and resend? Before I do so, I have two questions: 1. Countering your nodisable comment, would it be valid to do: libata.force=2:disable libata.force=2.02:nodisable To disable all of port 2 except device 2? 2. One of my friends wondered if it would be worthwhile to add force keywords for other HORKAGE bits, and if so, should the ata_lflag/ata_link force bits also be presented? There are only 3 HORKAGE bits presently available in libata.force: ATA_HORKAGE_NONCQ ATA_HORKAGE_DUMP_ID ATA_HORKAGE_ATAPI_DMADIR And 3 ata_link flags: ATA_LFLAG_NO_HRST ATA_LFLAG_NO_SRST ATA_LFLAG_RST_ONCE -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: provide the ability to disable a disk via the params.
On Thu, Dec 12, 2013 at 09:36:55PM +0100, Levente Kurusa wrote: 2. One of my friends wondered if it would be worthwhile to add force keywords for other HORKAGE bits, and if so, should the ata_lflag/ata_link force bits also be presented? I don't think so. Most of the other HORKAGEs are automatically recognized and applied by the code. I think the only ones which can cause trouble if not detected at first are the ones that are currently in the list. His logic was thinking that it will aid debugging/testing on new buggy devices if the options are available at boot. I'd think of the following as candidates for that: ATA_HORKAGE_NODMA ATA_HORKAGE_MAX_SEC_128 ATA_HORKAGE_DIAGNOSTIC ATA_HORKAGE_BROKEN_HPA ATA_HORKAGE_DISABLE ATA_HORKAGE_HPA_SIZE ATA_HORKAGE_IVB ATA_HORKAGE_STUCK_ERR (only set by code presently, not by blacklist) ATA_HORKAGE_BRIDGE_OK ATA_HORKAGE_ATAPI_MOD16_DMA ATA_HORKAGE_NOSETXFER ATA_HORKAGE_MAX_SEC_LBA48 -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PCI: QEMU top-level IDs for (sub)vendor & device
Introduce PCI_VENDOR/PCI_SUBVENDOR/PCI_SUBDEVICE defines to replace the constants scattered in the kernel already used to detect QEMU. They are defined in the QEMU codebase per docs/specs/pci-ids.txt. Signed-off-by: Robin H. Johnson <robb...@gentoo.org> --- This change prompted by a near-miss in the review of recent change: 'drm/i915: refine qemu south bridge detection' Signed-off-by: Robin H. Johnson <robb...@gentoo.org> --- drivers/gpu/drm/bochs/bochs_drv.c | 4 ++-- drivers/gpu/drm/cirrus/cirrus_drv.c | 5 +++-- drivers/virtio/virtio_pci_common.c | 2 +- include/linux/pci_ids.h | 4 sound/pci/intel8x0.c| 4 ++-- 5 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/bochs/bochs_drv.c b/drivers/gpu/drm/bochs/bochs_drv.c index 7f1a360..b332b4d3 100644 --- a/drivers/gpu/drm/bochs/bochs_drv.c +++ b/drivers/gpu/drm/bochs/bochs_drv.c @@ -182,8 +182,8 @@ static const struct pci_device_id bochs_pci_tbl[] = { { .vendor = 0x1234, .device = 0x, - .subvendor = 0x1af4, - .subdevice = 0x1100, + .subvendor = PCI_SUBVENDOR_ID_REDHAT_QUMRANET, + .subdevice = PCI_SUBDEVICE_ID_QEMU, .driver_data = BOCHS_QEMU_STDVGA, }, { diff --git a/drivers/gpu/drm/cirrus/cirrus_drv.c b/drivers/gpu/drm/cirrus/cirrus_drv.c index b1619e2..7bc394e 100644 --- a/drivers/gpu/drm/cirrus/cirrus_drv.c +++ b/drivers/gpu/drm/cirrus/cirrus_drv.c @@ -33,8 +33,9 @@ static struct drm_driver driver; /* only bind to the cirrus chip in qemu */ static const struct pci_device_id pciidlist[] = { - { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, 0x1af4, 0x1100, 0, - 0, 0 }, + { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, + PCI_SUBVENDOR_ID_REDHAT_QUMRANET, PCI_SUBDEVICE_ID_QEMU, + 0, 0, 0 }, { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, PCI_VENDOR_ID_XEN, 0x0001, 0, 0, 0 }, {0,} diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 36205c2..127dfe4 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -467,7 +467,7 @@ static const struct dev_pm_ops virtio_pci_pm_ops = { /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */ static const struct pci_device_id virtio_pci_id_table[] = { - { PCI_DEVICE(0x1af4, PCI_ANY_ID) }, + { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_ANY_ID) }, { 0 } }; diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 37f05cb..6d249d3 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2506,6 +2506,10 @@ #define PCI_VENDOR_ID_AZWAVE 0x1a3b +#define PCI_VENDOR_ID_REDHAT_QUMRANET0x1af4 +#define PCI_SUBVENDOR_ID_REDHAT_QUMRANET 0x1af4 +#define PCI_SUBDEVICE_ID_QEMU0x1100 + #define PCI_VENDOR_ID_ASMEDIA 0x1b21 #define PCI_VENDOR_ID_CIRCUITCO0x1cc8 diff --git a/sound/pci/intel8x0.c b/sound/pci/intel8x0.c index 42bcbac..12c2c18 100644 --- a/sound/pci/intel8x0.c +++ b/sound/pci/intel8x0.c @@ -2980,8 +2980,8 @@ static int snd_intel8x0_inside_vm(struct pci_dev *pci) goto fini; /* check for known (emulated) devices */ - if (pci->subsystem_vendor == 0x1af4 && - pci->subsystem_device == 0x1100) { + if (pci->subsystem_vendor == PCI_SUBVENDOR_ID_REDHAT_QUMRANET && + pci->subsystem_device == PCI_SUBDEVICE_ID_QEMU) { /* KVM emulated sound, PCI SSID: 1af4:1100 */ msg = "enable KVM"; } else if (pci->subsystem_vendor == 0x1ab8) { -- 2.3.0
[PATCH resend] PCI: QEMU top-level IDs for (sub)vendor & device
Introduce PCI_VENDOR/PCI_SUBVENDOR/PCI_SUBDEVICE defines to replace the constants scattered in the kernel already used to detect QEMU. They are defined in the QEMU codebase per docs/specs/pci-ids.txt. Signed-off-by: Robin H. Johnson <robb...@gentoo.org> Reviewed-by: Takashi Iwai <ti...@suse.de> Reviewed-by: Gerd Hoffmann <kra...@redhat.com> --- This change prompted by a near-miss in the review of recent change: 'drm/i915: refine qemu south bridge detection' This patch was previously sent to LKML 25 Jan 2016; and got some reviews, but otherwise slipped through the cracks. --- drivers/gpu/drm/bochs/bochs_drv.c | 4 ++-- drivers/gpu/drm/cirrus/cirrus_drv.c | 5 +++-- drivers/virtio/virtio_pci_common.c | 2 +- include/linux/pci_ids.h | 4 sound/pci/intel8x0.c| 4 ++-- 5 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/bochs/bochs_drv.c b/drivers/gpu/drm/bochs/bochs_drv.c index 7f1a360..b332b4d3 100644 --- a/drivers/gpu/drm/bochs/bochs_drv.c +++ b/drivers/gpu/drm/bochs/bochs_drv.c @@ -182,8 +182,8 @@ static const struct pci_device_id bochs_pci_tbl[] = { { .vendor = 0x1234, .device = 0x, - .subvendor = 0x1af4, - .subdevice = 0x1100, + .subvendor = PCI_SUBVENDOR_ID_REDHAT_QUMRANET, + .subdevice = PCI_SUBDEVICE_ID_QEMU, .driver_data = BOCHS_QEMU_STDVGA, }, { diff --git a/drivers/gpu/drm/cirrus/cirrus_drv.c b/drivers/gpu/drm/cirrus/cirrus_drv.c index b1619e2..7bc394e 100644 --- a/drivers/gpu/drm/cirrus/cirrus_drv.c +++ b/drivers/gpu/drm/cirrus/cirrus_drv.c @@ -33,8 +33,9 @@ static struct drm_driver driver; /* only bind to the cirrus chip in qemu */ static const struct pci_device_id pciidlist[] = { - { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, 0x1af4, 0x1100, 0, - 0, 0 }, + { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, + PCI_SUBVENDOR_ID_REDHAT_QUMRANET, PCI_SUBDEVICE_ID_QEMU, + 0, 0, 0 }, { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, PCI_VENDOR_ID_XEN, 0x0001, 0, 0, 0 }, {0,} diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 36205c2..127dfe4 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -467,7 +467,7 @@ static const struct dev_pm_ops virtio_pci_pm_ops = { /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */ static const struct pci_device_id virtio_pci_id_table[] = { - { PCI_DEVICE(0x1af4, PCI_ANY_ID) }, + { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_ANY_ID) }, { 0 } }; diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 37f05cb..6d249d3 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2506,6 +2506,10 @@ #define PCI_VENDOR_ID_AZWAVE 0x1a3b +#define PCI_VENDOR_ID_REDHAT_QUMRANET0x1af4 +#define PCI_SUBVENDOR_ID_REDHAT_QUMRANET 0x1af4 +#define PCI_SUBDEVICE_ID_QEMU0x1100 + #define PCI_VENDOR_ID_ASMEDIA 0x1b21 #define PCI_VENDOR_ID_CIRCUITCO0x1cc8 diff --git a/sound/pci/intel8x0.c b/sound/pci/intel8x0.c index 42bcbac..12c2c18 100644 --- a/sound/pci/intel8x0.c +++ b/sound/pci/intel8x0.c @@ -2980,8 +2980,8 @@ static int snd_intel8x0_inside_vm(struct pci_dev *pci) goto fini; /* check for known (emulated) devices */ - if (pci->subsystem_vendor == 0x1af4 && - pci->subsystem_device == 0x1100) { + if (pci->subsystem_vendor == PCI_SUBVENDOR_ID_REDHAT_QUMRANET && + pci->subsystem_device == PCI_SUBDEVICE_ID_QEMU) { /* KVM emulated sound, PCI SSID: 1af4:1100 */ msg = "enable KVM"; } else if (pci->subsystem_vendor == 0x1ab8) { -- 2.3.0
PROBLEM: dmesg spam: alloc_contig_range: [XX, YY) PFNs busy
(Replies CC to list and direct to me please) Summary: dmesg spammed with alloc_contig_range: [XX, YY) PFNs busy Description: I recently upgrading 4.9-rc5, (previous kernel 4.5.0-rc6-00141-g6794402), and since then my dmesg has been absolutely flooded with 'PFNs busy' (>3GiB/day). My config did not change (all new options =n). It's not consistent addresses, so the squelch of identical printk lines hasn't helped. Eg output: [187487.621916] alloc_contig_range: [83f0a9, 83f0aa) PFNs busy [187487.621924] alloc_contig_range: [83f0ce, 83f0cf) PFNs busy [187487.621976] alloc_contig_range: [83f125, 83f126) PFNs busy [187487.622013] alloc_contig_range: [83f127, 83f128) PFNs busy Keywords: - mm, alloc_contig_range, CMA Most recent kernel version which did not have the bug: -- Known 4.5.0-rc6-00141-g6794402 ver_linux: -- Linux bohr-int 4.9.0-rc5-00177-g81bcfe5 #12 SMP Wed Nov 16 13:16:32 PST 2016 x86_64 Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz GenuineIntel GNU/Linux GNU C 5.3.0 GNU Make4.2.1 Binutils2.25.1 Util-linux 2.29 Mount 2.29 Quota-tools 4.03 Linux C Library 2.23 Dynamic linker (ldd)2.23 readlink: missing operand Try 'readlink --help' for more information. Procps 3.3.12 Net-tools 1.60 Kbd 2.0.3 Console-tools 2.0.3 Sh-utils8.25 Udev230 Modules Loaded 3w_sas 3w_ ablk_helper aesni_intel aes_x86_64 af_packet ahci aic79xx amdgpu async_memcpy async_pq async_raid6_recov async_tx async_xor ata_piix auth_rpcgss binfmt_misc bluetooth bnep bnx2 bonding btbcm btintel btrfs btrtl btusb button cdrom cn configs coretemp crc32c_intel crc32_pclmul crc_ccitt crc_itu_t crct10dif_pclmul cryptd dca dm_bio_prison dm_bufio dm_cache dm_cache_smq dm_crypt dm_delay dm_flakey dm_log dm_log_userspace dm_mirror dm_mod dm_multipath dm_persistent_data dm_queue_length dm_raid dm_region_hash dm_round_robin dm_service_time dm_snapshot dm_thin_pool dm_zero drm drm_kms_helper dummy e1000 e1000e evdev ext2 fat fb_sys_fops firewire_core firewire_ohci fjes fscache fuse ghash_clmulni_intel glue_helper grace hangcheck_timer hid_a4tech hid_apple hid_belkin hid_cherry hid_chicony hid_cypress hid_ezkey hid_generic hid_gyration hid_logitech hid_logitech_dj hid_microsoft hid_monterey hid_petalynx hid_pl hid_samsung hid_sony hid_sunplus hwmon_vid i2c_algo_bit i2c_i801 i2c_smbus igb input_leds intel_rapl ip6_udp_tunnel ipv6 irqbypass iscsi_tcp iTCO_vendor_support iTCO_wdt ixgb ixgbe jfs kvm kvm_intel libahci libata libcrc32c libiscsi libiscsi_tcp linear lockd lpc_ich lpfc lrw macvlan mdio md_mod megaraid_mbox megaraid_mm megaraid_sas mii mptbase mptfc mptsas mptscsih mptspi multipath nfs nfs_acl nfsd nls_cp437 nls_iso8859_1 nvram ohci_hcd pata_jmicron pata_marvell pata_platform pcspkr psmouse qla1280 qla2xxx r8169 radeon raid0 raid10 raid1 raid456 raid6_pq reiserfs rfkill sata_mv sata_sil24 scsi_transport_fc scsi_transport_iscsi scsi_transport_sas scsi_transport_spi sd_mod sg sky2 snd snd_hda_codec snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_core snd_hda_intel snd_hwdep snd_pcm snd_timer soundcore sr_mod sunrpc syscopyarea sysfillrect sysimgblt tg3 ttm uas udp_tunnel usb_storage vfat virtio virtio_net virtio_ring vxlan w83627ehf x86_pkg_temp_thermal xfs xhci_hcd xhci_pci xor zlib_deflate -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: Digital signature
PROBLEM-PERSISTS: dmesg spam: alloc_contig_range: [XX, YY) PFNs busy
I didn't get any responses to this. git bisect shows that the problem did actually exist in 4.5.0-rc6, but has gotten worse by many orders of magnitude (< 1/week to ~20M/hour). Presently with 4.9-rc5, it's now writing ~2.5GB/hour to syslog. The list of addresses in that time is only ~80 unique ranges, each appearing ~320K times. They don't appear exactly in order, so the kernel does not squelch the log message for appearing too frequently. Could somebody at least make a suggestion on how to trace the printed range to somewhere in the kernel? On Sat, Nov 19, 2016 at 03:25:32AM +0000, Robin H. Johnson wrote: > (Replies CC to list and direct to me please) > > Summary: > > dmesg spammed with alloc_contig_range: [XX, YY) PFNs busy > > Description: > > I recently upgrading 4.9-rc5, (previous kernel 4.5.0-rc6-00141-g6794402), > and since then my dmesg has been absolutely flooded with 'PFNs busy' > (>3GiB/day). My config did not change (all new options =n). > > It's not consistent addresses, so the squelch of identical printk lines > hasn't helped. > Eg output: > [187487.621916] alloc_contig_range: [83f0a9, 83f0aa) PFNs busy > [187487.621924] alloc_contig_range: [83f0ce, 83f0cf) PFNs busy > [187487.621976] alloc_contig_range: [83f125, 83f126) PFNs busy > [187487.622013] alloc_contig_range: [83f127, 83f128) PFNs busy > > Keywords: > - > mm, alloc_contig_range, CMA > > Most recent kernel version which did not have the bug: > -- > Known 4.5.0-rc6-00141-g6794402 > > ver_linux: > -- > Linux bohr-int 4.9.0-rc5-00177-g81bcfe5 #12 SMP Wed Nov 16 13:16:32 PST > 2016 x86_64 Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz GenuineIntel > GNU/Linux > > GNU C 5.3.0 > GNU Make 4.2.1 > Binutils 2.25.1 > Util-linux2.29 > Mount 2.29 > Quota-tools 4.03 > Linux C Library 2.23 > Dynamic linker (ldd) 2.23 > readlink: missing operand > Try 'readlink --help' for more information. > Procps3.3.12 > Net-tools 1.60 > Kbd 2.0.3 > Console-tools 2.0.3 > Sh-utils 8.25 > Udev 230 > Modules Loaded3w_sas 3w_ ablk_helper aesni_intel > aes_x86_64 af_packet ahci aic79xx amdgpu async_memcpy async_pq > async_raid6_recov async_tx async_xor ata_piix auth_rpcgss binfmt_misc > bluetooth bnep bnx2 bonding btbcm btintel btrfs btrtl btusb button cdrom > cn configs coretemp crc32c_intel crc32_pclmul crc_ccitt crc_itu_t > crct10dif_pclmul cryptd dca dm_bio_prison dm_bufio dm_cache dm_cache_smq > dm_crypt dm_delay dm_flakey dm_log dm_log_userspace dm_mirror dm_mod > dm_multipath dm_persistent_data dm_queue_length dm_raid dm_region_hash > dm_round_robin dm_service_time dm_snapshot dm_thin_pool dm_zero drm > drm_kms_helper dummy e1000 e1000e evdev ext2 fat fb_sys_fops > firewire_core firewire_ohci fjes fscache fuse ghash_clmulni_intel > glue_helper grace hangcheck_timer hid_a4tech hid_apple hid_belkin > hid_cherry hid_chicony hid_cypress hid_ezkey hid_generic hid_gyration > hid_logitech hid_logitech_dj hid_microsoft hid_monterey hid_petalynx > hid_pl hid_samsung hid_sony hid_sunplus hwmon_vid i2c_algo_bit i2c_i801 > i2c_smbus igb input_leds intel_rapl ip6_udp_tunnel ipv6 irqbypass > iscsi_tcp iTCO_vendor_support iTCO_wdt ixgb ixgbe jfs kvm kvm_intel > libahci libata libcrc32c libiscsi libiscsi_tcp linear lockd lpc_ich lpfc > lrw macvlan mdio md_mod megaraid_mbox megaraid_mm megaraid_sas mii > mptbase mptfc mptsas mptscsih mptspi multipath nfs nfs_acl nfsd > nls_cp437 nls_iso8859_1 nvram ohci_hcd pata_jmicron pata_marvell > pata_platform pcspkr psmouse qla1280 qla2xxx r8169 radeon raid0 raid10 > raid1 raid456 raid6_pq reiserfs rfkill sata_mv sata_sil24 > scsi_transport_fc scsi_transport_iscsi scsi_transport_sas > scsi_transport_spi sd_mod sg sky2 snd snd_hda_codec > snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec_realtek > snd_hda_core snd_hda_intel snd_hwdep snd_pcm snd_timer soundcore sr_mod > sunrpc syscopyarea sysfillrect sysimgblt tg3 ttm uas udp_tunnel > usb_storage vfat virtio virtio_net virtio_ring vxlan w83627ehf > x86_pkg_temp_thermal xfs xhci_hcd xhci_pci xor zlib_deflate -- Robin Hugh Johnson E-Mail : robb...@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 ICQ# : 30269588 or 41961639 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 signature.asc Description: Digital signature
Re: PROBLEM-PERSISTS: dmesg spam: alloc_contig_range: [XX, YY) PFNs busy
(I'm going to respond directly to this email with the stack trace.) On Wed, Nov 30, 2016 at 02:28:49PM +0100, Michal Hocko wrote: > > On the other hand, if this didn’t happen and now happens all the time, > > this indicates a regression in CMA’s capability to allocate pages so > > just rate limiting the output would hide the potential actual issue. > > Or there might be just a much larger demand on those large blocks, no? > But seriously, dumping those message again and again into the low (see > the 2.5_GB_/h to the log is just insane. So there really should be some > throttling. > > Does the following help you Robin. At least to not get swamped by those > message. Here's what I whipped up based on that, to ensure that dump_stack got rate-limited at the same pass as PFNs-busy. It dropped the dmesg spew to ~25MB/hour (and is suppressing ~43 entries/second right now). commit 6ad4037e18ec2199f8755274d8a745a9904241a1 Author: Robin H. Johnson <robb...@gentoo.org> Date: Wed Nov 30 10:32:57 2016 -0800 mm: ratelimit & trace PFNs busy. Signed-off-by: Robin H. Johnson <robb...@gentoo.org> diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6de9440e3ae2..3c28ec3d18f8 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7289,8 +7289,15 @@ int alloc_contig_range(unsigned long start, unsigned long end, /* Make sure the range is really isolated. */ if (test_pages_isolated(outer_start, end, false)) { - pr_info("%s: [%lx, %lx) PFNs busy\n", - __func__, outer_start, end); + static DEFINE_RATELIMIT_STATE(ratelimit_pfn_busy, + DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + if (__ratelimit(_pfn_busy)) { + pr_info("%s: [%lx, %lx) PFNs busy\n", + __func__, outer_start, end); + dump_stack(); + } + ret = -EBUSY; goto done; } -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy
Somewhere in the Radeon/DRM codebase, CMA page allocation has either regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is doing something different with pages. Given that I haven't seen ANY other reports of this, I'm inclined to believe the problem is drm/radeon specific (if I don't start X, I can't reproduce the problem). The rate of the problem starts slow, and also is relatively low on an idle system (my screens blank at night, no xscreensaver running), but it still ramps up over time (to the point of generating 2.5GB/hour of "(timestamp) alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses (~100 unique ranges for a day). My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ 9 virtual desktops per monitor). I added a stack trace & rate limit to alloc_contig_range's PFNs busy message (patch in previous email on LKML/-MM lists); and they point to radeon. alloc_contig_range: [83f2a3, 83f2a4) PFNs busy CPU: 3 PID: 8518 Comm: X Not tainted 4.9.0-rc7-00024-g6ad4037e18ec #27 Hardware name: System manufacturer System Product Name/P8Z68 DELUXE, BIOS 0501 05/09/2011 ad50c3d7f730 b236c873 0083f2a3 0083f2a4 ad50c3d7f810 b2183b38 999dff4d8040 20fca8c0 0083f400 0083f000 0083f2a3 0004 Call Trace: [] dump_stack+0x85/0xc2 [] alloc_contig_range+0x368/0x370 [] cma_alloc+0x127/0x2e0 [] dma_alloc_from_contiguous+0x38/0x40 [] dma_generic_alloc_coherent+0x91/0x1d0 [] x86_swiotlb_alloc_coherent+0x25/0x50 [] ttm_dma_populate+0x48a/0x9a0 [ttm] [] ? __kmalloc+0x1b6/0x250 [] radeon_ttm_tt_populate+0x22a/0x2d0 [radeon] [] ? ttm_dma_tt_init+0x67/0xc0 [ttm] [] ttm_tt_bind+0x37/0x70 [ttm] [] ttm_bo_handle_move_mem+0x528/0x5a0 [ttm] [] ? shmem_alloc_inode+0x1a/0x30 [] ttm_bo_validate+0x114/0x130 [ttm] [] ? _raw_write_unlock+0xe/0x10 [] ttm_bo_init+0x31d/0x3f0 [ttm] [] radeon_bo_create+0x19b/0x260 [radeon] [] ? radeon_update_memory_usage.isra.0+0x50/0x50 [radeon] [] radeon_gem_object_create+0xad/0x180 [radeon] [] radeon_gem_create_ioctl+0x5f/0xf0 [radeon] [] drm_ioctl+0x21b/0x4d0 [drm] [] ? radeon_gem_pwrite_ioctl+0x30/0x30 [radeon] [] radeon_drm_ioctl+0x4c/0x80 [radeon] [] do_vfs_ioctl+0x92/0x5c0 [] SyS_ioctl+0x79/0x90 [] do_syscall_64+0x73/0x190 [] entry_SYSCALL64_slow_path+0x25/0x25 The Radeon card in my case is a VisionTek HD 7750 Eyefinity 6, which is reported as: 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] (prog-if 00 [VGA controller]) Subsystem: VISIONTEK Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] Flags: bus master, fast devsel, latency 0, IRQ 58 Memory at c000 (64-bit, prefetchable) [size=256M] Memory at fbe0 (64-bit, non-prefetchable) [size=256K] I/O ports at e000 [size=256] Expansion ROM at 000c [disabled] [size=128K] Capabilities: [48] Vendor Specific Information: Len=08 Capabilities: [50] Power Management version 3 Capabilities: [58] Express Legacy Endpoint, MSI 00 Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 Capabilities: [150] Advanced Error Reporting Kernel driver in use: radeon Kernel modules: radeon, amdgpu -- Robin Hugh Johnson E-Mail : robb...@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 ICQ# : 30269588 or 41961639 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 signature.asc Description: Digital signature
Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy
On Thu, Dec 01, 2016 at 08:38:15AM +0100, Vlastimil Babka wrote: > >> By default config this should not be used on x86. > > What do you mean by that statement? > > I mean that the 16 mbytes for generic CMA area is not a default on x86: > > config CMA_SIZE_MBYTES > int "Size in Mega Bytes" > depends on !CMA_SIZE_SEL_PERCENTAGE > default 0 if X86 > default 16 d7be003a9d275299f5ee36bbdf156654f59e08e9 (v3.18-2122-gd7be003a9d27) is there the 0MB if-x86 default was added to the tree. Prior to that, it was 16MiB, and that's where my system picked up the value from. I have a record of all my kconfigs, because I use oldconfig each time (going back 8 years to 2.6.27) # Added in 3.12.0-1-g5f258d0 CONFIG_CMA=y # Added in 3.16.0-rc6-00042-g67dd8f3 CONFIG_CMA_ALIGNMENT=8 CONFIG_CMA_AREAS=7 CONFIG_CMA_SIZE_MBYTES=16 CONFIG_CMA_SIZE_SEL_MBYTES=y CONFIG_DMA_CMA=y So the next question, is why did I pick up CMA in 3.16.0-rc6-00042-g67dd8f3... I'll poke at that. > > Yes, I'd say if there's a fallback without much penalty, nowarn makes > > sense. If the fallback just tries multiple addresses until success, then > > the warning should only be issued when too many attempts have been made. > On the other hand, if the warnings are correlated with high kernel CPU usage, > it's arguably better to be warned. Keep the rate-limit on the warning for cases like this? > >> > The rate of the problem starts slow, and also is relatively low on an > >> > idle > >> > system (my screens blank at night, no xscreensaver running), but it > >> > still ramps > >> > up over time (to the point of generating 2.5GB/hour of "(timestamp) > >> > alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses > >> > (~100 > >> > unique ranges for a day). > >> > > >> > My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors > >> > w/ 9 > >> > virtual desktops per monitor). > >> So IIUC, except the messages, everything actually works fine? > > There's high kernel CPU usage that seems to roughly correlate with the > > messages, but I can't yet tell if that's due to the syslog itself, or > > repeated alloc_contig_range requests. > You could try running perf top. Will do in the morning. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy
On Wed, Nov 30, 2016 at 10:24:59PM +0100, Vlastimil Babka wrote: > [add more CC's] > > On 11/30/2016 09:19 PM, Robin H. Johnson wrote: > > Somewhere in the Radeon/DRM codebase, CMA page allocation has either > > regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is > > doing something different with pages. > > Could be that it didn't use dma_generic_alloc_coherent() before, or you > didn't > have the generic CMA pool configured. v4.9-rc7-23-gded6e842cf49: [0.00] cma: Reserved 16 MiB at 0x00083e40 [0.00] Memory: 32883108K/33519432K available (6752K kernel code, 1244K rwdata, 4716K rodata, 1772K init, 2720K bss, 619940K reserved, 16384K cma-reserved) > What's the output of "grep CMA" on your > .config? # grep CMA .config |grep -v -e SECMARK= -e CONFIG_BCMA -e CONFIG_USB_HCD_BCMA -e INPUT_CMA3000 -e CRYPTO_CMAC CONFIG_CMA=y # CONFIG_CMA_DEBUG is not set # CONFIG_CMA_DEBUGFS is not set CONFIG_CMA_AREAS=7 CONFIG_DMA_CMA=y CONFIG_CMA_SIZE_MBYTES=16 CONFIG_CMA_SIZE_SEL_MBYTES=y # CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set # CONFIG_CMA_SIZE_SEL_MIN is not set # CONFIG_CMA_SIZE_SEL_MAX is not set CONFIG_CMA_ALIGNMENT=8 > Or any kernel boot options with cma in name? None. > By default config this should not be used on x86. What do you mean by that statement? It should be disallowed to enable CONFIG_CMA? Radeon and CMA should be mutually exclusive? > > Given that I haven't seen ANY other reports of this, I'm inclined to > > believe the problem is drm/radeon specific (if I don't start X, I can't > > reproduce the problem). > > It's rather CMA specific, the allocation attemps just can't be 100% reliable > due > to how CMA works. The question is if it should be spewing in the log in the > context of dma-cma, which has a fallback allocation option. It even uses > __GFP_NOWARN, perhaps the CMA path should respect that? Yes, I'd say if there's a fallback without much penalty, nowarn makes sense. If the fallback just tries multiple addresses until success, then the warning should only be issued when too many attempts have been made. > > > The rate of the problem starts slow, and also is relatively low on an idle > > system (my screens blank at night, no xscreensaver running), but it still > > ramps > > up over time (to the point of generating 2.5GB/hour of "(timestamp) > > alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses > > (~100 > > unique ranges for a day). > > > > My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ > > 9 > > virtual desktops per monitor). > So IIUC, except the messages, everything actually works fine? There's high kernel CPU usage that seems to roughly correlate with the messages, but I can't yet tell if that's due to the syslog itself, or repeated alloc_contig_range requests. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: Digital signature
Re: Regarding your thread on LKML - drm_radeon spamming alloc_contig_range [WAS: Re: PROBLEM-PERSISTS: dmesg spam: alloc_contig_range: [XX, YY) PFNs busy]
CC'd back to LKML. On Thu, Jun 29, 2017 at 06:11:00PM +0530, Kumar Abhishek wrote: > Hi Robin, > > I am an independent developer who stumbled upon your thread on the LKML > after facing a similar issue - my kernel log being spammed by > alloc_contig_range messages. I am running Linux on an ARM system > (specifically the BeagleBoard-X15) and am on kernel version 4.9.33 with TI > patches on top of it. > > I am running Debian Stretch (9.0) on the system. > > Here's what my stack trace looks like: .. > > It's somewhat similar to your stack trace, but this here happens on an > etnaviv GPU (Vivante GCxx). > > In my case if I do 'sudo service lightdm stop', these messages stop too. > This seems to suggest that the problem may be in the X server rather than > the kernel? I seem to think this because I replicated this on an entirely > different set of hardware than yours. > > I just wanted to bring this to your notice, and also ask you if you managed > to solve it for yourself. > > One solution could be to demote the pr_info in alloc_contig_range to > pr_debug or to do away with the message altogether, but this would be > suppressing the issue instead of really knowing what it is about. > > Let me know how I could further investigate this. The problem, as far as I got diagnosed on LKML, is that some of the GPUs have a bunch of non-fatal contiguous memory allocation requests: they have a meaningful fallback path on the allocation, so 'PFNs busy' is a false busy for their case. However, if there was a another consumer that does NOT have a fallback, the output would still be crucially useful. Attached is the patch that I unsuccessfully proposed on LKML to rate-limit the messages, with the last revision to only dump_stack() if CONFIG_CMA_DEBUG was set. The path that LKML wanted was to add a new parameter to suppress or at least demote the failure message, and update all of the callers: but it means that many of the indirect callers need that added parameter as well. mm/cma.c:cma_alloc this call can suppress the error, you can see it retry. mm/hugetlb.c: These callers should get the error message. The error message DOES still have a good general use in notifying you that something is going wrong. There was noticeable performance slowdown in my case when it was trying hard to allocate. -- Robin Hugh Johnson E-Mail : robb...@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 ICQ# : 30269588 or 41961639 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 commit 808c209dc82ce79147122ca78e7047bc74a16149 Author: Robin H. Johnson <robb...@gentoo.org> Date: Wed Nov 30 10:32:57 2016 -0800 mm: ratelimit & trace PFNs busy. Signed-off-by: Robin H. Johnson <robb...@gentoo.org> Acked-by: Michal Nazarewicz <min...@mina86.com> diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6de9440e3ae2..3c28ec3d18f8 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7289,8 +7289,16 @@ int alloc_contig_range(unsigned long start, unsigned long end, /* Make sure the range is really isolated. */ if (test_pages_isolated(outer_start, end, false)) { - pr_info("%s: [%lx, %lx) PFNs busy\n", - __func__, outer_start, end); + static DEFINE_RATELIMIT_STATE(ratelimit_pfn_busy, + DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + if (__ratelimit(_pfn_busy)) { + pr_info("%s: [%lx, %lx) PFNs busy\n", +__func__, outer_start, end); + if (IS_ENABLED(CONFIG_CMA_DEBUG)) +dump_stack(); + } + ret = -EBUSY; goto done; } signature.asc Description: Digital signature
[PATCH] firmware: cleanup FIRMWARE_IN_KERNEL message
The help for FIRMWARE_IN_KERNEL still references the firmware_install command that was recently removed by commit 5620a0d1aacd ("firmware: delete in-kernel firmware"). Clean up the message to direct the user to their distribution's linux-firmware package, and remove any reference to firmware being included in the kernel source tree. Cc: Greg K-H <gre...@linuxfoundation.org> Cc: Masahiro Yamada <yamada.masah...@socionext.com> Cc: David Woodhouse <dw...@infradead.org> Signed-off-by: Robin H. Johnson <robb...@gentoo.org> --- drivers/base/Kconfig | 25 + 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 2f6614c9a229..bdc87907d6a1 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -91,22 +91,23 @@ config FIRMWARE_IN_KERNEL depends on FW_LOADER default y help - The kernel source tree includes a number of firmware 'blobs' - that are used by various drivers. The recommended way to - use these is to run "make firmware_install", which, after - converting ihex files to binary, copies all of the needed - binary files in firmware/ to /lib/firmware/ on your system so - that they can be loaded by userspace helpers on request. + Various drivers in the kernel source tree may require firmware, + which is generally available in your distribution's linux-firmware + package. + + The linux-firmware package should install firmware into + /lib/firmware/ on your system, so they can be loaded by userspace + helpers on request. Enabling this option will build each required firmware blob - into the kernel directly, where request_firmware() will find - them without having to call out to userspace. This may be - useful if your root file system requires a device that uses - such firmware and do not wish to use an initrd. + specified by EXTRA_FIRMWARE into the kernel directly, where + request_firmware() will find them without having to call out to + userspace. This may be useful if your root file system requires a + device that uses such firmware and you do not wish to use an + initrd. This single option controls the inclusion of firmware for - every driver that uses request_firmware() and ships its - firmware in the kernel source tree, which avoids a + every driver that uses request_firmware(), which avoids a proliferation of 'Include firmware for xxx device' options. Say 'N' and let firmware be loaded from userspace. -- 2.14.1
Re: [PATCH 2/3] firmware: Drop FIRMWARE_IN_KERNEL Kconfig option
+1 on this series. Signed-off-by: Robin H. Johnson <robb...@gentoo.org> On Tue, Jan 23, 2018 at 06:06:31PM -0800, Benjamin Gilbert wrote: > It doesn't actually do anything. Merge its help text into > EXTRA_FIRMWARE. > > Fixes: 5620a0d1aacd ("firmware: delete in-kernel firmware") > Fixes: 0946b2fb38fd ("firmware: cleanup FIRMWARE_IN_KERNEL message") > Signed-off-by: Benjamin Gilbert <benjamin.gilb...@coreos.com> > Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org> > Cc: Robin H. Johnson <robb...@gentoo.org> > --- > arch/arc/configs/axs101_defconfig | 1 - > arch/arc/configs/axs103_defconfig | 1 - > arch/arc/configs/axs103_smp_defconfig | 1 - > arch/arc/configs/haps_hs_defconfig | 1 - > arch/arc/configs/haps_hs_smp_defconfig | 1 - > arch/arc/configs/hsdk_defconfig| 1 - > arch/arc/configs/nsim_700_defconfig| 1 - > arch/arc/configs/nsim_hs_defconfig | 1 - > arch/arc/configs/nsim_hs_smp_defconfig | 1 - > arch/arc/configs/nsimosci_defconfig| 1 - > arch/arc/configs/nsimosci_hs_defconfig | 1 - > arch/arc/configs/nsimosci_hs_smp_defconfig | 1 - > arch/arc/configs/tb10x_defconfig | 1 - > arch/arc/configs/vdk_hs38_defconfig| 1 - > arch/arc/configs/vdk_hs38_smp_defconfig| 1 - > arch/arm/configs/cns3420vb_defconfig | 1 - > arch/arm/configs/magician_defconfig| 1 - > arch/arm/configs/mini2440_defconfig| 1 - > arch/arm/configs/mv78xx0_defconfig | 1 - > arch/arm/configs/mxs_defconfig | 1 - > arch/arm/configs/orion5x_defconfig | 1 - > arch/arm/configs/tegra_defconfig | 1 - > arch/arm/configs/vf610m4_defconfig | 1 - > arch/m68k/configs/amiga_defconfig | 1 - > arch/m68k/configs/apollo_defconfig | 1 - > arch/m68k/configs/atari_defconfig | 1 - > arch/m68k/configs/bvme6000_defconfig | 1 - > arch/m68k/configs/hp300_defconfig | 1 - > arch/m68k/configs/mac_defconfig| 1 - > arch/m68k/configs/multi_defconfig | 1 - > arch/m68k/configs/mvme147_defconfig| 1 - > arch/m68k/configs/mvme16x_defconfig| 1 - > arch/m68k/configs/q40_defconfig| 1 - > arch/m68k/configs/sun3_defconfig | 1 - > arch/m68k/configs/sun3x_defconfig | 1 - > arch/mips/configs/ar7_defconfig| 1 - > arch/mips/configs/ath25_defconfig | 1 - > arch/mips/configs/ath79_defconfig | 1 - > arch/mips/configs/pic32mzda_defconfig | 1 - > arch/mips/configs/qi_lb60_defconfig| 1 - > arch/mips/configs/rt305x_defconfig | 1 - > arch/mips/configs/xway_defconfig | 1 - > arch/mn10300/configs/asb2364_defconfig | 1 - > arch/powerpc/configs/44x/warp_defconfig| 1 - > arch/powerpc/configs/mpc512x_defconfig | 1 - > arch/powerpc/configs/ppc6xx_defconfig | 1 - > arch/powerpc/configs/ps3_defconfig | 1 - > arch/powerpc/configs/wii_defconfig | 1 - > arch/s390/configs/zfcpdump_defconfig | 1 - > arch/sh/configs/polaris_defconfig | 1 - > arch/tile/configs/tilegx_defconfig | 1 - > arch/tile/configs/tilepro_defconfig| 1 - > drivers/base/Kconfig | 28 +--- > 53 files changed, 5 insertions(+), 75 deletions(-) > > diff --git a/arch/arc/configs/axs101_defconfig > b/arch/arc/configs/axs101_defconfig > index ec7c849a5c8e..09f85154c5a4 100644 > --- a/arch/arc/configs/axs101_defconfig > +++ b/arch/arc/configs/axs101_defconfig > @@ -44,7 +44,6 @@ CONFIG_IP_PNP_RARP=y > CONFIG_DEVTMPFS=y > # CONFIG_STANDALONE is not set > # CONFIG_PREVENT_FIRMWARE_BUILD is not set > -# CONFIG_FIRMWARE_IN_KERNEL is not set > CONFIG_SCSI=y > CONFIG_BLK_DEV_SD=y > CONFIG_NETDEVICES=y > diff --git a/arch/arc/configs/axs103_defconfig > b/arch/arc/configs/axs103_defconfig > index 63d3cf69e0b0..09fed3ef22b6 100644 > --- a/arch/arc/configs/axs103_defconfig > +++ b/arch/arc/configs/axs103_defconfig > @@ -44,7 +44,6 @@ CONFIG_IP_PNP_RARP=y > CONFIG_DEVTMPFS=y > # CONFIG_STANDALONE is not set > # CONFIG_PREVENT_FIRMWARE_BUILD is not set > -# CONFIG_FIRMWARE_IN_KERNEL is not set > CONFIG_BLK_DEV_LOOP=y > CONFIG_SCSI=y > CONFIG_BLK_DEV_SD=y > diff --git a/arch/arc/configs/axs103_smp_defconfig > b/arch/arc/configs/axs103_smp_defconfig > index f613ecac14a7..ea2f6d817d1a 100644 > --- a/arch/arc/configs/axs103_smp_defconfig > +++ b/arch/arc/configs/axs103_smp_defconfig > @@ -45,7 +45,6 @@ CONFIG_IP_PNP_RARP=y > CONFIG_DEVTMPFS=y > # CONFIG_STANDALONE
Intel Core Duo/Duo2 T2300/E6400 - Hyper-Threading (the absence of)
(Please CC me, I am not subscribed to LKML [I have set the Mail-Followup-To header accordingly]). On two of my new machines, with Intel Core Duo T2300 and Core2 Duo E6400 chips respectively, I noticed some weirdness in how many CPUs are present. If the hyper-threading bit is present in the CPU info, should there always be a an extra CPU presented to the system per physical core? Both the Core1 and Core2 chips I have the ht bit set, but present only their two physical cores to the system. No access to the hyper-threading capabilities at all. I also see no configuration options in the BIOS to enable or disable hyper-threading. That is, /proc/cpuinfo and all topology data only shows 2 CPUs present, and that they are not the HT pair. (CONFIG_NR_CPUS=8 is set). (This was originally triggered by somebody else's code that read the CPU flags, saw hyper-threading, and decided there were 2x cpus for each physical core. Said code has already been taken out back and shot repeatedly). -- Robin Hugh Johnson Gentoo Linux Developer E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgp6ZAuJuHuyo.pgp Description: PGP signature
netconsole module unload broken between 2.6.19 and 2.6.20 (and still broken as of 2.6.21-rc6)
(Please CC me on emails, I'm not on LKML). Somewhere between 2.6.19 and 2.6.20, unloading of the netconsole module got broken. It's still broken as of 2.6.21-rc6. If you try to unload the module, the rmmod/modprobe-r just sits there forever. I can reproduce it on tg3, forcedeth and e1000 hardware (all in various Opteron machines) Looking at the differences in netconsole itself between .19 and .20, they are extremely small, so I'd guess that the problem probably lies in netpoll itself. Originally, I was trying to unload the module to reconfigure the log destination - maybe a sysfs interface for (re-)configuration would be a good addition as well? -- Robin Hugh Johnson Gentoo Linux Developer & Council Member E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgp4hjM8A3xEQ.pgp Description: PGP signature
Re: [2.6.21.1] SATA freeze
On Sat, May 12, 2007 at 12:48:59PM -0600, Robert Hancock wrote: > Fred Moyer wrote: > > I just joined the list today so apologies if this email breaks any email > > client post threading. > > I have been seeing similar errors on two different systems. I applied > > Robert's sata_nv patch posted to the list on May 5th, and approved today by > > Jeff Garzik. I've taken several steps to insure that this isn't a faulty > > cable or drive issue. This is running on a hp dl145g2. Here is my lspci, > > dmesg, and relevant kernel config sections: > > (snip) > > > ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen > > ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 123392 > > in > > res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM > > violation) > > ata1: soft resetting port > > ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > > ata1.00: configured for UDMA/100 > > ata1: EH complete > > This appears to be a different problem. Something is issuing SMART-related > commands (smartd or smartctl perhaps) which the drive seems to be reacting > strangely to. It apparently completed the command but never raised DRQ to > request any data being transferred even though we expected it to. Maybe > SMART is disabled on the drive and that's causing it to just toss these > commands? CCing linux-ide in case anyone knows what would cause this. I previously posted a near identical error to linux-ide. http://article.gmane.org/gmane.linux.ide/18375 Specifically, I could trigger it by running 'smartctl -d ata -S on /dev/sda' OR (s-S/o/). Same sata_nv controller, two different drives, many different cables. Reproducible over 7 systems [two different models of Tyan mobo] that I have. -- Robin Hugh Johnson Gentoo Linux Developer & Council Member E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgp1lg2M9qRYv.pgp Description: PGP signature
[PATCH] libata: provide the ability to disable a disk via the params.
This was posted by a user on StackExchange, who has a failing SSD that's soldered directly onto the motherboard of his system. The BIOS does not give any option to disable it at all, so he can't just hide it that way. The old IDE layer had hdX=noprobe override for situations like this, but that was never ported to the libata layer. Signed-off-by: Robin H. Johnson X-URL: http://unix.stackexchange.com/questions/102648/how-to-tell-linux-kernel-3-0-to-completely-ignore-a-failing-disk X-URL: http://askubuntu.com/questions/352836/how-can-i-tell-linux-kernel-to-completely-ignore-a-disk-as-if-it-was-not-even-co X-URL: http://superuser.com/questions/599333/how-to-disable-kernel-probing-for-drive --- Documentation/kernel-parameters.txt | 2 ++ drivers/ata/libata-core.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 50680a5..40bf5ff 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1529,6 +1529,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. * atapi_dmadir: Enable ATAPI DMADIR bridge support + * [no]disable: Enable or disable this device. + If there are multiple matching configurations changing the same attribute, the last one is used. diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 75b9367..5069a96 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -6519,6 +6519,8 @@ static int __init ata_parse_force_one(char **cur, { "norst", .lflags = ATA_LFLAG_NO_HRST | ATA_LFLAG_NO_SRST }, { "rstonce",.lflags = ATA_LFLAG_RST_ONCE }, { "atapi_dmadir", .horkage_on = ATA_HORKAGE_ATAPI_DMADIR }, + { "disable",.horkage_on = ATA_HORKAGE_DISABLE }, + { "nodisable", .horkage_off= ATA_HORKAGE_DISABLE }, }; char *start = *cur, *p = *cur; char *id, *val, *endp; -- 1.8.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: provide the ability to disable a disk via the params.
On Thu, Dec 12, 2013 at 08:39:35AM -0500, Tejun Heo wrote: > Hello, Robin. > > On Sat, Dec 07, 2013 at 04:56:27PM -0800, Robin H. Johnson wrote: > > + { "disable",.horkage_on = ATA_HORKAGE_DISABLE }, > > + { "nodisable", .horkage_off= ATA_HORKAGE_DISABLE }, > Given the current usage of ATA_HORKAGE_DISABLE, I don't think we need > "nodisable". Let's just add "disable" for now. Can you please update > the patch and resend? Before I do so, I have two questions: 1. Countering your nodisable comment, would it be valid to do: libata.force=2:disable libata.force=2.02:nodisable To disable all of port 2 except device 2? 2. One of my friends wondered if it would be worthwhile to add force keywords for other HORKAGE bits, and if so, should the ata_lflag/ata_link force bits also be presented? There are only 3 HORKAGE bits presently available in libata.force: ATA_HORKAGE_NONCQ ATA_HORKAGE_DUMP_ID ATA_HORKAGE_ATAPI_DMADIR And 3 ata_link flags: ATA_LFLAG_NO_HRST ATA_LFLAG_NO_SRST ATA_LFLAG_RST_ONCE -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] libata: provide the ability to disable a disk via the params.
On Thu, Dec 12, 2013 at 09:36:55PM +0100, Levente Kurusa wrote: > > 2. One of my friends wondered if it would be worthwhile to add force > > keywords for other HORKAGE bits, and if so, should the > > ata_lflag/ata_link force bits also be presented? > I don't think so. Most of the other HORKAGEs are automatically > recognized and applied by the code. I think the only ones > which can cause trouble if not detected at first are the ones that are > currently in the list. His logic was thinking that it will aid debugging/testing on new buggy devices if the options are available at boot. I'd think of the following as candidates for that: ATA_HORKAGE_NODMA ATA_HORKAGE_MAX_SEC_128 ATA_HORKAGE_DIAGNOSTIC ATA_HORKAGE_BROKEN_HPA ATA_HORKAGE_DISABLE ATA_HORKAGE_HPA_SIZE ATA_HORKAGE_IVB ATA_HORKAGE_STUCK_ERR (only set by code presently, not by blacklist) ATA_HORKAGE_BRIDGE_OK ATA_HORKAGE_ATAPI_MOD16_DMA ATA_HORKAGE_NOSETXFER ATA_HORKAGE_MAX_SEC_LBA48 -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] libata: disable a disk via libata.force params
A user on StackExchange had a failing SSD that's soldered directly onto the motherboard of his system. The BIOS does not give any option to disable it at all, so he can't just hide it from the OS via the BIOS. The old IDE layer had hdX=noprobe override for situations like this, but that was never ported to the libata layer. This patch implements a disable flag for libata.force. Example use: libata.force=2.0:disable [v2 of the patch, removed the nodisable flag per Tejun Heo] Signed-off-by: Robin H. Johnson X-URL: http://unix.stackexchange.com/questions/102648/how-to-tell-linux-kernel-3-0-to-completely-ignore-a-failing-disk X-URL: http://askubuntu.com/questions/352836/how-can-i-tell-linux-kernel-to-completely-ignore-a-disk-as-if-it-was-not-even-co X-URL: http://superuser.com/questions/599333/how-to-disable-kernel-probing-for-drive --- Documentation/kernel-parameters.txt | 2 ++ drivers/ata/libata-core.c | 1 + 2 files changed, 3 insertions(+) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 50680a5..b9e9bd8 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1529,6 +1529,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. * atapi_dmadir: Enable ATAPI DMADIR bridge support + * disable: Disable this device. + If there are multiple matching configurations changing the same attribute, the last one is used. diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 75b9367..70529b8 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -6519,6 +6519,7 @@ static int __init ata_parse_force_one(char **cur, { "norst", .lflags = ATA_LFLAG_NO_HRST | ATA_LFLAG_NO_SRST }, { "rstonce",.lflags = ATA_LFLAG_RST_ONCE }, { "atapi_dmadir", .horkage_on = ATA_HORKAGE_ATAPI_DMADIR }, + { "disable",.horkage_on = ATA_HORKAGE_DISABLE }, }; char *start = *cur, *p = *cur; char *id, *val, *endp; -- 1.8.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Licensing & copyright of kernel .config files (defconfig, *config)
(Please CC me on replies, not subscribed to LKML) Hi, Somewhat of an odd question, but none of the files in question seem to have a copyright header on them... For a kernel .config file, either from one of the defconfig or any other *config option that automates the answer: 1. What license does the file fall under? 2. Who are the copyright holders? Naively, since the defconfigs are bundled with the kernel, that could fall under GPLv2-only implicitly, but lacking any explicit copyright headers makes this interesting (arch/*/configs/* contain lots of files, no copyright headers on them). If I manually write the names of some configuration options to a new .config file, at that point I logically am the only author and have copyright of it. My editor slaps a default license on it of BSD-2. Thereafter I run olddefconfig, and now it's a combined work of the kernel's defconfig and my manual settings. If GPL-2 was inherited from the kernel tree, this is now a combined BSD-GPL2 work, or is it? The kernel config tools did consider my file as input, possibly overrode the settings if they didn't work with others, and re-output everything. If the files are to be marked with a copyright header, who is the holder of it that it should be attributed to? Alternatively, is this a case where the work is not copyrightable, and the files should have a notice to that effect? Background: Gentoo has a bunch of "stock" kernel configurations for release engineering, our initramfs tool (genkernel), and other endeavors over the years. These projects claim BSD, GPL2, LGPL2 on various pieces, and I don't think they can all be correct. I'm working on getting them into one place, because some of them have been getting stale, but the differing licenses raised a red flag to me. -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Licensing & copyright of kernel .config files (defconfig, *config)
On Mon, Jun 02, 2014 at 12:01:46AM +0100, Ken Moffat wrote: > > Naively, since the defconfigs are bundled with the kernel, that could > > fall under GPLv2-only implicitly, but lacking any explicit copyright > > headers makes this interesting (arch/*/configs/* contain lots of files, > > no copyright headers on them). > I am not a lawyer, but surely _many_ of the kernel files do not > contain any explicit copyright information ? On closer inspection, more files than I thought don't have any explicit copyrights on them. ~67% of files in v3.13 had the text 'Copyright' or 'Licens' appear in them. > Why does your editor put a default license on anything ? It's my stock header, customized by per-directory vimrc. The non-project-specific default one actually has a CHANGEME string it in, to help remind me that it needs an edit before I release that file. I was just using the BSD license on the file as an example. Submissions to other open source projects are generally bound by the license of the project, with a few exceptions (I've put patches into public domain to avoid signing some CLA-like agreements). > If I was being awkward, I would suggest that the config would not > be useful until you had run it through "make oldconfig" or similar, > and that therefore the kernel license of GPL-2 applies. That's the case I was interested in :-). > > If the files are to be marked with a copyright header, who is the holder > > of it that it should be attributed to? > Iff the work is copyrightable (I do not have an opinion on that), > surely the license only matters if you breach it ? ;-) If you > distribute a compiled kernel with the source, and all of that source > is GPL-2, then I assume you are in the clear. For "extras" which > include binaries without source, my understanding is that you would > always be vulnerable to kernel copyright holders. So, I suspect > that the attribution of a config file is not particularly important. I agree with your reasoning if I was distributing kernel sources or compiled kernels, but this is going to be a package of kernel configurations only. > > Background: > > Gentoo has a bunch of "stock" kernel configurations for release > > engineering, our initramfs tool (genkernel), and other endeavors over > > the years. These projects claim BSD, GPL2, LGPL2 on various pieces, and > > I don't think they can all be correct. I'm working on getting them into > > one place, because some of them have been getting stale, but the > > differing licenses raised a red flag to me. > To the extent that GPL-2 can include LGPL-2 and BSD, I suggest that > you label them all as GPL-2. That is the licence of the kernel, and > for practical reasons it will not change (this was discussed when > somebody asked about GPL-3 : even if the main copyright holders > wanted to make the change (and many do not), some copyright holders > are no longer contactable). You might be able to dual-license some > of these distro files, but I have no idea if that would be appropriate. If the rest of the logic is correct, then the non-GPL2 license on these files was never valid in the first place; they inherited GPL2 from the kernel from the get go, and I don't need to be concerned about the hassle of formally relicensing them by contacting the authors of the configs (which again, aren't always contactable anymore). -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regarding your thread on LKML - drm_radeon spamming alloc_contig_range [WAS: Re: PROBLEM-PERSISTS: dmesg spam: alloc_contig_range: [XX, YY) PFNs busy]
CC'd back to LKML. On Thu, Jun 29, 2017 at 06:11:00PM +0530, Kumar Abhishek wrote: > Hi Robin, > > I am an independent developer who stumbled upon your thread on the LKML > after facing a similar issue - my kernel log being spammed by > alloc_contig_range messages. I am running Linux on an ARM system > (specifically the BeagleBoard-X15) and am on kernel version 4.9.33 with TI > patches on top of it. > > I am running Debian Stretch (9.0) on the system. > > Here's what my stack trace looks like: .. > > It's somewhat similar to your stack trace, but this here happens on an > etnaviv GPU (Vivante GCxx). > > In my case if I do 'sudo service lightdm stop', these messages stop too. > This seems to suggest that the problem may be in the X server rather than > the kernel? I seem to think this because I replicated this on an entirely > different set of hardware than yours. > > I just wanted to bring this to your notice, and also ask you if you managed > to solve it for yourself. > > One solution could be to demote the pr_info in alloc_contig_range to > pr_debug or to do away with the message altogether, but this would be > suppressing the issue instead of really knowing what it is about. > > Let me know how I could further investigate this. The problem, as far as I got diagnosed on LKML, is that some of the GPUs have a bunch of non-fatal contiguous memory allocation requests: they have a meaningful fallback path on the allocation, so 'PFNs busy' is a false busy for their case. However, if there was a another consumer that does NOT have a fallback, the output would still be crucially useful. Attached is the patch that I unsuccessfully proposed on LKML to rate-limit the messages, with the last revision to only dump_stack() if CONFIG_CMA_DEBUG was set. The path that LKML wanted was to add a new parameter to suppress or at least demote the failure message, and update all of the callers: but it means that many of the indirect callers need that added parameter as well. mm/cma.c:cma_alloc this call can suppress the error, you can see it retry. mm/hugetlb.c: These callers should get the error message. The error message DOES still have a good general use in notifying you that something is going wrong. There was noticeable performance slowdown in my case when it was trying hard to allocate. -- Robin Hugh Johnson E-Mail : robb...@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 ICQ# : 30269588 or 41961639 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 commit 808c209dc82ce79147122ca78e7047bc74a16149 Author: Robin H. Johnson Date: Wed Nov 30 10:32:57 2016 -0800 mm: ratelimit & trace PFNs busy. Signed-off-by: Robin H. Johnson Acked-by: Michal Nazarewicz diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6de9440e3ae2..3c28ec3d18f8 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7289,8 +7289,16 @@ int alloc_contig_range(unsigned long start, unsigned long end, /* Make sure the range is really isolated. */ if (test_pages_isolated(outer_start, end, false)) { - pr_info("%s: [%lx, %lx) PFNs busy\n", - __func__, outer_start, end); + static DEFINE_RATELIMIT_STATE(ratelimit_pfn_busy, + DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + if (__ratelimit(_pfn_busy)) { + pr_info("%s: [%lx, %lx) PFNs busy\n", +__func__, outer_start, end); + if (IS_ENABLED(CONFIG_CMA_DEBUG)) +dump_stack(); + } + ret = -EBUSY; goto done; } signature.asc Description: Digital signature
Re: Idle loadavg of ~1, maybe MD related
On Sat, Jan 05, 2008 at 01:30:37AM -0800, Andrew Morton wrote: > >From that I'd suspect that kwindfarm is being a bad citizen. > If a process is consistently stuck in D state, run Windfarm. > echo w > /proc/sysrq-trigger > then record the resulting dmesg output so we can see where it got stuck. Traceback: [552710.416174] SysRq : Show Blocked State [552710.417876] taskPC stack pid father [552710.417888] kwindfarm D 0 829 2 [552710.417892] Call Trace: [552710.417895] [c0036c9835f0] [c0528b90] 0xc0528b90 (unreliable) [552710.417908] [c0036c9837c0] [c000f4a8] .__switch_to+0xd8/0x110 [552710.417985] [c0036c983850] [c03bb2a0] .schedule+0x62c/0x6c8 [552710.417992] [c0036c983940] [c03bb8c4] .schedule_timeout+0x3c/0xe8 [552710.417997] [c0036c983a10] [c03bb51c] .wait_for_common+0x100/0x1bc [552710.418002] [c0036c983ae0] [c0285300] .smu_fan_set+0x17c/0x1e4 [552710.418009] [c0036c983c30] [c0284078] .pm112_wf_notify+0xc50/0x12d0 [552710.418015] [c0036c983d20] [c03bfb84] .notifier_call_chain+0x5c/0xcc [552710.418021] [c0036c983dc0] [c006f4b4] .__blocking_notifier_call_chain+0x70/0xb0 [552710.418027] [c0036c983e70] [c0282d9c] .wf_thread_func+0x78/0x11c [552710.418032] [c0036c983f00] [c0069b00] .kthread+0x78/0xc4 [552710.418039] [c0036c983f90] [c0023d0c] .kernel_thread+0x4c/0x68 [552710.418064] Sched Debug Version: v0.07, 2.6.24-rc6-prod-g6f0f5304 #10 [552710.418067] now at 552723945.170635 msecs [552710.418070] .sysctl_sched_latency: 60.00 [552710.418073] .sysctl_sched_min_granularity: 12.00 [552710.418076] .sysctl_sched_wakeup_granularity : 30.00 [552710.418079] .sysctl_sched_batch_wakeup_granularity : 30.00 [552710.418082] .sysctl_sched_child_runs_first : 0.01 [552710.418085] .sysctl_sched_features : 7 Full output at http://dev.gentoo.org/~robbat2/20080105_windfarm_sysrq_w.txt -- Robin Hugh Johnson Gentoo Linux Developer & Infra Guy E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgp9Esutq5LQs.pgp Description: PGP signature
Re: Idle loadavg of ~1, maybe MD related
On Sun, Jan 06, 2008 at 10:21:57PM +1100, Paul Mackerras wrote: > Robin, what does the "motherboard" line in /proc/cpuinfo say on your > machine? motherboard : PowerMac11,2 MacRISC4 Power Macintosh -- Robin Hugh Johnson Gentoo Linux Developer & Infra Guy E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 pgphw30OEtyKF.pgp Description: PGP signature
Idle loadavg of ~1, maybe MD related
(Please CC me, I'm subbed to LKML). My G5, while running practically nothing (just sshd and some to watch the load), has a weird cycle of load averages. I think it might be related to MD, simply because that's the only thing that is clocking up cputime. A full cycle lasts approximately 27 minutes. MinsLoad 0-2 0.0-0.15 (stable, 0 level) 3-5 0.50, 0.80, 0.95 (fast increase) 6-210.95-1.10 (stable, 1 level) 22-24 0.9, 0.8, 0.1 (fast decrease, to 0 level) 25-27 0.2, 0.3, 0.15 (local maxima peak) Here's a graph of it, spanning 230 minutes: http://dev.gentoo.org/~robbat2/20071230-g5-loadavg-bug.png Processed data for the graph here: http://dev.gentoo.org/~robbat2/20071230-g5-loadavg-bug.txt For the entire 230 minute period, there was _no_ disk I/O. Not recorded by iostat, nor generated. # while true ; do uptime ; iostat -t 60 2 -N -d | tail -n15 ; done >/dev/shm/foo Example of single output pass for the above loop: 00:59:37 up 8:32, 2 users, load average: 0.02, 0.47, 0.66 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 0.00 0.00 0.00 0 0 sdb 0.00 0.00 0.00 0 0 md1 0.00 0.00 0.00 0 0 md0 0.00 0.00 0.00 0 0 md2 0.00 0.00 0.00 0 0 md3 0.00 0.00 0.00 0 0 vg-usr0.00 0.00 0.00 0 0 vg-var0.00 0.00 0.00 0 0 vg-tmp0.00 0.00 0.00 0 0 vg-opt0.00 0.00 0.00 0 0 vg-home 0.00 0.00 0.00 0 0 vg-usr_src0.00 0.00 0.00 0 0 vg-usr_portage 0.00 0.00 0.00 0 0 This is basically 1842c7f2 from Linus's tree, my own stuff is config'd out with =n for the moment. And the problem does still occur in the main tree. Snippet from the head of 'top', sorting by cputime. top - 01:59:08 up 9:32, 2 users, load average: 1.04, 0.87, 0.70 Tasks: 74 total, 1 running, 73 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.5%sy, 0.0%ni, 99.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 12074480k total, 292520k used, 11781960k free,76812k buffers Swap: 8388536k total,0k used, 8388536k free, 144276k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 4635 root 15 -5 000 S0 0.0 8:14.33 md3_raid1 3121 root 15 -5 000 S1 0.0 3:15.12 md2_raid1 3098 root 15 -5 000 S0 0.0 3:07.83 md1_raid1 3076 root 15 -5 000 S0 0.0 0:45.82 md0_raid1 829 root 15 -5 000 D0 0.0 0:01.85 kwindfarm 13 root 15 -5 000 S0 0.0 0:01.41 ksoftirqd/3 18 root 15 -5 000 S0 0.0 0:01.39 events/3 10 root 15 -5 000 S0 0.0 0:00.94 ksoftirqd/2 1 root 20 0 1900 652 576 S0 0.0 0:00.89 init 32086 root 20 0 9336 2696 2076 S0 0.0 0:00.87 sshd # ver_linux Linux buck-int 2.6.24-rc6-prod-g6f0f5304 #10 SMP Sat Dec 29 05:11:24 PST 2007 ppc64 PPC970MP, altivec supported PowerMac11,2 GNU/Linux Gnu C 4.2.2 Gnu make 3.81 binutils 2.18 util-linux 2.13 mount 2.13 module-init-tools 3.4 e2fsprogs 1.40.3 reiserfsprogs 3.6.19 xfsprogs 2.9.4 quota-tools3.15. PPP2.4.4 Linux C Library2.7 Dynamic linker (ldd) 2.7 Procps 3.2.7 Net-tools 1.60 Kbd1.13 Sh-utils 6.9 udev 118 wireless-tools 29 Modules Loaded nfsd exportfs auth_rpcgss ipv6 unix tg3 nfs_acl lockd sunrpc dm_mod # lsmod Module Size Used by nfsd 346552 1 exportfs8392 1 nfsd auth_rpcgss69152 1 nfsd ipv6 428760 20 unix 47384 13 tg3 159020 0 nfs_acl 6056 1 nfsd lockd 105248 1 nfsd sunrpc281248 6 nfsd,auth_rpcgss,nfs_acl,lockd dm_mod100520 15 # ps -ef UIDPID PPID C STIME TTY TIME CMD root 1 0 0 Dec29 ?00:00:00 init [3] root 2 0 0 Dec29 ?00:00:00 [kthreadd] root 3 2 0
Re: [PATCH 2/3] firmware: Drop FIRMWARE_IN_KERNEL Kconfig option
+1 on this series. Signed-off-by: Robin H. Johnson On Tue, Jan 23, 2018 at 06:06:31PM -0800, Benjamin Gilbert wrote: > It doesn't actually do anything. Merge its help text into > EXTRA_FIRMWARE. > > Fixes: 5620a0d1aacd ("firmware: delete in-kernel firmware") > Fixes: 0946b2fb38fd ("firmware: cleanup FIRMWARE_IN_KERNEL message") > Signed-off-by: Benjamin Gilbert > Cc: Greg Kroah-Hartman > Cc: Robin H. Johnson > --- > arch/arc/configs/axs101_defconfig | 1 - > arch/arc/configs/axs103_defconfig | 1 - > arch/arc/configs/axs103_smp_defconfig | 1 - > arch/arc/configs/haps_hs_defconfig | 1 - > arch/arc/configs/haps_hs_smp_defconfig | 1 - > arch/arc/configs/hsdk_defconfig| 1 - > arch/arc/configs/nsim_700_defconfig| 1 - > arch/arc/configs/nsim_hs_defconfig | 1 - > arch/arc/configs/nsim_hs_smp_defconfig | 1 - > arch/arc/configs/nsimosci_defconfig| 1 - > arch/arc/configs/nsimosci_hs_defconfig | 1 - > arch/arc/configs/nsimosci_hs_smp_defconfig | 1 - > arch/arc/configs/tb10x_defconfig | 1 - > arch/arc/configs/vdk_hs38_defconfig| 1 - > arch/arc/configs/vdk_hs38_smp_defconfig| 1 - > arch/arm/configs/cns3420vb_defconfig | 1 - > arch/arm/configs/magician_defconfig| 1 - > arch/arm/configs/mini2440_defconfig| 1 - > arch/arm/configs/mv78xx0_defconfig | 1 - > arch/arm/configs/mxs_defconfig | 1 - > arch/arm/configs/orion5x_defconfig | 1 - > arch/arm/configs/tegra_defconfig | 1 - > arch/arm/configs/vf610m4_defconfig | 1 - > arch/m68k/configs/amiga_defconfig | 1 - > arch/m68k/configs/apollo_defconfig | 1 - > arch/m68k/configs/atari_defconfig | 1 - > arch/m68k/configs/bvme6000_defconfig | 1 - > arch/m68k/configs/hp300_defconfig | 1 - > arch/m68k/configs/mac_defconfig| 1 - > arch/m68k/configs/multi_defconfig | 1 - > arch/m68k/configs/mvme147_defconfig| 1 - > arch/m68k/configs/mvme16x_defconfig| 1 - > arch/m68k/configs/q40_defconfig| 1 - > arch/m68k/configs/sun3_defconfig | 1 - > arch/m68k/configs/sun3x_defconfig | 1 - > arch/mips/configs/ar7_defconfig| 1 - > arch/mips/configs/ath25_defconfig | 1 - > arch/mips/configs/ath79_defconfig | 1 - > arch/mips/configs/pic32mzda_defconfig | 1 - > arch/mips/configs/qi_lb60_defconfig| 1 - > arch/mips/configs/rt305x_defconfig | 1 - > arch/mips/configs/xway_defconfig | 1 - > arch/mn10300/configs/asb2364_defconfig | 1 - > arch/powerpc/configs/44x/warp_defconfig| 1 - > arch/powerpc/configs/mpc512x_defconfig | 1 - > arch/powerpc/configs/ppc6xx_defconfig | 1 - > arch/powerpc/configs/ps3_defconfig | 1 - > arch/powerpc/configs/wii_defconfig | 1 - > arch/s390/configs/zfcpdump_defconfig | 1 - > arch/sh/configs/polaris_defconfig | 1 - > arch/tile/configs/tilegx_defconfig | 1 - > arch/tile/configs/tilepro_defconfig| 1 - > drivers/base/Kconfig | 28 +--- > 53 files changed, 5 insertions(+), 75 deletions(-) > > diff --git a/arch/arc/configs/axs101_defconfig > b/arch/arc/configs/axs101_defconfig > index ec7c849a5c8e..09f85154c5a4 100644 > --- a/arch/arc/configs/axs101_defconfig > +++ b/arch/arc/configs/axs101_defconfig > @@ -44,7 +44,6 @@ CONFIG_IP_PNP_RARP=y > CONFIG_DEVTMPFS=y > # CONFIG_STANDALONE is not set > # CONFIG_PREVENT_FIRMWARE_BUILD is not set > -# CONFIG_FIRMWARE_IN_KERNEL is not set > CONFIG_SCSI=y > CONFIG_BLK_DEV_SD=y > CONFIG_NETDEVICES=y > diff --git a/arch/arc/configs/axs103_defconfig > b/arch/arc/configs/axs103_defconfig > index 63d3cf69e0b0..09fed3ef22b6 100644 > --- a/arch/arc/configs/axs103_defconfig > +++ b/arch/arc/configs/axs103_defconfig > @@ -44,7 +44,6 @@ CONFIG_IP_PNP_RARP=y > CONFIG_DEVTMPFS=y > # CONFIG_STANDALONE is not set > # CONFIG_PREVENT_FIRMWARE_BUILD is not set > -# CONFIG_FIRMWARE_IN_KERNEL is not set > CONFIG_BLK_DEV_LOOP=y > CONFIG_SCSI=y > CONFIG_BLK_DEV_SD=y > diff --git a/arch/arc/configs/axs103_smp_defconfig > b/arch/arc/configs/axs103_smp_defconfig > index f613ecac14a7..ea2f6d817d1a 100644 > --- a/arch/arc/configs/axs103_smp_defconfig > +++ b/arch/arc/configs/axs103_smp_defconfig > @@ -45,7 +45,6 @@ CONFIG_IP_PNP_RARP=y > CONFIG_DEVTMPFS=y > # CONFIG_STANDALONE is not set > # CONFIG_PREVENT_FIRMWARE_BUILD is not set > -# CONFIG_FIRMWARE_IN_KERNEL is not set > CONFIG_BLK_
[PATCH] firmware: cleanup FIRMWARE_IN_KERNEL message
The help for FIRMWARE_IN_KERNEL still references the firmware_install command that was recently removed by commit 5620a0d1aacd ("firmware: delete in-kernel firmware"). Clean up the message to direct the user to their distribution's linux-firmware package, and remove any reference to firmware being included in the kernel source tree. Cc: Greg K-H Cc: Masahiro Yamada Cc: David Woodhouse Signed-off-by: Robin H. Johnson --- drivers/base/Kconfig | 25 + 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 2f6614c9a229..bdc87907d6a1 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -91,22 +91,23 @@ config FIRMWARE_IN_KERNEL depends on FW_LOADER default y help - The kernel source tree includes a number of firmware 'blobs' - that are used by various drivers. The recommended way to - use these is to run "make firmware_install", which, after - converting ihex files to binary, copies all of the needed - binary files in firmware/ to /lib/firmware/ on your system so - that they can be loaded by userspace helpers on request. + Various drivers in the kernel source tree may require firmware, + which is generally available in your distribution's linux-firmware + package. + + The linux-firmware package should install firmware into + /lib/firmware/ on your system, so they can be loaded by userspace + helpers on request. Enabling this option will build each required firmware blob - into the kernel directly, where request_firmware() will find - them without having to call out to userspace. This may be - useful if your root file system requires a device that uses - such firmware and do not wish to use an initrd. + specified by EXTRA_FIRMWARE into the kernel directly, where + request_firmware() will find them without having to call out to + userspace. This may be useful if your root file system requires a + device that uses such firmware and you do not wish to use an + initrd. This single option controls the inclusion of firmware for - every driver that uses request_firmware() and ships its - firmware in the kernel source tree, which avoids a + every driver that uses request_firmware(), which avoids a proliferation of 'Include firmware for xxx device' options. Say 'N' and let firmware be loaded from userspace. -- 2.14.1
PROBLEM: dmesg spam: alloc_contig_range: [XX, YY) PFNs busy
(Replies CC to list and direct to me please) Summary: dmesg spammed with alloc_contig_range: [XX, YY) PFNs busy Description: I recently upgrading 4.9-rc5, (previous kernel 4.5.0-rc6-00141-g6794402), and since then my dmesg has been absolutely flooded with 'PFNs busy' (>3GiB/day). My config did not change (all new options =n). It's not consistent addresses, so the squelch of identical printk lines hasn't helped. Eg output: [187487.621916] alloc_contig_range: [83f0a9, 83f0aa) PFNs busy [187487.621924] alloc_contig_range: [83f0ce, 83f0cf) PFNs busy [187487.621976] alloc_contig_range: [83f125, 83f126) PFNs busy [187487.622013] alloc_contig_range: [83f127, 83f128) PFNs busy Keywords: - mm, alloc_contig_range, CMA Most recent kernel version which did not have the bug: -- Known 4.5.0-rc6-00141-g6794402 ver_linux: -- Linux bohr-int 4.9.0-rc5-00177-g81bcfe5 #12 SMP Wed Nov 16 13:16:32 PST 2016 x86_64 Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz GenuineIntel GNU/Linux GNU C 5.3.0 GNU Make4.2.1 Binutils2.25.1 Util-linux 2.29 Mount 2.29 Quota-tools 4.03 Linux C Library 2.23 Dynamic linker (ldd)2.23 readlink: missing operand Try 'readlink --help' for more information. Procps 3.3.12 Net-tools 1.60 Kbd 2.0.3 Console-tools 2.0.3 Sh-utils8.25 Udev230 Modules Loaded 3w_sas 3w_ ablk_helper aesni_intel aes_x86_64 af_packet ahci aic79xx amdgpu async_memcpy async_pq async_raid6_recov async_tx async_xor ata_piix auth_rpcgss binfmt_misc bluetooth bnep bnx2 bonding btbcm btintel btrfs btrtl btusb button cdrom cn configs coretemp crc32c_intel crc32_pclmul crc_ccitt crc_itu_t crct10dif_pclmul cryptd dca dm_bio_prison dm_bufio dm_cache dm_cache_smq dm_crypt dm_delay dm_flakey dm_log dm_log_userspace dm_mirror dm_mod dm_multipath dm_persistent_data dm_queue_length dm_raid dm_region_hash dm_round_robin dm_service_time dm_snapshot dm_thin_pool dm_zero drm drm_kms_helper dummy e1000 e1000e evdev ext2 fat fb_sys_fops firewire_core firewire_ohci fjes fscache fuse ghash_clmulni_intel glue_helper grace hangcheck_timer hid_a4tech hid_apple hid_belkin hid_cherry hid_chicony hid_cypress hid_ezkey hid_generic hid_gyration hid_logitech hid_logitech_dj hid_microsoft hid_monterey hid_petalynx hid_pl hid_samsung hid_sony hid_sunplus hwmon_vid i2c_algo_bit i2c_i801 i2c_smbus igb input_leds intel_rapl ip6_udp_tunnel ipv6 irqbypass iscsi_tcp iTCO_vendor_support iTCO_wdt ixgb ixgbe jfs kvm kvm_intel libahci libata libcrc32c libiscsi libiscsi_tcp linear lockd lpc_ich lpfc lrw macvlan mdio md_mod megaraid_mbox megaraid_mm megaraid_sas mii mptbase mptfc mptsas mptscsih mptspi multipath nfs nfs_acl nfsd nls_cp437 nls_iso8859_1 nvram ohci_hcd pata_jmicron pata_marvell pata_platform pcspkr psmouse qla1280 qla2xxx r8169 radeon raid0 raid10 raid1 raid456 raid6_pq reiserfs rfkill sata_mv sata_sil24 scsi_transport_fc scsi_transport_iscsi scsi_transport_sas scsi_transport_spi sd_mod sg sky2 snd snd_hda_codec snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_core snd_hda_intel snd_hwdep snd_pcm snd_timer soundcore sr_mod sunrpc syscopyarea sysfillrect sysimgblt tg3 ttm uas udp_tunnel usb_storage vfat virtio virtio_net virtio_ring vxlan w83627ehf x86_pkg_temp_thermal xfs xhci_hcd xhci_pci xor zlib_deflate -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: Digital signature
PROBLEM-PERSISTS: dmesg spam: alloc_contig_range: [XX, YY) PFNs busy
I didn't get any responses to this. git bisect shows that the problem did actually exist in 4.5.0-rc6, but has gotten worse by many orders of magnitude (< 1/week to ~20M/hour). Presently with 4.9-rc5, it's now writing ~2.5GB/hour to syslog. The list of addresses in that time is only ~80 unique ranges, each appearing ~320K times. They don't appear exactly in order, so the kernel does not squelch the log message for appearing too frequently. Could somebody at least make a suggestion on how to trace the printed range to somewhere in the kernel? On Sat, Nov 19, 2016 at 03:25:32AM +0000, Robin H. Johnson wrote: > (Replies CC to list and direct to me please) > > Summary: > > dmesg spammed with alloc_contig_range: [XX, YY) PFNs busy > > Description: > > I recently upgrading 4.9-rc5, (previous kernel 4.5.0-rc6-00141-g6794402), > and since then my dmesg has been absolutely flooded with 'PFNs busy' > (>3GiB/day). My config did not change (all new options =n). > > It's not consistent addresses, so the squelch of identical printk lines > hasn't helped. > Eg output: > [187487.621916] alloc_contig_range: [83f0a9, 83f0aa) PFNs busy > [187487.621924] alloc_contig_range: [83f0ce, 83f0cf) PFNs busy > [187487.621976] alloc_contig_range: [83f125, 83f126) PFNs busy > [187487.622013] alloc_contig_range: [83f127, 83f128) PFNs busy > > Keywords: > - > mm, alloc_contig_range, CMA > > Most recent kernel version which did not have the bug: > -- > Known 4.5.0-rc6-00141-g6794402 > > ver_linux: > -- > Linux bohr-int 4.9.0-rc5-00177-g81bcfe5 #12 SMP Wed Nov 16 13:16:32 PST > 2016 x86_64 Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz GenuineIntel > GNU/Linux > > GNU C 5.3.0 > GNU Make 4.2.1 > Binutils 2.25.1 > Util-linux2.29 > Mount 2.29 > Quota-tools 4.03 > Linux C Library 2.23 > Dynamic linker (ldd) 2.23 > readlink: missing operand > Try 'readlink --help' for more information. > Procps3.3.12 > Net-tools 1.60 > Kbd 2.0.3 > Console-tools 2.0.3 > Sh-utils 8.25 > Udev 230 > Modules Loaded3w_sas 3w_ ablk_helper aesni_intel > aes_x86_64 af_packet ahci aic79xx amdgpu async_memcpy async_pq > async_raid6_recov async_tx async_xor ata_piix auth_rpcgss binfmt_misc > bluetooth bnep bnx2 bonding btbcm btintel btrfs btrtl btusb button cdrom > cn configs coretemp crc32c_intel crc32_pclmul crc_ccitt crc_itu_t > crct10dif_pclmul cryptd dca dm_bio_prison dm_bufio dm_cache dm_cache_smq > dm_crypt dm_delay dm_flakey dm_log dm_log_userspace dm_mirror dm_mod > dm_multipath dm_persistent_data dm_queue_length dm_raid dm_region_hash > dm_round_robin dm_service_time dm_snapshot dm_thin_pool dm_zero drm > drm_kms_helper dummy e1000 e1000e evdev ext2 fat fb_sys_fops > firewire_core firewire_ohci fjes fscache fuse ghash_clmulni_intel > glue_helper grace hangcheck_timer hid_a4tech hid_apple hid_belkin > hid_cherry hid_chicony hid_cypress hid_ezkey hid_generic hid_gyration > hid_logitech hid_logitech_dj hid_microsoft hid_monterey hid_petalynx > hid_pl hid_samsung hid_sony hid_sunplus hwmon_vid i2c_algo_bit i2c_i801 > i2c_smbus igb input_leds intel_rapl ip6_udp_tunnel ipv6 irqbypass > iscsi_tcp iTCO_vendor_support iTCO_wdt ixgb ixgbe jfs kvm kvm_intel > libahci libata libcrc32c libiscsi libiscsi_tcp linear lockd lpc_ich lpfc > lrw macvlan mdio md_mod megaraid_mbox megaraid_mm megaraid_sas mii > mptbase mptfc mptsas mptscsih mptspi multipath nfs nfs_acl nfsd > nls_cp437 nls_iso8859_1 nvram ohci_hcd pata_jmicron pata_marvell > pata_platform pcspkr psmouse qla1280 qla2xxx r8169 radeon raid0 raid10 > raid1 raid456 raid6_pq reiserfs rfkill sata_mv sata_sil24 > scsi_transport_fc scsi_transport_iscsi scsi_transport_sas > scsi_transport_spi sd_mod sg sky2 snd snd_hda_codec > snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec_realtek > snd_hda_core snd_hda_intel snd_hwdep snd_pcm snd_timer soundcore sr_mod > sunrpc syscopyarea sysfillrect sysimgblt tg3 ttm uas udp_tunnel > usb_storage vfat virtio virtio_net virtio_ring vxlan w83627ehf > x86_pkg_temp_thermal xfs xhci_hcd xhci_pci xor zlib_deflate -- Robin Hugh Johnson E-Mail : robb...@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 ICQ# : 30269588 or 41961639 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 signature.asc Description: Digital signature
Re: PROBLEM-PERSISTS: dmesg spam: alloc_contig_range: [XX, YY) PFNs busy
(I'm going to respond directly to this email with the stack trace.) On Wed, Nov 30, 2016 at 02:28:49PM +0100, Michal Hocko wrote: > > On the other hand, if this didn’t happen and now happens all the time, > > this indicates a regression in CMA’s capability to allocate pages so > > just rate limiting the output would hide the potential actual issue. > > Or there might be just a much larger demand on those large blocks, no? > But seriously, dumping those message again and again into the low (see > the 2.5_GB_/h to the log is just insane. So there really should be some > throttling. > > Does the following help you Robin. At least to not get swamped by those > message. Here's what I whipped up based on that, to ensure that dump_stack got rate-limited at the same pass as PFNs-busy. It dropped the dmesg spew to ~25MB/hour (and is suppressing ~43 entries/second right now). commit 6ad4037e18ec2199f8755274d8a745a9904241a1 Author: Robin H. Johnson Date: Wed Nov 30 10:32:57 2016 -0800 mm: ratelimit & trace PFNs busy. Signed-off-by: Robin H. Johnson diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6de9440e3ae2..3c28ec3d18f8 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7289,8 +7289,15 @@ int alloc_contig_range(unsigned long start, unsigned long end, /* Make sure the range is really isolated. */ if (test_pages_isolated(outer_start, end, false)) { - pr_info("%s: [%lx, %lx) PFNs busy\n", - __func__, outer_start, end); + static DEFINE_RATELIMIT_STATE(ratelimit_pfn_busy, + DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + if (__ratelimit(_pfn_busy)) { + pr_info("%s: [%lx, %lx) PFNs busy\n", + __func__, outer_start, end); + dump_stack(); + } + ret = -EBUSY; goto done; } -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy
Somewhere in the Radeon/DRM codebase, CMA page allocation has either regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is doing something different with pages. Given that I haven't seen ANY other reports of this, I'm inclined to believe the problem is drm/radeon specific (if I don't start X, I can't reproduce the problem). The rate of the problem starts slow, and also is relatively low on an idle system (my screens blank at night, no xscreensaver running), but it still ramps up over time (to the point of generating 2.5GB/hour of "(timestamp) alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses (~100 unique ranges for a day). My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ 9 virtual desktops per monitor). I added a stack trace & rate limit to alloc_contig_range's PFNs busy message (patch in previous email on LKML/-MM lists); and they point to radeon. alloc_contig_range: [83f2a3, 83f2a4) PFNs busy CPU: 3 PID: 8518 Comm: X Not tainted 4.9.0-rc7-00024-g6ad4037e18ec #27 Hardware name: System manufacturer System Product Name/P8Z68 DELUXE, BIOS 0501 05/09/2011 ad50c3d7f730 b236c873 0083f2a3 0083f2a4 ad50c3d7f810 b2183b38 999dff4d8040 20fca8c0 0083f400 0083f000 0083f2a3 0004 Call Trace: [] dump_stack+0x85/0xc2 [] alloc_contig_range+0x368/0x370 [] cma_alloc+0x127/0x2e0 [] dma_alloc_from_contiguous+0x38/0x40 [] dma_generic_alloc_coherent+0x91/0x1d0 [] x86_swiotlb_alloc_coherent+0x25/0x50 [] ttm_dma_populate+0x48a/0x9a0 [ttm] [] ? __kmalloc+0x1b6/0x250 [] radeon_ttm_tt_populate+0x22a/0x2d0 [radeon] [] ? ttm_dma_tt_init+0x67/0xc0 [ttm] [] ttm_tt_bind+0x37/0x70 [ttm] [] ttm_bo_handle_move_mem+0x528/0x5a0 [ttm] [] ? shmem_alloc_inode+0x1a/0x30 [] ttm_bo_validate+0x114/0x130 [ttm] [] ? _raw_write_unlock+0xe/0x10 [] ttm_bo_init+0x31d/0x3f0 [ttm] [] radeon_bo_create+0x19b/0x260 [radeon] [] ? radeon_update_memory_usage.isra.0+0x50/0x50 [radeon] [] radeon_gem_object_create+0xad/0x180 [radeon] [] radeon_gem_create_ioctl+0x5f/0xf0 [radeon] [] drm_ioctl+0x21b/0x4d0 [drm] [] ? radeon_gem_pwrite_ioctl+0x30/0x30 [radeon] [] radeon_drm_ioctl+0x4c/0x80 [radeon] [] do_vfs_ioctl+0x92/0x5c0 [] SyS_ioctl+0x79/0x90 [] do_syscall_64+0x73/0x190 [] entry_SYSCALL64_slow_path+0x25/0x25 The Radeon card in my case is a VisionTek HD 7750 Eyefinity 6, which is reported as: 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] (prog-if 00 [VGA controller]) Subsystem: VISIONTEK Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] Flags: bus master, fast devsel, latency 0, IRQ 58 Memory at c000 (64-bit, prefetchable) [size=256M] Memory at fbe0 (64-bit, non-prefetchable) [size=256K] I/O ports at e000 [size=256] Expansion ROM at 000c [disabled] [size=128K] Capabilities: [48] Vendor Specific Information: Len=08 Capabilities: [50] Power Management version 3 Capabilities: [58] Express Legacy Endpoint, MSI 00 Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 Capabilities: [150] Advanced Error Reporting Kernel driver in use: radeon Kernel modules: radeon, amdgpu -- Robin Hugh Johnson E-Mail : robb...@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 ICQ# : 30269588 or 41961639 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 signature.asc Description: Digital signature
Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy
On Wed, Nov 30, 2016 at 10:24:59PM +0100, Vlastimil Babka wrote: > [add more CC's] > > On 11/30/2016 09:19 PM, Robin H. Johnson wrote: > > Somewhere in the Radeon/DRM codebase, CMA page allocation has either > > regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is > > doing something different with pages. > > Could be that it didn't use dma_generic_alloc_coherent() before, or you > didn't > have the generic CMA pool configured. v4.9-rc7-23-gded6e842cf49: [0.00] cma: Reserved 16 MiB at 0x00083e40 [0.00] Memory: 32883108K/33519432K available (6752K kernel code, 1244K rwdata, 4716K rodata, 1772K init, 2720K bss, 619940K reserved, 16384K cma-reserved) > What's the output of "grep CMA" on your > .config? # grep CMA .config |grep -v -e SECMARK= -e CONFIG_BCMA -e CONFIG_USB_HCD_BCMA -e INPUT_CMA3000 -e CRYPTO_CMAC CONFIG_CMA=y # CONFIG_CMA_DEBUG is not set # CONFIG_CMA_DEBUGFS is not set CONFIG_CMA_AREAS=7 CONFIG_DMA_CMA=y CONFIG_CMA_SIZE_MBYTES=16 CONFIG_CMA_SIZE_SEL_MBYTES=y # CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set # CONFIG_CMA_SIZE_SEL_MIN is not set # CONFIG_CMA_SIZE_SEL_MAX is not set CONFIG_CMA_ALIGNMENT=8 > Or any kernel boot options with cma in name? None. > By default config this should not be used on x86. What do you mean by that statement? It should be disallowed to enable CONFIG_CMA? Radeon and CMA should be mutually exclusive? > > Given that I haven't seen ANY other reports of this, I'm inclined to > > believe the problem is drm/radeon specific (if I don't start X, I can't > > reproduce the problem). > > It's rather CMA specific, the allocation attemps just can't be 100% reliable > due > to how CMA works. The question is if it should be spewing in the log in the > context of dma-cma, which has a fallback allocation option. It even uses > __GFP_NOWARN, perhaps the CMA path should respect that? Yes, I'd say if there's a fallback without much penalty, nowarn makes sense. If the fallback just tries multiple addresses until success, then the warning should only be issued when too many attempts have been made. > > > The rate of the problem starts slow, and also is relatively low on an idle > > system (my screens blank at night, no xscreensaver running), but it still > > ramps > > up over time (to the point of generating 2.5GB/hour of "(timestamp) > > alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses > > (~100 > > unique ranges for a day). > > > > My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ > > 9 > > virtual desktops per monitor). > So IIUC, except the messages, everything actually works fine? There's high kernel CPU usage that seems to roughly correlate with the messages, but I can't yet tell if that's due to the syslog itself, or repeated alloc_contig_range requests. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: Digital signature
Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy
On Thu, Dec 01, 2016 at 08:38:15AM +0100, Vlastimil Babka wrote: > >> By default config this should not be used on x86. > > What do you mean by that statement? > > I mean that the 16 mbytes for generic CMA area is not a default on x86: > > config CMA_SIZE_MBYTES > int "Size in Mega Bytes" > depends on !CMA_SIZE_SEL_PERCENTAGE > default 0 if X86 > default 16 d7be003a9d275299f5ee36bbdf156654f59e08e9 (v3.18-2122-gd7be003a9d27) is there the 0MB if-x86 default was added to the tree. Prior to that, it was 16MiB, and that's where my system picked up the value from. I have a record of all my kconfigs, because I use oldconfig each time (going back 8 years to 2.6.27) # Added in 3.12.0-1-g5f258d0 CONFIG_CMA=y # Added in 3.16.0-rc6-00042-g67dd8f3 CONFIG_CMA_ALIGNMENT=8 CONFIG_CMA_AREAS=7 CONFIG_CMA_SIZE_MBYTES=16 CONFIG_CMA_SIZE_SEL_MBYTES=y CONFIG_DMA_CMA=y So the next question, is why did I pick up CMA in 3.16.0-rc6-00042-g67dd8f3... I'll poke at that. > > Yes, I'd say if there's a fallback without much penalty, nowarn makes > > sense. If the fallback just tries multiple addresses until success, then > > the warning should only be issued when too many attempts have been made. > On the other hand, if the warnings are correlated with high kernel CPU usage, > it's arguably better to be warned. Keep the rate-limit on the warning for cases like this? > >> > The rate of the problem starts slow, and also is relatively low on an > >> > idle > >> > system (my screens blank at night, no xscreensaver running), but it > >> > still ramps > >> > up over time (to the point of generating 2.5GB/hour of "(timestamp) > >> > alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses > >> > (~100 > >> > unique ranges for a day). > >> > > >> > My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors > >> > w/ 9 > >> > virtual desktops per monitor). > >> So IIUC, except the messages, everything actually works fine? > > There's high kernel CPU usage that seems to roughly correlate with the > > messages, but I can't yet tell if that's due to the syslog itself, or > > repeated alloc_contig_range requests. > You could try running perf top. Will do in the morning. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
[PATCH resend] PCI: QEMU top-level IDs for (sub)vendor & device
Introduce PCI_VENDOR/PCI_SUBVENDOR/PCI_SUBDEVICE defines to replace the constants scattered in the kernel already used to detect QEMU. They are defined in the QEMU codebase per docs/specs/pci-ids.txt. Signed-off-by: Robin H. Johnson Reviewed-by: Takashi Iwai Reviewed-by: Gerd Hoffmann --- This change prompted by a near-miss in the review of recent change: 'drm/i915: refine qemu south bridge detection' This patch was previously sent to LKML 25 Jan 2016; and got some reviews, but otherwise slipped through the cracks. --- drivers/gpu/drm/bochs/bochs_drv.c | 4 ++-- drivers/gpu/drm/cirrus/cirrus_drv.c | 5 +++-- drivers/virtio/virtio_pci_common.c | 2 +- include/linux/pci_ids.h | 4 sound/pci/intel8x0.c| 4 ++-- 5 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/bochs/bochs_drv.c b/drivers/gpu/drm/bochs/bochs_drv.c index 7f1a360..b332b4d3 100644 --- a/drivers/gpu/drm/bochs/bochs_drv.c +++ b/drivers/gpu/drm/bochs/bochs_drv.c @@ -182,8 +182,8 @@ static const struct pci_device_id bochs_pci_tbl[] = { { .vendor = 0x1234, .device = 0x, - .subvendor = 0x1af4, - .subdevice = 0x1100, + .subvendor = PCI_SUBVENDOR_ID_REDHAT_QUMRANET, + .subdevice = PCI_SUBDEVICE_ID_QEMU, .driver_data = BOCHS_QEMU_STDVGA, }, { diff --git a/drivers/gpu/drm/cirrus/cirrus_drv.c b/drivers/gpu/drm/cirrus/cirrus_drv.c index b1619e2..7bc394e 100644 --- a/drivers/gpu/drm/cirrus/cirrus_drv.c +++ b/drivers/gpu/drm/cirrus/cirrus_drv.c @@ -33,8 +33,9 @@ static struct drm_driver driver; /* only bind to the cirrus chip in qemu */ static const struct pci_device_id pciidlist[] = { - { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, 0x1af4, 0x1100, 0, - 0, 0 }, + { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, + PCI_SUBVENDOR_ID_REDHAT_QUMRANET, PCI_SUBDEVICE_ID_QEMU, + 0, 0, 0 }, { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, PCI_VENDOR_ID_XEN, 0x0001, 0, 0, 0 }, {0,} diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 36205c2..127dfe4 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -467,7 +467,7 @@ static const struct dev_pm_ops virtio_pci_pm_ops = { /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */ static const struct pci_device_id virtio_pci_id_table[] = { - { PCI_DEVICE(0x1af4, PCI_ANY_ID) }, + { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_ANY_ID) }, { 0 } }; diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 37f05cb..6d249d3 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2506,6 +2506,10 @@ #define PCI_VENDOR_ID_AZWAVE 0x1a3b +#define PCI_VENDOR_ID_REDHAT_QUMRANET0x1af4 +#define PCI_SUBVENDOR_ID_REDHAT_QUMRANET 0x1af4 +#define PCI_SUBDEVICE_ID_QEMU0x1100 + #define PCI_VENDOR_ID_ASMEDIA 0x1b21 #define PCI_VENDOR_ID_CIRCUITCO0x1cc8 diff --git a/sound/pci/intel8x0.c b/sound/pci/intel8x0.c index 42bcbac..12c2c18 100644 --- a/sound/pci/intel8x0.c +++ b/sound/pci/intel8x0.c @@ -2980,8 +2980,8 @@ static int snd_intel8x0_inside_vm(struct pci_dev *pci) goto fini; /* check for known (emulated) devices */ - if (pci->subsystem_vendor == 0x1af4 && - pci->subsystem_device == 0x1100) { + if (pci->subsystem_vendor == PCI_SUBVENDOR_ID_REDHAT_QUMRANET && + pci->subsystem_device == PCI_SUBDEVICE_ID_QEMU) { /* KVM emulated sound, PCI SSID: 1af4:1100 */ msg = "enable KVM"; } else if (pci->subsystem_vendor == 0x1ab8) { -- 2.3.0
[PATCH] PCI: QEMU top-level IDs for (sub)vendor & device
Introduce PCI_VENDOR/PCI_SUBVENDOR/PCI_SUBDEVICE defines to replace the constants scattered in the kernel already used to detect QEMU. They are defined in the QEMU codebase per docs/specs/pci-ids.txt. Signed-off-by: Robin H. Johnson --- This change prompted by a near-miss in the review of recent change: 'drm/i915: refine qemu south bridge detection' Signed-off-by: Robin H. Johnson --- drivers/gpu/drm/bochs/bochs_drv.c | 4 ++-- drivers/gpu/drm/cirrus/cirrus_drv.c | 5 +++-- drivers/virtio/virtio_pci_common.c | 2 +- include/linux/pci_ids.h | 4 sound/pci/intel8x0.c| 4 ++-- 5 files changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/bochs/bochs_drv.c b/drivers/gpu/drm/bochs/bochs_drv.c index 7f1a360..b332b4d3 100644 --- a/drivers/gpu/drm/bochs/bochs_drv.c +++ b/drivers/gpu/drm/bochs/bochs_drv.c @@ -182,8 +182,8 @@ static const struct pci_device_id bochs_pci_tbl[] = { { .vendor = 0x1234, .device = 0x, - .subvendor = 0x1af4, - .subdevice = 0x1100, + .subvendor = PCI_SUBVENDOR_ID_REDHAT_QUMRANET, + .subdevice = PCI_SUBDEVICE_ID_QEMU, .driver_data = BOCHS_QEMU_STDVGA, }, { diff --git a/drivers/gpu/drm/cirrus/cirrus_drv.c b/drivers/gpu/drm/cirrus/cirrus_drv.c index b1619e2..7bc394e 100644 --- a/drivers/gpu/drm/cirrus/cirrus_drv.c +++ b/drivers/gpu/drm/cirrus/cirrus_drv.c @@ -33,8 +33,9 @@ static struct drm_driver driver; /* only bind to the cirrus chip in qemu */ static const struct pci_device_id pciidlist[] = { - { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, 0x1af4, 0x1100, 0, - 0, 0 }, + { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, + PCI_SUBVENDOR_ID_REDHAT_QUMRANET, PCI_SUBDEVICE_ID_QEMU, + 0, 0, 0 }, { PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_5446, PCI_VENDOR_ID_XEN, 0x0001, 0, 0, 0 }, {0,} diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 36205c2..127dfe4 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -467,7 +467,7 @@ static const struct dev_pm_ops virtio_pci_pm_ops = { /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */ static const struct pci_device_id virtio_pci_id_table[] = { - { PCI_DEVICE(0x1af4, PCI_ANY_ID) }, + { PCI_DEVICE(PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_ANY_ID) }, { 0 } }; diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 37f05cb..6d249d3 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -2506,6 +2506,10 @@ #define PCI_VENDOR_ID_AZWAVE 0x1a3b +#define PCI_VENDOR_ID_REDHAT_QUMRANET0x1af4 +#define PCI_SUBVENDOR_ID_REDHAT_QUMRANET 0x1af4 +#define PCI_SUBDEVICE_ID_QEMU0x1100 + #define PCI_VENDOR_ID_ASMEDIA 0x1b21 #define PCI_VENDOR_ID_CIRCUITCO0x1cc8 diff --git a/sound/pci/intel8x0.c b/sound/pci/intel8x0.c index 42bcbac..12c2c18 100644 --- a/sound/pci/intel8x0.c +++ b/sound/pci/intel8x0.c @@ -2980,8 +2980,8 @@ static int snd_intel8x0_inside_vm(struct pci_dev *pci) goto fini; /* check for known (emulated) devices */ - if (pci->subsystem_vendor == 0x1af4 && - pci->subsystem_device == 0x1100) { + if (pci->subsystem_vendor == PCI_SUBVENDOR_ID_REDHAT_QUMRANET && + pci->subsystem_device == PCI_SUBDEVICE_ID_QEMU) { /* KVM emulated sound, PCI SSID: 1af4:1100 */ msg = "enable KVM"; } else if (pci->subsystem_vendor == 0x1ab8) { -- 2.3.0