[RFC PATCH v3 0/1] x86/sgx: Explicitly give up the CPU in EDMM's ioctl() to avoid softlockup
Hi folks, This is the third version of the patch to fix the softlockup in EDMM iotcl()[1][2]. If we run an enclave equipped with large EPC(30G or greater on my platfrom) on the Linux with kernel preemptions disabled(by configuring "CONFIG_PREEMPT_NONE=y"), we will get the following softlockup warning messages being reported in "dmesg" log: The EDMM's ioctl()s (sgx_ioc_enclave_{ modify_types | restrict_permissions | remove_pages}) interface provided by kernel support batch changing attributes of enclave's EPC. If userspace App requests kernel to handle too many EPC pages, kernel may stuck for a long time(with preemption disabled). The log is as follows: [ cut here ] [ 901.101294] watchdog: BUG: soft lockup - CPU#92 stuck for 23s! [occlum-run:4289] [ 901.109617] Modules linked in: veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_addrtype iptable_filter br_netfilter bridge stp llc overlay nls_iso8859_1 intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit binfmt_misc ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 pmt_telemetry sha1_ssse3 pmt_class joydev intel_sdsi input_leds aesni_intel crypto_simd cryptd dax_hmem cxl_acpi cmdlinepart rapl cxl_core ast spi_nor intel_cstate drm_shmem_helper einj mtd drm_kms_helper mei_me idxd isst_if_mmio isst_if_mbox_pci isst_if_common intel_vsec idxd_bus mei acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter mac_hid sch_fq_codel msr parport_pc ppdev lp parport ramoops reed_solomon pstore_blk pstore_zone efi_pstore drm ip_tables x_tables [ 901.109670] autofs4 mlx5_ib ib_uverbs ib_core hid_generic usbhid hid ses enclosure scsi_transport_sas mlx5_core pci_hyperv_intf mlxfw igb ahci psample i2c_algo_bit i2c_i801 spi_intel_pci xhci_pci tls megaraid_sas dca spi_intel crc32_pclmul i2c_smbus i2c_ismt libahci xhci_pci_renesas wmi pinctrl_emmitsburg [ 901.109691] CPU: 92 PID: 4289 Comm: occlum-run Not tainted 6.9.0-rc5 #3 [ 901.109693] Hardware name: Inspur NF5468-M7-A0-R0-00/NF5468-M7-A0-R0-00, BIOS 05.02.01 05/08/2023 [ 901.109695] RIP: 0010:sgx_enclave_restrict_permissions+0xba/0x1f0 [ 901.109701] Code: 48 c1 e6 05 48 89 d1 48 8d 5c 24 40 b8 0e 00 00 00 48 2b 8e 70 8e 15 8b 48 c1 e9 05 48 c1 e1 0c 48 03 8e 68 8e 15 8b 0f 01 cf 00 00 00 40 0f 85 b2 00 00 00 85 c0 0f 85 db 00 00 00 4c 89 ef [ 901.109702] RSP: 0018:ad0ae5d0f8c0 EFLAGS: 0202 [ 901.109704] RAX: RBX: ad0ae5d0f900 RCX: ad11dfc0e000 [ 901.109705] RDX: ad2adcff81c0 RSI: RDI: 9a12f5f4f000 [ 901.109706] RBP: ad0ae5d0f9b0 R08: 0002 R09: 9a1289f57520 [ 901.109707] R10: 005d R11: 0002 R12: 0006d8ff2000 [ 901.109708] R13: 9a12f5f4f000 R14: ad0ae5d0fa18 R15: 9a12f5f4f020 [ 901.109709] FS: 7fb20ad1d740() GS:9a317fe0() knlGS: [ 901.109710] CS: 0010 DS: ES: CR0: 80050033 [ 901.109711] CR2: 7f8041811000 CR3: 000118530006 CR4: 00770ef0 [ 901.109712] DR0: DR1: DR2: [ 901.109713] DR3: DR6: fffe07f0 DR7: 0400 [ 901.109714] PKRU: 5554 [ 901.109714] Call Trace: [ 901.109716] [ 901.109718] ? show_regs+0x67/0x70 [ 901.109722] ? watchdog_timer_fn+0x1f3/0x280 [ 901.109725] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 901.109727] ? __hrtimer_run_queues+0xc8/0x220 [ 901.109731] ? hrtimer_interrupt+0x10c/0x250 [ 901.109733] ? __sysvec_apic_timer_interrupt+0x53/0x130 [ 901.109736] ? sysvec_apic_timer_interrupt+0x7b/0x90 [ 901.109739] [ 901.109740] [ 901.109740] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ 901.109745] ? sgx_enclave_restrict_permissions+0xba/0x1f0 [ 901.109747] ? aa_file_perm+0x145/0x550 [ 901.109750] sgx_ioctl+0x1ab/0x900 [ 901.109751] ? xas_find+0x84/0x200 [ 901.109754] ? sgx_enclave_etrack+0xbb/0x140 [ 901.109756] ? sgx_encl_may_map+0x19a/0x240 [ 901.109758] ? common_file_perm+0x8a/0x1b0 [ 901.109760] ? obj_cgroup_charge_pages+0xa2/0x100 [ 901.109763] ? tlb_flush_mmu+0x31/0x1c0 [ 901.109766] ? tlb_finish_mmu+0x42/0x80 [ 901.109767] ? do_mprotect_pkey+0x150/0x530 [ 901.109769] ? __fget_light+0xc0/0x100 [ 901.109772] __x64_sys_ioctl+0x95/0xd0 [ 901.109775] x64_sys_call+0x1209/0x20c0 [ 901.109777] do_syscall_64+0x6d/0x110 [ 901.109779] ? syscall_exit_to_user_mode+0x86/0x1c0 [ 901.109782] ? do_syscall_64+0x79/0x110 [ 901.109783] ? syscall_exit_to_user_mode+0x86/0x1c0 [ 901.109784] ? do_syscall_64+0x79/0x110 [ 901.109785] ? free_unref_page+0x10e/0x180 [ 901.109788] ? __do_fault+0x36/0x130 [ 901.109791] ?
[PATCH v3 0/1] arm64: Implement stack trace termination record
From: "Madhavan T. Venkataraman" Reliable stacktracing requires that we identify when a stacktrace is terminated early. We can do this by ensuring all tasks have a final frame record at a known location on their task stack, and checking that this is the final frame record in the chain. All tasks have a pt_regs structure right after the task stack in the stack page. The pt_regs structure contains a stackframe field. Make this stackframe field the final frame in the task stack so all stack traces end at a fixed stack offset. For kernel tasks, this is simple to understand. For user tasks, there is some extra detail. User tasks get created via fork() et al. Once they return from fork, they enter the kernel only on an EL0 exception. In arm64, system calls are also EL0 exceptions. The EL0 exception handler uses the task pt_regs mentioned above to save register state and call different exception functions. All stack traces from EL0 exception code must end at the pt_regs. So, make pt_regs->stackframe the final frame in the EL0 exception stack. To summarize, task_pt_regs(task)->stackframe will always be the final frame in a stack trace. Sample stack traces === Showing just the last couple of frames in each stack trace to show how the stack trace ends. Primary CPU idle task = ... [0.077109] rest_init+0x108/0x144 [0.077188] arch_call_rest_init+0x18/0x24 [0.077220] start_kernel+0x3ac/0x3e4 [0.077293] __primary_switched+0xac/0xb0 Secondary CPU idle task === ... [0.077264] secondary_start_kernel+0x228/0x388 [0.077326] __secondary_switched+0x80/0x84 Sample kernel thread ... [ 24.543250] kernel_init+0xa4/0x164 [ 24.561850] ret_from_fork+0x10/0x18 Write system call (EL0 exception) = (using a test driver called callfd) [ 1160.628723] callfd_stack+0x3c/0x70 [ 1160.628768] callfd_op+0x35c/0x3a8 [ 1160.628791] callfd_write+0x5c/0xc8 [ 1160.628813] vfs_write+0x104/0x3b8 [ 1160.628837] ksys_write+0xd0/0x188 [ 1160.628859] __arm64_sys_write+0x4c/0x60 [ 1160.628883] el0_svc_common.constprop.0+0xa8/0x240 [ 1160.628904] do_el0_svc+0x40/0xa8 [ 1160.628921] el0_svc+0x2c/0x78 [ 1160.628942] el0_sync_handler+0xb0/0xb8 [ 1160.628962] el0_sync+0x17c/0x180 NULL pointer dereference exception (EL1 exception) == [ 1160.637984] callfd_stack+0x3c/0x70 [ 1160.638015] die_kernel_fault+0x80/0x108 [ 1160.638042] do_page_fault+0x520/0x600 [ 1160.638075] do_translation_fault+0xa8/0xdc [ 1160.638102] do_mem_abort+0x68/0x100 [ 1160.638120] el1_abort+0x40/0x60 [ 1160.638138] el1_sync_handler+0xac/0xc8 [ 1160.638157] el1_sync+0x74/0x100 [ 1160.638174] 0x0 <=== NULL pointer dereference [ 1160.638189] callfd_write+0x5c/0xc8 [ 1160.638211] vfs_write+0x104/0x3b8 [ 1160.638234] ksys_write+0xd0/0x188 [ 1160.638278] __arm64_sys_write+0x4c/0x60 [ 1160.638325] el0_svc_common.constprop.0+0xa8/0x240 [ 1160.638358] do_el0_svc+0x40/0xa8 [ 1160.638379] el0_svc+0x2c/0x78 [ 1160.638409] el0_sync_handler+0xb0/0xb8 [ 1160.638452] el0_sync+0x17c/0x180 Timer interrupt (EL1 exception) === Secondary CPU idle task interrupted by the timer interrupt: [ 1160.702949] callfd_callback: [ 1160.703006] callfd_stack+0x3c/0x70 [ 1160.703060] callfd_callback+0x30/0x40 [ 1160.703087] call_timer_fn+0x48/0x220 [ 1160.703113] run_timer_softirq+0x7cc/0xc70 [ 1160.703144] __do_softirq+0x1ec/0x608 [ 1160.703166] irq_exit+0x138/0x180 [ 1160.703193] __handle_domain_irq+0x8c/0xf0 [ 1160.703218] gic_handle_irq+0xec/0x410 [ 1160.703253] el1_irq+0xc0/0x180 [ 1160.703278] arch_local_irq_enable+0xc/0x28 [ 1160.703329] default_idle_call+0x54/0x1d8 [ 1160.703355] do_idle+0x2d8/0x350 [ 1160.703388] cpu_startup_entry+0x2c/0x98 [ 1160.703412] secondary_start_kernel+0x238/0x388 [ 1160.703446] __secondary_switched+0x80/0x84 --- Changelog: v3: - Added Reviewed-by: Mark Brown . - Fixed an extra space after a cast reported by checkpatch --strict. - Synced with mainline tip. v2: - Changed some wordings as suggested by Mark Rutland. - Removed the synthetic return PC for idle tasks. Changed the branches to start_kernel() and secondary_start_kernel() to calls so that they will have a proper return PC. v1: - Set up task_pt_regs(current)->stackframe as the final frame when a new task is initialized in copy_thread(). - Create pt_regs for the idle tasks and set up pt_regs->stackframe as the final frame for the idle tasks. - Set up task_pt_regs(current)->stackframe as the final frame in the EL0 exception handler so the EL0 exception stack trace ends there. - Terminate the stack trace successfully in unwind_frame() when
[PATCH v3 0/1] dwc2: Enable USB when booted in ACPI mode
The BCM2711 has a designware USB controller that is commonly used on the CM4 and RPi400. There is a desire to use thes machines with a standard UEFI+ACPI stack as is being done with the normal RPi4. This patch enables this by adding ACPI module boilerplate to the existing dwc2 controller. It should also be noted, that there is an ACPI table update in the firmware which marks the ACPI _DMA() entries as ResourceProducers. That change is required for this to work with the 1G DMA translation present on the platform. Changes: v2->v3: Add this cover letter to describe the patch changes v1->v2: Fix the kernel_ulong_t/set_parms() function typecasting warning by explicitly doing the type cast. Jeremy Linton (1): usb: dwc2: Enable RPi in ACPI mode drivers/usb/dwc2/core.h | 2 ++ drivers/usb/dwc2/params.c | 18 +- drivers/usb/dwc2/platform.c | 1 + 3 files changed, 20 insertions(+), 1 deletion(-) -- 2.29.2
[PATCH v3 0/1] NVIDIA Tegra memory improvements
Hi, Here is the last patch of the series which had minor problem in v2, the rest of the patches are already applied by Krzysztof Kozlowski. Changelog: v3: - Added new optional reg property for emc-tables nodes in order to fix dt_binding_check warning. Please note that I will prepare a separate patch for v5.14 that will add the new property to the device-trees since Thierry already sent out PR for v5.13. v2: - Fixed typos in the converted schemas. - Corrected reg entry of tegra20-mc-gart schema to use fixed number of items. - Made power-domain to use maxItems instead of $ref phandle in schemas. Dmitry Osipenko (1): dt-bindings: memory: tegra20: emc: Convert to schema .../memory-controllers/nvidia,tegra20-emc.txt | 130 .../nvidia,tegra20-emc.yaml | 303 ++ 2 files changed, 303 insertions(+), 130 deletions(-) delete mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-emc.txt create mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-emc.yaml -- 2.30.2
Re: [PATCH v3 0/1] drm/tiny: add support for Waveshare 2inch LCD module
On Tue, 30 Mar 2021 09:17:19 -0500 David Lechner wrote: > On 3/30/21 3:08 AM, Carlis wrote: > > From: Xuezhi Zhang > > > > This adds a new module for the ST7789V controller with parameters > > for the Waveshare 2inch LCD module. > > > > Signed-off-by: Xuezhi Zhang > > --- > > v2:change compatible value. > > v3:change author name. > > --- > > MAINTAINERS| 8 + > > drivers/gpu/drm/tiny/Kconfig | 14 ++ > > drivers/gpu/drm/tiny/Makefile | 1 + > > drivers/gpu/drm/tiny/st7789v.c | 269 > > + 4 files changed, 292 insertions(+) > > create mode 100644 drivers/gpu/drm/tiny/st7789v.c > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index d92f85ca831d..df25e8e0deb1 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -5769,6 +5769,14 @@ T: git > > git://anongit.freedesktop.org/drm/drm-misc F: > > Documentation/devicetree/bindings/display/sitronix,st7735r.yaml > > F: drivers/gpu/drm/tiny/st7735r.c > > +DRM DRIVER FOR SITRONIX ST7789V PANELS > > +M: David Lechner > OK, i will remove this in the next patch. > I should not be added here. I don't have one of these displays. > > > +M: Xuezhi Zhang > > +S: Maintained > > +T: git git://anongit.freedesktop.org/drm/drm-misc > > +F: > > Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml > > +F: drivers/gpu/drm/tiny/st7789v.c + > > DRM DRIVER FOR SONY ACX424AKP PANELS > > M:Linus Walleij > > S:Maintained thanks, Xuezhi Zhang
[PATCH v3 0/1] nvmem: Change to unified property interface
nvmem: Change to unified property interface Change from using device tree (Open Firmware) APIs to the unified 'fwnode' interface. Change of_nvmem_cell_get() to fwnode_nvmem_cell_get(), and add a wrapper for of_nvmem_cell_get(). Change of_nvmem_device_get() to fwnode_nvmem_device_get(). There are no known accessors to the OF interface, so no need for a wrapper. The first version of this patch incorrectly had a wrapper for of_nvmem_device_get(), even though the comments about the patch not needing this were correct. The second version of this patch had an incorrect return type for of_nvmem_device_get().
Re: [PATCH v3 0/1] drm/tiny: add support for Waveshare 2inch LCD module
On 3/30/21 3:08 AM, Carlis wrote: From: Xuezhi Zhang This adds a new module for the ST7789V controller with parameters for the Waveshare 2inch LCD module. Signed-off-by: Xuezhi Zhang --- v2:change compatible value. v3:change author name. --- MAINTAINERS| 8 + drivers/gpu/drm/tiny/Kconfig | 14 ++ drivers/gpu/drm/tiny/Makefile | 1 + drivers/gpu/drm/tiny/st7789v.c | 269 + 4 files changed, 292 insertions(+) create mode 100644 drivers/gpu/drm/tiny/st7789v.c diff --git a/MAINTAINERS b/MAINTAINERS index d92f85ca831d..df25e8e0deb1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5769,6 +5769,14 @@ T: git git://anongit.freedesktop.org/drm/drm-misc F:Documentation/devicetree/bindings/display/sitronix,st7735r.yaml F:drivers/gpu/drm/tiny/st7735r.c +DRM DRIVER FOR SITRONIX ST7789V PANELS +M: David Lechner I should not be added here. I don't have one of these displays. +M: Xuezhi Zhang +S: Maintained +T: git git://anongit.freedesktop.org/drm/drm-misc +F: Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml +F: drivers/gpu/drm/tiny/st7789v.c + DRM DRIVER FOR SONY ACX424AKP PANELS M:Linus Walleij S:Maintained
[PATCH v3 0/1] drm/tiny: add support for Waveshare 2inch LCD module
From: Xuezhi Zhang This adds a new module for the ST7789V controller with parameters for the Waveshare 2inch LCD module. Signed-off-by: Xuezhi Zhang --- v2:change compatible value. v3:change author name. --- MAINTAINERS| 8 + drivers/gpu/drm/tiny/Kconfig | 14 ++ drivers/gpu/drm/tiny/Makefile | 1 + drivers/gpu/drm/tiny/st7789v.c | 269 + 4 files changed, 292 insertions(+) create mode 100644 drivers/gpu/drm/tiny/st7789v.c diff --git a/MAINTAINERS b/MAINTAINERS index d92f85ca831d..df25e8e0deb1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5769,6 +5769,14 @@ T: git git://anongit.freedesktop.org/drm/drm-misc F: Documentation/devicetree/bindings/display/sitronix,st7735r.yaml F: drivers/gpu/drm/tiny/st7735r.c +DRM DRIVER FOR SITRONIX ST7789V PANELS +M: David Lechner +M: Xuezhi Zhang +S: Maintained +T: git git://anongit.freedesktop.org/drm/drm-misc +F: Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml +F: drivers/gpu/drm/tiny/st7789v.c + DRM DRIVER FOR SONY ACX424AKP PANELS M: Linus Walleij S: Maintained diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig index 2b6414f0fa75..ac2c7fb702f0 100644 --- a/drivers/gpu/drm/tiny/Kconfig +++ b/drivers/gpu/drm/tiny/Kconfig @@ -131,3 +131,17 @@ config TINYDRM_ST7735R * Okaya RH128128T 1.44" 128x128 TFT If M is selected the module will be called st7735r. + +config TINYDRM_ST7789V + tristate "DRM support for Sitronix ST7789V display panels" + depends on DRM && SPI + select DRM_KMS_HELPER + select DRM_KMS_CMA_HELPER + select DRM_MIPI_DBI + select BACKLIGHT_CLASS_DEVICE + help + DRM driver for Sitronix ST7789V with one of the following + LCDs: + * Waveshare 2inch lcd module 240x320 TFT + + If M is selected the module will be called st7789v. diff --git a/drivers/gpu/drm/tiny/Makefile b/drivers/gpu/drm/tiny/Makefile index 6ae4e9e5a35f..aa0caa2b6c16 100644 --- a/drivers/gpu/drm/tiny/Makefile +++ b/drivers/gpu/drm/tiny/Makefile @@ -10,3 +10,4 @@ obj-$(CONFIG_TINYDRM_MI0283QT)+= mi0283qt.o obj-$(CONFIG_TINYDRM_REPAPER) += repaper.o obj-$(CONFIG_TINYDRM_ST7586) += st7586.o obj-$(CONFIG_TINYDRM_ST7735R) += st7735r.o +obj-$(CONFIG_TINYDRM_ST7789V) += st7789v.o diff --git a/drivers/gpu/drm/tiny/st7789v.c b/drivers/gpu/drm/tiny/st7789v.c new file mode 100644 index ..9b4bb9edba40 --- /dev/null +++ b/drivers/gpu/drm/tiny/st7789v.c @@ -0,0 +1,269 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * DRM driver for display panels connected to a Sitronix ST7789V + * display controller in SPI mode. + * + * Copyright 2017 David Lechner + * Copyright (C) 2019 Glider bvba + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#define ST7789V_PORCTRL 0xb2 +#define ST7789V_GCTRL 0xb7 +#define ST7789V_VCOMS 0xbb +#define ST7789V_LCMCTRL 0xc0 +#define ST7789V_VDVVRHEN0xc2 +#define ST7789V_VRHS0xc3 +#define ST7789V_VDVS0xc4 +#define ST7789V_FRCTRL2 0xc6 +#define ST7789V_PWCTRL1 0xd0 +#define ST7789V_PVGAMCTRL 0xe0 +#define ST7789V_NVGAMCTRL 0xe1 + +#define ST7789V_MY BIT(7) +#define ST7789V_MX BIT(6) +#define ST7789V_MV BIT(5) +#define ST7789V_RGBBIT(3) + +struct st7789v_cfg { + const struct drm_display_mode mode; + unsigned int left_offset; + unsigned int top_offset; + unsigned int write_only:1; + unsigned int rgb:1; /* RGB (vs. BGR) */ +}; + +struct st7789v_priv { + struct mipi_dbi_dev dbidev; /* Must be first for .release() */ + const struct st7789v_cfg *cfg; +}; + +static void st7789v_pipe_enable(struct drm_simple_display_pipe *pipe, + struct drm_crtc_state *crtc_state, + struct drm_plane_state *plane_state) +{ + struct mipi_dbi_dev *dbidev = drm_to_mipi_dbi_dev(pipe->crtc.dev); + struct st7789v_priv *priv = container_of(dbidev, struct st7789v_priv, +dbidev); + struct mipi_dbi *dbi = >dbi; + int ret, idx; + u8 addr_mode; + + if (!drm_dev_enter(pipe->crtc.dev, )) + return; + + DRM_DEBUG_KMS("\n"); + + ret = mipi_dbi_poweron_reset(dbidev); + if (ret) + goto out_exit; + + msleep(150); + + mipi_dbi_command(dbi, MIPI_DCS_EXIT_SLEEP_MODE); + msleep(100); + + + switch (dbidev->rotation) { + default: + addr_mode = 0; + break; + case 90: + addr_mode = ST7789V_MY | ST7789V_MV; + break; + case 180: + addr_mode = ST7789V_MX | ST7789V_MY; +
Re: [PATCH v3 0/1] Allow drivers to modify dql.min_limit value
Hello: This patch was applied to netdev/net-next.git (refs/heads/master): On Sun, 21 Mar 2021 22:48:48 +0900 you wrote: > Abstract: would like to directly set dql.min_limit value inside a > driver to improve BQL performances of a CAN USB driver. > > CAN packets have a small PDU: for classical CAN maximum size is > roughly 16 bytes (8 for payload and 8 for arbitration, CRC and > others). > > [...] Here is the summary with links: - [v3,1/1] netdev: add netdev_queue_set_dql_min_limit() https://git.kernel.org/netdev/net-next/c/f57bac3c33e7 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH v3 0/1] correct the inside linear map range during hotplug check
On Tue, 16 Feb 2021 10:03:50 -0500, Pavel Tatashin wrote: > v3: - Sync with linux-next where arch_get_mappable_range() was > introduced. > v2: - Added test-by Tyler Hicks > - Addressed comments from Anshuman Khandual: moved check under > IS_ENABLED(CONFIG_RANDOMIZE_BASE), added > WARN_ON(start_linear_pa > end_linear_pa); > > [...] Applied to arm64 (for-next/fixes), thanks! [1/1] arm64: mm: correct the inside linear map range during hotplug check https://git.kernel.org/arm64/c/ee7febce0519 Cheers, -- Will https://fixes.arm64.dev https://next.arm64.dev https://will.arm64.dev
[PATCH v3 0/1] Allow drivers to modify dql.min_limit value
Abstract: would like to directly set dql.min_limit value inside a driver to improve BQL performances of a CAN USB driver. CAN packets have a small PDU: for classical CAN maximum size is roughly 16 bytes (8 for payload and 8 for arbitration, CRC and others). I am writing an CAN driver for an USB interface. To compensate the extra latency introduced by the USB, I want to group several CAN frames and do one USB bulk send. To this purpose, I implemented BQL in my driver. However, the BQL algorithms can take time to adjust, especially if there are small bursts. The best way I found is to directly modify the dql.min_limit and set it to some empirical values. This way, even during small burst events I can have a good throughput. Slightly increasing the dql.min_limit has no measurable impact on the latency as long as frames fit in the same USB packet (i.e. BQL overheard is negligible compared to USB overhead). The BQL was not designed for USB nor was it designed for CAN's small PDUs which probably explains why I am the first one to ever have thought of using dql.min_limit within the driver. The code I wrote looks like: > #ifdef CONFIG_BQL > netdev_get_tx_queue(netdev, 0)->dql.min_limit = ; > #endif Using #ifdef to set up some variables is not a best practice. I am sending this RFC to see if we can add a function to set this dql.min_limit in a more pretty way. For your reference, this RFQ is a follow-up of a discussion on the linux-can mailing list: https://lore.kernel.org/linux-can/20210309125708.ei75tr5vp2san...@pengutronix.de/ Thank you for your comments. Yours sincerely, Vincent ** Changelog ** RFC v2 -> v3 - More verbose commit description. - Fix kernel documentation. RFC v1 -> RFC v2 - Fix incorect #ifdef use. Reference: https://lore.kernel.org/linux-can/20210309153547.q7zspf46k6ter...@pengutronix.de/ Link to RFC v1: https://lore.kernel.org/linux-can/20210309152354.95309-1-mailhol.vinc...@wanadoo.fr/T/#t Vincent Mailhol (1): netdev: add netdev_queue_set_dql_min_limit() include/linux/netdevice.h | 18 ++ 1 file changed, 18 insertions(+) -- 2.26.2
[PATCH v3 0/1] dump kmessage before machine_kexec
Changelog v3 - Re-sending because it still has not landed in mainline. - Sync with mainline - Added Acked-by: Baoquan He v2 - Added review-by's - Sync with mainline Allow to study performance shutdown via kexec reboot calls by having kmsg log saved via pstore. Previous submissions v1 https://lore.kernel.org/lkml/20200605194642.62278-1-pasha.tatas...@soleen.com v2 https://lore.kernel.org/lkml/20210126204125.313820-1-pasha.tatas...@soleen.com Pavel Tatashin (1): kexec: dump kmessage before machine_kexec kernel/kexec_core.c | 2 ++ 1 file changed, 2 insertions(+) -- 2.25.1
[PATCH v3 0/1] GIC v4.1: Disable VSGI support for GIC CPUIF < v4.1
This patchset is v3 of a previous version [1]. v2 -> v3: - Coalesced all checks in one function (Marc's feedback) - Allow sgi_ops on cpuif mismatch (to keep v4.1 doorbell mechanism that works fine even if GIC CPUIF < v4.1) v1 -> v2: - Fixed vGIC behaviour according to v1 [1] review - Removed capability detection - rely on sanitised reg read - Added vsgi specific flag (for gic and kvm) [1] https://lore.kernel.org/linux-arm-kernel/20210302102744.12692-1-lorenzo.pieral...@arm.com -- Original cover letter -- GIC v4.1 introduced changes to the GIC CPU interface; systems that integrate CPUs that do not support GIC v4.1 features (as reported in the ID_AA64PFR0_EL1.GIC bitfield) and a GIC v4.1 controller must disable in software virtual SGIs support since the CPUIF and GIC controller version mismatch results in CONSTRAINED UNPREDICTABLE behaviour at architectural level. For systems with CPUs reporting ID_AA64PFR0_EL1.GIC == b0001 integrated in a system with a GIC v4.1 it _should_ still be safe to enable vLPIs (other than vSGI) since the protocol between the GIC redistributor and the GIC CPUIF was not changed from GIC v4.0 to GIC v4.1. Cc: Marc Zyngier Lorenzo Pieralisi (1): irqchip/gic-v4.1: Disable vSGI upon (GIC CPUIF < v4.1) detection arch/arm64/kvm/vgic/vgic-mmio-v3.c | 4 ++-- drivers/irqchip/irq-gic-v4.c | 27 +-- include/linux/irqchip/arm-gic-v4.h | 2 ++ 3 files changed, 29 insertions(+), 4 deletions(-) -- 2.29.1
[PATCH v3 0/1] add ACPI binding to RX6110 driver
Hi, it took some time, but now we got the official ACPI id for the RX6110 RTC driver from Seiko Epson. regards, Claudius Johannes Hahn (1): rtc: rx6110: add ACPI bindings to I2C drivers/rtc/rtc-rx6110.c | 12 1 file changed, 12 insertions(+) -- 2.30.1
[PATCH v3 0/1] Unprivileged chroot
Hi, This new patch replaces the path_is_under() check with current_chrooted() as it is done with user namespaces. Indeed, it is much more simple to check the current root instead of limiting access to a subset of files. The chroot system call is currently limited to be used by processes with the CAP_SYS_CHROOT capability. This protects against malicious procesess willing to trick SUID-like binaries. The following patch allows unprivileged users to safely use chroot(2), which may be complementary to the use of user namespaces. This patch is a follow-up of a previous one sent by Andy Lutomirski some time ago: https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.l...@amacapital.net/ This patch can be applied on top of v5.12-rc2 . I would really appreciate constructive reviews. Previous version: https://lore.kernel.org/r/20210310181857.401675-1-...@digikod.net Regards, Mickaël Salaün (1): fs: Allow no_new_privs tasks to call chroot(2) fs/open.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) base-commit: a38fd8748464831584a19438cbb3082b5a2dab15 -- 2.30.2
[PATCH v3 0/1] Bluetooth: Suspend improvements
Hi Marcel (and linux bluetooth), Here are a few suspend improvements based on user reports we saw on ChromeOS and feedback from Hans de Goede on the mailing list. I have tested this using our ChromeOS suspend/resume automated tests (full SRHealth test coverage and some suspend resume stress tests). Thanks Abhishek Changes in v3: * Minor change to if statement Changes in v2: * Removed hci_dev_lock from hci_cc_set_event_filter since flags are set/cleared atomically Abhishek Pandit-Subedi (1): Bluetooth: Remove unneeded commands for suspend include/net/bluetooth/hci.h | 1 + net/bluetooth/hci_event.c | 27 +++ net/bluetooth/hci_request.c | 44 +++-- 3 files changed, 55 insertions(+), 17 deletions(-) -- 2.31.0.rc0.254.gbdcc3b1a9d-goog
[PATCH v3 0/1] s390/vfio-ap: fix circular lockdep when starting SE guest
*Commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated") introduced a change that results in a circular lockdep when a Secure Execution guest that is configured with crypto devices is started. The problem resulted due to the fact that the patch moved the setting of the guest's AP masks within the protection of the matrix_dev->lock when the vfio_ap driver is notified that the KVM pointer has been set. Since it is not critical that setting/clearing of the guest's AP masks be done under the matrix_dev->lock when the driver is notified, the masks will not be updated under the matrix_dev->lock. The lock is necessary for the setting/unsetting of the KVM pointer, however, so that will remain in place. The dependency chain for the circular lockdep resolved by this patch is (in reverse order): 2: vfio_ap_mdev_group_notifier:kvm->lock matrix_dev->lock 1: handle_pqap:matrix_dev->lock kvm_vcpu_ioctl: vcpu->mutex 0: kvm_s390_cpus_to_pv:vcpu->mutex kvm_vm_ioctl: kvm->lock Please note: --- * If checkpatch is run against this patch series, you may get a "WARNING: Unknown commit id 'f21916ec4826', maybe rebased or not pulled?" message. The commit 'f21916ec4826', however, is definitely in the master branch on top of which this patch series was built, so I'm not sure why this message is being output by checkpatch. * All acks granted from previous review of this patch have been removed due to the fact that this patch introduces non-trivial changes (see change log below). Change log v2=> v3: -- * Added two fields - 'bool kvm_busy' and 'wait_queue_head_t wait_for_kvm' - fields to struct ap_matrix_mdev. The former indicates that the KVM pointer is in the process of being updated and the second allows a function that needs access to the KVM pointer to wait until it is no longer being updated. Resolves problem of synchronization between the functions that change the KVM pointer value and the functions that required access to it. Change log v1=> v2: -- * No longer holding the matrix_dev->lock prior to setting/clearing the masks supplying the AP configuration to a KVM guest. * Make all updates to the data in the matrix mdev that is used to manage AP resources used by the KVM guest in the vfio_ap_mdev_set_kvm() function instead of the group notifier callback. * Check for the matrix mdev's KVM pointer in the vfio_ap_mdev_unset_kvm() function instead of the vfio_ap_mdev_release() function. Tony Krowiak (1): s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks drivers/s390/crypto/vfio_ap_ops.c | 312 ++ drivers/s390/crypto/vfio_ap_private.h | 2 + 2 files changed, 218 insertions(+), 96 deletions(-) -- 2.21.3
[PATCH v3 0/1] iio: adc: ad7124: allow more than 8 channels
From: Alexandru Tachici Currently AD7124-8 driver cannot use more than 8 IIO channels because it was assigning the channel configurations bijectively to channels specified in the device-tree. This is not possible to do when using more than 8 channels as AD7124-8 has only 8 configuration registers. All configurations are marked as live if they are programmed on the device. Any change that happens from userspace (sampling rate, filters etc.) will invalidate them. To allow the user to use all channels at once the driver will keep in memory configurations for all channels but will program only 8 of them at a time on the device. If multiple channels have the same configuration, only one configuration register will be used. If there are more configurations needed than available registers only the last 8 used configurations will be allowed to exist on the device in a LRU fashion. (in case of raw reads). If a read is requested on a channel whose configuration is not programmed: - check if there are similar configurations already programmed if yes: - point channel to that config if no: - check if there are empty config slots - if yes: write config, push into queue of LRU configs - if no: pop one config, get it's config slot nr, write new config on the old slot, push new config in queue of LRU configs. Alexandru Tachici (1): iio: adc: ad7124: allow more than 8 channels drivers/iio/adc/ad7124.c | 461 ++- 1 file changed, 308 insertions(+), 153 deletions(-) -- 2.20.1
[PATCH v3 0/1] Automatic LSM stack ordering
Hi, This patch series gives the opportunity to users to not manually configure the list of LSM enabled at boot but instead always rely on the up-to-date list of existing LSMs. Indeed, CONFIG_LSM may never be updated with a make oldconfig whereas users may select new LSMs over time. With this patch, when running make oldconfig, a new option CONFIG_LSM_AUTO is pre-selected to delegate LSM ordering to the kernel developers, according to the user configuration. This third series replace the previous virtual dependencies with a new option to automatically enable all selected LSMs. This is cleaner, simpler, and makes the transition more convenient. This patch series can be applied on v5.11-7580-gea914b7ffbfd (or v5.11). Previous version: https://lore.kernel.org/r/20210215181511.2840674-1-...@digikod.net Mickaël Salaün (1): security: Add CONFIG_LSM_AUTO to handle default LSM stack ordering security/Kconfig| 19 +++ security/security.c | 26 +- 2 files changed, 44 insertions(+), 1 deletion(-) base-commit: 31caf8b2a847214be856f843e251fc2ed2cd1075 -- 2.30.0
Re: [PATCH v3 0/1] phy: fsl-imx8-mipi-dphy: Hook into runtime pm
Hi Guido, On Fri, 2021-02-19 at 10:38 +0100, Guido Günther wrote: > Hi, > On Wed, Dec 16, 2020 at 07:22:32PM +0100, Guido Günther wrote: > > This allows us to shut down the mipi power domain on the imx8. The > > alternative > > would be to drop the dphy from the mipi power domain in the SOCs device tree > > and only have the DSI host controller visible there but since the PD is > > mostly > > about the PHY that would defeat it's purpose. > > Is there anything I can do to move that forward. I assume this needs to > go via the phy/ subsystem not drm? I cannot find patch 1/1 of v3 in my mailbox, so I'll provide comment on v2. Regards, Liu Ying > Cheers, > -- Guido > > > This is basically a resend from February 2020 which went without feedback. > > > > This allows to shut off the power domain hen blanking the LCD panel: > > > > pm_genpd_summary before: > > > > domain status slaves > > /device runtime status > > -- > > mipion > > /devices/platform/soc@0/soc@0:bus@3080/30a00300.dphy unsupported > > /devices/platform/soc@0/soc@0:bus@3080/30a0.mipi_dsi suspended > > > > after: > > > > mipioff-0 > > /devices/platform/soc@0/soc@0:bus@3080/30a00300.dphy suspended > > /devices/platform/soc@0/soc@0:bus@3080/30a0.mipi_dsi suspended > > > > Changes from v1: > > - Tweak commit message slightly > > > > Changes from v2: > > - As pre review comment by Lucas Stach > > > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-arm-kernel%2Fee22b072e0abe07559a3e6a63ccf6ece064a46cb.camel%40pengutronix.de%2Fdata=04%7C01%7Cvictor.liu%40nxp.com%7Ccac0b14c892c4a35340508d8d4ba2e16%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637493243396909710%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=PU5kegolJwKK%2BQ7nD7V9qjrKJ2fJ9eKoySoFihnFoD8%3Dreserved=0 > > Check for pm_runtime_get_sync failure > > > > Guido Günther (1): > > phy: fsl-imx8-mipi-dphy: Hook into runtime pm > > > > .../phy/freescale/phy-fsl-imx8-mipi-dphy.c| 25 ++- > > 1 file changed, 24 insertions(+), 1 deletion(-) > > > > -- > > 2.29.2 > > > > > > ___ > > linux-arm-kernel mailing list > > linux-arm-ker...@lists.infradead.org > > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.infradead.org%2Fmailman%2Flistinfo%2Flinux-arm-kerneldata=04%7C01%7Cvictor.liu%40nxp.com%7Ccac0b14c892c4a35340508d8d4ba2e16%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637493243396909710%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=kkC3Go0wvHemjxaKVHwU%2F6gWRsgVOFoVz7QEHB7Zqx0%3Dreserved=0
Re: [PATCH v3 0/1] phy: fsl-imx8-mipi-dphy: Hook into runtime pm
Hi, On Wed, Dec 16, 2020 at 07:22:32PM +0100, Guido Günther wrote: > This allows us to shut down the mipi power domain on the imx8. The alternative > would be to drop the dphy from the mipi power domain in the SOCs device tree > and only have the DSI host controller visible there but since the PD is mostly > about the PHY that would defeat it's purpose. Is there anything I can do to move that forward. I assume this needs to go via the phy/ subsystem not drm? Cheers, -- Guido > > This is basically a resend from February 2020 which went without feedback. > > This allows to shut off the power domain hen blanking the LCD panel: > > pm_genpd_summary before: > > domain status slaves > /device runtime status > -- > mipion > /devices/platform/soc@0/soc@0:bus@3080/30a00300.dphy unsupported > /devices/platform/soc@0/soc@0:bus@3080/30a0.mipi_dsi suspended > > after: > > mipioff-0 > /devices/platform/soc@0/soc@0:bus@3080/30a00300.dphy suspended > /devices/platform/soc@0/soc@0:bus@3080/30a0.mipi_dsi suspended > > Changes from v1: > - Tweak commit message slightly > > Changes from v2: > - As pre review comment by Lucas Stach > > https://lore.kernel.org/linux-arm-kernel/ee22b072e0abe07559a3e6a63ccf6ece064a46cb.ca...@pengutronix.de/ > Check for pm_runtime_get_sync failure > > Guido Günther (1): > phy: fsl-imx8-mipi-dphy: Hook into runtime pm > > .../phy/freescale/phy-fsl-imx8-mipi-dphy.c| 25 ++- > 1 file changed, 24 insertions(+), 1 deletion(-) > > -- > 2.29.2 > > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
[PATCH v3 0/1] Add FITRIM ioctl support for exFAT filesystem
This is for adding FITRIM ioctl functionality to exFAT filesystem. To do that, add generic ioctl function and FITRIM handler. Changelog = v2->v3: - Remove unnecessary local variable - Merge all changes to a single patch v1->v2: - Change variable declaration order as reverse tree style. - Return -EOPNOTSUPP from sb_issue_discard() just as it is. - Remove cond_resched() in while loop. - Move ioctl related code into it's helper function. Hyeongseok Kim (1): exfat: add support ioctl and FITRIM function fs/exfat/balloc.c | 81 + fs/exfat/dir.c | 5 +++ fs/exfat/exfat_fs.h | 4 +++ fs/exfat/file.c | 53 + 4 files changed, 143 insertions(+) -- 2.27.0.83.g0313f36
[PATCH v3 0/1] correct the inside linear map range during hotplug check
v3: - Sync with linux-next where arch_get_mappable_range() was introduced. v2: - Added test-by Tyler Hicks - Addressed comments from Anshuman Khandual: moved check under IS_ENABLED(CONFIG_RANDOMIZE_BASE), added WARN_ON(start_linear_pa > end_linear_pa); Fixes a hotplug error that may occur on systems with CONFIG_RANDOMIZE_BASE enabled. Applies against linux-next. v1: https://lore.kernel.org/lkml/20210213012316.1525419-1-pasha.tatas...@soleen.com v2: https://lore.kernel.org/lkml/20210215192237.362706-1-pasha.tatas...@soleen.com Pavel Tatashin (1): arm64: mm: correct the inside linear map range during hotplug check arch/arm64/mm/mmu.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) -- 2.25.1
[PATCH v3 0/1] AMD EPYC: fix schedutil perf regression (freq-invariance)
v2 at https://lore.kernel.org/lkml/20210122204038.3238-1-ggherdov...@suse.cz Changes wrt v2: - removed redundant "#ifdef CONFIG_ACPI_CPPC_LIB" Giovanni Gherdovich (1): x86,sched: On AMD EPYC set freq_max = max_boost in schedutil invariant formula drivers/cpufreq/acpi-cpufreq.c | 61 ++-- drivers/cpufreq/cpufreq.c| 3 ++ include/linux/cpufreq.h | 5 +++ kernel/sched/cpufreq_schedutil.c | 8 +++-- 4 files changed, 73 insertions(+), 4 deletions(-) -- 2.26.2
[PATCH v3 0/1] scale loop device lock
Changelog v3 - Added review-by Tyler - Sync with mainline v2 - Addressed Tyler Hicks comments - added mutex_destroy() - comment in lo_open() - added lock around lo_disk === In our environment we are using systemd portable containers in squashfs formats, convert them into loop device, and mount. NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop5 7:50 76.4M 0 loop `-BaseImageM1908 252:30 76.4M 1 crypt /BaseImageM1908 loop6 7:6020K 0 loop `-test_launchperf20 252:17 0 1.3M 1 crypt /app/test_launchperf20 loop7 7:7020K 0 loop `-test_launchperf18 252:40 1.5M 1 crypt /app/test_launchperf18 loop8 7:80 8K 0 loop `-test_launchperf8252:25 028K 1 crypt app/test_launchperf8 loop9 7:90 376K 0 loop `-test_launchperf14 252:29 0 45.7M 1 crypt /app/test_launchperf14 loop10 7:10 016K 0 loop `-test_launchperf4252:11 0 968K 1 crypt app/test_launchperf4 loop11 7:11 0 1.2M 0 loop `-test_launchperf17 252:26 0 150.4M 1 crypt /app/test_launchperf17 loop12 7:12 036K 0 loop `-test_launchperf19 252:13 0 3.3M 1 crypt /app/test_launchperf19 loop13 7:13 0 8K 0 loop ... We have over 50 loop devices which are mounted during boot. We observed contentions around loop_ctl_mutex. The sample contentions stacks: Contention 1: __blkdev_get() bdev->bd_disk->fops->open() lo_open() mutex_lock_killable(_ctl_mutex); <- contention Contention 2: __blkdev_put() disk->fops->release() lo_release() mutex_lock(_ctl_mutex); <- contention With total time waiting for loop_ctl_mutex ~18.8s during boot (across 8 CPUs) on our machine (69 loop devices): 2.35s per CPU. Scaling this lock eliminates this contention entirely, and improves the boot performance by 2s on our machine. v2 https://lore.kernel.org/lkml/20200723211748.13139-1-pasha.tatas...@soleen.com v1 https://lore.kernel.org/lkml/20200717205322.127694-1-pasha.tatas...@soleen.com Pavel Tatashin (1): loop: scale loop device by introducing per device lock drivers/block/loop.c | 92 +--- drivers/block/loop.h | 1 + 2 files changed, 54 insertions(+), 39 deletions(-) -- 2.25.1
[RFC PATCH v3 0/1] Adding support for IIO SCMI based sensors
Hi, This series adds support for ARM SCMI Protocol based IIO Device. This driver provides support for Accelerometer and Gyroscope sensor using new SCMI Sensor Protocol defined by the upcoming SCMIv3.0 ARM specification, which is available at https://developer.arm.com/documentation/den0056/c/ This version of the patch series has been tested using version 5.4.21 branch of Android common kernel. Any feedback welcome, Thanks, Jyoti Bhayana v2 --> v3 - Incorporated the feedback comments from v2 review of the patch v1 --> v2 - Incorporated the feedback comments from v1 review of the patch - Regarding the new ABI for sensor_power,sensor_max_range, and sensor_resolution, these are some of the sensor attributes which Android passes to the apps. If there is any other way of getting those values, please let us know Jyoti Bhayana (1): iio/scmi: Adding support for IIO SCMI Based Sensors MAINTAINERS| 6 + drivers/iio/common/Kconfig | 1 + drivers/iio/common/Makefile| 1 + drivers/iio/common/scmi_sensors/Kconfig| 18 + drivers/iio/common/scmi_sensors/Makefile | 5 + drivers/iio/common/scmi_sensors/scmi_iio.c | 736 + 6 files changed, 767 insertions(+) create mode 100644 drivers/iio/common/scmi_sensors/Kconfig create mode 100644 drivers/iio/common/scmi_sensors/Makefile create mode 100644 drivers/iio/common/scmi_sensors/scmi_iio.c -- 2.30.0.280.ga3ce27912f-goog
[PATCH v3 0/1] arm64: PCI SMC config conduit
This set provides a platform standardized way to access PCI config space. It does that via an Arm specific interface exported by the firmware. The Arm specification this is based on can be found here: The Arm PCI Configuration Space Access Firmware Interface https://developer.arm.com/documentation/den0115/latest v2->v3: Convert from SMC only calls to arm_smccc_1_1_invoke() for better conformance with the specification. v1->v2: Add SMC_PCI_FEATURES calls to verify _READ, _WRITE and _SEG_INFO functions exist. Add a _SEG_INFO bus start, end validation against the ACPI table. Adjust some function naming, and log messages. Jeremy Linton (1): arm64: PCI: Enable SMC conduit arch/arm64/kernel/pci.c | 111 ++ include/linux/arm-smccc.h | 29 ++ 2 files changed, 140 insertions(+) -- 2.26.2
[PATCH v3 0/1] Add software TX timestamps to the CAN devices
With the ongoing work to add BQL to Socket CAN, I figured out that it would be nice to have an easy way to mesure the latency. And one easy way to do so it to check the round trip time of the packet by doing the difference between the software rx timestamp and the software tx timestamp. rx timestamps are already available. This patch gives the missing piece: add a tx software timestamp feature to the CAN devices. Of course, the tx software timestamp might also be used for other purposes such as performance measurements of the different queuing disciplines (e.g. by checking the difference between the kernel tx software timestamp and the userland tx software timestamp). v2 was a mistake, please ignore it (fogot to do git add, changes were not reflected...) v3 reflects the comments that Jeroen made in https://lkml.org/lkml/2021/1/10/54 Vincent Mailhol (1): can: dev: add software tx timestamps drivers/net/can/dev.c | 1 + 1 file changed, 1 insertion(+) -- 2.26.2
[PATCH v3 0/1] mfd: intel-m10-bmc: add sysfs files for mac_address
Add two sysfs nodes to the Intel MAX10 BMC driver: mac_address and mac_count. The mac_address provides the first of a series of sequential MAC addresses assigned to the FPGA card. The mac_count indicates how many MAC addresses are assigned to the card. Changelog v2 -> v3: - Updated Date and KernelVersion in ABI documentation Changelog v1 -> v2: - Updated the documentation for the mac_address and mac_count sysfs nodes to clearify their usage. - Changed sysfs _show() functions to use sysfs_emit() instead of sprintf. Russ Weight (1): mfd: intel-m10-bmc: expose mac address and count .../ABI/testing/sysfs-driver-intel-m10-bmc| 21 + drivers/mfd/intel-m10-bmc.c | 43 +++ include/linux/mfd/intel-m10-bmc.h | 9 3 files changed, 73 insertions(+) -- 2.25.1
[RFC PATCH V3 0/1] block: fix I/O errors in BLKRRPART
Hello, This patch fixes I/O errors during BLKRRPART ioctl() behavior right after format operation that changed logical block size of the block device with a same file descriptor opened. Testcase: The following testcase is a case of NVMe namespace with the following conditions: - Current LBA format is lbaf=0 (512 bytes logical block size) - LBA Format(lbaf=1) has 4096 bytes logical block size # Format block device logical block size 512B to 4096B nvme format /dev/nvme0n1 --lbaf=1 --force This will cause I/O errors because BLKRRPART ioctl() happened right after the format command with same file descriptor opened in application (e.g., nvme-cli) like: fd = open("/dev/nvme0n1", O_RDONLY); nvme_format(fd, ...); if (ioctl(fd, BLKRRPART) < 0) ... Errors: We can see the Read command with Number of LBA(NLB) 0x(65535) which was under-flowed because BLKRRPART operation requested request size based on i_blkbits of the block device which is 9 via buffer_head. [dmesg-snip] [ 10.771740] blk_update_request: operation not supported error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [ 10.780262] Buffer I/O error on dev nvme0n1, logical block 0, async page read [event-snip] kworker/0:1H-56 [000] 913.456922: nvme_setup_cmd: nvme0: disk=nvme0n1, qid=1, cmdid=216, nsid=1, flags=0x0, meta=0x0, cmd=(nvme_cmd_read slba=0, len=65535, ctrl=0x0, dsmgmt=0, reftag=0) ksoftirqd/0-9 [000] .Ns. 916.566351: nvme_complete_rq: nvme0: disk=nvme0n1, qid=1, cmdid=216, res=0x0, retries=0, flags=0x0, status=0x4002 The patch below fixes the I/O errors by rejecting I/O requests from the block layer with setting a flag to gendisk until the file descriptor re-opened to be updated by __blkdev_get(). This is based on the previous discussion [1]. Since V2: - Cover letter with testcase and error logs attached. Removed un-related changes: empty line. (Chaitanya, [2]) - Put blkdev with blkdev_put_no_open(). Since V1: - Updated patch to reject I/O rather than updating i_blkbits of the block device's inode directly from driver. (Christoph, [1]) [1] https://lore.kernel.org/linux-nvme/20201223183143.GB13354@localhost.localdomain/T/#t [2] https://lore.kernel.org/linux-nvme/20201230140504.GB7917@localhost.localdomain/T/#t Thanks, Minwoo Im (1): block: reject I/O for same fd if block size changed block/blk-settings.c| 8 block/partitions/core.c | 11 +++ fs/block_dev.c | 6 ++ include/linux/genhd.h | 1 + 4 files changed, 26 insertions(+) -- 2.17.1
[PATCH v3 0/1] mm: memmap defer init dosn't work as expected
Post the regression fix in a standalone patch as Andrew suggested for -stable branch better back porting. This is rebased on the latest master branch of mainline kenrel, surely there's almost no change comparing with v2. https://lore.kernel.org/linux-mm/20201220082754.6900-1-...@redhat.com/ Tested on a system with 24G ram as below, adding 'memmap=128M!0x5' to split the one ram region into two regions in numa node1 to simulate the scenario of VMware. [ +0.00] BIOS-provided physical RAM map: [ +0.00] BIOS-e820: [mem 0x-0x0009bfff] usable [ +0.00] BIOS-e820: [mem 0x0009c000-0x0009] reserved [ +0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [ +0.00] BIOS-e820: [mem 0x0010-0x6cdcefff] usable [ +0.00] BIOS-e820: [mem 0x6cdcf000-0x6efcefff] reserved [ +0.00] BIOS-e820: [mem 0x6efcf000-0x6fdfefff] ACPI NVS [ +0.00] BIOS-e820: [mem 0x6fdff000-0x6fffefff] ACPI data [ +0.00] BIOS-e820: [mem 0x6000-0x6fff] usable [ +0.00] BIOS-e820: [mem 0x7000-0x8fff] reserved [ +0.00] BIOS-e820: [mem 0xe000-0x] reserved [ +0.00] BIOS-e820: [mem 0x0001-0x00067f1f] usable [ +0.00] BIOS-e820: [mem 0x00067f20-0x00067fff] reserved Test passed as below. As you can see, with patch applied, memmap init will cost much less time on numa node 1: Without the patch: [0.065029] Early memory node ranges [0.065030] node 0: [mem 0x1000-0x0009bfff] [0.065032] node 0: [mem 0x0010-0x6cdcefff] [0.065034] node 0: [mem 0x6000-0x6fff] [0.065036] node 0: [mem 0x0001-0x00027fff] [0.065038] node 1: [mem 0x00028000-0x0004] [0.065040] node 1: [mem 0x00050800-0x00067f1f] [0.065185] Zeroed struct page in unavailable ranges: 16533 pages [0.065187] Initmem setup node 0 [mem 0x1000-0x00027fff] [0.069616] Initmem setup node 1 [mem 0x00028000-0x00067f1f] [0.096298] ACPI: PM-Timer IO Port: 0x408 With the patch applied: [0.065029] Early memory node ranges [0.065030] node 0: [mem 0x1000-0x0009bfff] [0.065032] node 0: [mem 0x0010-0x6cdcefff] [0.065034] node 0: [mem 0x6000-0x6fff] [0.065036] node 0: [mem 0x0001-0x00027fff] [0.065038] node 1: [mem 0x00028000-0x0004] [0.065041] node 1: [mem 0x00050800-0x00067f1f] [0.065187] Zeroed struct page in unavailable ranges: 16533 pages [0.065189] Initmem setup node 0 [mem 0x1000-0x00027fff] [0.069572] Initmem setup node 1 [mem 0x00028000-0x00067f1f] [0.070161] ACPI: PM-Timer IO Port: 0x408 Baoquan He (1): mm: memmap defer init dosn't work as expected arch/ia64/mm/init.c | 4 ++-- include/linux/mm.h | 5 +++-- mm/memory_hotplug.c | 2 +- mm/page_alloc.c | 8 +--- 4 files changed, 11 insertions(+), 8 deletions(-) -- 2.17.2
[PATCH v3 0/1] phy: fsl-imx8-mipi-dphy: Hook into runtime pm
This allows us to shut down the mipi power domain on the imx8. The alternative would be to drop the dphy from the mipi power domain in the SOCs device tree and only have the DSI host controller visible there but since the PD is mostly about the PHY that would defeat it's purpose. This is basically a resend from February 2020 which went without feedback. This allows to shut off the power domain hen blanking the LCD panel: pm_genpd_summary before: domain status slaves /device runtime status -- mipion /devices/platform/soc@0/soc@0:bus@3080/30a00300.dphy unsupported /devices/platform/soc@0/soc@0:bus@3080/30a0.mipi_dsi suspended after: mipioff-0 /devices/platform/soc@0/soc@0:bus@3080/30a00300.dphy suspended /devices/platform/soc@0/soc@0:bus@3080/30a0.mipi_dsi suspended Changes from v1: - Tweak commit message slightly Changes from v2: - As pre review comment by Lucas Stach https://lore.kernel.org/linux-arm-kernel/ee22b072e0abe07559a3e6a63ccf6ece064a46cb.ca...@pengutronix.de/ Check for pm_runtime_get_sync failure Guido Günther (1): phy: fsl-imx8-mipi-dphy: Hook into runtime pm .../phy/freescale/phy-fsl-imx8-mipi-dphy.c| 25 ++- 1 file changed, 24 insertions(+), 1 deletion(-) -- 2.29.2
[PATCH v3 0/1] net: Reduce rcu_barrier() contentions from 'unshare(CLONE_NEWNET)'
From: SeongJae Park On a few of our systems, I found frequent 'unshare(CLONE_NEWNET)' calls make the number of active slab objects including 'sock_inode_cache' type rapidly and continuously increase. As a result, memory pressure occurs. In more detail, I made an artificial reproducer that resembles the workload that we found the problem and reproduce the problem faster. It merely repeats 'unshare(CLONE_NEWNET)' 50,000 times in a loop. It takes about 2 minutes. On 40 CPU cores, 70GB DRAM machine, the available memory continuously reduced in a fast speed (about 120MB per second, 15GB in total within the 2 minutes). Note that the issue don't reproduce on every machine. On my 6 CPU cores machine, the problem didn't reproduce. 'cleanup_net()' and 'fqdir_work_fn()' are functions that deallocate the relevant memory objects. They are asynchronously invoked by the work queues and internally use 'rcu_barrier()' to ensure safe destructions. 'cleanup_net()' works in a batched maneer in a single thread worker, while 'fqdir_work_fn()' works for each 'fqdir_exit()' call in the 'system_wq'. Therefore, 'fqdir_work_fn()' called frequently under the workload and made the contention for 'rcu_barrier()' high. In more detail, the global mutex, 'rcu_state.barrier_mutex' became the bottleneck. I tried making 'rcu_barrier()' and subsequent lightweight works in 'fqdir_work_fn()' to be processed by a dedicated singlethread worker in batch and confirmed it works. After the change, No continuous memory reduction but some fluctuation observed. Nevertheless, the available memory reduction was only up to about 400MB. The following patch is for the change. I think this is the right solution for point fix of this issue, but someone might blame different parts. 1. User: Frequent 'unshare()' calls >From some point of view, such frequent 'unshare()' calls might seem only insane. 2. Global mutex in 'rcu_barrier()' Because of the global mutex, 'rcu_barrier()' callers could wait long even after the callbacks started before the call finished. Therefore, similar issues could happen in another 'rcu_barrier()' usages. Maybe we can use some wait queue like mechanism to notify the waiters when the desired time came. I personally believe applying the point fix for now and making 'rcu_barrier()' improvement in longterm make sense. If I'm missing something or you have different opinion, please feel free to let me know. Patch History - Changes from v2 (https://lore.kernel.org/lkml/20201210080844.23741-1-sjp...@amazon.com/) - Add numbers after the patch (Eric Dumazet) - Make only 'rcu_barrier()' and subsequent lightweight works serialized (Eric Dumazet) Changes from v1 (https://lore.kernel.org/netdev/20201208094529.23266-1-sjp...@amazon.com/) - Keep xmas tree variable ordering (Jakub Kicinski) - Add more numbers (Eric Dumazet) - Use 'llist_for_each_entry_safe()' (Eric Dumazet) SeongJae Park (1): net/ipv4/inet_fragment: Batch fqdir destroy works include/net/inet_frag.h | 1 + net/ipv4/inet_fragment.c | 45 +--- 2 files changed, 39 insertions(+), 7 deletions(-) -- 2.17.1
[PATCH v3 0/1] Fix object remain in offline per-cpu quarantine
This patch fixes object remain in the offline per-cpu quarantine as describe below. Free objects will get into per-cpu quarantine if enable generic KASAN. If a cpu is offline and users use kmem_cache_destroy, kernel will detect objects still remain in the offline per-cpu quarantine and report error. Register a cpu hotplug function to remove all objects in the offline per-cpu quarantine when cpu is going offline. Set a per-cpu variable to indicate this cpu is offline. Changes since v3: - Add a barrier to ensure the ordering - Rename the init function Changes since v2: - Thanks for Dmitry suggestion - Remove unnecessary code - Put offline variable into cpu_quarantine - Use single qlist_free_all call instead of iteration over all slabs - Add bug reporter in commit message Kuan-Ying Lee (1): kasan: fix object remain in offline per-cpu quarantine mm/kasan/quarantine.c | 40 1 file changed, 40 insertions(+) -- 2.18.0
[PATCH V3 0/1] Add QPIC NAND support for IPQ6018
IPQ6018 has the QPIC NAND controller of version 1.5.0, which uses the BAM DMA. Add support for the QPIC BAM, QPIC NAND and enable the same in the board DTS file. [V3]: - Rebased on v5.10-rc6 - Renamed the qpic bam dma node name from 'dma' to 'dma-controller' - Update the device register space to 64bit format Above mentioned last two points based on the latest changes in the QCOM tree. [V2]: - Rebased on v5.10-rc2 - Replaced "ok" with "okay" for status property - Dropped the MTD and dt-bindings patch as they are already picked in MTD tree Kathiravan T (1): arm64: dts: ipq6018: Add the QPIC peripheral nodes arch/arm64/boot/dts/qcom/ipq6018-cp01-c1.dts | 16 arch/arm64/boot/dts/qcom/ipq6018.dtsi| 41 2 files changed, 57 insertions(+) base-commit: b65054597872ce3aefbc6a666385eabdf9e288da -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
[PATCH v3 0/1] Add macro definition for the upcoming new OST driver.
Add new macro definition to "ingenic,sysost.h", exchange the original ABI values of OST_CLK_PERCPU_TIMER and OST_CLK_GLOBAL_TIMER, prepare for the upcoming new OST driver. I'm sure that exchanging the ABI values of OST_CLK_PERCPU_TIMER and OST_CLK_GLOBAL_TIMER will not affect the existing related drivers and the SoCs whitch using these drivers, so we should be able to exchange them safely. v1->v2: Rewrite the commit message so that each line is less than 80 characters. v2->v3: Add the description of why the exchange of ABI values will not affect the existing driver into the commit message. 周琰杰 (Zhou Yanjie) (1): dt-bindings: timer: Add new OST support for the upcoming new driver. include/dt-bindings/clock/ingenic,sysost.h | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) -- 2.11.0
[PATCH v3 0/1] ARM: dts: sun8i: add FriendlyArm ZeroPi support
This patch add FriendlyArm ZeroPi support. Wiki: http://wiki.friendlyarm.com/wiki/index.php/ZeroPi Schematic: http://wiki.friendlyarm.com/wiki/images/7/71/ZeroPi_20190731_Schematic.pdf v1: - Remove the extra spaces in description text. v2: - Remove the ehci0 and ohci0 device nodes. - Remove the usbphy->usb0_id_det-gpios property. v3: - Enable RGMII RX/TX delay on PHY. Yu-Tung Chang (1): ARM: dts: sun8i: add FriendlyArm ZeroPi support .../devicetree/bindings/arm/sunxi.yaml| 5 ++ arch/arm/boot/dts/Makefile| 1 + arch/arm/boot/dts/sun8i-h3-zeropi.dts | 87 +++ 3 files changed, 93 insertions(+) create mode 100644 arch/arm/boot/dts/sun8i-h3-zeropi.dts -- 2.29.0
[PATCH v3 0/1] fix i2c polling mode workaround for FU540-C000 SoC
The polling mode workaround for the FU540-C000 on HiFive Unleashed A00 board was added earlier. The logic for this seems to work only in case the interrupt property was missing/not added into the i2c0 device node. Here we address this issue by identifying the SOC based on compatibility string and set the master xfer's to polling mode if it's the FU540-C000 SoC. The fix has been tested on Linux 5.9.0-rc8 with a PMOD based RTCC sensor connected to I2C pins J1 header of the board. Log for reference # uname -a Linux buildroot 5.9.0-rc8-1-g9da7791 #1 SMP Fri Oct 9 07:56:13 PDT 2020 riscv64 GNU/Linux # i2cdetect -y 0 0 1 2 3 4 5 6 7 8 9 a b c d e f 00: -- -- -- -- -- -- -- -- -- -- -- -- -- 10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 50: -- -- -- -- -- -- -- 57 -- -- -- -- -- -- -- -- 60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 6f 70: -- -- -- -- -- -- -- -- # i2cget 0 0x57 0 b -y 0xa5 # i2cset 0 0x57 0 0x9f b -y # i2cget 0 0x57 0 b -y 0x9f # i2cget 0 0x57 1 b -y 0xff # i2cset 0 0x57 1 0xa9 b -y # i2cget 0 0x57 1 b -y 0xa9 # i2cget 0 0x6f 0x20 b -y 0x98 # i2cset 0 0x6f 0x20 0xa5 b -y # i2cget 0 0x6f 0x20 b -y 0xa5 # i2cget 0 0x6f 0x5f b -y 0x55 # i2cset 0 0x6f 0x5f 0x5a b -y # i2cget 0 0x6f 0x5f b -y 0x5a # Without the fix here, it's observed that "i2cdetect -y 0" turns the system unresponsive, with CPU stall messages. Patch History: === V3: -Rectified typo as suggested here: https://lkml.org/lkml/2020/10/9/902 V2: -Incorporated changes as suggested by Peter Kosgaard https://lkml.org/lkml/2020/10/8/663 V1: Base version Sagar Shrikant Kadam (1): i2c: ocores: fix polling mode workaround on FU540-C000 SoC drivers/i2c/busses/i2c-ocores.c | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) -- 2.7.4
[RESEND PATCH v3 0/1] PCI/ERR: fix regression introduced by 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
This is a resend of v3 as the the original, sent over 6 hours ago, is yet to make it to LKML. - Changes since v2: * set status to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER. - Changes since v1: * changed the commit message to clarify what broke post commit 6d2c89441571 * dropped the misnomer post_reset_status variable in favour of a more natural approach that relies on a boolean to keep track of the outcome of reset_link() After commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()") pcie_do_recovery() no longer calls ->slot_reset() in the case of a successful reset which breaks error recovery by breaking driver (re)initialisation. Cc: Russ Anderson Cc: Kuppuswamy Sathyanarayanan Cc: Bjorn Helgaas Cc: Ashok Raj Cc: Joerg Roedel Cc: sta...@kernel.org # v5.7+ --- Hedi Berriche (1): PCI/ERR: don't clobber status after reset_link() drivers/pci/pcie/err.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) -- 2.28.0
[PATCH v3 0/1] PCI/ERR: fix regression introduced by 6d2c89441571 ("PCI/ERR: Update error status after reset_link()")
- Changes since v2: * set status to PCI_ERS_RESULT_RECOVERED, in case of successful link reset, if and only if the initial value of error status is PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER. - Changes since v1: * changed the commit message to clarify what broke post commit 6d2c89441571 * dropped the misnomer post_reset_status variable in favour of a more natural approach that relies on a boolean to keep track of the outcome of reset_link() After commit 6d2c89441571 ("PCI/ERR: Update error status after reset_link()") pcie_do_recovery() no longer calls ->slot_reset() in the case of a successful reset which breaks error recovery by breaking driver (re)initialisation. Cc: Russ Anderson Cc: Kuppuswamy Sathyanarayanan Cc: Bjorn Helgaas Cc: Ashok Raj Cc: Joerg Roedel Cc: sta...@kernel.org # v5.7+ --- Hedi Berriche (1): PCI/ERR: don't clobber status after reset_link() drivers/pci/pcie/err.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) -- 2.28.0
[PATCH v3 0/1] 8bpp support for Ingenic-drm
Final (?) version of my "small improvements to ingenic-drm" patchset. Most of the patches of V2 have been merged to drm-misc-next, except this one which required some more work. In the CRTC's .atomic_check callback, the size of the gamma LUT property is now checked, so that only a complete 256-entry palette is accepted. Cheers, -Paul Paul Cercueil (1): drm/ingenic: Add support for paletted 8bpp drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 66 +-- 1 file changed, 62 insertions(+), 4 deletions(-) -- 2.28.0
[PATCH v3 0/1] convert l2 cache dt bindings to YAML format
This patch is created and tested on top of mainline linux commit 856deb866d16 ("Linux 5.9-rc5") Reference log of "make dt_binding_check" is available here[1]. Just in case required the log of dt_binding_check without this patch is available here[2] [1] https://paste.ubuntu.com/p/d2bXwvpFz9/ [2] https://paste.ubuntu.com/p/X2TzBbCs3k/ Change History: v3: -Incorporated changes as suggested by Rob Herring here[3] [3] https://lkml.org/lkml/2020/9/15/670 -Rebased patch on 5.9-rc5 V2: -Fixed bot failure mentioned by Rob Herring -Updated dt-schema and kernel as suggested V1: Base version Sagar Kadam (1): dt-bindings: riscv: sifive-l2-cache: convert bindings to json-schema .../devicetree/bindings/riscv/sifive-l2-cache.txt | 51 .../devicetree/bindings/riscv/sifive-l2-cache.yaml | 90 ++ 2 files changed, 90 insertions(+), 51 deletions(-) delete mode 100644 Documentation/devicetree/bindings/riscv/sifive-l2-cache.txt create mode 100644 Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml -- 2.7.4
Re: [PATCH v3 0/1] drm/bridge: ps8640: Make sure all needed is powered to get the EDID
Hi Sam, On 27/8/20 10:59, Enric Balletbo i Serra wrote: > The first 4 patches of the series version 2: > - drm/bridge_connector: Set default status connected for eDP connectors > - drm/bridge: ps8640: Get the EDID from eDP control > - drm/bridge: ps8640: Return an error for incorrect attach flags > - drm/bridge: ps8640: Print an error if VDO control fails > > Are already applied to drm-misc-next, so I removed from this series. The > pending patch is part of the original series and is a rework of the power > handling to get the EDID. Basically, we need to make sure all the > needed is powered to be able to get the EDID. Before, we saw that getting > the EDID failed as explained in the third patch. > > [1] https://lkml.org/lkml/2020/6/15/1208 > > Changes in v3: > - Make poweron/poweroff and pre_enable/post_disable reverse one to each other > (Sam Ravnborg) > > Changes in v2: > - Use drm_bridge_chain_pre_enable/post_disable() helpers (Sam Ravnborg) > > Enric Balletbo i Serra (1): > drm/bridge: ps8640: Rework power state handling > > drivers/gpu/drm/bridge/parade-ps8640.c | 68 ++ > 1 file changed, 58 insertions(+), 10 deletions(-) > A gentle ping on this patch. Would be nice land this together with the already accepted patches. Thanks, Enric
Re: [PATCH v3 0/1] drm/bridge: ps8640: Make sure all needed is powered to get the EDID
Hi, On 15/09/2020 14:40, Enric Balletbo i Serra wrote: > Hi Sam, > > On 27/8/20 10:59, Enric Balletbo i Serra wrote: >> The first 4 patches of the series version 2: >> - drm/bridge_connector: Set default status connected for eDP connectors >> - drm/bridge: ps8640: Get the EDID from eDP control >> - drm/bridge: ps8640: Return an error for incorrect attach flags >> - drm/bridge: ps8640: Print an error if VDO control fails >> >> Are already applied to drm-misc-next, so I removed from this series. The >> pending patch is part of the original series and is a rework of the power >> handling to get the EDID. Basically, we need to make sure all the >> needed is powered to be able to get the EDID. Before, we saw that getting >> the EDID failed as explained in the third patch. >> >> [1] https://lkml.org/lkml/2020/6/15/1208 >> >> Changes in v3: >> - Make poweron/poweroff and pre_enable/post_disable reverse one to each >> other (Sam Ravnborg) >> >> Changes in v2: >> - Use drm_bridge_chain_pre_enable/post_disable() helpers (Sam Ravnborg) >> >> Enric Balletbo i Serra (1): >> drm/bridge: ps8640: Rework power state handling >> >> drivers/gpu/drm/bridge/parade-ps8640.c | 68 ++ >> 1 file changed, 58 insertions(+), 10 deletions(-) >> > > A gentle ping on this patch. Would be nice land this together with the already > accepted patches. Applying it to drm-misc-next Thanks, Neil > > Thanks, > Enric >
[RFC/RFT PATCH v3 0/1] arc: add sparsemem support
From: Mike Rapoport Hi, This is yet another attempt to enable SPARSEMEM on ARC. I've boot tested it on nSIM with haps_hs_defconfig with highmem and sparsemem enabled. With sparsemem the kernel text becomes a bit smaller, but bss and data are slightly increased: $ size discontig/vmlinux sparse/vmlinux textdata bss dec hex filename 4429390 785456 244580 5459426 534de2 discontig/vmlinux 4415099 786224 244844 5446167 531a17 sparse/vmlinux I've also added a dummy global functions to wrap pfn_valid(), page_to_pfn() and pfn_to_page(). Judging by objdump, sparsemem is a bit more efficient: DISCONTIGMEMSPARSEMEM : seths r2,0x3,r0 lsr r2,r0,0xe mpy r2,r2,1896 mpy r0,r0,0x24 add r3,r2,0x8050066cadd3r2,0x80529d1c,r2 add_s r2,r2,0x80500668ld_sr2,[r2,0] ld_sr3,[r3,0] bmskn r2,r2,0x3 sub_s r0,r0,r3j_s.d [blink] ld_sr2,[r2,0] add_s r0,r0,r2 mpy r0,r0,0x24 nop_s j_s.d [blink] add_s r0,r0,r2 : ld_sr2,[r0,0] ld_sr2,[r0,0] lsr_s r2,r2,0x1f lsr_s r2,r2,0x1b mpy r2,r2,1896 add3r2,0x80529d1c,r2 add r3,r2,0x80500668ld_sr2,[r2,0] add_s r2,r2,0x8050066cbmskn r2,r2,0x3 ld_sr3,[r3,0] sub_s r0,r0,r2 sub_s r0,r0,r3asr_s r0,r0,0x2 ld_sr2,[r2,0] mpy r0,r0,0x38e38e39 asr_s r0,r0,0x2 j_s [blink] mpy r0,r0,0x38e38e39 j_s.d [blink] add_s r0,r0,r2 nop_s : cmp_s r0,0x3 lsr_s r0,r0,0xe mov_s r2,0brhs.nt r0,0x20,24 mov.ls r2,0x768add3r0,0x80529d1c,r0 add_s r2,r2,0x80500814breq_s r0,0,12 ld.as r3,[r2,-106]ld_sr0,[r0,0] ld.as r2,[r2,-104]j_s.d [blink] add_s r2,r2,r3xbfur0,r0,0x1 j_s.d [blink] j_s.d [blink] seths r0,r2,r0mov_s r0,0 nop_s Still, SPARSEMEM has an issue with potentially wasted memory allocated for the memory map. The memory maps are allocated for each present section, which means that if part of the section is not populated we'll have a bunch of unused 'struct page' objects. The smaller the section size, the smaller is memory overhead, but the section size cannot be much smaller than the physical address because MAX_PHYSMEM_BITS - SECTION_SIZE_BITS has to fit into page flags and the room there is limited. There is yet another possibility to support separate banks. It is possible to use FLATMEM and free the memmap allocated for the hole, like, for instance, ARM does [1]. This will require ARC's override for pfn_valid() that takes into account the actual memory configuration rather than relies on the memmap. [1] https://elixir.bootlin.com/linux/latest/source/arch/arm/mm/init.c#L305 Mike Rapoport (1): arc: add sparsemem support arch/arc/Kconfig | 10 ++ arch/arc/include/asm/sparsemem.h | 13 + arch/arc/mm/init.c | 6 +- 3 files changed, 28 insertions(+), 1 deletion(-) create mode 100644 arch/arc/include/asm/sparsemem.h -- 2.26.2
[PATCH v3 0/1] drm/bridge: ps8640: Make sure all needed is powered to get the EDID
The first 4 patches of the series version 2: - drm/bridge_connector: Set default status connected for eDP connectors - drm/bridge: ps8640: Get the EDID from eDP control - drm/bridge: ps8640: Return an error for incorrect attach flags - drm/bridge: ps8640: Print an error if VDO control fails Are already applied to drm-misc-next, so I removed from this series. The pending patch is part of the original series and is a rework of the power handling to get the EDID. Basically, we need to make sure all the needed is powered to be able to get the EDID. Before, we saw that getting the EDID failed as explained in the third patch. [1] https://lkml.org/lkml/2020/6/15/1208 Changes in v3: - Make poweron/poweroff and pre_enable/post_disable reverse one to each other (Sam Ravnborg) Changes in v2: - Use drm_bridge_chain_pre_enable/post_disable() helpers (Sam Ravnborg) Enric Balletbo i Serra (1): drm/bridge: ps8640: Rework power state handling drivers/gpu/drm/bridge/parade-ps8640.c | 68 ++ 1 file changed, 58 insertions(+), 10 deletions(-) -- 2.28.0
[PATCH v3 0/1]extcon: ptn5150: Add usb-typec support for Intel LGM SoC
Add usb-typec detection support for the Intel LGM SoC based boards. Original driver is not supporting usb detection on Intel LGM SoC based boards then we debugged and fixed the issue, but before sending our patches Mr.Krzyszto has sent the same kind of patches, so I have rebased over his latest patches which is present in maintainer tree. Built and tested it's working fine, overthat created the new patch. Thanks to Chanwoo Choi for the review comments and suggestions --- v3: - Chanwoo Choi review comments update - replace 'capabiliy' to 'state' in commit message - add blank line v2: - Krzyszto review comments update - squash my previous patches 1 to 5 as single patch - add extcon_set_property_capability for EXTCON_USB and EXTCON_PROP_USB_TYPEC_POLARITY Ramuthevar Vadivel Murugan (1): extcon: ptn5150: Set the VBUS and POLARITY property capability drivers/extcon/extcon-ptn5150.c | 7 +++ 1 file changed, 7 insertions(+) -- 2.11.0
[PATCH v3 0/1] netfilter: nat: add a range check for l3/l4 protonum
Hi Pablo, > This patch is much smaller and if you confirm this is address the > issue, then this is awesome. Yes, I can confirm the updated patch does fix the kernel panic. I have retested on the Pixel 4 XL with version 4.14.180. Please see the updated patchset v3. Thanks, Will Will McVicker (1): netfilter: nat: add a range check for l3/l4 protonum net/netfilter/nf_conntrack_netlink.c | 2 ++ 1 file changed, 2 insertions(+) -- 2.28.0.297.g1956fa8f8d-goog
Re: [PATCH V3 0/1] irqchip: intmux: implement intmux PM
On Mon, 27 Jul 2020 22:17:33 +0800, Joakim Zhang wrote: > This patch intends to implement intmux PM. > > ChangeLogs: > V2->V3: > 1. allocate u32 saved_reg for a per channel. > > V1->V2: > 1. add more detailed commit message. > 2. use u32 for 32bit HW registers. > 3. fix kbuild failures. > 4. move trivial functions into their respective callers. > 5. squash two patches together. > > [...] Applied to irq/irqchip-next, thanks! [1/1] irqchip/imx-intmux: Implement intmux runtime power management commit: bb403111e017a327737242eca40311921f833627 Cheers, M. -- Without deviation from the norm, progress is not possible.
[PATCH V3 0/1] irqchip: intmux: implement intmux PM
This patch intends to implement intmux PM. ChangeLogs: V2->V3: 1. allocate u32 saved_reg for a per channel. V1->V2: 1. add more detailed commit message. 2. use u32 for 32bit HW registers. 3. fix kbuild failures. 4. move trivial functions into their respective callers. 5. squash two patches together. Joakim Zhang (1): irqchip: imx-intmux: implement intmux PM drivers/irqchip/irq-imx-intmux.c | 67 +++- 1 file changed, 65 insertions(+), 2 deletions(-) -- 2.17.1
Re: [PATCH v3 0/1] ASoC: fsl_asrc: always select different clocks
On Fri, Jul 17, 2020 at 01:34:34PM +0200, Arnaud Ferraris wrote: > Understood, sorry about that. Should I do a "clean" re-send for this one? It's fine, please just remember this for future submissions. signature.asc Description: PGP signature
Re: [PATCH v3 0/1] ASoC: fsl_asrc: always select different clocks
Le 17/07/2020 à 13:21, Mark Brown a écrit : > On Fri, Jul 17, 2020 at 12:38:56PM +0200, Arnaud Ferraris wrote: >> This patch fixes the automatic clock selection so it always selects >> distinct input and output clocks. > > Please don't send new patches in reply to old ones, it buries things and > makes it hard to keep track of what the current version of a series > looks like. Just send new versions as a completely new thread. > > Please don't send cover letters for single patches, if there is anything > that needs saying put it in the changelog of the patch or after the --- > if it's administrative stuff. This reduces mail volume and ensures that > any important information is recorded in the changelog rather than being > lost. > Understood, sorry about that. Should I do a "clean" re-send for this one? Regards, Arnaud
Re: [PATCH v3 0/1] ASoC: fsl_asrc: always select different clocks
On Fri, Jul 17, 2020 at 12:38:56PM +0200, Arnaud Ferraris wrote: > This patch fixes the automatic clock selection so it always selects > distinct input and output clocks. Please don't send new patches in reply to old ones, it buries things and makes it hard to keep track of what the current version of a series looks like. Just send new versions as a completely new thread. Please don't send cover letters for single patches, if there is anything that needs saying put it in the changelog of the patch or after the --- if it's administrative stuff. This reduces mail volume and ensures that any important information is recorded in the changelog rather than being lost. signature.asc Description: PGP signature
[PATCH v3 0/1] ASoC: fsl_asrc: always select different clocks
This patch fixes the automatic clock selection so it always selects distinct input and output clocks. v2 -> v3: - Update code comment, fix formatting and add more detailed explanations in commit message v1 -> v2: - compare clock indexes (and not the location in the clock table) to make sure input and output clocks are different Arnaud Ferraris(1): ASoC: fsl_asrc: make sure the input and output clocks are different sound/soc/fsl/fsl_asrc.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-)
[PATCH v3 0/1] power: Emit change uevent when updating sysfs
Hi linux-pm, ChromeOS has a udev rule to chown the `power/wakeup` attribute so that the power manager can modify it during runtime. (https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/platform2/power_manager/udev/99-powerd-permissions.rules) In our automated tests, we found that the `power/wakeup` attributes weren't being chown-ed for some boards. On investigating, I found that when the drivers probe and call device_set_wakeup_capable, no uevent was being emitted for the newly added power/wakeup attribute. This was manifesting at boot on some boards (Marvell SDIO bluetooth and Broadcom Serial bluetooth drivers) or during usb disconnects during resume (Realtek btusb driver with reset resume quirk). It seems reasonable to me that changes to the attributes of a device should cause a changed uevent so I have added that here. Here's an example of the kernel events after toggling the authorized bit of /sys/bus/usb/devices/1-3/ $ echo 0 > /sys/bus/usb/devices/1-3/authorized KERNEL[27.357994] remove /devices/pci:00/:00:15.0/usb1/1-3/1-3:1.0/bluetooth/hci0/rfkill1 (rfkill) KERNEL[27.358049] remove /devices/pci:00/:00:15.0/usb1/1-3/1-3:1.0/bluetooth/hci0 (bluetooth) KERNEL[27.358458] remove /devices/pci:00/:00:15.0/usb1/1-3/1-3:1.0 (usb) KERNEL[27.358486] remove /devices/pci:00/:00:15.0/usb1/1-3/1-3:1.1 (usb) KERNEL[27.358529] change /devices/pci:00/:00:15.0/usb1/1-3 (usb) $ echo 1 > /sys/bus/usb/devices/1-3/authorized KERNEL[36.415749] change /devices/pci:00/:00:15.0/usb1/1-3 (usb) KERNEL[36.415798] add /devices/pci:00/:00:15.0/usb1/1-3/1-3:1.0 (usb) KERNEL[36.417414] add /devices/pci:00/:00:15.0/usb1/1-3/1-3:1.0/bluetooth/hci0 (bluetooth) KERNEL[36.417447] add /devices/pci:00/:00:15.0/usb1/1-3/1-3:1.0/bluetooth/hci0/rfkill2 (rfkill) KERNEL[36.417481] add /devices/pci:00/:00:15.0/usb1/1-3/1-3:1.1 (usb) Thanks Abhishek Changes in v3: - Simplified error handling Changes in v2: - Add newline at end of bt_dev_err Abhishek Pandit-Subedi (1): power: Emit changed uevent on wakeup_sysfs_add/remove drivers/base/power/sysfs.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) -- 2.27.0.212.ge8ba1cc988-goog
Re: [PATCH v3 0/1] hwmon:max6697: Allow max6581 to create tempX_offset
On 7/6/20 6:18 PM, Chu Lin wrote: > Per max6581, reg 4d and reg 4e is used for temperature read offset. > This patch will let the user specify the temperature read offset for > max6581. This patch is tested on max6581 and only applies to max6581. > Since this is a single patch, you don't need patch 0. Just add the change log after "---" to the actual patch. Thanks, Guenter > Testing: > echo 16250 > temp2_offset > cat temp2_offset > 16250 > > echo 17500 > temp3_offset > cat temp3_offset > 17500 > cat temp4_offset > 0 > cat temp2_offset > 17500 > > echo 0 > temp2_offset > cat temp2_offset > 0 > cat temp3_offset > 17500 > > echo -0 > temp2_offset > cat temp2_offset > 0 > > echo -10 > temp2_offset > cat temp2_input > 4875 > > echo 1 > temp2_offset > cat temp2_input > 47125 > > echo -2000 > temp2_offset > cat temp2_input > 34875 > > echo -0 > temp2_offset > cat temp2_input > 37000 > > Signed-off-by: Chu Lin > --- > ChangeLog v2 -> v3: > - Use reverse christmas tree order convension > - Fix the type issue where comparision is always true > - Change the line limit to 100 char instead of 80 char > > ChangeLog v1 -> v2: > - Simplify the offset reg raw value to milli ceisus conversion > - Substitute the temp1_offset with dummy attr > - Avoid using double negative in the macro definition > - Return the actual error when i2c read/write is failed > - clamp the value to MAX or MIN respectively if an out of range input is > given > - Provide mux protection when multiple i2c accesses is required > > Chu Lin (1): > hwmon:max6697: Allow max6581 to create tempX_offset attributes > > drivers/hwmon/max6697.c | 92 +++-- > 1 file changed, 88 insertions(+), 4 deletions(-) >
[PATCH v3 0/1] hwmon:max6697: Allow max6581 to create tempX_offset
Per max6581, reg 4d and reg 4e is used for temperature read offset. This patch will let the user specify the temperature read offset for max6581. This patch is tested on max6581 and only applies to max6581. Testing: echo 16250 > temp2_offset cat temp2_offset 16250 echo 17500 > temp3_offset cat temp3_offset 17500 cat temp4_offset 0 cat temp2_offset 17500 echo 0 > temp2_offset cat temp2_offset 0 cat temp3_offset 17500 echo -0 > temp2_offset cat temp2_offset 0 echo -10 > temp2_offset cat temp2_input 4875 echo 1 > temp2_offset cat temp2_input 47125 echo -2000 > temp2_offset cat temp2_input 34875 echo -0 > temp2_offset cat temp2_input 37000 Signed-off-by: Chu Lin --- ChangeLog v2 -> v3: - Use reverse christmas tree order convension - Fix the type issue where comparision is always true - Change the line limit to 100 char instead of 80 char ChangeLog v1 -> v2: - Simplify the offset reg raw value to milli ceisus conversion - Substitute the temp1_offset with dummy attr - Avoid using double negative in the macro definition - Return the actual error when i2c read/write is failed - clamp the value to MAX or MIN respectively if an out of range input is given - Provide mux protection when multiple i2c accesses is required Chu Lin (1): hwmon:max6697: Allow max6581 to create tempX_offset attributes drivers/hwmon/max6697.c | 92 +++-- 1 file changed, 88 insertions(+), 4 deletions(-) -- 2.27.0.383.g050319c2ae-goog
[RFC PATCH v3 0/1] Add rwsem "contended hook" API and mmap_lock histograms
The overall goal of this patch is to add tracepoints around mmap_lock acquisition. This will let us collect latency histograms, so we can see how long we block for in the contended case. Our goal is to collect this data across all of production at Google, so low overhead is critical. I'm sending this RFC for feedback on the changes to rwsem.{h,c} and lockdep.h in particular. I'll describe reasoning for the down_write case, for brevity. We want to measure the time lock acquisition takes. Naively, this is: u64 start = sched_clock(); down_write(/* ... */); trace(sched_clock() - start); My measurements show that this adds ~5-6% overhead to building a kernel on a test machine [1]. This level of overhead is unacceptably high. My measurements show that only instrumenting the contended case lowers overhead to < 1%. Naively, we can instrument only the contended case like this: if (!down_write_trylock(/* ... */)) /* Time and call down_write as before. */ However, in the case where `_trylock` succeeds, we have lost the lockdep annotations (e.g. around ordering) `down_write` would normally include. (Granted, we don't run with lockdep in production, but debug builds do.) Assuming we need lower overhead, we aren't okay with losing lock annotations, and we reject various alternatives to this patch: - Making rwsem.c's __down_write and __down_write_trylock public, so mmap_lock.c could construct its own version of LOCK_CONTENDED with tracepoint calls. - Having mmap_lock.c reach into rwsem.c's internals with "extern" forward declarations for these functions (and removing "static inline"). - Somehow adding the instrumentation directly to rwsem.c (either affecting all locks, or polluting it some other way). The remaining alternative, I think, is what this patch proposes: add API surface to rwsem.h which allows callers to provide instrumentation callbacks which are invoked in the contended case. [1]: For measuring the overhead of the instrumentation, I've been timing a defconfig kernel build. The numbers above come from a KVM instance with 4 CPUs + 32G RAM, running 5.8-rc1 with this patch applied and a histogram trigger configured for the acquire_returned tracepoint. My test script is simple: for (( i=0; i<5; ++i)); do make mrproper > /dev/null || exit 1 make defconfig > /dev/null || exit 1 sync || exit 1 echo 3 > /proc/sys/vm/drop_caches || exit 1 /usr/bin/time make -j5 > /dev/null done The numbers I'm giving above are computed as: (avg of 5 runs with this hist trigger enabled) / (avg on 5.8-rc1). Axel Rasmussen (1): mmap_lock: add tracepoints around mmap_lock acquisition include/linux/lockdep.h | 47 ++ include/linux/mmap_lock.h| 27 ++- include/linux/rwsem.h| 12 ++ include/trace/events/mmap_lock.h | 76 + kernel/locking/rwsem.c | 64 +++ mm/Kconfig | 19 +++ mm/Makefile | 1 + mm/mmap_lock.c | 281 +++ 8 files changed, 526 insertions(+), 1 deletion(-) create mode 100644 include/trace/events/mmap_lock.h create mode 100644 mm/mmap_lock.c -- 2.27.0.111.gc72c7da667-goog
[PATCH v3 0/1] s390: virtio: let arch choose to accept devices without IOMMU feature
An architecture protecting the guest memory against unauthorized host access may want to enforce VIRTIO I/O device protection through the use of VIRTIO_F_IOMMU_PLATFORM. Let's give a chance to the architecture to accept or not devices without VIRTIO_F_IOMMU_PLATFORM. Pierre Morel (1): s390: virtio: let arch accept devices without IOMMU feature arch/s390/mm/init.c | 6 ++ drivers/virtio/virtio.c | 22 ++ include/linux/virtio.h | 2 ++ 3 files changed, 30 insertions(+) -- 2.25.1 Changelog to v3: - add warning (Connie, Christian) - add comment (Connie) - change hook name (Halil, Connie) to v2: - put the test in virtio_finalize_features() (Connie) - put the test inside VIRTIO core (Jason) - pass a virtio device as parameter (Halil)
[PATCH v3 0/1] ARM: Add Rockchip rk3288w support
Hello everyone, Context --- Here is my V3 of my patches that add the support for the Rockchip RK3288w which is a revision of the RK3288. It is mostly the same SOC except for, at least, one clock tree which is different. This difference is only known by looking at the BSP kernel [1]. Currently, the mainline kernel will not hang on rk3288w but it is probably by "chance" because we got an issue on a lower kernel version. According to Rockchip's U-Boot [2], the rk3288w can be detected using the HDMI revision number (= 0x1A) in this version of the SOC. Changelog - In this V3, the revision's detection is not done in the kernel anymore. This patch will handle the rk3288w clock tree according to a new compatible "rockchip,rk3288w-cru" that must be provided by bootloaders. Changes since v2: - Remove all codes about revision detection, let's handle that by Bootloaders Best regards, Mylène Josserand [1] https://github.com/rockchip-linux/kernel/blob/develop-4.4/drivers/clk/rockchip/clk-rk3288.c#L960..L964 [2] https://github.com/rockchip-linux/u-boot/blob/next-dev/arch/arm/mach-rockchip/rk3288/rk3288.c#L378..L388 Mylène Josserand (1): clk: rockchip: rk3288: Handle clock tree for rk3288w drivers/clk/rockchip/clk-rk3288.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) -- 2.26.2
Re: [PATCH v3 0/1] net: ethernet: stmmac: simplify phy modes management for stm32
Hi, Just a "gentleman ping" Regards, Christophe. On 27/04/2020 12:00, Christophe Roullier wrote: > No new feature, just to simplify stm32 part to be easier to use. > Add by default all Ethernet clocks in DT, and activate or not in function > of phy mode, clock frequency, if property "st,ext-phyclk" is set or not. > Keep backward compatibility > > version 3: > Add acked from Alexandre Torgue > Rebased on top of v5.7-rc2 > > Christophe Roullier (1): >net: ethernet: stmmac: simplify phy modes management for stm32 > > .../net/ethernet/stmicro/stmmac/dwmac-stm32.c | 74 +++ > 1 file changed, 44 insertions(+), 30 deletions(-) >
[PATCH v3 0/1] vfio-ccw: Enable transparent CCW IPL from DASD
Remove the explicit prefetch check when using vfio-ccw devices. This check does not trigger in practice as all Linux channel programs are intended to use prefetch. Version 3 improves logging by including the UUID of the vfio device that triggers the warning. A custom rate limit is used because the generic rate limit of 10 per 5 seconds will still result in multiple warnings during IPL. The warning message has been clarfied to reflect that a channel program will be executed using prefetch even though prefetch was not specified. The text of warning itself does not explicitly refer to non-prefetching channel programs as unsupported because it will trigger during IPL, which is a normal and expected sequence. Likewise, because we expect the message to appear during IPL, the warning also does not explicitly alert to the potential of an error, rather it simply notes that a channel program is being executed in a way other than specified. Verson 3 also makes some word choice changes to the documentation. Jared Rossi (1): vfio-ccw: Enable transparent CCW IPL from DASD Documentation/s390/vfio-ccw.rst | 6 ++ drivers/s390/cio/vfio_ccw_cp.c | 19 --- 2 files changed, 18 insertions(+), 7 deletions(-) -- 2.17.0
[PATCH v3 0/1] dmaengine: avalon: Intel Avalon-MM DMA Interface for PCIe
This series is against v5.4-rc3 I am posting "avalon-dma" update alone and going to post "avalon-test" update as a follow-up or in the next round. Changes since v2: - avalon_dma_register() return value bug fixed; - device_prep_slave_sg() does not crash dmaengine_prep_slave_single() now; - kernel configuration options removed in favour of module parameters; - BUG_ONs, WARN_ONs and dev_dbgs removed; - goto labels renamed, other style issues addressed; - polling loop in interrupt handler commented; Changes since v1: - "avalon-dma" converted to "dmaengine" model; - "avalon-drv" renamed to "avalon-test"; The Avalon-MM DMA Interface for PCIe is a design used in hard IPs for Intel Arria, Cyclone or Stratix FPGAs. It transfers data between on-chip memory and system memory. Testing was done using a custom FPGA build with Arria 10 FPGA streaming data to target device RAM: +--++--++--++--+ | Nios CPU |<-->| RAM|<-->| Avalon |<-PCIe->| Host CPU | +--++--++--++--+ The RAM was examined for data integrity by examining RAM contents from host CPU (indirectly - checking data DMAed to the system) and from Nios CPU that has direct access to the device RAM. A companion tool using "avalon-test" driver was used to DMA files to the device: https://github.com/a-gordeev/avalon-tool.git CC: dmaeng...@vger.kernel.org Alexander Gordeev (1): dmaengine: avalon: Intel Avalon-MM DMA Interface for PCIe drivers/dma/Kconfig | 2 + drivers/dma/Makefile | 1 + drivers/dma/avalon/Kconfig | 14 + drivers/dma/avalon/Makefile | 6 + drivers/dma/avalon/avalon-core.c | 476 +++ drivers/dma/avalon/avalon-core.h | 92 ++ drivers/dma/avalon/avalon-hw.c | 186 drivers/dma/avalon/avalon-hw.h | 85 ++ drivers/dma/avalon/avalon-pci.c | 144 ++ 9 files changed, 1006 insertions(+) create mode 100644 drivers/dma/avalon/Kconfig create mode 100644 drivers/dma/avalon/Makefile create mode 100644 drivers/dma/avalon/avalon-core.c create mode 100644 drivers/dma/avalon/avalon-core.h create mode 100644 drivers/dma/avalon/avalon-hw.c create mode 100644 drivers/dma/avalon/avalon-hw.h create mode 100644 drivers/dma/avalon/avalon-pci.c Because the amount of changes since the previous version is quite big, I am also posting the interdiff. diff -u b/drivers/dma/avalon/Kconfig b/drivers/dma/avalon/Kconfig --- b/drivers/dma/avalon/Kconfig +++ b/drivers/dma/avalon/Kconfig @@ -15,74 +14,0 @@ - -if AVALON_DMA - -config AVALON_DMA_MASK_WIDTH - int "Avalon DMA streaming and coherent bitmask width" - range 0 64 - default 64 - help - Width of bitmask for streaming and coherent DMA operations - -config AVALON_DMA_CTRL_BASE - hex "Avalon DMA controllers base" - default "0x" - -config AVALON_DMA_RD_EP_DST_LO - hex "Avalon DMA read controller base low" - default "0x8000" - help - Specifies the lower 32-bits of the base address of the read - status and descriptor table in the Root Complex memory. - -config AVALON_DMA_RD_EP_DST_HI - hex "Avalon DMA read controller base high" - default "0x" - help - Specifies the upper 32-bits of the base address of the read - status and descriptor table in the Root Complex memory. - -config AVALON_DMA_WR_EP_DST_LO - hex "Avalon DMA write controller base low" - default "0x80002000" - help - Specifies the lower 32-bits of the base address of the write - status and descriptor table in the Root Complex memory. - -config AVALON_DMA_WR_EP_DST_HI - hex "Avalon DMA write controller base high" - default "0x" - help - Specifies the upper 32-bits of the base address of the write - status and descriptor table in the Root Complex memory. - -config AVALON_DMA_PCI_VENDOR_ID - hex "PCI vendor ID" - default "0x1172" - -config AVALON_DMA_PCI_DEVICE_ID - hex "PCI device ID" - default "0xe003" - -config AVALON_DMA_PCI_BAR - int "PCI device BAR the Avalon DMA controller is mapped to" - range 0 5 - default 0 - help - Number of PCI BAR the DMA controller is mapped to - -config AVALON_DMA_PCI_MSI_COUNT_ORDER - int "Count of MSIs the PCI device provides (order)" - range 0 5 - default 5 - help - Number of vectors the PCI device uses in multiple MSI mode. - This number is provided as the power of two. - -config AVALON_DMA_PCI_MSI_VECTOR - int "Vector number the DMA controller is mapped to" - range 0 31 - default 0 - help - Number of MSI vector the DMA controller is mapped to in - multiple MSI mode. - -endif diff -u b/drivers/dma/avalon/avalon-core.c b/drivers/dma/avalon/avalon-core.c ---
[PATCH v3 0/1] x86/init: Add option to skip using RTC
Hi, We have a new Atom Airmont core based product which does not support RTC as persistent clock source. Presently, platform ops get/set wallclock always use MC146818 RTC/CMOS device to read & set time. This causes boot failure on our SOC with no RTC. More specifically, it hangs in RTC driver's mach_get_cmos_time() when it polls RTC_FRQ_SELECT register and loops until Update-In-Progress (UIP) flag gets cleared i.e. below code snippet. while ((CMOS_READ(RTC_FREQ_SELECT) & RTC_UIP)) cpu_relax(); After few rounds of review cycles/feedback, we concluded that we should control it from Motorola MC146818 compatible RTC devicetree node. Please see [1]. Make RTC read/write optional by detecting platforms which does not support RTC/CMOS device through the corresponding DT node status property. If status says disabled, then noop the get/set wallclock ops. For non DT enabled platforms or for DT enabled platforms which does not define optional status property, proceed same as before. Patch is baselined upon Linux 5.4-rc2 at below Git tree: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/core [1] Documentation/devicetree/bindings/rtc/rtc-cmos.txt v3: * Rebase to latest 5.4-rc2 kernel. * Fix a build warning reported by kbuild test robot. v2: * As per review feedback, do not hack RTC read/write functions directly. Instead, override get/set wallclock ops during setup_arch init sequence. v1: * Detect platforms with no RTC in RTC read/write functions and skip RTC read/write if not applicable. Rahul Tanwar (1): x86/init: Noop get/set wallclock when platform doesn't support RTC arch/x86/kernel/x86_init.c | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) -- 2.11.0
[PATCH v3 0/1] leds: fix /sys/class/leds//trigger
Reading /sys/class/leds//trigger returns all available LED triggers. However, the size of this file is limited to PAGE_SIZE because of the limitation for sysfs attribute. Enabling LED CPU trigger on systems with thousands of CPUs easily hits PAGE_SIZE limit, and makes it impossible to see all available LED triggers and which trigger is currently activated. This patch converts /sys/class/leds//trigger to bin attribute and removes the PAGE_SIZE limitation. The first version of this seris provided the new api that follows the "one value per file" rule of sysfs. The second version dropped it because there have been a number of problems and it turns out that the new api should be submitted separately. * v3 - Remove "query" parameters from led_trigger_snprintf() and led_trigger_format() - Return -ENOMEM immediately if memory allocation fails - Drop Acked-by: tag due to a certain amount of changes * v2 - Update commit message - Drop patches for new api Akinobu Mita (1): leds: remove PAGE_SIZE limit of /sys/class/leds//trigger drivers/leds/led-class.c| 8 ++-- drivers/leds/led-triggers.c | 90 ++--- drivers/leds/leds.h | 6 +++ include/linux/leds.h| 5 --- 4 files changed, 78 insertions(+), 31 deletions(-) Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Jacek Anaszewski Cc: Pavel Machek Cc: Dan Murphy -- 2.7.4
[PATCH v3 0/1] intel_cht_int33fe: Split code to USB Micro-B and Type-C variants
Patch to support INT33FE ACPI pseudo-device on hardware with USB Micro-B connector. v4: - Micro-B variant: Don't print error to the kernel log if i2c_acpi_new_device() has returned -EPROBE_DEFER. v3: - Rename TypeB variant to Micro-B (we have only one such device for now and it has Micro-B connector) - Rebase on current linus/master - Remove empty lines and replace "TypeC" by "Type-C" v2: Instead of defining two separated modules with two separated config options, compile {common,typeb,typec} sources into one .ko module. Call needed variant-specific probe function based after of hardware type detection in common code. Yauhen Kharuzhy (1): platform/x86/intel_cht_int33fe: Split code to USB Micro-B and Type-C variants drivers/platform/x86/Kconfig | 12 +- drivers/platform/x86/Makefile | 4 + .../platform/x86/intel_cht_int33fe_common.c | 147 ++ .../platform/x86/intel_cht_int33fe_common.h | 41 + .../platform/x86/intel_cht_int33fe_microb.c | 67 ...ht_int33fe.c => intel_cht_int33fe_typec.c} | 78 +- 6 files changed, 276 insertions(+), 73 deletions(-) create mode 100644 drivers/platform/x86/intel_cht_int33fe_common.c create mode 100644 drivers/platform/x86/intel_cht_int33fe_common.h create mode 100644 drivers/platform/x86/intel_cht_int33fe_microb.c rename drivers/platform/x86/{intel_cht_int33fe.c => intel_cht_int33fe_typec.c} (82%) -- 2.23.0.rc1
[PATCH v3 0/1] intel_cht_int33fe: Split code to USB Micro-B and Type-C variants
Patch to support INT33FE ACPI pseudo-device on hardware with USB Micro-B connector. v3: - Rename TypeB variant to Micro-B (we have only one such device for now and it has Micro-B connector) - Rebase on current linus/master - Remove empty lines and replace "TypeC" by "Type-C" v2: Instead of defining two separated modules with two separated config options, compile {common,typeb,typec} sources into one .ko module. Call needed variant-specific probe function based after of hardware type detection in common code. Yauhen Kharuzhy (1): platform/x86/intel_cht_int33fe: Split code to USB Micro-B and Type-C variants drivers/platform/x86/Kconfig | 12 +- drivers/platform/x86/Makefile | 4 + .../platform/x86/intel_cht_int33fe_common.c | 147 ++ .../platform/x86/intel_cht_int33fe_common.h | 41 + .../platform/x86/intel_cht_int33fe_microb.c | 63 ...ht_int33fe.c => intel_cht_int33fe_typec.c} | 78 +- 6 files changed, 272 insertions(+), 73 deletions(-) create mode 100644 drivers/platform/x86/intel_cht_int33fe_common.c create mode 100644 drivers/platform/x86/intel_cht_int33fe_common.h create mode 100644 drivers/platform/x86/intel_cht_int33fe_microb.c rename drivers/platform/x86/{intel_cht_int33fe.c => intel_cht_int33fe_typec.c} (82%) -- 2.23.0.rc1
[PATCH v3 0/1] Add option to skip using RTC
Hi, There is a new product which does not support RTC as persistent clock source. Platform ops get/set wallclock are used to get/set timespec through kernel timekeeping read/update_persistent_clock64() routines. Presently, get/set wallclock ops always use MC146818A RTC/CMOS device to read & set time. This causes boot failure on our new SOC with no RTC. Make RTC read/write optional by detecting platforms which does not support RTC/CMOS device through the corresponding DT node status property. If status says disabled, then noop the get/set wallclock ops. For non DT enabled machines or for DT enabled machines which does not define optional status property, proceed same as before. These patches are baselined upon Linux 5.3-rc6 at below Git tree: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/core v3: * Fix a build warning reported by kbuild test robot. v2: * As per review feedback, do not hack RTC read/write functions directly. Instead, override get/set wallclock ops during setup_arch init sequence. v1: * Detect platforms with no RTC in RTC read/write functions and skip RTC read/write if not applicable. Rahul Tanwar (1): x86/init: Noop get/set wallclock when platform doesn't support RTC arch/x86/kernel/x86_init.c | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) -- 2.11.0
Re: [PATCH v3 0/1] aacraid: Host adapter Adaptec 6405 constantly resets under high io load
> Problem description: > > A node with Adaptec 6405 controller, latest BIOS V5.3-0[19204] A lot > of disks attached to the controller. Simple test: running mkfs.ext4 > on many disks on the same controller in parallel (mkfs is not > important here, any serious io load triggers controller aborts) Microchip folks: Please review! -- Martin K. Petersen Oracle Linux Engineering
[PATCH v3 0/1] aacraid: Host adapter Adaptec 6405 constantly resets under high io load
Problem description: A node with Adaptec 6405 controller, latest BIOS V5.3-0[19204] A lot of disks attached to the controller. Simple test: running mkfs.ext4 on many disks on the same controller in parallel (mkfs is not important here, any serious io load triggers controller aborts) Results: * no problems (controller resets) with kernels prior to 395e5df79a95 ("scsi: aacraid: Remove reference to Series-9") * latest ms kernel v5.2-rc6-15-g249155c20f9b - mkfs processes are in D state, lot of complains in logs like: [ 654.894633] aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,43,0): [ 699.441034] aacraid: Host adapter abort request. aacraid: Outstanding commands on (0,1,40,0): [ 699.442950] aacraid: Host adapter reset request. SCSI hang ? [ 714.457428] aacraid: Host adapter reset request. SCSI hang ? ... [ 759.514759] aacraid: Host adapter reset request. SCSI hang ? [ 759.514869] aacraid :03:00.0: outstanding cmd: midlevel-0 [ 759.514870] aacraid :03:00.0: outstanding cmd: lowlevel-0 [ 759.514872] aacraid :03:00.0: outstanding cmd: error handler-498 [ 759.514873] aacraid :03:00.0: outstanding cmd: firmware-471 [ 759.514875] aacraid :03:00.0: outstanding cmd: kernel-60 [ 759.514912] aacraid :03:00.0: Controller reset type is 3 [ 759.515013] aacraid :03:00.0: Issuing IOP reset [ 850.296705] aacraid :03:00.0: IOP reset succeeded Same complains on Ubuntu kernel 4.15.0-50-generic: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1777586 Controller: === 03:00.0 RAID bus controller: Adaptec Series 6 - 6G SAS/PCIe 2 (rev 01) Subsystem: Adaptec Series 6 - ASR-6405 - 4 internal 6G SAS ports Test: = # cat dev.list /dev/sdq1 /dev/sde1 /dev/sds1 /dev/sdb1 /dev/sdk1 /dev/sdaj1 /dev/sdaf1 /dev/sdd1 /dev/sdac1 /dev/sdai1 /dev/sdz1 /dev/sdj1 /dev/sdy1 /dev/sdn1 /dev/sdae1 /dev/sdg1 /dev/sdi1 /dev/sdc1 /dev/sdf1 /dev/sdl1 /dev/sda1 /dev/sdab1 /dev/sdr1 /dev/sdo1 /dev/sdah1 /dev/sdm1 /dev/sdt1 /dev/sdp1 /dev/sdad1 /dev/sdh1 === # cat run_mkfs.sh #!/bin/bash while read i; do mkfs.ext4 $i -q -E lazy_itable_init=1 -O uninit_bg -m 0 & done = # cat dev.list | ./run_mkfs.sh The issue is 100% reproducible. i've bisected to the culprit patch, it's 395e5df79a95 ("scsi: aacraid: Remove reference to Series-9") it changes arc ctrl checks for Series-6 controllers and i've checked that resurrection of original logic in arc ctrl checks eliminates controller hangs/resets. Konstantin Khorenko (1): scsi: aacraid: resurrect correct arc ctrl checks for Series-6 -- v3 changes: * introduced another wrapper to check for devices except for Series 6 controllers upon request from Sagar Biradar (Microchip) * dropped mentions of private bug ids drivers/scsi/aacraid/aacraid.h | 11 +++ drivers/scsi/aacraid/comminit.c | 5 ++--- drivers/scsi/aacraid/linit.c| 2 +- 3 files changed, 14 insertions(+), 4 deletions(-) -- 2.15.1
Re: [PATCH v3 0/1] waitid: process group enhancement
On Wed, Aug 14, 2019 at 11:58:22AM -0400, Rich Felker wrote: > On Wed, Aug 14, 2019 at 05:43:59PM +0200, Christian Brauner wrote: > > Hey everyone, > > > > This patch adds support for waiting on the current process group by > > specifying waitid(P_PGID, 0, ...) as discussed in [1]. The details why > > we need to do this are in the commit message of [PATCH 1/1] so I won't > > repeat them here. > > > > I've picked this up since the thread has gone stale and parts of > > userspace are actually blocked by this. > > > > Note that the patch has been changed to be more closely aligned with the > > P_PIDFD changes to waitid() I have sitting in my for-next branch (cf. [2]). > > This makes the merge conflict a little simpler and picks up on the > > coding style discussions that guided the P_PIDFD patchset. > > > > There was some desire to get this feature in with 5.3 (cf. [3]). > > But given that this is a new feature for waitid() and for the sake of > > avoiding any merge conflicts I would prefer to land this in the 5.4 > > merge window together with the P_PIDFD changes. > > That makes 5.4 (or later, depending on other stuff) the hard minimum > for RV32 ABI. Is that acceptable? I was under the impression (perhaps > mistaken) that 5.3 was going to be next LTS series which is why I'd > like to have the necessary syscalls for a complete working RV32 > userspace in it. If I'm wrong about that please ignore me. :-) 5.3 is not going to be an LTS and we don't do new features after the merge window is closed anyway. :) Christian
Re: [PATCH v3 0/1] waitid: process group enhancement
On Wed, Aug 14, 2019 at 05:43:59PM +0200, Christian Brauner wrote: > Hey everyone, > > This patch adds support for waiting on the current process group by > specifying waitid(P_PGID, 0, ...) as discussed in [1]. The details why > we need to do this are in the commit message of [PATCH 1/1] so I won't > repeat them here. > > I've picked this up since the thread has gone stale and parts of > userspace are actually blocked by this. > > Note that the patch has been changed to be more closely aligned with the > P_PIDFD changes to waitid() I have sitting in my for-next branch (cf. [2]). > This makes the merge conflict a little simpler and picks up on the > coding style discussions that guided the P_PIDFD patchset. > > There was some desire to get this feature in with 5.3 (cf. [3]). > But given that this is a new feature for waitid() and for the sake of > avoiding any merge conflicts I would prefer to land this in the 5.4 > merge window together with the P_PIDFD changes. That makes 5.4 (or later, depending on other stuff) the hard minimum for RV32 ABI. Is that acceptable? I was under the impression (perhaps mistaken) that 5.3 was going to be next LTS series which is why I'd like to have the necessary syscalls for a complete working RV32 userspace in it. If I'm wrong about that please ignore me. :-) Rich
[PATCH v3 0/1] waitid: process group enhancement
Hey everyone, This patch adds support for waiting on the current process group by specifying waitid(P_PGID, 0, ...) as discussed in [1]. The details why we need to do this are in the commit message of [PATCH 1/1] so I won't repeat them here. I've picked this up since the thread has gone stale and parts of userspace are actually blocked by this. Note that the patch has been changed to be more closely aligned with the P_PIDFD changes to waitid() I have sitting in my for-next branch (cf. [2]). This makes the merge conflict a little simpler and picks up on the coding style discussions that guided the P_PIDFD patchset. There was some desire to get this feature in with 5.3 (cf. [3]). But given that this is a new feature for waitid() and for the sake of avoiding any merge conflicts I would prefer to land this in the 5.4 merge window together with the P_PIDFD changes. Thanks! Christian /* v0 */ Link: https://www.sourceware.org/ml/libc-alpha/2019-07/msg00587.html /* v1 */ Link: https://lore.kernel.org/lkml/20190814113822.9505-1-christian.brau...@ubuntu.com/ /* v2 */ Link: https://lore.kernel.org/lkml/20190814130732.23572-1-christian.brau...@ubuntu.com/ /* References */ [1]: https://www.sourceware.org/ml/libc-alpha/2019-07/msg00587.html [2]: https://lore.kernel.org/lkml/2019072729.6516-1-christ...@brauner.io/ [3]: https://www.sourceware.org/ml/libc-alpha/2019-08/msg00304.html Eric W. Biederman (1): waitid: Add support for waiting for the current process group kernel/exit.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) -- 2.22.0
[PATCH v3 0/1] get_user_pages changes
In this 3rd version of the patch series, I have compressed the patches of the previous patch series into one patch. This was suggested by Christoph Hellwig. The suggestion was to remove the pte_lookup functions and use the g et_user_pages* functions directly instead of the pte_lookup functions. There is nothing different in this series compared to the previous series, It essentially compresses the 3 patches of the original series into one patch. Bharath Vedartham (1): sgi-gru: Remove *pte_lookup functions drivers/misc/sgi-gru/grufault.c | 114 +--- 1 file changed, 25 insertions(+), 89 deletions(-) -- 2.7.4
[PATCH v3 0/1] builddeb: generate multi-arch friendly linux-libc-dev
Changes in v3: - add Multi-Arch: same to debian/control for linux-libc-dev Changes in v2: - forward $debarch from mkdebian to builddeb - use dpkg-architecture -qDEB_HOST_MULTIARCH instead of $CC -dumpmachine Cedric Hombourger (1): builddeb: generate multi-arch friendly linux-libc-dev package scripts/package/builddeb | 8 scripts/package/mkdebian | 5 +++-- 2 files changed, 11 insertions(+), 2 deletions(-) -- 2.11.0
[PATCH v3 0/1] mm/vmalloc.c: improve readability and rewrite vmap_area
v2 -> v3: * patch 1-4: Abandoned * patch 5: - Eliminate "flags" (suggested by Uladzislau Rezki) - Based on https://lkml.org/lkml/2019/6/6/455 and https://lkml.org/lkml/2019/7/3/661 v1 -> v2: * patch 3: Rename __find_vmap_area to __search_va_in_busy_tree instead of __search_va_from_busy_tree. * patch 5: Add motivation and necessary test data to the commit message. * patch 5: Let va->flags use only some low bits of va_start instead of completely overwriting va_start. The current implementation of struct vmap_area wasted space. After applying this commit, sizeof(struct vmap_area) has been reduced from 11 words to 8 words. Pengfei Li (1): Modify struct vmap_area to reduce its size include/linux/vmalloc.h | 20 +--- mm/vmalloc.c| 24 ++-- 2 files changed, 23 insertions(+), 21 deletions(-) -- 2.21.0
[PATCH v3 0/1] iio: common: cros_ec_sensors: Add protocol v3 support
This patch is part of a split of the following patch: https://lkml.org/lkml/2019/6/18/268 To fix Enric comments from https://lkml.org/lkml/2019/6/25/949 I extract it from the other serie to speed up acceptance because other patches need it to be upstreamed. Changes since v2: - Use patch 1 from v1 after discussion on ML Changes since v1: - Drop second patch - return ENODEV if version is 0 Fabien Lahoudere (1): iio: common: cros_ec_sensors: determine protocol version .../cros_ec_sensors/cros_ec_sensors_core.c| 36 ++- 1 file changed, 35 insertions(+), 1 deletion(-) -- 2.19.2
[PATCH v3 0/1] Add support for IPMB driver
Thank you for your feedback Wolfram. I have addressed your comments. Concerning your questions: "Why can't we use i2c_smbus_write_block_data()?" i2c_smbus_write_block_data() does not allow me to pass the requester_i2c_addr argument. Instead, it uses the client->addr. The client->addr in this driver is set to the i2c address of the device where this driver is loaded (since we used i2c_slave_register to register this device as a slave). But the address we want to pass to i2c_smbus_write_block_data_local is actually the i2c address of the device on the other end of the I2C bus. This is the case where our device acts as a master and sends the IPMB (equivalent to I2C) response to the requester device (which becomes the I2C slave). "Can't we leave the default or will the compiler complain?" I chose to leave the default because IPMB by definition only allows master write. It doesn't do any reads. So if there is any exetrnal device that tries to do a read, this i2c cb function will just go to the default case. "I really don't know enough about IPMB to judge if the design of having one i2c-dev interface and another ipmb-dev interface is a good solution" I am open for discussion. My reasoning was that we need to interact with user space so I used misc strictly to enable read/write. Maybe we could do something similar to the i2c-slave-eeprom.c where the eeprom_data struct uses bin_attributes? Asmaa Mnebhi (1): Add support for IPMB driver drivers/char/ipmi/Kconfig| 8 + drivers/char/ipmi/Makefile | 1 + drivers/char/ipmi/ipmb_dev_int.c | 386 +++ 3 files changed, 395 insertions(+) create mode 100644 drivers/char/ipmi/ipmb_dev_int.c -- 2.1.2
Re: [PATCH v3 0/1] Use HMM for ODP v3
On Thu, Apr 11, 2019 at 12:29:43PM +, Leon Romanovsky wrote: > On Wed, Apr 10, 2019 at 11:41:24AM -0400, jgli...@redhat.com wrote: > > From: Jérôme Glisse > > > > Changes since v1/v2 are about rebase and better comments in the code. > > Previous cover letter slightly updated. > > > > > > This patchset convert RDMA ODP to use HMM underneath this is motivated > > by stronger code sharing for same feature (share virtual memory SVM or > > Share Virtual Address SVA) and also stronger integration with mm code to > > achieve that. It depends on HMM patchset posted for inclusion in 5.2 [2] > > and [3]. > > > > It has been tested with pingpong test with -o and others flags to test > > different size/features associated with ODP. > > > > Moreover they are some features of HMM in the works like peer to peer > > support, fast CPU page table snapshot, fast IOMMU mapping update ... > > It will be easier for RDMA devices with ODP to leverage those if they > > use HMM underneath. > > > > Quick summary of what HMM is: > > HMM is a toolbox for device driver to implement software support for > > Share Virtual Memory (SVM). Not only it provides helpers to mirror a > > process address space on a device (hmm_mirror). It also provides > > helper to allow to use device memory to back regular valid virtual > > address of a process (any valid mmap that is not an mmap of a device > > or a DAX mapping). They are two kinds of device memory. Private memory > > that is not accessible to CPU because it does not have all the expected > > properties (this is for all PCIE devices) or public memory which can > > also be access by CPU without restriction (with OpenCAPI or CCIX or > > similar cache-coherent and atomic inter-connect). > > > > Device driver can use each of HMM tools separatly. You do not have to > > use all the tools it provides. > > > > For RDMA device i do not expect a need to use the device memory support > > of HMM. This device memory support is geared toward accelerator like GPU. > > > > > > You can find a branch [1] with all the prerequisite in. This patch is on > > top of rdma-next with the HMM patchset [2] and mmu notifier patchset [3] > > applied on top of it. > > > > [1] https://cgit.freedesktop.org/~glisse/linux/log/?h=rdma-5.2 > > Hi Jerome, > > I took this branch and merged with our latest rdma-next, but it doesn't > compile. > > In file included from drivers/infiniband/hw/mlx5/mem.c:35: > ./include/rdma/ib_umem_odp.h:110:20: error: field _mirror_ has > incomplete type > struct hmm_mirror mirror; > ^~ > ./include/rdma/ib_umem_odp.h:132:18: warning: _struct hmm_range_ declared > inside parameter list will not be visible outside of this definition or > declaration > struct hmm_range *range); > ^ > make[4]: *** [scripts/Makefile.build:276: drivers/infiniband/hw/mlx5/mem.o] > Error 1 > > The reason to it that in my .config, ZONE_DEVICE, MEMORY_HOTPLUG and HMM > options were disabled. Silly my i forgot to update kconfig so i pushed a branch with proper kconfig changes in the ODP patch but it depends on changes to the HMM kconfig so that HMM_MIRROR can be enabled on arch that do not have everything for HMM_DEVICE. https://cgit.freedesktop.org/~glisse/linux/log/?h=rdma-odp-hmm-v4 I doing build of various kconfig variation before posting to make sure it is all good. Cheers, Jérôme
Re: [PATCH v3 0/1] Use HMM for ODP v3
On Wed, Apr 10, 2019 at 11:41:24AM -0400, jgli...@redhat.com wrote: > From: Jérôme Glisse > > Changes since v1/v2 are about rebase and better comments in the code. > Previous cover letter slightly updated. > > > This patchset convert RDMA ODP to use HMM underneath this is motivated > by stronger code sharing for same feature (share virtual memory SVM or > Share Virtual Address SVA) and also stronger integration with mm code to > achieve that. It depends on HMM patchset posted for inclusion in 5.2 [2] > and [3]. > > It has been tested with pingpong test with -o and others flags to test > different size/features associated with ODP. > > Moreover they are some features of HMM in the works like peer to peer > support, fast CPU page table snapshot, fast IOMMU mapping update ... > It will be easier for RDMA devices with ODP to leverage those if they > use HMM underneath. > > Quick summary of what HMM is: > HMM is a toolbox for device driver to implement software support for > Share Virtual Memory (SVM). Not only it provides helpers to mirror a > process address space on a device (hmm_mirror). It also provides > helper to allow to use device memory to back regular valid virtual > address of a process (any valid mmap that is not an mmap of a device > or a DAX mapping). They are two kinds of device memory. Private memory > that is not accessible to CPU because it does not have all the expected > properties (this is for all PCIE devices) or public memory which can > also be access by CPU without restriction (with OpenCAPI or CCIX or > similar cache-coherent and atomic inter-connect). > > Device driver can use each of HMM tools separatly. You do not have to > use all the tools it provides. > > For RDMA device i do not expect a need to use the device memory support > of HMM. This device memory support is geared toward accelerator like GPU. > > > You can find a branch [1] with all the prerequisite in. This patch is on > top of rdma-next with the HMM patchset [2] and mmu notifier patchset [3] > applied on top of it. > > [1] https://cgit.freedesktop.org/~glisse/linux/log/?h=rdma-5.2 Hi Jerome, I took this branch and merged with our latest rdma-next, but it doesn't compile. In file included from drivers/infiniband/hw/mlx5/mem.c:35: ./include/rdma/ib_umem_odp.h:110:20: error: field _mirror_ has incomplete type struct hmm_mirror mirror; ^~ ./include/rdma/ib_umem_odp.h:132:18: warning: _struct hmm_range_ declared inside parameter list will not be visible outside of this definition or declaration struct hmm_range *range); ^ make[4]: *** [scripts/Makefile.build:276: drivers/infiniband/hw/mlx5/mem.o] Error 1 The reason to it that in my .config, ZONE_DEVICE, MEMORY_HOTPLUG and HMM options were disabled. Thanks signature.asc Description: PGP signature
[PATCH v3 0/1] Use HMM for ODP v3
From: Jérôme Glisse Changes since v1/v2 are about rebase and better comments in the code. Previous cover letter slightly updated. This patchset convert RDMA ODP to use HMM underneath this is motivated by stronger code sharing for same feature (share virtual memory SVM or Share Virtual Address SVA) and also stronger integration with mm code to achieve that. It depends on HMM patchset posted for inclusion in 5.2 [2] and [3]. It has been tested with pingpong test with -o and others flags to test different size/features associated with ODP. Moreover they are some features of HMM in the works like peer to peer support, fast CPU page table snapshot, fast IOMMU mapping update ... It will be easier for RDMA devices with ODP to leverage those if they use HMM underneath. Quick summary of what HMM is: HMM is a toolbox for device driver to implement software support for Share Virtual Memory (SVM). Not only it provides helpers to mirror a process address space on a device (hmm_mirror). It also provides helper to allow to use device memory to back regular valid virtual address of a process (any valid mmap that is not an mmap of a device or a DAX mapping). They are two kinds of device memory. Private memory that is not accessible to CPU because it does not have all the expected properties (this is for all PCIE devices) or public memory which can also be access by CPU without restriction (with OpenCAPI or CCIX or similar cache-coherent and atomic inter-connect). Device driver can use each of HMM tools separatly. You do not have to use all the tools it provides. For RDMA device i do not expect a need to use the device memory support of HMM. This device memory support is geared toward accelerator like GPU. You can find a branch [1] with all the prerequisite in. This patch is on top of rdma-next with the HMM patchset [2] and mmu notifier patchset [3] applied on top of it. [1] https://cgit.freedesktop.org/~glisse/linux/log/?h=rdma-5.2 [2] https://lkml.org/lkml/2019/4/3/1032 [3] https://lkml.org/lkml/2019/3/26/900 Cc: linux-r...@vger.kernel.org Cc: Jason Gunthorpe Cc: Leon Romanovsky Cc: Doug Ledford Cc: Artemy Kovalyov Cc: Moni Shoua Cc: Mike Marciniszyn Cc: Kaike Wan Cc: Dennis Dalessandro Jérôme Glisse (1): RDMA/odp: convert to use HMM for ODP v3 drivers/infiniband/core/umem_odp.c | 486 - drivers/infiniband/hw/mlx5/mem.c | 20 +- drivers/infiniband/hw/mlx5/mr.c| 2 +- drivers/infiniband/hw/mlx5/odp.c | 106 --- include/rdma/ib_umem_odp.h | 48 ++- 5 files changed, 219 insertions(+), 443 deletions(-) -- 2.20.1
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On 3/14/19 2:06 AM, Jan Kara wrote: > On Wed 13-03-19 19:21:37, Christopher Lameter wrote: >> On Wed, 13 Mar 2019, Christoph Hellwig wrote: >> >>> On Wed, Mar 13, 2019 at 09:11:13AM +1100, Dave Chinner wrote: On Tue, Mar 12, 2019 at 03:39:33AM -0700, Ira Weiny wrote: > IMHO I don't think that the copy_file_range() is going to carry us > through the > next wave of user performance requirements. RDMA, while the first, is > not the > only technology which is looking to have direct access to files. XDP is > another.[1] Sure, all I doing here was demonstrating that people have been trying to get local direct access to file mappings to DMA directly into them for a long time. Direct Io games like these are now largely unnecessary because we now have much better APIs to do zero-copy data transfer between files (which can do hardware offload if it is available!). >>> >>> And that is just the file to file case. There are tons of other >>> users of get_user_pages, including various drivers that do large >>> amounts of I/O like video capture. For them it makes tons of sense >>> to transfer directly to/from a mmap()ed file. >> >> That is very similar to the RDMA case and DAX etc. We need to have a way >> to tell a filesystem that this is going to happen and that things need to >> be setup for this to work properly. > > The way to tell filesystem what's happening is exactly what we are working > on with these patches... > >> But if that has not been done then I think its proper to fail a long term >> pin operation on page cache pages. Meaning the regular filesystems >> maintain control of whats happening with their pages. > > And as I mentioned in my other email, we cannot just fail the pin for > pagecache pages as that would regress existing applications. > > Honza > Christopher L, Are you OK with this approach now? If so, I'd like to collect any additional ACKs people are willing to provide, and ask Andrew to consider this first patch for 5.2, so we can get started. thanks, -- John Hubbard NVIDIA
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On 3/14/19 1:25 PM, William Kucharski wrote: On Mar 14, 2019, at 7:30 AM, Jan Kara wrote: Well I have some crash reports couple years old and they are not from QA departments. So I'm pretty confident there are real users that use this in production... and just reboot their machine in case it crashes. Do you know what the use case in those crashes actually was? I'm curious to know they were actually cases of say DMA from a video capture card or if the uses posited to date are simply theoretical. It's not merely theoretical. In addition to Jan's bug reports, I've personally investigated a bug that involved an GPU (acting basically as an AI accelerator in this case) that was doing DMA to memory that turned out to be file backed. The backtrace for that is in the commit description. As others have mentioned, this works well enough to lure people into using it, but then fails when you load down a powerful system (and put it under memory pressure). I think that as systems get larger, and more highly threaded, we might see more such failures--maybe even in the Direct IO case someday, although so far that race window is so small that that one truly is still theoretical (or, we just haven't been in communication with anyone who hit it). thanks, -- John Hubbard NVIDIA It's always good to know who might be doing this and why if for no other reason than as something to keep in mind when designing future interfaces.
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
> On Mar 14, 2019, at 7:30 AM, Jan Kara wrote: > > Well I have some crash reports couple years old and they are not from QA > departments. So I'm pretty confident there are real users that use this in > production... and just reboot their machine in case it crashes. Do you know what the use case in those crashes actually was? I'm curious to know they were actually cases of say DMA from a video capture card or if the uses posited to date are simply theoretical. It's always good to know who might be doing this and why if for no other reason than as something to keep in mind when designing future interfaces.
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Thu 14-03-19 09:57:18, Jason Gunthorpe wrote: > On Thu, Mar 14, 2019 at 10:03:45AM +0100, Jan Kara wrote: > > On Wed 13-03-19 19:16:51, Christopher Lameter wrote: > > > On Tue, 12 Mar 2019, Jerome Glisse wrote: > > > > > > > > > This has been discuss extensively already. GUP usage is now > > > > > > widespread in > > > > > > multiple drivers, removing that would regress userspace ie break > > > > > > existing > > > > > > application. We all know what the rules for that is. > > > > > > You are still misstating the issue. In RDMA land GUP is widely used for > > > anonyous memory and memory based filesystems. *Not* for real filesystems. > > > > Maybe in your RDMA land. But there are apparently other users which do use > > mmap of a file on normal filesystem (e.g. ext4) as a buffer for DMA > > (Infiniband does not prohibit this if nothing else, video capture devices > > also use very similar pattern of gup-ing pages and using them as video > > buffers). And these users are reporting occasional kernel crashes. That's > > how this whole effort started. Sadly the DMA to file mmap is working good > > enough that people started using it so at this point we cannot just tell: > > Sorry it was a mistake to allow this, just rewrite your applications. > > This is where we are in RDMA too.. People are trying it and the ones > that do enough load testing find their kernel OOPs > > So it is not clear at all if this has graduated to a real use, or just > an experiment. Perhaps there are some system configurations that don't > trigger crashes.. Well I have some crash reports couple years old and they are not from QA departments. So I'm pretty confident there are real users that use this in production... and just reboot their machine in case it crashes. Honza -- Jan Kara SUSE Labs, CR
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Thu, Mar 14, 2019 at 10:03:45AM +0100, Jan Kara wrote: > On Wed 13-03-19 19:16:51, Christopher Lameter wrote: > > On Tue, 12 Mar 2019, Jerome Glisse wrote: > > > > > > > This has been discuss extensively already. GUP usage is now > > > > > widespread in > > > > > multiple drivers, removing that would regress userspace ie break > > > > > existing > > > > > application. We all know what the rules for that is. > > > > You are still misstating the issue. In RDMA land GUP is widely used for > > anonyous memory and memory based filesystems. *Not* for real filesystems. > > Maybe in your RDMA land. But there are apparently other users which do use > mmap of a file on normal filesystem (e.g. ext4) as a buffer for DMA > (Infiniband does not prohibit this if nothing else, video capture devices > also use very similar pattern of gup-ing pages and using them as video > buffers). And these users are reporting occasional kernel crashes. That's > how this whole effort started. Sadly the DMA to file mmap is working good > enough that people started using it so at this point we cannot just tell: > Sorry it was a mistake to allow this, just rewrite your applications. This is where we are in RDMA too.. People are trying it and the ones that do enough load testing find their kernel OOPs So it is not clear at all if this has graduated to a real use, or just an experiment. Perhaps there are some system configurations that don't trigger crashes.. Jason
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Wed 13-03-19 19:21:37, Christopher Lameter wrote: > On Wed, 13 Mar 2019, Christoph Hellwig wrote: > > > On Wed, Mar 13, 2019 at 09:11:13AM +1100, Dave Chinner wrote: > > > On Tue, Mar 12, 2019 at 03:39:33AM -0700, Ira Weiny wrote: > > > > IMHO I don't think that the copy_file_range() is going to carry us > > > > through the > > > > next wave of user performance requirements. RDMA, while the first, is > > > > not the > > > > only technology which is looking to have direct access to files. XDP is > > > > another.[1] > > > > > > Sure, all I doing here was demonstrating that people have been > > > trying to get local direct access to file mappings to DMA directly > > > into them for a long time. Direct Io games like these are now > > > largely unnecessary because we now have much better APIs to do > > > zero-copy data transfer between files (which can do hardware offload > > > if it is available!). > > > > And that is just the file to file case. There are tons of other > > users of get_user_pages, including various drivers that do large > > amounts of I/O like video capture. For them it makes tons of sense > > to transfer directly to/from a mmap()ed file. > > That is very similar to the RDMA case and DAX etc. We need to have a way > to tell a filesystem that this is going to happen and that things need to > be setup for this to work properly. The way to tell filesystem what's happening is exactly what we are working on with these patches... > But if that has not been done then I think its proper to fail a long term > pin operation on page cache pages. Meaning the regular filesystems > maintain control of whats happening with their pages. And as I mentioned in my other email, we cannot just fail the pin for pagecache pages as that would regress existing applications. Honza -- Jan Kara SUSE Labs, CR
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Wed 13-03-19 19:16:51, Christopher Lameter wrote: > On Tue, 12 Mar 2019, Jerome Glisse wrote: > > > > > This has been discuss extensively already. GUP usage is now widespread > > > > in > > > > multiple drivers, removing that would regress userspace ie break > > > > existing > > > > application. We all know what the rules for that is. > > You are still misstating the issue. In RDMA land GUP is widely used for > anonyous memory and memory based filesystems. *Not* for real filesystems. Maybe in your RDMA land. But there are apparently other users which do use mmap of a file on normal filesystem (e.g. ext4) as a buffer for DMA (Infiniband does not prohibit this if nothing else, video capture devices also use very similar pattern of gup-ing pages and using them as video buffers). And these users are reporting occasional kernel crashes. That's how this whole effort started. Sadly the DMA to file mmap is working good enough that people started using it so at this point we cannot just tell: Sorry it was a mistake to allow this, just rewrite your applications. Plus we have O_DIRECT io which can use file mmap as a buffer and as Dave Chinner mentioned there are real applications using this. So no, we are not going to get away with "just forbid GUP for file backed pages" which seems to be what you suggest. We might get away with that for *some* GUP users and you are welcome to do that in the drivers you care about but definitely not for all. Honza -- Jan Kara SUSE Labs, CR
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Wed, Mar 13, 2019 at 07:16:51PM +, Christopher Lameter wrote: > On Tue, 12 Mar 2019, Jerome Glisse wrote: > > > > > This has been discuss extensively already. GUP usage is now widespread > > > > in > > > > multiple drivers, removing that would regress userspace ie break > > > > existing > > > > application. We all know what the rules for that is. > > You are still misstating the issue. In RDMA land GUP is widely used for > anonyous memory and memory based filesystems. *Not* for real filesystems. Then why are they bug report as one pointed out in cover letter ? It means someone is doing GUP on filesystem. Moreover looking at RDMA driver i do not see anything that check that VA for GUP belongs to a vma that is not back by a regular file. > > > > Because someone was able to get away with weird ways of abusing the system > > > it not an argument that we should continue to allow such things. In fact > > > we have repeatedly ensured that the kernel works reliably by improving the > > > kernel so that a proper failure is occurring. > > > > Driver doing GUP on mmap of regular file is something that seems to > > already have widespread user (in the RDMA devices at least). So they > > are active users and they were never told that what they are doing > > was illegal. > > Not true. Again please differentiate the use cases between regular > filesystem and anonyous mappings. Again where does the bug comes from ? Where in RDMA is the check that VA belong to a vma that is not back by a file ? > > > > Well swapout cannot occur if the page is pinned and those pages are also > > > often mlocked. > > > > I would need to check the swapout code but i believe the write to disk > > can happen before the pin checks happens. I believe the event flow is: > > map read only, allocate swap, write to disk, try to free page which > > checks for pin. So that you could write stale data to disk and the GUP > > going away before you perform the pin checks. > > Allocate swap is a separate step that associates a swap entry to an > anonymous page. > > > They are other thing to take into account and that need proper page > > dirtying, like soft dirtyness for instance. > > RDMA mapped pages are all dirty all the time. Point is the pte dirty bit might not be accurate nor the soft dirty bit because GUP user does not update those bits and thus GUP user need to call the set_page_dirty or similar to properly report page dirtyness. > > Well RDMA driver maintainer seems to report that this has been a valid > > and working workload for their users. > > No they dont. > > Could you please get up to date on the discussion before posting? Again why is there bug report ? Where is the code in RDMA that check that VA does not belong to vma that is back by a file ? As much as i would like that this use case did not exist i fear it does and it has been upstream for a while. This also very much apply to O_DIRECT wether you like it or not. Cheers, Jérôme
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Wed, 13 Mar 2019, Christoph Hellwig wrote: > On Wed, Mar 13, 2019 at 09:11:13AM +1100, Dave Chinner wrote: > > On Tue, Mar 12, 2019 at 03:39:33AM -0700, Ira Weiny wrote: > > > IMHO I don't think that the copy_file_range() is going to carry us > > > through the > > > next wave of user performance requirements. RDMA, while the first, is > > > not the > > > only technology which is looking to have direct access to files. XDP is > > > another.[1] > > > > Sure, all I doing here was demonstrating that people have been > > trying to get local direct access to file mappings to DMA directly > > into them for a long time. Direct Io games like these are now > > largely unnecessary because we now have much better APIs to do > > zero-copy data transfer between files (which can do hardware offload > > if it is available!). > > And that is just the file to file case. There are tons of other > users of get_user_pages, including various drivers that do large > amounts of I/O like video capture. For them it makes tons of sense > to transfer directly to/from a mmap()ed file. That is very similar to the RDMA case and DAX etc. We need to have a way to tell a filesystem that this is going to happen and that things need to be setup for this to work properly. But if that has not been done then I think its proper to fail a long term pin operation on page cache pages. Meaning the regular filesystems maintain control of whats happening with their pages.
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Tue, 12 Mar 2019, Jerome Glisse wrote: > > > This has been discuss extensively already. GUP usage is now widespread in > > > multiple drivers, removing that would regress userspace ie break existing > > > application. We all know what the rules for that is. You are still misstating the issue. In RDMA land GUP is widely used for anonyous memory and memory based filesystems. *Not* for real filesystems. > > Because someone was able to get away with weird ways of abusing the system > > it not an argument that we should continue to allow such things. In fact > > we have repeatedly ensured that the kernel works reliably by improving the > > kernel so that a proper failure is occurring. > > Driver doing GUP on mmap of regular file is something that seems to > already have widespread user (in the RDMA devices at least). So they > are active users and they were never told that what they are doing > was illegal. Not true. Again please differentiate the use cases between regular filesystem and anonyous mappings. > > Well swapout cannot occur if the page is pinned and those pages are also > > often mlocked. > > I would need to check the swapout code but i believe the write to disk > can happen before the pin checks happens. I believe the event flow is: > map read only, allocate swap, write to disk, try to free page which > checks for pin. So that you could write stale data to disk and the GUP > going away before you perform the pin checks. Allocate swap is a separate step that associates a swap entry to an anonymous page. > They are other thing to take into account and that need proper page > dirtying, like soft dirtyness for instance. RDMA mapped pages are all dirty all the time. > Well RDMA driver maintainer seems to report that this has been a valid > and working workload for their users. No they dont. Could you please get up to date on the discussion before posting?
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Wed, Mar 13, 2019 at 09:11:13AM +1100, Dave Chinner wrote: > On Tue, Mar 12, 2019 at 03:39:33AM -0700, Ira Weiny wrote: > > IMHO I don't think that the copy_file_range() is going to carry us through > > the > > next wave of user performance requirements. RDMA, while the first, is not > > the > > only technology which is looking to have direct access to files. XDP is > > another.[1] > > Sure, all I doing here was demonstrating that people have been > trying to get local direct access to file mappings to DMA directly > into them for a long time. Direct Io games like these are now > largely unnecessary because we now have much better APIs to do > zero-copy data transfer between files (which can do hardware offload > if it is available!). And that is just the file to file case. There are tons of other users of get_user_pages, including various drivers that do large amounts of I/O like video capture. For them it makes tons of sense to transfer directly to/from a mmap()ed file.
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Wed, Mar 13, 2019 at 09:11:13AM +1100, Dave Chinner wrote: > On Tue, Mar 12, 2019 at 03:39:33AM -0700, Ira Weiny wrote: > > IMHO I don't think that the copy_file_range() is going to carry us through > > the > > next wave of user performance requirements. RDMA, while the first, is not > > the > > only technology which is looking to have direct access to files. XDP is > > another.[1] > > Sure, all I doing here was demonstrating that people have been > trying to get local direct access to file mappings to DMA directly > into them for a long time. Direct Io games like these are now > largely unnecessary because we now have much better APIs to do > zero-copy data transfer between files (which can do hardware offload > if it is available!). > > It's the long term pins that RDMA does that are the problem here. > I'm asssuming that for XDP, you're talking about userspace zero copy > from files to the network hardware and vice versa? transmit is > simple (read-only mapping), but receive probably requires bpf > programs to ensure that data (minus headers) in the incoming packet > stream is correctly placed into the UMEM region? Yes, exactly. > > XDP receive seems pretty much like the same problem as RDMA writes > into the file. i.e. the incoming write DMAs are going to have to > trigger page faults if the UMEM is a long term pin so the filesystem > behaves correctly with this remote data placement. I'd suggest that > RDMA, XDP and anything other hardware that is going to pin > file-backed mappings for the long term need to use the same "inform > the fs of a write operation into it's mapping" mechanisms... Yes agreed. I have a hack patch I'm testing right now which allows the user to take a LAYOUT lease from user space and GUP triggers on that, either allowing or rejecting the pin based on the lease. I think this is the first step of what Jan suggested.[1] There is a lot more detail to work out with what happens if that lease needs to be broken. > > And if we start talking about wanting to do peer-to-peer DMA from > network/GPU device to storage device without going through a > file-backed CPU mapping, we still need to have the filesystem > involved to translate file offsets to storage locations the > filesystem has allocated for the data and to lock them down for as > long as the peer-to-peer DMA offload is in place. In effect, this > is the same problem as RDMA+FS-DAXs - the filesystem owns the file > offset to storage location mapping and manages storage access > arbitration, not the mm/vma mapping presented to userspace I've only daydreamed about Peer-to-peer transfers. But yes I think this is the direction we need to go. But The details of doing a GPU -> RDMA -> {network } -> RDMA -> FS DAX And back again... without CPU/OS involvement are only a twinkle in my eye... If that. Ira [1] https://lore.kernel.org/lkml/20190212160707.ga19...@quack2.suse.cz/
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Tue, Mar 12, 2019 at 03:39:33AM -0700, Ira Weiny wrote: > IMHO I don't think that the copy_file_range() is going to carry us through the > next wave of user performance requirements. RDMA, while the first, is not the > only technology which is looking to have direct access to files. XDP is > another.[1] Sure, all I doing here was demonstrating that people have been trying to get local direct access to file mappings to DMA directly into them for a long time. Direct Io games like these are now largely unnecessary because we now have much better APIs to do zero-copy data transfer between files (which can do hardware offload if it is available!). It's the long term pins that RDMA does that are the problem here. I'm asssuming that for XDP, you're talking about userspace zero copy from files to the network hardware and vice versa? transmit is simple (read-only mapping), but receive probably requires bpf programs to ensure that data (minus headers) in the incoming packet stream is correctly placed into the UMEM region? XDP receive seems pretty much like the same problem as RDMA writes into the file. i.e. the incoming write DMAs are going to have to trigger page faults if the UMEM is a long term pin so the filesystem behaves correctly with this remote data placement. I'd suggest that RDMA, XDP and anything other hardware that is going to pin file-backed mappings for the long term need to use the same "inform the fs of a write operation into it's mapping" mechanisms... And if we start talking about wanting to do peer-to-peer DMA from network/GPU device to storage device without going through a file-backed CPU mapping, we still need to have the filesystem involved to translate file offsets to storage locations the filesystem has allocated for the data and to lock them down for as long as the peer-to-peer DMA offload is in place. In effect, this is the same problem as RDMA+FS-DAXs - the filesystem owns the file offset to storage location mapping and manages storage access arbitration, not the mm/vma mapping presented to userspace Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Tue, Mar 12, 2019 at 05:23:21AM +, Christopher Lameter wrote: > On Mon, 11 Mar 2019, Dave Chinner wrote: > > > > Direct IO on a mmapped file backed page doesnt make any sense. > > > > People have used it for many, many years as zero-copy data movement > > pattern. i.e. mmap the destination file, use direct IO to DMA direct > > into the destination file page cache pages, fdatasync() to force > > writeback of the destination file. > > Well we could make that more safe through a special API that designates a > range of pages in a file in the same way as for RDMA. This is inherently > not reliable as we found out. I'm not following. What API was not reliable? In[2] we had ideas on such an API but AFAIK these have not been tried. >From what I have seen the above is racy and is prone to the issues John has seen. The difference is that Direct IO has a smaller window than RDMA. (Or at least I thought we already established that?) "And also remember that while RDMA might be the case at least some people care about here it really isn't different from any of the other gup + I/O cases, including doing direct I/O to a mmap area. The only difference in the various cases is how long the area should be pinned down..." -- Christoph Hellwig : https://lkml.org/lkml/2018/10/1/591 > > > Now we have copy_file_range() to optimise this sort of data > > movement, the need for games with mmap+direct IO largely goes away. > > However, we still can't just remove that functionality as it will > > break lots of random userspace stuff... > > It is already broken and unreliable. Are there really "lots" of these > things around? Can we test this by adding a warning in the kernel and see > where it actually crops up? IMHO I don't think that the copy_file_range() is going to carry us through the next wave of user performance requirements. RDMA, while the first, is not the only technology which is looking to have direct access to files. XDP is another.[1] Ira [1] https://www.kernel.org/doc/html/v4.19-rc1/networking/af_xdp.html [2] https://lore.kernel.org/lkml/20190205175059.gb21...@iweiny-desk2.sc.intel.com/
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Tue, Mar 12, 2019 at 11:35:29AM -0400, Jerome Glisse wrote: > > > > Yes you now have the filesystem as well as the GUP pinner claiming > > > > authority over the contents of a single memory segment. Maybe better not > > > > allow that? > > > > > > This goes back to regressing existing driver with existing users. > > > > There is no regression if that behavior never really worked. > > Well RDMA driver maintainer seems to report that this has been a valid > and working workload for their users. I think it is more O_DIRECT that is the history here.. In RDMA land long term GUPs of file backed pages tend to crash the kernel (what John is trying to fix here) so I'm not sure there are actual real & tested users, only people that wish they could do this.. Jason
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Tue, Mar 12, 2019 at 04:52:07AM +, Christopher Lameter wrote: > On Fri, 8 Mar 2019, Jerome Glisse wrote: > > > > > > > It would good if that understanding would be enforced somehow given the > > > problems > > > that we see. > > > > This has been discuss extensively already. GUP usage is now widespread in > > multiple drivers, removing that would regress userspace ie break existing > > application. We all know what the rules for that is. > > The applications that work are using anonymous memory and memory > filesystems. I have never seen use cases with a real filesystem and would > have objected if someone tried something crazy like that. > > Because someone was able to get away with weird ways of abusing the system > it not an argument that we should continue to allow such things. In fact > we have repeatedly ensured that the kernel works reliably by improving the > kernel so that a proper failure is occurring. Driver doing GUP on mmap of regular file is something that seems to already have widespread user (in the RDMA devices at least). So they are active users and they were never told that what they are doing was illegal. Note that i am personaly fine with breaking device driver that can not abide by mmu notifier but the consensus seems that it is not fine to do so. > > > > In fact, the GUP documentation even recommends that pattern. > > > > > > Isnt that pattern safe for anonymous memory and memory filesystems like > > > hugetlbfs etc? Which is the common use case. > > > > Still an issue in respect to swapout ie if anon/shmem page was map > > read only in preparation for swapout and we do not report the page > > as dirty what endup in swap might lack what was written last through > > GUP. > > Well swapout cannot occur if the page is pinned and those pages are also > often mlocked. I would need to check the swapout code but i believe the write to disk can happen before the pin checks happens. I believe the event flow is: map read only, allocate swap, write to disk, try to free page which checks for pin. So that you could write stale data to disk and the GUP going away before you perform the pin checks. They are other thing to take into account and that need proper page dirtying, like soft dirtyness for instance. > > > > > > Yes you now have the filesystem as well as the GUP pinner claiming > > > authority over the contents of a single memory segment. Maybe better not > > > allow that? > > > > This goes back to regressing existing driver with existing users. > > There is no regression if that behavior never really worked. Well RDMA driver maintainer seems to report that this has been a valid and working workload for their users. > > > Two filesystem trying to sync one memory segment both believing to have > > > exclusive access and we want to sort this out. Why? Dont allow this. > > > > This is allowed, it always was, forbidding that case now would regress > > existing application and it would also means that we are modifying the > > API we expose to userspace. So again this is not something we can block > > without regressing existing user. > > We have always stopped the user from doing obviously stupid and risky > things. It would be logical to do it here as well. While i would rather only allow device that can handle mmu notifier it is just not acceptable to regress existing user and they do seem to exist and had working setup going on for a while. Cheers, Jérôme
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Mon, 11 Mar 2019, Dave Chinner wrote: > > Direct IO on a mmapped file backed page doesnt make any sense. > > People have used it for many, many years as zero-copy data movement > pattern. i.e. mmap the destination file, use direct IO to DMA direct > into the destination file page cache pages, fdatasync() to force > writeback of the destination file. Well we could make that more safe through a special API that designates a range of pages in a file in the same way as for RDMA. This is inherently not reliable as we found out. > Now we have copy_file_range() to optimise this sort of data > movement, the need for games with mmap+direct IO largely goes away. > However, we still can't just remove that functionality as it will > break lots of random userspace stuff... It is already broken and unreliable. Are there really "lots" of these things around? Can we test this by adding a warning in the kernel and see where it actually crops up?
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Fri, 8 Mar 2019, Jerome Glisse wrote: > > > > It would good if that understanding would be enforced somehow given the > > problems > > that we see. > > This has been discuss extensively already. GUP usage is now widespread in > multiple drivers, removing that would regress userspace ie break existing > application. We all know what the rules for that is. The applications that work are using anonymous memory and memory filesystems. I have never seen use cases with a real filesystem and would have objected if someone tried something crazy like that. Because someone was able to get away with weird ways of abusing the system it not an argument that we should continue to allow such things. In fact we have repeatedly ensured that the kernel works reliably by improving the kernel so that a proper failure is occurring. > > > In fact, the GUP documentation even recommends that pattern. > > > > Isnt that pattern safe for anonymous memory and memory filesystems like > > hugetlbfs etc? Which is the common use case. > > Still an issue in respect to swapout ie if anon/shmem page was map > read only in preparation for swapout and we do not report the page > as dirty what endup in swap might lack what was written last through > GUP. Well swapout cannot occur if the page is pinned and those pages are also often mlocked. > > > > Yes you now have the filesystem as well as the GUP pinner claiming > > authority over the contents of a single memory segment. Maybe better not > > allow that? > > This goes back to regressing existing driver with existing users. There is no regression if that behavior never really worked. > > Two filesystem trying to sync one memory segment both believing to have > > exclusive access and we want to sort this out. Why? Dont allow this. > > This is allowed, it always was, forbidding that case now would regress > existing application and it would also means that we are modifying the > API we expose to userspace. So again this is not something we can block > without regressing existing user. We have always stopped the user from doing obviously stupid and risky things. It would be logical to do it here as well.
Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
On Fri, Mar 08, 2019 at 03:08:40AM +, Christopher Lameter wrote: > On Wed, 6 Mar 2019, john.hubb...@gmail.com wrote: > > Direct IO > > = > > > > Direct IO can cause corruption, if userspace does Direct-IO that writes to > > a range of virtual addresses that are mmap'd to a file. The pages written > > to are file-backed pages that can be under write back, while the Direct IO > > is taking place. Here, Direct IO races with a write back: it calls > > GUP before page_mkclean() has replaced the CPU pte with a read-only entry. > > The race window is pretty small, which is probably why years have gone by > > before we noticed this problem: Direct IO is generally very quick, and > > tends to finish up before the filesystem gets around to do anything with > > the page contents. However, it's still a real problem. The solution is > > to never let GUP return pages that are under write back, but instead, > > force GUP to take a write fault on those pages. That way, GUP will > > properly synchronize with the active write back. This does not change the > > required GUP behavior, it just avoids that race. > > Direct IO on a mmapped file backed page doesnt make any sense. People have used it for many, many years as zero-copy data movement pattern. i.e. mmap the destination file, use direct IO to DMA direct into the destination file page cache pages, fdatasync() to force writeback of the destination file. Now we have copy_file_range() to optimise this sort of data movement, the need for games with mmap+direct IO largely goes away. However, we still can't just remove that functionality as it will break lots of random userspace stuff... Cheers, Dave. -- Dave Chinner da...@fromorbit.com