Re: [PATCH] cpufreq: governor: Be friendly towards latency-sensitive bursty workloads
On 06/03/2014 10:46 AM, Gautham R Shenoy wrote: > On Mon, Jun 02, 2014 at 01:45:38PM +0530, Srivatsa S. Bhat wrote: >> On 06/02/2014 01:03 PM, Gautham R Shenoy wrote: >>> Hi, >>> >>> On Tue, May 27, 2014 at 02:23:38AM +0530, Srivatsa S. Bhat wrote: >>> >>> [..snip..] Experimental results: I ran a modified version of ebizzy (called 'sleeping-ebizzy') that sleeps in between its execution such that its total utilization can be a user-defined value, say 10% or 20% (higher the utilization specified, lesser the amount of sleeps injected). This ebizzy was run with a single-thread, tied to CPU 8. Behavior observed with tracing (sample taken from 40% utilization runs): Without patch: ~~ kworker/8:2-12137 416.335742: cpu_frequency: state=2061000 cpu_id=8 kworker/8:2-12137 416.335744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40753 416.345741: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-12137 416.345744: cpu_frequency: state=4123000 cpu_id=8 kworker/8:2-12137 416.345746: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40753 416.355738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 - <...>-40753 416.402202: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8 -0 416.502130: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy <...>-40753 416.505738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-12137 416.505739: cpu_frequency: state=2061000 cpu_id=8 kworker/8:2-12137 416.505741: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40753 416.515739: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-12137 416.515742: cpu_frequency: state=4123000 cpu_id=8 kworker/8:2-12137 416.515744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy Observation: Ebizzy went idle at 416.402202, and started running again at 416.502130. But cpufreq noticed the long idle period, and dropped the frequency at 416.505739, only to increase it back again at 416.515742, realizing that the workload is in-fact CPU bound. Thus ebizzy needlessly ran at the lowest frequency for almost 13 milliseconds (almost 1 full sample period), and this pattern repeats on every sleep-wakeup. This could hurt latency-sensitive workloads quite a lot. With patch: ~~~ kworker/8:2-29802 464.832535: cpu_frequency: state=2061000 cpu_id=8 - kworker/8:2-29802 464.962538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 464.972533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-29802 464.972536: cpu_frequency: state=4123000 cpu_id=8 kworker/8:2-29802 464.972538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 464.982531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 - kworker/8:2-29802 465.022533: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 465.032531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-29802 465.032532: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 465.035797: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8 -0 465.240178: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy <...>-40738 465.242533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-29802 465.242535: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 465.252531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 >>> >>> Have the log entries emmitted by kworker/8 to report about the >>> cpu_frequency states been snipped out in the entries post the >>> "465.032531" mark ? >>> >> >> No, why? Anything looks odd at that point? > > I was expecting to see log messages of the following kind after a > kworker thread is scheduled in. > > "kworker/8:2-12137 416.505739: cpu_frequency: state=2061000 cpu_id=8" > But this gets printed only if the frequency is changed. If the frequency is left at the same value as it was previously set at (that's the point of this patch), then we won't get this print. [Note that these logs are with the patch applied.] >> >> Note
[GIT PULL] USB driver patches for 3.16-rc1
The following changes since commit d6d211db37e75de2ddc3a4f979038c40df7cc79c: Linux 3.15-rc5 (2014-05-09 13:10:52 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/ tags/usb-3.16-rc1 for you to fetch changes up to 4a95b1fce97756d0333f8232eb7ed6974e93b054: usb: hub_handle_remote_wakeup() only exists for CONFIG_PM=y (2014-06-02 15:16:33 -0700) USB driver patches for 3.16-rc1 Here is the big USB driver pull request for 3.16-rc1. Nothing huge here, but lots of little things in the USB core, and in lots of drivers. Hopefully the USB power management will be work better now that it has been reworked to do per-port power control dynamically. There's also a raft of gadget driver updates and fixes, CONFIG_USB_DEBUG is finally gone now that everything has been converted over to the dynamic debug inteface, the last hold-out drivers were cleaned up and the config option removed. There were also other minor things all through the drivers/usb/ tree, the shortlog shows this pretty well. All have been in linux-next, including the very last patch, which came from linux-next to fix a build issue on some platforms. Signed-off-by: Greg Kroah-Hartman Alan Stern (1): USB: mutual exclusion for resetting a hub and power-managing a port Aleksander Morgado (2): usb: qcserial: add Netgear AirCard 341U usb: qcserial: add additional Sierra Wireless QMI devices Alexander Gordeev (1): xhci: Use pci_enable_msix_exact() instead of pci_enable_msix() Alexander Shiyan (1): usb: chipidea: core: Add missing module owner field Alexandre Belloni (1): usb: gadget: atmel_usba: always test udc->driver Alexey Khoroshilov (1): usb: gadget: gr_udc: unconditionally use GFP_ATOMIC in gr_queue_ext() Andreas Larsson (7): usb: gadget: gr_udc: improve platform_device variable name usb: gadget: gr_udc: Expand devicetree documentation usb: gadget: gr_udc: Use platform_get_irq instead of irq_of_parse_and_map usb: gadget: gr_udc: Use of_property_read_u32_index to access arrays usb: gadget: gr_udc: Add ep.maxpacket_limit to debugfs information usb: gadget: gr_udc: Return error code when trying to set ep.maxpacket > ep.maxpacket_limit usb: gadget: gr_udc: Use GFP_ATOMIC when allocating under held spinlock Andrzej Pietrasiewicz (9): usb: gadget: FunctionFS: share VLA macros with all usb gadget files usb: gadget: OS String support usb: gadget: OS Feature Descriptors support usb: gadget: f_rndis: OS descriptors support usb: gadget: configfs: OS String support usb: gadget: configfs: OS Extended Compatibility descriptors support usb: gadget: f_rndis: OS Descriptors configfs support usb: gadget: configfs: OS Extended Properties descriptors support usb: gadget: f_uac2: don't queue new requests when shutting down Andy Shevchenko (2): usb: dwc3: no need to initialize ret variable usb: dwc3: convert to pcim_enable_device() Antoine Ténart (1): phy: exynos-mipi-video: fix check on array index Apelete Seketeli (1): documentation: docbook: document process of writing an musb glue layer Arnd Bergmann (9): PHY: Exynos: fix SATA phy license typo phy: kona2: use 'select GENERIC_PHY' in Kconfig usb: gadget: s3c2410_udc: don't use pr_debug return value usb: musb: tusb-dma can't be built-in if tusb is not usb: musb: omap2plus bus glue needs USB host support usb: phy: msm: reset controller is mandatory now usb: xhci: avoid warning for !PM_SLEEP usb: ohci-da8xx can only be built-in usb: ohci: sort out dependencies for lpc32xx and omap Benoit Taine (1): USB: storage: ene_ub6250: Use kmemdup instead of kmalloc + memcpy Bjørn Mork (4): usb: qcserial: fix multiline comment coding style usb: qcserial: refactor device layout selection usb: qcserial: define and use Sierra Wireless layout usb: qcserial: remove interface number matching Boris BREZILLON (1): usb: ehci-platform: add optional reset controller retrieval Dan Carpenter (2): usb: phy: msm: change devm_ioremap() to devm_ioremap_resource() usb: phy: msm: fix bug in probe() Dan Williams (17): usb: catch attempts to submit urbs with a vmalloc'd transfer buffer usb: disable port power control if not supported in wHubCharacteristics usb: rename usb_port device objects usb: cleanup setting udev->removable from port_dev->connect_type usb: assign default peer ports for root hubs usb: assign usb3 external hub port peers usb: find internal hub tier mismatch via acpi usb: sysfs link peer ports usb: make usb_port flags atomic, rename did_runtime_put to child_usage usb: block suspension of superspeed port
[GIT PULL] TTY/Serial patches for 3.16-rc1
The following changes since commit d1db0eea852497762cab43b905b879dfcd3b8987: Linux 3.15-rc3 (2014-04-27 19:29:27 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git/ tags/tty-3.16-rc1 for you to fetch changes up to 9ce4f8f3f45443922c98e25133b8c9790fc7949a: Revert "serial: imx: remove the DMA wait queue" (2014-05-29 19:30:54 -0700) TTY/Serial driver patches for 3.16-rc1 Here is the big tty / serial driver pull request for 3.16-rc1. A variety of different serial driver fixes and updates and additions, nothing huge, and no real major core tty changes at all. All have been in linux-next for a while. Signed-off-by: Greg Kroah-Hartman Adam Borowski (1): vt: emulate 8- and 24-bit colour codes. Alexander Shiyan (2): serial: sccnxp: Remove useless timer_pending() check serial: sccnxp: Add IGNPAR flag handling Alexander Stein (1): pch_uart: Add uart device to irq name Arnd Bergmann (1): serial: add missing SERIAL_CORE dependencies Barry Song (1): serial: sirf: move to writel for TXFIFO instead of writeb Benjamin Herrenschmidt (1): tty/hvc/hvc_console: Fix wakeup of HVC thread on hvc_kick() Christopher Covington (1): ARM: tty: Move HVC DCC assembly to arch/arm Daniel Thompson (7): serial: mux: Align SUPPORT_SYSRQ behaviour with other drivers. serial: st-asc: Fix data corruption during long console bursts serial: sirf: Fix compilation failure serial: cpm_uart: No LF conversion in put_poll_char() serial: kgdb_nmi: Use container_of() to locate private data serial: kgdb_nmi: Switch from tasklets to real timers serial: kgdb_nmi: Improve console integration with KDB I/O Doug Anderson (1): serial_core: Commonalize crlf when working w/ a non open console port Ezequiel Garcia (1): parport: Add support for the WCH353 1S/1P multi-IO card Fabian Frederick (1): drivers/tty/n_hdlc.c: replace kmalloc/memset by kzalloc Fabio Estevam (1): serial: imx: Disable new features of autobaud detection Felipe Balbi (11): bluetooth: hci_ldisc: fix deadlock condition Revert "serial: omap: unlock the port lock" serial: fix UART_IIR_ID tty: serial: add missing braces tty: serial: omap: switch over to devm_request_gpio tty: serial: omap: cleanup variable declarations tty: serial: omap: switch over to platform_get_resource tty: serial: omap: switch over to devm_ioremap_resource tty: serial: omap: remove some dead code tty: serial: omap: remove unneeded singlethread workqueue tty: serial: omap: fix Sparse warnings Geert Uytterhoeven (1): serial: SERIAL_FSL_LPUART should depend on HAS_DMA Greg Kroah-Hartman (3): Revert "serial: sh-sci: Add device tree support for r8a7779" Merge 3.15-rc3 into tty-next Revert "serial: imx: remove the DMA wait queue" Heikki Krogerus (1): serial: 8250_dma: check the result of TX buffer mapping Huang Shijie (5): tty_ldisc: add more limits to the @write_wakeup serial: imx: reset the uart port all the time serial: imx: remove the redundant code serial: imx: remove the DMA wait queue serial: imx: disable the receiver ready interrupt for imx_stop_rx Jan Moskyto Matejka (1): serial: sc16is7xx: compile I2C when REGMAP_I2C is module Jean Delvare (2): serial: pch_uart: Fix Kconfig dependencies tty: n_hdlc: Drop redundant error message Joe Perches (1): serial: samsung: Neaten dbg uses Johannes Thumshirn (2): tty: serial: Add driver for MEN's 16z135 High Speed UART. tty: serial: men_z135_uart: Don't activate TX Space available IRQ on startup Jon Ringle (5): serial: sc16is7xx serial: sc16is7xx: Add bindings documentation for the SC16IS7XX UARTs serial: sc16is7xx: depend on I2C serial: sc16is7xx: fix implicit decl of func copy_{to,from}_user serial: sc16is7xx: dynamically allocate tx/rx buffer Julia Lawall (1): tty: serial: replace del_timer by del_timer_sync Loic Poulain (1): 8250_dw: Support all baudrates on baytrail Michal Simek (3): tty: xuartps: Fix kernel-doc errors in the driver tty: xuartps: Initialize ports according to aliases tty: serial: uartlite: Specify time for sending chars Murali Karicheri (1): serial: uart: add hw flow control support configuration Qipan Li (1): serial: sirf: fix spinlock deadlock issue Richard Genoud (5): tty/serial: atmel_serial: Fix device tree documentation ARM: at91: gpio: implement get_direction tty/serial: Add GPIOLIB helpers for controlling modem lines tty/serial: at91: use mctrl_gpio helpers tty/serial: at91: add interrupts for modem control lines Rob Herring (9): x86: move
Re: [GIT PULL] Driver core / sysfs patches for 3.16-rc1
On Mon, Jun 02, 2014 at 10:45:12PM -0700, Greg KH wrote: > The following changes since commit 4b660a7f5c8099d88d1a43d8ae138965112592c7: > > Linux 3.15-rc6 (2014-05-22 06:42:02 +0900) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/ > tags/driver-core-3.16-rc1 > > for you to fetch changes up to cda43576afa641d83ae268cb9795ae2a549d53d9: > > crypto/nx/nx-842: dev_set_drvdata can no longer fail (2014-05-28 13:39:51 > -0700) > > > Driver core / kernfs pull request for 3.16-rc1 > > Here is the "big" pull request for 3.16-rc1. > Not a lot of changes here, some kernfs work, a revert of a very old > driver core change that ended up cauing some memory leaks on driver > probe error paths, and other minor things. > > As was pointed out earlier today, one commit here, > 26fc9cd200ec839e0b3095e05ae018f27314e7aa (kernfs: move the last > knowledge of sysfs out from kernfs) is also needed in your 3.15-final > branch as well. If you could cherry-pick it there, it would be most > appreciated by Andy Lutomirski to prevent a regression there. > > All of these have been in linux-next for a while. > > Signed-off-by: Greg Kroah-Hartman > > > Fabian Frederick (2): > lib/devres.c: use dev in devm_request_and_ioremap > lib/devres.c: fix checkpatch warnings > > Greg Kroah-Hartman (2): > Merge 3.15-rc3 into staging-next Note, that merge description was wrong, I merged into driver-core-next but fat-fingered the message when doing the commit description, sorry about that. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] Driver core / sysfs patches for 3.16-rc1
The following changes since commit 4b660a7f5c8099d88d1a43d8ae138965112592c7: Linux 3.15-rc6 (2014-05-22 06:42:02 +0900) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git/ tags/driver-core-3.16-rc1 for you to fetch changes up to cda43576afa641d83ae268cb9795ae2a549d53d9: crypto/nx/nx-842: dev_set_drvdata can no longer fail (2014-05-28 13:39:51 -0700) Driver core / kernfs pull request for 3.16-rc1 Here is the "big" pull request for 3.16-rc1. Not a lot of changes here, some kernfs work, a revert of a very old driver core change that ended up cauing some memory leaks on driver probe error paths, and other minor things. As was pointed out earlier today, one commit here, 26fc9cd200ec839e0b3095e05ae018f27314e7aa (kernfs: move the last knowledge of sysfs out from kernfs) is also needed in your 3.15-final branch as well. If you could cherry-pick it there, it would be most appreciated by Andy Lutomirski to prevent a regression there. All of these have been in linux-next for a while. Signed-off-by: Greg Kroah-Hartman Fabian Frederick (2): lib/devres.c: use dev in devm_request_and_ioremap lib/devres.c: fix checkpatch warnings Greg Kroah-Hartman (2): Merge 3.15-rc3 into staging-next Merge 3.15-rc6 into driver-core-next Jean Delvare (6): driver core: Move driver_data back to struct device driver core: dev_set_drvdata can no longer fail driver core: dev_set_drvdata returns void driver core: dev_get_drvdata: Don't check for NULL dev driver core: Inline dev_set/get_drvdata crypto/nx/nx-842: dev_set_drvdata can no longer fail Jianyu Zhan (1): kernfs: move the last knowledge of sysfs out from kernfs Michael Marineau (1): kobject: Make support for uevent_helper optional. Robert ABEL (1): sysfs: fix attribute_group bin file path on removal Simon Wunderlich (1): sysfs.h: don't return a void-valued expression in sysfs_remove_file Tejun Heo (2): kernfs: implement kernfs_root->supers list kernfs: make kernfs_notify() trigger inotify events too Tony Lindgren (1): init.h: Update initcall_sync variants to fix build errors drivers/base/Kconfig | 17 +++-- drivers/base/base.h | 3 --- drivers/base/dd.c| 26 -- drivers/crypto/nx/nx-842.c | 7 +-- drivers/iommu/exynos-iommu.c | 7 +-- drivers/vfio/vfio.c | 8 +--- fs/kernfs/dir.c | 1 + fs/kernfs/file.c | 41 +++-- fs/kernfs/kernfs-internal.h | 5 + fs/kernfs/mount.c| 22 +- fs/sysfs/group.c | 10 +- fs/sysfs/mount.c | 4 +++- include/linux/device.h | 15 +-- include/linux/init.h | 14 +- include/linux/kernfs.h | 17 - include/linux/kobject.h | 2 ++ include/linux/sysfs.h| 2 +- kernel/cgroup.c | 4 +++- kernel/ksysfs.c | 5 - kernel/sysctl.c | 4 ++-- lib/devres.c | 10 +- lib/kobject_uevent.c | 6 ++ 22 files changed, 141 insertions(+), 89 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] char/misc driver patches for 3.16-rc1
The following changes since commit d1db0eea852497762cab43b905b879dfcd3b8987: Linux 3.15-rc3 (2014-04-27 19:29:27 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/ tags/char-misc-3.16-rc1 for you to fetch changes up to a100d88df1e924e5c9678fabf054d1bae7ab74fb: hv: use correct order when freeing monitor_pages (2014-05-28 13:45:15 -0700) Char / misc driver patches for 3.16-rc1 Here is the big char / misc driver updates for 3.16-rc1. Lots of different driver updates for a variety of different drivers and minor driver subsystems. All have been in linux-next with no reported issues. Signed-off-by: Greg Kroah-Hartman Alexander Usyskin (6): mei: txe: add runtime pm framework mei: txe: use runtime PG pm domain for non wakeable devices mei: extract fw status registers mei: make return values consistent across the driver mei: set connecting state just upon connection request is sent to the fw mei: add per device configuration Alexey Khoroshilov (1): w1: do not unlock unheld list_mutex in __w1_remove_master_device() Arnd Bergmann (1): misc: atmel_pwm: only build for supported platforms Bin Wang (1): uio: fix vma io range check in mmap Chanwoo Choi (11): extcon: max14577: Change extcon name instead of static name according to device type Merge tag 'ib-mfd-extcon-3.16' of git://git.kernel.org/.../lee/mfd into HEAD extcon: Add extcon_dev_allocate/free() to control the memory of extcon device extcon: Add devm_extcon_dev_allocate/free to manage the resource of extcon device extcon: max8997: Use devm_extcon_dev_allocate for extcon_dev extcon: max77693: Use devm_extcon_dev_allocate for extcon_dev extcon: max14577: Use devm_extcon_dev_allocate for extcon_dev extcon: arizona: Use devm_extcon_dev_allocate for extcon_dev extcon: adc-jack: Use devm_extcon_dev_allocate for extcon_dev extcon: gpio: Use devm_extcon_dev_allocate for extcon_dev extcon: palmas: Use devm_extcon_dev_allocate for extcon_dev Christian Engelmayer (1): misc: genwqe: fix uninitialized return value in genwqe_free_sync_sgl() Daeseok Youn (1): drivers: uio_dmem_genirq: Fix memory leak in uio_dmem_genirq_probe() Dan Carpenter (1): applicom: dereferencing NULL on error path David Fries (2): connector: allow multiple messages to be sent in one packet w1: optional bundling of netlink kernel replies Geert Uytterhoeven (1): drivers: Remove duplicate conditionally included subdirs Greg Kroah-Hartman (1): Merge tag 'extcon-next-for-3.16' of git://git.kernel.org/.../chanwoo/extcon into char-misc-next Jean Delvare (2): misc: Add hardware dependencies to Atmel drivers misc: pch_phub: Fix Kconfig dependencies Johannes Thumshirn (1): mcb: Add support for shared PCI IRQs Josh Cartwright (1): spmi: of: fixup generic SPMI devicetree binding example K. Y. Srinivasan (3): Drivers: hv: Eliminate the channel spinlock in the callback path Drivers: hv: vmbus: Implement per-CPU mapping of relid to channel Drivers: hv: balloon: Ensure pressure reports are posted regularly Kishon Vijay Abraham I (1): extcon: palmas: explicitly set edev name as node name Krzysztof Kozlowski (14): mfd: max14577: Add muic prefix to regmap config mfd: max14577: Add detection of device type extcon: max14577: Add max14577 prefix to muic_irqs extcon: max14577: Choose muic_irqs according to device type mfd: max14577: Add MAX14577 prefix to IRQ defines mfd: max77836: Add MAX77836 support to max14577 driver extcon: max14577: Add support for MAX77836 regulator: max14577: Add support for MAX77836 regulators extcon: max77693: Fix two NULL pointer exceptions on missing pdata extcon: max8997: Fix NULL pointer exception on missing pdata extcon: max77693: Use power efficient workqueue for delayed cable detection extcon: max8997: Use power efficient workqueue for delayed cable detection extcon: max14577: Fix probe failure on successful work queue extcon: max14577: Properly handle regmap_irq_get_virq error Masanari Iida (1): misc: genwqe: Fix format string mismatch in card_debugfs.c Radim Krčmář (1): hv: use correct order when freeing monitor_pages Rob Herring (2): dt/bindings: add binding for ARM Versatile character LCD misc: arm-charlcd: add DT probe support Robert P. J. Day (2): MAINTAINERS: Add miscdevice.h to file list for char/misc drivers. miscdevice.h: Simple syntax fix to make pointers consistent. Sangjung Woo (8): extcon: Add resource-managed extcon register function extcon: adc-jack: Use devm_extcon_dev_register() extcon: gpio: Use
[PATCH] ARM: EXYNOS: Fix the sequence of secondary CPU boot for Exynos3250
This patch set AUTOWAKEUP_EN bit to ARM_CORE_CONFIGURATION register because Exynos3250 removes WFE in secure mode so that turn on automatically after setting CORE_LOCAL_PWR_EN. Also, This patch use dbs_sev() macro to guarantee the data synchronization of command instead of IPI_WAKEUP because Exynos3250 don't have WFE mode in secue mode. Signed-off-by: Chanwoo Choi Acked-by: Kyungmin Park --- arch/arm/mach-exynos/platsmp.c | 9 - arch/arm/mach-exynos/pm.c | 8 ++-- arch/arm/mach-exynos/regs-pmu.h | 4 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/arch/arm/mach-exynos/platsmp.c b/arch/arm/mach-exynos/platsmp.c index ec02422..882fb84 100644 --- a/arch/arm/mach-exynos/platsmp.c +++ b/arch/arm/mach-exynos/platsmp.c @@ -149,6 +149,10 @@ static int exynos_boot_secondary(unsigned int cpu, struct task_struct *idle) return -ETIMEDOUT; } } + + if (soc_is_exynos3250()) + __raw_writel(EXYNOS3_COREPORESET(phys_cpu), EXYNOS_SWRESET); + /* * Send the secondary CPU a soft interrupt, thereby causing * the boot monitor to read the system wide flags register, @@ -182,7 +186,10 @@ static int exynos_boot_secondary(unsigned int cpu, struct task_struct *idle) call_firmware_op(cpu_boot, phys_cpu); - arch_send_wakeup_ipi_mask(cpumask_of(cpu)); + if (soc_is_exynos3250()) + dsb_sev(); + else + arch_send_wakeup_ipi_mask(cpumask_of(cpu)); if (pen_release == -1) break; diff --git a/arch/arm/mach-exynos/pm.c b/arch/arm/mach-exynos/pm.c index 87c0d34..4681f64 100644 --- a/arch/arm/mach-exynos/pm.c +++ b/arch/arm/mach-exynos/pm.c @@ -121,8 +121,12 @@ void exynos_cpu_power_down(int cpu) */ void exynos_cpu_power_up(int cpu) { - __raw_writel(S5P_CORE_LOCAL_PWR_EN, -EXYNOS_ARM_CORE_CONFIGURATION(cpu)); + u32 core_conf = 0; + + core_conf |= S5P_CORE_LOCAL_PWR_EN; + if (soc_is_exynos3250()) + core_conf |= S5P_CORE_AUTOWAKEUP_EN; + __raw_writel(core_conf, EXYNOS_ARM_CORE_CONFIGURATION(cpu)); } /** diff --git a/arch/arm/mach-exynos/regs-pmu.h b/arch/arm/mach-exynos/regs-pmu.h index 1d13b08..674dfc2 100644 --- a/arch/arm/mach-exynos/regs-pmu.h +++ b/arch/arm/mach-exynos/regs-pmu.h @@ -128,6 +128,7 @@ #define S5P_CORE_LOCAL_PWR_EN 0x3 #define S5P_INT_LOCAL_PWR_EN 0x7 +#define S5P_CORE_AUTOWAKEUP_EN (1 << 31) /* Only for EXYNOS4210 */ #define S5P_CMU_CLKSTOP_LCD1_LOWPWRS5P_PMUREG(0x1154) @@ -186,6 +187,9 @@ #define S5P_DIS_IRQ_CORE3 S5P_PMUREG(0x1034) #define S5P_DIS_IRQ_CENTRAL3 S5P_PMUREG(0x1038) +/* For EXYNOS3 */ +#define EXYNOS3_COREPORESET(cpu) ((1 << 4) << cpu) + /* For EXYNOS5 */ #define EXYNOS5_SYS_I2C_CFG S5P_SYSREG(0x0234) -- 1.8.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] hwmon: add support for Sensirion SHTC1 sensor
On Mon, Jun 02, 2014 at 03:13:40PM -0700, Tomas Pop wrote: > Hi Guenter, > Hi Tomas, comments inline. Mostly nitpicks plus one real bug. > here is a third version of the driver for Sensirion SHTC1 humidity and > temperature sensor. We included suggested corrections, the most important > changes to the previous version are: > > * default mode of I2C communication is changed to non-blocking > * default values of configuration added to documentation > * driver uses new ABI to obtain shtc1_data* and i2c_client* > > The patch was generated against kernel v3.14.4 (the version used for testing), > but the patch can be applied smoothly also to v3.15-rc8. > > Signed-off-by: Tomas Pop > > --- In general, you'll want a description of the patch prior to the --- line (this is what is added to the commit log), and everything that should not be part of the commit log below the --- line. > Documentation/hwmon/shtc1 | 44 +++ > drivers/hwmon/Kconfig | 10 ++ > drivers/hwmon/shtc1.c | 252 > > include/linux/platform_data/shtc1.h | 23 > 4 files changed, 329 insertions(+) > create mode 100644 Documentation/hwmon/shtc1 > create mode 100644 drivers/hwmon/shtc1.c > create mode 100644 include/linux/platform_data/shtc1.h > > diff --git a/Documentation/hwmon/shtc1 b/Documentation/hwmon/shtc1 > new file mode 100644 > index 000..6bf00a4 > --- /dev/null > +++ b/Documentation/hwmon/shtc1 > @@ -0,0 +1,44 @@ > +Kernel driver shtc1 > +=== > + > +Supported chips: > + * Sensirion SHTC1 > +Prefix: 'shtc1' > +Addresses scanned: none > +Datasheet: Publicly available at the Sensirion website Since you have a pointer to the datasheet you don't need the "Publicly available". > +http://www.sensirion.com/file/datasheet_shtc1 > + > + * Sensirion SHTW1 > +Prefix: 'shtc1' I ould suggest to use shtw1 here. This lets users instantiate this chip with the correct ID. That is especially useful if instantiated through devicetree. Also see below. > +Addresses scanned: none > +Datasheet: Not publicly available > + > +Author: > + Johannes Winkelmann > + > +Description > +--- > + > +This driver implements support for the Sensirion SHTC1 chip, a humidity and > +temperature sensor. Temperature is measured in degrees celsius, relative > +humidity is expressed as a percentage. Driver can be used as well for SHTW1 > +chip, which has the same electrical interface. > + > +The device communicates with the I2C protocol. All sensors are set to I2C > +address 0x70. See Documentation/i2c/instantiating-devices for methods to > +instantiate the device. > + > +There are two options configurable by means of shtc1_platform_data: > +1. blocking (pull the I2C clock line down while performing the measurement) > or > + non-blocking mode. Blocking mode will guarantee the fastest result but > + the I2C bus will be busy during that time. By default, non-blocking mode > + is used. Make sure clock-stretching works properly on your device if you > + want to use blocking mode. > +2. high or low accuracy. High accuracy is used by default and using it is > + strongly recommended. > + > +sysfs-Interface > +--- > + > +temp1_input - temperature input > +humidity1_input - humidity input > diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig > index 193c496..bf06213 100644 > --- a/drivers/hwmon/Kconfig > +++ b/drivers/hwmon/Kconfig > @@ -1047,6 +1047,16 @@ config SENSORS_SHT21 > This driver can also be built as a module. If so, the module > will be called sht21. > > +config SENSORS_SHTC1 > + tristate "Sensiron humidity and temperature sensors. SHTC1 and compat." How about "Sensirion SHTC1 and compatible humidity and temperature sensors" ? > + depends on I2C > + help > + If you say yes here you get support for the Sensiron SHTC1 and SHTW1 > + humidity and temperature sensor. > + sensors > + This driver can also be built as a module. If so, the module > + will be called shtc1. > + > config SENSORS_S3C > tristate "Samsung built-in ADC" > depends on S3C_ADC > diff --git a/drivers/hwmon/shtc1.c b/drivers/hwmon/shtc1.c > new file mode 100644 > index 000..b3278ab > --- /dev/null > +++ b/drivers/hwmon/shtc1.c > @@ -0,0 +1,252 @@ > +/* Sensirion SHTC1 humidity and temperature sensor driver > + * > + * Copyright (C) 2014 Sensirion AG, Switzerland > + * Author: Johannes Winkelmann > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + *
Re: [RFC][PATCH 1/2] Add a super operation for writeback
On the 3rd of June 2014 05:39, Dave Chinner wrote: On Mon, Jun 02, 2014 at 10:30:07AM +0200, Christian Stroetmann wrote: When I followed the advice of Dave Chinner: "We're not going to merge that page forking stuff (like you were told at LSF 2013 more than a year ago: http://lwn.net/Articles/548091/) without rigorous design review and a demonstration of the solutions to all the hard corner cases it has" given in his e-mail related with the presentation of the latest version of the Tux3 file system (see [1]) and read the linked article, I found in the second comments: "Parts of this almost sound like it either a.) overlaps with or b.) would benefit greatly from something similar to Featherstitch [[2]]." Could it be that we have with Featherstitch a general solution already that is said to be even "file system agnostic"? Honestly, I thought that something like this would make its way into the Linux code base. Here's what I said about the last proposal (a few months ago) for integrating featherstitch into the kernel: http://www.spinics.net/lists/linux-fsdevel/msg72799.html It's not a viable solution. Cheers, Dave. How annoying, I did not remember your e-mail of the referenced thread "[Lsf-pc] [LSF/MM TOPIC] atomic block device" despite I saved it on local disk. Thanks a lot for the reminder. I also directly saw the problem with the research prototype Featherstitch, specifically the point "All the filesystem modules it has are built into the featherstitch kernel module, and called through a VFS shim layer". But it is just a prototype and its concept of abstraction has not to be copied 1:1 into the Linux code base. In general, I do not believe that the complexity problems of soft updates, atomic writes, and related techniques can be solved by hand/manually. So my suggestion is to automatically handle the complexity problem of e.g. dependancies in a way that is comparable to a(n on-the-fly) file-system compiler so to say that works on a very large dependancy graph (having several billions of graph vertices actually). And at this point an abstraction like it is given with Featherstitch helps to feed and control this special FS compiler. Actually, I have to follow the discussion further on the one hand and go deeper into the highly complex problem space on the other hand. With all the best Christian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] delete unnecessary bootmem struct page array
From: Honggang Li The patch based on linux-next-2014-06-02. The old init_maps function does two things: 1) allocates and initializes one struct page array for bootmem 2) count the number of total pages After removed the source code related to the unnecessary array, the name 'init_maps' is some kind of improper named, as it just count the number of total page numbers. So, I renamed the function as 'mem_total_pages'. I tested the patch through repeat reboot the uml kernel many times. [real@name linux-next]$ make ARCH=um defconfig [real@name linux-next]$ make ARCH=um linux [real@name linux-next]$ file linux linux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, not stripped [real@name linux-next]$ ./linux ubda=/home/real/linux-next/Fedora20-AMD64-root_fs mem=256m && sync && echo 1 [real@name linux-next]$ ./linux ubda=/home/real/linux-next/Fedora20-AMD64-root_fs mem=256m && sync && echo 2 (repeat reboot the uml kernel many times..) Honggang Li (1): delete unnecessary bootmem struct page array arch/um/include/shared/mem_user.h | 2 +- arch/um/kernel/physmem.c | 32 ++-- arch/um/kernel/um_arch.c | 7 +-- 3 files changed, 8 insertions(+), 33 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] delete unnecessary bootmem struct page array
From: Honggang Li 1) uml kernel bootmem managed through bootmem_data->node_bootmem_map, not the struct page array, so the array is unnecessary. 2) the bootmem struct page array has been pointed by a *local* pointer, struct page *map, in init_maps function. The array can be accessed only in init_maps's scope. As a result, uml kernel wastes about 1% of total memory. Signed-off-by: Honggang Li --- arch/um/include/shared/mem_user.h | 2 +- arch/um/kernel/physmem.c | 32 ++-- arch/um/kernel/um_arch.c | 7 +-- 3 files changed, 8 insertions(+), 33 deletions(-) diff --git a/arch/um/include/shared/mem_user.h b/arch/um/include/shared/mem_user.h index 46384ac..cb84414 100644 --- a/arch/um/include/shared/mem_user.h +++ b/arch/um/include/shared/mem_user.h @@ -49,7 +49,7 @@ extern int iomem_size; extern int init_mem_user(void); extern void setup_memory(void *entry); extern unsigned long find_iomem(char *driver, unsigned long *len_out); -extern int init_maps(unsigned long physmem, unsigned long iomem, +extern void mem_total_pages(unsigned long physmem, unsigned long iomem, unsigned long highmem); extern unsigned long get_vm(unsigned long len); extern void setup_physmem(unsigned long start, unsigned long usable, diff --git a/arch/um/kernel/physmem.c b/arch/um/kernel/physmem.c index 30fdd5d..549ecf3 100644 --- a/arch/um/kernel/physmem.c +++ b/arch/um/kernel/physmem.c @@ -22,39 +22,19 @@ EXPORT_SYMBOL(high_physmem); extern unsigned long long physmem_size; -int __init init_maps(unsigned long physmem, unsigned long iomem, +void __init mem_total_pages(unsigned long physmem, unsigned long iomem, unsigned long highmem) { - struct page *p, *map; - unsigned long phys_len, phys_pages, highmem_len, highmem_pages; - unsigned long iomem_len, iomem_pages, total_len, total_pages; - int i; - - phys_pages = physmem >> PAGE_SHIFT; - phys_len = phys_pages * sizeof(struct page); - - iomem_pages = iomem >> PAGE_SHIFT; - iomem_len = iomem_pages * sizeof(struct page); + unsigned long phys_pages, highmem_pages; + unsigned long iomem_pages, total_pages; + phys_pages= physmem >> PAGE_SHIFT; + iomem_pages = iomem >> PAGE_SHIFT; highmem_pages = highmem >> PAGE_SHIFT; - highmem_len = highmem_pages * sizeof(struct page); - - total_pages = phys_pages + iomem_pages + highmem_pages; - total_len = phys_len + iomem_len + highmem_len; - map = alloc_bootmem_low_pages(total_len); - if (map == NULL) - return -ENOMEM; - - for (i = 0; i < total_pages; i++) { - p = [i]; - memset(p, 0, sizeof(struct page)); - SetPageReserved(p); - INIT_LIST_HEAD(>lru); - } + total_pages = phys_pages + iomem_pages + highmem_pages; max_mapnr = total_pages; - return 0; } void map_memory(unsigned long virt, unsigned long phys, unsigned long len, diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index 6043c76..dbd5bda 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -338,12 +338,7 @@ int __init linux_main(int argc, char **argv) start_vm = VMALLOC_START; setup_physmem(uml_physmem, uml_reserved, physmem_size, highmem); - if (init_maps(physmem_size, iomem_size, highmem)) { - printf("Failed to allocate mem_map for %Lu bytes of physical " - "memory and %Lu bytes of highmem\n", physmem_size, - highmem); - exit(1); - } + mem_total_pages(physmem_size, iomem_size, highmem); virtmem_size = physmem_size; stack = (unsigned long) argv; -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 2/3] mfd: intel_soc_pmic: Crystal Cove support
This patch provides chip-specific support for Crystal Cove. Crystal Cove is the PMIC in Baytrail-T platform. Also adds Intel SoC PMIC support to the build files. Signed-off-by: Yang, Bin Signed-off-by: Zhu, Lejun --- v2: - Add regmap_config for Crystal Cove. v3: - Convert IRQ config to regmap_irq_chip. v4: - Cleanup include files. - Remove useless init() function. - Remove useless .label and .init from struct intel_soc_pmic_config. - Fix various coding style issues. v5: - Use CRYSTAL_COVE_IRQ_ prefix for IRQ bits definition. - Merge build files patch to here. --- drivers/mfd/Kconfig | 12 +++ drivers/mfd/Makefile | 3 + drivers/mfd/intel_soc_pmic_crc.c | 158 +++ 3 files changed, 173 insertions(+) create mode 100644 drivers/mfd/intel_soc_pmic_crc.c diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig index 3383412..d987b71 100644 --- a/drivers/mfd/Kconfig +++ b/drivers/mfd/Kconfig @@ -241,6 +241,18 @@ config LPC_SCH LPC bridge function of the Intel SCH provides support for System Management Bus and General Purpose I/O. +config INTEL_SOC_PMIC + bool "Support for Intel Atom SoC PMIC" + depends on I2C=y + select MFD_CORE + select REGMAP_I2C + select REGMAP_IRQ + help + Select this option to enable support for the PMIC device + on some Intel SoC systems. The PMIC provides ADC, GPIO, + thermal, charger and related power management functions + on these systems. + config MFD_INTEL_MSIC bool "Intel MSIC" depends on INTEL_SCU_IPC diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile index 2851275..36dda4c 100644 --- a/drivers/mfd/Makefile +++ b/drivers/mfd/Makefile @@ -166,3 +166,6 @@ obj-$(CONFIG_MFD_RETU) += retu-mfd.o obj-$(CONFIG_MFD_AS3711) += as3711.o obj-$(CONFIG_MFD_AS3722) += as3722.o obj-$(CONFIG_MFD_STW481X) += stw481x.o + +intel-soc-pmic-objs:= intel_soc_pmic_core.o intel_soc_pmic_crc.o +obj-$(CONFIG_INTEL_SOC_PMIC) += intel-soc-pmic.o diff --git a/drivers/mfd/intel_soc_pmic_crc.c b/drivers/mfd/intel_soc_pmic_crc.c new file mode 100644 index 000..7107cab --- /dev/null +++ b/drivers/mfd/intel_soc_pmic_crc.c @@ -0,0 +1,158 @@ +/* + * intel_soc_pmic_crc.c - Device access for Crystal Cove PMIC + * + * Copyright (C) 2013, 2014 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version + * 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Author: Yang, Bin + * Author: Zhu, Lejun + */ + +#include +#include +#include +#include +#include "intel_soc_pmic_core.h" + +#define CRYSTAL_COVE_MAX_REGISTER 0xC6 + +#define CRYSTAL_COVE_REG_IRQLVL1 0x02 +#define CRYSTAL_COVE_REG_MIRQLVL1 0x0E + +#define CRYSTAL_COVE_IRQ_PWRSRC0 +#define CRYSTAL_COVE_IRQ_THRM 1 +#define CRYSTAL_COVE_IRQ_BCU 2 +#define CRYSTAL_COVE_IRQ_ADC 3 +#define CRYSTAL_COVE_IRQ_CHGR 4 +#define CRYSTAL_COVE_IRQ_GPIO 5 +#define CRYSTAL_COVE_IRQ_VHDMIOCP 6 + +static struct resource gpio_resources[] = { + { + .name = "GPIO", + .start = CRYSTAL_COVE_IRQ_GPIO, + .end= CRYSTAL_COVE_IRQ_GPIO, + .flags = IORESOURCE_IRQ, + }, +}; + +static struct resource pwrsrc_resources[] = { + { + .name = "PWRSRC", + .start = CRYSTAL_COVE_IRQ_PWRSRC, + .end = CRYSTAL_COVE_IRQ_PWRSRC, + .flags = IORESOURCE_IRQ, + }, +}; + +static struct resource adc_resources[] = { + { + .name = "ADC", + .start = CRYSTAL_COVE_IRQ_ADC, + .end = CRYSTAL_COVE_IRQ_ADC, + .flags = IORESOURCE_IRQ, + }, +}; + +static struct resource thermal_resources[] = { + { + .name = "THERMAL", + .start = CRYSTAL_COVE_IRQ_THRM, + .end = CRYSTAL_COVE_IRQ_THRM, + .flags = IORESOURCE_IRQ, + }, +}; + +static struct resource bcu_resources[] = { + { + .name = "BCU", + .start = CRYSTAL_COVE_IRQ_BCU, + .end = CRYSTAL_COVE_IRQ_BCU, + .flags = IORESOURCE_IRQ, + }, +}; + +static struct mfd_cell crystal_cove_dev[] = { + { + .name = "crystal_cove_pwrsrc", + .num_resources = ARRAY_SIZE(pwrsrc_resources), + .resources = pwrsrc_resources, + }, + { + .name = "crystal_cove_adc", + .num_resources =
[PATCH v5 0/3] mfd: Intel SoC Power Management IC
Devices based on Intel SoC products such as Baytrail have a Power Management IC. In the PMIC there are subsystems for voltage regulation, A/D conversion, GPIO and PWMs. The PMIC in Baytrail-T platform is called Crystal Cove. This series contains common code for these PMICs, and device specific support for Crystal Cove. v2: - Use regmap instead of creating our own I2C read/write callbacks. - Add one missing EXPORT_SYMBOL. - Remove some duplicate code and put them into pmic_regmap_load_from_hw. v3: - Use regmap-irq and remove lots of duplicate code. - Remove 2 unused APIs. - Some other cleanup. v4: - Remove all exported APIs which are wrappers of regmap API, export the regmap in data structure instead. - Combine intel_soc_pmic_core.c and intel_soc_pmic_i2c.c - Clean up include files. - Remove useless members of struct intel_soc_pmic_config. - Fix various coding style issues. v5: - Add comment to describe what is done in _find_gpio_irq(). - Remove i2c id. Only keep ACPI id and match it in _probe(). - Further fix of coding style issues. - Add the GPIO patch, to merge it along with the MFD changes. Zhu, Lejun (3): mfd: intel_soc_pmic: Core driver mfd: intel_soc_pmic: Crystal Cove support gpio: Add support for Intel Crystal Cove PMIC drivers/gpio/Kconfig | 13 ++ drivers/gpio/Makefile | 1 + drivers/gpio/gpio-crystalcove.c| 379 + drivers/mfd/Kconfig| 12 ++ drivers/mfd/Makefile | 3 + drivers/mfd/intel_soc_pmic_core.c | 168 drivers/mfd/intel_soc_pmic_core.h | 32 drivers/mfd/intel_soc_pmic_crc.c | 158 include/linux/mfd/intel_soc_pmic.h | 30 +++ 9 files changed, 796 insertions(+) create mode 100644 drivers/gpio/gpio-crystalcove.c create mode 100644 drivers/mfd/intel_soc_pmic_core.c create mode 100644 drivers/mfd/intel_soc_pmic_core.h create mode 100644 drivers/mfd/intel_soc_pmic_crc.c create mode 100644 include/linux/mfd/intel_soc_pmic.h -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5 1/3] mfd: intel_soc_pmic: Core driver
This patch provides the common I2C driver code for Intel SoC PMICs. Signed-off-by: Yang, Bin Signed-off-by: Zhu, Lejun --- v2: - Use regmap instead of creating our own I2C read/write callbacks. - Add one missing EXPORT_SYMBOL. - Remove duplicate code and put them into pmic_regmap_load_from_hw. v3: - Use regmap-irq. Remove our own pmic_regmap_* and IRQ handling code. - Remove intel_soc_pmic_dev() and intel_soc_pmic_set_pdata(). - Use EXPORT_SYMBOL_GPL for exposed APIs. - Use gpiod interface instead of gpio numbers. - Remove redundant I2C IDs. - Use managed allocations. v4: - Remove all exported APIs which are wrappers of regmap API, export the regmap in data structure instead. - Combine I2C and core .c files. - Clean up include files. - Use intel_soc_pmic_ prefix to replace pmic_ and intel_pmic_. - Fix various coding style issues. v5: - Add comment to describe what is done in _find_gpio_irq(). - Remove i2c id. Only keep ACPI id and match it in _probe(). - Further fix of coding style issues. --- drivers/mfd/intel_soc_pmic_core.c | 168 + drivers/mfd/intel_soc_pmic_core.h | 32 +++ include/linux/mfd/intel_soc_pmic.h | 30 +++ 3 files changed, 230 insertions(+) create mode 100644 drivers/mfd/intel_soc_pmic_core.c create mode 100644 drivers/mfd/intel_soc_pmic_core.h create mode 100644 include/linux/mfd/intel_soc_pmic.h diff --git a/drivers/mfd/intel_soc_pmic_core.c b/drivers/mfd/intel_soc_pmic_core.c new file mode 100644 index 000..7638b34 --- /dev/null +++ b/drivers/mfd/intel_soc_pmic_core.c @@ -0,0 +1,168 @@ +/* + * intel_soc_pmic_core.c - Intel SoC PMIC MFD Driver + * + * Copyright (C) 2013, 2014 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version + * 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Author: Yang, Bin + * Author: Zhu, Lejun + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include "intel_soc_pmic_core.h" + +/* + * On some boards the PMIC interrupt may come from a GPIO line. + * Try to lookup the ACPI table and see if such connection exists. If not, + * return -ENOENT and use the IRQ provided by I2C. + */ +static int intel_soc_pmic_find_gpio_irq(struct device *dev) +{ + struct gpio_desc *desc; + int irq; + + desc = devm_gpiod_get_index(dev, "intel_soc_pmic", 0); + if (IS_ERR(desc)) + return -ENOENT; + + irq = gpiod_to_irq(desc); + if (irq < 0) + dev_warn(dev, "Can't get irq: %d\n", irq); + + return irq; +} + +static int intel_soc_pmic_i2c_probe(struct i2c_client *i2c, + const struct i2c_device_id *i2c_id) +{ + struct device *dev = >dev; + const struct acpi_device_id *id; + struct intel_soc_pmic_config *config; + struct intel_soc_pmic *pmic; + int ret; + int irq; + + id = acpi_match_device(dev->driver->acpi_match_table, dev); + if (!id || !id->driver_data) + return -ENODEV; + + config = (struct intel_soc_pmic_config *)id->driver_data; + + pmic = devm_kzalloc(dev, sizeof(*pmic), GFP_KERNEL); + dev_set_drvdata(dev, pmic); + + pmic->regmap = devm_regmap_init_i2c(i2c, config->regmap_config); + + irq = intel_soc_pmic_find_gpio_irq(dev); + pmic->irq = (irq < 0) ? i2c->irq : irq; + + ret = regmap_add_irq_chip(pmic->regmap, pmic->irq, + config->irq_flags | IRQF_ONESHOT, + 0, config->irq_chip, + >irq_chip_data); + if (ret) + return ret; + + ret = enable_irq_wake(pmic->irq); + if (ret) + dev_warn(dev, "Can't enable IRQ as wake source: %d\n", ret); + + ret = mfd_add_devices(dev, -1, config->cell_dev, + config->n_cell_devs, NULL, 0, + regmap_irq_get_domain(pmic->irq_chip_data)); + if (ret) + goto err_del_irq_chip; + + return 0; + +err_del_irq_chip: + regmap_del_irq_chip(pmic->irq, pmic->irq_chip_data); + return ret; +} + +static int intel_soc_pmic_i2c_remove(struct i2c_client *i2c) +{ + struct intel_soc_pmic *pmic = dev_get_drvdata(>dev); + + regmap_del_irq_chip(pmic->irq, pmic->irq_chip_data); + + mfd_remove_devices(>dev); + + return 0; +} + +static void intel_soc_pmic_shutdown(struct i2c_client *i2c) +{ + struct intel_soc_pmic *pmic = dev_get_drvdata(>dev); + + disable_irq(pmic->irq); + + return; +} + +static int
[PATCH v5 3/3] gpio: Add support for Intel Crystal Cove PMIC
Devices based on Intel SoC products such as Baytrail have a Power Management IC. In the PMIC there are subsystems for voltage regulation, A/D conversion, GPIO and PWMs. The PMIC in Baytrail-T platform is called Crystal Cove. This patch adds support for the GPIO function in Crystal Cove. Signed-off-by: Yang, Bin Signed-off-by: Zhu, Lejun Reviewed-by: Mika Westerberg Reviewed-by: Alexandre Courbot Reviewed-by: Linus Walleij --- v5: - Fix the order of doing gpiochip_add() and gpiochip_irqchip_add(). - Add it to this patch set, to merge it along with the MFD changes. --- drivers/gpio/Kconfig| 13 ++ drivers/gpio/Makefile | 1 + drivers/gpio/gpio-crystalcove.c | 379 3 files changed, 393 insertions(+) create mode 100644 drivers/gpio/gpio-crystalcove.c diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index a86c49a..fed08d9d 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -440,6 +440,19 @@ config GPIO_ARIZONA help Support for GPIOs on Wolfson Arizona class devices. +config GPIO_CRYSTAL_COVE + tristate "GPIO support for Crystal Cove PMIC" + depends on INTEL_SOC_PMIC + select GPIOLIB_IRQCHIP + help + Support for GPIO pins on Crystal Cove PMIC. + + Say Yes if you have a Intel SoC based tablet with Crystal Cove PMIC + inside. + + This driver can also be built as a module. If so, the module will be + called gpio-crystalcove. + config GPIO_LP3943 tristate "TI/National Semiconductor LP3943 GPIO expander" depends on MFD_LP3943 diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile index 6309aff..e6cd935 100644 --- a/drivers/gpio/Makefile +++ b/drivers/gpio/Makefile @@ -20,6 +20,7 @@ obj-$(CONFIG_GPIO_BCM_KONA) += gpio-bcm-kona.o obj-$(CONFIG_GPIO_BT8XX) += gpio-bt8xx.o obj-$(CONFIG_GPIO_CLPS711X)+= gpio-clps711x.o obj-$(CONFIG_GPIO_CS5535) += gpio-cs5535.o +obj-$(CONFIG_GPIO_CRYSTAL_COVE)+= gpio-crystalcove.o obj-$(CONFIG_GPIO_DA9052) += gpio-da9052.o obj-$(CONFIG_GPIO_DA9055) += gpio-da9055.o obj-$(CONFIG_GPIO_DAVINCI) += gpio-davinci.o diff --git a/drivers/gpio/gpio-crystalcove.c b/drivers/gpio/gpio-crystalcove.c new file mode 100644 index 000..5a98499 --- /dev/null +++ b/drivers/gpio/gpio-crystalcove.c @@ -0,0 +1,379 @@ +/* + * gpio-crystalcove.c - Intel Crystal Cove GPIO Driver + * + * Copyright (C) 2012, 2014 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version + * 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Author: Yang, Bin + */ + +#include +#include +#include +#include +#include +#include + +#define CRYSTALCOVE_GPIO_NUM 16 + +#define UPDATE_IRQ_TYPEBIT(0) +#define UPDATE_IRQ_MASKBIT(1) + +#define GPIO0IRQ 0x0b +#define GPIO1IRQ 0x0c +#define MGPIO0IRQS00x19 +#define MGPIO1IRQS00x1a +#define MGPIO0IRQSX0x1b +#define MGPIO1IRQSX0x1c +#define GPIO0P0CTLO0x2b +#define GPIO0P0CTLI0x33 +#define GPIO1P0CTLO0x3b +#define GPIO1P0CTLI0x43 + +#define CTLI_INTCNT_DIS(0) +#define CTLI_INTCNT_NE (1 << 1) +#define CTLI_INTCNT_PE (2 << 1) +#define CTLI_INTCNT_BE (3 << 1) + +#define CTLO_DIR_IN(0) +#define CTLO_DIR_OUT (1 << 5) + +#define CTLO_DRV_CMOS (0) +#define CTLO_DRV_OD(1 << 4) + +#define CTLO_DRV_REN (1 << 3) + +#define CTLO_RVAL_2KDW (0) +#define CTLO_RVAL_2KUP (1 << 1) +#define CTLO_RVAL_50KDW(2 << 1) +#define CTLO_RVAL_50KUP(3 << 1) + +#define CTLO_INPUT_SET (CTLO_DRV_CMOS | CTLO_DRV_REN | CTLO_RVAL_2KUP) +#define CTLO_OUTPUT_SET(CTLO_DIR_OUT | CTLO_INPUT_SET) + +enum ctrl_register { + CTRL_IN, + CTRL_OUT, +}; + +/** + * struct crystalcove_gpio - Crystal Cove GPIO controller + * @buslock: for bus lock/sync and unlock. + * @chip: the abstract gpio_chip structure. + * @regmap: the regmap from the parent device. + * @update: pending IRQ setting update, to be written to the chip upon unlock. + * @intcnt_value: the Interrupt Detect value to be written. + * @set_irq_mask: true if the IRQ mask needs to be set, false to clear. + */ +struct crystalcove_gpio { + struct mutex buslock; /* irq_bus_lock */ + struct gpio_chip chip; + struct regmap *regmap; + int update; + int intcnt_value; + bool set_irq_mask; +}; + +static inline struct
Re: [PATCH RFC 2/2] rcu: Add Josh Triplett as designated reviewer
On Mon, 2014-06-02 at 22:10 -0700, Josh Triplett wrote: > On Mon, Jun 02, 2014 at 08:11:36PM -0700, Joe Perches wrote: [] > "the appropriate mailing lists" often just mean LKML, > for many patches I've sent; almost nobody sees a patch only sent to > LKML, unless they specifically go looking for it.) It's a good thing then that Andre Morton is all-seeing. > > the > > quantity of names and addresses on the [0/n] > > patch can easily exceed vger's 1024 byte > > maximum header size limit. > > Is that the limit on the size of any *one* header, or on the size of all > headers combined? All headers > > Another possibility is to add a new "--bcc_cmd" > > to git send-email so that vger's header limit > > can be worked around. > > That breaks the ability for the recipients to see replies. If interested, the recipient knows where to look. > > I had patches to git to do that awhile ago. > > That'd be handy; did you try submitting them upstream? What reception > did you get? I don't recall. It was a bit after I added the cc-cmd stuff. Nearly 7 years ago now. > If you have objections to putting it directly in get_maintainer.pl, I do. > it'd > be easy enough to make a secondary script (patch_add_maintainers?) to > drive it. Go for it. cheers, Joe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: governor: Be friendly towards latency-sensitive bursty workloads
On Mon, Jun 02, 2014 at 01:45:38PM +0530, Srivatsa S. Bhat wrote: > On 06/02/2014 01:03 PM, Gautham R Shenoy wrote: > > Hi, > > > > On Tue, May 27, 2014 at 02:23:38AM +0530, Srivatsa S. Bhat wrote: > > > > [..snip..] > >> > >> Experimental results: > >> > >> > >> I ran a modified version of ebizzy (called 'sleeping-ebizzy') that sleeps > >> in > >> between its execution such that its total utilization can be a user-defined > >> value, say 10% or 20% (higher the utilization specified, lesser the amount > >> of > >> sleeps injected). This ebizzy was run with a single-thread, tied to CPU 8. > >> > >> Behavior observed with tracing (sample taken from 40% utilization runs): > >> > >> > >> Without patch: > >> ~~ > >> kworker/8:2-12137 416.335742: cpu_frequency: state=2061000 cpu_id=8 > >> kworker/8:2-12137 416.335744: sched_switch: prev_comm=kworker/8:2 ==> > >> next_comm=ebizzy > >> <...>-40753 416.345741: sched_switch: prev_comm=ebizzy ==> > >> next_comm=kworker/8:2 > >> kworker/8:2-12137 416.345744: cpu_frequency: state=4123000 cpu_id=8 > >> kworker/8:2-12137 416.345746: sched_switch: prev_comm=kworker/8:2 ==> > >> next_comm=ebizzy > >> <...>-40753 416.355738: sched_switch: prev_comm=ebizzy ==> > >> next_comm=kworker/8:2 > >> > >> - > >> > >> <...>-40753 416.402202: sched_switch: prev_comm=ebizzy ==> > >> next_comm=swapper/8 > >> -0 416.502130: sched_switch: prev_comm=swapper/8 ==> > >> next_comm=ebizzy > >> <...>-40753 416.505738: sched_switch: prev_comm=ebizzy ==> > >> next_comm=kworker/8:2 > >> kworker/8:2-12137 416.505739: cpu_frequency: state=2061000 cpu_id=8 > >> kworker/8:2-12137 416.505741: sched_switch: prev_comm=kworker/8:2 ==> > >> next_comm=ebizzy > >> <...>-40753 416.515739: sched_switch: prev_comm=ebizzy ==> > >> next_comm=kworker/8:2 > >> kworker/8:2-12137 416.515742: cpu_frequency: state=4123000 cpu_id=8 > >> kworker/8:2-12137 416.515744: sched_switch: prev_comm=kworker/8:2 ==> > >> next_comm=ebizzy > >> > >> Observation: Ebizzy went idle at 416.402202, and started running again at > >> 416.502130. But cpufreq noticed the long idle period, and dropped the > >> frequency > >> at 416.505739, only to increase it back again at 416.515742, realizing > >> that the > >> workload is in-fact CPU bound. Thus ebizzy needlessly ran at the lowest > >> frequency > >> for almost 13 milliseconds (almost 1 full sample period), and this pattern > >> repeats on every sleep-wakeup. This could hurt latency-sensitive workloads > >> quite > >> a lot. > >> > >> With patch: > >> ~~~ > >> > >> kworker/8:2-29802 464.832535: cpu_frequency: state=2061000 cpu_id=8 > >> > >> - > >> > >> kworker/8:2-29802 464.962538: sched_switch: prev_comm=kworker/8:2 ==> > >> next_comm=ebizzy > >> <...>-40738 464.972533: sched_switch: prev_comm=ebizzy ==> > >> next_comm=kworker/8:2 > >> kworker/8:2-29802 464.972536: cpu_frequency: state=4123000 cpu_id=8 > >> kworker/8:2-29802 464.972538: sched_switch: prev_comm=kworker/8:2 ==> > >> next_comm=ebizzy > >> <...>-40738 464.982531: sched_switch: prev_comm=ebizzy ==> > >> next_comm=kworker/8:2 > >> > >> - > >> > >> kworker/8:2-29802 465.022533: sched_switch: prev_comm=kworker/8:2 ==> > >> next_comm=ebizzy > >> <...>-40738 465.032531: sched_switch: prev_comm=ebizzy ==> > >> next_comm=kworker/8:2 > >> kworker/8:2-29802 465.032532: sched_switch: prev_comm=kworker/8:2 ==> > >> next_comm=ebizzy > >> <...>-40738 465.035797: sched_switch: prev_comm=ebizzy ==> > >> next_comm=swapper/8 > >> -0 465.240178: sched_switch: prev_comm=swapper/8 ==> > >> next_comm=ebizzy > >> <...>-40738 465.242533: sched_switch: prev_comm=ebizzy ==> > >> next_comm=kworker/8:2 > >> kworker/8:2-29802 465.242535: sched_switch: prev_comm=kworker/8:2 ==> > >> next_comm=ebizzy > >> <...>-40738 465.252531: sched_switch: prev_comm=ebizzy ==> > >> next_comm=kworker/8:2 > >> > > > > Have the log entries emmitted by kworker/8 to report about the > > cpu_frequency states been snipped out in the entries post the > > "465.032531" mark ? > > > > No, why? Anything looks odd at that point? I was expecting to see log messages of the following kind after a kworker thread is scheduled in. "kworker/8:2-12137 416.505739: cpu_frequency: state=2061000 cpu_id=8" > > Note that the CPU went idle from 465.035797 to 465.240178, and hence cpufreq's > deferrable timer didn't fire (and hence kworker didn't run). But once the CPU > became busy again at 465.240178, the kworker got scheduled on the CPU within > 2 ms (at 465.242533). Yes, but the logs don't show the frequency that
Re: [PATCH RFC 2/2] rcu: Add Josh Triplett as designated reviewer
On Mon, Jun 02, 2014 at 08:11:36PM -0700, Joe Perches wrote: > On Mon, 2014-06-02 at 18:51 -0700, Josh Triplett wrote: > > git send-email can invoke 'scripts/get_maintainer.pl --no-rolestats' > > directly via --to-cmd or -cc-cmd; that works fine as long as you don't > > have a cover letter. > > > > Depending on the system I'm running on, and whether it's more convenient > > to invoke git-send-email or to edit patch mails and send them with 'mutt > > -H', I have a shell pipeline which invokes get_maintainer.pl on an > > entire patch series, collects all the email addresses it returns, and > > inserts them all into each mail as CCs. (That way, when I send a > > cross-subsystem patch series, I don't get a pile of maintainers confused > > that they only received a couple of the numbered patches.) One example: > > I think that as long as the appropriate mailing lists receive > the cover letter, any real maintainer won't be confused. Not so much "confused" as "annoyed"; I've had people specifically complain about getting one or two patches but not the cover letter, for instance. (And "the appropriate mailing lists" often just mean LKML, for many patches I've sent; almost nobody sees a patch only sent to LKML, unless they specifically go looking for it.) > > { echo -n "To: " ; for x in *.patch ; do scripts/get_maintainer.pl > > --no-rolestats < $x | fgrep -v j...@joshtriplett.org ; done | sort -u | sed > > 's/$/, /;$s/, $//' | tr -d '\n' ; echo ; } | sed -i '/^From:/r/dev/stdin' > > > > Personally, I'd find it handy if one of the following happened: > > > > - git send-email (and ideally also git format-patch) grew an option to > > collect *all* the to-cmd and cc-cmd output from each patch and apply > > it to every patch (including the cover letter). > > The biggest issue with doing that is the > quantity of names and addresses on the [0/n] > patch can easily exceed vger's 1024 byte > maximum header size limit. Is that the limit on the size of any *one* header, or on the size of all headers combined? If the former, there's an easy way around that. And if the latter, that seems absurdly small. Might also help to strip out the names and just insert the addresses; annoying, but a handy workaround to make it likely that a sensibly sized patch series won't hit the limit. > Another possibility is to add a new "--bcc_cmd" > to git send-email so that vger's header limit > can be worked around. That breaks the ability for the recipients to see replies. > I had patches to git to do that awhile ago. That'd be handy; did you try submitting them upstream? What reception did you get? > > - get_maintainer.pl accepted multiple patchfile names and output the > > union of the results. Ideally, get_maintainer.pl would also have a -i > > option to edit the patch files and insert the addresses in the mail > > headers. > > Why would get_maintainer.pl have any option like that? > > Tools for uses. Scripting. > Aren't we good at that sort of thing? Yes, we're good at scripting; we put scripts many people might wish to use in a scripts/ directory, such as the extremely handy script get_maintainer.pl that does that sort of thing with patches. And since it currently does nothing useful at all with cover letters, it'd be nice to change that. If you have objections to putting it directly in get_maintainer.pl, it'd be easy enough to make a secondary script (patch_add_maintainers?) to drive it. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 1/2] xen: vnuma for pv guests
Issues Xen hypercall subop XENMEM_get_vnumainfo and sets the NUMA topology, otherwise sets dummy NUMA node and prevents numa_init from calling other numa initializators as they dont work with pv guests. Signed-off-by: Elena Ufimtseva --- arch/x86/include/asm/xen/vnuma.h | 10 arch/x86/mm/numa.c |3 + arch/x86/xen/Makefile|1 + arch/x86/xen/setup.c |6 +- arch/x86/xen/vnuma.c | 121 ++ include/xen/interface/memory.h | 50 6 files changed, 190 insertions(+), 1 deletion(-) create mode 100644 arch/x86/include/asm/xen/vnuma.h create mode 100644 arch/x86/xen/vnuma.c diff --git a/arch/x86/include/asm/xen/vnuma.h b/arch/x86/include/asm/xen/vnuma.h new file mode 100644 index 000..8c8b098 --- /dev/null +++ b/arch/x86/include/asm/xen/vnuma.h @@ -0,0 +1,10 @@ +#ifndef _ASM_X86_VNUMA_H +#define _ASM_X86_VNUMA_H + +#ifdef CONFIG_XEN +int xen_numa_init(void); +#else +static inline int xen_numa_init(void) { return -1; }; +#endif + +#endif /* _ASM_X86_VNUMA_H */ diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index 1d045f9..37a9c84 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -18,6 +18,7 @@ #include #include +#include "asm/xen/vnuma.h" #include "numa_internal.h" int __initdata numa_off; @@ -687,6 +688,8 @@ static int __init dummy_numa_init(void) void __init x86_numa_init(void) { if (!numa_off) { + if (!numa_init(xen_numa_init)) + return; #ifdef CONFIG_ACPI_NUMA if (!numa_init(x86_acpi_numa_init)) return; diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile index 96ab2c0..185ec9b 100644 --- a/arch/x86/xen/Makefile +++ b/arch/x86/xen/Makefile @@ -22,3 +22,4 @@ obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= spinlock.o obj-$(CONFIG_XEN_DEBUG_FS) += debugfs.o obj-$(CONFIG_XEN_DOM0) += apic.o vga.o obj-$(CONFIG_SWIOTLB_XEN) += pci-swiotlb-xen.o +obj-$(CONFIG_NUMA) += vnuma.o diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 0982233..0235f19 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -622,6 +623,9 @@ void __init xen_arch_setup(void) WARN_ON(xen_set_default_idle()); fiddle_vdso(); #ifdef CONFIG_NUMA - numa_off = 1; + if (xen_initial_domain()) + numa_off = 1; + else + numa_off = 0; #endif } diff --git a/arch/x86/xen/vnuma.c b/arch/x86/xen/vnuma.c new file mode 100644 index 000..a02f9c6 --- /dev/null +++ b/arch/x86/xen/vnuma.c @@ -0,0 +1,121 @@ +#include +#include +#include +#include +#include +#include +#include + +/* + * Called from numa_init if numa_off = 0; + */ +int __init xen_numa_init(void) +{ + unsigned int i, j, idx; + unsigned int cpu, pcpus, nr_nodes, nr_cpus; + unsigned int *vdistance, *cpu_to_node; + unsigned long mem_size, dist_size, cpu_to_node_size; + struct vmemrange *vmem; + u64 physm, physd, physc; + int rc; + + struct vnuma_topology_info numa_topo = { + .domid = DOMID_SELF + }; + + rc = -EINVAL; + physm = physd = physc = 0; + + /* For now only PV guests are supported */ + if (!xen_pv_domain()) + return rc; + + /* get the number of nodes for allocation of memblocks */ + pcpus = num_possible_cpus(); + nr_cpus = setup_max_cpus < pcpus ? setup_max_cpus : pcpus; + + /* support for nodes with at least one cpu */ + nr_nodes = nr_cpus; + + /* +* Allocate arrays for nr_cpus/nr_nodes sizes and let +* hypervisor know that these are the boundaries. Partial +* copy is not allowed and hypercall will fail. +*/ + + mem_size = nr_nodes * sizeof(struct vmemrange); + dist_size = nr_nodes * nr_nodes * sizeof(*numa_topo.distance.h); + cpu_to_node_size = nr_cpus * sizeof(*numa_topo.cpu_to_node.h); + + physm = memblock_alloc(mem_size, PAGE_SIZE); + physd = memblock_alloc(dist_size, PAGE_SIZE); + physc = memblock_alloc(cpu_to_node_size, PAGE_SIZE); + + if (!physm || !physd || !physc) + goto out; + + vmem = __va(physm); + vdistance = __va(physd); + cpu_to_node = __va(physc); + + numa_topo.nr_nodes = nr_nodes; + numa_topo.nr_cpus = nr_cpus; + + set_xen_guest_handle(numa_topo.memrange.h, vmem); + set_xen_guest_handle(numa_topo.distance.h, vdistance); + set_xen_guest_handle(numa_topo.cpu_to_node.h, cpu_to_node); + + if (HYPERVISOR_memory_op(XENMEM_get_vnuma_info, _topo) < 0) + goto out; + + /* +* NUMA nodes memory ranges are in pfns, constructed and +* aligned based on e820 ram domain map. +*/ + for (i = 0; i < nr_nodes; i++) { +
[PATCH v3 0/2] xen: vnuma for PV guests
The patchset introduces vnuma to paravirtualized Xen guests runnning as domU. Xen subop hypercall is used to retreive vnuma topology information. Bases on the retreived topology from Xen, NUMA number of nodes, memory ranges, distance table and cpumask is being set. If initialization is incorrect, sets 'dummy' node and unsets nodemask. Patchsets for Xen and linux: git://gitorious.org/xenvnuma_v5/linuxvnuma_v5.git https://git.gitorious.org/xenvnuma_v5/linuxvnuma_v5.git Xen patchset is available at: git://gitorious.org/xenvnuma_v5/xenvnuma_v5.git https://git.gitorious.org/xenvnuma_v5/xenvnuma_v5.git Example of vnuma enabled pv domain dmesg: [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x1000-0x0009] [0.00] node 0: [mem 0x0010-0x] [0.00] node 1: [mem 0x1-0x1] [0.00] node 2: [mem 0x2-0x2] [0.00] node 3: [mem 0x3-0x3] [0.00] On node 0 totalpages: 1048479 [0.00] DMA zone: 56 pages used for memmap [0.00] DMA zone: 21 pages reserved [0.00] DMA zone: 3999 pages, LIFO batch:0 [0.00] DMA32 zone: 14280 pages used for memmap [0.00] DMA32 zone: 1044480 pages, LIFO batch:31 [0.00] On node 1 totalpages: 1048576 [0.00] Normal zone: 14336 pages used for memmap [0.00] Normal zone: 1048576 pages, LIFO batch:31 [0.00] On node 2 totalpages: 1048576 [0.00] Normal zone: 14336 pages used for memmap [0.00] Normal zone: 1048576 pages, LIFO batch:31 [0.00] On node 3 totalpages: 1048576 [0.00] Normal zone: 14336 pages used for memmap [0.00] Normal zone: 1048576 pages, LIFO batch:31 [0.00] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org [0.00] smpboot: Allowing 4 CPUs, 0 hotplug CPUs [0.00] No local APIC present [0.00] APIC: disable apic facility [0.00] APIC: switched to apic NOOP [0.00] nr_irqs_gsi: 16 [0.00] PM: Registered nosave memory: [mem 0x000a-0x000f] [0.00] e820: cannot find a gap in the 32bit address range [0.00] e820: PCI devices with unassigned 32bit BARs may break! [0.00] e820: [mem 0x40010-0x4004f] available for PCI devices [0.00] Booting paravirtualized kernel on Xen [0.00] Xen version: 4.4-unstable (preserve-AD) [0.00] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:4 nr_node_ids:4 [0.00] PERCPU: Embedded 28 pages/cpu @8800ffc0 s85376 r8192 d21120 u2097152 [0.00] pcpu-alloc: s85376 r8192 d21120 u2097152 alloc=1*2097152 numactl output: root@heatpipe:~# numactl --hardware available: 4 nodes (0-3) node 0 cpus: 0 node 0 size: 4031 MB node 0 free: 3997 MB node 1 cpus: 1 node 1 size: 4039 MB node 1 free: 4022 MB node 2 cpus: 2 node 2 size: 4039 MB node 2 free: 4023 MB node 3 cpus: 3 node 3 size: 3975 MB node 3 free: 3963 MB node distances: node 0 1 2 3 0: 10 20 20 20 1: 20 10 20 20 2: 20 20 10 20 3: 20 20 20 10 Elena Ufimtseva (1): Xen vnuma introduction. arch/x86/include/asm/xen/vnuma.h | 10 arch/x86/mm/numa.c |3 + arch/x86/xen/Makefile|1 + arch/x86/xen/setup.c |6 +- arch/x86/xen/vnuma.c | 121 ++ include/xen/interface/memory.h | 50 6 files changed, 190 insertions(+), 1 deletion(-) create mode 100644 arch/x86/include/asm/xen/vnuma.h create mode 100644 arch/x86/xen/vnuma.c -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V2] scripts/checkpatch.pl: device_initcall is not the only __initcall substitute
This patch adds a link to init.h to find accurate initcall function to replace obsolete __initcall Cc: Andy Whitcroft Cc: Joe Perches Cc: Andrew Morton Signed-off-by: Fabian Frederick --- V2: s/accurate/appropriate scripts/checkpatch.pl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 34eb216..ee28107 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -4443,10 +4443,10 @@ sub process { "$1 is obsolete, use k$3 instead\n" . $herecurr); } -# check for __initcall(), use device_initcall() explicitly please +# check for __initcall(), use device_initcall() explicitly or more appropriate function please if ($line =~ /^.\s*__initcall\s*\(/) { WARN("USE_DEVICE_INITCALL", -"please use device_initcall() instead of __initcall()\n" . $herecurr); +"please use device_initcall() or more appropriate function instead of __initcall() (see include/linux/init.h)\n" . $herecurr); } # check for various ops structs, ensure they are const. -- 1.8.4.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] blk-mq: fix sparse warning on missed __percpu annotation
On 2014-06-02 21:24, Ming Lei wrote: 'struct blk_mq_ctx' is __percpu, so add the annotation and fix the sparse warning reported from Fengguang: [block:for-linus 2/3] block/blk-mq.h:75:16: sparse: incorrect type in initializer (different address spaces) Thanks, applied. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFT: pinctrl: spear: switch plgpio to irqchip helpers
On Thu, May 29, 2014 at 2:55 PM, Linus Walleij wrote: > This switches the SPEAr PLGPIO driver over to using the irqchip > helpers. > > As part of this effort, also get rid of the strange irq_base > calculation and failure to use d->hwirq for obtaining a local > irqchip offset. > > Cc: Viresh Kumar > Cc: Shiraz Hashim > Cc: spear-de...@list.st.com > Signed-off-by: Linus Walleij > --- > SPEAr folks: please make sure to test this, it is part of an > important GPIO refactoring round. If it doesn't work: please do > an honest attempt at troubleshooting. Both me and shiraz can't test it anymore, we switched our jobs :) And not sure who else can do it.. @Pratyush/Mohit ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] driver core: platform: add device binding path 'driver_override'
On Mon, Jun 02, 2014 at 07:42:58PM -0500, Kim Phillips wrote: > Needed by platform device drivers, such as the upcoming > vfio-platform driver, in order to bypass the existing OF, ACPI, > id_table and name string matches, and successfully be able to be > bound to any device, like so: > > echo vfio-platform > > /sys/bus/platform/devices/fff51000.ethernet/driver_override > echo fff51000.ethernet > > /sys/bus/platform/devices/fff51000.ethernet/driver/unbind > echo fff51000.ethernet > /sys/bus/platform/drivers_probe > > This mimics "PCI: Introduce new device binding path using > pci_dev.driver_override", which is an interface enhancement > for more deterministic PCI device binding, e.g., when in the > presence of hotplug. > > Reviewed-by: Alex Williamson > Reviewed-by: Alexander Graf > Reviewed-by: Stuart Yoder > Signed-off-by: Kim Phillips > --- > Greg, > > This is largely identical to the PCI version of the same that has > been accepted for v3.16 and ack'd by you: > > https://lists.cs.columbia.edu/pipermail/kvmarm/2014-May/009674.html > > and applied to Bjorn Helgaas' PCI tree: > > https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?h=pci/virtualization=782a985d7af26db39e86070d28f987cad21313c0 > > You are the platform driver core maintainer: can you apply this to > your driver-core tree now? Yes, I will after this merge window ends, it's too late for 3.16-rc1 with the window opening up a week early, sorry. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH ftrace/core 0/2] ftrace: Introduce the new file "saved_cmdlines_size"
Hi Steven, This patch set introduces the new file "saved_cmdlines_size" for increasing the number of saved cmdlines. Current saved_cmdlines can store just 128 command names and PIDs, but process names are often lost like <...> when we read trace data. If the process exists, we can get the name by using ps command. However, if the process already has not existed, we cannot get the name. To solve this issue, we introduce the new file "saved_cmdlines_size" to expand the max number of saved command line names. This file is very simple. If we write a number to nr_saved_cmdlines, the number of command name will be stored. And, if we read the file, we can get current maximum number of command name. The default number is 128 which is current default number, so this patch does not change the usage of memory for saved_cmdlines when we boot kernel. I found a bug for current ftrace, so I fixed the bug before introducing saved_cmdlines_size file (2nd patch). Note that the 2nd patch depends on the 1st patch. Thanks! Changes in V2: [2/2] - Fix a racing problem of savedcmd between saved_cmdlines I/F and nr_saved_cmdlines I/F. If one reads saved_cmdlines and writes a value to nr_saved_cmdlines at the same time, then the write returns -EBUSY. [terminal 1] Read saved_cmdlines # while true; do cat saved_cmdlines > /dev/null; done; [terminal 2] Write 1024 to nr_saved_cmdlines # while true; do echo 1024 > nr_saved_cmdlines; done; -bash: echo: write error: Device or resource busy -bash: echo: write error: Device or resource busy -bash: echo: write error: Device or resource busy Changes in V3: [1/2] - Introduce this patch [2/2] - Change 'nr_saved_cmdlines' to 'saved_cmdlines_size' - Delete two helper functions(trace_init_savedcmd() and trace_creare_and_init_saved_cmd()) - Rebase this patch for current ftrace/core tree - Remove reader member in saved_cmdlines_buffer structure because the racing problem does not occur even if we don't the member in this patch - Fix several typos --- Yoshihiro YUNOMAE (2): [BUGFIX] ftrace: Avoid panic when allocation of max_buffer is failed ftrace: Introduce saved_cmdlines_size file kernel/trace/trace.c | 205 -- 1 file changed, 179 insertions(+), 26 deletions(-) -- Yoshihiro YUNOMAE Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: yoshihiro.yunomae...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH ftrace/core 1/2] [BUGFIX] ftrace: Avoid panic when allocation of max_buffer is failed
When allocation of max_buffer is failed, the kernel frees tr->trace_buffer.data per CPU and return -ENOMEM in allocate_trace_buffers(). However, tracer_alloc_buffers() calling allocate_trace_buffers() also frees the data per CPU for -ENOMEM by allocate_trace_buffers(). Therefore, the allocation failure induces double free. For the out_free_mask path in tracer_alloc_buffers(), global_trace.trace_buffer.data and global_trace.max_buffer.data are not allocated yet, so free_percpu of those are not needed. Signed-off-by: Yoshihiro YUNOMAE Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.c |4 1 file changed, 4 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 626dbfd..135af32 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6671,10 +6671,6 @@ __init static int tracer_alloc_buffers(void) out_free_temp_buffer: ring_buffer_free(temp_buffer); out_free_cpumask: - free_percpu(global_trace.trace_buffer.data); -#ifdef CONFIG_TRACER_MAX_TRACE - free_percpu(global_trace.max_buffer.data); -#endif free_cpumask_var(global_trace.tracing_cpumask); out_free_buffer_mask: free_cpumask_var(tracing_buffer_mask); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH ftrace/core 2/2] ftrace: Introduce saved_cmdlines_size file
Introduce saved_cmdlines_size file for changing the number of pid-comm list. saved_cmdlines can store 128 command names using SAVED_CMDLINES now, but 'no-existing processes' names are often lost in saved_cmdlines when we read trace data. So, by introducing saved_cmdlines_size file, the rule storing 128 command names is changed to the command numbers defined users. When we write a value to saved_cmdlines_size, the number of the value will be stored in pid-comm list: # echo 1024 > /sys/kernel/debug/tracing/saved_cmdlines_size Here, 1024 command names are stored. The default number is 128 and the maximum number is PID_MAX_DEFAULT (=32768 if CONFIG_BASE_SMALL is not set). So, if we want to avoid to lose command names, we need to set 32768 to saved_cmdlines_size. We can read the maximum number of the list: # cat /sys/kernel/debug/tracing/saved_cmdlines_size 128 Changes in V2: - Fix a racing problem of savedcmd between saved_cmdlines I/F and nr_saved_cmdlines I/F. If one reads saved_cmdlines and writes a value to nr_saved_cmdlines at the same time, then the write returns -EBUSY. [terminal 1] Read saved_cmdlines # while true; do cat saved_cmdlines > /dev/null; done; [terminal 2] Write 1024 to nr_saved_cmdlines # while true; do echo 1024 > nr_saved_cmdlines; done; -bash: echo: write error: Device or resource busy -bash: echo: write error: Device or resource busy -bash: echo: write error: Device or resource busy Changes in V3: - Change 'nr_saved_cmdlines' to 'saved_cmdlines_size' - Delete two helper functions(trace_init_savedcmd() and trace_create_and_init_saved_cmd()) - Rebase this patch for current ftrace/core tree - Remove reader member in saved_cmdlines_buffer structure because the racing problem does not occur even if we don't use the member in this patch - Fix several typos Signed-off-by: Yoshihiro YUNOMAE Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org --- kernel/trace/trace.c | 203 -- 1 file changed, 180 insertions(+), 23 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 135af32..473eb68 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1285,22 +1285,82 @@ void tracing_reset_all_online_cpus(void) } } -#define SAVED_CMDLINES 128 +#define SAVED_CMDLINES_DEFAULT 128 #define NO_CMDLINE_MAP UINT_MAX -static unsigned map_pid_to_cmdline[PID_MAX_DEFAULT+1]; -static unsigned map_cmdline_to_pid[SAVED_CMDLINES]; -static char saved_cmdlines[SAVED_CMDLINES][TASK_COMM_LEN]; -static int cmdline_idx; static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED; +struct saved_cmdlines_buffer { + unsigned map_pid_to_cmdline[PID_MAX_DEFAULT+1]; + unsigned *map_cmdline_to_pid; + unsigned cmdline_num; + int cmdline_idx; + char *saved_cmdlines; +}; +static struct saved_cmdlines_buffer *savedcmd; /* temporary disable recording */ static atomic_t trace_record_cmdline_disabled __read_mostly; -static void trace_init_cmdlines(void) +static inline char *get_saved_cmdlines(int idx) +{ + return >saved_cmdlines[idx * TASK_COMM_LEN]; +} + +static inline void set_cmdline(int idx, const char *cmdline) +{ + memcpy(get_saved_cmdlines(idx), cmdline, TASK_COMM_LEN); +} + +static int allocate_cmdlines_buffer(unsigned int val, + struct saved_cmdlines_buffer *s) { - memset(_pid_to_cmdline, NO_CMDLINE_MAP, sizeof(map_pid_to_cmdline)); - memset(_cmdline_to_pid, NO_CMDLINE_MAP, sizeof(map_cmdline_to_pid)); - cmdline_idx = 0; + s->map_cmdline_to_pid = kmalloc(val * sizeof(unsigned), GFP_KERNEL); + if (!s->map_cmdline_to_pid) + goto out; + + s->saved_cmdlines = kmalloc(val * TASK_COMM_LEN, GFP_KERNEL); + if (!s->saved_cmdlines) + goto out_free_map_cmdline_to_pid; + + return 0; + +out_free_map_cmdline_to_pid: + kfree(s->map_cmdline_to_pid); +out: + return -ENOMEM; +} + +static void trace_init_cmdlines_buffer(unsigned int val, + struct saved_cmdlines_buffer *s) +{ + s->cmdline_idx = 0; + s->cmdline_num = val; + memset(>map_pid_to_cmdline, NO_CMDLINE_MAP, + sizeof(s->map_pid_to_cmdline)); + memset(s->map_cmdline_to_pid, NO_CMDLINE_MAP, + val * sizeof(*s->map_cmdline_to_pid)); +} + +static int trace_create_savedcmd(void) +{ + int ret; + + savedcmd = kmalloc(sizeof(struct saved_cmdlines_buffer), GFP_KERNEL); + if (!savedcmd) + goto out; + + ret = allocate_cmdlines_buffer(SAVED_CMDLINES_DEFAULT, savedcmd); + if (ret < 0) + goto out_free; + + trace_init_cmdlines_buffer(SAVED_CMDLINES_DEFAULT, savedcmd); + + return 0; + +out_free: + kfree(savedcmd); + savedcmd = NULL; +out: + return -ENOMEM; } int
Re: [PATCH] ARM: Remove ARCH_HAS_CPUFREQ config option
On Tue, Jun 3, 2014 at 3:37 AM, Stephen Boyd wrote: > This config exists entirely to hide the cpufreq menu from the > kernel configuration unless a platform has selected it. Nothing > is actually built if this config is 'Y' and it just leads to > more patches that add a select under a platform Kconfig so that > some other CPUfreq option can be chosen. Let's remove the > option so that we can always enable CPUfreq drivers on ARM > platforms. > > Signed-off-by: Stephen Boyd > --- > arch/arm/Kconfig | 18 -- > arch/arm/mach-davinci/Kconfig | 1 - > arch/arm/mach-highbank/Kconfig | 1 - > arch/arm/mach-imx/Kconfig | 3 --- > arch/arm/mach-mvebu/Kconfig| 1 - > arch/arm/mach-omap2/Kconfig| 1 - > arch/arm/mach-shmobile/Kconfig | 2 -- > arch/arm/mach-spear/Kconfig| 1 - > arch/arm/mach-tegra/Kconfig| 1 - > arch/arm/mach-ux500/Kconfig| 1 - > arch/arm/mach-vexpress/Kconfig | 1 - > arch/arm/mach-vt8500/Kconfig | 1 - > arch/arm/mach-zynq/Kconfig | 1 - > 13 files changed, 33 deletions(-) Acked-by: Viresh Kumar While you are at it, please see if you can also fix unicore as well. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.15 regression: wrong cgroup magic
On Mon, Jun 02, 2014 at 06:30:20PM -0700, Linus Torvalds wrote: > On Mon, Jun 2, 2014 at 6:22 PM, Tejun Heo wrote: > > > > Linus, can you please cherry-pick the commit? > > I'd much rather see it go through the proper channels than go ahead > and cherry-pick from some branch that hasn't even been sent to me yet. > The whole "you have to send things to me for me to take them" policy > is not new, I don't want to start taking stuff that the > authors/maintainers haven't actively sent my way. > > That said, I suspect that Greg didn't expect this to actually matter > (the commit message certainly doesn't make it sound like anything that > people would notice), so the reason it is in -next is likely that > nobody thought it was a regression. Yes, I did not realize it at all, otherwise I would have sent it to you earlier. > Of course, Greg could just send it to me for my next branch (since the > merge window for 3.16 is already open) and tell me that it's also > stable material for 3.15. At _that_ point I'll happily cherry-pick it > intpo master... Ok, I'll go make up a pull request tonight for that. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.15-rc8 mm/filemap.c:202 BUG
I'm still seeing this one from time to time, though it takes me quite a while to hit it, despite my attempts at trying to narrow down the set of syscalls that cause it. kernel BUG at mm/filemap.c:202! invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 3 PID: 3013 Comm: trinity-c361 Not tainted 3.15.0-rc8+ #225 task: 88006c61 ti: 88005596 task.ti: 88005596 RIP: 0010:[] [] __delete_from_page_cache+0x318/0x360 RSP: 0018:880055963b90 EFLAGS: 00010046 RAX: RBX: 0003 RCX: 880146f68388 RDX: 022a RSI: aca8db38 RDI: aca62b17 RBP: 880055963be0 R08: 0002 R09: 88000613d530 R10: 880055963ba8 R11: 880007f49a40 R12: ea0006795880 R13: 880143232ad0 R14: R15: 880143232ad8 FS: 7f1e40673700() GS:88024d18() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f1e404e6000 CR3: 603eb000 CR4: 001407e0 DR0: 01bb1000 DR1: 02537000 DR2: 016a5000 DR3: DR6: fffe0ff0 DR7: 0600 Stack: 880143232ae8 88000613d530 88000613d568 08828259 ea0006795880 880143232ae8 0002 0002 880055963c08 ac158eae Call Trace: [] delete_from_page_cache+0x3e/0x70 [] truncate_inode_page+0x5b/0x90 [] shmem_undo_range+0x363/0x790 [] shmem_truncate_range+0x14/0x30 [] shmem_fallocate+0x9f/0x340 [] ? timerqueue_add+0x60/0xb0 [] do_fallocate+0x116/0x1a0 [] SyS_madvise+0x3c0/0x870 [] ? __this_cpu_preempt_check+0x13/0x20 [] tracesys+0xdd/0xe2 Code: ff ff 01 41 f6 c6 01 48 8b 45 c8 75 16 4c 89 30 e9 70 fe ff ff 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 0f 0b 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 41 54 9d e8 78 9e fd ff e9 8c fe ff ff RIP [] __delete_from_page_cache+0x318/0x360 There was also another variant of the same BUG with a slighty different stack trace. kernel BUG at mm/filemap.c:202! invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 2 PID: 6928 Comm: trinity-c45 Not tainted 3.15.0-rc5+ #208 task: 88023669d0a0 ti: 880186146000 task.ti: 880186146000 RIP: 0010:[] [] __delete_from_page_cache+0x315/0x320 RSP: 0018:880186147b18 EFLAGS: 00010046 RAX: RBX: 0003 RCX: 0002 RDX: 012a RSI: 84a9a83c RDI: 84a6e0c0 RBP: 880186147b68 R08: 0002 R09: 88002669e668 R10: 880186147b30 R11: R12: ea0008b067c0 R13: 880025355670 R14: R15: 880025355678 FS: 7fc10026f740() GS:88024440() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 2ab350f5c004 CR3: 00018566c000 CR4: 001407e0 DR0: 01989000 DR1: 00944000 DR2: 02494000 DR3: DR6: fffe0ff0 DR7: 0600 Stack: 880025355688 8800253556a0 88002669e668 88002669e6a0 8ea099ef ea0008b067c0 880025355688 0002 880186147b90 8415ba4d Call Trace: [] delete_from_page_cache+0x3d/0x70 [] truncate_inode_page+0x5b/0x90 [] shmem_undo_range+0x30b/0x780 [] shmem_truncate_range+0x14/0x30 [] shmem_evict_inode+0xcd/0x150 [] evict+0xa7/0x170 [] iput+0xf5/0x180 [] dentry_kill+0x260/0x2d0 [] dput+0x6c/0x110 [] __fput+0x189/0x200 [] fput+0xe/0x10 [] task_work_run+0xb4/0xe0 [] do_exit+0x302/0xb80 [] ? __this_cpu_preempt_check+0x13/0x20 [] do_group_exit+0x4c/0xc0 [] SyS_exit_group+0x14/0x20 [] tracesys+0xdd/0xe2 Code: 4c 89 30 e9 80 fe ff ff 48 8b 75 c0 4c 89 ff e8 82 8f 1c 00 84 c0 0f 85 6c fe ff ff e9 4f fe ff ff 0f 1f 44 00 00 e8 ae 95 5e 00 <0f> 0b e8 04 1c f1 ff 0f 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Fix division by zero in percpu_pagelist_fraction handler
Hello! On Jun 2, 2014, at 11:57 PM, David Rientjes wrote: > I'm pretty sure we want to allow users to restore the kernel default > behavior if they've already written to this file and now want to change it > back. > > What do you think about doing it like this instead? > --- > Documentation/sysctl/vm.txt | 3 ++- > kernel/sysctl.c | 3 +-- > mm/page_alloc.c | 20 > 3 files changed, 15 insertions(+), 11 deletions(-) > > diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt > --- a/Documentation/sysctl/vm.txt > +++ b/Documentation/sysctl/vm.txt > @@ -702,7 +702,8 @@ The batch value of each per cpu pagelist is also updated > as a result. It is > set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8) > > The initial value is zero. Kernel does not use this value at boot time to set > -the high water marks for each per cpu page list. > +the high water marks for each per cpu page list. If the user writes '0' to > this > +sysctl, it will revert to this default behavior. I think this is probably a better idea indeed. Always good to let users return back to defaults too. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler
On Mon, 2014-06-02 at 13:33 -0400, Tejun Heo wrote: > Hello, Pavel. > > On Mon, Jun 02, 2014 at 01:14:33PM +0200, Pavel Machek wrote: > > Now.. I see it is more work for storage maintainers, because there'll > > be more code to maintain in the interim. But perhaps user advantages > > are worth it? > > I'm quite skeptical about going that route. Not necessarily because > of the extra amount of work but more the higher probability of getting > into situation where we can neither push forward or back out. It's > difficult to define clear deadline and there will likely be unforeseen > challenges in the planned convergence of the two schedulers, > eventually, it isn't too unlikely to be in a situation where we have > to admit defeat and just keep both schedulers. Note that developer > overhead isn't the only factor here. Providing two slightly different > alternatives inevitably makes userland grow dependencies on subtleties > of both and there's a lot less pressure to make judgement calls and > take appropriate trade-offs, which have fairly high chance of > deadlocking progress towards any direction. But OTOH.. This thing (allegedly) fixes issues that have existed for ages, issues which have (also allegedly) not been fixed in all that time despite a number of people having done a lot of this and that over the years. If the claims are true, seems to me that would make BFQ a bit special, and perhaps worth some extra leeway and effort to ensure that what we are being offered on a silver plate doesn't molder away out of tree forever. If it were say put in staging, and it were stated right up front that it isn't ever going to go further (Jens already said that more or less), and _will_ drop dead if it stagnates, that would surely increase the test base to shake out problem spots (surely it has some), and allow users who meet an issue in either IO scheduler to verify it with the flick of a switch every step of the way to whichever ending, and maybe even motivate other IO people to help with the merge and/or to compare their changes at the flick of that same switch. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] cgroup: set visible flag only after we've mounted the default root
This fixes the failure path, so we won't set the visible flag though the mount is failed. Signed-off-by: Li Zefan --- kernel/cgroup.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index dabc486..0b6b44e 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1671,7 +1671,6 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type, /* look for a matching existing root */ if (!opts.subsys_mask && !opts.none && !opts.name) { - cgrp_dfl_root_visible = true; root = _dfl_root; cgroup_get(>cgrp); ret = 0; @@ -1770,6 +1769,9 @@ out_free: dentry = kernfs_mount(fs_type, flags, root->kf_root, _sb); if (IS_ERR(dentry) || !new_sb) cgroup_put(>cgrp); + else if (root == _dfl_root) + cgrp_dfl_root_visible = true; + return dentry; } -- 1.8.0.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] cgroup: make the default root invisible when it's umounted
Before this patch (in a fresh system): # cat /proc/$$/cgroup # mount -t cgroup -o __DEVEL__sane_behavior xxx /cgroup # umount /cgroup # cat /proc/$$/cgroup 0:cpuset,cpu,cpuacct,memory,devices,freezer,net_cls,blkio,perf_event,net_prio,hugetlb:/ After this patch (in a fresh system): # cat ... # mount ... # umount ... # cat /proc/$$/cgroup # You won't see the default root after it's umounted. Signed-off-by: Li Zefan --- kernel/cgroup.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index f73fe48..dabc486 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1788,6 +1788,8 @@ static void cgroup_kill_sb(struct super_block *sb) } else { if (root != _dfl_root) percpu_ref_kill(>cgrp.self.refcnt); + else + cgrp_dfl_root_visible = false; } kernfs_kill_sb(sb); -- 1.8.0.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] cgroup: don't destroy the default root
The default root is allocated and initialized at boot, so we shouldn't destroy the default root when it's umounted, otherwise it will lead to disaster. Signed-off-by: Li Zefan --- kernel/cgroup.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index a5f75ac..f73fe48 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1027,12 +1027,14 @@ static umode_t cgroup_file_mode(const struct cftype *cft) static void cgroup_get(struct cgroup *cgrp) { WARN_ON_ONCE(cgroup_is_dead(cgrp)); - css_get(>self); + if (!(cgrp->self.flags & CSS_NO_REF)) + css_get(>self); } static void cgroup_put(struct cgroup *cgrp) { - css_put(>self); + if (!(cgrp->self.flags & CSS_NO_REF)) + css_put(>self); } /** @@ -1781,10 +1783,12 @@ static void cgroup_kill_sb(struct super_block *sb) * This prevents new mounts by disabling percpu_ref_tryget_live(). * cgroup_mount() may wait for @root's release. */ - if (css_has_online_children(>cgrp.self)) + if (css_has_online_children(>cgrp.self)) { cgroup_put(>cgrp); - else - percpu_ref_kill(>cgrp.self.refcnt); + } else { + if (root != _dfl_root) + percpu_ref_kill(>cgrp.self.refcnt); + } kernfs_kill_sb(sb); } -- 1.8.0.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] hwmon changes for 3.16
Hi Linus, Please pull hwmon changes for Linux 3.16 from signed tag: git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git hwmon-for-linus Thanks, Guenter -- The following changes since commit 4b660a7f5c8099d88d1a43d8ae138965112592c7: Linux 3.15-rc6 (2014-05-22 06:42:02 +0900) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git tags/hwmon-for-linus for you to fetch changes up to 9d311eddf3565ed0e05b3cb5a22db41fa74d9d86: hwmon: (nct6775) Fix probe unwind paths to properly unregister platform devices (2014-05-24 08:30:29 -0700) New driver for NCT6683D New chip support to existing drivers: Add support for STTS2004 and AT30TSE004 to jc42 driver Add support for EMC1402/EMC1412/EMC1422 to emc1403 driver Other notable changes: Document hwmon kernel API Convert jc42, lm70, lm75, lm77, lm83, lm92, max1619, tmp421, and tmp102 drivers to use new hwmon API functions Replace function macros in lm80, lm92, and jc42 drivers with real code Convert emc1403 driver to use regmap, add support for additional attributes, and add device IDs for EMC1412, EMC1413, and EMC1414 Various additional cleanup and minor bug fixes in several drivers Axel Lin (2): hwmon: (nct6683) Fix probe unwind paths to properly unregister platform devices hwmon: (nct6775) Fix probe unwind paths to properly unregister platform devices Guenter Roeck (43): hwmon: (lm70) Convert to use devm_hwmon_device_register_with_groups hwmon: (tmp102) Introduce dev variable in probe function hwmon: (tmp102) Convert to use hwmon_device_register_with_groups hwmon: (tmp421) Convert to use devm_hwmon_device_register_with_groups hwmon: (lm77) Drop FSF mailing address hwmon: (lm77) Rearrange code to no longer require forward declarations hwmon: (lm77) Do not preserve hysteresis when updating critical temp limit hwmon: (lm77) Drop function macros hwmon: (lm77) Convert to use devm_hwmon_device_register_with_groups hwmon: (lm75) Convert to use hwmon_device_register_with_groups hwmon: (lm92) Drop unnecessary forward declaration hwmon: (lm92) Drop FSF mailing address hwmon: (lm92) Drop function macros hwmon: (lm92) Convert to use devm_hwmon_device_register_with_groups hwmon: (ltc2945) Fix 1st comment line hwmon: Document hwmon kernel API hwmon: (emc1403) Add driver documentation hwmon: (emc1403) Convert to use regmap hwmon: (emc1403) Report external diode fault status hwmon: (emc1403) Add support for alarm and diode fault status on EMC14x2 hwmon: (emc1403) Make all hyst attributes except for temp1_crit_hyst read-only hwmon: (emc1403) Relax hysteresis limit write checks hwmon: (emc1403) Add support for max_hyst attributes hwmon: (emc1403) Add support for min_hyst attributes hwmon: (emc1403) Add device IDs for EMC1412, EMC1413, and EMC1414 hwmon: (jc42) Rearrange code to avoid forward declarations hwmon: (jc42) Convert function macros into functions hwmon: (jc42) Add support for STTS2004 and AT30TSE004 hwmon: (max1619) Fix critical alarm display hwmon: (max1619) Drop FSF address hwmon: (max1619) Rearrange code to avoid forward declarations hwmon: (max1619) Drop function macros hwmon: (max1619) Convert to use devm_hwmon_device_register_with_groups hwmon: (lm83) Drop FSF address hwmon: (lm83) Rearange code to avoid forward declarations hwmon: (lm83) Convert to use devm_hwmon_device_register_with_groups hwmon: (lm80) Simplify TEMP_FROM_REG hwmon: (lm80) Normalize all temperature values to 16 bit hwmon: (lm80) Convert temperature display function macros into functions hwmon: (lm80) Convert voltage display function macros into functions hwmon: (lm80) Convert fan display function macros into functions hwmon: (lm80) Rearrange code to avoid forward declarations hwmon: Driver for NCT6683D Himangi Saraogi (1): hwmon: (ultra45_env) Introduce managed version of kzalloc Jingoo Han (11): hwmon: (f71805f) remove unnecessary OOM messages hwmon: (ibmpex) remove unnecessary OOM messages hwmon: (lm93) remove unnecessary OOM messages hwmon: (max) remove unnecessary OOM messages hwmon: (max197) remove unnecessary OOM messages hwmon: (pc87427) remove unnecessary OOM messages hwmon: (s3c-hwmon) remove unnecessary OOM messages hwmon: (vt1211) remove unnecessary OOM messages hwmon: (g762) Make of_device_id array const hwmon: (gpio-fan) Make of_device_id array const hwmon: (iio_hwmon) Make of_device_id array const Josef Gajdusek (1): hwmon:
Re: [PATCH] sysctl: Fix division by zero in percpu_pagelist_fraction handler
On Mon, 2 Jun 2014, Oleg Drokin wrote: > It's needed because at the very beginning of proc_dointvec_minmax it checks > if the length is 0 and if it is, immediately returns success because there's > nothing to be done > from it's perspective. > And since percpu_pagelist_fraction is 0 from the very beginning, next step is > division by this value. > > If length is not 0 on the other hand, then there's some value > proc_dointvec_minmax can interpret and the min and max checks happen and > everything works correctly. > > This is kind of hard to fix in proc_dointvec_minmax because there's no passed > in value to check against, I think. Sure, you can also check the passed in > pointer to value > to make sure it's within range, but that goes beyond the scope of the > function. > You might as well just assign a sane value to percpu_pagelist_fraction, which > has an added benefit of when you cat that /proc file, you'll get real value > of what the kernel uses instead of 0. > I'm pretty sure we want to allow users to restore the kernel default behavior if they've already written to this file and now want to change it back. What do you think about doing it like this instead? --- Documentation/sysctl/vm.txt | 3 ++- kernel/sysctl.c | 3 +-- mm/page_alloc.c | 20 3 files changed, 15 insertions(+), 11 deletions(-) diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -702,7 +702,8 @@ The batch value of each per cpu pagelist is also updated as a result. It is set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8) The initial value is zero. Kernel does not use this value at boot time to set -the high water marks for each per cpu page list. +the high water marks for each per cpu page list. If the user writes '0' to this +sysctl, it will revert to this default behavior. == diff --git a/kernel/sysctl.c b/kernel/sysctl.c --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -136,7 +136,6 @@ static unsigned long dirty_bytes_min = 2 * PAGE_SIZE; /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */ static int maxolduid = 65535; static int minolduid; -static int min_percpu_pagelist_fract = 8; static int ngroups_max = NGROUPS_MAX; static const int cap_last_cap = CAP_LAST_CAP; @@ -1305,7 +1304,7 @@ static struct ctl_table vm_table[] = { .maxlen = sizeof(percpu_pagelist_fraction), .mode = 0644, .proc_handler = percpu_pagelist_fraction_sysctl_handler, - .extra1 = _percpu_pagelist_fract, + .extra1 = , }, #ifdef CONFIG_MMU { diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -69,6 +69,7 @@ /* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */ static DEFINE_MUTEX(pcp_batch_high_lock); +#define MIN_PERCPU_PAGELIST_FRAC (8) #ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID DEFINE_PER_CPU(int, numa_node); @@ -4223,8 +4224,8 @@ static void pageset_set_high(struct per_cpu_pageset *p, pageset_update(>pcp, high, batch); } -static void __meminit pageset_set_high_and_batch(struct zone *zone, - struct per_cpu_pageset *pcp) +static void pageset_set_high_and_batch(struct zone *zone, + struct per_cpu_pageset *pcp) { if (percpu_pagelist_fraction) pageset_set_high(pcp, @@ -5850,20 +5851,23 @@ int percpu_pagelist_fraction_sysctl_handler(ctl_table *table, int write, void __user *buffer, size_t *length, loff_t *ppos) { struct zone *zone; - unsigned int cpu; int ret; ret = proc_dointvec_minmax(table, write, buffer, length, ppos); - if (!write || (ret < 0)) + if (!write || ret < 0) return ret; + if (percpu_pagelist_fraction && + percpu_pagelist_fraction < MIN_PERCPU_PAGELIST_FRAC) + percpu_pagelist_fraction = MIN_PERCPU_PAGELIST_FRAC; + mutex_lock(_batch_high_lock); for_each_populated_zone(zone) { - unsigned long high; - high = zone->managed_pages / percpu_pagelist_fraction; + unsigned int cpu; + for_each_possible_cpu(cpu) - pageset_set_high(per_cpu_ptr(zone->pageset, cpu), -high); + pageset_set_high_and_batch(zone, + per_cpu_ptr(zone->pageset, cpu)); } mutex_unlock(_batch_high_lock); return 0; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read
Re: [RFC][PATCH 1/2] Add a super operation for writeback
On Mon, Jun 02, 2014 at 10:30:07AM +0200, Christian Stroetmann wrote: > When I followed the advice of Dave Chinner: > "We're not going to merge that page forking stuff (like you were > told at LSF 2013 more than a year ago: > http://lwn.net/Articles/548091/) without rigorous design review and > a demonstration of the solutions to all the hard corner cases it > has" > given in his e-mail related with the presentation of the latest > version of the Tux3 file system (see [1]) and read the linked > article, I found in the second comments: > "Parts of this almost sound like it either a.) overlaps with or b.) > would benefit greatly from something similar to Featherstitch > [[2]]." > > Could it be that we have with Featherstitch a general solution > already that is said to be even "file system agnostic"? > Honestly, I thought that something like this would make its way into > the Linux code base. Here's what I said about the last proposal (a few months ago) for integrating featherstitch into the kernel: http://www.spinics.net/lists/linux-fsdevel/msg72799.html It's not a viable solution. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2] Add a super operation for writeback
[ please line wrap at something sane like 68 columns ] On Mon, Jun 02, 2014 at 01:02:29PM -0700, Daniel Phillips wrote: > On 06/01/2014 08:15 PM, Dave Chinner wrote: > > On Sun, Jun 01, 2014 at 02:41:02PM -0700, I wrote: > >> + > >> +/* > >> + * Add inode to writeback dirty list with current time. > >> + */ > >> +void inode_writeback_touch(struct inode *inode) > >> +{ > >> + struct backing_dev_info *bdi = inode->i_sb->s_bdi; > >> + spin_lock(>wb.list_lock); > >> + inode->dirtied_when = jiffies; > >> + list_move(>i_wb_list, >wb.b_dirty); > >> + spin_unlock(>wb.list_lock); > >> +} > >> +EXPORT_SYMBOL_GPL(inode_writeback_touch); > > You should be able to use redirty_tail() for this > > Redirty_tail nearly works, but "if (!list_empty(>b_dirty))" is > not correct because the inode needs to end up on the dirty list > whether it was already there or not. redirty_tail() always moves the inode to the end of the dirty list. > This requirement is analogous to __mark_inode_dirty() and must > tolerate similar races. At the microoptimization level, calling > redirty_tail from inode_writeback_touch would be less efficient > and more bulky. Another small issue is, redirty_tail does not > always update the timestamp, which could trigger some bogus > writeback events. redirty_tail does not update the timestamp when it doesn't need to change. If it needs to be changed because the current value would violate the time ordering requirements of the list, it rewrites it. So there is essentially no functional difference between the new function and redirty_tail > > H - this is using the wb dirty lists and locks, but you > > don't pass the wb structure to the writeback callout? That seem > > wrong to me - why would you bother manipulating these lists if > > you aren't using them to track dirty inodes in the first place? > > From Tux3's point of view, the wb lists just allow fs-writeback to > determine when to call ->writeback(). We agree that inode lists > are a suboptimal mechanism, but that is how fs-writeback currently > thinks. It would be better if our inodes never go onto wb lists in > the first place, provided that fs-writeback can still drive > writeback appropriately. It can't, and definitely not with the callout you added. However, we already avoid the VFS writeback lists for certain filesystems for pure metadata. e.g. XFS does not use the VFS dirty inode lists for inode metadata changes. They get tracked internally by the transaction subsystem which does it's own writeback according to the requirements of journal space availability. This is done simply by not calling mark_inode_dirty() on any metadata only change. If we want to do the same for data, then we'd simply not call mark_inode_dirty() in the data IO path. That requires a custom ->set_page_dirty method to be provided by the filesystem that didn't call __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); and instead did it's own thing. So the per-superblock dirty tracking is something we can do right now, and some filesystems do it for metadata. The missing piece for data is writeback infrastructure capable of deferring to superblocks for writeback rather than BDIs > Perhaps fs-writeback should have an option to work without inode > lists at all, and just maintain writeback scheduling statistics in > the superblock or similar. That would be a big change, more on the > experimental side. We would be happy to take it on after merge, > hopefully for the benefit of other filesystems besides Tux3. Well, I don't see it that way. If you have a requirement to be able to track dirty inodes internally, then lets move to that implement that infrastructure rather than hacking around with callouts that only have a limited shelf-life. > What we pass to ->writeback() is just a matter of taste at the > moment because we currently ignore everything except > >nr_pages. Anything sane is fine. Note that bdi_writeback is > already available from bdi->wb, the "default writeback info", > whatever that means. A quick tour of existing usage suggests that > you can reliably obtain the wb that way, but a complete audit > would be needed. > > Most probably, this API will evolve as new users arrive, and also > as our Tux3 usage becomes more sophisticated. For now, Tux3 just > flushes everything without asking questions. Exactly how that > might change in the future is hard to predict. You are in a better > position to know what XFS would require in order to use this > interface. XFS already has everything it needs internally to track dirty inodes. In fact, we used to do data writeback from within XFS and we slowly removed it as the generic writeback code was improved made the XFS code redundant. That said, parallelising writeback so we can support hundreds of GB/s of delayed allocation based writeback is something we eventually need to do, and that pretty much means we need to bring dirty data inode tracking back into XFS. So what we really
[PATCH] blk-mq: fix sparse warning on missed __percpu annotation
'struct blk_mq_ctx' is __percpu, so add the annotation and fix the sparse warning reported from Fengguang: [block:for-linus 2/3] block/blk-mq.h:75:16: sparse: incorrect type in initializer (different address spaces) Reported-by: kbuild test robot Signed-off-by: Ming Lei --- block/blk-mq.c |2 +- include/linux/blkdev.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 43eb315..3bb4cfe 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1767,7 +1767,7 @@ static void blk_mq_add_queue_tag_set(struct blk_mq_tag_set *set, struct request_queue *blk_mq_init_queue(struct blk_mq_tag_set *set) { struct blk_mq_hw_ctx **hctxs; - struct blk_mq_ctx *ctx; + struct blk_mq_ctx __percpu *ctx; struct request_queue *q; unsigned int *map; int i; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 2f3886e..3cd426e 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -335,7 +335,7 @@ struct request_queue { unsigned int*mq_map; /* sw queues */ - struct blk_mq_ctx *queue_ctx; + struct blk_mq_ctx __percpu *queue_ctx; unsigned intnr_queues; /* hw dispatch queues */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/4] tools lib traceevent: Added support for __get_bitmask() macro
From: "Steven Rostedt (Red Hat)" Coming in v3.16, trace events will be able to save bitmasks in raw format in the ring buffer and output it with the __get_bitmask() macro. In order for userspace tools to parse this, it must be able to handle the __get_bitmask() call and be able to convert the data that's in the ring buffer into a nice bitmask format. The output is similar to what the kernel uses to print bitmasks, with a comma separator every 4 bytes (8 characters). This allows for cpumasks to also be saved efficiently. The first user is the thermal:thermal_power_limit event which has the following output: thermal_power_limit: cpus=000f freq=190 cdev_state=0 power=5252 Link: http://lkml.kernel.org/r/20140506132238.22e13...@gandalf.local.home Suggested-by: Javi Merino Tested-by: Javi Merino Signed-off-by: Steven Rostedt --- tools/lib/traceevent/event-parse.c | 113 + tools/lib/traceevent/event-parse.h | 7 ++ .../perf/util/scripting-engines/trace-event-perl.c | 1 + .../util/scripting-engines/trace-event-python.c| 1 + 4 files changed, 122 insertions(+) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index b83184f2d484..93825a17dcce 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -765,6 +765,9 @@ static void free_arg(struct print_arg *arg) case PRINT_BSTRING: free(arg->string.string); break; + case PRINT_BITMASK: + free(arg->bitmask.bitmask); + break; case PRINT_DYNAMIC_ARRAY: free(arg->dynarray.index); break; @@ -2268,6 +2271,7 @@ static int arg_num_eval(struct print_arg *arg, long long *val) case PRINT_FIELD ... PRINT_SYMBOL: case PRINT_STRING: case PRINT_BSTRING: + case PRINT_BITMASK: default: do_warning("invalid eval type %d", arg->type); ret = 0; @@ -2296,6 +2300,7 @@ static char *arg_eval (struct print_arg *arg) case PRINT_FIELD ... PRINT_SYMBOL: case PRINT_STRING: case PRINT_BSTRING: + case PRINT_BITMASK: default: do_warning("invalid eval type %d", arg->type); break; @@ -2683,6 +2688,35 @@ process_str(struct event_format *event __maybe_unused, struct print_arg *arg, return EVENT_ERROR; } +static enum event_type +process_bitmask(struct event_format *event __maybe_unused, struct print_arg *arg, + char **tok) +{ + enum event_type type; + char *token; + + if (read_expect_type(EVENT_ITEM, ) < 0) + goto out_free; + + arg->type = PRINT_BITMASK; + arg->bitmask.bitmask = token; + arg->bitmask.offset = -1; + + if (read_expected(EVENT_DELIM, ")") < 0) + goto out_err; + + type = read_token(); + *tok = token; + + return type; + + out_free: + free_token(token); + out_err: + *tok = NULL; + return EVENT_ERROR; +} + static struct pevent_function_handler * find_func_handler(struct pevent *pevent, char *func_name) { @@ -2797,6 +2831,10 @@ process_function(struct event_format *event, struct print_arg *arg, free_token(token); return process_str(event, arg, tok); } + if (strcmp(token, "__get_bitmask") == 0) { + free_token(token); + return process_bitmask(event, arg, tok); + } if (strcmp(token, "__get_dynamic_array") == 0) { free_token(token); return process_dynamic_array(event, arg, tok); @@ -3324,6 +3362,7 @@ eval_num_arg(void *data, int size, struct event_format *event, struct print_arg return eval_type(val, arg, 0); case PRINT_STRING: case PRINT_BSTRING: + case PRINT_BITMASK: return 0; case PRINT_FUNC: { struct trace_seq s; @@ -3556,6 +3595,60 @@ static void print_str_to_seq(struct trace_seq *s, const char *format, trace_seq_printf(s, format, str); } +static void print_bitmask_to_seq(struct pevent *pevent, +struct trace_seq *s, const char *format, +int len_arg, const void *data, int size) +{ + int nr_bits = size * 8; + int str_size = (nr_bits + 3) / 4; + int len = 0; + char buf[3]; + char *str; + int index; + int i; + + /* +* The kernel likes to put in commas every 32 bits, we +* can do the same. +*/ + str_size += (nr_bits - 1) / 32; + + str = malloc(str_size + 1); + if (!str) { + do_warning("%s: not enough memory!", __func__); + return; + } + str[str_size] = 0; + + /* Start out with -2 for the two chars per byte */ + for (i = str_size - 2; i >= 0; i -= 2)
[PATCH v2 1/4] tools lib traceevent: Add flag to not load event plugins
From: "Steven Rostedt (Red Hat)" Add a flag to pevent that will let the callers be able to set it and keep the system, and perhaps even normal plugins from being loaded. This is useful when plugins might hide certain information and seeing the raw events shows what may be going on. Signed-off-by: Steven Rostedt --- tools/lib/traceevent/event-parse.h | 2 ++ tools/lib/traceevent/event-plugin.c | 7 ++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index feab94281634..a68ec3d8289f 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -354,6 +354,8 @@ enum pevent_func_arg_type { enum pevent_flag { PEVENT_NSEC_OUTPUT = 1,/* output in NSECS */ + PEVENT_DISABLE_SYS_PLUGINS = 1 << 1, + PEVENT_DISABLE_PLUGINS = 1 << 2, }; #define PEVENT_ERRORS\ diff --git a/tools/lib/traceevent/event-plugin.c b/tools/lib/traceevent/event-plugin.c index 0c8bf6780e4d..317466bd1a37 100644 --- a/tools/lib/traceevent/event-plugin.c +++ b/tools/lib/traceevent/event-plugin.c @@ -148,12 +148,17 @@ load_plugins(struct pevent *pevent, const char *suffix, char *path; char *envdir; + if (pevent->flags & PEVENT_DISABLE_PLUGINS) + return; + /* * If a system plugin directory was defined, * check that first. */ #ifdef PLUGIN_DIR - load_plugins_dir(pevent, suffix, PLUGIN_DIR, load_plugin, data); + if (!(pevent->flags & PEVENT_DISABLE_SYS_PLUGINS)) + load_plugins_dir(pevent, suffix, PLUGIN_DIR, +load_plugin, data); #endif /* -- 2.0.0.rc2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/4] tools lib traceevent: Add options to function plugin
From: "Steven Rostedt (Red Hat)" Add the options "parent" and "indent" to the function plugin. When parent is set, the output looks like this: function: fsnotify_modify <-- vfs_write function: zone_statistics <-- get_page_from_freelist function:__inc_zone_state <-- zone_statistics function:inotify_inode_queue_event <-- fsnotify_modify function:fsnotify_parent <-- fsnotify_modify function:__inc_zone_state <-- zone_statistics function: __fsnotify_parent <-- fsnotify_parent function: inotify_dentry_parent_queue_event <-- fsnotify_parent function: add_to_page_cache_lru <-- do_read_cache_page When it's not set, it looks like: function: fsnotify_modify function: zone_statistics function:__inc_zone_state function:inotify_inode_queue_event function:fsnotify_parent function:__inc_zone_state function: __fsnotify_parent function: inotify_dentry_parent_queue_event function: add_to_page_cache_lru When the otpion "indent" is not set, it looks like this: function: fsnotify_modify <-- vfs_write function: zone_statistics <-- get_page_from_freelist function: __inc_zone_state <-- zone_statistics function: inotify_inode_queue_event <-- fsnotify_modify function: fsnotify_parent <-- fsnotify_modify function: __inc_zone_state <-- zone_statistics function: __fsnotify_parent <-- fsnotify_parent function: inotify_dentry_parent_queue_event <-- fsnotify_parent function: add_to_page_cache_lru <-- do_read_cache_page Signed-off-by: Steven Rostedt --- tools/lib/traceevent/plugin_function.c | 43 +- 1 file changed, 37 insertions(+), 6 deletions(-) diff --git a/tools/lib/traceevent/plugin_function.c b/tools/lib/traceevent/plugin_function.c index 80ba4ff1fe84..a00ec190821a 100644 --- a/tools/lib/traceevent/plugin_function.c +++ b/tools/lib/traceevent/plugin_function.c @@ -33,6 +33,29 @@ static int cpus = -1; #define STK_BLK 10 +struct pevent_plugin_option plugin_options[] = +{ + { + .name = "parent", + .plugin_alias = "ftrace", + .description = + "Print parent of functions for function events", + }, + { + .name = "indent", + .plugin_alias = "ftrace", + .description = + "Try to show function call indents, based on parents", + .set = 1, + }, + { + .name = NULL, + } +}; + +static struct pevent_plugin_option *ftrace_parent = _options[0]; +static struct pevent_plugin_option *ftrace_indent = _options[1]; + static void add_child(struct func_stack *stack, const char *child, int pos) { int i; @@ -119,7 +142,8 @@ static int function_handler(struct trace_seq *s, struct pevent_record *record, parent = pevent_find_function(pevent, pfunction); - index = add_and_get_index(parent, func, record->cpu); + if (parent && ftrace_indent->set) + index = add_and_get_index(parent, func, record->cpu); trace_seq_printf(s, "%*s", index*3, ""); @@ -128,11 +152,13 @@ static int function_handler(struct trace_seq *s, struct pevent_record *record, else trace_seq_printf(s, "0x%llx", function); - trace_seq_printf(s, " <-- "); - if (parent) - trace_seq_printf(s, "%s", parent); - else - trace_seq_printf(s, "0x%llx", pfunction); + if (ftrace_parent->set) { + trace_seq_printf(s, " <-- "); + if (parent) + trace_seq_printf(s, "%s", parent); + else + trace_seq_printf(s, "0x%llx", pfunction); + } return 0; } @@ -141,6 +167,9 @@ int PEVENT_PLUGIN_LOADER(struct pevent *pevent) { pevent_register_event_handler(pevent, -1, "ftrace", "function", function_handler, NULL); + + traceevent_plugin_add_options("ftrace", plugin_options); + return 0; } @@ -157,6 +186,8 @@ void PEVENT_PLUGIN_UNLOADER(struct pevent *pevent) free(fstack[i].stack); } + traceevent_plugin_remove_options(plugin_options); + free(fstack); fstack = NULL; cpus = -1; -- 2.0.0.rc2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/4] tools lib traceevent: bitmask handling and plugin updates
This is v2 of the patch series that added __get_bitmask() as well as added some plugin code. This version addresses what Namhyung suggested. The diff from v1 is posted below. Steven Rostedt (Red Hat) (4): tools lib traceevent: Add flag to not load event plugins tools lib traceevent: Add options to plugins tools lib traceevent: Add options to function plugin tools lib traceevent: Added support for __get_bitmask() macro tools/lib/traceevent/event-parse.c | 113 tools/lib/traceevent/event-parse.h | 25 ++- tools/lib/traceevent/event-plugin.c| 204 - tools/lib/traceevent/plugin_function.c | 43 - .../perf/util/scripting-engines/trace-event-perl.c | 1 + .../util/scripting-engines/trace-event-python.c| 1 + 6 files changed, 377 insertions(+), 10 deletions(-) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index 2d6aa92..93825a1 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -3794,7 +3794,7 @@ static void print_str_arg(struct trace_seq *s, void *data, int size, f = pevent_find_any_field(event, arg->bitmask.bitmask); arg->bitmask.offset = f->offset; } - bitmask_offset = data2host4(pevent, data + arg->string.offset); + bitmask_offset = data2host4(pevent, data + arg->bitmask.offset); bitmask_size = bitmask_offset >> 16; bitmask_offset &= 0x; print_bitmask_to_seq(pevent, s, format, len_arg, diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index 025627f..7a3873f 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -362,7 +362,7 @@ enum pevent_func_arg_type { enum pevent_flag { PEVENT_NSEC_OUTPUT = 1,/* output in NSECS */ PEVENT_DISABLE_SYS_PLUGINS = 1 << 1, - PEVENT_DISABLE_PLUGINS = 1 << 2 + PEVENT_DISABLE_PLUGINS = 1 << 2, }; #define PEVENT_ERRORS\ @@ -419,6 +419,8 @@ enum pevent_errno { struct plugin_list; +#define INVALID_PLUGIN_LIST_OPTION ((char **)((unsigned long)-1)) + struct plugin_list *traceevent_load_plugins(struct pevent *pevent); void traceevent_unload_plugins(struct plugin_list *plugin_list, struct pevent *pevent); diff --git a/tools/lib/traceevent/event-plugin.c b/tools/lib/traceevent/event-plugin.c index a244794..648ef84 100644 --- a/tools/lib/traceevent/event-plugin.c +++ b/tools/lib/traceevent/event-plugin.c @@ -57,7 +57,7 @@ struct plugin_list { * used by toggling the option. * * Returns NULL if there's no options registered. On error it returns - * an (char **)-1 (must check for that) + * INVALID_PLUGIN_LIST_OPTION * * Must be freed with traceevent_plugin_free_options_list(). */ @@ -72,6 +72,7 @@ char **traceevent_plugin_list_options(void) for (reg = registered_options; reg; reg = reg->next) { for (op = reg->options; op->name; op++) { char *alias = op->plugin_alias ? op->plugin_alias : op->file; + char **temp = list; name = malloc(strlen(op->name) + strlen(alias) + 2); if (!name) @@ -80,6 +81,7 @@ char **traceevent_plugin_list_options(void) sprintf(name, "%s:%s", alias, op->name); list = realloc(list, count + 2); if (!list) { + list = temp; free(name); goto err; } @@ -87,16 +89,14 @@ char **traceevent_plugin_list_options(void) list[count] = NULL; } } - if (!count) - return NULL; return list; err: - while (--count > 0) + while (--count >= 0) free(list[count]); free(list); - return (char **)((unsigned long)-1); + return INVALID_PLUGIN_LIST_OPTION; } void traceevent_plugin_free_options_list(char **list) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/4] tools lib traceevent: Add options to plugins
From: "Steven Rostedt (Red Hat)" The traceevent plugins allows developers to have their events print out information that is more advanced than what can be achieved by the trace event format files. As these plugins are used on the userspace side of the tracing tools, it is only logical that the tools should be able to produce different types of output for the events. The types of events still need to be defined by the plugins thus we need a way to pass information from the tool to the plugin to specify what type of information to be shown. Not only does the information need to be passed by the tool to plugin, but the plugin also requires a way to notify the tool of what options it can provide. This builds the plugin option infrastructure that is taken from trace-cmd that is used to allow plugins to produce different output based on the options specified by the tool. Signed-off-by: Steven Rostedt --- tools/lib/traceevent/event-parse.h | 16 ++- tools/lib/traceevent/event-plugin.c | 197 2 files changed, 210 insertions(+), 3 deletions(-) diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index a68ec3d8289f..56e0e6c12411 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -107,8 +107,8 @@ typedef int (*pevent_event_handler_func)(struct trace_seq *s, typedef int (*pevent_plugin_load_func)(struct pevent *pevent); typedef int (*pevent_plugin_unload_func)(struct pevent *pevent); -struct plugin_option { - struct plugin_option*next; +struct pevent_plugin_option { + struct pevent_plugin_option *next; void*handle; char*file; char*name; @@ -135,7 +135,7 @@ struct plugin_option { * PEVENT_PLUGIN_OPTIONS: (optional) * Plugin options that can be set before loading * - * struct plugin_option PEVENT_PLUGIN_OPTIONS[] = { + * struct pevent_plugin_option PEVENT_PLUGIN_OPTIONS[] = { * { * .name = "option-name", * .plugin_alias = "overide-file-name", (optional) @@ -412,9 +412,19 @@ enum pevent_errno { struct plugin_list; +#define INVALID_PLUGIN_LIST_OPTION ((char **)((unsigned long)-1)) + struct plugin_list *traceevent_load_plugins(struct pevent *pevent); void traceevent_unload_plugins(struct plugin_list *plugin_list, struct pevent *pevent); +char **traceevent_plugin_list_options(void); +void traceevent_plugin_free_options_list(char **list); +int traceevent_plugin_add_options(const char *name, + struct pevent_plugin_option *options); +void traceevent_plugin_remove_options(struct pevent_plugin_option *options); +void traceevent_print_plugins(struct trace_seq *s, + const char *prefix, const char *suffix, + const struct plugin_list *list); struct cmdline; struct cmdline_list; diff --git a/tools/lib/traceevent/event-plugin.c b/tools/lib/traceevent/event-plugin.c index 317466bd1a37..648ef84dc37e 100644 --- a/tools/lib/traceevent/event-plugin.c +++ b/tools/lib/traceevent/event-plugin.c @@ -18,6 +18,7 @@ * ~~ */ +#include #include #include #include @@ -30,12 +31,208 @@ #define LOCAL_PLUGIN_DIR ".traceevent/plugins" +static struct registered_plugin_options { + struct registered_plugin_options*next; + struct pevent_plugin_option *options; +} *registered_options; + +static struct trace_plugin_options { + struct trace_plugin_options *next; + char*plugin; + char*option; + char*value; +} *trace_plugin_options; + struct plugin_list { struct plugin_list *next; char*name; void*handle; }; +/** + * traceevent_plugin_list_options - get list of plugin options + * + * Returns an array of char strings that list the currently registered + * plugin options in the format of :. This list can be + * used by toggling the option. + * + * Returns NULL if there's no options registered. On error it returns + * INVALID_PLUGIN_LIST_OPTION + * + * Must be freed with traceevent_plugin_free_options_list(). + */ +char **traceevent_plugin_list_options(void) +{ + struct registered_plugin_options *reg; + struct pevent_plugin_option *op; + char **list = NULL; + char *name; + int count = 0; + + for (reg = registered_options; reg; reg = reg->next) { + for (op = reg->options; op->name; op++) { + char *alias = op->plugin_alias ? op->plugin_alias : op->file; + char **temp = list; + + name = malloc(strlen(op->name) +
Re: [PATCH] perf record: Fix poll return value propagation
On Mon, 2 Jun 2014 20:02:06 +0200, Jiri Olsa wrote: > If the perf record command is interrupted in record__mmap_read_all > function, the 'done' is set and err has the latest poll return > value, which is most likely positive number (= number of pollfds > ready to read). > > This 'positive err' is then propagated to the exit code, resulting > in not finishing the perf.data header properly, causing following > error in report: > > # perf record -F 5 -a > > --- > make the system real busy, so there's more chance > to interrupt perf in event writing code > --- > > ^C[ perf record: Woken up 16 times to write data ] > [ perf record: Captured and wrote 30.292 MB perf.data (~1323468 samples) ] > > # perf report --stdio > /dev/null > WARNING: The perf.data file's data size field is 0 which is unexpected. > Was the 'perf record' command properly terminated? > > Fixing this by checking for positive poll return value > and setting err to 0. Acked-by: Namhyung Kim Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] watchdog: imx2_wdt: adds big endianness support.
> > @@ -201,6 +204,10 @@ static int __init imx2_wdt_probe(struct platform_device > *pdev) > > if (!wdev) > > return -ENOMEM; > > > > + big_endian = of_property_read_bool(np, "big-endian"); > > + if (big_endian) > > + imx2_wdt_regmap_config.val_format_endian = REGMAP_ENDIAN_BIG; > > + > > You'll need to document the use of this property in > Documentation/devicetree/bindings/watchdog/fsl-imx-wdt.txt. > Sorry for late. Please see the next version. Thanks, BRs Xiubo > Guenter > > > res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > > base = devm_ioremap_resource(>dev, res); > > if (IS_ERR(base)) > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: nfs4_do_reclaim lockdep pop in v3.15.0-rc1
On Mon, Jun 2, 2014 at 5:59 PM, Trond Myklebust wrote: > On Mon, Jun 2, 2014 at 6:49 PM, John Stultz wrote: >> On Mon, Jun 2, 2014 at 3:42 PM, Trond Myklebust >> wrote: >>> The so_reclaim_seqcount only exists in order to tell the other threads >>> that they may need to replay file open or file lock requests that have >>> raced with state recovery (because those threads got scheduled out >>> after their RPC calls ran, but before they managed to set up the >>> tracking of the new state). It is basically an edge condition >>> killer... >> >> Would then swapping the acquisition order, so the seqcount is taken >> before the so_lock at the top of nfs4_reclaim_open_state() avoid this >> then, without having to disable lockdep? >> > > I can change the write seqcount to use raw_write_seqcount(), but that So this doesn't address my suggestion to change the locking order... is that solution not feasible? > doesn't answer the question of why raw_seqcount_begin() is the _only_ > object out there with a "raw_" prefix, that doesn't explicitly disable > lockdep checking. > > What justifies the inconsistency? Here's the naming discussion... https://lkml.org/lkml/2014/1/2/404 thanks -john On Mon, Jun 2, 2014 at 5:59 PM, Trond Myklebust wrote: > On Mon, Jun 2, 2014 at 6:49 PM, John Stultz wrote: >> On Mon, Jun 2, 2014 at 3:42 PM, Trond Myklebust >> wrote: >>> On Mon, Jun 2, 2014 at 6:12 PM, John Stultz wrote: On Mon, Jun 2, 2014 at 9:02 AM, Trond Myklebust wrote: > On Mon, Jun 2, 2014 at 10:49 AM, Jeff Layton > wrote: >> I've been working on the patchset to break up the client_mutex in nfsd. >> While doing some debugging, I had mounted my kernel git tree with >> NFSv4.1, and was running crash on the vmlinux image in it. >> >> A little while later, I saw the following lockdep inversion pop. >> Unfortunately, I couldn't get the whole log, but I think it's enough to >> show that there's a potential problem? >> >> I've not had time to give it a hard look yet, but thought I'd post it >> here in the hopes that it might look familiar to someone: >> >> [ 2581.104687] == >> [ 2581.104716] [ INFO: possible circular locking dependency detected ] >> [ 2581.104716] 3.15.0-rc1.jlayton.1+ #2 Tainted: G OE >> [ 2581.104716] --- >> [ 2581.104716] 2001:470:8:d63:/5622 is trying to acquire lock: >> [ 2581.104716] (&(>so_lock)->rlock){+.+...}, at: >> [] nfs4_do_reclaim+0x5bd/0x7f0 [nfsv4] >> [ 2581.104716] >> [ 2581.104716] but task is already holding lock: >> [ 2581.104716] (>so_reclaim_seqcount){+.+...}, at: >> [] nfs4_run_state_manager+0x7ee/0xc00 [nfsv4] >> [ 2581.104716] >> [ 2581.104716] which lock already depends on the new lock. >> [ 2581.104716] >> [ 2581.104716] >> [ 2581.104716] the existing dependency chain (in reverse order) is: >> [ 2581.104716] >> -> #1 (>so_reclaim_seqcount){+.+...}: >> [ 2581.104716][] lock_acquire+0xa2/0x1d0 >> [ 2581.104716][] nfs4_do_reclaim+0x290/0x7f0 >> [nfsv4] >> [ 2581.104716][] >> nfs4_run_state_manager+0x7ee/0xc00 [nfsv4] >> [ 2581.104716][] kthread+0xff/0x120 >> [ 2581.104716][] ret_from_fork+0x7c/0xb0 >> [ 2581.104716] >> -> #0 (&(>so_lock)->rlock){+.+...}: >> [ 2581.104716][] __lock_acquire+0x1b8f/0x1ca0 >> [ 2581.104716][] lock_acquire+0xa2/0x1d0 >> [ 2581.104716][] _raw_spin_lock+0x3e/0x80 >> [ 2581.104716][] nfs4_do_reclaim+0x5bd/0x7f0 >> [nfsv4] >> [ 2581.104716][] >> nfs4_run_state_manager+0x7ee/0xc00 [nfsv4] >> [ 2581.104716][] kthread+0xff/0x120 >> [ 2581.104716][] ret_from_fork+0x7c/0xb0 >> [ 2581.104716] >> [ 2581.104716] other info that might help us debug this: >> [ 2581.104716] >> [ 2581.104716] Possible unsafe locking scenario: >> [ 2581.104716] >> [ 2581.104716]CPU0CPU1 >> [ 2581.104716] >> [ 2581.104716] lock(>so_reclaim_seqcount); >> [ 2581.104716] >> lock(&(>so_lock)->rlock); >> [ 2581.104716] >> lock(>so_reclaim_seqcount); >> [ 2581.104716] lock(&(>so_lock)->rlock); >> [ 2581.104716] >> [ 2581.104716] *** DEADLOCK *** >> [ 2581.104716] >> [ 2581.104716] 1 lock held by 2001:470:8:d63:/5622: >> [ 2581.104716] #0: (>so_reclaim_seqcount){+.+...}, at: >> [] nfs4_run_state_manager+0x7ee/0xc00 [nfsv4] >> [ 2581.104716] >> [ 2581.104716] stack backtrace: >> [ 2581.104716] CPU: 2 PID: 5622 Comm: 2001:470:8:d63: Tainted: G >> OE 3.15.0-rc1.jlayton.1+ #2 >> [ 2581.104716] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Re: [PATCH] tools: perf: builtin-trace.c: Cleaning up memory leak
On Sun, 1 Jun 2014 13:38:26 +0200, Rickard Strandqvist wrote: > There is a risk for memory leak in when something unexpected happens > and the function returns. > > This was largely found by using a static code analysis program called > cppcheck. Acked-by: Namhyung Kim Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] tools lib traceevent: Add options to plugins
On Mon, 19 May 2014 23:29:35 +0900 Namhyung Kim wrote: > > - * struct plugin_option PEVENT_PLUGIN_OPTIONS[] = { > > + * struct pevent_plugin_option PEVENT_PLUGIN_OPTIONS[] = { > > * { > > * .name = "option-name", > > * .plugin_alias = "overide-file-name", (optional) > > @@ -355,7 +355,7 @@ enum pevent_func_arg_type { > > enum pevent_flag { > > PEVENT_NSEC_OUTPUT = 1,/* output in NSECS */ > > PEVENT_DISABLE_SYS_PLUGINS = 1 << 1, > > - PEVENT_DISABLE_PLUGINS = 1 << 2, > > + PEVENT_DISABLE_PLUGINS = 1 << 2 > > Unnecessary change? Hmm, no idea why I changed that. > > +/** > > + * traceevent_plugin_list_options - get list of plugin options > > + * > > + * Returns an array of char strings that list the currently registered > > + * plugin options in the format of :. This list can be > > + * used by toggling the option. > > + * > > + * Returns NULL if there's no options registered. On error it returns > > + * an (char **)-1 (must check for that) > > What about making it a macro like INVALID_OPTION_LIST? Yeah, I could do this. > > > + * > > + * Must be freed with traceevent_plugin_free_options_list(). > > + */ > > +char **traceevent_plugin_list_options(void) > > +{ > > + struct registered_plugin_options *reg; > > + struct pevent_plugin_option *op; > > + char **list = NULL; > > + char *name; > > + int count = 0; > > + > > + for (reg = registered_options; reg; reg = reg->next) { > > + for (op = reg->options; op->name; op++) { > > + char *alias = op->plugin_alias ? op->plugin_alias : > > op->file; > > + > > + name = malloc(strlen(op->name) + strlen(alias) + 2); > > + if (!name) > > + goto err; > > + > > + sprintf(name, "%s:%s", alias, op->name); > > + list = realloc(list, count + 2); > > + if (!list) { > > This will lost the original list pointer. Please use a temp variable. Will fix. > > > > + free(name); > > + goto err; > > + } > > + list[count++] = name; > > + list[count] = NULL; > > + } > > + } > > + if (!count) > > + return NULL; > > + return list; > > Isn't it enough to simply return the list? Yep, will do. > > > + > > + err: > > + while (--count > 0) > > Shouldn't it be >= instead of > ? > Fixed. Thanks, -- Steve > Thanks, > Namhyung > > > > + free(list[count]); > > + free(list); > > + > > + return (char **)((unsigned long)-1); > > +} > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tools: lib: traceevent: parse-filter.c: Cleaning up memory leak
Hi Rickard, On Sun, 1 Jun 2014 13:36:56 +0200, Rickard Strandqvist wrote: > There is a risk for memory leak in when something unexpected happens > and the function returns. > > This was largely found by using a static code analysis program called > cppcheck. Nice work! Acked-by: Namhyung Kim Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 2/2] rcu: Add Josh Triplett as designated reviewer
On Mon, 2014-06-02 at 18:51 -0700, Josh Triplett wrote: > git send-email can invoke 'scripts/get_maintainer.pl --no-rolestats' > directly via --to-cmd or -cc-cmd; that works fine as long as you don't > have a cover letter. > > Depending on the system I'm running on, and whether it's more convenient > to invoke git-send-email or to edit patch mails and send them with 'mutt > -H', I have a shell pipeline which invokes get_maintainer.pl on an > entire patch series, collects all the email addresses it returns, and > inserts them all into each mail as CCs. (That way, when I send a > cross-subsystem patch series, I don't get a pile of maintainers confused > that they only received a couple of the numbered patches.) One example: I think that as long as the appropriate mailing lists receive the cover letter, any real maintainer won't be confused. > { echo -n "To: " ; for x in *.patch ; do scripts/get_maintainer.pl > --no-rolestats < $x | fgrep -v j...@joshtriplett.org ; done | sort -u | sed > 's/$/, /;$s/, $//' | tr -d '\n' ; echo ; } | sed -i '/^From:/r/dev/stdin' > > Personally, I'd find it handy if one of the following happened: > > - git send-email (and ideally also git format-patch) grew an option to > collect *all* the to-cmd and cc-cmd output from each patch and apply > it to every patch (including the cover letter). The biggest issue with doing that is the quantity of names and addresses on the [0/n] patch can easily exceed vger's 1024 byte maximum header size limit. I drop all but the primary maintainers and just cc lists. I use a couple of scripts for that (attached) for the "--to_cmd" and "--cc_cmd" options Another possibility is to add a new "--bcc_cmd" to git send-email so that vger's header limit can be worked around. I had patches to git to do that awhile ago. > - get_maintainer.pl accepted multiple patchfile names and output the > union of the results. Ideally, get_maintainer.pl would also have a -i > option to edit the patch files and insert the addresses in the mail > headers. Why would get_maintainer.pl have any option like that? Tools for uses. Scripting. Aren't we good at that sort of thing? to.sh Description: application/shellscript cc.sh Description: application/shellscript
Re: [PATCH] perf: fix 'make help' message error
Hi, Namhyung, On Tue, Jun 3, 2014 at 10:51 AM, Namhyung Kim wrote: > > s/condiered/considered/ ? > > Oops, sorry for the typo. >>> has an empty value, so "prefix" is honored. However, "prefix" is >>> unconditionally >>> assigned to $HOME, regardless of what it is set to from command line. So our >>> "prefix" setting got no respect and the actual destination falls back to >>> $HOME. >>> >>> This patch fixes this issue and corrects the help message. > > With that changed, > > Acked-by: Namhyung Kim Thanks. I renew the patch for maintainers to pick up. -8<- Subject: [PATCH] perf: fix 'make help' message error Currently 'make help' message has such hint: use "make prefix= " to install to a particular path like make prefix=/usr/local install install-doc But this is misleading, when I specify "prefix=/usr/local", it has got no respect at all. This is because that, "DESTDIR" is considered first. In this case, "DESTDIR" has an empty value, so "prefix" is honored. However, "prefix" is unconditionally assigned to $HOME, regardless of what it is set to from command line. So our "prefix" setting got no respect and the actual destination falls back to $HOME. This patch fixes this issue and corrects the help message. Acked-by: Namhyung Kim Signed-off-by: Jianyu Zhan --- tools/perf/Makefile.perf | 4 ++-- tools/perf/config/Makefile | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 895edd3..5918063 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -784,8 +784,8 @@ help: @echo '' @echo 'Perf install targets:' @echo ' NOTE: documentation build requires asciidoc, xmlto packages to be installed' - @echo ' HINT: use "make prefix= " to install to a particular' - @echo 'path like make prefix=/usr/local install install-doc' + @echo ' HINT: use "prefix" or "DESTDIR" to install to a particular' + @echo 'path like "make prefix=/usr/local install install-doc"' @echo ' install- install compiled binaries' @echo ' install-doc- install *all* documentation' @echo ' install-man- install manpage documentation' diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile index 802cf54..53dc11e 100644 --- a/tools/perf/config/Makefile +++ b/tools/perf/config/Makefile @@ -601,7 +601,7 @@ endif # Make the path relative to DESTDIR, not to prefix ifndef DESTDIR -prefix = $(HOME) +prefix ?= $(HOME) endif bindir_relative = bin bindir = $(prefix)/$(bindir_relative) -- 2.0.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf script: pass more arguments to the python event handler
Hi Sebastian, On Fri, 30 May 2014 18:25:23 +0200, Sebastian Andrzej Siewior wrote: > This patch extends the current argument list in case of events by > - IP / addr of the event. Currently only the function name is passed. > - seconds and ns as the timestamp. Split into two value to stay close to > what the trace handler passes. > - the pid of the proccess > > I added a common_ prefix to stay close to the naming of the "trace" > handler. Currently I don't mind dropping it since "comm" isn't named > "common_comm". Any suggestions? Please take a look at Joseph's work on the same direction (and more). https://lkml.org/lkml/2014/4/3/217 Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] toosl/perf: convert "==" into "="
Hi Arnaldo, On Fri, 30 May 2014 12:47:09 -0300, Arnaldo Carvalho de Melo wrote: > Em Fri, May 30, 2014 at 05:37:29PM +0200, Jiri Olsa escreveu: >> On Fri, May 30, 2014 at 12:20:11PM -0300, Arnaldo Carvalho de Melo wrote: >> > Em Fri, May 30, 2014 at 02:44:46PM +0200, Jean Delvare escreveu: >> > > I don't have anything to do with this, I'm not the author of the code >> > > nor the maintainer and I don't know anything about it. Arnaldo Carvalho >> > > de Melo is the right person to handle this bug. >> > >> > Jiri, > >> >Please take this patch, you can stick my: > >> > Acked-by: Arnaldo Carvalho de Melo > >> sure, but I dont see the patch on the lkml.. any chance of resend? > >> I guess I could dig the patch from above.. if you confirm it's >> the only change ;-) > > Hey, no need to go to such great lenghts, submitters must try and make > it easier to maintainers! :-) > > Find it attached, zhangdianfang, please send it as an attachment next > time, also please CC lkml as well. [SNIP] > > convert "==" into "=" > > Bug description: https://bugzilla.kernel.org/show_bug.cgi?id=76751 > > Cc: Jean Delvare > Reported-by: David Binderman > Signed-off-by: Dianfang Zhang Acked-by: Namhyung Kim Thanks, Namhyung > --- > tools/perf/ui/browser.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c > index d11541d..3ccf6e1 100644 > --- a/tools/perf/ui/browser.c > +++ b/tools/perf/ui/browser.c > @@ -194,7 +194,7 @@ int ui_browser__warning(struct ui_browser *browser, int > timeout, > ui_helpline__vpush(format, args); > va_end(args); > } else { > - while ((key == ui__question_window("Warning!", text, > + while ((key = ui__question_window("Warning!", text, > "Press any key...", > timeout)) == K_RESIZE) > ui_browser__handle_resize(browser); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf: fix 'make help' message error
Hi Jianyu, On Tue, 3 Jun 2014 00:53:16 +0800, Jianyu Zhan wrote: > On Tue, Jun 3, 2014 at 12:44 AM, Jianyu Zhan wrote: >> Hi, Namhyung, >> >>>I don't know what's the correct way to do this. But it seems like the >>>prefix was overwritten when given from user, so below patch will work >>>also. >> >> This does work too. So I update the patch as below: >> >> ---8<--- >> From: Jianyu Zhan >> Date: Sat, 24 May 2014 22:34:26 +0800 >> Subject: [PATCH] perf: fix 'make help' message error >> >> Currently 'make help' message has such hint: >> >>use "make prefix= " to install to a particular >>path like make prefix=/usr/local install install-doc >> >> But this is misleading, when I specify "prefix=/usr/local", it has got no >> respect at all. >> >> This is because that, "DESTDIR" is condiered first. In this case, "DESTDIR" s/condiered/considered/ ? >> has an empty value, so "prefix" is honored. However, "prefix" is >> unconditionally >> assigned to $HOME, regardless of what it is set to from command line. So our >> "prefix" setting got no respect and the actual destination falls back to >> $HOME. >> >> This patch fixes this issue and corrects the help message. With that changed, Acked-by: Namhyung Kim Thanks, Namhyung >> --- >> tools/perf/Makefile.perf | 4 ++-- >> tools/perf/config/Makefile | 2 +- >> 2 files changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf >> index 895edd3..5918063 100644 >> --- a/tools/perf/Makefile.perf >> +++ b/tools/perf/Makefile.perf >> @@ -784,8 +784,8 @@ help: >> @echo '' >> @echo 'Perf install targets:' >> @echo ' NOTE: documentation build requires asciidoc, xmlto packages >> to be installed' >> - @echo ' HINT: use "make prefix= " to install >> to a particular' >> - @echo 'path like make prefix=/usr/local install install-doc' >> + @echo ' HINT: use "prefix" or "DESTDIR" to install to a particular' >> + @echo 'path like "make prefix=/usr/local install >> install-doc"' >> @echo ' install- install compiled binaries' >> @echo ' install-doc- install *all* documentation' >> @echo ' install-man- install manpage documentation' >> diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile >> index 802cf54..53dc11e 100644 >> --- a/tools/perf/config/Makefile >> +++ b/tools/perf/config/Makefile >> @@ -601,7 +601,7 @@ endif >> >> >> # Make the path relative to DESTDIR, not to prefix >> ifndef DESTDIR >> -prefix = $(HOME) >> +prefix ?= $(HOME) >> endif >> bindir_relative = bin >> bindir = $(prefix)/$(bindir_relative) > > Cc Namyung with a correct email... > > > Thanks, > Jianyu Zhan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/4] tools lib traceevent: Added support for __get_bitmask() macro
On Tue, 03 Jun 2014 11:43:26 +0900 Namhyung Kim wrote: > Hi Steve, > > On Fri, 16 May 2014 10:02:19 -0400, Steven Rostedt wrote: > > @@ -3691,6 +3784,23 @@ static void print_str_arg(struct trace_seq *s, void > > *data, int size, > > case PRINT_BSTRING: > > print_str_to_seq(s, format, len_arg, arg->string.string); > > break; > > + case PRINT_BITMASK: { > > + int bitmask_offset; > > + int bitmask_size; > > + > > + if (arg->bitmask.offset == -1) { > > + struct format_field *f; > > + > > + f = pevent_find_any_field(event, arg->bitmask.bitmask); > > + arg->bitmask.offset = f->offset; > > + } > > + bitmask_offset = data2host4(pevent, data + arg->string.offset); > > s/string.offset/bitmask.offset/ Duh, thanks! -- Steve > > Thanks, > Namhyung > > > > + bitmask_size = bitmask_offset >> 16; > > + bitmask_offset &= 0x; > > + print_bitmask_to_seq(pevent, s, format, len_arg, > > +data + bitmask_offset, bitmask_size); > > + break; > > + } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/4] tools lib traceevent: Added support for __get_bitmask() macro
Hi Steve, On Fri, 16 May 2014 10:02:19 -0400, Steven Rostedt wrote: > @@ -3691,6 +3784,23 @@ static void print_str_arg(struct trace_seq *s, void > *data, int size, > case PRINT_BSTRING: > print_str_to_seq(s, format, len_arg, arg->string.string); > break; > + case PRINT_BITMASK: { > + int bitmask_offset; > + int bitmask_size; > + > + if (arg->bitmask.offset == -1) { > + struct format_field *f; > + > + f = pevent_find_any_field(event, arg->bitmask.bitmask); > + arg->bitmask.offset = f->offset; > + } > + bitmask_offset = data2host4(pevent, data + arg->string.offset); s/string.offset/bitmask.offset/ Thanks, Namhyung > + bitmask_size = bitmask_offset >> 16; > + bitmask_offset &= 0x; > + print_bitmask_to_seq(pevent, s, format, len_arg, > + data + bitmask_offset, bitmask_size); > + break; > + } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the net-next tree with the net tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in net/ipv6/output_core.c between commit 39c36094d78c ("net: fix inet_getid() and ipv6_select_ident() bugs") from the net tree and commit 73f156a6e8c1 ("inetpeer: get rid of ip_id_count") from the net-next tree. I fixed it up (the latter removed the code that the former updated, so I did that) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au signature.asc Description: PGP signature
linux-next: manual merge of the net-next tree with the net tree
Hi all, Today's linux-next merge of the net-next tree got a conflict in include/net/inetpeer.h between commit 39c36094d78c ("net: fix inet_getid() and ipv6_select_ident() bugs") from the tree and commit 73f156a6e8c1 ("inetpeer: get rid of ip_id_count") from the net-next tree. I fixed it up (the latter removed the code that the former updated, so I did that) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au signature.asc Description: PGP signature
Re: [PATCH] tools: lib: traceevent: parse-filter.c: Cleaning up memory leak
On Sun, 1 Jun 2014 13:36:56 +0200 Rickard Strandqvist wrote: > There is a risk for memory leak in when something unexpected happens > and the function returns. > > This was largely found by using a static code analysis program called > cppcheck. > > Signed-off-by: Rickard Strandqvist Acked-by: Steven Rostedt -- Steve > --- > tools/lib/traceevent/parse-filter.c |4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/tools/lib/traceevent/parse-filter.c > b/tools/lib/traceevent/parse-filter.c > index b502344..5865c9e 100644 > --- a/tools/lib/traceevent/parse-filter.c > +++ b/tools/lib/traceevent/parse-filter.c > @@ -1492,8 +1492,10 @@ static int copy_filter_type(struct event_filter > *filter, > arg->boolean.value = 0; > > filter_type = add_filter_type(filter, event->id); > - if (filter_type == NULL) > + if (filter_type == NULL) { > + free_arg(arg); > return -1; > + } > > filter_type->filter = arg; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Fix division by zero in percpu_pagelist_fraction handler
Hello! On Jun 2, 2014, at 9:40 PM, David Rientjes wrote: >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 5dba293..91d0265 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -5854,7 +5854,7 @@ int percpu_pagelist_fraction_sysctl_handler(ctl_table >> *table, int write, >> int ret; >> >> ret = proc_dointvec_minmax(table, write, buffer, length, ppos); >> -if (!write || (ret < 0)) >> +if (!write || (ret < 0) || !*length) >> return ret; >> >> mutex_lock(_batch_high_lock); > This hasn't made it to linux-next yet (probably because you didn't cc > Andrew Morton, the mm maintainer), but I'm wondering why it's needed. > Shouldn't this value always be >= min_percpu_pagelist_fract? > > If there's something going on in proc_dointvec_minmax() that disregards > that minimum then we need to fix it rather than the caller. It's needed because at the very beginning of proc_dointvec_minmax it checks if the length is 0 and if it is, immediately returns success because there's nothing to be done from it's perspective. And since percpu_pagelist_fraction is 0 from the very beginning, next step is division by this value. If length is not 0 on the other hand, then there's some value proc_dointvec_minmax can interpret and the min and max checks happen and everything works correctly. This is kind of hard to fix in proc_dointvec_minmax because there's no passed in value to check against, I think. Sure, you can also check the passed in pointer to value to make sure it's within range, but that goes beyond the scope of the function. You might as well just assign a sane value to percpu_pagelist_fraction, which has an added benefit of when you cat that /proc file, you'll get real value of what the kernel uses instead of 0. Bye, Oleg-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH V3] regulator: DA9211 : new regulator driver
This is the driver for the Dialog DA9211 Multi-phase 12A DC-DC Buck Converter regulator. It communicates via an I2C bus to the device. Signed-off-by: James Ban --- This patch is relative to linux-next repository tag next-20140530. Changes in V3: - Removed voltage selection in the da9211_regulator_set_suspend_voltage. Changes in V2: - Removed the redundant interupt code. drivers/regulator/Kconfig| 10 + drivers/regulator/Makefile | 1 + drivers/regulator/da9211-regulator.c | 704 +++ drivers/regulator/da9211-regulator.h | 271 ++ include/linux/regulator/da9211.h | 65 5 files changed, 1051 insertions(+) create mode 100644 drivers/regulator/da9211-regulator.c create mode 100644 drivers/regulator/da9211-regulator.h create mode 100644 include/linux/regulator/da9211.h diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig index 789eb46..f5040fc 100644 --- a/drivers/regulator/Kconfig +++ b/drivers/regulator/Kconfig @@ -198,6 +198,16 @@ config REGULATOR_DA9210 converter 12A DC-DC Buck controlled through an I2C interface. +config REGULATOR_DA9211 + tristate "Dialog Semiconductor DA9211/DA9212 regulator" + depends on I2C + select REGMAP_I2C + help + Say y here to support for the Dialog Semiconductor DA9211/DA9212. + The DA9211/DA9212 is a multi-phase synchronous step down + converter 12A DC-DC Buck controlled through an I2C + interface. + config REGULATOR_DBX500_PRCMU bool diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile index d461110..aa4a6aa 100644 --- a/drivers/regulator/Makefile +++ b/drivers/regulator/Makefile @@ -27,6 +27,7 @@ obj-$(CONFIG_REGULATOR_DA9052)+= da9052-regulator.o obj-$(CONFIG_REGULATOR_DA9055) += da9055-regulator.o obj-$(CONFIG_REGULATOR_DA9063) += da9063-regulator.o obj-$(CONFIG_REGULATOR_DA9210) += da9210-regulator.o +obj-$(CONFIG_REGULATOR_DA9211) += da9211-regulator.o obj-$(CONFIG_REGULATOR_DBX500_PRCMU) += dbx500-prcmu.o obj-$(CONFIG_REGULATOR_DB8500_PRCMU) += db8500-prcmu.o obj-$(CONFIG_REGULATOR_FAN53555) += fan53555.o diff --git a/drivers/regulator/da9211-regulator.c b/drivers/regulator/da9211-regulator.c new file mode 100644 index 000..e3ef43a --- /dev/null +++ b/drivers/regulator/da9211-regulator.c @@ -0,0 +1,704 @@ +/* + * da9211-regulator.c - Regulator device driver for DA9211 + * Copyright (C) 2014 Dialog Semiconductor Ltd. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Library General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Library General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "da9211-regulator.h" + +#define DA9211_BUCK_MODE_SLEEP 1 +#define DA9211_BUCK_MODE_SYNC 2 +#define DA9211_BUCK_MODE_AUTO 3 + +/* DA9211 REGULATOR IDs */ +#define DA9211_ID_BUCKA0 +#define DA9211_ID_BUCKB1 + +struct da9211_conf_reg { + int reg; + int sel_mask; +}; + +struct da9211_volt_reg { + int reg_a; + int reg_b; + int v_mask; +}; + +struct da9211_mode_reg { + int reg; + int mask; + int shift; +}; + +struct da9211_regulator_info { + struct regulator_desc reg_desc; + struct da9211_conf_reg conf; + struct da9211_volt_reg volt; + struct da9211_mode_reg mode; + int current_shift; +}; + +struct da9211 { + struct device *dev; + struct regmap *regmap; + struct da9211_pdata *pdata; + struct da9211_regulator_info *info; + struct regulator_dev *rdev[DA9211_MAX_REGULATORS]; + int num_regulator; + int chip_irq; +}; + +struct da9211_regulator { + struct da9211 *da9211; + struct da9211_regulator_info *info; + struct regulator_dev *rdev; + enum da9211_gpio_rsel_select reg_rselect; +}; + +static const struct regmap_config da9211_regmap_config = { + .reg_bits = 8, + .val_bits = 8, +}; + +/* Default limits measured in millivolts and milliamps */ +#define DA9211_MIN_MV 300 +#define DA9211_MAX_MV 1570 +#define DA9211_STEP_MV 10 + +/* Current limits for buck (uA) indices corresponds with register values */ +static const int da9211_current_limits[] = { + 200, 220, 240, 260, 280, 300, 320, 340, + 360, 380, 400, 420, 440, 460, 480, 500 +}; + +static unsigned int da9211_buck_get_mode(struct regulator_dev *rdev) +{ +
[git pull] m68knommu arch fixes for 3.16
Hi Linus, Can you please pull the m68knommu git tree, for-next branch. Nothing too big, just a handfull of small changes. A couple of dragonball fixes, coldfire qspi cleanup and fixes, and some coldfire gpio cleanup, fixes and extensions. Regards Greg The following changes since commit c7208164e66f63e3ec1759b98087849286410741: Linux 3.15-rc7 (2014-05-25 16:06:00 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git for-next for you to fetch changes up to 83c6bdb827c9422fe6e02130d9546800143304c1: m68knommu: Implement gpio support for m54xx. (2014-05-26 13:28:38 +1000) Daniel Palmer (2): m68knommu: Fix mach_sched_init for EZ and VZ DragonBall chips m68k: fix a compiler warning when building for DragonBall Steven King (7): m68knommu: Add qspi clk for Coldfire SoCs without real clks. m68knommu: Fix the 5249/525x qspi base address. m68knommu: qspi declutter. m68knommu: add to_irq function so we can map gpios to external interrupts. m68knommu: setting the gpio data direction register to output doesn't dependent upon the value to output! m68knommu: Make everything thats not exported, static. m68knommu: Implement gpio support for m54xx. arch/m68k/include/asm/m525xsim.h| 2 +- arch/m68k/include/asm/m54xxsim.h| 12 +--- arch/m68k/include/asm/mcfgpio.h | 12 arch/m68k/kernel/setup_no.c | 13 ++--- arch/m68k/platform/68000/m68EZ328.c | 3 ++- arch/m68k/platform/68000/m68VZ328.c | 1 + arch/m68k/platform/coldfire/gpio.c | 34 +++--- arch/m68k/platform/coldfire/m520x.c | 8 ++-- arch/m68k/platform/coldfire/m523x.c | 10 -- arch/m68k/platform/coldfire/m5249.c | 10 -- arch/m68k/platform/coldfire/m525x.c | 2 ++ arch/m68k/platform/coldfire/m5272.c | 2 ++ arch/m68k/platform/coldfire/m527x.c | 10 -- arch/m68k/platform/coldfire/m528x.c | 10 -- arch/m68k/platform/coldfire/m53xx.c | 8 ++-- 15 files changed, 74 insertions(+), 63 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mm,console: circular dependency between console_sem and zone lock
On Sun, 01 Jun 2014 10:08:21 -0400 Sasha Levin wrote: > On 05/12/2014 12:28 PM, Jan Kara wrote: > > On Wed 07-05-14 22:03:08, Sasha Levin wrote: > >> > Hi all, > >> > > >> > While fuzzing with trinity inside a KVM tools guest running the latest > >> > -next > >> > kernel I've stumbled on the following spew: > > Thanks for report. So the problem seems to be maginally valid but I'm not > > 100% sure whom to blame :). So printk() code calls up() which calls > > try_to_wake_up() under console_sem.lock spinlock. That function can take > > rq->lock which is all expected. > > > > The next part of the chain is that during CPU initialization we call > > __sched_fork() with rq->lock which calls into hrtimer_init() which can > > allocate memory which creates a dependency rq->lock => zone.lock.rlock. > > > > And memory management code calls printk() which zone.lock.rlock held which > > closes the loop. Now I suspect the second link in the chain can happen only > > while CPU is booting and might even happen only if some debug options are > > enabled. But I don't really know scheduler code well enough. Steven? > > I've cc'ed Peter and Ingo who may be able to answer that, as it still happens > on -next. > Hmm, it failed on a try lock, but on the spinlock within the trylock. I wonder if we should add this. Peter? -- Steve diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c index 6815171..6579f84 100644 --- a/kernel/locking/semaphore.c +++ b/kernel/locking/semaphore.c @@ -132,7 +132,9 @@ int down_trylock(struct semaphore *sem) unsigned long flags; int count; - raw_spin_lock_irqsave(>lock, flags); + if (!raw_spin_trylock_irqsave(>lock, flags)) + return 1; + count = sem->count - 1; if (likely(count >= 0)) sem->count = count; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 2/2] rcu: Add Josh Triplett as designated reviewer
On Mon, Jun 02, 2014 at 06:07:18PM -0700, Randy Dunlap wrote: > On 06/02/2014 05:02 PM, j...@joshtriplett.org wrote: > > On Mon, Jun 02, 2014 at 01:38:56PM -0700, Randy Dunlap wrote: > >> On 06/02/2014 01:36 PM, Joe Perches wrote: > >>> On Mon, 2014-06-02 at 13:35 -0700, Andrew Morton wrote: > On Mon, 2 Jun 2014 10:00:20 -0700 "Paul E. McKenney" > wrote: > > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -7321,6 +7321,7 @@ F:kernel/rcu/torture.c > > > > RCUTORTURE TEST FRAMEWORK > > M: "Paul E. McKenney" > > +R: Josh Triplett > > L: linux-kernel@vger.kernel.org > > S: Supported > > T: git > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git > > I like the general principle - knowing who to poke regarding a kernel > change is useful. > > I don't care much whether it's "M:" or "R:", although "R:" carries more > meaning and hence is probably better. > > But why not "Cc:"? That's meaningful too and is more copy-n-paste > friendly. > >> > >> Josh, what are you assuming that Andrew and I did not? > > > > Not sure what you mean here. Responding to the text you quoted: I have > > no particular need to bikeshed the tag name, so if you prefer "Cc" and > > can convince get_maintainer.pl to handle it, fine by me. > > Sorry, what I meant is that Andrew and I both mentioned copy-paste and > you replied earlier (and I have already deleted it) that copy-paste shouldn't > be necessary for someone who is using get_maintainer.pl. > > Do you redirect its output to your patch file and then edit it or does > get_maintainer.pl work with git-send-email or something else? if something > else, what is it, please? Oh, I see; that was in text you hadn't quoted, so I didn't know what you were asking. :) git send-email can invoke 'scripts/get_maintainer.pl --no-rolestats' directly via --to-cmd or -cc-cmd; that works fine as long as you don't have a cover letter. Depending on the system I'm running on, and whether it's more convenient to invoke git-send-email or to edit patch mails and send them with 'mutt -H', I have a shell pipeline which invokes get_maintainer.pl on an entire patch series, collects all the email addresses it returns, and inserts them all into each mail as CCs. (That way, when I send a cross-subsystem patch series, I don't get a pile of maintainers confused that they only received a couple of the numbered patches.) One example: { echo -n "To: " ; for x in *.patch ; do scripts/get_maintainer.pl --no-rolestats < $x | fgrep -v j...@joshtriplett.org ; done | sort -u | sed 's/$/, /;$s/, $//' | tr -d '\n' ; echo ; } | sed -i '/^From:/r/dev/stdin' Personally, I'd find it handy if one of the following happened: - git send-email (and ideally also git format-patch) grew an option to collect *all* the to-cmd and cc-cmd output from each patch and apply it to every patch (including the cover letter). - get_maintainer.pl accepted multiple patchfile names and output the union of the results. Ideally, get_maintainer.pl would also have a -i option to edit the patch files and insert the addresses in the mail headers. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Legal Notice
My late client made his last will and testament in your favour, i sent letters to you but got no response, i advise you contact me immediately. Barr Mark Freedman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Fix division by zero in percpu_pagelist_fraction handler
On Sat, 3 May 2014, Oleg Drokin wrote: > percpu_pagelist_fraction_sysctl_handler calls proc_dointvec_minmax > and blindly assumes that return value of 0 means success. > In fact the other valid case is when it got a zero length input. > > After that it proceeds to a division by percpu_pagelist_fraction > value which is conveniently set to a default of zero, resulting in > division by zero. > > Other than checking the bytecount to be more than zero, perhaps > a better default value for percpu_pagelist_fraction would help too. > > [ 661.985469] divide error: [#1] SMP DEBUG_PAGEALLOC > [ 661.985868] Modules linked in: binfmt_misc cfg80211 rfkill rpcsec_gss_krb5 > ttm drm_kms_helper drm i2c_piix4 microcode i2c_core joydev serio_raw pcspkr > virtio_blk nfsd > [ 661.986008] CPU: 1 PID: 9142 Comm: badarea_io Not tainted > 3.15.0-rc2-vm-nfs+ #19 > [ 661.986008] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ 661.986008] task: 8800d5aeb6e0 ti: 8800d87a2000 task.ti: > 8800d87a2000 > [ 661.986008] RIP: 0010:[] [] > percpu_pagelist_fraction_sysctl_handler+0x84/0x120 > [ 661.988031] RSP: 0018:8800d87a3e78 EFLAGS: 00010246 > [ 661.988031] RAX: 0f89 RBX: 88011f7fd000 RCX: > > [ 661.988031] RDX: RSI: 0001 RDI: > 0010 > [ 661.988031] RBP: 8800d87a3e98 R08: 81d002c8 R09: > 8800d87a3f50 > [ 661.988031] R10: 000b R11: 0246 R12: > 0060 > [ 661.988031] R13: 81c3c3e0 R14: 81cfddf8 R15: > 8801193b0800 > [ 661.988031] FS: 7f614f1e9740() GS:88011f44() > knlGS: > [ 661.988031] CS: 0010 DS: ES: CR0: 8005003b > [ 661.988031] CR2: 7f614f1fa000 CR3: d9291000 CR4: > 06e0 > [ 661.988031] Stack: > [ 661.988031] 0001 ffea 81c3c3e0 > > [ 661.988031] 8800d87a3ee8 8122b163 8800d87a3f50 > 7fff1564969c > [ 661.988031] 8800d8098f00 7fff1564969c > 8800d87a3f50 > [ 661.988031] Call Trace: > [ 661.988031] [] proc_sys_call_handler+0xb3/0xc0 > [ 661.988031] [] proc_sys_write+0x14/0x20 > [ 661.988031] [] vfs_write+0xba/0x1e0 > [ 661.988031] [] SyS_write+0x46/0xb0 > [ 661.988031] [] tracesys+0xe1/0xe6 > [ 661.988031] Code: 1f 84 00 00 00 00 00 48 83 bb b0 06 00 00 00 0f 84 7c 00 > 00 00 48 63 0d 93 6a e1 00 48 8b 83 b8 06 00 00 31 d2 41 bc 60 00 00 00 <48> > f7 f1 ba 01 00 00 00 49 89 c5 48 c1 e8 02 48 85 c0 48 0f 44 > > Signed-off-by: Oleg Drokin > CC: Rohit Seth > --- > mm/page_alloc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 5dba293..91d0265 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5854,7 +5854,7 @@ int percpu_pagelist_fraction_sysctl_handler(ctl_table > *table, int write, > int ret; > > ret = proc_dointvec_minmax(table, write, buffer, length, ppos); > - if (!write || (ret < 0)) > + if (!write || (ret < 0) || !*length) > return ret; > > mutex_lock(_batch_high_lock); This hasn't made it to linux-next yet (probably because you didn't cc Andrew Morton, the mm maintainer), but I'm wondering why it's needed. Shouldn't this value always be >= min_percpu_pagelist_fract? If there's something going on in proc_dointvec_minmax() that disregards that minimum then we need to fix it rather than the caller. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] perf tools: allow user to specify hardware breakpoint bp_len
Hi Jiri, On Fri, 30 May 2014 15:39:06 +0200, Jiri Olsa wrote: > On Thu, May 29, 2014 at 05:26:51PM +0200, Frederic Weisbecker wrote: >> From: Jacob Shin >> >> Currently bp_len is given a default value of 4. Allow user to override it: >> >> $ perf stat -e mem:0x1000/8 >> ^ >> bp_len >> >> If no value is given, it will default to 4 as it did before. > > Namhyung, > both perf tols patches from this patchset mess up with hists > tests.. I havent found any connection yet.. any idea? ;-) So you already found the problem in the hpp->elide change and that's the reason of the failure, right? :) Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] slab: delete cache from list after __kmem_cache_shutdown succeeds
On Thu, 15 May 2014, Vladimir Davydov wrote: > Currently, on kmem_cache_destroy we delete the cache from the slab_list > before __kmem_cache_shutdown, inserting it back to the list on failure. > Initially, this was done, because we could release the slab_mutex in > __kmem_cache_shutdown to delete sysfs slub entry, but since commit > 41a212859a4d ("slub: use sysfs'es release mechanism for kmem_cache") we > remove sysfs entry later in kmem_cache_destroy after dropping the > slab_mutex, so that no implementation of __kmem_cache_shutdown can ever > release the lock. Therefore we can simplify the code a bit by moving > list_del after __kmem_cache_shutdown. > > Signed-off-by: Vladimir Davydov Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kredīta piedāvājums
Laba diena, Mēs esam DIAMOND ŠVEICES aizdevumu uzņēmumam sniedzot aizdevumus ar pasta reklāma. Mēs piedāvājam dažāda veida aizdevumu (īstermiņa un ilgtermiņa aizdevumi, individuālie aizņēmumi, aizdevumi uzņēmumiem uc), par 3% procentu likmi. Mēs izsniegt aizdevumus cilvēkiem, kam nepieciešama ne neatkarīgi no to atrašanās vietas, dzimuma, ģimenes stāvokļa, izglītības, nodarbinātības statusu, bet ir jābūt ar tiesiskiem līdzekļiem atmaksu. Mūsu aizdevumu svārstās no 5,000.00 līdz 10,000,000.00 ASV dolāru vai eiro, vai mārciņa ar maksimālo ilgumu 20 gadi. Ja jūs interesē vairāk informācijas, lūdzu, aizpildiet zemāk esošo formu un nosūtiet to uz mūsu e-pasta adresi: diamond_swisslo...@hotmail.com Lūdzu, aizpildiet: vārds: adrese: vecums: dzimums: Kontaktu Tālrunis: Nodarbošanās: Ikmēneša ienākumi: Nepieciešama aizdevuma summa: Ilgums aizdevums: Aizdevuma mērķis: Valsts: ZIP kods: "Mēs parādīsim jums labāku veidu, lai jūsu finansiālo brīvību" Ar cieņu, Mr Diamond Peters (Managing Director). Mr Bill Anthony (mārketinga vadītājs) This message is intended only for the named recipient. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. This emailis confidential, may be legally privileged, and is for the intended recipient only. If an addressing or transmission error has misdirected this e-mail, please notify the author by replying to this e-mail. If you are not the intended recipient you must not use, disclose, distribute, copy, print or rely on this email. Any views or opinions presented are solely those of the author and do not necessarily represent those of the National University of Science and Technology. The National University of Science and Technology, Zimbabwe, does not accept legal responsibility for the contents of this message. Whilst all reasonable steps are taken to ensure the accuracy and integrity of information and data transmitted electronically and to preserve the confidentiality thereof, no liability or responsibility whatsoever is accepted if information or data is, for whatever reason, corrupted or does not reach its intended destination. Replies to this email may be monitored by the National University of Science and Technology, Zimbabwe, for operational, security or business reasons. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.15 regression: wrong cgroup magic
On Mon, Jun 2, 2014 at 6:22 PM, Tejun Heo wrote: > > Linus, can you please cherry-pick the commit? I'd much rather see it go through the proper channels than go ahead and cherry-pick from some branch that hasn't even been sent to me yet. The whole "you have to send things to me for me to take them" policy is not new, I don't want to start taking stuff that the authors/maintainers haven't actively sent my way. That said, I suspect that Greg didn't expect this to actually matter (the commit message certainly doesn't make it sound like anything that people would notice), so the reason it is in -next is likely that nobody thought it was a regression. Of course, Greg could just send it to me for my next branch (since the merge window for 3.16 is already open) and tell me that it's also stable material for 3.15. At _that_ point I'll happily cherry-pick it intpo master... Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 1/2] MAINTAINERS: Add "R:" designated-reviewers tag
On Tue, 3 Jun 2014 11:11:25 +1000 Dave Chinner wrote: > You've ignored the (c).(2) "free of known issues" criteria there. > You cannot say a patch is free of issues if you haven't applied, > compiled and tested it. > > > We should not, for instance, prevent someone from providing a > > Reviewed-by (as opposed to an Acked-by) for a driver whose hardware few > > people actually have. There's significant value in code review even > > without the ability to test. > > I don't disagree with you that there's value in code review, but > that's not the only part of what "reviewed-by" means. > > You can test that the code is free of known issues without reviewing > it (i.e. tested-by). You can read the code and note that you can't > see any technical issues without testing it (Acked-by). Unless you run every test imaginable on all existing hardware, you are not stating that it is free of known issues. I say your logic is flawed right there. I find that review finds more bugs than testing does. > > But you can't say that is it both free of techical and known > issues without both reading the code and testing it (Reviewed-by). I disagree. Testing only tests what you run. It's useless otherwise. Most code I review, and find bugs for in that review, will not be caught by tests unless you ran it on a 1024 CPU box for a week. I value looking hard at code much more than booting it and running some insignificant micro test. > > > > Anyone using Reviewed-by without having actually applied and tested > > > the patch is mis-using the tag - they should be using Acked-by: if > > > all they have done is read the code in their mail program > > > > Acked-by and Reviewed-by mean two different things (Reviewed-by being a > > superset of Acked-by), and the difference is not "I've applied and > > tested this"; that's Tested-by. > > Right, the difference is more than that - Reviewed-by is a > superset of both Acked-by and Tested-by. I disagree. > > And, yes, this is the definition we've been using for "reviewed-by" > for XFS code since, well, years before the "reviewed-by" tag even > existed... Fine, just like all else. That is up to the maintainer to decide. You may require people to run and test it as their review, but I require that people understand the code I write and look for those flaws that 99% of tests wont catch. I run lots of specific tests on the code I write, I don't expect those that review my code to do the same. In fact that's never what I even ask for when I ask someone to review my code. Note, I do ask for testers when I want people to test it, but those are not the same people that review my code. I find the reviewers of my code to be the worse testers. That's because those that I ask to review my code know what it's suppose to do, and those are the people that are not going to stumble over bugs. It's the people that have no idea how your code works that will trigger the most bugs in testing it. My best testers would be my worse reviewers. What do you require as a test anyway? I could boot your patches, but since I don't have an XFS filesystem, I doubt that would be much use for you. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RESEND PATCH] kexec : add sparse memory related values to vmcore
On 2014/5/29 8:13, Simon Horman 写道: > On Wed, May 28, 2014 at 09:49:56PM +0800, Liu Hua wrote: >> This patch deales with sparse memory model. >> >> For ARM32 platforms, different vendors may define different >> SECTION_SIZE_BITS, which we did not write to vmcore. >> >> For example: >> >> 1 arch/arm/mach-clps711x/include/mach/memory.h >> #define SECTION_SIZE_BITS 24 >> 2 arch/arm/mach-exynos/include/mach/memory.h >> #define SECTION_SIZE_BITS 28 >> 3 arch/arm/mach-sa1100/include/mach/memory.h >> #define SECTION_SIZE_BITS 27 > > I wonder if this problem will eventually go away, or at least only > apply to older platforms, as ARM moves towards multiplatform: a single > kernel for more than one platform. >> It is really a bad news for user space tools such as >> makedumpfile and crash, who have to defines them as >> macros. So for the same architecture, we may need to >> recomile them to parse vmcores with different >> SECTION_SIZE_BITS. >> >> And if we enable LPAE, MAX_PHYSMEM_SIZE can alse >> be variable. >> >> This patch adds these SECTION_SIZE_BITS and MAX_PHYSMEM_SIZE >> to vmcore. which makes user space tools more compatible. >> >> BTW, makedumpfile has queued the related patch. >> >> Signed-off-by: Liu Hua >> --- >> kernel/kexec.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/kernel/kexec.c b/kernel/kexec.c >> index bf0b929e..8b1a193 100644 >> --- a/kernel/kexec.c >> +++ b/kernel/kexec.c >> @@ -1577,6 +1577,8 @@ static int __init crash_save_vmcoreinfo_init(void) >> VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS); >> VMCOREINFO_STRUCT_SIZE(mem_section); >> VMCOREINFO_OFFSET(mem_section, section_mem_map); >> +VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS); >> +VMCOREINFO_NUMBER(SECTION_SIZE_BITS); >> #endif >> VMCOREINFO_STRUCT_SIZE(page); >> VMCOREINFO_STRUCT_SIZE(pglist_data); >> -- >> 1.9.0 >> >> >> ___ >> kexec mailing list >> ke...@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/kexec >> > > ___ > kexec mailing list > ke...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec > > . > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.15 regression: wrong cgroup magic
On Tue, Jun 03, 2014 at 09:18:25AM +0800, Li Zefan wrote: > > commit 2bd59d48ebfb3df41ee56938946ca0dd30887312 > > Author: Tejun Heo > > Date: Tue Feb 11 11:52:49 2014 -0500 > > > > cgroup: convert to kernfs > > > > In particular, this piece: > > > > - sb->s_magic = CGROUP_SUPER_MAGIC; > > > > The result is that cgroup shows up with the wrong magic number, so my > > code goes "oh crap, cgroupfs isn't mounted" and fails. > > > > I can change my code to hack around this, but I can imagine other > > things getting tripped up. Is there still time to fix this? > > > > This should be fixed by "kernfs: move the last knowledge of sysfs out from > kernfs". > > It's in driver-core-next. > > https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git/commit/?h=driver-core-next=26fc9cd200ec839e0b3095e05ae018f27314e7aa Right, I was writing about the same patch with a nagging sense of deja-vu. I should have noticed that this must go through driver-core-linus not -next. Sorry about that. Linus, can you please cherry-pick the commit? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.15 regression: wrong cgroup magic
Cc: Greg Cc: Jianyu Zhan On 2014/6/3 8:56, Andy Lutomirski wrote: > Sorry I didn't notice this earlier. Linux 3.15 breaks my production But 3.15 hasn't been released. :) > system :( The cause appears to be: > > commit 2bd59d48ebfb3df41ee56938946ca0dd30887312 > Author: Tejun Heo > Date: Tue Feb 11 11:52:49 2014 -0500 > > cgroup: convert to kernfs > > In particular, this piece: > > - sb->s_magic = CGROUP_SUPER_MAGIC; > > The result is that cgroup shows up with the wrong magic number, so my > code goes "oh crap, cgroupfs isn't mounted" and fails. > > I can change my code to hack around this, but I can imagine other > things getting tripped up. Is there still time to fix this? > This should be fixed by "kernfs: move the last knowledge of sysfs out from kernfs". It's in driver-core-next. https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git/commit/?h=driver-core-next=26fc9cd200ec839e0b3095e05ae018f27314e7aa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 6/6] seccomp: add SECCOMP_EXT_ACT_TSYNC and SECCOMP_FILTER_TSYNC
On Mon, Jun 2, 2014 at 6:09 PM, Kees Cook wrote: > On Mon, Jun 2, 2014 at 5:29 PM, Andy Lutomirski wrote: >> On Mon, Jun 2, 2014 at 5:14 PM, Kees Cook wrote: >>> On Mon, Jun 2, 2014 at 1:53 PM, Andy Lutomirski wrote: On Tue, May 27, 2014 at 12:55 PM, Kees Cook wrote: > On Tue, May 27, 2014 at 12:27 PM, Andy Lutomirski > wrote: >> On Tue, May 27, 2014 at 12:23 PM, Kees Cook >> wrote: >>> On Tue, May 27, 2014 at 12:10 PM, Andy Lutomirski >>> wrote: On Tue, May 27, 2014 at 11:45 AM, Kees Cook wrote: > On Tue, May 27, 2014 at 11:40 AM, Andy Lutomirski > wrote: >> On Tue, May 27, 2014 at 11:24 AM, Kees Cook >> wrote: >>> On Mon, May 26, 2014 at 12:27 PM, Andy Lutomirski >>> wrote: On Fri, May 23, 2014 at 10:05 AM, Kees Cook wrote: > On Thu, May 22, 2014 at 4:11 PM, Andy Lutomirski > wrote: >> On Thu, May 22, 2014 at 4:05 PM, Kees Cook >> wrote: >>> Applying restrictive seccomp filter programs to large or diverse >>> codebases often requires handling threads which may be started >>> early in >>> the process lifetime (e.g., by code that is linked in). While >>> it is >>> possible to apply permissive programs prior to process start >>> up, it is >>> difficult to further restrict the kernel ABI to those threads >>> after that >>> point. >>> >>> This change adds a new seccomp extension action for >>> synchronizing thread >>> group seccomp filters and a prctl() for accessing that >>> functionality, >>> as well as a flag for SECCOMP_EXT_ACT_FILTER to perform sync at >>> filter >>> installation time. >>> >>> When calling prctl(PR_SECCOMP_EXT, SECCOMP_EXT_ACT, >>> SECCOMP_EXT_ACT_FILTER, >>> flags, filter) with flags containing SECCOMP_FILTER_TSYNC, or >>> when calling >>> prctl(PR_SECCOMP_EXT, SECCOMP_EXT_ACT, SECCOMP_EXT_ACT_TSYNC, >>> 0, 0), it >>> will attempt to synchronize all threads in current's >>> threadgroup to its >>> seccomp filter program. This is possible iff all threads are >>> using a filter >>> that is an ancestor to the filter current is attempting to >>> synchronize to. >>> NULL filters (where the task is running as SECCOMP_MODE_NONE) >>> are also >>> treated as ancestors allowing threads to be transitioned into >>> SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS, ...) has >>> been set on the >>> calling thread, no_new_privs will be set for all synchronized >>> threads too. >>> On success, 0 is returned. On failure, the pid of one of the >>> failing threads >>> will be returned, with as many filters installed as possible. >> >> Is there a use case for adding a filter and synchronizing filters >> being separate operations? If not, I think this would be easier >> to >> understand and to use if there was just a single operation. > > Yes: if the other thread's lifetime is not well controlled, it's > good > to be able to have a distinct interface to retry the thread sync > that > doesn't require adding "no-op" filters. Wouldn't this still be solved by: seccomp_add_filter(final_filter, SECCOMP_FILTER_ALL_THREADS); the idea would be that, if seccomp_add_filter fails, then you give up and, if it succeeds, then you're done. It shouldn't fail unless out of memory or you've nested too deeply. >>> >>> I wanted to keep the case of being able to to wait for non-ancestor >>> threads to finish. For example, 2 threads start and set separate >>> filters. 1 does work and exits, 2 starts another thread (3) which >>> adds >>> filters, does work, and then waits for 1 to finish by calling TSYNC. >>> Once 1 dies, TSYNC succeeds. In the case of not having direct >>> control >>> over thread lifetime (say, when using third-party libraries), I'd >>> like >>> to retain the flexibility of being able to do TSYNC without needing >>> a >>> filter being attached to it. >> >> I must admit this strikes me as odd. What's the point of having a >> thread set a filter if it intends to be a short-lived thread? > > I was illustrating the
Re: 3.15 regression: wrong cgroup magic
On Mon, Jun 2, 2014 at 5:56 PM, Andy Lutomirski wrote: > > In particular, this piece: > > - sb->s_magic = CGROUP_SUPER_MAGIC; > > The result is that cgroup shows up with the wrong magic number, so my > code goes "oh crap, cgroupfs isn't mounted" and fails. > > I can change my code to hack around this, but I can imagine other > things getting tripped up. Is there still time to fix this? Sure. Send me a tested patch. I'm assuming it's going to look something like --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -54,6 +54,7 @@ #include /* TODO: replace with more sophisticated array */ #include #include +#include #include @@ -1607,6 +1608,8 @@ out_unlock: dentry = kernfs_mount(fs_type, flags, root->kf_root, _sb); if (IS_ERR(dentry) || !new_sb) cgroup_put(>cgrp); + else + dentry->d_sb->s_magic = CGROUP_SUPER_MAGIC; return dentry; } but somebody definitely needs to test it. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch v2] mm, memcg: periodically schedule when emptying page list
On Mon, Jun 02, 2014 at 05:51:25PM -0700, David Rientjes wrote: > From: Hugh Dickins > > mem_cgroup_force_empty_list() can iterate a large number of pages on an lru > and > mem_cgroup_move_parent() doesn't return an errno unless certain criteria, > none > of which indicate that the iteration may be taking too long, is met. > > We have encountered the following stack trace many times indicating > "need_resched set for > 5120 ns (51 ticks) without schedule", for example: > > scheduler_tick() > > mem_cgroup_move_account+0x4d/0x1d5 > mem_cgroup_move_parent+0x8d/0x109 > mem_cgroup_reparent_charges+0x149/0x2ba > mem_cgroup_css_offline+0xeb/0x11b > cgroup_offline_fn+0x68/0x16b > process_one_work+0x129/0x350 > > If this iteration is taking too long, we still need to do cond_resched() even > when an individual page is not busy. > > [rient...@google.com: changelog] > Signed-off-by: Hugh Dickins > Signed-off-by: David Rientjes Acked-by: Johannes Weiner -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT] Networking
1) Unbreak zebra and other netlink apps, from Eric W. Biederman. 2) Some new qmi_wwan device IDs, from Aleksander Morgado. 3) Fix info leak in DCB netlink handler of qlcnic driver, from Dan Carpenter. 4) inet_getid() and ipv6_select_ident() do not generate monotnically increasing ID numbers, fix from Eric Dumazet. 5) Fix memory leak in __sk_prepare_filter(), from Leon Yu. 6) Netlink leftover bytes warning message is user triggerable, rate limit it. From Michal Schmidt. 7) Fix non-linear SKB panic in ipvs, from Peter Christensen. 8) Congestion window undo needs to be performed even if only never retransmitted data is SACK'd, fix from Yuching Cheng. Please pull, thanks a lot! The following changes since commit 1ee1ceafb572f1a925809168267a7962a4289de8: Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc (2014-05-23 15:41:52 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master for you to fetch changes up to 418c96ac151a16a5094a95d14252c92c1d47ec67: net: filter: fix possible memory leak in __sk_prepare_filter() (2014-06-02 17:49:45 -0700) Aleksander Morgado (3): net: qmi_wwan: add Netgear AirCard 341U net: qmi_wwan: add additional Sierra Wireless QMI devices net: qmi_wwan: interface #11 in Sierra Wireless MC73xx is not QMI Bart De Schuymer (1): ebtables: Update MAINTAINERS entry. Dan Carpenter (1): qlcnic: info leak in qlcnic_dcb_peer_app_info() David S. Miller (4): Merge branch 'for-davem' of git://git.kernel.org/.../linville/wireless Merge tag 'linux-can-fixes-for-3.15-20140528' of git://gitorious.org/linux-can/linux-can Merge branch 'master' of git://git.kernel.org/.../pablo/nf Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Emmanuel Grumbach (1): iwlwifi: mvm: disable beacon filtering Eric Dumazet (1): net: fix inet_getid() and ipv6_select_ident() bugs Eric W. Biederman (1): netlink: Only check file credentials for implicit destinations Ivan Mikhaylov (2): emac: add missing support of 10mbit in emac/rgmii emac: aggregation of v1-2 PLB errors for IER register Jack Morgenstein (1): net/mlx4_core: Reset RoCE VF gids when guest driver goes down Jean Delvare (1): net: ec_bhf: Add runtime dependencies Jiri Pirko (1): team: fix mtu setting John W. Linville (3): Merge branch 'master' of git://git.kernel.org/.../bluetooth/bluetooth Merge branch 'for-john' of git://git.kernel.org/.../iwlwifi/iwlwifi-fixes Merge branch 'master' of git://git.kernel.org/.../linville/wireless into for-davem Jon Maxwell (1): bridge: notify user space after fdb update Kristian Evensen (1): ipheth: Add support for iPad 2 and iPad 3 Leon Yu (1): net: filter: fix possible memory leak in __sk_prepare_filter() Marek Lindner (1): batman-adv: fix NULL pointer dereferences Michal Schmidt (1): netlink: rate-limit leftover bytes warning and print process name Nikolay Aleksandrov (1): net: fix wrong mac_len calculation for vlans Oliver Hartkopp (1): can: only rename enabled led triggers when changing the netdev name Peter Christensen (1): ipvs: Fix panic due to non-linear skb Samuel Ortiz (1): Bluetooth: Fix L2CAP LE debugfs entries permissions Toshiaki Makita (1): bridge: Prevent insertion of FDB entry with disallowed vlan Yuchung Cheng (1): tcp: fix cwnd undo on DSACK in F-RTO MAINTAINERS | 3 +- drivers/net/can/led.c | 3 ++ drivers/net/ethernet/Kconfig | 1 + drivers/net/ethernet/ibm/emac/mal.c | 5 +--- drivers/net/ethernet/ibm/emac/mal.h | 20 + drivers/net/ethernet/ibm/emac/rgmii.c | 3 ++ drivers/net/ethernet/mellanox/mlx4/main.c | 1 + drivers/net/ethernet/mellanox/mlx4/mlx4.h | 20 + drivers/net/ethernet/mellanox/mlx4/port.c | 117 +++- drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 3 +- drivers/net/ethernet/qlogic/qlcnic/qlcnic_dcb.c | 1 + drivers/net/team/team.c | 7 - drivers/net/usb/ipheth.c | 10 +++ drivers/net/usb/qmi_wwan.c| 6 +++- drivers/net/wireless/iwlwifi/mvm/mac80211.c | 2 +- include/linux/if_team.h | 1 + include/linux/netlink.h | 7 +++-- include/net/inetpeer.h| 9 +- lib/nlattr.c | 4 +-- net/batman-adv/multicast.c| 6 ++--
Re: [PATCH RFC 1/2] MAINTAINERS: Add "R:" designated-reviewers tag
On Mon, Jun 02, 2014 at 04:59:15PM -0700, j...@joshtriplett.org wrote: > On Tue, Jun 03, 2014 at 09:19:49AM +1000, Dave Chinner wrote: > > On Mon, Jun 02, 2014 at 12:17:46PM -0700, Joe Perches wrote: > > > What it needs is testing, not reviewing. > > > > > > I tested it for all of 10 seconds. > > > > From Documentation/SubmittingPatches: > > > > " (c) While there may be things that could be improved with this > > submission, I believe that it is, at this time, (1) a > > worthwhile modification to the kernel, and (2) free of known > > issues which would argue against its inclusion. > > . > > > > A Reviewed-by tag is a statement of opinion that the patch is an > > appropriate modification of the kernel without any remaining serious > > technical issues." > > > > So, for someone to say they have reviewed the code and are able to > > say it is free of known issues and has no remaining technical > > issues, they would have had to apply, compile and test the patch, > > yes? > > > > i.e. Reviewed-by implies both Acked-by, Tested-by and that the code > > is technically sound. > > No, not at all. It implies Acked-by, and that the code is technically > sound (both at the micro-level and in overall architecture/approach), > but does not imply Tested-by; that's a separate tag for a reason. You've ignored the (c).(2) "free of known issues" criteria there. You cannot say a patch is free of issues if you haven't applied, compiled and tested it. > We should not, for instance, prevent someone from providing a > Reviewed-by (as opposed to an Acked-by) for a driver whose hardware few > people actually have. There's significant value in code review even > without the ability to test. I don't disagree with you that there's value in code review, but that's not the only part of what "reviewed-by" means. You can test that the code is free of known issues without reviewing it (i.e. tested-by). You can read the code and note that you can't see any technical issues without testing it (Acked-by). But you can't say that is it both free of techical and known issues without both reading the code and testing it (Reviewed-by). > > Anyone using Reviewed-by without having actually applied and tested > > the patch is mis-using the tag - they should be using Acked-by: if > > all they have done is read the code in their mail program > > Acked-by and Reviewed-by mean two different things (Reviewed-by being a > superset of Acked-by), and the difference is not "I've applied and > tested this"; that's Tested-by. Right, the difference is more than that - Reviewed-by is a superset of both Acked-by and Tested-by. And, yes, this is the definition we've been using for "reviewed-by" for XFS code since, well, years before the "reviewed-by" tag even existed... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 1/3] CMA: generalize CMA reserved area management functionality
Currently, there are two users on CMA functionality, one is the DMA subsystem and the other is the kvm on powerpc. They have their own code to manage CMA reserved area even if they looks really similar. >From my guess, it is caused by some needs on bitmap management. Kvm side wants to maintain bitmap not for 1 page, but for more size. Eventually it use bitmap where one bit represents 64 pages. When I implement CMA related patches, I should change those two places to apply my change and it seem to be painful to me. I want to change this situation and reduce future code management overhead through this patch. This change could also help developer who want to use CMA in their new feature development, since they can use CMA easily without copying & pasting this reserved area management code. Signed-off-by: Joonsoo Kim diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 00e13ce..b3fe1cc 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -283,7 +283,7 @@ config CMA_ALIGNMENT If unsure, leave the default value "8". -config CMA_AREAS +config DMA_CMA_AREAS int "Maximum count of the CMA device-private areas" default 7 help diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c index 83969f8..48cdac8 100644 --- a/drivers/base/dma-contiguous.c +++ b/drivers/base/dma-contiguous.c @@ -186,7 +186,7 @@ static int __init cma_activate_area(struct cma *cma) return 0; } -static struct cma cma_areas[MAX_CMA_AREAS]; +static struct cma cma_areas[MAX_DMA_CMA_AREAS]; static unsigned cma_area_count; static int __init cma_init_reserved_areas(void) diff --git a/include/linux/cma.h b/include/linux/cma.h new file mode 100644 index 000..60ba06f --- /dev/null +++ b/include/linux/cma.h @@ -0,0 +1,28 @@ +/* + * Contiguous Memory Allocator + * + * Copyright LG Electronics Inc., 2014 + * Written by: + * Joonsoo Kim + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of the + * License or (at your optional) any later version of the license. + * + */ + +#ifndef __CMA_H__ +#define __CMA_H__ + +struct cma; + +extern struct page *cma_alloc(struct cma *cma, unsigned long count, + unsigned long align); +extern bool cma_release(struct cma *cma, struct page *pages, + unsigned long count); +extern int __init cma_declare_contiguous(phys_addr_t size, phys_addr_t base, + phys_addr_t limit, phys_addr_t alignment, + unsigned long bitmap_shift, bool fixed, + struct cma **res_cma); +#endif diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h index 772eab5..dfb1dc9 100644 --- a/include/linux/dma-contiguous.h +++ b/include/linux/dma-contiguous.h @@ -63,7 +63,7 @@ struct device; * There is always at least global CMA area and a few optional device * private areas configured in kernel .config. */ -#define MAX_CMA_AREAS (1 + CONFIG_CMA_AREAS) +#define MAX_DMA_CMA_AREAS (1 + CONFIG_DMA_CMA_AREAS) extern struct cma *dma_contiguous_default_area; @@ -123,7 +123,7 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages, #else -#define MAX_CMA_AREAS (0) +#define MAX_DMA_CMA_AREAS (0) static inline struct cma *dev_get_cma_area(struct device *dev) { diff --git a/mm/Kconfig b/mm/Kconfig index 7511b4a..0877ddc 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -515,6 +515,17 @@ config CMA_DEBUG processing calls such as dma_alloc_from_contiguous(). This option does not affect warning and error messages. +config CMA_AREAS + int "Maximum count of the CMA areas" + depends on CMA + default 7 + help + CMA allows to create CMA areas for particular purpose, mainly, + used as device private area. This parameter sets the maximum + number of CMA area in the system. + + If unsure, leave the default value "7". + config ZBUD tristate default n diff --git a/mm/Makefile b/mm/Makefile index 1eaa70b..bc0422b 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -62,3 +62,4 @@ obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o obj-$(CONFIG_ZBUD) += zbud.o obj-$(CONFIG_ZSMALLOC) += zsmalloc.o obj-$(CONFIG_GENERIC_EARLY_IOREMAP) += early_ioremap.o +obj-$(CONFIG_CMA) += cma.o diff --git a/mm/cma.c b/mm/cma.c new file mode 100644 index 000..0dae88d --- /dev/null +++ b/mm/cma.c @@ -0,0 +1,329 @@ +/* + * Contiguous Memory Allocator + * + * Copyright (c) 2010-2011 by Samsung Electronics. + * Copyright IBM Corporation, 2013 + * Copyright LG Electronics Inc., 2014 + * Written by: + * Marek Szyprowski + * Michal Nazarewicz + * Aneesh Kumar K.V + * Joonsoo Kim + * + * This program is free software; you can
Re: [PATCH v5 6/6] seccomp: add SECCOMP_EXT_ACT_TSYNC and SECCOMP_FILTER_TSYNC
On Mon, Jun 2, 2014 at 5:29 PM, Andy Lutomirski wrote: > On Mon, Jun 2, 2014 at 5:14 PM, Kees Cook wrote: >> On Mon, Jun 2, 2014 at 1:53 PM, Andy Lutomirski wrote: >>> On Tue, May 27, 2014 at 12:55 PM, Kees Cook wrote: On Tue, May 27, 2014 at 12:27 PM, Andy Lutomirski wrote: > On Tue, May 27, 2014 at 12:23 PM, Kees Cook wrote: >> On Tue, May 27, 2014 at 12:10 PM, Andy Lutomirski >> wrote: >>> On Tue, May 27, 2014 at 11:45 AM, Kees Cook >>> wrote: On Tue, May 27, 2014 at 11:40 AM, Andy Lutomirski wrote: > On Tue, May 27, 2014 at 11:24 AM, Kees Cook > wrote: >> On Mon, May 26, 2014 at 12:27 PM, Andy Lutomirski >> wrote: >>> On Fri, May 23, 2014 at 10:05 AM, Kees Cook >>> wrote: On Thu, May 22, 2014 at 4:11 PM, Andy Lutomirski wrote: > On Thu, May 22, 2014 at 4:05 PM, Kees Cook > wrote: >> Applying restrictive seccomp filter programs to large or diverse >> codebases often requires handling threads which may be started >> early in >> the process lifetime (e.g., by code that is linked in). While it >> is >> possible to apply permissive programs prior to process start up, >> it is >> difficult to further restrict the kernel ABI to those threads >> after that >> point. >> >> This change adds a new seccomp extension action for >> synchronizing thread >> group seccomp filters and a prctl() for accessing that >> functionality, >> as well as a flag for SECCOMP_EXT_ACT_FILTER to perform sync at >> filter >> installation time. >> >> When calling prctl(PR_SECCOMP_EXT, SECCOMP_EXT_ACT, >> SECCOMP_EXT_ACT_FILTER, >> flags, filter) with flags containing SECCOMP_FILTER_TSYNC, or >> when calling >> prctl(PR_SECCOMP_EXT, SECCOMP_EXT_ACT, SECCOMP_EXT_ACT_TSYNC, 0, >> 0), it >> will attempt to synchronize all threads in current's threadgroup >> to its >> seccomp filter program. This is possible iff all threads are >> using a filter >> that is an ancestor to the filter current is attempting to >> synchronize to. >> NULL filters (where the task is running as SECCOMP_MODE_NONE) >> are also >> treated as ancestors allowing threads to be transitioned into >> SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS, ...) has >> been set on the >> calling thread, no_new_privs will be set for all synchronized >> threads too. >> On success, 0 is returned. On failure, the pid of one of the >> failing threads >> will be returned, with as many filters installed as possible. > > Is there a use case for adding a filter and synchronizing filters > being separate operations? If not, I think this would be easier > to > understand and to use if there was just a single operation. Yes: if the other thread's lifetime is not well controlled, it's good to be able to have a distinct interface to retry the thread sync that doesn't require adding "no-op" filters. >>> >>> Wouldn't this still be solved by: >>> >>> seccomp_add_filter(final_filter, SECCOMP_FILTER_ALL_THREADS); >>> >>> the idea would be that, if seccomp_add_filter fails, then you give >>> up >>> and, if it succeeds, then you're done. It shouldn't fail unless out >>> of memory or you've nested too deeply. >> >> I wanted to keep the case of being able to to wait for non-ancestor >> threads to finish. For example, 2 threads start and set separate >> filters. 1 does work and exits, 2 starts another thread (3) which >> adds >> filters, does work, and then waits for 1 to finish by calling TSYNC. >> Once 1 dies, TSYNC succeeds. In the case of not having direct control >> over thread lifetime (say, when using third-party libraries), I'd >> like >> to retain the flexibility of being able to do TSYNC without needing a >> filter being attached to it. > > I must admit this strikes me as odd. What's the point of having a > thread set a filter if it intends to be a short-lived thread? I was illustrating the potential insanity of third-party libraries. There isn't much sense in that behavior, but if it exists, working around it is harder without the separate TSYNC-only call.
[RFC PATCH 2/3] DMA, CMA: use general CMA reserved area management framework
Now, we have general CMA reserved area management framework, so use it for future maintainabilty. There is no functional change. Signed-off-by: Joonsoo Kim diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index b3fe1cc..4eac559 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -283,16 +283,6 @@ config CMA_ALIGNMENT If unsure, leave the default value "8". -config DMA_CMA_AREAS - int "Maximum count of the CMA device-private areas" - default 7 - help - CMA allows to create CMA areas for particular devices. This parameter - sets the maximum number of such device private CMA areas in the - system. - - If unsure, leave the default value "7". - endif endmenu diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c index 48cdac8..4bce4e1 100644 --- a/drivers/base/dma-contiguous.c +++ b/drivers/base/dma-contiguous.c @@ -24,23 +24,9 @@ #include #include -#include -#include -#include #include -#include -#include -#include #include - -struct cma { - unsigned long base_pfn; - unsigned long count; - unsigned long *bitmap; - struct mutexlock; -}; - -struct cma *dma_contiguous_default_area; +#include #ifdef CONFIG_CMA_SIZE_MBYTES #define CMA_SIZE_MBYTES CONFIG_CMA_SIZE_MBYTES @@ -48,6 +34,8 @@ struct cma *dma_contiguous_default_area; #define CMA_SIZE_MBYTES 0 #endif +struct cma *dma_contiguous_default_area; + /* * Default global CMA area size can be defined in kernel's .config. * This is useful mainly for distro maintainers to create a kernel @@ -154,55 +142,6 @@ void __init dma_contiguous_reserve(phys_addr_t limit) } } -static DEFINE_MUTEX(cma_mutex); - -static int __init cma_activate_area(struct cma *cma) -{ - int bitmap_size = BITS_TO_LONGS(cma->count) * sizeof(long); - unsigned long base_pfn = cma->base_pfn, pfn = base_pfn; - unsigned i = cma->count >> pageblock_order; - struct zone *zone; - - cma->bitmap = kzalloc(bitmap_size, GFP_KERNEL); - - if (!cma->bitmap) - return -ENOMEM; - - WARN_ON_ONCE(!pfn_valid(pfn)); - zone = page_zone(pfn_to_page(pfn)); - - do { - unsigned j; - base_pfn = pfn; - for (j = pageblock_nr_pages; j; --j, pfn++) { - WARN_ON_ONCE(!pfn_valid(pfn)); - if (page_zone(pfn_to_page(pfn)) != zone) - return -EINVAL; - } - init_cma_reserved_pageblock(pfn_to_page(base_pfn)); - } while (--i); - - mutex_init(>lock); - return 0; -} - -static struct cma cma_areas[MAX_DMA_CMA_AREAS]; -static unsigned cma_area_count; - -static int __init cma_init_reserved_areas(void) -{ - int i; - - for (i = 0; i < cma_area_count; i++) { - int ret = cma_activate_area(_areas[i]); - if (ret) - return ret; - } - - return 0; -} -core_initcall(cma_init_reserved_areas); - /** * dma_contiguous_reserve_area() - reserve custom contiguous area * @size: Size of the reserved area (in bytes), @@ -224,176 +163,31 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base, phys_addr_t limit, struct cma **res_cma, bool fixed) { - struct cma *cma = _areas[cma_area_count]; - phys_addr_t alignment; - int ret = 0; - - pr_debug("%s(size %lx, base %08lx, limit %08lx)\n", __func__, -(unsigned long)size, (unsigned long)base, -(unsigned long)limit); - - /* Sanity checks */ - if (cma_area_count == ARRAY_SIZE(cma_areas)) { - pr_err("Not enough slots for CMA reserved regions!\n"); - return -ENOSPC; - } - - if (!size) - return -EINVAL; - - /* Sanitise input arguments */ - alignment = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order); - base = ALIGN(base, alignment); - size = ALIGN(size, alignment); - limit &= ~(alignment - 1); - - /* Reserve memory */ - if (base && fixed) { - if (memblock_is_region_reserved(base, size) || - memblock_reserve(base, size) < 0) { - ret = -EBUSY; - goto err; - } - } else { - phys_addr_t addr = memblock_alloc_range(size, alignment, base, - limit); - if (!addr) { - ret = -ENOMEM; - goto err; - } else { - base = addr; - } - } - - /* -* Each reserved area must be initialised later, when more kernel -* subsystems (like slab allocator) are available. -*/ - cma->base_pfn =
[RFC PATCH 3/3] PPC, KVM, CMA: use general CMA reserved area management framework
Now, we have general CMA reserved area management framework, so use it for future maintainabilty. There is no functional change. Signed-off-by: Joonsoo Kim diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c index 8cd0dae..43c3f81 100644 --- a/arch/powerpc/kvm/book3s_hv_builtin.c +++ b/arch/powerpc/kvm/book3s_hv_builtin.c @@ -15,12 +15,14 @@ #include #include #include +#include #include #include #include -#include "book3s_hv_cma.h" +#define KVM_CMA_CHUNK_ORDER18 + /* * Hash page table alignment on newer cpus(CPU_FTR_ARCH_206) * should be power of 2. @@ -42,6 +44,8 @@ static unsigned long kvm_cma_resv_ratio = 5; unsigned long kvm_rma_pages = (1 << 27) >> PAGE_SHIFT; /* 128MB */ EXPORT_SYMBOL_GPL(kvm_rma_pages); +static struct cma *kvm_cma; + /* Work out RMLS (real mode limit selector) field value for a given RMA size. Assumes POWER7 or PPC970. */ static inline int lpcr_rmls(unsigned long rma_size) @@ -96,7 +100,7 @@ struct kvm_rma_info *kvm_alloc_rma() ri = kmalloc(sizeof(struct kvm_rma_info), GFP_KERNEL); if (!ri) return NULL; - page = kvm_alloc_cma(kvm_rma_pages, kvm_rma_pages); + page = cma_alloc(kvm_cma, kvm_rma_pages, get_order(kvm_rma_pages)); if (!page) goto err_out; atomic_set(>use_count, 1); @@ -111,7 +115,7 @@ EXPORT_SYMBOL_GPL(kvm_alloc_rma); void kvm_release_rma(struct kvm_rma_info *ri) { if (atomic_dec_and_test(>use_count)) { - kvm_release_cma(pfn_to_page(ri->base_pfn), kvm_rma_pages); + cma_release(kvm_cma, pfn_to_page(ri->base_pfn), kvm_rma_pages); kfree(ri); } } @@ -133,13 +137,13 @@ struct page *kvm_alloc_hpt(unsigned long nr_pages) /* Old CPUs require HPT aligned on a multiple of its size */ if (!cpu_has_feature(CPU_FTR_ARCH_206)) align_pages = nr_pages; - return kvm_alloc_cma(nr_pages, align_pages); + return cma_alloc(kvm_cma, nr_pages, get_order(align_pages)); } EXPORT_SYMBOL_GPL(kvm_alloc_hpt); void kvm_release_hpt(struct page *page, unsigned long nr_pages) { - kvm_release_cma(page, nr_pages); + cma_release(kvm_cma, page, nr_pages); } EXPORT_SYMBOL_GPL(kvm_release_hpt); @@ -178,6 +182,7 @@ void __init kvm_cma_reserve(void) align_size = HPT_ALIGN_PAGES << PAGE_SHIFT; align_size = max(kvm_rma_pages << PAGE_SHIFT, align_size); - kvm_cma_declare_contiguous(selected_size, align_size); + cma_declare_contiguous(selected_size, 0, 0, align_size, + KVM_CMA_CHUNK_ORDER - PAGE_SHIFT, false, _cma); } } diff --git a/arch/powerpc/kvm/book3s_hv_cma.c b/arch/powerpc/kvm/book3s_hv_cma.c deleted file mode 100644 index d9d3d85..000 --- a/arch/powerpc/kvm/book3s_hv_cma.c +++ /dev/null @@ -1,240 +0,0 @@ -/* - * Contiguous Memory Allocator for ppc KVM hash pagetable based on CMA - * for DMA mapping framework - * - * Copyright IBM Corporation, 2013 - * Author Aneesh Kumar K.V - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License as - * published by the Free Software Foundation; either version 2 of the - * License or (at your optional) any later version of the license. - * - */ -#define pr_fmt(fmt) "kvm_cma: " fmt - -#ifdef CONFIG_CMA_DEBUG -#ifndef DEBUG -# define DEBUG -#endif -#endif - -#include -#include -#include -#include - -#include "book3s_hv_cma.h" - -struct kvm_cma { - unsigned long base_pfn; - unsigned long count; - unsigned long *bitmap; -}; - -static DEFINE_MUTEX(kvm_cma_mutex); -static struct kvm_cma kvm_cma_area; - -/** - * kvm_cma_declare_contiguous() - reserve area for contiguous memory handling - * for kvm hash pagetable - * @size: Size of the reserved memory. - * @alignment: Alignment for the contiguous memory area - * - * This function reserves memory for kvm cma area. It should be - * called by arch code when early allocator (memblock or bootmem) - * is still activate. - */ -long __init kvm_cma_declare_contiguous(phys_addr_t size, phys_addr_t alignment) -{ - long base_pfn; - phys_addr_t addr; - struct kvm_cma *cma = _cma_area; - - pr_debug("%s(size %lx)\n", __func__, (unsigned long)size); - - if (!size) - return -EINVAL; - /* -* Sanitise input arguments. -* We should be pageblock aligned for CMA. -*/ - alignment = max(alignment, (phys_addr_t)(PAGE_SIZE << pageblock_order)); - size = ALIGN(size, alignment); - /* -* Reserve memory -* Use __memblock_alloc_base() since -* memblock_alloc_base() panic()s. -*/ - addr = __memblock_alloc_base(size, alignment, 0); - if (!addr) { - base_pfn = -ENOMEM; - goto err; -
[RFC PATCH 0/3] CMA: generalize CMA reserved area management code
Currently, there are two users on CMA functionality, one is the DMA subsystem and the other is the kvm on powerpc. They have their own code to manage CMA reserved area even if they looks really similar. >From my guess, it is caused by some needs on bitmap management. Kvm side wants to maintain bitmap not for 1 page, but for more size. Eventually it use bitmap where one bit represents 64 pages. When I implement CMA related patches, I should change those two places to apply my change and it seem to be painful to me. I want to change this situation and reduce future code management overhead through this patch. This change could also help developer who want to use CMA in their new feature development, since they can use CMA easily without copying & pasting this reserved area management code. Now, we are in merge window, so this is not for merging. I'd like to listen opinion from people who related to this stuff before actually trying to merge this patchset. If all agree with this change, I will resend it after rc1. Thanks. Joonsoo Kim (3): CMA: generalize CMA reserved area management functionality DMA, CMA: use general CMA reserved area management framework PPC, KVM, CMA: use general CMA reserved area management framework arch/powerpc/kvm/book3s_hv_builtin.c | 17 +- arch/powerpc/kvm/book3s_hv_cma.c | 240 - arch/powerpc/kvm/book3s_hv_cma.h | 27 --- drivers/base/Kconfig | 10 -- drivers/base/dma-contiguous.c| 230 ++-- include/linux/cma.h | 28 +++ include/linux/dma-contiguous.h |7 +- mm/Kconfig | 11 ++ mm/Makefile |1 + mm/cma.c | 329 ++ 10 files changed, 396 insertions(+), 504 deletions(-) delete mode 100644 arch/powerpc/kvm/book3s_hv_cma.c delete mode 100644 arch/powerpc/kvm/book3s_hv_cma.h create mode 100644 include/linux/cma.h create mode 100644 mm/cma.c -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 2/2] rcu: Add Josh Triplett as designated reviewer
On 06/02/2014 05:02 PM, j...@joshtriplett.org wrote: > On Mon, Jun 02, 2014 at 01:38:56PM -0700, Randy Dunlap wrote: >> On 06/02/2014 01:36 PM, Joe Perches wrote: >>> On Mon, 2014-06-02 at 13:35 -0700, Andrew Morton wrote: On Mon, 2 Jun 2014 10:00:20 -0700 "Paul E. McKenney" wrote: > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -7321,6 +7321,7 @@ F: kernel/rcu/torture.c > > RCUTORTURE TEST FRAMEWORK > M: "Paul E. McKenney" > +R: Josh Triplett > L: linux-kernel@vger.kernel.org > S: Supported > T: git > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git I like the general principle - knowing who to poke regarding a kernel change is useful. I don't care much whether it's "M:" or "R:", although "R:" carries more meaning and hence is probably better. But why not "Cc:"? That's meaningful too and is more copy-n-paste friendly. >> >> Josh, what are you assuming that Andrew and I did not? > > Not sure what you mean here. Responding to the text you quoted: I have > no particular need to bikeshed the tag name, so if you prefer "Cc" and > can convince get_maintainer.pl to handle it, fine by me. Sorry, what I meant is that Andrew and I both mentioned copy-paste and you replied earlier (and I have already deleted it) that copy-paste shouldn't be necessary for someone who is using get_maintainer.pl. Do you redirect its output to your patch file and then edit it or does get_maintainer.pl work with git-send-email or something else? if something else, what is it, please? thanks, -- ~Randy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: nfs4_do_reclaim lockdep pop in v3.15.0-rc1
On Mon, Jun 2, 2014 at 6:49 PM, John Stultz wrote: > On Mon, Jun 2, 2014 at 3:42 PM, Trond Myklebust > wrote: >> On Mon, Jun 2, 2014 at 6:12 PM, John Stultz wrote: >>> On Mon, Jun 2, 2014 at 9:02 AM, Trond Myklebust >>> wrote: On Mon, Jun 2, 2014 at 10:49 AM, Jeff Layton wrote: > I've been working on the patchset to break up the client_mutex in nfsd. > While doing some debugging, I had mounted my kernel git tree with > NFSv4.1, and was running crash on the vmlinux image in it. > > A little while later, I saw the following lockdep inversion pop. > Unfortunately, I couldn't get the whole log, but I think it's enough to > show that there's a potential problem? > > I've not had time to give it a hard look yet, but thought I'd post it > here in the hopes that it might look familiar to someone: > > [ 2581.104687] == > [ 2581.104716] [ INFO: possible circular locking dependency detected ] > [ 2581.104716] 3.15.0-rc1.jlayton.1+ #2 Tainted: G OE > [ 2581.104716] --- > [ 2581.104716] 2001:470:8:d63:/5622 is trying to acquire lock: > [ 2581.104716] (&(>so_lock)->rlock){+.+...}, at: > [] nfs4_do_reclaim+0x5bd/0x7f0 [nfsv4] > [ 2581.104716] > [ 2581.104716] but task is already holding lock: > [ 2581.104716] (>so_reclaim_seqcount){+.+...}, at: > [] nfs4_run_state_manager+0x7ee/0xc00 [nfsv4] > [ 2581.104716] > [ 2581.104716] which lock already depends on the new lock. > [ 2581.104716] > [ 2581.104716] > [ 2581.104716] the existing dependency chain (in reverse order) is: > [ 2581.104716] > -> #1 (>so_reclaim_seqcount){+.+...}: > [ 2581.104716][] lock_acquire+0xa2/0x1d0 > [ 2581.104716][] nfs4_do_reclaim+0x290/0x7f0 > [nfsv4] > [ 2581.104716][] > nfs4_run_state_manager+0x7ee/0xc00 [nfsv4] > [ 2581.104716][] kthread+0xff/0x120 > [ 2581.104716][] ret_from_fork+0x7c/0xb0 > [ 2581.104716] > -> #0 (&(>so_lock)->rlock){+.+...}: > [ 2581.104716][] __lock_acquire+0x1b8f/0x1ca0 > [ 2581.104716][] lock_acquire+0xa2/0x1d0 > [ 2581.104716][] _raw_spin_lock+0x3e/0x80 > [ 2581.104716][] nfs4_do_reclaim+0x5bd/0x7f0 > [nfsv4] > [ 2581.104716][] > nfs4_run_state_manager+0x7ee/0xc00 [nfsv4] > [ 2581.104716][] kthread+0xff/0x120 > [ 2581.104716][] ret_from_fork+0x7c/0xb0 > [ 2581.104716] > [ 2581.104716] other info that might help us debug this: > [ 2581.104716] > [ 2581.104716] Possible unsafe locking scenario: > [ 2581.104716] > [ 2581.104716]CPU0CPU1 > [ 2581.104716] > [ 2581.104716] lock(>so_reclaim_seqcount); > [ 2581.104716] > lock(&(>so_lock)->rlock); > [ 2581.104716] > lock(>so_reclaim_seqcount); > [ 2581.104716] lock(&(>so_lock)->rlock); > [ 2581.104716] > [ 2581.104716] *** DEADLOCK *** > [ 2581.104716] > [ 2581.104716] 1 lock held by 2001:470:8:d63:/5622: > [ 2581.104716] #0: (>so_reclaim_seqcount){+.+...}, at: > [] nfs4_run_state_manager+0x7ee/0xc00 [nfsv4] > [ 2581.104716] > [ 2581.104716] stack backtrace: > [ 2581.104716] CPU: 2 PID: 5622 Comm: 2001:470:8:d63: Tainted: G > OE 3.15.0-rc1.jlayton.1+ #2 > [ 2581.104716] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ 2581.104716] d29e16c4 8800d8d8fba8 > 817d318e > [ 2581.104716] 8262d5e0 8800d8d8fbe8 817ce525 > 8800d8d8fc40 > [ 2581.104716] 8800362a8b98 8800362a8b98 0001 > 8800362a8000 > [ 2581.104716] Call Trace: > [ 2581.104716] [] dump_stack+0x4d/0x66 > [ 2581.104716] [] print_circular_bug+0x201/0x20f > [ 2581.104716] [] __lock_acquire+0x1b8f/0x1ca0 > [ 2581.104716] [] ? > debug_check_no_obj_freed+0x17e/0x270 > [ 2581.104716] [] lock_acquire+0xa2/0x1d0 > [ 2581.104716] [] ? nfs4_do_reclaim+0x5bd/0x7f0 [nfsv4] > [ 2581.104716] [] _raw_spin_lock+0x3e/0x80 > [ 2581.104716] [] ? nfs4_do_reclaim+0x5bd/0x7f0 [nfsv4] > [ 2581.104716] [] nfs4_do_reclaim+0x5bd/0x7f0 [nfsv4] > [ 2581.104716] [] ? nfs4_run_state_manager+0x7ee/0xc00 > [nfsv4] > [ 2581.104716] [] nfs4_run_state_manager+0x7ee/0xc00 > [nfsv4] > [ 2581.104716] [] ? nfs4_do_reclaim+0x7f0/0x7f0 [nfsv4] > [ 2581.104716] [] kthread+0xff/0x120 > [ 2581.104716] [] ? insert_kthread_work+0x80/0x80 > [ 2581.104716] [] ret_from_fork+0x7c/0xb0 > [ 2581.104716] [] ? insert_kthread_work+0x80/0x80 OK. So now that lockdep has been added to raw_seqcount_begin()
3.15 regression: wrong cgroup magic
Sorry I didn't notice this earlier. Linux 3.15 breaks my production system :( The cause appears to be: commit 2bd59d48ebfb3df41ee56938946ca0dd30887312 Author: Tejun Heo Date: Tue Feb 11 11:52:49 2014 -0500 cgroup: convert to kernfs In particular, this piece: - sb->s_magic = CGROUP_SUPER_MAGIC; The result is that cgroup shows up with the wrong magic number, so my code goes "oh crap, cgroupfs isn't mounted" and fails. I can change my code to hack around this, but I can imagine other things getting tripped up. Is there still time to fix this? --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch v2] mm, memcg: periodically schedule when emptying page list
From: Hugh Dickins mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none of which indicate that the iteration may be taking too long, is met. We have encountered the following stack trace many times indicating "need_resched set for > 5120 ns (51 ticks) without schedule", for example: scheduler_tick() mem_cgroup_move_account+0x4d/0x1d5 mem_cgroup_move_parent+0x8d/0x109 mem_cgroup_reparent_charges+0x149/0x2ba mem_cgroup_css_offline+0xeb/0x11b cgroup_offline_fn+0x68/0x16b process_one_work+0x129/0x350 If this iteration is taking too long, we still need to do cond_resched() even when an individual page is not busy. [rient...@google.com: changelog] Signed-off-by: Hugh Dickins Signed-off-by: David Rientjes --- v2: always reschedule if needed, "page" itself may not have a pc mismatch or been unable to isolate. mm/memcontrol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4784,9 +4784,9 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg, if (mem_cgroup_move_parent(page, pc, memcg)) { /* found lock contention or "pc" is obsolete. */ busy = page; - cond_resched(); } else busy = NULL; + cond_resched(); } while (!list_empty(list)); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the mips tree with Linus' tree
Hi Ralf, Today's linux-next merge of the mips tree got a conflict in arch/mips/mti-malta/malta-memory.c between commit 2ff89d64f23e ("MIPS: malta: memory.c: Initialize the 'memsize' variable") from Linus' tree and commit acd8bc1a70ff ("MIPS: malta: Remove 'maybe_unused' attribute from ememsize{, _str}") from the mips tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc arch/mips/mti-malta/malta-memory.c index f2364e419682,1ca34887d990.. --- a/arch/mips/mti-malta/malta-memory.c +++ b/arch/mips/mti-malta/malta-memory.c @@@ -26,8 -26,8 +26,8 @@@ unsigned long physical_memsize = 0L fw_memblock_t * __init fw_getmdesc(int eva) { - char *memsize_str, *ememsize_str __maybe_unused = NULL, *ptr; - unsigned long memsize = 0, ememsize __maybe_unused = 0; + char *memsize_str, *ememsize_str = NULL, *ptr; - unsigned long memsize, ememsize = 0; ++ unsigned long memsize = 0, ememsize = 0; static char cmdline[COMMAND_LINE_SIZE] __initdata; int tmp; signature.asc Description: PGP signature
Re: [PATCH] net: filter: fix possible memory leak in __sk_prepare_filter()
From: Leon Yu Date: Sun, 1 Jun 2014 05:37:25 + > __sk_prepare_filter() was reworked in commit bd4cf0ed3 (net: filter: > rework/optimize internal BPF interpreter's instruction set) so that it should > have uncharged memory once things went wrong. However that work isn't > complete. > Error is handled only in __sk_migrate_filter() while memory can still leak in > the error path right after sk_chk_filter(). > > Signed-off-by: Leon Yu Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] driver core: platform: add device binding path 'driver_override'
Needed by platform device drivers, such as the upcoming vfio-platform driver, in order to bypass the existing OF, ACPI, id_table and name string matches, and successfully be able to be bound to any device, like so: echo vfio-platform > /sys/bus/platform/devices/fff51000.ethernet/driver_override echo fff51000.ethernet > /sys/bus/platform/devices/fff51000.ethernet/driver/unbind echo fff51000.ethernet > /sys/bus/platform/drivers_probe This mimics "PCI: Introduce new device binding path using pci_dev.driver_override", which is an interface enhancement for more deterministic PCI device binding, e.g., when in the presence of hotplug. Reviewed-by: Alex Williamson Reviewed-by: Alexander Graf Reviewed-by: Stuart Yoder Signed-off-by: Kim Phillips --- Greg, This is largely identical to the PCI version of the same that has been accepted for v3.16 and ack'd by you: https://lists.cs.columbia.edu/pipermail/kvmarm/2014-May/009674.html and applied to Bjorn Helgaas' PCI tree: https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?h=pci/virtualization=782a985d7af26db39e86070d28f987cad21313c0 You are the platform driver core maintainer: can you apply this to your driver-core tree now? Thanks, Kim changes since previous version (v2 of the v5 vfio-platform driver series): - commit text: replaced vfio platform driver reference with 'upcoming', and removed the PCI version mailing list reference since it has now been accepted. - added Alex W., Alex G., and Stuart's Reviewed-by's. changes in v2 patch of v5 of this patchseries: - rebased onto today's Linus' ToT - added kfree to match PCI counterpart fix, as Alex Williamson just posted a v3 of the patch (thanks Christoffer for the notification) - in the commit text, replaced vfio platform driver reference with 'later in series', and updated the PCI version mailing list reference to the v3 version. Documentation/ABI/testing/sysfs-bus-platform | 20 drivers/base/platform.c | 47 include/linux/platform_device.h | 1 + 3 files changed, 68 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-bus-platform diff --git a/Documentation/ABI/testing/sysfs-bus-platform b/Documentation/ABI/testing/sysfs-bus-platform new file mode 100644 index 000..5172a61 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-platform @@ -0,0 +1,20 @@ +What: /sys/bus/platform/devices/.../driver_override +Date: April 2014 +Contact: Kim Phillips +Description: + This file allows the driver for a device to be specified which + will override standard OF, ACPI, ID table, and name matching. + When specified, only a driver with a name matching the value + written to driver_override will have an opportunity to bind + to the device. The override is specified by writing a string + to the driver_override file (echo vfio-platform > \ + driver_override) and may be cleared with an empty string + (echo > driver_override). This returns the device to standard + matching rules binding. Writing to driver_override does not + automatically unbind the device from its current driver or make + any attempt to automatically load the specified driver. If no + driver with a matching name is currently loaded in the kernel, + the device will not bind to any driver. This also allows + devices to opt-out of driver binding using a driver_override + name such as "none". Only a single driver may be specified in + the override, there is no support for parsing delimiters. diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 5b47210..4f47563 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -23,6 +23,7 @@ #include #include #include +#include #include "base.h" #include "power/power.h" @@ -188,6 +189,7 @@ static void platform_device_release(struct device *dev) kfree(pa->pdev.dev.platform_data); kfree(pa->pdev.mfd_cell); kfree(pa->pdev.resource); + kfree(pa->pdev.driver_override); kfree(pa); } @@ -695,8 +697,49 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *a, } static DEVICE_ATTR_RO(modalias); +static ssize_t driver_override_store(struct device *dev, +struct device_attribute *attr, +const char *buf, size_t count) +{ + struct platform_device *pdev = to_platform_device(dev); + char *driver_override, *old = pdev->driver_override, *cp; + + if (count > PATH_MAX) + return -EINVAL; + + driver_override = kstrndup(buf, count, GFP_KERNEL); + if (!driver_override) + return -ENOMEM; + + cp = strchr(driver_override, '\n');
Re: [Patch V3 03/37] x86, mpparse: use pr_lvl() helper utilities to replace printk(KERN_LVL)
On Tue, 27 May 2014, Jiang Liu wrote: > Use pr_lvl() helper utilities to replace printk(KERN_LVL) for readability, > no function changes. > > Signed-off-by: Jiang Liu Acked-by: David Rientjes Eek, lots of newlines in single printks without pr_cont() that are existing, probably to make them more visible during debugging. I think we could clean this up even further, but the printk(KERN_LVL) -> pr_lvl() conversion of this patch is a step in the right direction. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 11/32] xfs: convert to struct inode_time
On Mon, Jun 02, 2014 at 01:43:44PM +0200, Arnd Bergmann wrote: > On Monday 02 June 2014 10:28:22 Dave Chinner wrote: > > On Sun, Jun 01, 2014 at 10:24:37AM +1000, Dave Chinner wrote: > > > On Sat, May 31, 2014 at 05:37:52PM +0200, Arnd Bergmann wrote: > > > > In my list at http://kernelnewbies.org/y2038, I found that almost > > > > all file systems at least times until 2106, because they treat > > > > the on-disk value as unsigned on 64-bit systems, or they use > > > > a completely different representation. My guess is that somebody > > > > earlier spent a lot of work on making that happen. > > > > > > > > The exceptions are: > > > > > > > > * exofs uses signed values, which can probably be changed to be > > > > consistent with the others. > > > > * isofs has a bug that limits it until 2027 on architectures with > > > > a signed 'char' type (otherwise it's 2155). > > > > * udf can represent times for many thousands of years through a > > > > 16-bit year representation, but the code to convert to epoch > > > > uses a const array that ends at 2038. > > > > * afs uses signed seconds and can probably be fixed > > > > * coda relies on user space time representation getting passed > > > > through an ioctl. > > > > * I miscategorized xfs/ext2/ext3 as having unsigned 32-bit seconds, > > > > where they really use signed. > > > > > > > > I was confused about XFS since I didn't noticed that there are > > > > separate xfs_ictimestamp_t and xfs_timestamp_t types, so I expected > > > > XFS to also use the 1970-2106 time range on 64-bit systems today. > > > > > > You've missed an awful lot more than just the implications for the > > > core kernel code. > > > > > > There's a good chance such changes propagate to APIs elsewhere in > > > the filesystems, because something you haven't realised is that XFS > > > effectively exposes the on-disk timestamp format directly to > > > userspace via the bulkstat interface (see struct xfs_bstat). It also > > > affects the XFS open-by-handle ioctl and the swap extent ioctl used > > > by the online defragmenter. > > I really didn't look at them at all, as ioctl is very late on my > mental list of things to change. I do realize that a lot of drivers > and file systems do have ioctls that pass time values and we need to > address them one by one. > > I just looked at the ioctls you mentioned but don't see how open-by-handle > is affected by this. Can you point me to what you mean? Sorry, I misremembered how some of the XFS open-by-handle code works in userspace (XFS has a pretty rich open-by-handle ioctl() interface that predates the kernel syscalls by at least 10 years). Basically there is code in userspace that uses the information returned from bulkstat to construct file handles to pass to the open-by-handle ioctls. xfs_fsr then uses the combination of open-by-handle from the bulkstat output and the bulkstat output to feed into the swap extent ioctls i.e. the filesystem's idea of what time is is passed to userspace as an opaque cookie in this case, but it is not used directly by the open-by-handle interfaces like I implied it was. > > Just to put that in context, here's the kernel patch to add extended > > epoch support to XFS. It's completely untested as I haven't done any > > userspace code changes to enable the feature. However, it should > > give you an indication of how far the simple act of changing the > > kernel time representation spread through the filesystem. This does > > not include any of the VFS infrastructure to specifying the range of > > supported timestamps. It survives some smoke testing, but dies when > > the online defragmenter starts using the bulkstat and swap extent > > ioctls (the assert in xfs_inode_time_from_epoch() fires), so I > > probably don't have that all sorted correctly yet... > > > > To test extended epoch support, however, I need to some fstests that > > define and validate the behaviour of the new syscalls - until we get > > those we can't validate that the filesystem follows the spec > > properly. I also suspect we are going to need an interface to query > > the supported range of timestamps from a filesystem so that we can > > test boundary conditions in an automated fashion > > Thanks a lot for having an initial look at this yourself! > > I'd still consider the two problems largely orthogonal. Depends how you look at it. You can't extend the kernel's idea of time without permanent storage being able to specify the supported bounds - that's a non-negotiable aspect of introducing extended epoch timestamp support. The actual addition of extended timestamp support to each individual filesystem is orthoganol to the introduction of the struct inode_time, but doing this addition properly is dependent on the VFS infrastructure being there in the first place. > My patch set > (at least with the 64-bit tv_sec) just gets 32-bit kernels to behave > more like 64-bit kernels regarding inode time stamps, which does >
Re: [PATCH RFC 1/2] MAINTAINERS: Add "R:" designated-reviewers tag
On Mon, 2 Jun 2014 16:24:24 -0700 Andrew Morton wrote: > On Tue, 3 Jun 2014 09:19:49 +1000 Dave Chinner wrote: > > > Anyone using Reviewed-by without having actually applied and tested > > the patch is mis-using the tag > > I think you just described 94.7% of Reviewed-by:s. 94.8% /me me raises his hand in shame! Yeah, to me Reviewed-by means that I agonized over the patch to understand it as much as if I wrote it myself. But I honestly don't always test it, or even compile it for that matter. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 6/6] seccomp: add SECCOMP_EXT_ACT_TSYNC and SECCOMP_FILTER_TSYNC
On Mon, Jun 2, 2014 at 5:14 PM, Kees Cook wrote: > On Mon, Jun 2, 2014 at 1:53 PM, Andy Lutomirski wrote: >> On Tue, May 27, 2014 at 12:55 PM, Kees Cook wrote: >>> On Tue, May 27, 2014 at 12:27 PM, Andy Lutomirski >>> wrote: On Tue, May 27, 2014 at 12:23 PM, Kees Cook wrote: > On Tue, May 27, 2014 at 12:10 PM, Andy Lutomirski > wrote: >> On Tue, May 27, 2014 at 11:45 AM, Kees Cook >> wrote: >>> On Tue, May 27, 2014 at 11:40 AM, Andy Lutomirski >>> wrote: On Tue, May 27, 2014 at 11:24 AM, Kees Cook wrote: > On Mon, May 26, 2014 at 12:27 PM, Andy Lutomirski > wrote: >> On Fri, May 23, 2014 at 10:05 AM, Kees Cook >> wrote: >>> On Thu, May 22, 2014 at 4:11 PM, Andy Lutomirski >>> wrote: On Thu, May 22, 2014 at 4:05 PM, Kees Cook wrote: > Applying restrictive seccomp filter programs to large or diverse > codebases often requires handling threads which may be started > early in > the process lifetime (e.g., by code that is linked in). While it > is > possible to apply permissive programs prior to process start up, > it is > difficult to further restrict the kernel ABI to those threads > after that > point. > > This change adds a new seccomp extension action for synchronizing > thread > group seccomp filters and a prctl() for accessing that > functionality, > as well as a flag for SECCOMP_EXT_ACT_FILTER to perform sync at > filter > installation time. > > When calling prctl(PR_SECCOMP_EXT, SECCOMP_EXT_ACT, > SECCOMP_EXT_ACT_FILTER, > flags, filter) with flags containing SECCOMP_FILTER_TSYNC, or > when calling > prctl(PR_SECCOMP_EXT, SECCOMP_EXT_ACT, SECCOMP_EXT_ACT_TSYNC, 0, > 0), it > will attempt to synchronize all threads in current's threadgroup > to its > seccomp filter program. This is possible iff all threads are > using a filter > that is an ancestor to the filter current is attempting to > synchronize to. > NULL filters (where the task is running as SECCOMP_MODE_NONE) are > also > treated as ancestors allowing threads to be transitioned into > SECCOMP_MODE_FILTER. If prctrl(PR_SET_NO_NEW_PRIVS, ...) has been > set on the > calling thread, no_new_privs will be set for all synchronized > threads too. > On success, 0 is returned. On failure, the pid of one of the > failing threads > will be returned, with as many filters installed as possible. Is there a use case for adding a filter and synchronizing filters being separate operations? If not, I think this would be easier to understand and to use if there was just a single operation. >>> >>> Yes: if the other thread's lifetime is not well controlled, it's >>> good >>> to be able to have a distinct interface to retry the thread sync >>> that >>> doesn't require adding "no-op" filters. >> >> Wouldn't this still be solved by: >> >> seccomp_add_filter(final_filter, SECCOMP_FILTER_ALL_THREADS); >> >> the idea would be that, if seccomp_add_filter fails, then you give up >> and, if it succeeds, then you're done. It shouldn't fail unless out >> of memory or you've nested too deeply. > > I wanted to keep the case of being able to to wait for non-ancestor > threads to finish. For example, 2 threads start and set separate > filters. 1 does work and exits, 2 starts another thread (3) which adds > filters, does work, and then waits for 1 to finish by calling TSYNC. > Once 1 dies, TSYNC succeeds. In the case of not having direct control > over thread lifetime (say, when using third-party libraries), I'd like > to retain the flexibility of being able to do TSYNC without needing a > filter being attached to it. I must admit this strikes me as odd. What's the point of having a thread set a filter if it intends to be a short-lived thread? >>> >>> I was illustrating the potential insanity of third-party libraries. >>> There isn't much sense in that behavior, but if it exists, working >>> around it is harder without the separate TSYNC-only call. >>> In any case, I must have missed the ability for TSYNC to block. Hmm. That seems complicated, albeit potentially useful. >>> >>> Oh, no, I didn't mean to imply TSYNC should block. I
Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks
On Mon, Jun 02, 2014 at 04:53:44PM -0700, Eric Dumazet wrote: > On Mon, 2014-06-02 at 16:17 -0700, Paul E. McKenney wrote: > > > But given that I already have preemption disabled and given that > > __srcu_read_lock() is not to be used by irq handlers, I should be able to > > use __this_cpu_inc(), correct? Just to avoid unnecessary irq disabling > > on non-x86 platforms... > > Absolutely, __this_cpu_inc() is OK here. Cool, giving it a test... Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 0/5] Huge pages for short descriptors on ARM
On Thu, Apr 24, 2014 at 2:03 PM, Russell King - ARM Linux wrote: > On Thu, Apr 24, 2014 at 11:55:56AM +0100, Steve Capper wrote: >> On 24 April 2014 11:42, Russell King - ARM Linux >> wrote: >> > On Thu, Apr 24, 2014 at 11:36:39AM +0100, Will Deacon wrote: >> >> I guess I'm after some commitment that this is (a) useful to somebody and >> >> (b) going to be tested regularly, otherwise it will go the way of things >> >> like big-endian, where we end up carrying around code which is broken more >> >> often than not (although big-endian is more self-contained). >> > >> > It may be something worth considering adding to my nightly builder/boot >> > testing, but I suspect that's impractical as it probably requires a BE >> > userspace, which would then mean that the platform can't boot LE. >> > >> > I suspect that we will just have to rely on BE users staying around and >> > reporting problems when they occur. >> >> The huge page support is for standard LE, I think Will was saying that >> this will be like BE if no-one uses it. > > We're not saying that. > > What we're asking is this: *Who* is using hugepages today? We are using it on opanpandora handheld, it's really useful for doing graphics in software. Here are some benchmarks I did some time ago: http://lists.infradead.org/pipermail/linux-arm-kernel/2013-February/148835.html For example Cortex-A8 only has 32 dTLB entries so they run out pretty fast while drawing vertical lines on linear images. And it's not so rare thing to do, like for drawing vertical scrollbars. Other people find use for it too, like to get more consistent results between benchmark runs: http://ssvb.github.io/2013/06/27/fullhd-x11-desktop-performance-of-the-allwinner-a10.html Yes in my case this is niche device and I can keep patching in the hugepage support, but mainline support would make life easier and would be very much appreciated. -- Gražvydas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] extcon next-v2 for v3.16
On 06/03/2014 09:12 AM, Greg KH wrote: > On Tue, Jun 03, 2014 at 08:53:42AM +0900, Chanwoo Choi wrote: >> Dear Greg, >> >> On 06/03/2014 07:55 AM, Greg KH wrote: >>> On Mon, Jun 02, 2014 at 02:09:16PM +0900, Chanwoo Choi wrote: Dear Greg, This is extcon-next-v2 full request for v3.16. This pull request includes additional patchset after merged previous extcon-next pull request(tags/extcon-next-for-v3.16). I add detailed description of this pull request on below. Please pull extcon with following update. Best Regards, Chanwoo Choi The following changes since commit 3f79a3fb5f41e8f2229e5bf8aa725eaa79686f14: extcon: palmas: Use devm_extcon_dev_allocate for extcon_dev (2014-04-29 09:52:12 +0900) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon.git tags/extcon-next-v2-for-3.16 >>> >>> I was thinking it was too late for 3.16-rc1, as the merge window is now >>> open and I like to have things in my tree a week or so before that >>> happens, but I figured this was nice and self-contained so I would take >>> it. >> >> Sorry late pull-request. I'll send next pull-request before -rc7 release >> certainly. >> >>> >>> So I pulled, and built, and then got this: >>> >>> ERROR: "regmap_irq_get_virq" [drivers/extcon/extcon-sm5502.ko] undefined! >>> ERROR: "regmap_add_irq_chip" [drivers/extcon/extcon-sm5502.ko] undefined! >>> ERROR: "regmap_del_irq_chip" [drivers/extcon/extcon-sm5502.ko] undefined! >>> >>> which makes me know that you haven't done much build testing of the code >>> :( >> >> Sorry about this mistake. As you comment, I'll check much build testing. > > Any reason your trees are not in linux-next? I'd recommend them to be > there, so you can get that build testing ahead-of-time. As you comment, I'm going to find the way to include extcon patchset for build test in linux-next. Best Regards, Chanwoo Choi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/