Re: [PATCH] gpio: pxa: normalize the return value for gpio_get
On Fri, Jan 10, 2014 at 7:03 AM, Neil Zhang wrote: > It would be convenient to normalize the return value for gpio_get. > > I have checked mach-mmp / mach-pxa / plat-pxa / plat-orion / mach-orion5x. > It's OK for all of them to change this function to return 0 and 1. > > Signed-off-by: Neil Zhang Bah I updated the commit message a bit ... you dropped the gpio: pxa: etc from the previous patch, not good but don't worry I fixed it up. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: Tree for Jan 15
Hi all, This tree fails (more than usual) the powerpc allyesconfig build. Changes since 20140114: Dropped tree: sh (complex merge conflicts against very old commits) Removed tree: pstore (at maintainer's request) The powerpc tree still had its build failure. The net-next tree gained a conflict against the mips tree. The infiniband tree gained a conflict against the net-next tree. The device-mapper tree gained a build failure, so I used the version from next-20130114. The md tree gained conflicts against the block tree. The iommu tree gained a conflict against the drm tree. The audit tree gained a conflict against Linus' tree. The tip tree lost its build failure and gained conflicts against the mips tree. The kvm tree still had its build failure so I used the version from next-20140109. Non-merge commits (relative to Linus' tree): 9105 8462 files changed, 439243 insertions(+), 228766 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" as mentioned in the FAQ on the wiki (see below). You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a multi_v7_defconfig for arm. After the final fixups (if any), it is also built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc, sparc64 and arm defconfig. These builds also have CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and CONFIG_DEBUG_INFO disabled when necessary. Below is a summary of the state of the merge. I am currently merging 208 trees (counting Linus' and 29 trees of patches pending for Linus' tree). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. There is a wiki covering stuff to do with linux-next at http://linux.f-seidel.de/linux-next/pmwiki/ . Thanks to Frank Seidel. -- Cheers, Stephen Rothwells...@canb.auug.org.au $ git checkout master $ git reset --hard stable Merging origin/master (a6da83f98267 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc) Merging fixes/master (b0031f227e47 Merge tag 's2mps11-build' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator) Merging kbuild-current/rc-fixes (19514fc665ff arm, kbuild: make "make install" not depend on vmlinux) Merging arc-current/for-curr (1e01c7eb7c43 ARC: Allow conditional multiple inclusion of uapi/asm/unistd.h) Merging arm-current/fixes (b25f3e1c3584 ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling) Merging m68k-current/for-linus (77a42796786c m68k: Remove deprecated IRQF_DISABLED) Merging metag-fixes/fixes (3b2f64d00c46 Linux 3.11-rc2) Merging powerpc-merge/merge (10348f597683 powerpc: Check return value of instance-to-package OF call) Merging sparc/master (ef350bb7c5e0 Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4) Merging net/master (fdc3452cd2c7 net: usbnet: fix SG initialisation) Merging ipsec/master (965cdea82569 dccp: catch failed request_module call in dccp_probe init) Merging sound-current/for-linus (150116bcfbd9 ALSA: hiface: Fix typo in 352800 rate definition) Merging pci-current/for-linus (f0b75693cbb2 MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers) Merging wireless/master (2eff7c791a18 Merge tag 'nfc-fixes-3.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixes) Merging driver-core.current/driver-core-linus (413541dd66d5 Linux 3.13-rc5) Merging tty.current/tty-linus (413541dd66d5 Linux 3.13-rc5) Merging usb.current/usb-linus (413541dd66d5 Linux 3.13-rc5) Merging staging.current/staging-linus (413541dd66d5 Linux 3.13-rc5) Merging char-misc.current/char-misc-linus (802eee95bde7 Linux 3.13-rc6) Merging input-current/for-linus (8e2f2325b73f Input: xpad - add new USB IDs for Logitech F310 and F710) Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" stripe) Merging crypto-current/master (efb753b8e013 crypto: ixp4xx - Fix kernel compile error) Merging id
Re: [PATCH] gpio: pxa: fix bug when get gpio value
On Thu, Jan 9, 2014 at 11:46 AM, Gerhard Sittig wrote: > Here is why I'm asking: Is there a need from GPIO get_value() > routines to return normalized values, That totally depends. All drivers calling gpio[d]_get_value() will be returned the value directly from the driver without any clamping to [0,1] in gpiolib. These are 496 occurences in the kernel, you'd have to check them all to see if they expect this or not. Hm. Maybe we should clamp it in gpiolib... > and if so should not more > drivers receive an update? Probably. But on my part I want that more as a code readability and maintenance hygiene thing, it gives a clear sign that the driver author think about details. (Possibly it gives the compiler a chance to optimize stuff also, I don't quite know that.) > If the GPIO subsystem's API wants to guarantee values of 0 and 1 > (which I think it doesn't), then I feel the adjustment should be > done in the gpio_get_value() routines (in all its public > variants, or a common routine which all of them pass through), > and certainly not in individual chip drivers. One does not exclude the other. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/3] mutex: In mutex_can_spin_on_owner(), return false if task need_resched()
On Wed, Jan 15, 2014 at 08:44:20AM +0100, Peter Zijlstra wrote: > On Tue, Jan 14, 2014 at 04:33:08PM -0800, Jason Low wrote: > > The mutex_can_spin_on_owner() function should also return false if the > > task needs to be rescheduled. > > > > While I was staring at mutex_can_spin_on_owner(); don't we need this? > > kernel/locking/mutex.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index 4dd6e4c219de..480d2f437964 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -214,8 +214,10 @@ static inline int mutex_can_spin_on_owner(struct mutex > *lock) > > rcu_read_lock(); > owner = ACCESS_ONCE(lock->owner); > - if (owner) > + if (owner) { That is, its an unmatched barrier, as mutex_set_owner() doesn't include a barrier, and I don't think i needs to; but on alpha we still need this read barrier to ensure we do not mess up this related load afaik. Paul? can you explain an unpaired read_barrier_depends? > + smp_read_barrier_depends(); > retval = owner->on_cpu; > + } > rcu_read_unlock(); > /* >* if lock->owner is not set, the mutex owner may have just acquired -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3.1 11/21] ARM: pxa: support ICP DAS LP-8x4x FPGA irq
This is looking much better! On Fri, Jan 10, 2014 at 12:07 AM, Sergei Ianovich wrote: > +++ b/drivers/irqchip/irq-lp8x4x.c (...) You could add some kerneldoc to this following struct (OK nitpick, but still nice, especially for the last two variables). > +struct lp8x4x_irq_data { > + void*base; > + struct irq_domain *domain; > + unsigned long num_irq; > + unsigned char irq_sys_enabled; > + unsigned char irq_high_enabled; > +}; > + > +static void lp8x4x_mask_irq(struct irq_data *d) > +{ > + unsigned mask; > + unsigned long irq = d->hwirq; Name the local variable hwirq too so we know what it is. > + struct lp8x4x_irq_data *host = irq_data_get_irq_chip_data(d); > + > + if (!host) { > + pr_err("lp8x4x: missing host data for irq %i\n", d->irq); > + return; > + } > + > + if (irq >= host->num_irq) { > + pr_err("lp8x4x: wrong irq handler for irq %i\n", d->irq); > + return; > + } This is on the hotpath. Do you *really* need these two checks? (...) > +static void lp8x4x_unmask_irq(struct irq_data *d) > +{ > + unsigned mask; > + unsigned long irq = d->hwirq; Name the variable "hwirq". > + struct lp8x4x_irq_data *host = irq_data_get_irq_chip_data(d); > + > + if (!host) { > + pr_err("lp8x4x: missing host data for irq %i\n", d->irq); > + return; > + } > + > + if (irq >= host->num_irq) { > + pr_err("lp8x4x: wrong irq handler for irq %i\n", d->irq); > + return; > + } Again overzealous error checks. (...) > +static void lp8x4x_irq_handler(unsigned int irq, struct irq_desc *desc) > +{ > + int n; > + unsigned long mask; > + struct irq_chip *chip = irq_desc_get_chip(desc); > + struct lp8x4x_irq_data *host = irq_desc_get_handler_data(desc); > + > + if (!host) > + return; I don't think this happens either? > + chained_irq_enter(chip, desc); > + > + for (;;) { > + mask = ioread8(host->base + CLRHILVINT) & 0xff; > + mask |= (ioread8(host->base + SECOINT) & SECOINT_MASK) << 8; > + mask |= (ioread8(host->base + PRIMINT) & PRIMINT_MASK) << 8; > + mask &= host->irq_high_enabled | (host->irq_sys_enabled << 8); > + if (mask == 0) > + break; > + for_each_set_bit(n, , BITS_PER_LONG) > + generic_handle_irq(irq_find_mapping(host->domain, n)); > + } I like the looks of this. If you fix this: Reviewed-by: Linus Walleij Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/3] mutex: In mutex_can_spin_on_owner(), return false if task need_resched()
On Tue, Jan 14, 2014 at 04:33:08PM -0800, Jason Low wrote: > The mutex_can_spin_on_owner() function should also return false if the > task needs to be rescheduled. > While I was staring at mutex_can_spin_on_owner(); don't we need this? kernel/locking/mutex.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 4dd6e4c219de..480d2f437964 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -214,8 +214,10 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock) rcu_read_lock(); owner = ACCESS_ONCE(lock->owner); - if (owner) + if (owner) { + smp_read_barrier_depends(); retval = owner->on_cpu; + } rcu_read_unlock(); /* * if lock->owner is not set, the mutex owner may have just acquired -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Dear Customer
This message is from Naukri Job Portal and to all registered Naukri account owners. We are currently facing phishers on our Data Base due to Spam. We want to exercise an improve secure service quality in our Admin System to reduce the spam in every job/users portal. Please Confirm your Naukri Login account. click the blow link, fill your detail of Naukri login. http://naukriresdexportalsecurityadministration.yolasite.com Confirmation of your Naukri account will help to stop spaming. Warning!!! Naukri Secure Team -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Dear Customer
This message is from Naukri Job Portal and to all registered Naukri account owners. We are currently facing phishers on our Data Base due to Spam. We want to exercise an improve secure service quality in our Admin System to reduce the spam in every job/users portal. Please Confirm your Naukri Login account. click the blow link, fill your detail of Naukri login. http://naukriresdexportalsecurityadministration.yolasite.com Confirmation of your Naukri account will help to stop spaming. Warning!!! Naukri Secure Team -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 11/21] ARM: pxa: support ICP DAS LP-8x4x FPGA irq
On Wed, Jan 8, 2014 at 8:01 PM, Sergei Ianovich wrote: > On Thu, 2014-01-02 at 13:32 +0100, Linus Walleij wrote: >> On Tue, Dec 17, 2013 at 8:37 PM, Sergei Ianovich wrote: >> Usually combined GPIO+IRQ controllers are put into drivers/gpio but >> this is a bit special as it seems to handle also non-GPIO-related IRQs >> so let's get some input on this. > > This one is a plain IRQ controller. It has simple input lines, not GPIO > pins. The chip reports its status to upper level interrupt controller. > The upper level controller is PXA GPIO in this case. Hm I don't know why I was deluded into thinking this had something to do with GPIO. I must have been soft in the head. Sorry about all those comments ... I'll re-read the irqchip driver v3.1. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] sched: find the latest idle cpu
On Wed, Jan 15, 2014 at 12:07:59PM +0800, Alex Shi wrote: > Currently we just try to find least load cpu. If some cpus idled, > we just pick the first cpu in cpu mask. > > In fact we can get the interrupted idle cpu or the latest idled cpu, > then we may get the benefit from both latency and power. > The selected cpu maybe not the best, since other cpu may be interrupted > during our selecting. But be captious costs too much. No, we should not do anything like this without first integrating cpuidle. At which point we have a sane view of the idle states and can make a sane choice between them. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 3/3] mutex: When there is no owner, stop spinning after too many tries
On Tue, 2014-01-14 at 17:06 -0800, Davidlohr Bueso wrote: > On Tue, 2014-01-14 at 16:33 -0800, Jason Low wrote: > > When running workloads that have high contention in mutexes on an 8 socket > > machine, spinners would often spin for a long time with no lock owner. > > > > One of the potential reasons for this is because a thread can be preempted > > after clearing lock->owner but before releasing the lock > > What happens if you invert the order here? So mutex_clear_owner() is > called after the actual unlocking (__mutex_fastpath_unlock). Reversing the mutex_fastpath_unlock and mutex_clear_owner resulted in a 20+% performance improvement to Ingo's test-mutex application at 160 threads on an 8 socket box. I have tried this method before, but what I was initially concerned about with clearing the owner after unlocking was that the following scenario may occur. thread 1 releases the lock thread 2 acquires the lock (in the fastpath) thread 2 sets the owner thread 1 clears owner In this situation, lock owner is NULL but thread 2 has the lock. > > or preempted after > > acquiring the mutex but before setting lock->owner. > > That would be the case _only_ for the fastpath. For the slowpath > (including optimistic spinning) preemption is already disabled at that > point. Right, for just the fastpath_lock. > > In those cases, the > > spinner cannot check if owner is not on_cpu because lock->owner is NULL. > > > > A solution that would address the preemption part of this problem would > > be to disable preemption between acquiring/releasing the mutex and > > setting/clearing the lock->owner. However, that will require adding overhead > > to the mutex fastpath. > > It's not uncommon to disable preemption in hotpaths, the overhead should > be quite smaller, actually. > > > > > The solution used in this patch is to limit the # of times thread can spin > > on > > lock->count when !owner. > > > > The threshold used in this patch for each spinner was 128, which appeared to > > be a generous value, but any suggestions on another method to determine > > the threshold are welcomed. > > Hmm generous compared to what? Could you elaborate further on how you > reached this value? These kind of magic numbers have produced > significant debate in the past. I've observed that when running workloads which don't exhibit this behavior (long spins with no owner), threads rarely take more than 100 extra spins. So I went with 128 based on those number. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] pinctrl: single: fix infinite loop caused by bad mask
On Thu, Jan 9, 2014 at 1:50 PM, Tomi Valkeinen wrote: > commit 4e7e8017a80e1 (pinctrl: pinctrl-single: > enhance to configure multiple pins of different modules) improved > support for pinctrl-single,bits option, but also caused a regression > in parsing badly configured mask data. > > If the masks in DT data are not quite right, > pcs_parse_bits_in_pinctrl_entry() can end up in an infinite loop, > trashing memory at the same time. > > Add a check to verify that each loop actually removes bits from the > 'mask', so that the loop can eventually end. > > Signed-off-by: Tomi Valkeinen > Acked-by: Tony Lindgren Patch applied. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] pinctrl: single: fix pcs_disable with bits_per_mux
On Thu, Jan 9, 2014 at 1:50 PM, Tomi Valkeinen wrote: > pcs_enable() uses vals->mask instead of pcs->fmask when bits_per_mux is > enabled. However, pcs_disable() always uses pcs->fmask. > > Fix pcs_disable() to use vals->mask with bits_per_mux. > > Signed-off-by: Tomi Valkeinen > Acked-by: Peter Ujfalusi > Acked-by: Tony Lindgren Patch applied. However make sure to CC Haojian on such patches as he's using it on a non-OMAP system and often has good feedback. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] pinctrl: single: fix DT bindings documentation
On Thu, Jan 9, 2014 at 1:50 PM, Tomi Valkeinen wrote: > Remove extra comma in pinctrl-single documentation. > > Signed-off-by: Tomi Valkeinen > Acked-by: Tony Lindgren Patch applied. Yours, Linus Walleij -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ACPI / init: Run acpi_early_init() before timekeeping_init()
This is a variant patch from Rafael J. Wysocki's ACPI / init: Run acpi_early_init() before efi_enter_virtual_mode() According to Matt Fleming, if acpi_early_init() was executed before efi_enter_virtual_mode(), the EFI initialization could benefit from it, so Rafael's patch makes that happen. And, we want accessing ACPI TAD device to set system clock, so move acpi_early_init() before timekeeping_init(). This final position is also before efi_enter_virtual_mode(). v2: Move acpi_early_init() before timekeeping_init() to prepare setting system clock with ACPI TAD. v1: Rafael J. Wysocki ACPI / init: Run acpi_early_init() before efi_enter_virtual_mode() Cc: Rafael J. Wysocki Cc: Matt Fleming Cc: H. Peter Anvin Cc: Borislav Petkov Cc: Matthew Garrett Tested-by: Toshi Kani Signed-off-by: Lee, Chun-Yi --- init/main.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/init/main.c b/init/main.c index febc511..b6d93c8 100644 --- a/init/main.c +++ b/init/main.c @@ -565,6 +565,7 @@ asmlinkage void __init start_kernel(void) init_timers(); hrtimers_init(); softirq_init(); + acpi_early_init(); timekeeping_init(); time_init(); sched_clock_postinit(); @@ -641,7 +642,6 @@ asmlinkage void __init start_kernel(void) check_bugs(); - acpi_early_init(); /* before LAPIC and SMP init */ sfi_init_late(); if (efi_enabled(EFI_RUNTIME_SERVICES)) { -- 1.6.4.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH net-next] tun/macvtap: limit the packets queued through rcvbuf
On Wed, Jan 15, 2014 at 11:36:01AM +0800, Jason Wang wrote: > On 01/14/2014 05:52 PM, Michael S. Tsirkin wrote: > > On Tue, Jan 14, 2014 at 04:45:24PM +0800, Jason Wang wrote: > >> > On 01/14/2014 04:25 PM, Michael S. Tsirkin wrote: > >>> > > On Tue, Jan 14, 2014 at 02:53:07PM +0800, Jason Wang wrote: > > >> We used to limit the number of packets queued through > > >> tx_queue_length. This > > >> has several issues: > > >> > > >> - tx_queue_length is the control of qdisc queue length, simply > > >> reusing it > > >> to control the packets queued by device may cause confusion. > > >> - After commit 6acf54f1cf0a6747bac9fea26f34cfc5a9029523 ("macvtap: > > >> Add > > >> support of packet capture on macvtap device."), an unexpected > > >> qdisc > > >> caused by non-zero tx_queue_length will lead qdisc lock > > >> contention for > > >> multiqueue deivce. > > >> - What we really want is to limit the total amount of memory > > >> occupied not > > >> the number of packets. > > >> > > >> So this patch tries to solve the above issues by using socket > > >> rcvbuf to > > >> limit the packets could be queued for tun/macvtap. This was done by > > >> using > > >> sock_queue_rcv_skb() instead of a direct call to skb_queue_tail(). > > >> Also two > > >> new ioctl() were introduced for userspace to change the rcvbuf like > > >> what we > > >> have done for sndbuf. > > >> > > >> With this fix, we can safely change the tx_queue_len of macvtap to > > >> zero. This will make multiqueue works without extra lock contention. > > >> > > >> Cc: Vlad Yasevich > > >> Cc: Michael S. Tsirkin > > >> Cc: John Fastabend > > >> Cc: Stephen Hemminger > > >> Cc: Herbert Xu > > >> Signed-off-by: Jason Wang > >>> > > No, I don't think we can change userspace-visible behaviour like that. > >>> > > > >>> > > This will break any existing user that tries to control > >>> > > queue length through sysfs,netlink or device ioctl. > >> > > >> > But it looks like a buggy API, since tx_queue_len should be for qdisc > >> > queue length instead of device itself. > > Probably, but it's been like this since 2.6.x time. > > Also, qdisc queue is unused for tun so it seemed kind of > > reasonable to override tx_queue_len. > > > >> > If we really want to preserve the > >> > behaviour, how about using a new feature flag and change the behaviour > >> > only when the device is created (TUNSETIFF) with the new flag? > > OK this addresses the issue partially, but there's also an issue > > of permissions: tx_queue_len can only be changed if > > capable(CAP_NET_ADMIN). OTOH in your patch a regular user > > can change the amount of memory consumed per queue > > by calling TUNSETRCVBUF. > > Yes, but we have the same issue for TUNSETSNDBUF. To an extent, but TUNSETSNDBUF is different. It limits how much device can queue *in the networking stack* but each queue in the stack is also limited, when we exceed that we star dropping packets. So while with infinite value (which is the default btw) you can keep host pretty busy, you will not be able to run it out of memory. The proposed TUNSETRCVBUF would keep configured amount of memory around indefinitely so you can run host out of memory. So assuming all this How about an ethtool or netlink command to configure this instead? > > > >>> > > > >>> > > Take a look at my patch in msg ID 20140109071721.gd19...@redhat.com > >>> > > which gives one way to set tx_queue_len to zero without > >>> > > breaking userspace. > >> > > >> > If I read the patch correctly, it will make no way for the user who > >> > really want to change the qdisc queue length for tun. > > Why would this matter? As far as I can see qdisc queue is currently unused. > > > > User may use qdisc to do port mirroring, bandwidth limitation, traffic > prioritization or more for a VM. So we do have users and maybe more > consider the case of vpn. Well it's not used by default at least. I remember that we discussed this previously actually. If all we want to do actually is utilize no_qdisc by default, we can simply use Eric's patch: http://article.gmane.org/gmane.linux.kernel/1279597 and a similar patch for macvtap. I tried it at the time and it didn't seem to help performance at all, but a lot has changed since, in particular I didn't test mq. If you now have results showing how it's beneficial, pls post them. -- MST -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFT][PATCH] ACPI / init: Run acpi_early_init() before efi_enter_virtual_mode()
於 二,2014-01-14 於 13:32 -0700,Toshi Kani 提到: > > > + acpi_early_init(); > > > timekeeping_init(); > > > time_init(); > > > sched_clock_postinit(); > > > @@ -641,7 +642,6 @@ asmlinkage void __init start_kernel(void) > > > > > > check_bugs(); > > > > > > - acpi_early_init(); /* before LAPIC and SMP init */ > > > sfi_init_late(); > > > > > > if (efi_enabled(EFI_RUNTIME_SERVICES)) { > > > > > > > Hi Toshi, > > > > Could you try this variant, too? If this works as well then we end > up > > solving two problems in one patch... > > Hi Peter, > > Yes, this version works fine as well. > > Tested-by: Toshi Kani > > Thanks, > -Toshi Thanks a lot for your testing. I will re-send a formal patch with changelog to everybody. Regards Joey Lee -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/9] mm: slab/slub: use page->list consistently instead of page->lru
On Tue, 14 Jan 2014, Dave Hansen wrote: > > block/blk-mq.c: In function ‘blk_mq_free_rq_map’: > > block/blk-mq.c:1094:10: error: ‘struct page’ has no member named ‘list’ > > block/blk-mq.c:1094:10: warning: initialization from incompatible pointer > > type [enabled by default] > > block/blk-mq.c:1094:10: error: ‘struct page’ has no member named ‘list’ > > block/blk-mq.c:1095:22: error: ‘struct page’ has no member named ‘list’ > > block/blk-mq.c: In function ‘blk_mq_init_rq_map’: > > block/blk-mq.c:1159:22: error: ‘struct page’ has no member named ‘list’ > > As I mentioned in the introduction, these are against linux-next. > There's a patch in there at the moment which fixed this. > Ok, thanks, I like this patch. Acked-by: David Rientjes
[PATCH 2/2] powerpc: Implement arch_spin_is_locked() using arch_spin_value_unlocked()
At a glance these are just the inverse of each other. The one subtlety is that arch_spin_value_unlocked() takes the lock by value, rather than as a pointer, which is important for the lockref code. On the other hand arch_spin_is_locked() doesn't really care, so implement it in terms of arch_spin_value_unlocked(). Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/spinlock.h | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 5162f8c..a30ef69 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -28,8 +28,6 @@ #include #include -#define arch_spin_is_locked(x) ((x)->slock != 0) - #ifdef CONFIG_PPC64 /* use 0x80yy when locked, where yy == CPU number */ #ifdef __BIG_ENDIAN__ @@ -59,6 +57,11 @@ static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) return lock.slock == 0; } +static inline int arch_spin_is_locked(arch_spinlock_t *lock) +{ + return !arch_spin_value_unlocked(*lock); +} + /* * This returns the old value in the lock, so we succeeded * in getting the lock if the return value is 0. -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] powerpc: Add support for the optimised lockref implementation
This commit adds the architecture support required to enable the optimised implementation of lockrefs. That's as simple as defining arch_spin_value_unlocked() and selecting the Kconfig option. We also define cmpxchg64_relaxed(), because the lockref code does not need the cmpxchg to have barrier semantics. Using Linus' test case[1] on one system I see a 4x improvement for the basic enablement, and a further 1.3x for cmpxchg64_relaxed(), for a total of 5.3x vs the baseline. On another system I see more like 2x improvement. [1]: http://marc.info/?l=linux-fsdevel=137782380714721=4 Signed-off-by: Michael Ellerman --- arch/powerpc/Kconfig| 1 + arch/powerpc/include/asm/cmpxchg.h | 1 + arch/powerpc/include/asm/spinlock.h | 5 + 3 files changed, 7 insertions(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index b44b52c..b34b53d 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -139,6 +139,7 @@ config PPC select OLD_SIGACTION if PPC32 select HAVE_DEBUG_STACKOVERFLOW select HAVE_IRQ_EXIT_ON_IRQ_STACK + select ARCH_USE_CMPXCHG_LOCKREF if PPC64 config GENERIC_CSUM def_bool CPU_LITTLE_ENDIAN diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h index e245aab..d463c68 100644 --- a/arch/powerpc/include/asm/cmpxchg.h +++ b/arch/powerpc/include/asm/cmpxchg.h @@ -300,6 +300,7 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new, BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ cmpxchg_local((ptr), (o), (n)); \ }) +#define cmpxchg64_relaxed cmpxchg64_local #else #include #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n)) diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 5f54a74..5162f8c 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -54,6 +54,11 @@ #define SYNC_IO #endif +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) +{ + return lock.slock == 0; +} + /* * This returns the old value in the lock, so we succeeded * in getting the lock if the return value is 0. -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 3/3] mutex: When there is no owner, stop spinning after too many tries
On Tue, 2014-01-14 at 17:00 -0800, Andrew Morton wrote: > On Tue, 14 Jan 2014 16:33:10 -0800 Jason Low wrote: > > > When running workloads that have high contention in mutexes on an 8 socket > > machine, spinners would often spin for a long time with no lock owner. > > > > One of the potential reasons for this is because a thread can be preempted > > after clearing lock->owner but before releasing the lock, or preempted after > > acquiring the mutex but before setting lock->owner. In those cases, the > > spinner cannot check if owner is not on_cpu because lock->owner is NULL. > > That sounds like a very small window. And your theory is that this > window is being hit sufficiently often to impact aggregate runtime > measurements, which sounds improbable to me? > > > A solution that would address the preemption part of this problem would > > be to disable preemption between acquiring/releasing the mutex and > > setting/clearing the lock->owner. However, that will require adding overhead > > to the mutex fastpath. > > preempt_disable() is cheap, and sometimes free. > > Have you confirmed that the preempt_disable() approach actually fixes > the performance issues? If it does then this would confirm your > "potential reason" hypothesis. If it doesn't then we should be hunting > further for the explanation. Using Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50) which can also generate high mutex contention, the preempt_disable() approach did provide approximately a 4% improvement at 160 threads, but not nearly the 25+% I was seeing with this patchset. So, it looks like preemption is not the main cause of the problem then. Thanks, Jason -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] vt: detect and ignore OSC codes.
These can be used to send commands consisting of an arbitrary string to the terminal, most often used to set a terminal's window title or to redefine the colour palette. Our console doesn't use OSC, unlike everything else, which can lead to junk being displayed if a process sends such a code unconditionally. Not following Ecma-48, this commit recognizes 7-bit forms (ESC ] ... 0x07, ESC ] .. ESC \) but not 8-bit (0x9D ... 0x9C). Signed-off-by: Adam Borowski --- drivers/tty/vt/vt.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index 61b1137..0377c52 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -1590,7 +1590,7 @@ static void restore_cur(struct vc_data *vc) enum { ESnormal, ESesc, ESsquare, ESgetpars, ESgotpars, ESfunckey, EShash, ESsetG0, ESsetG1, ESpercent, ESignore, ESnonstd, - ESpalette }; + ESpalette, ESosc }; /* console_lock is held (except via vc_init()) */ static void reset_terminal(struct vc_data *vc, int do_clear) @@ -1650,11 +1650,15 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c) * Control characters can be used in the _middle_ * of an escape sequence. */ + if (vc->vc_state == ESosc && c>=8 && c<=13) /* ... except for OSC */ + return; switch (c) { case 0: return; case 7: - if (vc->vc_bell_duration) + if (vc->vc_state == ESosc) + vc->vc_state = ESnormal; + else if (vc->vc_bell_duration) kd_mksound(vc->vc_bell_pitch, vc->vc_bell_duration); return; case 8: @@ -1765,7 +1769,9 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c) } else if (c=='R') { /* reset palette */ reset_palette(vc); vc->vc_state = ESnormal; - } else + } else if (c>='0' && c<='9') + vc->vc_state = ESosc; + else vc->vc_state = ESnormal; return; case ESpalette: @@ -2023,6 +2029,8 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c) return; default: vc->vc_state = ESnormal; + case ESosc: + return; } } -- 1.8.5.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] x86: intel-mid: sfi_handle_*_dev() should check for pdata error code
* David Cohen wrote: > Hi Ingo, > > On Fri, Dec 20, 2013 at 09:49:53AM +0100, Ingo Molnar wrote: > > > > * David Cohen wrote: > > > > > Prevent sfi_handle_*_dev() to register device in case > > > intel_mid_sfi_get_pdata() failed to execute. > > > > > > Since 'NULL' is a valid return value, this patch makes > > > sfi_handle_*_dev() functions to use IS_ERR() to validate returned pdata. > > > > Is this bug triggering in practice? If not then please say so in the > > changelog. If yes then is this patch desired for v3.13 merging and > > also please fix the changelog to conform to the standard changelog > > style: > > > > - first describe the symptoms of the bug - how does a user notice? > > > > - then describe how the code behaves today and how that is causing > >the bug > > > > - and then only describe how it's fixed. > > > > The first item is the most important one - while developers > > (naturally) tend to concentrate on the least important point, the last > > one. > > Thanks for the feedback :) > This new patch set was done in reply to your comment: > https://lkml.org/lkml/2013/12/20/517 Hm, in what way does the new changelog address my first request: > > - first describe the symptoms of the bug - how does a user notice? They are all phrased as bug fixes, yet _none_ of the three changelogs appears to describe specific symptoms on specific systems - they all seem to talk in the abstract, with no specific connection to reality. That really makes it harder for patches to get into the (way too narrow) attention span of maintainersm, while phrasing it like this: 'If an Intel-MID system boots in a specific SFI environment then it will hang on bootup without this fix.' or: 'Existing Intel-MID hardware will run faster with this patch.' will certainly wake up maintainers like a good coffee in the morning. If a patch is a cleanup with no known bug fix effects then say so in the title and the changelog. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/9] mm: slab/slub: use page->list consistently instead of page->lru
On 01/14/2014 06:31 PM, David Rientjes wrote: > Did you try with a CONFIG_BLOCK config? > > block/blk-mq.c: In function ‘blk_mq_free_rq_map’: > block/blk-mq.c:1094:10: error: ‘struct page’ has no member named ‘list’ > block/blk-mq.c:1094:10: warning: initialization from incompatible pointer > type [enabled by default] > block/blk-mq.c:1094:10: error: ‘struct page’ has no member named ‘list’ > block/blk-mq.c:1095:22: error: ‘struct page’ has no member named ‘list’ > block/blk-mq.c: In function ‘blk_mq_init_rq_map’: > block/blk-mq.c:1159:22: error: ‘struct page’ has no member named ‘list’ As I mentioned in the introduction, these are against linux-next. There's a patch in there at the moment which fixed this. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] sched: find the latest idle cpu
On 01/15/2014 01:33 PM, Michael wang wrote: > On 01/15/2014 12:07 PM, Alex Shi wrote: >> > Currently we just try to find least load cpu. If some cpus idled, >> > we just pick the first cpu in cpu mask. >> > >> > In fact we can get the interrupted idle cpu or the latest idled cpu, >> > then we may get the benefit from both latency and power. >> > The selected cpu maybe not the best, since other cpu may be interrupted >> > during our selecting. But be captious costs too much. > So the idea here is we want to choose the latest idle cpu if we have > multiple idle cpu for choosing, correct? yes. > > And I guess that was in order to avoid choosing tickless cpu while there > are un-tickless idle one, is that right? no, current logical choice least load cpu no matter if it is idle. > > What confused me is, what about those cpu who just going to recover from > tickless as you mentioned, which means latest idle doesn't mean the best > choice, or even could be the worst (if just two choice, and the longer > tickless one is just going to recover while the latest is going to > tickless). yes, to save your scenario, we need to know the next timer for idle cpu, but that is not enough, interrupt is totally unpredictable. So, I'd rather bear the coarse method now. > > So what about just check 'ts->tick_stopped' and record one ticking idle > cpu? the cost could be lower than time comparison, we could reduce the > risk may be...(well, not so risky since the logical only works when > system is relaxing with several cpu idle) first, nohz full also stop tick. second, tick_stopped can not reflect the interrupt. when the idle cpu was interrupted, it's waken, then be a good candidate for task running. -- Thanks Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RESEND PATCH v10] x86, apic, kexec, Documentation: Add disable_cpu_apicid kernel parameter
Add disable_cpu_apicid kernel parameter. To use this kernel parameter, specify an initial APIC ID of the corresponding CPU you want to disable. This is mostly used for the kdump 2nd kernel to disable BSP to wake up multiple CPUs without causing system reset or hang due to sending INIT from AP to BSP. Kdump users first figure out initial APIC ID of the BSP, CPU0 in the 1st kernel, for example from /proc/cpuinfo and then set up this kernel parameter for the 2nd kernel using the obtained APIC ID. However, doing this procedure at each boot time manually is awkward, which should be automatically done by user-land service scripts, for example, kexec-tools on fedora/RHEL distributions. This design is more flexible than disabling BSP in kernel boot time automatically in that in kernel boot time we have no choice but referring to ACPI/MP table to obtain initial APIC ID for BSP, meaning that the method is not applicable to the systems without such BIOS tables. One assumption behind this design is that users get initial APIC ID of the BSP in still healthy state and so BSP is uniquely kept in CPU0. Thus, through the kernel parameter, only one initial APIC ID can be specified. In a comparison with disabled_cpu_apicid, we use read_apic_id(), not boot_cpu_physical_apicid, because on some platforms, the variable is modified to the apicid reported as BSP through MP table and this function is executed with the temporarily modified boot_cpu_physical_apicid. As a result, disabled_cpu_apicid kernel parameter doesn't work well for apicids of APs. Fixing the wrong handling of boot_cpu_physical_apicid requires some reviews and tests beyond some platforms and it could take some time. The fix here is a kind of workaround to focus on the main topic of this patch. Signed-off-by: HATAYAMA Daisuke --- Documentation/kernel-parameters.txt |9 ++ arch/x86/kernel/apic/apic.c | 49 +++ 2 files changed, 58 insertions(+) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 50680a5..4e5528c 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -774,6 +774,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted. disable=[IPV6] See Documentation/networking/ipv6.txt. + disable_cpu_apicid= [X86,APIC,SMP] + Format: + The number of initial APIC ID for the + corresponding CPU to be disabled at boot, + mostly used for the kdump 2nd kernel to + disable BSP to wake up multiple CPUs without + causing system reset or hang due to sending + INIT from AP to BSP. + disable_ddw [PPC/PSERIES] Disable Dynamic DMA Window support. Use this if to workaround buggy firmware. diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index d278736..6c0b7d5 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -75,6 +75,13 @@ unsigned int max_physical_apicid; physid_mask_t phys_cpu_present_map; /* + * Processor to be disabled specified by kernel parameter + * disable_cpu_apicid=, mostly used for the kdump 2nd kernel to + * avoid undefined behaviour caused by sending INIT from AP to BSP. + */ +unsigned int disabled_cpu_apicid = BAD_APICID; + +/* * Map cpu index to physical APIC ID */ DEFINE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_cpu_to_apicid, BAD_APICID); @@ -2115,6 +2122,39 @@ int generic_processor_info(int apicid, int version) phys_cpu_present_map); /* +* boot_cpu_physical_apicid is designed to have the apicid +* returned by read_apic_id(), i.e, the apicid of the +* currently booting-up processor. However, on some platforms, +* it is temporarilly modified by the apicid reported as BSP +* through MP table. Concretely: +* +* - arch/x86/kernel/mpparse.c: MP_processor_info() +* - arch/x86/mm/amdtopology.c: amd_numa_init() +* - arch/x86/platform/visws/visws_quirks.c: MP_processor_info() +* +* This function is executed with the modified +* boot_cpu_physical_apicid. So, disabled_cpu_apicid kernel +* parameter doesn't work to disable APs on kdump 2nd kernel. +* +* Since fixing handling of boot_cpu_physical_apicid requires +* another discussion and tests on each platform, we leave it +* for now and here we use read_apic_id() directly in this +* function, generic_processor_info(). +*/ + if (disabled_cpu_apicid != BAD_APICID && + disabled_cpu_apicid != read_apic_id() && + disabled_cpu_apicid == apicid) { + int thiscpu = num_processors + disabled_cpus; + + pr_warning("ACPI:
Re: [PATCH 7/8 v3] crypto:s5p-sss: validate iv before memcpy
Hello Tomasz, On 10 January 2014 21:33, Tomasz Figa wrote: > Hi Naveen, > > > On 10.01.2014 12:45, Naveen Krishna Chatradhi wrote: >> >> This patch adds code to validate "iv" buffer before trying to >> memcpy the contents >> >> Signed-off-by: Naveen Krishna Chatradhi >> --- >> Changes since v2: >> None >> >> drivers/crypto/s5p-sss.c |5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c >> index f274f5f..7058bb6 100644 >> --- a/drivers/crypto/s5p-sss.c >> +++ b/drivers/crypto/s5p-sss.c >> @@ -381,8 +381,9 @@ static void s5p_set_aes(struct s5p_aes_dev *dev, >> struct samsung_aes_variant *var = dev->variant; >> void __iomem *keystart; >> >> - memcpy(dev->ioaddr + SSS_REG_AES_IV_DATA >> - (var->aes_offset, 0), iv, 0x10); >> + if (iv) >> + memcpy(dev->ioaddr + SSS_REG_AES_IV_DATA >> + (var->aes_offset, 0), iv, 0x10); > > > In what conditions can the iv end up being NULL? req->info is the initialization vector in our case, which comes from user space. Its good to have a check to avoid any crashes. Also AES ECB mode does not use IV. > > Best regards, > Tomasz -- Shine bright, (: Nav :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf tools: Synthesize anon MMAP records on the heap
Hi Namhyung, On 1/15/14, 12:46 AM, "Namhyung Kim" wrote: >I'd like to take my ack back - it seems I missed some points. No worries, looks like the patch wasn’t well thought out. >On Tue, 14 Jan 2014 20:48:23 +, Gaurav Jain wrote: >> On 1/13/14, 11:54 AM, "Don Zickus" wrote: >> >>>On Sat, Jan 11, 2014 at 08:32:14PM -0800, Gaurav Jain wrote: Anon records usually do not have the 'execname' entry. However if they are on the heap, the execname shows up as '[heap]'. The fix considers any executable entries in the map that do not have a name or are on the heap as anon records and sets the name to '//anon'. This fixes JIT profiling for records on the heap. >>> >>>I guess I don't understand the need for this fix. It seems breaking out >>>//anon vs. [heap] would be useful. Your patch is saying otherwise. Can >>>give a description of the problem you are trying to solve? >> >> Thank you for looking at the patch. >> >> We generate a perf map file which includes certain JIT¹ed functions that >> show up as [heap] entries. As a result, I included the executable heap >> entries as anon pages so that it would be handled in >> util/map.c:map__new(). The alternative would be to handle heap entries >>in >> map__new() directly, however I wasn¹t sure if this would break something >> as it seems that heap and stack entries are expected to fail all >> map__find_* functions. Thus I considered executable heap entries as >> //anon, but perhaps there is a better way. > >Hmm.. so the point is that an executable heap mapping should have >/tmp/perf-XXX.map as a file name, right? If so, does something like >below work well for you? Just gave it a try and it fixed the issue perfectly! Thanks for the help. This looks like a much better solution than treating the heap mapping as an anon record. Gaurav >diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c >index 9b9bd719aa19..d52387fe83f1 100644 >--- a/tools/perf/util/map.c >+++ b/tools/perf/util/map.c >@@ -69,7 +69,7 @@ struct map *map__new(struct list_head *dsos__list, u64 >start, u64 len, > map->ino = ino; > map->ino_generation = ino_gen; > >- if (anon) { >+ if (anon || (no_dso && type == MAP__FUNCTION)) { > snprintf(newfilename, sizeof(newfilename), > "/tmp/perf-%d.map", pid); > filename = newfilename; > } >@@ -93,7 +93,7 @@ struct map *map__new(struct list_head *dsos__list, u64 >start, u64 len, >* functions still return NULL, and we avoid the >* unnecessary map__load warning. >*/ >- if (no_dso) >+ if (no_dso && type != MAP__FUNCTION) > dso__set_loaded(dso, map->type); > } > }
Re: [GIT PULL] clockevents/clocksources: 3.13 fixes
* Daniel Lezcano wrote: > > Hi Thomas and Ingo, > > here is a pull request for a single fix for 3.13. It is based on the > latest timers/urgent update. > > * Soren Brinkmann fixed the cadence_ttc driver where a call to > clk_get_rate happens in an interrupt context. More precisely in an > IPI when the broadcast timer is initialized for each cpu in the > cpuidle driver > > Thanks > -- Daniel > > The following changes since commit b0031f227e47919797dc0e1c1990f3ef151ff0cc: > > Merge tag 's2mps11-build' of > git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator > (2013-12-17 12:57:36 -0800) > > are available in the git repository at: > > > git://git.linaro.org/people/daniel.lezcano/linux.git > clockevents/3.13-fixes > > for you to fetch changes up to c1dcc927dae01dfd4904ee82ce2c00b50eab6dc3: > > clocksource: cadence_ttc: Fix mutex taken inside interrupt context > (2013-12-30 11:32:24 +0100) > > > Soren Brinkmann (1): > clocksource: cadence_ttc: Fix mutex taken inside interrupt context > > drivers/clocksource/cadence_ttc_timer.c | 21 + > 1 file changed, 13 insertions(+), 8 deletions(-) Pulled into tip:timers/urgent, thanks Daniel! Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:core/urgent] sched_clock: Disable seqlock lockdep usage in sched_clock()
* John Stultz wrote: > On 01/12/2014 10:42 AM, tip-bot for John Stultz wrote: > > Commit-ID: 7a06c41cbec33c6dbe7eec575c61986122617408 > > Gitweb: > > http://git.kernel.org/tip/7a06c41cbec33c6dbe7eec575c61986122617408 > > Author: John Stultz > > AuthorDate: Thu, 2 Jan 2014 15:11:14 -0800 > > Committer: Ingo Molnar > > CommitDate: Sun, 12 Jan 2014 10:14:00 +0100 > > > > sched_clock: Disable seqlock lockdep usage in sched_clock() > > > > Unfortunately the seqlock lockdep enablement can't be used > > in sched_clock(), since the lockdep infrastructure eventually > > calls into sched_clock(), which causes a deadlock. > > > > Thus, this patch changes all generic sched_clock() usage > > to use the raw_* methods. > > > > Acked-by: Linus Torvalds > > Reviewed-by: Stephen Boyd > > Reported-by: Krzysztof Hałasa > > Signed-off-by: John Stultz > > Cc: Uwe Kleine-König > > Cc: Willy Tarreau > > Signed-off-by: Peter Zijlstra > > Link: > > http://lkml.kernel.org/r/1388704274-5278-2-git-send-email-john.stu...@linaro.org > > Signed-off-by: Ingo Molnar > > Hey Ingo, > Just wanted to follow up here, since I've still not seen this (and > the raw_ renaming patch) submitted to Linus. These address a lockup that > triggers on ARM systems if lockdep is enabled and it would be good to > get it in before 3.13 is out. It's in tip:core/urgent, i.e. lined up for v3.13, I plan to send it to Linus later today. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/urgent] x86, cpu, amd: Add workaround for family 16h, erratum 793
* H. Peter Anvin wrote: > On 01/14/2014 04:45 PM, tip-bot for Borislav Petkov wrote: > > + rdmsrl(MSR_AMD64_LS_CFG, val); > > + if (!(val & BIT(15))) > > + wrmsrl(MSR_AMD64_LS_CFG, val | BIT(15)); > > Incidentally, I'm wondering if we shouldn't have a > set_in_msr()/clear_in_msr() set of functions which would incorporate the > above construct: > > void set_in_msr(u32 msr, u64 mask) > { > u64 old, new; > > old = rdmsrl(msr); > new = old | mask; > if (old != new) > wrmsrl(msr, new); > } > > ... and the obvious equivalent for clear_in_msr(). > > The perhaps only question is if it should be "set/clear_bit_in_msr()" > rather than having to haul a full 64-bit mask in the common case. I'd suggest the introduction of a standard set of methods operating on MSRs: msr_read() msr_write() msr_set_bit() msr_clear_bit() msr_set_mask() msr_clear_mask() etc. msr_read() would essentially map to rdmsr_safe(). Each method has a return value that can be checked for failure. Note that the naming of 'msr_set_bit()' and 'msr_clear_bit()' mirrors that of bitops, and set_mask/clear_mask is named along a similar pattern, so that it's more immediately obvious what's going on. With such methods in place we could use them in most new code, and would use 'raw, unsafe' rdmsr()/wrmsr() only in very specific, justified cases. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] perf tools: Spare double comparison of callchain first entry
On Tue, 14 Jan 2014 16:37:15 +0100, Frederic Weisbecker wrote: > When a new callchain child branch matches an existing one in the rbtree, > the comparison of its first entry is performed twice: > > 1) From append_chain_children() on branch lookup > > 2) If 1) reports a match, append_chain() then compares all entries of > the new branch against the matching node in the rbtree, and this > comparison includes the first entry of the new branch again. Right. > > Lets shortcut this by performing the whole comparison only from > append_chain() which then returns the result of the comparison between > the first entry of the new branch and the iterating node in the rbtree. > If the first entry matches, the lookup on the current level of siblings > stops and propagates to the children of the matching nodes. Hmm.. it looks like that I thought directly calling append_chain() has some overhead - but it's not. > > This results in less comparisons performed by the CPU. Do you have any numbers? I suspect it'd not be a big change, but just curious. > > Signed-off-by: Frederic Weisbecker Reviewed-by: Namhyung Kim Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] perf tools: Remove unnecessary callchain cursor state restore on unmatch
On Tue, 14 Jan 2014 16:37:16 +0100, Frederic Weisbecker wrote: > If a new callchain branch doesn't match a single entry of the node that > it is given against comparison in append_chain(), then the cursor is > expected to be at the same position as it was before the comparison loop. > > As such, there is no need to restore the cursor position on exit in case > of non matching branches. > > Signed-off-by: Frederic Weisbecker Reviewed-by: Namhyung Kim Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2] ACPI/Battery: Add a _BIX quirk for NEC LZ750/LS
On 01/14/2014 03:37 PM, Rafael J. Wysocki wrote: On Tuesday, January 14, 2014 04:06:01 PM Matthew Garrett wrote: On Mon, Jan 06, 2014 at 11:25:53PM +0100, Rafael J. Wysocki wrote: Queued up as a fix for 3.13 (I fixed up the indentation). Ah, sorry, I missed this chunk of the thread. If the system provides valid _BIF data then we should possibly just fall back to that rather than adding another quirk table. The problem is to know that _BIX is broken. If we could figure that out upfront, we woulnd't need the quirk table in any case. Tianyu, can we do some effort during the driver initialization to detect this breakage and handle it without blacklisting systems? Yes, the usual question in such cases is "how does Windows manage to function on such systems, (almost certainly) without a system-specific hack, and can we replicate that behavior?" -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] clk: sirf: re-arch to make the codes support both prima2 and atlas6
2014/1/15 Mike Turquette : > Quoting Barry Song (2014-01-05 21:38:19) >> diff --git a/drivers/clk/sirf/clk-atlas6.c b/drivers/clk/sirf/clk-atlas6.c >> new file mode 100644 >> index 000..21e776a >> --- /dev/null >> +++ b/drivers/clk/sirf/clk-atlas6.c >> @@ -0,0 +1,153 @@ >> +/* >> + * Clock tree for CSR SiRFatlasVI >> + * >> + * Copyright (c) 2011 Cambridge Silicon Radio Limited, a CSR plc group >> company. >> + * >> + * Licensed under GPLv2 or later. >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include > > Please do not use clk-private.h. It is slated for removal (some day...). > Do you actually need it? removed in v3. > > Regards, > Mike -barry -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] toshiba_acpi: Support RFKILL hotkey scancode
In Windows disables the Wi-Fi & Bluetooth. In Linux does nothing but showing this error in dmesg about the unrecognized scancode: [230128.042047] toshiba_acpi: Unknown key 158 On Tue, Jan 14, 2014 at 4:54 PM, Matthew Garrett wrote: > On Tue, 2014-01-14 at 11:06 +0100, Unai Uribarri wrote: >> This scancode is used in new 2013 models like Satellite P75-A7200. > > Just to check - this is generated by a key that otherwise does nothing, > right? Ie, hitting the key generates the scancode but doesn't > automatically change the wireless state? > > -- > Matthew Garrett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: dts: imx28-apf28dev: add user button
On Tue, Jan 14, 2014 at 03:21:27PM +0100, Sébastien Szymanski wrote: > Signed-off-by: Sébastien Szymanski > --- > arch/arm/boot/dts/imx28-apf28dev.dts | 11 +++ > 1 file changed, 11 insertions(+) Applied, thanks. Shawn > > diff --git a/arch/arm/boot/dts/imx28-apf28dev.dts > b/arch/arm/boot/dts/imx28-apf28dev.dts > index 334dea5..221cac4 100644 > --- a/arch/arm/boot/dts/imx28-apf28dev.dts > +++ b/arch/arm/boot/dts/imx28-apf28dev.dts > @@ -48,6 +48,7 @@ > MX28_PAD_LCD_D20__GPIO_1_20 > MX28_PAD_LCD_D21__GPIO_1_21 > MX28_PAD_LCD_D22__GPIO_1_22 > + MX28_PAD_GPMI_CE1N__GPIO_0_17 > >; > fsl,drive-strength = ; > fsl,voltage = ; > @@ -193,4 +194,14 @@ > brightness-levels = <0 4 8 16 32 64 128 255>; > default-brightness-level = <6>; > }; > + > + gpio-keys { > + compatible = "gpio-keys"; > + > + user-button { > + label = "User button"; > + gpios = < 17 0>; > + linux,code = <0x100>; > + }; > + }; > }; > -- > 1.8.3.2 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 9/9] mm: keep page cache radix tree nodes in check
Hi Johannes, On 01/11/2014 02:10 AM, Johannes Weiner wrote: > Previously, page cache radix tree nodes were freed after reclaim > emptied out their page pointers. But now reclaim stores shadow > entries in their place, which are only reclaimed when the inodes > themselves are reclaimed. This is problematic for bigger files that > are still in use after they have a significant amount of their cache > reclaimed, without any of those pages actually refaulting. The shadow > entries will just sit there and waste memory. In the worst case, the > shadow entries will accumulate until the machine runs out of memory. > I have one more question. It seems that other algorithm only remember history information of a limit number of evicted pages where the number is usually the same as the total cache or memory size. But in your patch, I didn't see a preferred value that how many evicted pages' history information should be recorded. It all depends on the workingset_shadow_shrinker? Thanks, -Bob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] perf tools: Do proper comm override error handling
Hi Frederic, On Tue, 14 Jan 2014 16:37:14 +0100, Frederic Weisbecker wrote: > The comm overriding API ignores memory allocation failures by silently > keeping the previous and out of date comm. > > As a result, the user may get buggy events without ever being notified > about the problem and its source. > > Lets start to fix this by propagating the error from the API. Not all > callers may be doing proper error handling on comm set yet but this > is the first step toward it. Acked-by: Namhyung Kim Thanks, Namhyung -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Documentation / CPU hotplug: Fix the typo in example code
On 01/15/2014 10:40 AM, Sangjung Woo wrote: > As the notifier_block name (i.e. foobar_cpu_notifer) is different from > the parameter (i.e.foobar_cpu_notifier) of register function, that is > definitely error and it also makes readers confused. > > Signed-off-by: Sangjung Woo Reviewed-by: Srivatsa S. Bhat Regards, Srivatsa S. Bhat > --- > Documentation/cpu-hotplug.txt |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/Documentation/cpu-hotplug.txt b/Documentation/cpu-hotplug.txt > index 8cb9938..be675d2 100644 > --- a/Documentation/cpu-hotplug.txt > +++ b/Documentation/cpu-hotplug.txt > @@ -285,7 +285,7 @@ A: This is what you would need in your kernel code to > receive notifications. > return NOTIFY_OK; > } > > - static struct notifier_block foobar_cpu_notifer = > + static struct notifier_block foobar_cpu_notifier = > { > .notifier_call = foobar_cpu_callback, > }; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] perf tools: Synthesize anon MMAP records on the heap
Hi Gaurav, I'd like to take my ack back - it seems I missed some points. On Tue, 14 Jan 2014 20:48:23 +, Gaurav Jain wrote: > On 1/13/14, 11:54 AM, "Don Zickus" wrote: > >>On Sat, Jan 11, 2014 at 08:32:14PM -0800, Gaurav Jain wrote: >>> Anon records usually do not have the 'execname' entry. However if they >>>are on >>> the heap, the execname shows up as '[heap]'. The fix considers any >>>executable >>> entries in the map that do not have a name or are on the heap as anon >>>records >>> and sets the name to '//anon'. >>> >>> This fixes JIT profiling for records on the heap. >> >>I guess I don't understand the need for this fix. It seems breaking out >>//anon vs. [heap] would be useful. Your patch is saying otherwise. Can >>give a description of the problem you are trying to solve? > > Thank you for looking at the patch. > > We generate a perf map file which includes certain JIT¹ed functions that > show up as [heap] entries. As a result, I included the executable heap > entries as anon pages so that it would be handled in > util/map.c:map__new(). The alternative would be to handle heap entries in > map__new() directly, however I wasn¹t sure if this would break something > as it seems that heap and stack entries are expected to fail all > map__find_* functions. Thus I considered executable heap entries as > //anon, but perhaps there is a better way. Hmm.. so the point is that an executable heap mapping should have /tmp/perf-XXX.map as a file name, right? If so, does something like below work well for you? Thanks, Namhyung diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c index 9b9bd719aa19..d52387fe83f1 100644 --- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -69,7 +69,7 @@ struct map *map__new(struct list_head *dsos__list, u64 start, u64 len, map->ino = ino; map->ino_generation = ino_gen; - if (anon) { + if (anon || (no_dso && type == MAP__FUNCTION)) { snprintf(newfilename, sizeof(newfilename), "/tmp/perf-%d.map", pid); filename = newfilename; } @@ -93,7 +93,7 @@ struct map *map__new(struct list_head *dsos__list, u64 start, u64 len, * functions still return NULL, and we avoid the * unnecessary map__load warning. */ - if (no_dso) + if (no_dso && type != MAP__FUNCTION) dso__set_loaded(dso, map->type); } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/zswap: add writethrough option
Hello, On Tue, Jan 14, 2014 at 10:10:44AM -0500, Dan Streetman wrote: > On Mon, Jan 13, 2014 at 7:11 PM, Minchan Kim wrote: > > Hello Dan, > > > > Sorry for the late response and I didn't look at the code yet > > because I am not convinced. :( > > > > On Thu, Dec 19, 2013 at 08:23:27AM -0500, Dan Streetman wrote: > >> Currently, zswap is writeback cache; stored pages are not sent > >> to swap disk, and when zswap wants to evict old pages it must > >> first write them back to swap cache/disk manually. This avoids > >> swap out disk I/O up front, but only moves that disk I/O to > >> the writeback case (for pages that are evicted), and adds the > >> overhead of having to uncompress the evicted pages and the > >> need for an additional free page (to store the uncompressed page). > >> > >> This optionally changes zswap to writethrough cache by enabling > >> frontswap_writethrough() before registering, so that any > >> successful page store will also be written to swap disk. The > >> default remains writeback. To enable writethrough, the param > >> zswap.writethrough=1 must be used at boot. > >> > >> Whether writeback or writethrough will provide better performance > >> depends on many factors including disk I/O speed/throughput, > >> CPU speed(s), system load, etc. In most cases it is likely > >> that writeback has better performance than writethrough before > >> zswap is full, but after zswap fills up writethrough has > >> better performance than writeback. > > > > So you claims we should use writeback default but writethrough > > after memory limit is full? > > But it would break LRU ordering and I think better idea is to > > handle it more generic way rather than chaning entire policy. > > This patch only adds the option of using writethrough. That's all. The point is that please explain that what's the your problem now and prove that adding new option for solve the problem is best. Just "Optionally, having is better" is not good approach to merge/maintain. > > > Now, zswap evict out just *a* page rather than a bunch of pages > > so it stucks every store if many swap write happens continuously. > > It's not efficient so how about adding kswapd's threshold concept > > like min/low/high? So, it could evict early before reaching zswap > > memory pool and stop it reaches high watermark. > > I guess it could be better than now. > > Well, I don't think that's related to this patch, but certainly a good idea to > investigate. Why I suggested it that I feel from your description that wb is just slower than wt since zswap memory is pool. > > > > > Other point: As I read device-mapper/cache.txt, cache operating mode > > already supports writethrough. It means zram zRAM can support > > writeback/writethough with dm-cache. > > Have you tried it? Is there any problem? > > zswap isn't a block device though, so that doesn't apply (unless I'm > missing something). zram is block device so freely you can make it to swap block device and binding it with dm-cache will make what you want. The whole point is we could do what you want without adding new so I hope you prove what's the problem in existing solution so that we could judge and try to solve the pain point with more ideal approach. > > > > > Acutally, I really don't know how much benefit we have that in-memory > > swap overcomming to the real storage but if you want, zRAM with dm-cache > > is another option rather than invent new wheel by "just having is better". > > I'm not sure if this patch is related to the zswap vs. zram discussions. This > only adds the option of using writethrough to zswap. It's a first > step to possibly > making zswap work more efficiently using writeback and/or writethrough > depending on > the system and conditions. The patch size is small. Okay I don't want to be a party-pooper but at least, I should say my thought for Andrew to help judging. > > > > > Thanks. > > > >> > >> Signed-off-by: Dan Streetman > >> > >> --- > >> > >> Based on specjbb testing on my laptop, the results for both writeback > >> and writethrough are better than not using zswap at all, but writeback > >> does seem to be better than writethrough while zswap isn't full. Once > >> it fills up, performance for writethrough is essentially close to not > >> using zswap, while writeback seems to be worse than not using zswap. > >> However, I think more testing on a wider span of systems and conditions > >> is needed. Additionally, I'm not sure that specjbb is measuring true > >> performance under fully loaded cpu conditions, so additional cpu load > >> might need to be added or specjbb parameters modified (I took the > >> values from the 4 "warehouses" test run). > >> > >> In any case though, I think having writethrough as an option is still > >> useful. More changes could be made, such as changing from writeback > >> to writethrough based on the zswap % full. And the patch doesn't > >> change default behavior - writethrough must be specifically
[PATCH REGRESSION FIX] x86 idle: restore mwait_idle()
From: Len Brown In Linux-3.9 we removed the mwait_idle() loop: 'x86 idle: remove mwait_idle() and "idle=mwait" cmdline param' (69fb3676df3329a7142803bb3502fa59dc0db2e3) The reasoning was that modern machines should be sufficiently happy during the boot process using the default_idle() HALT loop, until cpuidle loads and either acpi_idle or intel_idle invoke the newer MWAIT-with-hints idle loop. But two machines reported problems: 1. Certain Core2-era machines support MWAIT-C1 and HALT only. MWAIT-C1 is preferred for optimal power and performance. But if they support just C1, cpuidle never loads and so they use the boot-time default idle loop forever. 2. Some laptops will boot-hang if HALT is used, but will boot successfully if MWAIT is used. This appears to be a hidden assumption in BIOS SMI, that is presumably valid on the proprietary OS where the BIOS was validated. https://bugzilla.kernel.org/show_bug.cgi?id=60770 So here we effectively revert the patch above, restoring the mwait_idle() loop. However, we don't bother restoring the idle=mwait cmdline parameter, since it appears to add no value. Maintainer notes: For 3.9, simply revert 69fb3676df for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in context For 3.11, 3.12, 3.13, this patch applies cleanly Cc: Mike Galbraith Cc: Ian Malone Cc: Josh Boyer Cc: # 3.9, 3.10, 3.11, 3.12, 3.13 Signed-off-by: Len Brown --- arch/x86/kernel/process.c | 46 ++ 1 file changed, 46 insertions(+) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 3fb8d95..db471a8 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -398,6 +398,49 @@ static void amd_e400_idle(void) default_idle(); } +/* + * Intel Core2 and older machines prefer MWAIT over HALT for C1. + * We can't rely on cpuidle installing MWAIT, because it will not load + * on systems that support only C1 -- so the boot default must be MWAIT. + * + * Some AMD machines are the opposite, they depend on using HALT. + * + * So for default C1, which is used during boot until cpuidle loads, + * use MWAIT-C1 on Intel HW that has it, else use HALT. + */ +static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c) +{ + if (c->x86_vendor != X86_VENDOR_INTEL) + return 0; + + if (!cpu_has(c, X86_FEATURE_MWAIT)) + return 0; + + return 1; +} + +/* + * MONITOR/MWAIT with no hints, used for default default C1 state. + * This invokes MWAIT with interrutps enabled and no flags, + * which is backwards compatible with the original MWAIT implementation. + */ + +static void mwait_idle(void) +{ + if (!need_resched()) { + if (this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR)) + clflush((void *)_thread_info()->flags); + + __monitor((void *)_thread_info()->flags, 0, 0); + smp_mb(); + if (!need_resched()) + __sti_mwait(0, 0); + else + local_irq_enable(); + } else + local_irq_enable(); +} + void select_idle_routine(const struct cpuinfo_x86 *c) { #ifdef CONFIG_SMP @@ -411,6 +454,9 @@ void select_idle_routine(const struct cpuinfo_x86 *c) /* E400: APIC timer interrupt does not wake up CPU from C1e */ pr_info("using AMD E400 aware idle routine\n"); x86_idle = amd_e400_idle; + } else if (prefer_mwait_c1_over_halt(c)) { + pr_info("using mwait in idle threads\n"); + x86_idle = mwait_idle; } else x86_idle = default_idle; } -- 1.8.5.2.309.ga25014b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] clk: sirf: re-arch to make the codes support both prima2 and atlas6
Quoting Barry Song (2014-01-05 21:38:19) > diff --git a/drivers/clk/sirf/clk-atlas6.c b/drivers/clk/sirf/clk-atlas6.c > new file mode 100644 > index 000..21e776a > --- /dev/null > +++ b/drivers/clk/sirf/clk-atlas6.c > @@ -0,0 +1,153 @@ > +/* > + * Clock tree for CSR SiRFatlasVI > + * > + * Copyright (c) 2011 Cambridge Silicon Radio Limited, a CSR plc group > company. > + * > + * Licensed under GPLv2 or later. > + */ > + > +#include > +#include > +#include > +#include > +#include Please do not use clk-private.h. It is slated for removal (some day...). Do you actually need it? Regards, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] sched: find the latest idle cpu
On 01/15/2014 12:07 PM, Alex Shi wrote: > Currently we just try to find least load cpu. If some cpus idled, > we just pick the first cpu in cpu mask. > > In fact we can get the interrupted idle cpu or the latest idled cpu, > then we may get the benefit from both latency and power. > The selected cpu maybe not the best, since other cpu may be interrupted > during our selecting. But be captious costs too much. So the idea here is we want to choose the latest idle cpu if we have multiple idle cpu for choosing, correct? And I guess that was in order to avoid choosing tickless cpu while there are un-tickless idle one, is that right? What confused me is, what about those cpu who just going to recover from tickless as you mentioned, which means latest idle doesn't mean the best choice, or even could be the worst (if just two choice, and the longer tickless one is just going to recover while the latest is going to tickless). So what about just check 'ts->tick_stopped' and record one ticking idle cpu? the cost could be lower than time comparison, we could reduce the risk may be...(well, not so risky since the logical only works when system is relaxing with several cpu idle) Regards, Michael Wang > > Signed-off-by: Alex Shi > --- > kernel/sched/fair.c | 20 > 1 file changed, 20 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index c7395d9..fb52d26 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4167,6 +4167,26 @@ find_idlest_cpu(struct sched_group *group, struct > task_struct *p, int this_cpu) > min_load = load; > idlest = i; > } > +#ifdef CONFIG_NO_HZ_COMMON > + /* > + * Coarsely to get the latest idle cpu for shorter latency and > + * possible power benefit. > + */ > + if (!min_load) { > + struct tick_sched *ts = _cpu(tick_cpu_sched, i); > + > + s64 latest_wake = 0; > + /* idle cpu doing irq */ > + if (ts->inidle && !ts->idle_active) > + idlest = i; > + /* the cpu resched */ > + else if (!ts->inidle) > + idlest = i; > + /* find latest idle cpu */ > + else if (ktime_to_us(ts->idle_entrytime) > latest_wake) > + idlest = i; > + } > +#endif > } > > return idlest; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: manual merge of the md tree with the block tree
On Wed, 15 Jan 2014 15:07:43 +1100 Stephen Rothwell wrote: > Hi Neil, > > Today's linux-next merge of the md tree got a conflict in > drivers/md/raid10.c between commit 4f024f3797c4 ("block: Abstract out > bvec iterator") from the block tree and commit b50c259e25d9 ("md/raid10: > fix two bugs in handling of known-bad-blocks") from the md tree. > > I fixed it up (see below) and can carry the fix as necessary (no action > is required). > Thanks Stephen. Those md fixes are now on their way to Linus (I'm hoping for 3.13 inclusion, I might be lucky). Good they they are easy to resolve! Thanks, NeilBrown signature.asc Description: PGP signature
[PATCH tip/core/timers 3/4] timers: Reduce future __run_timers() latency for newly emptied list
From: "Paul E. McKenney" The __run_timers() function currently steps through the list one jiffy at a time in order to update the timer wheel. However, if the timer wheel is empty, no adjustment is needed other than updating ->timer_jiffies. Therefore, if we just emptied the timer wheel, for example, by deleting the last timer, we should mark the timer wheel as being up to date. This marking will reduce (and perhaps eliminate) the jiffy-stepping that a future __run_timers() call will need to do in response to some future timer posting or migration. This commit therefore catches ->timer_jiffies for this case. Signed-off-by: Paul E. McKenney --- kernel/timer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/timer.c b/kernel/timer.c index 295837e5e011..bdd1c00ec4ec 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -700,6 +700,7 @@ static int detach_if_pending(struct timer_list *timer, struct tvec_base *base, base->next_timer = base->timer_jiffies; } base->all_timers--; + (void)catchup_timer_jiffies(base); return 1; } -- 1.8.1.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH tip/core/timers 2/4] timers: Reduce __run_timers() latency for empty list
From: "Paul E. McKenney" The __run_timers() function currently steps through the list one jiffy at a time in order to update the timer wheel. However, if the timer wheel is empty, no adjustment is needed other than updating ->timer_jiffies. In this case, which is likely to be common for NO_HZ_FULL kernels, the kernel currently incurs a large latency for no good reason. This commit therefore short-circuits this case. Signed-off-by: Paul E. McKenney --- kernel/timer.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/kernel/timer.c b/kernel/timer.c index 2245b7374c3d..295837e5e011 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -338,6 +338,17 @@ void set_timer_slack(struct timer_list *timer, int slack_hz) } EXPORT_SYMBOL_GPL(set_timer_slack); +static bool catchup_timer_jiffies(struct tvec_base *base) +{ +#ifdef CONFIG_NO_HZ_FULL + if (!base->all_timers) { + base->timer_jiffies = jiffies; + return 1; + } +#endif /* #ifdef CONFIG_NO_HZ_FULL */ + return 0; +} + static void __internal_add_timer(struct tvec_base *base, struct timer_list *timer) { @@ -1150,6 +1161,10 @@ static inline void __run_timers(struct tvec_base *base) struct timer_list *timer; spin_lock_irq(>lock); + if (catchup_timer_jiffies(base)) { + spin_unlock_irq(>lock); + return; + } while (time_after_eq(jiffies, base->timer_jiffies)) { struct list_head work_list; struct list_head *head = _list; -- 1.8.1.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH tip/core/timers 1/4] timers: Track total number of timers in list
From: "Paul E. McKenney" Currently, the tvec_base structure's ->active_timers field tracks only the non-deferrable timers, which means that even if ->active_timers is zero, there might well be non-deferrable timers in the list. This commit therefore adds an ->all_timers field to track all the timers, whether deferrable or not. Signed-off-by: Paul E. McKenney --- kernel/timer.c | 5 + 1 file changed, 5 insertions(+) diff --git a/kernel/timer.c b/kernel/timer.c index 6582b82fa966..2245b7374c3d 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -81,6 +81,7 @@ struct tvec_base { unsigned long timer_jiffies; unsigned long next_timer; unsigned long active_timers; + unsigned long all_timers; struct tvec_root tv1; struct tvec tv2; struct tvec tv3; @@ -392,6 +393,7 @@ static void internal_add_timer(struct tvec_base *base, struct timer_list *timer) base->next_timer = timer->expires; base->active_timers++; } + base->all_timers++; } #ifdef CONFIG_TIMER_STATS @@ -671,6 +673,7 @@ detach_expired_timer(struct timer_list *timer, struct tvec_base *base) detach_timer(timer, true); if (!tbase_get_deferrable(timer->base)) base->active_timers--; + base->all_timers--; } static int detach_if_pending(struct timer_list *timer, struct tvec_base *base, @@ -685,6 +688,7 @@ static int detach_if_pending(struct timer_list *timer, struct tvec_base *base, if (timer->expires == base->next_timer) base->next_timer = base->timer_jiffies; } + base->all_timers--; return 1; } @@ -1560,6 +1564,7 @@ static int init_timers_cpu(int cpu) base->timer_jiffies = jiffies; base->next_timer = base->timer_jiffies; base->active_timers = 0; + base->all_timers = 0; return 0; } -- 1.8.1.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH tip/core/timers 4/4] timers: Reduce future __run_timers() latency for first add to empty list
From: "Paul E. McKenney" The __run_timers() function currently steps through the list one jiffy at a time in order to update the timer wheel. However, if the timer wheel is empty, no adjustment is needed other than updating ->timer_jiffies. Therefore, just before we add a timer to an empty timer wheel, we should mark the timer wheel as being up to date. This marking will reduce (and perhaps eliminate) the jiffy-stepping that a future __run_timers() call will need to do in response to some future timer posting or migration. This commit therefore updates ->timer_jiffies for this case. Signed-off-by: Paul E. McKenney --- kernel/timer.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/timer.c b/kernel/timer.c index bdd1c00ec4ec..b49d2d0e879e 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -749,6 +749,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, base = lock_timer_base(timer, ); + (void)catchup_timer_jiffies(base); ret = detach_if_pending(timer, base, false); if (!ret && pending_only) goto out_unlock; -- 1.8.1.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL REQUEST] last minute md fixes for 3.13
Sorry they are late Christmas holidays and all that. Hopefully they can still squeak into 3.13. NeilBrown The following changes since commit 6d183de4077191d1201283a9035ce57a9b05254d: md/raid5: fix newly-broken locking in get_active_stripe. (2013-11-28 11:00:15 +1100) are available in the git repository at: git://neil.brown.name/md tags/md/3.13-fixes for you to fetch changes up to 8313b8e57f55b15e5b7f7fc5d1630bbf686a9a97: md: fix problem when adding device to read-only array with bitmap. (2014-01-14 16:44:08 +1100) md: half a dozen bug fixes for 3.13 All of these fix real bugs the people have hit, and are tagged for -stable. NeilBrown (6): md/raid5: Fix possible confusion when multiple write errors occur. md/raid10: fix two bugs in handling of known-bad-blocks. md/raid1: fix request counting bug in new 'barrier' code. md/raid5: fix a recently broken BUG_ON(). md/raid10: fix bug when raid10 recovery fails to recover a block. md: fix problem when adding device to read-only array with bitmap. drivers/md/md.c | 18 +++--- drivers/md/md.h | 3 +++ drivers/md/raid1.c | 3 +-- drivers/md/raid10.c | 12 ++-- drivers/md/raid5.c | 7 --- 5 files changed, 29 insertions(+), 14 deletions(-) signature.asc Description: PGP signature
[PATCH v2 tip/core/timers] Crude timer-wheel latency hacks
Hello! The following three patches provide some crude timer-wheel latency patches. I understand that a more comprehensive solution is in progress, but in the meantime, these patches work well in cases where a given CPU has either zero or one timers pending, which is a common case for NO_HZ_FULL kernels. So, on the off-chance that this is helpful to someone, the individual patches are as follows: 1. Add ->all_timers field to tbase_vec to count all timers, not just the non-deferrable ones. 2. Avoid jiffy-at-a-time stepping when the timer wheel is empty. 3. Avoid jiffy-at-a-time stepping when the timer wheel transitions to empty. 4. Avoid jiffy-at-a-time stepping after a timer is added to an initially empty timer wheel. Differences from v1: o Fix an embarrassing bug located by Oleg Nesterov where the timer wheel could be judged to be empty even if it contained deferrable timers. Thanx, Paul b/kernel/timer.c | 22 ++ 1 file changed, 22 insertions(+) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git pull] drm last fixes
Hi Linus, One nouveau regression fix on older cards, i915 black screen fixes, and a revert for a strange G33 intel problem. Dave. The following changes since commit 7e22e91102c6b9df7c4ae2168910e19d2bb14cd6: Linux 3.13-rc8 (2014-01-12 17:04:18 +0700) are available in the git repository at: git://people.freedesktop.org/~airlied/linux drm-fixes for you to fetch changes up to 703a8c2dfa5aa69b9b0a6684dc78ea28a2c7fe3e: Merge branch 'drm-nouveau-next' of git://git.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes (2014-01-15 15:01:11 +1000) Ben Skeggs (1): drm/nouveau: fix null ptr dereferences on some boards Dave Airlie (3): Merge tag 'drm-intel-fixes-2014-01-13' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes Revert "drm: copy mode type in drm_mode_connector_list_update()" Merge branch 'drm-nouveau-next' of git://git.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes Jesse Barnes (1): drm/i915/bdw: make sure south port interrupts are enabled properly v2 Paulo Zanoni (1): drm/i915: fix DDI PLLs HW state readout code Ville Syrjälä (1): drm/i915: Don't grab crtc mutexes in intel_modeset_gem_init() drivers/gpu/drm/drm_modes.c | 2 +- drivers/gpu/drm/i915/i915_irq.c | 2 ++ drivers/gpu/drm/i915/intel_ddi.c | 8 +++- drivers/gpu/drm/i915/intel_display.c | 4 ++-- drivers/gpu/drm/nouveau/core/include/subdev/i2c.h | 2 +- drivers/gpu/drm/nouveau/core/include/subdev/instmem.h | 7 +++ drivers/gpu/drm/nouveau/core/subdev/i2c/base.c| 4 ++-- drivers/gpu/drm/nouveau/core/subdev/therm/ic.c| 10 +- drivers/gpu/drm/nouveau/dispnv04/dfp.c| 2 +- drivers/gpu/drm/nouveau/dispnv04/tvnv04.c | 2 +- 10 files changed, 29 insertions(+), 14 deletions(-)
Re: [PATCH] mm/zswap: Check all pool pages instead of one pool pages
On Tue, Jan 14, 2014 at 02:15:44PM +0800, Weijie Yang wrote: > On Tue, Jan 14, 2014 at 1:42 PM, Bob Liu wrote: > > > > On 01/14/2014 01:05 PM, Minchan Kim wrote: > >> On Tue, Jan 14, 2014 at 01:50:22PM +0900, Minchan Kim wrote: > >>> Hello Bob, > >>> > >>> On Tue, Jan 14, 2014 at 09:19:23AM +0800, Bob Liu wrote: > > On 01/14/2014 07:35 AM, Minchan Kim wrote: > > Hello, > > > > On Sat, Jan 11, 2014 at 03:43:07PM +0800, Cai Liu wrote: > >> zswap can support multiple swapfiles. So we need to check > >> all zbud pool pages in zswap. > > > > True but this patch is rather costly that we should iterate > > zswap_tree[MAX_SWAPFILES] to check it. SIGH. > > > > How about defining zswap_tress as linked list instead of static > > array? Then, we could reduce unnecessary iteration too much. > > > > But if use linked list, it might not easy to access the tree like this: > struct zswap_tree *tree = zswap_trees[type]; > >>> > >>> struct zswap_tree { > >>> .. > >>> .. > >>> struct list_head list; > >>> } > >>> > >>> zswap_frontswap_init() > >>> { > >>> .. > >>> .. > >>> zswap_trees[type] = tree; > >>> list_add(>list, _list); > >>> } > >>> > >>> get_zswap_pool_pages(void) > >>> { > >>> struct zswap_tree *cur; > >>> list_for_each_entry(cur, _list, list) { > >>> pool_pages += zbud_get_pool_size(cur->pool); > >>> } > >>> return pool_pages; > >>> } > > > > Okay, I see your point. Yes, it's much better. > > Cai, Please make an new patch. > > This improved patch could reduce unnecessary iteration too much. > > But I still have a question: why do we need so many zbud pools? > How about use only one global zbud pool for all zswap_tree? > I do not test it, but I think it can improve the strore density. Just a quick glance, I don't know how multiple swap configuration is popular? With your approach, what kinds of change do we need in frontswap_invalidate_area? You will add encoded *type* in offset of entry? So we always should decode it when we need search opeartion? We lose speed but get a density(? but not sure because it's dependent on workload) for rare configuration(ie, multiple swap) and rare event(ie, swapoff). It's just popped question, not strong objection. Anyway, point is that you can try it if you want and then, report the number. :) Thanks. > > Just for your reference, Thanks! > -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5] Staging: comedi: convert while loop to timeout in ni_mio_common.c
This patch for ni_mio_common.c changes out a while loop for a timeout, which is preferred. Signed-off-by: Chase Southwood --- All right, I think this guy's ready to go now! Thanks for all the help! Chase 2: Changed from simple clean-up to swapping a timeout in for a while loop. 3: Removed extra counter variable, and added error checking. 4: No longer using counter variable, using jiffies instead. 5: udelay for 10u, instead of 1u. drivers/staging/comedi/drivers/ni_mio_common.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/staging/comedi/drivers/ni_mio_common.c b/drivers/staging/comedi/drivers/ni_mio_common.c index 457b884..07e9777 100644 --- a/drivers/staging/comedi/drivers/ni_mio_common.c +++ b/drivers/staging/comedi/drivers/ni_mio_common.c @@ -687,12 +687,21 @@ static void ni_clear_ai_fifo(struct comedi_device *dev) { const struct ni_board_struct *board = comedi_board(dev); struct ni_private *devpriv = dev->private; + unsigned long timeout; if (board->reg_type == ni_reg_6143) { /* Flush the 6143 data FIFO */ ni_writel(0x10, AIFIFO_Control_6143); /* Flush fifo */ ni_writel(0x00, AIFIFO_Control_6143); /* Flush fifo */ - while (ni_readl(AIFIFO_Status_6143) & 0x10) ; /* Wait for complete */ + /* Wait for complete */ + timeout = jiffies + msec_to_jiffies(500); + while (ni_readl(AIFIFO_Status_6143) & 0x10) { + if (time_after(jiffies, timeout)) { + comedi_error(dev, "FIFO flush timeout."); + break; + } + udelay(10); + } } else { devpriv->stc_writew(dev, 1, ADC_FIFO_Clear); if (board->reg_type == ni_reg_625x) { -- 1.8.4.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Your email a.w.a.r.d in the Jargua sum of $800, 000.00 send your full name, address phone number
-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 00/15] PCI: Allocate 64-bit BARs above 4G when possible
> > I added Daniel's Reviewed-by to the AGP patches (except the trivial > PCI_COMMAND change in ati_configure()). > > I added the incremental patch below to fix these warnings found by > Fengguang's autobuilder in the original b1e0e392f5dd commit: > > drivers/char/agp/amd-k7-agp.c:115:38: warning: 'addr' may be used > uninitialized in this function [-Wmaybe-uninitialized] > drivers/pci/bus.c:105:5: warning: large integer implicitly truncated to > unsigned type [-Woverflow] > > Finally, I merged the pci/resource branch with these changes into my "next" > branch, so it should appear in v3.14-rc1. > > Dave, let me know if you have any issue with these AGP changes going > through my tree. None, all fine by me. Acked-by: Dave Airlie Dave. > > Bjorn > > > diff --git a/drivers/char/agp/amd-k7-agp.c b/drivers/char/agp/amd-k7-agp.c > index e8c2e9167e89..3661a51e93e2 100644 > --- a/drivers/char/agp/amd-k7-agp.c > +++ b/drivers/char/agp/amd-k7-agp.c > @@ -148,8 +148,8 @@ static int amd_create_gatt_table(struct agp_bridge_data > *bridge) >* used to program the agp master not the cpu >*/ > > - agp_bridge->gart_bus_addr = pci_bus_address(agp_bridge->dev, > - AGP_APERTURE_BAR); > + addr = pci_bus_address(agp_bridge->dev, AGP_APERTURE_BAR); > + agp_bridge->gart_bus_addr = addr; > > /* Calculate the agp offset */ > for (i = 0; i < value->num_entries / 1024; i++, addr += 0x0040) { > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c > index 107ad9a5b8aa..86fb8ec5e448 100644 > --- a/drivers/pci/bus.c > +++ b/drivers/pci/bus.c > @@ -99,10 +99,12 @@ void pci_bus_remove_resources(struct pci_bus *bus) > } > > static struct pci_bus_region pci_32_bit = {0, 0xULL}; > +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT > static struct pci_bus_region pci_64_bit = {0, > - (resource_size_t) 0xULL}; > -static struct pci_bus_region pci_high = {(resource_size_t) 0x1ULL, > - (resource_size_t) 0xULL}; > + (dma_addr_t) 0xULL}; > +static struct pci_bus_region pci_high = {(dma_addr_t) 0x1ULL, > + (dma_addr_t) 0xULL}; > +#endif > > /* > * @res contains CPU addresses. Clip it so the corresponding bus addresses > @@ -207,6 +209,7 @@ int pci_bus_alloc_resource(struct pci_bus *bus, struct > resource *res, > resource_size_t), > void *alignf_data) > { > +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT > int rc; > > if (res->flags & IORESOURCE_MEM_64) { > @@ -220,6 +223,7 @@ int pci_bus_alloc_resource(struct pci_bus *bus, struct > resource *res, >type_mask, alignf, alignf_data, >_64_bit); > } > +#endif > > return pci_bus_alloc_from_region(bus, res, size, align, min, >type_mask, alignf, alignf_data, > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Documentation / CPU hotplug: Fix the typo in example code
As the notifier_block name (i.e. foobar_cpu_notifer) is different from the parameter (i.e.foobar_cpu_notifier) of register function, that is definitely error and it also makes readers confused. Signed-off-by: Sangjung Woo --- Documentation/cpu-hotplug.txt |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/cpu-hotplug.txt b/Documentation/cpu-hotplug.txt index 8cb9938..be675d2 100644 --- a/Documentation/cpu-hotplug.txt +++ b/Documentation/cpu-hotplug.txt @@ -285,7 +285,7 @@ A: This is what you would need in your kernel code to receive notifications. return NOTIFY_OK; } - static struct notifier_block foobar_cpu_notifer = + static struct notifier_block foobar_cpu_notifier = { .notifier_call = foobar_cpu_callback, }; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: unsigned commits in the mips tree
Hi Ralf, I noticed that there is a whole series of commits in today's mips tree that were committed by John Crispin but have no Signed-off-by from him ... -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpurbLY5MRfw.pgp Description: PGP signature
Re: [PATCH v3 5/5] slab: make more slab management structure off the slab
On Mon, 2 Dec 2013, Joonsoo Kim wrote: > Now, the size of the freelist for the slab management diminish, > so that the on-slab management structure can waste large space > if the object of the slab is large. > > Consider a 128 byte sized slab. If on-slab is used, 31 objects can be > in the slab. The size of the freelist for this case would be 31 bytes > so that 97 bytes, that is, more than 75% of object size, are wasted. > > In a 64 byte sized slab case, no space is wasted if we use on-slab. > So set off-slab determining constraint to 128 bytes. > > Acked-by: Christoph Lameter > Signed-off-by: Joonsoo Kim Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH/RFC] perf ui/tui: Show column header in hist browser
Add a line for showing column headers like --stdio. Signed-off-by: Namhyung Kim --- tools/perf/ui/browser.c| 4 +-- tools/perf/ui/browsers/hists.c | 55 ++ 2 files changed, 57 insertions(+), 2 deletions(-) diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c index d11541d4d7d7..441470ba8d2c 100644 --- a/tools/perf/ui/browser.c +++ b/tools/perf/ui/browser.c @@ -278,8 +278,8 @@ void ui_browser__hide(struct ui_browser *browser) static void ui_browser__scrollbar_set(struct ui_browser *browser) { int height = browser->height, h = 0, pct = 0, - col = browser->width, - row = browser->y - 1; + col = browser->width, row = 0; + if (browser->nr_entries > 1) { pct = ((browser->index * (browser->height - 1)) / diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index e0ab399d431d..5b94de48f23c 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -25,6 +25,7 @@ struct hist_browser { struct map_symbol *selection; int print_seq; bool show_dso; + bool show_column_header; floatmin_pcnt; u64 nr_pcnt_entries; }; @@ -312,6 +313,57 @@ static void ui_browser__warn_lost_events(struct ui_browser *browser) static void hist_browser__update_pcnt_entries(struct hist_browser *hb); +static void hist_browser__column_header(struct hist_browser *browser) +{ + struct perf_hpp_fmt *fmt; + struct sort_entry *se; + struct hists *hists = browser->hists; + bool show = browser->show_column_header; + unsigned int width; + + pthread_mutex_lock(__lock); + + if (!show) + goto out; + + ui_browser__gotorc(>b, 0, 0); + + perf_hpp__for_each_format(fmt) { + char buf[128]; + struct perf_hpp hpp = { + .buf= buf, + .size = sizeof(buf), + .ptr= hists_to_evsel(hists), + }; + + width = fmt->width(fmt, ); + fmt->header(fmt, ); + + slsmg_printf(" %-*s", width, hpp.buf); + } + + list_for_each_entry(se, _entry__sort_list, list) { + if (se->elide) + continue; + + width = strlen(se->se_header); + if (!hists__new_col_len(hists, se->se_width_idx, width)) + width = hists__col_len(hists, se->se_width_idx); + + slsmg_printf(" %-*s", width, se->se_header); + } + + /* +* We just called ui_browser__refresh_dimensions(). +* Update it to honor a new column header line. +*/ + browser->b.y++; + browser->b.height--; + +out: + pthread_mutex_unlock(__lock); +} + static int hist_browser__run(struct hist_browser *browser, const char *ev_name, struct hist_browser_timer *hbt) { @@ -331,6 +383,8 @@ static int hist_browser__run(struct hist_browser *browser, const char *ev_name, "Press '?' for help on key bindings") < 0) return -1; + hist_browser__column_header(browser); + while (1) { key = ui_browser__run(>b, delay_secs); @@ -1198,6 +1252,7 @@ static struct hist_browser *hist_browser__new(struct hists *hists) browser->b.refresh = hist_browser__refresh; browser->b.seek = ui_browser__hists_seek; browser->b.use_navkeypressed = true; + browser->show_column_header = true; } return browser; -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 4/5] slab: introduce byte sized index for the freelist of a slab
On Mon, 2 Dec 2013, Joonsoo Kim wrote: > Currently, the freelist of a slab consist of unsigned int sized indexes. > Since most of slabs have less number of objects than 256, large sized > indexes is needless. For example, consider the minimum kmalloc slab. It's > object size is 32 byte and it would consist of one page, so 256 indexes > through byte sized index are enough to contain all possible indexes. > > There can be some slabs whose object size is 8 byte. We cannot handle > this case with byte sized index, so we need to restrict minimum > object size. Since these slabs are not major, wasted memory from these > slabs would be negligible. > > Some architectures' page size isn't 4096 bytes and rather larger than > 4096 bytes (One example is 64KB page size on PPC or IA64) so that > byte sized index doesn't fit to them. In this case, we will use > two bytes sized index. > > Below is some number for this patch. > > * Before * > kmalloc-512 52564051281 : tunables 54 270 : > slabdata 80 80 0 > kmalloc-256 210210256 151 : tunables 120 600 : > slabdata 14 14 0 > kmalloc-192 1016 1040192 201 : tunables 120 600 : > slabdata 52 52 0 > kmalloc-96 560620128 311 : tunables 120 600 : > slabdata 20 20 0 > kmalloc-64 2148 2280 64 601 : tunables 120 600 : > slabdata 38 38 0 > kmalloc-128 647682128 311 : tunables 120 600 : > slabdata 22 22 0 > kmalloc-32 11360 11413 32 1131 : tunables 120 600 : > slabdata101101 0 > kmem_cache 197200192 201 : tunables 120 600 : > slabdata 10 10 0 > > * After * > kmalloc-512 52164851281 : tunables 54 270 : > slabdata 81 81 0 > kmalloc-256 208208256 161 : tunables 120 600 : > slabdata 13 13 0 > kmalloc-192 1029 1029192 211 : tunables 120 600 : > slabdata 49 49 0 > kmalloc-96 529589128 311 : tunables 120 600 : > slabdata 19 19 0 > kmalloc-64 2142 2142 64 631 : tunables 120 600 : > slabdata 34 34 0 > kmalloc-128 660682128 311 : tunables 120 600 : > slabdata 22 22 0 > kmalloc-32 11716 11780 32 1241 : tunables 120 600 : > slabdata 95 95 0 > kmem_cache 197210192 211 : tunables 120 600 : > slabdata 10 10 0 > > kmem_caches consisting of objects less than or equal to 256 byte have > one or more objects than before. In the case of kmalloc-32, we have 11 more > objects, so 352 bytes (11 * 32) are saved and this is roughly 9% saving of > memory. Of couse, this percentage decreases as the number of objects > in a slab decreases. > > Here are the performance results on my 4 cpus machine. > > * Before * > > Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 > runs): > >229,945,138 cache-misses >( +- 0.23% ) > > 11.627897174 seconds time elapsed >( +- 0.14% ) > > * After * > > Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 > runs): > >218,640,472 cache-misses >( +- 0.42% ) > > 11.504999837 seconds time elapsed >( +- 0.21% ) > > cache-misses are reduced by this patchset, roughly 5%. > And elapsed times are improved by 1%. > > Signed-off-by: Joonsoo Kim Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] sched: find the latest idle cpu
On 01/15/2014 12:53 PM, Alex Shi wrote: >>> >> I guess we missed some code for latest_wake here? >> > >> > Yes, thanks for reminder! >> > >> > so updated patch: >> > > ops, still incorrect. re-updated: update to wrong file. re-re-update. :( === >From b75e43bb77df14e2209532c1e5c48e0e03afa414 Mon Sep 17 00:00:00 2001 From: Alex Shi Date: Tue, 14 Jan 2014 23:07:42 +0800 Subject: [PATCH] sched: find the latest idle cpu Currently we just try to find least load cpu. If some cpus idled, we just pick the first cpu in cpu mask. In fact we can get the interrupted idle cpu or the latest idled cpu, then we may get the benefit from both latency and power. The selected cpu maybe not the best, since other cpu may be interrupted during our selecting. But be captious costs too much. Signed-off-by: Alex Shi --- kernel/sched/fair.c | 28 1 file changed, 28 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c7395d9..f82ca3d 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4159,6 +4159,10 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) int idlest = -1; int i; +#ifdef CONFIG_NO_HZ_COMMON + s64 latest_wake = 0; +#endif + /* Traverse only the allowed CPUs */ for_each_cpu_and(i, sched_group_cpus(group), tsk_cpus_allowed(p)) { load = weighted_cpuload(i); @@ -4167,6 +4171,30 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) min_load = load; idlest = i; } +#ifdef CONFIG_NO_HZ_COMMON + /* +* Coarsely to get the latest idle cpu for shorter latency and +* possible power benefit. +*/ + if (!load) { + struct tick_sched *ts = _cpu(tick_cpu_sched, i); + + /* idle cpu doing irq */ + if (ts->inidle && !ts->idle_active) + idlest = i; + /* the cpu resched */ + else if (!ts->inidle) + idlest = i; + /* find latest idle cpu */ + else { + s64 temp = ktime_to_us(ts->idle_entrytime); + if (temp > latest_wake) { + latest_wake = temp; + idlest = i; + } + } + } +#endif } return idlest; -- 1.8.1.2 -- Thanks Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the tip tree with the mips tree
Hi all, Today's linux-next merge of the tip tree got a conflict in arch/mips/netlogic/xlp/setup.c between commit 8e4857962d97 ("MIPS: Netlogic: Core wakeup improvements") from the mips tree and commit 7972e966b032 ("MIPS: Remove panic_timeout settings") from the tip tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc arch/mips/netlogic/xlp/setup.c index c3af2d8772cf,54e75c77184b.. --- a/arch/mips/netlogic/xlp/setup.c +++ b/arch/mips/netlogic/xlp/setup.c @@@ -96,15 -92,6 +96,14 @@@ static void __init xlp_init_mem_from_ba void __init plat_mem_setup(void) { +#ifdef CONFIG_SMP + nlm_wakeup_secondary_cpus(); + + /* update TLB size after waking up threads */ + current_cpu_data.tlbsize = ((read_c0_config6() >> 16) & 0x) + 1; + + register_smp_ops(_smp_ops); +#endif - panic_timeout = 5; _machine_restart = (void (*)(char *))nlm_linux_exit; _machine_halt = nlm_linux_exit; pm_power_off= nlm_linux_exit; pgp2bufSJR2Ri.pgp Description: PGP signature
Re: [PATCH v3 3/5] slab: restrict the number of objects in a slab
On Mon, 2 Dec 2013, Joonsoo Kim wrote: > To prepare to implement byte sized index for managing the freelist > of a slab, we should restrict the number of objects in a slab to be less > or equal to 256, since byte only represent 256 different values. > Setting the size of object to value equal or more than newly introduced > SLAB_OBJ_MIN_SIZE ensures that the number of objects in a slab is less or > equal to 256 for a slab with 1 page. > > If page size is rather larger than 4096, above assumption would be wrong. > In this case, we would fall back on 2 bytes sized index. > > If minimum size of kmalloc is less than 16, we use it as minimum object > size and give up this optimization. > > Signed-off-by: Joonsoo Kim Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/17] tools lib traceevent: Add state member to struct trace_seq
On Tue, 14 Jan 2014 21:56:55 -0500, Steven Rostedt wrote: > On Wed, 15 Jan 2014 11:49:28 +0900 > Namhyung Kim wrote: >> Oh, it looks better. But I'd like to TRACE_SEQ_CHECK() as is for some >> cases. How about this? > > Looks good to me. > > Acked-by: Steven Rostedt > > I'll try to look at the rest of the patches tomorrow. Thanks, and I found that trace_seq_destroy() should use TRACE_SEQ_TRACE_RET instead. Here's an updated patch. Thanks, Namhyung >From 539a5dd1cb3ba4cb03c283115a9a84f8e345514e Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Thu, 19 Dec 2013 18:17:44 +0900 Subject: [PATCH] tools lib traceevent: Add state member to struct trace_seq The trace_seq->state is for tracking errors during the use of trace_seq APIs and getting rid of die() in it. Acked-by: Steven Rostedt Signed-off-by: Namhyung Kim --- tools/lib/traceevent/Makefile | 2 +- tools/lib/traceevent/event-parse.h | 7 + tools/lib/traceevent/trace-seq.c | 55 +- 3 files changed, 50 insertions(+), 14 deletions(-) diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile index f778d48ac609..56d52a33a3df 100644 --- a/tools/lib/traceevent/Makefile +++ b/tools/lib/traceevent/Makefile @@ -136,7 +136,7 @@ export Q VERBOSE EVENT_PARSE_VERSION = $(EP_VERSION).$(EP_PATCHLEVEL).$(EP_EXTRAVERSION) -INCLUDES = -I. $(CONFIG_INCLUDES) +INCLUDES = -I. -I $(srctree)/../../include $(CONFIG_INCLUDES) # Set compile option CFLAGS if not set elsewhere CFLAGS ?= -g -Wall diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index cf5db9013f2c..3c890cb28db7 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -58,6 +58,12 @@ struct pevent_record { #endif }; +enum trace_seq_fail { + TRACE_SEQ__GOOD, + TRACE_SEQ__BUFFER_POISONED, + TRACE_SEQ__MEM_ALLOC_FAILED, +}; + /* * Trace sequences are used to allow a function to call several other functions * to create a string of data to use (up to a max of PAGE_SIZE). @@ -68,6 +74,7 @@ struct trace_seq { unsigned intbuffer_size; unsigned intlen; unsigned intreadpos; + enum trace_seq_fail state; }; void trace_seq_init(struct trace_seq *s); diff --git a/tools/lib/traceevent/trace-seq.c b/tools/lib/traceevent/trace-seq.c index d7f2e68bc5b9..f7112138e6af 100644 --- a/tools/lib/traceevent/trace-seq.c +++ b/tools/lib/traceevent/trace-seq.c @@ -22,6 +22,7 @@ #include #include +#include #include "event-parse.h" #include "event-utils.h" @@ -32,10 +33,21 @@ #define TRACE_SEQ_POISON ((void *)0xdeadbeef) #define TRACE_SEQ_CHECK(s) \ do { \ - if ((s)->buffer == TRACE_SEQ_POISON)\ - die("Usage of trace_seq after it was destroyed"); \ + if (WARN_ONCE((s)->buffer == TRACE_SEQ_POISON, \ + "Usage of trace_seq after it was destroyed")) \ + (s)->state = TRACE_SEQ__BUFFER_POISONED;\ } while (0) +#define TRACE_SEQ_CHECK_RET_N(s, n)\ +do { \ + TRACE_SEQ_CHECK(s); \ + if ((s)->state != TRACE_SEQ__GOOD) \ + return n; \ +} while (0) + +#define TRACE_SEQ_CHECK_RET(s) TRACE_SEQ_CHECK_RET_N(s, ) +#define TRACE_SEQ_CHECK_RET0(s) TRACE_SEQ_CHECK_RET_N(s, 0) + /** * trace_seq_init - initialize the trace_seq structure * @s: a pointer to the trace_seq structure to initialize @@ -46,6 +58,7 @@ void trace_seq_init(struct trace_seq *s) s->readpos = 0; s->buffer_size = TRACE_SEQ_BUF_SIZE; s->buffer = malloc_or_die(s->buffer_size); + s->state = TRACE_SEQ__GOOD; } /** @@ -71,7 +84,7 @@ void trace_seq_destroy(struct trace_seq *s) { if (!s) return; - TRACE_SEQ_CHECK(s); + TRACE_SEQ_CHECK_RET(s); free(s->buffer); s->buffer = TRACE_SEQ_POISON; } @@ -80,8 +93,9 @@ static void expand_buffer(struct trace_seq *s) { s->buffer_size += TRACE_SEQ_BUF_SIZE; s->buffer = realloc(s->buffer, s->buffer_size); - if (!s->buffer) - die("Can't allocate trace_seq buffer memory"); + if (WARN_ONCE(!s->buffer, + "Can't allocate trace_seq buffer memory")) + s->state = TRACE_SEQ__MEM_ALLOC_FAILED; } /** @@ -105,9 +119,9 @@ trace_seq_printf(struct trace_seq *s, const char *fmt, ...) int len; int ret; - TRACE_SEQ_CHECK(s); - try_again: + TRACE_SEQ_CHECK_RET0(s); + len = (s->buffer_size - 1) - s->len; va_start(ap, fmt); @@ -141,9 +155,9 @@ trace_seq_vprintf(struct trace_seq *s, const char *fmt, va_list args) int len;
[PATCH 16/16] hold bus_mutex in netlink and search
The bus_mutex needs to be taken to serialize access to a specific bus. netlink wasn't updated when bus_mutex was added and was calling without that lock held, and not all of the masters were holding the bus_mutex in a search. This was causing the ds2490 hardware to stop responding when both netlink and /sys slaves were executing bus commands at the same time. Signed-off-by: David Fries --- This fixes existing bugs, tacking it to the end of the previous patch series. drivers/w1/masters/ds1wm.c |4 +++- drivers/w1/masters/ds2490.c |8 ++-- drivers/w1/w1_netlink.c | 13 +++-- 3 files changed, 20 insertions(+), 5 deletions(-) diff --git a/drivers/w1/masters/ds1wm.c b/drivers/w1/masters/ds1wm.c index 02df3b1..b077b8b 100644 --- a/drivers/w1/masters/ds1wm.c +++ b/drivers/w1/masters/ds1wm.c @@ -326,13 +326,14 @@ static void ds1wm_search(void *data, struct w1_master *master_dev, unsigned slaves_found = 0; unsigned int pass = 0; + mutex_lock(_dev->bus_mutex); dev_dbg(_data->pdev->dev, "search begin\n"); while (true) { ++pass; if (pass > 100) { dev_dbg(_data->pdev->dev, "too many attempts (100), search aborted\n"); - return; + break; } mutex_lock(_dev->bus_mutex); @@ -439,6 +440,7 @@ static void ds1wm_search(void *data, struct w1_master *master_dev, dev_dbg(_data->pdev->dev, "pass: %d total: %d search done ms d bit pos: %d\n", pass, slaves_found, ms_discrep_bit); + mutex_unlock(_dev->bus_mutex); } /* - */ diff --git a/drivers/w1/masters/ds2490.c b/drivers/w1/masters/ds2490.c index db0bf32..7404ad30 100644 --- a/drivers/w1/masters/ds2490.c +++ b/drivers/w1/masters/ds2490.c @@ -727,9 +727,11 @@ static void ds9490r_search(void *data, struct w1_master *master, */ u64 buf[2*64/8]; + mutex_lock(>bus_mutex); + /* address to start searching at */ if (ds_send_data(dev, (u8 *)>search_id, 8) < 0) - return; + goto search_out; master->search_id = 0; value = COMM_SEARCH_ACCESS | COMM_IM | COMM_RST | COMM_SM | COMM_F | @@ -739,7 +741,7 @@ static void ds9490r_search(void *data, struct w1_master *master, search_limit = 0; index = search_type | (search_limit << 8); if (ds_send_control(dev, value, index) < 0) - return; + goto search_out; do { schedule_timeout(jtime); @@ -791,6 +793,8 @@ static void ds9490r_search(void *data, struct w1_master *master, master->max_slave_count); set_bit(W1_WARN_MAX_COUNT, >flags); } +search_out: + mutex_unlock(>bus_mutex); } #if 0 diff --git a/drivers/w1/w1_netlink.c b/drivers/w1/w1_netlink.c index a5dc219..5234964 100644 --- a/drivers/w1/w1_netlink.c +++ b/drivers/w1/w1_netlink.c @@ -246,11 +246,16 @@ static int w1_process_command_master(struct w1_master *dev, { int err = -EINVAL; + /* drop bus_mutex for search (does it's own locking), and add/remove +* which doesn't use the bus +*/ switch (req_cmd->cmd) { case W1_CMD_SEARCH: case W1_CMD_ALARM_SEARCH: case W1_CMD_LIST_SLAVES: + mutex_unlock(>bus_mutex); err = w1_get_slaves(dev, req_msg, req_hdr, req_cmd); + mutex_lock(>bus_mutex); break; case W1_CMD_READ: case W1_CMD_WRITE: @@ -262,8 +267,12 @@ static int w1_process_command_master(struct w1_master *dev, break; case W1_CMD_SLAVE_ADD: case W1_CMD_SLAVE_REMOVE: + mutex_unlock(>bus_mutex); + mutex_lock(>mutex); err = w1_process_command_addremove(dev, req_msg, req_hdr, req_cmd); + mutex_unlock(>mutex); + mutex_lock(>bus_mutex); break; default: err = -EINVAL; @@ -400,7 +409,7 @@ static void w1_process_cb(struct w1_master *dev, struct w1_async_cmd *async_cmd) struct w1_slave *sl = node->sl; struct w1_netlink_cmd *cmd = NULL; - mutex_lock(>mutex); + mutex_lock(>bus_mutex); dev->portid = node->block->portid; if (sl && w1_reset_select_slave(sl)) err = -ENODEV; @@ -437,7 +446,7 @@ static void w1_process_cb(struct w1_master *dev, struct w1_async_cmd *async_cmd) else atomic_dec(>refcnt); dev->portid = 0; - mutex_unlock(>mutex); + mutex_unlock(>bus_mutex); mutex_lock(>list_mutex); list_del(_cmd->async_entry); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body
[git pull] Please pull powerpc.git merge branch
Hi Linus ! So you make the call onto whether taking that one now or waiting for the merge window. It's a bug fix for a crash in mremap that occurs on powerpc with THP enabled. The fix however requires a small change in the generic code. It moves a condition into a helper we can override from the arch which is harmless, but it *also* slightly changes the order of the set_pmd and the withdraw & deposit, which should be fine according to Kirill (who wrote that code) but I agree -rc8 is a bit late... It was acked by Kirill and Andrew told me to just merge it via powerpc. My original intend was to put it in powerpc-next and then shoot it to stable, but it got a tad annoying (due to churn it needs to be applied at least on rc4 or later while my next is at rc1 and clean that way), so I put it in the merge branch. >From there, you tell me if you want to take it now, if not, I'll send you that branch along with my normal next one after you open the merge window. Cheers, Ben. The following changes since commit a6da83f98267bc8ee4e34aa899169991eb0ceb93: Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc (2014-01-13 10:59:05 +0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git merge for you to fetch changes up to b3084f4db3aeb991c507ca774337c7e7893ed04f: powerpc/thp: Fix crash on mremap (2014-01-15 15:46:38 +1100) Aneesh Kumar K.V (1): powerpc/thp: Fix crash on mremap arch/powerpc/include/asm/pgtable-ppc64.h | 14 ++ include/asm-generic/pgtable.h| 12 mm/huge_memory.c | 14 +- 3 files changed, 31 insertions(+), 9 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the tip tree with the mips tree
Hi all, Today's linux-next merge of the tip tree got a conflict in arch/mips/Kconfig between commit 597ce1723e0f ("MIPS: Support for 64-bit FP with O32 binaries") from the mips tree and commit 19952a92037e ("stackprotector: Unify the HAVE_CC_STACKPROTECTOR logic between architectures") from the tip tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc arch/mips/Kconfig index 87dc0c3fe05f,c93d92beb3d6.. --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@@ -2354,36 -2323,6 +2355,23 @@@ config SECCOM If unsure, say Y. Only embedded should say N here. - config CC_STACKPROTECTOR - bool "Enable -fstack-protector buffer overflow detection (EXPERIMENTAL)" - help - This option turns on the -fstack-protector GCC feature. This - feature puts, at the beginning of functions, a canary value on - the stack just before the return address, and validates - the value just before actually returning. Stack based buffer - overflows (that need to overwrite this return address) now also - overwrite the canary, which gets detected and the attack is then - neutralized via a kernel panic. - - This feature requires gcc version 4.2 or above. - +config MIPS_O32_FP64_SUPPORT + bool "Support for O32 binaries using 64-bit FP" + depends on 32BIT || MIPS32_O32 + default y + help +When this is enabled, the kernel will support use of 64-bit floating +point registers with binaries using the O32 ABI along with the +EF_MIPS_FP64 ELF header flag (typically built with -mfp64). On +32-bit MIPS systems this support is at the cost of increasing the +size and complexity of the compiled FPU emulator. Thus if you are +running a MIPS32 system and know that none of your userland binaries +will require 64-bit floating point, you may wish to reduce the size +of your kernel & potentially improve FP emulation performance by +saying N here. + +If unsure, say Y. + config USE_OF bool select OF pgpdHaXJGytMf.pgp Description: PGP signature
Re: [PATCH v3 2/5] slab: introduce helper functions to get/set free object
On Mon, 2 Dec 2013, Joonsoo Kim wrote: > In the following patches, to get/set free objects from the freelist > is changed so that simple casting doesn't work for it. Therefore, > introduce helper functions. > > Acked-by: Christoph Lameter > Signed-off-by: Joonsoo Kim Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/5] slab: factor out calculate nr objects in cache_estimate
On Mon, 2 Dec 2013, Joonsoo Kim wrote: > This logic is not simple to understand so that making separate function > helping readability. Additionally, we can use this change in the > following patch which implement for freelist to have another sized index > in according to nr objects. > > Acked-by: Christoph Lameter > Signed-off-by: Joonsoo Kim Acked-by: David Rientjes -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 13/14] mm, hugetlb: retry if failed to allocate and there is concurrent user
On Tue, 14 Jan 2014 20:37:49 -0800 Davidlohr Bueso wrote: > On Tue, 2014-01-14 at 19:08 -0800, David Rientjes wrote: > > On Mon, 6 Jan 2014, Davidlohr Bueso wrote: > > > > > > If Andrew agree, It would be great to merge 1-7 patches into mainline > > > > before your mutex approach. There are some of clean-up patches and, IMO, > > > > it makes the code more readable and maintainable, so it is worth to > > > > merge > > > > separately. > > > > > > Fine by me. > > > > > > > It appears like patches 1-7 are still missing from linux-next, would you > > mind posting them in a series with your approach? > > I haven't looked much into patches 4-7, but at least the first three are > ok. I was waiting for Andrew to take all seven for linux-next and then > I'd rebase my approach on top. Anyway, unless Andrew has any > preferences, if by later this week they're not picked up, I'll resend > everything. Well, we're mainly looking for bugfixes this last in the cycle. "[PATCH v3 03/14] mm, hugetlb: protect region tracking via newly introduced resv_map lock" fixes a bug, but I'd assumed that it depended on earlier patches. If we think that one is serious then it would be better to cook up a minimal fix which is backportable into 3.12 and eariler? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] sched: find the latest idle cpu
On 01/15/2014 12:48 PM, Alex Shi wrote: > On 01/15/2014 12:31 PM, Michael wang wrote: >> Hi, Alex >> >> On 01/15/2014 12:07 PM, Alex Shi wrote: >> [snip] } >>> +#ifdef CONFIG_NO_HZ_COMMON >>> + /* >>> +* Coarsely to get the latest idle cpu for shorter latency and >>> +* possible power benefit. >>> +*/ >>> + if (!min_load) { > > here should be !load. >>> + struct tick_sched *ts = _cpu(tick_cpu_sched, i); >>> + >>> + s64 latest_wake = 0; >> >> I guess we missed some code for latest_wake here? > > Yes, thanks for reminder! > > so updated patch: > ops, still incorrect. re-updated: === >From 5d48303b3eb3b5ca7fde54a6dfcab79cff360403 Mon Sep 17 00:00:00 2001 From: Alex Shi Date: Tue, 14 Jan 2014 23:07:42 +0800 Subject: [PATCH] sched: find the latest idle cpu Currently we just try to find least load cpu. If some cpus idled, we just pick the first cpu in cpu mask. In fact we can get the interrupted idle cpu or the latest idled cpu, then we may get the benefit from both latency and power. The selected cpu maybe not the best, since other cpu may be interrupted during our selecting. But be captious costs too much. Signed-off-by: Alex Shi --- kernel/sched/fair.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c7395d9..e2c4cd9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4161,12 +4161,38 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) /* Traverse only the allowed CPUs */ for_each_cpu_and(i, sched_group_cpus(group), tsk_cpus_allowed(p)) { + s64 latest_wake = 0; + load = weighted_cpuload(i); if (load < min_load || (load == min_load && i == this_cpu)) { min_load = load; idlest = i; } +#ifdef CONFIG_NO_HZ_COMMON + /* +* Coarsely to get the latest idle cpu for shorter latency and +* possible power benefit. +*/ + if (!load) { + struct tick_sched *ts = _cpu(tick_cpu_sched, i); + + /* idle cpu doing irq */ + if (ts->inidle && !ts->idle_active) + idlest = i; + /* the cpu resched */ + else if (!ts->inidle) + idlest = i; + /* find latest idle cpu */ + else { + s64 temp = ktime_to_us(ts->idle_entrytime); + if (temp > latest_wake) { + latest_wake = temp; + idlest = i; + } + } + } +#endif } return idlest; -- 1.8.1.2 -- Thanks Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] sched: find the latest idle cpu
On 01/15/2014 12:31 PM, Michael wang wrote: > Hi, Alex > > On 01/15/2014 12:07 PM, Alex Shi wrote: > [snip]} >> +#ifdef CONFIG_NO_HZ_COMMON >> +/* >> + * Coarsely to get the latest idle cpu for shorter latency and >> + * possible power benefit. >> + */ >> +if (!min_load) { here should be !load. >> +struct tick_sched *ts = _cpu(tick_cpu_sched, i); >> + >> +s64 latest_wake = 0; > > I guess we missed some code for latest_wake here? Yes, thanks for reminder! so updated patch: >From c3a88e73fed3da96549b5a922076e996832685f8 Mon Sep 17 00:00:00 2001 From: Alex Shi Date: Tue, 14 Jan 2014 23:07:42 +0800 Subject: [PATCH] sched: find the latest idle cpu Currently we just try to find least load cpu. If some cpus idled, we just pick the first cpu in cpu mask. In fact we can get the interrupted idle cpu or the latest idled cpu, then we may get the benefit from both latency and power. The selected cpu maybe not the best, since other cpu may be interrupted during our selecting. But be captious costs too much. Signed-off-by: Alex Shi --- kernel/sched/fair.c | 25 + 1 file changed, 25 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c7395d9..73a2a07 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4167,6 +4167,31 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) min_load = load; idlest = i; } +#ifdef CONFIG_NO_HZ_COMMON + /* +* Coarsely to get the latest idle cpu for shorter latency and +* possible power benefit. +*/ + if (!load) { + struct tick_sched *ts = _cpu(tick_cpu_sched, i); + + s64 latest_wake = 0; + /* idle cpu doing irq */ + if (ts->inidle && !ts->idle_active) + idlest = i; + /* the cpu resched */ + else if (!ts->inidle) + idlest = i; + /* find latest idle cpu */ + else { + s64 temp = ktime_to_us(ts->idle_entrytime); + if (temp > latest_wake) { + latest_wake = temp; + idlest = i; + } + } + } +#endif } return idlest; -- 1.8.1.2 -- Thanks Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the audit tree with Linus' tree
Hi Eric, Today's linux-next merge of the audit tree got a conflict in include/net/xfrm.h between commit d511337a1eda ("xfrm.h: Remove extern from function prototypes") from Linus' tree and commit 4440e8548153 ("audit: convert all sessionid declaration to unsigned int") from the audit tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc include/net/xfrm.h index cd7c46ff6f1f,f8d32b908423.. --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@@ -714,23 -713,23 +714,23 @@@ static inline void xfrm_audit_helper_us audit_log_task_context(audit_buf); } -extern void xfrm_audit_policy_add(struct xfrm_policy *xp, int result, -kuid_t auid, unsigned int ses, u32 secid); -extern void xfrm_audit_policy_delete(struct xfrm_policy *xp, int result, -kuid_t auid, unsigned int ses, u32 secid); -extern void xfrm_audit_state_add(struct xfrm_state *x, int result, - kuid_t auid, unsigned int ses, u32 secid); -extern void xfrm_audit_state_delete(struct xfrm_state *x, int result, - kuid_t auid, unsigned int ses, u32 secid); -extern void xfrm_audit_state_replay_overflow(struct xfrm_state *x, - struct sk_buff *skb); -extern void xfrm_audit_state_replay(struct xfrm_state *x, - struct sk_buff *skb, __be32 net_seq); -extern void xfrm_audit_state_notfound_simple(struct sk_buff *skb, u16 family); -extern void xfrm_audit_state_notfound(struct sk_buff *skb, u16 family, -__be32 net_spi, __be32 net_seq); -extern void xfrm_audit_state_icvfail(struct xfrm_state *x, - struct sk_buff *skb, u8 proto); +void xfrm_audit_policy_add(struct xfrm_policy *xp, int result, kuid_t auid, - u32 ses, u32 secid); ++ unsigned int ses, u32 secid); +void xfrm_audit_policy_delete(struct xfrm_policy *xp, int result, kuid_t auid, - u32 ses, u32 secid); ++unsigned int ses, u32 secid); +void xfrm_audit_state_add(struct xfrm_state *x, int result, kuid_t auid, - u32 ses, u32 secid); ++unsigned int ses, u32 secid); +void xfrm_audit_state_delete(struct xfrm_state *x, int result, kuid_t auid, -u32 ses, u32 secid); ++ unsigned int ses, u32 secid); +void xfrm_audit_state_replay_overflow(struct xfrm_state *x, +struct sk_buff *skb); +void xfrm_audit_state_replay(struct xfrm_state *x, struct sk_buff *skb, + __be32 net_seq); +void xfrm_audit_state_notfound_simple(struct sk_buff *skb, u16 family); +void xfrm_audit_state_notfound(struct sk_buff *skb, u16 family, __be32 net_spi, + __be32 net_seq); +void xfrm_audit_state_icvfail(struct xfrm_state *x, struct sk_buff *skb, +u8 proto); #else static inline void xfrm_audit_policy_add(struct xfrm_policy *xp, int result, pgpaVLYDBxn9h.pgp Description: PGP signature
[PATCH v9 2/5] qrwlock x86: Enable x86 to use queue read/write lock
This patch makes the necessary changes at the x86 architecture specific layer to enable the presence of the CONFIG_QUEUE_RWLOCK kernel option to replace the read/write lock by the queue read/write lock. It also enables the CONFIG_QUEUE_RWLOCK option by default for x86 which will force the use of queue read/write lock. That will greatly improve the fairness of read/write lock and eliminate live-lock situation where one task may not get the lock for an indefinite period of time. Signed-off-by: Waiman Long Reviewed-by: Paul E. McKenney --- arch/x86/Kconfig |1 + arch/x86/include/asm/spinlock.h |2 ++ arch/x86/include/asm/spinlock_types.h |4 3 files changed, 7 insertions(+), 0 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index e0acf82..3c6094b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -119,6 +119,7 @@ config X86 select MODULES_USE_ELF_RELA if X86_64 select CLONE_BACKWARDS if X86_32 select ARCH_USE_BUILTIN_BSWAP + select ARCH_USE_QUEUE_RWLOCK select OLD_SIGSUSPEND3 if X86_32 || IA32_EMULATION select OLD_SIGACTION if X86_32 select COMPAT_OLD_SIGACTION if IA32_EMULATION diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h index bf156de..8fb88c5 100644 --- a/arch/x86/include/asm/spinlock.h +++ b/arch/x86/include/asm/spinlock.h @@ -188,6 +188,7 @@ static inline void arch_spin_unlock_wait(arch_spinlock_t *lock) cpu_relax(); } +#ifndef CONFIG_QUEUE_RWLOCK /* * Read-write spinlocks, allowing multiple readers * but only one writer. @@ -270,6 +271,7 @@ static inline void arch_write_unlock(arch_rwlock_t *rw) asm volatile(LOCK_PREFIX WRITE_LOCK_ADD(%1) "%0" : "+m" (rw->write) : "i" (RW_LOCK_BIAS) : "memory"); } +#endif /* CONFIG_QUEUE_RWLOCK */ #define arch_read_lock_flags(lock, flags) arch_read_lock(lock) #define arch_write_lock_flags(lock, flags) arch_write_lock(lock) diff --git a/arch/x86/include/asm/spinlock_types.h b/arch/x86/include/asm/spinlock_types.h index 4f1bea1..a585635 100644 --- a/arch/x86/include/asm/spinlock_types.h +++ b/arch/x86/include/asm/spinlock_types.h @@ -34,6 +34,10 @@ typedef struct arch_spinlock { #define __ARCH_SPIN_LOCK_UNLOCKED { { 0 } } +#ifdef CONFIG_QUEUE_RWLOCK +#include +#else #include +#endif #endif /* _ASM_X86_SPINLOCK_TYPES_H */ -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 1/5] qrwlock: A queue read/write lock implementation
This patch introduces a new read/write lock implementation that put waiting readers and writers into a queue instead of actively contending the lock like the current read/write lock implementation. This will improve performance in highly contended situation by reducing the cache line bouncing effect. The queue read/write lock (qrwlock) is a fair lock even though there is still a slight chance of lock stealing if a reader or writer comes at the right moment. Other than that, lock granting is done in a FIFO manner. As a result, it is possible to determine a maximum time period after which the waiting is over and the lock can be acquired. Internally, however, there is a second type of readers which try to steal lock aggressively. They simply increments the reader count and wait until the writer releases the lock. The transition to aggressive reader happens in the read lock slowpath when 1. In an interrupt context. 2. When a reader comes to the head of the wait queue and sees the release of a write lock. The queue read lock is safe to use in an interrupt context (softirq or hardirq) as it will switch to become an aggressive reader in such environment allowing recursive read lock. The only downside of queue rwlock is the size increase in the lock structure by 4 bytes for 32-bit systems and by 12 bytes for 64-bit systems. In term of single-thread performance (no contention), a 256K lock/unlock loop was run on a 2.4GHz and 2.93Ghz Westmere x86-64 CPUs. The following table shows the average time (in ns) for a single lock/unlock sequence (including the looping and timing overhead): Lock Type 2.4GHz 2.93GHz - -- --- Ticket spinlock 14.912.3 Read lock17.013.5 Write lock 17.013.5 Queue read lock 16.013.4 Queue write lock 9.2 7.8 The queue read lock is slightly slower than the spinlock, but is slightly faster than the read lock. The queue write lock, however, is the fastest of all. It is almost twice as fast as the write lock and about 1.5X of the spinlock. With lock contention, the speed of each individual lock/unlock function is less important than the amount of contention-induced delays. To investigate the performance characteristics of the queue rwlock compared with the regular rwlock, Ingo's anon_vmas patch that converts rwsem to rwlock was applied to a 3.12 kernel. This kernel was then tested under the following 3 conditions: 1) Plain 3.12 2) Ingo's patch 3) Ingo's patch + qrwlock Each of the 3 kernels were booted up twice with and without the "idle=poll" kernel parameter which keeps the CPUs in C0 state while idling instead of a more energy-saving sleep state. The jobs per minutes (JPM) results of the AIM7's high_systime workload at 1500 users on a 8-socket 80-core DL980 (HT off) were: Kernel JPMs%Change from (1) -- 1145704/227295 - 2229750/236066 +58%/+3.8% 4240062/248606 +65%/+9.4% The first JPM number is without the "idle=poll" kernel parameter, the second number is with that parameter. It can be seen that most of the performance benefit of converting rwsem to rwlock actually come from the latency improvement of not needing to wake up a CPU from deep sleep state when work is available. The use of non-sleeping locks did improve performance by eliminating the context switching cost. Using queue rwlock gave almost tripling of performance gain. The performance gain was reduced somewhat with a fair lock which was to be expected. Looking at the perf profiles (with idle=poll) below, we can clearly see that other bottlenecks were constraining the performance improvement. Perf profile of kernel (2): 18.65%reaim [kernel.kallsyms] [k] __write_lock_failed 9.00%reaim [kernel.kallsyms] [k] _raw_spin_lock_irqsave 5.21% swapper [kernel.kallsyms] [k] cpu_idle_loop 3.08%reaim [kernel.kallsyms] [k] mspin_lock 2.50%reaim [kernel.kallsyms] [k] anon_vma_interval_tree_insert 2.00% ls [kernel.kallsyms] [k] _raw_spin_lock_irqsave 1.29%reaim [kernel.kallsyms] [k] update_cfs_rq_blocked_load 1.21%reaim [kernel.kallsyms] [k] __read_lock_failed 1.12%reaim [kernel.kallsyms] [k] _raw_spin_lock 1.10%reaim [kernel.kallsyms] [k] perf_event_aux 1.09% true [kernel.kallsyms] [k] _raw_spin_lock_irqsave Perf profile of kernel (3): 20.14% swapper [kernel.kallsyms] [k] cpu_idle_loop 7.94%reaim [kernel.kallsyms] [k] _raw_spin_lock_irqsave 5.41%reaim [kernel.kallsyms] [k] queue_write_lock_slowpath 5.01%reaim [kernel.kallsyms] [k] mspin_lock 2.12%reaim [kernel.kallsyms] [k] anon_vma_interval_tree_insert 2.07% ls [kernel.kallsyms] [k] _raw_spin_lock_irqsave 1.58%reaim [kernel.kallsyms] [k] update_cfs_rq_blocked_load
[PATCH v9 0/5] qrwlock: Introducing a queue read/write lock implementation
v8->v9: - Rebase to the tip branch which has the PeterZ's smp_load_acquire()/smp_store_release() patch. - Only pass integer type arguments to smp_load_acquire() & smp_store_release() functions. - Add a new patch to make any data type less than or equal to long as atomic or native in x86. - Modify write_unlock() to use atomic_sub() if the writer field is not atomic. v7->v8: - Use atomic_t functions (which are implemented in all arch's) to modify reader counts. - Use smp_load_acquire() & smp_store_release() for barriers. - Further tuning in slowpath performance. v6->v7: - Remove support for unfair lock, so only fair qrwlock will be provided. - Move qrwlock.c to the kernel/locking directory. v5->v6: - Modify queue_read_can_lock() to avoid false positive result. - Move the two slowpath functions' performance tuning change from patch 4 to patch 1. - Add a new optional patch to use the new smp_store_release() function if that is merged. v4->v5: - Fix wrong definitions for QW_MASK_FAIR & QW_MASK_UNFAIR macros. - Add an optional patch 4 which should only be applied after the mcs_spinlock.h header file is merged. v3->v4: - Optimize the fast path with better cold cache behavior and performance. - Removing some testing code. - Make x86 use queue rwlock with no user configuration. v2->v3: - Make read lock stealing the default and fair rwlock an option with a different initializer. - In queue_read_lock_slowpath(), check irq_count() and force spinning and lock stealing in interrupt context. - Unify the fair and classic read-side code path, and make write-side to use cmpxchg with 2 different writer states. This slows down the write lock fastpath to make the read side more efficient, but is still slightly faster than a spinlock. v1->v2: - Improve lock fastpath performance. - Optionally provide classic read/write lock behavior for backward compatibility. - Use xadd instead of cmpxchg for fair reader code path to make it immute to reader contention. - Run more performance testing. As mentioned in the LWN article http://lwn.net/Articles/364583/, the read/write lock suffer from an unfairness problem that it is possible for a stream of incoming readers to block a waiting writer from getting the lock for a long time. Also, a waiting reader/writer contending a rwlock in local memory will have a higher chance of acquiring the lock than a reader/writer in remote node. This patch set introduces a queue-based read/write lock implementation that can largely solve this unfairness problem. The read lock slowpath will check if the reader is in an interrupt context. If so, it will force lock spinning and stealing without waiting in a queue. This is to ensure the read lock will be granted as soon as possible. The queue write lock can also be used as a replacement for ticket spinlocks that are highly contended if lock size increase is not an issue. The first 2 patches provides the base queue read/write lock support on x86 architecture. Support for other architectures can be added later on once architecture specific support infrastructure is added and proper testing is done. Patches 3 and 4 are currently applicable on the tip git tree where the smp_load_acquire() & smp_store_release() macros are defined. The optional patch 5 has a dependency on the the mcs_spinlock.h header file which has not been merged yet. So this patch should only be applied after the mcs_spinlock.h header file is merged. Waiman Long (5): qrwlock: A queue read/write lock implementation qrwlock x86: Enable x86 to use queue read/write lock qrwlock, x86 - Treat all data type not bigger than long as atomic in x86 qrwlock: Use smp_store_release() in write_unlock() qrwlock: Use the mcs_spinlock helper functions for MCS queuing arch/x86/Kconfig |1 + arch/x86/include/asm/barrier.h|8 ++ arch/x86/include/asm/spinlock.h |2 + arch/x86/include/asm/spinlock_types.h |4 + include/asm-generic/qrwlock.h | 208 + kernel/Kconfig.locks |7 + kernel/locking/Makefile |1 + kernel/locking/qrwlock.c | 191 ++ 8 files changed, 422 insertions(+), 0 deletions(-) create mode 100644 include/asm-generic/qrwlock.h create mode 100644 kernel/locking/qrwlock.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
This patch modifies the queue_write_unlock() function to use the new smp_store_release() function (currently in tip). It also removes the temporary implementation of smp_load_acquire() and smp_store_release() function in qrwlock.c. This patch will use atomic subtraction instead if the writer field is not atomic. Signed-off-by: Waiman Long --- include/asm-generic/qrwlock.h | 10 ++ kernel/locking/qrwlock.c | 34 -- 2 files changed, 6 insertions(+), 38 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 5abb6ca..68f488b 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock *lock) static inline void queue_write_unlock(struct qrwlock *lock) { /* -* Make sure that none of the critical section will be leaked out. +* If the writer field is atomic, it can be cleared directly. +* Otherwise, an atomic subtraction will be used to clear it. */ - smp_mb__before_clear_bit(); - ACCESS_ONCE(lock->cnts.writer) = 0; - smp_mb__after_clear_bit(); + if (__native_word(lock->cnts.writer)) + smp_store_release(>cnts.writer, 0); + else + atomic_sub(_QW_LOCKED, >cnts.rwa); } /* diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index 053be4d..2727188 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -47,40 +47,6 @@ # define arch_mutex_cpu_relax() cpu_relax() #endif -#ifndef smp_load_acquire -# ifdef CONFIG_X86 -# define smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ - barrier(); \ - ___p1; \ - }) -# else -# define smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ - smp_mb(); \ - ___p1; \ - }) -# endif -#endif - -#ifndef smp_store_release -# ifdef CONFIG_X86 -# define smp_store_release(p, v) \ - do {\ - barrier(); \ - ACCESS_ONCE(*p) = v;\ - } while (0) -# else -# define smp_store_release(p, v) \ - do {\ - smp_mb(); \ - ACCESS_ONCE(*p) = v;\ - } while (0) -# endif -#endif - /* * If an xadd (exchange-add) macro isn't available, simulate one with * the atomic_add_return() function. -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 3/5] qrwlock, x86 - Treat all data type not bigger than long as atomic in x86
The generic __native_word() macro defined in include/linux/compiler.h only allows "int" and "long" data types to be treated as native and atomic. The x86 architecture, however, allow the use of char and short data types as atomic as well. This patch extends the data type allowed in the __native_word() macro to allow the use of char and short. Signed-off-by: Waiman Long --- arch/x86/include/asm/barrier.h |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index 04a4890..4d3e30a 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -24,6 +24,14 @@ #define wmb() asm volatile("sfence" ::: "memory") #endif +/* + * All data types <= long are atomic in x86 + */ +#ifdef __native_word +#undef __native_word +#endif +#define __native_word(t) (sizeof(t) <= sizeof(long)) + /** * read_barrier_depends - Flush all pending reads that subsequents reads * depend on. -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 5/5] qrwlock: Use the mcs_spinlock helper functions for MCS queuing
There is a pending MCS lock patch series that adds generic MCS lock helper functions to do MCS-style locking. This patch will enable the queue rwlock to use that generic MCS lock/unlock primitives for internal queuing. This patch should only be merged after the merging of that generic MCS locking patch. Signed-off-by: Waiman Long Reviewed-by: Paul E. McKenney --- include/asm-generic/qrwlock.h |7 +--- kernel/locking/qrwlock.c | 71 - 2 files changed, 9 insertions(+), 69 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 68f488b..db6b1e2 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -38,10 +38,7 @@ * the writer field. The least significant 8 bits is the writer field * whereas the remaining 24 bits is the reader count. */ -struct qrwnode { - struct qrwnode *next; - int wait; /* Waiting flag */ -}; +struct mcs_spinlock; typedef struct qrwlock { union qrwcnts { @@ -57,7 +54,7 @@ typedef struct qrwlock { atomic_trwa;/* Reader/writer atomic */ u32 rwc;/* Reader/writer counts */ } cnts; - struct qrwnode *waitq; /* Tail of waiting queue */ + struct mcs_spinlock *waitq; /* Tail of waiting queue */ } arch_rwlock_t; /* diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index 2727188..18b4f2c 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -20,6 +20,7 @@ #include #include #include +#include #include /* @@ -58,64 +59,6 @@ #endif /** - * wait_in_queue - Add to queue and wait until it is at the head - * @lock: Pointer to queue rwlock structure - * @node: Node pointer to be added to the queue - */ -static inline void wait_in_queue(struct qrwlock *lock, struct qrwnode *node) -{ - struct qrwnode *prev; - - node->next = NULL; - node->wait = true; - prev = xchg(>waitq, node); - if (prev) { - prev->next = node; - /* -* Wait until the waiting flag is off -*/ - while (smp_load_acquire(>wait)) - arch_mutex_cpu_relax(); - } -} - -/** - * signal_next - Signal the next one in queue to be at the head - * @lock: Pointer to queue rwlock structure - * @node: Node pointer to the current head of queue - */ -static inline void signal_next(struct qrwlock *lock, struct qrwnode *node) -{ - struct qrwnode *next; - - /* -* Try to notify the next node first without disturbing the cacheline -* of the lock. If that fails, check to see if it is the last node -* and so should clear the wait queue. -*/ - next = ACCESS_ONCE(node->next); - if (likely(next)) - goto notify_next; - - /* -* Clear the wait queue if it is the last node -*/ - if ((ACCESS_ONCE(lock->waitq) == node) && - (cmpxchg(>waitq, node, NULL) == node)) - return; - /* -* Wait until the next one in queue set up the next field -*/ - while (likely(!(next = ACCESS_ONCE(node->next - arch_mutex_cpu_relax(); - /* -* The next one in queue is now at the head -*/ -notify_next: - smp_store_release(>wait, false); -} - -/** * rspin_until_writer_unlock - inc reader count & spin until writer is gone * @lock : Pointer to queue rwlock structure * @writer: Current queue rwlock writer status byte @@ -138,7 +81,7 @@ rspin_until_writer_unlock(struct qrwlock *lock, u32 rwc) */ void queue_read_lock_slowpath(struct qrwlock *lock) { - struct qrwnode node; + struct mcs_spinlock node; union qrwcnts cnts; /* @@ -158,7 +101,7 @@ void queue_read_lock_slowpath(struct qrwlock *lock) /* * Put the reader into the wait queue */ - wait_in_queue(lock, ); + mcs_spin_lock(>waitq, ); /* * At the head of the wait queue now, wait until the writer state @@ -175,7 +118,7 @@ void queue_read_lock_slowpath(struct qrwlock *lock) /* * Signal the next one in queue to become queue head */ - signal_next(lock, ); + mcs_spin_unlock(>waitq, ); } EXPORT_SYMBOL(queue_read_lock_slowpath); @@ -231,18 +174,18 @@ static inline void queue_write_3step_lock(struct qrwlock *lock) */ void queue_write_lock_slowpath(struct qrwlock *lock) { - struct qrwnode node; + struct mcs_spinlock node; /* * Put the writer into the wait queue */ - wait_in_queue(lock, ); + mcs_spin_lock(>waitq, ); /* * At the head of the wait queue now, call queue_write_3step_lock() * to acquire the lock until it is done. */ queue_write_3step_lock(lock); - signal_next(lock, ); +
Re: [PATCH v3 13/14] mm, hugetlb: retry if failed to allocate and there is concurrent user
On Tue, 2014-01-14 at 19:08 -0800, David Rientjes wrote: > On Mon, 6 Jan 2014, Davidlohr Bueso wrote: > > > > If Andrew agree, It would be great to merge 1-7 patches into mainline > > > before your mutex approach. There are some of clean-up patches and, IMO, > > > it makes the code more readable and maintainable, so it is worth to merge > > > separately. > > > > Fine by me. > > > > It appears like patches 1-7 are still missing from linux-next, would you > mind posting them in a series with your approach? I haven't looked much into patches 4-7, but at least the first three are ok. I was waiting for Andrew to take all seven for linux-next and then I'd rebase my approach on top. Anyway, unless Andrew has any preferences, if by later this week they're not picked up, I'll resend everything. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: build failure after merge of the device-mapper tree
On Tue, Jan 14 2014 at 10:52pm -0500, Stephen Rothwell wrote: > Hi all, > > After merging the device-mapper tree, today's linux-next build (powerpc > ppc64_defconfig) failed like this: > > ERROR: ".dm_bufio_get_device_size" [drivers/md/dm-snapshot.ko] undefined! > ERROR: ".dm_bufio_release" [drivers/md/dm-snapshot.ko] undefined! > ERROR: ".dm_bufio_client_destroy" [drivers/md/dm-snapshot.ko] undefined! > ERROR: ".dm_bufio_prefetch" [drivers/md/dm-snapshot.ko] undefined! > ERROR: ".dm_bufio_set_minimum_buffers" [drivers/md/dm-snapshot.ko] undefined! > ERROR: ".dm_bufio_forget" [drivers/md/dm-snapshot.ko] undefined! > ERROR: ".dm_bufio_client_create" [drivers/md/dm-snapshot.ko] undefined! > ERROR: ".dm_bufio_read" [drivers/md/dm-snapshot.ko] undefined! > > Presumably caused by commit b41bf7440bcf ("dm snapshot: use dm-bufio"). Hi Stephen, That commit was missing a Kconfig update to have DM_SNAPSHOT select DM_BUFIO. I've rebased the "dm snapshot: use dm-bufio" commit to include the Kconfig change and pushed to 'for-next'. Thanks, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] sched: find the latest idle cpu
Hi, Alex On 01/15/2014 12:07 PM, Alex Shi wrote: [snip] } > +#ifdef CONFIG_NO_HZ_COMMON > + /* > + * Coarsely to get the latest idle cpu for shorter latency and > + * possible power benefit. > + */ > + if (!min_load) { > + struct tick_sched *ts = _cpu(tick_cpu_sched, i); > + > + s64 latest_wake = 0; I guess we missed some code for latest_wake here? Regards, Michael Wang > + /* idle cpu doing irq */ > + if (ts->inidle && !ts->idle_active) > + idlest = i; > + /* the cpu resched */ > + else if (!ts->inidle) > + idlest = i; > + /* find latest idle cpu */ > + else if (ktime_to_us(ts->idle_entrytime) > latest_wake) > + idlest = i; > + } > +#endif > } > > return idlest; > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] zram: rework reported to end-user zram statistics
Hello Sergey, On Tue, Jan 14, 2014 at 12:37:40PM +0300, Sergey Senozhatsky wrote: > 1) Introduce ZRAM_ATTR_RO macro to generate zram atomic64_t stats > `show' functions and reduce code duplication. > > 2) Account and report back to user numbers of failed READ and WRITE > operations. > > 3) Remove `good' and `bad' compressed sub-requests stats. RW request may > cause a number of RW sub-requests. zram used to account `good' compressed > sub-queries (with compressed size less than 50% of original size), `bad' > compressed sub-queries (with compressed size greater that 75% of original > size), leaving sub-requests with compression size between 50% and 75% of > original size not accounted and not reported. Account each sub-request > compression size so we can calculate real device compression ratio. > > 4) reported zram stats: > - num_writes -- number of writes > - num_reads -- number of reads > - pages_stored -- number of pages currently stored > - compressed_size -- compressed size of pages stored > - pages_zero -- number of zero filled pages > - failed_read -- number of failed reads > - failed_writes -- can happen when memory is too low > - invalid_io -- non-page-aligned I/O requests > - notify_free -- number of swap slot free notifications > - memory_used -- zs pool zs_get_total_size_bytes() > > Signed-off-by: Sergey Senozhatsky So this patch includes some clean up and behavior changes? Please seaprate them and each patch in behavior change includes why it's bad or good(ie, motivation). It could make reviewer/maintainer happy, even you because some of them could be picked up and other things could be dropped. Sorry for the bothering you. Thanks. > --- > drivers/block/zram/zram_drv.c | 167 > -- > drivers/block/zram/zram_drv.h | 17 ++--- > 2 files changed, 54 insertions(+), 130 deletions(-) > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index 2a7682c..8bddaff 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -42,6 +42,17 @@ static struct zram *zram_devices; > /* Module params (documentation at end) */ > static unsigned int num_devices = 1; > > +#define ZRAM_ATTR_RO(name) \ > +static ssize_t zram_attr_##name##_show(struct device *d, \ > + struct device_attribute *attr, char *b) \ > +{\ > + struct zram *zram = dev_to_zram(d); \ > + return sprintf(b, "%llu\n", \ > + (u64)atomic64_read(>stats.name)); \ > +}\ > +static struct device_attribute dev_attr_##name = \ > + __ATTR(name, S_IRUGO, zram_attr_##name##_show, NULL); > + > static inline int init_done(struct zram *zram) > { > return zram->meta != NULL; > @@ -52,97 +63,36 @@ static inline struct zram *dev_to_zram(struct device *dev) > return (struct zram *)dev_to_disk(dev)->private_data; > } > > -static ssize_t disksize_show(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - struct zram *zram = dev_to_zram(dev); > - > - return sprintf(buf, "%llu\n", zram->disksize); > -} > - > -static ssize_t initstate_show(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - struct zram *zram = dev_to_zram(dev); > - > - return sprintf(buf, "%u\n", init_done(zram)); > -} > - > -static ssize_t num_reads_show(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - struct zram *zram = dev_to_zram(dev); > - > - return sprintf(buf, "%llu\n", > - (u64)atomic64_read(>stats.num_reads)); > -} > - > -static ssize_t num_writes_show(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - struct zram *zram = dev_to_zram(dev); > - > - return sprintf(buf, "%llu\n", > - (u64)atomic64_read(>stats.num_writes)); > -} > - > -static ssize_t invalid_io_show(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - struct zram *zram = dev_to_zram(dev); > - > - return sprintf(buf, "%llu\n", > - (u64)atomic64_read(>stats.invalid_io)); > -} > - > -static ssize_t notify_free_show(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - struct zram *zram = dev_to_zram(dev); > - > - return sprintf(buf, "%llu\n", > - (u64)atomic64_read(>stats.notify_free)); > -} > - > -static ssize_t zero_pages_show(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - struct zram *zram = dev_to_zram(dev); > - > - return sprintf(buf, "%u\n",
linux-next: manual merge of the iommu tree with the drm tree
Hi Joerg, Today's linux-next merge of the iommu tree got a conflict in drivers/gpu/drm/msm/Kconfig between commit 3083894f7f29 ("drm/msm: COMPILE_TEST support") from the drm tree and commit 4c071c7b851b ("drm/msm: Fix link error with !MSM_IOMMU") from the iommu tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc drivers/gpu/drm/msm/Kconfig index bb103fb4519e,d3de8e1ae915.. --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@@ -2,7 -2,9 +2,8 @@@ config DRM_MSM tristate "MSM DRM" depends on DRM - depends on ARCH_MSM - depends on ARCH_MSM8960 + depends on (ARCH_MSM && ARCH_MSM8960) || (ARM && COMPILE_TEST) + depends on MSM_IOMMU select DRM_KMS_HELPER select SHMEM select TMPFS pgpbZmQz1_4Qf.pgp Description: PGP signature
[PATCH] misc: sram: cleanup the code
Since the devm_gen_pool_create() is used, so the gen_pool_destroy() here is redundant. Signed-off-by: Xiubo Li --- drivers/misc/sram.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c index afe66571..e3e421d 100644 --- a/drivers/misc/sram.c +++ b/drivers/misc/sram.c @@ -87,8 +87,6 @@ static int sram_remove(struct platform_device *pdev) if (gen_pool_avail(sram->pool) < gen_pool_size(sram->pool)) dev_dbg(>dev, "removed while SRAM allocated\n"); - gen_pool_destroy(sram->pool); - if (sram->clk) clk_disable_unprepare(sram->clk); -- 1.8.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Notifying on empty cgroup
I want to write software which needs to receive a signal when the cgroup created by it becomes empty. (After this the empty cgroup should be deleted just not to clutter the memory.) If the kernel does not support such notifications, it should be improved. This functionality is crucial for some kinds of software. There is /sys/fs/cgroup/systemd/release_agent but I don't understand how to use it. I don't understand why we would need it at all. Starting a binary on emptying a cgroup with the purpose to notify an other binary looks like a big overkill. Also my program should work in userspace without the need to use release_agent which can be accessed only by root. Note that my work is related with sandboxing software (running a program in closed environment, so that it would be unable for example to remove user's files). See also http://portonsoft.wordpress.com/2014/01/11/toward-robust-linux-sandbox/ -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH] sched: find the latest idle cpu
Currently we just try to find least load cpu. If some cpus idled, we just pick the first cpu in cpu mask. In fact we can get the interrupted idle cpu or the latest idled cpu, then we may get the benefit from both latency and power. The selected cpu maybe not the best, since other cpu may be interrupted during our selecting. But be captious costs too much. Signed-off-by: Alex Shi --- kernel/sched/fair.c | 20 1 file changed, 20 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c7395d9..fb52d26 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4167,6 +4167,26 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) min_load = load; idlest = i; } +#ifdef CONFIG_NO_HZ_COMMON + /* +* Coarsely to get the latest idle cpu for shorter latency and +* possible power benefit. +*/ + if (!min_load) { + struct tick_sched *ts = _cpu(tick_cpu_sched, i); + + s64 latest_wake = 0; + /* idle cpu doing irq */ + if (ts->inidle && !ts->idle_active) + idlest = i; + /* the cpu resched */ + else if (!ts->inidle) + idlest = i; + /* find latest idle cpu */ + else if (ktime_to_us(ts->idle_entrytime) > latest_wake) + idlest = i; + } +#endif } return idlest; -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
linux-next: manual merge of the md tree with the block tree
Hi Neil, Today's linux-next merge of the md tree got a conflict in drivers/md/raid10.c between commit 4f024f3797c4 ("block: Abstract out bvec iterator") from the block tree and commit b50c259e25d9 ("md/raid10: fix two bugs in handling of known-bad-blocks") from the md tree. I fixed it up (see below) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au diff --cc drivers/md/raid10.c index 6d43d88657aa,8d39d63281b9.. --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@@ -1256,8 -1319,8 +1256,8 @@@ read_again /* Could not read all from this device, so we will * need another r10_bio. */ - sectors_handled = (r10_bio->sectors + max_sectors + sectors_handled = (r10_bio->sector + max_sectors - - bio->bi_sector); + - bio->bi_iter.bi_sector); r10_bio->sectors = max_sectors; spin_lock_irq(>device_lock); if (bio->bi_phys_segments == 0) pgp0mUzDMyGJx.pgp Description: PGP signature
Re: [PATCH v4 1/3] Send loginuid and sessionid in SCM_AUDIT
On 14/01/13, Jan Kaluza wrote: > Server-like processes in many cases need credentials and other > metadata of the peer, to decide if the calling process is allowed to > request a specific action, or the server just wants to log away this > type of information for auditing tasks. > > The current practice to retrieve such process metadata is to look that > information up in procfs with the $PID received over SCM_CREDENTIALS. > This is sufficient for long-running tasks, but introduces a race which > cannot be worked around for short-living processes; the calling > process and all the information in /proc/$PID/ is gone before the > receiver of the socket message can look it up. > > This introduces a new SCM type called SCM_AUDIT to allow the direct > attaching of "loginuid" and "sessionid" to SCM, which is significantly more > efficient and will reliably avoid the race with the round-trip over > procfs. > > Signed-off-by: Jan Kaluza Acked-by: Richard Guy Briggs > --- > include/linux/socket.h | 6 ++ > include/net/af_unix.h | 2 ++ > include/net/scm.h | 28 ++-- > net/unix/af_unix.c | 7 +++ > 4 files changed, 41 insertions(+), 2 deletions(-) > > diff --git a/include/linux/socket.h b/include/linux/socket.h > index 5d488a6..eeac565 100644 > --- a/include/linux/socket.h > +++ b/include/linux/socket.h > @@ -130,6 +130,7 @@ static inline struct cmsghdr * cmsg_nxthdr (struct msghdr > *__msg, struct cmsghdr > #define SCM_RIGHTS 0x01/* rw: access rights (array of > int) */ > #define SCM_CREDENTIALS 0x02 /* rw: struct ucred */ > #define SCM_SECURITY 0x03/* rw: security label */ > +#define SCM_AUDIT0x04/* rw: struct uaudit*/ > > struct ucred { > __u32 pid; > @@ -137,6 +138,11 @@ struct ucred { > __u32 gid; > }; > > +struct uaudit { > + __u32 loginuid; > + __u32 sessionid; > +}; > + > /* Supported address families. */ > #define AF_UNSPEC0 > #define AF_UNIX 1 /* Unix domain sockets */ > diff --git a/include/net/af_unix.h b/include/net/af_unix.h > index a175ba4..3b9d22a 100644 > --- a/include/net/af_unix.h > +++ b/include/net/af_unix.h > @@ -36,6 +36,8 @@ struct unix_skb_parms { > u32 secid; /* Security ID */ > #endif > u32 consumed; > + kuid_t loginuid; > + unsigned intsessionid; > }; > > #define UNIXCB(skb) (*(struct unix_skb_parms *)&((skb)->cb)) > diff --git a/include/net/scm.h b/include/net/scm.h > index 262532d..67de64f 100644 > --- a/include/net/scm.h > +++ b/include/net/scm.h > @@ -6,6 +6,7 @@ > #include > #include > #include > +#include > > /* Well, we should have at least one descriptor open > * to accept passed FDs 8) > @@ -18,6 +19,11 @@ struct scm_creds { > kgid_t gid; > }; > > +struct scm_audit { > + kuid_t loginuid; > + unsigned int sessionid; > +}; > + > struct scm_fp_list { > short count; > short max; > @@ -28,6 +34,7 @@ struct scm_cookie { > struct pid *pid; /* Skb credentials */ > struct scm_fp_list *fp;/* Passed files */ > struct scm_credscreds; /* Skb credentials */ > + struct scm_auditaudit; /* Skb audit*/ > #ifdef CONFIG_SECURITY_NETWORK > u32 secid; /* Passed security ID */ > #endif > @@ -58,6 +65,13 @@ static __inline__ void scm_set_cred(struct scm_cookie *scm, > scm->creds.gid = gid; > } > > +static inline void scm_set_audit(struct scm_cookie *scm, > + kuid_t loginuid, unsigned int sessionid) > +{ > + scm->audit.loginuid = loginuid; > + scm->audit.sessionid = sessionid; > +} > + > static __inline__ void scm_destroy_cred(struct scm_cookie *scm) > { > put_pid(scm->pid); > @@ -77,8 +91,12 @@ static __inline__ int scm_send(struct socket *sock, struct > msghdr *msg, > memset(scm, 0, sizeof(*scm)); > scm->creds.uid = INVALID_UID; > scm->creds.gid = INVALID_GID; > - if (forcecreds) > - scm_set_cred(scm, task_tgid(current), current_uid(), > current_gid()); > + if (forcecreds) { > + scm_set_cred(scm, task_tgid(current), current_uid(), > + current_gid()); > + scm_set_audit(scm, audit_get_loginuid(current), > + audit_get_sessionid(current)); > + } > unix_get_peersec_dgram(sock, scm); > if (msg->msg_controllen <= 0) > return 0; > @@ -123,7 +141,13 @@ static __inline__ void scm_recv(struct socket *sock, > struct msghdr *msg, > .uid = from_kuid_munged(current_ns, scm->creds.uid), > .gid =
Re: [STABLE] find missing bug fixes in a stable kernel
On Tue, Jan 14, 2014 at 09:37:22AM +0800, Li Zefan wrote: > On 2014/1/13 23:57, Greg Kroah-Hartman wrote: > > On Mon, Jan 13, 2014 at 03:28:11PM +0800, Li Zefan wrote: > >> We have several long-term and extended stable kernels, and it's possible > >> that a bug fix is in some stable versions but is missing in some other > >> versions, so I've written a script to find out those fixes. > >> > >> Take 3.4.xx and 3.2.xx for example. If a bug fix was merged into upstream > >> kernel after 3.4, and then it was backported to 3.2.xx, then it probably > >> needs to be backported to 3.4.xx. > > > > I agree. > > > >> The result is, there're ~430 bug fixes in 3.2.xx that probably need to be > >> backported to 3.4.xx. Given there're about 4500 commits in 3.2.xx, that > >> is ~10%, which is quite a big number for stable kernels. > > > > That's a really big number, how am I missing so many patches for the 3.4 > > kernel? Is it because people are doing backports to 3.2 for patches > > that didn't apply to 3.4? Or are these patches being applied that do > > not have -stable markings on them? Or something else? > > > > I guess the biggest reason is, most people tag a patch with stable without > specifying kernel versions, and if this patch can't be applied to 3.4, it > will be dropped silently. I guess Ben has been checking this kind of patches > manually. > > >> We (our team in Huawei) are going to go through the whole list to filter > >> out fixes that're applicable for 3.4.xx. Please do this, I don't have the resources at the moment to be able to go through all of these git ids to see if they really are relevant or not right now. Any help that you can provide with this for 3.4 or 3.10 would be most appreciated. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 2/3] Send comm and cmdline in SCM_PROCINFO
On 14/01/13, Jan Kaluza wrote: > Server-like processes in many cases need credentials and other > metadata of the peer, to decide if the calling process is allowed to > request a specific action, or the server just wants to log away this > type of information for auditing tasks. > > The current practice to retrieve such process metadata is to look that > information up in procfs with the $PID received over SCM_CREDENTIALS. > This is sufficient for long-running tasks, but introduces a race which > cannot be worked around for short-living processes; the calling > process and all the information in /proc/$PID/ is gone before the > receiver of the socket message can look it up. > > This introduces a new SCM type called SCM_PROCINFO to allow the direct > attaching of "comm" and "cmdline" to SCM, which is significantly more > efficient and will reliably avoid the race with the round-trip over > procfs. > > To achieve that, new struct called unix_skb_parms_scm had to be created, > because otherwise unix_skb_parms would be too big. > > scm_get_current_procinfo is inspired by ./fs/proc/base.c. > > Signed-off-by: Jan Kaluza Acked-by: Richard Guy Briggs > --- > include/linux/socket.h | 2 ++ > include/net/af_unix.h | 11 +++-- > include/net/scm.h | 24 +++ > net/core/scm.c | 65 > ++ > net/unix/af_unix.c | 57 +-- > 5 files changed, 150 insertions(+), 9 deletions(-) > > diff --git a/include/linux/socket.h b/include/linux/socket.h > index eeac565..5a41f35 100644 > --- a/include/linux/socket.h > +++ b/include/linux/socket.h > @@ -131,6 +131,8 @@ static inline struct cmsghdr * cmsg_nxthdr (struct msghdr > *__msg, struct cmsghdr > #define SCM_CREDENTIALS 0x02 /* rw: struct ucred */ > #define SCM_SECURITY 0x03/* rw: security label */ > #define SCM_AUDIT0x04/* rw: struct uaudit*/ > +#define SCM_PROCINFO 0x05/* rw: comm + cmdline (NULL terminated > +array of char *) */ > > struct ucred { > __u32 pid; > diff --git a/include/net/af_unix.h b/include/net/af_unix.h > index 3b9d22a..05c7678 100644 > --- a/include/net/af_unix.h > +++ b/include/net/af_unix.h > @@ -27,6 +27,13 @@ struct unix_address { > struct sockaddr_un name[0]; > }; > > +struct unix_skb_parms_scm { > + kuid_t loginuid; > + unsigned int sessionid; > + char *procinfo; > + int procinfo_len; > +}; > + > struct unix_skb_parms { > struct pid *pid; /* Skb credentials */ > kuid_t uid; > @@ -36,12 +43,12 @@ struct unix_skb_parms { > u32 secid; /* Security ID */ > #endif > u32 consumed; > - kuid_t loginuid; > - unsigned intsessionid; > + struct unix_skb_parms_scm *scm; > }; > > #define UNIXCB(skb) (*(struct unix_skb_parms *)&((skb)->cb)) > #define UNIXSID(skb) (((skb)).secid) > +#define UNIXSCM(skb) (*(UNIXCB((skb)).scm)) > > #define unix_state_lock(s) spin_lock(_sk(s)->lock) > #define unix_state_unlock(s) spin_unlock(_sk(s)->lock) > diff --git a/include/net/scm.h b/include/net/scm.h > index 67de64f..f084e19 100644 > --- a/include/net/scm.h > +++ b/include/net/scm.h > @@ -30,11 +30,17 @@ struct scm_fp_list { > struct file *fp[SCM_MAX_FD]; > }; > > +struct scm_procinfo { > + char *procinfo; > + int len; > +}; > + > struct scm_cookie { > struct pid *pid; /* Skb credentials */ > struct scm_fp_list *fp;/* Passed files */ > struct scm_credscreds; /* Skb credentials */ > struct scm_auditaudit; /* Skb audit*/ > + struct scm_procinfo procinfo; /* Skb procinfo */ > #ifdef CONFIG_SECURITY_NETWORK > u32 secid; /* Passed security ID */ > #endif > @@ -45,6 +51,7 @@ void scm_detach_fds_compat(struct msghdr *msg, struct > scm_cookie *scm); > int __scm_send(struct socket *sock, struct msghdr *msg, struct scm_cookie > *scm); > void __scm_destroy(struct scm_cookie *scm); > struct scm_fp_list *scm_fp_dup(struct scm_fp_list *fpl); > +int scm_get_current_procinfo(char **procinfo); > > #ifdef CONFIG_SECURITY_NETWORK > static __inline__ void unix_get_peersec_dgram(struct socket *sock, struct > scm_cookie *scm) > @@ -72,10 +79,20 @@ static inline void scm_set_audit(struct scm_cookie *scm, > scm->audit.sessionid = sessionid; > } > > +static inline void scm_set_procinfo(struct scm_cookie *scm, > + char *procinfo, int len) > +{ > + scm->procinfo.procinfo = procinfo; > + scm->procinfo.len = len; > +} > + > static __inline__ void scm_destroy_cred(struct scm_cookie *scm) > { >
linux-next: manual merge of the md tree with the tree
Hi Neil, Today's linux-next merge of the md tree got a conflict in drivers/md/raid1.c between commit 4f024f3797c4 ("block: Abstract out bvec iterator") from the block tree and commit 41a336e01188 ("md/raid1: fix request counting bug in new 'barrier' code") from the md tree. I fixed it up (a line fixed in the latter was removed by the latter) and can carry the fix as necessary (no action is required). -- Cheers, Stephen Rothwells...@canb.auug.org.au pgp6bOELZxdyF.pgp Description: PGP signature
Re: [PATCH v4] Staging: comedi: convert while loop to timeout in ni_mio_common.c
On Tue, Jan 14, 2014 at 06:23:05PM -0600, Chase Southwood wrote: > This patch for ni_mio_common.c changes out a while loop for a timeout, > which is preferred. > > Signed-off-by: Chase Southwood > --- > > OK, here's another go at it. Hopefully everything looks more correct > this time. Greg, I've followed the pattern you gave me, and I really > appreciate all of the tips! As always, just let me know if there are > still things that need adjusting (especially length of timeout, udelay, > etc.). > > Thanks, > Chase Southwood > > 2: Changed from simple clean-up to swapping a timeout in for a while loop. > > 3: Removed extra counter variable, and added error checking. > > 4: No longer using counter variable, using jiffies instead. > > drivers/staging/comedi/drivers/ni_mio_common.c | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/drivers/staging/comedi/drivers/ni_mio_common.c > b/drivers/staging/comedi/drivers/ni_mio_common.c > index 457b884..882b249 100644 > --- a/drivers/staging/comedi/drivers/ni_mio_common.c > +++ b/drivers/staging/comedi/drivers/ni_mio_common.c > @@ -687,12 +687,21 @@ static void ni_clear_ai_fifo(struct comedi_device *dev) > { > const struct ni_board_struct *board = comedi_board(dev); > struct ni_private *devpriv = dev->private; > + unsigned long timeout; > > if (board->reg_type == ni_reg_6143) { > /* Flush the 6143 data FIFO */ > ni_writel(0x10, AIFIFO_Control_6143); /* Flush fifo */ > ni_writel(0x00, AIFIFO_Control_6143); /* Flush fifo */ > - while (ni_readl(AIFIFO_Status_6143) & 0x10) ; /* Wait for > complete */ > + /* Wait for complete */ > + timeout = jiffies + msec_to_jiffies(100); > + while (ni_readl(AIFIFO_Status_6143) & 0x10) { > + if (time_after(jiffies, timeout)) { > + comedi_error(dev, "FIFO flush timeout."); > + break; > + } > + udelay(1); Sleep for at least 10, as I think that's the smallest time delay you can sleep for anyway (meaning it will be that long no matter what number you put there less than 10, depending on the hardware used of course.) Other than that, looks much better. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 7/9] mm: thrash detection-based file cache sizing
Hello On 01/15/2014 10:57 AM, Bob Liu wrote: > > On 01/15/2014 03:16 AM, Johannes Weiner wrote: >> On Tue, Jan 14, 2014 at 09:01:09AM +0800, Bob Liu wrote: >>> Hi Johannes, >>> >>> On 01/11/2014 02:10 AM, Johannes Weiner wrote: The VM maintains cached filesystem pages on two types of lists. One list holds the pages recently faulted into the cache, the other list holds pages that have been referenced repeatedly on that first list. The idea is to prefer reclaiming young pages over those that have shown to benefit from caching in the past. We call the recently used list "inactive list" and the frequently used list "active list". Currently, the VM aims for a 1:1 ratio between the lists, which is the "perfect" trade-off between the ability to *protect* frequently used pages and the ability to *detect* frequently used pages. This means that working set changes bigger than half of cache memory go undetected and thrash indefinitely, whereas working sets bigger than half of cache memory are unprotected against used-once streams that don't even need caching. >>> >>> Good job! This patch looks good to me and with nice descriptions. >>> But it seems that this patch only fix the issue "working set changes >>> bigger than half of cache memory go undetected and thrash indefinitely". >>> My concern is could it be extended easily to address all other issues >>> based on this patch set? >>> >>> The other possible way is something like Peter has implemented the CART >>> and Clock-Pro which I think may be better because of using advanced >>> algorithms and consider the problem as a whole from the beginning.(Sorry >>> I haven't get enough time to read the source code, so I'm not 100% sure.) >>> http://linux-mm.org/PeterZClockPro2 >> >> My patches are moving the VM towards something that is comparable to >> how Peter implemented Clock-Pro. However, the current VM has evolved >> over time in small increments based on real life performance >> observations. Rewriting everything in one go would be incredibly >> disruptive and I doubt very much we would merge any such proposal in >> the first place. So it's not like I don't see the big picture, it's >> just divide and conquer: >> >> Peter's Clock-Pro implementation was basically a double clock with an >> intricate system to classify hotness, augmented by eviction >> information to work with reuse distances independent of memory size. >> >> What we have right now is a double clock with a very rudimentary >> system to classify whether a page is hot: it has been accessed twice >> while on the inactive clock. My patches now add eviction information >> to this, and improve the classification so that it can work with reuse >> distances up to memory size and is no longer dependent on the inactive >> clock size. >> >> This is the smallest imaginable step that is still useful, and even >> then we had a lot of discussions about scalability of the data >> structures and confusion about how the new data point should be >> interpreted. It also took a long time until somebody read the series >> and went, "Ok, this actually makes sense to me." Now, maybe I suck at >> documenting, but maybe this is just complicated stuff. Either way, we >> have to get there collectively, so that the code is maintainable in >> the long term. >> >> Once we have these new concepts established, we can further improve >> the hotness detector so that it can classify and order pages with >> reuse distances beyond memory size. But this will come with its own >> set of problems. For example, some time ago we stopped regularly >> scanning and rotating active pages because of scalability issues, but >> we'll most likely need an uptodate estimate of the reuse distances on >> the active list in order to classify refaults properly. >> > > Thank you for your kindly explanation. It make sense to me please feel > free to add my review. > + * Approximating inactive page access frequency - Observations: + * + * 1. When a page is accessed for the first time, it is added to the + *head of the inactive list, slides every existing inactive page + *towards the tail by one slot, and pushes the current tail page + *out of memory. + * + * 2. When a page is accessed for the second time, it is promoted to + *the active list, shrinking the inactive list by one slot. This + *also slides all inactive pages that were faulted into the cache + *more recently than the activated page towards the tail of the + *inactive list. + * >>> >>> Nitpick, how about the reference bit? >> >> What do you mean? >> > > Sorry, I mean the PG_referenced flag. I thought when a page is accessed > for the second time only PG_referenced flag will be set instead of be > promoted to active list. > No. I try to explain a bit. For mapped file pages, if the second access occurs on a
linux-next: build failure after merge of the device-mapper tree
Hi all, After merging the device-mapper tree, today's linux-next build (powerpc ppc64_defconfig) failed like this: ERROR: ".dm_bufio_get_device_size" [drivers/md/dm-snapshot.ko] undefined! ERROR: ".dm_bufio_release" [drivers/md/dm-snapshot.ko] undefined! ERROR: ".dm_bufio_client_destroy" [drivers/md/dm-snapshot.ko] undefined! ERROR: ".dm_bufio_prefetch" [drivers/md/dm-snapshot.ko] undefined! ERROR: ".dm_bufio_set_minimum_buffers" [drivers/md/dm-snapshot.ko] undefined! ERROR: ".dm_bufio_forget" [drivers/md/dm-snapshot.ko] undefined! ERROR: ".dm_bufio_client_create" [drivers/md/dm-snapshot.ko] undefined! ERROR: ".dm_bufio_read" [drivers/md/dm-snapshot.ko] undefined! Presumably caused by commit b41bf7440bcf ("dm snapshot: use dm-bufio"). I have used the device-mapper tree from next-20140114 for today. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpyxYgbtaJpF.pgp Description: PGP signature
Re: [RFC] sysfs_rename_link() and its usage
On 2014/1/15 1:17, Veaceslav Falico wrote: > Hi, > > I'm hitting a strange issue and/or I'm completely lost in sysfs internals. > > Consider having two net_device *a, *b; which are registered normally. > Now, to create a link from /sys/class/net/a->name/linkname to b, one should > use: > > sysfs_create_link(&(a->dev.kobj), &(b->dev.kobj), linkname); > > To remove it, even simpler: > > sysfs_remove_link(&(a->dev.kobj), linkname); > > This works like a charm. However, if I want to use (obviously, with the > symlink present): > > sysfs_rename_link(&(a->dev.kobj), &(b->dev.kobj), oldname, newname); > > this fails with: > > "sysfs: ns invalid in 'a->name' for 'oldname'" > > in > > 608 struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd, > ... > 615 if (!!sysfs_ns_type(parent_sd) != !!ns) { > 616 WARN(1, KERN_WARNING "sysfs: ns %s in '%s' for '%s'\n", > 617 sysfs_ns_type(parent_sd) ? "required" : > "invalid", > 618 parent_sd->s_name, name); > 619 return NULL; > 620 } > > Code path: > warn_slowpath_fmt+0x46/0x50 > sysfs_get_dirent_ns+0x30/0x80 > sysfs_find_dirent+0x84/0x110 > sysfs_get_dirent_ns+0x3e/0x80 > sysfs_rename_link_ns+0x54/0xd0 > > I have no idea what this code means. Is there any reason for it to > fail (i.e. am I doing something wrong?) or I've hit a bug? > > I've tested the only user of it (bridge) - and it works fine, however it's > not using its own net_device's kobject but rather its own dir. > I use the sysfs_rename_link(x,x) and meet the same problem, I review the code for bridge, I found the br->ifobj was using kobject_create_and_add() to add a subdir for this, maybe it helps? Ding > Thank you! > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2 v2] ixgbe: define IXGBE_MAX_VFS_DRV_LIMIT macro and cleanup const 63
On Fri, 2013-12-27 at 01:02 -0800, Jeff Kirsher wrote: > On Wed, 2013-12-25 at 00:12 +0800, Ethan Zhao wrote: > > Because ixgbe driver limit the max number of VF functions could be > > enabled > > to 63, so define one macro IXGBE_MAX_VFS_DRV_LIMIT and cleanup the > > const 63 > > in code. > > > > v2: fix a typo. > > > > Signed-off-by: Ethan Zhao > > --- > > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++-- > > drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 5 +++-- > > drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h | 5 + > > 3 files changed, 10 insertions(+), 4 deletions(-) > > Added to my queue, thanks Ethan! Hi Ethan, Did Jeff contact you about this failing to compile? I'm currently providing vacation covering for him and we found this was failing to compile just before he left. We captured the failure in our notes for this but there is no comment on if you were contacted or not. Regardless, when I apply this patch (with or without 2-2) we get the following error on a compilation attempt: Here's the error: Here's the error: drivers/net/ethernet/intel/ixgbe/ixgbe_main.c: In function "ixgbe_sw_init": drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:5033: error: stray "\357" in program drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:5033: error: stray "\274" in program drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:5033: error: stray "\215" in program drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:5033: error: expected ")" before numeric constant drivers/net/ethernet/intel/ixgbe/ixgbe_main.c: In function "ixgbe_probe": drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:7977: error: stray "\357" in program drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:7977: error: stray "\274" in program drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:7977: error: stray "\215" in program drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:7977: error: expected ")" before numeric constant make[5]: *** [drivers/net/ethernet/intel/ixgbe/ixgbe_main.o] Error 1 make[5]: *** Waiting for unfinished jobs make[4]: *** [drivers/net/ethernet/intel/ixgbe] Error 2 make[4]: *** Waiting for unfinished jobs make[3]: *** [drivers/net/ethernet/intel] Error 2 make[2]: *** [drivers/net/ethernet] Error 2 make[1]: *** [drivers/net] Error 2 make: *** [drivers] Error 2 Thanks, Aaron
Re: [PATCH] mm: nobootmem: avoid type warning about alignment value
On Sun, 12 Jan 2014, Russell King - ARM Linux wrote: > This patch makes their types match exactly with x86's definitions of > the same, which is the basic problem: on ARM, they all took "int" values > and returned "int"s, which leads to min() in nobootmem.c complaining. > > arch/arm/include/asm/bitops.h | 54 > +++ > 1 file changed, 44 insertions(+), 10 deletions(-) For the record: Acked-by: Nicolas Pitre The reason why macros were used at the time this was originally written is because gcc used to have issues forwarding the constant nature of a variable down multiple levels of inline functions and __builtin_constant_p() always returned false. But that was quite a long time ago. > diff --git a/arch/arm/include/asm/bitops.h b/arch/arm/include/asm/bitops.h > index e691ec91e4d3..b2e298a90d76 100644 > --- a/arch/arm/include/asm/bitops.h > +++ b/arch/arm/include/asm/bitops.h > @@ -254,25 +254,59 @@ static inline int constant_fls(int x) > } > > /* > - * On ARMv5 and above those functions can be implemented around > - * the clz instruction for much better code efficiency. > + * On ARMv5 and above those functions can be implemented around the > + * clz instruction for much better code efficiency. __clz returns > + * the number of leading zeros, zero input will return 32, and > + * 0x8000 will return 0. > */ > +static inline unsigned int __clz(unsigned int x) > +{ > + unsigned int ret; > + > + asm("clz\t%0, %1" : "=r" (ret) : "r" (x)); > > + return ret; > +} > + > +/* > + * fls() returns zero if the input is zero, otherwise returns the bit > + * position of the last set bit, where the LSB is 1 and MSB is 32. > + */ > static inline int fls(int x) > { > - int ret; > - > if (__builtin_constant_p(x)) > return constant_fls(x); > > - asm("clz\t%0, %1" : "=r" (ret) : "r" (x)); > - ret = 32 - ret; > - return ret; > + return 32 - __clz(x); > +} > + > +/* > + * __fls() returns the bit position of the last bit set, where the > + * LSB is 0 and MSB is 31. Zero input is undefined. > + */ > +static inline unsigned long __fls(unsigned long x) > +{ > + return fls(x) - 1; > +} > + > +/* > + * ffs() returns zero if the input was zero, otherwise returns the bit > + * position of the first set bit, where the LSB is 1 and MSB is 32. > + */ > +static inline int ffs(int x) > +{ > + return fls(x & -x); > +} > + > +/* > + * __ffs() returns the bit position of the first bit set, where the > + * LSB is 0 and MSB is 31. Zero input is undefined. > + */ > +static inline unsigned long __ffs(unsigned long x) > +{ > + return ffs(x) - 1; > } > > -#define __fls(x) (fls(x) - 1) > -#define ffs(x) ({ unsigned long __t = (x); fls(__t & -__t); }) > -#define __ffs(x) (ffs(x) - 1) > #define ffz(x) __ffs( ~(x) ) > > #endif > > > -- > FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up. Estimation > in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad. > Estimate before purchase was "up to 13.2Mbit". > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/