Re: 32-bit bug in iovec iterator changes
On Fri, Jun 20, 2014 at 11:51:44PM -0400, Theodore Ts'o wrote: > On Fri, Jun 20, 2014 at 08:38:20AM +1000, Dave Chinner wrote: > > > > Short reads are more likely a bug in all the iovec iterator stuff > > that got merged in from the vfs tree. ISTR a 32 bit-only bug in that > > stuff go past in to do with not being able to partition a 32GB block > > dev on a 32 bit system due to a 32 bit size_t overflow somewhere > > Dave Chinner called it. > > Al, I'm seeing a regression which shows up using a 32-bit x86 kernel. > The symptoms of the bug is when run under KVM, with a 5 GB /dev/vdc > virtual block device, a read at offset 2 ** 30 fails with a short > read: > > # dd if=/dev/vdc of=/dev/null bs=4k skip=262144 count=1 > 0+0 records in > 0+0 records out > 0 bytes (0 B) copied, 0.0164144 s, 0.0 kB/s Argh... ed include/linux/uio.h
[PATCH][BUGFIX] x86/reboot: Disable scheduler before disabling IO APIC
From: Fenghua Yu During reboot, in the middle of disabling IO APIC, the scheduler may be triggered by per cpu timer to do load blance. But since the kernel is already in the process of shutting down and can not execute scheduler's load balance at this point, it triggers invalid TSS exception and hangs during reboot. This happens on some boards (e.g. AsRock ZT87 Extreme4 BIOS 2.70) in 32-bit kernel reported in Bugzilla 76661 at https://bugzilla.kernel.org/show_bug.cgi?id=76661 To fix the issue, we disable local irq including per cpu timer before disabling IO APIC. By doing this, the scheduler will not disturb disable_IO_APIC(). Signed-off-by: Fenghua Yu Tested-by: berndku...@hotmail.com --- arch/x86/kernel/reboot.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index 52b1157..16111c6 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -574,6 +574,15 @@ static void native_machine_emergency_restart(void) void native_machine_shutdown(void) { /* Stop the cpus and apics */ + +#ifdef CONFIG_SMP + /* +* Disable the local irq to not receive the per-cpu timer interrupt +* which may trigger scheduler's load balance. +*/ + local_irq_disable(); +#endif + #ifdef CONFIG_X86_IO_APIC /* * Disabling IO APIC before local APIC is a workaround for @@ -591,11 +600,8 @@ void native_machine_shutdown(void) #ifdef CONFIG_SMP /* -* Stop all of the others. Also disable the local irq to -* not receive the per-cpu timer interrupt which may trigger -* scheduler's load balance. +* Stop all of the others. */ - local_irq_disable(); stop_other_cpus(); #endif -- 1.8.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] dma: imx-sdma: Add a new DMATYPE for Shared Peripheral ASRC
On Mon, Jun 16, 2014 at 11:31:05AM +0800, Nicolin Chen wrote: > Shared Peripheral ASRC, running on SPBA, needs to use shp sciprts for > DMA transfer. So this patch just adds a new DMATYPE for it. > > Signed-off-by: Nicolin Chen Acked-by: Shawn Guo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu] Reduce overhead of cond_resched() checks for RCU
On Fri, Jun 20, 2014 at 07:59:58PM -0700, Paul E. McKenney wrote: > Commit ac1bea85781e (Make cond_resched() report RCU quiescent states) > fixed a problem where a CPU looping in the kernel with but one runnable > task would give RCU CPU stall warnings, even if the in-kernel loop > contained cond_resched() calls. Unfortunately, in so doing, it introduced > performance regressions in Anton Blanchard's will-it-scale "open1" test. > The problem appears to be not so much the increased cond_resched() path > length as an increase in the rate at which grace periods complete, which > increased per-update grace-period overhead. > > This commit takes a different approach to fixing this bug, mainly by > moving the RCU-visible quiescent state from cond_resched() to > rcu_note_context_switch(), and by further reducing the check to a > simple non-zero test of a single per-CPU variable. However, this > approach requires that the force-quiescent-state processing send > resched IPIs to the offending CPUs. These will be sent only once > the grace period has reached an age specified by the boot/sysfs > parameter rcutree.jiffies_till_sched_qs, or once the grace period > reaches an age halfway to the point at which RCU CPU stall warnings > will be emitted, whichever comes first. > > Reported-by: Dave Hansen > Signed-off-by: Paul E. McKenney > Cc: Josh Triplett > Cc: Andi Kleen > Cc: Christoph Lameter > Cc: Mike Galbraith > Cc: Eric Dumazet I like this approach *far* better. This is the kind of thing I had in mind when I suggested using the fqs machinery: remove the poll entirely and just thwack a CPU if it takes too long without a quiescent state. Reviewed-by: Josh Triplett > --- > > b/Documentation/kernel-parameters.txt |6 + > b/include/linux/rcupdate.h| 36 > b/kernel/rcu/tree.c | 140 > +++--- > b/kernel/rcu/tree.h |6 + > b/kernel/rcu/tree_plugin.h|2 > b/kernel/rcu/update.c | 18 > b/kernel/sched/core.c |7 - > 7 files changed, 125 insertions(+), 90 deletions(-) > > diff --git a/Documentation/kernel-parameters.txt > b/Documentation/kernel-parameters.txt > index 6eaa9cdb7094..910c3829f81d 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -2785,6 +2785,12 @@ bytes respectively. Such letter suffixes can also be > entirely omitted. > leaf rcu_node structure. Useful for very large > systems. > > + rcutree.jiffies_till_sched_qs= [KNL] > + Set required age in jiffies for a > + given grace period before RCU starts > + soliciting quiescent-state help from > + rcu_note_context_switch(). > + > rcutree.jiffies_till_first_fqs= [KNL] > Set delay from grace-period initialization to > first attempt to force quiescent states. > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > index 5a75d19aa661..243aa4656cb7 100644 > --- a/include/linux/rcupdate.h > +++ b/include/linux/rcupdate.h > @@ -44,7 +44,6 @@ > #include > #include > #include > -#include > #include > > extern int rcu_expedited; /* for sysctl */ > @@ -300,41 +299,6 @@ bool __rcu_is_watching(void); > #endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) > || defined(CONFIG_SMP) */ > > /* > - * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings. > - */ > - > -#define RCU_COND_RESCHED_LIM 256 /* ms vs. 100s of ms. */ > -DECLARE_PER_CPU(int, rcu_cond_resched_count); > -void rcu_resched(void); > - > -/* > - * Is it time to report RCU quiescent states? > - * > - * Note unsynchronized access to rcu_cond_resched_count. Yes, we might > - * increment some random CPU's count, and possibly also load the result from > - * yet another CPU's count. We might even clobber some other CPU's attempt > - * to zero its counter. This is all OK because the goal is not precision, > - * but rather reasonable amortization of rcu_note_context_switch() overhead > - * and extremely high probability of avoiding RCU CPU stall warnings. > - * Note that this function has to be preempted in just the wrong place, > - * many thousands of times in a row, for anything bad to happen. > - */ > -static inline bool rcu_should_resched(void) > -{ > - return raw_cpu_inc_return(rcu_cond_resched_count) >= > -RCU_COND_RESCHED_LIM; > -} > - > -/* > - * Report quiscent states to RCU if it is time to do so. > - */ > -static inline void rcu_cond_resched(void) > -{ > - if (unlikely(rcu_should_resched())) > - rcu_resched(); > -} > - > -/* > * Infrastructure to implement the synchronize_() primitives in > * TREE_RCU and rcu_barrier_() primitives in TINY_RCU. > */ > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >
Re: [PATCH] regulator: palmas: Fix SMPS enable/disable/is_enabled
On Sat, Jun 21, 2014 at 2:26 AM, Nishanth Menon wrote: > We use regmap regulator ops to enable/disable and check if regulator > is enabled for various SMPS. However, these depend on valid > enable_reg, enable_mask and enable_value in regulator descriptor. > > Currently we do not populate these for SMPS other than SMPS10, this > results in spurious results as regmap assumes that the values are > valid and ends up reading register 0x0 RTC:SECONDS_REG on Palmas > variants that do have RTC! To fix this, we update proper parameters > for the descriptor fields. > > Further, we want to ensure the behavior consistent with logic > prior to commit dbabd624d4eec50b6, where, once you do a set_mode, > enable/disable ensure the logic remains consistent and configures > Palmas to the configuration that we set with set_mode (since the > configuration register is common). To do this, we can rely on the > regulator core's regulator_register behavior where the regulator > descriptor pointer provided by the regulator driver is stored. (no > reallocation and copy is done). This lets us update the enable_value > post registration, to remain consistent with the mode we configure as > part of set_mode. Tested-by: Alexandre Courbot Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] direct-io: squelch maybe-uninitialized warning in do_direct_IO()
The following warnings: fs/direct-io.c: In function ‘__blockdev_direct_IO’: fs/direct-io.c:1011:12: warning: ‘to’ may be used uninitialized in this function [-Wmaybe-uninitialized] fs/direct-io.c:913:16: note: ‘to’ was declared here fs/direct-io.c:1011:12: warning: ‘from’ may be used uninitialized in this function [-Wmaybe-uninitialized] fs/direct-io.c:913:10: note: ‘from’ was declared here are not necessary because dio_get_page() either fails, or sets both 'from' and 'to'. Make the compiler happy so we can more easily detect legitimate warnings. Signed-off-by: Jason Cooper --- fs/direct-io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/direct-io.c b/fs/direct-io.c index 98040ba388ac..c0a9854d2bc7 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -910,7 +910,8 @@ static int do_direct_IO(struct dio *dio, struct dio_submit *sdio, while (sdio->block_in_file < sdio->final_block_in_request) { struct page *page; - size_t from, to; + size_t from = 0; + size_t to = 0; page = dio_get_page(dio, sdio, , ); if (IS_ERR(page)) { ret = PTR_ERR(page); -- 2.0.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[v3.10-rt / v3.12-rt] scheduling while atomic in cgroup code
Hi. Call Trace: [e22d5a90] [c0007ea8] show_stack+0x4c/0x168 (unreliable) [e22d5ad0] [c0618c04] __schedule_bug+0x94/0xb0 [e22d5ae0] [c060b9ec] __schedule+0x530/0x550 [e22d5bf0] [c060bacc] schedule+0x30/0xbc [e22d5c00] [c060ca24] rt_spin_lock_slowlock+0x180/0x27c [e22d5c70] [c00b39dc] res_counter_uncharge_until+0x40/0xc4 [e22d5ca0] [c013ca88] drain_stock.isra.20+0x54/0x98 [e22d5cc0] [c01402ac] __mem_cgroup_try_charge+0x2e8/0xbac [e22d5d70] [c01410d4] mem_cgroup_charge_common+0x3c/0x70 [e22d5d90] [c0117284] __do_fault+0x38c/0x510 [e22d5df0] [c011a5f4] handle_pte_fault+0x98/0x858 [e22d5e50] [c060ed08] do_page_fault+0x42c/0x6fc [e22d5f40] [c000f5b4] handle_page_fault+0xc/0x80 What happens: - refill_stock() calls get_cpu_var() and thus disables preemption until matching put_cpu_var() is called, - then it calls drain_stock() -> res_counter_uncharge() -> res_counter_uncharge_until() - and here we have spin_lock(), which under RT can sleep. Thus we have sleeping with preemption disabled. Any ideas how to fix? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
32-bit bug in iovec iterator changes
On Fri, Jun 20, 2014 at 08:38:20AM +1000, Dave Chinner wrote: > > Short reads are more likely a bug in all the iovec iterator stuff > that got merged in from the vfs tree. ISTR a 32 bit-only bug in that > stuff go past in to do with not being able to partition a 32GB block > dev on a 32 bit system due to a 32 bit size_t overflow somewhere Dave Chinner called it. Al, I'm seeing a regression which shows up using a 32-bit x86 kernel. The symptoms of the bug is when run under KVM, with a 5 GB /dev/vdc virtual block device, a read at offset 2 ** 30 fails with a short read: # dd if=/dev/vdc of=/dev/null bs=4k skip=262144 count=1 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.0164144 s, 0.0 kB/s On a 3.15 kernel, this command works: # dd if=/dev/vdc of=/dev/null bs=4k skip=262144 count=1 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.0457984 s, 89.4 kB/s I tried bisecting it, but unfortunately the iovec iterator changes are not cleanly bisectable, since copy_page_from_iter() gets introduced some two dozen patches before it gets defined. :-( However, the bisect leads quite squarely to to the iovec iterator patches. Al, I'd appreciate it if you could take a look? Thanks!! - Ted % git bisect start # good: [1860e379875dfe7271c649058aeddffe5afd9d0d] Linux 3.15 git bisect good 1860e379875dfe7271c649058aeddffe5afd9d0d # bad: [7171511eaec5bf23fb06078f59784a3a0626b38f] Linux 3.16-rc1 git bisect bad 7171511eaec5bf23fb06078f59784a3a0626b38f # good: [aaeb2554337217dfa4eac2fcc90da7be540b9a73] Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media into next git bisect good aaeb2554337217dfa4eac2fcc90da7be540b9a73 # bad: [16b9057804c02e2d351e9c8f606e909b43cbd9e7] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs git bisect bad 16b9057804c02e2d351e9c8f606e909b43cbd9e7 # good: [82abb273d838318424644d8f02825db0fbbd400a] Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus git bisect good 82abb273d838318424644d8f02825db0fbbd400a # good: [d1e1cda862c16252087374ac75949b0e89a5717e] Merge tag 'nfs-for-3.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs git bisect good d1e1cda862c16252087374ac75949b0e89a5717e # good: [23d4ed53b7342bf5999b3ea227d9f69e75e5a625] Merge branch 'for-linus' of git://git.kernel.dk/linux-block git bisect good 23d4ed53b7342bf5999b3ea227d9f69e75e5a625 # good: [2840c566e95599cd60c7143762ca8b49d9395050] Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs git bisect good 2840c566e95599cd60c7143762ca8b49d9395050 # good: [4251c2a67011801caecd63671f26dd8c9aedb24c] Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux git bisect good 4251c2a67011801caecd63671f26dd8c9aedb24c # skip: [3dae8750c368f8ac11c3c8c2a28f56dcee865c01] cifs: switch to ->write_iter() git bisect skip 3dae8750c368f8ac11c3c8c2a28f56dcee865c01 # good: [5c02c392cd2320e8d612376d6b72b6548a680923] Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux git bisect good 5c02c392cd2320e8d612376d6b72b6548a680923 # bad: [5f073850602084fbcbb987948ff3e70ae273f7d2] kill generic_file_splice_write() git bisect bad 5f073850602084fbcbb987948ff3e70ae273f7d2 # good: [38583f095c5a8138ae2a1c9173d0fd8a9f10e8aa] Merge branch 'akpm' (incoming from Andrew) git bisect good 38583f095c5a8138ae2a1c9173d0fd8a9f10e8aa # good: [f5ccfe1ddbaf9d923a3ebdadcb1e5e32d83e9c28] ext4: fix locking for O_APPEND writes git bisect good f5ccfe1ddbaf9d923a3ebdadcb1e5e32d83e9c28 # bad: [f0d1bec9d58d4c038d0ac958c9af82be6eb18045] new helper: copy_page_from_iter() git bisect bad f0d1bec9d58d4c038d0ac958c9af82be6eb18045 % (unset DISPLAY; git bisect visualize) commit f0d1bec9d58d4c038d0ac958c9af82be6eb18045 Author: Al Viro Date: Thu Apr 3 15:05:18 2014 -0400 new helper: copy_page_from_iter() parallel to copy_page_to_iter(). pipe_write() switched to it (and became ->write_iter()). Signed-off-by: Al Viro commit 84c3d55cc474f9c234c023c92e2769f940d5548c Author: Al Viro Date: Thu Apr 3 14:33:23 2014 -0400 fuse: switch to ->write_iter() Signed-off-by: Al Viro commit b30ac0fc4109701fc122d41ee085c65b52dc44a3 Author: Al Viro Date: Thu Apr 3 14:29:04 2014 -0400 btrfs: switch to ->write_iter() Signed-off-by: Al Viro commit 3ef045c3d8ae8550abbfd44074efce6ff642cc86 Author: Al Viro Date: Thu Apr 3 14:25:22 2014 -0400 ocfs2: switch to ->write_iter() Signed-off-by: Al Viro commit bf97f3bc0c32140c43fe5ca53d23514ea46a54ca Author: Al Viro Date: Thu Apr 3 14:20:23 2014 -0400 xfs: switch to ->write_iter() Signed-off-by: Al Viro commit 50b5551d1719c8bce60c6d4027b814cfc72c2307 Author: Al Viro Date: Thu Apr 3 14:13:46 2014 -0400 afs: switch to ->write_iter() Signed-off-by: Al Viro
Re: [PATCH] include/trace/syscall.h: Use HAVE_SYSCALL_TRACEPOINTS instead of TRACEPOINTS
On Sat, 21 Jun 2014 10:32:37 +0800 Chen Gang wrote: > diff --git a/include/trace/syscall.h b/include/trace/syscall.h > index 291c282..a709cbd 100644 > --- a/include/trace/syscall.h > +++ b/include/trace/syscall.h > @@ -33,7 +33,7 @@ struct syscall_metadata { > struct ftrace_event_call *exit_event; > }; > > -#ifdef CONFIG_TRACEPOINTS > +#ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS > static inline void syscall_tracepoint_update(struct task_struct *p) > { > if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) This has already been fixed and is in my for-next branch getting ready to be pushed. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Check for Null return of function of affs_bread in function affs_truncate
I don't think that it's a good idea , in that case I would recommend either leaving this bug open or close it as there doesn't seem to be a good way of testing this. Cheers Nick On Fri, Jun 20, 2014 at 11:09 PM, Andrew Morton wrote: > On Fri, 20 Jun 2014 22:55:07 -0400 Nick Krause wrote: > >> On Fri, Jun 20, 2014 at 10:38 PM, Andrew Morton >> wrote: >> > On Fri, 20 Jun 2014 22:25:47 -0400 Nick Krause wrote: >> > >> >> If you have any ideas about what is better >> >> please let me known. >> > >> > I think the proposed patch was not a good one - it will cause truncate >> > to silently return, probably leaving the fs in an inconsistent state. >> > Neither the user nor the running application know this happened so they >> > will just keep on modifying the filesystem, possibly mangling it >> > further. >> > >> > The code as it stands at present is better - if bread() fails we'll get >> > a nice solid oops and the current app will be terminated (at least). >> > As we're in truncate it's quite possible that the entire fs will get >> > wedged up due to now-permanently-held i_mutex, which is even better. >> > >> > >> > As for the best fix, umm, hard. We're pretty screwed if we cannot read >> > that block at this code site. Perhaps emit loud printks, forcibly turn >> > the fs read-only then return -EIO/-ENOMEM/etc from the truncate. Such >> > a change would require runtime testing, with some form of developer fault >> > injection. >> >> Fair enough if somebody is running this file system I would be >> happy to have someone test my code in order to fix this. > > (top-posting repaired - please don't top-post!) > > It's going to be hard to find such a person. As mkfs.affs doesn't > appear to exist (?) your best bet would be to find someone who has an > Amiga, get them to create a new fs for you (via loopback-on-file) then > gzip the underlying file and send it to you. You can then use that fs > image file as many times as you want via loopback or straight onto a > disk. Make sure the image file is zeroed out first so it compresses > well. > > Or something like that. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Check for Null return of function of affs_bread in function affs_truncate
On Fri, 20 Jun 2014 22:55:07 -0400 Nick Krause wrote: > On Fri, Jun 20, 2014 at 10:38 PM, Andrew Morton > wrote: > > On Fri, 20 Jun 2014 22:25:47 -0400 Nick Krause wrote: > > > >> If you have any ideas about what is better > >> please let me known. > > > > I think the proposed patch was not a good one - it will cause truncate > > to silently return, probably leaving the fs in an inconsistent state. > > Neither the user nor the running application know this happened so they > > will just keep on modifying the filesystem, possibly mangling it > > further. > > > > The code as it stands at present is better - if bread() fails we'll get > > a nice solid oops and the current app will be terminated (at least). > > As we're in truncate it's quite possible that the entire fs will get > > wedged up due to now-permanently-held i_mutex, which is even better. > > > > > > As for the best fix, umm, hard. We're pretty screwed if we cannot read > > that block at this code site. Perhaps emit loud printks, forcibly turn > > the fs read-only then return -EIO/-ENOMEM/etc from the truncate. Such > > a change would require runtime testing, with some form of developer fault > > injection. > > Fair enough if somebody is running this file system I would be > happy to have someone test my code in order to fix this. (top-posting repaired - please don't top-post!) It's going to be hard to find such a person. As mkfs.affs doesn't appear to exist (?) your best bet would be to find someone who has an Amiga, get them to create a new fs for you (via loopback-on-file) then gzip the underlying file and send it to you. You can then use that fs image file as many times as you want via loopback or straight onto a disk. Make sure the image file is zeroed out first so it compresses well. Or something like that. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] staging: vt6655: remove header declarations for static functions
The functions iwctl_giwscan() and iwctl_siwscan() are only referenced within iwctl.c -- so, remove their function declarations from iwctl.h and mark these functions as static. Signed-off-by: James A Shackleford --- drivers/staging/vt6655/iwctl.c |4 ++-- drivers/staging/vt6655/iwctl.h | 10 -- 2 files changed, 2 insertions(+), 12 deletions(-) diff --git a/drivers/staging/vt6655/iwctl.c b/drivers/staging/vt6655/iwctl.c index ba50d7f..747d723 100644 --- a/drivers/staging/vt6655/iwctl.c +++ b/drivers/staging/vt6655/iwctl.c @@ -129,7 +129,7 @@ int iwctl_giwname(struct net_device *dev, * Wireless Handler : set scan */ -int iwctl_siwscan(struct net_device *dev, +static int iwctl_siwscan(struct net_device *dev, struct iw_request_info *info, struct iw_point *wrq, char *extra) @@ -190,7 +190,7 @@ int iwctl_siwscan(struct net_device *dev, * Wireless Handler : get scan results */ -int iwctl_giwscan(struct net_device *dev, +static int iwctl_giwscan(struct net_device *dev, struct iw_request_info *info, struct iw_point *wrq, char *extra) diff --git a/drivers/staging/vt6655/iwctl.h b/drivers/staging/vt6655/iwctl.h index 10564b4..de0a337 100644 --- a/drivers/staging/vt6655/iwctl.h +++ b/drivers/staging/vt6655/iwctl.h @@ -161,16 +161,6 @@ int iwctl_giwpower(struct net_device *dev, struct iw_param *wrq, char *extra); -int iwctl_giwscan(struct net_device *dev, - struct iw_request_info *info, - struct iw_point *wrq, - char *extra); - -int iwctl_siwscan(struct net_device *dev, - struct iw_request_info *info, - struct iw_point *wrq, - char *extra); - //2008-0409-07, by Einsn Liu #ifdef WPA_SUPPLICANT_DRIVER_WEXT_SUPPORT int iwctl_siwauth(struct net_device *dev, -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH tip/core/rcu] Reduce overhead of cond_resched() checks for RCU
Commit ac1bea85781e (Make cond_resched() report RCU quiescent states) fixed a problem where a CPU looping in the kernel with but one runnable task would give RCU CPU stall warnings, even if the in-kernel loop contained cond_resched() calls. Unfortunately, in so doing, it introduced performance regressions in Anton Blanchard's will-it-scale "open1" test. The problem appears to be not so much the increased cond_resched() path length as an increase in the rate at which grace periods complete, which increased per-update grace-period overhead. This commit takes a different approach to fixing this bug, mainly by moving the RCU-visible quiescent state from cond_resched() to rcu_note_context_switch(), and by further reducing the check to a simple non-zero test of a single per-CPU variable. However, this approach requires that the force-quiescent-state processing send resched IPIs to the offending CPUs. These will be sent only once the grace period has reached an age specified by the boot/sysfs parameter rcutree.jiffies_till_sched_qs, or once the grace period reaches an age halfway to the point at which RCU CPU stall warnings will be emitted, whichever comes first. Reported-by: Dave Hansen Signed-off-by: Paul E. McKenney Cc: Josh Triplett Cc: Andi Kleen Cc: Christoph Lameter Cc: Mike Galbraith Cc: Eric Dumazet --- b/Documentation/kernel-parameters.txt |6 + b/include/linux/rcupdate.h| 36 b/kernel/rcu/tree.c | 140 +++--- b/kernel/rcu/tree.h |6 + b/kernel/rcu/tree_plugin.h|2 b/kernel/rcu/update.c | 18 b/kernel/sched/core.c |7 - 7 files changed, 125 insertions(+), 90 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 6eaa9cdb7094..910c3829f81d 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2785,6 +2785,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. leaf rcu_node structure. Useful for very large systems. + rcutree.jiffies_till_sched_qs= [KNL] + Set required age in jiffies for a + given grace period before RCU starts + soliciting quiescent-state help from + rcu_note_context_switch(). + rcutree.jiffies_till_first_fqs= [KNL] Set delay from grace-period initialization to first attempt to force quiescent states. diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 5a75d19aa661..243aa4656cb7 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -44,7 +44,6 @@ #include #include #include -#include #include extern int rcu_expedited; /* for sysctl */ @@ -300,41 +299,6 @@ bool __rcu_is_watching(void); #endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_RCU_TRACE) || defined(CONFIG_SMP) */ /* - * Hooks for cond_resched() and friends to avoid RCU CPU stall warnings. - */ - -#define RCU_COND_RESCHED_LIM 256 /* ms vs. 100s of ms. */ -DECLARE_PER_CPU(int, rcu_cond_resched_count); -void rcu_resched(void); - -/* - * Is it time to report RCU quiescent states? - * - * Note unsynchronized access to rcu_cond_resched_count. Yes, we might - * increment some random CPU's count, and possibly also load the result from - * yet another CPU's count. We might even clobber some other CPU's attempt - * to zero its counter. This is all OK because the goal is not precision, - * but rather reasonable amortization of rcu_note_context_switch() overhead - * and extremely high probability of avoiding RCU CPU stall warnings. - * Note that this function has to be preempted in just the wrong place, - * many thousands of times in a row, for anything bad to happen. - */ -static inline bool rcu_should_resched(void) -{ - return raw_cpu_inc_return(rcu_cond_resched_count) >= - RCU_COND_RESCHED_LIM; -} - -/* - * Report quiscent states to RCU if it is time to do so. - */ -static inline void rcu_cond_resched(void) -{ - if (unlikely(rcu_should_resched())) - rcu_resched(); -} - -/* * Infrastructure to implement the synchronize_() primitives in * TREE_RCU and rcu_barrier_() primitives in TINY_RCU. */ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index f1ba77363fbb..7d711f9a2e86 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -206,6 +206,70 @@ void rcu_bh_qs(int cpu) rdp->passed_quiesce = 1; } +static DEFINE_PER_CPU(int, rcu_sched_qs_mask); + +static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = { + .dynticks_nesting = DYNTICK_TASK_EXIT_IDLE, + .dynticks = ATOMIC_INIT(1), +#ifdef CONFIG_NO_HZ_FULL_SYSIDLE + .dynticks_idle_nesting = DYNTICK_TASK_NEST_VALUE, + .dynticks_idle = ATOMIC_INIT(1),
Re: [PATCH V3 16/16] irqchip: crossbar: allow for quirky hardware with direct hardwiring of GIC
On Mon, Jun 16, 2014 at 04:53:16PM +0530, Sricharan R wrote: > From: Nishanth Menon > > On certain platforms such as DRA7, SPIs 0, 1, 2, 3, 5, 6, 10, 131, > 132, 133 are direct wired to hardware blocks bypassing crossbar. > This quirky implementation is *NOT* supposed to be the expectation > of crossbar hardware usage. However, these are already marked in our > description of the hardware with SKIP and RESERVED where appropriate. > > Unfortunately, we need to be able to refer to these hardwired IRQs. > So, to request these, crossbar driver can use the existing information > from it's table that these SKIP/RESERVED maps are direct wired sources > and generic allocation/programming of crossbar should be avoided. > > Signed-off-by: Nishanth Menon > Signed-off-by: Sricharan R > --- > .../devicetree/bindings/arm/omap/crossbar.txt | 12 ++-- > drivers/irqchip/irq-crossbar.c | 20 > ++-- > 2 files changed, 28 insertions(+), 4 deletions(-) > > diff --git a/Documentation/devicetree/bindings/arm/omap/crossbar.txt > b/Documentation/devicetree/bindings/arm/omap/crossbar.txt > index 8210ea4..438ccab 100644 > --- a/Documentation/devicetree/bindings/arm/omap/crossbar.txt > +++ b/Documentation/devicetree/bindings/arm/omap/crossbar.txt > @@ -42,8 +42,10 @@ Documentation/devicetree/bindings/arm/gic.txt for further > details. > > An interrupt consumer on an SoC using crossbar will use: > interrupts = > -request number shall be between 0 to that described by > -"ti,max-crossbar-sources" > +When the request number is between 0 to that described by > +"ti,max-crossbar-sources", it is assumed to be a crossbar mapping. If the > +request_number is greater than "ti,max-crossbar-sources", then it is mapped > as a > +quirky hardware mapping direct to GIC. > > Example: > device_x@0x4a023000 { > @@ -51,3 +53,9 @@ Example: > interrupts = ; > ... > }; > + > + device_y@0x4a033000 { > + /* Direct mapped GIC SPI 1 used */ > + interrupts = ; Ideally, I'd like to see a macro here so that it's clear that we crossed a magic threshold. eg: #define MAX_SOURCES 400 #define DIRECT_IRQ(irq) (MAX_SOURCES + irq) ... interrupts = ; and, then: ti,max-crossbar-sources = ; > + ... > + }; > diff --git a/drivers/irqchip/irq-crossbar.c b/drivers/irqchip/irq-crossbar.c > index ef613c4..fff6218 100644 > --- a/drivers/irqchip/irq-crossbar.c > +++ b/drivers/irqchip/irq-crossbar.c > @@ -86,8 +86,13 @@ static inline int allocate_free_irq(int cb_no) > > static inline bool needs_crossbar_write(irq_hw_number_t hw) > { > - if (hw > GIC_IRQ_START) > - return true; > + int cb_no; > + > + if (hw > GIC_IRQ_START) { > + cb_no = cb->irq_map[hw - GIC_IRQ_START]; > + if (cb_no != IRQ_RESERVED && cb_no != IRQ_SKIP) > + return true; > + } > > return false; > } > @@ -130,8 +135,19 @@ static int crossbar_domain_xlate(struct irq_domain *d, > { > int ret; > int req_num = intspec[1]; > + int direct_map_num; > > if (req_num >= cb->max_crossbar_sources) { > + direct_map_num = req_num - cb->max_crossbar_sources; > + if (direct_map_num < cb->int_max) { > + ret = cb->irq_map[direct_map_num]; > + if (ret == IRQ_RESERVED || ret == IRQ_SKIP) { > + /* We use the interrupt num as h/w irq num */ > + ret = direct_map_num; > + goto found; > + } > + } > + > pr_err("%s: requested crossbar number %d > max %d\n", > __func__, req_num, cb->max_crossbar_sources); > return -EINVAL; thx, Jason. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Check for Null return of function of affs_bread in function affs_truncate
Fair enough if somebody is running this file system I would be happy to have someone test my code in order to fix this. Cheers Nick On Fri, Jun 20, 2014 at 10:38 PM, Andrew Morton wrote: > On Fri, 20 Jun 2014 22:25:47 -0400 Nick Krause wrote: > >> If you have any ideas about what is better >> please let me known. > > I think the proposed patch was not a good one - it will cause truncate > to silently return, probably leaving the fs in an inconsistent state. > Neither the user nor the running application know this happened so they > will just keep on modifying the filesystem, possibly mangling it > further. > > The code as it stands at present is better - if bread() fails we'll get > a nice solid oops and the current app will be terminated (at least). > As we're in truncate it's quite possible that the entire fs will get > wedged up due to now-permanently-held i_mutex, which is even better. > > > As for the best fix, umm, hard. We're pretty screwed if we cannot read > that block at this code site. Perhaps emit loud printks, forcibly turn > the fs read-only then return -EIO/-ENOMEM/etc from the truncate. Such > a change would require runtime testing, with some form of developer fault > injection. > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] staging:rtl8821ae: rewrite legacy wifi check in halbcoutsrc
Is this patch being merged or is this not an issue. I am confused did I make a mistake in my patch or is there being a different patch being merged. Thank Nick On Fri, Jun 20, 2014 at 10:34 PM, Joe Perches wrote: > On Fri, 2014-06-20 at 22:26 -0400, Nick Krause wrote: >> Thanks for the feedback I will resend the patch fixed. > > Please do not. > >> Otherwise please use Larry's idea. > > It's not Larry's idea. Larry is the primary > contributor for Realtek drivers in staging. > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Check for Null return of function of affs_bread in function affs_truncate
On Fri, 20 Jun 2014 22:25:47 -0400 Nick Krause wrote: > If you have any ideas about what is better > please let me known. I think the proposed patch was not a good one - it will cause truncate to silently return, probably leaving the fs in an inconsistent state. Neither the user nor the running application know this happened so they will just keep on modifying the filesystem, possibly mangling it further. The code as it stands at present is better - if bread() fails we'll get a nice solid oops and the current app will be terminated (at least). As we're in truncate it's quite possible that the entire fs will get wedged up due to now-permanently-held i_mutex, which is even better. As for the best fix, umm, hard. We're pretty screwed if we cannot read that block at this code site. Perhaps emit loud printks, forcibly turn the fs read-only then return -EIO/-ENOMEM/etc from the truncate. Such a change would require runtime testing, with some form of developer fault injection. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] staging:rtl8821ae: rewrite legacy wifi check in halbcoutsrc
On Fri, 2014-06-20 at 22:26 -0400, Nick Krause wrote: > Thanks for the feedback I will resend the patch fixed. Please do not. > Otherwise please use Larry's idea. It's not Larry's idea. Larry is the primary contributor for Realtek drivers in staging. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V3 03/16] irqchip: crossbar: introduce ti,irqs-skip to skip
Sricharan, Your subject line seems truncated: "irqchip: crossbar: introduce ti,irqs-skip to skip" maybe "... Introduce DT property to skip hardwired irqs" ? Also note that you need to correct the subject line for *every* patch in the series wrt capitalization. I don't mind correcting it when I apply it, provided that: - the patch is otherwise ready - I only have to do it once or twice for the series - I never had a chance to ask since you created a rockstar patch series the first time out of the gate (except for capitalization). Once I've looked over the whole series, please resend with the subject lines corrected. On Mon, Jun 16, 2014 at 04:53:03PM +0530, Sricharan R wrote: > From: Nishanth Menon > > When, in the system due to varied reasons, interrupts might be unusable > due to hardware behavior, but register maps do exist, then those interrupts > should be skipped while mapping irq to crossbars. > > Signed-off-by: Nishanth Menon > Signed-off-by: Sricharan R > --- > [V3] introduced ti,irqs-skip dt property to list the > irqs to be skipped. > > .../devicetree/bindings/arm/omap/crossbar.txt |4 > drivers/irqchip/irq-crossbar.c | 20 > > 2 files changed, 24 insertions(+) > > diff --git a/Documentation/devicetree/bindings/arm/omap/crossbar.txt > b/Documentation/devicetree/bindings/arm/omap/crossbar.txt > index fb88585..cfcbd52 100644 > --- a/Documentation/devicetree/bindings/arm/omap/crossbar.txt > +++ b/Documentation/devicetree/bindings/arm/omap/crossbar.txt > @@ -17,6 +17,10 @@ Required properties: >so crossbar bar driver should not consider them as free >lines. > > +Optional properties: > +- ti,irqs-skip: This is similar to "ti,irqs-reserved", but are irq mappings > + which are not supposed to be used for errata or other > reasons(virtualization). I would specifically mention SoC-specific hard-wiring of irqs here. Also the fact that the hardwiring unexpectedly bypasses the crossbar. > + > Examples: > crossbar_mpu: @4a02 { > compatible = "ti,irq-crossbar"; Please include a ti,irqs-skip example here. > diff --git a/drivers/irqchip/irq-crossbar.c b/drivers/irqchip/irq-crossbar.c > index 51d4b87..27049de 100644 > --- a/drivers/irqchip/irq-crossbar.c > +++ b/drivers/irqchip/irq-crossbar.c > @@ -18,6 +18,7 @@ > > #define IRQ_FREE -1 > #define IRQ_RESERVED -2 > +#define IRQ_SKIP -3 > #define GIC_IRQ_START32 > > /* > @@ -160,6 +161,25 @@ static int __init crossbar_of_init(struct device_node > *node) > } > } > > + /* Skip the ones marked as skip */ This comment is redundant, perhaps "Skip irqs hardwired to bypass the crossbar."? > + irqsr = of_get_property(node, "ti,irqs-skip", ); > + if (irqsr) { > + size /= sizeof(__be32); > + > + for (i = 0; i < size; i++) { > + of_property_read_u32_index(node, > +"ti,irqs-skip", > +i, ); > + if (entry > max) { > + pr_err("Invalid skip entry\n"); > + ret = -EINVAL; > + goto err3; > + } > + cb->irq_map[entry] = IRQ_SKIP; > + } > + } > + > + > cb->register_offsets = kzalloc(max * sizeof(int), GFP_KERNEL); > if (!cb->register_offsets) > goto err3; thx, Jason. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] include/trace/syscall.h: Use HAVE_SYSCALL_TRACEPOINTS instead of TRACEPOINTS
At present, most architectures can support TRACEPOINTS, but about 10/29 architectures support HAVE_SYSCALL_TRACEPOINTS. TIF_SYSCALL_TRACEPOINT depends on HAVE_SYSCALL_TRACEPOINTS, not all architectures which support TRACEPOINTS also must support TIF_SYSCALL_TRACEPOINT. So at present, need use HAVE_SYSCALL_TRACEPOINTS instead of TRACEPOINTS, or can not pass compiling. The related error (allmodconfig under score): CC init/main.o In file included from include/asm-generic/preempt.h:4:0, from arch/score/include/generated/asm/preempt.h:1, from include/linux/preempt.h:18, from include/linux/spinlock.h:50, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/linux/stat.h:18, from include/linux/module.h:10, from init/main.c:15: include/trace/syscall.h: In function 'syscall_tracepoint_update': include/trace/syscall.h:39:23: error: 'TIF_SYSCALL_TRACEPOINT' undeclared (first use in this function) if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) ^ include/linux/thread_info.h:103:45: note: in definition of macro 'test_thread_flag' test_ti_thread_flag(current_thread_info(), flag) ^ include/trace/syscall.h:39:23: note: each undeclared identifier is reported only once for each function it appears in if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) ^ include/linux/thread_info.h:103:45: note: in definition of macro 'test_thread_flag' test_ti_thread_flag(current_thread_info(), flag) ^ make[1]: *** [init/main.o] Error 1 make: *** [init] Error 2 Signed-off-by: Chen Gang --- include/trace/syscall.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/trace/syscall.h b/include/trace/syscall.h index 291c282..a709cbd 100644 --- a/include/trace/syscall.h +++ b/include/trace/syscall.h @@ -33,7 +33,7 @@ struct syscall_metadata { struct ftrace_event_call *exit_event; }; -#ifdef CONFIG_TRACEPOINTS +#ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS static inline void syscall_tracepoint_update(struct task_struct *p) { if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) -- 1.9.2.459.g68773ac -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] initramfs: Support initrd that is bigger than 2GiB
When initrd (compressed or not) is used, kernel report data corrupted with /dev/ram0. The root cause: During initramfs checking, if it is initrd, it will be transferred to /initrd.image with sys_write. sys_write only support 2G-4K write, so if the initrd ram is more than that, /initrd.image will not complete at all. Add local xwrite to loop calling sys_write to workaround the problem. Also need to use xwrite in write_buffer() to handle: image is uncompressed cpio and there is one big file (>2G) in it. unpack_to_rootfs ===> write_buffer ===> actions[]/do_copy At the same time, we don't need to worry about sys_read/sys_write in do_mounts_rd.c::crd_load. As decompressor will have fill/flush and local buffer that is smaller than 2G. Test with uncompressed initrd, and compressed ones with gz, bz2, lzma,xz, lzop. -v2: according to HPA, change name to xwrite. Signed-off-by: Yinghai Lu Acked-by: H. Peter Anvin --- init/initramfs.c | 33 + 1 file changed, 29 insertions(+), 4 deletions(-) Index: linux-2.6/init/initramfs.c === --- linux-2.6.orig/init/initramfs.c +++ linux-2.6/init/initramfs.c @@ -19,6 +19,26 @@ #include #include +static long __init xwrite(unsigned int fd, char *p, + size_t count) +{ + ssize_t left = count; + long written; + + /* sys_write only can write MAX_RW_COUNT aka 2G-4K bytes at most */ + while (left > 0) { + written = sys_write(fd, p, left); + + if (written <= 0) + break; + + left -= written; + p += written; + } + + return (written < 0) ? written : count; +} + static __initdata char *message; static void __init error(char *x) { @@ -346,7 +366,7 @@ static int __init do_name(void) static int __init do_copy(void) { if (count >= body_len) { - sys_write(wfd, victim, body_len); + xwrite(wfd, victim, body_len); sys_close(wfd); do_utime(vcollected, mtime); kfree(vcollected); @@ -354,7 +374,7 @@ static int __init do_copy(void) state = SkipIt; return 0; } else { - sys_write(wfd, victim, count); + xwrite(wfd, victim, count); body_len -= count; eat(count); return 1; @@ -604,8 +624,13 @@ static int __init populate_rootfs(void) fd = sys_open("/initrd.image", O_WRONLY|O_CREAT, 0700); if (fd >= 0) { - sys_write(fd, (char *)initrd_start, - initrd_end - initrd_start); + long written = xwrite(fd, (char *)initrd_start, + initrd_end - initrd_start); + + if (written != initrd_end - initrd_start) + pr_err("/initrd.image: incomplete write (%ld != %ld)\n", + written, initrd_end - initrd_start); + sys_close(fd); free_initrd(); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] initrd: Fix lz4 decompress with initrd
During testing initrd (>2G) support, find decompress/lz4 does not work with initrd at all. decompress_* should support: 1. inbuf[]/outbuf[] for kernel preboot. 2. inbuf[]/flush() for initramfs 3. fill()/flush() for initrd. in the unlz4 does not handle case 3, as input len is passed as 0, and it failed in first try. Fix that add one extra if (fill) checking, and get out if EOF from the fill(). Signed-off-by: Yinghai Lu --- lib/decompress_unlz4.c | 65 - 1 file changed, 43 insertions(+), 22 deletions(-) Index: linux-2.6/lib/decompress_unlz4.c === --- linux-2.6.orig/lib/decompress_unlz4.c +++ linux-2.6/lib/decompress_unlz4.c @@ -83,13 +83,20 @@ STATIC inline int INIT unlz4(u8 *input, if (posp) *posp = 0; - if (fill) - fill(inp, 4); + if (fill) { + size = fill(inp, 4); + if (size < 4) { + error("data corrupted"); + goto exit_2; + } + } chunksize = get_unaligned_le32(inp); if (chunksize == ARCHIVE_MAGICNUMBER) { - inp += 4; - size -= 4; + if (!fill) { + inp += 4; + size -= 4; + } } else { error("invalid header"); goto exit_2; @@ -100,29 +107,44 @@ STATIC inline int INIT unlz4(u8 *input, for (;;) { - if (fill) - fill(inp, 4); + if (fill) { + size = fill(inp, 4); + if (size == 0) + break; + if (size < 4) { + error("data corrupted"); + goto exit_2; + } + } chunksize = get_unaligned_le32(inp); if (chunksize == ARCHIVE_MAGICNUMBER) { - inp += 4; - size -= 4; + if (!fill) { + inp += 4; + size -= 4; + } if (posp) *posp += 4; continue; } - inp += 4; - size -= 4; + if (posp) *posp += 4; - if (fill) { + if (!fill) { + inp += 4; + size -= 4; + } else { if (chunksize > lz4_compressbound(uncomp_chunksize)) { error("chunk length is longer than allocated"); goto exit_2; } - fill(inp, chunksize); + size = fill(inp, chunksize); + if (size < chunksize) { + error("data corrupted"); + goto exit_2; + } } #ifdef PREBOOT if (out_len >= uncomp_chunksize) { @@ -149,18 +171,17 @@ STATIC inline int INIT unlz4(u8 *input, if (posp) *posp += chunksize; - size -= chunksize; + if (!fill) { + size -= chunksize; - if (size == 0) - break; - else if (size < 0) { - error("data corrupted"); - goto exit_2; + if (size == 0) + break; + else if (size < 0) { + error("data corrupted"); + goto exit_2; + } + inp += chunksize; } - - inp += chunksize; - if (fill) - inp = inp_start; } ret = 0; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] initramfs: Support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is bigger than 2g, as we could put it above 4G. Found compressed initramfs image could not be decompressed properly. It turns out that image length is int during decompress detection, and it will become < 0 when length is more than 2G. Furthermore, during decompressing len as int is used for inbuf count, that has problem too. Change len to long, that should be ok as on 32 bit platform long is 32bits. Tested with following compressed initramfs image as root with kexec. gzip, bzip2, xz, lzma, lzop, lz4. run time for populate_rootfs(): sizename Nehalem-EX Westmere-EX Ivybridge-EX 9034400256 root_img : 26s 24s 30s 3561095057 root_img.lz4 : 28s 27s 27s 3459554629 root_img.lzo : 29s 29s 28s 3219399480 root_img.gz : 64s 62s 49s 2251594592 root_img.xz : 262s 260s 183s 2226366598 root_img.lzma: 386s 376s 277s 2901482513 root_img.bz2 : 635s 599s -v2: fix pr_debug format error. Signed-off-by: Yinghai Lu --- crypto/zlib.c |8 fs/isofs/compress.c|6 +- fs/jffs2/compr_zlib.c |7 --- include/linux/decompress/bunzip2.h |8 include/linux/decompress/generic.h | 10 +- include/linux/decompress/inflate.h |8 include/linux/decompress/unlz4.h |8 include/linux/decompress/unlzma.h |8 include/linux/decompress/unlzo.h |8 include/linux/decompress/unxz.h|8 include/linux/zlib.h |4 ++-- init/do_mounts_rd.c| 10 +- init/initramfs.c | 22 +++--- lib/decompress.c |2 +- lib/decompress_bunzip2.c | 26 +- lib/decompress_inflate.c | 12 ++-- lib/decompress_unlz4.c | 18 +- lib/decompress_unlzma.c| 28 ++-- lib/decompress_unlzo.c | 12 ++-- lib/decompress_unxz.c | 10 +- 20 files changed, 110 insertions(+), 113 deletions(-) Index: linux-2.6/include/linux/decompress/generic.h === --- linux-2.6.orig/include/linux/decompress/generic.h +++ linux-2.6/include/linux/decompress/generic.h @@ -1,11 +1,11 @@ #ifndef DECOMPRESS_GENERIC_H #define DECOMPRESS_GENERIC_H -typedef int (*decompress_fn) (unsigned char *inbuf, int len, - int(*fill)(void*, unsigned int), - int(*flush)(void*, unsigned int), +typedef int (*decompress_fn) (unsigned char *inbuf, long len, + long (*fill)(void*, unsigned long), + long (*flush)(void*, unsigned long), unsigned char *outbuf, - int *posp, + long *posp, void(*error)(char *x)); /* inbuf - input buffer @@ -33,7 +33,7 @@ typedef int (*decompress_fn) (unsigned c /* Utility routine to detect the decompression method */ -decompress_fn decompress_method(const unsigned char *inbuf, int len, +decompress_fn decompress_method(const unsigned char *inbuf, long len, const char **name); #endif Index: linux-2.6/init/initramfs.c === --- linux-2.6.orig/init/initramfs.c +++ linux-2.6/init/initramfs.c @@ -174,7 +174,7 @@ static __initdata enum state { } state, next_state; static __initdata char *victim; -static __initdata unsigned count; +static unsigned long count __initdata; static __initdata loff_t this_header, next_header; static inline void __init eat(unsigned n) @@ -186,7 +186,7 @@ static inline void __init eat(unsigned n static __initdata char *vcollected; static __initdata char *collected; -static __initdata int remains; +static long remains __initdata; static __initdata char *collect; static void __init read_into(char *buf, unsigned size, enum state next) @@ -213,7 +213,7 @@ static int __init do_start(void) static int __init do_collect(void) { - unsigned n = remains; + unsigned long n = remains; if (count < n) n = count; memcpy(collect, victim, n); @@ -384,7 +384,7 @@ static __initdata int (*actions[])(void) [Reset] = do_reset, }; -static int __init write_buffer(char *buf, unsigned len) +static long __init write_buffer(char *buf, unsigned long len) { count = len; victim = buf; @@ -394,11 +394,11 @@ static int __init write_buffer(char *buf return len - count; } -static int __init flush_buffer(void *bufv, unsigned len) +static long __init
Re: [PATCH] staging:rtl8821ae: rewrite legacy wifi check in halbcoutsrc
Thanks for the feedback I will resend the patch fixed. Otherwise please use Larry's idea. Cheers Nick On Fri, Jun 20, 2014 at 4:08 PM, Joe Perches wrote: > On Fri, 2014-06-20 at 22:59 +0300, Dan Carpenter wrote: >> On Fri, Jun 20, 2014 at 12:56:50PM -0400, Nicholas Krause wrote: >> > Rewrites the wireless check for legacy checking in function >> > halbtc_legacy to check for both Mode A and B. >> >> You're just guessing that A and B were intended but it could have been >> something B and G... >> >> Don't do this. Just leave the static checker warning there so someone >> can fix it properly instead of introducing a second new bug and hiding >> the warning so it's impossible to find. >> > > It's most likely G anyway: > > drivers/staging/rtl8192ee/btcoexist/halbtcoutsrc.c: if ((mac->mode == > WIRELESS_MODE_B) || (mac->mode == WIRELESS_MODE_G)) > drivers/staging/rtl8821ae/btcoexist/halbtcoutsrc.c: if ((mac->mode == > WIRELESS_MODE_B) || (mac->mode == WIRELESS_MODE_B)) > > Larry probably has a better idea. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Check for Null return of function of affs_bread in function affs_truncate
Thanks for standing up for me Thomas. If you have any ideas about what is better please let me known. Cheers Nick On Fri, Jun 20, 2014 at 7:59 PM, Thomas Gleixner wrote: > On Fri, 20 Jun 2014, Nick Krause wrote: > >> Ok that's fine I would return as if it's a NULL the other parts of the >> function can't continue. >> Nick >> >> On Thu, Jun 19, 2014 at 1:21 AM, Dan Carpenter >> wrote: >> > On Wed, Jun 18, 2014 at 06:08:05PM -0400, Nicholas Krause wrote: >> >> Signed-off-by: Nicholas Krause >> >> --- >> >> fs/affs/file.c | 2 ++ >> >> 1 file changed, 2 insertions(+) >> >> >> >> diff --git a/fs/affs/file.c b/fs/affs/file.c >> >> index a7fe57d..f26482d 100644 >> >> --- a/fs/affs/file.c >> >> +++ b/fs/affs/file.c >> >> @@ -923,6 +923,8 @@ affs_truncate(struct inode *inode) >> >> >> >> while (ext_key) { >> >> ext_bh = affs_bread(sb, ext_key); >> >> + if (!ext_bh) >> >> + return; >> > >> > The problem is that we don't know if we should return here or break >> > here. If you don't understand the code, then it's best to just leave it >> > alone. > > Dan, what kind of attitude is that? > > Nick certainly found an issue where a possible NULL return from > affs_bread() can cause havoc. > > Do YOU understand that code? > > If yes, you better explain, WHY Nicks finding is a false positive > instead of just telling him off in a very inpolite way. > > If not, you better refrain from telling a reporter that he does not > understand the code and should stay away. > > You clearly stated that you do not understand it either: > >> > The problem is that we don't know if we should return here or break >> > here. > > The problem here is that proceeding with a known NULL pointer is wrong > to begin with. It does not matter at all whether break or return is > the proper thing to do. What matters is that proceeding with a NULL > pointer is wrong to begin with, no matter what. > > So either explain why this is a non issue and the NULL pointer return > cannot happen or shut up and try to find a proper solution for that > "return" vs. "break" issue. > > Thanks, > > tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] irqchip: nvic: Use the generic noop function
On Wed, Jun 04, 2014 at 04:01:52PM +0100, Daniel Thompson wrote: > Using the generic function saves looking up this custom one in a source > navigator. > > Signed-off-by: Daniel Thompson > Cc: Thomas Gleixner > Cc: Jason Cooper > --- > drivers/irqchip/irq-nvic.c | 13 - > 1 file changed, 4 insertions(+), 9 deletions(-) Applied to irqchip/core with Uwe's Ack. thx, Jason. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] irqchip: brcmstb-l2: Level-2 interrupts are edge sensitive
On Mon, Jun 09, 2014 at 11:05:02AM -0700, Florian Fainelli wrote: > The driver was configuring the interrupt handler for the Level-2 > interrupts to be "level" triggered while they are in fact "edge" > triggered. Fix this by using the correct handler. > > Reported-by: Brian Norris > Signed-off-by: Florian Fainelli > --- > drivers/irqchip/irq-brcmstb-l2.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Applied to irqchip/urgent thx, Jason. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/urgent] x86/vdso: Improve the fake section headers
Commit-ID: bfad381c0d1e19cae8461e105d8d4387dd2a14fe Gitweb: http://git.kernel.org/tip/bfad381c0d1e19cae8461e105d8d4387dd2a14fe Author: Andy Lutomirski AuthorDate: Wed, 18 Jun 2014 15:59:48 -0700 Committer: H. Peter Anvin CommitDate: Thu, 19 Jun 2014 15:45:12 -0700 x86/vdso: Improve the fake section headers Fully stripping the vDSO has other unfortunate side effects: - binutils is unable to find ELF notes without a SHT_NOTE section. - Even elfutils has trouble: it can find ELF notes without a section table at all, but if a section table is present, it won't look for PT_NOTE. - gdb wants section names to match between stripped DSOs and their symbols; otherwise it will corrupt symbol addresses. We're also breaking the rules: section 0 is supposed to be SHT_NULL. Fix these problems by building a better fake section table. While we're at it, we might as well let buggy Go versions keep working well by giving the SHT_DYNSYM entry the correct size. This is a bit unfortunate: it adds quite a bit of size to the vdso image. If/when binutils improves and the improved versions become widespread, it would be worth considering dropping most of this. Signed-off-by: Andy Lutomirski Link: http://lkml.kernel.org/r/0e546a5eeaafdf1840e6ee654a55c1e727c26663.1403129369.git.l...@amacapital.net Signed-off-by: H. Peter Anvin --- arch/x86/vdso/Makefile | 4 +- arch/x86/vdso/vdso-fakesections.c| 44 arch/x86/vdso/vdso-layout.lds.S | 40 +-- arch/x86/vdso/vdso.lds.S | 2 + arch/x86/vdso/vdso2c.c | 31 -- arch/x86/vdso/vdso2c.h | 180 +++ arch/x86/vdso/vdso32/vdso-fakesections.c | 1 + arch/x86/vdso/vdsox32.lds.S | 2 + 8 files changed, 237 insertions(+), 67 deletions(-) diff --git a/arch/x86/vdso/Makefile b/arch/x86/vdso/Makefile index 3c0809a..2c1ca98 100644 --- a/arch/x86/vdso/Makefile +++ b/arch/x86/vdso/Makefile @@ -11,7 +11,6 @@ VDSO32-$(CONFIG_COMPAT) := y # files to link into the vdso vobjs-y := vdso-note.o vclock_gettime.o vgetcpu.o vdso-fakesections.o -vobjs-nox32 := vdso-fakesections.o # files to link into kernel obj-y += vma.o @@ -134,7 +133,7 @@ override obj-dirs = $(dir $(obj)) $(obj)/vdso32/ targets += vdso32/vdso32.lds targets += vdso32/note.o vdso32/vclock_gettime.o $(vdso32.so-y:%=vdso32/%.o) -targets += vdso32/vclock_gettime.o +targets += vdso32/vclock_gettime.o vdso32/vdso-fakesections.o $(obj)/vdso32.o: $(vdso32-images:%=$(obj)/%) @@ -155,6 +154,7 @@ $(vdso32-images:%=$(obj)/%.dbg): KBUILD_CFLAGS = $(KBUILD_CFLAGS_32) $(vdso32-images:%=$(obj)/%.dbg): $(obj)/vdso32-%.so.dbg: FORCE \ $(obj)/vdso32/vdso32.lds \ $(obj)/vdso32/vclock_gettime.o \ +$(obj)/vdso32/vdso-fakesections.o \ $(obj)/vdso32/note.o \ $(obj)/vdso32/%.o $(call if_changed,vdso) diff --git a/arch/x86/vdso/vdso-fakesections.c b/arch/x86/vdso/vdso-fakesections.c index cb8a8d7..56927a7 100644 --- a/arch/x86/vdso/vdso-fakesections.c +++ b/arch/x86/vdso/vdso-fakesections.c @@ -2,31 +2,23 @@ * Copyright 2014 Andy Lutomirski * Subject to the GNU Public License, v.2 * - * Hack to keep broken Go programs working. - * - * The Go runtime had a couple of bugs: it would read the section table to try - * to figure out how many dynamic symbols there were (it shouldn't have looked - * at the section table at all) and, if there were no SHT_SYNDYM section table - * entry, it would use an uninitialized value for the number of symbols. As a - * workaround, we supply a minimal section table. vdso2c will adjust the - * in-memory image so that "vdso_fake_sections" becomes the section table. - * - * The bug was introduced by: - * https://code.google.com/p/go/source/detail?r=56ea40aac72b (2012-08-31) - * and is being addressed in the Go runtime in this issue: - * https://code.google.com/p/go/issues/detail?id=8197 + * String table for loadable section headers. See vdso2c.h for why + * this exists. */ -#ifndef __x86_64__ -#error This hack is specific to the 64-bit vDSO -#endif - -#include - -extern const __visible struct elf64_shdr vdso_fake_sections[]; -const __visible struct elf64_shdr vdso_fake_sections[] = { - { - .sh_type = SHT_DYNSYM, - .sh_entsize = sizeof(Elf64_Sym), - } -}; +const char fake_shstrtab[] __attribute__((section(".fake_shstrtab"))) = + ".hash\0" + ".dynsym\0" + ".dynstr\0" + ".gnu.version\0" + ".gnu.version_d\0" + ".dynamic\0" + ".rodata\0" + ".fake_shstrtab\0" /* Yay, self-referential code. */ + ".note\0" + ".data\0" + ".altinstructions\0" + ".altinstr_replacement\0" + ".eh_frame_hdr\0" + ".eh_frame\0"
[tip:x86/urgent] x86/vdso: Remove some redundant in-memory section headers
Commit-ID: 0e3727a8839c988a3c56170bc8da76d55a16acad Gitweb: http://git.kernel.org/tip/0e3727a8839c988a3c56170bc8da76d55a16acad Author: Andy Lutomirski AuthorDate: Wed, 18 Jun 2014 15:59:49 -0700 Committer: H. Peter Anvin CommitDate: Thu, 19 Jun 2014 15:45:26 -0700 x86/vdso: Remove some redundant in-memory section headers .data doesn't need to be separate from .rodata: they're both readonly. .altinstructions and .altinstr_replacement aren't needed by anything except vdso2c; strip them from the final image. While we're at it, rather than aligning the actual executable text, just shove some unused-at-runtime data in between real data and text. My vdso image is still above 4k, but I'm disinclined to try to trim it harder for 3.16. For future trimming, I suspect that these sections could be moved to later in the file and dropped from the in-memory image: .gnu.version and .gnu.version_d (this may lose versions in gdb) .eh_frame (should be harmless) .eh_frame_hdr (I'm not really sure) .hash (AFAIK nothing needs this section header) Signed-off-by: Andy Lutomirski Link: http://lkml.kernel.org/r/2e96d0c49016ea6d026a614ae645e93edd325961.1403129369.git.l...@amacapital.net Signed-off-by: H. Peter Anvin --- arch/x86/vdso/vdso-fakesections.c | 3 --- arch/x86/vdso/vdso-layout.lds.S | 43 +-- arch/x86/vdso/vdso2c.h| 4 +++- 3 files changed, 26 insertions(+), 24 deletions(-) diff --git a/arch/x86/vdso/vdso-fakesections.c b/arch/x86/vdso/vdso-fakesections.c index 56927a7..aa5fbfa 100644 --- a/arch/x86/vdso/vdso-fakesections.c +++ b/arch/x86/vdso/vdso-fakesections.c @@ -16,9 +16,6 @@ const char fake_shstrtab[] __attribute__((section(".fake_shstrtab"))) = ".rodata\0" ".fake_shstrtab\0" /* Yay, self-referential code. */ ".note\0" - ".data\0" - ".altinstructions\0" - ".altinstr_replacement\0" ".eh_frame_hdr\0" ".eh_frame\0" ".text"; diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S index e4cbc21..9197544 100644 --- a/arch/x86/vdso/vdso-layout.lds.S +++ b/arch/x86/vdso/vdso-layout.lds.S @@ -14,7 +14,7 @@ # error unknown VDSO target #endif -#define NUM_FAKE_SHDRS 16 +#define NUM_FAKE_SHDRS 13 SECTIONS { @@ -28,15 +28,17 @@ SECTIONS .gnu.version_d : { *(.gnu.version_d) } .gnu.version_r : { *(.gnu.version_r) } - .note : { *(.note.*) }:text :note - - .eh_frame_hdr : { *(.eh_frame_hdr) } :text :eh_frame_hdr - .eh_frame : { KEEP (*(.eh_frame)) } :text - .dynamic: { *(.dynamic) } :text :dynamic .rodata : { *(.rodata*) + *(.data*) + *(.sdata*) + *(.got.plt) *(.got) + *(.gnu.linkonce.d.*) + *(.bss*) + *(.dynbss*) + *(.gnu.linkonce.b.*) /* * Ideally this would live in a C file, but that won't @@ -50,28 +52,29 @@ SECTIONS .fake_shstrtab : { *(.fake_shstrtab) } :text - .data : { - *(.data*) - *(.sdata*) - *(.got.plt) *(.got) - *(.gnu.linkonce.d.*) - *(.bss*) - *(.dynbss*) - *(.gnu.linkonce.b.*) - } - .altinstructions: { *(.altinstructions) } - .altinstr_replacement : { *(.altinstr_replacement) } + .note : { *(.note.*) }:text :note + + .eh_frame_hdr : { *(.eh_frame_hdr) } :text :eh_frame_hdr + .eh_frame : { KEEP (*(.eh_frame)) } :text + /* -* Align the actual code well away from the non-instruction data. -* This is the best thing for the I-cache. +* Text is well-separated from actual data: there's plenty of +* stuff that isn't used at runtime in between. */ - . = ALIGN(0x100); .text : { *(.text*) } :text =0x90909090, /* +* At the end so that eu-elflint stays happy when vdso2c strips +* these. A better implementation would avoid allocating space +* for these. +*/ + .altinstructions: { *(.altinstructions) } :text + .altinstr_replacement : { *(.altinstr_replacement) } :text + + /* * The remainder of the vDSO consists of special pages that are * shared between the kernel and userspace. It needs to be at the * end so that it doesn't overlap the mapping of the actual diff --git a/arch/x86/vdso/vdso2c.h b/arch/x86/vdso/vdso2c.h index f01ed4b..f42e2dd 100644 --- a/arch/x86/vdso/vdso2c.h +++ b/arch/x86/vdso/vdso2c.h @@ -92,7 +92,9 @@ static void BITSFUNC(copy_section)(struct
[tip:x86/urgent] x86/vdso: Create .build-id links for unstripped vdso files
Commit-ID: dda1e95cee38b416b23f751cac65421d781e3c10 Gitweb: http://git.kernel.org/tip/dda1e95cee38b416b23f751cac65421d781e3c10 Author: Andy Lutomirski AuthorDate: Fri, 20 Jun 2014 12:20:44 -0700 Committer: H. Peter Anvin CommitDate: Fri, 20 Jun 2014 13:18:49 -0700 x86/vdso: Create .build-id links for unstripped vdso files With this change, doing 'make vdso_install' and telling gdb: set debug-file-directory /lib/modules/KVER/vdso will enable vdso debugging with symbols. This is useful for testing, but kernel RPM builds will probably want to manually delete these symlinks or otherwise do something sensible when they strip the vdso/*.so files. If ld does not support --build-id, then the symlinks will not be created. Note that kernel packagers that use vdso_install may need to adjust their packaging scripts to accomdate this change. For example, Fedora's scripts create build-id symlinks themselves in a different location, so the spec should probably be updated to remove the symlinks created by make vdso_install. Signed-off-by: Andy Lutomirski Link: http://lkml.kernel.org/r/a424b189ce3ced85fe1e82d032a20e765e0fe0d3.1403291930.git.l...@amacapital.net Signed-off-by: H. Peter Anvin --- arch/x86/vdso/Makefile | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/x86/vdso/Makefile b/arch/x86/vdso/Makefile index 2c1ca98..68a15c4 100644 --- a/arch/x86/vdso/Makefile +++ b/arch/x86/vdso/Makefile @@ -169,14 +169,24 @@ quiet_cmd_vdso = VDSO$@ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=sysv) \ - -Wl,-Bsymbolic $(LTO_CFLAGS) + $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) GCOV_PROFILE := n # -# Install the unstripped copies of vdso*.so. +# Install the unstripped copies of vdso*.so. If our toolchain supports +# build-id, install .build-id links as well. # quiet_cmd_vdso_install = INSTALL $(@:install_%=%) - cmd_vdso_install = cp $< $(MODLIB)/vdso/$(@:install_%=%) +define cmd_vdso_install + cp $< "$(MODLIB)/vdso/$(@:install_%=%)"; \ + if readelf -n $< |grep -q 'Build ID'; then \ + buildid=`readelf -n $< |grep 'Build ID' |sed -e 's/^.*Build ID: \(.*\)$$/\1/'`; \ + first=`echo $$buildid | cut -b-2`; \ + last=`echo $$buildid | cut -b3-`; \ + mkdir -p "$(MODLIB)/vdso/.build-id/$$first"; \ + ln -sf "../../$(@:install_%=%)" "$(MODLIB)/vdso/.build-id/$$first/$$last.debug"; \ + fi +endef vdso_img_insttargets := $(vdso_img_sodbg:%.dbg=install_%) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/urgent] x86/vdso2c: Use better macros for ELF bitness
Commit-ID: c1979c370273fd9f7326ffa27a63b9ddb0f495f4 Gitweb: http://git.kernel.org/tip/c1979c370273fd9f7326ffa27a63b9ddb0f495f4 Author: Andy Lutomirski AuthorDate: Wed, 18 Jun 2014 15:59:47 -0700 Committer: H. Peter Anvin CommitDate: Thu, 19 Jun 2014 15:44:59 -0700 x86/vdso2c: Use better macros for ELF bitness Rather than using a separate macro for each replacement, use generic macros. Signed-off-by: Andy Lutomirski Link: http://lkml.kernel.org/r/d953cd2e70ceee1400985d091188cdd65fba2f05.1403129369.git.l...@amacapital.net Signed-off-by: H. Peter Anvin --- arch/x86/vdso/vdso2c.c | 42 +- arch/x86/vdso/vdso2c.h | 23 --- 2 files changed, 25 insertions(+), 40 deletions(-) diff --git a/arch/x86/vdso/vdso2c.c b/arch/x86/vdso/vdso2c.c index 7a6bf50..7343899 100644 --- a/arch/x86/vdso/vdso2c.c +++ b/arch/x86/vdso/vdso2c.c @@ -83,37 +83,21 @@ extern void bad_put_le(void); #define NSYMS (sizeof(required_syms) / sizeof(required_syms[0])) -#define BITS 64 -#define GOFUNC go64 -#define Elf_Ehdr Elf64_Ehdr -#define Elf_Shdr Elf64_Shdr -#define Elf_Phdr Elf64_Phdr -#define Elf_Sym Elf64_Sym -#define Elf_Dyn Elf64_Dyn +#define BITSFUNC3(name, bits) name##bits +#define BITSFUNC2(name, bits) BITSFUNC3(name, bits) +#define BITSFUNC(name) BITSFUNC2(name, ELF_BITS) + +#define ELF_BITS_XFORM2(bits, x) Elf##bits##_##x +#define ELF_BITS_XFORM(bits, x) ELF_BITS_XFORM2(bits, x) +#define ELF(x) ELF_BITS_XFORM(ELF_BITS, x) + +#define ELF_BITS 64 #include "vdso2c.h" -#undef BITS -#undef GOFUNC -#undef Elf_Ehdr -#undef Elf_Shdr -#undef Elf_Phdr -#undef Elf_Sym -#undef Elf_Dyn - -#define BITS 32 -#define GOFUNC go32 -#define Elf_Ehdr Elf32_Ehdr -#define Elf_Shdr Elf32_Shdr -#define Elf_Phdr Elf32_Phdr -#define Elf_Sym Elf32_Sym -#define Elf_Dyn Elf32_Dyn +#undef ELF_BITS + +#define ELF_BITS 32 #include "vdso2c.h" -#undef BITS -#undef GOFUNC -#undef Elf_Ehdr -#undef Elf_Shdr -#undef Elf_Phdr -#undef Elf_Sym -#undef Elf_Dyn +#undef ELF_BITS static void go(void *addr, size_t len, FILE *outfile, const char *name) { diff --git a/arch/x86/vdso/vdso2c.h b/arch/x86/vdso/vdso2c.h index c6eefaf..8e185ce 100644 --- a/arch/x86/vdso/vdso2c.h +++ b/arch/x86/vdso/vdso2c.h @@ -4,23 +4,24 @@ * are built for 32-bit userspace. */ -static void GOFUNC(void *addr, size_t len, FILE *outfile, const char *name) +static void BITSFUNC(go)(void *addr, size_t len, +FILE *outfile, const char *name) { int found_load = 0; unsigned long load_size = -1; /* Work around bogus warning */ unsigned long data_size; - Elf_Ehdr *hdr = (Elf_Ehdr *)addr; + ELF(Ehdr) *hdr = (ELF(Ehdr) *)addr; int i; unsigned long j; - Elf_Shdr *symtab_hdr = NULL, *strtab_hdr, *secstrings_hdr, + ELF(Shdr) *symtab_hdr = NULL, *strtab_hdr, *secstrings_hdr, *alt_sec = NULL; - Elf_Dyn *dyn = 0, *dyn_end = 0; + ELF(Dyn) *dyn = 0, *dyn_end = 0; const char *secstrings; uint64_t syms[NSYMS] = {}; uint64_t fake_sections_value = 0, fake_sections_size = 0; - Elf_Phdr *pt = (Elf_Phdr *)(addr + GET_LE(>e_phoff)); + ELF(Phdr) *pt = (ELF(Phdr) *)(addr + GET_LE(>e_phoff)); /* Walk the segment table. */ for (i = 0; i < GET_LE(>e_phnum); i++) { @@ -61,7 +62,7 @@ static void GOFUNC(void *addr, size_t len, FILE *outfile, const char *name) GET_LE(>e_shentsize)*GET_LE(>e_shstrndx); secstrings = addr + GET_LE(_hdr->sh_offset); for (i = 0; i < GET_LE(>e_shnum); i++) { - Elf_Shdr *sh = addr + GET_LE(>e_shoff) + + ELF(Shdr) *sh = addr + GET_LE(>e_shoff) + GET_LE(>e_shentsize) * i; if (GET_LE(>sh_type) == SHT_SYMTAB) symtab_hdr = sh; @@ -82,7 +83,7 @@ static void GOFUNC(void *addr, size_t len, FILE *outfile, const char *name) i < GET_LE(_hdr->sh_size) / GET_LE(_hdr->sh_entsize); i++) { int k; - Elf_Sym *sym = addr + GET_LE(_hdr->sh_offset) + + ELF(Sym) *sym = addr + GET_LE(_hdr->sh_offset) + GET_LE(_hdr->sh_entsize) * i; const char *name = addr + GET_LE(_hdr->sh_offset) + GET_LE(>st_name); @@ -123,12 +124,12 @@ static void GOFUNC(void *addr, size_t len, FILE *outfile, const char *name) fail("end_mapping must be a multiple of 4096\n"); /* Remove sections or use fakes */ - if (fake_sections_size % sizeof(Elf_Shdr)) + if (fake_sections_size % sizeof(ELF(Shdr))) fail("vdso_fake_sections size is not a multiple of %ld\n", -(long)sizeof(Elf_Shdr)); +(long)sizeof(ELF(Shdr))); PUT_LE(>e_shoff, fake_sections_value); - PUT_LE(>e_shentsize, fake_sections_value ? sizeof(Elf_Shdr) : 0); -
[tip:x86/urgent] x86/vdso: Discard the __bug_table section
Commit-ID: 5f56e7167e6d438324fcba87018255d81e201383 Gitweb: http://git.kernel.org/tip/5f56e7167e6d438324fcba87018255d81e201383 Author: Andy Lutomirski AuthorDate: Wed, 18 Jun 2014 15:59:46 -0700 Committer: H. Peter Anvin CommitDate: Thu, 19 Jun 2014 15:44:51 -0700 x86/vdso: Discard the __bug_table section It serves no purpose in user code. Signed-off-by: Andy Lutomirski Link: http://lkml.kernel.org/r/2a5bebff42defd8a5e81d96f7dc00f21143c80e8.1403129369.git.l...@amacapital.net Signed-off-by: H. Peter Anvin --- arch/x86/vdso/vdso-layout.lds.S | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/vdso/vdso-layout.lds.S b/arch/x86/vdso/vdso-layout.lds.S index 2ec72f6..c84166c 100644 --- a/arch/x86/vdso/vdso-layout.lds.S +++ b/arch/x86/vdso/vdso-layout.lds.S @@ -75,6 +75,7 @@ SECTIONS /DISCARD/ : { *(.discard) *(.discard.*) + *(__bug_table) } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86/mce: Don't unregister CPU hotplug notifier in error path
On 06/20/2014 05:11 PM, Borislav Petkov wrote: On Fri, Jun 20, 2014 at 04:43:37PM -0400, Boris Ostrovsky wrote: We are getting CPU_ONLINE notifier for ASPs during boot: Bah, that's craptastic. Hmm, ok, let's try this instead: I'll try it later but this doesn't look sufficient to me: we might not reach this point if subsys_system_register() or zalloc_cpumask_var() fail. We could register the notifier as the first thing in this routine (probably after mce_available() succeeds). -boris -- diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index bb92f38153b2..9a79c8dbd8e8 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -2451,6 +2451,12 @@ static __init int mcheck_init_device(void) for_each_online_cpu(i) { err = mce_device_create(i); if (err) { + /* +* Register notifier anyway (and do not unreg it) so +* that we don't leave undeleted timers, see notifier +* callback above. +*/ + __register_hotcpu_notifier(_cpu_notifier); cpu_notifier_register_done(); goto err_device_create; } @@ -2471,10 +2477,6 @@ static __init int mcheck_init_device(void) err_register: unregister_syscore_ops(_syscore_ops); - cpu_notifier_register_begin(); - __unregister_hotcpu_notifier(_cpu_notifier); - cpu_notifier_register_done(); - err_device_create: /* * We didn't keep track of which devices were created above, but -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] openrisc: irq: use irqchip framework
On Thu, May 29, 2014 at 11:28:08PM +0300, Stefan Kristiansson wrote: > On Tue, May 27, 2014 at 08:47:36AM +0200, Jonas Bonn wrote: > > On 05/26/2014 10:52 PM, Geert Uytterhoeven wrote: > > > CC devicetree for the bindings > > > > > > On Mon, May 26, 2014 at 10:31 PM, Stefan Kristiansson > > > wrote: > > >> +++ > > >> b/Documentation/devicetree/bindings/interrupt-controller/opencores,or1k-pic.txt > > >> @@ -0,0 +1,23 @@ > > >> +OpenRISC 1000 Programmable Interrupt Controller > > >> + > > >> +Required properties: > > >> + > > >> +- compatible : should be "opencores,or1k-pic-level" for variants with > > >> + level triggered interrupt lines, "opencores,or1k-pic-edge" for > > >> variants with > > >> + edge triggered interrupt lines or "opencores,or1200-pic" for machines > > >> + with the non-spec compliant or1200 type implementation. > > >> + > > >> + "opencores,or1k-pic" is also provided as an alias to > > >> "opencores,or1200-pic", > > >> + but this is only for backwards compatibility. > > > > I still think this identifier needs to be versioned. Use the same > > version number as we have on the cpu identifier since the OR1200 PIC > > hasn't changed since then; i.e. opencores,or1200-pic-rtlsvnXYZ. > > > > I can change that if you *really* insist on it... > But I don't understand the purpose of the versioning here, > there will never be any other or1200-pic version than the one that currently > exists, so IMO "or1200" should be enough versioning information. I'm horribly unfamiliar with openrisc, but compatible strings are compatible strings. ;-) Is the *actual* IP block called or1200-pic? Or is it, eg or1235-pic, and you're using or1200-pic as a generic catch-all? Please use the specific IP name without wildcards. That compatible string will then be used on that IP and future IP that is compatible with the original IP. Once an incompatible change is introduced, then we'll create a new compatible string, say or1300-pic, or or1237-pic. When in doubt, be specific. I don't think the '-rtlsvnXYZ' should be necessary, though. thx, Jason. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
SmPL for automatic request_firmware_nowait() conversion
I was just porting over an ethernet driver [0] to use request_firmware_nowait() since firmware loading seems can take over a minute on one device, while at it I noticed no other ethernet drivers yet use this API so figure this may be a trend coming if devices are getting as complex as cxgb4. The cxgb4 driver happens to even use the firmware API 3 times! Obviously I considered writing SmPL for this, but one thing which seemed hard was that for after the request_firmware_nowait() we tend to tuck away into another new call the rest of the code that was in place in the original function after the old request_firmware() call. Is there a way to dump all that code into the new routine? I think the hardest thing would be to also move the right set of variables over. In the third patch in this series for example [1] there was a state variable that I moved from beign static over to the ethernet private data structure. Its hard for me to think of how I can hint to Coccinelle enough information about what stuff it needs to move around. I think one hint would be: "Hey all that code that is static and is used *before* and *after* request_firmware() stuff it into the private data structure" We'd have to infer the private data structure but that's easy and I already know that's possible. Is this possible? The only other challenge I thought might be tough would be to come up with are rasonable call for the completion call, but I guess we can use the original routine name where request_firmware() was being used and postfix _completion or something. netdev: how worthy is this effort? [0] https://lkml.org/lkml/2014/6/20/688 [1] https://lkml.org/lkml/2014/6/20/691 Luis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] staging: ft1000_dnld.c:code indent should use tabs where possible
This patch fixes the following checkpatch.pl issue in ft1000/ft1000-pcmcia/ft1000_dnld.c ERROR: code indent should use tabs where possible Signed-off-by: Quentin Lee --- drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c b/drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c index d44e858..afaab07 100644 --- a/drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c +++ b/drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c @@ -15,8 +15,8 @@ Suite 330, Boston, MA 02111-1307, USA. -- - Description: This module will handshake with the DSP bootloader to - download the DSP runtime image. + Description: This module will handshake with the DSP bootloader to + download the DSP runtime image. ---*/ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [for-next][PATCH v2 1/3] tracing: Fix syscall_*regfunc() vs copy_process() race
On Fri, 20 Jun 2014 18:11:25 -0700 "Paul E. McKenney" wrote: > On Fri, Jun 20, 2014 at 06:45:19AM -0400, Steven Rostedt wrote: > > From: Oleg Nesterov > > > > syscall_regfunc() and syscall_unregfunc() should set/clear > > TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race > > with copy_process() and miss the new child which was not added to > > the process/thread lists yet. > > > > Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT > > under tasklist. > > > > Link: http://lkml.kernel.org/p/20140413185854.gb20...@redhat.com > > > > Cc: sta...@vger.kernel.org # 2.6.33 > > Fixes: a871bd33a6c0 "tracing: Add syscall tracepoints" > > Acked-by: Frederic Weisbecker > > Signed-off-by: Oleg Nesterov > > Signed-off-by: Steven Rostedt > > Acked-by: Paul E. McKenney > I don't usually rebase my for-next branch for acks, but I already rebased once for fixing an issue, and it's early in the rc cycle, and this is the first patch on the branch, so I think I will do it. Thanks! -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [for-next][PATCH v2 1/3] tracing: Fix syscall_*regfunc() vs copy_process() race
On Fri, Jun 20, 2014 at 06:45:19AM -0400, Steven Rostedt wrote: > From: Oleg Nesterov > > syscall_regfunc() and syscall_unregfunc() should set/clear > TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race > with copy_process() and miss the new child which was not added to > the process/thread lists yet. > > Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT > under tasklist. > > Link: http://lkml.kernel.org/p/20140413185854.gb20...@redhat.com > > Cc: sta...@vger.kernel.org # 2.6.33 > Fixes: a871bd33a6c0 "tracing: Add syscall tracepoints" > Acked-by: Frederic Weisbecker > Signed-off-by: Oleg Nesterov > Signed-off-by: Steven Rostedt Acked-by: Paul E. McKenney > --- > include/trace/syscall.h | 15 +++ > kernel/fork.c | 2 ++ > 2 files changed, 17 insertions(+) > > diff --git a/include/trace/syscall.h b/include/trace/syscall.h > index fed853f3d7aa..9674145e2f6a 100644 > --- a/include/trace/syscall.h > +++ b/include/trace/syscall.h > @@ -4,6 +4,7 @@ > #include > #include > #include > +#include > > #include > > @@ -32,4 +33,18 @@ struct syscall_metadata { > struct ftrace_event_call *exit_event; > }; > > +#if defined(CONFIG_TRACEPOINTS) && defined(CONFIG_HAVE_SYSCALL_TRACEPOINTS) > +static inline void syscall_tracepoint_update(struct task_struct *p) > +{ > + if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) > + set_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT); > + else > + clear_tsk_thread_flag(p, TIF_SYSCALL_TRACEPOINT); > +} > +#else > +static inline void syscall_tracepoint_update(struct task_struct *p) > +{ > +} > +#endif > + > #endif /* _TRACE_SYSCALL_H */ > diff --git a/kernel/fork.c b/kernel/fork.c > index d2799d1fc952..6a13c46cd87d 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1487,7 +1487,9 @@ static struct task_struct *copy_process(unsigned long > clone_flags, > > total_forks++; > spin_unlock(>sighand->siglock); > + syscall_tracepoint_update(p); > write_unlock_irq(_lock); > + > proc_fork_connector(p); > cgroup_post_fork(p); > if (clone_flags & CLONE_THREAD) > -- > 2.0.0 > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] mm: memcontrol: rewrite uncharge API
On 06/20/2014 08:56 PM, Andrew Morton wrote: > On Fri, 20 Jun 2014 20:34:43 -0400 Sasha Levin wrote: > >> I'm seeing the following when booting a VM, bisection pointed me to this >> patch. >> >> [ 32.830823] BUG: using __this_cpu_add() in preemptible [] code: >> mkdir/8677 > > Thanks. This one was fixed earlier today. Thank Andrew. My first bisection attempt went sideways and ended up pointing at "fs/mpage.c: forgotten WRITE_SYNC in case of data integrity write" for some reason. My attempt to understand what data integrity has to do cgroups was unfruitful :( Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] mm: memcontrol: rewrite uncharge API
On Fri, 20 Jun 2014 20:34:43 -0400 Sasha Levin wrote: > I'm seeing the following when booting a VM, bisection pointed me to this > patch. > > [ 32.830823] BUG: using __this_cpu_add() in preemptible [] code: > mkdir/8677 Thanks. This one was fixed earlier today. From: Michal Hocko Subject: memcg: mem_cgroup_charge_statistics needs preempt_disable preempt_disable was previously disabled by lock_page_cgroup which has been removed by "mm: memcontrol: rewrite uncharge API". This fixes the a flood of splats like this: [3.149371] BUG: using __this_cpu_add() in preemptible [] code: udevd/1271 [3.151458] caller is __this_cpu_preempt_check+0x13/0x15 [3.152927] CPU: 0 PID: 1271 Comm: udevd Not tainted 3.15.0-test1 #366 [3.154637] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [3.156788] 8805fba8 814efe3f [3.158810] 8805fbd8 8125b969 880007413448 0001 [3.160836] ea1e8c00 0001 8805fbe8 8125b9a8 [3.162950] Call Trace: [3.163598] [] dump_stack+0x4e/0x7a [3.164942] [] check_preemption_disabled+0xd2/0xe5 [3.166618] [] __this_cpu_preempt_check+0x13/0x15 [3.168267] [] mem_cgroup_charge_statistics.isra.36+0xb5/0xc6 [3.170169] [] commit_charge+0x23c/0x256 [3.171823] [] mem_cgroup_commit_charge+0xb8/0xd7 [3.173838] [] shmem_getpage_gfp+0x399/0x605 [3.175363] [] shmem_write_begin+0x3d/0x58 [3.176854] [] generic_perform_write+0xbc/0x192 [3.178445] [] ? file_update_time+0x34/0xac [3.179952] [] __generic_file_aio_write+0x2c0/0x300 [3.181655] [] generic_file_aio_write+0x52/0xbd [3.183234] [] do_sync_write+0x59/0x78 [3.184630] [] vfs_write+0xc4/0x181 [3.185957] [] SyS_write+0x4a/0x91 [3.187258] [] tracesys+0xd0/0xd5 Signed-off-by: Michal Hocko Cc: Johannes Weiner Signed-off-by: Andrew Morton --- mm/memcontrol.c |3 +++ 1 file changed, 3 insertions(+) diff -puN mm/memcontrol.c~mm-memcontrol-rewrite-uncharge-api-fix-4 mm/memcontrol.c --- a/mm/memcontrol.c~mm-memcontrol-rewrite-uncharge-api-fix-4 +++ a/mm/memcontrol.c @@ -904,6 +904,8 @@ static void mem_cgroup_charge_statistics struct page *page, int nr_pages) { + preempt_disable(); + /* * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is * counted as CACHE even if it's on ANON LRU. @@ -928,6 +930,7 @@ static void mem_cgroup_charge_statistics } __this_cpu_add(memcg->stat->nr_page_events, nr_pages); + preempt_enable(); } unsigned long mem_cgroup_get_lru_size(struct lruvec *lruvec, enum lru_list lru) _ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: scsi-mq
> -Original Message- > From: Bart Van Assche [mailto:bvanass...@acm.org] > Sent: Wednesday, 18 June, 2014 2:09 AM > To: Jens Axboe; Christoph Hellwig; James Bottomley > Cc: Elliott, Robert (Server Storage); linux-s...@vger.kernel.org; linux- > ker...@vger.kernel.org > Subject: Re: scsi-mq > ... > Hello Jens, > > Fio reports the same queue depth for use_blk_mq=Y (mq below) and > use_blk_mq=N (sq below), namely ">=64". However, the number of context > switches differs significantly for the random read-write tests. > ... > It seems like with the traditional SCSI mid-layer and block core (sq) > that the number of context switches does not depend too much on the > number of I/O operations but that for the multi-queue SCSI core there > are a little bit more than two context switches per I/O in the > particular test I ran. The "randrw" script I used for this test takes > SCSI LUNs as arguments (/dev/sdX) and starts the fio tool as follows: Some of those context switches might be from scsi_end_request(), which always schedules the scsi_requeue_run_queue() function via the requeue_work workqueue for scsi-mq. That causes lots of context switches from a busy application thread (e.g., fio) to a kworker thread. As shown by ftrace: fio-19340 [005] dNh. 12067.908444: scsi_io_completion <-scsi_finish_command fio-19340 [005] dNh. 12067.908444: scsi_end_request <-scsi_io_completion fio-19340 [005] dNh. 12067.908444: blk_update_request <-scsi_end_request fio-19340 [005] dNh. 12067.908445: blk_account_io_completion <-blk_update_request fio-19340 [005] dNh. 12067.908445: scsi_mq_free_sgtables <-scsi_end_request fio-19340 [005] dNh. 12067.908445: scsi_free_sgtable <-scsi_mq_free_sgtables fio-19340 [005] dNh. 12067.908445: blk_account_io_done <-__blk_mq_end_io fio-19340 [005] dNh. 12067.908445: blk_mq_free_request <-__blk_mq_end_io fio-19340 [005] dNh. 12067.908446: blk_mq_map_queue <-blk_mq_free_request fio-19340 [005] dNh. 12067.908446: blk_mq_put_tag <-__blk_mq_free_request fio-19340 [005] .N.. 12067.908446: blkdev_direct_IO <-generic_file_direct_write kworker/5:1H-3207 [005] 12067.908448: scsi_requeue_run_queue <-process_one_work kworker/5:1H-3207 [005] 12067.908448: scsi_run_queue <-scsi_requeue_run_queue kworker/5:1H-3207 [005] 12067.908448: blk_mq_start_stopped_hw_queues <-scsi_run_queue fio-19340 [005] 12067.908449: blk_start_plug <-do_blockdev_direct_IO fio-19340 [005] 12067.908449: blkdev_get_block <-do_direct_IO fio-19340 [005] 12067.908450: blk_throtl_bio <-generic_make_request_checks fio-19340 [005] 12067.908450: blk_sq_make_request <-generic_make_request fio-19340 [005] 12067.908450: blk_queue_bounce <-blk_sq_make_request fio-19340 [005] 12067.908450: blk_mq_map_request <-blk_sq_make_request fio-19340 [005] 12067.908451: blk_mq_queue_enter <-blk_mq_map_request fio-19340 [005] 12067.908451: blk_mq_map_queue <-blk_mq_map_request fio-19340 [005] 12067.908451: blk_mq_get_tag <-__blk_mq_alloc_request fio-19340 [005] 12067.908451: blk_mq_bio_to_request <-blk_sq_make_request fio-19340 [005] 12067.908451: blk_rq_bio_prep <-init_request_from_bio fio-19340 [005] 12067.908451: blk_recount_segments <-bio_phys_segments fio-19340 [005] 12067.908452: blk_account_io_start <-blk_mq_bio_to_request fio-19340 [005] 12067.908452: blk_mq_hctx_mark_pending <-__blk_mq_insert_request fio-19340 [005] 12067.908452: blk_mq_run_hw_queue <-blk_sq_make_request fio-19340 [005] 12067.908452: blk_mq_start_request <-__blk_mq_run_hw_queue In one snapshot just tracing scsi_end_request() and scsi_request_run_queue(), 30K scsi_end_request() calls yielded 20k scsi_request_run_queue() calls. In this case, blk_mq_start_stopped_hw_queues() doesn't end up doing anything since there aren't any stopped queues to restart (blk_mq_run_hw_queue() gets called a bit later during routine fio work); the context switch turned out to be a waste of time. If it did find a stopped queue, then it would call blk_mq_run_hw_queue() itself. --- Rob ElliottHP Server Storage -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFT 1/3] cxgb4: make ethtool set_flash use request_firmware_nowait()
From: "Luis R. Rodriguez" cxgb4 loading can take a while, this is part of the crusade to change it to be asynchronous. Cc: Casey Leedom Cc: Hariprasad Shenai Cc: Philip Oswald Cc: Santosh Rastapur Cc: Jeffrey Cheung Cc: David Chang Signed-off-by: Luis R. Rodriguez --- drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 3 ++ drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 40 - 2 files changed, 36 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index f503dce..bcf9acf 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -647,6 +647,9 @@ struct adapter { struct dentry *debugfs_root; spinlock_t stats_lock; + + struct completion flash_comp; + int flash_comp_status; }; /* Defined bit width of user definable filter tuples diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 2f8d6b9..9cf6f3e 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -2713,22 +2713,48 @@ out: return err; } +static void cxgb4_flash_complete(const struct firmware *fw, void *context) +{ + struct adapter *adap = context; + int ret; + + if (!fw) { + adap->flash_comp_status = -EINVAL; + goto out; + } + + ret = t4_load_fw(adap, fw->data, fw->size); + if (!ret) + adap->flash_comp_status = ret; + +out: + release_firmware(fw); + complete(>flash_comp); +} + static int set_flash(struct net_device *netdev, struct ethtool_flash *ef) { int ret; - const struct firmware *fw; struct adapter *adap = netdev2adap(netdev); + init_completion(>flash_comp); + adap->flash_comp_status = 0; + ef->data[sizeof(ef->data) - 1] = '\0'; - ret = request_firmware(, ef->data, adap->pdev_dev); + ret = request_firmware_nowait(THIS_MODULE, 1, ef->data, + adap->pdev_dev, GFP_KERNEL, + adap, cxgb4_flash_complete); if (ret < 0) return ret; - ret = t4_load_fw(adap, fw->data, fw->size); - release_firmware(fw); - if (!ret) - dev_info(adap->pdev_dev, "loaded firmware %s\n", ef->data); - return ret; + wait_for_completion(>flash_comp); + + if (adap->flash_comp_status != 0) + return adap->flash_comp_status; + + dev_info(adap->pdev_dev, "loaded firmware %s\n", ef->data); + + return 0; } #define WOL_SUPPORTED (WAKE_BCAST | WAKE_MAGIC) -- 2.0.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFT 0/3] cxgb4: use request_firmware_nowait()
From: "Luis R. Rodriguez" Its reported that loading the cxgb4 can take over 1 minute, use the more sane request_firmware_nowait() API call just in case this amount of time is causing issues. The driver uses the firmware API 3 times, one for the firmware, one for configuration and another one for flash, this provides the port for all cases. I don't have the hardware so please test. I did verify we can use this during pci probe and also during the ethtool flash callback. Luis R. Rodriguez (3): cxgb4: make ethtool set_flash use request_firmware_nowait() cxgb4: make configuration load use request_firmware_nowait() cxgb4: make device firmware load use request_firmware_nowait() drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 13 ++ drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 258 +++- 2 files changed, 176 insertions(+), 95 deletions(-) -- 2.0.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFT 3/3] cxgb4: make device firmware load use request_firmware_nowait()
From: "Luis R. Rodriguez" cxgb4 loading can take a while, this ends the crusade to change it to be asynchronous. Cc: Casey Leedom Cc: Hariprasad Shenai Cc: Philip Oswald Cc: Santosh Rastapur Cc: Jeffrey Cheung Cc: David Chang Signed-off-by: Luis R. Rodriguez --- drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 6 ++ drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 105 ++-- 2 files changed, 67 insertions(+), 44 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index 1507dc2..89296f1 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -654,6 +654,12 @@ struct adapter { char fw_config_file[32]; struct completion config_comp; int config_comp_status; + + struct fw_info *fw_info; + struct completion fw_comp; + int fw_comp_status; + enum dev_state state; + int reset; }; /* Defined bit width of user definable filter tuples diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 65e4124..105b83a 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -5341,6 +5341,39 @@ static struct fw_info *find_fw_info(int chip) return NULL; } +static void cxgb4_fw_complete(const struct firmware *fw, void *context) +{ + struct adapter *adap = context; + struct fw_hdr *card_fw; + const u8 *fw_data = NULL; + unsigned int fw_size = 0; + + /* allocate memory to read the header of the firmware on the +* card +*/ + card_fw = t4_alloc_mem(sizeof(*card_fw)); + + if (!fw) { + dev_err(adap->pdev_dev, + "unable to load firmware image %s\n", + adap->fw_info->fw_mod_name); + } else { + fw_data = fw->data; + fw_size = fw->size; + } + + /* upgrade FW logic */ + adap->fw_comp_status = t4_prep_fw(adap, adap->fw_info, fw_data, + fw_size, card_fw, adap->state, + >reset); + + /* Cleaning up */ + if (fw != NULL) + release_firmware(fw); + t4_free_mem(card_fw); + complete(>fw_comp); +} + /* * Phase 0 of initialization: contact FW, obtain config, perform basic init. */ @@ -5348,10 +5381,10 @@ static int adap_init0(struct adapter *adap) { int ret; u32 v, port_vec; - enum dev_state state; u32 params[7], val[7]; struct fw_caps_config_cmd caps_cmd; - int reset = 1; + + adap->reset = 1; /* * Contact FW, advertising Master capability (and potentially forcing @@ -5360,7 +5393,7 @@ static int adap_init0(struct adapter *adap) */ ret = t4_fw_hello(adap, adap->mbox, adap->fn, force_init ? MASTER_MUST : MASTER_MAY, - ); + >state); if (ret < 0) { dev_err(adap->pdev_dev, "could not connect to FW, error %d\n", ret); @@ -5368,8 +5401,8 @@ static int adap_init0(struct adapter *adap) } if (ret == adap->mbox) adap->flags |= MASTER_PF; - if (force_init && state == DEV_STATE_INIT) - state = DEV_STATE_UNINIT; + if (force_init && adap->state == DEV_STATE_INIT) + adap->state = DEV_STATE_UNINIT; /* * If we're the Master PF Driver and the device is uninitialized, @@ -5380,51 +5413,34 @@ static int adap_init0(struct adapter *adap) */ t4_get_fw_version(adap, >params.fw_vers); t4_get_tp_version(adap, >params.tp_vers); - if ((adap->flags & MASTER_PF) && state != DEV_STATE_INIT) { - struct fw_info *fw_info; - struct fw_hdr *card_fw; - const struct firmware *fw; - const u8 *fw_data = NULL; - unsigned int fw_size = 0; + if ((adap->flags & MASTER_PF) && adap->state != DEV_STATE_INIT) { + init_completion(>fw_comp); + adap->fw_comp_status = 0; /* This is the firmware whose headers the driver was compiled * against */ - fw_info = find_fw_info(CHELSIO_CHIP_VERSION(adap->params.chip)); - if (fw_info == NULL) { + adap->fw_info = + find_fw_info(CHELSIO_CHIP_VERSION(adap->params.chip)); + if (adap->fw_info == NULL) { dev_err(adap->pdev_dev, "unable to get firmware info for chip %d.\n", CHELSIO_CHIP_VERSION(adap->params.chip)); return -EINVAL; } - /* allocate memory to
[RFT 2/3] cxgb4: make configuration load use request_firmware_nowait()
From: "Luis R. Rodriguez" cxgb4 loading can take a while, this is part of the crusade to change it to be asynchronous. One more to go. Cc: Philip Oswald Cc: Santosh Rastapur Cc: Jeffrey Cheung Cc: David Chang Cc: Casey Leedom Cc: Hariprasad Shenai Signed-off-by: Luis R. Rodriguez --- drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 4 + drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 113 +++- 2 files changed, 73 insertions(+), 44 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index bcf9acf..1507dc2 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -650,6 +650,10 @@ struct adapter { struct completion flash_comp; int flash_comp_status; + + char fw_config_file[32]; + struct completion config_comp; + int config_comp_status; }; /* Defined bit width of user definable filter tuples diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 9cf6f3e..65e4124 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -4827,51 +4827,18 @@ static int adap_init0_tweaks(struct adapter *adapter) return 0; } -/* - * Attempt to initialize the adapter via a Firmware Configuration File. - */ -static int adap_init0_config(struct adapter *adapter, int reset) +static void cxgb4_config_complete(const struct firmware *cf, void *context) { - struct fw_caps_config_cmd caps_cmd; - const struct firmware *cf; + struct adapter *adapter = context; unsigned long mtype = 0, maddr = 0; u32 finiver, finicsum, cfcsum; - int ret; - int config_issued = 0; - char *fw_config_file, fw_config_file_path[256]; char *config_name = NULL; + struct fw_caps_config_cmd caps_cmd; + int config_issued = 0; + int ret = 0; + char fw_config_file_path[256]; - /* -* Reset device if necessary. -*/ - if (reset) { - ret = t4_fw_reset(adapter, adapter->mbox, - PIORSTMODE | PIORST); - if (ret < 0) - goto bye; - } - - /* -* If we have a T4 configuration file under /lib/firmware/cxgb4/, -* then use that. Otherwise, use the configuration file stored -* in the adapter flash ... -*/ - switch (CHELSIO_CHIP_VERSION(adapter->params.chip)) { - case CHELSIO_T4: - fw_config_file = FW4_CFNAME; - break; - case CHELSIO_T5: - fw_config_file = FW5_CFNAME; - break; - default: - dev_err(adapter->pdev_dev, "Device %d is not supported\n", - adapter->pdev->device); - ret = -EINVAL; - goto bye; - } - - ret = request_firmware(, fw_config_file, adapter->pdev_dev); - if (ret < 0) { + if (!cf) { config_name = "On FLASH"; mtype = FW_MEMTYPE_CF_FLASH; maddr = t4_flash_cfg_addr(adapter); @@ -4879,7 +4846,7 @@ static int adap_init0_config(struct adapter *adapter, int reset) u32 params[7], val[7]; sprintf(fw_config_file_path, - "/lib/firmware/%s", fw_config_file); + "/lib/firmware/%s", adapter->fw_config_file); config_name = fw_config_file_path; if (cf->size >= FLASH_CFG_MAX_SIZE) @@ -4898,7 +4865,7 @@ static int adap_init0_config(struct adapter *adapter, int reset) * to write that out separately since we can't * guarantee that the bytes following the * residual byte in the buffer returned by -* request_firmware() are zeroed out ... +* request_firmware_nowait() are zeroed out ... */ size_t resid = cf->size & 0x3; size_t size = cf->size & ~0x3; @@ -5018,7 +4985,8 @@ static int adap_init0_config(struct adapter *adapter, int reset) dev_info(adapter->pdev_dev, "Successfully configured using Firmware "\ "Configuration File \"%s\", version %#x, computed checksum %#x\n", config_name, finiver, cfcsum); - return 0; + complete(>config_comp); + return; /* * Something bad happened. Return the error ... (If the "error" @@ -5026,10 +4994,67 @@ static int adap_init0_config(struct adapter *adapter, int reset) * want to issue a warning since this is fairly common.) */ bye: + adapter->flash_comp_status = ret; if (config_issued && ret != -ENOENT)
Re: [PATCH 1/4] cfq: Increase default value of target_latency
On Fri, Jun 20, 2014 at 12:30:25PM +0100, Mel Gorman wrote: > On Fri, Jun 20, 2014 at 07:42:14AM +1000, Dave Chinner wrote: > > On Thu, Jun 19, 2014 at 02:38:44PM -0400, Jeff Moyer wrote: > > > Mel Gorman writes: > > > > > > > The existing CFQ default target_latency results in very poor performance > > > > for larger numbers of threads doing sequential reads. While this can be > > > > easily described as a tuning problem for users, it is one that is tricky > > > > to detect. This patch the default on the assumption that people with > > > > access > > > > to expensive fast storage also know how to tune their IO scheduler. > > > > > > > > The following is from tiobench run on a mid-range desktop with a single > > > > spinning disk. > > > > > > > > 3.16.0-rc13.16.0-rc1 > > > >3.0.0 > > > > vanilla cfq600 > > > > vanilla > > > > Mean SeqRead-MB/sec-1 121.88 ( 0.00%) 121.60 ( -0.23%) > > > > 134.59 ( 10.42%) > > > > Mean SeqRead-MB/sec-2 101.99 ( 0.00%) 102.35 ( 0.36%) > > > > 122.59 ( 20.20%) > > > > Mean SeqRead-MB/sec-4 97.42 ( 0.00%) 99.71 ( 2.35%) > > > > 114.78 ( 17.82%) > > > > Mean SeqRead-MB/sec-8 83.39 ( 0.00%) 90.39 ( 8.39%) > > > > 100.14 ( 20.09%) > > > > Mean SeqRead-MB/sec-16 68.90 ( 0.00%) 77.29 ( 12.18%) > > > > 81.64 ( 18.50%) > > > > > > Did you test any workloads other than this? Also, what normal workload > > > has 8 or more threads doing sequential reads? (That's an honest > > > question.) > > > > I'd also suggest that making changes basd on the assumption that > > people affected by the change know how to tune CFQ is a bad idea. > > When CFQ misbehaves, most people just switch to deadline or no-op > > because they don't understand how CFQ works, nor what what all the > > nobs do or which ones to tweak to solve their problem > > Ok, that's fair enough. Tuning CFQ is tricky but as it is, the default > performance is not great in comparison to older kernels and it's something > that has varied considerably over time. I'm surprised there have not been > more complaints but maybe I just missed them on the lists. That's because there are widespread recommendations not to use CFQ if you have any sort of significant storage or IO workload. We specifically recommend that you don't use CFQ with XFS because it does not play nicely with correlated multi-process IO. This is something that happens a lot, even with single threaded workloads. e.g. a single fsync can issue dependent IOs from multiple process contexts - the syscall process for data IO, the allocation workqueue kworker for btree blocks, the xfsaild to push metadata to disk to make space available for the allocation transaction, and then the journal IO from the xfs log workqueue kworker. There's 4 IOs, all from different process contexts, all of which need to be dispatched and completed with the minimum of latency. With CFQ adding scheduling and idling delays in the middle of this, it tends to leave disks idle when they really should be doing work. We also don't recommend using CFQ when you have hardware raid with caches, because the HW RAID does a much, much better job of optimising and prioritising IO through it's cache. Idling is wrong if the cache has hardware readahead, because most subsequent read IOs will hit the hardware cache. Hence you could be dispatching other IO instead of idling, yet still get minimal IO latency across multiple streams of different read workloads. Hence people search on CFQ problems, see the "use deadline" recommendations, change to deadline and see there IO workload going faster. So they shrug their shoulders, set deadline as the default, and move on to the next problem... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/5] Fix for cond_resched performance regression
On Fri, Jun 20, 2014 at 05:14:18PM -0700, Paul E. McKenney wrote: > On Fri, Jun 20, 2014 at 04:52:15PM -0700, j...@joshtriplett.org wrote: > > On Fri, Jun 20, 2014 at 04:30:33PM -0700, Paul E. McKenney wrote: > > > On Fri, Jun 20, 2014 at 03:39:51PM -0700, j...@joshtriplett.org wrote: > > > > On Fri, Jun 20, 2014 at 03:11:20PM -0700, Paul E. McKenney wrote: > > > > > On Fri, Jun 20, 2014 at 02:24:23PM -0700, j...@joshtriplett.org wrote: > > > > > > On Fri, Jun 20, 2014 at 12:12:36PM -0700, Paul E. McKenney wrote: > > > > > > > o Make cond_resched() a no-op for PREEMPT=y. This might well turn > > > > > > > out to be a good thing, but it doesn't help give RCU the > > > > > > > quiescent > > > > > > > states that it needs. > > > > > > > > > > > > What about doing this, together with letting the fqs logic poke > > > > > > un-quiesced kernel code as needed? That way, rather than having > > > > > > cond_resched do any work, you have the fqs logic recognize that a > > > > > > particular CPU has gone too long without quiescing, without > > > > > > disturbing > > > > > > that CPU at all if it hasn't gone too long. > > > > > > > > > > My next stop is to post the previous series, but with a couple of > > > > > exports and one bug fix uncovered by testing thus far, but after > > > > > another round of testing. Then I am going to take a close look at > > > > > this one: > > > > > > > > > > o Push the checks further into cond_resched(), so that the > > > > > fastpath does the same sequence of instructions that the > > > > > original > > > > > did. This might work well, but requires IPIs, which are not so > > > > > good for latencies on the remote CPU. It nevertheless might be > > > > > a > > > > > decent long-term solution given that if your CPU is spending > > > > > many > > > > > jiffies looping in the kernel, you aren't getting good latencies > > > > > anyway. It also has the benefit of allowing RCU to take > > > > > advantage > > > > > of the implicit quiescent states of all cond_resched() calls, > > > > > and of eliminating the need for a separate cond_resched_rcu_qs() > > > > > and for RCU_COND_RESCHED_QS. > > > > > > > > > > The one you call out is of course interesting as well. But there are > > > > > a couple of questions: > > > > > > > > > > 1.Why wasn't cond_resched() a no-op in CONFIG_PREEMPT to start > > > > > with? It just seems to obvious a thing to do for it to possibly > > > > > be an oversight. (What, me paranoid?) > > > > > > > > > > 2.When RCU recognizes that a particular CPU has gone too long, > > > > > exactly what are you suggesting that RCU do about it? When > > > > > formulating your answer, please give due consideration to the > > > > > implications of that CPU being a NO_HZ_FULL CPU. ;-) > > > > > > > > Send it an IPI that either causes it to flag a quiescent state > > > > immediately if currently quiesced or causes it to quiesce at the next > > > > opportunity if not. > > > > > > OK. But if we are in a !PREEMPT kernel, > > > > That's not the case I was suggesting. > > Fair enough, but we still need to support !PREEMPT kernels. > > >*If* the kernel is fully > > preemptible, then it makes little sense to put any code in cond_resched, > > when instead another thread can simply cause a preemption if it needs a > > quiescent state. That has the advantage of not imposing any unnecessary > > polling on code running in the kernel. > > OK. Exactly which thread are you suggesting should cause the preemption? > > > In a !PREEMPT kernel, it makes a bit more sense to have cond_resched as > > a voluntary preemption point. But voluntary preemption points don't > > make as much sense in a kernel prepared to preempt a thread anywhere. > > That does sound intuitive, but I am not yet prepared to believe that > the scheduler guys missed this trick. There might well be some good > reason for cond_resched() doing something, though I cannot think what it > might be (something to do with preempt_enable_no_resched(), perhaps?). > We should at least ask them, although if you want to do some testing > before asking them, I of course have no objection to your doing so. Oh, and it turns out to be possible to drive RCU's need-a-qs check much farther down the cond_resched() rabbit hole than I expected. Looks like it can be driven all the way down to rcu_note_context_switch(). Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 13/13] mm: memcontrol: rewrite uncharge API
On 06/18/2014 04:40 PM, Johannes Weiner wrote: > The memcg uncharging code that is involved towards the end of a page's > lifetime - truncation, reclaim, swapout, migration - is impressively > complicated and fragile. > > Because anonymous and file pages were always charged before they had > their page->mapping established, uncharges had to happen when the page > type could still be known from the context; as in unmap for anonymous, > page cache removal for file and shmem pages, and swap cache truncation > for swap pages. However, these operations happen well before the page > is actually freed, and so a lot of synchronization is necessary: > > - Charging, uncharging, page migration, and charge migration all need > to take a per-page bit spinlock as they could race with uncharging. > > - Swap cache truncation happens during both swap-in and swap-out, and > possibly repeatedly before the page is actually freed. This means > that the memcg swapout code is called from many contexts that make > no sense and it has to figure out the direction from page state to > make sure memory and memory+swap are always correctly charged. > > - On page migration, the old page might be unmapped but then reused, > so memcg code has to prevent untimely uncharging in that case. > Because this code - which should be a simple charge transfer - is so > special-cased, it is not reusable for replace_page_cache(). > > But now that charged pages always have a page->mapping, introduce > mem_cgroup_uncharge(), which is called after the final put_page(), > when we know for sure that nobody is looking at the page anymore. > > For page migration, introduce mem_cgroup_migrate(), which is called > after the migration is successful and the new page is fully rmapped. > Because the old page is no longer uncharged after migration, prevent > double charges by decoupling the page's memcg association (PCG_USED > and pc->mem_cgroup) from the page holding an actual charge. The new > bits PCG_MEM and PCG_MEMSW represent the respective charges and are > transferred to the new page during migration. > > mem_cgroup_migrate() is suitable for replace_page_cache() as well, > which gets rid of mem_cgroup_replace_page_cache(). > > Swap accounting is massively simplified: because the page is no longer > uncharged as early as swap cache deletion, a new mem_cgroup_swapout() > can transfer the page's memory+swap charge (PCG_MEMSW) to the swap > entry before the final put_page() in page reclaim. > > Finally, page_cgroup changes are now protected by whatever protection > the page itself offers: anonymous pages are charged under the page > table lock, whereas page cache insertions, swapin, and migration hold > the page lock. Uncharging happens under full exclusion with no > outstanding references. Charging and uncharging also ensure that the > page is off-LRU, which serializes against charge migration. Remove > the very costly page_cgroup lock and set pc->flags non-atomically. > > Signed-off-by: Johannes Weiner Hi Johannes, I'm seeing the following when booting a VM, bisection pointed me to this patch. [ 32.830823] BUG: using __this_cpu_add() in preemptible [] code: mkdir/8677 [ 32.831522] caller is __this_cpu_preempt_check+0x13/0x20 [ 32.832079] CPU: 35 PID: 8677 Comm: mkdir Not tainted 3.16.0-rc1-next-20140620-sasha-00023-g8fc12ed #700 [ 32.832898] b27ea69d 8800cb91b618 b151820b 0002 [ 32.833607] 0023 8800cb91b648 aeb4c799 88006efa5b60 [ 32.834318] ea0007cff9c0 0001 0001 8800cb91b658 [ 32.835030] Call Trace: [ 32.835257] dump_stack (lib/dump_stack.c:52) [ 32.835755] check_preemption_disabled (./arch/x86/include/asm/preempt.h:80 lib/smp_processor_id.c:49) [ 32.836336] __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 32.836991] mem_cgroup_charge_statistics.isra.23 (mm/memcontrol.c:930) [ 32.837682] commit_charge (mm/memcontrol.c:2761) [ 32.838187] ? _raw_spin_unlock_irq (./arch/x86/include/asm/paravirt.h:819 include/linux/spinlock_api_smp.h:168 kernel/locking/spinlock.c:199) [ 32.838735] ? get_parent_ip (kernel/sched/core.c:2546) [ 32.839230] mem_cgroup_commit_charge (mm/memcontrol.c:6519) [ 32.839807] __add_to_page_cache_locked (mm/filemap.c:588 include/linux/jump_label.h:115 include/trace/events/filemap.h:50 mm/filemap.c:589) [ 32.840479] add_to_page_cache_lru (mm/filemap.c:627) [ 32.841048] read_cache_pages (mm/readahead.c:92) [ 32.841560] ? v9fs_cache_session_get_key (fs/9p/cache.c:306) [ 32.842145] ? v9fs_write_begin (fs/9p/vfs_addr.c:99) [ 32.842694] v9fs_vfs_readpages (fs/9p/vfs_addr.c:127) [ 32.843251] __do_page_cache_readahead (mm/readahead.c:123 mm/readahead.c:200) [ 32.843848] ? __do_
[PATCH v2] selinux: no recursive read_lock of policy_rwlock in security_genfs_sid()
v1->v2: - Add an internal helper to switch on/off lock acquisition instead of modifying the external API. With introduction of fair queued rwlock, recursive read_lock() may hang the offending process if there is a write_lock() somewhere in between. With recursive read_lock checking enabled, the following error was reported: = [ INFO: possible recursive locking detected ] 3.16.0-rc1 #2 Tainted: GE - load_policy/708 is trying to acquire lock: (policy_rwlock){.+.+..}, at: [] security_genfs_sid+0x3a/0x170 but task is already holding lock: (policy_rwlock){.+.+..}, at: [] security_fs_use+0x2c/0x110 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(policy_rwlock); lock(policy_rwlock); This patch fixes the occurrence of recursive read_lock() of policy_rwlock in security_genfs_sid() by adding a helper function which has a 5th argument to indicate if the rwlock has been taken. Signed-off-by: Waiman Long --- security/selinux/ss/services.c | 36 1 files changed, 28 insertions(+), 8 deletions(-) diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index 4bca494..5f4c1f3 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -2277,20 +2277,22 @@ out: } /** - * security_genfs_sid - Obtain a SID for a file in a filesystem + * __security_genfs_sid - Helper to obtain a SID for a file in a filesystem * @fstype: filesystem type * @path: path from root of mount * @sclass: file security class * @sid: SID for path + * @locked: true if policy_rwlock taken * * Obtain a SID to use for a file in a filesystem that * cannot support xattr or use a fixed labeling behavior like * transition SIDs or task SIDs. */ -int security_genfs_sid(const char *fstype, - char *path, - u16 orig_sclass, - u32 *sid) +static inline int __security_genfs_sid(const char *fstype, + char *path, + u16 orig_sclass, + u32 *sid, + int locked) { int len; u16 sclass; @@ -2301,7 +2303,8 @@ int security_genfs_sid(const char *fstype, while (path[0] == '/' && path[1] == '/') path++; - read_lock(_rwlock); + if (!locked) + read_lock(_rwlock); sclass = unmap_class(orig_sclass); *sid = SECINITSID_UNLABELED; @@ -2336,11 +2339,27 @@ int security_genfs_sid(const char *fstype, *sid = c->sid[0]; rc = 0; out: - read_unlock(_rwlock); + if (!locked) + read_unlock(_rwlock); return rc; } /** + * security_genfs_sid - Obtain a SID for a file in a filesystem + * @fstype: filesystem type + * @path: path from root of mount + * @sclass: file security class + * @sid: SID for path + */ +int security_genfs_sid(const char *fstype, + char *path, + u16 orig_sclass, + u32 *sid) +{ + return __security_genfs_sid(fstype, path, orig_sclass, sid, false); +} + +/** * security_fs_use - Determine how to handle labeling for a filesystem. * @sb: superblock in question */ @@ -2370,7 +2389,8 @@ int security_fs_use(struct super_block *sb) } sbsec->sid = c->sid[0]; } else { - rc = security_genfs_sid(fstype, "/", SECCLASS_DIR, >sid); + rc = __security_genfs_sid(fstype, "/", SECCLASS_DIR, + >sid, true); if (rc) { sbsec->behavior = SECURITY_FS_USE_NONE; rc = 0; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
Willy Tarreau writes: > Hi Eric, > > On Fri, Jun 20, 2014 at 03:16:07PM -0700, Eric W. Biederman wrote: >> Willy Tarreau writes: >> >> > Hi Luis, >> > >> > On Thu, Jun 12, 2014 at 01:55:53PM +0100, Luis Henriques wrote: >> >> I was finally able to spend some more time with this and tried (a >> >> modified) Tyler's patch on top of 2.6.32.62, and it seems to work. >> >> Although I haven't done any extended testing, I don't see the two >> >> stack traces and the /proc/sys/net/ipv4/ directory seems to be >> >> correctly populated. >> >> >> >> I'm attaching the patch I've used, based on Tyler's. >> > >> > Would any of you or Tyler please kindly pass me a signed-off-by with >> > a commit message ? That would be great. Alternately I'd do it myself >> > and mention you authored them. >> >> If my memory serves it is possibe in 2.6.32 to set >> .ctl_name = CTL_UNNEEDED >> >> and not need to implement a .strategy routine at all. > > Ah that's quite interesting, thanks for the tip! > >> Given the fact that most people got the strategy routines >> slightly wrong and that sys_sysctl is effectively unused >> a strategy where you don't implement code that no-one >> will use in a backport I would be preferable. > > OK. > >> Since you have mentioned this has come up a couple of times if something >> else this will be something to think about for next time. > > I'm keeping your e-mail where I manage patches, hoping to recognize > this case next time. > >> I am puzzled why .ctl_name was populated in a backport at all. > > Oh it's simply because I didn't know it did not have to be there, > and among the few reviewers, I guess that it's not common to know > what version uses what semantics. I guess what I meant is that the field .ctl_name does not even exist anymore for the same reasons .strategy does not exist anymore. So I was just suprirsed that someone picked a randomish number and stuck it in there. If anyone actually were to use those randomish numbers in the binary sys_sysctl call their applications would break when they eventually moved to a more recent kernel. Which is one of the motivations it was decided there would be no more binary sysctls allocated around the 2.6.32 timeframe. > Thank you for the exaplanation, it's really helpful. We're not used > to backport sysctl changes but here I got caught a few times and have > found some sysctl.conf with bogus values in field a few times, so it > was really important to backport this one. Sysctl do have their uses, and at least 2.6.32 has runtime sysctl checks to keep the insanity to a dull roar. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/5] Fix for cond_resched performance regression
On Fri, Jun 20, 2014 at 04:52:15PM -0700, j...@joshtriplett.org wrote: > On Fri, Jun 20, 2014 at 04:30:33PM -0700, Paul E. McKenney wrote: > > On Fri, Jun 20, 2014 at 03:39:51PM -0700, j...@joshtriplett.org wrote: > > > On Fri, Jun 20, 2014 at 03:11:20PM -0700, Paul E. McKenney wrote: > > > > On Fri, Jun 20, 2014 at 02:24:23PM -0700, j...@joshtriplett.org wrote: > > > > > On Fri, Jun 20, 2014 at 12:12:36PM -0700, Paul E. McKenney wrote: > > > > > > o Make cond_resched() a no-op for PREEMPT=y. This might well turn > > > > > > out to be a good thing, but it doesn't help give RCU the > > > > > > quiescent > > > > > > states that it needs. > > > > > > > > > > What about doing this, together with letting the fqs logic poke > > > > > un-quiesced kernel code as needed? That way, rather than having > > > > > cond_resched do any work, you have the fqs logic recognize that a > > > > > particular CPU has gone too long without quiescing, without disturbing > > > > > that CPU at all if it hasn't gone too long. > > > > > > > > My next stop is to post the previous series, but with a couple of > > > > exports and one bug fix uncovered by testing thus far, but after > > > > another round of testing. Then I am going to take a close look at > > > > this one: > > > > > > > > o Push the checks further into cond_resched(), so that the > > > > fastpath does the same sequence of instructions that the > > > > original > > > > did. This might work well, but requires IPIs, which are not so > > > > good for latencies on the remote CPU. It nevertheless might be > > > > a > > > > decent long-term solution given that if your CPU is spending > > > > many > > > > jiffies looping in the kernel, you aren't getting good latencies > > > > anyway. It also has the benefit of allowing RCU to take > > > > advantage > > > > of the implicit quiescent states of all cond_resched() calls, > > > > and of eliminating the need for a separate cond_resched_rcu_qs() > > > > and for RCU_COND_RESCHED_QS. > > > > > > > > The one you call out is of course interesting as well. But there are > > > > a couple of questions: > > > > > > > > 1. Why wasn't cond_resched() a no-op in CONFIG_PREEMPT to start > > > > with? It just seems to obvious a thing to do for it to possibly > > > > be an oversight. (What, me paranoid?) > > > > > > > > 2. When RCU recognizes that a particular CPU has gone too long, > > > > exactly what are you suggesting that RCU do about it? When > > > > formulating your answer, please give due consideration to the > > > > implications of that CPU being a NO_HZ_FULL CPU. ;-) > > > > > > Send it an IPI that either causes it to flag a quiescent state > > > immediately if currently quiesced or causes it to quiesce at the next > > > opportunity if not. > > > > OK. But if we are in a !PREEMPT kernel, > > That's not the case I was suggesting. Fair enough, but we still need to support !PREEMPT kernels. >*If* the kernel is fully > preemptible, then it makes little sense to put any code in cond_resched, > when instead another thread can simply cause a preemption if it needs a > quiescent state. That has the advantage of not imposing any unnecessary > polling on code running in the kernel. OK. Exactly which thread are you suggesting should cause the preemption? > In a !PREEMPT kernel, it makes a bit more sense to have cond_resched as > a voluntary preemption point. But voluntary preemption points don't > make as much sense in a kernel prepared to preempt a thread anywhere. That does sound intuitive, but I am not yet prepared to believe that the scheduler guys missed this trick. There might well be some good reason for cond_resched() doing something, though I cannot think what it might be (something to do with preempt_enable_no_resched(), perhaps?). We should at least ask them, although if you want to do some testing before asking them, I of course have no objection to your doing so. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] arm64,ia64,ppc,s390,sh,tile,um,x86,mm: Remove default gate area
The core mm code will provide a default gate area based on FIXADDR_USER_START and FIXADDR_USER_END if !defined(__HAVE_ARCH_GATE_AREA) && defined(AT_SYSINFO_EHDR). This default is only useful for ia64. arm64, ppc, s390, sh, tile, 64-bit UML, and x86_32 have their own code just to disable it. arm, 32-bit UML, and x86_64 have gate areas, but they have their own implementations. This gets rid of the default and moves the code into ia64. This should save some code on architectures without a gate area: it's now possible to inline the gate_area functions in the default case. Signed-off-by: Andy Lutomirski --- arch/arm64/include/asm/page.h | 3 --- arch/arm64/kernel/vdso.c | 19 --- arch/ia64/include/asm/page.h | 2 ++ arch/ia64/mm/init.c| 26 ++ arch/powerpc/include/asm/page.h| 3 --- arch/powerpc/kernel/vdso.c | 16 arch/s390/include/asm/page.h | 2 -- arch/s390/kernel/vdso.c| 15 --- arch/sh/include/asm/page.h | 5 - arch/sh/kernel/vsyscall/vsyscall.c | 15 --- arch/tile/include/asm/page.h | 6 -- arch/tile/kernel/vdso.c| 15 --- arch/um/include/asm/page.h | 5 + arch/x86/include/asm/page.h| 1 - arch/x86/include/asm/page_64.h | 2 ++ arch/x86/um/asm/elf.h | 1 - arch/x86/um/mem_64.c | 15 --- arch/x86/vdso/vdso32-setup.c | 19 +-- include/linux/mm.h | 17 - mm/memory.c| 38 -- mm/nommu.c | 5 - 21 files changed, 48 insertions(+), 182 deletions(-) diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 46bf666..992710f 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -28,9 +28,6 @@ #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) -/* We do define AT_SYSINFO_EHDR but don't use the gate mechanism */ -#define __HAVE_ARCH_GATE_AREA 1 - #ifndef __ASSEMBLY__ #ifdef CONFIG_ARM64_64K_PAGES diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c index 50384fe..f630626 100644 --- a/arch/arm64/kernel/vdso.c +++ b/arch/arm64/kernel/vdso.c @@ -187,25 +187,6 @@ const char *arch_vma_name(struct vm_area_struct *vma) } /* - * We define AT_SYSINFO_EHDR, so we need these function stubs to keep - * Linux happy. - */ -int in_gate_area_no_mm(unsigned long addr) -{ - return 0; -} - -int in_gate_area(struct mm_struct *mm, unsigned long addr) -{ - return 0; -} - -struct vm_area_struct *get_gate_vma(struct mm_struct *mm) -{ - return NULL; -} - -/* * Update the vDSO data page to keep in sync with kernel timekeeping. */ void update_vsyscall(struct timekeeper *tk) diff --git a/arch/ia64/include/asm/page.h b/arch/ia64/include/asm/page.h index f1e1b2e..1f1bf14 100644 --- a/arch/ia64/include/asm/page.h +++ b/arch/ia64/include/asm/page.h @@ -231,4 +231,6 @@ get_order (unsigned long size) #define PERCPU_ADDR(-PERCPU_PAGE_SIZE) #define LOAD_OFFSET(KERNEL_START - KERNEL_TR_PAGE_SIZE) +#define __HAVE_ARCH_GATE_AREA 1 + #endif /* _ASM_IA64_PAGE_H */ diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c index 25c3502..35efaa3 100644 --- a/arch/ia64/mm/init.c +++ b/arch/ia64/mm/init.c @@ -278,6 +278,32 @@ setup_gate (void) ia64_patch_gate(); } +static struct vm_area_struct gate_vma; + +static int __init gate_vma_init(void) +{ + gate_vma.vm_mm = NULL; + gate_vma.vm_start = FIXADDR_USER_START; + gate_vma.vm_end = FIXADDR_USER_END; + gate_vma.vm_flags = VM_READ | VM_MAYREAD | VM_EXEC | VM_MAYEXEC; + gate_vma.vm_page_prot = __P101; + + return 0; +} +__initcall(gate_vma_init); + +struct vm_area_struct *get_gate_vma(struct mm_struct *mm) +{ + return _vma; +} + +int in_gate_area_no_mm(unsigned long addr) +{ + if ((addr >= FIXADDR_USER_START) && (addr < FIXADDR_USER_END)) + return 1; + return 0; +} + void ia64_mmu_init(void *my_cpu_data) { unsigned long pta, impl_va_bits; diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index 32e4e21..26fe1ae 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -48,9 +48,6 @@ extern unsigned int HPAGE_SHIFT; #define HUGE_MAX_HSTATE(MMU_PAGE_COUNT-1) #endif -/* We do define AT_SYSINFO_EHDR but don't use the gate mechanism */ -#define __HAVE_ARCH_GATE_AREA 1 - /* * Subtle: (1 << PAGE_SHIFT) is an int, not an unsigned long. So if we * assign PAGE_MASK to a larger type it gets extended the way we want diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index ce74c33..f174351 100644 --- a/arch/powerpc/kernel/vdso.c +++
Re: [PATCH 2/2] drivers/net/usb/asix_devices.c: inline ax88772_unbind
Hello. On 06/21/2014 12:40 AM, Fabian Frederick wrote: inline this one line function used in driver_info structure Cc: "David S. Miller" Cc: Emil Goode Cc: linux-...@vger.kernel.org Signed-off-by: Fabian Frederick --- drivers/net/usb/asix_devices.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c index 8a7582b..a41926a 100644 --- a/drivers/net/usb/asix_devices.c +++ b/drivers/net/usb/asix_devices.c @@ -497,7 +497,7 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf) return 0; } -static void ax88772_unbind(struct usbnet *dev, struct usb_interface *intf) +static inline void ax88772_unbind(struct usbnet *dev, struct usb_interface *intf) { kfree(dev->driver_priv); } gcc is perfectly capable of figuring that out. No need to use *inline* outside the *.h files. WBR, Sergei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i2c: exynos5: Properly use the "noirq" variants of suspend/resume
On 21.06.2014 01:53, Doug Anderson wrote: > Kevin, > > On Fri, Jun 20, 2014 at 4:13 PM, Kevin Hilman wrote: >> Doug Anderson writes: >> >>> Kevin, >>> >>> On Fri, Jun 20, 2014 at 2:48 PM, Kevin Hilman wrote: Hi Doug, Doug Anderson writes: > On Thu, Jun 19, 2014 at 11:43 AM, Kevin Hilman wrote: >> Doug Anderson writes: >> >>> The original code for the exynos i2c controller registered for the >>> "noirq" variants. However during review feedback it was moved to >>> SIMPLE_DEV_PM_OPS without anyone noticing that it meant we were no >>> longer actually "noirq" (despite functions named >>> exynos5_i2c_suspend_noirq and exynos5_i2c_resume_noirq). >>> >>> i2c controllers that might have wakeup sources on them seem to need to >>> resume at noirq time so that the individual drivers can actually read >>> the i2c bus to handle their wakeup. >> >> I suspect usage of the noirq variants pre-dates the existence of the >> late/early callbacks in the PM core, but based on the description above, >> I suspect what you actually want is the late/early callbacks. > > I think it actually really needs noirq. ;) Yes, it appears it does. Objection withdrawn. I just wanted to be sure because since the introduction of late/early, the need for noirq should be pretty rare, but there certainly are needs. In this case though, the need for it has more to do with the lack of a way for us to describe non parent-child device dependencies than whether or not IRQs are enabled or not. >>> >>> Actually, I'm not sure that's true, but I'll talk through it and you >>> can point to where I'm wrong (I often am!) >>> >>> If you're a wakeup device then you need to be ready to handle >>> interrupts as soon as the "noirq" phase of resume is done, right? >> >> As soon as the noirq phase of your own driver is done, correct. >> >>> Said another way: you need to be ready to handle interrupts _before_ >>> the normal resume code is called and be ready to handle interrupts >>> even _before_ the early resume code is called. >> >> Correct. >> >>> That means if you are implementing a bus that's needed by any devices >>> with wakeup interrupts then it's your responsibility to also be >>> prepared to run this early. >>> >>> In this particular case the max77686 driver doesn't need to do >>> anything at all to be ready to handle interrupts. It's suspend and >>> resume code is just boilerplate "enable wakeups / disable wakeups" and >>> it has no "noirq" code. The max77686 driver doesn't have any "noirq" >>> wake call because it would just be empty. >>> >>> Said another way: the problem isn't that the max77686 wakeup gets >>> called before the i2c wakeup. The problem is that i2c is needed ASAP >>> once IRQs are enabled and thus needs to be run noirq. >>> >>> Does that sound semi-correct? >> >> Yes that's correct. >> >> My point above was (trying to be) that ultimately this is an ordering >> issue. e.g. the bus device needs to be "ready" before wakeup devices on >> that bus can handle wakeup interrupts etc. The way we're handling that >> ordering is by the implied ordering of noirq, late/early and "normal" >> callbacks. That's convenient, but not exactly obvious. >> >> It works because we dont' typically need too many layers here, but it >> would be much more understandable if we could describe this kind of >> dependency in a way that the suspend/resume code would suspend/resume >> things in the right order rather than by tinkering with callback levels >> (since otherwise suspend/resume ordering just depends on probe order.) >> >> This issue then usually gets me headed down my usual rant path about how >> I think runtime PM is much better suited for handling ordering and >> dependencies becuase it automatically handles parent/child dependencies >> and non parent/child dependencies can be handled by taking advantage of >> the get/put APIs which are refcounted, ect etc. but that's another can >> worms. > > Ah, I gotcha. Yes, I'm a fan of having explicit dependency orderings too. > > So I guess in this case the truly correct way to handle it is: > > 1. i2c controller should have Runtime PM even though (as per the code > now) there's nothing you can do to it to save power under normal > circumstances. So the runtime "suspend" code would be a no-op. > > 2. When the i2c controller is told to runtime resume, it should > double-check if a full SoC poweroff has happened since the last time > it checked. In this case it should reinit its hardware. > > 3. If the i2c controller gets a full "resume" callback then it should > also reinit the hardware just so it's not sitting in a half-configured > state until the first peripheral uses it. > > If later someone finds a way to power gate the i2c controller when no > active transfers are going (and we actually save non-trivial power > doing this) then we've got a
Re: [PATCH] Check for Null return of function of affs_bread in function affs_truncate
On Fri, 20 Jun 2014, Nick Krause wrote: > Ok that's fine I would return as if it's a NULL the other parts of the > function can't continue. > Nick > > On Thu, Jun 19, 2014 at 1:21 AM, Dan Carpenter > wrote: > > On Wed, Jun 18, 2014 at 06:08:05PM -0400, Nicholas Krause wrote: > >> Signed-off-by: Nicholas Krause > >> --- > >> fs/affs/file.c | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/fs/affs/file.c b/fs/affs/file.c > >> index a7fe57d..f26482d 100644 > >> --- a/fs/affs/file.c > >> +++ b/fs/affs/file.c > >> @@ -923,6 +923,8 @@ affs_truncate(struct inode *inode) > >> > >> while (ext_key) { > >> ext_bh = affs_bread(sb, ext_key); > >> + if (!ext_bh) > >> + return; > > > > The problem is that we don't know if we should return here or break > > here. If you don't understand the code, then it's best to just leave it > > alone. Dan, what kind of attitude is that? Nick certainly found an issue where a possible NULL return from affs_bread() can cause havoc. Do YOU understand that code? If yes, you better explain, WHY Nicks finding is a false positive instead of just telling him off in a very inpolite way. If not, you better refrain from telling a reporter that he does not understand the code and should stay away. You clearly stated that you do not understand it either: > > The problem is that we don't know if we should return here or break > > here. The problem here is that proceeding with a known NULL pointer is wrong to begin with. It does not matter at all whether break or return is the proper thing to do. What matters is that proceeding with a NULL pointer is wrong to begin with, no matter what. So either explain why this is a non issue and the NULL pointer return cannot happen or shut up and try to find a proper solution for that "return" vs. "break" issue. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ARM: mvebu: Fix missing binding documentation for Armada 38x
On Fri, Jun 20, 2014 at 05:33:06PM -0500, Rob Herring wrote: > On Fri, Jun 20, 2014 at 1:52 PM, Jason Cooper wrote: > > On Thu, Jun 19, 2014 at 06:40:43PM +0200, Gregory CLEMENT wrote: > >> For the Armada 380 and Armada 385 SoCs, the common bindings for those > >> 2 SoCs, was forgotten. This patch add the documentation for the > >> marvell,aramda38x property. > >> > >> Signed-off-by: Gregory CLEMENT > >> -- > >> Hi, > >> > >> This fix should be merged in 3.16. For 3.15 I am not sure as it is not > >> a regression. > >> > >> Changelog: > >> v1->v2 > >> > >> - Reformulate to make clear that we will need marvell,armada38x _and_ a > >> SoC specific string. For consistency I duplicated what we have done in > >> armada-370-xp.txt > >> > >> > >> Thanks, > >> Gregory > >> > >> > >> Documentation/devicetree/bindings/arm/armada-38x.txt | 17 > >> +++-- > >> 1 file changed, 15 insertions(+), 2 deletions(-) > >> > >> diff --git a/Documentation/devicetree/bindings/arm/armada-38x.txt > >> b/Documentation/devicetree/bindings/arm/armada-38x.txt > >> index 11f2330a6554..fa08760046df 100644 > >> --- a/Documentation/devicetree/bindings/arm/armada-38x.txt > >> +++ b/Documentation/devicetree/bindings/arm/armada-38x.txt > >> @@ -6,5 +6,18 @@ following property: > >> > >> Required root node property: > >> > >> - - compatible: must contain either "marvell,armada380" or > >> - "marvell,armada385" depending on the variant of the SoC being used. > >> +compatible: must contain "marvell,armada38x" > > > > I agree with Sergei on this one. We generally avoid wildcards in > > compatible strings. Is there a use case where specifying one of the > > below wouldn't be sufficient? > > Isn't this a case of just documenting what is already in use? Technically, yes. However, there are no products shipping with this SoC yet. So there aren't any _real_ users other than the developers bringing in mainline support. > I agree wildcards alone are not good, but along with a specific > compatible is okay. But also there should be some need to have the > common property. I'm curious what you would consider to be a sufficient need? This can be easily handled by a match table, but a match table could also be considered rather heavy for this task. I think any implementation-based justification is prone to opening a can of worms. And I'm struggling to see a DT-only justification... thx, Jason. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i2c: exynos5: Properly use the "noirq" variants of suspend/resume
Kevin, On Fri, Jun 20, 2014 at 4:13 PM, Kevin Hilman wrote: > Doug Anderson writes: > >> Kevin, >> >> On Fri, Jun 20, 2014 at 2:48 PM, Kevin Hilman wrote: >>> Hi Doug, >>> >>> Doug Anderson writes: >>> On Thu, Jun 19, 2014 at 11:43 AM, Kevin Hilman wrote: > Doug Anderson writes: > >> The original code for the exynos i2c controller registered for the >> "noirq" variants. However during review feedback it was moved to >> SIMPLE_DEV_PM_OPS without anyone noticing that it meant we were no >> longer actually "noirq" (despite functions named >> exynos5_i2c_suspend_noirq and exynos5_i2c_resume_noirq). >> >> i2c controllers that might have wakeup sources on them seem to need to >> resume at noirq time so that the individual drivers can actually read >> the i2c bus to handle their wakeup. > > I suspect usage of the noirq variants pre-dates the existence of the > late/early callbacks in the PM core, but based on the description above, > I suspect what you actually want is the late/early callbacks. I think it actually really needs noirq. ;) >>> >>> Yes, it appears it does. Objection withdrawn. >>> >>> I just wanted to be sure because since the introduction of late/early, >>> the need for noirq should be pretty rare, but there certainly are needs. >>> >>> >>> In this case though, the need for it has more to do with the >>> lack of a way for us to describe non parent-child device dependencies >>> than whether or not IRQs are enabled or not. >>> >> >> Actually, I'm not sure that's true, but I'll talk through it and you >> can point to where I'm wrong (I often am!) >> >> If you're a wakeup device then you need to be ready to handle >> interrupts as soon as the "noirq" phase of resume is done, right? > > As soon as the noirq phase of your own driver is done, correct. > >> Said another way: you need to be ready to handle interrupts _before_ >> the normal resume code is called and be ready to handle interrupts >> even _before_ the early resume code is called. > > Correct. > >> That means if you are implementing a bus that's needed by any devices >> with wakeup interrupts then it's your responsibility to also be >> prepared to run this early. >> >> In this particular case the max77686 driver doesn't need to do >> anything at all to be ready to handle interrupts. It's suspend and >> resume code is just boilerplate "enable wakeups / disable wakeups" and >> it has no "noirq" code. The max77686 driver doesn't have any "noirq" >> wake call because it would just be empty. >> >> Said another way: the problem isn't that the max77686 wakeup gets >> called before the i2c wakeup. The problem is that i2c is needed ASAP >> once IRQs are enabled and thus needs to be run noirq. >> >> Does that sound semi-correct? > > Yes that's correct. > > My point above was (trying to be) that ultimately this is an ordering > issue. e.g. the bus device needs to be "ready" before wakeup devices on > that bus can handle wakeup interrupts etc. The way we're handling that > ordering is by the implied ordering of noirq, late/early and "normal" > callbacks. That's convenient, but not exactly obvious. > > It works because we dont' typically need too many layers here, but it > would be much more understandable if we could describe this kind of > dependency in a way that the suspend/resume code would suspend/resume > things in the right order rather than by tinkering with callback levels > (since otherwise suspend/resume ordering just depends on probe order.) > > This issue then usually gets me headed down my usual rant path about how > I think runtime PM is much better suited for handling ordering and > dependencies becuase it automatically handles parent/child dependencies > and non parent/child dependencies can be handled by taking advantage of > the get/put APIs which are refcounted, ect etc. but that's another can > worms. Ah, I gotcha. Yes, I'm a fan of having explicit dependency orderings too. So I guess in this case the truly correct way to handle it is: 1. i2c controller should have Runtime PM even though (as per the code now) there's nothing you can do to it to save power under normal circumstances. So the runtime "suspend" code would be a no-op. 2. When the i2c controller is told to runtime resume, it should double-check if a full SoC poweroff has happened since the last time it checked. In this case it should reinit its hardware. 3. If the i2c controller gets a full "resume" callback then it should also reinit the hardware just so it's not sitting in a half-configured state until the first peripheral uses it. If later someone finds a way to power gate the i2c controller when no active transfers are going (and we actually save non-trivial power doing this) then we've got a nice place to put that code. NOTE: Unless we can actually save power by power gating the i2c peripheral when there are no active transfers, we would also just have the i2c_xfer()
Re: [PATCH tip/core/rcu 0/5] Fix for cond_resched performance regression
On Fri, Jun 20, 2014 at 04:30:33PM -0700, Paul E. McKenney wrote: > On Fri, Jun 20, 2014 at 03:39:51PM -0700, j...@joshtriplett.org wrote: > > On Fri, Jun 20, 2014 at 03:11:20PM -0700, Paul E. McKenney wrote: > > > On Fri, Jun 20, 2014 at 02:24:23PM -0700, j...@joshtriplett.org wrote: > > > > On Fri, Jun 20, 2014 at 12:12:36PM -0700, Paul E. McKenney wrote: > > > > > o Make cond_resched() a no-op for PREEMPT=y. This might well turn > > > > > out to be a good thing, but it doesn't help give RCU the > > > > > quiescent > > > > > states that it needs. > > > > > > > > What about doing this, together with letting the fqs logic poke > > > > un-quiesced kernel code as needed? That way, rather than having > > > > cond_resched do any work, you have the fqs logic recognize that a > > > > particular CPU has gone too long without quiescing, without disturbing > > > > that CPU at all if it hasn't gone too long. > > > > > > My next stop is to post the previous series, but with a couple of > > > exports and one bug fix uncovered by testing thus far, but after > > > another round of testing. Then I am going to take a close look at > > > this one: > > > > > > o Push the checks further into cond_resched(), so that the > > > fastpath does the same sequence of instructions that the original > > > did. This might work well, but requires IPIs, which are not so > > > good for latencies on the remote CPU. It nevertheless might be a > > > decent long-term solution given that if your CPU is spending many > > > jiffies looping in the kernel, you aren't getting good latencies > > > anyway. It also has the benefit of allowing RCU to take advantage > > > of the implicit quiescent states of all cond_resched() calls, > > > and of eliminating the need for a separate cond_resched_rcu_qs() > > > and for RCU_COND_RESCHED_QS. > > > > > > The one you call out is of course interesting as well. But there are > > > a couple of questions: > > > > > > 1.Why wasn't cond_resched() a no-op in CONFIG_PREEMPT to start > > > with? It just seems to obvious a thing to do for it to possibly > > > be an oversight. (What, me paranoid?) > > > > > > 2.When RCU recognizes that a particular CPU has gone too long, > > > exactly what are you suggesting that RCU do about it? When > > > formulating your answer, please give due consideration to the > > > implications of that CPU being a NO_HZ_FULL CPU. ;-) > > > > Send it an IPI that either causes it to flag a quiescent state > > immediately if currently quiesced or causes it to quiesce at the next > > opportunity if not. > > OK. But if we are in a !PREEMPT kernel, That's not the case I was suggesting. *If* the kernel is fully preemptible, then it makes little sense to put any code in cond_resched, when instead another thread can simply cause a preemption if it needs a quiescent state. That has the advantage of not imposing any unnecessary polling on code running in the kernel. In a !PREEMPT kernel, it makes a bit more sense to have cond_resched as a voluntary preemption point. But voluntary preemption points don't make as much sense in a kernel prepared to preempt a thread anywhere. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] lib: list_sort_test(): Return -ENOMEM when allocation fails
Signed-off-by: Rasmus Villemoes --- lib/list_sort.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/list_sort.c b/lib/list_sort.c index 1183fa7..291412a 100644 --- a/lib/list_sort.c +++ b/lib/list_sort.c @@ -207,7 +207,7 @@ static int __init cmp(void *priv, struct list_head *a, struct list_head *b) static int __init list_sort_test(void) { - int i, count = 1, err = -EINVAL; + int i, count = 1, err = -ENOMEM; struct debug_el *el; struct list_head *cur, *tmp; LIST_HEAD(head); @@ -239,6 +239,7 @@ static int __init list_sort_test(void) list_sort(NULL, , cmp); + err = -EINVAL; for (cur = head.next; cur->next != cur = cur->next) { struct debug_el *el1; int cmp_result; -- 1.9.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] lib: list_sort_test(): Add extra corruption check
Add a check to make sure that the prev pointer of the list head points to the last element on the list. Signed-off-by: Rasmus Villemoes --- lib/list_sort.c | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/list_sort.c b/lib/list_sort.c index 291412a..832f525 100644 --- a/lib/list_sort.c +++ b/lib/list_sort.c @@ -272,6 +272,11 @@ static int __init list_sort_test(void) } count++; } + if (head->prev != cur) { + printk(KERN_ERR "list_sort_test: error: list is corrupted\n"); + goto exit; + } + if (count != TEST_LIST_LEN) { printk(KERN_ERR "list_sort_test: error: bad list length %d", -- 1.9.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/9] tools, perf: Make get_srcline fall back to sym+offset
From: Andi Kleen When the source line is not found fall back to sym + offset. This is generally much more useful than a raw address. For this we need to pass in the symbol from the caller. For some callers it's awkward to compute, so we stay at the old behaviour. Signed-off-by: Andi Kleen --- tools/perf/util/annotate.c | 2 +- tools/perf/util/callchain.c | 3 ++- tools/perf/util/map.c | 2 +- tools/perf/util/sort.c | 6 -- tools/perf/util/srcline.c | 12 +--- tools/perf/util/util.h | 4 +++- 6 files changed, 20 insertions(+), 9 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 12997ff..363b0c1 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1187,7 +1187,7 @@ static int symbol__get_source_line(struct symbol *sym, struct map *map, goto next; offset = start + i; - src_line->path = get_srcline(map->dso, offset); + src_line->path = get_srcline(map->dso, offset, NULL, false); insert_source_line(_root, src_line); next: diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index 2ca3655..ad4d7cb 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -690,7 +690,8 @@ char *callchain_list__sym_name(struct callchain_list *cl, cl->ms.map && !cl->srcline) cl->srcline = get_srcline(cl->ms.map->dso, map__rip_2objdump(cl->ms.map, - cl->ip)); + cl->ip), + cl->ms.sym, false); if (cl->srcline) printed = scnprintf(bf, bfsize, "%s %s", cl->ms.sym->name, cl->srcline); diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c index 8ccbb32..57cdc33 100644 --- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -355,7 +355,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix, if (map && map->dso) { srcline = get_srcline(map->dso, - map__rip_2objdump(map, addr)); + map__rip_2objdump(map, addr), NULL, true); if (srcline != SRCLINE_UNKNOWN) ret = fprintf(fp, "%s%s", prefix, srcline); free_srcline(srcline); diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index 901f44b..fee07ca 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -285,7 +285,8 @@ sort__srcline_cmp(struct hist_entry *left, struct hist_entry *right) else { struct map *map = left->ms.map; left->srcline = get_srcline(map->dso, - map__rip_2objdump(map, left->ip)); + map__rip_2objdump(map, left->ip), + left->ms.sym, true); } } if (!right->srcline) { @@ -294,7 +295,8 @@ sort__srcline_cmp(struct hist_entry *left, struct hist_entry *right) else { struct map *map = right->ms.map; right->srcline = get_srcline(map->dso, - map__rip_2objdump(map, right->ip)); +map__rip_2objdump(map, right->ip), +right->ms.sym, true); } } return strcmp(right->srcline, left->srcline); diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index ac877f9..36a7aff 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -8,12 +8,13 @@ #include "util/util.h" #include "util/debug.h" +#include "symbol.h" + #ifdef HAVE_LIBBFD_SUPPORT /* * Implement addr2line using libbfd. */ -#define PACKAGE "perf" #include struct a2l_data { @@ -250,7 +251,8 @@ void dso__free_a2l(struct dso *dso __maybe_unused) */ #define A2L_FAIL_LIMIT 123 -char *get_srcline(struct dso *dso, unsigned long addr) +char *get_srcline(struct dso *dso, unsigned long addr, struct symbol *sym, + bool show_sym) { char *file = NULL; unsigned line = 0; @@ -289,7 +291,11 @@ out: dso->has_srcline = 0; dso__free_a2l(dso); } - if (asprintf(, "%s[%lx]", dso->short_name, addr) < 0) + if (sym) { + if (asprintf(, "%s+%ld", show_sym ? sym->name : "", + addr - sym->start) < 0) + return SRCLINE_UNKNOWN; + } else if (asprintf(, "%s[%lx]", dso->short_name, addr) < 0) return
[PATCH 9/9] tools, perf: Add asprintf replacement
From: Andi Kleen asprintf corrupts memory on some older glibc versions. Provide a replacement. This fixes various segfaults with --branch-history on older Fedoras. Signed-off-by: Andi Kleen --- tools/perf/Makefile.perf | 1 + tools/perf/util/asprintf.c | 28 2 files changed, 29 insertions(+) create mode 100644 tools/perf/util/asprintf.c diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index ae20edf..57be4b7 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -372,6 +372,7 @@ LIB_OBJS += $(OUTPUT)util/vdso.o LIB_OBJS += $(OUTPUT)util/stat.o LIB_OBJS += $(OUTPUT)util/record.o LIB_OBJS += $(OUTPUT)util/srcline.o +LIB_OBJS += $(OUTPUT)util/asprintf.o LIB_OBJS += $(OUTPUT)util/data.o LIB_OBJS += $(OUTPUT)ui/setup.o diff --git a/tools/perf/util/asprintf.c b/tools/perf/util/asprintf.c new file mode 100644 index 000..9aafaca --- /dev/null +++ b/tools/perf/util/asprintf.c @@ -0,0 +1,28 @@ +/* Replacement for asprintf as it's buggy in older glibc versions */ +#include +#include +#include +#include + +int vasprintf(char **str, const char *fmt, va_list ap) +{ + char buf[1024]; + int len = vsnprintf(buf, sizeof buf, fmt, ap); + + *str = malloc(len + 1); + if (!*str) + return -1; + strcpy(*str, buf); + return len; +} + +int asprintf(char **str, const char *fmt, ...) +{ + va_list ap; + int ret; + + va_start(ap, fmt); + ret = vasprintf(str, fmt, ap); + va_end(ap); + return ret; +} -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/9] perf, tools: Enable printing the srcline in the history v4
From: Andi Kleen For lbr-as-callgraph we need to see the line number in the history, because many LBR entries can be in a single function, and just showing the same function name many times is not useful. When the history code is configured to sort by address, also try to resolve the address to a file:srcline and display this in the browser. If that doesn't work still display the address. This can be also useful without LBRs for understanding which call in a large function (or in which inlined function) called something else. Contains fixes from Namhyung Kim v2: Refactor code into common function v3: Fix GTK build v4: Rebase Signed-off-by: Andi Kleen --- tools/perf/ui/browsers/hists.c | 17 - tools/perf/ui/gtk/hists.c | 11 +-- tools/perf/ui/stdio/hist.c | 23 +-- tools/perf/util/callchain.c| 29 + tools/perf/util/callchain.h| 5 + tools/perf/util/machine.c | 2 +- tools/perf/util/srcline.c | 6 -- 7 files changed, 49 insertions(+), 44 deletions(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index 52c03fb..e0f32eb 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -422,23 +422,6 @@ out: return key; } -static char *callchain_list__sym_name(struct callchain_list *cl, - char *bf, size_t bfsize, bool show_dso) -{ - int printed; - - if (cl->ms.sym) - printed = scnprintf(bf, bfsize, "%s", cl->ms.sym->name); - else - printed = scnprintf(bf, bfsize, "%#" PRIx64, cl->ip); - - if (show_dso) - scnprintf(bf + printed, bfsize - printed, " %s", - cl->ms.map ? cl->ms.map->dso->short_name : "unknown"); - - return bf; -} - #define LEVEL_OFFSET_STEP 3 static int hist_browser__show_callchain_node_rb_tree(struct hist_browser *browser, diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c index 6ca60e4..a21b77e 100644 --- a/tools/perf/ui/gtk/hists.c +++ b/tools/perf/ui/gtk/hists.c @@ -87,15 +87,6 @@ void perf_gtk__init_hpp(void) perf_gtk__hpp_color_overhead_acc; } -static void callchain_list__sym_name(struct callchain_list *cl, -char *bf, size_t bfsize) -{ - if (cl->ms.sym) - scnprintf(bf, bfsize, "%s", cl->ms.sym->name); - else - scnprintf(bf, bfsize, "%#" PRIx64, cl->ip); -} - static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store, GtkTreeIter *parent, int col, u64 total) { @@ -126,7 +117,7 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store, scnprintf(buf, sizeof(buf), "%5.2f%%", percent); gtk_tree_store_set(store, , 0, buf, -1); - callchain_list__sym_name(chain, buf, sizeof(buf)); + callchain_list__sym_name(chain, buf, sizeof(buf), false); gtk_tree_store_set(store, , col, buf, -1); if (need_new_parent) { diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c index 90122ab..570d79d 100644 --- a/tools/perf/ui/stdio/hist.c +++ b/tools/perf/ui/stdio/hist.c @@ -41,6 +41,7 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain, { int i; size_t ret = 0; + char bf[1024]; ret += callchain__fprintf_left_margin(fp, left_margin); for (i = 0; i < depth; i++) { @@ -56,11 +57,8 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain, } else ret += fprintf(fp, "%s", " "); } - if (chain->ms.sym) - ret += fprintf(fp, "%s\n", chain->ms.sym->name); - else - ret += fprintf(fp, "0x%0" PRIx64 "\n", chain->ip); - + fputs(callchain_list__sym_name(chain, bf, sizeof(bf), false), fp); + fputc('\n', fp); return ret; } @@ -168,6 +166,7 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root, struct rb_node *node; int i = 0; int ret = 0; + char bf[1024]; /* * If have one single callchain root, don't bother printing @@ -196,10 +195,8 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root, } else ret += callchain__fprintf_left_margin(fp, left_margin); - if (chain->ms.sym) - ret += fprintf(fp, " %s\n", chain->ms.sym->name); - else - ret += fprintf(fp, " %p\n", (void *)(long)chain->ip); + ret += fprintf(fp, "%s\n", callchain_list__sym_name(chain, bf, sizeof(bf), +
[PATCH 2/9] perf, tools: Add --branch-history option to report v3
From: Andi Kleen Add a --branch-history option to perf report that changes all the settings necessary for using the branches in callstacks. This is just a short cut to make this nicer to use, it does not enable any functionality by itself. v2: Change sort order. Rename option to --branch-history to be less confusing. v3: Updates Signed-off-by: Andi Kleen --- tools/perf/Documentation/perf-report.txt | 5 + tools/perf/builtin-report.c | 34 +++- tools/perf/util/machine.c| 12 +-- 3 files changed, 40 insertions(+), 11 deletions(-) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 29a21b0..45f73c9 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -255,6 +255,11 @@ OPTIONS branch stacks and it will automatically switch to the branch view mode, unless --no-branch-stack is used. +--branch-history:: + Add the addresses of sampled taken branches to the callstack. + This allows to examine the path the program took to each sample. + The data collection must have used -b (or -j) and -g. + --objdump=:: Path to objdump binary. diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 4dcb4db..c2dc8f27 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -220,8 +220,9 @@ static int report__setup_sample_type(struct report *rep) return -EINVAL; } if (symbol_conf.use_callchain) { - ui__error("Selected -g but no callchain data. Did " - "you call 'perf record' without -g?\n"); + ui__error("Selected -g or --branch-history but no " + "callchain data. Did\n" + "you call 'perf record' without -g?\n"); return -1; } } else if (!rep->dont_use_callchains && @@ -544,6 +545,16 @@ parse_branch_mode(const struct option *opt __maybe_unused, } static int +parse_branch_call_mode(const struct option *opt __maybe_unused, + const char *str __maybe_unused, int unset) +{ + int *branch_mode = opt->value; + + *branch_mode = !unset; + return 0; +} + +static int parse_percent_limit(const struct option *opt, const char *str, int unset __maybe_unused) { @@ -558,7 +569,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) struct perf_session *session; struct stat st; bool has_br_stack = false; - int branch_mode = -1; + int branch_mode = -1, branch_call_mode = -1; int ret = -1; char callchain_default_opt[] = "fractal,0.5,callee"; const char * const report_usage[] = { @@ -669,7 +680,11 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) OPT_BOOLEAN(0, "group", _conf.event_group, "Show event group information together"), OPT_CALLBACK_NOOPT('b', "branch-stack", _mode, "", - "use branch records for histogram filling", parse_branch_mode), + "use branch records for per branch histogram filling", + parse_branch_mode), + OPT_CALLBACK_NOOPT(0, "branch-history", _call_mode, "", + "add last branch records to call history", + parse_branch_call_mode), OPT_STRING(0, "objdump", _path, "path", "objdump binary to use for disassembly and annotations"), OPT_BOOLEAN(0, "demangle", _conf.demangle, @@ -719,10 +734,19 @@ repeat: has_br_stack = perf_header__has_feat(>header, HEADER_BRANCH_STACK); - if (branch_mode == -1 && has_br_stack) { + if (branch_mode == -1 && has_br_stack && branch_call_mode == -1) { sort__mode = SORT_MODE__BRANCH; symbol_conf.cumulate_callchain = false; } + if (branch_call_mode != -1) { + callchain_param.branch_callstack = 1; + callchain_param.key = CCKEY_ADDRESS; + symbol_conf.use_callchain = true; + callchain_register_param(_param); + if (sort_order == default_sort_order) + sort_order = "srcline,symbol,dso"; + branch_mode = 0; + } if (report.mem_mode) { if (sort__mode == SORT_MODE__BRANCH) { diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index dee1695..ab04045 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -1379,15 +1379,15 @@ static int machine__resolve_callchain_sample(struct machine *machine, * - No annotations (should annotate somehow) */ - if (branch->nr >
[PATCH 1/9] perf, tools: Support handling complete branch stacks as histograms v7
From: Andi Kleen Currently branch stacks can be only shown as edge histograms for individual branches. I never found this display particularly useful. This implements an alternative mode that creates histograms over complete branch traces, instead of individual branches, similar to how normal callgraphs are handled. This is done by putting it in front of the normal callgraph and then using the normal callgraph histogram infrastructure to unify them. This way in complex functions we can understand the control flow that lead to a particular sample, and may even see some control flow in the caller for short functions. Example (simplified, of course for such simple code this is usually not needed): tcall.c: volatile a = 1, b = 10, c; __attribute__((noinline)) f2() { c = a / b; } __attribute__((noinline)) f1() { f2(); f2(); } main() { int i; for (i = 0; i < 100; i++) f1(); } % perf record -b -g ./tsrc/tcall [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ] % perf report --branch-history ... 54.91% tcall.c:6 [.] f2 tcall | |--65.53%-- f2 tcall.c:5 | | | |--70.83%-- f1 tcall.c:11 | | f1 tcall.c:10 | | main tcall.c:18 | | main tcall.c:18 | | main tcall.c:17 | | main tcall.c:17 | | f1 tcall.c:13 | | f1 tcall.c:13 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:12 | | f1 tcall.c:12 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:11 | | | --29.17%-- f1 tcall.c:12 | f1 tcall.c:12 | f2 tcall.c:7 | f2 tcall.c:5 | f1 tcall.c:11 | f1 tcall.c:10 | main tcall.c:18 | main tcall.c:18 | main tcall.c:17 | main tcall.c:17 | f1 tcall.c:13 | f1 tcall.c:13 | f2 tcall.c:7 | f2 tcall.c:5 | f1 tcall.c:12 The default output is unchanged. This is only implemented in perf report, no change to record or anywhere else. This adds the basic code to report: - add a new "branch" option to the -g option parser to enable this mode - when the flag is set include the LBR into the callstack in machine.c. The rest of the history code is unchanged and doesn't know the difference between LBR entry and normal call entry. - detect overlaps with the callchain - remove small loop duplicates in the LBR Current limitations: - The LBR flags (mispredict etc.) are not shown in the history and LBR entries have no special marker. - It would be nice if annotate marked the LBR entries somehow (e.g. with arrows) v2: Various fixes. v3: Merge further patches into this one. Fix white space. v4: Improve manpage. Address review feedback. v5: Rename functions. Better error message without -g. Fix crash without -b. v6: Rebase v7: Rebase. Use NO_ENTRY in memset. Signed-off-by: Andi Kleen --- tools/perf/Documentation/perf-report.txt | 7 +- tools/perf/builtin-report.c | 4 +- tools/perf/util/callchain.c | 11 ++- tools/perf/util/callchain.h | 1 + tools/perf/util/machine.c| 159 +++ tools/perf/util/symbol.h | 3 +- 6 files changed, 158 insertions(+), 27 deletions(-) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index cefdf43..29a21b0 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -143,7 +143,7 @@ OPTIONS --dump-raw-trace:: Dump raw trace in ASCII. --g [type,min[,limit],order[,key]]:: +-g [type,min[,limit],order[,key][,branch]]:: --call-graph:: Display call chains using type, min percent threshold, optional print limit and order. @@ -161,6 +161,11 @@ OPTIONS - function: compare on functions - address: compare on individual code addresses + branch can be: + - branch: include last branch information in callgraph + when available. Usually more convenient to use --branch-history + for this. + Default: fractal,0.5,callee,function.
[PATCH 4/9] perf, tools: Only print base source file for srcline
From: Andi Kleen For perf report with --sort srcline only print the base source file name. This makes the results generally fit much better to the screen. The path is usually not that useful anyways because it is often from different systems. Signed-off-by: Andi Kleen --- tools/perf/util/srcline.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index c6a7cdc..ac877f9 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -274,7 +274,7 @@ char *get_srcline(struct dso *dso, unsigned long addr) if (!addr2line(dso_name, addr, , , dso)) goto out; - if (asprintf(, "%s:%u", file, line) < 0) { + if (asprintf(, "%s:%u", basename(file), line) < 0) { free(file); goto out; } -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
perf: Implement lbr-as-callgraph v8
[Even more review feedback and some bugs addressed.] [Only port to changes in perf/core. No other changes.] [Rebase to latest perf/core] [Another rebase. No changes] This patchkit implements lbr-as-callgraphs in per freport, as an alternative way to present LBR information. Current perf report does a histogram over the branch edges, which is useful to look at basic blocks, but doesn't tell you anything about the larger control flow behaviour. This patchkit adds a new option --branch-history that adds the branch paths to the callgraph history instead. This allows to reason about individual branch paths leading to specific samples. Also available at git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/lbr-callgraph5 v2: - rebased on perf/core - fix various issues - rename the option to --branch-history - various fixes to display the information more concise v3: - White space changes - Consolidate some patches - Update some descriptions v4: - Fix various display problems - Unknown srcline is now printed as symbol+offset - Refactor some code to address review feedback - Merge with latest tip - Fix missing srcline display in stdio hist output. v5: - Rename functions - Fix gtk build problem - Fix crash without -g - Improve error messages - Improve srcline display in various ways v6: - Port to latest perf/core v7: - Really port to latest perf/core v8: - Rebased on 3.16-rc1 Example output: % perf record -b -g ./tsrc/tcall [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ] % perf report --branch-history ... 54.91% tcall.c:6 [.] f2 tcall | |--65.53%-- f2 tcall.c:5 | | | |--70.83%-- f1 tcall.c:11 | | f1 tcall.c:10 | | main tcall.c:18 | | main tcall.c:18 | | main tcall.c:17 | | main tcall.c:17 | | f1 tcall.c:13 | | f1 tcall.c:13 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:12 | | f1 tcall.c:12 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:11 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/9] perf, tools: Support source line numbers in annotate
From: Andi Kleen With srcline key/sort'ing it's useful to have line numbers in the annotate window. This patch implements this. Use objdump -l to request the line numbers and save them in the line structure. Then the browser displays them for source lines. The line numbers are not displayed by default, but can be toggled on with 'k' There is one unfortunate problem with this setup. For lines not containing source and which are outside functions objdump -l reports line numbers off by a few: it always reports the first line number in the next function even for lines that are outside the function. I haven't found a nice way to detect/correct this. Probably objdump has to be fixed. See https://sourceware.org/bugzilla/show_bug.cgi?id=16433 The line numbers are still useful even with these problems, as most are correct and the ones which are not are nearby. Signed-off-by: Andi Kleen --- tools/perf/ui/browsers/annotate.c | 13 - tools/perf/util/annotate.c| 30 +- tools/perf/util/annotate.h| 1 + 3 files changed, 38 insertions(+), 6 deletions(-) diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index f0697a3..8df6787 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -27,6 +27,7 @@ static struct annotate_browser_opt { bool hide_src_code, use_offset, jump_arrows, +show_linenr, show_nr_jumps; } annotate_browser__opts = { .use_offset = true, @@ -128,7 +129,11 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int if (!*dl->line) slsmg_write_nstring(" ", width - pcnt_width); else if (dl->offset == -1) { - printed = scnprintf(bf, sizeof(bf), "%*s ", + if (dl->line_nr && annotate_browser__opts.show_linenr) + printed = scnprintf(bf, sizeof(bf), "%*s %-5d ", + ab->addr_width, " ", dl->line_nr); + else + printed = scnprintf(bf, sizeof(bf), "%*s ", ab->addr_width, " "); slsmg_write_nstring(bf, printed); slsmg_write_nstring(dl->line, width - printed - pcnt_width + 1); @@ -733,6 +738,7 @@ static int annotate_browser__run(struct annotate_browser *browser, "o Toggle disassembler output/simplified view\n" "s Toggle source code view\n" "/ Search string\n" + "k Toggle line numbers\n" "r Run available scripts\n" "? Search string backwards\n"); continue; @@ -741,6 +747,10 @@ static int annotate_browser__run(struct annotate_browser *browser, script_browse(NULL); continue; } + case 'k': + annotate_browser__opts.show_linenr = + !annotate_browser__opts.show_linenr; + break; case 'H': nd = browser->curr_hot; break; @@ -984,6 +994,7 @@ static struct annotate_config { } annotate__configs[] = { ANNOTATE_CFG(hide_src_code), ANNOTATE_CFG(jump_arrows), + ANNOTATE_CFG(show_linenr), ANNOTATE_CFG(show_nr_jumps), ANNOTATE_CFG(use_offset), }; diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 809b4c5..12997ff 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -17,11 +17,13 @@ #include "debug.h" #include "annotate.h" #include "evsel.h" +#include #include #include const char *disassembler_style; const char *objdump_path; +static regex_t file_lineno; static struct ins *ins__find(const char *name); static int disasm_line__parse(char *line, char **namep, char **rawp); @@ -564,13 +566,15 @@ out_free_name: return -1; } -static struct disasm_line *disasm_line__new(s64 offset, char *line, size_t privsize) +static struct disasm_line *disasm_line__new(s64 offset, char *line, + size_t privsize, int line_nr) { struct disasm_line *dl = zalloc(sizeof(*dl) + privsize); if (dl != NULL) { dl->offset = offset; dl->line = strdup(line); + dl->line_nr = line_nr; if (dl->line == NULL) goto out_delete; @@ -782,13 +786,15 @@ static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 st * The ops.raw part will be parsed further according to type of the instruction. */ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map, - FILE *file, size_t
[PATCH 6/9] perf, tools: Fix srcline sort key output to use width
From: Andi Kleen The srcline sort output ignored the width, which caused various problems with displaying srcline in the tui browser. Just cut it off at width. Signed-off-by: Andi Kleen --- tools/perf/util/sort.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index 45512ba..901f44b 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -304,7 +304,7 @@ static int hist_entry__srcline_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width __maybe_unused) { - return repsep_snprintf(bf, size, "%s", he->srcline); + return repsep_snprintf(bf, size, "%.*s", width, he->srcline); } struct sort_entry sort_srcline = { -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 8/9] tools, perf: Make srcline output address with -v
From: Andi Kleen When -v is specified always print the hex address for the srcline. Signed-off-by: Andi Kleen --- tools/perf/util/srcline.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index 36a7aff..a22be7c 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -258,6 +258,12 @@ char *get_srcline(struct dso *dso, unsigned long addr, struct symbol *sym, unsigned line = 0; char *srcline; const char *dso_name; + char astr[50]; + + if (verbose) + snprintf(astr, sizeof astr, " %#lx", addr); + else + astr[0] = 0; if (!dso->has_srcline) goto out; @@ -276,7 +282,12 @@ char *get_srcline(struct dso *dso, unsigned long addr, struct symbol *sym, if (!addr2line(dso_name, addr, , , dso)) goto out; - if (asprintf(, "%s:%u", basename(file), line) < 0) { + if (line == 0) { + free(file); + goto fallback; + } + + if (asprintf(, "%s:%u%s", basename(file), line, astr) < 0) { free(file); goto out; } @@ -291,9 +302,10 @@ out: dso->has_srcline = 0; dso__free_a2l(dso); } +fallback: if (sym) { - if (asprintf(, "%s+%ld", show_sym ? sym->name : "", - addr - sym->start) < 0) + if (asprintf(, "%s+%ld%s", show_sym ? sym->name : "", + addr - sym->start, astr) < 0) return SRCLINE_UNKNOWN; } else if (asprintf(, "%s[%lx]", dso->short_name, addr) < 0) return SRCLINE_UNKNOWN; -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] selinux: no recursive read_lock of policy_rwlock in security_genfs_sid()
On 06/20/2014 01:49 PM, Stephen Smalley wrote: On 06/20/2014 01:45 PM, Waiman Long wrote: With introduction of fair queued rwlock, recursive read_lock() may hang the offending process if there is a write_lock() somewhere in between. With recursive read_lock checking enabled, the following error was reported: = [ INFO: possible recursive locking detected ] 3.16.0-rc1 #2 Tainted: GE - load_policy/708 is trying to acquire lock: (policy_rwlock){.+.+..}, at: [] security_genfs_sid+0x3a/0x170 but task is already holding lock: (policy_rwlock){.+.+..}, at: [] security_fs_use+0x2c/0x110 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(policy_rwlock); lock(policy_rwlock); This patch fixes the occurrence of recursive read_lock() of policy_rwlock in security_genfs_sid() by adding a 5th argument to indicate if the rwlock has been taken. Signed-off-by: Waiman Long Thanks, but I'd prefer to instead create a static helper function in services.c that does not take the lock at all, use that function from security_fs_use, and leave the extern function unmodified. On second thought, this is exactly what I want to change the patch. I will send out a new one later today. -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv6 1/3] devicetree: Addition of the Altera SDRAM controller
From: Thor Thayer Addition of the Altera SDRAM Controller bindings and device tree changes. v2: Changes to SoC SDRAM EDAC code. v3: Implement code suggestions for SDRAM EDAC code. v4: Remove syscon from SDRAM controller bindings. v5: No Change, bump version for consistency. v6: Only map the ctrlcfg register as syscon. Signed-off-by: Thor Thayer --- .../bindings/arm/altera/socfpga-sdram.txt | 11 +++ arch/arm/boot/dts/socfpga.dtsi |5 + 2 files changed, 16 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/altera/socfpga-sdram.txt diff --git a/Documentation/devicetree/bindings/arm/altera/socfpga-sdram.txt b/Documentation/devicetree/bindings/arm/altera/socfpga-sdram.txt new file mode 100644 index 000..5027026 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/altera/socfpga-sdram.txt @@ -0,0 +1,11 @@ +Altera SOCFPGA SDRAM Controller + +Required properties: +- compatible : "altr,sdr-ctl"; +- reg : Should contain 1 register ranges(address and length) + +Example: + sdrctl@ffc25000 { + compatible = "altr,sdr-ctl"; + reg = <0xffc25000 0x4>; + }; diff --git a/arch/arm/boot/dts/socfpga.dtsi b/arch/arm/boot/dts/socfpga.dtsi index 4676f25..310292e 100644 --- a/arch/arm/boot/dts/socfpga.dtsi +++ b/arch/arm/boot/dts/socfpga.dtsi @@ -682,6 +682,11 @@ clocks = <_sp_clk>; }; + sdrctl@ffc25000 { + compatible = "altr,sdr-ctl", "syscon"; + reg = <0xffc25000 0x4>; + }; + rst: rstmgr@ffd05000 { compatible = "altr,rst-mgr"; reg = <0xffd05000 0x1000>; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/5] Fix for cond_resched performance regression
On Fri, Jun 20, 2014 at 03:39:51PM -0700, j...@joshtriplett.org wrote: > On Fri, Jun 20, 2014 at 03:11:20PM -0700, Paul E. McKenney wrote: > > On Fri, Jun 20, 2014 at 02:24:23PM -0700, j...@joshtriplett.org wrote: > > > On Fri, Jun 20, 2014 at 12:12:36PM -0700, Paul E. McKenney wrote: > > > > o Make cond_resched() a no-op for PREEMPT=y. This might well turn > > > > out to be a good thing, but it doesn't help give RCU the > > > > quiescent > > > > states that it needs. > > > > > > What about doing this, together with letting the fqs logic poke > > > un-quiesced kernel code as needed? That way, rather than having > > > cond_resched do any work, you have the fqs logic recognize that a > > > particular CPU has gone too long without quiescing, without disturbing > > > that CPU at all if it hasn't gone too long. > > > > My next stop is to post the previous series, but with a couple of > > exports and one bug fix uncovered by testing thus far, but after > > another round of testing. Then I am going to take a close look at > > this one: > > > > o Push the checks further into cond_resched(), so that the > > fastpath does the same sequence of instructions that the original > > did. This might work well, but requires IPIs, which are not so > > good for latencies on the remote CPU. It nevertheless might be a > > decent long-term solution given that if your CPU is spending many > > jiffies looping in the kernel, you aren't getting good latencies > > anyway. It also has the benefit of allowing RCU to take advantage > > of the implicit quiescent states of all cond_resched() calls, > > and of eliminating the need for a separate cond_resched_rcu_qs() > > and for RCU_COND_RESCHED_QS. > > > > The one you call out is of course interesting as well. But there are > > a couple of questions: > > > > 1. Why wasn't cond_resched() a no-op in CONFIG_PREEMPT to start > > with? It just seems to obvious a thing to do for it to possibly > > be an oversight. (What, me paranoid?) > > > > 2. When RCU recognizes that a particular CPU has gone too long, > > exactly what are you suggesting that RCU do about it? When > > formulating your answer, please give due consideration to the > > implications of that CPU being a NO_HZ_FULL CPU. ;-) > > Send it an IPI that either causes it to flag a quiescent state > immediately if currently quiesced or causes it to quiesce at the next > opportunity if not. OK. But if we are in a !PREEMPT kernel, we have to assume that any point in the kernel is not a quiescent state, at least for the rcu_read_lock() flavor of RCU. So in that case, what constitutes the set of next opportunities, and what is the time bound on when the next opportunity will arrive? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 2/4] Documentation: dts: Add bindings for APM X-Gene SoC ethernet driver
This patch adds documentation for APM X-Gene SoC ethernet DTS binding. Signed-off-by: Iyappan Subramanian Signed-off-by: Ravi Patel Signed-off-by: Keyur Chudgar --- .../devicetree/bindings/net/apm-xgene-enet.txt | 72 ++ 1 file changed, 72 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/apm-xgene-enet.txt diff --git a/Documentation/devicetree/bindings/net/apm-xgene-enet.txt b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt new file mode 100644 index 000..3e2a295 --- /dev/null +++ b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt @@ -0,0 +1,72 @@ +APM X-Gene SoC Ethernet nodes + +Ethernet nodes are defined to describe on-chip ethernet interfaces in +APM X-Gene SoC. + +Required properties: +- compatible: Should be "apm,xgene-enet" +- reg: Address and length of the register set for the device. It contains the + information of registers in the same order as described by reg-names +- reg-names: Should contain the register set names + "enet_csr": Ethernet control and status register address space + "ring_csr": Descriptor ring control and status register address space + "ring_cmd": Descriptor ring command register address space +- interrupts: Ethernet main interrupt +- clocks: Reference to the clock entry. +- local-mac-address: MAC address assigned to this device +- phy-connection-type: Interface type between ethernet device and PHY device +- phy-handle: Reference to a PHY node connected to this device + +- mdio:Device tree subnode with the following required + properties: + + - compatible: Must be "apm,xgene-mdio". + - #address-cells: Must be <1>. + - #size-cells: Must be <0>. + + For the phy on the mdio bus, there must be a node with the following + fields: + + - compatible: PHY identifier. Please refer ./phy.txt for the format. + - reg: The ID number for the phy. + +Optional properties: +- status : Should be "ok" or "disabled" for enabled/disabled. + Default is "ok". + + +Example: + menetclk: menetclk { + compatible = "apm,xgene-device-clock"; + clock-output-names = "menetclk"; + status = "ok"; + }; + + menet: ethernet@1702 { + compatible = "apm,xgene-enet"; + status = "disabled"; + reg = <0x0 0x1702 0x0 0xd100>, + <0x0 0X1703 0x0 0X400>, + <0x0 0X1000 0x0 0X200>; + reg-names = "enet_csr", "ring_csr", "ring_cmd"; + interrupts = <0x0 0x3c 0x4>; + clocks = < 0>; + local-mac-address = [00 01 73 00 00 01]; + phy-connection-type = "rgmii"; + phy-handle = <>; + mdio { + compatible = "apm,xgene-mdio"; + #address-cells = <1>; + #size-cells = <0>; + menetphy: menetphy@3 { + compatible = "ethernet-phy-id001c.c915"; + reg = <0x3>; + }; + + }; + }; + +/* Board-specific peripheral configurations */ + { +status = "ok"; +}; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 0/4] net: Add APM X-Gene SoC Ethernet driver support
Adding APM X-Gene SoC Ethernet driver. v8: Address comments from v7 review * changed angle bracket to double quotes in header file include. v7: Address comments from v6 review * fixed skb memory leak when dma_map_single fails in xmit. v6: Address comments from v5 review * added basic ethtool support * added ndo_get_stats64 call back * deleted priting Rx error messages * renamed set_bits to xgene_set_bits to fix kbuild error (make ARCH=powerpc) v5: Address comments from v4 review * Documentation: Added phy-handle, reg-names and changed mdio part * dtb: Added reg-names supplemental property * changed platform_get_resource to platform_get_resource_byname * added separate tx/rx set_desc/get_desc functions to do raw_write/raw_read * removed set_desc/get_desc table lookup logic * added error handling logic based on per packet descriptor bits * added software managed Rx packet and error counters * added busy wait for register read/writes * changed mdio_bus->id to avoid conflict * fixed mdio_bus leak in case of mdio_config error * changed phy reg hard coded value to MII_BMSR * changed phy addr hard coded value to phy_device->addr * added paranthesis around macro arguments * converted helper macros to inline functions * changed use of goto's only to common work such as cleanup v4: Address comments from v3 review * MAINTAINERS: changed status to supported * Kconfig: made default to no * changed to bool data type wherever applicable * cleaned up single bit set and masking code * removed statistics counters masking * removed unnecessary OOM message printing * fixed dma_map_single and dma_unmap_single size parameter * changed set bits macro body using new set_bits function v3: Address comments from v2 review * cleaned up set_desc and get_desc functions * added dtb mdio node and phy-handle subnode * renamed dtb phy-mode to phy-connection-type * added of_phy_connect call to connec to PHY * added empty line after last local variable declaration * removed type casting when not required * removed inline keyword from source files * removed CONFIG_CPU_BIG_ENDIAN ifdef v2 * Completely redesigned ethernet driver * Added support to work with big endian kernel * Renamed dtb phyid entry to phy_addr * Changed dtb local-mac-address entry to byte string format * Renamed dtb eth8clk entry to menetclk v1 * Initial version Signed-off-by: Iyappan Subramanian Signed-off-by: Ravi Patel Signed-off-by: Keyur Chudgar --- Iyappan Subramanian (4): MAINTAINERS: Add entry for APM X-Gene SoC ethernet driver Documentation: dts: Add bindings for APM X-Gene SoC ethernet driver dts: Add bindings for APM X-Gene SoC ethernet driver drivers: net: Add APM X-Gene SoC ethernet driver support. .../devicetree/bindings/net/apm-xgene-enet.txt | 72 ++ MAINTAINERS| 8 + arch/arm64/boot/dts/apm-mustang.dts| 4 + arch/arm64/boot/dts/apm-storm.dtsi | 30 +- drivers/net/ethernet/Kconfig | 1 + drivers/net/ethernet/Makefile | 1 + drivers/net/ethernet/apm/Kconfig | 1 + drivers/net/ethernet/apm/Makefile | 5 + drivers/net/ethernet/apm/xgene/Kconfig | 9 + drivers/net/ethernet/apm/xgene/Makefile| 6 + .../net/ethernet/apm/xgene/xgene_enet_ethtool.c| 125 +++ drivers/net/ethernet/apm/xgene/xgene_enet_hw.c | 848 +++ drivers/net/ethernet/apm/xgene/xgene_enet_hw.h | 394 + drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 939 + drivers/net/ethernet/apm/xgene/xgene_enet_main.h | 109 +++ 15 files changed, 2549 insertions(+), 3 deletions(-) create mode 100644 Documentation/devicetree/bindings/net/apm-xgene-enet.txt create mode 100644 drivers/net/ethernet/apm/Kconfig create mode 100644 drivers/net/ethernet/apm/Makefile create mode 100644 drivers/net/ethernet/apm/xgene/Kconfig create mode 100644 drivers/net/ethernet/apm/xgene/Makefile create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_main.c create mode 100644 drivers/net/ethernet/apm/xgene/xgene_enet_main.h -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 1/4] MAINTAINERS: Add entry for APM X-Gene SoC ethernet driver
This patch adds a MAINTAINERS entry for APM X-Gene SoC ethernet driver. Signed-off-by: Iyappan Subramanian Signed-off-by: Ravi Patel Signed-off-by: Keyur Chudgar --- MAINTAINERS | 8 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 134483f..d65a3be 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -700,6 +700,14 @@ S: Maintained F: drivers/net/appletalk/ F: net/appletalk/ +APPLIED MICRO (APM) X-GENE SOC ETHERNET DRIVER +M: Iyappan Subramanian +M: Keyur Chudgar +M: Ravi Patel +S: Supported +F: drivers/net/ethernet/apm/xgene/ +F: Documentation/devicetree/bindings/net/apm-xgene-enet.txt + APTINA CAMERA SENSOR PLL M: Laurent Pinchart L: linux-me...@vger.kernel.org -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v8 3/4] dts: Add bindings for APM X-Gene SoC ethernet driver
This patch adds bindings for APM X-Gene SoC ethernet driver. Signed-off-by: Iyappan Subramanian Signed-off-by: Ravi Patel Signed-off-by: Keyur Chudgar --- arch/arm64/boot/dts/apm-mustang.dts | 4 arch/arm64/boot/dts/apm-storm.dtsi | 30 +++--- 2 files changed, 31 insertions(+), 3 deletions(-) diff --git a/arch/arm64/boot/dts/apm-mustang.dts b/arch/arm64/boot/dts/apm-mustang.dts index 1247ca1..e2fb1ef 100644 --- a/arch/arm64/boot/dts/apm-mustang.dts +++ b/arch/arm64/boot/dts/apm-mustang.dts @@ -24,3 +24,7 @@ reg = < 0x1 0x 0x0 0x8000 >; /* Updated by bootloader */ }; }; + + { + status = "ok"; +}; diff --git a/arch/arm64/boot/dts/apm-storm.dtsi b/arch/arm64/boot/dts/apm-storm.dtsi index c5f0a47..bd7a614 100644 --- a/arch/arm64/boot/dts/apm-storm.dtsi +++ b/arch/arm64/boot/dts/apm-storm.dtsi @@ -167,14 +167,13 @@ clock-output-names = "ethclk"; }; - eth8clk: eth8clk { + menetclk: menetclk { compatible = "apm,xgene-device-clock"; #clock-cells = <1>; clocks = < 0>; - clock-names = "eth8clk"; reg = <0x0 0x1702C000 0x0 0x1000>; reg-names = "csr-reg"; - clock-output-names = "eth8clk"; + clock-output-names = "menetclk"; }; sataphy1clk: sataphy1clk@1f21c000 { @@ -363,5 +362,30 @@ #clock-cells = <1>; clocks = < 0>; }; + + menet: ethernet@1702 { + compatible = "apm,xgene-enet"; + status = "disabled"; + reg = <0x0 0x1702 0x0 0xd100>, + <0x0 0X1703 0x0 0X400>, + <0x0 0X1000 0x0 0X200>; + reg-names = "enet_csr", "ring_csr", "ring_cmd"; + interrupts = <0x0 0x3c 0x4>; + dma-coherent; + clocks = < 0>; + local-mac-address = [00 01 73 00 00 01]; + phy-connection-type = "rgmii"; + phy-handle = <>; + mdio { + compatible = "apm,xgene-mdio"; + #address-cells = <1>; + #size-cells = <0>; + menetphy: menetphy@3 { + compatible = "ethernet-phy-id001c.c915"; + reg = <0x3>; + }; + + }; + }; }; }; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vfio: Fix endianness handling for emulated BARs
On Sat, 2014-06-21 at 00:14 +1000, Alexey Kardashevskiy wrote: > We can still use __raw_writel, would that be ok? No unless you understand precisely what kind of memory barriers each platform require for these. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv6 3/3] edac: altera: Add EDAC support for SDRAM Ctlr
From: Thor Thayer Addition of the driver to support the Altera SDRAM Controller. This patch adds support for the CycloneV and ArriaV SDRAM controllers. Correction and reporting of SBEs, Panic on DBEs. v2: Use the SDRAM controller registers to calculate memory size instead of the Device Tree. Update To & Cc list. Add maintainer information. v3: EDAC driver cleanup based on comments from Mailing list. v4: Panic on DBE. Add macro around inject-error reads to prevent them from being optimized out. Remove of_match_ptr since this will always use Device Tree. v5: Addition of printk to trigger function to ensure read vars are not optimized out. v6: Changes to split out shared SDRAM controller reg (offset 0x00) as a syscon device and allocate ECC specific SDRAM registers to EDAC. Signed-off-by: Thor Thayer --- drivers/edac/Kconfig |9 + drivers/edac/Makefile |2 + drivers/edac/altera_edac.c | 448 3 files changed, 459 insertions(+) create mode 100644 drivers/edac/altera_edac.c diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig index 878f090..4f4d379 100644 --- a/drivers/edac/Kconfig +++ b/drivers/edac/Kconfig @@ -368,4 +368,13 @@ config EDAC_OCTEON_PCI Support for error detection and correction on the Cavium Octeon family of SOCs. +config EDAC_ALTERA_MC + bool "Altera SDRAM Memory Controller EDAC" + depends on EDAC_MM_EDAC && ARCH_SOCFPGA + help + Support for error detection and correction on the + Altera SDRAM memory controller. Note that the + preloader must initialize the SDRAM before loading + the kernel. + endif # EDAC diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index 4154ed6..9741336 100644 --- a/drivers/edac/Makefile +++ b/drivers/edac/Makefile @@ -64,3 +64,5 @@ obj-$(CONFIG_EDAC_OCTEON_PC) += octeon_edac-pc.o obj-$(CONFIG_EDAC_OCTEON_L2C) += octeon_edac-l2c.o obj-$(CONFIG_EDAC_OCTEON_LMC) += octeon_edac-lmc.o obj-$(CONFIG_EDAC_OCTEON_PCI) += octeon_edac-pci.o + +obj-$(CONFIG_EDAC_ALTERA_MC) += altera_edac.o diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c new file mode 100644 index 000..e3fcd27 --- /dev/null +++ b/drivers/edac/altera_edac.c @@ -0,0 +1,448 @@ +/* + * Copyright Altera Corporation (C) 2014. All rights reserved. + * Copyright 2011-2012 Calxeda, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + + * + * Adapted from the highbank_mc_edac driver + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "edac_core.h" +#include "edac_module.h" + +#define EDAC_MOD_STR "altera_edac" +#define EDAC_VERSION "1" + +/* SDRAM Controller CtrlCfg Register */ +#define CTLCFG 0x00 + +/* SDRAM Controller CtrlCfg Register Bit Masks */ +#define CTLCFG_ECC_EN 0x400 +#define CTLCFG_ECC_CORR_EN 0x800 +#define CTLCFG_GEN_SB_ERR 0x2000 +#define CTLCFG_GEN_DB_ERR 0x4000 + +#define CTLCFG_ECC_AUTO_EN (CTLCFG_ECC_EN | \ +CTLCFG_ECC_CORR_EN) + +/* SDRAM Controller ECC Register Offset */ +#define ECC_REG_OFFSET 0x2C + +/* SDRAM Controller Address Width Register */ +#define DRAMADDRW (0x2C-ECC_REG_OFFSET) + +/* SDRAM Controller Address Widths Field Register */ +#define DRAMADDRW_COLBIT_MASK 0x001F +#define DRAMADDRW_COLBIT_LSB 0 +#define DRAMADDRW_ROWBIT_MASK 0x03E0 +#define DRAMADDRW_ROWBIT_LSB 5 +#define DRAMADDRW_BANKBIT_MASK 0x1C00 +#define DRAMADDRW_BANKBIT_LSB 10 +#define DRAMADDRW_CSBIT_MASK 0xE000 +#define DRAMADDRW_CSBIT_LSB13 + +/* SDRAM Controller Interface Data Width Register */ +#define DRAMIFWIDTH(0x30-ECC_REG_OFFSET) + +/* SDRAM Controller Interface Data Width Defines */ +#define DRAMIFWIDTH_16B_ECC24 +#define DRAMIFWIDTH_32B_ECC40 + +/* SDRAM Controller DRAM Status Register */ +#define DRAMSTS(0x38-ECC_REG_OFFSET) + +/* SDRAM Controller DRAM Status Register Bit Masks */ +#define DRAMSTS_SBEERR 0x04 +#define DRAMSTS_DBEERR 0x08 +#define DRAMSTS_CORR_DROP 0x10 + +/* SDRAM Controller DRAM IRQ Register */ +#define DRAMINTR (0x3C-ECC_REG_OFFSET) + +/* SDRAM Controller DRAM IRQ
Re: [PATCH v2] devicetree: Add generic IOMMU device tree bindings
On 5/30/2014 12:06 PM, Arnd Bergmann wrote: > On Friday 30 May 2014 08:16:05 Rob Herring wrote: >> On Fri, May 23, 2014 at 3:33 PM, Thierry Reding >> wrote: >>> From: Thierry Reding >>> +IOMMU master node: >>> +== >>> + >>> +Devices that access memory through an IOMMU are called masters. A device >>> can >>> +have multiple master interfaces (to one or more IOMMU devices). >>> + >>> +Required properties: >>> + >>> +- iommus: A list of phandle and IOMMU specifier pairs that describe the >>> IOMMU >>> + master interfaces of the device. One entry in the list describes one >>> master >>> + interface of the device. >>> + >>> +When an "iommus" property is specified in a device tree node, the IOMMU >>> will >>> +be used for address translation. If a "dma-ranges" property exists in the >>> +device's parent node it will be ignored. An exception to this rule is if >>> the >>> +referenced IOMMU is disabled, in which case the "dma-ranges" property of >>> the >>> +parent shall take effect. >> >> Just thinking out loud, could you have dma-ranges in the iommu node >> for the case when the iommu is enabled rather than putting the DMA >> window information into the iommus property? >> >> This would probably mean that you need both #iommu-cells and #address-cells. > > The reason for doing like this was that you may need a different window > for each device, while there can only be one dma-ranges property in > an iommu node. > >>> + >>> +Optional properties: >>> + >>> +- iommu-names: A list of names identifying each entry in the "iommus" >>> + property. >> >> Do we really need a name here? I would not expect that you have >> clearly documented names here from the datasheet like you would for >> interrupts or clocks, so you'd just be making up names. Sorry, but I'm >> not a fan of names properties in general. > > Good point, this was really overdesign by modeling it after other > subsystems that can have a use for names. > >>> +Multiple-master IOMMU: >>> +-- >>> + >>> + iommu { >>> + /* the specifier represents the ID of the master */ >>> + #address-cells = <1>; >>> + #size-cells = <0>; >>> + }; >>> + >>> + master { >>> + /* device has master ID 42 in the IOMMU */ >>> + iommus = <&/iommu 42>; >>> + }; >> >> Presumably the ID would be the streamID on ARM's SMMU. How would a >> master with 8 streamIDs be described? This is what Calxeda midway has >> for SATA and I would expect that to be somewhat common. Either you >> need some ID masking or you'll have lots of duplication when you have >> windows. > > I don't understand the problem. If you have stream IDs 0 through 7, > you would have > > master@a { > ... > iommus = < 0>; > }; > > master@b { > ... > iommus = < 1; > }; > > ... > > master@12 { > ... > iommus = < 7; > }; > > and you don't need a window at all. Why would you need a mask of > some sort? We have multiple-master SMMUs and each master emits a variable number of StreamIDs. However, we have to apply a mask (the ARM SMMU spec allows for this) to the StreamIDs due to limited number of StreamID 2 Context Bank entries in the SMMU. If my understanding is correct we would represent this in the DT like this: iommu { #address-cells = <2>; #size-cells = <0>; }; master@a { ... iommus = < StreamID0 MASK0>, < StreamID1 MASK1>, < StreamID2 MASK2>; }; master@b { ... iommus = < StreamID3 MASK3>, < StreamID4 MASK4>; }; Thanks, Olav Haugan -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv6 2/3] devicetree: Addition of the Altera SDRAM EDAC
From: Thor Thayer Addition of the Altera SDRAM EDAC bindings and device tree changes v2: Changes to SoC EDAC source code. v3: Fix typo in device tree documentation. v4,v5: No changes - bump version for consistency. v6: Assign ECC registers in SDRAM controller to EDAC Signed-off-by: Thor Thayer --- .../bindings/arm/altera/socfpga-sdram-edac.txt | 15 +++ arch/arm/boot/dts/socfpga.dtsi |6 ++ 2 files changed, 21 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt diff --git a/Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt b/Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt new file mode 100644 index 000..540c9cf --- /dev/null +++ b/Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt @@ -0,0 +1,15 @@ +Altera SOCFPGA SDRAM Error Detection & Correction [EDAC] + +Required properties: +- compatible : should contain "altr,sdram-edac"; +- reg : should contain the ECC register range in sdram +controller (address and length). +- interrupts : Should contain the SDRAM ECC IRQ in the + appropriate format for the IRQ controller. + +Example: + sdramedac@0 { + compatible = "altr,sdram-edac"; + reg = <0xffc2502C 0x28>; + interrupts = <0 39 4>; + }; diff --git a/arch/arm/boot/dts/socfpga.dtsi b/arch/arm/boot/dts/socfpga.dtsi index 310292e..fe9832e 100644 --- a/arch/arm/boot/dts/socfpga.dtsi +++ b/arch/arm/boot/dts/socfpga.dtsi @@ -687,6 +687,12 @@ reg = <0xffc25000 0x4>; }; + sdramedac@0 { + compatible = "altr,sdram-edac"; + reg = <0xffc2502C 0x28>; + interrupts = <0 39 4>; + }; + rst: rstmgr@ffd05000 { compatible = "altr,rst-mgr"; reg = <0xffd05000 0x1000>; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i2c: exynos5: Properly use the "noirq" variants of suspend/resume
Doug Anderson writes: > Kevin, > > On Fri, Jun 20, 2014 at 2:48 PM, Kevin Hilman wrote: >> Hi Doug, >> >> Doug Anderson writes: >> >>> On Thu, Jun 19, 2014 at 11:43 AM, Kevin Hilman wrote: Doug Anderson writes: > The original code for the exynos i2c controller registered for the > "noirq" variants. However during review feedback it was moved to > SIMPLE_DEV_PM_OPS without anyone noticing that it meant we were no > longer actually "noirq" (despite functions named > exynos5_i2c_suspend_noirq and exynos5_i2c_resume_noirq). > > i2c controllers that might have wakeup sources on them seem to need to > resume at noirq time so that the individual drivers can actually read > the i2c bus to handle their wakeup. I suspect usage of the noirq variants pre-dates the existence of the late/early callbacks in the PM core, but based on the description above, I suspect what you actually want is the late/early callbacks. >>> >>> I think it actually really needs noirq. ;) >> >> Yes, it appears it does. Objection withdrawn. >> >> I just wanted to be sure because since the introduction of late/early, >> the need for noirq should be pretty rare, but there certainly are needs. >> >> >> In this case though, the need for it has more to do with the >> lack of a way for us to describe non parent-child device dependencies >> than whether or not IRQs are enabled or not. >> > > Actually, I'm not sure that's true, but I'll talk through it and you > can point to where I'm wrong (I often am!) > > If you're a wakeup device then you need to be ready to handle > interrupts as soon as the "noirq" phase of resume is done, right? As soon as the noirq phase of your own driver is done, correct. > Said another way: you need to be ready to handle interrupts _before_ > the normal resume code is called and be ready to handle interrupts > even _before_ the early resume code is called. Correct. > That means if you are implementing a bus that's needed by any devices > with wakeup interrupts then it's your responsibility to also be > prepared to run this early. > > In this particular case the max77686 driver doesn't need to do > anything at all to be ready to handle interrupts. It's suspend and > resume code is just boilerplate "enable wakeups / disable wakeups" and > it has no "noirq" code. The max77686 driver doesn't have any "noirq" > wake call because it would just be empty. > > Said another way: the problem isn't that the max77686 wakeup gets > called before the i2c wakeup. The problem is that i2c is needed ASAP > once IRQs are enabled and thus needs to be run noirq. > > Does that sound semi-correct? Yes that's correct. My point above was (trying to be) that ultimately this is an ordering issue. e.g. the bus device needs to be "ready" before wakeup devices on that bus can handle wakeup interrupts etc. The way we're handling that ordering is by the implied ordering of noirq, late/early and "normal" callbacks. That's convenient, but not exactly obvious. It works because we dont' typically need too many layers here, but it would be much more understandable if we could describe this kind of dependency in a way that the suspend/resume code would suspend/resume things in the right order rather than by tinkering with callback levels (since otherwise suspend/resume ordering just depends on probe order.) This issue then usually gets me headed down my usual rant path about how I think runtime PM is much better suited for handling ordering and dependencies becuase it automatically handles parent/child dependencies and non parent/child dependencies can be handled by taking advantage of the get/put APIs which are refcounted, ect etc. but that's another can worms. Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vfio: Fix endianness handling for emulated BARs
On Thu, 2014-06-19 at 21:21 -0600, Alex Williamson wrote: > Working on big endian being an accident may be a matter of perspective :-) > The comment remains that this patch doesn't actually fix anything except > the overhead on big endian systems doing redundant byte swapping and > maybe the philosophy that vfio regions are little endian. Yes, that works by accident because technically VFIO is a transport and thus shouldn't perform any endian swapping of any sort, which remains the responsibility of the end driver which is the only one to know whether a given BAR location is a a register or some streaming data and in the former case whether it's LE or BE (some PCI devices are BE even ! :-) But yes, in the end, it works with the dual "cancelling" swaps and the overhead of those swaps is probably drowned in the noise of the syscall overhead. > I'm still not a fan of iowrite vs iowritebe, there must be something we > can use that doesn't have an implicit swap. Sadly there isn't ... In the old day we didn't even have the "be" variant and readl/writel style accessors still don't have them either for all archs. There is __raw_readl/writel but here the semantics are much more than just "don't swap", they also don't have memory barriers (which means they are essentially useless to most drivers unless those are platform specific drivers which know exactly what they are doing, or in the rare cases such as accessing a framebuffer which we know never have side effects). > Calling it iowrite*_native is also an abuse of the namespace. > Next thing we know some common code > will legitimately use that name. I might make sense to those definitions into a common header. There have been a handful of cases in the past that wanted that sort of "native byte order" MMIOs iirc (though don't ask me for examples, I can't really remember). > If we do need to define an alias > (which I'd like to avoid) it should be something like vfio_iowrite32. > Thanks, Cheers, Ben. > Alex > > > > === > > > > > > any better? > > > > > > > > > > > > > > Suggested-by: Benjamin Herrenschmidt > > Signed-off-by: Alexey Kardashevskiy > > --- > > drivers/vfio/pci/vfio_pci_rdwr.c | 20 > > 1 file changed, 16 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c > > b/drivers/vfio/pci/vfio_pci_rdwr.c > > index 210db24..f363b5a 100644 > > --- a/drivers/vfio/pci/vfio_pci_rdwr.c > > +++ b/drivers/vfio/pci/vfio_pci_rdwr.c > > @@ -21,6 +21,18 @@ > > > > #include "vfio_pci_private.h" > > > > +#ifdef __BIG_ENDIAN__ > > +#define ioread16_native ioread16be > > +#define ioread32_native ioread32be > > +#define iowrite16_native iowrite16be > > +#define iowrite32_native iowrite32be > > +#else > > +#define ioread16_native ioread16 > > +#define ioread32_native ioread32 > > +#define iowrite16_native iowrite16 > > +#define iowrite32_native iowrite32 > > +#endif > > + > > /* > > * Read or write from an __iomem region (MMIO or I/O port) with an > > excluded > > * range which is inaccessible. The excluded range drops writes and > > fills > > @@ -50,9 +62,9 @@ static ssize_t do_io_rw(void __iomem *io, char > > __user *buf, > > if (copy_from_user(, buf, 4)) > > return -EFAULT; > > > > - iowrite32(le32_to_cpu(val), io + off); > > + iowrite32_native(val, io + off); > > } else { > > - val = cpu_to_le32(ioread32(io + off)); > > + val = ioread32_native(io + off); > > > > if (copy_to_user(buf, , 4)) > > return -EFAULT; > > @@ -66,9 +78,9 @@ static ssize_t do_io_rw(void __iomem *io, char > > __user *buf, > > if (copy_from_user(, buf, 2)) > > return -EFAULT; > > > > - iowrite16(le16_to_cpu(val), io + off); > > + iowrite16_native(val, io + off); > > } else { > > - val = cpu_to_le16(ioread16(io + off)); > > + val = ioread16_native(io + off); > > > > if (copy_to_user(buf, , 2)) > > return -EFAULT; > > >>> > > >>> > > >>> > > >> > > >> > > > > > > > > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to
[PATCHv6 3/3] edac: altera: Add EDAC support for SDRAM Ctlr
From: Thor Thayer v2: Use the SDRAM controller registers to calculate memory size instead of the Device Tree. Update To & Cc list. Add maintainer information. v3: EDAC driver cleanup based on comments from Mailing list. v4: Panic on DBE. Add macro around inject-error reads to prevent them from being optimized out. Remove of_match_ptr since this will always use Device Tree. v5: Addition of printk to trigger function to ensure read vars are not optimized out. v6: Changes to split out shared SDRAM controller reg (offset 0x00) as a syscon device and allocate ECC specific SDRAM registers to EDAC. Signed-off-by: Thor Thayer --- drivers/edac/Kconfig |9 + drivers/edac/Makefile |2 + drivers/edac/altera_edac.c | 448 3 files changed, 459 insertions(+) create mode 100644 drivers/edac/altera_edac.c diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig index 878f090..4f4d379 100644 --- a/drivers/edac/Kconfig +++ b/drivers/edac/Kconfig @@ -368,4 +368,13 @@ config EDAC_OCTEON_PCI Support for error detection and correction on the Cavium Octeon family of SOCs. +config EDAC_ALTERA_MC + bool "Altera SDRAM Memory Controller EDAC" + depends on EDAC_MM_EDAC && ARCH_SOCFPGA + help + Support for error detection and correction on the + Altera SDRAM memory controller. Note that the + preloader must initialize the SDRAM before loading + the kernel. + endif # EDAC diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index 4154ed6..9741336 100644 --- a/drivers/edac/Makefile +++ b/drivers/edac/Makefile @@ -64,3 +64,5 @@ obj-$(CONFIG_EDAC_OCTEON_PC) += octeon_edac-pc.o obj-$(CONFIG_EDAC_OCTEON_L2C) += octeon_edac-l2c.o obj-$(CONFIG_EDAC_OCTEON_LMC) += octeon_edac-lmc.o obj-$(CONFIG_EDAC_OCTEON_PCI) += octeon_edac-pci.o + +obj-$(CONFIG_EDAC_ALTERA_MC) += altera_edac.o diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c new file mode 100644 index 000..e3fcd27 --- /dev/null +++ b/drivers/edac/altera_edac.c @@ -0,0 +1,448 @@ +/* + * Copyright Altera Corporation (C) 2014. All rights reserved. + * Copyright 2011-2012 Calxeda, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + + * + * Adapted from the highbank_mc_edac driver + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "edac_core.h" +#include "edac_module.h" + +#define EDAC_MOD_STR "altera_edac" +#define EDAC_VERSION "1" + +/* SDRAM Controller CtrlCfg Register */ +#define CTLCFG 0x00 + +/* SDRAM Controller CtrlCfg Register Bit Masks */ +#define CTLCFG_ECC_EN 0x400 +#define CTLCFG_ECC_CORR_EN 0x800 +#define CTLCFG_GEN_SB_ERR 0x2000 +#define CTLCFG_GEN_DB_ERR 0x4000 + +#define CTLCFG_ECC_AUTO_EN (CTLCFG_ECC_EN | \ +CTLCFG_ECC_CORR_EN) + +/* SDRAM Controller ECC Register Offset */ +#define ECC_REG_OFFSET 0x2C + +/* SDRAM Controller Address Width Register */ +#define DRAMADDRW (0x2C-ECC_REG_OFFSET) + +/* SDRAM Controller Address Widths Field Register */ +#define DRAMADDRW_COLBIT_MASK 0x001F +#define DRAMADDRW_COLBIT_LSB 0 +#define DRAMADDRW_ROWBIT_MASK 0x03E0 +#define DRAMADDRW_ROWBIT_LSB 5 +#define DRAMADDRW_BANKBIT_MASK 0x1C00 +#define DRAMADDRW_BANKBIT_LSB 10 +#define DRAMADDRW_CSBIT_MASK 0xE000 +#define DRAMADDRW_CSBIT_LSB13 + +/* SDRAM Controller Interface Data Width Register */ +#define DRAMIFWIDTH(0x30-ECC_REG_OFFSET) + +/* SDRAM Controller Interface Data Width Defines */ +#define DRAMIFWIDTH_16B_ECC24 +#define DRAMIFWIDTH_32B_ECC40 + +/* SDRAM Controller DRAM Status Register */ +#define DRAMSTS(0x38-ECC_REG_OFFSET) + +/* SDRAM Controller DRAM Status Register Bit Masks */ +#define DRAMSTS_SBEERR 0x04 +#define DRAMSTS_DBEERR 0x08 +#define DRAMSTS_CORR_DROP 0x10 + +/* SDRAM Controller DRAM IRQ Register */ +#define DRAMINTR (0x3C-ECC_REG_OFFSET) + +/* SDRAM Controller DRAM IRQ Register Bit Masks */ +#define DRAMINTR_INTREN0x01 +#define DRAMINTR_SBEMASK 0x02 +#define DRAMINTR_DBEMASK 0x04 +#define DRAMINTR_CORRDROPMASK 0x08 +#define
[PATCHv6 2/3] devicetree: Addition of the Altera SDRAM EDAC
From: Thor Thayer v2: Changes to SoC EDAC source code. v3: Fix typo in device tree documentation. v4,v5: No changes - bump version for consistency. v6: Assign ECC registers in SDRAM controller to EDAC Signed-off-by: Thor Thayer --- .../bindings/arm/altera/socfpga-sdram-edac.txt | 15 +++ arch/arm/boot/dts/socfpga.dtsi |6 ++ 2 files changed, 21 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt diff --git a/Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt b/Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt new file mode 100644 index 000..540c9cf --- /dev/null +++ b/Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt @@ -0,0 +1,15 @@ +Altera SOCFPGA SDRAM Error Detection & Correction [EDAC] + +Required properties: +- compatible : should contain "altr,sdram-edac"; +- reg : should contain the ECC register range in sdram +controller (address and length). +- interrupts : Should contain the SDRAM ECC IRQ in the + appropriate format for the IRQ controller. + +Example: + sdramedac@0 { + compatible = "altr,sdram-edac"; + reg = <0xffc2502C 0x28>; + interrupts = <0 39 4>; + }; diff --git a/arch/arm/boot/dts/socfpga.dtsi b/arch/arm/boot/dts/socfpga.dtsi index 310292e..fe9832e 100644 --- a/arch/arm/boot/dts/socfpga.dtsi +++ b/arch/arm/boot/dts/socfpga.dtsi @@ -687,6 +687,12 @@ reg = <0xffc25000 0x4>; }; + sdramedac@0 { + compatible = "altr,sdram-edac"; + reg = <0xffc2502C 0x28>; + interrupts = <0 39 4>; + }; + rst: rstmgr@ffd05000 { compatible = "altr,rst-mgr"; reg = <0xffd05000 0x1000>; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv6 1/3] devicetree: Addition of the Altera SDRAM controller
From: Thor Thayer v2: Changes to SoC SDRAM EDAC code. v3: Implement code suggestions for SDRAM EDAC code. v4: Remove syscon from SDRAM controller bindings. v5: No Change, bump version for consistency. v6: Only map the ctrlcfg register as syscon. Signed-off-by: Thor Thayer --- .../bindings/arm/altera/socfpga-sdram.txt | 11 +++ arch/arm/boot/dts/socfpga.dtsi |5 + 2 files changed, 16 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/altera/socfpga-sdram.txt diff --git a/Documentation/devicetree/bindings/arm/altera/socfpga-sdram.txt b/Documentation/devicetree/bindings/arm/altera/socfpga-sdram.txt new file mode 100644 index 000..5027026 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/altera/socfpga-sdram.txt @@ -0,0 +1,11 @@ +Altera SOCFPGA SDRAM Controller + +Required properties: +- compatible : "altr,sdr-ctl"; +- reg : Should contain 1 register ranges(address and length) + +Example: + sdrctl@ffc25000 { + compatible = "altr,sdr-ctl"; + reg = <0xffc25000 0x4>; + }; diff --git a/arch/arm/boot/dts/socfpga.dtsi b/arch/arm/boot/dts/socfpga.dtsi index 4676f25..310292e 100644 --- a/arch/arm/boot/dts/socfpga.dtsi +++ b/arch/arm/boot/dts/socfpga.dtsi @@ -682,6 +682,11 @@ clocks = <_sp_clk>; }; + sdrctl@ffc25000 { + compatible = "altr,sdr-ctl", "syscon"; + reg = <0xffc25000 0x4>; + }; + rst: rstmgr@ffd05000 { compatible = "altr,rst-mgr"; reg = <0xffd05000 0x1000>; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHv6 0/3] Addition of Altera SDRAM EDAC
From: Thor Thayer Addition of the Altera SDRAM controller to the EDAC driver. Thor Thayer (3): Addition of the Altera SDRAM controller bindings and device tree changes to the Altera SoC project. Addition of the Altera SDRAM EDAC bindings and device tree changes to the Altera SoC project. edac: altera: Add EDAC support for Altera SoC SDRAM Controller. This patch adds support for the CycloneV and ArriaV SDRAM controllers. Correction and reporting of SBEs, Panic on DBEs. .../bindings/arm/altera/socfpga-sdram-edac.txt | 15 + .../bindings/arm/altera/socfpga-sdram.txt | 11 + arch/arm/boot/dts/socfpga.dtsi | 11 + drivers/edac/Kconfig |9 + drivers/edac/Makefile |2 + drivers/edac/altera_edac.c | 448 6 files changed, 496 insertions(+) create mode 100644 Documentation/devicetree/bindings/arm/altera/socfpga-sdram-edac.txt create mode 100644 Documentation/devicetree/bindings/arm/altera/socfpga-sdram.txt create mode 100644 drivers/edac/altera_edac.c -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Add EDAC support for Altera SDRAM Controller
[PATCHv6 1/3] dt: bindings: Addition of the Altera SDRAM controller [PATCHv6 2/3] dt: bindings: Addition of the Altera SDRAM EDAC [PATCHv6 3/3] edac: altera: Add EDAC support for Altera SoC SDRAM -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
Hi Eric, On Fri, Jun 20, 2014 at 03:16:07PM -0700, Eric W. Biederman wrote: > Willy Tarreau writes: > > > Hi Luis, > > > > On Thu, Jun 12, 2014 at 01:55:53PM +0100, Luis Henriques wrote: > >> I was finally able to spend some more time with this and tried (a > >> modified) Tyler's patch on top of 2.6.32.62, and it seems to work. > >> Although I haven't done any extended testing, I don't see the two > >> stack traces and the /proc/sys/net/ipv4/ directory seems to be > >> correctly populated. > >> > >> I'm attaching the patch I've used, based on Tyler's. > > > > Would any of you or Tyler please kindly pass me a signed-off-by with > > a commit message ? That would be great. Alternately I'd do it myself > > and mention you authored them. > > If my memory serves it is possibe in 2.6.32 to set > .ctl_name = CTL_UNNEEDED > > and not need to implement a .strategy routine at all. Ah that's quite interesting, thanks for the tip! > Given the fact that most people got the strategy routines > slightly wrong and that sys_sysctl is effectively unused > a strategy where you don't implement code that no-one > will use in a backport I would be preferable. OK. > Since you have mentioned this has come up a couple of times if something > else this will be something to think about for next time. I'm keeping your e-mail where I manage patches, hoping to recognize this case next time. > I am puzzled why .ctl_name was populated in a backport at all. Oh it's simply because I didn't know it did not have to be there, and among the few reviewers, I guess that it's not common to know what version uses what semantics. Thank you for the exaplanation, it's really helpful. We're not used to backport sysctl changes but here I got caught a few times and have found some sysctl.conf with bogus values in field a few times, so it was really important to backport this one. Best regards, Willy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [rtc-linux] [PATCH] rtc: add support of nvram for maxim dallas rtc ds1343
On Sat, 24 May 2014 21:34:33 +0530 Raghavendra Ganiga wrote: > This is a patch to add support of nvram for maxim dallas > rtc ds1343 > > ... > > --- a/drivers/rtc/rtc-ds1343.c > +++ b/drivers/rtc/rtc-ds1343.c > @@ -4,6 +4,7 @@ > * Real Time Clock > * > * Author : Raghavendra Chandra Ganiga > + * Ankur Srivastava : DS1343 Nvram Support > * > * This program is free software; you can redistribute it and/or modify > * it under the terms of the GNU General Public License version 2 as > @@ -45,6 +46,9 @@ > #define DS1343_CONTROL_REG 0x0F > #define DS1343_STATUS_REG0x10 > #define DS1343_TRICKLE_REG 0x11 > +#define DS1343_NVRAM 0x20 > + > +#define DS1343_NVRAM_LEN 96 > > /* DS1343 Control Registers bits */ > #define DS1343_EOSC 0x80 > @@ -149,6 +153,64 @@ static ssize_t ds1343_store_glitchfilter(struct device > *dev, > static DEVICE_ATTR(glitch_filter, S_IRUGO | S_IWUSR, > ds1343_show_glitchfilter, > ds1343_store_glitchfilter); > > +static ssize_t ds1343_nvram_write(struct file *filp, struct kobject *kobj, > + struct bin_attribute *attr, > + char *buf, loff_t off, size_t count) > +{ > + int ret; > + unsigned char address; > + struct device *dev = kobj_to_dev(kobj); > + struct ds1343_priv *priv = dev_get_drvdata(dev); > + > + if (unlikely(!count)) > + return count; > + > + if ((count + off) > DS1343_NVRAM_LEN) I worry about what happens if (count + off) wraps through zero. > + count = DS1343_NVRAM_LEN - off; We might end up with an enormous value in `count'? > + address = DS1343_NVRAM + off; > + > + ret = regmap_bulk_write(priv->map, address, buf, count); > + if (ret < 0) > + dev_err(>spi->dev, "Error in nvram write %d", ret); > + > + return (ret < 0) ? ret : count; > +} > + > + > +static ssize_t ds1343_nvram_read(struct file *filp, struct kobject *kobj, > + struct bin_attribute *attr, > + char *buf, loff_t off, size_t count) > +{ > + int ret; > + unsigned char address; > + struct device *dev = kobj_to_dev(kobj); > + struct ds1343_priv *priv = dev_get_drvdata(dev); > + > + if (unlikely(!count)) > + return count; > + > + if ((count + off) > DS1343_NVRAM_LEN) > + count = DS1343_NVRAM_LEN - off; Here too. > + address = DS1343_NVRAM + off; > + > + ret = regmap_bulk_read(priv->map, address, buf, count); > + if (ret < 0) > + dev_err(>spi->dev, "Error in nvram read %d\n", ret); > + > + return (ret < 0) ? ret : count; > +} > + > + -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/5] Fix for cond_resched performance regression
On Fri, Jun 20, 2014 at 03:11:20PM -0700, Paul E. McKenney wrote: > On Fri, Jun 20, 2014 at 02:24:23PM -0700, j...@joshtriplett.org wrote: > > On Fri, Jun 20, 2014 at 12:12:36PM -0700, Paul E. McKenney wrote: > > > o Make cond_resched() a no-op for PREEMPT=y. This might well turn > > > out to be a good thing, but it doesn't help give RCU the quiescent > > > states that it needs. > > > > What about doing this, together with letting the fqs logic poke > > un-quiesced kernel code as needed? That way, rather than having > > cond_resched do any work, you have the fqs logic recognize that a > > particular CPU has gone too long without quiescing, without disturbing > > that CPU at all if it hasn't gone too long. > > My next stop is to post the previous series, but with a couple of > exports and one bug fix uncovered by testing thus far, but after > another round of testing. Then I am going to take a close look at > this one: > > o Push the checks further into cond_resched(), so that the > fastpath does the same sequence of instructions that the original > did. This might work well, but requires IPIs, which are not so > good for latencies on the remote CPU. It nevertheless might be a > decent long-term solution given that if your CPU is spending many > jiffies looping in the kernel, you aren't getting good latencies > anyway. It also has the benefit of allowing RCU to take advantage > of the implicit quiescent states of all cond_resched() calls, > and of eliminating the need for a separate cond_resched_rcu_qs() > and for RCU_COND_RESCHED_QS. > > The one you call out is of course interesting as well. But there are > a couple of questions: > > 1.Why wasn't cond_resched() a no-op in CONFIG_PREEMPT to start > with? It just seems to obvious a thing to do for it to possibly > be an oversight. (What, me paranoid?) > > 2.When RCU recognizes that a particular CPU has gone too long, > exactly what are you suggesting that RCU do about it? When > formulating your answer, please give due consideration to the > implications of that CPU being a NO_HZ_FULL CPU. ;-) Send it an IPI that either causes it to flag a quiescent state immediately if currently quiesced or causes it to quiesce at the next opportunity if not. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] ARM: mvebu: Fix missing binding documentation for Armada 38x
On Fri, Jun 20, 2014 at 1:52 PM, Jason Cooper wrote: > On Thu, Jun 19, 2014 at 06:40:43PM +0200, Gregory CLEMENT wrote: >> For the Armada 380 and Armada 385 SoCs, the common bindings for those >> 2 SoCs, was forgotten. This patch add the documentation for the >> marvell,aramda38x property. >> >> Signed-off-by: Gregory CLEMENT >> -- >> Hi, >> >> This fix should be merged in 3.16. For 3.15 I am not sure as it is not >> a regression. >> >> Changelog: >> v1->v2 >> >> - Reformulate to make clear that we will need marvell,armada38x _and_ a >> SoC specific string. For consistency I duplicated what we have done in >> armada-370-xp.txt >> >> >> Thanks, >> Gregory >> >> >> Documentation/devicetree/bindings/arm/armada-38x.txt | 17 +++-- >> 1 file changed, 15 insertions(+), 2 deletions(-) >> >> diff --git a/Documentation/devicetree/bindings/arm/armada-38x.txt >> b/Documentation/devicetree/bindings/arm/armada-38x.txt >> index 11f2330a6554..fa08760046df 100644 >> --- a/Documentation/devicetree/bindings/arm/armada-38x.txt >> +++ b/Documentation/devicetree/bindings/arm/armada-38x.txt >> @@ -6,5 +6,18 @@ following property: >> >> Required root node property: >> >> - - compatible: must contain either "marvell,armada380" or >> - "marvell,armada385" depending on the variant of the SoC being used. >> +compatible: must contain "marvell,armada38x" > > I agree with Sergei on this one. We generally avoid wildcards in > compatible strings. Is there a use case where specifying one of the > below wouldn't be sufficient? Isn't this a case of just documenting what is already in use? I agree wildcards alone are not good, but along with a specific compatible is okay. But also there should be some need to have the common property. Rob -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-kernel] [PATCH 2/4] drivers/base: devres.c: Add block copy func. for managed devices
On Thu, 2014-06-19 at 16:46 +0100, Rob Jones wrote: [...] > --- a/drivers/base/devres.c > +++ b/drivers/base/devres.c > @@ -793,7 +793,7 @@ EXPORT_SYMBOL_GPL(devm_kmalloc); > /** > * devm_kstrdup - Allocate resource managed space and > *copy an existing string into that. > - * @dev: Device to allocate memory for > + * @dev:Device to allocate memory for You shouldn't be changing this comment... Ben. > * @s: the string to duplicate > * @gfp: the GFP mask used in the devm_kmalloc() call when > * allocating memory [...] -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] pwm: add Rockchip SoC PWM support
On Sat, Jun 21, 2014 at 12:00:36AM +0200, Beniamino Galvani wrote: > On Tue, Jun 17, 2014 at 11:42:58PM +0200, Thierry Reding wrote: > > On Thu, May 08, 2014 at 01:08:33AM +0200, Beniamino Galvani wrote: [...] > > > diff --git a/drivers/pwm/pwm-rockchip.c b/drivers/pwm/pwm-rockchip.c [...] > > > +static int rockchip_pwm_config(struct pwm_chip *chip, struct pwm_device > > > *pwm, > > > +int duty_ns, int period_ns) > > > +{ > > > + struct rockchip_pwm_chip *pc = to_rockchip_pwm_chip(chip); > > > + unsigned long clk_rate, period, duty; > > > + u64 div; > > > + int ret; > > > + > > > + clk_rate = clk_get_rate(pc->clk); > > > + > > > + /* > > > + * Since period and duty cycle registers have a width of 32 > > > + * bits, every possible input period can be obtained using the > > > + * default prescaler value for all practical clock rate values. > > > + */ > > > + div = clk_rate; > > > + div *= period_ns; > > > > Perhaps shorten this to "div = clk_rate * period_ns;"? > > I will change this, adding a cast to avoid the truncation of the > result to 32 bits: "div = (u64)clk_rate * period_ns;" Alternatively you could simply make clk_rate a u64 since it's only used in this context anyway. Thierry pgp3utPsR41As.pgp Description: PGP signature
Re: [Linux-kernel] [PATCH 1/4] drivers/gpio: devres.c: allow gpio array requests for managed devices
On Thu, 2014-06-19 at 16:46 +0100, Rob Jones wrote: [...] > +int devm_gpio_request_array(struct device *dev, > + const struct gpio *array, > + size_t num) > +{ > + int i, err = 0; > + > + for (i = 0; i < num; i++, array++) { > + err = devm_gpio_request_one(dev, > + array->gpio, > + array->flags, > + array->label); > + if (err) { > + while (i--) > + devm_gpio_free(dev, (--array)->gpio); Missing break here. > + } > + } > + > + return err; > +} > +EXPORT_SYMBOL(devm_gpio_request_array); [...] -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 1/2] video: ARM CLCD: Add DT support
On 17 June 2014 16:21, Pawel Moll wrote: > This patch adds basic DT bindings for the PL11x CLCD cells > and make their fbdev driver use them. > +* ARM PrimeCell Color LCD Controller PL110/PL111 > + > +See also Documentation/devicetree/bindings/arm/primecell.txt > + > +Required properties: > + > +- compatible: must be one of: > + "arm,pl110", "arm,primecell" > + "arm,pl111", "arm,primecell" > + > +- reg: base address and size of the control registers block > + > +- interrupt-names: either the single entry "combined" representing a > + combined interrupt output (CLCDINTR), or the four entries > + "mbe", "vcomp", "lnbu", "fuf" representing the individual > + CLCDMBEINTR, CLCDVCOMPINTR, CLCDLNBUINTR, CLCDFUFINTR interrupts > + > +- interrupts: contains an interrupt specifier for each entry in > + interrupt-names > + > +- clocks-names: should contain "clcdclk" and "apb_pclk" > + > +- clocks: contains phandle and clock specifier pairs for the entries > + in the clock-names property. See > + Documentation/devicetree/binding/clock/clock-bindings.txt > + > +Optional properties: > + > +- arm,pl11x,framebuffer-base: a pair of two 32-bit values, address and size, > + defining the framebuffer that must be used; if not present, the > + framebuffer may be located anywhere in the memory > + > +- max-memory-bandwidth: maximum bandwidth in bytes per second that the > + cell's memory interface can handle > + > +Required sub-nodes: > + > +- port: describes LCD panel signals, following the common binding > + for video transmitter interfaces; see > + Documentation/devicetree/bindings/media/video-interfaces.txt; > + when it is a TFT panel, the port's endpoint must define the > + following property: > + > + - arm,pl11x,tft-r0g0b0-pads: an array of three 32-bit values, > + defining the way CLD pads are wired up; this implicitly > + defines available color modes, for example: > + - PL111 TFT 4:4:4 panel: > + arm,pl11x,tft-r0g0b0-pads = <4 15 20>; > + - PL110 TFT (1:)5:5:5 panel: > + arm,pl11x,tft-r0g0b0-pads = <1 7 13>; > + - PL111 TFT (1:)5:5:5 panel: > + arm,pl11x,tft-r0g0b0-pads = <3 11 19>; > + - PL111 TFT 5:6:5 panel: > + arm,pl11x,tft-r0g0b0-pads = <3 10 19>; > + - PL110 and PL111 TFT 8:8:8 panel: > + arm,pl11x,tft-r0g0b0-pads = <0 8 16>; > + - PL110 and PL111 TFT 8:8:8 panel, R & B components swapped: > + arm,pl11x,tft-r0g0b0-pads = <16 8 0>; How does this work for boards like the versatilepb which have a mux between a PL110 and the TFT, allowing it to efffectively rewire the pads at runtime under control of the SYS_CLCD sysreg (to give a wider range of colour modes than the PL110 supports natively)? thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] pwm: st: Add new driver for ST's PWM IP
On Thu, Jun 19, 2014 at 07:57:14PM +0530, Ajit Pal wrote: > On Thursday 19 June 2014 02:14 PM, Lee Jones wrote: > >On Thu, 19 Jun 2014, Thierry Reding wrote: > >>On Wed, Jun 18, 2014 at 03:52:51PM +0100, Lee Jones wrote: [...] > >>>+ cdata->max_prescale + 1, sizeof(unsigned long), > >>>+ st_pwm_cmp_periods); > >>>+ if (!found) { > >>>+ dev_err(dev, "failed to find matching period\n"); > >>>+ return -EINVAL; > >>>+ } > >>>+ > >>>+ prescale = found - >pwm_periods[0]; > >> > >>This is somewhat unconventional. None of the other drivers precompute > >>possible periods and I'm not convinced that it's an advantage. Setting > >>the period (and configuring the PWM in general) is a fairly uncommon > >>operation. > > > >Another one for Ajit I feel. > > For ST PWM IP, the PWM period is fixed to 256 local clock pulses.There is no > register interface to select PWM periods.To change the period we have to > change the prescaler. > We precompute the possible periods, so as to avoid the calculations > everytime the .config function is called. Based upon a matching period we > then select the prescaler. > Sorry but why do you think precomputing is not helpful ? Mostly I dislike it here because it sticks out as nobody else is doing it. Secondly I'm not convinced that it gives you much of a performance gain since the computations aren't that involved and typically the period isn't changed all that often. Also computing the value directly in .config() makes the code much easier to follow. > >>>+static int st_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm) > >>>+{ > >>>+ struct st_pwm_chip *pc = to_st_pwmchip(chip); > >>>+ struct device *dev = pc->dev; > >>>+ int ret; > >>>+ > >>>+ ret = clk_enable(pc->clk); > >>>+ if (ret) > >>>+ return ret; > >>>+ > >>>+ ret = regmap_field_write(pc->pwm_en, 1); > >>>+ if (ret) > >>>+ dev_err(dev, "%s,pwm_en write failed\n", __func__); > > >> > >>This error message is somewhat cryptic, perhaps: > >> > >> "failed to enable PWM" > > > >Agreed. I also can't believe I missed that nasty __func__ too. > > > >>? Also what implications does this have on controllers with multiple > >>channels? > > > >I believe this enables both channels, but I'm sure Ajit will correct > >me if I'm wrong. > > Yes it enables all channels.Unfortunately we do not have the facility to > enable/disable individual channels on the ST PWM IP. That's bad. If you can't control them separately then there's no way you can guarantee the semantics of the PWM framework. > >>>+ dev_dbg(dev, "pwm counter :%u\n", val); > >>>+ > >>>+ clk_disable(pc->clk); > >>>+} > >>>+ > >>>+static const struct pwm_ops st_pwm_ops = { > >>>+ .config = st_pwm_config, > >>>+ .enable = st_pwm_enable, > >>>+ .disable = st_pwm_disable, > >>>+ .owner = THIS_MODULE, > >>>+}; > >>>+ > >>>+static int st_pwm_probe_dt(struct st_pwm_chip *pc) > >>>+{ > >>>+ struct device *dev = pc->dev; > >>>+ const struct reg_field *reg_fields; > >>>+ struct device_node *np = dev->of_node; > >>>+ struct st_pwm_compat_data *cdata = pc->cdata; > >>>+ u32 num_chan; > >>>+ > >>>+ of_property_read_u32(np, "st,pwm-num-chan", _chan); > >>>+ if (num_chan) > >>>+ cdata->num_chan = num_chan; > >> > >>I don't like this very much. What influences the number of channels? Is > >>it that specific SoC revisions have one and others have two? > > > >Ajit? > > > Depends on the board type on which the SoC is used. I don't understand. How can the board influence the number of PWM channels that the SoC supports? It does make sense for a board to define how many of them are actually *used*, but that's nothing that DT should contain nor that the driver should care about. The driver (and DT for that matter) should expose the hardware block's full capabilities. The use-case is what should determine what's used and what not. Thierry pgpu97gq4OBC5.pgp Description: PGP signature
[PATCH] sched: Fix potential near-infinite distribute_cfs_runtime loop
distribute_cfs_runtime intentionally only hands out enough runtime to bring each cfs_rq to 1 ns of runtime, expecting the cfs_rqs to then take the runtime they need only once they actually get to run. However, if they get to run sufficiently quickly, the period timer is still in distribute_cfs_runtime and no runtime is available, causing them to throttle. Then distribute has to handle them again, and this can go on until distribute has handed out all of the runtime 1ns at a time, which takes far too long. Instead allow access to the same runtime that distribute is handing out, accepting that corner cases with very low quota may be able to spend the entire cfs_b->runtime during distribute_cfs_runtime, meaning that the runtime directly handed out by distribute_cfs_runtime was over quota. In addition, if a cfs_rq does manage to throttle like this, make sure the existing distribute_cfs_runtime no longer loops over it again. Signed-off-by: Ben Segall --- kernel/sched/fair.c | 41 - 1 file changed, 20 insertions(+), 21 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1f9c457..ef5eac7 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3361,7 +3361,11 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq) cfs_rq->throttled = 1; cfs_rq->throttled_clock = rq_clock(rq); raw_spin_lock(_b->lock); - list_add_tail_rcu(_rq->throttled_list, _b->throttled_cfs_rq); + /* +* Add to the _head_ of the list, so that an already-started +* distribute_cfs_runtime will not see us +*/ + list_add_rcu(_rq->throttled_list, _b->throttled_cfs_rq); if (!cfs_b->timer_active) __start_cfs_bandwidth(cfs_b, false); raw_spin_unlock(_b->lock); @@ -3418,7 +3422,8 @@ static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, u64 remaining, u64 expires) { struct cfs_rq *cfs_rq; - u64 runtime = remaining; + u64 runtime; + u64 starting_runtime = remaining; rcu_read_lock(); list_for_each_entry_rcu(cfs_rq, _b->throttled_cfs_rq, @@ -3449,7 +3454,7 @@ next: } rcu_read_unlock(); - return remaining; + return starting_runtime - remaining; } /* @@ -3495,22 +3500,17 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun) /* account preceding periods in which throttling occurred */ cfs_b->nr_throttled += overrun; - /* -* There are throttled entities so we must first use the new bandwidth -* to unthrottle them before making it generally available. This -* ensures that all existing debts will be paid before a new cfs_rq is -* allowed to run. -*/ - runtime = cfs_b->runtime; runtime_expires = cfs_b->runtime_expires; - cfs_b->runtime = 0; /* -* This check is repeated as we are holding onto the new bandwidth -* while we unthrottle. This can potentially race with an unthrottled -* group trying to acquire new bandwidth from the global pool. +* This check is repeated as we are holding onto the new bandwidth while +* we unthrottle. This can potentially race with an unthrottled group +* trying to acquire new bandwidth from the global pool. This can result +* in us over-using our runtime if it is all used during this loop, but +* only by limited amounts in that extreme case. */ - while (throttled && runtime > 0) { + while (throttled && cfs_b->runtime > 0) { + runtime = cfs_b->runtime; raw_spin_unlock(_b->lock); /* we can't nest cfs_b->lock while distributing bandwidth */ runtime = distribute_cfs_runtime(cfs_b, runtime, @@ -3518,10 +3518,10 @@ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun) raw_spin_lock(_b->lock); throttled = !list_empty(_b->throttled_cfs_rq); + + cfs_b->runtime -= min(runtime, cfs_b->runtime); } - /* return (any) remaining runtime */ - cfs_b->runtime = runtime; /* * While we are ensured activity in the period following an * unthrottle, this also covers the case in which the new bandwidth is @@ -3632,10 +3632,9 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b) return; } - if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice) { + if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice) runtime = cfs_b->runtime; - cfs_b->runtime = 0; - } + expires = cfs_b->runtime_expires; raw_spin_unlock(_b->lock); @@ -3646,7 +3645,7 @@ static void do_sched_cfs_slack_timer(struct cfs_bandwidth *cfs_b) raw_spin_lock(_b->lock); if (expires == cfs_b->runtime_expires) -
Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
Willy Tarreau writes: > Hi Luis, > > On Thu, Jun 12, 2014 at 01:55:53PM +0100, Luis Henriques wrote: >> I was finally able to spend some more time with this and tried (a >> modified) Tyler's patch on top of 2.6.32.62, and it seems to work. >> Although I haven't done any extended testing, I don't see the two >> stack traces and the /proc/sys/net/ipv4/ directory seems to be >> correctly populated. >> >> I'm attaching the patch I've used, based on Tyler's. > > Would any of you or Tyler please kindly pass me a signed-off-by with > a commit message ? That would be great. Alternately I'd do it myself > and mention you authored them. If my memory serves it is possibe in 2.6.32 to set .ctl_name = CTL_UNNEEDED and not need to implement a .strategy routine at all. Given the fact that most people got the strategy routines slightly wrong and that sys_sysctl is effectively unused a strategy where you don't implement code that no-one will use in a backport I would be preferable. Since you have mentioned this has come up a couple of times if something else this will be something to think about for next time. I am puzzled why .ctl_name was populated in a backport at all. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/7] pwm: st: Add new driver for ST's PWM IP
On Thu, Jun 19, 2014 at 09:44:04AM +0100, Lee Jones wrote: > I'll comment on some of the more fluffy topics, I'll let Ajit reply to > the more technical details of the patch. > > On Thu, 19 Jun 2014, Thierry Reding wrote: > > On Wed, Jun 18, 2014 at 03:52:51PM +0100, Lee Jones wrote: > > > This driver supports all current STi platforms' PWM IPs. > > > > > > Signed-off-by: Lee Jones > > > --- > > > drivers/pwm/Kconfig | 9 ++ > > > drivers/pwm/Makefile | 1 + > > > drivers/pwm/pwm-st.c | 378 > > > +++ > > > 3 files changed, 388 insertions(+) > > > create mode 100644 drivers/pwm/pwm-st.c > > > > > > diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig > > > index 4ad7b89..98a7bbc 100644 > > > --- a/drivers/pwm/Kconfig > > > +++ b/drivers/pwm/Kconfig > > > @@ -292,4 +292,13 @@ config PWM_VT8500 > > > To compile this driver as a module, choose M here: the module > > > will be called pwm-vt8500. > > > > > > +config PWM_ST > > > > PWM_ST is awfully generic, perhaps PWM_STI would be a better choice? > > Even that's very generic. Maybe PWM_STI_H4XX? There's nothing wrong with > > supporting STiH{5,6,7,...}xx SoCs with such a driver. I'm just trying to > > think ahead what will happen if at some point a new SoC family is > > released that requires a different driver. > > I'm inclined to agree with you, but as it stands, this driver supports > all ST h/w, so it's correct for it to be generic. If some new IP > comes into fuition, at worst we'll have to change the name of the > driver. I'm happy to put myself on the line for that if the time > comes. Renaming a driver isn't a trivial matter. People may be using the name in blacklists or scripts and renaming will likely annoy them. Like I said, there's nothing wrong with the driver name being less generic, we have other ways to identify what hardware it will run on. > > > diff --git a/drivers/pwm/pwm-st.c b/drivers/pwm/pwm-st.c [...] > > > +#define MAX_PWM_CNT_DEFAULT 255 > > > +#define MAX_PRESCALE_DEFAULT 0xff > > > +#define NUM_CHAN_DEFAULT 1 > > > > These are only used in one place and their meaning is fairly obvious, so > > I'd just drop them. > > I _always_ prefer defines over magic numbers, but as you wish - will fix. In general I agree, but there are cases where in my opinion the defines obfuscate rather than help. This is one of those. These aren't really magic numbers, since they are used in a context where their meaning is crystal clear. > > > + PWM_EN, > > > + PWM_INT_EN, > > > + /* keep last */ > > > + MAX_REGFIELDS > > > +}; > > > + > > > +struct st_pwm_chip { > > > + struct device *dev; > > > + struct clk *clk; > > > + unsigned long clk_rate; > > > + struct regmap *regmap; > > > + struct st_pwm_compat_data *cdata; > > > > Doesn't this require a predeclaration of struct st_pwm_compat_data? Or > > maybe just move struct st_pwm_compat_data before this. > > You're right, will fix. > > I think I would have expected at least a compiler warning about that? Me too. Perhaps one of the includes has a forward declaration? I'd hope not. > > > +}; > > > + > > > +struct st_pwm_compat_data { > > > + const struct reg_field *reg_fields; > > > + int num_chan; > > > + int max_pwm_cnt; > > > + int max_prescale; > > > > Can't these three be unsigned? > > I see no reason why not. They can also be signed. :) I prefer if variables use the strictest type possible. > > > +static void st_pwm_calc_periods(struct st_pwm_chip *pc) > > > +{ > > > + struct st_pwm_compat_data *cdata = pc->cdata; > > > + struct device *dev = pc->dev; > > > + unsigned long val; > > > + int i; > > > > unsigned? > > Why? > > It's much more common this way: > > $ git grep $'\t'"int i;" | wc -l > 17018 > $ git grep $'\t'"unsigned int i;" | wc -l > 2033 That just means that not everybody is as pedantic as I am. The reason why it should be unsigned int is that it's used in a loop and compared to a value which should also be unsigned (cdata->max_prescale). There just isn't a reasonable scenario where they would need to be negative. > > > + * 16 possible period values are supported (for a particular clock rate). > > > + * The requested period will be applied only if it matches one of these > > > + * 16 values. > > > + */ > > > +static int st_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm, > > > + int duty_ns, int period_ns) > > > +{ > > > + struct st_pwm_chip *pc = to_st_pwmchip(chip); > > > + struct device *dev = pc->dev; > > > + struct st_pwm_compat_data *cdata = pc->cdata; > > > + unsigned int prescale, pwmvalx; > > > + unsigned long *found; > > > + int ret; > > > + > > > + /* > > > + * Search for matching period value. The corresponding index is our > > > + * prescale value > > > + */ > > > + found = bsearch(_ns, >pwm_periods[0], > > > > Technically doesn't period_ns need to be converted to an unsigned long > > here? Otherwise this won't be compatible with 64-bit