date:20140226

Re: [PATCH 02/16] scsi: atari_scsi: fix sleep_on race

2014-02-26 Thread Michael Schmitz


Arnd Bergmann wrote:

sleep_on is known broken and going away. The atari_scsi driver is one of
two remaining users in the falcon_get_lock() function, which is a rather
crazy piece of code. This does not attempt to fix the driver's locking
scheme in general, but at least prevents falcon_get_lock from going to
sleep when no other thread holds the same lock or tries to get it,
and we no longer schedule with irqs disabled.

Signed-off-by: Arnd Bergmann 
Cc: Michael Schmitz 
Cc: Geert Uytterhoeven 
Cc: James E.J. Bottomley 
Cc: linux-s...@vger.kernel.org
---
 drivers/scsi/atari_scsi.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/atari_scsi.c b/drivers/scsi/atari_scsi.c
index a3e6c8a..b33ce34 100644
--- a/drivers/scsi/atari_scsi.c
+++ b/drivers/scsi/atari_scsi.c
@@ -90,6 +90,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 

 #include 
@@ -549,8 +550,10 @@ static void falcon_get_lock(void)
 
 	local_irq_save(flags);
 
-	while (!in_irq() && falcon_got_lock && stdma_others_waiting())

-   sleep_on(_fairness_wait);
+   wait_event_cmd(falcon_fairness_wait,
+  !in_irq() && falcon_got_lock && stdma_others_waiting(),
+  local_irq_restore(flags),
+  local_irq_save(flags));
 
 	while (!falcon_got_lock) {

if (in_irq())
@@ -562,7 +565,10 @@ static void falcon_get_lock(void)
falcon_trying_lock = 0;
wake_up(_try_wait);
} else {
-   sleep_on(_try_wait);
+   wait_event_cmd(falcon_try_wait,
+  falcon_got_lock && !falcon_trying_lock,
+  local_irq_restore(flags),
+  local_irq_save(flags));
}
}
 
  
Nack - the completion condition in the first hunk has its logic 
reversed. Try this instead (while() loops while condition true, do {} 
until () loops while condition false, no?)


I'm 99% confident I had tested your current version of the patch before 
and found it still attempts to schedule while in interrupt. I can retest 
if you prefer, but that'll have to wait a few days.


diff --git a/drivers/scsi/atari_scsi.c b/drivers/scsi/atari_scsi.c
index a3e6c8a..cc1b013 100644
--- a/drivers/scsi/atari_scsi.c
+++ b/drivers/scsi/atari_scsi.c
@@ -90,6 +90,7 @@
#include 
#include 
#include 
+#include 

#include 
#include 
@@ -549,8 +550,10 @@ static void falcon_get_lock(void)

   local_irq_save(flags);

-   while (!in_irq() && falcon_got_lock && stdma_others_waiting())
-   sleep_on(_fairness_wait);
+   wait_event_cmd(falcon_fairness_wait,
+   in_irq() || !falcon_got_lock || !stdma_others_waiting(),
+   local_irq_restore(flags),
+   local_irq_save(flags));

   while (!falcon_got_lock) {
   if (in_irq())
@@ -562,7 +565,10 @@ static void falcon_get_lock(void)
   falcon_trying_lock = 0;
   wake_up(_try_wait);
   } else {
-   sleep_on(_try_wait);
+   wait_event_cmd(falcon_try_wait,
+   falcon_got_lock && !falcon_trying_lock,
+   local_irq_restore(flags),
+   local_irq_save(flags));
   }
   }


Cheers,

   Michael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 2/4] spi: spidev: Add support for Dual/Quad SPI Transfers

2014-02-26 Thread Mark Brown

On Tue, Feb 25, 2014 at 11:40:17AM +0100, Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven 
> 
> Add support for Dual/Quad SPI Transfers to the spidev API.
> As this uses SPI mode bits that don't fit in a single byte, two new
> ioctls (SPI_IOC_RD_MODE32 and SPI_IOC_WR_MODE32) are introduced.

Applied, thanks.


signature.asc
Description: Digital signature

Re: [PATCH v2 0/6] spi: sh-msiof: Add support for R-Car H2 and M2

2014-02-26 Thread Mark Brown

On Tue, Feb 25, 2014 at 11:21:07AM +0100, Geert Uytterhoeven wrote:
> Hi Mark,
> 
> This patch series refactors the sh-msiof SPI driver and adds support for
> the MSIOF variant in the Renesas R-Car H2 (r8a7790) and M2 (r8a7791) SoCs.

I applied all these, thanks.  Laurent does make a valid point about the
fallback, though - can you please send a followup patch which addresses
that?

signature.asc
Description: Digital signature

Re: [PATCH v2 4/4] spi: spidev_fdx: Add support for Dual/Quad SPI Transfers

2014-02-26 Thread Mark Brown

On Tue, Feb 25, 2014 at 11:40:19AM +0100, Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven 
> 
> Use SPI_IOC_RD_MODE32 to print the full SPI mode, now in hex.

Applied, thanks.


signature.asc
Description: Digital signature

Re: [PATCH] ASoC: fsl-sai: Add SND_SOC_DAIFMT_DSP_A/B support.

2014-02-26 Thread Mark Brown

On Thu, Feb 27, 2014 at 08:45:01AM +0800, Xiubo Li wrote:
> o Add SND_SOC_DAIFMT_DSP_A support.
> o Add SND_SOC_DAIFMT_DSP_B support.

Applied, thanks.


signature.asc
Description: Digital signature

Re: [PATCH v2 3/4] spi: spidev_test: Add support for Dual/Quad SPI Transfers

2014-02-26 Thread Mark Brown

On Tue, Feb 25, 2014 at 11:40:18AM +0100, Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven 

Applied, thanks.


signature.asc
Description: Digital signature

Re: [PATCH v2 1/4] spi: spidev: Restore all SPI mode flags on ioctl failure

2014-02-26 Thread Mark Brown

On Tue, Feb 25, 2014 at 11:40:16AM +0100, Geert Uytterhoeven wrote:
> From: Geert Uytterhoeven 
> 
> In commit f477b7fb13df2b843997559ff34e87d054ba6538 ("spi: DUAL and QUAD
> support"), spi_device.mode was enlarged from 8 to 16 bits.

Applied, thanks.

> For SPI_IOC_WR_MODE this is probably not so important, as it doesn't allow
> setting Quad or Dual mode anyway, but SPI_IOC_WR_LSB_FIRST is used to just
> set or clear a single bit.

Since there's no API for it at present I'd not expect somethng that is
using spidev to be able to have enabled any of the high mode bits.
Unless I'm missing some path for this?


signature.asc
Description: Digital signature

Re: [PATCH 3/3] regulator: 88pm8607: fix indent code style

2014-02-26 Thread Mark Brown

On Wed, Feb 26, 2014 at 10:22:05AM +0900, Jingoo Han wrote:
> Fix indent code style in order to fix the following checkpatch
> issues.

Applied, thanks.


signature.asc
Description: Digital signature

[PATCH -tip v7 02/26] kprobes/x86: Allow to handle reentered kprobe on singlestepping

2014-02-26 Thread Masami Hiramatsu

Since the NMI handlers(e.g. perf) can interrupt in the
single stepping (or preparing the single stepping, do_debug
etc.), we should consider a kprobe is hit in the NMI
handler. Even in that case, the kprobe is allowed to be
reentered as same as the kprobes hit in kprobe handlers
(KPROBE_HIT_ACTIVE or KPROBE_HIT_SSDONE).
The real issue will happen when a kprobe hit while another
reentered kprobe is processing (KPROBE_REENTER), because
we already consumed a saved-area for the previous kprobe.

Signed-off-by: Masami Hiramatsu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
---
 arch/x86/kernel/kprobes/core.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index b482e96..a9a42fa 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -531,10 +531,11 @@ reenter_kprobe(struct kprobe *p, struct pt_regs *regs, 
struct kprobe_ctlblk *kcb
switch (kcb->kprobe_status) {
case KPROBE_HIT_SSDONE:
case KPROBE_HIT_ACTIVE:
+   case KPROBE_HIT_SS:
kprobes_inc_nmissed_count(p);
setup_singlestep(p, regs, kcb, 1);
break;
-   case KPROBE_HIT_SS:
+   case KPROBE_REENTER:
/* A probe has been hit in the codepath leading up to, or just
 * after, single-stepping of a probed instruction. This entire
 * codepath should strictly reside in .kprobes.text section.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 01/26] [BUGFIX]kprobes/x86: Fix page-fault handling logic

2014-02-26 Thread Masami Hiramatsu

Current kprobes in-kernel page fault handler doesn't
expect that its single-stepping can be interrupted by
an NMI handler which may cause a page fault(e.g. perf
with callback tracing).
In that case, the page-fault handled by kprobes and it
misunderstands the page-fault has been caused by the
single-stepping code and tries to recover IP address
to probed address.
But the truth is the page-fault has been caused by the
NMI handler, and do_page_fault failes to handle real
page fault because the IP address is modified and
causes Kernel BUGs like below.

 
 [ 2264.726905] BUG: unable to handle kernel NULL pointer
 dereference at 0020
[ 2264.727190] IP: [] copy_user_generic_string+0x0/0x40
[ 2264.727380] PGD cbcd067 PUD cbcc067 PMD 0
[ 2264.727529] Oops:  [#1] SMP
[ 2264.727683] Modules linked in: ipt_MASQUERADE bnep bluetooth 6lowpan_iphc 
iptable_nat nf_nat_ipv4 nf_nat aesni_intel aes_x86_64 ablk_helper cryptd lrw 
gf128mul glue_helper virtio_balloon snd_hda_intel snd_hda_codec snd_hwdep
[ 2264.728391] CPU: 1 PID: 25094 Comm: perf Not tainted 3.14.0-rc1.badprobe+ #24
[ 2264.728592] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 2264.728747] task: 88003db9c210 ti: 88000caac000 task.ti: 
88000caac000
[ 2264.728950] RIP: 0010:[]  [] 
copy_user_generic_string+0x0/0x40
[ 2264.729163] RSP: 0018:88003fd06bd0  EFLAGS: 00010246
[ 2264.729291] RAX:  RBX: 88003fd06bf8 RCX: 0002
[ 2264.729472] RDX:  RSI: 0020 RDI: 88003fd06bf8
[ 2264.729661] RBP: 88003fd06bd8 R08: 0030 R09: 
[ 2264.729789] R10: 001e R11: 0015 R12: 88000caadfd8
[ 2264.729789] R13: 88003d76bc00 R14: 88003db9c210 R15: 0020
[ 2264.729789] FS:  7f398bbcc780() GS:88003fd0() 
knlGS:
[ 2264.729789] CS:  0010 DS:  ES:  CR0: 80050033
[ 2264.729789] CR2: 0020 CR3: 204f2000 CR4: 07e0
[ 2264.729789] Stack:
[ 2264.729789]  813c5fd4 88003fd06c30 810183b0 
88003d76bc00
[ 2264.729789]  88003fd06ef8   
88003d76bc00
[ 2264.729789]  000c 00052ce0 88000956f800 
88000caadf58
[ 2264.729789] Call Trace:
[ 2264.729789]  
[ 2264.729789]  [] ? copy_from_user_nmi+0x64/0x70
[ 2264.729789]  [] perf_callchain_user+0xc0/0x220
[ 2264.729789]  [] perf_callchain+0x1c4/0x210
[ 2264.729789]  [] perf_prepare_sample+0x253/0x320
[ 2264.729789]  [] __perf_event_overflow+0xe7/0x230
[ 2264.729789]  [] ? x86_perf_event_set_period+0xe8/0x150
[ 2264.729789]  [] perf_event_overflow+0x14/0x20
[ 2264.729789]  [] intel_pmu_handle_irq+0x1cd/0x400
[ 2264.729789]  [] ? ftrace_regs_caller+0x81/0xcd
[ 2264.729789]  [] ? copy_user_generic_unrolled+0xc0/0xc0
[ 2264.729789]  [] perf_event_nmi_handler+0x2b/0x50
[ 2264.729789]  [] nmi_handle+0x88/0x180
[ 2264.729789]  [] ? copy_user_generic_unrolled+0xc0/0xc0
[ 2264.729789]  [] default_do_nmi+0x4a/0x140
[ 2264.729789]  [] do_nmi+0xa8/0xe0
[ 2264.729789]  [] end_repeat_nmi+0x1e/0x2e
[ 2264.729789]  [] ? copy_user_generic_unrolled+0xc0/0xc0
[ 2264.729789]  [] ? skip_prefixes+0x1c/0x40
[ 2264.729789]  [] ? bad_get_user+0x17/0x17
[ 2264.729789]  [] ? ftrace_regs_caller+0x81/0xcd
[ 2264.729789]  [] ? ftrace_regs_caller+0x81/0xcd
[ 2264.729789]  [] ? ftrace_regs_caller+0x81/0xcd
[ 2264.729789]  <>
[ 2264.729789]  <#DB>  [] ? 
copy_user_generic_unrolled+0xc0/0xc0
[ 2264.729789]  [] ? copy_user_generic_string+0x1/0x40
[ 2264.729789]  [] ? ftrace_cmp_recs+0x1/0x30
[ 2264.729789]  [] ? inat_get_opcode_attribute+0x5/0x20
[ 2264.729789]  [] ? inat_get_opcode_attribute+0x5/0x20
[ 2264.729789]  [] ? skip_prefixes+0x1c/0x40
[ 2264.729789]  [] resume_execution+0x37/0x1d0
[ 2264.729789]  [] kprobe_debug_handler+0x3f/0xe0
[ 2264.729789]  [] do_debug+0x7f/0x1d0
[ 2264.729789]  [] debug+0x3a/0x50
[ 2264.729789]  <>
[ 2264.729789]  [] ? seq_read+0x88/0x390
[ 2264.729789]  [] ? security_file_permission+0x84/0xa0
[ 2264.729789]  [] proc_reg_read+0x3d/0x80
[ 2264.729789]  [] vfs_read+0x9b/0x160
[ 2264.729789]  [] SyS_read+0x49/0xa0
[ 2264.729789]  [] system_call_fastpath+0x16/0x1b
[ 2264.729789] Code: c9 75 ee 21 d2 74 10 89 d1 8a 06 88 07 48 ff c6 48 ff c7 ff
 c9 75 f2 31 c0 0f 1f 00 c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00  1f 00
 83 fa 08 72 27 89 f9 83 e1 07 74 15 83 e9 08 f7 d9 29
[ 2264.729789] RIP  [] copy_user_generic_string+0x0/0x40
[ 2264.729789]  RSP 
[ 2264.729789] CR2: 0020
[ 2264.729789] ---[ end trace 533fc16b4cc45447 ]---
[ 2264.729789] Kernel panic - not syncing: Fatal exception in interrupt
[ 2264.729789] Kernel Offset: 0x0 from 0x8100 (relocation range: 0xf
fff8000-0x9fff)
 

To handle this correctly, I fixed the kprobes fault
handler to ensure the faulted ip address is its own
single-step buffer instead of checking current kprobe
state.


Signed-off-by: Masami

[PATCH -tip v7 06/26] [BUGFIX] x86: Prohibit probing on native_set_debugreg/load_idt

2014-02-26 Thread Masami Hiramatsu

Prohibit probing on native_set_debugreg and native_load_idt.
Since the kprobes uses do_debug for single stepping,
functions called from do_debug before notify_die must not
be probed.
And also native_load_idt is called from paranoid_exit when
returning int3, this also must not be probed.

Signed-off-by: Masami Hiramatsu 
Cc: Jeremy Fitzhardinge 
Cc: Chris Wright 
Cc: Alok Kataria 
Cc: Rusty Russell 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
---
 arch/x86/kernel/paravirt.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 4c785fd..abff75f 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -390,8 +390,10 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
.end_context_switch = paravirt_nop,
 };
 
-/* At this point, native_get_debugreg has real function entry */
+/* At this point, native_get/set_debugreg has real function entry */
 NOKPROBE_SYMBOL(native_get_debugreg);
+NOKPROBE_SYMBOL(native_set_debugreg);
+NOKPROBE_SYMBOL(native_load_idt);
 
 struct pv_apic_ops pv_apic_ops = {
 #ifdef CONFIG_X86_LOCAL_APIC


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 04/26] kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist

2014-02-26 Thread Masami Hiramatsu

Introduce NOKPROBE_SYMBOL() macro which builds a kprobe
blacklist in build time. The usage of this macro is similar
to the EXPORT_SYMBOL, put the NOKPROBE_SYMBOL(function); just
after the function definition.
Since this macro will inhibit inlining of static/inline
functions, this patch also introduce nokprobe_inline macro
for static/inline functions. In this case, we must use
NOKPROBE_SYMBOL() for the inline function caller.

When CONFIG_KPROBES=y, the macro stores the given function
address in the "_kprobe_blacklist" section.

Since the data structures are not fully initialized by the
macro (because there is no "size" information),  those
are re-initialized at boot time by using kallsyms.

Changes from previous version:
 - Add nokprobe_inline for inline functions according to
   Steven's suggestion.

Signed-off-by: Masami Hiramatsu 
Cc: Ananth N Mavinakayanahalli 
Cc: "David S. Miller" 
Cc: Rob Landley 
Cc: Jeremy Fitzhardinge 
Cc: Chris Wright 
Cc: Alok Kataria 
Cc: Rusty Russell 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Arnd Bergmann 
Cc: Peter Zijlstra 
---
 Documentation/kprobes.txt |   16 ++
 arch/x86/include/asm/asm.h|7 +++
 arch/x86/kernel/paravirt.c|4 +
 include/asm-generic/vmlinux.lds.h |9 +++
 include/linux/compiler.h  |2 +
 include/linux/kprobes.h   |   20 ++-
 kernel/kprobes.c  |  100 +++--
 kernel/sched/core.c   |1 
 8 files changed, 107 insertions(+), 52 deletions(-)

diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
index 0cfb00f..7062631 100644
--- a/Documentation/kprobes.txt
+++ b/Documentation/kprobes.txt
@@ -22,8 +22,9 @@ Appendix B: The kprobes sysctl interface
 
 Kprobes enables you to dynamically break into any kernel routine and
 collect debugging and performance information non-disruptively. You
-can trap at almost any kernel code address, specifying a handler
+can trap at almost any kernel code address(*), specifying a handler
 routine to be invoked when the breakpoint is hit.
+(*: at some part of kernel code can not be trapped, see 1.5 Blacklist)
 
 There are currently three types of probes: kprobes, jprobes, and
 kretprobes (also called return probes).  A kprobe can be inserted
@@ -273,6 +274,19 @@ using one of the following techniques:
  or
 - Execute 'sysctl -w debug.kprobes_optimization=n'
 
+1.5 Blacklist
+
+Kprobes can probe almost of the kernel except itself. This means
+that there are some functions where kprobes cannot probe. Probing
+(trapping) such functions can cause recursive trap (e.g. double
+fault) or at least the nested probe handler never be called.
+Kprobes manages such functions as a blacklist.
+If you want to add a function into the blacklist, you just need
+to (1) include linux/kprobes.h and (2) use NOKPROBE_SYMBOL() macro
+to specify a blacklisted function.
+Kprobes checks given probe address with the blacklist and reject
+registering if the given address is in the blacklist.
+
 2. Architectures Supported
 
 Kprobes, jprobes, and return probes are implemented on the following
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 4582e8e..7730c1c 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -57,6 +57,12 @@
.long (from) - . ;  \
.long (to) - . + 0x7ff0 ;   \
.popsection
+
+# define _ASM_NOKPROBE(entry)  \
+   .pushsection "_kprobe_blacklist","aw" ; \
+   _ASM_ALIGN ;\
+   _ASM_PTR (entry);   \
+   .popsection
 #else
 # define _ASM_EXTABLE(from,to) \
" .pushsection \"__ex_table\",\"a\"\n"  \
@@ -71,6 +77,7 @@
" .long (" #from ") - .\n"  \
" .long (" #to ") - . + 0x7ff0\n"   \
" .popsection\n"
+/* For C file, we already have NOKPROBE_SYMBOL macro */
 #endif
 
 #endif /* _ASM_X86_ASM_H */
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 1b10af8..4c785fd 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -389,6 +390,9 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
.end_context_switch = paravirt_nop,
 };
 
+/* At this point, native_get_debugreg has real function entry */
+NOKPROBE_SYMBOL(native_get_debugreg);
+
 struct pv_apic_ops pv_apic_ops = {
 #ifdef CONFIG_X86_LOCAL_APIC
.startup_ipi_hook = paravirt_nop,
diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index bc2121f..81d07d5 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -109,6 +109,14 @@
 #define BRANCH_PROFILE()

[PATCH -tip v7 05/26] [BUGFIX] kprobes/x86: Prohibit probing on debug_stack_*

2014-02-26 Thread Masami Hiramatsu

Prohibit probing on debug_stack_reset and debug_stack_set_zero.
Since the both functions are called from TRACE_IRQS_ON/OFF_DEBUG
macros which run in int3 ist entry, probing it may cause a soft
lockup.

This happens when the kernel built with CONFIG_DYNAMIC_FTRACE=y
and CONFIG_TRACE_IRQFLAGS=y.

Signed-off-by: Masami Hiramatsu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Borislav Petkov 
Cc: Fenghua Yu 
Cc: Seiji Aguchi 
---
 arch/x86/kernel/cpu/common.c |4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8e28bf2..fd9caa4 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1159,6 +1160,7 @@ int is_debug_stack(unsigned long addr)
(addr <= __get_cpu_var(debug_stack_addr) &&
 addr > (__get_cpu_var(debug_stack_addr) - DEBUG_STKSZ));
 }
+NOKPROBE_SYMBOL(is_debug_stack);
 
 DEFINE_PER_CPU(u32, debug_idt_ctr);
 
@@ -1167,6 +1169,7 @@ void debug_stack_set_zero(void)
this_cpu_inc(debug_idt_ctr);
load_current_idt();
 }
+NOKPROBE_SYMBOL(debug_stack_set_zero);
 
 void debug_stack_reset(void)
 {
@@ -1175,6 +1178,7 @@ void debug_stack_reset(void)
if (this_cpu_dec_return(debug_idt_ctr) == 0)
load_current_idt();
 }
+NOKPROBE_SYMBOL(debug_stack_reset);
 
 #else  /* CONFIG_X86_64 */
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 10/26] kprobes/x86: Allow probe on some kprobe preparation functions

2014-02-26 Thread Masami Hiramatsu

There is no need to prohibit probing on the functions
used in preparation phase. Those are safely probed because
those are not invoked from breakpoint/fault/debug handlers,
there is no chance to cause recursive exceptions.

Following functions are now removed from the kprobes blacklist.
 can_boost
 can_probe
 can_optimize
 is_IF_modifier
 __copy_instruction
 copy_optimized_instructions
 arch_copy_kprobe
 arch_prepare_kprobe
 arch_arm_kprobe
 arch_disarm_kprobe
 arch_remove_kprobe
 arch_trampoline_kprobe
 arch_prepare_kprobe_ftrace
 arch_prepare_optimized_kprobe
 arch_check_optimized_kprobe
 arch_within_optimized_kprobe
 __arch_remove_optimized_kprobe
 arch_remove_optimized_kprobe
 arch_optimize_kprobes
 arch_unoptimize_kprobe

I tested the safety via kprobe-tracer as below;

 # cd /sys/kernel/debug/tracing
 # cat above-coverted-symbols-list | while read s; do
   echo "p $s"; done > kprobe_events
 (Note: some symbols are not found, those are inlined)
 # echo 1 > events/kprobes/enable
 # echo p:foo vfs_symlink >> kprobe_events
 # echo p:bar vfs_symlink+5 >> kprobe_events
 # echo p vfs_symlink+5 >> kprobe_events
 # echo 1 > events/kprobes/foo/enable
 # ln -sf /tmp/foo /tmp/bar
 # echo 0 > events/kprobes/foo/enable
 # echo -:foo >> kprobe_events
 # head -n 20 trace
 # echo 0 > events/kprobes/enable
 # echo > kprobe_events
 # echo > trace

Signed-off-by: Masami Hiramatsu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Steven Rostedt 
Cc: Andrew Morton 
---
 arch/x86/kernel/kprobes/core.c   |   20 ++--
 arch/x86/kernel/kprobes/ftrace.c |2 +-
 arch/x86/kernel/kprobes/opt.c|   24 
 3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 566958e..ae5aafb 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -159,7 +159,7 @@ static kprobe_opcode_t *__kprobes 
skip_prefixes(kprobe_opcode_t *insn)
  * Returns non-zero if opcode is boostable.
  * RIP relative instructions are adjusted at copying time in 64 bits mode
  */
-int __kprobes can_boost(kprobe_opcode_t *opcodes)
+int can_boost(kprobe_opcode_t *opcodes)
 {
kprobe_opcode_t opcode;
kprobe_opcode_t *orig_opcodes = opcodes;
@@ -260,7 +260,7 @@ unsigned long recover_probed_instruction(kprobe_opcode_t 
*buf, unsigned long add
 }
 
 /* Check if paddr is at an instruction boundary */
-static int __kprobes can_probe(unsigned long paddr)
+static int can_probe(unsigned long paddr)
 {
unsigned long addr, __addr, offset = 0;
struct insn insn;
@@ -299,7 +299,7 @@ static int __kprobes can_probe(unsigned long paddr)
 /*
  * Returns non-zero if opcode modifies the interrupt flag.
  */
-static int __kprobes is_IF_modifier(kprobe_opcode_t *insn)
+static int is_IF_modifier(kprobe_opcode_t *insn)
 {
/* Skip prefixes */
insn = skip_prefixes(insn);
@@ -322,7 +322,7 @@ static int __kprobes is_IF_modifier(kprobe_opcode_t *insn)
  * If not, return null.
  * Only applicable to 64-bit x86.
  */
-int __kprobes __copy_instruction(u8 *dest, u8 *src)
+int __copy_instruction(u8 *dest, u8 *src)
 {
struct insn insn;
kprobe_opcode_t buf[MAX_INSN_SIZE];
@@ -365,7 +365,7 @@ int __kprobes __copy_instruction(u8 *dest, u8 *src)
return insn.length;
 }
 
-static int __kprobes arch_copy_kprobe(struct kprobe *p)
+static int arch_copy_kprobe(struct kprobe *p)
 {
int ret;
 
@@ -392,7 +392,7 @@ static int __kprobes arch_copy_kprobe(struct kprobe *p)
return 0;
 }
 
-int __kprobes arch_prepare_kprobe(struct kprobe *p)
+int arch_prepare_kprobe(struct kprobe *p)
 {
if (alternatives_text_reserved(p->addr, p->addr))
return -EINVAL;
@@ -407,17 +407,17 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
return arch_copy_kprobe(p);
 }
 
-void __kprobes arch_arm_kprobe(struct kprobe *p)
+void arch_arm_kprobe(struct kprobe *p)
 {
text_poke(p->addr, ((unsigned char []){BREAKPOINT_INSTRUCTION}), 1);
 }
 
-void __kprobes arch_disarm_kprobe(struct kprobe *p)
+void arch_disarm_kprobe(struct kprobe *p)
 {
text_poke(p->addr, >opcode, 1);
 }
 
-void __kprobes arch_remove_kprobe(struct kprobe *p)
+void arch_remove_kprobe(struct kprobe *p)
 {
if (p->ainsn.insn) {
free_insn_slot(p->ainsn.insn, (p->ainsn.boostable == 1));
@@ -1057,7 +1057,7 @@ int __init arch_init_kprobes(void)
return 0;
 }
 
-int __kprobes arch_trampoline_kprobe(struct kprobe *p)
+int arch_trampoline_kprobe(struct kprobe *p)
 {
return 0;
 }
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 23ef5c5..dcaa131 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -85,7 +85,7 @@ end:
local_irq_restore(flags);
 }
 
-int __kprobes arch_prepare_kprobe_ftrace(struct kprobe *p)
+int arch_prepare_kprobe_ftrace(struct kprobe *p)
 {

[PATCH -tip v7 07/26] [BUGFIX] x86: Prohibit probing on thunk functions and restore

2014-02-26 Thread Masami Hiramatsu

thunk/restore functions are also used for tracing irqoff etc.
and those are involved in kprobe's exception handling.
Prohibit probing on them to avoid kernel crash.

Signed-off-by: Masami Hiramatsu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
---
 arch/x86/lib/thunk_32.S |3 ++-
 arch/x86/lib/thunk_64.S |3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/thunk_32.S b/arch/x86/lib/thunk_32.S
index 2930ae0..28f85c91 100644
--- a/arch/x86/lib/thunk_32.S
+++ b/arch/x86/lib/thunk_32.S
@@ -4,8 +4,8 @@
  *  (inspired by Andi Kleen's thunk_64.S)
  * Subject to the GNU public license, v.2. No warranty of any kind.
  */
-
#include 
+   #include 
 
 #ifdef CONFIG_TRACE_IRQFLAGS
/* put return address in eax (arg1) */
@@ -22,6 +22,7 @@
popl %ecx
popl %eax
ret
+   _ASM_NOKPROBE(\name)
.endm
 
thunk_ra trace_hardirqs_on_thunk,trace_hardirqs_on_caller
diff --git a/arch/x86/lib/thunk_64.S b/arch/x86/lib/thunk_64.S
index a63efd6..92d9fea 100644
--- a/arch/x86/lib/thunk_64.S
+++ b/arch/x86/lib/thunk_64.S
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 
/* rdi: arg1 ... normal C conventions. rax is saved/restored. */
.macro THUNK name, func, put_ret_addr_in_rdi=0
@@ -25,6 +26,7 @@
call \func
jmp  restore
CFI_ENDPROC
+   _ASM_NOKPROBE(\name)
.endm
 
 #ifdef CONFIG_TRACE_IRQFLAGS
@@ -43,3 +45,4 @@ restore:
RESTORE_ARGS
ret
CFI_ENDPROC
+   _ASM_NOKPROBE(restore)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 12/26] ftrace/*probes: Allow probing on some functions

2014-02-26 Thread Masami Hiramatsu

There is no need to prohibit probing on the functions
used for preparation and uprobe only fetch functions.
Those are safely probed because those are not invoked
from kprobe's breakpoint/fault/debug handlers. So there
is no chance to cause recursive exceptions.

Following functions are now removed from the kprobes blacklist.
update_bitfield_fetch_param
free_bitfield_fetch_param
kprobe_register
FETCH_FUNC_NAME(stack, type) in trace_uprobe.c
FETCH_FUNC_NAME(memory, type) in trace_uprobe.c
FETCH_FUNC_NAME(memory, string) in trace_uprobe.c
FETCH_FUNC_NAME(memory, string_size) in trace_uprobe.c
FETCH_FUNC_NAME(file_offset, type) in trace_uprobe.c

Changes from v6:
  - allow probing fetch functions in trace_uprobe.c

Signed-off-by: Masami Hiramatsu 
Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
Cc: Namhyung Kim 
---
 kernel/trace/trace_kprobe.c |2 +-
 kernel/trace/trace_probe.c  |4 ++--
 kernel/trace/trace_uprobe.c |   20 ++--
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index bdbae45..d0ffbbe 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1209,7 +1209,7 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct 
kretprobe_instance *ri,
  * kprobe_trace_self_tests_init() does enable_trace_probe/disable_trace_probe
  * lockless, but we can't race with this __init function.
  */
-static __kprobes
+static
 int kprobe_register(struct ftrace_event_call *event,
enum trace_reg type, void *data)
 {
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 8364a42..d3a91e4 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -183,7 +183,7 @@ DEFINE_BASIC_FETCH_FUNCS(bitfield)
 #define fetch_bitfield_string  NULL
 #define fetch_bitfield_string_size NULL
 
-static __kprobes void
+static void
 update_bitfield_fetch_param(struct bitfield_fetch_param *data)
 {
/*
@@ -196,7 +196,7 @@ update_bitfield_fetch_param(struct bitfield_fetch_param 
*data)
update_symbol_cache(data->orig.data);
 }
 
-static __kprobes void
+static void
 free_bitfield_fetch_param(struct bitfield_fetch_param *data)
 {
/*
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 79e52d9..8751efd4 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -108,8 +108,8 @@ static unsigned long get_user_stack_nth(struct pt_regs 
*regs, unsigned int n)
  * Uprobes-specific fetch functions
  */
 #define DEFINE_FETCH_stack(type)   \
-static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
- void *offset, void *dest) \
+static void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs, \
+void *offset, void *dest)  \
 {  \
*(type *)dest = (type)get_user_stack_nth(regs,  \
  ((unsigned long)offset)); \
@@ -120,8 +120,8 @@ DEFINE_BASIC_FETCH_FUNCS(stack)
 #define fetch_stack_string_sizeNULL
 
 #define DEFINE_FETCH_memory(type)  \
-static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
-   void *addr, void *dest) \
+static void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,
\
+ void *addr, void *dest)   \
 {  \
type retval;\
void __user *vaddr = (void __force __user *) addr;  \
@@ -136,8 +136,8 @@ DEFINE_BASIC_FETCH_FUNCS(memory)
  * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
  * length and relative data location.
  */
-static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+   void *addr, void *dest)
 {
long ret;
u32 rloc = *(u32 *)dest;
@@ -158,8 +158,8 @@ static __kprobes void FETCH_FUNC_NAME(memory, 
string)(struct pt_regs *regs,
}
 }
 
-static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs 
*regs,
- void *addr, void *dest)
+static void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+void *addr, void *dest)
 {
int len;
void __user *vaddr = (void __force __user *) addr;
@@ -184,8 +184,8 @@ static unsigned long translate_user_vaddr(void *file_offset)
 }
 
 #define

[PATCH -tip v7 09/26] x86: Call exception_enter after kprobes handled

2014-02-26 Thread Masami Hiramatsu

Move exception_enter() call after kprobes handler
is done. Since the exception_enter() involves
many other functions (like printk), it can cause
recursive int3/break loop when kprobes probe such
functions.

Signed-off-by: Masami Hiramatsu 
---
 arch/x86/kernel/traps.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index e5d4a70..ba9abe9 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -327,7 +327,6 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs 
*regs, long error_co
if (poke_int3_handler(regs))
return;
 
-   prev_state = exception_enter();
 #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -338,6 +337,7 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs 
*regs, long error_co
if (kprobe_int3_handler(regs))
return;
 #endif
+   prev_state = exception_enter();
 
if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
@@ -415,8 +415,6 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, 
long error_code)
unsigned long dr6;
int si_code;
 
-   prev_state = exception_enter();
-
get_debugreg(dr6, 6);
 
/* Filter out all the reserved bits which are preset to 1 */
@@ -449,6 +447,7 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, 
long error_code)
if (kprobe_debug_handler(regs))
goto exit;
 #endif
+   prev_state = exception_enter();
 
if (notify_die(DIE_DEBUG, "debug", regs, (long), error_code,
SIGTRAP) == NOTIFY_STOP)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 15/26] kprobes: Use NOKPROBE_SYMBOL macro instead of __kprobes

2014-02-26 Thread Masami Hiramatsu

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation.

Signed-off-by: Masami Hiramatsu 
Cc: Ananth N Mavinakayanahalli 
Cc: "David S. Miller" 
---
 kernel/kprobes.c |   67 +-
 1 file changed, 41 insertions(+), 26 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4db2cc6..a21b4e6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -301,7 +301,7 @@ static inline void reset_kprobe_instance(void)
  * OR
  * - with preemption disabled - from arch/xxx/kernel/kprobes.c
  */
-struct kprobe __kprobes *get_kprobe(void *addr)
+struct kprobe *get_kprobe(void *addr)
 {
struct hlist_head *head;
struct kprobe *p;
@@ -314,8 +314,9 @@ struct kprobe __kprobes *get_kprobe(void *addr)
 
return NULL;
 }
+NOKPROBE_SYMBOL(get_kprobe);
 
-static int __kprobes aggr_pre_handler(struct kprobe *p, struct pt_regs *regs);
+static int aggr_pre_handler(struct kprobe *p, struct pt_regs *regs);
 
 /* Return true if the kprobe is an aggregator */
 static inline int kprobe_aggrprobe(struct kprobe *p)
@@ -347,7 +348,7 @@ static bool kprobes_allow_optimization;
  * Call all pre_handler on the list, but ignores its return value.
  * This must be called from arch-dep optimized caller.
  */
-void __kprobes opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
+void opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
 {
struct kprobe *kp;
 
@@ -359,6 +360,7 @@ void __kprobes opt_pre_handler(struct kprobe *p, struct 
pt_regs *regs)
reset_kprobe_instance();
}
 }
+NOKPROBE_SYMBOL(opt_pre_handler);
 
 /* Free optimized instructions and optimized_kprobe */
 static void free_aggr_kprobe(struct kprobe *p)
@@ -995,7 +997,7 @@ static void disarm_kprobe(struct kprobe *kp, bool reopt)
  * Aggregate handlers for multiple kprobes support - these handlers
  * take care of invoking the individual kprobe handlers on p->list
  */
-static int __kprobes aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
+static int aggr_pre_handler(struct kprobe *p, struct pt_regs *regs)
 {
struct kprobe *kp;
 
@@ -1009,9 +1011,10 @@ static int __kprobes aggr_pre_handler(struct kprobe *p, 
struct pt_regs *regs)
}
return 0;
 }
+NOKPROBE_SYMBOL(aggr_pre_handler);
 
-static void __kprobes aggr_post_handler(struct kprobe *p, struct pt_regs *regs,
-   unsigned long flags)
+static void aggr_post_handler(struct kprobe *p, struct pt_regs *regs,
+ unsigned long flags)
 {
struct kprobe *kp;
 
@@ -1023,9 +1026,10 @@ static void __kprobes aggr_post_handler(struct kprobe 
*p, struct pt_regs *regs,
}
}
 }
+NOKPROBE_SYMBOL(aggr_post_handler);
 
-static int __kprobes aggr_fault_handler(struct kprobe *p, struct pt_regs *regs,
-   int trapnr)
+static int aggr_fault_handler(struct kprobe *p, struct pt_regs *regs,
+ int trapnr)
 {
struct kprobe *cur = __this_cpu_read(kprobe_instance);
 
@@ -1039,8 +1043,9 @@ static int __kprobes aggr_fault_handler(struct kprobe *p, 
struct pt_regs *regs,
}
return 0;
 }
+NOKPROBE_SYMBOL(aggr_fault_handler);
 
-static int __kprobes aggr_break_handler(struct kprobe *p, struct pt_regs *regs)
+static int aggr_break_handler(struct kprobe *p, struct pt_regs *regs)
 {
struct kprobe *cur = __this_cpu_read(kprobe_instance);
int ret = 0;
@@ -1052,9 +1057,10 @@ static int __kprobes aggr_break_handler(struct kprobe 
*p, struct pt_regs *regs)
reset_kprobe_instance();
return ret;
 }
+NOKPROBE_SYMBOL(aggr_break_handler);
 
 /* Walks the list and increments nmissed count for multiprobe case */
-void __kprobes kprobes_inc_nmissed_count(struct kprobe *p)
+void kprobes_inc_nmissed_count(struct kprobe *p)
 {
struct kprobe *kp;
if (!kprobe_aggrprobe(p)) {
@@ -1065,9 +1071,10 @@ void __kprobes kprobes_inc_nmissed_count(struct kprobe 
*p)
}
return;
 }
+NOKPROBE_SYMBOL(kprobes_inc_nmissed_count);
 
-void __kprobes recycle_rp_inst(struct kretprobe_instance *ri,
-   struct hlist_head *head)
+void recycle_rp_inst(struct kretprobe_instance *ri,
+struct hlist_head *head)
 {
struct kretprobe *rp = ri->rp;
 
@@ -1082,8 +1089,9 @@ void __kprobes recycle_rp_inst(struct kretprobe_instance 
*ri,
/* Unregistering */
hlist_add_head(>hlist, head);
 }
+NOKPROBE_SYMBOL(recycle_rp_inst);
 
-void __kprobes kretprobe_hash_lock(struct task_struct *tsk,
+void kretprobe_hash_lock(struct task_struct *tsk,
 struct hlist_head **head, unsigned long *flags)
 __acquires(hlist_lock)
 {
@@ -1094,17 +1102,19 @@ __acquires(hlist_lock)
hlist_lock = kretprobe_table_lock_ptr(hash);
raw_spin_lock_irqsave(hlist_lock, *flags);

[PATCH -tip v7 08/26] kprobes/x86: Call exception handlers directly from do_int3/do_debug

2014-02-26 Thread Masami Hiramatsu

To avoid a kernel crash by probing on lockdep code, call
kprobe_int3_handler and kprobe_debug_handler directly
from do_int3 and do_debug. Since there is a locking code
in notify_die, lockdep code can be invoked. And because
the lockdep involves printk() related things, theoretically,
we need to prohibit probing on much more code...

Anyway, most of the int3 handlers in the kernel are already
called from do_int3 directly, e.g. ftrace_int3_handler,
poke_int3_handler, kgdb_ll_trap. Actually only
kprobe_exceptions_notify is on the notifier_call_chain.

Signed-off-by: Masami Hiramatsu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Ananth N Mavinakayanahalli 
Cc: Andi Kleen 
Cc: Steven Rostedt 
Cc: Sasha Levin 
Cc: Andrew Morton 
Cc: Seiji Aguchi 
Cc: Frederic Weisbecker 
---
 arch/x86/include/asm/kprobes.h |2 ++
 arch/x86/kernel/kprobes/core.c |   24 +++-
 arch/x86/kernel/traps.c|   10 ++
 3 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 9454c16..53cdfb2 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -116,4 +116,6 @@ struct kprobe_ctlblk {
 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
 extern int kprobe_exceptions_notify(struct notifier_block *self,
unsigned long val, void *data);
+extern int kprobe_int3_handler(struct pt_regs *regs);
+extern int kprobe_debug_handler(struct pt_regs *regs);
 #endif /* _ASM_X86_KPROBES_H */
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 4708d6e..566958e 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -559,7 +559,7 @@ reenter_kprobe(struct kprobe *p, struct pt_regs *regs, 
struct kprobe_ctlblk *kcb
  * Interrupts are disabled on entry as trap3 is an interrupt gate and they
  * remain disabled throughout this function.
  */
-static int __kprobes kprobe_handler(struct pt_regs *regs)
+int __kprobes kprobe_int3_handler(struct pt_regs *regs)
 {
kprobe_opcode_t *addr;
struct kprobe *p;
@@ -857,7 +857,7 @@ no_change:
  * Interrupts are disabled on entry as trap1 is an interrupt gate and they
  * remain disabled throughout this function.
  */
-static int __kprobes post_kprobe_handler(struct pt_regs *regs)
+int __kprobes kprobe_debug_handler(struct pt_regs *regs)
 {
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
@@ -960,22 +960,7 @@ kprobe_exceptions_notify(struct notifier_block *self, 
unsigned long val, void *d
if (args->regs && user_mode_vm(args->regs))
return ret;
 
-   switch (val) {
-   case DIE_INT3:
-   if (kprobe_handler(args->regs))
-   ret = NOTIFY_STOP;
-   break;
-   case DIE_DEBUG:
-   if (post_kprobe_handler(args->regs)) {
-   /*
-* Reset the BS bit in dr6 (pointed by args->err) to
-* denote completion of processing
-*/
-   (*(unsigned long *)ERR_PTR(args->err)) &= ~DR_STEP;
-   ret = NOTIFY_STOP;
-   }
-   break;
-   case DIE_GPF:
+   if (val == DIE_GPF) {
/*
 * To be potentially processing a kprobe fault and to
 * trust the result from kprobe_running(), we have
@@ -984,9 +969,6 @@ kprobe_exceptions_notify(struct notifier_block *self, 
unsigned long val, void *d
if (!preemptible() && kprobe_running() &&
kprobe_fault_handler(args->regs, args->trapnr))
ret = NOTIFY_STOP;
-   break;
-   default:
-   break;
}
return ret;
 }
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 57409f6..e5d4a70 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -334,6 +334,11 @@ dotraplinkage void __kprobes notrace do_int3(struct 
pt_regs *regs, long error_co
goto exit;
 #endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
 
+#ifdef CONFIG_KPROBES
+   if (kprobe_int3_handler(regs))
+   return;
+#endif
+
if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
SIGTRAP) == NOTIFY_STOP)
goto exit;
@@ -440,6 +445,11 @@ dotraplinkage void __kprobes do_debug(struct pt_regs 
*regs, long error_code)
/* Store the virtualized DR6 value */
tsk->thread.debugreg6 = dr6;
 
+#ifdef CONFIG_KPROBES
+   if (kprobe_debug_handler(regs))
+   goto exit;
+#endif
+
if (notify_die(DIE_DEBUG, "debug", regs, (long), error_code,
SIGTRAP) == NOTIFY_STOP)
goto exit;


--
To unsubscribe from this list: send the line "unsubscribe

[PATCH -tip v7 13/26] x86: Allow kprobes on text_poke/hw_breakpoint

2014-02-26 Thread Masami Hiramatsu

Allow kprobes on text_poke/hw_breakpoint because
those are not related to the critical int3-debug
recursive path of kprobes at this moment.

Signed-off-by: Masami Hiramatsu 
---
 arch/x86/kernel/alternative.c   |3 +--
 arch/x86/kernel/hw_breakpoint.c |5 ++---
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index df94598..703130f 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -5,7 +5,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -551,7 +550,7 @@ void *__init_or_module text_poke_early(void *addr, const 
void *opcode,
  *
  * Note: Must be called under text_mutex.
  */
-void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
+void *text_poke(void *addr, const void *opcode, size_t len)
 {
unsigned long flags;
char *vaddr;
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index a67b47c..5f9cf20 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -32,7 +32,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -424,7 +423,7 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
  * NOTIFY_STOP returned for all other cases
  *
  */
-static int __kprobes hw_breakpoint_handler(struct die_args *args)
+static int hw_breakpoint_handler(struct die_args *args)
 {
int i, cpu, rc = NOTIFY_STOP;
struct perf_event *bp;
@@ -511,7 +510,7 @@ static int __kprobes hw_breakpoint_handler(struct die_args 
*args)
 /*
  * Handle debug exception notifications.
  */
-int __kprobes hw_breakpoint_exceptions_notify(
+int hw_breakpoint_exceptions_notify(
struct notifier_block *unused, unsigned long val, void *data)
 {
if (val != DIE_DEBUG)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 17/26] notifier: Use NOKPROBE_SYMBOL macro in notifier

2014-02-26 Thread Masami Hiramatsu

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in notifier.

Signed-off-by: Masami Hiramatsu 
---
 kernel/notifier.c |   22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/kernel/notifier.c b/kernel/notifier.c
index 2d5cc4c..61fc78a 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -71,9 +71,9 @@ static int notifier_chain_unregister(struct notifier_block 
**nl,
  * @returns:   notifier_call_chain returns the value returned by the
  * last notifier function called.
  */
-static int __kprobes notifier_call_chain(struct notifier_block **nl,
-   unsigned long val, void *v,
-   int nr_to_call, int *nr_calls)
+static int notifier_call_chain(struct notifier_block **nl,
+  unsigned long val, void *v,
+  int nr_to_call, int *nr_calls)
 {
int ret = NOTIFY_DONE;
struct notifier_block *nb, *next_nb;
@@ -102,6 +102,7 @@ static int __kprobes notifier_call_chain(struct 
notifier_block **nl,
}
return ret;
 }
+NOKPROBE_SYMBOL(notifier_call_chain);
 
 /*
  * Atomic notifier chain routines.  Registration and unregistration
@@ -172,9 +173,9 @@ EXPORT_SYMBOL_GPL(atomic_notifier_chain_unregister);
  * Otherwise the return value is the return value
  * of the last notifier function called.
  */
-int __kprobes __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
-   unsigned long val, void *v,
-   int nr_to_call, int *nr_calls)
+int __atomic_notifier_call_chain(struct atomic_notifier_head *nh,
+unsigned long val, void *v,
+int nr_to_call, int *nr_calls)
 {
int ret;
 
@@ -184,13 +185,15 @@ int __kprobes __atomic_notifier_call_chain(struct 
atomic_notifier_head *nh,
return ret;
 }
 EXPORT_SYMBOL_GPL(__atomic_notifier_call_chain);
+NOKPROBE_SYMBOL(__atomic_notifier_call_chain);
 
-int __kprobes atomic_notifier_call_chain(struct atomic_notifier_head *nh,
-   unsigned long val, void *v)
+int atomic_notifier_call_chain(struct atomic_notifier_head *nh,
+  unsigned long val, void *v)
 {
return __atomic_notifier_call_chain(nh, val, v, -1, NULL);
 }
 EXPORT_SYMBOL_GPL(atomic_notifier_call_chain);
+NOKPROBE_SYMBOL(atomic_notifier_call_chain);
 
 /*
  * Blocking notifier chain routines.  All access to the chain is
@@ -527,7 +530,7 @@ EXPORT_SYMBOL_GPL(srcu_init_notifier_head);
 
 static ATOMIC_NOTIFIER_HEAD(die_chain);
 
-int notrace __kprobes notify_die(enum die_val val, const char *str,
+int notrace notify_die(enum die_val val, const char *str,
   struct pt_regs *regs, long err, int trap, int sig)
 {
struct die_args args = {
@@ -540,6 +543,7 @@ int notrace __kprobes notify_die(enum die_val val, const 
char *str,
};
return atomic_notifier_call_chain(_chain, val, );
 }
+NOKPROBE_SYMBOL(notify_die);
 
 int register_die_notifier(struct notifier_block *nb)
 {


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 11/26] kprobes: Allow probe on some kprobe functions

2014-02-26 Thread Masami Hiramatsu

There is no need to prohibit probing on the functions
used for preparation, registeration, optimization,
controll etc. Those are safely probed because those are
not invoked from breakpoint/fault/debug handlers,
there is no chance to cause recursive exceptions.

Following functions are now removed from the kprobes blacklist.
add_new_kprobe
aggr_kprobe_disabled
alloc_aggr_kprobe
alloc_aggr_kprobe
arm_all_kprobes
__arm_kprobe
arm_kprobe
arm_kprobe_ftrace
check_kprobe_address_safe
collect_garbage_slots
collect_garbage_slots
collect_one_slot
debugfs_kprobe_init
__disable_kprobe
disable_kprobe
disarm_all_kprobes
__disarm_kprobe
disarm_kprobe
disarm_kprobe_ftrace
do_free_cleaned_kprobes
do_optimize_kprobes
do_unoptimize_kprobes
enable_kprobe
force_unoptimize_kprobe
free_aggr_kprobe
free_aggr_kprobe
__free_insn_slot
__get_insn_slot
get_optimized_kprobe
__get_valid_kprobe
init_aggr_kprobe
init_aggr_kprobe
in_nokprobe_functions
kick_kprobe_optimizer
kill_kprobe
kill_optimized_kprobe
kprobe_addr
kprobe_optimizer
kprobe_queued
kprobe_seq_next
kprobe_seq_start
kprobe_seq_stop
kprobes_module_callback
kprobes_open
optimize_all_kprobes
optimize_kprobe
prepare_kprobe
prepare_optimized_kprobe
register_aggr_kprobe
register_jprobe
register_jprobes
register_kprobe
register_kprobes
register_kretprobe
register_kretprobe
register_kretprobes
register_kretprobes
report_probe
show_kprobe_addr
try_to_optimize_kprobe
unoptimize_all_kprobes
unoptimize_kprobe
unregister_jprobe
unregister_jprobes
unregister_kprobe
__unregister_kprobe_bottom
unregister_kprobes
__unregister_kprobe_top
unregister_kretprobe
unregister_kretprobe
unregister_kretprobes
unregister_kretprobes
wait_for_kprobe_optimizer

Signed-off-by: Masami Hiramatsu 
Cc: Ananth N Mavinakayanahalli 
Cc: "David S. Miller" 
---
 kernel/kprobes.c |  153 +++---
 1 file changed, 76 insertions(+), 77 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 5ffc687..4db2cc6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -138,13 +138,13 @@ struct kprobe_insn_cache kprobe_insn_slots = {
.insn_size = MAX_INSN_SIZE,
.nr_garbage = 0,
 };
-static int __kprobes collect_garbage_slots(struct kprobe_insn_cache *c);
+static int collect_garbage_slots(struct kprobe_insn_cache *c);
 
 /**
  * __get_insn_slot() - Find a slot on an executable page for an instruction.
  * We allocate an executable page if there's no room on existing ones.
  */
-kprobe_opcode_t __kprobes *__get_insn_slot(struct kprobe_insn_cache *c)
+kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c)
 {
struct kprobe_insn_page *kip;
kprobe_opcode_t *slot = NULL;
@@ -201,7 +201,7 @@ out:
 }
 
 /* Return 1 if all garbages are collected, otherwise 0. */
-static int __kprobes collect_one_slot(struct kprobe_insn_page *kip, int idx)
+static int collect_one_slot(struct kprobe_insn_page *kip, int idx)
 {
kip->slot_used[idx] = SLOT_CLEAN;
kip->nused--;
@@ -222,7 +222,7 @@ static int __kprobes collect_one_slot(struct 
kprobe_insn_page *kip, int idx)
return 0;
 }
 
-static int __kprobes collect_garbage_slots(struct kprobe_insn_cache *c)
+static int collect_garbage_slots(struct kprobe_insn_cache *c)
 {
struct kprobe_insn_page *kip, *next;
 
@@ -244,8 +244,8 @@ static int __kprobes collect_garbage_slots(struct 
kprobe_insn_cache *c)
return 0;
 }
 
-void __kprobes __free_insn_slot(struct kprobe_insn_cache *c,
-   kprobe_opcode_t *slot, int dirty)
+void __free_insn_slot(struct kprobe_insn_cache *c,
+ kprobe_opcode_t *slot, int dirty)
 {
struct kprobe_insn_page *kip;
 
@@ -361,7 +361,7 @@ void __kprobes opt_pre_handler(struct kprobe *p, struct 
pt_regs *regs)
 }
 
 /* Free optimized instructions and optimized_kprobe */
-static __kprobes void free_aggr_kprobe(struct kprobe *p)
+static void free_aggr_kprobe(struct kprobe *p)
 {
struct optimized_kprobe *op;
 
@@ -399,7 +399,7 @@ static inline int kprobe_disarmed(struct kprobe *p)
 }
 
 /* Return true(!0) if the probe is queued on (un)optimizing lists */
-static int __kprobes kprobe_queued(struct kprobe *p)
+static int kprobe_queued(struct kprobe *p)
 {
struct optimized_kprobe *op;
 
@@ -415,7 +415,7 @@ static int __kprobes kprobe_queued(struct kprobe *p)
  * Return an optimized kprobe whose optimizing code replaces
  * instructions including addr (exclude breakpoint).
  */
-static struct kprobe *__kprobes get_optimized_kprobe(unsigned long addr)
+static struct kprobe *get_optimized_kprobe(unsigned long addr)
 {
int i;
struct kprobe *p = NULL;
@@ -447,7 +447,7 @@ static DECLARE_DELAYED_WORK(optimizing_work, 
kprobe_optimizer);
  * Optimize (replace a breakpoint with a jump) kprobes listed on
  * optimizing_list.
  */
-static __kprobes void do_optimize_kprobes(void)
+static void do_optimize_kprobes(void)
 {
/* Optimization never be done when disarmed */
if

[PATCH -tip v7 19/26] kprobes: Show blacklist entries via debugfs

2014-02-26 Thread Masami Hiramatsu

Show blacklist entries (function names with the address
range) via /sys/kernel/debug/kprobes/blacklist.

Signed-off-by: Masami Hiramatsu 
Cc: Ananth N Mavinakayanahalli 
Cc: "David S. Miller" 
---
 kernel/kprobes.c |   61 +++---
 1 file changed, 53 insertions(+), 8 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index a21b4e6..3214289 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2249,6 +2249,46 @@ static const struct file_operations 
debugfs_kprobes_operations = {
.release= seq_release,
 };
 
+/* kprobes/blacklist -- shows which functions can not be probed */
+static void *kprobe_blacklist_seq_start(struct seq_file *m, loff_t *pos)
+{
+   return seq_list_start(_blacklist, *pos);
+}
+
+static void *kprobe_blacklist_seq_next(struct seq_file *m, void *v, loff_t 
*pos)
+{
+   return seq_list_next(v, _blacklist, pos);
+}
+
+static int kprobe_blacklist_seq_show(struct seq_file *m, void *v)
+{
+   struct kprobe_blacklist_entry *ent =
+   list_entry(v, struct kprobe_blacklist_entry, list);
+
+   seq_printf(m, "0x%p-0x%p\t%ps\n", (void *)ent->start_addr,
+  (void *)ent->end_addr, (void *)ent->start_addr);
+   return 0;
+}
+
+static const struct seq_operations kprobe_blacklist_seq_ops = {
+   .start = kprobe_blacklist_seq_start,
+   .next  = kprobe_blacklist_seq_next,
+   .stop  = kprobe_seq_stop,   /* Reuse void function */
+   .show  = kprobe_blacklist_seq_show,
+};
+
+static int kprobe_blacklist_open(struct inode *inode, struct file *filp)
+{
+   return seq_open(filp, _blacklist_seq_ops);
+}
+
+static const struct file_operations debugfs_kprobe_blacklist_ops = {
+   .open   = kprobe_blacklist_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= seq_release,
+};
+
 static void arm_all_kprobes(void)
 {
struct hlist_head *head;
@@ -2372,19 +2412,24 @@ static int __init debugfs_kprobe_init(void)
 
file = debugfs_create_file("list", 0444, dir, NULL,
_kprobes_operations);
-   if (!file) {
-   debugfs_remove(dir);
-   return -ENOMEM;
-   }
+   if (!file)
+   goto error;
 
file = debugfs_create_file("enabled", 0600, dir,
, _kp);
-   if (!file) {
-   debugfs_remove(dir);
-   return -ENOMEM;
-   }
+   if (!file)
+   goto error;
+
+   file = debugfs_create_file("blacklist", 0444, dir, NULL,
+   _kprobe_blacklist_ops);
+   if (!file)
+   goto error;
 
return 0;
+
+error:
+   debugfs_remove(dir);
+   return -ENOMEM;
 }
 
 late_initcall(debugfs_kprobe_init);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 16/26] ftrace/kprobes: Use NOKPROBE_SYMBOL macro in ftrace

2014-02-26 Thread Masami Hiramatsu

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in ftrace.
This applies nokprobe_inline annotation for some cases,
because NOKPROBE_SYMBOL() will inhibit inlining by
referring the symbol address.

Changes from previous:
 - Use nokprobe_inline for call_fetch (Thanks to Steven Rostedt)
 - Use nokprobe_inline instead of __always_inline.
 - Apply NOKPROBE_SYMBOL to __get_data_size and store_trace_args
   (Thanks to Steven Rostedt)

Signed-off-by: Masami Hiramatsu 
Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
---
 kernel/trace/trace_event_perf.c |5 ++-
 kernel/trace/trace_kprobe.c |   64 +++
 kernel/trace/trace_probe.c  |   61 -
 kernel/trace/trace_probe.h  |   15 -
 4 files changed, 81 insertions(+), 64 deletions(-)

diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index e854f42..c97a795 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -232,8 +232,8 @@ void perf_trace_del(struct perf_event *p_event, int flags)
tp_event->class->reg(tp_event, TRACE_REG_PERF_DEL, p_event);
 }
 
-__kprobes void *perf_trace_buf_prepare(int size, unsigned short type,
-  struct pt_regs *regs, int *rctxp)
+void *perf_trace_buf_prepare(int size, unsigned short type,
+struct pt_regs *regs, int *rctxp)
 {
struct trace_entry *entry;
unsigned long flags;
@@ -265,6 +265,7 @@ __kprobes void *perf_trace_buf_prepare(int size, unsigned 
short type,
return raw_data;
 }
 EXPORT_SYMBOL_GPL(perf_trace_buf_prepare);
+NOKPROBE_SYMBOL(perf_trace_buf_prepare);
 
 #ifdef CONFIG_FUNCTION_TRACER
 static void
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index d0ffbbe..6fc79e4 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -45,27 +45,27 @@ struct event_file_link {
(sizeof(struct probe_arg) * (n)))
 
 
-static __kprobes bool trace_kprobe_is_return(struct trace_kprobe *tk)
+static nokprobe_inline bool trace_kprobe_is_return(struct trace_kprobe *tk)
 {
return tk->rp.handler != NULL;
 }
 
-static __kprobes const char *trace_kprobe_symbol(struct trace_kprobe *tk)
+static nokprobe_inline const char *trace_kprobe_symbol(struct trace_kprobe *tk)
 {
return tk->symbol ? tk->symbol : "unknown";
 }
 
-static __kprobes unsigned long trace_kprobe_offset(struct trace_kprobe *tk)
+static nokprobe_inline unsigned long trace_kprobe_offset(struct trace_kprobe 
*tk)
 {
return tk->rp.kp.offset;
 }
 
-static __kprobes bool trace_kprobe_has_gone(struct trace_kprobe *tk)
+static nokprobe_inline bool trace_kprobe_has_gone(struct trace_kprobe *tk)
 {
return !!(kprobe_gone(>rp.kp));
 }
 
-static __kprobes bool trace_kprobe_within_module(struct trace_kprobe *tk,
+static nokprobe_inline bool trace_kprobe_within_module(struct trace_kprobe *tk,
 struct module *mod)
 {
int len = strlen(mod->name);
@@ -73,7 +73,7 @@ static __kprobes bool trace_kprobe_within_module(struct 
trace_kprobe *tk,
return strncmp(mod->name, name, len) == 0 && name[len] == ':';
 }
 
-static __kprobes bool trace_kprobe_is_on_module(struct trace_kprobe *tk)
+static nokprobe_inline bool trace_kprobe_is_on_module(struct trace_kprobe *tk)
 {
return !!strchr(trace_kprobe_symbol(tk), ':');
 }
@@ -137,19 +137,21 @@ struct symbol_cache *alloc_symbol_cache(const char *sym, 
long offset)
  * Kprobes-specific fetch functions
  */
 #define DEFINE_FETCH_stack(type)   \
-static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
+static void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs, \
  void *offset, void *dest) \
 {  \
*(type *)dest = (type)regs_get_kernel_stack_nth(regs,   \
(unsigned int)((unsigned long)offset)); \
-}
+}  \
+NOKPROBE_SYMBOL(FETCH_FUNC_NAME(stack, type));
+
 DEFINE_BASIC_FETCH_FUNCS(stack)
 /* No string on the stack entry */
 #define fetch_stack_string NULL
 #define fetch_stack_string_sizeNULL
 
 #define DEFINE_FETCH_memory(type)  \
-static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
+static void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,
\
  void *addr, void *dest)   \
 {  \
type retval;\
@@ -157,14 +159,16 @@ static __kprobes void FETCH_FUNC_NAME(memory, 
type)(struct pt_regs

[PATCH -tip v7 22/26] kprobes/x86: Use kprobe_blacklist for .kprobes.text and .entry.text

2014-02-26 Thread Masami Hiramatsu

Use kprobe_blackpoint for blacklisting .entry.text and .kprobees.text
instead of arch_within_kprobe_blacklist. This also makes them visible
via (debugfs)/kprobes/blacklist.

Signed-off-by: Masami Hiramatsu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Ananth N Mavinakayanahalli 
Cc: "David S. Miller" 
Cc: Steven Rostedt 
Cc: Andrew Morton 
---
 arch/x86/kernel/kprobes/core.c |   11 +--
 include/linux/kprobes.h|1 +
 kernel/kprobes.c   |   64 
 3 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index d00103a..4767fda 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1065,17 +1065,10 @@ int longjmp_break_handler(struct kprobe *p, struct 
pt_regs *regs)
 }
 NOKPROBE_SYMBOL(longjmp_break_handler);
 
-bool arch_within_kprobe_blacklist(unsigned long addr)
-{
-   return  (addr >= (unsigned long)__kprobes_text_start &&
-addr < (unsigned long)__kprobes_text_end) ||
-   (addr >= (unsigned long)__entry_text_start &&
-addr < (unsigned long)__entry_text_end);
-}
-
 int __init arch_init_kprobes(void)
 {
-   return 0;
+   return kprobe_blacklist_add_range((unsigned long)__entry_text_start,
+ (unsigned long) __entry_text_end);
 }
 
 int arch_trampoline_kprobe(struct kprobe *p)
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index e059507..e81bced 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -266,6 +266,7 @@ extern int arch_init_kprobes(void);
 extern void show_registers(struct pt_regs *regs);
 extern void kprobes_inc_nmissed_count(struct kprobe *p);
 extern bool arch_within_kprobe_blacklist(unsigned long addr);
+extern int kprobe_blacklist_add_range(unsigned long start, unsigned long end);
 
 struct kprobe_insn_cache {
struct mutex mutex;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 8319048..abdede5 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1325,13 +1325,6 @@ out:
return ret;
 }
 
-bool __weak arch_within_kprobe_blacklist(unsigned long addr)
-{
-   /* The __kprobes marked functions and entry code must not be probed */
-   return addr >= (unsigned long)__kprobes_text_start &&
-  addr < (unsigned long)__kprobes_text_end;
-}
-
 static struct kprobe_blacklist_entry *find_blacklist_entry(unsigned long addr)
 {
struct kprobe_blacklist_entry *ent;
@@ -1346,8 +1339,6 @@ static struct kprobe_blacklist_entry 
*find_blacklist_entry(unsigned long addr)
 
 static bool within_kprobe_blacklist(unsigned long addr)
 {
-   if (arch_within_kprobe_blacklist(addr))
-   return true;
/*
 * If there exists a kprobe_blacklist, verify and
 * fail any probe registration in the prohibited area
@@ -2032,6 +2023,40 @@ void dump_kprobe(struct kprobe *kp)
 }
 NOKPROBE_SYMBOL(dump_kprobe);
 
+static int __kprobe_blacklist_add(unsigned long start, unsigned long end)
+{
+   struct kprobe_blacklist_entry *ent;
+
+   ent = kmalloc(sizeof(*ent), GFP_KERNEL);
+   if (!ent)
+   return -ENOMEM;
+
+   ent->start_addr = start;
+   ent->end_addr = end;
+   INIT_LIST_HEAD(>list);
+   list_add_tail(>list, _blacklist);
+   return 0;
+}
+
+int kprobe_blacklist_add_range(unsigned long start, unsigned long end)
+{
+   unsigned long offset = 0, size = 0;
+   int err = 0;
+
+   mutex_lock(_blacklist_mutex);
+   while (!err && start < end) {
+   if (!kallsyms_lookup_size_offset(start, , ) ||
+   size == 0) {
+   err = -ENOENT;
+   break;
+   }
+   err = __kprobe_blacklist_add(start, start + size);
+   start += size;
+   }
+   mutex_unlock(_blacklist_mutex);
+   return err;
+}
+
 /*
  * Lookup and populate the kprobe_blacklist.
  *
@@ -2043,8 +2068,8 @@ NOKPROBE_SYMBOL(dump_kprobe);
 static int populate_kprobe_blacklist(unsigned long *start, unsigned long *end)
 {
unsigned long *iter;
-   struct kprobe_blacklist_entry *ent;
unsigned long offset = 0, size = 0;
+   int ret;
 
mutex_lock(_blacklist_mutex);
for (iter = start; iter < end; iter++) {
@@ -2052,14 +2077,7 @@ static int populate_kprobe_blacklist(unsigned long 
*start, unsigned long *end)
pr_err("Failed to find blacklist %p\n", (void *)*iter);
continue;
}
-
-   ent = kmalloc(sizeof(*ent), GFP_KERNEL);
-   if (!ent)
-   return -ENOMEM;
-   ent->start_addr = *iter;
-   ent->end_addr = *iter + size;
-   INIT_LIST_HEAD(>list);
-   list_add_tail(>list, _blacklist);
+   ret = __kprobe_blacklist_add(*iter, *iter

[PATCH -tip v7 20/26] kprobes: Support blacklist functions in module

2014-02-26 Thread Masami Hiramatsu

To blacklist the functions in a module (e.g. user-defined
kprobe handler and the functions invoked from it), expand
blacklist support for modules.
With this change, users can use NOKPROBE_SYMBOL() macro in
their own modules.

Signed-off-by: Masami Hiramatsu 
Cc: Ananth N Mavinakayanahalli 
Cc: "David S. Miller" 
Cc: Rob Landley 
Cc: Rusty Russell 
---
 Documentation/kprobes.txt |8 ++
 include/linux/module.h|5 
 kernel/kprobes.c  |   63 ++---
 kernel/module.c   |6 
 4 files changed, 72 insertions(+), 10 deletions(-)

diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
index 7062631..c6634b3 100644
--- a/Documentation/kprobes.txt
+++ b/Documentation/kprobes.txt
@@ -512,6 +512,14 @@ int enable_jprobe(struct jprobe *jp);
 Enables *probe which has been disabled by disable_*probe(). You must specify
 the probe which has been registered.
 
+4.9 NOKPROBE_SYMBOL()
+
+#include 
+NOKPROBE_SYMBOL(FUNCTION);
+
+Protects given FUNCTION from other kprobes. This is useful for handler
+functions and functions called from the handlers.
+
 5. Kprobes Features and Limitations
 
 Kprobes allows multiple probes at the same address.  Currently,
diff --git a/include/linux/module.h b/include/linux/module.h
index eaf60ff..7afb64a 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -360,6 +361,10 @@ struct module {
unsigned int num_ftrace_callsites;
unsigned long *ftrace_callsites;
 #endif
+#ifdef CONFIG_KPROBES
+   unsigned int num_kprobe_blacklist;
+   unsigned long  *kprobe_blacklist;
+#endif
 
 #ifdef CONFIG_MODULE_UNLOAD
/* What modules depend on me? */
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 3214289..8319048 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -88,6 +88,7 @@ static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long 
hash)
 
 /* Blacklist -- list of struct kprobe_blacklist_entry */
 static LIST_HEAD(kprobe_blacklist);
+static DEFINE_MUTEX(kprobe_blacklist_mutex);
 
 #ifdef __ARCH_WANT_KPROBES_INSN_SLOT
 /*
@@ -1331,22 +1332,27 @@ bool __weak arch_within_kprobe_blacklist(unsigned long 
addr)
   addr < (unsigned long)__kprobes_text_end;
 }
 
-static bool within_kprobe_blacklist(unsigned long addr)
+static struct kprobe_blacklist_entry *find_blacklist_entry(unsigned long addr)
 {
struct kprobe_blacklist_entry *ent;
 
+   list_for_each_entry(ent, _blacklist, list) {
+   if (addr >= ent->start_addr && addr < ent->end_addr)
+   return ent;
+   }
+
+   return NULL;
+}
+
+static bool within_kprobe_blacklist(unsigned long addr)
+{
if (arch_within_kprobe_blacklist(addr))
return true;
/*
 * If there exists a kprobe_blacklist, verify and
 * fail any probe registration in the prohibited area
 */
-   list_for_each_entry(ent, _blacklist, list) {
-   if (addr >= ent->start_addr && addr < ent->end_addr)
-   return true;
-   }
-
-   return false;
+   return !!find_blacklist_entry(addr);
 }
 
 /*
@@ -1432,6 +1438,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
 #endif
}
 
+   mutex_lock(_blacklist_mutex);
jump_label_lock();
preempt_disable();
 
@@ -1469,6 +1476,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
 out:
preempt_enable();
jump_label_unlock();
+   mutex_unlock(_blacklist_mutex);
 
return ret;
 }
@@ -2032,13 +2040,13 @@ NOKPROBE_SYMBOL(dump_kprobe);
  * since a kprobe need not necessarily be at the beginning
  * of a function.
  */
-static int __init populate_kprobe_blacklist(unsigned long *start,
-unsigned long *end)
+static int populate_kprobe_blacklist(unsigned long *start, unsigned long *end)
 {
unsigned long *iter;
struct kprobe_blacklist_entry *ent;
unsigned long offset = 0, size = 0;
 
+   mutex_lock(_blacklist_mutex);
for (iter = start; iter < end; iter++) {
if (!kallsyms_lookup_size_offset(*iter, , )) {
pr_err("Failed to find blacklist %p\n", (void *)*iter);
@@ -2053,9 +2061,28 @@ static int __init populate_kprobe_blacklist(unsigned 
long *start,
INIT_LIST_HEAD(>list);
list_add_tail(>list, _blacklist);
}
+   mutex_unlock(_blacklist_mutex);
+
return 0;
 }
 
+/* Shrink the blacklist */
+static void shrink_kprobe_blacklist(unsigned long *start, unsigned long *end)
+{
+   struct kprobe_blacklist_entry *ent;
+   unsigned long *iter;
+
+   mutex_lock(_blacklist_mutex);
+   for (iter = start; iter < end; iter++) {
+   ent = find_blacklist_entry(*iter);
+   if (!ent)
+   continue;
+

[PATCH -tip v7 18/26] sched: Use NOKPROBE_SYMBOL macro in sched

2014-02-26 Thread Masami Hiramatsu

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in sched/core.c.

Signed-off-by: Masami Hiramatsu 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
---
 kernel/sched/core.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 55d31fa..09716bb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2484,7 +2484,7 @@ notrace unsigned long get_parent_ip(unsigned long addr)
 #if defined(CONFIG_PREEMPT) && (defined(CONFIG_DEBUG_PREEMPT) || \
defined(CONFIG_PREEMPT_TRACER))
 
-void __kprobes preempt_count_add(int val)
+void preempt_count_add(int val)
 {
 #ifdef CONFIG_DEBUG_PREEMPT
/*
@@ -2510,8 +2510,9 @@ void __kprobes preempt_count_add(int val)
}
 }
 EXPORT_SYMBOL(preempt_count_add);
+NOKPROBE_SYMBOL(preempt_count_add);
 
-void __kprobes preempt_count_sub(int val)
+void preempt_count_sub(int val)
 {
 #ifdef CONFIG_DEBUG_PREEMPT
/*
@@ -2532,6 +2533,7 @@ void __kprobes preempt_count_sub(int val)
__preempt_count_sub(val);
 }
 EXPORT_SYMBOL(preempt_count_sub);
+NOKPROBE_SYMBOL(preempt_count_sub);
 
 #endif
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 26/26] ftrace: Introduce FTRACE_OPS_FL_SELF_FILTER for ftrace-kprobe

2014-02-26 Thread Masami Hiramatsu

Since the kprobes itself owns a hash table to get a kprobe
data structure corresponding to the given ip address, there
is no need to test ftrace hash in ftrace side.
To achive better performance on ftrace-based kprobe,
FTRACE_OPS_FL_SELF_FILTER flag to ftrace_ops which means
that ftrace skips testing its own hash table.

Without this patch, ftrace_lookup_ip() is biggest cycles
consumer when 20,000 kprobes are enabled.
  
  Samples: 917  of event 'cycles', Event count (approx.): 427239250
  +  22.21%  [k] ftrace_lookup_ip
  +  10.22%  [k] kprobe_trace_func
  +   3.77%  [k] get_kprobe_cached
  

With this patch, ftrace_lookup_ip() vanished from the
cycles consumer list (of course, there is no caller on
hotpath anymore :))
  
  Samples: 1K of event 'cycles', Event count (approx.): 337873938
  +   7.88%  [k] kprobe_trace_func
  +   7.32%  [k] kprobe_ftrace_handler
  +   5.78%  [k] get_kprobe_cached
  

Signed-off-by: Masami Hiramatsu 
Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
---
 include/linux/ftrace.h |3 +++
 kernel/kprobes.c   |2 +-
 kernel/trace/ftrace.c  |3 ++-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index f4233b1..1842334 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -92,6 +92,8 @@ typedef void (*ftrace_func_t)(unsigned long ip, unsigned long 
parent_ip,
  * STUB   - The ftrace_ops is just a place holder.
  * INITIALIZED - The ftrace_ops has already been initialized (first use time
  *register_ftrace_function() is called, it will initialized the 
ops)
+ * SELF_FILTER - The ftrace_ops function filters ip by itself. Do not need to
+ *check hash table on each hit.
  */
 enum {
FTRACE_OPS_FL_ENABLED   = 1 << 0,
@@ -103,6 +105,7 @@ enum {
FTRACE_OPS_FL_RECURSION_SAFE= 1 << 6,
FTRACE_OPS_FL_STUB  = 1 << 7,
FTRACE_OPS_FL_INITIALIZED   = 1 << 8,
+   FTRACE_OPS_FL_SELF_FILTER   = 1 << 9,
 };
 
 struct ftrace_ops {
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 65b18f6..2f82117 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1019,7 +1019,7 @@ static struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
 #ifdef CONFIG_KPROBES_ON_FTRACE
 static struct ftrace_ops kprobe_ftrace_ops __read_mostly = {
.func = kprobe_ftrace_handler,
-   .flags = FTRACE_OPS_FL_SAVE_REGS,
+   .flags = FTRACE_OPS_FL_SAVE_REGS | FTRACE_OPS_FL_SELF_FILTER,
 };
 static int kprobe_ftrace_enabled;
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index cd7f76d..2734f20 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4502,7 +4502,8 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long 
parent_ip,
 */
preempt_disable_notrace();
do_for_each_ftrace_op(op, ftrace_ops_list) {
-   if (ftrace_ops_test(op, ip, regs))
+   if (op->flags & FTRACE_OPS_FL_SELF_FILTER ||
+   ftrace_ops_test(op, ip, regs))
op->func(ip, parent_ip, op, regs);
} while_for_each_ftrace_op(op);
preempt_enable_notrace();


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 25/26] kprobes: Introduce kprobe cache to reduce cache misshits

2014-02-26 Thread Masami Hiramatsu

Introduce kprobe cache to reduce cache misshits for
massive multiple kprobes.
For stress testing kprobes, we need to activate kprobes
as many as possible. This situation causes cache miss
hit storm on kprobe hash-list. kprobe hashlist is already
enlarged to 4k entries and this is still small for 40k
kprobes.

For example, when registering 40k probes on the hlist and
enabling 20k probes, perf tools shows still a lot of
cache-misses are on the get_kprobe.
  
  Samples: 4K of event 'cache-misses', Event count (approx.): 7473222
  +  79.94%  [k] get_kprobe
  +   5.55%  [k] ftrace_lookup_ip
  +   1.23%  [k] kprobe_trace_func
  

Also, I found that the most of the kprobes are not hit.
In that case, to reduce cache-misses, we can reduce the
random memory access by introducing a per-cpu cache which
caches the address of frequently used kprobe data structure
and its probe address.

With kpcache enabled, the get_kprobe_cached goes down to
around 3% of cache-misses with 20k probes.
  
  Samples: 578  of event 'cache-misses', Event count (approx.): 621689
  +  18.37%  [k] ftrace_lookup_ip
  +   6.74%  [k] kprobe_trace_func
  +   3.92%  [k] kprobe_ftrace_handler
  +   3.44%  [k] get_kprobe_cached
  

Of course this reduces the enabling time too:

Without this fix (just enlarge hash table):
(2303 sec, 1 min intervals for each 2000 probes enabled)

  Enabling trace events: start at 1392794306
  0 1392794307 a2mp_chan_alloc_skb_cb_38556
  1 1392794307 a2mp_chan_close_cb_38555
  
  19997 1392796603 nfs4_negotiate_security_12119
  19998 1392796603 nfs4_open_confirm_done_11767
  1 1392796603 nfs4_open_confirm_prepare_11779

With this fix:
(1768 sec, 1 min intervals for each 2000 probes enabled)
  
  Enabling trace events: start at 1392901057
  0 1392901059 a2mp_chan_alloc_skb_cb_38558
  1 1392901059 a2mp_chan_close_cb_38557
  2 1392901059 a2mp_channel_create_38706
  
  19997 1392902824 nfs4_match_stateid_11734
  19998 1392902824 nfs4_negotiate_security_12121
  1 1392902825 nfs4_open_confirm_done_11769
  

This patch implements a simple per-cpu 4way/4096entry cache
for kprobes hlist. All get_kprobe on hot-path uses the cache
and if the cache miss-hit, it searches kprobes on the hlist
and inserts the found kprobes to the cache entry.
When removing kprobes, it clears cache entries by using IPI,
because it is per-cpu cache.

Note that this consumes much memory (272KB/cpu) only for
kprobes, and this is only good for the users who use
thousands of probes at a time, e.g. kprobe stress testing.
Thus I've added CONFIG_KPROBE_CACHE option for this feature.
If you aren't interested in the stress testing, you should
set CONFIG_KPROBE_CACHE=n.

Signed-off-by: Masami Hiramatsu 
---
 arch/Kconfig |   10 +++
 arch/x86/kernel/kprobes/core.c   |2 -
 arch/x86/kernel/kprobes/ftrace.c |2 -
 include/linux/kprobes.h  |1 
 kernel/kprobes.c |  125 +++---
 5 files changed, 128 insertions(+), 12 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 80bbb8c..e38787e 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -46,6 +46,16 @@ config KPROBES
  for kernel debugging, non-intrusive instrumentation and testing.
  If in doubt, say "N".
 
+config KPROBE_CACHE
+   bool "Kprobe per-cpu cache for massive multiple probes"
+   depends on KPROBES
+   help
+ For handling massive multiple kprobes with better performance,
+ kprobe per-cpu cache is enabled by this option. This cache is
+ only for users who would like to use more than 10,000 probes
+ at a time, which is usually stress testing, debugging etc.
+ If in doubt, say "N".
+
 config JUMP_LABEL
bool "Optimize very unlikely/likely branches"
depends on HAVE_ARCH_JUMP_LABEL
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 8ef676f..374d207 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -576,7 +576,7 @@ int kprobe_int3_handler(struct pt_regs *regs)
addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t));
 
kcb = get_kprobe_ctlblk();
-   p = get_kprobe(addr);
+   p = get_kprobe_cached(addr);
 
if (p) {
if (kprobe_running()) {
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 717b02a..8178dd4 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -63,7 +63,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
/* Disable irq for emulating a breakpoint and avoiding preempt */
local_irq_save(flags);
 
-   p = get_kprobe((kprobe_opcode_t *)ip);
+   p = get_kprobe_cached((kprobe_opcode_t *)ip);
if (unlikely(!p) || kprobe_disabled(p))
goto end;
 
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index

[PATCH -tip v7 24/26] kprobes: Enlarge hash table to 4096 entries

2014-02-26 Thread Masami Hiramatsu

Currently, since the kprobes expects to be used
with less than 100 probe points, its hash table
just has 64 entries. This is too little to handle
several thousands of probes.
Enlarge this to 4096 entires which just consumes
32KB (on 64bit arch) for better scalability.

Without this patch, enabling 17787 probes takes
more than 2 hours! (9428sec, 1 min intervals for
each 2000 probes enabled)

  Enabling trace events: start at 1392782584
  0 1392782585 a2mp_chan_alloc_skb_cb_38556
  1 1392782585 a2mp_chan_close_cb_38555
  
  17785 1392792008 lookup_vport_34987
  17786 1392792010 loop_add_23485
  17787 1392792012 loop_attr_do_show_autoclear_23464

I profiled it and saw that more than 90% of
cycles are consumed on get_kprobe.

  Samples: 18K of event 'cycles', Event count (approx.): 37759714934
  +  95.90%  [k] get_kprobe
  +   0.76%  [k] ftrace_lookup_ip
  +   0.54%  [k] kprobe_trace_func

And also more than 60% of executed instructions
were in get_kprobe too.

  Samples: 17K of event 'instructions', Event count (approx.): 1321391290
  +  65.48%  [k] get_kprobe
  +   4.07%  [k] kprobe_trace_func
  +   2.93%  [k] optimized_callback


And annotating get_kprobe also shows the hlist
is too long and takes a time on tracking it.

   |struct hlist_head *head;
   |struct kprobe *p;
   |
   |head = _table[hash_ptr(addr, KPROBE_HASH_BITS)];
   |hlist_for_each_entry_rcu(p, head, hlist) {
 86.33 |  mov(%rax),%rax
 11.24 |  test   %rax,%rax
   |  jne60
   |if (p->addr == addr)
   |return p;
   |}

With this fix, enabling 20,000 probes just takes
40 min (2303 sec, 1 min intervals for
each 2000 probes enabled)

  Enabling trace events: start at 1392794306
  0 1392794307 a2mp_chan_alloc_skb_cb_38556
  1 1392794307 a2mp_chan_close_cb_38555
  
  19997 1392796603 nfs4_negotiate_security_12119
  19998 1392796603 nfs4_open_confirm_done_11767
  1 1392796603 nfs4_open_confirm_prepare_11779

And it reduced cycles on get_kprobe (with 20,000 probes).

  Samples: 5K of event 'cycles', Event count (approx.): 4540269674
  +  68.77%  [k] get_kprobe
  +   8.56%  [k] ftrace_lookup_ip
  +   3.04%  [k] kprobe_trace_func

Signed-off-by: Masami Hiramatsu 
---
 kernel/kprobes.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index abdede5..302ff42 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -54,7 +54,7 @@
 #include 
 #include 
 
-#define KPROBE_HASH_BITS 6
+#define KPROBE_HASH_BITS 12
 #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS)
 
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 23/26] kprobes/x86: Remove unneeded preempt_disable/enable in interrupt handlers

2014-02-26 Thread Masami Hiramatsu

Since the int3 itself disables the local_irq and kprobes
keeps it disabled while the single step has done, the
kernel preemption never happen while processing a kprobe.
This means that we don't need to disable/enable preemption.
Also, this changes kprobe_int3_handler to use goto-out style.

Signed-off-by: Masami Hiramatsu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
---
 arch/x86/kernel/kprobes/core.c |   24 +++-
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 4767fda..8ef676f 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -506,7 +506,6 @@ static void setup_singlestep(struct kprobe *p, struct 
pt_regs *regs,
 * stepping.
 */
regs->ip = (unsigned long)p->ainsn.insn;
-   preempt_enable_no_resched();
return;
}
 #endif
@@ -575,13 +574,6 @@ int kprobe_int3_handler(struct pt_regs *regs)
struct kprobe_ctlblk *kcb;
 
addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t));
-   /*
-* We don't want to be preempted for the entire
-* duration of kprobe processing. We conditionally
-* re-enable preemption at the end of this function,
-* and also in reenter_kprobe() and setup_singlestep().
-*/
-   preempt_disable();
 
kcb = get_kprobe_ctlblk();
p = get_kprobe(addr);
@@ -589,7 +581,7 @@ int kprobe_int3_handler(struct pt_regs *regs)
if (p) {
if (kprobe_running()) {
if (reenter_kprobe(p, regs, kcb))
-   return 1;
+   goto handled;
} else {
set_current_kprobe(p, regs, kcb);
kcb->kprobe_status = KPROBE_HIT_ACTIVE;
@@ -604,7 +596,7 @@ int kprobe_int3_handler(struct pt_regs *regs)
 */
if (!p->pre_handler || !p->pre_handler(p, regs))
setup_singlestep(p, regs, kcb, 0);
-   return 1;
+   goto handled;
}
} else if (*addr != BREAKPOINT_INSTRUCTION) {
/*
@@ -617,19 +609,20 @@ int kprobe_int3_handler(struct pt_regs *regs)
 * the original instruction.
 */
regs->ip = (unsigned long)addr;
-   preempt_enable_no_resched();
-   return 1;
+   goto handled;
} else if (kprobe_running()) {
p = __this_cpu_read(current_kprobe);
if (p->break_handler && p->break_handler(p, regs)) {
if (!skip_singlestep(p, regs, kcb))
setup_singlestep(p, regs, kcb, 0);
-   return 1;
+   goto handled;
}
} /* else: not a kprobe fault; let the kernel handle it */
 
-   preempt_enable_no_resched();
return 0;
+
+handled:
+   return 1;
 }
 NOKPROBE_SYMBOL(kprobe_int3_handler);
 
@@ -894,7 +887,6 @@ int kprobe_debug_handler(struct pt_regs *regs)
}
reset_current_kprobe();
 out:
-   preempt_enable_no_resched();
 
/*
 * if somebody else is singlestepping across a probe point, flags
@@ -927,7 +919,6 @@ int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
restore_previous_kprobe(kcb);
else
reset_current_kprobe();
-   preempt_enable_no_resched();
} else if (kcb->kprobe_status == KPROBE_HIT_ACTIVE ||
   kcb->kprobe_status == KPROBE_HIT_SSDONE) {
/*
@@ -1058,7 +1049,6 @@ int longjmp_break_handler(struct kprobe *p, struct 
pt_regs *regs)
memcpy((kprobe_opcode_t *)(kcb->jprobe_saved_sp),
   kcb->jprobes_stack,
   MIN_STACK_SIZE(kcb->jprobe_saved_sp));
-   preempt_enable_no_resched();
return 1;
}
return 0;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 21/26] kprobes: Use NOKPROBE_SYMBOL() in sample modules

2014-02-26 Thread Masami Hiramatsu

Use NOKPROBE_SYMBOL() to protect handlers from kprobes
in sample modules.

Signed-off-by: Masami Hiramatsu 
Ananth N Mavinakayanahalli 
---
 samples/kprobes/jprobe_example.c|1 +
 samples/kprobes/kprobe_example.c|3 +++
 samples/kprobes/kretprobe_example.c |2 ++
 3 files changed, 6 insertions(+)

diff --git a/samples/kprobes/jprobe_example.c b/samples/kprobes/jprobe_example.c
index b754135..40114ac 100644
--- a/samples/kprobes/jprobe_example.c
+++ b/samples/kprobes/jprobe_example.c
@@ -35,6 +35,7 @@ static long jdo_fork(unsigned long clone_flags, unsigned long 
stack_start,
jprobe_return();
return 0;
 }
+NOKPROBE_SYMBOL(jdo_fork);
 
 static struct jprobe my_jprobe = {
.entry  = jdo_fork,
diff --git a/samples/kprobes/kprobe_example.c b/samples/kprobes/kprobe_example.c
index 366db1a..462d90f 100644
--- a/samples/kprobes/kprobe_example.c
+++ b/samples/kprobes/kprobe_example.c
@@ -46,6 +46,7 @@ static int handler_pre(struct kprobe *p, struct pt_regs *regs)
/* A dump_stack() here will give a stack backtrace */
return 0;
 }
+NOKPROBE_SYMBOL(handler_pre);
 
 /* kprobe post_handler: called after the probed instruction is executed */
 static void handler_post(struct kprobe *p, struct pt_regs *regs,
@@ -68,6 +69,7 @@ static void handler_post(struct kprobe *p, struct pt_regs 
*regs,
p->addr, regs->ex1);
 #endif
 }
+NOKPROBE_SYMBOL(handler_post);
 
 /*
  * fault_handler: this is called if an exception is generated for any
@@ -81,6 +83,7 @@ static int handler_fault(struct kprobe *p, struct pt_regs 
*regs, int trapnr)
/* Return 0 because we don't handle the fault. */
return 0;
 }
+NOKPROBE_SYMBOL(handler_fault);
 
 static int __init kprobe_init(void)
 {
diff --git a/samples/kprobes/kretprobe_example.c 
b/samples/kprobes/kretprobe_example.c
index 1041b67..d932c52 100644
--- a/samples/kprobes/kretprobe_example.c
+++ b/samples/kprobes/kretprobe_example.c
@@ -47,6 +47,7 @@ static int entry_handler(struct kretprobe_instance *ri, 
struct pt_regs *regs)
data->entry_stamp = ktime_get();
return 0;
 }
+NOKPROBE_SYMBOL(entry_handler);
 
 /*
  * Return-probe handler: Log the return value and duration. Duration may turn
@@ -66,6 +67,7 @@ static int ret_handler(struct kretprobe_instance *ri, struct 
pt_regs *regs)
func_name, retval, (long long)delta);
return 0;
 }
+NOKPROBE_SYMBOL(ret_handler);
 
 static struct kretprobe my_kretprobe = {
.handler= ret_handler,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -tip v7 14/26] x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation

2014-02-26 Thread Masami Hiramatsu

Use NOKPROBE_SYMBOL macro for protecting functions
from kprobes instead of __kprobes annotation under
arch/x86.

This applies nokprobe_inline annotation for some cases,
because NOKPROBE_SYMBOL() will inhibit inlining by
referring the symbol address.

This just folds a bunch of previous NOKPROBE_SYMBOL()
cleanup patches for x86 to one patch.

Changes from previous version:
 - Use nokprobe_inline instead of __always_inline.

Signed-off-by: Masami Hiramatsu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Peter Zijlstra 
Cc: Oleg Nesterov 
Cc: Steven Rostedt 
Cc: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Frederic Weisbecker 
Cc: Seiji Aguchi 
---
 arch/x86/include/asm/traps.h |2 -
 arch/x86/kernel/apic/hw_nmi.c|3 +
 arch/x86/kernel/cpu/perf_event.c |3 +
 arch/x86/kernel/cpu/perf_event_amd_ibs.c |3 +
 arch/x86/kernel/dumpstack.c  |9 ++--
 arch/x86/kernel/kprobes/core.c   |   77 +++---
 arch/x86/kernel/kprobes/ftrace.c |   15 --
 arch/x86/kernel/kprobes/opt.c|8 ++-
 arch/x86/kernel/kvm.c|4 +-
 arch/x86/kernel/nmi.c|   18 +--
 arch/x86/kernel/traps.c  |   20 +---
 arch/x86/mm/fault.c  |   28 +++
 12 files changed, 121 insertions(+), 69 deletions(-)

diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 58d66fe..ca32508 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -68,7 +68,7 @@ dotraplinkage void do_segment_not_present(struct pt_regs *, 
long);
 dotraplinkage void do_stack_segment(struct pt_regs *, long);
 #ifdef CONFIG_X86_64
 dotraplinkage void do_double_fault(struct pt_regs *, long);
-asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *);
+asmlinkage struct pt_regs *sync_regs(struct pt_regs *);
 #endif
 dotraplinkage void do_general_protection(struct pt_regs *, long);
 dotraplinkage void do_page_fault(struct pt_regs *, unsigned long);
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index a698d71..73eb5b3 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -60,7 +60,7 @@ void arch_trigger_all_cpu_backtrace(void)
smp_mb__after_clear_bit();
 }
 
-static int __kprobes
+static int
 arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
 {
int cpu;
@@ -80,6 +80,7 @@ arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, 
struct pt_regs *regs)
 
return NMI_DONE;
 }
+NOKPROBE_SYMBOL(arch_trigger_all_cpu_backtrace_handler);
 
 static int __init register_trigger_all_cpu_backtrace(void)
 {
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 895604f..c0dd9ba 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1273,7 +1273,7 @@ void perf_events_lapic_init(void)
apic_write(APIC_LVTPC, APIC_DM_NMI);
 }
 
-static int __kprobes
+static int
 perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
 {
u64 start_clock;
@@ -1291,6 +1291,7 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs 
*regs)
 
return ret;
 }
+NOKPROBE_SYMBOL(perf_event_nmi_handler);
 
 struct event_constraint emptyconstraint;
 struct event_constraint unconstrained;
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c 
b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 4b8e4d3..8aa687b 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -593,7 +593,7 @@ out:
return 1;
 }
 
-static int __kprobes
+static int
 perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
 {
int handled = 0;
@@ -606,6 +606,7 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
 
return handled;
 }
+NOKPROBE_SYMBOL(perf_ibs_nmi_handler);
 
 static __init int perf_ibs_pmu_init(struct perf_ibs *perf_ibs, char *name)
 {
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index d9c12d3..b74ebc7 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -200,7 +200,7 @@ static arch_spinlock_t die_lock = __ARCH_SPIN_LOCK_UNLOCKED;
 static int die_owner = -1;
 static unsigned int die_nest_count;
 
-unsigned __kprobes long oops_begin(void)
+unsigned long oops_begin(void)
 {
int cpu;
unsigned long flags;
@@ -223,8 +223,9 @@ unsigned __kprobes long oops_begin(void)
return flags;
 }
 EXPORT_SYMBOL_GPL(oops_begin);
+NOKPROBE_SYMBOL(oops_begin);
 
-void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
+void oops_end(unsigned long flags, struct pt_regs *regs, int signr)
 {
if (regs && kexec_should_crash(current))
crash_kexec(regs);
@@ -247,8 +248,9 @@ void __kprobes oops_end(unsigned long flags, struct pt_regs 
*regs, int signr)
panic("Fatal exception");

[PATCH -tip v7 00/26] kprobes: introduce NOKPROBE_SYMBOL, bugfixes and scalbility efforts

2014-02-26 Thread Masami Hiramatsu

Hi,
Here is the version 7 of NOKPROBE_SYMBOL series. :)

This includes several scalability improvements against massive
multiple probes (over 10k probes), which are useful for stress
testing of kprobes (putting kprobes on every function entry).
I also include bugfixes which I sent last week(*), because it is
required to pass the stress test.
 (*) https://lkml.org/lkml/2014/2/19/744

Changes
===
>From this series, I removed 2 patches;
 - Prohibiting probing on memset/memcpy, this could
   not be reproduced.
 - Original instruction recovery code for emergency,
   I had hit a problem with this.
Add(include) previous 2 bugfixes;
 - Fix page-fault handling logic on x86 kprobes.
 - Allow to handle reentered kprobe on singlestepping.
   both of them are needed for profiling kprobes
   by perf.
And adds 4 new patches;
 - Call exception_enter after kprobes handled, since
   excception_enter involves a large set of functions.
 - Enlarge kprobes hash table size, since current
   table size is just 64 entries, too small.
 - Kprobe cache for frequently accessd kprobes to
   solve cache-misses on the kprobe hash table.
 - Skip Ftrace hlist check with ftrace-based kprobe,
   since the ftrace-based kprobe already has its
   own hlist. We don't need to search on hlist twice.

Blacklist improvements
==
Currently, kprobes uses __kprobes annotation and internal symbol-
name based blacklist to prohibit probing on some functions, because
to probe those functions may cause an infinit recursive loop by
int3/debug exceptions.
However, current mechanisms have some problems especially from the
view point of maintaining code;
 - __kprobes is easy to confuse the function is
   used by kprobes, despite it just means "no kprobe
   on it".
 - __kprobes moves functions to different section
   this will be not good for cache optimization.
 - symbol-name based solution is not good at all,
   since the symbol name easily be changed, and
   we cannot notice it.
 - it doesn't support functions in modules at all.

Thus, I decided to introduce new NOKPROBE_SYMBOL macro for building
an integrated kprobe blacklist.

The new macro stores the address of the given symbols into
_kprobe_blacklist section, and initialize the blacklist based on the
address list at boottime.
This is also applied for modules. When loading a module, kprobes
finds the blacklist symbols in _kprobe_blacklist section in the
module automatically.
This series replaces all __kprobes on x86 and generic code with the
NOKPROBE_SYMBOL() too.

Although, the new blacklist still support old-style __kprobes by
decoding .kprobes.text if exist, because it still be used on arch-
dependent code except for x86.

Scalability effort
==
This series fixes not only the kernel crashable "qualitative" bugs
but also "quantitative" issue with massive multiple kprobes. Thus
we can now do a stress test, putting kprobes on all (non-blacklisted)
kernel functions and enabling all of them.
To set kprobes on all kernel functions, run the below script.
  
  #!/bin/sh
  TRACE_DIR=/sys/kernel/debug/tracing/
  echo > $TRACE_DIR/kprobe_events
  grep -iw t /proc/kallsyms | tr -d . | \
awk 'BEGIN{i=0};{print("p:"$3"_"i, "0x"$1); i++}' | \
while read l; do echo $l >> $TRACE_DIR/kprobe_events ; done
  
Since it doesn't check the blacklist at all, you'll see many write
errors, but no problem :).

Note that a kind of performance issue is still in the kprobe-tracer
if you trace all functions. Since a few ftrace functions are called
inside the kprobe tracer even if we shut off the tracing (tracing_on
= 0), enabling kprobe-events on the functions will cause a bad
performance impact (it is safe, but you'll see the system slowdown
and no event recorded because it is just ignored).
To find those functions, you can use the third column of
(debugfs)/tracing/kprobe_profile as below, which tells you the number
of miss-hit(ignored) for each events. If you find that some events
which have small number in 2nd column and large number in 3rd column,
those may course the slowdown.
  
  # sort -rnk 3 (debugfs)/tracing/kprobe_profile | head
  ftrace_cmp_recs_4907   264950231 33648874543
  ring_buffer_lock_reserve_5087  0  4802719935
  trace_buffer_lock_reserve_5199 0  4385319303
  trace_event_buffer_lock_reserve_5200   0  4379968153
  ftrace_location_range_4918  18944015  2407616669
  bsearch_17098   18979815  2407579741
  ftrace_location_497218927061  2406723128
  ftrace_int3_handler_121118926980  2406303531
  poke_int3_handler_199   18448012  1403516611
  inat_get_opcode_attribute_16941012715314
  

I'd recommend you to enable events on such functions

[PATCH -tip v7 03/26] kprobes: Prohibit probing on .entry.text code

2014-02-26 Thread Masami Hiramatsu

.entry.text is a code area which is used for interrupt/syscall
entries, and there are many sensitive codes.
Thus, it is better to prohibit probing on all of such codes
instead of a part of that.
Since some symbols are already registered on kprobe blacklist,
this also removes them from the blacklist.

Signed-off-by: Masami Hiramatsu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Ananth N Mavinakayanahalli 
Cc: Al Viro 
Cc: Seiji Aguchi 
Cc: Peter Zijlstra 
Cc: Frederic Weisbecker 
---
 arch/x86/kernel/entry_32.S |   33 -
 arch/x86/kernel/entry_64.S |   20 
 arch/x86/kernel/kprobes/core.c |8 
 include/linux/kprobes.h|1 +
 kernel/kprobes.c   |   13 -
 5 files changed, 17 insertions(+), 58 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index a2a4f46..0ca5bf1 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -315,10 +315,6 @@ ENTRY(ret_from_kernel_thread)
 ENDPROC(ret_from_kernel_thread)
 
 /*
- * Interrupt exit functions should be protected against kprobes
- */
-   .pushsection .kprobes.text, "ax"
-/*
  * Return to user mode is not as complex as all this looks,
  * but we want the default path for a system call return to
  * go as quickly as possible which is why some of this is
@@ -372,10 +368,6 @@ need_resched:
 END(resume_kernel)
 #endif
CFI_ENDPROC
-/*
- * End of kprobes section
- */
-   .popsection
 
 /* SYSENTER_RETURN points to after the "sysenter" instruction in
the vsyscall page.  See vsyscall-sysentry.S, which defines the symbol.  */
@@ -495,10 +487,6 @@ sysexit_audit:
PTGS_TO_GS_EX
 ENDPROC(ia32_sysenter_target)
 
-/*
- * syscall stub including irq exit should be protected against kprobes
- */
-   .pushsection .kprobes.text, "ax"
# system call handler stub
 ENTRY(system_call)
RING0_INT_FRAME # can't unwind into user space anyway
@@ -691,10 +679,6 @@ syscall_badsys:
jmp resume_userspace
 END(syscall_badsys)
CFI_ENDPROC
-/*
- * End of kprobes section
- */
-   .popsection
 
 .macro FIXUP_ESPFIX_STACK
 /*
@@ -781,10 +765,6 @@ common_interrupt:
 ENDPROC(common_interrupt)
CFI_ENDPROC
 
-/*
- *  Irq entries should be protected against kprobes
- */
-   .pushsection .kprobes.text, "ax"
 #define BUILD_INTERRUPT3(name, nr, fn) \
 ENTRY(name)\
RING0_INT_FRAME;\
@@ -961,10 +941,6 @@ ENTRY(spurious_interrupt_bug)
jmp error_code
CFI_ENDPROC
 END(spurious_interrupt_bug)
-/*
- * End of kprobes section
- */
-   .popsection
 
 #ifdef CONFIG_XEN
 /* Xen doesn't set %esp to be precisely what the normal sysenter
@@ -1239,11 +1215,6 @@ return_to_handler:
jmp *%ecx
 #endif
 
-/*
- * Some functions should be protected against kprobes
- */
-   .pushsection .kprobes.text, "ax"
-
 #ifdef CONFIG_TRACING
 ENTRY(trace_page_fault)
RING0_EC_FRAME
@@ -1453,7 +1424,3 @@ ENTRY(async_page_fault)
 END(async_page_fault)
 #endif
 
-/*
- * End of kprobes section
- */
-   .popsection
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1e96c36..43bb389 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -487,8 +487,6 @@ ENDPROC(native_usergs_sysret64)
TRACE_IRQS_OFF
.endm
 
-/* save complete stack frame */
-   .pushsection .kprobes.text, "ax"
 ENTRY(save_paranoid)
XCPT_FRAME 1 RDI+8
cld
@@ -517,7 +515,6 @@ ENTRY(save_paranoid)
 1: ret
CFI_ENDPROC
 END(save_paranoid)
-   .popsection
 
 /*
  * A newly forked process directly context switches into this address.
@@ -975,10 +972,6 @@ END(interrupt)
call \func
.endm
 
-/*
- * Interrupt entry/exit should be protected against kprobes
- */
-   .pushsection .kprobes.text, "ax"
/*
 * The interrupt stubs push (~vector+0x80) onto the stack and
 * then jump to common_interrupt.
@@ -1113,10 +1106,6 @@ ENTRY(retint_kernel)
 
CFI_ENDPROC
 END(common_interrupt)
-/*
- * End of kprobes section
- */
-   .popsection
 
 /*
  * APIC interrupts.
@@ -1477,11 +1466,6 @@ apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
hyperv_callback_vector hyperv_vector_handler
 #endif /* CONFIG_HYPERV */
 
-/*
- * Some functions should be protected against kprobes
- */
-   .pushsection .kprobes.text, "ax"
-
 paranoidzeroentry_ist debug do_debug DEBUG_STACK
 paranoidzeroentry_ist int3 do_int3 DEBUG_STACK
 paranoiderrorentry stack_segment do_stack_segment
@@ -1898,7 +1882,3 @@ ENTRY(ignore_sysret)
CFI_ENDPROC
 END(ignore_sysret)
 
-/*
- * End of kprobes section
- */
-   .popsection
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index a9a42fa..4708d6e 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1062,6 +1062,14 @@ int

[PATCH] mm/slab.c: cleanup outdated comments and unify variables naming

2014-02-26 Thread Jianyu Zhan

As time goes, the code changes a lot, and this leads to that
some old-days comments scatter around , which instead of faciliating
understanding, but make more confusion. So this patch cleans up them.

Also, this patch unifies some variables naming.

Signed-off-by: Jianyu Zhan 
---
 mm/slab.c | 66 +++
 1 file changed, 32 insertions(+), 34 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index b264214..5678673 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -277,8 +277,8 @@ static void kmem_cache_node_init(struct kmem_cache_node 
*parent)
  * OTOH the cpuarrays can contain lots of objects,
  * which could lock up otherwise freeable slabs.
  */
-#define REAPTIMEOUT_CPUC   (2*HZ)
-#define REAPTIMEOUT_LIST3  (4*HZ)
+#define REAPTIMEOUT_AC (2*HZ)
+#define REAPTIMEOUT_NODE   (4*HZ)
 
 #if STATS
 #defineSTATS_INC_ACTIVE(x) ((x)->num_active++)
@@ -1067,7 +1067,7 @@ static int init_cache_node_node(int node)
 
list_for_each_entry(cachep, _caches, list) {
/*
-* Set up the size64 kmemlist for cpu before we can
+* Set up the kmem_cache_node for cpu before we can
 * begin anything. Make sure some other cpu on this
 * node has not already allocated this
 */
@@ -1076,12 +1076,12 @@ static int init_cache_node_node(int node)
if (!n)
return -ENOMEM;
kmem_cache_node_init(n);
-   n->next_reap = jiffies + REAPTIMEOUT_LIST3 +
-   ((unsigned long)cachep) % REAPTIMEOUT_LIST3;
+   n->next_reap = jiffies + REAPTIMEOUT_NODE +
+   ((unsigned long)cachep) % REAPTIMEOUT_NODE;
 
/*
-* The l3s don't come and go as CPUs come and
-* go.  slab_mutex is sufficient
+* The kmem_cache_nodes don't come and go as CPUs
+* come and go.  slab_mutex is sufficient
 * protection here.
 */
cachep->node[node] = n;
@@ -1406,8 +1406,8 @@ static void __init set_up_node(struct kmem_cache *cachep, 
int index)
for_each_online_node(node) {
cachep->node[node] = _kmem_cache_node[index + node];
cachep->node[node]->next_reap = jiffies +
-   REAPTIMEOUT_LIST3 +
-   ((unsigned long)cachep) % REAPTIMEOUT_LIST3;
+   REAPTIMEOUT_NODE +
+   ((unsigned long)cachep) % REAPTIMEOUT_NODE;
}
 }
 
@@ -2103,8 +2103,8 @@ static int __init_refok setup_cpu_cache(struct kmem_cache 
*cachep, gfp_t gfp)
}
}
cachep->node[numa_mem_id()]->next_reap =
-   jiffies + REAPTIMEOUT_LIST3 +
-   ((unsigned long)cachep) % REAPTIMEOUT_LIST3;
+   jiffies + REAPTIMEOUT_NODE +
+   ((unsigned long)cachep) % REAPTIMEOUT_NODE;
 
cpu_cache_get(cachep)->avail = 0;
cpu_cache_get(cachep)->limit = BOOT_CPUCACHE_ENTRIES;
@@ -2300,10 +2300,10 @@ __kmem_cache_create (struct kmem_cache *cachep, 
unsigned long flags)
if (flags & CFLGS_OFF_SLAB) {
cachep->freelist_cache = kmalloc_slab(freelist_size, 0u);
/*
-* This is a possibility for one of the malloc_sizes caches.
+* This is a possibility for one of the kmalloc_{dma,}_caches.
 * But since we go off slab only for object size greater than
-* PAGE_SIZE/8, and malloc_sizes gets created in ascending 
order,
-* this should not happen at all.
+* PAGE_SIZE/8, and kmalloc_{dma,}_caches get created
+* in ascending order,this should not happen at all.
 * But leave a BUG_ON for some lucky dude.
 */
BUG_ON(ZERO_OR_NULL_PTR(cachep->freelist_cache));
@@ -2511,14 +2511,17 @@ int __kmem_cache_shutdown(struct kmem_cache *cachep)
 
 /*
  * Get the memory for a slab management obj.
- * For a slab cache when the slab descriptor is off-slab, slab descriptors
- * always come from malloc_sizes caches.  The slab descriptor cannot
- * come from the same cache which is getting created because,
- * when we are searching for an appropriate cache for these
- * descriptors in kmem_cache_create, we search through the malloc_sizes array.
- * If we are creating a malloc_sizes cache here it would not be visible to
- * kmem_find_general_cachep till the initialization is complete.
- * Hence we cannot have freelist_cache same as the original cache.
+ *
+ * For a slab cache when the slab descriptor is off-slab, the
+ * slab descriptor can't come from the same cache which is being created,
+ * Because if it is the case, that means

[PATCH 2/3] ARM: dts: vf610: i2c: Add eDMA support

2014-02-26 Thread Yuan Yao

Add i2c dts node properties for eDMA support, them depend on the eDMA driver.

Signed-off-by: Yuan Yao 
---
 arch/arm/boot/dts/vf610.dtsi | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/boot/dts/vf610.dtsi b/arch/arm/boot/dts/vf610.dtsi
index 91a7757..9d14a19 100644
--- a/arch/arm/boot/dts/vf610.dtsi
+++ b/arch/arm/boot/dts/vf610.dtsi
@@ -273,6 +273,9 @@
interrupts =<0 71 IRQ_TYPE_LEVEL_HIGH>;
clocks = < VF610_CLK_I2C0>;
clock-names = "ipg";
+   dmas = < 0 50>,
+   < 0 51>;
+   dma-names = "rx","tx";
status = "disabled";
};
 
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3] i2c: add DMA support for freescale i2c driver

2014-02-26 Thread Yuan Yao


Added in v1:
- Enable dma if it's support dma and transfer size bigger than the threshold.
- Add device tree bindings for i2c eDMA support.
- Add eDMA support for i2c driver.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] Documentation:add DMA support for freescale i2c driver

2014-02-26 Thread Yuan Yao

Add i2c dts node properties for eDMA support, them depend on the eDMA driver.

Signed-off-by: Yuan Yao 
---
 Documentation/devicetree/bindings/i2c/i2c-imx.txt | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/devicetree/bindings/i2c/i2c-imx.txt 
b/Documentation/devicetree/bindings/i2c/i2c-imx.txt
index 4a8513e..52d37fd 100644
--- a/Documentation/devicetree/bindings/i2c/i2c-imx.txt
+++ b/Documentation/devicetree/bindings/i2c/i2c-imx.txt
@@ -11,6 +11,8 @@ Required properties:
 Optional properties:
 - clock-frequency : Constains desired I2C/HS-I2C bus clock frequency in Hz.
   The absence of the propoerty indicates the default frequency 100 kHz.
+- dmas: A list of two dma specifiers, one for each entry in dma-names.
+- dma-names: should contain "tx" and "rx".
 
 Examples:
 
@@ -26,3 +28,12 @@ i2c@70038000 { /* HS-I2C on i.MX51 */
interrupts = <64>;
clock-frequency = <40>;
 };
+
+i2c0: i2c@40066000 { /* i2c0 on vf610 */
+   compatible = "fsl,vf610-i2c";
+   reg = <0x40066000 0x1000>;
+   interrupts =<0 71 0x04>;
+   dmas = < 0 50>,
+   < 0 51>;
+   dma-names = "rx","tx";
+};
-- 
1.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] i2c: add DMA support for freescale i2c driver

2014-02-26 Thread Yuan Yao

Add dma support for i2c. This function depend on DMA driver.
You can turn on it by write both the dmas and dma-name properties in dts node.
And you should set ".has_dma_support" as true for dma support in imx_i2c_hwdata 
struct.

Signed-off-by: Yuan Yao 
---
 drivers/i2c/busses/i2c-imx.c | 358 +--
 1 file changed, 344 insertions(+), 14 deletions(-)

diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index db895fb..6ec392b 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -37,22 +37,27 @@
 /** Includes 
***
 
***/
 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
-#include 
 #include 
+#include 
 #include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
+#include 
 
 /** Defines 

 
***/
@@ -63,6 +68,9 @@
 /* Default value */
 #define IMX_I2C_BIT_RATE   10  /* 100kHz */
 
+/* enable DMA if transfer size is bigger than this threshold */
+#define IMX_I2C_DMA_THRESHOLD  16
+
 /* IMX I2C registers:
  * the I2C register offset is different between SoCs,
  * to provid support for all these chips, split the
@@ -88,6 +96,7 @@
 #define I2SR_IBB   0x20
 #define I2SR_IAAS  0x40
 #define I2SR_ICF   0x80
+#define I2CR_DMAEN 0x02
 #define I2CR_RSTA  0x04
 #define I2CR_TXAK  0x08
 #define I2CR_MTX   0x10
@@ -172,6 +181,17 @@ struct imx_i2c_hwdata {
unsignedndivs;
unsignedi2sr_clr_opcode;
unsignedi2cr_ien_opcode;
+   boolhas_dma_support;
+};
+
+struct imx_i2c_dma {
+   struct dma_chan *chan_tx;
+   struct dma_chan *chan_rx;
+   dma_addr_t  buf_tx;
+   dma_addr_t  buf_rx;
+   unsigned intlen_tx;
+   unsigned intlen_rx;
+   struct completion   cmd_complete;
 };
 
 struct imx_i2c_struct {
@@ -184,6 +204,9 @@ struct imx_i2c_struct {
int stopped;
unsigned intifdr; /* IMX_I2C_IFDR */
const struct imx_i2c_hwdata *hwdata;
+
+   booluse_dma;
+   struct imx_i2c_dma  *dma;
 };
 
 static const struct imx_i2c_hwdata imx1_i2c_hwdata  = {
@@ -193,6 +216,7 @@ static const struct imx_i2c_hwdata imx1_i2c_hwdata  = {
.ndivs  = ARRAY_SIZE(imx_i2c_clk_div),
.i2sr_clr_opcode= I2SR_CLR_OPCODE_W0C,
.i2cr_ien_opcode= I2CR_IEN_OPCODE_1,
+   .has_dma_support= false,
 
 };
 
@@ -203,6 +227,7 @@ static const struct imx_i2c_hwdata imx21_i2c_hwdata  = {
.ndivs  = ARRAY_SIZE(imx_i2c_clk_div),
.i2sr_clr_opcode= I2SR_CLR_OPCODE_W0C,
.i2cr_ien_opcode= I2CR_IEN_OPCODE_1,
+   .has_dma_support= false,
 
 };
 
@@ -213,6 +238,7 @@ static struct imx_i2c_hwdata vf610_i2c_hwdata = {
.ndivs  = ARRAY_SIZE(vf610_i2c_clk_div),
.i2sr_clr_opcode= I2SR_CLR_OPCODE_W1C,
.i2cr_ien_opcode= I2CR_IEN_OPCODE_0,
+   .has_dma_support= true,
 
 };
 
@@ -254,6 +280,155 @@ static inline unsigned char imx_i2c_read_reg(struct 
imx_i2c_struct *i2c_imx,
return readb(i2c_imx->base + (reg << i2c_imx->hwdata->regshift));
 }
 
+/** Functions for DMA support 
+**/
+static int i2c_imx_dma_request(struct imx_i2c_struct *i2c_imx, u32 phy_addr)
+{
+   struct imx_i2c_dma *dma = i2c_imx->dma;
+   struct dma_slave_config dma_sconfig;
+   int ret;
+
+   dma->chan_tx = dma_request_slave_channel(_imx->adapter.dev, "tx");
+   if (!dma->chan_tx) {
+   dev_err(_imx->adapter.dev,
+   "Dma tx channel request failed!\n");
+   return -ENODEV;
+   }
+
+   dma_sconfig.dst_addr = (dma_addr_t)phy_addr +
+   (IMX_I2C_I2DR << i2c_imx->hwdata->regshift);
+   dma_sconfig.dst_addr_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+   dma_sconfig.dst_maxburst = 1;
+   dma_sconfig.direction = DMA_MEM_TO_DEV;
+   ret = dmaengine_slave_config(dma->chan_tx, _sconfig);
+   if (ret < 0) {
+   dev_err(_imx->adapter.dev,
+   "Dma slave config failed, err = %d\n", ret);
+   dma_release_channel(dma->chan_tx);
+   return ret;
+   }
+
+   dma->chan_rx =

Re: [PATCH] s390: select CONFIG_TTY for use of tty in unconditional keyboard driver

2014-02-26 Thread Heiko Carstens

On Wed, Feb 26, 2014 at 06:13:06PM -0800, Josh Triplett wrote:
> The unconditionally built keyboard driver, drivers/s390/char/keyboard.c,
> requires CONFIG_TTY, so select it from CONFIG_S390 to prevent a build
> error.
> 
> Signed-off-by: Josh Triplett 
> ---
>  arch/s390/Kconfig | 1 +
>  1 file changed, 1 insertion(+)

Applied, thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] ARM: S3C24XX: select COMMON_CLK_SAMSUNG for S3C24XX

2014-02-26 Thread Heiko Stübner

Am Donnerstag, 27. Februar 2014, 10:48:26 schrieb Pankaj Dubey:
> On 02/27/2014 09:16 AM, Mike Turquette wrote:
> > Quoting Pankaj Dubey (2014-02-25 21:24:07)
> > 
> >> CC: Ben Dooks 
> >> CC: Kukjin Kim 
> >> CC: Russell King 
> >> Signed-off-by: Pankaj Dubey 
> >> ---
> >> 
> >>   arch/arm/mach-s3c24xx/Kconfig |3 +++
> >>   1 file changed, 3 insertions(+)
> >> 
> >> diff --git a/arch/arm/mach-s3c24xx/Kconfig
> >> b/arch/arm/mach-s3c24xx/Kconfig
> >> index 80373da..5cf82a1 100644
> >> --- a/arch/arm/mach-s3c24xx/Kconfig
> >> +++ b/arch/arm/mach-s3c24xx/Kconfig
> >> @@ -40,6 +40,7 @@ config CPU_S3C2410
> >> 
> >>   config CPU_S3C2412
> >>   
> >>  bool "SAMSUNG S3C2412"
> >>  select COMMON_CLK
> >> 
> >> +   select COMMON_CLK_SAMSUNG
> > 
> > I guess this depends on Heiko's "[PATCH 00/12] ARM: S3C24XX: convert
> > s3c2410, s3c2440 s3c2442 to common clock framework" series?
> > 
> > Regards,
> > Mike
> 
> Yes, this series is based on latest kgene/for-next branch where Heiko's
> series is merged.

Just to clarify, converted are the s3c2416/s3c2443 (first series) and s3c2412 
(second series), because the clockout for s3c2410 etc seems to need a bit more 
work. I've just moved two comon patches (shared plls and a platform change) 
from the s3c2410,et-all series into the s3c2412 one.

Both of these series are merged in kgenes tree as mentioned. And at this point 
I'm not sure if I will have the time to respin the s3c2410 series for 3.15.
Which might also be good to let all the other series touching samsung clock 
code settle.


Heiko

> 
> >>  select CPU_ARM926T
> >>  select CPU_LLSERIAL_S3C2440
> >>  select S3C2412_COMMON_CLK
> >> 
> >> @@ -51,6 +52,7 @@ config CPU_S3C2412
> >> 
> >>   config CPU_S3C2416
> >>   
> >>  bool "SAMSUNG S3C2416/S3C2450"
> >>  select COMMON_CLK
> >> 
> >> +   select COMMON_CLK_SAMSUNG
> >> 
> >>  select CPU_ARM926T
> >>  select CPU_LLSERIAL_S3C2440
> >>  select S3C2416_PM if PM
> >> 
> >> @@ -89,6 +91,7 @@ config CPU_S3C244X
> >> 
> >>   config CPU_S3C2443
> >>   
> >>  bool "SAMSUNG S3C2443"
> >>  select COMMON_CLK
> >> 
> >> +   select COMMON_CLK_SAMSUNG
> >> 
> >>  select CPU_ARM920T
> >>  select CPU_LLSERIAL_S3C2440
> >>  select S3C2443_COMMON_CLK
> >> 
> >> ___
> >> linux-arm-kernel mailing list
> >> linux-arm-ker...@lists.infradead.org
> >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] usb: gadget: return the right length in ffs_epfile_io()

2014-02-26 Thread Chuansheng Liu

When the request length is aligned to maxpacketsize, sometimes
the return length ret > the user space requested len.

At that time, we will use min_t(size_t, ret, len) to limit the
size in case of user data buffer overflow.

But we need return the min_t(size_t, ret, len) to tell the user
space rightly also.

Signed-off-by: Chuansheng Liu 
---
 drivers/usb/gadget/f_fs.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/gadget/f_fs.c b/drivers/usb/gadget/f_fs.c
index 2b43343..31ee7af 100644
--- a/drivers/usb/gadget/f_fs.c
+++ b/drivers/usb/gadget/f_fs.c
@@ -687,10 +687,12 @@ static ssize_t ffs_epfile_io(struct file *file,
 * space for.
 */
ret = ep->status;
-   if (read && ret > 0 &&
-   unlikely(copy_to_user(buf, data,
- min_t(size_t, ret, len
-   ret = -EFAULT;
+   if (read && ret > 0) {
+   ret = min_t(size_t, ret, len);
+
+   if (unlikely(copy_to_user(buf, data, ret)))
+   ret = -EFAULT;
+   }
}
}
 
-- 
1.9.rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] thermal,rcar_thermal: Add dependency on HAS_IOMEM

2014-02-26 Thread Zhang Rui

On Sat, 2014-01-25 at 23:29 +0100, Richard Weinberger wrote:
> Commit beeb5a1e (thermal: rcar-thermal: Enable driver compilation with 
> COMPILE_TEST)
> broke build on archs wihout io memory.
> 
> On archs like S390 or um this driver cannot build nor work.
> Make it depend on HAS_IOMEM to bypass build failures.
> 
> drivers/thermal/rcar_thermal.c:404: undefined reference to 
> `devm_ioremap_resource'
> drivers/thermal/rcar_thermal.c:426: undefined reference to 
> `devm_ioremap_resource'
> 
> Signed-off-by: Richard Weinberger 

Kuninori,

are you okay with this patch?

thanks,
rui

> ---
>  drivers/thermal/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 35c0664..88efa8f 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -136,6 +136,7 @@ config SPEAR_THERMAL
>  config RCAR_THERMAL
>   tristate "Renesas R-Car thermal driver"
>   depends on ARCH_SHMOBILE || COMPILE_TEST
> + depends on HAS_IO_MEM
>   help
> Enable this to plug the R-Car thermal sensor driver into the Linux
> thermal framework.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] mm: per-thread vma caching

2014-02-26 Thread Michel Lespinasse

Agree with Linus; this is starting to look pretty good.

I still have nits though :)

On Wed, Feb 26, 2014 at 4:07 PM, Davidlohr Bueso  wrote:
> @@ -0,0 +1,45 @@
> +#ifndef __LINUX_VMACACHE_H
> +#define __LINUX_VMACACHE_H
> +
> +#include 
> +
> +#ifdef CONFIG_MMU
> +#define VMACACHE_BITS 2
> +#else
> +#define VMACACHE_BITS 0
> +#endif

I wouldn't even both with the #ifdef here - why not just always use 2 bits ?

> +#define vmacache_flush(tsk) \
> +   do { \
> +   memset(tsk->vmacache, 0, sizeof(tsk->vmacache)); \
> +   } while (0)

I think inline functions are preferred

> diff --git a/mm/nommu.c b/mm/nommu.c
> index 8740213..9a5347b 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -768,16 +768,19 @@ static void add_vma_to_mm(struct mm_struct *mm, struct 
> vm_area_struct *vma)
>   */
>  static void delete_vma_from_mm(struct vm_area_struct *vma)
>  {
> +   int i;
> struct address_space *mapping;
> struct mm_struct *mm = vma->vm_mm;
> +   struct task_struct *curr = current;
>
> kenter("%p", vma);
>
> protect_vma(vma, 0);
>
> mm->map_count--;
> -   if (mm->mmap_cache == vma)
> -   mm->mmap_cache = NULL;
> +   for (i = 0; i < VMACACHE_SIZE; i++)
> +   if (curr->vmacache[i] == vma)
> +   curr->vmacache[i] = NULL;

Why is the invalidation done differently here ? shouldn't it be done
by bumping the mm's sequence number so that invalidation works accross
all threads sharing that mm ?

> +#ifndef CONFIG_MMU
> +struct vm_area_struct *vmacache_find_exact(struct mm_struct *mm,
> +  unsigned long start,
> +  unsigned long end)
> +{
> +   int i;
> +
> +   if (!vmacache_valid(mm))
> +   return NULL;
> +
> +   for (i = 0; i < VMACACHE_SIZE; i++) {
> +   struct vm_area_struct *vma = current->vmacache[i];
> +
> +   if (vma && vma->vm_start == start && vma->vm_end == end)
> +   return vma;
> +   }
> +
> +   return NULL;
> +
> +}
> +#endif

I think the caller could do instead
vma = vmacache_find(mm, start)
if (vma && vma->vm_start == start && vma->vm_end == end) {
}

I.e. better deal with it at the call site than add a new vmacache
function for it.

These are nits, the code looks good already.

I would like to propose an LRU eviction scheme to replace your
VMACACHE_HASH mechanism; I will probably do that as a follow-up once
you have the code in andrew's tree.

Reviewed-by: Michel Lespinasse 

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] thermal: add generic IIO channel thermal sensor driver

2014-02-26 Thread Zhang Rui

On Wed, 2014-02-05 at 17:43 -0800, Courtney Cavin wrote:
> This driver is a generic method for using IIO ADC channels as thermal
> sensors.
> 
> Signed-off-by: Courtney Cavin 

Eduardo,

what do you think of this patch?

thanks,
rui
> ---
>  .../devicetree/bindings/thermal/iio-thermal.txt|  46 +
>  drivers/thermal/Kconfig|  13 ++
>  drivers/thermal/Makefile   |   1 +
>  drivers/thermal/iio_thermal.c  | 207 
> +
>  4 files changed, 267 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/thermal/iio-thermal.txt
>  create mode 100644 drivers/thermal/iio_thermal.c
> 
> diff --git a/Documentation/devicetree/bindings/thermal/iio-thermal.txt 
> b/Documentation/devicetree/bindings/thermal/iio-thermal.txt
> new file mode 100644
> index 000..3be11b6
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/thermal/iio-thermal.txt
> @@ -0,0 +1,46 @@
> +Generic IIO channel thermal sensor bindings
> +
> +compatible:
> + Usage: required
> + Type: string
> + Desc: compatible string, must be "iio-thermal"
> +
> +conversion-method:
> + Usage: required
> + Type: string
> + Desc: How to convert IIO voltage values to temperature, one of:
> + "interpolation" - interpolate between values in lookup table
> + "scalar" - use values as multiplier and divisor
> +
> +conversion-values:
> + Usage: required
> + Type: u32 array, 2-tuples
> + Desc: lookup table for conversion, for conversion-method:
> + "interpolation" - 2-tuples of < uV mK >; micro-volts to
> +   milli-kelvin; table must ascend
> + "scalar" - single scalar 2-tuple as < M D >; where:
> +mK = uV * M / D
> +
> +io-channels:
> + Usage: required
> + Type: prop-encoded-array
> + Desc: See bindings/iio/iio-bindings.txt; must be a voltage channel
> +
> +Example:
> +
> +vadc: some_vadc {
> + compatible = "...";
> + #io-channel-cells = <1>;
> +};
> +
> +iio-thermal {
> + compatible = "iio-thermal";
> + io-channels = < 0>;
> + conversion-method = "interpolation";
> + conversion-values = <
> +   3 398200
> +  385000 318200
> + 1738000 233200
> + >;
> +};
> +
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 35c0664..f83a8e8 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -114,6 +114,19 @@ config THERMAL_EMULATION
> because userland can easily disable the thermal policy by simply
> flooding this sysfs node with low temperature values.
>  
> +config IIO_THERMAL
> + tristate "Temperature sensor driver for generic IIO channels"
> + depends on IIO
> + depends on THERMAL_OF
> + help
> +   Support for generic IIO channels, such as ADCs.  This driver allows
> +   you to expose an IIO voltage channel as a thermal sensor.  This is
> +   implemented as a thermal sensor, not a thermal zone, and thus
> +   requires DT defined thermal infrastructure in order to be useful.
> +
> +   If you aren't sure that you need this support, or haven't configured
> +   a thermal infrastructure in device tree, you should say 'N' here.
> +
>  config IMX_THERMAL
>   tristate "Temperature sensor driver for Freescale i.MX SoCs"
>   depends on CPU_THERMAL
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index 54e4ec9..0ee2c92 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -25,6 +25,7 @@ obj-y   += samsung/
>  obj-$(CONFIG_DOVE_THERMAL)   += dove_thermal.o
>  obj-$(CONFIG_DB8500_THERMAL) += db8500_thermal.o
>  obj-$(CONFIG_ARMADA_THERMAL) += armada_thermal.o
> +obj-$(CONFIG_IIO_THERMAL)+= iio_thermal.o
>  obj-$(CONFIG_IMX_THERMAL)+= imx_thermal.o
>  obj-$(CONFIG_DB8500_CPUFREQ_COOLING) += db8500_cpufreq_cooling.o
>  obj-$(CONFIG_INTEL_POWERCLAMP)   += intel_powerclamp.o
> diff --git a/drivers/thermal/iio_thermal.c b/drivers/thermal/iio_thermal.c
> new file mode 100644
> index 000..df21dbc
> --- /dev/null
> +++ b/drivers/thermal/iio_thermal.c
> @@ -0,0 +1,207 @@
> +/*
> + * An IIO channel based thermal sensor driver
> + *
> + * Copyright (C) 2014 Sony Mobile Communications, AB
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; version 2 of the License.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define MILLIKELVIN_0C

Re: [PATCH RESEND v10 3/4] PHY: add APM X-Gene SoC 15Gbps Multi-purpose PHY driver

2014-02-26 Thread Kishon Vijay Abraham I


On Thursday 27 February 2014 12:11 PM, Loc Ho wrote:

Hi,


+
+static void sds_wr(void __iomem *csr_base, u32 indirect_cmd_reg,
+  u32 indirect_data_reg, u32 addr, u32 data)
+{
+   u32 val;
+   u32 cmd;
+
+   cmd = CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
+   cmd = CFG_IND_ADDR_SET(cmd, addr);





This looks hacky. If 'CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK'
should
be set then it should be part of the second argument. From the macro
'CFG_IND_ADDR_SET' the first argument should be more like the current
value
present in the register right? I feel the macro (CFG_IND_ADDR_SET) is
not
used in the way it is intended to.




The macro XXX_SET is intended to update an field within the register.
The update field is returned. The first assignment lines are setting
another field. Those two lines can be written as:

cmd = 0;
cmd |= CFG_IND_WR_CMD_MASK;==> Set the CMD bit
cmd |= CFG_IND_CMD_DONE_MASK;==> Set the DONE bit
cmd = CFG_IND_ADDR_SET(cmd, addr);===> Set the field ADDR




#define  CFG_IND_ADDR_SET(dst, src) \
  (((dst) & ~0x0030) | (((u32)(src)<<4) & 0x0030))

  From this macro the first argument should be the present value in that
register. Here you reset the address bits and write the new address bits.



Yes.. This is correct. I am clearing x number of bit and then set new
value.


IMO the first argument should be the value in 'csr_base +
indirect_cmd_reg'.
So it resets the address bits in 'csr_base + indirect_cmd_reg' and write
down the new address bits.



Yes.. The above code does just that. In addition, I am also setting
the bits CFG_IND_WR_CMD_MASK and CFG_IND_CMD_DONE_MASK with the two
previous statement. Think of the code flow as follow:

val = readl(some void * address); /* read the register */



Where are you reading the register in your code (before CFG_IND_ADDR_SET)?


I am not reading the register as I will be completely setting them.


Ok. Never-mind then. Sorry for the noise. You code is fine.

Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] thermal: fix default governor assignment

2014-02-26 Thread Zhang Rui

On Tue, 2014-01-14 at 14:45 -0400, Eduardo Valentin wrote:
> When registering a thermal zone, passing an invalid
> .governor_name via struct thermal_zone_params may
> create a thermal zone without a governor, when it
> is supposed to be the default governor.
> 
> This patch fixes this issue by assigning the
> default governor, whenever the zone has a governor
> set to NULL.
> 
please check if the patch at https://patchwork.kernel.org/patch/3730391/
fixes the same problem?

thanks,
rui

> Cc: Zhang Rui 
> Cc: linux...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Reported-by: Wei Ni 
> Signed-off-by: Eduardo Valentin 
> ---
>  drivers/thermal/thermal_core.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index a621e90..967d980 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -89,7 +89,7 @@ int thermal_register_governor(struct thermal_governor 
> *governor)
>   list_for_each_entry(pos, _tz_list, node) {
>   if (pos->governor)
>   continue;
> - if (pos->tzp)
> + if (pos->tzp && pos->tzp->governor_name)
>   name = pos->tzp->governor_name;
>   else
>   name = DEFAULT_THERMAL_GOVERNOR;
> @@ -1527,9 +1527,10 @@ struct thermal_zone_device 
> *thermal_zone_device_register(const char *type,
>   /* Update 'this' zone's governor information */
>   mutex_lock(_governor_lock);
>  
> - if (tz->tzp)
> + if (tz->tzp && tz->tzp->governor_name)
>   tz->governor = __find_governor(tz->tzp->governor_name);
> - else
> +
> + if (!tz->governor)
>   tz->governor = __find_governor(DEFAULT_THERMAL_GOVERNOR);
>  
>   mutex_unlock(_governor_lock);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND v10 3/4] PHY: add APM X-Gene SoC 15Gbps Multi-purpose PHY driver

2014-02-26 Thread Kishon Vijay Abraham I


On Thursday 27 February 2014 12:04 PM, Kishon Vijay Abraham I wrote:

On Thursday 27 February 2014 11:55 AM, Loc Ho wrote:

Hi,


+
+static void sds_wr(void __iomem *csr_base, u32 indirect_cmd_reg,
+  u32 indirect_data_reg, u32 addr, u32 data)
+{
+   u32 val;
+   u32 cmd;
+
+   cmd = CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
+   cmd = CFG_IND_ADDR_SET(cmd, addr);




This looks hacky. If 'CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK'
should
be set then it should be part of the second argument. From the macro
'CFG_IND_ADDR_SET' the first argument should be more like the
current value
present in the register right? I feel the macro (CFG_IND_ADDR_SET)
is not
used in the way it is intended to.



The macro XXX_SET is intended to update an field within the register.
The update field is returned. The first assignment lines are setting
another field. Those two lines can be written as:

cmd = 0;
cmd |= CFG_IND_WR_CMD_MASK;==> Set the CMD bit
cmd |= CFG_IND_CMD_DONE_MASK;==> Set the DONE bit
cmd = CFG_IND_ADDR_SET(cmd, addr);===> Set the field ADDR



#define  CFG_IND_ADDR_SET(dst, src) \
 (((dst) & ~0x0030) | (((u32)(src)<<4) &
0x0030))

 From this macro the first argument should be the present value in that
register. Here you reset the address bits and write the new address
bits.


Yes.. This is correct. I am clearing x number of bit and then set new
value.


IMO the first argument should be the value in 'csr_base +
indirect_cmd_reg'.
So it resets the address bits in 'csr_base + indirect_cmd_reg' and write
down the new address bits.


Yes.. The above code does just that. In addition, I am also setting
the bits CFG_IND_WR_CMD_MASK and CFG_IND_CMD_DONE_MASK with the two
previous statement. Think of the code flow as follow:

val = readl(some void * address); /* read the register */


Where are you reading the register in your code (before CFG_IND_ADDR_SET)?

val = _SET(val, 0x1);/* set bit 0  - assuming  set
bit 0 only */

If you want to set other bits (other than address) don't use
CFG_IND_ADDR_SET macro. That looks hacky to me.


huh.. looked it again and I think only the readl is missing. If you can 
add that, it should be fine.


How about something like this

val = readl(csr_base + indirect_cmd_reg);
val = CFG_IND_ADDR_SET(val, addr);
val |= CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
writel(val, csr_base + indirect_cmd_reg);

Cheers
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND v10 3/4] PHY: add APM X-Gene SoC 15Gbps Multi-purpose PHY driver

2014-02-26 Thread Loc Ho

Hi,

>> +
>> +static void sds_wr(void __iomem *csr_base, u32 indirect_cmd_reg,
>> +  u32 indirect_data_reg, u32 addr, u32 data)
>> +{
>> +   u32 val;
>> +   u32 cmd;
>> +
>> +   cmd = CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
>> +   cmd = CFG_IND_ADDR_SET(cmd, addr);
>
>
>
>
> This looks hacky. If 'CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK'
> should
> be set then it should be part of the second argument. From the macro
> 'CFG_IND_ADDR_SET' the first argument should be more like the current
> value
> present in the register right? I feel the macro (CFG_IND_ADDR_SET) is
> not
> used in the way it is intended to.



 The macro XXX_SET is intended to update an field within the register.
 The update field is returned. The first assignment lines are setting
 another field. Those two lines can be written as:

 cmd = 0;
 cmd |= CFG_IND_WR_CMD_MASK;==> Set the CMD bit
 cmd |= CFG_IND_CMD_DONE_MASK;==> Set the DONE bit
 cmd = CFG_IND_ADDR_SET(cmd, addr);===> Set the field ADDR
>>>
>>>
>>>
>>> #define  CFG_IND_ADDR_SET(dst, src) \
>>>  (((dst) & ~0x0030) | (((u32)(src)<<4) & 0x0030))
>>>
>>>  From this macro the first argument should be the present value in that
>>> register. Here you reset the address bits and write the new address bits.
>>
>>
>> Yes.. This is correct. I am clearing x number of bit and then set new
>> value.
>>
>>> IMO the first argument should be the value in 'csr_base +
>>> indirect_cmd_reg'.
>>> So it resets the address bits in 'csr_base + indirect_cmd_reg' and write
>>> down the new address bits.
>>
>>
>> Yes.. The above code does just that. In addition, I am also setting
>> the bits CFG_IND_WR_CMD_MASK and CFG_IND_CMD_DONE_MASK with the two
>> previous statement. Think of the code flow as follow:
>>
>> val = readl(some void * address); /* read the register */
>
>
> Where are you reading the register in your code (before CFG_IND_ADDR_SET)?

I am not reading the register as I will be completely setting them.
This example is to show how these macro intended to be used.

>>
>> val = _SET(val, 0x1);/* set bit 0  - assuming  set
>> bit 0 only */
>
> If you want to set other bits (other than address) don't use
> CFG_IND_ADDR_SET macro. That looks hacky to me.

I am not. This example is only to show how it can be used if there are
multiple fields to be set. I need to set three fields - two 1 bit
fields and one 19 bit fields. For the one bit field, I just use the
mask. For the 19 bit field, I am using the CFG_IND_ADDR_SET macro. Any
issue before I post an update version?

>
>> val = _SET(val, 0x1);  /* set bit 1 - assuming  set
>> bit 1 only */
>> val = _SET(val, 0x5);/* set upper 16 bit of the
>> register to 0x5 - assuming  set field of the upper 16 bits */
>>
>> Instead writing the above, I am replacing the above 4 lines with these
>> two lines:
>>
>> cmd = CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
>> cmd = CFG_IND_ADDR_SET(cmd, addr);
>>
>> Is there clear?
>>

-Loc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/16] sleep_on removal, second try

2014-02-26 Thread Michael Schmitz


Arnd,



It's been a while since the first submission of these patches,
but a lot of them have made it into linux-next already, so here
is the stuff that is not merged yet, hopefully addressing all
the comments.

Geert and Michael: the I was expecting the ataflop and atari_scsi
patches to be merged already, based on earlier discussion.
Can you apply them to the linux-m68k tree, or do you prefer
them to go through the scsi and block maintainers?


Not sure what we decided to do - I'd prefer to double-check the latest 
ones first, but I'd be OK with these to go via m68k.


Maybe Geert waits for acks from linux-scsi and linux-block? (The rest 
of my patches to Atari SCSI still awaits comment there.)


Geert?

Regards,

Michael


Jens: I did not get any comments for the DAC960 and swim3 patches,
I assume they are good to go in. Please merge.

Hans and Mauro: As I commented on the old thread, I thought the
four media patches were on their way. I have addressed the one
comment that I missed earlier now, and used Hans' version for
the two patches he changed. Please merge or let me know the status
if you have already put them in some tree, but not yet into linux-next

Greg or Andrew: The parport subsystem is orphaned unfortunately,
can one of you pick up that patch?

Davem: The two ATM patches got acks, but I did not hear back from
Karsten regarding the ISDN patches. Can you pick up all six, or
should we wait for comments about the ISDN patches?

Arnd

Cc: Andrew Morton 
Cc: David S. Miller 
Cc: Geert Uytterhoeven 
Cc: Greg Kroah-Hartman 
Cc: Ingo Molnar 
Cc: "James E.J. Bottomley" 
Cc: Jens Axboe 
Cc: Karsten Keil 
Cc: Mauro Carvalho Chehab 
Cc: Michael Schmitz 
Cc: Peter Zijlstra 
Cc: linux-atm-gene...@lists.sourceforge.net
Cc: linux-me...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: net...@vger.kernel.org

Arnd Bergmann (16):
  ataflop: fix sleep_on races
  scsi: atari_scsi: fix sleep_on race
  DAC960: remove sleep_on usage
  swim3: fix interruptible_sleep_on race
  [media] omap_vout: avoid sleep_on race
  [media] usbvision: drop unused define USBVISION_SAY_AND_WAIT
  [media] radio-cadet: avoid interruptible_sleep_on race
  [media] arv: fix sleep_on race
  parport: fix interruptible_sleep_on race
  atm: nicstar: remove interruptible_sleep_on_timeout
  atm: firestream: fix interruptible_sleep_on race
  isdn: pcbit: fix interruptible_sleep_on race
  isdn: hisax/elsa: fix sleep_on race in elsa FSM
  isdn: divert, hysdn: fix interruptible_sleep_on race
  isdn: fix multiple sleep_on races
  sched: remove sleep_on() and friends

 Documentation/DocBook/kernel-hacking.tmpl| 10 --
 drivers/atm/firestream.c |  4 +--
 drivers/atm/nicstar.c| 13 
 drivers/block/DAC960.c   | 34 ++--
 drivers/block/ataflop.c  | 16 +-
 drivers/block/swim3.c| 18 ++-
 drivers/isdn/divert/divert_procfs.c  |  7 +++--
 drivers/isdn/hisax/elsa.c|  9 --
 drivers/isdn/hisax/elsa_ser.c|  3 +-
 drivers/isdn/hysdn/hysdn_proclog.c   |  7 +++--
 drivers/isdn/i4l/isdn_common.c   | 13 +---
 drivers/isdn/pcbit/drv.c |  6 ++--
 drivers/media/platform/arv.c |  6 ++--
 drivers/media/platform/omap/omap_vout_vrfb.c |  3 +-
 drivers/media/radio/radio-cadet.c| 46 


 drivers/media/usb/usbvision/usbvision.h  |  8 -
 drivers/parport/share.c  |  3 +-
 drivers/scsi/atari_scsi.c| 12 ++--
 include/linux/wait.h | 11 ---
 kernel/sched/core.c  | 46 


 20 files changed, 113 insertions(+), 162 deletions(-)

--
1.8.3.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND v10 3/4] PHY: add APM X-Gene SoC 15Gbps Multi-purpose PHY driver

2014-02-26 Thread Kishon Vijay Abraham I


On Thursday 27 February 2014 11:55 AM, Loc Ho wrote:

Hi,


+
+static void sds_wr(void __iomem *csr_base, u32 indirect_cmd_reg,
+  u32 indirect_data_reg, u32 addr, u32 data)
+{
+   u32 val;
+   u32 cmd;
+
+   cmd = CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
+   cmd = CFG_IND_ADDR_SET(cmd, addr);




This looks hacky. If 'CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK' should
be set then it should be part of the second argument. From the macro
'CFG_IND_ADDR_SET' the first argument should be more like the current value
present in the register right? I feel the macro (CFG_IND_ADDR_SET) is not
used in the way it is intended to.



The macro XXX_SET is intended to update an field within the register.
The update field is returned. The first assignment lines are setting
another field. Those two lines can be written as:

cmd = 0;
cmd |= CFG_IND_WR_CMD_MASK;==> Set the CMD bit
cmd |= CFG_IND_CMD_DONE_MASK;==> Set the DONE bit
cmd = CFG_IND_ADDR_SET(cmd, addr);===> Set the field ADDR



#define  CFG_IND_ADDR_SET(dst, src) \
 (((dst) & ~0x0030) | (((u32)(src)<<4) & 0x0030))

 From this macro the first argument should be the present value in that
register. Here you reset the address bits and write the new address bits.


Yes.. This is correct. I am clearing x number of bit and then set new value.


IMO the first argument should be the value in 'csr_base + indirect_cmd_reg'.
So it resets the address bits in 'csr_base + indirect_cmd_reg' and write
down the new address bits.


Yes.. The above code does just that. In addition, I am also setting
the bits CFG_IND_WR_CMD_MASK and CFG_IND_CMD_DONE_MASK with the two
previous statement. Think of the code flow as follow:

val = readl(some void * address); /* read the register */


Where are you reading the register in your code (before CFG_IND_ADDR_SET)?

val = _SET(val, 0x1);/* set bit 0  - assuming  set
bit 0 only */
If you want to set other bits (other than address) don't use 
CFG_IND_ADDR_SET macro. That looks hacky to me.



val = _SET(val, 0x1);  /* set bit 1 - assuming  set
bit 1 only */
val = _SET(val, 0x5);/* set upper 16 bit of the
register to 0x5 - assuming  set field of the upper 16 bits */

Instead writing the above, I am replacing the above 4 lines with these
two lines:

cmd = CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
cmd = CFG_IND_ADDR_SET(cmd, addr);

Is there clear?

-Loc



Cheers
Kishon

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND v10 3/4] PHY: add APM X-Gene SoC 15Gbps Multi-purpose PHY driver

2014-02-26 Thread Loc Ho

Hi,

 +
 +static void sds_wr(void __iomem *csr_base, u32 indirect_cmd_reg,
 +  u32 indirect_data_reg, u32 addr, u32 data)
 +{
 +   u32 val;
 +   u32 cmd;
 +
 +   cmd = CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
 +   cmd = CFG_IND_ADDR_SET(cmd, addr);
>>>
>>>
>>>
>>> This looks hacky. If 'CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK' should
>>> be set then it should be part of the second argument. From the macro
>>> 'CFG_IND_ADDR_SET' the first argument should be more like the current value
>>> present in the register right? I feel the macro (CFG_IND_ADDR_SET) is not
>>> used in the way it is intended to.
>>
>>
>> The macro XXX_SET is intended to update an field within the register.
>> The update field is returned. The first assignment lines are setting
>> another field. Those two lines can be written as:
>>
>> cmd = 0;
>> cmd |= CFG_IND_WR_CMD_MASK;==> Set the CMD bit
>> cmd |= CFG_IND_CMD_DONE_MASK;==> Set the DONE bit
>> cmd = CFG_IND_ADDR_SET(cmd, addr);===> Set the field ADDR
>
>
> #define  CFG_IND_ADDR_SET(dst, src) \
> (((dst) & ~0x0030) | (((u32)(src)<<4) & 0x0030))
>
> From this macro the first argument should be the present value in that
> register. Here you reset the address bits and write the new address bits.

Yes.. This is correct. I am clearing x number of bit and then set new value.

> IMO the first argument should be the value in 'csr_base + indirect_cmd_reg'.
> So it resets the address bits in 'csr_base + indirect_cmd_reg' and write
> down the new address bits.

Yes.. The above code does just that. In addition, I am also setting
the bits CFG_IND_WR_CMD_MASK and CFG_IND_CMD_DONE_MASK with the two
previous statement. Think of the code flow as follow:

val = readl(some void * address); /* read the register */
val = _SET(val, 0x1);/* set bit 0  - assuming  set
bit 0 only */
val = _SET(val, 0x1);  /* set bit 1 - assuming  set
bit 1 only */
val = _SET(val, 0x5);/* set upper 16 bit of the
register to 0x5 - assuming  set field of the upper 16 bits */

Instead writing the above, I am replacing the above 4 lines with these
two lines:

cmd = CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
cmd = CFG_IND_ADDR_SET(cmd, addr);

Is there clear?

-Loc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] md / procfs: avoid Oops if md-mod removed while /proc/mdstat is being polled.

2014-02-26 Thread NeilBrown



If poll or select is waiting on /proc/mdstat when md-mod is unloaded
an oops will ensure when the poll/select completes.

This is because the wait_queue_head which is registered with poll_wait()
is local to the module and no longer exists when the poll completes and
detaches that wait_queue_head (in poll_free_wait -> remove_wait_queue).

To fix this we need the wait_queue_head to have (at least) the same life
time as the proc_dir_entry.  So this patch places it in that structure.

We:
  - add pde_poll_wait to struct proc_dir_entry
  - call poll_wait() passing this when poll() is called on the proc file
  - export a function proc_wake_up which will call wake_up() on pde_poll_wait

and make use of all that in md.c

Reported-by: "majianpeng" 
Signed-off-by: NeilBrown 

--

Do we have a maintainer for fs/proc ??
If I could get a couple of Acks, or constructive comments, on this,
I would  appreciate it.
Thanks,
NeilBrown


diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4ad5cc4e63e8..1bf70d9c55d3 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -193,12 +193,12 @@ EXPORT_SYMBOL_GPL(bio_clone_mddev);
  *  start array, stop array, error, add device, remove device,
  *  start build, activate spare
  */
-static DECLARE_WAIT_QUEUE_HEAD(md_event_waiters);
+static struct proc_dir_entry *mdstat_pde;
 static atomic_t md_event_count;
 void md_new_event(struct mddev *mddev)
 {
atomic_inc(_event_count);
-   wake_up(_event_waiters);
+   proc_wake_up(mdstat_pde);
 }
 EXPORT_SYMBOL_GPL(md_new_event);
 
@@ -208,7 +208,7 @@ EXPORT_SYMBOL_GPL(md_new_event);
 static void md_new_event_inintr(struct mddev *mddev)
 {
atomic_inc(_event_count);
-   wake_up(_event_waiters);
+   proc_wake_up(mdstat_pde);
 }
 
 /*
@@ -7187,8 +7187,6 @@ static unsigned int mdstat_poll(struct file *filp, 
poll_table *wait)
struct seq_file *seq = filp->private_data;
int mask;
 
-   poll_wait(filp, _event_waiters, wait);
-
/* always allow read */
mask = POLLIN | POLLRDNORM;
 
@@ -8557,7 +8555,7 @@ static void md_geninit(void)
 {
pr_debug("md: sizeof(mdp_super_t) = %d\n", (int)sizeof(mdp_super_t));
 
-   proc_create("mdstat", S_IRUGO, NULL, _seq_fops);
+   mdstat_pde = proc_create("mdstat", S_IRUGO, NULL, _seq_fops);
 }
 
 static int __init md_init(void)
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index b7f268eb5f45..c579da4cd765 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -357,6 +357,7 @@ static struct proc_dir_entry *__proc_create(struct 
proc_dir_entry **parent,
atomic_set(>count, 1);
spin_lock_init(>pde_unload_lock);
INIT_LIST_HEAD(>pde_openers);
+   init_waitqueue_head(>pde_poll_wait);
 out:
return ent;
 }
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 124fc43c7090..353fc199e8b5 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -234,13 +234,21 @@ static unsigned int proc_reg_poll(struct file *file, 
struct poll_table_struct *p
unsigned int (*poll)(struct file *, struct poll_table_struct *);
if (use_pde(pde)) {
poll = pde->proc_fops->poll;
-   if (poll)
+   if (poll) {
+   poll_wait(file, >pde_poll_wait, pts);
rv = poll(file, pts);
+   }
unuse_pde(pde);
}
return rv;
 }
 
+void proc_wake_up(struct proc_dir_entry *pde)
+{
+   wake_up(>pde_poll_wait);
+}
+EXPORT_SYMBOL_GPL(proc_wake_up);
+
 static long proc_reg_unlocked_ioctl(struct file *file, unsigned int cmd, 
unsigned long arg)
 {
struct proc_dir_entry *pde = PDE(file_inode(file));
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 651d09a11dde..6f9f84eecded 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -46,6 +46,7 @@ struct proc_dir_entry {
struct completion *pde_unload_completion;
struct list_head pde_openers;   /* who did ->open, but not ->release */
spinlock_t pde_unload_lock; /* proc_fops checks and pde_users bumps */
+   wait_queue_head_t pde_poll_wait; /* For proc_reg_poll */
u8 namelen;
char name[];
 };
diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h
index 608e60a74c3c..a4a3d5f001ef 100644
--- a/include/linux/proc_fs.h
+++ b/include/linux/proc_fs.h
@@ -34,6 +34,7 @@ static inline struct proc_dir_entry *proc_create(
return proc_create_data(name, mode, parent, proc_fops, NULL);
 }
 
+extern void proc_wake_up(struct proc_dir_entry *pde);
 extern void proc_set_size(struct proc_dir_entry *, loff_t);
 extern void proc_set_user(struct proc_dir_entry *, kuid_t, kgid_t);
 extern void *PDE_DATA(const struct inode *);


signature.asc
Description: PGP signature

Re: [PATCH resend] clk: axi-clkgen: Add support for v2

2014-02-26 Thread Lars-Peter Clausen


On 02/27/2014 02:04 AM, Mike Turquette wrote:

Quoting Lars-Peter Clausen (2014-02-17 01:31:53)

This patch adds support for the new v2 version of the axi-clkgen core.
Unfortunately the method of accessing the registers is quite different on v2,
while the content still stays largely the same. So the patch adds a small
abstraction layer which implements the specific read and write functions for v1
and v2 in callback functions.


Hi,

This patch almost doubles the size of clk-axi-clkgen.c. Should it be a
separate clock driver? I guess that depends on the relationship between
"v1" and "v2". Are both of those versions of the clkgen core going into
production?


Hi,

The only thing that is different between the two versions is how the PLL 
registers are accessed. The content that is written to those register is a 
100% identical. So splitting it up into two drivers makes no sense, since 
you'd have to copy all the application logic. Both versions of the 
core can be found in the wild.


- Lars



Regards,
Mike



Signed-off-by: Lars-Peter Clausen 
---
  .../devicetree/bindings/clock/axi-clkgen.txt   |   2 +-
  drivers/clk/clk-axi-clkgen.c   | 312 ++---
  2 files changed, 270 insertions(+), 44 deletions(-)

diff --git a/Documentation/devicetree/bindings/clock/axi-clkgen.txt 
b/Documentation/devicetree/bindings/clock/axi-clkgen.txt
index 028b493..20e1704 100644
--- a/Documentation/devicetree/bindings/clock/axi-clkgen.txt
+++ b/Documentation/devicetree/bindings/clock/axi-clkgen.txt
@@ -5,7 +5,7 @@ This binding uses the common clock binding[1].
  [1] Documentation/devicetree/bindings/clock/clock-bindings.txt

  Required properties:
-- compatible : shall be "adi,axi-clkgen".
+- compatible : shall be "adi,axi-clkgen-1.00.a" or "adi,axi-clkgen-2.00.a".
  - #clock-cells : from common clock binding; Should always be set to 0.
  - reg : Address and length of the axi-clkgen register set.
  - clocks : Phandle and clock specifier for the parent clock.
diff --git a/drivers/clk/clk-axi-clkgen.c b/drivers/clk/clk-axi-clkgen.c
index 8137327..1127ee4 100644
--- a/drivers/clk/clk-axi-clkgen.c
+++ b/drivers/clk/clk-axi-clkgen.c
@@ -17,23 +17,75 @@
  #include 
  #include 

-#define AXI_CLKGEN_REG_UPDATE_ENABLE   0x04
-#define AXI_CLKGEN_REG_CLK_OUT10x08
-#define AXI_CLKGEN_REG_CLK_OUT20x0c
-#define AXI_CLKGEN_REG_CLK_DIV 0x10
-#define AXI_CLKGEN_REG_CLK_FB1 0x14
-#define AXI_CLKGEN_REG_CLK_FB2 0x18
-#define AXI_CLKGEN_REG_LOCK1   0x1c
-#define AXI_CLKGEN_REG_LOCK2   0x20
-#define AXI_CLKGEN_REG_LOCK3   0x24
-#define AXI_CLKGEN_REG_FILTER1 0x28
-#define AXI_CLKGEN_REG_FILTER2 0x2c
+#define AXI_CLKGEN_V1_REG_UPDATE_ENABLE0x04
+#define AXI_CLKGEN_V1_REG_CLK_OUT1 0x08
+#define AXI_CLKGEN_V1_REG_CLK_OUT2 0x0c
+#define AXI_CLKGEN_V1_REG_CLK_DIV  0x10
+#define AXI_CLKGEN_V1_REG_CLK_FB1  0x14
+#define AXI_CLKGEN_V1_REG_CLK_FB2  0x18
+#define AXI_CLKGEN_V1_REG_LOCK10x1c
+#define AXI_CLKGEN_V1_REG_LOCK20x20
+#define AXI_CLKGEN_V1_REG_LOCK30x24
+#define AXI_CLKGEN_V1_REG_FILTER1  0x28
+#define AXI_CLKGEN_V1_REG_FILTER2  0x2c
+
+#define AXI_CLKGEN_V2_REG_RESET0x40
+#define AXI_CLKGEN_V2_REG_DRP_CNTRL0x70
+#define AXI_CLKGEN_V2_REG_DRP_STATUS   0x74
+
+#define AXI_CLKGEN_V2_RESET_MMCM_ENABLEBIT(1)
+#define AXI_CLKGEN_V2_RESET_ENABLE BIT(0)
+
+#define AXI_CLKGEN_V2_DRP_CNTRL_SELBIT(29)
+#define AXI_CLKGEN_V2_DRP_CNTRL_READ   BIT(28)
+
+#define AXI_CLKGEN_V2_DRP_STATUS_BUSY  BIT(16)
+
+#define MMCM_REG_CLKOUT0_1 0x08
+#define MMCM_REG_CLKOUT0_2 0x09
+#define MMCM_REG_CLK_FB1   0x14
+#define MMCM_REG_CLK_FB2   0x15
+#define MMCM_REG_CLK_DIV   0x16
+#define MMCM_REG_LOCK1 0x18
+#define MMCM_REG_LOCK2 0x19
+#define MMCM_REG_LOCK3 0x1a
+#define MMCM_REG_FILTER1   0x4e
+#define MMCM_REG_FILTER2   0x4f
+
+struct axi_clkgen;
+
+struct axi_clkgen_mmcm_ops {
+   void (*enable)(struct axi_clkgen *axi_clkgen, bool enable);
+   int (*write)(struct axi_clkgen *axi_clkgen, unsigned int reg,
+unsigned int val, unsigned int mask);
+   int (*read)(struct axi_clkgen *axi_clkgen, unsigned int reg,
+   unsigned int *val);
+};

  struct axi_clkgen {
 void __iomem *base;
+   const struct axi_clkgen_mmcm_ops *mmcm_ops;
 struct clk_hw clk_hw;
  };

+static void axi_clkgen_mmcm_enable(struct axi_clkgen *axi_clkgen,
+   bool enable)
+{
+   axi_clkgen->mmcm_ops->enable(axi_clkgen, enable);
+}
+
+static int axi_clkgen_mmcm_write(struct axi_clkgen *axi_clkgen,
+   unsigned int reg, unsigned int val, unsigned int mask)
+{
+   return axi_clkgen->mmcm_ops->write(axi_clkgen, reg, val, mask);
+}
+
+static int axi_clkgen_mmcm_read(struct axi_clkgen *axi_clkgen,
+   unsigned int reg,

linux-next: Tree for Feb 27

2014-02-26 Thread Stephen Rothwell

Hi all,

This tree fails (more than usual) the powerpc allyesconfig build.

Changes since 20140226:

The powerpc tree still had its build failure.

The libata tree lost its build failure.

The mfd-lj tree still had its build failure so I used the version from
next-20140210.

The drm-tegra tree gained a build failure so I used the version from
next-20140226.

The wireless-next tree still had its build failure so I used the version
from next-20140224.

I reverted a commit from the tip tree due to a reported regression.

The pwm tree gained a conflict against the usb tree.

I added some supplied patches to the akpm-current tree to address its
build failure.

Non-merge commits (relative to Linus' tree): 4818
 4730 files changed, 179975 insertions(+), 101515 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64 and a
multi_v7_defconfig for arm. After the final fixups (if any), it is also
built with powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig and
allyesconfig (minus CONFIG_PROFILE_ALL_BRANCHES - this fails its final
link) and i386, sparc, sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

I am currently merging 210 trees (counting Linus' and 28 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (d2a0476307e6 Merge branch 'akpm' (patches from Andrew 
Morton))
Merging fixes/master (b0031f227e47 Merge tag 's2mps11-build' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator)
Merging kbuild-current/rc-fixes (38dbfb59d117 Linus 3.14-rc1)
Merging arc-current/for-curr (7e22e91102c6 Linux 3.13-rc8)
Merging arm-current/fixes (b36345759308 ARM: 7980/1: kernel: improve error 
message when LPAE config doesn't match CPU)
Merging m68k-current/for-linus (7247f55381d5 m68k: Wire up sched_setattr and 
sched_getattr)
Merging metag-fixes/fixes (f229006ec6be irq-metag*: stop set_affinity vectoring 
to offline cpus)
Merging powerpc-merge/merge (66f9af83e56b powerpc/eeh: Disable EEH on reboot)
Merging sparc/master (10527106abec Merge tag 'dt-for-linus' of 
git://git.secretlab.ca/git/linux)
Merging net/master (ee6154e11eec bonding: fix a div error caused by the slave 
release path)
Merging ipsec/master (3a9016f97fdc xfrm: Fix unlink race when policies are 
deleted.)
Merging sound-current/for-linus (fce0a0c72618 ALSA: hda/realtek - Add more 
entry for enable HP mute led)
Merging pci-current/for-linus (fc40363b2140 ahci: Fix broken fallback to single 
MSI mode)
Merging wireless/master (b7b146c9c9a0 ath9k: fix invalid descriptor discarding)
Merging driver-core.current/driver-core-linus (fed95bab8d29 sysfs: fix 
namespace refcnt leak)
Merging tty.current/tty-linus (cfbf8d4857c2 Linux 3.14-rc4)
Merging usb.current/usb-linus (cfbf8d4857c2 Linux 3.14-rc4)
Merging staging.current/staging-linus (260ea9c2e2d3 staging: r8188eu: Add new 
device ID)
Merging char-misc.current/char-misc-linus (58e868be77bd Merge 3.14-rc4 into 
char-misc-linus)
Merging input-current/for-linus (70b0052425ff Input: da9052_onkey - use correct 
register bit for key status)
Merging md-current/for-linus (d47648fcf061 raid5: avoid finding "discard" 
stripe)
Merging crypto-current/master (ee97dc7db4cb crypto: s390 - fix des and des3_ede 
ctr concurrency issue)
Merging ide/master (738b52bb9845 Merge tag 'microblaze-3.14-rc3' of 
git://git.monstr.eu/linux-2.6-microblaze)
Merging dwmw2/master (5950f0803ca9 pcmcia: remove RPX board stuff)
Merging devicetree-current/devicetree/merge (1f42e5dd5065 of: Add self test

Re: [PATCH RESEND v10 3/4] PHY: add APM X-Gene SoC 15Gbps Multi-purpose PHY driver

2014-02-26 Thread Kishon Vijay Abraham I


On Thursday 27 February 2014 02:15 AM, Loc Ho wrote:

Hi,



+config PHY_XGENE
+   tristate "APM X-Gene 15Gbps PHY support"
+   depends on ARM64 || COMPILE_TEST
+   select GENERIC_PHY



depends on HAS_IOMEM and CONFIG_OF..


I will make it depends as "HAS_IOMEM && OF && (ARM64 || COMPILE_TEST)



+/* PLL Clock Macro Unit (CMU) CSR accessing from SDS indirectly */
+#define CMU_REG0   0x0
+#define  CMU_REG0_PLL_REF_SEL_MASK 0x2000
+#define  CMU_REG0_PLL_REF_SEL_SET(dst, src)\
+   (((dst) & ~0x2000) | (((u32)(src) << 0xd) & 0x2000))



using decimals for shift value would be better. No strong feeling though.


I will change to integer value.


+/*
+ * For chip earlier than A3 version, enable this flag.
+ * To enable, pass boot argument phy_xgene.preA3Chip=1
+ */
+static int preA3Chip;
+MODULE_PARM_DESC(preA3Chip, "Enable pre-A3 chip support (1=enable 0=disable)");
+module_param_named(preA3Chip, preA3Chip, int, 0444);



Do we need to have module param for this? I mean we can differentiate between
different chip versions in dt data only.


This is only required for the short term. Once all the pre-A3 system
are replaced, there isn't an need for this. For those who still has an
pre-A3 silicon system, this would provide an short term solution for
them. DT isn't quite correct here. This is an global thing. I guess I
can OR all node. If it is still better to put in the DT, let me know
and I will move it.


+
+static void sds_wr(void __iomem *csr_base, u32 indirect_cmd_reg,
+  u32 indirect_data_reg, u32 addr, u32 data)
+{
+   u32 val;
+   u32 cmd;
+
+   cmd = CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK;
+   cmd = CFG_IND_ADDR_SET(cmd, addr);



This looks hacky. If 'CFG_IND_WR_CMD_MASK | CFG_IND_CMD_DONE_MASK' should be 
set then it should be part of the second argument. From the macro 
'CFG_IND_ADDR_SET' the first argument should be more like the current value 
present in the register right? I feel the macro (CFG_IND_ADDR_SET) is not used 
in the way it is intended to.


The macro XXX_SET is intended to update an field within the register.
The update field is returned. The first assignment lines are setting
another field. Those two lines can be written as:

cmd = 0;
cmd |= CFG_IND_WR_CMD_MASK;==> Set the CMD bit
cmd |= CFG_IND_CMD_DONE_MASK;==> Set the DONE bit
cmd = CFG_IND_ADDR_SET(cmd, addr);===> Set the field ADDR


#define  CFG_IND_ADDR_SET(dst, src) \
(((dst) & ~0x0030) | (((u32)(src)<<4) & 0x0030))

From this macro the first argument should be the present value in that 
register. Here you reset the address bits and write the new address bits.
IMO the first argument should be the value in 'csr_base + 
indirect_cmd_reg'. So it resets the address bits in 'csr_base + 
indirect_cmd_reg' and write down the new address bits.


Thanks
Kishon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] pwm-backlight: switch to gpiod interface (part 1)

2014-02-26 Thread Alexandre Courbot

These two patches initiate the switch of the pwm-backlight driver to
the gpiod GPIO interface, as it considerably simplifies the code.

For compatibility with current users of the driver, it is still possible
to pass the enable GPIO number as platform data. Two platforms are still
relying on this feature (pxa/palmtc and shmobile/armadillo800eva) which
will be removed as soon as its last users are switched to GPIO mapping
tables.

Alexandre Courbot (2):
  ARM: SAMSUNG: remove gpio flags in dev-backlight
  pwm-backlight: switch to gpiod interface

 arch/arm/plat-samsung/dev-backlight.c |  2 -
 drivers/video/backlight/pwm_bl.c  | 72 +++
 include/linux/pwm_backlight.h |  5 +--
 3 files changed, 32 insertions(+), 47 deletions(-)

-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] ARM: SAMSUNG: remove gpio flags in dev-backlight

2014-02-26 Thread Alexandre Courbot

The pwm-backlight driver is moving to use the gpiod interface,
which has its own mapping mechanism for platform data GPIOs.
These mappings carry GPIO properties like active low so they don't have
to be explicitly handled by GPIO consumers.

Because of this change, the enable_gpio_flags member of
platform_pwm_backlight_data is going away. dev-backlight was passing
this member, but had no user making use of it, so it can safely be
removed. Further GPIOs used by pwm-backlight are expected to be
defined using the mechanisms provided by the gpiod API.

Signed-off-by: Alexandre Courbot 
---
 arch/arm/plat-samsung/dev-backlight.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm/plat-samsung/dev-backlight.c 
b/arch/arm/plat-samsung/dev-backlight.c
index be4ad0b21c08..2157c5b539e6 100644
--- a/arch/arm/plat-samsung/dev-backlight.c
+++ b/arch/arm/plat-samsung/dev-backlight.c
@@ -124,8 +124,6 @@ void __init samsung_bl_set(struct samsung_bl_gpio_info 
*gpio_info,
samsung_bl_data->pwm_period_ns = bl_data->pwm_period_ns;
if (bl_data->enable_gpio >= 0)
samsung_bl_data->enable_gpio = bl_data->enable_gpio;
-   if (bl_data->enable_gpio_flags)
-   samsung_bl_data->enable_gpio_flags = bl_data->enable_gpio_flags;
if (bl_data->init)
samsung_bl_data->init = bl_data->init;
if (bl_data->notify)
-- 
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] pwm-backlight: switch to gpiod interface

2014-02-26 Thread Alexandre Courbot

Switch to the new gpiod interface, which allows to handle GPIO
properties such as active low transparently and removes a whole bunch of
code.

There are still a couple of users of this driver that rely on passing
the enable GPIO number through platform data, so a fallback mechanism
using a GPIO number is still available to avoid breaking them. It will
be removed once current users have switched to the GPIO lookup tables
provided by the gpiod interface.

Signed-off-by: Alexandre Courbot 
---
 drivers/video/backlight/pwm_bl.c | 72 +---
 include/linux/pwm_backlight.h|  5 +--
 2 files changed, 32 insertions(+), 45 deletions(-)

diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c
index b75201ff46f6..533057688d93 100644
--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -10,8 +10,8 @@
  * published by the Free Software Foundation.
  */
 
+#include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -32,8 +32,7 @@ struct pwm_bl_data {
unsigned int*levels;
boolenabled;
struct regulator*power_supply;
-   int enable_gpio;
-   unsigned long   enable_gpio_flags;
+   struct gpio_desc*enable_gpio;
unsigned intscale;
int (*notify)(struct device *,
  int brightness);
@@ -54,12 +53,8 @@ static void pwm_backlight_power_on(struct pwm_bl_data *pb, 
int brightness)
if (err < 0)
dev_err(pb->dev, "failed to enable power supply\n");
 
-   if (gpio_is_valid(pb->enable_gpio)) {
-   if (pb->enable_gpio_flags & PWM_BACKLIGHT_GPIO_ACTIVE_LOW)
-   gpio_set_value(pb->enable_gpio, 0);
-   else
-   gpio_set_value(pb->enable_gpio, 1);
-   }
+   if (pb->enable_gpio)
+   gpiod_set_value(pb->enable_gpio, 1);
 
pwm_enable(pb->pwm);
pb->enabled = true;
@@ -73,12 +68,8 @@ static void pwm_backlight_power_off(struct pwm_bl_data *pb)
pwm_config(pb->pwm, 0, pb->period);
pwm_disable(pb->pwm);
 
-   if (gpio_is_valid(pb->enable_gpio)) {
-   if (pb->enable_gpio_flags & PWM_BACKLIGHT_GPIO_ACTIVE_LOW)
-   gpio_set_value(pb->enable_gpio, 1);
-   else
-   gpio_set_value(pb->enable_gpio, 0);
-   }
+   if (pb->enable_gpio)
+   gpiod_set_value(pb->enable_gpio, 0);
 
regulator_disable(pb->power_supply);
pb->enabled = false;
@@ -148,7 +139,6 @@ static int pwm_backlight_parse_dt(struct device *dev,
  struct platform_pwm_backlight_data *data)
 {
struct device_node *node = dev->of_node;
-   enum of_gpio_flags flags;
struct property *prop;
int length;
u32 value;
@@ -189,14 +179,6 @@ static int pwm_backlight_parse_dt(struct device *dev,
data->max_brightness--;
}
 
-   data->enable_gpio = of_get_named_gpio_flags(node, "enable-gpios", 0,
-   );
-   if (data->enable_gpio == -EPROBE_DEFER)
-   return -EPROBE_DEFER;
-
-   if (gpio_is_valid(data->enable_gpio) && (flags & OF_GPIO_ACTIVE_LOW))
-   data->enable_gpio_flags |= PWM_BACKLIGHT_GPIO_ACTIVE_LOW;
-
return 0;
 }
 
@@ -256,8 +238,6 @@ static int pwm_backlight_probe(struct platform_device *pdev)
} else
pb->scale = data->max_brightness;
 
-   pb->enable_gpio = data->enable_gpio;
-   pb->enable_gpio_flags = data->enable_gpio_flags;
pb->notify = data->notify;
pb->notify_after = data->notify_after;
pb->check_fb = data->check_fb;
@@ -265,26 +245,39 @@ static int pwm_backlight_probe(struct platform_device 
*pdev)
pb->dev = >dev;
pb->enabled = false;
 
-   if (gpio_is_valid(pb->enable_gpio)) {
-   unsigned long flags;
-
-   if (pb->enable_gpio_flags & PWM_BACKLIGHT_GPIO_ACTIVE_LOW)
-   flags = GPIOF_OUT_INIT_HIGH;
-   else
-   flags = GPIOF_OUT_INIT_LOW;
+   pb->enable_gpio = devm_gpiod_get(>dev, "enable");
+   if (IS_ERR(pb->enable_gpio)) {
+   ret = PTR_ERR(pb->enable_gpio);
+   if (ret == -ENOENT) {
+   pb->enable_gpio = NULL;
+   ret = 0;
+   } else {
+   goto err_alloc;
+   }
+   }
 
-   ret = gpio_request_one(pb->enable_gpio, flags, "enable");
+   /*
+* Compatibility fallback for drivers still using the integer GPIO
+* platform data. Must go away soon.
+*/
+   if (pb->enable_gpio == NULL && gpio_is_valid(data->enable_gpio)) {
+   ret = devm_gpio_request_one(>dev,

Re: linux-next: build failure after merge of the char-misc tree

2014-02-26 Thread Stephen Rothwell

Hi Greg,

On Wed, 26 Feb 2014 19:37:16 -0800 Greg KH  wrote:
>
> On Wed, Feb 26, 2014 at 05:47:21PM +1100, Stephen Rothwell wrote:
> > 
> > On Fri, 21 Feb 2014 16:47:11 +1100 Stephen Rothwell  
> > wrote:
> > >
> > > After merging the char-misc tree, today's linux-next build (x86_64
> > > allmodconfig) failed like this:
> > > 
> > > In file included from drivers/misc/mei/hw-txe.c:25:0:
> > > drivers/misc/mei/hw-txe.h:63:1: error: unknown type name 'irqreturn_t'
> > >  irqreturn_t mei_txe_irq_quick_handler(int irq, void *dev_id);
> > >  ^
> > > 
> > > Caused by commit 266f6178d1f1 ("mei: txe: add hw-txe.h header file") but
> > > probably exposed by commit 46cb7b1bd86f ("PCI: Remove unused SR-IOV VF
> > > Migration support") from the pci tree which removed the include of
> > > irqreturn.h from pci.h ...
> > > 
> > > See Rule 1 from Documentation/SubmitChecklist ...
> > > 
> > > I added the following merge fix patch (this should be applied to the
> > > char-misc tree):
> > 
> > Ping?
> 
> I've merged everything together, and it all builds properly for me in
> the char-misc branches, so I don't see what is missing.  What did I do
> wrong?

Nothing, your tree is fine, except when merged with the pci tree.  There
is a commit in the pci tree that removed the include of irqreturn.h from
pci.h, thus exposing that drivers/misc/mei/hw-txe.c did not include
irqreturn.h directly despite using stuff from there (similarly for
hw-txe.h).  My patch is just a "quality of implementation" thing in your
tree at the moment, but applying it to your tree will save doing the
semantic merge conflict fixup in linux-next and later in Linus' tree when
your tree and the pci tree meet there.

i.e. it does not hurt your tree to apply it and will save is forgetting
later.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpiMz9v22hbJ.pgp
Description: PGP signature

Re: [PATCH 3/3] cpufreq: stats: Refactor common code into __cpufreq_stats_create_table()

2014-02-26 Thread Viresh Kumar

On 27 February 2014 01:47, Saravana Kannan  wrote:
> cpufreq_frequency_get_table() is called from all callers of
> __cpufreq_stats_create_table(). So, move it inside.
>
> Suggested-by: Viresh Kumar 
> Signed-off-by: Saravana Kannan 
> ---
>  drivers/cpufreq/cpufreq_stats.c | 22 +-
>  1 file changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> index c52b440..9d9e366 100644
> --- a/drivers/cpufreq/cpufreq_stats.c
> +++ b/drivers/cpufreq/cpufreq_stats.c
> @@ -180,13 +180,18 @@ static void cpufreq_stats_free_table(unsigned int cpu)
> cpufreq_cpu_put(policy);
>  }
>
> -static int __cpufreq_stats_create_table(struct cpufreq_policy *policy,
> -   struct cpufreq_frequency_table *table)
> +static int __cpufreq_stats_create_table(struct cpufreq_policy *policy)
>  {
> unsigned int i, j, count = 0, ret = 0;
> struct cpufreq_stats *stat;
> unsigned int alloc_size;
> unsigned int cpu = policy->cpu;
> +   struct cpufreq_frequency_table *table;
> +
> +   table = cpufreq_frequency_get_table(policy->cpu);

s/policy->cpu/cpu

Otherwise, Acked-by: Viresh Kumar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] x86: Mark __vdso entries as asmlinkage

2014-02-26 Thread H. Peter Anvin

On 02/26/2014 09:19 PM, Andy Lutomirski wrote:
>>
>> The normal ABI almost certainly makes more sense; as such -mregparm=3 is
>> probably not what we want, and I suspect it makes more sense to just
>> drop that from the CFLAGS line?
> 
> Hmm.  What happens on a native 32-bit build?  IIRC the whole kernel is
> build with regparm(3).
> 

Well, the vdso is still built separately, so we can use different CFLAGS
if we want to.

> If we want to save a cycle or two, then regparm(3) is probably faster.
>  But I think that these functions should either be asmlinkage or (on
> 32 bit builds) explicitly regparm(3) to avoid confusion.

I suggest using the standard ABI, but I suggest doing it via CFLAGS.

It isn't any faster if the C library has to provide a wrapper just to
marshal parameters.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] asm-generic: rwsem: ensure sem->cnt is only accessed via atomic_long_*

2014-02-26 Thread Davidlohr Bueso

On Fri, 2014-02-21 at 17:22 +, Will Deacon wrote:
> The asm-generic rwsem implementation directly acceses sem->cnt when
> performing a __down_read_trylock operation. Whilst this is probably safe
> on all architectures, we should stick to the atomic_long_* API and use
> atomic_long_read instead.
> 
> Signed-off-by: Will Deacon 
> ---
>  include/asm-generic/rwsem.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/asm-generic/rwsem.h b/include/asm-generic/rwsem.h
> index bb1e2cdeb9bf..75af612f54f8 100644
> --- a/include/asm-generic/rwsem.h
> +++ b/include/asm-generic/rwsem.h
> @@ -41,7 +41,7 @@ static inline int __down_read_trylock(struct rw_semaphore 
> *sem)
>  {
>   long tmp;
>  
> - while ((tmp = sem->count) >= 0) {
> + while ((tmp = atomic_long_read((atomic_long_t *)>count)) >= 0) {

That's pretty ugly, how about having infinite look and just do the tmp
assign separately from the conditional?

It also looks like a cpu_relax() could help here between iterations.
Other than that:

Reviewed-by: Davidlohr Bueso 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] drivers: cpufreq: Mark function as static in cpufreq.c

2014-02-26 Thread Viresh Kumar

Hi Rashika,

On 26 February 2014 22:08, Rashika Kheria  wrote:
> Mark function as static in cpufreq.c because it is not
> used outside this file.
>
> This eliminates the following warning in cpufreq.c:
> drivers/cpufreq/cpufreq.c:355:9: warning: no previous prototype for 
> 'show_boost' [-Wmissing-prototypes]

Can you please elaborate how this warning is related to
the non-static definition of this function?

> Signed-off-by: Rashika Kheria 
> ---
>  drivers/cpufreq/cpufreq.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 08ca8c9..54fd670 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -352,7 +352,7 @@ EXPORT_SYMBOL_GPL(cpufreq_notify_post_transition);
>  /*
>   *  SYSFS INTERFACE  *
>   */
> -ssize_t show_boost(struct kobject *kobj,
> +static ssize_t show_boost(struct kobject *kobj,
>  struct attribute *attr, char *buf)
>  {
> return sprintf(buf, "%d\n", cpufreq_driver->boost_enabled);
> --
> 1.7.9.5
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] x86: Mark __vdso entries as asmlinkage

2014-02-26 Thread Andy Lutomirski

On Wed, Feb 26, 2014 at 10:06 PM, H. Peter Anvin  wrote:
> On 02/26/2014 07:39 PM, Andi Kleen wrote:
>> On Wed, Feb 26, 2014 at 05:02:13PM -0800, Andy Lutomirski wrote:
>>> This makes no difference for 64-bit, bit it's critical for 32-bit code:
>>> these functions are called from outside the kernel, so they need to comply
>>> with the ABI.
>>
>> That's an odd patch. If that was wrong things couldn't have worked at all.
>> Probably hidden by inlining? If yes just make it static
>>
>> Also you would rather need notrace more often.
>>
>
> It has to support *an* ABI... the syscall vdso entry point uses the old
> int $0x80 calling convention rather than the normal ABI.  It would
> depend on the test program and eventual glibc implementation.  And sure
> enough, the test program has:
>
> int (*vdso_gettimeofday)(struct timeval *tv, struct timezone *tz)
> __attribute__ ((regparm (3)));
> int (*vdso_clock_gettime)(clockid_t clk_id, struct timespec *tp)
> __attribute__ ((regparm (3)));
> time_t (*vdso_time)(time_t *t) __attribute__ ((regparm (3)));
>
> That being said, since this code is compiled separately, the compiler
> flags there determine what actually matters.  However, there we have:
>
> KBUILD_CFLAGS_32 += -m32 -msoft-float -mregparm=3 -freg-struct-return -fpic
>
> The normal ABI almost certainly makes more sense; as such -mregparm=3 is
> probably not what we want, and I suspect it makes more sense to just
> drop that from the CFLAGS line?

Hmm.  What happens on a native 32-bit build?  IIRC the whole kernel is
build with regparm(3).

If we want to save a cycle or two, then regparm(3) is probably faster.
 But I think that these functions should either be asmlinkage or (on
32 bit builds) explicitly regparm(3) to avoid confusion.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Final: Add 32 bit VDSO time function support

2014-02-26 Thread H. Peter Anvin

On 02/26/2014 12:54 PM, Andy Lutomirski wrote:
> On Wed, Feb 26, 2014 at 12:45 PM, Greg KH  wrote:
>> On Wed, Feb 26, 2014 at 08:34:58PM +0100, Stefani Seibold wrote:
>>> Hi,
>>>
>>> i still wait for ACK's for the  32 bit VDSO time function support. Whats
>>> the next step? Is there a way to apply it to the linux-git or linux-next
>>> in near future?
>>
>> I thought this was already in the tip tree.  Didn't the emails saying
>> this happened come by a few days ago?  Or was it pulled from there for
>> some reason?
> 
> It is in tip.  I think that hpa wants people to test it before it goes
> to Linus, and I certainly want to convince myself that there's no
> performance regression.
> 

Absolutely.  Your help is *greatly* appreciated.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] usb: musb: musb_cppi41: Dont reprogram DMA if tear down is initiated

2014-02-26 Thread George Cherian

Reprogramming the DMA after tear down is initiated leads to warning.
This is mainly seen with ISOCH since we do a delayed completion for
ISOCH transfers. In ISOCH transfers dma_completion should not reprogram
if the channel tear down is initiated.

Signed-off-by: George Cherian 
---
 drivers/usb/musb/musb_cppi41.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/musb/musb_cppi41.c b/drivers/usb/musb/musb_cppi41.c
index f3ec7d2..e201b1e 100644
--- a/drivers/usb/musb/musb_cppi41.c
+++ b/drivers/usb/musb/musb_cppi41.c
@@ -132,7 +132,8 @@ static void cppi41_trans_done(struct cppi41_dma_channel 
*cppi41_channel)
struct musb_hw_ep *hw_ep = cppi41_channel->hw_ep;
struct musb *musb = hw_ep->musb;
 
-   if (!cppi41_channel->prog_len) {
+   if (!cppi41_channel->prog_len ||
+   (cppi41_channel->channel.status == MUSB_DMA_STATUS_FREE)) {
 
/* done, complete */
cppi41_channel->channel.actual_len =
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] dma: cppi41: start tear down only if channel is busy

2014-02-26 Thread George Cherian

Start the channel tear down only if the channel is busy, else just
bail out. In some cases its seen that by the time the tear down is
initiated the cppi completes the DMA, especially in ISOCH transfers.

Signed-off-by: George Cherian 
---
 drivers/dma/cppi41.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/cppi41.c b/drivers/dma/cppi41.c
index c18aebf..d028f36 100644
--- a/drivers/dma/cppi41.c
+++ b/drivers/dma/cppi41.c
@@ -620,12 +620,15 @@ static int cppi41_stop_chan(struct dma_chan *chan)
u32 desc_phys;
int ret;
 
+   desc_phys = lower_32_bits(c->desc_phys);
+   desc_num = (desc_phys - cdd->descs_phys) / sizeof(struct cppi41_desc);
+   if (!cdd->chan_busy[desc_num])
+   return 0;
+
ret = cppi41_tear_down_chan(c);
if (ret)
return ret;
 
-   desc_phys = lower_32_bits(c->desc_phys);
-   desc_num = (desc_phys - cdd->descs_phys) / sizeof(struct cppi41_desc);
WARN_ON(!cdd->chan_busy[desc_num]);
cdd->chan_busy[desc_num] = NULL;
 
-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] Fix CPPI Warnings during tear down after ISOCH transfers

2014-02-26 Thread George Cherian

Warinings are seen after  ISOCH transfers, during channel tear down.
This is mainly beacause we handle ISOCH differently as compared to 
other transfers. 

Patch 1: make sure we do channel tear down only if channel is busy.
 If not the tear down will never succeed.

Patch 2: ISOCH completions are done differently, so this might lead to 
reprogram of dma channel on which already a teardown is done.


George Cherian (2):
  dma: cppi41: start tear down only if channel is busy
  usb: musb: musb_cppi41: Dont reprogram DMA if tear down is initiated

 drivers/dma/cppi41.c   | 7 +--
 drivers/usb/musb/musb_cppi41.c | 3 ++-
 2 files changed, 7 insertions(+), 3 deletions(-)

-- 
1.8.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] x86: Mark __vdso entries as asmlinkage

2014-02-26 Thread H. Peter Anvin

On 02/26/2014 07:39 PM, Andi Kleen wrote:
> On Wed, Feb 26, 2014 at 05:02:13PM -0800, Andy Lutomirski wrote:
>> This makes no difference for 64-bit, bit it's critical for 32-bit code:
>> these functions are called from outside the kernel, so they need to comply
>> with the ABI.
> 
> That's an odd patch. If that was wrong things couldn't have worked at all.
> Probably hidden by inlining? If yes just make it static
> 
> Also you would rather need notrace more often.
> 

It has to support *an* ABI... the syscall vdso entry point uses the old
int $0x80 calling convention rather than the normal ABI.  It would
depend on the test program and eventual glibc implementation.  And sure
enough, the test program has:

int (*vdso_gettimeofday)(struct timeval *tv, struct timezone *tz)
__attribute__ ((regparm (3)));
int (*vdso_clock_gettime)(clockid_t clk_id, struct timespec *tp)
__attribute__ ((regparm (3)));
time_t (*vdso_time)(time_t *t) __attribute__ ((regparm (3)));

That being said, since this code is compiled separately, the compiler
flags there determine what actually matters.  However, there we have:

KBUILD_CFLAGS_32 += -m32 -msoft-float -mregparm=3 -freg-struct-return -fpic

The normal ABI almost certainly makes more sense; as such -mregparm=3 is
probably not what we want, and I suspect it makes more sense to just
drop that from the CFLAGS line?

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] x86: Mark __vdso entries as asmlinkage

2014-02-26 Thread H. Peter Anvin

On 02/26/2014 07:39 PM, Andi Kleen wrote:
> 
> Also you would rather need notrace more often.
> 

Again, can be dealt with in CFLAGS, no?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCHv9 Resend 1/4] pwm: Add Freescale FTM PWM driver support

2014-02-26 Thread li.xi...@freescale.com

Hi Thierry,

Thanks very much, I will fix them all.

:)

--
Best Regards,
Xiubo

> Sorry for taking so long to get back to you. Things have been quite busy
> lately. A few more comments below, but we're getting there.
> 
> On Wed, Feb 19, 2014 at 04:38:54PM +0800, Xiubo Li wrote:
> [...]
> > diff --git a/drivers/pwm/pwm-fsl-ftm.c b/drivers/pwm/pwm-fsl-ftm.c
> [...]
> > +static unsigned long fsl_pwm_calculate_period(struct fsl_pwm_chip *fpc,
> > + unsigned long period_ns)
> > +{
> > +   struct clk *cnt_clk[3];
> > +   enum fsl_pwm_clk m0, m1;
> > +   unsigned long fix_rate, ext_rate, cycles;
> > +
> > +   fpc->counter_clk = fpc->sys_clk;
> > +   cycles = fsl_pwm_calculate_period_cycles(fpc, period_ns,
> > +   FSL_PWM_CLK_SYS);
> > +   if (cycles)
> > +   return cycles;
> > +
> > +   cnt_clk[FSL_PWM_CLK_FIX] = devm_clk_get(fpc->chip.dev, "ftm_fix");
> > +   if (IS_ERR(cnt_clk[FSL_PWM_CLK_FIX]))
> > +   return PTR_ERR(cnt_clk[FSL_PWM_CLK_FIX]);
> > +
> > +   cnt_clk[FSL_PWM_CLK_EXT] = devm_clk_get(fpc->chip.dev, "ftm_ext");
> > +   if (IS_ERR(cnt_clk[FSL_PWM_CLK_EXT]))
> > +   return PTR_ERR(cnt_clk[FSL_PWM_CLK_EXT]);
> > +
> > +   fpc->counter_clk_en = devm_clk_get(fpc->chip.dev, "ftm_cnt_clk_en");
> > +   if (IS_ERR(fpc->counter_clk_en))
> > +   return PTR_ERR(fpc->counter_clk_en);
> 
> You shouldn't do this. You're obtaining a reference to each of these
> clocks whenever pwm_config() is called. And devres will only clean those
> up after the driver is unbound. Can't you simply keep a reference to
> these within struct fsl_pwm_chip?
> 
> > +static int fsl_counter_clock_enable(struct fsl_pwm_chip *fpc)
> > +{
> > +   u32 val;
> > +   int ret;
> > +
> > +   if (fpc->counter_clk_enable++)
> 
> This function is always called with the fpc->lock held, so you could
> make this much easier by incrementing the .counter_clk_enable field only
> at the very end of the function. That way...
> 
> > +   return 0;
> > +
> > +   ret = clk_prepare_enable(fpc->counter_clk);
> > +   if (ret) {
> > +   fpc->counter_clk_enable--;
> 
> ... this won't be necessary...
> 
> > +   return ret;
> > +   }
> > +
> > +   ret = clk_prepare_enable(fpc->counter_clk_en);
> > +   if (ret) {
> > +   fpc->counter_clk_enable--;
> 
> ... and neither will this.
> 
> > +static int fsl_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm)
> > +{
> > +   struct fsl_pwm_chip *fpc = to_fsl_chip(chip);
> > +   u32 val;
> > +   int ret;
> > +
> > +   val = readl(fpc->base + FTM_OUTMASK);
> > +   val &= ~BIT(pwm->hwpwm);
> > +   writel(val, fpc->base + FTM_OUTMASK);
> > +
> > +   mutex_lock(>lock);
> 
> I think you want to extend the lock to cover the FTM_OUTMASK register
> access as well because there could be a race between pwm_enable() and
> pwm_disable().
> 
> > +   ret = fsl_counter_clock_enable(fpc);
> > +   mutex_unlock(>lock);
> > +
> > +   return ret;
> > +}
> 
> Can this function be moved somewhere else so fsl_counter_clock_enable()
> and fsl_counter_clock_disable() are grouped together?
> 
> > +static void fsl_counter_clock_disable(struct fsl_pwm_chip *fpc)
> > +{
> > +   u32 val;
> > +
> > +   if (--fpc->counter_clk_enable)
> > +   return;
> 
> This is going to break. Consider the case where you call pwm_disable()
> on a PWM device and fpc->counter_clk_enable == 1. In that case, this
> will decrement counter_clk_enable to 0 and proceed with the remainder of
> this function.
> 
> Now you call pwm_disable() again. The above will decrement again and
> cause fpc->counter_clk_enable to wrap around to UINT_MAX.
> 
> So I think a more correct implementation would be:
> 
>   /*
>* already disabled, do nothing (perhaps output warning message
>* to catch unbalanced calls? )
>*/
>   if (fpc->counter_clk_enable == 0)
>   return;
> 
>   /* there are still users, so can't disable yet */
>   if (--fpc->counter_clk_enable > 0)
>   return;
> 
>   /* no users left, disable clock */
> 
> > +static void fsl_pwm_disable(struct pwm_chip *chip, struct pwm_device *pwm)
> > +{
> > +   struct fsl_pwm_chip *fpc = to_fsl_chip(chip);
> > +   u32 val;
> > +
> > +   val = readl(fpc->base + FTM_OUTMASK);
> > +   val |= BIT(pwm->hwpwm);
> > +   writel(val, fpc->base + FTM_OUTMASK);
> > +
> > +   mutex_lock(>lock);
> 
> This lock should also include the access to FTM_OUTMASK above.
> 
> > +static int fsl_pwm_probe(struct platform_device *pdev)
> > +{
> [...]
> > +   fpc->sys_clk = devm_clk_get(>dev, "ftm_sys");
> > +   if (IS_ERR(fpc->sys_clk)) {
> > +   dev_err(>dev,
> > +   "failed to get \"ftm_sys\" clock\n");
> 
> The above easily fits on a single line, no need for the wrapping.
> 
> > +   return PTR_ERR(fpc->sys_clk);
> > +   }
> > +
> > +   ret = clk_prepare_enable(fpc->sys_clk);
> > +   if (ret)
> > +   return ret;
> > +
> > +

[PATCH 5/5] hwrng: timeriomem - Use devm_*() functions

2014-02-26 Thread Jingoo Han

Use devm_*() functions to make cleanup paths simpler.

Signed-off-by: Jingoo Han 
---
 drivers/char/hw_random/timeriomem-rng.c |   40 ---
 1 file changed, 10 insertions(+), 30 deletions(-)

diff --git a/drivers/char/hw_random/timeriomem-rng.c 
b/drivers/char/hw_random/timeriomem-rng.c
index 73ce739..439ff8b 100644
--- a/drivers/char/hw_random/timeriomem-rng.c
+++ b/drivers/char/hw_random/timeriomem-rng.c
@@ -118,7 +118,8 @@ static int timeriomem_rng_probe(struct platform_device 
*pdev)
}
 
/* Allocate memory for the device structure (and zero it) */
-   priv = kzalloc(sizeof(struct timeriomem_rng_private_data), GFP_KERNEL);
+   priv = devm_kzalloc(>dev,
+   sizeof(struct timeriomem_rng_private_data), GFP_KERNEL);
if (!priv) {
dev_err(>dev, "failed to allocate device structure.\n");
return -ENOMEM;
@@ -134,17 +135,16 @@ static int timeriomem_rng_probe(struct platform_device 
*pdev)
period = i;
else {
dev_err(>dev, "missing period\n");
-   err = -EINVAL;
-   goto out_free;
+   return -EINVAL;
}
-   } else
+   } else {
period = pdata->period;
+   }
 
priv->period = usecs_to_jiffies(period);
if (priv->period < 1) {
dev_err(>dev, "period is less than one jiffy\n");
-   err = -EINVAL;
-   goto out_free;
+   return -EINVAL;
}
 
priv->expires   = jiffies;
@@ -160,24 +160,16 @@ static int timeriomem_rng_probe(struct platform_device 
*pdev)
priv->timeriomem_rng_ops.data_read  = timeriomem_rng_data_read;
priv->timeriomem_rng_ops.priv   = (unsigned long)priv;
 
-   if (!request_mem_region(res->start, resource_size(res),
-   dev_name(>dev))) {
-   dev_err(>dev, "request_mem_region failed\n");
-   err = -EBUSY;
+   priv->io_base = devm_ioremap_resource(>dev, res);
+   if (IS_ERR(priv->io_base)) {
+   err = PTR_ERR(priv->io_base);
goto out_timer;
}
 
-   priv->io_base = ioremap(res->start, resource_size(res));
-   if (priv->io_base == NULL) {
-   dev_err(>dev, "ioremap failed\n");
-   err = -EIO;
-   goto out_release_io;
-   }
-
err = hwrng_register(>timeriomem_rng_ops);
if (err) {
dev_err(>dev, "problem registering\n");
-   goto out;
+   goto out_timer;
}
 
dev_info(>dev, "32bits from 0x%p @ %dus\n",
@@ -185,30 +177,18 @@ static int timeriomem_rng_probe(struct platform_device 
*pdev)
 
return 0;
 
-out:
-   iounmap(priv->io_base);
-out_release_io:
-   release_mem_region(res->start, resource_size(res));
 out_timer:
del_timer_sync(>timer);
-out_free:
-   kfree(priv);
return err;
 }
 
 static int timeriomem_rng_remove(struct platform_device *pdev)
 {
struct timeriomem_rng_private_data *priv = platform_get_drvdata(pdev);
-   struct resource *res;
-
-   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 
hwrng_unregister(>timeriomem_rng_ops);
 
del_timer_sync(>timer);
-   iounmap(priv->io_base);
-   release_mem_region(res->start, resource_size(res));
-   kfree(priv);
 
return 0;
 }
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mtip32xx blk-mq status?

2014-02-26 Thread Jens Axboe


On 2014-02-26 20:42, Sam Bradshaw (sbradshaw) wrote:



On 2014-02-26 11:29, Christoph Hellwig wrote:

Hi all,

with blk-mq stabilizing in mainline and Jens using mtip32xx as tje

major

example drivers in the past is there any progress on getting the
conversion finished and merged?


I'll pick up the pieces as soon as I get back and can test again. The
basic conversion is easy enough, I'll just dust that off. On top of
that
I started splitting the issue groups up, since it actually gets you a
good step further than just the basic conversion.


If you have a public branch for the latter, send it our way.  We can
help with the SRSI testing.


Thanks, I'll get it setup next week.


The only problem area I ran into was the special case of the unaligned
writes.


Yup, it's a painful limitation.  If you need any p420m hardware, let me
know.


Sure, a card for testing is always useful.

--
Jens Axboe


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] hwrng: nomadik - Use devm_*() functions

2014-02-26 Thread Jingoo Han

Use devm_*() functions to make cleanup paths simpler.

Signed-off-by: Jingoo Han 
---
 drivers/char/hw_random/nomadik-rng.c |   13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/char/hw_random/nomadik-rng.c 
b/drivers/char/hw_random/nomadik-rng.c
index 00e9d2d..9c85815 100644
--- a/drivers/char/hw_random/nomadik-rng.c
+++ b/drivers/char/hw_random/nomadik-rng.c
@@ -43,7 +43,7 @@ static int nmk_rng_probe(struct amba_device *dev, const 
struct amba_id *id)
void __iomem *base;
int ret;
 
-   rng_clk = clk_get(>dev, NULL);
+   rng_clk = devm_clk_get(>dev, NULL);
if (IS_ERR(rng_clk)) {
dev_err(>dev, "could not get rng clock\n");
ret = PTR_ERR(rng_clk);
@@ -56,33 +56,28 @@ static int nmk_rng_probe(struct amba_device *dev, const 
struct amba_id *id)
if (ret)
goto out_clk;
ret = -ENOMEM;
-   base = ioremap(dev->res.start, resource_size(>res));
+   base = devm_ioremap(>dev, dev->res.start,
+   resource_size(>res));
if (!base)
goto out_release;
nmk_rng.priv = (unsigned long)base;
ret = hwrng_register(_rng);
if (ret)
-   goto out_unmap;
+   goto out_release;
return 0;
 
-out_unmap:
-   iounmap(base);
 out_release:
amba_release_regions(dev);
 out_clk:
clk_disable(rng_clk);
-   clk_put(rng_clk);
return ret;
 }
 
 static int nmk_rng_remove(struct amba_device *dev)
 {
-   void __iomem *base = (void __iomem *)nmk_rng.priv;
hwrng_unregister(_rng);
-   iounmap(base);
amba_release_regions(dev);
clk_disable(rng_clk);
-   clk_put(rng_clk);
return 0;
 }
 
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/5] hwrng: pixocell - Use devm_clk_get()

2014-02-26 Thread Jingoo Han

Use devm_clk_get() to make cleanup paths simpler.

Signed-off-by: Jingoo Han 
---
 drivers/char/hw_random/picoxcell-rng.c |8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/char/hw_random/picoxcell-rng.c 
b/drivers/char/hw_random/picoxcell-rng.c
index c03beee..eab5448 100644
--- a/drivers/char/hw_random/picoxcell-rng.c
+++ b/drivers/char/hw_random/picoxcell-rng.c
@@ -108,7 +108,7 @@ static int picoxcell_trng_probe(struct platform_device 
*pdev)
if (IS_ERR(rng_base))
return PTR_ERR(rng_base);
 
-   rng_clk = clk_get(>dev, NULL);
+   rng_clk = devm_clk_get(>dev, NULL);
if (IS_ERR(rng_clk)) {
dev_warn(>dev, "no clk\n");
return PTR_ERR(rng_clk);
@@ -117,7 +117,7 @@ static int picoxcell_trng_probe(struct platform_device 
*pdev)
ret = clk_enable(rng_clk);
if (ret) {
dev_warn(>dev, "unable to enable clk\n");
-   goto err_enable;
+   return ret;
}
 
picoxcell_trng_start();
@@ -132,9 +132,6 @@ static int picoxcell_trng_probe(struct platform_device 
*pdev)
 
 err_register:
clk_disable(rng_clk);
-err_enable:
-   clk_put(rng_clk);
-
return ret;
 }
 
@@ -142,7 +139,6 @@ static int picoxcell_trng_remove(struct platform_device 
*pdev)
 {
hwrng_unregister(_trng);
clk_disable(rng_clk);
-   clk_put(rng_clk);
 
return 0;
 }
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] hwrng: omap3-rom: Use devm_clk_get()

2014-02-26 Thread Jingoo Han

Use devm_clk_get() to make cleanup paths simpler.

Signed-off-by: Jingoo Han 
---
 drivers/char/hw_random/omap3-rom-rng.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/char/hw_random/omap3-rom-rng.c 
b/drivers/char/hw_random/omap3-rom-rng.c
index c853e9e..6f2eaff 100644
--- a/drivers/char/hw_random/omap3-rom-rng.c
+++ b/drivers/char/hw_random/omap3-rom-rng.c
@@ -103,7 +103,7 @@ static int omap3_rom_rng_probe(struct platform_device *pdev)
}
 
setup_timer(_timer, omap3_rom_rng_idle, 0);
-   rng_clk = clk_get(>dev, "ick");
+   rng_clk = devm_clk_get(>dev, "ick");
if (IS_ERR(rng_clk)) {
pr_err("unable to get RNG clock\n");
return PTR_ERR(rng_clk);
@@ -120,7 +120,6 @@ static int omap3_rom_rng_remove(struct platform_device 
*pdev)
 {
hwrng_unregister(_rom_rng_ops);
clk_disable_unprepare(rng_clk);
-   clk_put(rng_clk);
return 0;
 }
 
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/6] PM / Voltagedomain: Add generic clk notifier handler for regulator based dynamic voltage scaling

2014-02-26 Thread Mike Turquette

Quoting Nishanth Menon (2014-02-26 18:34:55)
> +/**
> + * pm_runtime_get_rate() - Returns the device operational frequency
> + * @dev:   Device to handle
> + * @rate:  Returns rate in Hz.
> + *
> + * Returns appropriate error value in case of error conditions, else
> + * returns 0 and rate is updated. The pm_domain logic does all the necessary
> + * operation (which may consider magic hardware stuff) to provide the rate.
> + *
> + * NOTE: the rate returned is a snapshot and in many cases just a bypass
> + * to clk api to set the rate.
> + */
> +int pm_runtime_get_rate(struct device *dev, unsigned long *rate)

Instead of "rate", how about we use "level" and leave it undefined as to
what that means? It would be equally valid for level to represent a
clock rate, or an opp from a table of opp's, or a p-state, or some value
passed to a PM microcontroller.

Code that is tightly coupled to the hardware would simply know what
value to use with no extra sugar.

Generic code would need to get the various supported "levels" populated
at run time, but a DT binding could do that, or a query to the ACPI
tables, or whatever.

> +{
> +   unsigned long flags;
> +   int error = -ENOSYS;
> +
> +   if (!rate || !dev)
> +   return -EINVAL;
> +
> +   spin_lock_irqsave(>power.lock, flags);
> +   if (!pm_runtime_active(dev)) {
> +   error = -EINVAL;
> +   goto out;
> +   }
> +
> +   if (dev->pm_domain && dev->pm_domain->active_ops.get_rate)
> +   error = dev->pm_domain->active_ops.get_rate(dev, rate);
> +out:
> +   spin_unlock_irqrestore(>power.lock, flags);
> +
> +   return error;
> +}
> +
> +/**
> + * pm_runtime_set_rate() - Set a specific rate for the device operation
> + * @dev:   Device to handle
> + * @rate:  Rate to set in Hz
> + *
> + * Returns appropriate error value in case of error conditions, else
> + * returns 0. The pm_domain logic does all the necessary operation (which
> + * may include voltage scale operations or other magic hardware stuff) to
> + * achieve the operation. It is guarenteed that the requested rate is 
> achieved
> + * on returning from this function if return value is 0.
> + */
> +int pm_runtime_set_rate(struct device *dev, unsigned long rate)

Additionally I wonder if the function signature should include a way to
specify the sub-unit of a device that we are operating on? This is a way
to tackle the issues you raised regarding multiple clocks per device,
etc. Two approaches come to mind:

int pm_runtime_set_rate(struct device *dev, int index,
unsigned long rate);

Where index is a sub-unit of struct device *dev. The second approach is
to create a publicly declared structure representing the sub-unit. Some
variations on that theme:

int pm_runtime_set_rate(struct perf_domain *perfdm, unsigned long rate);

or,

int pm_runtime_set_rate(struct generic_power_domain *gpd,
unsigned long rate);

or whatever that sub-unit looks like. The gpd thing might be a total
layering violation, I don't know. Or perhaps it's a decent idea but it
shouldn't be as a PM runtime call. Again, I dunno.

Regards,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/5] hwrng: atmel - Use devm_clk_get()

2014-02-26 Thread Jingoo Han

Use devm_clk_get() to make cleanup paths simpler.

Signed-off-by: Jingoo Han 
---
 drivers/char/hw_random/atmel-rng.c |8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/char/hw_random/atmel-rng.c 
b/drivers/char/hw_random/atmel-rng.c
index dfeddf2..851bc7e 100644
--- a/drivers/char/hw_random/atmel-rng.c
+++ b/drivers/char/hw_random/atmel-rng.c
@@ -63,13 +63,13 @@ static int atmel_trng_probe(struct platform_device *pdev)
if (IS_ERR(trng->base))
return PTR_ERR(trng->base);
 
-   trng->clk = clk_get(>dev, NULL);
+   trng->clk = devm_clk_get(>dev, NULL);
if (IS_ERR(trng->clk))
return PTR_ERR(trng->clk);
 
ret = clk_enable(trng->clk);
if (ret)
-   goto err_enable;
+   return ret;
 
writel(TRNG_KEY | 1, trng->base + TRNG_CR);
trng->rng.name = pdev->name;
@@ -85,9 +85,6 @@ static int atmel_trng_probe(struct platform_device *pdev)
 
 err_register:
clk_disable(trng->clk);
-err_enable:
-   clk_put(trng->clk);
-
return ret;
 }
 
@@ -99,7 +96,6 @@ static int atmel_trng_remove(struct platform_device *pdev)
 
writel(TRNG_KEY, trng->base + TRNG_CR);
clk_disable(trng->clk);
-   clk_put(trng->clk);
 
return 0;
 }
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the pwm tree with the usb tree

2014-02-26 Thread Stephen Rothwell

Hi Thierry,

Today's linux-next merge of the pwm tree got a conflict in
arch/arm/Kconfig between commit f6723b569a67 ("usb: host: remove selects
of USB_ARCH_HAS_?HCI") from the usb tree and commit 557fe99d9d49 ("pwm:
Remove obsolete HAVE_PWM Kconfig symbol") from the pwm tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc arch/arm/Kconfig
index c85745d2d20a,cc6ce44064a2..
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@@ -628,7 -629,7 +625,6 @@@ config ARCH_LPC32X
select CPU_ARM926T
select GENERIC_CLOCKEVENTS
select HAVE_IDE
-   select HAVE_PWM
 -  select USB_ARCH_HAS_OHCI
select USE_OF
help
  Support for the NXP LPC32XX family of processors


pgpVDMV9ZNoBc.pgp
Description: PGP signature

Re: 3.13.5 : rm -rf running forever, one cpu at approx 100%

2014-02-26 Thread Mike Galbraith

On Thu, 2014-02-27 at 03:45 +, Ken Moffat wrote: 
> On Thu, Feb 27, 2014 at 04:26:35AM +0100, Mike Galbraith wrote:
> > 
> > I would start with strace to see if a task is looping in userspace, then
> > move on to perf top -g -p  (or perf record/report) to peek at what
> > it's up to in the kernel.  Once you have the where, trace_printk() is
> > the best thing since sliced bread (which ranks just below printk()).
> > 
> > -Mike
>  Thanks.  I'll need to build perf.

You may want to build the kernel with frame-pointers too, for easy gdb
list *0x(hexnum) of *func()+0x(hexoffset) use.  Crash is also pretty
handy both for rummaging live via crash vmlinux /proc/kcore, and for
leisurely postmortem analysis if you set the box up to crashdump in
advance, and force a dump (poke sysrq-c or echo c > /proc/sysrq-trigger)
when you see the bad thing happen.  Crash has all kinds of goodies,
including invocation of gdb.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ACPI / EC: Clear stale EC events on Samsung systems

2014-02-26 Thread Li Guang


Kieran Clancy wrote:

On Thu, Feb 27, 2014 at 12:29 PM, Li Guang  wrote:
   

+#define ACPI_EC_CLEAR_MAX  20  /* Maximum number of events to
query
+* when trying to clear the EC */

   


20 is enough?
the query index is length of a byte.
 

On my machine, 8 seems to be enough, so 20 seems to be a conservative
maximum. Just reading your other email, maybe we should set this to
32? or 40? 100?

If it's not enough, hopefully anyone seeing bugs will notice the
warning "maximum of X stale EC events cleared".

Here's what happens if I plug/replug the AC lots of times (more than
8) during suspend:

[ 8807.019800] ACPI : EC: --->  status = 0x29
[ 8807.019804] ACPI : EC: --->  data = 0x66
[ 8807.020790] ACPI : EC: --->  status = 0x29
[ 8807.020793] ACPI : EC: --->  data = 0x66
[ 8807.021793] ACPI : EC: --->  status = 0x29
[ 8807.021798] ACPI : EC: --->  data = 0x66
[ 8807.022831] ACPI : EC: --->  status = 0x29
[ 8807.022834] ACPI : EC: --->  data = 0x66
[ 8807.023788] ACPI : EC: --->  status = 0x29
[ 8807.023792] ACPI : EC: --->  data = 0x66
[ 8807.024787] ACPI : EC: --->  status = 0x29
[ 8807.024791] ACPI : EC: --->  data = 0x66
[ 8807.025787] ACPI : EC: --->  status = 0x29
[ 8807.025790] ACPI : EC: --->  data = 0x66
[ 8807.026787] ACPI : EC: --->  status = 0x29
[ 8807.026790] ACPI : EC: --->  data = 0x66
[ 8807.027786] ACPI : EC: --->  status = 0x09
[ 8807.027790] ACPI : EC: --->  data = 0x00
[ 8807.027792] ACPI : EC: 8 stale EC events cleared

Note that most of these have SCI_EVT set, but the OS is not notified
according to ACPI specs (seemingly because these events happened
during sleep).

The _Q66 method in my DSDT, is:

 P8XH (Zero, 0x66)
 If (LEqual (B1EX, One))
 {
 Notify (BAT1, 0x80)
 }

So, basically, this is supposed to notify that the battery (BAT1 =
PNP0C0A) has changed state, but they are stale events so we don't run
the handlers.

   

+static int EC_FLAGS_CLEAR_ON_RESUME; /* EC should be polled on
boot/resume */
   

seems name is implicit, what about EC_FLAGS_QEVENT_CLR_ON_RESUME?
seems too long :-)
 

In my mind this is referring to the function name (acpi_ec_)clear.
Perhaps we could just make the connection more explicit in the
comment:

/* needs acpi_ec_clear() on boot/resume */

Not sure if this is better?

   

+   /* Some hardware may need the EC to be cleared before use */
   

description is implicit, should specify what we clear is Q event, not EC.
 

Are Q events the only thing we can get from the EC data port? I've
read the relevant parts of the ACPI spec and I can't say I am 100%
sure.

   

I guess you want to clear Q events here,
EC usually has ACPI space to be read by cmd 80.

Thanks!



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: mtip32xx blk-mq status?

2014-02-26 Thread Sam Bradshaw (sbradshaw)


> On 2014-02-26 11:29, Christoph Hellwig wrote:
> > Hi all,
> >
> > with blk-mq stabilizing in mainline and Jens using mtip32xx as tje
> major
> > example drivers in the past is there any progress on getting the
> > conversion finished and merged?
> 
> I'll pick up the pieces as soon as I get back and can test again. The
> basic conversion is easy enough, I'll just dust that off. On top of
> that
> I started splitting the issue groups up, since it actually gets you a
> good step further than just the basic conversion.

If you have a public branch for the latter, send it our way.  We can
help with the SRSI testing.
 
> The only problem area I ran into was the special case of the unaligned
> writes.

Yup, it's a painful limitation.  If you need any p420m hardware, let me 
know.

-Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3] fixes on page table walker and hugepage rmapping

2014-02-26 Thread Naoya Horiguchi

Hi,

Sasha, could you test if the bug you reported recently [1] reproduces
on the latest next tree with this patchset? (I'm not sure of this
because the problem looks differently in my own testing...)

[1] http://thread.gmane.org/gmane.linux.kernel.mm/113374/focus=113
---
Summary:

Naoya Horiguchi (3):
  mm/pagewalk.c: fix end address calculation in walk_page_range()
  mm, hugetlbfs: fix rmapping for anonymous hugepages with page_pgoff()
  mm: call vma_adjust_trans_huge() only for thp-enabled vma

 include/linux/pagemap.h | 13 +
 mm/huge_memory.c|  2 +-
 mm/hugetlb.c|  5 +
 mm/memory-failure.c |  4 ++--
 mm/mmap.c   |  3 ++-
 mm/pagewalk.c   |  5 +++--
 mm/rmap.c   |  8 ++--
 7 files changed, 28 insertions(+), 12 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] mm/pagewalk.c: fix end address calculation in walk_page_range()

2014-02-26 Thread Naoya Horiguchi

When we try to walk over inside a vma, walk_page_range() tries to walk
until vma->vm_end even if a given end is before that point.
So this patch takes the smaller one as an end address.

Signed-off-by: Naoya Horiguchi 
---
 mm/pagewalk.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git next-20140220.orig/mm/pagewalk.c next-20140220/mm/pagewalk.c
index 416e981243b1..b418407ff4da 100644
--- next-20140220.orig/mm/pagewalk.c
+++ next-20140220/mm/pagewalk.c
@@ -321,8 +321,9 @@ int walk_page_range(unsigned long start, unsigned long end,
next = vma->vm_start;
} else { /* inside the found vma */
walk->vma = vma;
-   next = vma->vm_end;
-   err = walk_page_test(start, end, walk);
+   next = min_t(unsigned long, end, vma->vm_end);
+
+   err = walk_page_test(start, next, walk);
if (skip_lower_level_walking(walk))
continue;
if (err)
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] mm: call vma_adjust_trans_huge() only for thp-enabled vma

2014-02-26 Thread Naoya Horiguchi

vma_adjust() is called also for vma(VM_HUGETLB) and it could happen that
we happen to try to split hugetlbfs hugepage. So exclude the possibility.

Signed-off-by: Naoya Horiguchi 
---
 mm/mmap.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git next-20140220.orig/mm/mmap.c next-20140220/mm/mmap.c
index f53397806d7f..45a9c0d51e3f 100644
--- next-20140220.orig/mm/mmap.c
+++ next-20140220/mm/mmap.c
@@ -772,7 +772,8 @@ again:  remove_next = 1 + (end > 
next->vm_end);
}
}
 
-   vma_adjust_trans_huge(vma, start, end, adjust_next);
+   if (transparent_hugepage_enabled(vma))
+   vma_adjust_trans_huge(vma, start, end, adjust_next);
 
anon_vma = vma->anon_vma;
if (!anon_vma && adjust_next)
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3] mm, hugetlbfs: fix rmapping for anonymous hugepages with page_pgoff()

2014-02-26 Thread Naoya Horiguchi

page->index stores pagecache index when the page is mapped into file mapping
region, and the index is in pagecache size unit, so it depends on the page
size. Some of users of reverse mapping obviously assumes that page->index
is in PAGE_CACHE_SHIFT unit, so they don't work for anonymous hugepage.

For example, consider that we have 3-hugepage vma and try to mbind the 2nd
hugepage to migrate to another node. Then the vma is split and migrate_page()
is called for the 2nd hugepage (belonging to the middle vma.)
In migrate operation, rmap_walk_anon() tries to find the relevant vma to
which the target hugepage belongs, but here we miscalculate pgoff.
So anon_vma_interval_tree_foreach() grabs invalid vma, which fires VM_BUG_ON.

This patch introduces a new API that is usable both for normal page and
hugepage to get PAGE_SIZE offset from page->index. Users should clearly
distinguish page_index for pagecache index and page_pgoff for page offset.

Reported-by: Sasha Levin  # if the reported problem is 
fixed
Signed-off-by: Naoya Horiguchi 
Cc: sta...@vger.kernel.org # 3.12+
---
 include/linux/pagemap.h | 13 +
 mm/huge_memory.c|  2 +-
 mm/hugetlb.c|  5 +
 mm/memory-failure.c |  4 ++--
 mm/rmap.c   |  8 ++--
 5 files changed, 23 insertions(+), 9 deletions(-)

diff --git next-20140220.orig/include/linux/pagemap.h 
next-20140220/include/linux/pagemap.h
index 4f591df66778..a8bd14f42032 100644
--- next-20140220.orig/include/linux/pagemap.h
+++ next-20140220/include/linux/pagemap.h
@@ -316,6 +316,19 @@ static inline loff_t page_file_offset(struct page *page)
return ((loff_t)page_file_index(page)) << PAGE_CACHE_SHIFT;
 }
 
+extern pgoff_t hugepage_pgoff(struct page *page);
+
+/*
+ * page->index stores pagecache index whose unit is not always PAGE_SIZE.
+ * This function converts it into PAGE_SIZE offset.
+ */
+#define page_pgoff(page)   \
+({ \
+   unlikely(PageHuge(page)) ?  \
+   hugepage_pgoff(page) :  \
+   page->index >> (PAGE_CACHE_SHIFT - PAGE_SHIFT); \
+})
+
 extern pgoff_t linear_hugepage_index(struct vm_area_struct *vma,
 unsigned long address);
 
diff --git next-20140220.orig/mm/huge_memory.c next-20140220/mm/huge_memory.c
index 6ac89e9f82ef..ef96763a6abf 100644
--- next-20140220.orig/mm/huge_memory.c
+++ next-20140220/mm/huge_memory.c
@@ -1800,7 +1800,7 @@ static void __split_huge_page(struct page *page,
  struct list_head *list)
 {
int mapcount, mapcount2;
-   pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+   pgoff_t pgoff = page_pgoff(page);
struct anon_vma_chain *avc;
 
BUG_ON(!PageHead(page));
diff --git next-20140220.orig/mm/hugetlb.c next-20140220/mm/hugetlb.c
index 2252cacf98e8..e159e593d99f 100644
--- next-20140220.orig/mm/hugetlb.c
+++ next-20140220/mm/hugetlb.c
@@ -764,6 +764,11 @@ pgoff_t __basepage_index(struct page *page)
return (index << compound_order(page_head)) + compound_idx;
 }
 
+pgoff_t hugepage_pgoff(struct page *page)
+{
+   return page->index << huge_page_order(page_hstate(page));
+}
+
 static struct page *alloc_fresh_huge_page_node(struct hstate *h, int nid)
 {
struct page *page;
diff --git next-20140220.orig/mm/memory-failure.c 
next-20140220/mm/memory-failure.c
index 35ef28acf137..5d85a4afb22c 100644
--- next-20140220.orig/mm/memory-failure.c
+++ next-20140220/mm/memory-failure.c
@@ -404,7 +404,7 @@ static void collect_procs_anon(struct page *page, struct 
list_head *to_kill,
if (av == NULL) /* Not actually mapped anymore */
return;
 
-   pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+   pgoff = page_pgoff(page);
read_lock(_lock);
for_each_process (tsk) {
struct anon_vma_chain *vmac;
@@ -437,7 +437,7 @@ static void collect_procs_file(struct page *page, struct 
list_head *to_kill,
mutex_lock(>i_mmap_mutex);
read_lock(_lock);
for_each_process(tsk) {
-   pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+   pgoff_t pgoff = page_pgoff(page);
 
if (!task_early_kill(tsk))
continue;
diff --git next-20140220.orig/mm/rmap.c next-20140220/mm/rmap.c
index 9056a1f00b87..78405051474a 100644
--- next-20140220.orig/mm/rmap.c
+++ next-20140220/mm/rmap.c
@@ -515,11 +515,7 @@ void page_unlock_anon_vma_read(struct anon_vma *anon_vma)
 static inline unsigned long
 __vma_address(struct page *page, struct vm_area_struct *vma)
 {
-   pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
-
-   if (unlikely(is_vm_hugetlb_page(vma)))
-   pgoff = page->index << huge_page_order(page_hstate(page));
-
+   pgoff_t pgoff = page_pgoff(page);

Re: wacom: Fixes for stylus pressure values for Thinkpad Yoga

2014-02-26 Thread Ping Cheng

Hi Carl,

Thank you for the heads up. I believe Jason's patchset 4 of 4
(http://www.spinics.net/lists/linux-input/msg29435.html) fixed the
issue for your device and for other's. The patch was submitted last
month. If you can test the set on your device and give us a Tested-by
here, it will help Dmitry to merge the patch upstream.

Thank you for your effort.

Ping

On Wed, Feb 26, 2014 at 2:38 PM, Carl Worth  wrote:
> This series of patches fixes the pressure values reported for the
> wacom tablet built-in to a Lenovo ThinkPad Yoga laptop. Prior to this
> patch series, if I slowly increased stylus pressure, (expecting a
> gradual increase of values from 0 to 1023), I instead received values
> that increased slowly to 255, then reset to 0 and increased slowly
> again, etc.
>
> The buggy arithmetic that is updated here appears to exist in
> identical forms for other drivers. I did not update any code that I
> was not able to test directly. But it looks like wacom_pl_irq and
> wacom_dtu_irq potentially have similar bugs, (depending on the actual
> pressure_max values encountered in practice).
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-input" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2] sparse: Allow override of sizeof(bool) warning

2014-02-26 Thread H. Peter Anvin

On 02/26/2014 08:26 PM, Ben Pfaff wrote:
> 
>> Because sizeof(_Bool) is a little bit special compare to sizeof(long).
>> In the case of long, all sizeof(long) * 8 bits are use in the actual value.
>> But for the _Bool, only the 1 bit is used in the 8 bits size. In other words,
>> the _Bool has a special case of the actual bit size is not a multiple of 8.

Quite frankly, this is silly in my opinion, *and* it is not guaranteed
by C either (read about "trap representations").

>> Sparse has two hats, it is a C compiler front end, and more often it is
>> used in the Linux kernel source sanitize checking. Depending on the sizeof
>> _Bool sounds a little bit suspicious in the kernel. I would love to the heard
>> your actual usage case of the sizeof(_Bool). Why do you care about this
>> warning?

Anything that moves data around in a generic fashion.  It can be as
simple as:

memcpy(foo, bar, sizeof *foo);

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: The sheer number of sparse warnings in the kernel

2014-02-26 Thread Greg KH

On Wed, Feb 26, 2014 at 10:15:08PM -0500, Dave Jones wrote:
> On Wed, Feb 26, 2014 at 05:34:24PM -0800, Greg KH wrote:
> 
>  > Yes, for some areas of the kernel it will take some work, but for
>  > others, sparse works really well.  As an example, building all of
>  > drivers/usb/* with sparse only brings up 2 issues, both of which should
>  > probably be fixed (or annotated properly in the case of the locking
>  > warning.)
> 
> Hm. I see 102 in drivers/usb. Mostly in gadget. 
> http://paste.fedoraproject.org/80787/39347077/raw/

Ick, gadget, I don't build that for my systems as I don't have that
hardware, sorry, I forgot to take that into consideration here.

Hey, at least I know what I'm going to be doing tomorrow :)

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: The sheer number of sparse warnings in the kernel

2014-02-26 Thread Greg KH

On Wed, Feb 26, 2014 at 08:19:23PM -0800, H. Peter Anvin wrote:
> On 02/26/2014 05:52 PM, Peter Hurley wrote:
> > 
> > Well there was that "should we do a bug-fix-only 4.0 release?" message
> > from Linus back at the 3.12 release.
> > 
> 
> Sure... but will it actually happen?

I sure hope not, the backlog it would cause would be immense.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2] sparse: Allow override of sizeof(bool) warning

2014-02-26 Thread Ben Pfaff

On Wed, Feb 26, 2014 at 08:19:57PM -0800, H. Peter Anvin wrote:
> On 02/26/2014 08:00 PM, Ben Pfaff wrote:
> > 
> > The commit *relaxed* sparse behavior: because previously sizeof(bool)
> > was an error.  I'm not in favor of any diagnostic at all for
> > sizeof(bool), but my recollection is that a sparse maintainer wanted it
> > to yield one.
> 
> Still not clear as to why.

The discussion is here:
http://comments.gmane.org/gmane.comp.parsers.sparse/2462

Quoting from that discussion, the core of Christopher Li's argument was
this:
> On Mon, May 9, 2011 at 1:02 PM, Ben Pfaff  nicira.com> wrote:
> > Thank you for applying my patch.  It does work for me, in the sense
> > that I get a warning instead of an error now, but I'm not so happy to
> > get any diagnostic at all.  Is there some reason why sizeof(_Bool)
> > warrants a warning when, say, sizeof(long) does not?  After all, both
> > sizes are implementation defined.

> Because sizeof(_Bool) is a little bit special compare to sizeof(long).
> In the case of long, all sizeof(long) * 8 bits are use in the actual value.
> But for the _Bool, only the 1 bit is used in the 8 bits size. In other words,
> the _Bool has a special case of the actual bit size is not a multiple of 8.

> Sparse has two hats, it is a C compiler front end, and more often it is
> used in the Linux kernel source sanitize checking. Depending on the sizeof
> _Bool sounds a little bit suspicious in the kernel. I would love to the heard
> your actual usage case of the sizeof(_Bool). Why do you care about this
> warning?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2] sparse: Allow override of sizeof(bool) warning

2014-02-26 Thread H. Peter Anvin

On 02/26/2014 08:00 PM, Ben Pfaff wrote:
> 
> The commit *relaxed* sparse behavior: because previously sizeof(bool)
> was an error.  I'm not in favor of any diagnostic at all for
> sizeof(bool), but my recollection is that a sparse maintainer wanted it
> to yield one.
> 

Still not clear as to why.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: The sheer number of sparse warnings in the kernel

2014-02-26 Thread H. Peter Anvin

On 02/26/2014 05:52 PM, Peter Hurley wrote:
> 
> Well there was that "should we do a bug-fix-only 4.0 release?" message
> from Linus back at the 3.12 release.
> 

Sure... but will it actually happen?

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND] scsi: Output error messages using structured printk in single line

2014-02-26 Thread Yoshihiro YUNOMAE


Hi Hannes,

Although I sent you a message 6 days ago to ask your work which
is similar to my patch, I resend my patch because I'm considering
this problem should be fixed as soon as possible.

Thank you,
Yoshihiro YUNOMAE

(2014/02/27 13:17), Yoshihiro YUNOMAE wrote:

Output error messages using structured printk in single line.
In SCSI drivers, some error messages which should be output in single line are
divided in multiple lines. When user tools handle the error messages, those
divided messages will create some inconveniences.

The reason why this problem is induced is structured printk for error messages.
Structured printk can add device information for printk, and it is used in
scmd_printk() and sd_printk(). The printk aims at output in atomic, so we
cannot use those functions for connecting multiple messages like KERN_CONT.
However, some error messages is implemented as follows:
structured_printk("DEVICE INFORMATION:");
printk(KERN_CONT, "DETAIL INFORMATION\n");
This implementation will be expected to output like "DEVICE INFORMATION: DETAIL
INFORMATION", but actually, this will be output as follows:
DEVICE INFORMATION:
DETAIL INFORMATION

For instance, in a following pseudo SCSI error test, the device information and
the detail information are divided:

-- Pseudo SCSI error test for current kernel
   # modprobe scsi_debug
   # cd /sys/bus/pseudo/drivers/scsi_debug
   # echo 2 > opts
   # dd if=/dev/sdb of=/dev/null 2> /dev/null

-- Result for current kernel
   # dmesg

[   17.842110] sd 2:0:0:0: [sdb] Attached SCSI disk
[   18.859098] sd 2:0:0:0: [sdb] Unhandled sense code
[   18.859103] sd 2:0:0:0: [sdb]
[   18.859106] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   18.859108] sd 2:0:0:0: [sdb]
[   18.859110] Sense Key : Medium Error [current]
[   18.859114] Info fld=0x1234
[   18.859116] sd 2:0:0:0: [sdb]
[   18.859119] Add. Sense: Unrecovered read error
[   18.859122] sd 2:0:0:0: [sdb] CDB:
[   18.859124] Read(10): 28 00 00 00 11 e0 00 01 00 00

In a SCSI device driver, sd_print_result() is implemented as follows:
sd_print_result()
{
sd_printk(KERN_INFO, sdkp, " ");
scsi_show_result(result);
}
Here, first sd_printk() outputs "sd 2:0:0:0: [sdb] ", then scsi_show_sense_hdr()
outputs "Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE". sd_printk() does not
include "\n", but it forcibly starts a new line. Therefore, when the driver
outputs error messages, those messages are divided.

This patch makes those multiple line messages output in single line as follows:

   # dmesg

[   17.145085]  sdb: unknown partition table
[   17.149096] sd 2:0:0:0: [sdb] Attached SCSI disk
[   18.166090] sd 2:0:0:0: [sdb] Unhandled sense code
[   18.166095] sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   18.166099] sd 2:0:0:0: [sdb] Sense Key : Medium Error [current]
[   18.166104] Info fld=0x1234
[   18.166106] sd 2:0:0:0: [sdb] Add. Sense: Unrecovered read error
[   18.166111] sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 11 e0 00 01 00 00

Signed-off-by: Yoshihiro YUNOMAE 
Cc: James E.J. Bottomley 
Cc: Hannes Reinecke 
Cc: Kay Sievers 
Cc: linux-kernel@vger.kernel.org
Cc: linux-s...@vger.kernel.org
---
  drivers/scsi/constants.c |  206 --
  drivers/scsi/scsi.c  |   28 --
  drivers/scsi/sd.c|   19 +++-
  include/scsi/scsi_dbg.h  |   23 -
  4 files changed, 176 insertions(+), 100 deletions(-)

diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
index d35a5d6..cb93435 100644
--- a/drivers/scsi/constants.c
+++ b/drivers/scsi/constants.c
@@ -256,8 +256,26 @@ static const char * get_sa_name(const struct 
value_name_pair * arr,
return (k < arr_sz) ? arr->name : NULL;
  }

+/* Store a SCSI logging event to buf. */
+__printf(2, 3)
+void scsi_log_add(struct scsi_log_line *log, const char *fmt, ...)
+{
+   va_list args;
+   int len;
+
+   va_start(args, fmt);
+   len = vscnprintf(log->buf + log->offset,
+SCSI_LOG_LINE_MAX - log->offset, fmt, args);
+   WARN_ONCE(!len, "Cannot store the message '%s' in a local log buffer\n",
+ fmt);
+   log->offset += len;
+   va_end(args);
+}
+EXPORT_SYMBOL(scsi_log_add);
+
  /* attempt to guess cdb length if cdb_len==0 . No trailing linefeed. */
-static void print_opcode_name(unsigned char * cdbp, int cdb_len)
+static void print_opcode_name(unsigned char *cdbp, int cdb_len,
+ struct scsi_log_line *log)
  {
int sa, len, cdb0;
int fin_name = 0;
@@ -268,20 +286,22 @@ static void print_opcode_name(unsigned char * cdbp, int 
cdb_len)
case VARIABLE_LENGTH_CMD:
len = scsi_varlen_cdb_length(cdbp);
if (len < 10) {
-   printk("short variable length command, "
-  "len=%d ext_len=%d", len, cdb_len);
+   scsi_log_add(log,
+

Re: 3.13.5 : rm -rf running forever, one cpu at approx 100%

2014-02-26 Thread Gene Heskett

On Wednesday 26 February 2014, Ken Moffat wrote:
>On Thu, Feb 27, 2014 at 04:26:35AM +0100, Mike Galbraith wrote:
>> I would start with strace to see if a task is looping in userspace,
>> then move on to perf top -g -p  (or perf record/report) to peek
>> at what it's up to in the kernel.  Once you have the where,
>> trace_printk() is the best thing since sliced bread (which ranks just
>> below printk()).
>> 
>> -Mike
>
> Thanks.  I'll need to build perf.
>
>ؤ¸en
I probably will too, but I don't have a huge amount of tracing turned on in 
this kernel.  We'll see what happens tonight & go from there.
FWIW, about all I see in htop is the command line that launched it.


Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 

NOTICE: Will pay 100 USD for an HP-4815A defective but
complete probe assembly.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RESEND] scsi: Output error messages using structured printk in single line

2014-02-26 Thread Yoshihiro YUNOMAE

Output error messages using structured printk in single line.
In SCSI drivers, some error messages which should be output in single line are
divided in multiple lines. When user tools handle the error messages, those
divided messages will create some inconveniences.

The reason why this problem is induced is structured printk for error messages.
Structured printk can add device information for printk, and it is used in
scmd_printk() and sd_printk(). The printk aims at output in atomic, so we
cannot use those functions for connecting multiple messages like KERN_CONT.
However, some error messages is implemented as follows:
   structured_printk("DEVICE INFORMATION:");
   printk(KERN_CONT, "DETAIL INFORMATION\n");
This implementation will be expected to output like "DEVICE INFORMATION: DETAIL
INFORMATION", but actually, this will be output as follows:
   DEVICE INFORMATION:
   DETAIL INFORMATION

For instance, in a following pseudo SCSI error test, the device information and
the detail information are divided:

-- Pseudo SCSI error test for current kernel
  # modprobe scsi_debug
  # cd /sys/bus/pseudo/drivers/scsi_debug
  # echo 2 > opts
  # dd if=/dev/sdb of=/dev/null 2> /dev/null

-- Result for current kernel
  # dmesg

[   17.842110] sd 2:0:0:0: [sdb] Attached SCSI disk
[   18.859098] sd 2:0:0:0: [sdb] Unhandled sense code
[   18.859103] sd 2:0:0:0: [sdb]
[   18.859106] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   18.859108] sd 2:0:0:0: [sdb]
[   18.859110] Sense Key : Medium Error [current]
[   18.859114] Info fld=0x1234
[   18.859116] sd 2:0:0:0: [sdb]
[   18.859119] Add. Sense: Unrecovered read error
[   18.859122] sd 2:0:0:0: [sdb] CDB:
[   18.859124] Read(10): 28 00 00 00 11 e0 00 01 00 00

In a SCSI device driver, sd_print_result() is implemented as follows:
   sd_print_result()
   {
   sd_printk(KERN_INFO, sdkp, " ");
   scsi_show_result(result);
   }
Here, first sd_printk() outputs "sd 2:0:0:0: [sdb] ", then scsi_show_sense_hdr()
outputs "Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE". sd_printk() does not
include "\n", but it forcibly starts a new line. Therefore, when the driver
outputs error messages, those messages are divided.

This patch makes those multiple line messages output in single line as follows:

  # dmesg

[   17.145085]  sdb: unknown partition table
[   17.149096] sd 2:0:0:0: [sdb] Attached SCSI disk
[   18.166090] sd 2:0:0:0: [sdb] Unhandled sense code
[   18.166095] sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   18.166099] sd 2:0:0:0: [sdb] Sense Key : Medium Error [current]
[   18.166104] Info fld=0x1234
[   18.166106] sd 2:0:0:0: [sdb] Add. Sense: Unrecovered read error
[   18.166111] sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 11 e0 00 01 00 00

Signed-off-by: Yoshihiro YUNOMAE 
Cc: James E.J. Bottomley 
Cc: Hannes Reinecke 
Cc: Kay Sievers 
Cc: linux-kernel@vger.kernel.org
Cc: linux-s...@vger.kernel.org
---
 drivers/scsi/constants.c |  206 --
 drivers/scsi/scsi.c  |   28 --
 drivers/scsi/sd.c|   19 +++-
 include/scsi/scsi_dbg.h  |   23 -
 4 files changed, 176 insertions(+), 100 deletions(-)

diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
index d35a5d6..cb93435 100644
--- a/drivers/scsi/constants.c
+++ b/drivers/scsi/constants.c
@@ -256,8 +256,26 @@ static const char * get_sa_name(const struct 
value_name_pair * arr,
return (k < arr_sz) ? arr->name : NULL;
 }
 
+/* Store a SCSI logging event to buf. */
+__printf(2, 3)
+void scsi_log_add(struct scsi_log_line *log, const char *fmt, ...)
+{
+   va_list args;
+   int len;
+
+   va_start(args, fmt);
+   len = vscnprintf(log->buf + log->offset,
+SCSI_LOG_LINE_MAX - log->offset, fmt, args);
+   WARN_ONCE(!len, "Cannot store the message '%s' in a local log buffer\n",
+ fmt);
+   log->offset += len;
+   va_end(args);
+}
+EXPORT_SYMBOL(scsi_log_add);
+
 /* attempt to guess cdb length if cdb_len==0 . No trailing linefeed. */
-static void print_opcode_name(unsigned char * cdbp, int cdb_len)
+static void print_opcode_name(unsigned char *cdbp, int cdb_len,
+ struct scsi_log_line *log)
 {
int sa, len, cdb0;
int fin_name = 0;
@@ -268,20 +286,22 @@ static void print_opcode_name(unsigned char * cdbp, int 
cdb_len)
case VARIABLE_LENGTH_CMD:
len = scsi_varlen_cdb_length(cdbp);
if (len < 10) {
-   printk("short variable length command, "
-  "len=%d ext_len=%d", len, cdb_len);
+   scsi_log_add(log,
+"short variable length command, len=%d ext_len=%d",
+len, cdb_len);
break;
}
sa = (cdbp[8] << 8) + cdbp[9];
name = get_sa_name(variable_length_arr, VARIABLE_LENGTH_SZ,

Re: [PATCH] ASoC: io: Clean up snd_soc_codec_set_cache_io()

2014-02-26 Thread Mark Brown

On Thu, Feb 27, 2014 at 09:37:45AM +0800, Xiubo Li wrote:

> I'm also thinking could we just discard snd_soc_codec_set_cache_io()
> calling from each individual driver to simply the code? And just bind
> it to devm_regmap_init_i2c() and devm_regmap_init_spi()...

> Is there any other limitations for snd_soc_codec_set_cache_io() usage?

That's the goal overall, I'm not sure it's worth going through and
changing the signature of the function and then later going through and
merging it, it's just too much churn.  The main thing we need to do in
order to do that is to make sure nothing is relying on specific
sequencing during startup, probably by providing a way to manually set
the regmap pointer in the main device probe function rather than in the
ASoC one.

signature.asc
Description: Digital signature

Re: [PATCH V2] sparse: Allow override of sizeof(bool) warning

2014-02-26 Thread Ben Pfaff

On Wed, Feb 26, 2014 at 07:38:46PM -0800, Joe Perches wrote:
> (adding Ben Pfaff and Christopher Li)
> 
> On Wed, 2014-02-26 at 19:29 -0800, H. Peter Anvin wrote:
> > On 02/26/2014 06:58 PM, Josh Triplett wrote:
> > > On Wed, Feb 26, 2014 at 06:53:14PM -0800, Joe Perches wrote:
> > >> Allow an override to emit or not the sizeof(bool) warning
> > >> Add a description to the manpage.
> > >>
> > >> Signed-off-by: Joe Perches 
> > > 
> > > Reviewed-by: Josh Triplett 
> > > 
> > 
> > I have to admit that this particular warning is a bit odd to me.  I'm
> > wondering what kind of bugs it was intended to catch.
> > 
> > In particular, things that incorrectly assumes the size of bool to be
> > anything in particular would seem unlikely to actually use sizeof().
> 
> Dunno, the commit log for the commit that added it doesn't quite
> match the code and is seemingly unaware that the c99 spec doesn't
> specify sizeof(bool).

The commit *relaxed* sparse behavior: because previously sizeof(bool)
was an error.  I'm not in favor of any diagnostic at all for
sizeof(bool), but my recollection is that a sparse maintainer wanted it
to yield one.

I don't care about the particular result for sizeof(bool) as long as it
matches the ABI.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1722 matches

Mail list logo