Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Ingo Molnar

* Willy Tarreau  wrote:

> On Wed, Jan 10, 2018 at 08:31:28AM +0100, Ingo Molnar wrote:
> > 
> > * Borislav Petkov  wrote:
> > 
> > > Oh, and you've built the kernel with the option to be able to disable
> > > PTI so it's not like you haven't seen it already.
> > 
> > In general in many corporate environments requiring kernel reboots or 
> > kernel 
> > rebuilds limits the real-world usability of any kernel feature we offer 
> > down to 
> > "non-existent". Saying "build your own kernel or reboot" is excluding a 
> > large 
> > subset of our real-world users.
> > 
> > Build and boot options are fine for developers and testing. Otherwise 
> > _everything_ 
> > not readily accessible when your distro kernel has booted up is essentially 
> > behind 
> > a usability (and corporate policy) wall so steep that it's essentially 
> > non-existent to many users.
> > 
> > So either we make this properly sysctl (and/or prctl) controllable, or just 
> > don't 
> > do it at all.
> 
> After having slept over it, I really prefer the sysctl+prctl approach.
> It's much more consistent with the rest of the tunables which act
> similarly. We have mmap_min_addr, mmap_rnd_bits, randomize_va_space, etc
> All of them are here to trade some protections for something else (mostly
> compatibility).
> 
> What I'd like to have would be a sysctl with 3 values :
>   -  0 : default disabled : arch_prctl() fails, this is the default
>   -  1 : forced enabled : arch_prctl() succeeds for CAP_SYS_RAWIO
>   - -1 : permanently disabled : fails and cannot be switched back to enabled.

Btw., I wouldn't call the value of 1 'forced enabled', it's simply enabled.

BTW., we might eventually also want to introduce a 'super flag' for all the 
permanent disabling features, so that sysadmins/distros who want to default to 
restrictive policies can set that and don't have to be aware of new tunables, 
and 
this also protects against renames, etc.

Thanks,

Ingo


Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Ingo Molnar

* Willy Tarreau  wrote:

> On Wed, Jan 10, 2018 at 08:31:28AM +0100, Ingo Molnar wrote:
> > 
> > * Borislav Petkov  wrote:
> > 
> > > Oh, and you've built the kernel with the option to be able to disable
> > > PTI so it's not like you haven't seen it already.
> > 
> > In general in many corporate environments requiring kernel reboots or 
> > kernel 
> > rebuilds limits the real-world usability of any kernel feature we offer 
> > down to 
> > "non-existent". Saying "build your own kernel or reboot" is excluding a 
> > large 
> > subset of our real-world users.
> > 
> > Build and boot options are fine for developers and testing. Otherwise 
> > _everything_ 
> > not readily accessible when your distro kernel has booted up is essentially 
> > behind 
> > a usability (and corporate policy) wall so steep that it's essentially 
> > non-existent to many users.
> > 
> > So either we make this properly sysctl (and/or prctl) controllable, or just 
> > don't 
> > do it at all.
> 
> After having slept over it, I really prefer the sysctl+prctl approach.
> It's much more consistent with the rest of the tunables which act
> similarly. We have mmap_min_addr, mmap_rnd_bits, randomize_va_space, etc
> All of them are here to trade some protections for something else (mostly
> compatibility).
> 
> What I'd like to have would be a sysctl with 3 values :
>   -  0 : default disabled : arch_prctl() fails, this is the default
>   -  1 : forced enabled : arch_prctl() succeeds for CAP_SYS_RAWIO
>   - -1 : permanently disabled : fails and cannot be switched back to enabled.

Btw., I wouldn't call the value of 1 'forced enabled', it's simply enabled.

BTW., we might eventually also want to introduce a 'super flag' for all the 
permanent disabling features, so that sysadmins/distros who want to default to 
restrictive policies can set that and don't have to be aware of new tunables, 
and 
this also protects against renames, etc.

Thanks,

Ingo


Re: [patches] [PATCH 1/6] riscv/ftrace: Add RECORD_MCOUNT support

2018-01-09 Thread Alan Kao
On Wed, Jan 10, 2018 at 08:43:54AM +0100, Christoph Hellwig wrote:
> On Wed, Jan 10, 2018 at 03:38:09PM +0800, Alan Kao wrote:
> > -LDFLAGS_vmlinux :=
> > +ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> > +   LDFLAGS_vmlinux := --no-relax
> > +else
> > +   LDFLAGS_vmlinux :=
> > +endif
> 
> Why not:
> 
> LDFLAGS_vmlinux :=
> ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> LDFLAGS_vmlinux += --no-relax
> endif
> 
Thanks for the comment! This will be enhanced in the next try.

Alan


Re: [patches] [PATCH 1/6] riscv/ftrace: Add RECORD_MCOUNT support

2018-01-09 Thread Alan Kao
On Wed, Jan 10, 2018 at 08:43:54AM +0100, Christoph Hellwig wrote:
> On Wed, Jan 10, 2018 at 03:38:09PM +0800, Alan Kao wrote:
> > -LDFLAGS_vmlinux :=
> > +ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> > +   LDFLAGS_vmlinux := --no-relax
> > +else
> > +   LDFLAGS_vmlinux :=
> > +endif
> 
> Why not:
> 
> LDFLAGS_vmlinux :=
> ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> LDFLAGS_vmlinux += --no-relax
> endif
> 
Thanks for the comment! This will be enhanced in the next try.

Alan


Re: [PATCH v2 3/4] clk: Show symbolic clock flags in debugfs

2018-01-09 Thread Geert Uytterhoeven
Hi Stephen,

On Wed, Jan 10, 2018 at 3:02 AM, Stephen Boyd  wrote:
> On 01/03, Geert Uytterhoeven wrote:
>> Currently the virtual "clk_flags" file in debugfs shows the numeric
>> value of the top-level framework flags for the specified clock.
>> Hence the user must manually interpret these values.
>>
>> Moreover, on big-endian 64-bit systems, the wrong half of the value is
>> shown, due to the cast from "unsigned long *" to "u32 *".
>>
>> Fix both issues by showing the symbolic flag names instead.
>> Any non-standard flags are shown as a hex number.
>>
>> Signed-off-by: Geert Uytterhoeven 

> I wonder if it can be a little simpler with something like the
> below patch squashed in? It would also be nice to detect if we

Sure, I didn't bother adding a macro for that as the list isn't that long,
and fairly static.

But feel free to fold in that part.

> fail to add another flag, but that may mean we need to make the
> flags into some sort of enum that we also set equal to BIT(x) and
> then have a case statement in the for loop instead of an array
> lookup. Not sure that's a big win.

Yes, you need an enum for that.
The case statement and for loop can be avoided by indexing the array
by enum value.

And the check for unknown flags can be moved to clock registration time
later, if you deem that's the way forward.

Thanks!

> @@ -2558,18 +2559,20 @@ static const struct {
> unsigned long flag;
> const char *name;
>  } clk_flags[] = {
> -   { CLK_SET_RATE_GATE,"CLK_SET_RATE_GATE",},
> -   { CLK_SET_PARENT_GATE,  "CLK_SET_PARENT_GATE",  },
> -   { CLK_SET_RATE_PARENT,  "CLK_SET_RATE_PARENT",  },
> -   { CLK_IGNORE_UNUSED,"CLK_IGNORE_UNUSED",},
> -   { CLK_IS_BASIC, "CLK_IS_BASIC", },
> -   { CLK_GET_RATE_NOCACHE, "CLK_GET_RATE_NOCACHE", },
> -   { CLK_SET_RATE_NO_REPARENT, "CLK_SET_RATE_NO_REPARENT", },
> -   { CLK_GET_ACCURACY_NOCACHE, "CLK_GET_ACCURACY_NOCACHE", },
> -   { CLK_RECALC_NEW_RATES, "CLK_RECALC_NEW_RATES", },
> -   { CLK_SET_RATE_UNGATE,  "CLK_SET_RATE_UNGATE",  },
> -   { CLK_IS_CRITICAL,  "CLK_IS_CRITICAL",  },
> -   { CLK_OPS_PARENT_ENABLE,"CLK_OPS_PARENT_ENABLE",},
> +#define ENTRY(f) { f, __stringify(f) }
> +   ENTRY(CLK_SET_RATE_GATE),
> +   ENTRY(CLK_SET_PARENT_GATE),
> +   ENTRY(CLK_SET_RATE_PARENT),
> +   ENTRY(CLK_IGNORE_UNUSED),
> +   ENTRY(CLK_IS_BASIC),
> +   ENTRY(CLK_GET_RATE_NOCACHE),
> +   ENTRY(CLK_SET_RATE_NO_REPARENT),
> +   ENTRY(CLK_GET_ACCURACY_NOCACHE),
> +   ENTRY(CLK_RECALC_NEW_RATES),
> +   ENTRY(CLK_SET_RATE_UNGATE),
> +   ENTRY(CLK_IS_CRITICAL),
> +   ENTRY(CLK_OPS_PARENT_ENABLE),
> +#undef ENTRY
>  };

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v2 3/4] clk: Show symbolic clock flags in debugfs

2018-01-09 Thread Geert Uytterhoeven
Hi Stephen,

On Wed, Jan 10, 2018 at 3:02 AM, Stephen Boyd  wrote:
> On 01/03, Geert Uytterhoeven wrote:
>> Currently the virtual "clk_flags" file in debugfs shows the numeric
>> value of the top-level framework flags for the specified clock.
>> Hence the user must manually interpret these values.
>>
>> Moreover, on big-endian 64-bit systems, the wrong half of the value is
>> shown, due to the cast from "unsigned long *" to "u32 *".
>>
>> Fix both issues by showing the symbolic flag names instead.
>> Any non-standard flags are shown as a hex number.
>>
>> Signed-off-by: Geert Uytterhoeven 

> I wonder if it can be a little simpler with something like the
> below patch squashed in? It would also be nice to detect if we

Sure, I didn't bother adding a macro for that as the list isn't that long,
and fairly static.

But feel free to fold in that part.

> fail to add another flag, but that may mean we need to make the
> flags into some sort of enum that we also set equal to BIT(x) and
> then have a case statement in the for loop instead of an array
> lookup. Not sure that's a big win.

Yes, you need an enum for that.
The case statement and for loop can be avoided by indexing the array
by enum value.

And the check for unknown flags can be moved to clock registration time
later, if you deem that's the way forward.

Thanks!

> @@ -2558,18 +2559,20 @@ static const struct {
> unsigned long flag;
> const char *name;
>  } clk_flags[] = {
> -   { CLK_SET_RATE_GATE,"CLK_SET_RATE_GATE",},
> -   { CLK_SET_PARENT_GATE,  "CLK_SET_PARENT_GATE",  },
> -   { CLK_SET_RATE_PARENT,  "CLK_SET_RATE_PARENT",  },
> -   { CLK_IGNORE_UNUSED,"CLK_IGNORE_UNUSED",},
> -   { CLK_IS_BASIC, "CLK_IS_BASIC", },
> -   { CLK_GET_RATE_NOCACHE, "CLK_GET_RATE_NOCACHE", },
> -   { CLK_SET_RATE_NO_REPARENT, "CLK_SET_RATE_NO_REPARENT", },
> -   { CLK_GET_ACCURACY_NOCACHE, "CLK_GET_ACCURACY_NOCACHE", },
> -   { CLK_RECALC_NEW_RATES, "CLK_RECALC_NEW_RATES", },
> -   { CLK_SET_RATE_UNGATE,  "CLK_SET_RATE_UNGATE",  },
> -   { CLK_IS_CRITICAL,  "CLK_IS_CRITICAL",  },
> -   { CLK_OPS_PARENT_ENABLE,"CLK_OPS_PARENT_ENABLE",},
> +#define ENTRY(f) { f, __stringify(f) }
> +   ENTRY(CLK_SET_RATE_GATE),
> +   ENTRY(CLK_SET_PARENT_GATE),
> +   ENTRY(CLK_SET_RATE_PARENT),
> +   ENTRY(CLK_IGNORE_UNUSED),
> +   ENTRY(CLK_IS_BASIC),
> +   ENTRY(CLK_GET_RATE_NOCACHE),
> +   ENTRY(CLK_SET_RATE_NO_REPARENT),
> +   ENTRY(CLK_GET_ACCURACY_NOCACHE),
> +   ENTRY(CLK_RECALC_NEW_RATES),
> +   ENTRY(CLK_SET_RATE_UNGATE),
> +   ENTRY(CLK_IS_CRITICAL),
> +   ENTRY(CLK_OPS_PARENT_ENABLE),
> +#undef ENTRY
>  };

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


[PATCH v2] f2fs: handle newly created page when revoking inmem pages

2018-01-09 Thread Daeho Jeong
When committing inmem pages is successful, we revoke already committed
blocks in __revoke_inmem_pages() and finally replace the committed
ones with the old blocks using f2fs_replace_block(). However, if
the committed block was newly created one, the address of the old
block is NEW_ADDR and __f2fs_replace_block() cannot handle NEW_ADDR
as new_blkaddr properly and a kernel panic occurrs.

Signed-off-by: Daeho Jeong 
Tested-by: Shu Tan 
---
 fs/f2fs/segment.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c117e09..0673d08 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -248,7 +248,11 @@ static int __revoke_inmem_pages(struct inode *inode,
goto next;
}
get_node_info(sbi, dn.nid, );
-   f2fs_replace_block(sbi, , dn.data_blkaddr,
+   if (cur->old_addr == NEW_ADDR) {
+   invalidate_blocks(sbi, dn.data_blkaddr);
+   f2fs_update_data_blkaddr(, NEW_ADDR);
+   } else
+   f2fs_replace_block(sbi, , dn.data_blkaddr,
cur->old_addr, ni.version, true, true);
f2fs_put_dnode();
}
-- 
1.9.1



[PATCH v2] f2fs: handle newly created page when revoking inmem pages

2018-01-09 Thread Daeho Jeong
When committing inmem pages is successful, we revoke already committed
blocks in __revoke_inmem_pages() and finally replace the committed
ones with the old blocks using f2fs_replace_block(). However, if
the committed block was newly created one, the address of the old
block is NEW_ADDR and __f2fs_replace_block() cannot handle NEW_ADDR
as new_blkaddr properly and a kernel panic occurrs.

Signed-off-by: Daeho Jeong 
Tested-by: Shu Tan 
---
 fs/f2fs/segment.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c117e09..0673d08 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -248,7 +248,11 @@ static int __revoke_inmem_pages(struct inode *inode,
goto next;
}
get_node_info(sbi, dn.nid, );
-   f2fs_replace_block(sbi, , dn.data_blkaddr,
+   if (cur->old_addr == NEW_ADDR) {
+   invalidate_blocks(sbi, dn.data_blkaddr);
+   f2fs_update_data_blkaddr(, NEW_ADDR);
+   } else
+   f2fs_replace_block(sbi, , dn.data_blkaddr,
cur->old_addr, ni.version, true, true);
f2fs_put_dnode();
}
-- 
1.9.1



Re: [RFC git branches] rseq and membarrier against v4.15-rc7

2018-01-09 Thread Ingo Molnar

* Mathieu Desnoyers <mathieu.desnoy...@efficios.com> wrote:

> Hi,
> 
> I rebased the rseq and membarrier development branches on top of v4.15-rc7. 
> I'm
> not sending an RFC round now considering that everyone in CC here has other
> fishes to fry (speculatively speaking).
> 
> Those with time on their hands and interested to test those branches can fetch
> them at:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/rseq/linux-rseq.git/
> 
> tags:
> v4.15-rc7-rseq-20180109
> v4.15-rc7-membarrier-20180109
> 
> Feedback is welcome, although I really hope those involved with the recent
> security effort will take a break and breathe some fresh air rather than
> look at my branches in the short term.

Please submit all pending membarrier patches you intend for v4.16 inclusion to 
me 
and PeterZ (as a reviewable git-send-email patch series, not as a Git tree URI 
only), for inclusion into the scheduler tree.

Thanks,

Ingo


Re: [RFC git branches] rseq and membarrier against v4.15-rc7

2018-01-09 Thread Ingo Molnar

* Mathieu Desnoyers  wrote:

> Hi,
> 
> I rebased the rseq and membarrier development branches on top of v4.15-rc7. 
> I'm
> not sending an RFC round now considering that everyone in CC here has other
> fishes to fry (speculatively speaking).
> 
> Those with time on their hands and interested to test those branches can fetch
> them at:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/rseq/linux-rseq.git/
> 
> tags:
> v4.15-rc7-rseq-20180109
> v4.15-rc7-membarrier-20180109
> 
> Feedback is welcome, although I really hope those involved with the recent
> security effort will take a break and breathe some fresh air rather than
> look at my branches in the short term.

Please submit all pending membarrier patches you intend for v4.16 inclusion to 
me 
and PeterZ (as a reviewable git-send-email patch series, not as a Git tree URI 
only), for inclusion into the scheduler tree.

Thanks,

Ingo


Re: BUG: soft lockup (2)

2018-01-09 Thread Eric Biggers
On Fri, Jan 05, 2018 at 09:47:01AM -0800, syzbot wrote:
> syzkaller has found reproducer for the following crash on
> e1915c8195b38393005be9b74bfa6a3a367c83b3
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
> 
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by:
> syzbot+f76f3c62dfadce022fd1c1deff15a61e09ac7...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed.
> 
> watchdog: BUG: soft lockup - CPU#0 stuck for 135s! [syzkaller670324:3527]
> Modules linked in:
> irq event stamp: 2531226
> hardirqs last  enabled at (2531225): []
> snd_pcm_stream_unlock_irq+0x78/0xe0 sound/core/pcm_native.c:166
> hardirqs last disabled at (2531226): [<3c6ef1cd>]
> apic_timer_interrupt+0xa4/0xb0 arch/x86/entry/entry_64.S:920
> softirqs last  enabled at (41848): [<81bd5f03>]
> __do_softirq+0x7a0/0xb85 kernel/softirq.c:311
> softirqs last disabled at (41829): [] invoke_softirq
> kernel/softirq.c:365 [inline]
> softirqs last disabled at (41829): [] irq_exit+0x1cc/0x200
> kernel/softirq.c:405
> CPU: 0 PID: 3527 Comm: syzkaller670324 Not tainted 4.15.0-rc6+ #158
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:memcpy+0x45/0x50 mm/kasan/kasan.c:305
> RSP: 0018:8801bf6676f0 EFLAGS: 0246 ORIG_RAX: ff11
> RAX: c9000137ba06 RBX: 0002 RCX: 
> RDX: 0002 RSI: 8801bf6677da RDI: c9000137ba08
> RBP: 8801bf667708 R08: f5200026f741 R09: f5200026f741
> R10: 0001 R11: f5200026f740 R12: c9000137ba06
> R13: 8801bf6677d8 R14: dc00 R15: c9000137ba06
> FS:  () GS:8801db20(0063) knlGS:f7ec6b40
> CS:  0010 DS: 002b ES: 002b CR0: 80050033
> CR2: 20735ee0 CR3: 0001bfba8002 CR4: 001606f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  memcpy include/linux/string.h:344 [inline]
>  cvt_s16_to_native sound/core/oss/mulaw.c:164 [inline]
>  mulaw_decode+0x52f/0x770 sound/core/oss/mulaw.c:195
>  mulaw_transfer+0x222/0x270 sound/core/oss/mulaw.c:273
>  snd_pcm_plug_write_transfer+0x22d/0x420 sound/core/oss/pcm_plugin.c:611
>  snd_pcm_oss_write2+0x260/0x420 sound/core/oss/pcm_oss.c:1311
>  snd_pcm_oss_write1 sound/core/oss/pcm_oss.c:1372 [inline]
>  snd_pcm_oss_write+0x5fe/0x830 sound/core/oss/pcm_oss.c:2646
>  __vfs_write+0xef/0x970 fs/read_write.c:480
>  vfs_write+0x189/0x510 fs/read_write.c:544
>  SYSC_write fs/read_write.c:589 [inline]
>  SyS_write+0xef/0x220 fs/read_write.c:581
>  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
>  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
>  entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:129

Seems that this is fixed in sound/for-linus by:

#syz fix: ALSA: pcm: Abort properly at pending signal in OSS read/write loops


Re: BUG: soft lockup (2)

2018-01-09 Thread Eric Biggers
On Fri, Jan 05, 2018 at 09:47:01AM -0800, syzbot wrote:
> syzkaller has found reproducer for the following crash on
> e1915c8195b38393005be9b74bfa6a3a367c83b3
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
> 
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by:
> syzbot+f76f3c62dfadce022fd1c1deff15a61e09ac7...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed.
> 
> watchdog: BUG: soft lockup - CPU#0 stuck for 135s! [syzkaller670324:3527]
> Modules linked in:
> irq event stamp: 2531226
> hardirqs last  enabled at (2531225): []
> snd_pcm_stream_unlock_irq+0x78/0xe0 sound/core/pcm_native.c:166
> hardirqs last disabled at (2531226): [<3c6ef1cd>]
> apic_timer_interrupt+0xa4/0xb0 arch/x86/entry/entry_64.S:920
> softirqs last  enabled at (41848): [<81bd5f03>]
> __do_softirq+0x7a0/0xb85 kernel/softirq.c:311
> softirqs last disabled at (41829): [] invoke_softirq
> kernel/softirq.c:365 [inline]
> softirqs last disabled at (41829): [] irq_exit+0x1cc/0x200
> kernel/softirq.c:405
> CPU: 0 PID: 3527 Comm: syzkaller670324 Not tainted 4.15.0-rc6+ #158
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:memcpy+0x45/0x50 mm/kasan/kasan.c:305
> RSP: 0018:8801bf6676f0 EFLAGS: 0246 ORIG_RAX: ff11
> RAX: c9000137ba06 RBX: 0002 RCX: 
> RDX: 0002 RSI: 8801bf6677da RDI: c9000137ba08
> RBP: 8801bf667708 R08: f5200026f741 R09: f5200026f741
> R10: 0001 R11: f5200026f740 R12: c9000137ba06
> R13: 8801bf6677d8 R14: dc00 R15: c9000137ba06
> FS:  () GS:8801db20(0063) knlGS:f7ec6b40
> CS:  0010 DS: 002b ES: 002b CR0: 80050033
> CR2: 20735ee0 CR3: 0001bfba8002 CR4: 001606f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  memcpy include/linux/string.h:344 [inline]
>  cvt_s16_to_native sound/core/oss/mulaw.c:164 [inline]
>  mulaw_decode+0x52f/0x770 sound/core/oss/mulaw.c:195
>  mulaw_transfer+0x222/0x270 sound/core/oss/mulaw.c:273
>  snd_pcm_plug_write_transfer+0x22d/0x420 sound/core/oss/pcm_plugin.c:611
>  snd_pcm_oss_write2+0x260/0x420 sound/core/oss/pcm_oss.c:1311
>  snd_pcm_oss_write1 sound/core/oss/pcm_oss.c:1372 [inline]
>  snd_pcm_oss_write+0x5fe/0x830 sound/core/oss/pcm_oss.c:2646
>  __vfs_write+0xef/0x970 fs/read_write.c:480
>  vfs_write+0x189/0x510 fs/read_write.c:544
>  SYSC_write fs/read_write.c:589 [inline]
>  SyS_write+0xef/0x220 fs/read_write.c:581
>  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
>  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
>  entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:129

Seems that this is fixed in sound/for-linus by:

#syz fix: ALSA: pcm: Abort properly at pending signal in OSS read/write loops


Re: [patches] [PATCH 1/6] riscv/ftrace: Add RECORD_MCOUNT support

2018-01-09 Thread Christoph Hellwig
On Wed, Jan 10, 2018 at 03:38:09PM +0800, Alan Kao wrote:
> -LDFLAGS_vmlinux :=
> +ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> + LDFLAGS_vmlinux := --no-relax
> +else
> + LDFLAGS_vmlinux :=
> +endif

Why not:

LDFLAGS_vmlinux :=
ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
LDFLAGS_vmlinux += --no-relax
endif



Re: [patches] [PATCH 1/6] riscv/ftrace: Add RECORD_MCOUNT support

2018-01-09 Thread Christoph Hellwig
On Wed, Jan 10, 2018 at 03:38:09PM +0800, Alan Kao wrote:
> -LDFLAGS_vmlinux :=
> +ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> + LDFLAGS_vmlinux := --no-relax
> +else
> + LDFLAGS_vmlinux :=
> +endif

Why not:

LDFLAGS_vmlinux :=
ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
LDFLAGS_vmlinux += --no-relax
endif



Re: [PATCH] Fix: membarrier: add missing preempt off around smp_call_function_many

2018-01-09 Thread Ingo Molnar

* Mathieu Desnoyers  wrote:

> Hi Linus,
> 
> Can you pick up this straightforward fix please ? Let me know whether
> I need to re-send the patch if for some reason the original post is
> too far back in your inbox.

The fix looks much more reasonable than previous attempts: I'll pick it up into 
tip:sched/urgent and send it Linuswards.

Thanks,

Ingo


Re: [PATCH] Fix: membarrier: add missing preempt off around smp_call_function_many

2018-01-09 Thread Ingo Molnar

* Mathieu Desnoyers  wrote:

> Hi Linus,
> 
> Can you pick up this straightforward fix please ? Let me know whether
> I need to re-send the patch if for some reason the original post is
> too far back in your inbox.

The fix looks much more reasonable than previous attempts: I'll pick it up into 
tip:sched/urgent and send it Linuswards.

Thanks,

Ingo


[PATCH v2 2/2] phy: rockchip-emmc: use regmap_read_poll_timeout to poll dllrdy

2018-01-09 Thread Caesar Wang
From: Shawn Lin 

Just use the API instead of open-coding it, no functional change
intended.

Signed-off-by: Shawn Lin 
Reviewed-by: Brian Norris 
Signed-off-by: Caesar Wang 

---

Changes in v2:
- As Brian commented on https://patchwork.kernel.org/patch/10139891/,
  changed the note and added to print error value with
  regmap_read_poll_timeout API.

 drivers/phy/rockchip/phy-rockchip-emmc.c | 33 +++-
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/drivers/phy/rockchip/phy-rockchip-emmc.c 
b/drivers/phy/rockchip/phy-rockchip-emmc.c
index 574838f..343c623 100644
--- a/drivers/phy/rockchip/phy-rockchip-emmc.c
+++ b/drivers/phy/rockchip/phy-rockchip-emmc.c
@@ -79,6 +79,9 @@
 #define PHYCTRL_IS_CALDONE(x) \
x) >> PHYCTRL_CALDONE_SHIFT) & \
  PHYCTRL_CALDONE_MASK) == PHYCTRL_CALDONE_DONE)
+#define PHYCTRL_IS_DLLRDY(x) \
+   x) >> PHYCTRL_DLLRDY_SHIFT) & \
+ PHYCTRL_DLLRDY_MASK) == PHYCTRL_DLLRDY_DONE)
 
 struct rockchip_emmc_phy {
unsigned intreg_offset;
@@ -93,7 +96,6 @@ static int rockchip_emmc_phy_power(struct phy *phy, bool 
on_off)
unsigned int dllrdy;
unsigned int freqsel = PHYCTRL_FREQSEL_200M;
unsigned long rate;
-   unsigned long timeout;
int ret;
 
/*
@@ -217,28 +219,15 @@ static int rockchip_emmc_phy_power(struct phy *phy, bool 
on_off)
 * NOTE: There appear to be corner cases where the DLL seems to take
 * extra long to lock for reasons that aren't understood.  In some
 * extreme cases we've seen it take up to over 10ms (!).  We'll be
-* generous and give it 50ms.  We still busy wait here because:
-* - In most cases it should be super fast.
-* - This is not called lots during normal operation so it shouldn't
-*   be a power or performance problem to busy wait.  We expect it
-*   only at boot / resume.  In both cases, eMMC is probably on the
-*   critical path so busy waiting a little extra time should be OK.
+* generous and give it 50ms.
 */
-   timeout = jiffies + msecs_to_jiffies(50);
-   do {
-   udelay(1);
-
-   regmap_read(rk_phy->reg_base,
-   rk_phy->reg_offset + GRF_EMMCPHY_STATUS,
-   );
-   dllrdy = (dllrdy >> PHYCTRL_DLLRDY_SHIFT) & PHYCTRL_DLLRDY_MASK;
-   if (dllrdy == PHYCTRL_DLLRDY_DONE)
-   break;
-   } while (!time_after(jiffies, timeout));
-
-   if (dllrdy != PHYCTRL_DLLRDY_DONE) {
-   pr_err("rockchip_emmc_phy_power: dllrdy timeout.\n");
-   return -ETIMEDOUT;
+   ret = regmap_read_poll_timeout(rk_phy->reg_base,
+  rk_phy->reg_offset + GRF_EMMCPHY_STATUS,
+  dllrdy, PHYCTRL_IS_DLLRDY(dllrdy),
+  1, 50 * USEC_PER_MSEC);
+   if (ret) {
+   pr_err("%s: dllrdy timeout. ret=%d\n", __func__, ret);
+   return ret;
}
 
return 0;
-- 
2.7.4



[PATCH v2 2/2] phy: rockchip-emmc: use regmap_read_poll_timeout to poll dllrdy

2018-01-09 Thread Caesar Wang
From: Shawn Lin 

Just use the API instead of open-coding it, no functional change
intended.

Signed-off-by: Shawn Lin 
Reviewed-by: Brian Norris 
Signed-off-by: Caesar Wang 

---

Changes in v2:
- As Brian commented on https://patchwork.kernel.org/patch/10139891/,
  changed the note and added to print error value with
  regmap_read_poll_timeout API.

 drivers/phy/rockchip/phy-rockchip-emmc.c | 33 +++-
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/drivers/phy/rockchip/phy-rockchip-emmc.c 
b/drivers/phy/rockchip/phy-rockchip-emmc.c
index 574838f..343c623 100644
--- a/drivers/phy/rockchip/phy-rockchip-emmc.c
+++ b/drivers/phy/rockchip/phy-rockchip-emmc.c
@@ -79,6 +79,9 @@
 #define PHYCTRL_IS_CALDONE(x) \
x) >> PHYCTRL_CALDONE_SHIFT) & \
  PHYCTRL_CALDONE_MASK) == PHYCTRL_CALDONE_DONE)
+#define PHYCTRL_IS_DLLRDY(x) \
+   x) >> PHYCTRL_DLLRDY_SHIFT) & \
+ PHYCTRL_DLLRDY_MASK) == PHYCTRL_DLLRDY_DONE)
 
 struct rockchip_emmc_phy {
unsigned intreg_offset;
@@ -93,7 +96,6 @@ static int rockchip_emmc_phy_power(struct phy *phy, bool 
on_off)
unsigned int dllrdy;
unsigned int freqsel = PHYCTRL_FREQSEL_200M;
unsigned long rate;
-   unsigned long timeout;
int ret;
 
/*
@@ -217,28 +219,15 @@ static int rockchip_emmc_phy_power(struct phy *phy, bool 
on_off)
 * NOTE: There appear to be corner cases where the DLL seems to take
 * extra long to lock for reasons that aren't understood.  In some
 * extreme cases we've seen it take up to over 10ms (!).  We'll be
-* generous and give it 50ms.  We still busy wait here because:
-* - In most cases it should be super fast.
-* - This is not called lots during normal operation so it shouldn't
-*   be a power or performance problem to busy wait.  We expect it
-*   only at boot / resume.  In both cases, eMMC is probably on the
-*   critical path so busy waiting a little extra time should be OK.
+* generous and give it 50ms.
 */
-   timeout = jiffies + msecs_to_jiffies(50);
-   do {
-   udelay(1);
-
-   regmap_read(rk_phy->reg_base,
-   rk_phy->reg_offset + GRF_EMMCPHY_STATUS,
-   );
-   dllrdy = (dllrdy >> PHYCTRL_DLLRDY_SHIFT) & PHYCTRL_DLLRDY_MASK;
-   if (dllrdy == PHYCTRL_DLLRDY_DONE)
-   break;
-   } while (!time_after(jiffies, timeout));
-
-   if (dllrdy != PHYCTRL_DLLRDY_DONE) {
-   pr_err("rockchip_emmc_phy_power: dllrdy timeout.\n");
-   return -ETIMEDOUT;
+   ret = regmap_read_poll_timeout(rk_phy->reg_base,
+  rk_phy->reg_offset + GRF_EMMCPHY_STATUS,
+  dllrdy, PHYCTRL_IS_DLLRDY(dllrdy),
+  1, 50 * USEC_PER_MSEC);
+   if (ret) {
+   pr_err("%s: dllrdy timeout. ret=%d\n", __func__, ret);
+   return ret;
}
 
return 0;
-- 
2.7.4



Re: [PATCH][next] IB/mlx5: remove redundant assignment of mdev

2018-01-09 Thread Leon Romanovsky
On Tue, Jan 09, 2018 at 03:55:43PM +, Colin King wrote:
> From: Colin Ian King 
>
> The initial assignment to mdev is redundant as mdev is re-assigned
> later and the first assigned value is never read. Remove this
> redundant assignment.
>
> Cleans up clang warning:
> drivers/infiniband/hw/mlx5/main.c:359:24: warning: Value stored
> to 'mdev' during its initialization is never read
>
> Signed-off-by: Colin Ian King 
> ---
>  drivers/infiniband/hw/mlx5/main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>

Thanks,
Acked-by: Leon Romanovsky 


signature.asc
Description: PGP signature


Re: [PATCH 04/18] arm: implement nospec_ptr()

2018-01-09 Thread Hanjun Guo
On 2018/1/10 10:04, Laura Abbott wrote:
> On 01/05/2018 05:10 PM, Dan Williams wrote:
>> From: Mark Rutland 
>>
>> This patch implements nospec_ptr() for arm, following the recommended
>> architectural sequences for the arm and thumb instruction sets.
>>
> Fedora picked up the series and it fails on arm:
> 
> In file included from ./include/linux/compiler.h:242:0,
>  from ./include/uapi/linux/swab.h:6,
>  from ./include/linux/swab.h:5,
>  from ./arch/arm/include/asm/opcodes.h:89,
>  from ./arch/arm/include/asm/bug.h:7,
>  from ./include/linux/bug.h:5,
>  from ./include/linux/mmdebug.h:5,
>  from ./include/linux/gfp.h:5,
>  from ./include/linux/slab.h:15,
>  from kernel/fork.c:14:
> ./include/linux/fdtable.h: In function '__fcheck_files':
> ./arch/arm/include/asm/barrier.h:112:41: error: expected declaration 
> specifiers or '...' before numeric constant
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>  ^
> ./arch/arm/include/asm/barrier.h:68:32: note: in definition of macro 
> '__load_no_speculate_n'
>    (typeof(*ptr)(unsigned long)(failval));  \
>     ^~~
> ./arch/arm/include/asm/barrier.h:112:2: note: in expansion of macro 
> '__load_no_speculate'
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>   ^~~
> ./include/asm-generic/barrier.h:122:2: note: in expansion of macro 
> 'nospec_ptr'
>   nospec_ptr(__arr + __idx, __arr, __arr + __sz);   \
>   ^~
> ./include/linux/fdtable.h:86:13: note: in expansion of macro 
> 'nospec_array_ptr'
>   if ((fdp = nospec_array_ptr(fdt->fd, fd, fdt->max_fds)))
>  ^~~~
> ./arch/arm/include/asm/barrier.h:112:41: error: expected declaration 
> specifiers or '...' before numeric constant
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>  ^
> ./arch/arm/include/asm/barrier.h:68:32: note: in definition of macro 
> '__load_no_speculate_n'
>    (typeof(*ptr)(unsigned long)(failval));  \
>     ^~~
> ./arch/arm/include/asm/barrier.h:112:2: note: in expansion of macro 
> '__load_no_speculate'
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>   ^~~
> ./include/asm-generic/barrier.h:122:2: note: in expansion of macro 
> 'nospec_ptr'
>   nospec_ptr(__arr + __idx, __arr, __arr + __sz);   \
>   ^~
> ./include/linux/fdtable.h:86:13: note: in expansion of macro 
> 'nospec_array_ptr'
>   if ((fdp = nospec_array_ptr(fdt->fd, fd, fdt->max_fds)))
>  ^~~~
> ./arch/arm/include/asm/barrier.h:112:41: error: expected declaration 
> specifiers or '...' before numeric constant
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>  ^
> ./arch/arm/include/asm/barrier.h:68:32: note: in definition of macro 
> '__load_no_speculate_n'
>    (typeof(*ptr)(unsigned long)(failval));  \
>     ^~~
> ./arch/arm/include/asm/barrier.h:112:2: note: in expansion of macro 
> '__load_no_speculate'
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>   ^~~
> ./include/asm-generic/barrier.h:122:2: note: in expansion of macro 
> 'nospec_ptr'
>   nospec_ptr(__arr + __idx, __arr, __arr + __sz);   \
>   ^~
> ./include/linux/fdtable.h:86:13: note: in expansion of macro 
> 'nospec_array_ptr'
>   if ((fdp = nospec_array_ptr(fdt->fd, fd, fdt->max_fds)))
> 
> I can't puzzle out what exactly is the problem here, except that it really
> does not seem to like that failval. Does the arm compiler not like doing
> the typeof with the __arr + __idx?

>> +#define __load_no_speculate_n(ptr, lo, hi, failval, cmpptr, sz)    \
>> +({    \
>> +    typeof(*ptr) __nln_val;    \
>> +    typeof(*ptr) __failval =    \
>> +    (typeof(*ptr)(unsigned long)(failval));    \

Just typo,

- (typeof(*ptr)(unsigned long)(failval)); \
+ (typeof(*ptr))(unsigned long)(failval); \

Please try it.

Thanks
Hanjun



Re: [PATCH][next] IB/mlx5: remove redundant assignment of mdev

2018-01-09 Thread Leon Romanovsky
On Tue, Jan 09, 2018 at 03:55:43PM +, Colin King wrote:
> From: Colin Ian King 
>
> The initial assignment to mdev is redundant as mdev is re-assigned
> later and the first assigned value is never read. Remove this
> redundant assignment.
>
> Cleans up clang warning:
> drivers/infiniband/hw/mlx5/main.c:359:24: warning: Value stored
> to 'mdev' during its initialization is never read
>
> Signed-off-by: Colin Ian King 
> ---
>  drivers/infiniband/hw/mlx5/main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>

Thanks,
Acked-by: Leon Romanovsky 


signature.asc
Description: PGP signature


Re: [PATCH 04/18] arm: implement nospec_ptr()

2018-01-09 Thread Hanjun Guo
On 2018/1/10 10:04, Laura Abbott wrote:
> On 01/05/2018 05:10 PM, Dan Williams wrote:
>> From: Mark Rutland 
>>
>> This patch implements nospec_ptr() for arm, following the recommended
>> architectural sequences for the arm and thumb instruction sets.
>>
> Fedora picked up the series and it fails on arm:
> 
> In file included from ./include/linux/compiler.h:242:0,
>  from ./include/uapi/linux/swab.h:6,
>  from ./include/linux/swab.h:5,
>  from ./arch/arm/include/asm/opcodes.h:89,
>  from ./arch/arm/include/asm/bug.h:7,
>  from ./include/linux/bug.h:5,
>  from ./include/linux/mmdebug.h:5,
>  from ./include/linux/gfp.h:5,
>  from ./include/linux/slab.h:15,
>  from kernel/fork.c:14:
> ./include/linux/fdtable.h: In function '__fcheck_files':
> ./arch/arm/include/asm/barrier.h:112:41: error: expected declaration 
> specifiers or '...' before numeric constant
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>  ^
> ./arch/arm/include/asm/barrier.h:68:32: note: in definition of macro 
> '__load_no_speculate_n'
>    (typeof(*ptr)(unsigned long)(failval));  \
>     ^~~
> ./arch/arm/include/asm/barrier.h:112:2: note: in expansion of macro 
> '__load_no_speculate'
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>   ^~~
> ./include/asm-generic/barrier.h:122:2: note: in expansion of macro 
> 'nospec_ptr'
>   nospec_ptr(__arr + __idx, __arr, __arr + __sz);   \
>   ^~
> ./include/linux/fdtable.h:86:13: note: in expansion of macro 
> 'nospec_array_ptr'
>   if ((fdp = nospec_array_ptr(fdt->fd, fd, fdt->max_fds)))
>  ^~~~
> ./arch/arm/include/asm/barrier.h:112:41: error: expected declaration 
> specifiers or '...' before numeric constant
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>  ^
> ./arch/arm/include/asm/barrier.h:68:32: note: in definition of macro 
> '__load_no_speculate_n'
>    (typeof(*ptr)(unsigned long)(failval));  \
>     ^~~
> ./arch/arm/include/asm/barrier.h:112:2: note: in expansion of macro 
> '__load_no_speculate'
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>   ^~~
> ./include/asm-generic/barrier.h:122:2: note: in expansion of macro 
> 'nospec_ptr'
>   nospec_ptr(__arr + __idx, __arr, __arr + __sz);   \
>   ^~
> ./include/linux/fdtable.h:86:13: note: in expansion of macro 
> 'nospec_array_ptr'
>   if ((fdp = nospec_array_ptr(fdt->fd, fd, fdt->max_fds)))
>  ^~~~
> ./arch/arm/include/asm/barrier.h:112:41: error: expected declaration 
> specifiers or '...' before numeric constant
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>  ^
> ./arch/arm/include/asm/barrier.h:68:32: note: in definition of macro 
> '__load_no_speculate_n'
>    (typeof(*ptr)(unsigned long)(failval));  \
>     ^~~
> ./arch/arm/include/asm/barrier.h:112:2: note: in expansion of macro 
> '__load_no_speculate'
>   __load_no_speculate(&__np_ptr, lo, hi, 0, __np_ptr);  \
>   ^~~
> ./include/asm-generic/barrier.h:122:2: note: in expansion of macro 
> 'nospec_ptr'
>   nospec_ptr(__arr + __idx, __arr, __arr + __sz);   \
>   ^~
> ./include/linux/fdtable.h:86:13: note: in expansion of macro 
> 'nospec_array_ptr'
>   if ((fdp = nospec_array_ptr(fdt->fd, fd, fdt->max_fds)))
> 
> I can't puzzle out what exactly is the problem here, except that it really
> does not seem to like that failval. Does the arm compiler not like doing
> the typeof with the __arr + __idx?

>> +#define __load_no_speculate_n(ptr, lo, hi, failval, cmpptr, sz)    \
>> +({    \
>> +    typeof(*ptr) __nln_val;    \
>> +    typeof(*ptr) __failval =    \
>> +    (typeof(*ptr)(unsigned long)(failval));    \

Just typo,

- (typeof(*ptr)(unsigned long)(failval)); \
+ (typeof(*ptr))(unsigned long)(failval); \

Please try it.

Thanks
Hanjun



[PATCH 3/6] riscv/ftrace: Add dynamic function graph tracer support

2018-01-09 Thread Alan Kao
Once the function_graph tracer is enabled, a filtered function has the
following call sequence:

* ftracer_caller ==> on/off by ftrace_make_call/ftrace_make_nop
* ftrace_graph_caller
* ftrace_graph_call  ==> on/off by ftrace_en/disable_ftrace_graph_caller
* prepare_ftrace_return

Considering that the DYNAMIC_FTRACE_WITH_REGS feature, which introduces
another hook entry ftrace_regs_caller, will access to ftrace_graph_call
when needed, it would be more extendable to have a ftrace_graph_caller 
function, instead of calling prepare_ftrace_return directly in ftrace_caller.

Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/kernel/ftrace.c | 25 +++-
 arch/riscv/kernel/mcount-dyn.S | 65 ++
 2 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 49d2d799f532..239ef5d56f24 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -45,7 +45,7 @@ static int __ftrace_modify_call(unsigned long hook_pos, 
unsigned long target,
unsigned int nops[2] = {NOP4, NOP4};
int ret = 0;
 
-   /* when ftrace_make_nop is called */
+   /* for ftrace_make_nop and ftrace_disable_ftrace_graph_caller */
if (!enable)
ret = ftrace_check_current_call(hook_pos, calls);
 
@@ -99,6 +99,7 @@ int __init ftrace_dyn_arch_init(void)
 }
 #endif
 
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
 /*
  * Most of this function is copied from arm64.
  */
@@ -131,3 +132,25 @@ void prepare_ftrace_return(unsigned long *parent, unsigned 
long self_addr,
return;
*parent = return_hooker;
 }
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+extern void ftrace_graph_call(void);
+int ftrace_enable_ftrace_graph_caller(void)
+{
+   int ret = ftrace_check_current_call((unsigned long)_graph_call,
+   NULL);
+
+   if (ret)
+   return ret;
+
+   return __ftrace_modify_call((unsigned long)_graph_call,
+   (unsigned long)_ftrace_return, 
true);
+}
+
+int ftrace_disable_ftrace_graph_caller(void)
+{
+   return __ftrace_modify_call((unsigned long)_graph_call,
+   (unsigned long)_ftrace_return, 
false);
+}
+#endif /* CONFIG_DYNAMIC_FTRACE */
+#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 57f80fe09cbd..64e715d4e180 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -14,18 +14,63 @@
.text
 
.macro SAVE_ABI_STATE
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   addisp, sp, -48
+   sd  s0, 32(sp)
+   sd  ra, 40(sp)
+   addis0, sp, 48
+   sd  t0, 24(sp)
+   sd  t1, 16(sp)
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   sd  t2, 8(sp)
+#endif
+#else
addisp, sp, -16
sd  s0, 0(sp)
sd  ra, 8(sp)
addis0, sp, 16
+#endif
.endm
 
.macro RESTORE_ABI_STATE
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   ld  s0, 32(sp)
+   ld  ra, 40(sp)
+   addisp, sp, 48
+#else
ld  ra, 8(sp)
ld  s0, 0(sp)
addisp, sp, 16
+#endif
.endm
 
+   .macro RESTORE_GRAPH_ARGS
+   ld  a0, 24(sp)
+   ld  a1, 16(sp)
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   ld  a2, 8(sp)
+#endif
+   .endm
+
+ENTRY(ftrace_graph_caller)
+   addisp, sp, -16
+   sd  s0, 0(sp)
+   sd  ra, 8(sp)
+   addis0, sp, 16
+ftrace_graph_call:
+   .global ftrace_graph_call
+   /*
+* Calling ftrace_enable/disable_ftrace_graph_caller would overwrite the
+* nops below.  Check ftrace_modify_all_code for details.
+*/
+   addix0, x0, 0
+   addix0, x0, 0
+   ld  ra, 8(sp)
+   ld  s0, 0(sp)
+   addisp, sp, 16
+   ret
+ENDPROC(ftrace_graph_caller)
+
 ENTRY(ftrace_caller)
/*
 * a0: the address in the caller when calling ftrace_caller
@@ -33,6 +78,20 @@ ENTRY(ftrace_caller)
 */
ld  a1, -8(s0)
addia0, ra, -MCOUNT_INSN_SIZE
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   /*
+* the graph tracer (specifically, prepare_ftrace_return) needs these
+* arguments but for now the function tracer occupies the regs, so we
+* save them in temporary regs to recover later.
+*/
+   addit0, s0, -8
+   mv  t1, a0
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   ld  t2, -16(s0)
+#endif
+#endif
+
SAVE_ABI_STATE
 ftrace_call:
.global ftrace_call
@@ -47,6 +106,12 @@ ftrace_call:
 */
addix0, x0, 0
addix0, x0, 0
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   RESTORE_GRAPH_ARGS
+   callftrace_graph_caller
+#endif
+

[PATCH 3/6] riscv/ftrace: Add dynamic function graph tracer support

2018-01-09 Thread Alan Kao
Once the function_graph tracer is enabled, a filtered function has the
following call sequence:

* ftracer_caller ==> on/off by ftrace_make_call/ftrace_make_nop
* ftrace_graph_caller
* ftrace_graph_call  ==> on/off by ftrace_en/disable_ftrace_graph_caller
* prepare_ftrace_return

Considering that the DYNAMIC_FTRACE_WITH_REGS feature, which introduces
another hook entry ftrace_regs_caller, will access to ftrace_graph_call
when needed, it would be more extendable to have a ftrace_graph_caller 
function, instead of calling prepare_ftrace_return directly in ftrace_caller.

Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/kernel/ftrace.c | 25 +++-
 arch/riscv/kernel/mcount-dyn.S | 65 ++
 2 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 49d2d799f532..239ef5d56f24 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -45,7 +45,7 @@ static int __ftrace_modify_call(unsigned long hook_pos, 
unsigned long target,
unsigned int nops[2] = {NOP4, NOP4};
int ret = 0;
 
-   /* when ftrace_make_nop is called */
+   /* for ftrace_make_nop and ftrace_disable_ftrace_graph_caller */
if (!enable)
ret = ftrace_check_current_call(hook_pos, calls);
 
@@ -99,6 +99,7 @@ int __init ftrace_dyn_arch_init(void)
 }
 #endif
 
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
 /*
  * Most of this function is copied from arm64.
  */
@@ -131,3 +132,25 @@ void prepare_ftrace_return(unsigned long *parent, unsigned 
long self_addr,
return;
*parent = return_hooker;
 }
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+extern void ftrace_graph_call(void);
+int ftrace_enable_ftrace_graph_caller(void)
+{
+   int ret = ftrace_check_current_call((unsigned long)_graph_call,
+   NULL);
+
+   if (ret)
+   return ret;
+
+   return __ftrace_modify_call((unsigned long)_graph_call,
+   (unsigned long)_ftrace_return, 
true);
+}
+
+int ftrace_disable_ftrace_graph_caller(void)
+{
+   return __ftrace_modify_call((unsigned long)_graph_call,
+   (unsigned long)_ftrace_return, 
false);
+}
+#endif /* CONFIG_DYNAMIC_FTRACE */
+#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 57f80fe09cbd..64e715d4e180 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -14,18 +14,63 @@
.text
 
.macro SAVE_ABI_STATE
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   addisp, sp, -48
+   sd  s0, 32(sp)
+   sd  ra, 40(sp)
+   addis0, sp, 48
+   sd  t0, 24(sp)
+   sd  t1, 16(sp)
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   sd  t2, 8(sp)
+#endif
+#else
addisp, sp, -16
sd  s0, 0(sp)
sd  ra, 8(sp)
addis0, sp, 16
+#endif
.endm
 
.macro RESTORE_ABI_STATE
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   ld  s0, 32(sp)
+   ld  ra, 40(sp)
+   addisp, sp, 48
+#else
ld  ra, 8(sp)
ld  s0, 0(sp)
addisp, sp, 16
+#endif
.endm
 
+   .macro RESTORE_GRAPH_ARGS
+   ld  a0, 24(sp)
+   ld  a1, 16(sp)
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   ld  a2, 8(sp)
+#endif
+   .endm
+
+ENTRY(ftrace_graph_caller)
+   addisp, sp, -16
+   sd  s0, 0(sp)
+   sd  ra, 8(sp)
+   addis0, sp, 16
+ftrace_graph_call:
+   .global ftrace_graph_call
+   /*
+* Calling ftrace_enable/disable_ftrace_graph_caller would overwrite the
+* nops below.  Check ftrace_modify_all_code for details.
+*/
+   addix0, x0, 0
+   addix0, x0, 0
+   ld  ra, 8(sp)
+   ld  s0, 0(sp)
+   addisp, sp, 16
+   ret
+ENDPROC(ftrace_graph_caller)
+
 ENTRY(ftrace_caller)
/*
 * a0: the address in the caller when calling ftrace_caller
@@ -33,6 +78,20 @@ ENTRY(ftrace_caller)
 */
ld  a1, -8(s0)
addia0, ra, -MCOUNT_INSN_SIZE
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   /*
+* the graph tracer (specifically, prepare_ftrace_return) needs these
+* arguments but for now the function tracer occupies the regs, so we
+* save them in temporary regs to recover later.
+*/
+   addit0, s0, -8
+   mv  t1, a0
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   ld  t2, -16(s0)
+#endif
+#endif
+
SAVE_ABI_STATE
 ftrace_call:
.global ftrace_call
@@ -47,6 +106,12 @@ ftrace_call:
 */
addix0, x0, 0
addix0, x0, 0
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   RESTORE_GRAPH_ARGS
+   callftrace_graph_caller
+#endif
+
RESTORE_ABI_STATE
ret
 ENDPROC(ftrace_caller)
-- 

[PATCH 1/6] riscv/ftrace: Add RECORD_MCOUNT support

2018-01-09 Thread Alan Kao
Now recordmcount.pl recognizes RISC-V object files. For the mechanism to
work, we have to disable the linker relaxation. This is because
relaxation happens after the script records offsets of _mcount call
sites, resulting in a unreliable record. 

Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/Kconfig  | 1 +
 arch/riscv/Makefile | 6 +-
 scripts/recordmcount.pl | 5 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 504ba386b22e..346dd1b0fb05 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -112,6 +112,7 @@ config ARCH_RV64I
select 64BIT
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_GRAPH_TRACER
+   select HAVE_FTRACE_MCOUNT_RECORD
 
 endchoice
 
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 6719dd30ec5b..2bc39c6d9662 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -10,7 +10,11 @@
 
 LDFLAGS :=
 OBJCOPYFLAGS:= -O binary
-LDFLAGS_vmlinux :=
+ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
+   LDFLAGS_vmlinux := --no-relax
+else
+   LDFLAGS_vmlinux :=
+endif
 KBUILD_AFLAGS_MODULE += -fPIC
 KBUILD_CFLAGS_MODULE += -fPIC
 
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index 2033af758173..d44d55db7c06 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -376,6 +376,11 @@ if ($arch eq "x86_64") {
 $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\s__mcount\$";
 $type = ".quad";
 $alignment = 8;
+} elsif ($arch eq "riscv") {
+$function_regex = "^([0-9a-fA-F]+)\\s+<([^.0-9][0-9a-zA-Z_\\.]+)>:";
+$mcount_regex = "^\\s*([0-9a-fA-F]+):\\sR_RISCV_CALL\\s_mcount\$";
+$type = ".quad";
+$alignment = 2;
 } else {
 die "Arch $arch is not supported with CONFIG_FTRACE_MCOUNT_RECORD";
 }
-- 
2.15.1



[PATCH 1/6] riscv/ftrace: Add RECORD_MCOUNT support

2018-01-09 Thread Alan Kao
Now recordmcount.pl recognizes RISC-V object files. For the mechanism to
work, we have to disable the linker relaxation. This is because
relaxation happens after the script records offsets of _mcount call
sites, resulting in a unreliable record. 

Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/Kconfig  | 1 +
 arch/riscv/Makefile | 6 +-
 scripts/recordmcount.pl | 5 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 504ba386b22e..346dd1b0fb05 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -112,6 +112,7 @@ config ARCH_RV64I
select 64BIT
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_GRAPH_TRACER
+   select HAVE_FTRACE_MCOUNT_RECORD
 
 endchoice
 
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 6719dd30ec5b..2bc39c6d9662 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -10,7 +10,11 @@
 
 LDFLAGS :=
 OBJCOPYFLAGS:= -O binary
-LDFLAGS_vmlinux :=
+ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
+   LDFLAGS_vmlinux := --no-relax
+else
+   LDFLAGS_vmlinux :=
+endif
 KBUILD_AFLAGS_MODULE += -fPIC
 KBUILD_CFLAGS_MODULE += -fPIC
 
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index 2033af758173..d44d55db7c06 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -376,6 +376,11 @@ if ($arch eq "x86_64") {
 $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\s__mcount\$";
 $type = ".quad";
 $alignment = 8;
+} elsif ($arch eq "riscv") {
+$function_regex = "^([0-9a-fA-F]+)\\s+<([^.0-9][0-9a-zA-Z_\\.]+)>:";
+$mcount_regex = "^\\s*([0-9a-fA-F]+):\\sR_RISCV_CALL\\s_mcount\$";
+$type = ".quad";
+$alignment = 2;
 } else {
 die "Arch $arch is not supported with CONFIG_FTRACE_MCOUNT_RECORD";
 }
-- 
2.15.1



[PATCH 6/6] riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support

2018-01-09 Thread Alan Kao
When doing unwinding in the function walk_stackframe, the pc now receives 
the address from calling ftrace_graph_ret_addr instead of manual calculation.

Note that the original expression,
pc = frame->ra - 4
is buggy if the instruction at the return address happened to be a
compressed inst. But since it is not a critical part of ftrace and
is a RISC-V-specific behavior, it is ignored for now to ease the 
review process.

Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/include/asm/ftrace.h | 1 +
 arch/riscv/kernel/ftrace.c  | 2 +-
 arch/riscv/kernel/stacktrace.c  | 6 ++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 429a6a156645..6e4b4c96b63e 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -8,6 +8,7 @@
 #if defined(CONFIG_FUNCTION_GRAPH_TRACER) && defined(CONFIG_FRAME_POINTER)
 #define HAVE_FUNCTION_GRAPH_FP_TEST
 #endif
+#define HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
 
 #define ARCH_SUPPORTS_FTRACE_OPS 1
 #ifndef __ASSEMBLY__
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index c9cc884961d7..e02ecd44fe47 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -144,7 +144,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned 
long self_addr,
return;
 
err = ftrace_push_return_trace(old, self_addr, ,
-  frame_pointer, NULL);
+  frame_pointer, parent);
if (err == -EBUSY)
return;
*parent = return_hooker;
diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index 559aae781154..a4b1d94371a0 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_FRAME_POINTER
 
@@ -63,7 +64,12 @@ static void notrace walk_stackframe(struct task_struct *task,
frame = (struct stackframe *)fp - 1;
sp = fp;
fp = frame->fp;
+#ifdef HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
+   pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
+  (unsigned long *)(fp - 8));
+#else
pc = frame->ra - 0x4;
+#endif
}
 }
 
-- 
2.15.1



[PATCH 6/6] riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support

2018-01-09 Thread Alan Kao
When doing unwinding in the function walk_stackframe, the pc now receives 
the address from calling ftrace_graph_ret_addr instead of manual calculation.

Note that the original expression,
pc = frame->ra - 4
is buggy if the instruction at the return address happened to be a
compressed inst. But since it is not a critical part of ftrace and
is a RISC-V-specific behavior, it is ignored for now to ease the 
review process.

Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/include/asm/ftrace.h | 1 +
 arch/riscv/kernel/ftrace.c  | 2 +-
 arch/riscv/kernel/stacktrace.c  | 6 ++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 429a6a156645..6e4b4c96b63e 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -8,6 +8,7 @@
 #if defined(CONFIG_FUNCTION_GRAPH_TRACER) && defined(CONFIG_FRAME_POINTER)
 #define HAVE_FUNCTION_GRAPH_FP_TEST
 #endif
+#define HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
 
 #define ARCH_SUPPORTS_FTRACE_OPS 1
 #ifndef __ASSEMBLY__
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index c9cc884961d7..e02ecd44fe47 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -144,7 +144,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned 
long self_addr,
return;
 
err = ftrace_push_return_trace(old, self_addr, ,
-  frame_pointer, NULL);
+  frame_pointer, parent);
if (err == -EBUSY)
return;
*parent = return_hooker;
diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index 559aae781154..a4b1d94371a0 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_FRAME_POINTER
 
@@ -63,7 +64,12 @@ static void notrace walk_stackframe(struct task_struct *task,
frame = (struct stackframe *)fp - 1;
sp = fp;
fp = frame->fp;
+#ifdef HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
+   pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
+  (unsigned long *)(fp - 8));
+#else
pc = frame->ra - 0x4;
+#endif
}
 }
 
-- 
2.15.1



[PATCH 2/6] riscv/ftrace: Add dynamic function tracer support

2018-01-09 Thread Alan Kao
We now have dynamic ftrace with the following added items:

* ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c)
  The two functions turns any recorded call site of filtered functions
  into a call to ftrace_caller or nops

* ftracce_update_ftrace_func (in kernel/ftrace.c)
  turns the nops at ftrace_call into a call to a generic entry for
  function tracers.

* ftrace_caller (in kernel/mcount-dyn.S)
  The entry where each _mcount call site calls to once the function
  is filtered to be traced.

Also, this patch fixes the semantic problems in mcount.S, which will be
treated as only a reference implementation once we have the dynamic
ftrace.

Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/Kconfig  |  1 +
 arch/riscv/include/asm/ftrace.h | 45 
 arch/riscv/kernel/Makefile  |  5 ++-
 arch/riscv/kernel/ftrace.c  | 94 -
 arch/riscv/kernel/mcount-dyn.S  | 52 +++
 arch/riscv/kernel/mcount.S  | 22 ++
 6 files changed, 207 insertions(+), 12 deletions(-)
 create mode 100644 arch/riscv/kernel/mcount-dyn.S

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 346dd1b0fb05..96db66272db5 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -113,6 +113,7 @@ config ARCH_RV64I
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FTRACE_MCOUNT_RECORD
+   select HAVE_DYNAMIC_FTRACE
 
 endchoice
 
diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 66d4175eb13e..acf0c7d001f3 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -8,3 +8,48 @@
 #if defined(CONFIG_FUNCTION_GRAPH_TRACER) && defined(CONFIG_FRAME_POINTER)
 #define HAVE_FUNCTION_GRAPH_FP_TEST
 #endif
+
+#ifndef __ASSEMBLY__
+void _mcount(void);
+static inline unsigned long ftrace_call_adjust(unsigned long addr)
+{
+   return addr;
+}
+
+struct dyn_arch_ftrace {
+};
+#endif
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+/*
+ * A general call in RISC-V is a pair of insts:
+ * 1) auipc: setting high-20 pc-related bits to ra register
+ * 2) jalr: setting low-12 offset to ra, jump to ra, and set ra to
+ *  return address (original pc + 4)
+ *
+ * Dynamic ftrace generates probes to call sites, so we must deal with
+ * both auipc and jalr at the same time.
+ */
+
+#define MCOUNT_ADDR((unsigned long)_mcount)
+#define JALR_SIGN_MASK (0x0800)
+#define JALR_OFFSET_MASK   (0x0fff)
+#define AUIPC_OFFSET_MASK  (0xf000)
+#define AUIPC_PAD  (0x1000)
+#define JALR_SHIFT 20
+#define JALR_BASIC (0x80e7)
+#define AUIPC_BASIC(0x0097)
+#define NOP4   (0x0013)
+
+#define to_jalr_insn(_offset) \
+   (((_offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_BASIC)
+
+#define to_auipc_insn(_offset) ((_offset & JALR_SIGN_MASK) ? \
+   (((_offset & AUIPC_OFFSET_MASK) + AUIPC_PAD) | AUIPC_BASIC) : \
+   ((_offset & AUIPC_OFFSET_MASK) | AUIPC_BASIC))
+
+/*
+ * Let auipc+jalr be the basic *mcount unit*, so we make it 8 bytes here.
+ */
+#define MCOUNT_INSN_SIZE 8
+#endif
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 196f62ffc428..d7bdf888f1ca 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -34,7 +34,8 @@ CFLAGS_setup.o := -mcmodel=medany
 obj-$(CONFIG_SMP)  += smpboot.o
 obj-$(CONFIG_SMP)  += smp.o
 obj-$(CONFIG_MODULES)  += module.o
-obj-$(CONFIG_FUNCTION_TRACER)  += mcount.o
-obj-$(CONFIG_FUNCTION_GRAPH_TRACER)+= ftrace.o
+
+obj-$(CONFIG_FUNCTION_TRACER)  += mcount.o ftrace.o
+obj-$(CONFIG_DYNAMIC_FTRACE)   += mcount-dyn.o
 
 clean:
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index d0de68d144cb..49d2d799f532 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -6,9 +6,101 @@
  */
 
 #include 
+#include 
+#include 
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+static int ftrace_check_current_call(unsigned long hook_pos,
+unsigned int *expected)
+{
+   unsigned int replaced[2];
+   unsigned int nops[2] = {NOP4, NOP4};
+
+   /* we expect nops at the hook position */
+   if (!expected)
+   expected = nops;
+
+   /* read the text we want to modify */
+   if (probe_kernel_read(replaced, (void *)hook_pos, MCOUNT_INSN_SIZE))
+   return -EFAULT;
+
+   /* Make sure it is what we expect it to be */
+   if (replaced[0] != expected[0] || replaced[1] != expected[1]) {
+   pr_err("%p: expected (%08x %08x) but get (%08x %08x)",
+  (void *)hook_pos, expected[0], expected[1], replaced[0],
+  replaced[1]);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int __ftrace_modify_call(unsigned long hook_pos, 

[PATCH 4/6] riscv/ftrace: Add ARCH_SUPPORTS_FTRACE_OPS support

2018-01-09 Thread Alan Kao
Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/include/asm/ftrace.h | 1 +
 arch/riscv/kernel/mcount-dyn.S  | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index acf0c7d001f3..429a6a156645 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -9,6 +9,7 @@
 #define HAVE_FUNCTION_GRAPH_FP_TEST
 #endif
 
+#define ARCH_SUPPORTS_FTRACE_OPS 1
 #ifndef __ASSEMBLY__
 void _mcount(void);
 static inline unsigned long ftrace_call_adjust(unsigned long addr)
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 64e715d4e180..627478571c7a 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -75,9 +75,12 @@ ENTRY(ftrace_caller)
/*
 * a0: the address in the caller when calling ftrace_caller
 * a1: the caller's return address
+* a2: the address of global variable function_trace_op
 */
ld  a1, -8(s0)
addia0, ra, -MCOUNT_INSN_SIZE
+   la  t5, function_trace_op
+   ld  a2, 0(t5)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
/*
-- 
2.15.1



[PATCH 5/6] riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support

2018-01-09 Thread Alan Kao
Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/Kconfig |   1 +
 arch/riscv/kernel/ftrace.c |  17 ++
 arch/riscv/kernel/mcount-dyn.S | 124 +
 3 files changed, 142 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 96db66272db5..06685bcf5643 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -114,6 +114,7 @@ config ARCH_RV64I
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_DYNAMIC_FTRACE
+   select HAVE_DYNAMIC_FTRACE_WITH_REGS
 
 endchoice
 
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 239ef5d56f24..c9cc884961d7 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -99,6 +99,23 @@ int __init ftrace_dyn_arch_init(void)
 }
 #endif
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
+  unsigned long addr)
+{
+   unsigned int offset = (unsigned int)(old_addr - rec->ip);
+   unsigned int auipc_call = to_auipc_insn(offset);
+   unsigned int jalr_call = to_jalr_insn(offset);
+   unsigned int calls[2] = {auipc_call, jalr_call};
+   int ret = ftrace_check_current_call(rec->ip, calls);
+
+   if (ret)
+   return ret;
+
+   return __ftrace_modify_call(rec->ip, addr, true);
+}
+#endif
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 /*
  * Most of this function is copied from arm64.
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 627478571c7a..3ec3ddbfb5e7 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -118,3 +118,127 @@ ftrace_call:
RESTORE_ABI_STATE
ret
 ENDPROC(ftrace_caller)
+
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+   .macro SAVE_ALL
+   addisp, sp, -(PT_SIZE_ON_STACK+16)
+   sd  s0, (PT_SIZE_ON_STACK)(sp)
+   sd  ra, (PT_SIZE_ON_STACK+8)(sp)
+   addis0, sp, (PT_SIZE_ON_STACK+16)
+
+   sd x1,  PT_RA(sp)
+   sd x2,  PT_SP(sp)
+   sd x3,  PT_GP(sp)
+   sd x4,  PT_TP(sp)
+   sd x5,  PT_T0(sp)
+   sd x6,  PT_T1(sp)
+   sd x7,  PT_T2(sp)
+   sd x8,  PT_S0(sp)
+   sd x9,  PT_S1(sp)
+   sd x10, PT_A0(sp)
+   sd x11, PT_A1(sp)
+   sd x12, PT_A2(sp)
+   sd x13, PT_A3(sp)
+   sd x14, PT_A4(sp)
+   sd x15, PT_A5(sp)
+   sd x16, PT_A6(sp)
+   sd x17, PT_A7(sp)
+   sd x18, PT_S2(sp)
+   sd x19, PT_S3(sp)
+   sd x20, PT_S4(sp)
+   sd x21, PT_S5(sp)
+   sd x22, PT_S6(sp)
+   sd x23, PT_S7(sp)
+   sd x24, PT_S8(sp)
+   sd x25, PT_S9(sp)
+   sd x26, PT_S10(sp)
+   sd x27, PT_S11(sp)
+   sd x28, PT_T3(sp)
+   sd x29, PT_T4(sp)
+   sd x30, PT_T5(sp)
+   sd x31, PT_T6(sp)
+   .endm
+
+   .macro RESTORE_ALL
+   ld x1,  PT_RA(sp)
+   ld x2,  PT_SP(sp)
+   ld x3,  PT_GP(sp)
+   ld x4,  PT_TP(sp)
+   ld x5,  PT_T0(sp)
+   ld x6,  PT_T1(sp)
+   ld x7,  PT_T2(sp)
+   ld x8,  PT_S0(sp)
+   ld x9,  PT_S1(sp)
+   ld x10, PT_A0(sp)
+   ld x11, PT_A1(sp)
+   ld x12, PT_A2(sp)
+   ld x13, PT_A3(sp)
+   ld x14, PT_A4(sp)
+   ld x15, PT_A5(sp)
+   ld x16, PT_A6(sp)
+   ld x17, PT_A7(sp)
+   ld x18, PT_S2(sp)
+   ld x19, PT_S3(sp)
+   ld x20, PT_S4(sp)
+   ld x21, PT_S5(sp)
+   ld x22, PT_S6(sp)
+   ld x23, PT_S7(sp)
+   ld x24, PT_S8(sp)
+   ld x25, PT_S9(sp)
+   ld x26, PT_S10(sp)
+   ld x27, PT_S11(sp)
+   ld x28, PT_T3(sp)
+   ld x29, PT_T4(sp)
+   ld x30, PT_T5(sp)
+   ld x31, PT_T6(sp)
+
+   ld  s0, (PT_SIZE_ON_STACK)(sp)
+   ld  ra, (PT_SIZE_ON_STACK+8)(sp)
+   addisp, sp, (PT_SIZE_ON_STACK+16)
+   .endm
+
+   .macro RESTORE_GRAPH_REG_ARGS
+   ld  a0, PT_T0(sp)
+   ld  a1, PT_T1(sp)
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   ld  a2, PT_T2(sp)
+#endif
+   .endm
+
+/*
+ * Most of the contents are the same as ftrace_caller.
+ */
+ENTRY(ftrace_regs_caller)
+   /*
+* a3: the address of all registers in the stack
+*/
+   ld  a1, -8(s0)
+   addia0, ra, -MCOUNT_INSN_SIZE
+   la  t5, function_trace_op
+   ld  a2, 0(t5)
+   addia3, sp, -(PT_SIZE_ON_STACK+16)
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   addit0, s0, -8
+   mv  t1, a0
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   ld  t2, -16(s0)
+#endif
+#endif
+   SAVE_ALL
+
+ftrace_regs_call:
+   .global ftrace_regs_call
+   addix0, x0, 0
+   addix0, x0, 0
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   RESTORE_GRAPH_REG_ARGS
+   callftrace_graph_caller
+#endif
+
+   RESTORE_ALL
+   ret
+ENDPROC(ftrace_regs_caller)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
-- 
2.15.1



[PATCH 2/6] riscv/ftrace: Add dynamic function tracer support

2018-01-09 Thread Alan Kao
We now have dynamic ftrace with the following added items:

* ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c)
  The two functions turns any recorded call site of filtered functions
  into a call to ftrace_caller or nops

* ftracce_update_ftrace_func (in kernel/ftrace.c)
  turns the nops at ftrace_call into a call to a generic entry for
  function tracers.

* ftrace_caller (in kernel/mcount-dyn.S)
  The entry where each _mcount call site calls to once the function
  is filtered to be traced.

Also, this patch fixes the semantic problems in mcount.S, which will be
treated as only a reference implementation once we have the dynamic
ftrace.

Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/Kconfig  |  1 +
 arch/riscv/include/asm/ftrace.h | 45 
 arch/riscv/kernel/Makefile  |  5 ++-
 arch/riscv/kernel/ftrace.c  | 94 -
 arch/riscv/kernel/mcount-dyn.S  | 52 +++
 arch/riscv/kernel/mcount.S  | 22 ++
 6 files changed, 207 insertions(+), 12 deletions(-)
 create mode 100644 arch/riscv/kernel/mcount-dyn.S

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 346dd1b0fb05..96db66272db5 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -113,6 +113,7 @@ config ARCH_RV64I
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FTRACE_MCOUNT_RECORD
+   select HAVE_DYNAMIC_FTRACE
 
 endchoice
 
diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 66d4175eb13e..acf0c7d001f3 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -8,3 +8,48 @@
 #if defined(CONFIG_FUNCTION_GRAPH_TRACER) && defined(CONFIG_FRAME_POINTER)
 #define HAVE_FUNCTION_GRAPH_FP_TEST
 #endif
+
+#ifndef __ASSEMBLY__
+void _mcount(void);
+static inline unsigned long ftrace_call_adjust(unsigned long addr)
+{
+   return addr;
+}
+
+struct dyn_arch_ftrace {
+};
+#endif
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+/*
+ * A general call in RISC-V is a pair of insts:
+ * 1) auipc: setting high-20 pc-related bits to ra register
+ * 2) jalr: setting low-12 offset to ra, jump to ra, and set ra to
+ *  return address (original pc + 4)
+ *
+ * Dynamic ftrace generates probes to call sites, so we must deal with
+ * both auipc and jalr at the same time.
+ */
+
+#define MCOUNT_ADDR((unsigned long)_mcount)
+#define JALR_SIGN_MASK (0x0800)
+#define JALR_OFFSET_MASK   (0x0fff)
+#define AUIPC_OFFSET_MASK  (0xf000)
+#define AUIPC_PAD  (0x1000)
+#define JALR_SHIFT 20
+#define JALR_BASIC (0x80e7)
+#define AUIPC_BASIC(0x0097)
+#define NOP4   (0x0013)
+
+#define to_jalr_insn(_offset) \
+   (((_offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_BASIC)
+
+#define to_auipc_insn(_offset) ((_offset & JALR_SIGN_MASK) ? \
+   (((_offset & AUIPC_OFFSET_MASK) + AUIPC_PAD) | AUIPC_BASIC) : \
+   ((_offset & AUIPC_OFFSET_MASK) | AUIPC_BASIC))
+
+/*
+ * Let auipc+jalr be the basic *mcount unit*, so we make it 8 bytes here.
+ */
+#define MCOUNT_INSN_SIZE 8
+#endif
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 196f62ffc428..d7bdf888f1ca 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -34,7 +34,8 @@ CFLAGS_setup.o := -mcmodel=medany
 obj-$(CONFIG_SMP)  += smpboot.o
 obj-$(CONFIG_SMP)  += smp.o
 obj-$(CONFIG_MODULES)  += module.o
-obj-$(CONFIG_FUNCTION_TRACER)  += mcount.o
-obj-$(CONFIG_FUNCTION_GRAPH_TRACER)+= ftrace.o
+
+obj-$(CONFIG_FUNCTION_TRACER)  += mcount.o ftrace.o
+obj-$(CONFIG_DYNAMIC_FTRACE)   += mcount-dyn.o
 
 clean:
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index d0de68d144cb..49d2d799f532 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -6,9 +6,101 @@
  */
 
 #include 
+#include 
+#include 
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+static int ftrace_check_current_call(unsigned long hook_pos,
+unsigned int *expected)
+{
+   unsigned int replaced[2];
+   unsigned int nops[2] = {NOP4, NOP4};
+
+   /* we expect nops at the hook position */
+   if (!expected)
+   expected = nops;
+
+   /* read the text we want to modify */
+   if (probe_kernel_read(replaced, (void *)hook_pos, MCOUNT_INSN_SIZE))
+   return -EFAULT;
+
+   /* Make sure it is what we expect it to be */
+   if (replaced[0] != expected[0] || replaced[1] != expected[1]) {
+   pr_err("%p: expected (%08x %08x) but get (%08x %08x)",
+  (void *)hook_pos, expected[0], expected[1], replaced[0],
+  replaced[1]);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target,
+ 

[PATCH 4/6] riscv/ftrace: Add ARCH_SUPPORTS_FTRACE_OPS support

2018-01-09 Thread Alan Kao
Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/include/asm/ftrace.h | 1 +
 arch/riscv/kernel/mcount-dyn.S  | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index acf0c7d001f3..429a6a156645 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -9,6 +9,7 @@
 #define HAVE_FUNCTION_GRAPH_FP_TEST
 #endif
 
+#define ARCH_SUPPORTS_FTRACE_OPS 1
 #ifndef __ASSEMBLY__
 void _mcount(void);
 static inline unsigned long ftrace_call_adjust(unsigned long addr)
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 64e715d4e180..627478571c7a 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -75,9 +75,12 @@ ENTRY(ftrace_caller)
/*
 * a0: the address in the caller when calling ftrace_caller
 * a1: the caller's return address
+* a2: the address of global variable function_trace_op
 */
ld  a1, -8(s0)
addia0, ra, -MCOUNT_INSN_SIZE
+   la  t5, function_trace_op
+   ld  a2, 0(t5)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
/*
-- 
2.15.1



[PATCH 5/6] riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support

2018-01-09 Thread Alan Kao
Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 arch/riscv/Kconfig |   1 +
 arch/riscv/kernel/ftrace.c |  17 ++
 arch/riscv/kernel/mcount-dyn.S | 124 +
 3 files changed, 142 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 96db66272db5..06685bcf5643 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -114,6 +114,7 @@ config ARCH_RV64I
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_DYNAMIC_FTRACE
+   select HAVE_DYNAMIC_FTRACE_WITH_REGS
 
 endchoice
 
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 239ef5d56f24..c9cc884961d7 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -99,6 +99,23 @@ int __init ftrace_dyn_arch_init(void)
 }
 #endif
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
+  unsigned long addr)
+{
+   unsigned int offset = (unsigned int)(old_addr - rec->ip);
+   unsigned int auipc_call = to_auipc_insn(offset);
+   unsigned int jalr_call = to_jalr_insn(offset);
+   unsigned int calls[2] = {auipc_call, jalr_call};
+   int ret = ftrace_check_current_call(rec->ip, calls);
+
+   if (ret)
+   return ret;
+
+   return __ftrace_modify_call(rec->ip, addr, true);
+}
+#endif
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 /*
  * Most of this function is copied from arm64.
diff --git a/arch/riscv/kernel/mcount-dyn.S b/arch/riscv/kernel/mcount-dyn.S
index 627478571c7a..3ec3ddbfb5e7 100644
--- a/arch/riscv/kernel/mcount-dyn.S
+++ b/arch/riscv/kernel/mcount-dyn.S
@@ -118,3 +118,127 @@ ftrace_call:
RESTORE_ABI_STATE
ret
 ENDPROC(ftrace_caller)
+
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+   .macro SAVE_ALL
+   addisp, sp, -(PT_SIZE_ON_STACK+16)
+   sd  s0, (PT_SIZE_ON_STACK)(sp)
+   sd  ra, (PT_SIZE_ON_STACK+8)(sp)
+   addis0, sp, (PT_SIZE_ON_STACK+16)
+
+   sd x1,  PT_RA(sp)
+   sd x2,  PT_SP(sp)
+   sd x3,  PT_GP(sp)
+   sd x4,  PT_TP(sp)
+   sd x5,  PT_T0(sp)
+   sd x6,  PT_T1(sp)
+   sd x7,  PT_T2(sp)
+   sd x8,  PT_S0(sp)
+   sd x9,  PT_S1(sp)
+   sd x10, PT_A0(sp)
+   sd x11, PT_A1(sp)
+   sd x12, PT_A2(sp)
+   sd x13, PT_A3(sp)
+   sd x14, PT_A4(sp)
+   sd x15, PT_A5(sp)
+   sd x16, PT_A6(sp)
+   sd x17, PT_A7(sp)
+   sd x18, PT_S2(sp)
+   sd x19, PT_S3(sp)
+   sd x20, PT_S4(sp)
+   sd x21, PT_S5(sp)
+   sd x22, PT_S6(sp)
+   sd x23, PT_S7(sp)
+   sd x24, PT_S8(sp)
+   sd x25, PT_S9(sp)
+   sd x26, PT_S10(sp)
+   sd x27, PT_S11(sp)
+   sd x28, PT_T3(sp)
+   sd x29, PT_T4(sp)
+   sd x30, PT_T5(sp)
+   sd x31, PT_T6(sp)
+   .endm
+
+   .macro RESTORE_ALL
+   ld x1,  PT_RA(sp)
+   ld x2,  PT_SP(sp)
+   ld x3,  PT_GP(sp)
+   ld x4,  PT_TP(sp)
+   ld x5,  PT_T0(sp)
+   ld x6,  PT_T1(sp)
+   ld x7,  PT_T2(sp)
+   ld x8,  PT_S0(sp)
+   ld x9,  PT_S1(sp)
+   ld x10, PT_A0(sp)
+   ld x11, PT_A1(sp)
+   ld x12, PT_A2(sp)
+   ld x13, PT_A3(sp)
+   ld x14, PT_A4(sp)
+   ld x15, PT_A5(sp)
+   ld x16, PT_A6(sp)
+   ld x17, PT_A7(sp)
+   ld x18, PT_S2(sp)
+   ld x19, PT_S3(sp)
+   ld x20, PT_S4(sp)
+   ld x21, PT_S5(sp)
+   ld x22, PT_S6(sp)
+   ld x23, PT_S7(sp)
+   ld x24, PT_S8(sp)
+   ld x25, PT_S9(sp)
+   ld x26, PT_S10(sp)
+   ld x27, PT_S11(sp)
+   ld x28, PT_T3(sp)
+   ld x29, PT_T4(sp)
+   ld x30, PT_T5(sp)
+   ld x31, PT_T6(sp)
+
+   ld  s0, (PT_SIZE_ON_STACK)(sp)
+   ld  ra, (PT_SIZE_ON_STACK+8)(sp)
+   addisp, sp, (PT_SIZE_ON_STACK+16)
+   .endm
+
+   .macro RESTORE_GRAPH_REG_ARGS
+   ld  a0, PT_T0(sp)
+   ld  a1, PT_T1(sp)
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   ld  a2, PT_T2(sp)
+#endif
+   .endm
+
+/*
+ * Most of the contents are the same as ftrace_caller.
+ */
+ENTRY(ftrace_regs_caller)
+   /*
+* a3: the address of all registers in the stack
+*/
+   ld  a1, -8(s0)
+   addia0, ra, -MCOUNT_INSN_SIZE
+   la  t5, function_trace_op
+   ld  a2, 0(t5)
+   addia3, sp, -(PT_SIZE_ON_STACK+16)
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   addit0, s0, -8
+   mv  t1, a0
+#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
+   ld  t2, -16(s0)
+#endif
+#endif
+   SAVE_ALL
+
+ftrace_regs_call:
+   .global ftrace_regs_call
+   addix0, x0, 0
+   addix0, x0, 0
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   RESTORE_GRAPH_REG_ARGS
+   callftrace_graph_caller
+#endif
+
+   RESTORE_ALL
+   ret
+ENDPROC(ftrace_regs_caller)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
-- 
2.15.1



Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Willy Tarreau
On Wed, Jan 10, 2018 at 08:31:28AM +0100, Ingo Molnar wrote:
> 
> * Borislav Petkov  wrote:
> 
> > Oh, and you've built the kernel with the option to be able to disable
> > PTI so it's not like you haven't seen it already.
> 
> In general in many corporate environments requiring kernel reboots or kernel 
> rebuilds limits the real-world usability of any kernel feature we offer down 
> to 
> "non-existent". Saying "build your own kernel or reboot" is excluding a large 
> subset of our real-world users.
> 
> Build and boot options are fine for developers and testing. Otherwise 
> _everything_ 
> not readily accessible when your distro kernel has booted up is essentially 
> behind 
> a usability (and corporate policy) wall so steep that it's essentially 
> non-existent to many users.
> 
> So either we make this properly sysctl (and/or prctl) controllable, or just 
> don't 
> do it at all.

After having slept over it, I really prefer the sysctl+prctl approach.
It's much more consistent with the rest of the tunables which act
similarly. We have mmap_min_addr, mmap_rnd_bits, randomize_va_space, etc
All of them are here to trade some protections for something else (mostly
compatibility).

What I'd like to have would be a sysctl with 3 values :
  -  0 : default disabled : arch_prctl() fails, this is the default
  -  1 : forced enabled : arch_prctl() succeeds for CAP_SYS_RAWIO
  - -1 : permanently disabled : fails and cannot be switched back to enabled.

This way the admin always has the last choice, it's possible to globally
disable it without having to fear that someone would enable it again if
desired, and it's possible to try if it helps without rebooting in
emergency situations.

Willy


[PATCH 0/6] Add dynamic ftrace support for RISC-V platforms

2018-01-09 Thread Alan Kao
This patch set includes the building blocks of dynamic ftraces features
for RISC-V machines.

Alan Kao (6):
  riscv/ftrace: Add RECORD_MCOUNT support
  riscv/ftrace: Add dynamic function tracer support
  riscv/ftrace: Add dynamic function graph tracer support
  riscv/ftrace: Add ARCH_SUPPORTS_FTRACE_OPS support
  riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support
  riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support

 arch/riscv/Kconfig  |   3 +
 arch/riscv/Makefile |   6 +-
 arch/riscv/include/asm/ftrace.h |  47 
 arch/riscv/kernel/Makefile  |   5 +-
 arch/riscv/kernel/ftrace.c  | 136 +-
 arch/riscv/kernel/mcount-dyn.S  | 244 
 arch/riscv/kernel/mcount.S  |  22 ++--
 arch/riscv/kernel/stacktrace.c  |   6 +
 scripts/recordmcount.pl |   5 +
 9 files changed, 460 insertions(+), 14 deletions(-)
 create mode 100644 arch/riscv/kernel/mcount-dyn.S

-- 
2.15.1



[PATCH v2 0/2] phy: rockchip-emmc: fixes emmc-phy power on failed with rk3399 SoCs

2018-01-09 Thread Caesar Wang
Hi Kishon,

Since the Shawn isn't available, I take over this series patches for now.

As the original bug had tracked on https://issuetracker.google.com/71561742.
In some cases, the mmc phy power on failed during booting up.
The log as below:
...
[   2.375333] rockchip_emmc_phy_power: caldone timeout.
[2.377815] phy phy-ff77.syscon:phy@f780.4: phy poweron failed --> -110
...
[2.489295] mmc0: mmc_select_hs400es failed, error -110
[2.489302] mmc0: error -110 whilst initialising MMC card
..

The actual emulate, the wait 5us for calpad busy trimming, that's no enough.
We need give the enough margin for it.

Verified on url =

https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-4.4
This series patches can apply and bring up with kernel-next on rk3399 
chromebook.

-Caesar


Changes in v2:
- print the return valut with regmap_read_poll_timeout failing.
- As Brian commented on https://patchwork.kernel.org/patch/10139891/,
  changed the note and added to print error value with
  regmap_read_poll_timeout API.

Shawn Lin (2):
  phy: rockchip-emmc: retry calpad busy trimming
  phy: rockchip-emmc: use regmap_read_poll_timeout to poll dllrdy

 drivers/phy/rockchip/phy-rockchip-emmc.c | 60 +++-
 1 file changed, 28 insertions(+), 32 deletions(-)

-- 
2.7.4



Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Willy Tarreau
On Wed, Jan 10, 2018 at 08:31:28AM +0100, Ingo Molnar wrote:
> 
> * Borislav Petkov  wrote:
> 
> > Oh, and you've built the kernel with the option to be able to disable
> > PTI so it's not like you haven't seen it already.
> 
> In general in many corporate environments requiring kernel reboots or kernel 
> rebuilds limits the real-world usability of any kernel feature we offer down 
> to 
> "non-existent". Saying "build your own kernel or reboot" is excluding a large 
> subset of our real-world users.
> 
> Build and boot options are fine for developers and testing. Otherwise 
> _everything_ 
> not readily accessible when your distro kernel has booted up is essentially 
> behind 
> a usability (and corporate policy) wall so steep that it's essentially 
> non-existent to many users.
> 
> So either we make this properly sysctl (and/or prctl) controllable, or just 
> don't 
> do it at all.

After having slept over it, I really prefer the sysctl+prctl approach.
It's much more consistent with the rest of the tunables which act
similarly. We have mmap_min_addr, mmap_rnd_bits, randomize_va_space, etc
All of them are here to trade some protections for something else (mostly
compatibility).

What I'd like to have would be a sysctl with 3 values :
  -  0 : default disabled : arch_prctl() fails, this is the default
  -  1 : forced enabled : arch_prctl() succeeds for CAP_SYS_RAWIO
  - -1 : permanently disabled : fails and cannot be switched back to enabled.

This way the admin always has the last choice, it's possible to globally
disable it without having to fear that someone would enable it again if
desired, and it's possible to try if it helps without rebooting in
emergency situations.

Willy


[PATCH 0/6] Add dynamic ftrace support for RISC-V platforms

2018-01-09 Thread Alan Kao
This patch set includes the building blocks of dynamic ftraces features
for RISC-V machines.

Alan Kao (6):
  riscv/ftrace: Add RECORD_MCOUNT support
  riscv/ftrace: Add dynamic function tracer support
  riscv/ftrace: Add dynamic function graph tracer support
  riscv/ftrace: Add ARCH_SUPPORTS_FTRACE_OPS support
  riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support
  riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support

 arch/riscv/Kconfig  |   3 +
 arch/riscv/Makefile |   6 +-
 arch/riscv/include/asm/ftrace.h |  47 
 arch/riscv/kernel/Makefile  |   5 +-
 arch/riscv/kernel/ftrace.c  | 136 +-
 arch/riscv/kernel/mcount-dyn.S  | 244 
 arch/riscv/kernel/mcount.S  |  22 ++--
 arch/riscv/kernel/stacktrace.c  |   6 +
 scripts/recordmcount.pl |   5 +
 9 files changed, 460 insertions(+), 14 deletions(-)
 create mode 100644 arch/riscv/kernel/mcount-dyn.S

-- 
2.15.1



[PATCH v2 0/2] phy: rockchip-emmc: fixes emmc-phy power on failed with rk3399 SoCs

2018-01-09 Thread Caesar Wang
Hi Kishon,

Since the Shawn isn't available, I take over this series patches for now.

As the original bug had tracked on https://issuetracker.google.com/71561742.
In some cases, the mmc phy power on failed during booting up.
The log as below:
...
[   2.375333] rockchip_emmc_phy_power: caldone timeout.
[2.377815] phy phy-ff77.syscon:phy@f780.4: phy poweron failed --> -110
...
[2.489295] mmc0: mmc_select_hs400es failed, error -110
[2.489302] mmc0: error -110 whilst initialising MMC card
..

The actual emulate, the wait 5us for calpad busy trimming, that's no enough.
We need give the enough margin for it.

Verified on url =

https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-4.4
This series patches can apply and bring up with kernel-next on rk3399 
chromebook.

-Caesar


Changes in v2:
- print the return valut with regmap_read_poll_timeout failing.
- As Brian commented on https://patchwork.kernel.org/patch/10139891/,
  changed the note and added to print error value with
  regmap_read_poll_timeout API.

Shawn Lin (2):
  phy: rockchip-emmc: retry calpad busy trimming
  phy: rockchip-emmc: use regmap_read_poll_timeout to poll dllrdy

 drivers/phy/rockchip/phy-rockchip-emmc.c | 60 +++-
 1 file changed, 28 insertions(+), 32 deletions(-)

-- 
2.7.4



[PATCH v2 1/2] phy: rockchip-emmc: retry calpad busy trimming

2018-01-09 Thread Caesar Wang
From: Shawn Lin 

It turns out that 5us isn't enough for all cases, so let's
retry some more times to wait for caldone.

Signed-off-by: Shawn Lin 
Tested-by: Ziyuan Xu 
Signed-off-by: Caesar Wang 
---

Changes in v2:
- print the return valut with regmap_read_poll_timeout failing.

 drivers/phy/rockchip/phy-rockchip-emmc.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/phy/rockchip/phy-rockchip-emmc.c 
b/drivers/phy/rockchip/phy-rockchip-emmc.c
index f1b24f1..574838f 100644
--- a/drivers/phy/rockchip/phy-rockchip-emmc.c
+++ b/drivers/phy/rockchip/phy-rockchip-emmc.c
@@ -76,6 +76,10 @@
 #define PHYCTRL_OTAPDLYSEL_MASK0xf
 #define PHYCTRL_OTAPDLYSEL_SHIFT   0x7
 
+#define PHYCTRL_IS_CALDONE(x) \
+   x) >> PHYCTRL_CALDONE_SHIFT) & \
+ PHYCTRL_CALDONE_MASK) == PHYCTRL_CALDONE_DONE)
+
 struct rockchip_emmc_phy {
unsigned intreg_offset;
struct regmap   *reg_base;
@@ -90,6 +94,7 @@ static int rockchip_emmc_phy_power(struct phy *phy, bool 
on_off)
unsigned int freqsel = PHYCTRL_FREQSEL_200M;
unsigned long rate;
unsigned long timeout;
+   int ret;
 
/*
 * Keep phyctrl_pdb and phyctrl_endll low to allow
@@ -160,17 +165,19 @@ static int rockchip_emmc_phy_power(struct phy *phy, bool 
on_off)
   PHYCTRL_PDB_SHIFT));
 
/*
-* According to the user manual, it asks driver to
-* wait 5us for calpad busy trimming
+* According to the user manual, it asks driver to wait 5us for
+* calpad busy trimming. However it is documented that this value is
+* PVT(A.K.A process,voltage and temperature) relevant, so some
+* failure cases are found which indicates we should be more tolerant
+* to calpad busy trimming.
 */
-   udelay(5);
-   regmap_read(rk_phy->reg_base,
-   rk_phy->reg_offset + GRF_EMMCPHY_STATUS,
-   );
-   caldone = (caldone >> PHYCTRL_CALDONE_SHIFT) & PHYCTRL_CALDONE_MASK;
-   if (caldone != PHYCTRL_CALDONE_DONE) {
-   pr_err("rockchip_emmc_phy_power: caldone timeout.\n");
-   return -ETIMEDOUT;
+   ret = regmap_read_poll_timeout(rk_phy->reg_base,
+  rk_phy->reg_offset + GRF_EMMCPHY_STATUS,
+  caldone, PHYCTRL_IS_CALDONE(caldone),
+  5, 50);
+   if (ret) {
+   pr_err("%s: caldone timeout, ret=%d\n", __func__, ret);
+   return ret;
}
 
/* Set the frequency of the DLL operation */
-- 
2.7.4



[PATCH v2 1/2] phy: rockchip-emmc: retry calpad busy trimming

2018-01-09 Thread Caesar Wang
From: Shawn Lin 

It turns out that 5us isn't enough for all cases, so let's
retry some more times to wait for caldone.

Signed-off-by: Shawn Lin 
Tested-by: Ziyuan Xu 
Signed-off-by: Caesar Wang 
---

Changes in v2:
- print the return valut with regmap_read_poll_timeout failing.

 drivers/phy/rockchip/phy-rockchip-emmc.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/phy/rockchip/phy-rockchip-emmc.c 
b/drivers/phy/rockchip/phy-rockchip-emmc.c
index f1b24f1..574838f 100644
--- a/drivers/phy/rockchip/phy-rockchip-emmc.c
+++ b/drivers/phy/rockchip/phy-rockchip-emmc.c
@@ -76,6 +76,10 @@
 #define PHYCTRL_OTAPDLYSEL_MASK0xf
 #define PHYCTRL_OTAPDLYSEL_SHIFT   0x7
 
+#define PHYCTRL_IS_CALDONE(x) \
+   x) >> PHYCTRL_CALDONE_SHIFT) & \
+ PHYCTRL_CALDONE_MASK) == PHYCTRL_CALDONE_DONE)
+
 struct rockchip_emmc_phy {
unsigned intreg_offset;
struct regmap   *reg_base;
@@ -90,6 +94,7 @@ static int rockchip_emmc_phy_power(struct phy *phy, bool 
on_off)
unsigned int freqsel = PHYCTRL_FREQSEL_200M;
unsigned long rate;
unsigned long timeout;
+   int ret;
 
/*
 * Keep phyctrl_pdb and phyctrl_endll low to allow
@@ -160,17 +165,19 @@ static int rockchip_emmc_phy_power(struct phy *phy, bool 
on_off)
   PHYCTRL_PDB_SHIFT));
 
/*
-* According to the user manual, it asks driver to
-* wait 5us for calpad busy trimming
+* According to the user manual, it asks driver to wait 5us for
+* calpad busy trimming. However it is documented that this value is
+* PVT(A.K.A process,voltage and temperature) relevant, so some
+* failure cases are found which indicates we should be more tolerant
+* to calpad busy trimming.
 */
-   udelay(5);
-   regmap_read(rk_phy->reg_base,
-   rk_phy->reg_offset + GRF_EMMCPHY_STATUS,
-   );
-   caldone = (caldone >> PHYCTRL_CALDONE_SHIFT) & PHYCTRL_CALDONE_MASK;
-   if (caldone != PHYCTRL_CALDONE_DONE) {
-   pr_err("rockchip_emmc_phy_power: caldone timeout.\n");
-   return -ETIMEDOUT;
+   ret = regmap_read_poll_timeout(rk_phy->reg_base,
+  rk_phy->reg_offset + GRF_EMMCPHY_STATUS,
+  caldone, PHYCTRL_IS_CALDONE(caldone),
+  5, 50);
+   if (ret) {
+   pr_err("%s: caldone timeout, ret=%d\n", __func__, ret);
+   return ret;
}
 
/* Set the frequency of the DLL operation */
-- 
2.7.4



Re: [PATCH 3/3] mmc: sdhci: fix o2 eMMC init bug and add support for hardware tuning

2018-01-09 Thread Adrian Hunter
On 28/12/17 12:00, ernest.zhang wrote:
> In some case of eMMC used as boot device, the eMMC signaling voltage is
> fixed to 1.8v, bios can set o2 sd host controller register 0x308 bit4 to
> let host controller skip try 3.3.v signaling voltage in eMMC initialize
> process.
> O2 sd host controller has a function named hardware tuning. In software
> tuning mode CPU should send multiple command to host controller but in
> hardware tuning mode, CPU need send only one tuning command to sd host
> controller. It can improve the speed linux boot from eMMC.

Please put the changes from patch 2 into this patch, re-base, and put V3 on
the next submission.  Note the mmc tree is:

git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git

Re-base on the 'next' branch.

Also Craig Bergstrom reported problems getting it to work, so please respond
to that, otherwise I am left wondering if this rather strange tuning
procedure actually works:

https://marc.info/?l=linux-mmc=151387671202146=2

> 
> Signed-off-by: ernest.zhang 
> ---
>  drivers/mmc/host/sdhci-pci-o2micro.c | 218 
> ++-
>  1 file changed, 217 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-pci-o2micro.c 
> b/drivers/mmc/host/sdhci-pci-o2micro.c
> index 14273ca00641..fd244d88b07e 100644
> --- a/drivers/mmc/host/sdhci-pci-o2micro.c
> +++ b/drivers/mmc/host/sdhci-pci-o2micro.c
> @@ -16,11 +16,211 @@
>   */
>  
>  #include 
> -
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
>  #include "sdhci.h"
>  #include "sdhci-pci.h"
>  #include "sdhci-pci-o2micro.h"
>  
> +static void sdhci_o2_start_tuning(struct sdhci_host *host)
> +{
> + u16 ctrl;
> +
> + ctrl = sdhci_readw(host, SDHCI_HOST_CONTROL2);
> + ctrl |= SDHCI_CTRL_EXEC_TUNING;
> + if (host->quirks2 & SDHCI_QUIRK2_TUNING_WORK_AROUND)
> + ctrl |= SDHCI_CTRL_TUNED_CLK;

Why program for a quirk that you don't use?

> + sdhci_writew(host, ctrl, SDHCI_HOST_CONTROL2);
> +
> + /*
> +  * As per the Host Controller spec v3.00, tuning command
> +  * generates Buffer Read Ready interrupt, so enable that.
> +  *
> +  * Note: The spec clearly says that when tuning sequence
> +  * is being performed, the controller does not generate
> +  * interrupts other than Buffer Read Ready interrupt. But
> +  * to make sure we don't hit a controller bug, we _only_
> +  * enable Buffer Read Ready interrupt here.
> +  */
> + sdhci_writel(host, SDHCI_INT_DATA_AVAIL, SDHCI_INT_ENABLE);
> + sdhci_writel(host, SDHCI_INT_DATA_AVAIL, SDHCI_SIGNAL_ENABLE);
> +}
> +
> +static void sdhci_o2_end_tuning(struct sdhci_host *host)
> +{
> + sdhci_writel(host, host->ier, SDHCI_INT_ENABLE);
> + sdhci_writel(host, host->ier, SDHCI_SIGNAL_ENABLE);
> +}
> +
> +static inline bool sdhci_data_line_cmd(struct mmc_command *cmd)
> +{
> + return cmd->data || cmd->flags & MMC_RSP_BUSY;
> +}
> +
> +static void sdhci_del_timer(struct sdhci_host *host, struct mmc_request *mrq)
> +{
> + if (sdhci_data_line_cmd(mrq->cmd))
> + del_timer(>data_timer);
> + else
> + del_timer(>timer);
> +}
> +
> +static void sdhci_o2_set_tuning_mode(struct sdhci_host *host, bool hw)

Since you only ever call this with hw = true, maybe drop the 'hw' parameter
altogether.

> +{
> + u16 reg;
> +
> + if (hw) {
> + // enable hardware tuning

For consistency, please use old C-style comments /* */ instead of //

> + reg = sdhci_readw(host, O2_SD_VENDOR_SETTING);
> + reg &= (~O2_SD_HW_TUNING_ENABLE);
> + sdhci_writew(host, reg, O2_SD_VENDOR_SETTING);
> + } else {
> + reg = sdhci_readw(host, O2_SD_VENDOR_SETTING);
> + reg |= O2_SD_HW_TUNING_ENABLE;
> + sdhci_writew(host, reg, O2_SD_VENDOR_SETTING);
> + }
> +}
> +
> +static u8 data_buf[64];

It would be better to allocate data_buf.

> +
> +static void sdhci_o2_send_tuning(struct sdhci_host *host, u32 opcode)
> +{
> + struct mmc_command cmd = { };
> + struct mmc_data data = { };
> + struct scatterlist sg;
> + struct mmc_request mrq = { };
> + unsigned long flags;
> + u32 b = host->sdma_boundary;
> + int size = sizeof(data_buf);
> +
> + cmd.opcode = opcode;
> + cmd.flags = MMC_RSP_PRESENT | MMC_RSP_OPCODE | MMC_RSP_CRC;
> + cmd.mrq = 
> + mrq.cmd = 
> + mrq.data = 
> + data.blksz = size;
> + data.blocks = 1;
> + data.flags = MMC_DATA_READ;
> +
> + data.timeout_ns = 150 * NSEC_PER_MSEC;

It seems inconsistent to set 150ms timeout here but 50ms for waiting for
Buffer Read Ready.

> +
> + data.sg = 
> + data.sg_len = 1;
> + sg_init_one(, data_buf, size);
> +
> + spin_lock_irqsave(>lock, flags);
> +
> + sdhci_writew(host, SDHCI_MAKE_BLKSZ(b, 64), SDHCI_BLOCK_SIZE);
> +
> + /*
> +  * The tuning block is sent by the card to 

Re: [PATCH 3/3] mmc: sdhci: fix o2 eMMC init bug and add support for hardware tuning

2018-01-09 Thread Adrian Hunter
On 28/12/17 12:00, ernest.zhang wrote:
> In some case of eMMC used as boot device, the eMMC signaling voltage is
> fixed to 1.8v, bios can set o2 sd host controller register 0x308 bit4 to
> let host controller skip try 3.3.v signaling voltage in eMMC initialize
> process.
> O2 sd host controller has a function named hardware tuning. In software
> tuning mode CPU should send multiple command to host controller but in
> hardware tuning mode, CPU need send only one tuning command to sd host
> controller. It can improve the speed linux boot from eMMC.

Please put the changes from patch 2 into this patch, re-base, and put V3 on
the next submission.  Note the mmc tree is:

git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git

Re-base on the 'next' branch.

Also Craig Bergstrom reported problems getting it to work, so please respond
to that, otherwise I am left wondering if this rather strange tuning
procedure actually works:

https://marc.info/?l=linux-mmc=151387671202146=2

> 
> Signed-off-by: ernest.zhang 
> ---
>  drivers/mmc/host/sdhci-pci-o2micro.c | 218 
> ++-
>  1 file changed, 217 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-pci-o2micro.c 
> b/drivers/mmc/host/sdhci-pci-o2micro.c
> index 14273ca00641..fd244d88b07e 100644
> --- a/drivers/mmc/host/sdhci-pci-o2micro.c
> +++ b/drivers/mmc/host/sdhci-pci-o2micro.c
> @@ -16,11 +16,211 @@
>   */
>  
>  #include 
> -
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
>  #include "sdhci.h"
>  #include "sdhci-pci.h"
>  #include "sdhci-pci-o2micro.h"
>  
> +static void sdhci_o2_start_tuning(struct sdhci_host *host)
> +{
> + u16 ctrl;
> +
> + ctrl = sdhci_readw(host, SDHCI_HOST_CONTROL2);
> + ctrl |= SDHCI_CTRL_EXEC_TUNING;
> + if (host->quirks2 & SDHCI_QUIRK2_TUNING_WORK_AROUND)
> + ctrl |= SDHCI_CTRL_TUNED_CLK;

Why program for a quirk that you don't use?

> + sdhci_writew(host, ctrl, SDHCI_HOST_CONTROL2);
> +
> + /*
> +  * As per the Host Controller spec v3.00, tuning command
> +  * generates Buffer Read Ready interrupt, so enable that.
> +  *
> +  * Note: The spec clearly says that when tuning sequence
> +  * is being performed, the controller does not generate
> +  * interrupts other than Buffer Read Ready interrupt. But
> +  * to make sure we don't hit a controller bug, we _only_
> +  * enable Buffer Read Ready interrupt here.
> +  */
> + sdhci_writel(host, SDHCI_INT_DATA_AVAIL, SDHCI_INT_ENABLE);
> + sdhci_writel(host, SDHCI_INT_DATA_AVAIL, SDHCI_SIGNAL_ENABLE);
> +}
> +
> +static void sdhci_o2_end_tuning(struct sdhci_host *host)
> +{
> + sdhci_writel(host, host->ier, SDHCI_INT_ENABLE);
> + sdhci_writel(host, host->ier, SDHCI_SIGNAL_ENABLE);
> +}
> +
> +static inline bool sdhci_data_line_cmd(struct mmc_command *cmd)
> +{
> + return cmd->data || cmd->flags & MMC_RSP_BUSY;
> +}
> +
> +static void sdhci_del_timer(struct sdhci_host *host, struct mmc_request *mrq)
> +{
> + if (sdhci_data_line_cmd(mrq->cmd))
> + del_timer(>data_timer);
> + else
> + del_timer(>timer);
> +}
> +
> +static void sdhci_o2_set_tuning_mode(struct sdhci_host *host, bool hw)

Since you only ever call this with hw = true, maybe drop the 'hw' parameter
altogether.

> +{
> + u16 reg;
> +
> + if (hw) {
> + // enable hardware tuning

For consistency, please use old C-style comments /* */ instead of //

> + reg = sdhci_readw(host, O2_SD_VENDOR_SETTING);
> + reg &= (~O2_SD_HW_TUNING_ENABLE);
> + sdhci_writew(host, reg, O2_SD_VENDOR_SETTING);
> + } else {
> + reg = sdhci_readw(host, O2_SD_VENDOR_SETTING);
> + reg |= O2_SD_HW_TUNING_ENABLE;
> + sdhci_writew(host, reg, O2_SD_VENDOR_SETTING);
> + }
> +}
> +
> +static u8 data_buf[64];

It would be better to allocate data_buf.

> +
> +static void sdhci_o2_send_tuning(struct sdhci_host *host, u32 opcode)
> +{
> + struct mmc_command cmd = { };
> + struct mmc_data data = { };
> + struct scatterlist sg;
> + struct mmc_request mrq = { };
> + unsigned long flags;
> + u32 b = host->sdma_boundary;
> + int size = sizeof(data_buf);
> +
> + cmd.opcode = opcode;
> + cmd.flags = MMC_RSP_PRESENT | MMC_RSP_OPCODE | MMC_RSP_CRC;
> + cmd.mrq = 
> + mrq.cmd = 
> + mrq.data = 
> + data.blksz = size;
> + data.blocks = 1;
> + data.flags = MMC_DATA_READ;
> +
> + data.timeout_ns = 150 * NSEC_PER_MSEC;

It seems inconsistent to set 150ms timeout here but 50ms for waiting for
Buffer Read Ready.

> +
> + data.sg = 
> + data.sg_len = 1;
> + sg_init_one(, data_buf, size);
> +
> + spin_lock_irqsave(>lock, flags);
> +
> + sdhci_writew(host, SDHCI_MAKE_BLKSZ(b, 64), SDHCI_BLOCK_SIZE);
> +
> + /*
> +  * The tuning block is sent by the card to the host controller.
> + 

Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Ingo Molnar

* Borislav Petkov  wrote:

> Oh, and you've built the kernel with the option to be able to disable
> PTI so it's not like you haven't seen it already.

In general in many corporate environments requiring kernel reboots or kernel 
rebuilds limits the real-world usability of any kernel feature we offer down to 
"non-existent". Saying "build your own kernel or reboot" is excluding a large 
subset of our real-world users.

Build and boot options are fine for developers and testing. Otherwise 
_everything_ 
not readily accessible when your distro kernel has booted up is essentially 
behind 
a usability (and corporate policy) wall so steep that it's essentially 
non-existent to many users.

So either we make this properly sysctl (and/or prctl) controllable, or just 
don't 
do it at all.

Thanks,

Ingo


Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Ingo Molnar

* Borislav Petkov  wrote:

> Oh, and you've built the kernel with the option to be able to disable
> PTI so it's not like you haven't seen it already.

In general in many corporate environments requiring kernel reboots or kernel 
rebuilds limits the real-world usability of any kernel feature we offer down to 
"non-existent". Saying "build your own kernel or reboot" is excluding a large 
subset of our real-world users.

Build and boot options are fine for developers and testing. Otherwise 
_everything_ 
not readily accessible when your distro kernel has booted up is essentially 
behind 
a usability (and corporate policy) wall so steep that it's essentially 
non-existent to many users.

So either we make this properly sysctl (and/or prctl) controllable, or just 
don't 
do it at all.

Thanks,

Ingo


[GIT PULL] sound fixes for 4.15-rc8

2018-01-09 Thread Takashi Iwai
Linus,

please pull sound fixes for v4.15-rc8 from:

  git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git 
tags/sound-4.15-rc8

The topmost commit is 900498a34a3ac9c611e9b425094c8106bdd7dc1c



sound fixes for 4.15-rc8

A collection of the last-minute small PCM fixes:
- A workaround for the recent regression wrt PulseAudio
- Removal of spurious WARN_ON() that is triggered by syzkaller
- Fixes for aloop, hardening racy accesses
- Fixes in PCM OSS emulation wrt the unabortable loops that may cause
  RCU stall



Takashi Iwai (8):
  ALSA: pcm: Remove incorrect snd_BUG_ON() usages
  ALSA: pcm: Add missing error checks in OSS emulation plugin builder
  ALSA: pcm: Workaround for weird PulseAudio behavior on rewind error
  ALSA: aloop: Release cable upon open error path
  ALSA: aloop: Fix inconsistent format due to incomplete rule
  ALSA: aloop: Fix racy hw constraints adjustment
  ALSA: pcm: Abort properly at pending signal in OSS read/write loops
  ALSA: pcm: Allow aborting mutex lock at OSS read/write loops

---
 sound/core/oss/pcm_oss.c| 41 ---
 sound/core/oss/pcm_plugin.c | 14 +--
 sound/core/pcm_lib.c|  4 +-
 sound/core/pcm_native.c |  9 -
 sound/drivers/aloop.c   | 98 +++--
 5 files changed, 97 insertions(+), 69 deletions(-)

diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c
index e49f448ee04f..c2db7e905f7d 100644
--- a/sound/core/oss/pcm_oss.c
+++ b/sound/core/oss/pcm_oss.c
@@ -455,7 +455,6 @@ static int snd_pcm_hw_param_near(struct snd_pcm_substream 
*pcm,
v = snd_pcm_hw_param_last(pcm, params, var, dir);
else
v = snd_pcm_hw_param_first(pcm, params, var, dir);
-   snd_BUG_ON(v < 0);
return v;
 }
 
@@ -1335,8 +1334,11 @@ static ssize_t snd_pcm_oss_write1(struct 
snd_pcm_substream *substream, const cha
 
if ((tmp = snd_pcm_oss_make_ready(substream)) < 0)
return tmp;
-   mutex_lock(>oss.params_lock);
while (bytes > 0) {
+   if (mutex_lock_interruptible(>oss.params_lock)) {
+   tmp = -ERESTARTSYS;
+   break;
+   }
if (bytes < runtime->oss.period_bytes || 
runtime->oss.buffer_used > 0) {
tmp = bytes;
if (tmp + runtime->oss.buffer_used > 
runtime->oss.period_bytes)
@@ -1380,14 +1382,18 @@ static ssize_t snd_pcm_oss_write1(struct 
snd_pcm_substream *substream, const cha
xfer += tmp;
if ((substream->f_flags & O_NONBLOCK) != 0 &&
tmp != runtime->oss.period_bytes)
-   break;
+   tmp = -EAGAIN;
}
-   }
-   mutex_unlock(>oss.params_lock);
-   return xfer;
-
  err:
-   mutex_unlock(>oss.params_lock);
+   mutex_unlock(>oss.params_lock);
+   if (tmp < 0)
+   break;
+   if (signal_pending(current)) {
+   tmp = -ERESTARTSYS;
+   break;
+   }
+   tmp = 0;
+   }
return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp;
 }
 
@@ -1435,8 +1441,11 @@ static ssize_t snd_pcm_oss_read1(struct 
snd_pcm_substream *substream, char __use
 
if ((tmp = snd_pcm_oss_make_ready(substream)) < 0)
return tmp;
-   mutex_lock(>oss.params_lock);
while (bytes > 0) {
+   if (mutex_lock_interruptible(>oss.params_lock)) {
+   tmp = -ERESTARTSYS;
+   break;
+   }
if (bytes < runtime->oss.period_bytes || 
runtime->oss.buffer_used > 0) {
if (runtime->oss.buffer_used == 0) {
tmp = snd_pcm_oss_read2(substream, 
runtime->oss.buffer, runtime->oss.period_bytes, 1);
@@ -1467,12 +1476,16 @@ static ssize_t snd_pcm_oss_read1(struct 
snd_pcm_substream *substream, char __use
bytes -= tmp;
xfer += tmp;
}
-   }
-   mutex_unlock(>oss.params_lock);
-   return xfer;
-
  err:
-   mutex_unlock(>oss.params_lock);
+   mutex_unlock(>oss.params_lock);
+   if (tmp < 0)
+   break;
+   if (signal_pending(current)) {
+   tmp = -ERESTARTSYS;
+   break;
+   }
+   tmp = 0;
+   }
return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp;
 }
 
diff --git a/sound/core/oss/pcm_plugin.c b/sound/core/oss/pcm_plugin.c
index cadc93792868..85a56af104bd 100644
--- a/sound/core/oss/pcm_plugin.c
+++ b/sound/core/oss/pcm_plugin.c
@@ -592,18 +592,26 @@ 

Re: [RFC PATCH v2 3/6] x86/pti: add a per-cpu variable pti_disable

2018-01-09 Thread Willy Tarreau
On Wed, Jan 10, 2018 at 08:19:51AM +0100, Ingo Molnar wrote:
> 
> * Willy Tarreau  wrote:
> 
> > +#ifdef CONFIG_PAGE_TABLE_ISOLATION
> > +   this_cpu_write(pti_disable,
> > +  next_p->mm && next_p->mm->context.pti_disable);
> > +#endif
> 
> Another pet peeve, please write:
> 
> > +   this_cpu_write(pti_disable, next_p->mm && 
> > next_p->mm->context.pti_disable);
> 
> or consider introducing an 'mm_next' local variable, set to next_p->mm, and 
> use 
> that to shorten the sequence.

OK.

> More importantly, any strong reasons why the flag is logic-inverted? I.e. why 
> not
> ::pti_enabled?

For me it's a matter of default case. Having a "pti_enabled" flag makes
one think the default is disabled and an action is required to turn it on.
With "pti_disabled", it becomes clearer that the default is enabled and an
action is required to turn it off. While it causes a double inversion for
the user due to the temporary choice of prctl name (we could have
ARCH_SET_PTI for example), I think it results on more readable code in
the sensitive parts like the asm one where these tests could possibly
end up inside #ifdefs. If we had "pit_enabled", something like this could
be confusing because it's not obvious whether this pti_enabled *enforces*
PTI or if its absence disables it :

#ifdef CONFIG_ALLOW_DISABLE_PTI
cmpb $0, PER_CPU_VAR(pti_enabled)
jz .Lend\@
#endif

But this is open to discussion of course.

Willy


[GIT PULL] sound fixes for 4.15-rc8

2018-01-09 Thread Takashi Iwai
Linus,

please pull sound fixes for v4.15-rc8 from:

  git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git 
tags/sound-4.15-rc8

The topmost commit is 900498a34a3ac9c611e9b425094c8106bdd7dc1c



sound fixes for 4.15-rc8

A collection of the last-minute small PCM fixes:
- A workaround for the recent regression wrt PulseAudio
- Removal of spurious WARN_ON() that is triggered by syzkaller
- Fixes for aloop, hardening racy accesses
- Fixes in PCM OSS emulation wrt the unabortable loops that may cause
  RCU stall



Takashi Iwai (8):
  ALSA: pcm: Remove incorrect snd_BUG_ON() usages
  ALSA: pcm: Add missing error checks in OSS emulation plugin builder
  ALSA: pcm: Workaround for weird PulseAudio behavior on rewind error
  ALSA: aloop: Release cable upon open error path
  ALSA: aloop: Fix inconsistent format due to incomplete rule
  ALSA: aloop: Fix racy hw constraints adjustment
  ALSA: pcm: Abort properly at pending signal in OSS read/write loops
  ALSA: pcm: Allow aborting mutex lock at OSS read/write loops

---
 sound/core/oss/pcm_oss.c| 41 ---
 sound/core/oss/pcm_plugin.c | 14 +--
 sound/core/pcm_lib.c|  4 +-
 sound/core/pcm_native.c |  9 -
 sound/drivers/aloop.c   | 98 +++--
 5 files changed, 97 insertions(+), 69 deletions(-)

diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c
index e49f448ee04f..c2db7e905f7d 100644
--- a/sound/core/oss/pcm_oss.c
+++ b/sound/core/oss/pcm_oss.c
@@ -455,7 +455,6 @@ static int snd_pcm_hw_param_near(struct snd_pcm_substream 
*pcm,
v = snd_pcm_hw_param_last(pcm, params, var, dir);
else
v = snd_pcm_hw_param_first(pcm, params, var, dir);
-   snd_BUG_ON(v < 0);
return v;
 }
 
@@ -1335,8 +1334,11 @@ static ssize_t snd_pcm_oss_write1(struct 
snd_pcm_substream *substream, const cha
 
if ((tmp = snd_pcm_oss_make_ready(substream)) < 0)
return tmp;
-   mutex_lock(>oss.params_lock);
while (bytes > 0) {
+   if (mutex_lock_interruptible(>oss.params_lock)) {
+   tmp = -ERESTARTSYS;
+   break;
+   }
if (bytes < runtime->oss.period_bytes || 
runtime->oss.buffer_used > 0) {
tmp = bytes;
if (tmp + runtime->oss.buffer_used > 
runtime->oss.period_bytes)
@@ -1380,14 +1382,18 @@ static ssize_t snd_pcm_oss_write1(struct 
snd_pcm_substream *substream, const cha
xfer += tmp;
if ((substream->f_flags & O_NONBLOCK) != 0 &&
tmp != runtime->oss.period_bytes)
-   break;
+   tmp = -EAGAIN;
}
-   }
-   mutex_unlock(>oss.params_lock);
-   return xfer;
-
  err:
-   mutex_unlock(>oss.params_lock);
+   mutex_unlock(>oss.params_lock);
+   if (tmp < 0)
+   break;
+   if (signal_pending(current)) {
+   tmp = -ERESTARTSYS;
+   break;
+   }
+   tmp = 0;
+   }
return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp;
 }
 
@@ -1435,8 +1441,11 @@ static ssize_t snd_pcm_oss_read1(struct 
snd_pcm_substream *substream, char __use
 
if ((tmp = snd_pcm_oss_make_ready(substream)) < 0)
return tmp;
-   mutex_lock(>oss.params_lock);
while (bytes > 0) {
+   if (mutex_lock_interruptible(>oss.params_lock)) {
+   tmp = -ERESTARTSYS;
+   break;
+   }
if (bytes < runtime->oss.period_bytes || 
runtime->oss.buffer_used > 0) {
if (runtime->oss.buffer_used == 0) {
tmp = snd_pcm_oss_read2(substream, 
runtime->oss.buffer, runtime->oss.period_bytes, 1);
@@ -1467,12 +1476,16 @@ static ssize_t snd_pcm_oss_read1(struct 
snd_pcm_substream *substream, char __use
bytes -= tmp;
xfer += tmp;
}
-   }
-   mutex_unlock(>oss.params_lock);
-   return xfer;
-
  err:
-   mutex_unlock(>oss.params_lock);
+   mutex_unlock(>oss.params_lock);
+   if (tmp < 0)
+   break;
+   if (signal_pending(current)) {
+   tmp = -ERESTARTSYS;
+   break;
+   }
+   tmp = 0;
+   }
return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp;
 }
 
diff --git a/sound/core/oss/pcm_plugin.c b/sound/core/oss/pcm_plugin.c
index cadc93792868..85a56af104bd 100644
--- a/sound/core/oss/pcm_plugin.c
+++ b/sound/core/oss/pcm_plugin.c
@@ -592,18 +592,26 @@ 

Re: [RFC PATCH v2 3/6] x86/pti: add a per-cpu variable pti_disable

2018-01-09 Thread Willy Tarreau
On Wed, Jan 10, 2018 at 08:19:51AM +0100, Ingo Molnar wrote:
> 
> * Willy Tarreau  wrote:
> 
> > +#ifdef CONFIG_PAGE_TABLE_ISOLATION
> > +   this_cpu_write(pti_disable,
> > +  next_p->mm && next_p->mm->context.pti_disable);
> > +#endif
> 
> Another pet peeve, please write:
> 
> > +   this_cpu_write(pti_disable, next_p->mm && 
> > next_p->mm->context.pti_disable);
> 
> or consider introducing an 'mm_next' local variable, set to next_p->mm, and 
> use 
> that to shorten the sequence.

OK.

> More importantly, any strong reasons why the flag is logic-inverted? I.e. why 
> not
> ::pti_enabled?

For me it's a matter of default case. Having a "pti_enabled" flag makes
one think the default is disabled and an action is required to turn it on.
With "pti_disabled", it becomes clearer that the default is enabled and an
action is required to turn it off. While it causes a double inversion for
the user due to the temporary choice of prctl name (we could have
ARCH_SET_PTI for example), I think it results on more readable code in
the sensitive parts like the asm one where these tests could possibly
end up inside #ifdefs. If we had "pit_enabled", something like this could
be confusing because it's not obvious whether this pti_enabled *enforces*
PTI or if its absence disables it :

#ifdef CONFIG_ALLOW_DISABLE_PTI
cmpb $0, PER_CPU_VAR(pti_enabled)
jz .Lend\@
#endif

But this is open to discussion of course.

Willy


Re: [PATCH 0/2] pinctrl: meson: use one uniform 'function' name

2018-01-09 Thread Jerome Brunet
On Wed, 2018-01-10 at 10:12 +0800, Yixun Lan wrote:
> 
> On 01/08/18 16:52, Jerome Brunet wrote:
> > On Mon, 2018-01-08 at 15:33 +0800, Yixun Lan wrote:
> > > These two patches are general improvement for meson pinctrl driver.
> > > It make the two pinctrl trees (ee/ao) to share one uniform 'function' 
> > > name for
> > > one hardware block even its pin groups live inside two differet hardware 
> > > domains,
> > > which for example EE vs AO domain here.
> > > 
> > > This idea is motivated by Martin's question at [1]
> > > 
> > > [1]
> > >  
> > > http://lkml.kernel.org/r/CAFBinCCuQ-NK747+GHDkhZty_UMMgzCYOYFcNTrRDJgU8OM=g...@mail.gmail.com
> > > 
> > > 
> > > Yixun Lan (2):
> > >   pinctrl: meson: introduce a macro to have name/groups seperated
> > >   pinctrl: meson-axg: correct the pin expansion of UART_AO_B
> > > 
> > >  drivers/pinctrl/meson/pinctrl-meson-axg.c | 4 ++--
> > >  drivers/pinctrl/meson/pinctrl-meson.h | 8 +---
> > >  2 files changed, 7 insertions(+), 5 deletions(-)
> > 
> > Hi Yixun,
> > 
> > Honestly, I don't like the idea. I think it adds an unnecessary complexity.
> > I don't see the point of FUNCTION_EX(uart_ao_b, _z) when you could simply 
> > write 
> > FUNCTION(uart_ao_b_z) ... especially when there is just a couple of 
> > function per
> > SoC available on different domains.
> > 
> > A pinctrl driver can already be challenging to understand at first, let's 
> > keep
> > it simple and avoid adding more macros.
> > 
> 
> Hi Jerome:
>   In my opinion, the idea of keeping one uniform 'function' in DT (thus
> introducing another macro) is worth considering. It would make the DT
> part much clean.

Ok this is your opinion. I don't share it. Keeping function names tidy is good,
I don't think we need another macro to do so.

>   And yes, it's a trade-off here, either we 1) do more in code to make
> DT clean or 2) do nothing in the code level to make DT live with it.

I don't see how adding a macro doing just string concatenation is going to make
anything more clean. It does not prevent one to write FUNCTION_EX(uart_ao_b,
_gpioz), resulting in uart_ao_b_gpioz, which is what is apparently considered
'not clean'

BTW, there no cleanness issue here, the name is just out of the 'usual scheme'
but there is no problem with. If you want to change this, and
s/uart_ao_b_gpioz/uart_ao_b_z/, now is the time to change it. 

> 
> Yixun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [PATCH 0/2] pinctrl: meson: use one uniform 'function' name

2018-01-09 Thread Jerome Brunet
On Wed, 2018-01-10 at 10:12 +0800, Yixun Lan wrote:
> 
> On 01/08/18 16:52, Jerome Brunet wrote:
> > On Mon, 2018-01-08 at 15:33 +0800, Yixun Lan wrote:
> > > These two patches are general improvement for meson pinctrl driver.
> > > It make the two pinctrl trees (ee/ao) to share one uniform 'function' 
> > > name for
> > > one hardware block even its pin groups live inside two differet hardware 
> > > domains,
> > > which for example EE vs AO domain here.
> > > 
> > > This idea is motivated by Martin's question at [1]
> > > 
> > > [1]
> > >  
> > > http://lkml.kernel.org/r/CAFBinCCuQ-NK747+GHDkhZty_UMMgzCYOYFcNTrRDJgU8OM=g...@mail.gmail.com
> > > 
> > > 
> > > Yixun Lan (2):
> > >   pinctrl: meson: introduce a macro to have name/groups seperated
> > >   pinctrl: meson-axg: correct the pin expansion of UART_AO_B
> > > 
> > >  drivers/pinctrl/meson/pinctrl-meson-axg.c | 4 ++--
> > >  drivers/pinctrl/meson/pinctrl-meson.h | 8 +---
> > >  2 files changed, 7 insertions(+), 5 deletions(-)
> > 
> > Hi Yixun,
> > 
> > Honestly, I don't like the idea. I think it adds an unnecessary complexity.
> > I don't see the point of FUNCTION_EX(uart_ao_b, _z) when you could simply 
> > write 
> > FUNCTION(uart_ao_b_z) ... especially when there is just a couple of 
> > function per
> > SoC available on different domains.
> > 
> > A pinctrl driver can already be challenging to understand at first, let's 
> > keep
> > it simple and avoid adding more macros.
> > 
> 
> Hi Jerome:
>   In my opinion, the idea of keeping one uniform 'function' in DT (thus
> introducing another macro) is worth considering. It would make the DT
> part much clean.

Ok this is your opinion. I don't share it. Keeping function names tidy is good,
I don't think we need another macro to do so.

>   And yes, it's a trade-off here, either we 1) do more in code to make
> DT clean or 2) do nothing in the code level to make DT live with it.

I don't see how adding a macro doing just string concatenation is going to make
anything more clean. It does not prevent one to write FUNCTION_EX(uart_ao_b,
_gpioz), resulting in uart_ao_b_gpioz, which is what is apparently considered
'not clean'

BTW, there no cleanness issue here, the name is just out of the 'usual scheme'
but there is no problem with. If you want to change this, and
s/uart_ao_b_gpioz/uart_ao_b_z/, now is the time to change it. 

> 
> Yixun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [PATCH] f2fs: handle newly created page when revoking inmem pages

2018-01-09 Thread Daeho Jeong
Hi Chao,

> Original intention here is to recover status to the timing before
> committing atomic write. As at that timing blkaddr in dnode should be
> cur->old_addr(NEW_ADDR), so we need to change to call:
 
> f2fs_update_data_blkaddr(, NEW_ADDR);

Ok, I'll change NULL_ADDR to NEW_ADDR.

Thanks,
 
> Otherwise, metadata will become inconsistent, because blkaddr value is
> NULL_ADDR means that current block is not preallocated, but
> total_valid_block_count has already been updated. Right?
 
> Thanks,
 

 
 


Re: [PATCH] f2fs: handle newly created page when revoking inmem pages

2018-01-09 Thread Daeho Jeong
Hi Chao,

> Original intention here is to recover status to the timing before
> committing atomic write. As at that timing blkaddr in dnode should be
> cur->old_addr(NEW_ADDR), so we need to change to call:
 
> f2fs_update_data_blkaddr(, NEW_ADDR);

Ok, I'll change NULL_ADDR to NEW_ADDR.

Thanks,
 
> Otherwise, metadata will become inconsistent, because blkaddr value is
> NULL_ADDR means that current block is not preallocated, but
> total_valid_block_count has already been updated. Right?
 
> Thanks,
 

 
 


Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Tue, Jan 09, 2018 at 01:26:57PM -0800, Andy Lutomirski wrote:
> > 2.Turning off PTI is, in general, a terrible idea.  It totally breaks
> > any semblance of a security model on a Meltdown-affected CPU.  So I
> > think we should require CAP_SYS_RAWIO *and* that the system is booted
> > with pti=allow_optout or something like that.
> 
> Uhh, I like that.
> 
> Maybe also taint the kernel ...

Tainting the kernel is really a bad idea: a CAP_SYS_RAWIO user _already_ has 
the 
privilege to directly write to the hardware. The binary makes use of that 
privilege to allow execution of ring 3 code that has the theoretical ability to 
_read_ kernel memory.

Unless we plan on tainting all uses of /dev/mem as well this is totally over 
the 
top.

We could taint the kernel and warn prominently in the syslog when PTI is 
disabled 
globally on the boot line though, if running on affected CPUs.

Something like:

 "x86/intel: Page Table Isolation (PTI) is disabled globally. This allows 
unprivileged, untrusted code to exploit the Meltdown CPU bug to read kernel 
data."

Thanks,

Ingo


Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Tue, Jan 09, 2018 at 01:26:57PM -0800, Andy Lutomirski wrote:
> > 2.Turning off PTI is, in general, a terrible idea.  It totally breaks
> > any semblance of a security model on a Meltdown-affected CPU.  So I
> > think we should require CAP_SYS_RAWIO *and* that the system is booted
> > with pti=allow_optout or something like that.
> 
> Uhh, I like that.
> 
> Maybe also taint the kernel ...

Tainting the kernel is really a bad idea: a CAP_SYS_RAWIO user _already_ has 
the 
privilege to directly write to the hardware. The binary makes use of that 
privilege to allow execution of ring 3 code that has the theoretical ability to 
_read_ kernel memory.

Unless we plan on tainting all uses of /dev/mem as well this is totally over 
the 
top.

We could taint the kernel and warn prominently in the syslog when PTI is 
disabled 
globally on the boot line though, if running on affected CPUs.

Something like:

 "x86/intel: Page Table Isolation (PTI) is disabled globally. This allows 
unprivileged, untrusted code to exploit the Meltdown CPU bug to read kernel 
data."

Thanks,

Ingo


Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set

2018-01-09 Thread Willy Tarreau
On Wed, Jan 10, 2018 at 08:15:10AM +0100, Ingo Molnar wrote:
> > +   /* The "pti_disable" mm attribute is mirrored into this per-cpu var */
> > +   cmpb$0, PER_CPU_VAR(pti_disable)
> > +   jne .Lend_\@
> 
> Could you please do this small change for future iterations:
> 
> s/per-cpu
>  /per-CPU
> 
> ... to make the spelling more consistent with the rest of the code base?

Now done as well, thank you. Do not hesitate to send future cosmetic
changes directly to me so that we keep a max of participants mostly
focused on the design choices. (and don't worry, I'll pick them, I
really appreciate these as well).

Thanks,
Willy


Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set

2018-01-09 Thread Willy Tarreau
On Wed, Jan 10, 2018 at 08:15:10AM +0100, Ingo Molnar wrote:
> > +   /* The "pti_disable" mm attribute is mirrored into this per-cpu var */
> > +   cmpb$0, PER_CPU_VAR(pti_disable)
> > +   jne .Lend_\@
> 
> Could you please do this small change for future iterations:
> 
> s/per-cpu
>  /per-CPU
> 
> ... to make the spelling more consistent with the rest of the code base?

Now done as well, thank you. Do not hesitate to send future cosmetic
changes directly to me so that we keep a max of participants mostly
focused on the design choices. (and don't worry, I'll pick them, I
really appreciate these as well).

Thanks,
Willy


Re: [PATCH v8 00/37] tracing: Inter-event (e.g. latency) support

2018-01-09 Thread Steven Rostedt
On Wed, 10 Jan 2018 14:45:07 +0900
Namhyung Kim  wrote:

> On Thu, Dec 21, 2017 at 10:02:22AM -0600, Tom Zanussi wrote:
> > Hi,
> > 
> > This is V8 of the inter-event tracing patchset, addressing input from
> > V7.
> > 
> > These changes address Namhyung's most recent comments (thanks,
> > Namhyung!) and move everything to the latest tracing/for-next:
> > 
> >   - moved a couple hunks switching hist_field_fn_t params from 15/37
> > (add variable support) to 20/37 (Pass tracing_map_elt to
> > hist_field_accessor)
> >   - in hist_trigger_elt_data_alloc(), remove the unnecessary '+1' from
> > TASK_COMM_LEN size.
> >   - simplified find_var_file() code according to Namhyung's
> > suggestions.
> >   - fixed bug in print_synth_event() where entry->fields was being
> > used instead of the address as it should have been  
> 
> I only have a nitpick in the patch 24 but otherwise cannot find an
> issue anymore, so
> 
> Reviewed-by: Namhyung Kim 

Thanks for reviewing this Namhyung.

I'm currently traveling (what else is new?), and I want to start
pulling this in. It wont be ready for the next merge window as it's too
close, but I want to get it in by the one after that. I need to
allocate some time to pull these patches in and review them as well.

-- Steve


Re: [PATCH v8 00/37] tracing: Inter-event (e.g. latency) support

2018-01-09 Thread Steven Rostedt
On Wed, 10 Jan 2018 14:45:07 +0900
Namhyung Kim  wrote:

> On Thu, Dec 21, 2017 at 10:02:22AM -0600, Tom Zanussi wrote:
> > Hi,
> > 
> > This is V8 of the inter-event tracing patchset, addressing input from
> > V7.
> > 
> > These changes address Namhyung's most recent comments (thanks,
> > Namhyung!) and move everything to the latest tracing/for-next:
> > 
> >   - moved a couple hunks switching hist_field_fn_t params from 15/37
> > (add variable support) to 20/37 (Pass tracing_map_elt to
> > hist_field_accessor)
> >   - in hist_trigger_elt_data_alloc(), remove the unnecessary '+1' from
> > TASK_COMM_LEN size.
> >   - simplified find_var_file() code according to Namhyung's
> > suggestions.
> >   - fixed bug in print_synth_event() where entry->fields was being
> > used instead of the address as it should have been  
> 
> I only have a nitpick in the patch 24 but otherwise cannot find an
> issue anymore, so
> 
> Reviewed-by: Namhyung Kim 

Thanks for reviewing this Namhyung.

I'm currently traveling (what else is new?), and I want to start
pulling this in. It wont be ready for the next merge window as it's too
close, but I want to get it in by the one after that. I need to
allocate some time to pull these patches in and review them as well.

-- Steve


Re: [RFC PATCH v2 3/6] x86/pti: add a per-cpu variable pti_disable

2018-01-09 Thread Ingo Molnar

* Willy Tarreau  wrote:

> +#ifdef CONFIG_PAGE_TABLE_ISOLATION
> + this_cpu_write(pti_disable,
> +next_p->mm && next_p->mm->context.pti_disable);
> +#endif

Another pet peeve, please write:

> + this_cpu_write(pti_disable, next_p->mm && 
> next_p->mm->context.pti_disable);

or consider introducing an 'mm_next' local variable, set to next_p->mm, and use 
that to shorten the sequence.

More importantly, any strong reasons why the flag is logic-inverted? I.e. why 
not
::pti_enabled?

Thanks,

Ingo


Re: [RFC PATCH v2 3/6] x86/pti: add a per-cpu variable pti_disable

2018-01-09 Thread Ingo Molnar

* Willy Tarreau  wrote:

> +#ifdef CONFIG_PAGE_TABLE_ISOLATION
> + this_cpu_write(pti_disable,
> +next_p->mm && next_p->mm->context.pti_disable);
> +#endif

Another pet peeve, please write:

> + this_cpu_write(pti_disable, next_p->mm && 
> next_p->mm->context.pti_disable);

or consider introducing an 'mm_next' local variable, set to next_p->mm, and use 
that to shorten the sequence.

More importantly, any strong reasons why the flag is logic-inverted? I.e. why 
not
::pti_enabled?

Thanks,

Ingo


Re: [RFC PATCH v2 5/6] x86/entry/pti: avoid setting CR3 when it's already correct

2018-01-09 Thread Willy Tarreau
On Wed, Jan 10, 2018 at 08:16:24AM +0100, Ingo Molnar wrote:
> 
> * Willy Tarreau  wrote:
> 
> > +   /* if we're already on the kernel PGD, we don't switch */
> > +* If we're already on the kernel PGD, we don't switch,
> > +* If we saved a kernel context on entry, we didn't switch the CR3,
> 
> It's hard enough to read assembly code, please use consistent capitalization:
> 
> s/if
>  /If

Will do, thanks for the review ;-)

Willy


Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

2018-01-09 Thread Steven Rostedt
On Tue, 9 Jan 2018 14:53:56 -0800
Tejun Heo  wrote:

> Hello, Steven.
> 
> On Tue, Jan 09, 2018 at 05:47:50PM -0500, Steven Rostedt wrote:
> > > Maybe it can break out eventually but that can take a really long
> > > time.  It's OOM.  Most of userland is waiting for reclaim.  There
> > > isn't all that much going on outside that and there can only be one
> > > CPU which is OOMing.  The kernel isn't gonna be all that chatty.  
> > 
> > Are you saying that the OOM is stuck printing over and over on a single
> > CPU. Perhaps we should fix THAT.  
> 
> I'm not sure what you meant but OOM code isn't doing anything bad

My point is, that your test is only hammering at a single CPU. You say
it is the scenario you see, which means that the OOM is printing out
more than it should, because if it prints it out once, it should not
print it out again for the same process, or go into a loop doing it
over and over on a single CPU. That would be a bug in the
implementation.

> other than excluding others from doing OOM kills simultaneously, which
> is what we want, and printing a lot of messages and then gets caught
> up in a positive feedback loop.
> 
> To me, the whole point of this effort is preventing printk messages
> from causing significant or critical disruptions to overall system
> operation.

I agree, and my patch helps with this tremendously, if we are not doing
something stupid like printk thousands of times in an interrupt
handler, over and over on a single CPU.

>  IOW, it's rather dumb if the machine goes down because
> somebody printk'd wrong or just failed to foresee the combinations of
> events which could lead to such conditions.

I still like to see a trace of a real situation.

> 
> It's not like we don't know how to fix this either.

But we don't want the fix to introduce regressions, and offloading
printk does. Heck, the current fixes to printk has causes issues for me
in my own debugging. Like we can no longer do large dumps of printk from
NMI context. Which I use to do when detecting a lock up and then doing
a task list dump of all tasks. Or even a ftrace_dump_on_oops.

http://lkml.kernel.org/r/20180109162019.gl3...@hirez.programming.kicks-ass.net


-- Steve


Re: [PATCH] Remove silentoldconfig from "make help"; fix kconfig/conf's help

2018-01-09 Thread Masahiro Yamada
2018-01-06 7:21 GMT+09:00 Marc Herbert :
> On 04/01/2018 09:21, Masahiro Yamada wrote:
>> (+CC Michal's new address)
>>
>> 2017-12-19 10:26 GMT+09:00 Marc Herbert :
>>> As explained by Michal Marek at https://lkml.org/lkml/2011/8/31/189
>>> silentoldconfig has become a misnomer. It has become an internal
>>> interface and "oldconfig" is just as silent now.
>>
>>
>> Hmm, I'd like to be sure about your intention.
>
> My main intention is to stop advertising the now internal silentoldconfig
> target in the user interface. A secondary goal is to provide an accurate
> background information in the commit message.

OK.


>> "oldconfig" is not silent.  (nor is silentoldconfig).
>> When it finds a new symbol, it will show a dialog
>> to ask users to input a value.
>>
>> "olddefconfig" is really silent
>> because it automatically sets new symbols to default.
>
> I think "silent" is typically missing well-defined semantics (another appeal
> to remove "silentoldconfig" from the user interface...) and I'm not sure
> "silent" ever meant "non-interactive" as you just described here. I think
> silent just meant "quiet(er)" here.

I imagined "silent" meant "non-interactive" at first,
but, you are right, Michal probably referred to this word "quiet(er)".


> This commit message was purely based on Michal's message that I'm
> referencing. He wrote there: "... nowadays oldconfig is silent as well"
>
> I can change that part of the commit message to:
>
> | As explained by Michal Marek at https://lkml.org/lkml/2011/8/31/189
> | silentoldconfig has become a misnomer. It has become an internal
> | interface and "oldconfig" is just as QUIET now.


Sounds good.

(but, currently, there is a slight difference of quiet level
between oldconfig and silentoldconfig)

See below.


> ... or to anything else you prefer.
>
>
>> If you drop silentoldconfig help,
>> the "Same as silentoldconfig" is not sensible.
>> You need to update this line, too.
>
>> I think "Same as oldconfig but ..." will be OK.
>
> Agreed, thank you! I will also search for other occurrences.
>
>
>> What do you mean by "oldconfig used to be more verbose" ?
>> Did oldconfig change its behavior?
>>
>> Unless I am missing something, the current behavior of "oldconfig" has
>> been the same at least since the beginning of the git era.
>
> Again that's what Michal's message claimed in 2011. I don't know to which
> even older era he was referring to.
>
> It was already quite time consuming to understand and verify the subtle
> nuances of the current state (which luckily still matches what Michal
> reported 7 years ago), so for the even older past I just deferred to Michal.
>
> Now I just checked out v2.6.12-rc2 (2005) and it looks like Michal was
> right: oldconfig was much more verbose then; it was dumping the entire
> .config file on stdout.

You and Michal are right.

It is commit cd9140e1e73a ("kconfig: make oldconfig is now less chatty").


I took a closer look at this.

Currently, oldconfig is a little bit more verbose than silentoldconfig,
but it should not be.  I'd like to fix it.

I attached a test case in my patch
https://patchwork.kernel.org/patch/10154095/

At the end of its git-log,
I attached deeper analysis of the history of oldconfig.
Please check it out if you are interested.


> If you prefer I can keep referring to Michal's message but without
> paraphrasing it at all; sticking to the description of the current
> behaviours and not mentioning any possible past behaviour and saving all of
> us the time spent doing archeology. Just let me know, thx!

Sounds good to me.  (I recorded the backgound in my patch, anyway...)

I leave the detail of the commit log up to you.



> +   printf("  --silentoldconfig   Similar to oldconfig but:\n"
> +  "- no re-formatting of .config 
> when nothing's missing\n"
> +  "- generates configuration in 
> include/{generated/,config/}\n"
> +  "  (oldconfig used to be more 
> verbose)\n");

How about "Similar to oldconfig but, generates configuration ..." ?

I'd like to drop the following description.

"no re-formatting of .config when nothing's missing"
   This is very subtle difference, less important.
   It may not be stable in the future.


"(oldconfig used to be more verbose)"
   The historical background is git.
   If people are interested in archeology,
   they would be able to do it by "git log", "git blame", etc.
   We are generally interested in the current behavior.



-- 
Best Regards
Masahiro Yamada


Re: [RFC PATCH v2 5/6] x86/entry/pti: avoid setting CR3 when it's already correct

2018-01-09 Thread Willy Tarreau
On Wed, Jan 10, 2018 at 08:16:24AM +0100, Ingo Molnar wrote:
> 
> * Willy Tarreau  wrote:
> 
> > +   /* if we're already on the kernel PGD, we don't switch */
> > +* If we're already on the kernel PGD, we don't switch,
> > +* If we saved a kernel context on entry, we didn't switch the CR3,
> 
> It's hard enough to read assembly code, please use consistent capitalization:
> 
> s/if
>  /If

Will do, thanks for the review ;-)

Willy


Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

2018-01-09 Thread Steven Rostedt
On Tue, 9 Jan 2018 14:53:56 -0800
Tejun Heo  wrote:

> Hello, Steven.
> 
> On Tue, Jan 09, 2018 at 05:47:50PM -0500, Steven Rostedt wrote:
> > > Maybe it can break out eventually but that can take a really long
> > > time.  It's OOM.  Most of userland is waiting for reclaim.  There
> > > isn't all that much going on outside that and there can only be one
> > > CPU which is OOMing.  The kernel isn't gonna be all that chatty.  
> > 
> > Are you saying that the OOM is stuck printing over and over on a single
> > CPU. Perhaps we should fix THAT.  
> 
> I'm not sure what you meant but OOM code isn't doing anything bad

My point is, that your test is only hammering at a single CPU. You say
it is the scenario you see, which means that the OOM is printing out
more than it should, because if it prints it out once, it should not
print it out again for the same process, or go into a loop doing it
over and over on a single CPU. That would be a bug in the
implementation.

> other than excluding others from doing OOM kills simultaneously, which
> is what we want, and printing a lot of messages and then gets caught
> up in a positive feedback loop.
> 
> To me, the whole point of this effort is preventing printk messages
> from causing significant or critical disruptions to overall system
> operation.

I agree, and my patch helps with this tremendously, if we are not doing
something stupid like printk thousands of times in an interrupt
handler, over and over on a single CPU.

>  IOW, it's rather dumb if the machine goes down because
> somebody printk'd wrong or just failed to foresee the combinations of
> events which could lead to such conditions.

I still like to see a trace of a real situation.

> 
> It's not like we don't know how to fix this either.

But we don't want the fix to introduce regressions, and offloading
printk does. Heck, the current fixes to printk has causes issues for me
in my own debugging. Like we can no longer do large dumps of printk from
NMI context. Which I use to do when detecting a lock up and then doing
a task list dump of all tasks. Or even a ftrace_dump_on_oops.

http://lkml.kernel.org/r/20180109162019.gl3...@hirez.programming.kicks-ass.net


-- Steve


Re: [PATCH] Remove silentoldconfig from "make help"; fix kconfig/conf's help

2018-01-09 Thread Masahiro Yamada
2018-01-06 7:21 GMT+09:00 Marc Herbert :
> On 04/01/2018 09:21, Masahiro Yamada wrote:
>> (+CC Michal's new address)
>>
>> 2017-12-19 10:26 GMT+09:00 Marc Herbert :
>>> As explained by Michal Marek at https://lkml.org/lkml/2011/8/31/189
>>> silentoldconfig has become a misnomer. It has become an internal
>>> interface and "oldconfig" is just as silent now.
>>
>>
>> Hmm, I'd like to be sure about your intention.
>
> My main intention is to stop advertising the now internal silentoldconfig
> target in the user interface. A secondary goal is to provide an accurate
> background information in the commit message.

OK.


>> "oldconfig" is not silent.  (nor is silentoldconfig).
>> When it finds a new symbol, it will show a dialog
>> to ask users to input a value.
>>
>> "olddefconfig" is really silent
>> because it automatically sets new symbols to default.
>
> I think "silent" is typically missing well-defined semantics (another appeal
> to remove "silentoldconfig" from the user interface...) and I'm not sure
> "silent" ever meant "non-interactive" as you just described here. I think
> silent just meant "quiet(er)" here.

I imagined "silent" meant "non-interactive" at first,
but, you are right, Michal probably referred to this word "quiet(er)".


> This commit message was purely based on Michal's message that I'm
> referencing. He wrote there: "... nowadays oldconfig is silent as well"
>
> I can change that part of the commit message to:
>
> | As explained by Michal Marek at https://lkml.org/lkml/2011/8/31/189
> | silentoldconfig has become a misnomer. It has become an internal
> | interface and "oldconfig" is just as QUIET now.


Sounds good.

(but, currently, there is a slight difference of quiet level
between oldconfig and silentoldconfig)

See below.


> ... or to anything else you prefer.
>
>
>> If you drop silentoldconfig help,
>> the "Same as silentoldconfig" is not sensible.
>> You need to update this line, too.
>
>> I think "Same as oldconfig but ..." will be OK.
>
> Agreed, thank you! I will also search for other occurrences.
>
>
>> What do you mean by "oldconfig used to be more verbose" ?
>> Did oldconfig change its behavior?
>>
>> Unless I am missing something, the current behavior of "oldconfig" has
>> been the same at least since the beginning of the git era.
>
> Again that's what Michal's message claimed in 2011. I don't know to which
> even older era he was referring to.
>
> It was already quite time consuming to understand and verify the subtle
> nuances of the current state (which luckily still matches what Michal
> reported 7 years ago), so for the even older past I just deferred to Michal.
>
> Now I just checked out v2.6.12-rc2 (2005) and it looks like Michal was
> right: oldconfig was much more verbose then; it was dumping the entire
> .config file on stdout.

You and Michal are right.

It is commit cd9140e1e73a ("kconfig: make oldconfig is now less chatty").


I took a closer look at this.

Currently, oldconfig is a little bit more verbose than silentoldconfig,
but it should not be.  I'd like to fix it.

I attached a test case in my patch
https://patchwork.kernel.org/patch/10154095/

At the end of its git-log,
I attached deeper analysis of the history of oldconfig.
Please check it out if you are interested.


> If you prefer I can keep referring to Michal's message but without
> paraphrasing it at all; sticking to the description of the current
> behaviours and not mentioning any possible past behaviour and saving all of
> us the time spent doing archeology. Just let me know, thx!

Sounds good to me.  (I recorded the backgound in my patch, anyway...)

I leave the detail of the commit log up to you.



> +   printf("  --silentoldconfig   Similar to oldconfig but:\n"
> +  "- no re-formatting of .config 
> when nothing's missing\n"
> +  "- generates configuration in 
> include/{generated/,config/}\n"
> +  "  (oldconfig used to be more 
> verbose)\n");

How about "Similar to oldconfig but, generates configuration ..." ?

I'd like to drop the following description.

"no re-formatting of .config when nothing's missing"
   This is very subtle difference, less important.
   It may not be stable in the future.


"(oldconfig used to be more verbose)"
   The historical background is git.
   If people are interested in archeology,
   they would be able to do it by "git log", "git blame", etc.
   We are generally interested in the current behavior.



-- 
Best Regards
Masahiro Yamada


Re: [RFC PATCH v2 5/6] x86/entry/pti: avoid setting CR3 when it's already correct

2018-01-09 Thread Ingo Molnar

* Willy Tarreau  wrote:

> + /* if we're already on the kernel PGD, we don't switch */
> +  * If we're already on the kernel PGD, we don't switch,
> +  * If we saved a kernel context on entry, we didn't switch the CR3,

It's hard enough to read assembly code, please use consistent capitalization:

s/if
 /If

Thanks,

Ingo


Re: [RFC PATCH v2 5/6] x86/entry/pti: avoid setting CR3 when it's already correct

2018-01-09 Thread Ingo Molnar

* Willy Tarreau  wrote:

> + /* if we're already on the kernel PGD, we don't switch */
> +  * If we're already on the kernel PGD, we don't switch,
> +  * If we saved a kernel context on entry, we didn't switch the CR3,

It's hard enough to read assembly code, please use consistent capitalization:

s/if
 /If

Thanks,

Ingo


Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set

2018-01-09 Thread Ingo Molnar

* Willy Tarreau  wrote:

> When a syscall returns to userspace with pti_disable set, it means the
> current mm is configured to disable page table isolation (PTI). In this
> case, returns from kernel to user will not switch the CR3, leaving it
> to the kernel one which already maps both user and kernel pages. This
> avoids a TLB flush, and saves another one on next entry.
> 
> Thanks to these changes, haproxy running under KVM went back from
> 12700 conn/s (without PCID) or 19700 (with PCID) to 23100 once loaded
> after calling prctl(), indicating that PTI has no measurable impact on
> this workload.
> 
> Signed-off-by: Willy Tarreau 
> Cc: Andy Lutomirski 
> Cc: Borislav Petkov 
> Cc: Brian Gerst 
> Cc: Dave Hansen 
> Cc: Ingo Molnar 
> Cc: Linus Torvalds 
> Cc: Peter Zijlstra 
> Cc: Thomas Gleixner 
> Cc: Josh Poimboeuf 
> Cc: "H. Peter Anvin" 
> Cc: Greg Kroah-Hartman 
> Cc: Kees Cook 
> 
> v2:
>   - use pti_disable instead of task flag
> ---
>  arch/x86/entry/calling.h | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
> index 2c0d3b5..5361a10 100644
> --- a/arch/x86/entry/calling.h
> +++ b/arch/x86/entry/calling.h
> @@ -229,6 +229,11 @@
>  
>  .macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
>   ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
> +
> + /* The "pti_disable" mm attribute is mirrored into this per-cpu var */
> + cmpb$0, PER_CPU_VAR(pti_disable)
> + jne .Lend_\@

Could you please do this small change for future iterations:

s/per-cpu
 /per-CPU

... to make the spelling more consistent with the rest of the code base?

Thanks,

Ingo


Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set

2018-01-09 Thread Ingo Molnar

* Willy Tarreau  wrote:

> When a syscall returns to userspace with pti_disable set, it means the
> current mm is configured to disable page table isolation (PTI). In this
> case, returns from kernel to user will not switch the CR3, leaving it
> to the kernel one which already maps both user and kernel pages. This
> avoids a TLB flush, and saves another one on next entry.
> 
> Thanks to these changes, haproxy running under KVM went back from
> 12700 conn/s (without PCID) or 19700 (with PCID) to 23100 once loaded
> after calling prctl(), indicating that PTI has no measurable impact on
> this workload.
> 
> Signed-off-by: Willy Tarreau 
> Cc: Andy Lutomirski 
> Cc: Borislav Petkov 
> Cc: Brian Gerst 
> Cc: Dave Hansen 
> Cc: Ingo Molnar 
> Cc: Linus Torvalds 
> Cc: Peter Zijlstra 
> Cc: Thomas Gleixner 
> Cc: Josh Poimboeuf 
> Cc: "H. Peter Anvin" 
> Cc: Greg Kroah-Hartman 
> Cc: Kees Cook 
> 
> v2:
>   - use pti_disable instead of task flag
> ---
>  arch/x86/entry/calling.h | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
> index 2c0d3b5..5361a10 100644
> --- a/arch/x86/entry/calling.h
> +++ b/arch/x86/entry/calling.h
> @@ -229,6 +229,11 @@
>  
>  .macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
>   ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
> +
> + /* The "pti_disable" mm attribute is mirrored into this per-cpu var */
> + cmpb$0, PER_CPU_VAR(pti_disable)
> + jne .Lend_\@

Could you please do this small change for future iterations:

s/per-cpu
 /per-CPU

... to make the spelling more consistent with the rest of the code base?

Thanks,

Ingo


Re: [PATCH] x86/retpoline: Fix NOSPEC_JMP for tip

2018-01-09 Thread David Woodhouse
On Tue, 2018-01-09 at 16:39 -0800, Linus Torvalds wrote:
> On Tue, Jan 9, 2018 at 4:31 PM, Andi Kleen 
> wrote:
> > 
> > 
> > The following patch fixes it for me. Something doesn't
> > seem to work with ALTERNATIVE_2. It adds only a few bytes
> > more code, so seems acceptable.
> Ugh. It's kind of stupid, though.
> 
> Why is the code sequence not simply:
> 
>   ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
>   ALTERNATIVE __stringify(jmp *\reg), __stringify(RETPOLINE_JMP
> \reg),
> X86_FEATURE_RETPOLINE
> 
> ie make that X86_FEATURE_RETPOLINE_AMD _only_ emit the "lfence", and
> simply fall through to what will be the "jmp *\reg" of the
> non-RETPOLINE version.
> 
> Then just make sure X86_FEATURE_RETPOLINE_AMD disables
> X86_FEATURE_RETPOLINE.
> 
> That is both simpler and smaller, no?

Not smaller, as the lfence *could* have gone in all the space left by
turning the whole retpoline thing into a single 'jmp'. But I'll
certainly give you simpler. I'll retest and merge Andi's latest patches
for that; thanks.

I'd really like to know what went wrong though. Did we merge Borislav's
attempt to peek at jumps inside alternatives, perchance? Will take a
look...

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH] x86/retpoline: Fix NOSPEC_JMP for tip

2018-01-09 Thread David Woodhouse
On Tue, 2018-01-09 at 16:39 -0800, Linus Torvalds wrote:
> On Tue, Jan 9, 2018 at 4:31 PM, Andi Kleen 
> wrote:
> > 
> > 
> > The following patch fixes it for me. Something doesn't
> > seem to work with ALTERNATIVE_2. It adds only a few bytes
> > more code, so seems acceptable.
> Ugh. It's kind of stupid, though.
> 
> Why is the code sequence not simply:
> 
>   ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
>   ALTERNATIVE __stringify(jmp *\reg), __stringify(RETPOLINE_JMP
> \reg),
> X86_FEATURE_RETPOLINE
> 
> ie make that X86_FEATURE_RETPOLINE_AMD _only_ emit the "lfence", and
> simply fall through to what will be the "jmp *\reg" of the
> non-RETPOLINE version.
> 
> Then just make sure X86_FEATURE_RETPOLINE_AMD disables
> X86_FEATURE_RETPOLINE.
> 
> That is both simpler and smaller, no?

Not smaller, as the lfence *could* have gone in all the space left by
turning the whole retpoline thing into a single 'jmp'. But I'll
certainly give you simpler. I'll retest and merge Andi's latest patches
for that; thanks.

I'd really like to know what went wrong though. Did we merge Borislav's
attempt to peek at jumps inside alternatives, perchance? Will take a
look...

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> On Tue, Jan 9, 2018 at 6:54 AM, Willy Tarreau  wrote:
> > On Tue, Jan 09, 2018 at 03:51:57PM +0100, Borislav Petkov wrote:
> >> On Tue, Jan 09, 2018 at 03:36:53PM +0100, Willy Tarreau wrote:
> >> > I see and am not particularly against this, but what use case do you
> >> > have in mind precisely ? I doubt it's just saving a few tens of bytes,
> >> > so probably you're more concerned about the potential risks this opens ?
> >> > But given we only allow this for CAP_SYS_RAWIO and these ones already
> >> > have access to /dev/mem and many other things, don't you think there
> >> > are much easier ways to dump kernel memory in this case than trying to
> >> > inject some meltdown code into the victim process ? Or maybe you have
> >> > other cases in mind that I'm not seeing.
> >>
> >> I'd like this to be config-controllable so that distros can make the
> >> decision whether/if they want to support the whole per-mm thing.
> >
> > OK.
> >
> >> Also, if CAP_SYS_RAWIO is going to protect, please make the
> >> ARCH_GET_NOPTI variant check it too.
> >
> > Interestingly I removed the check consecutive to the discussions. But
> > I think I'll simply remove the whole ARCH_GET_NOPTI as it has no real
> > value beyond initial development.
> >
> 
> I've thought about this a bit more.  Here are my thoughts:
> 
> 1. I don't like it being per-mm.  I think it should be a per-thread
> control so that a program can have a thread with PTI that runs
> less-trusted JavaScript and other network threads with PTI off.
> Obviously we lose NX protection mm-wide if any threads have PTI off.
> I think the way to implement this is:

Btw., the "NX protection", the NX bit set in the PTI kernel pagetables for the 
user range really just matters for non-SMEP hardware, right? On SMEP a CPU in 
kernel privilege mode cannot execute user pages, i.e. the fact that it's user 
pages is already NX, guaranteed by the CPU.

And note how there's a happy circumstance for users, regarding SMEP and PTI NX:

- All Intel desktop/server CPUs currently sold and those built in the last ~3 
  years have SMEP enabled already, so are not affected.

- AMD CPUs don't have PTI enabled, so they already don't have NX for their user 
  pages - no change in behavior.

I.e.: non-issue and not a real constraint on the flexibility of this ABI, 
AFAICS - 
it's "only" an implementational matter.

Thanks,

Ingo


Re: [patch v7 3/3] platform/mellanox: mlxreg-hotplug: modify to use regmap intreface

2018-01-09 Thread kbuild test robot
Hi Vadim,

I love your patch! Yet something to improve:

[auto build test ERROR on platform-drivers-x86/for-next]
[cannot apply to linus/master]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Vadim-Pasternak/rivers-platform-replace-module-x86-mlxcpld-hotplug-with-mellanox-mlxreg-hotplug/20180110-115215
base:   git://git.infradead.org/users/dvhart/linux-platform-drivers-x86.git 
for-next
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=ia64 

All errors (new ones prefixed by >>):

   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/auxdisplay/img-ascii-lcd.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-ath79.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-iop.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/iio/accel/kxsd9-i2c.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/iio/adc/qcom-vadc-common.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/mtk-vcodec/mtk-vcodec-common.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/tegra-cec/tegra_cec.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/mtd/nand/denali_pci.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/pinctrl/pxa/pinctrl-pxa2xx.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/power/reset/zx-reboot.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/staging/comedi/drivers/ni_atmio.o
   see include/linux/module.h for more information
   ERROR: "ia64_delay_loop" [drivers/spi/spi-thunderx.ko] undefined!
>> ERROR: "of_update_property" [drivers/platform/mellanox/mlxreg-hotplug.ko] 
>> undefined!
   ERROR: "ia64_delay_loop" [drivers/net/phy/mdio-cavium.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [RFC PATCH v2 2/6] x86/arch_prctl: add ARCH_GET_NOPTI and ARCH_SET_NOPTI to enable/disable PTI

2018-01-09 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> On Tue, Jan 9, 2018 at 6:54 AM, Willy Tarreau  wrote:
> > On Tue, Jan 09, 2018 at 03:51:57PM +0100, Borislav Petkov wrote:
> >> On Tue, Jan 09, 2018 at 03:36:53PM +0100, Willy Tarreau wrote:
> >> > I see and am not particularly against this, but what use case do you
> >> > have in mind precisely ? I doubt it's just saving a few tens of bytes,
> >> > so probably you're more concerned about the potential risks this opens ?
> >> > But given we only allow this for CAP_SYS_RAWIO and these ones already
> >> > have access to /dev/mem and many other things, don't you think there
> >> > are much easier ways to dump kernel memory in this case than trying to
> >> > inject some meltdown code into the victim process ? Or maybe you have
> >> > other cases in mind that I'm not seeing.
> >>
> >> I'd like this to be config-controllable so that distros can make the
> >> decision whether/if they want to support the whole per-mm thing.
> >
> > OK.
> >
> >> Also, if CAP_SYS_RAWIO is going to protect, please make the
> >> ARCH_GET_NOPTI variant check it too.
> >
> > Interestingly I removed the check consecutive to the discussions. But
> > I think I'll simply remove the whole ARCH_GET_NOPTI as it has no real
> > value beyond initial development.
> >
> 
> I've thought about this a bit more.  Here are my thoughts:
> 
> 1. I don't like it being per-mm.  I think it should be a per-thread
> control so that a program can have a thread with PTI that runs
> less-trusted JavaScript and other network threads with PTI off.
> Obviously we lose NX protection mm-wide if any threads have PTI off.
> I think the way to implement this is:

Btw., the "NX protection", the NX bit set in the PTI kernel pagetables for the 
user range really just matters for non-SMEP hardware, right? On SMEP a CPU in 
kernel privilege mode cannot execute user pages, i.e. the fact that it's user 
pages is already NX, guaranteed by the CPU.

And note how there's a happy circumstance for users, regarding SMEP and PTI NX:

- All Intel desktop/server CPUs currently sold and those built in the last ~3 
  years have SMEP enabled already, so are not affected.

- AMD CPUs don't have PTI enabled, so they already don't have NX for their user 
  pages - no change in behavior.

I.e.: non-issue and not a real constraint on the flexibility of this ABI, 
AFAICS - 
it's "only" an implementational matter.

Thanks,

Ingo


Re: [patch v7 3/3] platform/mellanox: mlxreg-hotplug: modify to use regmap intreface

2018-01-09 Thread kbuild test robot
Hi Vadim,

I love your patch! Yet something to improve:

[auto build test ERROR on platform-drivers-x86/for-next]
[cannot apply to linus/master]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Vadim-Pasternak/rivers-platform-replace-module-x86-mlxcpld-hotplug-with-mellanox-mlxreg-hotplug/20180110-115215
base:   git://git.infradead.org/users/dvhart/linux-platform-drivers-x86.git 
for-next
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=ia64 

All errors (new ones prefixed by >>):

   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/auxdisplay/img-ascii-lcd.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-ath79.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/gpio/gpio-iop.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/iio/accel/kxsd9-i2c.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/iio/adc/qcom-vadc-common.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/mtk-vcodec/mtk-vcodec-common.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/media/platform/tegra-cec/tegra_cec.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/mtd/nand/denali_pci.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/pinctrl/pxa/pinctrl-pxa2xx.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in drivers/power/reset/zx-reboot.o
   see include/linux/module.h for more information
   WARNING: modpost: missing MODULE_LICENSE() in 
drivers/staging/comedi/drivers/ni_atmio.o
   see include/linux/module.h for more information
   ERROR: "ia64_delay_loop" [drivers/spi/spi-thunderx.ko] undefined!
>> ERROR: "of_update_property" [drivers/platform/mellanox/mlxreg-hotplug.ko] 
>> undefined!
   ERROR: "ia64_delay_loop" [drivers/net/phy/mdio-cavium.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[PATCH v1 6/8] perf util: Allocate time slices buffer according to number of comma

2018-01-09 Thread Jin Yao
Previously we use a magic number 10 to limit the number of
time slices. It's not very good.

This patch creates a new function perf_time__range_alloc()
to allocate time slices buffer. The number of buffer entries is
determined by the number of comma in string but at least it will
allocate one entry even if no comma is found.

Signed-off-by: Jin Yao 
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 5769f97..6193b46 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -325,6 +325,34 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
return -1;
 }
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+   const char *p1, *p2;
+   int i = 1;
+   struct perf_time_interval *ptime;
+
+   /*
+* At least allocate one time range.
+*/
+   if (!ostr)
+   goto alloc;
+
+   p1 = ostr;
+   while (p1 < ostr + strlen(ostr)) {
+   p2 = strchr(p1, ',');
+   if (!p2)
+   break;
+
+   p1 = p2 + 1;
+   i++;
+   }
+
+alloc:
+   *size = i;
+   ptime = calloc(i, sizeof(*ptime));
+   return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
/* if time is not set don't drop sample */
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 34d5eba..70b177d 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -16,6 +16,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr);
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end);
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
 bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
-- 
2.7.4



[PATCH v1 6/8] perf util: Allocate time slices buffer according to number of comma

2018-01-09 Thread Jin Yao
Previously we use a magic number 10 to limit the number of
time slices. It's not very good.

This patch creates a new function perf_time__range_alloc()
to allocate time slices buffer. The number of buffer entries is
determined by the number of comma in string but at least it will
allocate one entry even if no comma is found.

Signed-off-by: Jin Yao 
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 5769f97..6193b46 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -325,6 +325,34 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
return -1;
 }
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+   const char *p1, *p2;
+   int i = 1;
+   struct perf_time_interval *ptime;
+
+   /*
+* At least allocate one time range.
+*/
+   if (!ostr)
+   goto alloc;
+
+   p1 = ostr;
+   while (p1 < ostr + strlen(ostr)) {
+   p2 = strchr(p1, ',');
+   if (!p2)
+   break;
+
+   p1 = p2 + 1;
+   i++;
+   }
+
+alloc:
+   *size = i;
+   ptime = calloc(i, sizeof(*ptime));
+   return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
/* if time is not set don't drop sample */
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 34d5eba..70b177d 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -16,6 +16,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr);
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end);
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
 bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
-- 
2.7.4



[PATCH v1 4/8] perf util: Support no index time percent slice

2018-01-09 Thread Jin Yao
Previously, the time percent slice needs an index to specify
which one the user wants.

While it may be easy for using if the index can be omitted.
So with this patch, for example,

perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%

Signed-off-by: Jin Yao 
---
 tools/perf/util/time-utils.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 88510ab..5769f97 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -261,6 +261,37 @@ static int percent_comma_split(struct perf_time_interval 
*ptime_buf, int num,
return i;
 }
 
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+  const char *ostr, u64 start, u64 end, char *c)
+{
+   char *str;
+   int len = strlen(ostr), ret;
+
+   /*
+* c points to '%'.
+* '%' should be the last character
+*/
+   if (ostr + len - 1 != c)
+   return -1;
+
+   /*
+* Construct a string like "xx%/1"
+*/
+   str = malloc(len + 3);
+   if (str == NULL)
+   return -ENOMEM;
+
+   memcpy(str, ostr, len);
+   strcpy(str + len, "/1");
+
+   ret = percent_slash_split(str, ptime_buf, start, end);
+   if (ret == 0)
+   ret = 1;
+
+   free(str);
+   return ret;
+}
+
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end)
 {
@@ -270,6 +301,7 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 * ostr example:
 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
 * 0%-10%,30%-40%: multiple time range
+* 50%: just one percent
 */
 
memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
@@ -286,6 +318,10 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
   end, percent_dash_split);
}
 
+   c = strchr(ostr, '%');
+   if (c)
+   return one_percent_convert(ptime_buf, ostr, start, end, c);
+
return -1;
 }
 
-- 
2.7.4



[PATCH v1 2/8] perf script: Improve error msg when no first/last sample time found

2018-01-09 Thread Jin Yao
The following message will be returned to user when executing
'perf script --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c1cce47..4f691af 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3448,7 +3448,9 @@ int cmd_script(int argc, const char **argv)
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
err = -EINVAL;
goto out_delete;
}
-- 
2.7.4



[PATCH v1 4/8] perf util: Support no index time percent slice

2018-01-09 Thread Jin Yao
Previously, the time percent slice needs an index to specify
which one the user wants.

While it may be easy for using if the index can be omitted.
So with this patch, for example,

perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%

Signed-off-by: Jin Yao 
---
 tools/perf/util/time-utils.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 88510ab..5769f97 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -261,6 +261,37 @@ static int percent_comma_split(struct perf_time_interval 
*ptime_buf, int num,
return i;
 }
 
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+  const char *ostr, u64 start, u64 end, char *c)
+{
+   char *str;
+   int len = strlen(ostr), ret;
+
+   /*
+* c points to '%'.
+* '%' should be the last character
+*/
+   if (ostr + len - 1 != c)
+   return -1;
+
+   /*
+* Construct a string like "xx%/1"
+*/
+   str = malloc(len + 3);
+   if (str == NULL)
+   return -ENOMEM;
+
+   memcpy(str, ostr, len);
+   strcpy(str + len, "/1");
+
+   ret = percent_slash_split(str, ptime_buf, start, end);
+   if (ret == 0)
+   ret = 1;
+
+   free(str);
+   return ret;
+}
+
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end)
 {
@@ -270,6 +301,7 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 * ostr example:
 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
 * 0%-10%,30%-40%: multiple time range
+* 50%: just one percent
 */
 
memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
@@ -286,6 +318,10 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
   end, percent_dash_split);
}
 
+   c = strchr(ostr, '%');
+   if (c)
+   return one_percent_convert(ptime_buf, ostr, start, end, c);
+
return -1;
 }
 
-- 
2.7.4



[PATCH v1 2/8] perf script: Improve error msg when no first/last sample time found

2018-01-09 Thread Jin Yao
The following message will be returned to user when executing
'perf script --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c1cce47..4f691af 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3448,7 +3448,9 @@ int cmd_script(int argc, const char **argv)
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
err = -EINVAL;
goto out_delete;
}
-- 
2.7.4



[PATCH v1 8/8] perf script: Remove the time slices number limitation

2018-01-09 Thread Jin Yao
Previously it was only allowed to use at most 10 time slices
in 'perf script --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf script --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
---
 tools/perf/Documentation/perf-script.txt | 10 +-
 tools/perf/builtin-script.c  | 17 +
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 806ec63..7730c1d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -351,19 +351,19 @@ include::itrace.txt[]
to end of file.
 
Also support time percent with multipe time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
-   Select the second 10% time slice
+   Select the second 10% time slice:
perf script --time 10%/2
 
-   Select from 0% to 10% time slice
+   Select from 0% to 10% time slice:
perf script --time 0%-10%
 
-   Select the first and second 10% time slices
+   Select the first and second 10% time slices:
perf script --time 10%/1,10%/2
 
-   Select from 0% to 10% and 30% to 40% slices
+   Select from 0% to 10% and 30% to 40% slices:
perf script --time 0%-10%,30%-40%
 
 --max-blocks::
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4f691af..f116a31 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1480,8 +1480,6 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
-#define PTIME_RANGE_MAX10
-
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1496,7 +1494,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
 };
 
@@ -3444,6 +3443,13 @@ int cmd_script(int argc, const char **argv)
if (err < 0)
goto out_delete;
 
+   script.ptime_range = perf_time__range_alloc(script.time_str,
+   _size);
+   if (!script.ptime_range) {
+   err = -ENOMEM;
+   goto out_delete;
+   }
+
/* needs to be parsed after looking up reference time */
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
@@ -3456,7 +3462,7 @@ int cmd_script(int argc, const char **argv)
}
 
script.range_num = perf_time__percent_parse_str(
-   script.ptime_range, PTIME_RANGE_MAX,
+   script.ptime_range, script.range_size,
script.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
@@ -3475,6 +3481,9 @@ int cmd_script(int argc, const char **argv)
flush_scripting();
 
 out_delete:
+   if (script.ptime_range)
+   free(script.ptime_range);
+
perf_evlist__free_stats(session->evlist);
perf_session__delete(session);
 
-- 
2.7.4



[PATCH v1 8/8] perf script: Remove the time slices number limitation

2018-01-09 Thread Jin Yao
Previously it was only allowed to use at most 10 time slices
in 'perf script --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf script --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
---
 tools/perf/Documentation/perf-script.txt | 10 +-
 tools/perf/builtin-script.c  | 17 +
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 806ec63..7730c1d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -351,19 +351,19 @@ include::itrace.txt[]
to end of file.
 
Also support time percent with multipe time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
-   Select the second 10% time slice
+   Select the second 10% time slice:
perf script --time 10%/2
 
-   Select from 0% to 10% time slice
+   Select from 0% to 10% time slice:
perf script --time 0%-10%
 
-   Select the first and second 10% time slices
+   Select the first and second 10% time slices:
perf script --time 10%/1,10%/2
 
-   Select from 0% to 10% and 30% to 40% slices
+   Select from 0% to 10% and 30% to 40% slices:
perf script --time 0%-10%,30%-40%
 
 --max-blocks::
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4f691af..f116a31 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1480,8 +1480,6 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
-#define PTIME_RANGE_MAX10
-
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1496,7 +1494,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
 };
 
@@ -3444,6 +3443,13 @@ int cmd_script(int argc, const char **argv)
if (err < 0)
goto out_delete;
 
+   script.ptime_range = perf_time__range_alloc(script.time_str,
+   _size);
+   if (!script.ptime_range) {
+   err = -ENOMEM;
+   goto out_delete;
+   }
+
/* needs to be parsed after looking up reference time */
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
@@ -3456,7 +3462,7 @@ int cmd_script(int argc, const char **argv)
}
 
script.range_num = perf_time__percent_parse_str(
-   script.ptime_range, PTIME_RANGE_MAX,
+   script.ptime_range, script.range_size,
script.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
@@ -3475,6 +3481,9 @@ int cmd_script(int argc, const char **argv)
flush_scripting();
 
 out_delete:
+   if (script.ptime_range)
+   free(script.ptime_range);
+
perf_evlist__free_stats(session->evlist);
perf_session__delete(session);
 
-- 
2.7.4



[PATCH v1 7/8] perf report: Remove the time slices number limitation

2018-01-09 Thread Jin Yao
Previously it was only allowed to use at most 10 time slices
in 'perf report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
---
 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/builtin-report.c  | 23 +--
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 5522ce0..1940c4f 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -403,7 +403,7 @@ OPTIONS
to end of file.
 
Also support time percent with multiple time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
Select the second 10% time slice:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 77c954c..fe89021 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -54,8 +54,6 @@
 #include 
 #include 
 
-#define PTIME_RANGE_MAX10
-
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -76,7 +74,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
float   min_percent;
u64 nr_entries;
@@ -1299,24 +1298,33 @@ int cmd_report(int argc, const char **argv)
if (symbol__init(>header.env) < 0)
goto error;
 
+   report.ptime_range = perf_time__range_alloc(report.time_str,
+   _size);
+   if (!report.ptime_range) {
+   ret = -ENOMEM;
+   goto error;
+   }
+
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
pr_err("HINT: no first/last sample time found in perf 
data.\n"
   "Please use latest perf binary to execute 'perf 
record'\n"
   "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
 
report.range_num = perf_time__percent_parse_str(
-   report.ptime_range, PTIME_RANGE_MAX,
+   report.ptime_range, report.range_size,
report.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
 
if (report.range_num < 0) {
pr_err("Invalid time string\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
} else {
report.range_num = 1;
@@ -1332,6 +1340,9 @@ int cmd_report(int argc, const char **argv)
ret = 0;
 
 error:
+   if (report.ptime_range)
+   free(report.ptime_range);
+
perf_session__delete(session);
return ret;
 }
-- 
2.7.4



[PATCH v1 3/8] perf util: Improve error checking for time percent input

2018-01-09 Thread Jin Yao
The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.

This patch uses strtod() to replace original atof() and check the
entire string. Now for the same command line, it would return error
message "Invalid time string".

root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string

Signed-off-by: Jin Yao 
---
 tools/perf/util/time-utils.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 3f7f18f..88510ab 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -116,7 +116,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
 
 static int parse_percent(double *pcnt, char *str)
 {
-   char *c;
+   char *c, *endptr;
+   double d;
 
c = strchr(str, '%');
if (c)
@@ -124,8 +125,11 @@ static int parse_percent(double *pcnt, char *str)
else
return -1;
 
-   *pcnt = atof(str) / 100.0;
+   d = strtod(str, );
+   if (endptr != str + strlen(str))
+   return -1;
 
+   *pcnt = d / 100.0;
return 0;
 }
 
-- 
2.7.4



[PATCH v1 7/8] perf report: Remove the time slices number limitation

2018-01-09 Thread Jin Yao
Previously it was only allowed to use at most 10 time slices
in 'perf report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
---
 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/builtin-report.c  | 23 +--
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 5522ce0..1940c4f 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -403,7 +403,7 @@ OPTIONS
to end of file.
 
Also support time percent with multiple time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
Select the second 10% time slice:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 77c954c..fe89021 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -54,8 +54,6 @@
 #include 
 #include 
 
-#define PTIME_RANGE_MAX10
-
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -76,7 +74,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
float   min_percent;
u64 nr_entries;
@@ -1299,24 +1298,33 @@ int cmd_report(int argc, const char **argv)
if (symbol__init(>header.env) < 0)
goto error;
 
+   report.ptime_range = perf_time__range_alloc(report.time_str,
+   _size);
+   if (!report.ptime_range) {
+   ret = -ENOMEM;
+   goto error;
+   }
+
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
pr_err("HINT: no first/last sample time found in perf 
data.\n"
   "Please use latest perf binary to execute 'perf 
record'\n"
   "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
 
report.range_num = perf_time__percent_parse_str(
-   report.ptime_range, PTIME_RANGE_MAX,
+   report.ptime_range, report.range_size,
report.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
 
if (report.range_num < 0) {
pr_err("Invalid time string\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
} else {
report.range_num = 1;
@@ -1332,6 +1340,9 @@ int cmd_report(int argc, const char **argv)
ret = 0;
 
 error:
+   if (report.ptime_range)
+   free(report.ptime_range);
+
perf_session__delete(session);
return ret;
 }
-- 
2.7.4



[PATCH v1 3/8] perf util: Improve error checking for time percent input

2018-01-09 Thread Jin Yao
The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.

This patch uses strtod() to replace original atof() and check the
entire string. Now for the same command line, it would return error
message "Invalid time string".

root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string

Signed-off-by: Jin Yao 
---
 tools/perf/util/time-utils.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 3f7f18f..88510ab 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -116,7 +116,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
 
 static int parse_percent(double *pcnt, char *str)
 {
-   char *c;
+   char *c, *endptr;
+   double d;
 
c = strchr(str, '%');
if (c)
@@ -124,8 +125,11 @@ static int parse_percent(double *pcnt, char *str)
else
return -1;
 
-   *pcnt = atof(str) / 100.0;
+   d = strtod(str, );
+   if (endptr != str + strlen(str))
+   return -1;
 
+   *pcnt = d / 100.0;
return 0;
 }
 
-- 
2.7.4



[PATCH v1 5/8] perf report: Add an indication of what time slices are used

2018-01-09 Thread Jin Yao
Add a time slices indication to the perf report header.

For example,

  # perf report --stdio --time 10%

  # Total Lost Samples: 0
  #
  # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
  # Event count (approx.): 8951288803

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-report.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index a6c5cf2..77c954c 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -403,6 +403,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists 
*hists, struct report
if (evname != NULL)
ret += fprintf(fp, " of event '%s'", evname);
 
+   if (rep->time_str)
+   ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
if (symbol_conf.show_ref_callgraph &&
strstr(evname, "call-graph=no")) {
ret += fprintf(fp, ", show reference callgraph");
-- 
2.7.4



[PATCH v1 5/8] perf report: Add an indication of what time slices are used

2018-01-09 Thread Jin Yao
Add a time slices indication to the perf report header.

For example,

  # perf report --stdio --time 10%

  # Total Lost Samples: 0
  #
  # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
  # Event count (approx.): 8951288803

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-report.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index a6c5cf2..77c954c 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -403,6 +403,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists 
*hists, struct report
if (evname != NULL)
ret += fprintf(fp, " of event '%s'", evname);
 
+   if (rep->time_str)
+   ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
if (symbol_conf.show_ref_callgraph &&
strstr(evname, "call-graph=no")) {
ret += fprintf(fp, ", show reference callgraph");
-- 
2.7.4



[PATCH v1 1/8] perf report: Improve error msg when no first/last sample time found

2018-01-09 Thread Jin Yao
The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-report.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index dd4df9a..a6c5cf2 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1299,7 +1299,9 @@ int cmd_report(int argc, const char **argv)
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
return -EINVAL;
}
 
-- 
2.7.4



[PATCH v1 1/8] perf report: Improve error msg when no first/last sample time found

2018-01-09 Thread Jin Yao
The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-report.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index dd4df9a..a6c5cf2 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1299,7 +1299,9 @@ int cmd_report(int argc, const char **argv)
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
return -EINVAL;
}
 
-- 
2.7.4



[PATCH v1 0/8] perf: Follow-up patches to improve time slice

2018-01-09 Thread Jin Yao
It's follow-up patches to improve the perf time slice feature
(perf report/script --time xxx)

1. Improve the error message
   perf report: Improve error msg when no first/last sample time found
   perf script: Improve error msg when no first/last sample time found

2. Fix an issue that illegal percent was accepted previously (e.g. 1abc%)
   perf util: Improve error checking for time percent input

3. Omit the slice index if possible. For example,
   perf report --stdio --time 10%/1 is equivalent to
   perf report --stdio --time 10%

   perf util: Support no index time percent slice 

4. Add indication of time slices in perf report header.
   perf report: Add an indication of what time slices are used

5. Remove the time slices number limitation in perf report/script
   perf util: Allocate time slices buffer according to number of comma
   perf report: Remove the time slices number limitation
   perf script: Remove the time slices number limitation

Jin Yao (8):
  perf report: Improve error msg when no first/last sample time found
  perf script: Improve error msg when no first/last sample time found
  perf util: Improve error checking for time percent input
  perf util: Support no index time percent slice
  perf report: Add an indication of what time slices are used
  perf util: Allocate time slices buffer according to number of comma
  perf report: Remove the time slices number limitation
  perf script: Remove the time slices number limitation

 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/Documentation/perf-script.txt | 10 ++---
 tools/perf/builtin-report.c  | 30 +
 tools/perf/builtin-script.c  | 21 +++---
 tools/perf/util/time-utils.c | 72 +++-
 tools/perf/util/time-utils.h |  2 +
 6 files changed, 117 insertions(+), 20 deletions(-)

-- 
2.7.4



[PATCH v1 0/8] perf: Follow-up patches to improve time slice

2018-01-09 Thread Jin Yao
It's follow-up patches to improve the perf time slice feature
(perf report/script --time xxx)

1. Improve the error message
   perf report: Improve error msg when no first/last sample time found
   perf script: Improve error msg when no first/last sample time found

2. Fix an issue that illegal percent was accepted previously (e.g. 1abc%)
   perf util: Improve error checking for time percent input

3. Omit the slice index if possible. For example,
   perf report --stdio --time 10%/1 is equivalent to
   perf report --stdio --time 10%

   perf util: Support no index time percent slice 

4. Add indication of time slices in perf report header.
   perf report: Add an indication of what time slices are used

5. Remove the time slices number limitation in perf report/script
   perf util: Allocate time slices buffer according to number of comma
   perf report: Remove the time slices number limitation
   perf script: Remove the time slices number limitation

Jin Yao (8):
  perf report: Improve error msg when no first/last sample time found
  perf script: Improve error msg when no first/last sample time found
  perf util: Improve error checking for time percent input
  perf util: Support no index time percent slice
  perf report: Add an indication of what time slices are used
  perf util: Allocate time slices buffer according to number of comma
  perf report: Remove the time slices number limitation
  perf script: Remove the time slices number limitation

 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/Documentation/perf-script.txt | 10 ++---
 tools/perf/builtin-report.c  | 30 +
 tools/perf/builtin-script.c  | 21 +++---
 tools/perf/util/time-utils.c | 72 +++-
 tools/perf/util/time-utils.h |  2 +
 6 files changed, 117 insertions(+), 20 deletions(-)

-- 
2.7.4



Re: RFC(V3): Audit Kernel Container IDs

2018-01-09 Thread Richard Guy Briggs
On 2018-01-09 11:18, Simo Sorce wrote:
> On Tue, 2018-01-09 at 07:16 -0500, Richard Guy Briggs wrote:
> > Containers are a userspace concept.  The kernel knows nothing of them.
> > 
> > The Linux audit system needs a way to be able to track the container
> > provenance of events and actions.  Audit needs the kernel's help to do
> > this.
> > 
> > Since the concept of a container is entirely a userspace concept, a
> > registration from the userspace container orchestration system initiates
> > this.  This will define a point in time and a set of resources
> > associated with a particular container with an audit container
> > identifier.
> > 
> > The registration is a u64 representing the audit container identifier
> > written to a special file in a pseudo filesystem (proc, since PID tree
> > already exists) representing a process that will become a parent process
> > in that container.  This write might place restrictions on mount
> > namespaces required to define a container, or at least careful checking
> > of namespaces in the kernel to verify permissions of the orchestrator so
> > it can't change its own container ID.  A bind mount of nsfs may be
> > necessary in the container orchestrator's mount namespace.  This write
> > can only happen once per process.
> > 
> > Note: The justification for using a u64 is that it minimizes the
> > information printed in every audit record, reducing bandwidth and limits
> > comparisons to a single u64 which will be faster and less error-prone.
> > 
> > Require CAP_AUDIT_CONTROL to be able to carry out the registration.  At
> > that time, record the target container's user-supplied audit container
> > identifier along with a target container's parent process (which may
> > become the target container's "init" process) process ID (referenced
> > from the initial PID namespace) in a new record AUDIT_CONTAINER with a
> > qualifying op=$action field.
> > 
> > Issue a new auxilliary record AUDIT_CONTAINER_INFO for each valid
> > container ID present on an auditable action or event.
> > 
> > Forked and cloned processes inherit their parent's audit container
> > identifier, referenced in the process' task_struct.  Since the audit
> > container identifier is inherited rather than written, it can still be
> > written once.  This will prevent tampering while allowing nesting.
> > (This can be implemented with an internal settable flag upon
> > registration that does not get copied across a fork/clone.)
> > 
> > Mimic setns(2) and return an error if the process has already initiated
> > threading or forked since this registration should happen before the
> > process execution is started by the orchestrator and hence should not
> > yet have any threads or children.  If this is deemed overly restrictive,
> > switch all of the target's threads and children to the new containerID.
> > 
> > Trust the orchestrator to judiciously use and restrict CAP_AUDIT_CONTROL.
> > 
> > When a container ceases to exist because the last process in that
> > container has exited log the fact to balance the registration action.  
> > (This is likely needed for certification accountability.)
> > 
> > At this point it appears unnecessary to add a container session
> > identifier since this is all tracked from loginuid and sessionid to
> > communicate with the container orchestrator to spawn an additional
> > session into an existing container which would be logged.  It can be
> > added at a later date without breaking API should it be deemed
> > necessary.
> > 
> > The following namespace logging actions are not needed for certification
> > purposes at this point, but are helpful for tracking namespace activity.
> > These are auxilliary records that are associated with namespace
> > manipulation syscalls unshare(2), clone(2) and setns(2), so the records
> > will only show up if explicit syscall rules have been added to document
> > this activity.
> > 
> > Log the creation of every namespace, inheriting/adding its spawning
> > process' audit container identifier(s), if applicable.  Include the
> > spawning and spawned namespace IDs (device and inode number tuples).
> > [AUDIT_NS_CREATE, AUDIT_NS_DESTROY] [clone(2), unshare(2), setns(2)]
> > Note: At this point it appears only network namespaces may need to track
> > container IDs apart from processes since incoming packets may cause an
> > auditable event before being associated with a process.  Since a
> > namespace can be shared by processes in different containers, the
> > namespace will need to track all containers to which it has been
> > assigned.
> > 
> > Upon registration, the target process' namespace IDs (in the form of a
> > nsfs device number and inode number tuple) will be recorded in an
> > AUDIT_NS_INFO auxilliary record.
> > 
> > Log the destruction of every namespace that is no longer used by any
> > process, including the namespace IDs (device and inode number tuples).
> > [AUDIT_NS_DESTROY] [process exit, unshare(2), setns(2)]
> > 
> > 

Re: RFC(V3): Audit Kernel Container IDs

2018-01-09 Thread Richard Guy Briggs
On 2018-01-09 11:18, Simo Sorce wrote:
> On Tue, 2018-01-09 at 07:16 -0500, Richard Guy Briggs wrote:
> > Containers are a userspace concept.  The kernel knows nothing of them.
> > 
> > The Linux audit system needs a way to be able to track the container
> > provenance of events and actions.  Audit needs the kernel's help to do
> > this.
> > 
> > Since the concept of a container is entirely a userspace concept, a
> > registration from the userspace container orchestration system initiates
> > this.  This will define a point in time and a set of resources
> > associated with a particular container with an audit container
> > identifier.
> > 
> > The registration is a u64 representing the audit container identifier
> > written to a special file in a pseudo filesystem (proc, since PID tree
> > already exists) representing a process that will become a parent process
> > in that container.  This write might place restrictions on mount
> > namespaces required to define a container, or at least careful checking
> > of namespaces in the kernel to verify permissions of the orchestrator so
> > it can't change its own container ID.  A bind mount of nsfs may be
> > necessary in the container orchestrator's mount namespace.  This write
> > can only happen once per process.
> > 
> > Note: The justification for using a u64 is that it minimizes the
> > information printed in every audit record, reducing bandwidth and limits
> > comparisons to a single u64 which will be faster and less error-prone.
> > 
> > Require CAP_AUDIT_CONTROL to be able to carry out the registration.  At
> > that time, record the target container's user-supplied audit container
> > identifier along with a target container's parent process (which may
> > become the target container's "init" process) process ID (referenced
> > from the initial PID namespace) in a new record AUDIT_CONTAINER with a
> > qualifying op=$action field.
> > 
> > Issue a new auxilliary record AUDIT_CONTAINER_INFO for each valid
> > container ID present on an auditable action or event.
> > 
> > Forked and cloned processes inherit their parent's audit container
> > identifier, referenced in the process' task_struct.  Since the audit
> > container identifier is inherited rather than written, it can still be
> > written once.  This will prevent tampering while allowing nesting.
> > (This can be implemented with an internal settable flag upon
> > registration that does not get copied across a fork/clone.)
> > 
> > Mimic setns(2) and return an error if the process has already initiated
> > threading or forked since this registration should happen before the
> > process execution is started by the orchestrator and hence should not
> > yet have any threads or children.  If this is deemed overly restrictive,
> > switch all of the target's threads and children to the new containerID.
> > 
> > Trust the orchestrator to judiciously use and restrict CAP_AUDIT_CONTROL.
> > 
> > When a container ceases to exist because the last process in that
> > container has exited log the fact to balance the registration action.  
> > (This is likely needed for certification accountability.)
> > 
> > At this point it appears unnecessary to add a container session
> > identifier since this is all tracked from loginuid and sessionid to
> > communicate with the container orchestrator to spawn an additional
> > session into an existing container which would be logged.  It can be
> > added at a later date without breaking API should it be deemed
> > necessary.
> > 
> > The following namespace logging actions are not needed for certification
> > purposes at this point, but are helpful for tracking namespace activity.
> > These are auxilliary records that are associated with namespace
> > manipulation syscalls unshare(2), clone(2) and setns(2), so the records
> > will only show up if explicit syscall rules have been added to document
> > this activity.
> > 
> > Log the creation of every namespace, inheriting/adding its spawning
> > process' audit container identifier(s), if applicable.  Include the
> > spawning and spawned namespace IDs (device and inode number tuples).
> > [AUDIT_NS_CREATE, AUDIT_NS_DESTROY] [clone(2), unshare(2), setns(2)]
> > Note: At this point it appears only network namespaces may need to track
> > container IDs apart from processes since incoming packets may cause an
> > auditable event before being associated with a process.  Since a
> > namespace can be shared by processes in different containers, the
> > namespace will need to track all containers to which it has been
> > assigned.
> > 
> > Upon registration, the target process' namespace IDs (in the form of a
> > nsfs device number and inode number tuple) will be recorded in an
> > AUDIT_NS_INFO auxilliary record.
> > 
> > Log the destruction of every namespace that is no longer used by any
> > process, including the namespace IDs (device and inode number tuples).
> > [AUDIT_NS_DESTROY] [process exit, unshare(2), setns(2)]
> > 
> > 

[PATCH 5/6] kconfig: remove redundant input_mode test for check_conf() loop

2018-01-09 Thread Masahiro Yamada
check_conf() never increments conf_cnt for listnewconfig, so conf_cnt
is always zero.

In other words, conf_cnt is not zero, "input_mode != listnewconfig"
is met.

Signed-off-by: Masahiro Yamada 
---

 scripts/kconfig/conf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c
index 693cd5f..1d2ed3e 100644
--- a/scripts/kconfig/conf.c
+++ b/scripts/kconfig/conf.c
@@ -669,7 +669,7 @@ int main(int ac, char **av)
do {
conf_cnt = 0;
check_conf();
-   } while (conf_cnt && input_mode != listnewconfig);
+   } while (conf_cnt);
break;
case alldefconfig:
case defconfig:
-- 
2.7.4



[PATCH 5/6] kconfig: remove redundant input_mode test for check_conf() loop

2018-01-09 Thread Masahiro Yamada
check_conf() never increments conf_cnt for listnewconfig, so conf_cnt
is always zero.

In other words, conf_cnt is not zero, "input_mode != listnewconfig"
is met.

Signed-off-by: Masahiro Yamada 
---

 scripts/kconfig/conf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c
index 693cd5f..1d2ed3e 100644
--- a/scripts/kconfig/conf.c
+++ b/scripts/kconfig/conf.c
@@ -669,7 +669,7 @@ int main(int ac, char **av)
do {
conf_cnt = 0;
check_conf();
-   } while (conf_cnt && input_mode != listnewconfig);
+   } while (conf_cnt);
break;
case alldefconfig:
case defconfig:
-- 
2.7.4



  1   2   3   4   5   6   7   8   9   10   >