Re: [PATCH] ARM: s3c64xx: Tidy up handling of regulator GPIO lookups

2018-05-13 Thread Linus Walleij
On Thu, Apr 19, 2018 at 5:01 PM, Charles Keepax
 wrote:

> From: Charles Keepax 
>
> Rather than unconditionally registering the GPIO lookup table only do so
> for devices that require it.
>
> Signed-off-by: Charles Keepax 
> ---
>
> Do you have any objections to the following?
>
> If we are lucky I might be able to find time to test these early
> next week. Well at least there is reasonable chance I can test
> the 5102 stuff when you resend, not sure I have a device to
> test the wm1277 but will have a look. Also I haven't run up
> Cragganmore in a little while so might depend a little on how
> much people have broken it since last I did :-)

I folded this in on top of my series, also adding the table entries
for wm5102 and wm5102 reva.

Sorry for the delay, I was sidetracked...

Yours,
Linus Walleij


Re: [PATCH] ARM: s3c64xx: Tidy up handling of regulator GPIO lookups

2018-05-13 Thread Linus Walleij
On Thu, Apr 19, 2018 at 5:01 PM, Charles Keepax
 wrote:

> From: Charles Keepax 
>
> Rather than unconditionally registering the GPIO lookup table only do so
> for devices that require it.
>
> Signed-off-by: Charles Keepax 
> ---
>
> Do you have any objections to the following?
>
> If we are lucky I might be able to find time to test these early
> next week. Well at least there is reasonable chance I can test
> the 5102 stuff when you resend, not sure I have a device to
> test the wm1277 but will have a look. Also I haven't run up
> Cragganmore in a little while so might depend a little on how
> much people have broken it since last I did :-)

I folded this in on top of my series, also adding the table entries
for wm5102 and wm5102 reva.

Sorry for the delay, I was sidetracked...

Yours,
Linus Walleij


Re: [PATCH] scsi: clean up generated file scsi_devinfo_tbl.c

2018-05-13 Thread Hannes Reinecke
On Sun, 13 May 2018 17:10:52 -0700
Randy Dunlap  wrote:

> From: Randy Dunlap 
> 
> "make clean" should remove the generated file "scsi_devinfo_tbl.c",
> so list it in the clean-files variable so that the file gets
> cleaned up.
> 
> Fixes: 345e29608b4b ("scsi: scsi: Export blacklist flags to sysfs")
> 
> Cc: Hannes Reinecke 
> Signed-off-by: Randy Dunlap 
> ---
>  drivers/scsi/Makefile |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- linux-4.17-rc4.orig/drivers/scsi/Makefile
> +++ linux-4.17-rc4/drivers/scsi/Makefile
> @@ -182,7 +182,7 @@ zalon7xx-objs := zalon.o ncr53c8xx.o
>  NCR_Q720_mod-objs:= NCR_Q720.o ncr53c8xx.o
>  
>  # Files generated that shall be removed upon make clean
> -clean-files :=   53c700_d.h 53c700_u.h
> +clean-files :=   53c700_d.h 53c700_u.h scsi_devinfo_tbl.c
>  
>  $(obj)/53c700.o $(MODVERDIR)/$(obj)/53c700.ver: $(obj)/53c700_d.h
>  
> 
> 

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes



Re: [PATCH] scsi: clean up generated file scsi_devinfo_tbl.c

2018-05-13 Thread Hannes Reinecke
On Sun, 13 May 2018 17:10:52 -0700
Randy Dunlap  wrote:

> From: Randy Dunlap 
> 
> "make clean" should remove the generated file "scsi_devinfo_tbl.c",
> so list it in the clean-files variable so that the file gets
> cleaned up.
> 
> Fixes: 345e29608b4b ("scsi: scsi: Export blacklist flags to sysfs")
> 
> Cc: Hannes Reinecke 
> Signed-off-by: Randy Dunlap 
> ---
>  drivers/scsi/Makefile |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- linux-4.17-rc4.orig/drivers/scsi/Makefile
> +++ linux-4.17-rc4/drivers/scsi/Makefile
> @@ -182,7 +182,7 @@ zalon7xx-objs := zalon.o ncr53c8xx.o
>  NCR_Q720_mod-objs:= NCR_Q720.o ncr53c8xx.o
>  
>  # Files generated that shall be removed upon make clean
> -clean-files :=   53c700_d.h 53c700_u.h
> +clean-files :=   53c700_d.h 53c700_u.h scsi_devinfo_tbl.c
>  
>  $(obj)/53c700.o $(MODVERDIR)/$(obj)/53c700.ver: $(obj)/53c700_d.h
>  
> 
> 

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes



Re: [PATCH 01/10] autofs4 - merge auto_fs.h and auto_fs4.h

2018-05-13 Thread Ian Kent
On 14/05/18 11:15, Al Viro wrote:
> On Mon, May 14, 2018 at 11:03:50AM +0800, Ian Kent wrote:
>> The autofs module has long since been removed so there's no need to have
>> two separate include files for autofs.
> 
> Umm...  Why does fs/compat_ioctl.c need either include, actually?
> 
>> --- a/fs/compat_ioctl.c
>> +++ b/fs/compat_ioctl.c
>> @@ -39,7 +39,6 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>>  #include 
>>  #include 
>>  #include 
> 
> AFAICS, we can just delete both.  Matter of fact, a *lot* of those includes 
> are
> pointless nowadays...
> 

OK, I'll have a look at that and post a follow up patch.

Thanks for having a look at these Al.


Re: [PATCH 01/10] autofs4 - merge auto_fs.h and auto_fs4.h

2018-05-13 Thread Ian Kent
On 14/05/18 11:15, Al Viro wrote:
> On Mon, May 14, 2018 at 11:03:50AM +0800, Ian Kent wrote:
>> The autofs module has long since been removed so there's no need to have
>> two separate include files for autofs.
> 
> Umm...  Why does fs/compat_ioctl.c need either include, actually?
> 
>> --- a/fs/compat_ioctl.c
>> +++ b/fs/compat_ioctl.c
>> @@ -39,7 +39,6 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>>  #include 
>>  #include 
>>  #include 
> 
> AFAICS, we can just delete both.  Matter of fact, a *lot* of those includes 
> are
> pointless nowadays...
> 

OK, I'll have a look at that and post a follow up patch.

Thanks for having a look at these Al.


Re: general protection fault in rds_ib_get_mr

2018-05-13 Thread santosh.shilim...@oracle.com

On 5/13/18 2:10 PM, Eric Biggers wrote:

On Wed, Mar 21, 2018 at 09:00:01AM -0700, syzbot wrote:


[...]



Still reproducible on Linus' tree (commit 66e1c94db3cd4) and linux-next
(next-20180511).  Here's a simplified reproducer:


Thanks for the test case !!

Regards,
Santosh


Re: general protection fault in rds_ib_get_mr

2018-05-13 Thread santosh.shilim...@oracle.com

On 5/13/18 2:10 PM, Eric Biggers wrote:

On Wed, Mar 21, 2018 at 09:00:01AM -0700, syzbot wrote:


[...]



Still reproducible on Linus' tree (commit 66e1c94db3cd4) and linux-next
(next-20180511).  Here's a simplified reproducer:


Thanks for the test case !!

Regards,
Santosh


general protection fault in shmem_unused_huge_count

2018-05-13 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:3eb2ce825ea1 Linux 4.16-rc7
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16bd9d9380
kernel config:  https://syzkaller.appspot.com/x/.config?x=8addcf4530d93e53
dashboard link: https://syzkaller.appspot.com/bug?extid=d2586fde8fdcead3647f
compiler:   gcc (GCC) 7.1.1 20170620

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+d2586fde8fdcead36...@syzkaller.appspotmail.com

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 9915 Comm: syz-executor3 Not tainted 4.16.0-rc7+ #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
RIP: 0010:shmem_unused_huge_count+0x8e/0x100 mm/shmem.c:561
RSP: 0018:8801b0e9f460 EFLAGS: 00010206
RAX: dc00 RBX: 1100361d3e8d RCX: 8195e5e7
RDX: 0021 RSI: 8801b0e9f778 RDI: 0108
RBP: 8801b0e9f4e0 R08:  R09: 1100361d3e76
R10: 8801b0e9f378 R11: 0001 R12: 
R13: dc00 R14: 8801b4c5cdf0 R15: 
FS:  00d5a940() GS:8801db20() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 00d5ac10 CR3: 0001d8120003 CR4: 001606f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 super_cache_count+0x96/0x280 fs/super.c:131
 do_shrink_slab mm/vmscan.c:310 [inline]
 shrink_slab.part.46+0x30c/0xe80 mm/vmscan.c:475
 shrink_slab+0x9d/0xb0 mm/vmscan.c:442
 shrink_node+0x51e/0xf70 mm/vmscan.c:2556
 shrink_zones mm/vmscan.c:2728 [inline]
 do_try_to_free_pages+0x383/0x1020 mm/vmscan.c:2790
 try_to_free_mem_cgroup_pages+0x44d/0xb40 mm/vmscan.c:3079
 reclaim_high.constprop.64+0x1e2/0x330 mm/memcontrol.c:1862
 mem_cgroup_handle_over_high+0x8d/0x130 mm/memcontrol.c:1887
 tracehook_notify_resume include/linux/tracehook.h:193 [inline]
 exit_to_usermode_loop+0x242/0x2f0 arch/x86/entry/common.c:166
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath+0x487/0x550 arch/x86/entry/common.c:265
 ret_from_fork+0x15/0x50 arch/x86/entry/entry_64.S:399
RIP: 0033:0x452f5a
RSP: 002b:7ffdaa5ef700 EFLAGS: 0246 ORIG_RAX: 0038
RAX:  RBX: 7ffdaa5ef700 RCX: 00452f5a
RDX:  RSI:  RDI: 01200011
RBP: 7ffdaa5ef740 R08: 0001 R09: 00d5a940
R10: 00d5ac10 R11: 0246 R12: 0001
R13:  R14:  R15: 1380
Code: c1 e8 03 42 80 3c 28 00 75 6f 4d 8b a4 24 80 06 00 00 48 b8 00 00 00  
00 00 fc ff df 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00  
75 5e 48 8d 7d a8 48 ba 00 00 00 00 00 fc ff df 49
RIP: __read_once_size include/linux/compiler.h:188 [inline] RSP:  
8801b0e9f460

RIP: shmem_unused_huge_count+0x8e/0x100 mm/shmem.c:561 RSP: 8801b0e9f460
---[ end trace 8116f40602cb9839 ]---


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.


general protection fault in shmem_unused_huge_count

2018-05-13 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:3eb2ce825ea1 Linux 4.16-rc7
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16bd9d9380
kernel config:  https://syzkaller.appspot.com/x/.config?x=8addcf4530d93e53
dashboard link: https://syzkaller.appspot.com/bug?extid=d2586fde8fdcead3647f
compiler:   gcc (GCC) 7.1.1 20170620

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+d2586fde8fdcead36...@syzkaller.appspotmail.com

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault:  [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 9915 Comm: syz-executor3 Not tainted 4.16.0-rc7+ #3
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:__read_once_size include/linux/compiler.h:188 [inline]
RIP: 0010:shmem_unused_huge_count+0x8e/0x100 mm/shmem.c:561
RSP: 0018:8801b0e9f460 EFLAGS: 00010206
RAX: dc00 RBX: 1100361d3e8d RCX: 8195e5e7
RDX: 0021 RSI: 8801b0e9f778 RDI: 0108
RBP: 8801b0e9f4e0 R08:  R09: 1100361d3e76
R10: 8801b0e9f378 R11: 0001 R12: 
R13: dc00 R14: 8801b4c5cdf0 R15: 
FS:  00d5a940() GS:8801db20() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 00d5ac10 CR3: 0001d8120003 CR4: 001606f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 super_cache_count+0x96/0x280 fs/super.c:131
 do_shrink_slab mm/vmscan.c:310 [inline]
 shrink_slab.part.46+0x30c/0xe80 mm/vmscan.c:475
 shrink_slab+0x9d/0xb0 mm/vmscan.c:442
 shrink_node+0x51e/0xf70 mm/vmscan.c:2556
 shrink_zones mm/vmscan.c:2728 [inline]
 do_try_to_free_pages+0x383/0x1020 mm/vmscan.c:2790
 try_to_free_mem_cgroup_pages+0x44d/0xb40 mm/vmscan.c:3079
 reclaim_high.constprop.64+0x1e2/0x330 mm/memcontrol.c:1862
 mem_cgroup_handle_over_high+0x8d/0x130 mm/memcontrol.c:1887
 tracehook_notify_resume include/linux/tracehook.h:193 [inline]
 exit_to_usermode_loop+0x242/0x2f0 arch/x86/entry/common.c:166
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath+0x487/0x550 arch/x86/entry/common.c:265
 ret_from_fork+0x15/0x50 arch/x86/entry/entry_64.S:399
RIP: 0033:0x452f5a
RSP: 002b:7ffdaa5ef700 EFLAGS: 0246 ORIG_RAX: 0038
RAX:  RBX: 7ffdaa5ef700 RCX: 00452f5a
RDX:  RSI:  RDI: 01200011
RBP: 7ffdaa5ef740 R08: 0001 R09: 00d5a940
R10: 00d5ac10 R11: 0246 R12: 0001
R13:  R14:  R15: 1380
Code: c1 e8 03 42 80 3c 28 00 75 6f 4d 8b a4 24 80 06 00 00 48 b8 00 00 00  
00 00 fc ff df 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00  
75 5e 48 8d 7d a8 48 ba 00 00 00 00 00 fc ff df 49
RIP: __read_once_size include/linux/compiler.h:188 [inline] RSP:  
8801b0e9f460

RIP: shmem_unused_huge_count+0x8e/0x100 mm/shmem.c:561 RSP: 8801b0e9f460
---[ end trace 8116f40602cb9839 ]---


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.


linux-next: manual merge of the kselftest tree with the kvm-fixes tree

2018-05-13 Thread Stephen Rothwell
Hi Shuah,

Today's linux-next merge of the kselftest tree got a conflict in:

  tools/testing/selftests/kvm/vmx_tsc_adjust_test.c

between commit:

  bcb2b94ae010 ("KVM: selftests: exit with 0 status code when tests cannot be 
run")

from the kvm-fixes tree and commit:

  13911360966d ("selftests: kvm: return Kselftest Skip code for skipped tests")

from the kselftest tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc tools/testing/selftests/kvm/vmx_tsc_adjust_test.c
index aaa633263b2c,62fb73699eb6..
--- a/tools/testing/selftests/kvm/vmx_tsc_adjust_test.c
+++ b/tools/testing/selftests/kvm/vmx_tsc_adjust_test.c
@@@ -189,8 -191,8 +191,8 @@@ int main(int argc, char *argv[]
struct kvm_cpuid_entry2 *entry = kvm_get_supported_cpuid_entry(1);
  
if (!(entry->ecx & CPUID_VMX)) {
 -  printf("nested VMX not enabled, skipping test");
 +  fprintf(stderr, "nested VMX not enabled, skipping test\n");
-   exit(KSFT_SKIP);
+   return KSFT_SKIP;
}
  
vm = vm_create_default_vmx(VCPU_ID, (void *) l1_guest_code);


pgpMgTaUG4A6E.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the kselftest tree with the kvm-fixes tree

2018-05-13 Thread Stephen Rothwell
Hi Shuah,

Today's linux-next merge of the kselftest tree got a conflict in:

  tools/testing/selftests/kvm/vmx_tsc_adjust_test.c

between commit:

  bcb2b94ae010 ("KVM: selftests: exit with 0 status code when tests cannot be 
run")

from the kvm-fixes tree and commit:

  13911360966d ("selftests: kvm: return Kselftest Skip code for skipped tests")

from the kselftest tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc tools/testing/selftests/kvm/vmx_tsc_adjust_test.c
index aaa633263b2c,62fb73699eb6..
--- a/tools/testing/selftests/kvm/vmx_tsc_adjust_test.c
+++ b/tools/testing/selftests/kvm/vmx_tsc_adjust_test.c
@@@ -189,8 -191,8 +191,8 @@@ int main(int argc, char *argv[]
struct kvm_cpuid_entry2 *entry = kvm_get_supported_cpuid_entry(1);
  
if (!(entry->ecx & CPUID_VMX)) {
 -  printf("nested VMX not enabled, skipping test");
 +  fprintf(stderr, "nested VMX not enabled, skipping test\n");
-   exit(KSFT_SKIP);
+   return KSFT_SKIP;
}
  
vm = vm_create_default_vmx(VCPU_ID, (void *) l1_guest_code);


pgpMgTaUG4A6E.pgp
Description: OpenPGP digital signature


Re: [PATCH v5 11/23] driver core: add per device iommu param

2018-05-13 Thread Lu Baolu
Hi,

On 05/12/2018 04:54 AM, Jacob Pan wrote:
> DMA faults can be detected by IOMMU at device level. Adding a pointer
> to struct device allows IOMMU subsystem to report relevant faults
> back to the device driver for further handling.
> For direct assigned device (or user space drivers), guest OS holds
> responsibility to handle and respond per device IOMMU fault.
> Therefore we need fault reporting mechanism to propagate faults beyond
> IOMMU subsystem.
>
> There are two other IOMMU data pointers under struct device today, here
> we introduce iommu_param as a parent pointer such that all device IOMMU
> data can be consolidated here. The idea was suggested here by Greg KH
> and Joerg. The name iommu_param is chosen here since iommu_data has been used.

This doesn't match what you've done in the patch. Maybe you
forgot to cleanup? :-)

The idea is to create a parent pointer under device struct and
move previous iommu_group and iommu_fwspec together with
the iommu fault related data into it.

Best regards,
Lu Baolu

>
> Suggested-by: Greg Kroah-Hartman 
> Reviewed-by: Greg Kroah-Hartman 
> Signed-off-by: Jacob Pan 
> Link: https://lkml.org/lkml/2017/10/6/81
> ---
>  include/linux/device.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 4779569..c1b1796 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -41,6 +41,7 @@ struct iommu_ops;
>  struct iommu_group;
>  struct iommu_fwspec;
>  struct dev_pin_info;
> +struct iommu_param;
>  
>  struct bus_attribute {
>   struct attributeattr;
> @@ -899,6 +900,7 @@ struct dev_links_info {
>   *   device (i.e. the bus driver that discovered the device).
>   * @iommu_group: IOMMU group the device belongs to.
>   * @iommu_fwspec: IOMMU-specific properties supplied by firmware.
> + * @iommu_param: Per device generic IOMMU runtime data
>   *
>   * @offline_disabled: If set, the device is permanently online.
>   * @offline: Set after successful invocation of bus type's .offline().
> @@ -988,6 +990,7 @@ struct device {
>   void(*release)(struct device *dev);
>   struct iommu_group  *iommu_group;
>   struct iommu_fwspec *iommu_fwspec;
> + struct iommu_param  *iommu_param;
>  
>   booloffline_disabled:1;
>   booloffline:1;



Re: [PATCH v5 11/23] driver core: add per device iommu param

2018-05-13 Thread Lu Baolu
Hi,

On 05/12/2018 04:54 AM, Jacob Pan wrote:
> DMA faults can be detected by IOMMU at device level. Adding a pointer
> to struct device allows IOMMU subsystem to report relevant faults
> back to the device driver for further handling.
> For direct assigned device (or user space drivers), guest OS holds
> responsibility to handle and respond per device IOMMU fault.
> Therefore we need fault reporting mechanism to propagate faults beyond
> IOMMU subsystem.
>
> There are two other IOMMU data pointers under struct device today, here
> we introduce iommu_param as a parent pointer such that all device IOMMU
> data can be consolidated here. The idea was suggested here by Greg KH
> and Joerg. The name iommu_param is chosen here since iommu_data has been used.

This doesn't match what you've done in the patch. Maybe you
forgot to cleanup? :-)

The idea is to create a parent pointer under device struct and
move previous iommu_group and iommu_fwspec together with
the iommu fault related data into it.

Best regards,
Lu Baolu

>
> Suggested-by: Greg Kroah-Hartman 
> Reviewed-by: Greg Kroah-Hartman 
> Signed-off-by: Jacob Pan 
> Link: https://lkml.org/lkml/2017/10/6/81
> ---
>  include/linux/device.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 4779569..c1b1796 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -41,6 +41,7 @@ struct iommu_ops;
>  struct iommu_group;
>  struct iommu_fwspec;
>  struct dev_pin_info;
> +struct iommu_param;
>  
>  struct bus_attribute {
>   struct attributeattr;
> @@ -899,6 +900,7 @@ struct dev_links_info {
>   *   device (i.e. the bus driver that discovered the device).
>   * @iommu_group: IOMMU group the device belongs to.
>   * @iommu_fwspec: IOMMU-specific properties supplied by firmware.
> + * @iommu_param: Per device generic IOMMU runtime data
>   *
>   * @offline_disabled: If set, the device is permanently online.
>   * @offline: Set after successful invocation of bus type's .offline().
> @@ -988,6 +990,7 @@ struct device {
>   void(*release)(struct device *dev);
>   struct iommu_group  *iommu_group;
>   struct iommu_fwspec *iommu_fwspec;
> + struct iommu_param  *iommu_param;
>  
>   booloffline_disabled:1;
>   booloffline:1;



Re: [PATCH 2/2] arm64: Clear the stack

2018-05-13 Thread Mark Rutland
On Sun, May 13, 2018 at 11:40:07AM +0300, Alexander Popov wrote:
> It seems that previously I was very "lucky" to accidentally have those 
> MIN_STACK_LEFT,
> call trace depth and oops=panic together to experience a hang on stack 
> overflow
> during BUG().
> 
> 
> When I run my test in a loop _without_ VMAP_STACK, I manage to corrupt the 
> neighbour
> processes with BUG() handling overstepping the stack boundary. It's a pity, 
> but
> I have an idea.

I think that in the absence of VMAP_STACK, there will always be cases where we
*could* corrupt a neighbouring stack, but I agree that trying to minimize that
possibility would be good.

> In kernel/sched/core.c we already have:
> 
> #ifdef CONFIG_SCHED_STACK_END_CHECK
>   if (task_stack_end_corrupted(prev))
>   panic("corrupted stack end detected inside scheduler\n");
> #endif
> 
> So what would you think if I do the following in check_alloca():
> 
>   if (size >= stack_left) {
> #if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK)
>   panic("alloca over the kernel stack boundary\n");
> #else
>   BUG();
> #endif

Given this is already out-of-line, how about we always use panic(), regardless
of VMAP_STACK and SCHED_STACK_END_CHECK? i.e. just

if (unlikely(size >= stack_left))
panic("alloca over the kernel stack boundary");

If we have VMAP_STACK selected, and overflow during the panic, it's the same as
if we overflowed during the BUG(). It's likely that panic() will use less stack
space than BUG(), and the compiler can put the call in a slow path that
shouldn't affect most calls, so in all cases it's likely preferable.

Thanks,
Mark.


Re: [PATCH 2/2] arm64: Clear the stack

2018-05-13 Thread Mark Rutland
On Sun, May 13, 2018 at 11:40:07AM +0300, Alexander Popov wrote:
> It seems that previously I was very "lucky" to accidentally have those 
> MIN_STACK_LEFT,
> call trace depth and oops=panic together to experience a hang on stack 
> overflow
> during BUG().
> 
> 
> When I run my test in a loop _without_ VMAP_STACK, I manage to corrupt the 
> neighbour
> processes with BUG() handling overstepping the stack boundary. It's a pity, 
> but
> I have an idea.

I think that in the absence of VMAP_STACK, there will always be cases where we
*could* corrupt a neighbouring stack, but I agree that trying to minimize that
possibility would be good.

> In kernel/sched/core.c we already have:
> 
> #ifdef CONFIG_SCHED_STACK_END_CHECK
>   if (task_stack_end_corrupted(prev))
>   panic("corrupted stack end detected inside scheduler\n");
> #endif
> 
> So what would you think if I do the following in check_alloca():
> 
>   if (size >= stack_left) {
> #if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK)
>   panic("alloca over the kernel stack boundary\n");
> #else
>   BUG();
> #endif

Given this is already out-of-line, how about we always use panic(), regardless
of VMAP_STACK and SCHED_STACK_END_CHECK? i.e. just

if (unlikely(size >= stack_left))
panic("alloca over the kernel stack boundary");

If we have VMAP_STACK selected, and overflow during the panic, it's the same as
if we overflowed during the BUG(). It's likely that panic() will use less stack
space than BUG(), and the compiler can put the call in a slow path that
shouldn't affect most calls, so in all cases it's likely preferable.

Thanks,
Mark.


Re: CONFIG_KCOV causing crash in svm_vcpu_run()

2018-05-13 Thread Dmitry Vyukov
On Mon, May 14, 2018 at 5:02 AM, Eric Biggers  wrote:
> Sorry, messed up address for KVM mailing list.  See message below.
>
> On Sun, May 13, 2018 at 08:00:07PM -0700, Eric Biggers wrote:
>> With CONFIG_KCOV=y and an AMD processor, running the following program 
>> crashes
>> the kernel with no output (I'm testing in a VM, so it's using nested
>> virtualization):
>>
>>   #include 
>>   #include 
>>   #include 
>>
>>   int main()
>>   {
>>   int dev, vm, cpu;
>>   char page[4096] __attribute__((aligned(4096))) = { 0 };
>>   struct kvm_userspace_memory_region memreg = {
>>   .memory_size = 4096,
>>   .userspace_addr = (unsigned long)page,
>>   };
>>   dev = open("/dev/kvm", O_RDONLY);
>>   vm = ioctl(dev, KVM_CREATE_VM, 0);
>>   cpu = ioctl(vm, KVM_CREATE_VCPU, 0);
>>   ioctl(vm, KVM_SET_USER_MEMORY_REGION, );
>>   ioctl(cpu, KVM_RUN, 0);
>>   }
>>
>> It bisects down to commit b2ac58f90540e39 ("KVM/SVM: Allow direct access to
>> MSR_IA32_SPEC_CTRL").  The bug is apparently that due to the new code for
>> managing the SPEC_CTRL MSR, __sanitizer_cov_trace_pc() is being called from
>> svm_vcpu_run() before the host's MSR_GS_BASE has been restored, which causes 
>> a
>> crash somehow.  The following patch fixes it, though I don't know that it's 
>> the
>> right solution; maybe KCOV should be disabled in the function instead, or 
>> maybe
>> there's a more fundamental problem.  What do people think?


If __sanitizer_cov_trace_pc() crashes, I would expect there must be
few more of them here:

if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);

if (svm->spec_ctrl)
native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);

Compiler inserts these callbacks into every basic block/edge.. Aren't there?

Unfortunately we don't have an attribute that disables instrumentation
of a single function. This is currently possible only on file level.



>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 1fc05e428aba8..d35ef241e66d8 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -5652,6 +5652,15 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
>>  #endif
>>   );
>>
>> +#ifdef CONFIG_X86_64
>> + wrmsrl(MSR_GS_BASE, svm->host.gs_base);
>> +#else
>> + loadsegment(fs, svm->host.fs);
>> +#ifndef CONFIG_X86_32_LAZY_GS
>> + loadsegment(gs, svm->host.gs);
>> +#endif
>> +#endif
>> +
>>   /*
>>* We do not use IBRS in the kernel. If this vCPU has used the
>>* SPEC_CTRL MSR it may have left it on; save the value and
>> @@ -5676,15 +5685,6 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
>>   /* Eliminate branch target predictions from guest mode */
>>   vmexit_fill_RSB();
>>
>> -#ifdef CONFIG_X86_64
>> - wrmsrl(MSR_GS_BASE, svm->host.gs_base);
>> -#else
>> - loadsegment(fs, svm->host.fs);
>> -#ifndef CONFIG_X86_32_LAZY_GS
>> - loadsegment(gs, svm->host.gs);
>> -#endif
>> -#endif
>> -
>>   reload_tss(vcpu);
>>
>>   local_irq_disable();


Re: CONFIG_KCOV causing crash in svm_vcpu_run()

2018-05-13 Thread Dmitry Vyukov
On Mon, May 14, 2018 at 5:02 AM, Eric Biggers  wrote:
> Sorry, messed up address for KVM mailing list.  See message below.
>
> On Sun, May 13, 2018 at 08:00:07PM -0700, Eric Biggers wrote:
>> With CONFIG_KCOV=y and an AMD processor, running the following program 
>> crashes
>> the kernel with no output (I'm testing in a VM, so it's using nested
>> virtualization):
>>
>>   #include 
>>   #include 
>>   #include 
>>
>>   int main()
>>   {
>>   int dev, vm, cpu;
>>   char page[4096] __attribute__((aligned(4096))) = { 0 };
>>   struct kvm_userspace_memory_region memreg = {
>>   .memory_size = 4096,
>>   .userspace_addr = (unsigned long)page,
>>   };
>>   dev = open("/dev/kvm", O_RDONLY);
>>   vm = ioctl(dev, KVM_CREATE_VM, 0);
>>   cpu = ioctl(vm, KVM_CREATE_VCPU, 0);
>>   ioctl(vm, KVM_SET_USER_MEMORY_REGION, );
>>   ioctl(cpu, KVM_RUN, 0);
>>   }
>>
>> It bisects down to commit b2ac58f90540e39 ("KVM/SVM: Allow direct access to
>> MSR_IA32_SPEC_CTRL").  The bug is apparently that due to the new code for
>> managing the SPEC_CTRL MSR, __sanitizer_cov_trace_pc() is being called from
>> svm_vcpu_run() before the host's MSR_GS_BASE has been restored, which causes 
>> a
>> crash somehow.  The following patch fixes it, though I don't know that it's 
>> the
>> right solution; maybe KCOV should be disabled in the function instead, or 
>> maybe
>> there's a more fundamental problem.  What do people think?


If __sanitizer_cov_trace_pc() crashes, I would expect there must be
few more of them here:

if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);

if (svm->spec_ctrl)
native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);

Compiler inserts these callbacks into every basic block/edge.. Aren't there?

Unfortunately we don't have an attribute that disables instrumentation
of a single function. This is currently possible only on file level.



>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 1fc05e428aba8..d35ef241e66d8 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -5652,6 +5652,15 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
>>  #endif
>>   );
>>
>> +#ifdef CONFIG_X86_64
>> + wrmsrl(MSR_GS_BASE, svm->host.gs_base);
>> +#else
>> + loadsegment(fs, svm->host.fs);
>> +#ifndef CONFIG_X86_32_LAZY_GS
>> + loadsegment(gs, svm->host.gs);
>> +#endif
>> +#endif
>> +
>>   /*
>>* We do not use IBRS in the kernel. If this vCPU has used the
>>* SPEC_CTRL MSR it may have left it on; save the value and
>> @@ -5676,15 +5685,6 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
>>   /* Eliminate branch target predictions from guest mode */
>>   vmexit_fill_RSB();
>>
>> -#ifdef CONFIG_X86_64
>> - wrmsrl(MSR_GS_BASE, svm->host.gs_base);
>> -#else
>> - loadsegment(fs, svm->host.fs);
>> -#ifndef CONFIG_X86_32_LAZY_GS
>> - loadsegment(gs, svm->host.gs);
>> -#endif
>> -#endif
>> -
>>   reload_tss(vcpu);
>>
>>   local_irq_disable();


Re: [PATCH v6 3/6] kernel/reboot.c: export pm_power_off_prepare

2018-05-13 Thread Oleksij Rempel


On 14.05.2018 06:33, Oleksij Rempel wrote:
> 
> 
> On 12.05.2018 13:13, Rafael J. Wysocki wrote:
>> On Friday, May 4, 2018 8:50:52 PM CEST Oleksij Rempel wrote:
>>> Hallo Andrew,
>>> I need your ACK or NACK for this patch.
>>>
>>> This function is used to configure external PMIC to interpret
>>> signal which will be triggered by pm_power_off as power off.
>>> Since same signal can be used for stand by, I linked PMIC configuration
>>> with pm_power_off_prepare to avoid possible conflicts.
>>>
>>> On Mon, Mar 05, 2018 at 11:25:20AM +0100, Oleksij Rempel wrote:
 Export pm_power_off_prepare. It is needed to implement power off on
 Freescale/NXP iMX6 based boards with external power management
 integrated circuit (PMIC).

 Signed-off-by: Oleksij Rempel 
 ---
  kernel/reboot.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/kernel/reboot.c b/kernel/reboot.c
 index e4ced883d8de..350be6baa60d 100644
 --- a/kernel/reboot.c
 +++ b/kernel/reboot.c
 @@ -49,6 +49,7 @@ int reboot_force;
   */
  
  void (*pm_power_off_prepare)(void);
 +EXPORT_SYMBOL(pm_power_off_prepare);
>>
>> Why not EXPORT_SYMBOL_GPL() ?
> 
> No special reason. Fixed.
> Any other comments?
> 

Or with other words, will it be enough to get your Signed-of-by for this
patch?



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v6 3/6] kernel/reboot.c: export pm_power_off_prepare

2018-05-13 Thread Oleksij Rempel


On 14.05.2018 06:33, Oleksij Rempel wrote:
> 
> 
> On 12.05.2018 13:13, Rafael J. Wysocki wrote:
>> On Friday, May 4, 2018 8:50:52 PM CEST Oleksij Rempel wrote:
>>> Hallo Andrew,
>>> I need your ACK or NACK for this patch.
>>>
>>> This function is used to configure external PMIC to interpret
>>> signal which will be triggered by pm_power_off as power off.
>>> Since same signal can be used for stand by, I linked PMIC configuration
>>> with pm_power_off_prepare to avoid possible conflicts.
>>>
>>> On Mon, Mar 05, 2018 at 11:25:20AM +0100, Oleksij Rempel wrote:
 Export pm_power_off_prepare. It is needed to implement power off on
 Freescale/NXP iMX6 based boards with external power management
 integrated circuit (PMIC).

 Signed-off-by: Oleksij Rempel 
 ---
  kernel/reboot.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/kernel/reboot.c b/kernel/reboot.c
 index e4ced883d8de..350be6baa60d 100644
 --- a/kernel/reboot.c
 +++ b/kernel/reboot.c
 @@ -49,6 +49,7 @@ int reboot_force;
   */
  
  void (*pm_power_off_prepare)(void);
 +EXPORT_SYMBOL(pm_power_off_prepare);
>>
>> Why not EXPORT_SYMBOL_GPL() ?
> 
> No special reason. Fixed.
> Any other comments?
> 

Or with other words, will it be enough to get your Signed-of-by for this
patch?



signature.asc
Description: OpenPGP digital signature


Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Joel Fernandes
On Sun, May 13, 2018 at 08:47:24PM -0700, Randy Dunlap wrote:
> On 05/13/2018 08:15 PM, Joel Fernandes (Google) wrote:
> > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > Lets document how it works with an example to make it easier.
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> > ---
> >  kernel/rcu/rcu.h | 24 +++-
> >  1 file changed, 23 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index 003671825d62..fc3170914ac7 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> >  }
> >  
> > -/* Take a snapshot of the update side's sequence number. */
> > +/*
> > + * Take a snapshot of the update side's sequence number.
> > + *
> > + * This function predicts what the grace period number will be the next
> > + * time an RCU callback will be executed, given the current grace period's
> > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> > + * already in progress.
> > + *
> > + * We do this with a single addition and masking.
> > + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit 
> > (LSB) of
> > + * the seq is used to track if a GP is in progress or not, its sufficient 
> > if we
> 
>   it's

Pardon my english. Fixed, thanks,

- Joel



Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Joel Fernandes
On Sun, May 13, 2018 at 08:47:24PM -0700, Randy Dunlap wrote:
> On 05/13/2018 08:15 PM, Joel Fernandes (Google) wrote:
> > rcu_seq_snap may be tricky for someone looking at it for the first time.
> > Lets document how it works with an example to make it easier.
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> > ---
> >  kernel/rcu/rcu.h | 24 +++-
> >  1 file changed, 23 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index 003671825d62..fc3170914ac7 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
> > WRITE_ONCE(*sp, rcu_seq_endval(sp));
> >  }
> >  
> > -/* Take a snapshot of the update side's sequence number. */
> > +/*
> > + * Take a snapshot of the update side's sequence number.
> > + *
> > + * This function predicts what the grace period number will be the next
> > + * time an RCU callback will be executed, given the current grace period's
> > + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> > + * already in progress.
> > + *
> > + * We do this with a single addition and masking.
> > + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit 
> > (LSB) of
> > + * the seq is used to track if a GP is in progress or not, its sufficient 
> > if we
> 
>   it's

Pardon my english. Fixed, thanks,

- Joel



Re: [tip/core/rcu,16/21] rcu: Add funnel locking to rcu_start_this_gp()

2018-05-13 Thread Joel Fernandes
On Sun, May 13, 2018 at 07:22:06PM -0700, Paul E. McKenney wrote:
[..]
> > > > > > If you don't mind going through the if conditions in the funnel 
> > > > > > locking loop
> > > > > > with me, it would be quite helpful so that I don't mess the code up 
> > > > > > and would
> > > > > > also help me add tracing correctly.
> > > > > > 
> > > > > > The if condition for prestarted is this:
> > > > > > 
> > > > > >if (need_future_gp_element(rnp_root, c) ||
> > > > > >ULONG_CMP_GE(rnp_root->gpnum, c) ||
> > > > > >(rnp != rnp_root &&
> > > > > > rnp_root->gpnum != rnp_root->completed)) {
> > > > > >trace_rcu_this_gp(rnp_root, rdp, c, 
> > > > > > TPS("Prestarted"));
> > > > > >goto unlock_out;
> > > > > > need_future_gp_element(rnp_root, c) = true;
> > > > > > 
> > > > > > As of 16/21, the heart of the loop is the above (excluding the 
> > > > > > locking bits)
> > > > > > 
> > > > > > In this what confuses me is the second and the third condition for
> > > > > > pre-started.
> > > > > > 
> > > > > > The second condition is:  ULONG_CMP_GE(rnp_root->gpnum, c). 
> > > > > > AIUI the goal of this condition is to check whether the requested 
> > > > > > grace
> > > > > > period has already started. I believe then the above check is 
> > > > > > insufficient. 
> > > > > > The reason I think its insufficient is I believe we should also 
> > > > > > check the
> > > > > > state of the grace period to augment this check.
> > > > > > IMO the condition should really be:
> > > > > > (ULONG_CMP_GT(rnp_root->gpnum, c) ||
> > > > > 
> > > > > The above asks whether the -next- grace period -after- the requested
> > > > > one had started.
> > > > > 
> > > > > >   (rnp_root->gpnum == c && rnp_root->gpnum != rnp_root->completed))
> > > > > 
> > > > > This asks that the requested grace period not have completed.
> > > > > 
> > > > > What about the case where the requested grace period has completed,
> > > > > but the one after has not yet started?  If you add that in, I bet you
> > > > > will have something that simplifies to my original.
> > > > > 
> > > > > > In a later patch you replaced this with 
> > > > > > rseq_done(_root->gp_seq, c) which
> > > > > > kind of accounts for the state, except that rseq_done uses 
> > > > > > ULONG_CMP_GE,
> > > > > > whereas to fix this, rseq_done IMO should be using ULONG_CMP_GT to 
> > > > > > be equivalent
> > > > > > to the above check. Do you agree?
> > > > > 
> > > > > I do not believe that I do.  The ULONG_CMP_GE() allows for the 
> > > > > missing case
> > > > > where the requested grace period completed, but the following grace 
> > > > > period
> > > > > has not yet started.
> > > > 
> > > > Ok thanks that clears it up. For some reason I was thinking if
> > > > rnp_root->gpnum == c, that could means 'c' has not yet started, unless 
> > > > we
> > > > also checked the state. Obviously, now I realize gpnum == c can only 
> > > > mean 2
> > > > things:
> > > >  - c has started but not yet completed
> > > >  - c has completed
> > > > 
> > > > Both of these cases should cause a bail out so I agree now with your
> > > > condition ULONG_CMP_GE, thanks.
> > > > 
> > > > > 
> > > > > > The third condition for pre-started is:
> > > > > >(rnp != rnp_root && rnp_root->gpnum != 
> > > > > > rnp_root->completed))
> > > > > > This as I followed from your commit message is if an intermediate 
> > > > > > node thinks
> > > > > > RCU is non-idle, then its not necessary to mark the tree and we can 
> > > > > > bail out
> > > > > > since the clean up will scan the whole tree anyway. That makes 
> > > > > > sense to me
> > > > > > but I think I will like to squash the diff in your previous email 
> > > > > > into this
> > > > > > condition as well to handle both conditions together.
> > > > > 
> > > > > Please keep in mind that it is necessary to actually record the 
> > > > > request
> > > > > in the leaf case.  Or are you advocating use of ?: or similar to make 
> > > > > this
> > > > > happen?
> > > > 
> > > > Yes, I realized yesterday you wanted to record it for the leaf that's 
> > > > why
> > > > you're doing things this way. I'll let you know if I find any other 
> > > > ways of
> > > > simplifying it once I look at your latest tree.
> > > > 
> > > > Btw, I checked your git tree and couldn't see the update that you 
> > > > mentioned
> > > > you queued above. Could you push those changes?
> > > 
> > > Good point, pushed now.  And the patch that I forgot to include in the
> > > last email is below.
> > 
> > Cool, thanks. Also one thing I wanted to discuss, I am a bit unclear about
> > the if (rcu_seq_done..) condition in the loop which decides if the GP
> > requested is pre-started.
> 
> Actually, rcu_seq_done() instead determines whether or not the GP has
> -completed-.
> 
> > Say c is 8 (0b1000) - i.e. gp requested is 2.
> > I drew some 

Re: [tip/core/rcu,16/21] rcu: Add funnel locking to rcu_start_this_gp()

2018-05-13 Thread Joel Fernandes
On Sun, May 13, 2018 at 07:22:06PM -0700, Paul E. McKenney wrote:
[..]
> > > > > > If you don't mind going through the if conditions in the funnel 
> > > > > > locking loop
> > > > > > with me, it would be quite helpful so that I don't mess the code up 
> > > > > > and would
> > > > > > also help me add tracing correctly.
> > > > > > 
> > > > > > The if condition for prestarted is this:
> > > > > > 
> > > > > >if (need_future_gp_element(rnp_root, c) ||
> > > > > >ULONG_CMP_GE(rnp_root->gpnum, c) ||
> > > > > >(rnp != rnp_root &&
> > > > > > rnp_root->gpnum != rnp_root->completed)) {
> > > > > >trace_rcu_this_gp(rnp_root, rdp, c, 
> > > > > > TPS("Prestarted"));
> > > > > >goto unlock_out;
> > > > > > need_future_gp_element(rnp_root, c) = true;
> > > > > > 
> > > > > > As of 16/21, the heart of the loop is the above (excluding the 
> > > > > > locking bits)
> > > > > > 
> > > > > > In this what confuses me is the second and the third condition for
> > > > > > pre-started.
> > > > > > 
> > > > > > The second condition is:  ULONG_CMP_GE(rnp_root->gpnum, c). 
> > > > > > AIUI the goal of this condition is to check whether the requested 
> > > > > > grace
> > > > > > period has already started. I believe then the above check is 
> > > > > > insufficient. 
> > > > > > The reason I think its insufficient is I believe we should also 
> > > > > > check the
> > > > > > state of the grace period to augment this check.
> > > > > > IMO the condition should really be:
> > > > > > (ULONG_CMP_GT(rnp_root->gpnum, c) ||
> > > > > 
> > > > > The above asks whether the -next- grace period -after- the requested
> > > > > one had started.
> > > > > 
> > > > > >   (rnp_root->gpnum == c && rnp_root->gpnum != rnp_root->completed))
> > > > > 
> > > > > This asks that the requested grace period not have completed.
> > > > > 
> > > > > What about the case where the requested grace period has completed,
> > > > > but the one after has not yet started?  If you add that in, I bet you
> > > > > will have something that simplifies to my original.
> > > > > 
> > > > > > In a later patch you replaced this with 
> > > > > > rseq_done(_root->gp_seq, c) which
> > > > > > kind of accounts for the state, except that rseq_done uses 
> > > > > > ULONG_CMP_GE,
> > > > > > whereas to fix this, rseq_done IMO should be using ULONG_CMP_GT to 
> > > > > > be equivalent
> > > > > > to the above check. Do you agree?
> > > > > 
> > > > > I do not believe that I do.  The ULONG_CMP_GE() allows for the 
> > > > > missing case
> > > > > where the requested grace period completed, but the following grace 
> > > > > period
> > > > > has not yet started.
> > > > 
> > > > Ok thanks that clears it up. For some reason I was thinking if
> > > > rnp_root->gpnum == c, that could means 'c' has not yet started, unless 
> > > > we
> > > > also checked the state. Obviously, now I realize gpnum == c can only 
> > > > mean 2
> > > > things:
> > > >  - c has started but not yet completed
> > > >  - c has completed
> > > > 
> > > > Both of these cases should cause a bail out so I agree now with your
> > > > condition ULONG_CMP_GE, thanks.
> > > > 
> > > > > 
> > > > > > The third condition for pre-started is:
> > > > > >(rnp != rnp_root && rnp_root->gpnum != 
> > > > > > rnp_root->completed))
> > > > > > This as I followed from your commit message is if an intermediate 
> > > > > > node thinks
> > > > > > RCU is non-idle, then its not necessary to mark the tree and we can 
> > > > > > bail out
> > > > > > since the clean up will scan the whole tree anyway. That makes 
> > > > > > sense to me
> > > > > > but I think I will like to squash the diff in your previous email 
> > > > > > into this
> > > > > > condition as well to handle both conditions together.
> > > > > 
> > > > > Please keep in mind that it is necessary to actually record the 
> > > > > request
> > > > > in the leaf case.  Or are you advocating use of ?: or similar to make 
> > > > > this
> > > > > happen?
> > > > 
> > > > Yes, I realized yesterday you wanted to record it for the leaf that's 
> > > > why
> > > > you're doing things this way. I'll let you know if I find any other 
> > > > ways of
> > > > simplifying it once I look at your latest tree.
> > > > 
> > > > Btw, I checked your git tree and couldn't see the update that you 
> > > > mentioned
> > > > you queued above. Could you push those changes?
> > > 
> > > Good point, pushed now.  And the patch that I forgot to include in the
> > > last email is below.
> > 
> > Cool, thanks. Also one thing I wanted to discuss, I am a bit unclear about
> > the if (rcu_seq_done..) condition in the loop which decides if the GP
> > requested is pre-started.
> 
> Actually, rcu_seq_done() instead determines whether or not the GP has
> -completed-.
> 
> > Say c is 8 (0b1000) - i.e. gp requested is 2.
> > I drew some 

linux-next: build warning after merge of the staging tree

2018-05-13 Thread Stephen Rothwell
Hi Greg,

After merging the staging tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

WARNING: 
drivers/staging/vc04_services/bcm2835-camera/bcm2835-v4l2.o(.data+0x0): Section 
mismatch in reference from the variable bcm2835_camera_driver to the function 
.init.text:bcm2835_mmal_probe()
The variable bcm2835_camera_driver references
the function __init bcm2835_mmal_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console

WARNING: 
drivers/staging/vc04_services/bcm2835-camera/bcm2835-v4l2.o(.data+0x0): Section 
mismatch in reference from the variable bcm2835_camera_driver to the function 
.init.text:bcm2835_mmal_probe()
The variable bcm2835_camera_driver references
the function __init bcm2835_mmal_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console

Introduced by commit

  4bebb0312ea9 ("staging/bcm2835-camera: Set ourselves up as a platform 
driver.")

-- 
Cheers,
Stephen Rothwell


pgpsqVPfECu6b.pgp
Description: OpenPGP digital signature


linux-next: build warning after merge of the staging tree

2018-05-13 Thread Stephen Rothwell
Hi Greg,

After merging the staging tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

WARNING: 
drivers/staging/vc04_services/bcm2835-camera/bcm2835-v4l2.o(.data+0x0): Section 
mismatch in reference from the variable bcm2835_camera_driver to the function 
.init.text:bcm2835_mmal_probe()
The variable bcm2835_camera_driver references
the function __init bcm2835_mmal_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console

WARNING: 
drivers/staging/vc04_services/bcm2835-camera/bcm2835-v4l2.o(.data+0x0): Section 
mismatch in reference from the variable bcm2835_camera_driver to the function 
.init.text:bcm2835_mmal_probe()
The variable bcm2835_camera_driver references
the function __init bcm2835_mmal_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console

Introduced by commit

  4bebb0312ea9 ("staging/bcm2835-camera: Set ourselves up as a platform 
driver.")

-- 
Cheers,
Stephen Rothwell


pgpsqVPfECu6b.pgp
Description: OpenPGP digital signature


linux-next: build warnings after merge of the staging tree

2018-05-13 Thread Stephen Rothwell
Hi Greg,

After merging the staging tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/staging/most/video/video.c: In function 'vidioc_enum_fmt_vid_cap':
drivers/staging/most/video/video.c:265:25: warning: unused variable 'mdev' 
[-Wunused-variable]
  struct most_video_dev *mdev = fh->mdev;
 ^~~~
drivers/staging/most/video/video.c: In function 'vidioc_g_fmt_vid_cap':
drivers/staging/most/video/video.c:282:25: warning: unused variable 'mdev' 
[-Wunused-variable]
  struct most_video_dev *mdev = fh->mdev;
 ^~~~
drivers/staging/most/video/video.c: In function 'vidioc_g_std':
drivers/staging/most/video/video.c:309:25: warning: unused variable 'mdev' 
[-Wunused-variable]
  struct most_video_dev *mdev = fh->mdev;
 ^~~~

Introduced by commit

  7d7cdb4fa552 ("staging: most: video: remove debugging code")

-- 
Cheers,
Stephen Rothwell


pgp684TdcDKZd.pgp
Description: OpenPGP digital signature


linux-next: build warnings after merge of the staging tree

2018-05-13 Thread Stephen Rothwell
Hi Greg,

After merging the staging tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/staging/most/video/video.c: In function 'vidioc_enum_fmt_vid_cap':
drivers/staging/most/video/video.c:265:25: warning: unused variable 'mdev' 
[-Wunused-variable]
  struct most_video_dev *mdev = fh->mdev;
 ^~~~
drivers/staging/most/video/video.c: In function 'vidioc_g_fmt_vid_cap':
drivers/staging/most/video/video.c:282:25: warning: unused variable 'mdev' 
[-Wunused-variable]
  struct most_video_dev *mdev = fh->mdev;
 ^~~~
drivers/staging/most/video/video.c: In function 'vidioc_g_std':
drivers/staging/most/video/video.c:309:25: warning: unused variable 'mdev' 
[-Wunused-variable]
  struct most_video_dev *mdev = fh->mdev;
 ^~~~

Introduced by commit

  7d7cdb4fa552 ("staging: most: video: remove debugging code")

-- 
Cheers,
Stephen Rothwell


pgp684TdcDKZd.pgp
Description: OpenPGP digital signature


Re: [PATCH v6 3/6] kernel/reboot.c: export pm_power_off_prepare

2018-05-13 Thread Oleksij Rempel


On 12.05.2018 13:13, Rafael J. Wysocki wrote:
> On Friday, May 4, 2018 8:50:52 PM CEST Oleksij Rempel wrote:
>> Hallo Andrew,
>> I need your ACK or NACK for this patch.
>>
>> This function is used to configure external PMIC to interpret
>> signal which will be triggered by pm_power_off as power off.
>> Since same signal can be used for stand by, I linked PMIC configuration
>> with pm_power_off_prepare to avoid possible conflicts.
>>
>> On Mon, Mar 05, 2018 at 11:25:20AM +0100, Oleksij Rempel wrote:
>>> Export pm_power_off_prepare. It is needed to implement power off on
>>> Freescale/NXP iMX6 based boards with external power management
>>> integrated circuit (PMIC).
>>>
>>> Signed-off-by: Oleksij Rempel 
>>> ---
>>>  kernel/reboot.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/kernel/reboot.c b/kernel/reboot.c
>>> index e4ced883d8de..350be6baa60d 100644
>>> --- a/kernel/reboot.c
>>> +++ b/kernel/reboot.c
>>> @@ -49,6 +49,7 @@ int reboot_force;
>>>   */
>>>  
>>>  void (*pm_power_off_prepare)(void);
>>> +EXPORT_SYMBOL(pm_power_off_prepare);
> 
> Why not EXPORT_SYMBOL_GPL() ?

No special reason. Fixed.
Any other comments?



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v6 3/6] kernel/reboot.c: export pm_power_off_prepare

2018-05-13 Thread Oleksij Rempel


On 12.05.2018 13:13, Rafael J. Wysocki wrote:
> On Friday, May 4, 2018 8:50:52 PM CEST Oleksij Rempel wrote:
>> Hallo Andrew,
>> I need your ACK or NACK for this patch.
>>
>> This function is used to configure external PMIC to interpret
>> signal which will be triggered by pm_power_off as power off.
>> Since same signal can be used for stand by, I linked PMIC configuration
>> with pm_power_off_prepare to avoid possible conflicts.
>>
>> On Mon, Mar 05, 2018 at 11:25:20AM +0100, Oleksij Rempel wrote:
>>> Export pm_power_off_prepare. It is needed to implement power off on
>>> Freescale/NXP iMX6 based boards with external power management
>>> integrated circuit (PMIC).
>>>
>>> Signed-off-by: Oleksij Rempel 
>>> ---
>>>  kernel/reboot.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/kernel/reboot.c b/kernel/reboot.c
>>> index e4ced883d8de..350be6baa60d 100644
>>> --- a/kernel/reboot.c
>>> +++ b/kernel/reboot.c
>>> @@ -49,6 +49,7 @@ int reboot_force;
>>>   */
>>>  
>>>  void (*pm_power_off_prepare)(void);
>>> +EXPORT_SYMBOL(pm_power_off_prepare);
> 
> Why not EXPORT_SYMBOL_GPL() ?

No special reason. Fixed.
Any other comments?



signature.asc
Description: OpenPGP digital signature


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Mon, May 14, 2018 at 05:04:15AM +0100, Al Viro wrote:
> diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
> index b428d317ae92..92682fcc41f6 100644
> --- a/fs/sysfs/mount.c
> +++ b/fs/sysfs/mount.c
> @@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type 
> *fs_type,
>  {
>   struct dentry *root;
>   void *ns;
> - bool new_sb;
> + bool new_sb = false;
>  
>   if (!(flags & SB_KERNMOUNT)) {
>   if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
> @@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type 
> *fs_type,
>   ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
>   root = kernfs_mount_ns(fs_type, flags, sysfs_root,
>   SYSFS_MAGIC, _sb, ns);
> - if (IS_ERR(root) || !new_sb)
> + if (!new_sb)
>   kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
> - else if (new_sb)
> + else if (!IS_ERR(root))
>   root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
>  
>   return root;

What we want for that kobj_ns_drop() is "no fs instances created" (== no
->kill_sb(), be it now or later, to drop that kobj reference); for setting
->s_iflags - "new instance successfully set up".

That's it; all we need is new_sb that would be accurate on its own.
The problem is with kludging over the cases when it's left uninitialized
(early exits from kernfs_mount_ns()) with IS_ERR(root), which happens to
grab the cases when new_sb *was* set to true.  So the fix is to initialize
new_sb properly and get rid of that kludge.  Which turns the whole thing
into
if (!new_sb)
...
if (!IS_ERR(root) && new_sb)
...
i.e.
if (!new_sb)
...
else if (!IS_ERR(root))
...


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Mon, May 14, 2018 at 05:04:15AM +0100, Al Viro wrote:
> diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
> index b428d317ae92..92682fcc41f6 100644
> --- a/fs/sysfs/mount.c
> +++ b/fs/sysfs/mount.c
> @@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type 
> *fs_type,
>  {
>   struct dentry *root;
>   void *ns;
> - bool new_sb;
> + bool new_sb = false;
>  
>   if (!(flags & SB_KERNMOUNT)) {
>   if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
> @@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type 
> *fs_type,
>   ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
>   root = kernfs_mount_ns(fs_type, flags, sysfs_root,
>   SYSFS_MAGIC, _sb, ns);
> - if (IS_ERR(root) || !new_sb)
> + if (!new_sb)
>   kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
> - else if (new_sb)
> + else if (!IS_ERR(root))
>   root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
>  
>   return root;

What we want for that kobj_ns_drop() is "no fs instances created" (== no
->kill_sb(), be it now or later, to drop that kobj reference); for setting
->s_iflags - "new instance successfully set up".

That's it; all we need is new_sb that would be accurate on its own.
The problem is with kludging over the cases when it's left uninitialized
(early exits from kernfs_mount_ns()) with IS_ERR(root), which happens to
grab the cases when new_sb *was* set to true.  So the fix is to initialize
new_sb properly and get rid of that kludge.  Which turns the whole thing
into
if (!new_sb)
...
if (!IS_ERR(root) && new_sb)
...
i.e.
if (!new_sb)
...
else if (!IS_ERR(root))
...


[RFC 2/6] perf probe: Parse linerange for C++ functions

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

perf probe --funcs will demangle C++ symbols by default but these
functions can not be used for listing sourcecode. Modify the scanner
to start searching for a line number only after a single ':'.

./perf probe -x ./cxx-example -L \
"std::vector::at:1"

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-event.c | 57 ++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index e1dbc98..39a2d47 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1237,6 +1237,59 @@ static bool is_c_func_name(const char *name)
return true;
 }
 
+/* Symbols in demangled CXX function names */
+static inline bool is_cxx_symbol(const char symbol)
+{
+   switch (symbol) {
+   case '_':
+   case ' ':
+   case '&':
+   case '*':
+   case '@':
+   case ',':
+   case ':':
+   case '<':
+   case '>':
+   case '(':
+   case ')':
+   return true;
+   default:
+   return false;
+   }
+}
+
+/* Is name a C++ name? */
+static bool is_cxx_func_name(const char *name)
+{
+   /* C name or a mangled name */
+   if (is_c_func_name(name))
+   return true;
+   while (*++name != '\0') {
+   if (!isalpha(*name) && !isdigit(*name) && !is_cxx_symbol(*name))
+   return false;
+   }
+   return true;
+}
+
+/*
+ * Find the first ':' that isn't part of a C++ namespace or class
+ * name.
+ */
+static char *first_non_cxx_ns(char *name)
+{
+   while (*name) {
+   char cur = *name, nxt = *(name + 1);
+
+   if (cur == ':' && nxt == ':')
+   name += 2;
+   else if (cur == ':')
+   return name;
+
+   name += 1;
+   }
+   return NULL;
+}
+
 /*
  * Stuff 'lr' according to the line range described by 'arg'.
  * The line range syntax is described by:
@@ -1255,7 +1308,7 @@ int parse_line_range_desc(const char *arg, struct 
line_range *lr)
lr->start = 0;
lr->end = INT_MAX;
 
-   range = strchr(name, ':');
+   range = first_non_cxx_ns(name);
if (range) {
*range++ = '\0';
 
@@ -1309,6 +1362,8 @@ int parse_line_range_desc(const char *arg, struct 
line_range *lr)
lr->file = name;
else if (is_c_func_name(name))/* We reuse it for checking funcname */
lr->function = name;
+   else if (is_cxx_func_name(name))
+   lr->function = name;
else {  /* Invalid name */
semantic_error("'%s' is not a valid function name.\n", name);
err = -EINVAL;
-- 
2.7.4



[RFC 2/6] perf probe: Parse linerange for C++ functions

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

perf probe --funcs will demangle C++ symbols by default but these
functions can not be used for listing sourcecode. Modify the scanner
to start searching for a line number only after a single ':'.

./perf probe -x ./cxx-example -L \
"std::vector >::at:1"

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-event.c | 57 ++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index e1dbc98..39a2d47 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1237,6 +1237,59 @@ static bool is_c_func_name(const char *name)
return true;
 }
 
+/* Symbols in demangled CXX function names */
+static inline bool is_cxx_symbol(const char symbol)
+{
+   switch (symbol) {
+   case '_':
+   case ' ':
+   case '&':
+   case '*':
+   case '@':
+   case ',':
+   case ':':
+   case '<':
+   case '>':
+   case '(':
+   case ')':
+   return true;
+   default:
+   return false;
+   }
+}
+
+/* Is name a C++ name? */
+static bool is_cxx_func_name(const char *name)
+{
+   /* C name or a mangled name */
+   if (is_c_func_name(name))
+   return true;
+   while (*++name != '\0') {
+   if (!isalpha(*name) && !isdigit(*name) && !is_cxx_symbol(*name))
+   return false;
+   }
+   return true;
+}
+
+/*
+ * Find the first ':' that isn't part of a C++ namespace or class
+ * name.
+ */
+static char *first_non_cxx_ns(char *name)
+{
+   while (*name) {
+   char cur = *name, nxt = *(name + 1);
+
+   if (cur == ':' && nxt == ':')
+   name += 2;
+   else if (cur == ':')
+   return name;
+
+   name += 1;
+   }
+   return NULL;
+}
+
 /*
  * Stuff 'lr' according to the line range described by 'arg'.
  * The line range syntax is described by:
@@ -1255,7 +1308,7 @@ int parse_line_range_desc(const char *arg, struct 
line_range *lr)
lr->start = 0;
lr->end = INT_MAX;
 
-   range = strchr(name, ':');
+   range = first_non_cxx_ns(name);
if (range) {
*range++ = '\0';
 
@@ -1309,6 +1362,8 @@ int parse_line_range_desc(const char *arg, struct 
line_range *lr)
lr->file = name;
else if (is_c_func_name(name))/* We reuse it for checking funcname */
lr->function = name;
+   else if (is_cxx_func_name(name))
+   lr->function = name;
else {  /* Invalid name */
semantic_error("'%s' is not a valid function name.\n", name);
err = -EINVAL;
-- 
2.7.4



[RFC 1/6] perf probe: Do not exclude mangled C++ funcs

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

Using --funcs --no-demangle on a C++ binary does not list any of the C++
functions. Change the default filter to not exclude the Common C++ ABI
symbols.

 $ ./perf probe -x ./cxx-example --funcs --no-demangle
 ...
 _ZN9__gnu_cxx13new_allocatorIiEC1Ev
 ...

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/builtin-probe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index c006592..d69f679 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -43,7 +43,7 @@
 #include "util/probe-file.h"
 
 #define DEFAULT_VAR_FILTER "!__k???tab_* & !__crc_*"
-#define DEFAULT_FUNC_FILTER "!_*"
+#define DEFAULT_FUNC_FILTER "!_* | _Z*"
 #define DEFAULT_LIST_FILTER "*"
 
 /* Session management structure */
-- 
2.7.4



[RFC 1/6] perf probe: Do not exclude mangled C++ funcs

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

Using --funcs --no-demangle on a C++ binary does not list any of the C++
functions. Change the default filter to not exclude the Common C++ ABI
symbols.

 $ ./perf probe -x ./cxx-example --funcs --no-demangle
 ...
 _ZN9__gnu_cxx13new_allocatorIiEC1Ev
 ...

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/builtin-probe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index c006592..d69f679 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -43,7 +43,7 @@
 #include "util/probe-file.h"
 
 #define DEFAULT_VAR_FILTER "!__k???tab_* & !__crc_*"
-#define DEFAULT_FUNC_FILTER "!_*"
+#define DEFAULT_FUNC_FILTER "!_* | _Z*"
 #define DEFAULT_LIST_FILTER "*"
 
 /* Session management structure */
-- 
2.7.4



[RFC 3/6] perf probe: Make listing of C++ functions work

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

If die_match_name does not match, attempt to demangle the linkage name.
To use the generic demangling API we require to have a struct dso. Store
it inside the debuginfo and pass it to the relevant callbacks.

./perf probe -x ./foo -L \
"std::vector::at:2-3"
<...::at@/usr/include/c++/5/bits/stl_vector.h:2>
  2 _M_range_check(__n);
  3 return (*this)[__n];

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-finder.c | 55 ++
 tools/perf/util/probe-finder.h |  3 +++
 2 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index c37fbef..c73dccc 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -96,7 +96,7 @@ static int debuginfo__init_offline_dwarf(struct debuginfo 
*dbg,
return -ENOENT;
 }
 
-static struct debuginfo *__debuginfo__new(const char *path)
+static struct debuginfo *__debuginfo__new(const char *path, struct dso *dso)
 {
struct debuginfo *dbg = zalloc(sizeof(*dbg));
if (!dbg)
@@ -104,8 +104,10 @@ static struct debuginfo *__debuginfo__new(const char *path)
 
if (debuginfo__init_offline_dwarf(dbg, path) < 0)
zfree();
-   if (dbg)
+   if (dbg) {
pr_debug("Open Debuginfo file: %s\n", path);
+   dbg->dso = dso__get(dso);
+   }
return dbg;
 }
 
@@ -135,13 +137,15 @@ struct debuginfo *debuginfo__new(const char *path)
if (dso__read_binary_type_filename(dso, *type, ,
   buf, PATH_MAX) < 0)
continue;
-   dinfo = __debuginfo__new(buf);
+   dinfo = __debuginfo__new(buf, dso);
}
-   dso__put(dso);
 
 out:
/* if failed to open all distro debuginfo, open given binary */
-   return dinfo ? : __debuginfo__new(path);
+   if (!dinfo)
+   dinfo = __debuginfo__new(path, dso);
+   dso__put(dso);
+   return dinfo;
 }
 
 void debuginfo__delete(struct debuginfo *dbg)
@@ -149,6 +153,7 @@ void debuginfo__delete(struct debuginfo *dbg)
if (dbg) {
if (dbg->dwfl)
dwfl_end(dbg->dwfl);
+   dso__put(dbg->dso);
free(dbg);
}
 }
@@ -167,6 +172,32 @@ static struct probe_trace_arg_ref 
*alloc_trace_arg_ref(long offs)
 }
 
 /*
+ * Check if the the demangled linkage_name matches the function. E.g. the
+ * linkage name of _ZNKSt6vectorIiSaIiEE4sizeEv matching the c++ function name
+ * of std::vector::size() const.
+ */
+static bool matches_demangled(struct debuginfo *dbg, Dwarf_Die *dw_die,
+ const char *function)
+{
+   const char *name;
+   char *demangled;
+   bool res;
+
+   name = die_get_linkage_name(dw_die);
+   if (!name)
+   return false;
+
+   demangled = dso__demangle_sym(dbg->dso, 0, name);
+   if (!demangled)
+   return false;
+
+   res = strglobmatch(demangled, function);
+   free(demangled);
+   return res;
+}
+
+
+/*
  * Convert a location into trace_arg.
  * If tvar == NULL, this just checks variable can be converted.
  * If fentry == true and vr_die is a parameter, do huristic search
@@ -975,6 +1006,7 @@ static int probe_point_inline_cb(Dwarf_Die *in_die, void 
*data)
 struct dwarf_callback_param {
void *data;
int retval;
+   struct debuginfo *dbg;
 };
 
 /* Search function from function name */
@@ -1721,7 +1753,8 @@ static int line_range_search_cb(Dwarf_Die *sp_die, void 
*data)
return DWARF_CB_OK;
 
if (die_is_func_def(sp_die) &&
-   die_match_name(sp_die, lr->function)) {
+   (die_match_name(sp_die, lr->function) ||
+matches_demangled(param->dbg, sp_die, lr->function))) {
lf->fname = dwarf_decl_file(sp_die);
dwarf_decl_line(sp_die, >offset);
pr_debug("fname: %s, lineno:%d\n", lf->fname, lr->offset);
@@ -1744,9 +1777,11 @@ static int line_range_search_cb(Dwarf_Die *sp_die, void 
*data)
return DWARF_CB_OK;
 }
 
-static int find_line_range_by_func(struct line_finder *lf)
+static int find_line_range_by_func(struct debuginfo *dbg,
+  struct line_finder *lf)
 {
-   struct dwarf_callback_param param = {.data = (void *)lf, .retval = 0};
+   struct dwarf_callback_param param = {
+   .data = (void *)lf, .retval = 0, .dbg = dbg};
dwarf_getfuncs(>cu_die, line_range_search_cb, , 0);
return param.retval;
 }
@@ -1766,7 +1801,7 @@ int debuginfo__find_line_range(struct debuginfo *dbg, 
struct line_range *lr)
.function = lr->function, .file = lr->file,
   

[RFC 0/6] perf probe: Attempt to improve C++ probing

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

Currently perf probe -x app --funcs will list and demangle C++ functions
but the other probe actions can't work with them. When asking probe to not
demangle it will not list any of the application symbols creating the
impression that there are no symbols at all.

Make --funcs --no-demangle list all C++ functions and modify the handling
for listing code, variables and adding the uprobe work with the demangled
C++ function name.

I tried to keep this as minimal as possible but having to keep the dso in
the debuginfo and passing it everywhere to be able to demangle the linkage
name isn't pretty (and for C++ demangling the struct dso is not of much
use. Maybe having a static "empty" dso could avoid a lot of the changes).

Maybe the easiest first patch is to default to --no-demangle and change
the DEFAULT_FUNC_FILTER to not include mangled C++ symbols. The remaining
tooling would work then.

This has seen very little testing outside the following commands.

My test set includes:

 ./perf probe -x . -L "std::vector::at"
 ./perf probe -x . -L "std::vector::at:2-3"

 ./perf probe -x . -V "std::vector::at"
 ./perf probe -x . -V "std::vector::at:2"
 ./perf probe -x . -V "std::vector::size%return"


Holger Hans Peter Freyther (6):
  perf probe: Do not exclude mangled C++ funcs
  perf probe: Parse linerange for C++ functions
  perf probe: Make listing of C++ functions work
  perf probe: Show variables for C++ functions
  perf probe: Make listing of variables work for C++ functions
  perf probe: Make it possible to add a C++ uprobe

 tools/perf/builtin-probe.c |   2 +-
 tools/perf/util/probe-event.c  |  77 -
 tools/perf/util/probe-finder.c | 152 ++---
 tools/perf/util/probe-finder.h |   3 +
 tools/perf/util/string.c   |  57 
 tools/perf/util/string2.h  |   1 +
 6 files changed, 247 insertions(+), 45 deletions(-)

-- 
2.7.4



[RFC 3/6] perf probe: Make listing of C++ functions work

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

If die_match_name does not match, attempt to demangle the linkage name.
To use the generic demangling API we require to have a struct dso. Store
it inside the debuginfo and pass it to the relevant callbacks.

./perf probe -x ./foo -L \
"std::vector >::at:2-3"
<...::at@/usr/include/c++/5/bits/stl_vector.h:2>
  2 _M_range_check(__n);
  3 return (*this)[__n];

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-finder.c | 55 ++
 tools/perf/util/probe-finder.h |  3 +++
 2 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index c37fbef..c73dccc 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -96,7 +96,7 @@ static int debuginfo__init_offline_dwarf(struct debuginfo 
*dbg,
return -ENOENT;
 }
 
-static struct debuginfo *__debuginfo__new(const char *path)
+static struct debuginfo *__debuginfo__new(const char *path, struct dso *dso)
 {
struct debuginfo *dbg = zalloc(sizeof(*dbg));
if (!dbg)
@@ -104,8 +104,10 @@ static struct debuginfo *__debuginfo__new(const char *path)
 
if (debuginfo__init_offline_dwarf(dbg, path) < 0)
zfree();
-   if (dbg)
+   if (dbg) {
pr_debug("Open Debuginfo file: %s\n", path);
+   dbg->dso = dso__get(dso);
+   }
return dbg;
 }
 
@@ -135,13 +137,15 @@ struct debuginfo *debuginfo__new(const char *path)
if (dso__read_binary_type_filename(dso, *type, ,
   buf, PATH_MAX) < 0)
continue;
-   dinfo = __debuginfo__new(buf);
+   dinfo = __debuginfo__new(buf, dso);
}
-   dso__put(dso);
 
 out:
/* if failed to open all distro debuginfo, open given binary */
-   return dinfo ? : __debuginfo__new(path);
+   if (!dinfo)
+   dinfo = __debuginfo__new(path, dso);
+   dso__put(dso);
+   return dinfo;
 }
 
 void debuginfo__delete(struct debuginfo *dbg)
@@ -149,6 +153,7 @@ void debuginfo__delete(struct debuginfo *dbg)
if (dbg) {
if (dbg->dwfl)
dwfl_end(dbg->dwfl);
+   dso__put(dbg->dso);
free(dbg);
}
 }
@@ -167,6 +172,32 @@ static struct probe_trace_arg_ref 
*alloc_trace_arg_ref(long offs)
 }
 
 /*
+ * Check if the the demangled linkage_name matches the function. E.g. the
+ * linkage name of _ZNKSt6vectorIiSaIiEE4sizeEv matching the c++ function name
+ * of std::vector >::size() const.
+ */
+static bool matches_demangled(struct debuginfo *dbg, Dwarf_Die *dw_die,
+ const char *function)
+{
+   const char *name;
+   char *demangled;
+   bool res;
+
+   name = die_get_linkage_name(dw_die);
+   if (!name)
+   return false;
+
+   demangled = dso__demangle_sym(dbg->dso, 0, name);
+   if (!demangled)
+   return false;
+
+   res = strglobmatch(demangled, function);
+   free(demangled);
+   return res;
+}
+
+
+/*
  * Convert a location into trace_arg.
  * If tvar == NULL, this just checks variable can be converted.
  * If fentry == true and vr_die is a parameter, do huristic search
@@ -975,6 +1006,7 @@ static int probe_point_inline_cb(Dwarf_Die *in_die, void 
*data)
 struct dwarf_callback_param {
void *data;
int retval;
+   struct debuginfo *dbg;
 };
 
 /* Search function from function name */
@@ -1721,7 +1753,8 @@ static int line_range_search_cb(Dwarf_Die *sp_die, void 
*data)
return DWARF_CB_OK;
 
if (die_is_func_def(sp_die) &&
-   die_match_name(sp_die, lr->function)) {
+   (die_match_name(sp_die, lr->function) ||
+matches_demangled(param->dbg, sp_die, lr->function))) {
lf->fname = dwarf_decl_file(sp_die);
dwarf_decl_line(sp_die, >offset);
pr_debug("fname: %s, lineno:%d\n", lf->fname, lr->offset);
@@ -1744,9 +1777,11 @@ static int line_range_search_cb(Dwarf_Die *sp_die, void 
*data)
return DWARF_CB_OK;
 }
 
-static int find_line_range_by_func(struct line_finder *lf)
+static int find_line_range_by_func(struct debuginfo *dbg,
+  struct line_finder *lf)
 {
-   struct dwarf_callback_param param = {.data = (void *)lf, .retval = 0};
+   struct dwarf_callback_param param = {
+   .data = (void *)lf, .retval = 0, .dbg = dbg};
dwarf_getfuncs(>cu_die, line_range_search_cb, , 0);
return param.retval;
 }
@@ -1766,7 +1801,7 @@ int debuginfo__find_line_range(struct debuginfo *dbg, 
struct line_range *lr)
.function = lr->function, .file = lr->file,
.cu_die = _die, .sp_die = _die, .found = 0};
struct dwarf_callback_param 

[RFC 0/6] perf probe: Attempt to improve C++ probing

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

Currently perf probe -x app --funcs will list and demangle C++ functions
but the other probe actions can't work with them. When asking probe to not
demangle it will not list any of the application symbols creating the
impression that there are no symbols at all.

Make --funcs --no-demangle list all C++ functions and modify the handling
for listing code, variables and adding the uprobe work with the demangled
C++ function name.

I tried to keep this as minimal as possible but having to keep the dso in
the debuginfo and passing it everywhere to be able to demangle the linkage
name isn't pretty (and for C++ demangling the struct dso is not of much
use. Maybe having a static "empty" dso could avoid a lot of the changes).

Maybe the easiest first patch is to default to --no-demangle and change
the DEFAULT_FUNC_FILTER to not include mangled C++ symbols. The remaining
tooling would work then.

This has seen very little testing outside the following commands.

My test set includes:

 ./perf probe -x . -L "std::vector >::at"
 ./perf probe -x . -L "std::vector >::at:2-3"

 ./perf probe -x . -V "std::vector >::at"
 ./perf probe -x . -V "std::vector >::at:2"
 ./perf probe -x . -V "std::vector >::size%return"


Holger Hans Peter Freyther (6):
  perf probe: Do not exclude mangled C++ funcs
  perf probe: Parse linerange for C++ functions
  perf probe: Make listing of C++ functions work
  perf probe: Show variables for C++ functions
  perf probe: Make listing of variables work for C++ functions
  perf probe: Make it possible to add a C++ uprobe

 tools/perf/builtin-probe.c |   2 +-
 tools/perf/util/probe-event.c  |  77 -
 tools/perf/util/probe-finder.c | 152 ++---
 tools/perf/util/probe-finder.h |   3 +
 tools/perf/util/string.c   |  57 
 tools/perf/util/string2.h  |   1 +
 6 files changed, 247 insertions(+), 45 deletions(-)

-- 
2.7.4



[RFC 6/6] perf probe: Make it possible to add a C++ uprobe

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

If the linkage name looks like a common C++ ABI name use it instead of
the original function name. This makes adding a uprobe for a C++ symbol
possible.

./perf probe -x ./cxx-example  "std::vector::at"
Added new event:
  probe_foo:_ZNSt6vectorIiSaIiEE2atEm (on _ZN... in /cxx-example)

You can now use it in all perf tools, such as:

perf record -e probe_foo:_ZNSt6vectorIiSaIiEE2atEm -aR sleep 1

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-finder.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 4ba4b18..4cfa3de 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1317,6 +1317,7 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, 
struct probe_finder *pf)
struct perf_probe_point *pp = >pev->point;
struct probe_trace_event *tev;
struct perf_probe_arg *args = NULL;
+   const char *linkage_name;
int ret, i;
 
/* Check number of tevs */
@@ -1333,6 +1334,16 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, 
struct probe_finder *pf)
if (ret < 0)
goto end;
 
+   /*
+* Adding a C++ name like std::vector::at
+* will fail. Check if we want to use the linkage name instead.
+*/
+   linkage_name = die_get_linkage_name(>sp_die);
+   if (linkage_name && strncmp(linkage_name, "_Z", 2) == 0) {
+   free(pp->function);
+   pp->function = strdup(linkage_name);
+   }
+
tev->point.realname = strdup(dwarf_diename(sc_die));
if (!tev->point.realname) {
ret = -ENOMEM;
-- 
2.7.4



[RFC 5/6] perf probe: Make listing of variables work for C++ functions

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

Update call sites with die_match_name to call matches_demangled as well.
This requires to pass the struct debuginfo/struct dso to the callbacks
and modifies the closure/void *data parameter. For most functions this
will change the parameter from struct probe_finder to the generic struct
dwarf_callback_param.

$ ./perf probe -x ./foo -V "std::vector::at"
Available variables at std::vector::at
@
size_type   __n
vector*  this

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-finder.c | 88 +++---
 1 file changed, 56 insertions(+), 32 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index c73dccc..4ba4b18 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -760,6 +760,7 @@ struct find_scope_param {
int line;
int diff;
Dwarf_Die *die_mem;
+   struct debuginfo *dbg;
bool found;
 };
 
@@ -777,7 +778,8 @@ static int find_best_scope_cb(Dwarf_Die *fn_die, void *data)
}
/* If the function name is given, that's what user expects */
if (fsp->function) {
-   if (die_match_name(fn_die, fsp->function)) {
+   if (die_match_name(fn_die, fsp->function) ||
+   matches_demangled(fsp->dbg, fn_die, fsp->function)) {
memcpy(fsp->die_mem, fn_die, sizeof(Dwarf_Die));
fsp->found = true;
return 1;
@@ -795,8 +797,16 @@ static int find_best_scope_cb(Dwarf_Die *fn_die, void 
*data)
return 0;
 }
 
+/* Callback parameter with return value for libdw */
+struct dwarf_callback_param {
+   void *data;
+   int retval;
+   struct debuginfo *dbg;
+};
+
 /* Find an appropriate scope fits to given conditions */
-static Dwarf_Die *find_best_scope(struct probe_finder *pf, Dwarf_Die *die_mem)
+static Dwarf_Die *find_best_scope(struct debuginfo *dbg,
+ struct probe_finder *pf, Dwarf_Die *die_mem)
 {
struct find_scope_param fsp = {
.function = pf->pev->point.function,
@@ -804,6 +814,7 @@ static Dwarf_Die *find_best_scope(struct probe_finder *pf, 
Dwarf_Die *die_mem)
.line = pf->lno,
.diff = INT_MAX,
.die_mem = die_mem,
+   .dbg = dbg,
.found = false,
};
 
@@ -815,7 +826,8 @@ static Dwarf_Die *find_best_scope(struct probe_finder *pf, 
Dwarf_Die *die_mem)
 static int probe_point_line_walker(const char *fname, int lineno,
   Dwarf_Addr addr, void *data)
 {
-   struct probe_finder *pf = data;
+   struct dwarf_callback_param *param = data;
+   struct probe_finder *pf = param->data;
Dwarf_Die *sc_die, die_mem;
int ret;
 
@@ -823,7 +835,7 @@ static int probe_point_line_walker(const char *fname, int 
lineno,
return 0;
 
pf->addr = addr;
-   sc_die = find_best_scope(pf, _mem);
+   sc_die = find_best_scope(param->dbg, pf, _mem);
if (!sc_die) {
pr_warning("Failed to find scope of probe point.\n");
return -ENOENT;
@@ -836,9 +848,12 @@ static int probe_point_line_walker(const char *fname, int 
lineno,
 }
 
 /* Find probe point from its line number */
-static int find_probe_point_by_line(struct probe_finder *pf)
+static int find_probe_point_by_line(struct debuginfo *dbg,
+   struct probe_finder *pf)
 {
-   return die_walk_lines(>cu_die, probe_point_line_walker, pf);
+   struct dwarf_callback_param param = {
+   .data = (void *)pf, .dbg = dbg, .retval = 0};
+   return die_walk_lines(>cu_die, probe_point_line_walker, );
 }
 
 /* Find lines which match lazy pattern */
@@ -884,7 +899,8 @@ static int find_lazy_match_lines(struct intlist *list,
 static int probe_point_lazy_walker(const char *fname, int lineno,
   Dwarf_Addr addr, void *data)
 {
-   struct probe_finder *pf = data;
+   struct dwarf_callback_param *param = data;
+   struct probe_finder *pf = param->data;
Dwarf_Die *sc_die, die_mem;
int ret;
 
@@ -896,7 +912,7 @@ static int probe_point_lazy_walker(const char *fname, int 
lineno,
 lineno, (unsigned long long)addr);
pf->addr = addr;
pf->lno = lineno;
-   sc_die = find_best_scope(pf, _mem);
+   sc_die = find_best_scope(param->dbg, pf, _mem);
if (!sc_die) {
pr_warning("Failed to find scope of probe point.\n");
return -ENOENT;
@@ -912,8 +928,10 @@ static int probe_point_lazy_walker(const char *fname, int 
lineno,
 }
 
 /* Find probe points from lazy pattern  */
-static int 

[RFC 4/6] perf probe: Show variables for C++ functions

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

The demangled C++ function name contains spaces and using the generic
argc_split would split the function in the middle. Create a separate
version that counts the number of opening and closing '<', '>' for
templated functions.

$ ./perf probe -x ./foo -V "std::vector::at"
Available variables at std::vector::at
@
size_type   __n
vector*  this

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-event.c | 20 +--
 tools/perf/util/string.c  | 57 +++
 tools/perf/util/string2.h |  1 +
 3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 39a2d47..97d6b6a 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1407,6 +1407,22 @@ static int parse_perf_probe_event_name(char **arg, 
struct perf_probe_event *pev)
return 0;
 }
 
+/* Split the function name from  @file, :line, %return but be C++ aware */
+static char *split_func_name(char *arg)
+{
+   char *ptr = arg;
+
+   while ((ptr = strpbrk_esc(ptr, ";:+@%"))) {
+   if (ptr[0] == ':' && ptr[1] == ':') {
+   ptr += 2;
+   continue;
+   }
+   return ptr;
+   }
+
+   return NULL;
+}
+
 /* Parse probepoint definition. */
 static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
 {
@@ -1486,7 +1502,7 @@ static int parse_perf_probe_point(char *arg, struct 
perf_probe_event *pev)
file_spec = true;
}
 
-   ptr = strpbrk_esc(arg, ";:+@%");
+   ptr = split_func_name(arg);
if (ptr) {
nc = *ptr;
*ptr++ = '\0';
@@ -1726,7 +1742,7 @@ int parse_perf_probe_command(const char *cmd, struct 
perf_probe_event *pev)
char **argv;
int argc, i, ret = 0;
 
-   argv = argv_split(cmd, );
+   argv = argv_split_cxx(cmd, );
if (!argv) {
pr_debug("Failed to split arguments.\n");
return -ENOMEM;
diff --git a/tools/perf/util/string.c b/tools/perf/util/string.c
index d8bfd0c..bb96fe2 100644
--- a/tools/perf/util/string.c
+++ b/tools/perf/util/string.c
@@ -80,6 +80,23 @@ static const char *skip_arg(const char *cp)
return cp;
 }
 
+static const char *skip_arg_cxx(const char *cp)
+{
+   int tmpl = 0;
+
+   while (*cp) {
+   if (tmpl == 0 && isspace(*cp))
+   break;
+   if (*cp == '<')
+   tmpl += 1;
+   if (*cp == '>')
+   tmpl -= 1;
+   cp++;
+   }
+
+   return cp;
+}
+
 static int count_argc(const char *str)
 {
int count = 0;
@@ -163,6 +180,46 @@ char **argv_split(const char *str, int *argcp)
return NULL;
 }
 
+char **argv_split_cxx(const char *str, int *argcp)
+{
+   int argc = count_argc(str);
+   char **argv = calloc(argc + 1, sizeof(*argv));
+   char **argvp;
+
+   if (argv == NULL)
+   goto out;
+
+   argvp = argv;
+
+   while (*str) {
+   str = skip_sep(str);
+
+   if (*str) {
+   const char *p = str;
+   char *t;
+
+   str = skip_arg_cxx(str);
+
+   t = strndup(p, str-p);
+   if (t == NULL)
+   goto fail;
+   *argvp++ = t;
+   }
+   }
+   if (argcp)
+   *argcp = argvp - argv;
+   *argvp = NULL;
+
+out:
+   return argv;
+
+fail:
+   if (argcp)
+   *argcp = 0;
+   argv_free(argv);
+   return NULL;
+}
+
 /* Character class matching */
 static bool __match_charclass(const char *pat, char c, const char **npat)
 {
diff --git a/tools/perf/util/string2.h b/tools/perf/util/string2.h
index 4c68a09..d32de6f 100644
--- a/tools/perf/util/string2.h
+++ b/tools/perf/util/string2.h
@@ -8,6 +8,7 @@
 
 s64 perf_atoll(const char *str);
 char **argv_split(const char *str, int *argcp);
+char **argv_split_cxx(const char *str, int *argvcp);
 void argv_free(char **argv);
 bool strglobmatch(const char *str, const char *pat);
 bool strglobmatch_nocase(const char *str, const char *pat);
-- 
2.7.4



[RFC 6/6] perf probe: Make it possible to add a C++ uprobe

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

If the linkage name looks like a common C++ ABI name use it instead of
the original function name. This makes adding a uprobe for a C++ symbol
possible.

./perf probe -x ./cxx-example  "std::vector >::at"
Added new event:
  probe_foo:_ZNSt6vectorIiSaIiEE2atEm (on _ZN... in /cxx-example)

You can now use it in all perf tools, such as:

perf record -e probe_foo:_ZNSt6vectorIiSaIiEE2atEm -aR sleep 1

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-finder.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 4ba4b18..4cfa3de 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1317,6 +1317,7 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, 
struct probe_finder *pf)
struct perf_probe_point *pp = >pev->point;
struct probe_trace_event *tev;
struct perf_probe_arg *args = NULL;
+   const char *linkage_name;
int ret, i;
 
/* Check number of tevs */
@@ -1333,6 +1334,16 @@ static int add_probe_trace_event(Dwarf_Die *sc_die, 
struct probe_finder *pf)
if (ret < 0)
goto end;
 
+   /*
+* Adding a C++ name like std::vector >::at
+* will fail. Check if we want to use the linkage name instead.
+*/
+   linkage_name = die_get_linkage_name(>sp_die);
+   if (linkage_name && strncmp(linkage_name, "_Z", 2) == 0) {
+   free(pp->function);
+   pp->function = strdup(linkage_name);
+   }
+
tev->point.realname = strdup(dwarf_diename(sc_die));
if (!tev->point.realname) {
ret = -ENOMEM;
-- 
2.7.4



[RFC 5/6] perf probe: Make listing of variables work for C++ functions

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

Update call sites with die_match_name to call matches_demangled as well.
This requires to pass the struct debuginfo/struct dso to the callbacks
and modifies the closure/void *data parameter. For most functions this
will change the parameter from struct probe_finder to the generic struct
dwarf_callback_param.

$ ./perf probe -x ./foo -V "std::vector >::at"
Available variables at std::vector >::at
@
size_type   __n
vector >*  this

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-finder.c | 88 +++---
 1 file changed, 56 insertions(+), 32 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index c73dccc..4ba4b18 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -760,6 +760,7 @@ struct find_scope_param {
int line;
int diff;
Dwarf_Die *die_mem;
+   struct debuginfo *dbg;
bool found;
 };
 
@@ -777,7 +778,8 @@ static int find_best_scope_cb(Dwarf_Die *fn_die, void *data)
}
/* If the function name is given, that's what user expects */
if (fsp->function) {
-   if (die_match_name(fn_die, fsp->function)) {
+   if (die_match_name(fn_die, fsp->function) ||
+   matches_demangled(fsp->dbg, fn_die, fsp->function)) {
memcpy(fsp->die_mem, fn_die, sizeof(Dwarf_Die));
fsp->found = true;
return 1;
@@ -795,8 +797,16 @@ static int find_best_scope_cb(Dwarf_Die *fn_die, void 
*data)
return 0;
 }
 
+/* Callback parameter with return value for libdw */
+struct dwarf_callback_param {
+   void *data;
+   int retval;
+   struct debuginfo *dbg;
+};
+
 /* Find an appropriate scope fits to given conditions */
-static Dwarf_Die *find_best_scope(struct probe_finder *pf, Dwarf_Die *die_mem)
+static Dwarf_Die *find_best_scope(struct debuginfo *dbg,
+ struct probe_finder *pf, Dwarf_Die *die_mem)
 {
struct find_scope_param fsp = {
.function = pf->pev->point.function,
@@ -804,6 +814,7 @@ static Dwarf_Die *find_best_scope(struct probe_finder *pf, 
Dwarf_Die *die_mem)
.line = pf->lno,
.diff = INT_MAX,
.die_mem = die_mem,
+   .dbg = dbg,
.found = false,
};
 
@@ -815,7 +826,8 @@ static Dwarf_Die *find_best_scope(struct probe_finder *pf, 
Dwarf_Die *die_mem)
 static int probe_point_line_walker(const char *fname, int lineno,
   Dwarf_Addr addr, void *data)
 {
-   struct probe_finder *pf = data;
+   struct dwarf_callback_param *param = data;
+   struct probe_finder *pf = param->data;
Dwarf_Die *sc_die, die_mem;
int ret;
 
@@ -823,7 +835,7 @@ static int probe_point_line_walker(const char *fname, int 
lineno,
return 0;
 
pf->addr = addr;
-   sc_die = find_best_scope(pf, _mem);
+   sc_die = find_best_scope(param->dbg, pf, _mem);
if (!sc_die) {
pr_warning("Failed to find scope of probe point.\n");
return -ENOENT;
@@ -836,9 +848,12 @@ static int probe_point_line_walker(const char *fname, int 
lineno,
 }
 
 /* Find probe point from its line number */
-static int find_probe_point_by_line(struct probe_finder *pf)
+static int find_probe_point_by_line(struct debuginfo *dbg,
+   struct probe_finder *pf)
 {
-   return die_walk_lines(>cu_die, probe_point_line_walker, pf);
+   struct dwarf_callback_param param = {
+   .data = (void *)pf, .dbg = dbg, .retval = 0};
+   return die_walk_lines(>cu_die, probe_point_line_walker, );
 }
 
 /* Find lines which match lazy pattern */
@@ -884,7 +899,8 @@ static int find_lazy_match_lines(struct intlist *list,
 static int probe_point_lazy_walker(const char *fname, int lineno,
   Dwarf_Addr addr, void *data)
 {
-   struct probe_finder *pf = data;
+   struct dwarf_callback_param *param = data;
+   struct probe_finder *pf = param->data;
Dwarf_Die *sc_die, die_mem;
int ret;
 
@@ -896,7 +912,7 @@ static int probe_point_lazy_walker(const char *fname, int 
lineno,
 lineno, (unsigned long long)addr);
pf->addr = addr;
pf->lno = lineno;
-   sc_die = find_best_scope(pf, _mem);
+   sc_die = find_best_scope(param->dbg, pf, _mem);
if (!sc_die) {
pr_warning("Failed to find scope of probe point.\n");
return -ENOENT;
@@ -912,8 +928,10 @@ static int probe_point_lazy_walker(const char *fname, int 
lineno,
 }
 
 /* Find probe points from lazy pattern  */
-static int find_probe_point_lazy(Dwarf_Die *sp_die, struct probe_finder *pf)
+static int find_probe_point_lazy(Dwarf_Die *sp_die,
+   

[RFC 4/6] perf probe: Show variables for C++ functions

2018-05-13 Thread Holger Freyther
From: Holger Hans Peter Freyther 

The demangled C++ function name contains spaces and using the generic
argc_split would split the function in the middle. Create a separate
version that counts the number of opening and closing '<', '>' for
templated functions.

$ ./perf probe -x ./foo -V "std::vector >::at"
Available variables at std::vector >::at
@
size_type   __n
vector >*  this

Signed-off-by: Holger Hans Peter Freyther 
---
 tools/perf/util/probe-event.c | 20 +--
 tools/perf/util/string.c  | 57 +++
 tools/perf/util/string2.h |  1 +
 3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 39a2d47..97d6b6a 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1407,6 +1407,22 @@ static int parse_perf_probe_event_name(char **arg, 
struct perf_probe_event *pev)
return 0;
 }
 
+/* Split the function name from  @file, :line, %return but be C++ aware */
+static char *split_func_name(char *arg)
+{
+   char *ptr = arg;
+
+   while ((ptr = strpbrk_esc(ptr, ";:+@%"))) {
+   if (ptr[0] == ':' && ptr[1] == ':') {
+   ptr += 2;
+   continue;
+   }
+   return ptr;
+   }
+
+   return NULL;
+}
+
 /* Parse probepoint definition. */
 static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
 {
@@ -1486,7 +1502,7 @@ static int parse_perf_probe_point(char *arg, struct 
perf_probe_event *pev)
file_spec = true;
}
 
-   ptr = strpbrk_esc(arg, ";:+@%");
+   ptr = split_func_name(arg);
if (ptr) {
nc = *ptr;
*ptr++ = '\0';
@@ -1726,7 +1742,7 @@ int parse_perf_probe_command(const char *cmd, struct 
perf_probe_event *pev)
char **argv;
int argc, i, ret = 0;
 
-   argv = argv_split(cmd, );
+   argv = argv_split_cxx(cmd, );
if (!argv) {
pr_debug("Failed to split arguments.\n");
return -ENOMEM;
diff --git a/tools/perf/util/string.c b/tools/perf/util/string.c
index d8bfd0c..bb96fe2 100644
--- a/tools/perf/util/string.c
+++ b/tools/perf/util/string.c
@@ -80,6 +80,23 @@ static const char *skip_arg(const char *cp)
return cp;
 }
 
+static const char *skip_arg_cxx(const char *cp)
+{
+   int tmpl = 0;
+
+   while (*cp) {
+   if (tmpl == 0 && isspace(*cp))
+   break;
+   if (*cp == '<')
+   tmpl += 1;
+   if (*cp == '>')
+   tmpl -= 1;
+   cp++;
+   }
+
+   return cp;
+}
+
 static int count_argc(const char *str)
 {
int count = 0;
@@ -163,6 +180,46 @@ char **argv_split(const char *str, int *argcp)
return NULL;
 }
 
+char **argv_split_cxx(const char *str, int *argcp)
+{
+   int argc = count_argc(str);
+   char **argv = calloc(argc + 1, sizeof(*argv));
+   char **argvp;
+
+   if (argv == NULL)
+   goto out;
+
+   argvp = argv;
+
+   while (*str) {
+   str = skip_sep(str);
+
+   if (*str) {
+   const char *p = str;
+   char *t;
+
+   str = skip_arg_cxx(str);
+
+   t = strndup(p, str-p);
+   if (t == NULL)
+   goto fail;
+   *argvp++ = t;
+   }
+   }
+   if (argcp)
+   *argcp = argvp - argv;
+   *argvp = NULL;
+
+out:
+   return argv;
+
+fail:
+   if (argcp)
+   *argcp = 0;
+   argv_free(argv);
+   return NULL;
+}
+
 /* Character class matching */
 static bool __match_charclass(const char *pat, char c, const char **npat)
 {
diff --git a/tools/perf/util/string2.h b/tools/perf/util/string2.h
index 4c68a09..d32de6f 100644
--- a/tools/perf/util/string2.h
+++ b/tools/perf/util/string2.h
@@ -8,6 +8,7 @@
 
 s64 perf_atoll(const char *str);
 char **argv_split(const char *str, int *argcp);
+char **argv_split_cxx(const char *str, int *argvcp);
 void argv_free(char **argv);
 bool strglobmatch(const char *str, const char *pat);
 bool strglobmatch_nocase(const char *str, const char *pat);
-- 
2.7.4



Re: WARNING: suspicious RCU usage in tipc_bearer_find

2018-05-13 Thread Eric Biggers
On Fri, Feb 09, 2018 at 12:00:01PM -0800, syzbot wrote:
> syzbot has found reproducer for the following crash on net-next commit
> 617aebe6a97efa539cc4b8a52adccd89596e6be0 (Sun Feb 4 00:25:42 2018 +)
> Merge tag 'usercopy-v4.16-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
> 
> So far this crash happened 13 times on net-next, upstream.
> C reproducer is attached.
> syzkaller reproducer is attached.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+b743957adcee51f5e...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed.
> 
> 
> audit: type=1400 audit(1518206230.395:8): avc:  denied  { create } for
> pid=4164 comm="syzkaller756462"
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tclass=netlink_generic_socket permissive=1
> =
> audit: type=1400 audit(1518206230.396:9): avc:  denied  { write } for
> pid=4164 comm="syzkaller756462"
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tclass=netlink_generic_socket permissive=1
> WARNING: suspicious RCU usage
> 4.15.0+ #221 Not tainted
> -
> net/tipc/bearer.c:177 suspicious rcu_dereference_protected() usage!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 2, debug_locks = 1
> 2 locks held by syzkaller756462/4164:
>  #0:  (cb_lock){}, at: [<3bb01113>] genl_rcv+0x19/0x40
> net/netlink/genetlink.c:634
>  #1:  (genl_mutex){+.+.}, at: [<2e321e71>] genl_lock
> net/netlink/genetlink.c:33 [inline]
>  #1:  (genl_mutex){+.+.}, at: [<2e321e71>] genl_rcv_msg+0x115/0x140
> net/netlink/genetlink.c:622
> 
> stack backtrace:
> CPU: 0 PID: 4164 Comm: syzkaller756462 Not tainted 4.15.0+ #221
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
>  tipc_bearer_find+0x2b4/0x3b0 net/tipc/bearer.c:177
>  tipc_nl_compat_link_set+0x329/0x9f0 net/tipc/netlink_compat.c:729
>  __tipc_nl_compat_doit net/tipc/netlink_compat.c:288 [inline]
>  tipc_nl_compat_doit+0x15b/0x670 net/tipc/netlink_compat.c:335
>  tipc_nl_compat_handle net/tipc/netlink_compat.c:1119 [inline]
>  tipc_nl_compat_recv+0x1135/0x18f0 net/tipc/netlink_compat.c:1201
>  genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:599
>  genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:624
>  netlink_rcv_skb+0x14b/0x380 net/netlink/af_netlink.c:2442
>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
>  netlink_unicast_kernel net/netlink/af_netlink.c:1308 [inline]
>  netlink_unicast+0x4c4/0x6b0 net/netlink/af_netlink.c:1334
>  netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1897
>  sock_sendmsg_nosec net/socket.c:630 [inline]
>  sock_sendmsg+0xca/0x110 net/socket.c:640
>  ___sys_sendmsg+0x767/0x8b0 net/socket.c:2046
>  __sys_sendmsg+0xe5/0x210 net/socket.c:2080
>  SYSC_sendmsg net/socket.c:2091 [inline]
>  SyS_sendmsg+0x2d/0x50 net/socket.c:2087
>  entry_SYSCALL_64_fastpath+0x29/0xa0
> RIP: 0033:0x43fd69
> RSP: 002b:7fff09979378 EFLAGS: 0203 ORIG_RAX: 002e
> RAX: ffda RBX: 004002c8 RCX: 0043fd69
> RDX:  RSI: 20003000 RDI: 0003
> RBP: 006ca018 R08:  R09: 
> R10:  R11: 0203 R12: 00401690
> R13: 00401720 R14:  R15: 000
> 

This was fixed by commit ed4ffdfec26df:

#syz fix: tipc: Fix missing RTNL lock protection during setting link properties

- Eric


Re: WARNING: suspicious RCU usage in tipc_bearer_find

2018-05-13 Thread Eric Biggers
On Fri, Feb 09, 2018 at 12:00:01PM -0800, syzbot wrote:
> syzbot has found reproducer for the following crash on net-next commit
> 617aebe6a97efa539cc4b8a52adccd89596e6be0 (Sun Feb 4 00:25:42 2018 +)
> Merge tag 'usercopy-v4.16-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
> 
> So far this crash happened 13 times on net-next, upstream.
> C reproducer is attached.
> syzkaller reproducer is attached.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+b743957adcee51f5e...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed.
> 
> 
> audit: type=1400 audit(1518206230.395:8): avc:  denied  { create } for
> pid=4164 comm="syzkaller756462"
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tclass=netlink_generic_socket permissive=1
> =
> audit: type=1400 audit(1518206230.396:9): avc:  denied  { write } for
> pid=4164 comm="syzkaller756462"
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tclass=netlink_generic_socket permissive=1
> WARNING: suspicious RCU usage
> 4.15.0+ #221 Not tainted
> -
> net/tipc/bearer.c:177 suspicious rcu_dereference_protected() usage!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 2, debug_locks = 1
> 2 locks held by syzkaller756462/4164:
>  #0:  (cb_lock){}, at: [<3bb01113>] genl_rcv+0x19/0x40
> net/netlink/genetlink.c:634
>  #1:  (genl_mutex){+.+.}, at: [<2e321e71>] genl_lock
> net/netlink/genetlink.c:33 [inline]
>  #1:  (genl_mutex){+.+.}, at: [<2e321e71>] genl_rcv_msg+0x115/0x140
> net/netlink/genetlink.c:622
> 
> stack backtrace:
> CPU: 0 PID: 4164 Comm: syzkaller756462 Not tainted 4.15.0+ #221
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
>  tipc_bearer_find+0x2b4/0x3b0 net/tipc/bearer.c:177
>  tipc_nl_compat_link_set+0x329/0x9f0 net/tipc/netlink_compat.c:729
>  __tipc_nl_compat_doit net/tipc/netlink_compat.c:288 [inline]
>  tipc_nl_compat_doit+0x15b/0x670 net/tipc/netlink_compat.c:335
>  tipc_nl_compat_handle net/tipc/netlink_compat.c:1119 [inline]
>  tipc_nl_compat_recv+0x1135/0x18f0 net/tipc/netlink_compat.c:1201
>  genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:599
>  genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:624
>  netlink_rcv_skb+0x14b/0x380 net/netlink/af_netlink.c:2442
>  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
>  netlink_unicast_kernel net/netlink/af_netlink.c:1308 [inline]
>  netlink_unicast+0x4c4/0x6b0 net/netlink/af_netlink.c:1334
>  netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1897
>  sock_sendmsg_nosec net/socket.c:630 [inline]
>  sock_sendmsg+0xca/0x110 net/socket.c:640
>  ___sys_sendmsg+0x767/0x8b0 net/socket.c:2046
>  __sys_sendmsg+0xe5/0x210 net/socket.c:2080
>  SYSC_sendmsg net/socket.c:2091 [inline]
>  SyS_sendmsg+0x2d/0x50 net/socket.c:2087
>  entry_SYSCALL_64_fastpath+0x29/0xa0
> RIP: 0033:0x43fd69
> RSP: 002b:7fff09979378 EFLAGS: 0203 ORIG_RAX: 002e
> RAX: ffda RBX: 004002c8 RCX: 0043fd69
> RDX:  RSI: 20003000 RDI: 0003
> RBP: 006ca018 R08:  R09: 
> R10:  R11: 0203 R12: 00401690
> R13: 00401720 R14:  R15: 000
> 

This was fixed by commit ed4ffdfec26df:

#syz fix: tipc: Fix missing RTNL lock protection during setting link properties

- Eric


Re: [PATCH 6/6] vfs: change inode times to use struct timespec64

2018-05-13 Thread Deepa Dinamani
Al,

Are you ok with this approach to changing vfs timestamps?

Kees mentioned that he wants to merge a patch to pstore that changes
it to use timespec64 internally for 4.17:
https://lkml.org/lkml/2018/5/13/3

I'm not sure how we usually merge such flag day patches. Should this
be targeted for 4.17 or 4.18? The above might or might not be a
problem based on when this series is merged.

If you are ok with this approach, I could post a v2 with a couple of
requested fix-ups.

-Deepa

On Fri, May 11, 2018 at 11:44 PM, Kees Cook  wrote:
> On Fri, May 11, 2018 at 9:59 PM, Deepa Dinamani  
> wrote:
>> diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
>> index 5fcb845b9fec..fb681d302bb3 100644
>> --- a/fs/pstore/inode.c
>> +++ b/fs/pstore/inode.c
>> @@ -392,7 +392,7 @@ int pstore_mkfile(struct dentry *root, struct 
>> pstore_record *record)
>> inode->i_private = private;
>>
>> if (record->time.tv_sec)
>> -   inode->i_mtime = inode->i_ctime = record->time;
>> +   inode->i_mtime = inode->i_ctime = 
>> timespec_to_timespec64(record->time);
>>
>> d_add(dentry, inode);
>
> I'm fine to just convert pstore internally to timespec64 right now. Is
> it correct to say that I should use timespec64_to_timespec() here
> until this flag day patch? And I'd need to do this as well, yes?
>
> fs/pstore/platform.c: record->time =
> ns_to_timespec64(ktime_get_real_fast_ns());
>
> Thanks!
>
> -Kees
>
> --
> Kees Cook
> Pixel Security


Re: [PATCH 6/6] vfs: change inode times to use struct timespec64

2018-05-13 Thread Deepa Dinamani
Al,

Are you ok with this approach to changing vfs timestamps?

Kees mentioned that he wants to merge a patch to pstore that changes
it to use timespec64 internally for 4.17:
https://lkml.org/lkml/2018/5/13/3

I'm not sure how we usually merge such flag day patches. Should this
be targeted for 4.17 or 4.18? The above might or might not be a
problem based on when this series is merged.

If you are ok with this approach, I could post a v2 with a couple of
requested fix-ups.

-Deepa

On Fri, May 11, 2018 at 11:44 PM, Kees Cook  wrote:
> On Fri, May 11, 2018 at 9:59 PM, Deepa Dinamani  
> wrote:
>> diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
>> index 5fcb845b9fec..fb681d302bb3 100644
>> --- a/fs/pstore/inode.c
>> +++ b/fs/pstore/inode.c
>> @@ -392,7 +392,7 @@ int pstore_mkfile(struct dentry *root, struct 
>> pstore_record *record)
>> inode->i_private = private;
>>
>> if (record->time.tv_sec)
>> -   inode->i_mtime = inode->i_ctime = record->time;
>> +   inode->i_mtime = inode->i_ctime = 
>> timespec_to_timespec64(record->time);
>>
>> d_add(dentry, inode);
>
> I'm fine to just convert pstore internally to timespec64 right now. Is
> it correct to say that I should use timespec64_to_timespec() here
> until this flag day patch? And I'd need to do this as well, yes?
>
> fs/pstore/platform.c: record->time =
> ns_to_timespec64(ktime_get_real_fast_ns());
>
> Thanks!
>
> -Kees
>
> --
> Kees Cook
> Pixel Security


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote:

> But there remains a refcount bug because deactivate_locked_super() from
> kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via
> sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount()
> if kernfs_mount_ns() returned an error.

Trivial:

unfuck sysfs_mount()

Signed-off-by: Al Viro 
---
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index b428d317ae92..92682fcc41f6 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type 
*fs_type,
 {
struct dentry *root;
void *ns;
-   bool new_sb;
+   bool new_sb = false;
 
if (!(flags & SB_KERNMOUNT)) {
if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
@@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type 
*fs_type,
ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
root = kernfs_mount_ns(fs_type, flags, sysfs_root,
SYSFS_MAGIC, _sb, ns);
-   if (IS_ERR(root) || !new_sb)
+   if (!new_sb)
kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
-   else if (new_sb)
+   else if (!IS_ERR(root))
root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
 
return root;


Re: general protection fault in kernfs_kill_sb (2)

2018-05-13 Thread Al Viro
On Mon, May 14, 2018 at 12:20:16PM +0900, Tetsuo Handa wrote:

> But there remains a refcount bug because deactivate_locked_super() from
> kernfs_mount_ns() triggers kobj_ns_drop() from sysfs_kill_sb() via
> sb->kill_sb() when kobj_ns_drop() is always called by sysfs_mount()
> if kernfs_mount_ns() returned an error.

Trivial:

unfuck sysfs_mount()

Signed-off-by: Al Viro 
---
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index b428d317ae92..92682fcc41f6 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -25,7 +25,7 @@ static struct dentry *sysfs_mount(struct file_system_type 
*fs_type,
 {
struct dentry *root;
void *ns;
-   bool new_sb;
+   bool new_sb = false;
 
if (!(flags & SB_KERNMOUNT)) {
if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
@@ -35,9 +35,9 @@ static struct dentry *sysfs_mount(struct file_system_type 
*fs_type,
ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
root = kernfs_mount_ns(fs_type, flags, sysfs_root,
SYSFS_MAGIC, _sb, ns);
-   if (IS_ERR(root) || !new_sb)
+   if (!new_sb)
kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
-   else if (new_sb)
+   else if (!IS_ERR(root))
root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
 
return root;


Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Randy Dunlap
On 05/13/2018 08:15 PM, Joel Fernandes (Google) wrote:
> rcu_seq_snap may be tricky for someone looking at it for the first time.
> Lets document how it works with an example to make it easier.
> 
> Signed-off-by: Joel Fernandes (Google) 
> ---
>  kernel/rcu/rcu.h | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index 003671825d62..fc3170914ac7 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
>   WRITE_ONCE(*sp, rcu_seq_endval(sp));
>  }
>  
> -/* Take a snapshot of the update side's sequence number. */
> +/*
> + * Take a snapshot of the update side's sequence number.
> + *
> + * This function predicts what the grace period number will be the next
> + * time an RCU callback will be executed, given the current grace period's
> + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> + * already in progress.
> + *
> + * We do this with a single addition and masking.
> + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) 
> of
> + * the seq is used to track if a GP is in progress or not, its sufficient if 
> we

  it's

> + * add (2+1) and mask with ~1. Let's see why with an example:
> + *
> + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> + * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
> + * to account for the state bit. However, if the current seq is 7 (gp is 3 
> and
> + * state bit is 1), then it means the current grace period is already in
> + * progress so the next time the callback will run is at the end of grace
> + * period number gp+2. To account for the extra +1, we just overflow the LSB 
> by
> + * adding another 0x1 and masking with ~0x1. In case no GP was in progress 
> (RCU
> + * is idle), then the addition of the extra 0x1 and masking will have no
> + * effect. This is calculated as below.
> + */
>  static inline unsigned long rcu_seq_snap(unsigned long *sp)
>  {
>   unsigned long s;
> 


-- 
~Randy


Re: [PATCH RFC 1/8] rcu: Add comment documenting how rcu_seq_snap works

2018-05-13 Thread Randy Dunlap
On 05/13/2018 08:15 PM, Joel Fernandes (Google) wrote:
> rcu_seq_snap may be tricky for someone looking at it for the first time.
> Lets document how it works with an example to make it easier.
> 
> Signed-off-by: Joel Fernandes (Google) 
> ---
>  kernel/rcu/rcu.h | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index 003671825d62..fc3170914ac7 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -91,7 +91,29 @@ static inline void rcu_seq_end(unsigned long *sp)
>   WRITE_ONCE(*sp, rcu_seq_endval(sp));
>  }
>  
> -/* Take a snapshot of the update side's sequence number. */
> +/*
> + * Take a snapshot of the update side's sequence number.
> + *
> + * This function predicts what the grace period number will be the next
> + * time an RCU callback will be executed, given the current grace period's
> + * number. This can be gp+1 if RCU is idle, or gp+2 if a grace period is
> + * already in progress.
> + *
> + * We do this with a single addition and masking.
> + * For example, if RCU_SEQ_STATE_MASK=1 and the least significant bit (LSB) 
> of
> + * the seq is used to track if a GP is in progress or not, its sufficient if 
> we

  it's

> + * add (2+1) and mask with ~1. Let's see why with an example:
> + *
> + * Say the current seq is 6 which is 0b110 (gp is 3 and state bit is 0).
> + * To get the next GP number, we have to at least add 0b10 to this (0x1 << 1)
> + * to account for the state bit. However, if the current seq is 7 (gp is 3 
> and
> + * state bit is 1), then it means the current grace period is already in
> + * progress so the next time the callback will run is at the end of grace
> + * period number gp+2. To account for the extra +1, we just overflow the LSB 
> by
> + * adding another 0x1 and masking with ~0x1. In case no GP was in progress 
> (RCU
> + * is idle), then the addition of the extra 0x1 and masking will have no
> + * effect. This is calculated as below.
> + */
>  static inline unsigned long rcu_seq_snap(unsigned long *sp)
>  {
>   unsigned long s;
> 


-- 
~Randy


linux-next: manual merge of the rcu tree with Linus' tree

2018-05-13 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the rcu tree got a conflict in:

  drivers/nvme/host/core.c

between commit:

  12d9f07022dc ("nvme: fix use-after-free in nvme_free_ns_head")

from Linus' tree and commit:

  d9cf21bae6cf ("nvme: Avoid flush dependency in delete controller flow")

from the rcu tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/nvme/host/core.c
index 99b857e5a7a9,c3cea8a29843..
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@@ -351,8 -349,7 +351,8 @@@ static void nvme_free_ns_head(struct kr
nvme_mpath_remove_disk(head);
ida_simple_remove(>subsys->ns_ida, head->instance);
list_del_init(>entry);
-   cleanup_srcu_struct(>srcu);
+   cleanup_srcu_struct_quiesced(>srcu);
 +  nvme_put_subsystem(head->subsys);
kfree(head);
  }
  


pgpZj68hxGev0.pgp
Description: OpenPGP digital signature


linux-next: manual merge of the rcu tree with Linus' tree

2018-05-13 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the rcu tree got a conflict in:

  drivers/nvme/host/core.c

between commit:

  12d9f07022dc ("nvme: fix use-after-free in nvme_free_ns_head")

from Linus' tree and commit:

  d9cf21bae6cf ("nvme: Avoid flush dependency in delete controller flow")

from the rcu tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/nvme/host/core.c
index 99b857e5a7a9,c3cea8a29843..
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@@ -351,8 -349,7 +351,8 @@@ static void nvme_free_ns_head(struct kr
nvme_mpath_remove_disk(head);
ida_simple_remove(>subsys->ns_ida, head->instance);
list_del_init(>entry);
-   cleanup_srcu_struct(>srcu);
+   cleanup_srcu_struct_quiesced(>srcu);
 +  nvme_put_subsystem(head->subsys);
kfree(head);
  }
  


pgpZj68hxGev0.pgp
Description: OpenPGP digital signature


Re: [PATCH v3] ext4: handle errors on ext4_commit_super

2018-05-13 Thread Theodore Y. Ts'o
On Mon, Apr 23, 2018 at 08:46:26AM -0600, Jaegeuk Kim wrote:
> When remounting ext4 from ro to rw, currently it allows its transition,
> even if ext4_commit_super() returns EIO. Even worse thing is, after that,
> fs/buffer complains buffer dirty bits like:
> 
>  Call trace:
>  [] mark_buffer_dirty+0x184/0x1a4
>  [] __ext4_handle_dirty_super+0x4c/0xfc
>  [] ext4_file_open+0x154/0x1c0
>  [] do_dentry_open+0x114/0x2d0
>  [] vfs_open+0x5c/0x94
>  [] path_openat+0x668/0xfe8
>  [] do_filp_open+0x74/0x120
>  [] do_sys_open+0x148/0x254
>  [] SyS_openat+0x10/0x18
>  [] el0_svc_naked+0x24/0x28
>  EXT4-fs (dm-1): previous I/O error to superblock detected
>  Buffer I/O error on dev dm-1, logical block 0, lost sync page write
>  EXT4-fs (dm-1): re-mounted. Opts: (null)
>  Buffer I/O error on dev dm-1, logical block 80, lost async page write
> 
> Cc: "Theodore Ts'o" 
> Cc: Andreas Dilger 
> Cc: linux-e...@vger.kernel.org
> Cc: Jaegeuk Kim 
> Signed-off-by: Jaegeuk Kim 

Applied with a fix up to ext4_fill_super() when it calls
ext4_setup_super() with the changed error return semantics.

- Ted

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index aac33c155363..1388e56bb3f5 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4254,8 +4254,12 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
goto failed_mount4;
}
 
-   if (ext4_setup_super(sb, es, sb_rdonly(sb)))
+   ret = ext4_setup_super(sb, es, sb_rdonly(sb));
+   if (ret == -EROFS) {
sb->s_flags |= SB_RDONLY;
+   ret = 0;
+   } else if (ret)
+   goto failed_mount4a;
 
/* determine the minimum size of new large inodes, if present */
if (sbi->s_inode_size > EXT4_GOOD_OLD_INODE_SIZE &&


Re: [PATCH v3] ext4: handle errors on ext4_commit_super

2018-05-13 Thread Theodore Y. Ts'o
On Mon, Apr 23, 2018 at 08:46:26AM -0600, Jaegeuk Kim wrote:
> When remounting ext4 from ro to rw, currently it allows its transition,
> even if ext4_commit_super() returns EIO. Even worse thing is, after that,
> fs/buffer complains buffer dirty bits like:
> 
>  Call trace:
>  [] mark_buffer_dirty+0x184/0x1a4
>  [] __ext4_handle_dirty_super+0x4c/0xfc
>  [] ext4_file_open+0x154/0x1c0
>  [] do_dentry_open+0x114/0x2d0
>  [] vfs_open+0x5c/0x94
>  [] path_openat+0x668/0xfe8
>  [] do_filp_open+0x74/0x120
>  [] do_sys_open+0x148/0x254
>  [] SyS_openat+0x10/0x18
>  [] el0_svc_naked+0x24/0x28
>  EXT4-fs (dm-1): previous I/O error to superblock detected
>  Buffer I/O error on dev dm-1, logical block 0, lost sync page write
>  EXT4-fs (dm-1): re-mounted. Opts: (null)
>  Buffer I/O error on dev dm-1, logical block 80, lost async page write
> 
> Cc: "Theodore Ts'o" 
> Cc: Andreas Dilger 
> Cc: linux-e...@vger.kernel.org
> Cc: Jaegeuk Kim 
> Signed-off-by: Jaegeuk Kim 

Applied with a fix up to ext4_fill_super() when it calls
ext4_setup_super() with the changed error return semantics.

- Ted

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index aac33c155363..1388e56bb3f5 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4254,8 +4254,12 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
goto failed_mount4;
}
 
-   if (ext4_setup_super(sb, es, sb_rdonly(sb)))
+   ret = ext4_setup_super(sb, es, sb_rdonly(sb));
+   if (ret == -EROFS) {
sb->s_flags |= SB_RDONLY;
+   ret = 0;
+   } else if (ret)
+   goto failed_mount4a;
 
/* determine the minimum size of new large inodes, if present */
if (sbi->s_inode_size > EXT4_GOOD_OLD_INODE_SIZE &&


Re: [PATCH] f2fs: Fix deadlock in shutdown ioctl

2018-05-13 Thread Chao Yu
On 2018/5/10 21:20, Sahitya Tummala wrote:
> f2fs_ioc_shutdown() ioctl gets stuck in the below path
> when going down with full sync (F2FS_GOING_DOWN_FULLSYNC)
> option.
> 
> __switch_to+0x90/0xc4
> percpu_down_write+0x8c/0xc0
> freeze_super+0xec/0x1e4
> freeze_bdev+0xc4/0xcc
> f2fs_ioctl+0xc0c/0x1ce0
> f2fs_compat_ioctl+0x98/0x1f0
> 
> Fix this by not holding write access during this ioctl.

I think we can just remove lock coverage for F2FS_GOING_DOWN_FULLSYNC path, for
other path, we need to keep as it is.

Thanks,

> 
> Signed-off-by: Sahitya Tummala 
> ---
>  fs/f2fs/file.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index b926df7..2c2e61b 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -1835,10 +1835,6 @@ static int f2fs_ioc_shutdown(struct file *filp, 
> unsigned long arg)
>   if (get_user(in, (__u32 __user *)arg))
>   return -EFAULT;
>  
> - ret = mnt_want_write_file(filp);
> - if (ret)
> - return ret;
> -
>   switch (in) {
>   case F2FS_GOING_DOWN_FULLSYNC:
>   sb = freeze_bdev(sb->s_bdev);
> @@ -1878,7 +1874,6 @@ static int f2fs_ioc_shutdown(struct file *filp, 
> unsigned long arg)
>  
>   f2fs_update_time(sbi, REQ_TIME);
>  out:
> - mnt_drop_write_file(filp);
>   return ret;
>  }
>  
> 



Re: [PATCH] f2fs: Fix deadlock in shutdown ioctl

2018-05-13 Thread Chao Yu
On 2018/5/10 21:20, Sahitya Tummala wrote:
> f2fs_ioc_shutdown() ioctl gets stuck in the below path
> when going down with full sync (F2FS_GOING_DOWN_FULLSYNC)
> option.
> 
> __switch_to+0x90/0xc4
> percpu_down_write+0x8c/0xc0
> freeze_super+0xec/0x1e4
> freeze_bdev+0xc4/0xcc
> f2fs_ioctl+0xc0c/0x1ce0
> f2fs_compat_ioctl+0x98/0x1f0
> 
> Fix this by not holding write access during this ioctl.

I think we can just remove lock coverage for F2FS_GOING_DOWN_FULLSYNC path, for
other path, we need to keep as it is.

Thanks,

> 
> Signed-off-by: Sahitya Tummala 
> ---
>  fs/f2fs/file.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index b926df7..2c2e61b 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -1835,10 +1835,6 @@ static int f2fs_ioc_shutdown(struct file *filp, 
> unsigned long arg)
>   if (get_user(in, (__u32 __user *)arg))
>   return -EFAULT;
>  
> - ret = mnt_want_write_file(filp);
> - if (ret)
> - return ret;
> -
>   switch (in) {
>   case F2FS_GOING_DOWN_FULLSYNC:
>   sb = freeze_bdev(sb->s_bdev);
> @@ -1878,7 +1874,6 @@ static int f2fs_ioc_shutdown(struct file *filp, 
> unsigned long arg)
>  
>   f2fs_update_time(sbi, REQ_TIME);
>  out:
> - mnt_drop_write_file(filp);
>   return ret;
>  }
>  
> 



Re: [PATCH v5 09/23] iommu/vt-d: add svm/sva invalidate function

2018-05-13 Thread Lu Baolu
Hi,

On 05/12/2018 04:54 AM, Jacob Pan wrote:
> When Shared Virtual Address (SVA) is enabled for a guest OS via
> vIOMMU, we need to provide invalidation support at IOMMU API and driver
> level. This patch adds Intel VT-d specific function to implement
> iommu passdown invalidate API for shared virtual address.
>
> The use case is for supporting caching structure invalidation
> of assigned SVM capable devices. Emulated IOMMU exposes queue
> invalidation capability and passes down all descriptors from the guest
> to the physical IOMMU.
>
> The assumption is that guest to host device ID mapping should be
> resolved prior to calling IOMMU driver. Based on the device handle,
> host IOMMU driver can replace certain fields before submit to the
> invalidation queue.
>
> Signed-off-by: Liu, Yi L 
> Signed-off-by: Ashok Raj 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/intel-iommu.c | 129 
> 
>  1 file changed, 129 insertions(+)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 732a10f..684bd98 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4973,6 +4973,134 @@ static void intel_iommu_detach_device(struct 
> iommu_domain *domain,
>   dmar_remove_one_dev_info(to_dmar_domain(domain), dev);
>  }
>  
> +/*
> + * 2D array for converting and sanitizing IOMMU generic TLB granularity to
> + * VT-d granularity. Invalidation is typically included in the unmap 
> operation
> + * as a result of DMA or VFIO unmap. However, for assigned device where guest
> + * could own the first level page tables without being shadowed by QEMU. In
> + * this case there is no pass down unmap to the host IOMMU as a result of 
> unmap
> + * in the guest. Only invalidations are trapped and passed down.
> + * In all cases, only first level TLB invalidation (request with PASID) can 
> be
> + * passed down, therefore we do not include IOTLB granularity for request
> + * without PASID (second level).
> + *
> + * For an example, to find the VT-d granularity encoding for IOTLB
> + * type and page selective granularity within PASID:
> + * X: indexed by enum iommu_inv_type
> + * Y: indexed by enum iommu_inv_granularity
> + * [IOMMU_INV_TYPE_TLB][IOMMU_INV_GRANU_PAGE_PASID]
> + *
> + * Granu_map array indicates validity of the table. 1: valid, 0: invalid
> + *
> + */
> +const static int inv_type_granu_map[IOMMU_INV_NR_TYPE][IOMMU_INV_NR_GRANU] = 
> {
> + /* Extended dev TLBs */
> + {1, 1, 1},
> + /* Extended IOTLB */
> + {1, 1, 1},
> + /* PASID cache */
> + {1, 1, 0}
> +};
> +
> +const static u64 inv_type_granu_table[IOMMU_INV_NR_TYPE][IOMMU_INV_NR_GRANU] 
> = {
> + /* extended dev IOTLBs */
> + {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0},
> + /* Extended IOTLB */
> + {QI_GRAN_NONG_ALL, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID},
> + /* PASID cache */
> + {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0},
> +};
> +
> +static inline int to_vtd_granularity(int type, int granu, u64 *vtd_granu)
> +{
> + if (type >= IOMMU_INV_NR_TYPE || granu >= IOMMU_INV_NR_GRANU ||
> + !inv_type_granu_map[type][granu])

Alignment should match open parenthesis.

> + return -EINVAL;
> +
> + *vtd_granu = inv_type_granu_table[type][granu];
> +
> + return 0;
> +}
> +
> +static int intel_iommu_sva_invalidate(struct iommu_domain *domain,
> + struct device *dev, struct tlb_invalidate_info *inv_info)

Ditto.

> +{
> + struct intel_iommu *iommu;
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> + struct device_domain_info *info;
> + u16 did, sid;
> + u8 bus, devfn;
> + int ret = 0;
> + u64 granu;
> + unsigned long flags;
> +

I prefer to keep this in order.

struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct device_domain_info *info;
struct intel_iommu *iommu;
unsigned long flags;
u8 bus, devfn;
u16 did, sid;
int ret = 0;
u64 granu;

> + if (!inv_info || !dmar_domain ||
> + inv_info->hdr.type != TLB_INV_HDR_VERSION_1)

Ditto.

> + return -EINVAL;
> +
> + iommu = device_to_iommu(dev, , );
> + if (!iommu)
> + return -ENODEV;
> +
> + if (!dev || !dev_is_pci(dev))
> + return -ENODEV;
> +
> + did = dmar_domain->iommu_did[iommu->seq_id];
> + sid = PCI_DEVID(bus, devfn);
> + ret = to_vtd_granularity(inv_info->hdr.type, inv_info->granularity,
> + );
> + if (ret) {
> + pr_err("Invalid range type %d, granu %d\n", inv_info->hdr.type,
> + inv_info->granularity);
> + return ret;
> + }
> +
> + spin_lock(>lock);
> + spin_lock_irqsave(_domain_lock, flags);
> +
> + switch (inv_info->hdr.type) {
> + case 

Re: [PATCH v5 09/23] iommu/vt-d: add svm/sva invalidate function

2018-05-13 Thread Lu Baolu
Hi,

On 05/12/2018 04:54 AM, Jacob Pan wrote:
> When Shared Virtual Address (SVA) is enabled for a guest OS via
> vIOMMU, we need to provide invalidation support at IOMMU API and driver
> level. This patch adds Intel VT-d specific function to implement
> iommu passdown invalidate API for shared virtual address.
>
> The use case is for supporting caching structure invalidation
> of assigned SVM capable devices. Emulated IOMMU exposes queue
> invalidation capability and passes down all descriptors from the guest
> to the physical IOMMU.
>
> The assumption is that guest to host device ID mapping should be
> resolved prior to calling IOMMU driver. Based on the device handle,
> host IOMMU driver can replace certain fields before submit to the
> invalidation queue.
>
> Signed-off-by: Liu, Yi L 
> Signed-off-by: Ashok Raj 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/intel-iommu.c | 129 
> 
>  1 file changed, 129 insertions(+)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 732a10f..684bd98 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4973,6 +4973,134 @@ static void intel_iommu_detach_device(struct 
> iommu_domain *domain,
>   dmar_remove_one_dev_info(to_dmar_domain(domain), dev);
>  }
>  
> +/*
> + * 2D array for converting and sanitizing IOMMU generic TLB granularity to
> + * VT-d granularity. Invalidation is typically included in the unmap 
> operation
> + * as a result of DMA or VFIO unmap. However, for assigned device where guest
> + * could own the first level page tables without being shadowed by QEMU. In
> + * this case there is no pass down unmap to the host IOMMU as a result of 
> unmap
> + * in the guest. Only invalidations are trapped and passed down.
> + * In all cases, only first level TLB invalidation (request with PASID) can 
> be
> + * passed down, therefore we do not include IOTLB granularity for request
> + * without PASID (second level).
> + *
> + * For an example, to find the VT-d granularity encoding for IOTLB
> + * type and page selective granularity within PASID:
> + * X: indexed by enum iommu_inv_type
> + * Y: indexed by enum iommu_inv_granularity
> + * [IOMMU_INV_TYPE_TLB][IOMMU_INV_GRANU_PAGE_PASID]
> + *
> + * Granu_map array indicates validity of the table. 1: valid, 0: invalid
> + *
> + */
> +const static int inv_type_granu_map[IOMMU_INV_NR_TYPE][IOMMU_INV_NR_GRANU] = 
> {
> + /* Extended dev TLBs */
> + {1, 1, 1},
> + /* Extended IOTLB */
> + {1, 1, 1},
> + /* PASID cache */
> + {1, 1, 0}
> +};
> +
> +const static u64 inv_type_granu_table[IOMMU_INV_NR_TYPE][IOMMU_INV_NR_GRANU] 
> = {
> + /* extended dev IOTLBs */
> + {QI_DEV_IOTLB_GRAN_ALL, QI_DEV_IOTLB_GRAN_PASID_SEL, 0},
> + /* Extended IOTLB */
> + {QI_GRAN_NONG_ALL, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID},
> + /* PASID cache */
> + {QI_PC_ALL_PASIDS, QI_PC_PASID_SEL, 0},
> +};
> +
> +static inline int to_vtd_granularity(int type, int granu, u64 *vtd_granu)
> +{
> + if (type >= IOMMU_INV_NR_TYPE || granu >= IOMMU_INV_NR_GRANU ||
> + !inv_type_granu_map[type][granu])

Alignment should match open parenthesis.

> + return -EINVAL;
> +
> + *vtd_granu = inv_type_granu_table[type][granu];
> +
> + return 0;
> +}
> +
> +static int intel_iommu_sva_invalidate(struct iommu_domain *domain,
> + struct device *dev, struct tlb_invalidate_info *inv_info)

Ditto.

> +{
> + struct intel_iommu *iommu;
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> + struct device_domain_info *info;
> + u16 did, sid;
> + u8 bus, devfn;
> + int ret = 0;
> + u64 granu;
> + unsigned long flags;
> +

I prefer to keep this in order.

struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct device_domain_info *info;
struct intel_iommu *iommu;
unsigned long flags;
u8 bus, devfn;
u16 did, sid;
int ret = 0;
u64 granu;

> + if (!inv_info || !dmar_domain ||
> + inv_info->hdr.type != TLB_INV_HDR_VERSION_1)

Ditto.

> + return -EINVAL;
> +
> + iommu = device_to_iommu(dev, , );
> + if (!iommu)
> + return -ENODEV;
> +
> + if (!dev || !dev_is_pci(dev))
> + return -ENODEV;
> +
> + did = dmar_domain->iommu_did[iommu->seq_id];
> + sid = PCI_DEVID(bus, devfn);
> + ret = to_vtd_granularity(inv_info->hdr.type, inv_info->granularity,
> + );
> + if (ret) {
> + pr_err("Invalid range type %d, granu %d\n", inv_info->hdr.type,
> + inv_info->granularity);
> + return ret;
> + }
> +
> + spin_lock(>lock);
> + spin_lock_irqsave(_domain_lock, flags);
> +
> + switch (inv_info->hdr.type) {
> + case IOMMU_INV_TYPE_TLB:
> + if (inv_info->size &&
> + 

[PATCH v2] Revert "alx: remove WoL support"

2018-05-13 Thread AceLan Kao
This reverts commit bc2bebe8de8ed4ba6482c9cc370b0dd72ffe8cd2.

The WoL feature is a must to pass Energy Star 6.1 and above,
the power consumption will be measured during S3 with WoL is enabled.

Reverting "alx: remove WoL support", and will try to fix the unintentional
wake up issue when WoL is enabled.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=61651

Signed-off-by: AceLan Kao 
---
 drivers/net/ethernet/atheros/alx/ethtool.c |  36 +
 drivers/net/ethernet/atheros/alx/hw.c  | 154 -
 drivers/net/ethernet/atheros/alx/hw.h  |   5 +
 drivers/net/ethernet/atheros/alx/main.c| 142 +--
 4 files changed, 326 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/atheros/alx/ethtool.c 
b/drivers/net/ethernet/atheros/alx/ethtool.c
index 2f4eabf652e8..859e27236ce4 100644
--- a/drivers/net/ethernet/atheros/alx/ethtool.c
+++ b/drivers/net/ethernet/atheros/alx/ethtool.c
@@ -310,11 +310,47 @@ static int alx_get_sset_count(struct net_device *netdev, 
int sset)
}
 }
 
+static void alx_get_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
+{
+   struct alx_priv *alx = netdev_priv(netdev);
+   struct alx_hw *hw = >hw;
+
+   wol->supported = WAKE_MAGIC | WAKE_PHY;
+   wol->wolopts = 0;
+
+   if (hw->sleep_ctrl & ALX_SLEEP_WOL_MAGIC)
+   wol->wolopts |= WAKE_MAGIC;
+   if (hw->sleep_ctrl & ALX_SLEEP_WOL_PHY)
+   wol->wolopts |= WAKE_PHY;
+}
+
+static int alx_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
+{
+   struct alx_priv *alx = netdev_priv(netdev);
+   struct alx_hw *hw = >hw;
+
+   if (wol->wolopts & ~(WAKE_MAGIC | WAKE_PHY))
+   return -EOPNOTSUPP;
+
+   hw->sleep_ctrl = 0;
+
+   if (wol->wolopts & WAKE_MAGIC)
+   hw->sleep_ctrl |= ALX_SLEEP_WOL_MAGIC;
+   if (wol->wolopts & WAKE_PHY)
+   hw->sleep_ctrl |= ALX_SLEEP_WOL_PHY;
+
+   device_set_wakeup_enable(>hw.pdev->dev, hw->sleep_ctrl);
+
+   return 0;
+}
+
 const struct ethtool_ops alx_ethtool_ops = {
.get_pauseparam = alx_get_pauseparam,
.set_pauseparam = alx_set_pauseparam,
.get_msglevel   = alx_get_msglevel,
.set_msglevel   = alx_set_msglevel,
+   .get_wol= alx_get_wol,
+   .set_wol= alx_set_wol,
.get_link   = ethtool_op_get_link,
.get_strings= alx_get_strings,
.get_sset_count = alx_get_sset_count,
diff --git a/drivers/net/ethernet/atheros/alx/hw.c 
b/drivers/net/ethernet/atheros/alx/hw.c
index 6ac40b0003a3..f9bf612550ab 100644
--- a/drivers/net/ethernet/atheros/alx/hw.c
+++ b/drivers/net/ethernet/atheros/alx/hw.c
@@ -332,6 +332,16 @@ void alx_set_macaddr(struct alx_hw *hw, const u8 *addr)
alx_write_mem32(hw, ALX_STAD1, val);
 }
 
+static void alx_enable_osc(struct alx_hw *hw)
+{
+   u32 val;
+
+   /* rising edge */
+   val = alx_read_mem32(hw, ALX_MISC);
+   alx_write_mem32(hw, ALX_MISC, val & ~ALX_MISC_INTNLOSC_OPEN);
+   alx_write_mem32(hw, ALX_MISC, val | ALX_MISC_INTNLOSC_OPEN);
+}
+
 static void alx_reset_osc(struct alx_hw *hw, u8 rev)
 {
u32 val, val2;
@@ -774,7 +784,6 @@ int alx_setup_speed_duplex(struct alx_hw *hw, u32 ethadv, 
u8 flowctrl)
return err;
 }
 
-
 void alx_post_phy_link(struct alx_hw *hw)
 {
u16 phy_val, len, agc;
@@ -848,6 +857,65 @@ void alx_post_phy_link(struct alx_hw *hw)
}
 }
 
+/* NOTE:
+ *1. phy link must be established before calling this function
+ *2. wol option (pattern,magic,link,etc.) is configed before call it.
+ */
+int alx_pre_suspend(struct alx_hw *hw, int speed, u8 duplex)
+{
+   u32 master, mac, phy, val;
+   int err = 0;
+
+   master = alx_read_mem32(hw, ALX_MASTER);
+   master &= ~ALX_MASTER_PCLKSEL_SRDS;
+   mac = hw->rx_ctrl;
+   /* 10/100 half */
+   ALX_SET_FIELD(mac, ALX_MAC_CTRL_SPEED,  ALX_MAC_CTRL_SPEED_10_100);
+   mac &= ~(ALX_MAC_CTRL_FULLD | ALX_MAC_CTRL_RX_EN | ALX_MAC_CTRL_TX_EN);
+
+   phy = alx_read_mem32(hw, ALX_PHY_CTRL);
+   phy &= ~(ALX_PHY_CTRL_DSPRST_OUT | ALX_PHY_CTRL_CLS);
+   phy |= ALX_PHY_CTRL_RST_ANALOG | ALX_PHY_CTRL_HIB_PULSE |
+  ALX_PHY_CTRL_HIB_EN;
+
+   /* without any activity  */
+   if (!(hw->sleep_ctrl & ALX_SLEEP_ACTIVE)) {
+   err = alx_write_phy_reg(hw, ALX_MII_IER, 0);
+   if (err)
+   return err;
+   phy |= ALX_PHY_CTRL_IDDQ | ALX_PHY_CTRL_POWER_DOWN;
+   } else {
+   if (hw->sleep_ctrl & (ALX_SLEEP_WOL_MAGIC | ALX_SLEEP_CIFS))
+   mac |= ALX_MAC_CTRL_RX_EN | ALX_MAC_CTRL_BRD_EN;
+   if (hw->sleep_ctrl & ALX_SLEEP_CIFS)
+   mac |= ALX_MAC_CTRL_TX_EN;
+   if (duplex == DUPLEX_FULL)
+   mac |= ALX_MAC_CTRL_FULLD;
+   if (speed == SPEED_1000)
+   

[PATCH v2] Revert "alx: remove WoL support"

2018-05-13 Thread AceLan Kao
This reverts commit bc2bebe8de8ed4ba6482c9cc370b0dd72ffe8cd2.

The WoL feature is a must to pass Energy Star 6.1 and above,
the power consumption will be measured during S3 with WoL is enabled.

Reverting "alx: remove WoL support", and will try to fix the unintentional
wake up issue when WoL is enabled.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=61651

Signed-off-by: AceLan Kao 
---
 drivers/net/ethernet/atheros/alx/ethtool.c |  36 +
 drivers/net/ethernet/atheros/alx/hw.c  | 154 -
 drivers/net/ethernet/atheros/alx/hw.h  |   5 +
 drivers/net/ethernet/atheros/alx/main.c| 142 +--
 4 files changed, 326 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/atheros/alx/ethtool.c 
b/drivers/net/ethernet/atheros/alx/ethtool.c
index 2f4eabf652e8..859e27236ce4 100644
--- a/drivers/net/ethernet/atheros/alx/ethtool.c
+++ b/drivers/net/ethernet/atheros/alx/ethtool.c
@@ -310,11 +310,47 @@ static int alx_get_sset_count(struct net_device *netdev, 
int sset)
}
 }
 
+static void alx_get_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
+{
+   struct alx_priv *alx = netdev_priv(netdev);
+   struct alx_hw *hw = >hw;
+
+   wol->supported = WAKE_MAGIC | WAKE_PHY;
+   wol->wolopts = 0;
+
+   if (hw->sleep_ctrl & ALX_SLEEP_WOL_MAGIC)
+   wol->wolopts |= WAKE_MAGIC;
+   if (hw->sleep_ctrl & ALX_SLEEP_WOL_PHY)
+   wol->wolopts |= WAKE_PHY;
+}
+
+static int alx_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
+{
+   struct alx_priv *alx = netdev_priv(netdev);
+   struct alx_hw *hw = >hw;
+
+   if (wol->wolopts & ~(WAKE_MAGIC | WAKE_PHY))
+   return -EOPNOTSUPP;
+
+   hw->sleep_ctrl = 0;
+
+   if (wol->wolopts & WAKE_MAGIC)
+   hw->sleep_ctrl |= ALX_SLEEP_WOL_MAGIC;
+   if (wol->wolopts & WAKE_PHY)
+   hw->sleep_ctrl |= ALX_SLEEP_WOL_PHY;
+
+   device_set_wakeup_enable(>hw.pdev->dev, hw->sleep_ctrl);
+
+   return 0;
+}
+
 const struct ethtool_ops alx_ethtool_ops = {
.get_pauseparam = alx_get_pauseparam,
.set_pauseparam = alx_set_pauseparam,
.get_msglevel   = alx_get_msglevel,
.set_msglevel   = alx_set_msglevel,
+   .get_wol= alx_get_wol,
+   .set_wol= alx_set_wol,
.get_link   = ethtool_op_get_link,
.get_strings= alx_get_strings,
.get_sset_count = alx_get_sset_count,
diff --git a/drivers/net/ethernet/atheros/alx/hw.c 
b/drivers/net/ethernet/atheros/alx/hw.c
index 6ac40b0003a3..f9bf612550ab 100644
--- a/drivers/net/ethernet/atheros/alx/hw.c
+++ b/drivers/net/ethernet/atheros/alx/hw.c
@@ -332,6 +332,16 @@ void alx_set_macaddr(struct alx_hw *hw, const u8 *addr)
alx_write_mem32(hw, ALX_STAD1, val);
 }
 
+static void alx_enable_osc(struct alx_hw *hw)
+{
+   u32 val;
+
+   /* rising edge */
+   val = alx_read_mem32(hw, ALX_MISC);
+   alx_write_mem32(hw, ALX_MISC, val & ~ALX_MISC_INTNLOSC_OPEN);
+   alx_write_mem32(hw, ALX_MISC, val | ALX_MISC_INTNLOSC_OPEN);
+}
+
 static void alx_reset_osc(struct alx_hw *hw, u8 rev)
 {
u32 val, val2;
@@ -774,7 +784,6 @@ int alx_setup_speed_duplex(struct alx_hw *hw, u32 ethadv, 
u8 flowctrl)
return err;
 }
 
-
 void alx_post_phy_link(struct alx_hw *hw)
 {
u16 phy_val, len, agc;
@@ -848,6 +857,65 @@ void alx_post_phy_link(struct alx_hw *hw)
}
 }
 
+/* NOTE:
+ *1. phy link must be established before calling this function
+ *2. wol option (pattern,magic,link,etc.) is configed before call it.
+ */
+int alx_pre_suspend(struct alx_hw *hw, int speed, u8 duplex)
+{
+   u32 master, mac, phy, val;
+   int err = 0;
+
+   master = alx_read_mem32(hw, ALX_MASTER);
+   master &= ~ALX_MASTER_PCLKSEL_SRDS;
+   mac = hw->rx_ctrl;
+   /* 10/100 half */
+   ALX_SET_FIELD(mac, ALX_MAC_CTRL_SPEED,  ALX_MAC_CTRL_SPEED_10_100);
+   mac &= ~(ALX_MAC_CTRL_FULLD | ALX_MAC_CTRL_RX_EN | ALX_MAC_CTRL_TX_EN);
+
+   phy = alx_read_mem32(hw, ALX_PHY_CTRL);
+   phy &= ~(ALX_PHY_CTRL_DSPRST_OUT | ALX_PHY_CTRL_CLS);
+   phy |= ALX_PHY_CTRL_RST_ANALOG | ALX_PHY_CTRL_HIB_PULSE |
+  ALX_PHY_CTRL_HIB_EN;
+
+   /* without any activity  */
+   if (!(hw->sleep_ctrl & ALX_SLEEP_ACTIVE)) {
+   err = alx_write_phy_reg(hw, ALX_MII_IER, 0);
+   if (err)
+   return err;
+   phy |= ALX_PHY_CTRL_IDDQ | ALX_PHY_CTRL_POWER_DOWN;
+   } else {
+   if (hw->sleep_ctrl & (ALX_SLEEP_WOL_MAGIC | ALX_SLEEP_CIFS))
+   mac |= ALX_MAC_CTRL_RX_EN | ALX_MAC_CTRL_BRD_EN;
+   if (hw->sleep_ctrl & ALX_SLEEP_CIFS)
+   mac |= ALX_MAC_CTRL_TX_EN;
+   if (duplex == DUPLEX_FULL)
+   mac |= ALX_MAC_CTRL_FULLD;
+   if (speed == SPEED_1000)
+   

[PATCH v2 13/13] soc: rockchip: power-domain: add power domain support for px30

2018-05-13 Thread Elaine Zhang
From: Finley Xiao 

This driver is modified to support PX30 SoC.

Signed-off-by: Finley Xiao 
Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index 90dcd5e21ae6..d0c5615132e3 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -114,6 +115,9 @@ struct rockchip_pmu {
.active_wakeup = wakeup,\
 }
 
+#define DOMAIN_PX30(pwr, status, req, wakeup)  \
+   DOMAIN_M(pwr, status, req, (req) + 16, req, wakeup)
+
 #define DOMAIN_RK3288(pwr, status, req, wakeup)\
DOMAIN(pwr, status, req, req, (req) + 16, wakeup)
 
@@ -712,6 +716,17 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
return error;
 }
 
+static const struct rockchip_domain_info px30_pm_domains[] = {
+   [PX30_PD_USB]   = DOMAIN_PX30(5, 5, 10, false),
+   [PX30_PD_SDCARD]= DOMAIN_PX30(8, 8, 9, false),
+   [PX30_PD_GMAC]  = DOMAIN_PX30(10, 10, 6, false),
+   [PX30_PD_MMC_NAND]  = DOMAIN_PX30(11, 11, 5, false),
+   [PX30_PD_VPU]   = DOMAIN_PX30(12, 12, 14, false),
+   [PX30_PD_VO]= DOMAIN_PX30(13, 13, 7, false),
+   [PX30_PD_VI]= DOMAIN_PX30(14, 14, 8, false),
+   [PX30_PD_GPU]   = DOMAIN_PX30(15, 15, 2, false),
+};
+
 static const struct rockchip_domain_info rk3036_pm_domains[] = {
[RK3036_PD_MSCH]= DOMAIN_RK3036(14, 23, 30, true),
[RK3036_PD_CORE]= DOMAIN_RK3036(13, 17, 24, false),
@@ -811,6 +826,17 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
[RK3399_PD_SDIOAUDIO]   = DOMAIN_RK3399(31, 31, 29, true),
 };
 
+static const struct rockchip_pmu_info px30_pmu = {
+   .pwr_offset = 0x18,
+   .status_offset = 0x20,
+   .req_offset = 0x64,
+   .idle_offset = 0x6c,
+   .ack_offset = 0x6c,
+
+   .num_domains = ARRAY_SIZE(px30_pm_domains),
+   .domain_info = px30_pm_domains,
+};
+
 static const struct rockchip_pmu_info rk3036_pmu = {
.req_offset = 0x148,
.idle_offset = 0x14c,
@@ -915,6 +941,10 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
 
 static const struct of_device_id rockchip_pm_domain_dt_match[] = {
{
+   .compatible = "rockchip,px30-power-controller",
+   .data = (void *)_pmu,
+   },
+   {
.compatible = "rockchip,rk3036-power-controller",
.data = (void *)_pmu,
},
-- 
1.9.1




[PATCH v2 13/13] soc: rockchip: power-domain: add power domain support for px30

2018-05-13 Thread Elaine Zhang
From: Finley Xiao 

This driver is modified to support PX30 SoC.

Signed-off-by: Finley Xiao 
Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index 90dcd5e21ae6..d0c5615132e3 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -114,6 +115,9 @@ struct rockchip_pmu {
.active_wakeup = wakeup,\
 }
 
+#define DOMAIN_PX30(pwr, status, req, wakeup)  \
+   DOMAIN_M(pwr, status, req, (req) + 16, req, wakeup)
+
 #define DOMAIN_RK3288(pwr, status, req, wakeup)\
DOMAIN(pwr, status, req, req, (req) + 16, wakeup)
 
@@ -712,6 +716,17 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
return error;
 }
 
+static const struct rockchip_domain_info px30_pm_domains[] = {
+   [PX30_PD_USB]   = DOMAIN_PX30(5, 5, 10, false),
+   [PX30_PD_SDCARD]= DOMAIN_PX30(8, 8, 9, false),
+   [PX30_PD_GMAC]  = DOMAIN_PX30(10, 10, 6, false),
+   [PX30_PD_MMC_NAND]  = DOMAIN_PX30(11, 11, 5, false),
+   [PX30_PD_VPU]   = DOMAIN_PX30(12, 12, 14, false),
+   [PX30_PD_VO]= DOMAIN_PX30(13, 13, 7, false),
+   [PX30_PD_VI]= DOMAIN_PX30(14, 14, 8, false),
+   [PX30_PD_GPU]   = DOMAIN_PX30(15, 15, 2, false),
+};
+
 static const struct rockchip_domain_info rk3036_pm_domains[] = {
[RK3036_PD_MSCH]= DOMAIN_RK3036(14, 23, 30, true),
[RK3036_PD_CORE]= DOMAIN_RK3036(13, 17, 24, false),
@@ -811,6 +826,17 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
[RK3399_PD_SDIOAUDIO]   = DOMAIN_RK3399(31, 31, 29, true),
 };
 
+static const struct rockchip_pmu_info px30_pmu = {
+   .pwr_offset = 0x18,
+   .status_offset = 0x20,
+   .req_offset = 0x64,
+   .idle_offset = 0x6c,
+   .ack_offset = 0x6c,
+
+   .num_domains = ARRAY_SIZE(px30_pm_domains),
+   .domain_info = px30_pm_domains,
+};
+
 static const struct rockchip_pmu_info rk3036_pmu = {
.req_offset = 0x148,
.idle_offset = 0x14c,
@@ -915,6 +941,10 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
 
 static const struct of_device_id rockchip_pm_domain_dt_match[] = {
{
+   .compatible = "rockchip,px30-power-controller",
+   .data = (void *)_pmu,
+   },
+   {
.compatible = "rockchip,rk3036-power-controller",
.data = (void *)_pmu,
},
-- 
1.9.1




[PATCH v2 11/13] dt-bindings: power: add PX30 SoCs header for power-domain

2018-05-13 Thread Elaine Zhang
From: Finley Xiao 

According to a description from TRM, add all the power domains.

Signed-off-by: Finley Xiao 
Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/power/px30-power.h | 32 
 1 file changed, 32 insertions(+)
 create mode 100644 include/dt-bindings/power/px30-power.h

diff --git a/include/dt-bindings/power/px30-power.h 
b/include/dt-bindings/power/px30-power.h
new file mode 100644
index ..4ed482e80950
--- /dev/null
+++ b/include/dt-bindings/power/px30-power.h
@@ -0,0 +1,32 @@
+/*
+ * Copyright (c) 2017 Fuzhou Rockchip Electronics Co., Ltd
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+#ifndef __DT_BINDINGS_POWER_PX30_POWER_H__
+#define __DT_BINDINGS_POWER_PX30_POWER_H__
+
+/* VD_CORE */
+#define PX30_PD_A35_0  0
+#define PX30_PD_A35_1  1
+#define PX30_PD_A35_2  2
+#define PX30_PD_A35_3  3
+#define PX30_PD_SCU4
+
+/* VD_LOGIC */
+#define PX30_PD_USB5
+#define PX30_PD_DDR6
+#define PX30_PD_SDCARD 7
+#define PX30_PD_CRYPTO 8
+#define PX30_PD_GMAC   9
+#define PX30_PD_MMC_NAND   10
+#define PX30_PD_VPU11
+#define PX30_PD_VO 12
+#define PX30_PD_VI 13
+#define PX30_PD_GPU14
+
+/* VD_PMU */
+#define PX30_PD_PMU15
+
+#endif
-- 
1.9.1




[PATCH v2 12/13] dt-bindings: add binding for px30 power domains

2018-05-13 Thread Elaine Zhang
From: Finley Xiao 

Add binding documentation for the power domains
found on Rockchip PX30 SoCs.

Signed-off-by: Finley Xiao 
Signed-off-by: Elaine Zhang 
---
 Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt 
b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
index affe36dcfa17..5d49d0a2ff29 100644
--- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
+++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
@@ -5,6 +5,7 @@ powered up/down by software based on different application 
scenes to save power.
 
 Required properties for power domain controller:
 - compatible: Should be one of the following.
+   "rockchip,px30-power-controller" - for PX30 SoCs.
"rockchip,rk3036-power-controller" - for RK3036 SoCs.
"rockchip,rk3128-power-controller" - for RK3128 SoCs.
"rockchip,rk3228-power-controller" - for RK3228 SoCs.
@@ -20,6 +21,7 @@ Required properties for power domain controller:
 
 Required properties for power domain sub nodes:
 - reg: index of the power domain, should use macros in:
+   "include/dt-bindings/power/px30-power.h" - for PX30 type power domain.
"include/dt-bindings/power/rk3036-power.h" - for RK3036 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for RK3128 type power 
domain.
"include/dt-bindings/power/rk3228-power.h" - for RK3228 type power 
domain.
@@ -99,6 +101,7 @@ Node of a device using power domains must have a 
power-domains property,
 containing a phandle to the power device node and an index specifying which
 power domain to use.
 The index should use macros in:
+   "include/dt-bindings/power/px30-power.h" - for px30 type power domain.
"include/dt-bindings/power/rk3036-power.h" - for rk3036 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for rk3128 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for rk3228 type power 
domain.
-- 
1.9.1




[PATCH v2 11/13] dt-bindings: power: add PX30 SoCs header for power-domain

2018-05-13 Thread Elaine Zhang
From: Finley Xiao 

According to a description from TRM, add all the power domains.

Signed-off-by: Finley Xiao 
Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/power/px30-power.h | 32 
 1 file changed, 32 insertions(+)
 create mode 100644 include/dt-bindings/power/px30-power.h

diff --git a/include/dt-bindings/power/px30-power.h 
b/include/dt-bindings/power/px30-power.h
new file mode 100644
index ..4ed482e80950
--- /dev/null
+++ b/include/dt-bindings/power/px30-power.h
@@ -0,0 +1,32 @@
+/*
+ * Copyright (c) 2017 Fuzhou Rockchip Electronics Co., Ltd
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+#ifndef __DT_BINDINGS_POWER_PX30_POWER_H__
+#define __DT_BINDINGS_POWER_PX30_POWER_H__
+
+/* VD_CORE */
+#define PX30_PD_A35_0  0
+#define PX30_PD_A35_1  1
+#define PX30_PD_A35_2  2
+#define PX30_PD_A35_3  3
+#define PX30_PD_SCU4
+
+/* VD_LOGIC */
+#define PX30_PD_USB5
+#define PX30_PD_DDR6
+#define PX30_PD_SDCARD 7
+#define PX30_PD_CRYPTO 8
+#define PX30_PD_GMAC   9
+#define PX30_PD_MMC_NAND   10
+#define PX30_PD_VPU11
+#define PX30_PD_VO 12
+#define PX30_PD_VI 13
+#define PX30_PD_GPU14
+
+/* VD_PMU */
+#define PX30_PD_PMU15
+
+#endif
-- 
1.9.1




[PATCH v2 12/13] dt-bindings: add binding for px30 power domains

2018-05-13 Thread Elaine Zhang
From: Finley Xiao 

Add binding documentation for the power domains
found on Rockchip PX30 SoCs.

Signed-off-by: Finley Xiao 
Signed-off-by: Elaine Zhang 
---
 Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt 
b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
index affe36dcfa17..5d49d0a2ff29 100644
--- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
+++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
@@ -5,6 +5,7 @@ powered up/down by software based on different application 
scenes to save power.
 
 Required properties for power domain controller:
 - compatible: Should be one of the following.
+   "rockchip,px30-power-controller" - for PX30 SoCs.
"rockchip,rk3036-power-controller" - for RK3036 SoCs.
"rockchip,rk3128-power-controller" - for RK3128 SoCs.
"rockchip,rk3228-power-controller" - for RK3228 SoCs.
@@ -20,6 +21,7 @@ Required properties for power domain controller:
 
 Required properties for power domain sub nodes:
 - reg: index of the power domain, should use macros in:
+   "include/dt-bindings/power/px30-power.h" - for PX30 type power domain.
"include/dt-bindings/power/rk3036-power.h" - for RK3036 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for RK3128 type power 
domain.
"include/dt-bindings/power/rk3228-power.h" - for RK3228 type power 
domain.
@@ -99,6 +101,7 @@ Node of a device using power domains must have a 
power-domains property,
 containing a phandle to the power device node and an index specifying which
 power domain to use.
 The index should use macros in:
+   "include/dt-bindings/power/px30-power.h" - for px30 type power domain.
"include/dt-bindings/power/rk3036-power.h" - for rk3036 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for rk3128 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for rk3228 type power 
domain.
-- 
1.9.1




[PATCH v2 10/13] soc: rockchip: power-domain: add power domain support for rk3228

2018-05-13 Thread Elaine Zhang
This driver is modified to support RK3228 SoC.

Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index 99a2dd8a7801..90dcd5e21ae6 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -729,6 +730,20 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
[RK3128_PD_GPU] = DOMAIN_RK3288(1, 1, 3, false),
 };
 
+static const struct rockchip_domain_info rk3228_pm_domains[] = {
+   [RK3228_PD_CORE]= DOMAIN_RK3036(0, 0, 16, true),
+   [RK3228_PD_MSCH]= DOMAIN_RK3036(1, 1, 17, true),
+   [RK3228_PD_BUS] = DOMAIN_RK3036(2, 2, 18, true),
+   [RK3228_PD_SYS] = DOMAIN_RK3036(3, 3, 19, true),
+   [RK3228_PD_VIO] = DOMAIN_RK3036(4, 4, 20, false),
+   [RK3228_PD_VOP] = DOMAIN_RK3036(5, 5, 21, false),
+   [RK3228_PD_VPU] = DOMAIN_RK3036(6, 6, 22, false),
+   [RK3228_PD_RKVDEC]  = DOMAIN_RK3036(7, 7, 23, false),
+   [RK3228_PD_GPU] = DOMAIN_RK3036(8, 8, 24, false),
+   [RK3228_PD_PERI]= DOMAIN_RK3036(9, 9, 25, true),
+   [RK3228_PD_GMAC]= DOMAIN_RK3036(10, 10, 26, false),
+};
+
 static const struct rockchip_domain_info rk3288_pm_domains[] = {
[RK3288_PD_VIO] = DOMAIN_RK3288(7, 7, 4, false),
[RK3288_PD_HEVC]= DOMAIN_RK3288(14, 10, 9, false),
@@ -816,6 +831,15 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
.domain_info = rk3128_pm_domains,
 };
 
+static const struct rockchip_pmu_info rk3228_pmu = {
+   .req_offset = 0x40c,
+   .idle_offset = 0x488,
+   .ack_offset = 0x488,
+
+   .num_domains = ARRAY_SIZE(rk3228_pm_domains),
+   .domain_info = rk3228_pm_domains,
+};
+
 static const struct rockchip_pmu_info rk3288_pmu = {
.pwr_offset = 0x08,
.status_offset = 0x0c,
@@ -899,6 +923,10 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
.data = (void *)_pmu,
},
{
+   .compatible = "rockchip,rk3228-power-controller",
+   .data = (void *)_pmu,
+   },
+   {
.compatible = "rockchip,rk3288-power-controller",
.data = (void *)_pmu,
},
-- 
1.9.1




[PATCH v2 10/13] soc: rockchip: power-domain: add power domain support for rk3228

2018-05-13 Thread Elaine Zhang
This driver is modified to support RK3228 SoC.

Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index 99a2dd8a7801..90dcd5e21ae6 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -729,6 +730,20 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
[RK3128_PD_GPU] = DOMAIN_RK3288(1, 1, 3, false),
 };
 
+static const struct rockchip_domain_info rk3228_pm_domains[] = {
+   [RK3228_PD_CORE]= DOMAIN_RK3036(0, 0, 16, true),
+   [RK3228_PD_MSCH]= DOMAIN_RK3036(1, 1, 17, true),
+   [RK3228_PD_BUS] = DOMAIN_RK3036(2, 2, 18, true),
+   [RK3228_PD_SYS] = DOMAIN_RK3036(3, 3, 19, true),
+   [RK3228_PD_VIO] = DOMAIN_RK3036(4, 4, 20, false),
+   [RK3228_PD_VOP] = DOMAIN_RK3036(5, 5, 21, false),
+   [RK3228_PD_VPU] = DOMAIN_RK3036(6, 6, 22, false),
+   [RK3228_PD_RKVDEC]  = DOMAIN_RK3036(7, 7, 23, false),
+   [RK3228_PD_GPU] = DOMAIN_RK3036(8, 8, 24, false),
+   [RK3228_PD_PERI]= DOMAIN_RK3036(9, 9, 25, true),
+   [RK3228_PD_GMAC]= DOMAIN_RK3036(10, 10, 26, false),
+};
+
 static const struct rockchip_domain_info rk3288_pm_domains[] = {
[RK3288_PD_VIO] = DOMAIN_RK3288(7, 7, 4, false),
[RK3288_PD_HEVC]= DOMAIN_RK3288(14, 10, 9, false),
@@ -816,6 +831,15 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
.domain_info = rk3128_pm_domains,
 };
 
+static const struct rockchip_pmu_info rk3228_pmu = {
+   .req_offset = 0x40c,
+   .idle_offset = 0x488,
+   .ack_offset = 0x488,
+
+   .num_domains = ARRAY_SIZE(rk3228_pm_domains),
+   .domain_info = rk3228_pm_domains,
+};
+
 static const struct rockchip_pmu_info rk3288_pmu = {
.pwr_offset = 0x08,
.status_offset = 0x0c,
@@ -899,6 +923,10 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
.data = (void *)_pmu,
},
{
+   .compatible = "rockchip,rk3228-power-controller",
+   .data = (void *)_pmu,
+   },
+   {
.compatible = "rockchip,rk3288-power-controller",
.data = (void *)_pmu,
},
-- 
1.9.1




[PATCH v2 09/13] dt-bindings: add binding for rk3228 power domains

2018-05-13 Thread Elaine Zhang
Add binding documentation for the power domains
found on Rockchip RK3228 SoCs.

Signed-off-by: Elaine Zhang 
---
 Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt 
b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
index 9a3f5fd36a80..affe36dcfa17 100644
--- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
+++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
@@ -7,6 +7,7 @@ Required properties for power domain controller:
 - compatible: Should be one of the following.
"rockchip,rk3036-power-controller" - for RK3036 SoCs.
"rockchip,rk3128-power-controller" - for RK3128 SoCs.
+   "rockchip,rk3228-power-controller" - for RK3228 SoCs.
"rockchip,rk3288-power-controller" - for RK3288 SoCs.
"rockchip,rk3328-power-controller" - for RK3328 SoCs.
"rockchip,rk3366-power-controller" - for RK3366 SoCs.
@@ -21,6 +22,7 @@ Required properties for power domain sub nodes:
 - reg: index of the power domain, should use macros in:
"include/dt-bindings/power/rk3036-power.h" - for RK3036 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for RK3128 type power 
domain.
+   "include/dt-bindings/power/rk3228-power.h" - for RK3228 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for RK3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for RK3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for RK3366 type power 
domain.
@@ -99,6 +101,7 @@ power domain to use.
 The index should use macros in:
"include/dt-bindings/power/rk3036-power.h" - for rk3036 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for rk3128 type power 
domain.
+   "include/dt-bindings/power/rk3128-power.h" - for rk3228 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for rk3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for rk3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for rk3366 type power 
domain.
-- 
1.9.1




[PATCH v2 09/13] dt-bindings: add binding for rk3228 power domains

2018-05-13 Thread Elaine Zhang
Add binding documentation for the power domains
found on Rockchip RK3228 SoCs.

Signed-off-by: Elaine Zhang 
---
 Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt 
b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
index 9a3f5fd36a80..affe36dcfa17 100644
--- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
+++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
@@ -7,6 +7,7 @@ Required properties for power domain controller:
 - compatible: Should be one of the following.
"rockchip,rk3036-power-controller" - for RK3036 SoCs.
"rockchip,rk3128-power-controller" - for RK3128 SoCs.
+   "rockchip,rk3228-power-controller" - for RK3228 SoCs.
"rockchip,rk3288-power-controller" - for RK3288 SoCs.
"rockchip,rk3328-power-controller" - for RK3328 SoCs.
"rockchip,rk3366-power-controller" - for RK3366 SoCs.
@@ -21,6 +22,7 @@ Required properties for power domain sub nodes:
 - reg: index of the power domain, should use macros in:
"include/dt-bindings/power/rk3036-power.h" - for RK3036 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for RK3128 type power 
domain.
+   "include/dt-bindings/power/rk3228-power.h" - for RK3228 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for RK3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for RK3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for RK3366 type power 
domain.
@@ -99,6 +101,7 @@ power domain to use.
 The index should use macros in:
"include/dt-bindings/power/rk3036-power.h" - for rk3036 type power 
domain.
"include/dt-bindings/power/rk3128-power.h" - for rk3128 type power 
domain.
+   "include/dt-bindings/power/rk3128-power.h" - for rk3228 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for rk3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for rk3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for rk3366 type power 
domain.
-- 
1.9.1




[PATCH v2 08/13] dt-bindings: power: add RK3228 SoCs header for power-domain

2018-05-13 Thread Elaine Zhang
According to a description from TRM, add all the power domains.

Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/power/rk3228-power.h | 26 ++
 1 file changed, 26 insertions(+)
 create mode 100644 include/dt-bindings/power/rk3228-power.h

diff --git a/include/dt-bindings/power/rk3228-power.h 
b/include/dt-bindings/power/rk3228-power.h
new file mode 100644
index ..fa1264d5a995
--- /dev/null
+++ b/include/dt-bindings/power/rk3228-power.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (c) 2018 Fuzhou Rockchip Electronics Co., Ltd
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+#ifndef __DT_BINDINGS_POWER_RK3228_POWER_H__
+#define __DT_BINDINGS_POWER_RK3228_POWER_H__
+
+/**
+ * RK3228 idle id Summary.
+ */
+
+#define RK3228_PD_CORE 0
+#define RK3228_PD_MSCH 1
+#define RK3228_PD_BUS  2
+#define RK3228_PD_SYS  3
+#define RK3228_PD_VIO  4
+#define RK3228_PD_VOP  5
+#define RK3228_PD_VPU  6
+#define RK3228_PD_RKVDEC   7
+#define RK3228_PD_GPU  8
+#define RK3228_PD_PERI 9
+#define RK3228_PD_GMAC 10
+
+#endif
-- 
1.9.1




[PATCH v2 07/13] soc: rockchip: power-domain: add power domain support for rk3128

2018-05-13 Thread Elaine Zhang
This driver is modified to support RK3128 SoC.

Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index 01d4ba26a054..99a2dd8a7801 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -720,6 +721,14 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
[RK3036_PD_SYS] = DOMAIN_RK3036(8, 22, 29, false),
 };
 
+static const struct rockchip_domain_info rk3128_pm_domains[] = {
+   [RK3128_PD_CORE]= DOMAIN_RK3288(0, 0, 4, false),
+   [RK3128_PD_MSCH]= DOMAIN_RK3288(-1, -1, 6, true),
+   [RK3128_PD_VIO] = DOMAIN_RK3288(3, 3, 2, false),
+   [RK3128_PD_VIDEO]   = DOMAIN_RK3288(2, 2, 1, false),
+   [RK3128_PD_GPU] = DOMAIN_RK3288(1, 1, 3, false),
+};
+
 static const struct rockchip_domain_info rk3288_pm_domains[] = {
[RK3288_PD_VIO] = DOMAIN_RK3288(7, 7, 4, false),
[RK3288_PD_HEVC]= DOMAIN_RK3288(14, 10, 9, false),
@@ -796,6 +805,17 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
.domain_info = rk3036_pm_domains,
 };
 
+static const struct rockchip_pmu_info rk3128_pmu = {
+   .pwr_offset = 0x04,
+   .status_offset = 0x08,
+   .req_offset = 0x0c,
+   .idle_offset = 0x10,
+   .ack_offset = 0x10,
+
+   .num_domains = ARRAY_SIZE(rk3128_pm_domains),
+   .domain_info = rk3128_pm_domains,
+};
+
 static const struct rockchip_pmu_info rk3288_pmu = {
.pwr_offset = 0x08,
.status_offset = 0x0c,
@@ -875,6 +895,10 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
.data = (void *)_pmu,
},
{
+   .compatible = "rockchip,rk3128-power-controller",
+   .data = (void *)_pmu,
+   },
+   {
.compatible = "rockchip,rk3288-power-controller",
.data = (void *)_pmu,
},
-- 
1.9.1




[PATCH v2 08/13] dt-bindings: power: add RK3228 SoCs header for power-domain

2018-05-13 Thread Elaine Zhang
According to a description from TRM, add all the power domains.

Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/power/rk3228-power.h | 26 ++
 1 file changed, 26 insertions(+)
 create mode 100644 include/dt-bindings/power/rk3228-power.h

diff --git a/include/dt-bindings/power/rk3228-power.h 
b/include/dt-bindings/power/rk3228-power.h
new file mode 100644
index ..fa1264d5a995
--- /dev/null
+++ b/include/dt-bindings/power/rk3228-power.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (c) 2018 Fuzhou Rockchip Electronics Co., Ltd
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+#ifndef __DT_BINDINGS_POWER_RK3228_POWER_H__
+#define __DT_BINDINGS_POWER_RK3228_POWER_H__
+
+/**
+ * RK3228 idle id Summary.
+ */
+
+#define RK3228_PD_CORE 0
+#define RK3228_PD_MSCH 1
+#define RK3228_PD_BUS  2
+#define RK3228_PD_SYS  3
+#define RK3228_PD_VIO  4
+#define RK3228_PD_VOP  5
+#define RK3228_PD_VPU  6
+#define RK3228_PD_RKVDEC   7
+#define RK3228_PD_GPU  8
+#define RK3228_PD_PERI 9
+#define RK3228_PD_GMAC 10
+
+#endif
-- 
1.9.1




[PATCH v2 07/13] soc: rockchip: power-domain: add power domain support for rk3128

2018-05-13 Thread Elaine Zhang
This driver is modified to support RK3128 SoC.

Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index 01d4ba26a054..99a2dd8a7801 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -720,6 +721,14 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
[RK3036_PD_SYS] = DOMAIN_RK3036(8, 22, 29, false),
 };
 
+static const struct rockchip_domain_info rk3128_pm_domains[] = {
+   [RK3128_PD_CORE]= DOMAIN_RK3288(0, 0, 4, false),
+   [RK3128_PD_MSCH]= DOMAIN_RK3288(-1, -1, 6, true),
+   [RK3128_PD_VIO] = DOMAIN_RK3288(3, 3, 2, false),
+   [RK3128_PD_VIDEO]   = DOMAIN_RK3288(2, 2, 1, false),
+   [RK3128_PD_GPU] = DOMAIN_RK3288(1, 1, 3, false),
+};
+
 static const struct rockchip_domain_info rk3288_pm_domains[] = {
[RK3288_PD_VIO] = DOMAIN_RK3288(7, 7, 4, false),
[RK3288_PD_HEVC]= DOMAIN_RK3288(14, 10, 9, false),
@@ -796,6 +805,17 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
.domain_info = rk3036_pm_domains,
 };
 
+static const struct rockchip_pmu_info rk3128_pmu = {
+   .pwr_offset = 0x04,
+   .status_offset = 0x08,
+   .req_offset = 0x0c,
+   .idle_offset = 0x10,
+   .ack_offset = 0x10,
+
+   .num_domains = ARRAY_SIZE(rk3128_pm_domains),
+   .domain_info = rk3128_pm_domains,
+};
+
 static const struct rockchip_pmu_info rk3288_pmu = {
.pwr_offset = 0x08,
.status_offset = 0x0c,
@@ -875,6 +895,10 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
.data = (void *)_pmu,
},
{
+   .compatible = "rockchip,rk3128-power-controller",
+   .data = (void *)_pmu,
+   },
+   {
.compatible = "rockchip,rk3288-power-controller",
.data = (void *)_pmu,
},
-- 
1.9.1




[PATCH v2 05/13] dt-bindings: power: add RK3128 SoCs header for power-domain

2018-05-13 Thread Elaine Zhang
According to a description from TRM, add all the power domains.

Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/power/rk3128-power.h | 28 
 1 file changed, 28 insertions(+)
 create mode 100644 include/dt-bindings/power/rk3128-power.h

diff --git a/include/dt-bindings/power/rk3128-power.h 
b/include/dt-bindings/power/rk3128-power.h
new file mode 100644
index ..26aef519cd94
--- /dev/null
+++ b/include/dt-bindings/power/rk3128-power.h
@@ -0,0 +1,28 @@
+/*
+ * Copyright (c) 2017 Rockchip Electronics Co. Ltd.
+ * Author: Elaine Zhang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __DT_BINDINGS_POWER_RK3128_POWER_H__
+#define __DT_BINDINGS_POWER_RK3128_POWER_H__
+
+/* VD_CORE */
+#define RK3128_PD_CORE 0
+
+/* VD_LOGIC */
+#define RK3128_PD_VIO  1
+#define RK3128_PD_VIDEO2
+#define RK3128_PD_GPU  3
+#define RK3128_PD_MSCH 4
+
+#endif
-- 
1.9.1




[PATCH v2 06/13] dt-bindings: add binding for rk3128 power domains

2018-05-13 Thread Elaine Zhang
Add binding documentation for the power domains
found on Rockchip RK3128 SoCs.

Signed-off-by: Elaine Zhang 
---
 Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt 
b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
index 79924ee9ae86..9a3f5fd36a80 100644
--- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
+++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
@@ -6,6 +6,7 @@ powered up/down by software based on different application 
scenes to save power.
 Required properties for power domain controller:
 - compatible: Should be one of the following.
"rockchip,rk3036-power-controller" - for RK3036 SoCs.
+   "rockchip,rk3128-power-controller" - for RK3128 SoCs.
"rockchip,rk3288-power-controller" - for RK3288 SoCs.
"rockchip,rk3328-power-controller" - for RK3328 SoCs.
"rockchip,rk3366-power-controller" - for RK3366 SoCs.
@@ -19,6 +20,7 @@ Required properties for power domain controller:
 Required properties for power domain sub nodes:
 - reg: index of the power domain, should use macros in:
"include/dt-bindings/power/rk3036-power.h" - for RK3036 type power 
domain.
+   "include/dt-bindings/power/rk3128-power.h" - for RK3128 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for RK3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for RK3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for RK3366 type power 
domain.
@@ -96,6 +98,7 @@ containing a phandle to the power device node and an index 
specifying which
 power domain to use.
 The index should use macros in:
"include/dt-bindings/power/rk3036-power.h" - for rk3036 type power 
domain.
+   "include/dt-bindings/power/rk3128-power.h" - for rk3128 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for rk3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for rk3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for rk3366 type power 
domain.
-- 
1.9.1




[PATCH v2 05/13] dt-bindings: power: add RK3128 SoCs header for power-domain

2018-05-13 Thread Elaine Zhang
According to a description from TRM, add all the power domains.

Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/power/rk3128-power.h | 28 
 1 file changed, 28 insertions(+)
 create mode 100644 include/dt-bindings/power/rk3128-power.h

diff --git a/include/dt-bindings/power/rk3128-power.h 
b/include/dt-bindings/power/rk3128-power.h
new file mode 100644
index ..26aef519cd94
--- /dev/null
+++ b/include/dt-bindings/power/rk3128-power.h
@@ -0,0 +1,28 @@
+/*
+ * Copyright (c) 2017 Rockchip Electronics Co. Ltd.
+ * Author: Elaine Zhang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __DT_BINDINGS_POWER_RK3128_POWER_H__
+#define __DT_BINDINGS_POWER_RK3128_POWER_H__
+
+/* VD_CORE */
+#define RK3128_PD_CORE 0
+
+/* VD_LOGIC */
+#define RK3128_PD_VIO  1
+#define RK3128_PD_VIDEO2
+#define RK3128_PD_GPU  3
+#define RK3128_PD_MSCH 4
+
+#endif
-- 
1.9.1




[PATCH v2 06/13] dt-bindings: add binding for rk3128 power domains

2018-05-13 Thread Elaine Zhang
Add binding documentation for the power domains
found on Rockchip RK3128 SoCs.

Signed-off-by: Elaine Zhang 
---
 Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt 
b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
index 79924ee9ae86..9a3f5fd36a80 100644
--- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
+++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
@@ -6,6 +6,7 @@ powered up/down by software based on different application 
scenes to save power.
 Required properties for power domain controller:
 - compatible: Should be one of the following.
"rockchip,rk3036-power-controller" - for RK3036 SoCs.
+   "rockchip,rk3128-power-controller" - for RK3128 SoCs.
"rockchip,rk3288-power-controller" - for RK3288 SoCs.
"rockchip,rk3328-power-controller" - for RK3328 SoCs.
"rockchip,rk3366-power-controller" - for RK3366 SoCs.
@@ -19,6 +20,7 @@ Required properties for power domain controller:
 Required properties for power domain sub nodes:
 - reg: index of the power domain, should use macros in:
"include/dt-bindings/power/rk3036-power.h" - for RK3036 type power 
domain.
+   "include/dt-bindings/power/rk3128-power.h" - for RK3128 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for RK3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for RK3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for RK3366 type power 
domain.
@@ -96,6 +98,7 @@ containing a phandle to the power device node and an index 
specifying which
 power domain to use.
 The index should use macros in:
"include/dt-bindings/power/rk3036-power.h" - for rk3036 type power 
domain.
+   "include/dt-bindings/power/rk3128-power.h" - for rk3128 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for rk3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for rk3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for rk3366 type power 
domain.
-- 
1.9.1




[PATCH v2 04/13] soc: rockchip: power-domain: Fix wrong value when power up pd

2018-05-13 Thread Elaine Zhang
From: Finley Xiao 

Solve the pd could only ever turn off but never turn them on again,
If the pd registers have the writemask bits.

Fix up the code error for commit:
commit 79bb17ce8edb3141339b5882e372d0ec7346217c
Author: Elaine Zhang 
Date:   Fri Dec 23 11:47:52 2016 +0800

soc: rockchip: power-domain: Support domain control in hiword-registers

New Rockchips SoCs may have their power-domain control in registers
using a writemask-based access scheme (upper 16bit being the write
mask). So add a DOMAIN_M type and handle this case accordingly.
Signed-off-by: Elaine Zhang 
Signed-off-by: Heiko Stuebner 

Signed-off-by: Finley Xiao 
Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index ebd7c41898c0..01d4ba26a054 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -264,7 +264,7 @@ static void rockchip_do_pmu_set_power_domain(struct 
rockchip_pm_domain *pd,
return;
else if (pd->info->pwr_w_mask)
regmap_write(pmu->regmap, pmu->info->pwr_offset,
-on ? pd->info->pwr_mask :
+on ? pd->info->pwr_w_mask :
 (pd->info->pwr_mask | pd->info->pwr_w_mask));
else
regmap_update_bits(pmu->regmap, pmu->info->pwr_offset,
-- 
1.9.1




[PATCH v2 04/13] soc: rockchip: power-domain: Fix wrong value when power up pd

2018-05-13 Thread Elaine Zhang
From: Finley Xiao 

Solve the pd could only ever turn off but never turn them on again,
If the pd registers have the writemask bits.

Fix up the code error for commit:
commit 79bb17ce8edb3141339b5882e372d0ec7346217c
Author: Elaine Zhang 
Date:   Fri Dec 23 11:47:52 2016 +0800

soc: rockchip: power-domain: Support domain control in hiword-registers

New Rockchips SoCs may have their power-domain control in registers
using a writemask-based access scheme (upper 16bit being the write
mask). So add a DOMAIN_M type and handle this case accordingly.
Signed-off-by: Elaine Zhang 
Signed-off-by: Heiko Stuebner 

Signed-off-by: Finley Xiao 
Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index ebd7c41898c0..01d4ba26a054 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -264,7 +264,7 @@ static void rockchip_do_pmu_set_power_domain(struct 
rockchip_pm_domain *pd,
return;
else if (pd->info->pwr_w_mask)
regmap_write(pmu->regmap, pmu->info->pwr_offset,
-on ? pd->info->pwr_mask :
+on ? pd->info->pwr_w_mask :
 (pd->info->pwr_mask | pd->info->pwr_w_mask));
else
regmap_update_bits(pmu->regmap, pmu->info->pwr_offset,
-- 
1.9.1




[PATCH v2 03/13] Soc: rockchip: power-domain: add power domain support for rk3036

2018-05-13 Thread Elaine Zhang
From: Caesar Wang 

This driver is modified to support RK3036 SoC.

Signed-off-by: Caesar Wang 
Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index 53efc386b1ad..ebd7c41898c0 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -102,6 +103,14 @@ struct rockchip_pmu {
.ack_mask = (ack >= 0) ? BIT(ack) : 0,  \
.active_wakeup = wakeup,\
 }
+#define DOMAIN_RK3036(req, ack, idle, wakeup)  \
+{  \
+   .req_mask = (req >= 0) ? BIT(req) : 0,  \
+   .req_w_mask = (req >= 0) ?  BIT(req + 16) : 0,  \
+   .ack_mask = (ack >= 0) ? BIT(ack) : 0,  \
+   .idle_mask = (idle >= 0) ? BIT(idle) : 0,   \
+   .active_wakeup = wakeup,\
+}
 
 #define DOMAIN_RK3288(pwr, status, req, wakeup)\
DOMAIN(pwr, status, req, req, (req) + 16, wakeup)
@@ -701,6 +710,16 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
return error;
 }
 
+static const struct rockchip_domain_info rk3036_pm_domains[] = {
+   [RK3036_PD_MSCH]= DOMAIN_RK3036(14, 23, 30, true),
+   [RK3036_PD_CORE]= DOMAIN_RK3036(13, 17, 24, false),
+   [RK3036_PD_PERI]= DOMAIN_RK3036(12, 18, 25, false),
+   [RK3036_PD_VIO] = DOMAIN_RK3036(11, 19, 26, false),
+   [RK3036_PD_VPU] = DOMAIN_RK3036(10, 20, 27, false),
+   [RK3036_PD_GPU] = DOMAIN_RK3036(9, 21, 28, false),
+   [RK3036_PD_SYS] = DOMAIN_RK3036(8, 22, 29, false),
+};
+
 static const struct rockchip_domain_info rk3288_pm_domains[] = {
[RK3288_PD_VIO] = DOMAIN_RK3288(7, 7, 4, false),
[RK3288_PD_HEVC]= DOMAIN_RK3288(14, 10, 9, false),
@@ -768,6 +787,15 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
[RK3399_PD_SDIOAUDIO]   = DOMAIN_RK3399(31, 31, 29, true),
 };
 
+static const struct rockchip_pmu_info rk3036_pmu = {
+   .req_offset = 0x148,
+   .idle_offset = 0x14c,
+   .ack_offset = 0x14c,
+
+   .num_domains = ARRAY_SIZE(rk3036_pm_domains),
+   .domain_info = rk3036_pm_domains,
+};
+
 static const struct rockchip_pmu_info rk3288_pmu = {
.pwr_offset = 0x08,
.status_offset = 0x0c,
@@ -843,6 +871,10 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
 
 static const struct of_device_id rockchip_pm_domain_dt_match[] = {
{
+   .compatible = "rockchip,rk3036-power-controller",
+   .data = (void *)_pmu,
+   },
+   {
.compatible = "rockchip,rk3288-power-controller",
.data = (void *)_pmu,
},
-- 
1.9.1




[PATCH v2 03/13] Soc: rockchip: power-domain: add power domain support for rk3036

2018-05-13 Thread Elaine Zhang
From: Caesar Wang 

This driver is modified to support RK3036 SoC.

Signed-off-by: Caesar Wang 
Signed-off-by: Elaine Zhang 
---
 drivers/soc/rockchip/pm_domains.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/drivers/soc/rockchip/pm_domains.c 
b/drivers/soc/rockchip/pm_domains.c
index 53efc386b1ad..ebd7c41898c0 100644
--- a/drivers/soc/rockchip/pm_domains.c
+++ b/drivers/soc/rockchip/pm_domains.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -102,6 +103,14 @@ struct rockchip_pmu {
.ack_mask = (ack >= 0) ? BIT(ack) : 0,  \
.active_wakeup = wakeup,\
 }
+#define DOMAIN_RK3036(req, ack, idle, wakeup)  \
+{  \
+   .req_mask = (req >= 0) ? BIT(req) : 0,  \
+   .req_w_mask = (req >= 0) ?  BIT(req + 16) : 0,  \
+   .ack_mask = (ack >= 0) ? BIT(ack) : 0,  \
+   .idle_mask = (idle >= 0) ? BIT(idle) : 0,   \
+   .active_wakeup = wakeup,\
+}
 
 #define DOMAIN_RK3288(pwr, status, req, wakeup)\
DOMAIN(pwr, status, req, req, (req) + 16, wakeup)
@@ -701,6 +710,16 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
return error;
 }
 
+static const struct rockchip_domain_info rk3036_pm_domains[] = {
+   [RK3036_PD_MSCH]= DOMAIN_RK3036(14, 23, 30, true),
+   [RK3036_PD_CORE]= DOMAIN_RK3036(13, 17, 24, false),
+   [RK3036_PD_PERI]= DOMAIN_RK3036(12, 18, 25, false),
+   [RK3036_PD_VIO] = DOMAIN_RK3036(11, 19, 26, false),
+   [RK3036_PD_VPU] = DOMAIN_RK3036(10, 20, 27, false),
+   [RK3036_PD_GPU] = DOMAIN_RK3036(9, 21, 28, false),
+   [RK3036_PD_SYS] = DOMAIN_RK3036(8, 22, 29, false),
+};
+
 static const struct rockchip_domain_info rk3288_pm_domains[] = {
[RK3288_PD_VIO] = DOMAIN_RK3288(7, 7, 4, false),
[RK3288_PD_HEVC]= DOMAIN_RK3288(14, 10, 9, false),
@@ -768,6 +787,15 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
[RK3399_PD_SDIOAUDIO]   = DOMAIN_RK3399(31, 31, 29, true),
 };
 
+static const struct rockchip_pmu_info rk3036_pmu = {
+   .req_offset = 0x148,
+   .idle_offset = 0x14c,
+   .ack_offset = 0x14c,
+
+   .num_domains = ARRAY_SIZE(rk3036_pm_domains),
+   .domain_info = rk3036_pm_domains,
+};
+
 static const struct rockchip_pmu_info rk3288_pmu = {
.pwr_offset = 0x08,
.status_offset = 0x0c,
@@ -843,6 +871,10 @@ static int rockchip_pm_domain_probe(struct platform_device 
*pdev)
 
 static const struct of_device_id rockchip_pm_domain_dt_match[] = {
{
+   .compatible = "rockchip,rk3036-power-controller",
+   .data = (void *)_pmu,
+   },
+   {
.compatible = "rockchip,rk3288-power-controller",
.data = (void *)_pmu,
},
-- 
1.9.1




[PATCH v2 02/13] dt-bindings: add binding for rk3036 power domains

2018-05-13 Thread Elaine Zhang
From: Caesar Wang 

Add binding documentation for the power domains
found on Rockchip RK3036 SoCs.

Signed-off-by: Caesar Wang 
Signed-off-by: Elaine Zhang 
---
 Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt 
b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
index 301d2a9bc1b8..79924ee9ae86 100644
--- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
+++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
@@ -5,6 +5,7 @@ powered up/down by software based on different application 
scenes to save power.
 
 Required properties for power domain controller:
 - compatible: Should be one of the following.
+   "rockchip,rk3036-power-controller" - for RK3036 SoCs.
"rockchip,rk3288-power-controller" - for RK3288 SoCs.
"rockchip,rk3328-power-controller" - for RK3328 SoCs.
"rockchip,rk3366-power-controller" - for RK3366 SoCs.
@@ -17,6 +18,7 @@ Required properties for power domain controller:
 
 Required properties for power domain sub nodes:
 - reg: index of the power domain, should use macros in:
+   "include/dt-bindings/power/rk3036-power.h" - for RK3036 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for RK3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for RK3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for RK3366 type power 
domain.
@@ -93,6 +95,7 @@ Node of a device using power domains must have a 
power-domains property,
 containing a phandle to the power device node and an index specifying which
 power domain to use.
 The index should use macros in:
+   "include/dt-bindings/power/rk3036-power.h" - for rk3036 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for rk3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for rk3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for rk3366 type power 
domain.
-- 
1.9.1




[PATCH v2 02/13] dt-bindings: add binding for rk3036 power domains

2018-05-13 Thread Elaine Zhang
From: Caesar Wang 

Add binding documentation for the power domains
found on Rockchip RK3036 SoCs.

Signed-off-by: Caesar Wang 
Signed-off-by: Elaine Zhang 
---
 Documentation/devicetree/bindings/soc/rockchip/power_domain.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt 
b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
index 301d2a9bc1b8..79924ee9ae86 100644
--- a/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
+++ b/Documentation/devicetree/bindings/soc/rockchip/power_domain.txt
@@ -5,6 +5,7 @@ powered up/down by software based on different application 
scenes to save power.
 
 Required properties for power domain controller:
 - compatible: Should be one of the following.
+   "rockchip,rk3036-power-controller" - for RK3036 SoCs.
"rockchip,rk3288-power-controller" - for RK3288 SoCs.
"rockchip,rk3328-power-controller" - for RK3328 SoCs.
"rockchip,rk3366-power-controller" - for RK3366 SoCs.
@@ -17,6 +18,7 @@ Required properties for power domain controller:
 
 Required properties for power domain sub nodes:
 - reg: index of the power domain, should use macros in:
+   "include/dt-bindings/power/rk3036-power.h" - for RK3036 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for RK3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for RK3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for RK3366 type power 
domain.
@@ -93,6 +95,7 @@ Node of a device using power domains must have a 
power-domains property,
 containing a phandle to the power device node and an index specifying which
 power domain to use.
 The index should use macros in:
+   "include/dt-bindings/power/rk3036-power.h" - for rk3036 type power 
domain.
"include/dt-bindings/power/rk3288-power.h" - for rk3288 type power 
domain.
"include/dt-bindings/power/rk3328-power.h" - for rk3328 type power 
domain.
"include/dt-bindings/power/rk3366-power.h" - for rk3366 type power 
domain.
-- 
1.9.1




[PATCH v2 01/13] dt-bindings: power: add RK3036 SoCs header for power-domain

2018-05-13 Thread Elaine Zhang
From: Caesar Wang 

According to a description from TRM, add all the power domains.

Signed-off-by: Caesar Wang 
Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/power/rk3036-power.h | 27 +++
 1 file changed, 27 insertions(+)
 create mode 100644 include/dt-bindings/power/rk3036-power.h

diff --git a/include/dt-bindings/power/rk3036-power.h 
b/include/dt-bindings/power/rk3036-power.h
new file mode 100644
index ..59e09f1c5af7
--- /dev/null
+++ b/include/dt-bindings/power/rk3036-power.h
@@ -0,0 +1,27 @@
+/*
+ * Copyright (c) 2017 Rockchip Electronics Co. Ltd.
+ * Author: Caesar Wang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __DT_BINDINGS_POWER_RK3036_POWER_H__
+#define __DT_BINDINGS_POWER_RK3036_POWER_H__
+
+#define RK3036_PD_MSCH 0
+#define RK3036_PD_CORE 1
+#define RK3036_PD_PERI 2
+#define RK3036_PD_VIO  3
+#define RK3036_PD_VPU  4
+#define RK3036_PD_GPU  5
+#define RK3036_PD_SYS  6
+
+#endif
-- 
1.9.1




[PATCH v2 01/13] dt-bindings: power: add RK3036 SoCs header for power-domain

2018-05-13 Thread Elaine Zhang
From: Caesar Wang 

According to a description from TRM, add all the power domains.

Signed-off-by: Caesar Wang 
Signed-off-by: Elaine Zhang 
---
 include/dt-bindings/power/rk3036-power.h | 27 +++
 1 file changed, 27 insertions(+)
 create mode 100644 include/dt-bindings/power/rk3036-power.h

diff --git a/include/dt-bindings/power/rk3036-power.h 
b/include/dt-bindings/power/rk3036-power.h
new file mode 100644
index ..59e09f1c5af7
--- /dev/null
+++ b/include/dt-bindings/power/rk3036-power.h
@@ -0,0 +1,27 @@
+/*
+ * Copyright (c) 2017 Rockchip Electronics Co. Ltd.
+ * Author: Caesar Wang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __DT_BINDINGS_POWER_RK3036_POWER_H__
+#define __DT_BINDINGS_POWER_RK3036_POWER_H__
+
+#define RK3036_PD_MSCH 0
+#define RK3036_PD_CORE 1
+#define RK3036_PD_PERI 2
+#define RK3036_PD_VIO  3
+#define RK3036_PD_VPU  4
+#define RK3036_PD_GPU  5
+#define RK3036_PD_SYS  6
+
+#endif
-- 
1.9.1




[PATCH v2 00/13] add power domain support for Rockchip Socs

2018-05-13 Thread Elaine Zhang
add power domain support for RK3036/RK3128/RK3228/PX30 Soc.
fix up the wrong value when set power domain up.

Change in V2:
Fix up the commit message description and Assign author.

Caesar Wang (3):
  dt-bindings: power: add RK3036 SoCs header for power-domain
  dt-bindings: add binding for rk3036 power domains
  Soc: rockchip: power-domain: add power domain support for rk3036

Elaine Zhang (6):
  dt-bindings: power: add RK3128 SoCs header for power-domain
  dt-bindings: add binding for rk3128 power domains
  soc: rockchip: power-domain: add power domain support for rk3128
  dt-bindings: power: add RK3228 SoCs header for power-domain
  dt-bindings: add binding for rk3228 power domains
  soc: rockchip: power-domain: add power domain support for rk3228

Finley Xiao (4):
  soc: rockchip: power-domain: Fix wrong value when power up pd
  dt-bindings: power: add PX30 SoCs header for power-domain
  dt-bindings: add binding for px30 power domains
  soc: rockchip: power-domain: add power domain support for px30

 .../bindings/soc/rockchip/power_domain.txt |  12 +++
 drivers/soc/rockchip/pm_domains.c  | 116 -
 include/dt-bindings/power/px30-power.h |  32 ++
 include/dt-bindings/power/rk3036-power.h   |  27 +
 include/dt-bindings/power/rk3128-power.h   |  28 +
 include/dt-bindings/power/rk3228-power.h   |  26 +
 6 files changed, 240 insertions(+), 1 deletion(-)
 create mode 100644 include/dt-bindings/power/px30-power.h
 create mode 100644 include/dt-bindings/power/rk3036-power.h
 create mode 100644 include/dt-bindings/power/rk3128-power.h
 create mode 100644 include/dt-bindings/power/rk3228-power.h

-- 
1.9.1




Re: [PATCH 2/2] powerpc: Enable ASYM_SMT on interleaved big-core systems

2018-05-13 Thread Michael Neuling
On Fri, 2018-05-11 at 16:47 +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> Each of the SMT4 cores forming a fused-core are more or less
> independent units. Thus when multiple tasks are scheduled to run on
> the fused core, we get the best performance when the tasks are spread
> across the pair of SMT4 cores.
> 
> Since the threads in the pair of SMT4 cores of an interleaved big-core
> are numbered {0,2,4,6} and {1,3,5,7} respectively, enable ASYM_SMT on
> such interleaved big-cores that will bias the load-balancing of tasks
> on smaller numbered threads, which will automatically result in
> spreading the tasks uniformly across the associated pair of SMT4
> cores.
> 
> Signed-off-by: Gautham R. Shenoy 
> ---
>  arch/powerpc/kernel/smp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 9ca7148..0153f01 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1082,7 +1082,7 @@ static int powerpc_smt_flags(void)
>  {
>   int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
>  
> - if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
> + if (cpu_has_feature(CPU_FTR_ASYM_SMT) || has_interleaved_big_core) {

Shouldn't we just set CPU_FTR_ASYM_SMT and leave this code unchanged?


>   printk_once(KERN_INFO "Enabling Asymmetric SMT
> scheduling\n");
>   flags |= SD_ASYM_PACKING;
>   }


[PATCH v2 00/13] add power domain support for Rockchip Socs

2018-05-13 Thread Elaine Zhang
add power domain support for RK3036/RK3128/RK3228/PX30 Soc.
fix up the wrong value when set power domain up.

Change in V2:
Fix up the commit message description and Assign author.

Caesar Wang (3):
  dt-bindings: power: add RK3036 SoCs header for power-domain
  dt-bindings: add binding for rk3036 power domains
  Soc: rockchip: power-domain: add power domain support for rk3036

Elaine Zhang (6):
  dt-bindings: power: add RK3128 SoCs header for power-domain
  dt-bindings: add binding for rk3128 power domains
  soc: rockchip: power-domain: add power domain support for rk3128
  dt-bindings: power: add RK3228 SoCs header for power-domain
  dt-bindings: add binding for rk3228 power domains
  soc: rockchip: power-domain: add power domain support for rk3228

Finley Xiao (4):
  soc: rockchip: power-domain: Fix wrong value when power up pd
  dt-bindings: power: add PX30 SoCs header for power-domain
  dt-bindings: add binding for px30 power domains
  soc: rockchip: power-domain: add power domain support for px30

 .../bindings/soc/rockchip/power_domain.txt |  12 +++
 drivers/soc/rockchip/pm_domains.c  | 116 -
 include/dt-bindings/power/px30-power.h |  32 ++
 include/dt-bindings/power/rk3036-power.h   |  27 +
 include/dt-bindings/power/rk3128-power.h   |  28 +
 include/dt-bindings/power/rk3228-power.h   |  26 +
 6 files changed, 240 insertions(+), 1 deletion(-)
 create mode 100644 include/dt-bindings/power/px30-power.h
 create mode 100644 include/dt-bindings/power/rk3036-power.h
 create mode 100644 include/dt-bindings/power/rk3128-power.h
 create mode 100644 include/dt-bindings/power/rk3228-power.h

-- 
1.9.1




Re: [PATCH 2/2] powerpc: Enable ASYM_SMT on interleaved big-core systems

2018-05-13 Thread Michael Neuling
On Fri, 2018-05-11 at 16:47 +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> Each of the SMT4 cores forming a fused-core are more or less
> independent units. Thus when multiple tasks are scheduled to run on
> the fused core, we get the best performance when the tasks are spread
> across the pair of SMT4 cores.
> 
> Since the threads in the pair of SMT4 cores of an interleaved big-core
> are numbered {0,2,4,6} and {1,3,5,7} respectively, enable ASYM_SMT on
> such interleaved big-cores that will bias the load-balancing of tasks
> on smaller numbered threads, which will automatically result in
> spreading the tasks uniformly across the associated pair of SMT4
> cores.
> 
> Signed-off-by: Gautham R. Shenoy 
> ---
>  arch/powerpc/kernel/smp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 9ca7148..0153f01 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1082,7 +1082,7 @@ static int powerpc_smt_flags(void)
>  {
>   int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
>  
> - if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
> + if (cpu_has_feature(CPU_FTR_ASYM_SMT) || has_interleaved_big_core) {

Shouldn't we just set CPU_FTR_ASYM_SMT and leave this code unchanged?


>   printk_once(KERN_INFO "Enabling Asymmetric SMT
> scheduling\n");
>   flags |= SD_ASYM_PACKING;
>   }


Re: [PATCH 1/2] powerpc: Detect the presence of big-core with interleaved threads

2018-05-13 Thread Michael Neuling
Thanks for posting this... A couple of comments below.

On Fri, 2018-05-11 at 16:47 +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> A pair of IBM POWER9 SMT4 cores can be fused together to form a
> big-core with 8 SMT threads. This can be discovered via the
> "ibm,thread-groups" CPU property in the device tree which will
> indicate which group of threads that share the L1 cache, translation
> cache and instruction data flow.  If there are multiple such group of
> threads, then the core is a big-core. The thread-ids of the threads of
> the big-core can be obtained by interleaving the thread-ids of the
> thread-groups (component small core).
> 
> Eg: Threads in the pair of component SMT4 cores of an interleaved
> big-core are numbered {0,2,4,6} and {1,3,5,7} respectively.
> 
> This patch introduces a function to check if a given device tree node
> corresponding to a CPU node represents an interleaved big-core.
> 
> This function is invoked during the boot-up to detect the presence of
> interleaved big-cores. The presence of such an interleaved big-core is
> recorded in a global variable for later use.
> 
> Signed-off-by: Gautham R. Shenoy 
> ---
>  arch/powerpc/include/asm/cputhreads.h |  8 +++--
>  arch/powerpc/kernel/setup-common.c| 63 +-
> -
>  2 files changed, 66 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/cputhreads.h
> b/arch/powerpc/include/asm/cputhreads.h
> index d71a909..b706f0a 100644
> --- a/arch/powerpc/include/asm/cputhreads.h
> +++ b/arch/powerpc/include/asm/cputhreads.h
> @@ -23,11 +23,13 @@
>  extern int threads_per_core;
>  extern int threads_per_subcore;
>  extern int threads_shift;
> +extern bool has_interleaved_big_core;
>  extern cpumask_t threads_core_mask;
>  #else
> -#define threads_per_core 1
> -#define threads_per_subcore  1
> -#define threads_shift0
> +#define threads_per_core 1
> +#define threads_per_subcore  1
> +#define threads_shift0
> +#define has_interleaved_big_core 0
>  #define threads_core_mask(*get_cpu_mask(0))
>  #endif
>  
> diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-
> common.c
> index 0af5c11..884dff2 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -408,10 +408,12 @@ void __init check_for_initrd(void)
>  #ifdef CONFIG_SMP
>  
>  int threads_per_core, threads_per_subcore, threads_shift;
> +bool has_interleaved_big_core;
>  cpumask_t threads_core_mask;
>  EXPORT_SYMBOL_GPL(threads_per_core);
>  EXPORT_SYMBOL_GPL(threads_per_subcore);
>  EXPORT_SYMBOL_GPL(threads_shift);
> +EXPORT_SYMBOL_GPL(has_interleaved_big_core);
>  EXPORT_SYMBOL_GPL(threads_core_mask);
>  
>  static void __init cpu_init_thread_core_maps(int tpc)
> @@ -436,8 +438,56 @@ static void __init cpu_init_thread_core_maps(int tpc)
>   printk(KERN_DEBUG " (thread shift is %d)\n", threads_shift);
>  }
>  
> -
>  u32 *cpu_to_phys_id = NULL;
> +/*
> + * check_for_interleaved_big_core - Checks if the core represented by
> + *dn is a big-core whose threads are interleavings of the
> + *threads of the component small cores.
> + *
> + * @dn: device node corresponding to the core.
> + *
> + * Returns true if the core is a interleaved big-core.
> + * Returns false otherwise.
> + */
> +static inline bool check_for_interleaved_big_core(struct device_node *dn)
> +{
> + int len, nr_groups, threads_per_group;
> + const __be32 *thread_groups;
> + __be32 *thread_list, *first_cpu_idx;
> + int cur_cpu, next_cpu, i, j;
> +
> + thread_groups = of_get_property(dn, "ibm,thread-groups", );
> + if (!thread_groups)
> + return false;

Can you document what this property looks like? Seems to be nr_groups,
threads_per_group, thread_list. Can you explain what each of these mean?

If we get configured with an SMT2 big-core (ie. two interleaved SMT1 normal
cores), will this code also work there?

> +
> + nr_groups = be32_to_cpu(*(thread_groups + 1));
> + if (nr_groups <= 1)
> + return false;
> +
> + threads_per_group = be32_to_cpu(*(thread_groups + 2));
> + thread_list = (__be32 *)thread_groups + 3;
> +
> + /*
> +  * In case of an interleaved big-core, the thread-ids of the
> +  * big-core can be obtained by interleaving the the thread-ids
> +  * of the component small
> +  *
> +  * Eg: On a 8-thread big-core with two SMT4 small cores, the
> +  * threads of the two component small cores will be
> +  * {0, 2, 4, 6} and {1, 3, 5, 7}.
> +  */
> + for (i = 0; i < nr_groups; i++) {
> + first_cpu_idx = thread_list + i * threads_per_group;
> +
> + for (j = 0; j < threads_per_group - 1; j++) {
> + cur_cpu = be32_to_cpu(*(first_cpu_idx + j));
> + next_cpu = be32_to_cpu(*(first_cpu_idx + 

Re: [PATCH 1/2] powerpc: Detect the presence of big-core with interleaved threads

2018-05-13 Thread Michael Neuling
Thanks for posting this... A couple of comments below.

On Fri, 2018-05-11 at 16:47 +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> A pair of IBM POWER9 SMT4 cores can be fused together to form a
> big-core with 8 SMT threads. This can be discovered via the
> "ibm,thread-groups" CPU property in the device tree which will
> indicate which group of threads that share the L1 cache, translation
> cache and instruction data flow.  If there are multiple such group of
> threads, then the core is a big-core. The thread-ids of the threads of
> the big-core can be obtained by interleaving the thread-ids of the
> thread-groups (component small core).
> 
> Eg: Threads in the pair of component SMT4 cores of an interleaved
> big-core are numbered {0,2,4,6} and {1,3,5,7} respectively.
> 
> This patch introduces a function to check if a given device tree node
> corresponding to a CPU node represents an interleaved big-core.
> 
> This function is invoked during the boot-up to detect the presence of
> interleaved big-cores. The presence of such an interleaved big-core is
> recorded in a global variable for later use.
> 
> Signed-off-by: Gautham R. Shenoy 
> ---
>  arch/powerpc/include/asm/cputhreads.h |  8 +++--
>  arch/powerpc/kernel/setup-common.c| 63 +-
> -
>  2 files changed, 66 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/cputhreads.h
> b/arch/powerpc/include/asm/cputhreads.h
> index d71a909..b706f0a 100644
> --- a/arch/powerpc/include/asm/cputhreads.h
> +++ b/arch/powerpc/include/asm/cputhreads.h
> @@ -23,11 +23,13 @@
>  extern int threads_per_core;
>  extern int threads_per_subcore;
>  extern int threads_shift;
> +extern bool has_interleaved_big_core;
>  extern cpumask_t threads_core_mask;
>  #else
> -#define threads_per_core 1
> -#define threads_per_subcore  1
> -#define threads_shift0
> +#define threads_per_core 1
> +#define threads_per_subcore  1
> +#define threads_shift0
> +#define has_interleaved_big_core 0
>  #define threads_core_mask(*get_cpu_mask(0))
>  #endif
>  
> diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-
> common.c
> index 0af5c11..884dff2 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -408,10 +408,12 @@ void __init check_for_initrd(void)
>  #ifdef CONFIG_SMP
>  
>  int threads_per_core, threads_per_subcore, threads_shift;
> +bool has_interleaved_big_core;
>  cpumask_t threads_core_mask;
>  EXPORT_SYMBOL_GPL(threads_per_core);
>  EXPORT_SYMBOL_GPL(threads_per_subcore);
>  EXPORT_SYMBOL_GPL(threads_shift);
> +EXPORT_SYMBOL_GPL(has_interleaved_big_core);
>  EXPORT_SYMBOL_GPL(threads_core_mask);
>  
>  static void __init cpu_init_thread_core_maps(int tpc)
> @@ -436,8 +438,56 @@ static void __init cpu_init_thread_core_maps(int tpc)
>   printk(KERN_DEBUG " (thread shift is %d)\n", threads_shift);
>  }
>  
> -
>  u32 *cpu_to_phys_id = NULL;
> +/*
> + * check_for_interleaved_big_core - Checks if the core represented by
> + *dn is a big-core whose threads are interleavings of the
> + *threads of the component small cores.
> + *
> + * @dn: device node corresponding to the core.
> + *
> + * Returns true if the core is a interleaved big-core.
> + * Returns false otherwise.
> + */
> +static inline bool check_for_interleaved_big_core(struct device_node *dn)
> +{
> + int len, nr_groups, threads_per_group;
> + const __be32 *thread_groups;
> + __be32 *thread_list, *first_cpu_idx;
> + int cur_cpu, next_cpu, i, j;
> +
> + thread_groups = of_get_property(dn, "ibm,thread-groups", );
> + if (!thread_groups)
> + return false;

Can you document what this property looks like? Seems to be nr_groups,
threads_per_group, thread_list. Can you explain what each of these mean?

If we get configured with an SMT2 big-core (ie. two interleaved SMT1 normal
cores), will this code also work there?

> +
> + nr_groups = be32_to_cpu(*(thread_groups + 1));
> + if (nr_groups <= 1)
> + return false;
> +
> + threads_per_group = be32_to_cpu(*(thread_groups + 2));
> + thread_list = (__be32 *)thread_groups + 3;
> +
> + /*
> +  * In case of an interleaved big-core, the thread-ids of the
> +  * big-core can be obtained by interleaving the the thread-ids
> +  * of the component small
> +  *
> +  * Eg: On a 8-thread big-core with two SMT4 small cores, the
> +  * threads of the two component small cores will be
> +  * {0, 2, 4, 6} and {1, 3, 5, 7}.
> +  */
> + for (i = 0; i < nr_groups; i++) {
> + first_cpu_idx = thread_list + i * threads_per_group;
> +
> + for (j = 0; j < threads_per_group - 1; j++) {
> + cur_cpu = be32_to_cpu(*(first_cpu_idx + j));
> + next_cpu = be32_to_cpu(*(first_cpu_idx + j + 1));
> + if (next_cpu != 

[PATCH RFC 3/8] rcu: Add back the cpuend tracepoint

2018-05-13 Thread Joel Fernandes (Google)
Commit be4b8beed87d ("rcu: Move RCU's grace-period-change code to ->gp_seq")
removed the cpuend grace period trace point. This patch adds it back.

Signed-off-by: Joel Fernandes (Google) 
---
 kernel/rcu/tree.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9ad931bff409..29ccc60bdbfc 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1774,10 +1774,12 @@ static bool __note_gp_changes(struct rcu_state *rsp, 
struct rcu_node *rnp,
 
/* Handle the ends of any preceding grace periods first. */
if (rcu_seq_completed_gp(rdp->gp_seq, rnp->gp_seq) ||
-   unlikely(READ_ONCE(rdp->gpwrap)))
+   unlikely(READ_ONCE(rdp->gpwrap))) {
ret = rcu_advance_cbs(rsp, rnp, rdp); /* Advance callbacks. */
-   else
+   trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuend"));
+   } else {
ret = rcu_accelerate_cbs(rsp, rnp, rdp); /* Recent callbacks. */
+   }
 
/* Now handle the beginnings of any new-to-this-CPU grace periods. */
if (rcu_seq_new_gp(rdp->gp_seq, rnp->gp_seq) ||
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH RFC 3/8] rcu: Add back the cpuend tracepoint

2018-05-13 Thread Joel Fernandes (Google)
Commit be4b8beed87d ("rcu: Move RCU's grace-period-change code to ->gp_seq")
removed the cpuend grace period trace point. This patch adds it back.

Signed-off-by: Joel Fernandes (Google) 
---
 kernel/rcu/tree.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9ad931bff409..29ccc60bdbfc 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1774,10 +1774,12 @@ static bool __note_gp_changes(struct rcu_state *rsp, 
struct rcu_node *rnp,
 
/* Handle the ends of any preceding grace periods first. */
if (rcu_seq_completed_gp(rdp->gp_seq, rnp->gp_seq) ||
-   unlikely(READ_ONCE(rdp->gpwrap)))
+   unlikely(READ_ONCE(rdp->gpwrap))) {
ret = rcu_advance_cbs(rsp, rnp, rdp); /* Advance callbacks. */
-   else
+   trace_rcu_grace_period(rsp->name, rdp->gp_seq, TPS("cpuend"));
+   } else {
ret = rcu_accelerate_cbs(rsp, rnp, rdp); /* Recent callbacks. */
+   }
 
/* Now handle the beginnings of any new-to-this-CPU grace periods. */
if (rcu_seq_new_gp(rdp->gp_seq, rnp->gp_seq) ||
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH RFC 4/8] rcu: Get rid of old c variable from places in tree RCU

2018-05-13 Thread Joel Fernandes (Google)
The 'c' variable was used previously to store the grace period
that is being requested. However it is not very meaningful for
a code reader, this patch replaces it with gp_seq_start indicating that
this is the grace period that was requested. Also updating tracing with
the new name.

Just a clean up patch, no logical change.

Signed-off-by: Joel Fernandes (Google) 
---
 include/trace/events/rcu.h | 15 ++--
 kernel/rcu/tree.c  | 47 ++
 2 files changed, 35 insertions(+), 27 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index ce9d1a1cac78..539900a9f8c7 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -103,15 +103,16 @@ TRACE_EVENT(rcu_grace_period,
  */
 TRACE_EVENT(rcu_future_grace_period,
 
-   TP_PROTO(const char *rcuname, unsigned long gp_seq, unsigned long c,
-u8 level, int grplo, int grphi, const char *gpevent),
+   TP_PROTO(const char *rcuname, unsigned long gp_seq,
+unsigned long gp_seq_start, u8 level, int grplo, int grphi,
+const char *gpevent),
 
-   TP_ARGS(rcuname, gp_seq, c, level, grplo, grphi, gpevent),
+   TP_ARGS(rcuname, gp_seq, gp_seq_start, level, grplo, grphi, gpevent),
 
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gp_seq)
-   __field(unsigned long, c)
+   __field(unsigned long, gp_seq_start)
__field(u8, level)
__field(int, grplo)
__field(int, grphi)
@@ -121,7 +122,7 @@ TRACE_EVENT(rcu_future_grace_period,
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gp_seq = gp_seq;
-   __entry->c = c;
+   __entry->gp_seq_start = gp_seq_start;
__entry->level = level;
__entry->grplo = grplo;
__entry->grphi = grphi;
@@ -129,7 +130,7 @@ TRACE_EVENT(rcu_future_grace_period,
),
 
TP_printk("%s %lu %lu %u %d %d %s",
- __entry->rcuname, __entry->gp_seq, __entry->c, __entry->level,
+ __entry->rcuname, __entry->gp_seq, __entry->gp_seq_start, 
__entry->level,
  __entry->grplo, __entry->grphi, __entry->gpevent)
 );
 
@@ -751,7 +752,7 @@ TRACE_EVENT(rcu_barrier,
 #else /* #ifdef CONFIG_RCU_TRACE */
 
 #define trace_rcu_grace_period(rcuname, gp_seq, gpevent) do { } while (0)
-#define trace_rcu_future_grace_period(rcuname, gp_seq, c, \
+#define trace_rcu_future_grace_period(rcuname, gp_seq, gp_seq_start, \
  level, grplo, grphi, event) \
  do { } while (0)
 #define trace_rcu_grace_period_init(rcuname, gp_seq, level, grplo, grphi, \
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 29ccc60bdbfc..9f5679ba413b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1541,13 +1541,18 @@ void rcu_cpu_stall_reset(void)
 
 /* Trace-event wrapper function for trace_rcu_future_grace_period.  */
 static void trace_rcu_this_gp(struct rcu_node *rnp, struct rcu_data *rdp,
- unsigned long c, const char *s)
+ unsigned long gp_seq_start, const char *s)
 {
-   trace_rcu_future_grace_period(rdp->rsp->name, rnp->gp_seq, c,
+   trace_rcu_future_grace_period(rdp->rsp->name, rnp->gp_seq, gp_seq_start,
  rnp->level, rnp->grplo, rnp->grphi, s);
 }
 
 /*
+ * rcu_start_this_gp - Request the start of a particular grace period
+ * @rnp: The leaf node of the CPU from which to start.
+ * @rdp: The rcu_data corresponding to the CPU from which to start.
+ * @gp_seq_start: The gp_seq of the grace period to start.
+ *
  * Start the specified grace period, as needed to handle newly arrived
  * callbacks.  The required future grace periods are recorded in each
  * rcu_node structure's ->gp_seq_needed field.  Returns true if there
@@ -1555,9 +1560,11 @@ static void trace_rcu_this_gp(struct rcu_node *rnp, 
struct rcu_data *rdp,
  *
  * The caller must hold the specified rcu_node structure's ->lock, which
  * is why the caller is responsible for waking the grace-period kthread.
+ *
+ * Returns true if the GP thread needs to be awakened else false.
  */
 static bool rcu_start_this_gp(struct rcu_node *rnp, struct rcu_data *rdp,
- unsigned long c)
+ unsigned long gp_seq_start)
 {
bool ret = false;
struct rcu_state *rsp = rdp->rsp;
@@ -1573,18 +1580,19 @@ static bool rcu_start_this_gp(struct rcu_node *rnp, 
struct rcu_data *rdp,
 * not be released.
 */
raw_lockdep_assert_held_rcu_node(rnp);
-   trace_rcu_this_gp(rnp, rdp, c, TPS("Startleaf"));
+   trace_rcu_this_gp(rnp, rdp, gp_seq_start, TPS("Startleaf"));
for (rnp_root = rnp; 1; rnp_root = 

[PATCH RFC 4/8] rcu: Get rid of old c variable from places in tree RCU

2018-05-13 Thread Joel Fernandes (Google)
The 'c' variable was used previously to store the grace period
that is being requested. However it is not very meaningful for
a code reader, this patch replaces it with gp_seq_start indicating that
this is the grace period that was requested. Also updating tracing with
the new name.

Just a clean up patch, no logical change.

Signed-off-by: Joel Fernandes (Google) 
---
 include/trace/events/rcu.h | 15 ++--
 kernel/rcu/tree.c  | 47 ++
 2 files changed, 35 insertions(+), 27 deletions(-)

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index ce9d1a1cac78..539900a9f8c7 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -103,15 +103,16 @@ TRACE_EVENT(rcu_grace_period,
  */
 TRACE_EVENT(rcu_future_grace_period,
 
-   TP_PROTO(const char *rcuname, unsigned long gp_seq, unsigned long c,
-u8 level, int grplo, int grphi, const char *gpevent),
+   TP_PROTO(const char *rcuname, unsigned long gp_seq,
+unsigned long gp_seq_start, u8 level, int grplo, int grphi,
+const char *gpevent),
 
-   TP_ARGS(rcuname, gp_seq, c, level, grplo, grphi, gpevent),
+   TP_ARGS(rcuname, gp_seq, gp_seq_start, level, grplo, grphi, gpevent),
 
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(unsigned long, gp_seq)
-   __field(unsigned long, c)
+   __field(unsigned long, gp_seq_start)
__field(u8, level)
__field(int, grplo)
__field(int, grphi)
@@ -121,7 +122,7 @@ TRACE_EVENT(rcu_future_grace_period,
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->gp_seq = gp_seq;
-   __entry->c = c;
+   __entry->gp_seq_start = gp_seq_start;
__entry->level = level;
__entry->grplo = grplo;
__entry->grphi = grphi;
@@ -129,7 +130,7 @@ TRACE_EVENT(rcu_future_grace_period,
),
 
TP_printk("%s %lu %lu %u %d %d %s",
- __entry->rcuname, __entry->gp_seq, __entry->c, __entry->level,
+ __entry->rcuname, __entry->gp_seq, __entry->gp_seq_start, 
__entry->level,
  __entry->grplo, __entry->grphi, __entry->gpevent)
 );
 
@@ -751,7 +752,7 @@ TRACE_EVENT(rcu_barrier,
 #else /* #ifdef CONFIG_RCU_TRACE */
 
 #define trace_rcu_grace_period(rcuname, gp_seq, gpevent) do { } while (0)
-#define trace_rcu_future_grace_period(rcuname, gp_seq, c, \
+#define trace_rcu_future_grace_period(rcuname, gp_seq, gp_seq_start, \
  level, grplo, grphi, event) \
  do { } while (0)
 #define trace_rcu_grace_period_init(rcuname, gp_seq, level, grplo, grphi, \
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 29ccc60bdbfc..9f5679ba413b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1541,13 +1541,18 @@ void rcu_cpu_stall_reset(void)
 
 /* Trace-event wrapper function for trace_rcu_future_grace_period.  */
 static void trace_rcu_this_gp(struct rcu_node *rnp, struct rcu_data *rdp,
- unsigned long c, const char *s)
+ unsigned long gp_seq_start, const char *s)
 {
-   trace_rcu_future_grace_period(rdp->rsp->name, rnp->gp_seq, c,
+   trace_rcu_future_grace_period(rdp->rsp->name, rnp->gp_seq, gp_seq_start,
  rnp->level, rnp->grplo, rnp->grphi, s);
 }
 
 /*
+ * rcu_start_this_gp - Request the start of a particular grace period
+ * @rnp: The leaf node of the CPU from which to start.
+ * @rdp: The rcu_data corresponding to the CPU from which to start.
+ * @gp_seq_start: The gp_seq of the grace period to start.
+ *
  * Start the specified grace period, as needed to handle newly arrived
  * callbacks.  The required future grace periods are recorded in each
  * rcu_node structure's ->gp_seq_needed field.  Returns true if there
@@ -1555,9 +1560,11 @@ static void trace_rcu_this_gp(struct rcu_node *rnp, 
struct rcu_data *rdp,
  *
  * The caller must hold the specified rcu_node structure's ->lock, which
  * is why the caller is responsible for waking the grace-period kthread.
+ *
+ * Returns true if the GP thread needs to be awakened else false.
  */
 static bool rcu_start_this_gp(struct rcu_node *rnp, struct rcu_data *rdp,
- unsigned long c)
+ unsigned long gp_seq_start)
 {
bool ret = false;
struct rcu_state *rsp = rdp->rsp;
@@ -1573,18 +1580,19 @@ static bool rcu_start_this_gp(struct rcu_node *rnp, 
struct rcu_data *rdp,
 * not be released.
 */
raw_lockdep_assert_held_rcu_node(rnp);
-   trace_rcu_this_gp(rnp, rdp, c, TPS("Startleaf"));
+   trace_rcu_this_gp(rnp, rdp, gp_seq_start, TPS("Startleaf"));
for (rnp_root = rnp; 1; rnp_root = rnp_root->parent) {

  1   2   3   4   5   6   7   >