date:20180213

Re: [PATCH] rtc: ds1302: remove redundant initializations of pointer bp

2018-02-13 Thread Alexandre Belloni

On 23/01/2018 at 10:17:27 +, Colin King wrote:
> From: Colin Ian King 
> 
> Pointe bp is being initialized and this value is never read, it
> is being updated to the same value later just before it is going to
> be used. Remove the initialization as it is never read and keep
> the setting of bp closer to the use of bp.
> 
> Cleans up clang warnings:
> drivers/rtc/rtc-ds1302.c:115:7: warning: Value stored to 'bp' during
> its initialization is never read
> drivers/rtc/rtc-ds1302.c:46:7: warning: Value stored to 'bp' during
> its initialization is never read
> 
> Signed-off-by: Colin Ian King 
> ---
>  drivers/rtc/rtc-ds1302.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
Applied, thanks.

-- 
Alexandre Belloni, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com

Re: [PATCH] rtc: ds1302: remove redundant initializations of pointer bp

2018-02-13 Thread Alexandre Belloni

On 23/01/2018 at 10:17:27 +, Colin King wrote:
> From: Colin Ian King 
> 
> Pointe bp is being initialized and this value is never read, it
> is being updated to the same value later just before it is going to
> be used. Remove the initialization as it is never read and keep
> the setting of bp closer to the use of bp.
> 
> Cleans up clang warnings:
> drivers/rtc/rtc-ds1302.c:115:7: warning: Value stored to 'bp' during
> its initialization is never read
> drivers/rtc/rtc-ds1302.c:46:7: warning: Value stored to 'bp' during
> its initialization is never read
> 
> Signed-off-by: Colin Ian King 
> ---
>  drivers/rtc/rtc-ds1302.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
Applied, thanks.

-- 
Alexandre Belloni, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com

Re: [tip:perf/core] perf/headers: Sync new perf_event.h with the tools/include/uapi version

2018-02-13 Thread Ingo Molnar


* Alexei Starovoitov  wrote:

> On Tue, Feb 06, 2018 at 03:52:59AM -0800, tip-bot for Song Liu wrote:
> > Commit-ID:  0d8dd67be013727ae57645ecd3ea2c36365d7da8
> > Gitweb: 
> > https://git.kernel.org/tip/0d8dd67be013727ae57645ecd3ea2c36365d7da8
> > Author: Song Liu 
> > AuthorDate: Wed, 6 Dec 2017 14:45:14 -0800
> > Committer:  Ingo Molnar 
> > CommitDate: Tue, 6 Feb 2018 10:18:05 +0100
> 
> any chance these patches can still make into this release considering
> they were ready back in December ?

The actual kernel side patches were only applied a week ago:

 33ea4b24277b: perf/core: Implement the 'perf_uprobe' PMU

 include/linux/trace_events.h|  4 
 kernel/events/core.c| 48 
+++-
 kernel/trace/trace_event_perf.c | 53 
+
 kernel/trace/trace_probe.h  |  4 
 kernel/trace/trace_uprobe.c | 86 
++
 5 files changed, 186 insertions(+), 9 deletions(-)

 e12f03d7031a: perf/core: Implement the 'perf_kprobe' PMU

 include/linux/trace_events.h|   4 
 kernel/events/core.c| 142 
+++---
 kernel/trace/trace_event_perf.c |  49 
+
 kernel/trace/trace_kprobe.c |  91 
+++
 kernel/trace/trace_probe.h  |   7 +++
 5 files changed, 250 insertions(+), 43 deletions(-)

 Commit: Ingo Molnar 
 CommitDate: Tue Feb 6 11:29:28 2018 +0100

They are also large and complex, so I can only send this to Linus in the v4.17 
merge window.

> We have few followups for them and if we don't get them via Linus's tree
> into net-next/bpf-next we cannot really proceed further.
> The other option would be to cherry-pick them into bpf-next/net-next,
> but also a bit scary due to potential conflicts?

No cherry-picking of such large patches please.

But I suppose you could git-pull tip:perf/core into the BPF tree, it only has 
these changes:

  33ea4b24277b: perf/core: Implement the 'perf_uprobe' PMU
  e12f03d7031a: perf/core: Implement the 'perf_kprobe' PMU
  0d8dd67be013: perf/headers: Sync new perf_event.h with the tools/include/uapi 
version
  65074d43fc77: perf/core: Prepare perf_event.h for new types: 'perf_kprobe' 
and 'perf_uprobe'

... on top of an upstream commit (59410f5ac70a).

The risks are:

 - In the v4.17 merge window the BPF tree should only be sent to Linus once he 
has 
   pulled the perf tree - i.e. there's a dependency.

 - If any of these commits needs serious fixes or a revert then that would have 
to 
   be pulled into the BPF tree too later on. (I don't expect there to be many
   problems though, no regression was reported so far.)

Thanks,

Ingo

Re: [tip:perf/core] perf/headers: Sync new perf_event.h with the tools/include/uapi version

2018-02-13 Thread Ingo Molnar


* Alexei Starovoitov  wrote:

> On Tue, Feb 06, 2018 at 03:52:59AM -0800, tip-bot for Song Liu wrote:
> > Commit-ID:  0d8dd67be013727ae57645ecd3ea2c36365d7da8
> > Gitweb: 
> > https://git.kernel.org/tip/0d8dd67be013727ae57645ecd3ea2c36365d7da8
> > Author: Song Liu 
> > AuthorDate: Wed, 6 Dec 2017 14:45:14 -0800
> > Committer:  Ingo Molnar 
> > CommitDate: Tue, 6 Feb 2018 10:18:05 +0100
> 
> any chance these patches can still make into this release considering
> they were ready back in December ?

The actual kernel side patches were only applied a week ago:

 33ea4b24277b: perf/core: Implement the 'perf_uprobe' PMU

 include/linux/trace_events.h|  4 
 kernel/events/core.c| 48 
+++-
 kernel/trace/trace_event_perf.c | 53 
+
 kernel/trace/trace_probe.h  |  4 
 kernel/trace/trace_uprobe.c | 86 
++
 5 files changed, 186 insertions(+), 9 deletions(-)

 e12f03d7031a: perf/core: Implement the 'perf_kprobe' PMU

 include/linux/trace_events.h|   4 
 kernel/events/core.c| 142 
+++---
 kernel/trace/trace_event_perf.c |  49 
+
 kernel/trace/trace_kprobe.c |  91 
+++
 kernel/trace/trace_probe.h  |   7 +++
 5 files changed, 250 insertions(+), 43 deletions(-)

 Commit: Ingo Molnar 
 CommitDate: Tue Feb 6 11:29:28 2018 +0100

They are also large and complex, so I can only send this to Linus in the v4.17 
merge window.

> We have few followups for them and if we don't get them via Linus's tree
> into net-next/bpf-next we cannot really proceed further.
> The other option would be to cherry-pick them into bpf-next/net-next,
> but also a bit scary due to potential conflicts?

No cherry-picking of such large patches please.

But I suppose you could git-pull tip:perf/core into the BPF tree, it only has 
these changes:

  33ea4b24277b: perf/core: Implement the 'perf_uprobe' PMU
  e12f03d7031a: perf/core: Implement the 'perf_kprobe' PMU
  0d8dd67be013: perf/headers: Sync new perf_event.h with the tools/include/uapi 
version
  65074d43fc77: perf/core: Prepare perf_event.h for new types: 'perf_kprobe' 
and 'perf_uprobe'

... on top of an upstream commit (59410f5ac70a).

The risks are:

 - In the v4.17 merge window the BPF tree should only be sent to Linus once he 
has 
   pulled the perf tree - i.e. there's a dependency.

 - If any of these commits needs serious fixes or a revert then that would have 
to 
   be pulled into the BPF tree too later on. (I don't expect there to be many
   problems though, no regression was reported so far.)

Thanks,

Ingo

Re: [PATCH] x86/entry/64: Fix CR3 restore order in paranoid_exit()

2018-02-13 Thread Ingo Molnar


* Dave Hansen  wrote:

> On 02/13/2018 06:27 PM, Josh Poimboeuf wrote:
> > --- a/arch/x86/entry/entry_64.S
> > +++ b/arch/x86/entry/entry_64.S
> > @@ -1167,10 +1167,10 @@ ENTRY(paranoid_exit)
> > UNWIND_HINT_REGS
> > DISABLE_INTERRUPTS(CLBR_ANY)
> > TRACE_IRQS_OFF_DEBUG
> > +   RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
> > testl   %ebx, %ebx  /* swapgs needed? */
> > jnz .Lparanoid_exit_no_swapgs
> > TRACE_IRQS_IRETQ
> > -   RESTORE_CR3 scratch_reg=%rbx save_reg=%r14
> > SWAPGS_UNSAFE_STACK
> > jmp .Lparanoid_exit_restore
> >  .Lparanoid_exit_no_swapgs:
> 
> TRACE_IRQS_* call non-entry functions that are not mapped by the user
> CR3.  How can this possibly work?  What am I missing?

How about something like the patch below? (Totally untested)

Thanks,

Ingo
---
 arch/x86/entry/entry_64.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index cd216c9431e1..8971bd64d515 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1175,6 +1175,7 @@ ENTRY(paranoid_exit)
jmp .Lparanoid_exit_restore
 .Lparanoid_exit_no_swapgs:
TRACE_IRQS_IRETQ_DEBUG
+   RESTORE_CR3 scratch_reg=%rbx save_reg=%r14
 .Lparanoid_exit_restore:
jmp restore_regs_and_return_to_kernel
 END(paranoid_exit)

Re: [PATCH] x86/entry/64: Fix CR3 restore order in paranoid_exit()

2018-02-13 Thread Ingo Molnar


* Dave Hansen  wrote:

> On 02/13/2018 06:27 PM, Josh Poimboeuf wrote:
> > --- a/arch/x86/entry/entry_64.S
> > +++ b/arch/x86/entry/entry_64.S
> > @@ -1167,10 +1167,10 @@ ENTRY(paranoid_exit)
> > UNWIND_HINT_REGS
> > DISABLE_INTERRUPTS(CLBR_ANY)
> > TRACE_IRQS_OFF_DEBUG
> > +   RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
> > testl   %ebx, %ebx  /* swapgs needed? */
> > jnz .Lparanoid_exit_no_swapgs
> > TRACE_IRQS_IRETQ
> > -   RESTORE_CR3 scratch_reg=%rbx save_reg=%r14
> > SWAPGS_UNSAFE_STACK
> > jmp .Lparanoid_exit_restore
> >  .Lparanoid_exit_no_swapgs:
> 
> TRACE_IRQS_* call non-entry functions that are not mapped by the user
> CR3.  How can this possibly work?  What am I missing?

How about something like the patch below? (Totally untested)

Thanks,

Ingo
---
 arch/x86/entry/entry_64.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index cd216c9431e1..8971bd64d515 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1175,6 +1175,7 @@ ENTRY(paranoid_exit)
jmp .Lparanoid_exit_restore
 .Lparanoid_exit_no_swapgs:
TRACE_IRQS_IRETQ_DEBUG
+   RESTORE_CR3 scratch_reg=%rbx save_reg=%r14
 .Lparanoid_exit_restore:
jmp restore_regs_and_return_to_kernel
 END(paranoid_exit)

Re: [PATCH 00/17] Add kexec_file_load support to s390

2018-02-13 Thread Dave Young

Hi Philipp,

I added AKASHI in cc, he posted arm64 kexec_file series previously.
I would like to read both series especially the general part, but
maybe at the end of this month because of a holiday..

>From the patch log the cleanup looks nice, but still need read the
details.

On 02/12/18 at 11:07am, Philipp Rudo wrote:
> Hi everybody
> 
> resending the series as there was no reaction, yet. Furthermore i was told
> that Andrew and the x86 list should also be CCed, so welcome.
> 
> No changes made to the patches since first time i sent them. The patches
> apply to the current master (v4.16-rc1).
> 
> Thanks
> Philipp
> 
> ---
> 
> this series adds the kexec_file_load system call to s390. Before the system
> call is added there are some preparations/clean ups to common
> kexec_file_load. In detail this series contains:
> 
> Patch #1&2: Minor cleanups/fixes.
> 
> Patch #3-9: Clean up the purgatory load/relocation code. Especially remove
> the mis-use of the purgatory_info->sechdrs->sh_offset field, currently
> holding a pointer into either kexec_purgatory (ro) or purgatory_buf (rw)
> depending on the section. With these patches the section address will be
> calculated verbosely and sh_offset will contain the offset of the section
> in the stripped purgatory binary (purgatory_buf).
> 
> Patch #10: Allows architectures to set the purgaory load address. This
> patch is important for s390 as the kernel and purgatory have to be loaded
> to fixed addresses. In current code this is impossible as the purgatory
> load is opaque to the architecture.
> 
> Patch #11: Moves x86 purgatories sha implementation to common lib/
> directory.
> 
> Patches #12-17 finally adds the kexec_file_load system call to s390.
> 
> Please note that I had to touch arch code for x86 and power a little. In
> theory this should not change the behavior but I don't have a way to test
> it. Cross-compiling with defconfig(*) works fine for both.
> 
> Thanks
> Philipp
> 
> (*) On x86 with the orc unwinder turned off. objtool SEGFAULTs on s390...
> 
> Philipp Rudo (17):
>   kexec_file: Silence compile warnings
>   kexec_file: Remove checks in kexec_purgatory_load
>   kexec_file: Make purgatory_info->ehdr const
>   kexec_file: Search symbols in read-only kexec_purgatory
>   kexec_file: Use read-only sections in arch_kexec_apply_relocations*
>   kexec_file: Split up __kexec_load_puragory
>   kexec_file: Simplify kexec_purgatory_setup_sechdrs 1
>   kexec_file: Simplify kexec_purgatory_setup_sechdrs 2
>   kexec_file: Remove mis-use of sh_offset field
>   kexec_file: Allow archs to set purgatory load address
>   kexec_file: Move purgatories sha256 to common code
>   s390/kexec_file: Prepare setup.h for kexec_file_load
>   s390/kexec_file: Add purgatory
>   s390/kexec_file: Add kexec_file_load system call
>   s390/kexec_file: Add image loader
>   s390/kexec_file: Add crash support to image loader
>   s390/kexec_file: Add ELF loader
> 
>  arch/powerpc/kernel/kexec_elf_64.c |   9 +-
>  arch/s390/Kbuild   |   1 +
>  arch/s390/Kconfig  |   4 +
>  arch/s390/include/asm/kexec.h  |  23 ++
>  arch/s390/include/asm/purgatory.h  |  17 ++
>  arch/s390/include/asm/setup.h  |  40 ++-
>  arch/s390/kernel/Makefile  |   1 +
>  arch/s390/kernel/asm-offsets.c |   5 +
>  arch/s390/kernel/compat_wrapper.c  |   1 +
>  arch/s390/kernel/kexec_elf.c   | 149 ++
>  arch/s390/kernel/kexec_image.c |  78 +
>  arch/s390/kernel/machine_kexec_file.c  | 291 +++
>  arch/s390/kernel/syscalls/syscall.tbl  |   1 +
>  arch/s390/purgatory/Makefile   |  37 +++
>  arch/s390/purgatory/head.S | 279 ++
>  arch/s390/purgatory/purgatory.c|  42 +++
>  arch/x86/kernel/kexec-bzimage64.c  |   8 +-
>  arch/x86/kernel/machine_kexec_64.c |  66 ++---
>  arch/x86/purgatory/Makefile|   3 +
>  arch/x86/purgatory/purgatory.c |   2 +-
>  include/linux/kexec.h  |  38 +--
>  {arch/x86/purgatory => include/linux}/sha256.h |  10 +-
>  kernel/kexec_file.c| 375 
> -
>  {arch/x86/purgatory => lib}/sha256.c   |   4 +-
>  24 files changed, 1200 insertions(+), 284 deletions(-)
>  create mode 100644 arch/s390/include/asm/purgatory.h
>  create mode 100644 arch/s390/kernel/kexec_elf.c
>  create mode 100644 arch/s390/kernel/kexec_image.c
>  create mode 100644 arch/s390/kernel/machine_kexec_file.c
>  create mode 100644 arch/s390/purgatory/Makefile
>  create mode 100644 arch/s390/purgatory/head.S
>  create mode 100644 arch/s390/purgatory/purgatory.c
>  rename {arch/x86/purgatory => include/linux}/sha256.h (63%)
>  rename {arch/x86/purgatory => lib}/sha256.c (99%)
> 
> -- 
> 2.13.5
> 
>

Re: [PATCH 00/17] Add kexec_file_load support to s390

2018-02-13 Thread Dave Young

Hi Philipp,

I added AKASHI in cc, he posted arm64 kexec_file series previously.
I would like to read both series especially the general part, but
maybe at the end of this month because of a holiday..

>From the patch log the cleanup looks nice, but still need read the
details.

On 02/12/18 at 11:07am, Philipp Rudo wrote:
> Hi everybody
> 
> resending the series as there was no reaction, yet. Furthermore i was told
> that Andrew and the x86 list should also be CCed, so welcome.
> 
> No changes made to the patches since first time i sent them. The patches
> apply to the current master (v4.16-rc1).
> 
> Thanks
> Philipp
> 
> ---
> 
> this series adds the kexec_file_load system call to s390. Before the system
> call is added there are some preparations/clean ups to common
> kexec_file_load. In detail this series contains:
> 
> Patch #1&2: Minor cleanups/fixes.
> 
> Patch #3-9: Clean up the purgatory load/relocation code. Especially remove
> the mis-use of the purgatory_info->sechdrs->sh_offset field, currently
> holding a pointer into either kexec_purgatory (ro) or purgatory_buf (rw)
> depending on the section. With these patches the section address will be
> calculated verbosely and sh_offset will contain the offset of the section
> in the stripped purgatory binary (purgatory_buf).
> 
> Patch #10: Allows architectures to set the purgaory load address. This
> patch is important for s390 as the kernel and purgatory have to be loaded
> to fixed addresses. In current code this is impossible as the purgatory
> load is opaque to the architecture.
> 
> Patch #11: Moves x86 purgatories sha implementation to common lib/
> directory.
> 
> Patches #12-17 finally adds the kexec_file_load system call to s390.
> 
> Please note that I had to touch arch code for x86 and power a little. In
> theory this should not change the behavior but I don't have a way to test
> it. Cross-compiling with defconfig(*) works fine for both.
> 
> Thanks
> Philipp
> 
> (*) On x86 with the orc unwinder turned off. objtool SEGFAULTs on s390...
> 
> Philipp Rudo (17):
>   kexec_file: Silence compile warnings
>   kexec_file: Remove checks in kexec_purgatory_load
>   kexec_file: Make purgatory_info->ehdr const
>   kexec_file: Search symbols in read-only kexec_purgatory
>   kexec_file: Use read-only sections in arch_kexec_apply_relocations*
>   kexec_file: Split up __kexec_load_puragory
>   kexec_file: Simplify kexec_purgatory_setup_sechdrs 1
>   kexec_file: Simplify kexec_purgatory_setup_sechdrs 2
>   kexec_file: Remove mis-use of sh_offset field
>   kexec_file: Allow archs to set purgatory load address
>   kexec_file: Move purgatories sha256 to common code
>   s390/kexec_file: Prepare setup.h for kexec_file_load
>   s390/kexec_file: Add purgatory
>   s390/kexec_file: Add kexec_file_load system call
>   s390/kexec_file: Add image loader
>   s390/kexec_file: Add crash support to image loader
>   s390/kexec_file: Add ELF loader
> 
>  arch/powerpc/kernel/kexec_elf_64.c |   9 +-
>  arch/s390/Kbuild   |   1 +
>  arch/s390/Kconfig  |   4 +
>  arch/s390/include/asm/kexec.h  |  23 ++
>  arch/s390/include/asm/purgatory.h  |  17 ++
>  arch/s390/include/asm/setup.h  |  40 ++-
>  arch/s390/kernel/Makefile  |   1 +
>  arch/s390/kernel/asm-offsets.c |   5 +
>  arch/s390/kernel/compat_wrapper.c  |   1 +
>  arch/s390/kernel/kexec_elf.c   | 149 ++
>  arch/s390/kernel/kexec_image.c |  78 +
>  arch/s390/kernel/machine_kexec_file.c  | 291 +++
>  arch/s390/kernel/syscalls/syscall.tbl  |   1 +
>  arch/s390/purgatory/Makefile   |  37 +++
>  arch/s390/purgatory/head.S | 279 ++
>  arch/s390/purgatory/purgatory.c|  42 +++
>  arch/x86/kernel/kexec-bzimage64.c  |   8 +-
>  arch/x86/kernel/machine_kexec_64.c |  66 ++---
>  arch/x86/purgatory/Makefile|   3 +
>  arch/x86/purgatory/purgatory.c |   2 +-
>  include/linux/kexec.h  |  38 +--
>  {arch/x86/purgatory => include/linux}/sha256.h |  10 +-
>  kernel/kexec_file.c| 375 
> -
>  {arch/x86/purgatory => lib}/sha256.c   |   4 +-
>  24 files changed, 1200 insertions(+), 284 deletions(-)
>  create mode 100644 arch/s390/include/asm/purgatory.h
>  create mode 100644 arch/s390/kernel/kexec_elf.c
>  create mode 100644 arch/s390/kernel/kexec_image.c
>  create mode 100644 arch/s390/kernel/machine_kexec_file.c
>  create mode 100644 arch/s390/purgatory/Makefile
>  create mode 100644 arch/s390/purgatory/head.S
>  create mode 100644 arch/s390/purgatory/purgatory.c
>  rename {arch/x86/purgatory => include/linux}/sha256.h (63%)
>  rename {arch/x86/purgatory => lib}/sha256.c (99%)
> 
> -- 
> 2.13.5
> 
>

Re: [PATCH] x86/entry/64: Fix CR3 restore order in paranoid_exit()

2018-02-13 Thread Ingo Molnar

* Josh Poimboeuf  wrote:

> I haven't actually seen any real-world bugs caused by this, so I'm not
> sure how theoretical it is.  I just stumbled upon it in code review when
> looking for another bug.

I believe it's a real bug, but the fix is wrong with irq tracing or lockdep 
enabled as Dave points out.

I think the reason we haven't seen this bug yet is that "paranoid" entry points 
are limited to:

idtentry double_fault   do_double_fault 
has_error_code=1 paranoid=2
idtentry debug  do_debughas_error_code=0
paranoid=1 shift_ist=DEBUG_STACK
idtentry int3   do_int3 has_error_code=0
paranoid=1 shift_ist=DEBUG_STACK
idtentry machine_check  do_mce  has_error_code=0
paranoid=1

Only machine_check is one that will interrupt an IRQS-off critical section 
asynchronously - and machine check events are rare.

The other main asynchronous entries are NMI entries, which can be very 
high-freq 
with perf profiling, but they are special: they don't use the 'idtentry' macro 
but 
are open coded and restore user CR3 unconditionally so don't seem to have this 
bug.

Thanks,

Ingo

Re: [PATCH] x86/entry/64: Fix CR3 restore order in paranoid_exit()

2018-02-13 Thread Ingo Molnar

* Josh Poimboeuf  wrote:

> I haven't actually seen any real-world bugs caused by this, so I'm not
> sure how theoretical it is.  I just stumbled upon it in code review when
> looking for another bug.

I believe it's a real bug, but the fix is wrong with irq tracing or lockdep 
enabled as Dave points out.

I think the reason we haven't seen this bug yet is that "paranoid" entry points 
are limited to:

idtentry double_fault   do_double_fault 
has_error_code=1 paranoid=2
idtentry debug  do_debughas_error_code=0
paranoid=1 shift_ist=DEBUG_STACK
idtentry int3   do_int3 has_error_code=0
paranoid=1 shift_ist=DEBUG_STACK
idtentry machine_check  do_mce  has_error_code=0
paranoid=1

Only machine_check is one that will interrupt an IRQS-off critical section 
asynchronously - and machine check events are rare.

The other main asynchronous entries are NMI entries, which can be very 
high-freq 
with perf profiling, but they are special: they don't use the 'idtentry' macro 
but 
are open coded and restore user CR3 unconditionally so don't seem to have this 
bug.

Thanks,

Ingo

Re: [PATCH 08/12] Drivers: hv: vmbus: Implement Direct Mode for stimer0

2018-02-13 Thread Dan Carpenter

On Wed, Feb 14, 2018 at 02:58:41AM +, Michael Kelley (EOSG) wrote:
> > -Original Message-
> > From: Dan Carpenter 
> > Sent: Monday, February 12, 2018 12:42 AM
> > To: KY Srinivasan ; Stephen Hemminger
> > 
> > Cc: gre...@linuxfoundation.org; linux-kernel@vger.kernel.org; 
> > de...@linuxdriverproject.org;
> > o...@aepfle.de; a...@canonical.com; vkuzn...@redhat.com; 
> > jasow...@redhat.com;
> > leann.ogasaw...@canonical.com; marcelo.ce...@canonical.com; Stephen 
> > Hemminger
> > ; Michael Kelley (EOSG) 
> > 
> > Subject: Re: [PATCH 08/12] Drivers: hv: vmbus: Implement Direct Mode for 
> > stimer0
> > 
> > On Sun, Feb 11, 2018 at 05:33:16PM -0700, k...@exchange.microsoft.com wrote:
> > > @@ -116,9 +146,29 @@ static int hv_ce_set_oneshot(struct 
> > > clock_event_device *evt)
> > >  {
> > >   union hv_timer_config timer_cfg;
> > >
> > > + timer_cfg.as_uint64 = 0;
> > >   timer_cfg.enable = 1;
> > >   timer_cfg.auto_enable = 1;
> > > - timer_cfg.sintx = VMBUS_MESSAGE_SINT;
> > > + if (direct_mode_enabled)
> > > + /*
> > > +  * When it expires, the timer will directly interrupt
> > > +  * on the specified hardware vector/IRQ.
> > > +  */
> > > + {
> > > + timer_cfg.direct_mode = 1;
> > > + timer_cfg.apic_vector = stimer0_vector;
> > > + hv_enable_stimer0_percpu_irq(stimer0_irq);
> > > + }
> > > + else
> > > + /*
> > > +  * When it expires, the timer will generate a VMbus message,
> > > +  * to be handled by the normal VMbus interrupt handler.
> > > +  */
> > > + {
> > > + timer_cfg.direct_mode = 0;
> > > + timer_cfg.sintx = VMBUS_MESSAGE_SINT;
> > > + }
> > > +
> > 
> > This indenting isn't right.  We should probably zero out .apic_vector
> > if .direct_mode is zero.  Or maybe it's fine.  I don't know if any
> > static analysis tools will complain...
> 
> I'll fix the indenting.  Old habits 
> 
> The " timer_cfg.as_uint64 = 0" statement already zero's out .apic_vector
> along with all the other unused fields in the 64-bit value, as required by
> the Hyper-V spec.

Ah, you're right, of course.

regards,
dan carpenter

Re: [PATCH 08/12] Drivers: hv: vmbus: Implement Direct Mode for stimer0

2018-02-13 Thread Dan Carpenter

On Wed, Feb 14, 2018 at 02:58:41AM +, Michael Kelley (EOSG) wrote:
> > -Original Message-
> > From: Dan Carpenter 
> > Sent: Monday, February 12, 2018 12:42 AM
> > To: KY Srinivasan ; Stephen Hemminger
> > 
> > Cc: gre...@linuxfoundation.org; linux-kernel@vger.kernel.org; 
> > de...@linuxdriverproject.org;
> > o...@aepfle.de; a...@canonical.com; vkuzn...@redhat.com; 
> > jasow...@redhat.com;
> > leann.ogasaw...@canonical.com; marcelo.ce...@canonical.com; Stephen 
> > Hemminger
> > ; Michael Kelley (EOSG) 
> > 
> > Subject: Re: [PATCH 08/12] Drivers: hv: vmbus: Implement Direct Mode for 
> > stimer0
> > 
> > On Sun, Feb 11, 2018 at 05:33:16PM -0700, k...@exchange.microsoft.com wrote:
> > > @@ -116,9 +146,29 @@ static int hv_ce_set_oneshot(struct 
> > > clock_event_device *evt)
> > >  {
> > >   union hv_timer_config timer_cfg;
> > >
> > > + timer_cfg.as_uint64 = 0;
> > >   timer_cfg.enable = 1;
> > >   timer_cfg.auto_enable = 1;
> > > - timer_cfg.sintx = VMBUS_MESSAGE_SINT;
> > > + if (direct_mode_enabled)
> > > + /*
> > > +  * When it expires, the timer will directly interrupt
> > > +  * on the specified hardware vector/IRQ.
> > > +  */
> > > + {
> > > + timer_cfg.direct_mode = 1;
> > > + timer_cfg.apic_vector = stimer0_vector;
> > > + hv_enable_stimer0_percpu_irq(stimer0_irq);
> > > + }
> > > + else
> > > + /*
> > > +  * When it expires, the timer will generate a VMbus message,
> > > +  * to be handled by the normal VMbus interrupt handler.
> > > +  */
> > > + {
> > > + timer_cfg.direct_mode = 0;
> > > + timer_cfg.sintx = VMBUS_MESSAGE_SINT;
> > > + }
> > > +
> > 
> > This indenting isn't right.  We should probably zero out .apic_vector
> > if .direct_mode is zero.  Or maybe it's fine.  I don't know if any
> > static analysis tools will complain...
> 
> I'll fix the indenting.  Old habits 
> 
> The " timer_cfg.as_uint64 = 0" statement already zero's out .apic_vector
> along with all the other unused fields in the 64-bit value, as required by
> the Hyper-V spec.

Ah, you're right, of course.

regards,
dan carpenter

Re: [PATCH] x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME

2018-02-13 Thread Kai Huang

On Tue, 2018-02-13 at 22:57 -0600, Tom Lendacky wrote:
> On 2/13/2018 10:21 PM, Kirill A. Shutemov wrote:
> > On Tue, Feb 13, 2018 at 10:10:22PM -0600, Tom Lendacky wrote:
> > > On 2/8/2018 6:55 AM, Kirill A. Shutemov wrote:
> > > > AMD SME claims one bit from physical address to indicate
> > > > whether the
> > > > page is encrypted or not. To achieve that we clear out the bit
> > > > from
> > > > __PHYSICAL_MASK.
> > > 
> > > I was actually working on a suggestion by Linus to use one of the
> > > software
> > > page table bits to indicate encryption and translate that to the
> > > hardware
> > > bit when writing the actual page table entry.  With that,
> > > __PHYSICAL_MASK
> > > would go back to its original definition.
> > 
> > But you would need to mask it on reading of pfn from page table
> > entry,
> > right? I expect it to have more overhead than this one.
> 
> When reading back an entry it would translate the hardware bit
> position
> back to the software bit position.  The suggestion for changing it
> was
> to make _PAGE_ENC a constant and not tied to the sme_me_mask.
> 
> See https://marc.info/?l=linux-kernel=151017622615894=2
> 
> > 
> > And software bits are valuable. Do we still have a spare one for
> > this?
> 
> I was looking at possibly using bit 57 (_PAGE_BIT_SOFTW5).

But MK-TME supports upto 15 bits (architectually) as keyID. How is this
supposed to work with MK-TME?

Thanks,
-Kai
> 
> Thanks,
> Tom
> 
> >

Re: [PATCH] x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME

2018-02-13 Thread Kai Huang

On Tue, 2018-02-13 at 22:57 -0600, Tom Lendacky wrote:
> On 2/13/2018 10:21 PM, Kirill A. Shutemov wrote:
> > On Tue, Feb 13, 2018 at 10:10:22PM -0600, Tom Lendacky wrote:
> > > On 2/8/2018 6:55 AM, Kirill A. Shutemov wrote:
> > > > AMD SME claims one bit from physical address to indicate
> > > > whether the
> > > > page is encrypted or not. To achieve that we clear out the bit
> > > > from
> > > > __PHYSICAL_MASK.
> > > 
> > > I was actually working on a suggestion by Linus to use one of the
> > > software
> > > page table bits to indicate encryption and translate that to the
> > > hardware
> > > bit when writing the actual page table entry.  With that,
> > > __PHYSICAL_MASK
> > > would go back to its original definition.
> > 
> > But you would need to mask it on reading of pfn from page table
> > entry,
> > right? I expect it to have more overhead than this one.
> 
> When reading back an entry it would translate the hardware bit
> position
> back to the software bit position.  The suggestion for changing it
> was
> to make _PAGE_ENC a constant and not tied to the sme_me_mask.
> 
> See https://marc.info/?l=linux-kernel=151017622615894=2
> 
> > 
> > And software bits are valuable. Do we still have a spare one for
> > this?
> 
> I was looking at possibly using bit 57 (_PAGE_BIT_SOFTW5).

But MK-TME supports upto 15 bits (architectually) as keyID. How is this
supposed to work with MK-TME?

Thanks,
-Kai
> 
> Thanks,
> Tom
> 
> >

FW: WWW-Nachricht.

2018-02-13 Thread Martin Kiefer

Sehr geehrte Damen und Herren, 

nach unserem Besuch Ihrer Homepage möchten wir Ihnen ein Angebot von Produkten 
vorstellen, das Ihnen ermöglichen wird, den Verkauf Ihrer Produkte sowie 
Dienstleistungen deutlich zu erhöhen.

Ich biete Ihnen den ganz neuen Adressenkatalog der Schweizer Unternehmen an, in 
dem sich direkte Kontaktdaten der Firmeninhaber und Manager befinden.

Der neue Katalog enthält 187.764 schweizerische Firmen und stellt solche Daten 
zur Verfügung wie: Namen der Firma, Firmenanschrift, Kontaktdaten des 
Firmeninhabers oder des Managers, E-Mail-Adresse, Telefonummer, 
Faxnummer, Branche usw.

Mithilfe dieses großen Verzeichnisses erreichen Sie mit Ihrem Angebot mehr als 
eine Million Unternehmen und Personen, die zu Ihren neuen Kunden werden.

*** Schweiz 2018 ( 187 764 ) 149 € * bis zum 14.02.2018 ***


http://www.ch-contact.net/?page=catalog

MfG
Martin Kiefer

FW: WWW-Nachricht.

2018-02-13 Thread Martin Kiefer

Sehr geehrte Damen und Herren, 

nach unserem Besuch Ihrer Homepage möchten wir Ihnen ein Angebot von Produkten 
vorstellen, das Ihnen ermöglichen wird, den Verkauf Ihrer Produkte sowie 
Dienstleistungen deutlich zu erhöhen.

Ich biete Ihnen den ganz neuen Adressenkatalog der Schweizer Unternehmen an, in 
dem sich direkte Kontaktdaten der Firmeninhaber und Manager befinden.

Der neue Katalog enthält 187.764 schweizerische Firmen und stellt solche Daten 
zur Verfügung wie: Namen der Firma, Firmenanschrift, Kontaktdaten des 
Firmeninhabers oder des Managers, E-Mail-Adresse, Telefonummer, 
Faxnummer, Branche usw.

Mithilfe dieses großen Verzeichnisses erreichen Sie mit Ihrem Angebot mehr als 
eine Million Unternehmen und Personen, die zu Ihren neuen Kunden werden.

*** Schweiz 2018 ( 187 764 ) 149 € * bis zum 14.02.2018 ***


http://www.ch-contact.net/?page=catalog

MfG
Martin Kiefer

Re: [PATCH] x86/spectre: fix an error message

2018-02-13 Thread Joe Perches

On Wed, 2018-02-14 at 10:14 +0300, Dan Carpenter wrote:
> If i == ARRAY_SIZE(mitigation_options) then we accidentally print
> garbage from one space beyond the end of the mitigation_options[] array.
> 
> Fixes: 9005c6834c0f ("x86/spectre: Simplify spectre_v2 command line parsing")
> Signed-off-by: Dan Carpenter 
> 
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index debcdda88560..acee4ebec04f 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -174,7 +174,7 @@ static enum spectre_v2_mitigation_cmd __init 
> spectre_v2_parse_cmdline(void)
>   }
>  
>   if (i >= ARRAY_SIZE(mitigation_options)) {
> - pr_err("unknown option (%s). Switching to AUTO 
> select\n", mitigation_options[i].option);
> + pr_err("unknown option (%s). Switching to AUTO 
> select\n", arg);
>   return SPECTRE_V2_CMD_AUTO;
>   }
>   }

Should probably unindent this block too by
removing the else after the return

---
 arch/x86/kernel/cpu/bugs.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 4acf16a76d1e..0af8245afb6c 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -161,22 +161,21 @@ static enum spectre_v2_mitigation_cmd __init 
spectre_v2_parse_cmdline(void)
 
if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
return SPECTRE_V2_CMD_NONE;
-   else {
-   ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, 
sizeof(arg));
-   if (ret < 0)
-   return SPECTRE_V2_CMD_AUTO;
-
-   for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
-   if (!match_option(arg, ret, 
mitigation_options[i].option))
-   continue;
-   cmd = mitigation_options[i].cmd;
-   break;
-   }
 
-   if (i >= ARRAY_SIZE(mitigation_options)) {
-   pr_err("unknown option (%s). Switching to AUTO 
select\n", mitigation_options[i].option);
-   return SPECTRE_V2_CMD_AUTO;
-   }
+   ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, 
sizeof(arg));
+   if (ret < 0)
+   return SPECTRE_V2_CMD_AUTO;
+
+   for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
+   if (!match_option(arg, ret, mitigation_options[i].option))
+   continue;
+   cmd = mitigation_options[i].cmd;
+   break;
+   }
+
+   if (i >= ARRAY_SIZE(mitigation_options)) {
+   pr_err("unknown option (%s). Switching to AUTO select\n", arg);
+   return SPECTRE_V2_CMD_AUTO;
}
 
if ((cmd == SPECTRE_V2_CMD_RETPOLINE ||

Re: [PATCH] x86/spectre: fix an error message

2018-02-13 Thread Joe Perches

On Wed, 2018-02-14 at 10:14 +0300, Dan Carpenter wrote:
> If i == ARRAY_SIZE(mitigation_options) then we accidentally print
> garbage from one space beyond the end of the mitigation_options[] array.
> 
> Fixes: 9005c6834c0f ("x86/spectre: Simplify spectre_v2 command line parsing")
> Signed-off-by: Dan Carpenter 
> 
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index debcdda88560..acee4ebec04f 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -174,7 +174,7 @@ static enum spectre_v2_mitigation_cmd __init 
> spectre_v2_parse_cmdline(void)
>   }
>  
>   if (i >= ARRAY_SIZE(mitigation_options)) {
> - pr_err("unknown option (%s). Switching to AUTO 
> select\n", mitigation_options[i].option);
> + pr_err("unknown option (%s). Switching to AUTO 
> select\n", arg);
>   return SPECTRE_V2_CMD_AUTO;
>   }
>   }

Should probably unindent this block too by
removing the else after the return

---
 arch/x86/kernel/cpu/bugs.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 4acf16a76d1e..0af8245afb6c 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -161,22 +161,21 @@ static enum spectre_v2_mitigation_cmd __init 
spectre_v2_parse_cmdline(void)
 
if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
return SPECTRE_V2_CMD_NONE;
-   else {
-   ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, 
sizeof(arg));
-   if (ret < 0)
-   return SPECTRE_V2_CMD_AUTO;
-
-   for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
-   if (!match_option(arg, ret, 
mitigation_options[i].option))
-   continue;
-   cmd = mitigation_options[i].cmd;
-   break;
-   }
 
-   if (i >= ARRAY_SIZE(mitigation_options)) {
-   pr_err("unknown option (%s). Switching to AUTO 
select\n", mitigation_options[i].option);
-   return SPECTRE_V2_CMD_AUTO;
-   }
+   ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, 
sizeof(arg));
+   if (ret < 0)
+   return SPECTRE_V2_CMD_AUTO;
+
+   for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
+   if (!match_option(arg, ret, mitigation_options[i].option))
+   continue;
+   cmd = mitigation_options[i].cmd;
+   break;
+   }
+
+   if (i >= ARRAY_SIZE(mitigation_options)) {
+   pr_err("unknown option (%s). Switching to AUTO select\n", arg);
+   return SPECTRE_V2_CMD_AUTO;
}
 
if ((cmd == SPECTRE_V2_CMD_RETPOLINE ||

[PATCH] x86/spectre: fix an error message

2018-02-13 Thread Dan Carpenter

If i == ARRAY_SIZE(mitigation_options) then we accidentally print
garbage from one space beyond the end of the mitigation_options[] array.

Fixes: 9005c6834c0f ("x86/spectre: Simplify spectre_v2 command line parsing")
Signed-off-by: Dan Carpenter 

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index debcdda88560..acee4ebec04f 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -174,7 +174,7 @@ static enum spectre_v2_mitigation_cmd __init 
spectre_v2_parse_cmdline(void)
}
 
if (i >= ARRAY_SIZE(mitigation_options)) {
-   pr_err("unknown option (%s). Switching to AUTO 
select\n", mitigation_options[i].option);
+   pr_err("unknown option (%s). Switching to AUTO 
select\n", arg);
return SPECTRE_V2_CMD_AUTO;
}
}

[PATCH] x86/spectre: fix an error message

2018-02-13 Thread Dan Carpenter

If i == ARRAY_SIZE(mitigation_options) then we accidentally print
garbage from one space beyond the end of the mitigation_options[] array.

Fixes: 9005c6834c0f ("x86/spectre: Simplify spectre_v2 command line parsing")
Signed-off-by: Dan Carpenter 

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index debcdda88560..acee4ebec04f 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -174,7 +174,7 @@ static enum spectre_v2_mitigation_cmd __init 
spectre_v2_parse_cmdline(void)
}
 
if (i >= ARRAY_SIZE(mitigation_options)) {
-   pr_err("unknown option (%s). Switching to AUTO 
select\n", mitigation_options[i].option);
+   pr_err("unknown option (%s). Switching to AUTO 
select\n", arg);
return SPECTRE_V2_CMD_AUTO;
}
}

[PATCH] perf test: Fix test case inet_pton to accept inlines.

2018-02-13 Thread Thomas Richter

Using Fedora 27 and latest Linux kernel the test case
trace+probe_libc_inet_pton.sh fails again on s390.
This time is the inlining of functions which does not match.
After an update of the glibc (from 2.26-16 to 2.26-24)
the output is different

The expected output is:
   __inet_pton (/usr/lib64/libc-2.26.so)
   gaih_inet (inlined)
   

The actual output is:
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
 0.000 probe_libc:inet_pton:(3ffb2140448))
   __inet_pton (inlined)
   gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
   ...

Fix this by being less strict on 'inlined' verses library
name and accept both

Signed-off-by: Thomas Richter 
---
 tools/perf/tests/shell/trace+probe_libc_inet_pton.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh 
b/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
index c446c894b297..8c4ab0b390c0 100755
--- a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
+++ b/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
@@ -21,12 +21,12 @@ trace_libc_inet_pton_backtrace() {
expected[3]=".*packets transmitted.*"
expected[4]="rtt min.*"

expected[5]="[0-9]+\.[0-9]+[[:space:]]+probe_libc:inet_pton:\([[:xdigit:]]+\)"
-   expected[6]=".*inet_pton[[:space:]]\($libc\)$"
+   expected[6]=".*inet_pton[[:space:]]\($libc|inlined\)$"
case "$(uname -m)" in
s390x)
eventattr='call-graph=dwarf'
-   expected[7]="gaih_inet[[:space:]]\(inlined\)$"
-   expected[8]="__GI_getaddrinfo[[:space:]]\(inlined\)$"
+   expected[7]="gaih_inet.*[[:space:]]\($libc|inlined\)$"
+   expected[8]="__GI_getaddrinfo[[:space:]]\($libc|inlined\)$"
expected[9]="main[[:space:]]\(.*/bin/ping.*\)$"
expected[10]="__libc_start_main[[:space:]]\($libc\)$"
expected[11]="_start[[:space:]]\(.*/bin/ping.*\)$"
-- 
2.14.3

[PATCH] perf test: Fix test case inet_pton to accept inlines.

2018-02-13 Thread Thomas Richter

Using Fedora 27 and latest Linux kernel the test case
trace+probe_libc_inet_pton.sh fails again on s390.
This time is the inlining of functions which does not match.
After an update of the glibc (from 2.26-16 to 2.26-24)
the output is different

The expected output is:
   __inet_pton (/usr/lib64/libc-2.26.so)
   gaih_inet (inlined)
   

The actual output is:
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
 0.000 probe_libc:inet_pton:(3ffb2140448))
   __inet_pton (inlined)
   gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
   ...

Fix this by being less strict on 'inlined' verses library
name and accept both

Signed-off-by: Thomas Richter 
---
 tools/perf/tests/shell/trace+probe_libc_inet_pton.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh 
b/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
index c446c894b297..8c4ab0b390c0 100755
--- a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
+++ b/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh
@@ -21,12 +21,12 @@ trace_libc_inet_pton_backtrace() {
expected[3]=".*packets transmitted.*"
expected[4]="rtt min.*"

expected[5]="[0-9]+\.[0-9]+[[:space:]]+probe_libc:inet_pton:\([[:xdigit:]]+\)"
-   expected[6]=".*inet_pton[[:space:]]\($libc\)$"
+   expected[6]=".*inet_pton[[:space:]]\($libc|inlined\)$"
case "$(uname -m)" in
s390x)
eventattr='call-graph=dwarf'
-   expected[7]="gaih_inet[[:space:]]\(inlined\)$"
-   expected[8]="__GI_getaddrinfo[[:space:]]\(inlined\)$"
+   expected[7]="gaih_inet.*[[:space:]]\($libc|inlined\)$"
+   expected[8]="__GI_getaddrinfo[[:space:]]\($libc|inlined\)$"
expected[9]="main[[:space:]]\(.*/bin/ping.*\)$"
expected[10]="__libc_start_main[[:space:]]\($libc\)$"
expected[11]="_start[[:space:]]\(.*/bin/ping.*\)$"
-- 
2.14.3

Re: [PATCH v3 1/5] ALSA: emu10k1: remove reserved_page

2018-02-13 Thread Takashi Iwai

On Wed, 14 Feb 2018 00:04:58 +0100,
 Maciej S. Szmigiero  wrote:
> 
> The emu10k1-family chips need the first page (index 0) reserved in their
> page tables for some reason (every emu10k1 driver I've checked does this
> without much of an explanation).
> Using the first page for normal samples results in a broken playback.
> 
> However, we already have a dummy page allocated - so called "silent page"
> and, in fact, had always been setting it as the first page in the chip page
> table because an initialization of every entry of the page table to point
> to a silent page happens after and overwrites the reserved_page allocation.
> 
> So the only thing remaining to remove the reserved_page allocation is a
> trivial change to the page allocation logic to ignore the first page entry
> and start its allocations from the second entry (index 1).
> 
> Signed-off-by: Maciej S. Szmigiero 
> ---
> Changes from v1, v2: None in this patch.

Thanks, applied all 5 patches now.


Takashi

[RFT PATCH] drm/msm: Trigger fence completion from GPU

2018-02-13 Thread Bjorn Andersson

Interrupt commands causes the CP to trigger an interrupt as the command
is processed, regardless of the GPU being done processing previous
commands. This is seen by the interrupt being delivered before the
fence is written on 8974 and is likely the cause of the additional
CP_WAIT_FOR_IDLE workaround found for a306, which would cause the CP to
wait for the GPU to go idle before triggering the interrupt.

Instead we can set the (undocumented) BIT(31) of the CACHE_FLUSH_TS
which will cause a special CACHE_FLUSH_TS interrupt to be triggered from
the GPU as the write event is processed.

Add CACHE_FLUSH_TS to the IRQ masks of A3xx and A4xx and remove the
workaround for A306.

Suggested-by: Jordan Crouse 
Signed-off-by: Bjorn Andersson 
---

This is only tested on 8974.

 drivers/gpu/drm/msm/adreno/a3xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 18 ++
 3 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 4baef2738178..a3a43be920d0 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -35,6 +35,7 @@
 A3XX_INT0_CP_RB_INT | \
 A3XX_INT0_CP_REG_PROTECT_FAULT |  \
 A3XX_INT0_CP_AHB_ERROR_HALT | \
+A3XX_INT0_CACHE_FLUSH_TS |\
 A3XX_INT0_UCHE_OOB_ACCESS)
 
 extern bool hang_debug;
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 8199a4b9f2fa..b44cd0d90621 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -27,6 +27,7 @@
 A4XX_INT0_CP_RB_INT | \
 A4XX_INT0_CP_REG_PROTECT_FAULT |  \
 A4XX_INT0_CP_AHB_ERROR_HALT | \
+A4XX_INT0_CACHE_FLUSH_TS |\
 A4XX_INT0_UCHE_OOB_ACCESS)
 
 extern bool hang_debug;
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index de63ff26a062..5806f9942514 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -293,26 +293,12 @@ void adreno_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
OUT_RING(ring, 0x);
}
 
+   /* BIT(31) of CACHE_FLUSH_TS triggers CACHE_FLUSH_TS IRQ from GPU */
OUT_PKT3(ring, CP_EVENT_WRITE, 3);
-   OUT_RING(ring, CACHE_FLUSH_TS);
+   OUT_RING(ring, CACHE_FLUSH_TS | BIT(31));
OUT_RING(ring, rbmemptr(ring, fence));
OUT_RING(ring, submit->seqno);
 
-   /* we could maybe be clever and only CP_COND_EXEC the interrupt: */
-   OUT_PKT3(ring, CP_INTERRUPT, 1);
-   OUT_RING(ring, 0x8000);
-
-   /* Workaround for missing irq issue on 8x16/a306.  Unsure if the
-* root cause is a platform issue or some a306 quirk, but this
-* keeps things humming along:
-*/
-   if (adreno_is_a306(adreno_gpu)) {
-   OUT_PKT3(ring, CP_WAIT_FOR_IDLE, 1);
-   OUT_RING(ring, 0x);
-   OUT_PKT3(ring, CP_INTERRUPT, 1);
-   OUT_RING(ring, 0x8000);
-   }
-
 #if 0
if (adreno_is_a3xx(adreno_gpu)) {
/* Dummy set-constant to trigger context rollover */
-- 
2.15.0

Re: [PATCH v3 1/5] ALSA: emu10k1: remove reserved_page

2018-02-13 Thread Takashi Iwai

On Wed, 14 Feb 2018 00:04:58 +0100,
 Maciej S. Szmigiero  wrote:
> 
> The emu10k1-family chips need the first page (index 0) reserved in their
> page tables for some reason (every emu10k1 driver I've checked does this
> without much of an explanation).
> Using the first page for normal samples results in a broken playback.
> 
> However, we already have a dummy page allocated - so called "silent page"
> and, in fact, had always been setting it as the first page in the chip page
> table because an initialization of every entry of the page table to point
> to a silent page happens after and overwrites the reserved_page allocation.
> 
> So the only thing remaining to remove the reserved_page allocation is a
> trivial change to the page allocation logic to ignore the first page entry
> and start its allocations from the second entry (index 1).
> 
> Signed-off-by: Maciej S. Szmigiero 
> ---
> Changes from v1, v2: None in this patch.

Thanks, applied all 5 patches now.


Takashi

[RFT PATCH] drm/msm: Trigger fence completion from GPU

2018-02-13 Thread Bjorn Andersson

Interrupt commands causes the CP to trigger an interrupt as the command
is processed, regardless of the GPU being done processing previous
commands. This is seen by the interrupt being delivered before the
fence is written on 8974 and is likely the cause of the additional
CP_WAIT_FOR_IDLE workaround found for a306, which would cause the CP to
wait for the GPU to go idle before triggering the interrupt.

Instead we can set the (undocumented) BIT(31) of the CACHE_FLUSH_TS
which will cause a special CACHE_FLUSH_TS interrupt to be triggered from
the GPU as the write event is processed.

Add CACHE_FLUSH_TS to the IRQ masks of A3xx and A4xx and remove the
workaround for A306.

Suggested-by: Jordan Crouse 
Signed-off-by: Bjorn Andersson 
---

This is only tested on 8974.

 drivers/gpu/drm/msm/adreno/a3xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 18 ++
 3 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 4baef2738178..a3a43be920d0 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -35,6 +35,7 @@
 A3XX_INT0_CP_RB_INT | \
 A3XX_INT0_CP_REG_PROTECT_FAULT |  \
 A3XX_INT0_CP_AHB_ERROR_HALT | \
+A3XX_INT0_CACHE_FLUSH_TS |\
 A3XX_INT0_UCHE_OOB_ACCESS)
 
 extern bool hang_debug;
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 8199a4b9f2fa..b44cd0d90621 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -27,6 +27,7 @@
 A4XX_INT0_CP_RB_INT | \
 A4XX_INT0_CP_REG_PROTECT_FAULT |  \
 A4XX_INT0_CP_AHB_ERROR_HALT | \
+A4XX_INT0_CACHE_FLUSH_TS |\
 A4XX_INT0_UCHE_OOB_ACCESS)
 
 extern bool hang_debug;
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index de63ff26a062..5806f9942514 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -293,26 +293,12 @@ void adreno_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
OUT_RING(ring, 0x);
}
 
+   /* BIT(31) of CACHE_FLUSH_TS triggers CACHE_FLUSH_TS IRQ from GPU */
OUT_PKT3(ring, CP_EVENT_WRITE, 3);
-   OUT_RING(ring, CACHE_FLUSH_TS);
+   OUT_RING(ring, CACHE_FLUSH_TS | BIT(31));
OUT_RING(ring, rbmemptr(ring, fence));
OUT_RING(ring, submit->seqno);
 
-   /* we could maybe be clever and only CP_COND_EXEC the interrupt: */
-   OUT_PKT3(ring, CP_INTERRUPT, 1);
-   OUT_RING(ring, 0x8000);
-
-   /* Workaround for missing irq issue on 8x16/a306.  Unsure if the
-* root cause is a platform issue or some a306 quirk, but this
-* keeps things humming along:
-*/
-   if (adreno_is_a306(adreno_gpu)) {
-   OUT_PKT3(ring, CP_WAIT_FOR_IDLE, 1);
-   OUT_RING(ring, 0x);
-   OUT_PKT3(ring, CP_INTERRUPT, 1);
-   OUT_RING(ring, 0x8000);
-   }
-
 #if 0
if (adreno_is_a3xx(adreno_gpu)) {
/* Dummy set-constant to trigger context rollover */
-- 
2.15.0

[PATCH -mm] proc: faster open/close of files without ->release hook

2018-02-13 Thread Alexey Dobriyan

The whole point of code in fs/proc/inode.c is to make sure ->release
hook is called either at close() or at rmmod time.

All if it is unnecessary if there is no ->release hook.

Save allocation+list manipulations under spinlock in that case.

Signed-off-by: Alexey Dobriyan 
---

 fs/proc/inode.c |   41 +++--
 1 file changed, 23 insertions(+), 18 deletions(-)

--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -342,31 +342,36 @@ static int proc_reg_open(struct inode *inode, struct file 
*file)
 *
 * Save every "struct file" with custom ->release hook.
 */
-   pdeo = kmalloc(sizeof(struct pde_opener), GFP_KERNEL);
-   if (!pdeo)
-   return -ENOMEM;
-
-   if (!use_pde(pde)) {
-   kfree(pdeo);
+   if (!use_pde(pde))
return -ENOENT;
-   }
-   open = pde->proc_fops->open;
+
release = pde->proc_fops->release;
+   if (release) {
+   pdeo = kmalloc(sizeof(struct pde_opener), GFP_KERNEL);
+   if (!pdeo) {
+   rv = -ENOMEM;
+   goto out_unuse;
+   }
+   }
 
+   open = pde->proc_fops->open;
if (open)
rv = open(inode, file);
 
-   if (rv == 0 && release) {
-   /* To know what to release. */
-   pdeo->file = file;
-   pdeo->closing = false;
-   pdeo->c = NULL;
-   spin_lock(>pde_unload_lock);
-   list_add(>lh, >pde_openers);
-   spin_unlock(>pde_unload_lock);
-   } else
-   kfree(pdeo);
+   if (release) {
+   if (rv == 0) {
+   /* To know what to release. */
+   pdeo->file = file;
+   pdeo->closing = false;
+   pdeo->c = NULL;
+   spin_lock(>pde_unload_lock);
+   list_add(>lh, >pde_openers);
+   spin_unlock(>pde_unload_lock);
+   } else
+   kfree(pdeo);
+   }
 
+out_unuse:
unuse_pde(pde);
return rv;
 }

[PATCH -mm] proc: faster open/close of files without ->release hook

2018-02-13 Thread Alexey Dobriyan

The whole point of code in fs/proc/inode.c is to make sure ->release
hook is called either at close() or at rmmod time.

All if it is unnecessary if there is no ->release hook.

Save allocation+list manipulations under spinlock in that case.

Signed-off-by: Alexey Dobriyan 
---

 fs/proc/inode.c |   41 +++--
 1 file changed, 23 insertions(+), 18 deletions(-)

--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -342,31 +342,36 @@ static int proc_reg_open(struct inode *inode, struct file 
*file)
 *
 * Save every "struct file" with custom ->release hook.
 */
-   pdeo = kmalloc(sizeof(struct pde_opener), GFP_KERNEL);
-   if (!pdeo)
-   return -ENOMEM;
-
-   if (!use_pde(pde)) {
-   kfree(pdeo);
+   if (!use_pde(pde))
return -ENOENT;
-   }
-   open = pde->proc_fops->open;
+
release = pde->proc_fops->release;
+   if (release) {
+   pdeo = kmalloc(sizeof(struct pde_opener), GFP_KERNEL);
+   if (!pdeo) {
+   rv = -ENOMEM;
+   goto out_unuse;
+   }
+   }
 
+   open = pde->proc_fops->open;
if (open)
rv = open(inode, file);
 
-   if (rv == 0 && release) {
-   /* To know what to release. */
-   pdeo->file = file;
-   pdeo->closing = false;
-   pdeo->c = NULL;
-   spin_lock(>pde_unload_lock);
-   list_add(>lh, >pde_openers);
-   spin_unlock(>pde_unload_lock);
-   } else
-   kfree(pdeo);
+   if (release) {
+   if (rv == 0) {
+   /* To know what to release. */
+   pdeo->file = file;
+   pdeo->closing = false;
+   pdeo->c = NULL;
+   spin_lock(>pde_unload_lock);
+   list_add(>lh, >pde_openers);
+   spin_unlock(>pde_unload_lock);
+   } else
+   kfree(pdeo);
+   }
 
+out_unuse:
unuse_pde(pde);
return rv;
 }

[PATCH] x86/apic: Make setup_local_APIC() static

2018-02-13 Thread Dou Liyang

This function isn't used outside of apic.c, so let's mark it static.

Signed-off-by: Dou Liyang 
---
 arch/x86/include/asm/apic.h | 1 -
 arch/x86/kernel/apic/apic.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 98722773391d..09d1c6ed2408 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -138,7 +138,6 @@ extern void lapic_shutdown(void);
 extern void sync_Arb_IDs(void);
 extern void init_bsp_APIC(void);
 extern void apic_intr_mode_init(void);
-extern void setup_local_APIC(void);
 extern void init_apic_mappings(void);
 void register_lapic_address(unsigned long address);
 extern void setup_boot_APIC_clock(void);
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 6f59ac2238ab..3fda9734db25 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1464,7 +1464,7 @@ static void apic_pending_intr_clear(void)
  * Used to setup local APIC while initializing BSP or bringing up APs.
  * Always called with preemption disabled.
  */
-void setup_local_APIC(void)
+static void setup_local_APIC(void)
 {
int cpu = smp_processor_id();
unsigned int value;
-- 
2.14.3

[PATCH] x86/apic: Make setup_local_APIC() static

2018-02-13 Thread Dou Liyang

This function isn't used outside of apic.c, so let's mark it static.

Signed-off-by: Dou Liyang 
---
 arch/x86/include/asm/apic.h | 1 -
 arch/x86/kernel/apic/apic.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 98722773391d..09d1c6ed2408 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -138,7 +138,6 @@ extern void lapic_shutdown(void);
 extern void sync_Arb_IDs(void);
 extern void init_bsp_APIC(void);
 extern void apic_intr_mode_init(void);
-extern void setup_local_APIC(void);
 extern void init_apic_mappings(void);
 void register_lapic_address(unsigned long address);
 extern void setup_boot_APIC_clock(void);
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 6f59ac2238ab..3fda9734db25 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1464,7 +1464,7 @@ static void apic_pending_intr_clear(void)
  * Used to setup local APIC while initializing BSP or bringing up APs.
  * Always called with preemption disabled.
  */
-void setup_local_APIC(void)
+static void setup_local_APIC(void)
 {
int cpu = smp_processor_id();
unsigned int value;
-- 
2.14.3

[PATCH] x86/apic: Move pending intr check code into it's own function

2018-02-13 Thread Dou Liyang

the pending interrupt check code is mixed with the local APIC setup code,
that looks messy.

Extract the related code, move it into a new function named
apic_pending_intr_clear().

Signed-off-by: Dou Liyang 
---
 arch/x86/kernel/apic/apic.c | 98 -
 1 file changed, 52 insertions(+), 46 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 3fc259b4dd2d..6f59ac2238ab 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1408,6 +1408,56 @@ static void lapic_setup_esr(void)
oldvalue, value);
 }
 
+static void apic_pending_intr_clear(void)
+{
+   long long max_loops = cpu_khz ? cpu_khz : 100;
+   unsigned long long tsc = 0, ntsc;
+   unsigned int value, queued;
+   int i, j, acked = 0;
+
+   if (boot_cpu_has(X86_FEATURE_TSC))
+   tsc = rdtsc();
+   /*
+* After a crash, we no longer service the interrupts and a pending
+* interrupt from previous kernel might still have ISR bit set.
+*
+* Most probably by now CPU has serviced that pending interrupt and
+* it might not have done the ack_APIC_irq() because it thought,
+* interrupt came from i8259 as ExtInt. LAPIC did not get EOI so it
+* does not clear the ISR bit and cpu thinks it has already serivced
+* the interrupt. Hence a vector might get locked. It was noticed
+* for timer irq (vector 0x31). Issue an extra EOI to clear ISR.
+*/
+   do {
+   queued = 0;
+   for (i = APIC_ISR_NR - 1; i >= 0; i--)
+   queued |= apic_read(APIC_IRR + i*0x10);
+
+   for (i = APIC_ISR_NR - 1; i >= 0; i--) {
+   value = apic_read(APIC_ISR + i*0x10);
+   for (j = 31; j >= 0; j--) {
+   if (value & (1< 256) {
+   printk(KERN_ERR "LAPIC pending interrupts after %d 
EOI\n",
+  acked);
+   break;
+   }
+   if (queued) {
+   if (boot_cpu_has(X86_FEATURE_TSC) && cpu_khz) {
+   ntsc = rdtsc();
+   max_loops = (cpu_khz << 10) - (ntsc - tsc);
+   } else
+   max_loops--;
+   }
+   } while (queued && max_loops > 0);
+   WARN_ON(max_loops <= 0);
+}
+
 /**
  * setup_local_APIC - setup the local APIC
  *
@@ -1417,13 +1467,7 @@ static void lapic_setup_esr(void)
 void setup_local_APIC(void)
 {
int cpu = smp_processor_id();
-   unsigned int value, queued;
-   int i, j, acked = 0;
-   unsigned long long tsc = 0, ntsc;
-   long long max_loops = cpu_khz ? cpu_khz : 100;
-
-   if (boot_cpu_has(X86_FEATURE_TSC))
-   tsc = rdtsc();
+   unsigned int value;
 
if (disable_apic) {
disable_ioapic_support();
@@ -1475,45 +1519,7 @@ void setup_local_APIC(void)
value &= ~APIC_TPRI_MASK;
apic_write(APIC_TASKPRI, value);
 
-   /*
-* After a crash, we no longer service the interrupts and a pending
-* interrupt from previous kernel might still have ISR bit set.
-*
-* Most probably by now CPU has serviced that pending interrupt and
-* it might not have done the ack_APIC_irq() because it thought,
-* interrupt came from i8259 as ExtInt. LAPIC did not get EOI so it
-* does not clear the ISR bit and cpu thinks it has already serivced
-* the interrupt. Hence a vector might get locked. It was noticed
-* for timer irq (vector 0x31). Issue an extra EOI to clear ISR.
-*/
-   do {
-   queued = 0;
-   for (i = APIC_ISR_NR - 1; i >= 0; i--)
-   queued |= apic_read(APIC_IRR + i*0x10);
-
-   for (i = APIC_ISR_NR - 1; i >= 0; i--) {
-   value = apic_read(APIC_ISR + i*0x10);
-   for (j = 31; j >= 0; j--) {
-   if (value & (1< 256) {
-   printk(KERN_ERR "LAPIC pending interrupts after %d 
EOI\n",
-  acked);
-   break;
-   }
-   if (queued) {
-   if (boot_cpu_has(X86_FEATURE_TSC) && cpu_khz) {
-   ntsc = rdtsc();
-

[PATCH] x86/apic: Move pending intr check code into it's own function

2018-02-13 Thread Dou Liyang

the pending interrupt check code is mixed with the local APIC setup code,
that looks messy.

Extract the related code, move it into a new function named
apic_pending_intr_clear().

Signed-off-by: Dou Liyang 
---
 arch/x86/kernel/apic/apic.c | 98 -
 1 file changed, 52 insertions(+), 46 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 3fc259b4dd2d..6f59ac2238ab 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1408,6 +1408,56 @@ static void lapic_setup_esr(void)
oldvalue, value);
 }
 
+static void apic_pending_intr_clear(void)
+{
+   long long max_loops = cpu_khz ? cpu_khz : 100;
+   unsigned long long tsc = 0, ntsc;
+   unsigned int value, queued;
+   int i, j, acked = 0;
+
+   if (boot_cpu_has(X86_FEATURE_TSC))
+   tsc = rdtsc();
+   /*
+* After a crash, we no longer service the interrupts and a pending
+* interrupt from previous kernel might still have ISR bit set.
+*
+* Most probably by now CPU has serviced that pending interrupt and
+* it might not have done the ack_APIC_irq() because it thought,
+* interrupt came from i8259 as ExtInt. LAPIC did not get EOI so it
+* does not clear the ISR bit and cpu thinks it has already serivced
+* the interrupt. Hence a vector might get locked. It was noticed
+* for timer irq (vector 0x31). Issue an extra EOI to clear ISR.
+*/
+   do {
+   queued = 0;
+   for (i = APIC_ISR_NR - 1; i >= 0; i--)
+   queued |= apic_read(APIC_IRR + i*0x10);
+
+   for (i = APIC_ISR_NR - 1; i >= 0; i--) {
+   value = apic_read(APIC_ISR + i*0x10);
+   for (j = 31; j >= 0; j--) {
+   if (value & (1< 256) {
+   printk(KERN_ERR "LAPIC pending interrupts after %d 
EOI\n",
+  acked);
+   break;
+   }
+   if (queued) {
+   if (boot_cpu_has(X86_FEATURE_TSC) && cpu_khz) {
+   ntsc = rdtsc();
+   max_loops = (cpu_khz << 10) - (ntsc - tsc);
+   } else
+   max_loops--;
+   }
+   } while (queued && max_loops > 0);
+   WARN_ON(max_loops <= 0);
+}
+
 /**
  * setup_local_APIC - setup the local APIC
  *
@@ -1417,13 +1467,7 @@ static void lapic_setup_esr(void)
 void setup_local_APIC(void)
 {
int cpu = smp_processor_id();
-   unsigned int value, queued;
-   int i, j, acked = 0;
-   unsigned long long tsc = 0, ntsc;
-   long long max_loops = cpu_khz ? cpu_khz : 100;
-
-   if (boot_cpu_has(X86_FEATURE_TSC))
-   tsc = rdtsc();
+   unsigned int value;
 
if (disable_apic) {
disable_ioapic_support();
@@ -1475,45 +1519,7 @@ void setup_local_APIC(void)
value &= ~APIC_TPRI_MASK;
apic_write(APIC_TASKPRI, value);
 
-   /*
-* After a crash, we no longer service the interrupts and a pending
-* interrupt from previous kernel might still have ISR bit set.
-*
-* Most probably by now CPU has serviced that pending interrupt and
-* it might not have done the ack_APIC_irq() because it thought,
-* interrupt came from i8259 as ExtInt. LAPIC did not get EOI so it
-* does not clear the ISR bit and cpu thinks it has already serivced
-* the interrupt. Hence a vector might get locked. It was noticed
-* for timer irq (vector 0x31). Issue an extra EOI to clear ISR.
-*/
-   do {
-   queued = 0;
-   for (i = APIC_ISR_NR - 1; i >= 0; i--)
-   queued |= apic_read(APIC_IRR + i*0x10);
-
-   for (i = APIC_ISR_NR - 1; i >= 0; i--) {
-   value = apic_read(APIC_ISR + i*0x10);
-   for (j = 31; j >= 0; j--) {
-   if (value & (1< 256) {
-   printk(KERN_ERR "LAPIC pending interrupts after %d 
EOI\n",
-  acked);
-   break;
-   }
-   if (queued) {
-   if (boot_cpu_has(X86_FEATURE_TSC) && cpu_khz) {
-   ntsc = rdtsc();
-   max_loops = (cpu_khz << 10) - (ntsc - tsc);
-   } else
-   max_loops--;
-   }
-   } while (queued && max_loops > 0);
-   WARN_ON(max_loops <= 0);
+   apic_pending_intr_clear();
 
/*
 * Now that we are all set up, enable the APIC
-- 
2.14.3

HOPE TO HEAR FROM YOU

2018-02-13 Thread Miss Nadege

Hello dear how are you?

Nice to meet you,my name is Miss Nadege Yann, can we become friends? hope to 
hear from you so that we can know each other very well, love matters mostly in 
life,i will also send you my pictures and tell you more about myself, my email 
address is(missnade...@gmail.com)

waiting to hear from you soon.
Miss.Nadege Yann

HOPE TO HEAR FROM YOU

2018-02-13 Thread Miss Nadege

Hello dear how are you?

Nice to meet you,my name is Miss Nadege Yann, can we become friends? hope to 
hear from you so that we can know each other very well, love matters mostly in 
life,i will also send you my pictures and tell you more about myself, my email 
address is(missnade...@gmail.com)

waiting to hear from you soon.
Miss.Nadege Yann

[PATCHv3 1/2] zsmalloc: introduce zs_huge_object() function

2018-02-13 Thread Sergey Senozhatsky

Not every object can be share its zspage with other objects, e.g.
when the object is as big as zspage or nearly as big a zspage.
For such objects zsmalloc has a so called huge class - every object
which belongs to huge class consumes the entire zspage (which
consists of a physical page). On x86_64, PAGE_SHIFT 12 box, the
first non-huge class size is 3264, so starting down from size 3264,
objects can share page(-s) and thus minimize memory wastage.

ZRAM, however, has its own statically defined watermark for huge
objects - "3 * PAGE_SIZE / 4 = 3072", and forcibly stores every
object larger than this watermark (3072) as a PAGE_SIZE object,
in other words, to a huge class, while zsmalloc can keep some of
those objects in non-huge classes. This results in increased
memory consumption.

zsmalloc knows better if the object is huge or not. Introduce
zs_huge_object() function which tells if the given object can be
stored in one of non-huge classes or not. This will let us to drop
ZRAM's huge object watermark and fully rely on zsmalloc when we
decide if the object is huge.

Signed-off-by: Sergey Senozhatsky 
---
 include/linux/zsmalloc.h |  2 ++
 mm/zsmalloc.c| 26 ++
 2 files changed, 28 insertions(+)

diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 57a8e98f2708..9a1baf673cc1 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -47,6 +47,8 @@ void zs_destroy_pool(struct zs_pool *pool);
 unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t flags);
 void zs_free(struct zs_pool *pool, unsigned long obj);
 
+bool zs_huge_object(size_t sz);
+
 void *zs_map_object(struct zs_pool *pool, unsigned long handle,
enum zs_mapmode mm);
 void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index c3013505c305..e43fc6ebb8e1 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -192,6 +192,7 @@ static struct vfsmount *zsmalloc_mnt;
  * (see: fix_fullness_group())
  */
 static const int fullness_threshold_frac = 4;
+static size_t zs_huge_class_size;
 
 struct size_class {
spinlock_t lock;
@@ -1417,6 +1418,28 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long 
handle)
 }
 EXPORT_SYMBOL_GPL(zs_unmap_object);
 
+/**
+ * zs_huge_object() - Test if a compressed object's size is too big for normal
+ *zspool classes and it should be stored in a huge class.
+ * @sz: Size of the compressed object (in bytes).
+ *
+ * The function checks if the object's size falls into huge_class
+ * area. We must take handle size into account and test the actual
+ * size we are going to use, because zs_malloc() unconditionally
+ * adds %ZS_HANDLE_SIZE before it performs _class lookup.
+ *
+ * Context: Any context.
+ *
+ * Return:
+ * * true  - The object is too big, it will be stored in the huge class.
+ * * false - The object will be stored in a normal zspool class.
+ */
+bool zs_huge_object(size_t sz)
+{
+   return sz + ZS_HANDLE_SIZE >= zs_huge_class_size;
+}
+EXPORT_SYMBOL_GPL(zs_huge_object);
+
 static unsigned long obj_malloc(struct size_class *class,
struct zspage *zspage, unsigned long handle)
 {
@@ -2404,6 +2427,9 @@ struct zs_pool *zs_create_pool(const char *name)
INIT_LIST_HEAD(>fullness_list[fullness]);
 
prev_class = class;
+   if (pages_per_zspage == 1 && objs_per_zspage == 1
+   && !zs_huge_class_size)
+   zs_huge_class_size = size;
}
 
/* debug only, don't abort if it fails */
-- 
2.16.1

[PATCHv3 1/2] zsmalloc: introduce zs_huge_object() function

2018-02-13 Thread Sergey Senozhatsky

Not every object can be share its zspage with other objects, e.g.
when the object is as big as zspage or nearly as big a zspage.
For such objects zsmalloc has a so called huge class - every object
which belongs to huge class consumes the entire zspage (which
consists of a physical page). On x86_64, PAGE_SHIFT 12 box, the
first non-huge class size is 3264, so starting down from size 3264,
objects can share page(-s) and thus minimize memory wastage.

ZRAM, however, has its own statically defined watermark for huge
objects - "3 * PAGE_SIZE / 4 = 3072", and forcibly stores every
object larger than this watermark (3072) as a PAGE_SIZE object,
in other words, to a huge class, while zsmalloc can keep some of
those objects in non-huge classes. This results in increased
memory consumption.

zsmalloc knows better if the object is huge or not. Introduce
zs_huge_object() function which tells if the given object can be
stored in one of non-huge classes or not. This will let us to drop
ZRAM's huge object watermark and fully rely on zsmalloc when we
decide if the object is huge.

Signed-off-by: Sergey Senozhatsky 
---
 include/linux/zsmalloc.h |  2 ++
 mm/zsmalloc.c| 26 ++
 2 files changed, 28 insertions(+)

diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index 57a8e98f2708..9a1baf673cc1 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -47,6 +47,8 @@ void zs_destroy_pool(struct zs_pool *pool);
 unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t flags);
 void zs_free(struct zs_pool *pool, unsigned long obj);
 
+bool zs_huge_object(size_t sz);
+
 void *zs_map_object(struct zs_pool *pool, unsigned long handle,
enum zs_mapmode mm);
 void zs_unmap_object(struct zs_pool *pool, unsigned long handle);
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index c3013505c305..e43fc6ebb8e1 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -192,6 +192,7 @@ static struct vfsmount *zsmalloc_mnt;
  * (see: fix_fullness_group())
  */
 static const int fullness_threshold_frac = 4;
+static size_t zs_huge_class_size;
 
 struct size_class {
spinlock_t lock;
@@ -1417,6 +1418,28 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long 
handle)
 }
 EXPORT_SYMBOL_GPL(zs_unmap_object);
 
+/**
+ * zs_huge_object() - Test if a compressed object's size is too big for normal
+ *zspool classes and it should be stored in a huge class.
+ * @sz: Size of the compressed object (in bytes).
+ *
+ * The function checks if the object's size falls into huge_class
+ * area. We must take handle size into account and test the actual
+ * size we are going to use, because zs_malloc() unconditionally
+ * adds %ZS_HANDLE_SIZE before it performs _class lookup.
+ *
+ * Context: Any context.
+ *
+ * Return:
+ * * true  - The object is too big, it will be stored in the huge class.
+ * * false - The object will be stored in a normal zspool class.
+ */
+bool zs_huge_object(size_t sz)
+{
+   return sz + ZS_HANDLE_SIZE >= zs_huge_class_size;
+}
+EXPORT_SYMBOL_GPL(zs_huge_object);
+
 static unsigned long obj_malloc(struct size_class *class,
struct zspage *zspage, unsigned long handle)
 {
@@ -2404,6 +2427,9 @@ struct zs_pool *zs_create_pool(const char *name)
INIT_LIST_HEAD(>fullness_list[fullness]);
 
prev_class = class;
+   if (pages_per_zspage == 1 && objs_per_zspage == 1
+   && !zs_huge_class_size)
+   zs_huge_class_size = size;
}
 
/* debug only, don't abort if it fails */
-- 
2.16.1

Re: [PATCH v7 31/37] MAINTAINERS: Add nds32

2018-02-13 Thread Greentime Hu

2018-02-14 0:02 GMT+08:00 Joe Perches :
> On Tue, 2018-02-13 at 17:09 +0800, Greentime Hu wrote:
>> Add a maintainer information for the nds32(Andes) architecture.
> []
>> diff --git a/MAINTAINERS b/MAINTAINERS
> []
>> @@ -868,6 +868,17 @@ X:   drivers/iio/*/adjd*
>>  F:   drivers/staging/iio/*/ad*
>>  F:   drivers/staging/iio/trigger/iio-trig-bfin-timer.c
>>
>> +ANDES ARCHITECTURE
>> +M:   Greentime Hu 
>> +M:   Vincent Chen 
>> +T:   git https://github.com/andestech/linux.git
>> +S:   Supported
>> +F:   arch/nds32
>
> This should have a trailing /
>
> F:  arch/nds32/

Thank you Joe.
I will add this trailing /
>
>> +F:   
>> Documentation/devicetree/bindings/interrupt-controller/andestech,ativic32.txt
>> +F:   Documentation/devicetree/bindings/nds32

And here

>> +K:   nds32
>
> Perhaps this should be
>
> K:  \bnds32
>
> as there are some existing uses of nds32 in the current tree.
>
> or maybe case insensitive like
>
> K:  (?i:\bnds32)
> or
> K:  (?:\bnds32|\bNDS32)
>

I think it might be better to keep it "nds32" becaue some intrinsic
functions are defined with __nds32__xxx.

Re: [PATCH v7 31/37] MAINTAINERS: Add nds32

2018-02-13 Thread Greentime Hu

2018-02-14 0:02 GMT+08:00 Joe Perches :
> On Tue, 2018-02-13 at 17:09 +0800, Greentime Hu wrote:
>> Add a maintainer information for the nds32(Andes) architecture.
> []
>> diff --git a/MAINTAINERS b/MAINTAINERS
> []
>> @@ -868,6 +868,17 @@ X:   drivers/iio/*/adjd*
>>  F:   drivers/staging/iio/*/ad*
>>  F:   drivers/staging/iio/trigger/iio-trig-bfin-timer.c
>>
>> +ANDES ARCHITECTURE
>> +M:   Greentime Hu 
>> +M:   Vincent Chen 
>> +T:   git https://github.com/andestech/linux.git
>> +S:   Supported
>> +F:   arch/nds32
>
> This should have a trailing /
>
> F:  arch/nds32/

Thank you Joe.
I will add this trailing /
>
>> +F:   
>> Documentation/devicetree/bindings/interrupt-controller/andestech,ativic32.txt
>> +F:   Documentation/devicetree/bindings/nds32

And here

>> +K:   nds32
>
> Perhaps this should be
>
> K:  \bnds32
>
> as there are some existing uses of nds32 in the current tree.
>
> or maybe case insensitive like
>
> K:  (?i:\bnds32)
> or
> K:  (?:\bnds32|\bNDS32)
>

I think it might be better to keep it "nds32" becaue some intrinsic
functions are defined with __nds32__xxx.

Re: [PATCHv2 1/2] zsmalloc: introduce zs_huge_object() function

2018-02-13 Thread Sergey Senozhatsky

On (02/11/18 09:05), Mike Rapoport wrote:
[..]
> > +/**
> > + * zs_huge_object() - Test if a compressed object's size is too big for 
> > normal
> > + *zspool classes and it shall be stored in a huge 
> > class.
> 
> I think "is should be stored" is more appropriate
> 
> > + * @sz: Size of the compressed object (in bytes).
> > + *
> > + * The function checks if the object's size falls into huge_class
> > + * area. We must take handle size into account and test the actual
> > + * size we are going to use, because zs_malloc() unconditionally
> > + * adds %ZS_HANDLE_SIZE before it performs %size_class lookup.
> 
> ^ _class ;-)

I'm sorry, Mike. Lost in branches/versions and sent out a half baked
version.

-ss

Re: [PATCHv2 1/2] zsmalloc: introduce zs_huge_object() function

2018-02-13 Thread Sergey Senozhatsky

On (02/11/18 09:05), Mike Rapoport wrote:
[..]
> > +/**
> > + * zs_huge_object() - Test if a compressed object's size is too big for 
> > normal
> > + *zspool classes and it shall be stored in a huge 
> > class.
> 
> I think "is should be stored" is more appropriate
> 
> > + * @sz: Size of the compressed object (in bytes).
> > + *
> > + * The function checks if the object's size falls into huge_class
> > + * area. We must take handle size into account and test the actual
> > + * size we are going to use, because zs_malloc() unconditionally
> > + * adds %ZS_HANDLE_SIZE before it performs %size_class lookup.
> 
> ^ _class ;-)

I'm sorry, Mike. Lost in branches/versions and sent out a half baked
version.

-ss

[PATCH v5 5/6] x86/apic: Rename variable/function related to x86_io_apic_ops

2018-02-13 Thread Baoquan He

The names of x86_io_apic_ops and its two member variables, are
misleading. The .read member is to read IO_APIC reg, while .disable
which hook native_disable_io_apic/irq_remapping_disable_io_apic
is actually used to restore boot irq mode, not disable IO_APIC.

So rename x86_io_apic_ops to x86_apic_ops since it doesn't only
handle IO_APIC, also LAPIC.

And also rename its member variables and the related hooks.

Signed-off-by: Baoquan He 
---
 arch/x86/include/asm/io_apic.h  | 6 +++---
 arch/x86/include/asm/x86_init.h | 8 
 arch/x86/kernel/apic/io_apic.c  | 4 ++--
 arch/x86/kernel/x86_init.c  | 6 +++---
 arch/x86/xen/apic.c | 2 +-
 drivers/iommu/irq_remapping.c   | 4 ++--
 6 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 5e389145d808..06fec4426458 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -183,11 +183,11 @@ extern void disable_ioapic_support(void);
 
 extern void __init io_apic_init_mappings(void);
 extern unsigned int native_io_apic_read(unsigned int apic, unsigned int reg);
-extern void native_disable_io_apic(void);
+extern void native_restore_boot_irq_mode(void);
 
 static inline unsigned int io_apic_read(unsigned int apic, unsigned int reg)
 {
-   return x86_io_apic_ops.read(apic, reg);
+   return x86_apic_ops.io_apic_read(apic, reg);
 }
 
 extern void setup_IO_APIC(void);
@@ -229,7 +229,7 @@ static inline void mp_save_irq(struct mpc_intsrc *m) { }
 static inline void disable_ioapic_support(void) { }
 static inline void io_apic_init_mappings(void) { }
 #define native_io_apic_readNULL
-#define native_disable_io_apic NULL
+#define native_restore_boot_irq_mode   NULL
 
 static inline void setup_IO_APIC(void) { }
 static inline void enable_IO_APIC(void) { }
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index fc2f082ac635..88306054bd98 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -274,16 +274,16 @@ struct x86_msi_ops {
void (*restore_msi_irqs)(struct pci_dev *dev);
 };
 
-struct x86_io_apic_ops {
-   unsigned int(*read)   (unsigned int apic, unsigned int reg);
-   void(*disable)(void);
+struct x86_apic_ops {
+   unsigned int(*io_apic_read)   (unsigned int apic, unsigned int reg);
+   void(*restore)(void);
 };
 
 extern struct x86_init_ops x86_init;
 extern struct x86_cpuinit_ops x86_cpuinit;
 extern struct x86_platform_ops x86_platform;
 extern struct x86_msi_ops x86_msi;
-extern struct x86_io_apic_ops x86_io_apic_ops;
+extern struct x86_apic_ops x86_apic_ops;
 
 extern void x86_early_init_platform_quirks(void);
 extern void x86_init_noop(void);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 9d86b10c2121..68129f11e7db 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1410,7 +1410,7 @@ void __init enable_IO_APIC(void)
clear_IO_APIC();
 }
 
-void native_disable_io_apic(void)
+void native_restore_boot_irq_mode(void)
 {
/*
 * If the i8259 is routed through an IOAPIC
@@ -1443,7 +1443,7 @@ void restore_boot_irq_mode(void)
if (!nr_legacy_irqs())
return;
 
-   x86_io_apic_ops.disable();
+   x86_apic_ops.restore();
 }
 
 #ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 1151ccd72ce9..2bccd03bd654 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -146,7 +146,7 @@ void arch_restore_msi_irqs(struct pci_dev *dev)
 }
 #endif
 
-struct x86_io_apic_ops x86_io_apic_ops __ro_after_init = {
-   .read   = native_io_apic_read,
-   .disable= native_disable_io_apic,
+struct x86_apic_ops x86_apic_ops __ro_after_init = {
+   .io_apic_read   = native_io_apic_read,
+   .restore= native_restore_boot_irq_mode,
 };
diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c
index de58533d3664..2163888497d3 100644
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -215,7 +215,7 @@ static void __init xen_apic_check(void)
 }
 void __init xen_init_apic(void)
 {
-   x86_io_apic_ops.read = xen_io_apic_read;
+   x86_apic_ops.io_apic_read = xen_io_apic_read;
/* On PV guests the APIC CPUID bit is disabled so none of the
 * routines end up executing. */
if (!xen_initial_domain())
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 49721b4e1975..496deee3ae3a 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -27,7 +27,7 @@ int disable_irq_post = 0;
 static int disable_irq_remap;
 static struct irq_remap_ops *remap_ops;
 
-static void irq_remapping_disable_io_apic(void)
+static void irq_remapping_restore_boot_irq_mode(void)
 {
/*
 * With interrupt-remapping, for now we will

[PATCH v5 6/6] x86/apic: Set up through-local-APIC on boot CPU if 'noapic' specified

2018-02-13 Thread Baoquan He

Currently kdump kernel becomes very slow if 'noapic' is specified.
Normal kernel won't.

Kernel parameter 'noapic' is used to disable IO-APIC in system for
testing or special purpose. Here the root cause is that in kdump
kernel LAPIC is disabled since commit 522e664644
("x86/apic: Disable I/O APIC before shutdown of the local APIC").
In this case We need set up through-local-APIC on boot CPU in
setup_local_APIC().

While In normal kernel the legacy irq mode is enabled in BIOS. If
it is virtual wire mode, the local-APIC has been enabled and set as
through-local-APIC.

Though we fix the regression introduced by criminal commit 522e664644,
for safety and clarity, better set up through-local-APIC explicitly,
but not rely on the default boot irq mode.

Do it now.

Signed-off-by: Baoquan He 
---
 arch/x86/kernel/apic/apic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 25ddf02598d2..3fc259b4dd2d 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1570,7 +1570,7 @@ void setup_local_APIC(void)
 * TODO: set up through-local-APIC from through-I/O-APIC? --macro
 */
value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
-   if (!cpu && (pic_mode || !value)) {
+   if (!cpu && (pic_mode || !value || skip_ioapic_setup)) {
value = APIC_DM_EXTINT;
apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n", cpu);
} else {
-- 
2.13.6

[PATCH v5 5/6] x86/apic: Rename variable/function related to x86_io_apic_ops

2018-02-13 Thread Baoquan He

The names of x86_io_apic_ops and its two member variables, are
misleading. The .read member is to read IO_APIC reg, while .disable
which hook native_disable_io_apic/irq_remapping_disable_io_apic
is actually used to restore boot irq mode, not disable IO_APIC.

So rename x86_io_apic_ops to x86_apic_ops since it doesn't only
handle IO_APIC, also LAPIC.

And also rename its member variables and the related hooks.

Signed-off-by: Baoquan He 
---
 arch/x86/include/asm/io_apic.h  | 6 +++---
 arch/x86/include/asm/x86_init.h | 8 
 arch/x86/kernel/apic/io_apic.c  | 4 ++--
 arch/x86/kernel/x86_init.c  | 6 +++---
 arch/x86/xen/apic.c | 2 +-
 drivers/iommu/irq_remapping.c   | 4 ++--
 6 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 5e389145d808..06fec4426458 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -183,11 +183,11 @@ extern void disable_ioapic_support(void);
 
 extern void __init io_apic_init_mappings(void);
 extern unsigned int native_io_apic_read(unsigned int apic, unsigned int reg);
-extern void native_disable_io_apic(void);
+extern void native_restore_boot_irq_mode(void);
 
 static inline unsigned int io_apic_read(unsigned int apic, unsigned int reg)
 {
-   return x86_io_apic_ops.read(apic, reg);
+   return x86_apic_ops.io_apic_read(apic, reg);
 }
 
 extern void setup_IO_APIC(void);
@@ -229,7 +229,7 @@ static inline void mp_save_irq(struct mpc_intsrc *m) { }
 static inline void disable_ioapic_support(void) { }
 static inline void io_apic_init_mappings(void) { }
 #define native_io_apic_readNULL
-#define native_disable_io_apic NULL
+#define native_restore_boot_irq_mode   NULL
 
 static inline void setup_IO_APIC(void) { }
 static inline void enable_IO_APIC(void) { }
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index fc2f082ac635..88306054bd98 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -274,16 +274,16 @@ struct x86_msi_ops {
void (*restore_msi_irqs)(struct pci_dev *dev);
 };
 
-struct x86_io_apic_ops {
-   unsigned int(*read)   (unsigned int apic, unsigned int reg);
-   void(*disable)(void);
+struct x86_apic_ops {
+   unsigned int(*io_apic_read)   (unsigned int apic, unsigned int reg);
+   void(*restore)(void);
 };
 
 extern struct x86_init_ops x86_init;
 extern struct x86_cpuinit_ops x86_cpuinit;
 extern struct x86_platform_ops x86_platform;
 extern struct x86_msi_ops x86_msi;
-extern struct x86_io_apic_ops x86_io_apic_ops;
+extern struct x86_apic_ops x86_apic_ops;
 
 extern void x86_early_init_platform_quirks(void);
 extern void x86_init_noop(void);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 9d86b10c2121..68129f11e7db 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1410,7 +1410,7 @@ void __init enable_IO_APIC(void)
clear_IO_APIC();
 }
 
-void native_disable_io_apic(void)
+void native_restore_boot_irq_mode(void)
 {
/*
 * If the i8259 is routed through an IOAPIC
@@ -1443,7 +1443,7 @@ void restore_boot_irq_mode(void)
if (!nr_legacy_irqs())
return;
 
-   x86_io_apic_ops.disable();
+   x86_apic_ops.restore();
 }
 
 #ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index 1151ccd72ce9..2bccd03bd654 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -146,7 +146,7 @@ void arch_restore_msi_irqs(struct pci_dev *dev)
 }
 #endif
 
-struct x86_io_apic_ops x86_io_apic_ops __ro_after_init = {
-   .read   = native_io_apic_read,
-   .disable= native_disable_io_apic,
+struct x86_apic_ops x86_apic_ops __ro_after_init = {
+   .io_apic_read   = native_io_apic_read,
+   .restore= native_restore_boot_irq_mode,
 };
diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c
index de58533d3664..2163888497d3 100644
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -215,7 +215,7 @@ static void __init xen_apic_check(void)
 }
 void __init xen_init_apic(void)
 {
-   x86_io_apic_ops.read = xen_io_apic_read;
+   x86_apic_ops.io_apic_read = xen_io_apic_read;
/* On PV guests the APIC CPUID bit is disabled so none of the
 * routines end up executing. */
if (!xen_initial_domain())
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index 49721b4e1975..496deee3ae3a 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -27,7 +27,7 @@ int disable_irq_post = 0;
 static int disable_irq_remap;
 static struct irq_remap_ops *remap_ops;
 
-static void irq_remapping_disable_io_apic(void)
+static void irq_remapping_restore_boot_irq_mode(void)
 {
/*
 * With interrupt-remapping, for now we will use virtual wire

[PATCH v5 6/6] x86/apic: Set up through-local-APIC on boot CPU if 'noapic' specified

2018-02-13 Thread Baoquan He

Currently kdump kernel becomes very slow if 'noapic' is specified.
Normal kernel won't.

Kernel parameter 'noapic' is used to disable IO-APIC in system for
testing or special purpose. Here the root cause is that in kdump
kernel LAPIC is disabled since commit 522e664644
("x86/apic: Disable I/O APIC before shutdown of the local APIC").
In this case We need set up through-local-APIC on boot CPU in
setup_local_APIC().

While In normal kernel the legacy irq mode is enabled in BIOS. If
it is virtual wire mode, the local-APIC has been enabled and set as
through-local-APIC.

Though we fix the regression introduced by criminal commit 522e664644,
for safety and clarity, better set up through-local-APIC explicitly,
but not rely on the default boot irq mode.

Do it now.

Signed-off-by: Baoquan He 
---
 arch/x86/kernel/apic/apic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 25ddf02598d2..3fc259b4dd2d 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1570,7 +1570,7 @@ void setup_local_APIC(void)
 * TODO: set up through-local-APIC from through-I/O-APIC? --macro
 */
value = apic_read(APIC_LVT0) & APIC_LVT_MASKED;
-   if (!cpu && (pic_mode || !value)) {
+   if (!cpu && (pic_mode || !value || skip_ioapic_setup)) {
value = APIC_DM_EXTINT;
apic_printk(APIC_VERBOSE, "enabled ExtINT on CPU#%d\n", cpu);
} else {
-- 
2.13.6

[PATCH v5 3/6] x86/apic: Fix restoring boot irq mode in reboot and kexec/kdump

2018-02-13 Thread Baoquan He

This is a regression fix.

Before, to fix erratum AVR31, commit 522e66464467 ("x86/apic: Disable
I/O APIC before shutdown of the local APIC") moved lapic_shutdown()
calling after disable_IO_APIC() in reboot and kexec/kdump code path.
This introdued a regression. The root cause is that disable_IO_APIC()
not only clears IO_APIC, also restore boot irq mode by setting
LAPIC/APIC/IMCR, calling lapic_shutdown() after disable_IO_APIC() will
disable LAPIC and ruin the possible virtual wire mode setting which
the code has been trying to do all along.

The consequence is, in KVM guest kernel always prints warning as below
during kexec/kdump kernel boots up. That happened in setup_local_APIC()
since 'do { xxx } while (queued && max_loops > 0)' loop does not function
well any more if pending irq exists in APIC IRR after LAPIC is disabled.

[0.001000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1467 
setup_local_APIC+0x228/0x330
[0.001000] Modules linked in:
[0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #3
[0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1.fc26 04/01/2014
[0.001000] RIP: 0010:setup_local_APIC+0x228/0x330
[0.001000] RSP: :b6e03eb8 EFLAGS: 00010286
[0.001000] RAX: 009edb4c4d84 RBX:  RCX: b099d800
[0.001000] RDX: 009e RSI:  RDI: 0810
[0.001000] RBP:  R08:  R09: 0001
[0.001000] R10: 98ce6a801c00 R11: 0761076d072f0776 R12: 0001
[0.001000] R13: 00f0 R14: 4000 R15: c6ff
[0.001000] FS:  () GS:98ce6bc0() 
knlGS:
[0.001000] CS:  0010 DS:  ES:  CR0: 80050033
[0.001000] CR2:  CR3: 22209000 CR4: 000406b0
[0.001000] Call Trace:
[0.001000]  apic_bsp_setup+0x56/0x74
[0.001000]  x86_late_time_init+0x11/0x16
[0.001000]  start_kernel+0x3c9/0x486
[0.001000]  secondary_startup_64+0xa5/0xb0
[0.001000] Code: 00 85 c9 74 2d 0f 31 c1 e1 0a 48 c1 e2 20 41 89 cf 4c 03 
7c 24 08 48 09 d0 49 29 c7 4c 89 3c 24 48 83 3c 24 00 0f 8f 8f fe ff
ff <0f> ff e9 10 ff ff ff 48 83 2c 24 01 eb e7 48 83 c4 18 5b 5d 41
[0.001000] ---[ end trace b88e71b9a6ebebdd ]---
[0.001000] masked ExtINT on CPU#0

To fix this, just call clear_IO_APIC() to stop IO_APIC where
disable_IO_APIC() was called, and call restore_boot_irq_mode() to
restore boot irq mode before reboot or kexec/kdump jump.

Signed-off-by: Baoquan He 
Fixes: commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown of the 
local APIC")
Cc: sta...@vger.kernel.org
---
v4->v5:
  Take the change related to KEXEC_JUMP out to new patch 0002.

v4->v3:
  Eric pointed out the change related to KEXEC_JUMP is not right.
  Correct it.

  Add Fixes tag and Cc to stable.

 arch/x86/kernel/crash.c  | 3 ++-
 arch/x86/kernel/reboot.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 10e74d4778a1..1f6680427ff0 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -199,9 +199,10 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 #ifdef CONFIG_X86_IO_APIC
/* Prevent crash_kexec() from deadlocking on ioapic_lock. */
ioapic_zap_locks();
-   disable_IO_APIC();
+   clear_IO_APIC();
 #endif
lapic_shutdown();
+   restore_boot_irq_mode();
 #ifdef CONFIG_HPET_TIMER
hpet_disable();
 #endif
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 2126b9d27c34..725624b6c0c0 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -666,7 +666,7 @@ void native_machine_shutdown(void)
 * Even without the erratum, it still makes sense to quiet IO APIC
 * before disabling Local APIC.
 */
-   disable_IO_APIC();
+   clear_IO_APIC();
 #endif
 
 #ifdef CONFIG_SMP
@@ -680,6 +680,7 @@ void native_machine_shutdown(void)
 #endif
 
lapic_shutdown();
+   restore_boot_irq_mode();
 
 #ifdef CONFIG_HPET_TIMER
hpet_disable();
-- 
2.13.6

[PATCH v5 0/6] x86/apic: Fix restoring boot irq mode in reboot and kexec/kdump

2018-02-13 Thread Baoquan He

This is v5 post. Newly added patch 0002 includes the change
related to KEXEC_JUMP path. Patch 0003 only includes the
regression fix.

A regression bug was introduced in below commit.
commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown of the local 
APIC")

It caused the action to fail that we try to restore boot irq mode
in reboot and kexec/kdump. Details can be seen in patch 0003.

Warning can always be seen during kdump kernel boot on qemu/kvm
platform. Our customer even saw casual kdump kernel hang once in
~30 attempts during stress testing of kdump on KVM machine.

v4->v5:
  Take out the change related to KEXEC_JUMP to a new patch 0002
  according to Eric's suggestion.
  Patch 0003 in this series only includes the regression fix.

v3->v4:
  Eric pointed out that in patch 0002 the change related to
  KEXEC_JUMP is not right.
  Correct it.

  Add Fixes tag and Cc to stable.

v2->v3:
  Change as Eric suggested:

  Rerrange patches and change code and messy function/variable naming.
  Change patch subject and log to make it more understandable. 



*** BLURB HERE ***

Baoquan He (6):
  x86/apic: Split out restore_boot_irq_mode from disable_IO_APIC
  x86/apic: Replace disable_IO_APIC for KEXEC_JUMP
  x86/apic: Fix restoring boot irq mode in reboot and kexec/kdump
  x86/apic: Remove useless disable_IO_APIC
  x86/apic: Rename variable/function related to x86_io_apic_ops
  x86/apic: Set up through-local-APIC on boot CPU if 'noapic' specified

 arch/x86/include/asm/io_apic.h |  9 +
 arch/x86/include/asm/x86_init.h|  8 
 arch/x86/kernel/apic/apic.c|  2 +-
 arch/x86/kernel/apic/io_apic.c | 16 
 arch/x86/kernel/crash.c|  3 ++-
 arch/x86/kernel/machine_kexec_32.c |  8 
 arch/x86/kernel/machine_kexec_64.c |  8 
 arch/x86/kernel/reboot.c   |  3 ++-
 arch/x86/kernel/x86_init.c |  6 +++---
 arch/x86/xen/apic.c|  2 +-
 drivers/iommu/irq_remapping.c  |  4 ++--
 11 files changed, 32 insertions(+), 37 deletions(-)

-- 
2.13.6

[PATCH v5 3/6] x86/apic: Fix restoring boot irq mode in reboot and kexec/kdump

2018-02-13 Thread Baoquan He

This is a regression fix.

Before, to fix erratum AVR31, commit 522e66464467 ("x86/apic: Disable
I/O APIC before shutdown of the local APIC") moved lapic_shutdown()
calling after disable_IO_APIC() in reboot and kexec/kdump code path.
This introdued a regression. The root cause is that disable_IO_APIC()
not only clears IO_APIC, also restore boot irq mode by setting
LAPIC/APIC/IMCR, calling lapic_shutdown() after disable_IO_APIC() will
disable LAPIC and ruin the possible virtual wire mode setting which
the code has been trying to do all along.

The consequence is, in KVM guest kernel always prints warning as below
during kexec/kdump kernel boots up. That happened in setup_local_APIC()
since 'do { xxx } while (queued && max_loops > 0)' loop does not function
well any more if pending irq exists in APIC IRR after LAPIC is disabled.

[0.001000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1467 
setup_local_APIC+0x228/0x330
[0.001000] Modules linked in:
[0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #3
[0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1.fc26 04/01/2014
[0.001000] RIP: 0010:setup_local_APIC+0x228/0x330
[0.001000] RSP: :b6e03eb8 EFLAGS: 00010286
[0.001000] RAX: 009edb4c4d84 RBX:  RCX: b099d800
[0.001000] RDX: 009e RSI:  RDI: 0810
[0.001000] RBP:  R08:  R09: 0001
[0.001000] R10: 98ce6a801c00 R11: 0761076d072f0776 R12: 0001
[0.001000] R13: 00f0 R14: 4000 R15: c6ff
[0.001000] FS:  () GS:98ce6bc0() 
knlGS:
[0.001000] CS:  0010 DS:  ES:  CR0: 80050033
[0.001000] CR2:  CR3: 22209000 CR4: 000406b0
[0.001000] Call Trace:
[0.001000]  apic_bsp_setup+0x56/0x74
[0.001000]  x86_late_time_init+0x11/0x16
[0.001000]  start_kernel+0x3c9/0x486
[0.001000]  secondary_startup_64+0xa5/0xb0
[0.001000] Code: 00 85 c9 74 2d 0f 31 c1 e1 0a 48 c1 e2 20 41 89 cf 4c 03 
7c 24 08 48 09 d0 49 29 c7 4c 89 3c 24 48 83 3c 24 00 0f 8f 8f fe ff
ff <0f> ff e9 10 ff ff ff 48 83 2c 24 01 eb e7 48 83 c4 18 5b 5d 41
[0.001000] ---[ end trace b88e71b9a6ebebdd ]---
[0.001000] masked ExtINT on CPU#0

To fix this, just call clear_IO_APIC() to stop IO_APIC where
disable_IO_APIC() was called, and call restore_boot_irq_mode() to
restore boot irq mode before reboot or kexec/kdump jump.

Signed-off-by: Baoquan He 
Fixes: commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown of the 
local APIC")
Cc: sta...@vger.kernel.org
---
v4->v5:
  Take the change related to KEXEC_JUMP out to new patch 0002.

v4->v3:
  Eric pointed out the change related to KEXEC_JUMP is not right.
  Correct it.

  Add Fixes tag and Cc to stable.

 arch/x86/kernel/crash.c  | 3 ++-
 arch/x86/kernel/reboot.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 10e74d4778a1..1f6680427ff0 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -199,9 +199,10 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 #ifdef CONFIG_X86_IO_APIC
/* Prevent crash_kexec() from deadlocking on ioapic_lock. */
ioapic_zap_locks();
-   disable_IO_APIC();
+   clear_IO_APIC();
 #endif
lapic_shutdown();
+   restore_boot_irq_mode();
 #ifdef CONFIG_HPET_TIMER
hpet_disable();
 #endif
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 2126b9d27c34..725624b6c0c0 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -666,7 +666,7 @@ void native_machine_shutdown(void)
 * Even without the erratum, it still makes sense to quiet IO APIC
 * before disabling Local APIC.
 */
-   disable_IO_APIC();
+   clear_IO_APIC();
 #endif
 
 #ifdef CONFIG_SMP
@@ -680,6 +680,7 @@ void native_machine_shutdown(void)
 #endif
 
lapic_shutdown();
+   restore_boot_irq_mode();
 
 #ifdef CONFIG_HPET_TIMER
hpet_disable();
-- 
2.13.6

[PATCH v5 0/6] x86/apic: Fix restoring boot irq mode in reboot and kexec/kdump

2018-02-13 Thread Baoquan He

This is v5 post. Newly added patch 0002 includes the change
related to KEXEC_JUMP path. Patch 0003 only includes the
regression fix.

A regression bug was introduced in below commit.
commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown of the local 
APIC")

It caused the action to fail that we try to restore boot irq mode
in reboot and kexec/kdump. Details can be seen in patch 0003.

Warning can always be seen during kdump kernel boot on qemu/kvm
platform. Our customer even saw casual kdump kernel hang once in
~30 attempts during stress testing of kdump on KVM machine.

v4->v5:
  Take out the change related to KEXEC_JUMP to a new patch 0002
  according to Eric's suggestion.
  Patch 0003 in this series only includes the regression fix.

v3->v4:
  Eric pointed out that in patch 0002 the change related to
  KEXEC_JUMP is not right.
  Correct it.

  Add Fixes tag and Cc to stable.

v2->v3:
  Change as Eric suggested:

  Rerrange patches and change code and messy function/variable naming.
  Change patch subject and log to make it more understandable. 



*** BLURB HERE ***

Baoquan He (6):
  x86/apic: Split out restore_boot_irq_mode from disable_IO_APIC
  x86/apic: Replace disable_IO_APIC for KEXEC_JUMP
  x86/apic: Fix restoring boot irq mode in reboot and kexec/kdump
  x86/apic: Remove useless disable_IO_APIC
  x86/apic: Rename variable/function related to x86_io_apic_ops
  x86/apic: Set up through-local-APIC on boot CPU if 'noapic' specified

 arch/x86/include/asm/io_apic.h |  9 +
 arch/x86/include/asm/x86_init.h|  8 
 arch/x86/kernel/apic/apic.c|  2 +-
 arch/x86/kernel/apic/io_apic.c | 16 
 arch/x86/kernel/crash.c|  3 ++-
 arch/x86/kernel/machine_kexec_32.c |  8 
 arch/x86/kernel/machine_kexec_64.c |  8 
 arch/x86/kernel/reboot.c   |  3 ++-
 arch/x86/kernel/x86_init.c |  6 +++---
 arch/x86/xen/apic.c|  2 +-
 drivers/iommu/irq_remapping.c  |  4 ++--
 11 files changed, 32 insertions(+), 37 deletions(-)

-- 
2.13.6

[PATCH v5 4/6] x86/apic: Remove useless disable_IO_APIC

2018-02-13 Thread Baoquan He

No one uses it anymore.

Signed-off-by: Baoquan He 
---
 arch/x86/include/asm/io_apic.h |  1 -
 arch/x86/kernel/apic/io_apic.c | 13 -
 arch/x86/kernel/machine_kexec_32.c |  5 ++---
 arch/x86/kernel/machine_kexec_64.c |  5 ++---
 4 files changed, 4 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 0fa95bfacb39..5e389145d808 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -192,7 +192,6 @@ static inline unsigned int io_apic_read(unsigned int apic, 
unsigned int reg)
 
 extern void setup_IO_APIC(void);
 extern void enable_IO_APIC(void);
-extern void disable_IO_APIC(void);
 extern void clear_IO_APIC(void);
 extern void restore_boot_irq_mode(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 2d7cd2db77f5..9d86b10c2121 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1438,19 +1438,6 @@ void native_disable_io_apic(void)
disconnect_bsp_APIC(ioapic_i8259.pin != -1);
 }
 
-/*
- * Not an __init, needed by the reboot code
- */
-void disable_IO_APIC(void)
-{
-   /*
-* Clear the IO-APIC before rebooting:
-*/
-   clear_IO_APIC();
-
-   restore_boot_irq_mode();
-}
-
 void restore_boot_irq_mode(void)
 {
if (!nr_legacy_irqs())
diff --git a/arch/x86/kernel/machine_kexec_32.c 
b/arch/x86/kernel/machine_kexec_32.c
index 4cd79d88a4ac..60cdec6628b0 100644
--- a/arch/x86/kernel/machine_kexec_32.c
+++ b/arch/x86/kernel/machine_kexec_32.c
@@ -195,9 +195,8 @@ void machine_kexec(struct kimage *image)
/*
 * We need to put APICs in legacy mode so that we can
 * get timer interrupts in second kernel. kexec/kdump
-* paths already have calls to disable_IO_APIC() in
-* one form or other. kexec jump path also need
-* one.
+* paths already have calls to restore_boot_irq_mode()
+* in one form or other. kexec jump path also need one.
 */
clear_IO_APIC();
restore_boot_irq_mode();
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 2ab14b9c1a89..5ffbc55ea80f 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -293,9 +293,8 @@ void machine_kexec(struct kimage *image)
/*
 * We need to put APICs in legacy mode so that we can
 * get timer interrupts in second kernel. kexec/kdump
-* paths already have calls to disable_IO_APIC() in
-* one form or other. kexec jump path also need
-* one.
+* paths already have calls to restore_boot_irq_mode()
+* in one form or other. kexec jump path also need one.
 */
clear_IO_APIC();
restore_boot_irq_mode();
-- 
2.13.6

[PATCH v5 4/6] x86/apic: Remove useless disable_IO_APIC

2018-02-13 Thread Baoquan He

No one uses it anymore.

Signed-off-by: Baoquan He 
---
 arch/x86/include/asm/io_apic.h |  1 -
 arch/x86/kernel/apic/io_apic.c | 13 -
 arch/x86/kernel/machine_kexec_32.c |  5 ++---
 arch/x86/kernel/machine_kexec_64.c |  5 ++---
 4 files changed, 4 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 0fa95bfacb39..5e389145d808 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -192,7 +192,6 @@ static inline unsigned int io_apic_read(unsigned int apic, 
unsigned int reg)
 
 extern void setup_IO_APIC(void);
 extern void enable_IO_APIC(void);
-extern void disable_IO_APIC(void);
 extern void clear_IO_APIC(void);
 extern void restore_boot_irq_mode(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 2d7cd2db77f5..9d86b10c2121 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1438,19 +1438,6 @@ void native_disable_io_apic(void)
disconnect_bsp_APIC(ioapic_i8259.pin != -1);
 }
 
-/*
- * Not an __init, needed by the reboot code
- */
-void disable_IO_APIC(void)
-{
-   /*
-* Clear the IO-APIC before rebooting:
-*/
-   clear_IO_APIC();
-
-   restore_boot_irq_mode();
-}
-
 void restore_boot_irq_mode(void)
 {
if (!nr_legacy_irqs())
diff --git a/arch/x86/kernel/machine_kexec_32.c 
b/arch/x86/kernel/machine_kexec_32.c
index 4cd79d88a4ac..60cdec6628b0 100644
--- a/arch/x86/kernel/machine_kexec_32.c
+++ b/arch/x86/kernel/machine_kexec_32.c
@@ -195,9 +195,8 @@ void machine_kexec(struct kimage *image)
/*
 * We need to put APICs in legacy mode so that we can
 * get timer interrupts in second kernel. kexec/kdump
-* paths already have calls to disable_IO_APIC() in
-* one form or other. kexec jump path also need
-* one.
+* paths already have calls to restore_boot_irq_mode()
+* in one form or other. kexec jump path also need one.
 */
clear_IO_APIC();
restore_boot_irq_mode();
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 2ab14b9c1a89..5ffbc55ea80f 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -293,9 +293,8 @@ void machine_kexec(struct kimage *image)
/*
 * We need to put APICs in legacy mode so that we can
 * get timer interrupts in second kernel. kexec/kdump
-* paths already have calls to disable_IO_APIC() in
-* one form or other. kexec jump path also need
-* one.
+* paths already have calls to restore_boot_irq_mode()
+* in one form or other. kexec jump path also need one.
 */
clear_IO_APIC();
restore_boot_irq_mode();
-- 
2.13.6

[PATCH v5 2/6] x86/apic: Replace disable_IO_APIC for KEXEC_JUMP

2018-02-13 Thread Baoquan He

Later disable_IO_APIC() will be broken down into clear_IO_APIC()
and restore_boot_irq_mode(). These two functions will be called
separately where they are needed to fix a regression introduced
by commit 522e66464467 ("x86/apic: Disable I/O APIC before
shutdown of the local APIC").

While KEXEC_JUMP code doesn't call lapic_shutdown() before jump
like kexec/kdump, so it's not impacted by commit 522e66464467.

Hence here change clear_IO_APIC() as public, and replace disable_IO_APIC()
with clear_IO_APIC() and restore_boot_irq_mode() to keep KEXEC_JUMP
code unchanged in essence. No functional change.

Signed-off-by: Baoquan He 
---
v4->v5:
  Make this patch to replace disable_IO_APIC() with clear_IO_APIC
  and restore_boot_irq_mode() for KEXEC_JUMP path only. This makes
  patch easier to review according to Eric's suggestion..

 arch/x86/include/asm/io_apic.h | 1 +
 arch/x86/kernel/apic/io_apic.c | 2 +-
 arch/x86/kernel/machine_kexec_32.c | 3 ++-
 arch/x86/kernel/machine_kexec_64.c | 3 ++-
 4 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 558d1a6a13ad..0fa95bfacb39 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -193,6 +193,7 @@ static inline unsigned int io_apic_read(unsigned int apic, 
unsigned int reg)
 extern void setup_IO_APIC(void);
 extern void enable_IO_APIC(void);
 extern void disable_IO_APIC(void);
+extern void clear_IO_APIC(void);
 extern void restore_boot_irq_mode(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
 extern void print_IO_APICs(void);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 7b73b6b9b4b6..2d7cd2db77f5 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -587,7 +587,7 @@ static void clear_IO_APIC_pin(unsigned int apic, unsigned 
int pin)
   mpc_ioapic_id(apic), pin);
 }
 
-static void clear_IO_APIC (void)
+void clear_IO_APIC (void)
 {
int apic, pin;
 
diff --git a/arch/x86/kernel/machine_kexec_32.c 
b/arch/x86/kernel/machine_kexec_32.c
index edfede768688..4cd79d88a4ac 100644
--- a/arch/x86/kernel/machine_kexec_32.c
+++ b/arch/x86/kernel/machine_kexec_32.c
@@ -199,7 +199,8 @@ void machine_kexec(struct kimage *image)
 * one form or other. kexec jump path also need
 * one.
 */
-   disable_IO_APIC();
+   clear_IO_APIC();
+   restore_boot_irq_mode();
 #endif
}
 
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 1f790cf9d38f..2ab14b9c1a89 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -297,7 +297,8 @@ void machine_kexec(struct kimage *image)
 * one form or other. kexec jump path also need
 * one.
 */
-   disable_IO_APIC();
+   clear_IO_APIC();
+   restore_boot_irq_mode();
 #endif
}
 
-- 
2.13.6

[PATCH v5 2/6] x86/apic: Replace disable_IO_APIC for KEXEC_JUMP

2018-02-13 Thread Baoquan He

Later disable_IO_APIC() will be broken down into clear_IO_APIC()
and restore_boot_irq_mode(). These two functions will be called
separately where they are needed to fix a regression introduced
by commit 522e66464467 ("x86/apic: Disable I/O APIC before
shutdown of the local APIC").

While KEXEC_JUMP code doesn't call lapic_shutdown() before jump
like kexec/kdump, so it's not impacted by commit 522e66464467.

Hence here change clear_IO_APIC() as public, and replace disable_IO_APIC()
with clear_IO_APIC() and restore_boot_irq_mode() to keep KEXEC_JUMP
code unchanged in essence. No functional change.

Signed-off-by: Baoquan He 
---
v4->v5:
  Make this patch to replace disable_IO_APIC() with clear_IO_APIC
  and restore_boot_irq_mode() for KEXEC_JUMP path only. This makes
  patch easier to review according to Eric's suggestion..

 arch/x86/include/asm/io_apic.h | 1 +
 arch/x86/kernel/apic/io_apic.c | 2 +-
 arch/x86/kernel/machine_kexec_32.c | 3 ++-
 arch/x86/kernel/machine_kexec_64.c | 3 ++-
 4 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 558d1a6a13ad..0fa95bfacb39 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -193,6 +193,7 @@ static inline unsigned int io_apic_read(unsigned int apic, 
unsigned int reg)
 extern void setup_IO_APIC(void);
 extern void enable_IO_APIC(void);
 extern void disable_IO_APIC(void);
+extern void clear_IO_APIC(void);
 extern void restore_boot_irq_mode(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
 extern void print_IO_APICs(void);
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 7b73b6b9b4b6..2d7cd2db77f5 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -587,7 +587,7 @@ static void clear_IO_APIC_pin(unsigned int apic, unsigned 
int pin)
   mpc_ioapic_id(apic), pin);
 }
 
-static void clear_IO_APIC (void)
+void clear_IO_APIC (void)
 {
int apic, pin;
 
diff --git a/arch/x86/kernel/machine_kexec_32.c 
b/arch/x86/kernel/machine_kexec_32.c
index edfede768688..4cd79d88a4ac 100644
--- a/arch/x86/kernel/machine_kexec_32.c
+++ b/arch/x86/kernel/machine_kexec_32.c
@@ -199,7 +199,8 @@ void machine_kexec(struct kimage *image)
 * one form or other. kexec jump path also need
 * one.
 */
-   disable_IO_APIC();
+   clear_IO_APIC();
+   restore_boot_irq_mode();
 #endif
}
 
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 1f790cf9d38f..2ab14b9c1a89 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -297,7 +297,8 @@ void machine_kexec(struct kimage *image)
 * one form or other. kexec jump path also need
 * one.
 */
-   disable_IO_APIC();
+   clear_IO_APIC();
+   restore_boot_irq_mode();
 #endif
}
 
-- 
2.13.6

[PATCH v5 1/6] x86/apic: Split out restore_boot_irq_mode from disable_IO_APIC

2018-02-13 Thread Baoquan He

This is a preparation patch. Split out the code which restores boot
irq mode from disable_IO_APIC() and wrap into a new function
restore_boot_irq_mode(). No functional change.

Signed-off-by: Baoquan He 
---
 arch/x86/include/asm/io_apic.h | 1 +
 arch/x86/kernel/apic/io_apic.c | 5 +
 2 files changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index a8834dd546cd..558d1a6a13ad 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -193,6 +193,7 @@ static inline unsigned int io_apic_read(unsigned int apic, 
unsigned int reg)
 extern void setup_IO_APIC(void);
 extern void enable_IO_APIC(void);
 extern void disable_IO_APIC(void);
+extern void restore_boot_irq_mode(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
 extern void print_IO_APICs(void);
 #else  /* !CONFIG_X86_IO_APIC */
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 8ad2e410974f..7b73b6b9b4b6 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1448,6 +1448,11 @@ void disable_IO_APIC(void)
 */
clear_IO_APIC();
 
+   restore_boot_irq_mode();
+}
+
+void restore_boot_irq_mode(void)
+{
if (!nr_legacy_irqs())
return;
 
-- 
2.13.6

[PATCH v5 1/6] x86/apic: Split out restore_boot_irq_mode from disable_IO_APIC

2018-02-13 Thread Baoquan He

This is a preparation patch. Split out the code which restores boot
irq mode from disable_IO_APIC() and wrap into a new function
restore_boot_irq_mode(). No functional change.

Signed-off-by: Baoquan He 
---
 arch/x86/include/asm/io_apic.h | 1 +
 arch/x86/kernel/apic/io_apic.c | 5 +
 2 files changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index a8834dd546cd..558d1a6a13ad 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -193,6 +193,7 @@ static inline unsigned int io_apic_read(unsigned int apic, 
unsigned int reg)
 extern void setup_IO_APIC(void);
 extern void enable_IO_APIC(void);
 extern void disable_IO_APIC(void);
+extern void restore_boot_irq_mode(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
 extern void print_IO_APICs(void);
 #else  /* !CONFIG_X86_IO_APIC */
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 8ad2e410974f..7b73b6b9b4b6 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1448,6 +1448,11 @@ void disable_IO_APIC(void)
 */
clear_IO_APIC();
 
+   restore_boot_irq_mode();
+}
+
+void restore_boot_irq_mode(void)
+{
if (!nr_legacy_irqs())
return;
 
-- 
2.13.6

Re: [2/2] powerpc/pseries: Declare optional dummy function for find_and_online_cpu_nid

2018-02-13 Thread Michael Ellerman

On Mon, 2018-02-12 at 22:34:08 UTC, Guenter Roeck wrote:
> Commit e67e02a544e9 ("powerpc/pseries: Fix cpu hotplug crash with
> memoryless nodes") adds an unconditional call to find_and_online_cpu_nid(),
> which is only declared if CONFIG_PPC_SPLPAR is enabled. This results in
> the following build error if this is not the case.
> 
> arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_online_cpu':
> arch/powerpc/platforms/pseries/hotplug-cpu.c:369:
>   undefined reference to `.find_and_online_cpu_nid'
> 
> Follow the guideline provided by similar functions and provide a dummy
> function if CONFIG_PPC_SPLPAR is not enabled. This also moves the external
> function declaration into an include file where it should be.
> 
> Fixes: e67e02a544e9 ("powerpc/pseries: Fix cpu hotplug crash with ...")
> Cc: Michael Bringmann 
> Cc: Michael Ellerman 
> Cc: Nathan Fontenot 
> Signed-off-by: Guenter Roeck 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/82343484a2d4c97a03bfd81303b549

cheers

Re: [2/2] powerpc/pseries: Declare optional dummy function for find_and_online_cpu_nid

2018-02-13 Thread Michael Ellerman

On Mon, 2018-02-12 at 22:34:08 UTC, Guenter Roeck wrote:
> Commit e67e02a544e9 ("powerpc/pseries: Fix cpu hotplug crash with
> memoryless nodes") adds an unconditional call to find_and_online_cpu_nid(),
> which is only declared if CONFIG_PPC_SPLPAR is enabled. This results in
> the following build error if this is not the case.
> 
> arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_online_cpu':
> arch/powerpc/platforms/pseries/hotplug-cpu.c:369:
>   undefined reference to `.find_and_online_cpu_nid'
> 
> Follow the guideline provided by similar functions and provide a dummy
> function if CONFIG_PPC_SPLPAR is not enabled. This also moves the external
> function declaration into an include file where it should be.
> 
> Fixes: e67e02a544e9 ("powerpc/pseries: Fix cpu hotplug crash with ...")
> Cc: Michael Bringmann 
> Cc: Michael Ellerman 
> Cc: Nathan Fontenot 
> Signed-off-by: Guenter Roeck 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/82343484a2d4c97a03bfd81303b549

cheers

Re: [1/2] powerpc/kdump: Add missing optional dummy functions

2018-02-13 Thread Michael Ellerman

On Mon, 2018-02-12 at 22:34:07 UTC, Guenter Roeck wrote:
> If KEXEC_CORE is not enabled, PowerNV builds fail as follows.
> 
> arch/powerpc/platforms/powernv/smp.c: In function 'pnv_smp_cpu_kill_self':
> arch/powerpc/platforms/powernv/smp.c:236:4: error:
>   implicit declaration of function 'crash_ipi_callback'
> 
> Add dummy function calls, similar to kdump_in_progress(), to solve the
> problem.
> 
> Fixes: 4145f358644b ("powernv/kdump: Fix cases where the kdump kernel ...")
> Cc: Balbir Singh 
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Signed-off-by: Guenter Roeck 
> Acked-by: Balbir Singh 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/910961754572a2f4c83ad7e610d180

cheers

Re: [1/2] powerpc/kdump: Add missing optional dummy functions

2018-02-13 Thread Michael Ellerman

On Mon, 2018-02-12 at 22:34:07 UTC, Guenter Roeck wrote:
> If KEXEC_CORE is not enabled, PowerNV builds fail as follows.
> 
> arch/powerpc/platforms/powernv/smp.c: In function 'pnv_smp_cpu_kill_self':
> arch/powerpc/platforms/powernv/smp.c:236:4: error:
>   implicit declaration of function 'crash_ipi_callback'
> 
> Add dummy function calls, similar to kdump_in_progress(), to solve the
> problem.
> 
> Fixes: 4145f358644b ("powernv/kdump: Fix cases where the kdump kernel ...")
> Cc: Balbir Singh 
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Signed-off-by: Guenter Roeck 
> Acked-by: Balbir Singh 

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/910961754572a2f4c83ad7e610d180

cheers

Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-13 Thread Tomasz Figa

On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
 wrote:
> Hi Tomasz,
>
> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa  wrote:
>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark  wrote:
>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa  wrote:
 On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark  wrote:
> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa  wrote:
>> Hi Vivek,
>>
>> Thanks for the patch. Please see my comments inline.
>>
>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>>  wrote:
>>> While handling the concerned iommu, there should not be a
>>> need to power control the drm devices from iommu interface.
>>> If these drm devices need to be powered around this time,
>>> the respective drivers should take care of this.
>>>
>>> Replace the pm_runtime_get/put_sync() with
>>> pm_runtime_get/put_suppliers() calls, to power-up
>>> the connected iommu through the device link interface.
>>> In case the device link is not setup these get/put_suppliers()
>>> calls will be a no-op, and the iommu driver should take care of
>>> powering on its devices accordingly.
>>>
>>> Signed-off-by: Vivek Gautam 
>>> ---
>>>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>>> b/drivers/gpu/drm/msm/msm_iommu.c
>>> index b23d33622f37..1ab629bbee69 100644
>>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, 
>>> const char * const *names,
>>> struct msm_iommu *iommu = to_msm_iommu(mmu);
>>> int ret;
>>>
>>> -   pm_runtime_get_sync(mmu->dev);
>>> +   pm_runtime_get_suppliers(mmu->dev);
>>> ret = iommu_attach_device(iommu->domain, mmu->dev);
>>> -   pm_runtime_put_sync(mmu->dev);
>>> +   pm_runtime_put_suppliers(mmu->dev);
>>
>> For me, it looks like a wrong place to handle runtime PM of IOMMU
>> here. iommu_attach_device() calls into IOMMU driver's attach_device()
>> callback and that's where necessary runtime PM gets should happen, if
>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
>> with power state of device controlled by driver B (ARM SMMU).
>
> Note that we end up having to do the same, because of iommu_unmap()
> while DRM driver is powered off..  it might be cleaner if it was all
> self contained in the iommu driver, but that would make it so other
> drivers couldn't call iommu_unmap() from an irq handler, which is
> apparently something that some of them want to do..

 I'd assume that runtime PM status is already guaranteed to be active
 when the IRQ handler is running, by some other means (e.g.
 pm_runtime_get_sync() called earlier, when queuing some work to the
 hardware). Otherwise, I'm not sure how a powered down device could
 trigger an IRQ.

 So, if the master device power is already on, suppliers should be
 powered on as well, thanks to device links.

>>>
>>> umm, that is kindof the inverse of the problem..  the problem is
>>> things like gpu driver (and v4l2 drivers that import dma-buf's,
>>> afaict).. they will potentially call iommu->unmap() when device is not
>>> active (due to userspace or things beyond the control of the driver)..
>>> so *they* would want iommu to do pm get/put calls.
>>
>> Which is fine and which is actually already done by one of the patches
>> in this series, not for map/unmap, but probe, add_device,
>> remove_device. Having parts of the API doing it inside the callback
>> and other parts outside sounds at least inconsistent.
>>
>>> But other drivers
>>> trying to unmap from irq ctx would not.  Which is the contradictory
>>> requirement that lead to the idea of iommu user powering up iommu for
>>> unmap.
>>
>> Sorry, maybe I wasn't clear. My last message was supposed to show that
>> it's not contradictory at all, because "other drivers trying to unmap
>> from irq ctx" would already have called pm_runtime_get_*() earlier
>> from a non-irq ctx, which would have also done the same on all the
>> linked suppliers, including the IOMMU. The ultimate result would be
>> that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync()
>> would do nothing besides incrementing the reference count.
>
> The entire point was to avoid the slowpath that pm_runtime_get/put_sync()
> would add in map/unmap. It would not be correct to add a slowpath in irq_ctx
> for taking care of non-irq_ctx and for the situations where master is already
> powered-off.

Correct me if I'm wrong, but I believe that with what I'm

Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-13 Thread Tomasz Figa

On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
 wrote:
> Hi Tomasz,
>
> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa  wrote:
>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark  wrote:
>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa  wrote:
 On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark  wrote:
> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa  wrote:
>> Hi Vivek,
>>
>> Thanks for the patch. Please see my comments inline.
>>
>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>>  wrote:
>>> While handling the concerned iommu, there should not be a
>>> need to power control the drm devices from iommu interface.
>>> If these drm devices need to be powered around this time,
>>> the respective drivers should take care of this.
>>>
>>> Replace the pm_runtime_get/put_sync() with
>>> pm_runtime_get/put_suppliers() calls, to power-up
>>> the connected iommu through the device link interface.
>>> In case the device link is not setup these get/put_suppliers()
>>> calls will be a no-op, and the iommu driver should take care of
>>> powering on its devices accordingly.
>>>
>>> Signed-off-by: Vivek Gautam 
>>> ---
>>>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>>> b/drivers/gpu/drm/msm/msm_iommu.c
>>> index b23d33622f37..1ab629bbee69 100644
>>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, 
>>> const char * const *names,
>>> struct msm_iommu *iommu = to_msm_iommu(mmu);
>>> int ret;
>>>
>>> -   pm_runtime_get_sync(mmu->dev);
>>> +   pm_runtime_get_suppliers(mmu->dev);
>>> ret = iommu_attach_device(iommu->domain, mmu->dev);
>>> -   pm_runtime_put_sync(mmu->dev);
>>> +   pm_runtime_put_suppliers(mmu->dev);
>>
>> For me, it looks like a wrong place to handle runtime PM of IOMMU
>> here. iommu_attach_device() calls into IOMMU driver's attach_device()
>> callback and that's where necessary runtime PM gets should happen, if
>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
>> with power state of device controlled by driver B (ARM SMMU).
>
> Note that we end up having to do the same, because of iommu_unmap()
> while DRM driver is powered off..  it might be cleaner if it was all
> self contained in the iommu driver, but that would make it so other
> drivers couldn't call iommu_unmap() from an irq handler, which is
> apparently something that some of them want to do..

 I'd assume that runtime PM status is already guaranteed to be active
 when the IRQ handler is running, by some other means (e.g.
 pm_runtime_get_sync() called earlier, when queuing some work to the
 hardware). Otherwise, I'm not sure how a powered down device could
 trigger an IRQ.

 So, if the master device power is already on, suppliers should be
 powered on as well, thanks to device links.

>>>
>>> umm, that is kindof the inverse of the problem..  the problem is
>>> things like gpu driver (and v4l2 drivers that import dma-buf's,
>>> afaict).. they will potentially call iommu->unmap() when device is not
>>> active (due to userspace or things beyond the control of the driver)..
>>> so *they* would want iommu to do pm get/put calls.
>>
>> Which is fine and which is actually already done by one of the patches
>> in this series, not for map/unmap, but probe, add_device,
>> remove_device. Having parts of the API doing it inside the callback
>> and other parts outside sounds at least inconsistent.
>>
>>> But other drivers
>>> trying to unmap from irq ctx would not.  Which is the contradictory
>>> requirement that lead to the idea of iommu user powering up iommu for
>>> unmap.
>>
>> Sorry, maybe I wasn't clear. My last message was supposed to show that
>> it's not contradictory at all, because "other drivers trying to unmap
>> from irq ctx" would already have called pm_runtime_get_*() earlier
>> from a non-irq ctx, which would have also done the same on all the
>> linked suppliers, including the IOMMU. The ultimate result would be
>> that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync()
>> would do nothing besides incrementing the reference count.
>
> The entire point was to avoid the slowpath that pm_runtime_get/put_sync()
> would add in map/unmap. It would not be correct to add a slowpath in irq_ctx
> for taking care of non-irq_ctx and for the situations where master is already
> powered-off.

Correct me if I'm wrong, but I believe that with what I'm proposing
there wouldn't be any slow path.

a) For IRQ context, the master is already powered on and so the SMMU
is also powered on, through respective device link.
pm_runtime_get_sync() would

Re: [PATCH 2/4 v6] lib: debugobjects: add global free list and the counter

2018-02-13 Thread Yang Shi




On 2/13/18 2:02 AM, Thomas Gleixner wrote:

On Mon, 12 Feb 2018, Yang Shi wrote:

On 2/12/18 8:25 AM, Thomas Gleixner wrote:

On Tue, 6 Feb 2018, Yang Shi wrote:

+   /*
+* Reuse objs from the global free list, they will be reinitialized
+* when allocating
+*/
+   while (obj_nr_tofree > 0 && (obj_pool_free < obj_pool_min_free)) {
+   raw_spin_lock_irqsave(_lock, flags);
+   obj = hlist_entry(obj_to_free.first, typeof(*obj), node);

This is racy vs. the worker thread. Assume obj_nr_tofree = 1:

CPU0CPU1
worker
 lock(_lock);  while (obj_nr_tofree > 0 && ...) {
   obj = hlist_entry(obj_to_free);lock(_lock);
   hlist_del(obj);  
   obj_nr_tofree--;
   ...
 unlock(_lock);
  obj = hlist_entry(obj_to_free);
  hlist_del(obj); <--- NULL
pointer dereference

Not what you want, right? The counter or the list head need to be rechecked
after the lock is acquired.

Yes, you are right. Will fix the race in newer version.

I fixed up all the minor issues with this series and applied it to:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/debugobjects

Please double check the result.


Thanks a lot. It looks good.

Regards,
Yang



Thanks,

tglx

Re: [PATCH 2/4 v6] lib: debugobjects: add global free list and the counter

2018-02-13 Thread Yang Shi




On 2/13/18 2:02 AM, Thomas Gleixner wrote:

On Mon, 12 Feb 2018, Yang Shi wrote:

On 2/12/18 8:25 AM, Thomas Gleixner wrote:

On Tue, 6 Feb 2018, Yang Shi wrote:

+   /*
+* Reuse objs from the global free list, they will be reinitialized
+* when allocating
+*/
+   while (obj_nr_tofree > 0 && (obj_pool_free < obj_pool_min_free)) {
+   raw_spin_lock_irqsave(_lock, flags);
+   obj = hlist_entry(obj_to_free.first, typeof(*obj), node);

This is racy vs. the worker thread. Assume obj_nr_tofree = 1:

CPU0CPU1
worker
 lock(_lock);  while (obj_nr_tofree > 0 && ...) {
   obj = hlist_entry(obj_to_free);lock(_lock);
   hlist_del(obj);  
   obj_nr_tofree--;
   ...
 unlock(_lock);
  obj = hlist_entry(obj_to_free);
  hlist_del(obj); <--- NULL
pointer dereference

Not what you want, right? The counter or the list head need to be rechecked
after the lock is acquired.

Yes, you are right. Will fix the race in newer version.

I fixed up all the minor issues with this series and applied it to:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/debugobjects

Please double check the result.


Thanks a lot. It looks good.

Regards,
Yang



Thanks,

tglx

Re: [PATCH v7 5/9] dmtimer: Add timer ops to the platform data structure

2018-02-13 Thread Keerthy



On Tuesday 13 February 2018 08:16 AM, Suman Anna wrote:
> On 01/09/2018 12:23 AM, J, KEERTHY wrote:
>> Add timer ops to the platform data structure
>>
>> Signed-off-by: Keerthy 
>> Reviewed-by: Sebastian Reichel 
>> Tested-by: Ladislav Michl 
>> ---
>>   include/linux/platform_data/dmtimer-omap.h | 38 
>> ++
>>   1 file changed, 38 insertions(+)
>>
>> diff --git a/include/linux/platform_data/dmtimer-omap.h 
>> b/include/linux/platform_data/dmtimer-omap.h
>> index a19b78d..a3e1794 100644
>> --- a/include/linux/platform_data/dmtimer-omap.h
>> +++ b/include/linux/platform_data/dmtimer-omap.h
>> @@ -20,12 +20,50 @@
>>   #ifndef __PLATFORM_DATA_DMTIMER_OMAP_H__
>>   #define __PLATFORM_DATA_DMTIMER_OMAP_H__
>>
>> +struct omap_dm_timer_ops {
>> +   struct omap_dm_timer *(*request_by_node)(struct device_node *np);
>> +   struct omap_dm_timer *(*request_specific)(int timer_id);
>> +   struct omap_dm_timer *(*request)(void);
>> +
>> +   int (*free)(struct omap_dm_timer *timer);
>> +
>> +   void(*enable)(struct omap_dm_timer *timer);
>> +   void(*disable)(struct omap_dm_timer *timer);
>> +
>> +   int (*get_irq)(struct omap_dm_timer *timer);
>> +   int (*set_int_enable)(struct omap_dm_timer *timer,
>> + unsigned int value);
>> +   int (*set_int_disable)(struct omap_dm_timer *timer, u32 mask);
>> +
>> +   struct clk *(*get_fclk)(struct omap_dm_timer *timer);
>> +
>> +   int (*start)(struct omap_dm_timer *timer);
>> +   int (*stop)(struct omap_dm_timer *timer);
>> +   int (*set_source)(struct omap_dm_timer *timer, int source);
>> +
>> +   int (*set_load)(struct omap_dm_timer *timer, int autoreload,
>> +   unsigned int value);
>> +   int (*set_match)(struct omap_dm_timer *timer, int enable,
>> +unsigned int match);
>> +   int (*set_pwm)(struct omap_dm_timer *timer, int def_on,
>> +  int toggle, int trigger);
>> +   int (*set_prescaler)(struct omap_dm_timer *timer, int prescaler);
>> +
>> +   unsigned int (*read_counter)(struct omap_dm_timer *timer);
>> +   int (*write_counter)(struct omap_dm_timer *timer,
>> +unsigned int value);
>> +   unsigned int (*read_status)(struct omap_dm_timer *timer);
>> +   int (*write_status)(struct omap_dm_timer *timer,
>> +   unsigned int value);
>> +};
>> +
>>   struct dmtimer_platform_data {
>>   /* set_timer_src - Only used for OMAP1 devices */
>>   int (*set_timer_src)(struct platform_device *pdev, int source);
> Have you looked into collapsing this into the set_source() option above
> for OMAP1? Looks like the only reason the pdev is needed is for
> retrieving the pdev id, which is also stored in the omap_dm_timer structure?

I would prefer not to touch the mach-omap1 part as part of this
migration series. I will revisit this once the migration is done.

> 
>>   u32 timer_capability;
>>   u32 timer_errata;
>>   int (*get_context_loss_count)(struct device *);
>> +   struct omap_dm_timer_ops *timer_ops;
> 
> Any reason why this is not a const? We don't expect this to change right.

Yes this can be.

> 
> regards
> Suman
> 
>>   };
>>
>>   #endif /* __PLATFORM_DATA_DMTIMER_OMAP_H__ */
>> -- 
>> 1.9.1
>>
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>

Re: [PATCH v7 5/9] dmtimer: Add timer ops to the platform data structure

2018-02-13 Thread Keerthy



On Tuesday 13 February 2018 08:16 AM, Suman Anna wrote:
> On 01/09/2018 12:23 AM, J, KEERTHY wrote:
>> Add timer ops to the platform data structure
>>
>> Signed-off-by: Keerthy 
>> Reviewed-by: Sebastian Reichel 
>> Tested-by: Ladislav Michl 
>> ---
>>   include/linux/platform_data/dmtimer-omap.h | 38 
>> ++
>>   1 file changed, 38 insertions(+)
>>
>> diff --git a/include/linux/platform_data/dmtimer-omap.h 
>> b/include/linux/platform_data/dmtimer-omap.h
>> index a19b78d..a3e1794 100644
>> --- a/include/linux/platform_data/dmtimer-omap.h
>> +++ b/include/linux/platform_data/dmtimer-omap.h
>> @@ -20,12 +20,50 @@
>>   #ifndef __PLATFORM_DATA_DMTIMER_OMAP_H__
>>   #define __PLATFORM_DATA_DMTIMER_OMAP_H__
>>
>> +struct omap_dm_timer_ops {
>> +   struct omap_dm_timer *(*request_by_node)(struct device_node *np);
>> +   struct omap_dm_timer *(*request_specific)(int timer_id);
>> +   struct omap_dm_timer *(*request)(void);
>> +
>> +   int (*free)(struct omap_dm_timer *timer);
>> +
>> +   void(*enable)(struct omap_dm_timer *timer);
>> +   void(*disable)(struct omap_dm_timer *timer);
>> +
>> +   int (*get_irq)(struct omap_dm_timer *timer);
>> +   int (*set_int_enable)(struct omap_dm_timer *timer,
>> + unsigned int value);
>> +   int (*set_int_disable)(struct omap_dm_timer *timer, u32 mask);
>> +
>> +   struct clk *(*get_fclk)(struct omap_dm_timer *timer);
>> +
>> +   int (*start)(struct omap_dm_timer *timer);
>> +   int (*stop)(struct omap_dm_timer *timer);
>> +   int (*set_source)(struct omap_dm_timer *timer, int source);
>> +
>> +   int (*set_load)(struct omap_dm_timer *timer, int autoreload,
>> +   unsigned int value);
>> +   int (*set_match)(struct omap_dm_timer *timer, int enable,
>> +unsigned int match);
>> +   int (*set_pwm)(struct omap_dm_timer *timer, int def_on,
>> +  int toggle, int trigger);
>> +   int (*set_prescaler)(struct omap_dm_timer *timer, int prescaler);
>> +
>> +   unsigned int (*read_counter)(struct omap_dm_timer *timer);
>> +   int (*write_counter)(struct omap_dm_timer *timer,
>> +unsigned int value);
>> +   unsigned int (*read_status)(struct omap_dm_timer *timer);
>> +   int (*write_status)(struct omap_dm_timer *timer,
>> +   unsigned int value);
>> +};
>> +
>>   struct dmtimer_platform_data {
>>   /* set_timer_src - Only used for OMAP1 devices */
>>   int (*set_timer_src)(struct platform_device *pdev, int source);
> Have you looked into collapsing this into the set_source() option above
> for OMAP1? Looks like the only reason the pdev is needed is for
> retrieving the pdev id, which is also stored in the omap_dm_timer structure?

I would prefer not to touch the mach-omap1 part as part of this
migration series. I will revisit this once the migration is done.

> 
>>   u32 timer_capability;
>>   u32 timer_errata;
>>   int (*get_context_loss_count)(struct device *);
>> +   struct omap_dm_timer_ops *timer_ops;
> 
> Any reason why this is not a const? We don't expect this to change right.

Yes this can be.

> 
> regards
> Suman
> 
>>   };
>>
>>   #endif /* __PLATFORM_DATA_DMTIMER_OMAP_H__ */
>> -- 
>> 1.9.1
>>
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>

Re: [PATCH v7 4/9] arm: OMAP: Move dmtimer driver out of plat-omap to drivers under clocksource

2018-02-13 Thread Keerthy



On Tuesday 13 February 2018 07:54 AM, Suman Anna wrote:
> Hi Keerthy,
> 
> On 01/09/2018 12:23 AM, J, KEERTHY wrote:
>> Move the dmtimer driver out of plat-omap to clocksource.
>> So that non-omap devices also could use this.
> 
> What non-omap devices do you have in mind? I don't think this driver is
> ready for that yet. It still has a lot of OMAP dependencies. So you
> should defer this for later along with the rest of the cleanup and when
> the driver is ready for that.

I mean non-omap TI devices. Not totally generic one.

> 
>>
>> No Code changes done to the driver file only renamed to timer-dm.c.
>> Also removed the config dependencies for OMAP_DM_TIMER.
>>
>> Signed-off-by: Keerthy 
>> Reviewed-by: Sebastian Reichel 
>> Tested-by: Ladislav Michl 
>> ---
>>   arch/arm/plat-omap/Kconfig | 6 --
>>   arch/arm/plat-omap/Makefile| 1 -
>>   drivers/clocksource/Kconfig| 3 +++
>>   drivers/clocksource/Makefile   | 1 +
>>   arch/arm/plat-omap/dmtimer.c => drivers/clocksource/timer-dm.c | 0
>>   5 files changed, 4 insertions(+), 7 deletions(-)
>>   rename arch/arm/plat-omap/dmtimer.c => drivers/clocksource/timer-dm.c 
>> (100%)
>>
>> diff --git a/arch/arm/plat-omap/Kconfig b/arch/arm/plat-omap/Kconfig
>> index 7276afe..afc1a1d 100644
>> --- a/arch/arm/plat-omap/Kconfig
>> +++ b/arch/arm/plat-omap/Kconfig
>> @@ -106,12 +106,6 @@ config OMAP3_L2_AUX_SECURE_SERVICE_SET_ID
>>   help
>> PPA routine service ID for setting L2 auxiliary control register.
>>
>> -config OMAP_DM_TIMER
>> -   bool "Use dual-mode timer"
>> -   depends on ARCH_OMAP16XX || ARCH_OMAP2PLUS
>> -   help
>> -Select this option if you want to use OMAP Dual-Mode timers.
>> -
>>   config OMAP_SERIAL_WAKE
>>   bool "Enable wake-up events for serial ports"
>>   depends on ARCH_OMAP1 && OMAP_MUX
>> diff --git a/arch/arm/plat-omap/Makefile b/arch/arm/plat-omap/Makefile
>> index 47e1867..7215ada 100644
>> --- a/arch/arm/plat-omap/Makefile
>> +++ b/arch/arm/plat-omap/Makefile
>> @@ -9,5 +9,4 @@ obj-y := sram.o dma.o counter_32k.o
>>
>>   # omap_device support (OMAP2+ only at the moment)
>>
>> -obj-$(CONFIG_OMAP_DM_TIMER) += dmtimer.o
>>   obj-$(CONFIG_OMAP_DEBUG_LEDS) += debug-leds.o
>> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
>> index c729a88..3f799b2 100644
>> --- a/drivers/clocksource/Kconfig
>> +++ b/drivers/clocksource/Kconfig
>> @@ -21,6 +21,9 @@ config CLKEVT_I8253
>>   config I8253_LOCK
>>   bool
>>
>> +config OMAP_DM_TIMER
>> +   bool
>> +
>>   config CLKBLD_I8253
>>   def_bool y if CLKSRC_I8253 || CLKEVT_I8253 || I8253_LOCK
>>
>> diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile
>> index 72711f1..27b5497 100644
>> --- a/drivers/clocksource/Makefile
>> +++ b/drivers/clocksource/Makefile
>> @@ -16,6 +16,7 @@ obj-$(CONFIG_EM_TIMER_STI)+= em_sti.o
>>   obj-$(CONFIG_CLKBLD_I8253)  += i8253.o
>>   obj-$(CONFIG_CLKSRC_MMIO)   += mmio.o
>>   obj-$(CONFIG_DIGICOLOR_TIMER)   += timer-digicolor.o
>> +obj-$(CONFIG_OMAP_DM_TIMER)+= timer-dm.o
>>   obj-$(CONFIG_DW_APB_TIMER)  += dw_apb_timer.o
>>   obj-$(CONFIG_DW_APB_TIMER_OF)   += dw_apb_timer_of.o
>>   obj-$(CONFIG_FTTMR010_TIMER)+= timer-fttmr010.o
>> diff --git a/arch/arm/plat-omap/dmtimer.c b/drivers/clocksource/timer-dm.c
>> similarity index 100%
>> rename from arch/arm/plat-omap/dmtimer.c
>> rename to drivers/clocksource/timer-dm.c
> 
> Similar comments as in patch 3 about the file name at the top, and the
> question about adding omap to the file name.

I will go with timer-ti-dm.c following timer-ti-32k.c
Hope that is okay.

> 
> Also, I see that omap_dm_timer_get_fclk() is only defined for
> !CONFIG_ARCH_OMAP1, but currently the function is declared in the header
> file for both OMAP1 and OMAP2. You would want to inline that for OMAP1
> in the header file (we currently get away with it because no one uses it).

Sure.

> 
> regards
> Suman
> 
>> -- 
>> 1.9.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>

Re: [PATCH v7 4/9] arm: OMAP: Move dmtimer driver out of plat-omap to drivers under clocksource

2018-02-13 Thread Keerthy



On Tuesday 13 February 2018 07:54 AM, Suman Anna wrote:
> Hi Keerthy,
> 
> On 01/09/2018 12:23 AM, J, KEERTHY wrote:
>> Move the dmtimer driver out of plat-omap to clocksource.
>> So that non-omap devices also could use this.
> 
> What non-omap devices do you have in mind? I don't think this driver is
> ready for that yet. It still has a lot of OMAP dependencies. So you
> should defer this for later along with the rest of the cleanup and when
> the driver is ready for that.

I mean non-omap TI devices. Not totally generic one.

> 
>>
>> No Code changes done to the driver file only renamed to timer-dm.c.
>> Also removed the config dependencies for OMAP_DM_TIMER.
>>
>> Signed-off-by: Keerthy 
>> Reviewed-by: Sebastian Reichel 
>> Tested-by: Ladislav Michl 
>> ---
>>   arch/arm/plat-omap/Kconfig | 6 --
>>   arch/arm/plat-omap/Makefile| 1 -
>>   drivers/clocksource/Kconfig| 3 +++
>>   drivers/clocksource/Makefile   | 1 +
>>   arch/arm/plat-omap/dmtimer.c => drivers/clocksource/timer-dm.c | 0
>>   5 files changed, 4 insertions(+), 7 deletions(-)
>>   rename arch/arm/plat-omap/dmtimer.c => drivers/clocksource/timer-dm.c 
>> (100%)
>>
>> diff --git a/arch/arm/plat-omap/Kconfig b/arch/arm/plat-omap/Kconfig
>> index 7276afe..afc1a1d 100644
>> --- a/arch/arm/plat-omap/Kconfig
>> +++ b/arch/arm/plat-omap/Kconfig
>> @@ -106,12 +106,6 @@ config OMAP3_L2_AUX_SECURE_SERVICE_SET_ID
>>   help
>> PPA routine service ID for setting L2 auxiliary control register.
>>
>> -config OMAP_DM_TIMER
>> -   bool "Use dual-mode timer"
>> -   depends on ARCH_OMAP16XX || ARCH_OMAP2PLUS
>> -   help
>> -Select this option if you want to use OMAP Dual-Mode timers.
>> -
>>   config OMAP_SERIAL_WAKE
>>   bool "Enable wake-up events for serial ports"
>>   depends on ARCH_OMAP1 && OMAP_MUX
>> diff --git a/arch/arm/plat-omap/Makefile b/arch/arm/plat-omap/Makefile
>> index 47e1867..7215ada 100644
>> --- a/arch/arm/plat-omap/Makefile
>> +++ b/arch/arm/plat-omap/Makefile
>> @@ -9,5 +9,4 @@ obj-y := sram.o dma.o counter_32k.o
>>
>>   # omap_device support (OMAP2+ only at the moment)
>>
>> -obj-$(CONFIG_OMAP_DM_TIMER) += dmtimer.o
>>   obj-$(CONFIG_OMAP_DEBUG_LEDS) += debug-leds.o
>> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
>> index c729a88..3f799b2 100644
>> --- a/drivers/clocksource/Kconfig
>> +++ b/drivers/clocksource/Kconfig
>> @@ -21,6 +21,9 @@ config CLKEVT_I8253
>>   config I8253_LOCK
>>   bool
>>
>> +config OMAP_DM_TIMER
>> +   bool
>> +
>>   config CLKBLD_I8253
>>   def_bool y if CLKSRC_I8253 || CLKEVT_I8253 || I8253_LOCK
>>
>> diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile
>> index 72711f1..27b5497 100644
>> --- a/drivers/clocksource/Makefile
>> +++ b/drivers/clocksource/Makefile
>> @@ -16,6 +16,7 @@ obj-$(CONFIG_EM_TIMER_STI)+= em_sti.o
>>   obj-$(CONFIG_CLKBLD_I8253)  += i8253.o
>>   obj-$(CONFIG_CLKSRC_MMIO)   += mmio.o
>>   obj-$(CONFIG_DIGICOLOR_TIMER)   += timer-digicolor.o
>> +obj-$(CONFIG_OMAP_DM_TIMER)+= timer-dm.o
>>   obj-$(CONFIG_DW_APB_TIMER)  += dw_apb_timer.o
>>   obj-$(CONFIG_DW_APB_TIMER_OF)   += dw_apb_timer_of.o
>>   obj-$(CONFIG_FTTMR010_TIMER)+= timer-fttmr010.o
>> diff --git a/arch/arm/plat-omap/dmtimer.c b/drivers/clocksource/timer-dm.c
>> similarity index 100%
>> rename from arch/arm/plat-omap/dmtimer.c
>> rename to drivers/clocksource/timer-dm.c
> 
> Similar comments as in patch 3 about the file name at the top, and the
> question about adding omap to the file name.

I will go with timer-ti-dm.c following timer-ti-32k.c
Hope that is okay.

> 
> Also, I see that omap_dm_timer_get_fclk() is only defined for
> !CONFIG_ARCH_OMAP1, but currently the function is declared in the header
> file for both OMAP1 and OMAP2. You would want to inline that for OMAP1
> in the header file (we currently get away with it because no one uses it).

Sure.

> 
> regards
> Suman
> 
>> -- 
>> 1.9.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>

Re: [PATCH v3 1/1] mm: initialize pages on demand during boot

2018-02-13 Thread Sergey Senozhatsky

On (02/09/18 14:22), Pavel Tatashin wrote:
[..]
> +/*
> + * If this zone has deferred pages, try to grow it by initializing enough
> + * deferred pages to satisfy the allocation specified by order, rounded up to
> + * the nearest PAGES_PER_SECTION boundary.  So we're adding memory in 
> increments
> + * of SECTION_SIZE bytes by initializing struct pages in increments of
> + * PAGES_PER_SECTION * sizeof(struct page) bytes.
> + *
> + * Return true when zone was grown by at least number of pages specified by
> + * order. Otherwise return false.
> + *
> + * Note: We use noinline because this function is needed only during boot, 
> and
> + * it is called from a __ref function _deferred_grow_zone. This way we are
> + * making sure that it is not inlined into permanent text section.
> + */
> +static noinline bool __init
> +deferred_grow_zone(struct zone *zone, unsigned int order)
> +{
> + int zid = zone_idx(zone);
> + int nid = zone->node;

^

Should be CONFIG_NUMA dependent

struct zone {
...
#ifdef CONFIG_NUMA
int node;
#endif
...

-ss

Re: [PATCH v3 1/1] mm: initialize pages on demand during boot

2018-02-13 Thread Sergey Senozhatsky

On (02/09/18 14:22), Pavel Tatashin wrote:
[..]
> +/*
> + * If this zone has deferred pages, try to grow it by initializing enough
> + * deferred pages to satisfy the allocation specified by order, rounded up to
> + * the nearest PAGES_PER_SECTION boundary.  So we're adding memory in 
> increments
> + * of SECTION_SIZE bytes by initializing struct pages in increments of
> + * PAGES_PER_SECTION * sizeof(struct page) bytes.
> + *
> + * Return true when zone was grown by at least number of pages specified by
> + * order. Otherwise return false.
> + *
> + * Note: We use noinline because this function is needed only during boot, 
> and
> + * it is called from a __ref function _deferred_grow_zone. This way we are
> + * making sure that it is not inlined into permanent text section.
> + */
> +static noinline bool __init
> +deferred_grow_zone(struct zone *zone, unsigned int order)
> +{
> + int zid = zone_idx(zone);
> + int nid = zone->node;

^

Should be CONFIG_NUMA dependent

struct zone {
...
#ifdef CONFIG_NUMA
int node;
#endif
...

-ss

[PATCH] lib/scatterlist: Add SG_CHAIN and SG_EMARK macros for LSB encodings

2018-02-13 Thread Anshuman Khandual

This replaces scatterlist->page_link LSB encodings with SG_CHAIN and
SG_EMARK definitions without any functional change.

Signed-off-by: Anshuman Khandual 
---
 include/linux/scatterlist.h | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 22b2131bcdcd..63d00bdb2fb3 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -65,16 +65,18 @@ struct sg_table {
  */
 
 #define SG_MAGIC   0x87654321
+#define SG_CHAIN   0x01
+#define SG_EMARK   0x02
 
 /*
  * We overload the LSB of the page pointer to indicate whether it's
  * a valid sg entry, or whether it points to the start of a new scatterlist.
  * Those low bits are there for everyone! (thanks mason :-)
  */
-#define sg_is_chain(sg)((sg)->page_link & 0x01)
-#define sg_is_last(sg) ((sg)->page_link & 0x02)
+#define sg_is_chain(sg)((sg)->page_link & SG_CHAIN)
+#define sg_is_last(sg) ((sg)->page_link & SG_EMARK)
 #define sg_chain_ptr(sg)   \
-   ((struct scatterlist *) ((sg)->page_link & ~0x03))
+   ((struct scatterlist *) ((sg)->page_link & ~(SG_CHAIN | SG_EMARK)))
 
 /**
  * sg_assign_page - Assign a given page to an SG entry
@@ -88,13 +90,13 @@ struct sg_table {
  **/
 static inline void sg_assign_page(struct scatterlist *sg, struct page *page)
 {
-   unsigned long page_link = sg->page_link & 0x3;
+   unsigned long page_link = sg->page_link & (SG_CHAIN | SG_EMARK);
 
/*
 * In order for the low bit stealing approach to work, pages
 * must be aligned at a 32-bit boundary as a minimum.
 */
-   BUG_ON((unsigned long) page & 0x03);
+   BUG_ON((unsigned long) page & (SG_CHAIN | SG_EMARK));
 #ifdef CONFIG_DEBUG_SG
BUG_ON(sg->sg_magic != SG_MAGIC);
BUG_ON(sg_is_chain(sg));
@@ -130,7 +132,7 @@ static inline struct page *sg_page(struct scatterlist *sg)
BUG_ON(sg->sg_magic != SG_MAGIC);
BUG_ON(sg_is_chain(sg));
 #endif
-   return (struct page *)((sg)->page_link & ~0x3);
+   return (struct page *)((sg)->page_link & ~(SG_CHAIN | SG_EMARK));
 }
 
 /**
@@ -178,7 +180,8 @@ static inline void sg_chain(struct scatterlist *prv, 
unsigned int prv_nents,
 * Set lowest bit to indicate a link pointer, and make sure to clear
 * the termination bit if it happens to be set.
 */
-   prv[prv_nents - 1].page_link = ((unsigned long) sgl | 0x01) & ~0x02;
+   prv[prv_nents - 1].page_link = ((unsigned long) sgl | SG_CHAIN)
+   & ~SG_EMARK;
 }
 
 /**
@@ -198,8 +201,8 @@ static inline void sg_mark_end(struct scatterlist *sg)
/*
 * Set termination bit, clear potential chain bit
 */
-   sg->page_link |= 0x02;
-   sg->page_link &= ~0x01;
+   sg->page_link |= SG_EMARK;
+   sg->page_link &= ~SG_CHAIN;
 }
 
 /**
@@ -215,7 +218,7 @@ static inline void sg_unmark_end(struct scatterlist *sg)
 #ifdef CONFIG_DEBUG_SG
BUG_ON(sg->sg_magic != SG_MAGIC);
 #endif
-   sg->page_link &= ~0x02;
+   sg->page_link &= ~SG_EMARK;
 }
 
 /**
-- 
2.11.0

[PATCH] lib/scatterlist: Add SG_CHAIN and SG_EMARK macros for LSB encodings

2018-02-13 Thread Anshuman Khandual

This replaces scatterlist->page_link LSB encodings with SG_CHAIN and
SG_EMARK definitions without any functional change.

Signed-off-by: Anshuman Khandual 
---
 include/linux/scatterlist.h | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 22b2131bcdcd..63d00bdb2fb3 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -65,16 +65,18 @@ struct sg_table {
  */
 
 #define SG_MAGIC   0x87654321
+#define SG_CHAIN   0x01
+#define SG_EMARK   0x02
 
 /*
  * We overload the LSB of the page pointer to indicate whether it's
  * a valid sg entry, or whether it points to the start of a new scatterlist.
  * Those low bits are there for everyone! (thanks mason :-)
  */
-#define sg_is_chain(sg)((sg)->page_link & 0x01)
-#define sg_is_last(sg) ((sg)->page_link & 0x02)
+#define sg_is_chain(sg)((sg)->page_link & SG_CHAIN)
+#define sg_is_last(sg) ((sg)->page_link & SG_EMARK)
 #define sg_chain_ptr(sg)   \
-   ((struct scatterlist *) ((sg)->page_link & ~0x03))
+   ((struct scatterlist *) ((sg)->page_link & ~(SG_CHAIN | SG_EMARK)))
 
 /**
  * sg_assign_page - Assign a given page to an SG entry
@@ -88,13 +90,13 @@ struct sg_table {
  **/
 static inline void sg_assign_page(struct scatterlist *sg, struct page *page)
 {
-   unsigned long page_link = sg->page_link & 0x3;
+   unsigned long page_link = sg->page_link & (SG_CHAIN | SG_EMARK);
 
/*
 * In order for the low bit stealing approach to work, pages
 * must be aligned at a 32-bit boundary as a minimum.
 */
-   BUG_ON((unsigned long) page & 0x03);
+   BUG_ON((unsigned long) page & (SG_CHAIN | SG_EMARK));
 #ifdef CONFIG_DEBUG_SG
BUG_ON(sg->sg_magic != SG_MAGIC);
BUG_ON(sg_is_chain(sg));
@@ -130,7 +132,7 @@ static inline struct page *sg_page(struct scatterlist *sg)
BUG_ON(sg->sg_magic != SG_MAGIC);
BUG_ON(sg_is_chain(sg));
 #endif
-   return (struct page *)((sg)->page_link & ~0x3);
+   return (struct page *)((sg)->page_link & ~(SG_CHAIN | SG_EMARK));
 }
 
 /**
@@ -178,7 +180,8 @@ static inline void sg_chain(struct scatterlist *prv, 
unsigned int prv_nents,
 * Set lowest bit to indicate a link pointer, and make sure to clear
 * the termination bit if it happens to be set.
 */
-   prv[prv_nents - 1].page_link = ((unsigned long) sgl | 0x01) & ~0x02;
+   prv[prv_nents - 1].page_link = ((unsigned long) sgl | SG_CHAIN)
+   & ~SG_EMARK;
 }
 
 /**
@@ -198,8 +201,8 @@ static inline void sg_mark_end(struct scatterlist *sg)
/*
 * Set termination bit, clear potential chain bit
 */
-   sg->page_link |= 0x02;
-   sg->page_link &= ~0x01;
+   sg->page_link |= SG_EMARK;
+   sg->page_link &= ~SG_CHAIN;
 }
 
 /**
@@ -215,7 +218,7 @@ static inline void sg_unmark_end(struct scatterlist *sg)
 #ifdef CONFIG_DEBUG_SG
BUG_ON(sg->sg_magic != SG_MAGIC);
 #endif
-   sg->page_link &= ~0x02;
+   sg->page_link &= ~SG_EMARK;
 }
 
 /**
-- 
2.11.0

Re: [PATCH] x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME

2018-02-13 Thread Tom Lendacky

On 2/13/2018 10:21 PM, Kirill A. Shutemov wrote:
> On Tue, Feb 13, 2018 at 10:10:22PM -0600, Tom Lendacky wrote:
>> On 2/8/2018 6:55 AM, Kirill A. Shutemov wrote:
>>> AMD SME claims one bit from physical address to indicate whether the
>>> page is encrypted or not. To achieve that we clear out the bit from
>>> __PHYSICAL_MASK.
>>
>> I was actually working on a suggestion by Linus to use one of the software
>> page table bits to indicate encryption and translate that to the hardware
>> bit when writing the actual page table entry.  With that, __PHYSICAL_MASK
>> would go back to its original definition.
> 
> But you would need to mask it on reading of pfn from page table entry,
> right? I expect it to have more overhead than this one.

When reading back an entry it would translate the hardware bit position
back to the software bit position.  The suggestion for changing it was
to make _PAGE_ENC a constant and not tied to the sme_me_mask.

See https://marc.info/?l=linux-kernel=151017622615894=2

> 
> And software bits are valuable. Do we still have a spare one for this?

I was looking at possibly using bit 57 (_PAGE_BIT_SOFTW5).

Thanks,
Tom

>

Re: [PATCH] x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME

2018-02-13 Thread Tom Lendacky

On 2/13/2018 10:21 PM, Kirill A. Shutemov wrote:
> On Tue, Feb 13, 2018 at 10:10:22PM -0600, Tom Lendacky wrote:
>> On 2/8/2018 6:55 AM, Kirill A. Shutemov wrote:
>>> AMD SME claims one bit from physical address to indicate whether the
>>> page is encrypted or not. To achieve that we clear out the bit from
>>> __PHYSICAL_MASK.
>>
>> I was actually working on a suggestion by Linus to use one of the software
>> page table bits to indicate encryption and translate that to the hardware
>> bit when writing the actual page table entry.  With that, __PHYSICAL_MASK
>> would go back to its original definition.
> 
> But you would need to mask it on reading of pfn from page table entry,
> right? I expect it to have more overhead than this one.

When reading back an entry it would translate the hardware bit position
back to the software bit position.  The suggestion for changing it was
to make _PAGE_ENC a constant and not tied to the sme_me_mask.

See https://marc.info/?l=linux-kernel=151017622615894=2

> 
> And software bits are valuable. Do we still have a spare one for this?

I was looking at possibly using bit 57 (_PAGE_BIT_SOFTW5).

Thanks,
Tom

>

Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Aleksa Sarai

On 2018-02-14, Enrico Weigelt  wrote:
> On 13.02.2018 22:27, Aleksa Sarai wrote:
> 
> > You can do this by creating a new user namespace (CLONE_NEWUSER), which
> > then gives you the required permissions to create other namespaces
> > (CLONE_NEWNS). This is how "rootless containers" or unprivileged
> > containers operate.
> 
> hmm, unshare -U doesn't work for me (even as root). But docker works,
> so user namespaces should be working. Any idea what could be wrong ?

It depends how old your kernel is and what distro you use. Arch Linux
disables user namespaces entirely, Debian requires that you set a sysctl
to enable unprivileged user namespaces, and RHEL requires you to set
both a sysctl and a kernel boot-flag. Also check how old your kernel is
(unprivileged user namespace support was added in 3.8).

Also Docker doesn't use user namespaces by default (you need to manually
enable it with --userns-remap, check the docs for more details). You
probably also want to be using "unshare -r" in your testing (as "unshare
-U" will leave you without mapped users).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature

Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Aleksa Sarai

On 2018-02-14, Enrico Weigelt  wrote:
> On 13.02.2018 22:27, Aleksa Sarai wrote:
> 
> > You can do this by creating a new user namespace (CLONE_NEWUSER), which
> > then gives you the required permissions to create other namespaces
> > (CLONE_NEWNS). This is how "rootless containers" or unprivileged
> > containers operate.
> 
> hmm, unshare -U doesn't work for me (even as root). But docker works,
> so user namespaces should be working. Any idea what could be wrong ?

It depends how old your kernel is and what distro you use. Arch Linux
disables user namespaces entirely, Debian requires that you set a sysctl
to enable unprivileged user namespaces, and RHEL requires you to set
both a sysctl and a kernel boot-flag. Also check how old your kernel is
(unprivileged user namespace support was added in 3.8).

Also Docker doesn't use user namespaces by default (you need to manually
enable it with --userns-remap, check the docs for more details). You
probably also want to be using "unshare -r" in your testing (as "unshare
-U" will leave you without mapped users).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature

Re: [PATCH v7 3/9] arm: omap: Move dmtimer.h out of plat-omap

2018-02-13 Thread Keerthy



On Tuesday 13 February 2018 07:36 AM, Suman Anna wrote:
> Hi Keerthy,
> 
> On 01/09/2018 12:23 AM, J, KEERTHY wrote:
>> The header file is currently under plat-omap directory
>> under arch/omap. Move this out to an accessible place.
>>
>> No Code changes done to the header file.
>>
>> Signed-off-by: Keerthy 
>> Reviewed-by: Sebastian Reichel 
>> Tested-by: Ladislav Michl 
>> ---
>>   arch/arm/mach-omap1/pm.c   | 2 +-
>>   arch/arm/mach-omap1/timer.c| 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_2420_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_2430_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_3xxx_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_44xx_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_54xx_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_7xx_data.c  | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_81xx_data.c | 2 +-
>>   arch/arm/mach-omap2/pdata-quirks.c | 2 +-
>>   arch/arm/mach-omap2/timer.c| 2 +-
>>   arch/arm/plat-omap/dmtimer.c   | 2 +-
>>   {arch/arm/plat-omap/include/plat => include/clocksource}/dmtimer.h | 0
>>   14 files changed, 13 insertions(+), 13 deletions(-)
>>   rename {arch/arm/plat-omap/include/plat => include/clocksource}/dmtimer.h 
>> (100%)
>>
>> diff --git a/arch/arm/mach-omap1/pm.c b/arch/arm/mach-omap1/pm.c
>> index f1135bf..a07d47cf 100644
>> --- a/arch/arm/mach-omap1/pm.c
>> +++ b/arch/arm/mach-omap1/pm.c
>> @@ -55,7 +55,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include 
>>
>> diff --git a/arch/arm/mach-omap1/timer.c b/arch/arm/mach-omap1/timer.c
>> index 8fb1ec6..7c057ab 100644
>> --- a/arch/arm/mach-omap1/timer.c
>> +++ b/arch/arm/mach-omap1/timer.c
>> @@ -27,7 +27,7 @@
>>   #include 
>>   #include 
>>
>> -#include 
>> +#include 
>>
>>   #include "soc.h"
>>
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_2420_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_2420_data.c
>> index 0afb014..0a8b95f 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_2420_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_2420_data.c
>> @@ -16,7 +16,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include "omap_hwmod.h"
>>   #include "l3_2xxx.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_2430_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_2430_data.c
>> index 013b26b..16e3d8c 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_2430_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_2430_data.c
>> @@ -18,7 +18,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
> 
> These headers are actually not needed in the first-place since we no
> longer create any non-DT timer devices. I have submitted a series to
> cleanup the presence of this header file, as part of a larger hwmod data
> cleanup series.

I will rebase my series on top of your clean up series. Thanks for clean
ups.

> 
>>
>>   #include "omap_hwmod.h"
>>   #include "l3_2xxx.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
>> index 4b094cb..8a65f70 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
>> @@ -11,7 +11,7 @@
>>
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>   #include 
>>
>>   #include "omap_hwmod.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c
>> index 1a2f224..b030137 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c
>> @@ -25,7 +25,7 @@
>>   #include "l4_3xxx.h"
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include "soc.h"
>>   #include "omap_hwmod.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_44xx_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>> index a1901c2..51c7d62 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>> @@ -30,7 +30,7 @@
>>
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include "omap_hwmod.h"
>>   #include "omap_hwmod_common_data.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_54xx_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_54xx_data.c
>> index 988e7ea..530334e 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_54xx_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_54xx_data.c
>> @@ -26,7 +26,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include "omap_hwmod.h"
>>   #include

Re: [PATCH v7 3/9] arm: omap: Move dmtimer.h out of plat-omap

2018-02-13 Thread Keerthy



On Tuesday 13 February 2018 07:36 AM, Suman Anna wrote:
> Hi Keerthy,
> 
> On 01/09/2018 12:23 AM, J, KEERTHY wrote:
>> The header file is currently under plat-omap directory
>> under arch/omap. Move this out to an accessible place.
>>
>> No Code changes done to the header file.
>>
>> Signed-off-by: Keerthy 
>> Reviewed-by: Sebastian Reichel 
>> Tested-by: Ladislav Michl 
>> ---
>>   arch/arm/mach-omap1/pm.c   | 2 +-
>>   arch/arm/mach-omap1/timer.c| 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_2420_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_2430_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_3xxx_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_44xx_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_54xx_data.c | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_7xx_data.c  | 2 +-
>>   arch/arm/mach-omap2/omap_hwmod_81xx_data.c | 2 +-
>>   arch/arm/mach-omap2/pdata-quirks.c | 2 +-
>>   arch/arm/mach-omap2/timer.c| 2 +-
>>   arch/arm/plat-omap/dmtimer.c   | 2 +-
>>   {arch/arm/plat-omap/include/plat => include/clocksource}/dmtimer.h | 0
>>   14 files changed, 13 insertions(+), 13 deletions(-)
>>   rename {arch/arm/plat-omap/include/plat => include/clocksource}/dmtimer.h 
>> (100%)
>>
>> diff --git a/arch/arm/mach-omap1/pm.c b/arch/arm/mach-omap1/pm.c
>> index f1135bf..a07d47cf 100644
>> --- a/arch/arm/mach-omap1/pm.c
>> +++ b/arch/arm/mach-omap1/pm.c
>> @@ -55,7 +55,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include 
>>
>> diff --git a/arch/arm/mach-omap1/timer.c b/arch/arm/mach-omap1/timer.c
>> index 8fb1ec6..7c057ab 100644
>> --- a/arch/arm/mach-omap1/timer.c
>> +++ b/arch/arm/mach-omap1/timer.c
>> @@ -27,7 +27,7 @@
>>   #include 
>>   #include 
>>
>> -#include 
>> +#include 
>>
>>   #include "soc.h"
>>
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_2420_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_2420_data.c
>> index 0afb014..0a8b95f 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_2420_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_2420_data.c
>> @@ -16,7 +16,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include "omap_hwmod.h"
>>   #include "l3_2xxx.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_2430_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_2430_data.c
>> index 013b26b..16e3d8c 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_2430_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_2430_data.c
>> @@ -18,7 +18,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
> 
> These headers are actually not needed in the first-place since we no
> longer create any non-DT timer devices. I have submitted a series to
> cleanup the presence of this header file, as part of a larger hwmod data
> cleanup series.

I will rebase my series on top of your clean up series. Thanks for clean
ups.

> 
>>
>>   #include "omap_hwmod.h"
>>   #include "l3_2xxx.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
>> index 4b094cb..8a65f70 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_2xxx_ipblock_data.c
>> @@ -11,7 +11,7 @@
>>
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>   #include 
>>
>>   #include "omap_hwmod.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c
>> index 1a2f224..b030137 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_3xxx_data.c
>> @@ -25,7 +25,7 @@
>>   #include "l4_3xxx.h"
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include "soc.h"
>>   #include "omap_hwmod.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_44xx_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>> index a1901c2..51c7d62 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>> @@ -30,7 +30,7 @@
>>
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include "omap_hwmod.h"
>>   #include "omap_hwmod_common_data.h"
>> diff --git a/arch/arm/mach-omap2/omap_hwmod_54xx_data.c 
>> b/arch/arm/mach-omap2/omap_hwmod_54xx_data.c
>> index 988e7ea..530334e 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod_54xx_data.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod_54xx_data.c
>> @@ -26,7 +26,7 @@
>>   #include 
>>   #include 
>>   #include 
>> -#include 
>> +#include 
>>
>>   #include "omap_hwmod.h"
>>   #include "omap_hwmod_common_data.h"
>> diff --git

Re: [RFC PATCH 1/2] KVM: x86: Add a framework for supporting MSR-based features

2018-02-13 Thread Tom Lendacky

On 2/13/2018 10:25 AM, Paolo Bonzini wrote:
> On 08/02/2018 23:58, Tom Lendacky wrote:
>> +bool kvm_valid_msr_feature(u32 msr, u64 data)
>> +{
>> +unsigned int i;
>> +
>> +for (i = 0; i < num_msr_based_features; i++) {
>> +struct kvm_msr_based_features *m = msr_based_features + i;
>> +
>> +if (msr != m->msr)
>> +continue;
>> +
>> +/* Make sure not trying to change unsupported bits */
>> +return (data & ~m->mask) ? false : true;
>> +}
>> +
>> +return false;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_valid_msr_feature);
>> +
> 
> This is probably unnecessary too (the allowed values are a bit more
> complicated for, you just guessed it, VMX capability MSRs) and you can
> just check bits other than LFENCE in svm_set_msr.

The whole routine or just the bit checking?  I can see still needing the
check to be sure the "feature" is present.

Thanks,
Tom

> 
> Paolo
>

Re: [RFC PATCH 1/2] KVM: x86: Add a framework for supporting MSR-based features

2018-02-13 Thread Tom Lendacky

On 2/13/2018 10:25 AM, Paolo Bonzini wrote:
> On 08/02/2018 23:58, Tom Lendacky wrote:
>> +bool kvm_valid_msr_feature(u32 msr, u64 data)
>> +{
>> +unsigned int i;
>> +
>> +for (i = 0; i < num_msr_based_features; i++) {
>> +struct kvm_msr_based_features *m = msr_based_features + i;
>> +
>> +if (msr != m->msr)
>> +continue;
>> +
>> +/* Make sure not trying to change unsupported bits */
>> +return (data & ~m->mask) ? false : true;
>> +}
>> +
>> +return false;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_valid_msr_feature);
>> +
> 
> This is probably unnecessary too (the allowed values are a bit more
> complicated for, you just guessed it, VMX capability MSRs) and you can
> just check bits other than LFENCE in svm_set_msr.

The whole routine or just the bit checking?  I can see still needing the
check to be sure the "feature" is present.

Thanks,
Tom

> 
> Paolo
>

[no subject]

2018-02-13 Thread Alfred Cheuk Chow





Good Day,

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing Chong
Hing Bank, Hong Kong, Chong Hing Bank Center, 24 Des Voeux Road Central,
Hong Kong. I have a business proposal of $ 38,980,369.00.

All confirmable documents to back up the claims will be made available
to you prior to your acceptance and as soon as I receive your return
mail.

Best Regards,
Alfred Chow.

[no subject]

2018-02-13 Thread Alfred Cheuk Chow





Good Day,

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing Chong
Hing Bank, Hong Kong, Chong Hing Bank Center, 24 Des Voeux Road Central,
Hong Kong. I have a business proposal of $ 38,980,369.00.

All confirmable documents to back up the claims will be made available
to you prior to your acceptance and as soon as I receive your return
mail.

Best Regards,
Alfred Chow.

Re: [RFC PATCH 2/2] KVM: SVM: Add MSR feature support for serializing LFENCE

2018-02-13 Thread Tom Lendacky

On 2/13/2018 10:22 AM, Paolo Bonzini wrote:
> On 08/02/2018 23:58, Tom Lendacky wrote:
>> Create an entry in the new MSR as a feature framework to allow a guest to
>> recognize LFENCE as a serializing instruction on AMD processors.  The MSR
>> can only be set by the host, any write by the guest will be ignored.  A
>> read by the guest will return the value as set by the host.  In this way,
>> the support to expose the feature to the guest is controlled by the
>> hypervisor.
>>
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/x86/kvm/svm.c |   16 
>>  arch/x86/kvm/x86.c |6 ++
>>  2 files changed, 22 insertions(+)
>>
>> @@ -4047,6 +4052,17 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct 
>> msr_data *msr)
>>  case MSR_VM_IGNNE:
>>  vcpu_unimpl(vcpu, "unimplemented wrmsr: 0x%x data 0x%llx\n", 
>> ecx, data);
>>  break;
>> +case MSR_F10H_DECFG:
>> +/* Only the host can set this MSR, silently ignore */
>> +if (!msr->host_initiated)
>> +break;
> 
> Just one thing I'm wondering, should we #GP if the guest attempts to
> clear MSR_F10H_DECFG_LFENCE_SERIALIZE?

It would be more consistent with other entries to do "return 1" here
instead.  The current kernel code that writes this bit is using
msr_set_bit(), so a #GP is caught and handled.

Thanks,
Tom

> 
> Thanks,
> 
> Paolo
> 
>> +
>> +/* Check the supported bits */
>> +if (!kvm_valid_msr_feature(MSR_F10H_DECFG, data))
>> +return 1;
>> +
>> +svm->msr_decfg = data;
>> +break;
>>  case MSR_IA32_APICBASE:
>>  if (kvm_vcpu_apicv_active(vcpu))
>>  avic_update_vapic_bar(to_svm(vcpu), data);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 4251c34..21ec73b 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1060,7 +1060,13 @@ struct kvm_msr_based_features {
>>  u64 value;  /* MSR value */
>>  };
>>  
>> +static const struct x86_cpu_id msr_decfg_match[] = {
>> +{ X86_VENDOR_AMD, X86_FAMILY_ANY, X86_MODEL_ANY, 
>> X86_FEATURE_LFENCE_RDTSC },
>> +{}
>> +};
>> +
>>  static struct kvm_msr_based_features msr_based_features[] = {
>> +{ MSR_F10H_DECFG, MSR_F10H_DECFG_LFENCE_SERIALIZE, msr_decfg_match },
>>  {}
>>  };
>>  
>>
>

Re: [RFC PATCH 2/2] KVM: SVM: Add MSR feature support for serializing LFENCE

2018-02-13 Thread Tom Lendacky

On 2/13/2018 10:22 AM, Paolo Bonzini wrote:
> On 08/02/2018 23:58, Tom Lendacky wrote:
>> Create an entry in the new MSR as a feature framework to allow a guest to
>> recognize LFENCE as a serializing instruction on AMD processors.  The MSR
>> can only be set by the host, any write by the guest will be ignored.  A
>> read by the guest will return the value as set by the host.  In this way,
>> the support to expose the feature to the guest is controlled by the
>> hypervisor.
>>
>> Signed-off-by: Tom Lendacky 
>> ---
>>  arch/x86/kvm/svm.c |   16 
>>  arch/x86/kvm/x86.c |6 ++
>>  2 files changed, 22 insertions(+)
>>
>> @@ -4047,6 +4052,17 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct 
>> msr_data *msr)
>>  case MSR_VM_IGNNE:
>>  vcpu_unimpl(vcpu, "unimplemented wrmsr: 0x%x data 0x%llx\n", 
>> ecx, data);
>>  break;
>> +case MSR_F10H_DECFG:
>> +/* Only the host can set this MSR, silently ignore */
>> +if (!msr->host_initiated)
>> +break;
> 
> Just one thing I'm wondering, should we #GP if the guest attempts to
> clear MSR_F10H_DECFG_LFENCE_SERIALIZE?

It would be more consistent with other entries to do "return 1" here
instead.  The current kernel code that writes this bit is using
msr_set_bit(), so a #GP is caught and handled.

Thanks,
Tom

> 
> Thanks,
> 
> Paolo
> 
>> +
>> +/* Check the supported bits */
>> +if (!kvm_valid_msr_feature(MSR_F10H_DECFG, data))
>> +return 1;
>> +
>> +svm->msr_decfg = data;
>> +break;
>>  case MSR_IA32_APICBASE:
>>  if (kvm_vcpu_apicv_active(vcpu))
>>  avic_update_vapic_bar(to_svm(vcpu), data);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 4251c34..21ec73b 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1060,7 +1060,13 @@ struct kvm_msr_based_features {
>>  u64 value;  /* MSR value */
>>  };
>>  
>> +static const struct x86_cpu_id msr_decfg_match[] = {
>> +{ X86_VENDOR_AMD, X86_FAMILY_ANY, X86_MODEL_ANY, 
>> X86_FEATURE_LFENCE_RDTSC },
>> +{}
>> +};
>> +
>>  static struct kvm_msr_based_features msr_based_features[] = {
>> +{ MSR_F10H_DECFG, MSR_F10H_DECFG_LFENCE_SERIALIZE, msr_decfg_match },
>>  {}
>>  };
>>  
>>
>

Re: [RFC PATCH 1/2] KVM: x86: Add a framework for supporting MSR-based features

2018-02-13 Thread Tom Lendacky

On 2/13/2018 10:21 AM, Paolo Bonzini wrote:
> On 08/02/2018 23:58, Tom Lendacky wrote:
>> Provide a new KVM capability that allows bits within MSRs to be recognized
>> as features.  Two new ioctls are added to the VM ioctl routine to retrieve
>> the list of these MSRs and their values. The MSR features can optionally
>> be exposed based on a CPU and/or a CPU feature.
> 
> Yes, pretty much.  Just two changes:
> 
>> +struct kvm_msr_based_features {
>> +u32 msr;/* MSR to query */
>> +u64 mask;   /* MSR mask */
>> +const struct x86_cpu_id *match; /* Match criteria */
>> +u64 value;  /* MSR value */
> 
> 1) These two should be replaced by a kvm_x86_ops callback, because
> computing the value is sometimes a bit more complicated than just rdmsr
> (for example, MSRs for VMX capabilities depend on the kvm_intel.ko
> module parameters).

Ok, I'll rework this.

> 
> 
>> +case KVM_CAP_GET_MSR_FEATURES:
> 
> This should be KVM_GET_MSR.

Yup, not sure what I was thinking there.

> 
>> +r = msr_io(NULL, argp, do_get_msr_features, 1);
>> +break;
> 
> 
> Bonus points for writing documentation :) and for moving the MSR> handling 
> code to arch/x86/kvm/msr.{c,h}.

Yup, there will be documentation on it - I wanted to make sure the
direction was correct first.  Splitting out msr.c/msr.h might be
best as a separate patchset, let me see what's involved.

Thanks,
Tom

> 
> Thanks,
> 
> Paolo
>

Re: [RFC PATCH 1/2] KVM: x86: Add a framework for supporting MSR-based features

2018-02-13 Thread Tom Lendacky

On 2/13/2018 10:21 AM, Paolo Bonzini wrote:
> On 08/02/2018 23:58, Tom Lendacky wrote:
>> Provide a new KVM capability that allows bits within MSRs to be recognized
>> as features.  Two new ioctls are added to the VM ioctl routine to retrieve
>> the list of these MSRs and their values. The MSR features can optionally
>> be exposed based on a CPU and/or a CPU feature.
> 
> Yes, pretty much.  Just two changes:
> 
>> +struct kvm_msr_based_features {
>> +u32 msr;/* MSR to query */
>> +u64 mask;   /* MSR mask */
>> +const struct x86_cpu_id *match; /* Match criteria */
>> +u64 value;  /* MSR value */
> 
> 1) These two should be replaced by a kvm_x86_ops callback, because
> computing the value is sometimes a bit more complicated than just rdmsr
> (for example, MSRs for VMX capabilities depend on the kvm_intel.ko
> module parameters).

Ok, I'll rework this.

> 
> 
>> +case KVM_CAP_GET_MSR_FEATURES:
> 
> This should be KVM_GET_MSR.

Yup, not sure what I was thinking there.

> 
>> +r = msr_io(NULL, argp, do_get_msr_features, 1);
>> +break;
> 
> 
> Bonus points for writing documentation :) and for moving the MSR> handling 
> code to arch/x86/kvm/msr.{c,h}.

Yup, there will be documentation on it - I wanted to make sure the
direction was correct first.  Splitting out msr.c/msr.h might be
best as a separate patchset, let me see what's involved.

Thanks,
Tom

> 
> Thanks,
> 
> Paolo
>

Re: [PATCH] x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME

2018-02-13 Thread Kirill A. Shutemov

On Tue, Feb 13, 2018 at 10:10:22PM -0600, Tom Lendacky wrote:
> On 2/8/2018 6:55 AM, Kirill A. Shutemov wrote:
> > AMD SME claims one bit from physical address to indicate whether the
> > page is encrypted or not. To achieve that we clear out the bit from
> > __PHYSICAL_MASK.
> 
> I was actually working on a suggestion by Linus to use one of the software
> page table bits to indicate encryption and translate that to the hardware
> bit when writing the actual page table entry.  With that, __PHYSICAL_MASK
> would go back to its original definition.

But you would need to mask it on reading of pfn from page table entry,
right? I expect it to have more overhead than this one.

And software bits are valuable. Do we still have a spare one for this?

-- 
 Kirill A. Shutemov

Re: [PATCH] x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME

2018-02-13 Thread Kirill A. Shutemov

On Tue, Feb 13, 2018 at 10:10:22PM -0600, Tom Lendacky wrote:
> On 2/8/2018 6:55 AM, Kirill A. Shutemov wrote:
> > AMD SME claims one bit from physical address to indicate whether the
> > page is encrypted or not. To achieve that we clear out the bit from
> > __PHYSICAL_MASK.
> 
> I was actually working on a suggestion by Linus to use one of the software
> page table bits to indicate encryption and translate that to the hardware
> bit when writing the actual page table entry.  With that, __PHYSICAL_MASK
> would go back to its original definition.

But you would need to mask it on reading of pfn from page table entry,
right? I expect it to have more overhead than this one.

And software bits are valuable. Do we still have a spare one for this?

-- 
 Kirill A. Shutemov

Re: [PATCH] x86/entry/64: Fix CR3 restore order in paranoid_exit()

2018-02-13 Thread Dave Hansen

On 02/13/2018 06:27 PM, Josh Poimboeuf wrote:
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1167,10 +1167,10 @@ ENTRY(paranoid_exit)
>   UNWIND_HINT_REGS
>   DISABLE_INTERRUPTS(CLBR_ANY)
>   TRACE_IRQS_OFF_DEBUG
> + RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
>   testl   %ebx, %ebx  /* swapgs needed? */
>   jnz .Lparanoid_exit_no_swapgs
>   TRACE_IRQS_IRETQ
> - RESTORE_CR3 scratch_reg=%rbx save_reg=%r14
>   SWAPGS_UNSAFE_STACK
>   jmp .Lparanoid_exit_restore
>  .Lparanoid_exit_no_swapgs:

TRACE_IRQS_* call non-entry functions that are not mapped by the user
CR3.  How can this possibly work?  What am I missing?

Re: [PATCH] x86/entry/64: Fix CR3 restore order in paranoid_exit()

2018-02-13 Thread Dave Hansen

On 02/13/2018 06:27 PM, Josh Poimboeuf wrote:
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1167,10 +1167,10 @@ ENTRY(paranoid_exit)
>   UNWIND_HINT_REGS
>   DISABLE_INTERRUPTS(CLBR_ANY)
>   TRACE_IRQS_OFF_DEBUG
> + RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
>   testl   %ebx, %ebx  /* swapgs needed? */
>   jnz .Lparanoid_exit_no_swapgs
>   TRACE_IRQS_IRETQ
> - RESTORE_CR3 scratch_reg=%rbx save_reg=%r14
>   SWAPGS_UNSAFE_STACK
>   jmp .Lparanoid_exit_restore
>  .Lparanoid_exit_no_swapgs:

TRACE_IRQS_* call non-entry functions that are not mapped by the user
CR3.  How can this possibly work?  What am I missing?

Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-13 Thread Vivek Gautam

Hi Tomasz,

On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa  wrote:
> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark  wrote:
>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa  wrote:
>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark  wrote:
 On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa  wrote:
> Hi Vivek,
>
> Thanks for the patch. Please see my comments inline.
>
> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>  wrote:
>> While handling the concerned iommu, there should not be a
>> need to power control the drm devices from iommu interface.
>> If these drm devices need to be powered around this time,
>> the respective drivers should take care of this.
>>
>> Replace the pm_runtime_get/put_sync() with
>> pm_runtime_get/put_suppliers() calls, to power-up
>> the connected iommu through the device link interface.
>> In case the device link is not setup these get/put_suppliers()
>> calls will be a no-op, and the iommu driver should take care of
>> powering on its devices accordingly.
>>
>> Signed-off-by: Vivek Gautam 
>> ---
>>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>> b/drivers/gpu/drm/msm/msm_iommu.c
>> index b23d33622f37..1ab629bbee69 100644
>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const 
>> char * const *names,
>> struct msm_iommu *iommu = to_msm_iommu(mmu);
>> int ret;
>>
>> -   pm_runtime_get_sync(mmu->dev);
>> +   pm_runtime_get_suppliers(mmu->dev);
>> ret = iommu_attach_device(iommu->domain, mmu->dev);
>> -   pm_runtime_put_sync(mmu->dev);
>> +   pm_runtime_put_suppliers(mmu->dev);
>
> For me, it looks like a wrong place to handle runtime PM of IOMMU
> here. iommu_attach_device() calls into IOMMU driver's attach_device()
> callback and that's where necessary runtime PM gets should happen, if
> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
> with power state of device controlled by driver B (ARM SMMU).

 Note that we end up having to do the same, because of iommu_unmap()
 while DRM driver is powered off..  it might be cleaner if it was all
 self contained in the iommu driver, but that would make it so other
 drivers couldn't call iommu_unmap() from an irq handler, which is
 apparently something that some of them want to do..
>>>
>>> I'd assume that runtime PM status is already guaranteed to be active
>>> when the IRQ handler is running, by some other means (e.g.
>>> pm_runtime_get_sync() called earlier, when queuing some work to the
>>> hardware). Otherwise, I'm not sure how a powered down device could
>>> trigger an IRQ.
>>>
>>> So, if the master device power is already on, suppliers should be
>>> powered on as well, thanks to device links.
>>>
>>
>> umm, that is kindof the inverse of the problem..  the problem is
>> things like gpu driver (and v4l2 drivers that import dma-buf's,
>> afaict).. they will potentially call iommu->unmap() when device is not
>> active (due to userspace or things beyond the control of the driver)..
>> so *they* would want iommu to do pm get/put calls.
>
> Which is fine and which is actually already done by one of the patches
> in this series, not for map/unmap, but probe, add_device,
> remove_device. Having parts of the API doing it inside the callback
> and other parts outside sounds at least inconsistent.
>
>> But other drivers
>> trying to unmap from irq ctx would not.  Which is the contradictory
>> requirement that lead to the idea of iommu user powering up iommu for
>> unmap.
>
> Sorry, maybe I wasn't clear. My last message was supposed to show that
> it's not contradictory at all, because "other drivers trying to unmap
> from irq ctx" would already have called pm_runtime_get_*() earlier
> from a non-irq ctx, which would have also done the same on all the
> linked suppliers, including the IOMMU. The ultimate result would be
> that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync()
> would do nothing besides incrementing the reference count.

The entire point was to avoid the slowpath that pm_runtime_get/put_sync()
would add in map/unmap. It would not be correct to add a slowpath in irq_ctx
for taking care of non-irq_ctx and for the situations where master is already
powered-off.

>
>>
>> There has already been some discussion about this on various earlier
>> permutations of this patchset.  I think we have exhausted all other
>> options.
>
> I guess I should have read those. Let me do that now.
Yea, i point to the thread in

Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-13 Thread Vivek Gautam

Hi Tomasz,

On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa  wrote:
> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark  wrote:
>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa  wrote:
>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark  wrote:
 On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa  wrote:
> Hi Vivek,
>
> Thanks for the patch. Please see my comments inline.
>
> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>  wrote:
>> While handling the concerned iommu, there should not be a
>> need to power control the drm devices from iommu interface.
>> If these drm devices need to be powered around this time,
>> the respective drivers should take care of this.
>>
>> Replace the pm_runtime_get/put_sync() with
>> pm_runtime_get/put_suppliers() calls, to power-up
>> the connected iommu through the device link interface.
>> In case the device link is not setup these get/put_suppliers()
>> calls will be a no-op, and the iommu driver should take care of
>> powering on its devices accordingly.
>>
>> Signed-off-by: Vivek Gautam 
>> ---
>>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>> b/drivers/gpu/drm/msm/msm_iommu.c
>> index b23d33622f37..1ab629bbee69 100644
>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const 
>> char * const *names,
>> struct msm_iommu *iommu = to_msm_iommu(mmu);
>> int ret;
>>
>> -   pm_runtime_get_sync(mmu->dev);
>> +   pm_runtime_get_suppliers(mmu->dev);
>> ret = iommu_attach_device(iommu->domain, mmu->dev);
>> -   pm_runtime_put_sync(mmu->dev);
>> +   pm_runtime_put_suppliers(mmu->dev);
>
> For me, it looks like a wrong place to handle runtime PM of IOMMU
> here. iommu_attach_device() calls into IOMMU driver's attach_device()
> callback and that's where necessary runtime PM gets should happen, if
> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
> with power state of device controlled by driver B (ARM SMMU).

 Note that we end up having to do the same, because of iommu_unmap()
 while DRM driver is powered off..  it might be cleaner if it was all
 self contained in the iommu driver, but that would make it so other
 drivers couldn't call iommu_unmap() from an irq handler, which is
 apparently something that some of them want to do..
>>>
>>> I'd assume that runtime PM status is already guaranteed to be active
>>> when the IRQ handler is running, by some other means (e.g.
>>> pm_runtime_get_sync() called earlier, when queuing some work to the
>>> hardware). Otherwise, I'm not sure how a powered down device could
>>> trigger an IRQ.
>>>
>>> So, if the master device power is already on, suppliers should be
>>> powered on as well, thanks to device links.
>>>
>>
>> umm, that is kindof the inverse of the problem..  the problem is
>> things like gpu driver (and v4l2 drivers that import dma-buf's,
>> afaict).. they will potentially call iommu->unmap() when device is not
>> active (due to userspace or things beyond the control of the driver)..
>> so *they* would want iommu to do pm get/put calls.
>
> Which is fine and which is actually already done by one of the patches
> in this series, not for map/unmap, but probe, add_device,
> remove_device. Having parts of the API doing it inside the callback
> and other parts outside sounds at least inconsistent.
>
>> But other drivers
>> trying to unmap from irq ctx would not.  Which is the contradictory
>> requirement that lead to the idea of iommu user powering up iommu for
>> unmap.
>
> Sorry, maybe I wasn't clear. My last message was supposed to show that
> it's not contradictory at all, because "other drivers trying to unmap
> from irq ctx" would already have called pm_runtime_get_*() earlier
> from a non-irq ctx, which would have also done the same on all the
> linked suppliers, including the IOMMU. The ultimate result would be
> that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync()
> would do nothing besides incrementing the reference count.

The entire point was to avoid the slowpath that pm_runtime_get/put_sync()
would add in map/unmap. It would not be correct to add a slowpath in irq_ctx
for taking care of non-irq_ctx and for the situations where master is already
powered-off.

>
>>
>> There has already been some discussion about this on various earlier
>> permutations of this patchset.  I think we have exhausted all other
>> options.
>
> I guess I should have read those. Let me do that now.
Yea, i point to the thread in cover letter and [PATCH 1/6].
Thanks.

regards
Vivek

>
> Best regards,
> Tomasz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm"

Re: [PATCH v7 3/9] arm: omap: Move dmtimer.h out of plat-omap

2018-02-13 Thread Keerthy



On Tuesday 13 February 2018 08:39 PM, Tony Lindgren wrote:
> * Suman Anna  [180213 02:07]:
>> On 01/09/2018 12:23 AM, J, KEERTHY wrote:
>>> The header file is currently under plat-omap directory
>>> under arch/omap. Move this out to an accessible place.
>>> @@ -18,7 +18,7 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> -#include 
>>> +#include 
>>
>> These headers are actually not needed in the first-place since we no
>> longer create any non-DT timer devices. I have submitted a series to
>> cleanup the presence of this header file, as part of a larger hwmod data
>> cleanup series.
> 
> OK great. Keerthy, care to take a look? Seems like it
> simplifies things a bit.

Sure Tony. I will look into this.

> 
> Regards,
> 
> Tony
>

Re: [PATCH v7 3/9] arm: omap: Move dmtimer.h out of plat-omap

2018-02-13 Thread Keerthy



On Tuesday 13 February 2018 08:39 PM, Tony Lindgren wrote:
> * Suman Anna  [180213 02:07]:
>> On 01/09/2018 12:23 AM, J, KEERTHY wrote:
>>> The header file is currently under plat-omap directory
>>> under arch/omap. Move this out to an accessible place.
>>> @@ -18,7 +18,7 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> -#include 
>>> +#include 
>>
>> These headers are actually not needed in the first-place since we no
>> longer create any non-DT timer devices. I have submitted a series to
>> cleanup the presence of this header file, as part of a larger hwmod data
>> cleanup series.
> 
> OK great. Keerthy, care to take a look? Seems like it
> simplifies things a bit.

Sure Tony. I will look into this.

> 
> Regards,
> 
> Tony
>

Re: [PATCH] x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME

2018-02-13 Thread Tom Lendacky

On 2/8/2018 6:55 AM, Kirill A. Shutemov wrote:
> AMD SME claims one bit from physical address to indicate whether the
> page is encrypted or not. To achieve that we clear out the bit from
> __PHYSICAL_MASK.

I was actually working on a suggestion by Linus to use one of the software
page table bits to indicate encryption and translate that to the hardware
bit when writing the actual page table entry.  With that, __PHYSICAL_MASK
would go back to its original definition.

Thanks,
Tom

> 
> The capability to adjust __PHYSICAL_MASK is required beyond AMD SME.
> For instance for upcoming Intel Multi-Key Total Memory Encryption.
> 
> Let's factor it out into separate feature with own Kconfig handle.
> 
> It also helps with overhead of AMD SME. It saves more than 3k in .text
> on defconfig + AMD_MEM_ENCRYPT:
> 
>   add/remove: 3/2 grow/shrink: 5/110 up/down: 189/-3753 (-3564)
> 
> We would need to return to this once we have infrastructure to patch
> constants in code. That's good candidate for it.
> 
> Signed-off-by: Kirill A. Shutemov 
> ---
>  arch/x86/Kconfig | 4 
>  arch/x86/boot/compressed/pagetable.c | 3 +++
>  arch/x86/include/asm/page_types.h| 8 +++-
>  arch/x86/mm/mem_encrypt.c| 3 +++
>  arch/x86/mm/pgtable.c| 5 +
>  5 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index b52cdf48ad26..ffd9ef3f6ca6 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -332,6 +332,9 @@ config ARCH_SUPPORTS_UPROBES
>  config FIX_EARLYCON_MEM
>   def_bool y
>  
> +config DYNAMIC_PHYSICAL_MASK
> + bool
> +
>  config PGTABLE_LEVELS
>   int
>   default 5 if X86_5LEVEL
> @@ -1469,6 +1472,7 @@ config ARCH_HAS_MEM_ENCRYPT
>  config AMD_MEM_ENCRYPT
>   bool "AMD Secure Memory Encryption (SME) support"
>   depends on X86_64 && CPU_SUP_AMD
> + select DYNAMIC_PHYSICAL_MASK
>   ---help---
> Say yes to enable support for the encryption of system memory.
> This requires an AMD processor that supports Secure Memory
> diff --git a/arch/x86/boot/compressed/pagetable.c 
> b/arch/x86/boot/compressed/pagetable.c
> index b5e5e02f8cde..4318ac0af815 100644
> --- a/arch/x86/boot/compressed/pagetable.c
> +++ b/arch/x86/boot/compressed/pagetable.c
> @@ -16,6 +16,9 @@
>  #define __pa(x)  ((unsigned long)(x))
>  #define __va(x)  ((void *)((unsigned long)(x)))
>  
> +/* No need in adjustable __PHYSICAL_MASK during decompresssion phase */
> +#undef CONFIG_DYNAMIC_PHYSICAL_MASK
> +
>  /*
>   * The pgtable.h and mm/ident_map.c includes make use of the SME related
>   * information which is not used in the compressed image support. Un-define
> diff --git a/arch/x86/include/asm/page_types.h 
> b/arch/x86/include/asm/page_types.h
> index 1e53560a84bb..c85e15010f48 100644
> --- a/arch/x86/include/asm/page_types.h
> +++ b/arch/x86/include/asm/page_types.h
> @@ -17,7 +17,6 @@
>  #define PUD_PAGE_SIZE(_AC(1, UL) << PUD_SHIFT)
>  #define PUD_PAGE_MASK(~(PUD_PAGE_SIZE-1))
>  
> -#define __PHYSICAL_MASK  ((phys_addr_t)(__sme_clr((1ULL << 
> __PHYSICAL_MASK_SHIFT) - 1)))
>  #define __VIRTUAL_MASK   ((1UL << __VIRTUAL_MASK_SHIFT) - 1)
>  
>  /* Cast *PAGE_MASK to a signed type so that it is sign-extended if
> @@ -55,6 +54,13 @@
>  
>  #ifndef __ASSEMBLY__
>  
> +#ifdef CONFIG_DYNAMIC_PHYSICAL_MASK
> +extern phys_addr_t physical_mask;
> +#define __PHYSICAL_MASK  physical_mask
> +#else
> +#define __PHYSICAL_MASK  ((phys_addr_t)((1ULL << 
> __PHYSICAL_MASK_SHIFT) - 1))
> +#endif
> +
>  extern int devmem_is_allowed(unsigned long pagenr);
>  
>  extern unsigned long max_low_pfn_mapped;
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 1a53071e2e17..18954f97f3da 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -999,6 +999,7 @@ void __init __nostackprotector sme_enable(struct 
> boot_params *bp)
>   /* SEV state cannot be controlled by a command line option */
>   sme_me_mask = me_mask;
>   sev_enabled = true;
> + physical_mask &= ~sme_me_mask;
>   return;
>   }
>  
> @@ -1033,4 +1034,6 @@ void __init __nostackprotector sme_enable(struct 
> boot_params *bp)
>   sme_me_mask = 0;
>   else
>   sme_me_mask = active_by_default ? me_mask : 0;
> +
> + physical_mask &= ~sme_me_mask;
>  }
> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index 004abf9ebf12..a4dfe85f2fd8 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -7,6 +7,11 @@
>  #include 
>  #include 
>  
> +#ifdef CONFIG_DYNAMIC_PHYSICAL_MASK
> +phys_addr_t physical_mask __ro_after_init = (1ULL << __PHYSICAL_MASK_SHIFT) 
> - 1;
> +EXPORT_SYMBOL(physical_mask);
> +#endif
> +
>  #define PGALLOC_GFP (GFP_KERNEL_ACCOUNT | __GFP_ZERO)
>  
>  #ifdef

Re: [PATCH] x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME

2018-02-13 Thread Tom Lendacky

On 2/8/2018 6:55 AM, Kirill A. Shutemov wrote:
> AMD SME claims one bit from physical address to indicate whether the
> page is encrypted or not. To achieve that we clear out the bit from
> __PHYSICAL_MASK.

I was actually working on a suggestion by Linus to use one of the software
page table bits to indicate encryption and translate that to the hardware
bit when writing the actual page table entry.  With that, __PHYSICAL_MASK
would go back to its original definition.

Thanks,
Tom

> 
> The capability to adjust __PHYSICAL_MASK is required beyond AMD SME.
> For instance for upcoming Intel Multi-Key Total Memory Encryption.
> 
> Let's factor it out into separate feature with own Kconfig handle.
> 
> It also helps with overhead of AMD SME. It saves more than 3k in .text
> on defconfig + AMD_MEM_ENCRYPT:
> 
>   add/remove: 3/2 grow/shrink: 5/110 up/down: 189/-3753 (-3564)
> 
> We would need to return to this once we have infrastructure to patch
> constants in code. That's good candidate for it.
> 
> Signed-off-by: Kirill A. Shutemov 
> ---
>  arch/x86/Kconfig | 4 
>  arch/x86/boot/compressed/pagetable.c | 3 +++
>  arch/x86/include/asm/page_types.h| 8 +++-
>  arch/x86/mm/mem_encrypt.c| 3 +++
>  arch/x86/mm/pgtable.c| 5 +
>  5 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index b52cdf48ad26..ffd9ef3f6ca6 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -332,6 +332,9 @@ config ARCH_SUPPORTS_UPROBES
>  config FIX_EARLYCON_MEM
>   def_bool y
>  
> +config DYNAMIC_PHYSICAL_MASK
> + bool
> +
>  config PGTABLE_LEVELS
>   int
>   default 5 if X86_5LEVEL
> @@ -1469,6 +1472,7 @@ config ARCH_HAS_MEM_ENCRYPT
>  config AMD_MEM_ENCRYPT
>   bool "AMD Secure Memory Encryption (SME) support"
>   depends on X86_64 && CPU_SUP_AMD
> + select DYNAMIC_PHYSICAL_MASK
>   ---help---
> Say yes to enable support for the encryption of system memory.
> This requires an AMD processor that supports Secure Memory
> diff --git a/arch/x86/boot/compressed/pagetable.c 
> b/arch/x86/boot/compressed/pagetable.c
> index b5e5e02f8cde..4318ac0af815 100644
> --- a/arch/x86/boot/compressed/pagetable.c
> +++ b/arch/x86/boot/compressed/pagetable.c
> @@ -16,6 +16,9 @@
>  #define __pa(x)  ((unsigned long)(x))
>  #define __va(x)  ((void *)((unsigned long)(x)))
>  
> +/* No need in adjustable __PHYSICAL_MASK during decompresssion phase */
> +#undef CONFIG_DYNAMIC_PHYSICAL_MASK
> +
>  /*
>   * The pgtable.h and mm/ident_map.c includes make use of the SME related
>   * information which is not used in the compressed image support. Un-define
> diff --git a/arch/x86/include/asm/page_types.h 
> b/arch/x86/include/asm/page_types.h
> index 1e53560a84bb..c85e15010f48 100644
> --- a/arch/x86/include/asm/page_types.h
> +++ b/arch/x86/include/asm/page_types.h
> @@ -17,7 +17,6 @@
>  #define PUD_PAGE_SIZE(_AC(1, UL) << PUD_SHIFT)
>  #define PUD_PAGE_MASK(~(PUD_PAGE_SIZE-1))
>  
> -#define __PHYSICAL_MASK  ((phys_addr_t)(__sme_clr((1ULL << 
> __PHYSICAL_MASK_SHIFT) - 1)))
>  #define __VIRTUAL_MASK   ((1UL << __VIRTUAL_MASK_SHIFT) - 1)
>  
>  /* Cast *PAGE_MASK to a signed type so that it is sign-extended if
> @@ -55,6 +54,13 @@
>  
>  #ifndef __ASSEMBLY__
>  
> +#ifdef CONFIG_DYNAMIC_PHYSICAL_MASK
> +extern phys_addr_t physical_mask;
> +#define __PHYSICAL_MASK  physical_mask
> +#else
> +#define __PHYSICAL_MASK  ((phys_addr_t)((1ULL << 
> __PHYSICAL_MASK_SHIFT) - 1))
> +#endif
> +
>  extern int devmem_is_allowed(unsigned long pagenr);
>  
>  extern unsigned long max_low_pfn_mapped;
> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> index 1a53071e2e17..18954f97f3da 100644
> --- a/arch/x86/mm/mem_encrypt.c
> +++ b/arch/x86/mm/mem_encrypt.c
> @@ -999,6 +999,7 @@ void __init __nostackprotector sme_enable(struct 
> boot_params *bp)
>   /* SEV state cannot be controlled by a command line option */
>   sme_me_mask = me_mask;
>   sev_enabled = true;
> + physical_mask &= ~sme_me_mask;
>   return;
>   }
>  
> @@ -1033,4 +1034,6 @@ void __init __nostackprotector sme_enable(struct 
> boot_params *bp)
>   sme_me_mask = 0;
>   else
>   sme_me_mask = active_by_default ? me_mask : 0;
> +
> + physical_mask &= ~sme_me_mask;
>  }
> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index 004abf9ebf12..a4dfe85f2fd8 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -7,6 +7,11 @@
>  #include 
>  #include 
>  
> +#ifdef CONFIG_DYNAMIC_PHYSICAL_MASK
> +phys_addr_t physical_mask __ro_after_init = (1ULL << __PHYSICAL_MASK_SHIFT) 
> - 1;
> +EXPORT_SYMBOL(physical_mask);
> +#endif
> +
>  #define PGALLOC_GFP (GFP_KERNEL_ACCOUNT | __GFP_ZERO)
>  
>  #ifdef CONFIG_HIGHPTE
>

Re: [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops

2018-02-13 Thread Tomasz Figa

On Tue, Feb 13, 2018 at 7:25 PM, Vivek Gautam
 wrote:
>>> +static int arm_smmu_init_clks(struct arm_smmu_device *smmu)
>>> +{
>>> +   int i;
>>> +   int num = smmu->num_clks;
>>> +   const struct arm_smmu_match_data *data;
>>> +
>>> +   if (num < 1)
>>> +   return 0;
>>> +
>>> +   smmu->clocks = devm_kcalloc(smmu->dev, num,
>>> +   sizeof(*smmu->clocks), GFP_KERNEL);
>>> +   if (!smmu->clocks)
>>> +   return -ENOMEM;
>>> +
>>> +   data = of_device_get_match_data(smmu->dev);
>>> +
>>> +   for (i = 0; i < num; i++)
>>> +   smmu->clocks[i].id = data->clks[i];
>>
>> I'd argue that arm_smmu_device_dt_probe() is a better place for all
>> the code above, since this function is called regardless of whether
>> the device is probed from DT or not. Going further,
>> arm_smmu_device_acpi_probe() could fill smmu->num_clks and ->clocks
>> using ACPI-like way (as opposed to OF match data) if necessary.
>
> Right, it's valid to fill the data in arm_smmu_device_dt_probe().
> Perhaps we can just keep the devm_clk_bulk_get() in arm_smmu_device_probe()
> at the point where we are currently doing arm_smmu_init_clks().

Sounds good to me. Thanks.

Best regards,
Tomasz

Re: [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops

2018-02-13 Thread Tomasz Figa

On Tue, Feb 13, 2018 at 7:25 PM, Vivek Gautam
 wrote:
>>> +static int arm_smmu_init_clks(struct arm_smmu_device *smmu)
>>> +{
>>> +   int i;
>>> +   int num = smmu->num_clks;
>>> +   const struct arm_smmu_match_data *data;
>>> +
>>> +   if (num < 1)
>>> +   return 0;
>>> +
>>> +   smmu->clocks = devm_kcalloc(smmu->dev, num,
>>> +   sizeof(*smmu->clocks), GFP_KERNEL);
>>> +   if (!smmu->clocks)
>>> +   return -ENOMEM;
>>> +
>>> +   data = of_device_get_match_data(smmu->dev);
>>> +
>>> +   for (i = 0; i < num; i++)
>>> +   smmu->clocks[i].id = data->clks[i];
>>
>> I'd argue that arm_smmu_device_dt_probe() is a better place for all
>> the code above, since this function is called regardless of whether
>> the device is probed from DT or not. Going further,
>> arm_smmu_device_acpi_probe() could fill smmu->num_clks and ->clocks
>> using ACPI-like way (as opposed to OF match data) if necessary.
>
> Right, it's valid to fill the data in arm_smmu_device_dt_probe().
> Perhaps we can just keep the devm_clk_bulk_get() in arm_smmu_device_probe()
> at the point where we are currently doing arm_smmu_init_clks().

Sounds good to me. Thanks.

Best regards,
Tomasz

[PATCH RESEND v5 0/3] Add support for Hi3660 mailbox driver

2018-02-13 Thread Leo Yan

Hi3660 mailbox controller is used to send message within multiple
processors, MCU, HIFI, etc. This patch series is to implement an
initial version for Hi3660 mailbox driver with "automatic
acknowledge" mode.

The patch set have been verified with Hi3660 stub clock driver, so
we can send message to MCU to execute CPU frequency scaling. This is
tested on 96boards Hikey960.

Changes from v4:
* According to Jassi suggestion, refactored mailbox driver and removed
  "inline" for function declaration;

Changes from v3:
* According to Jassi suggestion, refined structure name to
  "struct hi3660_chan_info";
* According to Jassi suggestion, moved channel 'lock'+'acquire'
  operations into .startup();

Changes from v2:
* According to Mark Rutland suggestions, removed sev()/wfe() from
  driver, the system has no two masters sharing the same channel for
  data transferring so we don't need these instructions;
* Refined DT binding and doc according to Rob suggestions;
* Refined driver according to Julien suggestions;

Changes from v1:
* Added cover letter to track the changelog;
* Added document for DT binding;
* Refactored and simplized mailbox driver with "automatic ack" mode;
* Refined commit logs for patches;


Kaihua Zhong (2):
  mailbox: Add support for Hi3660 mailbox
  dts: arm64: Add mailbox binding for hi3660

Leo Yan (1):
  dt-bindings: mailbox: Introduce Hi3660 controller binding

 .../bindings/mailbox/hisilicon,hi3660-mailbox.txt  |  51 
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi  |   8 +
 drivers/mailbox/Kconfig|   8 +
 drivers/mailbox/Makefile   |   2 +
 drivers/mailbox/hi3660-mailbox.c   | 316 +
 5 files changed, 385 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt
 create mode 100644 drivers/mailbox/hi3660-mailbox.c

-- 
1.9.1

[PATCH RESEND v5 1/3] dt-bindings: mailbox: Introduce Hi3660 controller binding

2018-02-13 Thread Leo Yan

Introduce a binding for the Hi3660 mailbox controller, the mailbox is
used within application processor (AP), communication processor (CP),
HIFI and MCU, etc.

Acked-by: Rob Herring 
Signed-off-by: Leo Yan 
---
 .../bindings/mailbox/hisilicon,hi3660-mailbox.txt  | 51 ++
 1 file changed, 51 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt

diff --git 
a/Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt 
b/Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt
new file mode 100644
index 000..3e5b453
--- /dev/null
+++ b/Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt
@@ -0,0 +1,51 @@
+Hisilicon Hi3660 Mailbox Controller
+
+Hisilicon Hi3660 mailbox controller supports up to 32 channels.  Messages
+are passed between processors, including application & communication
+processors, MCU, HIFI, etc.  Each channel is unidirectional and accessed
+by using MMIO registers; it supports maximum to 8 words message.
+
+Controller
+--
+
+Required properties:
+- compatible:  : Shall be "hisilicon,hi3660-mbox"
+- reg: : Offset and length of the device's register set
+- #mbox-cells: : Must be 3
+ < channel dst_irq ack_irq>
+   phandle : Label name of controller
+   channel : Channel number
+   dst_irq : Remote interrupt vector
+   ack_irq : Local interrupt vector
+
+- interrupts:  : Contains the two IRQ lines for mailbox.
+
+Example:
+
+mailbox: mailbox@e896b000 {
+   compatible = "hisilicon,hi3660-mbox";
+   reg = <0x0 0xe896b000 0x0 0x1000>;
+   interrupts = <0x0 0xc0 0x4>,
+<0x0 0xc1 0x4>;
+   #mbox-cells = <3>;
+};
+
+Client
+--
+
+Required properties:
+- compatible   : See the client docs
+- mboxes   : Standard property to specify a Mailbox (See 
./mailbox.txt)
+ Cells must match 'mbox-cells' (See Controller docs 
above)
+
+Optional properties
+- mbox-names   : Name given to channels seen in the 'mboxes' property.
+
+Example:
+
+stub_clock: stub_clock@e896b500 {
+   compatible = "hisilicon,hi3660-stub-clk";
+   reg = <0x0 0xe896b500 0x0 0x0100>;
+   #clock-cells = <1>;
+   mboxes = < 13 3 0>;
+};
-- 
1.9.1

[PATCH RESEND v5 0/3] Add support for Hi3660 mailbox driver

2018-02-13 Thread Leo Yan

Hi3660 mailbox controller is used to send message within multiple
processors, MCU, HIFI, etc. This patch series is to implement an
initial version for Hi3660 mailbox driver with "automatic
acknowledge" mode.

The patch set have been verified with Hi3660 stub clock driver, so
we can send message to MCU to execute CPU frequency scaling. This is
tested on 96boards Hikey960.

Changes from v4:
* According to Jassi suggestion, refactored mailbox driver and removed
  "inline" for function declaration;

Changes from v3:
* According to Jassi suggestion, refined structure name to
  "struct hi3660_chan_info";
* According to Jassi suggestion, moved channel 'lock'+'acquire'
  operations into .startup();

Changes from v2:
* According to Mark Rutland suggestions, removed sev()/wfe() from
  driver, the system has no two masters sharing the same channel for
  data transferring so we don't need these instructions;
* Refined DT binding and doc according to Rob suggestions;
* Refined driver according to Julien suggestions;

Changes from v1:
* Added cover letter to track the changelog;
* Added document for DT binding;
* Refactored and simplized mailbox driver with "automatic ack" mode;
* Refined commit logs for patches;


Kaihua Zhong (2):
  mailbox: Add support for Hi3660 mailbox
  dts: arm64: Add mailbox binding for hi3660

Leo Yan (1):
  dt-bindings: mailbox: Introduce Hi3660 controller binding

 .../bindings/mailbox/hisilicon,hi3660-mailbox.txt  |  51 
 arch/arm64/boot/dts/hisilicon/hi3660.dtsi  |   8 +
 drivers/mailbox/Kconfig|   8 +
 drivers/mailbox/Makefile   |   2 +
 drivers/mailbox/hi3660-mailbox.c   | 316 +
 5 files changed, 385 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt
 create mode 100644 drivers/mailbox/hi3660-mailbox.c

-- 
1.9.1

[PATCH RESEND v5 1/3] dt-bindings: mailbox: Introduce Hi3660 controller binding

2018-02-13 Thread Leo Yan

Introduce a binding for the Hi3660 mailbox controller, the mailbox is
used within application processor (AP), communication processor (CP),
HIFI and MCU, etc.

Acked-by: Rob Herring 
Signed-off-by: Leo Yan 
---
 .../bindings/mailbox/hisilicon,hi3660-mailbox.txt  | 51 ++
 1 file changed, 51 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt

diff --git 
a/Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt 
b/Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt
new file mode 100644
index 000..3e5b453
--- /dev/null
+++ b/Documentation/devicetree/bindings/mailbox/hisilicon,hi3660-mailbox.txt
@@ -0,0 +1,51 @@
+Hisilicon Hi3660 Mailbox Controller
+
+Hisilicon Hi3660 mailbox controller supports up to 32 channels.  Messages
+are passed between processors, including application & communication
+processors, MCU, HIFI, etc.  Each channel is unidirectional and accessed
+by using MMIO registers; it supports maximum to 8 words message.
+
+Controller
+--
+
+Required properties:
+- compatible:  : Shall be "hisilicon,hi3660-mbox"
+- reg: : Offset and length of the device's register set
+- #mbox-cells: : Must be 3
+ < channel dst_irq ack_irq>
+   phandle : Label name of controller
+   channel : Channel number
+   dst_irq : Remote interrupt vector
+   ack_irq : Local interrupt vector
+
+- interrupts:  : Contains the two IRQ lines for mailbox.
+
+Example:
+
+mailbox: mailbox@e896b000 {
+   compatible = "hisilicon,hi3660-mbox";
+   reg = <0x0 0xe896b000 0x0 0x1000>;
+   interrupts = <0x0 0xc0 0x4>,
+<0x0 0xc1 0x4>;
+   #mbox-cells = <3>;
+};
+
+Client
+--
+
+Required properties:
+- compatible   : See the client docs
+- mboxes   : Standard property to specify a Mailbox (See 
./mailbox.txt)
+ Cells must match 'mbox-cells' (See Controller docs 
above)
+
+Optional properties
+- mbox-names   : Name given to channels seen in the 'mboxes' property.
+
+Example:
+
+stub_clock: stub_clock@e896b500 {
+   compatible = "hisilicon,hi3660-stub-clk";
+   reg = <0x0 0xe896b500 0x0 0x0100>;
+   #clock-cells = <1>;
+   mboxes = < 13 3 0>;
+};
-- 
1.9.1

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2256 matches

Mail list logo