Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-15 Thread Eduardo Valentin
Hey Radim,

On Thu, Nov 09, 2017 at 03:17:33PM +0100, Radim Krčmář wrote:



> 
> This is what I'm doubting, because the patch is adding about two
> thousand cycles to every spinlock-taken path.
> Doesn't this patch yield better results?
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 3df743b60c80..d9225e48c11a 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -676,6 +676,12 @@ void __init kvm_spinlock_init(void)
>  {
>   if (!kvm_para_available())
>   return;
> +
> + if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED)) {
> + static_branch_disable(_spin_lock_key);
> + return;
> + }
> +

Yes, the above suggestion is a much better approach. The code has probably 
changed from the time I wrote the first version. I will refresh with the above 
suggestion.


>   /* Does host kernel support KVM_FEATURE_PV_UNHALT? */
>   if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
>   return;
> 
> >  However, the 
> > key aspect
> > here is this patch gives a way for the host to instruct the guest to use 
> > qspinlock.
> > Even with Longman's patch which allows guest to select the spinlock 
> > implementation,
> > there should still be the auto-select mode. In such mode, PV_DEDICATED 
> > should
> > allow the host to get the guest to use qspinlock, without, the guest will 
> > fallback
> > to tas when PV_UNHALT == 0.
> 
> I agree that a flag can be useful for certains setups.

Cool!

> 

-- 
All the best,
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-09 Thread Eduardo Valentin
Hello,

On Wed, Nov 08, 2017 at 06:36:52PM +0100, Radim Krčmář wrote:
> 2017-11-06 12:26-0800, Eduardo Valentin:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> > 
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
> > 
> > With this patch, when in autoselect mode, the guest will
> > use the default spinlock implementation based on host feature
> > flags as follows:
> > 
> > PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> > 
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: "Radim Krčmář" <rkrc...@redhat.com>
> > Cc: Jonathan Corbet <cor...@lwn.net>
> > Cc: Thomas Gleixner <t...@linutronix.de>
> > Cc: Ingo Molnar <mi...@redhat.com>
> > Cc: "H. Peter Anvin" <h...@zytor.com>
> > Cc: x...@kernel.org
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Waiman Long <long...@redhat.com>
> > Cc: k...@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-ker...@vger.kernel.org
> > Cc: Jan H. Schoenherr <jscho...@amazon.de>
> > Cc: Anthony Liguori <aligu...@amazon.com>
> > Suggested-by: Matt Wilson <m...@amazon.com>
> > Signed-off-by: Eduardo Valentin <edu...@amazon.com>
> > ---
> > V3:
> >  - When PV_DEDICATED is set (1), qspinlock is selected,
> >regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
> >  - Refreshed on top of tip/master.
> > V2:
> >  - rebase on top of tip/master
> > 
> >  Documentation/virtual/kvm/cpuid.txt  | 6 ++
> >  arch/x86/include/asm/qspinlock.h | 4 
> >  arch/x86/include/uapi/asm/kvm_para.h | 1 +
> >  arch/x86/kernel/kvm.c| 2 ++
> >  4 files changed, 13 insertions(+)
> > 
> > diff --git a/Documentation/virtual/kvm/cpuid.txt 
> > b/Documentation/virtual/kvm/cpuid.txt
> > index 3c65feb..117066a 100644
> > --- a/Documentation/virtual/kvm/cpuid.txt
> > +++ b/Documentation/virtual/kvm/cpuid.txt
> > @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest 
> > checks this feature bit
> > ||   || before enabling 
> > paravirtualized
> > ||   || spinlock support.
> >  
> > --
> > +KVM_FEATURE_PV_DEDICATED   || 8 || guest checks this feature 
> > bit
> > +   ||   || to determine if they run on
> > +   ||   || dedicated vCPUs, allowing 
> > opti-
> > +   ||   || mizations such as usage of
> > +   ||   || qspinlocks.
> > +--
> >  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no 
> > guest-side
> > ||   || per-cpu warps are expected 
> > in
> > ||   || kvmclock.
> > diff --git a/arch/x86/include/asm/qspinlock.h 
> > b/arch/x86/include/asm/qspinlock.h
> > index 5e16b5d..de42694 100644
> > --- a/arch/x86/include/asm/qspinlock.h
> > +++ b/arch/x86/include/asm/qspinlock.h
> > @@ -3,6 +3,8 @@
> >  #define _ASM_X86_QSPINLOCK_H
> >  
> >  #include 
> > +#include 
> > +
> >  #include 
> >  #include 
> >  #include 
> > @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> > if (!static_branch_likely(_spin_lock_key))
> > return false;
> >  
> > +   if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +   return false;
> 
> Hm, every spinlock slowpath calls cpuid, which causes a VM exit, so I
> wouldn't expect it to be faster than the existing implementations.
> (Using the static key would be better.)
> 
> How does this patch perform compared to user-forced qspinlock and hybrid
> pvqspinlock?

This pa

Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-08 Thread Eduardo Valentin
Paolo,

On Tue, Nov 07, 2017 at 01:43:15PM +0100, Paolo Bonzini wrote:
> On 07/11/2017 13:39, Eduardo Valentin wrote:
> >> is this still needed after Waiman's patch to adaptively switch between
> >> tas and pvqspinlock?
> > Can you please point me to it ? Is it already in tip/master?
> > 
> 
> No, he just posted it:
> 
> https://marc.info/?l=linux-kernel=150972337909996=2

OK, Thanks. I've reviewed his V2. I think the patch to have PV_DEDICATED is 
still interesting to have,
for the case the guest is in autoselect mode and honors what the host gives as 
hints.

> 
> Paolo

-- 
All the best,
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-07 Thread Eduardo Valentin
On Tue, Nov 07, 2017 at 01:23:56PM +0100, Paolo Bonzini wrote:
> On 06/11/2017 21:26, Eduardo Valentin wrote:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> > 
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
> > 
> > With this patch, when in autoselect mode, the guest will
> > use the default spinlock implementation based on host feature
> > flags as follows:
> > 
> > PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> 
> Hi Eduardo,
> 
> besides the suggestion to use a separate word than the one for features,

Ok. I will take a look.

> is this still needed after Waiman's patch to adaptively switch between
> tas and pvqspinlock?

Can you please point me to it ? Is it already in tip/master?

> 
> Paolo
> 
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: "Radim Krčmář" <rkrc...@redhat.com>
> > Cc: Jonathan Corbet <cor...@lwn.net>
> > Cc: Thomas Gleixner <t...@linutronix.de>
> > Cc: Ingo Molnar <mi...@redhat.com>
> > Cc: "H. Peter Anvin" <h...@zytor.com>
> > Cc: x...@kernel.org
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Waiman Long <long...@redhat.com>
> > Cc: k...@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-ker...@vger.kernel.org
> > Cc: Jan H. Schoenherr <jscho...@amazon.de>
> > Cc: Anthony Liguori <aligu...@amazon.com>
> > Suggested-by: Matt Wilson <m...@amazon.com>
> > Signed-off-by: Eduardo Valentin <edu...@amazon.com>
> > ---
> > V3:
> >  - When PV_DEDICATED is set (1), qspinlock is selected,
> >regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
> >  - Refreshed on top of tip/master.
> > V2:
> >  - rebase on top of tip/master
> > 
> >  Documentation/virtual/kvm/cpuid.txt  | 6 ++
> >  arch/x86/include/asm/qspinlock.h | 4 
> >  arch/x86/include/uapi/asm/kvm_para.h | 1 +
> >  arch/x86/kernel/kvm.c| 2 ++
> >  4 files changed, 13 insertions(+)
> > 
> > diff --git a/Documentation/virtual/kvm/cpuid.txt 
> > b/Documentation/virtual/kvm/cpuid.txt
> > index 3c65feb..117066a 100644
> > --- a/Documentation/virtual/kvm/cpuid.txt
> > +++ b/Documentation/virtual/kvm/cpuid.txt
> > @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest 
> > checks this feature bit
> > ||   || before enabling 
> > paravirtualized
> > ||   || spinlock support.
> >  
> > --
> > +KVM_FEATURE_PV_DEDICATED   || 8 || guest checks this feature 
> > bit
> > +   ||   || to determine if they run on
> > +   ||   || dedicated vCPUs, allowing 
> > opti-
> > +   ||   || mizations such as usage of
> > +   ||   || qspinlocks.
> > +--
> >  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no 
> > guest-side
> > ||   || per-cpu warps are expected 
> > in
> > ||   || kvmclock.
> > diff --git a/arch/x86/include/asm/qspinlock.h 
> > b/arch/x86/include/asm/qspinlock.h
> > index 5e16b5d..de42694 100644
> > --- a/arch/x86/include/asm/qspinlock.h
> > +++ b/arch/x86/include/asm/qspinlock.h
> > @@ -3,6 +3,8 @@
> >  #define _ASM_X86_QSPINLOCK_H
> >  
> >  #include 
> > +#include 
> > +
> >  #include 
> >  #include 
> >  #include 
> > @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> > if (!static_branch_likely(_spin_lock_key))
> > return false;
> >  
> > +   if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +   return false;
> >

[PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-06 Thread Eduardo Valentin
Currently, the existing qspinlock implementation will fallback to
test-and-set if the hypervisor has not set the PV_UNHALT flag.

This patch gives the opportunity to guest kernels to select
between test-and-set and the regular queueu fair lock implementation
based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
flag is not set, the code will still fall back to test-and-set,
but when the PV_DEDICATED flag is set, the code will use
the regular queue spinlock implementation.

With this patch, when in autoselect mode, the guest will
use the default spinlock implementation based on host feature
flags as follows:

PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
PV_DEDICATED = 0, PV_UNHALT = 0: default is tas

Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: "Radim Krčmář" <rkrc...@redhat.com>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: x...@kernel.org
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Waiman Long <long...@redhat.com>
Cc: k...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: Jan H. Schoenherr <jscho...@amazon.de>
Cc: Anthony Liguori <aligu...@amazon.com>
Suggested-by: Matt Wilson <m...@amazon.com>
Signed-off-by: Eduardo Valentin <edu...@amazon.com>
---
V3:
 - When PV_DEDICATED is set (1), qspinlock is selected,
   regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
 - Refreshed on top of tip/master.
V2:
 - rebase on top of tip/master

 Documentation/virtual/kvm/cpuid.txt  | 6 ++
 arch/x86/include/asm/qspinlock.h | 4 
 arch/x86/include/uapi/asm/kvm_para.h | 1 +
 arch/x86/kernel/kvm.c| 2 ++
 4 files changed, 13 insertions(+)

diff --git a/Documentation/virtual/kvm/cpuid.txt 
b/Documentation/virtual/kvm/cpuid.txt
index 3c65feb..117066a 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest checks 
this feature bit
||   || before enabling paravirtualized
||   || spinlock support.
 --
+KVM_FEATURE_PV_DEDICATED   || 8 || guest checks this feature bit
+   ||   || to determine if they run on
+   ||   || dedicated vCPUs, allowing opti-
+   ||   || mizations such as usage of
+   ||   || qspinlocks.
+--
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no guest-side
||   || per-cpu warps are expected in
||   || kvmclock.
diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
index 5e16b5d..de42694 100644
--- a/arch/x86/include/asm/qspinlock.h
+++ b/arch/x86/include/asm/qspinlock.h
@@ -3,6 +3,8 @@
 #define _ASM_X86_QSPINLOCK_H
 
 #include 
+#include 
+
 #include 
 #include 
 #include 
@@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
if (!static_branch_likely(_spin_lock_key))
return false;
 
+   if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
+   return false;
/*
 * On hypervisors without PARAVIRT_SPINLOCKS support we fall
 * back to a Test-and-Set spinlock, because fair locks have
diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
b/arch/x86/include/uapi/asm/kvm_para.h
index 554aa8f..85a9875 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -25,6 +25,7 @@
 #define KVM_FEATURE_STEAL_TIME 5
 #define KVM_FEATURE_PV_EOI 6
 #define KVM_FEATURE_PV_UNHALT  7
+#define KVM_FEATURE_PV_DEDICATED   8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 8bb9594..dacd7cf 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -642,6 +642,8 @@ void __init kvm_spinlock_init(void)
 {
if (!kvm_para_available())
return;
+   if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
+   return;
/* Does host kernel support KVM_FEATURE_PV_UNHALT? */
if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
return;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-03 Thread Eduardo Valentin
On Fri, Nov 03, 2017 at 11:09:53AM +0100, Paolo Bonzini wrote:
> On 02/11/2017 19:43, Eduardo Valentin wrote:
> > On Thu, Nov 02, 2017 at 07:24:16PM +0100, Paolo Bonzini wrote:
> >> On 02/11/2017 19:08, Eduardo Valentin wrote:
> >>> On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
> >>>> On 02/11/2017 18:45, Eduardo Valentin wrote:
> >>>>> Currently, the existing qspinlock implementation will fallback to
> >>>>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >>>>>
> >>>>> This patch gives the opportunity to guest kernels to select
> >>>>> between test-and-set and the regular queueu fair lock implementation
> >>>>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> >>>>> flag is not set, the code will still fall back to test-and-set,
> >>>>> but when the PV_DEDICATED flag is set, the code will use
> >>>>> the regular queue spinlock implementation.
> >>>>
> >>>> Have you seen Waiman's series that lets you specify this on the guest
> >>>> command line instead?  Would this be acceptable for your use case?
> >>>
> >>> No, can you please share a link to it? is it already merged to tip/master?
> >>
> >> [PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type
> >> https://lkml.org/lkml/2017/11/1/655
> >>
> >>>> (In other words, is there a difference for you between making the host
> >>>> vs. guest administrator toggle the feature?  "@amazon.com" means you are
> >>>> the host admin, how would you use it?)
> >>>
> >>> The way I think of this is this is a flag set by host side so the
> >>> guest adapts accordingly.
> >>>
> >>> If the admin in guest side wants to ignore what the host is
> >>> flagging, that is a different story.
> >>
> >> Okay, this makes sense.  But perhaps it should be a separate CPUID leaf,
> >> such as "configuration hints", rather than properly a feature.
> > 
> > Oh OK, you don't think this starts to deviate from the feature concept.
> > But would the PV_UNHALT also go to "configuration hints" bucket?
> 
> PV_UNHALT says whether the pvqspinlock API is available, PV_DEDICATED
> says whether it should be used.
> 
> > Another way to see this is we have three locking feature options to select 
> > from,
> > so we need at least two bits here.
> 
> PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> 
> What do you think?

Sounds reasonable, and it is almost what the patch does. But to achieve the 
above
table, we need in include the following chunk:
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 8bb9594..dacd7cf 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -642,6 +642,8 @@ void __init kvm_spinlock_init(void)
 {
if (!kvm_para_available())
return;
+   if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
+   return;
/* Does host kernel support KVM_FEATURE_PV_UNHALT? */
if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
return;

Now we get:
PV_DEDICATEDPV_UNHALT   IMPLEMENTATION
1   X   qspinlock
0   1   pvspinlock
0   0   tas

Do you still think PV_DEDICATED goes as a "configuration hint", given that it 
would take precedence on a feature (PV_UNHALT)?
Or do we keep everything as features (two bits to represent selection of three 
features)?

BR,


> 
> Paolo

-- 
All the best,
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-02 Thread Eduardo Valentin
On Thu, Nov 02, 2017 at 07:24:16PM +0100, Paolo Bonzini wrote:
> On 02/11/2017 19:08, Eduardo Valentin wrote:
> > On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
> >> On 02/11/2017 18:45, Eduardo Valentin wrote:
> >>> Currently, the existing qspinlock implementation will fallback to
> >>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >>>
> >>> This patch gives the opportunity to guest kernels to select
> >>> between test-and-set and the regular queueu fair lock implementation
> >>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> >>> flag is not set, the code will still fall back to test-and-set,
> >>> but when the PV_DEDICATED flag is set, the code will use
> >>> the regular queue spinlock implementation.
> >>
> >> Have you seen Waiman's series that lets you specify this on the guest
> >> command line instead?  Would this be acceptable for your use case?
> > 
> > No, can you please share a link to it? is it already merged to tip/master?
> 
> [PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type
> https://lkml.org/lkml/2017/11/1/655
> 
> >> (In other words, is there a difference for you between making the host
> >> vs. guest administrator toggle the feature?  "@amazon.com" means you are
> >> the host admin, how would you use it?)
> > 
> > The way I think of this is this is a flag set by host side so the
> > guest adapts accordingly.
> > 
> > If the admin in guest side wants to ignore what the host is
> > flagging, that is a different story.
> 
> Okay, this makes sense.  But perhaps it should be a separate CPUID leaf,
> such as "configuration hints", rather than properly a feature.

Oh OK, you don't think this starts to deviate from the feature concept.
But would the PV_UNHALT also go to "configuration hints" bucket?

Another way to see this is we have three locking feature options to select from,
so we need at least two bits here.

> 
> Paolo

-- 
All the best,
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-02 Thread Eduardo Valentin
Longman,

On Thu, Nov 02, 2017 at 02:12:13PM -0400, Waiman Long wrote:
> On 11/02/2017 02:08 PM, Eduardo Valentin wrote:
> > On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
> >> On 02/11/2017 18:45, Eduardo Valentin wrote:
> >>> Currently, the existing qspinlock implementation will fallback to
> >>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >>>
> >>> This patch gives the opportunity to guest kernels to select
> >>> between test-and-set and the regular queueu fair lock implementation
> >>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> >>> flag is not set, the code will still fall back to test-and-set,
> >>> but when the PV_DEDICATED flag is set, the code will use
> >>> the regular queue spinlock implementation.
> >> Have you seen Waiman's series that lets you specify this on the guest
> >> command line instead?  Would this be acceptable for your use case?
> >>
> > No, can you please share a link to it? is it already merged to tip/master?
> 
> See https://lkml.org/lkml/2017/11/1/655

Oh I see, thanks. I think that patch would help, but I believe the series and 
this patch are complementary.

Paolo, back to your question, I think this patch still makes sense in 
combination with Waiman's series
for the following case:

+ * If this argument is not specified, the kernel will automatically choose
+ * an appropriate one depending on X86_FEATURE_HYPERVISOR and hypervisor
+ * specific settings.
+ */

 In this case, the hypervisor can still flag PV_DEDICATED and the guest would 
not pick test, when
that scenario is desirable.

> 
> Cheers,
> Longman
> 

-- 
All the best,
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-02 Thread Eduardo Valentin
On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
> On 02/11/2017 18:45, Eduardo Valentin wrote:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> > 
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
> 
> Have you seen Waiman's series that lets you specify this on the guest
> command line instead?  Would this be acceptable for your use case?
> 

No, can you please share a link to it? is it already merged to tip/master?

> (In other words, is there a difference for you between making the host
> vs. guest administrator toggle the feature?  "@amazon.com" means you are
> the host admin, how would you use it?)
> 

The way I think of this is this is a flag set by host side so the guest adapts 
accordingly.

If the admin in guest side wants to ignore what the host is flagging, that is a 
different story.

> Thanks,
> 
> Paolo
> 
> > Cc: Paolo Bonzini <pbonz...@redhat.com>
> > Cc: "Radim Krčmář" <rkrc...@redhat.com>
> > Cc: Jonathan Corbet <cor...@lwn.net>
> > Cc: Thomas Gleixner <t...@linutronix.de>
> > Cc: Ingo Molnar <mi...@redhat.com>
> > Cc: "H. Peter Anvin" <h...@zytor.com>
> > Cc: x...@kernel.org
> > Cc: Peter Zijlstra <pet...@infradead.org>
> > Cc: Waiman Long <long...@redhat.com>
> > Cc: k...@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-ker...@vger.kernel.org
> > Cc: Jan H. Schoenherr <jscho...@amazon.de>
> > Cc: Anthony Liguori <aligu...@amazon.com>
> > Suggested-by: Matt Wilson <m...@amazon.com>
> > Signed-off-by: Eduardo Valentin <edu...@amazon.com>
> > ---
> > V2:
> >  - rebase on top of tip/master
> > 
> >  Documentation/virtual/kvm/cpuid.txt  | 6 ++
> >  arch/x86/include/asm/qspinlock.h | 4 
> >  arch/x86/include/uapi/asm/kvm_para.h | 1 +
> >  3 files changed, 11 insertions(+)
> > 
> > diff --git a/Documentation/virtual/kvm/cpuid.txt 
> > b/Documentation/virtual/kvm/cpuid.txt
> > index 3c65feb..117066a 100644
> > --- a/Documentation/virtual/kvm/cpuid.txt
> > +++ b/Documentation/virtual/kvm/cpuid.txt
> > @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest 
> > checks this feature bit
> > ||   || before enabling 
> > paravirtualized
> > ||   || spinlock support.
> >  
> > --
> > +KVM_FEATURE_PV_DEDICATED   || 8 || guest checks this feature 
> > bit
> > +   ||   || to determine if they run on
> > +   ||   || dedicated vCPUs, allowing 
> > opti-
> > +   ||   || mizations such as usage of
> > +   ||   || qspinlocks.
> > +--
> >  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no 
> > guest-side
> > ||   || per-cpu warps are expected 
> > in
> > ||   || kvmclock.
> > diff --git a/arch/x86/include/asm/qspinlock.h 
> > b/arch/x86/include/asm/qspinlock.h
> > index 308dfd0..3751898 100644
> > --- a/arch/x86/include/asm/qspinlock.h
> > +++ b/arch/x86/include/asm/qspinlock.h
> > @@ -2,6 +2,8 @@
> >  #define _ASM_X86_QSPINLOCK_H
> >  
> >  #include 
> > +#include 
> > +
> >  #include 
> >  #include 
> >  #include 
> > @@ -57,6 +59,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> > if (!static_branch_likely(_spin_lock_key))
> > return false;
> >  
> > +   if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +   return false;
> > /*
> >  * On hypervisors without PARAVIRT_SPINLOCKS support we fall
> >  * back to a Test-and-Set spinlock, because fair locks have
> > diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
> > b/arch/x86/include/uapi/asm/kvm_para.h
> > index a96

[PATCHv2 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-11-02 Thread Eduardo Valentin
Currently, the existing qspinlock implementation will fallback to
test-and-set if the hypervisor has not set the PV_UNHALT flag.

This patch gives the opportunity to guest kernels to select
between test-and-set and the regular queueu fair lock implementation
based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
flag is not set, the code will still fall back to test-and-set,
but when the PV_DEDICATED flag is set, the code will use
the regular queue spinlock implementation.

Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: "Radim Krčmář" <rkrc...@redhat.com>
Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: x...@kernel.org
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Waiman Long <long...@redhat.com>
Cc: k...@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: Jan H. Schoenherr <jscho...@amazon.de>
Cc: Anthony Liguori <aligu...@amazon.com>
Suggested-by: Matt Wilson <m...@amazon.com>
Signed-off-by: Eduardo Valentin <edu...@amazon.com>
---
V2:
 - rebase on top of tip/master

 Documentation/virtual/kvm/cpuid.txt  | 6 ++
 arch/x86/include/asm/qspinlock.h | 4 
 arch/x86/include/uapi/asm/kvm_para.h | 1 +
 3 files changed, 11 insertions(+)

diff --git a/Documentation/virtual/kvm/cpuid.txt 
b/Documentation/virtual/kvm/cpuid.txt
index 3c65feb..117066a 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest checks 
this feature bit
||   || before enabling paravirtualized
||   || spinlock support.
 --
+KVM_FEATURE_PV_DEDICATED   || 8 || guest checks this feature bit
+   ||   || to determine if they run on
+   ||   || dedicated vCPUs, allowing opti-
+   ||   || mizations such as usage of
+   ||   || qspinlocks.
+--
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no guest-side
||   || per-cpu warps are expected in
||   || kvmclock.
diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
index 308dfd0..3751898 100644
--- a/arch/x86/include/asm/qspinlock.h
+++ b/arch/x86/include/asm/qspinlock.h
@@ -2,6 +2,8 @@
 #define _ASM_X86_QSPINLOCK_H
 
 #include 
+#include 
+
 #include 
 #include 
 #include 
@@ -57,6 +59,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
if (!static_branch_likely(_spin_lock_key))
return false;
 
+   if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
+   return false;
/*
 * On hypervisors without PARAVIRT_SPINLOCKS support we fall
 * back to a Test-and-Set spinlock, because fair locks have
diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
b/arch/x86/include/uapi/asm/kvm_para.h
index a965e5b0..d151300 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #define KVM_FEATURE_STEAL_TIME 5
 #define KVM_FEATURE_PV_EOI 6
 #define KVM_FEATURE_PV_UNHALT  7
+#define KVM_FEATURE_PV_DEDICATED   8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-10-31 Thread Eduardo Valentin
Hello Radim,

On Tue, Oct 24, 2017 at 01:18:59PM +0200, Radim Krčmář wrote:
> 2017-10-23 17:44-0700, Eduardo Valentin:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> 
> Where have you detected the main source of overhead with pinned VCPUs?
> Makes me wonder if we couldn't improve general PV_UNHALT,

This is essentially for cases of non-overcommitted vCPUs in which we want 
the instance vCPUs to run uninterrupted as much as possible. Here by disabling
the PV_UNHALT,  we avoid the accounting needed to properly do the PV_UNHALT 
hypercall, as the lock holder won't be preempted anyway for the 1:1 pin case.

> 
> thanks.
> 
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
> 
> Some flag makes sense and we do want to make sure that userspaces don't
> enable it in pass-through-cpuid mode.

Did you mean something like:
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0099e10..8ceb503 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -211,7 +211,8 @@ int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
}
for (i = 0; i < cpuid->nent; i++) {
vcpu->arch.cpuid_entries[i].function = 
cpuid_entries[i].function;
-   vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax;
+   vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax &
+   
~KVM_FEATURE_PV_DEDICATED;
vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx;
vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx;
vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx;


But I do not see any other KVM_FEATURE_* being enforced (e.g. PV_UNHALT).
Do you mind elaborating a bit here?

> 

-- 
All the best,
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-10-24 Thread Eduardo Valentin
Hey Waiman,

On Tue, Oct 24, 2017 at 12:07:04PM -0400, Waiman Long wrote:
> On 10/24/2017 11:37 AM, Eduardo Valentin wrote:
> > Hello Peter,
> > On Tue, Oct 24, 2017 at 10:13:45AM +0200, Peter Zijlstra wrote:
> >> On Mon, Oct 23, 2017 at 05:44:27PM -0700, Eduardo Valentin wrote:
> >>> @@ -46,6 +48,8 @@ static inline bool virt_spin_lock(struct qspinlock 
> >>> *lock)
> >>>   if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
> >>>   return false;
> >>>  
> >>> + if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> >>> + return false;
> >>>   /*
> >>>* On hypervisors without PARAVIRT_SPINLOCKS support we fall
> >>>* back to a Test-and-Set spinlock, because fair locks have
> >> This does not apply. Much has been changed here recently.
> >>
> >  I checked against Linus master branch before sending. Which tree/branch 
> > are you referring to / should I based this?
> >
> Please check the tip tree
> (https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git) which has
> the latest changes in locking code.

I will rebase the patch on top of the tip tree.

Thanks.

> 
> Cheers,
> Longman
> 

-- 
All the best,
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-10-24 Thread Eduardo Valentin
Hello Peter,
On Tue, Oct 24, 2017 at 10:13:45AM +0200, Peter Zijlstra wrote:
> On Mon, Oct 23, 2017 at 05:44:27PM -0700, Eduardo Valentin wrote:
> > @@ -46,6 +48,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> > if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
> > return false;
> >  
> > +   if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +   return false;
> > /*
> >  * On hypervisors without PARAVIRT_SPINLOCKS support we fall
> >  * back to a Test-and-Set spinlock, because fair locks have
> 
> This does not apply. Much has been changed here recently.
> 

 I checked against Linus master branch before sending. Which tree/branch are 
you referring to / should I based this?

-- 
All the best,
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

2017-10-23 Thread Eduardo Valentin
Currently, the existing qspinlock implementation will fallback to
test-and-set if the hypervisor has not set the PV_UNHALT flag.

This patch gives the opportunity to guest kernels to select
between test-and-set and the regular queueu fair lock implementation
based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
flag is not set, the code will still fall back to test-and-set,
but when the PV_DEDICATED flag is set, the code will use
the regular queue spinlock implementation.

Cc: Paolo Bonzini <pbonz...@redhat.com> 
Cc: "Radim Krčmář" <rkrc...@redhat.com> 
Cc: Jonathan Corbet <cor...@lwn.net> 
Cc: Thomas Gleixner <t...@linutronix.de> 
Cc: Ingo Molnar <mi...@redhat.com> 
Cc: "H. Peter Anvin" <h...@zytor.com> 
Cc: x...@kernel.org 
Cc: Peter Zijlstra <pet...@infradead.org> 
Cc: Waiman Long <long...@redhat.com> 
Cc: k...@vger.kernel.org 
Cc: linux-doc@vger.kernel.org 
Cc: linux-ker...@vger.kernel.org 
Cc: Jan H. Schoenherr <jscho...@amazon.de>
Cc: Anthony Liguori <aligu...@amazon.com>
Suggested-by: Matt Wilson <m...@amazon.com>
Signed-off-by: Eduardo Valentin <edu...@amazon.com>
---
 Documentation/virtual/kvm/cpuid.txt  | 6 ++
 arch/x86/include/asm/qspinlock.h | 4 
 arch/x86/include/uapi/asm/kvm_para.h | 1 +
 3 files changed, 11 insertions(+)

diff --git a/Documentation/virtual/kvm/cpuid.txt 
b/Documentation/virtual/kvm/cpuid.txt
index 3c65feb..117066a 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT  || 7 || guest checks 
this feature bit
||   || before enabling paravirtualized
||   || spinlock support.
 --
+KVM_FEATURE_PV_DEDICATED   || 8 || guest checks this feature bit
+   ||   || to determine if they run on
+   ||   || dedicated vCPUs, allowing opti-
+   ||   || mizations such as usage of
+   ||   || qspinlocks.
+--
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||24 || host will warn if no guest-side
||   || per-cpu warps are expected in
||   || kvmclock.
diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
index eaba080..f89b469 100644
--- a/arch/x86/include/asm/qspinlock.h
+++ b/arch/x86/include/asm/qspinlock.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_X86_QSPINLOCK_H
 #define _ASM_X86_QSPINLOCK_H
 
+#include 
+
 #include 
 #include 
 #include 
@@ -46,6 +48,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
return false;
 
+   if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
+   return false;
/*
 * On hypervisors without PARAVIRT_SPINLOCKS support we fall
 * back to a Test-and-Set spinlock, because fair locks have
diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
b/arch/x86/include/uapi/asm/kvm_para.h
index 94dc8ca..ad2e8fe 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
 #define KVM_FEATURE_STEAL_TIME 5
 #define KVM_FEATURE_PV_EOI 6
 #define KVM_FEATURE_PV_UNHALT  7
+#define KVM_FEATURE_PV_DEDICATED   8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
-- 
2.7.5

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] Documentation: update kernel enforcement support list

2017-10-17 Thread Eduardo Valentin
Adding myself to the list as I missed the window to be in the original patch.

Cc: Jonathan Corbet <cor...@lwn.net>
Cc: Bart Van Assche <bart.vanass...@wdc.com>
Cc: Namhyung Kim <namhy...@kernel.org>
Cc: Olof Johansson <o...@lixom.net>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Javier Martinez Canillas <jav...@dowhile0.org>
Signed-off-by: Eduardo Valentin <edu...@amazon.com>
---
 Documentation/process/kernel-enforcement-statement.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/process/kernel-enforcement-statement.rst 
b/Documentation/process/kernel-enforcement-statement.rst
index 1e23d42..18cfb4c 100644
--- a/Documentation/process/kernel-enforcement-statement.rst
+++ b/Documentation/process/kernel-enforcement-statement.rst
@@ -145,3 +145,4 @@ we might work for today, have in the past, or will in the 
future.
   - Masahiro Yamada
   - Wei Yongjun
   - Lv Zheng
+  - Eduardo Valentin 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [xfstests PATCH v3 1/5] generic: add a writeback error handling test

2017-05-31 Thread Eduardo Valentin
   }
> + fd[i] = open(fname, O_WRONLY | O_CREAT | O_TRUNC, 0644);
> + if (fd[i] < 0) {
> + printf("Second open of fd[%d] failed: %m\n", i);
> + return 1;
> + }
> + ret = fsync(fd[i]);
> + if (ret < 0) {
> + /* New opens should not return an error */
> + printf("First fsync after reopen of fd[%d] failed: 
> %m\n", i);
> + return 1;
> + }
> + }
> +
> + printf("Test passed!\n");
> + return 0;
> +}
> diff --git a/tests/generic/999 b/tests/generic/999
> new file mode 100755
> index ..6de143d1149e
> --- /dev/null
> +++ b/tests/generic/999
> @@ -0,0 +1,76 @@
> +#! /bin/bash
> +# FS QA Test No. 999
> +#
> +# Open a file several times, write to it, fsync on all fds and make sure that
> +# they all return 0. Change the device to start throwing errors. Write again
> +# on all fds and fsync on all fds. Ensure that we get errors on all of them.
> +# Then fsync on all one last time and verify that all return 0.
> +#
> +#---
> +# Copyright (c) 2017, Jeff Layton <jlay...@redhat.com>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#---

Sure you want to track the address? Maybe just remove it?

> +



> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1# failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +cd /
> +rm -rf $tmp.* $testdir
> +_dmerror_cleanup
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmerror
> +
> +# real QA test starts here
> +_supported_os Linux
> +_require_scratch
> +_require_logdev
> +_require_dm_target error
> +_require_test_program fsync-err
> +
> +rm -f $seqres.full
> +
> +echo "Format and mount"
> +$XFS_IO_PROG -d -c "pwrite -S 0x7c -b 1048576 0 $((64 * 1048576))" 
> $SCRATCH_DEV >> $seqres.full
> +_scratch_mkfs > $seqres.full 2>&1
> +_dmerror_init
> +_dmerror_mount >> $seqres.full 2>&1
> +_dmerror_unmount
> +_dmerror_mount
> +
> +_require_fs_space $SCRATCH_MNT 8192
> +
> +testfile=$SCRATCH_MNT/fsync-err-test
> +
> +$here/src/fsync-err $testfile
> +
> +# success, all done
> +_dmerror_load_working_table
> +_dmerror_unmount
> +_dmerror_cleanup
> +_repair_scratch_fs >> $seqres.full
> +status=0
> +exit
> diff --git a/tests/generic/999.out b/tests/generic/999.out
> new file mode 100644
> index ..2e48492ff6d1
> --- /dev/null
> +++ b/tests/generic/999.out
> @@ -0,0 +1,3 @@
> +QA output created by 999
> +Format and mount
> +Test passed!
> diff --git a/tests/generic/group b/tests/generic/group
> index 438c299030f2..39f7b14657f1 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -440,3 +440,4 @@
>  435 auto encrypt
>  436 auto quick rw
>  437 auto quick
> +999 auto quick
> diff --git a/tools/dmerror b/tools/dmerror
> new file mode 100755
> index ..4aaf682ee5f9
> --- /dev/null
> +++ b/tools/dmerror
> @@ -0,0 +1,44 @@
> +#!/bin/bash
> +#---
> +# Copyright (c) 2017, Jeff Layton <jlay...@redhat.com>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#---
> +
> +. ./common/config
> +. ./common/dmerror
> +
> +_dmerror_setup
> +
> +case $1 in
> +cleanup)
> + _dmerror_cleanup
> + ;;
> +init)
> + _dmerror_init
> + ;;
> +load_error_table)
> + _dmerror_load_error_table
> + ;;
> +load_working_table)
> + _dmerror_load_working_table
> + ;;
> +*)
> + echo "Usage: $0 {init|cleanup|load_error_table|load_working_table}"
> + exit 1
> + ;;
> +esac
> +
> +status=0
> +exit
> -- 
> 2.9.4
> 
> 

-- 
All the best,
Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/5] thermal: Add support for hardware-tracked trip points

2016-07-01 Thread Eduardo Valentin
On Wed, Jun 22, 2016 at 11:15:19AM +0800, Zhang Rui wrote:
> On 一, 2016-06-06 at 19:44 +0800, Caesar Wang wrote:
> > From: Sascha Hauer <s.ha...@pengutronix.de>
> > 
> > This adds support for hardware-tracked trip points to the device tree
> > thermal sensor framework.
> > 
> > The framework supports an arbitrary number of trip points. Whenever
> > the current temperature is updated, the trip points immediately
> > below and above the current temperature are found. A .set_trips
> > callback is then called with the temperatures. If there is no trip
> > point above or below the current temperature, the passed trip
> > temperature will be -INT_MAX or INT_MAX respectively. In this
> > callback,
> > the driver should program the hardware such that it is notified
> > when either of these trip points are triggered. When a trip point
> > is triggered, the driver should call `thermal_zone_device_update'
> > for the respective thermal zone. This will cause the trip points
> > to be updated again.
> > 
> > If .set_trips is not implemented, the framework behaves as before.
> > 
> > This patch is based on an earlier version from Mikko Perttunen
> > <mikko.perttu...@kapsi.fi>
> > 
> > Signed-off-by: Sascha Hauer <s.ha...@pengutronix.de>
> > Signed-off-by: Caesar Wang <w...@rock-chips.com>
> > Cc: Zhang Rui <rui.zh...@intel.com>
> > Cc: Eduardo Valentin <edubez...@gmail.com>
> > Cc: linux...@vger.kernel.org
> > 
> > ---
> > 
> > Changes in v5:
> > - add the lock for thermal_zone_set_trips function.
> > - change based on next kernel.
> > 
> > Changes in v4:
> > - Missing the lock added in v3.
> > 
> > Changes in v3:
> > - as Javi comments on https://patchwork.kernel.org/patch/9001281/.
> > - add the lock for preventing the called from multi placce
> > - add the note for pre_low/high_trip.
> > 
> > Changes in v2:
> > - update the sysfs-api.txt for set_trips.
> > 
> >  Documentation/thermal/sysfs-api.txt |  7 +
> >  drivers/thermal/thermal_core.c  | 54
> > +
> >  drivers/thermal/thermal_sysfs.c |  3 +++
> >  include/linux/thermal.h | 10 +++
> >  4 files changed, 74 insertions(+)
> > 
> > diff --git a/Documentation/thermal/sysfs-api.txt
> > b/Documentation/thermal/sysfs-api.txt
> > index efc3f3d..75d8838 100644
> > --- a/Documentation/thermal/sysfs-api.txt
> > +++ b/Documentation/thermal/sysfs-api.txt
> > @@ -49,6 +49,9 @@ temperature) and throttle appropriate devices.
> >     .bind: bind the thermal zone device with a thermal cooling
> > device.
> >     .unbind: unbind the thermal zone device with a thermal
> > cooling device.
> >     .get_temp: get the current temperature of the thermal zone.
> > +   .set_trips: set the trip points window. Whenever the current
> > temperature
> > +   is updated, the trip points immediately below
> > and above the
> > +   current temperature are found.
> >     .get_mode: get the current mode (enabled/disabled) of the
> > thermal zone.
> >     - "enabled" means the kernel thermal management is
> > enabled.
> >     - "disabled" will prevent kernel thermal driver action
> > upon trip points
> > @@ -95,6 +98,10 @@ temperature) and throttle appropriate devices.
> >     get_temp:   a pointer to a function
> > that reads the
> >     sensor temperature. This is
> > mandatory
> >     callback provided by sensor
> > driver.
> > +   set_trips:  a pointer to a function that
> > sets a
> > +   temperature window. When
> > this window is
> > +   left the driver must inform
> > the thermal
> > +   core via
> > thermal_zone_device_update.
> >     get_trend:  a pointer to a function
> > that reads the
> >     sensor temperature trend.
> >     set_emul_temp:  a pointer to a
> > function that sets
> > diff --git a/drivers/thermal/thermal_core.c
> > b/drivers/thermal/thermal_core.c
> > index 792aab7..e69aee2 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -368,6 +368,56 @@ static void handle_thermal_trip(struct
> > thermal_zone_device *t

Re: [PATCH v2 0/5] Thermal: Support for hardware-tracked trip points

2016-05-23 Thread Eduardo Valentin
On Mon, May 23, 2016 at 03:32:39PM +0800, Caesar Wang wrote:
> Hello Eduardo & 'Zhang Rui'
> 
> Do we have the chance to merge this series patches for next kernel?
> I had picked them up in my github, and tested for a period of time with
> rockchip inside kernel.
> 
> Let me know if someone have some suggestions or against opinios.
> Thanks,

I will have a look after after the current merge window.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Syntactic and factual errors in the API document

2016-03-31 Thread Eduardo Valentin
On Thu, Mar 31, 2016 at 12:43:22AM -0600, Jonathan Corbet wrote:
> On Tue, 22 Mar 2016 12:37:25 +
> Andy Champ  wrote:
> 
> > There are several places where the English in the document is syntactically
> > invalid, or unclear. There are also one or two factual errors.
> 
> I went to apply this one to the docs tree, but it no longer applies to
> current kernels.  Any chance of getting a respin?

Jonanthan,

I applied this one into the thermal-soc tree. But if you are willing to
add it via the documentation tree, let me know, I can drop it from my
-fixes branch.

> 
> Thanks,
> 
> jon
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Syntactic and factual errors in the API document

2016-03-28 Thread Eduardo Valentin
On Tue, Mar 22, 2016 at 09:00:32AM -0700, Eduardo Valentin wrote:
> Hello Andy,
> 
> On Tue, Mar 22, 2016 at 12:37:25PM +, Andy Champ wrote:
> > There are several places where the English in the document is syntactically
> > invalid, or unclear. There are also one or two factual errors.

BTW, you are supposed to sign your patches too. I am adding your
signed-off-by here, but next time, remember to sign your patches (have a
look at the Documentation/SubmittingPatches file.

> > 
> > ---
> >  Documentation/thermal/sysfs-api.txt | 44 
> > ++---
> >  1 file changed, 22 insertions(+), 22 deletions(-)
> 
> Thanks for your patch improving the documentation. I have no comments in
> the text correction. The only minor is that the patch title should have
> 'thermal:' prefix (have a look in the git log
> Documentation/thermal/sysfs-api.txt). 
> 
> Also, I will get this patch only after the merge window is finished and
> my current pending patches queue has been sent and merged.
> 
> BR,
> 
> Eduardo Valentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 0/6] thermal: add devm_ version of thermal_zone register and driver for max77620

2016-03-09 Thread Eduardo Valentin
On Wed, Mar 09, 2016 at 06:40:04PM +0530, Laxman Dewangan wrote:
> The series add the devm_ version of thermal_zone_of_sensor_register/
> unregister, interface details, and use this in new thermal driver
> for max77620.
> 
> The header file for max77620 is part of MFD patch
>   https://lkml.org/lkml/2016/2/11/186
> 
> Changes from V1:
> - Run checkpatch with --strict and fix all warnings.
> - Added details of interfaces in spifs-api.txt
> - Added DT binding doc.
> 
> Laxman Dewangan (6):
>   thermal: doc: Add details of
> thermal_zone_of_sensor_{register,unregister}
>   thermal: of-thermal: Add devm version of
> thermal_zone_of_sensor_register
>   thermal: Add devm_thermal_zone_of_sensor_register() in managed devices
> list
>   thermal: doc: Add details of
> devm_thermal_zone_of_sensor_{register,unregister}
>   thermal: max77620: Add thermal driver for reporting junction temp
>   thermal: Add DT binding doc for thermal of PMIC max77620


Add patches 1-3 in my tree. Patch 4 should probably go via documentation
tree. And patch 5 needs dependency to be sorted, and 6 need minor changes.



signature.asc
Description: Digital signature


Re: [PATCH V2 3/6] thermal: Add devm_thermal_zone_of_sensor_register() in managed devices list

2016-03-09 Thread Eduardo Valentin
On Wed, Mar 09, 2016 at 06:40:07PM +0530, Laxman Dewangan wrote:
> The interface thermal_zone_of_sensor_register() and
> thermal_zone_of_sensor_unregister() gained their devm_
> wrappers. Add these APIs in the list of managed devices.
> 
> Signed-off-by: Laxman Dewangan <ldewan...@nvidia.com>

Acked-by: Eduardo Valentin <edubez...@gmail.com>


This should probably go via the documentation tree.


> 
> ---
> Changes from V1:
> - No change.
> 
>  Documentation/driver-model/devres.txt | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/Documentation/driver-model/devres.txt 
> b/Documentation/driver-model/devres.txt
> index 5930d9a..dd7ce58 100644
> --- a/Documentation/driver-model/devres.txt
> +++ b/Documentation/driver-model/devres.txt
> @@ -342,6 +342,10 @@ SLAVE DMA ENGINE
>  SPI
>devm_spi_register_master()
>  
> +THERMAL
> + devm_thermal_zone_of_sensor_register()
> + devm_thermal_zone_of_sensor_unregister()
> +
>  WATCHDOG
>devm_watchdog_register_device()
>devm_watchdog_unregister_device()
> -- 
> 2.1.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 6/6] thermal: Add DT binding doc for thermal of PMIC max77620

2016-03-09 Thread Eduardo Valentin
On Wed, Mar 09, 2016 at 06:40:10PM +0530, Laxman Dewangan wrote:
> Maxim Semiconductor MAX77620 supports alarm interrupts when
> its die temperature crosses 120C and 140C. These threshold
> temperatures are not configurable.
> 
> Add DT binding document to details out the DT property related
> to MAX77620 thermal functionality.
> 
> Signed-off-by: Laxman Dewangan 
> 
> ---
> Changes from V1:
> - New in series to add DT binding doc per V1 review.
> 
>  .../bindings/thermal/thermal-max77620.txt  | 43 
> ++
>  1 file changed, 43 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/thermal/thermal-max77620.txt
> 
> diff --git a/Documentation/devicetree/bindings/thermal/thermal-max77620.txt 
> b/Documentation/devicetree/bindings/thermal/thermal-max77620.txt
> new file mode 100644
> index 000..d76412f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/thermal/thermal-max77620.txt
> @@ -0,0 +1,43 @@
> +Thermal driver for MAX77620 Power management IC from Maxim Semiconductor.
> +
> +Maxim Semiconductor MAX77620 supports alarm interrupts when its
> +die temperature crosses 120C and 140C. These threshold temperatures
> +are not configurable. Device does not provide the real temperature
> +of die other than just indicating whether temperature is above or
> +below threshold level.
> +
> +Required properties:
> +---
> +#thermal-sensor-cells:   Please refer 
> 
> + for more details.
> + The value must be 0.
> +
> +For more details, please refer generic thermal DT binding document
> +.
> +
> +Please refer  for mfd DT binding
> +document for the MAX77620.
> +
> +Example:
> +
> +#include 
> +...
> +spmic: max77620@3c {
> + compatible = "maxim,max77620";
> +
> + #thermal-sensor-cells = <0>;
> +};
> +
> +thermal-zones {
> + PMIC-Die {
> + thermal-sensors = <>;
> + trips {
> + die_temp_thresh: hot-die {
> + temperature = <12>;
> + type = "active";
> + hysteresis = <0>;
> + };
> + };
> + };
> +};


Considering that this is an example (and people will copy and paste it),
please add all the required DT properties for a thermal-zone node.

> +
> -- 
> 2.1.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 5/6] thermal: max77620: Add thermal driver for reporting junction temp

2016-03-09 Thread Eduardo Valentin
On Wed, Mar 09, 2016 at 06:40:09PM +0530, Laxman Dewangan wrote:
> Maxim Semiconductor Max77620 supports alarm interrupts when
> its die temperature crosses 120C and 140C. These threshold
> temperatures are not configurable.
> 
> Add thermal driver to register PMIC die temperature as thermal
> zone sensor and capture the die temperature warning interrupts
> to notifying the client.
> 
> Signed-off-by: Laxman Dewangan 
> 
> ---
> Changes from V1:
> - checkpatch warning fix and simplifying the dev.of_node
>   initialisation.
> 
>  drivers/thermal/Kconfig|  10 +++
>  drivers/thermal/Makefile   |   1 +
>  drivers/thermal/thermal-max77620.c | 151 
> +
>  3 files changed, 162 insertions(+)
>  create mode 100644 drivers/thermal/thermal-max77620.c
> 
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 5e7c97a..fc856eb 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -194,6 +194,16 @@ config IMX_THERMAL
> cpufreq is used as the cooling device to throttle CPUs when the
> passive trip is crossed.
>  
> +config MAX77620_THERMAL
> + tristate "Temperature sensor driver for Maxim MAX77620 PMIC"
> + depends on MFD_MAX77620
> + depends on OF

The previous question on compile test was more if we could add depends
on COMPILE_TEST flag. Sorry if I was not clear enough. 


I am adding the flag here:

 
  config MAX77620_THERMAL
  tristate "Temperature sensor driver for Maxim MAX77620 PMIC"
  -   depends on MFD_MAX77620
  +   depends on MFD_MAX77620 || COMPILE_TEST
  +   depends on HAS_IOMEM
  depends on OF
  help
Support for die junction temperature
warning alarm for Maxim


but I still cannot compile test the file because of the missing header.
drivers/thermal/thermal-max77620.c:16:32: fatal error:
linux/mfd/max77620.h: No such file or directory
 #include 


I wont be able to merge this as it is. We need to sort out the
dependency first.

The patches with devm_ helpers could go, though.

> + help
> +   Support for die junction temperature warning alarm for Maxim
> +   Semiconductor PMIC MAX77620 device. Device generates two alarm
> +   interrupts when PMIC die temperature cross the threshold of
> +   120 degC and 140 degC.
> +
>  config SPEAR_THERMAL
>   tristate "SPEAr thermal sensor driver"
>   depends on PLAT_SPEAR || COMPILE_TEST
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index 8e9cbc3..c6bc2bd 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -36,6 +36,7 @@ obj-$(CONFIG_DOVE_THERMAL)  += dove_thermal.o
>  obj-$(CONFIG_DB8500_THERMAL) += db8500_thermal.o
>  obj-$(CONFIG_ARMADA_THERMAL) += armada_thermal.o
>  obj-$(CONFIG_IMX_THERMAL)+= imx_thermal.o
> +obj-$(CONFIG_MAX77620_THERMAL)   += thermal-max77620.o
>  obj-$(CONFIG_DB8500_CPUFREQ_COOLING) += db8500_cpufreq_cooling.o
>  obj-$(CONFIG_INTEL_POWERCLAMP)   += intel_powerclamp.o
>  obj-$(CONFIG_X86_PKG_TEMP_THERMAL)   += x86_pkg_temp_thermal.o
> diff --git a/drivers/thermal/thermal-max77620.c 
> b/drivers/thermal/thermal-max77620.c
> new file mode 100644
> index 000..5fba00f
> --- /dev/null
> +++ b/drivers/thermal/thermal-max77620.c
> @@ -0,0 +1,151 @@
> +/*
> + * Junction temperature thermal driver for Maxim Max77620.
> + *
> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * Author: Laxman Dewangan 
> + *  Mallikarjun Kasoju 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define MAX77620_NORMAL_OPERATING_TEMP   10
> +#define MAX77620_TJALARM1_TEMP   12
> +#define MAX77620_TJALARM2_TEMP   14
> +
> +struct max77620_therm_info {
> + struct device   *dev;
> + struct regmap   *rmap;
> + struct thermal_zone_device  *tz_device;
> + int irq_tjalarm1;
> + int irq_tjalarm2;
> +};
> +
> +static int max77620_thermal_read_temp(void *data, int *temp)
> +{
> + struct max77620_therm_info *mtherm = data;
> + unsigned int val;
> + int ret;
> +
> + ret = regmap_read(mtherm->rmap, MAX77620_REG_STATLBT, );
> + if (ret < 0) {
> + dev_err(mtherm->dev, "Failed to read STATLBT, %d\n", ret);
> + return -EINVAL;
> + }
> +
> + if (val & MAX77620_IRQ_TJALRM2_MASK)
> + *temp = MAX77620_TJALARM2_TEMP;
> + 

Re: [PATCH V2 0/6] thermal: add devm_ version of thermal_zone register and driver for max77620

2016-03-09 Thread Eduardo Valentin
Laxman,


On Wed, Mar 09, 2016 at 06:40:04PM +0530, Laxman Dewangan wrote:
> The series add the devm_ version of thermal_zone_of_sensor_register/
> unregister, interface details, and use this in new thermal driver
> for max77620.
> 
> The header file for max77620 is part of MFD patch
>   https://lkml.org/lkml/2016/2/11/186
> 
> Changes from V1:
> - Run checkpatch with --strict and fix all warnings.
> - Added details of interfaces in spifs-api.txt
> - Added DT binding doc.

Thanks a lot for working on the comments. I don't see anything specific
right now. So, I am starting a couple of automated testing on this
series. If I find something, I will let you know.



signature.asc
Description: Digital signature


Re: [PATCH 3/3] thermal: max77620: Add thermal driver for reporting junction temp

2016-03-08 Thread Eduardo Valentin

Hello Laxman,

Thanks for working on this. Impressed how simplified these drivers are
becoming. I really liked you added the so waited devm_ helpers. Very
minor comments as follows (now that you will send a new version anyway).

On Fri, Mar 04, 2016 at 07:10:10PM +0530, Laxman Dewangan wrote:
> Maxim Semiconductor Max77620 supports alarm interrupts when
> its die temperature crosses 120C and 140C. These threshold
> temperatures are not configurable.
> 
> Add thermal driver to register PMIC die temperature as thermal
> zone sensor and capture the die temperature warning interrupts
> to notifying the client.

Are any of these critical?

> 
> Signed-off-by: Laxman Dewangan 
> ---
>  drivers/thermal/Kconfig|   9 +++
>  drivers/thermal/Makefile   |   1 +
>  drivers/thermal/thermal-max77620.c | 151 
> +

Given that it is a DT based driver, please add also a binding entry
under Documentation/devicetree/bindings/thermal/


>  3 files changed, 161 insertions(+)
>  create mode 100644 drivers/thermal/thermal-max77620.c
> 
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 5e7c97a..faba1a3 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -194,6 +194,15 @@ config IMX_THERMAL
> cpufreq is used as the cooling device to throttle CPUs when the
> passive trip is crossed.
>  
> +config MAX77620_THERMAL
> + tristate "Temperature sensor driver for Maxim MAX77620 PMIC"
> + depends on MFD_MAX77620

Can this be COMPILE_TEST'ed?

> + depends on OF

> + help
> +   Support for die junction temperature warning alarm for Maxim
> +   Semiconductor PMIC MAX77620 device. Device generates alert
> +   signal/interrupt when die temperature cross its threshold.
> +

Somehow checkpatch.pl --strict is complaining about this entry. Can you
please check?


>  config SPEAR_THERMAL
>   tristate "SPEAr thermal sensor driver"
>   depends on PLAT_SPEAR || COMPILE_TEST
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index 8e9cbc3..c6bc2bd 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -36,6 +36,7 @@ obj-$(CONFIG_DOVE_THERMAL)  += dove_thermal.o
>  obj-$(CONFIG_DB8500_THERMAL) += db8500_thermal.o
>  obj-$(CONFIG_ARMADA_THERMAL) += armada_thermal.o
>  obj-$(CONFIG_IMX_THERMAL)+= imx_thermal.o
> +obj-$(CONFIG_MAX77620_THERMAL)   += thermal-max77620.o
>  obj-$(CONFIG_DB8500_CPUFREQ_COOLING) += db8500_cpufreq_cooling.o
>  obj-$(CONFIG_INTEL_POWERCLAMP)   += intel_powerclamp.o
>  obj-$(CONFIG_X86_PKG_TEMP_THERMAL)   += x86_pkg_temp_thermal.o
> diff --git a/drivers/thermal/thermal-max77620.c 
> b/drivers/thermal/thermal-max77620.c
> new file mode 100644
> index 000..846da62
> --- /dev/null
> +++ b/drivers/thermal/thermal-max77620.c
> @@ -0,0 +1,151 @@
> +/*
> + * Junction temperature thermal driver for Maxim Max77620.
> + *
> + * Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
> + *
> + * Author: Laxman Dewangan 
> + *  Mallikarjun Kasoju 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define MAX77620_NORMAL_OPERATING_TEMP   10
> +#define MAX77620_TJALARM1_TEMP   12
> +#define MAX77620_TJALARM1_TEMP   14
> +
> +struct max77620_therm_info {
> + struct device   *dev;
> + struct regmap   *rmap;
> + struct thermal_zone_device  *tz_device;
> + int irq_tjalarm1;
> + int irq_tjalarm2;
> +};
> +
> +static int max77620_thermal_read_temp(void *data, int *temp)
> +{
> + struct max77620_therm_info *mtherm = data;
> + unsigned int val;
> + int ret;
> +
> + ret = regmap_read(mtherm->rmap, MAX77620_REG_STATLBT, );
> + if (ret < 0) {
> + dev_err(mtherm->dev, "Failed to read STATLBT, %d\n", ret);
> + return -EINVAL;
> + }
> +
> + if (val & MAX77620_IRQ_TJALRM2_MASK)
> + *temp = MAX77620_TJALARM2_TEMP;
> + else if (val & MAX77620_IRQ_TJALRM1_MASK)
> + *temp = MAX77620_TJALARM1_TEMP;
> + else
> + *temp = MAX77620_NORMAL_OPERATING_TEMP;

So, no way at all to get a temp?

> + return 0;
> +}
> +
> +static const struct thermal_zone_of_device_ops max77620_thermal_ops = {
> + .get_temp = max77620_thermal_read_temp,
> +};
> +
> +static irqreturn_t max77620_thermal_irq(int irq, void *data)
> +{
> + struct max77620_therm_info *mtherm = data;
> +
> + if (irq == mtherm->irq_tjalarm1)
> +