RE: UART/TTY console deadlock

2020-07-02 Thread S, Shirish
This one:
https://lkml.org/lkml/2020/6/30/394

I did reply to all, not sure what I am missing while replying.



Regards,
Shirish S

-Original Message-
From: Greg Kroah-Hartman  
Sent: Thursday, July 2, 2020 11:42 AM
To: S, Shirish 
Cc: Tony Lindgren ; Sergey Senozhatsky 
; Petr Mladek ; Andy Shevchenko 
; Raul Rangel ; Sergey 
Senozhatsky ; linux-kernel 
; Andy Shevchenko 
; k...@linutronix.de; Peter Zijlstra 
; John Ogness ; Steven Rostedt 

Subject: Re: UART/TTY console deadlock

On Thu, Jul 02, 2020 at 03:48:43AM +, S, Shirish wrote:
> Hi All,
> 
> Can we land this patch upstream?

What patch?

> Feel free to add my tested-by.

Please send it in a form that I can add it, to the patch in question.

thanks,

greg k-h


RE: UART/TTY console deadlock

2020-07-01 Thread S, Shirish
Hi All,

Can we land this patch upstream?
Feel free to add my tested-by.

Thanks.

Regards,
Shirish S

-Original Message-
From: S, Shirish  
Sent: Wednesday, July 1, 2020 12:15 PM
To: Tony Lindgren ; Sergey Senozhatsky 

Cc: Petr Mladek ; Andy Shevchenko 
; Raul Rangel ; Sergey 
Senozhatsky ; linux-kernel 
; Greg Kroah-Hartman 
; Andy Shevchenko 
; k...@linutronix.de; S, Shirish 
; Peter Zijlstra ; John Ogness 
; Steven Rostedt 
Subject: Re: UART/TTY console deadlock


On 6/30/2020 11:32 PM, Tony Lindgren wrote:
> * Sergey Senozhatsky  [200630 13:06]:
>> On (20/06/30 14:22), Petr Mladek wrote:
> ...
>
>>>>>> @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
>>>>>>   * allow register changes to become visible.
>>>>>>   */
>>>>>>  spin_lock_irqsave(>lock, flags);
>>>>>> -if (up->port.irqflags & IRQF_SHARED)
>>>>>> -disable_irq_nosync(port->irq);
>>>>>>   
>>>>>>  wait_for_xmitr(up, UART_LSR_THRE);
>>>>>>  serial_port_out_sync(port, UART_IER, UART_IER_THRI); @@ 
>>>>>> -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
>>>>>>  iir = serial_port_in(port, UART_IIR);
>>>>>>  serial_port_out(port, UART_IER, 0);
>>>>>>   
>>>>>> -if (port->irqflags & IRQF_SHARED)
>>>>>> -enable_irq(port->irq);
>>>>>>  spin_unlock_irqrestore(>lock, flags);
>>>>>> +if (irq_shared)
>>>>>> +enable_irq(port->irq);
>>>>>>   
>>>>>>  /*
>>>>>>   * If the interrupt is not reasserted, or we otherwise
>>>>> I think that it might be safe but I am not 100% sure, sigh.
>>>> Yeah, I'm not 100%, but I'd give it a try.
>>> I do not feel brave enough to ack it today. But I am all for trying 
>>> it if anyone more familiar with the code is fine with it.
>> I see. Well, I suppose we need Ack-s from tty/serial/8250 maintainers.
>> I would not be very happy if _only_ printk people Ack the patch.

FWIW, the lockdep trace is not seen anymore with the patch applied.

Regards,

Shirish S

> This conditional disable for irq_shared does not look nice to me from 
> the other device point of view :)
>
> Would it be possible to just set up te dummy interrupt handler for the 
> startup, then change it back afterwards? See for example 
> omap8250_no_handle_irq().
>
> The other device for irq_shared should be capable of dealing with 
> spurious interrupts if it's shared.
>
> Regards,
>
> Tony

--
Regards,
Shirish S



Re: UART/TTY console deadlock

2020-07-01 Thread S, Shirish



On 6/30/2020 11:32 PM, Tony Lindgren wrote:

* Sergey Senozhatsky  [200630 13:06]:

On (20/06/30 14:22), Petr Mladek wrote:

...


@@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
 * allow register changes to become visible.
 */
spin_lock_irqsave(>lock, flags);
-   if (up->port.irqflags & IRQF_SHARED)
-   disable_irq_nosync(port->irq);
  
  		wait_for_xmitr(up, UART_LSR_THRE);

serial_port_out_sync(port, UART_IER, UART_IER_THRI);
@@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
iir = serial_port_in(port, UART_IIR);
serial_port_out(port, UART_IER, 0);
  
-		if (port->irqflags & IRQF_SHARED)

-   enable_irq(port->irq);
spin_unlock_irqrestore(>lock, flags);
+   if (irq_shared)
+   enable_irq(port->irq);
  
  		/*

 * If the interrupt is not reasserted, or we otherwise

I think that it might be safe but I am not 100% sure, sigh.

Yeah, I'm not 100%, but I'd give it a try.

I do not feel brave enough to ack it today. But I am all for trying it
if anyone more familiar with the code is fine with it.

I see. Well, I suppose we need Ack-s from tty/serial/8250 maintainers.
I would not be very happy if _only_ printk people Ack the patch.


FWIW, the lockdep trace is not seen anymore with the patch applied.

Regards,

Shirish S


This conditional disable for irq_shared does not look nice to me
from the other device point of view :)

Would it be possible to just set up te dummy interrupt handler
for the startup, then change it back afterwards? See for example
omap8250_no_handle_irq().

The other device for irq_shared should be capable of dealing
with spurious interrupts if it's shared.

Regards,

Tony


--
Regards,
Shirish S



Re: linux-next: Fixes tag needs some work in the amdgpu tree

2019-10-04 Thread S, Shirish
Hi Stephen,


On 10/3/2019 3:41 AM, Stephen Rothwell wrote:
> Hi all,
>
> In commit
>
>8c9f69bc5cc4 ("drm/amdgpu: fix build error without CONFIG_HSA_AMD")
>
> Fixes tag
>
>Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W 
> scheduling policy")
>
> has these problem(s):
>
>- Target SHA1 does not exist
>
> Did you mean
>
> Fixes: aa978594cf7f ("drm/amdgpu: disable gfxoff while use no H/W scheduling 
> policy")

Yes, I meant "aa978594cf7f".

Regards,

Shirish S

-- 
Regards,
Shirish S



RE: [PATCH 2/2] x86/mce/amd: Ensure quirks are applied in resume path as well

2019-01-16 Thread S, Shirish
I believe its fixed now in : 

https://lkml.org/lkml/2019/1/16/507
https://lkml.org/lkml/2019/1/16/508

I don’t see S@vger ... in the From field in the above links instead I see "S, 
Shirish" <>


Regards,
Shirish S

-Original Message-
From: S, Shirish 
Sent: Wednesday, January 16, 2019 9:01 PM
To: Borislav Petkov 
Cc: Thomas Gleixner ; Ingo Molnar ; H . 
Peter Anvin ; maintainer : X86 ARCHITECTURE ; 
Tony Luck ; Vishal Verma ; open 
list : X86 ARCHITECTURE ; Lendacky, Thomas 
; Singh, Brijesh 
Subject: RE: [PATCH 2/2] x86/mce/amd: Ensure quirks are applied in resume path 
as well

Nope thats not my email id, am not sure how(s...@vger.kernel.org) its getting 
added.
Do you find the same for the new patchset I have sent? 


Regards,
Shirish S

-Original Message-
From: Borislav Petkov 
Sent: Wednesday, January 16, 2019 8:57 PM
To: S, Shirish 
Cc: Thomas Gleixner ; Ingo Molnar ; H . 
Peter Anvin ; maintainer : X86 ARCHITECTURE ; 
Tony Luck ; Vishal Verma ; open 
list : X86 ARCHITECTURE ; Lendacky, Thomas 
; Singh, Brijesh 
Subject: Re: [PATCH 2/2] x86/mce/amd: Ensure quirks are applied in resume path 
as well

On Wed, Jan 16, 2019 at 03:14:29PM +, S wrote:
> Done. Hope the mailer is fine now, i have added --from while sending 
> the new patchset and also got it reviewed from folks here.

No, it isn't. And the problem is not git. The problem is your normal emails 
you're sending. Look at what you just sent:

https://lore.kernel.org/lkml/8bed39c8-df48-dbb9-3674-dfbc4aea2...@amd.com/raw

and especially your From: field:

From:   s...@vger.kernel.org

Is that your email address?

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


RE: [PATCH 2/2] x86/mce/amd: Ensure quirks are applied in resume path as well

2019-01-16 Thread S, Shirish
Nope thats not my email id, am not sure how(s...@vger.kernel.org) its getting 
added.
Do you find the same for the new patchset I have sent? 


Regards,
Shirish S

-Original Message-
From: Borislav Petkov  
Sent: Wednesday, January 16, 2019 8:57 PM
To: S, Shirish 
Cc: Thomas Gleixner ; Ingo Molnar ; H . 
Peter Anvin ; maintainer : X86 ARCHITECTURE ; 
Tony Luck ; Vishal Verma ; open 
list : X86 ARCHITECTURE ; Lendacky, Thomas 
; Singh, Brijesh 
Subject: Re: [PATCH 2/2] x86/mce/amd: Ensure quirks are applied in resume path 
as well

On Wed, Jan 16, 2019 at 03:14:29PM +, S wrote:
> Done. Hope the mailer is fine now, i have added --from while sending 
> the new patchset and also got it reviewed from folks here.

No, it isn't. And the problem is not git. The problem is your normal emails 
you're sending. Look at what you just sent:

https://lore.kernel.org/lkml/8bed39c8-df48-dbb9-3674-dfbc4aea2...@amd.com/raw

and especially your From: field:

From:   s...@vger.kernel.org

Is that your email address?

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


[PATCH 2/2] x86/mce/amd: carve out MC4_MISC thresholding quirk

2019-01-16 Thread S, Shirish
MC4_MISC thresholding quirk needs to be applied during S5 -> S0 and
S3 -> S0 state transitions, which follow different code paths, hence
carve it out and move it mce_amd_feature_init(), which is  the converging
point of both code paths.

Changelog[v2]:
 - move the quirk to mce/amd.c

Signed-off-by: Shirish S 
---
 arch/x86/kernel/cpu/mce/amd.c  | 34 ++
 arch/x86/kernel/cpu/mce/core.c | 29 -
 2 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 89298c8..f6a5c96 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -545,6 +545,33 @@ prepare_threshold_block(unsigned int bank, unsigned int 
block, u32 addr,
return offset;
 }
 
+void mc4_misc_thresholding_quirk(void)
+{
+   int i;
+   u64 hwcr;
+   bool need_toggle;
+   u32 msrs[] = {
+   0x0413, /* MC4_MISC0 */
+   0xc408, /* MC4_MISC1 */
+   };
+
+   rdmsrl(MSR_K7_HWCR, hwcr);
+
+   /* McStatusWrEn has to be set */
+   need_toggle = !(hwcr & BIT(18));
+
+   if (need_toggle)
+   wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
+
+   /* Clear CntP bit safely */
+   for (i = 0; i < ARRAY_SIZE(msrs); i++)
+   msr_clear_bit(msrs[i], 62);
+
+   /* restore old settings */
+   if (need_toggle)
+   wrmsrl(MSR_K7_HWCR, hwcr);
+}
+
 /* cpu init entry point, called from mce.c with preempt off */
 void mce_amd_feature_init(struct cpuinfo_x86 *c)
 {
@@ -552,6 +579,13 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
unsigned int bank, block, cpu = smp_processor_id();
int offset = -1;
 
+   /*
+* Turn off MC4_MISC thresholding banks on all family 15 models since
+* they're not supported there.
+*/
+   if (c->x86 == 0x15)
+   mc4_misc_thresholding_quirk();
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (mce_flags.smca)
smca_configure(bank, cpu);
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d0c5416..6063ae2 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1611,35 +1611,6 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 
*c)
if (c->x86 == 0x15 && c->x86_model <= 0xf)
mce_flags.overflow_recov = 1;
 
-   /*
-* Turn off MC4_MISC thresholding banks on all models since
-* they're not supported there.
-*/
-   if (c->x86 == 0x15) {
-   int i;
-   u64 hwcr;
-   bool need_toggle;
-   u32 msrs[] = {
-   0x0413, /* MC4_MISC0 */
-   0xc408, /* MC4_MISC1 */
-   };
-
-   rdmsrl(MSR_K7_HWCR, hwcr);
-
-   /* McStatusWrEn has to be set */
-   need_toggle = !(hwcr & BIT(18));
-
-   if (need_toggle)
-   wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
-
-   /* Clear CntP bit safely */
-   for (i = 0; i < ARRAY_SIZE(msrs); i++)
-   msr_clear_bit(msrs[i], 62);
-
-   /* restore old settings */
-   if (need_toggle)
-   wrmsrl(MSR_K7_HWCR, hwcr);
-   }
}
 
if (c->x86_vendor == X86_VENDOR_INTEL) {
-- 
2.7.4



[PATCH 1/2] x86/mce/amd: apply MC4_MISC thresholding to all models of family 15

2019-01-16 Thread S, Shirish
Its evident from various forums and logs that MC4_MISC thresholding is not
supported for the family 15 processors, hence skip the x86_model check
while applying quirk.

Changelog[v2]:
 - reword commit message to adhere to coding standards
 - remove check of model range

Signed-off-by: Shirish S 
---
 arch/x86/kernel/cpu/mce/core.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c722..d0c5416 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1612,11 +1612,10 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 
*c)
mce_flags.overflow_recov = 1;
 
/*
-* Turn off MC4_MISC thresholding banks on those models since
+* Turn off MC4_MISC thresholding banks on all models since
 * they're not supported there.
 */
-   if (c->x86 == 0x15 &&
-   (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
+   if (c->x86 == 0x15) {
int i;
u64 hwcr;
bool need_toggle;
-- 
2.7.4



[PATCH 0/2] x86/mce/amd: apply missing quirks to family 15 models (v2)

2019-01-16 Thread S, Shirish
Below patch series applies to family 15 CPU's of AMD platform, to address a
consistent warning of:

 "[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."

at every boot and every resume, which is misguiding as the reason is not a
Firmware Bug but "MC4_MISC thresholding quirk" not being apporpriately
applied.

Shirish S (2):
  x86/mce/amd: apply MC4_MISC thresholding to all models of family 15
  x86/mce/amd: carve out MC4_MISC thresholding quirk

 arch/x86/kernel/cpu/mce/amd.c  | 34 ++
 arch/x86/kernel/cpu/mce/core.c | 30 --
 2 files changed, 34 insertions(+), 30 deletions(-)

-- 
2.7.4



[PATCH 3/3] x86/mce/amd: apply MC4_MISC thresholding quirk in resume path

2019-01-09 Thread S, Shirish
There are 2 code paths leading to mce_amd_feature_init() as below.
1) S5 -> S0: (boot)
secondary_startup_64 -> start_kernel -> identify_boot_cpu ->
identify_cpu ->
mcheck_cpu_init (calls  __mcheck_cpu_apply_quirks before) ->
mce_amd_feature_init

2) S3 -> S0: (resume)
syscore_resume -> mce_syscore_resume -> mce_amd_feature_init

Its clear that the quirks are not applied in S3 -> S0 path, hence apply
the same.

Signed-off-by: Shirish S 
---
 arch/x86/kernel/cpu/mce/amd.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 89298c8..ab1b12a 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -552,6 +552,12 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
unsigned int bank, block, cpu = smp_processor_id();
int offset = -1;
 
+   /*
+* mcheck_cpu_init() is not called when called by mce_syscore_resume(),
+* hence re-apply quirks, to be on safer side.
+*/
+   quirk_fam15_mc4_misc_thresholding();
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (mce_flags.smca)
smca_configure(bank, cpu);
-- 
2.7.4



[PATCH 0/3] x86/mce/amd: apply missing quirks to family 15 models (v2)

2019-01-09 Thread S, Shirish
Below patch series applies to family 15 CPU's of AMD platform, to address a
consistent warning of:

 "[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."

at every boot and every resume, which is misguiding as the reason is not a
Firmware Bug but "MC4_MISC thresholding quirk" not being apporpriately applied.

Shirish S (3):
  x86/mce/amd: apply MC4_MISC thresholding to all models of family 15
  x86/mce/amd: carve out MC4_MISC thresholding quirk
  x86/mce/amd: apply MC4_MISC thresholding quirk in resume path

 arch/x86/include/asm/mce.h |  1 +
 arch/x86/kernel/cpu/mce/amd.c  |  6 
 arch/x86/kernel/cpu/mce/core.c | 65 +++---
 3 files changed, 42 insertions(+), 30 deletions(-)

-- 
2.7.4



[PATCH 1/3] x86/mce/amd: apply MC4_MISC thresholding to all models of family 15

2019-01-09 Thread S, Shirish
Its evident from various forums and logs that MC4_MISC thresholding is not
supported for the family 15 processors, hence skip the x86_model check
while applying quirk.

Changelog[v2]:
 - reword commit message to adhere to coding standards
 - remove check of model range

Signed-off-by: Shirish S 
---
 arch/x86/kernel/cpu/mce/core.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c722..d0c5416 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1612,11 +1612,10 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 
*c)
mce_flags.overflow_recov = 1;
 
/*
-* Turn off MC4_MISC thresholding banks on those models since
+* Turn off MC4_MISC thresholding banks on all models since
 * they're not supported there.
 */
-   if (c->x86 == 0x15 &&
-   (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
+   if (c->x86 == 0x15) {
int i;
u64 hwcr;
bool need_toggle;
-- 
2.7.4



[PATCH 2/3] x86/mce/amd: carve out MC4_MISC thresholding quirk

2019-01-09 Thread S, Shirish
MC4_MISC thresholding quirk needs to be applied during S5 -> S0 and
S3 -> S0 state transitions, which follow different code paths, hence
carve it out so as to facilitate its application in both scenarios.

Signed-off-by: Shirish S 
---
 arch/x86/include/asm/mce.h |  1 +
 arch/x86/kernel/cpu/mce/core.c | 64 +++---
 2 files changed, 36 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index c1a812b..328b65c 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -216,6 +216,7 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, 
u64 *sys_addr);
 static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
 static inline int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 
*sys_addr) { return -EINVAL; };
 #endif
+void quirk_fam15_mc4_misc_thresholding(void);
 
 static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c) { return 
mce_amd_feature_init(c); }
 
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d0c5416..51f61cf 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1570,6 +1570,39 @@ static void quirk_sandybridge_ifu(int bank, struct mce 
*m, struct pt_regs *regs)
m->cs = regs->cs;
 }
 
+/*
+ * Turn off MC4_MISC thresholding banks on all family 15 models since
+ * they're not supported there.
+ */
+void quirk_fam15_mc4_misc_thresholding(void)
+{
+   if (boot_cpu_data.x86 == 0x15) {
+   int i;
+   u64 hwcr;
+   bool need_toggle;
+   u32 msrs[] = {
+   0x0413, /* MC4_MISC0 */
+   0xc408, /* MC4_MISC1 */
+   };
+
+   rdmsrl(MSR_K7_HWCR, hwcr);
+
+   /* McStatusWrEn has to be set */
+   need_toggle = !(hwcr & BIT(18));
+
+   if (need_toggle)
+   wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
+
+   /* Clear CntP bit safely */
+   for (i = 0; i < ARRAY_SIZE(msrs); i++)
+   msr_clear_bit(msrs[i], 62);
+
+   /* restore old settings */
+   if (need_toggle)
+   wrmsrl(MSR_K7_HWCR, hwcr);
+   }
+}
+
 /* Add per CPU specific workarounds here */
 static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
 {
@@ -1611,35 +1644,8 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 
*c)
if (c->x86 == 0x15 && c->x86_model <= 0xf)
mce_flags.overflow_recov = 1;
 
-   /*
-* Turn off MC4_MISC thresholding banks on all models since
-* they're not supported there.
-*/
-   if (c->x86 == 0x15) {
-   int i;
-   u64 hwcr;
-   bool need_toggle;
-   u32 msrs[] = {
-   0x0413, /* MC4_MISC0 */
-   0xc408, /* MC4_MISC1 */
-   };
-
-   rdmsrl(MSR_K7_HWCR, hwcr);
-
-   /* McStatusWrEn has to be set */
-   need_toggle = !(hwcr & BIT(18));
-
-   if (need_toggle)
-   wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
-
-   /* Clear CntP bit safely */
-   for (i = 0; i < ARRAY_SIZE(msrs); i++)
-   msr_clear_bit(msrs[i], 62);
-
-   /* restore old settings */
-   if (need_toggle)
-   wrmsrl(MSR_K7_HWCR, hwcr);
-   }
+   quirk_fam15_mc4_misc_thresholding();
+
}
 
if (c->x86_vendor == X86_VENDOR_INTEL) {
-- 
2.7.4



[PATCH 2/2] x86/mce/amd: Ensure quirks are applied in resume path as well

2019-01-07 Thread S, Shirish
This patch adds threshold quirk applicable for family 15
in resume path as well, since mce_amd_feature_init()
does not have quirks applied when originating from mce_syscore_resume(),
resulting in the below message at every successful resume:

"[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."

Signed-off-by: Shirish S 
---
 arch/x86/kernel/cpu/mce/amd.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 89298c8..27cbf66 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -545,6 +545,34 @@ prepare_threshold_block(unsigned int bank, unsigned int 
block, u32 addr,
return offset;
 }
 
+void disable_err_thresholding(struct cpuinfo_x86 *c)
+{
+   int i;
+   u64 hwcr;
+   bool need_toggle;
+   u32 msrs[] = {
+   0x0413, /* MC4_MISC0 */
+   0xc408, /* MC4_MISC1 */
+   };
+
+   if (c->x86_model >= 0x10 && c->x86_model <= 0x7f) {
+   rdmsrl(MSR_K7_HWCR, hwcr);
+
+   /* McStatusWrEn has to be set */
+   need_toggle = !(hwcr & BIT(18));
+
+   if (need_toggle)
+   wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
+
+   /* Clear CntP bit safely */
+   for (i = 0; i < ARRAY_SIZE(msrs); i++)
+   msr_clear_bit(msrs[i], 62);
+
+   /* restore old settings */
+   if (need_toggle)
+   wrmsrl(MSR_K7_HWCR, hwcr);
+   }
+}
 /* cpu init entry point, called from mce.c with preempt off */
 void mce_amd_feature_init(struct cpuinfo_x86 *c)
 {
@@ -552,6 +580,12 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
unsigned int bank, block, cpu = smp_processor_id();
int offset = -1;
 
+   /* Disable error thresholding bank in S3 resume path as well,
+* for 15h family
+*/
+   if (c->x86 == 0x15)
+   disable_err_thresholding(c);
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (mce_flags.smca)
smca_configure(bank, cpu);
-- 
2.7.4



[PATCH 1/2] x86/mce/amd: Extend "Disable error thresholding bank 4" to more models

2019-01-07 Thread S, Shirish
The below patch added this quirk only for the first generation of family
15 processors, over time its noticed that its required for later
generations too.

"575203b4747c x86, MCE, AMD: Disable error thresholding bank 4 on some
models"

This patch extends the quirk to make it applicable till 7th Generation,
so as to address the below warning at boot:

"[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."

Signed-off-by: Shirish S 
---
 arch/x86/kernel/cpu/mce/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c722..051b536 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1616,7 +1616,7 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 
*c)
 * they're not supported there.
 */
if (c->x86 == 0x15 &&
-   (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
+   (c->x86_model >= 0x10 && c->x86_model <= 0x7f)) {
int i;
u64 hwcr;
bool need_toggle;
-- 
2.7.4



[PATCH 0/2] x86/mce/amd: apply missing quirks for family 15 models

2019-01-07 Thread S, Shirish
This patch series applies to family 15 CPU's of AMD platforms,
so as to address a consistent warning of
 "[Firmware Bug]: cpu 0, invalid threshold interrupt offset"
at every boot and upon completiong of successful S3 cycle,
due to a missing quirk, which was not extended to newer models
and also not applied in resume path.

Shirish S (2):
  x86/mce/amd: Extend "Disable error thresholding bank 4" to more models
  x86/mce/amd: Ensure quirks are applied in resume path as well

 arch/x86/kernel/cpu/mce/amd.c  | 34 ++
 arch/x86/kernel/cpu/mce/core.c |  2 +-
 2 files changed, 35 insertions(+), 1 deletion(-)

-- 
2.7.4



[PATCH 2/2] x86/mce/amd: Ensure quirks are applied in resume path as well

2019-01-04 Thread S, Shirish
This patch adds threshold quirk applicable for family 15
in resume path as well, since mce_amd_feature_init()
does not have quirks applied when originating from mce_syscore_resume(),
resulting in the below message at every successful resume:

"[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."

Signed-off-by: Shirish S 
---
 arch/x86/kernel/cpu/mce/amd.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 89298c8..27cbf66 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -545,6 +545,34 @@ prepare_threshold_block(unsigned int bank, unsigned int 
block, u32 addr,
return offset;
 }
 
+void disable_err_thresholding(struct cpuinfo_x86 *c)
+{
+   int i;
+   u64 hwcr;
+   bool need_toggle;
+   u32 msrs[] = {
+   0x0413, /* MC4_MISC0 */
+   0xc408, /* MC4_MISC1 */
+   };
+
+   if (c->x86_model >= 0x10 && c->x86_model <= 0x7f) {
+   rdmsrl(MSR_K7_HWCR, hwcr);
+
+   /* McStatusWrEn has to be set */
+   need_toggle = !(hwcr & BIT(18));
+
+   if (need_toggle)
+   wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
+
+   /* Clear CntP bit safely */
+   for (i = 0; i < ARRAY_SIZE(msrs); i++)
+   msr_clear_bit(msrs[i], 62);
+
+   /* restore old settings */
+   if (need_toggle)
+   wrmsrl(MSR_K7_HWCR, hwcr);
+   }
+}
 /* cpu init entry point, called from mce.c with preempt off */
 void mce_amd_feature_init(struct cpuinfo_x86 *c)
 {
@@ -552,6 +580,12 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
unsigned int bank, block, cpu = smp_processor_id();
int offset = -1;
 
+   /* Disable error thresholding bank in S3 resume path as well,
+* for 15h family
+*/
+   if (c->x86 == 0x15)
+   disable_err_thresholding(c);
+
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (mce_flags.smca)
smca_configure(bank, cpu);
-- 
2.7.4



[PATCH 1/2] x86/mce/amd: Extend "Disable error thresholding bank 4" to more models

2019-01-04 Thread S, Shirish
The below patch added this quirk only for the first generation of family
15 processors, over time its noticed that its required for later
generations too.

"575203b4747c x86, MCE, AMD: Disable error thresholding bank 4 on some
models"

This patch extends the quirk to make it applicable till 7th Generation,
so as to address the below warning at boot:

"[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."

Signed-off-by: Shirish S 
---
 arch/x86/kernel/cpu/mce/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c722..051b536 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1616,7 +1616,7 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 
*c)
 * they're not supported there.
 */
if (c->x86 == 0x15 &&
-   (c->x86_model >= 0x10 && c->x86_model <= 0x1f)) {
+   (c->x86_model >= 0x10 && c->x86_model <= 0x7f)) {
int i;
u64 hwcr;
bool need_toggle;
-- 
2.7.4



Re: [PATCH] tpm: Fix NULL pointer dereference in tpm_transmit()

2018-07-04 Thread S, Shirish




On 7/4/2018 10:43 PM, Jarkko Sakkinen wrote:

On Wed, Jul 04, 2018 at 02:33:40PM +0530, Shirish S wrote:

During system shutdown,
tpm_class_shutdown() when called with TPM_CHIP_FLAG_TPM2
flag set, makes chip->ops NULL.

However tpm_chip_unregister() called later in shutdown
sequence tries to access chip->ops in tpm_try_transmit()
leading the NULL pointer dereference.

This patch fixes this issue.
Below is the trace for reference:

BUG: unable to handle kernel NULL pointer dereference at
0048
IP: tpm_transmit+0x267/0x565
PGD 0 P4D 0
Oops:  [#1] PREEMPT SMP NOPTI
...
task: 937c847fe580 task.stack: a79f80b04000
RIP: 0010:tpm_transmit+0x267/0x565
RSP: 0018:a79f80b07c08 EFLAGS: 00010286
RAX:  RBX: 937ca9bc8000 RCX: 937c847fe580
RDX:  RSI: 0002 RDI: 98e3cd40
RBP: a79f80b07c88 R08: 0001fff4 R09: 
R10:  R11:  R12: a79f80b07cd4
R13: 008c R14: ffc3 R15: 
FS:  7ef31f747740() GS:937caed0()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0048 CR3: 0001243d2000 CR4: 001406e0
Call Trace:
tpm_transmit_cmd+0x25/0x70
tpm2_shutdown+0x69/0xa3
? __radix_tree_replace+0xd9/0x120
? idr_replace_ext+0x92/0xb6
tpm_chip_unregister+0xaa/0xdb
cr50_i2c_shutdown+0x1e/0x41
device_shutdown+0x157/0x193
kernel_power_off+0x35/0x6e
SYSC_reboot+0x120/0x1a3
? do_writepages+0x36/0x6e
? do_writepages+0x36/0x6e
? sync_inodes_one_sb+0x17/0x17
? _raw_spin_unlock+0xe/0x20
? iput+0x87/0x1bd
do_syscall_64+0x64/0x72
entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Signed-off-by: Shirish S 
---
  drivers/char/tpm/tpm-interface.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index e32f6e8..b14d196 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -411,6 +411,14 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip,
unsigned long stop;
bool need_locality;
  
+	/* chip->ops is made NULL in tpm_class_shutdown()

+* This case is hit when tpm_chip_unregister() is called post
+* tpm_class_shutdown(), hence exit early and return
+* transmit operation not permitted
+*/
+   if (!chip->ops)
+   return -EPERM;
+
rc = tpm_validate_command(chip, space, buf, bufsiz);
if (rc == -EINVAL)
return rc;
--
2.7.4


What is cr50 anyway?

Please ignore this patch, i realised the cr50 is not yet upstreamed.
Hence the issue lies in in cr50 and not tpm driver.
Thanks.
Regards,
Shirish S

/Jarkko


--
Regards,
Shirish S



Re: [PATCH] tpm: Fix NULL pointer dereference in tpm_transmit()

2018-07-04 Thread S, Shirish




On 7/4/2018 10:43 PM, Jarkko Sakkinen wrote:

On Wed, Jul 04, 2018 at 02:33:40PM +0530, Shirish S wrote:

During system shutdown,
tpm_class_shutdown() when called with TPM_CHIP_FLAG_TPM2
flag set, makes chip->ops NULL.

However tpm_chip_unregister() called later in shutdown
sequence tries to access chip->ops in tpm_try_transmit()
leading the NULL pointer dereference.

This patch fixes this issue.
Below is the trace for reference:

BUG: unable to handle kernel NULL pointer dereference at
0048
IP: tpm_transmit+0x267/0x565
PGD 0 P4D 0
Oops:  [#1] PREEMPT SMP NOPTI
...
task: 937c847fe580 task.stack: a79f80b04000
RIP: 0010:tpm_transmit+0x267/0x565
RSP: 0018:a79f80b07c08 EFLAGS: 00010286
RAX:  RBX: 937ca9bc8000 RCX: 937c847fe580
RDX:  RSI: 0002 RDI: 98e3cd40
RBP: a79f80b07c88 R08: 0001fff4 R09: 
R10:  R11:  R12: a79f80b07cd4
R13: 008c R14: ffc3 R15: 
FS:  7ef31f747740() GS:937caed0()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0048 CR3: 0001243d2000 CR4: 001406e0
Call Trace:
tpm_transmit_cmd+0x25/0x70
tpm2_shutdown+0x69/0xa3
? __radix_tree_replace+0xd9/0x120
? idr_replace_ext+0x92/0xb6
tpm_chip_unregister+0xaa/0xdb
cr50_i2c_shutdown+0x1e/0x41
device_shutdown+0x157/0x193
kernel_power_off+0x35/0x6e
SYSC_reboot+0x120/0x1a3
? do_writepages+0x36/0x6e
? do_writepages+0x36/0x6e
? sync_inodes_one_sb+0x17/0x17
? _raw_spin_unlock+0xe/0x20
? iput+0x87/0x1bd
do_syscall_64+0x64/0x72
entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Signed-off-by: Shirish S 
---
  drivers/char/tpm/tpm-interface.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index e32f6e8..b14d196 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -411,6 +411,14 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip,
unsigned long stop;
bool need_locality;
  
+	/* chip->ops is made NULL in tpm_class_shutdown()

+* This case is hit when tpm_chip_unregister() is called post
+* tpm_class_shutdown(), hence exit early and return
+* transmit operation not permitted
+*/
+   if (!chip->ops)
+   return -EPERM;
+
rc = tpm_validate_command(chip, space, buf, bufsiz);
if (rc == -EINVAL)
return rc;
--
2.7.4


What is cr50 anyway?

Please ignore this patch, i realised the cr50 is not yet upstreamed.
Hence the issue lies in in cr50 and not tpm driver.
Thanks.
Regards,
Shirish S

/Jarkko


--
Regards,
Shirish S