Re: [PATCH v4 1/2] edac: Add an API for edac device to report for multiple errors

2019-10-02 Thread Hawa, Hanna




On 9/30/2019 5:50 PM, Borislav Petkov wrote:

On Mon, Sep 23, 2019 at 08:17:40PM +0100, Hanna Hawa wrote:

+void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
+   int inst_nr, int block_nr, const char *msg)
+{
+   __edac_device_handle_ce(edac_dev, 1, inst_nr, block_nr, msg);
+}
+EXPORT_SYMBOL_GPL(edac_device_handle_ce);


Eww, I don't like that: exporting the function *and* the __ counterpart.
The user will get confused and that is unnecessary.

See below for a better version. This way you solve the whole deal with a
single patch.

I'm okay with this version, minor comment below.



---
From: Hanna Hawa 
Date: Mon, 23 Sep 2019 20:17:40 +0100
Subject: [PATCH] EDAC/device: Rework error logging API

Make the main workhorse the "count" functions which can log a @count
of errors. Have the current APIs edac_device_handle_{ce,ue}() call
the _count() variants and this way keep the exported symbols number
unchanged.

  [ bp: Rewrite. ]

Signed-off-by: Hanna Hawa 
Signed-off-by: Borislav Petkov 
Cc: b...@amazon.com
Cc: d...@amazon.co.uk
Cc: hano...@amazon.com
Cc: James Morse 
Cc: jon...@amazon.com
Cc: linux-edac 
Cc: Mauro Carvalho Chehab 
Cc: ron...@amazon.com
Cc: ta...@amazon.com
Cc: Tony Luck 
Link: https://lkml.kernel.org/r/20190923191741.29322-2-hhh...@amazon.com
---
  drivers/edac/edac_device.c | 44 ---
  drivers/edac/edac_device.h | 54 ++
  2 files changed, 66 insertions(+), 32 deletions(-)

diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 65cf2b9355c4..d4d8bed5b55d 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -555,8 +555,9 @@ static inline int edac_device_get_panic_on_ue(struct 
edac_device_ctl_info
return edac_dev->panic_on_ue;
  }
  
-void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,

-   int inst_nr, int block_nr, const char *msg)
+void edac_device_handle_ce_count(struct edac_device_ctl_info *edac_dev,
+unsigned int count, int inst_nr, int block_nr,
+const char *msg)
  {
struct edac_device_instance *instance;
struct edac_device_block *block = NULL;


Missing count check, same in *_ue_count():
if (count)
return;


@@ -582,23 +583,24 @@ void edac_device_handle_ce(struct edac_device_ctl_info 
*edac_dev,
  
  	if (instance->nr_blocks > 0) {

block = instance->blocks + block_nr;
-   block->counters.ce_count++;
+   block->counters.ce_count += count;
}
  
  	/* Propagate the count up the 'totals' tree */

-   instance->counters.ce_count++;
-   edac_dev->counters.ce_count++;
+   instance->counters.ce_count += count;
+   edac_dev->counters.ce_count += count;
  
  	if (edac_device_get_log_ce(edac_dev))

edac_device_printk(edac_dev, KERN_WARNING,
-   "CE: %s instance: %s block: %s '%s'\n",
-   edac_dev->ctl_name, instance->name,
-   block ? block->name : "N/A", msg);
+  "CE: %s instance: %s block: %s count: %d 
'%s'\n",
+  edac_dev->ctl_name, instance->name,
+  block ? block->name : "N/A", count, msg);
  }
-EXPORT_SYMBOL_GPL(edac_device_handle_ce);
+EXPORT_SYMBOL_GPL(edac_device_handle_ce_count);
  
-void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,

-   int inst_nr, int block_nr, const char *msg)
+void edac_device_handle_ue_count(struct edac_device_ctl_info *edac_dev,
+unsigned int count, int inst_nr, int block_nr,
+const char *msg)
  {
struct edac_device_instance *instance;
struct edac_device_block *block = NULL;
@@ -624,22 +626,22 @@ void edac_device_handle_ue(struct edac_device_ctl_info 
*edac_dev,
  
  	if (instance->nr_blocks > 0) {

block = instance->blocks + block_nr;
-   block->counters.ue_count++;
+   block->counters.ue_count += count;
}
  
  	/* Propagate the count up the 'totals' tree */

-   instance->counters.ue_count++;
-   edac_dev->counters.ue_count++;
+   instance->counters.ue_count += count;
+   edac_dev->counters.ue_count += count;
  
  	if (edac_device_get_log_ue(edac_dev))

edac_device_printk(edac_dev, KERN_EMERG,
-   "UE: %s instance: %s block: %s '%s'\n",
-   edac_dev->ctl_name, instance->name,
-   block ? block->name : "N/A", msg);
+  "UE: %s instance: %s block: %s count: %d 
'%s'\n",
+  edac_dev->ctl_name, instance->name,
+  block ? block->name : "N/A", count, msg);
  
  	if 

Re: [PATCH v4 1/2] edac: Add an API for edac device to report for multiple errors

2019-10-01 Thread Robert Richter
On 01.10.19 12:25:39, Borislav Petkov wrote:
> On Tue, Oct 01, 2019 at 09:47:07AM +, Robert Richter wrote:
> > If you move to static inline for edac_device_handle_{ce,ue} the
> > symbols vanish and this breaks the abi. That's why the split in two
> > patches.
> 
> ABI issues do not concern upstream. And that coming from me working at a
> company who dance a lot to make ABI happy.
> 
> Also, I'm missing the reasoning why you use the ABI as an argument at
> all: do you know of a particular case where people are thinking of
> backporting this or this is all hypothetical.
> 
> > Your comment to not have a __ version as a third variant of the
> > interface makes sense to me. But to keep ABI your patch still needs to
> > be split.
> 
> Not really - normally, when you fix ABI issues with symbols
> disappearing, all of a sudden, you add dummy ones so that the ABI
> checker is happy.

Let's go with a single patch then and the function naming you
suggested before.

Thanks,

-Robert


Re: [PATCH v4 1/2] edac: Add an API for edac device to report for multiple errors

2019-10-01 Thread Borislav Petkov
On Tue, Oct 01, 2019 at 09:47:07AM +, Robert Richter wrote:
> If you move to static inline for edac_device_handle_{ce,ue} the
> symbols vanish and this breaks the abi. That's why the split in two
> patches.

ABI issues do not concern upstream. And that coming from me working at a
company who dance a lot to make ABI happy.

Also, I'm missing the reasoning why you use the ABI as an argument at
all: do you know of a particular case where people are thinking of
backporting this or this is all hypothetical.

> Your comment to not have a __ version as a third variant of the
> interface makes sense to me. But to keep ABI your patch still needs to
> be split.

Not really - normally, when you fix ABI issues with symbols
disappearing, all of a sudden, you add dummy ones so that the ABI
checker is happy.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v4 1/2] edac: Add an API for edac device to report for multiple errors

2019-10-01 Thread Robert Richter
On 01.10.19 10:32:42, Borislav Petkov wrote:

> --
> On Tue, Oct 01, 2019 at 06:56:58AM +, Robert Richter wrote:
> > It is *not* the counterpart. The __* version already has the...
> 
> Lemme cut to the chase:
> 
> "Make the main workhorse the "count" functions which can log a @count
> of errors. Have the current APIs edac_device_handle_{ce,ue}() call
> the _count() variants and this way keep the exported symbols number
> unchanged."
> 
> I want to see exactly *two* pairs of functions:
> 
> edac_device_handle_{ce,ue}_count  - logs a @count of errors
> edac_device_handle_{ce,ue}- logs a single error
> 
> Not three pairs. I.e., the "__" versions are not needed.
> 
> > The first patch only adds functionality but keeps the abi. Thus it
> > makes a backport easier.
> 
> Works just the same with my version - single backport.

If you move to static inline for edac_device_handle_{ce,ue} the
symbols vanish and this breaks the abi. That's why the split in two
patches.

Your comment to not have a __ version as a third variant of the
interface makes sense to me. But to keep ABI your patch still needs to
be split. The first patch still must contain symbols for
edac_device_handle_{ce,ue}. I am not sure if ABI compatability is
something we want to make easier here. I personally like the approach.

-Robert


Re: [PATCH v4 1/2] edac: Add an API for edac device to report for multiple errors

2019-10-01 Thread Borislav Petkov
On Tue, Oct 01, 2019 at 06:56:58AM +, Robert Richter wrote:
> It is *not* the counterpart. The __* version already has the...

Lemme cut to the chase:

"Make the main workhorse the "count" functions which can log a @count
of errors. Have the current APIs edac_device_handle_{ce,ue}() call
the _count() variants and this way keep the exported symbols number
unchanged."

I want to see exactly *two* pairs of functions:

edac_device_handle_{ce,ue}_count- logs a @count of errors
edac_device_handle_{ce,ue}  - logs a single error

Not three pairs. I.e., the "__" versions are not needed.

> The first patch only adds functionality but keeps the abi. Thus it
> makes a backport easier.

Works just the same with my version - single backport.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v4 1/2] edac: Add an API for edac device to report for multiple errors

2019-10-01 Thread Robert Richter
On 30.09.19 16:50:46, Borislav Petkov wrote:
> --
> On Mon, Sep 23, 2019 at 08:17:40PM +0100, Hanna Hawa wrote:
> > +void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
> > +   int inst_nr, int block_nr, const char *msg)
> > +{
> > +   __edac_device_handle_ce(edac_dev, 1, inst_nr, block_nr, msg);
> > +}
> > +EXPORT_SYMBOL_GPL(edac_device_handle_ce);
> 
> Eww, I don't like that: exporting the function *and* the __ counterpart.
> The user will get confused and that is unnecessary.

It is *not* the counterpart. The __* version already has the
additional count argument in. There are 2 patches:

1) introduce new function __edac_device_handle_ce/ue() including the
   *_count() inline functions, but keep the old symbols (note the
   count=1 parameter).

2) remove old symbol edac_device_handle_ce/ue() and replace it with an
   inline function to use the counterpart symbol too.

The first patch only adds functionality but keeps the abi. Thus it
makes a backport easier. The 2nd switches completely to the new
function.

-Robert



Re: [PATCH v4 1/2] edac: Add an API for edac device to report for multiple errors

2019-09-30 Thread Borislav Petkov
On Mon, Sep 23, 2019 at 08:17:40PM +0100, Hanna Hawa wrote:
> +void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
> + int inst_nr, int block_nr, const char *msg)
> +{
> + __edac_device_handle_ce(edac_dev, 1, inst_nr, block_nr, msg);
> +}
> +EXPORT_SYMBOL_GPL(edac_device_handle_ce);

Eww, I don't like that: exporting the function *and* the __ counterpart.
The user will get confused and that is unnecessary.

See below for a better version. This way you solve the whole deal with a
single patch.

---
From: Hanna Hawa 
Date: Mon, 23 Sep 2019 20:17:40 +0100
Subject: [PATCH] EDAC/device: Rework error logging API

Make the main workhorse the "count" functions which can log a @count
of errors. Have the current APIs edac_device_handle_{ce,ue}() call
the _count() variants and this way keep the exported symbols number
unchanged.

 [ bp: Rewrite. ]

Signed-off-by: Hanna Hawa 
Signed-off-by: Borislav Petkov 
Cc: b...@amazon.com
Cc: d...@amazon.co.uk
Cc: hano...@amazon.com
Cc: James Morse 
Cc: jon...@amazon.com
Cc: linux-edac 
Cc: Mauro Carvalho Chehab 
Cc: ron...@amazon.com
Cc: ta...@amazon.com
Cc: Tony Luck 
Link: https://lkml.kernel.org/r/20190923191741.29322-2-hhh...@amazon.com
---
 drivers/edac/edac_device.c | 44 ---
 drivers/edac/edac_device.h | 54 ++
 2 files changed, 66 insertions(+), 32 deletions(-)

diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 65cf2b9355c4..d4d8bed5b55d 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -555,8 +555,9 @@ static inline int edac_device_get_panic_on_ue(struct 
edac_device_ctl_info
return edac_dev->panic_on_ue;
 }
 
-void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
-   int inst_nr, int block_nr, const char *msg)
+void edac_device_handle_ce_count(struct edac_device_ctl_info *edac_dev,
+unsigned int count, int inst_nr, int block_nr,
+const char *msg)
 {
struct edac_device_instance *instance;
struct edac_device_block *block = NULL;
@@ -582,23 +583,24 @@ void edac_device_handle_ce(struct edac_device_ctl_info 
*edac_dev,
 
if (instance->nr_blocks > 0) {
block = instance->blocks + block_nr;
-   block->counters.ce_count++;
+   block->counters.ce_count += count;
}
 
/* Propagate the count up the 'totals' tree */
-   instance->counters.ce_count++;
-   edac_dev->counters.ce_count++;
+   instance->counters.ce_count += count;
+   edac_dev->counters.ce_count += count;
 
if (edac_device_get_log_ce(edac_dev))
edac_device_printk(edac_dev, KERN_WARNING,
-   "CE: %s instance: %s block: %s '%s'\n",
-   edac_dev->ctl_name, instance->name,
-   block ? block->name : "N/A", msg);
+  "CE: %s instance: %s block: %s count: %d 
'%s'\n",
+  edac_dev->ctl_name, instance->name,
+  block ? block->name : "N/A", count, msg);
 }
-EXPORT_SYMBOL_GPL(edac_device_handle_ce);
+EXPORT_SYMBOL_GPL(edac_device_handle_ce_count);
 
-void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
-   int inst_nr, int block_nr, const char *msg)
+void edac_device_handle_ue_count(struct edac_device_ctl_info *edac_dev,
+unsigned int count, int inst_nr, int block_nr,
+const char *msg)
 {
struct edac_device_instance *instance;
struct edac_device_block *block = NULL;
@@ -624,22 +626,22 @@ void edac_device_handle_ue(struct edac_device_ctl_info 
*edac_dev,
 
if (instance->nr_blocks > 0) {
block = instance->blocks + block_nr;
-   block->counters.ue_count++;
+   block->counters.ue_count += count;
}
 
/* Propagate the count up the 'totals' tree */
-   instance->counters.ue_count++;
-   edac_dev->counters.ue_count++;
+   instance->counters.ue_count += count;
+   edac_dev->counters.ue_count += count;
 
if (edac_device_get_log_ue(edac_dev))
edac_device_printk(edac_dev, KERN_EMERG,
-   "UE: %s instance: %s block: %s '%s'\n",
-   edac_dev->ctl_name, instance->name,
-   block ? block->name : "N/A", msg);
+  "UE: %s instance: %s block: %s count: %d 
'%s'\n",
+  edac_dev->ctl_name, instance->name,
+  block ? block->name : "N/A", count, msg);
 
if (edac_device_get_panic_on_ue(edac_dev))
-   panic("EDAC %s: UE instance: %s block %s '%s'\n",
-   edac_dev->ctl_name, 

[PATCH v4 1/2] edac: Add an API for edac device to report for multiple errors

2019-09-23 Thread Hanna Hawa
Add an API for EDAC device to report multiple errors with same type.

Signed-off-by: Hanna Hawa 
Acked-by: Robert Richter 
---
 drivers/edac/edac_device.c | 56 --
 drivers/edac/edac_device.h | 47 
 2 files changed, 83 insertions(+), 20 deletions(-)

diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 65cf2b9355c4..6701fd3a8ce0 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -555,8 +555,9 @@ static inline int edac_device_get_panic_on_ue(struct 
edac_device_ctl_info
return edac_dev->panic_on_ue;
 }
 
-void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
-   int inst_nr, int block_nr, const char *msg)
+void __edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
+unsigned int count, int inst_nr, int block_nr,
+const char *msg)
 {
struct edac_device_instance *instance;
struct edac_device_block *block = NULL;
@@ -582,23 +583,24 @@ void edac_device_handle_ce(struct edac_device_ctl_info 
*edac_dev,
 
if (instance->nr_blocks > 0) {
block = instance->blocks + block_nr;
-   block->counters.ce_count++;
+   block->counters.ce_count += count;
}
 
/* Propagate the count up the 'totals' tree */
-   instance->counters.ce_count++;
-   edac_dev->counters.ce_count++;
+   instance->counters.ce_count += count;
+   edac_dev->counters.ce_count += count;
 
if (edac_device_get_log_ce(edac_dev))
edac_device_printk(edac_dev, KERN_WARNING,
-   "CE: %s instance: %s block: %s '%s'\n",
-   edac_dev->ctl_name, instance->name,
-   block ? block->name : "N/A", msg);
+  "CE: %s instance: %s block: %s count: %d 
'%s'\n",
+  edac_dev->ctl_name, instance->name,
+  block ? block->name : "N/A", count, msg);
 }
-EXPORT_SYMBOL_GPL(edac_device_handle_ce);
+EXPORT_SYMBOL_GPL(__edac_device_handle_ce);
 
-void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
-   int inst_nr, int block_nr, const char *msg)
+void __edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
+unsigned int count, int inst_nr, int block_nr,
+const char *msg)
 {
struct edac_device_instance *instance;
struct edac_device_block *block = NULL;
@@ -624,22 +626,36 @@ void edac_device_handle_ue(struct edac_device_ctl_info 
*edac_dev,
 
if (instance->nr_blocks > 0) {
block = instance->blocks + block_nr;
-   block->counters.ue_count++;
+   block->counters.ue_count += count;
}
 
/* Propagate the count up the 'totals' tree */
-   instance->counters.ue_count++;
-   edac_dev->counters.ue_count++;
+   instance->counters.ue_count += count;
+   edac_dev->counters.ue_count += count;
 
if (edac_device_get_log_ue(edac_dev))
edac_device_printk(edac_dev, KERN_EMERG,
-   "UE: %s instance: %s block: %s '%s'\n",
-   edac_dev->ctl_name, instance->name,
-   block ? block->name : "N/A", msg);
+  "UE: %s instance: %s block: %s count: %d 
'%s'\n",
+  edac_dev->ctl_name, instance->name,
+  block ? block->name : "N/A", count, msg);
 
if (edac_device_get_panic_on_ue(edac_dev))
-   panic("EDAC %s: UE instance: %s block %s '%s'\n",
-   edac_dev->ctl_name, instance->name,
-   block ? block->name : "N/A", msg);
+   panic("EDAC %s: UE instance: %s block %s count: %d '%s'\n",
+ edac_dev->ctl_name, instance->name,
+ block ? block->name : "N/A", count, msg);
+}
+EXPORT_SYMBOL_GPL(__edac_device_handle_ue);
+
+void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
+   int inst_nr, int block_nr, const char *msg)
+{
+   __edac_device_handle_ce(edac_dev, 1, inst_nr, block_nr, msg);
+}
+EXPORT_SYMBOL_GPL(edac_device_handle_ce);
+
+void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
+   int inst_nr, int block_nr, const char *msg)
+{
+   __edac_device_handle_ue(edac_dev, 1, inst_nr, block_nr, msg);
 }
 EXPORT_SYMBOL_GPL(edac_device_handle_ue);
diff --git a/drivers/edac/edac_device.h b/drivers/edac/edac_device.h
index 1aaba74ae411..7869f036138a 100644
--- a/drivers/edac/edac_device.h
+++ b/drivers/edac/edac_device.h
@@ -285,6 +285,33 @@ extern int edac_device_add_device(struct 
edac_device_ctl_info *edac_dev);
  */
 extern struct